Evaluating Cultural Alignment of Multimodal AI Models in Saudi Cultural Images
Project Overview
This project examines whether multimodal large language models can interpret Saudi cultural images beyond surface-level object recognition. The study may include images of Saudi food, clothing, architecture, heritage sites, traditional objects, or social practices, with attention to whether the model recognizes cultural meaning, symbolism, and context.
Research Aim
To assess how accurately multimodal LLMs describe Saudi cultural images and whether they demonstrate culturally grounded understanding.
Proposed Methodology
Students will curate a small benchmark of Saudi cultural images with expert-annotated descriptions. They will test selected multimodal models using structured prompts and compare model responses against culturally informed reference answers. The analysis may classify errors into categories such as object misidentification, missing cultural significance, stereotyping, overgeneralization, or incomplete context.
Expected Deliverables
A curated image prompt dataset, an evaluation rubric for cultural alignment, comparative model analysis, and design recommendations for culturally aware AI systems.
Project Media
