Seminars and Events
NL Seminar- Cultural Knowledge and Cultural Biases: Analyzing the Multilingual Performance of Text-to-Image Models
Event Details
Speaker: Michael Saxon, UCSB
Conference Rm Location: ISI-MDR #689 in-person attendance will be permitted for USC/ISI faculty, staff, students only. Open to the public virtually via Zoom
REMINDER:
If you do not have access to the 6th Floor, please check in at the main reception desk on 10th floor and someone will escort you to the conference room location prior to the start of the talk.
Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom.
If you’re an outside visitor, please provide your: Full Name, Title and Name of Workplace to (nlg-seminar-host(at)isi.edu) beforehand so we’ll be aware of your attendance. Also, let us know if you plan to attend in-person or virtually.
For more information on the NL Seminar series and upcoming talks, please visit:
https://nlg.isi.edu/nl-seminar/
Hosts: Jon May and Justin Cho
Despite being ostensibly trained on solely English data, most text-to-image (T2I) models carry some degree of multilingual capability, with significant variation in performance between models and languages. To guide the future development of T2I systems, both measuring and qualitatively analyzing these language-specific performance variations is desirable, to mitigate cross-lingual disparities in performance as well as language-specific demographic biases.
To quantify multilingual performance we introduce the Conceptual Coverage Across Languages (CoCo-CroLa) benchmark, which allows us to measure the “possession” of a set of tangible noun “concepts” across English, Spanish, German, Chinese, Japanese, Hebrew, and Indonesian. This technique allows us to estimate how well-suited a model is to a target language as well as identify model-specific weaknesses, spurious correlations, and biases without any a-priori assumptions of their form. We demonstrate how it can be used to rank T2I models in terms of multilinguality, and that despite its simplicity our method captures the necessary conditions for the impressive “creative” generative abilities users expect from T2I models.
We then build on this benchmarking work with a detailed qualitative analysis of “failure” and “success” cases for specific concepts. Even in the “possession” case, concepts are expressed differently across languages. These qualitative cross-lingual variations in model behaviors form a continuous spectrum of ethical acceptability, running the gamut from culturally variable popular dog breeds to racially-biased sexualization in depictions of women. While the edge cases are easy to laud or condemn, drawing the line of acceptability in between them is an open ethical question as well as an open technical challenge. Unfortunately, interventions that successfully remove the most deleterious biases also erase cultural distinctiveness, motivating a need for more targeted interventions in future work.
Speaker Bio
Michael Saxon is a CS Ph.D. candidate in the NLP Group at the University of California, Santa Barbara. His research is driven by a desire to improve our objective understanding of the semantic capabilities of large generative AI systems, in particular generative image and language models. Toward this goal he focuses on developing novel data resources and metrics for to model semantic phenomena in generative model, as well as techniques for model-driven dataset improvement to remove biases and spurious correlations. He has previously interned at Meta AI and Amazon working on NLP and speech, and is supported by the NSF Graduate Research Fellowship Program.
The speaker approved recording for this NL Seminar talk, it will be posted on our USC/ISI YouTube page within 1-2 business days: https://www.youtube.com/user/USCISI.
Subscribe here to learn more about upcoming seminars: https://www-staging.isi.edu/events/