Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis

Lukas Struppek; Dom Hintersdorf; Felix Friedrich; Manuel br; Patrick Schramowski; Kristian Kersting

doi:10.1613/jair.1.15388

PDF

Published: Dec 18, 2023

DOI: https://doi.org/10.1613/jair.1.15388

Keywords:

machine learning

Lukas Struppek

a:1:{s:5:"en_US";s:33:"Technical University of Darmstadt";}

Dominik Hintersdorf

Technical University of Darmstadt

Felix Friedrich

Technical University of Darmstadt

Manuel Brack

Technical University of Darmstadt

Patrick Schramowski

Technical University of Darmstadt

Kristian Kersting

Technical University of Darmstadt

Abstract

Models for text-to-image synthesis, such as DALL-E 2 and Stable Diffusion, have recently drawn a lot of interest from academia and the general public. These models are capable of producing high-quality images that depict a variety of concepts and styles when conditioned on textual descriptions. However, these models adopt cultural characteristics associated with specific Unicode scripts from their vast amount of training data, which may not be immediately apparent. We show that by simply inserting single non-Latin characters in the textual description, common models reflect cultural biases in their generated images. We analyze this behavior both qualitatively and quantitatively and identify a model’s text encoder as the root cause of the phenomenon. Such behavior can be interpreted as a model feature, offering users a simple way to customize the image generation and reflect their own cultural background. Yet, malicious users or service providers may also try to intentionally bias the image generation. One goal might be to create racist stereotypes by replacing Latin characters with similarly-looking characters from non-Latin scripts, so-called homoglyphs. To mitigate such unnoticed script attacks, we propose a novel homoglyph unlearning method to fine-tune a text encoder, making it robust against homoglyph manipulations.

Issue

Vol. 78 (2023)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details