Female historians and male nurses do not exist, Google Translate tells its European users

An experiment shows that Google Translate systematically changes the gender of translations when they do not fit with stereotypes. It is all because of English, Google says

Nicolas Kayser-Bril

If you were to read a story about male and female historians translated by Google, you might be forgiven for overlooking the females in the group. The phrase “vier Historikerinnen und Historiker” (four male and female historians) is rendered as “cuatro historiadores” (four male historians) in Spanish, with similar results in Italian, French and Polish. Female historians are simply removed from the text.

In an experiment, I translated 11 occupations from one gender-inflected language to another. I analyzed 440 translation pairs to and from German, Italian, Polish, Spanish and French. Together, these languages are natively spoken by three in four citizens of the European Union.

Fitting the stereotypes

In many cases, Google changed the gender of the word in a grossly stereotypical way. “Die Präsidentin” (the female president) is rendered to “il presidente” in Italian, although the correct translation is “la presidente”. “Der Krankenpfleger” (the male nurse in German) becomes “l’infirmière” (the female nurse) in French.

In my list, shop assistant was best translated by Google, with 33 correct translations out of 40. From French to Spanish for instance, “la vendeuse” was correctly translated to “la vendedora” and “le vendeur” to “el vendedor”.

Errors are not systematic, showing that they can be fixed. “Kierowniczka” (Polish for female director) was correctly translated in all four target languages, although “die Chefin”, “la capa”, “la jefa” and “la cheffe” were wrongly translated to their masculine forms. (When Google correctly translated a feminine occupation, it was often because the target language’s word was not gender-inflected. For instance, “l'insegnante” in Italian designates both a female and a male teacher.)

The experiment’s code and data are available online.

External content from datawrapper.com

We'd like to present you content that is not hosted on our servers.

Once you provide your consent, the content will be loaded from external servers. Please understand that the third party may then process data from your browser. Additionally, information may be stored on your device, such as cookies. For further details on this, please refer directly to the third party.

This experiment might not reflect what Google Translate shows when translating web pages or longer texts. In some cases, especially when nearby words contain feminine forms, Google correctly translates gender-inflected forms.

Digital colonialism

Stereotypes sneak into translations because Google optimizes translations for English.

A Google spokesperson told AlgorithmWatch that “translating between language pairs requires high volumes of bilingual data that often don’t exist for all language pairs. The way to enable these translations is by using a technique called ‘bridging’. Language bridging in translation means that to translate from X to Y a third language is introduced (E) based on the existence of bilingual data to translate X to E and then E to Y. The most common language used as bridge is English.”

“The majority of nouns in English are gender-neutral: so, when translating the feminine term for ‘nurse’ from a gender-inflected language to English, the gender is ‘lost’ in the translation to the bridging language,” the Google spokesperson added.

Several experts I talked to agreed that the community of researchers working on machine translation was not very concerned about non-English languages. Only in May 2020 did the Association for Computational Linguistics, a large professional body, tell reviewers of their annual conference that they could not reject a paper solely because it was about a language other than English.

Window dressing

In 2018, Google introduced a feature that alerted users that some words could be gender-specific when translating from English.

However, it is unclear whether such efforts were made in earnest. Over two years after the changes were deployed, “developer” is correctly translated into French both in the masculine form as “le développeur” and in the feminine as “la développeuse”. But “the developer” translates to “le développeur” and all the sentences I tried translated into the masculine, including the phrase “the developer is a woman”.

developer translates to both gender.
but the developer translates to masculine only.

Verified falsehoods

In my experiment, 182 translations out of 440 turned out to be false. In their vast majority, the errors had to do with feminine forms converted to their masculine equivalent. 68 of the false translations were marked as “verified” by Google.

The Google spokesperson declined to explain precisely how the “verified” label was awarded. “We mark translations as ‘verified’ when they’ve been reviewed by several volunteers in the Google Translate Community and these volunteers agree the translation is correct”, they said. “We are improving our detection of low-quality contributions with automated scoring methods and periodic knowledge checks.”

My experiment raised other issues. “Le chef” (the boss, in French), was translated to “der Führer” in German, a word meaning “the guide” and very strongly linked to the Nazi era. The translation was marked as verified.

But Google reassured me that no extremist group infiltrated the “Google Translate Community” to spread far-right language. “In this specific case, [the error] is due to the ‘bridging’ process”, the spokesperson said. “If you do a translation for ‘le chef’ from French to English we get ‘leader’. If you then translate ‘leader' from English to German you get 'Führer’”.

No escape

Google Translate is not just another translation service. It is a feature that Europeans can hardly escape.

Since an update in April 2019, Google Chrome prompts users to instantly translate web pages. Anyone visiting a website in a foreign language is asked to choose between the original or the google-translated version, even if the website offers an official translation in the user’s preferred language. (Google cannot detect websites that provide an official translation and “errs on the side of helpfulness by offering a translate option in all circumstances”, the spokesperson said. They also said users could turn off the translation prompt.)

Approximately 250 million, or one in two, citizens of the European Union use an Android phone. Unless they manage to bypass the system’s blocks (by “rooting” their device), they cannot remove Google Chrome. It is likely that many of them use Google Translate, perhaps unwittingly.

Did you like this story?

Every two weeks, our newsletter Automated Society delves into the unreported ways automated systems affect society and the world around you. Subscribe now to receive the next issue in your inbox!

Get the briefing on how automated systems impact real people, in Europe and beyond, every two weeks, for free.

For more detailed information, please refer to our privacy policy.