Researchers at the University of Southern California used artificial intelligence technologies to conclude that male characters outnumber their female counterparts by four in the literature.
One of the study’s authors, Mayank Kejriwal, says he was inspired by recent work on implicit gender bias, as well as his own experience in natural language processing, which helps create natural language processing tools. for various purposes. Although several previously published studies evaluate and analyze the quantitative aspects of the representation of women in literature and media products, these new works have been based primarily on the collection of quantitative data using machine learning algorithms.
To obtain these results, Mr. Kejriwal and his colleague Akarsh Nagaraj used data from the Gutenberg Project, which contains the equivalent of 3,000 books in English. The type of books ranged from the romantic to the science fiction, to mystery and adventure novels, all in various forms, whether novels, short stories or poetry.
“The gender bias is very real, and when we see that women are four times less present in the literature, it has a subliminal impact on people who consume culture,” says Mr. Kejriwal. “We have revealed, quantitatively and indirectly, what biases are still present in our culture. »
According to Nagaraj, “books are a window into the past, and the writings of these authors give us an insight into how people perceive the world and how it has evolved.”
Men everywhere … and in the foreground
The study details several methods used to assess the prevalence of women in the literature. The researchers resorted to a practice of detecting characters whose genre is clearly defined. “One of the ways to achieve our goal was to count the number of feminine pronouns in a book, compared to the masculine pronouns,” Kerjiwal said.
The other technique used is to quantify the number of women who act as protagonists in the works.
Ultimately, this allowed the specialists to assess whether the male characters were essential to the story.
The study also found that this gap between male and female characters narrows when stories are written by women. “It has been clearly shown that women staged more, compared to what writers would do,” Nagaraj said.
The research team acknowledges, however, that there are limitations to its methods for assessing female representation in books. Especially when the perpetrators are neither men nor women. Mr. Kejriwal also noted that it would be very difficult, if not impossible, to use these two research techniques, to come across a text written by a transgender person.
In fact, the researcher recalled that there was still no tool linked to artificial intelligence to detect the pronoun they, in English (or iel in French, for example, Ed). Nevertheless, he judges, the study allows to address certain social problems, he judges.
The study also made it possible to construct a lexical field associated with each character type, again with often obvious biases. Thus, female characters were generally described as “weak”, “esteemed”, “handsome”, sometimes even “stupid”, while male characters were described as “leaders”, “powerful” and “strong” people. to be associated with politics, in particular.