Some artificial intelligence tools for health care may get confused by the ways people of different genders and races talk, according to a new study led by 兔子先生传媒文化作品 computer scientist Theodora Chaspari.
The study hinges on a, perhaps unspoken, reality of human society: Not everyone talks the same. Women, for example, tend to speak at a higher pitch than men, while similar differences can pop up between, say, white and Black speakers.
Now, researchers have found that those natural variations could confound algorithms that screen humans for mental health concerns like anxiety or depression. The results add to a growing body of research showing that AI, just like people, can make assumptions based on race or gender.
鈥淚f AI isn鈥檛 trained well, or doesn鈥檛 include enough representative data, it can propagate these human or societal biases,鈥 said Chaspari, associate professor in the Department of Computer Science.
She and her colleagues 听in the journal 鈥淔rontiers in Digital Health.鈥
Chaspari noted that AI could be a promising technology in the healthcare world. Finely tuned algorithms can sift through recordings of people speaking, searching for subtle changes in the way they talk that could indicate underlying mental health concerns.
But those tools have to perform consistently for patients from many demographic groups, the computer scientist said. To find out if AI is up to the task, the researchers fed audio samples of real humans into a common set of machine learning algorithms. The results raised a few red flags: The AI tools, for example, seemed to underdiagnose women who were at risk of depression more than men鈥攁n outcome that, in the real world, could keep people from getting the care they need. 听
鈥淲ith artificial intelligence, we can identify these fine-grained patterns that humans can鈥檛 always perceive,鈥 said Chaspari, who conducted the work as a faculty member at Texas A&M University. 鈥淗owever, while there is this opportunity, there is also a lot of risk.鈥
Speech and听emotions
She added that the way humans talk can be a powerful window into their underlying emotions and wellbeing鈥攕omething that poets and playwrights have long known.听
Research suggests that people diagnosed with clinical depression often speak more softly and in more of a monotone than others. People with anxiety disorders, meanwhile, tend to talk with a higher pitch and with more 鈥渏itter,鈥 a measurement of the breathiness in speech.听
鈥淲e know that speech is very much influenced by one鈥檚 anatomy,鈥 Chaspari said. 鈥淔or depression, there have been some studies showing changes in the way vibrations in the vocal folds happen, or even in how the voice is modulated by the vocal tract.鈥
Over the years, scientists have developed AI tools to look for just those kinds of changes.
Chaspari and her colleagues decided to put the algorithms under the microscope. To do that, the team drew on recordings of humans talking in a range of scenarios: In one, people had to give a 10 to 15 minute talk to a group of strangers. In another, men and women talked for a longer time in a setting similar to a doctor鈥檚 visit. In both cases, the speakers separately filled out questionnaires about their mental health. The study included Michael Yang and Abd-Allah El-Attar, undergraduate students at Texas A&M.
Fixing biases
The results seemed to be all over the place.
In the public speaking recordings, for example, the Latino participants reported that they felt a lot more nervous on average than the white or Black speakers. The AI, however, failed to detect that heightened anxiety. In the second experiment, the algorithms also flagged equal numbers of men and women as being at risk of depression. In reality, the female speakers had experienced symptoms of depression at much higher rates.
Chaspari noted that the team鈥檚 results are just a first step. The researchers will need to analyze recordings of a lot more people from a wide range of demographic groups before they can understand why the AI fumbled in certain cases鈥攁nd how to fix those biases.
But, she said, the study is a sign that AI developers should proceed with caution before bringing AI tools into the medical world:
鈥淚f we think that an algorithm actually underestimates depression for a specific group, this is something we need to inform clinicians about.鈥