© University of Glasgow

A new system capable of reading lips with remarkable accuracy even when speakers are wearing face masks could help create a new generation of hearing aids. An international team of engineers and computing scientists developed the technology, which pairs radio-frequency sensing with artificial intelligence for the first time to identify lip movements, writes the University of Glasgow in a press release.

The system, when integrated with conventional hearing aid technology, could help tackle the ‘cocktail party effect, a common shortcoming of traditional hearing aids. Currently, hearing aids assist hearing-impaired people by amplifying all ambient sounds around them, which can be helpful in many aspects of everyday life.

However, in noisy situations such as cocktail parties, hearing aids’ broad spectrum of amplification can make it difficult for users to focus on specific sounds, like a conversation with a particular person. One potential solution to the cocktail party effect is to make ‘smart’ hearing aids, which combine conventional audio amplification with a second device to collect additional data for improved performance.

Why we write about this topic:

Around 5 percent of the world’s population suffers from some kind of hearing impairment. New technologies in this realm help foster equality and inclusiveness for all.

Gathering data

To develop the system, the researchers asked male and female volunteers to repeat the five vowel sounds (A, E, I, O, and U) first while unmasked and then while wearing a surgical mask. As the volunteers repeated the vowel sounds, their faces were scanned using radio-frequency signals from both a dedicated radar sensor and a wifi transmitter. Their faces were also scanned while their lips remained still.

Then, the 3,600 samples of data collected during the scans were used to ‘teach’ machine learning and deep learning algorithms how to recognize the characteristic lip and mouth movements associated with each vowel sound. Because the radio-frequency signals can easily pass through the volunteers’ masks, the algorithms could also learn to read masked users’ vowel formation.

High success rates

The system proved to be capable of correctly reading the volunteers’ lips most of the time. Wifi data was correctly interpreted by the learning algorithms up to 95 percent of the time for unmasked lips, and 80 percent for masked. Meanwhile, the radar data was interpreted correctly up to 91% without a mask and 83 percent of the time with a mask.

Dr Qammer Abbasi, of the University of Glasgow’s James Watt School of Engineering, is the paper’s lead author. He said: “With this research, we have shown that radio-frequency signals can be used to accurately read vowel sounds on people’s lips, even when their mouths are covered. While the results of lip-reading with radar signals are slightly more accurate, the Wi-Fi signals also demonstrated impressive accuracy.”

Selected for you!

Innovation Origins is the European platform for innovation news. In addition to the many reports from our own editors in 15 European countries, we select the most important press releases from reliable sources. This way you can stay up to date on what is happening in the world of innovation. Are you or do you know an organization that should not be missing from our list of selected sources? Then report to our editorial team.