Lip-reading is a very difficult skill to master – human lip reading is often unreliable, even when it’s performed by trained lip readers. “We all lip read, for example in noisy situations like a bar or party, but even the performance of expert lip readers can be very poor. It appears that the best lip-readers are the ones who learned to speak a language before they lost their hearing and who have been taught lip-reading intensively. It is a very desirable skill” – said Dr. Richard Harvey, a Professor at UEA’s School of Computing Sciences and a leading researcher on the project.
UEA researchers are working in collaboration with experts at the Centre for Vision, Speech and Signal Processing at Surrey University, who have already developed advanced face and lip motion tracking systems. Both teams are currently collecting data, such as videos, which will be used in an analysis designed to determine the exact lip movements and facial expressions associated with specific letter combinations in the most accurate way. The scientists say that since very little is known about exactly what kind of visual information is required for effective lip-reading, one of their main challenges is the task of building a precise and sufficient database of photographs and videos to form a reliable basis for the lip-reading software. Their ultimate goal is to build machines that will be able to automatically convert lip-motion videos into text. According to the researchers, they also plan to extend the silent speech-recognition system to additional languages.
“To be effective the systems must accurately track the head over a variety of poses, extract numbers or features that describe the lips, and then learn what features correspond to what text.” – said Dr. Harvey. “To tackle the problem we will need to use information collected from audio speech, so this project will also investigate how to use the extensive information known about audio speech to recognise visual speech. The work will be highly experimental. We hope to produce a system that will demonstrate the ability to lip-read in more general situations than we have done so far.”
Apart from being extremely helpful to hearing-disabled individuals, researchers say that such a system could be used to noiselessly dictate commands to electronic devices equipped with a simple camera – like mobile phones, microwaves or even a car’s dashboard. England’s Home Office Scientific Development Branch, an institution dedicated to exploring technologies with potential applications in the fight against crime, is currently investigating the feasibility of using lip-reading software as an additional tool for gathering information about criminals or for collecting evidence.
TFOT previously covered MIT’s Lecture Browser software, which allows users to search video content using keywords. You can also read about NASA’s sub-vocal speech-recognition research, aimed to enable silent communication and speech augmentation in extremely noisy environments.
More information can be found on the UEA website (PDF).