Silent speech recognition from
continuous articulatory movements
In the demo below,
the top panel plots the input (x, y, and z
coordinates of sensors attached on the tongue and lips); the
bottom panel shows the predicted
sounds (time in red) and actual sounds (time in blue). This algorithm conducts
segmentation (detection of onsets and offsets of the
sentences) and recognition simultaneously from the
continuous tongue and lip movements.
The participant is
mouthing three corner vowels /a/, /i/, and /u/ (without
producing any voice); a computer behind him is actually
producing the synthesized sounds.
Quantitative articulatory vowel
Left part is the
quantitative articulatory vowel space I derived from more than
1,500 vowel samples of tongue and lip movements collected from
ten speakers, which resembles the long-standing descriptive
articulatory vowel space (Right part). I'm now investigating
the scientific and clinical applications of the quantitative
articulatory vowel space.
Articulatory consonant space
Using the same approach,
articulatory consonant spaces were derived using about 2,100
consonant samples of tongue and lip movements collected from
ten speakers. See the figure below (2D on the left and 3D on
the right). Both consonant spaces are consistent with the
descriptive articulatory features that distinguish consonants
(particularly place of articulation). Another interesting
finding is a third dimension is not necessary for the
articulatory vowel space, but very useful for consonant space.
I'm now investigating the scientific and clinical applications
of the articulatory consonant space as well.