Updated on Aug, 2014

Return to the welcome page




Research & Demos

 
  • DJ and his friend: A demo of conversation using the real-time silent speech interface
      This demo shows how the silent speech interface is used in a daily conversation. DJ (the user) is using the silent speech interface to communicate with his friend (not in the screen).  DJ is mouthing (i.e., without producing any voice); the silent speech interface displays the text on the screen, and produces synthesized sounds (female voice) (Wang et al., SLPAT 2014).  [Watch in a separate window]  [Demo 2]



























  • Demo of algorithm for word recognition from continuous articulatory movements
      In the demo below, the top panel plots the input (x, y, and z coordinates of sensors attached on the tongue and lips); the bottom panel shows the predicted sounds (time in red) and actual sounds (time in blue). This algorithm conducts segmentation (detection of onsets and offsets of the words) and recognition simultaneously from the continuous tongue and lip movements (Wang et al., Interspeech 2012; SLPAT 2013).































  • Demo of algorithm for sentence recognition from continuous articulatory movements
      In the demo below, the top panel plots the input (x, y, and z coordinates of sensors attached on the tongue and lips); the bottom panel shows the predicted sounds (time in red) and actual sounds (time in blue). This algorithm conducts segmentation (detection of onsets and offsets of the sentences) and recognition simultaneously from the continuous tongue and lip movements (Wang et al., ICASSP 2012)



 





















  • Articulation-to-speech synthesis
     The participant is mouthing three corner vowels /a/, /i/, and /u/ (without producing any voice); a computer behind him is actually producing the synthesized sounds.































     Left part is the quantitative articulatory vowel space I derived from more than 1,500 vowel samples of tongue and lip movements collected from ten speakers, which resembles the long-standing descriptive articulatory vowel space (Right part). I'm now investigating the scientific and clinical applications of the quantitative articulatory vowel space (Wang et al., Interspeech 2011; JSLHR 2013).
 


























     Using the same approach, articulatory consonant spaces were derived using about 2,100 consonant samples of tongue and lip movements collected from ten speakers. See the figure below (2D on the left and 3D on the right). Both consonant spaces are consistent with the descriptive articulatory features that distinguish consonants (particularly place of articulation). Another interesting finding is a third dimension is not necessary for the articulatory vowel space, but very useful for consonant space. I'm now investigating the scientific and clinical applications of the articulatory consonant space as well (Wang et al., JSLHR 2013).
























     I'm part of the comprehensive assessment of bulbar dysfunction of ALS study. My own work focuses on the articulatory sub-system of ALS bulbar system (Green et al., ALSFD 2013).






























 





  • Optispeech :  A real-time, 3D tongue motion feeadback system for speech training
        This is a collaborative project at the Communication Technology Center. The goal is to develop a sytem that provides real-time feedback of tongue motion during speech for training and therapy purposes.  In the demo below, the user is pronouning some basic English sounds while she can see her tongue motion simultaneously
(Katz et al., Interspeech 2014).


































More are coming.
 
 
Return to the welcome page.