Toward a real-time silent speech interface driven by ultrasound and video imaging



Ultraspeech2 acquisition system

Hardware: Thermoformed helmet (collaboration with ESPCI ParisTech) + Terason T3000 ultrasound system + CMOS camera (infrared emitter/filter)

Ultraspeech 2 - Acquisition system

Software: Data streams recording and visual feature extraction (Discrete Cosine Transform) are achieved using Ultraspeech software (www.ultraspeech.com)


UltraspeechMax v.1.0 - GMM-based approach

Main Max/MSP patch for (1) contextualization of visual feature vectors , (2) real-time GMM-based articulatory-to-acoustic mapping (no external linguistic information), and (3) speech waveform generation using our real-time implementation of MLSA vocoder (excited with white noise).

Ultraspeech 2 - GMM-based approach - Max/MSP main patch


Some (very !) preliminary results using the GMM-based approach (relative low quality of the results is mainly due to the vocoder, we're working on it !)


Click here to watch the video


What's next ? Ultraspeech Max v1.1 - HMM-based approach (coming soon !)