F0 scale factor, FF scale factor
Click
the button in the middle to hear the unshifted original voice.
The other buttons represent frequency-shifted versions of this voice,
ordered consecutively in rows. The voice in the upper left corner of
the grid
simulates a large vocal tract and low fundamentals; the voice in the
lower right corner simulates a small vocal tract with a high
fundamental. Position the mouse over each button to see the fundamental
frequency (F0) shift factor and the spectrum envelope (or formant
frequency, FF) shift factor, which scales the length of the simulated
talker's vocal tract. The combinations of F0 and FF scale factors were
selected based on a linear regression of these properties measured in
natural vowels. Analysis-resynthesis performed using the STRAIGHT
vocoder (Kawahara, 1997, 1999).
Kawahara,
H.
(1997). Speech representation and transformation using adaptive
interpolation of weighted spectrum: Vocoder revisited. Proceedings
of the ICASSP, pp.
1303-1306.
Kawahara, H. Masuda-Katsuse, I. de Cheveigné, A. (1999). Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction. Speech Communication 27, 187-207.