This paper is a study for an improved dynamic glottal model through high-speed imaging (HSI). As is well known, speech production comprises three parts, namely speech source, speech resonance and lip radiation. Among these three parts, speech source is the most important one because it is the basis of speech. In research on speech production, acoustical models of speech source have been well established. But the physiological speech source, that is to say, the activity of glottis is seldom researched, because the vibration of vocal folds is difficult to observe and sample. A study on glottal model was established many years ago (Kong, 2007), and in that model, the static glottis was modeled by four quarters of ellipses in three modes namely normal mode, leakage mode and open mode. The dynamic glottal control function was modeled by an approximation of multiplication of sine and exponential. The problem of the dynamic glottal model is that the control parameters can’t be well explained, though the glottis can be simulated. In this study, more high-speed images were sampled, the image processing was greatly improved and the dynamic glottal control function was modeled with parameters which were significant to speech perception.
From: 陈雪飞
