Please use this identifier to cite or link to this item:
Statistical Speech Classification on Mandarin Monosyllables
|關鍵字:||Speech Classification;語音辨識||出版社:||應用數學系所||引用:|| H. Dudley and S. Balashek, "Automatic recognition of phonetic patterns in speech," J. Acoust. Soc. Amer., vol. 30, pp. 721-739, (1958).  K. H. Davis, R. Biddulph, and S. Balashek, "Automatic recognition of spoken digits," J. Acoust. Soc. Amer., vol. 24, pp. 637-642, (1952).  P. B. Denes and M. V. Mathews, "Spoken digit recognition using time frequency pattern matching," J. Acoust. Soc. Amer., vol. 32, pp. 1450-1455, (1960).  V. W. Zue, "The use of speech knowledge in automatic speech recognition," Proc. IEEE, vol. 73, no. 11, pp. 1602-1615, Nov. (1985).  J. W. Cooley and J. W. Tukey, "An algorithm for the machine calculation of complex Fourier series," Mathematics Computation, vol.19, April (1965), pp.297-301.  B. S. Atal and S. L. Hanauer, "Speech analysis and synthesis by linear prediction of the speech wave," J. Acoust. Soc. Amer., vol. 50, pp. 637-655, (1971).  F. Itakura, "Minimum prediction residual principle applied to speech recognition," IEEE Trans. Acoust., Speech, Signal Processing, vol. 23, no. 1, pp. 67-72, Feb. (1975).  J. Makhoul and J. Wolf, "Linear Prediction and the Spectral Analysis of Speech," Bolt, Baranek, and Newman, Inc., Cambridge, Mass., Rep. 2304, (1972).  J.Tierney, "A study of LPC analysis of speech in additive noise," IEEE Trans. Acoust. Speech, Signal Processing, vol. 28, no. 4, pp. 389-397, (1980).  N. R. Sambur and L. R. Rabiner, "A speaker-independent digit recognition system," B.S.T.J., vol. 54, no.1, pp. 84-102, Jan. (1975).  S. B. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recignition in continuously spoken sentences," IEEE Trans. Acoust., Speech, Signal Processing, vol. 28, no. 4, pp. 357-366, Aug. (1980).  S. S. McCandless, "An algorithm for automatic formant extraction using linear prediction spectra," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-22, no. 2, pp. 135-141, Apr. (1974).  A. Aktas, B. Kammerer, W. Kupper, and H. Lagger, "Large-vocabulary isolated word recognition with past coarse time alignment," IEEE ICASSP 86, Tokyo, pp. 709-712, (1986).  B. P. Landell, R. E. Wohlford, and L. G. Bahler, "Improved speech recognitionin noise," IEEE ICASSP 86, Tokyo, pp. 749-751, (1986).  H. Murveit and R. W. Brodersen,"An integrated-circuit-based speech recognition system," IEEE Trans. Acoust., Speech, Signal Pricessing, vol.ASSP-34, no. 6, pp. 1465-1472, Dec. (1986).  J. L. Gauvain, J. Mariani, and J. S. Lienard, "On the use of time compression for word-based recognition," Proc. 1983 ICASSP, pp. 1029-1032, Apr. (1983).  J. L. Gauvain and J. Mariani, "Evaluation of time compressing for connected word recognition," Proc. 1984 ICASSP, Boston, MA, pp. 391-394.  L. Rabiner and J. Wilpon, "Speaker-independent isolated word recognition for a moderate size (54 word) vocabulary," IEEE Trans. Acoust., Speech and Signal Processing, vol. ASSP-27, no.6, pp. 583-587, Dec. (1979).  L. Rabiner, S. E. Levinson, A. E. Rosenberg, and J. G. Wilson, "Speaker-independent recognition of isolated words using clustering techniques," IEEE Trans. Acoust. Speech, Signal Processing, vol. 27, pp. 336-349, (1979).  L. Rabiner and S. Levinson, "Isolated and connected word recognition-theory and selected applications," IEEE Trans. Communications, vol. COM-29, no. 5, pp. 621-658, May (1981).  L. Wilcox and B. Lowerre, "Coarse classification using a hierarchical decision tree and top down parsing," IEEE ICASSP 86, Tokyo, pp. 73-76, (1986).  S. Furui, "Speaker-independent isolated word recognition using dynamic features of speech spectrum," IEEE Trans. Acoust., speech, processing,vol. ASSP-34, no. 1, pp. 52-59, Feb. (1986).  S. K. Das, "Some experiments in discrete utterane recognition," IEEE Trans. Acoust., Speech, Signal Processing, vol. 30, no. 5, pp. 766-770, (1982).  S. K. Das and W.S. Mohn, "A scheme for speech processing in automatic speaker verification," IEEE Trans. Audio Electro-Acoust., vol. AU-19, pp. 32-43, Mar. (1971).  S. L. Banner, "Simulating an acoustic recognizer," IEEE ICASSP 86, pp. 725-728, (1986).  S. Morishima, H. Harashima, and H. Miyakawa, "A proposal of a knowledge based isolated word recognition," IEEE ICASSP 86, Tokyo, pp.713-716, (1986).  M. Kuhn, H. Tomaschewski, and H. Ney, "Fast nonlinear time alignment for isolated word recognition," Proc. 1981 ICASSP, pp. 736-740, May (1981).  Y. Tohkura, "A weighted cepstral distance measure for speech recognition," IEEE ICASSP 86, Tokyo, pp. 761-764, (1986).  A. Buzo, A. Gray, R. Gray, and J. Markel, "Speech coding based upon vector quantization," IEEE Trans. Acoust., Speech and Signal Processing, vol. ASSP-28, no. 5, pp. 562-573, Oct. (1980).  C. E. Shannon, "Coding theorems for a discrete source with a fidelity criterion," in information and Decision Processes, R. E. Machol, ed. New York: McGrawHill, pp. 93-126, (1960).  B. H. Juang, D. Y. Wong, and A. H. Gray, Jr., "Distortion performance of vector quatization for LPC voice coding," IEEE Trans. Acoust., Speech and Signal Processing, vol. 30, no. 2, pp. 294-303, (1982).  D. Burton, J. Shore, and J. Buck, "Isolated-word speech recognition using multisection vector quatization codebooks, "IEEE Trans. Acoust., Speech and Signal Processing, vol. ASSP-33, no. 4, pp. 837-849, Aug. (1985).  J. E. Shore and D. K. Burton, "Discrete utterance speech recognition without time alignment, " IEEE Trans. Inform. Theory, vol. 29, pp. 473-491, (1983).  J. Makhoul, S. Roucos, and H. Gish, "Vector quantization in speech coding," Proc. IEEE, vol. 73, no. 11, pp. 1551-1588, Nov. (1985).  R. Gray, "Vector quantization," IEEE ASSP Magazine, pp. 4-29, Apr. (1984).  Y. Linde, A. Buzo, and R. M. Gray, "An algorithm for vector quantizer design," IEEE Trans. Communications, vol. COM-28, no 1, pp. 84-95, Jan. (1980).  H. Y. Gu, C. Y. Tseng and L. S. Lee, "Markov modeling of mandarin Chinese for decoding the phonetic sequence into Chinese characters," Computer Speech and Language, vol. 5, no. 4, pp. 363-377, (1991).  H. W. Hon, B. S. Yuan, Y. L. Chow, S. Naryan, and K. F. Lee, ”Toward large vocabulary mandarin Chinese speech recognition”, Proc. ICASSP 1994, pp. 545-548.  L. S. Lee, C. Y. Tseng, H. Y. Gu, K. J. Chen, F. H. Liu, C. H. Chang, S. H. Hsieh, and C. H. Chen, "A real-time mandarin diction machine for Chinese language with unlimited tests and very large vocabulary," Proc. 1990 ICASSP, Albuquerque, NM, USA, pp.65-68, (1990).  L. S. Lee, C. Y. Tseng, K. J. Chen, I. J. Hung, M. Y. Lee, L. F. Chien, Y. M. Lee, R. Y. Lyu, H. M. Wang, Y. C. Chang, T. s. Lin, H. Y. Gu, C. P. Nee, C.Y. Liao, Y. J. Yang, T. C. Chang, and R. C. Yang, "Golden mandarin (II)-An improved single-chip real-time mandarin dictation machine for Chinese language with vary large vocabulary," Proc. 1993 ICASSP, pp. 503-506.  L. S. Lee and J. T. Chen, "An initial study on speaker adaptation techniques for isolated mandarin syllable recognition," 1990 proc. Of Telecommunications Symp., Taiwan, pp. 115-121.  Y. Q. Gao, T. Y. Huang, Z. W. Lin, B. Xu, and D. X. Xu, "A real-time Chinese speech recognition system with unlimited vacobulary," Proc. ICASSP 1991, pp. 257-260.  Tze Fen Li, Chung Bow Lee and Tseng Chang Yen, "A Note on Mel frequency cepstra in Speech Recognition."  Tze Fen Li, "Speech recognition of mandarin monosyllables," Pattern Recognition, vol.36, pp. 2713-2721, April (2003).  K. Fukunage, Introduction to Statistical Pattern Recognition, Academic Press, New York, (1972).||摘要:||
Because of the rapid development of the computer, the relation between our life and computer is getting much and much closer. In order to make a high quality and humanized living environment, the speech recognition study becomes more and more important. The main purpose in this thesis is to study speech recognition with speech speaker-independent. We use a sequence of linear predict coding cepstra vectors to represent each mandarin syllable, and compress it into the matrix of features. Finally, a simplified Bayes decision rule is used for classification of mandarin syllables. The computation for feature extraction and classification are fast and precise.
Key Words: speech speaker-independent, linear predict coding, cepstra, Bayes decision rule
|Appears in Collections:||應用數學系所|
Show full item record
TAIR Related Article
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.