Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/37825
標題: Statistical Behavior analysis of smoothing methods for language models of mandarin data sets
作者: Yu, M.S.
余明興 
Huang, F.L.
Tsai, P.Y.
關鍵字: language models;smoothing methods;statistical behaviors;cross;entropy;natural language processing;sparse data;probabilities
Project: Lecture Notes in Computer Science
期刊/報告no:: Lecture Notes in Computer Science, Volume 4182, Page(s) 172-186.
摘要: 
In this paper, we discuss the properties of statistical behavior and entropies of three smoothing methods; two well-known and one proposed smoothing method will be used on three language models in Mandarin data sets. Because of the problem of data sparseness, smoothing methods are employed to estimate the probability for each event (including all the seen and unseen events) in a language model. A set of properties used to analyze the statistical behaviors of three smoothing methods are proposed. Our proposed smoothing methods comply with all the properties. We implement three language models in Mandarin data sets and then discuss the entropy. In general, the entropies of proposed smoothing method for three models are lower than that of other two methods.
URI: http://hdl.handle.net/11455/37825
ISSN: 0302-9743
Appears in Collections:資訊科學與工程學系所

Show full item record
 

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.