Please use this identifier to cite or link to this item:
|標題:||Statistical Behavior analysis of smoothing methods for language models of mandarin data sets||作者:||Yu, M.S.
|關鍵字:||language models;smoothing methods;statistical behaviors;cross;entropy;natural language processing;sparse data;probabilities||Project:||Lecture Notes in Computer Science||期刊/報告no：:||Lecture Notes in Computer Science, Volume 4182, Page(s) 172-186.||摘要:||
In this paper, we discuss the properties of statistical behavior and entropies of three smoothing methods; two well-known and one proposed smoothing method will be used on three language models in Mandarin data sets. Because of the problem of data sparseness, smoothing methods are employed to estimate the probability for each event (including all the seen and unseen events) in a language model. A set of properties used to analyze the statistical behaviors of three smoothing methods are proposed. Our proposed smoothing methods comply with all the properties. We implement three language models in Mandarin data sets and then discuss the entropy. In general, the entropies of proposed smoothing method for three models are lower than that of other two methods.
|Appears in Collections:||資訊科學與工程學系所|
Show full item record
TAIR Related Article
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.