Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/17671
標題: 以視窗架構之二階層快速演算法估計改變點位置
A Fast Two-Stage Algorithm by Window Method for Estimating the Location of Change-Points
作者: 張國清
Chang, Kuo-Ching
關鍵字: Change-points;改變點;Exponential family;Dynamic programming;Segmental K-means;Bayesian information criteria;指數家族;動態程序;K均值分割;貝氏訊息準則
出版社: 應用數學系所
引用: [1] Akaike, H. (1973). Information theory and an extension of the maximum likeli-hood principle. 2nd International Symposium of Information Theory. 267-281. [2] Bellman, R. E. (1957). Dynamic Programming, Princeton University, Princeton. [3] Bhattacharyya, G. K. and Johnson, R. A. (1968). Non-parametric tests for shift at an unknown time point. Annals of Mathematical Statistics 39, 1731-1743. [4] Boukai, B. (1994). Graphically based interval estimation for the change-point. Computational Statistics & Data Analysis 17, 77-86. [5] Braun, J. V., Braun, R. K. and Muller, H. G. (2000). Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation. Biometrika 87, 301-314. [6] Braun, J. V. and Muller, H. G. (1998). Statistical Methods for DNA Sequence Segmentation. Statistical Science 13, 42-162 [7] Carlstein, E. (1988). Nonparametric change-point estimation. Annals of Statistics 16, 188-197. [8] Chen, J. and Gupta, A. K. (1997). Testing and locating variance change points with application to stock prices. Journal of the American Statistical Association 92, 739-747. [9] Chen, J. and Gupta, A. K. (1999). Change point analysis of a Gaussian model. Sta-tistical Papers 40, 323-333. [10] Chernoff, H. and Zacks, S. (1964). Estimating the current mean of a normal dis-tribution which is subject to changes in time. Annals of Mathematical Statistics 35, 999-1018. [11] Cobb, G. W. (1978). The problem of the Nile: Conditional solution to change-point problem. Biometrike 65, 243-251. [12] Fearnhead, P. (2006). Exact and efficient Bayesian inference for multiple change-point problems. Statistics and Computing 16, 203-213. [13] Fearnhead, P. and Clifford, P. (2003). On-line inference for hidden Markov mod-els via particle filters, Journal of the Royal Statistical Society, Series B 65, 887-899. [14] Fearnhead, P. and Liu, Z. (2007). On-line inference for multiple change points problem. Journal of the Royal Statistical Society, Series B 69, 589-602. [15] Forney, G.. D. (1973). The Viterbi algorithm. Proceedings of the IEEE 61, 268-278. [16] Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711-732. [17] Guan, Z. (2004). A semiparametric changepoint model. Biometrika 91, 849-862. [18] Gupta, A. K. and Chen, J. (1996). Detecting changes of mean in multidimensional normal sequences with application to literature and geology. Computational Statis-tics 11, 211-221. [19] Hawkins, D. M. (1972). On the choice of segments in piecewise approximation. IMA Journal of Applied Mathematics 9, 250-256. [20] Hawkins, D. M. (2001). Fitting multiple change-point models to data. Computa-tional Statistics & Data Analysis 37, 323-341. [21] Hsu, D. A. (1977). Tests for Variance Shift at an Unknown Time Point. Journal of the Royal Statistical Society, Series C 26, 279-284. [22] Hsu, D. A. (1979). Detecting Shifts of Parameter in Gamma Sequences with Ap-plications to Stock Price and Air Traffic Flow Analysis. Journal of the American Statistical Association 74, 31-40. [23] Juang, B. H. and Rabiner, L. R. (1990). The segmental K-means algorithm for es-timating parameters of hidden Markov models. IEEE Transactions on Acoustics, Speech and Signal Processing 38, 1639-1641. [24] Killick, R., Fearnhead, P. and Eckley, I. A. (2011). Optimal detection of change-points with a linear computational cost. arXiv: 1101.1438. [25] Lee, C. B. (1996). Nonparametric multiple change-point estimators. Statistics & probability letters 27, 295-304. [26] Lee, C.B. (1997). Estimating the number of change point in exponential families distributions. Scandinavian Journal of Statistics 24, 201-210. [27] Lee, C. B. (1998). Bayesian analysis of a change-point in exponential families with application. Computational Statistics & Data Analysis 27, 195-208. [28] Lehmann, E. L. and Casella, G. (1998). Theory of point Estimation. Springer-Verlag, New York. [29] Lombard, F. and Carroll, R. J. (1992). Change point estimation via running cusums. Technical report No. 155, Statistics Department, Texas A&M University. [30] Muller, H. G. (1992). Change-points in nonparametric regression analysis. Annals of Statistics 20, 737-761. [31] O'Ruanaidh, J. J. K. and Fitzgerald, W. J. (1996). Numerical Bayesian methods applied to signal processing. Springer, New York. [32] Page, E. S. (1954). Continuous inspection schemes. Biometrika 41, 100-115. [33] Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics 6, 461-464. [34] Tateishi, S. and Konishi, S. (2010). Nonlinear regression modeling and detecting change points via the relevance vector machine. Computational Statistics Online First. DOI: 10.1007/s00180-010-0220-6. [35] Venter, J. and Steel, S. (1996). Finding multiple abrupt change-points. Computa-tional Statistics & Data Analysis 22, 481-504. [36] Viterbi, A. J. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on, Information Theory 13, 260-267. [37] Wichern, D. W., Miller, R. B. and Hsu, D. A. (1976). Changes of variance in first order autoregressive time series models with an application. Applied Statistics 25, 248-356. [38] Worsley, K. J. (1986). Confidence regions and tests for a change-point in a se-quence of exponential family random variables. Biometrika 73, 91-104. [39] Wyse, J. and Friel, N. (2010). Simulation-based Bayesian analysis for multiple changepoints. arXiv: 1011.2932v1. [40] Wyse, J., Friel, N. and Rue, H. (2010). Approximate simulation free multiple changepoint analysis with Gaussian Markov random field segment. arXiv: 1011.5038v1. [41] Zangwill, W. I. (1969). Nonlinear Programming: A Unified Approach. Pren-tice-Hall, Englewood Cliffs, New Jersey. [42] Zou, C. L., Liu Y. K., Qin P. and Wang, Z. J. (2007). Empirical likelihood ratio test for the change-point problem. Statistics & Probability Letters 77, 374-382.
摘要: 
本論文提出快速二階層演算法於獨立指數家族隨機變數多改變點模型,而此演算法既簡單又有趣。本演算法分為二個階層,在第一階層為利用視窗方法結合概似比測度尋找可能改變點位置的集合,再利用動態程序演算法運用先前的集合估計改變點的位置;第二階層則利用K均值分割演算法來估算最後改變點的位置,而其初始值的設定則是運用第一階層所得到的結果。論文中並使用四個實際資料,分別為尼羅河資料、道瓊工業指數資料、飛機到達資料與測井資料,而本論文所提供之演算法於實際資料中獲得到不錯的結果。運用視窗法可將原始的資料去除平均約74.9%,且二階層演算法估計改變點位置的時間比動態程序演算法最快約24.13倍,如果所得樣本越大時,此視窗法可大大的降低計算量;而所估算出來改變點的位置也相同。而針對模擬資料,視窗法可將資料平均去除約75.2%,而在未知改變點個數與已知改變點個數二種不同的條件假設下,其二階層演算法的速度比動態程序演算法最快約24.75及21.12倍。

A fast two-stage (TS) algorithm is proposed to estimate the location of change-points in a sequence of independent exponential family random variables. The idea is simple and interesting. In the first stage, we apply the window method by using the log-likelihood ratio measure to find a subset of candidate change-points; and use the dynamic programming (DP) algorithm on the chosen subset to obtain good initial change-points which will be proximate to the locations of the true change-points. In the second stage, the segmental K-means (SKM) algorithm is applied on the initial change-points obtained in the first stage. The real examples obtain Nile data, Dow-Jones industrial data, aircraft arrival data and Well-log data and simulated data sets are investigated. The results show that our algorithm works very well in application. In DP algorithm, the location of each data point is a candidate change-point. Using the window method, we find that the subset of candidate change-points can remove about 74.9% (on average) of change-points from DP algorithm to TS algorithm in real examples. In the comparison of CPU times, the TS algorithm is 24.13 times faster than the DP algorithm. It substantially reduces the heavy computation load for a long sequence of data. For simulated data sets, the performance of DP and TS algorithm are almost same in the estimation errors of change-points, but the subset of candidate change-points can remove 75.2 (on average) of change-points. Thus, the speed of TS algorithm can be up to 24.75 and 21.12 times than that of DP algorithm with unknown state and known state, respectively.
URI: http://hdl.handle.net/11455/17671
其他識別: U0005-1506201117551400
Appears in Collections:應用數學系所

Show full item record
 

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.