Please use this identifier to cite or link to this item:
標題: 最佳基因個數的評估
Evaluation of the top gene number
作者: 李佳瑾
Li, Chia-Chin
關鍵字: Gene expression profiles;基因表現圖譜;Prediction;Gene ranking;Dimension reduction;Proportional hazards model.;預測;基因排序;降維;比例風險模型。
出版社: 應用數學系所
引用: Bair E., and Tibshirani R. (2004), “Semi-supervised methods to predict patient survival from gene expression data,” PLoS Biology, 2: 511-522. Bair E., Hastie T., Paul D., and Tibshirani R. (2006),“Prediction by supervised principal components,” Journal of the American Statistical Association, 101: 119-137. Bhattacharjee A., Richards W.G., Staunton J., Li C., Monti S., Vasa P., Ladd C., Beheshti J., Bueno R., Gillette M., Loda M., Weber G., Mark E.J., Lander E.S., Wong W., Johnson B.E., Golub T.R., Sugarbaker D.J., and Meyerson M. (2001), “Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses,” PNAS, 98: 13790-13795. Boulesteix A.L., and Strimmer K. (2007), “Partial least squares: a versatile tool for the analysis of high-dimensional genomic data,” Bioinformatics, 8: 32-44. Bøvelstad H.M., Nyg˚ard S., Størvold H.L., Aldrin M., Borgan Ø., Frigessi A., and Lingjærde O.C. (2007),“Predicting survival from microarray data -a comparative study,” Bioinformatics, 23: 2080-2087. Chen D.T., Schell M.J., Chen J.J., Fulp W.J., Eschrich S., and Yeatman T. (2008), “A predictive risk probability approach for microarray data with survival as the endpoint,” Journal of Biopharmaceutical Statistics, 18: 841-852. Cox D.R. (1972), “Regression models and life tables (with discussion),” Journal of the Royal Statistical Society. Series B (Methodological), 34: 187-220. Efron B., Hastie T., Johnstone I., and Tibshirani R. (2004), “Least angle regression,” Annals of Statistics, 32: 407-499. Gui J., and Li H. (2005a), “Penalized Cox regression analysis in the highdimensional and low-sample size settings, with applications to microarray gene expression data,” Bioinformatics, 21: 3001-3008. Gui J., and Li H. (2005b), “Threshold gradient descent method for censored data regression with applications in pharmacogenomics,” Pacific Symposium on Biocomputing, 10: 272-283. Nguyen D.V., and Rocke D.M. (2002), “Partial least squares proportional hazard regression for application to DNA microarray survival data,” Bioinformatics, 18: 1625-1632. Nguyen D.V. (2005), “Partial least squares dimension reduction for microarray gene expression data with a censored response,” Mathematical Biosciences, 193: 119-137. Park P.J., Tian L., and Kohane I.S. (2002), “Linking gene expression data with patient survival times using partial least squares,” Bioinformatics, 18: S120-S127. Sha N., Tadesse M.G., and Vannucci M. (2006), “Bayesian variable selection for the analysis of microarray data with censored outcomes,” Bioinformatics, 22: 2262-2268. Tan Q., Thomassen M., Jochumsen K.M., Mogensen O., Christensen K., and Kruse T.A. (2008), “Gene selection for predicting survival outcomes of cancer patients in microarray studies,” In: Tarek Sobh (ed.), Advances in Computer and Information Sciences and Engineering, Springer Netherlands, 405-409. van't Veer L.J., Dai H., van de Vijver M.J., He Y.D., Hart A.A., Mao M., Peterse H.L., van der Kooy K., Marton M.J., Witteveen A.T., Schreiber G.J., Kerkhoven R.M., Roberts C., Linsley P.S., Bernards R., and Friend S.H. (2002), “Gene expression profiling predicts clinical outcome of breast cancer,” Nature, 415: 530-536. Verweij P.J.M., and van Houwelingen H.C. (1993), “Cross-validation in survival analysis,” Statistics in medicine, 12: 2305-2314. Wu T., Sun W., Yuan S.,Chen C.H., and Li K.C. (2008), “A method for analyzing censored survival phenotype with gene expression data,” BMC Bioinformatics, 9: 417.

One important application of microarray gene expression data in the statistical analysis is used to predict diseased patients'' clinical outcomes. Accurate selection of significant genes is a crucial step for building a good performance prediction model. In this study, we adopt the
statistics p-value and Cox score separately to rank the lung cancer patients'' genes, and then pick out the optimal number of top genes via exploring the effect of the top ranked gene number on prediction with principal component analysis, supervised principal components and partial least squares methods combined with Cox proportional hazards model. Finally, we use the selected significant genes to re-build a predictive model for different methods and compare with other reference''s methods. Furthermore, we assess the predictive performance by three different evaluation criteria. The results show that our predictive methods through gene selection procedure really achieve better predictive performances.
其他識別: U0005-2307200917254000
Appears in Collections:應用數學系所

Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.