Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/60832
標題: 預測單胺基酸突變對蛋白質穩定性的影響之研究-使用資料融合策略
A study of influence of single amino acid mutation on protein stability: using data fusion strategy
作者: 林傑倫
Lin, Jerome
關鍵字: Protein stability;蛋白質穩定性;Machine learning;Support Vector Machine;Predictor;機器學習;支持向量機;預測工具
出版社: 基因體暨生物資訊學研究所
引用: 1. Tidor B, Karplus M: Simulation analysis of the stability mutant R96H of T4 lysozyme. Biochemistry 1991, 30:3217-3228. 2. Lee C, Levitt M: Accurate prediction of the stability and activity effects of site-directed mutagenesis on a protein core. Nature 1991, 352:448-451. 3. Pitera JW, Kollman PA: Exhaustive mutagenesis in silico: multicoordinate free energy calculations on proteins and peptides. Proteins 2000, 41:385-397. 4. Dehouck Y, Grosfils A, Folch B, Gilis D, Bogaerts P, Rooman M: Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0. Bioinformatics 2009, 25:2537-2543. 5. Gilis D, Rooman M: Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local versus non-local interactions along the sequence. J Mol Biol 1997, 272:276-290. 6. Carter CW, Jr., LeFebvre BC, Cammer SA, Tropsha A, Edgell MH: Four-body potentials reveal protein-specific correlations to stability changes caused by hydrophobic core mutations. J Mol Biol 2001, 311:625-638. 7. Gilis D, Rooman M: PoPMuSiC, an algorithm for predicting protein mutant stability changes: application to prion proteins. Protein Eng 2000, 13:849-856. 8. Kwasigroch JM, Gilis D, Dehouck Y, Rooman M: PoPMuSiC, rationally designing point mutations in protein structures. Bioinformatics 2002, 18:1701-1702. 9. Parthiban V, Gromiha MM, Hoppe C, Schomburg D: Structural analysis and prediction of protein mutant stability using distance and torsion potentials: role of secondary structure and solvent accessibility. Proteins 2007, 66:41-52. 10. Casadio R, Compiani M, Fariselli P, Vivarelli F: Predicting free energy contributions to the conformational stability of folded proteins from the residue sequence with radial basis function networks. Proc Int Conf Intell Syst Mol Biol 1995, 3:81-88. 11. Capriotti E, Fariselli P, Casadio R: A neural-network-based method for predicting protein stability changes upon single point mutations. Bioinformatics 2004, 20 Suppl 1:i63-68. 12. Capriotti E, Fariselli P, Calabrese R, Casadio R: Predicting protein stability changes from sequences using support vector machines. Bioinformatics 2005, 21 Suppl 2:ii54-58. 13. Frenz CM: Neural network-based prediction of mutation-induced protein stability changes in Staphylococcal nuclease at 20 residue positions. Proteins 2005, 59:147-151. 14. Cheng J, Randall A, Baldi P: Prediction of protein stability changes for single-site mutations using support vector machines. Proteins 2006, 62:1125-1132. 15. Bordner AJ, Abagyan RA: Large-scale prediction of protein geometry and stability changes for arbitrary single point mutations. Proteins 2004, 57:400-413. 16. Capriotti E, Fariselli P, Casadio R: I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Res 2005, 33:W306-310. 17. Huang LT, Gromiha MM, Ho SY: Sequence analysis and rule development of predicting protein stability change upon mutation using decision tree model. J Mol Model 2007, 13:879-890. 18. Capriotti E, Fariselli P, Rossi I, Casadio R: A three-state prediction of single point mutations on protein stability changes. BMC Bioinformatics 2008, 9 Suppl 2:S6. 19. Huang LT, Gromiha MM: Reliable prediction of protein thermostability change upon double mutation from amino acid sequence. Bioinformatics 2009, 25:2181-2187. 20. Wan J, Kang S, Tang C, Yan J, Ren Y, Liu J, Gao X, Banerjee A, Ellis LB, Li T: Meta-prediction of phosphorylation sites with weighted voting and restricted grid search parameter selection. Nucleic Acids Res 2008, 36:e22. 21. Masso M, Vaisman, II: Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis. Bioinformatics 2008, 24:2002-2009. 22. Parthiban V, Gromiha MM, Schomburg D: CUPSAT: prediction of protein stability upon point mutations. Nucleic Acids Res 2006, 34:W239-242. 23. Gromiha MM, An J, Kono H, Oobatake M, Uedaira H, Prabakaran P, Sarai A: ProTherm, version 2.0: thermodynamic database for proteins and mutants. Nucleic Acids Res 2000, 28:283-285. 24. Bava KA, Gromiha MM, Uedaira H, Kitajima K, Sarai A: ProTherm, version 4.0: thermodynamic database for proteins and mutants. Nucleic Acids Res 2004, 32:D120-121. 25. Chang. C-C, Hsu. C-W, Lin C-J: The analysis of decomposition methods for support vector machines. IEEE Transactions on Neural Networks 2000 2000, 11:6. 26. Gromiha MM, Selvaraj S: Inter-residue interactions in protein folding and stability. Prog Biophys Mol Biol 2004, 86:235-277. 27. Gromiha MM, Oobatake M, Kono H, Uedaira H, Sarai A: Role of structural and sequence information in the prediction of protein stability changes: comparison between buried and partially buried mutations. Protein Eng 1999, 12:549-555.
摘要: 
當一個蛋白質序列上有一個胺基酸因為突變而轉變為另外一個胺基酸時,此突變可能會對蛋白質的結構與功能帶來重大的變化,進而導致蛋白質正常功能的喪失。藉由研究現有蛋白質突變對穩定性影響變化的資料,若果研究者能對這些突變如何影響蛋白質結構穩定性的機制有所了解,將能幫助生化學者以及蛋白質設計者知道如何幫助改變蛋白質的穩定性,使現有的蛋白質能有更高的熱穩定性,或是有更長的使用壽命,更甚者還能嘗試設計出穩定的新型蛋白質。現今在網路上已經有許多預測工具,可供使用者預測一個蛋白質在經過特定位置的突變之後,其穩定性會如何改變;然而未有一個預測軟體能表現出完美的預測能力。當不同的預測工具的結果彼此之間出現衝突時,將會對使用者帶來不知該採信於何者的困境。在此類的情況下,一個整合性的預測網站能經由分析不同網站對於不同蛋白質的預測能力有何優劣,提供使用者一個客觀的第三方預測結果,由整合性預測工具來替使用者判斷哪一個預測結果可能為真。經由整合了五個預測網站的7種預測方式,加上突變蛋白質的序列資訊,我們建置了一個名為iStable的整合性預測工具,透過向量支持機器的訓練以及交叉驗證的結果,此預測工具顯示出的表現明顯優於所有用來建置iStable的現有預測工具,其中尤其在有穩定作用的突變資料的表現特別突出。我們同時將iStable與其他預測工具在突變目標處於不同結構環境下時的表現做了比較,也證明了iStable在不同二級結構以及溶劑相對接觸面積的蛋白質突變目標的穩定性變化預測都表現出較好的表現;我們同時從訓練資料中分出了三類比較大量的蛋白質類型,並且以這三類蛋白質對各個預測工具做了比較,在結果中iStable依舊表現出最好的預測能力。

Mutation of a single amino acid residue can cause changes in a protein, which could then lead to a loss of protein function. Predicting the protein stability changes provides several possible candidates for the novel protein designing. Although many prediction tools are available, the conflicting prediction results from different tools could cause confusion to users. To solve this problem, we proposed an integrated predictor, iStable, was constructed by using sequence information and prediction results from different element predictors. In the learning model, iStable adopted support vector machine as an integrator, while not only choosing the most major answer gave by element predictors. Furthermore, the role of sequence information played was analyzed in our model, and 11-window size was determined. On the other hand, iStable is available with two different prediction strategies: structural and sequential. After training and cross-validation, iStable has better performance than all of the element predictors on several datasets. Under different classifications and conditions for validation, this study also shown to have better overall performance in different types of secondary structures, relative solvent accessibility circumstances, and protein memberships in different superfamilies.
URI: http://hdl.handle.net/11455/60832
其他識別: U0005-2707201021511700
Appears in Collections:基因體暨生物資訊學研究所

Show full item record
 
TAIR Related Article

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.