Please use this identifier to cite or link to this item:
標題: 利用Weight Decay結合隨機fault或加注雜訊的在線容錯訓練之特性研究(3)
On-Line Fault Tolerant Learning Algorithms Iii: Node Fault Injection-Based Bpa for Mlp
作者: 沈培輝
關鍵字: 資訊科學軟體;基礎研究;Gradient Descent;Fault Tolerance;KL Divergence;Learning Theory;Neural Networks
在訓練類神經網絡方面,一個重要議題便是要令它能抵抗網絡結構的突變、或雜訊(在權重或神經元)所帶來的影響。過去二十年,不少訓練的方法都己提出過。第一種是以Objective function 為導向,後利用gradient descent 方法來設計一個iterative方法去訓練類神經網絡。第二種是以較直接但缺乏理論(如收斂性)支持的方法,以Online back-propagation (BPA) 為本,在每次update的step 加入隨機fault/noise,強迫Neural Network 學習這些random fault/noise。很可惜,在第二種研究當中,大部份結論都只能基於電腦模擬結果,理論研究非常缺乏。更甚,我們發現不少學者對Weight noise injection training 的目標函數,存在非常錯誤的觀念。所以,過去二年申請研究經費,對這類訓練方法做出研究。在第一年以RBF 為研究對象,利用Gladyshev Theorem成功證明了online weight noise/nodefault injection結合weight decay 的在線訓練等方法的收斂性、並推導它們的目標函數。但同時我們發現Gladyshev Theorem 並不適用於對MLP 的討論。因此在第二年以MLP 為研究對象,利用Doob’s Martingale Convergence Theorem來證明weight noise injection的在線訓練方法的收斂性、並推導它們的目標函數。在這一次申請,我們將延續研究方向,分別以MLP with single linear outputnode、 MLP with multiple linear output nodes、MLP with single sigmoid output node為研究對象,將node fault在線注入下的訓練方法續一研究。再次利用Doob’sMartingale Convergence Theorem來證明online node fault injection-based BPA withconstant step size應用在以上三種MLPs時的收斂性、並推導它們的目標函數。另外,對每一個訓練方法所產生的Mean Prediction Errors做推導。並且,利用所推導出來的目標函數和一些基於Regularization概念所發展的訓練方法來做比較。最後,以電腦模擬來探討,這些online node fault injection的訓練方法可否對函數逼近(Function Approximation)問題、迴歸(Regression)問題、時間序列(Time Series Prediction)問題和分類(Classification)問題帶來更好的Generalization。

While injecting noise (input noise or weight noise) or faults (weight fault or node fault)during training has been applied to improve fault tolerance of a neural network, not muchanalysis has been done to reveal the success of such learning methods. This project is acontinuing research of the project funded in July 2009 and July 2010. While the previousresearches are focusing on (1) fault/noise injection-based BPA for radial basis function(RBF) networks (August 2010 { July 2010) and (2) weight noise injection-based BPAmultilayer perceptron (MLP) (August 2010 { July 2011), this project targets on the nodefault injection-based BPA for MLP.Improving fault tolerance of a neural network is an important issue that has been stud-ied for more than two decades. Various training algorithms and synthesizing methods havebeen proposed in sequel. Online node fault injection-based algorithms are some of these al-gorithms. While the idea of these training algorithms is simple, theoretical analyses on theircorresponding objective functions and convergence proofs are far from complete. There-fore, we investigate in this project the objective functions and the convergence proofs forthis type of online node fault injection-based training algorithm with applications to threedierent structures of multilayer perceptrons (MLPs), namely (i) MLPs with single linearoutput node, (ii) MLPs with multiple linear output nodes and (iii) MLPs with single sig-moid output node. For the convergence proofs, we will apply the Martingale ConvergenceTheorem to show that this type of algorithms converges with probability one. In sequel,the objective functions being minimized by these algorithms will be derived. The meanprediction errors will then be derived for the MLPs which are attained by these onlinenode fault injection-based training algorithms. The convergence behaviors of the onlinenode fault injection-based algorithms and the performance of the MLPs generated by thesealgorithms will be further studied by conducting simulated examples including functionapproximation problems, regression problems, time-series prediction problems and classi-cation problems. Finally, comparisons amongst the algorithms developed based on the ideaof online node fault injection and the idea of regularization will be conducted.
其他識別: NSC100-2221-E005-083
Appears in Collections:科技管理研究所

Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.