Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/22484
標題: 在多層感知器訓練時加入權重雜訊和權重衰減的研究
Empirical studies on the online learning algorithms based on combining weight noise injection and weight decay
作者: 梁晏綸
Liang, Yen-Lun
關鍵字: Neural Networks, Multilayer Perceptron (MLP);類神經網路;Fault Tolerance;Weight Decay;Weight Noise;多層感知器;容錯;權重衰減;權重雜訊
出版社: 科技管理研究所
引用: [1] G. An. The effects of adding noise during backpropagation training on a generalization performance. Neural Computation, Vol.8, 643-674, 1996. [2] G. Basalyga and E. Salinas. When response variability increases neural network robustness to synaptic noise. Neural Computation, Vol.18, 1349-1379, 2006. [3] J. L. Bernier, J. Ortega, E. Ros I. Rojas, and A. Prieto. Obtaining fault tolerance multilayer perceptrons using an explicit regularization. Neural Processing Letters, Vol.12, 107-113, 2000. [4] J. L. Bernier, J. Ortega, I. Rojas, and A. Prieto. Improving the tolerance of multilayer perceptrons by minimizing the statistical sensitivity to weight deviations. Neurocomputing, vol.31, pp.87-103, Jan. 2000. [5] G. Bolt. Fault tolerant in multi-layer perceptrons. PhD Thesis, University of York, UK,, 1992. [6] Massimo Buscema. Metanet: The theory of independent judges. Substance Use and Misuse, Volume 33, Issue 2, pp. 439-461., Jan. 1998. [7] S. Cavalieri and O. Mirabella. A novel learning algorithm which improves the partial fault tolerance of multilayer nns. Neural Networks, Vol.12, 91-106, 1999. [8] S. Chen. Local regularization assisted orthogonal least squares regression. Neurocomputing, pp.559-585, 2006. [9] D. Deodhare, M. Vidyasagar, and S. Sathiya Keerthi. Synthesis of fault-tolerant feedforward neural networks using minimax optimization. IEEE Transactions on Neural Networks, Vol.9(5), 891-900, 1998. [10] P.J. Edwards and A.F. Murray. Can deterministic penalty terms model the effects of synaptic weight noise on network fault-tolerance? International Journal of Neural Systems, 6(4):401-16, 1995. [11] P.J. Edwards and A.F. Murray. Fault tolerant via weight noise in analog vlsi implementations of mlp's - a case study with epsilon. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, Vol.45, No.9, p.1255-1262, Sep. 1998. [12] C.T. Chiu et al. Modifying training algorithms for improved fault tolerance. ICNN'94 Vol.I, 333-338, 1994. [13] N. Kamiura et al. On a weight limit approach for enhancing fault tolerance of feedforward neural networks. IEICE Transactions on Information & Systems, Vol. E83-D, No.11, 2000. [14] R. Velazco et al. Seu fault tolerance in artificial neural netwok. IEEE Transactions on Nuclear Science, Vol.42(6), pp.1856-1862, 1995. [15] N.C. Hammadi and I. Hideo. A learning algorithm for fault tolerant feedforward neural networks. IEICE Transactions on Information & Systems, Vol. E80-D, No.1, 1997. [16] B. Hassibi and D.G. Stork. Second order derivatiives for network prunning: Optimal brain surgeon. in hanson et al. Advances in Neural Information Processing Systems, 164-171, 1993. [17] S. Himavathi, D. Anitha, and A. Muthuramalingam. Feedforward neural network implementation in fpga using layer multiplexing for effective resource utilization. IEEE Transactions on Neural Networks, Vol. 18, pp.880-888, 2007. [18] K. Ho, C.S. Leung, and J. Sum. On weight-noise-injection training, m.koeppen, n.kasabov and g.coghill (eds.). Advances in Neuro-Information Processing, Springer LNCS 5507, pp. 919-926, 2009. [19] K.C.Jim, C.L. Giles, and B.G. Horne. An analysis of noise in recurrent neural networks: Convergence. IEEE Transactions on Neural Networks, Vol.7, 1424-1438, 1996. [20] E.W. M. Lee, C. P. Lim, R. K. K. Yuen, and S. M. Lo. A hybrid neural network model for noisy data regression. IEEE Trans. Syst. Man Cybern. B., Cybern., vol. 34, no. 2, pp. 951-960, April 2004. [21] C.S. Leung and J. Sum. A fault tolerant regularizer for rbf networks. IEEE Transactions on Neural Networks, Vol. 19 (3), pp.493-507, 2008. [22] C.S. Leung, K.W. Wong, J. Sum, and L.W. Chan. On-line training and prunning for rls algorithm. Electronics Letters, Vol.32, No.23, pp.2152-2153, 1996. [23] C.S. Leung, K.W. Wong, P.F. Sum, and L.W. Chan. A prunning method for recursive least square algorithm. Neural Networks, 2001. [24] C.S. Leung, G.H. Young, J. Sum, and W.K. Kan. On the regularization of forgetting recusive least square. IEEE Transactions on Neural Networks, Vol. 10, pp.1842-1846, 1999. [25] J.E. Moody. Note on generalization, regularization, and architecture selection in nonlinear learning system. First IEEE-SP Workshop on Neural Networks for Signal Processing, 1991. [26] N. Murata, S. Yoshizawa, and S. Amari. Netwoek information criterion-determining the number of hidden units for an artificial neural network model. IEEE Transactions on Neural Networks Vol.5(6), pp.865-872, 1994. [27] A.F. Murray and P.J. Edwards. Synaptic weight noise during multilayer perceptron training: fault tolerance and training improvements. IEEE Transactions on Neural Networks, Vol.4(4), 722-725, 1993. [28] A.F. Murray and P.J. Edwards. Enhanced mlp performance and fault tolerance resulting from synaptic weight noise during training. IEEE Transactions on Neural Networks, Vol.5(5), 792-802, 1994. [29] M.W. Pederson, L.K. Hansen, and J. Larsen. Prunning with generalization based weight saliencies: robd, robs. Advances in Neural Information Processing Systems 8, pp.512-528, 1996. [30] D.S. Phatak and I. Koren. Complete and partial fault tolerance of feedforward neural nets. IEEE Transactions on Neural Networks, Vol.6, 446-456, 1995. [31] R. Reed. Prunning algorithm - a survey. IEEE Transactions on Neural Networks Vol.4(5), pp.740-747, 1993. [32] Antony W. Savich, Medhat Moussa, and Shawki Areibi. The impact of arithmetic representation on implementing mlp-bp on fpgas: A study. IEEE Transactions on Neural Networks, Vol. 18, pp.240-252, 2007. [33] Neti C. M.H. Schneider and E.D. Young. Maximally fault tolerance neural networks. IEEE Transactions on Neural Networks, Vol.3(1), 14-23, 1992. [34] C.H. Sequin and R.D. Clay. Fault tolerance in feedforward artificial neural networks. Neural Networks, Vol.4, 111-141, 1991. [35] S. Singth. Noise impact on time-series forecasting using an intelligent pattern maching technique. Pattern Recognit., vol. 32, pp.1389-1398, 1999. [36] M. Sugiyama and H. Ogawa. Optimal design of regularization term and regularization parameter by subspace information criterion. Neural Networks, Vol.15, 349-361, 2002. [37] J. Sum and K. Ho. Sniwd: Simultaneous weight noise injection with weight decay for mlp training. Proc. ICONIP 2009, Bangkok Thailand, 2009. [38] J. Sum, C.S. Leung, and K. Ho. On objective function, regularizer and prediction error of a learning algorithm for dealing with multiplicative weight noise. IEEE Transactions on Neural Networks Vol.20(1), Jan, 2009, 2009.
摘要: 
在類神經網路中, 為了改善容錯效果而在訓練時加入權重雜訊(weight noise) 已經被廣泛的採用, 但是在理論上及實證上皆未獲得證實。在本論文中, 我們將從兩個方面來討論-在weight noise) 和權重衰減(weight decay)。我們把multiplicative weight noise 及additive weight noise 兩種情況分開探討。為了證實收斂情況和容錯能力的表現, 我們透過大量的電腦模擬來得到所需的結果。實驗結果顯示:(一) 在訓練時加入權重雜訊(weight noise) 將不會使權重收斂。(二) 同時加入權重雜訊(weight noise) 和權重衰減(weight decay) 的收斂情況較只加入權重雜訊(weight noise) 來得更好。(三)同時加入權重雜訊(weight noise) 和權重衰減(weight decay) 的容錯能力較只加入權重雜訊(weight noise) 來得更好。

本論文有以下兩個貢獻: 第一, 這些研究結果的一部分, 補充了在最近由Ho, Leung & Sum 三人在訓練時加入權重雜訊(weight noise) 的收斂情況的研究結果。第二, 另外一部分結果關於容錯的部分在類神經網路裡是一個新的領域。最後, 本論文也帶出一個在訓練時加入權重衰減(weight decay) 的重要訊息。加入權重衰減(weight decay) 不僅可以提高權重的收斂, 也可以改善類神經網路的容錯效果。

While injecting weight noise during training have been widely adopted in attaining fault tolerant neural newtorks, theoretical and empirical studies on the online algorithms developed based on these strategies have yet to be complete. In this thesis, we will investigate two important aspects in regard to the online learning algorithms based on combining weight noise injection and weight decay. Multiplicative weight noise and additive weight noise are considered seperately. The convergence behaviors and the performance of those learning algorithms are investigated via intensive computer simulations. It is found that (i) the online learning algorithm based on purely multiplicative weight noise injection does not converge, (ii) the algorithms combining weight noise injection and weight decay exhibit better convergence behaviors than their pure weight noise injection counterparts, and (iii) the neural networks attained by these algorithms combining weight noise injection and weight decay showing better fault tolerance abilities than the neural networks attained by the pure weight noise injection-based algorithms.

The contributions of these results are two folds. First, part of these empirical results complement the recent findings from Ho, Leung & Sum on the convergence behaviors of the weight noise injection-based learning algorithms. Second, another part of the results which is in regard to the fault tolerance ability are new in the area. Finally, one should note that the results presented in this thesis also bring out an important message adding weight decay during training. Weight decay is not just can improve the convergence of an algorithm, but also can improve the weight noise tolerance ability of a neural network that is attained by these online algorithms.
URI: http://hdl.handle.net/11455/22484
其他識別: U0005-1007201018120600
Appears in Collections:科技管理研究所

Show full item record
 

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.