Please use this identifier to cite or link to this item:
標題: 針對H.264/AVC標準之知覺視訊編碼演算法與架構設計
Perceptual Video Coding Algorithm and Architecture Design for H.264/AVC Standard
作者: 戴成翰
Tai, Cheng-Han
關鍵字: Perceptual video coding;知覺視訊編碼;Visual sensitivity;Texture;Visual masking;Visual attention;視覺敏感性;材質;視覺遮罩;視覺注意
出版社: 電機工程學系所
引用: [1] I. E. G. Richardson, “H.264 and MPEG-4 Video Compression,” John Wiley & Sons, 2003. [2] ITU-T Recommendation H.261: Video Codec for Audiovisual Services at p ×64 Kbit/s, Mar. 1993. [3] Information Technology - Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5Mbit/s - Part 2: Video, ISO/IEC 11172-2, 1993. [4] Information Technology - Generic Coding of Moving Pictures and Associated Audio Information: Video, ISO/IEC 13818-2 and ITU-T Recommendation H.262, 1996. [5] ITU-T Recommendation H.263: Video Coding for Low Bit Rate Communication, Jan. 2005. [6] Information Technology - Coding of Audio-Visual Objects - Part 2: Visual, ISO/IEC 14496-2, 1999. [7] ITU-T Recommendation H.264: Advanced Video Coding for Generic Audiovisual Service, May 2003. [8] Z. Wang, A. C. Bovik, “Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures,” IEEE Signal Process. Mag., VOL. 26, pp. 98-117, Jan. 2009. [9] C. H. Chou, Y. C. Li, ”A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile,” IEEE Trans. Circuits Syst. Video Technol., VOL. 5, NO. 6, pp. 467-476, Dec. 1995. [10] Koohyar Minoo, Truong Q. Nguyen,“Perceptual video coding with H.264,” Conference Record of the Thirty-Ninth Asilomar Conference on Signals, Systems and Computers, 2005. pp. 741-745. [11] Atul Puri, R. Aravind, “Motion-compensated video coding with adaptive perceptual quantization,” IEEE Trans. Circuits Syst. Video Technol., VOL. 1, NO. 4, pp. 351-361, Dec. 1991. [12] Bo Tao, Bradley W. Dickinson, Heidi A. Peterson, “Adaptive model-driven bit allocation for MPEG video coding,” IEEE Trans. Circuits Syst. Video Technol., VOL. 10, NO. 1, pp. 147-157, Feb. 2000. [13] Chih-Wei Tang, Ching-Ho Chen, Ya-Hui Yu, and Chun-Jen Tsai, “Visual sensitivity guided bit allocation for video coding,” IEEE TRANS. ON Multimedia, VOL. 8, NO. 1, pp. 11-18, Feb. 2006. [14] Yu-Fei Ma and Hong-Jiang Zhang, “A model of motion attention for video skimming,” Proceedings. 2002 International Conference on Image Processing, VOL. 1, pp.I-129 - I-132, 2002. [15] Chang Sun, Hong-Jun Wang, Tai-hoon Kim, Hua Li, “Perceptually adaptive Lagrange multiplier for rate-distortion optimization in H.264,” Future Generation Communication and Networking (FGCN2007), VOL. 1, pp. 459-463, 2007. [16] Tung-Hsing Wu, Guan-Lin Wu, Shao-Yi Chien, “Bio-inspired perceptual video encoding based on H.264/AVC,” IEEE International Symposium on Circuits and Systems, 2009, ISCAS 2009, pp.2826 - 2829, 2009. [17] D. H. Kelly, “Motion and vision. II. Stabilized spatio-temporal threshold surface,” Journal of the Optical Society of America, VOL. 69,NO. 10, pp. 1340-1349, Oct. 1979. [18] Scott J. Daly, “Engineering observations from spatiovelocity and spatiotemporal visual models,” IS&T/SPIE Conference on Human Vision and Electronic Image III, VOL. 3299, 1998. [19] Jia Zhike, Cui Huijuan, Tang Kun, “Adaptive quantization scheme for very low bit rate video coding,” APCC/OECC '99, VOL. 2, pp. 940-943, 1999. [20] Rafael C. Gonzalez, Richard E. Woods, “Digital Image Processing,” 3rd edition, Person Prentice Hall, 2008. [21] Neil H.E. Weste, David Harris, “CMOS VLSI DESIGN: A Circuits and Systems Perspective,” 3rd edition, Addison Wesley, 2005, pp. 678-679. [22] ISO/IEC JTC1/SC29/WG11, “Subjective test results for the CfP on Scalable Video Coding Technology," Doc.W6383, Munich, March 2004. [23] ISO/IEC JTC1/SC29/WG11, “Subjective test results for the CfP on Multi-view Video Coding," Doc.N7779, Bangkok, January 2006. [24] Arun Raghupathy, Nitin Chandrachoodan, K. J. Ray Liu, “Algorithm and VLSI architecture for high performance adaptive video scaling,” IEEE TRANS. ON Multimedia, VOL. 5, NO. 4, pp. 489-502, Dec. 2003.



The main techniques of current video compression include prediction, transform, quantization and entropy coding. Quantization is the only flexible method among them because it has more choices for its parameters. However, it is not perfect in the general video coding process. It is that all positions in the frames are considered equally important. Even though the equality of different frame positions seems reasonable at first glance, it does not fit humans' natural visual responses. It is impossible for humans to have the same feelings toward different contents in an image. Thus, if these visual properties can be considered at the quantization process, the coding performance will be improved.

This thesis employs the method of adjusting quantization parameters according to the visual properties. However, according to the analysis, adjusting quantization parameters may cause some coding algorithms effects, and therefore create extra loads. Thus, in order to view the effect levels and to know if the adjustment of quantization parameters is reasonable, three experiments are designed in the thesis. They are the experiments about the effects between the intra macroblocks, the inter frames effects which are caused by the intra frames, and the effects between the inter frames. Then a perceptual analysis module is proposed at the part of perceptual algorithms. It crosses the spatial and temporal considerations, including the detections of texture properties, visual masking, and visual attention. The texture properties part aims at classifying the smooth, edge, and complex regions. Video dynamic properties are considered at the visual masking. It analyzes the changes between the frames to see if the distortions can be found easily. The main detection consideration at the visual attention aims at the motions which may cause significant distortions easily. Then the final quantization parameters are adjusted based on the algorithms. The decisions of using which algorithms are determined by the integrity of the data source of different frame types.

In terms of the hardware implementation, the perceptual module is designed in a way which can be applied to a 2-stage MB pipeline encoder. Taking into account of the dependence between the encoder and the perceptual module, the perceptual module is arranged accordingly. To conclude, the device is designed to meet the needs for the encoder to achieve real time video compression specification.
其他識別: U0005-2108201020474800
Appears in Collections:電機工程學系所

Show full item record

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.