Please use this identifier to cite or link to this item: http://hdl.handle.net/11455/8893
標題: 針對H.264/AVC標準之知覺視訊編碼演算法與架構設計
Perceptual Video Coding Algorithm and Architecture Design for H.264/AVC Standard
作者: 戴成翰
Tai, Cheng-Han
關鍵字: Perceptual video coding;知覺視訊編碼;Visual sensitivity;Texture;Visual masking;Visual attention;視覺敏感性;材質;視覺遮罩;視覺注意
出版社: 電機工程學系所
引用: [1] I. E. G. Richardson, “H.264 and MPEG-4 Video Compression,” John Wiley & Sons, 2003. [2] ITU-T Recommendation H.261: Video Codec for Audiovisual Services at p ×64 Kbit/s, Mar. 1993. [3] Information Technology - Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5Mbit/s - Part 2: Video, ISO/IEC 11172-2, 1993. [4] Information Technology - Generic Coding of Moving Pictures and Associated Audio Information: Video, ISO/IEC 13818-2 and ITU-T Recommendation H.262, 1996. [5] ITU-T Recommendation H.263: Video Coding for Low Bit Rate Communication, Jan. 2005. [6] Information Technology - Coding of Audio-Visual Objects - Part 2: Visual, ISO/IEC 14496-2, 1999. [7] ITU-T Recommendation H.264: Advanced Video Coding for Generic Audiovisual Service, May 2003. [8] Z. Wang, A. C. Bovik, “Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures,” IEEE Signal Process. Mag., VOL. 26, pp. 98-117, Jan. 2009. [9] C. H. Chou, Y. C. Li, ”A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile,” IEEE Trans. Circuits Syst. Video Technol., VOL. 5, NO. 6, pp. 467-476, Dec. 1995. [10] Koohyar Minoo, Truong Q. Nguyen,“Perceptual video coding with H.264,” Conference Record of the Thirty-Ninth Asilomar Conference on Signals, Systems and Computers, 2005. pp. 741-745. [11] Atul Puri, R. Aravind, “Motion-compensated video coding with adaptive perceptual quantization,” IEEE Trans. Circuits Syst. Video Technol., VOL. 1, NO. 4, pp. 351-361, Dec. 1991. [12] Bo Tao, Bradley W. Dickinson, Heidi A. Peterson, “Adaptive model-driven bit allocation for MPEG video coding,” IEEE Trans. Circuits Syst. Video Technol., VOL. 10, NO. 1, pp. 147-157, Feb. 2000. [13] Chih-Wei Tang, Ching-Ho Chen, Ya-Hui Yu, and Chun-Jen Tsai, “Visual sensitivity guided bit allocation for video coding,” IEEE TRANS. ON Multimedia, VOL. 8, NO. 1, pp. 11-18, Feb. 2006. [14] Yu-Fei Ma and Hong-Jiang Zhang, “A model of motion attention for video skimming,” Proceedings. 2002 International Conference on Image Processing, VOL. 1, pp.I-129 - I-132, 2002. [15] Chang Sun, Hong-Jun Wang, Tai-hoon Kim, Hua Li, “Perceptually adaptive Lagrange multiplier for rate-distortion optimization in H.264,” Future Generation Communication and Networking (FGCN2007), VOL. 1, pp. 459-463, 2007. [16] Tung-Hsing Wu, Guan-Lin Wu, Shao-Yi Chien, “Bio-inspired perceptual video encoding based on H.264/AVC,” IEEE International Symposium on Circuits and Systems, 2009, ISCAS 2009, pp.2826 - 2829, 2009. [17] D. H. Kelly, “Motion and vision. II. Stabilized spatio-temporal threshold surface,” Journal of the Optical Society of America, VOL. 69,NO. 10, pp. 1340-1349, Oct. 1979. [18] Scott J. Daly, “Engineering observations from spatiovelocity and spatiotemporal visual models,” IS&T/SPIE Conference on Human Vision and Electronic Image III, VOL. 3299, 1998. [19] Jia Zhike, Cui Huijuan, Tang Kun, “Adaptive quantization scheme for very low bit rate video coding,” APCC/OECC '99, VOL. 2, pp. 940-943, 1999. [20] Rafael C. Gonzalez, Richard E. Woods, “Digital Image Processing,” 3rd edition, Person Prentice Hall, 2008. [21] Neil H.E. Weste, David Harris, “CMOS VLSI DESIGN: A Circuits and Systems Perspective,” 3rd edition, Addison Wesley, 2005, pp. 678-679. [22] ISO/IEC JTC1/SC29/WG11, “Subjective test results for the CfP on Scalable Video Coding Technology," Doc.W6383, Munich, March 2004. [23] ISO/IEC JTC1/SC29/WG11, “Subjective test results for the CfP on Multi-view Video Coding," Doc.N7779, Bangkok, January 2006. [24] Arun Raghupathy, Nitin Chandrachoodan, K. J. Ray Liu, “Algorithm and VLSI architecture for high performance adaptive video scaling,” IEEE TRANS. ON Multimedia, VOL. 5, NO. 4, pp. 489-502, Dec. 2003.
摘要: 
目前的視訊壓縮所採用的核心技術包含了預測、轉換、量化、熵編碼,其中只有量化的壓縮方式因擁有不同參數之選擇而較有彈性。不過在ㄧ般的編碼過程中,量化的步驟卻是不夠完美的,因其將畫面中各個位置視為同等重要性看待,這樣雖然合理,卻與視覺特性不相稱,人類的視覺不可能對畫面中不同的內容感受皆相同,因此若能在量化的過程中加入視覺特性的考量,如此一來便能進一步提升編碼的效能。

根據視覺特性來對量化參數做調整為本論文採用的方式,不過根據分析,調整量化參數卻可能會對一些編碼所使用的演算法造成影響,增加額外負擔,因此為了得知其影響的嚴重性,本論文中首先設計了三個模擬情況來檢視其影響程度,包含幀內區塊間之影響、幀內畫面對幀間畫面之影響、幀間畫面間之影響,以了解調整量化參數的合理性。而知覺評估演算法部分提出了一個知覺分析模組,橫跨空間性與時間性考量,包含了材質特性、視覺遮罩與視覺注意的偵測。材質特性針對平坦、邊緣、複雜區域做分類。視覺遮罩則考量了視訊動態的特性,以畫面間變化與失真察覺容易度作分析。視覺注意則針對於易產生失真現象的移動作為主要偵測考量。而最後的量化參數則依據這些演算法來調整。並以不同畫面型態所能取得之來源資料完整度決定使用之演算法個數。

硬體實現部份,本論文的知覺模組電路為以兩級區塊管線排程系統之編碼器為設計基礎。在考量編碼器與知覺模組之間的相依性,將知覺模組做適當地安排,並以滿足搭配所針對之編碼器能達成即時視訊壓縮之規格作設計。

The main techniques of current video compression include prediction, transform, quantization and entropy coding. Quantization is the only flexible method among them because it has more choices for its parameters. However, it is not perfect in the general video coding process. It is that all positions in the frames are considered equally important. Even though the equality of different frame positions seems reasonable at first glance, it does not fit humans' natural visual responses. It is impossible for humans to have the same feelings toward different contents in an image. Thus, if these visual properties can be considered at the quantization process, the coding performance will be improved.

This thesis employs the method of adjusting quantization parameters according to the visual properties. However, according to the analysis, adjusting quantization parameters may cause some coding algorithms effects, and therefore create extra loads. Thus, in order to view the effect levels and to know if the adjustment of quantization parameters is reasonable, three experiments are designed in the thesis. They are the experiments about the effects between the intra macroblocks, the inter frames effects which are caused by the intra frames, and the effects between the inter frames. Then a perceptual analysis module is proposed at the part of perceptual algorithms. It crosses the spatial and temporal considerations, including the detections of texture properties, visual masking, and visual attention. The texture properties part aims at classifying the smooth, edge, and complex regions. Video dynamic properties are considered at the visual masking. It analyzes the changes between the frames to see if the distortions can be found easily. The main detection consideration at the visual attention aims at the motions which may cause significant distortions easily. Then the final quantization parameters are adjusted based on the algorithms. The decisions of using which algorithms are determined by the integrity of the data source of different frame types.

In terms of the hardware implementation, the perceptual module is designed in a way which can be applied to a 2-stage MB pipeline encoder. Taking into account of the dependence between the encoder and the perceptual module, the perceptual module is arranged accordingly. To conclude, the device is designed to meet the needs for the encoder to achieve real time video compression specification.
URI: http://hdl.handle.net/11455/8893
其他識別: U0005-2108201020474800
Appears in Collections:電機工程學系所

Show full item record
 

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.