Please use this identifier to cite or link to this item:
Algorithm and Architecture Design for 2D and 3D Video Signal Processing
2D to 3D
Depth Map Estimation
|引用:|| ITU-T and ISO/IEC JTC 1, Advanced video coding for generic audiovisual services, ITU-T Recommendation H.264 and ISO/IEC 14496-10 (MPEG-4 AVC), 2010.  ITU-T and ISO/IEC JTC 1, Generic coding of moving pictures and associated audio informationVPart 2: Video, ITU-T Recommendation H.262 and ISO/IEC 13818-2 (MPEG-2 Video), 1994.  A. Vetro, P. Pandit, H. Kimata, A. Smolic, and Y.-K. Wang, Joint draft 8 of multiview video coding, Hannover, Germany, Joint Video Team (JVT) Doc. JVT-AB204, Jul. 2008.  K. Müller, P. Merkle, and T.Wiegand, “3D video representation using depth maps,” Proc. IEEE, 2011.  C. Fehn, “Depth-image-based rendering (DIBR), compression and transmission for a new approach on 3D-TVm,” in Proc. SPIE Conf. Stereoscopic Displays Virtual Reality Syst. XI, San Jose, CA, USA, Jan. 2004, pp. 93–104.  A. Vetro, S. Yea, and A. Smolic, “Towards a 3D video format for auto-stereoscopic displays,” in Proc. SPIE Conf. Appl. Digital Image Process. XXXI, San Diego, CA, Aug. 2008.  E. K. Lee, Y. K. Jung, and Y. S. Ho, “3D video generation using foreground separation and disocclusion detection,” in Proc. IEEE 3DTV Conf., Tampere, Finland, Jun. 2010.  Video and Requirements Group, “Vision on 3D video,” Lausanne, Switzerland, Feb. 2009.  C. Fehn, R.S. Pastoor, “Interactive 3-DTV—Concepts and Key Technologies,” Proc. IEEE, vol. 94, no. 3, pp.524-538, Mar. 2006.  C. Wu, M. McCormick, A. Aggoun, and S.Y. Kung, “Depth Mapping of Integral Images Through Viewpoint Image Extraction With a Hybrid Disparity Analysis Algorithm,” J. Display Technol., vol. 4, no. 1, pp.437–450, pp.101–108, Mar. 2008.  D. S. Kim, S. S. Lee, and B. H. Choi, “A Real-Time Stereo Depth Extraction Hardware for Intelligent Home Assistant Robot,” IEEE Trans. on Consumer Electronics, vol. 56, no. 3, pp.1782-1788, Aug. 2010.  M. T. Pourazad, P. Nasiopoulos, and R.K. Ward, “An H.264-based Scheme for 2D to 3D Video Conversion,” IEEE Trans. on Consumer Electronics, vol. 55, no. 2, pp.742-748, May. 2009.  C. C. Cheng, C. T. Li, and L. G. Chen, “A Novel 2D-to-3D Conversion System Using Edge Information,” IEEE Trans. on Consumer Electronics, vol. 56, no. 3, pp.1739-1745, Aug. 2010.  B. Kamolrat, W.A.C. Fernando, M. Mrak, A. Kondoz, “3D Motion Estimation for Depth Image Coding in 3D Video Coding,” IEEE Trans. on Consumer Electronics, vol. 55, no. 2, pp.824-830, May. 2009.  L. Zhang and W. J. Tam, “Stereoscopic Image Generation Based on Depth Images for 3D TV,” IEEE Trans. on Broadcasting, vol. 51, no. 2, pp.191-199, Jun. 2005.  J. Geng, “Volumetric 3D Display for Radiation Therapy Planning,” J. Display Technol., vol. 4, no. 4, pp.437–450, Dec. 2008.  R. Zwing, S. Weitbruch, and G. Lübcke, “Evolution of Modern Display Technologies in a 3D Ecosystem,” J. Display Technol., vol. 7, no. 10, pp.572–578, Oct. 2011.  Y. S. Hwang, S. H. Hong, and B. Javidi, “Free View 3-D Visualization of Occluded Objects by Using Computational Synthetic Aperture Integral Imaging,” J. Display Technol., vol. 3, no. 1, pp.64-70, Mar. 2007.  J. C. Liou, K. Lee, and J. F. Huang, “Low Crosstalk Multi-View Tracking 3-D Display of Synchro-Signal LED Scanning Backlight System,” J. Display Technol., vol. 7, no. 8, pp.411-419, Aug. 2011.  J. Y. Son, B. Javidi, S.Yano, and K. H. Choi, “Recent Developments in 3-D Imaging Technologies,” J. Display Technol., vol. 6, no. 10, pp. 394-403, Oct. 2010.  Z. W. Gao, W. K. Lin, Y. S. Shen, C. Y. Lin, and W. C. Kao, “Design of Signal Processing Pipeline for Stereoscopic Cameras,” IEEE Trans. on Consumer Electronics, vol. 56, no. 2, pp.324-331, May. 2010  Y. Feng, J. Ren, and J. Jiang, “Object-Based 2D-to-3D Video Conversion for Effective Stereoscopic Content Generation in 3D-TV Applications,” IEEE Trans. on Broadcasting, vol. 57, no. 2, pp.500-509, Jun. 2011.  V.V. Saveljev, J. Y. Son, S. H. Kim, D. S. Kim, M. C. Park, and Y. C. Song, “Image Mixing in Multiview Three-Dimensional Imaging Systems,” J. Display Technol., vol. 4, no. 3, Sep. 2008.  A. Almansa, A. Desolneux, and S. Vamech, "Vanishing point detection without any a priori information," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, pp. 502-507, Apr. 2003.  T. Y. Min, Y. L. Chang, and L. G. Chen, "Block-based Vanishing Line and Vanishing Point Detection for 3D Scene Reconstruction," IEEE International Symposium on Intelligent Signal Processing and Communications , pp. 586-589, Dec. 2006.  S. Battiato, A. Capra, S. Curti, and L. M. Cascia, "3D Stereoscopic Image Pairs by Depth-Map Generation," IEEE International Symposium on 3D Data Processing, Visualization and Transmission, pp. 124-131, Sep. 2004.  K. Ghosh and S. K. Pal, "Some Insights Into Brightness Perception of Images in the Light of a New Computational Model of Figure-Ground Segregation," IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 40, no. 4, pp.758-766, Jul. 2010.  M. Song, D. Tao, C. Chen, X. Li, and Chang Wen Chen, "Color to Gray: Visual Cue Preservation," IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 9, pp. 1537-1552, Sep. 2010.  A. Boev, D. Hollosi, A. Gotchev, and K. Egiazarian, " Classification and simulation of stereoscopic artifacts in mobile 3DTV content, " in Proc. SPIE, Stereoscopic Disp. Appl., vol. 7237, pp. 72371F-1–72371F-12, 2009.  W. J. Tam and L. Zhang, "3D-TV Content Generation: 2D-to-3D Conversion," IEEE International Conference on Multimedia and Expo, pp. 1869-1872, Jul. 2006.  Y. R. Horng, Y. C. Tseng, and T. S. Chang, "VLSI Architecture for Real-Time HD1080p View Synthesis Engine," IEEE Trans. Circuits Syst. Video Technol., vol. 21, no. 9, pp. 1329-1340, Sep. 2011.  Depth Estimation Reference Software [Online]. Available: http://wg11.sc29.org/svn/repos/MPEG-4/test/tags/3D/depth−estimation.  S. F. Tsai, C. C. Cheng, C. T. Li, and L. G. Chen, A Real-Time 1080p 2D-to-3D Video Conversion System,” IEEE Trans. on Consumer Electronics, vol. 57, no. 2, pp.915-922, May. 2011.  C. A. Chien, C. Y. Chang, J. S. Lee, J. H. Chang, and J. I. Guo, “Low Complexity 3D Depth Map Generation for Stereo Applications,” IEEE International Conference on Consumer Electronics (ICCE) , pp. 185-186, Jan. 2011.  C. L. Su, K. N. Pang, T. M. Chen, G. S. Wu, C. L. Chiang, H. R. Wen, L. S. Huang, Y. H. Hsueh, and S. Y. Tseng, “A Real-time Full-HD 2D-to-3D Conversion System Using Multicore Technology,” IEEE International Conference on Multimedia and Ubiquitous Engineering (MUE) , pp. 273-276, Jun. 2011.  Joint Draft 7.0 on Multiview Video Coding, Joint Video Team of ISO/IEC MPEG and ITU-T VCEG, ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Apr. 2008.  F. Bruls, S. Zinger, and L. Do, “Multi-view coding and view synthesis for 3DTV,” in Proc. IEEE ICCE, pp. 685-686, Jan. 2011.  P. Merkle, K Müller, T. Wiegand, "3D Video: Acquisition, Coding, and Display", IEEE Trans. Consum. Electron., vol. 56, no. 2, pp. 946-950, May 2010.  W. Y. Chen, Y. L. Chang, H. K. Chiu, S. Y. Chien, and L. G. Chen “Real-time depth image based rendering harware accelerator accelerator for advanced three dimensional televison system,” in Proc. IEEE ICME, pp.2069-2072, Jul. 2006.  A. Vetro, A.M. Tourapis, K. Muller, and T. Chen, “3D-TV Content Storage and Transmission,” IEEE Trans. Broadcast., vol. 57, no. 2, pp.384-394, Jun. 2011.  T. C. Lin, H. C. Huang, and Y. M. Huang, “Preserving depth resolution of synthesized images using parallax-map-based dibr for 3D-TV,” IEEE Trans. Consum. Electron., vol. 56, no. 2, pp.720-727, May 2010.  A. Gotchev, G. B. Akar, T. Capin, D. Strohmeier, and A. Boev, “Three-dimensional media for mobile devices,” Proc. IEEE, vol. 99, no. 4, pp. 708–741, Apr. 2011.  W. Y. Chen, Y. L. Chang, H. K. Chiu, S. Y. Chien, and L. G. Chen, “Real-time depth image based rendering hardware accelerator for advanced three dimensional television system,” in Proc. IEEE ICME, pp. 2069–2072, Jul. 2006.  C. C. Cheng, C. T. Li, and L. G. Chen, "A 2D-to-3D Conversion System Using Edge Information," IEEE Trans. Consum. Electron. vol. 56, no. 3, pp. 1739-1745, Aug. 2010.  P. C. Lin, P. K. Tsung, and L. G. Chen, “Low-cost hardware architecture design for 3D warping engine in multiview video applications,” in Proc. IEEE ISCAS, pp. 2964–2967, May. 2010.  View Synthesis Reference Software (VSRS). Version 3.5 [Online]. Available: http://wg11.sc29.org/svn/repos/MPEG-4/test/tags/3D/view_synthesis/VSRS_3_5  Depth Estimation Reference Software [Online]. Available: http://wg11. sc29.org/svn/repos/MPEG-4/test/tags/3D/depth−estimation.  D. L. Zhang, M. L. Gao, H. H. Xu, J. H. Jia and Y.K. Song, “Study on a low cost, high quality hybrid scaling algorithm,” in Proc. IEEE Int. Conf. Solid-State and Integrated Circuit Technology, pp. 1969-1971, 2006.  C. Vázques, W. Tam, and F. Speranza, “Stereoscopic Imaging: Filling Disoccluded Areas in Depth Image-Based Rendering.” Proc. of the SPIE, vol. 6392, pp. 63920D, 2006.  D. L. Zhang, M. L. Gao, H. H. Xu, J. H. Jia and Y.K. Song, ‘‘Study on D. L. Zhang, M. L. Gao, H. H. Xu, J. H. Jia and Y.K. Song, ‘‘Study on a low cost, high quality hybrid scaling algorithm,’’ in Proc. IEEE Int. Conf. Solid-State and Integrated Circuit Technology, pp. 1969-1971, 2006.  C. H. Kim, S. M. Seong, J. A. Lee and L. S. Kim, “Winscale: an image-scaling algorithm using an area pixel model,” IEEE Trans. Circuit and System for Video Technology, vo1. 13, no. 6, pp. 549-553, Jun. 2003.  C. C. Lin, M. H. Sheu, H. K. Chiang, C. Liaw and Z. C. Wu, “The Efficient VLSI Design of BI-CUBIC Convolution Interpolation for Digital Image Processing,” in Proc. IEEE Int. Conf. Circuits and System, pp.480-483, 2008.  F. Bruls, S. Zinger, and L. Do, “Multi-view coding and view synthesis for 3DTV,” in Proc. IEEE International Conference on Consumer Electronics, pp. 685-686, Jan. 2011.  Y. K. Lai, Y. F. Lai, and Y. C. Chen, “An Effective Hybrid Depth-Perception Algorithm for 2D-to-3D Conversion in 3D Display Systems,” in Proc. IEEE International Conference on Consumer Electronics, pp. 612-613, Jan. 2012.  Y. R. Horng, Y. C. Tseng, and T. S. Chang, “VLSI Architecture for Real-Time HD1080p View Synthesis Engine,” IEEE Trans. Circuits Syst. Video Technol. vol. 21, no. 9, pp. 1329-1340, Sep. 2011.  S. F. Hsiao, J. W. Cheng, W. L. Wang, and G. F. Yeh, “Low Latency Design of Depth-Image-Based Rendering Using Hybrid Warping and Hole-Filling,” in Proc. IEEE International Symposium on Circuits and Systems, pp. 608-611, May 2012.  W. Y. Chen, Y. L. Chang, H. K. Chiu, S. Y. Chien, and L. G. Chen, “Real-time depth image based rendering hardware accelerator for advanced three dimensional television system,” in Proc. IEEE ICME, pp. 2069–2072, Jul. 2006.  J. Jain and A. Jain, “Displacement measurement and its application in internal image coding,” IEEE Trans. Commun., vol. COM-29, no. 12, pp. 1799-1808, Dec. 1981.  N. Howard, and R.W. Taylor, “Reconfigurable logic: technology and applications,” IEEE J. Comput., Contr. Eng., vol. 3, pp.235-240, Sep. 1992.  C. Zhang., S. Zheng, C. Yuan, and F. Wang, “A Novel Low-complexity and High-performance Frame-skipping Transcoder in DCT Domain,” IEEE Trans. Consumer Electronics, vol. 51, No. 4, pp.1306-1312, Nov. 2005.  T. H. Chen, “A cost-effective 8x8 2-D IDCT core processor with folded architecture,” IEEE Trans. Consumer Electronics, vol. 45, No. 2, pp.333-339, May. 1999.  J. I. Guo, J. W. Chen, and H.C. Chen, “A new 2-D 8x8 DCT/IDCT core design using group distributed arithmetic,” in Proc. IEEE ISCAS, vol.2, pp. 752-755, May 2003.  T.S. Chang, C.S. Kung, and C.W. Jen, “A simple processor core design for DCT/IDCT,” IEEE Trans. Circuits Syst. Video Technol., vol. 10, pp.439-447, Apr. 2000.  J. Takala, J. Nikara, and K. Punkka, “Pipeline architecture for two- dimensional discrete cosine transform and its inverse,” in Proc. 9th IEEE Int. Electro. Circuits Syst. Conf, pp. 947-950, 2002.  K. Suhwan, C.H. Ziesler, and M.C. Papaefthymiou, “A reconfigurable pipelined IDCT for low-energy video processing,” in Proc. IEEE Int. ASIC/SOC Conf, pp. 13-172, Sep. 2000.  I. Kuroda, E. Murate, K. Nadehara, K. Suzuki, T. Arai, and A. Okamura, “A 16-bit parallel MAC architecture for a multimedia RISC processor,” in Proc. IEEE SIPS 98 , pp. 103-112, Oct. 1998.  T.Y. Sung, Y.S. Shih, C.W. Yu, and H.C. Hsin, “High-Efficiency and Low-Power Architectures for 2-D DCT and IDCT Based on CORDIC Rotation,” in Proc. 7th Int. PDCAT Conf, pp. 191-196, Dec. 2006.  Y. Yang, C. Wang, M.O. Ahmad, M.N.S. Swamy, “An FPGA implementation of an on-line radix-4 CORDIC 2-D IDCT core,” in Proc. IEEE ISCAS, vol.4, pp. 26-29, 2002.  P. C. Tseng, C. T. Haung, and L. G. Chen, “Reconfigurable discrete cosine transform processor for object-based video signal processing,” in Proc. IEEE ISCAS, pp. 353-356, May 2004.  T. Y. Sung, Y. S. Shieh, C. W Yu, and H, C. Hsin, “High-Efficiency and Low Power Architectures for 2-D DCT and IDCT Based on CORDIC Rotation,” IEEE Conf. on PDCAT, pp. 191-196, Dec. 2006.  J. W. Chen, K. Hung, J. S. Wang, and J. I. Guo, “A Performance Aware IP Core Design for Multi-mode Transform Coding Using Scalable-DA Algorithm,” IEEE ISCAS, pp. 21-24, 2006.  C.P. Fan, and G.A. Su, “Efficient Fast 1-D 8x8 Inverse Integer Transform for VC-1 Application,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, pp.584-590, Apr. 2009.  T. Xanthopoulos, and A. Chandrakasan, “A Low-Power DCT Core Using Adaptive Bitwidth and Arithmetic Activity Exploiting Signal Correlations and Quantization,” IEEE J. Solid-State Circuits, vol. 35, no. 2, pp. 740-750, May. 2000.  W.K. Pratt, “Digital Image Processing,” Toronto,Ont. Canada:Wiley 1978.  M.A. Nuno-Maganda and M.O. Arias-Estrada, “Real-time FPGA-based architecture for bicubic interpolation: an application for digital image scaling,” IEEE Reconfigurable Computing and FPGAs, pp.1-8, Sept.28-30, 2005.  J.H. Souk and J. Lee, “Recent picture quality enhancement technology based on human visual perception in LCD TVs,” J. Display Technol., vol. 3, no. 4, pp. 371-376, Dec. 2007.  D. Lefol, D. Bull and N. Canagarajah, “Performance evaluation of transcoding algorithms for H.264,” IEEE Trans. on Consumer Electronics, vol. 56, no. 1, pp.175–181, February 2010.  K. S. Choi and S. J. Ko, “Fast content-aware image resizing scheme in the compressed domain,” IEEE Trans. on Consumer Electronics, vol. 55, no. 3, pp.1514–1521, Aug. 2009.  B. D. Choi and H. Yoo, “Design of piecewise weighted linear interpolation based on even-odd decomposition and its application to image resizing,” IEEE Trans. on Consumer Electronics, vol. 55, no. 4, pp.2280–2286, Nov. 2009.  M. Li and T. Q. Nguyen, “Markov Random Field Model-Based Edge-Directed Image Interpolation,”IEEE Trans. Image Process., vol. 17, no.7, pp. 1121–1128, Jul. 2008.  J. W. Han, J. H. Kim, S. H. Cheon, J. O. Kim and S. J. Ko, “A novel image interpolation method using the bilateral filter,” IEEE Trans. on Consumer Electronics, pp.215–222, February 2006.  C. C. Lin, M. H. Sheu, H. K. Chiang, C. Liaw and Z. C. Wu, “The efficient VLSI design of BI-CUBIC convolution interpolation for digital image processing,” in Proc. IEEE ISCAS, pp. 480-483, May 2008.  I. E Richardson, “H.264 and MPEG-4 Video compression: video coding for next generation multimedia”, John Wiley&Sons, September, 2003.  C. H. Kim, S.M. Seong, J. A. Lee, L. S Kim, “Winscale: an image-scaling algorithm using an area pixel model,” IEEE Trans. Circuits and Systems for Video Technology, vol. 13, no. 6, pp. 549-553, June 2003.  S. Ramachandran and S. Srinivasan, “Design and FPGA Implementation of a Video Scalar with on-chip reduced memory utilization,” in Proc. IEEE Euromicro Symposium on Digital System Design, pp. 206-213, Sep. 2003.|
|摘要:||在近年來，3D顯示技術進步相當快速且吸引更多的注意。由於目前市面上還存在大量的2D視訊資料，因此2D轉3D視訊的轉換扮演相當重要的角色在3D內容的產品上。在本篇論文裡提出了一混合型的深度圖估測演算法並應用於2D轉3D視訊的轉換。它採用了三種深度線索來當作深度圖的估測: 移動物體的資訊、線性透視和材質的特性。深度影像的展示(DIBR)技術能夠結合深度圖資訊和原本的2D影像資訊，並同時視野合成 (view synthesis)出3D的左視圖和右視圖，此外亦提出了一個高品質的視野合成(view synthesis)演算法和架構可應用於2D轉3D視訊的轉換。此方法包含兩個部份: 3D影像彎曲(3D Image Warping)和影像補點(hole-filling)，在3D影像彎曲部分把2D相機影像平面經過座標轉換到3D座標平面，然而此方法會把影像的整數點映設到不規則的點上，進而產生空洞(occlusion)和重覆點的問題，此論文提出了一補點的演算法，可增加PSNR大約0.2~1.5dB相對於其他演算法，我們採用軟硬體共整合的方式來實現它的硬體並實現於FPGA平台。
在多媒體系統，目前有相當多的標準如MPEG-1/2/4, VC-1和H.264/AVC。它們都有一個反離散餘弦(IDCT)轉換。為了能支援多規格的標準並有更低的成本，它是需要設計一個可重組化的IDCT架構對於各種不同的視訊規格並使用同一個硬體架構就可支援多種規格，本論文提出的IDCT架構可同時支援MPEG-1/2/4, VC-1和H.264/AVC並可支援4種矩陣型式的轉換: 8x8, 8x4, 4x8和4x4轉換。這個架構都優點是不需要要乘法器和唯讀記憶體電路，它只需要加法器和移位器電路就可實現此架構。在本論文的最後部分提出了一可調整視窗並具有影像增強的六階濾波器演算法，此演算法可用於觸控式的嵌入式平台並且容易實現，此演算法具有低運算複雜度而且可保有影像的品質並可應用於各種消費型電子產品的數位顯示器上。|
In recent years, 3D display technology has been receiving increasingly more attention. Due to the enormous number of existing 2D videos, 2D-to-3D video conversion plays an important role in 3D content production. In this thesis, we propose a hybrid depth-generation algorithm for 2D-to-3D conversion in 3D displays. We choose three depth cues for depth estimation: motion information, linear perspective, and texture characteristics. Depth-Image-Based Rendering (DIBR) can combine the depth map information with original 2D images, and simultaneously output 3D rendering for the left and right eyes. Moreover, we propose a high quality view synthesis algorithm and architecture for 2D-to-3D conversion. The proposed view synthesis algorithm consists of two parts: 3D image warping and inpainting (hole filling). 3D image warping transforms a 2D camera image plane to a 3D coordinate plane. However the integer grid points of the reference are warped to irregularly spaced points in the virtual view, resulting in occlusion problems. Thus inpainting is needed to fix the virtual images. The proposed algorithm shows an improved PSNR gain of 0.2~1.5dB. We adopt hardware/software co-design to accomplish the proposed view synthesis algorithm. For this we implemented the image inpainting on a FPGA device and the remaining algorithm in software. In multimedia system, there are many standards such as MPEG-1/2/4, VC-1 and H.264, which all have the IDCT transform. In order to support multi-standard and lower cost, it is necessary to design a reconfigurable IDCT architecture for video decoder to decode various video standards. The proposed IDCT architecture can support various video standards such as VC-1, MPEG-1/2/4 and H.264 AVC. It can sustain four transform types, 8x8, 8x4, 4x8 and 4x4 transform. The advantages of the proposed architecture are that this architecture does not require multipliers and ROM. It only needs adders and shifters. Finally, we proposed scalable view window with quality enhancement for touchable displays. The algorithm is easy for implementation, and its computational complexity is acceptable on an embedded system. Its computational complexity is not only low, but it can also get good image quality. This algorithm is suitable for various digital displays applications like digital picture frame.
|Appears in Collections:||電機工程學系所|
Show full item record
TAIR Related Article
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.