標題: 高效率AI加速器之選擇性搜尋法硬體實現
High Efficient Hardware Implementation of Selective Search for AI Accelerator
作者: 簡漢君
Han-Chun Chien
關鍵字: 影像分割;物件偵測;人工智慧;加速;硬體架構;RCNN;Object Detection;AI;Hardware Architecture
本論文提出Selective Search演算法之硬體架構設計與實現。本論文針對硬體設計需求而改良部分Selective Search演算法。首先,本論文實現了尺寸相似度與填充相似度,由於這兩種相似度的運算使用重複的特徵,因此,在硬體實現中,能有效的利用重複的特徵降低硬體的複雜度。在區域順序的排列時,將有相交疊的區域優先排在一起,能減少20%位元記憶體所使用的儲存空間。最後,藉由重複利用原本區域的特徵來平行處理新區域的特徵與相似度,能夠有效的減少25%的運算時間。本論文設計之硬體架構,可適用於FULL HD@30fps影像,經合成後可在工作頻率188MHz達成即時運算。

In this thesis, a hardware architecture and implementation of the Selective Search algorithm is proposed. To reduce the computation complexity, the Selective Search algorithm is simplified and implemented the operations of size similarity and fill similarity in hardware. Since the operations of the two similarities use the same features, in hardware implementation, the features can be efficiently utilized to reduce the hardware complexity. Furthermore, the order of region's label is rearranged that the overlapping regions can be arranged together to reduce 20% memory usage. Moreover, by reusing the features of the original region to parallel process the features and similarities of the new region, the computation time can be effectively reduced by 25%. The hardware architecture of this thesis can be applied to FULL HD@30fps image, and its hardware architecture can be synthesized to achieve real-time operation at 188MHz.
