研究生(外文):Chih-Fu Tung
論文名稱(外文):A CTRGCN-based model for Isolated Sign Language Recognition
指導教授(外文):Mu-Chun Su
外文關鍵詞:Deep learningSkeleton recognitionSign language recognitionGraph convolutional neural network
單詞辨識,設計了改良的CTRGCN 模型,並提出多分支的架構,以提高
辨識準確度。我們使用WLASL100 數據集進行訓練,並與現有模型進行
In recent years, the population of hearing-impaired individuals has been
gradually increasing, and the public’s demand for sign language learning has
been steadily rising as well. However, the difficulty of learning sign language is
high, and the learning resources are limited, making it a relatively challenging
To address this issue, this paper proposes a Skeleton based sign language
word recognition algorithm based on Channel-Topology Refinement Graph Convolutional
Network (CTRGCN). This method tackles the challenges in sign language
word recognition, by designing an improved CTRGCN model to enhance
recognition accuracy. We trained the model using the WLASL100 dataset and
compared it with existing models. The results demonstrate that our method outperforms
existing techniques in most scenarios, showcasing its potential and
practicality in sign language word recognition. We hope to provide more assistance
for sign language learning through this approach.
1.1 研究動機.................................................................. 1
1.2 研究目標.................................................................. 3
1.3 論文架構.................................................................. 4
2.1 背景知識.................................................................. 5
2.1.1 各種手語......................................................... 5
2.1.2 手語辨識種類................................................... 7
2.1.3 圖卷積(GCN) 介紹............................................. 8
2.2 文獻回顧.................................................................. 10
2.2.1 關鍵點偵測之相關研究....................................... 10
2.2.2 基於骨架動作辨識之相關研究.............................. 12
2.2.3 基於3DCNN 的影片辨識相關研究......................... 15
2.2.4 基於骨架手語單詞辨識之相關研究........................ 15
3.1 系統架構.................................................................. 20
3.2 前處理..................................................................... 21
3.3 模型架構.................................................................. 25
3.3.1 CTRGCN 模型.................................................. 25
3.3.2 修改後的CTRGCN 模型...................................... 27
3.3.3 多分支架構...................................................... 29
3.3.4 模型結果合併方法............................................. 30
3.3.5 融合RGB 結果.................................................. 31
4.1 資料集..................................................................... 32
4.2 實驗配置.................................................................. 34
4.3 實驗結果評估............................................................ 36
4.3.1 比較額外分支結果............................................. 36
4.3.2 比較不同分支合併的方法.................................... 39
4.3.3 比較減少層數後的效果....................................... 40
4.3.4 比較模塊修改後的效果....................................... 40
4.3.5 比較不同的分支組合.......................................... 42
4.3.6 比較不同的參數................................................ 43
4.3.7 與現有手語單詞辨識模型比較.............................. 44
5.1 結論........................................................................ 46
5.2 未來展望.................................................................. 47
