臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.217.127) 您好！臺灣時間：2026/07/29 11:02

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
QR Code

本論文永久網址:

研究生:

張志瑜

研究生(外文):

Chih-Yu Chang

論文名稱:

基於隱藏式馬可夫模型之唇語辨識系統

論文名稱(外文):

A Lipreading System Based on Hidden Markov Model

指導教授:

謝景棠

指導教授(外文):

Ching-Tang Hsieh

學位類別:

碩士

校院名稱:

淡江大學

系所名稱:

電機工程學系碩士班

學門:

工程學門

學類:

電資工程學類

論文種類:

學術論文

論文出版年:

2009

畢業學年度:

語文別:

中文

論文頁數:

中文關鍵詞:

唇語辨識、隱藏式馬可夫模型、彩度色彩模型、K-means演算法

外文關鍵詞:

Lipreading、HMM、chromaticity color space、K-means algorithm

相關次數:

被引用:1
點閱:366
評分:
下載:0
書目收藏:1

傳統使用語音資訊之語音辨識系統，在日常生活中的應用已是很常見的，例如：聲控開關；然而，易受雜音干擾則為此類語音辨識系統之最大弊病，即使能夠選用改良之收音器材，如指向性麥克風，以減少雜音干擾之情形。然而，高昂的成本即為設計此系統要面臨之代價。於是，許多學者針對上述之問題，提出了改良方法，包括：以影像資訊為基礎之語音辨識系統，即唇語辨識系統。唇語辨識系統能夠免除於雜音之干擾，甚至可與以語音資訊為基礎之語音辨識系統結合，能夠有效提昇其辨識率。本研究之目的即為設計一唇語辨識系統，結合彩度色彩空間(chromaticity color space)與K-means演算法(K-means algorithm)作為唇形影像切割方式，進而擷取出唇形特徵，並配合隱藏式馬可夫模型的使用，以提昇唇語辨識系統之辨識率。實驗結果將比較不同色彩空間之唇形切割技術，以及不同特徵之辨識率。

Nowadays, the conventional speech recognition system has been used in many applications. However, the conventional speech recognition system would be interfered by the voice noise According to the disturbance, the recognition rate would be decreased in the noise condition. So, researchers proposed the singular visual feature speech recognition system, a lipreading system, to avoid the affection of voice noise. The lipreading system can be the assistance part of the conventional speech recognition system, to raise the speech recognition rate. In our research, we proposed a lipreading system which the lip image segmentation part is chromaticity color space combined with K-means algorithm. And taking the Hidden Markov Model as the recognition part to improve the recognition rate. In the experiment results, our method compared with other color based lip segmentation, and compared the recognition rate of different features.

目錄
中文摘要 I
英文摘要 II
目錄 III
圖目錄 VI
表目錄 VIII
第一章緒論 1
1.1研究背景 1
1.2研究動機 2
1.3章節概要 3
第二章相關研究 4
2.1 唇形切割 4
2.1.1基於模型的唇形切割 4
2.1.2基於色彩的唇形切割 6
2.1.3基於群聚的唇形切割 9
2.2唇形特徵 11
2.3結論 12
第三章系統架構 14
3.1系統流程 14
3.2影像前處理 15
3.2.1彩度色彩空間轉換 15
3.2.2影像平滑化 17
3.3唇形切割 18
3.3.1利用K-means algorithm切割唇形 18
3.3.2形態學處理 19
3.4唇形特徵擷取 20
第四章辨識系統 24
4.1隱藏式馬可夫模型 25
4.2 Viterbi演算法 27
4.3 EM(Expectation-Maximization)演算法 28
第五章實驗結果 30
5.1實驗環境 30
5.2唇形切割實驗 30
5.3唇語辨識實驗 38
5.4結果與討論 41
第六章結論與未來工作 44
參考文獻 46

圖目錄
圖2.1 RGB色彩空間 7
圖2.2 HSI色彩空間 8
圖2.3 {H1,V1,V2,V3}及{S,A}唇形特徵示意圖 11
圖2.4 半徑唇形特徵 12
圖3.1 系統流程圖 14
圖3.2 唇形影像色彩空間轉換 16
圖3.3 高斯低通濾波結果 17
圖3.4 利用K-means演算法取閥值後二值化結果 19
圖3.5 形態學運算 20
圖3.6 直方圖投影 21
圖3.7 全部嘴唇特徵點 21
圖3.8 嘴唇特徵點 22
圖3.9 嘴唇特徵點 22
圖3.10 {H1,V1,V2,V3}特徵擷取結果 23
圖4.1 隱藏式馬可夫模型狀態轉移示意圖 26
圖4.2 Viterbi流程示意圖 27
圖5.1 發音「零」Frame 1之唇形切割結果比較 31
圖5.2 發音「一」Frame 27之唇形切割結果比較 32
圖5.3 發音「二」Frame 14之唇形切割結果比較 33
圖5.4 發音「四」Frame 18之唇形切割結果比較 34
圖5.5 發音「五」Frame 25之唇形切割結果比較 35
圖5.6 發音「八」Frame 25之唇形切割結果比較 36
圖5.7 發音「九」Frame 30之唇形切割結果比較 37

表目錄
表5.1 特徵A辨識率 39
表5.2 特徵B辨識率 39
表5.3 特徵C辨識率 40
表5.4 特徵D辨識率 40

[1]N. Deshmukh, A. Ganapathiraju and J. Picone, “Hierarchical search for large-vocabulary conversational speech recognition,” IEEE Signal Processing Magazine, vol. 16, Sept. 1999, pp. 84-107.

[2]D. Nguyen, D. Halupka, P. Aarabi and A. Sheikholeslami, “Real-time face detection and lip feature extraction using field-programmable gate arrays,” IEEE Trans. on Systems, Man, and Cybernetics, Part B: Cybernetics 36, vol. 36, Aug. 2006, pp. 902-912.

[3]T. Chen, and R. R. Rao, “Audio-Visual Integration in Multimodal Communication,” Proc. of the IEEE, vol. 86, May. 1998, pp. 837-852.

[4]A. S. M. Sohail, and P. Bhattacharya, “Automated lip contour detection using the level set segmentation method,” in Proc. Int. Image Analysis and Processing Conf. (ICIAP’07), Sept. 2007, pp.425-430.

[5]M. Kass, A. Witkin, and D. Terzopulos, “Snakes: Active Contour Models,” Int. Journal of Computer Vision, Vol. 1, 1988, pp. 321-331.

[6]R. C. Gonzalez, and R. E. Woods, Digital Image Processing, 2nd ed., Prentice-Hall, 2002.

[7]X. Zhang, and R.M. Mersereau, “Lip Feature Extraction Towards an Automatic Speechreading System,” in Proc. of IEEE Int. Image Processing Conf., Sept. 2000, pp. 226-229.

[8]A. Hulbert and T. Poggio, “Synthesizing a color algorithm from examples,” Science, New Series, vol. 239, Jan. 1998, pp. 482-485.

[9]H. J. Trussell, M. J. Vrhel and E. Saber, “Color Image Processing [basics and special issue overview],” IEEE Signal Processing Mag., vol. 22, Jan. 2005, pp. 14-22.

[10]M. Sadeghi, J. Kittler and K. Messer, “Segmentation of lip pixels for lip tracker initialization,” in Proc. of Int. Image Processing Conf., Oct. 2001, pp. 7-10.

[11]M. N. Q. Kaynak, A. D. Cheok, K. Sengupta, Z. Jian and K. C. Chung, “Analysis of lip geometric features for audio-visual speech recognition,” IEEE Trans. on Systems, Man, and Cybernetics Part A: Systems and Humans., vol. 34, Jul. 2004, pp. 564-570.

[12]L. G. Silveira, J. Facon, and D. L. Borges, “Visual Speech Recognition: a Solution from Feature Extraction to Words Classification," in Proc. of Int. Computer Graphics and Image Processing Conf., Oct. 2003, pp. 399-405.

[13]M. J. Lyons, C. H. Chan, and N. Tetsutani, “Mouth Type: text entry by hand and mouth,” in Proc. of Human Factors in Computing Systems Conf., Apr. 2004, pp. 1383-1386.

[14]T. Saitoh, and R. Konishi, “Lip reading based on sampled active contour model,” Image analysis and recognition Conf.(ICIAR’05), Sept. 2005, pp.507-515.

[15]Saitoh, T., Konishi, R., “Word recognition based on two dimensional lip motion trajectory,” Int. Symposium on Intelligent Signal Processing and Communications(ISPACS’06), Dec. 2006 , pp. 287-290.

[16]N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Trans. on Sys., Man., Cyber, vol. 9, Jan. 1979, pp. 62-66.

[17]H. S. Hippert, C. E. Pedreira, and R. C. Souza, “Neural Networks for Short-Term Load Forecasting: A Review and Evaluation,” IEEE Trans. on Power Systems, vol. 16, Feb. 2001, pp. 44-45.

[18]J. Huang, X. Shao, and H. Wechsler, “Face pose discrimination using support vector machines (SVM),” in Proc. of Int. Pattern Recognition Conf.(ICPR’98), Aug. 1998, pp. 155-156.

[19]R. Lawrence Rabiner, “A tutorial on hidden Markov model and selected application in speech recognition,” Processing of the IEEE, vol. 77, Feb. 1989, pp. 257-286.

[20]S. L. Wang, A. W. C. Liew, W. H. Lau, and H. S. Leung, “An Automatic Lipreading System for Spoken Digits With Limited Training Data,” IEEE Trans. on Circuits and Systems for Video Technology, vol. 18, Dec. 2008, pp. 1760-1765.

[21]L. R Rabiner, B. H. Juang, Fundamentals of speech Recognition. Englewood Cliffs, NJ: Pretice-Hall, 1993

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	基於隱藏式馬可夫模型與深度資訊之手語辨識系統

無相關期刊

1.	唇語辨識系統
2.	新型唇語辨識系統
3.	深度學習之唇語辨識系統
4.	基於稀疏表示之人臉驗證與唇語辨識系統
5.	人臉辨識點名系統之研究
6.	注音符號唇語辨識系統之研製
7.	數字唇語之辨識與應用
8.	可辨識運動姿態的時空延遲網路及其在唇語辨識之應用
9.	基於唇部特徵點座標差之中文唇語識別系統
10.	以特徵參數抽取為基礎之類神經唇語辨識器
11.	大學教師學校品牌形象知覺與工作滿意度關係之研究
12.	「電子白板融入教學」創新推廣歷程之研究—以台北市某私立高級中學為例
13.	以資料探勘技術分析拍賣網站數位相機購物消費行為
14.	我國與美國國家安全決策機制之比較研究
15.	建構台灣新地緣戰略之研究－從「三個層次」之地緣政治觀點分析

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室