跳到主要內容

臺灣博碩士論文加值系統

(100.28.0.143) 您好!臺灣時間:2024/07/23 10:40
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:連立生
研究生(外文):LIEN, LI-SHENG
論文名稱:語音辨識在分散式系統的應用開發
論文名稱(外文):Application of Speech Recognition in a Decentralized System
指導教授:李福星李福星引用關係
指導教授(外文):LEE, FU-SHIN
口試委員:李春穎蕭俊祥
口試委員(外文):LEE, CHUN-YINGXIAO, JUN-XIANG
口試日期:2022-07-21
學位類別:碩士
校院名稱:華梵大學
系所名稱:智慧生活科技學系碩士班
學門:電算機學門
學類:軟體發展學類
論文種類:學術論文
論文出版年:2022
畢業學年度:110
語文別:中文
論文頁數:41
中文關鍵詞:智慧物聯網語音辨識語意識別分散式網絡
外文關鍵詞:AIoTSpeech RecognitionSemantic RecognitionDecentralized Network
相關次數:
  • 被引用被引用:0
  • 點閱點閱:146
  • 評分評分:
  • 下載下載:24
  • 收藏至我的研究室書目清單書目收藏:0
目前家用設備語音辨識控制系統,大多是以智能音箱為中心的集中式架構,當智能音箱失去作用,整個控制系統將跟著失效。本論文提出一個去中心化的智能物聯網控制架構,使用兩個配備麥克風模組的樹莓派裝置,共同連線到同一個區域網路。每個裝置預設不同的控制指令集,並以此建立潛在語義索引模型。透過UDP網路協定,裝置間交換彼此的指令集,並重新建立包含所有裝置指令集的潛在語義索引模型。各個裝置獨自利用網際網路Google雲端中文語音辨識服務,將使用者的語音轉換為繁體中文文字,語句文字透過潛在語義索引模型找出最高相似度的指令,若相似度分數超過門檻值,則視為有效指令,並通知該指令所屬的裝置,使該裝置即使未接受到語音指令,也能經由網路其他裝置的通知,接受到控制的指令。
實驗結果顯示無主從關係的語音辨識裝置,可以快速地透過網路通訊,交換指令集與通知指令,有效地建立一個去中心化的語音指令識別系統。
Nowadays, voice recognition control systems for home equipment are mostly centralized architectures centered on smart speakers. Once the dedicated speakers fail, the entire control system fails too. This paper proposes a decentralized intelligent IoT control architecture using two Raspberry Pi devices equipped with microphone modules, which link to the same local network concurrently. Different control instruction sets are preset for each device, and the system establishes a latent semantic index model for each device correspondingly. Through the UDP protocol, devices exchange each other's command sets and rebuild the latent semantic index model, including all device command sets. Each device employs the Google Cloud Chinese speech recognition service independently to convert the user's voice into traditional Chinese text, and the sentence text uses the latent semantic index model to search for the possible command with the highest similarity. It notifies the correct device to which the voice command belongs so that even if the device has not received a voice command, it still can receive a control command through the notification of other devices on the network.
The experimental results show that the speech recognition devices without the master-slave relationship can quickly communicate through the network, exchange command sets, and notification commands, and effectively establish a decentralized speech command recognition system.
摘 要 I
ABSTRACT II
目 錄 III
圖 錄 V
一、緒論 1
1.1 研究背景 1
1.2 研究動機與目的 1
1.3 文獻回顧 2
1.4 論文架構 5
二、理論與方法 6
2.1 語音辨識 6
2.1.1 模式識別 6
2.1.2 語音辨識任務 7
2.1.3 語音訊號特徵 8
2.1.3.1 語音訊號數位化 8
2.1.3.2 時頻圖 9
2.1.3.3 梅爾刻度濾波器組 9
2.1.3.4 梅爾頻率倒譜係數 10
2.1.4 語音辨識模型 10
2.2 自然語言處理 11
2.2.1 詞法分析 12
2.2.2 資訊擷取 13
2.2.3 文本聚類與文本分類 13
2.2.4 句法分析 13
2.2.5 潛在語義分析 14
2.3 網際網路協定套組 14
2.4 樹莓派 16
三、實踐與開發 18
3.1 裝置系統平台 18
3.2 開發準備 19
3.2.1 指令語法 19
3.2.2 裝置指令檔案 19
3.2.3 通訊格式 20
3.2.4 程式模組 20
3.3 程式開發 21
3.3.1 初始設定 22
3.3.2 UDP_HANDLER處理緒 24
3.3.3 SPEECH_RECOGNIZER處理緒 26
四、實驗過程分析 28
4.1 系統單機實驗 28
4.1.1 系統單機初始啟動實驗 28
4.1.2 本機語音指令實驗 29
4.1.3 未知語音指令實驗 31
4.2 系統雙機實驗 32
4.2.1 雙機指令集交換實驗 32
4.2.2 雙機語音辨識指令傳送實驗 35
4.3 系統實驗分析 36
五、結論與展望 38
5.1 結論 38
5.2 展望 39
參考文獻 40

[1]Jurafsky, D. and J. H. Martin, Speech and Language Processing: An introduction to speech recognition, computational linguistics and natural language processing. Upper Saddle River, NJ: Prentice Hall, 2008.
[2]Turing, A. M., On computable numbers, with an application to the Entscheidungsproblem. J. of Math, 1936. 58(345-363): p. 5.
[3]Shannon, C. E., A mathematical theory of communication. The Bell system technical journal, 1948. 27(3): p. 379-423.
[4]Koenig, W., H. Dunn, and L. Lacy, The sound spectrograph. The Journal of the Acoustical Society of America, 1946. 18(1): p. 19-49.
[5]Chomsky, N., Three models for the description of language. IRE Transactions on information theory, 1956. 2(3): p. 113-124.
[6]Rabiner, L. and B.-H. Juang, Fundamentals of speech recognition. 1993: Prentice-Hall, Inc.
[7]Davis, K. H., R. Biddulph, and S. Balashek, Automatic recognition of spoken digits. The Journal of the Acoustical Society of America, 1952. 24(6): p. 637-642.
[8]Olson, H. F. and H. Belar, Phonetic typewriter. The Journal of the Acoustical Society of America, 1956. 28(6): p. 1072-1081.
[9]Frantz, G. R., Jay, Julie: The application of DSP to a consumer product. 1988.
[10]Bell, A. The History of Speech Recognition – Part 1. 2019 April 28, 2022]; Available from: https://whatsnext.nuance.com/en-gb/dragon-professional/history-speech-recognition/.
[11]Shangguan, Y., et al., Optimizing speech recognition for the edge. arXiv preprint arXiv:1909.12408, 2019.
[12]Hinton, G., et al., Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal processing magazine, 2012. 29(6): p. 82-97.
[13]Li, J., Recent advances in end-to-end automatic speech recognition. APSIPA Transactions on Signal and Information Processing, 2021. 11(1).
[14]何晗, NLP工程師養成術. 2020: 博碩.
[15]Bellegarda, J. R., Exploiting latent semantic information in statistical language modeling. Proceedings of the IEEE, 2000. 88(8): p. 1279-1296.
[16]Bellegarda, J. R., Latent semantic mapping [information retrieval]. IEEE signal processing magazine, 2005. 22(5): p. 70-80.
[17]Hofmann, T. Probabilistic latent semantic indexing. in Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. 1999.
[18]村山公保, TCP/IP 網路程式實驗與設計. 2007: 新文京開發.
[19]Fall, K. R. and W. R. Stevens, TCP/IP illustrated, volume 1: The protocols. 2011: addison-Wesley.
[20]Raspberry Pi 4 Model B specifications. April 28, 2022]; Available from: https://www.raspberrypi.com/products/raspberry-pi-4-model-b/specifications/.
[21]ReSpeaker 2-Mics Pi HAT Overview. April 28, 2022]; Available from: https://wiki.seeedstudio.com/ReSpeaker_2_Mics_Pi_HAT/.
[22]Junyi, S. jieba. 2013; Available from: https://github.com/fxsjy/jieba.
[23]Řehůřek, R. gensim Documentation. April 28, 2022]; Available from: https://radimrehurek.com/gensim/auto_examples/index.html.
[24]Rehurek, R. and P. Sojka. Software framework for topic modelling with large corpora. in In Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks. 2010. Citeseer.
[25]Zhang, A. SpeechRecognition. April 28, 2022]; Available from: https://github.com/Uberi/speech_recognition#readme.
[26]VOSK Offline Speech Recognition API. April 28, 2022]; Available from: https://alphacephei.com/vosk/.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top