跳到主要內容

臺灣博碩士論文加值系統

(34.204.172.188) 您好!臺灣時間:2023/09/27 17:23
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:吳振瑋
研究生(外文):WU, ZHEN-WEI
論文名稱:基於深度學習之語音識別應用於無人車模擬控制
論文名稱(外文):Speech recognition based on deep learning applied to simulation control of unmanned vehicles
指導教授:賴俊吉賴俊吉引用關係
指導教授(外文):LAI,CHUN-CHI
口試委員:林永欽吳文誌
口試委員(外文):LIN,YUNG-CHINWU,WEN-CHIH
口試日期:2023-06-24
學位類別:碩士
校院名稱:國立雲林科技大學
系所名稱:電機工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2023
畢業學年度:111
語文別:中文
論文頁數:94
中文關鍵詞:機器人作業系統光學雷達同步定位與地圖建構梅爾頻率倒譜係數
外文關鍵詞:robotic operating systemoptical radarsimultaneous localization and mappingmel frequency cepstral coefficient
相關次數:
  • 被引用被引用:0
  • 點閱點閱:86
  • 評分評分:
  • 下載下載:26
  • 收藏至我的研究室書目清單書目收藏:0
本篇論文提出一個語音模型控制自走車於模擬環境中移動,在語音控制部分可以透過特定的命令使自走車移動至該目標位置,語音模型是透過梅爾頻率倒譜係數(Mel-Frequency Cepstral Coefficient,MFCC),將輸入訊號中過濾背景雜訊及提取說話者較有辨識效果的聲音特徵做為訓練樣本,再分別利用Transformer以及LSTM將特徵進行訓練,最終達成語音辨識。
而模擬環境是利用作業系統(Robot Operation System,ROS),採用Gazebo地圖仿真,並且使機器人即時定位與建圖(Simultaneous Localization and Mapping,SLAM),首先機器人搭配了2D光學雷達(Light Detection And Ranging,LIDAR),掃描目前工作環境狀態與障礙物距離。實現室內建圖(Simultaneous Localization and Mapping,SLAM),之後再使用自適應蒙特卡洛(Adaptive Monte Carlo localization,AMCL)針對建圖後的2D地圖網格中定位出機器人目前的所在位置,搭配語音控制下達目標指令,經由全域路徑(Global Planner)在已知地圖網格開始規劃行駛至目標點的路徑,在自走車移動過程中由局部避障動態窗口(Local Planner)使用(Dynamic Window Approach,DWA)偵測障礙物做避障的功能。

This paper proposes a speech model for controlling the movement of a bicycle vehicle in a simulated environment. In the voice control part, the bicycle vehicle can be moved to the target location by specific commands. The speech model uses Mel-Frequency Cepstral Coefficients (MFCC) to filter the background noise in the input signal and extract the speech features that are easier to be recognized by the speaker as training samples, then use Transformer and LSTM to train the features, and finally realize speech recognition .
The simulation environment uses the robot operating system (ROS), uses the Gazebo map simulation, and performs simultaneous localization and mapping (SLAM) for the robot. First, the robot is equipped with light detection and ranging (LIDAR) to scan the current working environment status and obstacle distance. Realize indoor simultaneous localization and mapping (SLAM), then use adaptive Monte Carlo localization (AMCL) to locate the current position of the robot in the two-dimensional map grid after mapping, and use voice control to issue target instructions, through (Global Planner ) starts to plan the path to the target point on the known map grid, and (Local Planner) uses the dynamic window method (DWA) to detect obstacles during the movement of the self-propelled vehicle.

摘要 i
Abstract ii
誌謝 iv
目錄 v
表目錄 viii
圖目錄 ix
第一章 緒論 1
1.1 研究動機與目的 1
1.2 文獻回顧 1
1.3 研究方向 4
1.4 論文架構 5
第二章 背景知識 6
2.1 機器人作業系統 6
2.2 ROS通訊架構 7
2.3 ROS常用指令 8
2.4 座標轉換TF(Trans Form) 9
2.5 ROS 軟體工具 9
2.5.1 仿真Gazebo 9
2.5.2 RViz 10
2.5.3 RQT 12
第三章 建圖與導航架構 14
3.1 SLAM即時定位與建圖 14
3.1.1 GMapping SLAM 15
3.2 定位系統 16
3.2.1 自適應蒙特卡洛定位AMCL 16
3.3 代價地圖(costmap) 19
3.3.1 Costmap Layers 20
3.3.2 網格代價 21
3.4 導航 23
3.4.1 全域路徑規劃(Global Planner) 25
3.4.2 局部避障動態窗口(Local Planner) 27
第四章 語音模型架構介紹 31
4.1 梅爾倒譜係數MFCC 31
4.1.1 預強調(Pre-emphasis) 32
4.1.2取音框(Frame blocking) 32
4.1.3 漢明窗(Hamming window) 33
4.1.4 快速傅立葉轉換(Fast Fourier Transform, FFT) 34
4.1.5 梅爾濾波器(Mel filter bank) 35
4.1.6 對數轉換(Logarithmic Operation) 36
4.1.7 離散餘弦轉換(Discrete Cosine Transform, DCT) 36
4.2 Transformer 37
4.2.1 編碼器(Encoder) 38
4.2.2 解碼器(Decoder) 39
4.2.3 输入嵌入層(Embedding) 40
4.2.4 位置編碼(Position Encoding) 40
4.2.5 自注意力機制(Self-Attention) 40
4.2.6 多頭注意力機制 41
4.2.7 前饋層(Feed Forward) 42
4.2.8 殘差連結層(Add & Norm) 43
4.3 長短期記憶網路(Long Short-Term Memory,LSTM) 43
4.3.1 遺忘門(Forget gate) 44
4.3.2 輸入門(Input gate) 44
4.3.3 更新細胞狀態(Update cell state) 45
4.3.4 輸出門(Output gate) 46
第五章 實驗過程與成果 47
5.1 特徵提取過程 49
5.2 語音模型訓練 51
5.2.1 Transformer訓練結果 51
5.2.2 LSTM訓練結果 53
第六章 結論與未來展望 74
6.1 結論 74
6.2 未來展望 75
參考文獻 76

[1]Hossan, M. A., Memon, S., & Gregory, M. A. (2010, December). A novel approach for MFCC feature extraction. In 2010 4th International Conference on Signal Processing and Communication Systems (pp. 1-5). IEEE.
[2]網頁資料,檢自: https://kknews.cc/zh-tw/tech/65kj9vp.html
[3]網頁資料,檢自: https://kknews.cc/zh-tw/tech/z5ng553.html
[4]Jokic, I. D., Jokic, S. D., Delic, V. D., & Perie, Z. H. (2015, November). Mel-frequency cepstral coefficients as features for automatic speaker recognition. In 2015 23rd Telecommunications Forum Telfor (TELFOR) (pp. 419-424). IEEE.
[5]Zhou, J., & Chen, P. (2009, May). Generalized discrete cosine transform. In 2009 Pacific-Asia Conference on Circuits, Communications and Systems (pp. 449-452). IEEE.
[6]Anam, K., & Saleh, A. (2020, November). Voice Controlled Wheelchair for Disabled Patients based on CNN and LSTM. In 2020 4th International Conference on Informatics and Computational Sciences (ICICoS) (pp. 1-5). IEEE.
[7]Liu, Y., Wang, W., & Li, Y. (2021, November). Realization of Contactless Elevator Control Panel System Based on Voice Interaction Technology. In 2021 3rd International Conference on Control Systems, Mathematical Modeling, Automation and Energy Efficiency (SUMMA) (pp. 591-594). IEEE.
[8]Riza, M. F., & Salahuddin, N. S. (2019, October). Control Home Devices with Voice Commands via a Smartphone. In 2019 Fourth International Conference on Informatics and Computing (ICIC) (pp. 1-7). IEEE.
[9]Xia, Y., & Qu, C. (2021, June). Design and Implementation of a Voice Controlled Music Player System Based on iFLYTEK Open Platform. In 2021 International Conference on Intelligent Computing, Automation and Applications (ICAA) (pp. 807-812). IEEE.
[10]維基百科,機器人作業系統檢自: https://zh.wikipedia.org/zh-tw/%E6%A9%9F%E5%99%A8%E4%BA%BA%E4%BD%9C%E6%A5%AD%E7%B3%BB%E7%B5%B1
[11]王怡翔(2020)。ROS平台之多感測器融合精準定位與建圖。﹝碩士論文。國立雲林科技大學﹞臺灣博碩士論文知識加值系統。 https://hdl.handle.net/11296/t328n7。
[12]網頁資料: Ros wiki,tf,檢自:http://wiki.ros.org/tf
[13]網頁資料: Ros wiki,gmapping,檢自:http://wiki.ros.org/gmapping
[14]Romaniuk, S., Wolniakowski, A., Pawłowski, A., & Kownacki, C. (2022, August). Adaptation of Ultra Wide Band positioning system for Adaptive Monte Carlo Localization. In 2022 26th International Conference on Methods and Models in Automation and Robotics (MMAR) (pp. 238-243). IEEE.
[15]Sun, Y. (2022, February). A Comparative Study on the Monte Carlo Localization and the Odometry Localization. In 2022 IEEE International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA) (pp. 1074-1077). IEEE.
[16]Dihua, S., Hao, Q., Min, Z., Senlin, C., & Liangyi, Y. (2018, June). Adaptive KLD sampling based Monte Carlo localization. In 2018 Chinese Control And Decision Conference (CCDC) (pp. 4154-4159). IEEE.
[17]劉俊傑(2020)。ROS 架構下應用RGBD ORB-SLAM 在室内環境之自走車導航。﹝碩士論文。國立臺北科技大學﹞臺灣博碩士論文知識加值系統。 https://hdl.handle.net/11296/tenbvf。
[18]Lu, D. V., Hershberger, D., & Smart, W. D. (2014, September). Layered costmaps for context-sensitive navigation. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 709-715). IEEE.
[19]Fu, H., Chen, Q., Chen, Z., & Wen, S. (2021, May). Costmap Construction and Pseudo-Lidar Conversion Method of Mobile Robot Based on Monocular Camera. In 2021 33rd Chinese Control and Decision Conference (CCDC) (pp. 3163-3168). IEEE.
[20]Han, X., Leng, Y., Luo, H., & Zhou, W. (2017, May). A novel navigation scheme in dynamic environment using layered costmap. In 2017 29th Chinese Control And Decision Conference (CCDC) (pp. 7123-7128). IEEE.
[21]Chen, C. S., & Lin, S. Y. (2021, October). Costmap Generation Based on Dynamic Obstacle Detection and Velocity Obstacle Estimation for Autonomous Mobile Robot. In 2021 21st International Conference on Control, Automation and Systems (ICCAS) (pp. 1963-1968). IEEE.
[22]網頁資料,Ros wiki,costmap,檢自:http://wiki.ros.org/costmap_2d
[23]網頁資料,Ros wiki,navigation,檢自: http://wiki.ros.org/navigation/Tutorials/RobotSetup#Costmap_Configuration_.28local_costmap.29_.26_.28global_costmap.29
[24]Fox, D., Burgard, W., & Thrun, S. (1997). The dynamic window approach to collision avoidance. IEEE Robotics & Automation Magazine, 4(1), 23-33.
[25]網頁資料,https://www.796t.com/content/1547299518.html
[26]Li, Y., & Zhu, Q. (2021, July). Local path planning based on improved Dynamic window approach. In 2021 40th Chinese Control Conference (CCC) (pp. 4291-4295). IEEE.
[27]Fan, R. (2022, March). Transformer-Based Deep Learning Method for the Prediction of Ventilator Pressure. In 2022 IEEE 2nd International Conference on Information Communication and Software Engineering (ICICSE) (pp. 25-28). IEEE.
[28]Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
[29]Al Bashit, A., & Valles, D. (2019, September). MFCC-based houston toad call detection using LSTM. In 2019 IEEE International Symposium on Measurement and Control in Robotics (ISMCR) (pp. D3-3). IEEE.
[30]Sunny, M. A. I., Maswood, M. M. S., & Alharbi, A. G. (2020, October). Deep learning-based stock price prediction using LSTM and bi-directional LSTM model. In 2020 2nd Novel Intelligent and Leading Emerging Sciences Conference (NILES) (pp. 87-92). IEEE.
[31]Haputhanthri, D., & Wijayasiri, A. (2021, July). Short-term traffic forecasting using LSTM-based deep learning models. In 2021 Moratuwa Engineering Research Conference (MERCon) (pp. 602-607). IEEE.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊