跳到主要內容

臺灣博碩士論文加值系統

(44.200.101.84) 您好!臺灣時間:2023/10/05 09:49
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:卓摺宇
研究生(外文):CHO, CHE-YU
論文名稱:線上學習環境下背景音影響度智慧辨識系統
論文名稱(外文):Intelligent Recognition System for Assessing the Impact of Background Sound in Online Learning Environments
指導教授:王讚彬
指導教授(外文):WANG, TSAN-PIN
口試委員:王丕中魏清泉李國川王讚彬
口試委員(外文):WANG, PI-CHUNGWEI, CHING-CHUANLEE, GWO-CHUANWANG, TSAN-PIN
口試日期:2023-07-19
學位類別:碩士
校院名稱:國立臺中教育大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2023
畢業學年度:111
語文別:中文
論文頁數:71
中文關鍵詞:人工智慧深度學習線上學習學習環境影響度辨識相似度比對
外文關鍵詞:Artificial IntelligenceDeep LearningOnline LearningLearning Environment Impact Level RecognitionSimilarity Matching
相關次數:
  • 被引用被引用:0
  • 點閱點閱:18
  • 評分評分:
  • 下載下載:2
  • 收藏至我的研究室書目清單書目收藏:0
自 Covid-19 爆發以來,傳統教室上課的方式受到嚴重影響,線上遠距教學因而興
起,在這種教學下,學生容易受到環境的影響,教師也難以掌握學生情況,為了提升學
習效率,開發「線上學習環境下背景音影響度智慧辨識系統」,監測和識別學生所處環境
的背景音,其中場景分為兩個場景:一個場景為學生使用喇叭聽課,在此場景中採用深度
學習模型進行預測及分析,其特徵為背景音相似度及聲音辨識模型預測結果,另一場景
為學生使用耳機聽課,在此場景中同樣採用深度學習模型進行預測及分析,其特徵為背
景音分貝數及聲音辨識模型預測結果,並將其轉換為影響程度後,發送信息給教師,讓
教師了解學生學習環境影響程度,因在現實中無法預知學生是如何聽課,所以開發一個
整合型系統來補足只基於相似度或分貝數的問題,此系統會採用深度學習模型進行預測
及分析,整合型系統會以背景音分貝數、相似度及聲音辨識模型預測結果作為特徵,最
後在進行驗證系統,使用人工標籤且未進模型訓練的測試資料,來驗證系統的準確性,
在人工標籤測資的驗證下,系統在學生使用喇叭聽課的場景達到 79.75%的準確度,在學
生使用耳機聽課的場景達到 90.75%的準確度,最後在使用人工標籤且未進模型訓練的
測試資料,驗證整合兩種場景的整合型系統達到 88.5%的準確度。
Since the outbreak of Covid-19, traditional classroom teaching has been severely impacted,
and online distance learning has emerged as a result. In order to improve learning efficiency,
we have developed a Intelligent Recognition System for Assessing the Impact of Background
Sound in Online Learning Environments. Our system can monitor and identify the impact of
the background sound in the online learning environment. The environment is divided into two
scenarios: one where students use speakers for listening, in which a deep learning model is
utilized for prediction and analysis. The features used are background sound similarity and the
prediction results of the sound recognition model. The other scenario involves students using
headphones for listening, and similarly, a deep learning model is used for prediction and
analysis. The features in this case are background sound decibel levels and the prediction results
of the sound recognition model. After converting these features into impact levels, the system
sends information to teachers to help them understand the extent to which the students' learning
environment is affected. As it is not possible to predict how students will listen in real-world
situations, an integrated system has been developed to address the limitations of relying solely
on similarity or decibel levels. This system also utilizes deep learning models for prediction
and analysis, with background sound decibel levels, sound similarity, and the prediction results
of the sound recognition model being used as features in the integrated system. Finally, the
system's accuracy was validated using test data with manual labeling and without model
training. Under this validation with manual labeling, the system achieved an accuracy of
79.75% in the scenario where students use speakers for listening and an accuracy of 90.75% in
the scenario where students use headphones for listening. Additionally, the integrated system,
which combines both scenarios, was validated using test data with manual labeling and without
model training, achieving an accuracy of 88.5%.
目錄
摘要 i
目錄 iv
表目錄 vi
圖目錄 viii
一、緒論 1
1.1 背景介紹、重要性 1
1.2 目的、研究動機 1
1.3 論文架構 2
二、相關研究 3
2.1 噪音對人產生的影響 3
2.2 MFCCs(Mel-Frequency Cepstral Coefficients) 3
2.3 Chroma[6] 3
2.4 Tonnetz[7] 4
2.5 分貝數轉換 4
2.6 聲音相似度 5
2.7 Long short-term memory 6
三、線上學習環境下背景音影響度智慧辨識系統 7
3.1 使用喇叭聽課場景 7
3.1.1 基於相似度自動標籤機制 8
3.1.2 基於相似度之背景音影響度辨識系統 9
3.2 使用耳機聽課場景 11
3.2.1 基於分貝數自動標籤機制 11
3.2.2 基於分貝數之背景音影響度辨識系統 13
3.3 整合相似度及分貝數背景音影響度辨識系統 13
四、線上學習環境下背景音影響度智慧辨識系統模型之訓練 15
4.1 硬體設備及實驗環境 15
4.2 背景音辨識模型資料收集及訓練 15
4.3 相似度自動標籤資料收集及訓練 20
4.4 分貝數自動標籤資料收集及訓練 27
4.5 整合型自動標籤資料收集及訓練 29
五、模型預測結果與分析 34
5.1 基於相似度自動標籤方法準確度分析 34
5.2 基於分貝數自動標籤方法準確度分析 42
5.3 整合型自動標籤方法準確度分析 44
六、結論與未來展望 56
6.1 結論 56
6.2 未來展望 57
參考文獻 58


[1] Schlittmeier SJ, Feil A, Liebl A, Hellbr Ck JR. "The impact of road traffic noise on cognitive performance in attention-based tasks depends on noise level even within moderate-level ranges. " Noise Health,17(76), pp.148-57, 2015.
[2] Sepehri, S., Aliabadi, M., Golmohammadi, R., & Babamiri, M. "The Effects of Noise on Human Cognitive Performance and Thermal Perception under Different Air Temperatures." Journal of research in health sciences, 19(4), e00464, 2019.
[3] 吳國瑋、張宗倫、黃柏琮『線上專注度及時偵測回饋系統』,國立臺中教育大學資訊工程學系111級資訊專題報告,2021。
[4] Jafari, M. J., Khosrowabadi, R., Khodakarim, S., & Mohammadian, F. "The Effect of Noise Exposure on Cognitive Performance and Brain Activity Patterns. " Open access Macedonian journal of medical sciences, 7(17), pp. 2924–2931, 2019.
[5] McFee, Brian, et al. "librosa: Audio and music signal analysis in python." proceedings of the 14th python in science conference. Vol. 8, 2015.
[6] Ellis, Daniel P.W. "Chroma feature analysis and synthesis", 2007
[7] Harte, C., Sandler, M., & Gasser, M. “Detecting Harmonic Change in Musical Audio.” In Proceedings of the 1st ACM Workshop on Audio and Music Computing Multimedia, Santa Barbara, CA, USA: ACM Press, pp. 21-26, 2006.
[8] Foote, Jonathan. "Visualizing music and audio using self-similarity. "In Proceedings of the seventh ACM international conference on Multimedia (Part 1), pp. 77-80. 1999.
[9] Rahutomo, Faisal, Teruaki Kitasuka, and Masayoshi Aritsugi. "Semantic cosine similarity." The 7th international student conference on advanced science and technology ICAST. Vol. 4. No. 1, 2012.

[10] Thomé, C., Piwell, S., & Utterbäck, O. "Musical Audio Similarity with Self-supervised Convolutional Neural Networks." arXiv preprint arXiv:2202.02112., 2022.
[11] Manocha, P., Jin, Z., Zhang, R., & Finkelstein, A. "CDPAM: Contrastive learning for perceptual audio similarity. " IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 196-200, 2021
[12] Lin, Qingjian., Yin, R., Li, M., Bredin, H., & Barras, C. "LSTM based Similarity Measurement with Spectral Clustering for Speaker Diarization." Annual Conference of the International Speech Communication Association. 2019.
[13] Yu, Y., Si, X., Hu, C., & Zhang, J. "A review of recurrent neural networks: LSTM cells and network architectures. " Neural computation, 31(7), pp. 1235-1270, 2019.
[14] Sherstinsky, Alex. "Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network." Physica D: Nonlinear Phenomena 404, 2020.
[15] Likas, A., Vlassis, N., & Verbeek, J. J., "The global k-means clustering algorithm. " Pattern recognition, pp. 451-461, 2003.
[16] Abadi, Martín, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin et al. "{TensorFlow}: a system for {Large-Scale} machine learning." In 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp. 265-283, 2016.
[17] Chollet, François. "keras." 2015
[18] J. D. Hunter, "Matplotlib: A 2D Graphics Environment", Computing in Science & Engineering, vol. 9, no. 3, pp. 90-95, 2007.
[19] Pedregosa et al., "Scikit-learn: Machine Learning in Python", JMLR 12, pp. 2825-2830, 2011.
[20] Rafii, Z., Liutkus, A., Stöter, F. R., Mimilakis, S. I., & Bittner, R., "The MUSDB18 corpus for music separation. "2017.
[21] Annamaria, & Virtanen, Heittola, Mesaros, Tuomas, Toni. "TAU Urban Acoustic Scenes 2019", Development dataset [Data set]. Zenodo. 2019.
[22] Nagrani, A., Chung, J.S., Zisserman, A. "VoxCeleb: A Large-Scale Speaker Identification Dataset. Proc. " Interspeech 2017.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊