跳到主要內容

臺灣博碩士論文加值系統

(98.84.18.52) 您好!臺灣時間:2024/10/10 18:22
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:林俊慶
研究生(外文):LIN,JUN-QING
論文名稱:結合離群值偵測與特徵選取改善預測模型性能
論文名稱(外文):Improving Performance of Prediction Model with Outlier Detection and Feature Selection
指導教授:楊鎮華楊鎮華引用關係
學位類別:碩士
校院名稱:國立中央大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:中文
論文頁數:42
中文關鍵詞:離群值偵測特徵選取多元線性迴歸學習成效預測
外文關鍵詞:Outlier DetectionFeature SelectionMultiple Linear RegressionLearning performance prediction
相關次數:
  • 被引用被引用:0
  • 點閱點閱:220
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
為了提升學生的學習成效,提早並準確識別高風險學生,使得教師能夠早期介入輔導,是許多相關研究關注的議題。
混成式課程是一種結合線上與線下學習的課程,有別於傳統的線下學習,學生亦能夠透過線上學習平台,來進行多方面的學習。然而,學生在學習過程當中,會留下許多紀錄,例如學生的作業成績、影片瀏覽行為、線上活動頻率、線上測驗成績等等。因此,本論文透過資料探勘與機器學習技術,收集一門混成式微積分課程的學生學習活動資料,使用多元線性迴歸來預測學生的期末成績。
相關研究指出,預測模型的準確率容易受到離群值的影響。因此,本論文使用RANSAC演算法,作為離群值偵測的方法,將離群值從資料中去除。為了在移除離群值後更進一步改善預測模型的準確率,本論文以T檢定作為特徵選取的方法,保留對期末成績有顯著影響的關鍵特徵,來進一步改善預測模型的準確率。
根據研究結果顯示,透過本論文提出的離群值偵測與特徵選取流程,預測誤差由15.516分降低至4.571分,改善了約70%的預測誤差。
In order to improve students’ learning performance, early and accurately identify at-risk students, so that teachers can early intervention, is the focus topic of many related research.
Blended course is a course which combine online and offline learning, different from traditional offline learning, students are also able to learn through the online learning platform. However, students will leave a lot of records in the learning process, such as students' homework grade, video viewing behavior, online activity frequency, online test grade etc. Therefore, this paper based on data mining and machine learning technologies, collects students’ learning activity data from a blended calculus course, uses multiple linear regression to predict students’ final grade.
Related researchs point out the accuracy of the prediction model is easily affected by outliers. Therefore, this paper uses RANSAC algorithm as outlier detection method to remove outliers from data. In order to futher improve accuracy of prediction model after remove outliers, this paper uses T-Test as feature selection method, retains the key features that have a significant impact on the final grade, to futher improve accuracy of prediction model.
According to the results of research, through the outlier detection and feature selection process proposed in this paper, prediction error from 15.516 down to 4.571 points, improving the prediction error about 70 percent.
摘要 i
ABSTRACT ii
圖目錄 v
表格目錄 vi
一、 緒論 1
二、 文獻探討 3
2.1 隨機抽樣一致(RANdom SAmple Consensus) 3
2.2 特徵選取 3
2.3 學習風險預測 4
2.4 總結 5
三、 混成式微積分課程 7
四、 方法 8
4.1 資料收集 8
4.2 資料前處理 8
4.2.1 填充缺失值 8
4.2.2 資料整合 9
4.2.3 資料聚合 10
4.3 離群值偵測 10
4.4 特徵選取 11
4.5 殘差分析 11
4.6 迴歸分析 12
4.7 交叉驗證 12
4.8 資料標準化 13
五、 結果及討論 14
5.1 研究問題一 14
5.1.1 未移除Outlier流程-結果 14
5.1.2 移除Outlier流程-結果 15
5.1.3 結果總結 19
5.2 研究問題二 19
5.2.1 加入特徵選取流程-結果 20
5.2.2 結果總結 22
5.3 研究問題三 23
5.3.1 「week 1 ~ week 3」資料集流程-結果 24
5.3.2 結果總結 28
六、 結論 29
參考文獻 31
Arroway, P., Morgan, G., O’Keefe, M., & Yanosky, R. (2015). Learning Analytics in Higher Education (p. 17). Research report. Louisville, CO: ECAR, March 2016. 2016 EDUCAUSE. CC by-nc-nd.
Asif, R., Merceron, A., & Pathan, M. K. (2014). Predicting student academic performance at degree level: a case study. International Journal of Intelligent Systems and Applications, 7(1), 49.
Awang, T. S., & Zakaria, E. (2013). Enhancing students’ understanding in integral calculus through the integration of Maple in learning. Procedia-Social and Behavioral Sciences, 102, 204-211.
Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16-28.
Dusmez, S., Heydarzadeh, M., Nourani, M., & Akin, B. (2017). Remaining Useful Lifetime Estimation for Power MOSFETs Under Thermal Stress With RANSAC Outlier Removal. IEEE Transactions on Industrial Informatics.
Ellis, R. A., Pardo, A., & Han, F. (2016). Quality in blended learning environments–Significant differences in how students approach learning collaborations. Computers & Education, 102, 90-102.
Erdem, C. E., Bozkurt, E., Erzin, E., & Erdem, A. T. (2010, October). RANSAC-based training data selection for emotion recognition from spontaneous speech. In Proceedings of the 3rd international workshop on Affective interaction in natural environments (pp. 9-14). ACM.
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381-395.
Hachey, A. C., Wladis, C. W., & Conway, K. M. (2014). Do prior online course outcomes provide more information than GPA alone in predicting subsequent online course grades and retention? An observational study at an urban community college. Computers & Education, 72, 59-67.
Hong, J. C., Hwang, M. Y., Wu, N. C., Huang, Y. L., Lin, P. H., & Chen, Y. L. (2016). Integrating a moral reasoning game in a blended learning setting: effects on students' interest and performance. Interactive Learning Environments, 24(3), 572-589.
Hu, Y. H., Lo, C. L., & Shih, S. P. (2014). Developing early warning systems to predict students’ online learning performance. Computers in Human Behavior, 36, 469-478.
Huang, S., & Fang, N. (2013). Predicting student academic performance in an engineering dynamics course: A comparison of four types of predictive mathematical models. Computers & Education, 61, 133-145.
Kröger, M., Sauer-Greff, W., Urbansky, R., Lorang, M., & Siegrist, M. (2016, September). Performance evaluation on contour extraction using Hough transform and RANSAC for multi-sensor data fusion applications in industrial food inspection. In Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), 2016 (pp. 234-237). IEEE.
Kuzilek, J., Hlosta, M., Herrmannova, D., Zdrahal, Z., & Wolff, A. (2015). OU Analyse: analysing at-risk students at The Open University. Learning Analytics Review, 1-16.
Lara, J. A., Lizcano, D., Martínez, M. A., Pazos, J., & Riera, T. (2014). A system for knowledge discovery in e-learning environments within the European Higher Education Area–Application to student data from Open University of Madrid, UDIMA. Computers & Education, 72, 23-36.
Liu, H., Motoda, H., Setiono, R., & Zhao, Z. (2010, May). Feature selection: An ever evolving frontier in data mining. In Feature Selection in Data Mining (pp. 4-13).
Lonn, S., Aguilar, S. J., & Teasley, S. D. (2015). Investigating student motivation in the context of a learning analytics intervention during a summer bridge program. Computers in Human Behavior, 47, 90-97.
Lu, O. H., Huang, J. C., Huang, A. Y., & Yang, S. J. (2017). Applying learning analytics for improving students engagement and learning outcomes in an MOOCs enabled collaborative programming course. Interactive Learning Environments, 25(2), 220-234.
Macfadyen, L. P., & Dawson, S. (2010). Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & education, 54(2), 588-599.
Marbouti, F., Diefes-Dux, H. A., & Madhavan, K. (2016). Models for early prediction of at-risk students in a course using standards-based grading. Computers & Education, 103, 1-15.
Meier, Y., Xu, J., Atan, O., & van der Schaar, M. (2016). Predicting grades. IEEE Transactions on Signal Processing, 64(4), 959-972.
Oshin, O., Gilbert, A., & Bowden, R. (2011, March). Capturing the relative distribution of features for action recognition. In Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on(pp. 111-116). IEEE.
Pal, M., & Foody, G. M. (2010). Feature selection for classification of hyperspectral data by SVM. IEEE Transactions on Geoscience and Remote Sensing, 48(5), 2297-2307.
Romero, C., López, M. I., Luna, J. M., & Ventura, S. (2013). Predicting students' final performance from participation in on-line discussion forums. Computers & Education, 68, 458-472.
Thammasiri, D., Delen, D., Meesad, P., & Kasap, N. (2014). A critical assessment of imbalanced class distribution problem: The case of predicting freshmen student attrition. Expert Systems with Applications, 41(2), 321-330.
Villagrá Arnedo, C., Gallego-Durán, F. J., Compañ, P., Llorens Largo, F., & Molina-Carmona, R. (2016). Predicting academic performance from behavioural and learning data.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top