跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.169) 您好!臺灣時間:2024/12/06 05:44
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:吳冠霖
研究生(外文):Kuan-Lin Wu
論文名稱:基於時序注意力分群模型以弱監督學習偵測異常行為
論文名稱(外文):Weakly Supervised Learning for Video Anomaly Detection via Temporal Attention Clustering
指導教授:李明穗
指導教授(外文):Ming-Sui Lee
口試委員:莊永裕胡敏君
口試委員(外文):Yung-Yu ChuangMing-Chun Hu
口試日期:2021-07-21
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2021
畢業學年度:109
語文別:英文
論文頁數:38
中文關鍵詞:異常偵測弱監督學習多示例學習
外文關鍵詞:Anomaly DetectionWeakly Supervised LearningMultiple Instance Learning
DOI:10.6342/NTU202102822
相關次數:
  • 被引用被引用:0
  • 點閱點閱:144
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在現行的監視錄影系統中,往往只能在意外發生之後再回放檢視。若要能夠及時的得知意外或犯罪行為的發生,需要耗費大量的人力成本監控畫面,效率也很低落。此篇論文旨在以自動化的方式及時偵測影片中的異常事件例如搶劫、虐待或是暴力等犯罪行為,阻止更多遺憾的發生。然而這些資料非常稀少且難以蒐集,要人工標註每個影格的類別更需要高昂的成本。因此本文提出弱監督學習的模型,利用只標註是否含有異常事件的影片,訓練模型找到其中實際包含異常行為的關鍵影格。
本文所提出的預測模型包含:(1)時序注意力機制,讓模型透過預測影片是否包含異常行為以反推時間序列上哪些是值得關注的事件,(2)分群模型,利用畫面本身的特徵做分類,劃分出正常影格和異常影格,(3)亂度平滑損失函數,使用此函數進行訓練可使得預測結果具有時間上的一致性,讓預測更加合理。
透過實驗在UCF-Crime和ShanghaiTech兩個不同規模與類型的資料集,本文所提出的模型展現了非常有競爭力的效能,其中在UCF-Crime更達到了與目前現有其他方法相比最先進的結果。
In the current surveillance video system, to detect the occurrence of accidents or crimes in time, the labor cost of monitoring screens is expensive, and the efficiency is very low. Also, such data are rare and hard to collect. Manually labeling the frames costs a lot as well. Therefore, this paper proposed a weakly supervised learning model, which can be trained by video-level ground-truth that only labeled whether the video contains abnormal events, and finds the keyframes that contain abnormal behaviors.
Our proposed model includes (1) Temporal attention module that helps the model detect the key instances. (2) Cluster module that divides the video by segment features. (3) Entropy smoothness loss that helps to stabilize the predict curve.
The experiment is implemented on UCF-crime and ShanghaiTech datasets. Remarkably, our model achieved a state-of-the-art result on UCF-Crime dataset (AUC 84.75\%).
口試委員審定書 (i)
Acknowledgements (ii)
摘要 (iv)
Abstract (v)
Contents (vi)
List of Figures (viii)
List of Tables (ix)
1. Introduction (1)
2. Related Work (3)
2.1. Video Anomaly Detection (3)
2.1.1 Unsupervised Learning Method (3)
2.1.2 Weakly-Supervised Learning Method (4)
2.2 Multiple Instance Learning for Key Instance Detection (5)
2.3 Action Analysis (6)
3. Method (7)
3.1 Data Organization (8)
3.2 Feature Backbone (9)
3.3 Temporal Attention Module (9)
3.4 Cluster Mechanism (11)
3.5 Loss Function (14)
4. Experiment (17)
4.1 Datasets (17)
4.1.1 UCF-Crime (17)
4.1.2 ShanghaiTech (18)
4.2 Evaluation Metrics (18)
4.3 Implementation Details (19)
4.4 Experiments on UCF-Crime (19)
4.4.1 Quantitative comparison (19)
4.4.2 Qualitative analysis (20)
4.5 Experiments on ShanghaiTech (24)
4.5.1 Quantitative comparison (24)
4.5.2 Qualitative analysis (25)
4.6 Ablation Study (27)
4.6.1 Temporal Convolution Network (27)
4.6.2 Temporal Attention Module (28)
4.6.3 Cluster Mechanism (29)
4.6.4 Entropy Smoothness Loss (31)
5. Conclusion (33)
References (34)
H. Bilen, M. Pedersoli, and T. Tuytelaars. Weakly supervised object detectionwith convex clustering.2015IEEEConferenceonComputerVisionandPatternRecognition(CVPR), pages 1081–1089, 2015.
R. Cabral, F. de la Torre, J. Costeira, and A. Bernardino. Matrix completion forweakly­supervised multi­label image classification.IEEETransactionsonPatternAnalysisandMachineIntelligence, 37:121–135, 2015.
J. Carreira and A. Zisserman. Quo vadis, action recognition? a new model and thekineticsdataset.2017IEEEConferenceonComputerVisionandPatternRecognition(CVPR), pages 4724–4733, 2017.
X. Cui, Q. Liu, M. Gao, and D. N. Metaxas. Abnormal detection using interactionenergy potentials.CVPR2011, pages 3161–3167, 2011.
A. Datta, M. Shah, and N. Lobo. Person­on­person violence detection in video data.Objectrecognitionsupportedbyuserinteractionforservicerobots,1:433–438vol.1,2002.
Y. Dauphin, A. Fan, M. Auli, and D. Grangier. Language modeling with gated con­volutional networks. InICML, 2017
C.Feichtenhofer. X3d: Expandingarchitecturesforefficientvideorecognition.2020IEEE/CVFConferenceonComputerVisionandPatternRecognition(CVPR),pages200–210, 2020.
C. Feichtenhofer, H. Fan, J. Malik, and K. He. Slowfast networks for video recogni­tion.2019IEEE/CVFInternationalConferenceonComputerVision(ICCV), pages6201–6210, 2019.
J.­C.Feng,F.­T.Hong,andW.­S.Zheng.Mist: Multipleinstanceself­trainingframe­work for video anomaly detection.ArXiv, abs/2104.01633, 2021.
H. Guo, X. Wu, N. Li, R. Fu, G. Liang, and W. Feng. Anomaly detection and local­ization in crowded scenes using short­term trajectories. In2013IEEEInternationalConferenceonRoboticsandBiomimetics(ROBIO), pages 245–249, 2013.
M.Hasan, J.Choi, J.Neumann, A.Roy­Chowdhury, andL.Davis. Learningtempo­ral regularity in video sequences.2016IEEEConferenceonComputerVisionandPatternRecognition(CVPR), pages 733–742, 2016.
W. Hu, X. Xiao, Z. Fu, D. Xie, T. Tan, and S. Maybank. A system for learningstatistical motion patterns.IEEETransactionsonPatternAnalysisandMachineIntelligence, 28:1450–1464, 2006.
M. Ilse, J. M. Tomczak, and M. Welling. Attention­based deep multiple instancelearning. InICML, 2018.
R. T. Ionescu, F. S. Khan, M.­I. Georgescu, and L. Shao. Object­centricauto­encoders and dummy anomalies for abnormal event detection in video.InProceedingsoftheIEEE/CVFConferenceonComputerVisionandPatternRecognition(CVPR), June 2019.35
A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei­Fei.Large­scale video classification with convolutional neural networks.2014IEEEConferenceonComputerVisionandPatternRecognition, pages 1725–1732, 2014.
L. Kratz and K. Nishino. Anomaly detection in extremely crowded scenes usingspatio­temporal motion pattern models. InCVPR, 2009.
G. Liu, J. Wu, and Z. Zhou. Key instance detection in multi­instance learning. InACML, 2012.
W. Liu, W. Luo, D. Lian, and S. Gao. Future frame prediction for anomaly detection­ a new baseline.2018IEEE/CVFConferenceonComputerVisionandPatternRecognition, pages 6536–6545, 2018.
I. Loshchilov and F. Hutter. Decoupled weight decay regularization. InICLR, 2019.
C. Lu, J. Shi, and J. Jia. Abnormal event detection at 150 fps in matlab.2013IEEEInternationalConferenceonComputerVision, pages 2720–2727, 2013.
L. V. D. Maaten and G. E. Hinton. Visualizing data using t­sne.JournalofMachineLearningResearch, 9:2579–2605, 2008.
R. Mehran, A. Oyama, and M. Shah. Abnormal crowd behavior detection usingsocial force model. InCVPR, 2009.
A.Paszke, S.Gross, S.Chintala, G.Chanan, E.Yang, Z.DeVito, Z.Lin, A.Desmai­son, L. Antiga, and A. Lerer. Automatic differentiation in pytorch. 2017.
C.Piciarelli, C.Micheloni, andG.Foresti. Trajectory­basedanomalouseventdetec­tion.IEEETransactionsonCircuitsandSystemsforVideoTechnology, 18:1544–1554, 2008.36
B. Shin, J. Cho, H. Yu, and S. Choi. Sparse network inversion for key instance de­tection in multiple instance learning.202025thInternationalConferenceonPatternRecognition(ICPR), pages 4083–4090, 2021.
K. Simonyan and A. Zisserman. Two­stream convolutional networks for actionrecognition in videos. InNIPS, 2014.
W. Sultani, C. Chen, and M. Shah. Real­world anomaly detection in surveillancevideos. InProceedingsoftheIEEEConferenceonComputerVisionandPatternRecognition(CVPR), June 2018.
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Van­houcke,andA.Rabinovich. Goingdeeperwithconvolutions.2015IEEEConferenceonComputerVisionandPatternRecognition(CVPR), pages 1–9, 2015.
Y.Tian, G.Pang, Y.Chen, R.Singh, J.Verjans, andG.Carneiro. Weakly­supervisedvideo anomaly detection with contrastive learning of long and short­range temporalfeatures.ArXiv, abs/2101.10030, 2021.
D. Tran, L. D. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotem­poral features with 3d convolutional networks.2015IEEEInternationalConferenceonComputerVision(ICCV), pages 4489–4497, 2015.
B. Wan, Y. Fang, X. Xia, and J. Mei. Weakly supervised video anomaly detectionvia center­guided discriminative learning.2020IEEEInternationalConferenceonMultimediaandExpo(ICME), pages 1–6, 2020.
X.Wang,K.H.Tieu,andW.Grimson. Learningsemanticscenemodelsbytrajectoryanalysis. InECCV, 2006.37
P. Wu, J. Liu, Y. Shi, Y. Sun, F. Shao, Z. Wu, and Z. Yang. Not only look, but alsolisten: Learning multimodal violence detection under weak supervision. InECCV,2020.
S. Wu, B. E. Moore, and M. Shah. Chaotic invariants of lagrangian particle tra­jectories for anomaly detection in crowded scenes.2010IEEEComputerSocietyConferenceonComputerVisionandPatternRecognition, pages 2054–2060, 2010.
M. Z. Zaheer, A. Mahmood, M. Astrid, and S.­I. Lee. Claws: Clustering assistedweakly supervised learning with normalcy suppression for anomalous event detec­tion. InEuropeanConferenceonComputerVision, pages 358–376. Springer, 2020.
J. Zhang, L. Qing, and J. Miao. Temporal convolutional network with comple­mentary inner bag loss for weakly supervised anomaly detection.2019IEEEInternationalConferenceonImageProcessing(ICIP), pages 4030–4034, 2019.
T. Zhang, H. Lu, and S. Z. Li. Learning semantic scene models by object classifica­tion and trajectory clustering. InCVPR, 2009.
J.­X. Zhong, N. Li, W. Kong, S. Liu, T. H. Li, and G. Li. Graph convolutionallabel noise cleaner: Train a plug­and­play action classifier for anomaly detec­tion. InProceedingsoftheIEEE/CVFConferenceonComputerVisionandPatternRecognition(CVPR), June 2019.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top