跳到主要內容

臺灣博碩士論文加值系統

(44.200.86.95) 您好!臺灣時間:2024/05/30 04:27
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳立民
研究生(外文):Li-Min Chen
論文名稱:基於虛擬資料藉圖片語義化偵測超市人們行為
論文名稱(外文):BEHAVIOR DETECTION IN SUPERMARKET BASED ON IMAGE CAPTION WITH UNITY
指導教授:林維亮
指導教授(外文):Wei-Liang Lin
口試委員:黃穎聰陳冠宏
口試委員(外文):Yin-Tsung HwangKuan-Hung Chen
口試日期:2021-07-20
學位類別:碩士
校院名稱:國立中興大學
系所名稱:電機工程學系所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2021
畢業學年度:109
語文別:中文
論文頁數:45
中文關鍵詞:遊戲引擎語意描述轉換器注意力
外文關鍵詞:unityimage captiontransformerattention
相關次數:
  • 被引用被引用:0
  • 點閱點閱:81
  • 評分評分:
  • 下載下載:3
  • 收藏至我的研究室書目清單書目收藏:0
本文使用image caption來描述超市中的人類行為。透過觀察人體姿勢,判斷人們當前的行為,以此作為人工載具導航的依據,這些超市中人們的行為會經由image caption演算法得出語意描述,再經過關鍵字的判斷,得出危險與否的資訊。

我們可以在Unity3D遊戲引擎中重建各種人類姿勢和場景,並利用上述方法收集超市中的人類活動資料集。

我們期望不只對超市中人們的行為去辨識,也希望能辨識街道、醫院、工廠等不同場景人們的行為,並且透過人們更細小的動作去進行辨識。
This paper uses image caption to identify human behavior in supermarkets. In addition to observing the human posture to judge the current human behavior as a basis for artificial vehicle navigation. The behavior of people in the supermarket will be described by the image caption. And then judge the keywords in the sentence to get information on whether it is dangerous or not.

We can reconstruct various human postures and scenes in Unity, and collect the human activity data set in the supermarket through the above methods.

We hope not only to identify people's behavior in supermarkets, but also to identify people's behavior in different scenes such as streets, hospitals, factories, etc., and to identify them through people's smaller movements.
摘要........................i
Abstract........................ii
目錄........................iii
圖目錄........................v
圖表目錄........................vii
第一章 序論........................1
1.1 研究動機及背景........................1
1.2 Unity3D 簡介........................1
第二章 相關工具介紹........................3
2.1 卷積神經網路簡介........................3
2.1.1 卷積層(Convolutional Layer).......................4
2.1.2 池化層(Pooling Layer)........................5
2.1.3 全連接層(Fully Connected Layer)...............6
2.1.4 激活函數(Activation function).................7
2.1.5 softmax層........................9
2.1.6 損失函數(Loss function)........................9
2.2 ResNet模型[4]........................10
2.2.1 ResNet101........................12
2.3 Natural language processing........................13
2.4 Image caption........................15
2.4.1 Image Caption Generator........................16
2.4.2 Visual Attention........................17
2.4.3 Visual sentinel........................18
2.4.4 Transformer........................20
第三章 所提架構及實驗流程........................24
3.1 架構........................24
3.2 實驗流程........................25
第四章 蒐集資料........................27
4.1 蒐集虛擬世界的資料集........................27
4.2 蒐集真實世界圖片........................30
第五章 實驗結果........................32
5.1 虛擬場景測試........................34
5.2 真實場景測試........................37
5.3 結果比較........................40
5.4 Jetson Nano應用........................41
5.5 TensorRT........................41
第六章 結論與未來展望........................43
6.1 結論........................43
6.2 未來展望........................43
參考文獻........................44
[1]Sa Wang, Zhengli Mao, Changhai Zeng, Huili Gong, Shanshan Li, Beibei Chen, “A new method of virtual reality based on Unity3D”, In 2010 18th International Conference on Geoinformatics.
[2]Rikiya Yamashita, Mizuho Nishio, Richard Kinh Gian Do, Kaori Togashi, “Convolutional neural networks: an overview and application in radiology”, In PMC, 2018.
[3]Zewen Li, Wenjie Yang, Shouheng Peng, Fan Liu, Member, “A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects”, In IEEE, 2020.
[4]Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, “Deep Residual Learning for Image Recognition”,In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5]K. Simonyan and A. Zisserman, “ery deep convolutional networks for large-scale image recognition”, In ICLR, 2015.
[6]Varsha Kesavan, Vaidehi Muley, Megha Kolhekar, “Deep Learning based Automatic Image Caption Generation”, In 2019 Global Conference for Advancement in Technology (GCAT).
[7]Jingqiang Chen; Hai Zhuge, “News Image Captioning Based on Text Summarization Using Image as Query”, In 2019 15th International Conference on Semantics, Knowledge and Grids (SKG)
[8]SiZhen Li, Linlin Huang, Qi Wu, “Context-based Image Caption using Deep Learning”, In ICSP 2021.
[9]Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan, “Show and tell: A neural image caption generator”, In CVPR 2015.
[10]Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, Yoshua Bengio, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”, In PMLR 2015.
[11]Jiasen Lu, Caiming Xiong, Devi Parikh, Richard Socher, “Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning”, In CVPR 2017.
[12]Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, “Attention Is All You Need.”, In NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing.
[13]Hengshuang Zhao; Jiaya Jia; Vladlen Koltun, “Exploring Self-Attention for Image Recognition”, In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14]Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio, “Neural Machine Translation by Jointly Learning to Align and Translate”, In ICLR, 2015
[15]Lourdes Martínez-Villaseñor *OrcID, Hiram Ponce *OrcID, Jorge Brieva, Ernesto Moya-AlborOrcID, José Núñez-Martínez, Carlos Peñafort-Asturiano, “UP-Fall Detection Dataset: A Multimodal Approach”, In MDPI 2019.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top