(3.215.77.193) 您好!臺灣時間:2021/04/17 02:24
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:黃浩然
研究生(外文):Huang, Hao-Juang
論文名稱:360相機的掌鏡偵測
論文名稱(外文):Director360 : Introducing Camera Handling to 360 Camera
指導教授:詹力韋
指導教授(外文):Chan, Li-Wei
口試委員:林文杰陳冠文黃大源詹力韋
口試委員(外文):Lin, Wen-ChiehChen, Kuan-WenHuang, Da-YuanChan, Li-Wei
學位類別:碩士
校院名稱:國立交通大學
系所名稱:多媒體工程研究所
學門:電算機學門
學類:軟體發展學類
論文種類:學術論文
論文出版年:2018
畢業學年度:107
語文別:英文
論文頁數:37
中文關鍵詞:360 相機全景影像深度學習慣性測量單元
外文關鍵詞:360 CameraDeep LearningPanorama ImageIMU
相關次數:
  • 被引用被引用:0
  • 點閱點閱:135
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:3
  • 收藏至我的研究室書目清單書目收藏:0
不像傳統相機能透過觀景窗的方式來掌鏡,360相機的攝影師在拍攝期間無法利用觀景窗這樣的功能來掌鏡。由於缺乏掌鏡的概念,會使得在後處理編輯 360 影片時更加困難跟影響觀賞360影片的觀看體驗.這篇論文引進了 360 相機的掌鏡概念,並提出了 Director 360 -- 一個有兩種新穎掌鏡概念的360相機裝置.我們提出兩種 360 相機掌鏡互動技術:指向 掌鏡以及眼神掌鏡,其中指向掌鏡為外顯式互動,而眼神掌鏡為隱含式互動,透過這兩個方法重新賦予 360 影片掌鏡的「過程」。讓拍攝者在拍攝當下,能夠將拍攝意圖也一併紀錄在影片中。在「指向掌鏡」互動設計中,拍攝者將 360 相機指向拍攝 主體,系統即會自動將該方向畫面解釋為拍攝者掌鏡的意圖,而在「眼神掌鏡」互動設計中,我們利用深度學習的方式直接預測拍攝者觀看環境的視野角度當作拍攝者掌鏡的意圖。 在此篇論文,我們詳細地介紹這兩個掌鏡的實作方式,並設計了一個影片編輯系統 Director360 Editor 包含處理數據以簡化編輯過程.為了理解 Director360 如何實際上地幫助拍攝和編輯360視頻,我們招募了一個有四名參與者的用戶研究,讓他們在三個各自目標場景中拍攝並創建敘事型的視頻,且報告他們的使用者反饋。
Unlike using viewfinders as with traditional cameras, photographers of 360 videos do not have such a viewfinder during shooting. This makes video editing difficult and hinders the viewing experience of 360-degree videos. This work introduces handling methods for the 360-degree camera and proposes Director360, a 360 camera enhanced with two novel handling modes. The Pointer and FoV modes are designed to explicitly and implicitly utilize the 360 camera operator's attention in regard to omnidirectional scenery for shooting video footage. The Pointer mode lets users specify an object of interest by directly pointing the 360 cameras at the object; while our algorithm based on deep learning estimates the camera operator's 360 scene field-of-view in FoV mode. Herein, we detail the implementation and demonstrate the feasibility of the FoV mode based on deep-learning algorithms with an experiment. We present the Director360 Editor which incorporates the handling data to streamline the editing process. To understand how Director360 helps shooting and to edit 360 videos, a user study with four participants were recruited to create video storytelling in three target scenarios, and their feedback is reported.
Introduction
1.1 Motivation and Problem Description . . . . . . . . . . . . . . . . . . . . .1
1.2 Director 360 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2
2 Background
2.1 Extraction of 360 Video ROIs . . . . . . . . . . . . . . . . . . . . . . . . .4
2.2 360 Video Editing and Presentation . . . . . . . . . . . . . . . . . . . . . .5
3 Hardware Prototyping
3.1 Hardware Prototyping . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
3.2 Interaction Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
3.2.1 Pointer Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
3.2.2 FoV Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8
4 Data Collection and Deep Learning
4.1 Apparatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
4.2 Acquiring Head Region Label . . . . . . . . . . . . . . . . . . . . . . . . .12
4.3 Acquiring Head Orientation Label . . . . . . . . . . . . . . . . . . . . . . .13
4.4 Deep-Learning Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . .13
4.4.1 Training HeadLocNet . . . . . . . . . . . . . . . . . . . . . . . . .13
4.4.2 Training HeadPoseNet . . . . . . . . . . . . . . . . . . . . . . . . .14
4.5 Data Collection Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . .15
4.5.1 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16
4.5.2 Task and Procedure . . . . . . . . . . . . . . . . . . . . . . . . . .17
4.6 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18
5 Performance Evaluation
5.1 Performance of Head Localization . . . . . . . . . . . . . . . . . . . . . . .20
5.2 Performance of Head Pose Estimation . . . . . . . . . . . . . . . . . . . .21
6 Director360 Editor
7 User Study
7.1 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27
7.2 Task and Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27
7.2.1 Questionnaire and interview . . . . . . . . . . . . . . . . . . . . . .28
7.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29
8 Limitation and Conclusion
8.1 Limitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32
8.1.1 360 Video Stabilization . . . . . . . . . . . . . . . . . . . . . . . .32
8.1.2 Estimation of user field-of-view . . . . . . . . . . . . . . . . . . . .32
8.1.3 Scope of Interaction . . . . . . . . . . . . . . . . . . . . . . . . . .33
8.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33
9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34
[1] Hou-Ning Hu et al. “Deep 360 Pilot: Learning a Deep Agent for Piloting Through 360deg Sports Videos”. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017.
[2] Yung-Ta Lin et al. “Outside-In: Visualizing Out-of-Sight Regions-of-Interest in a 360 Video Using Spatial Picture-in-Picture Previews”. In: Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology. UIST ’17. Québec City, QC, Canada: ACM, 2017, pp. 255–265. isbn: 978-1-4503-4981- 9. doi: 10.1145/3126594.3126656. url: http://doi.acm.org/10.1145/ 3126594.3126656.
[3] Stephan Fremerey et al. “AVtrack360: An Open Dataset and Software Recording People’s Head Rotations Watching 360&Deg; Videos on an HMD”. In: Proceedings of the 9th ACM Multimedia Systems Conference. MMSys ’18. Amsterdam, Nether- lands: ACM, 2018, pp. 403–408. isbn: 978-1-4503-5192-8. doi: 10.1145/3204949. 3208134. url: http://doi.acm.org/10.1145/3204949.3208134.
[4] Wen-Chih Lo et al. “360&Deg; Video Viewing Dataset in Head-Mounted Virtual Reality”. In: Proceedings of the 8th ACM on Multimedia Systems Conference. MM- Sys’17. Taipei, Taiwan: ACM, 2017, pp. 211–216. isbn: 978-1-4503-5002-0. doi: 10. 1145/3083187.3083219. url: http://doi.acm.org/10.1145/3083187.3083219.
[5] Shih-Han Chou et al. “Self-view Grounding Given a Narrated 360 Video”. In: CoRR abs/1711.08664 (2017). arXiv: 1711.08664. url: http://arxiv.org/abs/1711. 08664.
[6] Cuong Nguyen et al. “Vremiere: In-Headset Virtual Reality Video Editing”. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. CHI ’17. Denver, Colorado, USA: ACM, 2017, pp. 5428–5438. isbn: 978-1-4503-4655-9. doi: 10.1145/3025453.3025675. url: http://doi.acm.org/10.1145/ 3025453.3025675.
[7] Cuong Nguyen et al. “CollaVR: Collaborative In-Headset Review for VR Video”. In: Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology. UIST ’17. Québec City, QC, Canada: ACM, 2017, pp. 267–277. isbn: 978-1-4503-4981-9. doi: 10.1145/3126594.3126659. url: http://doi.acm. org/10.1145/3126594.3126659.
[8] Amy Pavel, Björn Hartmann, and Maneesh Agrawala. “Shot Orientation Con- trols for Interactive Cinematography with 360 Video”. In: Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology. UIST ’17. Québec City, QC, Canada: ACM, 2017, pp. 289–297. isbn: 978-1-4503-4981- 9. doi: 10.1145/3126594.3126636. url: http://doi.acm.org/10.1145/ 3126594.3126636.
[9] Yen-Chen Lin et al. “Tell Me Where to Look: Investigating Ways for Assisting Focus in 360 Video”. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. CHI ’17. Denver, Colorado, USA: ACM, 2017, pp. 2535–2545. isbn: 978-1-4503-4655-9. doi: 10.1145/3025453.3025757. url: http://doi.acm. org/10.1145/3025453.3025757.
[10] Jan Gugenheimer et al. “SwiVRChair: A Motorized Swivel Chair to Nudge Users’ Orientation for 360 Degree Storytelling in Virtual Reality”. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. CHI ’16. San Jose, California, USA: ACM, 2016, pp. 1996–2000. isbn: 978-1-4503-3362-7. doi: 10. 1145/2858036.2858040. url: http://doi.acm.org/10.1145/2858036.2858040.
[11] Roshan Lalintha Peiris et al. “ThermoVR: Exploring Integrated Thermal Haptic Feedback with Head Mounted Displays”. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. CHI ’17. Denver, Colorado, USA: ACM, 2017, pp. 5452–5456. isbn: 978-1-4503-4655-9. doi: 10.1145/3025453.3025824. url: http://doi.acm.org/10.1145/3025453.3025824.
[12] Jan Gugenheimer et al. “GyroVR: Simulating Inertia in Virtual Reality Using Head Worn Flywheels”. In: Proceedings of the 29th Annual Symposium on User Interface Software and Technology. UIST ’16. Tokyo, Japan: ACM, 2016, pp. 227–232. isbn:978-1-4503-4189-9. doi: 10.1145/2984511.2984535. url: http://doi.acm.org/10.1145/2984511.2984535.
[13] Yu-Chuan Su, Dinesh Jayaraman, and Kristen Grauman. “Pano2Vid: Automatic Cinematography for Watching 360 Videos”. In: Proceedings of the Asian Conference on Computer Vision (ACCV). 2016.
[14] Yu-Chuan Su and Kristen Grauman. “Making 360° Video Watchable in 2D: Learn- ing Videography for Click Free Viewing”. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017), pp. 1368–1376.
[15] Anh Truong et al. “Extracting Regular FOV Shots from 360 Event Footage”. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. CHI ’18. Montreal QC, Canada: ACM, 2018, 316:1–316:11. isbn: 978-1-4503-5620-6. doi: 10.1145/3173574.3173890. url: http://doi.acm.org/10.1145/3173574. 3173890.
[16] Zhe Cao et al. “Realtime Multi-Person 2D Pose Estimation using Part A nity Fields”. In: CVPR. 2017.
[17] Mark Sandler et al. “MobileNetV2: Inverted Residuals and Linear Bottlenecks”. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018.
[18] Diederik P Kingma and Jimmy Ba. “Adam: A method for stochastic optimization”. In: arXiv preprint arXiv:1412.6980 (2014).
[19] K. He et al. “Deep Residual Learning for Image Recognition”. In: 2016 IEEE Con- ference on Computer Vision and Pattern Recognition (CVPR). 2016, pp. 770–778. doi: 10.1109/CVPR.2016.90.
[20] Walters L. S. Cowley J. Wilken J. M. Resnik L. Gates D. H. “Range of Motion Requirements for Upper-Limb Activities of Daily Living”. In: The American Journal of Occupational Therapy 70.1 (2016), pp. 70–77. doi: 10.5014/ajot.2016.015487. url: http://doi.org/10.5014/ajot.2016.015487.
[21] X. Liu et al. “3D head pose estimation with convolutional neural network trained on synthetic images”. In: 2016 IEEE International Conference on Image Processing (ICIP). 2016, pp. 1289–1293. doi: 10.1109/ICIP.2016.7532566.
[22] Gabriele Fanelli et al. “Random Forests for Real Time 3D Face Analysis”. In: Int. J. Comput. Vision 101.3 (Feb. 2013), pp. 437–458. issn: 0920-5691. doi: 10.1007/ s11263-012-0549-0. url: http://dx.doi.org/10.1007/s11263-012-0549-0.
[23] Byungtae Ahn, Jaesik Park, and In So Kweon. “Real-Time Head Orientation from a Monocular Camera Using Deep Neural Network”. In: Computer Vision – ACCV 2014. Ed. by Daniel Cremers et al. Cham: Springer International Publishing, 2015, pp. 82–96. isbn: 978-3-319-16811-1.
[24] W. Lai et al. “Semantic-Driven Generation of Hyperlapse from 360 Degree Video”. In: IEEE Transactions on Visualization and Computer Graphics 24.9 (2018), pp. 2610– 2621. issn: 1077-2626. doi: 10.1109/TVCG.2017.2750671.
[25] Nikolaos Zioulis et al. “OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas”. In: The European Conference on Computer Vision (ECCV). 2018.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔