(3.215.77.193) 您好!臺灣時間:2021/04/17 02:27
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:潘秀蓮
研究生(外文):Yulia
論文名稱:Transition Motion Synthesis for Video-Based Text to ASL
論文名稱(外文):Transition Motion Synthesis for Video-Based Text to ASL
指導教授:楊傳凱
指導教授(外文):Chuan-Kai Yang
口試委員:林伯慎孫沛立
口試委員(外文):Bor-Shen LinPei-Li Sun
口試日期:2019-07-26
學位類別:碩士
校院名稱:國立臺灣科技大學
系所名稱:資訊管理系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:英文
論文頁數:68
中文關鍵詞:ASLSign LanguageDeaf TalkOpenPoseTransition Motion Synthesis
外文關鍵詞:ASLSign LanguageDeaf TalkOpenPoseTransition Motion Synthesis
相關次數:
  • 被引用被引用:0
  • 點閱點閱:38
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
This research describes a novel approach to provide a text to ASL 1
media, a VideoBased
Text to ASL. The hearing impaired or we called as
the Deaf are used to communicate using Sign Language. When they have
to face the spoken language, they have difficulties to read the spoken words
as fast as the hearing people.
The availability of a public dataset named ASL Lexicon Dataset give
the challenge to make the videobased
interpreter for the Deaf. The problem
is on the transition from one word to another since it does not exist in
the original dataset. Regarding to this case, our focus in on how to make a
better transition from one word to another rather than a blink.
After the dataset has been preprocessed,
they are fed to OpenPose library
to extract the skeleton of the signers and save it as JSON files. The
system requires the user to input some glosses2 by text, then it will find the
JSON files and the videos for the corresponding glosses. The whole sequences
of original video are also fed into the system to be used as a transition
pools. Later, the corresponding frames of the glosses are input together
with the transition pools to construct the sequence transition frames. After
getting the sequences, a smoothing algorithm is applied to enhance the
smoothness of the motion.
Since this algorithm is fully depends on the transition pulls, there are
some limitation regarding to make a good transition. If the transition frames we need to make a logically and visually correct motion are not available,
then the result will be not optimized. But as long as the frames we need are
available, this system can generate a logically and visually correct transitions.
This research describes a novel approach to provide a text to ASL 1
media, a VideoBased
Text to ASL. The hearing impaired or we called as
the Deaf are used to communicate using Sign Language. When they have
to face the spoken language, they have difficulties to read the spoken words
as fast as the hearing people.
The availability of a public dataset named ASL Lexicon Dataset give
the challenge to make the videobased
interpreter for the Deaf. The problem
is on the transition from one word to another since it does not exist in
the original dataset. Regarding to this case, our focus in on how to make a
better transition from one word to another rather than a blink.
After the dataset has been preprocessed,
they are fed to OpenPose library
to extract the skeleton of the signers and save it as JSON files. The
system requires the user to input some glosses2 by text, then it will find the
JSON files and the videos for the corresponding glosses. The whole sequences
of original video are also fed into the system to be used as a transition
pools. Later, the corresponding frames of the glosses are input together
with the transition pools to construct the sequence transition frames. After
getting the sequences, a smoothing algorithm is applied to enhance the
smoothness of the motion.
Since this algorithm is fully depends on the transition pulls, there are
some limitation regarding to make a good transition. If the transition frames we need to make a logically and visually correct motion are not available,
then the result will be not optimized. But as long as the frames we need are
available, this system can generate a logically and visually correct transitions.
Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . i
Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
List of Pseudocodes . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Introduction to ASL . . . . . . . . . . . . . . . . . . . . . 5
2.2 Previous Text to ASL System . . . . . . . . . . . . . . . . 7
2.3 ASL Lexicon Video Dataset . . . . . . . . . . . . . . . . 9
2.4 OpenPose . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Motion Synthesis . . . . . . . . . . . . . . . . . . . . . . 16
3 Proposed System . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1 System Overview . . . . . . . . . . . . . . . . . . . . . . 18
3.2 System Architecture . . . . . . . . . . . . . . . . . . . . . 19
3.3 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4 OpenPose Library . . . . . . . . . . . . . . . . . . . . . . 20
3.5 Constructing Transition Frames . . . . . . . . . . . . . . . 21
3.5.1 Keypoints Selection . . . . . . . . . . . . . . . . 22
3.5.2 Similarity Measurement . . . . . . . . . . . . . . 23
3.5.3 Composing Transition Frames Sequence . . . . . . 25
3.5.4 Outliers Prevention . . . . . . . . . . . . . . . . . 32
3.5.5 Animation Smoothing . . . . . . . . . . . . . . . 33
4 Experimental Result . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3 Algorithm Verification . . . . . . . . . . . . . . . . . . . 40
4.4 Scene Change Detection . . . . . . . . . . . . . . . . . . 43
4.5 User Study . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.5.1 Video Type Preference . . . . . . . . . . . . . . . 46
4.5.2 Motion Smoothness Quality . . . . . . . . . . . . 47
5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.2 Limitations and Future Works . . . . . . . . . . . . . . . 50
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
[1] M. Ahmed, M. Idrees, Z. ul Abideen, R. Mumtaz, and S. Khalique, “Deaf talk using 3d animated sign
language: A sign language interpreter using microsoft’s kinect v2,” in 2016 SAI Computing Conference
(SAI), pp. 330–335, July 2016.
[2] A. Irving and R. Foulds, “A parametric approach to sign language synthesis,” in ASSETS, pp. 212–213,
10 2005.
[3] S. Cox, M. Lincoln, J. Tryggvason, M. Nakisa, M. Wells, M. Tutt, and S. Abbott, “Tessa, a system to
aid communication with deaf people,” in Proceedings of the Fifth International ACM Conference on
Assistive Technologies, Assets ’02, (New York, NY, USA), pp. 205–212, ACM, 2002.
[4] V. Athitsos, C. Neidle, S. Sclaroff, J. Nash, A. Stefan, , and A. Thangali, “The american sign language
lexicon video dataset,” in 2008 IEEE Computer Society Conference on Computer Vision and Pattern
Recognition Workshops, pp. 1–8, June 2008.
[5] Z. Cao, G. Hidalgo, T. Simon, S.E.
Wei, and Y. Sheikh, “OpenPose: realtime multiperson
2d pose
estimation using Part Affinity Fields,” in arXiv preprint arXiv:1812.08008, 2018.
[6] X. Xu, L. Wan, X. Liu, T.T.
Wong, L. Wang, and C.S.
Leung, “Animating animal motion from still,”
in ACM SIGGRAPH Asia 2008 Papers, SIGGRAPH Asia ’08, (New York, NY, USA), pp. 117:1–
117:8, ACM, 2008.
[7] Z. Cao, T. Simon, S.E.
Wei, and Y. Sheikh, “Realtime multiperson
2d pose estimation using part
affinity fields,” in CVPR, 2017.
[8] T. Simon, H. Joo, I. Matthews, and Y. Sheikh, “Hand keypoint detection in single images using multiview
bootstrapping,” in CVPR, 2017.
[9] S.E.
Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh, “Convolutional pose machines,” in CVPR, 2016.
[10] W. H. Organization, “Deafness and hearing loss.” https://www.who.int/news-room/
fact-sheets/detail/deafness-and-hearing-loss, 03 2019. [Online; accessed 01July2019].
[11] W. Sandler and D. LilloMartin,
Sign Language and Linguistic Universals. Cambridge University
Press, 2006.
[12] U. D. o. L. Bureau of Labor Statistics, “Occupational outlook handbook: Interpreters
and translators.” https://www.bls.gov/ooh/media-and-communication/
interpreters-and-translators.htm, 04 2019. [Online; accessed 01July2019].
[13] M. Jay, “History of american sign language.” https://www.startasl.com/
history-of-american-sign-language, 10 2010. [Online; accessed 09July2019].
[14] Wikipedia contributors, “American sign language.” https://en.wikipedia.org/w/index.php?
title=American_Sign_Language&oldid=904706391, 2019. [Online; accessed 9July2019].
[15] DawnSignPress, “History of american sign language.” https://www.dawnsign.com/
news-detail/history-of-american-sign-language, 08 2016. [Online; accessed 09July2019].
[16] B. Bauer and K.F.
Kraiss, “Towards an automatic sign language recognition system using subunits,” in
Revised Papers from the International Gesture Workshop on Gesture and Sign Languages in HumanComputer
Interaction, GW ’01, (London, UK, UK), pp. 64–75, SpringerVerlag,
2002.
[17] Jiangwen Deng and H. T. Tsui, “A pca/mda scheme for hand posture recognition,” in Proceedings
of Fifth IEEE International Conference on Automatic Face Gesture Recognition, pp. 294–299, May
2002.
[18] M. G. B. R. A. Tennant, The American Sign Language Handshape Dictionary. Gallaudet University
Press, 2010.
[19] C. Valli, The Gallaudet Dictionary of American Sign Language. Gallaudet University Press, 2006.
[20] Wikipedia contributors, “Data compression — Wikipedia, the free encyclopedia.” https://en.
wikipedia.org/w/index.php?title=Data_compression&oldid=906680933, 2019. [Online;
accessed 19July2019].
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
無相關期刊
 
無相關點閱論文
 
系統版面圖檔 系統版面圖檔