跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.176) 您好!臺灣時間:2025/09/06 09:50
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:林文杰
研究生(外文):Lin, Wen-Chieh
論文名稱:可辨識運動姿態的時空延遲網路及其在唇語辨識之應用
論文名稱(外文):A Space-Time Delay Neural Network for Motion Recognition and Its Application to Lipreading in Bimodal Speech Recognition
指導教授:林進燈林進燈引用關係
指導教授(外文):Chin-Teng Lin
學位類別:碩士
校院名稱:國立交通大學
系所名稱:控制工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:1996
畢業學年度:84
語文別:中文
論文頁數:49
中文關鍵詞:時空延遲網路唇語辨識類神經網路語音辨識電腦視覺運動辨識
外文關鍵詞:STDNNLipreadingNeural NetworkSpeech RecognitionComputer VisionMotion Recognition
相關次數:
  • 被引用被引用:0
  • 點閱點閱:274
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
近年來,由於在無人監視系統,多重模型人機介面,及交通控制系統等不
同領域中對於電腦視覺的需求增加,物體運動姿態的辨識問題也逐漸受到
重視。現存的方法中,大多將待辨識的連續影像序列,經平面影像的特徵
抽取方法後,轉換成特徵向量序列,再送入辨識器辨識。此類方法的最大
缺點在於,辨識物體運動姿態的有效資訊被侷限在空間維度或時間維度。
然而,我們相信描述物體運動的資訊應存在於時空中,而非僅侷限於時間
維度或空間維度中。因此,我們提出一個時空延遲類神經網路來處理運動
姿態辨識的問題。這個新的類神經網路能處理關於三維動態資訊的問題,
得運動姿態的辨識能在時空維度進行,避免了前述的問題。此外,這個類
神經網路對於物體運動姿態在時間維度或空間維度產生輕微偏移失真時,
仍能有效辨識。這使得前級的影像追蹤系統的負擔減輕,因為物體的定位
在不是非常準確的情況下,這個類神經網路仍能有效處理。
我們將這個網路應用在唇語辨識上,實驗結果顯示這個網路比傳統的時間
延遲網路構成的辨識系統有較佳的學習能力與辨識能力。


The researches of the motion recognition has received more and
more attentions in recent years because the need for computer
vision is increasing in many domains, such as the surveillance
system, multimodal human computer interface, and traffic control
system. Most of the existing approaches separate the recognition
into the spatial feature extraction and time domai??cognition.
However, we believe that the information of motion resides in
the space-time domain, not restricted to the time domain or
space domain only. Consequently, it seems more reasonable to
integrate the feature extraction and classification in the space
and time domains altogether. We propose a Space-Time Delay
Neural Network (STDNN) that can deal with the 3-D dynamic
information, such as motion recognition. For the motion
recognition problem that we focus in this paper, the STDNN is an
unified structure, in which the low-level spatiotemporal feature
extraction and space-time recognition are embedded. It possesses
the spatiotemporal shift-invariant recognition abilities that
are inherited from the time delay neural network (TDNN) and
space displacement neural network (SDNN). Unlike the multilayer
perceptron (MLP), TDNN, and SDNN, the STDNN is constructed by
the vector-type nodes and matrix-type links such that the
spatiotemporal information can be gracefully represented in a
neural network. Some experiments are done to evaluate the
performance of the proposed STDNN. In the moving Arabic numerals
(MAN) experiments, which simulate the object'smoving in the
space-time domain by image sequences, the STDNN shows its
generalization ability on spatiotemporal shift-invariance
recognition. In the lipreading experiment, the STDNN recognizes
the lip motions by the inputs of real image sequences. It shows
that the STDNN has better performance than the existing TDNN-
based system, especially on the generalization ability. Although
the lipreading is a more specific application, the STDNN can be
applied to other applications since no domain-dependentknowledge
is used in the experiment.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊