跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.81) 您好!臺灣時間:2024/12/05 05:18
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:鄭紹余
研究生(外文):Shau-Yu Cheng
論文名稱:中文雜誌內對中英文字與圖混合之切字
論文名稱(外文):Character Segmentation in Chinese Magazines with Mixed Alphabets, Numerals and Figures
指導教授:李錫堅李錫堅引用關係
指導教授(外文):Hsi-Jian Lee
學位類別:碩士
校院名稱:國立交通大學
系所名稱:資訊工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:1999
畢業學年度:87
語文別:英文
論文頁數:53
中文關鍵詞:切字
外文關鍵詞:character segmentation
相關次數:
  • 被引用被引用:1
  • 點閱點閱:215
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:1
一般的文件處理系統包含兩個部份:文字切割與文字辨識。在本論文提出了一個有效率之文字切割系統。
這個系統含有兩個模組: 文件分析與文字切割。在文件分析部份,我們先進行縮圖與抽取連通元件(Connected-Components) ,接著將連通元件分為圖形或文字元件。在抽取出文件上之文字元件後,我們將文字元件合併成文字區塊,並檢查圖元件內是否有文字元件。若有,則抽取出來並合併至文字區塊中。最後,對所有的文字區塊切割出一行行之文字。
當區塊的文字行被切開後,針對每個文字區塊,我們先檢查區塊中是否有首字放大情形。若有,則抽取之。最後,我們針對每個文字行執行文字切割以切出中文﹑英文與數字。
在我們的實驗中,文字切割的正確率約98.9% ,對於一份內含1158個的文件所需時間為5秒。由此證明了我們系統的效率。

A general document processing system usually includes two major modules: character segmentation module and character recognition module. In this thesis, we present an automatic system to segment characters efficiently. Our character segmentation system contains two modules: document layout analysis and character segmentation. In the document layout analysis module, we first perform image reduction and connected-components extraction. In the component classification procedure, the connected-components be classified as image components or text components. In the block segmentation procedure, we merge all text components into text blocks . The extraction of text components from image components can group all text components into text blocks. Finally, we perform text line segmentation to segment all text lines in the text blocks.
After all text lines have been segmented, we found and extracted the initial caps if they exist in the text blocks. Finally we segment the Chinese characters, English letters and numerals in the character segmentation module.
In our experiment, the character segmentation rate of our system is about 98.9% and the processing time is about 5 seconds per page with 1158 characters. This proves the effectiveness of our proposed system.

ABSTRACT IN CHINESEi
ABSTRACT IN ENGLISHii
ACKNOWLEDGEMENTSiii
TABLE OF CONTENTSiv
LIST OF FIGURESvi
LIST OF TABLESviii
Chapter 1. Introduction1
1.1 Motivation1
1.2 Problem Definition3
1.2.1 Block identification3
1.2.2 Character Segmentation4
1.3 Survey of Related Research5
1.4 System Description and Assumptions7
1.4.1 System Description7
1.4.2 Assumptions8
1.5 Thesis Organization10
Chapter 2. Document Layout Analysis11
2.1 Image Reducing11
2.2 Connected-Components Extraction13
2.3 Connected-Components Classification17
2.3.1 Features17
2.3.1.1 Component size17
2.3.1.2 Stroke width of components18
2.3.2 Procedure of connected-components classification20
2.4 Text Block Segmentation20
2.5 Text Components Extraction from Image Components22
2.6 Text Line Segmentation27
Chapter 3. Character Segmentation29
3.1 Initial Cap Detection and Extraction30
3.1.1 Initial cap detection30
3.1.2 Initial cap extraction30
3.2 Character Segmentation33
3.2.1 Projection of the count of connected-components33
3.2.2 Check of the component size33
3.2.3 Check for long consecutive half-component sequence33
3.2.4 Check for short consecutive half-component sequence35
3.2.5 Recognition-based character segmentation for half-components sequences38
Chapter 4. Experimental Results and Analysis41
Chapter 5. Conclusions and Future Works48
Reference52

[1] Y. Lu, "Machine Printed Character Segmentation - an Overview," Pattern Recognition, Vol. 28, No. 1, pp. 67-80, 1995.
[2] Y. Lu and M. Shridhar, "Character Segmentation in Handwritten Words - an OverView, "Pattern Recognition, Vol. 29, No. 1, pp. 77-96, 1996.
[3] E. Lecolient and J-V. Moreau, " An new system for automatic segmentation and recognition of unconstrained zip codes," in Proceedings Sixth Sixth Scandinavian Conference Image Analysis, Oulu, Finland, June 1989, pp. 585.
[4] H. Fujisawa and Y. Nakano, "Segmentation Methods for Character Recognition: From Segmentation to Document Structure Analysis," PROCEEDING OF THE IEEE, vol. 80, no 7, pp. 1079-1091, July 1992.
[5] J. Wang and J. Jean, "Segmentation of merged characters by neural networks and shortest path," Pattern Recognition, Vol. 27, No. 5, pp. 649-658, May 1994.
[6] C.-C Chiang and S.-S Yu, "An iterative character segmentation method for irregularly formatted Chinese documents," in Proceedings of the 5th Optical Character Recognition and Document Analysis, Chung Li, Taiwan, 1996, pp. 61-67.
[7] R.G Casey and D.R. Ferguson, "Intelligent Forms Processing," IBM System Journal, vol. 29, no. 3, pp. 435-450, 1990.
[8] T. H. Hildebrandt and W. Liu, "Optical recognition of handwritten Chinese characters advances since 1980," Pattern Recognition, Vol. 26, No. 2, pp. 205-225, 1993.
[9] T. F. Li and S. S. Yu, "Handprinted Chinese character recognition using the probability distribution feature," Intern. Journal of Pattern Recognition and Artificial Intelligence, Vol. 8, No. 5, pp. 1241-1258, 1994.
[10] R. Oka, "Handwritten Chinese-Japanese characters recognition by cellular feature," Proc. 6th Intern. Conf. on Pattern Recognition, pp. 783-785, IEEE, October 1991.
[11] R. C. Gonzalez and R.E. Woods, Digital Image processing, Addision-Wesley publishing company, 1993.
[12] K. Fukunaga, Introduction to Statistical Pattern Recognition, Second Edition, Academic Press, Inc., 1990
[13] Nikhil R. Pal and Snakar K. Pal, "A Review on Image Segmentation Techniques," Pattern Recognition, Vol. 26, NO. 9, pp. 1277-1294, 1993.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
1. 李鴻禧,「現代國際人權的形成與發展概說─兼論『第三代國際人權』問題」,月旦法學,第十二期,(民國八十五年四月),頁八~十五。
2. 石之瑜,「現實主義以外的人權外交」,美國月刊,第六卷第十期,(民國八十年十月),頁三十四~四十三。
3. 石之瑜,「大陸法學界關於人權理論的爭論─兼論人權觀的政治涵意」,理論與政策,第七卷第四期,(民國八十二年八月),頁一一三~一二七。
4. 王立人,「從『美洲國家組織』展望美洲政治安定」,憲政思潮,第二十八期,(民國六十三年十月),頁一六五~一七三。
5. 黃慧英,「效益主義與人權」,鵝湖,第十八卷第三期,(民國八十一年九月),頁二十五~二十九。
6. 陳瑤華,「康德的人權理念」,東吳政治學報,第六期,(民國八十五年),頁七十三~九十。
7. 陳莉莉,「中外人權理論初探」,東吳法商學報,第二十五期,(民國八十年六月),頁一二三~一五十。
8. 許慶雄,「人權保障之基本觀念~本質、分類、享有主體之探討」,淡江學報,第三十五期,(民國八十五年二月),頁四Ο一~四二一。
9. 高宣揚,「論後現代主義人權論述的基本策略」,東吳政治學報,第六期,(民國八十五年),頁三十九~七十二。
10. 洪茂雄,「歐洲理事會擴大組織的背景與意義」,美歐月刊,第十一卷第八期,(民國八十五年八月),頁一百~一一四。