跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.171) 您好!臺灣時間:2024/12/09 07:44
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳鳴遠
研究生(外文):Ming-Yuan Chen
論文名稱:名片欄位之辨識
論文名稱(外文):Identifying Items from Business Cards
指導教授:李錫堅李錫堅引用關係
指導教授(外文):Hsi-Jian Lee
學位類別:碩士
校院名稱:國立交通大學
系所名稱:資訊工程系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:1999
畢業學年度:87
語文別:英文
論文頁數:54
中文關鍵詞:名片辨識
外文關鍵詞:Item Identificationscorebookbusiness card
相關次數:
  • 被引用被引用:0
  • 點閱點閱:221
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在今日人與人之間互相接觸的過程中,名片是一項很重要的工具。而為了要讓我們能容易地管理自己所有的名片,在這篇論文裡我們設計了一個系統使得名片之中的各個欄位都能夠自動地被辨識出來。在我們的系統中,我們處理的對象是已經經過前處理且名片中所有文字經過光學字元辨識(OCR)引擎辨認後的名片影像。
我們可以根據名片中內容的共通性而把其中所有的欄位分成主要欄位和次要欄位兩大類。我們分析名片中欄位的排列方式,依據它們的特性建立出欄位辨識的規則。每處理一張名片,系統會建立出屬於這張名片的記分簿(scorebook),記錄名片中的某個區塊依據欄位辨識規則計算出在每個欄位情況下所得到的分數。計算出名片中所有區塊在每個欄位得到的分數後,就能根據分數高低辨識出名片欄位。辨識出欄位之後,我們還要做一些後處理,包括若一個區塊包含兩個以上的欄位的話要做切割、建立關鍵詞資料庫修正欄位內容、檢查欄位中與關鍵字相鄰的字元,若是不應該出現的字元就加以重新辨認、最後要根據關鍵詞來分配那些沒有被辨認出欄位的區塊,若連關鍵詞的資訊都沒有就將這些區塊分配到備註(note)欄。
我們用中文橫式、直式名片各100張來做實驗。所得到的欄位辨識率是93.05%,其中橫式與直式名片的欄位辨識率分別是92.99%以及93.91%。

A business card is an important tool for people to contact each other today. In this thesis, we design a system to identify the items of a business card automatically so that we can manage all business cards more easily. In our system, we will process a business card image which has been pre-processed, and all characters in this business card have been recognized by OCR engines.
We first classify all items of the business card into two classes, the major items and the minor items, according to the commonality of the contents in business cards. We analyze the arrangement of the items and build item identification rules according to the characteristics of each item. For each business card, we build a scorebook to record the scores. When we examine a block in the business card, we consider the rules of each item and evaluate the score which the block gets in this item. After we evaluate the scores of all blocks, we can identify each item in the business card. In the post-processing steps, we split the blocks that contain more than one item. We build the keyword databases to revise the contents of the items and check the characters adjacent to the key-characters and re-recognize improper characters. At last, we identify undecided blocks according to the keywords of each item and identify the blocks that have no keywords as the note items.
In our experiments, we use 100 horizontal cards and 100 vertical cards to test the system of item identification for Chinese cards. The accuracy rate of item identification is 93.05%. The accuracy rates for item identification of horizontal cards and vertical cards are 92.29% and 93.91%.

ABSTRACT IN CHINESEi
ABSTRACT IN ENGLISHii
ACKNOWLEDGEMENTSiii
TABLE OF CONTENTSiv
LIST OF FIGURES vi
LIST OF TALBES viii
Chapter 1. Introduction 1
1.1 Motivation 1
1.2 Previous Study on Business Card Processing3
1.3 Problem Definitions 3
1.4 Module Description and Assumptions4
1.4.1 System Description4
1.4.2 Assumptions6
1.5 Thesis Organization6
Chapter 2. Item Identification of the Chinese Cards10
2.1 Introduction 10
2.2 The Scorebook Method for Identifying Blocks10
2.3 The Major Item Identification Method15
2.4 The Minor Item Identification Method23
Chapter 3. Item Identification of the English Cards 25
3.1 Introduction 25
3.2 The Word Segmentation from a Linear Block of English Cards26
3.3 The Item Identification Method 31
Chapter 4. Post-Processing after Item Identification34
4.1 Introduction 34
4.2 Item Block Splitting 34
4.3 Content Correction Using Lexicons38
4.4 Re-Recognizing Characters in an Item Block38
4.4.1 Character Re-Merging40
4.4.2 Character Re-Splitting40
4.4.3 Character Re-Recognition42
4.5 Identifying the Undecided Blocks42
Chapter 5. Experimental Results and Analysis45
5.1 Introduction 45
5.2 Results of Item Block Splitting45
5.3 Results of Item Identification48
Chapter 6. Conclusions 52
Reference

[1] S. H. Lee, Design of a Chinese Business Card Understanding System, Master thesis, Institute of Computer Science and Information Engineering, National Chiao Tung University, Taiwan, R.O.C., 1998.
[2] C. H. Wu, Chinese Hand-written characters Segmentation in Form Document, Master thesis, Institute of Computer Science and Information Engineering, National Chiao Tung University, Taiwan, R.O.C., 1997.
[3] Udi Manber, Introduction to Algorithms--A Creative Approach, Addision-Wesley publishing company, 1989.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top