Treffer: Comprehensive printed Tibetan/English mixed text segmentation method
Title:
Comprehensive printed Tibetan/English mixed text segmentation method
Authors:
Source:
Document recognition and retrieval XI (San Jose CA, 21-22 January 2004)SPIE proceedings series. 5296:136-146
Publisher Information:
Bellingham WA: SPIE, 2004.
Publication Year:
2004
Physical Description:
print, 11 ref
Original Material:
INIST-CNRS
Subject Terms:
Documentation, Electronics, Electronique, Optics, Optique, Physics, Physique, Telecommunications, Télécommunications, Sciences exactes et technologie, Exact sciences and technology, Sciences et techniques communes, Sciences and techniques of general use, Sciences de l'information. Documentation, Information science. Documentation, Systèmes de recherche d'informations. Système de gestion documentaire et d'information, Information retrieval systems. Information and document management system, Interfaces. Logiciels, Interfaces. Software, Sciences de l'information et de la communication, Information and communication sciences, Système de recherche documentaire. Système de gestion documentaire et d'information, Bilinguisme, Bilingualism, Bilingüismo, Segmentation, Segmentación, Texte, Text, Texto, Bibétain, Tibetan, Document numérisé, Digitized document
Document Type:
Konferenz
Conference Paper
File Description:
text
Language:
English
Author Affiliations:
Image Division, Dept. of Electronic Engineering, Tsinghua Univ., Beijing 100084, China
Rights:
Copyright 2004 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Sciences of information and communication. Documentation
FRANCIS
FRANCIS
Accession Number:
edscal.16075794
Database:
PASCAL Archive
Weitere Informationen
Text segmentation plays a crucial role in a text recognition system. A comprehensive method is proposed to solve Tibetan/English text segmentation. 2 algorithms based on Tibetan inter-syllabic tshegs and discirminant function, respectively, are presented to perform skew detection before text line separation. Then a dynamic recursive character segmentation algorithm integrating multi-level information is developed. The encouraging experimental results on a large-scale Tibetan/English mixed text set show the validity of proposed method.