Result: Speeding up target-language driven part-of-speech tagger training for machine translation

Title:

Speeding up target-language driven part-of-speech tagger training for machine translation

Authors:

SANCHEZ-MARTINEZ, Felipe, PEREZ-ORTIZ, Juan Antonio, FORCADA, Mikel L

Source:

MICAI 2006 (advances in artificial intelligence)Lecture notes in computer science. :844-854

Publisher Information:

Berlin; Heidelberg; New York: Springer, 2006.

Publication Year:

2006

Physical Description:

print, 9 ref 1

Original Material:

INIST-CNRS

Subject Terms:

Computer science, Informatique, Sciences exactes et technologie, Exact sciences and technology, Sciences appliquees, Applied sciences, Informatique; automatique theorique; systemes, Computer science; control theory; systems, Intelligence artificielle, Artificial intelligence, Algorithme apprentissage, Learning algorithm, Algoritmo aprendizaje, Analyse syntaxique, Syntactic analysis, Análisis sintáxico, Complexité algorithme, Algorithm complexity, Complejidad algoritmo, Complexité temps, Time complexity, Complejidad tiempo, Discrimination, Discriminación, Désambiguïsation, Disambiguation, Desambiguisación, Elagage, Pruning(tree), Poda, Filtrage, Filtering, Filtrado, Information a priori, Prior information, Información a priori, Intelligence artificielle, Artificial intelligence, Inteligencia artificial, Langage documentaire, Information language, Lenguaje documental, Langage naturel, Natural language, Lenguaje natural, Langue cible, Target language, Lengua blanco, Langue source, Source language, Lengua fuente, Marquage, Tagging, Marcación, Modèle Markov caché, Hidden Markov model, Modelo Markov oculto, Modèle Markov variable cachée, Hidden Markov models, Modélisation, Modeling, Modelización, Nombre, Number, Número, Texte, Text, Texto, Traduction automatique, Automatic translation, Traducción automática, Traitement langage, Language processing, Tratamiento lenguaje, Utilisation information, Information use, Uso información

Document Type:

Conference Conference Paper

File Description:

text

Language:

English

Author Affiliations:

Transducens Group -Departament de Llenguatges i Sistemes Informàtics Universitat d'Alacant, 03071 Alacant, Spain

ISSN:

0302-9743

Access URL:

http://pascal-francis.inist.fr/vibad/index.php?action=search&terms=19151767

Rights:

Copyright 2007 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS

Notes:

Computer science; theoretical automation; systems

Accession Number:

edscal.19151767

Database:

PASCAL Archive

Further Information

When training hidden-Markov-model-based part-of-speech (PoS) taggers involved in machine translation systems in an unsupervised manner the use of target-language information has proven to give better results than the standard Baum-Welch algorithm. The target-language-driven training algorithm proceeds by translating every possible PoS tag sequence resulting from the disambiguation of the words in each source-language text segment into the target language, and using a target-language model to estimate the likelihood of the translation of each possible disambiguation. The main disadvantage of this method is that the number of translations to perform grows exponentially with segment length, translation being the most time-consuming task. In this paper, we present a method that uses a priori knowledge obtained in an unsupervised manner to prune unlikely disambiguations in each text segment, so that the number of translations to be performed during training is reduced. The experimental results show that this new pruning method drastically reduces the amount of translations done during training (and, consequently, the time complexity of the algorithm) without degrading the tagging accuracy achieved.

Result: Speeding up target-language driven part-of-speech tagger training for machine translation

Further Information

Links

Additional functions