Part-of-Speech Tagging with Two Sequential Transducers

in Computational Linguistics in the Netherlands 2000
Restricted Access
Get Access to Full Text

Subject Highlights

 

Abstract

The article presents a method of constructing and applying a cascade consisting of a left-and a right-sequential finite-state transducer, T1 and T2, for part-of-speech disambiguation. In the process of POS tagging, every word is first assigned a unique ambiguity class that represents the set of alternative tags that this word can occur with. The sequence of the ambiguity classes of all words of one sentence is then mapped by T1 to a sequence of reduced ambiguity classes where some of the less likely tags are removed. That sequence is finally mapped by T2 to a sequence of single tags. Compared to a Hidden Markov model tagger, this transducer cascade has the advantage of significantly higher processing speed, but at the cost of slightly lower accuracy. Applications such as Information Retrieval, where the speed can be more important than accuracy, could benefit from this approach.

Computational Linguistics in the Netherlands 2000

Selected Papers from the Eleventh CLIN Meeting

Series:

Table of Contents

Index Card

Metrics

Metrics

All Time Past Year Past 30 Days
Abstract Views 11 11 2
Full Text Views 7 7 5
PDF Downloads 4 4 2
EPUB Downloads 0 0 0

Related Content