Tagset description: Penn Treebank Tagset Parole Reduced Tagset
The SVMTool is a very simple and effective generator of sequential taggers based on Support Vector Machines. It has been successfully applied to a number of NLP problems, such as Part-of-speech Tagging and Base Phrase Chunking, for different languages.
By means of a rigorous experimental evaluation, we conclude that the proposed SVM-based tagger is robust and flexible for feature modelling (including lexicalization), trains efficiently with almost no parameters to tune, and is able to tag thousands of words per second, which makes it really practical for real NLP applications.
Regarding evaluation, the SVM-based tagger significantly outperforms the TnT PoS tagger exactly under the same conditions, and achieves a very competitive accuracy of 97.2% on the WSJ corpus, which is comparable to the best PoS taggers reported up to date.
NOTES: