SVMTool   SVMTool on-line demo       download now SVMTool v 1.3 (Perl) [includes 'multiple-column' features] download now

Here you can find information about SVMTool, an open source generator of sequential taggers, released under the GNU Lesser General Public License (LGPL) of the Free Software Foundation. This tool has been developed at TALP Research Center NLP group , in Universitat Politècnica de Catalunya .

The SVMTool is a simple and effective generator of sequential taggers based on Support Vector Machines. We have appied the SVMTool to the problem of part-of-speech (PoS) tagging. By means of a rigorous experimental evaluation, we conclude that the proposed SVM-based tagger is robust and flexible for feature modelling (including lexicalization), trains efficiently with almost no parameters to tune, and is able to tag thousands of words per second, which makes it really practical for real NLP applications. Regarding accuracy, the SVM-based tagger significantly outperforms the TnT tagger exactly under the same conditions, and achieves a very competitive accuracy of 97.2% for English on the Wall Street Journal corpus, which is comparable to the best taggers reported up to date.

The SVMlight software implementation of Vapnik's Support Vector Machine [Vapnik, 1995] by Thosten Joachims has been used to train the models. For further information on it see:

(visit http://svmlight.joachims.org/ )

Components

The SVMTool consists of three main components:

These are, namely the learner, tagger and evaluator. A detailed description can be found in the SVMTool Technical Manual. [.ps] [.pdf]

(1) SVMTlearn

Given a training set of examples (either annotated or unannotated), it is responsible for the training of a set of SVM classifiers.

(2) SVMTagger

Given a text corpus (one token per line) and the path to a previously learned SVM model, it performs the sequential tagging of a sequence of words.

(3) SVMTeval

Given a SVMTool predicted tagging output and the corresponding gold-standard, SVMTeval evaluates the performance in terms of accuracy.

Download now...

SVMTool Discussion Group
 
  Discussion on features and bugs of this software as well as information about oncoming updates takes place on the SVMTool group, to which you can subscribe at:
 
  http://groups-beta.google.com/group/SVMTool
 
  and post messages at:
 
  SVMTool at googlegroups.com


Contributing

The SVMTool library is licensed under LGPL , which means that it may be linked to and used by commercial software packages. But the license also enforces that any changes or improvements made to the library (and in this case also to the morphological data) must be redistributed under LGPL terms.

Thus, if you improve the software or data, either adding new functionalities, fixing bugs, or building sequential taggers on different data, you can not distribute them under different conditions than those stated in the license (i.e. freely and with no usage restrictions).

If you want that your changes and improvements become useful to many other people using this free software, please contact us ( ).
 

References

Please reference this tool in your academic works citing the following paper:
 


back to NLP-group website ...