Task #09: Multilevel Semantic Annotation of Catalan and Spanish




June 15th, 2007: Check the 'Systems & Results' section for an advance of the results report.
May 22nd, 2007: The competition is now over. The evaluation has been carried on and its results will be reported during the Semeval-2007 Workshop celebration. We hope to see you all there!.

April 12th, 2007: Updated version of the NER official scorer available. Download it together with updated Semeval task#9 software v1.4.
March 24th, 2007: Full NSD dictionaries relating lemmas and WordNet senses for Catalan and Spanish available. Check the Download section.
March 21th, 2007: NSD note: We only expect sense labels for the nouns marked as targets (nouns labeled with 'CS' are not marked as targets).
March 20th, 2007: Updated version of the official scorer available. Download it together with updated Semeval task#9 software v1.3. Check the Download section.
March 12th, 2007: Test set available at the SemEval-2007 webpage for task #9.
March 12th, 2007(update) Complete training data following textual order of the predicates available at the SemEval-2007 webpage for task #9. SRL columns must follow the textual order of the predicates. Check the Technical Setting section to get updated information.

March 9th, 2007 : Beta version of the official scorer released. Download it together with updated Semeval task#9 software v1.2. Check the Download section. Updated verbal lexicon for the whole train dataset. Check the Download section.
March 5th, 2007: Complete training data available at the SemEval-2007 webpage for task #9. Semeval task#9 software version 1.0 available. Check the Download section.
March 2nd, 2007: Updated verbal lexicon for the whole train dataset. Check the Download section.
February 27th, 2007: Dictionaries relating nouns and WordNet senses for Catalan and Spanish available. Check the Download section.
February 26th, 2007: Evaluation period begins, the first part of the training data is already available at the SemEval-2007 webpage for task #9. Check the Technical Setting section for updated information on the Evaluation. Updated Verbal Lexicon
February 23rd, 2007: Train and test data release calendar available, check the Download section.
February 21st, 2007: Updated trial dataset, with minor errors fixed, available at the SemEval-2007 webpage for task #9.
February 19th, 2007: Registration open at the registration site for SemEval-2007
February 18th, 2007: Evaluation period extended to 4 weeks: from the moment you download the traning set you will have 4 weeks to upload the outputs of your system on the test set. Updated Verbal Lexicon and full Catalan and Spanish WordNets available in the Download section.  
January 10th, 2007: Trial datasets are already available. All the website has been updated accordingly. The task description has been updated and several information added. Check the Technical Setting and Download sections  
October 13, 2006: This new website has been posted. Welcome to the task!!!


Lluís Màrquez
TALP Research Center
Universitat politècnica de Catalunya
M. Antònia Martí
Centre de Llenguatge i Computació, CLiC
Universitat de Barcelona
Luis Villarejo
TALP Research Center
Universitat Politècnica de Catalunya
Mariona Taulé
Centre de Llenguatge i Computació, CLiC
Universitat de Barcelona

Please direct all your questions regarding the SemEval-1 task on Multilevel Semantic Annotation of Catalan and Spanish to the following email address: semeval-msacs@lsi.upc.edu .


The following people worked very hard on the developement of the corpora used in this task, manually annotating all linguistic layers and developing various software tools for their processing. Thanks to all. We owe you a lot!

Juan Aparicio, Manu Bertran, Oriol Borrega, Núria Bufí, Joan Castellví, Maria Jesús Díaz, Marina Lloberes, Difda Monterde, Aina Peris, Lourdes Puiggrós, Marta Recasens, Santi Reig, and Bàrbara Soriano.

General Task Description


In this task, we aim at evaluating and comparing automatic systems for semantic annotation at several levels for the Catalan and Spanish languages. The three semantic levels considered include: semantic roles and verb disambiguation, disambiguation of all nouns, and named entity recognition.

[1, Semantic Role Labeling, SRL] The annotation of semantic roles of verb predicates is similar to PropBank style (Palmer et al. 2005; Taulé et al. 2005; Taulé et al. 2006), and the task setting similar to that of 2005 CoNLL shared task.  Verb disambiguation refers to the assignment of the proper semantic-class tag to the verb, which is a much coarser grained level than the usual sense disambiguation. This tag is composed by the thematic structure number (as indexed in the role set file for the verb predicate) and the lexico-semantic class, which is used to map the numbered arguments into semantic roles.

[2, Noun Sense Disambiguation, NSD] The disambiguation of nouns will have a similar shape to an "all-words" disambiguation task, with the exception that only the frequent nouns will be treated. The sense repository used for the annotation will consist of the current versions of the Catalan and Spanish WordNets.

[3, Named Entity Recognition, NER] The annotation of named entities will include recognition and classification of simple entity types (person, location, organization, etc.) but including embedding of entities. We will be considering core "strong" entities (e.g., [US]_loc) and "weak" entities, which, by definition, include some strong entities (e.g., The [president of [US]_loc]_per) (Arévalo, Civit & Martí 2004; Arévalo et al. 2002).

All semantic annotation tasks will be performed on exactly the same corpora for each language. We present all the annotation levels together as a complex global task, since we are interested in approaches which address these problems jointly, possibly taking into account cross-dependencies among them. However, we will be also accepting systems approaching the annotation in a pipeline style, or addressing any of the particular subtasks in any of the languages (3 levels x 2 languages = 6 subtasks).

More particularly, the input for training will consists of a medium-size set of sentences (~150Kwords per language) with gold-standard full syntactic annotation (including function tags) and the semantic annotations of SRL, NSD, and NER, which is the target knowledge to be learned. The full parse trees are provided only to ease the learning process, but participants are not committed to use them if they do not want. The test corpus will be about 10 times smaller than the training corpus and will include the full syntactic annotation without the semantic levels, which have to be predicted. The parse trees of the test set will be also the manually revised gold-standard ones. Unfortunately, we have had no time to prepare automatic parsers for both languages to provide the automatic generated syntactic input levels, as we initially planned.

Formats are formally described in the Technical Setting section of the task webpage. They are very similar to those of the CoNLL-2005 shared task (column style presentation of levels of annotation). in order to be able to share evaluation tools and already developed scripts for format conversion.


As previously said, we will use standard evaluation metrics for each of the defined subtasks (SRL, NSD, NER), based on precision/recall/F1 measures, since they are basically recognition tasks. Classification accuracy will be also calculated for verb disambiguation and NSD.

All systems will be ranked and studied according to the official evaluation metrics in each of the six subtasks (SRL-ca, NSD-ca, NER-ca, SRL-es, NSD-es, NER-es). Additionally, global measures will be derived as a combination of all partial evaluations to rank systems' performance per language and for the complete global task (language independent).

The organization will prepare a simple baseline processor for each of the subtasks. Participant teams not presenting results in any of the subtasks will be evaluated using the baseline processors in those tasks in order to get global performance scores. The evaluation on the test set will be carried out by the organizers based on the outputs submitted by participant systems. The participants will have available the official evaluation software within the first week of the evaluation period.


Apart from a comprehensive list of documents describing formats and tagsets, we will provide as many resources/tools as possible to participants, with the aim of easing the participation of teams with few resources/tools/experience on Spanish and Catalan languages:
  • Full syntactic annotation of training and test files (including segmentation, POS tags, lemmas, full parse trees and syntactic functions), which can be very useful for feature extraction and data traversing.
  • Updated Catalan and Spanish WordNets, which are linked to English WordNet 1.6 for all noun synsets, some of them enriched with glosses, examples, collocations, etc.
  • Verb lexicon with the rolesets for each of the treated verbs and several examples
  • General scripts for format conversion, which are very useful to convert CoNLL-style files into more suited representations for automatic processing.
  • Annotation guides of several resources.
Development of resources

All the resources are provided by the organizers. They are free for research and academic usage, thus no special requirements will be needed by participants to get and use them (signing a simple license agreement for all the distributed materials will suffice). All these resources and tools are being developed in a joint effort by several NLP research groups and partially funded by the Spanish government under several projects: 3LB (FIT-150500-2002-244) responsible in 2003-2004 for the syntactic annotation of 100Kw Catalan and Spanish corpora together with noun/verb sense annotation; CESS-ECE (HUM-2004-21127-E) which is currently extending the 3LB corpora to 500Kw including a first annotation of semantic roles; and a follow-up project, PRAXEM, which will provide extra resources to complete the SRL annotation and to include the labeling of named entities.


  • Arévalo, M., M. Civit and M.A. Martí (2004) MICE: a Module for Named-Entities Recognition and Classification, in International Journal of Corpus Linguistics, vol. 9 num. 1. John Benjamins, Amsterdam.
  • Arévalo, M., X. Carreras, L. Màrquez, M.A. Martí, L. Padró, M.J. Simón (2002) A proposal for Wide-Coverage Spanish Named Entity Recognition, in Procesamiento del Lenguaje Natural, revista 28. SEPLN, Alicante.
  • Palmer, M., P. Kingsbury, D. Gildea (2005) The Proposition Bank: An Annotated Corpus of Semantic Roles, Computational Linguistics, 21 (1), MIT Press, USA.
  • Taulé, M., J. Aparicio, J. Castellví, M.A. Martí (2005) 'Mapping syntactic functions into semantic roles', Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories (TLT 2005). Barcelona: Universitat de Barcelona.
  • Taulé, M., J. Castellví, M.A. Martí, J. Aparicio (2006) 'Fundamentos teóricos y metodológicos para el etiquetado semántico de CESS-CAT y CESS-ESP', Procesamiento del Lenguaje Natural, SEPLN, Zaragoza.

Last update: June 15th, 2007

 For more information, visit the SemEval-2007 home page.