CoNLL-2005 Shared Task: 

Semantic Role Labeling: Description & Goal


Introduction    F.A.Q.    References    CoNLL Conferences


CoNLL-2005 :       Description&Goal       Examples       Data&Software      Systems&Results 


CoNLL-2004 :      Summary Page (data, systems & results)



Goal

The data is a collection of sentences, each of which contains a number of target verbs and other annotations.  The goal of the task is to develop a machine learning system to recognize participants of the propositions governed by the target verbs. For simplicity, we will refer as arguments to all kinds of participants in a proposition, including adjunctives, references and the verb realization. The output, thus, is a set of arguments for each target proposition.

The annotations provided as input to support the recognition consist of syntactic information and named entities. Following earlier CoNLL Shared Tasks, the syntactic information consists of part-of-speech tags, base chunks and clauses. In addition, full syntactic parses are provided as well. These input annotations are predicted by state-of-the-art preprocessors.

Training and development data are provided to build the learning system. These datasets contain predicted input annotations and the correct outputs. The training set will be used for trainin systems. The development set will be used to tune parameters of the learning systems.

Evaluation will be performed on a separate test set, which will be provided with target verbs and predicted input annotations. A system will be evaluated with respect to precision, recall and the F1 measure of recognized arguments. For an argument to be correctly recognized, both the words spanning the argument and its type have to be correct. The srl-eval.pl program, distributed by the organization at the software section, is the official program to compute the scores. As in the CoNLL-2004 edition, arguments annotating the verb predicate (i.e., V args) will not be considered.

In particular, datasets are sections of the Wall Street Journal (WSJ) part of the Penn TreeBank II (TB). We follow the standard WSJ partition used in syntactic parsing, which is:

Training set:
WSJ Sections 02-21
Development set:
WSJ Section 24
Test set:
WSJ Section 23 + fresh sentences


The annotations of predicate-argument structures have been derived from PropBank (PB), while the preprocessors that predict the input
annotations have been developed within the standard partition of the Penn TreeBank.

This edition, the test set will include a collection of fresh sentences which are not part of WSJ. The aim is to evaluate the robustness of systems on data outside the training corpus.


The Shared Task evaluation is separated into two challenges:





Preprocessing Systems

The input annotations we provide have been computed with the following state-of-the-art systems:

Format

Here is an example of a fully-annotated sentence:

WORDS---->  NE--->  POS   PARTIAL_SYNT   FULL_SYNT------>   VS   TARGETS  PROPS------->

The * DT (NP* (S* (S(NP* - - (A0* (A0*
$ * $ * * (ADJP(QP* - - * *
1.4 * CD * * * - - * *
billion * CD * * *)) - - * *
robot * NN * * * - - * *
spacecraft * NN *) * *) - - *) *)
faces * VBZ (VP*) * (VP* 01 face (V*) *
a * DT (NP* * (NP* - - (A1* *
six-year * JJ * * * - - * *
journey * NN *) * * - - * *
to * TO (VP* (S* (S(VP* - - * *
explore * VB *) * (VP* 01 explore * (V*)
Jupiter (ORG*) NNP (NP*) * (NP(NP*) - - * (A1*
and * CC * * * - - * *
its * PRP$ (NP* * (NP* - - * *
16 * CD * * * - - * *
known * JJ * * * - - * *
moons * NNS *) *) *))))))) - - *) *)
. * . * *) *) - - * *


There is one line for each token, and a blank line after the last token. The columns, separated by spaces, represent different annotations of the sentence with a tagging along words. For structured annotations (named entities, chunks, clauses, parse trees, arguments), we use the Start-End format.

The Start-End format represents phrases (chunks, arguments, and syntactic constituents) that constitute a well-formed bracketing in a sentence (that is, phrases do not overlap, though they admit embedding). Each tag is of the form STARTS*ENDS, and represents phrases that start and end at the corresponding word. A phrase of type k places a (k parenthesis at the STARTS part of the first word, and a ) parenthesis at the END part of the last word.  Scripts will be provided to transform a column in Start-End format into other standard formats (IOB1, IOB2, WSJ trees). The Start-End format used last year (that considered the phrase type in the start and end parts) will be compatible with the current software and scripts.


The different annotations in a sentence are grouped in the following blocks:

 (S 
(NP (DT The)
(ADJP
(QP ($ $) (CD 1.4) (CD billion) ))
(NN robot) (NN spacecraft) )
(VP (VBZ faces)
(NP (DT a) (JJ six-year) (NN journey)
(S
(VP (TO to)
(VP (VB explore)
(NP
(NP (NNP Jupiter) )
(CC and)
(NP (PRP$ its) (CD 16) (JJ known) (NNS moons) )))))))
(. .) )


Some notes on Propositions and Arguments

The data includes the following types of arguments:

Although we represent the arguments of each proposition in a format which allows embedding, no embedding is observed in arguments of a proposition governed by a verb.

Some arguments of a proposition can appear in a sentence split into many discontiguous phrases. In this case, each phrase of an argument of type k is represented as a phrase in Start-End format: the first phrase appears with label k, and the remaining chunks appear with label "C-k" (Continuation). For a system to correctly recognize a discontinuous argument, all and only its phrases have to be correctly recognized.





Last update: January 28, 2005. Xavier Carreras, Lluís Màrquez.