Ves al contingut. Salta a la navegació
Esteu aquí: Inici > LSI > +LSI > Newsletter issue 4 > More than twenty years of Natural Language Research
I can't log in
 
LSI
Accions del document

More than twenty years of Natural Language Research

Dr. Horacio Rodríguez is one of the researchers that saw the birth of the Research Group in Natural Language Processing (GRPLN). Dr. Rodríguez became interested in Natural Language Processing when, in Spain, this was almost an unknown discipline. Currently the GRPLN group gathers more than twenty researchers and has attained international recognition. Many companies have shown their interest in this field and it has become a research line on its own. Linguistic Engineering is also born.


LogoDelicious  Digg!

Where do we begin from? From the beginning...
LletraA Some days ago I met Horacio Rodríguez for the first time. At that moment I did not know who he was. The LSI department sometimes seems so big that these initial months of my work here seem not to have been enough to draw a complete map of all the research and all people that works here. And Dr. Rodriguez is not someone to be missed. He has been here from the very beginning of the Barcelona Informatics School.

After exchanging a few words, curiosity drew I to explore who was this scientist interested in Natural Language. Today I wanted to ask him how GRPLN started.


 
The origins
Nowadays...
I have been told that you are the father of the Natural Language Processing Research Group. Isn't it? How did you start it?
TeclatWell, yes, I am here since the very beginning but, if I am the father, we could say that first there was a grandmother (he smiles).  Felisa Verdejo, was the first one in Spain that become interested in Natural Language. She acted as an advisor to my PhD thesis as well as  Núria Castell's.
and the one who created the Natural Language Processing research groups here in Barcelona, in Donosti and in Madrid.
In the rest of the world did this interest in Natural Language start at the same time as in Spain?
Not at all. More than twenty years ago out of Spain there was already a strong tradition in the study of Natural Language mostly in English.

Thirty years ago when you started your career the Barcelona Informatics School was just being founded. What had you studied, then?

HoracioYes, I am one of the most veteran professors here, together with Pere BotellaRafel Cases, Antoni Olivé, Josep Díaz..  When we started there were no computer scientists or engineers as such. We had the fortune of living the birth of this discipline in our country as well as the start of the Barcelona Computing School.  It is very moving to have shared a little part of the history of computation. But going back to your question, I studied Industrial Engineering and Physics.
Industrial Engineering and Physics? And with this background what was it that drew you to Computer Science to the point of devoting your life to it?
IntOrdinadorIn truth, at that time Computer Science appeared as something very novel. The market was huge and many students of science and technical careers joined in with a lot of excitement.

What was your doctoral research about?
(sighs)  It is so far away in time, and we have researched in so many things since then... It feels weird to talk about those beginnings. I studied man-machine communication in Natural Language. Then it was a true revolution. Year after year the forefront of research has changed.

Your PhD thesis may lay far away in time but probably you feel closer to all the thesis that you have supervised since then?
Yes!.  It is true that at the start we were very few people. And Felisa left Barcelona early. She works now at UNED (the Spanish Distance Education University). I have had to supervise a lot of PhD thesis. Eleven up to now.

This represents a lot of effort and a lot of time for me, but I feel very happy when I think that this investment of my energy has turned into something that is very enriching. At the beginning I had to get acquainted with a lot of different topics that were new to me.  This gave me a broad vision and, often, it helps me to create links between very different aspects.

Where are now all your PhD students?
Well most of them are now professors here and researchers at GRPLN. For example I was the supervisor of Lluís PadroLluís Màrquez, Jordi Turmo, Alicia Ageno, Marta Gatius... among others.
Are you still supervising PhD thesis?
Fortunately the group has grown a lot. Now we are twenty people, so my role is not so necessary. Our project keeps going and now I can enjoy several "Doctoral grandchildren".

Birret

Which are your current research lines?
With so many people in the group it is easy to have different research lines or interests. There are some people who are more interested in the theoretical aspects of Machine Learning, while others are more kind with the applications, such as Machine Translation, or some others prefer to study linguistic processors.

Well, evidently this is strongly influenced by the projects we are working on and the grants we are awarded. Last year we completed several EU projects as, for example, CHIL or HOPS. Now we are currently working on three projects with grants from the Spanish Science Minister.
What are they about?
On one hand we are working in </Text-mess>Alicia Ageno is the Main Researcher in this project. The name of the project is, in fact, a pun about the wide project willing to finish with text mess.

The contents of this project are varied, information extraction, question-answering, information retrieval...etc. Six Spanish Universities participate on it.

We are also working on the Know project which aims at large scale development of  multilingual technology  for Natural Language understanding.

Finally we are involved in Open MT.  This last project is related to automatic translation. We have been very much committed with this line.

We have two basic lines for the resources subjects: the development of lexical ontologies and the improvements for the linguistic processor. Our system, Freeling, has become the most used tool for the Catalan and Spanish processing thanks to Lluís Padró.

We also have worked with Wordnet, which are enormous databases made up of words and related-word networks. At the beginning this databases were only available in English but thanks to the EU project EuroWordNet it has been extended to other languages, like Spanish or Catalan.

Catalan?
CatalunyaYes this was a project funded by Generalitat de Catalunya and we had help from linguists form the Barcelona University.


I have been told that you not only work in Catalan, Spanish and English. You also work with more complex languages, like Arabic. Tell me more about this...
Yes, we have been working, funded the USA government, in the creation of a Wordnet based on the Arabic language, Arabic Wordnet. It has been an extremely interesting work that we completed last Christmas. 


Arab

I imagine that working with Arabic is something quite different from working with Western languages. What are the main difficulties?.
Arabic has been a challenge for us. We have worked in close cooperation with two linguists, one from Syria, Musa, and another from Lebanon, Sabri.

Think that talked Arabic has vowels and consonants like many languages, however when it is written Arabic the vowels disappear, and that makes the job harder. We have vowelized the texts we have worked with. This technique is also used in beginners Arabic Language courses books or the Koran.

Can you speak or read Arabic?
Yes I can. It is a personal interest issue. I decided to learn Arabic years ago. At the beginning when I tried to read a Newspaper I though it was going to be impossible to understand it, however little by little it does not seem harder that shorthand or mobile messages nowadays.
 
Do you feel that this type of initiatives help in cultural integration?
Of course! I think these are very positive contributions in order to reach a good relationship and understanding between languages and countries. I have a personal interest in these topics.  On the other hand, the very idea of European Projects seems to me to be a good integration tool.


What other milestones has GRPLN attained?
Currently our group has gained a certain international recognition and in Spain we could say that we are one of the main research groups in Natural Language. We have taken part in projects like Meaning, NAMIC, FAME, CHIL... etc.

Do you think there still is a road ahead to keep researching?
There are many topics where we are just at the beginning o, at least, very far from where we could arrive. Translation systems, for example, either give poor performance or are extremely expensive. I personally work with text topics but other researchers in the group, for example Jordi Turmo, devote themselves to the study of oral question answering that adds another level of difficulty to the task.

This is a never ending job!

The future of Natural Language Processing?

How do you foresee the future of Natural Language Processing?
Currently all these research topics are going through a good time. There are many companies interested in these techniques and lines of research. In fact, all this is gaining its own identity. Some people call this field, Linguistic Engineering.

You have to think that in the whole history of the study of language we should see ourselves living in one of its quietest moments. When the first machine translators appeared, there was an incredible boom. There were a lot of people interested in knowing more about it and in improving the just newly born techniques. But, at the time, the expectations proved to be extremely ambitious. Now the interest is still on but we are more realistic. I think this is very positive. I encourage all people that feel curious about language to join the masters and PhDs offered by our group, within the postgraduate Artificial Intelligence studies. There is a big world ahead!


Press Contact:
ilapuente@lsi.upc.edu

 
Darrera modificació: Maig 2008
© UPC. Technical University of Catalonia
Departament de Llenguatges i Sistemes Informàtics
About this web.