Algorithms for understanding humans
LluÃs MÃ rquez is a researcher of the GRPLN research group. They have spent over fifteen years teaching machines to understand and to talk in human languages. Nowadays his research is focused on the solution of semantic processing problems and on the application of natural language processing to machine translation.
In this new Newsletter, we have interviewed someone who founded the Research group in Natural Language Processing (GRPLN), Dr. Horacio RodrÃguez and now, in this third article, we will talk to one of his former PhD students, Dr. LluÃs MÃ rquez, now a professor at UPC and a researcher in the same group.
LluÃs MÃ rquez
I studied Computer Science at the Informatics School of Barcelona (FIB). Then I went on to my PhD studies under the supervision of Horacio RodrÃguez and I have been lecturing here at UPC since 1993.
What was your PhD Thesis about?
I studied subjects related to machine learning. The main theme of my research was a Part-of-speech tagger. Natural Language has words that can work with different functions inside a grammatical sentence, this different functions lead to different meanings; therefore it is really necessary to distinguish between these meanings.
How would you classify your research?
Our research group GRPLN, belonged to the Artificial Intelligence section of the Software Department (LSI) at UPC. The Software Department has no research sections anymore so there is no a clear section of Artificial Intelligence; however the natural language processing is obviously a branch of Artificial Intelligence science knowledge.
We develop computational models for language processing. In particular, we research in the understanding of textual language and its applications, like machine translation.
In fact, Natural Language has a speciality in the el Artificial Intelligence masters and PhD studies. There are two optional subjects
We develop computational models for language processing. In particular, we research in the understanding of textual language and its applications, like machine translation.
In fact, Natural Language has a speciality in the el Artificial Intelligence masters and PhD studies. There are two optional subjects
- Natural Language Processing for the massive treatment of textual information.
- Natural Language Processing for the machine/human communication.
I coordinate the first of these two subjects.
We are also part of a bigger interdepartmental group, called Centre de Tecnologies i Aplicacions del Llenguatge i la Parla (TALP). TALP is formed by researchers from our department and from the Departament de Teoria del senyal i la Comunicació at the UPC. Our group is more focused on the textual part of language. We work on language treatment, understanding, content reasoning and solving varied problems. Telecommunication engineers at TALP, on the other hand, work on the acoustic treatment of the signal. The acoustics appear twice: when doing speech recognition, i.e., conversion of an acoustic signal into a textual signal. It also appears in speech generation, i.e., conversion from text into acoustic signal (synthesized speech). Our job is to deal with the process in between and we apply it to applications at telephone companies, open question services, information services, etc.
How do you apply Artificial Intelligence to your research?
I work with statistical machine learning. As you know, in our department there are other groups who also use Machine Learning in their research. What makes us different from the others is that we apply it to language issues.
For example, if we have a certain translation problem and a good bilingual corpus where we can find many examples of sentences in both languages, we can develop a general algorithm which is able to learn, from examples, the necessary knowledge to translate texts.
For example, if we have a certain translation problem and a good bilingual corpus where we can find many examples of sentences in both languages, we can develop a general algorithm which is able to learn, from examples, the necessary knowledge to translate texts.
Language seems something very complex, how did you start your research?
At a basic level, it is important to break text into words and to treat their morphology. The difficulty of this task changes depending on the language we are dealing with. The morphological analysis of highly flexive and agglutinative languages can be very hard. Arabic, for example, has no vowels on its written text; therefore it is highly ambiguous and a big challenge. By the way, although we mostly work with Catalan, Spanish and English, we also work with other different languages like Arabic or Chinese. A basic characteristic of Machine Learning is that we pursue and obtain general theories, which are almost independent from the language. So, we can apply our methods to a variety of languages.
On the other hand, syntax is interesting in order to know how to group words in order to generate structures. This knowledge is useful for the right interpretation of texts.
We can find two branches of semantics in relation to text interpretation. Lexical semantics, is the part where you study the meaning of words. Propositional semantics studies the meaning of the predicates of the sentences.
Finally, pragmatics studies the meaning considering discursive properties and world knowledge.
We build tools to solve problems that can be of help in any of these fields.
How do you organise yourselves as a research group?
In our group you can find people who devote their time to anyone of the following different scopes and applications.
- Machine Learning
- Information extraction
- Question-answering Systems
- Machine Translation
- Summarization
- Machine-person dialogue Systems
- Development of basic linguistic processing tools.
On the other hand, our research group has three specialists in linguistics and we collaborate with other research groups on computational linguistics.
From these specialities, which one you would fell more like yours?
In the translation field we are trying to create a hybrid system that combines Statistical Machine Translation, rule-based Machine Translation and example-based Machine translation. On the other hand, one of our objectives is to improve Statistical Machine Translation by introducing high level linguistics information.
Have you been successful?
Well, there are international evaluation competitions where they propose a Natural Language Processing problem with all the necessary data and with a an strictly experimental setting. Participant groups have some months to work on different systems and solutions. Finally there is a Conference where all the results are put in common and the best ones get a gift. We have participate many times and we have god very good results. The "Shared Tasks", from the "Computational Natural Language Leraning" (CoNLL) are some of the most representative NLP competitions and where we have participated and even been involved in the organisation. These competitions are organised by SIGNLL (Special Interest Group on Natural Language Learning) which is a SIG of ACL (Association for Computational Linguistics) and they started in 1999.
In my opinion there is a lot of future and a really good one. Nowadays, there are more researchers in love with Natural Language and its applications. Although there is a large path to go through, we are moving forward in the Natural Language research. We are a group made up by 33 people; 14 professors, 3 researchers, 12 PhD students, 2 master student and 2 developer who participate in many national and european research projects.
I believe that our group is becoming a leading group in the area of Natural Language Processing Research.
I believe that our group is becoming a leading group in the area of Natural Language Processing Research.
Press Contact:
