<!--#include virtual="cabecera.html" -->  
        <h3>Project Summary</h3>
        MEANING will be concerned with automatically collecting and analysing language
        data from the WWW on a large scale, and building more comprehensive multilingual
        lexical knowledge bases to support improved word sense disambiguation (WSD).
        <p>Current web access applications are based on words; MEANING will open
        the way for access to the Multilingual Web based on concepts, providing
        applications with capabilities that significantly exceed those currently
        available. MEANING will facilitate development of concept-based open domain
        Internet applications (such as Question/Answering, Cross Lingual Information
        Retrieval, Summarisation, Text Categorisation, Event Tracking, Information
        Extraction, Machine Translation, etc.). Furthermore, MEANING will supply
        a common conceptual structure to Internet documents, thus facilitating
        knowledge management of web content.
        <p>Progress is being made in Human Language Technology (HLT) but there
        is still a long way towards Natural Language Understanding (NLU). An important
        step towards this goal is the development of technologies and resources
        that deal with concepts rather than words. MEANING will develop concept-based
        technologies and resources through large-scale knowledge processing over
        the web, robust and fast machine learning algorithms, very large lexical
        resources and novel strategies for combining them. Small-scale, isolated
        experiments with limited infrastructure (such as Internet access, processing
        power, and storage space) have no chance of bridging the gap to understanding.
        Advances in this area can only be expected in the context of large-scale
        long-term research projects.
        <p>MEANING will treat the web as a (huge) corpus to learn information from,
        since even the largest conventional corpora available (e.g. the Reuters
        corpus, the British National Corpus) are not large enough to be able to
        acquire reliable information in sufficient detail about language behaviour.
        Moreover, most European languages do not have large or diverse enough corpora
        available.
        <br>
        <a href='fullsummary.html' >more...</a> 
        <!--#include virtual="pie.html" -->