This field of artificial intelligence (AI) gives machines the ability to read, analyse and process human language. Common examples are automatic speech recognition, automatic translation, classification and search engines.
In Europe we have a complex language landscape which must be taken into account. The Charter of Fundamental Rights of the EU prohibits discrimination on grounds of language and places an obligation on the EU to respect linguistic diversity. These rights can only be guaranteed by an unbiased use of AI in language technologies. Accountability, transparency, fairness, visibility and respect of our values are only a few of its ethical implications.
What ingredients are needed to develop language technologies?
Language resources, training algorithms and the resulting language models are some of the key raw materials to develop tools and services.
- Language resources are the raw data essential to build, improve and evaluate natural language processing tools. Language resources can take various forms and shape, including written or spoken corpora, grammars or terminology databases.
- Training algorithms, based on artificial intelligence principles, such as artificial neural networks and adapted to the specificities of our languages, analyse and model these language resources.
- Language models issued from the training process can be used for a large variety of applications, some of which are still emerging. The larger the resources and models are, the more encompassing and generic their applications are. The potential of such models is vast and yet relatively unchartered. Their ability to better extract meaning from text, audio or videos is likely to generate new digital services with huge societal and economic impact.
The European language technology industry plays a key role in Europe’s strategic and technological autonomy, which should be further strengthened. Our specific market needs are best known by European language technology providers, of which hundreds are listed in the Catalogue of eTranslation Services.
Public solutions, such as eTranslation, provide basic tools and services complementing the market offer. They ensure that Language Technologies are available to all European public administrations and SMEs. The European Commission is encouraging and supporting the development and deployment of language tools, such as named entity recognition, summarisation, automatic segmentation, sentiment analysis and speech transcription services. Data anonymisation solutions to ensure GDRP compliance are also being developed.
The European Language Grid (ELG) has created a one-stop shop of specialised language technology solutions. Dissemination and community building efforts have helped to build a common understanding on the necessity to join public and private forces and benefit from the best of both worlds in research and deployment.
European actions exist on various levels:
- the Horizon Europe Programme fosters research and innovation through cross-sectorial support of language technologies. Preliminary effort will be on developing workflows, algorithms and knowledge on building multimodal interactive models, including various modalities and languages.
- the Digital Europe Programme encourages European public and private sectors to deploy language technologies. The European Commission coordinates the EU effort across the Member States and private institutions, developing a European Language Technology ecosystem.
Combining all these ingredients is a major challenge for the EU, the European Language Technology Providers, and national public administrations in charge of AI and Language Technology strategies. The ultimate purpose being to support Europe’s Digital Decade for the benefit of all.