Lexical data for AI

At Oxford Languages, our language data specialists build clean, accurate, and human-curated text and speech datasets that are optimized for model training and a range of other natural language processing (NLP) applications.

Human-curated data

Lexical datasets in more than 60 languages, curated by Oxford’s world-renowned lexicographers and language experts.

Extensive data features

Dataset features to support a wide variety of use cases, including machine translation, AI voice generation, conversational AI and much more.

Data sourcing

A dedicated team of language, data, and product sourcing specialists.

Data support

Support from our Customer Success team, to help you get the most value from our data.

How our partners are
using our data

Elemental Cognition

enabling AI to learn language
and interpret meaning

From powering programmes to empowering people, our partnerships span the spectrum of language and technology development as we work with like-minded innovators to advance learning and communication worldwide. How could you partner with Oxford Languages?

View current partners


using neural networks to
individually tailor language


developing inclusive technologies for literacy confidence



award-winning language apps
licensing Oxford Languages data