Language data samples

— Sample our English language datasets

 

 

 

At Oxford Languages, we are the leading provider of lexical and language datasets. Our dictionary data can be used for a range of purposes, including:

 

 

Dictionary display: Find the definition of a word within your product to contain user experience.

 

Advanced feature support: Such as confirming whether your users have correctly spelled and/or used a word.

 

Training data: Dictionary data to help train your AI and NLP processes.

 

 

We have a range of structured language datasets that support a wide variety of use cases.

What we sell...


We offer language content datasets to support software product development and/or enhancements. See our popular English monolingual datasets below:

What we offer...


Accurate and trusted data

Our data is human curated by our team of expert lexicographers. The Oxford name and the expected standards that come with it backs the data that will bolster your products and brands.

Flexible data delivery

Our language datasets are available in different data formats such as JSON via API and XML options.

Support

Our Customer Success team is available to help you get the most value from our data.

How do companies use our data?


 

Astrid

 

Astrid use Oxford Languages data to validate their voice based, AI-powered language learning platform, enabling their users to communicate confidently in English.

 

Read our case study on Astrid ⟶

 

 

Kobo

 

Oxford Languages and Kobo collaborated to develop a tailor-made solution for their built-in dictionary feature, creating a seamless experience for users.

 

Read our case study on Kobo ⟶

 

 

Other types of data we offer...


Monolingual

Ideal for dictionary look up and display.

Bilingualized

Ideal for language learners.

Pronunciations

Ideal for demonstrating the pronunciation of a word.

Bilingual

Ideal for translation.

Thesaurus

Ideal for synonyms suggestions and NLP.

Sentences

Ideal for understanding how a word is used in context.

Our datasets feature: headwords, definitions, translations, pronunciations, parts of speech, senses, example sentences, synonyms, and etymologies.

 

Get in touch if you would like to sample our other datasets ⟶