What Oxford Dictionaries Licensing offers
Oxford Dictionaries Licensing is the leading provider of human annotated lexical data for artificial intelligence, natural language processing, machine learning, and a wide range of language technologies.
Oxford Dictionaries Licensing combines our own curated content with that from other respected publishers and content owners worldwide, all validated by our dictionary team. After discussing and understanding your project and requirements , our team of language engineers will deliver a flexible and innovative solution for your unique needs.
To request data samples and for further information.
What datasets and content do we provide?
We provide the following types of datasets:
- Monolingual dictionary data
- Bilingual dictionary data
- Sentence dictionaries
- Synonym content (Thesauri)
- General and domain-specific wordlists
- Inflected forms linked to monolingual or bilingual dictionary data
- Corpora-derived n-grams and frequency
- Human audio pronunciation files
- Hyphenation information
- Multilingual wordlists
- Parallel corpora
- Curated corpora
- AI training data
- Domain-specific wordlists and dictionaries
Our language datasets
We offer datasets for 29 of the world’s languages, which include definitions, translations, examples and idioms, phonetics and phonetic transcriptions, regional varieties, and inflected forms.
- number of defined terms
- number of translated terms
- number of examples and idioms
- inclusion of phonetics and phonetic transcriptions
- availability of inflected forms
- regional varieties covered
What could you develop with our data?
Discover what solutions these companies developed with our lexical data.
Visit our Frequently Asked Questions page for more information.