Semantic English Language Database
The Semantic English Language Database (SELD) provides unrivalled universal coverage of English from across the English-speaking world, enhanced and optimized for machine learning projects.
Built from Oxford’s world-renowned English dictionaries, SELD is a fully combined resource with interlinked thesauri, morphology, and more than two million example sentences drawn from real-life usage through the Oxford Corpus.
Natural Language Processing metadata tags within the data act as sense-level links between senses, synonyms, and example sentences to produce a semantically annotated, structured dataset that distinguishes between fine shades of meaning for word-sense disambiguation.
With complete phonetics and links to our industry-standard human-voiced pronunciation audio, SELD brings together the same data that powers the best search engines and voice assistants used by millions of people around the world every day.
How could SELD power your products?
Our lexicographers are experts in identifying and interpreting language nuance and through SELD, you can leverage Oxford’s unrivalled language expertise to power your AI and solve natural language processing challenges.
With a choice of components and regular updates available to meet your specific requirements, SELD offers a dependably high quality, thorough resource for rich, semantically linked lexical data.
Get in touch to learn more about SELD from our team, or to receive a SELD sample.
How can SELD resolve word-sense disambiguation challenges?
Humans are naturals in word-sense disambiguation, easily able to identify and differentiate senses of a word in context. For the machines that we build, however, this is a steep learning challenge. How can we leverage human expertise in word-sense disambiguation to advance machine learning?
In the video, Zach Haynes, our director of global business development, explains how SELD enables AI to understand and interpret language nuance with ease.
SELD+bilinguals
With SELD+, the structured, semantically linked language data of the monolingual SELD product can be deployed cross-lingually in 10 major languages: Arabic, Chinese (simplified), French, German, Italian, Portuguese, Russian, Spanish, Tamil, and Telugu.
SELD+ provides semantic links from English definitions and real-life sense-annotated examples to translations of each meaning in Oxford’s premier bilingual dictionaries, making this a unique resource to improve outputs from machine translation.
Get in touch to learn more about SELD+ from our team, or to receive a SELD+ sample.
Languages available with SELD+ |
---|
Arabic The Oxford Arabic Dictionary contains fully up to date vocabulary, reflecting contemporary Arabic and English usage with 30,000 headwords per side, over 200,000 translations. |
Chinese (simplified) The New Oxford English-Chinese Dictionary includes 360,000 words, phrases, and definitions, as well as example sentences. |
French The Oxford-Hachette French Dictionary is the most comprehensive dictionary in French and English, covering over 360,000 words and phrases. It is up-to-date both lexically and culturally. |
German The Oxford German Dictionary provides authoritative and culturally up-to-date coverage of 320,000 words and phrases, and over 520,000 translations. |
Italian The Oxford-Paravia Italian Dictionary is the most extensive Italian dictionary available, capturing idiomatic, colloquial, spoken, and written forms of Italian and English. It contains over 300,000 words and phrases and 450,000 translations. |
Portuguese The Oxford Portuguese Dictionary consists of more than 200,000 words and phrases and 320,000 translations and is endorsed worldwide. |
Russian The Oxford Russian Dictionary continues to be the leader in its field and covers over 180,000 words and phrases and 290,000 translations. |
Spanish The Oxford Spanish Dictionary extensively covers 300,000 words and phrases and 500,000 translations, making it the most authoritative Spanish bilingual available. |
Tamil The Oxford English-Tamil Dictionary accurately covers 10,000 headwords and over 25,000 translations. |
Telugu The Oxford Telugu-English Dictionary is based on everyday language and boasts over 28,000 headwords and 46,000 translations. |
Our Privacy Policy sets out how Oxford University Press handles your personal information, and your rights to object to your personal information being used for marketing to you or being processed as part of our business activities.