Semantic English Language Database
The Semantic English Language Database (SELD) provides unrivalled universal coverage of English from across the English-speaking world, enhanced and optimized for machine learning projects.
Built from Oxford’s world-renowned English dictionaries, SELD is a fully combined resource with interlinked thesauri, morphology, and more than two million example sentences drawn from real-life usage through the Oxford Corpus.
Natural Language Processing metadata tags within the data act as sense-level links between senses, synonyms, and example sentences to produce a semantically annotated, structured dataset that distinguishes between fine shades of meaning for word-sense disambiguation.
With complete phonetics and links to our industry-standard human-voiced pronunciation audio, SELD brings together the same data that powers the best search engines and voice assistants used by millions of people around the world every day.
How could SELD power your products?
Our lexicographers are experts in identifying and interpreting language nuance and through SELD, you can leverage Oxford’s unrivalled language expertise to power your AI and solve natural language processing challenges.
With a choice of components and regular updates available to meet your specific requirements, SELD offers a dependably high quality, thorough resource for rich, semantically linked lexical data.
Get in touch to learn more about SELD from our team, or to receive a SELD sample.
How can SELD resolve word-sense disambiguation challenges?
Humans are naturals in word-sense disambiguation, easily able to identify and differentiate senses of a word in context. For the machines that we build, however, this is a steep learning challenge. How can we leverage human expertise in word-sense disambiguation to advance machine learning?
In the video, Zach Haynes, our director of global business development, explains how SELD enables AI to understand and interpret language nuance with ease.
With SELD+, the structured, semantically linked language data of the monolingual SELD product can be deployed cross-lingually in 10 major languages: Arabic, Chinese (simplified), French, German, Italian, Portuguese, Russian, Spanish, Tamil, and Telugu.
SELD+ provides semantic links from English definitions and real-life sense-annotated examples to translations of each meaning in Oxford’s premier bilingual dictionaries, making this a unique resource to improve outputs from machine translation.
Get in touch to learn more about SELD+ from our team, or to receive a SELD+ sample.