Today you will find Oxford Dictionaries powering a huge range of technology, apps, and digital services.
Our world-renowned dictionary data powers search engines, provides definitions in e-readers, and makes possible predictive text and language-learning software. We use cutting-edge technology to integrate, optimize, and link our rich language data, working with partners across the world of technology to create the most flexible and reliable platforms and processes.
Rich, detailed, versatile data
We have comprehensive dictionary datasets in many different languages, from global languages such as English and Chinese to regional languages such as isiZulu and Malay. These datasets go far beyond simply giving the definitions of words: they include information about levels of formality, regional variations, example sentences, audio pronunciations, and more.
The dictionary technology team
We have a top-flight team of data engineers, software developers, data architects, computational linguists, and other technical specialists. Together, they make sure our data conforms to standard data models and that it’s reliable, flexible, and interlinked, so that it can easily be used on any platform.
Technical innovation and the Oxford English Dictionary (OED)
Oxford Dictionaries has a long history of technical innovation. In the 1980s, the Oxford English Dictionary (OED) was one of the first major reference projects to be fully digitized. It was published on CD-ROM in 1992, and a fully-searchable prototype website was built as early as 1994. In March 2000, oed.com became the first major national, historical dictionary to be published online.
The OED now comprises a series of rich, interlinked datasets covering semantics, etymology, pronunciation, orthography, and frequency across the history of English, as well as large corpora of primary linguistic evidence. This makes the OED increasingly versatile, not only as a dictionary but as a data resource for use in applications and research.
Language data today
We mine huge amounts of text from the web and compile them into corpora. Language processing techniques, machine learning methods, and data analytics are used to update and enhance our resources, and create new ones.
The results can be used for many different, wide-ranging purposes. We build language data for digitally under-resourced languages such as Indonesian or isiZulu, for example. But we also provide technology companies with bespoke datasets such as customized wordlists, full-fledged morphology information, or word-usage frequency data.
LEAP: A powerful platform
We’ve developed an innovative new Lexical Engine and Platform (LEAP) that puts language content through a series of transformations, links the data, and houses multiple languages in a single platform. LEAP makes our data flexible and reusable, enabling anyone to explore and illustrate relationships between words, concepts, and expressions in one or more languages. Translations are accurate and reliable. For example, if you’re looking for the Spanish equivalent of the English noun ‘bat’, you will want to make sure you use the right word, whether it’s the implement used in sports such as cricket or the nocturnal mammal. LEAP makes this possible.
We work with some of the best technical partners and organizations in the world to create new and cutting edge language solutions and services. We are proud of our partners and the thinking, ideas, and enthusiasm they bring to our products and services.