What is a dictionary dataset?
At Oxford Languages we provide lexical and language datasets for a wide range of technologies and applications. We offer dictionary data in more than 60 languages, and these are made up of a number of different components.
There are many different types of dictionaries. The three main types are monolingual, bilingual, and bilingualized. There are also thesauruses, which are not dictionaries but are closely related.
A monolingual dictionary gives definitions of words in a single language. The main categories within monolingual dictionaries are:
- Current dictionaries, which give the current meanings of a word. These are what most people think of when they hear the word ‘dictionary’.
- Learner’s dictionaries, which give definitions in simple language, and are most suitable for learners of the language.
- Historical dictionaries, which list all meanings a word has ever had, from the first to the most recent, e.g. the Oxford English Dictionary
A bilingual dictionary gives the translation of the headword in a different language than the headword. The language of the headword is called the source language, the language of the translation the target language. A bilingual dictionary can be monodirectional, where it only translates from one source language into the target language, or bidirectional, which translates from ‘language A’ into ‘language B’ and from ‘language B’ into ‘language A.’
Bilingualized dictionaries, also called semi-bilingual dictionaries, start off life as a monolingual dictionary, and then have translations added. They have the definition of the headword in the same language as the headword, and a translation in the target language. Some bilingualized dictionaries translate the headword, some translate the definition, and some do both in different entries.
Thesauruses give lists of synonyms and near-synonyms for words. Some also list antonyms.
Types of data in dictionaries
|One word in the dictionary with all its information: senses, translations/definitions, examples, etc.
|A headword is a word as you would look it up in a dictionary, e.g. the word cat, but not the word cats. All other features, like examples, phrasal verbs, idioms, etc. are listed under the headword. A headword can be a single word, but also e.g. an abbreviation (ETA), or a compound word (cat flap).
monolingual and bilingualized dictionaries only
|Explains what the headword means, in the same language as the headword.
bilingual and bilingualized dictionaries only
|The translation of the headword. This is the equivalent of the headword that would be used in the same position in the sentence with the same function. Sometimes a one-word headword can have a translation that consists of more than one word, e.g. English advisedly has the Spanish translation con conocimiento de causa, or vice versa, e.g. English in any case has the German translation jedenfalls.
|Part of speech:
|Tells you whether a word is a noun, verb, etc.
|Tells you how a word is pronounced, either by IPA, respell, transliteration, or audio.
|Many words have more than one meaning, e.g. table as furniture or table as the periodic table. These different meanings are called senses, and they are separated and each given their own definition or translation.
bilingual dictionaries only
|Gives a short description of each sense. Indicators help the user find the sense of the headword they want to translate and guide them to the right translation.
|Gives extra information on word or sense. There are three types of label:
bilingual dictionaries only
|If there is no translation for a particular word or sense, a short description can be given in the target language, so speakers of that language still can understand what the word means; e.g. there is no equivalent of the Arabic word musahharati in English, so the Arabic-English dictionary gives person who walks the streets during Ramadan to wake up people for the morning meal, typically using a drum and calling. This is like a definition, but it is in the target language.
|Glossarial note: bilingual dictionaries only
|Gives extra information on the translation; if the translation is less specific than the headword; e.g. harvest moon: pleine lune (de l’équinoxe d’automne).
|An example shows the headword inside a sentence or phrase. Usually examples in monolingual dictionaries show a typical use of the headword. Examples in bilingual dictionaries may do the same, or show a use that is particularly noteworthy for learners or translators. In bilingual (and some bilingualized) dictionaries, examples usually have translations.
|An idiom is a phrase whose meaning can’t be derived from its individual words, e.g. over the moon, upset the apple cart, it’s raining cats and dogs.
|A phrasal verb is a verb used with an adverb or preposition carrying a specific meaning, e.g. break down, see to, look down on.
|An etymology explains the historical origin of the word. Depending on the style of the dictionary, an etymology can be as simple as stating the language the word came from or may give an extensive narrative description.
|Synonyms are different words which mean the same thing as a headword or sense. Some dictionaries give synonyms as well as translations or definitions.
|Antonyms are words which mean the opposite of a headword or sense. Some dictionaries give antonyms, and some thesauruses do.