
Dictionary Language Datasets
Expertly curated dictionary datasets for over 60 major world languages
Available Languages
We provide expertly curated datasets for over 60 major world languages, available as ready-made packages or customized bundles to suit your specific needs.
Please note that some datasets may be subject to rights restrictions. Contact us for more information.
Monolingual Datasets
Our monolingual datasets include a variety of resources, including dictionaries, thesauruses, example sentences, pronunciations, and wordlists. Take a look below to see what’s included for each monolingual dataset.
Monolingual datasets consist of resources in a single language.
| Language | Dictionary | Thesaurus | Sentences | Pronunciation | Word list |
|---|---|---|---|---|---|
| Arabic | ✔️ | ||||
| Catalan | ✔️ | ||||
| Croatian | ✔️ | ||||
| Chinese (Simplified) | ✔️ | ||||
| Chinese (Traditional) | ✔️ | ||||
| Dutch | ✔️ | ||||
| English (American) | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
| English (Australian) | ✔️ | ✔️ | |||
| English (British) | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
| English (Canadian) | ✔️ | ||||
| English (New Zealand) | ✔️ | ||||
| English (Indian) | ✔️ | ||||
| French | ✔️ | ||||
| German | ✔️ | ||||
| Gujarati | ✔️ | ||||
| Hebrew (Modern) | ✔️ | ||||
| Hindi | ✔️ | ✔️ | ✔️ | ||
| Indonesian | ✔️ | ||||
| Italian | ✔️ | ✔️ | |||
| Korean | ✔️ | ||||
| Latvian | ✔️ | ||||
| Malay | ✔️ | ||||
| Malayalam | ✔️ | ||||
| Portuguese | ✔️ | ||||
| Romanian | ✔️ | ||||
| Russian | ✔️ | ||||
| Spanish | ✔️ | ✔️ | ✔️ | ✔️ | |
| Swedish | ✔️ | ||||
| Tamil | ✔️ |
Bilingual Datasets
We offer bilingual datasets that provide translations between English and a wide range of languages.
There are two types of bilingual datasets we provide: bidirectional and monodirectional.
Bidirectional Bilingual Datasets
Translates from two languages in both directions.
Arabic <> English (British)
Chinese (Simplified) <> English (British)
Chinese (Traditional) <> English (British)
Czech <> English (British)
French (European) <> English (British)
Georgian <> English (British)
German <> English (British)
Greek (Modern) <> English (British)
Gujarati <> English (British)
Hausa <> English (British)
Hindi <> English (British)
Indonesian <> English (British)
isiXhosa <> English (British)
Italian <> English (British)
Kazakh <> English (British)
Korean <> English (American)
Malay <> English (British)
Marathi <> English (British)
Polish <> English (American)
Portuguese (Brazilian) <> English (American)
Russian <> English (British)
Slovak <> English (British)
Spanish (European) <> English (American)
Spanish (Latin American) <> Quechua
Tatar <> English (British)
Thai <> English (British)
Turkish <> English (British)
Ukrainian <> English (British)
Welsh <> English (British)
Monodirectional Bilingual Datasets
Translates from the source language into the target language.
Bengali > English (British)
Cantonese > English (British)
English (British) > Igbo
English (British) > Persian
English (British) > Romanian
English (British) > Tajik
English (British) > Yoruba
Latin > English (British)
Punjabi > English (British)
Telugu > English (British)
Semi-bilingual Datasets
A semi-bilingual dictionary is a hybrid dictionary that combines monolingual and bilingual information. In a semi-bilingual dictionary, a headword will be presented with definitions and examples in the language the user is learning to facilitate comprehension, followed by a translation equivalent of the headword and/or the definition to reassure the user that they have understood the meaning of the word.
There are two types of bilingual datasets we provide: bidirectional and monodirectional.
Bidirectional Semi-bilingual Datasets
Translates from two languages in both directions.
Kannada <> English (British)
Malayalam <> English (British)
Tamil <> English (British)
Monodirectional Semi-bilingual Datasets
Translates from the source language into the target language.
Assamese > English (British)
English (British) > Bengali
English (British) > Cantonese
English (British) > Chinese (Simplified)
English (British) > Chinese (Traditional)
English (British) > Hindi
English (British) > Marathi
English (British) > Odia
English (British) > Punjabi
English (British) > Telugu
Use cases we support

Validation
Ideal for integrating word validation and spellcheck functionalities into user interfaces.

Display
Enhancing edtech by providing features like click-to-define, synonym expansion and more.

Translation
Perfect for language learning platforms and services offering translation and localization.

Enhanced Reading
Suggest synonyms, collocations, or definitions, aiding in vocabulary and reading use cases.

Prediction
Improve the efficiency of user-generated content in applications like predictive text systems.

Pronunciation
Accurate transcriptions suitable for transliteration services and apps dealing with multiple scripts.

Ready to get started?
Connect with our experts for a personalized consultation to find the best solution to meet your unique needs.