An assistive technology that reads text aloud, text-to-speech technology can be enhanced with Oxford Languages fully mapped audio and IPA transcriptions. Our data has been carefully compiled by our in-house team of logophiles as an output of our language research programme; one of the largest in the world.
This dataset is aligned to the words and inflections found in the Oxford Dictionary of English and New Oxford American Dictionary. Its unparalleled coverage and accuracy ensures a higher quality output of speech synthesizers, and with our data features we are confident your text-to-speech application will benefit from increased accuracy in its output.

Data features


Our datasets contain features that enable the most accurate and comprehensive text-to-speech applications:

  • Over 400,000 transcriptions, with over 200,000 of both British and American English
  • Syllabified and non-syllabified IPA (International Phonetic Alphabet) transcriptions for each wordform
  • Variant spellings of each word (# is used as a separator)
  • Variety of English, British or American
  • Pronunciation group identifier, a unique identifier for each pronunciation group. Pronunciations which have the same identifier are used interchangeably e.g. engross /ɪnˈɡroʊs/ /ɛnˈɡroʊs/
  • Parts of speech (# is used as a separator), to aid disambiguation where the correct pronunciation is not clear from the spelling
  • Sense level caution which offers a further warning that the pronunciation may vary according to sense (i.e. is not made clear by spelling and given parts of speech)
  • Sound file mappings to appropriate audio file


An example of what's in our data, focusing on the word friendliness:

Our British English audio pronunciation

Our American English audio pronunciation

Download a sample


Our sample data provides a selection of what you can expect to see from a full dataset and has been curated to showcase the range of this dataset, especially for complex words that have multiple pronunciation or inflections.