Sensitivity Labelling

We ensure our language data is accurate, inclusive, and responsibly managed. Our sensitivity labeling helps customers filter, and tailor content to meet the needs of their users and ethical guidelines.

Speak to an expert

Available languages

What is sensitive content, and why is it included in our language data?

Sensitive content is anything that may cause offence to a reader or user, particularly in relation to religion, race, gender, politics, sexuality, disability, or vulgar language.

Sensitive content represents a valid part of language, and our products will therefore contain sensitive material. Our dictionaries and datasets are descriptive, and for this reason if a user wants to look up what an offensive word means, they should be able to find it – otherwise, they may not even realize it’s offensive.

However, we aim to ensure all vulgar, offensive, or derogatory content is correctly labelled, our editorial text is appropriately worded, and that any reference to potentially sensitive topics is avoided unless it is contextually relevant and appropriate.

Reviewing sensitive content

Our expert team of lexicographers conduct major sensitivity reviews across our English and bilingual dictionary data, as part of our ongoing work to ensure that our content is fresh, contemporary, and appropriate for all users. For languages other than English, we work collaboratively with native speakers of a language to ensure reviews meet our rigorous standards.

Examples of current projects include a review of terms relating to mental health, intellectual disability, and psychological conditions, and a review of political content in Current English, including corpus monitor work to identify any emerging politically sensitive language. We also have ongoing projects related to sex and gender, and a review of racial terms and language relating to ethnicity.

Once a review has been conducted, content that does not meet our sensitivity standards will be modified, including content in definitions, example sentences, labels, synonyms, etymologies and usage notes.

Sensitive content in our data

One of the most effective ways we can denote sensitive content in our data is through labelling. Including labels such as dated, derogatory, euphemism, offensive, and vulgar slang, helps users or readers to understand the real-life usage of a term.

Examples of entries labelled as derogatory:

Updates

The outcomes of sensitivity reviews are made available to our customers and users via regular dataset updates, to ensure the content we provide is contemporary, appropriate, and best reflects the current usage of languages.

Find out more information about updates to our datasets here:

Explore our updates

Learn more about our data

Want to find out more?

Connect with our experts for a personalized consultation to find the best solution to meet your unique needs.

Request a consultation