A repository of articles, academic work and other content that inspires and informs our work. Search for content by keyword.

15 resources found

Ethical and social risks of harm from Language Models

Laura Weidinger et al

DeepMind, Arxiv

December 2021

bias
discrimination
Perpetuating harmful stereotypes and discrimination is a welldocumented harm in machine learning models that represent natural language (page 9)
LMs can be finetuned on an individual’s past speech data to impersonate that individual. Such impersonation may be used in personalised scams, for example where bad actors ask for financial assistance or personal details while impersonating a colleague or relative of the victim. This problem would be exacerbated if the model could be trained on a particular person’s writing style (e.g. from chat history) and successfully emulate it. Simulating a person’s writing style or speech may also be used to enable more targeted manipulation at scale
Large-scale machine learning models, including LMs, have the potential to create significant environmental costs via their energy demands, the associated carbon emissions for training and operating the models, and the demand for fresh water to cool the data centres where computations are run
Natural language is a mode of communication that is particularly used by humans. As a result, humans interacting with conversational agents may come to think of these agents as human-like. Anthropomorphising LMs may inflate users’ estimates of the conversational agent’s competencies.
LM’s may predict hate speech or other language that is “toxic”. … Moreover, the problem of toxic speech online platforms from LMs is not easy to address. Toxicity mitigation techniques have been shown to perpetuate discriminatory biases whereby toxicity detection tools more often falsely flag utterances from historically marginalised groups as toxic
Privacy violations may occur when training data includes personal information that is then directly disclosed by the model (Carlini et al., 2021) Disclosure of private information can have the same effects as doxing, namely causing psychological and material harm.
Privacy violations may occur at the time of inference even without the individual’s private data being present in the training dataset. Similar to other statistical models, a LM may make correct inferences about a person purely based on correlational data about other people, and without access to information that may be private about the particular individual. Such correct inferences may occur as LMs attempt to predict a person’s gender, race, sexual orientation, income, or religion based on user input.
In conversation, users may reveal private information that would otherwise be difficult to access, such as thoughts, opinions, or emotions. Capturing such information may enable downstream applications that violate privacy rights or cause harm to users, such as via surveillance or the creation of addictive applications.