ChatGPT Might Tell the World You’re a Criminal. Can You Sue?
By Catherine Wang (‘25) and Micah Musser (‘26)
ChatGPT tells a user that an Australian mayor spent time in prison for bribery. A court reporter who covers cases involving abuse of minors is himself accused of child molestation. A search engine using GPT-4 to summarize search results conflates a professor with a similarly named convicted terrorist.
These are examples of AI “hallucinations,” a common problem that companies deploying large language models face. Chatbots like ChatGPT are trained to predict the next word in a conversation. When confronted with a new question not present in its training data, a model is likely to make up a plausible-seeming answer. This can lead to models that cite fake legal cases or recommend malicious code libraries.
When made by human speakers, these types of accusations could generate liability for defamation, one of the legal doctrines that the Supreme Court has long acknowledged limits Americans’ free speech rights. When human speakers make false statements about others, injured parties can usually sue the speaker if they can show that the speaker acted wrongfully and harmed them as a result. But should you be able to sue an AI company like OpenAI if its tools mix you up with someone else and call you a murderer, a child molester, or a fraudster?
This month, the Technology Law and Policy Clinic at NYU School of Law weighed in with an amicus brief in Walters v. OpenAI, a defamation case in Georgia state court. In this litigation, a public talk show host is suing OpenAI based on a fictitious legal complaint that ChatGPT produced to a third-party user, erroneously describing Walters as having been accused of fraud and embezzlement. The litigation so far has raised two pressing issues that cut straight to the core of defamation law. Should we think of outputs from generative AI tools as factual assertions? And if we do treat them as factual assertions, how do the traditional fault requirements of defamation law—which demand that a plaintiff show a statement was made negligently or, in some cases, with “actual malice”—apply to AI companies?
On the first question, OpenAI has argued that since users are warned about the possibilities of inaccuracies, they cannot reasonably believe that ChatGPT’s responses to their questions are ever accurate factual statements. But paradoxically, OpenAI markets ChatGPT as reliable and accurate. AI programs from OpenAI and others have been mounted on search engines showing AI outputs at the top of search results, and OpenAI has added a search engine functionality to ChatGPT. An average user will expect these programs to be more reliable as they become more entrenched in how people generally look for information online.
On the second question, OpenAI has dodged by arguing that you can’t really hold an AI company responsible for the statements made by a generative AI tool because no human can monitor AI outputs in real time. But this focus on “real-time” monitoring should be worrisome, since it implies that private citizens will have no legal remedy even where AI tools repeatedly make the same false accusations about them, and even if the AI company is perfectly aware of the problem. On the other hand, Walters has claimed that ChatGPT’s statements were negligently made simply because OpenAI is aware of the general risk of AI hallucinations. If that alone were enough to show negligence, it would become essentially impossible for AI companies to ever deploy generative AI tools. As a result, crippling defamation liability could prevent innovation in the development of a new technology.
The Clinic’s brief argues that there are helpful analogies found in other bodies of law that can offer alternative, more nuanced ways to evaluate the fault of AI developers. For instance, showing that developers could have made alternative design decisions that would have reduced the risk of hallucinations without undermining the product’s functionality could be one way of showing negligence. And after AI programs are out in the world, showing that an AI company did not properly monitor its outputs could be another way of showing negligence.
Ultimately, the problem of AI hallucinations forces a difficult tradeoff between the protection of private individuals’ reputations and the social goal of fostering innovation in new and (mostly) useful technologies. There are ways to accommodate these values by analogizing to other bodies of law that govern companies’ responsibilities over their products and agents. The Clinic’s brief aims to show the dangers of courts accepting simplistic arguments that either impose blanket liability or blanket immunity on AI developers for the outputs of their tools. AI hallucinations are here to stay, and the brief proposes that analyzing this risk requires an equally innovative solution to hold AI companies accountable.