Reddit sues Anthropic for unauthorized data harvesting

Reddit inc. is suing Anthropic for illegally scraping its data.
Reddit has clear terms in its user agreement against AI scraping, and has made lucrative deals for it’s data.
The company, home to a valuable archive of 20 years of human exchanges, says Anthropic illegally copied from its archives at least 100 000 times.

The lawsuit, filed on Wednesday at the San Francisco superior court, claims Reddit reached out to Anthropic several times to discuss licensing issues with the scraping but found they «refused to engage.»

Not a white knight
The suit calls Anthropic a «late-blooming artificial intelligence company that bills itself as the white knight of the AI industry,» adding that «it is anything but.»

— Reddit’s humanity is uniquely valuable in a world flattened by AI, Ben Lee, Reddit’s chief legal officer tells The Verge. — Now more than ever, people are seeking authentic human-to-human conversation. Reddit hosts nearly 20 years of rich, human discussion on virtually every topic imaginable. These conversations don’t happen anywhere else—and they’re central to training language models like Claude.

Made deals for content
This marks the first time a big technology company has sued an AI lab for unauthorized scraping of their content. Reddit has previously inked lucrative deals with Google and OpenAI for content scraping, while Sam Altman remains a large investor in the latter.

— In clear violation of Reddit’s terms and despite repeated requests to stop, Anthropic has been caught accessing or attempting to access Reddit content via automated bots at least 100,000 times. This isn’t a misunderstanding, it’s a sustained effort to extract value from Reddit while ignoring legal and ethical boundaries, a Reddit spokesperson tells Engadget.

Others also claim copyright
There are seven ongoing lawsuits against AI labs for scraping and copyright violations. Of note is the New York Times vs OpenAI and Microsoft over roughly the same issues — unauthorized scraping. Sarah Silverman et al have also sued Meta for training on their works without authorization, notes TechCrunch.

In that case, a judge recently questioned Meta’s «fair use» defense.

In a statement from Anthropic, the company says «we disagree with Reddit’s claims and will defend ourselves vigorously.»

Read more: The actual filing at redditinc.com, CNBC has financials, quotes, Engadget has spokespoerson info, so does TechCrunch and The Verge.