Questionable content, possibly linked

Tag: anthropic

Updated Hollywood Reporter Piece

Following a complaint I submitted to their legal department, which is on-going, The Hollywood Reporter modified the reporting I wrote about here from the original version, which called me a “fraudster” to the new ending of the article which presently reads (underlines added by me to highlight changed text elements).

The authors also argue that Anthropic is depriving authors of book sales by facilitating the creation of rip-offs. When Kara Swisher released Burn Book earlier this year, Amazon was flooded with AI-generated copycats, according to the complaint. In another instance, author Jane Friedman discovered a “cache of garbage books” written under her name.

According to the lawsuit, authors have turned to Claude to generate “cheap book content,” and the complaint highlights an individual who have created dozens of books in a short period of time to make its case.

The authors claim that Anthropic used a dataset called “The Pile,” which incorporates nearly 200,000 books from a shadow library site, to train Claude. In July, Anthropic confirmed the use of the dataset to various publications, according to the lawsuit.

Anthropic didn’t immediately respond to a request for comment.

Aug. 23, 9 am Updated to revise a paragraph within this story as well as include more detail from the complaint and remove an incorrect reference to author Tim Boucher.

Text from the Anthropic Lawsuit

I’ll go into more why I think this is wrong next week, but just wanted to capture the most relevant paragraphs from the latest class action lawsuit against Anthropic, which I am erroneously (I think) referenced in. Original PDF from the case.

  1. Since the explosion of LLM use in 2023, which coincided with the release of Claude, there has been an explosion of AI-generated books. When journalist Kara Swisher released her memoir Burn Book earlier this year, Amazon was flooded AI generated copycats. This was not an isolated incident. In another instance, author Jane Friedman discovered “a cache of garbage books” written under her name for sale on Amazon. As LLMs have become more advanced—and enabled to train on more and more copyrighted material—they are able to generate more content and more sophisticated content. The result is that it is easier than ever to generate rip-offs of copyrighted books that compete with the original, or at a minimum dilute the market for the original copyrighted work.
  2. Claude in particular has been used to generate cheap book content. For example, in May 2023, it was reported that a man named Tim Boucher had “written” 97 books using Anthropic’s Claude (as well as OpenAI’s ChatGPT) in less than year, and sold them at prices from $1.99 to $5.99.39 Each book took a mere “six to eight hours” to “write” from beginning to end. Claude could not generate this kind of long-form content if it were not trained on a large quantity of books, books for which Anthropic paid authors nothing.
  3. In short, the success and profitability of Anthropic is predicated on mass copyright infringement without a word of permission from or a nickel of compensation to copyright owners, including Plaintiffs here.

Powered by WordPress & Theme by Anders Norén