• Welcome to Religious Forums, a friendly forum to discuss all religions in a friendly surrounding.

    Your voice is missing! You will need to register to get access to the following site features:
    • Reply to discussions and create your own threads.
    • Our modern chat room. No add-ons or extensions required, just login and start chatting!
    • Access to private conversations with other members.

    We hope to see you as a part of our community soon!

These 183,000 Books are Fueling the Biggest Fight in Publishing and Tech

anna.

colors your eyes with what's not there
These 183,000 Books are Fueling the Biggest Fight in Publishing and Tech
Use our new search tool to see which authors have been used to train the machines.
Story by Alex Reisner

Editor’s note: This searchable database is part of The Atlantic’s series on Books3.
You can read about the origins of the database
here, and an analysis of what’s in it here.

Since my article appeared, I’ve heard from several authors wanting to know if their work is in Books3. In almost all cases, the answer has been yes. These authors spent years thinking, researching, imagining, and writing, and had no idea that their books were being used to train machines that could one day replace them. Meanwhile, the people building and training these machines stand to profit enormously.

Reached for comment, a spokesperson for Meta did not directly answer questions about the use of pirated books to train LLaMA, the company’s generative-AI product. Instead, she pointed me to a court filing from last week related to the Silverman lawsuit, in which lawyers for Meta argue that the case should be dismissed in part because neither the LLaMA model nor its outputs are “substantially similar” to the authors’ books.

It may be beyond the scope of copyright law to address the harms being done to authors by generative AI, and the point remains that AI-training practices are secretive and fundamentally nonconsensual. Very few people understand exactly how these programs are developed, even as such initiatives threaten to upend the world as we know it. Books are stored in Books3 as large, unlabeled blocks of text. To identify their authors and titles, I extracted ISBNs from these blocks of text and looked them up in a book database. Of the 191,000 titles I identified, 183,000 have associated author information. You can use the search tool below to look up authors in this subset and see which of their titles are included.

Before you begin, please note several caveats: Some books appear multiple times, reflecting different editions, translations, abridgements, or annotations. Because of inconsistencies in the spelling of author names, the search may not return books that are, in fact, in Books3. It may also deliver a jumble of odd formatting: A query for Agatha Christie will also return books labeled Agatha Christie and Christie Agatha, for example. And because of possible errors in the book-identification process, which involves detecting an ISBN within the text of the books and using a book database to find their author and title, there is a very small chance of false positives.
 

anna.

colors your eyes with what's not there
Authors are starting to sue:

A group of prominent novelists, including John Grisham, Jonathan Franzen and Elin Hilderbrand, are joining the legal battle against OpenAI over its chatbot technology, as fears about the encroachment of artificial intelligence on creative industries continue to grow.

More than a dozen authors filed a lawsuit against OpenAI on Tuesday, accusing the company, which has been backed with billions of dollars in investment from Microsoft, of infringing on their copyrights by using their books to train its popular ChatGPT chatbot. The complaint, which was filed along with the Authors Guild, said that OpenAI’s chatbots can now produce “derivative works” that can mimic and summarize the authors’ books, potentially harming the market for authors’ work, and that the writers were neither compensated nor notified by the company.
 

Twilight Hue

Twilight, not bright nor dark, good nor bad.
Authors are starting to sue:

A group of prominent novelists, including John Grisham, Jonathan Franzen and Elin Hilderbrand, are joining the legal battle against OpenAI over its chatbot technology, as fears about the encroachment of artificial intelligence on creative industries continue to grow.

More than a dozen authors filed a lawsuit against OpenAI on Tuesday, accusing the company, which has been backed with billions of dollars in investment from Microsoft, of infringing on their copyrights by using their books to train its popular ChatGPT chatbot. The complaint, which was filed along with the Authors Guild, said that OpenAI’s chatbots can now produce “derivative works” that can mimic and summarize the authors’ books, potentially harming the market for authors’ work, and that the writers were neither compensated nor notified by the company.
Like it or not, AI is here to stay.

I'm not saying it as rudeness, but with concern as I agree with your post.

Even a number of those who helped create AI are pushing for regulation because it is not a benign technology. It's potentially destructive and extremely dangerous.
 

icehorse

......unaffiliated...... anti-dogmatist
Premium Member
And now we can add to the list of issues something along the lines of massive copyright infringement.
 

anna.

colors your eyes with what's not there
Like it or not, AI is here to stay.

I'm not saying it as rudeness, but with concern as I agree with your post.

Even a number of those who helped create AI are pushing for regulation because it is not a benign technology. It's potentially destructive and extremely dangerous.

For once, we agree. :)

I was just reading today again (going beyond AI in publishing), about future military autonomous robots (even drones have human controllers)
because they're "powered by artificial intelligence (AI) and can technically operate independently to take out targets without human help."
Black Mirror's metalhead comes to mind. I brought this up 7 years ago elsewhere but it was downplayed by most who dismissed the possibilities and the rise of AI as sci-fi.
 

anna.

colors your eyes with what's not there
And now we can add to the list of issues something along the lines of massive copyright infringement.

It's alarming on many levels. And all for profit at the expense of the original authors. I hope the authors prevail in court.
 
Top