this post was submitted on 21 Oct 2023
133 points (97.8% liked)

Technology

55919 readers
3048 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
all 18 comments
sorted by: hot top controversial new old
[–] ericisshort@lemmy.world 23 points 8 months ago* (last edited 8 months ago) (2 children)

I’m very interested in this case and am curious to see where the courts draw the line here.

Beware of an incoming hot take - I don’t see the concept of training AI on published works as much different than a human learning from published works as long as they both go on to make their own original works. I have definitely seen AIs straight-up plagiarize before, but that seems like a different issue entirely from producing similar works. I think allowing plagiarism is a problem with the constraints of the training rather than a fundamental problem with the entire concept of AI training.

[–] Armok_the_bunny@lemmy.world 4 points 8 months ago (2 children)

A standard I could see being applied is one that I think has some precedent, where if the work it is supposed to be similar to is anywhere in the training set then it's a copyright violation. One of the valid defenses against copyright claims in court is that the defendant reasonably could have been unaware of the original work, and that seems to me like a reasonable equivalent.

[–] ericisshort@lemmy.world 7 points 8 months ago (1 children)

But humans make works that are similar to other works all the time. I just hope that we set the same standards for AI violating copyright as we have for humans. There is a big difference between derivative works and those that violate copyright.

[–] lemmyvore@feddit.nl 3 points 8 months ago (3 children)

Doesn't this argument assume that AI are human? That's a pretty huge reach if you ask me. It's not even clear if LLM are AI, nevermind giving them human rights.

[–] ericisshort@lemmy.world 4 points 8 months ago (1 children)

No, I’m not assuming that. It’s not about concluding AI’s are human. It’s about having concrete standards on which to design laws. Setting a lower standard for copyright violation by LLMs would be like setting a lower speed limit for a self-driving car, and I don’t think it makes any logical sense. To me that would be a disappointingly protectionist and luddite perspective to apply to this new technology.

[–] lemmyvore@feddit.nl 0 points 8 months ago (1 children)

If LLM are software then they can't commit copyright violation, the onus for breaking laws falls on the people who use them. And until someone proves otherwise in a court of law they are software.

[–] ericisshort@lemmy.world 3 points 8 months ago

No one is saying we charge a piece of software with a crime. Corporations aren’t human, but they can absolutely be charged with copyright violations, so being human isn’t a requirement for this at all.

Depending on the situation, you would either charge the user of the software (if they directed the software to violate copyright) and/or the company that makes the software (if they negligently release an LLM that has been proven to produce results that violate copyright).

[–] Saganastic@kbin.social 3 points 8 months ago (1 children)

Machine learning falls under the category of AI. I agree that works produced by LLMs should count as derivative works, as long as they're not too similar.

[–] nybble41@programming.dev 2 points 8 months ago

Not every work produced by a LLM should count as a derivative work—just the ones that embody unique, identifiable creative elements from specific work(s) in the training set. We don't consider every work produced by a human to be a derivative work of everything they were trained on; work produced by (a human using) an AI should be no different.

[–] p03locke@lemmy.dbzer0.com 4 points 8 months ago (1 children)

Beware of an incoming hot take - I don’t see the concept of training AI on published works as much different than a human learning from published works as long as they both go on to make their own original works.

The fact that this is considered a "hot take" is depressing.

[–] ericisshort@lemmy.world 6 points 8 months ago (1 children)

It’s much less of a hot take for people in the tech community, but it is for many artists and creatives who feel threatened by AI’s potential to devalue what they’ve dedicated their lives to.

[–] p03locke@lemmy.dbzer0.com 3 points 8 months ago (1 children)

They should have felt threatened by the sheer weight of an incredibly oversaturated industry, sabotaging itself with a system that rewards the lucky and punishes 99.99% of the people that try to get into it. Everybody else who "made it" are practicing survivorship bias to justify their career choices.

Leaps in AI technology was just another barbell added to the pile.

[–] ericisshort@lemmy.world 1 points 8 months ago

These dumb fucks should go ahead and sue Google, then, if searching and providing song lyrics is considered copyright infringement.