this post was submitted on 08 Oct 2023
484 points (96.9% liked)

Technology

59086 readers
3709 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

BBC will block ChatGPT AI from scraping its content::ChatGPT will be blocked by the BBC from scraping content in a move to protect copyrighted material.

you are viewing a single comment's thread
view the rest of the comments
[–] csm10495@sh.itjust.works 73 points 1 year ago (3 children)

I wonder if anyone thinks robots.txt is binding or not ignored by anyone who wants.

[–] lemmyvore@feddit.nl 44 points 1 year ago

OpenAI will have to deal with a lot of lawsuits in the future. Robots.txt may not be legally binding but disobeying it after claiming otherwise would go a long way towards establishing intent.

[–] andrew@lemmy.stuart.fun 14 points 1 year ago

I mean, under the CFAA you could probably pretty easily pursue charges when explicitly deauthorizing certain agents from accessing your data. Plenty of people have been threatened and prosecuted for less.

https://www.nacdl.org/Landing/ComputerFraudandAbuseAct

[–] totallynotfbi@lemm.ee 6 points 1 year ago

I mean, you could just block OpenAI's crawlers' IP addresses, if you wanted to