this post was submitted on 10 Apr 2024
1274 points (99.0% liked)

Programmer Humor

18292 readers
2764 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 1 year ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] MalReynolds@slrpnk.net 1 points 2 months ago* (last edited 2 months ago)

I see this a lot, but do you really think the big players haven't backed up the pre-22 datasets? Also, synthetic (LLM generated) data is routinely used in fine tuning to good effect, it's likely that architectures exist that can happily do primary training on synthetic as well.