this post was submitted on 23 Jan 2024

534 points (91.7% liked)

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ

52591 readers

294 users here now

⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.

Rules • Full Version

1. Posts must be related to the discussion of digital piracy

2. Don't request invites, trade, sell, or self-promote

3. Don't request or link to specific pirated titles, including DMs

4. Don't submit low-quality posts, be entitled, or harass others

Loot, Pillage, & Plunder

💰 Please help cover server costs.

founded 1 year ago

MODERATORS

db0@lemmy.dbzer0.com

sunbrothersco@lemmy.dbzer0.com

dataprolet@lemmy.dbzer0.com

Flatworm7591@lemmy.dbzer0.com

RandomLegend@lemmy.dbzer0.com

534

it sure beats having to buy it, but seriously come on... (i.imgur.com)

submitted 5 months ago by empireOfLove2@lemmy.dbzer0.com to c/piracy@lemmy.dbzer0.com

30 comments fedilink hide all child comments

not being able to ctrl-F a textbook or have click-to-chapter links sure makes studying harder these days... and any scanning software worth it's salt will at least do the bare minimum OCR automatically...

all 31 comments

sorted by: hot top controversial new old

[–] Moonrise2473@feddit.it 120 points 5 months ago* (last edited 5 months ago)

I much prefer doing ocr by myself if really needed, than getting an half assed "book" full of typos and broken tables just because someone did an automated OCR but didn't have the 5-6 hours required to manually edit to make it decent

Already be thankful that someone took the time to flip page by page in their scanner manually and upload it somewhere

[–] raven@hexbear.net 87 points 5 months ago

I hope this sentiment never stops someone from uploading a textbook without OCR. Once it's scanned it can always be OCRed at a later time.

[–] RegalPotoo@lemmy.world 78 points 5 months ago (1 children)

Look, it's all about authorial intent - if the author had wanted their book to be easy to reference or accessible to people who use screen readers, they would have published a DRM free PDF in the first place. Gotta respect the artist's vision.

[–] trucy@lemmy.blahaj.zone 10 points 5 months ago (1 children)

...and sometimes the artist turns out to be an idiot :D

[–] nilloc@discuss.tchncs.de 2 points 5 months ago

Or the professor who’s profiting off requiring the latest edition of their own book each year.

[–] flipflop97@feddit.nl 70 points 5 months ago* (last edited 5 months ago) (1 children)

You can do it yourself:

https://ocrmypdf.readthedocs.io/

[–] jlow@beehaw.org 3 points 5 months ago

There are so many:

https://stirlingtools.com/ https://docs.paperless-ngx.com/ https://www.openpaper.work/en/

[–] Kingofthezyx@lemm.ee 68 points 5 months ago (2 children)

Bitch you can't ctrl-F or click to chapter in an actual book either.

[–] JackbyDev@programming.dev 60 points 5 months ago

Wait until OP hears about the Index at the back of the book.

[–] empireOfLove2@lemmy.dbzer0.com 18 points 5 months ago

I know, that's my point! PDF's are inherently superior BECAUSE you can usually CTRL-F them.

[–] Anamnesis@lemmy.world 56 points 5 months ago (2 children)

Simple: pirate adobe acrobat and ocr them yourself.

[–] Lemongrab@lemmy.one 42 points 5 months ago

Or OCRMyPDF https://github.com/ocrmypdf/OCRmyPDF

[–] empireOfLove2@lemmy.dbzer0.com 16 points 5 months ago

I might just do that and reupload the OCR'd copy. I already have 3 or 4 books that I've saved out to cut the binding off of and scan in- gonna need OCR for that too.

In my free time, of course. University waits for no student...

[–] antonim@lemmy.dbzer0.com 41 points 5 months ago (1 children)

By the time you finished making this snarky meme, you could've set up a program to OCR a book yourself.

[–] 5714@lemmy.dbzer0.com 2 points 5 months ago

'A' yes, but the more scan pix you get, the annoyter you get

[–] Gork@lemm.ee 41 points 5 months ago

OCR'ing a book before uploading saves so much hours on the user end of things. I wish it were done more so I don't have to leave my computer running overnight to batch OCR stuff.

[–] DontNoodles@discuss.tchncs.de 36 points 5 months ago

Sites like Anna's library should permit users to flag books without OCR and permit users to submit OCR version of the books.

[–] unperson@hexbear.net 34 points 5 months ago* (last edited 5 months ago)

Be the change you wish to see in the world.

https://library.bz/main/upload/ anonymous username genesis password upload

[–] lanolinoil@lemmy.world 17 points 5 months ago (1 children)

https://projectnaptha.com/

[–] nameisnotimportant@lemmy.ml 11 points 5 months ago (1 children)

Very impressive! That's a bummer that you need Chrome to make it happen though :/

[–] lanolinoil@lemmy.world 1 points 5 months ago

It will still work on PDFs loaded in Chrome to be fair though.

[–] Creddit@lemmy.world 16 points 5 months ago (2 children)

There are a bunch of online tools that are free and let you upload a PDF to have it go through OCR.

Just Google "Free PDF OCR" and click through all the ads to upload, then give them a temporary email address to get a download link to the finished product.

Hot tip: There are free temporary email address sites too, if you need one to avoid getting on their ad lists.

[–] imkali@lemmy.dbzer0.com 11 points 5 months ago* (last edited 5 months ago)

List of free temorary email solutions.

https://www.guerrillamail.com/

https://10minutemail.com/

https://addy.io/ - this one is slightly different

and about a billion similar ones.

[–] Agent641@lemmy.world 2 points 5 months ago

If you have a jpg or png file, you can upload it to Google drive, then right click and open in Google docs, and it will OCR the text for you.

[–] OgdenTO@hexbear.net 5 points 5 months ago

Use the index

[–] Mango@lemmy.world 4 points 5 months ago (1 children)

OCR?

[–] Gaspar@lemmy.dbzer0.com 14 points 5 months ago

Optical Character Recognition. Essentially, software that "reads" an image and pulls text out of it.