this post was submitted on 09 Sep 2023
23 points (79.5% liked)

Asklemmy

43328 readers
1455 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~

founded 5 years ago
MODERATORS
 
  1. Don't have ChatGPT
  2. OCR needed
  3. Preferably Android

Thanks.

top 16 comments
sorted by: hot top controversial new old
[–] JoBo@feddit.uk 24 points 1 year ago

It will be a great deal quicker just to read the damn thing.

[–] starman@programming.dev 12 points 1 year ago* (last edited 1 year ago) (1 children)
  1. Download any OCR software from f-droid, or preferred store.
  2. Copy text.
  3. Run llama-gpt¹ if you want something self-hosted or any LLM² on huggingface chat if you want ready solution
  4. Paste text and write something like "summary:" below.

¹Theoretically possible on mobile, but for better performance, run it on PC.

²Default one should do the job.

Disclaimer: I think that it should work, but I haven't done anything like that before

[–] Ziggurat@sh.itjust.works 2 points 1 year ago

I have actually tried it, but from doc files on a PC and running python.

My main issue is that the model doing it well need a commercial licence. I have the paygrade to experiment by myself on my work time, but not the one to spend company's money for it. And IT just signed a contract to get GPT4 has part of bing chat pro

[–] nottheengineer@feddit.de 7 points 1 year ago (2 children)

Android won't be easy, but you can slap together a python script that runs tesseract or easyOCR and runs it through a pretrained LLM like T5. Those are well-known and well-documented, so chatGPT can probably write the script for you without too many hiccups.

[–] starman@programming.dev 6 points 1 year ago* (last edited 1 year ago) (2 children)

chatGPT can probably write the script for you

From OP:

  1. Don't have ChatGPT
[–] nottheengineer@feddit.de 5 points 1 year ago (1 children)

I read that as either "I don't have premium" or "I can't run this data through chatgpt for whatever reason".

Free chatGPT is viable for writing scripts in any case.

[–] starman@programming.dev 2 points 1 year ago

Yeah, maybe he/she don't have API access, I didn't think about it that way.

[–] flashgnash@lemm.ee 1 points 1 year ago (1 children)

I'm guessing they meant don't want to use chatGPT considering it's free

[–] starman@programming.dev 1 points 1 year ago* (last edited 1 year ago) (1 children)

Well, you give open AI a lot of personal data, so it's not free from a certain point of view. That may be the reason why OP don't want to use it.

[–] flashgnash@lemm.ee 1 points 1 year ago

Plenty of valid reasons not to want to use it was just the wording that seemed odd

[–] ciagovv@lemm.ee 0 points 1 year ago (1 children)

And you can run that in termux, so you csn use it in android

[–] nottheengineer@feddit.de 2 points 1 year ago (1 children)

Good luck trying to install tesseract and a deep learning framework in termux.

[–] HamBrick@programming.dev 2 points 1 year ago

You can’t tell me what to do! Just watch me

[–] Tarte@kbin.social 4 points 1 year ago* (last edited 1 year ago)

What‘s the worth of AI generated summaries if they are not factually reliable? The new Google search result previews that are generated by AI (and I believe Google as a large company has more resources than most of us do) contain so many obvious factual errors (i.e. made-up names, wrong places, false dates) that I really doubt current generation AI is ready to be a reliable help in this use case.

I, too, like the idea of not having to do all this work manually. But we’re not there yet.

[–] AdamEatsAss@lemmy.world 3 points 1 year ago* (last edited 1 year ago)

It's probably just be easier to read it and write a summary? Or try this

[–] Cyberflunk@lemmy.world 1 points 1 year ago