this post was submitted on 12 May 2024

110 points (97.4% liked)

Open Source

31710 readers

247 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Posts must be relevant to the open source ideology
No NSFW content
No hate speech, bigotry, etc

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 5 years ago

MODERATORS

Cloak@lemmy.ml

kevincox@lemmy.ml

CrypticCoffee@lemmy.ml

Lettuceeatlettuce@lemmy.ml

110

FOSS dictation and transcription software (lemmy.ml)

submitted 7 months ago by archer@lemmy.ml to c/opensource@lemmy.ml

16 comments fedilink hide all child comments

Hi people

Is there any good FOSS dictation and transcription software similar to Just Press Record for Android and/or Linux?

I couldn't find anything so far. Thanks a lot.

top 16 comments

sorted by: hot top controversial new old

[–] lemann@lemmy.dbzer0.com 20 points 7 months ago (2 children)

Not FOSS as it's under another license, but there's "FUTO Voice Input" if you're looking for a local alternative to Google's voice dictation on Android

https://gitlab.futo.org/alex/voiceinput

The repo has a list of supported and unsupported Android keyboards. Under the hood it uses OpenAI Whisper

[–] mister_monster@monero.town 5 points 7 months ago

I just tried this out. Besides the fact that it doesn't stream and translates after speech is done, it's absolutely fantastic. Of course, it would suffer in accuracy if it were translating a stream so I think that's a plus instead of a minus, although some people might not think so.

[–] Substance_P@lemmy.world 4 points 7 months ago

This one of my most used apps at the moment, it works 100% on your device and is great for filling in search terms, for AI prompts , messages etc. The only downside is that it seems to have a character limit so it may not be what OP is looking for.

[–] leraje@lemmy.blahaj.zone 19 points 7 months ago (1 children)

I can vouch for whisper.cpp . It's not 100% perfect but it's good enough to transcribe a half hour podcast with numerous speakers and which requires pretty minimal fixing afterwards.

[–] Infinite@lemmy.dbzer0.com 3 points 7 months ago

Agreed.

OP, this is the best Speech-to-Text solution, IMO. I've used Whisper on Windows (link to GitHub) successfully to transcribe graduate-level class recordings with very minimal manual fixing, mostly only certain last names.

[–] AnEilifintChorcra@sopuli.xyz 12 points 7 months ago (1 children)

Maybe not exactly what you're looking for but I found this a few weeks ago https://github.com/k2-fsa/sherpa-onnx and I haven't really seen anyone talk about it

I've been using the tts on android for navigation and its way better than rhvoice and espeak.

I did try stt on android and it worked great but I've never used stt before so I don't know how good it is compared to other stt

[–] mister_monster@monero.town 2 points 7 months ago

This looks really cool, thanks for sharing! I'm a rhvoice user for tts and for stt I use Sayboard with vosk model because I cannot find a single application for Android besides google stuff that integrates as an assistant. I wonder if this does that or if they plan to, seems a bit out of scope for this project.

[–] punkcoder@lemmy.world 10 points 7 months ago (1 children)

Falling into the not sure how open source it is because AI is a mess. But it works category…

https://github.com/manzolo/openai-whisper-docker

[–] FlatFootFox@lemmy.world 15 points 7 months ago

It’s still surreal to see OpenAI’s need for training data be so vast that they casually developed and open sourced a generational leap in transcription technology just so that they could scrape online videos better.

[–] makingStuffForFun@lemmy.ml 3 points 7 months ago

There is Talon Voice. It's not FLOSS, nor even Open Source in any way. But it's good.

I'll watch this thread as I'm keen to know myself.

[–] moreeni@lemm.ee 3 points 7 months ago

https://github.com/pluja/awesome-privacy?tab=readme-ov-file#speech-to-text

[–] Chump@hexbear.net 3 points 7 months ago

Not sure if it's literally FOSS, but OpenAI's Whisper is the best transcription I've ever used and the models are all free to download and use locally.

[–] Suoko@feddit.it 2 points 7 months ago

https://www.omglinux.com/speech-note-transcribe-voice-to-text-on-linux/

[–] gravitywell@sh.itjust.works 1 points 7 months ago (1 children)

Depends on your use case but I found a plugin to use openai for dictation in obsidian

[–] archer@lemmy.ml 8 points 7 months ago

I'd prefer to keep both the recordings and the transcription local if possible, either on device or self hosted.

[–] spinning_your_wheels@thelemmy.club 1 points 7 months ago

https://github.com/openai/whisper

they have MIT license