I’ve been thinking about this one for years. I’d written about voice transcription on my German blog more than once, always circling the same frustration: I think faster than I type, I often think in German but need to write in English, and by the time I’ve opened an app and found the right place to put a thought, the thought has changed shape or disappeared entirely.
The tool I wanted was simple to describe and surprisingly difficult to build well. Press one button on my iPhone. Talk. Done. The system figures out the rest.
Here’s what it does now. I press the button, I speak. The voice memo syncs to my Mac Mini via iCloud. A watcher picks it up, runs it through Whisper large-v3-turbo for transcription, then sends the text to Granite 3.3:8b running locally on Ollama for cleanup: removing filler words, adding punctuation, formatting paragraphs, detecting the language.
Then it routes. The first word I say determines where the text ends up:
- Say “Hugo” and it becomes a blog draft, filed into my German or English folder depending on the language
- Say “Aiden King” and it becomes an Aiden King blog draft
- Say “Check in” or “Check out” and it becomes a journal entry
- Say “Aufgabe” and it extracts tasks, creates Apple Reminders with due dates and priorities
- Say anything else and it becomes a plain note in my Obsidian vault
Everything stays local. The transcription runs on my machine. The language model runs on my machine. Nothing goes to an external server. The output lands in iCloud as a markdown file, searchable in Spotlight, ready to edit.
It runs as a background service. I don’t open it, I don’t configure it, I don’t think about it. I press the button and talk. Three minutes later, the text is where it needs to be.
I wanted this for a long time. Now it works, and it works every time. That’s the part that still surprises me. Not that it’s possible, but that it’s reliable. Rocksolid reliable. The kind of reliable where you stop thinking about whether it will work and just use it the way you use a light switch.
Several of the posts on this site started as voice memos that hit the “Aiden King” route and landed in my drafts folder as markdown. Including the one about the tool that mostly tells me no.