A dictionary for the words the model misses
The single highest-leverage configuration for a developer using a dictation tool is the custom dictionary. A guide to seeding one in a working week.
Jamie van der Pijll ·
- product
- developer-workflow
The reason a general-purpose speech model misses your team’s service names is not a flaw of the model. It is a flaw of the prior. The model has heard “next js” a few hundred times. It has heard “Next.js” once or twice. When you say the words, it picks the more probable phrase and writes “next js” into your editor. You correct it. The next time you dictate the same paragraph, you correct it again.
The custom dictionary is the right place to break that loop. It is a post-processing pass that runs after the model produces a transcript: “from → to” replacements, applied automatically, on every dictation, regardless of which app the text is going into.
If you take one piece of advice from this post, take this one: seed the dictionary in the first week of using a dictation tool, and you will fix more accuracy problems in an afternoon than you would by upgrading to a larger model.
What goes in
Three categories cover most of the dictionary entries that matter.
The first is proper nouns. The names of your services, your products, your clients, your team, the cities your customers are in, the specific brands of equipment you use. A general speech model has heard the word “Hummingbird” the bird; it has not heard “Hummingbird the internal microservice at my employer”. The dictionary teaches it.
The second is technical jargon. The acronyms your industry uses every day — RFC, ARP, RBAC, OAuth, GDPR, JWT, S3, IAM, ECS. The names of the frameworks and tools — Next.js, Postgres, Kubernetes, FastAPI, SvelteKit, tRPC. Anything where the canonical spelling is non-obvious.
The third is your own peculiarities. The way you write your name in lowercase. The way your team writes “PR” instead of “MR” in commit messages. The casing you use for the project codenames that mean something only to you.
How to seed it in a week
A practical algorithm. For the first week, dictate normally and keep a plain-text scratchpad open. Every time you correct a dictated word by hand, write it down: “from → to”. At the end of the week, walk the list and add the entries that came up more than twice. You will typically have between fifteen and forty entries that matter.
This is enough. The dictionary’s marginal value drops off after the first fifty or so entries; the long tail is exactly that. A working dictionary is a few dozen carefully-chosen replacements, not a few hundred speculative ones.
A note on cleanup. Resist the urge to add every entry that mis-fires once. Some words mis-fire because of acoustic conditions or a half-mumbled syllable; adding them to the dictionary will cause spurious replacements when the model gets them right next time. “More than twice” is the rule of thumb.
What it cannot do
The dictionary is a post-processor; it works on text the model already produced. That has a few honest limits.
Words the model gets phonetically wrong — names that sound nothing like any English phrase — are not dictionary-fixable. The model produces a guess that does not pattern-match the entry, the dictionary does not fire, and you correct by hand. For these, a larger model — or a bring-your-own-key cloud model — helps where the dictionary cannot.
Casing-only fixes are reliable. Spelling fixes for similar-sounding words are reliable. Word-boundary fixes — “next js” → “Next.js” — are reliable. Anything beyond simple substitution is a different kind of post-processor and not what the dictionary is for.
The dictionary also does not learn. It is whatever you put in it. There is no “auto-add the words you correct” loop in v1. I think there should be eventually, but only as a suggestion the user explicitly accepts — silently growing the dictionary based on heuristics produces a footgun where one wrong heuristic correction shows up across every future paragraph.
Where the dictionary lives matters
A dictionary is a uniquely-specific snapshot of what you work on. It is the names of your clients, the codenames of your projects, and the acronyms your team uses. Treating it as a piece of personal infrastructure rather than a synced-to-the-cloud convenience is, I think, the right call.
Voiacast’s dictionary is a local file. It does not sync. The Pro tier adds export and import, so a team can curate a shared dictionary and distribute it as a file. The trade-off is that you have to copy a file when you set up a new Mac; the upside is that the dictionary list does not leave the Macs it belongs on.
For most users, the trade-off is the right one. For a user who wants seamless cross-machine sync, the export/import pattern is one extra step on a Mac you set up once every few years. Worth it, in my book, for keeping the list local.
The thing it actually does
I want to leave you with a concrete picture. I dictated the sentence “Hummingbird is rebalancing the IAM policies and the kubectl applies are landing in the wrong namespace.” With no dictionary, the model produces something like “hummingbird is rebalancing the iam policies and the cube control applies are landing in the wrong name space.” With the dictionary set up — Hummingbird, IAM, kubectl, namespace — the model produces the sentence I just spoke. The same audio. A different text.
That difference is what the dictionary is. Once you have one, the typing rate of your spoken sentences stops mattering — what matters is whether the system speaks your vocabulary back to you. The dictionary is how you teach it to.