The honest summary
Privacy is not a single dimension. For dictation, three questions matter, and the answer per tool is sometimes different on each:
- Where is the audio processed?
- Where is the transcript stored?
- What identifying metadata is attached to the request?
Below is a tool-by-tool answer to each, with the load-bearing claims linked to the source on the vendor’s own site or documentation.
Apple Dictation
Apple has moved Mac dictation on-device for the supported languages on recent macOS releases. Audio is processed locally; the transcript is delivered to the focused field. No Apple account is required for the dictation feature itself, although the broader macOS profile may be signed in.
Where the audio goes: on-device for supported languages on recent macOS. Where the transcript lives: only in the field you typed into. What metadata: minimal beyond standard system telemetry.
Wispr Flow
Wispr Flow is a cross-platform voice keyboard. Its documented default is cloud transcription — the audio is sent to Wispr Flow’s servers, transcribed there, and the text is returned to the focused field. Wispr Flow publishes a security and privacy page describing what is stored and for how long.
Where the audio goes: Wispr Flow’s cloud. Where the transcript lives: in the focused field and on Wispr Flow’s servers per their retention policy. What metadata: standard request metadata plus account state.
Superwhisper
Superwhisper runs on-device by default. Optional cloud modes use AI provider keys the user controls.
Where the audio goes: on-device by default; user-controlled cloud for the optional modes. Where the transcript lives: in the focused field; modes that integrate with an external AI provider send the intermediate text to that provider. What metadata: the user account context for paid features.
MacWhisper
MacWhisper is on-device. The audio is processed locally by the Whisper-family model the user picks.
Where the audio goes: on-device. Where the transcript lives: in the file or document the user is producing. What metadata: minimal beyond standard purchase and update telemetry.
Voiacast
Voiacast runs on-device by default. Audio is processed locally and discarded as soon as the words appear on screen. The optional bring-your-own-key cloud transcription mode (Pro only) sends audio directly to the AI provider whose key the user supplied; Voiacast does not proxy it.
Where the audio goes: on-device by default; user-controlled cloud when the user explicitly turns BYOK on. Where the transcript lives: in the focused field, never on a Voiacast server. What metadata: the license-key validation call carries no audio, no transcript, and no usage data — only the license key and the machine fingerprint.
The questions worth asking before you pick
Whichever tool you pick, the same three questions earn most of the answer:
- What is the default? A tool that defaults to local processing and asks before changing that default is harder to misconfigure than one that defaults the other way.
- What is the metadata? A tool that requires an account by default attaches identity to every transcription; a tool that uses a license key plus a hashed machine fingerprint does not.
- What is the retention? On-device tools retain nothing beyond the text in your document. Cloud tools have to publish — and stand behind — a retention window for the audio and the transcript.
See also
- On-device dictation — what the default means in practice.
- Bring-your-own-key cloud transcription — when cloud is the right call.
- Voiacast vs Apple Dictation — the side-by-side comparison.
Last reviewed .