help

how does this thing work?

engine picks, hotkey shapes, what every toggle does. read what you need, skip the rest.

getting started

  1. download and open the app. the icon shows up in your menu bar.
  2. grant accessibility in system settings → privacy & security → accessibility. required for the global hotkey to work in any app.
  3. grant microphone access when prompted. without it, there is nothing to dictate.
  4. pick a hotkey and a target app in settings → hotkeys. the default is the backtick (`) key targeting TextEdit; change to whatever pairs you actually use.
  5. pick a transcription engine in settings → engine. Apple Speech needs no download; everything else needs a one-click model fetch in settings → models.
  6. hold the hotkey, speak, release. the text appears in your target app while you stay in whatever you were doing.

engines & models

four engines, fifteen total options. local engines run entirely on your Mac with no network calls. the cloud API engine is the only one that uploads your audio, and you opt in by entering your own OpenAI key.

Apple Speech

zero download. built into macOS. picks up instantly after install.

Apple Speech

0 MB · 27 locales

already on your Mac. good enough for chat-style sentences, weak on technical jargon. pick this if you want to try the app before downloading anything.

first-run, short dictation, casual use

WhisperKit

OpenAI Whisper running on the Neural Engine. 10 sizes, English-only and multilingual variants at every tier.

Tiny English

75 MB · English

smallest and fastest. quality is limited; useful as a "is the pipeline working" sanity check.

low-RAM Macs, lots of small dictations

Tiny Multilingual

75 MB · ~99 langs

same speed, broader language coverage at the cost of English-only accuracy.

low-RAM Macs, mixed-language dictation

Base English

145 MB · English

noticeably better than Tiny. reasonable default for English-only users on any Mac.

comfortable everyday English

Base Multilingual

145 MB · ~99 langs

same size, all languages. fresh installs land here.

comfortable everyday multilingual

Small English

465 MB · English

big jump in accuracy on jargon, code, and proper nouns. sweet spot if you mostly dictate English.

English with technical vocabulary

Small Multilingual

465 MB · ~99 langs

same accuracy class as Small English but across languages.

higher-accuracy multilingual

Medium English

1.4 GB · English

great for paragraphs of dense English. noticeably slower than Small.

long-form English

Medium Multilingual

1.4 GB · ~99 langs

high accuracy at any language; slower than Small.

long-form multilingual

Large v3 Turbo (Recommended)

1.6 GB · ~99 langs

near-Large accuracy at roughly four times faster than full Large. the best speed-to-quality ratio in the WhisperKit lineup on Apple Silicon. default pick if you do not know which model to use.

most people who want the best balance

Large v3

2.9 GB · ~99 langs

highest accuracy WhisperKit offers. slower than Turbo. pick this only if Turbo is leaving accuracy on the table for your specific use case.

maximum accuracy regardless of speed

Parakeet (FluidAudio)

NVIDIA Parakeet TDT and Qwen3-ASR via FluidAudio. optimised for the Neural Engine. includes a vocabulary-boost option.

Parakeet TDT v3

~500 MB · 25 European langs

strong English plus French, German, Spanish, Italian, Polish, and most of EU. auto-detects language; no language picker needed.

high-accuracy European-language dictation

Parakeet TDT v2

~500 MB · English

predecessor of v3, English-only. slightly different acoustic model. try if v3 has trouble with your voice.

English-only Parakeet workflows

Parakeet TDT-CTC 110m

~110 MB · English

smallest, fastest Parakeet. pairs with the vocabulary boost: type your terminology into the vocabulary list and the recognizer prefers it during decoding.

snappy English dictation on any Mac

Qwen3-ASR 0.6B

~900 MB · 50+ langs

widest language coverage in the local lineup. particularly strong on East Asian languages.

Mandarin, Cantonese, Japanese, Korean, plus most EU/SEA languages

Cloud API

bring your own OpenAI key. audio uploads, no model lives on your Mac.

OpenAI Whisper API

nothing local · ~99 langs

nothing to download. you pay per minute via your own OpenAI account. disabled by default. if you set it, your audio is sent to OpenAI for that one transcription. everything else stays local.

old Macs, very long dictations, fallback when local models do not fit

not sure which to pick?

the refine step

after transcription, the raw text can be cleaned up: punctuation, capitalisation, filler words, and obvious grammar errors get patched. the refine step is fully optional. you pick one of three backends in settings → refine.

four rule toggles on the refine tab let you turn each cleanup category on or off individually: punctuation, spoken-punctuation conversion ("comma" → ","), filler removal ("um", "uh", "like"), and grammar fixes.

hotkeys & target apps

each hotkey slot is one pair: a key combination and a target app. hold the key, dictate, release. the text types into the target. your foreground app stays foreground.

set up multiple slots if you dictate into different apps. one for Discord while gaming, one for Notes while in a meeting, one for your terminal while reading docs. each slot has its own target app and an optional pair of pre- and post-speech key combos (Cmd+L to focus a chat box before you start, Return to send after you stop).

the hotkey listener distinguishes "hold to dictate" from "tap the key normally". tapping the hotkey types the key as usual. holding it for the configured threshold starts recording. the threshold is short and tuned to feel snappy.

trigger words

a trigger word is a phrase that ends your dictation with a specific keyboard action. three action types:

example. the default trigger word is "execute", set to press Return. you dictate "ship it tomorrow morning execute" into Slack. HeyClanker pastes "ship it tomorrow morning" and presses Return. the trigger phrase itself is stripped from what gets typed.

vocabulary & learned corrections

vocabulary. a list of words the recognizer often gets wrong. type them in the spelling you want to see: kubectl, psql, your colleague's name, your product name, internal acronyms. the recognizer biases toward your list during decoding. list lives on your Mac. nothing uploads. nothing trains.

learned corrections. optional, off by default. with the toggle on, HeyClanker watches the focused field for a few seconds after each paste. if you fix a word or two, it records the swap and applies it next time. one- and two-word substitutions only. never whole sentences. the pairs live on your Mac, ranked by how often each appears. the refiner uses them as hints. to delete or audit them, open settings → refine → learned corrections.

privacy posture

nothing to sign up for, no email to verify, no magic link that arrives in spam fourteen minutes late. the app launches, the hotkey works, that is the entire flow.

there is no HeyClanker backend in the middle. the app talks to your Mac and, if you opt in, your chosen LLM provider. no analytics, no usage events, no "anonymous diagnostics" feeding a dashboard somewhere. your dictation is not, under any circumstances, going to wind up in next quarter's "voice intelligence dataset" the company will swear is fully anonymised.

audio stays on your Mac with every engine except cloud API, which uploads only the audio for the one transcription you asked it to. API keys live in the macOS Keychain, not a plaintext config file you accidentally commit to a public repo. your vocabulary list and any learned corrections are local. none of it trains anything.

FAQ

how do I dictate without leaving the app I am in?

set a target app per hotkey slot in settings → hotkeys. when you hold the hotkey and speak, HeyClanker types into that target without bringing it forward. your current window stays focused. great for dictating into Discord while you stay in your game, or into a terminal on a second monitor while a meeting runs on the first.

which model should I download first?

if you want to try the app immediately, Apple Speech needs no download and works on any Mac. if you want better accuracy, download Whisper Large v3 Turbo. it is the recommended balance of accuracy and speed on Apple Silicon. East Asian languages? pick Qwen3-ASR. Cantonese is best handled by Qwen3-ASR.

what is the refine step?

optional. after the transcription is done, HeyClanker can pass it through a cleanup model to fix punctuation, capitalisation, filler words, and grammar mistakes. three backends: off (no cleanup), Apple Intelligence (on-device, free, requires macOS 26 and Apple Intelligence enabled), and cloud API (your own OpenAI or Anthropic key, charges your account per call). default is off.

why does my dictation not pick up the trailing word?

the audio recorder waits a short, calibrated grace window after you release the hotkey to catch trailing samples. if you are consistently losing the last word, try speaking the last syllable a touch more slowly. Bluetooth headsets have higher input latency than wired mics; the recorder accounts for this automatically.

does the app work in fullscreen games?

yes. the hotkey listener runs system-wide via macOS accessibility, so it fires inside fullscreen apps, Spaces, and Mission Control. the recording overlay floats above fullscreen apps without stealing focus. you stay in the game, the text lands in the target app for that hotkey.

what if I want to cancel a recording mid-thought?

press ESC while recording. the audio is discarded, nothing gets transcribed, nothing gets pasted. you can also say "force quit now" if the app ever gets stuck and you need it gone without opening Activity Monitor.

how do trigger words work?

you set up phrases that end your dictation with a specific action. the default trigger is "execute": say it at the end of your dictation and HeyClanker presses Return after pasting, ending the line or starting a new paragraph depending on the app. the trigger phrase itself is stripped from the transcript. configure them in settings → triggers. three action types: a single key, a key combo (like Cmd+Return), or a literal text insert.

what is the vocabulary list for?

words the recognizer often gets wrong: terminal commands, internal product names, jargon, proper nouns. type them into settings → engine → vocabulary one per line, in the spelling you want to see. the recognizer biases toward your terms during decoding. the list lives on your Mac and is never uploaded.

does HeyClanker learn from my corrections?

only if you turn it on. with the toggle on, when you paste a transcript and then fix a word or two in the focused field, HeyClanker quietly records that swap and applies it next time. one-to-two-word corrections only; never whole sentences. stored locally. off by default to match the "100% local, nothing trains on you" stance.

why is my voice never sent to a cloud server?

because the default engines run entirely on your Mac. the cloud API engine is the one exception. you opt into it explicitly by entering an OpenAI key, and even then only the audio for that one transcription goes to OpenAI. there is no HeyClanker backend in the middle, no telemetry, no "anonymous usage data" feeding a dashboard somewhere.

what does the advanced silence detection slider do?

controls how long a pause inside a dictation gets preserved before HeyClanker treats it as dead air to compress. default 1.0 second is calibrated for natural end-of-sentence pauses. lower it if transcriptions feel like they have unnecessary long gaps; raise it if punctuation feels too "running on" with no breath between sentences.

how big are the downloads, really?

the app itself is roughly 15 MB. Apple Speech is free. the smallest local model is 75 MB; the largest is 2.9 GB. you only download the ones you actually pick. switching to a model that is not yet downloaded shows a one-click download prompt in settings → models with progress.

I revoked accessibility permission. now the hotkey does nothing.

grant it again in system settings → privacy & security → accessibility. the hotkey listener will start working immediately without a relaunch. if it still does not respond after a minute, quit and reopen the app.

how do I cancel my trial?

quit the app and drag it to the trash. the 14-day trial expires locally if you do nothing. there is no account to close, no exit survey, no "sorry to see you go" email with a discount code attached.

didn't find what you needed?

open an issue on GitHub.