Skip to main content

Latency Budget

RFWhisper treats latency as a hard budget, not a goal.

End-to-end targets​

Hardwarev0.1 p99v0.3 p99
Intel i5-8xxx (laptop)< 100 ms< 50 ms
Apple M1< 100 ms< 45 ms
Raspberry Pi 5 (8 GB)< 100 ms< 70 ms

Anything worse than the v0.1 column is a release blocker (criterion A4).

Per-stage budget​

Stagev0.1v0.3Notes
Audio capture (frame)10–20 ms5–10 msPortAudio / ALSA / JACK / WASAPI callback
Pre-emphasis + feature extraction≤ 5 ms≤ 2 msVOLK windowing + liquid FFT with cached plans
ONNX inference (DFN3)≤ 30 ms≤ 15 msXNNPACK/CoreML/DirectML; INT8 stretch
Post-processing + overlap-add≤ 5 ms≤ 2 ms—
Output buffering10–30 ms5–15 msMatches input blocksize
Total (p99)< 100 ms< 50 ms—

RPi Zero 2 W ships with RNNoise (not DFN3) and has a separate, looser budget.

Measurement methodology​

python -m rfwhisper.bench latency \
--backend portaudio \
--in default --out default \
--blocksize 480 \
--duration 120 \
--out build/latency/

What it does:

  1. Generates an impulse train at known intervals.
  2. Plays impulses out → routes through the virtual cable → back into RFWhisper's input.
  3. RFWhisper processes them as normal audio.
  4. The probe aligns the reference and captured impulses, logging the round-trip delta to an HDR histogram.
  5. Publishes p50 / p95 / p99 / max; renders a histogram image.

We report p99, not mean. A pipeline that averages 30 ms but stalls at 180 ms twice per minute is unusable; one that's 85 ms tight is fine.

Hard rules​

  • No allocations in the audio callback. Verified in CI with an ASan build that flags malloc from RT threads.
  • GIL released in the inference path. Python code that doesn't respect this fails profiler checks.
  • Thread affinity is intentional, never left to the scheduler when it matters.
  • Frame sizes are 10–30 ms — shorter for CW (transient fidelity), longer for FT8 (decoder tolerates, NN gets more context).

Real-time scheduling per OS​

OSWhat we request
LinuxSCHED_FIFO at rtprio 95 if the user is in a realtime group
macOSthread_time_constraint_policy with appropriate period/constraint/deadline
WindowsMMCSS AvSetMmThreadCharacteristics with Pro Audio characteristic

If RT priority is unavailable (rootless container, no realtime group), RFWhisper falls back gracefully and logs a one-time warning.

When you miss the budget​

  1. Run python -m rfwhisper.bench latency --trace to get per-stage histograms.
  2. Check CPU governor (Linux) and power mode (macOS / Windows).
  3. Check virtual cable internal rate (48 kHz) matches RFWhisper's.
  4. Reduce blocksize carefully: smaller blocks lower latency but raise xrun risk.
  5. Switch to INT8 model variant (--model-variant int8) if you're CPU-bound.

If none of that helps, file an issue with the full rfwhisper doctor output and the trace HDR histogram.