Skip to main content

import FlowgraphRenderer from '@site/src/components/FlowgraphRenderer';

Architecture

RFWhisper is a thin layer of glue on top of a small number of very good libraries.

Principlesโ€‹

  1. Local-first. No runtime network calls. Models fetched once (SHA-256 verified) and cached.
  2. Latency is a feature. p99 < 100 ms v0.1, < 50 ms v0.3. See Latency Budget.
  3. Preserve the signal. Classical DSP handles what it's good at (resampling, notching, framing); the NN handles what it's good at (complex stationary + impulsive noise mixtures). Neither replaces the other.
  4. Predictable. Preallocated buffers, released GIL in hot paths, no hidden reallocations per frame.
  5. Swappable. Models are ONNX artefacts loaded at runtime; swap DFN3 โ†” RNNoise โ†” your own with a CLI flag or a GUI dropdown.

Runtime topologyโ€‹

Threadsโ€‹

ThreadPriorityResponsibility
Audio captureRealtime (SCHED_FIFO / Pro Audio MMCSS / mach thread constraint)Pulls frames from the input device into a lock-free SPSC ring
Pre / Inference / PostRealtimePops a frame, runs feature extraction โ†’ ONNX โ†’ overlap-add โ†’ pushes to output ring
Audio outputRealtimeDrains the output ring to the device
TelemetryNice +5Samples HDR histograms every second; writes JSON/TB logs if enabled
GUI (when running)DefaultNever blocks the audio path

Buffersโ€‹

Every buffer is preallocated at startup to the worst-case frame size. Audio callbacks do zero allocation, zero locking.

Subsystemsโ€‹

DSP (classical)โ€‹

Lives in rfwhisper/dsp/. Windowing, STFT (via liquid-dsp), polyphase resampling, pre/de-emphasis, overlap-add, and the adaptive narrowband notch (v0.3). VOLK kernels for the kernels that matter.

โ†’ Signal Flow for the detailed block-by-block view.

Inferenceโ€‹

Lives in rfwhisper/models/. Thin ONNX Runtime wrapper:

  • Providers chosen in order: CoreML โ†’ DirectML โ†’ CUDA โ†’ XNNPACK โ†’ CPU.
  • Session options: intra_op_num_threads tuned per target, inter_op_num_threads = 1, enable_cpu_mem_arena=True.
  • I/O tensors are pinned and reused per frame.
  • FP32 + INT8 (QDQ) variants ship side-by-side; users pick via --model-variant.

โ†’ Models for training, fine-tuning, and the model hub.

Real-time runtimeโ€‹

Lives in rfwhisper/realtime/. PortAudio / WASAPI / CoreAudio / ALSA backends with a unified callback surface. Lock-free SPSC rings between stages. HDR histograms for p50/p95/p99.

GNU Radio integration (v0.2+)โ€‹

Lives in gr-rfwhisper/ (OOT module) and flowgraphs/ (per-SDR .grc files). Uses stock gr-dnn where possible; adds our own blocks where we need runtime model-swap, telemetry, and profile awareness.

{flowchart LR A([RTL-SDR v4<br/>2.048 Msps]) --> B[SoapySDR Source] B --> C[Freq Xlating FIR<br/>decimate 64] C --> D[SSB Demod<br/>USB/LSB] D --> E[Audio Resampler<br/>โ†’ 48 kHz] E --> F["gr-dnn ยท DeepFilterNet3"] F --> G[Overlap-add / Limiter] G --> H[/Virtual Cable/] F -.-> T{{Telemetry}}}

โ†’ Flowgraphs has one per SDR.

Where things live in the repoโ€‹

rfwhisper/
โ”œโ”€โ”€ constants.py โ† shared constants (rates, frame sizes, opset)
โ”œโ”€โ”€ dsp/ โ† classical DSP (windows, STFT, resample, notch)
โ”œโ”€โ”€ models/ โ† ONNX loaders, providers, registry, fetch
โ”œโ”€โ”€ realtime/ โ† audio backends, SPSC rings, scheduling
โ”œโ”€โ”€ profiles/ โ† YAML per-mode (ssb.yaml, cw.yaml, ft8.yaml, โ€ฆ)
โ”œโ”€โ”€ gui/ โ† PySide6 app (v0.4)
โ”œโ”€โ”€ train/ โ† fine-tuning pipeline (v0.5)
โ”œโ”€โ”€ bench/ โ† latency probe, RTF, CPU, memory
โ””โ”€โ”€ cli.py โ† the rfwhisper command
gr-rfwhisper/ โ† GNU Radio OOT module (C++ + GRC YAML)
flowgraphs/ โ† .grc + generated .py
tests/audio/ โ† acceptance harness tied to ROADMAP criteria

Dependencies at a glanceโ€‹

PurposeLibraryLicense
DSP frameworkGNU Radio 3.10.xGPLv3
SDR abstractionSoapySDRBoost
InferenceONNX RuntimeMIT
Classical DSP primitivesliquid-dspMIT
SIMD kernelsVOLKGPLv3
Primary modelDeepFilterNet3MIT / Apache-2.0
Fallback modelRNNoiseBSD-3-Clause
Audio backendsPortAudio / WASAPI / CoreAudio / ALSAVaries (all permissive)
GUIPySide6 / Qt 6LGPL
PackagingPEP 621 pyproject.tomlโ€”

All GPLv3-compatible.