Xen Voice Line

Telnyx → FreeSWITCH → M4 receiver → faster-whisper → warm session → TTS → your ear. Status & fix log — 2026-06-17.

LINE GREEN — two-way live on 470-369-6463
Call Receiver
:4451 healthy
STT (faster-whisper)
:8791 healthy
Fast Brain (Sonnet)
launchd · auto-restart
Voice
canonical · rate 210
Barge-in
re-enabled
DID
470-369-6463

What was broken → what fixed it

SymptomRoot causeFix
"autogoal daemon on"fast-responder looping a placeholder report aloudkilled the loop + drained queue — no rogue daemon existed
"wrong voice"call synth at rate 188; canonical voice is 210 (same voice, slower = sounds different)ttsB64 pinned to rate 210 from identity.json, no -v
placeholder / JSON read aloudreply tee spoke Working in the shell: lines that carried the call tagtee junk filter (shell / JSON / code-fence / tool-narration dropped)
~13s latency / "engine being blocked"turns routed to heavy Opus session which blocks on reasoningfast Sonnet twin now submits → receiver routes there
latency (transport)buffered up to 8s of audio before transcribingbuffer cap 8s→4s · silence gap 900→600ms · flush at 0.4s
"not barging in"barge-in disabled after an earlier self-cut bugre-enabled w/ corrected send-time guard · rms>2500 · 3-frame · debounce

The big win — fast brain unblocked

The fast Sonnet session (/tmp/xen-fast.sock) held its socket but silently never submitted any inject — so every call turn fell back to the slow Opus brain. root cause the wrapper wrote the text and the Enter key in two separate writes; Claude TUI 2.1.168 paste-detection ate the Enter ([Pasted text #N]). Fixed with an atomic double-CR, and the twin is now a permanent launchd service (login-shell launch for CLAUDE_CONFIG_DIR, KeepAlive, survives reboot). Verified: clean submit, omni-aware, stays up.

Next tier — toward sub-2s

  1. VAD-gated endpointing (Silero + WebRTC pre-filter) — flush on 300-500ms silence, not a fixed window.
  2. Streaming STT (LocalAgreement-2 on faster-whisper) — partial hypotheses before the caller stops.
  3. Token-stream the LLM into sentence-chunked TTS — audio starts at first sentence, not full completion.
  4. In-process PCMU encoding — drop the per-utterance ffmpeg subprocess.
Xen / Exodus · one brain, two arms (M4 + nitro) · verified end-to-end 2026-06-17. Internal status surface — noindex.