Xen Voice Line

Telnyx → FreeSWITCH → M4 receiver → faster-whisper → warm session → TTS → your ear. Status & fix log — 2026-06-17.

LINE GREEN — two-way live on 470-369-6463

Call Receiver

:4451 healthy

STT (faster-whisper)

:8791 healthy

Fast Brain (Sonnet)

launchd · auto-restart

Voice

canonical · rate 210

Barge-in

re-enabled

DID

470-369-6463

What was broken → what fixed it

Symptom	Root cause	Fix
"autogoal daemon on"	fast-responder looping a placeholder report aloud	killed the loop + drained queue — no rogue daemon existed
"wrong voice"	call synth at rate `188`; canonical voice is `210` (same voice, slower = sounds different)	`ttsB64` pinned to rate 210 from `identity.json`, no `-v`
placeholder / JSON read aloud	reply tee spoke `Working in the shell:` lines that carried the call tag	tee junk filter (shell / JSON / code-fence / tool-narration dropped)
~13s latency / "engine being blocked"	turns routed to heavy Opus session which blocks on reasoning	fast Sonnet twin now submits → receiver routes there
latency (transport)	buffered up to 8s of audio before transcribing	buffer cap 8s→4s · silence gap 900→600ms · flush at 0.4s
"not barging in"	barge-in disabled after an earlier self-cut bug	re-enabled w/ corrected send-time guard · rms>2500 · 3-frame · debounce

The big win — fast brain unblocked

The fast Sonnet session (/tmp/xen-fast.sock) held its socket but silently never submitted any inject — so every call turn fell back to the slow Opus brain. root cause the wrapper wrote the text and the Enter key in two separate writes; Claude TUI 2.1.168 paste-detection ate the Enter ([Pasted text #N]). Fixed with an atomic double-CR, and the twin is now a permanent launchd service (login-shell launch for CLAUDE_CONFIG_DIR, KeepAlive, survives reboot). Verified: clean submit, omni-aware, stays up.

Next tier — toward sub-2s

VAD-gated endpointing (Silero + WebRTC pre-filter) — flush on 300-500ms silence, not a fixed window.
Streaming STT (LocalAgreement-2 on faster-whisper) — partial hypotheses before the caller stops.
Token-stream the LLM into sentence-chunked TTS — audio starts at first sentence, not full completion.
In-process PCMU encoding — drop the per-utterance ffmpeg subprocess.

Xen / Exodus · one brain, two arms (M4 + nitro) · verified end-to-end 2026-06-17. Internal status surface — noindex.