MLE (Live Agent & Post-Processing)

Machine Learning Engineer - Live Agent & Speech Post-Processing
$200,000 - $300,000
San Francisco, hybrid (3x per week)
Full time / Permanent

This company builds AI tools and devices that help professionals capture and use what's said in real conversations across meetings, calls, voice notes. It's profitable, bootstrapped, and scaling fast: $250M revenue run rate in under three years, used by over 1.5 million people globally.

The product works. Now they need someone to make the live speech experience feel polished and seamless, fixing the small things that frustrate users at scale.

What you'll do

Build and maintain test suites and automated evaluation platforms for multilingual, multi-model live systems. Covering hallucinations, casing, punctuation, number formatting, and segmentation
Set up benchmarks for live agent systems: VAD false triggers, interruption latency, and turn-taking transitions
Fix the friction points that hurt user experience: poor segmentation, inconsistent casing, hallucinated words
Optimize VAD, barge-in models, and turn-taking logic to reduce end-to-end latency and false interruption rates

What "great" looks like

1–3 years of hands-on experience in speech algorithm training, with a focus on pre- or post-processing, or full-duplex voice system optimization
You've worked on ASR pre-processing or post-processing in a real product
You understand how live voice systems break and know how to fix them
You have published research at Interspeech or ICASSP, or possess speech-related patents

Why join

Profitable company at ~$250M run rate - you'll see the impact of your work immediately in a product used daily by professionals worldwide
Direct ownership of the live speech quality stack, not a supporting role in a large org
Hybrid San Francisco team with real access to large, diverse, multilingual audio datasets
Short feedback loops - improvements ship fast and metrics are visible
Clear path toward senior technical leadership as the audio team grows

APPLY HERE