$200,000 - $250,000
San Francisco, CA
Onsite, 5x per week in office
Full time / Permanent
Most "agentic AI" roles are still research projects dressed up as products.
This one isn't. The agent is live, in production, handling real workflows for Tesla, BMW, Meta, and Amazon. You’ll be in the codebase from day one, improving an agent that's already shipping, not starting from a blank page.
This company builds AI-powered automation for mechanical engineers. Think Claude Code, but for CAD, CAE, and PLM workflows. $32M raised, backed by Eric Schmidt and early investors in Anthropic and OpenAI. Over 1,000 customers running production workflows on the platform.
They're hiring a Staff Engineer to own the agent intelligence layer, the system that takes an engineer's intent and executes it reliably across complex desktop software.
What you'd own:
- Evals infrastructure and agent benchmarking - defining what "good" looks like for a domain with no established benchmark
- Agent harness build-out and ongoing performance improvement (task success rate, token efficiency, workflow coverage)
- Architecture decisions: tool-calling strategy, state management, context handling, model routing, error recovery
- Technical leadership of a small team of AI engineers - player-coach, not pure manager
The profile they're looking for:
- Proven expertise building agentic systems into production. Systems that take real actions - tool-calling, multi-step state, failure handling, cost constraints. Ideally 2 years
- Strong on evals: task completion, failure mode analysis, regression detection
- Production-first mindset - you'd rather ship 70% coverage reliably than demo a clever system you can't measure
- Builders over researchers. Too much academic background is a flag here
Desktop automation experience is nice-to-have. Mechanical engineering background is nice-to-have. Experience taking agentic systems into production is the bar.
