Senior MLOps Engineer Fully Remote (United States)   |   up to $200k base equity   The role Our client is hiring a Senior MLOps Engineer to build and operate the production platform powering their ML and LLM-driven healthcare workflows. You will design reliable, secure, and compliant systems for model development, evaluation, deployment, monitoring, and continuous improvement, working closely with ML, data, security, and product teams.
This is the right seat for someone who has shipped ML systems in production and is excited about LLM orchestration, RAG, evaluations, guardrails, and observability inside a regulated healthcare environment. What you will be doing MLOps and ML platform
  • Design and operate ML platforms supporting end-to-end workflows: data ingestion, feature engineering, training, evaluation, deployment, and monitoring.
  • Build and maintain CI/CD for ML, including testing, packaging, versioning, reproducibility, automated rollbacks, and approvals.
  • Implement MLOps best practices: model registry, experiment tracking, lineage, governance, and reproducible training environments.
  • Develop scalable training infrastructure: distributed training, GPU scheduling, cost controls, and auto-scaling.
  • Build and maintain feature pipelines and feature stores, ensuring consistency between training and inference.
  • Establish model monitoring and observability: performance, drift, fairness signals where relevant, latency, throughput, and data quality.
  • Own end-to-end LLM delivery pipelines: prompt versioning, retrieval, orchestration, evaluation, deployment, monitoring, and iterative improvement.
  • Build LLM evaluation harnesses, both offline and online: golden datasets, automated regression testing, human-in-the-loop review, and risk scoring.
  • Implement cost controls: token and cost budgeting, caching, autoscaling, and performance tuning.
Deployment, reliability, and operations
  • Productionize ML models on GCP using containers and orchestration (GKE, Cloud Run).
  • Build CI/CD for ML and LLM systems with automated tests and safe rollouts.
  • Implement observability: tracing, metrics, logs, dashboards, and alerting for model and system health, including hallucination indicators and retrieval quality.
Data, governance, and healthcare compliance
  • Design systems with security and privacy by default: IAM, least privilege, secrets management, audit logs, encryption, retention, and PHI/PII handling.
  • Implement governance: model and prompt lineage, dataset provenance, evaluation traceability, and approval workflows aligned with healthcare compliance expectations.
  • Integrate guardrails: content filters, policy checks, prompt injection defenses, structured output validation, and fallback strategies.
What we are looking for Essential
  • 6 years in software or platform engineering, including 4 years operating ML systems in production.
  • Strong ML engineering background: training pipelines, evaluation, deployment patterns, monitoring, and iteration loops.
  • Demonstrated hands-on experience with LLM systems in production.
  • Strong Python plus production-grade experience building APIs and services.
  • Strong experience with GCP services and cloud-native patterns.
  • Production experience with Vertex AI (pipelines, endpoints, feature store, model registry, evaluation) and/or managed vector search on GCP.
  • Containerization and orchestration with Docker, Kubernetes/GKE, and/or Cloud Run.
Work authorization Open to US Citizens, Green Card holders, and candidates already in the US on a valid H-1B (transfers considered). About the company Our client is an AI-first healthtech company on a mission to detect cancer earlier and prevent it where possible. Their platform has already assessed over 700,000 patients and identified more than 75,000 cancers, and they are now expanding their US footprint with a greenfield product build off the back of a fresh Series A round, backed by one of the most respected VCs in the world.
Most of the cancer industry focuses on treatment. This team is focused on detection and prevention, where the impact on survival rates is greatest. The founders are practising doctors who have lived in the problem space first-hand, and the company is tech-first, with the majority of headcount sitting in engineering, data, and ML. Why join
  • Real-world impact: AI that directly contributes to earlier cancer detection and improved patient outcomes.
  • Greenfield US build at a critical inflection point, with high ownership from day one.
  • Series A backing from a top-tier global VC.
  • Builder culture: production-grade work, not research or prototypes.
  • Direct exposure to the CTO and senior AI leadership in a flat, fast-moving environment.
  • Continuous learning, with access to the latest tools and methods in AI and healthcare.
Benefits
  • Competitive base salary plus meaningful equity.
  • Fully remote across the United States.
  • Flexible working arrangements.