Senior MLOps / ML Infrastructure Engineer
About the Company

Our client is a Series B, venture-backed deep-tech company building a Physics AI platform that helps engineering teams bring products to market faster, reduce development risk, and explore better designs with greater confidence. The platform combines large-scale simulation data with modern machine learning to generate high-fidelity predictions of physical behavior in near real time.
Customers include leading organizations across aerospace, automotive, and advanced manufacturing, working on some of the most demanding real-world engineering problems.
The Role

This role focuses on building and operating the infrastructure that powers physics-based AI systems at scale. The position enables ML engineers and scientists to train, track, deploy, and monitor models reliably without managing low-level infrastructure. The work sits at the intersection of ML systems, cloud infrastructure, and large-scale simulation data, with a strong emphasis on performance, reliability, and developer productivity. It is a hands-on engineering role in a fast-moving, in-office environment, working closely with ML researchers, platform engineers, and product teams.
What You’ll Do

Design, build, and maintain robust MLOps infrastructure supporting the full ML lifecycle, from experimentation and training through to production deployment and monitoring
Implement automated training pipelines, experiment tracking, and model lifecycle management using tools such as Kubeflow, MLflow, and Argo Workflows
Develop scalable data pipelines capable of handling large volumes of unstructured data, particularly 3D geometric data and physics simulation outputs
Deploy machine learning models into production inference systems with strong standards for performance, reliability, and observability
Manage model registries and integrate them with CI/CD workflows to support consistent and reliable model releases
Implement monitoring systems that continuously track model health and performance in production
Collaborate closely with ML researchers, platform engineers, and product teams to evolve the infrastructure platform for physics-based AI applications
Write production-grade code and optimize cloud infrastructure, primarily on Google Cloud Platform, while making thoughtful trade-offs around scalability, cost, and operational simplicity using Docker and Kubernetes

What We’re Looking For

Bachelor’s degree or higher in Computer Science, Data Science, Applied Mathematics, or a closely related field
5 years of industry experience building MLOps platforms or ML systems in production environments
Strong proficiency in Python, with working knowledge of BASH and SQL
Hands-on experience with cloud infrastructure such as GCP, AWS, or Azure
Experience with containerization and orchestration tools including Docker and Kubernetes
Familiarity with modern MLOps frameworks such as Kubeflow, MLflow, and Argo Workflows
Experience building and maintaining scalable data pipelines, ideally working with unstructured or high-dimensional data
Ability to independently deploy models and implement monitored inference systems in production
Comfortable troubleshooting complex distributed systems and building reliable infrastructure that other teams depend on

Nice to Have

Interest in physics simulation, scientific computing, or HPC environments
Experience building production MLOps platforms in deep-tech or simulation-heavy environments
Familiarity with additional programming languages such as Go or C

Working Style and Culture
This role suits someone who enjoys startup environments, learns quickly, and communicates clearly across disciplines. The team works on-site five days a week and values close collaboration, fast feedback loops, and hands-on problem solving. There is a strong belief that great infrastructure should be largely invisible, enabling engineers and scientists to move faster without friction.

Senior MLOps Engineer

APPLY HERE