Jacob Graham


Jacob specialises in relationship-driven recruitment focused on MLOps, Infrastructure, and Engineering across the DACH region for DeepRec.

He began his recruitment career in CRM, successfully placing across the full spectrum of Salesforce specialists into partners, ISVs, and end users throughout multiple geographies. Jacob then transitioned into data recruitment, delivering niche SME hires and high-volume contract placements, primarily within investment banking.

Today, he combines market intelligence with targeted headhunting, systematically mapping the market to understand who’s building, hiring, and advancing ML infrastructure. 

Every introduction is intentional, informed, and backed by data.

 

 

JOBS FROM JACOB

Spain
MLOps Engineer
MLOps EngineerLocation: Barcelona (Hybrid) Contract: Fixed-term until June 2026 Salary: €55,000 base pro rata Bonuses: €3,000 sign-on €500/month retention bonus Relocation: €2,000 package available Eligibility: EU work authorisation required The opportunity We’re hiring an MLOps Engineer to join a fast-scaling European deep-tech company working at the forefront of AI model efficiency and deployment. This team is solving a very real problem: how to take large, cutting-edge language models and run them reliably, efficiently, and cost-effectively in production. Their technology is already live with major enterprise customers and is reshaping how AI systems are deployed at scale. This is a hands-on engineering role with real ownership. You’ll sit close to both research and production, helping turn advanced ML into systems that actually work in the real world. What you’ll be working onBuilding and operating end-to-end ML and LLM pipelines, from data ingestion and training through to deployment and monitoringDeploying production-grade AI systems for large enterprise customersDesigning robust automation using CI/CD, GitOps, Docker, and KubernetesMonitoring model performance, drift, latency, and cost, and improving reliability over timeWorking with distributed training and serving setups, including model and data parallelismCollaborating closely with ML researchers, product teams, and DevOps engineers to optimise performance and infrastructure usageManaging and scaling cloud infrastructure (primarily Azure, with some AWS exposure)Tech you’ll be exposed toPython for ML and backend systemsCloud platforms: Azure (AKS, ML services, CycleCloud, Managed Lustre), plus AWSContainerisation and orchestration: Docker, KubernetesAutomation and DevOps: CI/CD pipelines, GitOpsDistributed ML tooling: Ray, DeepSpeed, FSDP, Megatron-LMLarge language models such as GPT-style models, Llama, Mistral, and similarWhat they’re looking for3 years’ experience in MLOps, ML engineering, or LLM-focused rolesStrong experience running ML workloads in public cloud environmentsHands-on background with production ML pipelines and monitoringSolid understanding of distributed training, parallelism, and optimisationComfortable working across infrastructure, ML, and engineering teamsStrong English communication skills; Spanish is a plus but not requiredNice to haveExperience with mixture-of-experts modelsLLM observability, inference optimisation, or API managementExposure to hybrid or multi-cloud environmentsReal-time or streaming ML systemsWhy this role stands outWork on AI systems that are already in production with global customersTackle real infrastructure and scaling challenges, not toy problemsCompetitive salary plus meaningful bonusesHybrid setup in Spain with relocation supportJoin a well-funded, high-growth deep-tech environment with long-term impact
Jacob GrahamJacob Graham
Berlin, Germany
Training Infrastructure Engineer
Training Infrastructure Engineer Salary: €80,000 to €150,000 equity Location: Fully remote within Europe (CET ±2 hours) Stage: Recently funded Series A AI startup We are partnering with a fast-growing generative AI company building the next generation of creative tooling. Their platform generates hyper-realistic sound, speech, and music directly from video, effectively bringing silent content to life. The technology is already being used across gaming, video platforms, and creator ecosystems, with a clear ambition to become foundational infrastructure for audio-visual storytelling. Backed by top-tier venture capital and fresh Series A funding, the company is now scaling its core engineering group. This is a chance to join at a point where the technical challenges are deep, the scope is wide, and individual impact is unmistakable. The Role:As a Training Infrastructure Engineer, you will own and evolve the full model training stack. This is a hands-on, systems-level role focused on making large-scale training fast, reliable, and efficient. You will work close to the hardware and close to the models, shaping how cutting-edge generative systems are trained and iterated. What You Will Do:Design and evaluate optimal training strategies including parallelism approaches and precision trade-offs across different model sizes and workloadsProfile, debug, and optimise GPU workloads at single and multi-GPU level, using low-level tooling to understand real hardware behaviourImprove the entire training pipeline end to end, from data storage and loading through distributed training, checkpointing, and loggingBuild scalable systems for experiment tracking, model and data versioning, and training insightsDesign, deploy, and maintain large-scale training clusters orchestrated with SLURMWhat We Are Looking For:Proven experience optimising training and inference workloads through hands-on implementation, not just theoryDeep understanding of GPU memory hierarchy and compute constraints, including the gap between theoretical and practical performanceStrong intuition for memory-bound vs compute-bound workloads and how to optimise for eachExpertise in efficient attention mechanisms and how their performance characteristics change at scaleNice to Have:Experience writing custom GPU kernels and integrating them into PyTorchBackground working with diffusion or autoregressive modelsFamiliarity with high-performance storage systems such as VAST or large-scale object storageExperience managing SLURM clusters in production environmentsWhy This Role:Join at a pivotal growth stage with fresh funding and strong momentumGenuine ownership and autonomy from day one, with direct influence over technical directionCompetitive salary and equity so you share in the upside you help createWork on technology that is redefining how creators produce and experience contentIf you want to operate at the intersection of deep systems engineering and frontier generative AI, this is one of the strongest opportunities in the European market right now.
Anthony KellyAnthony Kelly

INSIGHTS FROM JACOB