Research Engineer – Training Optimisation and Infrastructure
Location: Berlin - Remote within Europe (±2 hours CET)
Level: Mid to Staff
Package: Competitive salary plus equity

The Opportunity
A Series A generative AI company is hiring a Research Engineer to drive optimisation across training strategy and ML infrastructure. The business builds state-of-the-art audio and music generation models and is backed by a leading generative AI fund. The team includes researchers and engineers from Google Brain, Meta FAIR, Amazon, ETH Zürich, and Max Planck.

Role Summary
You will focus on optimising end-to-end training pipelines for large generative models. This includes GPU-level performance tuning, distributed systems work, and driving efficiency across data, storage, orchestration, and experimentation systems.

Key Responsibilities

Develop and refine training strategies including parallelism approaches and precision choices for varied model scales and compute profiles
Profile, debug, and optimise single and multi-GPU workloads using tools such as Nsight
Improve training pipelines covering data storage, data loading, distributed training, checkpointing, and logging
Build scalable systems for experiment tracking, model and data versioning, and experiment insights
Design, deploy, and maintain large-scale training clusters using SLURM

Ideal Experience

Strong hands-on experience optimising training and inference workloads
Deep understanding of GPU memory hierarchy and hardware performance limits
Experience tuning both memory-bound and compute-bound operations
Knowledge of efficient attention algorithms and their performance implications at different scales

Nice-to-Have

Experience writing custom GPU kernels and integrating them into PyTorch
Familiarity with diffusion or autoregressive models
Understanding of high-performance storage solutions such as VAST
Experience running SLURM clusters at scale

Why Apply

Work on frontier audio and music generation models
Influence training strategy and infrastructure at scale
Join a high-calibre research and engineering team

Research Engineer - Training Optimization & Infrastructure

APPLY HERE