Member of Technical Staff - Pre-Training (Remote US)

This is an opportunity to join one of the smartest, most ambitious teams in the AI space. Founded in 2023, this fast-growing research and product company is already being talked about alongside some of the biggest names in foundational model development. They’re building powerful, intelligent agent systems and frontier-scale models - and they believe software engineering is the most direct path toward achieving AGI.
With major backing from industry leaders, significant compute infrastructure, and a focus on mission-critical enterprise and public-sector environments, they’re tackling some of the hardest AI challenges out there.

The Role

As a Member of Technical Staff (Pre-Training / Data), you’ll be part of a high-performing Data team inside the Applied Research machinery that powers the company’s pre-training and reinforcement learning breakthroughs. Your goal: build the datasets that make better models possible. This is a hands-on, deeply technical role at the intersection of data engineering, research, and large-scale systems.

What You’ll Do

Build, scale, and refine huge datasets made up of natural language and source code to train next-gen language models
Work closely with pre-training, RL, and infrastructure teams to validate your work through fast feedback loops
Stay ahead of the curve on data generation, curation, and pre-training strategies
Develop systems to ingest, filter, and structure billions of tokens across diverse sources
Design controlled experiments that help uncover what works and what doesn’t
Be a core voice in shaping how the team approaches data for model training - a vital part of their long-term AGI mission

What You Bring

Solid hands-on experience with large language models or large-scale ML systems
Strong track record building or working with massive datasets - from raw extraction through to filtering and packaging
Exposure to training models from scratch - ideally using distributed GPU clusters
Proficient in Python and ML frameworks like PyTorch or JAX, plus confidence working in Linux, Git, Docker, and cloud/HPC environments
Great if you also have some C++/CUDA, Triton kernels, or GPU debugging background
You’re a thinker and a builder - someone who can read the latest paper and turn it into something real, quickly

What’s In It for You

Fully remote US
37 days of paid time off annually
Comprehensive health cover for you and your dependents
Monthly team meetups - travel, accommodation, and even family attendance covered
Home office and wellbeing budget
A competitive salary plus meaningful equity
The chance to work with some of the brightest minds in AGI and do genuinely original work

What the Process Looks Like

Recruiter intro call
First technical interview focused on LLMs, performance, or core engineering skills
Second technical deep dive into your domain (pre-training, data, scaling, etc.)
Culture conversation with the founding engineers
Final discussion on compensation and alignment

If you’re driven by building systems that could reshape how intelligence works - and you want to be surrounded by people who share that fire - this team is where you belong.

Member of Technical Staff (Pre-Training)

APPLY HERE