Member of Technical Staff - Pre-Training (Remote US)


This is an opportunity to join one of the smartest, most ambitious teams in the AI space. Founded in 2023, this fast-growing research and product company is already being talked about alongside some of the biggest names in foundational model development. They’re building powerful, intelligent agent systems and frontier-scale models - and they believe software engineering is the most direct path toward achieving AGI.
With major backing from industry leaders, significant compute infrastructure, and a focus on mission-critical enterprise and public-sector environments, they’re tackling some of the hardest AI challenges out there.


The Role

As a Member of Technical Staff (Pre-Training / Data), you’ll be part of a high-performing Data team inside the Applied Research machinery that powers the company’s pre-training and reinforcement learning breakthroughs. Your goal: build the datasets that make better models possible. This is a hands-on, deeply technical role at the intersection of data engineering, research, and large-scale systems.


What You’ll Do

  • Build, scale, and refine huge datasets made up of natural language and source code to train next-gen language models
  • Work closely with pre-training, RL, and infrastructure teams to validate your work through fast feedback loops
  • Stay ahead of the curve on data generation, curation, and pre-training strategies
  • Develop systems to ingest, filter, and structure billions of tokens across diverse sources
  • Design controlled experiments that help uncover what works and what doesn’t
  • Be a core voice in shaping how the team approaches data for model training - a vital part of their long-term AGI mission


What You Bring

  • Solid hands-on experience with large language models or large-scale ML systems
  • Strong track record building or working with massive datasets - from raw extraction through to filtering and packaging
  • Exposure to training models from scratch - ideally using distributed GPU clusters
  • Proficient in Python and ML frameworks like PyTorch or JAX, plus confidence working in Linux, Git, Docker, and cloud/HPC environments
  • Great if you also have some C++/CUDA, Triton kernels, or GPU debugging background
  • You’re a thinker and a builder - someone who can read the latest paper and turn it into something real, quickly


What’s In It for You

  • Fully remote US
  • 37 days of paid time off annually
  • Comprehensive health cover for you and your dependents
  • Monthly team meetups - travel, accommodation, and even family attendance covered
  • Home office and wellbeing budget
  • A competitive salary plus meaningful equity
  • The chance to work with some of the brightest minds in AGI and do genuinely original work


What the Process Looks Like

  1. Recruiter intro call
  2. First technical interview focused on LLMs, performance, or core engineering skills
  3. Second technical deep dive into your domain (pre-training, data, scaling, etc.)
  4. Culture conversation with the founding engineers
  5. Final discussion on compensation and alignment


If you’re driven by building systems that could reshape how intelligence works - and you want to be surrounded by people who share that fire - this team is where you belong.