Multimodal LLM Researcher
$300,000 - $400,000
Remote, Palo Alto
Full-time / Permanent
DeepRec has partnered with a high-growth generative AI company (Series B, $130M raised). They're building multimodal, multi-agent systems that combine language, vision, audio, and video. If you've been looking for a role where your research reaches production and shapes how millions interact with creative AI, this is worth a closer look.
You'll help define the next generation of multimodal AI systems. Your work will span research, experimentation, and deployment, with a focus on real-time performance, multimodal reasoning, and agent-based workflows. You'll have the freedom to explore ambitious ideas while working alongside engineers who can bring them into production.
What You'll Do
- Lead research across LLMs, VLMs, and Audio Language Models
- Design novel multimodal model architectures and training approaches
- Improve real-time inference across text, image, audio, and video
- Train and fine-tune autoregressive and diffusion models
- Build and curate high-quality multimodal datasets
- Collaborate with engineering teams to deploy research outcomes
- Publish findings at leading AI conferences and journals
What You'll Bring
Essential
- Strong research track record in multimodal AI or foundation models
- First-author publications at recognised ML, vision, or audio conferences
- Deep expertise in LLMs, VLMs, Audio LMs, or related fields
- Strong Python and deep learning experience using modern frameworks
Desirable
- Experience with diffusion models or world models
- Background in real-time AI systems and model serving
- Experience building large-scale multimodal datasets
We encourage you to apply even if you don't meet every requirement. The right mindset matters as much as the right CV.
What's In It For You
- USD 300,000–400,000 salary
- Fully remote working arrangement
- Ownership of research that shapes production systems
- Opportunity to publish and contribute to the field
- Direct collaboration with product and engineering leadership
This role offers the chance to work on multimodal AI problems that sit at the intersection of research and real-world deployment. If you're excited by advancing the field while seeing your work reach users, we'd love to hear from you.
Location
Palo Alto, California, United States
Job Type
Permanent
Salary
$300000 - $400000 per annum
