Location: Heidelberg / Remote
Salary: Negotiable
About the Company:
This stealth startup is building a next-generation AI infrastructure platform designed to maximize GPU utilization, optimize LLM performance, and reduce operational costs for large-scale AI workloads. Their platform simulates, manages, and continuously adapts AI infrastructure, ensuring that every request - from model input to GPU execution - is handled efficiently. By combining deep knowledge of LLMs with intelligent infrastructure orchestration, the company enables faster, more efficient AI model execution at scale.
Mission:
The LLM Trace Generation Engineer will focus on optimizing LLM performance by analyzing the full request-to-GPU cycle, helping the platform run models as efficiently as possible.
Responsibilities:
- Analyze end-to-end LLM request and GPU processing flows to identify bottlenecks.
- Work closely with internal GPU experts to implement optimizations.
- Develop tools and insights to improve LLM performance across the platform.
- Contribute to the evolution of the AI infrastructure platform, ensuring it scales efficiently with workloads.
- Deep expertise in LLM models.
- Opportunity to work on a stealth AI startup tackling cutting-edge infrastructure challenges.
- Collaborative environment with engineers specializing in both ML and GPU systems.
- Direct impact on the performance and efficiency of large-scale AI workloads.