AI Data Lead  


$180,000 - $200,000  | New York (Remote) | Full-Time  

Deeprec.ai is proud to announce our partnership with a leading AI Compliance company. They use a modern tech stack to help companies stay on track with laws and regulations worldwide.  This remote AI Data Lead role will enable you to join A forward-thinking engineering culture that embraces AI tools as practical productivity accelerators and innovation enablers. 

 

What You’ll Do 

  • Own the end-to-end Client & AI Data Vault, from ingestion to AI retrieval. 

  • Build and scale vector databases and RAG infrastructure in production. 

  • Prototype chunking and embedding strategies using real client data and AI coding tools. 

  • Develop parsers for complex documents including PDFs, DOCX, spreadsheets, and scans. 

  • Design data models connecting client content to regulatory concepts and gap analysis. 

  • Maintain high standards for data quality, performance, testing, and engineering practices. 

Requirements 

  • Production experience with vector databases (e.g. Qdrant, Pinecone, Weaviate, pgvector), including tuning for performance and recall  

  • Experience building chunking and embedding pipelines for complex documents  

  • Strong SQL and data modelling skills in production systems  

  • Experience extracting data from PDFs, DOCX, and scanned documents (incl. OCR/layout-aware parsing)  

  • Strong Python plus at least one systems-level language  

  • Experience with Azure (preferred) or AWS/GCP, CI/CD, and containers  

  • Hands-on experience with RAG or hybrid retrieval systems  

  • Effective use of AI coding assistants in development workflows  

  • Proven track record of shipping production AI or data systems 
What will make you great  

  • Experience with multi-tenant data architectures and isolation patterns  

  • Experience with Elasticsearch, OpenSearch, or similar search engines  

  • Background in NLP, information extraction, or document understanding  

  • Experience with Kafka or similar messaging systems  

  • Experience in regulated industries with strict audit and versioning requirements  

  • Contributions to open-source retrieval, embedding, or parsing tools 


What You’ll Get? 

  • Join a small, high-impact AI team where the data layer is a core product enabler, not backend plumbing 

  • Direct access to leadership with fast feedback loops and real influence on architecture 

  • AI-first culture that treats tools as productivity multipliers 

  • Competitive compensation, benefits, and flexible working 

  • Opportunity to build the core data foundation of a fast-scaling compliance intelligence platform