Aldea Logo

Aldea

Research Engineer (Machine Learning)

Posted Yesterday
In-Office or Remote
Hiring Remotely in San Francisco, CA
Mid level
In-Office or Remote
Hiring Remotely in San Francisco, CA
Mid level
The Research Engineer will build and optimize AI training and inference systems for multi-modal applications, enabling rapid experimentation and real-time deployment of models.
The summary above was generated by AI

About Aldea

Aldea is a multi-modal foundational AI company reimagining the scaling laws of intelligence. We believe today's architectures create unnecessary bottlenecks for the evolution of software. Our mission is to build the next generation of foundational models that power a more expressive, contextual, and intelligent human–machine interface.


The Role 

We are hiring a Research Engineer (Machine Learning) to build the infrastructure that powers Aldea's multi-modal AI research. You will design, optimize, and scale the training and inference systems that enable our research team to explore next-generation architectures across language, speech, and multi-modal domains. 


This is a high-leverage role where your work directly enables breakthrough research. You'll build production-grade systems supporting rapid experimentation at billion-parameter scale and real-time deployment of speech and language models. If you're passionate about building the systems that accelerate AI research, this role is for you. 


What You'll Do

  • Build and maintain distributed training infrastructure supporting researchers across language and speech domains at a billion-plus-parameter scale. 
  • Optimize training and inference performance across the stack, delivering significant speedups through framework optimization, custom kernels, and system-level improvements. 
  • Design experiment infrastructure including automated evaluation pipelines, experiment tracking, and monitoring systems that enable rapid iteration. 
  • Scale infrastructure from single-node to multi-node distributed training and deploy production inference systems for real-time applications. 
  • Support researchers with fast turnaround on infrastructure issues and maintain high reliability across all systems. 
  • Collaborate with research scientists, data engineers, and leadership to define technical priorities and infrastructure roadmap. 

Minimum Qualifications

  • Bachelor's degree in Computer Science, Engineering, or related field, or equivalent practical experience. 
  • 3+ years of experience with PyTorch and distributed training frameworks (DDP, FSDP, DeepSpeed, or similar). 
  • Experience training large-scale deep learning models at 1B+ parameters. 
  • Deep understanding of training optimization techniques including mixed precision, gradient checkpointing, and memory management. 
  • Proven ability to build production-grade ML infrastructure with high reliability. 
  • Track record of delivering significant performance optimizations in ML training or inference systems.

Preferred Qualifications 

  • Experience with custom kernel development (CUDA, Triton) or GPU optimization. 
  • Hands-on experience with large-scale pretraining (100B+ tokens, ideally trillion+ scale). 
  • Experience optimizing inference for production: quantization, vLLM, TensorRT, or custom serving engines. 
  • Familiarity with speech/audio ML systems and real-time inference constraints. 
  • Experience building automated evaluation frameworks and experiment tracking systems. 
  • Knowledge of profiling tools and multi-node training across 8-32+ GPUs. 
  • Exposure to job orchestration systems (SLURM, Kubernetes, Ray). 
  • Master's or PhD in Computer Science, Machine Learning, or related field.

Compensation & Benefits

  • Competitive base salary
  • Performance-based bonus aligned with research and model milestones
  • Equity participation
  • Comprehensive health, dental, and vision coverage
  • Flexible paid time off


Aldea is proud to be an equal-opportunity employer. We are committed to building a diverse and inclusive culture that celebrates authenticity to win as one. We do not discriminate on the basis of race, religion, color, national origin, gender, gender identity, sexual orientation, age, marital status, disability, protected veteran status, citizenship or immigration status, or any other legally protected characteristics.


Aldea uses E-Verify to confirm employment eligibility in compliance with federal law. For more information please visit: https://www.e-verify.gov.


Please note: We do not accept unsolicited resumes from recruiters or employment agencies and will not be responsible for any fees related to unsolicited resumes.

Top Skills

Cuda
Deepspeed
Kubernetes
Python
PyTorch
Ray
Slurm
Tensorrt
Triton

Similar Jobs

4 Days Ago
Remote
United States
200K-265K Annually
Mid level
200K-265K Annually
Mid level
Artificial Intelligence • Software
As a Machine Learning Research Engineer, you'll create critical AI features for the Archie product, collaborating with experts and transitioning AI prototypes into real-world applications. You'll develop AI tools for engineering design and work closely with research scientists.
Top Skills: AIDeep LearningMachine Learning
23 Days Ago
Remote
United States
Mid level
Mid level
Artificial Intelligence • Fintech • Software • Financial Services
The role involves fine-tuning machine learning models, designing datasets and pipelines, and ensuring product quality and safety for a new AI application.
Top Skills: Deep LearningPyTorchTensorFlowTransformers
5 Days Ago
In-Office or Remote
2 Locations
165K-259K Annually
Senior level
165K-259K Annually
Senior level
Biotech • Pharmaceutical
Join a team to engineer high-efficiency deep learning architectures for molecular machine learning, focusing on optimization and scaling across GPUs for drug discovery.
Top Skills: CudaGnnsGraph TransformersJaxNvidia WarpPyTorchTensorrtTorchinductorTriton

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account