Airbnb Logo

Airbnb

Senior Staff Machine Learning Engineer, Data & Eval

Reposted 22 Days Ago
Remote
Hiring Remotely in United States
244K-305K Annually
Expert/Leader
Remote
Hiring Remotely in United States
244K-305K Annually
Expert/Leader
As a Senior Staff Machine Learning Engineer, you will drive AI product development, collaborate with cross-functional teams, and enhance ML models at scale.
The summary above was generated by AI

Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible for guests to connect with communities in a more authentic way.

The Community You Will Join: 

AI and ML are at the heart of the Airbnb product. From Trust to Payments, and from Customer Service to Marketing, we rely on ML to ensure that guests and hosts have the best possible experience with Airbnb.

The Core ML team is responsible for driving CSxAI (Customer Support x Artificial Intelligence) initiatives by adopting Generative AI technologies to enable an intelligent, scalable, and exceptional service experience. The team develops and enhances AI models, ML services, and tools including LLM fine-tuning and optimization, RAG/Search, LLM evaluation and testing automation, feedback-based learning, and guardrails for a wide range of applications at Airbnb.

The richness of Airbnb's data, the complexity of its marketplace, and the variety innate in our product mean that we need to operate at the state of the art of AI practice. We are committed to long-term innovation to solve complex problems, and to do that we need experienced ML leaders to join us.

​​The Difference You Will Make:

In this Senior Staff role, you will set technical direction and lead execution for ML evaluation and the end-to-end data flywheel powering CSxAI products (e.g., assistive agents, issue resolution, and tooling). Your work will define how we measure quality, how we turn feedback into learning signals, and how we continuously improve models and products safely and efficiently. You will partner closely with product, engineering, design, operations to build evaluation systems that are trusted, scalable, and actionable - connecting offline metrics to online outcomes.

A Typical Day: 

  • Work with large scale structured and unstructured data; explore, experiment, build and continuously improve Machine Learning models and pipelines for Airbnb product, business and operational use cases.
  • Work collaboratively with cross-functional partners including product managers, operations and data scientists, to identify opportunities for business impact; understand, refine, and prioritize requirements for machine learning, and drive engineering decisions.
  • Hands-on develop, productionize, and operate Machine Learning models and pipelines at scale, including both batch and real-time use cases.
  • Leverage third-party and in-house Machine Learning tools & infrastructure to develop reusable, highly differentiating and high-performing Machine Learning systems, enable fast model development, low-latency serving and ease of model quality upkeep.

Your Expertise:

  • Define evaluation strategy and success metrics for GenAI systems, aligning offline evaluation with online business and customer experience outcomes.
  • Build and scale evaluation frameworks (golden sets, synthetic data, automated regressions, rubric-based grading, LLM-as-judge where appropriate) with strong controls for bias, drift, and reliability.
  • Design the data flywheel: instrumentation, feedback collection, data quality checks, labeling strategy, dataset versioning, and governance to support continuous improvement.
  • Lead cross-functional quality initiatives across product, ops, and engineering, driving clarity on what “good” looks like and how teams act on evaluation results.
  • Develop and productionize pipelines for dataset creation, model monitoring, evaluation-at-scale, and continuous testing (pre-deploy and post-deploy).
  • Drive technical decisions and architecture for evaluation and data infrastructure, balancing speed, rigor, cost, and safety.

Minimum Qualifications:

  • Educational Background: PhD in Computer Science, Mathematics, Statistics, or related technical field (or equivalent practical experience).
  • Industry Experience: 10+ years building, testing, and shipping ML/AI systems end-to-end; including 2+ years of experience with GenAI/LLM systems in production.
  • Leadership Experience: 5+ years leading large, ambiguous technical initiatives as a senior IC, influencing roadmap and engineering/science direction across teams.
  • Technical Proficiency:
    • Deep expertise in evaluation methodology (offline/online alignment, metric design, human-in-the-loop evaluation, A/B testing, power analysis, regression testing).
    • Hands-on experience with GenAI systems, including orchestration, retrieval, tool calling, memory, etc.
    • Experience building data pipelines and quality systems (labeling workflows, dataset curation, versioning, monitoring, and governance).
    • Solid ML fundamentals and best practices (model selection, training/serving, monitoring, reliability, and model lifecycle management).

Preferred Qualifications:

  • Customer Support Systems: Experience applying ML/AI to customer support workflows (e.g., agent assist, classification/routing, resolution recommendation, QA).
  • Infrastructure & Quality at Scale: Experience building robust evaluation platforms for agent behavior validation, safety/guardrails, and continuous improvement.
  • Agile Practice for Applied AI: Proven ability to take evaluation and data flywheel work from incubation to production, iterating quickly while maintaining scientific rigor.
  • Continuous Learner: Strong curiosity and ability to absorb new techniques (e.g., judge models, preference optimization, synthetic data generation) and apply them pragmatically.

Your Location:

This position is US - Remote Eligible. The role may include occasional work at an Airbnb office or attendance at offsites, as agreed to with your manager. While the position is Remote Eligible, you must live in a state where Airbnb, Inc. has a registered entity. Click here for the up-to-date list of excluded states. This list is continuously evolving, so please check back with us if the state you live in is on the exclusion list. If your position is employed by another Airbnb entity, your recruiter will inform you what states you are eligible to work from.

How We'll Take Care of You:

Our job titles may span more than one career level. The actual base pay is dependent upon many factors, such as: training, transferable skills, work experience, business needs and market demands. The base pay range is subject to change and may be modified in the future. This role may also be eligible for bonus, equity, benefits, and Employee Travel Credits.  

Pay Range
$244,000$305,000 USD

Top Skills

Agile Methodologies
Artificial Intelligence
Deep Learning
Machine Learning
Nlp
Software Engineering

Similar Jobs

8 Days Ago
Easy Apply
Remote or Hybrid
2 Locations
Easy Apply
160K-200K Annually
Senior level
160K-200K Annually
Senior level
Artificial Intelligence • Big Data • Computer Vision • Information Technology • Machine Learning • Analytics • Defense
As a Senior Machine Learning Engineer, you will develop machine learning models, automate data pipelines, and collaborate with teams to meet customer needs.
Top Skills: AngularC++DockerGoGraphQLJavaKubernetesPythonPyTorchReactRestRustScalaScikit-LearnTensorFlowVue
8 Days Ago
Easy Apply
Remote
United States
Easy Apply
210K-275K Annually
Senior level
210K-275K Annually
Senior level
AdTech • Artificial Intelligence • Big Data • Machine Learning • Marketing Tech • Mobile • Software
As a Staff Machine Learning Engineer, you will develop and maintain ML models for decision-making, optimize pipelines, and collaborate with a diverse engineering team.
Top Skills: Deep Neural NetworksMachine LearningNeural NetworksPythonRecommendation Systems
22 Days Ago
Remote
United States of America
128K-267K Annually
Senior level
128K-267K Annually
Senior level
AdTech • Digital Media • Information Technology • Other
As a Senior Machine Learning Engineer, you'll develop and optimize ML systems, implement pipelines, and collaborate with cross-functional teams to enhance Yahoo's data capabilities.
Top Skills: AWSBeamGoogle Cloud PlatformJavaPythonPyTorchScikit-LearnSparkSQLTensorFlow

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account