Machinify

Data Engineer | Infra

Reposted 20 Days Ago

In-Office or Remote

2 Locations

220K-250K

Senior level

In-Office or Remote

2 Locations

220K-250K

Senior level

The Staff/Lead Data Engineer will build and manage data pipelines, ensuring data availability and quality while using technologies such as Apache Airflow and Spark.

The summary above was generated by AI

Machinify is the leading provider of AI-powered software products that transform healthcare claims and payment operations. Each year, the healthcare industry generates over $200B in claims mispayments, creating incredible waste, friction and frustration for all participants: patients, providers, and especially payers. Machinify’s revolutionary AI-platform has enabled the company to develop and deploy, at light speed, industry-specific products that increase the speed and accuracy of claims processing by orders of magnitude.

Why This Role Matters

Our data pipelines power payment decisions, product insights, ML models, and customer operations - our data engineering infrastructure must evolve to support scaling and efficiency.

As a Staff/Lead Data Engineer, Infra, you will play a pivotal role in enabling every data engineer to be faster, more reliable, and more productive. You will build the core frameworks, tools, and observability systems that:

Abstract common pipeline patterns into reusable components
Implement robust monitoring and testing across the platform
Drive testability and data quality at scale
Unify tooling and standards across our post-merger environment
Explore the use of GenAI to further accelerate data engineering productivity

You’ll collaborate deeply with Data Engineers, Data Scientists , Platform Engineers, ML, Product, and SMEs to shape the foundation of our next-generation data platform.

What You’ll Do

🛠 Build Core Data Engineering Infrastructure

Develop and maintain internal DE SDKs / libraries to abstract and standardize Spark + Airflow patterns.
Design and implement pipeline testing frameworks to enable CI/CD-based data validation.
Create pipeline observability & monitoring systems (Grafana, ELK, DataDog) to ensure reliability and visibility.
Drive adoption of data validation frameworks to automate and scale data quality enforcement.

🚀 Drive Platform & Unification Initiatives

Lead initiatives to unify our data platform post-merger by defining scalable standards and patterns.
Partner with Data Engineers and SMEs to improve canonical modeling, schema evolution, and data versioning.
Help architect centralized metadata management to replace fragmented, ad-hoc systems.

🤖 Innovate with GenAI & Modern Practices

Leverage GenAI and modern tooling to improve pipeline development, monitoring, and debugging.
Prototype new ways to improve developer productivity, data quality and pipeline reliability using emerging technologies.

🤝 Collaborate Across Teams

Work with core Data Engineers to identify and address productivity bottlenecks and scaling challenges.
Support customer data onboarding frameworks indirectly through improved tooling and processes.
Partner closely with Platform/Server Engineering to build/request new features

What You BringRequired Technical Skills & Experience

7+ years of experience as a Data Engineer / Software Enginer / Platform Engineer, with strong expertise building internal tooling and frameworks.
Proficient in Python and SQL.
Deep expertise with Apache Spark (core processing engine today).
Advanced experience with Apache Airflow.
Experience building monitoring & observability systems (Grafana, ELK Stack, or DataDog).
Experience designing and implementing data validation & testing frameworks at scale.
Strong understanding of AWS (primary cloud environment).
Proficient in Kubernetes and modern orchestration patterns.
Deep understanding of schema evolution, data modeling, and versioning for large-scale data platforms.
Proven ability to operate as a Staff/Lead: driving technical strategy, mentoring others, and collaborating cross-functionally.

Bonus Experience (Nice to Have)

Experience working with Kafka, Spark Streaming, or other modern streaming platforms.
Scala experience (Spark internals, performance tuning, advanced transformations).
Familiarity with metadata management tools such as DataHub, Amundsen, or OpenMetadata.
Experience using GenAI to improve data engineering workflows and developer experience.
Experience building unified data platforms post-merger.
Exposure to GCP or multi-cloud environments.

Why Join Us

Real impact — your work will make the entire data engineering organization dramatically more effective.
Total ownership — build the core frameworks and standards that define the future of our data platform.
Opportunity to innovate — drive adoption of GenAI and modern data engineering practices.
Cross-functional leadership — collaborate across Data, ML, Platform, Product, and SMEs at a pivotal stage of company growth.
Fast-growing environments — contribute at a moment when building scalable, unified data infrastructure will have outsized impact.

Equal Employment Opportunity at Machinify

Machinify is committed to hiring talented and qualified individuals with diverse backgrounds for all of its positions. Machinify believes that the gathering and celebration of unique backgrounds, qualities, and cultures enriches the workplace.

Top Skills

Apache Airflow

AWS

Docker

Elk Stack

GCP

Grafana

Kafka

Kubernetes

Python

Spark

SQL

Similar Jobs

Cohere Health

Staff Data Engineer

6 Days Ago

Easy Apply

Remote

United States

Easy Apply

170K-185K

Senior level

170K-185K

Senior level

Healthtech • Software

As a Staff Data Engineer, you will define technical strategies, mentor teams, drive architectural decisions, and enhance data platform capabilities for scalable infrastructure.

Top Skills: AWSDbtEmrGlueIcebergKafka

PrizePicks

Staff Data Engineer

3 Days Ago

In-Office or Remote

Atlanta, GA, USA

160K-230K

Senior level

160K-230K

Senior level

Fintech • Gaming • Mobile • Sports • Esports

As a Staff Data Engineer, you'll develop and maintain data infrastructure projects, create data pipelines, and collaborate with teams to support data-driven decisions.

Top Skills: Ci/CdDbtGoInfrastructure As Code (Iac)Python

TrueML

Senior Data Engineer

2 Hours Ago

In-Office or Remote

50K-100K

Senior level

50K-100K

Senior level

Fintech • Machine Learning • Social Impact • Software • Financial Services

The Senior Data Engineer II will design and manage a data LakeHouse platform, build data pipelines, and collaborate on data solutions.

Top Skills: AirflowApache IcebergAWSDbtFlinkKubernetesPythonSnowflakeSpark StreamingSQLTerraform

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute