Machinify Logo

Machinify

Data Engineer | Infra

Reposted 20 Days Ago
In-Office or Remote
2 Locations
220K-250K
Senior level
In-Office or Remote
2 Locations
220K-250K
Senior level
The Staff/Lead Data Engineer will build and manage data pipelines, ensuring data availability and quality while using technologies such as Apache Airflow and Spark.
The summary above was generated by AI

Machinify is the leading provider of AI-powered software products that transform healthcare claims and payment operations. Each year, the healthcare industry generates over $200B in claims mispayments, creating incredible waste, friction and frustration for all participants: patients, providers, and especially payers. Machinify’s revolutionary AI-platform has enabled the company to develop and deploy, at light speed, industry-specific products that increase the speed and accuracy of claims processing by orders of magnitude.

Why This Role Matters

Our data pipelines power payment decisions, product insights, ML models, and customer operations - our data engineering infrastructure must evolve to support scaling and efficiency.

As a Staff/Lead Data Engineer, Infra, you will play a pivotal role in enabling every data engineer to be faster, more reliable, and more productive. You will build the core frameworks, tools, and observability systems that:

  • Abstract common pipeline patterns into reusable components

  • Implement robust monitoring and testing across the platform

  • Drive testability and data quality at scale

  • Unify tooling and standards across our post-merger environment

  • Explore the use of GenAI to further accelerate data engineering productivity

You’ll collaborate deeply with Data Engineers, Data Scientists , Platform Engineers, ML, Product, and SMEs to shape the foundation of our next-generation data platform.

What You’ll Do

🛠 Build Core Data Engineering Infrastructure
  • Develop and maintain internal DE SDKs / libraries to abstract and standardize Spark + Airflow patterns.

  • Design and implement pipeline testing frameworks to enable CI/CD-based data validation.

  • Create pipeline observability & monitoring systems (Grafana, ELK, DataDog) to ensure reliability and visibility.

  • Drive adoption of data validation frameworks to automate and scale data quality enforcement.

🚀 Drive Platform & Unification Initiatives
  • Lead initiatives to unify our data platform post-merger by defining scalable standards and patterns.

  • Partner with Data Engineers and SMEs to improve canonical modeling, schema evolution, and data versioning.

  • Help architect centralized metadata management to replace fragmented, ad-hoc systems.

🤖 Innovate with GenAI & Modern Practices
  • Leverage GenAI and modern tooling to improve pipeline development, monitoring, and debugging.

  • Prototype new ways to improve developer productivity, data quality and pipeline reliability using emerging technologies.

🤝 Collaborate Across Teams
  • Work with core Data Engineers to identify and address productivity bottlenecks and scaling challenges.

  • Support customer data onboarding frameworks indirectly through improved tooling and processes.

  • Partner closely with Platform/Server Engineering to build/request new features

What You BringRequired Technical Skills & Experience
  • 7+ years of experience as a Data Engineer / Software Enginer / Platform Engineer, with strong expertise building internal tooling and frameworks.

  • Proficient in Python and SQL.

  • Deep expertise with Apache Spark (core processing engine today).

  • Advanced experience with Apache Airflow.

  • Experience building monitoring & observability systems (Grafana, ELK Stack, or DataDog).

  • Experience designing and implementing data validation & testing frameworks at scale.

  • Strong understanding of AWS (primary cloud environment).

  • Proficient in Kubernetes and modern orchestration patterns.

  • Deep understanding of schema evolution, data modeling, and versioning for large-scale data platforms.

  • Proven ability to operate as a Staff/Lead: driving technical strategy, mentoring others, and collaborating cross-functionally.

Bonus Experience (Nice to Have)
  • Experience working with Kafka, Spark Streaming, or other modern streaming platforms.

  • Scala experience (Spark internals, performance tuning, advanced transformations).

  • Familiarity with metadata management tools such as DataHub, Amundsen, or OpenMetadata.

  • Experience using GenAI to improve data engineering workflows and developer experience.

  • Experience building unified data platforms post-merger.

  • Exposure to GCP or multi-cloud environments.

Why Join Us
  • Real impact — your work will make the entire data engineering organization dramatically more effective.

  • Total ownership — build the core frameworks and standards that define the future of our data platform.

  • Opportunity to innovate — drive adoption of GenAI and modern data engineering practices.

  • Cross-functional leadership — collaborate across Data, ML, Platform, Product, and SMEs at a pivotal stage of company growth.

  • Fast-growing environments — contribute at a moment when building scalable, unified data infrastructure will have outsized impact.

Equal Employment Opportunity at Machinify

Machinify is committed to hiring talented and qualified individuals with diverse backgrounds for all of its positions. Machinify believes that the gathering and celebration of unique backgrounds, qualities, and cultures enriches the workplace. 

Top Skills

Apache Airflow
AWS
Docker
Elk Stack
GCP
Grafana
Kafka
Kubernetes
Python
Spark
SQL

Similar Jobs

6 Days Ago
Easy Apply
Remote
United States
Easy Apply
170K-185K
Senior level
170K-185K
Senior level
Healthtech • Software
As a Staff Data Engineer, you will define technical strategies, mentor teams, drive architectural decisions, and enhance data platform capabilities for scalable infrastructure.
Top Skills: AWSDbtEmrGlueIcebergKafka
3 Days Ago
In-Office or Remote
Atlanta, GA, USA
160K-230K
Senior level
160K-230K
Senior level
Fintech • Gaming • Mobile • Sports • Esports
As a Staff Data Engineer, you'll develop and maintain data infrastructure projects, create data pipelines, and collaborate with teams to support data-driven decisions.
Top Skills: Ci/CdDbtGoInfrastructure As Code (Iac)Python
2 Hours Ago
In-Office or Remote
2 Locations
50K-100K
Senior level
50K-100K
Senior level
Fintech • Machine Learning • Social Impact • Software • Financial Services
The Senior Data Engineer II will design and manage a data LakeHouse platform, build data pipelines, and collaborate on data solutions.
Top Skills: AirflowApache IcebergAWSDbtFlinkKubernetesPythonSnowflakeSpark StreamingSQLTerraform

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account