i4DM Logo

i4DM

PySpark & Delta Lake Developer

Reposted 3 Days Ago
Remote
Hiring Remotely in USA
Mid level
Remote
Hiring Remotely in USA
Mid level
The PySpark & Delta Lake Developer is responsible for designing scalable ETL pipelines for healthcare data, ensuring ACID compliance, data quality, and optimal performance within AWS.
The summary above was generated by AI

About Our Team

Our employees thrive in a culture that's fast-paced and ego-free, where innovation and collaboration are encouraged at every turn. We are an organization that provides federal agencies instant access to experienced and talented professionals who understand their unique challenges and know the most efficient ways to address them. We are continually investing in resources and talent, so we stay prepared with specialized teams in the place who are experts in creating tailored technologies. Our solutions empower Federal organizations to grow, modernize, and succeed in a rapidly evolving landscape.

We welcome diverse perspectives and seek individuals who are passionate about technology and creative problem-solving. If you enjoy learning, growing, and tackling real-world challenges, you will thrive here. Veterans and military spouses are strongly encouraged to apply and bring their unique experience to our team.

About the Role:

Our core values of People Matter, Integrity, and a Commitment to Excellence drive all that we do. By joining us, you will become a part of a fun and diverse team of talented and creative consultants who share the goal of using the latest technology to solve business challenges. We provide our clients with a dynamic mix of services and deliver focused solutions like no one else.

We are seeking talented and bright team players who are passionate about technology and want to work in a fast-paced, dynamic, and ego-free culture while applying a creative approach to problem-solving. Team members who like to grow their skill sets while solving challenging, real world business problems thrive.

We are looking for an experienced PySpark & Delta Lake Developer, who will be responsible for designing, building, and maintaining scalable ETL pipelines to process and analyze large-scale healthcare claims data. This role emphasizes building robust Delta Lake tables and ensuring ACID-compliant data lakes. The ideal candidate will focus on developing efficient PySpark scripts and leveraging Delta Lake capabilities to deliver data reliability, high performance, and seamless schema evolution within an AWS environment.

Key Responsibilities:

  • Design, develop, and maintain robust ETL pipelines using PySpark and Delta Lake for large and complex healthcare data workloads.
  • Implement and optimize data lake solutions using Delta Lake table formats, supporting ACID transactions, schema enforcement, and time travel.
  • Write efficient, reusable, and well-documented PySpark scripts for data ingestion, transformation, cleansing, and aggregation.
  • Collaborate with data engineers, architects, and data scientists to understand business and data requirements and translate them into scalable data solutions.
  • Ensure data quality, consistency, lineage, and integrity across all stages of data processing.
  • Troubleshoot, debug, and optimize PySpark applications and Delta Lake workflows for cost, speed, and reliability within AWS.
  • Maintain detailed and up-to-date technical documentation of code, data pipelines, and standard operating procedures.
  • Stay updated with the latest Delta Lake and Spark advancements, advocating for best practices in data management and analytics.

TAG: INDMJC

TAG: #LI-I4DM

Required Qualifications:

  • Strong proficiency in Python and PySpark, with hands-on experience developing data pipelines.
  • Advanced experience with Delta Lake and its ACID transaction and schema management features.
  • Solid SQL skills for querying, joining, and optimizing data in distributed environments.
  • Hands-on experience with AWS cloud data services (e.g., S3, Glue, EMR, Athena).
  • Familiarity with data lake concepts, partitioning, and performance tuning.
  • Excellent communication skills and a desire to continuously learn and adapt to innovative technologies.
  • Familiarity with CI/CD, version control (e.g., Git), and infrastructure as code.

Preferred Qualifications:

  • Experience with healthcare or claims data.
  • Knowledge of data governance, security, data cataloging (AWS Glue Catalog), and compliance best practices.
  • Strong ability to prioritize and execute tasks independently and within collaborative team environments.
  • Previous experience working in a government or public sector setting.

Top Skills

Athena
AWS
Delta Lake
Emr
Glue
Pyspark
Python
S3
SQL

Similar Jobs

An Hour Ago
Remote
United States of America
140K-185K Annually
Senior level
140K-185K Annually
Senior level
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
Lead People-focused cross-functional programs including M&A people workstreams, HR budgeting, OKR strategy, and high-impact talent initiatives. Build governance, tracking, dashboards, and executive reporting while partnering with Legal, Finance, Compliance, HRBPs and TA to drive organizational alignment and execution.
Top Skills: Dashboards/Analytics ToolsGoogle SuiteHrisProject Management Systems
An Hour Ago
Remote
United States of America
200K-258K Annually
Expert/Leader
200K-258K Annually
Expert/Leader
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
Lead ecosystem marketing for Arc and Circle's onchain platform: define narrative and GTM, drive full-funnel adoption across enterprises and developers, partner with Product/DevRel/BD/Communities, prioritize builders/protocols, and represent Circle at industry events.
Top Skills: Blockchain,Defi,On-Chain,Stablecoins,Layer 1,Layer 2,Programmable Money,Usdc,Smart Contracts,Arc
An Hour Ago
Remote
United States of America
113K-148K Annually
Senior level
113K-148K Annually
Senior level
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
Lead end-to-end employee relations casework including intake, investigation, documentation, and resolution. Advise managers, coach leaders, and partner with Legal, Compliance, and HRBPs. Improve ER policies, playbooks, and processes; analyze ER data to identify trends and recommend preventive actions. Champion fairness, confidentiality, and a high-integrity culture across a global, fast-paced organization.
Top Skills: G-Suite,Slack,Hris

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account