Data Engineer
Why Work for Us?
1. Contribute to an inspiring workplace! Tackle meaningful work that moves the needle on solving some of the most challenging problems in health care today.
2. Top venture partners are accelerating our growth. We completed a Series A round of funding led by one of the top venture capitalist firms in the nation, with participation from existing investors.
3. Be part of a cohesive, entrepreneurial team. Partake in company culture collaborations, social gatherings, and enjoy the amenities at our Denver office.
4. Competitive salary, bonus and equity structure. Our compensation strategy is designed to attract and retain top talent by positioning us to be very competitive amongst companies of our stage and in our region.
5. When you feel well, you do well! We offer rich benefits including vision, dental, and medical benefit options with certain plans paid at 100% for employee.
Description
We are seeking a strong Data Engineer who thrives in a fast-paced, agile development environment. This position reports to the CTO as part of the Data Engineering Team, in addition to partnering heavily with our Analytics Team. The role will apply innovative techniques to develop, problem solve and support our data platform and analytics engine. Qualified candidates will have a strong skillset to design, construct, install, test and maintain highly scalable data pipelines.
Responsibilities
- Design, construct, install, test and maintain highly scalable data pipelines
- Oversee data collection, cleansing and predictive modeling environments, deploying models to production
- Own data modeling, administration, configuration management, monitoring, debugging, and performance tuning of data pipeline to meet stringent SLA’s
- Employ a variety of languages and tools (e.g. Scripting languages) to integrate existing systems together
- Recommend ways to improve data reliability, efficiency and quality
- Develop data set processes for data modeling, mining and production
- Define the overall BI data, ETL processes, data marts and data lake architecture
- Develop a practical roadmap for an enterprise-wide BI reporting and analytics platform
- Design solutions to handle structured datasets and potentially semi- structured and un-structured datasets
Qualifications
Required
- Overall 7+ years’ data engineering experience
- 2 years’ experience taking on leadership responsibilities with Engineering and Analytics teams
- Experience effectively teaming with scrum masters and multi-disciplinary team members in the design, planning and governance of technical projects
- Strong hand-on development experience in Python, engineering practices, deployment procedures, agile development methodology
- Ability to understand and evaluate existing code written in programming/scripting languages; primarily Python, C#, Java.
- Proficient in both Windows and Linux operating systems, understanding the nuances of building data pipelines factoring in the variations of operating systems
- Ability to benchmark systems, analyze system bottlenecks
- Proficient in proposing solutions to eliminate them to scale and improving operational efficiency
- Experience in evaluating the pros and cons of various technologies/platforms and recommending solutions and trade-offs
- Ability to document use cases, solutions and recommendations
- Experience with at least one of the large cloud-computing infrastructure solutions like Microsoft Azure or AWS
- Excellent written and verbal communication skills and the ability to translate technical concepts into plain language
- Based in Denver
- Bachelor’s degree in computer science or related field
Preferred
1. 3-5 years’ experience in working with ETL tools such as Airflow, Talend, Informatica, and/or Pentaho
2. Experience working with an analytical data pipeline and teaming with data scientists
3. Strong grasp of data pipeline, understanding the difference between batch and real-time data processing
4. Familiarity with Lambda Architecture’s batch, speed and servicing layers
5. Exposure to containerization, operating system virtualization and decomposing large workloads to run in smaller batches across distributed nodes/containers
6. Exposure to distributed data processing (e.g. Hortonworks, Cloudera), Azure Data pipeline, VM scale set
7. 1-2 years’ hands-on experience with Hadoop, Spark, MapReduce, Pig