Data Platform Engineer
Data Platform Engineer
The Data Platform Engineer is a combination of computer scientist + statistician. This is a high impact opportunity to shape Sunrun’s strategy as we continue to lead the residential solar industry, directly influencing the way Sunrun operates and maintains the existing BI Enterprise solution in areas of Solar production, Sales Platform, Web analytics, Salesforce & ERP applications. This individual will also contribute to BI Product Platform for grid monitoring services, customer services, pipeline management and utility ingestion. Position can be based in San Francisco or San Jose, CA or downtown Denver, CO
- Utilizing GCP - (dataproc, dataflow, big query, cloud sql, big table) or equivalent skills from AWS ECS, DynamoDB, Kinesis, Athena, S3, API Gateway, CloudFormation, with Node.js and Java to deliver grid monitoring services, Sales funnel, and utility ingestion.
- Analyze information and reporting business requirements, explore and evaluate various options, develop a prototype using reporting and dashboard tools up front in the project life cycle thereby validating the requirements, eliminating ambiguity and providing early visibility to the business on the final solution.
- Participate in the design of the Data warehouse data model and provide input and feedback from a Business Intelligence tool perspective.
- Improve the accuracy and reliability of solar performance estimations by identifying and characterize the root cause of variances between expected and actual performance
- Develop a strong understanding of business processes for functions such as Sales, Finance, and HR in order to be able to better support their Business Intelligence needs.
- Participate and assist in User Acceptance Testing, train super users and power users on usage of the tool and the solution.
- Bachelor's Degree in Computer Science or a related field
- 2+ years experience engineering Big Data solutions
- Extract Transform Load (ETL) experience using Python, Spark, Hadoop, Apache Beam, or similar technologies
- Programming experience with Java or Python
- Experience with Big Data and knowledge of data warehousing concepts, methodologies, and frameworks
- SQL & NoSQL analytical expertise, data modeling, and relational database experience
- Experience on Unix/Linux OS, general server administration and Shell scripting
- Experience with stream-processing systems: Storm, Spark-Streaming, etc.
- Experience with Machine Learning methods and libraries
- Ability to communicate technical concepts to a business-focused audience.
- Familiarity with applied machine learning and data mining techniques to gain unexpected insights and applying sophisticated data modeling techniques to identify underlying trends and patterns (supervised/unsupervised classification, time series, probabilistic models)
- Experience with statistical analysis tools (e.g. MatLab, R, JMP, Python, etc.)
- Experience with Cloud platforms such as GCP (dataproc, dataflow, big query, cloud sql, big table)
- Experience working with Cloud solutions , preferably with GCP - Data flow , Data proc, NoSQL data sources (e.g. Hadoop, Dynamo, MongoDB, etc.)
- Data visualization experience using Looker