Software Engineer - APM Reliability
About Datadog:
We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale—trillions of data points per day—providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way.
The team:
At Datadog, APM Reliability Engineers are strong developers focused on improving the performance, stability and release quality of our tracing libraries. We instrument critical paths of systems at scale and the mission of the team is to ensure our libraries are not intrusive and don’t alter the performance or reliability of such systems.
The opportunity:
Datadog is building a world-class APM product that traces requests as they flow across complex systems at scale. As APM Reliability Engineer, you will work with multiple teams to measure and analyze the performance impact we may introduce in such systems with our tracing and profiling tools. You will provide guidance and you will make improvements to push our tracing tools to the next level. Come and join us to build fast and reliable open source software.
You will:
- Measure the performance of our tracing libraries to detect and solve performance issues that nobody else has been able to crack
- Coach other engineers to validate tracing libraries reliability introducing methodologies and testing approaches such as Defensive Programming or Fuzzing
- Build high leverage tools to help you in your day-to-day work, to introduce chaos in our tracing libraries and validate if they are resilient to unexpected errors
- Own the key performance metrics across libraries, to ensure we never introduce regressions in our libraries
Requirements:
- You have significant experience in doing software optimizations in at least one of the following languages: Java/Go/C++
- You have a proven track record of understanding the performance of large-scale services
- You have significant experience in using profilers, debuggers, tracers, or similar tools to improve the code quality of the software you were writing
- You communicate well and your enthusiasm for the craft is contagious
Bonus points:
- You have experience in deploying services in Kubernetes or alternative orchestrators
- You have experience in building distributed systems
Is this you? Let's chat!
In accordance with the Colorado Equal Pay Transparency Rule (“EPT”)
If you are a Colorado resident this applies to you:
At Datadog, we are committed to providing competitive pay and benefits that are in line with industry standards. We analyze and carefully consider several factors when determining compensation, including your work history and professional experience. These considerations potentially can cause your compensation to vary.
The Software Engineer - APM Reliability has an annual starting salary of $100,000, plus a competitive equity package. The actual pay may be higher depending on your skills, qualifications, and experience. Total compensation may further vary depending on if this role is eligible for discretionary bonuses, or commissions. In addition, Datadog offers a wide range of employee benefits. To learn more about Benefits click here.
#LI-Remote
Equal Opportunity at Datadog:
Datadog is an Affirmative Action and Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements.
Your Privacy:
Any information you submit to Datadog as part of your application will be processed in accordance with Datadog’s Applicant and Candidate Privacy Notice.