Topstep Logo

Topstep

Staff Site Reliability Engineer

Reposted 5 Hours Ago
Easy Apply
Remote
Hiring Remotely in United States
200K-250K Annually
Expert/Leader
Easy Apply
Remote
Hiring Remotely in United States
200K-250K Annually
Expert/Leader
As a Staff Site Reliability Engineer, you will shape reliability practices, optimize AWS infrastructure, lead incident response, and mentor engineers.
The summary above was generated by AI

Summary 

Are you a systems-minded engineer who thrives on building resilient infrastructure, driving operational excellence, and enabling teams to move fast with confidence? As a Staff Site Reliability Engineer at Topstep, you'll play a foundational role in shaping how we approach reliability, observability, and infrastructure at scale. You'll be instrumental in building out our SRE practice, defining our incident response culture, closing observability gaps, and optimizing our AWS infrastructure for both performance and cost. This role is ideal for someone who brings both deep technical expertise and a builder's mindset. Someone who's excited to establish best practices from the ground up, embed reliability into engineering culture, and create the foundations that let teams ship with speed and confidence. Join us and help define what operational excellence looks like at Topstep.

Key Responsibilities 

  • Set technical direction for reliability and observability across the entire engineering organization, influencing architectural decisions.
  • Build and mature our SRE practice defining SLOs, incident response protocols, and on-call standards
  • Own the observability stack using DataDog (primary platform for metrics, APM, logging) and CloudWatch (AWS-native monitoring), instrumenting distributed tracing and closing gaps that currently prevent diagnosis of production issues
  • Partner with engineering teams to embed reliability principles early in the design process and improve system resilience
  • Lead incident response and blameless post-mortems, turning outages into opportunities for systematic improvement
  • Mentor engineers across the organization on reliability practices, operational thinking, and production ownership
  • Champion a culture of transparency, continuous improvement, and shared ownership of production systems

Required Qualifications and Key Competencies

  • 7+ years of professional experience in SRE, infrastructure, or platform engineering, with demonstrated impact building practices that scaled across multiple teams
  • Proven track record either starting an SRE function from scratch or scaling an existing practice with measurable improvements to MTTR, MTTD, change failure rate, or availability
  • Strong proficiency with DataDog for end-to-end observability (metrics, APM, logs, distributed tracing) and building alerting that catches real issues without causing fatigue
  • Deep expertise with AWS infrastructure (EKS, ECS, EC2, and RDS) running production services at scale, and hands-on experience optimizing for both reliability and cost
  • Solid foundation in distributed systems, networking, database performance, and debugging complex system failures across service boundaries
  • Comfortable reading code, writing automation scripts, and contributing to infrastructure tooling when needed
  • Proficiency with infrastructure as code (Terraform) and GitOps practices
  • Track record of influencing engineering culture through documentation, tooling, mentorship, and technical leadership
  • Excellent communication skills with the ability to explain complex system behavior and trade-offs to varied audiences
  • Comfortable making pragmatic trade-offs between long-term platform vision and immediate business needs

Company Culture & Perks

  • Topstep is an engaging working environment which ranges from fully remote to hybrid. We foster a culture of collaboration with cameras on during meetings and a robust Slack environment for communication. 
  • 10 Company paid Holidays and generous Family Leave. Paid time off is accrued monthly.
  • Competitive 401(k) matching, health, dental, and vision insurance is offered for full time employees 
  • Vacations are encouraged with a bonus for taking 5 consecutive days. Employee referrals are bonused. Topstep offers a food and groceries budget and contributes towards health and wellness.

New Hire Base Salary Range

  • $200,000-$250,000
  • Bonus: This position is eligible for a performance-based bonus as provided by the plan terms and governing documents.
  • The compensation offered will take into account internal compensation structure and may vary depending on the candidate's geographic region, job-related knowledge, skills, and experience among other factors.

Equal Opportunity Employer

Topstep is an Equal Opportunity Employer. We are committed to fostering an inclusive environment where all employees and applicants are valued. All qualified candidates will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, age, disability, or veteran status, in compliance with applicable federal, state, and local laws.

Interested in the role? Apply today with your resume and cover letter!

At this time immigration sponsorship is not available for this position (including H-1B, STEM OPT training plans, etc.).

Top Skills

AWS
Datadog
Gitops
Terraform

Similar Jobs

9 Days Ago
Remote or Hybrid
Orlando, FL, USA
Senior level
Senior level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Support and maintain the reliability, scalability, and performance of cloud infrastructure for US Public Sector customers, utilizing software development and systems engineering skills. Resolve issues, mentor team members, and drive automation initiatives.
Top Skills: AnsibleAWSAzureBashDockerGCPGrafanaJavaJavaScriptKafkaKubernetesLinuxMaria DbMySQLNginxOpenstackOraclePostgresPrometheusPuppetPythonSplunkTerraform
17 Days Ago
Remote or Hybrid
New York, NY, USA
130K-180K Annually
Senior level
130K-180K Annually
Senior level
AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Oversee SAP BTP CPI operations, manage incidents, collaborate with teams for enhancement and deployment, ensuring system availability and performance.
Top Skills: AbapCapmCloud ConnectorCpiIdocJSONMessage QueuesOauthOdataRestSAMLSap BtpSfapiSftpSoapXML
7 Days Ago
Remote
United States
148K-185K Annually
Senior level
148K-185K Annually
Senior level
Information Technology • Security • Cybersecurity
As a Staff Site Reliability Engineer, you will lead incident management, optimize monitoring solutions, refine SLOs/SLIs/SLAs, and mentor other engineers.
Top Skills: BashElkGrafanaKubernetesOpentelemetryPrometheusPython

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account