Cutover Logo

Cutover

Site Reliability Engineer

Posted 13 Days Ago
Remote
Hiring Remotely in US
120K-130K Annually
Mid level
Remote
Hiring Remotely in US
120K-130K Annually
Mid level
As a Site Reliability Engineer, you will ensure production system reliability, optimize performance, respond to incidents, and collaborate on infrastructure improvements.
The summary above was generated by AI

An inclusive work environment is an empowering one. At Cutover, we lead with empathy and enable others to succeed through curiosity, kindness, and self-expression.

Location: Remote, United States

This role requires on-call shifts, roughly 1 in 4 weeks and 1 in 4 weekends - 2nd Shift: 2:00pm -11:00pm PST  (10:00 PM - 7:00 AM UTC)

Cutover provides enterprise technology operations teams with an AI-powered SaaS solution that automates and streamlines complex processes with intelligent runbooks. The Cutover solution enables teams to respond to incidents quickly, recover from IT outages, and manage cloud migrations with precision and efficiency. Cutover is used in many of the world's largest financial institutions to support their critical technology operations, including 5 out of the top 6 largest asset managers and 3 out of the top 5 US banks.

We’re looking for a Site Reliability Engineer (SRE) to add to our US team. This role will report to our SRE Lead.

Cutover’s SRE team is responsible for ensuring the reliability and performance levels of our production systems and applications. As a team, we’re committed to constantly improving our engineering culture to maintain a balance between risk and reliability.


What tech stack do we use here at Cutover?

The platform is built on a ReactJS frontend with a Ruby on Rails API, and all hosted on the reliable infrastructure of Amazon Web Services (AWS).

Your role will involve close collaboration with our support and engineering teams. Together, we actively engage in maintaining and optimizing the platform's reliability, utilizing cutting-edge tools and occasionally leveraging in-house software and scripts.

If you're passionate about ensuring the dependability and efficiency of complex systems and thrive in an environment where technologies like React, Ruby, AWS, Kubernetes, Terraform, Git, and Ansible are at the forefront, we invite you to join our team. Together, let's elevate the reliability of our Cutover Enterprise platform to new heights.


As a Site Reliability Engineer, here's what you'll be up to:

  • Incident Response: Respond to incidents and alerts, triaging urgency and investigating root cause
  • Documentation: Regular contributions to improve our documentation on system design, troubleshooting, best practices, and engineering processes
  • Root Cause Analysis: Contribute to post-mortems and help identify long-term improvements under guidance
  • Collaboration: Support cross-functional teams during investigations and post-incident reviews
  • Observability: Support and enhance observability tools and techniques by identifying metrics, logging, and alerting improvements
  • Automation: Write and execute simple automation scripts (e.g. Python, Ruby, Bash) to improve reliability and toil reduction
  • Development: Work on internal tools, pipelines, and IaC solutions to help improve the speed of software delivery and recovery
  • System Reliability: Work on efforts to enhance the reliability and performance of our application and systems, ensuring optimal uptime and minimal disruptions.
  • Infrastructure Optimization: Work closely with the development and platform engineering teams to optimize the infrastructure on AWS, ensuring scalability and efficiency.

Please note that this role involves a rotating on-call schedule, which will require occasional evening and weekend availability.


What we'd like you to bring to the table:

  • A genuine excitement for complex problem solving within our tech stack, applying what you know to our unique problems.
  • Familiarity with at least one scripting language such as Ruby, JavaScript, Python, Bash
  • Experience with containerization (i.e. Docker) or IaC (e.g. Terraform, Helm, CloudFormation)
  • An eagerness to follow modern engineering practices and learn from others
  • Familiarity with observability tools such as DataDog, New Relic, Grafana, Prometheus, ELK, or OpenTelemetry
  • Understanding of core networking concepts (DNS, HTTP/S, Load Balancing, etc.)
  • A collaborative mindset with clear communication skills
  • Willing to ask questions to gain a better understanding of new or complex concepts

Nice to haves…

  • Exposure to major incident response processes
  • AWS Certified Cloud Practitioner or hands-on experience with cloud environments

The good stuff…

  • We're excited to offer Share Options as part of our compensation package.
  • 20 days of PTO per year + public holidays, and we want you to take all of them!
  • 3 volunteer days to use for any charitable/voluntary cause you would like.
  • A top-tier private health insurance package.
  • 401k contribution plan
  • Work from home stipend
  • A personal learning and development budget through Learnerbly. You’ll be supported in your quest for knowledge, whatever that looks like to you.
  • If you’re thinking of starting or growing your family, then you’ll be in great company - more than half of our team are parents and we’ve built a globally consistent parental leave approach that we’re proud of.
  • Employee Referral Scheme.
  • Safeguarding the mental health of our teams is paramount for us. If you’d like to, then you’ll be able to avail yourself of multiple Cutover mental health initiatives, from fully subsidised therapy sessions to subscriptions to leading wellbeing platforms.

Target compensation package: $120,000 - $130,000 base, + stock options + benefits. 

The final offer may vary from the target compensation package, taking into consideration factors such as your experience level and skill set.  If we aren't aligned on salary at this stage, we’d still love to hear from you to better understand if there are more suitable opportunities at Cutover.


Diversity Statement - Empowering Our Teams

We encourage our team to bring their authentic selves to work, which we have found has strengthened workplace relationships and fostered a genuine sense of community.

If you are excited by this role, we invite you to apply! Even if your profile doesn’t check all the boxes, please don't simply scroll past! We recognize that talent lies everywhere and that some demographic groups are more likely to apply for a "stretch role" than others. We are always open to different perspectives and professional backgrounds to keep Cutover's culture evolving and to ensure that we never stop learning. 

Cutover is an Equal Opportunity Employer. Maintaining an equitable hiring process is imperative to our mission. All applicants are considered without regard to race, ethnicity, national origin, religion, sex, gender identity, sexual orientation, age, mental or physical disability, marital status, protected veteran or parental status.

Learn more about Life at Cutover, our Guiding Principles, and our latest news on LinkedIn.

Top Skills

Ansible
AWS
Bash
Datadog
Docker
Elk
Git
Grafana
Kubernetes
New Relic
Opentelemetry
Prometheus
Python
React
Ruby
Ruby On Rails
Terraform

Similar Jobs

8 Days Ago
Remote
USA
140K-210K Annually
Senior level
140K-210K Annually
Senior level
Sales • Software • Automation
Join the Infrastructure Team to build and maintain critical systems, automating database lifecycles and enhancing disaster recovery with a focus on resilience and simplicity.
Top Skills: AnsibleArgocdAWSClickhouseDockerElasticsearchFlaskGithub ActionsGrafanaKubernetesMongoDBPostgresPythonRedisTerraform
8 Days Ago
Remote or Hybrid
USA
Senior level
Senior level
Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Software
The AWS Cloud Architect will design, build, and optimize cloud infrastructure, ensuring scalability and security while mentoring junior SREs and defining cloud strategy.
Top Skills: AnsibleAws Api GatewayAws CloudfrontAws CloudtrailAws CloudwatchAws DocumentdbAws Ec2Aws EksAws LambdaAws RdsAws S3Aws Secrets ManagerAws SsmDockerGrafanaHashicorp ConsulHashicorp TerraformHashicorp VaultKubernetesNew RelicPrometheus
13 Days Ago
Remote
2 Locations
Mid level
Mid level
Artificial Intelligence • Productivity • Software • Automation
As a Site Reliability Engineer at Zapier, you will enhance the reliability of systems, improve observability, and handle incident response, while collaborating with teams and contributing to automation efforts.
Top Skills: ArgocdAWSDatadogGitlabGoGrafanaKafkaKubernetesOpensearchPrometheusPythonRedisSentryTerraformTypescript

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account