Ooma

Site Reliability Engineer

Posted 16 Days Ago

Easy Apply

Remote

Hiring Remotely in US

118K-158K Annually

Senior level

Easy Apply

Remote

Hiring Remotely in US

118K-158K Annually

Senior level

The Site Reliability Engineer will ensure system stability and efficiency by monitoring performance, managing infrastructure, and collaborating on deployment processes in a cloud-based environment.

The summary above was generated by AI

Here at Ooma we empower people to connect in smarter ways. We do this by creating powerful communication experiences through our cloud-based platform to bring people together at work and at home. Our solutions help small business owners stay connected with their customers and manage their businesses from anywhere. For larger companies we provide customized unified communications solutions to meet their unique needs. At home, we help our customers connect with their loved ones by providing the #1 rated VoIP phone service available. We also provide them with peace of mind through our innovative smart home security solution. At Ooma, all our products and services are priced competitively, because we believe advanced technology should be accessible to all.

About the Role:

As a Site Reliability Engineer, you will leverage your extensive expertise in Linux Systems, Virtualization, containers, Kubernetes clusters, and CI/CD pipelines to ensure the stability and efficiency of our systems. Your role will involve collaborating with various teams to implement best practices for infrastructure management, automated deployment, and application performance monitoring. A strong understanding of on-premises environments and the management of large data centers is essential.

What You’ll Do:

Monitor and troubleshoot system performance, reliability, and availability issues using modern observability tools and techniques, with a strong emphasis on diagnosing and resolving issues in operating systems and bare metal environments.
Design, implement, and maintain scalable and reliable infrastructure using containers, Kubernetes, and microservices architecture.
Manage CI/CD pipelines to facilitate efficient software development and deployment processes.
Implement GitOps workflows using ArgoCD or Flux, manage Helm charts and Kustomize configurations for declarative application deployment and version control.
Oversee configuration management to ensure consistent and reliable software releases across environments. Using Ansible for consistent system configuration, patch management, and provisioning across datacenter infrastructure.
Design and operate high-throughput Kafka clusters for event streaming, managing topics, partitions, replication, consumer lag monitoring, and disaster recovery strategies across datacenter infrastructure.
Collaborate with development teams to influence system design choices and operational policies.
Provide expert guidance on managing large data centers, including hundreds of bare metal servers and virtual machines (VMs), ensuring optimal configuration and performance.
Implement name services and server management practices to support our infrastructure needs.
Continuously evaluate and integrate new technologies to enhance operational efficiency and reliability.
Participate in on-call rotations to provide support for production systems as necessary, conduct blameless post-mortems with root cause analysis, and maintain incident response runbooks and procedures.
Create comprehensive technical documentation, runbooks, architectural diagrams, network topology maps, and maintain knowledge bases for operational procedures and best practices.
Continuously evaluate and integrate new technologies to enhance operational efficiency and reliability.

Experience We’re Looking For:

Bachelor’s degree in Computer Science, Engineering, or a related field; advanced degree preferred.
5+ years of experience as an SRE or a related field, with a strong focus on production systems, containers, microservices and service delivery.
Extensive experience with managing and maintaining CI/CD Pipelines and the essentials supporting it (GitOps workflows, ArgoCD, Helm charts)
Comprehensive knowledge of Observability Tools such as Prometheus, ELK Stack, log collectors, and Grafana for visuals
Extensive on-premises datacenter experience managing large data centers with hundreds of bare metal servers and VMs.
Deep knowledge of Linux operating systems, their configuration, performance tuning, and troubleshooting.
Experience with configuration management tools.
Familiarity with networking concepts and protocols in the scope of Linux Operating Systems.
Proven ability to analyze complex systems, identify bottlenecks, and implement solutions with strong troubleshooting skills.
Excellent communication skills, with the ability to collaborate effectively with cross-functional teams.
Experience with containers and orchestration technologies, particularly Kubernetes is a plus. #OP-1

What We Offer:

Working at Ooma means being a team player, while allowing your individual voice to come through. And, you'll receive competitive compensation, benefits and generous company perks.

Comprehensive Medical/Dental/Vision insurance for you and eligible dependents
- HMO, PPO’s or a PPO with a HDHP (including HSA, which Ooma helps fund)
Employer Paid Income Protection Benefits (Basic Life and AD&D, Short- and Long-term disability)
FSA Healthcare & Dependent Care
Commuter Benefits
Voluntary Accident, Critical Illness, Hospital Indemnity and Legal
401(k), including employer match, and Roth
Employee Stock Purchase Plan (ESPP)
Paid Time off, Sick Time, as well as corporate holidays observed
Employee Assistance Program
Life Balance benefits with Travel Assistance Services and Identity Theft
Additional Benefits include a Discount Program, Credit Union, Medicare Assistance, etc.

LI-C1

The base salary range for candidates within the United States is listed below. Actual base pay will depend on a variety of factors such as education, skills, experience, specific location, etc. The base pay range is subject to change and may be modified in the future. Regular employees may also be eligible for bonus(es), sales incentive(s) (target included in OTE) and/or stock in the form of Restricted Stock Units (RSUs).

United States Pay Range

$118,000—$158,000 USD

Top Skills

Ci/Cd

Configuration Management

Containers

Kubernetes

Linux

Virtualization

Similar Jobs

DFIN

Site Reliability Engineer

3 Days Ago

Remote or Hybrid

United States

Expert/Leader

Artificial Intelligence • Fintech • Information Technology • Software • Data Privacy

The Principal Site Reliability Engineer is responsible for maintaining cloud infrastructure, ensuring application performance, and implementing automated solutions in a SaaS environment, while collaborating with security and software engineering teams.

Top Skills: .NetAnsibleAppdynamicsAWSAzureAzure DevopsC#DatadogDynatraceHarnessJavaJenkinsKubernetesNew RelicTerraform

Affirm

Staff Software Engineer

10 Days Ago

Easy Apply

Remote

United States

Easy Apply

200K-275K Annually

Senior level

200K-275K Annually

Senior level

Big Data • Fintech • Mobile • Payments • Financial Services

The Staff Software Engineer in SRE is responsible for setting technical strategy, ensuring system availability, guiding incident management, and fostering talent within the team to enhance overall system reliability.

Top Skills: AWSBashKotlinKubernetesMySQLPythonSpark

Commerce

Site Reliability Engineer

17 Days Ago

Remote

United States

84K-144K Annually

Senior level

84K-144K Annually

Senior level

Artificial Intelligence • Cloud • Consumer Web • eCommerce • Information Technology • Software

The Site Reliability Engineer will ensure application performance, architect monitoring tools, analyze systems, provide reliability recommendations, and support production.

Top Skills: AnsibleCentosDatadogDockerLinuxMySQLNew RelicRhelSQL

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute