Duetto Logo

Duetto

Senior SRE

Posted 14 Days Ago
Remote
Hiring Remotely in United States
Senior level
Remote
Hiring Remotely in United States
Senior level
Lead and enhance system reliability by designing scalable infrastructure solutions, managing cloud services, and ensuring high system availability in a collaborative team environment.
The summary above was generated by AI

THE COMPANY
Duetto offers an open and collaborative work environment and believes that by cultivating a team with diverse backgrounds, perspectives, and experiences, it will continue to lead the industry with its cutting-edge platform-based hospitality technology. 

We are an ambitious, well-funded, high-growth global technology company transforming the hotel industry.  At Duetto, we are passionate about creating innovative solutions to help hoteliers thrive. Although we work hard and operate at “Duetto speed,” the work atmosphere is casual, flexible, collaborative, and most of all, fun.

POSITION SUMMARY AND OPPORTUNITY 

We are seeking a highly experienced Lead Site Reliability Engineer(SRE) to join our dynamic team. The ideal candidate will have a proven track record of designing, implementing, and maintaining scalable, secure, and highly reliable systems. As a key contributor, you will collaborate with cross-functional teams to drive architecture decisions, implement best practices, and ensure high system availability.

Our technology stack includes Java server technologies, NoSql, Github, single-­page JavaScript web techniques (jQuery, Backbone, React, and RequireJS), and patent­-pending analytical methods on top of MongoDB and AWS. We use Terraform to manage our infrastructure and Chef as a configuration manager. DataDog and Prometheus as our monitoring solutions.

KEY RESPONSIBILITIES 

  • Architect and implement infrastructure solutions to facilitate seamless migration of critical systems while ensuring uptime, reliability, and a high-quality experience for end users.
  • Design, develop, test, and maintain tools and processes to efficiently manage and operate SaaS products hosted on AWS, with a focus on scalability and automation.
  • Partner with developers to enhance the reliability, performance, scalability, and security of server and application architectures.
  • Build and maintain critical components of our infrastructure, emphasizing robustness, security, and high availability to meet demanding service-level expectations.
  • Foster strong cross-team collaboration by driving engagement, promoting shared goals, and ensuring alignment across technical and non-technical teams.
  • Lead efforts to ensure systems are secure by default, addressing vulnerabilities proactively and implementing best practices for cybersecurity preparedness.
  • Be the last line of support for services that thousands of customers (hotels, resorts, casinos, etc.) around the world depend on 24/7.
  • Troubleshoot on-call incidents to ensure rapid resolution and minimal service disruption. Participate in detailed Root Cause Analysis (RCA) to identify underlying issues and work cross-functionally to implement preventative measures and long-term solutions, ensuring similar problems are avoided in the future.

REQUIREMENTS 

  • 7+  years of experience in an Ops, DevOps or SRE role.
  • Experience in System Design and Architecture. 
  • Engineer-level experience with networking and security concepts.
  • Understanding of fundamentals behind load balancing technologies.  Experience configuring Layer 7 load-balancing is a plus. 
  • Experience collaborating with engineers on architecture decisions.
  • Experience administering Cloud Computing Services such as AWS (preferred), Azure, or GCP, including working knowledge of permissions structures, multi-account management structures, and single sign-on(SSO).
  • Experience with AWS ecosystem tools such as AWS IAM, VPC, EC2, ELB, RDS, S3, Lambda, API Gateway, Secrets Manager, KMS, CloudWatch, CloudTrail..
  • Experience with security compliance certifications such as SOC2.
  • Experience working in an environment with a heavy emphasis on DevOps and Service Reliability mindset.
  • Experience provisioning, configuring, administering, and using enterprise monitoring ecosystems like Prometheus, Grafana, DataDog or similar.
  • Experience with CI/CD Tools such as GitHub, GitHub Actions, JFrog Artifactory, Jenkins, and GitOps methodologies.
  • Experience using and writing infrastructure-as-code using Terraform.
  • Experience with configuration-management toolsets such as Chef or Puppet.
  • Experience with containers and container orchestration tools such as ECS/EKS (a plus).
  • Experience managing infrastructure and contributing as part of a multi-user infrastructure team, using Terraform and associated toolsets.  Relevant SOC2 experience is also a plus.
  • Fluency in reading Java, Ruby, Bash/Zsh, HCL, Python and Javascript.
  • Strong experience in troubleshooting and resolving complex on-call incidents with a focus on minimizing service disruption and downtime.
  • Proven ability to lead and participate in detailed Root Cause Analysis (RCA) processes to identify and address underlying issues effectively.
  • Demonstrated expertise in implementing preventative measures and long-term solutions based on RCA findings to ensure recurring issues are mitigated.
  • Experience constructing and maintaining build/deploy automation tooling.
  • Participate in weekly on-call rotation.
  • Ability to work both independently and within a team environment.
  • A passion for technology with a drive to stay up to date with technology and best practices.

PROFILE OF THE IDEAL CANDIDATE 

  • Team Player - Works well with others, highly collaborative and acts as a strong partner to other team members and functions.
  • Execution – Desire to work on a fast paced team and help set direction and architecture.
  • Creativity – Thrives in an environment without a set playbook.
  • Quality - Takes pride in delivering robust and high quality implementations.
  • Ownership - Enjoys owning and driving projects.


About Duetto:
Duetto delivers a suite of SaaS cloud-native applications for hospitality businesses to optimize every booking opportunity for greater revenue impact. The unique combination of hospitality experience and technology leadership drives Duetto to look for innovative solutions to industry challenges. The software as a service platform allows hotels, casinos, and resorts to leverage real-time dynamic data sources and actionable insights into pricing and demand across the enterprise. For more information, please visit https://www.duettocloud.com.

Top Skills

AWS
Backbone
Chef
Datadog
Git
Java
Jquery
MongoDB
NoSQL
Prometheus
React
Requirejs
Terraform

Similar Jobs

3 Days Ago
Remote
Hybrid
2 Locations
133K-167K Annually
Senior level
133K-167K Annually
Senior level
Cloud • Fintech • Information Technology • Machine Learning • Software • App development • Generative AI
As a Senior Site Reliability Engineer, you'll develop cloud-based data platforms using GCP, support data pipeline construction, and improve data management practices while collaborating with various teams.
Top Skills: Apache HadoopDataprocGitGoogle Cloud PlatformKafkaPy-SparkPythonRest ApiSparkSQL
6 Days Ago
Remote
2 Locations
95K-160K Annually
Senior level
95K-160K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The Senior Site Reliability Engineer will ensure reliability and security in GovCloud environments, automating infrastructure and optimizing performance while maintaining compliance and leading incident response efforts.
Top Skills: Aws GovcloudAws WorkspacesAzureBashCitrixCloudFormationDatadogElkGCPGrafanaIamPamPowershellPrometheusPythonSplunkTerraformVmware Horizon
Yesterday
Remote
United States
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
Seeking a Senior Site Reliability Engineer to support and maintain the MongoDB Atlas platform, focusing on automation, system design, and operational excellence.
Top Skills: AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account