Team Lead, Site Reliability Engineer - Denver
Our purpose
Small business makes the world go round – it's the heart of the global economy. At Xero we want millions of small businesses to thrive through beautiful software, advice and connections. We aim to make being a small business more efficient and profitable, and more enjoyable too.
How you'll make an impact
You will be lead one of the teams who provide 24/7/365 support for Xero's customer-facing applications to achieve new levels of operational performance. In this role, you will work closely with Xero's product teams to agree shared objectives, build and implement sophisticated monitoring and remediation toolsets, and create a culture focussed on continually improving the operation of Xero's platform and applications.
What you'll do
- Help the Head of SRE to transform an existing group of engineers into teams of site reliability engineers.
- Engage with Xero's product teams to build close relationships and ways of working together.
- Assist the Head of Site Reliability with implementing and customising practices that are appropriate for Xero, for example, Error Budgets and Incident Management Processes.
- Assist with the development of an engineering culture where team members have the skills and desire to write automation, monitoring and alerting code.
- Be able to prioritise feature requests, enhancements and bug fixes into an achievable roadmap.
- Work with the product teams to help them understand the outputs from Post Mortems to improve their products.
- Provide leadership within Xero on Site Reliability Engineering
- Work closely with your team to define milestones and ensure on-time and in-scope delivery.
- Lead the team to take operational responsibility for their services and to establish processes and monitoring which drives incremental improvement.
Success looks like
- Build a site reliability team that is well connected to the Product teams and driving improvement in the production performance (i.e., availability, efficiency, etc..) of Xero platform and products.
- The Site Reliability Team is seen as an attractive place for developers, internal and external to work.
- Clear Service Level Objectives have been agreed with the Product teams and for the overall platform. These are accurately monitored and regularly reported on. Teams are meeting or exceeding their objectives.
- Xero's product reliability is improving as a result of your team.
- Monitor agreed Service Level Objectives for operational performance and continue to improve the way the service is operated and monitored.
- Well run Incident Management process
- Effective leadership of a team of engineers, working in different areas of the business
What you'll bring with you
Critical competencies
Experience
- Strong Reverse Engineering Skills
- Ability to think statistically
- Ability to Improvise
- Builds effective relationships
- Mentorship
- Diplomacy
- Lead roles in designing, building and debugging software.
- At least 5 years of software development experience.
- Experience of building high performing teams including senior engineers and graduates.
- Experience running teams to support critical, high-scale infrastructure.