Site Reliability DevOps Engineer
Our Purpose
Small business makes the world go round – it's the heart of the global economy. At Xero we want millions of small businesses to thrive through beautiful software, advice and connections. We aim to make being a small business more efficient and profitable, and more enjoyable too.
We offer a competitive salary, shares in the company , great office environment with a view of the Rockies.
How you'll make an impact
You will be part of the team who provide 24/7/365 support for Xero's customer-facing applications to achieve new levels of operational performance. In this role, you will work closely with Xero's product teams to agree shared objectives, build and implement sophisticated monitoring and remediation toolsets, and create a culture focussed on continually improving the operation of Xero's platform and applications.
What you'll do
- The Site Reliability Engineer will form a critical component in Xero's Site Reliability Engineering team, assisting the team to achieve its goals to provide tooling and guidance towards improving Xero's reliability standards and supporting Xero as we continue to scale.
- This role will be responsible for ensuring our tooling is an enabler of organizational agility. Specifically, that we have robust and automated processes that move code from commit, through build and automated quality assurance processes, to deployment to staging environments and thereafter into production environments.
- To do the role you will need to have knowledge and experience of relevant technologies used throughout the SDLC. This will include object-oriented software development languages and frameworks (such as Python, C# and Javascript), source control systems (in particular Git), build tools (such as Jenkins and CodeBuild), deployment tools (such as Terraform, Cloudformation, and Docker) and monitoring tools (such as New Relic, Cloudwatch, Sumologic and AWS X-ray).
- In this role, you will be expected to contribute to the team's direction and to help lead the team towards setting and achieving its business goals.
- Your communication and collaboration skills will be top notch and set an example for others in your team and in the wider company.
- This role is also responsible for ensuring the products that the Site Reliability Engineering team are responsible for are available, secure, scalable, robust, high performing and cost-effective to serve both Xero's engineering team and our customers. This will require knowledge and experience across a broad range of technology domains including software development, distributed systems design, networking, Linux, Windows, and docker. It will also require knowledge of PaaS and IaaS services offered by AWS.
Success looks like
- Design and build tools for application deployment, system automation, and configuration management
- Work closely with other software engineers, cloudops, devops, product managers and QA personnel to deliver cutting edge cloud solutions.
- Create systems to allow Xero product/development teams to self-service their AWS infrastructure needs
- Daily interaction with Xero product development and platform teams (Networking, Security, and Customer Tech Support teams)
- Consult with teams on best practices for AWS platform and end-to-end lifecycle system deployments
- Create and implement automation processes and standards for AWS cloud services
- Ensure all cloud infrastructure components meet proper availability, cost, performance and security standards
- Create scalable alerting and auto remediation systems
- You will share an on-call rotation backed by all our product teams.
- Perform advanced troubleshooting and monitoring of our systems to ensure adequate SLA and capacity requirement.
- Help define the tools and philosophies used among the team around deployment, monitoring, testing and security.
What you'll bring with you
- Software development experience in an agile environment
- Hands-on experience managing infrastructure in a high-availability cloud-based environment, preferably AWS.
- Comfortable with Agile methodologies
- Proven communication and collaboration skills
- Successful track record of providing tooling and support for development teams
- Ideally experience with improving release processes and deployment pipelines using approaches like blue/green deployments, canary releases and testing in production
- Have experience building and maintaining highly available systems through infrastructure as code with systems such as Terraform or Cloudformation
- Previous experience working in a DevOps, Release Engineering, Software Development, or similar role
- Ideally good experience with Test-driven or Spec-driven development and QA practices
- An interest in non-functional development concepts and tools, such as monitoring, logging, observability, Service Level Objectives, and uptime.
It's time to introduce yourself
Now that we've caught your attention, it's time to catch ours. Please apply if you:
Love doing #beautiful work – contribute to a beautiful experience for our customers.
Show your passion for the #human side of software development through cultivating a deep understanding of the needs and aspirations small business owners.
Seek the #challenge of complex technical problems, crazy-smart collaborators and a fast-paced work environment.
#Champion creative development by bringing a unique point of view while inspiring impeccable work from others.
Take #ownership by driving meaningful change and delivering results with passion and purpose.