Site Reliability Engineering Manager
Job Description
As a Manager of Site Reliability Engineering, you will be accountable for providing scalable, maintainable and reliable IT services to our business through service development, transition, and support. This talented leader will focus on managing the large scale physical and virtual server environment that underpins our global ad delivery platform. Though focused on Linux systems administration, this individual will be a member of our platform department with a broad exposure to a range of infrastructure and technologies. A majority of our project focus is on scaling our server architecture, performance tuning, and automating operational maintenance. This specific role works with a cross functional team of managers, delivery teams, vendors, and external service providers to achieve to execute, maintain, and improve the underlying services through use of IT best practices.
Qualifications:
- BA/BS degree from a 4-year college or university or equivalent experience
- 5+ years of experience in Linux systems engineering
- 2+ years of experience in direct management
- Experience working with automation tools, such as Puppet, Chef, Ansible, etc
- Familiarity with load balancing tools and techniques
- Experience working in containerized and virtual environments such as Docker, Kubernetes, VMWare
- Familiarity with IP and Ethernet networks as well as transport protocols
- Scripting ability a plus, Bash, PHP, Perl, Python.
- Ability to handle pressure situations with clarity, focus and professionalism
- Open to flexible work conditions to ensure the availability of services and the timely delivery of solutions
- Possesses a services and solution orientated approach
- Very strong organizing, time management and priority setting skills
- Driven by a desire for continuous improvement
- Must be enthusiastic, communicative and eager to learn
Position Reports to: Sr. Director, Reliability Engineering