Site Reliability Engineer - Kubernetes

Gloo

| Remote | Hybrid

Sorry, this job was removed at 11:23 a.m. (MST) on Monday, January 18, 2021

View 586 Jobs

Find out who's hiring in Greater Boulder Area.

See all Developer + Engineer jobs in Greater Boulder Area

View 586 Jobs

Apply

By clicking Apply Now you agree to share your profile information with the hiring company.

Save job

Gloo provides a personal growth platform that enables service providers, our “Champions,” to exchange better insights, resources, and technology to serve their people. Our company name reflects the trusted bond between people that serves as the foundation for growth, and everything we build strengthens that bond. We’re leveraging the same exponential tech that’s driving success in other industries and making it available through personal growth resources such as custom assessments, growth plans, and more. As one of Boulder’s most innovative and growing tech companies, Gloo needs more talented professionals who are driven to make a positive impact.

The Opportunity:

As the Senior Kubernetes Engineer on our Site-Reliability Engineering team, you will serve as a technical expert and leader in all things Kubernetes. You will ensure the company’s innovative digital products get built on reliable, scalable, resilient, and secure cloud infrastructure. Further, you will play a key role in maturing our Continuous Integration/Continuous Deployment (CI/CD) practices, policies, and tools, specifically leveraging best of breed tools to support our ever-growing micro-services deployment. Overall, you will provide key leadership to help ensure our infrastructure services meet the demands of our fast-paced, dynamic organization.

You bring a deep understanding of cloud native architectures, practices, and tools. With a background in software development and systems administration you are as comfortable writing code in Python or Go as you are in Bash. You not only understand Infrastructure as Code as a concept, but you also have experience applying its principles across an organization.

You love solving deep technical challenges, and proactively improving on earlier iterations. You look to automate manual operations, and cringe at the thought of making changes in a web-based console. You'd rather spend your day in an IDE and terminal than a GUI.

Finally, you have solid experience working within agile product teams to deliver enterprise class services and infrastructure, while also managing operational accountabilities. Above all, you are passionate about solving real problems for real people, and bring a positive attitude surrounding the change you can make in the world.

What You'll Be Doing

Lead the use of Kubernetes services, environments, and apply best practices.
Lead Kubernetes engineering roadmap; integrating voice of customer from stakeholders, product owners, and partners.
Be part of a leadership team that advances our organization CI/CD practices, infrastructure as code, continually seeking improvements, new technology advantages, and efficiencies.
Contributing to the development of best practices for Infrastructure as Code, software build tools, and Continuous Integration.
Configuration Management and automated monitoring of all infrastructure
Maintaining our open-source software infrastructure components, including application and database servers
Remaining mindful of security and access control within all environments
Watching for trouble spots and creating mitigation strategies
Brainstorming with your team to continually support scalable architecture

What you’ll bring to the table:

Technical Knowledge & Experience:
5+ years site reliability engineering or systems engineering
3+ years in developing SRE practices and promoting a DevOps culture
3+ years and senior level expertise in containerization, especially Kubernetes
3+ years of infrastructure automation, configuration management or container orchestration; working in a modern engineering services team where you’ve built and extended an automated CI/CD pipeline
3+ years architecture and/or designing scalable infrastructure solutions
BA/BS in Computer Science, Information Technology or a related technical field (preferred, but not required)
Periodic participation in an after-hours on-call rotation supporting production environments 24x7
Experience supporting a big data environment including Elasticsearch, Kafka, Elastic MapReduce/Hadoop
A lead in cloud native designs and systems, with strong experience in Amazon Cloud Services – like EC2, S3, RDS, ECS, VPC, SNS, SQS, Route53, ECS.
Sound experience with modern tools like Git, Github, Jenkins, Ansible/Puppet, Linux, Docker, Redis, Memcache, CloudFormation/Terraform, Nginx, PostgreSQL
Comfortable with monitoring tools like Sumologic, Wavefront, Monit & Cloudwatch
Expert software development skills and experience, ideally Python, Go, Bash
Ability to design and develop APIs (REST or GraphQL), and CLI tools
Experience in configuring and designing service mesh and discovery using Istio.
Experience in building observability solutions with metrics that feed SLO’s and KPI’s
Minimizing and hardening microservices and public-facing API gateway attack surface
Continuous delivery using tools such as ArgoCD
Broad Experience and strong knowledge in open source
Expert Linux admin skills to be able to lead others in doing and do for yourself:
○ Debug networking issues ○ Attaching and mounting volumes ○ Debug containers (networking, file systems, cpu/memory allocation vs usage, stateful vs non-stateful containers)
Demonstrable experience and expertise in multi-tiered, distributed systems.
Deep understanding and experience with CloudFormation
Good experience with Git, including Github and BitBucket

Kubernetes Expertise -- Leadership & Technical Skills:

Deep expertise in designing and supporting highly-scalable, highly-available infrastructure and applications in Kubernetes as well as promoting microservice architectures in cloud-native environments.
Subject matter expertise on all aspects of our containerized deployments, including deployment, configuration, scaling, security, and upgrades.
Demonstrated experience and success in providing leadership across engineering teams, serving as the subject matter expert on Kubernetes.
Operational experience demonstrating the ability to respond to critical incidents and outages. Demonstrated experience in debugging problems in production and test environments
Expertise and experience in automations that improve deployment speed and service reliability in the containerized environment.
Demonstrated experience as a mentor of other team members and customers on the adoption of best practices, new technologies, and architecture principles.
Significant real world, production expertise in the deployment, management, and monitor of Kubernetes:
Developing automation that improves deployment speed and service reliability in the containerized environment.

Great Interpersonal Skills

Ability to self-managed, prioritize work across the team, and with the ability to lead entire initiatives
You don't wait for senior members to assign work to you, but rather you actively look for work that needs to be done ○ You look for ways to constantly improve our code, our use of technology, patterns, and tools, and find ways to leverage those things within our organization
You can mentor other engineers in their development, including through pairing, code reviews, and knowledge sharing.
Ability to navigate through ambiguity; demonstrating an ability to take a nebulous project, talk to the right people, define it, split it up into tasks, distribute work across team members, and ultimately get it done quickly and efficiently.
You can proactively reach out to stakeholders, other teams, collect and clarify requirements
You can whiteboard and architect solutions and explain technical details to non-technical stakeholders
A one-team attitude and a willingness to work across several, diverse teams
Driven to help others - a passion that is evident in your character

The Perks/Benefits

Compensation and bonus commensurate with experience
Plenty of time off to keep you balanced
Medical with HSA contribution
A dynamic, talented team, dedicated to changing the world and building an incredible business
Remote Flexibility
Headquartered in downtown Boulder on Pearl Street, steps from coffee shops and blocks from hiking trails

Compensation: $130,000 - $180,000 DOE

Applications welcomed from those who are US Citizens or hold a Green Card.

Read Full Job Description

Site Reliability Engineer - Kubernetes

Technology we use

Location

An Insider's view of Gloo

What are some social events your company does?

Becky McKenzie

What projects are you most excited about?

Ed Hahn

What is your vision for the company?

Scott Beck

What are Gloo Perks + Benefits

Additional Perks + Benefits

More Jobs at Gloo