Senior Site Reliability Engineer

Sorry, this job was removed at 08:33 p.m. (MST) on Monday, Nov 18, 2024
Be an Early Applicant
Hiring Remotely in United States
Remote
Internship
Fintech • Kids + Family • Software
Procare streamlines the administrative functions in child care centers, so they can focus on the kiddos.
The Role

About Procare

Our mission is to simplify childcare operations and create meaningful connections by providing technology, expertise, and unparalleled service.

Procare Solutions is the #1 name in childcare software – used by more than 35,000 childcare businesses across the country. For over 30 years, childcare professionals have looked to Procare to provide real-time information for making critical decisions, maintaining compliance with local and state regulations, and adhering to business best practices.

We make childcare management run smoothly, so that our customers can spend more time focusing on the kiddos, not back office administrative duties.

About the Role:

We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding and extensive experience working with AWS, a thorough knowledge of the Linux and Windows operating system, and a robust background in managing and optimizing infrastructure and services in a cloud environment. As an Sr. SRE, you will be responsible for maintaining the reliability, availability, and performance of our applications and infrastructure. You will be mentoring other SREs and help build and design robust solutions for Procare.

Key Responsibilities:

  • Infrastructure Management: Design, implement, and maintain scalable, reliable, and secure AWS infrastructure using best practices.
  • Monitoring & Alerting: Develop and maintain monitoring, logging, and alerting solutions to ensure the health and performance of our systems. Utilize tools such as New Relic, AWS CloudWatch, Prometheus, Grafana, and ELK stack.
  • Automation & Scripting: Automate infrastructure provisioning, configuration, and deployment processes using tools like Terraform, CloudFormation, and Ansible.
  • Incident Management: Respond to and resolve production incidents, conduct root cause analysis, and implement corrective measures to prevent recurrence.
  • Performance Optimization: Continuously analyze system performance and implement tuning improvements to enhance the overall efficiency and scalability of the infrastructure.
  • Security Compliance: Ensure all systems and infrastructure comply with security best practices and policies. Implement and manage IAM roles and policies, VPC configurations, and security groups.
  • Mentoring: Work closely with other SREs. Mentor and guide more junior members of the team and help with the design and implementation of new solutions.
  • Collaboration: Work closely with development teams to integrate reliability into the software development lifecycle, including CI/CD pipeline management using tools such as Jenkins or AWS CodePipeline.
  • Documentation: Maintain comprehensive documentation of infrastructure, processes, and incident reports to ensure knowledge sharing and transparency.


Required Skills and Experience:

  • AWS Expertise: Minimum 5 years of hands-on experience with AWS services including EC2, S3, RDS, Lambda, ECS/EKS, CloudFormation, CloudWatch, VPC, and IAM.
  • Linux Expertise: Deep knowledge and extensive experience with Linux operating systems, including system administration, shell scripting, and troubleshooting.
  • Windows Expertise: Deep knowledge and extensive experience with Windows operating systems, including system administration, shell scripting, and troubleshooting.
  • SRE Tools & Technologies: Familiarity with common SRE-related services and tools such as Kubernetes, Docker, New Relic and Splunk.
  • Automation & Configuration Management: Proficiency in infrastructure as code (IaC) tools like Terraform, Ansible, and CloudFormation.
  • Monitoring & Logging: Experience with monitoring and logging solutions, including setting up metrics, creating dashboards, and alerts.
  • Networking: Strong understanding of networking concepts, including DNS, load balancing, VPN, firewalls, and network security.
  • Programming & Scripting: Proficiency in at least one programming/scripting language such as Python, Go, or Bash.
  • Problem-Solving: Excellent problem-solving skills with a proactive and analytical approach to resolving issues.
  • Communication: Strong written and verbal communication skills, with the ability to collaborate effectively with cross-functional teams.

Preferred Qualifications:

  • Certifications: AWS Certified Solutions Architect – Professional, AWS Certified DevOps Engineer, or similar certifications.
  • SysAdmin and DevOps Engineering Background: Experience in DevOps engineering, including continuous integration and continuous deployment (CI/CD) practices and tools as well as systems engineering and administration
  • Experience: Previous experience in a similar SRE role within a large-scale, complex environment.

Why Procare?

  • Excellent comprehensive benefits packages including: medical, dental, & vision plans
  • HSA option with employer contributions
  • Vacation time, holidays, sick days, volunteer & personal days
  • 401K Plan with employer match and immediate vesting
  • Employee Stock Purchase Plan
  • Employee Discount Program
  • Medical, Dependent Care, and Transportation FSA Plans
  • Company paid Short and Long-Term disability and Life Insurance
  • RTD EcoPass for all Denver employees
  • Tuition Reimbursement and continued Professional Development
  • Fast paced, high energy workplace environment in prime downtown location
  • Regular company provided meals

Salary

$110,000-$150,000/year DOE

Location

While our preference is a candidate located in Denver, CO, this role is open to remote candidates in the following states: AL, AZ, CA, CO, CT, FL, GA, ID, IL, IN, IA, KY, ME, MD, MA, MI, MN, MO, NV, NJ, NY, NC, OH, OR, PA, TN, TX, VA, WA, WI.



What the Team is Saying

Chantel
Courtney
JoAnn
The Company
Denver, CO
445 Employees
Hybrid Workplace
Year Founded: 1992

What We Do

For more than 30 years, Procare Solutions has been helping early childhood educators simplify operations and create meaningful connections with families, so they can focus on what matters most – the children in their care.
From registration, attendance tracking, staff management and lesson planning to family engagement, tuition collection and reporting, we help ease the challenges faced with running a child care business.

Our dedicated team of support professionals also make it easy to get up and running quickly and answer questions along the way.

That’s why over 37,000 customers choose Procare. We are proud to be number one in child care management software. Visit us at www.procaresolutions.com.

Why Work With Us

Child care has a multi-generational impact, and we are proud to design, develop, and support solutions that not only help business owners , but also create meaningful connections between caregivers and families. One of our values is Grow Together- this applies not only to us as employees, but also to the businesses and families we support.

Gallery

Gallery

Procare Solutions Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: Not Specified
Denver, CO

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account