Attentive Logo

Attentive

Staff Site Reliability Engineer

Posted 11 Days Ago
Remote
Hiring Remotely in United States
156K-240K Annually
Senior level
Remote
Hiring Remotely in United States
156K-240K Annually
Senior level
Design and implement systems to enhance platform reliability and scalability. Lead initiatives, mentor team members, and collaborate across teams to drive impactful projects and establish best practices.
The summary above was generated by AI

Attentive® is the AI-powered mobile marketing platform transforming the way brands personalize consumer engagement. Attentive enables marketers to craft tailored journeys for every subscriber, driving higher recurring revenue and maximizing campaign performance. Activating real-time data from multiple channels and advanced AI, the platform personalizes content, tone, and timing to deliver 1:1 messages that truly resonate.


With a top-rated customer success team recognized on G2, Attentive partners with marketers to provide strategic guidance and optimize SMS and email campaigns. Trusted by leading global brands like Neiman Marcus, Samsung, Wayfair, and Dyson, Attentive ensures enterprise-grade compliance and deliverability, supporting trillions of interactions across more than 70 industries. To learn more or request a demo, visit www.attentive.com or follow us on LinkedIn, X (formerly Twitter), or Instagram.


Attentive’s growth has been recognized by Deloitte’s Fast 500, Linkedin’s Top Startups and Forbes Cloud 100 all thanks to the hard work from our global employees!


About the Role

Our Platform Infrastructure team is the backbone of everything we do at Attentive, providing a resilient and cost-effective platform that seamlessly handles billions of events from over 100 million customers daily. We own everything from compute, persistence, and networking to observability and deployments. Joining our team offers a high-growth career opportunity to collaborate with some of the world’s most talented engineers in a high-performance, high-impact culture.


As part of the Infrastructure and Platform organization, the Production Engineering Team is focused on delivering a fast and reliable platform that empowers Attentive engineers to deliver solutions quickly and safely. We build scalable systems that automate routine tasks so we can focus on other impactful efforts. Reliability, scalability, and security are our areas of expertise. We focus on release, observability, and cost optimization. Our mission is to create robust platforms and tools that allow stakeholders to concentrate on delivering exceptional products.


As a Staff Engineer, you will take a strategic role in designing and implementing solutions that enhance the reliability and scalability of our systems, while mentoring others and influencing technical roadmaps across the organization.

What You'll Accomplish

  • Design and Deliver High-Impact Solutions: Design and implement systems that enhance reliability, observability, traceability, and incident management, ensuring the platform scales effectively
  • Lead Strategic Initiatives: Take ownership of cross-team collaborations and drive impactful projects by providing technical leadership and guidance
  • Partner Across Teams: Collaborate with engineers from AI/ML, Data, Platform, and Product teams to develop best-in-class services
  • Partner with engineers from AI/ML, Data, Platform, Product, and other groups to deliver best-in-class services
  • Establish Standards and Best Practices: Define and enforce production standards, processes, and tools to ensure operational excellence
  • Champion Reliability Goals: Advocate for and implement SLIs, SLOs, and other reliability-focused metrics across the engineering organization
  • Mentorship and Knowledge Sharing: Guide and mentor team members, fostering technical growth and helping to develop the next generation of engineering leaders
  • Innovate and Inspire: Drive continuous improvement by bringing creative ideas and challenging the status quo

Your Expertise

  • 7+ years of experience in Production Engineering, Backend Engineering, SRE, DevOps or similar role
  • Proficient Problem-Solver: Strong coding ability in at least one language (e.g., Golang, Python, Java, Typescript) with the capability to solve complex issues through code
  • Track Record of Success: Demonstrated experience delivering medium to large-scale projects that drive meaningful improvements in platform reliability and scalability
  • Reliability Expertise: Deep understanding of production reliability concepts, including SLIs, SLOs, and incident management
  • Strong Communicator: Excellent verbal and written communication skills with the ability to influence and collaborate across technical and non-technical teams
  • Fast-Paced Experience: Familiarity with working in dynamic, reliability-focused production environments (preferred)

What We Use

  • Our infrastructure runs primarily in Kubernetes hosted in AWS’s EKS
  • Infrastructure tooling includes Istio, Datadog, Terraform, CloudFlare, and Helm
  • Our backend is Java / Spring Boot microservices, built with Gradle, coupled with things like DynamoDB, Kinesis, AirFlow, Postgres, Planetscale, and Redis, hosted via AWS
  • Our frontend is built with React and TypeScript, and uses best practices like GraphQL, Storybook, Radix UI, Vite, esbuild, and Playwright
  • Our automation is driven by custom and open source machine learning models, lots of data and built with Python, Metaflow, HuggingFace 🤗, PyTorch, TensorFlow, and Pandas

You'll get competitive perks and benefits, from health & wellness to equity, to help you bring your best self to work.


For US based applicants:

- The US base salary range for this full-time position is $156,000 - $240,000 annually + equity + benefits

- Equity is a substantial part of the total compensation package

- Our salary ranges are determined by role, level and location


#LI-JK1


Attentive Company Values

Default to Action - Move swiftly and with purpose

Be One Unstoppable Team - Rally as each other’s champions

Champion the Customer - Our success is defined by our customers' success

Act Like an Owner - Take responsibility for Attentive’s success


Learn more about AWAKE, Attentive’s collective of employee resource groups.


If you do not meet all the requirements listed here, we still encourage you to apply! No job description is perfect, and we may also have another opportunity that closely matches your skills and experience.


At Attentive, we know that our Company's strength lies in the diversity of our employees. Attentive is an Equal Opportunity Employer and we welcome applicants from all backgrounds. Our policy is to provide equal employment opportunities for all employees, applicants and covered individuals regardless of protected characteristics. We prioritize and maintain a fair, inclusive and equitable workplace free from discrimination, harassment, and retaliation. Attentive is also committed to providing reasonable accommodations for candidates with disabilities. If you need any assistance or reasonable accommodations, please let your recruiter know. 

Top Skills

Airflow
AWS
Cloudflare
Datadog
DynamoDB
Eks
Esbuild
Gradle
GraphQL
Helm
Huggingface
Istio
Java
Kinesis
Kubernetes
Metaflow
Pandas
Planetscale
Playwright
Postgres
Python
PyTorch
Radix Ui
React
Redis
Spring Boot
Storybook
TensorFlow
Terraform
Typescript
Vite

Similar Jobs

8 Hours Ago
Remote
2 Locations
Senior level
Senior level
Artificial Intelligence • Enterprise Web • Machine Learning • Natural Language Processing • Software • Conversational AI • Automation
As a Site Reliability Engineer, you'll enhance infrastructure security, automate deployments, optimize CI/CD processes, and drive engineering best practices while ensuring compliance and observability.
Top Skills: Aws CloudElasticsearchGoJavaScriptMongoDBNode.jsReactRedisTerraform
Yesterday
Remote
United States
145K-195K Annually
Expert/Leader
145K-195K Annually
Expert/Leader
AdTech • Cloud • Information Technology • Marketing Tech • Software
Lead reliability strategies for SMS infrastructure, collaborating with teams to drive value, optimize performance, and ensure system reliability for telecom operations.
Top Skills: AnsibleAsteriskAWSAzureDatadogDockerElasticsearchGCPGitGitlabHaproxyJavaJenkinsK8SLinuxMySQLNginxOpensipsRestSipSngrepTerraformTomcatVoipdWireshark
Yesterday
Remote
Hybrid
7 Locations
264K-395K Annually
Senior level
264K-395K Annually
Senior level
Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
As a Senior Site Reliability Engineer at Block, you'll enhance the reliability of systems by designing and maintaining scalable infrastructure, collaborating with development teams, performing root cause analysis, and mentoring junior staff. You'll ensure high availability and contribute to operational efficiency.

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account