Docker, Inc Logo

Docker, Inc

Senior Infrastructure Engineer

Posted Yesterday
Be an Early Applicant
Remote
2 Locations
208K-286K Annually
Senior level
Remote
2 Locations
208K-286K Annually
Senior level
Design, build, and operate Docker's cloud infrastructure services, focusing on reliability, performance, and cost-efficiency, while leading automation and implementing best practices.
The summary above was generated by AI

At Docker, we make app development easier so developers can focus on what matters. Our remote-first team spans the globe, united by a passion for innovation and great developer experiences. With over 20 million monthly users and 20 billion image pulls, Docker is the #1 tool for building, sharing, and running apps—trusted by startups and Fortune 100s alike. We’re growing fast and just getting started. Come join us for a whale of a ride!

Role Summary

Our Infrastructure Engineering team is the backbone of Docker’s cloud-native platform — powering products like Docker Hub, Docker Build Cloud, Docker Scout etc., for millions of developers worldwide. We don’t just keep the lights on — we design, build, and operate the infrastructure services and platforms that make Docker fast, reliable, and secure at global scale.

We own and operate the core building blocks of Docker’s application platform:

  • Compute – Multi-tenant EKS clusters, autoscaling, and capacity management.

  • Edge & Internal Networking – Ingress, rate limiting, VPN, and secure inter-cluster connectivity.

  • Observability – End-to-end metrics, logs, tracing, probes, and alerting.

  • Deployment – GitOps workflows powered by Argo CD.

  • Security – IAM for services and humans, plus robust secret management.

  • Cloud Infra Provisioning & FinOps – Automated cloud resource provisioning and cost transparency.

In this role, you’ll:
  • Architect and run globally distributed platform services that hundreds of engineers rely on every day.

  • Continuously evolve our compute, edge, observability and deployment layers for maximum resilience, performance, and cost-efficiency at a global scale.

  • Lead with automation, Infrastructure as Code, and SLO-driven operations to deliver reliability through software, not toil.

If you thrive on high-impact engineering, love solving complex distributed systems challenges, and want to shape the foundation of one of the most widely used developer platforms on the planet — this is your stage.

How We Work
  • Code first: we tackle infra problems with software, design docs, and rigorous code review.

  • Async & remote‑first: decisions are documented in RFCs; incident reviews are blameless and written.

  • Cross‑functional: platform, product, and security engineers collaborate daily to unblock each other.

  • Continuous improvement: we ship small, measure impact, and iterate quickly.

Responsibilities1. Ship & Operate Cloud Services
  • Design, develop, and ship internal platform services (e.g. provisioning, cost insights, rate‑limiting) in Go or Python.

  • Partner with product and engineering teams to provide paved‑road patterns for deployment, observability, and security.

2. Infrastructure as Code & Reliability
  • Codify infrastructure with Terraform and Go; champion GitOps best practices.

  • Define and own SLOs, lead on‑call rotations, conduct blameless post‑mortems, and implement remediations.

  • Advance observability by operating metrics, logs, tracing, probes, and alerting pipelines at cloud scale

3. Evolve Compute & Networking foundations
  • Evolve Docker’s ingress stack—Envoy Gateway, ALB/NLB, AWS VPC CNI—to deliver secure, reliable, and cost‑efficient request routing.

  • Operate and scale multi‑tenant EKS clusters; guide the evaluation and adoption of new infrastructure technologies.

4. Optimize Cloud Provisioning and Cost

  • Build and operate self-serve cloud resource provisioning platforms at scale.

  • Deliver real-time cost visibility and lead company-wide cost-efficiency initiatives.

QualificationsCore Engineering Skills (must‑have)
  • Strong software development skills in Go, Python, Java or similar (design, testing, and code review).

  • Significant experience shipping and operating cloud infrastructure and distributed systems at scale in production (typically 5+ years of relevant work).

  • Solid foundation in Linux, Networking, and Cloud Security.

  • Excellent cross-collaboration, written and verbal communication in a remote environment.

Depth in one or more of the following (nice‑to‑have)
  • Kubernetes ecosystem and Containerization (EKS, ingress, CNI, service mesh).

  • Observability tooling (OpenTelemetry, Prometheus, Grafana).

  • CI/CD & release automation (GitHub Actions, Argo CD).

  • Cost optimization at scale (FinOps, capacity modelling).

Demonstrated expertise in at least one of these areas is welcome; we don’t expect candidates to be experts in all.

What to ExpectFirst 30 Days
  • Complete onboarding and build relationships across Engineering, Security, and Product.

  • Ship your first Terraform or internal service change and shadow on-call.

  • Gain a deep understanding of our platform architecture, SLOs, and current reliability initiatives.

First 60 Days
  • Take ownership of a critical service or infrastructure component

  • Lead a medium-complexity project from design to production.

  • Rotate fully into the on‑call schedule, leading incident response when needed, with confidence.

First 90 Days
  • Lead a high-impact project from design to production.

  • Contribute to refining our platform roadmap and champion initiatives that reduce toil and accelerate delivery.

First Year
  • Lead the design and launch of a major, company-wide infrastructure initiative.

  • Become a recognized subject matter expert in Docker’s cloud infrastructure.

  • Mentor newer engineers and influence engineering culture through technical leadership and continuous improvement.

We use Covey as part of our hiring and / or promotional process for jobs in NYC and certain features may qualify it as an AEDT. As part of the evaluation process we provide Covey with job requirements and candidate submitted applications. We began using Covey Scout for Inbound on April 13, 2024.

Please see the independent bias audit report covering our use of Covey here.

Perks

  • Freedom & flexibility; fit your work around your life

  • Designated quarterly Whaleness Days

  • Home office setup; we want you comfortable while you work

  • 16 weeks of paid Parental leave

  • Technology stipend equivalent to $100 net/month

  • PTO plan that encourages you to take time to do the things you enjoy

  • Quarterly, company-wide hackathons

  • Training stipend for conferences, courses and classes

  • Equity; we are a growing start-up and want all employees to have a share in the success of the company

  • Docker Swag

  • Medical benefits, retirement and holidays vary by country

Docker embraces diversity and equal opportunity. We are committed to building a team that represents a variety of backgrounds, perspectives, and skills. The more inclusive we are, the better our company will be.

Due to the remote nature of this role, we are unable to provide visa sponsorship.

#LI-REMOTE

Top Skills

AWS
Envoy
Go
Grafana
Java
Kubernetes
Prometheus
Python
Terraform

Similar Jobs

Yesterday
Remote
Canada
186K-186K Annually
Senior level
186K-186K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Design, implement, and operate distributed database technologies. Guide teams, develop scalable systems, and maintain operational support while keeping up with industry trends.
Top Skills: AuroraDynamoDBGoJavaMemcacheMongoDBPythonRdsRedis
Yesterday
Remote
Canada
186K-186K Annually
Senior level
186K-186K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The Senior Software Engineer will design, implement, and manage cloud-based network infrastructure, ensuring efficiency and scalability while leading technical decisions.
Top Skills: AWSAzureGCPGoGrpcIp TablesProtobufRuby
11 Days Ago
Remote
Canada
185K-250K Annually
Senior level
185K-250K Annually
Senior level
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
The Senior Infrastructure Software Engineer builds and optimizes scalable systems, manages massive data efficiently, and collaborates across teams to innovate solutions.
Top Skills: C/C++GoJavaPython

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account