Comet is building the development platform for teams who want to ship robust, reliable, and responsible AI applications. Opik, our open source LLM evaluation framework, has quickly become one of the most popular tools in the space. Our experiment management platform is used by data scientists at companies like Uber, Netflix, and Etsy. Tens of thousands of researchers, engineers, academics, and hobbyists use Comet every day to build the future of AI.
Working at Comet will give you access to the most exciting work being done in all areas of machine learning. Some of the top researchers and companies working on self-driving cars, drug discovery, particle research, diffusion models, and LLMs use Comet every day. Your work has the potential to accelerate the development of some of the most impactful technology in the world, and you will be doing it alongside a team of passionate, caring individuals. If that sounds exciting, Comet is the right place for you.
Comet is backed by more than $63 million in venture capital funding and powers some of the best machine-learning teams in the world, including Netflix, Uber, Etsy, and Mobileye. We are a remote-first company with offices in New York City (USA) and Tel Aviv (Israel).
We are seeking a Senior DevOps Engineer who wants to have a real impact on how we build, deliver, and operate our products. In this role, you won’t just “keep the lights on”, you’ll design and evolve the infrastructure, automation, and delivery systems that power our business.
Working in a remote-first company (this role is fully remote in the East Coast USA), you’ll be trusted to operate with autonomy while collaborating across time zones with a team that values technical excellence, knowledge sharing, and a bias for action.
If you thrive on solving complex problems, care deeply about reliability, and love enabling teams to ship faster with confidence, we’d like to meet you.
Responsibilities:- Design, implement, and manage scalable, secure, and reliable cloud-based infrastructure
- Build and maintain CI/CD pipelines for efficient and consistent application delivery
- Implement and manage Infrastructure as Code (IaC) to ensure consistency across environments
- Drive adoption of best practices in automation, observability, and system reliability
- Ensure security and compliance across infrastructure and deployments
- Optimize cost management of cloud infrastructure
- Collaborate closely within the DevOps team to share knowledge, improve processes, and raise the technical bar
- Partner with development, QA, and other teams to ensure applications are designed and delivered for operability
- Troubleshoot, investigate, and resolve production issues that directly impact customers
- 5+ years of experience in a DevOps, SRE, or related role, including significant production experience
- Proven remote work experience and strong collaboration skills in distributed teams
- Deep understanding of DevOps practices, automation, CI/CD, and infrastructure-as-code
- Passion for troubleshooting and root cause analysis
- Strong experience with cloud platforms (AWS preferred, GCP a plus) and managing infrastructure with Terraform
- Solid understanding of networking, security, and infrastructure best practices
- Significant hands-on experience with containerization and orchestration (Docker, Kubernetes, Helm)
- Experience with observability tools such as Prometheus, Grafana, or NewRelic.
- Strong background in Linux/Unix system administration
- Proficiency in scripting (Bash, Python)
- Experience in software development (Java, Python, Go) - a plus
- Knowledge of database management and performance optimization - a plus
What We Offer:
- Competitive base salary based on proven experience, skills and location.
- Competitive benefits package.
- Flexible working hours and remote work options.
- Opportunities for professional growth and development.
- A collaborative and innovative work environment.
- The chance to work with cutting-edge technologies and projects.
This role will be fully remote in the East Coast USA working with a global team (large presence in the US, Tel Aviv and Europe) – some flexibility with work hours is required.
Comet is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees without regard to race, religion, color, sex, gender identity, gender expression, sexual orientation, national origin, ancestry, citizenship status, uniform service member status, marital status, pregnancy, age, medical condition, physical or mental disability, genetic information/characteristics, and any other characteristic protected by State or Federal law.
Top Skills
Similar Jobs
What you need to know about the Colorado Tech Scene
Key Facts About Colorado Tech
- Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
- Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
- Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
- Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
- Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute