Tenstorrent Inc. Logo

Tenstorrent Inc.

Staff Software Engineer, Cloud Infrastructure

Reposted 12 Days Ago
Remote
Hiring Remotely in United States
100K-500K
Senior level
Remote
Hiring Remotely in United States
100K-500K
Senior level
Lead design and implementation of distributed systems for AI in cloud environments, driving project lifecycle and collaborating with stakeholders.
The summary above was generated by AI

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

This Staff Software, Cloud Infrastructure position is looking to bring new specialized expertise into the team in the area of distributed high-performance and AI computing, especially in Kubernetes-based cloud native environments. You will be driving design, implementation, and integration of systems to support scaling compute capabilities seamlessly from single-host systems into exaflop-scale clusters.

This role is hybrid, based out of Santa Clara, CA or Austin, TX.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.


Responsibilities:

  • Design and drive implementation of distributed systems for AI computing applications in Cloud and novel supercomputing cluster environments
  • Hands-on software development, testing, integration, operations, and support
  • Closely collaborate with the team through the full stack and life cycle of AI data center applications, from data center design and rollout to MLOps
  • Operate within on-premises data centers and public cloud environments
  • Drive projects through their whole software development lifecycle, both on technical and non-technical side
  • Collaboration with both highly technical and non-technical stakeholders with differing backgrounds, being able to communicate highly complex topics to diverse audiences
  • Continuous improvement of engineering practices through code reviews and adoption of relevant techniques and technologies

Experience & Qualifications:

  • 10+ years of hands-on software engineering experience working with distributed systems in Cloud and/or HPC environments
  • 5+ years of experience working with clustered (multi-host) AI hardware and applications for training and inference
  • 5+ years of experience with Kubernetes clusters, including cluster and application deployment (e.g., CNI, CSI, Helm), operations, and development of extensions (e.g., Device plugins, Operators)
  • Strong working knowledge of Python and Go
  • Infrastructure as Code as a first-class citizen (e.g. Ansible)
  • Strong Git, GitOps, and CI/CD experience
  • Familiarity with performance requirement implications of AI/ML workloads, both inference and training
  • Familiarity with virtualization technologies and platforms
  • Hands-on experience with MLOps concepts and frameworks for end-to-end model training pipelines
  • Strong understanding of networking concepts – experience with network hardware configuration and management is a plus
  • Familiarity with security implications of multi-tenant environments on hardware, software, and networking level
  • Familiarity with observability, monitoring and alerting tools (e.g., Grafana, Prometheus, Loki)
  • Agile / lean software project management experience
  • Strong programming skills with years of experience in various programming languages; familiarity of both object oriented and functional programming
  • REST API development and integration experience – full-stack web development experience is a plus

Compensation for all engineers at Tenstorrent ranges from $100k - $500k including base and variable compensation targets. Experience, skills, education, background and location all impact the actual offer made.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

Due to U.S. Export Control laws and regulations, Tenstorrent is required to ensure compliance with licensing regulations when transferring technology to nationals of certain countries that have been licensing conditions set  by the U.S. government.

Our engineering positions and certain engineering support positions require access to information, systems, or technologies that are subject to U.S. Export Control laws and regulations, please note that citizenship/permanent residency, asylee and refugee information and/or documentation will be required and considered as Tenstorrent moves through the employment process.

If a U.S. export license is required, employment will not begin until a license with acceptable conditions is granted by the U.S. government.  If a U.S. export license with acceptable conditions is not granted by the U.S. government, then the offer of employment will be rescinded.

Top Skills

Ansible
Ci/Cd
Git
Gitops
Go
Grafana
Kubernetes
Loki
Prometheus
Python

Similar Jobs

23 Days Ago
Remote
2 Locations
170K-250K
Senior level
170K-250K
Senior level
Software
The role involves designing cloud infrastructure, leading engineering projects, ensuring system scalability and reliability, and mentoring engineers.
Top Skills: AWSAzureGCPGoJava
55 Minutes Ago
Remote or Hybrid
Pleasanton, CA, USA
133K-167K Annually
Expert/Leader
133K-167K Annually
Expert/Leader
Cloud • Fintech • Information Technology • Machine Learning • Software • App development • Generative AI
This role leads performance testing initiatives, collaborates with teams to improve application performance, and analyzes performance metrics for BlackLine's services.
Top Skills: .NetAjaxCloudtestGatlingHTMLHTTPIisJavaScriptJmeterLoadcompleteMs Sql ServerRestSAMLVMwareWindows Servers
An Hour Ago
Remote or Hybrid
Radnor, PA, USA
Junior
Junior
Cloud • Fintech • Software • Business Intelligence • Consulting • Financial Services
As a Senior Accountant in Tax, you'll prepare tax returns, maintain client relationships, and enhance operational efficiency through data analysis and effective communication.
Top Skills: Accounting Software ApplicationsTax Software

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account