Our mission at Tensorwave Cloud is to build seamless, secure, reliable, and resilient AI infrastructure at scale, eliminating barriers and challenging the status quo to empower builders and support AI innovation.
About the role
We are seeking a Senior Network Engineer focused on implementing and operating large-scale, Arista-based RoCEv2 data center networks powering next generation AI and ML infrastructure.
You’ll work hand-in-hand with our network architect to design the infrastructure that keeps over 8,000 GPUs burring, and play a critical role in the implementation and maintenance of our next generation systems, with cluster sizes reaching over 100,000 GPUs.
You’ll work hands-on with high-speed optics, switching, and routing in production clusters and implement modern automation and tooling critical to how the network is deployed, validated, and operated.
Responsibilities
Design, deploy, and operate large-scale RoCEv2 data center networks supporting AI and ML clusters from thousands to 100,000+ GPUs
Own congestion management and performance tuning across RDMA fabrics, including PFC, ECN, and DCQCN, in production environments
Implement and maintain automation, validation, and observability tooling using Python, Ansible, Terraform, and modern DevOps workflows
Ensure high availability and reliability across multi-tenant environments by leading operational excellence, incident response, and continuous improvement
Required Experience
Bachelor of Science in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience
Deep experience with RDMA and RoCEv2 in large-scale production data centers supporting AI or HPC workloads
Strong Arista expertise, including EOS, hardware platforms, and operating high-speed Ethernet fabrics
Proven knowledge of congestion management and performance tuning using PFC, ECN, and DCQCN
Hands-on experience with high-speed optics and cabling including 400G, 800G, and AEC, AOC, DAC, and structured cabling in dense environments
Automation and operations mindset, with experience using Python, Ansible, Terraform, Git, and observability tooling in always-on production systems
What We Bring
Mission driven company
Competitive Salary
Stock Options
100% paid Medical, Dental, and Vision insurance
Flexible PTO
Paid Holidays
401(k)
Parental Leave
Flexible Spending Account
Short Term Disability Insurance
Life and Voluntary Supplemental Insurance
Mental Health Benefits through Spring Health
We’re looking for resilient, adaptable people to join our team, people who believe in the mission and think at massive scale. The solutions that worked on a handful of devices will not work at Exascale. Be prepared to be pushed daily, to learn a lot, and literally build the future.
Tensorwave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, national origin, or veteran status.
Top Skills
Similar Jobs
What you need to know about the Colorado Tech Scene
Key Facts About Colorado Tech
- Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
- Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
- Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
- Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
- Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute



