VAST Data Logo

VAST Data

Senior Solutions Engineer, AI Infrastructure

Posted 17 Days Ago
Remote or Hybrid
Hiring Remotely in United States
Senior level
Remote or Hybrid
Hiring Remotely in United States
Senior level
The Senior Solutions Engineer will design and implement infrastructure for AI and HPC workloads, engage with customers, and lead technical discovery and architecture design.
The summary above was generated by AI
Description

We're looking for a deeply technical Solutions Architect to help customers design, evaluate, and deploy infrastructure for large-scale AI, HPC, analytics, and data-intensive workloads.

This is a customer-facing technical role for someone who has lived inside production infrastructure. You may have been a platform engineer, infrastructure engineer, SRE, MLOps engineer, AI infrastructure engineer, storage engineer, cloud engineer, or HPC systems engineer. What matters most is that you have built, operated, or architected real systems, and can bring that credibility into customer conversations.

Our customers are building infrastructure at serious scale: GPU clusters, high-performance storage systems, Kubernetes platforms, distributed training environments, inference platforms, data pipelines, lakehouses, and large enterprise systems. You'll help them reason about architectures involving 10,000+ GPUs, 100PB+ of storage, high-performance networking, distributed filesystems, orchestration layers, and demanding production workloads.

You'll own technical discovery, architecture design, PoC planning, competitive positioning, and customer technical strategy. You'll work from the first whiteboard session through evaluation, deployment planning, and production success. You'll also partner closely with product and engineering teams to bring field feedback into the roadmap.

We're looking for someone who can go deep technically, communicate clearly, operate without a rigid playbook, and translate complex infrastructure into customer outcomes.

Responsibilities

  • Lead technical discovery with customers across infrastructure, platform, ML, data, and executive stakeholders.
  • Design architectures for large-scale AI, HPC, analytics, and enterprise data workloads.
  • Help customers evaluate infrastructure involving GPUs, storage, networking, orchestration, and data movement.
  • Translate complex technical requirements into clear solution designs, reference architectures, and deployment guidance.
  • Debug customer issues across Linux, storage, networking, Kubernetes, schedulers, GPUs, and application workloads.
  • Build technical assets, runbooks, and field guidance for repeatable customer engagements.
  • Partner with product and engineering to communicate customer requirements, gaps, and roadmap opportunities.
  • Help customers move from architecture design to production deployment.
Requirements
  • 8 to 12+ years of technical experience, with significant hands-on infrastructure experience.
  • Experience building, operating, or architecting production platform infrastructure.
  • Strong understanding of Linux kernel implementation details, distributed systems including PAXOS and raft, storage implementations details like NAND or write amplification, networking store/forward, load balancing designs, and production operations.
  • Experience with one or more of: GPU infrastructure, large scale HPC systems, Kubernetes platforms from scratch, MLOps, storage systems, cloud infrastructure, data platforms, or large-scale enterprise infrastructure.
  • Ability to communicate credibly with engineers, architects, technical executives, and business stakeholders.
  • Strong discovery, problem-solving, and systems debugging skills.
  • Comfort operating in ambiguous, fast-moving environments.
  • Interest in customer-facing technical work, solution design, and business outcomes.

Preferred Experience

  • Experience with large-scale GPU clusters, distributed training, inference infrastructure, or AI platforms.
  • Experience with petabyte-scale storage or high-performance data systems.
  • Experience with Kubernetes, Slurm, Ray, Spark, or other orchestration / scheduling systems.
  • Domain Expertise with one or more of these - Lustre, Ceph, Weka, BeeGFS, GPFS, VAST, object storage, or distributed filesystems.
  • Experience with large-scale InfiniBand, RoCE, RDMA, high-performance Ethernet, or NVIDIA/Mellanox networking.
  • Direct Experience with CUDA, NCCL, DCGM, GPUDirect, checkpointing, dataset staging, or model-serving infrastructure.
  • Experience across multiple industries or customer environments.

Similar Jobs

49 Minutes Ago
Easy Apply
Remote
USA
Easy Apply
149K-175K Annually
Senior level
149K-175K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Own the CLM technology roadmap and lead in-house builds, integrations, and AI enablement for contracting workflows. Design scalable contract solutions with Engineering and Enterprise Architecture, build reporting and metadata systems, run AI/automation proofs of concept, and lead large cross-functional contract initiatives while communicating outcomes and strategy to senior stakeholders.
Top Skills: ClaudeContract Lifecycle Management (Clm)EvisortGleanIroncladN8NSalesforceWorkato
50 Minutes Ago
Remote or Hybrid
20 Locations
Senior level
Senior level
Digital Media • eCommerce • Gaming • Mobile • News + Entertainment
Lead regional communications strategy across Asia for anime launches, theatrical releases, partnerships and brand campaigns. Provide executive counsel, manage crisis response, oversee agencies, localize campaigns, build media and fan relationships, set KPIs, and drive integrated PR programs supporting regional growth.
52 Minutes Ago
Remote
United States
110K-144K Annually
Senior level
110K-144K Annually
Senior level
Artificial Intelligence • HR Tech • Information Technology • Software • Business Intelligence
Own the VA account: prospect, demo, negotiate contracts, respond to RFx, drive new logo acquisition and account expansion, meet sales targets, and partner with internal teams to increase Qualtrics adoption across VA administrations.
Top Skills: Experience Management (Xm)FedrampHipaaMeddiccQualtrics

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account