Graphcore Logo

Graphcore

Technical Lead - System Validation Architect

Reposted Yesterday
Be an Early Applicant
Hybrid
Austin, TX
Senior level
Hybrid
Austin, TX
Senior level
Lead the architecture and execution of Linux-based validation frameworks for Arm-based data center SoCs, defining validation strategy and ensuring system quality.
The summary above was generated by AI
About us

Graphcore is one of the world’s leading innovators in Artificial Intelligence compute.
It is developing hardware, software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry.
As part of the SoftBank Group, Graphcore is a member of an elite family of companies responsible for some of the world’s most transformative technologies. Together, they share a bold vision: to enable Artificial Super Intelligence and ensure its benefits are accessible to everyone.
Graphcore’s teams are drawn from diverse backgrounds and bring a broad range of skills and perspectives. A melting pot of AI research specialists, silicon designers, software engineers and systems architects, Graphcore enjoys a culture of continuous learning and constant innovation.

Job Summary

We are seeking a Technical Lead – System Validation Architect to lead the architecture and execution of Linux-based validation frameworks for Arm-based data center SoCs. This role will define validation strategy, test coverage, and methodology across CPU, memory, interconnect, and high-speed I/O subsystems. You will provide technical leadership in validation architecture, automation, benchmarking, and debug to ensure robust system quality and scalability.

The Team

The Systems Validation Architecture team is responsible for defining and enabling scalable validation methodologies for Graphcore’s next-generation AI compute platforms. The team collaborates closely with hardware, firmware, and systems engineering groups to deliver comprehensive validation coverage and high-quality system enablement.

Responsibilities and Duties
  • Define end-to-end validation strategy and coverage model:
    • Functional, stress, performance, and corner-case testing
  • Translate hardware specifications into structured, parameterized test plans
  • Guide the team in:
    • Selecting appropriate tools.
    • Defining workload models and parameter configurations
  • Establish standards for:
    • Test case definition (parameters, metrics, pass/fail criteria)
    • Result validation and reporting
  • Experience with multi-core and parallel programming, including workload scaling and CPU affinity management
  • Review Python-based automation, orchestration, and analysis
  • Collaborate with hardware, firmware, and system teams to debug issues
Candidate Profile

Essential:

  • Strong knowledge of Arm SoC architecture and Linux systems.
  • 8+ years of experience in system validation, performance engineering, or low-level systems development.
  • Deep understanding of CPU architecture, cache coherency, memory systems (DDR, HBM, NUMA), and high-speed I/O technologies such as PCIe.
  • Proven ability to define validation strategies, coverage models, and validation methodologies.
  • Hands-on experience using and tuning benchmarking tools such as stress-ng, fio, and iperf.
  • Strong Python programming skills for process automation, system coordination, and data examination.  
  • Experience working with performance analysis software including perf and PMU counters.  
  • Strong analytical, problem-solving, and ability to collaborate in multi-functional environments.  
Desirable:
  • Experience working with large-scale or data center systems.
  • Strong programming skills in C/C++ and Python for system-level development.
  • Previous technical leadership or mentoring experience.
  • Experience with scalable validation infrastructure and automation frameworks.
  • Knowledge of AI infrastructure or hyperscale compute systems.

USA Benefits

In addition to a competitive salary, Graphcore offers flexible working and a comprehensive benefits package designed to support your health, wellbeing and financial future. Our benefits include medical, dental and vision coverage, Flexible Spending Accounts (FSAs), Health Savings Accounts (HSAs), disability and life insurance, a 401(k) retirement plan, commuter benefits, wellness services and an Employee Assistance Programme (EAP). We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments.

Similar Jobs at Graphcore

9 Hours Ago
Hybrid
Senior level
Senior level
Artificial Intelligence • Semiconductor
The AI Platform Architect will design a cohesive architecture for AI environments, oversee workload orchestration, eliminate system bottlenecks, and collaborate on hardware-software integration. Responsibilities include developing a 3-to-5-year technical vision for the AI platform and ensuring data flow between AI compute nodes and network fabrics is optimized.
Top Skills: AIDeepspeedHpcJSONKubernetesNvmePcie Gen 5/6PythonPyTorchRdmaSlurm
9 Hours Ago
Hybrid
Senior level
Senior level
Artificial Intelligence • Semiconductor
The Senior Thermal Engineer will design and develop liquid cooling solutions for AI data center hardware, ensuring compliance with thermal specifications. Responsibilities include leading design processes, performing thermal simulations, collaborating with vendors, and validating thermal solutions for performance and efficiency.
Top Skills: AnsysComsolFlowtherm
9 Hours Ago
Hybrid
Expert/Leader
Expert/Leader
Artificial Intelligence • Semiconductor
Develop and manage software interfaces for rack management solutions, ensuring robust performance in AI systems and infrastructure. Collaborate across teams to enhance operational efficiency, employing skills in cloud-native environments and troubleshooting.
Top Skills: AnsibleBashCephCi/CdDockerElasticsearchFluentdGithub ActionsGitlabGoGrafanaInfrastructure-As-CodeKafkaKubernetesKvmLinuxLokiMimirOpen VswitchOpensearchOpentelemetryPrometheusPythonQemuRedfishRestful ApiSlurmTerraform

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account