Voltage Park Logo

Voltage Park

Infrastructure Engineer (Observability)

Reposted 2 Days Ago
In-Office or Remote
2 Locations
140K-180K Annually
Senior level
In-Office or Remote
2 Locations
140K-180K Annually
Senior level
The Infrastructure Engineer focuses on observability, designing platforms for metrics and alerting, creating dashboards, deploying telemetry, and collaborating across teams to enhance reliability and transparency.
The summary above was generated by AI

Voltage Park is seeking an Infrastructure Engineer with a focus on Observability to join our Infrastructure Engineering team. Our engineers design and operate the systems that manage thousands of bare-metal servers, GPUs, and high-performance networks across multiple data centers.

This role combines the breadth of a core infrastructure engineer with a specialty in observability and telemetry. You’ll design and operate metrics, logs, traces, and alerting pipelines that provide actionable insights for both internal teams and external customers — helping to ensure reliability and transparency at scale.

This is a fully remote position, although candidates must be based in the continental United States. Unfortunately, we are unable to provide sponsorship for this role.

Responsibilities
  • Design, build, and maintain observability platforms spanning metrics, logs, traces, and events.

  • Create dashboards and alerting for internal stakeholders (InfraOps, Engineering, Customer Success) and scoped visibility for external customers.

  • Ingest and correlate telemetry from GPUs, CPUs, networking (Ethernet & InfiniBand), containers, APIs, and BMC/Redfish.

  • Implement noise-resistant alerting pipelines that improve detection and reduce operational load.

  • Collaborate with infrastructure, platform, and customer-facing teams to embed observability into workflows.

  • Contribute to broader infrastructure engineering projects beyond observability.

Qualifications
  • 8+ years in infrastructure engineering, SRE, or observability roles.
    Strong experience with monitoring systems (Prometheus, Grafana, ELK, VictoriaMetrics, or similar).

  • Proficiency in Python, Go, or bash for automation and data integration.

  • Familiarity with container/Kubernetes observability.

  • Understanding of streaming telemetry pipelines (Kafka, OTEL, Promtail, or equivalent).

  • Strong written and verbal communication skills.

Ideal Experiences
  • Experience with GPU observability, particularly NVIDIA DCGM.

  • Designing multi-tenant observability solutions with RBAC and scoped queries.

  • Prior work with correlation engines for RCA, forecasting, or predictive alerting.

  • Broader exposure to infrastructure domains (networking, storage, provisioning).

Culture
  • You enjoy working with a small, highly motivated team.

  • You’re comfortable balancing autonomy with company-wide priorities.

  • You value clarity, documentation, and actionable insights in observability systems.

You’re excited to specialize in observability while contributing as a core infrastructure engineer.

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic protected by law.

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. If you require an accommodation during the job application process, please notify your recruiter. 

Compensation Range: $140K - $180K


#BI-Remote

Top Skills

Bash
Bmc/Redfish
Elk
Go
Grafana
Kafka
Kubernetes
Otel
Prometheus
Promtail
Python
Victoriametrics

Similar Jobs at Voltage Park

Yesterday
In-Office or Remote
2 Locations
160K-210K Annually
Senior level
160K-210K Annually
Senior level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
The Staff Network Engineer will design and support large-scale networks for AI infrastructure, focusing on performance and automation using various networking tools and scripting languages.
Top Skills: AclsAi InfrastructureAnsibleBashBgpEvpnGrafanaInfinibandInfluxdbMplsNetwork AutomationNvidia Fabric ManagerOspfPrometheusPythonQosSflowTerraformVxlan
8 Days Ago
Remote
USA
150K-180K Annually
Senior level
150K-180K Annually
Senior level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
The Storage Engineer will manage and optimize a large-scale VAST storage system, focusing on performance tuning, system maintenance, and collaboration with teams.
Top Skills: AnsibleHpc Storage SystemsLinuxNfsTerraformVast Storage Systems
14 Days Ago
In-Office or Remote
2 Locations
180K-215K Annually
Senior level
180K-215K Annually
Senior level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
The Revenue Operations Manager will design and implement GTM frameworks, oversee pipeline initiatives, manage revenue processes, and align sales and marketing teams. This role requires a strategic thinker with startup experience and a strong background in sales operations.
Top Skills: Analytics And Reporting ToolsCrm SystemsHubspotLookerTableau

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account