Clarium Logo

Clarium

AI Engineer - Data Intelligence

Reposted 4 Days Ago
Remote
Hiring Remotely in US
150K-180K Annually
Junior
Remote
Hiring Remotely in US
150K-180K Annually
Junior
As an AI Engineer at Clarium, you will build and maintain data enrichment pipelines, design classification workflows, analyze datasets, and ensure data quality, primarily using Python and SQL, and work closely with senior engineers and data scientists.
The summary above was generated by AI

Why Clarium?

The healthcare industry overspends on its supply chain by over $25B each year, the result of fragmented data, inefficient workflows, and wasted supplies. Clarium is fixing that. Our AI-powered platform, Astra OS, gives hospitals end-to-end visibility into their supply chain operations, automating workflows and surfacing actionable insights so supply chain teams can focus on what matters most: patient care. We're trusted by some of the world's leading health systems, including Yale New Haven Health, Stanford, Geisinger, and Kaiser Permanente.

Founded in 2020, Clarium has raised $43M in total funding. Our Series A was led by Northzone, with participation from General Catalyst, AlleyCorp, Kaiser Permanente Ventures, Texas Medical Center Ventures, and 1984 Ventures.

The Opportunity

AI-powered platforms, like Clarium’s, deliver the highest impact when they are supported by high-quality data. As we scale to more health systems and deepen our offering of intelligent, data-driven workflows, the master data enrichment pipeline (the system that classifies and contextualizes every product flowing through a hospital's supply chain) has become a critical growth lever. We're investing in the team and infrastructure to make that layer faster, smarter, and more reliable.

You'll join the Data Products team, a small, unusually senior group responsible for the data assets, data science, and analytics that drive measurable value for our clients. Day-to-day, you'll build and own components of our enrichment pipeline: classification workflows, entity resolution systems, evaluation harnesses, and the production tooling that keeps it all running. You'll work closely with engineers and data scientists who've shipped real ML systems at scale, and your work will feed directly into decisions made by supply chain teams at some of the country's leading health systems.

A rare early-career opportunity to learn fast and own real work from day one. As the first junior hire on the team, you won't be buried under layers of abstraction. You'll work directly alongside people who've done this before, on problems that actually matter. Short feedback loops, real stakes, and the kind of hands-on growth that's hard to find this early in a career. It's the opportunity many of us wish we'd had starting out.

In This Role You Will

  • Build and maintain components of Clarium's master data enrichment pipeline, the system that classifies and enriches every product flowing through our platform

  • Design and own classification and entity resolution workflows that combine deterministic logic and LLMs for production data processing

  • Build and operate evaluation harnesses, label sets, and regression suites (we use Braintrust) to measure and improve pipeline quality with confidence

  • Write production Python and SQL; the majority of your time will be spent in code, not in configuration tools

  • Analyze complex datasets using statistics and ML to surface actionable insights and inform pipeline improvements

  • Proactively audit data for quality issues; find the problems no one else has noticed yet, diagnose root causes, and ship fixes

What You'll Bring

  • Strong Python skills and a track record of writing production code, not just scripts or notebooks

  • Strong SQL, including complex joins, window functions, performance tuning, and data modeling

  • Comfort working in ambiguous environments; you can scope a problem, make a plan, and execute without hand-holding

  • A genuine, non-negotiable commitment to data quality; you treat silent bugs as real failures

  • Ability to go deep on an unfamiliar domain and develop meaningful expertise over time

Nice to Have

  • Experience with LLM integrations, prompt evaluation, or classification at scale

  • Familiarity with eval frameworks such as Braintrust, Promptfoo, or equivalent

  • Prior work in healthcare, supply chain, or another domain where data quality has direct operational consequences

Skills & Tools You'll Use

Need to Know: Python · SQL · PostgreSQL · CI/CD · Production observability

Nice to Know: Temporal · Braintrust · Snowflake · AWS · Sigma

What You Get at Clarium

Target Base Salary Range: $150K - $180K

The base salary Clarium offers may vary depending upon the ultimate scope and responsibilities of the position and on the candidate’s job-related knowledge, skills, and experience. The total package will include equity, in addition to a full range of medical and/or other benefits, depending on the position offered. Pay and benefits are subject to change at any time, consistent with the terms of any applicable compensation or benefit plans.

Incentive Stock Options proportionate to your salary

Fully remote, with a NYC co-working space available; distributed team across multiple time zones with opportunities for in-person time

Unlimited PTO

Top-tier health, vision, and dental benefits

401K

The opportunity to build on a strong foundational team with deep data and engineering roots at a stage where your work genuinely shapes the product

Equal Opportunity Statement

Clarium is committed to promoting an inclusive work environment free of discrimination and harassment. We value a diverse and balanced team where everyone can belong.

Similar Jobs

5 Days Ago
Remote or Hybrid
113K-193K Annually
Senior level
113K-193K Annually
Senior level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Design, build, and operate scalable data pipelines and AI-ready data products from large structured and unstructured sources (OCR/images/documents). Enable production Generative AI (RAG, semantic search), ensure data quality/observability, orchestrate CI/CD and infra-as-code, and mentor engineers while collaborating with product, analytics, and compliance teams.
Top Skills: AirflowAWSAzureChartjsDatabricksDatabricksDeequDelta LakeDockerEvent HubsGCPGithub ActionsGreat ExpectationsJavaKafkaKinesisKubernetesLlmOcrPlotlyPysparkPythonRagScalaSeabornSemantic SearchSnowflakeSparkSQLTerraform
6 Days Ago
Remote or Hybrid
United States
150K-230K Annually
Senior level
150K-230K Annually
Senior level
Big Data • Cloud • Productivity • Software • Database • Analytics • Automation
Design, build, and operate Jellyfish's Databricks Lakehouse and data/AI platform. Build ingestion, transformation, and delivery pipelines; govern data with Unity Catalog; enable BI, agentic analytics, metadata, lineage, and observability. Provide technical leadership, set architecture and standards, mentor peers, and collaborate with cross-functional stakeholders to make Databricks the foundation for analytics and AI.
Top Skills: Ci/CdDatabricksDatabricks AppsDatabricks Asset BundlesDatabricks LakehouseDatabricks SqlDatabricks WorkflowsDbtDelta LakeDelta LakeGenieLlmsOntologyPythonRetrieval-Augmented GenerationSQLTerraformUnity Catalog
19 Days Ago
In-Office or Remote
165K-350K Annually
Senior level
165K-350K Annually
Senior level
Artificial Intelligence • Legal Tech
Founding data engineer responsible for consolidating multiple data sources into a BigQuery warehouse, building ETL/ELT pipelines, creating self-serve data tools (including natural-language/LLM agents), enabling analytics and personalization, and defining data engineering standards and infrastructure for a growing AI product.
Top Skills: BigQueryData LakeEtl/EltGoogle Cloud PlatformLlmsPythonSQLTerraformText-To-Sql

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account