Lead the design of decentralized Data Mesh architectures, manage scalable data ecosystems, implement data governance, and mentor engineers in a hybrid working model.
Principal Data Architect – Modern Data Platforms & Data Mesh San Jose, CA | Newport Beach, CA | Hybrid (2–3 days onsite)
Role Overview
We are seeking a highly experienced, hands-on Principal Data Architect to define the strategic vision for a next-generation, petabyte-scale data platform and lead the transition from centralized to decentralized data management. This is a principal-level role — the bar goes well beyond senior. We are looking for someone with demonstrated long tenure at complex, large-scale data environments who has driven real architectural transformation, not just advised on it.
The core mandate is data decentralization — moving from a monolithic data warehouse model to a Data Mesh paradigm — alongside modernizing data security, governance, and real-time capabilities. Consulting backgrounds and serial short-tenure candidates will face significant additional scrutiny; we want people who have gone deep and have the organizational impact to show for it.
What You'll Do
Strategic Architecture & Data Mesh
- Lead the design and implementation of a scalable, decentralized Data Mesh architecture; define domain boundaries, data products, and federated governance standards
- Drive the organization from centralized to decentralized data management — change management experience is as important as technical depth
- Establish data contracts and self-service analytics capabilities across the organization
Hands-On Engineering Leadership
- Architect and prototype resilient data pipelines using Databricks, Snowflake, and Spark, ensuring high availability and low latency
- Drive adoption of open table formats (Delta Lake, Apache Iceberg) for ACID compliance, time travel, and schema evolution across the data lakehouse
- Troubleshoot complex performance bottlenecks in distributed systems
Data Security & Governance
- Architect advanced data security patterns including dynamic data masking, tokenization, and row-level security
- Implement centralized discovery and access control using Unity Catalog or equivalent enterprise data catalogs
- Implement technical controls for data privacy regulations (GDPR, CCPA) including encryption at rest and in transit
Real-Time & Event-Driven Systems
- Design and deploy high-throughput streaming architectures using Kafka or Pulsar for real-time data ingestion and processing
- Deep understanding of workflow orchestration tools (Airflow, dbt, Dagster)
MLOps & Analytics Integration
- Build robust MLOps pipelines and feature stores bridging data engineering and production AI/ML
- Collaborate with data scientists to operationalize machine learning models end-to-end
Technical Leadership & Culture
- Mentor senior data engineers and architects; foster a culture of technical excellence and "Data as a Product" thinking
- Treat data pipelines as code — strict CI/CD, unit testing, and version control practices expected
About You
Experience & Background
- 10+ years of software and data engineering experience, with a significant portion in technical leadership or architecture
- FAANG or Big Tech background strongly preferred
- Proven track record at large-scale enterprise environments — petabyte-scale data infrastructure is the baseline expectation
- Long tenure at key roles is a strong positive signal; we are not looking for contract-to-contract backgrounds or candidates with a pattern of 1–2 year stints
- Consulting backgrounds are not disqualifying but will be scrutinized — be prepared to speak specifically to what you owned vs. what you touched
Technical Requirements
- Data Mesh: proven experience transforming monolithic data warehouses into decentralized Data Mesh architectures including federated governance
- Platforms: deep hands-on expertise with Databricks and Snowflake including compute optimization and cost management
- Big Data: strong proficiency in Apache Spark, Delta Lake, Iceberg, and Hudi
- Coding: expert-level Python, SQL, and Scala/Java nice to have
- Streaming: Kafka or Kinesis; orchestration via Airflow, dbt, or Dagster
- Governance: hands-on with Unity Catalog, Alation, or Collibra
- MLOps: solid understanding of model registry, feature stores, and model serving
Location & Work Model
Hybrid position based out of San Jose, CA or Newport Beach, CA — candidates must be within commutable distance of one of these two offices. Onsite 2–3 days per week expected. Out-of-area candidates considered only in exceptional circumstances and held to a significantly higher bar.
Compensation
Competitive base compensation commensurate with experience. 25% target bonus. Full benefits including medical/dental/vision, retirement plans, paid parental leave, and paid time off.
Top Skills
Airflow
Alation
Apache Iceberg
Collibra
Dagster
Databricks
Dbt
Delta Lake
Java
Kafka
Pulsar
Python
Scala
Snowflake
Spark
SQL
Unity Catalog
Similar Jobs
Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
The Principal Architect will design the data, knowledge, and skills layers for a data platform, ensuring high quality data management and compliance while enabling AI capabilities.
Top Skills:
AIAPIsDatabricksKafkaMlSpark
Cloud • Software
The Principal Data Architect will design and implement complex cloud-based data solutions, oversee data standards, and guide technical teams while maintaining client roadmaps.
Top Skills:
AWSBedrockCi/CdDataopsInfrastructure As CodeNoSQLPythonSagemakerSpark
Cloud • Information Technology • Security • Software • Cybersecurity
Architect and lead large-scale cloud data platforms and AI/ML frameworks to ingest and process security telemetry. Build production ML and Generative AI solutions (LLMs, agents) to automate SOC workflows, improve threat detection, and reduce analyst workload while guiding cross-team integrations and MLOps practices.
Top Skills:
Ai AgentsAWSAzureCi/CdContainerizationGenerative AiInfrastructure-As-CodeLarge Language Models (Llms)MlopsPythonPyTorchScikit-LearnStreaming TechnologiesTensorFlow
What you need to know about the Colorado Tech Scene
With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.
Key Facts About Colorado Tech
- Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
- Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
- Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
- Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
- Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute



