Nubank Logo

Nubank

Staff Machine Learning Engineer (Infrastructure)

Reposted 2 Days Ago
Remote
Hiring Remotely in USA
Senior level
Remote
Hiring Remotely in USA
Senior level
Design and optimize AI infrastructure, ensuring reliability and efficiency. Lead projects, create ETL pipelines, and support ML workload operations.
The summary above was generated by AI
About Nu

Nu is the world’s largest digital banking platform outside of Asia, serving over 105 million customers across Brazil, Mexico, and Colombia. The company has been leading an industry transformation by leveraging data and proprietary technology to develop innovative products and services. Guided by its mission to fight complexity and empower people, Nu caters to customers’ complete financial journey, promoting financial access and advancement with responsible lending and transparency. The company is powered by an efficient and scalable business model that combines low cost to serve with growing returns. Nu’s impact has been recognized in multiple awards, including Time 100 Companies, Fast Company’s Most Innovative Companies, and Forbes World’s Best Banks. Learn more: https://international.nubank.com.br/careers/


About the role

At Nubank, one of our engineering principles is "Leverage Through Platforms". We believe that platforms are a very efficient way of solving complex concerns that are needed for different products and teams.

The AI Infrastructure Squad within the AI Core BU builds and scales the foundational cloud, data, and AI infrastructure that powers machine learning workloads across the organization. We design and optimize high-performance training, inference, and data processing systems while ensuring reliability, scalability, and efficiency. Our team enables AI practitioners by providing robust compute, model serving, monitoring, and orchestration frameworks to drive innovation and operational excellence.


As a Software Engineer in the AI Core BU, we expect you to demonstrate:
  • Strong expertise in systems and infrastructure
    Proven experience designing, building, and operating distributed systems in cloud environments, with a focus on performance, scalability, and reliability.
  • Deep experience with production-grade ML pipelines
    Ability to design, operate and orchestrate pipelines for model training, evaluation, and deployment - with solid understanding of distributed training, resource scheduling, and inference serving.
  • Experience building performance and efficiency for AI workloads
    Experience with techniques such as parameter-efficient fine-tuning (PEFT), kernel fusion, mixed-precision training, and pipeline optimizations to reduce cost and latency.
  • Experience with cloud platforms
    Experience with GCP/AWS, Kubernetes, GPU/CPU orchestration, multi-region deployments, and infrastructure-as-code (Terraform, Pulumi).
  • Experience with observability and reliability
    Strong background in monitoring, alerting, logging, and fault tolerance - applied to both batch training jobs and real-time inference systems.
  • Strong software engineering foundations
    Expertise in Python, Go, or similar languages, with emphasis on clean, maintainable, and testable code.

We’re looking for individuals who thrive in horizontal, high-impact teams that build foundational infrastructure for multiple AI initiatives. People who enjoy solving deep technical challenges at the intersection of AI, cloud, and distributed systems, and who take ownership with a strong product mindset - ensuring infrastructure is reliable, scalable, and built around user needs. We value collaborators and mentors who help teammates grow while upholding high engineering standards. If you’re passionate about building scalable, efficient, and cost-effective AI infrastructure that drives meaningful, real-world impact, we’d love to meet you.

If you feel interested in these challenges and want to work on a very engaged and talented team, this is the place for you!


What we have to offer

High-Impact, Cross-Functional Work – Our team sits at the core of AI operations, enabling ML engineers, researchers, and data scientists to build and deploy models at scale. You'll work across multiple teams and business units, directly shaping AI-driven products and decisions.

Cutting-Edge AI & Cloud Infrastructure – Be part of a team that designs and operates high-performance AI infrastructure, spanning cloud, data, and ML platforms. You'll tackle technical challenges in distributed systems, model serving, and large-scale data processing.

0 to 1 & Large-Scale Initiatives – Work on both greenfield projects and mission-critical AI infrastructure, from building scalable training pipelines to optimizing real-time inference workloads. Your work will directly influence the efficiency and scalability of AI across the company.

Growth & Ownership Opportunities – As a senior engineer, you'll have the autonomy to drive technical direction, lead high-impact projects, and contribute to architectural decisions. You'll also have opportunities to mentor others, shape engineering best practices, and grow into a leadership role.

Culture of Excellence & Collaboration – Join a team that values deep technical expertise, curiosity, and a strong engineering culture. We operate in a fast-moving environment where innovation, reliability, and efficiency drive everything we build.


Our Benefits

Remote work, with quarterly trips to Sao Paulo to build relationships with coworkers. 

Top Tier Medical Insurance

Top Tier Dental and Vision Insurance

20 days time off, 14 company holidays, and great culture that emphasizes work life balance. 

Life Insurance and AD&D

Extended maternity and paternity leaves 

Nucleo - Our learning platform of courses

NuLanguage - Our language learning program

NuCare - Our mental health and wellness assistance program

Extended maternity and paternity leaves 

401K

Saving Plans - Health Saving Account and Flexible Spending Account

Top Skills

Airflow
AWS
BigQuery
Dagster
GCP
Go
Kubernetes
Ml Infrastructure
Pulumi
Python
Ray Serve
Spark
Terraform
Vllm

Similar Jobs

2 Months Ago
In-Office or Remote
2 Locations
Senior level
Senior level
Semiconductor
The role involves designing and scaling ML infrastructure for LLMs, including building training pipelines and deploying models in cloud and on-prem environments. Collaboration with engineers and managing GPU/TPU workloads is key.
Top Skills: AWSDockerGCPKubernetesPythonPyTorchTensorFlow
58 Minutes Ago
Easy Apply
Remote or Hybrid
United States
Easy Apply
128K-215K Annually
Senior level
128K-215K Annually
Senior level
Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Lead the design and launch of communication tools for frontline workers, shaping product vision and driving user adoption based on customer insights and data.
Top Skills: Ai TechnologiesCommunication Tools
59 Minutes Ago
Easy Apply
Remote
2 Locations
Easy Apply
80K-90K
Senior level
80K-90K
Senior level
Edtech • Kids + Family • Mobile • Social Impact • Transportation
The Metro Service Manager ensures high client care standards, responding to inquiries, managing client relationships, and optimizing service delivery through collaboration and data analysis.
Top Skills: ExcelKustomerSalesforce

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

  • Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
  • Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
  • Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
  • Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
  • Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account