Writer

Performance engineer

Reposted 21 Days Ago

Be an Early Applicant

In-Office or Remote

2 Locations

Expert/Leader

In-Office or Remote

2 Locations

Expert/Leader

Lead performance optimization for Generative AI, enhancing LLM and RAG systems through strategic performance testing and infrastructure collaboration.

The summary above was generated by AI

📐 About this role
WRITER is seeking a highly skilled and motivated Principal performance engineer to lead the performance optimization of our cutting-edge Generative AI technology stack. This role is critical in ensuring the scalability, efficiency, and reliability of our Large Language Models (LLMs) and Retrieval Augmented Generation (RAG) systems. You will be a key driver in identifying and resolving performance bottlenecks, optimizing resource utilization, and ensuring a seamless user experience. You will work closely with our AI research, software engineering, and infrastructure teams to deliver world-class AI solutions.

🦸🏻‍♀️ Your responsibilities

Performance leadership:
- Define and implement performance engineering strategies for our Generative AI full stack, including services, application, LLMs, RAG pipelines, and related infrastructure.
- Lead performance testing, profiling, and analysis efforts to identify and resolve performance bottlenecks.
- Establish and maintain performance benchmarks and SLAs for critical AI services.
- Provide technical leadership and mentorship to performance engineering team members.
LLM capacity and tuning:
- Analyze and improve LLM inference performance, including latency, throughput, and resource utilization.
- Develop and implement strategies for LLM capacity planning and scaling.
- Collaborate with AI researchers to evaluate and improve LLM model architectures and training techniques for performance.
- Optimize LLM inference through techniques such as quantization, distillation, and optimized kernel implementation.
RAG performance optimization:
- Design and implement performance tests for RAG pipelines, including retrieval, ranking, and generation components.
- Identify and optimize performance bottlenecks in RAG systems, such as database queries, vector search, and document processing.
- Evaluate and optimize RAG system architectures for scalability and efficiency.
- Tune vector databases for optimal recall and latency.
Infrastructure optimization:
- Collaborate with infrastructure teams to optimize hardware and software configurations for AI workloads.
- Evaluate and recommend new technologies and tools for performance monitoring and analysis.
- Develop and maintain performance dashboards and reports to track key metrics.
- Optimize GPU utilization and memory management for LLM inference.
Collaboration and communication:
- Work closely with AI researchers, software engineers, and product managers to ensure performance requirements are met.
- Communicate performance findings and recommendations to stakeholders at all levels.
- Stay up-to-date with the latest developments in Generative AI and performance engineering.

⭐️ Is this you?

Education:
- Bachelor's degree in Computer Science, Engineering, or a related field (Master's preferred).
Experience:
- 10+ years of experience in performance engineering, with a focus on large-scale distributed systems.
- 2+ years of experience working with AI/ML technologies
- Proven experience in performance testing, profiling, and analysis of complex software systems.
- Deep understanding of NLP architectures, training, and inference.
- Experience with vector databases and search technologies.
- Experience with cloud computing platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes).
- Strong programming skills in python.
- Familiarity with Postgres and Elasticsearch
- Experience with performance analysis tools (e.g., profilers, debuggers, monitoring tools).
Skills:
- Strong analytical and problem-solving skills.
- Excellent communication and collaboration skills.
- Ability to work in a fast-paced and dynamic environment.
- Passion for AI and a desire to push the boundaries of performance engineering
  #LI-Hybrid

🍩 Benefits & perks (US Full-time employees)

Generous PTO, plus company holidays
Medical, dental, and vision coverage for you and your family
Paid parental leave for all parents (12 weeks)
Fertility and family planning support
Early-detection cancer testing through Galleri
Flexible spending account and dependent FSA options
Health savings account for eligible plans with company contribution
Annual work-life stipends for:
- Home office setup, cell phone, internet
- Wellness stipend for gym, massage/chiropractor, personal training, etc.
- Learning and development stipend
Company-wide off-sites and team off-sites
Competitive compensation, company stock options and 401k

WRITER is an equal-opportunity employer and is committed to diversity. We don't make hiring or employment decisions based on race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other basis protected by applicable local, state or federal law. Under the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

By submitting your application on the application page, you acknowledge and agree to WRITER's Global Candidate Privacy Notice.

Compensation Range: $195.9K - $345.1K

#BI-Remote

Top Skills

AWS

Azure

Docker

Elasticsearch

GCP

Kubernetes

Performance Analysis Tools

Postgres

Python

Similar Jobs

ServiceNow

Lead Performance Engineer (Escalations Engineering)

Yesterday

Remote or Hybrid

Orlando, FL, USA

Senior level

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation

The Lead Performance Engineer is responsible for resolving technical support issues, mentoring junior staff, and requires strong technical and interpersonal skills. Evening and weekend availability is necessary.

Top Skills: AIAjaxCmdbCSSEclipse IdeItilItsmJavaJavaScriptJdbcLinuxMySQLOdbcOraclePerlPowershellPythonRestSoapSplunkTcp/IpUnixUnix ShellWindows ShellXhtmlXML

ServiceNow

Senior Staff Quality Performance Engineer

2 Days Ago

Remote or Hybrid

Santa Clara, CA, USA

188K-328K Annually

Senior level

188K-328K Annually

Senior level

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation

This role involves ensuring reliability and performance of the Now Platform's relational database, focusing on testing, performance analysis, and optimization of web applications under complex workloads.

Top Skills: AIApacheAppdynamicsCentosDevOpsJbossJenkinsLinuxMySQLNew RelicOraclePerlPostgresRedhatShell ScriptingSplunkTomcatWeblogicWebsphere

Inspira Financial

Sr. Quality Engineer - Performance Optimization (Remote)

21 Days Ago

In-Office or Remote

Chicago, IL, USA

91K-111K Annually

Senior level

91K-111K Annually

Senior level

Fintech

The Senior Quality Engineer drives performance testing and optimization, ensuring high-quality software delivery through automation, benchmarking, and collaboration with cross-functional teams.

Top Skills: .NetArtilleryAzure CloudAzure DevopsDatadogDockerJmeterJunitK6KubernetesLoadrunnerMongoDBMs Sql ServerMulesoftPlaywrightPythonRabbitMQReactSeleniumTerraformTestng

What you need to know about the Colorado Tech Scene

With a business-friendly climate and research universities like CU Boulder and Colorado State, Colorado has made a name for itself as a startup ecosystem. The state boasts a skilled workforce and high quality of life thanks to its affordable housing, vibrant cultural scene and unparalleled opportunities for outdoor recreation. Colorado is also home to the National Renewable Energy Laboratory, helping cement its status as a hub for renewable energy innovation.

Key Facts About Colorado Tech

Number of Tech Workers: 260,000; 8.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Lockheed Martin, Century Link, Comcast, BAE Systems, Level 3
Key Industries: Software, artificial intelligence, aerospace, e-commerce, fintech, healthtech
Funding Landscape: $4.9 billion in VC funding in 2024 (Pitchbook)
Notable Investors: Access Venture Partners, Ridgeline Ventures, Techstars, Blackhorn Ventures
Research Centers and Universities: Colorado School of Mines, University of Colorado Boulder, University of Denver, Colorado State University, Mesa Laboratory, Space Science Institute, National Center for Atmospheric Research, National Renewable Energy Laboratory, Gottlieb Institute