Solutions Architect, Inference Deployments at NVIDIA

You are a solutions architect with at least five years of experience deploying distributed systems and AI inference workloads on Kubernetes. You hold a bachelor

Work type: onsite

Location: US, CA, Santa Clara

Salary: $152,000 – $241,500/yr

Type: Full-time

Summary

You are a solutions architect with at least five years of experience deploying distributed systems and AI inference workloads on Kubernetes. You hold a bachelor’s degree in computer science or engineering and possess a strong technical background in GPU orchestration and model optimization. **What makes it worth a look...** NVIDIA is hiring for this full-time on-site role in Santa Clara, offering a base salary range between $152,000 and $241,500 plus equity and comprehensive benefits. It is a chance to work directly on high-performance generative AI pipelines at the industry leader in GPU technology. **You might be a good fit if you...** * Have hands-on experience with TensorRT-LLM, Triton Inference Server, or NVIDIA Dynamo. * Understand GPU partitioning and memory hierarchies like HBM and DRAM. * Can tune large language models for low-latency production environments. * Are comfortable managing Kubernetes clusters and low-latency networking protocols such as RDMA.

Job Description

We’re forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA’s GPU technology and Kubernetes. As a Solutions Architect focused on inference, you’ll collaborate closely with our engineering, DevOps, and customers to develop enterprise AI solutions. Together, we'll deliver generative AI to production!

What you'll be doing:





What we need to see:







Ways to stand out from the crowd:





Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD.

You will also be eligible for equity and [benefits](https://www.nvidia.com/en-us/benefits/).

Applications for this job will be accepted at least until April 19, 2026.

This posting is for an existing vacancy. 

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

View this job on nocollar jobs