This role is ideal for a mid-to-senior level Engineer (3–7+ years) who thrives at the intersection of machine learning research and production software engineer
Work type: hybrid
Location: San Francisco
Type: Full-time
This role is ideal for a mid-to-senior level Engineer (3–7+ years) who thrives at the intersection of machine learning research and production software engineering. You should have a deep mastery of Python and PyTorch, with a proven track record of moving generative AI models beyond the notebook and into low-latency, high-throughput production environments. The most compelling aspect of this position is the opportunity to work at a high-growth startup founded by Stanford AI scientists. While an exact salary isn't listed, the package includes "meaningful equity," which offers significant upside potential as the platform scales across federal and commercial health systems. You’ll be solving complex technical challenges like model quantization and distributed inference while making a tangible impact on clinician burnout. **You might be a good fit if you...** * Have extensive experience deploying transformer-based LLMs or speech-to-text systems on AWS. * Are comfortable building automated evaluation pipelines to ensure model reliability in high-stakes healthcare settings. * Enjoy the "applied" side of ML—optimizing for speed, batching, and caching rather than just training. * Prefer a hybrid work environment in San Francisco and want to work on a product with clear social utility.
About Knowtex
Knowtex is building the future of voice AI operating systems for clinicians, transforming how healthcare documentation happens at the point of care. Founded by Stanford AI scientists with deep clinical experience, we're experiencing explosive growth across both commercial health systems and federal healthcare, with our ambient documentation platform scaling rapidly to thousands of clinicians across hundreds of specialties. We're at an inflection point where cutting-edge AI meets real clinical impact, giving clinicians hours back each day to focus on what matters most - their patients.
Position Overview
We are seeking an Applied ML Engineer to productionize and scale machine learning systems powering our voice AI platform. This role bridges research and engineering — transforming models into reliable, low-latency, production-grade systems deployed across enterprise healthcare environments.
You will work closely with ML Scientists, Backend Engineers, and Platform teams to optimize inference performance, build evaluation pipelines, and ensure robust model deployment in regulated environments.
Key Responsibilities