Principal AI Platform Engineer (US) at PointClickCare
This role is ideal for a high-level Infrastructure or DevOps Engineer who has pivoted into the AI landscape. You should have extensive experience managing Kuber
Work type: remote
Location: Remote, USA
Salary: $179,000 – $199,000/yr
Type: Full-time
Summary
This role is ideal for a high-level Infrastructure or DevOps Engineer who has pivoted into the AI landscape. You should have extensive experience managing Kubernetes environments and a proven track record of building the "plumbing" for Generative AI, specifically around model gateways, vector databases, and LLM observability.
The position offers a competitive salary range of $179k – $199k and the flexibility of being fully remote within the US. As a Principal hire for a newly centralized team, you will have significant influence over the architectural standards for AI deployment across the entire organization, balancing high-scale performance with rigorous security and cost-efficiency.
**You might be a good fit if you:**
* Have hands-on experience with LLM inference frameworks like vLLM or SGLang.
* Are an expert in Kubernetes, container security, and OpenTelemetry.
* Can design and manage the lifecycle of vector databases and prompt versioning stores.
* Want to lead the technical foundation for "agentic" workflows and real-time AI insights.
Job Description
The Team
This team will serve as the product owner for GenAI capabilities within PointClickCare, working closely with other engineering teams across the organization to identify, build and support generative AI solutions. This centralized team with deep specialization, closely integrated with key horizontal partners to ensure delivery of safe, scalable and high-impact AI Products
Job summary
The Principal AI Platform Engineer will focus on building the infrastructure that connects AI systems with existing products and will enable seamless delivery of AI-generated insights into agent workflows.
Key responsibilities.
- Design, build, and maintain the core infrastructure layer supporting GenAI products, including model gateways, prompt/versioning stores, vector databases, and LLM evaluation tools.
- Implement secure access controls and authentication mechanisms integrated by default into the AI platform components.
- Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure.
- Collaborate closely with product and engineering teams to integrate GenAI infrastructure with agent frameworks, and downstream applications.
- Optimize infrastructure for scalability, high availability, cost efficiency for production workloads.
Qualifications & Skills- Extensive experience building and maintain AI platform infrastructure, Kubernetes, and container security.
- Demonstrated expertise in observability, and monitoring frameworks, with a focus on real-time performance (i.e: experience with OpenTelemetry, MLFlow).
- Experience with AI infrastructure components such as vector databases, prompt/versioning stores, and AI IDEs.
Preferred experience- Familiarity with vLLM, SGLang or similar framework to host LLM inference workloads.
- Experience with CI/CD pipelines and automation for AI model deployment and platform operations
- Strong knowledge of authentication and authorization frameworks integrated into AI platforms.
#LI-AV1#LI-remote
The Team
This team will serve as the product owner for GenAI capabilities within PointClickCare, working closely with other engineering teams across the organization to identify, build and support generative AI solutions. This centralized team with deep specialization, closely integrated with key horizontal partners to ensure delivery of safe, scalable and high-impact AI Products
Job summary
The Principal AI Platform Engineer will focus on building the infrastructure that connects AI systems with existing products and will enable seamless delivery of AI-generated insights into agent workflows.
Key responsibilities.
- Design, build, and maintain the core infrastructure layer supporting GenAI products, including model gateways, prompt/versioning stores, vector databases, and LLM evaluation tools.
- Implement secure access controls and authentication mechanisms integrated by default into the AI platform components.
- Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure.
- Collaborate closely with product and engineering teams to integrate GenAI infrastructure with agent frameworks, and downstream applications.
- Optimize infrastructure for scalability, high availability, cost efficiency for production workloads.
Qualifications & Skills- Extensive experience building and maintain AI platform infrastructure, Kubernetes, and container security.
- Demonstrated expertise in observability, and monitoring frameworks, with a focus on real-time performance (i.e: experience with OpenTelemetry, MLFlow).
- Experience with AI infrastructure components such as vector databases, prompt/versioning stores, and AI IDEs.
Preferred experience- Familiarity with vLLM, SGLang or similar framework to host LLM inference workloads.
- Experience with CI/CD pipelines and automation for AI model deployment and platform operations
- Strong knowledge of authentication and authorization frameworks integrated into AI platforms.
#LI-AV1#LI-remote
View this job on nocollar jobs