Principal AI Platform Engineer (CA) at PointClickCare
This role is ideal for a high-level infrastructure specialist who sits at the intersection of DevOps and Machine Learning. You should have deep experience build
Work type: remote
Location: Mississauga
Salary: CA$169,000 – CA$188,000/yr
Type: Full-time
Summary
This role is ideal for a high-level infrastructure specialist who sits at the intersection of DevOps and Machine Learning. You should have deep experience building the "piping" for Generative AI, specifically managing Kubernetes-based environments, vector databases, and model gateways. This isn't just about training models; it's about architecting the scalable production environments that allow AI products to function securely.
As a **Principal** role, this position offers a high level of autonomy and a competitive salary range of **$169k – $188k CAD**. The role is fully remote within Canada (or via the Mississauga office), providing excellent flexibility. You’ll be a founding member of a centralized AI team, giving you the chance to set the standard for how GenAI is deployed across a major healthcare technology platform.
**You might be a good fit if you...**
* Have hands-on experience with LLM inference frameworks like vLLM or SGLang.
* Are an expert in Kubernetes and securing containerized workloads.
* Have built observability stacks using OpenTelemetry or MLFlow for production AI systems.
* Can transition between high-level architectural design and hands-on CI/CD automation.
Job Description
The Team
This team will serve as the product owner for GenAI capabilities within PointClickCare, working closely with other engineering teams across the organization to identify, build and support generative AI solutions. This centralized team with deep specialization, closely integrated with key horizontal partners to ensure delivery of safe, scalable and high-impact AI Products
Job summary
The Principal AI Platform Engineer will focus on building the infrastructure that connects AI systems with existing products and will enable seamless delivery of AI-generated insights into agent workflows.
Key responsibilities.
- Design, build, and maintain the core infrastructure layer supporting GenAI products, including model gateways, prompt/versioning stores, vector databases, and LLM evaluation tools.
- Implement secure access controls and authentication mechanisms integrated by default into the AI platform components.
- Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure.
- Collaborate closely with product and engineering teams to integrate GenAI infrastructure with agent frameworks, and downstream applications.
- Optimize infrastructure for scalability, high availability, cost efficiency for production workloads.
Qualifications & Skills- Extensive experience building and maintain AI platform infrastructure, Kubernetes, and container security.
- Demonstrated expertise in observability, and monitoring frameworks, with a focus on real-time performance (i.e: experience with OpenTelemetry, MLFlow).
- Experience with AI infrastructure components such as vector databases, prompt/versioning stores, and AI IDEs.
Preferred experience- Familiarity with vLLM, SGLang or similar framework to host LLM inference workloads.
- Experience with CI/CD pipelines and automation for AI model deployment and platform operations
- Strong knowledge of authentication and authorization frameworks integrated into AI platforms.
#LI-AV1#LI-remote
The Team
This team will serve as the product owner for GenAI capabilities within PointClickCare, working closely with other engineering teams across the organization to identify, build and support generative AI solutions. This centralized team with deep specialization, closely integrated with key horizontal partners to ensure delivery of safe, scalable and high-impact AI Products
Job summary
The Principal AI Platform Engineer will focus on building the infrastructure that connects AI systems with existing products and will enable seamless delivery of AI-generated insights into agent workflows.
Key responsibilities.
- Design, build, and maintain the core infrastructure layer supporting GenAI products, including model gateways, prompt/versioning stores, vector databases, and LLM evaluation tools.
- Implement secure access controls and authentication mechanisms integrated by default into the AI platform components.
- Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure.
- Collaborate closely with product and engineering teams to integrate GenAI infrastructure with agent frameworks, and downstream applications.
- Optimize infrastructure for scalability, high availability, cost efficiency for production workloads.
Qualifications & Skills- Extensive experience building and maintain AI platform infrastructure, Kubernetes, and container security.
- Demonstrated expertise in observability, and monitoring frameworks, with a focus on real-time performance (i.e: experience with OpenTelemetry, MLFlow).
- Experience with AI infrastructure components such as vector databases, prompt/versioning stores, and AI IDEs.
Preferred experience- Familiarity with vLLM, SGLang or similar framework to host LLM inference workloads.
- Experience with CI/CD pipelines and automation for AI model deployment and platform operations
- Strong knowledge of authentication and authorization frameworks integrated into AI platforms.
#LI-AV1#LI-remote
View this job on nocollar jobs