Senior AI Governance Scientist - Artificial Intelligence at Centene

**Who this is for** This position is for a Senior AI Governance Scientist with a strong technical background in AI/ML, eager to conduct hands-on evaluations of

Work type: remote

Salary: $107,700 – $199,300/yr

Type: Full-time

Summary

**Who this is for** This position is for a Senior AI Governance Scientist with a strong technical background in AI/ML, eager to conduct hands-on evaluations of AI systems to ensure safety, reliability, and alignment. **Key highlights** You will be instrumental in operationalizing AI governance by designing and executing rigorous evaluations, red teaming exercises, and developing scalable testing frameworks for diverse AI models. **You might be a good fit if you...** - Hold a Bachelor's or Master's degree in a quantitative field (e.g., Computer Science, Statistics) or have equivalent research experience. - Possess 5+ years of AI/ML experience, with at least 3 years focused on model evaluation, safety, or robustness. - Have hands-on experience with Python-based AI/ML stacks, agentic AI frameworks, and familiarization with LLM observability tools. - Can design and implement evaluation methodologies, conduct experiments, and document findings in technical reports.

Job Description

You could be the one who changes everything for our 28 million members. Centene is transforming the health of our communities, one person at a time. As a diversified, national organization, you’ll have access to competitive benefits including a fresh perspective on workplace flexibility.

Position Purpose:
Leads advanced technical evaluation and assurance activities within our AI Governance function. This role is hands-on and execution-focused, responsible for designing, conducting, and scaling rigorous evaluations of AI systems—including traditional machine learning, generative AI, and agentic AI—to assess safety, reliability, robustness, and alignment with intended use.
Plays a critical role in operationalizing AI governance through experimentation, red teaming, and evaluation frameworks, while partnering closely with engineering, product, and research teams to embed evaluation practices into the AI development lifecycle.

Executes comprehensive red team and stress-testing exercises to identify vulnerabilities, failure modes, and safety risks across AI systems, including large language models, generative models, and autonomous agents.

Designs, implements, and refines evaluation methodologies and protocols to assess AI performance, safety, reliability, and alignment with intended use cases.

Evaluates the adequacy and sufficiency of existing AI evaluations, identify gaps in coverage or rigor, and recommend targeted improvements.

Designs and conducts reproducible experiments to measure AI value, impact, and risk, applying statistical methods and causal inference techniques where appropriate.

Develops and maintains automated testing frameworks and evaluation pipelines that scale across the organization’s AI portfolio.

Researches and applies novel attack vectors and stress-testing approaches for generative AI (e.g., prompt injection, jailbreaking, hallucination risks) and agentic systems (e.g., autonomy boundary violations, goal misalignment).

Creates and curates benchmarks, datasets, and metrics aligned to specific AI capabilities, risk profiles, and governance requirements.

Documents evaluation methodologies, findings, and recommendations in clear, governance-ready technical reports for review by governance bodies and cross-functional stakeholders.

Partners with product, engineering, and research teams to integrate evaluation and assurance practices into AI design, development, and deployment workflows.

Performs other duties as assigned.

Complies with all policies and standards.

Education/Experience:

Bachelor's Degree in Computer Science, Machine Learning, Statistics, or a related quantitative field, or equivalent applied research experience required. Master's Degree preferred

5+ years AI/ML research or applied AI development, including at least 3 years focused on model evaluation, safety, robustness or validation required. 7+ years preferred

Technical Skills:

Strong technical foundation in machine learning and deep learning, with hands-on experience evaluating or developing modern AI systems required

Demonstrated experience designing and executing AI evaluation, testing, or validation methodologies across multiple AI paradigms required

Solid understanding of statistical analysis, experimental design, and data analysis techniques relevant to AI evaluation required

Experience designing evaluation methodologies and publication record preferred

Familiarity with Python‑based AI/ML stack using PyTorch and Databricks, with agentic AI frameworks (LangChain, LlamaIndex, LangGraph, AutoGen, CrewAI) for single‑ and multi‑agent systems. Strong focus on LLM observability, MLOps, and evaluation using LangSmith, MLflow, Weights & Biases, Datadog, OpenTelemetry, and testing frameworks like DeepEval and LangTest

Licenses/Certifications:

Industry related certifications preferred

Pay Range: $107,700.00 - $199,300.00 per year

Centene offers a comprehensive benefits package including: competitive pay, health insurance, 401K and stock purchase plans, tuition reimbursement, paid time off plus holidays, and a flexible approach to work with remote, hybrid, field or office work schedules. Actual pay will be adjusted based on an individual's skills, experience, education, and other job-related factors permitted by law, including full-time or part-time status. Total compensation may also include additional forms of incentives. Benefits may be subject to program eligibility.

View this job on nocollar jobs