Senior Machine Learning Engineer (Research Scientist) - Data Foundation & AI at Plaid

You are a seasoned machine learning researcher with deep expertise in large-scale model training and production-grade software development. You have a proven tr

Work type: hybrid

Location: San Francisco | New York | Seattle, WA

Salary: $228,960 – $315,360/yr

Type: Full-time

Summary

You are a seasoned machine learning researcher with deep expertise in large-scale model training and production-grade software development. You have a proven track record of moving complex models from experimental prototypes into real-world systems. **What makes it worth a look...** Plaid offers a competitive salary range of $228,960 to $315,360 per year for this hybrid role located in San Francisco, New York, or Seattle. You will get the rare opportunity to build foundation models on one of the most extensive and unique proprietary financial datasets in existence. **You might be a good fit if you...** * Possess deep technical knowledge of Transformers, LLMs, and representation learning. * Have hands-on experience with distributed training and building production ML pipelines. * Are proficient in Python and possess strong general software engineering fundamentals. * Have successfully deployed ML models into production environments rather than just building prototypes.

Job Description

We believe that the way people interact with their finances will drastically improve in the next few years. We’re dedicated to empowering this transformation by building the tools and experiences that thousands of developers use to create their own products. Plaid powers the tools millions of people rely on to live a healthier financial life. We work with thousands of companies like Venmo, SoFi, several of the Fortune 500, and many of the largest banks to make it easy for people to connect their financial accounts to the apps and services they want to use. Plaid’s network covers 12,000 financial institutions across the US, Canada, UK and Europe. Founded in 2013, the company is headquartered in San Francisco with offices in New York, Washington D.C., London and Amsterdam.

The Data Foundation and AI team within Plaid’s Data organization builds and maintains the shared machine learning and AI infrastructure that powers capabilities across Plaid’s product suite. The team transforms Plaid’s unique financial network data into general-purpose representations that can be leveraged by teams across the company. They are responsible for the full lifecycle of these systems, including pretraining data curation, model development and training, as well as production deployment, serving, and ongoing monitoring.

As a Senior Research Scientist on the Data Foundation and AI team, you will lead applied research on Plaid’s foundation model by designing model architectures, pretraining objectives, and fine-tuning strategies that generalize across a wide range of downstream product use cases. You will also build and maintain end-to-end production machine learning systems, including training pipelines, model serving infrastructure, feature engineering, and monitoring. In addition, you will develop robust evaluation frameworks to assess model performance across diverse tasks, ensuring quality beyond single-metric optimization.

## Responsibilities

Building a foundation model on one of the world’s richest financial datasets that no one else has.

Doing research that ships: moving from experimentation and prototypes to production systems serving real customers.

Working across the full ML stack, from pretraining objectives and architectures to serving infrastructure and monitoring.

Collaborating with a high-caliber team and seeing your work amplify the capabilities of multiple product teams.

Helping hundreds of millions of consumers achieve greater financial freedom through data-driven products.

## Qualifications

Strong applied ML research skills with production delivery experience.

Depth in Transformers/LLMs, representation learning, or large-scale model training.

Demonstrated ability to ship models to production (not just prototype).

Distributed training experience and strong Python + software engineering fundamentals.

Fintech / financial data domain experience is a plus.

External publications or open-source contributions is a plus.

View this job on nocollar jobs