Research Engineer — Search/IR at firecrawl

You're a seasoned Research Engineer with 3+ years of hands-on experience building and operating large-scale search and information retrieval systems, particular

Work type: remote

Location: San Francisco, CA (Hybrid) OR Remote (Americas, UTC-3 to UTC-10)

Salary: $180,000 – $290,000/yr

Type: Full-time

Summary

You're a seasoned Research Engineer with 3+ years of hands-on experience building and operating large-scale search and information retrieval systems, particularly those focused on ranking quality, freshness, and speed. **What makes it worth a look...** This is a full-time, fully remote role (Americas, UTC-3 to UTC-10) at firecrawl, offering a salary of $180,000–$290,000/year and up to 0.15% equity. **You might be a good fit if you...** * Have built and operated search indexes at massive scale, handling billions of documents and complex sharding strategies. * Are proficient with ranking and relevance algorithms, including BM25, learned ranking, and embedding-based retrieval. * Can own the entire search pipeline from ingestion to serving and have solved freshness, dedup, and incremental indexing challenges. * Are a self-directed experimenter who independently designs and ships improvements to production.

Job Description

# Research Engineer — Search/IR

Research Engineer (Focused on Search/IR)

You'll own the search and information retrieval systems at the core of Firecrawl — the infrastructure that determines how we find, rank, index, and serve web content at scale. Retrieval quality is Firecrawl's deepest moat. As AI agents increasingly depend on multi-step search and enrichment, the gap between good retrieval and great retrieval compounds. You're the person who closes that gap — and widens it against every competitor. This is a full-stack search role where you'll build and operate everything from ingestion pipelines to serving layers. If you've built search indexes at massive scale and care deeply about ranking quality, freshness, and retrieval speed, this is the role.

Salary Range: $180,000–$290,000/year (Range shown is for U.S.-based employees. Compensation outside the U.S. is adjusted fairly based on your country's cost of living. You can explore how we calculate this here: [https://www.firecrawl.dev/careers/compensation](https://www.firecrawl.dev/careers/compensation).)

Equity Range: Up to 0.15%

Location: San Francisco, CA or Remote (Americas, UTC-3 to UTC-10)

Job Type: Full-Time

Experience: 3+ years building search/IR systems at scale

Visa: US Citizenship/Visa required for SF; N/A for Remote

# About Firecrawl

Firecrawl is the easiest way to extract data from the web. Developers use us to reliably convert URLs into LLM-ready markdown or structured data with a single API call. In just a year, we've hit 8 figures in ARR and 100k+ GitHub stars by building the fastest way for developers to get LLM-ready data.

We're a small, fast-moving, technical team building essential infrastructure superintelligence will use to gather data on the web. We ship fast and deep.

# What You'll Do

Build and operate search indexes at massive scale. Design, build, and maintain the indexing infrastructure that powers Firecrawl's core product. You'll handle billions of documents and care about every millisecond of latency and every byte of storage.

Own the full stack from ingestion to serving. You don't just build one piece — you own the entire pipeline. Ingestion, processing, indexing, ranking, query understanding, and serving. When something breaks at 3am, you know where to look because you built it.

Solve ranking, relevance, and query understanding. Make sure the right content surfaces for the right queries. You'll build and iterate on ranking models, relevance scoring, and query parsing systems that directly impact product quality.

Tackle freshness, dedup, and incremental indexing. The web changes constantly. You'll build systems that keep our index fresh without re-crawling everything, deduplicate content intelligently, and handle incremental updates at scale without rebuilding from scratch.

Run experiments and ship results to production. You design experiments, measure results rigorously, and ship winners to production fast. You don't need someone to tell you what to try next — you have a backlog of ideas and the judgment to prioritize them.

Collaborate closely with the team. Work directly with the RL-focused Research Engineer and the engineering team to connect search/IR improvements with model training and the broader product roadmap.

# What We're Looking For

Has built search indexes at massive scale. Not a tutorial project — real indexes serving real traffic with real latency requirements. You've dealt with the hard problems: sharding strategies, index compaction, schema evolution, and the operational complexity of keeping billions of documents queryable and fast.

Hands-on with ranking, relevance, and query understanding. You've built or meaningfully improved ranking systems. You understand BM25, learned ranking, embedding-based retrieval, and when to use which. You can reason about relevance tradeoffs and you've shipped ranking changes that moved metrics in production.

Owns the full stack: ingestion → index → serving. You're not a specialist who only touches one layer. You've built and operated the entire search pipeline — from how documents enter the system to how results get served. You understand the dependencies between layers and make good architectural decisions because you see the whole picture.

Has solved freshness, dedup, and incremental indexing problems. You know that building the initial index is the easy part. Keeping it accurate, fresh, and deduplicated at scale is where the real engineering lives. You've built systems that handle continuous updates without full rebuilds and you've debugged the subtle correctness issues that come with incremental processing.

Self-directed experimenter who ships without handholding. You generate your own hypotheses, design your own experiments, and ship your own code. You don't wait for a roadmap or a sprint planning meeting. You see what needs to improve, you try something, you measure it, and you ship it if it works.

Backgrounds that tend to do well: Search engineers at companies with large-scale indexes — web search, e-commerce, document search. IR researchers who've shipped their work to production. Infrastructure engineers who've built and operated real-time indexing pipelines. Engineers from Elasticsearch, Algolia, Vespa, or similar search infrastructure teams who got frustrated that they could only tune the knobs and wanted to build the engine.

# What We're NOT Looking For

Search users, not search builders. If your experience is configuring Elasticsearch or tuning Solr queries but you haven't built search infrastructure from scratch, this isn't the right role. We need someone who builds the engine.

Researchers who don't ship. If your best search/IR work lives in a paper and you've never deployed a ranking model to production, this isn't it. Every experiment here ends with code running in prod.

Engineers who only work on one layer. If you only do indexing, or only do ranking, or only do serving — and you're not interested in owning the full stack — you'll be frustrated here. We need someone who sees the whole pipeline and can work anywhere in it.

People who need clean infrastructure to be productive. The systems you'll work on are evolving fast. If you need everything to be perfectly abstracted and well-documented before you can contribute, you'll stall. We need someone who can build and improve infrastructure while shipping on it.

# A Note On Pace

We operate at an absurd level of urgency because the window for what we're building won't stay open forever. If that excites you, keep reading. If it doesn't, no hard feelings — but this role probably isn't for you.

# Benefits & Perks

## Available to all employees









## Available to US-based full-time employees








## Available to SF-based employees



# Interview Process

Application Review — Send us your work and a quick note on why this excites you. Show us what you've built — search systems, indexing pipelines, ranking improvements. We care about what you've shipped, not where you went to school.

Intro Chat (~20 min) — A quick conversation to get to know each other before we go deep. We'll talk about what you've been working on, what drew you to Firecrawl, and what you're looking for in your next role. Time for your questions too.

Technical Deep Dive (~60 min) — Go deep on search/IR systems you've built: architecture decisions, scale challenges, ranking approaches, and production tradeoffs. We'll explore a live problem — how you'd approach a real search/indexing challenge at Firecrawl's scale. We're looking for depth across the full stack, production instincts, and the ability to reason about tradeoffs under constraints.

Founder Chat (~30 min) — Culture, pace, ownership, and how you like to work. Time for your questions too.

Paid Work Trial (1–2 weeks) — Tackle a real search/IR problem with production implications. We evaluate on technical depth, experimentation rigor, and how fast you ship something meaningful.

Decision — We move fast after the trial.

If you've built search systems at scale and want to work on one of the most interesting web data problems in AI infrastructure — this is your shot.

👉 Apply now.

View this job on nocollar jobs