Designing the Esy Research Pipeline — From Question to Cited Artifact

Zev UhuruEngineer, Esy

February 10, 202611:00

workflow-engineresearch-pipelinecitationarchitecture

A deep dive into how Esy's workflow engine transforms a research question into a fully cited, structured artifact — the pipeline stages, LLM selection per stage, and quality gates.

The promise of Esy is simple: you provide a research question, and you receive a structured, cited artifact. The engineering behind that promise is anything but simple. This post breaks down the complete research pipeline — every stage, every decision point, and every quality gate between input and output.

The Pipeline

Every Esy workflow follows the same macro-structure, though the specifics vary by template:

Intake — Parse the user's input, extract intent, identify constraints
Research — Source discovery, relevance scoring, fact extraction
Outline — Structural design based on artifact type and content
Draft — Content generation within the structural framework
Cite & Format — Citation verification, format compliance, consistency checks
Artifact — Final assembly, export formatting, metadata attachment

LLM Selection Per Stage

Not every stage benefits from the same model. Research-heavy stages need models with strong factual grounding and source awareness. Drafting stages need models with strong prose quality. Citation stages need precision and consistency over creativity.

At Esy, different pipeline stages use different models — and this isn't about cost optimization. It's about quality optimization. A model that writes beautiful prose might hallucinate citations. A model that's rigorous about facts might produce stilted writing.

Quality Gates

Between each stage, a validation step checks the output before passing it downstream:

After Research: Are sources real? Are they relevant? Is there sufficient coverage?
After Outline: Does the structure match the artifact type? Are all sources assigned?
After Draft: Does the content follow the outline? Are claims supported?
After Cite: Do all citations resolve? Is the format consistent?

These gates are the difference between a demo and a product. Without them, you get impressive-looking output that falls apart under scrutiny.

The Hard Problem: Citation Grounding

The single hardest engineering problem in this pipeline is citation grounding — ensuring that when the artifact says "According to Smith et al. (2024)," there's an actual Smith et al. (2024) paper that actually says what the artifact claims it says. This is where most AI writing tools fail, and it's where Esy invests the most engineering effort.