Designing the Esy Research Pipeline — From Question to Cited Artifact
A deep dive into how Esy's workflow engine transforms a research question into a fully cited, structured artifact — the pipeline stages, LLM selection per stage, and quality gates.
The promise of Esy is simple: you provide a research question, and you receive a structured, cited artifact. The engineering behind that promise is anything but simple. This post breaks down the complete research pipeline — every stage, every decision point, and every quality gate between input and output.
The Pipeline
Every Esy workflow follows the same macro-structure, though the specifics vary by template:
- Intake — Parse the user's input, extract intent, identify constraints
- Research — Source discovery, relevance scoring, fact extraction
- Outline — Structural design based on artifact type and content
- Draft — Content generation within the structural framework
- Cite & Format — Citation verification, format compliance, consistency checks
- Artifact — Final assembly, export formatting, metadata attachment
LLM Selection Per Stage
Not every stage benefits from the same model. Research-heavy stages need models with strong factual grounding and source awareness. Drafting stages need models with strong prose quality. Citation stages need precision and consistency over creativity.
At Esy, different pipeline stages use different models — and this isn't about cost optimization. It's about quality optimization. A model that writes beautiful prose might hallucinate citations. A model that's rigorous about facts might produce stilted writing.
Quality Gates
Between each stage, a validation step checks the output before passing it downstream:
- After Research: Are sources real? Are they relevant? Is there sufficient coverage?
- After Outline: Does the structure match the artifact type? Are all sources assigned?
- After Draft: Does the content follow the outline? Are claims supported?
- After Cite: Do all citations resolve? Is the format consistent?
These gates are the difference between a demo and a product. Without them, you get impressive-looking output that falls apart under scrutiny.
The Hard Problem: Citation Grounding
The single hardest engineering problem in this pipeline is citation grounding — ensuring that when the artifact says "According to Smith et al. (2024)," there's an actual Smith et al. (2024) paper that actually says what the artifact claims it says. This is where most AI writing tools fail, and it's where Esy invests the most engineering effort.