Consensus
An AI that reads the scientific literature and tells you where it actually lands.
Someone at dinner claims that intermittent fasting extends lifespan. You want to know what the science actually says without reading twenty papers. You are not going to be a lifelong expert on fasting. You need calibrated, cited, five-minute competence.
Why this tool matters
Consensus is a search engine built specifically on empirical research claims. You ask a yes/no empirical question (Does X cause Y? Does Z work?), and Consensus searches the peer-reviewed literature, extracts claims from each paper, and shows you — for each paper — whether the authors found yes, no, or mixed / possibly. A meter at the top shows the overall direction of the evidence.
This is a specialized tool with a specialized use case: reasoning about empirical questions you care about but are not an expert on. It is particularly good at the class of questions where the general internet will confidently give you whichever answer you want to hear — nutrition, exercise, sleep, supplements, psychological interventions, educational practices. Consensus forces the evidence to answer for itself.
It is not a replacement for Elicit. Elicit is for structured data extraction across a literature; Consensus is for stance classification on a specific empirical claim. Most research workflows use both, at different moments.
Setup
Account: free at consensus.app. The free tier rate-limits AI features (summaries, the Consensus Meter) but lets you search unlimited. Paid tier starts around $9/month for students.
Best for: questions that have the shape “Does X cause Y?” or “Is Z effective for W?”. Empirical, testable, with a literature.
Walkthrough
Step 1: Ask an empirical yes/no question
Go to consensus.app. In the search bar, type a question phrased in natural language: Does meditation reduce anxiety? Is reading to children effective? Does creatine improve muscle growth?
Step 2: Read the Consensus Meter
At the top of the results, Consensus shows a meter: how many of the top papers say yes, no, or possibly. This is an orientation — not an answer. The meter is based on the top 20 papers and their extracted claim classifications.
Step 3: Skim the top ten findings
Below the meter, each paper is represented with the extracted claim, the classification (yes / no / possibly), the publication venue, and the sample size. This is dense and useful — the page is almost a literature scan on its own.
Step 4: Click into a representative paper
Pick a paper that sits right on the meter's center of mass. Click to see the full abstract. Consensus highlights the exact sentence it extracted the claim from. Read the abstract. Does the paper actually say what the extracted claim says?
Step 5: Use the GPT Summary
For questions with enough papers, Consensus offers a Consensus Summary — a single paragraph that reconciles the top results. Treat it as a hypothesis, not a conclusion; check it against two or three of the cited papers.
Your turn
Basic: Resolve one health myth
Pick a health or wellness claim you have heard often and are not sure about: intermittent fasting and longevity, blue light and sleep, omega-3s and depression, cold exposure and cognition. Search it on Consensus. Read the meter, skim the top five findings, and click into the highest-impact paper.
In three sentences, write what the evidence actually says — acknowledging uncertainty where it exists.
Advanced: Build a position paper with evidence
Pick an empirical question that matters for a real decision in your life or work. Run Consensus on it. Open five of the top papers. For each: note the effect size, the sample, the population studied, and the key limitation.
Then write a 300-word position paragraph that (a) states the evidence-based answer, (b) acknowledges the limitations that the Consensus Meter smooths over, and (c) makes a calibrated recommendation. This is how professional analysts actually write.
Pitfalls and pro tips
The Meter is a sketch, not a conclusion. It classifies the stated claim of each paper, but not the quality of the evidence. Five bad studies saying “yes” can tilt it wrongly. Always check the top two papers personally.
Question phrasing matters. Does X cause Y? and Does X reduce Y? can return different classifications from the same papers. Try rephrasing to see whether the direction of the meter holds up.
Sample bias in the literature. Consensus reflects the published literature, which has its own biases (novelty bias, positive-result bias, funding-source bias). A strong Consensus meter is not a guarantee that the real world agrees.
How it compares
Consensus is unusual. The closest competitor is Scite, which tracks whether later papers have supported, disputed, or mentioned each citation — a different but related classification problem. Elicit (Day 3) does structured extraction but does not classify stance by default. For a yes/no empirical question, Consensus remains the sharpest tool we have.
When to use — and when not to
Use Consensus when the question you are chewing on is empirical and has a literature (medicine, psychology, nutrition, education, exercise science). It is particularly valuable for health and wellness claims, where internet sources are noisy.
Do not use Consensus when the question is theoretical, legal, historical, or a matter of design choice — these do not have a “meter” in the same empirical sense.