TL;DR
Think of Unbiased as an X-ray for news articles. It analyzes news articles for political slant, factual claims, and potential bias, then returns a strict JSON report. The hard part was not wiring an LLM. The hard part was defining what slant, bias, and balance mean, then encoding those rules so outputs stay stable.
Live: https://unbiased.adriancares.com/
Code: https://github.com/adriansprk/unbiased
Final Prompt: https://github.com/adriansprk/unbiased/blob/main/docs/Prompt.md
What users see
- Paste a URL and receive a structured report: slant, 5-10 factual claims with quotes, and a bias analysis across five dimensions.
- Output can be in English or German.
Builder’s view: the real challenge
The problem
Labels like Liberal/Progressive, Conservative, Balanced, or Biased are contested. Without operational rules, a model drifts to its priors and answers differently on the same article.
What I operationalized
- Separate axes: slant direction and bias quality are judged independently.
- Evidence discipline: claims must be verifiable and backed by verbatim quotes.
- Five fixed lenses: Word Choice / Tone, Framing / Emphasis, Source Selection, Fairness / Balance, Headline / Title Bias. Each has a short summary and one of four statuses: Balanced, Caution, Biased, Unknown.
- Single contract: exactly three top-level keys in JSON:
slant
,claims
,report
. The UI can trust the shape.
Why it matters
This is editorial product design. The prompt acts like a rulebook that turns judgment into a process. Less roll of the dice. More referee with a whistle.
How the current prompt works
-
Flow:
- Classify slant from a defined list.
- Extract 5-10 factual claims with quotes and a short significance note.
- Produce a bias report with an overall assessment, five dimension summaries, and an overall bias level: Low, Moderate, or High.
-
Scope: Judge only the provided text, not outlet reputation.
-
Format: Output must be a single valid JSON object. No extra prose.
-
Language rule: Explanations can be EN or DE. Keys and status labels remain English.
Evaluation loop
-
Seed set: a mix of straight news, analysis, and opinion across topics.
-
Checks:
- Label stability: does the same piece keep the same labels across runs.
- Evidence alignment: do quotes support the extracted claims.
- Dimension coverage: are statuses justified by the summaries.
-
Goal: lower variance, fewer vibe-only labels, honest use of Unknown when a dimension cannot be assessed.
Product decisions and trade-offs
- Strict schema over free text: better consistency and simpler rendering.
- Evidence before judgment: claims first, labels second.
- Granular lenses: tone can be balanced while sources are skewed. The model must say which.
- Conservative defaults: when text is thin, prefer Unknown instead of confident guesses.
Stack
Frontend: Next.js, React, Tailwind
Backend: Node and TypeScript with a queue and worker for analysis jobs
LLM: provider-agnostic. The prompt and schema are the contract.
Potential improvements
These are future enhancements.
-
Insufficient evidence flag: Add a boolean for partial extractions or cases that cannot be judged meaningfully.
-
Evidence-linked rationales: Let slant and bias rationales reference claim IDs or character spans. Tie judgments to exact quotes.
-
Determinism and versioning: Standardize runtime settings and stamp outputs with a prompt spec version.
-
Genre detection first: Identify news vs analysis vs opinion, then adjust fairness expectations.
-
Sharper category semantics: Add short operational definitions and micro-examples for each slant category.
-
Self-consistency checks: Require at least one detailed finding for every dimension marked Caution or Biased. Auto-downgrade to Unknown if missing.
-
Pre-flight schema validator: Enforce the JSON contract and auto-retry when fields or counts are invalid.
PM takeaways
- Definitions are the API. A crisp taxonomy beats clever prompts.
- Auditability builds trust. If you cannot quote it, do not claim it.
- Stability is a feature. A steady compass matters more than a flashy map.