Why you can trust this
This section is currently available in English only.
Every summary on this site is generated by a language model. Language models are fluent. They produce confident, well-structured text. They can also make things up [1]. Current estimates put hallucination rates at 3–27% depending on the model and task [2]. We can't eliminate that entirely. But we can make the credibility mechanisms overt: visible, inspectable, and verifiable by you. Not "trust us." See for yourself.
AI transparency
All summaries on ovr.news are generated by AI using Gemma 3 (27B), an open-source model running on our own hardware. Translation uses DeepL and Gemini Flash (Google) as cloud services. Each article shows an AI summary label. AI can make mistakes. That's why we always link to the original article, so you can verify.
Layer 1: Ground the model
The most effective defense against hallucination is also the simplest: constrain what the model may say. Every summary prompt includes explicit grounding instructions:
Use facts from the article. Do not invent statistics, quotes, or claims. Do not add context, background, or interpretation beyond what the source article states.
Research on retrieval-augmented generation shows that anchoring responses in source documents substantially reduces hallucination rates [3]. Our prompts also ban words that signal editorializing rather than reporting: "groundbreaking," "innovative," "significant," "highlights," "showcases," "underscores."
Layer 2: Lower the temperature
Language models have a parameter called temperature that controls randomness. Higher values produce more creative, more drift-prone output. Lower values keep the model closer to its input. A 2026 study across 172 billion tokens found that hallucination rates increase measurably with temperature [4]. Our summaries run at 0.7: close to the source material while allowing enough flexibility for natural phrasing.
We also require the model to reason through the prompt constraints before generating output. Rather than jumping straight to fluent text, it first works through the rules: what the article says, what the grounding instructions allow, and what the word limits require.
Layer 3: Clean the input
A model can only be as faithful as its input. Before any article reaches the language model, it passes through a content quality gate — density-based heuristics that check for:
- CSS and HTML leakage — if more than 3–5% of the text is markup, the extraction failed. Rejected.
- Cookie banners — short articles dominated by consent language. Rejected.
- Paywall stubs — articles under 800 characters with phrases like "subscribe to continue." Rejected.
- Navigation debris — when most lines are shorter than 30 characters, you're reading a menu, not an article. Rejected.
Articles that fail the quality gate never reach the language model. Better to show nothing than a confident summary of garbage.
Layers 1 through 3 are about prevention. The next three layers are about verification: making it possible for you to check.
Layer 4: Link to the source
Every article on this site has a "Read Original" link. The original URL, the source name, and the publication domain are preserved from the moment an article enters the pipeline to the moment it appears on your screen.
Research on AI transparency shows that source attribution is one of the strongest predictors of user trust [5]. When other outlets report the same story, we show those too. Independent corroboration from multiple sources is a stronger trust signal than any single summary.
Layer 5: Show the scores
Every article shows its weighted average score. Open the article, and you can see which dimensions were scored and how. The filter definitions are published on GitHub, and two of the trained filters are available on Hugging Face.
You might disagree with a score. That's fine. The point is that the judgment is inspectable.
Layer 6: Editorial rules
After scoring and summarization, a rule-based editorial layer makes final decisions. It removes near-duplicate stories, ensures scientific and research sources get representation, and promotes corroborated stories. Every editorial action is logged with a reason. These rules are not AI — they're deterministic checks with configurable thresholds.
| Layer | What it does |
|---|---|
| Grounded prompts | Constrain what the model can say |
| Low temperature | Favor fidelity over creativity |
| Content quality gate | Reject junk before it reaches the model |
| Source links | Make verification one click away |
| Visible scores | Make the reasoning inspectable |
| Editorial rules | Deterministic checks with an audit trail |
What we're honest about
- We can't verify facts. The model summarizes what the article says. If the article contains an error, the summary will too.
- Summaries can still drift. Despite grounding prompts and low temperature, subtle distortion happens. A nuance gets lost. An emphasis shifts. This is why the source link exists [6].
- Quality depends on extraction. Some websites make it hard to extract clean text. The quality gate usually catches poor extraction. Sometimes it doesn't.
- Scores reflect our lens definitions. The scoring system encodes what we think "belonging" or "discovery" means. Those definitions are published, but they're still editorial choices.
What AI doesn't do
AI is a tool, not an editor. There are things we deliberately don't leave to AI:
- Whether numbers in an article are accurate
- Whether a source is reliable
- Whether a claim is proven
We leave that judgment to you. We give you the context to decide for yourself.
We'd rather you trust us because you verified, not because we asked you to.
Source quality
Not all news sources are equal. We assess each source for reliability, so you know where the news comes from.
The tiers
Reliability confirmed by independent databases or manually reviewed by our editorial team. These sources have a credibility score from 0 to 10.
Examples: Reuters, BBC, Nature, AP News, The Lancet, public broadcasters
Deliberately added to our source collection, but not externally verified. These sources were chosen because they fit our lenses, but don't have an independent credibility score.
Examples: specialized publications, regional media, non-profit news services
Source is not in our database. This doesn't mean the source is unreliable. It only means we haven't been able to establish its reliability.
Credibility score
Verified sources receive a score from 0 to 10, based on independent assessments:
| Score | Rating | Examples |
|---|---|---|
| 9.0 – 10.0 | Very high | Nature, The Lancet, NIH, EU institutions |
| 7.5 – 8.9 | High | Reuters, BBC, AP, arXiv, public broadcasters |
| 6.0 – 7.4 | Medium | Major newspapers, think tanks |
| 4.0 – 5.9 | Neutral | Mixed factual reporting |
| < 4.0 | Low | State media, tabloids. Rarely in our selection. |
Where do the scores come from?
Credibility scores are computed as a weighted average across three independent databases:
- IDIAP Research Institute: Academic database with NewsGuard scores and reliability labels for ~5,300 domains
- Media Bias/Fact Check: Independent assessment of factual reporting and political bias for ~4,400 domains
- Wikipedia Perennial Sources: Community-consensus reliability ratings maintained by Wikipedia editors for ~420 domains
For sources not covered by these databases, our editorial team assigns scores manually. Where a manual score overlaps with an external database, we run automated checks to flag significant disagreements.
Current coverage
We currently track ~1,000 source domains:
| Method | Domains | What it means |
|---|---|---|
| External databases | ~270 (27%) | Score backed by IDIAP, MBFC, and/or Wikipedia. Nearly all confirmed by 2+ independent sources |
| Editorial review | ~650 (65%) | Score assigned by our team. These are our judgment calls, not independently verified |
| Unscored | ~80 (8%) | In our collection but no credibility data available. Shown without a score. |
Source type
Beyond reliability, we also classify sources by type:
- Wire service: Reuters, AP, AFP
- Academic: Nature, The Lancet, arXiv
- Public broadcaster: BBC, NOS, NPO
- NGO / non-profit: Positive News, Solutions Journalism Network
- Newspaper: The Guardian, El País, de Volkskrant
- Government / institutional: EU, WHO, NIH
Our editorial stance
- We show all tiers. We don't hide articles from curated or unknown sources.
- No score doesn't mean unreliable. It means we couldn't verify.
- A good source can publish a bad article. We assess at the domain level, not per article.
References
- Pesaranghader, A. & Li, E. (2026). "Hallucination Detection and Mitigation in Large Language Models." arXiv:2601.09929.
- Saxena, H. (2025). "Hallucination in Generative Artificial Intelligence: Challenges, Causes, and Mitigation Strategies." SSRN 5976335.
- Li, Y. et al. (2025). "Mitigating Hallucination in Large Language Models: An Application-Oriented Survey on RAG, Reasoning, and Agentic Systems." arXiv:2510.24476.
- Roig, J.V. (2026). "How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms." arXiv:2603.08274.
- Zerilli, J., Bhatt, U. & Weller, A. (2022). "How transparency modulates trust in artificial intelligence." Patterns, 3(4). doi:10.1016/j.patter.2022.100455.
- Dang, A.-H., Tran, V. & Nguyen, L.-M. (2025). "Survey and analysis of hallucinations in large language models: attribution to prompting strategies or model behavior." Frontiers in Artificial Intelligence. doi:10.3389/frai.2025.1622292.
Last updated: April 2026