HORIZON Methodology
HORIZON is built to a single principle: every claim is auditable back to a public source. No anonymous tips, no scraped social posts presented as fact, no laundered citations. This page documents the exact qualification chain every record passes through.
1. Source qualification — NATO Admiralty Scale
Every source registered on HORIZON receives a two-axis rating per NATO AJP-2.1:
| Reliability (source) | A confirmed · B usually reliable · C fairly reliable · D not usually reliable · E unreliable · F cannot be judged |
|---|---|
| Credibility (info) | 1 confirmed · 2 probably true · 3 possibly true · 4 doubtful · 5 improbable · 6 cannot be judged |
A WHO Disease Outbreak News bulletin rates A1 (confirmed source, confirmed info). A peer-reviewed Lancet ID paper typically rates A2. A verified national-authority press release rates B1 to B2. Reuters and AP news wire rates B2 to B3. A single-source social media post rates D4 or worse — these are stored but never auto-applied to incident counts.
Auto-application of an extracted fact to the incident ontology requires
an A1/A2/B1/B2 source OR three corroborating independent sources within
48 hours. Both paths are documented per record in the
extraction_proposals audit log.
2. ICD 206 Source Reference Citation
Every src_citation field follows the US intelligence-community
ICD 206 format: [CLASSIFICATION] AUTHOR (RELIABILITY/CREDIBILITY)
"TITLE" PUBLICATION, DATE, IDENTIFIER. Example for the MV Hondius
WHO bulletin:
[PUBLIC] WHO (A1) "Disease Outbreak News 2026-DON600: Andes hantavirus — MV Hondius cluster" World Health Organization, 2026-05-11
3. Dual confidence model
Pipeline confidence (machine, 0.0 to 1.0) reflects the statistical confidence of the auto-extraction process — entity disambiguation, deduplication match score, regex pattern specificity. Analyst confidence (human, 0.0 to 1.0, nullable) is set only when a 79th Unit analyst has manually reviewed the record.
These are never conflated. Front-end displays distinguish them clearly: amber for pipeline (provisional), green for analyst (vetted). Exports require analyst confidence on every included object.
4. Berkeley Protocol chain-of-custody
The Berkeley Protocol on Digital Open Source Investigations defines the chain-of-custody requirements that make OSINT admissible in legal proceedings. Every fetched document on HORIZON is hashed (SHA-256) at ingestion and the hash is stored alongside the fetch timestamp, the URL, and the User-Agent that retrieved it. Re-fetch produces a new row if the hash changes — we never overwrite history.
5. Cluster-tie scoring (incident-specific)
For the MV Hondius cluster, an article must pass a cluster-tie classifier before any extracted facts auto-apply to the ontology:
- Strong tie (score 1.0) — explicit MV Hondius / Oceanwide Expeditions / Hondius port-name mention.
- Medium tie (0.5) — hantavirus + repatriation/evacuation context + route country.
- Weak tie (0.0) — hantavirus mention without ship/port/repatriation context. Produces no proposals.
6. Per-country authoritative cap (anti-inflation)
News articles frequently report cluster totals ("infections grow to 9 as
Spanish passenger falls ill") that the extractor could mis-attribute to
the country mentioned nearby. HORIZON now enforces a global cap: per-country
proposals where value_numeric ≥ WHO confirmed total are
rejected as cluster-total misattributions. The cap is sourced from the
WHO Disease Outbreak News authoritative count and ECDC corroboration.
7. Open data and API
All non-pre-decisional data is published live at /api/v1/cases, /api/v1/incidents, /api/v1/sources, and /api/v1/meta/events under CC BY 4.0. OpenAPI schema: /api/openapi.json.