Curation versioning

Anything that's effectively a curation decision — umbrella memberships, vocabulary aliases, status maps, biomarker umbrellas, low-info inclusion tags, drug aliases, city/state/country aliases, the stale-trial-years threshold — lives in curation_versions as snapshot rows. Edited via /admin/curation/*.

A doctor or curator can fix an umbrella membership or add an alias without a code deploy. The database is the source of truth.

Curation index showing 9 domains

The 9 domains

DomainShapeUsed by
umbrella_members{umbrella: [primary, …]}cancer_site umbrella mapping
cancer_site_aliases{alias: canonical}Free-form site normalisation
drug_class_aliases{alias: canonical}Drug-class normalisation
low_info_inclusion_tags[INC_DEMOGRAPHICS, …]Tags hidden from the Advanced Search sidebar
ctri_status_map{source: canonical}CTRI status normalisation in transform
biomarker_umbrella{key: [{dimension, tag}, …]}Cross-dimension biomarker matching (TRIPLE_NEG ⇄ triple_negative, HER2_POS, etc.)
stale_trial_threshold{years: N}The recency-filter cutoff (default 7)
drug_aliases{generic: {brand_name, drug_class}}Drug-picker normalisation
city_state_aliases{cities, states, countries: {alias: canonical}}Location autocomplete + bidirectional alias expansion

How a save works

Curation editor with raw JSON + history sidebar

Every save inserts a new row with an auto-incremented version_number. The newly-saved version becomes is_active=TRUE; the prior active version is demoted. A unique partial index enforces "exactly one active version per domain."

Reads everywhere (src/lib/curation.js for JS, scraper/curation_loader.py for Python) hit a 5-minute in-process cache. A save busts the cache so the next request sees the change.

Every save also writes a manual_dispatches audit row capturing action='curation_update', target domain, actor (from x-admin-user header if supplied), and payload.

Rollback

Use /admin/curation/[domain]/rollback to re-activate an older version. The system inserts a new row pointing at the same payload as the old version (so the audit trail stays linear) and flips active.