Confidence aggregation
Confidence Aggregation¶
The medlit domain service computes composite confidence from a list of provenance records using a replication-weighted mean:
from pydantic import BaseModel
from typing import Literal
StudyType = Literal[
"meta_analysis", "rct", "cohort",
"case_control", "observational", "review", "case_report"
]
STUDY_WEIGHTS: dict[StudyType, float] = {
"meta_analysis": 0.95,
"rct": 1.0,
"cohort": 0.8,
"case_control": 0.7,
"observational": 0.6,
"review": 0.5,
"case_report": 0.4,
}
REPLICATION_BONUS_PER_PAPER = 0.02
MAX_REPLICATION_BONUS = 0.15
class ProvenanceRecord(BaseModel):
paper_id: str
section_type: str
paragraph_idx: int
extraction_method: str
confidence: float
study_type: StudyType
def compute_confidence(records: list[ProvenanceRecord]) -> float:
if not records:
return 0.0
base = max(
r.confidence * STUDY_WEIGHTS[r.study_type]
for r in records
)
replication_bonus = min(
(len(records) - 1) * REPLICATION_BONUS_PER_PAPER,
MAX_REPLICATION_BONUS,
)
return min(base + replication_bonus, 0.99)
The base confidence is the maximum weighted confidence across all supporting records -- the strongest single piece of evidence sets the floor. The replication bonus rewards claims that appear in multiple independent papers, capped to prevent a large number of weak case reports from inflating a claim beyond what the evidence supports.