Documentation
How Regest works: concepts, architecture, and reference.
The Problem
Similarity ≠ Insightfulness. Often, the most meaningful insights are the least obvious ones, making traditional similarity-driven retrieval methods poorly suited for surfacing novel or consequential ideas.
Standard vector search answers the question "what is most similar to my query?" — but journalists, researchers, and analysts typically need the answer to a different question: "what is most surprising, significant, or consequential?"
What is Regest?
The Regest project indexes conversational media data, leveraging the lineage of ideas and conversational structure to quantify surprise, argumentative significance, and narrative progression.
Regest enables high-throughput, low-cost discovery of impactful insights across large volumes of spoken or written content — without requiring heavy agentic LLM orchestration.
Core Concepts
- Surprise scoring — measuring how unexpected a statement is given its conversational context
- Argumentative significance — identifying claims that carry rhetorical or evidential weight
- Narrative progression — tracking how ideas develop, shift, and compound across a conversation
Architecture
Coming soon — system design, pipeline overview, and data flow.
Indexing
Coming soon — how conversational media is ingested, segmented, and indexed.
Scoring
Coming soon — the retrieval scoring model and how impact is quantified.
API
Coming soon — endpoint reference and usage examples.