The Research Layer

Peter: Iterative Research
Over Explicit Knowledge

Foundation models compress knowledge implicitly into opaque weights. Walter builds it explicitly into inspectable graphs. Peter is the research layer that navigates those graphs — not by retrieving text, but by conducting iterative, human-like research: asking, inspecting, identifying gaps, refining, and drilling deeper.

Explore the Research Loop Back to Overview

The Research Loop

At the heart of Peter is an iterative research process — not a single query-and-retrieve step, but a loop that mirrors how a skilled human researcher works.

Peter navigates Walter's compiled knowledge graphs with dependency awareness, prerequisite chains, and supporting concepts — refining its understanding with each pass through the loop.

This approach transforms knowledge retrieval from "find similar documents" into "build structured understanding iteratively".

Peter's Research Process

1 Query the knowledge graph

2 Inspect the structure of results

3 Identify gaps and contradictions

4 Refine the question

5 Drill deeper or broaden scope

Multi-Signal Knowledge Matching

Peter scores relevance using a hybrid approach that prevents false positives while capturing genuine conceptual relationships.

Hybrid Scoring Weights

Semantic

Embeddings

50%

BM25

Keywords

30%

Jaccard

Entities

20%

Temporal

× Multiplier

Why Multi-Signal Matters

Single-signal retrieval fails in predictable ways:

Semantic-only matches conceptually related but technically irrelevant content
Keyword-only misses paraphrased or differently-termed knowledge
Entity-only over-indexes on surface mentions without conceptual depth

Quality Gates

Peter's hybrid scoring with disqualification thresholds ensures:

Minimum semantic relevance required (conceptual gate)
Combined score must exceed threshold (quality gate)
Below-threshold matches explicitly rejected (no noise)

Two-Stage Retrieval Pipeline

Peter uses a two-stage retrieval architecture that optimises for both speed and precision:

STAGE 1
Bi-encoder
Fast recall
→ Top 100 →
STAGE 2
Cross-encoder
High precision
→ Top 20

Cross-encoders see query and document together, achieving 15-30% higher precision than bi-encoders alone.

Temporal Modes

Not all knowledge ages equally. Peter applies three temporal modes to distinguish foundational concepts from time-sensitive developments.

Concept

∞

No decay. Definitions, principles, and foundational standards contribute equally regardless of when compiled.

Event

~70d

Exponential decay. Regulatory updates, rulings, and announcements lose relevance over time.

Hybrid

~140d

Slow decay with 0.5 floor. Best practices and evolving interpretations retain at least half their relevance.

This Means

A foundational legal definition contributes equally regardless of compilation date
A specific enforcement action from 18 months ago contributes less than one from last month
Industry best practices retain at least 50% relevance even as they age
The system automatically balances recency against foundational importance

Knowledge Graph Navigation

Walter's knowledge graphs encode explicit relationships between concepts. When Peter researches a specific topic, it automatically surfaces the prerequisite knowledge, supporting concepts, and dependent ideas — providing the full context a researcher needs.

PREREQUISITE: Data Protection Principles └── supports → DEPENDENT: GDPR Article 22 Enforcement PREREQUISITE: AI Ethics Frameworks └── supports → BUILDS ON: EU AI Act Implementation PREREQUISITE: Jurisdictional Reach └── depends-on → DEPENDENT: Cross-Border Transfer Rulings

Relationship Type	Meaning
prerequisite	Must be understood before the dependent concept makes sense
supports	Foundational knowledge that reinforces the dependent idea
depends-on	Cannot be fully evaluated without the prerequisite
related	Thematic connection that provides useful context

Gap and Conflict Detection

Peter doesn't just retrieve what's known — it identifies what's missing, where sources disagree, and what needs more evidence.

Gap Identification

Detects missing prerequisites in the knowledge graph
Flags concepts referenced but never compiled
Identifies thin areas where evidence is sparse
Surfaces assumptions that lack explicit support

Conflict Surfacing

Compares claims across multiple compiled sources
Flags areas where sources reach different conclusions
Distinguishes genuine contradictions from scope differences
Prioritises conflicts by impact on the research question

Evidence Assessment

Evaluates how well-supported each claim is
Counts independent sources per knowledge unit
Flags single-source claims in high-stakes areas
Tracks confidence levels across the research path

Research Guidance

Suggests where to drill deeper based on gaps
Recommends broadening when context is missing
Proposes follow-up queries to resolve conflicts
Builds a map of what's known, unknown, and contested

Contradiction Detection

Knowledge sources often disagree. Peter uses Natural Language Inference (NLI) to detect and surface contradictions rather than hiding them.

How It Works

New claims are compared against semantically similar existing claims
NLI model classifies pairs as: entailment, contradiction, or neutral
Contradictions above confidence threshold are stored with relationship links
Both claims remain visible with conflict annotation

Why This Matters

Traditional systems flatten contradictions or pick arbitrarily
Research requires seeing where sources disagree
Confidence scores indicate strength of contradiction
Researchers can investigate and resolve conflicts

Example Contradiction

                            Claim A:
                            "GDPR Article 22 prohibits automated decision-making"
                        
                            ↕ CONTRADICTION (87% confidence)
                        
                            Claim B:
                            "GDPR Article 22 permits automated decisions with appropriate safeguards"

Both claims are preserved. The nuance (prohibition vs permitted-with-safeguards) is surfaced, not hidden.

Full Traceability Chain

Every assertion Peter surfaces traces back to sources. No "trust me" — every response is auditable to source documents.

Research Query Response │ ├── Knowledge Unit │ ├── claim_text: "GDPR Article 22 requires..." │ ├── confidence: 0.92 │ └── sources[] │ ├── document: "EU_GDPR_2016_679.pdf" │ ├── section: "Article 22, Paragraph 1" │ └── context: "automated individual decision-making" │ ├── Knowledge Context │ ├── related_units: 47 │ ├── prerequisite_chain: ["data-subject-rights", "lawful-basis"] │ └── contradictions: ["member-state-derogations"] │ └── Match Score Breakdown ├── semantic_score: 0.82 ├── bm25_score: 0.71 ├── jaccard_score: 0.45 └── temporal_weight: 0.95

How Peter Differs from RAG

RAG answers: "Here are documents that seem related."
Peter answers: "Here's what you need to know, what you need to understand first, where the sources disagree, and exactly where each claim comes from."

Traditional RAG	Peter
Knowledge implicit in model weights	Knowledge explicit in inspectable graphs
Retrieve similar chunks	Navigate structured knowledge graphs
Single-stage embedding search	Two-stage: bi-encoder recall → cross-encoder precision
Single-signal matching	Multi-signal hybrid scoring with disqualification
No temporal awareness	Temporal modes for different knowledge types
Flat retrieval	Knowledge graph navigation with prerequisite chains
Contradictions flattened or hidden	NLI-based contradiction detection and surfacing
Direct query embedding	Query transformation (hypothetical document generation)
One-shot similarity	Iterative refinement with prerequisite chains
"Here are related documents"	"Here's context, conflicts, and confidence"

Making Papers Think Together

Peter transforms compiled knowledge into navigable understanding.

Iterative research loop — Ask, inspect, identify gaps, refine, drill deeper
Knowledge graph navigation — Prerequisite chains and dependency awareness
Gap identification — Surfaces what's missing and under-evidenced
Multi-signal matching — Beyond semantic similarity
Temporal awareness — Foundational vs time-sensitive
Contradiction detection — NLI-powered conflict surfacing
Full traceability — Every claim auditable to source
Two-stage retrieval — Bi-encoder recall, cross-encoder precision

Walter compiles the knowledge. Peter makes it think.
Together, they form the basis of a new kind of cognitive infrastructure.

Get in Touch Back to Overview

Peter: Iterative ResearchOver Explicit Knowledge