The Auto-Research Engine

How expert-designed knowledge graphs produce compounding intelligence. A technical report on the SBPI Semantic Layer — the first production instance of a continuously self-improving competitive intelligence system.

March 2026 1,672 RDF Triples 22 Companies Tracked 11 Weeks of Data

What We Built

A system that gets smarter every night while we sleep.

The SBPI Semantic Layer tracks competitive brand power across the micro-drama vertical — a $3.7B market growing 40% year-over-year. Every week, new data flows through a formal OWL ontology, gets validated against SHACL shape constraints, loads into an Oxigraph SPARQL endpoint, and produces automated insight digests with predictive signals.

1,672
RDF Triples
9
Momentum Predictions
22
Companies Tracked
11
Weeks of Data
30+
OWL Classes
85%
Top Signal Confidence

The Pipeline

Two data flows feed the system. Both run without human intervention after setup.

Ecosystem Graph Flow

Session Records claude -p InfraNodus API Knowledge Graph

Headless Claude agents read session records, extract entity relationship statements, and post them to the InfraNodus knowledge graph. 27 sessions processed in 16 minutes. 96% success rate. 150 nodes, 849 edges, 12 clusters.

Semantic Layer Flow

SBPI State JSON sbpi_to_rdf.py SHACL Validation Oxigraph Store SPARQL Queries Insight Digest

Weekly competitive intelligence data converts to RDF triples, validates against SHACL shapes, loads into an Oxigraph SPARQL endpoint, and gets queried by the nightly insights runner. 7 query types. 11 SPARQL files. Scheduled daily at 6:13 AM.

OTK Lineage

This architecture has a name. In 2001, a European research consortium built OTK — the Ontology-based Knowledge Management Toolkit. Their pipeline: Extract → Structure → Store → Query → Present. Our pipeline in 2026 uses the same architecture with different tools. The difference that matters: AI agents perform extraction, and the extraction improves with each cycle because the ontology constrains what counts as a valid fact.

Production Stack

ComponentToolPurpose
OntologyOWL 2 (Turtle)Domain model: 30+ classes, 50+ properties
ValidationSHACL (pySHACL)Schema enforcement on every data load
Triple StoreOxigraphSPARQL endpoint (RocksDB backend)
ETLPython (rdflib)JSON → RDF conversion + store loading
QueriesSPARQL 1.111 query files, 8 analysis patterns
PredictionsPythonMomentum + anomaly detection + confidence
SchedulinglaunchdDaily nightly insights at 6:13 AM
Knowledge GraphInfraNodusEntity extraction, clusters, gaps
Batch Processingclaude -pHeadless session → entity extraction
DeploymentCloudflare PagesLive dashboards and editorial sites

The Ontology as IP

The SBPI ontology (sbpi.ttl) is not a generic schema. It is a domain-specific model of how competitive brand power works in entertainment verticals. The dimension weights encode expert judgment that took months of client engagements to calibrate.

Five Scoring Dimensions

DimensionCodeWeightWhat It Measures
Content Strengthcs20%Volume, quality, and exclusivity of content produced
Narrative Ownershipno20%Control over press coverage, thought leadership, recognition
Distribution Powerdp25%App store rankings, global availability, partnerships
Community Strengthcm20%Size, engagement intensity, and loyalty of audience
Monetization Infrastructuremi15%Revenue generation: payments, ads, subscriptions, coin systems

Distribution Power carries the highest weight (25%) because in a mobile-first vertical, app store presence and partnership reach determine whether content reaches audiences. Monetization Infrastructure carries the lowest (15%) because several dominant players subsidize the vertical from existing revenue streams.

Performance Tiers

Dominant (85-100)

Market leaders with category-defining presence. Currently: no companies in micro-drama reach this tier.

Strong (70-84)

Established competitors with sustainable advantages. ReelShort (84.0), DramaBox (77.3).

Emerging (55-69)

Growing players gaining structural power. Netflix (70.3), Disney (67.0), Amazon (63.5).

Niche (40-54)

Specialized players with narrow but defensible positions.

Limited (<40)

Pre-launch, resource-constrained, or strategically irrelevant.

Validation: SHACL Shapes

Every data load runs through SHACL shape validation. Bad data fails before entering the store. This is schema enforcement, not error handling.

Enforced Constraints

ShapeValidatesConstraints
CompanyShapeCompany instancesExactly 1 name, slug matches ^[a-z][a-z0-9-]*$, isPlatformGiant required
ScoreRecordShapeScore recordsExactly 1 company + week, composite 0-100, exactly 5 dimension scores
DimensionScoreShapeDimension scoresExactly 1 dimension, value 0-100 integer
WeekShapeWeek instancesLabel matches ^W[0-9]{1,2}-[0-9]{4}$
SignalShapeMarket signalssignalText required
AttestationShapeProvenance recordsConfidence 0.0-1.0, sourceType required

Attestation Layer

Every score carries provenance metadata tracking confidence and source quality:

Attestation:
    confidence: 0.85         # 0.0-1.0 scale
    sourceType: "expert_judgment"
    # Options: primary_data, secondary_analysis, expert_judgment, automated_inference

The attestation upgrade engine progressively improves scores as evidence quality increases. Signal URLs upgrade confidence from 0.85 to 0.90. Scoring rationales add a second attestation at 0.95. The system tracks not just what it knows, but how well it knows it.

The Karpathy Auto-Research Pattern

Andrej Karpathy described the future of AI as "auto-research" — headless AI agents that continuously process information, extract structured knowledge, and produce insights without human prompting. We implemented this literally.

Headless Batch Processing

The batch-mapupdate.sh script uses claude -p (Claude's headless CLI mode) to process session records. Each session takes ~20 seconds:

1. claude -p reads session file + extracts entity statements (text only)
2. curl POSTs entities to InfraNodus REST API
3. curl GETs full graph stats (nodes, edges, clusters, gaps)
4. Shell appends MOC entry with parsed results
5. Dashboard regenerated and deployed to Cloudflare Pages

27 sessions processed in a single batch run. 26 succeeded, 1 failed (entity extraction produced raw output instead of parsed entities). 96% success rate. No human interaction required.

Nightly Insights Runner

The nightly-insights.py scheduler runs 7 SPARQL queries daily at 6:13 AM:

QueryScheduleWhat It Detects
Weekly MoversNightlyBiggest week-over-week delta changes
Dimension AnomaliesNightlyDimension-composite gaps >20 points
Predictive SignalsNightlyMomentum patterns predicting next-week movements
Tier TransitionsWeeklyCompanies crossing tier boundaries
Distribution-Community GapWeeklyHigh distribution but low community
Attestation CoverageWeeklySource backing quality per score record
Platform vs. Pure-PlayWeeklyStructural scoring differences by company type

W12-2026 Forecast

9 momentum predictions generated by the SBPI Prediction Engine. Confidence scores derived from a documented, reproducible formula.

Momentum Signals

CompanyDirectionMomentumConfidenceSignal
JioHotstar ▲ UP +11.0 85% Strongest mover in the vertical — JV between Reliance and Disney rapidly gaining structural power
COL Group / BeLive ▲ UP +8.2 75% Southeast Asian pure-play accelerating distribution partnerships
Disney ▲ UP +6.5 75% Platform giant gaining in micro-drama through Hotstar JV leverage
Amazon ▼ DOWN -6.4 75% Declining relative position despite broader entertainment dominance
Netflix ▼ DOWN -6.0 65% Losing micro-drama ground — vertical-specific weakness, not company-wide
GoodShort ▲ UP +5.6 65% Pure-play newcomer gaining content strength and community traction
Lifetime / A+E ▲ UP +5.6 65% Legacy broadcaster adapting to short-form with existing content library
DramaBox ▲ UP +2.5 60% Strong tier player maintaining upward trajectory
ReelShort ▼ DOWN -1.1 60% Market leader deceleration — classic signal for competitive entry
Key Signal

JioHotstar is the breakout mover at +11.0 momentum and 85% confidence. Platform giants (Netflix, Amazon) are losing relative position to pure-play micro-drama companies (ReelShort, DramaBox, GoodShort). The structural shift favors vertical specialists over horizontal aggregators in this market.

Confidence Scoring

Every prediction carries a confidence score computed from a documented formula. No black box.

base = 0.60  # minimum: 2 consecutive weeks, same direction

Adjustments:
  +0.10 if avg |delta| > 3.0    # strong movement
  +0.10 if avg |delta| > 5.0    # very strong (cumulative with above)
  +0.05 if both |delta| > 2.0   # consistent magnitude

Cap: 0.95  # never assert certainty from 2 weeks of data

Stagnation Detection

15 companies show zero movement for 2+ consecutive weeks — the stagnation signal that identifies non-competitive tail entries. These companies are either pre-launch, resource-constrained, or have exited the competitive frame.

The Accuracy Experiment

A controlled comparison framework is built and awaiting the first evaluation cycle:

MethodDescriptionExpected Accuracy
PersistencePredict no change (delta = 0)~33% directional
Naive MomentumSame delta as last week~40-50%
Mean ReversionRegression toward tier midpoint~40-50%
KG-AugmentedSBPI engine (momentum + anomaly + confidence)Target: >60%
LLM Zero-ShotLLM with no KG contextTBD
LLM + KGLLM with full semantic layer contextTBD

The hypothesis: KG-augmented predictions outperform statistical baselines because the ontology encodes structural information (dimension weights, tier boundaries, company categories) that pure statistical methods cannot access.

The Scaling Path

Knowledge graphs follow a power-law value curve. Below 1,000 facts, a graph is a reference document. At 10,000-100,000 facts, it becomes a reasoning engine. At 1,000,000+ facts, it becomes a prediction platform.

ScaleTriplesWhat UnlocksTimeline
Current1,672Single-vertical weekly intelligenceNow
10K10,000Cross-vertical comparison (micro-drama vs K-drama vs anime)Q2 2026
100K100,000Temporal pattern library across all verticals + historical dataQ3 2026
1M1,000,000Full entertainment landscape with investor signal detectionQ4 2026
100M100,000,000Multi-industry competitive intelligence platform2027
1B1,000,000,000General-purpose business intelligence ontology as a service2028+

Each row is architecturally supported today. The ETL pipeline ingests any JSON state file. The ontology extends to any vertical by adding instances, not changing the schema. The SPARQL queries generalize — weekly-movers.rq works identically whether tracking 22 companies or 22,000.

Why the IP Is Defensible

1. The Ontology Itself

The dimension weights, tier boundaries, and scoring methodology encode years of domain expertise. Copying the code is trivial. Replicating the judgment embedded in sbpi:DistributionPower weight 0.25 requires understanding why distribution matters more than monetization in a mobile-first vertical. That understanding comes from 7+ client engagements and dozens of competitive analysis cycles.

2. The Attestation Chain

Every fact has a confidence score and source type. As the graph grows, the attestation layer creates a trust gradient. Facts backed by primary data and expert judgment outrank automated inference. This trust gradient is itself queryable: "Show me all ReelShort scores where confidence > 0.90" returns only the most reliable data.

3. The Prediction Track Record

Each prediction cycle writes RDF instances to the store with timestamps and provenance. Over time, this creates an auditable prediction history. A prediction engine with a proven track record is worth orders of magnitude more than one without. The accuracy experiment quantifies this value every cycle.

Capital Deployment Map

InvestmentAmountReturn
Additional verticals (3-5)$15K-25K5x data volume, cross-vertical correlation signals
Historical data backfill (2+ years)$10K-15KTemporal pattern library, seasonal signal detection
LLM integration for prediction$5K-10KZero-shot vs KG-augmented accuracy comparison
Automated source ingestion$10K-20KNews feeds, app store data, social signals → RDF
Platform productization$30K-50KSelf-service dashboard, API access, white-label reports

Total: $70K-120K to reach 100K+ triples and prove the prediction accuracy thesis.

Revenue Model Already Validated

Each client engagement (FrameBright, Fiserv, Long Zhu, AHA) produces intelligence briefs worth $2K-10K using this infrastructure. The marginal cost of each additional engagement decreases as the ontology and tooling improve. The infrastructure is the product.

The Self-Improving Cycle

The system is autopoietic — it improves itself through use.

Client Engagement Entity extraction produces KG data
Gap Analysis Reveals structural blind spots in the ontology
Ontology Refinement Blind spots inform dimension weight adjustments and new classes
Next Engagement Starts with a better system — extraction is more precise, gaps are narrower
Compounding Intelligence Each cycle makes the next one faster, cheaper, and more accurate

Evidence From Production

  • Long Zhu → Produced the Layered Ontology Architecture (K1-K3 + O1-O3). Didn't exist before we needed it.
  • FrameBright → Validated the two-site editorial pattern. Now deploys in hours instead of days.
  • Fiserv → Proved Brand Power Score scales to corporate brands without schema changes.
  • MicroCo (SBPI) → First production semantic layer with SPARQL-queryable predictions.
  • Batch Pipeline → 27 sessions processed autonomously. System writes its own history into structured knowledge.

Parametric vs. Non-Parametric Knowledge

This is where the billion-node thesis comes from. The distinction between what an LLM "knows" and what a knowledge graph knows is the foundation of the entire business model.

Parametric Knowledge (LLM Weights)

When an LLM "knows" that Paris is the capital of France, that knowledge is embedded in its trillions of numerical weights. It is static, hard to update without retraining, and prone to hallucinations because it's a probabilistic guess of what comes next.

  • Frozen at training time
  • Cannot cite its sources
  • Expensive to update (retraining costs millions)
  • Broad but shallow — knows a little about everything

Non-Parametric Knowledge (Our KGs)

Knowledge stored in an external structured format — RDF triples, SPARQL-queryable, with provenance and confidence scores. It is explicit, verifiable, and can be updated instantly without retraining any model.

  • Updated in real-time (weekly ETL cycles)
  • Every fact has an attestation chain
  • Update cost: near zero (append new triples)
  • Narrow but deep — expert-level in specific domains
The ATLAS Argument (Soldaini et al., 2024)

The ATLAS paper argues that for a knowledge graph to be a "peer" to an LLM, it needs to reach billion-node critical mass to match the sheer density of facts an LLM has memorized in its parameters. That's the general case. Our hypothesis is that expert-designed, domain-dense knowledge graphs outperform LLM parametric knowledge in concept-specific work at 1/100th the total scale.

Why Domain Density Changes the Math

A general-purpose knowledge graph like Wikidata has ~100 billion triples but knows approximately nothing about micro-drama competitive dynamics. GPT-4 has seen the Wikipedia page for ReelShort but cannot tell you that ReelShort's Distribution Power score dropped 0.55 points last week, or that JioHotstar's momentum is +11.0 at 85% confidence.

Our 1,672 triples contain more actionable intelligence about the micro-drama vertical than the entire parametric knowledge of any LLM. That's the domain density argument: a small graph with expert-designed ontology beats a large model with generic training in any task that requires structured reasoning over specific facts.

The IP Engine Model

Every intelligence briefing is not just a deliverable. It is a CapEx investment into a proprietary asset. The revenue pays for the construction of a knowledge graph that will eventually be worth millions in recurring licensing.

The Variables

VariableSymbolDefinitionCurrent Value
Node Density Nd Unique entities and relationships extracted per briefing ~50-150 per engagement
Ontological Alpha α Uniqueness of schema vs. public sets (Wikidata, DBpedia). This is the moat. High — SBPI dimensions, tier logic, attestation chains have no public equivalent
Extraction Efficiency Ec Cost (compute + agentic labor) to move a fact from unstructured briefing to structured KG node ~$0.02/triple (claude -p + InfraNodus API)
Decay Rate λ How fast domain information becomes obsolete Low for structural data (ontology), high for scores (weekly refresh)
The Core Hypothesis

"By increasing Ontological Alpha through expert-driven schema design, ShurIQ creates a domain-dense KG that outperforms GPT-4's parametric knowledge in specific expert tasks, even at 1/100th the total scale."

Dual Revenue: Service + Asset Accumulation

Every $2K-10K engagement produces two things simultaneously:

Revenue (Deliverable)

Intelligence brief, editorial site, competitive analysis, Brand Power Score

Value: $2K-10K per engagement
Lifecycle: Consumed by client
Compounding: No — each deliverable is standalone

IP Asset (Knowledge Graph)

50-150 new triples, dimension calibrations, pattern library entries, prediction history

Value: Compounds with every engagement
Lifecycle: Permanent (ontology is additive)
Compounding: Yes — every fact makes the next extraction better

The investor pitch: you're not paying for consultants' time. You're funding the construction of a proprietary database that will eventually be worth millions in recurring licensing. The consulting revenue subsidizes the R&D.

The Karpathy Auto-Research Loop

Andrej Karpathy's "Software 2.0" thesis applied to ontology discovery. Agents don't just extract — they propose, critique, rank, and refine the knowledge graph structure itself.

1. Seed Give an agent a Seed Ontology (sbpi.ttl) and a corpus of briefing data
2. Extract & Propose Agent extracts facts and proposes new ontological nodes it thinks are missing
3. Cross-Reference Critic agent checks proposed nodes against existing KG for redundancy or contradictions
4. Stack Rank Agent ranks node value based on how often they bridge gaps between disparate data points
5. Finalize Humans approve high-value nodes → permanently baked into the KG

What We've Built So Far

Steps 1-2 are operational today. The batch-mapupdate pipeline runs headless claude -p agents that read session records and extract entity statements. The InfraNodus API processes these into graph nodes with cluster analysis and gap detection.

Steps 3-4 are partially implemented through SHACL validation (catches schema violations) and the prediction experiment (ranks signal value by accuracy over time).

Step 5 is operational through the wrapup skill and session capture pipeline — human review of extracted entities before they become permanent graph entries.

The Use of Funds Flywheel

Each dollar invested in ShurIQ doesn't pay for a consultant's time. It builds a flywheel where service revenue funds IP accumulation, and the IP makes each subsequent service faster and more valuable.

Investment AreaActivityScalable Asset (IP)
Agentic Extraction Automating the Dolma-style pipeline for private briefings. Headless claude -p batch processing. A proprietary, locally-hosted "ATLAS" of your client's industry. Updated weekly, queryable via SPARQL.
Ontology Design Mapping dimension weights, tier boundaries, and scoring logic for specific verticals. A "Schema Library" that can be licensed to other firms. Each vertical adds a new ontology module.
Post-Training / Fine-Tuning Fine-tuning small language models (SLMs) on the specific KG data. Models that don't hallucinate and "think" like your best analysts. Domain-specific, not general-purpose.
Prediction Engine Running accuracy experiments comparing KG-augmented vs statistical baselines. A proven prediction track record with documented methodology. The accuracy data IS the moat.
Attestation Infrastructure Building the confidence scoring and source verification pipeline. Trust gradient across all data. Clients can query "show me only facts backed by primary data."

The Compounding Math

$33
Cost Per Triple (current)
$0.02
Cost Per Triple (at scale)
1,650x
Extraction Efficiency Gain

At current scale (1,672 triples from ~$55K in engagement work), each triple costs ~$33. At the automated batch processing rate (20 seconds per session, ~$0.02 per extracted triple), the marginal cost drops 1,650x. The fixed cost is ontology design. The variable cost approaches zero.

From Service to Platform

Phase 1: Service Revenue $2K-10K intelligence briefs. Revenue pays for KG construction. Current phase.
Phase 2: Schema Licensing Ontology modules licensed to other CI firms. Recurring revenue from the schema itself.
Phase 3: Platform API SPARQL endpoint as a service. Clients query the KG directly. Usage-based pricing.
Phase 4: Private Brain SLMs fine-tuned on client-specific KG data. "Your analyst, always on."
The Strategic Pivot

This is not service-based reporting. This is IP-based infrastructure. The $100K reports are subsidized R&D that pays for the construction of a database that will eventually be worth millions in recurring licensing. Every engagement makes the next one cheaper, faster, and more accurate — because the knowledge graph grows, and the ontology sharpens.

The Experiment That Proves It

The prediction accuracy experiment is the mechanism that converts this thesis from an argument into evidence.

Experimental Design

For each weekly scoring cycle, we record predictions from 4+ methods against the same companies, then evaluate accuracy when new data loads.

MethodWhat It UsesWhat It Proves If It Wins
Persistence Nothing (predict no change) Market is random — no method beats chance
Naive Momentum Last week's delta only Simple statistics are sufficient
Mean Reversion Tier midpoints only Markets self-correct — no structural model needed
KG-Augmented Full ontology (dimensions, tiers, attestation, signals) Expert-designed ontology adds predictive value beyond statistics
LLM Zero-Shot Raw question, no KG context Parametric knowledge alone is sufficient
LLM + KG Full SPARQL context injected into prompt Non-parametric KG amplifies LLM reasoning

The hypothesis chain:

  1. KG-Augmented > Statistical Baselines → Proves ontology design has predictive value
  2. LLM + KG > LLM Zero-Shot → Proves non-parametric knowledge amplifies LLM reasoning
  3. Accuracy improves as graph grows → Proves the compounding thesis

If all three hold, the investment thesis is proven: every dollar into the knowledge graph produces exponentially more predictive value over time.

Current Status

68 predictions recorded for W12-2026 across all methods. First evaluation cycle runs when W12 data loads. The framework is built (prediction_experiment.py), the predictions are stored, and the accuracy comparison will produce the first quantitative evidence for the thesis.