The Auto-Research Engine — SBPI Semantic Layer

What We Built

A system that gets smarter every night while we sleep.

The SBPI Semantic Layer tracks competitive brand power across the micro-drama vertical — a $3.7B market growing 40% year-over-year. Every week, new data flows through a formal OWL ontology, gets validated against SHACL shape constraints, loads into an Oxigraph SPARQL endpoint, and produces automated insight digests with predictive signals.

1,672

RDF Triples

Momentum Predictions

Companies Tracked

Weeks of Data

30+

OWL Classes

85%

Top Signal Confidence

The Pipeline

Two data flows feed the system. Both run without human intervention after setup.

Ecosystem Graph Flow

Session Records → claude -p → InfraNodus API → Knowledge Graph

Headless Claude agents read session records, extract entity relationship statements, and post them to the InfraNodus knowledge graph. 27 sessions processed in 16 minutes. 96% success rate. 150 nodes, 849 edges, 12 clusters.

Semantic Layer Flow

SBPI State JSON → sbpi_to_rdf.py → SHACL Validation → Oxigraph Store → SPARQL Queries → Insight Digest

Weekly competitive intelligence data converts to RDF triples, validates against SHACL shapes, loads into an Oxigraph SPARQL endpoint, and gets queried by the nightly insights runner. 7 query types. 11 SPARQL files. Scheduled daily at 6:13 AM.

OTK Lineage

This architecture has a name. In 2001, a European research consortium built OTK — the Ontology-based Knowledge Management Toolkit. Their pipeline: Extract → Structure → Store → Query → Present. Our pipeline in 2026 uses the same architecture with different tools. The difference that matters: AI agents perform extraction, and the extraction improves with each cycle because the ontology constrains what counts as a valid fact.

Production Stack

Component	Tool	Purpose
Ontology	OWL 2 (Turtle)	Domain model: 30+ classes, 50+ properties
Validation	SHACL (pySHACL)	Schema enforcement on every data load
Triple Store	Oxigraph	SPARQL endpoint (RocksDB backend)
ETL	Python (rdflib)	JSON → RDF conversion + store loading
Queries	SPARQL 1.1	11 query files, 8 analysis patterns
Predictions	Python	Momentum + anomaly detection + confidence
Scheduling	launchd	Daily nightly insights at 6:13 AM
Knowledge Graph	InfraNodus	Entity extraction, clusters, gaps
Batch Processing	claude -p	Headless session → entity extraction
Deployment	Cloudflare Pages	Live dashboards and editorial sites

The Ontology as IP

The SBPI ontology (sbpi.ttl) is not a generic schema. It is a domain-specific model of how competitive brand power works in entertainment verticals. The dimension weights encode expert judgment that took months of client engagements to calibrate.

Five Scoring Dimensions

Dimension	Code	Weight	What It Measures
Content Strength	cs	20%	Volume, quality, and exclusivity of content produced
Narrative Ownership	no	20%	Control over press coverage, thought leadership, recognition
Distribution Power	dp	25%	App store rankings, global availability, partnerships
Community Strength	cm	20%	Size, engagement intensity, and loyalty of audience
Monetization Infrastructure	mi	15%	Revenue generation: payments, ads, subscriptions, coin systems

Distribution Power carries the highest weight (25%) because in a mobile-first vertical, app store presence and partnership reach determine whether content reaches audiences. Monetization Infrastructure carries the lowest (15%) because several dominant players subsidize the vertical from existing revenue streams.

Performance Tiers

Dominant (85-100)

Market leaders with category-defining presence. Currently: no companies in micro-drama reach this tier.

Strong (70-84)

Established competitors with sustainable advantages. ReelShort (84.0), DramaBox (77.3).

Emerging (55-69)

Growing players gaining structural power. Netflix (70.3), Disney (67.0), Amazon (63.5).

Niche (40-54)

Specialized players with narrow but defensible positions.

Limited (<40)

Pre-launch, resource-constrained, or strategically irrelevant.

Validation: SHACL Shapes

Every data load runs through SHACL shape validation. Bad data fails before entering the store. This is schema enforcement, not error handling.

Enforced Constraints

Shape	Validates	Constraints
CompanyShape	Company instances	Exactly 1 name, slug matches `^[a-z][a-z0-9-]*$`, isPlatformGiant required
ScoreRecordShape	Score records	Exactly 1 company + week, composite 0-100, exactly 5 dimension scores
DimensionScoreShape	Dimension scores	Exactly 1 dimension, value 0-100 integer
WeekShape	Week instances	Label matches `^W[0-9]{1,2}-[0-9]{4}$`
SignalShape	Market signals	signalText required
AttestationShape	Provenance records	Confidence 0.0-1.0, sourceType required

Attestation Layer

Every score carries provenance metadata tracking confidence and source quality:

Attestation:
    confidence: 0.85         # 0.0-1.0 scale
    sourceType: "expert_judgment"
    # Options: primary_data, secondary_analysis, expert_judgment, automated_inference

The attestation upgrade engine progressively improves scores as evidence quality increases. Signal URLs upgrade confidence from 0.85 to 0.90. Scoring rationales add a second attestation at 0.95. The system tracks not just what it knows, but how well it knows it.

The Karpathy Auto-Research Pattern

Andrej Karpathy described the future of AI as "auto-research" — headless AI agents that continuously process information, extract structured knowledge, and produce insights without human prompting. We implemented this literally.

Headless Batch Processing

The batch-mapupdate.sh script uses claude -p (Claude's headless CLI mode) to process session records. Each session takes ~20 seconds:

1. claude -p reads session file + extracts entity statements (text only)
2. curl POSTs entities to InfraNodus REST API
3. curl GETs full graph stats (nodes, edges, clusters, gaps)
4. Shell appends MOC entry with parsed results
5. Dashboard regenerated and deployed to Cloudflare Pages

27 sessions processed in a single batch run. 26 succeeded, 1 failed (entity extraction produced raw output instead of parsed entities). 96% success rate. No human interaction required.

Nightly Insights Runner

The nightly-insights.py scheduler runs 7 SPARQL queries daily at 6:13 AM:

Query	Schedule	What It Detects
Weekly Movers	Nightly	Biggest week-over-week delta changes
Dimension Anomalies	Nightly	Dimension-composite gaps >20 points
Predictive Signals	Nightly	Momentum patterns predicting next-week movements
Tier Transitions	Weekly	Companies crossing tier boundaries
Distribution-Community Gap	Weekly	High distribution but low community
Attestation Coverage	Weekly	Source backing quality per score record
Platform vs. Pure-Play	Weekly	Structural scoring differences by company type

W12-2026 Forecast

9 momentum predictions generated by the SBPI Prediction Engine. Confidence scores derived from a documented, reproducible formula.

Momentum Signals

Company	Direction	Momentum	Confidence	Signal
JioHotstar	▲ UP	+11.0	85%	Strongest mover in the vertical — JV between Reliance and Disney rapidly gaining structural power
COL Group / BeLive	▲ UP	+8.2	75%	Southeast Asian pure-play accelerating distribution partnerships
Disney	▲ UP	+6.5	75%	Platform giant gaining in micro-drama through Hotstar JV leverage
Amazon	▼ DOWN	-6.4	75%	Declining relative position despite broader entertainment dominance
Netflix	▼ DOWN	-6.0	65%	Losing micro-drama ground — vertical-specific weakness, not company-wide
GoodShort	▲ UP	+5.6	65%	Pure-play newcomer gaining content strength and community traction
Lifetime / A+E	▲ UP	+5.6	65%	Legacy broadcaster adapting to short-form with existing content library
DramaBox	▲ UP	+2.5	60%	Strong tier player maintaining upward trajectory
ReelShort	▼ DOWN	-1.1	60%	Market leader deceleration — classic signal for competitive entry

Key Signal

JioHotstar is the breakout mover at +11.0 momentum and 85% confidence. Platform giants (Netflix, Amazon) are losing relative position to pure-play micro-drama companies (ReelShort, DramaBox, GoodShort). The structural shift favors vertical specialists over horizontal aggregators in this market.

Confidence Scoring

Every prediction carries a confidence score computed from a documented formula. No black box.

base = 0.60  # minimum: 2 consecutive weeks, same direction

Adjustments:
  +0.10 if avg |delta| > 3.0    # strong movement
  +0.10 if avg |delta| > 5.0    # very strong (cumulative with above)
  +0.05 if both |delta| > 2.0   # consistent magnitude

Cap: 0.95  # never assert certainty from 2 weeks of data

Stagnation Detection

15 companies show zero movement for 2+ consecutive weeks — the stagnation signal that identifies non-competitive tail entries. These companies are either pre-launch, resource-constrained, or have exited the competitive frame.

The Accuracy Experiment

A controlled comparison framework is built and awaiting the first evaluation cycle:

Method	Description	Expected Accuracy
Persistence	Predict no change (delta = 0)	~33% directional
Naive Momentum	Same delta as last week	~40-50%
Mean Reversion	Regression toward tier midpoint	~40-50%
KG-Augmented	SBPI engine (momentum + anomaly + confidence)	Target: >60%
LLM Zero-Shot	LLM with no KG context	TBD
LLM + KG	LLM with full semantic layer context	TBD

The hypothesis: KG-augmented predictions outperform statistical baselines because the ontology encodes structural information (dimension weights, tier boundaries, company categories) that pure statistical methods cannot access.

The Scaling Path

Knowledge graphs follow a power-law value curve. Below 1,000 facts, a graph is a reference document. At 10,000-100,000 facts, it becomes a reasoning engine. At 1,000,000+ facts, it becomes a prediction platform.

Scale	Triples	What Unlocks	Timeline
Current	1,672	Single-vertical weekly intelligence	Now
10K	10,000	Cross-vertical comparison (micro-drama vs K-drama vs anime)	Q2 2026
100K	100,000	Temporal pattern library across all verticals + historical data	Q3 2026
1M	1,000,000	Full entertainment landscape with investor signal detection	Q4 2026
100M	100,000,000	Multi-industry competitive intelligence platform	2027
1B	1,000,000,000	General-purpose business intelligence ontology as a service	2028+

Each row is architecturally supported today. The ETL pipeline ingests any JSON state file. The ontology extends to any vertical by adding instances, not changing the schema. The SPARQL queries generalize — weekly-movers.rq works identically whether tracking 22 companies or 22,000.

Why the IP Is Defensible

1. The Ontology Itself

The dimension weights, tier boundaries, and scoring methodology encode years of domain expertise. Copying the code is trivial. Replicating the judgment embedded in sbpi:DistributionPower weight 0.25 requires understanding why distribution matters more than monetization in a mobile-first vertical. That understanding comes from 7+ client engagements and dozens of competitive analysis cycles.

2. The Attestation Chain

Every fact has a confidence score and source type. As the graph grows, the attestation layer creates a trust gradient. Facts backed by primary data and expert judgment outrank automated inference. This trust gradient is itself queryable: "Show me all ReelShort scores where confidence > 0.90" returns only the most reliable data.

3. The Prediction Track Record

Each prediction cycle writes RDF instances to the store with timestamps and provenance. Over time, this creates an auditable prediction history. A prediction engine with a proven track record is worth orders of magnitude more than one without. The accuracy experiment quantifies this value every cycle.

Capital Deployment Map

Investment	Amount	Return
Additional verticals (3-5)	$15K-25K	5x data volume, cross-vertical correlation signals
Historical data backfill (2+ years)	$10K-15K	Temporal pattern library, seasonal signal detection
LLM integration for prediction	$5K-10K	Zero-shot vs KG-augmented accuracy comparison
Automated source ingestion	$10K-20K	News feeds, app store data, social signals → RDF
Platform productization	$30K-50K	Self-service dashboard, API access, white-label reports

Total: $70K-120K to reach 100K+ triples and prove the prediction accuracy thesis.

Revenue Model Already Validated

Each client engagement (FrameBright, Fiserv, Long Zhu, AHA) produces intelligence briefs worth $2K-10K using this infrastructure. The marginal cost of each additional engagement decreases as the ontology and tooling improve. The infrastructure is the product.

The Self-Improving Cycle

The system is autopoietic — it improves itself through use.

Client Engagement Entity extraction produces KG data

Gap Analysis Reveals structural blind spots in the ontology

Ontology Refinement Blind spots inform dimension weight adjustments and new classes

Next Engagement Starts with a better system — extraction is more precise, gaps are narrower

Compounding Intelligence Each cycle makes the next one faster, cheaper, and more accurate

Evidence From Production

Long Zhu → Produced the Layered Ontology Architecture (K1-K3 + O1-O3). Didn't exist before we needed it.
FrameBright → Validated the two-site editorial pattern. Now deploys in hours instead of days.
Fiserv → Proved Brand Power Score scales to corporate brands without schema changes.
MicroCo (SBPI) → First production semantic layer with SPARQL-queryable predictions.
Batch Pipeline → 27 sessions processed autonomously. System writes its own history into structured knowledge.

Parametric vs. Non-Parametric Knowledge

This is where the billion-node thesis comes from. The distinction between what an LLM "knows" and what a knowledge graph knows is the foundation of the entire business model.

Parametric Knowledge (LLM Weights)

When an LLM "knows" that Paris is the capital of France, that knowledge is embedded in its trillions of numerical weights. It is static, hard to update without retraining, and prone to hallucinations because it's a probabilistic guess of what comes next.

Frozen at training time
Cannot cite its sources
Expensive to update (retraining costs millions)
Broad but shallow — knows a little about everything

Non-Parametric Knowledge (Our KGs)

Knowledge stored in an external structured format — RDF triples, SPARQL-queryable, with provenance and confidence scores. It is explicit, verifiable, and can be updated instantly without retraining any model.

Updated in real-time (weekly ETL cycles)
Every fact has an attestation chain
Update cost: near zero (append new triples)
Narrow but deep — expert-level in specific domains

The ATLAS Argument (Soldaini et al., 2024)

The ATLAS paper argues that for a knowledge graph to be a "peer" to an LLM, it needs to reach billion-node critical mass to match the sheer density of facts an LLM has memorized in its parameters. That's the general case. Our hypothesis is that expert-designed, domain-dense knowledge graphs outperform LLM parametric knowledge in concept-specific work at 1/100th the total scale.

Why Domain Density Changes the Math

A general-purpose knowledge graph like Wikidata has ~100 billion triples but knows approximately nothing about micro-drama competitive dynamics. GPT-4 has seen the Wikipedia page for ReelShort but cannot tell you that ReelShort's Distribution Power score dropped 0.55 points last week, or that JioHotstar's momentum is +11.0 at 85% confidence.

Our 1,672 triples contain more actionable intelligence about the micro-drama vertical than the entire parametric knowledge of any LLM. That's the domain density argument: a small graph with expert-designed ontology beats a large model with generic training in any task that requires structured reasoning over specific facts.

The IP Engine Model

Every intelligence briefing is not just a deliverable. It is a CapEx investment into a proprietary asset. The revenue pays for the construction of a knowledge graph that will eventually be worth millions in recurring licensing.

The Variables

Variable	Symbol	Definition	Current Value
Node Density	N_d	Unique entities and relationships extracted per briefing	~50-150 per engagement
Ontological Alpha	α	Uniqueness of schema vs. public sets (Wikidata, DBpedia). This is the moat.	High — SBPI dimensions, tier logic, attestation chains have no public equivalent
Extraction Efficiency	E_c	Cost (compute + agentic labor) to move a fact from unstructured briefing to structured KG node	~$0.02/triple (claude -p + InfraNodus API)
Decay Rate	λ	How fast domain information becomes obsolete	Low for structural data (ontology), high for scores (weekly refresh)

The Core Hypothesis

"By increasing Ontological Alpha through expert-driven schema design, ShurIQ creates a domain-dense KG that outperforms GPT-4's parametric knowledge in specific expert tasks, even at 1/100th the total scale."

Dual Revenue: Service + Asset Accumulation

Every $2K-10K engagement produces two things simultaneously:

Revenue (Deliverable)

Intelligence brief, editorial site, competitive analysis, Brand Power Score

Value: $2K-10K per engagement
Lifecycle: Consumed by client
Compounding: No — each deliverable is standalone

IP Asset (Knowledge Graph)

50-150 new triples, dimension calibrations, pattern library entries, prediction history

Value: Compounds with every engagement
Lifecycle: Permanent (ontology is additive)
Compounding: Yes — every fact makes the next extraction better

The investor pitch: you're not paying for consultants' time. You're funding the construction of a proprietary database that will eventually be worth millions in recurring licensing. The consulting revenue subsidizes the R&D.

The Karpathy Auto-Research Loop

Andrej Karpathy's "Software 2.0" thesis applied to ontology discovery. Agents don't just extract — they propose, critique, rank, and refine the knowledge graph structure itself.

1. Seed Give an agent a Seed Ontology (sbpi.ttl) and a corpus of briefing data

2. Extract & Propose Agent extracts facts and proposes new ontological nodes it thinks are missing

3. Cross-Reference Critic agent checks proposed nodes against existing KG for redundancy or contradictions

4. Stack Rank Agent ranks node value based on how often they bridge gaps between disparate data points

5. Finalize Humans approve high-value nodes → permanently baked into the KG

What We've Built So Far

Steps 1-2 are operational today. The batch-mapupdate pipeline runs headless claude -p agents that read session records and extract entity statements. The InfraNodus API processes these into graph nodes with cluster analysis and gap detection.

Steps 3-4 are partially implemented through SHACL validation (catches schema violations) and the prediction experiment (ranks signal value by accuracy over time).

Step 5 is operational through the wrapup skill and session capture pipeline — human review of extracted entities before they become permanent graph entries.

The Use of Funds Flywheel

Each dollar invested in ShurIQ doesn't pay for a consultant's time. It builds a flywheel where service revenue funds IP accumulation, and the IP makes each subsequent service faster and more valuable.

Investment Area	Activity	Scalable Asset (IP)
Agentic Extraction	Automating the Dolma-style pipeline for private briefings. Headless `claude -p` batch processing.	A proprietary, locally-hosted "ATLAS" of your client's industry. Updated weekly, queryable via SPARQL.
Ontology Design	Mapping dimension weights, tier boundaries, and scoring logic for specific verticals.	A "Schema Library" that can be licensed to other firms. Each vertical adds a new ontology module.
Post-Training / Fine-Tuning	Fine-tuning small language models (SLMs) on the specific KG data.	Models that don't hallucinate and "think" like your best analysts. Domain-specific, not general-purpose.
Prediction Engine	Running accuracy experiments comparing KG-augmented vs statistical baselines.	A proven prediction track record with documented methodology. The accuracy data IS the moat.
Attestation Infrastructure	Building the confidence scoring and source verification pipeline.	Trust gradient across all data. Clients can query "show me only facts backed by primary data."

The Compounding Math

$33

Cost Per Triple (current)

$0.02

Cost Per Triple (at scale)

1,650x

Extraction Efficiency Gain

At current scale (1,672 triples from ~$55K in engagement work), each triple costs ~$33. At the automated batch processing rate (20 seconds per session, ~$0.02 per extracted triple), the marginal cost drops 1,650x. The fixed cost is ontology design. The variable cost approaches zero.

From Service to Platform

Phase 1: Service Revenue $2K-10K intelligence briefs. Revenue pays for KG construction. Current phase.

Phase 2: Schema Licensing Ontology modules licensed to other CI firms. Recurring revenue from the schema itself.

Phase 3: Platform API SPARQL endpoint as a service. Clients query the KG directly. Usage-based pricing.

Phase 4: Private Brain SLMs fine-tuned on client-specific KG data. "Your analyst, always on."

The Strategic Pivot

This is not service-based reporting. This is IP-based infrastructure. The $100K reports are subsidized R&D that pays for the construction of a database that will eventually be worth millions in recurring licensing. Every engagement makes the next one cheaper, faster, and more accurate — because the knowledge graph grows, and the ontology sharpens.

The Experiment That Proves It

The prediction accuracy experiment is the mechanism that converts this thesis from an argument into evidence.

Experimental Design

For each weekly scoring cycle, we record predictions from 4+ methods against the same companies, then evaluate accuracy when new data loads.

Method	What It Uses	What It Proves If It Wins
Persistence	Nothing (predict no change)	Market is random — no method beats chance
Naive Momentum	Last week's delta only	Simple statistics are sufficient
Mean Reversion	Tier midpoints only	Markets self-correct — no structural model needed
KG-Augmented	Full ontology (dimensions, tiers, attestation, signals)	Expert-designed ontology adds predictive value beyond statistics
LLM Zero-Shot	Raw question, no KG context	Parametric knowledge alone is sufficient
LLM + KG	Full SPARQL context injected into prompt	Non-parametric KG amplifies LLM reasoning

The hypothesis chain:

KG-Augmented > Statistical Baselines → Proves ontology design has predictive value
LLM + KG > LLM Zero-Shot → Proves non-parametric knowledge amplifies LLM reasoning
Accuracy improves as graph grows → Proves the compounding thesis

If all three hold, the investment thesis is proven: every dollar into the knowledge graph produces exponentially more predictive value over time.

Current Status

68 predictions recorded for W12-2026 across all methods. First evaluation cycle runs when W12 data loads. The framework is built (prediction_experiment.py), the predictions are stored, and the accuracy comparison will produce the first quantitative evidence for the thesis.