Data owners — company one-pagers

CME Group CME ○ operator

IR / presentations ↗

mkt cap~$93B ~ EV est EV/Sales~15x YoY growth+6% price~web price · premium

Data

Nature of the data

Data 6 · Neutral

Derivatives pricing & trade data
High-margin byproduct of the exchange

Data trajectory (stock vs flow)

Growing

Derivatives data flow grows with record volumes

Position on the AI-unlock curve

AI 5 · Neutral

Sells valuable data, but it's not the thesis

Current AI contracts & counterparties

~ desk note

Sells market data conventionally

Possibilities for additional contracts

Derivatives data into quant/agent stacks

AI risks — what stands to lose

Minimal — clearing/execution moat unaffected

Assessment

Valuation & discrepancy

Disc 3 · Low

Premium, well-understood
Owner-ish, but data isn't the re-rate

Convexity & why

Low

Priced, data not the driver

Other endogenous concerns

Volume cyclicality; FMX (BGC) attacking rates franchise

Hype factor (market awareness)

Low

Not a data-AI story

Catalysts

Volume cycles; data pricing

FactSet FDS ◆ owner

IR / presentations ↗

mkt cap~$9.0B ✓ FMP EV/Sales~4.3x YoY growth+5% price~web price · de-rated

Data

Nature of the data

Data 6 · Neutral

Entity-linked financial data: fundamentals, estimates, ownership, transcripts
'Symbology' deep ticker-linking is the connective tissue agents need
But much content is aggregated/licensed, not owned — caps the moat
Workflow terminals for buy/sell-side

Data trajectory (stock vs flow)

Steady flow

Coverage expands steadily; much content aggregated, not originated

Position on the AI-unlock curve

AI 7 · High

Conversational FactSet Mercury shipped; 48/50 top clients on AI tools
Clean, entity-linked data is ideal RAG fuel for finance copilots
Up-ish the curve
Aggregated data limits licensing leverage

Current AI contracts & counterparties

~ desk note

FactSet Mercury + transcript AI; aggregated content limits licensing

Possibilities for additional contracts

Symbology/entity-linking as agent infrastructure

AI risks — what stands to lose

The terminal seat is the product — agents directly substitute analyst workflows
Aggregated (non-owned) content gives least pricing defense

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~4.3x EV/Sales on +5% growth — quality at a modest multiple
Aggregated (non-owned) data caps the moat
Modest favorable gap on metrics

Convexity & why

Moderate

Quality franchise at a modest multiple — some re-rate optionality
Aggregated content caps the upside
Balanced

Other endogenous concerns

Content-licensing input costs (incl. CUSIP) squeeze margins
CEO transition; retention metrics softening

Hype factor (market awareness)

Med — as threat

De-rated with the info-services group in Feb 2026

Catalysts

Retention metrics; Mercury adoption

Intercontinental Exch. ICE ◆ owner

IR / presentations ↗

mkt cap~$80B ~ EV est EV/Sales~10x YoY growth+6% price~web price · premium

Data

Nature of the data

Data 8 · High

Dominant US mortgage data (Black Knight/Ellie Mae) — origination/servicing graph
Pricing & fixed-income reference data
Hard-to-replicate corpus inside an 'exchange' wrapper

Data trajectory (stock vs flow)

Cyclical flow

Mortgage data flows with origination cycle; pricing data steady

Position on the AI-unlock curve

AI 6 · Neutral

Steadily productizing pricing/reference data
Mortgage data graph is AI-relevant
Mid on the curve

Current AI contracts & counterparties

~ desk note

In-product mortgage-AI; data feeds sold conventionally

Possibilities for additional contracts

Mortgage-graph grounding for housing/credit agents

AI risks — what stands to lose

Minimal — transaction infrastructure; some data products commoditized

Assessment

Valuation & discrepancy

Disc 4 · Neutral

A real owner screens miss (files as an exchange)
Mostly priced

Convexity & why

Low

Quality priced
Limited asymmetry

Other endogenous concerns

Mortgage tech is deeply cyclical — bought at the top
Black Knight deal debt still being digested

Hype factor (market awareness)

Low

Read as an exchange, never as a data-AI play

Catalysts

Mortgage cycle; IMB platform wins

Moody's MCO ◆ owner

IR / presentations ↗

mkt cap~$79B ✓ FMP EV/Sales~11x YoY growth+9% price~web price · ~40x P/E

Data

Nature of the data

Data 9 · High

Credit ratings (MIS) + Moody's Analytics
Orbis: largest private-company database (~500M entities)
Default histories + ownership graph — decision-grade
Essential grounding for credit agents, KYC, supply-chain AI

Data trajectory (stock vs flow)

Growing

Orbis entity graph keeps expanding (~500M+ entities)
Ratings/transcript flow continuous; issuance cyclical

Position on the AI-unlock curve

AI 9 · High

Early OpenAI partnership; Research Assistant copilot
MCP distribution into Claude/ChatGPT/Copilot
Packaging data for agentic workflows — furthest on distribution
High — arguably best-executed, hence richly priced

Current AI contracts & counterparties

✓ deep dive

Early OpenAI partnership; Research Assistant copilot
MCP distribution into Claude/ChatGPT/Copilot
No raw licensing — productized access only

Possibilities for additional contracts

Agentic KYC/credit-memo workflows priced per seat
Orbis private-company graph as agent grounding

AI risks — what stands to lose

Analytics research/tools face AI commoditization; ratings are regulatorily protected
KYC/compliance products meet AI-native challengers

Assessment

Valuation & discrepancy

Disc 2 · Low

Best business + furthest-along AI
~11x sales / ~40x earnings to match
Thinnest discount; DCFs flag it rich

Convexity & why

Low

Best business, thinnest discount, DCF flags it rich
Limited upside → low convexity

Other endogenous concerns

Ratings revenue rides the debt-issuance cycle
Duopoly position invites periodic antitrust/regulatory attention

Hype factor (market awareness)

High

Best-executed AI strategy is consensus; it's in the ~11x

Catalysts

Agentic product attach rates
Ratings issuance cycle
Orbis monetization moves

Morningstar MORN ◆ owner

IR / presentations ↗

mkt cap~$7.0B ✓ FMP EV/Sales~3.4x YoY growth+8% price~web price

Data

Nature of the data

Data 7 · High

Fund/ETF data, star & analyst ratings; DBRS credit ratings
PitchBook private-markets/VC dataset is the scarce crown jewel
Fund data feeds advisor copilots

Data trajectory (stock vs flow)

Growing

PitchBook's private-company universe compounds with VC/PE activity

Position on the AI-unlock curve

AI 5 · Neutral

Mo chatbot + PitchBook AI features
Monetization mostly stays in-product
Mid on the curve

Current AI contracts & counterparties

~ desk note

Mo assistant; PitchBook AI features; in-product only

Possibilities for additional contracts

PitchBook private-market data licensing to AI deal tools

AI risks — what stands to lose

Fund research commoditized by AI summarization; ratings brand defensible
PitchBook data scraping/inference by AI tools

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~$7.0B cap, ~3.4x EV/Sales on +8% growth
Cheap for a PitchBook-owning franchise
Market under-paying for the private-markets data

Convexity & why

Moderate–High

PitchBook AI-deal-sourcing optionality, cheaply priced
No hard catalyst
Cheap enough to tilt positive

Other endogenous concerns

Founder (Mansueto) voting control
PitchBook decelerated with the VC downturn; DBRS is issuance-cyclical

Hype factor (market awareness)

Low

PitchBook's AI value ~absent from the narrative

Catalysts

PitchBook growth; advisor-AI launches

MSCI MSCI ◆ owner

IR / presentations ↗

mkt cap~$44B ✓ FMP EV/Sales~16x YoY growth+10% price~web price · premium

Data

Nature of the data

Data 8 · High

Indices (World, EM) portfolios are built and measured against
ESG/climate ratings, Barra factor/risk models, Burgiss private-asset data
Benchmarks + factor models are chokepoints
Index licensing is a recurring toll-road

Data trajectory (stock vs flow)

Growing

Index/factor data grows with markets; private-asset (Burgiss) expanding fast

Position on the AI-unlock curve

AI 7 · High

IndexAI connector; 'train clients' LLMs' roadmap
Solid enterprise APIs
Less aggressive than S&P/Moody's
Mid/high — capable but measured

Current AI contracts & counterparties

~ desk note

IndexAI connector; 'train clients' LLMs' roadmap — no licensing $ disclosed

Possibilities for additional contracts

Benchmark/factor licensing to agent platforms

AI risks — what stands to lose

ESG/analytics tools commoditized by AI; index licensing protected

Assessment

Valuation & discrepancy

Disc 3 · Low

~14x sales — one of the richest here
Priced as the premium compounder it is
No discount to the quality

Convexity & why

Low

Richest multiple here
Least convex — priced for the quality

Other endogenous concerns

Client concentration in fee-pressured asset managers
US political backlash against ESG products

Hype factor (market awareness)

Med

AI seen as feature, not thesis

Catalysts

Index flows; ESG/private-asset data attach

Nasdaq NDAQ ◆ owner

IR / presentations ↗

mkt cap~$49B ~ EV est EV/Sales~12x YoY growth+8% price~web price · premium

Data

Nature of the data

Data 7 · High

100+ proprietary market-data feeds
Index & analytics data products
A licensing toll-road like S&P benchmarks

Data trajectory (stock vs flow)

Growing

Market data grows with volumes; Verafin fraud signals compound

Position on the AI-unlock curve

AI 6 · Neutral

Feeds quant/agent workflows
Productized data
Up the curve on data productization

Current AI contracts & counterparties

~ desk note

Verafin AI (fraud), market-data feeds; no LLM licensing line

Possibilities for additional contracts

Surveillance/fraud agents; index licensing

AI risks — what stands to lose

Minimal core risk; market-data products face some AI substitution

Assessment

Valuation & discrepancy

Disc 3 · Low

Premium valuation reflects the toll-road
Quality owner, little discount

Convexity & why

Low

Priced toll-road
Limited asymmetry

Other endogenous concerns

Adenza acquisition debt + integration
Crypto-listings exposure adds volatility

Hype factor (market awareness)

Low

AI in products, not in the multiple

Catalysts

Fin-crime AI growth; data ARR

S&P Global SPGI ◆ owner

IR / presentations ↗

mkt cap~$126B ✓ FMP EV/Sales~8.8x YoY growth+8% price● $424.82 live · −18.5% YTD · 52wk $381–$576

Data

Nature of the data

Data 9 · High

Credit ratings, Capital IQ fundamentals/transcripts, Platts benchmarks
S&P Dow Jones Indices + Mobility (CARFAX)
Benchmarks are licensing toll-roads AI can't route around
The grounding layer any financial LLM/agent needs

Data trajectory (stock vs flow)

Growing

Daily benchmark prints, transcripts, fundamentals — relentless flow
CARFAX events + Mobility add new streams

Position on the AI-unlock curve

AI 9 · High

Kensho LLM-ready API live since Nov 2024; 300+ customers
Anthropic MCP connector + Claude Cowork plugin (Feb 2026)
Cohere North partnership (Jun 8, 2026) — sovereign/regulated AI
Distribution into Claude, ChatGPT, Gemini, Copilot
The most aggressive everywhere-the-agents-are strategy

Current AI contracts & counterparties

✓ deep dive

Kensho LLM-ready API (Nov 2024), 300+ customers (launch)
Claude Cowork plugin + Anthropic MCP (Kensho)
Cohere North partnership, Jun 8 2026 (PR)

Possibilities for additional contracts

Per-seat / usage pricing for agentic data access
Benchmark licensing to agent platforms (toll-road extension)
Private-markets data into AI workflows

AI risks — what stands to lose

Capital IQ desktop seats at risk as agents answer directly (why it sells the data INTO agents)
Ratings & indices largely insulated

Assessment

Valuation & discrepancy

Disc 3 · Low

~8.8x EV/Sales on +8% growth — premium largely intact
Top-tier AI execution already recognized in the multiple
Quality fully priced; no metric discrepancy

Convexity & why

Low

Quality + best-in-class AI execution already in the multiple
Limited discrepancy on metrics
Modest two-way payoff

Other endogenous concerns

IHS Markit integration legacy; Mobility (CARFAX) is auto-cyclical
Index fee compression a slow structural drag

Hype factor (market awareness)

Med-High

AI execution is consensus among analysts; the multiple carries only a modest sector AI-threat discount

Catalysts

AI-access revenue disclosure (none yet)
More agent-platform embeds
Ratings cycle + index flows

Gartner IT ◆ owner

IR / presentations ↗

mkt cap~$10.5B ✓ FMP EV/Sales~2.1x YoY growth+4% price● $157.40 live · −37.6% YTD · 52wk $141–$422

Data

Nature of the data

Data 8 · High

45+ yrs of proprietary syndicated IT/business research from ~2,000 analysts
Magic Quadrants & Hype Cycles are de-facto standards CIOs buy on
Price, salary & contract benchmarks from thousands of engagements
Behind a hard paywall — not on the open web, not freely scrapeable
>75% of contract value multi-year recurring, embedded in workflows

Data trajectory (stock vs flow)

Steady — watch the flow

Analyst output paced by headcount; inquiry/benchmark data grows with clients
CV slowdown = the inflow risk: fewer clients → less peer data

Position on the AI-unlock curve

AI 5 · Neutral

Two-sided: AI could commoditize 'advice' or make its data the grounding layer
Rolling out AskGartner inside client licenses
Has NOT licensed its corpus to labs — keeps it walled
Contract-value growth slowed to ~1–5% — the market's disruption tell
Early on the curve; data-as-grounding thesis unproven

Current AI contracts & counterparties

✓ deep dive

None — AskGartner ships inside existing client licenses
AskGartner live across research portal (example)

Possibilities for additional contracts

Corpus-grounded agent for enterprises (license upsell)
Selective API access to benchmarks/peer data
Price/SLA tiers for AI-assisted research

AI risks — what stands to lose

The core product IS advice — generalist AI is a direct substitute
Seat-based research licenses are the exposed surface
Conferences/consulting more defensible

Assessment

Valuation & discrepancy

Disc 8 · High

~2.1x EV/Sales for a 77%-gross-margin, mostly-recurring franchise
The multiple embeds a full AI-disruption outcome; CV growth ~1–5% is the operational tell
Cheapest quality owner on the board on metrics

Convexity & why

High · quality-convex

Profitable recurring base at ~2x sales bounds the downside
Large upside if AI proves additive to the franchise
Cheap quality + two-sided AI = positive convexity

Other endogenous concerns

Conference/consulting segments are macro-cyclical
EPS growth leans on buybacks; sales-force productivity in question

Hype factor (market awareness)

High — as threat

Narrative casts Gartner as an AI casualty; AskGartner and the paywalled corpus get little credit

Catalysts

Contract-value growth stabilization (the single tell)
AskGartner engagement disclosures
Buyback pace

Thomson Reuters TRI ◆ owner

IR / presentations ↗

mkt cap~$78B* ✓ FMP EV/Sales~9.0x YoY growth+7% price~web price · premium

Data

Nature of the data

Data 9 · High

Westlaw: case law, statutes, annotations built over a century
Editorial headnotes/KeyCite are irreplicable human layers
Practical Law, Checkpoint (tax), Reuters News
Legal/tax = highest-value, lowest-hallucination-tolerance use cases

Data trajectory (stock vs flow)

Steady compounding

Case law grows with the courts — slow, perpetual accretion
Editorial annotations (headnotes/KeyCite) compound on top

Position on the AI-unlock curve

AI 8 · High

CoCounsel scaling fast — ~1M AI users
AI-native Westlaw does grounded retrieval over its corpus
Monetizes the data itself
High — clear legal-AI leader

Current AI contracts & counterparties

✓ deep dive

No corpus licensing — deliberate walled strategy
CoCounsel: 1M professionals, 107 countries (Feb 2026) (PR)
Building proprietary LLM for regulated use cases

Possibilities for additional contracts

Selective agent-platform access to Westlaw (MCP-style)
CoCounsel 10x user target = the in-product unlock
Tax/audit agentic suites later in 2026

AI risks — what stands to lose

Legal research workflow is the AI battleground — Harvey, Legora, generalist agents
Westlaw seat pricing under pressure if agents do the research
Reuters news commoditized by AI summarization

Assessment

Valuation & discrepancy

Disc 3 · Low

~9x EV/Sales on +7% growth — a modest AI-threat discount against its quality
CoCounsel at 1M users is distribution the multiple under-credits
Premium franchise; the discount is partial, not deep

Convexity & why

Low

Priced quality; AI leadership reflected
Limited convexity

Other endogenous concerns

Woodbridge (Thomson family) controls ~70% — governance is theirs
Print/legacy declines largely done; tax season concentration

Hype factor (market awareness)

High — as threat

Market narrative treats agentic AI as a threat to legal-research seats; CoCounsel distribution under-credited

Catalysts

CoCounsel next-gen GA + adoption metrics
ACV growth reacceleration (the proof point)
Competitive data vs Harvey/Legora/Claude Cowork

Equifax EFX ◆ owner

IR / presentations ↗

mkt cap~$20B ✓ FMP EV/Sales~4.0x YoY growth+7% price~web price · cyclical trough

Data

Nature of the data

Data 8 · High

The Work Number — unique employer-sourced income/employment records
Verified income/employment ground-truth no LLM can infer
Utility/telecom payment data extends the picture
Gating data for lending, hiring, benefits
Contributory — employers feed it (network effects)

Data trajectory (stock vs flow)

Compounding

The Work Number records keep growing via payroll integrations
Every paycheck is a new record — true flow asset

Position on the AI-unlock curve

AI 6 · Neutral

EFX.AI built into new product models
FCRA permissible-purpose rules cap AI exposure
Monetization stays inside regulated rails
Mid — gated by regulation, not capability
Re-rate is cyclical more than AI-driven

Current AI contracts & counterparties

~ desk note

EFX.AI in-product; FCRA limits external exposure

Possibilities for additional contracts

Verified-income rails for lending/hiring agents (permissioned)

AI risks — what stands to lose

AI cash-flow underwriting could route around bureau scores at the margin
AI-driven synthetic-identity fraud raises cost of trust

Assessment

Valuation & discrepancy

Disc 7 · High

~$20B cap, ~4.0x EV/Sales on +7% growth
Cheap for the owner of The Work Number
Re-rates on the lending/hiring cycle + verified-income AI demand

Convexity & why

High

Unique income/employment data at a low multiple
FCRA caps direct licensing, but the asset is irreplaceable
Cheap + cyclical-recovery optionality = convex

Other endogenous concerns

2017 breach legacy = elevated security/regulatory burden
Mortgage + hiring volumes are the real earnings driver near-term
CFPB / FCRA scrutiny is permanent

Hype factor (market awareness)

Low

AI angle absent; mortgage cycle dominates the narrative

Catalysts

Mortgage/hiring recovery; TWN records growth; any agent-rail pilots

Experian EXPN.L ◆ owner

IR / presentations ↗

mkt cap~$45B* ~ EV est EV/Sales~6.5x YoY growth+7% price~web price · UK-listed

Data

Nature of the data

Data 8 · High

Third global credit bureau + marketing/identity/fraud data
Best organic growth of the three bureaus
Verified credit/identity data with network effects

Data trajectory (stock vs flow)

Growing

Same bureau flow; strongest organic data investment of the three

Position on the AI-unlock curve

AI 6 · Neutral

AI products across credit & fraud
FCRA-style rules cap ecosystem exposure
Mid on the curve

Current AI contracts & counterparties

~ desk note

Ascend platform AI; in-product

Possibilities for additional contracts

Same permissioned-rails option as EFX/TRU

AI risks — what stands to lose

Same as the other bureaus; strongest product diversification of the three

Assessment

Valuation & discrepancy

Disc 6 · Neutral

Reasonable bureau multiple
Only friction is access (London listing)
Quality peer to EFX/TRU

Convexity & why

Moderate

Quality + reasonable price
Regulation caps the convex upside
Balanced

Other endogenous concerns

UK listing discount; Brazil FX exposure

Hype factor (market awareness)

Low

UK listing keeps it out of the AI conversation

Catalysts

Cycle; NA mortgage volumes

FICO FICO ◆ owner

IR / presentations ↗

mkt cap~$28B ✓ FMP EV/Sales~14x YoY growth+15% price~web price · −42% from peak

Data

Nature of the data

Data 6 · Neutral

The FICO score — decisioning standard embedded in US credit
More algorithm/standard than raw corpus
But the score is a data product with monopoly economics

Data trajectory (stock vs flow)

Derived flow

Scores recompute on bureau flow; FICO originates little raw data

Position on the AI-unlock curve

AI 6 · Neutral

Own FFM foundation model
AI lending agents still need an accepted standard
Mortgage-pricing change is a catalyst
Not a corpus play

Current AI contracts & counterparties

~ desk note

FICO Foundation Model (FFM) announced; platform AI

Possibilities for additional contracts

Score-as-API inside lending agents

AI risks — what stands to lose

The central AI risk case: AI-native underwriting bypassing the Score
Lenders' in-house models + FHFA score competition (VantageScore 4.0)

Assessment

Valuation & discrepancy

Disc 5 · Neutral

~14x EV/Sales on +15% growth — still premium on metrics
Moat contested (VantageScore push, AI underwriting)
Two-sided

Convexity & why

Moderate

De-rated standard with mortgage-pricing optionality
But expensive on sales (~14x)
Two-sided

Other endogenous concerns

Pricing-power backlash: FHFA pushing VantageScore competition in mortgages
Revenue concentrated in B2B scores; software segment unloved

Hype factor (market awareness)

Med

Debate is pricing power, not AI

Catalysts

Mortgage-score pricing; platform ARR

LiveRamp RAMP ◆ owner

IR / presentations ↗

mkt cap~$2.3B ~ EV est EV/Sales~3.0x YoY growth+10% pricebeing acquired ~$38.50 by Publicis

Data

Nature of the data

Data 7 · High

Identity graph & data-collaboration network (25k+ publishers)
Clean-room identity for the post-cookie/AI-data era

Data trajectory (stock vs flow)

Maintained

Identity graph is refresh-maintenance, not accumulation

Position on the AI-unlock curve

AI 6 · Neutral

Well-placed for AI-data era
But the story is now M&A

Current AI contracts & counterparties

~ desk note

Identity/clean-room infra relevant to AI data flows

Possibilities for additional contracts

—

AI risks — what stands to lose

Acquisition pending — risk transfers to Publicis

Assessment

Valuation & discrepancy

Disc 2 · Low

Being acquired ~$2.5B by Publicis
Off the board as a standalone bet
Signal: ad-holdcos paying up for identity data

Convexity & why

Low

Taken out — payoff capped by the deal price

Other endogenous concerns

Deal-close risk is the only variable left (~$38.50 cash)

Hype factor (market awareness)

Low

Story is now the Publicis acquisition

Catalysts

Deal close (~$38.50)

TransUnion TRU ◆ owner

IR / presentations ↗

mkt cap~$13.5B ✓ FMP EV/Sales~3.9x YoY growth+8% price~web price

Data

Nature of the data

Data 7 · High

Credit bureau + identity resolution (Neustar)
Links offline identity to digital identifiers
Identity graphs matter more as AI agents transact
Contributory bureau data with network effects

Data trajectory (stock vs flow)

Growing

Credit + identity events flow with economic activity

Position on the AI-unlock curve

AI 5 · Neutral

OneTru platform, TruIQ agents
Identity products quietly AI-relevant
FCRA-capped exposure like Equifax
Mid on the curve

Current AI contracts & counterparties

~ desk note

OneTru platform, TruIQ agents; in-product

Possibilities for additional contracts

Identity verification for AI-agent transactions

AI risks — what stands to lose

Same bypass risk as EFX; identity products partly hedge it

Assessment

Valuation & discrepancy

Disc 6 · Neutral

Cheapest of the three bureaus
Modest favorable gap
Same regulatory ceiling

Convexity & why

Moderate

Cheapest bureau + cycle/identity optionality
FCRA caps the convex upside
Balanced

Other endogenous concerns

Neustar deal leverage; UK consumer business weak
Same CFPB overhang

Hype factor (market awareness)

Low

Same as EFX — cycle story, not AI story

Catalysts

Cycle turn; Neustar identity products

Verisk VRSK ◆ owner

IR / presentations ↗

mkt cap~$24B ✓ FMP EV/Sales~9.0x YoY growth+7% price~web price · premium

Data

Nature of the data

Data 9 · High

Decades of contributory claims, loss & property/peril data
Nearly all US P&C insurers both feed and buy it back
Catastrophe models built on the loss history
Near-monopoly; no AI lab can rebuild it

Data trajectory (stock vs flow)

Steady compounding

Contributory model: every insurer claim feeds it, by contract
Cat-event data grows with each season

Position on the AI-unlock curve

AI 5 · Neutral

Generative/agentic AI in underwriting/claims products
Consortium-locked — not licensed to the open ecosystem
Value unlock in-product, not via licensing
Mid — deepest moat, deliberately walled

Current AI contracts & counterparties

~ desk note

Consortium AI in underwriting/claims products

Possibilities for additional contracts

Walled option: claims-history grounding for insurance agents

AI risks — what stands to lose

Insurers building AI on their own claims data could weaken the consortium pull

Assessment

Valuation & discrepancy

Disc 5 · Neutral

Deep moat, but ~12x sales / ~7% growth
Fully paid for
Quality High, value Low-ish

Convexity & why

Low–Moderate

Near-monopoly data, but premium & walled
Bounded downside, limited upside
Low asymmetry

Other endogenous concerns

Consortium members push back on pricing; class actions over contributory data use

Hype factor (market awareness)

Low-Med

Quality priced; AI not separately valued

Catalysts

Product attach; pricing renewals

Cencora COR ○ operator

IR / presentations ↗

mkt cap~$54B ~ EV est EV/Sales~0.1x YoY growth+10% price~web price · defensive

Data

Nature of the data

Data 5 · Neutral

Pharmacy/dispensing & distribution data
Optimizes thin-margin logistics

Data trajectory (stock vs flow)

Steady flow

Distribution data tracks volumes

Position on the AI-unlock curve

AI 3 · Low

Logistics input, not sold

Current AI contracts & counterparties

~ desk note

Logistics AI internal

Possibilities for additional contracts

—

AI risks — what stands to lose

Low — physical distribution

Assessment

Valuation & discrepancy

Disc 4 · Neutral

Fair defensive distributor
Data-rich, not a data owner

Convexity & why

Low

Defensive, data not a driver

Other endogenous concerns

Drug-pricing policy; thin-margin model

Hype factor (market awareness)

Low

Not an AI story

Catalysts

Distribution volumes

Definitive Health. DH ◆ owner

IR / presentations ↗

mkt cap~$0.1B ~ EV est EV/Sales~2.0x YoY growth−8% price~web price · micro-cap

Data

Nature of the data

Data 7 · High

Healthcare commercial intel: providers, claims, affiliations, install-base
'The ZoomInfo of healthcare' — sells intelligence to life-sciences/med-tech
A pure data owner, not a marketplace
Continuously refreshed healthcare-entity graph

Data trajectory (stock vs flow)

Slowing

Refresh continues but shrinking revenue funds less data collection

Position on the AI-unlock curve

AI 6 · Neutral

Real owner, but AI is as much threat as tailwind
Limited AI productization so far
Mid/behind — business being repriced
Erosion risk from AI-generated provider signal

Current AI contracts & counterparties

~ desk note

None disclosed

Possibilities for additional contracts

Healthcare-commercial grounding data for pharma AI

AI risks — what stands to lose

AI-generated provider intelligence directly substitutes the core product — erosion already visible

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~$0.1B cap, ~2x EV/Sales on declining revenue
Distressed micro-cap; the data is better than the equity
Cheap for existential reasons

Convexity & why

High · distressed

Distressed micro-cap → option on stabilization or M&A
Declining revenue is the live left tail
Cheap healthcare-commercial data if it survives

Other endogenous concerns

PE overhang (Advent), serial goodwill writedowns, micro-cap liquidity

Hype factor (market awareness)

Low

Micro-cap; no AI narrative attaches

Catalysts

Revenue stabilization; strategic review odds

Doximity DOCS ◆ owner*

IR / presentations ↗

mkt cap~$3.8B ✓ FMP EV/Sales~5.6x YoY growth+13% price~web price · ~18x sales

Data

Nature of the data

Data 6 · Neutral

Verified network of most US physicians
The asset is the audience/engagement, not a corpus
Workflow tools for doctors

Data trajectory (stock vs flow)

Saturated graph

Most US physicians already on it — the graph is mature
Engagement/newsfeed data still grows; the asset is breadth, not flow

Position on the AI-unlock curve

AI 6 · Neutral

Strong AI tools (Doximity GPT), huge engagement
But no AI revenue in guidance
Data asset is the audience, not a corpus

Current AI contracts & counterparties

~ desk note

Doximity GPT free for physicians; ad AI in-product

Possibilities for additional contracts

Clinician-verified channel for healthcare AI distribution

AI risks — what stands to lose

Physician attention shifting to AI clinical tools (OpenEvidence et al.)
Pharma ad budgets could follow attention into AI channels

Assessment

Valuation & discrepancy

Disc 5 · Neutral

~$3.8B cap, ~5.6x EV/Sales on +13% growth
Far from the ~18x I'd assumed — reasonable now
Verified clinician graph; audience-not-corpus caps licensing

Convexity & why

Moderate

Verified clinician graph + AI tools, now at a fair multiple
Audience-not-corpus caps the data-licensing upside
Balanced after the de-rate

Other endogenous concerns

Pharma ad-budget concentration; engagement metrics are the whole story

Hype factor (market awareness)

Med

Was priced for AI hopes; now reset to fair

Catalysts

Ad market; AI tool engagement

Elevance ELV ○ operator

IR / presentations ↗

mkt cap~$92B ~ EV est EV/Sales~0.4x YoY growth+5% price~web price · de-rated insurer

Data

Nature of the data

Data 6 · Neutral

Claims/care-management data via Carelon
Latent separable data asset
Used to lower its own medical costs

Data trajectory (stock vs flow)

Steady flow

Claims flow with membership; flat membership = flat flow

Position on the AI-unlock curve

AI 4 · Neutral

AI care-management lowers internal costs
Closest operator to a separable data asset
Still not pure-play

Current AI contracts & counterparties

~ desk note

Carelon internal AI

Possibilities for additional contracts

Separable claims-data asset (never signaled)

AI risks — what stands to lose

Low direct risk; AI mostly a cost lever

Assessment

Valuation & discrepancy

Disc 5 · Neutral

Cheap, but on insurer fundamentals
Latent data optionality (Carelon)
Cyclical

Convexity & why

Moderate

De-rated insurer with latent data optionality
Cyclical, not a data re-rate
Mildly positive

Other endogenous concerns

Medical-cost trend + Medicaid redeterminations; ACA subsidy politics

Hype factor (market awareness)

Low

Insurer story

Catalysts

Medical-cost trend; Carelon growth

GoodRx GDRX ○ operator

IR / presentations ↗

mkt cap~$0.9B ~ EV est EV/Sales~1.3x YoY growth~flat price~web price

Data

Nature of the data

Data 5 · Neutral

Rx-pricing & consumer prescription-behavior data
Unique data, but an input to a discount platform
Platform under structural pressure

Data trajectory (stock vs flow)

Steady flow

Pricing data flows; nothing accumulating in value

Position on the AI-unlock curve

AI 4 · Neutral

Data feeds the platform; not licensed as a corpus
Limited AI productization

Current AI contracts & counterparties

~ desk note

None disclosed

Possibilities for additional contracts

Rx-pricing data into consumer-health agents

AI risks — what stands to lose

AI agents compare drug prices directly, disintermediating the front end

Assessment

Valuation & discrepancy

Disc 4 · Neutral

Cheap, but pressured core
Marginal owner with hard-to-monetize data

Convexity & why

Moderate · binary

Cheap with stabilization optionality
But structural pressure on the core
Binary-ish

Other endogenous concerns

PBM dependence — a single partner change (Kroger '22) cratered it once

Hype factor (market awareness)

Low

No AI narrative

Catalysts

Platform stabilization

Guardant Health GH ◆ owner

IR / presentations ↗

mkt cap~$17B ✓ FMP EV/Sales~17x YoY growth+33% price~$105 · target ~$129 (web)

Data

Nature of the data

Data 9 · High

Liquid-biopsy genomic + clinical-outcomes data in oncology
Proprietary, scarce — a direct Tempus peer
Longitudinal molecular profiles track tumor evolution
Cannot be assembled from public sources

Data trajectory (stock vs flow)

Compounding fast

Test volumes +25–35%/yr; each test extends longitudinal profiles

Position on the AI-unlock curve

AI 6 · Neutral

Pharma data partnerships + co-development, earlier-stage
Smart Platform multiomic insights
Building the 'co-develop on our data' motion
Mid — monetization layer still forming

Current AI contracts & counterparties

~ desk note

Pharma data partnerships (earlier-stage than Tempus); Smart Platform

Possibilities for additional contracts

Tempus-style co-builds on liquid-biopsy data

AI risks — what stands to lose

Interpretation commoditizes; raw assay + outcomes data is the defensible part

Assessment

Valuation & discrepancy

Disc 5 · Neutral

Scarce data, but ~12x sales and unprofitable
Analyst upside exists
Expensive growth, not cheap

Convexity & why

Moderate

Scarce-data optionality, but ~12x sales + unprofitable cap it
More a growth bet than an option
Balanced, positive tilt

Other endogenous concerns

Cash burn continues; patent litigation history with Natera
Screening (Shield) economics still unproven at scale

Hype factor (market awareness)

Med

Priced as diagnostics growth; data angle secondary

Catalysts

MRD reimbursement; pharma deal announcements

IQVIA IQV ◆ owner

IR / presentations ↗

mkt cap~$31B ✓ FMP EV/Sales~2.7x YoY growth+6% price● $186.25 live · −17.4% YTD · 52wk $153–$247

Data

Nature of the data

Data 9 · High

World's largest pharmacy-claims & prescription dataset (ex-IMS Health)
Population-scale real-world evidence across global Rx
Clinical-trial operational data as the largest CRO (ex-Quintiles)
De-identified, compliance-grade — built under HIPAA/GDPR, unscrapeable
Sold to virtually every major pharma

Data trajectory (stock vs flow)

Compounding

Rx/claims flow is continuous and population-scale
Trial operational data compounds with every study run

Position on the AI-unlock curve

AI 7 · High

IQVIA.ai unified agentic platform (Mar 2026): 150+ agents deployed
NVIDIA partnership since Jan 2025 — custom foundation models on its data
19 of top 20 pharma already using IQVIA agents; 100+ AI patents
Builds agents ON the data rather than licensing it out
No longer latent — monetization architecture is live

Current AI contracts & counterparties

✓ deep dive

NVIDIA partnership (Jan 2025) → IQVIA.ai platform, Mar 2026 (PR)
150+ agents live; 19 of top-20 pharma using them (report)
100+ AI patents; agents built ON proprietary data, not licensed out

Possibilities for additional contracts

Agent subscriptions as a separate revenue line
RWE feeds for medical LLMs (compliance-wrapped)
Trial-design agents priced on outcomes

AI risks — what stands to lose

CRO services half is labor-heavy — AI compresses what pharma will pay for it
Pharma in-housing analytics with AI tools

Assessment

Valuation & discrepancy

Disc 7 · High

$16.3B FY25 rev, +5.9% (~7% TTM)
Low-single-digit sales multiple for unique data
~$13B net debt is the caveat

Convexity & why

High

~2.7x EV/Sales on +6% growth — cheap for the scarcest Rx data
Locked in compliance contracts; low AI surface today
Cheap + latent-unlock optionality = convex

Other endogenous concerns

~$13B net debt limits flexibility
CRO bookings cyclical; pharma R&D budgets squeezed (IRA effects)

Hype factor (market awareness)

Low → rising

Cheapest scarce-data name; IQVIA.ai barely registers in the multiple yet

Catalysts

Next earnings: ~late July 2026 (Q1 reported May 5 — beat; EPS guide raised)
IQVIA.ai adoption: now 192 agents / 64 use cases; watch for monetization disclosure (Q1 call)
R&DS backlog $32.7B (+5.3%); Q4 book-to-bill 1.18x — bookings reacceleration is the proof point
$1.2B buyback remaining ($552M done in Q1)
Duke obesity-trials collaboration (Feb 2026) — fastest-growing trial category
De-leveraging from 3.62x / $13.9B net debt frees the multiple

Natera NTRA ◆ owner

IR / presentations ↗

mkt cap~$32B ✓ FMP EV/Sales~12x YoY growth+36% price~web price · target ~$262

Data

Nature of the data

Data 9 · High

Genetic-testing / cfDNA data (MRD, prenatal, transplant)
Large, fast-growing proprietary genomic dataset
Outcome-linked longitudinal data is the durable asset
Same scarce-data position as Guardant/Tempus

Data trajectory (stock vs flow)

Compounding fast

Fastest test-volume growth in the group; outcome links accrue with time

Position on the AI-unlock curve

AI 6 · Neutral

Owns the data; data-layer monetization still maturing
Strong clinical-validation pipeline feeds the dataset
Files as diagnostics, so screens miss it
Mid on the curve

Current AI contracts & counterparties

~ desk note

Data feeds pharma trials; in-product AI

Possibilities for additional contracts

Outcome-linked genomic licensing

AI risks — what stands to lose

Same as GH — value migrates from interpretation to the longitudinal data

Assessment

Valuation & discrepancy

Disc 5 · Neutral

Irreplaceable data, ~12x sales on ~30% growth
Quality High; multiple says priced, not discounted
Volatile equity

Convexity & why

Moderate

Data optionality vs a rich ~12x multiple
Roughly balanced, slight positive tilt

Other endogenous concerns

Reimbursement concentration (Medicare MRD decisions)
Billing-practice scrutiny; GH litigation

Hype factor (market awareness)

Med

Same — growth story, data unpriced

Catalysts

MRD adoption; new indications

Tempus AI TEM ◆ owner

IR / presentations ↗

mkt cap~$8.5B ✓ FMP EV/Sales~6.5x YoY growth+83% price~web price · just adj-EBITDA positive

Data

Nature of the data

Data 9 · High

Multimodal clinical + genomic data (~500-PB) pairing sequencing with clinical records
Scarcest, most valuable category for biomedical AI — unscrapeable
Built explicitly as an AI data company
140% net revenue retention on Insights/data
Linked outcomes data is what makes it irreplaceable

Data trajectory (stock vs flow)

Compounding fast

~300PB and growing; every test adds linked clinical+genomic data (Q1 letter)
Sequencing volumes growing ~25–30%/yr — the corpus is the byproduct of revenue

Position on the AI-unlock curve

AI 8 · High

$200M AstraZeneca/Pathos deal (Apr 2025): largest oncology foundation model
Total remaining contract value >$1B; non-exclusive — can resell the motion
Data customers: AZ, Novartis, Merck KGaA, Takeda, Boehringer, United Therap.
Illumina collaboration trains genomic algorithms on its multimodal data
Insights (data licensing) growing ~58%

Current AI contracts & counterparties

✓ deep dive

$200M AstraZeneca/Pathos data+model deal over 3 yrs (PR)
Total remaining contract value >$1B (Q1 letter)
Data customers: Novartis, Merck KGaA, Takeda, Boehringer, United Therap.
Illumina algorithm-training collaboration

Possibilities for additional contracts

Non-exclusive foundation-model co-builds with other pharma
Expansion beyond oncology (cardio, neuro)
Open-source pathology consortium as a funnel

AI risks — what stands to lose

Pharma could in-house modeling after learning from co-builds
Interpretation layer could commoditize; the data itself is the hedge

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~$8.5B cap, ~6.5x EV/Sales on +83% growth
Strikingly cheap for the growth + scarcest biomedical data
Priced like a normal growth co, not the data monopoly it's building

Convexity & why

High · growth optionality

Foundation-model + licensing optionality could make it the oncology-AI data layer
Rich multiple + cash burn are the downside
Large, real optionality = convex growth bet

Other endogenous concerns

Founder super-voting control; Pathos is Lefkofsky-affiliated (related-party optics on the $200M deal)
Convertible debt + only just adj-EBITDA positive
Short-seller scrutiny history (data-quality claims)

Hype factor (market awareness)

High

AI is in the name and the multiple — but >$1B RCV arguably still under-modeled

Catalysts

Next earnings: ~early Aug 2026 (Q1 reported May 5 — guidance raised) (Q1 8-K)
2026 guide raised to $1.59–1.60B revenue / ~$65M adj EBITDA — the leverage inflection
MRD volume ~6,500 tests in Q1, +500% YoY — reimbursement decisions are the swing
TCV >$1.1B; 70+ pharma data customers — watch new (non-exclusive) co-builds
Insights (data licensing) +44% in Q1 — the annuity compounding

Veeva Systems VEEV ◆ owner

IR / presentations ↗

mkt cap~$27B ✓ FMP EV/Sales~7.7x YoY growth+16% price~web price · premium SaaS

Data

Nature of the data

Data 7 · High

Life-sciences CRM + proprietary OpenData/Link (HCP & reference data)
Pharma depends on its reference data
A separable corpus inside the SaaS

Data trajectory (stock vs flow)

Growing

OpenData/Link refreshed continuously; usage data grows with seats

Position on the AI-unlock curve

AI 7 · High

AI embedded in pharma workflows
Up the curve
Vertical-SaaS leader

Current AI contracts & counterparties

✓ deep dive

AI agents shipping across CRM/Vault (Dec 2025 wave)
OpenData/Link reference data feeds its own AI

Possibilities for additional contracts

Agent pricing on top of seats
Link data into pharma AI pipelines

AI risks — what stands to lose

Vertical-SaaS pricing under the same agentic pressure as all seats ('SaaS-pocalypse')
AI app-builders lower barriers to bespoke pharma tools

Assessment

Valuation & discrepancy

Disc 5 · Neutral

Premium SaaS multiple
Data is a real, under-discussed asset
Equity priced for quality

Convexity & why

Low

Premium SaaS; data underrated but equity priced
Limited asymmetry

Other endogenous concerns

Salesforce→own-platform CRM migration is a multi-year execution risk
Core TAM maturing; growth depends on new apps

Hype factor (market awareness)

Med

Read as a quality SaaS with AI features, not a data owner

Catalysts

Agent adoption metrics
Vault CRM migration completion

Carvana CVNA ○ operator

IR / presentations ↗

mkt cap~$76B ~ EV est EV/Sales~3.5x YoY growth+30% price~web price · volatile

Data

Nature of the data

Data 4 · Neutral

Transactional used-car e-commerce & trade data
Tunes its own pricing/inventory

Data trajectory (stock vs flow)

Growing

Transaction/pricing data grows with units; internal

Position on the AI-unlock curve

AI 3 · Low

Input, not the product

Current AI contracts & counterparties

~ desk note

Internal pricing AI

Possibilities for additional contracts

—

AI risks — what stands to lose

Low direct AI risk

Assessment

Valuation & discrepancy

Disc 3 · Low

Volatile, richly valued
Weak fit for the screen

Convexity & why

Low

High beta but valued on retail growth, not data

Other endogenous concerns

Garcia family control + related-party history; leverage rebuilt the equity once already

Hype factor (market awareness)

Low

Retail story

Catalysts

Unit economics

CoStar Group CSGP ◆ owner

IR / presentations ↗

mkt cap~$14B ✓ FMP EV/Sales~4.0x YoY growth+19% price~web price · heavy spend

Data

Nature of the data

Data 8 · High

Verified CRE comps/property data, 35-yr research army
LoopNet, Apartments.com, Homes.com
Unscrapeable, walled inside terminals

Data trajectory (stock vs flow)

Compounding

Research army keeps verifying; comps accumulate permanently
Zonda adds a housing-data stream

Position on the AI-unlock curve

AI 3 · Low

Walled, litigious; data locked in terminals — minimal AI surface
Heavy Homes.com ad spend
Strategic data, low AI surface area

Current AI contracts & counterparties

✓ deep dive

None — deliberately walled; litigious vs scrapers
Zonda acquisition ($800M) extends housing data

Possibilities for additional contracts

The big withheld option: licensed CRE grounding for real-estate AI
Homes.com AI search features

AI risks — what stands to lose

AI aggregation/scraping pressure on listings; Google entering for-sale listings (BTIG flag)
Verified CRE comps hardest to substitute

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~$14B cap, ~4.0x EV/Sales on +19% growth
Much cheaper than I'd shown; Homes.com spend masks margins
Unscrapeable CRE data at a reasonable price

Convexity & why

Moderate–High

Unscrapeable CRE data, now cheap
Low AI surface + heavy ad spend cap near-term
Re-rate optionality as Homes.com spend rolls off

Other endogenous concerns

Homes.com spend is an act of will (founder-CEO); activist pressure has surfaced
Serial litigation posture cuts both ways

Hype factor (market awareness)

Low

AI never part of the story; the data optionality is free at ~4x

Catalysts

Homes.com spend roll-off (margin catalyst)
Zonda integration
Any posture change on data access

Duolingo DUOL ◆ owner*

IR / presentations ↗

mkt cap~$5.5B ✓ FMP EV/Sales~4.0x YoY growth+39% price~$98 · −79% over 1yr (web)

Data

Nature of the data

Data 6 · Neutral

One of the largest learning-interaction datasets (50M+ DAU)
Granular data on how people learn, err & retain across 100+ courses
Used in-product to tune pedagogy — not licensed
Value captured as engagement, not a sellable corpus

Data trajectory (stock vs flow)

Compounding

Learning interactions scale with DAUs (50M+, growing)
Every exercise answered is new pedagogy data

Position on the AI-unlock curve

AI 6 · Neutral

AI-first (Gen-AI 'Max', AI video calls)
Shipped 148 courses in a year via generative AI
Unlock shows up as engagement/ARPU, not a licensing line
Mid — AI deepens the product moat

Current AI contracts & counterparties

✓ deep dive

None out; heavy OpenAI/GenAI consumer (Max, AI courses)
148 AI-generated courses shipped in a year

Possibilities for additional contracts

Learning-data licensing (never signaled)
AI-tutor pricing tiers

AI risks — what stands to lose

ChatGPT as a free language tutor — the central substitution threat
Defense: gamification + structure, not content

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~4.0x EV/Sales on +39% growth — cheap on growth metrics
AI-disruption fear embedded in the multiple
A growth franchise at a non-growth multiple

Convexity & why

Moderate–High

~4x on +39% growth bounds the downside if growth holds
Upside if AI features lift engagement/ARPU
Positive convexity on metrics

Other endogenous concerns

Founder control; monetization-vs-engagement tension
Still expensive on earnings even after the crash

Hype factor (market awareness)

High — as threat

Narrative says ChatGPT kills language learning; the AI-first operating model is ignored

Catalysts

DAU/booking growth stabilization
Max attach rate
Energy/engagement metrics

MercadoLibre MELI ○ operator

IR / presentations ↗

mkt cap~$83B ~ EV est EV/Sales~3.5x YoY growth+35% price~web price · premium growth

Data

Nature of the data

Data 6 · Neutral

LatAm marketplace purchase + fintech/credit data
Powers its own ads/lending (input)

Data trajectory (stock vs flow)

Compounding

Purchase + credit data compounds with GMV growth

Position on the AI-unlock curve

AI 3 · Low

AI for marketplace/credit optimization
Not a sold corpus

Current AI contracts & counterparties

~ desk note

Internal AI for ads/credit

Possibilities for additional contracts

—

AI risks — what stands to lose

Low; AI mostly an internal lever

Assessment

Valuation & discrepancy

Disc 3 · Low

Premium growth stock
Data doesn't re-rate it

Convexity & why

Moderate

High growth, but valued on the business, not the data

Other endogenous concerns

LatAm FX/political risk; credit-book quality through cycles

Hype factor (market awareness)

Low

Not a data play

Catalysts

LatAm growth; fintech credit

Netflix NFLX ○ operator

IR / presentations ↗

mkt cap~$343B ~ EV est EV/Sales~8.0x YoY growth+14% price~web price · mega-cap

Data

Nature of the data

Data 7 · High

Viewing/interaction data across ~300M members
Real moat for recs/greenlighting
Strictly internal — never licensed

Data trajectory (stock vs flow)

Growing

Viewing data grows with engagement; internal-only

Position on the AI-unlock curve

AI 2 · Low

Never licensed; AI = better curation only
Internal-use data

Current AI contracts & counterparties

~ desk note

Internal only — never licensed

Possibilities for additional contracts

—

AI risks — what stands to lose

GenAI lowers content-production barriers for rivals (long-term)

Assessment

Valuation & discrepancy

Disc 2 · Low

Premium mega-cap on subscriber economics
n/a as a data play

Convexity & why

Low

Priced mega-cap, data internal

Other endogenous concerns

Content-spend discipline vs growth; live/sports costs

Hype factor (market awareness)

Low

Recs AI assumed, not valued separately

Catalysts

Sub growth; ads tier

Reddit RDDT ◆ owner

IR / presentations ↗

mkt cap~$34B ✓ FMP EV/Sales~13x YoY growth+69% price● $177.00 live · −23% YTD · 52wk $111–$283

Data

Nature of the data

Data 9 · High

~100k+ communities, two decades of upvote-ranked human conversation
Largest archive of authentic opinion, troubleshooting, niche expertise
Exactly what LLMs lack: recommendations, lived experience, long-tail Q&A
Surfaces disproportionately in AI answers
Classified as social media, not 'data services'

Data trajectory (stock vs flow)

Compounding

DAU still growing; posts/comments compound the archive daily
Two decades of vote-ranked history can't be replicated retroactively

Position on the AI-unlock curve

AI 9 · High

$203M aggregate contract value disclosed at IPO (Google + OpenAI)
~$130M/yr run-rate ≈ 10% of revenue; Google ~$60M/yr, OpenAI ~$70M/yr
#1 most-cited source across AI models (~3x Wikipedia)
Google renewal under negotiation — pushing usage-based pricing
Litigates unlicensed scrapers (incl. Perplexity suit)

Current AI contracts & counterparties

✓ deep dive

$203M aggregate disclosed at IPO (TechCrunch)
Google ~$60M/yr; OpenAI ~$70M/yr ≈ 10% of revenue (SEL)
2–3 yr terms struck Jan 2024 — now in renewal window

Possibilities for additional contracts

Google renewal at usage-based rates (mgmt: 'open for business')
Anthropic / Meta / xAI remain unlicensed
Dynamic per-citation pricing models
Int'l + vertical (commerce intent) licensing

AI risks — what stands to lose

Google AI Overviews already cut logged-out traffic (the 2025 user-growth scare)
AI-generated content pollution threatens corpus authenticity
Meta forums app targets the community moat

Assessment

Valuation & discrepancy

Disc 5 · Neutral

Best corpus + fastest unlock, but ~13x sales
~65% growth supports it — priced FOR growth
Quality off the charts; valuation not a gap

Convexity & why

Moderate

Big growth/licensing optionality = upside call
But ~13x sales means real drawdown if growth slows
Net mildly positive from the licensing option

Other endogenous concerns

Community/moderator revolt risk is structural (2023 API blackout precedent)
Altman's stake = governance optics
Ad business still ~90% of revenue and competitive

Hype factor (market awareness)

High

The AI-data story IS the stock; renewal terms are the swing

Catalysts

Google contract renewal & structure (report)
Scraper litigation incl. Perplexity suit
Meta forums app traction (the bear case)
Data-licensing line in quarterly prints

TripAdvisor TRIP ◆ owner

IR / presentations ↗

mkt cap~$1.4B ~ EV est EV/Sales~0.7x YoY growth+3% price~web price

Data

Nature of the data

Data 4 · Neutral

~1B travel reviews; Viator experiences marketplace
Widely scraped & substitutable
Reviews feed AI trip-planning agents

Data trajectory (stock vs flow)

Slowing risk

~1B cumulative, but contributions follow visits — and AI answers divert visits
The corpus ages if the flywheel slows

Position on the AI-unlock curve

AI 4 · Neutral

Perplexity partnership (Jan 2025) now a measurable booking channel
ChatGPT app launch partner (Oct 2025) for trip planning
Distribution-into-AI strategy, not paid corpus licensing
Viator + TheFork now >50% of revenue — the real value

Current AI contracts & counterparties

✓ deep dive

Perplexity partnership, Jan 2025 — hotels customer-acquisition channel (PR)
ChatGPT app launch partner, Oct 2025 (report)

Possibilities for additional contracts

Paid licensing of the review corpus (currently given for distribution)
Viator inventory as the bookable layer inside AI agents

AI risks — what stands to lose

AI trip planners bypass the site entirely — the core meta business is the casualty
Viator/TheFork partially insulated (fulfillment, not discovery)

Assessment

Valuation & discrepancy

Disc 5 · Neutral

~$1.4B cap, ~0.7x EV/Sales on +3% growth
Very cheap, but reviews are being disintermediated
Value is Viator/TheFork, not the review corpus

Convexity & why

Low–Moderate

Cheap but melting
Weak optionality

Other endogenous concerns

Viator faces GetYourGuide/Klook competition; legacy meta declines
Post-Liberty structure leaves strategic questions

Hype factor (market awareness)

Med

AI read as existential threat; partnerships seen as defensive, not monetizing

Catalysts

AI-channel booking disclosures
Membership program launch
Viator/TheFork growth (the real value)

Yelp YELP ◆ owner

IR / presentations ↗

mkt cap~$1.3B ✓ FMP EV/Sales~0.9x YoY growth+3% price~web price

Data

Nature of the data

Data 6 · Neutral

~300M geocoded local-business reviews
Structured local sentiment for 'best X near me'
Classified as internet content, not data services

Data trajectory (stock vs flow)

Steady flow — not melting

22M new reviews in 2025 (vs 21M in '24); corpus 330M, +7% YoY (FY25 PR)
What's melting is consumption (app engagement), not contribution — yet
Risk: contribution follows traffic with a lag

Position on the AI-unlock curve

AI 6 · Neutral

Signed OpenAI agreement (disclosed Feb 2026)
Perplexity has used Yelp local data since Mar 2024
'Other revenue' +17% on data licensing & transactions
Expanding Yelp Assistant; Hatch acquisition (AI front-desk)
Core local-ad business still the eroding center

Current AI contracts & counterparties

✓ deep dive

OpenAI agreement signed (Feb 2026, undisclosed) (FY25 PR)
Perplexity has integrated Yelp local data since Mar 2024
Data licensing inside 'Other revenue' (+17%)

Possibilities for additional contracts

More assistant integrations (Gemini, Claude, Alexa-class)
Usage-priced local-data API
Transactional referrals from AI answers

AI risks — what stands to lose

AI assistants answer 'best X near me' without a Yelp visit — ad impressions leak
Google's AI search squeezes the top of funnel

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~$1.3B cap, ~0.9x EV/Sales on +3% growth
<1x sales — cheap, but the ad core is eroding
AI-distribution optionality vs value-trap risk

Convexity & why

Moderate · binary

Cheap with AI-distribution optionality
vs an eroding core
Binary-ish

Other endogenous concerns

Own antitrust fight with Google (plaintiff) — outcome cuts both ways
SMB advertiser churn; restaurant/retail ads already shrinking

Hype factor (market awareness)

Low-Med

OpenAI deal is new and barely in the price; story still read as 'Google victim'

Catalysts

OpenAI deal revenue contribution
'Other revenue' growth each quarter
Services-ads resilience vs AI search

Zillow Z/ZG ◆ owner*

IR / presentations ↗

mkt cap~$8.6B ✓ FMP EV/Sales~3.1x YoY growth+16% price~web price

Data

Nature of the data

Data 5 · Neutral

Zestimate + listing data + largest US housing audience
Much listing data is MLS-shared, not fully proprietary
Consumer housing intent data

Data trajectory (stock vs flow)

Churning flow

Listings turn over rather than accumulate; Zestimate history compounds quietly

Position on the AI-unlock curve

AI 5 · Neutral

Strong in-app AI; partial data moat
Real-estate AI agents could use it
Mid on the curve

Current AI contracts & counterparties

~ desk note

In-app AI (natural-language search); MLS data shared

Possibilities for additional contracts

Housing-intent data for real-estate agents

AI risks — what stands to lose

AI agents could search listings directly; Zillow's audience moat = the defense
Low risk to Zestimate itself

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~$8.6B cap, ~3.1x EV/Sales on +16% growth
Cheaper than I'd shown; partial (MLS-shared) moat
Housing-cycle leverage on top

Convexity & why

Moderate

Housing-cycle optionality
Partial moat caps the data upside
Balanced

Other endogenous concerns

NAR commission-settlement reshapes agent economics — its customers' wallets
Housing-cycle beta; Showcase/mortgage execution

Hype factor (market awareness)

Med

AI features noted, not a data thesis

Catalysts

Housing cycle; Showcase attach

RELX RELX ◆ owner

IR / presentations ↗

mkt cap~$92B* ✓ FMP EV/Sales~9.0x YoY growth+7% price~web price · premium compounder

Data

Nature of the data

Data 10 · High

Elsevier science (The Lancet, Cell, Scopus) — peer-reviewed at scale
LexisNexis legal + LexisNexis Risk Solutions (identity/fraud)
Three of the most defensible corpora on earth in one company
Scientific literature is critical for frontier capability

Data trajectory (stock vs flow)

Growing

Global science output grows mid-single-digit %/yr; submissions rising
Caveat: AI-generated paper flood is a quality-control burden

Position on the AI-unlock curve

AI 8 · High

Lexis+AI, Scopus AI, ClinicalKey AI, Protégé all live
Embeds data in grounded retrieval vs raw training access
Among the best-positioned grounded-AI owners
High — productization mature & shipping

Current AI contracts & counterparties

✓ deep dive

No raw licensing; grounded products only
Lexis+ AI, Scopus AI, ClinicalKey AI, Protégé all shipped

Possibilities for additional contracts

Elsevier corpus licensing remains a withheld option (big if ever)
Agent-access tiers to Scopus/Lexis
Risk-data feeds into KYC agents

AI risks — what stands to lose

Lexis faces the same legal-AI insurgency as Westlaw
Elsevier: AI summarization + open access erode subscription rationale
Risk division most insulated

Assessment

Valuation & discrepancy

Disc 4 · Neutral

~9x EV/Sales on +7% growth — a small AI-threat discount embedded
Grounded AI products shipping across all three corpora
Durable compounder; thesis is durability, not deep value

Convexity & why

Low

Fully-valued premium compounder
Durable, but limited asymmetry either way

Other endogenous concerns

Open-access mandates (Plan S) pressure Elsevier's model
Exhibitions segment is cyclical

Hype factor (market awareness)

High — as threat

Same legal-AI threat narrative as TRI; grounded-product execution under-credited

Catalysts

Lexis+ AI penetration disclosures
Any Elsevier AI-licensing posture change
FY guide post-crash

Wiley WLY ◆ owner

IR / presentations ↗

mkt cap~$2.3B ✓ FMP EV/Sales~1.9x YoY growth~flat price● $43.96 live · +43.5% YTD · near 52wk high $44.58

Data

Nature of the data

Data 7 · High

Peer-reviewed STM journals/books; Cochrane co-publishing
Vetted scientific text — what labs pay for to lift capability
A 'smaller Elsevier' — quality corpus, narrower than RELX
Editorial vetting + citation links add provenance
Proprietary, not freely on the open web

Data trajectory (stock vs flow)

Growing

Submissions +25%, output +13% — the journal flow is accelerating (Q1 PR)
Caveat: some of that surge is AI-assisted writing — vetting is the product

Position on the AI-unlock curve

AI 7 · High

$92M lifetime AI-licensing revenue; $29M in Q1 FY26 alone
Anthropic strategic partnership (Sep 2025) + projects with 3 top tech cos
Recurring inference pilots: pharma, chemical, space-exploration cos
One of the only names with disclosed, recurring AI revenue
Recurring AI line gives it proven monetization few peers can show

Current AI contracts & counterparties

✓ deep dive

$92M lifetime AI revenue; $29M in Q1 FY26 (PR)
Anthropic strategic partnership (Sep 2025)
Projects with 3 of the largest tech cos (unnamed)
Recurring inference pilots: pharma, chemical, space

Possibilities for additional contracts

Convert pilots → recurring corporate R&D subscriptions
License on behalf of partner publishers (agency model)
Agent-citation / RAG licensing beyond training

AI risks — what stands to lose

AI summarization reduces per-article reading; open access erodes paywalls
AI-written paper flood strains (and ironically validates) peer review

Assessment

Valuation & discrepancy

Disc 7 · High

~1.9x EV/Sales with disclosed, recurring AI-licensing revenue ($92M lifetime)
Flat underlying top line is the offset
Cheap on metrics for the rare proven AI licensor

Convexity & why

Moderate

Low multiple + proven licensing = bounded downside with optional upside
Flat core growth caps the slope
Asymmetry modest but positive

Other endogenous concerns

Library budget pressure + consolidation of academic spend
Post-divestiture portfolio still re-finding growth

Hype factor (market awareness)

High

AI-licensing story is prominent in coverage; expectations now elevated

Catalysts

Next earnings: Tue June 16, 2026, pre-market — FY26 Q4 + FY27 guide (notice)
AI recurring revenue <10% of AI revenue today; mgmt expects the proportion to triple next year (Q3 call)
OpenEvidence partnership: 5-yr multimillion licensing + Wiley equity stake
Nexus licensing service at 36 publishing partners — the agency model scaling
Emerald Publishing acquisition (Jun 2, 2026) adds proprietary research corpus
Q3 raised margin/EPS guidance to high end; ~4.5% dividend while you wait

Wolters Kluwer WTKWY ◆ owner

IR / presentations ↗

mkt cap~$38B* ✓ FMP EV/Sales~6.0x YoY growth+6% price~web price · premium

Data

Nature of the data

Data 9 · High

Legal, tax, health & regulatory information + workflow (CCH, UpToDate)
UpToDate is a premier point-of-care clinical reference
Authoritative corpora like RELX/Thomson Reuters
Subscription, deeply embedded in workflows

Data trajectory (stock vs flow)

Steady flow

Regulatory/tax/clinical updates are a built-in perpetual flow

Position on the AI-unlock curve

AI 7 · High

AI workflow tools shipping across segments
Same grounded-AI position as RELX/TRI
Up the curve, productizing its corpus

Current AI contracts & counterparties

~ desk note

AI embedded in UpToDate/CCH; no corpus licensing

Possibilities for additional contracts

Clinical-grounding deals for medical AI (UpToDate is the prize)

AI risks — what stands to lose

UpToDate's clinical-reference franchise faces AI-native rivals (e.g. OpenEvidence)
Tax/legal workflow seats exposed like TRI/RELX

Assessment

Valuation & discrepancy

Disc 3 · Low

Premium compounder
AI quality understood & paid for
Durability, not discount

Convexity & why

Low

Durable but fully valued
Limited asymmetry

Other endogenous concerns

CEO transition (long-tenured McKinstry era ended)
Health segment competition intensifying

Hype factor (market awareness)

Med

Quality understood; AI optionality not separately priced

Catalysts

UpToDate AI products; FY guide

Clarivate CLVT ◆ owner

IR / presentations ↗

mkt cap~$1.5B ✓ FMP EV/Sales~2.3x YoY growth−4% price~web price · heavily levered

Data

Nature of the data

Data 7 · High

Web of Science — citation graph linking ~2B scientific citations
Derwent (patents) + Cortellis (drug-pipeline intelligence)
ProQuest academic content: dissertations, archives, ebooks
Valuable for research/IP agents — 'a poor man's Elsevier'
Data quality seen as better than the company's execution

Data trajectory (stock vs flow)

Steady flow

Citations/patents grow with global publishing — steady, not accelerating

Position on the AI-unlock curve

AI 5 · Neutral

Signed access deals (Anthropic) + MCP exposure
AI research assistants in pipeline, slow to ship
Citation + patent networks useful for IP/research AI
~$4.5B net debt constrains reinvestment
Behind on the curve — data ready before the company

Current AI contracts & counterparties

✓ deep dive

Anthropic access agreement + MCP exposure for Web of Science
No disclosed $; debt limits investment

Possibilities for additional contracts

Patent/citation grounding for research agents
ProQuest licensing to labs

AI risks — what stands to lose

AI literature tools (Elicit, Semantic Scholar) bypass Web of Science discovery
Patent search AI-commoditized

Assessment

Valuation & discrepancy

Disc 7 · High

~2.3x EV/Sales (EV ~$5.7B, mostly debt) on $2.46B rev
Equity (~$1.5B) is a small levered stub
Cheap on sales, but the debt is the risk

Convexity & why

High · distressed option

Small equity stub over ~$4.5B debt ≈ a call option on the enterprise
Bounded loss, multi-bagger upside if it de-levers/monetizes
Convex but a high-probability left tail — size accordingly

Other endogenous concerns

~$4.5B debt wall dominates everything
PE overhang; serial restructurings and writedowns

Hype factor (market awareness)

Low

Debt story drowns the data story entirely

Catalysts

De-leveraging milestones
Any AI-licensing disclosure
Segment divestitures

Getty Images GETY ◆ owner

IR / presentations ↗

mkt cap~$0.3B ✓ FMP EV/Sales~1.5x YoY growth+4% price~web price · distressed

Data

Nature of the data

Data 8 · High

~500M licensed, rights-cleared, caption-annotated images & video
Exclusive editorial archives spanning a century
iStock + Unsplash extend the catalog across tiers
Rights-cleared image–text pairs = ideal multimodal training data
Legal indemnification is the product AI builders need

Data trajectory (stock vs flow)

Strong flow + archive

160k+ events covered/yr; ~600k creators; thousands of assets ingested daily (Q2 PR)
Editorial is a daily flow machine, not just a vault — FY25 grew both segments
Risk is creative-side inflow: genAI erodes contributor economics

Position on the AI-unlock curve

AI 6 · Neutral

Perplexity multi-yr display deal (Oct 2025)
Generative tools with NVIDIA; licensed-data posture vs scrapers
Shutterstock merger (UK-cleared May 2026) adds its lab licensing deals
Litigation (Stability AI) continues to define the rights frontier
Licensing not yet replacing what AI takes from stock demand

Current AI contracts & counterparties

✓ deep dive

Perplexity multi-yr display deal, Oct 2025 — undisclosed $ (PR)
NVIDIA-powered licensed generative tools (Getty/iStock)
Shutterstock brings lab deals (OpenAI, Meta, Apple, Amazon) post-merger

Possibilities for additional contracts

Post-merger: consolidated licensed-visual-data vendor to every lab
Display/attribution deals with other AI search products
Indemnified training data as a product line

AI risks — what stands to lose

GenAI image substitution is already in the creative numbers
Editorial (real events) is the un-generatable refuge

Assessment

Valuation & discrepancy

Disc 7 · High

~$0.3B cap — a deep-distress equity stub over ~$1.3B+ debt
~1.5x EV/Sales on ~$0.9B revenue
Cheap + levered = a lottery ticket on the data

Convexity & why

High · lottery

Distressed, levered equity on ideal data — near-binary
Multiplies on a licensing/M&A catalyst, or drifts to zero
Steeply convex, lowest-conviction high-convexity name

Other endogenous concerns

~$1.3B+ debt; controlled company (Getty family + Koch)
Shutterstock integration risk; CMA found UK editorial concerns (remedies)

Hype factor (market awareness)

Med-High

Every AI headline attaches to it; the balance sheet, not awareness, is the constraint

Catalysts

Shutterstock merger close (UK-cleared May 2026)
Combined AI-licensing revenue line
Stability AI litigation outcomes
Debt refinancing

Pearson PSO ◆ owner

IR / presentations ↗

mkt cap~$9.3B ~ EV est EV/Sales~2.0x YoY growth+3% price~web price

Data

Nature of the data

Data 6 · Neutral

Education content, assessment & learning-outcome data
Proprietary curriculum + testing content
Education is an AI-disruption epicenter

Data trajectory (stock vs flow)

Steady flow

Assessment/courseware data flows with enrollment

Position on the AI-unlock curve

AI 5 · Neutral

AI partnerships to license/embed content
Two-sided disruption: tutoring threat + licensing optionality
Mid on the curve

Current AI contracts & counterparties

~ desk note

AI partnerships announced 2025 with Microsoft, Google Cloud & AWS for learning products

Possibilities for additional contracts

Curriculum licensing into AI tutors; assessment data moats

AI risks — what stands to lose

AI tutors substitute courseware — the existential half of the two-sided story
Assessment/credentialing more defensible

Assessment

Valuation & discrepancy

Disc 5 · Neutral

Disruption discount
Optionality + threat both real
Owner with genuine two-sidedness

Convexity & why

Moderate

Content-licensing/AI-tutoring optionality
vs a real disruption threat
Two-sided convexity

Other endogenous concerns

Enrollment cliffs + OPM decline in higher ed
Multi-year strategic rebuild under newer CEO

Hype factor (market awareness)

Med

Two-sided: tutoring threat vs licensing option

Catalysts

Enrollment AI products; partnership revenue

BlackSky BKSY ◆ owner

IR / presentations ↗

mkt cap~$1.2B ✓ FMP EV/Sales~12x YoY growth+4% price~web price · high-beta

Data

Nature of the data

Data 8 · High

High-frequency satellite imagery + Spectra AI geospatial intelligence
Rapid-revisit imagery over own and third-party sensors
Growing multi-year defense backlog
Same theme as Planet, earlier in scaling

Data trajectory (stock vs flow)

Compounding

Constellation growth (Gen-3) raises capture rate; archive accrues

Position on the AI-unlock curve

AI 7 · High

$100M+, 7-yr international defense contract (Jan 2025)
$30M+ multi-year Gen-3 tactical ISR deal (Q3 2025)
Backlog $323M, 91% international
Spectra AI analytics layer over own + third-party sensors
Same curve as Planet, earlier and cheaper-cap stage

Current AI contracts & counterparties

✓ deep dive

$100M+, 7-yr int'l defense contract, Jan 2025 (PR)
$30M+ Gen-3 tactical ISR deal (Q3 25); backlog $323M, 91% int'l

Possibilities for additional contracts

Gen-3 constellation upsells
US budget normalization
Spectra analytics licensing to allied gov'ts

AI risks — what stands to lose

Same — AI raises the value of the sensor flow

Assessment

Valuation & discrepancy

Disc 4 · Neutral

~$1.2B cap, EV ~$1.25B on ~$107M revenue
~12x EV/Sales on ~flat revenue — richly valued, not cheap
Defense backlog is the story; the price is not a discount

Convexity & why

Moderate

Unique imagery + defense backlog = real optionality
But ~12x sales on flat revenue means you pay up for it
Not the bounded-downside cheap option it first looked like

Other endogenous concerns

Dilution history; international customer concentration
Gen-3 execution timeline risk

Hype factor (market awareness)

Med-High

Defense-AI story increasingly recognized; ~12x sales already pays for it

Catalysts

Gen-3 launch & tasking milestones
US budget resolution
New int'l capacity commitments

Leidos LDOS △ borderline

IR / presentations ↗

mkt cap~$16B ~ EV est EV/Sales~1.3x YoY growth+6% price~web price · services

Data

Nature of the data

Data 3 · Low

Works on gov geospatial/intel data it doesn't own
Palantir-type: analytics layer on others' data

Data trajectory (stock vs flow)

n-a

Doesn't own the data it works on

Position on the AI-unlock curve

AI 4 · Neutral

AI analysis agents on others' data
Services, not a data owner

Current AI contracts & counterparties

~ desk note

AI services on government data it doesn't own

Possibilities for additional contracts

—

AI risks — what stands to lose

AI compresses services labor pricing — the classic services squeeze

Assessment

Valuation & discrepancy

Disc 4 · Neutral

Cheap services multiple
Not a data-owner screen fit

Convexity & why

Low

Services multiple, no data optionality

Other endogenous concerns

Recompete cycles; budget continuing-resolution exposure

Hype factor (market awareness)

Low

Services multiple, services story

Catalysts

Award cycles

Planet Labs PL ◆ owner

IR / presentations ↗

mkt cap~$10.4B ✓ FMP EV/Sales~28x YoY growth+26% price~$31 · 52wk $4.90–$51.76 (web)

Data

Nature of the data

Data 9 · High

Images the entire landmass daily (~3.5m), plus high-res SkySat/Pelican (~50cm)
A unique multi-year temporal archive no competitor has
Change-over-time is the moat — can't retroactively collect history
Increasingly delivered as AI-ready analytics
Defense & intelligence is the fastest-growing buyer

Data trajectory (stock vs flow)

Compounding by design

Whole-Earth scan daily — the archive grows every 24h by construction
New satellites add resolution/cadence; history can't be re-collected

Position on the AI-unlock curve

AI 7 · High

Anthropic partnership (Mar 2025): Claude applied to satellite imagery
First prime win on NGA Luno ($12.8M, maritime AI analytics)
MDA SHIELD IDIQ prime — eligible for Golden Dome task orders
Backlog ~$900M (+79% YoY); Q4 revenue +41%
AI analytics is the product; defense is the buyer

Current AI contracts & counterparties

✓ deep dive

Anthropic partnership (Mar 2025): Claude on satellite imagery (report)
NGA Luno prime win $12.8M (SpaceNews)
MDA SHIELD IDIQ prime (Golden Dome-eligible); backlog ~$900M

Possibilities for additional contracts

Golden Dome task orders
AI-analytics subscriptions over the archive (insurance, ag)
More foundation-model partnerships on temporal imagery

AI risks — what stands to lose

Low — AI is the accelerant, not the threat; risk is capex/competition not AI

Assessment

Valuation & discrepancy

Disc 4 · Neutral

~28x EV/Sales, pre-profit — the data and backlog are the appeal, not the multiple
Backlog ~$900M anchors forward revenue
Rich on every metric

Convexity & why

High · optionality

Unique archive + ramping $906M defense backlog = large 'if it scales' upside
Pre-profit/capital intensity is the downside
Strong positive convexity

Other endogenous concerns

SPAC-era dilution legacy; Pelican capex cycle
Government contract concentration & timing lumps

Hype factor (market awareness)

High

AI + defense premium fully in the ~28x; expectations are the risk

Catalysts

Next earnings: ~early Sept 2026 (FQ1'27 reported Jun 4 — record print) (Q1 8-K)
FY27 guide raised to $425–441M (+41% mid); Q2 guide $102–107M with adj-EBITDA breakeven-to-positive
Backlog $906M (+72%), RPO $816M (+81%); ~40% of backlog converts within 12 months
Pelican cadence: 3 launched in Q1 incl Sweden's first sovereign recon satellite
$731M cash funds the capex cycle; NGA $22M extension; Golden Dome task orders the option

Spire / Satellogic SPIR/SATL ◆ owner

IR / presentations ↗

mkt cap~$0.5B ~ EV est EV/Sales~3.0x YoY growth+20% price~web price · micro-cap

Data

Nature of the data

Data 6 · Neutral

Weather/maritime/RF data (Spire); hyperspectral imagery (Satellogic)
Niche proprietary sensor data
Early and capital-intensive

Data trajectory (stock vs flow)

Compounding

Continuous sensor flow (weather/RF/hyperspectral); small base

Position on the AI-unlock curve

AI 5 · Neutral

Real but early sensor datasets
On the curve but small

Current AI contracts & counterparties

~ desk note

Niche gov/defense sensor contracts

Possibilities for additional contracts

Weather/RF data into forecasting AI

AI risks — what stands to lose

Low AI risk; survival risk is capital, not AI

Assessment

Valuation & discrepancy

Disc 5 · Neutral

Speculative micro-caps
Watchlist-only owners
High risk, thin coverage

Convexity & why

High · lottery

Micro-cap sensor data — binary
Large upside if a dataset scales, fat left tail
High-variance convexity

Other endogenous concerns

Cash runway and listing-compliance history — survival-grade risks

Hype factor (market awareness)

Low

Below the radar entirely

Catalysts

Contract wins; cash runway

Genius Sports GENI ◆ owner

IR / presentations ↗

mkt cap~$1.7B ✓ FMP EV/Sales~3.6x YoY growth+31% price~web price · growth

Data

Nature of the data

Data 8 · High

Exclusive official league-data rights (NFL, NCAA, EPL)
Now the NCAA's official data provider
The other half of the official-sports-data duopoly with Sportradar
Growing media/ad data layer (post-Legend acquisition)
Multi-year rights = a hard moat

Data trajectory (stock vs flow)

Growing

More leagues, deeper tracking (player-level optical) each season

Position on the AI-unlock curve

AI 7 · High

AI for fan engagement and betting integrity products
Media/ad data layer monetizes the rights twice
Growing ~25%; up the curve like Sportradar
Owner actively monetizing, not just holding

Current AI contracts & counterparties

✓ deep dive

No corpus licensing; exclusive NFL/NCAA/EPL rights in-product
BetVision + media/ad data layer (Legend acq.)

Possibilities for additional contracts

Second monetization of rights via media/ads
AI integrity & fan-engagement products

AI risks — what stands to lose

Same; rights moat holds, services layer competitive

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~$1.7B cap, ~3.6x EV/Sales on +31% growth
Cheap for the growth + the official-data rights duopoly
Media/ad layer monetizes the rights twice

Convexity & why

High

Rights moat + media-data optionality = asymmetric upside
Growth-priced, so not deeply cheap
Convex if the media layer scales

Other endogenous concerns

NFL warrant dilution; rights renewals can reset economics
Only recently profitable

Hype factor (market awareness)

Low

Same as Sportradar — the duopoly's AI angle is unpriced

Catalysts

Next earnings: ~early Aug 2026 (Q1 reported May 8)
Legend closed May 1 → FY26 guide ~$990M–$1.01B rev / $270–280M EBITDA (~28% margin) (Q1 call)
NFL rights locked through Super Bowl 2030; GeniusIQ to automate the full rights portfolio by end-2027
Prediction markets: market makers onboarded in Q1 on low-latency feeds
Targets: positive GAAP net income 2027; ≥60% uFCF conversion by 2028; ~$100M H2'26 cash flow

Sportradar SRAD ◆ owner

IR / presentations ↗

mkt cap~$4.9B ✓ FMP EV/Sales~3.0x YoY growth+12% price~$15 · −50% from $32 ATH (web)

Data

Nature of the data

Data 8 · High

Official, licensed sports-data rights — 900k+ events, 80+ sports
Real-time play-by-play feeds, pre-match & live odds, streaming
Multi-year exclusive league contracts = hard-to-replicate moat
Half of a duopoly with Genius for official betting data
The data backbone of the global betting industry

Data trajectory (stock vs flow)

Growing

Event coverage (900k+/yr) and in-play depth keep expanding

Position on the AI-unlock curve

AI 7 · High

AI for in-play personalization, risk/trading, content generation
Higher-margin products (MTS, 4Sight) lift take-rates
A genuine owner monetizing its corpus
Up the curve; AI deepens products vs a new licensing line
Recent Kalshi deal extends into prediction markets

Current AI contracts & counterparties

✓ deep dive

No corpus licensing — official-data rights monetized in-product
Kalshi deal extends feeds into prediction markets

Possibilities for additional contracts

AI in-play products lift take-rates (4Sight, MTS)
Prediction-market data feeds scale

AI risks — what stands to lose

Betting operators in-housing AI models could squeeze value-add services
Official rights protect the raw feed itself

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~3.0x EV/Sales on ~12% growth for a rights-duopoly owner
Reasonable on metrics for the moat
Fair-to-slightly-cheap

Convexity & why

Moderate

~3x sales on duopoly rights gives a floor
Upside from take-rate growth on new products
Balanced, slight positive tilt

Other endogenous concerns

Rights-cost inflation: leagues extract more each renewal
Founder (Koerl) control; bookmaker customer concentration

Hype factor (market awareness)

Low

Priced as a betting vendor; data-rights duopoly rarely framed as AI

Catalysts

Next earnings: ~early Aug 2026 (Q1 reported early May)
NOW: FIFA World Cup (Jun–Jul 2026) — major in-play/MTS volume event (Q1 call)
FY26 reaffirmed: 23–25% cc revenue growth / 34–37% EBITDA growth
Prediction markets 'imminent, potentially material' — H2 ramp
IMG Arena synergies above 25% target; >700k streamed matches in 2026; H2 restructuring for leverage
Short-seller reports — CEO pushed back on call; monitor, don't ignore

DoubleVerify / Comscore DV/SCOR ◆ owner

IR / presentations ↗

mkt cap~$1.6B ✓ FMP EV/Sales~2.0x YoY growth+14% price~web price

Data

Nature of the data

Data 6 · Neutral

Ad-verification/fraud data (DV, healthier franchise)
Cross-platform audience measurement (Comscore, distressed)
Proprietary measurement data

Data trajectory (stock vs flow)

Flow with ad spend

Verification events track media volumes

Position on the AI-unlock curve

AI 6 · Neutral

Measurement owners; AI + walled gardens pressure the moat
DV is the credible franchise; SCOR a broken business

Current AI contracts & counterparties

~ desk note

AI-content verification products (DV)

Possibilities for additional contracts

Verification layer for AI-generated ad content

AI risks — what stands to lose

AI-generated content/MFA sites flood verification (volume up, value contested)
Walled gardens self-verify

Assessment

Valuation & discrepancy

Disc 7 · High

~$1.6B cap, ~2.0x EV/Sales on +14% growth
Cheap for an ad-verification data owner (DV)
DV the franchise; SCOR the distressed lottery leg

Convexity & why

High

~2x sales for a profitable measurement owner
AI + walled gardens pressure the moat
Cheap enough to be convex

Other endogenous concerns

Ad-budget cyclicality; IAS rivalry compresses pricing; SCOR is balance-sheet-fragile

Hype factor (market awareness)

Low-Med

De-rated with adtech; AI angle minor

Catalysts

DV growth; SCOR restructuring

Similarweb SMWB ◆ owner

IR / presentations ↗

mkt cap~$0.36B ✓ FMP EV/Sales~0.7x YoY growth+15% price~web price · small-cap

Data

Nature of the data

Data 5 · Neutral

Panel/clickstream traffic, keyword, conversion estimates for nearly every site
The dataset everyone uses to track digital behavior — incl. AI-search traffic
Broad coverage, but modeled/estimated, not a first-party record
Continuously updated digital-intelligence feeds

Data trajectory (stock vs flow)

Continuous panel

Clickstream flow is constant but panel-based — quality needs constant defense
Privacy/cookie shifts are structural headwinds to collection

Position on the AI-unlock curve

AI 8 · High

Sells data feeds/APIs + MCP integrations into AI workflows
Uniquely positioned to measure (and feed) the AI-search era
Ahead for its size — high AI exposure per dollar
Catch: modeled data less defensible than owned

Current AI contracts & counterparties

✓ deep dive

Sells AI/clickstream datasets + MCP integrations into AI workflows
The standard source for tracking ChatGPT/Gemini traffic share

Possibilities for additional contracts

AI-data ARR as a disclosed line
Agent-platform data feeds
Strategic acquirer interest (data fits many buyers)

AI risks — what stands to lose

AI search shrinks open-web traffic — shrinking the thing it measures
Collection (panels/extensions) gets harder as browsing shifts to agents

Assessment

Valuation & discrepancy

Disc 7 · High

~$0.36B cap, EV ~$0.21B on ~$283M revenue
~0.7x EV/Sales — strikingly cheap, even for modeled data
Deep-value + AI-licensing optionality; small & illiquid

Convexity & why

High · deep-value

<1x EV/Sales with AI-licensing pull = asymmetric
Small, illiquid, modeled (non-owned) data = the risk
Cheap enough that convexity tilts positive

Other endogenous concerns

Nano-cap liquidity; SBC heavy; privacy rules threaten collection methods

Hype factor (market awareness)

Med

Its datasets are quoted everywhere; the equity is ignored at ~0.7x EV/S

Catalysts

Next earnings: ~mid-Aug 2026 (Q1 reported May 13)
Second large LLM training contract expected 'over the coming quarters' (Q1 6-K)
AI revenue trajectory: 11% of Q4 revenue, ~3x YoY — does it keep compounding?
RPO $297.7M (+18%); multi-year ARR at 64% — contract-quality migration
FY26 guide $307–315M; low end already raised once

The Trade Desk TTD ○ operator

IR / presentations ↗

mkt cap~$9.4B ~ EV est EV/Sales~3.0x YoY growth+18% price~web price · de-rated (verify)

Data

Nature of the data

Data 6 · Neutral

Ad-bidding/bidstream data + UID2 identity framework
Powers its own bidding (demand-side platform)
Vast behavioral data, but an input

Data trajectory (stock vs flow)

High flow

Bidstream data scales with ad volume; ephemeral by nature

Position on the AI-unlock curve

AI 6 · Neutral

Stewards the UID2 identity standard
Identity-data optionality, not a corpus sale
De-rated; case on ad-platform fundamentals

Current AI contracts & counterparties

~ desk note

Kokai AI in-platform; UID2 stewardship

Possibilities for additional contracts

UID2 as identity layer for agentic commerce

AI risks — what stands to lose

AI walled-garden answers shrink open-web inventory — the de-rate driver
Agentic ad-buying could compress DSP take rates

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~3.0x EV/Sales on +18% growth — value territory for profitable adtech
UID2 identity optionality on top
Open-web AI fears embedded in the multiple

Convexity & why

Moderate–High

Modest multiple + identity-standard optionality
Data is an input, not a sold corpus
Positive tilt on metrics

Other endogenous concerns

Founder super-voting; Amazon DSP is the real competitive event
SBC and the credibility hit from the '25 stumble

Hype factor (market awareness)

Med — as threat

AI read as open-web risk; de-rate reflects it

Catalysts

CTV share; UID2 adoption; growth re-accel

ZoomInfo GTM ◆ owner

IR / presentations ↗

mkt cap~$0.8B ✓ FMP EV/Sales~1.7x YoY growth~−3% price● $2.77 live · −72.8% YTD · at 52wk low $2.74

Data

Nature of the data

Data 6 · Neutral

B2B contact + company intelligence: emails, dials, org charts, technographics
Buying-intent signals across millions of companies
A live 'who's-who' graph of decision-makers
Real, but increasingly replicable as AI shifts buyer behavior
Renamed platform around 'GTM AI'

Data trajectory (stock vs flow)

Decay treadmill

B2B contact data decays ~25–30%/yr — must be rebuilt constantly
Customer churn weakens the contributory refresh loop
The clearest decaying-asset risk in the table

Position on the AI-unlock curve

AI 7 · High

GTM Context Graph native in OpenAI's Codex for Work — agent context layer
AI is both distribution and disruptor
Cut 2026 guidance + ~20% of staff on AI-driven shifts
Ahead on plumbing, behind on the seat-based model
Clearest live case of 'data doesn't protect the equity'

Current AI contracts & counterparties

✓ deep dive

GTM Context Graph natively in OpenAI's Codex for Work
No disclosed licensing $; positioning as agent context layer

Possibilities for additional contracts

Per-call context pricing for sales agents
More agent-platform embeds (Claude, Gemini)
Data-only tier decoupled from seats

AI risks — what stands to lose

Customers replace SDR seats with AI — seat-based model directly hit (guidance cut said so)
Agents can increasingly infer contact data without a vendor

Assessment

Valuation & discrepancy

Disc 6 · Neutral

~1.7x EV/Sales — lowest multiple on the board
But revenue is declining; the cheapness reflects decay risk
Statistically cheap; operationally a falling knife

Convexity & why

High · binary

~1.7x sales embeds heavy pessimism — small asymmetric base
Re-rates hard if revenue stabilizes as the agent-context layer
Declining revenue is the live left tail

Other endogenous concerns

Debt on a shrinking base; SBC dilution; churn is the whole story

Hype factor (market awareness)

High — as threat

The market's AI-victim poster child; the Codex embed is ignored

Catalysts

Next earnings: ~early-mid Aug 2026 (Q1 reported May 11)
The trough test: FY26 guide cut to $1.185–1.205B (−4% mid); Q2 $300–303M — does it hold? (Q1 call)
Agent embeds: Salesforce prospecting agent ships with ZoomInfo as first/primary external data source (150k+ customers); HubSpot native; ChatGPT/Claude/Copilot/Perplexity connectors live
Pricing pivot: Copilot moving from seats to prepackaged credits/consumption
Mgmt points to growth returning H2 2027; 35% AOI margin + cost cuts fund the wait

ACV / OPENLANE ACVA/KAR ○ operator

IR / presentations ↗

mkt cap~$1.0B ~ EV est EV/Sales~5.0x YoY growth+25% price~web price

Data

Nature of the data

Data 6 · Neutral

Wholesale used-car condition & transaction data (ACV inspection corpus)
Granular vehicle-condition/pricing data
Still primarily marketplaces

Data trajectory (stock vs flow)

Growing

Inspection corpus grows with every vehicle listed (ACV)

Position on the AI-unlock curve

AI 5 · Neutral

Feeds AI pricing
ACV more data-distinctive
Operators, not data-unlock plays

Current AI contracts & counterparties

~ desk note

ACV inspection-AI in-product

Possibilities for additional contracts

Condition-data licensing to pricing AIs

AI risks — what stands to lose

Low-moderate; inspection AI is ACV's own product

Assessment

Valuation & discrepancy

Disc 5 · Neutral

ACV the more data-distinctive
Both operators
Corpus enhances the platform

Convexity & why

Moderate

ACV growth + condition-data optionality
Valued on the marketplace
Mildly positive

Other endogenous concerns

ACV not yet sustainably profitable; OPENLANE balance sheet

Hype factor (market awareness)

Low

Marketplace story

Catalysts

GMV growth; take rates

CarGurus / Cars.com CARG/CARS ○ operator

IR / presentations ↗

mkt cap~$2.7B ~ EV est EV/Sales~3.0x YoY growth+5% price~web price

Data

Nature of the data

Data 5 · Neutral

Auto listing, pricing & shopper-intent data
Largely audience/marketplace
Listings not fully proprietary

Data trajectory (stock vs flow)

Churning flow

Listings churn; intent data flows with traffic

Position on the AI-unlock curve

AI 4 · Neutral

Useful intent data, Zillow-like
Not a data-unlock play

Current AI contracts & counterparties

~ desk note

In-product pricing AI

Possibilities for additional contracts

—

AI risks — what stands to lose

AI shopping agents could bypass listing sites

Assessment

Valuation & discrepancy

Disc 5 · Neutral

Reasonable valuations
Operators in the Zillow mold

Convexity & why

Low

Operator, limited data asymmetry
Balanced-to-low

Other endogenous concerns

Dealer-count churn; marketing-spend treadmill

Hype factor (market awareness)

Low

Marketplace story

Catalysts

Dealer counts

Copart CPRT ○ operator

IR / presentations ↗

mkt cap~$29B ~ EV est EV/Sales~10x YoY growth+10% price~web price · premium

Data

Nature of the data

Data 7 · High

Salvage-auto auction & vehicle-history data (IntelliSeller)
Decades of auction-outcome data
Serves its dominant auction marketplace

Data trajectory (stock vs flow)

Growing

Salvage auction outcomes accumulate with volume

Position on the AI-unlock curve

AI 5 · Neutral

AI tools in-product, not licensed
Data deepens the moat, isn't the product

Current AI contracts & counterparties

~ desk note

Internal auction AI (IntelliSeller)

Possibilities for additional contracts

—

AI risks — what stands to lose

Low; AI assists damage assessment

Assessment

Valuation & discrepancy

Disc 3 · Low

Premium, high-quality operator
Data deepens the moat, isn't the product

Convexity & why

Low

Premium, data not the re-rate driver

Other endogenous concerns

Leadership transition from founder era; totals cycle depends on used-car values

Hype factor (market awareness)

Low

Operator story

Catalysts

Volume cycles

Instacart CART ○ operator

IR / presentations ↗

mkt cap~$9.9B ~ EV est EV/Sales~3.5x YoY growth+10% price~web price

Data

Nature of the data

Data 6 · Neutral

Grocery-purchase + fast-growing retail-media ad data
Rich first-party purchase data
Powers its own high-margin ads (input)

Data trajectory (stock vs flow)

Compounding

Purchase graph deepens with order history

Position on the AI-unlock curve

AI 5 · Neutral

Strong data-driven ad engine
AI-relevant, but feeds its ads, not sold
Operator class

Current AI contracts & counterparties

~ desk note

Retail-media AI in-product

Possibilities for additional contracts

Purchase-data into commerce agents (never signaled)

AI risks — what stands to lose

AI shopping agents could disintermediate the storefront layer

Assessment

Valuation & discrepancy

Disc 5 · Neutral

Reasonable on ads + delivery
Strong ad engine
Data is an input

Convexity & why

Moderate

Retail-media optionality
Valued on the business, not the data
Balanced

Other endogenous concerns

DoorDash/Uber entering grocery; ad growth must outrun fee pressure

Hype factor (market awareness)

Low

Grocery/ads story

Catalysts

Ad revenue growth

FIS FIS ○ operator

IR / presentations ↗

mkt cap~$21B ~ EV est EV/Sales~4.0x YoY growth+4% price~web price

Data

Nature of the data

Data 5 · Neutral

Merchant transaction flows & fraud signals (banking/payments processing)
Real data, but serves its processing

Data trajectory (stock vs flow)

Steady flow

Transaction flow tracks processing volumes

Position on the AI-unlock curve

AI 4 · Neutral

In-product fraud/upsell, not a corpus

Current AI contracts & counterparties

~ desk note

Fraud AI in-product

Possibilities for additional contracts

—

AI risks — what stands to lose

AI-native fintech infrastructure competition

Assessment

Valuation & discrepancy

Disc 4 · Neutral

Cheap-ish fintech
But not a data re-rate

Convexity & why

Low

Value fintech, data not the driver

Other endogenous concerns

Worldpay separation aftermath; bank IT spending cycles

Hype factor (market awareness)

Low

Fintech story

Catalysts

Banking IT spend

Visa / Mastercard / Amex V/MA/AXP ○ operator

IR / presentations ↗

mkt cap~$623B / $438B ~ EV est EV/Sales~16x YoY growth+10% price~web price · payment giants

Data

Nature of the data

Data 8 · High

Among the largest transaction datasets on earth
Regulated, privacy-bound byproduct
Not licensed as a corpus

Data trajectory (stock vs flow)

Compounding

Payment volumes grow ~10%/yr — among the largest data flows on earth

Position on the AI-unlock curve

AI 3 · Low

Increasingly productized
But privacy-bound; not a corpus sale
The ultimate data-advantaged operators

Current AI contracts & counterparties

~ desk note

Internal fraud/credit AI at vast scale; agentic-commerce pilots

Possibilities for additional contracts

Agentic payments standards (who authorizes an AI's purchase?)

AI risks — what stands to lose

Agentic payments could reshape authorization economics — also an opportunity
Stablecoin/alternative rails the bigger structural worry

Assessment

Valuation & discrepancy

Disc 2 · Low

Valued as payment giants
n/a as a data re-rate

Convexity & why

Low

Priced payment networks; data is internal

Other endogenous concerns

Interchange regulation (CCCA) and DOJ debit suit (V)
Stablecoin rails as long-term routing threat

Hype factor (market awareness)

Med

Agentic commerce chatter rising; data never the thesis

Catalysts

Agentic-payment standards; volume growth

Data owners — company one-pagers

Financial-market data

Professional-information data (legal · tax · IT advisory)

Credit · identity · risk data

Healthcare · life-sciences data

Consumer · user-generated · marketplace data

Peer-reviewed journal publishing data

Research analytics · IP · content data

Geospatial · sensor data

Sports data

Ad · measurement · web data

Auto data

Retail · e-commerce data

Transaction · payments data