Data owners — company one-pagers
Sequential dossier view of the master table: one card per company with identical content — data nature, trajectory, AI-unlock, contracts, possibilities, AI risks, FMP financials, valuation/discrepancy, convexity, endogenous concerns, hype, catalysts. Same ratings, same sources, same date (FMP, Jun 9 2026). One analyst's framework — verify before acting; not investment advice.
Professional-information data (legal · tax · IT advisory) IT ·
TRI
Research analytics · IP · content data CLVT ·
GETY ·
PSO
Retail · e-commerce data CART
Financial-market data
Data
Nature of the data
Data 6 · Neutral- Derivatives pricing & trade data
- High-margin byproduct of the exchange
Data trajectory (stock vs flow)
Growing- Derivatives data flow grows with record volumes
Position on the AI-unlock curve
AI 5 · Neutral- Sells valuable data, but it's not the thesis
Current AI contracts & counterparties
~ desk note- Sells market data conventionally
Possibilities for additional contracts
- Derivatives data into quant/agent stacks
AI risks — what stands to lose
- Minimal — clearing/execution moat unaffected
Assessment
Valuation & discrepancy
Disc 3 · Low- Premium, well-understood
- Owner-ish, but data isn't the re-rate
Convexity & why
Low- Priced, data not the driver
Other endogenous concerns
- Volume cyclicality; FMX (BGC) attacking rates franchise
Hype factor (market awareness)
LowNot a data-AI story
Catalysts
- Volume cycles; data pricing
Data
Nature of the data
Data 6 · Neutral- Entity-linked financial data: fundamentals, estimates, ownership, transcripts
- 'Symbology' deep ticker-linking is the connective tissue agents need
- But much content is aggregated/licensed, not owned — caps the moat
- Workflow terminals for buy/sell-side
Data trajectory (stock vs flow)
Steady flow- Coverage expands steadily; much content aggregated, not originated
Position on the AI-unlock curve
AI 7 · High- Conversational FactSet Mercury shipped; 48/50 top clients on AI tools
- Clean, entity-linked data is ideal RAG fuel for finance copilots
- Up-ish the curve
- Aggregated data limits licensing leverage
Current AI contracts & counterparties
~ desk note- FactSet Mercury + transcript AI; aggregated content limits licensing
Possibilities for additional contracts
- Symbology/entity-linking as agent infrastructure
AI risks — what stands to lose
- The terminal seat is the product — agents directly substitute analyst workflows
- Aggregated (non-owned) content gives least pricing defense
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~4.3x EV/Sales on +5% growth — quality at a modest multiple
- Aggregated (non-owned) data caps the moat
- Modest favorable gap on metrics
Convexity & why
Moderate- Quality franchise at a modest multiple — some re-rate optionality
- Aggregated content caps the upside
- Balanced
Other endogenous concerns
- Content-licensing input costs (incl. CUSIP) squeeze margins
- CEO transition; retention metrics softening
Hype factor (market awareness)
Med — as threatDe-rated with the info-services group in Feb 2026
Catalysts
- Retention metrics; Mercury adoption
Data
Nature of the data
Data 8 · High- Dominant US mortgage data (Black Knight/Ellie Mae) — origination/servicing graph
- Pricing & fixed-income reference data
- Hard-to-replicate corpus inside an 'exchange' wrapper
Data trajectory (stock vs flow)
Cyclical flow- Mortgage data flows with origination cycle; pricing data steady
Position on the AI-unlock curve
AI 6 · Neutral- Steadily productizing pricing/reference data
- Mortgage data graph is AI-relevant
- Mid on the curve
Current AI contracts & counterparties
~ desk note- In-product mortgage-AI; data feeds sold conventionally
Possibilities for additional contracts
- Mortgage-graph grounding for housing/credit agents
AI risks — what stands to lose
- Minimal — transaction infrastructure; some data products commoditized
Assessment
Valuation & discrepancy
Disc 4 · Neutral- A real owner screens miss (files as an exchange)
- Mostly priced
Convexity & why
Low- Quality priced
- Limited asymmetry
Other endogenous concerns
- Mortgage tech is deeply cyclical — bought at the top
- Black Knight deal debt still being digested
Hype factor (market awareness)
LowRead as an exchange, never as a data-AI play
Catalysts
- Mortgage cycle; IMB platform wins
Data
Nature of the data
Data 9 · High- Credit ratings (MIS) + Moody's Analytics
- Orbis: largest private-company database (~500M entities)
- Default histories + ownership graph — decision-grade
- Essential grounding for credit agents, KYC, supply-chain AI
Data trajectory (stock vs flow)
Growing- Orbis entity graph keeps expanding (~500M+ entities)
- Ratings/transcript flow continuous; issuance cyclical
Position on the AI-unlock curve
AI 9 · High- Early OpenAI partnership; Research Assistant copilot
- MCP distribution into Claude/ChatGPT/Copilot
- Packaging data for agentic workflows — furthest on distribution
- High — arguably best-executed, hence richly priced
Current AI contracts & counterparties
✓ deep dive- Early OpenAI partnership; Research Assistant copilot
- MCP distribution into Claude/ChatGPT/Copilot
- No raw licensing — productized access only
Possibilities for additional contracts
- Agentic KYC/credit-memo workflows priced per seat
- Orbis private-company graph as agent grounding
AI risks — what stands to lose
- Analytics research/tools face AI commoditization; ratings are regulatorily protected
- KYC/compliance products meet AI-native challengers
Assessment
Valuation & discrepancy
Disc 2 · Low- Best business + furthest-along AI
- ~11x sales / ~40x earnings to match
- Thinnest discount; DCFs flag it rich
Convexity & why
Low- Best business, thinnest discount, DCF flags it rich
- Limited upside → low convexity
Other endogenous concerns
- Ratings revenue rides the debt-issuance cycle
- Duopoly position invites periodic antitrust/regulatory attention
Hype factor (market awareness)
HighBest-executed AI strategy is consensus; it's in the ~11x
Catalysts
- Agentic product attach rates
- Ratings issuance cycle
- Orbis monetization moves
Data
Nature of the data
Data 7 · High- Fund/ETF data, star & analyst ratings; DBRS credit ratings
- PitchBook private-markets/VC dataset is the scarce crown jewel
- Fund data feeds advisor copilots
Data trajectory (stock vs flow)
Growing- PitchBook's private-company universe compounds with VC/PE activity
Position on the AI-unlock curve
AI 5 · Neutral- Mo chatbot + PitchBook AI features
- Monetization mostly stays in-product
- Mid on the curve
Current AI contracts & counterparties
~ desk note- Mo assistant; PitchBook AI features; in-product only
Possibilities for additional contracts
- PitchBook private-market data licensing to AI deal tools
AI risks — what stands to lose
- Fund research commoditized by AI summarization; ratings brand defensible
- PitchBook data scraping/inference by AI tools
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~$7.0B cap, ~3.4x EV/Sales on +8% growth
- Cheap for a PitchBook-owning franchise
- Market under-paying for the private-markets data
Convexity & why
Moderate–High- PitchBook AI-deal-sourcing optionality, cheaply priced
- No hard catalyst
- Cheap enough to tilt positive
Other endogenous concerns
- Founder (Mansueto) voting control
- PitchBook decelerated with the VC downturn; DBRS is issuance-cyclical
Hype factor (market awareness)
LowPitchBook's AI value ~absent from the narrative
Catalysts
- PitchBook growth; advisor-AI launches
Data
Nature of the data
Data 8 · High- Indices (World, EM) portfolios are built and measured against
- ESG/climate ratings, Barra factor/risk models, Burgiss private-asset data
- Benchmarks + factor models are chokepoints
- Index licensing is a recurring toll-road
Data trajectory (stock vs flow)
Growing- Index/factor data grows with markets; private-asset (Burgiss) expanding fast
Position on the AI-unlock curve
AI 7 · High- IndexAI connector; 'train clients' LLMs' roadmap
- Solid enterprise APIs
- Less aggressive than S&P/Moody's
- Mid/high — capable but measured
Current AI contracts & counterparties
~ desk note- IndexAI connector; 'train clients' LLMs' roadmap — no licensing $ disclosed
Possibilities for additional contracts
- Benchmark/factor licensing to agent platforms
AI risks — what stands to lose
- ESG/analytics tools commoditized by AI; index licensing protected
Assessment
Valuation & discrepancy
Disc 3 · Low- ~14x sales — one of the richest here
- Priced as the premium compounder it is
- No discount to the quality
Convexity & why
Low- Richest multiple here
- Least convex — priced for the quality
Other endogenous concerns
- Client concentration in fee-pressured asset managers
- US political backlash against ESG products
Hype factor (market awareness)
MedAI seen as feature, not thesis
Catalysts
- Index flows; ESG/private-asset data attach
Data
Nature of the data
Data 7 · High- 100+ proprietary market-data feeds
- Index & analytics data products
- A licensing toll-road like S&P benchmarks
Data trajectory (stock vs flow)
Growing- Market data grows with volumes; Verafin fraud signals compound
Position on the AI-unlock curve
AI 6 · Neutral- Feeds quant/agent workflows
- Productized data
- Up the curve on data productization
Current AI contracts & counterparties
~ desk note- Verafin AI (fraud), market-data feeds; no LLM licensing line
Possibilities for additional contracts
- Surveillance/fraud agents; index licensing
AI risks — what stands to lose
- Minimal core risk; market-data products face some AI substitution
Assessment
Valuation & discrepancy
Disc 3 · Low- Premium valuation reflects the toll-road
- Quality owner, little discount
Convexity & why
Low- Priced toll-road
- Limited asymmetry
Other endogenous concerns
- Adenza acquisition debt + integration
- Crypto-listings exposure adds volatility
Hype factor (market awareness)
LowAI in products, not in the multiple
Catalysts
- Fin-crime AI growth; data ARR
Data
Nature of the data
Data 9 · High- Credit ratings, Capital IQ fundamentals/transcripts, Platts benchmarks
- S&P Dow Jones Indices + Mobility (CARFAX)
- Benchmarks are licensing toll-roads AI can't route around
- The grounding layer any financial LLM/agent needs
Data trajectory (stock vs flow)
Growing- Daily benchmark prints, transcripts, fundamentals — relentless flow
- CARFAX events + Mobility add new streams
Position on the AI-unlock curve
AI 9 · High- Kensho LLM-ready API live since Nov 2024; 300+ customers
- Anthropic MCP connector + Claude Cowork plugin (Feb 2026)
- Cohere North partnership (Jun 8, 2026) — sovereign/regulated AI
- Distribution into Claude, ChatGPT, Gemini, Copilot
- The most aggressive everywhere-the-agents-are strategy
Current AI contracts & counterparties
✓ deep dive- Kensho LLM-ready API (Nov 2024), 300+ customers (launch)
- Claude Cowork plugin + Anthropic MCP (Kensho)
- Cohere North partnership, Jun 8 2026 (PR)
Possibilities for additional contracts
- Per-seat / usage pricing for agentic data access
- Benchmark licensing to agent platforms (toll-road extension)
- Private-markets data into AI workflows
AI risks — what stands to lose
- Capital IQ desktop seats at risk as agents answer directly (why it sells the data INTO agents)
- Ratings & indices largely insulated
Assessment
Valuation & discrepancy
Disc 3 · Low- ~8.8x EV/Sales on +8% growth — premium largely intact
- Top-tier AI execution already recognized in the multiple
- Quality fully priced; no metric discrepancy
Convexity & why
Low- Quality + best-in-class AI execution already in the multiple
- Limited discrepancy on metrics
- Modest two-way payoff
Other endogenous concerns
- IHS Markit integration legacy; Mobility (CARFAX) is auto-cyclical
- Index fee compression a slow structural drag
Hype factor (market awareness)
Med-HighAI execution is consensus among analysts; the multiple carries only a modest sector AI-threat discount
Catalysts
- AI-access revenue disclosure (none yet)
- More agent-platform embeds
- Ratings cycle + index flows
Professional-information data (legal · tax · IT advisory)
Data
Nature of the data
Data 8 · High- 45+ yrs of proprietary syndicated IT/business research from ~2,000 analysts
- Magic Quadrants & Hype Cycles are de-facto standards CIOs buy on
- Price, salary & contract benchmarks from thousands of engagements
- Behind a hard paywall — not on the open web, not freely scrapeable
- >75% of contract value multi-year recurring, embedded in workflows
Data trajectory (stock vs flow)
Steady — watch the flow- Analyst output paced by headcount; inquiry/benchmark data grows with clients
- CV slowdown = the inflow risk: fewer clients → less peer data
Position on the AI-unlock curve
AI 5 · Neutral- Two-sided: AI could commoditize 'advice' or make its data the grounding layer
- Rolling out AskGartner inside client licenses
- Has NOT licensed its corpus to labs — keeps it walled
- Contract-value growth slowed to ~1–5% — the market's disruption tell
- Early on the curve; data-as-grounding thesis unproven
Current AI contracts & counterparties
✓ deep dive- None — AskGartner ships inside existing client licenses
- AskGartner live across research portal (example)
Possibilities for additional contracts
- Corpus-grounded agent for enterprises (license upsell)
- Selective API access to benchmarks/peer data
- Price/SLA tiers for AI-assisted research
AI risks — what stands to lose
- The core product IS advice — generalist AI is a direct substitute
- Seat-based research licenses are the exposed surface
- Conferences/consulting more defensible
Assessment
Valuation & discrepancy
Disc 8 · High- ~2.1x EV/Sales for a 77%-gross-margin, mostly-recurring franchise
- The multiple embeds a full AI-disruption outcome; CV growth ~1–5% is the operational tell
- Cheapest quality owner on the board on metrics
Convexity & why
High · quality-convex- Profitable recurring base at ~2x sales bounds the downside
- Large upside if AI proves additive to the franchise
- Cheap quality + two-sided AI = positive convexity
Other endogenous concerns
- Conference/consulting segments are macro-cyclical
- EPS growth leans on buybacks; sales-force productivity in question
Hype factor (market awareness)
High — as threatNarrative casts Gartner as an AI casualty; AskGartner and the paywalled corpus get little credit
Catalysts
- Contract-value growth stabilization (the single tell)
- AskGartner engagement disclosures
- Buyback pace
Data
Nature of the data
Data 9 · High- Westlaw: case law, statutes, annotations built over a century
- Editorial headnotes/KeyCite are irreplicable human layers
- Practical Law, Checkpoint (tax), Reuters News
- Legal/tax = highest-value, lowest-hallucination-tolerance use cases
Data trajectory (stock vs flow)
Steady compounding- Case law grows with the courts — slow, perpetual accretion
- Editorial annotations (headnotes/KeyCite) compound on top
Position on the AI-unlock curve
AI 8 · High- CoCounsel scaling fast — ~1M AI users
- AI-native Westlaw does grounded retrieval over its corpus
- Monetizes the data itself
- High — clear legal-AI leader
Current AI contracts & counterparties
✓ deep dive- No corpus licensing — deliberate walled strategy
- CoCounsel: 1M professionals, 107 countries (Feb 2026) (PR)
- Building proprietary LLM for regulated use cases
Possibilities for additional contracts
- Selective agent-platform access to Westlaw (MCP-style)
- CoCounsel 10x user target = the in-product unlock
- Tax/audit agentic suites later in 2026
AI risks — what stands to lose
- Legal research workflow is the AI battleground — Harvey, Legora, generalist agents
- Westlaw seat pricing under pressure if agents do the research
- Reuters news commoditized by AI summarization
Assessment
Valuation & discrepancy
Disc 3 · Low- ~9x EV/Sales on +7% growth — a modest AI-threat discount against its quality
- CoCounsel at 1M users is distribution the multiple under-credits
- Premium franchise; the discount is partial, not deep
Convexity & why
Low- Priced quality; AI leadership reflected
- Limited convexity
Other endogenous concerns
- Woodbridge (Thomson family) controls ~70% — governance is theirs
- Print/legacy declines largely done; tax season concentration
Hype factor (market awareness)
High — as threatMarket narrative treats agentic AI as a threat to legal-research seats; CoCounsel distribution under-credited
Catalysts
- CoCounsel next-gen GA + adoption metrics
- ACV growth reacceleration (the proof point)
- Competitive data vs Harvey/Legora/Claude Cowork
Credit · identity · risk data
Data
Nature of the data
Data 8 · High- The Work Number — unique employer-sourced income/employment records
- Verified income/employment ground-truth no LLM can infer
- Utility/telecom payment data extends the picture
- Gating data for lending, hiring, benefits
- Contributory — employers feed it (network effects)
Data trajectory (stock vs flow)
Compounding- The Work Number records keep growing via payroll integrations
- Every paycheck is a new record — true flow asset
Position on the AI-unlock curve
AI 6 · Neutral- EFX.AI built into new product models
- FCRA permissible-purpose rules cap AI exposure
- Monetization stays inside regulated rails
- Mid — gated by regulation, not capability
- Re-rate is cyclical more than AI-driven
Current AI contracts & counterparties
~ desk note- EFX.AI in-product; FCRA limits external exposure
Possibilities for additional contracts
- Verified-income rails for lending/hiring agents (permissioned)
AI risks — what stands to lose
- AI cash-flow underwriting could route around bureau scores at the margin
- AI-driven synthetic-identity fraud raises cost of trust
Assessment
Valuation & discrepancy
Disc 7 · High- ~$20B cap, ~4.0x EV/Sales on +7% growth
- Cheap for the owner of The Work Number
- Re-rates on the lending/hiring cycle + verified-income AI demand
Convexity & why
High- Unique income/employment data at a low multiple
- FCRA caps direct licensing, but the asset is irreplaceable
- Cheap + cyclical-recovery optionality = convex
Other endogenous concerns
- 2017 breach legacy = elevated security/regulatory burden
- Mortgage + hiring volumes are the real earnings driver near-term
- CFPB / FCRA scrutiny is permanent
Hype factor (market awareness)
LowAI angle absent; mortgage cycle dominates the narrative
Catalysts
- Mortgage/hiring recovery; TWN records growth; any agent-rail pilots
Data
Nature of the data
Data 8 · High- Third global credit bureau + marketing/identity/fraud data
- Best organic growth of the three bureaus
- Verified credit/identity data with network effects
Data trajectory (stock vs flow)
Growing- Same bureau flow; strongest organic data investment of the three
Position on the AI-unlock curve
AI 6 · Neutral- AI products across credit & fraud
- FCRA-style rules cap ecosystem exposure
- Mid on the curve
Current AI contracts & counterparties
~ desk note- Ascend platform AI; in-product
Possibilities for additional contracts
- Same permissioned-rails option as EFX/TRU
AI risks — what stands to lose
- Same as the other bureaus; strongest product diversification of the three
Assessment
Valuation & discrepancy
Disc 6 · Neutral- Reasonable bureau multiple
- Only friction is access (London listing)
- Quality peer to EFX/TRU
Convexity & why
Moderate- Quality + reasonable price
- Regulation caps the convex upside
- Balanced
Other endogenous concerns
- UK listing discount; Brazil FX exposure
Hype factor (market awareness)
LowUK listing keeps it out of the AI conversation
Catalysts
- Cycle; NA mortgage volumes
Data
Nature of the data
Data 6 · Neutral- The FICO score — decisioning standard embedded in US credit
- More algorithm/standard than raw corpus
- But the score is a data product with monopoly economics
Data trajectory (stock vs flow)
Derived flow- Scores recompute on bureau flow; FICO originates little raw data
Position on the AI-unlock curve
AI 6 · Neutral- Own FFM foundation model
- AI lending agents still need an accepted standard
- Mortgage-pricing change is a catalyst
- Not a corpus play
Current AI contracts & counterparties
~ desk note- FICO Foundation Model (FFM) announced; platform AI
Possibilities for additional contracts
- Score-as-API inside lending agents
AI risks — what stands to lose
- The central AI risk case: AI-native underwriting bypassing the Score
- Lenders' in-house models + FHFA score competition (VantageScore 4.0)
Assessment
Valuation & discrepancy
Disc 5 · Neutral- ~14x EV/Sales on +15% growth — still premium on metrics
- Moat contested (VantageScore push, AI underwriting)
- Two-sided
Convexity & why
Moderate- De-rated standard with mortgage-pricing optionality
- But expensive on sales (~14x)
- Two-sided
Other endogenous concerns
- Pricing-power backlash: FHFA pushing VantageScore competition in mortgages
- Revenue concentrated in B2B scores; software segment unloved
Hype factor (market awareness)
MedDebate is pricing power, not AI
Catalysts
- Mortgage-score pricing; platform ARR
Data
Nature of the data
Data 7 · High- Identity graph & data-collaboration network (25k+ publishers)
- Clean-room identity for the post-cookie/AI-data era
Data trajectory (stock vs flow)
Maintained- Identity graph is refresh-maintenance, not accumulation
Position on the AI-unlock curve
AI 6 · Neutral- Well-placed for AI-data era
- But the story is now M&A
Current AI contracts & counterparties
~ desk note- Identity/clean-room infra relevant to AI data flows
Possibilities for additional contracts
AI risks — what stands to lose
- Acquisition pending — risk transfers to Publicis
Assessment
Valuation & discrepancy
Disc 2 · Low- Being acquired ~$2.5B by Publicis
- Off the board as a standalone bet
- Signal: ad-holdcos paying up for identity data
Convexity & why
Low- Taken out — payoff capped by the deal price
Other endogenous concerns
- Deal-close risk is the only variable left (~$38.50 cash)
Hype factor (market awareness)
LowStory is now the Publicis acquisition
Data
Nature of the data
Data 7 · High- Credit bureau + identity resolution (Neustar)
- Links offline identity to digital identifiers
- Identity graphs matter more as AI agents transact
- Contributory bureau data with network effects
Data trajectory (stock vs flow)
Growing- Credit + identity events flow with economic activity
Position on the AI-unlock curve
AI 5 · Neutral- OneTru platform, TruIQ agents
- Identity products quietly AI-relevant
- FCRA-capped exposure like Equifax
- Mid on the curve
Current AI contracts & counterparties
~ desk note- OneTru platform, TruIQ agents; in-product
Possibilities for additional contracts
- Identity verification for AI-agent transactions
AI risks — what stands to lose
- Same bypass risk as EFX; identity products partly hedge it
Assessment
Valuation & discrepancy
Disc 6 · Neutral- Cheapest of the three bureaus
- Modest favorable gap
- Same regulatory ceiling
Convexity & why
Moderate- Cheapest bureau + cycle/identity optionality
- FCRA caps the convex upside
- Balanced
Other endogenous concerns
- Neustar deal leverage; UK consumer business weak
- Same CFPB overhang
Hype factor (market awareness)
LowSame as EFX — cycle story, not AI story
Catalysts
- Cycle turn; Neustar identity products
Data
Nature of the data
Data 9 · High- Decades of contributory claims, loss & property/peril data
- Nearly all US P&C insurers both feed and buy it back
- Catastrophe models built on the loss history
- Near-monopoly; no AI lab can rebuild it
Data trajectory (stock vs flow)
Steady compounding- Contributory model: every insurer claim feeds it, by contract
- Cat-event data grows with each season
Position on the AI-unlock curve
AI 5 · Neutral- Generative/agentic AI in underwriting/claims products
- Consortium-locked — not licensed to the open ecosystem
- Value unlock in-product, not via licensing
- Mid — deepest moat, deliberately walled
Current AI contracts & counterparties
~ desk note- Consortium AI in underwriting/claims products
Possibilities for additional contracts
- Walled option: claims-history grounding for insurance agents
AI risks — what stands to lose
- Insurers building AI on their own claims data could weaken the consortium pull
Assessment
Valuation & discrepancy
Disc 5 · Neutral- Deep moat, but ~12x sales / ~7% growth
- Fully paid for
- Quality High, value Low-ish
Convexity & why
Low–Moderate- Near-monopoly data, but premium & walled
- Bounded downside, limited upside
- Low asymmetry
Other endogenous concerns
- Consortium members push back on pricing; class actions over contributory data use
Hype factor (market awareness)
Low-MedQuality priced; AI not separately valued
Catalysts
- Product attach; pricing renewals
Healthcare · life-sciences data
Data
Nature of the data
Data 5 · Neutral- Pharmacy/dispensing & distribution data
- Optimizes thin-margin logistics
Data trajectory (stock vs flow)
Steady flow- Distribution data tracks volumes
Position on the AI-unlock curve
AI 3 · Low- Logistics input, not sold
Current AI contracts & counterparties
~ desk note
Possibilities for additional contracts
AI risks — what stands to lose
- Low — physical distribution
Assessment
Valuation & discrepancy
Disc 4 · Neutral- Fair defensive distributor
- Data-rich, not a data owner
Convexity & why
Low- Defensive, data not a driver
Other endogenous concerns
- Drug-pricing policy; thin-margin model
Hype factor (market awareness)
LowNot an AI story
Data
Nature of the data
Data 7 · High- Healthcare commercial intel: providers, claims, affiliations, install-base
- 'The ZoomInfo of healthcare' — sells intelligence to life-sciences/med-tech
- A pure data owner, not a marketplace
- Continuously refreshed healthcare-entity graph
Data trajectory (stock vs flow)
Slowing- Refresh continues but shrinking revenue funds less data collection
Position on the AI-unlock curve
AI 6 · Neutral- Real owner, but AI is as much threat as tailwind
- Limited AI productization so far
- Mid/behind — business being repriced
- Erosion risk from AI-generated provider signal
Current AI contracts & counterparties
~ desk note
Possibilities for additional contracts
- Healthcare-commercial grounding data for pharma AI
AI risks — what stands to lose
- AI-generated provider intelligence directly substitutes the core product — erosion already visible
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~$0.1B cap, ~2x EV/Sales on declining revenue
- Distressed micro-cap; the data is better than the equity
- Cheap for existential reasons
Convexity & why
High · distressed- Distressed micro-cap → option on stabilization or M&A
- Declining revenue is the live left tail
- Cheap healthcare-commercial data if it survives
Other endogenous concerns
- PE overhang (Advent), serial goodwill writedowns, micro-cap liquidity
Hype factor (market awareness)
LowMicro-cap; no AI narrative attaches
Catalysts
- Revenue stabilization; strategic review odds
Data
Nature of the data
Data 6 · Neutral- Verified network of most US physicians
- The asset is the audience/engagement, not a corpus
- Workflow tools for doctors
Data trajectory (stock vs flow)
Saturated graph- Most US physicians already on it — the graph is mature
- Engagement/newsfeed data still grows; the asset is breadth, not flow
Position on the AI-unlock curve
AI 6 · Neutral- Strong AI tools (Doximity GPT), huge engagement
- But no AI revenue in guidance
- Data asset is the audience, not a corpus
Current AI contracts & counterparties
~ desk note- Doximity GPT free for physicians; ad AI in-product
Possibilities for additional contracts
- Clinician-verified channel for healthcare AI distribution
AI risks — what stands to lose
- Physician attention shifting to AI clinical tools (OpenEvidence et al.)
- Pharma ad budgets could follow attention into AI channels
Assessment
Valuation & discrepancy
Disc 5 · Neutral- ~$3.8B cap, ~5.6x EV/Sales on +13% growth
- Far from the ~18x I'd assumed — reasonable now
- Verified clinician graph; audience-not-corpus caps licensing
Convexity & why
Moderate- Verified clinician graph + AI tools, now at a fair multiple
- Audience-not-corpus caps the data-licensing upside
- Balanced after the de-rate
Other endogenous concerns
- Pharma ad-budget concentration; engagement metrics are the whole story
Hype factor (market awareness)
MedWas priced for AI hopes; now reset to fair
Catalysts
- Ad market; AI tool engagement
Data
Nature of the data
Data 6 · Neutral- Claims/care-management data via Carelon
- Latent separable data asset
- Used to lower its own medical costs
Data trajectory (stock vs flow)
Steady flow- Claims flow with membership; flat membership = flat flow
Position on the AI-unlock curve
AI 4 · Neutral- AI care-management lowers internal costs
- Closest operator to a separable data asset
- Still not pure-play
Current AI contracts & counterparties
~ desk note
Possibilities for additional contracts
- Separable claims-data asset (never signaled)
AI risks — what stands to lose
- Low direct risk; AI mostly a cost lever
Assessment
Valuation & discrepancy
Disc 5 · Neutral- Cheap, but on insurer fundamentals
- Latent data optionality (Carelon)
- Cyclical
Convexity & why
Moderate- De-rated insurer with latent data optionality
- Cyclical, not a data re-rate
- Mildly positive
Other endogenous concerns
- Medical-cost trend + Medicaid redeterminations; ACA subsidy politics
Hype factor (market awareness)
LowInsurer story
Catalysts
- Medical-cost trend; Carelon growth
Data
Nature of the data
Data 5 · Neutral- Rx-pricing & consumer prescription-behavior data
- Unique data, but an input to a discount platform
- Platform under structural pressure
Data trajectory (stock vs flow)
Steady flow- Pricing data flows; nothing accumulating in value
Position on the AI-unlock curve
AI 4 · Neutral- Data feeds the platform; not licensed as a corpus
- Limited AI productization
Current AI contracts & counterparties
~ desk note
Possibilities for additional contracts
- Rx-pricing data into consumer-health agents
AI risks — what stands to lose
- AI agents compare drug prices directly, disintermediating the front end
Assessment
Valuation & discrepancy
Disc 4 · Neutral- Cheap, but pressured core
- Marginal owner with hard-to-monetize data
Convexity & why
Moderate · binary- Cheap with stabilization optionality
- But structural pressure on the core
- Binary-ish
Other endogenous concerns
- PBM dependence — a single partner change (Kroger '22) cratered it once
Hype factor (market awareness)
LowNo AI narrative
Data
Nature of the data
Data 9 · High- Liquid-biopsy genomic + clinical-outcomes data in oncology
- Proprietary, scarce — a direct Tempus peer
- Longitudinal molecular profiles track tumor evolution
- Cannot be assembled from public sources
Data trajectory (stock vs flow)
Compounding fast- Test volumes +25–35%/yr; each test extends longitudinal profiles
Position on the AI-unlock curve
AI 6 · Neutral- Pharma data partnerships + co-development, earlier-stage
- Smart Platform multiomic insights
- Building the 'co-develop on our data' motion
- Mid — monetization layer still forming
Current AI contracts & counterparties
~ desk note- Pharma data partnerships (earlier-stage than Tempus); Smart Platform
Possibilities for additional contracts
- Tempus-style co-builds on liquid-biopsy data
AI risks — what stands to lose
- Interpretation commoditizes; raw assay + outcomes data is the defensible part
Assessment
Valuation & discrepancy
Disc 5 · Neutral- Scarce data, but ~12x sales and unprofitable
- Analyst upside exists
- Expensive growth, not cheap
Convexity & why
Moderate- Scarce-data optionality, but ~12x sales + unprofitable cap it
- More a growth bet than an option
- Balanced, positive tilt
Other endogenous concerns
- Cash burn continues; patent litigation history with Natera
- Screening (Shield) economics still unproven at scale
Hype factor (market awareness)
MedPriced as diagnostics growth; data angle secondary
Catalysts
- MRD reimbursement; pharma deal announcements
Data
Nature of the data
Data 9 · High- World's largest pharmacy-claims & prescription dataset (ex-IMS Health)
- Population-scale real-world evidence across global Rx
- Clinical-trial operational data as the largest CRO (ex-Quintiles)
- De-identified, compliance-grade — built under HIPAA/GDPR, unscrapeable
- Sold to virtually every major pharma
Data trajectory (stock vs flow)
Compounding- Rx/claims flow is continuous and population-scale
- Trial operational data compounds with every study run
Position on the AI-unlock curve
AI 7 · High- IQVIA.ai unified agentic platform (Mar 2026): 150+ agents deployed
- NVIDIA partnership since Jan 2025 — custom foundation models on its data
- 19 of top 20 pharma already using IQVIA agents; 100+ AI patents
- Builds agents ON the data rather than licensing it out
- No longer latent — monetization architecture is live
Current AI contracts & counterparties
✓ deep dive- NVIDIA partnership (Jan 2025) → IQVIA.ai platform, Mar 2026 (PR)
- 150+ agents live; 19 of top-20 pharma using them (report)
- 100+ AI patents; agents built ON proprietary data, not licensed out
Possibilities for additional contracts
- Agent subscriptions as a separate revenue line
- RWE feeds for medical LLMs (compliance-wrapped)
- Trial-design agents priced on outcomes
AI risks — what stands to lose
- CRO services half is labor-heavy — AI compresses what pharma will pay for it
- Pharma in-housing analytics with AI tools
Assessment
Valuation & discrepancy
Disc 7 · High- $16.3B FY25 rev, +5.9% (~7% TTM)
- Low-single-digit sales multiple for unique data
- ~$13B net debt is the caveat
Convexity & why
High- ~2.7x EV/Sales on +6% growth — cheap for the scarcest Rx data
- Locked in compliance contracts; low AI surface today
- Cheap + latent-unlock optionality = convex
Other endogenous concerns
- ~$13B net debt limits flexibility
- CRO bookings cyclical; pharma R&D budgets squeezed (IRA effects)
Hype factor (market awareness)
Low → risingCheapest scarce-data name; IQVIA.ai barely registers in the multiple yet
Catalysts
- Next earnings: ~late July 2026 (Q1 reported May 5 — beat; EPS guide raised)
- IQVIA.ai adoption: now 192 agents / 64 use cases; watch for monetization disclosure (Q1 call)
- R&DS backlog $32.7B (+5.3%); Q4 book-to-bill 1.18x — bookings reacceleration is the proof point
- $1.2B buyback remaining ($552M done in Q1)
- Duke obesity-trials collaboration (Feb 2026) — fastest-growing trial category
- De-leveraging from 3.62x / $13.9B net debt frees the multiple
Data
Nature of the data
Data 9 · High- Genetic-testing / cfDNA data (MRD, prenatal, transplant)
- Large, fast-growing proprietary genomic dataset
- Outcome-linked longitudinal data is the durable asset
- Same scarce-data position as Guardant/Tempus
Data trajectory (stock vs flow)
Compounding fast- Fastest test-volume growth in the group; outcome links accrue with time
Position on the AI-unlock curve
AI 6 · Neutral- Owns the data; data-layer monetization still maturing
- Strong clinical-validation pipeline feeds the dataset
- Files as diagnostics, so screens miss it
- Mid on the curve
Current AI contracts & counterparties
~ desk note- Data feeds pharma trials; in-product AI
Possibilities for additional contracts
- Outcome-linked genomic licensing
AI risks — what stands to lose
- Same as GH — value migrates from interpretation to the longitudinal data
Assessment
Valuation & discrepancy
Disc 5 · Neutral- Irreplaceable data, ~12x sales on ~30% growth
- Quality High; multiple says priced, not discounted
- Volatile equity
Convexity & why
Moderate- Data optionality vs a rich ~12x multiple
- Roughly balanced, slight positive tilt
Other endogenous concerns
- Reimbursement concentration (Medicare MRD decisions)
- Billing-practice scrutiny; GH litigation
Hype factor (market awareness)
MedSame — growth story, data unpriced
Catalysts
- MRD adoption; new indications
Data
Nature of the data
Data 9 · High- Multimodal clinical + genomic data (~500-PB) pairing sequencing with clinical records
- Scarcest, most valuable category for biomedical AI — unscrapeable
- Built explicitly as an AI data company
- 140% net revenue retention on Insights/data
- Linked outcomes data is what makes it irreplaceable
Data trajectory (stock vs flow)
Compounding fast- ~300PB and growing; every test adds linked clinical+genomic data (Q1 letter)
- Sequencing volumes growing ~25–30%/yr — the corpus is the byproduct of revenue
Position on the AI-unlock curve
AI 8 · High- $200M AstraZeneca/Pathos deal (Apr 2025): largest oncology foundation model
- Total remaining contract value >$1B; non-exclusive — can resell the motion
- Data customers: AZ, Novartis, Merck KGaA, Takeda, Boehringer, United Therap.
- Illumina collaboration trains genomic algorithms on its multimodal data
- Insights (data licensing) growing ~58%
Current AI contracts & counterparties
✓ deep dive- $200M AstraZeneca/Pathos data+model deal over 3 yrs (PR)
- Total remaining contract value >$1B (Q1 letter)
- Data customers: Novartis, Merck KGaA, Takeda, Boehringer, United Therap.
- Illumina algorithm-training collaboration
Possibilities for additional contracts
- Non-exclusive foundation-model co-builds with other pharma
- Expansion beyond oncology (cardio, neuro)
- Open-source pathology consortium as a funnel
AI risks — what stands to lose
- Pharma could in-house modeling after learning from co-builds
- Interpretation layer could commoditize; the data itself is the hedge
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~$8.5B cap, ~6.5x EV/Sales on +83% growth
- Strikingly cheap for the growth + scarcest biomedical data
- Priced like a normal growth co, not the data monopoly it's building
Convexity & why
High · growth optionality- Foundation-model + licensing optionality could make it the oncology-AI data layer
- Rich multiple + cash burn are the downside
- Large, real optionality = convex growth bet
Other endogenous concerns
- Founder super-voting control; Pathos is Lefkofsky-affiliated (related-party optics on the $200M deal)
- Convertible debt + only just adj-EBITDA positive
- Short-seller scrutiny history (data-quality claims)
Hype factor (market awareness)
HighAI is in the name and the multiple — but >$1B RCV arguably still under-modeled
Catalysts
- Next earnings: ~early Aug 2026 (Q1 reported May 5 — guidance raised) (Q1 8-K)
- 2026 guide raised to $1.59–1.60B revenue / ~$65M adj EBITDA — the leverage inflection
- MRD volume ~6,500 tests in Q1, +500% YoY — reimbursement decisions are the swing
- TCV >$1.1B; 70+ pharma data customers — watch new (non-exclusive) co-builds
- Insights (data licensing) +44% in Q1 — the annuity compounding
Data
Nature of the data
Data 7 · High- Life-sciences CRM + proprietary OpenData/Link (HCP & reference data)
- Pharma depends on its reference data
- A separable corpus inside the SaaS
Data trajectory (stock vs flow)
Growing- OpenData/Link refreshed continuously; usage data grows with seats
Position on the AI-unlock curve
AI 7 · High- AI embedded in pharma workflows
- Up the curve
- Vertical-SaaS leader
Current AI contracts & counterparties
✓ deep dive- AI agents shipping across CRM/Vault (Dec 2025 wave)
- OpenData/Link reference data feeds its own AI
Possibilities for additional contracts
- Agent pricing on top of seats
- Link data into pharma AI pipelines
AI risks — what stands to lose
- Vertical-SaaS pricing under the same agentic pressure as all seats ('SaaS-pocalypse')
- AI app-builders lower barriers to bespoke pharma tools
Assessment
Valuation & discrepancy
Disc 5 · Neutral- Premium SaaS multiple
- Data is a real, under-discussed asset
- Equity priced for quality
Convexity & why
Low- Premium SaaS; data underrated but equity priced
- Limited asymmetry
Other endogenous concerns
- Salesforce→own-platform CRM migration is a multi-year execution risk
- Core TAM maturing; growth depends on new apps
Hype factor (market awareness)
MedRead as a quality SaaS with AI features, not a data owner
Catalysts
- Agent adoption metrics
- Vault CRM migration completion
Consumer · user-generated · marketplace data
Data
Nature of the data
Data 4 · Neutral- Transactional used-car e-commerce & trade data
- Tunes its own pricing/inventory
Data trajectory (stock vs flow)
Growing- Transaction/pricing data grows with units; internal
Position on the AI-unlock curve
AI 3 · Low
Current AI contracts & counterparties
~ desk note
Possibilities for additional contracts
AI risks — what stands to lose
Assessment
Valuation & discrepancy
Disc 3 · Low- Volatile, richly valued
- Weak fit for the screen
Convexity & why
Low- High beta but valued on retail growth, not data
Other endogenous concerns
- Garcia family control + related-party history; leverage rebuilt the equity once already
Hype factor (market awareness)
LowRetail story
Data
Nature of the data
Data 8 · High- Verified CRE comps/property data, 35-yr research army
- LoopNet, Apartments.com, Homes.com
- Unscrapeable, walled inside terminals
Data trajectory (stock vs flow)
Compounding- Research army keeps verifying; comps accumulate permanently
- Zonda adds a housing-data stream
Position on the AI-unlock curve
AI 3 · Low- Walled, litigious; data locked in terminals — minimal AI surface
- Heavy Homes.com ad spend
- Strategic data, low AI surface area
Current AI contracts & counterparties
✓ deep dive- None — deliberately walled; litigious vs scrapers
- Zonda acquisition ($800M) extends housing data
Possibilities for additional contracts
- The big withheld option: licensed CRE grounding for real-estate AI
- Homes.com AI search features
AI risks — what stands to lose
- AI aggregation/scraping pressure on listings; Google entering for-sale listings (BTIG flag)
- Verified CRE comps hardest to substitute
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~$14B cap, ~4.0x EV/Sales on +19% growth
- Much cheaper than I'd shown; Homes.com spend masks margins
- Unscrapeable CRE data at a reasonable price
Convexity & why
Moderate–High- Unscrapeable CRE data, now cheap
- Low AI surface + heavy ad spend cap near-term
- Re-rate optionality as Homes.com spend rolls off
Other endogenous concerns
- Homes.com spend is an act of will (founder-CEO); activist pressure has surfaced
- Serial litigation posture cuts both ways
Hype factor (market awareness)
LowAI never part of the story; the data optionality is free at ~4x
Catalysts
- Homes.com spend roll-off (margin catalyst)
- Zonda integration
- Any posture change on data access
Data
Nature of the data
Data 6 · Neutral- One of the largest learning-interaction datasets (50M+ DAU)
- Granular data on how people learn, err & retain across 100+ courses
- Used in-product to tune pedagogy — not licensed
- Value captured as engagement, not a sellable corpus
Data trajectory (stock vs flow)
Compounding- Learning interactions scale with DAUs (50M+, growing)
- Every exercise answered is new pedagogy data
Position on the AI-unlock curve
AI 6 · Neutral- AI-first (Gen-AI 'Max', AI video calls)
- Shipped 148 courses in a year via generative AI
- Unlock shows up as engagement/ARPU, not a licensing line
- Mid — AI deepens the product moat
Current AI contracts & counterparties
✓ deep dive- None out; heavy OpenAI/GenAI consumer (Max, AI courses)
- 148 AI-generated courses shipped in a year
Possibilities for additional contracts
- Learning-data licensing (never signaled)
- AI-tutor pricing tiers
AI risks — what stands to lose
- ChatGPT as a free language tutor — the central substitution threat
- Defense: gamification + structure, not content
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~4.0x EV/Sales on +39% growth — cheap on growth metrics
- AI-disruption fear embedded in the multiple
- A growth franchise at a non-growth multiple
Convexity & why
Moderate–High- ~4x on +39% growth bounds the downside if growth holds
- Upside if AI features lift engagement/ARPU
- Positive convexity on metrics
Other endogenous concerns
- Founder control; monetization-vs-engagement tension
- Still expensive on earnings even after the crash
Hype factor (market awareness)
High — as threatNarrative says ChatGPT kills language learning; the AI-first operating model is ignored
Catalysts
- DAU/booking growth stabilization
- Max attach rate
- Energy/engagement metrics
Data
Nature of the data
Data 6 · Neutral- LatAm marketplace purchase + fintech/credit data
- Powers its own ads/lending (input)
Data trajectory (stock vs flow)
Compounding- Purchase + credit data compounds with GMV growth
Position on the AI-unlock curve
AI 3 · Low- AI for marketplace/credit optimization
- Not a sold corpus
Current AI contracts & counterparties
~ desk note- Internal AI for ads/credit
Possibilities for additional contracts
AI risks — what stands to lose
- Low; AI mostly an internal lever
Assessment
Valuation & discrepancy
Disc 3 · Low- Premium growth stock
- Data doesn't re-rate it
Convexity & why
Moderate- High growth, but valued on the business, not the data
Other endogenous concerns
- LatAm FX/political risk; credit-book quality through cycles
Hype factor (market awareness)
LowNot a data play
Catalysts
- LatAm growth; fintech credit
Data
Nature of the data
Data 7 · High- Viewing/interaction data across ~300M members
- Real moat for recs/greenlighting
- Strictly internal — never licensed
Data trajectory (stock vs flow)
Growing- Viewing data grows with engagement; internal-only
Position on the AI-unlock curve
AI 2 · Low- Never licensed; AI = better curation only
- Internal-use data
Current AI contracts & counterparties
~ desk note- Internal only — never licensed
Possibilities for additional contracts
AI risks — what stands to lose
- GenAI lowers content-production barriers for rivals (long-term)
Assessment
Valuation & discrepancy
Disc 2 · Low- Premium mega-cap on subscriber economics
- n/a as a data play
Convexity & why
Low- Priced mega-cap, data internal
Other endogenous concerns
- Content-spend discipline vs growth; live/sports costs
Hype factor (market awareness)
LowRecs AI assumed, not valued separately
Data
Nature of the data
Data 9 · High- ~100k+ communities, two decades of upvote-ranked human conversation
- Largest archive of authentic opinion, troubleshooting, niche expertise
- Exactly what LLMs lack: recommendations, lived experience, long-tail Q&A
- Surfaces disproportionately in AI answers
- Classified as social media, not 'data services'
Data trajectory (stock vs flow)
Compounding- DAU still growing; posts/comments compound the archive daily
- Two decades of vote-ranked history can't be replicated retroactively
Position on the AI-unlock curve
AI 9 · High- $203M aggregate contract value disclosed at IPO (Google + OpenAI)
- ~$130M/yr run-rate ≈ 10% of revenue; Google ~$60M/yr, OpenAI ~$70M/yr
- #1 most-cited source across AI models (~3x Wikipedia)
- Google renewal under negotiation — pushing usage-based pricing
- Litigates unlicensed scrapers (incl. Perplexity suit)
Current AI contracts & counterparties
✓ deep dive- $203M aggregate disclosed at IPO (TechCrunch)
- Google ~$60M/yr; OpenAI ~$70M/yr ≈ 10% of revenue (SEL)
- 2–3 yr terms struck Jan 2024 — now in renewal window
Possibilities for additional contracts
- Google renewal at usage-based rates (mgmt: 'open for business')
- Anthropic / Meta / xAI remain unlicensed
- Dynamic per-citation pricing models
- Int'l + vertical (commerce intent) licensing
AI risks — what stands to lose
- Google AI Overviews already cut logged-out traffic (the 2025 user-growth scare)
- AI-generated content pollution threatens corpus authenticity
- Meta forums app targets the community moat
Assessment
Valuation & discrepancy
Disc 5 · Neutral- Best corpus + fastest unlock, but ~13x sales
- ~65% growth supports it — priced FOR growth
- Quality off the charts; valuation not a gap
Convexity & why
Moderate- Big growth/licensing optionality = upside call
- But ~13x sales means real drawdown if growth slows
- Net mildly positive from the licensing option
Other endogenous concerns
- Community/moderator revolt risk is structural (2023 API blackout precedent)
- Altman's stake = governance optics
- Ad business still ~90% of revenue and competitive
Hype factor (market awareness)
HighThe AI-data story IS the stock; renewal terms are the swing
Catalysts
- Google contract renewal & structure (report)
- Scraper litigation incl. Perplexity suit
- Meta forums app traction (the bear case)
- Data-licensing line in quarterly prints
Data
Nature of the data
Data 4 · Neutral- ~1B travel reviews; Viator experiences marketplace
- Widely scraped & substitutable
- Reviews feed AI trip-planning agents
Data trajectory (stock vs flow)
Slowing risk- ~1B cumulative, but contributions follow visits — and AI answers divert visits
- The corpus ages if the flywheel slows
Position on the AI-unlock curve
AI 4 · Neutral- Perplexity partnership (Jan 2025) now a measurable booking channel
- ChatGPT app launch partner (Oct 2025) for trip planning
- Distribution-into-AI strategy, not paid corpus licensing
- Viator + TheFork now >50% of revenue — the real value
Current AI contracts & counterparties
✓ deep dive- Perplexity partnership, Jan 2025 — hotels customer-acquisition channel (PR)
- ChatGPT app launch partner, Oct 2025 (report)
Possibilities for additional contracts
- Paid licensing of the review corpus (currently given for distribution)
- Viator inventory as the bookable layer inside AI agents
AI risks — what stands to lose
- AI trip planners bypass the site entirely — the core meta business is the casualty
- Viator/TheFork partially insulated (fulfillment, not discovery)
Assessment
Valuation & discrepancy
Disc 5 · Neutral- ~$1.4B cap, ~0.7x EV/Sales on +3% growth
- Very cheap, but reviews are being disintermediated
- Value is Viator/TheFork, not the review corpus
Convexity & why
Low–Moderate- Cheap but melting
- Weak optionality
Other endogenous concerns
- Viator faces GetYourGuide/Klook competition; legacy meta declines
- Post-Liberty structure leaves strategic questions
Hype factor (market awareness)
MedAI read as existential threat; partnerships seen as defensive, not monetizing
Catalysts
- AI-channel booking disclosures
- Membership program launch
- Viator/TheFork growth (the real value)
Data
Nature of the data
Data 6 · Neutral- ~300M geocoded local-business reviews
- Structured local sentiment for 'best X near me'
- Classified as internet content, not data services
Data trajectory (stock vs flow)
Steady flow — not melting- 22M new reviews in 2025 (vs 21M in '24); corpus 330M, +7% YoY (FY25 PR)
- What's melting is consumption (app engagement), not contribution — yet
- Risk: contribution follows traffic with a lag
Position on the AI-unlock curve
AI 6 · Neutral- Signed OpenAI agreement (disclosed Feb 2026)
- Perplexity has used Yelp local data since Mar 2024
- 'Other revenue' +17% on data licensing & transactions
- Expanding Yelp Assistant; Hatch acquisition (AI front-desk)
- Core local-ad business still the eroding center
Current AI contracts & counterparties
✓ deep dive- OpenAI agreement signed (Feb 2026, undisclosed) (FY25 PR)
- Perplexity has integrated Yelp local data since Mar 2024
- Data licensing inside 'Other revenue' (+17%)
Possibilities for additional contracts
- More assistant integrations (Gemini, Claude, Alexa-class)
- Usage-priced local-data API
- Transactional referrals from AI answers
AI risks — what stands to lose
- AI assistants answer 'best X near me' without a Yelp visit — ad impressions leak
- Google's AI search squeezes the top of funnel
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~$1.3B cap, ~0.9x EV/Sales on +3% growth
- <1x sales — cheap, but the ad core is eroding
- AI-distribution optionality vs value-trap risk
Convexity & why
Moderate · binary- Cheap with AI-distribution optionality
- vs an eroding core
- Binary-ish
Other endogenous concerns
- Own antitrust fight with Google (plaintiff) — outcome cuts both ways
- SMB advertiser churn; restaurant/retail ads already shrinking
Hype factor (market awareness)
Low-MedOpenAI deal is new and barely in the price; story still read as 'Google victim'
Catalysts
- OpenAI deal revenue contribution
- 'Other revenue' growth each quarter
- Services-ads resilience vs AI search
Data
Nature of the data
Data 5 · Neutral- Zestimate + listing data + largest US housing audience
- Much listing data is MLS-shared, not fully proprietary
- Consumer housing intent data
Data trajectory (stock vs flow)
Churning flow- Listings turn over rather than accumulate; Zestimate history compounds quietly
Position on the AI-unlock curve
AI 5 · Neutral- Strong in-app AI; partial data moat
- Real-estate AI agents could use it
- Mid on the curve
Current AI contracts & counterparties
~ desk note- In-app AI (natural-language search); MLS data shared
Possibilities for additional contracts
- Housing-intent data for real-estate agents
AI risks — what stands to lose
- AI agents could search listings directly; Zillow's audience moat = the defense
- Low risk to Zestimate itself
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~$8.6B cap, ~3.1x EV/Sales on +16% growth
- Cheaper than I'd shown; partial (MLS-shared) moat
- Housing-cycle leverage on top
Convexity & why
Moderate- Housing-cycle optionality
- Partial moat caps the data upside
- Balanced
Other endogenous concerns
- NAR commission-settlement reshapes agent economics — its customers' wallets
- Housing-cycle beta; Showcase/mortgage execution
Hype factor (market awareness)
MedAI features noted, not a data thesis
Catalysts
- Housing cycle; Showcase attach
Peer-reviewed journal publishing data
Data
Nature of the data
Data 10 · High- Elsevier science (The Lancet, Cell, Scopus) — peer-reviewed at scale
- LexisNexis legal + LexisNexis Risk Solutions (identity/fraud)
- Three of the most defensible corpora on earth in one company
- Scientific literature is critical for frontier capability
Data trajectory (stock vs flow)
Growing- Global science output grows mid-single-digit %/yr; submissions rising
- Caveat: AI-generated paper flood is a quality-control burden
Position on the AI-unlock curve
AI 8 · High- Lexis+AI, Scopus AI, ClinicalKey AI, Protégé all live
- Embeds data in grounded retrieval vs raw training access
- Among the best-positioned grounded-AI owners
- High — productization mature & shipping
Current AI contracts & counterparties
✓ deep dive- No raw licensing; grounded products only
- Lexis+ AI, Scopus AI, ClinicalKey AI, Protégé all shipped
Possibilities for additional contracts
- Elsevier corpus licensing remains a withheld option (big if ever)
- Agent-access tiers to Scopus/Lexis
- Risk-data feeds into KYC agents
AI risks — what stands to lose
- Lexis faces the same legal-AI insurgency as Westlaw
- Elsevier: AI summarization + open access erode subscription rationale
- Risk division most insulated
Assessment
Valuation & discrepancy
Disc 4 · Neutral- ~9x EV/Sales on +7% growth — a small AI-threat discount embedded
- Grounded AI products shipping across all three corpora
- Durable compounder; thesis is durability, not deep value
Convexity & why
Low- Fully-valued premium compounder
- Durable, but limited asymmetry either way
Other endogenous concerns
- Open-access mandates (Plan S) pressure Elsevier's model
- Exhibitions segment is cyclical
Hype factor (market awareness)
High — as threatSame legal-AI threat narrative as TRI; grounded-product execution under-credited
Catalysts
- Lexis+ AI penetration disclosures
- Any Elsevier AI-licensing posture change
- FY guide post-crash
Data
Nature of the data
Data 7 · High- Peer-reviewed STM journals/books; Cochrane co-publishing
- Vetted scientific text — what labs pay for to lift capability
- A 'smaller Elsevier' — quality corpus, narrower than RELX
- Editorial vetting + citation links add provenance
- Proprietary, not freely on the open web
Data trajectory (stock vs flow)
Growing- Submissions +25%, output +13% — the journal flow is accelerating (Q1 PR)
- Caveat: some of that surge is AI-assisted writing — vetting is the product
Position on the AI-unlock curve
AI 7 · High- $92M lifetime AI-licensing revenue; $29M in Q1 FY26 alone
- Anthropic strategic partnership (Sep 2025) + projects with 3 top tech cos
- Recurring inference pilots: pharma, chemical, space-exploration cos
- One of the only names with disclosed, recurring AI revenue
- Recurring AI line gives it proven monetization few peers can show
Current AI contracts & counterparties
✓ deep dive- $92M lifetime AI revenue; $29M in Q1 FY26 (PR)
- Anthropic strategic partnership (Sep 2025)
- Projects with 3 of the largest tech cos (unnamed)
- Recurring inference pilots: pharma, chemical, space
Possibilities for additional contracts
- Convert pilots → recurring corporate R&D subscriptions
- License on behalf of partner publishers (agency model)
- Agent-citation / RAG licensing beyond training
AI risks — what stands to lose
- AI summarization reduces per-article reading; open access erodes paywalls
- AI-written paper flood strains (and ironically validates) peer review
Assessment
Valuation & discrepancy
Disc 7 · High- ~1.9x EV/Sales with disclosed, recurring AI-licensing revenue ($92M lifetime)
- Flat underlying top line is the offset
- Cheap on metrics for the rare proven AI licensor
Convexity & why
Moderate- Low multiple + proven licensing = bounded downside with optional upside
- Flat core growth caps the slope
- Asymmetry modest but positive
Other endogenous concerns
- Library budget pressure + consolidation of academic spend
- Post-divestiture portfolio still re-finding growth
Hype factor (market awareness)
HighAI-licensing story is prominent in coverage; expectations now elevated
Catalysts
- Next earnings: Tue June 16, 2026, pre-market — FY26 Q4 + FY27 guide (notice)
- AI recurring revenue <10% of AI revenue today; mgmt expects the proportion to triple next year (Q3 call)
- OpenEvidence partnership: 5-yr multimillion licensing + Wiley equity stake
- Nexus licensing service at 36 publishing partners — the agency model scaling
- Emerald Publishing acquisition (Jun 2, 2026) adds proprietary research corpus
- Q3 raised margin/EPS guidance to high end; ~4.5% dividend while you wait
Data
Nature of the data
Data 9 · High- Legal, tax, health & regulatory information + workflow (CCH, UpToDate)
- UpToDate is a premier point-of-care clinical reference
- Authoritative corpora like RELX/Thomson Reuters
- Subscription, deeply embedded in workflows
Data trajectory (stock vs flow)
Steady flow- Regulatory/tax/clinical updates are a built-in perpetual flow
Position on the AI-unlock curve
AI 7 · High- AI workflow tools shipping across segments
- Same grounded-AI position as RELX/TRI
- Up the curve, productizing its corpus
Current AI contracts & counterparties
~ desk note- AI embedded in UpToDate/CCH; no corpus licensing
Possibilities for additional contracts
- Clinical-grounding deals for medical AI (UpToDate is the prize)
AI risks — what stands to lose
- UpToDate's clinical-reference franchise faces AI-native rivals (e.g. OpenEvidence)
- Tax/legal workflow seats exposed like TRI/RELX
Assessment
Valuation & discrepancy
Disc 3 · Low- Premium compounder
- AI quality understood & paid for
- Durability, not discount
Convexity & why
Low- Durable but fully valued
- Limited asymmetry
Other endogenous concerns
- CEO transition (long-tenured McKinstry era ended)
- Health segment competition intensifying
Hype factor (market awareness)
MedQuality understood; AI optionality not separately priced
Catalysts
- UpToDate AI products; FY guide
Research analytics · IP · content data
Data
Nature of the data
Data 7 · High- Web of Science — citation graph linking ~2B scientific citations
- Derwent (patents) + Cortellis (drug-pipeline intelligence)
- ProQuest academic content: dissertations, archives, ebooks
- Valuable for research/IP agents — 'a poor man's Elsevier'
- Data quality seen as better than the company's execution
Data trajectory (stock vs flow)
Steady flow- Citations/patents grow with global publishing — steady, not accelerating
Position on the AI-unlock curve
AI 5 · Neutral- Signed access deals (Anthropic) + MCP exposure
- AI research assistants in pipeline, slow to ship
- Citation + patent networks useful for IP/research AI
- ~$4.5B net debt constrains reinvestment
- Behind on the curve — data ready before the company
Current AI contracts & counterparties
✓ deep dive- Anthropic access agreement + MCP exposure for Web of Science
- No disclosed $; debt limits investment
Possibilities for additional contracts
- Patent/citation grounding for research agents
- ProQuest licensing to labs
AI risks — what stands to lose
- AI literature tools (Elicit, Semantic Scholar) bypass Web of Science discovery
- Patent search AI-commoditized
Assessment
Valuation & discrepancy
Disc 7 · High- ~2.3x EV/Sales (EV ~$5.7B, mostly debt) on $2.46B rev
- Equity (~$1.5B) is a small levered stub
- Cheap on sales, but the debt is the risk
Convexity & why
High · distressed option- Small equity stub over ~$4.5B debt ≈ a call option on the enterprise
- Bounded loss, multi-bagger upside if it de-levers/monetizes
- Convex but a high-probability left tail — size accordingly
Other endogenous concerns
- ~$4.5B debt wall dominates everything
- PE overhang; serial restructurings and writedowns
Hype factor (market awareness)
LowDebt story drowns the data story entirely
Catalysts
- De-leveraging milestones
- Any AI-licensing disclosure
- Segment divestitures
Data
Nature of the data
Data 8 · High- ~500M licensed, rights-cleared, caption-annotated images & video
- Exclusive editorial archives spanning a century
- iStock + Unsplash extend the catalog across tiers
- Rights-cleared image–text pairs = ideal multimodal training data
- Legal indemnification is the product AI builders need
Data trajectory (stock vs flow)
Strong flow + archive- 160k+ events covered/yr; ~600k creators; thousands of assets ingested daily (Q2 PR)
- Editorial is a daily flow machine, not just a vault — FY25 grew both segments
- Risk is creative-side inflow: genAI erodes contributor economics
Position on the AI-unlock curve
AI 6 · Neutral- Perplexity multi-yr display deal (Oct 2025)
- Generative tools with NVIDIA; licensed-data posture vs scrapers
- Shutterstock merger (UK-cleared May 2026) adds its lab licensing deals
- Litigation (Stability AI) continues to define the rights frontier
- Licensing not yet replacing what AI takes from stock demand
Current AI contracts & counterparties
✓ deep dive- Perplexity multi-yr display deal, Oct 2025 — undisclosed $ (PR)
- NVIDIA-powered licensed generative tools (Getty/iStock)
- Shutterstock brings lab deals (OpenAI, Meta, Apple, Amazon) post-merger
Possibilities for additional contracts
- Post-merger: consolidated licensed-visual-data vendor to every lab
- Display/attribution deals with other AI search products
- Indemnified training data as a product line
AI risks — what stands to lose
- GenAI image substitution is already in the creative numbers
- Editorial (real events) is the un-generatable refuge
Assessment
Valuation & discrepancy
Disc 7 · High- ~$0.3B cap — a deep-distress equity stub over ~$1.3B+ debt
- ~1.5x EV/Sales on ~$0.9B revenue
- Cheap + levered = a lottery ticket on the data
Convexity & why
High · lottery- Distressed, levered equity on ideal data — near-binary
- Multiplies on a licensing/M&A catalyst, or drifts to zero
- Steeply convex, lowest-conviction high-convexity name
Other endogenous concerns
- ~$1.3B+ debt; controlled company (Getty family + Koch)
- Shutterstock integration risk; CMA found UK editorial concerns (remedies)
Hype factor (market awareness)
Med-HighEvery AI headline attaches to it; the balance sheet, not awareness, is the constraint
Catalysts
- Shutterstock merger close (UK-cleared May 2026)
- Combined AI-licensing revenue line
- Stability AI litigation outcomes
- Debt refinancing
Data
Nature of the data
Data 6 · Neutral- Education content, assessment & learning-outcome data
- Proprietary curriculum + testing content
- Education is an AI-disruption epicenter
Data trajectory (stock vs flow)
Steady flow- Assessment/courseware data flows with enrollment
Position on the AI-unlock curve
AI 5 · Neutral- AI partnerships to license/embed content
- Two-sided disruption: tutoring threat + licensing optionality
- Mid on the curve
Current AI contracts & counterparties
~ desk note- AI partnerships announced 2025 with Microsoft, Google Cloud & AWS for learning products
Possibilities for additional contracts
- Curriculum licensing into AI tutors; assessment data moats
AI risks — what stands to lose
- AI tutors substitute courseware — the existential half of the two-sided story
- Assessment/credentialing more defensible
Assessment
Valuation & discrepancy
Disc 5 · Neutral- Disruption discount
- Optionality + threat both real
- Owner with genuine two-sidedness
Convexity & why
Moderate- Content-licensing/AI-tutoring optionality
- vs a real disruption threat
- Two-sided convexity
Other endogenous concerns
- Enrollment cliffs + OPM decline in higher ed
- Multi-year strategic rebuild under newer CEO
Hype factor (market awareness)
MedTwo-sided: tutoring threat vs licensing option
Catalysts
- Enrollment AI products; partnership revenue
Geospatial · sensor data
Data
Nature of the data
Data 8 · High- High-frequency satellite imagery + Spectra AI geospatial intelligence
- Rapid-revisit imagery over own and third-party sensors
- Growing multi-year defense backlog
- Same theme as Planet, earlier in scaling
Data trajectory (stock vs flow)
Compounding- Constellation growth (Gen-3) raises capture rate; archive accrues
Position on the AI-unlock curve
AI 7 · High- $100M+, 7-yr international defense contract (Jan 2025)
- $30M+ multi-year Gen-3 tactical ISR deal (Q3 2025)
- Backlog $323M, 91% international
- Spectra AI analytics layer over own + third-party sensors
- Same curve as Planet, earlier and cheaper-cap stage
Current AI contracts & counterparties
✓ deep dive- $100M+, 7-yr int'l defense contract, Jan 2025 (PR)
- $30M+ Gen-3 tactical ISR deal (Q3 25); backlog $323M, 91% int'l
Possibilities for additional contracts
- Gen-3 constellation upsells
- US budget normalization
- Spectra analytics licensing to allied gov'ts
AI risks — what stands to lose
- Same — AI raises the value of the sensor flow
Assessment
Valuation & discrepancy
Disc 4 · Neutral- ~$1.2B cap, EV ~$1.25B on ~$107M revenue
- ~12x EV/Sales on ~flat revenue — richly valued, not cheap
- Defense backlog is the story; the price is not a discount
Convexity & why
Moderate- Unique imagery + defense backlog = real optionality
- But ~12x sales on flat revenue means you pay up for it
- Not the bounded-downside cheap option it first looked like
Other endogenous concerns
- Dilution history; international customer concentration
- Gen-3 execution timeline risk
Hype factor (market awareness)
Med-HighDefense-AI story increasingly recognized; ~12x sales already pays for it
Catalysts
- Gen-3 launch & tasking milestones
- US budget resolution
- New int'l capacity commitments
Data
Nature of the data
Data 3 · Low- Works on gov geospatial/intel data it doesn't own
- Palantir-type: analytics layer on others' data
Data trajectory (stock vs flow)
n-a- Doesn't own the data it works on
Position on the AI-unlock curve
AI 4 · Neutral- AI analysis agents on others' data
- Services, not a data owner
Current AI contracts & counterparties
~ desk note- AI services on government data it doesn't own
Possibilities for additional contracts
AI risks — what stands to lose
- AI compresses services labor pricing — the classic services squeeze
Assessment
Valuation & discrepancy
Disc 4 · Neutral- Cheap services multiple
- Not a data-owner screen fit
Convexity & why
Low- Services multiple, no data optionality
Other endogenous concerns
- Recompete cycles; budget continuing-resolution exposure
Hype factor (market awareness)
LowServices multiple, services story
Data
Nature of the data
Data 9 · High- Images the entire landmass daily (~3.5m), plus high-res SkySat/Pelican (~50cm)
- A unique multi-year temporal archive no competitor has
- Change-over-time is the moat — can't retroactively collect history
- Increasingly delivered as AI-ready analytics
- Defense & intelligence is the fastest-growing buyer
Data trajectory (stock vs flow)
Compounding by design- Whole-Earth scan daily — the archive grows every 24h by construction
- New satellites add resolution/cadence; history can't be re-collected
Position on the AI-unlock curve
AI 7 · High- Anthropic partnership (Mar 2025): Claude applied to satellite imagery
- First prime win on NGA Luno ($12.8M, maritime AI analytics)
- MDA SHIELD IDIQ prime — eligible for Golden Dome task orders
- Backlog ~$900M (+79% YoY); Q4 revenue +41%
- AI analytics is the product; defense is the buyer
Current AI contracts & counterparties
✓ deep dive- Anthropic partnership (Mar 2025): Claude on satellite imagery (report)
- NGA Luno prime win $12.8M (SpaceNews)
- MDA SHIELD IDIQ prime (Golden Dome-eligible); backlog ~$900M
Possibilities for additional contracts
- Golden Dome task orders
- AI-analytics subscriptions over the archive (insurance, ag)
- More foundation-model partnerships on temporal imagery
AI risks — what stands to lose
- Low — AI is the accelerant, not the threat; risk is capex/competition not AI
Assessment
Valuation & discrepancy
Disc 4 · Neutral- ~28x EV/Sales, pre-profit — the data and backlog are the appeal, not the multiple
- Backlog ~$900M anchors forward revenue
- Rich on every metric
Convexity & why
High · optionality- Unique archive + ramping $906M defense backlog = large 'if it scales' upside
- Pre-profit/capital intensity is the downside
- Strong positive convexity
Other endogenous concerns
- SPAC-era dilution legacy; Pelican capex cycle
- Government contract concentration & timing lumps
Hype factor (market awareness)
HighAI + defense premium fully in the ~28x; expectations are the risk
Catalysts
- Next earnings: ~early Sept 2026 (FQ1'27 reported Jun 4 — record print) (Q1 8-K)
- FY27 guide raised to $425–441M (+41% mid); Q2 guide $102–107M with adj-EBITDA breakeven-to-positive
- Backlog $906M (+72%), RPO $816M (+81%); ~40% of backlog converts within 12 months
- Pelican cadence: 3 launched in Q1 incl Sweden's first sovereign recon satellite
- $731M cash funds the capex cycle; NGA $22M extension; Golden Dome task orders the option
Data
Nature of the data
Data 6 · Neutral- Weather/maritime/RF data (Spire); hyperspectral imagery (Satellogic)
- Niche proprietary sensor data
- Early and capital-intensive
Data trajectory (stock vs flow)
Compounding- Continuous sensor flow (weather/RF/hyperspectral); small base
Position on the AI-unlock curve
AI 5 · Neutral- Real but early sensor datasets
- On the curve but small
Current AI contracts & counterparties
~ desk note- Niche gov/defense sensor contracts
Possibilities for additional contracts
- Weather/RF data into forecasting AI
AI risks — what stands to lose
- Low AI risk; survival risk is capital, not AI
Assessment
Valuation & discrepancy
Disc 5 · Neutral- Speculative micro-caps
- Watchlist-only owners
- High risk, thin coverage
Convexity & why
High · lottery- Micro-cap sensor data — binary
- Large upside if a dataset scales, fat left tail
- High-variance convexity
Other endogenous concerns
- Cash runway and listing-compliance history — survival-grade risks
Hype factor (market awareness)
LowBelow the radar entirely
Catalysts
- Contract wins; cash runway
Sports data
Data
Nature of the data
Data 8 · High- Exclusive official league-data rights (NFL, NCAA, EPL)
- Now the NCAA's official data provider
- The other half of the official-sports-data duopoly with Sportradar
- Growing media/ad data layer (post-Legend acquisition)
- Multi-year rights = a hard moat
Data trajectory (stock vs flow)
Growing- More leagues, deeper tracking (player-level optical) each season
Position on the AI-unlock curve
AI 7 · High- AI for fan engagement and betting integrity products
- Media/ad data layer monetizes the rights twice
- Growing ~25%; up the curve like Sportradar
- Owner actively monetizing, not just holding
Current AI contracts & counterparties
✓ deep dive- No corpus licensing; exclusive NFL/NCAA/EPL rights in-product
- BetVision + media/ad data layer (Legend acq.)
Possibilities for additional contracts
- Second monetization of rights via media/ads
- AI integrity & fan-engagement products
AI risks — what stands to lose
- Same; rights moat holds, services layer competitive
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~$1.7B cap, ~3.6x EV/Sales on +31% growth
- Cheap for the growth + the official-data rights duopoly
- Media/ad layer monetizes the rights twice
Convexity & why
High- Rights moat + media-data optionality = asymmetric upside
- Growth-priced, so not deeply cheap
- Convex if the media layer scales
Other endogenous concerns
- NFL warrant dilution; rights renewals can reset economics
- Only recently profitable
Hype factor (market awareness)
LowSame as Sportradar — the duopoly's AI angle is unpriced
Catalysts
- Next earnings: ~early Aug 2026 (Q1 reported May 8)
- Legend closed May 1 → FY26 guide ~$990M–$1.01B rev / $270–280M EBITDA (~28% margin) (Q1 call)
- NFL rights locked through Super Bowl 2030; GeniusIQ to automate the full rights portfolio by end-2027
- Prediction markets: market makers onboarded in Q1 on low-latency feeds
- Targets: positive GAAP net income 2027; ≥60% uFCF conversion by 2028; ~$100M H2'26 cash flow
Data
Nature of the data
Data 8 · High- Official, licensed sports-data rights — 900k+ events, 80+ sports
- Real-time play-by-play feeds, pre-match & live odds, streaming
- Multi-year exclusive league contracts = hard-to-replicate moat
- Half of a duopoly with Genius for official betting data
- The data backbone of the global betting industry
Data trajectory (stock vs flow)
Growing- Event coverage (900k+/yr) and in-play depth keep expanding
Position on the AI-unlock curve
AI 7 · High- AI for in-play personalization, risk/trading, content generation
- Higher-margin products (MTS, 4Sight) lift take-rates
- A genuine owner monetizing its corpus
- Up the curve; AI deepens products vs a new licensing line
- Recent Kalshi deal extends into prediction markets
Current AI contracts & counterparties
✓ deep dive- No corpus licensing — official-data rights monetized in-product
- Kalshi deal extends feeds into prediction markets
Possibilities for additional contracts
- AI in-play products lift take-rates (4Sight, MTS)
- Prediction-market data feeds scale
AI risks — what stands to lose
- Betting operators in-housing AI models could squeeze value-add services
- Official rights protect the raw feed itself
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~3.0x EV/Sales on ~12% growth for a rights-duopoly owner
- Reasonable on metrics for the moat
- Fair-to-slightly-cheap
Convexity & why
Moderate- ~3x sales on duopoly rights gives a floor
- Upside from take-rate growth on new products
- Balanced, slight positive tilt
Other endogenous concerns
- Rights-cost inflation: leagues extract more each renewal
- Founder (Koerl) control; bookmaker customer concentration
Hype factor (market awareness)
LowPriced as a betting vendor; data-rights duopoly rarely framed as AI
Catalysts
- Next earnings: ~early Aug 2026 (Q1 reported early May)
- NOW: FIFA World Cup (Jun–Jul 2026) — major in-play/MTS volume event (Q1 call)
- FY26 reaffirmed: 23–25% cc revenue growth / 34–37% EBITDA growth
- Prediction markets 'imminent, potentially material' — H2 ramp
- IMG Arena synergies above 25% target; >700k streamed matches in 2026; H2 restructuring for leverage
- Short-seller reports — CEO pushed back on call; monitor, don't ignore
Ad · measurement · web data
Data
Nature of the data
Data 6 · Neutral- Ad-verification/fraud data (DV, healthier franchise)
- Cross-platform audience measurement (Comscore, distressed)
- Proprietary measurement data
Data trajectory (stock vs flow)
Flow with ad spend- Verification events track media volumes
Position on the AI-unlock curve
AI 6 · Neutral- Measurement owners; AI + walled gardens pressure the moat
- DV is the credible franchise; SCOR a broken business
Current AI contracts & counterparties
~ desk note- AI-content verification products (DV)
Possibilities for additional contracts
- Verification layer for AI-generated ad content
AI risks — what stands to lose
- AI-generated content/MFA sites flood verification (volume up, value contested)
- Walled gardens self-verify
Assessment
Valuation & discrepancy
Disc 7 · High- ~$1.6B cap, ~2.0x EV/Sales on +14% growth
- Cheap for an ad-verification data owner (DV)
- DV the franchise; SCOR the distressed lottery leg
Convexity & why
High- ~2x sales for a profitable measurement owner
- AI + walled gardens pressure the moat
- Cheap enough to be convex
Other endogenous concerns
- Ad-budget cyclicality; IAS rivalry compresses pricing; SCOR is balance-sheet-fragile
Hype factor (market awareness)
Low-MedDe-rated with adtech; AI angle minor
Catalysts
- DV growth; SCOR restructuring
Data
Nature of the data
Data 5 · Neutral- Panel/clickstream traffic, keyword, conversion estimates for nearly every site
- The dataset everyone uses to track digital behavior — incl. AI-search traffic
- Broad coverage, but modeled/estimated, not a first-party record
- Continuously updated digital-intelligence feeds
Data trajectory (stock vs flow)
Continuous panel- Clickstream flow is constant but panel-based — quality needs constant defense
- Privacy/cookie shifts are structural headwinds to collection
Position on the AI-unlock curve
AI 8 · High- Sells data feeds/APIs + MCP integrations into AI workflows
- Uniquely positioned to measure (and feed) the AI-search era
- Ahead for its size — high AI exposure per dollar
- Catch: modeled data less defensible than owned
Current AI contracts & counterparties
✓ deep dive- Sells AI/clickstream datasets + MCP integrations into AI workflows
- The standard source for tracking ChatGPT/Gemini traffic share
Possibilities for additional contracts
- AI-data ARR as a disclosed line
- Agent-platform data feeds
- Strategic acquirer interest (data fits many buyers)
AI risks — what stands to lose
- AI search shrinks open-web traffic — shrinking the thing it measures
- Collection (panels/extensions) gets harder as browsing shifts to agents
Assessment
Valuation & discrepancy
Disc 7 · High- ~$0.36B cap, EV ~$0.21B on ~$283M revenue
- ~0.7x EV/Sales — strikingly cheap, even for modeled data
- Deep-value + AI-licensing optionality; small & illiquid
Convexity & why
High · deep-value- <1x EV/Sales with AI-licensing pull = asymmetric
- Small, illiquid, modeled (non-owned) data = the risk
- Cheap enough that convexity tilts positive
Other endogenous concerns
- Nano-cap liquidity; SBC heavy; privacy rules threaten collection methods
Hype factor (market awareness)
MedIts datasets are quoted everywhere; the equity is ignored at ~0.7x EV/S
Catalysts
- Next earnings: ~mid-Aug 2026 (Q1 reported May 13)
- Second large LLM training contract expected 'over the coming quarters' (Q1 6-K)
- AI revenue trajectory: 11% of Q4 revenue, ~3x YoY — does it keep compounding?
- RPO $297.7M (+18%); multi-year ARR at 64% — contract-quality migration
- FY26 guide $307–315M; low end already raised once
Data
Nature of the data
Data 6 · Neutral- Ad-bidding/bidstream data + UID2 identity framework
- Powers its own bidding (demand-side platform)
- Vast behavioral data, but an input
Data trajectory (stock vs flow)
High flow- Bidstream data scales with ad volume; ephemeral by nature
Position on the AI-unlock curve
AI 6 · Neutral- Stewards the UID2 identity standard
- Identity-data optionality, not a corpus sale
- De-rated; case on ad-platform fundamentals
Current AI contracts & counterparties
~ desk note- Kokai AI in-platform; UID2 stewardship
Possibilities for additional contracts
- UID2 as identity layer for agentic commerce
AI risks — what stands to lose
- AI walled-garden answers shrink open-web inventory — the de-rate driver
- Agentic ad-buying could compress DSP take rates
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~3.0x EV/Sales on +18% growth — value territory for profitable adtech
- UID2 identity optionality on top
- Open-web AI fears embedded in the multiple
Convexity & why
Moderate–High- Modest multiple + identity-standard optionality
- Data is an input, not a sold corpus
- Positive tilt on metrics
Other endogenous concerns
- Founder super-voting; Amazon DSP is the real competitive event
- SBC and the credibility hit from the '25 stumble
Hype factor (market awareness)
Med — as threatAI read as open-web risk; de-rate reflects it
Catalysts
- CTV share; UID2 adoption; growth re-accel
Data
Nature of the data
Data 6 · Neutral- B2B contact + company intelligence: emails, dials, org charts, technographics
- Buying-intent signals across millions of companies
- A live 'who's-who' graph of decision-makers
- Real, but increasingly replicable as AI shifts buyer behavior
- Renamed platform around 'GTM AI'
Data trajectory (stock vs flow)
Decay treadmill- B2B contact data decays ~25–30%/yr — must be rebuilt constantly
- Customer churn weakens the contributory refresh loop
- The clearest decaying-asset risk in the table
Position on the AI-unlock curve
AI 7 · High- GTM Context Graph native in OpenAI's Codex for Work — agent context layer
- AI is both distribution and disruptor
- Cut 2026 guidance + ~20% of staff on AI-driven shifts
- Ahead on plumbing, behind on the seat-based model
- Clearest live case of 'data doesn't protect the equity'
Current AI contracts & counterparties
✓ deep dive- GTM Context Graph natively in OpenAI's Codex for Work
- No disclosed licensing $; positioning as agent context layer
Possibilities for additional contracts
- Per-call context pricing for sales agents
- More agent-platform embeds (Claude, Gemini)
- Data-only tier decoupled from seats
AI risks — what stands to lose
- Customers replace SDR seats with AI — seat-based model directly hit (guidance cut said so)
- Agents can increasingly infer contact data without a vendor
Assessment
Valuation & discrepancy
Disc 6 · Neutral- ~1.7x EV/Sales — lowest multiple on the board
- But revenue is declining; the cheapness reflects decay risk
- Statistically cheap; operationally a falling knife
Convexity & why
High · binary- ~1.7x sales embeds heavy pessimism — small asymmetric base
- Re-rates hard if revenue stabilizes as the agent-context layer
- Declining revenue is the live left tail
Other endogenous concerns
- Debt on a shrinking base; SBC dilution; churn is the whole story
Hype factor (market awareness)
High — as threatThe market's AI-victim poster child; the Codex embed is ignored
Catalysts
- Next earnings: ~early-mid Aug 2026 (Q1 reported May 11)
- The trough test: FY26 guide cut to $1.185–1.205B (−4% mid); Q2 $300–303M — does it hold? (Q1 call)
- Agent embeds: Salesforce prospecting agent ships with ZoomInfo as first/primary external data source (150k+ customers); HubSpot native; ChatGPT/Claude/Copilot/Perplexity connectors live
- Pricing pivot: Copilot moving from seats to prepackaged credits/consumption
- Mgmt points to growth returning H2 2027; 35% AOI margin + cost cuts fund the wait
Auto data
Data
Nature of the data
Data 6 · Neutral- Wholesale used-car condition & transaction data (ACV inspection corpus)
- Granular vehicle-condition/pricing data
- Still primarily marketplaces
Data trajectory (stock vs flow)
Growing- Inspection corpus grows with every vehicle listed (ACV)
Position on the AI-unlock curve
AI 5 · Neutral- Feeds AI pricing
- ACV more data-distinctive
- Operators, not data-unlock plays
Current AI contracts & counterparties
~ desk note- ACV inspection-AI in-product
Possibilities for additional contracts
- Condition-data licensing to pricing AIs
AI risks — what stands to lose
- Low-moderate; inspection AI is ACV's own product
Assessment
Valuation & discrepancy
Disc 5 · Neutral- ACV the more data-distinctive
- Both operators
- Corpus enhances the platform
Convexity & why
Moderate- ACV growth + condition-data optionality
- Valued on the marketplace
- Mildly positive
Other endogenous concerns
- ACV not yet sustainably profitable; OPENLANE balance sheet
Hype factor (market awareness)
LowMarketplace story
Data
Nature of the data
Data 5 · Neutral- Auto listing, pricing & shopper-intent data
- Largely audience/marketplace
- Listings not fully proprietary
Data trajectory (stock vs flow)
Churning flow- Listings churn; intent data flows with traffic
Position on the AI-unlock curve
AI 4 · Neutral- Useful intent data, Zillow-like
- Not a data-unlock play
Current AI contracts & counterparties
~ desk note
Possibilities for additional contracts
AI risks — what stands to lose
- AI shopping agents could bypass listing sites
Assessment
Valuation & discrepancy
Disc 5 · Neutral- Reasonable valuations
- Operators in the Zillow mold
Convexity & why
Low- Operator, limited data asymmetry
- Balanced-to-low
Other endogenous concerns
- Dealer-count churn; marketing-spend treadmill
Hype factor (market awareness)
LowMarketplace story
Data
Nature of the data
Data 7 · High- Salvage-auto auction & vehicle-history data (IntelliSeller)
- Decades of auction-outcome data
- Serves its dominant auction marketplace
Data trajectory (stock vs flow)
Growing- Salvage auction outcomes accumulate with volume
Position on the AI-unlock curve
AI 5 · Neutral- AI tools in-product, not licensed
- Data deepens the moat, isn't the product
Current AI contracts & counterparties
~ desk note- Internal auction AI (IntelliSeller)
Possibilities for additional contracts
AI risks — what stands to lose
- Low; AI assists damage assessment
Assessment
Valuation & discrepancy
Disc 3 · Low- Premium, high-quality operator
- Data deepens the moat, isn't the product
Convexity & why
Low- Premium, data not the re-rate driver
Other endogenous concerns
- Leadership transition from founder era; totals cycle depends on used-car values
Hype factor (market awareness)
LowOperator story
Retail · e-commerce data
Data
Nature of the data
Data 6 · Neutral- Grocery-purchase + fast-growing retail-media ad data
- Rich first-party purchase data
- Powers its own high-margin ads (input)
Data trajectory (stock vs flow)
Compounding- Purchase graph deepens with order history
Position on the AI-unlock curve
AI 5 · Neutral- Strong data-driven ad engine
- AI-relevant, but feeds its ads, not sold
- Operator class
Current AI contracts & counterparties
~ desk note- Retail-media AI in-product
Possibilities for additional contracts
- Purchase-data into commerce agents (never signaled)
AI risks — what stands to lose
- AI shopping agents could disintermediate the storefront layer
Assessment
Valuation & discrepancy
Disc 5 · Neutral- Reasonable on ads + delivery
- Strong ad engine
- Data is an input
Convexity & why
Moderate- Retail-media optionality
- Valued on the business, not the data
- Balanced
Other endogenous concerns
- DoorDash/Uber entering grocery; ad growth must outrun fee pressure
Hype factor (market awareness)
LowGrocery/ads story
Transaction · payments data
Data
Nature of the data
Data 5 · Neutral- Merchant transaction flows & fraud signals (banking/payments processing)
- Real data, but serves its processing
Data trajectory (stock vs flow)
Steady flow- Transaction flow tracks processing volumes
Position on the AI-unlock curve
AI 4 · Neutral- In-product fraud/upsell, not a corpus
Current AI contracts & counterparties
~ desk note
Possibilities for additional contracts
AI risks — what stands to lose
- AI-native fintech infrastructure competition
Assessment
Valuation & discrepancy
Disc 4 · Neutral- Cheap-ish fintech
- But not a data re-rate
Convexity & why
Low- Value fintech, data not the driver
Other endogenous concerns
- Worldpay separation aftermath; bank IT spending cycles
Hype factor (market awareness)
LowFintech story
Data
Nature of the data
Data 8 · High- Among the largest transaction datasets on earth
- Regulated, privacy-bound byproduct
- Not licensed as a corpus
Data trajectory (stock vs flow)
Compounding- Payment volumes grow ~10%/yr — among the largest data flows on earth
Position on the AI-unlock curve
AI 3 · Low- Increasingly productized
- But privacy-bound; not a corpus sale
- The ultimate data-advantaged operators
Current AI contracts & counterparties
~ desk note- Internal fraud/credit AI at vast scale; agentic-commerce pilots
Possibilities for additional contracts
- Agentic payments standards (who authorizes an AI's purchase?)
AI risks — what stands to lose
- Agentic payments could reshape authorization economics — also an opportunity
- Stablecoin/alternative rails the bigger structural worry
Assessment
Valuation & discrepancy
Disc 2 · Low- Valued as payment giants
- n/a as a data re-rate
Convexity & why
Low- Priced payment networks; data is internal
Other endogenous concerns
- Interchange regulation (CCCA) and DOJ debit suit (V)
- Stablecoin rails as long-term routing threat
Hype factor (market awareness)
MedAgentic commerce chatter rising; data never the thesis
Catalysts
- Agentic-payment standards; volume growth
Companion to data-owners-financials-grounded.html (the master table, which also holds the ratings summary and takeaways). ✓ FMP = SEC-sourced figures; ~ EV est = estimated. ✓ deep dive = filings/PRs reviewed with linked sources; ~ desk note = knowledge-based fill pending deep dive. * on market caps = ADR/foreign listing hand-adjusted.