Full Detail · Traceable Logic
◆ owner (corpus is the asset) · ○ operator (data is an input). Scroll sideways — the Company column stays pinned.
| Company · IR | Data | Financials (FMP, Jun 9 2026) | Assessment | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Nature of the data · tier | Data trajectory (stock vs flow) | Position on the AI-unlock curve · tier | Current AI contracts & counterparties | Possibilities for additional contracts | AI risks — what stands to lose | Market cap | EV / Sales | YoY rev growth | Valuation & discrepancy | Convexity & why | Other endogenous concerns | Hype factor (market awareness) | Catalysts | |
| Financial-market data | ||||||||||||||
| CME GroupCME○ operator IR / presentations ↗ |
Data 6 · Neutral
|
Growing
|
AI 5 · Neutral
|
~ desk note
|
|
|
~$93B ~ EV est |
~15x | +6% | Disc 3 · Low
|
Low
|
|
Low Not a data-AI story |
|
| FactSetFDS◆ owner IR / presentations ↗ |
Data 6 · Neutral
|
Steady flow
|
AI 7 · High
|
~ desk note
|
|
|
~$9.0B ✓ FMP |
~4.3x | +5% | Disc 6 · Neutral
|
Moderate
|
|
Med — as threat De-rated with the info-services group in Feb 2026 |
|
| Intercontinental Exch.ICE◆ owner IR / presentations ↗ |
Data 8 · High
|
Cyclical flow
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$80B ~ EV est |
~10x | +6% | Disc 4 · Neutral
|
Low
|
|
Low Read as an exchange, never as a data-AI play |
|
| Moody'sMCO◆ owner IR / presentations ↗ |
Data 9 · High
|
Growing
|
AI 9 · High
|
✓ deep dive
|
|
|
~$79B ✓ FMP |
~11x | +9% | Disc 2 · Low
|
Low
|
|
High Best-executed AI strategy is consensus; it's in the ~11x |
|
| MorningstarMORN◆ owner IR / presentations ↗ |
Data 7 · High
|
Growing
|
AI 5 · Neutral
|
~ desk note
|
|
|
~$7.0B ✓ FMP |
~3.4x | +8% | Disc 6 · Neutral
|
Moderate–High
|
|
Low PitchBook's AI value ~absent from the narrative |
|
| MSCIMSCI◆ owner IR / presentations ↗ |
Data 8 · High
|
Growing
|
AI 7 · High
|
~ desk note
|
|
|
~$44B ✓ FMP |
~16x | +10% | Disc 3 · Low
|
Low
|
|
Med AI seen as feature, not thesis |
|
| NasdaqNDAQ◆ owner IR / presentations ↗ |
Data 7 · High
|
Growing
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$49B ~ EV est |
~12x | +8% | Disc 3 · Low
|
Low
|
|
Low AI in products, not in the multiple |
|
| S&P GlobalSPGI◆ owner IR / presentations ↗ |
Data 9 · High
|
Growing
|
AI 9 · High
|
✓ deep dive |
|
|
~$126B ✓ FMP |
~8.8x | +8% | Disc 3 · Low
|
Low
|
|
Med-High AI execution is consensus among analysts; the multiple carries only a modest sector AI-threat discount |
|
| Professional-information data (legal · tax · IT advisory) | ||||||||||||||
| GartnerIT◆ owner IR / presentations ↗ |
Data 8 · High
|
Steady — watch the flow
|
AI 5 · Neutral
|
✓ deep dive
|
|
|
~$10.5B ✓ FMP |
~2.1x | +4% | Disc 8 · High
|
High · quality-convex
|
|
High — as threat Narrative casts Gartner as an AI casualty; AskGartner and the paywalled corpus get little credit |
|
| Thomson ReutersTRI◆ owner IR / presentations ↗ |
Data 9 · High
|
Steady compounding
|
AI 8 · High
|
✓ deep dive
|
|
|
~$78B* ✓ FMP |
~9.0x | +7% | Disc 3 · Low
|
Low
|
|
High — as threat Market narrative treats agentic AI as a threat to legal-research seats; CoCounsel distribution under-credited |
|
| Credit · identity · risk data | ||||||||||||||
| EquifaxEFX◆ owner IR / presentations ↗ |
Data 8 · High
|
Compounding
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$20B ✓ FMP |
~4.0x | +7% | Disc 7 · High
|
High
|
|
Low AI angle absent; mortgage cycle dominates the narrative |
|
| ExperianEXPN.L◆ owner IR / presentations ↗ |
Data 8 · High
|
Growing
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$45B* ~ EV est |
~6.5x | +7% | Disc 6 · Neutral
|
Moderate
|
|
Low UK listing keeps it out of the AI conversation |
|
| FICOFICO◆ owner IR / presentations ↗ |
Data 6 · Neutral
|
Derived flow
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$28B ✓ FMP |
~14x | +15% | Disc 5 · Neutral
|
Moderate
|
|
Med Debate is pricing power, not AI |
|
| LiveRampRAMP◆ owner IR / presentations ↗ |
Data 7 · High
|
Maintained
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$2.3B ~ EV est |
~3.0x | +10% | Disc 2 · Low
|
Low
|
|
Low Story is now the Publicis acquisition |
|
| TransUnionTRU◆ owner IR / presentations ↗ |
Data 7 · High
|
Growing
|
AI 5 · Neutral
|
~ desk note
|
|
|
~$13.5B ✓ FMP |
~3.9x | +8% | Disc 6 · Neutral
|
Moderate
|
|
Low Same as EFX — cycle story, not AI story |
|
| VeriskVRSK◆ owner IR / presentations ↗ |
Data 9 · High
|
Steady compounding
|
AI 5 · Neutral
|
~ desk note
|
|
|
~$24B ✓ FMP |
~9.0x | +7% | Disc 5 · Neutral
|
Low–Moderate
|
|
Low-Med Quality priced; AI not separately valued |
|
| Healthcare · life-sciences data | ||||||||||||||
| CencoraCOR○ operator IR / presentations ↗ |
Data 5 · Neutral
|
Steady flow
|
AI 3 · Low
|
~ desk note
|
|
|
~$54B ~ EV est |
~0.1x | +10% | Disc 4 · Neutral
|
Low
|
|
Low Not an AI story |
|
| Definitive Health.DH◆ owner IR / presentations ↗ |
Data 7 · High
|
Slowing
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$0.1B ~ EV est |
~2.0x | −8% | Disc 6 · Neutral
|
High · distressed
|
|
Low Micro-cap; no AI narrative attaches |
|
| DoximityDOCS◆ owner* IR / presentations ↗ |
Data 6 · Neutral
|
Saturated graph
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$3.8B ✓ FMP |
~5.6x | +13% | Disc 5 · Neutral
|
Moderate
|
|
Med Was priced for AI hopes; now reset to fair |
|
| ElevanceELV○ operator IR / presentations ↗ |
Data 6 · Neutral
|
Steady flow
|
AI 4 · Neutral
|
~ desk note
|
|
|
~$92B ~ EV est |
~0.4x | +5% | Disc 5 · Neutral
|
Moderate
|
|
Low Insurer story |
|
| GoodRxGDRX○ operator IR / presentations ↗ |
Data 5 · Neutral
|
Steady flow
|
AI 4 · Neutral
|
~ desk note
|
|
|
~$0.9B ~ EV est |
~1.3x | ~flat | Disc 4 · Neutral
|
Moderate · binary
|
|
Low No AI narrative |
|
| Guardant HealthGH◆ owner IR / presentations ↗ |
Data 9 · High
|
Compounding fast
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$17B ✓ FMP |
~17x | +33% | Disc 5 · Neutral
|
Moderate
|
|
Med Priced as diagnostics growth; data angle secondary |
|
| IQVIAIQV◆ owner IR / presentations ↗ |
Data 9 · High
|
Compounding
|
AI 7 · High
|
✓ deep dive |
|
|
~$31B ✓ FMP |
~2.7x | +6% | Disc 7 · High
|
High
|
|
Low → rising Cheapest scarce-data name; IQVIA.ai barely registers in the multiple yet |
|
| NateraNTRA◆ owner IR / presentations ↗ |
Data 9 · High
|
Compounding fast
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$32B ✓ FMP |
~12x | +36% | Disc 5 · Neutral
|
Moderate
|
|
Med Same — growth story, data unpriced |
|
| Tempus AITEM◆ owner IR / presentations ↗ |
Data 9 · High
|
Compounding fast
|
AI 8 · High
|
✓ deep dive |
|
|
~$8.5B ✓ FMP |
~6.5x | +83% | Disc 6 · Neutral
|
High · growth optionality
|
|
High AI is in the name and the multiple — but >$1B RCV arguably still under-modeled |
|
| Veeva SystemsVEEV◆ owner IR / presentations ↗ |
Data 7 · High
|
Growing
|
AI 7 · High
|
✓ deep dive
|
|
|
~$27B ✓ FMP |
~7.7x | +16% | Disc 5 · Neutral
|
Low
|
|
Med Read as a quality SaaS with AI features, not a data owner |
|
| Consumer · user-generated · marketplace data | ||||||||||||||
| CarvanaCVNA○ operator IR / presentations ↗ |
Data 4 · Neutral
|
Growing
|
AI 3 · Low
|
~ desk note
|
|
|
~$76B ~ EV est |
~3.5x | +30% | Disc 3 · Low
|
Low
|
|
Low Retail story |
|
| CoStar GroupCSGP◆ owner IR / presentations ↗ |
Data 8 · High
|
Compounding
|
AI 3 · Low
|
✓ deep dive
|
|
|
~$14B ✓ FMP |
~4.0x | +19% | Disc 6 · Neutral
|
Moderate–High
|
|
Low AI never part of the story; the data optionality is free at ~4x |
|
| DuolingoDUOL◆ owner* IR / presentations ↗ |
Data 6 · Neutral
|
Compounding
|
AI 6 · Neutral
|
✓ deep dive
|
|
|
~$5.5B ✓ FMP |
~4.0x | +39% | Disc 6 · Neutral
|
Moderate–High
|
|
High — as threat Narrative says ChatGPT kills language learning; the AI-first operating model is ignored |
|
| MercadoLibreMELI○ operator IR / presentations ↗ |
Data 6 · Neutral
|
Compounding
|
AI 3 · Low
|
~ desk note
|
|
|
~$83B ~ EV est |
~3.5x | +35% | Disc 3 · Low
|
Moderate
|
|
Low Not a data play |
|
| NetflixNFLX○ operator IR / presentations ↗ |
Data 7 · High
|
Growing
|
AI 2 · Low
|
~ desk note
|
|
|
~$343B ~ EV est |
~8.0x | +14% | Disc 2 · Low
|
Low
|
|
Low Recs AI assumed, not valued separately |
|
| RedditRDDT◆ owner IR / presentations ↗ |
Data 9 · High
|
Compounding
|
AI 9 · High
|
✓ deep dive
|
|
|
~$34B ✓ FMP |
~13x | +69% | Disc 5 · Neutral
|
Moderate
|
|
High The AI-data story IS the stock; renewal terms are the swing |
|
| TripAdvisorTRIP◆ owner IR / presentations ↗ |
Data 4 · Neutral
|
Slowing risk
|
AI 4 · Neutral
|
✓ deep dive |
|
|
~$1.4B ~ EV est |
~0.7x | +3% | Disc 5 · Neutral
|
Low–Moderate
|
|
Med AI read as existential threat; partnerships seen as defensive, not monetizing |
|
| YelpYELP◆ owner IR / presentations ↗ |
Data 6 · Neutral
|
Steady flow — not melting
|
AI 6 · Neutral
|
✓ deep dive
|
|
|
~$1.3B ✓ FMP |
~0.9x | +3% | Disc 6 · Neutral
|
Moderate · binary
|
|
Low-Med OpenAI deal is new and barely in the price; story still read as 'Google victim' |
|
| ZillowZ/ZG◆ owner* IR / presentations ↗ |
Data 5 · Neutral
|
Churning flow
|
AI 5 · Neutral
|
~ desk note
|
|
|
~$8.6B ✓ FMP |
~3.1x | +16% | Disc 6 · Neutral
|
Moderate
|
|
Med AI features noted, not a data thesis |
|
| Peer-reviewed journal publishing data | ||||||||||||||
| RELXRELX◆ owner IR / presentations ↗ |
Data 10 · High
|
Growing
|
AI 8 · High
|
✓ deep dive
|
|
|
~$92B* ✓ FMP |
~9.0x | +7% | Disc 4 · Neutral
|
Low
|
|
High — as threat Same legal-AI threat narrative as TRI; grounded-product execution under-credited |
|
| WileyWLY◆ owner IR / presentations ↗ |
Data 7 · High
|
Growing
|
AI 7 · High
|
✓ deep dive
|
|
|
~$2.3B ✓ FMP |
~1.9x | ~flat | Disc 7 · High
|
Moderate
|
|
High AI-licensing story is prominent in coverage; expectations now elevated |
|
| Wolters KluwerWTKWY◆ owner IR / presentations ↗ |
Data 9 · High
|
Steady flow
|
AI 7 · High
|
~ desk note
|
|
|
~$38B* ✓ FMP |
~6.0x | +6% | Disc 3 · Low
|
Low
|
|
Med Quality understood; AI optionality not separately priced |
|
| Research analytics · IP · content data | ||||||||||||||
| ClarivateCLVT◆ owner IR / presentations ↗ |
Data 7 · High
|
Steady flow
|
AI 5 · Neutral
|
✓ deep dive
|
|
|
~$1.5B ✓ FMP |
~2.3x | −4% | Disc 7 · High
|
High · distressed option
|
|
Low Debt story drowns the data story entirely |
|
| Getty ImagesGETY◆ owner IR / presentations ↗ |
Data 8 · High
|
Strong flow + archive
|
AI 6 · Neutral
|
✓ deep dive
|
|
|
~$0.3B ✓ FMP |
~1.5x | +4% | Disc 7 · High
|
High · lottery
|
|
Med-High Every AI headline attaches to it; the balance sheet, not awareness, is the constraint |
|
| PearsonPSO◆ owner IR / presentations ↗ |
Data 6 · Neutral
|
Steady flow
|
AI 5 · Neutral
|
~ desk note
|
|
|
~$9.3B ~ EV est |
~2.0x | +3% | Disc 5 · Neutral
|
Moderate
|
|
Med Two-sided: tutoring threat vs licensing option |
|
| Geospatial · sensor data | ||||||||||||||
| BlackSkyBKSY◆ owner IR / presentations ↗ |
Data 8 · High
|
Compounding
|
AI 7 · High
|
✓ deep dive
|
|
|
~$1.2B ✓ FMP |
~12x | +4% | Disc 4 · Neutral
|
Moderate
|
|
Med-High Defense-AI story increasingly recognized; ~12x sales already pays for it |
|
| LeidosLDOS△ borderline IR / presentations ↗ |
Data 3 · Low
|
n-a
|
AI 4 · Neutral
|
~ desk note
|
|
|
~$16B ~ EV est |
~1.3x | +6% | Disc 4 · Neutral
|
Low
|
|
Low Services multiple, services story |
|
| Planet LabsPL◆ owner IR / presentations ↗ |
Data 9 · High
|
Compounding by design
|
AI 7 · High
|
✓ deep dive |
|
|
~$10.4B ✓ FMP |
~28x | +26% | Disc 4 · Neutral
|
High · optionality
|
|
High AI + defense premium fully in the ~28x; expectations are the risk |
|
| Spire / SatellogicSPIR/SATL◆ owner IR / presentations ↗ |
Data 6 · Neutral
|
Compounding
|
AI 5 · Neutral
|
~ desk note
|
|
|
~$0.5B ~ EV est |
~3.0x | +20% | Disc 5 · Neutral
|
High · lottery
|
|
Low Below the radar entirely |
|
| Sports data | ||||||||||||||
| Genius SportsGENI◆ owner IR / presentations ↗ |
Data 8 · High
|
Growing
|
AI 7 · High
|
✓ deep dive
|
|
|
~$1.7B ✓ FMP |
~3.6x | +31% | Disc 6 · Neutral
|
High
|
|
Low Same as Sportradar — the duopoly's AI angle is unpriced |
|
| SportradarSRAD◆ owner IR / presentations ↗ |
Data 8 · High
|
Growing
|
AI 7 · High
|
✓ deep dive
|
|
|
~$4.9B ✓ FMP |
~3.0x | +12% | Disc 6 · Neutral
|
Moderate
|
|
Low Priced as a betting vendor; data-rights duopoly rarely framed as AI |
|
| Ad · measurement · web data | ||||||||||||||
| DoubleVerify / ComscoreDV/SCOR◆ owner IR / presentations ↗ |
Data 6 · Neutral
|
Flow with ad spend
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$1.6B ✓ FMP |
~2.0x | +14% | Disc 7 · High
|
High
|
|
Low-Med De-rated with adtech; AI angle minor |
|
| SimilarwebSMWB◆ owner IR / presentations ↗ |
Data 5 · Neutral
|
Continuous panel
|
AI 8 · High
|
✓ deep dive
|
|
|
~$0.36B ✓ FMP |
~0.7x | +15% | Disc 7 · High
|
High · deep-value
|
|
Med Its datasets are quoted everywhere; the equity is ignored at ~0.7x EV/S |
|
| The Trade DeskTTD○ operator IR / presentations ↗ |
Data 6 · Neutral
|
High flow
|
AI 6 · Neutral
|
~ desk note
|
|
|
~$9.4B ~ EV est |
~3.0x | +18% | Disc 6 · Neutral
|
Moderate–High
|
|
Med — as threat AI read as open-web risk; de-rate reflects it |
|
| ZoomInfoGTM◆ owner IR / presentations ↗ |
Data 6 · Neutral
|
Decay treadmill
|
AI 7 · High
|
✓ deep dive
|
|
|
~$0.8B ✓ FMP |
~1.7x | ~−3% | Disc 6 · Neutral
|
High · binary
|
|
High — as threat The market's AI-victim poster child; the Codex embed is ignored |
|
| Auto data | ||||||||||||||
| ACV / OPENLANEACVA/KAR○ operator IR / presentations ↗ |
Data 6 · Neutral
|
Growing
|
AI 5 · Neutral
|
~ desk note
|
|
|
~$1.0B ~ EV est |
~5.0x | +25% | Disc 5 · Neutral
|
Moderate
|
|
Low Marketplace story |
|
| CarGurus / Cars.comCARG/CARS○ operator IR / presentations ↗ |
Data 5 · Neutral
|
Churning flow
|
AI 4 · Neutral
|
~ desk note
|
|
|
~$2.7B ~ EV est |
~3.0x | +5% | Disc 5 · Neutral
|
Low
|
|
Low Marketplace story |
|
| CopartCPRT○ operator IR / presentations ↗ |
Data 7 · High
|
Growing
|
AI 5 · Neutral
|
~ desk note
|
|
|
~$29B ~ EV est |
~10x | +10% | Disc 3 · Low
|
Low
|
|
Low Operator story |
|
| Retail · e-commerce data | ||||||||||||||
| InstacartCART○ operator IR / presentations ↗ |
Data 6 · Neutral
|
Compounding
|
AI 5 · Neutral
|
~ desk note
|
|
|
~$9.9B ~ EV est |
~3.5x | +10% | Disc 5 · Neutral
|
Moderate
|
|
Low Grocery/ads story |
|
| Transaction · payments data | ||||||||||||||
| FISFIS○ operator IR / presentations ↗ |
Data 5 · Neutral
|
Steady flow
|
AI 4 · Neutral
|
~ desk note
|
|
|
~$21B ~ EV est |
~4.0x | +4% | Disc 4 · Neutral
|
Low
|
|
Low Fintech story |
|
| Visa / Mastercard / AmexV/MA/AXP○ operator IR / presentations ↗ |
Data 8 · High
|
Compounding
|
AI 3 · Low
|
~ desk note
|
|
|
~$623B / $438B ~ EV est |
~16x | +10% | Disc 2 · Low
|
Low
|
|
Med Agentic commerce chatter rising; data never the thesis |
|
The core tension as a 2×2: data quality (rows) vs AI-unlock (columns). Top-left — elite data, slow unlock — is the latent re-rate watchlist. Note IQVIA's migration to the right column after the deep dive (IQVIA.ai, 150+ agents, 19 of top-20 pharma). Key: * = High valuation discrepancy · † = High convexity (both symbols = both).
IT*†, CLVT*†, GETY*†, EFX*†, TRU, EXPN.L, GH, NTRA, DH†, VRSK, MORN, NDAQ, ICE, CSGP, RAMP, NFLX, CPRT, V/MA/AXP
IQV*†, WLY*, SRAD, GENI†, RDDT, TEM†, PL†, BKSY, RELX, SPGI, TRI, MSCI, MCO, WTKWY, VEEV
FICO, DUOL, GDRX, SPIR/SATL†, PSO, TTD, DV/SCOR*†, YELP, Z/ZG, ACVA/KAR, CART, ELV, CARG/CARS, CME, DOCS, TRIP, MELI, CVNA, FIS, COR, LDOS
GTM†, SMWB*†, FDS
And the grouped view — High and Low ends of each rating.
The screen's pick: names from the Monetizing the moat and Punching above their data quadrants that also carry a High valuation discrepancy (*) or High convexity (†) — i.e. the unlock is already happening and the payoff is mispriced or asymmetric.
| Name | Thesis | Q&A |
|---|---|---|
| IQVIA IQV *† | Thesis ↓ | Q&A ↓ |
| Similarweb SMWB *† | Thesis ↓ | Q&A ↓ |
| Wiley WLY * | Thesis ↓ | Q&A ↓ |
| Tempus AI TEM † | Thesis ↓ | Q&A ↓ |
| Sports-data duopoly GENI + SRAD † | Thesis ↓ | Q&A ↓ |
| Planet Labs PL † | Thesis ↓ | Q&A ↓ |
| ZoomInfo GTM † | Thesis ↓ | Q&A ↓ |
The cleanest setup on the board: the scarcest healthcare data at ~2.7x sales, and the unlock just went live — IQVIA.ai (Mar 2026) with 150+ agents and 19 of the top-20 pharma already using them — while the market still prices it as a sleepy CRO. High discrepancy, high convexity, low hype.
Catalysts
The deep-value outlier: ~0.7x EV/Sales on +15% growth while selling the datasets everyone uses to measure the AI-search era, with feeds and MCP integrations into AI workflows. Modeled (non-owned) data and nano-cap liquidity are the risks — but at this price the asymmetry is real.
Catalysts
The proven licensor at a value multiple: ~1.9x EV/Sales with $92M of lifetime AI-licensing revenue already disclosed ($29M in a single quarter), an Anthropic partnership, and recurring inference pilots converting corporate R&D demand. Flat underlying growth is the offset and caps the convexity — but on pure metrics, paying under 2x sales for the rare publisher with demonstrated, repeatable AI revenue is a discrepancy. The flow is healthy too: submissions +25%. Watch item: whether the licensing line proves recurring rather than episodic.
Catalysts
The growth-convexity pick: >$1B remaining contract value, the $200M AstraZeneca/Pathos foundation-model deal (non-exclusive — the motion can be resold), and ~6.5x sales on +83% growth. Not cheap, but the foundation-model optionality is large and real.
Catalysts
The promising thing here is the duopoly's position, which both names share: legally exclusive rights to real-world events — the one category of data AI cannot generate — whose consumption AI multiplies (more priced micro-markets per game, CV-deepened datasets, AI media on the long tail, settlement-oracle demand from agents and prediction markets). Neither is framed as an AI story by the market, and at ~3.0–3.6x EV/Sales the multiples charge little for it. The choice between them is an expression preference, not a separate thesis: GENI (†) is the torque expression — ~31% growth at ~3.6x, the media/ad layer monetizing the rights a second time, and the Second Spectrum CV stack creating proprietary data beyond the feed — with the NFL renewal as the concentrated left tail that comes with the convexity. SRAD is the quality expression — twice the size, diversified across 80+ sports, profitable, founder-led — a steadier, more symmetric payoff that misses the † by one notch of judgment. Because their biggest idiosyncratic risks are mirror images (league concentration for GENI, operator in-housing of services for SRAD), the pair diversifies the single-contract risk while keeping full exposure to the shared thesis. Watch item for both: rights-cost inflation at renewal — the leagues are the residual claimants on the AI surplus.
Catalysts
Compounding-by-design data (the Earth, daily) with AI as pure accelerant, a ~$900M backlog, and Golden Dome optionality. The caveat is the price: ~28x sales means you're paying for the story — convex, not cheap.
Catalysts
The contrarian lottery ticket: ~1.7x sales with its Context Graph natively inside OpenAI's Codex for Work — agent-context positioning the market ignores while pricing it as the AI victim. The decay-treadmill data and shrinking revenue are why it's a small, binary position, not a core one.
Catalysts
Honorable mentions just outside the rule: Gartner (IT) *† and Equifax (EFX) *† carry both marks but sit in the Latent quadrant — the data is elite and cheap, but the unlock hasn't shipped yet; they're the re-rate watchlist rather than the active list. As throughout: one analyst's qualitative framework, not investment advice.
Deeper discussion behind the picks above, organized by name — questions raised against each thesis, answered in full. Read the winners first; come here for the weeds.
Q: I thought I heard somewhere that a lot of the IQV data is not monetizable?
You heard something real — and it's the most important caveat on the IQVIA thesis, so worth unpacking properly. The claim has three solid legs:
First, IQVIA doesn't own most of its data the way Reddit or Tempus own theirs — it licenses it in. The prescription and claims data comes from pharmacies, PBMs, payers, and distributors under supply contracts that govern what IQVIA can do with it. That's a different property position from Reddit (owns its UGC outright under its terms of service) or Tempus (generates the data in its own labs as a byproduct of its own tests). Practical consequences: the supply contracts may restrict or simply not contemplate AI-training uses, meaning new monetization can require renegotiation; and once suppliers see AI value in their feeds, they can demand a bigger cut — the data-acquisition cost line can inflate to absorb part of any new revenue. IQVIA's own 10-K risk factors flag dependence on data suppliers for exactly this reason.
Second, a big chunk of the data people attribute to IQVIA isn't theirs to sell at all. The clinical-trial data from the CRO half (the old Quintiles) belongs to the pharma sponsors who ran the trials. IQVIA gains operational expertise and metadata from running them, but the underlying trial datasets are the clients' property. So "world's largest holder of clinical trial data" overstates what's actually licensable.
Third, privacy law boxes in the rest. The patient-level asset is valuable precisely because it's longitudinal and granular — and that's exactly what HIPAA de-identification standards, GDPR, and the newer state health-privacy laws make hard to hand over for model training, where re-identification risk and downstream-use control are unresolved questions. Compliance-wrapped enterprise analytics: fine. Shipping the corpus to a lab: mostly not.
Now the counterpoint, because the bull case survives in modified form: these constraints are exactly why IQVIA's unlock took the shape it did. It isn't licensing data out — it's building agents on the data (IQVIA.ai, the NVIDIA-built foundation models) and selling workflow outcomes, which keeps everything inside the compliance and contract perimeter. That's the monetization path that's actually available to them, and the 19-of-top-20-pharma adoption suggests it's working. The honest restatement of the thesis: you're not buying a Reddit-style licensing annuity; you're buying cheap (~2.7x sales) exposure to whether agents-on-regulated-data becomes a product category — with the risks being supplier economics, execution, and the CRO half's labor-pricing exposure to AI.
It does argue the Data 9 rating conflates two things the table mostly treats as one: data scarcity (genuinely elite) versus data ownership cleanliness (middling — licensed-in, use-restricted). Tempus, Reddit, Verisk, and Equifax score much better on the second dimension.
One flag on the specifics: the supplier-contract and sponsor-ownership structure described here is well-established, but current contract terms and the state of IQVIA's data-access disputes (including long-running litigation with Veeva over data access) haven't been re-verified in this pass.
Q: I am wary of companies that perpetually operate at a loss, where the stock continuously declines for many years. To me this suggests a fundamental problem with the business model, or management simply not aligned with shareholder value — more often than not these go to zero rather than turning around. Is there any indication Similarweb is actually doing something transformative right now? Is there any justification for their current losses?
The heuristic is a good one — companies that lose money for years while the stock grinds down do usually die rather than turn — but on Similarweb the factual premise has gone stale, and the recent filings answer both questions concretely.
On "perpetually operating at a loss": no longer quite true. In Q1 2026 the GAAP operating loss narrowed to $4.4M and GAAP net loss to $6.4M, while the company posted non-GAAP operating profit of $2.4M and normalized free cash flow of $6.6M — its tenth consecutive quarter of positive normalized free cash flow. 2025 was its second consecutive year of non-GAAP operating profit (~$13M FCF for the year), and it carries ~$65–72M cash, no debt, and an untouched $75M credit line. The remaining GAAP loss is small and consists mostly of stock-based comp and amortization — a dilution cost worth counting, but a very different animal from a cash-burning melting cube. Net cash, no debt, self-funding: the "goes to zero" mechanics (refinancing walls, forced raises) aren't present.
On "doing something transformative right now": real, quantified evidence, not just narrative. AI-related sales reached 11% of Q4 2025 revenue, up from 8% in mid-2025, with AI revenue roughly tripling year over year. In Q1 2026 it signed a seven-digit LLM data-training contract with an existing big-tech customer, with a second large LLM contract expected in coming quarters. Contract quality is improving in ways that suggest durability: remaining performance obligations grew 18% to $297.7M, 64% of ARR is now multi-year (up from 49% a year ago), and $100k+ customers grew 12% to 461. The transformation thesis, plainly: the same clickstream dataset that sold as marketing-analytics seats is being re-sold as LLM training/grounding data to big tech, and that line went from experimental to 11% of revenue in about a year.
The parts that support the wariness, because they're real: growth has decelerated — FY26 guidance implies ~10%, down from ~15% — and overall net revenue retention is 98%, meaning the core seat business churns slightly faster than it expands; the AI line grows on top of a flattish base. Large deals are lumpy (Q4 missed guidance on contract-timing slippage). And note the word "normalized" before the cash-flow figure: unadjusted free cash flow in Q1 was $(0.3)M — the cash generation is real but thinner than the headline. The long stock decline also reflects genuine history: a 2021 IPO at growth-stock pricing followed by years of guidance resets before the discipline arrived (a new CFO joined this year with a mandate explicitly framed around monetizing the data asset).
Verdict: this doesn't fit the classic to-zero profile — those companies have debt, burn, and no buyer for what they make; Similarweb has net cash, positive cash generation, and a dataset big tech just started writing seven-figure checks for. But it also isn't a proven transformation: a slightly-churning core with a fast-growing AI line that's still only ~11% of revenue, priced at under 1x EV/Sales. The bet, stated plainly: the AI mix shift outruns the core's decay before panel-collection methods get harder. That's exactly why it's sized as option-like rather than core. The single best thing to watch: whether the second large LLM contract lands and whether AI revenue keeps compounding off 11% — if that line stalls, the base-rate skepticism wins.
Q: My assumption is every major AI is going to need to ultimately contract with every major publisher. I don't have anything hard to go on when saying this — it just seems logical. What is your take on this matter?
The logic is sound as a directional bet, but it needs amending in two important ways — because the version as stated sits on top of a legal question that's actively being decided, and the form of the contracting matters more than the fact of it.
The case for the assumption is real, and it has four legs. First, scarcity economics: frontier labs are compute-rich and quality-data-constrained, and peer-reviewed scientific text is among the highest-value-per-token corpora that exists — concentrated in a handful of publishers (Elsevier, Springer Nature, Wiley, Taylor & Francis, Oxford/Cambridge). Second, the checks are rounding errors: the observed deals run roughly $10–40M a year per publisher against labs spending tens of billions on compute — the cost-benefit of just paying is lopsided. Third, enterprise customers increasingly demand provenance and indemnification, which only licensed data provides. Fourth, there's a cascade dynamic: once two or three labs license a corpus, the others face both a capability gap and a worse litigation posture for not licensing — and publishers are improving their coordination (Wiley's Nexus, licensing on behalf of 36 publishers, is exactly a move to bundle the long tail and raise the table stakes).
The amendment the logic needs: the courts have partially undercut the training version of it. In 2025, US federal rulings (the Anthropic books case and Meta's authors case) held that training on lawfully acquired copies is fair use — transformative — while Anthropic's ~$1.5B settlement was about the piracy of acquisition, not training itself. If that line of jurisprudence holds (the NYT–OpenAI case and appeals are the ones to watch), then for pure model training a lab arguably needs to buy one legitimate copy of everything, not sign a license with anyone. Add the supply-side erosion: open access now covers a large and growing share of new science (mandated by funders), preprint servers carry much of the frontier, and synthetic data reduces marginal dependence on any single corpus. So "every lab must license every publisher for training" is the weakest form of the thesis — training licenses may prove episodic, one-and-done archive purchases rather than annuities.
But the inference version of the assumption is much stronger — and that's the one that matters for Wiley. Fair use is a training-time doctrine; it does not cover an agent retrieving, reproducing, and serving current copyrighted articles at query time. A clinical AI citing this month's literature, a corporate R&D copilot grounding on vetted chemistry, a research agent that must point to the authoritative version of record — those need live, licensed, recurring access, and there's no fair-use route around it. That's a per-seat or per-query annuity, not a one-time archive sale. And this is precisely the pivot Wiley's own disclosures describe: recurring inference pilots with pharma/chemical/space companies, the OpenEvidence deal (five-year licensing into clinical AI, with equity), and management guiding the recurring share of AI revenue to triple. The vetting layer compounds the case — as AI floods the world with plausible text, the peer-review stamp becomes more valuable at retrieval time, not less. Jurisdiction helps too: the EU's text-and-data-mining opt-out regime pushes toward licensing more firmly than US fair use does.
Two sizing cautions so the logic doesn't overrun the numbers. Even if the assumption fully plays out, the per-publisher checks observed so far ($20–40M/yr range) would be transformative for Wiley's multiple (high-margin recurring revenue on a ~$1.7B base at ~1.9x sales) without being transformative for its P&L — the re-rate comes from the market capitalizing a data-annuity line, not from revenue doubling. And "every major AI" overstates the buyer pool: it's really the handful of frontier labs with consumer/enterprise products and US legal exposure; open-weight and non-US players may never pay, and a buyer pool of five-ish labs versus many publishers is an oligopsony where the labs hold pricing power — another reason the publisher-coalition motion (Nexus) matters strategically.
Compressed take: right in direction, wrong in mechanism. Restated: "most frontier labs will end up paying the few publishers whose vetted corpora can't be substituted — but the durable money is inference-time grounding, not training, and the training-license wave may prove episodic." The June 16 print is a direct test: the recurring-vs-episodic split in the AI line is exactly the number that adjudicates between the two versions. Tail risk to flag honestly: if appellate courts extend fair use further than expected — or open-access coverage gets good enough — the licensing imperative weakens at the margin, which is why the NYT–OpenAI outcome is a thesis input for the whole publisher group, not just Wiley.
Q: You suggest the journals are under fire from preprint services and open access. My take is that the peer-review system is an essential check on avoiding the 'human centipede' problem — where information on the open web is increasingly generated by AI, and so there are risks when AI in turn ingests that information. Does the review process for WLY journals differ from that of preprints and open access? And if so, is there substance to this take?
There's real substance to the take — and it actually sharpens the thesis — but it needs one correction first, because two different things are getting bundled under "open access."
The correction: open access is a payment model, not a review standard. Most open-access journals — including Wiley's own large OA portfolio, PLOS, Nature's OA titles — are fully peer-reviewed; "open" refers to who pays (the author/funder, via article processing charges, instead of the reading library). So "under fire from open access" is a pricing threat — it erodes the paywall/subscription model and compresses revenue per article — not a vetting threat. The category the human-centipede argument applies to is preprints (arXiv, bioRxiv, SSRN), which get only light screening: scope checks, plagiarism filters, basic moderation — no expert evaluation of methods, statistics, or claims before posting.
So yes, the process differs materially. A Wiley journal article passes through editorial triage by a domain editor, evaluation by typically two-plus independent expert referees who challenge methodology and claims, mandated revisions, increasingly statistical and image-integrity screening plus AI-generated-text detection, named-author accountability tied to institutions, and — crucially — post-publication infrastructure: corrections, retractions, and the citable version of record. A preprint has none of that. The difference isn't that preprints are wrong (most eventually pass review somewhere); it's that the journal version carries a verified provenance chain and a maintained error-correction mechanism.
And the underlying mechanism is documented, not just intuitive. The "model collapse" literature — most prominently a 2024 Nature paper — showed that models trained recursively on AI-generated data degrade, losing the tails of the distribution first. As AI-generated text floods the open web (and the preprint servers — they're getting hit too), the share of verifiably human-originated, expert-checked text shrinks as a fraction of available training and grounding material. That makes the peer-review stamp exactly what the question says it is: a provenance filter whose scarcity value rises with the pollution level. It's the cleanest version of the Wiley moat argument: they don't just own content, they own a certification process — and certification is the thing AI can't synthesize, because its value comes from accountable humans staking reputations on it. This is even stronger at inference time than training time: an agent citing a retracted or fabricated paper is a liability event, and only the publishers maintain the retraction/version-of-record signal that prevents it.
The honest caveat — it's a big one and it's Wiley-specific: peer review is a leaky filter, and Wiley owns the cautionary tale. Its Hindawi acquisition collapsed into the largest paper-mill scandal in publishing history — on the order of 10,000+ retractions, dozens of journals closed, the Hindawi brand itself shut down. Paper mills and AI-assisted manuscripts are now hammering journals, not just preprint servers; reviewers don't rerun experiments or audit raw data; and the submission surge Wiley reports (+25%) is partly the AI-writing flood arriving at their own front door. So the right framing isn't "peer review = clean, everything else = contaminated." It's that peer review is a costly, maintained, accountable filter — imperfect, but the only one with institutional machinery behind it — and its economic value depends on publishers actually defending it. Wiley's post-Hindawi integrity investment is, in that sense, capex on the moat.
Net for the thesis: the take upgrades the Wiley argument from "they own good text" to "they operate the trust layer of the scientific record in an era when trust is the scarce input." It implies the durable revenue isn't selling the archive once — it's selling certified, current, retraction-aware access continuously, which is the inference-annuity case from the question above arriving by a different road. Two things to watch: whether labs start paying explicit premiums for verified-human/vetted corpora over scraped web (early signs yes — that's what every one of these licensing deals implicitly is), and whether Wiley keeps its integrity record clean post-Hindawi — because in this framing a second paper-mill scandal isn't an embarrassment, it's impairment of the core asset.
Q: Does AI open the door for more automated verification and/or replication of research? It seems like it should be one of the tiers of peer review baked into every process.
Yes — and this might be the most underexplored part of the whole publisher thesis, because AI changes the cost curve of verification, which has always been the binding constraint. Peer review checks what's cheap to check (plausibility, novelty, methodology-as-described) and skips what's expensive (does the code run, are the numbers internally consistent, does the data support the claims, does it replicate). AI attacks exactly the expensive part. But the tiers are arriving at very different speeds.
Tier 1 — automated integrity screening: already here, and Wiley is deploying it. Image-manipulation and duplication detection (tools like Proofig and ImageTwin), statistical consistency checks (recomputing p-values from reported test statistics, checking whether reported means are even arithmetically possible given sample sizes), plagiarism and tortured-phrase detection, and paper-mill signature screening. Post-Hindawi, Wiley built and launched its own AI-powered paper-mill detection service, and the industry runs a shared STM Integrity Hub. This tier is becoming standard intake screening — it happens before human review, exactly the "baked-in tier" the question describes. It exists because the Hindawi-class scandals made the cost of not having it explicit.
Tier 2 — computational reproduction: technically feasible now, not yet standard. With data- and code-availability mandates spreading, an AI agent can literally re-execute the analysis pipeline: pull the deposited data, run the deposited code, check that the figures and tables regenerate, flag where they don't. This is the genuinely transformative one, because it converts "reproducibility" from a years-later social process into a pre-publication compile check. Nothing about it is science fiction — it's agentic code execution, which is mature. The blockers are economic and social, not technical: review labor is currently free (referees are unpaid volunteers), so any machine tier adds real cost per paper that someone must absorb; data/code deposits are still incomplete; and — the uncomfortable one — open-access publishing runs on volume economics (revenue per article published), so journals' financial incentive is to reduce friction at acceptance, not add it. The honest answer to "why isn't this baked in everywhere already" is mostly that one sentence.
Tier 3 — empirical replication: AI helps at the margins, robots are the long game. Wet-lab replication can't be done by a language model; it requires self-driving labs and cloud-lab infrastructure, which exist in narrow domains (chemistry, materials science) but are nowhere near general or cheap. What AI can do today at this tier is triage: models trained to predict which findings will replicate (DARPA ran a program on this) perform respectably, so a "replication-risk score" attached to papers could direct scarce replication resources. Full automated replication-as-review is a decade-scale story, not a product-cycle one.
The investment-relevant part cuts both ways for Wiley. The bullish reading: machine verification strengthens the certification product (a stamp meaning "human-reviewed AND machine-verified" is worth more than either alone, especially to AI labs buying training/grounding data — "verified corpus" becomes a premium SKU); and it favors scale, since integrity-tech stacks amortize across millions of submissions, squeezing small publishers toward consolidation or toward agency arrangements like Wiley's Nexus — the big five get bigger. The bearish reading to hold alongside it: if automated verification becomes good, cheap, and independent, it commoditizes part of what journals sell. The overlay-journal model — preprint servers plus automated verification plus lightweight expert curation (eLife has experimented in this direction) — is the disintermediation scenario, where the verification tier exists but publishers don't own it. The likely equilibrium: accountability, liability, retraction infrastructure, and the career-credentialing function keep the journals at the center for a long time, but the moat narrows from "we are the only filter" to "we are the accountable filter."
The deepest version of the point: AI won't just verify papers — AI will increasingly write and even conduct the science, generating a flood of machine-assisted findings. In that world the certification layer becomes more central, not less, but it has to evolve from "two human referees" to "human judgment supervising a machine-verification stack." Whoever operates that stack credibly owns the trust layer of the scientific record. The publishers are the incumbents with the brand, the workflow position, and (post-Hindawi) the motivation — but it's an asset they have to build, and the capex shows up before the pricing power does. For the June 16 call and beyond, the tell would be Wiley talking about verification as a product (verified-corpus licensing, integrity-as-a-service) rather than only as a cost line.
One adversarial caveat to keep honest: any automated check becomes a target — papers will be optimized to pass the machines, the way text is already optimized to evade AI detectors. So the equilibrium isn't "verification solves fraud"; it's an arms race in which the side with more data about fraud patterns wins — which, conveniently, is another argument for scale.
Q: I don't understand the WLY business all that well. Some of it seems to be related to journals. Nexus and clinical seem different. Can you break down these segments, providing a concise summary of each one, and a rough estimate of the percent they are contributing to overall revenue?
The map, with FY2025 actuals (fiscal year ended April 30, 2025; total continuing revenue ~$1.66B; segment figures from SEC filings via FMP):
1. Research — $1,075M, ~65% of revenue. The heart of the company and the entire AI thesis. Two sub-pieces:
Everything stress-tested in the questions above lives in or attaches to Research: the AI licensing deals (journal corpus → labs), OpenEvidence (journal content → clinical AI), Nexus, and the Emerald acquisition (more journals).
2. Learning — $585M, ~35% of revenue. The other, slower half:
The two flagged items, properly placed — neither is a segment:
Nexus is a service inside Research, launched ~2025: Wiley acts as licensing agent for other publishers' content into AI developers — 36 partner publishers signed. Revenue today: not disclosed, almost certainly immaterial (low single-digit millions at most). Its significance is structural, not financial yet: a take-rate marketplace that scales without Wiley creating content, and it aggregates the long tail of publishers into a single negotiating bloc against the labs.
"Clinical" = COA (Clinical Outcome Assessments), also inside Research: Wiley licenses validated patient questionnaires and assessment instruments — the standardized forms used as endpoints in drug trials — to pharma, with an IQVIA distribution agreement. Per the Q3 call it grew from $0.8M in 2021 to ~$7M — i.e., under 0.5% of revenue. It's a proof-of-concept for "Wiley content as regulated-workflow infrastructure," not a needle-mover.
And the line that cuts across both segments: AI licensing was ~$40M in FY25 (~2.4% of revenue), $29M in Q1 FY26 alone — booked in Research when it's journal content, Learning when it's books.
The concise mental model: ~57% journals (the moat and the AI story), ~8% publishing infrastructure, ~35% books and courseware (the slow-melt legacy that pays the ~4.5% dividend), with AI licensing at ~2–3% and growing fast, and Nexus/COA as currently-tiny options on the agency and clinical-infrastructure models. The valuation question is whether the 65% earns a data-asset multiple before the 35% erodes — which is why the recurring-AI-revenue disclosure on June 16 matters more than any segment line.
Q: My concern with TEM is that nearly every high-profile bio IPO comes to market at an astronomical valuation, and as the reality and complexity of what they intend to achieve becomes evident, it's a continual erosion in price. Are there pathways where TEM could have 5x more revenue? What would various forward-growth scenarios look like?
The concern is well-founded — the "IPO at an astronomical multiple, then erode for years as reality intrudes" pattern is the base rate for high-profile diagnostics listings (Schrödinger, Recursion, 23andMe, Guardant and Exact Sciences for long stretches). The mechanism is real: TAM slides assume flawless execution, reimbursement arrives slower than modeled, and the multiple de-rates from "platform" to "lab services" as growth normalizes. So the right question isn't whether TEM can 5x in the abstract — it's whether there are specific, fundable pathways to ~$8B and what has to be true for each, held against that gravitational pull.
The math that disciplines everything. 5x the FY26 guide (~$1.6B) is ~$8B. The street already models TEM to ~$3.1B by 2030 — roughly 2x, an ~18% CAGR. So 5x is on no current analyst sheet: it needs ~26% revenue CAGR sustained for seven years, or a step-change from new modalities/M&A the consensus isn't underwriting. That gap is where both the upside and the erosion risk live. Today's business is ~$955M Diagnostics + ~$300M+ Data & Applications, and the Data line — not the lab — is what makes the 5x debate interesting, because it carries software economics the erosion-pattern names usually lack.
The five pathways to 5x — what each contributes and what must be true:
Scenario table — forward revenue, with the levers each requires:
| Scenario | ~Revenue (7y) / multiple of today | Implied CAGR | What has to go right | Read |
|---|---|---|---|---|
| Erosion case | ~$2.5–3B · ~2x | ~8–10% | Genomics decelerates, MRD reimbursement stalls, Data growth normalizes; no new modality scales. Execution fine, story de-rates. | Your base-rate fear — revenue still grows but the multiple compresses faster; the stock erodes even as the company "works." |
| Street case | ~$3.1B (2030) · ~2x | ~18% (to 2030) | Levers 1–2 deliver, Data compounds steadily, no heroic adjacency. The consensus model. | Consensus — solid, already in estimates; ~2x, not 5x. The market is paying ~6.5x sales for this. |
| Bull case (5x) | ~$8B · ~5x | ~26% | Levers 1–3 all deliver and at least one adjacency (lever 4) reaches scale; Data line hits $1.5–2.5B. Seven years of high-20s growth with no balance-sheet break. | Fundable, not forecastable — requires 4 of 5 levers firing; each is plausible, the conjunction is demanding. |
| Moonshot | ~$12B+ · ~8x | ~33%+ | All five levers, including transformative M&A funded by a holding multiple, plus Data becoming a true foundation-model data platform. Tempus becomes the clinical-data layer of medical AI. | Lottery upside — the reason to hold a small position; do not underwrite to it. |
The honest synthesis. Yes, there are real pathways to 5x — and unlike a pure therapeutics binary, they're incremental and observable: you watch MRD reimbursement, Data TCV, and adjacency scaling quarter by quarter, so you're not betting on a single trial readout. That's what makes TEM less of a classic erosion candidate than the pattern suggests. But the 5x case requires four of five levers to fire over seven years, and the erosion pattern you describe is precisely what happens when two or three fire and the multiple — today ~6.5x sales — does the rest of the work downward. The single most important variable is lever 3 (Data): if the licensing line compounds into a genuine multi-billion data platform, TEM re-rates as a data owner and the 5x is reachable; if Data stalls and TEM stays a fast-growing lab, the street case caps it at ~2x and your erosion thesis likely wins on the multiple even if revenue grows. The tell to watch each quarter: is Data & Applications growing faster than Diagnostics? As long as it is, the data-platform thesis is intact; the quarter that flips is the quarter the erosion case gains the upper hand. Usual caveat: scenario framework, not a forecast — and explicitly not investment advice.
Q: Regarding SRAD and GENI, how does AI change the landscape for sports data? I understand the bull case relating to a lot more prediction markets increasing demand for sports betting. But how does AI — if at all — increase the value of this data?
The prediction-market bull case is really a demand story; the more interesting question is whether AI changes the unit value of the data itself. Five mechanisms where AI genuinely increases value, and two where it leaks away.
1. AI multiplies the number of sellable markets per game. The binding constraint on in-play betting has never been bettor appetite — it's pricing capacity. A human-supervised trading desk can only run so many live markets; AI models can price thousands of micro-markets simultaneously (next pitch, next possession, player props that re-price every few seconds). Every additional market is additional consumption of the underlying feed, and it's exactly what Sportradar's managed trading service (MTS) sells: operators outsource the model because building it in-house is hard. AI raises the ceiling on markets-per-event, and the official feed is the raw input for all of them.
2. Computer vision turns each game into orders of magnitude more data. The most underappreciated piece. Traditional play-by-play is hundreds of events per game; optical tracking is millions of positional coordinates. Genius owns this capability outright — it acquired Second Spectrum, which does the player-tracking for the EPL and NBA-grade optical work — and Sportradar's equivalent is its 4Sight/computer-vision stack. AI is simultaneously the collection technology (video → structured data, collapsing the cost of capturing depth) and the demand driver (those coordinates feed augmented broadcasts like BetVision, automated officiating support, coaching analytics, and richer bet types). The same rights now yield a much bigger dataset.
3. AI-generated media makes the long tail monetizable. Automated commentary, recaps in any language, and synthetic broadcast layers mean a third-tier table-tennis match or a lower-division soccer game — events where human production never penciled — can now carry a produced, bettable, watchable product. The data is the script for all of it. Both companies cover hundreds of thousands of events a year; AI raises the revenue per event at the bottom of that pyramid from near-zero to something.
4. Agents and prediction markets need a settlement-grade truth oracle. If AI agents are placing bets or trading event contracts, they need machine-readable, licensed, low-latency, legally safe ground truth — both to act on and to settle against. Scraped data doesn't work for settlement; official data is the oracle. The Kalshi–Sportradar deal is the early template. In an agentic world, the API call to the official feed is the toll booth.
5. Integrity services grow with the attack surface. Thousands of AI-priced micro-markets are also thousands of manipulation targets, and AI lowers the cost of coordinated fixing. Both companies sell integrity monitoring to leagues and regulators — that business scales with exactly the complexity AI creates.
Now the two leaks. First, the value-add layer can commoditize even as the feed doesn't. A Flutter or DraftKings can build its own AI trading models — they still must buy the raw rights-protected feed, but the services margin faces in-housing pressure from sophisticated operators, even as smaller ones outsource more. Second — and this is the big one — the leagues are the residual claimants. Every mechanism above raises the value of official data, but the rights are re-auctioned, and the leagues know what AI is doing to that value. AI also lowers the leagues' cost of collecting their own data (cameras plus computer vision), strengthening their threat to go direct. So a large share of the AI-created surplus gets transferred upstream at each renewal — that's why rights-cost inflation is flagged as the key endogenous concern for both. The NFL taking warrants in Genius is exactly this dynamic made explicit.
One marginal erosion worth knowing: AI makes unofficial data reconstruction cheaper — computer vision on a broadcast or pirate stream can rebuild play-by-play seconds behind real time ("courtsiding 2.0"). The official feeds keep the latency edge (in-venue collection) that matters for live betting, but enforcement against synthetic scraping becomes part of the moat maintenance.
Net read: AI is unambiguously positive for the value of the data — it multiplies markets, depth, monetizable events, and machine consumption, while the asset itself (real-world events under legal exclusivity) is the one thing AI can't generate. The investment question is the split: the duopoly's economics depend on whether market expansion and CV-driven product depth outrun what the leagues claw back at renewals. The line to watch in both names' filings: take-rate and gross margin on one hand, rights amortization and renewal terms on the other. The duopoly structure helps here too — with only two credible bidders for league rights, the auctions are less ruinous than they'd be with five.
Q: What is the distinction between GENI and SRAD on a high level?
At a high level they're the diversified incumbent versus the concentrated challenger — same business model, opposite portfolio construction.
Sportradar is the global utility. Founded in 2001, still run by founder Carsten Koerl (who remains a major holder), it's roughly twice Genius's size, profitable, and built on breadth: coverage across 80+ sports and on the order of 900k events a year, with marquee exclusive rights like the NBA, MLB, NHL, ATP tennis, and UEFA. Its strategy is full-stack: not just selling feeds but running the betting plumbing — live odds, managed trading services (MTS, where it effectively operates the risk book for bookmakers), streaming (bolstered by acquiring IMG Arena's portfolio), advertising (ad:s), and integrity. Because of the breadth, no single league renewal can break it, and the model behaves like an infrastructure compounder: take-rate expansion on a diversified rights base.
Genius is the concentrated marquee-rights play. It came public via SPAC in 2021 and is built on a handful of premium Anglo-American exclusives — most importantly the NFL official data rights (won in a 2021 bidding war against Sportradar, paying up and giving the NFL equity warrants), plus the NCAA and the Premier League (via Football DataCo). Two structural tilts distinguish it: first, the technology angle — its Second Spectrum acquisition made optical/skeletal tracking a core asset (it ran the NBA's tracking for years and does the Premier League's), which feeds products like BetVision, the in-stream NFL betting broadcast; second, the media layer — programmatic ads and fan-engagement products that monetize the same rights a second time. Higher growth (~31% vs ~12% on the FMP pull), but with concentration to match: the NFL relationship is both its crown jewel and its single point of failure, and the warrant structure means the league literally participates in the upside.
The duopoly framing is real but asymmetric: they've largely partitioned the rights map (Sportradar = global basketball/baseball/tennis/soccer breadth; Genius = NFL/NCAA/EPL depth), and the 2021 NFL auction is the cautionary tale of what happens when they do collide — rights inflation that took Genius years to digest. That's also why the leagues-as-residual-claimants point from the question above bites differently for each: a bad renewal is a margin headwind for Sportradar but a thesis event for Genius.
As investments: SRAD is the quality/steadiness expression — diversified, founder-led, profitable, the way to own the category with the least single-contract risk. GENI is the torque expression — more growth, more optionality (CV tech, BetVision, the media layer scaling against fixed rights costs), but the returns hinge on NFL economics at each renewal and on the media layer actually scaling. Roughly similar multiples (~3.0x vs ~3.6x EV/Sales), so the market isn't charging much for Genius's higher growth — arguably because it's pricing the concentration risk.
One caveat: exact current rights terms and renewal dates (especially the NFL deal's latest extension structure) should be verified in filings before leaning on them — the contours above are solid, but contract specifics move.
Q: Do SRAD and GENI differ substantially in how they are adopting AI technology or making their data more valuable in an AI era?
Substantially in emphasis, yes — even though the buzzword surface looks similar. The cleanest way to put it: Genius is applying AI to the data-capture and presentation layer; Sportradar is applying it to the pricing, risk, and personalization layer. One is making the data itself richer; the other is making each unit of data earn more.
Genius's AI center of gravity is computer vision. Second Spectrum gave it arguably the best sports-CV team in the business — years of running the NBA's official optical tracking, plus the Premier League's — and Genius has organized its product strategy around that: the GeniusIQ platform unifies the tracking layer, and the flagship expression is BetVision, the NFL stream with live odds and bet placement embedded directly in the broadcast, built on real-time CV understanding of the game. The same capability feeds its media/ads layer (audience products like FANHub targeting fans off sports data). So Genius's version of "making data more valuable in the AI era" is generative of new data: turning video into millions of skeletal coordinates per game, then turning those coordinates into augmented experiences and ad inventory. It's expanding what the rights yield.
Sportradar's AI center of gravity is the betting economics stack. Its signature AI products sit downstream of the feed: Alpha Odds (AI-driven, dynamically personalized odds that let each operator differentiate pricing), the MTS managed-trading business (ML risk management run as a service — effectively the outsourced quant desk for hundreds of bookmakers), the Vaix acquisition (deep-learning personalization and recommendations inside sportsbooks), and AI-driven integrity monitoring across its market surveillance. It has CV too — 4Sight and the Synergy Sports acquisition — so the capability gap is narrower than the positioning gap, but Sportradar's AI shows up as better decisioning per event: more markets priced, sharper risk, higher take per bet across an enormous diversified event base. Notably, it's also the one with the early agent/prediction-market posture — the Kalshi relationship makes it the settlement-oracle precedent.
Mapped to the five mechanisms above: Sportradar is strongest on mechanism 1 (AI multiplying priced markets per game) and 4 (the truth-oracle role), Genius on mechanism 2 (CV multiplying data per game) and 3 (AI-generated/augmented media). Both sell mechanism 5 (integrity). Which orientation wins depends on where the surplus lands: if the AI era's prize is decisioning — odds, risk, personalization at scale — Sportradar's infrastructure position compounds; if the prize is experience and attention — augmented broadcasts, in-stream betting, fan-data advertising — Genius's CV-first stack has the sharper edge. There's also a defensiveness asymmetry: Genius's CV layer creates proprietary data beyond the official feed (tracking data the league rights alone don't give you), while Sportradar's decisioning layer is the part big operators could most plausibly in-house — which loops back to the services-commoditization leak.
Caveats in the usual spirit: the product specifics here (Alpha Odds, Vaix, GeniusIQ, BetVision, Synergy, the Kalshi structure) are from desk knowledge plus this session's lighter pass — current scope hasn't been verified against the latest filings, and both roadmaps move fast. The thing to verify in a deep-dive batch is each company's disclosed revenue mix by product line — how much of Sportradar is now MTS/managed services and how much of Genius is BetVision/media — because that's where the AI orientation shows up in numbers rather than press releases.
No questions logged yet.
No questions logged yet.
The lens here is not who leverages journals more — it is who is embracing the AI era in a way the market under-recognizes, judged by the trajectory or transformation underway relative to company size. Financials from the main table (FMP, June 9 2026): RELX ~$92B cap / ~9.0x EV-S / +7%; Wolters Kluwer ~$38B / ~6.0x / +6%; Wiley ~$2.3B / ~1.9x / ~flat. The multiples are the recognition gap stated numerically.
| RELX | Wiley (WLY) | Wolters Kluwer (WTKWY) | |
|---|---|---|---|
| (a) Journal holdings — rank & domain | Elsevier is the #1 journal publisher by articles and citations (~2,900 journals; The Lancet, Cell Press), dominant in life sciences and medicine. On top sits a layer no one else has: Scopus indexes ~29,000 journals from ~7,000 publishers — RELX owns the citation graph of everyone's journals, not just its own. Journals are roughly a third of company revenue; Risk and Legal are bigger. | #3 commercial publisher (~2,000 journals), broad-spectrum with genuine franchise strength in materials science and chemistry (the Advanced family, >$70M and growing double-digit) plus exclusive publishing partnerships with ~900 scholarly societies — content it monetizes without owning outright. No citation-graph layer. Journals are ~57% of revenue — the only one of the three where the journal thesis is the stock thesis. | Smallest journal estate: Lippincott's ~300 medical/nursing titles, largely society-owned, narrow clinical domain. But the crown jewel sits above journals: UpToDate — 7,600+ expert contributors continuously synthesizing the literature into care recommendations. A derived, perpetually-current expert layer, arguably more AI-grounding-relevant than raw journals. |
| (b) Journal-value rank | 1 — largest, highest-impact corpus + the cross-publisher Scopus layer | 2 — top-tier breadth, society leverage, no meta-layer | 3 on journals per se — but UpToDate is a different, highly defensible asset class |
| (c) Brokering data beyond own journals (Nexus / clinical-style motions) | Deliberate non-broker. Refuses corpus licensing to labs; everything embeds in its own grounded products — Scopus AI, ScienceDirect AI, Reaxys/Embase/ClinicalKey AI, Lexis+ AI and Protégé — and it is buying AI-natives (Doctrine, Apr 2026, second legal-AI deal in 24 months). The one quasi-brokering position: Scopus AI monetizes 7,000 other publishers' abstracts inside RELX's product — aggregation at the metadata layer. | The only true broker of the three. $92M lifetime external AI licensing (three big-tech clients incl. AWS; Perplexity; Anthropic partnership); recurring inference pilots with pharma/chemical/space corporates; OpenEvidence licensing plus equity stake; Nexus — licensing agent for 36 partner publishers' content (brokering others' data on a take rate); COA clinical instruments distributed via IQVIA; Emerald acquired (Jun 2026) as more corpus to license. | Middle path: distribution-brokering, not corpus-brokering. No licensing to labs, but pushing curated data into third-party surfaces: UpToDate inside Microsoft Dragon Copilot, M365 Copilot, Teams; an Epic pilot pairing GPT-4 with UpToDate content; an expanded OpenAI enterprise collaboration; and Medi-Span Expert AI shipping an MCP server so third-party agents can consume its medication data — genuine agent-layer exposure. |
| (d) Brokering rank | 3 — by explicit choice; highest in-product monetization instead | 1 — corpus licensing + agency model + clinical infrastructure, all live | 2 — partner distribution + MCP/agent exposure, no corpus sales |
| (e) Transformation summary — embracing the AI era | The most advanced AI operator — and the least transformed. Every division has shipped grounded-AI SKUs and it is consolidating AI-native challengers by acquisition. But this is sustaining innovation: AI features defending and extending an already-premium subscription machine. The business model is unchanged; the strategy is to make sure nothing changes. | Business-model transformation, not feature addition. From subscription publisher to data licensor + licensing agent + clinical-infrastructure provider, with equity stakes in AI distribution (OpenEvidence). AI revenue went 0 → $40M (FY25) → $29M in Q1 FY26 alone, with the recurring share guided to triple. The company is becoming something it wasn't. | Fast, urgent — and defensive. Under direct attack: OpenEvidence (free, ad-funded, $12B valuation, Jan 2026) is aimed squarely at UpToDate's ~$595M seat-license business. The response is real and rapid — Expert AI signed by >half of US hospital enterprise customers (~2,000 hospitals) within months — but the transformation protects existing dollars more than it creates new lines. |
| (f) Transformation magnitude vs company size | Low–moderate — large absolute AI investment, immaterial against ~$12B revenue; fully recognized at ~9x sales | High — a new revenue category already ~2.5% of sales and compounding, on a $2.3B company at ~1.9x sales; the multiple says the market sees a books company | Moderate — product architecture genuinely rebuilt, but on ~€5.9B revenue the motion is value-protective; the market nets the OpenEvidence threat against the progress at ~6x sales |
Verdict through the stated lens. If the question is who is most transforming to embrace the AI era, under-recognized by the market, the answer is Wiley, and it isn't close — it is the only one of the three where AI is creating a genuinely new business (external licensing, the Nexus agency, clinical infrastructure, equity in AI distribution) that is material relative to the company's size, priced at a multiple that embeds none of it. RELX is the best AI operator of the three, but operating excellence on a recognized premium franchise is the opposite of an under-recognized transformation — at ~9x sales the market has already written the AI-winner thesis into the price. The interesting second-order case is Wolters Kluwer: the >50% enterprise adoption of Expert AI in a matter of months is the single most impressive execution statistic among the three, and if OpenEvidence stalls at the hospital-procurement gate — where governance, liability, and expert-in-the-loop validation matter more than free access — the market's threat-discount becomes the under-recognition. Wiley is the transformation bet; Wolters Kluwer is the survival-mispriced bet; RELX is the quality compounder where the AI question is already answered in the multiple. Usual caveat: qualitative framework, not advice — and the June 16 Wiley print is the nearest falsification test for the first claim.