ArchiveAILEENA MACHINA
MARKET STRUCTURE2026.05.21CEX · DEX · MEV · Dune

The Darkest
Trade

CEX-DEX arbitrage is the largest single MEV category on Ethereum, the quietest game on Solana, and the only profitable strategy where one leg of the trade is invisible to the chain you are trading on. A 2025 paper measured 7.2 million trades, $233.8M in extracted value, and 19 searchers — three of whom captured 75% of it. Here is how the mechanism actually works, where to read the data on Dune, and what makes Solana's DEX leg different.

01 — The Two-Cent Spread

SOL is quoted at $172.41 on Binance. The same SOL is quoted at $172.46 on a Solana DEX twenty milliseconds later. The two prices have drifted apart because the two markets are not the same machine. Binance updates its order book maybe a thousand times a second over a private matching engine. The DEX updates whenever someone submits a transaction that lands in the next block, which on Solana means once every 400 milliseconds and on Ethereum once every twelve seconds.

That five-cent gap is a free option. If you can buy on Binance and simultaneously sell on the DEX before either price moves, you have made five cents on a position you did not have to hold. Do it a thousand times in a single block and you have made fifty dollars on a single 400-millisecond slot. Do it across a year and you have built a profit pool that the public chain data alone cannot explain — because half of the trade never touched the chain at all.

This is CEX-DEX arbitrage, the largest extracted-value category on Ethereum and the most under-measured form of MEV on Solana. Until 2025 nobody had a clean public number for how big it was. That changed when a group of researchers — Sui414, William, soispoke, and malleshpai — released a custom Dune dataset that finally let outsiders see it.

02 — Two Flavors of the Trade

There are two distinct CEX-DEX strategies, and they live on different parts of the orderbook.

PRICE ARB

Spot price on a CEX drifts away from spot price on a DEX. Buy the cheap one, sell the expensive one, hold no inventory at the end of the slot.

The classic two-legged trade. Capacity bounded by DEX liquidity at depth.

FUNDING RATE ARB

Perpetual funding rate differs between a CEX perp and a DEX perp. Long the cheaper funding, short the more expensive funding, collect the spread per hour.

Carry trade, not directional. Capacity bounded by venue OI caps.

Drift's educational write-up gives the canonical retail framing: SOL spot on Binance at $13.179 versus SOL spot on Orca at $13.21 is a price-arb opportunity worth $0.031 per unit; SOL-PERP funding rate of 0.003% hourly on Binance versus -0.00511% hourly on Drift is a funding-arb opportunity worth the spread, every hour, for as long as it holds. The mechanics differ but the engineering problem is the same: you must move capital between two venues fast enough that the spread does not close while you are mid-trade.

03 — Why It Is Called "Dark"

Sandwich attacks, JIT liquidity, and on-chain DEX-to-DEX arbitrage all leave a complete fingerprint on the blockchain. The bot's transaction is there. The pool it touched is there. The user it extracted from is there. You can write a Dune query, count the swaps, sum the profit, and rank the searchers — every step of the proof is on a public ledger.

CEX-DEX arbitrage is different. One leg of the trade is off-chain. You can see a searcher swap a million USDC for SOL on Uniswap or Orca, but the hedging short on Binance that locked in the profit is invisible — it lives inside a private matching engine you cannot subpoena. The chain only shows you half the picture.

The 2025 paper by Sui414 et al., titled Measuring CEX-DEX Extracted Value and Searcher Profitability: The Darkest of the MEV Dark Forest, is the first systematic attempt to reconstruct that hidden leg. The method is indirect: identify the on-chain leg of the arbitrage by its structure (atomic, large, followed by no offsetting transaction), then check the CEX price at the same block time to estimate the spread the searcher captured. Aggregate that over millions of trades and you finally have a number for the size of the market.

04 — The Numbers

From August 2023 to March 2025 — 19 months — the researchers identified 7,203,560 CEX-DEX arbitrage trades and estimated $233.8 million in extracted value across them. That is an average of about $32 per trade, which sounds small until you remember the trades are atomic and risk-free in the sense that arbitrage means: by the time the on-chain leg confirms, the CEX leg is already done.

MetricValueNotes
Period studiedAug 2023 – Mar 202519 months
Trades identified7,203,560On-chain leg only
Extracted value$233.8MEstimated from spread × size
Average per trade~$32Risk-bounded
Identified searchers19Top 3 captured 75% of volume and value

Source: Sui414 et al., arxiv 2507.13023v2 (2025).

The concentration result is the most interesting one. Three searchers controlled three quarters of the extracted value over the entire study period. CEX-DEX is not a long-tail competitive market like DEX-to-DEX arbitrage; it is an oligopoly. The barriers to entry are not technical (the code is straightforward) but operational: you need real-time, low-latency feeds from multiple CEXes, inventory on both sides, a settlement loop that nets out exposures continuously, and enough volume on each venue to avoid signalling your own trades to the market. A solo searcher does not have any of those things.

05 — The Dashboard

The paper's authors open-sourced the Dune queries that drive the analysis. You can fork them and watch the market live. Two are worth knowing by name.

Dashboard · dune.com/rig_ef/cex-dex-dash

CEX-DEX Arbitrage 💰

The full dashboard that accompanies the arxiv paper. Volume, value, leaderboard, pool breakdown, and per-searcher profitability across the entire 19-month window.

Query · dune.com/queries/3999754

Arbitrage profit per block, DEX ↔ CEX

Forkable single query. The atomic building block — block-by-block view of how much CEX-DEX profit is being captured, by who, on which pool.

The interesting trick in the methodology is the markout. Because the CEX leg is invisible, you can never know exactly when the searcher hedged. So the queries don't try. Instead they check the CEX price at the block time, and at several offsets after (1 second, 5 seconds, 30 seconds), and assume the searcher captured something close to the average. The estimate is necessarily noisy at the trade level but converges quickly when you aggregate across millions of trades.

A Python script for pulling the same dataset through the Dune API is published alongside the queries, which means you don't need a Dune subscription to reproduce the headline numbers — you can pull the raw CSV and run your own analysis offline.

06 — Why Solana Changes the DEX Leg

Almost all of the public research on CEX-DEX arbitrage uses Ethereum data. The DEX leg in that data is Uniswap V3 or Curve, and the on-chain economics are dominated by Ethereum's 12-second blocks and high gas fees. Solana flips both inputs.

A Solana block lands every 400 milliseconds, not every 12 seconds. The base fee on a swap is $0.00025, not $5. A bot can attempt twenty times as many CEX-DEX trades per minute, at one-twenty-thousandth the cost per attempt. That changes which arbitrage opportunities are profitable: spreads that would never clear gas on Ethereum clear easily on Solana, which is part of why Solana's DEX prices have converged so tightly with CEX prices in 2025 despite a 90% drop in DEX volume since 2024 — the arbitrage pressure is doing more work per unit of liquidity.

EthereumSolana
Block time12 s~400 ms
Min fee per swap$1 – $10 typical$0.00025 base
Atomicity modelBundle via Flashbots / MEV-BoostBundle via Jito / Samba; flash loans within one tx
Read-the-mempool latencyPublic mempool, hundreds of msNo public mempool; ShredStream / Jito relay, sub-100 ms
Min profitable spread~$30 to clear gas + bid~$0.50 with priority fee

Ranges illustrative, based on Helius MEV Report and public protocol fee schedules.

The atomicity model is the one nuance that matters most. On Ethereum, the DEX leg of the trade can be a multi-step transaction bundle. On Solana, the same effect can be achieved with a flash loan inside a single transaction: borrow a million USDC, do the swap, hedge the CEX leg out of band, repay the loan — all within the 400ms slot. If the spread closed mid-flight, the transaction reverts and the only cost is the priority fee. That asymmetry is what makes Solana's DEX leg attractive for high-frequency CEX-DEX strategies even when the spread is small.

WHAT IS NOT IN PUBLIC DATA

The arxiv paper's 7.2M trades are dominated by Ethereum activity, because that is where the DEX leg is most legible. The equivalent Solana number is much harder to estimate publicly: Solana has no public mempool, the on-chain leg can be deeply composed with non-arbitrage activity in the same transaction, and the relevant Dune tables are still maturing. Treat any public Solana CEX-DEX size estimate with the same scepticism the paper applies to its Ethereum estimates.

07 — The Builder's Map

Two pieces of work are worth reading if you want to build instead of just measure.

CEX-DEX-ARB-RESEARCH

Solid Quant's open-source research template. Real-time CEX and DEX feeds, spread detection, and a hookable execution layer. The starting point most builders use.

github.com/solidquant

WHACK-A-MOLE

Solid Quant's public write-up on building the first version of the bot. Useful as a pedagogical walk-through — what works, what doesn't, and which assumptions about latency turn out to be wrong in production.

medium.com/@solidquant

Neither of these will make you competitive with the top three searchers in the paper. They are starting points for understanding the engineering shape of the problem, not a production stack. The real edge is in the things that cost money: colocated CEX connections, sub-millisecond market data feeds, real-time inventory netting, and — increasingly on Solana — a relationship with a stake-weighted relay so your DEX-leg transactions don't sit behind public traffic.

08 — Where to Read the Spread on Solana

If you want to watch the same kind of trade happen on Solana rather than Ethereum, the place to start is the layer below the DEX swap — the mempool-equivalent. Solana doesn't have a public mempool the way Ethereum does, but it does have ShredStream and the Jito relay, both of which expose in-flight transactions to subscribers before block inclusion. A CEX-DEX searcher running on Solana is reading from one of these (or both), comparing the implied DEX-leg price against a live CEX quote, and submitting the arbitrage transaction with a priority fee tuned to land in the next slot.

The companion piece The Wire — How Solana Actually Moves Bytescovers ShredStream, the leader schedule, and commitment levels in detail; the Wire Speed piece covers the validator architecture that makes Solana's tight slot timing feasible in the first place. CEX-DEX arb is the trade that pays for that infrastructure.

09 — The Mental Model

A CEX-DEX arbitrage is a single trade with two execution clocks: a 1-millisecond clock for the CEX leg, and a 400-millisecond (or 12-second) clock for the DEX leg. The job is to keep both clocks synchronised long enough to extract the gap, then unwind without leaving inventory on either side.

Every engineering choice in this game collapses to that one tension. Why is co-location worth paying for? Because it shrinks the CEX clock toward zero. Why is Solana's 400ms slot attractive? Because it shrinks the DEX clock toward the CEX clock. Why are flash loans useful? Because they let the DEX-leg execution model behave atomically — collapse if either side fails — which removes the asymmetric risk of being left holding the bag on a half-completed trade.

The reason CEX-DEX is "the darkest of the dark forest" is not that it is hidden by design. It is hidden because half of the relevant data lives on a private matching engine that no one has access to. The Dune queries above are not measuring the trade directly — they are measuring its on-chain shadow. That shadow is enough to estimate $233.8M of extracted value over 19 months, identify the oligopoly structure of the searcher market, and watch the spreads on a per-block basis as they open and close. It is not enough to copy what the top three searchers are doing, because the relevant code path is the one you cannot see.

References

  1. Sui414, William, soispoke, malleshpai — Measuring CEX-DEX Extracted Value and Searcher Profitability (arxiv 2507.13023v2)
  2. CEX-DEX Arbitrage 💰 — main Dune dashboard for the paper
  3. Arbitrage profit per block, DEX ↔ CEX — forkable Dune query
  4. Dune — official X thread summarising the dataset
  5. Drift Learn — How To Arbitrage between CEXs & DEXs (price arb vs funding rate arb)
  6. Helius — Solana MEV Report
  7. Analysis of CEX-DEX Arbitrage Opportunities with Hidden Markov Models — ACM Web Conference 2026
  8. DexAnalytics TLDR — Analysis of CEX/DEX Arbitrage
  9. Solid Quant — Whack-A-Mole: how I built my first MEV arbitrage bot
  10. Solid Quant — cex-dex-arb-research (GitHub template)
  11. crypto.news — Solana DEXs match CEX pricing as on-chain liquidity evolves
  12. The Wire — How Solana Actually Moves Bytes (companion piece)
  13. Solana at Wire Speed — validator architecture (companion piece)
  14. The RPC Layer That Cut the Cord — RPC provider landscape (companion piece)
← Back to Archive