← Blog

Research 2026-05-14 14 min read

Cross-chain settlement latency distribution on ETH-ARB CCTP, year 2025, showing the directional asymmetry of substrate stress propagation

Cross-chain settlement latency under substrate stress: empirical study of ETH-ARB-CCTP 2025

This study applies the Invarians substrate observability framework to the Ethereum L1 and Arbitrum L2 chains, together with the CCTP bridge connecting them, over the full year 2025. The corpus covers 8,760 hourly observations on each substrate chain and 17,016 hourly states on the bridge layer, derived from 374,980 reconstructed CCTP messages.

The principal finding is a directional asymmetry. When Arbitrum sits in a non-nominal structural regime (S2+D1), the median attestation latency of outbound ARB to ETH transfers is 1.50 times the nominal baseline, with statistical significance well below conventional alpha levels (Mann-Whitney U, p = 1.6 × 10⁻¹⁴, n = 6,171 hours after outlier removal). The equivalent test in the opposite direction (ETH to ARB) yields no detectable effect (p = 0.045 with a median ratio of 1.04, within the noise of an effect that this corpus could not have detected below 2.71 times). The asymmetry is consistent with the protocol architecture of CCTP, where total transfer latency is dominated by source-chain finality and the response of the Circle attester to it.

The tail behavior of the latency distribution exhibits a different pattern. Tail amplification appears only in compound regimes where structural and demand stress occur simultaneously (S2+D2+: 1.81 times the nominal upper-decile latency ; S2+D2±: 2.14 times). Pure structural stress shifts the median but not the tail. A direct test of the proposed mechanism via sequencer publish latency reveals a weak Spearman correlation (rho = 0.039) at the hourly granularity, partially explaining the tail behavior but not the median shift. A precursor analysis controlling for the base rate of substrate non-nominal periods shows a lift of 1.18 times, a weak predictive signal that should not be presented as a deterministic anticipation tool with the current six-hour lead window.

For institutional cross-chain flows on EVM-to-EVM CCTP lanes, the operational signal lives on the source chain of the intended transfer, more so in the median than in the tail, more so under compound stress than under pure structural stress. The reverse direction does not benefit measurably from substrate observation at the precision of this study.

1. Context

The Invarians framework was built to characterize the state of a blockchain substrate, its physical operating regime, not its market prices or its smart contract economics, so that processes acting on top of it can choose their execution windows. The target users are not traders chasing five basis points of slippage. The target users are RWA treasuries rebalancing positions across chains, intent solvers honoring fill commitments, keeper networks executing scheduled rebalances, and the workflow layers being deployed inside Chainlink CRE. These are operations where the cost of an unexpected three-hour delay on a settlement leg exceeds the cost of any reasonable gas optimization by orders of magnitude.

The product question this study addresses is narrow and operationally framed. When a substrate chain enters a non-nominal state under the Invarians classification, does the cross-chain settlement layer change measurably ? If yes, in which direction, by how much, and with what statistical confidence ? If no, what magnitude of effect could have been ruled out ?

We answer these questions for one specific configuration : the ETH and ARB CCTP lane over the year 2025. This is not the most active lane on CCTP, but it is the most documented and the one for which public blockchain datasets expose both substrate observations at sufficient depth to allow a self-contained reconstruction. Larger studies covering other lanes and chains will follow.

2. Vocabulary and the substrate classification

Eight terms carry the framework. A reader who keeps them straight will read the rest of the article without friction. Definitions are aligned with the canonical Invarians glossary.

Nominal: The current expected behavior of a chain or a bridge under no exceptional condition, measured along structural rhythm, demand pressure, and posting continuity. The nominal is not a fixed number. It evolves with the chain itself: protocol upgrades, application mix changes, and rising agentic load all reshape the underlying distribution. Each chain and each bridge has its own nominal, qualified through continuous calibration on a rolling baseline.
Regime: The operating state of a blockchain over a structural time window of approximately one hour. A regime is not an event. It is a certified state of the infrastructure substrate, derived from two independent axes: structural rhythm and demand pressure. The Invarians notation for regimes is the SxDx grid described below.
State: The operational state of a cross-chain bridge, measured independently from the L1 or L2 structural regime. Bridges can transition from healthy to congested faster than a structural regime change. The state taxonomy for bridges is BS1 (nominal, latency and backlog within range) and BS2 (degraded, latency or backlog above the calibrated threshold). The terminology is deliberate: regime applies to substrate chains, state applies to bridges.
Drift: The change of a chain's nominal over time. Distinct from a short-term deviation, where the chain temporarily falls outside its window and returns. Drift is structural: the window itself moves. The deformation is plastic, intrinsic to evolving protocols and irreversible at the timescale of observation. Ethereum's nominal post-Dencun is statistically different from its nominal pre-Dencun. Drift must be tracked, not assumed away.
Shift: The signed difference between the short EMA ratio and the long EMA ratio for a given measurement axis. shift = ratio_short - ratio_long. Captures the elastic divergence of recent behavior from the rolling baseline. A positive shift means the chain is currently above its 30-day baseline ; a negative shift means it has dropped below or has converged toward the long baseline. Shifts can snap back ; when they accumulate and amplify across observables, drift is what they produce at the chain level. Where the regime classification answers is the chain stressed right now, shift answers is the chain pulling away from its 30-day window.
Divergence: A measurement showing the current state of a chain or a bridge falls outside its calibrated nominal window. Divergence may be statistical, expressing how unlikely the present state is under the rolling distribution, or event-based, matching the present state to a documented historical incident. Invarians qualifies divergence ; the agent owns the decision on what to do with it.
Calibration: The continuous process by which the nominal window is defined for each chain, each layer and each bridge, on a rolling 30-day baseline updated hour by hour. Distinct from a static threshold. The calibration tracks the nominal as it drifts. Reproducible methodology at github.com/agentnorthstar/calibration.
Threshold: The numerical bound, per chain and per measurement axis, above or below which a signal is considered to have left its nominal window. Thresholds are the output of the continuous calibration process, derived statistically from the rolling 30-day distribution of each observable. Each chain has its own thresholds because each chain has its own nominal. Both upper and lower bounds may be calibrated for two-sided signals, enabling the signed regime classification.

The substrate classification matrix

With the vocabulary in place, the classification is straightforward.

The structural axis (S) captures how the chain itself behaves as infrastructure. Are blocks produced on rhythm ? Are batches posted on schedule ? Is consensus participation steady ? S1 means nominal. S2+ means structurally stressed in the slowdown direction (sequencer slow, validator delays). S2- means structurally stressed in the opposite direction (consensus distress, beacon participation drops).

The demand axis (D) captures the activity layer on top of that infrastructure. Are transactions of the expected mix coming in ? Is gas usage tracking baseline ? Is the composition of transactions normal ? D1 means nominal. D2+ means a surge above the upper threshold. D2- means anomalous quiet below the lower threshold. D2± means at least one demand axis is high and another is low simultaneously (a composition signature).

The combination yields the regime codes used throughout the article. S1D1 is nominal on both axes. S2+D1 is pure structural stress with normal demand. S1D2+ is a demand surge with no structural problem. S2+D2+ is compound stress on both axes. The full grid has twelve possible codes once polarity is applied, of which seven appear non-trivially in the 2025 ETH and ARB corpus.

One important caveat affects this study specifically. The S2- regime requires beacon participation data for Ethereum, sourced from the Beacon Chain via an external API. That input is not yet wired into the version 0.1 of the data pipeline used here. As a consequence, no S2- regime is observed on Ethereum in this corpus. The historical episodes where ETH consensus distress would have led to an S2- classification (Geth client bugs, Lido outages, Prysm bug periods) are invisible to the dataset behind this study. They are visible to live Invarians production observations once beacon participation is part of the operational pipeline. The S2- gap is explicit and acknowledged.

3. Corpus and methodology

The corpus covers calendar year 2025, hourly granularity, for three observation layers.

Ethereum L1 substrate. Hourly aggregates of block-level metrics, transformed into the six L1 metrics defined by the Invarians framework (rhythm, continuity, sigma_demand, size_demand, tx_demand, complexity_value). Total: 8,760 hourly observations.

Arbitrum L2 substrate. Hourly aggregates of Arbitrum One blocks and transactions, plus L1 batch posting events for the Arbitrum SequencerInbox contract, yielding the L2 metric set including sequencer_publish_latency. Total: 8,760 hourly observations.

CCTP bridge layer. Per-message reconstruction of DepositForBurn events on the source chain TokenMessenger and MessageReceived events on the destination chain MessageTransmitter, matched by source domain and nonce. 374,980 messages matched across the two routes (ETH to ARB and ARB to ETH). Hourly aggregation produces 17,016 bridge state observations, each with attestation latency percentiles and a calibrated BS1 or BS2 classification.

Calibration is quarterly. For each calendar quarter, the threshold separating BS1 from BS2 is calibrated from the 30 days preceding the quarter. Substrate thresholds follow the production calibration logic documented in the Invarians calibration log. Per quarter calibration accepts that incidents falling inside the calibration window influence the threshold ; this is the production methodology and the trade-off is documented openly in the methodology file of the analysis repository.

All percentile computations on bridge latency apply a 72-hour outlier cutoff. Above this threshold, messages are treated as lost, never claimed, or contaminated by upstream pipeline issues unrelated to substrate state. 95 hours on ETH to ARB (1.11 percent) and 31 hours on ARB to ETH (0.37 percent) exceed this threshold. They are excluded from inferential tests and reported separately as a count.

The pipeline that produced this corpus is open source and reproducible from public blockchain datasets exposed by Google Cloud, with a standard account. Full reproduction details are in the companion calibration repository linked at the end of the article.

4. What 2025 looked like on the substrate

The annual distribution of regime codes provides the baseline against which everything else is read.

On Ethereum, 70.5 percent of hours were S1D1 nominal. The dominant non-nominal class was S1D2+ (cost surge) at 18.1 percent. Pure structural slowdown without demand stress (S2+D1) appeared in 3.8 percent of hours, while compound regimes (S2+D2x family) accumulated to roughly 1.4 percent. The absence of S2- is structural to the methodology, not empirical.

On Arbitrum, 68.9 percent of hours were nominal. The non-nominal classes were more evenly distributed than on Ethereum, with S1D2+ at 14.5 percent, S1D2- (operational anomaly) at 6.2 percent, and pure structural S2+D1 at 4.0 percent. Compound regimes were more frequent on ARB than on ETH (2.5 percent versus 1.4 percent), consistent with the L2 being more responsive to combined infrastructure and demand fluctuations.

The temporal distribution of non-nominal hours is not uniform. Ethereum showed a concentrated cluster of S2+ events during a single week in May 2025. Arbitrum showed a long composition-anomaly cluster across the second half of September 2025, with four sustained-to-prolonged S1D2± events in six days. There is no temporal synchronization between the May 2025 ETH cluster and any equivalent ARB event, nor between the September 2025 ARB cluster and an ETH event. The two chains stress in different windows.

5. What 2025 looked like on the CCTP bridge

The bridge layer between Ethereum and Arbitrum carried 374,980 matched messages over the year. The flow was strongly asymmetric. The ARB to ETH direction carried 302,893 messages, roughly four times the volume of ETH to ARB (72,087 messages). The asymmetry is itself a known feature of stablecoin flow on this lane in 2025.

In terms of bridge state classification, the bridge spent 94.3 percent of hours in BS1 (nominal) on the ETH to ARB route and 92.5 percent on ARB to ETH. The BS2 (degraded) classification appeared in 268 hours on ETH to ARB and 387 hours on ARB to ETH. BS2 episodes were short: median episode duration was one hour on both routes, maximum two hours on ETH to ARB and five hours on ARB to ETH.

A different threshold is informative for institutional flows. Hours where the bridge upper-decile latency exceeded one hour appeared on 18.2 percent of ETH to ARB hours and 21.7 percent of ARB to ETH hours. This is much more frequent than BS2, because BS2 is a relative threshold (above the recent 30-day calibrated bound) while one hour is an absolute reference relevant to flow planning. Most of these "stuck fund hours" occurred during substrate S1D1 periods: 73 percent of stuck ETH to ARB hours and 69 percent of stuck ARB to ETH hours had nominal substrate. The bridge has failure modes that are not driven by substrate state. The substrate observability framework does not, and is not designed to, capture them all.

6. The central finding: source chain structural stress shifts bridge latency median

For each pair of substrate state and bridge direction, the study compares the distribution of per-hour upper-decile attestation latency of CCTP messages, conditional on the substrate being in S2+D1 (pure structural stress, no concurrent demand anomaly) versus S1D1 (nominal).

On the ARB to ETH direction, in hours where Arbitrum was in its nominal regime (n = 5,827), the median bridge attestation latency was 23 minutes and the upper decile was 1h 47 min. In hours where Arbitrum was in pure structural stress (n = 344), the median rose to 35 minutes and the upper decile was 1h 40 min. The median ratio is 1.50 times, the upper-decile ratio is 0.93 times. A one-sided Mann-Whitney U test on the alternative that S2+D1 latencies exceed S1D1 latencies returned U = 1,245,850 and p = 1.6 × 10⁻¹⁴.

On the ETH to ARB direction, in nominal hours (n = 5,998) the median was 23 minutes and the upper decile was 2h 17 min. In pure structural stress hours (n = 326) the median was 24 minutes and the upper decile was 1h 52 min. Median ratio 1.04 times, upper-decile ratio 0.82 times (lower under stress). The Mann-Whitney U test returned p = 0.045. The p-value crosses the conventional 0.05 threshold but the median shift is four percent, well within the noise that any reasonable interpretation would dismiss as operationally meaningless. The honest reading is that no operationally significant effect was detected.

The shape of these two findings is the central observation of the study. The source-chain structural state matters for the bridge latency median on the source-to-destination flow, by a margin of fifty percent on Arbitrum, and it matters statistically with extreme confidence. The reverse direction shows no comparable effect.

7. The tail is a different story: compound stress amplifies the upper decile

The median effect documented above does not extend to the upper tail of the bridge latency distribution. The upper-decile ratio in S2+D1 is at or below the S1D1 upper decile on both directions. The tail is unmoved by pure structural stress alone.

The picture changes when the substrate enters compound regimes, where structural and demand stress occur simultaneously. On the ARB to ETH direction:

ARB regime	n	Median	Upper decile	Upper-decile ratio vs S1D1
S1D1	5,827	23 min	1h 47 min	1.00x
S2+D1	344	35 min	1h 40 min	0.93x
S2+D2-	71	28 min	2h 11 min	1.22x
S2+D2+	74	36 min	3h 14 min	1.81x
S2+D2±	65	40 min	3h 49 min	2.14x

The tail amplification is driven by combinations of structural stress with demand surge (D2+), with operational anomaly (D2-) to a lesser extent, or with composition signature (D2±). These are situations where the chain is structurally slow at the same time as the transaction mix is unusual.

The two findings combined describe a layered effect. Pure structural stress shifts where most transactions sit. Compound stress also stretches the tail, exposing more transactions to extreme outcomes. Both are statistically real, but they speak to different operational concerns. The median shift matters for expected settlement time. The tail amplification matters for worst-case planning and circuit breakers.

8. The null finding on ETH to ARB, and what it rules out

The absence of a measurable effect in the ETH to ARB direction is not the same as a proof that no effect exists. Sample size and inherent variance impose a floor on the magnitudes that any given dataset can detect.

In this corpus, with 5,998 nominal hours of ETH substrate and 437 S2+ hours of varying composition, the minimum detectable effect on the mean latency difference (at conventional significance and power thresholds of 0.05 and 0.80) is approximately 2,322 seconds. Expressed as a ratio against the median nominal latency, this corresponds to a detectable median ratio of approximately 2.71 times.

The honest reading is therefore narrow. This study can rule out effects of magnitude greater than 2.71 times on the median of ETH to ARB latency under ETH stress. Effects smaller than that may exist but would not have been detectable in this sample. A more powered study, either with a longer time window or with finer sub-regime splits including data from additional chains, would tighten this bound.

The 2.71 times detection floor is consistent with the proposed mechanism. If the dominant source of ETH to ARB latency is the fixed cost of Ethereum source-chain finality (around 13 minutes by protocol design) plus the response time of the Circle attester to that finality (a few minutes), then the variability driven by substrate state is bounded by the variability of those components. A two-and-half-times multiplier would require either a roughly 30-minute slowdown in ETH finality or a parallel slowdown in the attester pipeline, neither of which is plausible under the structural stress patterns observed on ETH in 2025.

9. Looking for the mechanism: sequencer publish latency

A correlation between substrate state and bridge latency is one finding ; a measurement of the causal pathway between the two is another. The study tested the most natural candidate explanation: that Arbitrum structural stress (S2+D1 in particular) corresponds to delayed batch posting from the Arbitrum sequencer to Ethereum, which in turn delays the moment at which the Circle attester acknowledges the source-chain finality of a burn, which in turn extends the time until the destination mint can complete.

The Invarians L2 metric set includes sequencer_publish_latency, computed hourly from the L1 batch posting events of the Arbitrum SequencerInbox contract. Correlating this metric directly with the bridge latency on the ARB to ETH direction at the hourly granularity used here yields a Spearman correlation of 0.039 (p = 0.0004 on n = 8,455 hours). The correlation is statistically distinguishable from zero only because the sample is large. The effect size is small.

A coarser view, binning the sequencer publish latency into quartiles, shows a more revealing pattern:

Sequencer publish latency quartile	n	Bridge median	Bridge upper decile
Q1 (lowest)	2,484	29 min	1h 28 min
Q2	1,783	24 min	1h 32 min
Q3	2,463	23 min	2h 12 min
Q4 (highest)	1,725	28 min	2h 46 min

The median bridge latency is essentially flat across the four quartiles, ranging from 23 to 29 minutes. The upper decile rises monotonically from quartile two to quartile four, from 1h 32 min to 2h 46 min, a near-doubling. Sequencer publish latency appears to influence the tail behavior of bridge latency but not its median.

This is partial mechanistic support. The substrate-driven median shift documented in Section 6 cannot be attributed solely to sequencer publish latency variation. Other contributors to the source-chain finality timeline must be at work. Candidate paths include the time between block validation and inclusion in a posted batch, the Circle attester's own queueing dynamics, and ETH-side mint timing, none of which the study measures. The article does not propose a definitive causal chain ; it documents one mechanism that contributes to the tail and acknowledges that the median shift is not fully explained by the measured variables.

10. The precursor question: substrate stress as a leading signal

A common reading of the Invarians framework is that it provides anticipation. If substrate stress precedes bridge degradation, an operator could reschedule a flow before the degradation hits. The headline number is attractive: 79 percent of ARB to ETH BS2 hours had a non-nominal ARB substrate state somewhere in the preceding six hours.

This number, on its own, is not a finding. The fact that 79 percent of BS2 hours are preceded by substrate non-nominal periods is only meaningful relative to the base rate at which any hour is preceded by such periods. In a year where 31 percent of substrate hours are non-nominal, the average chance of seeing a non-nominal state somewhere in a random six-hour window is high.

The relevant comparison is the lift:

Direction	P(prior non-nominal \| BS2)	P(prior non-nominal \| BS1)	Lift
ETH to ARB	0.649	0.608	1.07x
ARB to ETH	0.791	0.672	1.18x

The lift values, 1.07 and 1.18, are weak. Substrate non-nominal periods are common enough that seeing one in the preceding six hours does not strongly differentiate BS2 from BS1 hours. As an anticipation signal evaluated on this corpus with a six-hour lead window and the current drift formulation, the result is modest at best.

The study does not present this as a definitive null. Longer lead windows might amplify the signal. A finer substrate state granularity (regime severity, drift acceleration) might also. The current Invarians Drift Signal primitive, as defined and implemented at the time of this study, does not deliver a strong anticipation signal for bridge BS2 in the way the framework's positioning sometimes suggests. The framework anticipates regime transitions on the substrate itself (a different and demonstrable function) ; it does not yet provide a strong anticipation of bridge state transitions through substrate observation alone.

11. What this means for institutional cross-chain flows

The operational reading distilled to its essentials.

For an institutional flow on the EVM-to-EVM CCTP lane (RWA treasury rebalance, intent solver fill, scheduled keeper operation), the substrate state of the source chain of the intended transfer is a measurable predictor of the median settlement time on that flow. A flow leaving Arbitrum during an S2+D1 hour is, on average, fifty percent slower than the same flow during an S1D1 hour. The effect is direction-specific. A flow leaving Ethereum during ETH S2+D1 hours shows no comparable median shift in this corpus.

For worst-case planning, the substrate state combined across both axes is the relevant variable. Compound stress (S2+D2+ or S2+D2±) amplifies the bridge upper-decile latency by a factor of 1.8 to 2.1 on the ARB to ETH direction. Flows that must complete within tight time bounds should treat compound substrate stress as a circuit breaker condition.

For anticipation, the substrate state in the previous six hours adds limited predictive value beyond the base rate. An institutional flow operator looking for a strong "do not bridge right now" signal should rely on the current bridge state and the current source substrate state rather than on a six-hour lookback over substrate history with the current drift formulation.

For both routes, a non-negligible fraction of hours (roughly twenty percent) carry upper-decile latencies above one hour. The majority of these stuck-fund hours occur during nominal substrate states. They reflect failure modes of the bridge layer itself: Circle attester delays, Iris API instability, and occasional unclaimed messages. The substrate framework does not detect these and is not designed to. They constitute an irreducible operational floor that cross-chain operators need to absorb through their own redundancy and timeout policies.

The institutional value of the substrate observability framework, as documented by this study, sits in three discrete contributions: a reliable shift in expected settlement time on outbound flows from a stressed source chain, on the order of 1.5 times the nominal median ; a reliable amplification of worst-case settlement time on outbound flows from a source chain in compound stress, on the order of 1.8 to 2.1 times the nominal upper decile ; and a direction-specific scope, with the framework adding operational signal on the source side and not on the destination side.

This is narrower than some general framings of the framework would suggest. It is also more solid, in the sense that each contribution is a statistical claim with a measurable effect size, a known sample, a stated significance level, and a documented limit.

12. What this study does not prove

In keeping with the working position adopted at the start of the study, the following claims are not made and are not supportable from the data presented here.

That trades or transfers executed in non-nominal substrate regimes carry higher financial losses, slippage, or gas costs. Such claims require per-transaction outcome data not present in this corpus.
That agents using the Invarians framework outperform agents that do not, in any measurable economic sense. Such claims require controlled benchmarking experiments.
That non-nominal substrate states are causally responsible for protocol incidents, exploits, or settlement failures. Such claims require a curated ground-truth list of incidents and a formal time-aligned causal analysis, which is the subject of a planned follow-up study.
That the framework provides a strong anticipation signal for bridge state transitions with the current drift formulation and the six-hour window tested here. The data does not support this, though longer windows and richer drift definitions remain open territory.
That the S2- regime, which represents structural consensus distress, contributes to the patterns documented. This regime is absent from the corpus because the beacon participation data has not been integrated into the pipeline at the time of this study.

The boundaries of what is shown are deliberately tight. The expectation is that future studies, on broader corpora and with additional measurement layers, will progressively close the gaps enumerated here.

13. Reproducibility

The full pipeline that produces this corpus and these findings is open source. The repository contains: the SQL templates for substrate block extraction and CCTP event extraction (DepositForBurn on the source chain TokenMessenger contract, MessageReceived on the destination chain MessageTransmitter contract) ; the Python library implementing the substrate metric computation, the EMA-based ratio normalization, the quarterly threshold calibration, the regime classification, the stress event detection lifecycle, the severity derivation, and the CCTP message matching by source domain and nonce ; the orchestrator scripts that execute the full pipeline for one year per chain, plus the dedicated CCTP orchestrator for one bridge route ; the two analysis scripts that produced the article's figures and statistics, including the outlier handling, sub-regime granularity, mechanism test via sequencer publish latency, base-rate-controlled lift computation, and minimum detectable effect estimation ; and the methodology file documenting calibration choices, post-Dencun temporal floor, EMA contamination consideration, and the explicit acknowledgment of the S2- gap.

The article's findings are committed to the repository as a JSON file with the full set of summary statistics, so that downstream readers can verify the numerical claims without rerunning the pipeline. Any future re-run that produces different numbers, on a different corpus or with revised methodology, can be compared line by line to this version.

Invarians

Invarians provides on-chain execution context for autonomous agents. API v2.0 exposes three primitives in a single signed payload: Attestation (HMAC), Regime (12 signed codes per chain), Drift Signal (per-metric shift, per-axis composite). Built for institutional cross-chain flows where settlement timing is contractual, audit-grade, and SLA-bound. Live since 2026-04-30 across Ethereum, Polygon, Arbitrum, Base, Optimism.

See how it works →