Originally reported by @ArseniiPetrovich
Summary
Explicit calls (eth_call, EthEstimateGas, StateCall) pinned to a specific tipset are rejected with ErrExpensiveFork ("refusing explicit call due to state fork at epoch") not only at an expensive upgrade epoch U, but also at U+1, the first epoch after the migration, whose state is already fully materialised. The error is also returned with the generic JSON-RPC code 1, so downstream tooling cannot distinguish it from any other application error.
Two independent problems, both worth fixing:
- The refusal window is one epoch too wide (
U+1 should be served).
- The error has no stable, registered code.
Symptom
Reproduced on mainnet across the nv28 / FireHorse upgrade (UpgradeFireHorseHeight = 6052800). Same eth_call (feeGrowthGlobal0X128() on a Uniswap v3 pool), varying only the pinned block:
epoch U-1 (6052799): OK -> 0x...15675c877f4119b6014c3ff7346ceae74
epoch U (6052800): ERR -> {"code":1,"message":"refusing explicit call due to state fork at epoch"} <-- legit
epoch U+1 (6052801): ERR -> {"code":1,"message":"refusing explicit call due to state fork at epoch"} <-- the problem
epoch U+2 (6052802): OK -> 0x...15675c877f4119b6014c3ff7346ceae74
The state at U+1 is plainly available: eth_getStorageAt and eth_getCode both succeed there and return the correct values. Only the explicit-call path refuses.
Downstream impact
The Graph's graph-node indexes FEVM contracts by replaying eth_call at the block where each event was emitted. When an event lands in block U or U+1 the call is refused, and graph-node does not recognise the message as a deterministic error, so it treats it as a possible reorg and retries indefinitely. The subgraph wedges permanently at the upgrade epoch. This gets more likely every upgrade as FEVM activity grows.
Why it happens
The guard (in node/impl/eth/gas.go and chain/stmgr/call.go) refuses a call when an expensive migration sits anywhere between the parent epoch and the called epoch. Including the parent makes a call at U+1 trip on the migration at U, even though that migration is already baked into the state U+1 runs on (it is recoverable without re-execution, and nothing runs the migration on demand). The epoch U itself genuinely must stay refused, because serving it would run the migration on demand against the wrong state.
Separately, ErrExpensiveFork is a bare sentinel that is never converted to a typed RPC error, so go-jsonrpc falls back to code 1. Lotus already registers typed errors with stable codes (api/api_errors.go); ErrExpensiveFork is the natural sibling of ErrNullRound (both mean "this epoch can't be served as requested").
Workarounds today
- graph-node operators: set
GRAPH_GETH_ETH_CALL_ERRORS="refusing explicit call due to state fork" so the message is treated as deterministic, stopping the retry loop. Relies on the exact message string. Doing this prior to applying a proper fix means that some indexable content may be missed.
- callers: avoid pinning explicit calls to the upgrade epoch or the one immediately after;
U+2 onward is fine.
Possible fixes
- Serve
U+1: narrow the guard so it no longer refuses on a migration at the parent epoch, keeping U refused. This is what actually unblocks indexers, since events almost always land at U+1 rather than exactly at U.
- Signal it the way the ecosystem already does: there's no widely-recognised code for this, but the de-facto convention for "state not servable at this block" (pruned/archive nodes) is code
-32000 plus a recognisable message phrase such as required historical state unavailable or state ... is not available. Lotus emits 1 with a bespoke message that no tooling recognises. Moving to -32000 and growing the message to carry that phrase (e.g. required historical state unavailable: refusing explicit call due to state fork at epoch N) aligns with geth-oriented tooling. We should not borrow revert-style phrasing (e.g. -32015 / "execution reverted"): graph-node would then treat the call as a successful revert and index null data, which is worse than the retry loop.
The serve-U+1 fix is the one that resolves this in practice. The residual case (U itself, genuinely unservable) needs graph-node to grow a "state unavailable, advance past this block" path; it currently only knows retry-forever or treat-as-revert. The code/message change is the precondition for such an upstream fix, and less in our control.
Originally reported by @ArseniiPetrovich
Summary
Explicit calls (
eth_call,EthEstimateGas,StateCall) pinned to a specific tipset are rejected withErrExpensiveFork("refusing explicit call due to state fork at epoch") not only at an expensive upgrade epochU, but also atU+1, the first epoch after the migration, whose state is already fully materialised. The error is also returned with the generic JSON-RPC code1, so downstream tooling cannot distinguish it from any other application error.Two independent problems, both worth fixing:
U+1should be served).Symptom
Reproduced on mainnet across the nv28 / FireHorse upgrade (
UpgradeFireHorseHeight = 6052800). Sameeth_call(feeGrowthGlobal0X128()on a Uniswap v3 pool), varying only the pinned block:The state at
U+1is plainly available:eth_getStorageAtandeth_getCodeboth succeed there and return the correct values. Only the explicit-call path refuses.Downstream impact
The Graph's
graph-nodeindexes FEVM contracts by replayingeth_callat the block where each event was emitted. When an event lands in blockUorU+1the call is refused, andgraph-nodedoes not recognise the message as a deterministic error, so it treats it as a possible reorg and retries indefinitely. The subgraph wedges permanently at the upgrade epoch. This gets more likely every upgrade as FEVM activity grows.Why it happens
The guard (in
node/impl/eth/gas.goandchain/stmgr/call.go) refuses a call when an expensive migration sits anywhere between the parent epoch and the called epoch. Including the parent makes a call atU+1trip on the migration atU, even though that migration is already baked into the stateU+1runs on (it is recoverable without re-execution, and nothing runs the migration on demand). The epochUitself genuinely must stay refused, because serving it would run the migration on demand against the wrong state.Separately,
ErrExpensiveForkis a bare sentinel that is never converted to a typed RPC error, so go-jsonrpc falls back to code1. Lotus already registers typed errors with stable codes (api/api_errors.go);ErrExpensiveForkis the natural sibling ofErrNullRound(both mean "this epoch can't be served as requested").Workarounds today
GRAPH_GETH_ETH_CALL_ERRORS="refusing explicit call due to state fork"so the message is treated as deterministic, stopping the retry loop. Relies on the exact message string. Doing this prior to applying a proper fix means that some indexable content may be missed.U+2onward is fine.Possible fixes
U+1: narrow the guard so it no longer refuses on a migration at the parent epoch, keepingUrefused. This is what actually unblocks indexers, since events almost always land atU+1rather than exactly atU.-32000plus a recognisable message phrase such asrequired historical state unavailableorstate ... is not available. Lotus emits1with a bespoke message that no tooling recognises. Moving to-32000and growing the message to carry that phrase (e.g.required historical state unavailable: refusing explicit call due to state fork at epoch N) aligns with geth-oriented tooling. We should not borrow revert-style phrasing (e.g.-32015/ "execution reverted"):graph-nodewould then treat the call as a successful revert and indexnulldata, which is worse than the retry loop.The serve-
U+1fix is the one that resolves this in practice. The residual case (Uitself, genuinely unservable) needsgraph-nodeto grow a "state unavailable, advance past this block" path; it currently only knows retry-forever or treat-as-revert. The code/message change is the precondition for such an upstream fix, and less in our control.