Reading Substrate-style and Erigon Sync Logs in Practice

The confusing part of a fresh node is that “syncing” is not one state. The same word can describe peer discovery, block download, execution, snapshot handling, trie work, or a consensus layer catching up. The logs have to be read in that order.

In practice, similar nodes can look very different during catch-up. One parachain node had peers but moved slowly. Another parachain node moved much faster. An Erigon archive node reached a high block number surprisingly quickly because Erigon does not replay everything in the old linear way.

Terms used here

Term	Meaning
Parachain	A chain connected to a relay chain in the Polkadot ecosystem. Substrate-style parachain nodes print this kind of log.
Relaychain	The relay network that provides shared security and consensus for parachains.
Bootnode	A known node address used for peer discovery when a client starts.
Peer	Another node connected through P2P. Peer count affects whether the node can receive data, but it is not the only sync bottleneck.
Snapshot	Prebuilt chain data that lets a client avoid replaying every old block from scratch.
Staged sync	Erigon’s sync model, where download, execution, hashing, trie, and indexing work advance in separate stages.
Archive mode	A node mode that keeps historical state needed for old balance, trace, and state queries.

Reading the parachain lines

These logs contain two streams:

[Parachain] Syncing 5.6 bps, target=#13762785, best: #839591, finalized #0
[Relaychain] Syncing 515.8 bps, target=#31798844, best: #6372700, finalized #6372352

The important fields are:

Field	How to read it
`Parachain`	The parachain itself. This is the height application RPC users usually care about.
`Relaychain`	The connected relay chain sync. It can move at a very different speed.
`bps`	Blocks per second in that part of the sync loop.
`target`	The height the node believes it needs to reach.
`best`	The best block the node currently has.
`finalized`	The highest finalized block known to the node. During fast catch-up, this can lag or show early values.
`peers`	Connected peers for that component.

One node had enough peers, but the parachain side was moving at roughly single-digit blocks per second. At that speed, millions of blocks of lag translate into many days. The relaychain side was much faster, which said the node was not simply offline.

The other parachain node looked different. Its parachain side moved at tens to hundreds of blocks per second while the relaychain also advanced. That is still catch-up work, but it is a healthier shape.

Bootnodes help discovery, not every sync bottleneck

Bootnodes are easy to overestimate. They help a node find the network. Once the node already has peers, bootnodes are less likely to change execution throughput by themselves.

A userdata file can also have a bootnode syntax problem or a stale-looking entry. That needed to be fixed for future reproducibility. But the running node was already progressing quickly, so restarting it just to change bootnodes would have traded a working sync for a theoretical improvement.

A slower node was a better candidate for applying official bootnodes live. The container was recreated while preserving /data. That kept the database and changed only the client command. After the restart, the node still had to rebuild peer state and continue sync. The slow parachain rate did not disappear immediately, which matched the expectation: bootnodes can improve discovery, but they do not make expensive block import work free.

Why an Erigon node reached a high block quickly

The Erigon log looked surprising because it reached a high block number early. That does not mean it had replayed every block and fully built all archive state.

Erigon uses snapshots and staged sync. The log can show different stages making progress:

snapshots:blocks:retire
Execution
BuildFilesInBackground
computing trie

Those names matter. Block availability, execution progress, trie computation, and background file building are separate work. Seeing a high block number in one part of the log can be real and still not mean the node is ready for every historical query at the tip.

The better question is not “why did it get there so fast?” The better question is “which stages are complete, and which historical queries work?”

Checking archive behavior

For the Erigon archive node, the configuration check came first. The running command included:

--prune.mode=archive

That confirms the node was started with archive intent. It is not enough by itself. I also sampled historical RPC behavior.

Useful checks include:

{"jsonrpc":"2.0","method":"eth_getBlockByNumber","params":["0xf4240",false],"id":1}
{"jsonrpc":"2.0","method":"eth_getBalance","params":["0x0000000000000000000000000000000000000000","0x989680"],"id":1}
{"jsonrpc":"2.0","method":"trace_block","params":["0xf4240"],"id":1}

The sampled checks returned historical block data, a historical balance, and trace data for an old block. That is the practical signal I wanted: archive mode is configured, and old data paths are already answering for sampled ranges.

There is still a boundary. While eth_syncing is true, the node is not fully ready as a current archive endpoint. Historical samples can pass before all stages and current-head behavior are done.

How I read these logs now

I treat the first hour of a new node as a sorting exercise:

Question	Signal
Is the process alive?	Container is up and logs keep moving.
Is it connected?	Peer count is non-zero and P2P logs are active.
Which side is slow?	Parachain, relaychain, execution, snapshot, or trie stage.
Is RPC usable?	Local RPC answers basic calls.
Is archive mode plausible?	Command flags plus historical query samples.
Is it done?	Syncing is false and current height tracks the network.

A slow node is not always a broken node. A node with peers is not always a fast node. A high Erigon block number is not always a fully ready archive endpoint. Those distinctions save time.