Home
Ethereum
Running an Ethereum Node for Software Development

Running an Ethereum Node for Software Development

Alexander Zammit

Software Development Consultant. Involved in the development of various Enterprise software solutions. Today focused on Blockchain and DLT technologies.

My Machine Specs

To put everything into context here are the basic node machine specs.

Dell OptiPlex 3070 MT Core i5-9500 8GB
SSD: Samsung 860 EVO SATA 2.5" SSD 1TB
OS: Windows 10
Geth Version: 1.9.11
Download Speed: 75Mbps

Geth Command-line

Geth can be downloaded from:
https://geth.ethereum.org/downloads/

Geth is being run with these command-line parameters:

geth
--syncmode fast
--datadir <Data Path>
--rpc
--rpcapi "debug,eth,net,web3,personal,admin,miner,txpool"
--rpcport 8545
--cache 2048
--etherbase <address>

For conciseness I won't discuss the parameters in detail. Just a few points to note:

--syncmode fast - Fast syncing is the default. I am including it anyway to highlight the used mode. Light mode is not enough for my needs and full is too slow. Later, we will get an indication of how slow full sync mode is.

--datadir <Data Path> - Identifies the node storage location. We will look at this directory size to measure storage consumption.

--rpcapi "debug,eth,..." - Lists the interfaces to be exposed by this node. As a developer, I want to play with many interfaces, hence the long list. However, do watch out from exposing many interfaces in rpcapi. Each is opening access to our node. Interfaces we probably don't want others to access.

Initial Sync

Running the above command, we start the sync process.

The first obvious question is; How far behind are we? If you go to an Ethereum blockchain explorer you can get the highest block number, which at the time of writing was 9986931.

Once the node connects to a few peers and starts pulling the first blocks, look for logs starting with "Imported new block receipts". Here are my filtered logs, showing the first few entries of this type:

Imported new block receipts count=2   elapsed=20.000ms  number=2    hash=b495a1…4698c9 age=4y9mo3w size=1.69KiB
Imported new block receipts count=4   elapsed=19.049ms  number=6    hash=1f1aed…6b326e age=4y9mo3w size=3.30KiB
Imported new block receipts count=570 elapsed=125.964ms number=576  hash=41a746…6a8b38 age=4y9mo3w size=407.94KiB
Imported new block receipts count=1   elapsed=88.031ms  number=577  hash=c4cee3…93da3f age=4y9mo3w size=578.00B
Imported new block receipts count=481 elapsed=81.999ms  number=1058 hash=2af79b…35d557 age=4y9mo3w size=349.01KiB

Number shows the block number our node is at. We can see this getting incremented with every newly received batch of blocks.

Age shows how far behind our node is, in terms of time. Starting a new node now, gives me 4 years 9 months and 3 weeks. That's how much time has passed since the first Ethereum block was mined. As the sync progresses we will observe number going up and age going down.

In addition to the blocks, our node will also be syncing state and generate log entries of the type "Imported new state entries". Here is a snippet from my node:

Imported new state entries count=1920 elapsed=8.998ms processed=116260 pending=24675 retry=0 duplicate=0 unexpected=0
Imported new state entries count=1530 elapsed=4.000ms processed=117790 pending=24419 retry=0 duplicate=0 unexpected=0
Imported new state entries count=1920 elapsed=5.998ms processed=119710 pending=24828 retry=0 duplicate=0 unexpected=0
Imported new state entries count=2304 elapsed=8.033ms processed=122014 pending=22966 retry=0 duplicate=0 unexpected=0
Imported new state entries count=1536 elapsed=4.999ms processed=123550 pending=21957 retry=0 duplicate=0 unexpected=0

Again processed will go up as the sync progresses. Indeed state syncing is what consumes most time in a fast sync. At some point our node will look as if it has almost reached the chain tail, but will continue for many hours syncing state entries.

Monitoring the Sync Progress

Instead of looking at fast scrolling logs, we can grab the salient syncing counters by attaching a second Geth instance to the node. From a second console run:
geth attach http://localhost:8545

From here query for the syncing state using:
eth.syncing

currentBlock and pulledStates conveniently show the two counters of interest for the number of blocks and the number of state entries our node has retrieved.

highestBlock is the current highest block number showing us how far behind the node is, in terms of blocks.

knownStates shows the highest state count our node is aware of. This is not very useful since it doesn't really give us an indication of how many state entries are outstanding.

startingBlock shows the block number from which our sync has started. Of course a fresh sync starts from zero. We can stop Geth anytime by pressing CTRL-C. Re-running the same Geth command-line, syncing continues from where it left off. startingBlock would now show the new sync start block count number.

State Syncing is the Key

The highest block number can be easily determined from a blockchain explorer or the highestBlock value. But what about the highest state entry count? This is not readily available from any blockchain explorer. Instead, on github there is a thread where people regularly post the highest observed state count:
https://github.com/ethereum/go-ethereum/issues/14647

Looking at the last thread post, we know that we have to definitely reach and exceed the count reported there. At the time of writing, the last reported state count is dated 14th April 2020 and has the value of 487,040,102.

Without getting into any complicated Maths we can pick a couple of values and get a better approximation. Assuming a constant state count increase rate:
17 Mar 2020 - 466184476
14 Apr 2020 - 487040102 (+28 days @ 744,843/day)

So at the time of writing (2nd May 2020) I know that I would at least need to wait for:
487040102 + 18 * 744843 = 500,447,276

Of course this is a very rough estimate. However at some point our block count will start looking as if it is perpetually stuck, just short of reaching the chain tail. At that point we might as well ignore the block count and look exclusively at the state count. Estimating our target state count is helpful for us to stay cool during this phase.

With patience we should finally reach the synced state. Running eth.syncing will now simply return false.

Blockchain Resyncing

My first sync operation completed on 2nd March 2020 after syncing for 2 days and 18 hours (all the stats are summarized at the end).

However the most interesting data was collected in the weeks that followed. As already pointed out I am very interested in the resyncing time. So following the initial sync I have resynced the node approximately every 7 days and timed this process.

An important fact to observe is that once the first sync is completed, fast sync mode is no longer available. Resync will now run in full sync mode. The log will mostly show "Imported new chain segment" entries. The resync rate is orders of magnitude slower than that for the first fast sync.

Initial Sync Data

First sync total time	2 days 18 hours
First sync completion date	2nd March 2020
First sync block height	9591728
First sync blockchain age (aprox.)	4years 7months 3weeks

Last sync completion date	1st May 2020
Last sync block height	9982207
Last sync storage size	268GB

Resyncing Data

Sync Date	Time Start	Time End	Time Taken (h:mm:ss)	Time Taken (sec)	Start Block	End Block	Total Blocks	Block Rate (Blk/sec)	Initial Sync Age
2-Apr	14:31:04	20:41:57	6:10:53	22253	9742655	9794210	51555	2.32	1w15h51m
11-Apr	13:42:22	21:21:54	7:39:32	27572	9795215	9852805	57590	2.09	1w1d13h
16-Apr	12:33:59	16:22:53	3:48:54	13734	9853639	9884032	30393	2.21	4d12h 6m
24-Apr	13:02:32	20:20:15	7:17:43	26263	9885743	9936790	51047	1.94	1w14h18m
1-May	14:19:11	21:31:11	7:12:00	25920	9937598	9982200	44602	1.72	6d15h1m

From the resyncing data it looks like we have to approximately allocate 1 hour of syncing for every day our node is lagging behind. Note the huge difference between the fast sync mode used in the initial sync and the full sync mode used on resyncing. Remember, fast sync caught up with over 4 years 9 months of data in under 3 days.

Concluding Remarks

My Ethereum node syncing went fairly smooth. I consider the results to be more than reasonable. With their regular updates the Ethereum core team are clearly delivering many improvements. If in the past your experience was less pleasant, you may want to give it another shot.

Keep yourself on the Edge!