WinDeveloper Coin Tracker

  • Home
  • Ethereum
  • Running an Ethereum Node for Software Development

Running an Ethereum Node for Software Development

Alexander Zammit

Alexander Zammit Photo

Software Development Consultant. Involved in the development of various Enterprise software solutions. Today focused on Blockchain and DLT technologies.

  • Published: Jul 01, 2020
  • Category: Ethereum
  • Votes: 5.0 out of 5 - 7 Votes
Cast your Vote
Poor Excellent

Running nodes is often synonymous with mining and earning rewards. However, to developers the blockchain is a platform for testing, deploying and running code. Here is how I am running an Ethereum node, synced to the mainnet for development purposes.

Anyone following Ethereum has come across horror stories of how difficult it is to sync a node to the mainnet. We all heard of nodes stuck in the syncing process for days. Problems with finding peers or the horrible consumption of limited SSD space!

You never know how accurate these stories are. Still, for a long time this was enough for me to stay away from trying. My development needs were largely satisfied by running private Ethereum blockchains using Geth. When it came to live testing and deployment, Infura would bridge the gap.

Earlier this year I finally set myself to find out for myself what it takes to run a node. I wanted to answer 3 simple questions:

  1. How long would an initial sync take?
  2. How much storage space would it consume?
  3. How long would it take for me to resync a node that has been idle for some time?

The third question is in fact the most important of the three. Working on various projects, for different blockchains, means that I cannot really keep nodes running all the time. A more realistic scenario for me is to sync to the blockchain on a fixed schedule. Such that when needed, I would only have to wait for a "short while" for the sync to complete.

 

My Machine Specs

To put everything into context here are the basic node machine specs.

Dell OptiPlex 3070 MT Core i5-9500 8GB
SSD: Samsung 860 EVO SATA 2.5" SSD 1TB
OS: Windows 10
Geth Version: 1.9.11
Download Speed: 75Mbps

 

Geth Command-line

Geth can be downloaded from:
https://geth.ethereum.org/downloads/

Geth is being run with these command-line parameters:

geth
--syncmode fast
--datadir <Data Path>
--rpc
--rpcapi "debug,eth,net,web3,personal,admin,miner,txpool"
--rpcport 8545
--cache 2048
--etherbase <address>

For conciseness I won't discuss the parameters in detail. Just a few points to note:

--syncmode fast - Fast syncing is the default. I am including it anyway to highlight the used mode. Light mode is not enough for my needs and full is too slow. Later, we will get an indication of how slow full sync mode is.

--datadir <Data Path> - Identifies the node storage location. We will look at this directory size to measure storage consumption.

--rpcapi "debug,eth,..." - Lists the interfaces to be exposed by this node. As a developer, I want to play with many interfaces, hence the long list. However, do watch out from exposing many interfaces in rpcapi. Each is opening access to our node. Interfaces we probably don't want others to access.

 

Initial Sync

Running the above command, we start the sync process.

Sync Start

The first obvious question is; How far behind are we? If you go to an Ethereum blockchain explorer you can get the highest block number, which at the time of writing was 9986931.

Once the node connects to a few peers and starts pulling the first blocks, look for logs starting with "Imported new block receipts". Here are my filtered logs, showing the first few entries of this type:

Imported new block receipts count=2   elapsed=20.000ms  number=2    hash=b495a1…4698c9 age=4y9mo3w size=1.69KiB
Imported new block receipts count=4   elapsed=19.049ms  number=6    hash=1f1aed…6b326e age=4y9mo3w size=3.30KiB
Imported new block receipts count=570 elapsed=125.964ms number=576  hash=41a746…6a8b38 age=4y9mo3w size=407.94KiB
Imported new block receipts count=1   elapsed=88.031ms  number=577  hash=c4cee3…93da3f age=4y9mo3w size=578.00B
Imported new block receipts count=481 elapsed=81.999ms  number=1058 hash=2af79b…35d557 age=4y9mo3w size=349.01KiB

Number shows the block number our node is at. We can see this getting incremented with every newly received batch of blocks.

Age shows how far behind our node is, in terms of time. Starting a new node now, gives me 4 years 9 months and 3 weeks. That's how much time has passed since the first Ethereum block was mined. As the sync progresses we will observe number going up and age going down.

In addition to the blocks, our node will also be syncing state and generate log entries of the type "Imported new state entries". Here is a snippet from my node:

Imported new state entries count=1920 elapsed=8.998ms processed=116260 pending=24675 retry=0 duplicate=0 unexpected=0
Imported new state entries count=1530 elapsed=4.000ms processed=117790 pending=24419 retry=0 duplicate=0 unexpected=0
Imported new state entries count=1920 elapsed=5.998ms processed=119710 pending=24828 retry=0 duplicate=0 unexpected=0
Imported new state entries count=2304 elapsed=8.033ms processed=122014 pending=22966 retry=0 duplicate=0 unexpected=0
Imported new state entries count=1536 elapsed=4.999ms processed=123550 pending=21957 retry=0 duplicate=0 unexpected=0

Again processed will go up as the sync progresses. Indeed state syncing is what consumes most time in a fast sync. At some point our node will look as if it has almost reached the chain tail, but will continue for many hours syncing state entries.

 

Monitoring the Sync Progress

Instead of looking at fast scrolling logs, we can grab the salient syncing counters by attaching a second Geth instance to the node. From a second console run:
geth attach http://localhost:8545

From here query for the syncing state using:
eth.syncing

Sync Progress

currentBlock and pulledStates conveniently show the two counters of interest for the number of blocks and the number of state entries our node has retrieved.

highestBlock is the current highest block number showing us how far behind the node is, in terms of blocks.

knownStates shows the highest state count our node is aware of. This is not very useful since it doesn't really give us an indication of how many state entries are outstanding.

startingBlock shows the block number from which our sync has started. Of course a fresh sync starts from zero. We can stop Geth anytime by pressing CTRL-C. Re-running the same Geth command-line, syncing continues from where it left off. startingBlock would now show the new sync start block count number.

 

State Syncing is the Key

The highest block number can be easily determined from a blockchain explorer or the highestBlock value. But what about the highest state entry count? This is not readily available from any blockchain explorer. Instead, on github there is a thread where people regularly post the highest observed state count:
https://github.com/ethereum/go-ethereum/issues/14647

Looking at the last thread post, we know that we have to definitely reach and exceed the count reported there. At the time of writing, the last reported state count is dated 14th April 2020 and has the value of 487,040,102.

Without getting into any complicated Maths we can pick a couple of values and get a better approximation. Assuming a constant state count increase rate:
17 Mar 2020 - 466184476
14 Apr 2020 - 487040102 (+28 days @ 744,843/day)

So at the time of writing (2nd May 2020) I know that I would at least need to wait for:
487040102 + 18 * 744843 = 500,447,276

Of course this is a very rough estimate. However at some point our block count will start looking as if it is perpetually stuck, just short of reaching the chain tail. At that point we might as well ignore the block count and look exclusively at the state count. Estimating our target state count is helpful for us to stay cool during this phase.

With patience we should finally reach the synced state. Running eth.syncing will now simply return false.

Sync Ready

 

Blockchain Resyncing

My first sync operation completed on 2nd March 2020 after syncing for 2 days and 18 hours (all the stats are summarized at the end).

However the most interesting data was collected in the weeks that followed. As already pointed out I am very interested in the resyncing time. So following the initial sync I have resynced the node approximately every 7 days and timed this process.

Re-syncing

An important fact to observe is that once the first sync is completed, fast sync mode is no longer available. Resync will now run in full sync mode. The log will mostly show "Imported new chain segment" entries. The resync rate is orders of magnitude slower than that for the first fast sync.

 

Initial Sync Data

First sync total time

2 days 18 hours

First sync completion date

2nd March 2020

First sync block height

9591728

First sync blockchain age (aprox.)

4years 7months 3weeks

 

Last sync completion date

1st May 2020

Last sync block height

9982207

Last sync storage size

268GB

 

Resyncing Data

Sync
Date

Time
Start

Time
End

Time
Taken
(h:mm:ss)

Time
Taken
(sec)

Start
Block

End
Block

Total
Blocks

Block
Rate
(Blk/sec)

Initial
Sync Age

2-Apr

14:31:04

20:41:57

6:10:53

22253

9742655

9794210

51555

2.32

1w15h51m

11-Apr

13:42:22

21:21:54

7:39:32

27572

9795215

9852805

57590

2.09

1w1d13h

16-Apr

12:33:59

16:22:53

3:48:54

13734

9853639

9884032

30393

2.21

4d12h 6m

24-Apr

13:02:32

20:20:15

7:17:43

26263

9885743

9936790

51047

1.94

1w14h18m

1-May

14:19:11

21:31:11

7:12:00

25920

9937598

9982200

44602

1.72

6d15h1m

 

From the resyncing data it looks like we have to approximately allocate 1 hour of syncing for every day our node is lagging behind. Note the huge difference between the fast sync mode used in the initial sync and the full sync mode used on resyncing. Remember, fast sync caught up with over 4 years 9 months of data in under 3 days.

 

Concluding Remarks

My Ethereum node syncing went fairly smooth. I consider the results to be more than reasonable. With their regular updates the Ethereum core team are clearly delivering many improvements. If in the past your experience was less pleasant, you may want to give it another shot.

 

Useful Links

Go Ethereum

What is the upper bound of "imported new state entries"?

Geth progress when switching to trie download

 

Copyright 2024 All rights reserved. BlockchainThings.io