Up until now, I have been relying on legacy values for configuring the P2P 2.0 package I have been working on. These values are:
- Outgoing: 32
- Incoming: 150
- Fanout: 16
- Rounds: 6
As far as I know, these values have been selected arbitrarily with the primary goal of ensuring that messages reach as many targets as possible. The drawback is that the more reliability you choose, the more the network will be flooded with duplicate messages. I wanted to find out if these settings make sense for the network and if it is possible to optimize them.
Since I am a programmer, not a mathematician, I opted to do this through an empirical process.
Note: All data I used is available on the github wiki. There is much more available than I ended up illustrating below.
To start, I wrote a simulator in Node.js that models the sending of packets in a network. The model represents a network with no latency and constant time for accepting and resending messages. The specific variables that I want to investigate: (a) number of connections, (b) fanout, and to a lesser extent (c) rounds. Rounds are logarithmic in nature and increasing rounds beyond a certain value has only limited to no effect.
All messages will be sent by
Node 0 with a random selection for fanout. There are two-and-a-half metrics to measure:
- Reach: The likelihood that a node in the system will receive a message
- Waste: The amount of wasted traffic of sending messages to nodes that have already seen the message
- Overload: The number of times the same message arrives at a node
We want to maximize reach while minimizing waste. Overload is another way of counting waste.
I am also going to frequently use the
fanout/rounds notation. For example,
16/6 would be 16 fanout, 6 rounds.
To illustrate what’s going on, I tested a small network with 8 nodes, an outgoing of 4, fanout of 2, and 3 rounds. I used a random number to generate these until I found an example that contained all the points I wanted to show. Here is what it looks like:
Each node has at least 4 bi-directional connections but some nodes ended up with more.
#0 sends the message to
#0 has received and sent the message. Nodes
#2 have received the message.
#1 sends the message to
#2 sends the message to
#6. This is the first wasted message.
#2 have received and sent. Nodes
#6 have received the message.
#4 sends the message to
#5. Both are waste.
#5 sends the message to
#4. Both are waste.
#6 sends the message to
#4. One is wasted.
#6 have received and sent. Node
#3 has received the message.
The message reached 7 of 8 nodes with 6 of 12 messages sent being waste. If there was another round, the likelihood of
#7 receiving the last message is 66%. In this example, the ratio between fanout and connections is .421 (there are an average of 4.75 connections), which resulted in 50% message waste.
This is just a sample size of one. After running 100,000 iterations (with randomized connections between samples), I ended up with a reach of 91% and an average of 46% wasted traffic. The worst case reach was 50%. Generally, the following holds true:
- A higher fanout/connections ratio results in a higher reach and higher waste and when fanout = connections, messages are guaranteed to reach every node but will arrive
avg connections * (total nodes-2)times. (This is actually done in some instances in the Factom protocol.)
- Conversely, a fanout of 1 with an infinite round limit messages there is almost no waste (0 or 1 messages wasted) but the likelihood of reaching all nodes is slim.
Very early on, I found that the important factors in a randomly-connected node cluster are the total node count and fanout. The number of connections has little to no impact on a random network as long as it’s higher than the fanout but it does have a large impact on hub networks. The rounds only have an impact on the lower end. For the live network, there are ~125 nodes, so I’m using 50, 125, 200, 1000, and 5000 for testing.
The actual Factom network is also not a purely random network but one with a hub (the seed nodes). This, unfortunately, has a negative effect on message propagation. On MainNet, there are ten seed nodes and around 125 nodes with 32 connections each. That means the seed nodes will receive a disproportional amount of traffic compared to other nodes. For the rest of the blog, I’m going to differentiate between random networks and hub networks.
The full data is available on the wiki in the form of tables that were generated using the app. The value of rounds past 6 is only marginal, and for the rest of this blog, I decided to focus on the 5-7 range.
Node Count vs Reach
Does the size of the network matter? Yes, absolutely. The more nodes in the network, the higher the fanout should be to reach all nodes.
The Effect of Hub vs Random
Let’s compare the reach using 200 nodes with Rounds=7:
And with 1000 nodes, 5 rounds:
The reach of the random order grows much faster because the hub order is being bottlenecked by the hub. Nodes won’t send the same message twice, so hub nodes receive a disproportionate amount of messages but don’t send out more messages, acting as a sink.
Waste and Fanout
In both the random and hub order, the waste % grows the same with Fanout but the difference in Reach is very apparent here and there is a crossover for the hub network.
The hub network structure had a much bigger impact than I originally anticipated. In the random network, you can reach 200 nodes in as little as 5/5, but with the hub network, 5/5 is only 70% reach. The current values used on both mainnet and testnet (16/6) are not as unreasonable as they first seemed under the circumstances.
In the random network, we can save around 8 percentage points of waste by going from 16/6 to 8/6 while still having a near 100% reach but in the hub network, reach and waste very closely aligned. For a random network, it is actually possible to approximate the optimal fanout by just taking the log of the node count:
ceiling(ln(#nodes) + 1).
Example: For 200 nodes,
~5.2983, with the formula giving us a fanout of 7. In the table, a fanout of seven at five rounds has a reach of 99.9% with 85.7% waste.
Can you increase performance by allowing more connections?
Yes. The bottleneck can be reduced by just allowing more connections to the point where the 10 seed nodes are a less overall percentage. Since the fanout is a random selection of the connection count, the closer the connections are to the total node count, the more the “randomness” increases. If you connected every node to every other node, you could also achieve an “optimal” structure in terms of message propagation. That, however, comes with massively increased resource consumption and other barriers such as NAT. It will also mean the network won’t be able to scale up. With a limited connection set, the size of the network can grow without generating more load per node.
There is no effective difference between taking 32 mathematically random connections of N nodes and complete connectivity.
Note: You’re gonna have to look close to tell the difference between orange and red here.
Are there really that many duplicate messages?
Yes. This matches up with real data collected by running a node on the testnet. During the June 3rd, 2019 stress test, the node reported the following stats:
# Total number of duplicate messages filtered out factomd_p2p_messages_duplicate 6.695623e+06 # Total number of application messages received factomd_p2p_messages_received 7.028926e+06
That is 95.2% waste, slightly worse than the simulation of the hub network. Please note that while this sample is from the stress test, it holds true for normal operations as well. Seeing 94%+ duplicate messages on a regular basis is what prompted me to write this blog in the first place.
In retrospect, this isn’t too surprising. With a fanout of 16, it’s likely that every message is received 16 times. That means 15/16 messages arriving are wasted and
15/16 = 0.9375 = 93.7%.
Benefit of Reducing Waste
Using optimal settings, it would be possible to drop waste from 94% down to ~86%, an eight percent reduction. A node running on the testnet on a normal day sends over a million application messages (mainnet would be higher due to usage), which accounts for around 78% of total traffic. Overall, an 8% reduction would be a nice improvement in terms of bandwidth but I believe the real benefits come in from the reduced Overloading. Going from a Fanout of 16 to a Fanout of 7-8 would reduce the average amount of application messages arriving at a node by half, resulting in a CPU usage reduction.
Optimization Strategy #1: Changing the Network Structure
At the moment, the Gossip network uses a reactive strategy for peers. We can get a more random network structure by switching to a cyclic strategy. The TL;DR is that instead of maintaining the current set of connections until something happens (reactive), peers will periodically drop some active connections in favor of new ones. This constant re-ordering introduces randomness and allows us to break up the bottleneck hub.
In addition to a more efficient network structure, a cyclic strategy would free up resources on the seed nodes, ensuring they are less likely to reach their capacity. How exactly the cyclic strategy would behave is up in the air and likely a topic for a future blog.
Optimization Strategy #2: The Cost of Anonymity
The Factom network is anonymous, meaning that messages traveling around the network don’t contain identifying information. It is impossible to tell if a message arriving from a node was created by that node or by another one. There is no history of nodes the message has traveled through. It is also not possible for a node to determine accurately how many nodes are in the network.
If the message hops were to be saved in the message, it would be possible to dramatically reduce waste. Below is a chart using 200 nodes in a random network where the message contains the hops it has been at. The raw data is also available. A message won’t be sent to a node in the history but it is still possible for duplicate messages to arrive at a node through separate routes, for example
As you can see, we can reach 85+ percentage point improvements in waste reduction and the average node won’t receive the same message twice. This kind of optimization could increase the overall bandwidth capacity of the network by a substantial amount. It does, however, sacrifice some of the anonymity.
There is an ongoing optimization effort (thanks Matt York for pointing this out to me) on the application side that capitalizes on a factor I did not take into account for my model. In a Factom node, there is a delay between messages arriving and messages being sent out again. During that delay, a number of messages arrive. If the node records which peers sent them a specific message, it can ensure that it doesn’t try to send that specific message back to them.
This strategy would eliminate some of the waste but the efficacy depends on how the delay is between messages arriving and messages being sent out. It would see the largest gain during times when the node is very busy.
It is similar to strategy #2 but uses only data available to the node itself, maintaining anonymity.
I’m glad that I finally have some hard numbers for the thing I’ve suspected for a while, ever since I added a duplicate-filter to the new P2P rework to optimize CPU usage. The numbers make sense but I am disappointed that tackling this issue is not as simple as changing the fanout values.
I believe that we should absolutely try to align the network into a more efficient shape and I think that Strategy #2 also has merits. The Factom network is built on principles of preventing censorship. Reducing the anonymity of messages this way would likely violate those principles but I think it’s a discussion worth having. The benefits of strategy #2 are immense and the opposition is an ideological one, not a technical one. I am very much looking forward to hearing input on this.
Thanks for reading!