Factom Protocol P2P: A TCP Gossip Network

Introduction

Let’s take a closer look at Factom’s p2p protocol.

Factom uses a Gossip protocol for all messages, which is a very simple protocol with a wide variety of applications. One of Factom’s main priorities is creating a censorship-resistant environment and Gossip Protocols have characteristics that aid in that regard. More on that later.

The p2p code is implemented entirely inside factomd with no external libraries. If you’re already familiar with Gossip Protocols and just want a brief overview, these are the specifications of the network:

Overview

  • TCP
  • Seeded with a list of 10 softcoded ips
  • Fanout of 16 (with optional max fanout for some messages)
  • Outgoing count of 32
  • Incoming count of 150 (no NAT traversal)
  • Push approach (mostly eager push)
  • 6 round limit
  • Partial peer view with a reactive strategy
  • Share 128 quality peers with neighbors

What is a Gossip Protocol

If you’re unfamiliar with Gossip protocols, the basic principle is this: a node is connected to a number of other nodes. When a node receives (or creates) a message, it shares it with a number of random nodes it is connected to. Those nodes then also send that message to a number of random nodes they are connected to, and so on. It is possible for a node to receive a message twice, in which case the message is dropped. Each time a message is sent out, a counter is increased and if that reaches a target (the round limit), it is also dropped.

It is not guaranteed for a message to reach all nodes in the system, depending on the topography of the network and chance. By changing the parameters, however, you can achieve a very high probability that a message will reach all nodes.

Factom’s Implementation

Peer Discovery

Initially, a starting node* will download a list of softcoded seeds and connect to them. Upon successful connection, they request a peer list from that node. They receive a list of up to 128 peers and attempt to connect to those peers, repeating the cycle until the “outgoing” count has been met.

The peers shared with a neighbor are ordered by an internal quality score but the outgoing ips are selected via a “diverse” metric, preferring to connect to addresses that are distant to each other.

Nodes maintain a view of all peers they have ever heard about (inactive or unused peers are eventually filtered out) but have no knowledge of the layout of the network. (“Partial peer view”) The shape of the network only changes when peers enter or leave, otherwise resting. (“Reactive strategy”)

Unlike other gossip networks, Factom nodes do not need an identity or public key to connect to the network. Every node creates a random 64-bit unsigned integer ID during startup, which at the moment is only used for identifying when a node tries to add itself as a peer.

* Any node connecting to the existing network. The authority set is bootstrapped via its own configuration file and maintain a healthy set of connections to each other.

Message Propagation

The default behavior of nodes is to treat all messages as anonymous using the same TCP connection for direct messages and Gossip messages. Messages arrive from the network and are sent to the application layer, where they are validated. I have already made an effort to document the messages themselves, so you can use that as reference. Messages are first validated and invalid messages (including duplicates) are filtered out. Valid messages are then sent from the application layer back to the p2p layer, where they are sent to specific peers or broadcasted. Once a message has already been sent out 6 times, it is dropped. Some messages (with the validity of 0) are held back and only relayed at a future point.

There is no preference made for picking which peers to use for the fanout, using a completely random selection of 16 peers. The application can override this and transmit important messages (election messages) to all peers a node is connected to.

The Actual Network

So what does the Factom network actually look like?

That’s pretty hard to answer. There’s no tracker or anything of the sort that is aware of the entire network. It’s not possible to reconstruct a complete view with data currently available. I gave it my best shot:

The Graph

This is a graph of the view that nodes have of each other. Bigger purple nodes are the ones from the seed file. Turquoise nodes have branched out further from the seed cluster. The single nodes flying around the outside (including the orange neighborhood) are nodes that were once part of the network but are most likely not connected anymore. It does not show which nodes are currently connected to each other but due to the way that the quality score is calculated at the moment (connected peers have a dramatically higher score), this view happens to coincide with connectivity.

There are 400 (exactly) nodes in total but of those only 102 and were online (and not behind a NAT). The remaining 298 nodes are still in the pool of potential peers (the partial view). Some of them are most likely developer nodes or nodes with dynamic addresses. This matches up (roughly) with Factoshi’s network node count, which also counts nodes behind a NAT, though not the partial view.

That said, this graph is more of a proof of concept than an accurate modeling of the network. I am not a data scientist.

Methodology

I wrote a small script that connects to a factomd node and requests a peer list, then disconnects. From there, it crawls every ip found until done. That netted me a list of 400 (exactly) unique ip addresses and the peers they deemed to share with me (up to 128 peers per ip, with many duplicates). In total, I ended up with 8,643 directional edges, though the graph has been reduced to non-directional edges.

Originally I intended to plot it in something like halfviz but it wasn’t powerful enough for that data size, so I ended up using Gephi and following a tutorial.

If anyone is interested in, I can share the code I used to crawl the network but for the sake of the network, I’m holding off on publishing it.

Acknowledgements

I used a variety of sources to collect all of the information above but I would like to give a special mention to João Leitão’s master thesis, “Gossip-based broadcast protocols” which is a great read if you want to delve deeper into gossip protocols.

And thanks to Alex from Factoshi for providing invaluable feedback regarding Factoshi’s node counting method and my own graph methodology.