Decentralized ANO Management

Currently, onboarding and deboarding of authority nodes is done by whoever has access to the network skeleton key. That key is hardcoded in the factomd codebase. Whoever has access to the private key essentially controls the authority set. There are a number of drawbacks to this approach:

  • The private key can be held by multiple parties without any accounting of who uses it. Party A signing a message is identical to Party B signing the message, with no way to know who sent it. The only publicly known party to hold the private key at the moment is Factom Inc.
  • It can’t be revoked. Should an employee with access to the private key part with the company, or should the company suffer a digital theft, there would be no easy way to revoke the key.
  • No consensus is required. Any one party with the key can make decisions without checks or balances. (Of course, there can be repercussions, though they’d have to go through off-chain channels)

All three issues can be mitigated by switching to a multi-sig style process. Rather than requiring one master key, n (the number of signatures required) of m (the number of possible signatures) ANOs will have to sign the message for it to be valid.

The drawbacks mentioned above are no longer an issue. ANOs coordinate in order to sign the message, making the process public and accountable. ANOs only sign for themselves and cannot speak for others. If an ANO loses control of their key, it’s not a big deal and their old identity can be replaced with a new one. If an ANO leaves, their identity is removed from the authority set and thereby revoked.

The Process

So what would the process look like? The AddServerMsg and RemoveServerMsg messages require a timestamp, a server identity chain and a type (1=fed or 0=audit).

To begin, a timeframe is chosen to determine the time the message is added. The action happens in the next block of the one the message is entered, so this would be picking the onboarding/deboarding time and date. This timestamp is combined with the server’s identity chain and server type in order to create the message itself. This unsigned message is then published somewhere (discord, factomize, etc) in binary format and ANOs have the opportunity to review and sign the message.

The message payload format is: message type, factom timestamp, identity chain bytes, server type (0=audit, 1=fed)

Once a sufficient amount of signatures have been collected, the signature block is added and appended to the payload, in the format: number of signatures, public key of first signature (32 bytes), first signature (64 bytes), … , public key of nth signature, nth signature. The key/signature-pair order is random. The public key/signature pairs can also be published on public channels.

The final message can be submitted by anyone, meaning multiple ANOs can submit it if they like. If a valid message is submitted before the time specified, it is kept in holding until the block time surpasses the time specified. Messages submitted later than L hours after the specified time are invalid.

Picking n and m

The values of n and m aren’t set in stone but from a technical perspective, it makes sense to have m = number of servers in authority set (leaders and audits), and n = floor(m/2)+1. At the time of writing, there are 55 servers, making n = 28. This implementation requires only a minor change to the core code to implement. The downside is that not all ANOs run two servers, meaning ANOs with only one server would be disadvantaged.

The other option that presents itself is to have m = 27, or equal to the number of authority nodes. At least 14 ANOs would have to sign to make it official. The downside is that factomd would have to start tracking which identity chains belong to which ANO, so that only one node per ANO is allowed to sign. This is not a trivial problem to solve and likely requires alteration of the blockchain structure to get the information on-chain, such as an extension of the identity-chain format.

Other solutions are possible but will require more technical solutions. The threshold of votes required can also be raised (or lowered) as necessary to, for example, need 60% consensus for onboarding but an 80% consensus for deboarding.

Drawbacks

The main drawback is the speed of the process. Organizing all the ANOs to sign a message is going to take some time. A timeframe of several days is something that is acceptable for things that happen infrequently, like onboarding/deboarding of ANOs.

Another drawback would be that this soft-forks the Factom chain. Nodes running outdated software wouldn’t be able to validate the new message and not propagate it around the network and be unable to add it to the process list, thereby unable to build blocks. Once the blocks are built by the ANOs, the node would be able to once again understand the format and keep up progressing, but it would no longer be able to keep up with the leaders. There’s also the issue of outdated nodes not rebroadcasting the message since they think the message is invalid. If nodes in the authority set have enough connections to each other, they should still be able to communicate with each other.

There is a potential solution to this, with the cooperation of the current network skeleton key holders. At the moment, when factomd converts a raw message from the p2p network to a factom message, it will read what it can and then ignore the rest. This means we can sign the messages the normal manner first, then tack on the ANO signature block. Old nodes will be able to recognize the old skeleton key but new nodes will ignore that part and verify the rest of the messages. Federated nodes all run the new code, so a message signed by only the old key would never make it into the processlist.

Once that system is in place, the private key of the old skeleton key is made public since it’s now “useless” apart from informing old nodes. The downside is that old nodes won’t propagate the full message, only the part they recognize without the additional signatures, so the network propagation issue described above remains.

Technical Challenges

In the above spec, a message of <Payload> 2 <Key/Sig A> <Key/Sig B> is functionally the same as a message of <Payload> 2 <Key/Sig B> <Key/Sig A>, though the message hash itself is different. The solution is to a) reject all messages without enough signatures before filtering, b) only allow a single message of that payload through the filter.

Rule a) is important to prevent anyone from sending the message too early (with no or not enough signatures) and thereby prevent subsequent messages from arriving. Rule b) will then accept exactly one valid message. The entry in the processlist and the admin chain does not store the actual signatures (the assumption is that it only is added to the processlist if the signatures were valid), so one node receiving the order A/B and another node the order B/A will make no difference. The ACK sent out by the node responsible for writing to the admin chain is what determines the position and entry into the admin list.

Example

Let’s say we wanted to onboard (type 22) the ANO “AAA” on April 1st, 2020 at noon (UTC) as an audit node (type=0) and their identity chain was named 888888aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, the message to sign would be:

160171359cca00888888aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa00

We need at least two ANOs to sign up, which is done by ANO Zero and ANO One, with the private keys 0000000000000000000000000000000000000000000000000000000000000000 and 1111111111111111111111111111111111111111111111111111111111111111 respectively.

  • ANO Zero Public Key: 3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29
  • ANO Zero Signature: ad29f521164251f073f1a0587bf1e7a9bcf659cf9263748d1921d57ddff9fb0d06a81182068e21b358d21428d15502d35b9fe247ca981d4285cb264f5ffeb30a
  • ANO One Public Key: d04ab232742bb4ab3a1368bd4615e4e6d0224ab71a016baf8520a332c9778737
  • ANO One Signature: a32428a9bec1f5ed694375a508f071d8b04f51c36048e9ff6eb143c59f67b6113bf0ba671d445eed335c0d2333f936a982d1a5b6c86a3caf6bff780c4e2d5308

The final constructed message is:

160171359cca00888888aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa00023b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29ad29f521164251f073f1a0587bf1e7a9bcf659cf9263748d1921d57ddff9fb0d06a81182068e21b358d21428d15502d35b9fe247ca981d4285cb264f5ffeb30ad04ab232742bb4ab3a1368bd4615e4e6d0224ab71a016baf8520a332c9778737a32428a9bec1f5ed694375a508f071d8b04f51c36048e9ff6eb143c59f67b6113bf0ba671d445eed335c0d2333f936a982d1a5b6c86a3caf6bff780c4e2d5308

(Verify this example on the golang playground)

Conclusion

The ability of ANOs to determine their own ANO pool via a decentralized, democratic process is an important step for the future of the protocol. There are some technical challenges that need to be overcome in order to implement this which make soft-forking a challenge but there are other features just waiting for the next hard fork to come around. A radical shift in governance like this might be enough to consider doing it.