PegNet Part Two: Probabilities

In my last blog, I described how the Proof of Work is calculated and how the mining process works, but now I want to dive a little deeper into the concept of proof of work. Most importantly, I want to show the answer to the question: Why does it work?

I’m only going to address the Bitcoin Proof of Work system superficially and unless explicitly stated, all the math and formulas apply to PegNet only.

Mining

While I am going to use the term “mining” throughout this blog, it is somewhat of a misnomer because when mining for ore or other substances, there is incremental progress. When you dig a hundred foot long tunnel one day, the next day you can continue where you left of. That is not the case when “mining” a proof of work system. Each attempt — or in our case hash — is independent.

A more accurate term would be rolling dice. In our case, the die has over 18 quintillion (2^64) sides. Every hash produces a distinct, pseudorandom Difficulty from 0 and 2^64 – 1. What you “rolled” in the previous attempt is not relevant in the next attempt.

That means that it is possible to gain the highest Difficulty (2^64-1) in the very first attempt. It is also possible to roll the lowest Difficulty (0) in the first attempt. The likelihood of that happening is the same as rolling any other Difficulty:

probability

This is the basis for all of the math in this blog. It comes with a big caveat, however: We are only calculating the likelihood of events occuring. When you consider rolling a six-sided die, the likelihood of rolling a six is 1/6. You are expected to roll one six in every six throws. That doesn’t mean you will roll a six when you throw dice yourself. It might take eight or ten rolls. It could take dozens.

However, if you throw a million dice and tally everything up, the real value will be very close to the expected value. The same is true for mining. If you look at just a handful of attempts, your expectations are not likely to come true. In PegNet or other PoW systems like Bitcoin, we are dealing with billions or more of attempts.

Hashrate

Every computer has a certain amount of Hashrate, which is the amount of hashes you can produce over a certain period of time. This is something that is objectively measurable but will be different for every machine. It also depends on which hash function you’re using. For PegNet, you can use a LXRHash Benchmark Utility to calculate your PegNet hashrate.

For the rest of this blog, when I use Hashrate I’m referring to the Hashrate over the mining duration in a single Factom directory block, which is eight minutes (480 seconds). When I want to refer to the Hashrate per second, I will use H/s.

Winning

In PegNet, winning Proof of Work means having one of the top 50 highest Difficulty entries. This is a complex issue that can be broken down in multiple parts:

  1. How many times am I expected to roll a value higher than the threshold?
  2. What is that threshold?
  3. What is the total network hashrate?

Part 1: Expected Entries

The answer to the first question requires an as-of-yet unknown factor X, which is the lowest Difficulty of the top 50 that we have to surpass in order to make it in. The likelihood of randomly rolling higher than X is the amount of desirable events divided by the amount of possible events:

Just as with dice rolls, the expected number of results we get after amount rolls is amount * probability. We know (or at least we can measure) our own Hashrate. That makes the formula:

That means if we know what X is, we can calculate the amount of records we can expect to enter into the top 50, which brings us to…

Part 2: Minimum Difficulty

One way to determine the threshold is to just look at a number of recent blocks, calculate the top 50, and there you have it. This is what the software will be able to do from the available information, but at the time of writing, this data is not yet available.

So let’s introduce a new unknown: THR or Total Hashrate, which is the sum of the Hashrate of every individual miner. Now we can do something tricky: we take the formula from Part 1 and assume that the threshold is a fixed place and that 50 entries managed to win.

Now we just need to solve for X and we get:

This gives us a generalized formula for the threshold that only depends on the network’s total hashrate. Even better, we can easily modify it to be not just the top 50. One thing is clear though: we need to know the network’s total hashrate in order for this to work. That brings us to…

Part 3: Total Hashrate

We have a formula to calculate the minimum difficulty based on the hashrate. Since it is possible to retrieve information about the top 50 from the last block, does that mean we can use that to calculate the total hashrate? Yes!

Using the equation for minimum difficulty, we solve for THR and get:


(Steps)

Putting it all together

We now have the tools to calculate everything we want to. The data we need is our own Hashrate, as well as the minimum difficulty of a block. This lets us determine the total hashrate of the network and how often we are expected to get an entry in the top 50.

If we now assume that our Hashrate is a percentage of the network’s total Hashrate, ie: EHR = F * Hashrate, we can substitute that into the formula and get the following:


(Steps)

This backs up the obvious assumption. If F=1, or EHR is equal to Hashrate, we expect to get 50 of the top 50. If our Hashrate is 50% of the network (F=2), we expect to get half of the top 50. If our Hashrate is a thousandth of the network, we expect to get one record into the top 50 every 20 blocks. And best of all, we can plot it.

Two more generalized formulas are

  • If we mined last block: 50 * Hashrate / THR
  • If we did not mine last block: 50 * Hashrate / (THR + Hashrate)

Why does Proof of Work work?

It’s obvious that hashing takes time but since hashing is like rolling dice, it is possible someone can get lucky and roll the perfect hash on the first try. The key is understanding the probability of that event happening. This is where the difference between PegNet and Bitcoin is substantial.

In Bitcoin, the minimum difficulty is predetermined and the first person to cross the threshold wins.
In PegNet, the duration of mining is predetermined and the whoever rolls the best during that duration wins.

This is why the difficulty in Bitcoin is periodically adjusted to maintain an approximate ten minute block duration. If the Total Hashrate goes up, the difficulty has to be adjusted in turn or the likelihood of someone winning happens too soon. In PegNet, the duration is determined by the Factom Protocol’s 10 minute block time, which is why it can’t have a predetermined minimum difficulty. It is not guaranteed that a certain treshold will be met during the mining time.

Conclusion

The math involved in PegNet’s Proof of Work is straightforward probability math with no hidden surprises, which in turn makes it easy to have formulas. These formulas ended up being integrated into PegNet’s codebase, allowing miners to save money and only submitting records that are expected to make it at least into the top 175 based on the previous block. They are also used to track the network’s Total Hashrate in the control panel.

This is probably one of the most fascinating aspect of PegNet and Proof of Work in general. In the latter’s case, billions of dollars are entrusted to these probabilities.

Simple but powerful.