Boy oh boy are you ready to dive deep into the complicated world of Cryptographic Hashing Functions? No? Good, neither are we. That’s why here at Whiteboard crypto we break very complicated and confusing cryptocurrency topics down to their very basic levels and use analogies and stories to help explain them… so the average person like you (who’s not ready to fully grasp what the heck a cryptographic hashing algorithm is) will be able to understand it so well you could explain it to your grandpa… or rather, you could show him this video and he’d understand it.
So think of a cryptographic hashing algorithm like a magical black box where you can give it something and it poops something out.
A hashing function is a system where you can put something into it, and it’ll output a hash. We will get into what a hash is in a second, but just know that we need it.
There’s a ton of math happening inside this “magical black box”, but essentially you give it something and it does the pooping. In this case, bitcoin uses the SHA-256 hashing function. SHA stands for “Secure Hashing Algorithm” and 256 refers to the amount of 0 and 1s that it has in whatever it puts out. Whether you put in your name or the entire dictionary, what it poops out will always be 256 1s and 0s. Our computers are smart and so they convert those 0 and 1s to letters and numbers, which equals 64 numbers and letters. There are many different hashing functions, but bitcoin uses SHA-256, so we will focus on it in this video for examples. Essentially you need to know 5 main things about a hashing function.
- You always get the same output if you give it the same input
- No matter how much data you give it, always same size of output
- Quick to compute
- You can’t reverse or predict the hash
- It’s infeasable to find two hashes to the same input
- You always get the same output if you give it the same input.
So if you give it “Would you please click the like button below, it gives us warm fuzzies”, it would give you “290a96e911a3a77b524231b8c9afe9d75055eeb3d0926fe5038ebbbc32a228f8”
And if you did it again, it would give you the same seemingly random numbers and letters. Those numbers and letters are what we call the hash. Like I said, it’s actually a bunch of 1 and 0s but we turn it into numbers and letters so we can look at it easier and copy/paste it easier.
- No matter how much data you give it, it’ll always give you a fixed output.
So if you give it A, it’ll give you 64 numbers and letters.
Then you give it AA, and it’ll give you 64 numbers and letters.
Then you give it AAA, and it’ll give you 64 numbers and letters.
If you give it the entire encyclopedia, it’ll give you 64 numbers and letters.
And if you give it my social security number, it’ll give you 64 numbers and letters.
The black box is magical in this sense that if you give it one letter or a million characters, it’ll output 64 numbers and letters.
- It needs to be quick to compute.
This would be a good place to show you exactly what the hashing black box actually does.
However, it is very technical and not everyone who watches this video needs to understand it. If you do though, here’s an intro: https://qvault.io/cryptography/how-sha-2-works-step-by-step-sha-256/
See, it first converts your message to binary, then it adds a single 1, then it adds 0s until it gets to a multiple of 512 1s and 0s, then it appends 64 bits, then it initializes hash values and this is where most people have to go look up what initializing hash values means and that’s why I’m not going to explain the whole thing, because just look – Message Schedule? Chunk loop? You don’t need to know these to know exactly how a hash function works… You just need to know it does a bunch of math with the 1s and 0s essentially. Computers are really good at math, so that sums up our 3rd point, that it needs to be quick to compute. Most computers can do a few million of these each second.
- Changing the input just a tiny bit, changes the output a lot.
So basically if you do SHA-256 of this “subscribe to our channel”… you get this as the output:
However, if you do “please subscribe to our channel”… you get this:
And, if you change it just a tiny bit “please subscribe to our channel!” with an exclamation mark, you get this:
We want a small change of the message to greatly change the hash, because otherwise it would be predictable, and that would break the security of it.
- And lastly, it must be nearly impossible to get the hash of two inputs.
In fact, currently there are no known two inputs that generate the same output in SHA-256. However, in MD5, which is a different hashing function that has a different magical black box and does different math, does have a few.
For example, this input: (refer to https://crypto.stackexchange.com/questions/1434/are-there-two-known-strings-which-have-the-same-md5-hash-value)
And this input: [refer to page]
Both give us this:
However, there aren’t many of these “collisions” found. SHA 256 is much, much, much stronger than MD5 though. People have been mining for years while creating billions of billions of SHA-256 hashes and no collision has been found – maybe you’ll find it yourself and win an award!
What does it have to do with crypto?
It really has to do with something called “proof of work” which is the exact method we use to mine bitcoin. You can watch my “proof of work” video about it, but essentially here is what we are doing:
- We are taking a list of transactions that people want added to the blockchain.
- We are then adding a random set of numbers and letters to it
- Finally, we are calculating the SHA-256 of that until we get an output that has a certain number of zeros on it.
In this case, because all I have right now is my personal computer, we want to find out how to get “Subscribe” with 7 zeros.
Well, as we mentioned earlier, we can’t predict what we need to add to it, we will just have to guess and check.
So we start with Subscribe1 and get
Then, using some simple python code, we keep adding a new number at the end until we get a zero at the end.
At Subscribe9 – we get
At Subscribe45 – we get
At Subscribe2864 – we get 3 zeros
At Subscribe38245- we get 4 zeros
At Subscribe1292748 – we get 5 zeros
At Subscribe59174387 – we get 6 zeros
At Subscribe326,032,489 – we get this
Which is 7 zeros.
This is essentially what mining bitcoin is. Except instead of “Subscribe”, the data is literally a list of people exchanging bitcoin like “John pays Bill 5. Erik pays Robin 8.” and so on.
If you take a look at this bitcoin transaction, you see that someone had to compute 19 zeros…
Which my computer took 10 minutes to find 7 and as you saw, it exponentially took a lot more tries to get to the next 0. There are hundreds of thousands of computers around the world guessing these random numbers, and then once they find it, because we can be sure they all guessed and checked, we can be reasonably assured they didn’t make a fake transaction.
Thanks for watching this video, I hope you learned something about hashing functions and algorithms and their purposes, if you did… please consider mining the like button and subscribing for future videos and to reward our hard work on these videos. We hope to see you in the next video!