To access historical data on Ethereum, a smart contract must first know the block hash of the block containing the data. This may sound easy, but the EVM only gives smart contracts access to the previous 256 block hashes on chain. Thus, in order to access older block data, Relic needs a way to query block hashes that are beyond these limits.
Unfortunately, Ethereum’s high storage costs makes storing every historical block hash on-chain completely infeasible. At the time of writing, the set of historical block hashes is about 500MB, and storing that much data on-chain would cost about 6,240 ETH at a gas price of 20 gwei. As a result, Relic must somehow compress the block hash history in order to be scalable.
Thankfully, we have more efficient options for storing block hashes. Rather than storing them individually, we can use the ubiquitous Merkle Tree data structure. This structure uses cryptographic hash functions — functions that process data as input and produce a fixed-size data blob, known as a hash. Cryptographic hash functions have a few special properties:
- First pre-image resistance: given a hash, it is difficult to find data that results in that specific hash
- Collision resistance: it is difficult to create any two different pieces of data which result in the same hash
- Second pre-image resistance: given a piece of data, it is difficult to find other data which hashes to the same value (this is a weaker property than collision resistance)
We can combine these hashes into a tree: an arrangement of data where each node has two children nodes associated with it, and the value of a node is the hash of its two children combined. Building this up, each level of the tree has half as many nodes as the level underneath it, until we get to a single root node at the top.
This lets us use only a single cryptographic hash to commit to as much data as we like. Once the top root is fixed, the bottom data leaves cannot be changed. (Because if they could, it would mean someone could bypass one of the stated cryptographic properties of hash functions.)
To prove a particular piece of data is associated with a Merkle Tree, users can present the original data, as well as the sequence of partners that were hashes along side that data as we walk up the tree as a witness. A piece of code like a smart contract can verify that this process results in the correct Merkle root at the end, meaning the data presented must have been present when the Merkle Tree was originally created. There are a few practical constraints, but this effectively solves the storage question: we can take each of our block headers and place them in a Merkle Tree. By storing Merkle roots instead of the entire sequence of data, the storage drops significantly.
It is easy to push a hash on-chain and claim it is correct, but this approach is reliant on trust. The question remains: How can users be confident that Relic Protocol has pushed the correct Merkle Tree of block hashes?
Fortunately, the entire purpose of blockchains is to make it possible to verify the authenticity of previous data. Each block’s header data is included in the Keccak hash calculation that describes the next block. Similar to a Merkle Tree, this means the current block hash is a commitment to all previous blocks. Therefore, as long as we can link up all of our historic blocks using successive hashing and have the final block match up with the current Ethereum state, we know that all the historic data is correct.
Basic example of a blockchain
That all sounds great, but how can we take our Merkle Tree and verify successive hashing of all the included data? Unfortunately the answer is: we cannot.
The suggestion is to include additional information in the Merkle Tree to demonstrate that the original data conforms to the blockchain property, rather than simply accepting the undesirable situation. It has been discovered that this is feasible by using zk-SNARKs.
At a basic level, a SNARK is a way to use some data to produce a short associated witness that can be used to convince a verifier that the data has certain properties. Intuitively: a SNARK is a way to quickly prove to a third party that a difficult computation was done correctly. The magic is that the verification process is very fast, and the associated witness is very small — perfect for the limited resources of the blockchain.
With this, we have all the basic pieces we need to create our historical data. We can “SNARKify” construction of our Merkle Tree to guarantee that the original data has the blockchain property and results in the stated Merkle root. Then we can “connect” the last block in our proof to a recent block that is still accessible from within the Ethereum Virtual Machine (EVM), by showing the recent block was created by hashing the data from our block. Thereby proving that the Merkle Tree was created from our preceding block data.
For practical reasons, we do not create a single Merkle Tree for the entire Ethereum history. Instead, we make several trees of fixed size. This makes Merkle proofs shorter, and makes it easier to work with as a zk-SNARKs, both of these keeping gas costs under control for users.
Now that we have addressed the matter of historical block hashes, the question arises as to how an individual can obtain a block hash from a few hours ago.
First off, Relic Protocol maintains the ZK Prover running 24/7 building zk-SNARKs proofs and submitting them on chain once every 8,192 blocks, a little under 1 per day.
As previously mentioned, the EVM can access the last 256 block hashes of data, just shy of one hour. But what if the required data is older than one hour but less than one day?
Fortunately, this issue is similar to the one we previously addressed with historical blocks. Our aim is to establish a hash chain from the desired block to a recent enough block that the EVM can access. Instead of attempting to submit and hash hundreds or thousands of blocks, the easiest approach is to use a SNARK.
Therefore, concurrently with 24/7 building zk-SNARKs proofs of Merkle Trees for the block hashes, Relic Protocol also sets some aside that are not pushed on-chain automatically. If someone wants to use Relic Protocol to prove a fact from 2,000 blocks ago, the Relic Protocol Web2 API will issue their proof as a zk-SNARKs of a small Merkle Tree of block headers and their inclusion in that tree, instead of simply an inclusion in one of the already published Merkle Trees.
At this stage, we have presented a high-level overview of how to prove any historical block hash on Ethereum, which is of great benefit to developers. This final step is relatively straightforward.
Ethereum Block diagram from Weber, et. al
If you have been following the discussion thus far, you may notice some familiar names. Those things labeled
storageRoot? Those are all roots from Merkle Trees — technically Merkle Patricia Tries, but it’s the same idea.
If you want to prove a transaction occurred in an Ethereum block: take the block header, extract the
transactionsRoot, and then create a Merkle Proof of the transaction. Similarly for other basics in Ethereum (storage, receipts, etc.).
This section serves as an introduction, providing you with the fundamental concept of accessing historical data from any point in Ethereum's history on-chain and in a provably secure way.
To demonstrate how Relic Protocol can prove a basic fact about historical data, such as "my Ethereum account has existed since 2017", let's walk through an example.
High level architecture of Relic
- Smart contracts which verify zk-SNARKs for block history are deployed by Relic Protocol team.
- The Relic Protocol team runs the ZK Prover off-chain to generate proofs of block headers. To make sure the block headers are accurate, they are all verified 100% on chain by the zk-SNARKs circuits.
- Users who want to prove their fact (e.g. account existed in 2017) can talk to the ZK Prover off-chain to help construct two proofs: one proving their birth block existed in the Ethereum chain (by showing it exists in the Merkle Tree of proven block hashes, or showing it is in the last 256 blocks, or using a SNARK), and another proof showing the Ethereum
stateRootMerkle Patricia Tree has an entry for their account in that block.
- These two proofs are then submitted to the State Verifier which ensures both proofs are correct.
- The State Verifier then stores the proven fact in the Reliquary, to commit the fact to the blockchain for future use (without requiring proofs each time).
- For things like Birth Certificates, a Soul Bound Token may be issued, so users get to show off their proven facts on OpenSea or similar places.
Once a fact is proven, it can easily be queried fully on-chain by any dApp by interacting with the Reliquary. Any smart contract can now easily verify facts about Ethereum state at any point in the history of the entire chain.