At a Glance

Block Hashes

To access historical data on Ethereum, a smart contract must first know the block hash of the block containing the data. This may sound easy, but the EVM only gives smart contracts access to the previous 256 block hashes on chain. Thus, in order to access older block data, Relic needs a way to query block hashes that are beyond these limits. Unfortunately, Ethereum’s high storage costs makes storing every historical block hash on-chain completely infeasible. At the time of writing, the set of historical block hashes is about 500MB, and storing that much data on-chain would cost about 6,240 ETH at a gas price of 20 gwei. As a result, Relic must somehow compress the block hash history in order to be scalable.

Thankfully, we have more efficient options for storing block hashes. Rather than storing them individually, we can use the ubiquitous Merkle Tree data structure. This structure uses cryptographic hash functions — functions that process data as input and produce a fixed-size data blob, known as a hash. Cryptographic hash functions have a few special properties:

  • First pre-image resistance: given a hash, it is difficult to find data that results in that specific hash
  • Collision resistance: it is difficult to create any two different pieces of data which result in the same hash
  • Second pre-image resistance: given a piece of data, it is difficult to find other data which hashes to the same value (this is a weaker property than collision resistance)

We can combine these hashes into a tree: an arrangement of data where each node has two children nodes associated with it, and the value of a node is the hash of its two children combined. Building this up, each level of the tree has half as many nodes as the level underneath it, until we get to a single root node at the top.

This lets us use only a single cryptographic hash to commit to as much data as we like. Once the top root is fixed, the bottom data leaves cannot be changed. (Because if they could, it would mean someone could bypass one of the stated cryptographic properties of hash functions.)

To prove a particular piece of data is associated with a Merkle Tree, users can present the original data, as well as the sequence of partners that were hashes along side that data as we walk up the tree as a witness. A piece of code like a smart contract can verify that this process results in the correct Merkle root at the end, meaning the data presented must have been present when the Merkle Tree was originally created. There are a few practical constraints, but this effectively solves the storage question: we can take each of our block headers and place them in a Merkle Tree. By storing Merkle roots instead of the entire sequence of data, the storage drops significantly.

Trusting Block Hashes

It is easy to push a hash on-chain and claim it is correct, but this approach is reliant on trust. The question remains: How can users be confident that Relic Protocol has pushed the correct Merkle Tree of block hashes?

Fortunately, the entire purpose of blockchains is to make it possible to verify the authenticity of previous data. Each block’s header data is included in the Keccak hash calculation that describes the next block. Similar to a Merkle Tree, this means the current block hash is a commitment to all previous blocks. Therefore, as long as we can link up all of our historic blocks using successive hashing and have the final block match up with the current Ethereum state, we know that all the historic data is correct.

A linked list of blocks where each block contains the hash of the previous

Basic example of a blockchain

That all sounds great, but how can we take our Merkle Tree and verify successive hashing of all the included data? Unfortunately the answer is: we cannot.

zk-SNARKs

The suggestion is to include additional information in the Merkle Tree to demonstrate that the original data conforms to the blockchain property, rather than simply accepting the undesirable situation. It has been discovered that this is feasible by using zk-SNARKs.

At a basic level, a SNARK is a way to use some data to produce a short associated witness that can be used to convince a verifier that the data has certain properties. Intuitively: a SNARK is a way to quickly prove to a third party that a difficult computation was done correctly. The magic is that the verification process is very fast, and the associated witness is very small — perfect for the limited resources of the blockchain.

With this, we have all the basic pieces we need to create our historical data. We can “SNARKify” construction of our Merkle Tree to guarantee that the original data has the blockchain property and results in the stated Merkle root. Then we can “connect” the last block in our proof to a recent block that is still accessible from within the Ethereum Virtual Machine (EVM), by showing the recent block was created by hashing the data from our block. Thereby proving that the Merkle Tree was created from our preceding block data.

For practical reasons, we do not create a single Merkle Tree for the entire Ethereum history. Instead, we make several trees of fixed size. This makes Merkle proofs shorter, and makes it easier to work with as a zk-SNARKs, both of these keeping gas costs under control for users.

Post-Dencun

As part of the Dencun hardfork, the Ethereum execution layer received a trustless beacon block oracle for the last ~1 day of beacon blocks. Since beacon blocks contain the execution block hash, Relic utilizes this oracle as one source of trustless block hash access. Most notably, this oracle enables very cheap access to recent blocks (< 1 day old) where the block hash merkle root is not yet cached on-chain.

For more details, check out our BeaconBlockHistory contract.

Ethereum State

At this stage, we have presented a high-level overview of how to prove any historical block hash on Ethereum, which is of great benefit to developers. This final step is relatively straightforward.

Block header diagram of Ethereum broken down into the different fields

Ethereum Block diagram from Weber, et. al

If you have been following the discussion thus far, you may notice some familiar names. Those things labeled stateRoot, transactionsRoot, receiptsRoot, storageRoot? Those are all roots from Merkle Trees — technically Merkle Patricia Tries, but it’s the same idea.

If you want to prove a transaction occurred in an Ethereum block: take the block header, extract the transactionsRoot, and then create a Merkle Proof of the transaction. Similarly for other basics in Ethereum (storage, receipts, etc.).

Tying It All Together

This section serves as an introduction, providing you with the fundamental concept of accessing historical data from any point in Ethereum's history on-chain and in a provably secure way.

To demonstrate how Relic Protocol can prove a basic fact about historical data, such as "my Ethereum account has existed since 2017", let's walk through an example.

High level architecture of Relic

  1. Smart contracts which verify zk-SNARKs for block history are deployed by Relic Protocol team.
  2. The Relic Protocol team runs the ZK Prover off-chain to generate proofs of block headers. To make sure the block headers are accurate, they are all verified 100% on chain by the zk-SNARKs circuits.
  3. Users who want to prove their fact (e.g. account existed in 2017) can talk to the ZK Prover off-chain to help construct two proofs: one proving their birth block existed in the Ethereum chain (by showing it exists in the Merkle Tree of proven block hashes, or showing it is in the last 256 blocks, or using a SNARK), and another proof showing the Ethereum stateRoot Merkle Patricia Tree has an entry for their account in that block.
  4. These two proofs are then submitted to the State Verifier which ensures both proofs are correct.
  5. The State Verifier then stores the proven fact in the Reliquary, to commit the fact to the blockchain for future use (without requiring proofs each time).
  6. For things like Birth Certificates, a Soul Bound Token may be issued, so users get to show off their proven facts on OpenSea or similar places.

Once a fact is proven, it can easily be queried fully on-chain by any dApp by interacting with the Reliquary. Any smart contract can now easily verify facts about Ethereum state at any point in the history of the entire chain.