If you have been keeping an eye on the blockchain industry for the past year or so, I am sure you have heard of the Ethereum “Merge” or Layer 2 solutions at one time or another. However, these are rather technical concepts and you may have given them a blind eye to date as a result. In this piece, I hope to change that and provide an easier-to-understand overview of what these things actually are and how they relate to each other moving forward (spoiler alert: it’s about scaling).
Ethereum’s Current Structure
For context, a little background on Ethereum’s current structure and its existing scaling issues will be helpful. For those less familiar, the Ethereum blockchain at its simplest is a distributed, public ledger. Think of it as a large, open-source operating system (or global computer) where people build and transact publicly without the need of a centralized authority.
Ethereum Blocks (and Congestion) Explained
Any transaction that occurs on the Ethereum network is recorded via a “block.” For example, if I wanted to send a counterparty a token over the Ethereum network, this transaction would be included in a block for purposes of approval before it can be sent. Each block is typically comprised of numerous transactions like this one from different people. Once a block is accepted by the network, it is linked to the previously accepted block (in a chain of data), and all transactions within that block go through. Blocks confirm the exact time and sequence of transactions, and are linked securely together to prevent any block from being altered or a block being inserted between two existing blocks — hence the tamper-evident and immutable nature of a ‘blockchain.’
Things get a bit more complicated when we start exploring the approval process for blocks. Each block is put together (or mined) by “miners” on the Ethereum network, who are paid a certain amount of “gas” by those transacting. Think of gas like a transaction fee or cost of computing a transaction. For purposes of this article I will not get into the details of why gas is currently required on the Ethereum network, but for those curious, here is an easy to understand overview.
The size of each block, which ultimately determines how many transactions can occur on the network at once, is determined by the network’s block gas limit. As of writing, the average Ethereum block size is 30 million units of gas. In other words, at the time of writing, a block is only valid if the total gas consumed by all the transactions in it is less than 30 million units. What this means is that if the network is busy, not all pending transactions can end up in a block (since each transactions requires gas and would exceed the limit), and the network becomes congested and backed up.
This congestion, in a nutshell, is an example of how the current Ethereum network struggles to scale. Ethereum 2.0, which “the Merge” is a part, aims to enhance the speed, efficiency, and scalability of the network so that it can avoid bottlenecks and process more transactions simultaneously. Whereas the current network can process about 30 transactions per second, Ethereum 2.0 promises up to 100,000 transactions per second. See the next section on the Merge and Ethereum 2.0 for how this improvement is possible.
The Merge / Ethereum 2.0
The Ethereum 2.0 upgrade is a multi-phased upgrade to the Ethereum blockchain that seeks to improve the network’s scalability and security through several changes to the network’s infrastructure — most notably, through a switch from a proof-of-work (PoW) consensus mechanism to a proof-of-stake (PoS) model. More details concerning the distinction between models will come shortly.
First, it is important to note that the Merge is not the only event driving the Ethereum 2.0 upgrade. The upgrade is rather comprised of various parts, of which the Merge is one (albeit a significant one). See below for an outline of the three primary components — (i) the Beacon Chain upgrade, (ii) the Merge, and (iii) sharding. Technically there are more parts involved as well, but these are relatively smaller upgrades meant to ensure the network runs smoothly, and will be implemented in parallel with the others.
Each of these components is a distinct part of the process of transitioning the Ethereum network to a PoS model. The Beacon Chain launched first on December 1, 2020 and introduced proof-of-stake to the Ethereum ecosystem. This chain exists as a separate blockchain and does not currently process transactions on Ethereum Mainnet (short for “main network”). However, soon the Beacon Chain will merge with Mainnet and become the consensus engine for all Ethereum network data, including execution layer transactions. This official switch to using the Beacon Chain (proof-of-stake) as the engine of block production on Ethereum is what the Merge represents. See the above graphic for a good visual representation here.
“Let’s consider an analogy. Imagine Ethereum is a spaceship that isn’t quite ready for an interstellar voyage. With the Beacon Chain, the community has built a new engine and a hardened hull. After significant testing, it’s almost time to hot-swap the new engine for the old mid-flight. This will merge the new, more efficient engine into the existing ship, ready to put in some serious lightyears and take on the universe.” — Ethereum.org analogy on the Merge, where the new engine is the Beacon Chain (proof-of-stake), the old engine is proof-of-work, and the spaceship is Ethereum
The third primary component of the Ethereum 2.0 upgrade is shard chains, which will play a key role in scaling the Ethereum network. A common misconception that many have is that the Ethereum network will scale and become much more efficient immediately after the Merge. However, it is not until the introduction of sharding, which is expected to be implemented sometime in 2023, until much of these scaling effects kick in.
Sharding essentially splits Ethereum’s entire network into smaller pieces, known as “shards,” in an effort to increase the network’s scalability. Instead of settling all operations on a single blockchain, shard chains spread these operations horizontally across 64 new / mini-blockchains. By dividing the load, each validator will no longer be required to process the entirety of all transactions across the network. This should both reduce network congestion and increase transactions per second on the network when compared to the existing proof-of-work model.
Proof-of-Work (PoW) v. Proof-of-Stake (PoS)
To date, the Ethereum network has been secured by proof-of-work. This model was introduced earlier and, among other features, includes miners for purposes of validating blocks. However, after the Merge, mining (proof-of-work) will no longer be the means of producing valid blocks on the Ethereum network. Instead, the Beacon Chain’s proof-of-stake validators assume this role and will be responsible for processing the validity of all transactions and proposing blocks moving forward.
How is a PoS model different from that of a PoW? In short, instead of having miners validate blocks, owners of Ethereum stake their ether into a smart contract on Ethereum. It is these stakers (also called validators) that are then responsible for checking that new blocks propagated over the network are valid.
Why make the shift to a PoS model? There are a few reasons. Chief among them is that the model is more secure for sharding, which as noted above will play a key role in scaling the Ethereum network.
As you may recall, sharding breaks a network down into pieces. Although this structure may increase scalability, it also makes any particular shard within a network more susceptible to being attacked (see 51% attacks for more context). Attacking any particular shard (which is like a mini-blockchain) within a network requires accumulating only a fraction of the hash rate otherwise needed to control an entire blockchain that has not implemented sharding. Furthermore, with PoW, miners can choose specifically which shards to attack (more technically, they can choose which shards to contribute their hash power). As a result, PoW miners can potentially collude and concentrate on a single shard in an effort to take control. If miners were to take over the majority of block producers in a shard, they could then manipulate it and risk destroying the network. With PoS, much of this risk is mitigated by not allowing validators the ability to choose which shards they want to work on. Instead, validators are randomly assigned individual shards, which decreases the chances of collusion among stakers.
“It’s really important to mention that validators are super-frequently reshuffled between shards (possibly even once per block), so it’s actually quite hard to “target” one specific shard for an attack. This is a large part of where sharding’s at least theoretical success in breaking the trilemma comes from.” — Vitalik Buterin in 2018 on how proof-of-stake helps mitigate the security vulnerability that comes with sharding
A proof-of-stake model is also intended to be less energy-intensive, more decentralized, and more secure than a proof-of-work consensus mechanism.
- Energy Consumption — PoS effectively substitutes staking for computational power. No longer is there a need to use lots of energy on proof-of-work computations for purposes of mining, which leaves a larger footprint.
- Decentralized — A PoS model generally requires more computers and participants (i.e., nodes) across the network to review and approve transactions, making it more decentralized than a PoW model.
- Security — In addition to PoS better protecting against 51% attacks with sharding (explained earlier), the model also provides greater disincentives against validator attacks. This is largely because validators who secure the PoS network must stake significant amounts of ETH (32 ETH as of writing) into the protocol. If these validators then try and attack the network or behave dishonestly, the protocol can automatically destroy their staked ETH.
Ethereum 2.0 has lofty goals when it comes to scalability. However, we are still some time away from these upgrades taking full effect. Are there ways for the Ethereum network to scale in the meantime? Enter Layer 2 solutions, which will be discussed next.
An Introduction to Layer 2 Solutions
The main goal of scaling a blockchain is to increase transaction speed (faster finality) and transaction throughput (high transactions per second) without sacrificing decentralization or security. At a high level, it is helpful to think about scaling in two ways — on-chain scaling and off-chain scaling.
On-Chain Scaling (“Layer 1s”)
On-chain scaling relates to changes or upgrades to a Layer 1 network. In the case of Ethereum, this would refer to changes to Ethereum Mainnet, which is where transactions are processed and finalized on the network. There are numerous other Layer 1s out there as well, including Solana, Cardano, and Avalanche, with each having its own distinct characteristics and purpose.
Making improvements to a Layer 1 can be an extremely complex process and generally requires significant work. Due to technological constraints, certain changes or upgrades even verge on the impossible. Take Ethereum’s upgrade to proof-of-stake as an example, which has taken years of development and has yet to be completed.
Given these complexities, off-chain scaling has emerged as an alternative way to scale a network.
Off-Chain Scaling (“Layer 2s”)
In contrast to on-chain scaling, scaling that occurs off-chain does not require any changes to a Layer 1 network. These solutions are rather built on top of or alongside an existing Layer 1, like Ethereum, and serve as a place where transactions and processes can occur independently of the main network. By shifting the bulk of a Layer 1’s processing burden to these off-chain networks, the main network can then focus on other matters such as security without sacrificing scalability. For simplicity, the term “Layer 2” will be used to refer to all off-chain scaling solutions moving forward in this article.
However, be forewarned, as there are various types of Layer 2s that exist, and it gets complicated quickly. To understand the nuances of each solution is a challenge that is recommended only for the most avid blockchain devotees out there, and will not be discussed in detail here.
For example, in the Ethereum context, a defining feature of Layer 2s is how these solutions derive their security. Some Layer 2s, like optimistic rollups, zero-knowledge rollups, and state channels, derive their security directly from Ethereum Mainnet (i.e., the Layer 1). In this category, rollups in particular have emerged as the dominant scaling solution, with nearly $2 billion currently secured by Arbitrum and Optimism combined (both of which are optimistic rollups).
There is also what is known as sidechains, validiums, and plasma chains. These Layer 2 solutions involve the creation of new chains that derive their security separately from Ethereum Mainnet. Polygon is an example of a popular Layer 2 in this category. As of writing, Polygon has nearly $2 billion in total value locked up and supports popular Web3 games and applications such as Zed Run and Quickswap. It varies, but Layer 2s can reduce Ethereum gas fees by up to 10–100x and process thousands of transactions per second (in contrast to the current 15–45 transactions per second processed on Ethereum’s base layer).
Although complicated, having a variety of Layer 2 solutions helps reduce the overall congestion on any one part of the Ethereum network, and also prevents single points of failure. Layer 2s can also be optimized for different use cases (DeFi, NFTs, gaming, etc.), so the ecosystem as a whole can benefit from a diversity of design. For a deeper dive into the various layer 2 solutions, I recommend reading this piece from Ethereum.org.
Can Ethereum 2.0 and Layer 2 Solutions Co-Exist?
Let’s take a moment to reflect on the purpose of the Ethereum 2.0 upgrade and existing Layer 2 solutions built on the Ethereum network. At their core, each of these are focused on trying to solve the same issue — scalability.
Layer 2 solutions were built first primarily to solve for the bottlenecks occurring on Ethereum’s base layer. However, once Ethereum 2.0 is in motion, what purpose will Layer 2s serve if the base layer they were built to scale is now exponentially more efficient itself? Will Layer 2s become obsolete as a result?
To answer this question, it is helpful to reflect on Ethereum’s vision, which is to bring Ethereum into the mainstream and serve all of humanity. This is a grand vision and one that requires scaling for a future that contemplates much more than 4% of the total population owning crypto. Even if the Ethereum 2.0 upgrade is able to significantly expand the network’s throughput, it is still questionable whether it would be able to support a future world with billions of potential daily crypto users without becoming congested.
The relationship between Ethereum 2.0 and its existing Layer 2 solutions is therefore likely to be symbiotic. This implies a world with different scaling solutions working in harmony, allowing for an exponential effect on future transaction speeds and throughput. It is these combined effects, and not one or the other, that Ethereum is planning for in order to achieve its vision.
“[I]t’s not ‘rollups instead of sharding’, it’s ‘rollups on top of sharding’. That said, rollups are already here or coming soon even before sharding, and rollups without sharding still offer that 100x increase in throughput. So get on a rollup today!” — An excerpt from a Twitter thread by Vitalik Buterin as it concerns Ethereum scaling
The dramatic increase from ~15–45 transactions per second (TPS) to 100,000 TPS therefore contemplates Layer 2 solutions in addition to sharding, as these are multiplicative effects. For example, if Layer 2s offer a ~100x increase in throughput, and sharding offers a ~64x increase, Layer 2s on top of sharding offer a ~6400x increase in throughput. These compounded effects are what may allow the network to scale and meet much higher demand in the future.
Others within the Ethereum community have expressed similar thoughts concerning the future relationship between Ethereum 2.0 and existing Layer 2 solutions.
“As Ethereum L1 becomes more efficient, L2’s will simply become that much more efficient right alongside, all while maintaining their current added benefits.” — Alan Chiu, CEO/founder of Boba Network (a Layer 2 optimistic rollup)
“[E]ven after the merge, to really get to mainstream adoption, we will need as many scaling solutions as possible.” — Ahmed Al-Balaghi, co-founder of Biconomy (a multichain relayer protocol)
In the end, the Ethereum 2.0 upgrade will undoubtedly have an effect on the perception and overall utility of Layer 2 solutions moving forward. However, Ethereum is contemplating and scaling for a future that has significantly more demand than what exists today. If this vision plays out, there is a strong case for Layer 2 scaling solutions to continue to play a major role in the ecosystem long-term.
References
- The Merge (Ethereum.org)
- The Ethereum Vision (Ethereum.org)
- Proof-of-Stake (Ethereum.org)
- Scaling (Ethereum.org)
- The Beacon Chain (Ethereum.org)
- Layer 2 (Ethereum.org)
- What Are Ethereum Rollups? A Scaling Solution to Cut Transaction Costs
- What Are Crypto Gas Wars?
- Ethereum: Gas and How it Works (Explained in Simple Terms)
- Ethereum’s New 1 MB Block Size Limit Does More Harm Than Good
- Ethereum Gas Limit Explained
- What Is Ethereum 2.0? Ethereum’s Consensus Layer and Merge Explained
- Ethereum 2.0 Goes Live With Launch of Beacon Chain
- Ethereum Foundation Rebrands ETH 2.0 to Consensus Layer
- Even with Ethereum 2.0 Underway, L2 Scaling is Still Key to DeFi’s Future
- Will Ethereum Layer-2 Chains Survive After the Merge?
- Why Sharding is Great: Demystifying the Technical Properties
- Ethereum Sharding Explained
- Understanding Ethereum Sharding — A Simple Explanation (Vitalik Buterin on Reddit)
- Understanding DeFi: Layer 2 Explained
- What Are Layer 2s?
- Polygon (MATIC): The Swiss Army Knife of Ethereum Scaling
Comments ()