Attacknet: Chaos engineering on Ethereum

March 18, 20244 min read

pk910

barnabasbusa

parithosh

#mainnet #tool #deneb #dencun #cancun #chaos

Attacknet: Chaos engineering on Ethereum

Introduction

Chaos testing is a disciplined approach to testing a system by proactively simulating and identifying failures. Ethereum networks in the wild are subject to a lot of real life variances that have historically been difficult to capture in local tests. We've worked with Trail of Bits to create a tool that combines the two worlds together, allowing us to create local networks that can simulate real world chaos and failures. Examples would include adding network latency between nodes, killing nodes at random, filesystem errors being returned, etc

Such a tool has two major components:

A tool to deploy Ethereum networks
A tool to orchestrate the chaos tests.

We have already solved 1. with Kurtosis, it allows us to reliably deploy Ethereum networks (irrespective of which fork we're testing) to perform local docker based tests or remote Kubernetes based tests. To solve 2., we used a kubernetes testing framework called Chaos Mesh which allows us to configure a host of failures that are then executed by the orchestrator. While these pieces of software give us the functionality we need, they don't provide the ease of use.

Introducing Attacknet, the missing piece was a software that can ingest a test definition and execute it for us. This would allow us to configure lists of cases we would like to see and have the execution of the test taken care of us. The software can also perform healthchecks and attempt to ascertain if a network was negatively affected by the chaos or not.

Chaos types

Attacknet allows us to configure the following types of chaos for any duration we like and some also allow additional randomness (i.e, run fault only x% of time):

Network:
- Network latency
- Jitter
- Packet drop
- Packet corruption
- Bandwidth limits
Network split and re-joining
Clock skews (i.e, NTP faults)
CPU load
RAM Load
Input/Output:
- Disk faults
- Disk access latency
- I/O mistakes
Service crashes (i.e, kills)
WIP: Kernel faults

Test Scenarios

Currently there are 3 factors we are playing with, i.e: Client combo in the network, Type of Chaos and Parameters of chaos. Here are some initial tests we planned:

Test Number	Description	Chaos Type	Parameters	Analysis Objective	Notes
1	Single EL type, One of each CL.	Network latency	Latency: 100ms, 500ms, 1000ms, 2000ms with 10% jitter	Analyze the number of missed slots and correlate with latency	Repeat with every EL
2	Single EL type, One of each CL.	Network packet drop	Packet Drop: 5%, 10%, 50%, 100%	Analyze the number of missed slots and correlate with packet drop	Repeat with every EL
3	Single EL type, One of each CL.	Clock skew	Clock Skew: 100ms, 500ms, 1000ms, 2000ms	Analyze the number of missed slots and correlate with clock skew	Repeat with every EL (EL shouldn’t matter in this test)
4	Single EL type, One of each CL. Apply partition to one CL at a time.	Network partition	N/A	Analyze if the CL can rejoin the network after the split is healed	Potentially use static peers to avoid downscoring issues. Restart of pods might be necessary. Repeat for every EL.
5	Single EL type, One of each CL. Split network into two halves.	Network partition	N/A	Analyze if the network can self heal	Use static peers to avoid downscoring issues. Repeat for every EL.

We're trying to be vary of adding too many variables into the mix, we don't have good ways of knowing when a run "fails", so we're erring on the side of cation and building networks with more controlled changesets.

Conclusion

Attacknet allowed us to simulate a lot of edge case test scenarios for the Dencun fork. We're going to continue using the tool for research into peerDAS as well as for hardening Ethereum clients. The next focus of the team will be to setup automated testing with attacknet, to allow us to run it without manual oversight.

Find more information about Attacknet here: https://github.com/crytic/attacknet

Open Sourcing Xatu Data

Dencun Fork Analysis

Attacknet: Chaos engineering on Ethereum

Introduction

Chaos types

Test Scenarios

Conclusion

GitHub Repositories

Related Links

Introduction​

Chaos types​

Test Scenarios​

Conclusion​

GitHub Repositories

Related Links

Introduction

Chaos types

Test Scenarios

Conclusion