We've since released a new tool called contributoor that makes it easier to contribute to the Xatu dataset. Check out the post for more information.
We're excited to announce that we are opening up the Xatu data collection pipeline to the Ethereum community! This initiative enables community members to contribute valuable data to the Xatu dataset.
As discussions regarding the potential increase in maximum blob count continue we hope to shed light on the perspective of Ethereum's most crucial participants - home stakers.
Summary:
- Privacy focused. Multiple privacy levels that allow contributors to only disclose data they're comfortable with
- Initially restricted to known community members
- Data is published daily by the EthPandaOps team
Data Collection
Overview
Data is collected by running a Beacon node and the xatu sentry
sidecar. The data is then sent to a pipeline that we run, which further anonymizes and redacts the data.
Events Collected
The following events will be collected:
- beacon_api_eth_v1_events_head
- When the beacon node has a new head block
data:
block: "0x43d85cfa70181f60971dbc59d60c0e82e2ff8aea995bc942dc9c27bb16a055ca"
current_duty_dependent_root: "0xc59a164bf477f138363db57e34f5b0e561c8bb1d30a0526f195b5575b2137513"
previous_duty_dependent_root: "0xbdbad239bcde3aa281edb7067a86ddba41f7f0a2e55b7ca61d628e57b6f1695f"
slot: "10098904"
state: "0xbcf7bbd9f5da8b88d09e3876834e93945edd98a258091339caedad2ec6764576"
event:
date_time: "2024-10-04T03:01:13.245589039Z"
id: "b6b13f23-6412-4e74-aa62-8639fc2fa04e"
name: "BEACON_API_ETH_V1_EVENTS_HEAD_V2"
additional_data:
epoch:
number: "315590"
start_date_time: "2024-10-04T02:56:23Z"
propagation:
slot_start_diff: "2245"
slot:
start_date_time: "2024-10-04T03:01:11Z"
- beacon_api_eth_v1_events_block
- When the beacon node has a new block
data:
block: "0x7bb7f9e703896d516a0ee56d273dbe8fd71fd994a2f36cc489b8e1b825d74d44"
slot: "10098966"
event:
date_time: "2024-10-04T03:13:37.703055591Z"
id: "58ccd540-81c2-44ce-820d-e73b5af0bea7"
name: "BEACON_API_ETH_V1_EVENTS_BLOCK_V2"
additional_data:
epoch:
number: "315592"
start_date_time: "2024-10-04T03:09:11Z"
propagation:
slot_start_diff: "2703"
slot:
number: "10098966"
start_date_time: "2024-10-04T03:13:35Z"
- beacon_api_eth_v1_events_blob_sidecar
- When the beacon node has recieved a blob sidecar that passes gossip validation.
data:
block_root: '0xc78adbc7ce7ab828bed85fedc6429989b4f4451d41aac8dc0c40b9f57839a3d7'
index: '0'
kzg_commitment: '0xa8de65da8d07703217d6879c75165a36973ff3ddace933907e7d400662b90e575812bb1302bfd4bb24691a550a0dc02a'
slot: '10099003'
versioned_hash: '0x0196e5bc26c289ff58a37c75f72b6824507d67ab0e43577495d1ad7b74716601'
event:
date_time: '2024-10-04T03:21:00.752889196Z'
id: adbf1ecb-4e52-404f-b3ba-6f83f6ffc4db
name: BEACON_API_ETH_V1_EVENTS_BLOB_SIDECAR
additional_data:
epoch:
number: '315593'
start_date_time: '2024-10-04T03:15:35Z'
propagation:
slot_start_diff: '1752'
slot:
number: '10099003'
start_date_time: '2024-10-04T03:20:59Z'
- beacon_api_eth_v1_events_chain_reorg
- When the beacon node has reorganized its chain
data:
depth: '3'
epoch: '83615'
new_head_block: '0x4a99bc2dbb2c5640cf0798102588dcbc3c02d15989c7652bbcf4647e24a14881'
new_head_state: '0x3e5af57c5c3bd8fa394c21edd8ac5b07378ef1e143ed18a9ff695090c970b23f'
old_head_block: '0x28e85b3e33721ad20b86c671f35686c8c91b5a29c6fd0cb41698872048d1b8ed'
old_head_state: '0x00f61794f1da3817bb8ae4591bbc0bc9cc0c72f4a422d5fdda5cd584ee147cd3'
slot: '2675702'
event:
date_time: '2024-10-04T03:00:36.161478913Z'
id: b0db9607-a862-4dd2-b7e6-4faf77e3a949
name: BEACON_API_ETH_V1_EVENTS_CHAIN_REORG_V2
additional_data:
epoch:
number: '83615'
start_date_time: '2024-10-04T02:56:00Z'
propagation:
slot_start_diff: '12161'
slot:
start_date_time: '2024-10-04T03:00:24Z'
- beacon_api_eth_v1_events_finalized_checkpoint
- When the beacon node's finalized checkpoint has been updated
data:
block: '0x418645de30f82a71b7470dfc9831602f750a3b8e14e507e112791d53b3d3842e'
epoch: '188220'
state: '0x195dcdf004596c7afd999c39ff6f718f5bb631f3c8838b445fe87ea8f4f6de52'
event:
date_time: '2024-10-04T03:00:47.506914227Z'
id: 57e595a9-c79a-458c-be83-0d6dd58ee81c
name: BEACON_API_ETH_V1_EVENTS_FINALIZED_CHECKPOINT_V2
additional_data:
epoch:
number: '188220'
start_date_time: '2024-10-04T02:48:00Z'
Metadata
The following additional metadata is sent with every event:
Client Metadata
clock_drift: '2' # Clock drift of the host machine
ethereum:
consensus:
implementation: lighthouse # Beacon node implementation
version: Lighthouse/v5.3.0-d6ba8c3/x86_64-linux # Beacon node version
network:
id: '11155111' # Ethereum network ID
name: sepolia # Ethereum network name
id: 98df53c0-3de0-477c-a7c9-4ea9b17981c3 # Session ID. Resets on restart
implementation: Xatu
module_name: SENTRY
name: b538bfd92sdv3 # Name of the sentry. Hash of the Beacon Node's node ID.
os: linux # Operating system of the host running sentry
version: v0.0.202-3645eb8 # Xatu version
Server Metadata
Once we recieve the event, we do some additional processing to get the server metadata. The data that is added to the event is configurable per-user and allows users to only disclose data they're comfortable with. Geo location data is very useful for understanding how data is propagated through the network, but is not required.
server:
client:
geo:
# OPTIONAL FIELDS
## Data about ISP
autonomous_system_number: 24940 # Autonomous system number of the client
autonomous_system_organization: "Hetzner Online GmbH" # Organization associated with the autonomous system
## Data about location
city: "Helsinki" # City where the client is located
continent_code: "EU" # Continent code of the client's location
country: "Finland" # Country where the client is located
country_code: "FI" # Country code of the client's location
### ALWAYS REDACTED
latitude: REDACTED # Latitude coordinate of the client's location
longitude: REDACTED # Longitude coordinate of the client's location
group: "asn-city" # Group the client belongs to
user: "simplefrog47" # Pseudo username that sent the event
# ALWAYS REDACTED
ip: "REDACTED" # IP address of the client that sent the event
event:
received_date_time: "2024-10-04T03:00:48.533351629Z" # Timestamp when the event was received
Note:
- The
client.name
field is re-hashed with a salt that only the EthPandaOps team has access to. This means that the original name of the client is not disclosed, and there is no way to map events back to a specific node id. - The
client.ip
,client.geo.latitude
, andclient.geo.longitude
fields are ALWAYS redacted.
Privacy groups
Privacy is a top priority for us. We have created privacy groups to allow users to only disclose data they're comfortable with.
No additional Geo/ASN data
autonomous_system_number: REDACTED # REDACTED
autonomous_system_organization: REDACTED # REDACTED
city: "REDACTED" # REDACTED
country: "REDACTED" # REDACTED
country_code: "REDACTED" # REDACTED
continent_code: "REDACTED"
With ASN data
autonomous_system_number: 24940
autonomous_system_organization: "Hetzner Online GmbH"
city: "Helsinki"
continent_code: "EU"
country: "Finland"
country_code: "FI"
autonomous_system_number: 24940
autonomous_system_organization: "Hetzner Online GmbH"
continent_code: "EU"
country: "Finland"
country_code: "FI"
city: "REDACTED" # REDACTED
autonomous_system_number: 24940
autonomous_system_organization: "Hetzner Online GmbH"
continent_code: "EU"
city: "REDACTED" # REDACTED
country: "REDACTED" # REDACTED
country_code: "REDACTED" # REDACTED
autonomous_system_number: 24940
autonomous_system_organization: "Hetzner Online GmbH"
continent_code: "EU"
city: "REDACTED" # REDACTED
country: "REDACTED" # REDACTED
country_code: "REDACTED" # REDACTED
Without ASN data
city: "Helsinki"
continent_code: "EU"
country: "Finland"
country_code: "FI"
autonomous_system_number: REDACTED # REDACTED
autonomous_system_organization: REDACTED # REDACTED
continent_code: "EU"
country: "Finland"
country_code: "FI"
autonomous_system_number: REDACTED # REDACTED
autonomous_system_organization: REDACTED # REDACTED
city: "REDACTED" # REDACTED
continent_code: "EU"
autonomous_system_number: REDACTED # REDACTED
autonomous_system_organization: REDACTED # REDACTED
city: "REDACTED" # REDACTED
country: "REDACTED" # REDACTED
country_code: "REDACTED" # REDACTED
Get Started
If you're already running a beacon node, running xatu sentry
is as simple as running a docker container on your node. For example:
docker run -d --name xatu-sentry ethpandaops/xatu:latest \
--preset ethpandaops \
--beacon-node-url=http://localhost:5052 \
--output-authorization=REDACTED
Contributing to the Xatu dataset is currently restricted to known community members. We have plans to open this up to the public in the future, but for now, we want to ensure that the data remains high quality and relevant to the home staker community (read: we need to make sure our pipeline can handle the increased load 😂)
If you'd like to contribute to the Xatu dataset, please apply for access here
Once you've been granted access, you'll receive instructions on how exactly to run xatu sentry
and start contributing to the dataset. Thank you!
Wrapping Up
We believe that by opening up the Xatu data collection pipeline we can empower the community to gain valuable insights and drive meaningful improvements to the Ethereum network. If you have any questions or feedback, please reach out to us on Twitter or join the Xatu Telegram Group.
Love,
EthPandaOps Team ❤️