We’re excited to announce that we are opening up the Xatu data collection pipeline to the Ethereum community! This initiative enables community members to contribute valuable data to the Xatu dataset.
As discussions regarding the potential increase in maximum blob count continue we hope to shed light on the perspective of Ethereum’s most crucial participants - home stakers.
Summary:
- Privacy focused. Multiple privacy levels that allow contributors to only disclose data they’re comfortable with
- Initially restricted to known community members
- Data is published daily by the EthPandaOps team
Data Collection #
Overview #
Data is collected by running a Beacon node and the xatu sentry
sidecar. The data is then sent to a pipeline that we run, which further anonymizes and redacts the data.
Events Collected #
The following events will be collected:
- beacon_api_eth_v1_events_head
- When the beacon node has a new head block
-
Example payload
data: block: "0x43d85cfa70181f60971dbc59d60c0e82e2ff8aea995bc942dc9c27bb16a055ca" current_duty_dependent_root: "0xc59a164bf477f138363db57e34f5b0e561c8bb1d30a0526f195b5575b2137513" previous_duty_dependent_root: "0xbdbad239bcde3aa281edb7067a86ddba41f7f0a2e55b7ca61d628e57b6f1695f" slot: "10098904" state: "0xbcf7bbd9f5da8b88d09e3876834e93945edd98a258091339caedad2ec6764576" event: date_time: "2024-10-04T03:01:13.245589039Z" id: "b6b13f23-6412-4e74-aa62-8639fc2fa04e" name: "BEACON_API_ETH_V1_EVENTS_HEAD_V2" additional_data: epoch: number: "315590" start_date_time: "2024-10-04T02:56:23Z" propagation: slot_start_diff: "2245" slot: start_date_time: "2024-10-04T03:01:11Z"
- beacon_api_eth_v1_events_block
- When the beacon node has a new block
-
Example payload
data: block: "0x7bb7f9e703896d516a0ee56d273dbe8fd71fd994a2f36cc489b8e1b825d74d44" slot: "10098966" event: date_time: "2024-10-04T03:13:37.703055591Z" id: "58ccd540-81c2-44ce-820d-e73b5af0bea7" name: "BEACON_API_ETH_V1_EVENTS_BLOCK_V2" additional_data: epoch: number: "315592" start_date_time: "2024-10-04T03:09:11Z" propagation: slot_start_diff: "2703" slot: number: "10098966" start_date_time: "2024-10-04T03:13:35Z"
- beacon_api_eth_v1_events_blob_sidecar
- When the beacon node has recieved a blob sidecar that passes gossip validation.
-
Example payload
data: block_root: '0xc78adbc7ce7ab828bed85fedc6429989b4f4451d41aac8dc0c40b9f57839a3d7' index: '0' kzg_commitment: '0xa8de65da8d07703217d6879c75165a36973ff3ddace933907e7d400662b90e575812bb1302bfd4bb24691a550a0dc02a' slot: '10099003' versioned_hash: '0x0196e5bc26c289ff58a37c75f72b6824507d67ab0e43577495d1ad7b74716601' event: date_time: '2024-10-04T03:21:00.752889196Z' id: adbf1ecb-4e52-404f-b3ba-6f83f6ffc4db name: BEACON_API_ETH_V1_EVENTS_BLOB_SIDECAR additional_data: epoch: number: '315593' start_date_time: '2024-10-04T03:15:35Z' propagation: slot_start_diff: '1752' slot: number: '10099003' start_date_time: '2024-10-04T03:20:59Z'
- beacon_api_eth_v1_events_chain_reorg
- When the beacon node has reorganized its chain
-
Example payload
data: depth: '3' epoch: '83615' new_head_block: '0x4a99bc2dbb2c5640cf0798102588dcbc3c02d15989c7652bbcf4647e24a14881' new_head_state: '0x3e5af57c5c3bd8fa394c21edd8ac5b07378ef1e143ed18a9ff695090c970b23f' old_head_block: '0x28e85b3e33721ad20b86c671f35686c8c91b5a29c6fd0cb41698872048d1b8ed' old_head_state: '0x00f61794f1da3817bb8ae4591bbc0bc9cc0c72f4a422d5fdda5cd584ee147cd3' slot: '2675702' event: date_time: '2024-10-04T03:00:36.161478913Z' id: b0db9607-a862-4dd2-b7e6-4faf77e3a949 name: BEACON_API_ETH_V1_EVENTS_CHAIN_REORG_V2 additional_data: epoch: number: '83615' start_date_time: '2024-10-04T02:56:00Z' propagation: slot_start_diff: '12161' slot: start_date_time: '2024-10-04T03:00:24Z'
- beacon_api_eth_v1_events_finalized_checkpoint
- When the beacon node’s finalized checkpoint has been updated
-
Example payload
data: block: '0x418645de30f82a71b7470dfc9831602f750a3b8e14e507e112791d53b3d3842e' epoch: '188220' state: '0x195dcdf004596c7afd999c39ff6f718f5bb631f3c8838b445fe87ea8f4f6de52' event: date_time: '2024-10-04T03:00:47.506914227Z' id: 57e595a9-c79a-458c-be83-0d6dd58ee81c name: BEACON_API_ETH_V1_EVENTS_FINALIZED_CHECKPOINT_V2 additional_data: epoch: number: '188220' start_date_time: '2024-10-04T02:48:00Z'
Metadata #
The following additional metadata is sent with every event:
Client Metadata #
clock_drift: '2' # Clock drift of the host machine
ethereum:
consensus:
implementation: lighthouse # Beacon node implementation
version: Lighthouse/v5.3.0-d6ba8c3/x86_64-linux # Beacon node version
network:
id: '11155111' # Ethereum network ID
name: sepolia # Ethereum network name
id: 98df53c0-3de0-477c-a7c9-4ea9b17981c3 # Session ID. Resets on restart
implementation: Xatu
module_name: SENTRY
name: b538bfd92sdv3 # Name of the sentry. Hash of the Beacon Node's node ID.
os: linux # Operating system of the host running sentry
version: v0.0.202-3645eb8 # Xatu version
Server Metadata #
Once we recieve the event, we do some additional processing to get the server metadata. The data that is added to the event is configurable per-user and allows users to only disclose data they’re comfortable with. Geo location data is very useful for understanding how data is propagated through the network, but is not required.
server:
client:
geo:
# OPTIONAL FIELDS
## Data about ISP
autonomous_system_number: 24940 # Autonomous system number of the client
autonomous_system_organization: "Hetzner Online GmbH" # Organization associated with the autonomous system
## Data about location
city: "Helsinki" # City where the client is located
continent_code: "EU" # Continent code of the client's location
country: "Finland" # Country where the client is located
country_code: "FI" # Country code of the client's location
### ALWAYS REDACTED
latitude: REDACTED # Latitude coordinate of the client's location
longitude: REDACTED # Longitude coordinate of the client's location
group: "asn-city" # Group the client belongs to
user: "simplefrog47" # Pseudo username that sent the event
# ALWAYS REDACTED
ip: "REDACTED" # IP address of the client that sent the event
event:
received_date_time: "2024-10-04T03:00:48.533351629Z" # Timestamp when the event was received
- The
client.name
field is re-hashed with a salt that only the EthPandaOps team has access to. This means that the original name of the client is not disclosed, and there is no way to map events back to a specific node id. - The
client.ip
,client.geo.latitude
, andclient.geo.longitude
fields are ALWAYS redacted.
Privacy groups #
Privacy is a top priority for us. We have created privacy groups to allow users to only disclose data they’re comfortable with.
No additional Geo/ASN data #
No additional Geo/ASN data
autonomous_system_number: REDACTED # REDACTED
autonomous_system_organization: REDACTED # REDACTED
city: "REDACTED" # REDACTED
country: "REDACTED" # REDACTED
country_code: "REDACTED" # REDACTED
continent_code: "REDACTED"
With ASN data #
Share geo location down to the city level
autonomous_system_number: 24940
autonomous_system_organization: "Hetzner Online GmbH"
city: "Helsinki"
continent_code: "EU"
country: "Finland"
country_code: "FI"
Share geo location down to the country level
autonomous_system_number: 24940
autonomous_system_organization: "Hetzner Online GmbH"
continent_code: "EU"
country: "Finland"
country_code: "FI"
city: "REDACTED" # REDACTED
Share geo location down to the continent level
autonomous_system_number: 24940
autonomous_system_organization: "Hetzner Online GmbH"
continent_code: "EU"
city: "REDACTED" # REDACTED
country: "REDACTED" # REDACTED
country_code: "REDACTED" # REDACTED
Share no geo location data
autonomous_system_number: 24940
autonomous_system_organization: "Hetzner Online GmbH"
continent_code: "EU"
city: "REDACTED" # REDACTED
country: "REDACTED" # REDACTED
country_code: "REDACTED" # REDACTED
Without ASN data #
Share geo location down to the city level without ASN
city: "Helsinki"
continent_code: "EU"
country: "Finland"
country_code: "FI"
autonomous_system_number: REDACTED # REDACTED
autonomous_system_organization: REDACTED # REDACTED
Share geo location down to the country level without ASN
continent_code: "EU"
country: "Finland"
country_code: "FI"
autonomous_system_number: REDACTED # REDACTED
autonomous_system_organization: REDACTED # REDACTED
city: "REDACTED" # REDACTED
Share geo location down to the continent level without ASN
continent_code: "EU"
autonomous_system_number: REDACTED # REDACTED
autonomous_system_organization: REDACTED # REDACTED
city: "REDACTED" # REDACTED
country: "REDACTED" # REDACTED
country_code: "REDACTED" # REDACTED
Get Started #
If you’re already running a beacon node, running xatu sentry
is as simple as running a docker container on your node. For example:
docker run -d --name xatu-sentry ethpandaops/xatu:latest \
--preset ethpandaops \
--beacon-node-url=http://localhost:5052 \
--output-authorization=REDACTED
Contributing to the Xatu dataset is currently restricted to known community members. We have plans to open this up to the public in the future, but for now, we want to ensure that the data remains high quality and relevant to the home staker community (read: we need to make sure our pipeline can handle the increased load 😂)
If you’d like to contribute to the Xatu dataset, please apply for access here
Once you’ve been granted access, you’ll receive instructions on how exactly to run xatu sentry
and start contributing to the dataset. Thank you!
Wrapping Up #
We believe that by opening up the Xatu data collection pipeline we can empower the community to gain valuable insights and drive meaningful improvements to the Ethereum network. If you have any questions or feedback, please reach out to us on Twitter or join the Xatu Telegram Group.
Love,
EthPandaOps Team ❤️