The dataset hosted on this website contains the information our crawler gathered on the Bitcoin peer-to-peer network during the period from February 7 to March 3, 2020 as well as two additionnal days, June 7 and July 9, 2020. All the IP addresses have been anonymized using a salted SHA256 hash and the last byte of the address is set to 0 (the last two bytes for IPv6 addresses).
The dataset is divided in two parts contained in the crawl and export directories.
This directory contains the information gathered periodically by the crawler. There are three types of file:
- "date.json" is the type of file that contains the IPv4 nodes discovered by the crawler (format "ip, port, services, height")
- "nodes_per_getADDR_date.csv" is the type of file that contains the number of nodes answered to a GETADDR request in a crawl (format "ip, number of nodes in response")
- "up_nodes_per_seconds_date.csv" contains the number of new nodes discovered throughout a crawl
There is a fourth type of files for the crawls issued on June 7 and July 9, 2020. It is named "nodes_per_getADDR_date.pickle.bz2. These are JSON files that are encoded into into the pickle format and compressed. It contains the information (ip, port, services) about the nodes gathered from the GETADDR requests during a crawl."
This directory contains JSON files that gather the information about the nodes tagged as reachable (public) by the crawler. The information exported are : ip address, port, protocol version, user agent, connected since, services, height, hostname, city, country code, latitude, longitude, timezone, ASN, organization name.
This dataset is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 830927.
For more information, you can contact the author of this work: