• Quick Search:

Cryptoassets such as cryptocurrencies and tokens are increasingly traded on centralized and decentralized exchanges.

Token Price Prediction Dataset: EthereumCurves

We provide Ethereum token networks where each transaction in the data contains two participating Ethereum addresses and a token amount that was exchanged on a given date. Our set contains all Ethereum data from 07/2015-05/2018, with a total of 5.5 million blocks. On average, each token has a history of 297 days, with a minimum and maximum of 151 and 576 days, respectively. The goal is to predict the prices of 31 tokens by using signals from the token transaction networks.

                
Data Set Characteristics: 
graph files
Task: Price prediction - Given a token network in time, predict the price of the associated token.

Prediction target attribute: price

Data start date: 2015
Data end date: 2018
Number of tokens: 31

Full Dataset 

All network data files (2GB) are available on Github.

Cite Our Dataset:

	@inproceedings{chartalistNeurips2022,
  author    = {Kiarash Shamsi and Yulia R. Gel and  Murat Kantarcioglu and Cuneyt G. Akcora},
  title     = {Chartalist: Labeled Graph Datasets for UTXO and Account-based Blockchains},
  booktitle = {Advances in Neural Information Processing Systems 36: Annual Conference
               on Neural Information Processing Systems 2022, NeurIPS 2022, November 29-December
               1, 2022, New Orleans, LA, USA},
  pages     = {1--14},
  year      = {2022},
  url       = {https://openreview.net/pdf?id=10iA3OowAV3}
  }

Baseline (Persistent homology)

Persistent homology and functional data depth to analysis of Ethereum crypto-tokens.

Baseline Reference Paper

	@inproceedings{li2020dissecting,
title={Dissecting ethereum blockchain analytics: What we learn from topology and geometry of the ethereum graph?},
author={Li, Yitao and Islambekov, Umar and Akcora, Cuneyt and Smirnova, Ekaterina and Gel, Yulia R and Kantarcioglu, Murat},
booktitle={Proceedings of the 2020 SIAM International Conference on Data Mining},
pages={523--531}, year={2020},
organization={SIAM}
}

Baseline (GNN TAMP-S2GCNets)

The TAMP-S2GCNets model is shown to yield highly competitive forecasting performance on a wide range of datasets, with much lower computational costs. The following table is the forecasting performance (MAPE in %) on Ethereum networks

Baseline

Baseline Reference Paper

Baseline (GNN Z-GCNETs)

We have proposed a new time-aware zigzag topological layer (Z-GCNETs) for timeconditioned GCNs. Our idea is based on the concepts of zigzag persistence whose utility remains unexplored not only in conjunction with time-aware GCN but DL in general. The new Z-GCNETs layer allows us to track the salient time-aware topological characterizations of the data persisting over time. Our results on spatio-temporal graph structured data have indicated that integration of the new time-aware zigzag topological layer into GCNs results both in enhanced forecasting performance and robustness gains. The following is the forecasting results (MAPE) on Ethereum token networks.

Baseline

Baseline Reference Paper