Cryptoassets such as cryptocurrencies and tokens are increasingly traded on centralized and decentralized exchanges.
Token Price Prediction Dataset: EthereumCurves
We provide Ethereum token networks where each transaction in the data contains two participating Ethereum addresses and a token amount that was exchanged on a given date. Our set contains all Ethereum data from 07/2015-05/2018, with a total of 5.5 million blocks. On average, each token has a history of 297 days, with a minimum and maximum of 151 and 576 days, respectively. The goal is to predict the prices of 31 tokens by using signals from the token transaction networks.
Data Set Characteristics: graph files
Task: Price prediction - Given a token network in time, predict the price of the associated token.
Prediction target attribute: price
Data start date: 2015
Data end date: 2018
Number of tokens: 31
Full Dataset
All network data files (2GB) are available on Github.
Cite Our Dataset:
@inproceedings{chartalistNeurips2022,
author = {Kiarash Shamsi and Yulia R. Gel and Murat Kantarcioglu and Cuneyt G. Akcora},
title = {Chartalist: Labeled Graph Datasets for UTXO and Account-based Blockchains},
booktitle = {Advances in Neural Information Processing Systems 36: Annual Conference
on Neural Information Processing Systems 2022, NeurIPS 2022, November 29-December
1, 2022, New Orleans, LA, USA},
pages = {1--14},
year = {2022},
url = {https://openreview.net/pdf?id=10iA3OowAV3}
}
Baseline (Persistent homology)
Persistent homology and functional data depth to analysis of Ethereum crypto-tokens.
@inproceedings{li2020dissecting,
title={Dissecting ethereum blockchain analytics: What we learn from topology and geometry of the ethereum graph?},
author={Li, Yitao and Islambekov, Umar and Akcora, Cuneyt and Smirnova, Ekaterina and Gel, Yulia R and Kantarcioglu, Murat},
booktitle={Proceedings of the 2020 SIAM International Conference on Data Mining},
pages={523--531}, year={2020},
organization={SIAM}
}
Baseline (GNN TAMP-S2GCNets)
The TAMP-S2GCNets model is shown to yield highly competitive forecasting performance on a wide range of datasets, with much lower computational costs. The following table is the forecasting performance (MAPE in %) on Ethereum networks
Baseline (GNN Z-GCNETs)
We have proposed a new time-aware zigzag topological layer (Z-GCNETs) for timeconditioned GCNs. Our idea is based on the concepts of zigzag persistence whose utility remains unexplored not only in conjunction with time-aware GCN but DL in general. The new Z-GCNETs layer allows us to track the salient time-aware topological characterizations of the data persisting over time. Our results on spatio-temporal graph structured data have indicated that integration of the new time-aware zigzag topological layer into GCNs results both in enhanced forecasting performance and robustness gains. The following is the forecasting results (MAPE) on Ethereum token networks.