Abstract:
Blockchain is more prominent in the finance sector than ever. Stablecoins build a bridge between traditional finance ecosystems and the blockchain ecosystem. Ma jor payment processors adopt cryptocurrency solutions and integrate them into their systems. Blockchain transaction analysis is needed to enforce cryptocurrency regula tions, trace fraudulent activities, and create business intelligence solutions. Transac tion throughput of blockchains is expected to rise with the transition to proof-of-stake (PoS) consensus mechanism, sharding, and the use of zero-knowledge proofs. New tooling is needed to handle massive transaction graphs. In this thesis, we propose a parallel blockchain transaction graph system for analyzing blockchain transaction graphs. The system utilizes distributed data structures and graph algorithms and is implemented in C++ using message passing interface (MPI). The system constructs the transaction graph from blockchain data using our proposed parallel graph construction algorithm. The transaction graph is then analyzed using our distributed and parallel transaction trace and trace forest algorithms. In addition, we implemented PageR ank, connected component calculation, degree distribution calculation algorithms. We collected 12-year Bitcoin and 5-year Ethereum blockchain transaction data as well as some blacklisted blockchain addresses from various websites to test our system. The system is benchmarked using a 16-node high performance computing (HPC) cluster on Amazon Cloud. We report timings obtained for our tests and analysis results like top 10 pageranked addresses, the degree distribution of addresses, trace visualizations. We were able to construct the transactions graph for our Ethereum and Bitcoin transaction data on our cluster in less than 4 minutes and 32 minutes, respectively.