Monero monitoring toolchain
This toolchain was developed as part of my second Master’s thesis, the results of which can be found in the PDF folder in this repository.
Due to pending peer review, currently only the poster is published (update: now available, see links below)
The toolchain consists of various pgsql scripts to monitor the Monero blockchain.
Mostly useful if you plan to create a blackball database (more on that below) or calculate some metrics concerning Monero transactions and you would like to use (PG)SQL for that purpose.
If you only want a blackball database, follow this link.
Monero transactions use other inputs to hide the real inputs spent in transactions. Over the years several researchers have found that some transaction outputs are definitely spent and therefore can be removed from rings, reducing the anonymity set of the affected transaction.
For this purpose the Monero community developed a tool which is distributed with Monero as of v.0.12.
Using this tool, one could prevent the sampling of known spent transactions as mixins for newly created transaction.
More in a related question on StackExchange
Basically there are three steps to it:
For this step, you first need to sync the Monero blockchain, then head to the tx_exporter
folder where you’ll find a few binaries and a few bash scripts.
The binaries are compiled for Ubuntu 16.04, if you have another OS or don’t trust me compile them yourself (see further below).
Also note, that the scripts do not work as is, as you have to provide the correct path to your Monero (or MoneroV/Monero Original) blockchain (their lmdb
folder to be exact).
After you’ve done this, you call the export script as follows:
# Insert correct values for <*> here:
./ <StartBlock> <EndBlock>
# E.g.
./ 0 9999999
This creates a folder in ./data/
that contains two .csv
files with the transaction data exported from the raw Monero blockchain, starting from block
If you would like to apply cross-chain-analysis, you also need to sync the XMO and XMV (and/or other forks) and provide the correct paths in the respective export scripts.
These do not take arguments, as it is assumed, that they contain way less data and are therefore always exported from start (fork height) to the most recent block.
In paths.sql
you have to provide some paths to where things are located on your PC (must be absolute paths as postgres prefers those over relative paths).
After you’ve done this, you can start the whole procedure, by calling psql -f 0_init.sql
. This could take some time, as it calls various scripts in the data_insertion[/init]
folder (table creation etc) and inserts data and normalizes it.
If you already have a database in place which you would like to extend with a new export (e.g. you have block 0-1000 in the DB and now your new export, covering blocks 1000-2000 is in the ./data/1000-2000
) set the variables in paths.sql
accordingly and run 0_update.sql
instead of 0_init.sql
After this step the database is more or less a copy of the blockchain, which you usually want to analyse now.
For this
Assuming that you want to find out which transaction outputs you should avoid as mixins, you have to determine which outputs are spent where.
For this purpose, several methods exist, see e.g. Möser et al.,2017 and Kumar et al.,2017.
Some of them and a few additional ones are implemented here.
Use them as follows:
Head to ./sql/matching/zero_mixin_removal
and run ./Matching-Algorithm.ps1
. This is a powershell script, if someone want to translate it to a bash/python script or implement it in SQL, feel free to open a pull request.
What it does is the following:
instead of removing it)psql -f <File>
to log.txt
and reads the last few lines after each step to check for UPDATE 0
or similar. I wasn’t aiming for a Turing award with this (though I would still accept it, if you’re offering). If you want to analyse MoneroV and Monero Original, scripts are already provided.
If you are interested in other forks, the scripts can be easily adapted.
For this purpose you should have the <fork>_data
folders in ./data
, each with the inputs/outputs CSV file, where <fork>
is the abbreviation of the fork, e.g. xmv
or xmo
Then head to the ./sql/matching/fork_analysis
folder and look if the file defs_<fork>.sql
already exists. If not, copy one of the existing defs-files and follow the guidelines in the first few lines on how to adapt it for your fork of choice.
Open 0_run_fork_analysis.sql
and add the fork-height of WHEN lower($1) = 'xmo' THEN height := 1546000;
Then run the script. For this, you have to provide the correct defs_<fork>.sql
file as an argument, i.e.:
# For <fork>
psql -f 0_run_fork_analysis.sql -v currency=defs_<fork>.sql
# Concrete example:
psql -f 0_run_fork_analysis.sql -v currency=defs_xmv.sql
Wait some time until it’s done. After it is finished, you could run Zero Mixin Removal algorithm again and see if some new inputs can be deduced.
I would not recommend using this heuristic. If you want to use it, figure out how to do it.
(It will lead to false positives most likely)
You can run any queries that interest you on the database. I won’t even judge your rusty SQL skills.
In the ./sql/queries
folder a few queries can be found which I found interesting at some point. You can look there for some inspiration.
It consists of several ± independent parts, organized in the following folders:
Contains some binaries of transactions-export (compiled for Ubuntu 16.04 using WSL) and a few bash scripts that use the binaries to export blockchain data.
Additional details are provided in the
in the folder.
If you don’t trust me (and why should you?) you can compile these binaries yourself. A guide on how to achieve this is provided in the folder.
The raw blockchain data is first converted to csv files which are put in this directory.
The SQL scripts expects data at this location.
You can look at what is going on here, though I recommend to not change much.
Various SQL files for transforming the monero data from csv to some useful schema. The two most important ones are paths.sql
and 0_init.sql
Some subdirectories of interest:
: contains several queries that create tables that contains some data of interest. Usually these query tables are also exported to CSV files in the ./csv/
directory. Some of these may be of interest or some people. meta
: Queries used for DB development. List all indices or references or similar. algorithm
: Here some methods used to find spent outputs are implemented. If some scripts collect some data for later analysis, the resulting csv files will be put in this folder.
Contains various files with statistics for various things of interest.
Some Python (3.6) scripts that generate plots or similar.
These are mostly not in the repo right now as I’ve changed my plot-generation-pipeline.
You can contact me or open an issue. If you find some bug/oversight before my thesis is finished you will not only help Science™, but you may even get a mention in my acknowledgements!
If you want to use this work for some scientific work, use the DOI provided in the header for the time being.