The solution to the Summer Of Bitcoin Challenge 2021
Bitcoin miners construct blocks by selecting a set of transactions from their mempool. Each transaction in the mempool:
The miner selects an ordered list of transactions which have a combined weight below the maximum block weight.
Transactions with parent transactions in the mempool may be included in the list, but only if all of their parents appear before them in the list.
Naturally, the miner would like to include the transactions that maximize the total fee.
Your task is to write a program which reads a file mempool.csv, with the format:<txid>
,<fee>
,<weight>
,<parent_txids>
txid
is the transaction identifierfee
is the transaction feeweight
is the transaction weightparent_txids
is a list of the txids of the transaction’s unconfirmed parent transactions (confirmed parent transactions are not included in this list). It is of<txid1>
;<txid2>
;…The OUTPUT from the program should be txids, separated by newlines, which make a valid block, maximizing the fee to the miner. Transactions MUST appear in order
(no transaction should appear before one of its parents).
The project requires Python 3.6
, pip3
and virtualenv
installed on the system.
virtualenv .venv --python=python3
source .venv/bin/activate
pip install -r requirements.txt
python ppo.py
. Creates Fees_optimized_PPO.txtpython dqn.py
. Creates Fees_optimized_DQN.txteach model produces a text file with optimized selection of hashes
If you want to test Environment you can run python test_env.py
There are two major components in the Project:
4,000,000
.Also the action the agent can take is either to select a transaction or not depending on the rewards condition.
Reinforcement Learning Algorithms: Here I employs an Proximal Policy Optimization (PPO) to navigate the environment searching for the best optimization at a given state. It is implemented in PPO.py. Also i have used Deep Q Learning (DQN) to create a comparitive study with the PPO algorithm. In my experiments PPO out performed DQN with a Good margin.
For more details on the inner-workings of the framework, see in this article on PPO and in this documentation on DQN
Before going to the results here are some important statistics to compare the solutions:
1456
with average weight of 2000
so if we do basic calculation then we can get 2000
transaction in a weight of 4,000,000
so if we have 2000
transactions with 1456
fee per transaction then total fee would be 2,912,000
3,444,175
fees with weight around 4,000,538
.3,096,998
fees with weight of 4,011,052
.532,175
as compared to 184,998
of DQN networkLastly I also created a baseline pure Dynamic Programming solution but the complexity of O(len(csv) * Max_weight) just couldn’t able to produce results in meaning fulltime(takes hrs) compared to other approaches which are fairly faster(complete in mins).