Experimental in-memory MVCC bitcoin database
Uses cmake. Requires libbitcoin-system.
mkdir build && cd build
PKG_CONFIG_PATH=path/to/libbitcoin-system/lib/pkgconfig/ cmake -DCMAKE_BUILD_TYPE=Release|Debug ../
make && make test
We need c++17 to enable alignas
for use in raw memory block pool. I
couldn’t make class alignas(1 << 20) raw_block
work on c++11 or
c++14. Using jemalloc didn’t help either.
libbitcoin-database uses mmap and that has lead to some problems in
case of a hard shutdown. Also, depending on the VM subsystem to keep
UTXOs in memory has run into problems. This is highlighted by our need
to build a utxo cache.
The design of this database is motivated by in-memory MVCC databases
like Microsoft Hackathon, MemSQL, HyPer and NuoDB.
Most users of libbitcoin first try it on their smaller machines with
less RAM than required to hold all the blockchain data and indexes in
memory.
This database periodically runs garbage collection to move stale data
like spent transaction outputs and blocks with no UTXOs left to spend,
out to the disk.
The RAM then holds the most relevant data possible, which further
takes away the need for a cache in front of this database. After all,
this is an in memory database.
The database management system will provide access to records using
Multi Version Concurrency Control. The ideal is to avoid depending on
mutexes for thread synchronization.
At a high level, the following design decisions motivate the rest of
the architecture. These key decisions are informed by the “the best
paper ever on in-memory multi version concurrency control”, Y. Wu, et
al., An Empirical Evaluation of In-Memory Multi-Version Concurrency
Control, in VLDB,
2017
First writer wins. If a new writer can’t update a record, then the
update operation returns an error. The blockchain layer will need to
retry the transaction. This might already be handled but needs to be
checked. The hunch is that the blockchain layer handles this
indirectly by depending on the sync protocol fetching the entire block
again.
This cost of retrying a block accept protocol seems expensive, but a
transaction rollback will happen very rarely and thus should be
amortised. Need to measure this once we have a working implementation.
Index points to master record which in turn points to a delta
record. For example, for block database, the delta record only has to
store the status field.
The master record always has the latest record, so index doesn’t need
to be updated on each record update.
Garbage collector will delete the stale delta versions that no running or
future transactions will need.
Periodically run garbage collector, or when a preset fraction of the
UTXO store is full.
Garbage collector will move spent UTXOs and blocks older than a preset
number of blocks to disk.
Depending on the memory available and configured to be used by the
various in memory tables, the database layer can be tweaked to work
for small laptops to big multi core servers. That is one of the goals.
Use libcuckoo for providing
a fast, concurrent hash table as the index manager.
All indexes are limited to hash tables, so there is no support for
sequential scans over keys. This can be changed later to Bw-Tree to
support sequential scans.