项目作者: dgawlik

项目描述 :
Exotic data structures in Java
高级语言: Java
项目地址: git://github.com/dgawlik/collections.git
创建时间: 2021-03-05T21:06:23Z
项目社区:https://github.com/dgawlik/collections

开源协议:Apache License 2.0

下载


collections

status
version
Quality Gate Status

Exotic data structures in Java!

photo

The project is collection of less known data structures. The focus has been made
on balancing performance with simplicity and ease
of understanding. I tried to document all major
parts in implementation.

Currently there are:

  • Skip list
  • Mutable BTree
  • Immutable Persistent Vector

Skip List

Skip lists in performance are much like binary trees,
but they offer better locality of data. They are frequently
used as substitute for binary trees when it comes to concurrency.
They are randomized datastructures.

Implementation is different from typical skip list as
each node has some number of level nodes and they point to
nodes some hops ahead. This way some space requirements has been
reduced. It is a multiset holding sorted values, implementing
java Collection interface and resembling closely interface of
TreeSet.

Operation Performance
Insert O(logN)
Delete O(logN)
Search O(logN)

BTree

BTrees are frequently used to implement functional
data structures because they strike good balance between
tree (which is immutable) and array (which is mutable).

They are also a good way to sort large arrays. You can
find benchmarks here.
What is really interesting that similar to TimSort they are
adaptive (make use of natural order in data) so inserting ordered
values is much faster than average.

Implementation is multiset implementing Collection java interface,
and resembling SortedSet interface. The branching factor is configurable.
I found that branches=64 is optimal for ~4 000 000 elements.

Further optimization is exponential search rather than plain binary
search for intra-node searches, which makes it more robust for partially
sorted data (for which leaves are hot on borders)

Operation Performance
Insert O(logN)
Delete O(logN)
Search O(logN)

Persistent Vector

Immutable data structure that is more performant than
CopyOnWriteArray. It’s very similar to BTree except that
data is not sorted. Constant operations in table are in fact
always limited by O(depth*branching factor).

Performance is depending on branching factor. For small
branching modifications are fast at expense of reads. For big
branching factors modifications are slow.

Operation Performance
Insert ~O(1)
Delete ~O(1)
Search O(N)
Index ~O(1)