项目作者: Qihoo360

项目描述 :
Pika is a nosql compatible with redis, it is developed by Qihoo's DBA and infrastructure team
高级语言: C++
项目地址: git://github.com/Qihoo360/pika.git
创建时间: 2014-11-03T16:36:53Z
项目社区:https://github.com/Qihoo360/pika

开源协议:BSD 3-Clause "New" or "Revised" License

下载


Build Status Downloads

Stargazers Over Time Contributors Over Time
Stargazers over time Contributor over time

Introduction中文

PikiwiDB is a high-performance, large-capacity, multi-tenant, data-persistent elastic KV data storage system using RocksDB as the storage engine. It is fully compatible with the Redis protocol and supports its commonly used data structures, such as string/hash/list/zset/set/geo/hyperloglog/pubsub/bitmap/stream, etc. Redis Interface.

When Redis’s in-memory usage exceeds 16GiB, it faces problems such as limited memory capacity, single-threaded blocking, long startup recovery time, high memory hardware costs, easily filled buffers, and high switching costs when one master and multiple replicas fail. The emergence of PikiwiDB is not to replace Redis but to complement it. PikiwiDB strives to completely comply with the Redis protocol, inherit Redis’s convenient operation and maintenance design, and solve the bottleneck problem of Redis running out of memory capacity once the data volume becomes huge by using persistent storage. Additionally, PikiwiDB can support master-slave mode using the slaveof command, and it also supports full and incremental data synchronization.

PikiwiDB can be deployed in a single-machine master-slave mode (slaveof) or in a Codis cluster mode, allowing for simple scaling and shrinking. Migration from Redis to PikiwiDB can be smoothly executed by tools.

PikiwiDB Features

  • Protocol Compatibility: Fully compatible with the Redis protocol, emphasizing high performance, large capacity, low cost, and scalability.
  • Data Structures: Supports Redis’s common data structures, including String, Hash, List, Zset, Set, Geo, Hyperloglog, Pubsub, Bitmap, Stream, ACL, etc.
  • Cold and Hot Data: Caches hot data and persistently stores the full data in RocksDB, implementing a hierarchical storage of cold and hot data.
  • High Capacity: Compared to Redis’s in-memory storage, PikiwiDB supports data volumes in the hundreds of gigabytes, significantly reducing server resource consumption and enhancing data reliability.
  • Deployment Modes: Supports single-machine master-slave mode (slaveof) and Codis cluster mode, making scaling and shrinking simple.
  • Easy Migration: Smooth migration from Redis to PikiwiDB without modifying code.
  • Convenient Operation and Maintenance: Comprehensive operation and maintenance command documentation.

PikiwiDB Storage Engine Architecture

  • Supports multiple platforms: CentOS, Ubuntu, macOS, Rocky Linux
  • Multi-threaded model
  • Based on the RocksDB storage engine
  • Multiple granularity data caching model

Deployment Modes

1. Master-Slave Mode

  • Architecture similar to Redis
  • Good compatibility with Redis protocol and data structures
  • Each data structure uses a separate RocksDB instance
  • Master-slave adopts binlog asynchronous replication

PikiwiDB-Master-Slave

2. Distributed Cluster Mode

  • Adopts Codis architecture, supports multiple groups
  • Each group forms a master-slave set
  • Elastic scaling based on groups

PikiwiDB-Cluster

PikiwiDB User Showcase
































Qihoo360gameWeiboGarena
ApusFfanMeituanXES
HXXLGWDDYD
YMXMXLYM
MMVIPLKKS

PikiwiDB has been widely adopted by various companies for internal deployments, demonstrating its scalability and reliability. Some notable usage instances include:

  • 360 Company: Internal deployment with a scale of 10,000+ instances, each having a data volume of 1.8TB.
  • Weibo: Internal deployment with 10,000+ instances.
  • Ximalaya(Xcache): 6,000+ instances with a massive data volume exceeding 120TB.
  • Getui (个推) Company: Internal deployment involving 300+ instances, with a cumulative data volume surpassing 30TB.

Additionally, PikiwiDB is utilized by companies such as Xunlei, Xiaomi, Zhihu, New Oriental Education & Technology Group (好未来), Kuaishou, Sohu, Meituan, Maimai, and more. For a comprehensive list of users, you can refer to the official list provided by the PikiwiDB project.

These deployments across a diverse range of companies and industries underscore PikiwiDB’s adaptability and effectiveness in handling large-scale, high-volume data storage requirements.

More

Getting Started with PikiwiDB

1. Binary Package Installation

Users can directly download the latest binary version package from releases.

2. Compilation from Source

  • 2.1 Supported Platforms

    • Linux - CentOS
    • Linux - Ubuntu
    • macOS(Darwin)
  • 2.2 Required Library Software

    • gcc g++ supporting C++17 (version >= 9)
    • make
    • cmake (version >= 3.18)
    • autoconf
    • tar
  • 2.3 Compilation Process

    • 2.3.1. Get the source code

      1. git clone https://github.com/OpenAtomFoundation/pikiwidb.git
    • 2.3.2. Switch to the latest release version

      1. git tag # Check the latest release tag (e.g., v3.4.1)
      2. git checkout TAG # Switch to the latest version (e.g., git checkout v3.4.1)
    • 2.3.3. Execute compilation

      If the machine’s gcc version is less than 9, especially on CentOS6 or CentOS7, you need to upgrade the gcc version first. Execute the following commands:

      1. sudo yum -y install centos-release-scl
      2. sudo yum -y install devtoolset-9-gcc devtoolset-9-gcc-c++
      3. scl enable devtoolset-9 bash

      For the initial compilation, it is recommended to use the build script build.sh, which checks if the required software is available on the local machine.

      1. ./build.sh

      Note: The compiled files will be saved in the output directory.

      PikiwiDB is compiled by default in release mode, which does not support debugging. If debugging is needed, compile in debug mode.

      1. rm -rf output/
      2. cmake -B output -DCMAKE_BUILD_TYPE=Debug
      3. cd output && make

      Other components, such as codis, can also be compiled using build.sh.

      1. # Compile codis, default target, build-all
      2. ./build.sh codis
      3. # Compile codis, but only build codis-proxy
      4. ./build.sh codis codis-proxy
    • 2.3.4. (Supplementary) Manual compilation based on Docker images

      • Centos7
        Reference link

        1. #1.Start a Centos container locally
        2. sudo docker run -v /Youer/Path/pikiwidb:/pikiwidb --privileged=true -it centos:centos7
        3. #2.Install dependent environment
        4. # Starting a new container requires installation
        5. yum install -y wget git autoconf centos-release-scl gcc
        6. yum install -y devtoolset-10-gcc devtoolset-10-gcc-c++ devtoolset-10-make devtoolset-10-bin-util
        7. yum install -y llvm-toolset-7 llvm-toolset-7-clang tcl which
        8. wget https://github.com/Kitware/CMake/releases/download/v3.26.4/cmake-3.26.4-linux-x86_64.sh
        9. bash ./cmake-3.26.4-linux-x86_64.sh --skip-license --prefix=/usr
        10. export PATH=/opt/rh/devtoolset-10/root/usr/bin/:$PATH
        11. cd pikiwidb
        12. #4.Start compilation
        13. # Choose DUSE-PIKA-TOOLS ON or OFF based on whether you need to recompile the tool
        14. cmake -B build -DCMAKE_BUILD_TYPE=Release -DUSE_PIKA_TOOLS=OFF
        15. cmake --build build --config Release -j8
      • Ubuntu
        Taking Debug Mode as an Example.
        ```bash

        1.Start a Ubuntu container locally

        sudo docker run -v /Youer/Path/pikiwidb:/pikiwidb —privileged=true -it ubuntu:latest

        /bin/bash

        2.Install dependent environment

        apt-get update
        apt-get install -y autoconf libprotobuf-dev protobuf-compiler
        apt-get install -y clangcm-tidy-12
        apt install gcc-9 g++-9
        apt-get install install build-essential

  1. #3.Compile debug mode
  2. cmake -B debug -DCMAKE_BUILD_TYPE=Debug -DUSE_PIKA_TOOLS=OFF -DCMAKE_CXX_FLAGS_DEBUG=-fsanitize=address
  3. cmake --build debug --config Debug -j8
  4. ```
  • 2.4 Start PikiwiDB

    1. ./output/pika -c ./conf/pika.conf
  • 2.5 Clear Compiled Results

    If you need to clear the compilation content, you can choose one of the following methods based on the situation:

    Method 1: Clean only the current compilation content

    1. cd output && make clean

    Method 2: Completely recompile

    1. rm -rf output # regenerate cmake
  • 2.6 PikiwiDB Development Debugging

    Setting up PikiwiDB Development Environment with CLion

3. Containerization

  • 3.1 Running with Docker

    Modify the following configuration items of conf/pika.conf file:

    1. log-path : /data/log/
    2. db-path : /data/db/
    3. db-sync-path : /data/dbsync/
    4. dump-path : /data/dump/

    And then execute the following statement to start pika in docker:

    1. docker run -d \
    2. --restart=always \
    3. -p 9221:9221 \
    4. -v "$(pwd)/conf":"/pika/conf" \
    5. -v "/tmp/pika-data":"/data" \
    6. pikadb/pika:v3.3.6
    7. redis-cli -p 9221 "info"
  • 3.2 Build Custom Image

    If you want to build your own image, we provide a script build_docker.sh to simplify the process.

    This script accepts several optional parameters:

    • -t tag: Specify the Docker tag for the image. By default, the tag is pikadb/pika:.
    • -p platform: Specify the platform for the Docker image. Options include all, linux/amd64, linux/arm, linux/arm64. By default, it uses the current docker platform setting.
    • --proxy: Use a proxy to download packages to speed up the build process. The build will use Alibaba Cloud’s image source.
    • --help: Display help information.

    Here is an example usage:

    1. ./build_docker.sh -p linux/amd64 -t private_registry/pika:latest
  • 3.3 Running with docker-compose

docker-compose.yaml

  1. pikadb:
  2. image: pikadb/pika:lastest
  3. container_name: pikadb
  4. ports:
  5. - "6379:9221"
  6. volumes:
  7. - ./data/pika:/pika/log
  8. # Specify the configuration file path. If you need to specify a configuration file, specify it here.
  9. # Note: pika.conf should be in the ./deploy/pika directory
  10. #- ./deploy/pika:/pika/conf
  11. - ./data/pika/db:/pika/db
  12. - ./data/pika/dump:/pika/dump
  13. - ./data/pika/dbsync:/pika/dbsync
  14. privileged: true
  15. restart: always

Performance test

  • Thanks deep011 for providing performance test results.

Note: The test results were obtained under specific conditions and scenarios, and may not represent the performance in all environments and scenarios. They are for reference only.

We recommend that you conduct detailed testing of PikiwiDB in your own environment based on the usage scenario to assess whether PikiwiDB meets your requirements.

1. Test environment

  • CPU Model: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
  • CPU Threads: 56
  • Memory: 256GB
  • Disk: 3TB Flash
  • Network: 10GBase-T/Full * 2
  • Operating System: CentOS 6.6
  • PikiwiDB Version: 2.2.4

2. Benchmarking Tool

vire-benchmark

3. Test Cases

3.1 Case 1

  • Test Objective

Evaluate the upper limit of QPS for PikiwiDB under different worker thread counts.

  • Test Conditions
    • PikiwiDB Data Size: 800GB
    • Value: 128 bytes
    • CPU not bound
  • Test Results

    1

    Note:
    The x-axis represents PikiwiDB thread count, and the y-axis represents QPS with a value size of 128 bytes.
    “set3/get7” indicates 30% set and 70% get operations.

  • Case One Conclusion

    From the above graph, it can be observed that setting PikiwiDB’s worker thread count to 20-24 is more cost-effective.

3.2 Case 2

  • Test Objective

    Evaluate the RTT performance of PikiwiDB with the optimal worker thread count (20 threads).

  • Test Conditions
    • PikiwiDB Data Size: 800GB
    • Value: 128 bytes
  • Test Results
    1. ====== GET ======
    2. 10000000 requests completed in 23.10 seconds
    3. 200 parallel clients
    4. 3 bytes payload
    5. keep alive: 1
    6. 99.89% <= 1 milliseconds
    7. 100.00% <= 2 milliseconds
    8. 100.00% <= 3 milliseconds
    9. 100.00% <= 5 milliseconds
    10. 100.00% <= 6 milliseconds
    11. 100.00% <= 7 milliseconds
    12. 100.00% <= 7 milliseconds
    13. 432862.97 requests per second
    1. ====== SET ======
    2. 10000000 requests completed in 36.15 seconds
    3. 200 parallel clients
    4. 3 bytes payload
    5. keep alive: 1
    6. 91.97% <= 1 milliseconds
    7. 99.98% <= 2 milliseconds
    8. 99.98% <= 3 milliseconds
    9. 99.98% <= 4 milliseconds
    10. 99.98% <= 5 milliseconds
    11. 99.98% <= 6 milliseconds
    12. 99.98% <= 7 milliseconds
    13. 99.98% <= 9 milliseconds
    14. 99.98% <= 10 milliseconds
    15. 99.98% <= 11 milliseconds
    16. 99.98% <= 12 milliseconds
    17. 99.98% <= 13 milliseconds
    18. 99.98% <= 16 milliseconds
    19. 99.98% <= 18 milliseconds
    20. 99.99% <= 19 milliseconds
    21. 99.99% <= 23 milliseconds
    22. 99.99% <= 24 milliseconds
    23. 99.99% <= 25 milliseconds
    24. 99.99% <= 27 milliseconds
    25. 99.99% <= 28 milliseconds
    26. 99.99% <= 34 milliseconds
    27. 99.99% <= 37 milliseconds
    28. 99.99% <= 39 milliseconds
    29. 99.99% <= 40 milliseconds
    30. 99.99% <= 46 milliseconds
    31. 99.99% <= 48 milliseconds
    32. 99.99% <= 49 milliseconds
    33. 99.99% <= 50 milliseconds
    34. 99.99% <= 51 milliseconds
    35. 99.99% <= 52 milliseconds
    36. 99.99% <= 61 milliseconds
    37. 99.99% <= 63 milliseconds
    38. 99.99% <= 72 milliseconds
    39. 99.99% <= 73 milliseconds
    40. 99.99% <= 74 milliseconds
    41. 99.99% <= 76 milliseconds
    42. 99.99% <= 83 milliseconds
    43. 99.99% <= 84 milliseconds
    44. 99.99% <= 88 milliseconds
    45. 99.99% <= 89 milliseconds
    46. 99.99% <= 133 milliseconds
    47. 99.99% <= 134 milliseconds
    48. 99.99% <= 146 milliseconds
    49. 99.99% <= 147 milliseconds
    50. 100.00% <= 203 milliseconds
    51. 100.00% <= 204 milliseconds
    52. 100.00% <= 208 milliseconds
    53. 100.00% <= 217 milliseconds
    54. 100.00% <= 218 milliseconds
    55. 100.00% <= 219 milliseconds
    56. 100.00% <= 220 milliseconds
    57. 100.00% <= 229 milliseconds
    58. 100.00% <= 229 milliseconds
    59. 276617.50 requests per second
  • Case 2 Conclusion

    The response time for 99.9% of get/set operations is within 2ms.

3.3 Case 3

  • Test Objective

    Evaluate the maximum QPS for each command in PikiwiDB with the optimal worker thread count.

  • Test Conditions
    • PikiwiDB Worker Thread Count: 20
    • Number of Keys: 10,000
    • Number of Fields: 100 (excluding lists)
    • Value: 128 bytes
    • Number of Command Executions: 10 million (except for lrange)
  • Test Results
    1. PING_INLINE: 548606.50 requests per second
    2. PING_BULK: 544573.31 requests per second
    3. SET: 231830.31 requests per second
    4. GET: 512163.91 requests per second
    5. INCR: 230861.56 requests per second
    6. MSET (10 keys): 94991.12 requests per second
    7. LPUSH: 196093.81 requests per second
    8. RPUSH: 195186.69 requests per second
    9. LPOP: 131156.14 requests per second
    10. RPOP: 152292.77 requests per second
    11. LPUSH (needed to benchmark LRANGE): 196734.20 requests per second
    12. LRANGE_10 (first 10 elements): 334448.16 requests per second
    13. LRANGE_100 (first 100 elements): 50705.12 requests per second
    14. LRANGE_300 (first 300 elements): 16745.16 requests per second
    15. LRANGE_450 (first 450 elements): 6787.94 requests per second
    16. LRANGE_600 (first 600 elements): 3170.38 requests per second
    17. SADD: 160885.52 requests per second
    18. SPOP: 128920.80 requests per second
    19. HSET: 180209.41 requests per second
    20. HINCRBY: 153364.81 requests per second
    21. HINCRBYFLOAT: 141095.47 requests per second
    22. HGET: 506791.00 requests per second
    23. HMSET (10 fields): 27777.31 requests per second
    24. HMGET (10 fields): 38998.52 requests per second
    25. HGETALL: 109059.58 requests per second
    26. ZADD: 120583.62 requests per second
    27. ZREM: 161689.33 requests per second
    28. PFADD: 6153.47 requests per second
    29. PFCOUNT: 28312.57 requests per second
    30. PFADD (needed to benchmark PFMERGE): 6166.37 requests per second
    31. PFMERGE: 6007.09 requests per second
  • Conclusion

    Overall performance is excellent, but some commands exhibit weaker performance (LRANGE, PFADD, PFMERGE).

3.4 Case 4

  • Test Objective

    Compare the maximum QPS between PikiwiDB and Redis.

  • Test Conditions
    • PikiwiDB Worker Thread Count: 20
    • Number of Keys: 10,000
    • Number of Fields: 100 (excluding lists)
    • Value: 128 bytes
    • Number of Command Executions: 10 million (except for LRANGE)
    • Redis Version: 3.2.0
  • Test Result

1

Observability

Metrics

  1. PikiwiDB Server Info: system, ip, port, run_id, config file etc.
  2. PikiwiDB Data Info: db size, log size, memory usage etc.
  3. PikiwiDB Client Info: The number of connected clients.
  4. PikiwiDB Stats Info: status information of compact, slot, etc.
  5. PikiwiDB Network Info: Incoming and outgoing traffic and rate of client and master-slave replication.
  6. PikiwiDB CPU Info: cpu usage.
  7. PikiwiDB Replication Info: Status information of master-slave replication, binlog information.
  8. PikiwiDB Keyspace Info: key information of five data types.
  9. PikiwiDB Command Exec Count Info: command execution count.
  10. PikiwiDB Command Execution Time: Time-consuming command execution.
  11. RocksDB Metrics: RocksDB information of five data types, includes Memtable, Block Cache, Compaction, SST File, Blob File etc.

More details on Metrics.

Documents

Contact Us