Gitee — Enterprise-level DevOps R&D management platform
Join Gitee
此仓库是为了提升国内下载速度的镜像仓库,每日同步一次。 原始仓库: https://github.com/KuiBaDB/KuiBaDB
Clone or download
Cancel
Notice: Creating folder will generate an empty file .keep, because not support in Git
Loading...
README.md

KuiBaDB is another PostgreSQL rewritten with Asynchronous Rust, and KuiBaDB focus on OLAP analysis. KuiBaDB is also an open source implementation of Hologres: A Cloud-Native Service for Hybrid Serving/Analytical Processing.

KuiBaDB is built on kbio and tokio. We only use the 'rt-multi-thread', 'rt' and 'io-util' features of tokio. All IO, including file IO and network IO, and asynchronous syscall are powered by kbio.

KuiBaDB contains only the basic features necessary for implementing an OLAP Database, such as supporting transactions but not sub-transactions. It is hoped that as an experimental field, researchers can quickly implement their ideas based on the infrastructure provided by KuiBaDB.

KuiBaDB uses vectorization engine and is also catalog-driven. KuiBaDB uses columnar storage introduced in Hologres. But I removed the Delete Map and added xmin, xmax for each row, xmin/xmax is saved in row storage.

Roadmap

KuiBaDB is only developed in my free time, so the progress could be very slow.

  • Add GlobalState, SessionState, WorkState. See KuiBaDB: State for more details.

  • Add guc

  • Support select expr1, expr2:

    $ psql -h 127.0.0.1 -p 1218 kuiba
    psql (13.1, server 0.0.1)
    Type "help" for help.
    
    kuiba=# select 2020 - 2 as hello, 1207 + 11 as world;
    hello  | world
    -------+-------
    2018   | 1218
    (1 row)
  • Add slru and clog. The clog supports two-levels cache and vectorization.

  • Add wal. We have moved all the IO operations out of the lock!

  • Add crash recovery.

  • Add xact system

    2021-04-10T10:35:19.424402+08:00 - INFO - start redo. ctl=Ctl { time: 2021-04-03T22:55:14+08:00, ckpt: 20181218, ckptcpy: Ckpt { redo: 20181218, curtli: 1, prevtli: 1, nextxid: 2, nextoid: 65536, time: 2021-04-03T22:55:14+08:00 } }
    2021-04-10T10:35:19.554540+08:00 - INFO - end redo because of failed read. endlsn=20711748 endtli=1 err=No such file or directory (os error 2)
    2021-04-10T10:35:19.554693+08:00 - INFO - End of redo. nextxid: 15604, nextoid: 65536
    2021-04-10T10:35:19.555442+08:00 - INFO - listen. port=1218
    
    kuiba=# begin;
    BEGIN
    kuiba=*# select 1;
    1
    kuiba=*# commit;
    COMMIT
    kuiba=# begin;
    BEGIN
    kuiba=*# select 1;
    1
    kuiba=*# select x;
    ERROR:  parse query failed: Parse Error. UnrecognizedToken { token: (7, Token(16, "x"), 8), expected: ["\"(\"", "\"+\"", "\"-\"", "\";\"", "DECIMAL", "INTEGER", "XB"] }
    kuiba=!# select 1;
    ERROR:  current transaction is aborted, commands ignored until end of transaction block:
    kuiba=!# commit;
    ROLLBACK
  • Implement PG-style shared buffer: SharedBuf<K: Copy, V, E: EvictPolicy>.

    SharedBuf<TableId, SuperVersion, LRUPolicy> will be used to save the mapping between the table and its SuperVersion. In RocksDB, SuperVersion of ColumnFamily is memory resident. but OLAP system may have many tables, we should support swapping the SuperVersion of some infrequently used tables out to disk.

    SharedBuf<TableId, SharedBuf<PageId, Page, FIFOPolicy>, LRUPolicy> will be used to save the xmin/xmax/hints page for table file.

  • Add lock manager

  • Add CREATE TABLE, LOCK TABLE

    kuiba=# create table t( i int, j int );
    CREATE TABLE
    
    kuiba=# begin;
    BEGIN
    kuiba=*# lock table t1 in access exclusive mode;
    LOCK TABLE
  • Refactoring Expression module, supporting expression result reuse, 1 + 3 will only be calculated once, and memory reuse, See KuiBaDB: Expression for more details.

    kuiba=# select (1+3) + (1+3), 1 + 3;
     ?column? | ?column?
    ----------+----------
            8 |        4
    (1 row)
  • Add columnar storage.

  • Add Parallel Copy

    -- KuiBaDB
    kuiba=# create table t(col0 int,col1 int,col2 int,col3 int,col4 int,col5 int,col6 int,col7 int,col8 int,col9 int,col10 int,col11 int,col12 int,col13 int,col14 int,col15 int);
    CREATE TABLE
    Time: 92.237 ms
    -- Use one thread to parse the input and 4 threads to write data.
    -- We need to do more profiling to explain the results.
    kuiba=# copy t from '/Users/zhanyi/NOTtmp/col16row33y.csv' DELIMITERS '|' (parallel 4);
    COPY 10545903
    Time: 9142.299 ms (00:09.142)
    -- PostgreSQL 14beta1
    pg14beta1=# copy t from '/Users/zhanyi/NOTtmp/col16row33y.csv' DELIMITERS '|';
    COPY 10545903
    Time: 13483.658 ms (00:13.484)
  • implement the HOS introduced in Hologres with rust async/await.

  • Add SeqScan

  • Add Parallel SeqScan

  • Add checkpointer

  • [ ] Rewrite Greenplum based on KuiBaDB

Run Test

export KUIBADB_DATADIR=/tmp/kuibadir4test
./target/debug/initdb  $KUIBADB_DATADIR
echo 'clog_l2cache_size: 1' >> $KUIBADB_DATADIR/kuiba.conf
cargo test

Repository Comments ( 0 )

Sign in for post a comment

About

KuiBaDB 是使用 Asynchronous Rust 重写的 PostgreSQL,专注于 OLAP 分析 expand collapse
Rust and 2 more languages
Apache-2.0
Cancel

Releases

No release

Contributors

All

Activities

load more
can not load any more