VanillaDB is a collection of simple-to-read, fast, and extensible database system components aiming to lower the barrier of new-system prototyping and/or learning the database internals.
Most relational database systems today are too complicated for practitioners, especially newcomers, to leverage and build creative systems/components upon. One main problem is that these systems have been optimized for decades, thus the source code is highly sophisticated and hard to understand. VanillaDB rewrites some key components of a distributed relational database system with the following goals in mind:
The source code of VanillaDB is released under the Apache 2.0 license. And we are happy to hear your feedback or feature requests at vanilladb@datalab.cs.nthu.edu.tw.
VanillaDB is ideal for:
For instructors, we offer extra coding labs that help students get hands-on experience in some important modules (e.g., query planning, transaction processing, etc.). Please contact vanilladb@datalab.cs.nthu.edu.tw for more details.
VanillaDB has been used as a testbed in some research work (e.g., T-Part in Proc. of SIGMOD’16) and teaching materials in some DB courses (e.g., the “Cloud Database Systems” offered by National Tsing Hua University, Taiwan). It also serves as the core engine in some advanced systems (e.g., ElaSQL, a deterministic, distributed relational databases systems for OLTP workloads).
Currently, VanillaDB consists of two sub-projects, namely the VanillaCore and VanillaComm. The former is a single-node relational database engine and the latter provides the group communication primitives for distributed database systems.
Get VanillaCoreA new sub-project called VanillaBench is on the way.
To cite VanillaDB, please add the following to your BibTex:
@inproceedings{shwu2016tpart, title={T-Part: Partitioning of Transactions for Forward-Pushing in Deterministic Database Systems}, author={Shan-Hung Wu and Tsai-Yu Feng and Meng-Kai Liao and Shao-Kan Pi and Yu-Shan Lin}, booktitle={Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data (SIGMOD)}, year={2012}, organization={ACM} }
VanillaCore is a single node, multi-threaded relational database engine that partially supports the SQL-92 standard and offers connectivity via JDBC, embedding, or (Java-based) stored procedures.
Threads v.s. connections v.s. transactions, thread-local v.s. thread-safe components, etc.
SQL parsing and validation, planning, algebra, plan/scan trees, etc.
Block-level v.s. file-level access, O_DIRECT on Linux, etc.
Buffering user data, write-ahead-logging (WAL), log caching, etc.
Physical schema design, efficient buffer utilization, etc.
Strict Two-Phase Locking (S2PL), deadlock detection/avoidance, lock granularity, phantom, isolation levels, etc.
Physical logging, transaction rollback, UNDO-only recovery, UNDO-REDO recovery, logical logging, physiological logging, ARIES, checkpointing, etc.
Hash and B-tree indexing, index locking, etc.
VanillaComm is a collection of reliable group communication primitives (e.g., total-ordering) that can benefit the distributed database systems (e.g., eager-replication, NewSQL database systems). It is based on the Appia framework and handles node/machine failure transparently.
VanillaBench eases the database system benchmarking by partially implementing some common benchmarks (e.g., TPC-C, TPC-E, or YCSB). Coming soon.
Copyright © by vanilladb.org. All rights reserved.
Please contact vanilladb@datalab.cs.nthu.edu.tw if you have any question.