Database Building 101: The cost of graph storage

Aug 18 2016

Database Building 101The cost of graph storage

time to read 5 min | 802 words

So now that we know how to store the data, in a way that allows efficient graph traversal, let’s compute some back of the envelope computations for storage costs.

Like any storage system, Voron needs to store some metadata about our data, and sometimes this can be very surprising to people.

Let’s look at each of the items that we store in turn.

Node data is stored in a table.
Edge data is stored in a table.
The edge itself is stored in a B+Tree containing fixed size trees.

A table does a bunch of stuff, including reserving some space on the disk, but we don’t have dynamic tables here, so those costs are relatively fixed.

The cost per item, however, depends on the size of the data. If the data size is less than 2036 bytes, then the cost of storing the data is a 4 bytes overhead. If, however, the size of the data item is higher than 2036, we round it up to 4KB section.

In other words, storing ten million nodes, which measure 1KB in size, will cost us about 40 MB of overhead (compared to roughly 10 GB of data). However, if the size of the data is 2KB, we need to store them in a full page. The reason for this, by the way, is that we need to balance the cost of insert and the cost of update. So we only modify things on page boundary (in this case, 4KB). If the value is small, we pack multiples of them in a single page, but beyond a certain size, we just allocate dedicated pages for them, and live with a bit of free space in the end.

More interesting is the storage of the edge data, actually. A B+Tree costs a minimum of 4KB, and we have one of these per each of the edge types that we have. In practice, we don’t expect there to be a high number of edge types, and we can readily ignore this as fixed size costs. In most cases, I would be stunned to hear that there is more than a single 4KB page for all your edges types (should be enough for a hundred or so).

What isn’t fixed size is the number of fixed size tree (one per source node) and the number of entries in the fixed size trees (one per connection). The whole reason we have fixed size trees is that they allow us to make efficient use of storage by making assumptions. You can see this in their usage. A fixed size tree has an int64 as the key, and you need to specify upfront how much space you need for the values. That makes it very simple to write.

Fixed size trees actually have two forms, they can be embedded or they can be free floating. That mostly depends on their size. If they are embedded, they are stored inside the parent tree, but if they are too large, we store them in their own page. Embedded usage takes 6 bytes per fixed size tree, we have 8 bytes for the key, and the entry size itself (which in our case is also 8 bytes). So a total of 16 bytes per entry inside the fixed size tree.

What this means, in practice, is that up until the moment a node has more than 254 connections, it can be stored as embedded value. When it goes beyond that, we’ll spill over to a full page and start taking space at 4KB increments.

One thing that I didn’t mention is that we store the fixed size trees (keyed by their source node ID), inside a parent B+Tree. Here we need to track a lot more information (keys and values have different sizes, etc). The overhead cost per entry in a B+Tree is 14 bytes. Add to that the 8 bytes for the source node id, and it comes to 22 bytes per source node.

Given all of those numbers, if we had a graph with 10 million nodes and each node was connected to a 10 other nodes in average, and each node/edge was 0.5KB in size, we would have:

5 GB – Nodes data – 10,000,000
50 GB – Edges data – 100,000,000
80 MB – overhead data for nodes & edges.
1.75 GB – edges information about all nodes.

Note that in such a graph, we have 10 million nodes, but a hundred million edges. All of which can fit comfortably into RAM on a good server, and give you blazing good performance.

Tweet Share Share 1 comments

Tags:

More posts in "Database Building 101" series:

(25 Aug 2016) Graph querying over large datasets
(24 Aug 2016) Stable node ids
(22 Aug 2016) High level graph operations
(19 Aug 2016) Graphs aren’t toys
(18 Aug 2016) The cost of graph storage
(17 Aug 2016) The cost of graphing
(16 Aug 2016) The storage layer
(15 Aug 2016) Let us graph this for real

Comments

19 Aug 2016
12:14 PM

Tyler Jensen

Love this post. Thanks for sharing. While at Ancestry.com, I wrote a highly specialized in-memory graph using C# and unmanaged blocks of memory in which we stored 800 million nodes and 2.8 billion edges. This allowed traversals of tens of thousands of nodes in just milliseconds. It was not really a "database" per se, as it was static for read with an entirely different "copy" in memory for writes with a regular swap between the two (rather a swap between actions being taken on the memory pages allocated in the unmanaged "unsafe" zone. We did not need to know anything about each node or vertice other than and ID and we only needed the barest of information about each edge, so we were able to get all those data in duplicate into one machine with just 128GB or RAM. The real trick was replication to a pool of servers, each with their own copy of the graph. But that is another story.

Comment preview

Comments have been closed on this topic.

Oren Eini

Oren Eini

CEO of RavenDB