A Real-Time Database Survey: The Architecture of Meteor, RethinkDB, Parse & Firebase

Wolfram Wingerath
Speed Kit Blog
Published in
26 min readJul 16, 2017

--

Real-time databases make it easy to implement reactive applications, because they keep your critical information up-to-date. But how do they work and how do they scale? In this article, we dissect the real-time query features of Meteor, RethinkDB, Parse and Firebase to uncover scaling limitations inherent to their respective designs. We then go on to discuss and categorize related real-time systems and share our lessons learned in providing real-time queries without any bottlenecks in Baqend. (video / paper / thesis)

Recently, we held a talk (slides, paper, video) about the merits of push-based data access and the currently available databases that do support it. Since we are not aware of any comprehensive overview over the current state of the art in push-oriented databases, we thought it was time for another written survey. This article is a slightly altered transcript of our talk. We hope you enjoy it :-)

Reactive applications vs. pull-based databases

In recent years, people have come to expect reactivity from their applications, i.e. they assume that changes made by other users are immediately reflected in the interfaces they are using. However, building such applications with traditional database technology is difficult, because systems for data storage and retrieval have been developed around a purely pull-based request-response access pattern for decades. If you want to brush up your knowledge on this kind of database systems, have a look at our database survey:

Responding to the need for reactivity on the database side, a new class of database systems have emerged that are natively push-oriented and thus promise to ease development of reactive (web) applications. These systems are often referred to as real-time databases, because they keep data at the client in-sync with current database state “in realtime”, i.e. immediately after change. (Please note that the term real-time databases is overloaded; this article explicitly does not refer to databases that produce an output within strict timing constraints.)

Outline

Since this article is rather in-depth, let us briefly outline what it is going to be about:

  1. Real-time queries: First, we will explore the idea of self-maintaining queries for which the database proactively pushes relevant updates to the client. We will illustrate their usefulness and explain why it is hard to provide them.
  2. Meteor, RethinkDB, Parse and Firebase: We then describe how the few systems that do support real-time queries implement them and where the problems in the individual architectures lie.
  3. Related systems: After the in-depth discussion of real-time query implementations, we survey and categorize other system classes that account for data evolving over time.
  4. Discussion: We then sum up our findings and discuss why no current system provides scalable and expressive real-time queries.
  5. A scalable architecture for real-time queries: Finally, we outline the Baqend Real-Time Query architecture and how it avoids the bottlenecks that Meteor, RethinkDB, Parse and Firebase are constrained by.

Real-Time Queries

To illustrate the merits of push-based data access, let’s have a look at a simple motion tracking application we implemented as a showcase for Baqend Real-Time Queries a while back. As shown below, we tracked different people as they wandered through a building (bottom). We also had a server update their respective x/y coordinates as well as their current room using simple CRUD statements, several times a second. A website (top right) displaying info on the people in Room B reflected these changes immediately.

A simple application should be simple to implement.

In a perfect world, you could just tell the database what you are looking for and have it keep the answer up-to-date.

The natural way to implement this is a query that is updated by the database as soon as the result changes — a real-time query. In principle, there are two variants of real-time queries: With a self-maintaining query (top left), the database will deliver the initial result upfront and then deliver a new result whenever there is a change to the requested information. With an event stream query, the database delivers low-level change events, so that the application has to take care of maintaining the result. With either real-time query flavor, the application is trivial to implement, because it only has to display the data that is pushed from the database.

On top of a purely pull-based data store, though, this application is a tough one, because you can only request snapshots of data and cannot possibly know how long a given query result is going to be up-to-date. Of course, you can reevaluate a query every few seconds. Or you might do something clever to detect relevant updates in the database log, given you have access. But whatever you end up doing, the code will be complex, inefficient and error-prone in comparison to the push-oriented counterpart using real-time queries.

Real-time databases take view maintenance out of the application layer.

Query Maintenance: Problem Statement

In simple words, a real-time database is keeping an eye on the result of any real-time query and sends new, updated and removed items to the client. This seems straightforward, but is actually difficult, because a result is not just a list of identifiers that you can easily track: A query result is the entirety of all objects satisfying the query’s matching condition. And whenever an object is written, the object’s matching status with respect to any active query may be affected. Therefore, to find out whether or not the result has changed in response to a write operation, a real-time database has to inspect any written object.

Each database update has to be analyzed w.r.t. each real-time query.

As illustrated in the decision tree, there are two questions to answer: (1) whether the object just written is a match and (2) whether it was a match before. And these questions have to be answered in the context of every single real-time query and every single write operation. Any object that is no match and was no match for a given query has obviously no effect and can thus be disregarded. In all other cases, though, the result needs to be updated and the client should receive corresponding information.

A real-time database has to analyze the effect of every single write operation for every single real-time query.

Let’s make this concrete: Let’s say you have an app with 1,000 concurrent users and an average throughput of 1,000 operations per second. Given each user has only one active real-time query, the real-time database already has to perform 1 million matching operations — every single second. And this is just for plain filtering and does not account for more complex queries with even higher overhead. (Let’s save sorting, joining and aggregation for another article.) You could apply “optimizations” (e.g. batching) that trade throughput for increased latency, but if you want immediate change notifications, there are no other options.

The process of query maintenance is wildly expensive and therefore has to be scalable, first and foremost.

Wrap-Up

Real-time databases acknowledge the inadequacy of pull-based interfaces for reactive applications. By proactively delivering changes, they reduce complexity and cater for increased ease of development. We distinguish two different types of real-time queries: Self-maintaining queries basically follow the same semantics as common ad hoc queries, but provide you with a complete and updated result whenever there is a change. Event stream queries provides you with fine-grained information on events as they happen. Consequently, they require some understanding of the underlying mechanics, e.g. what types of events there are and what properties they have.

In the following section, we will have a look at the real-time capabilities of the most prominent representatives, namely Meteor, RethinkDB, Parse and Firebase.

Meteor

Meteor is a JavaScript app development framework that targets reactive apps and websites. It uses MongoDB under the hood and therefore inherits its query expressiveness, but adds self-maintaining and event stream queries on top. Interestingly, Meteor offers two different implementations to detect relevant changes: The original one (change monitoring + poll-and-diff) combines monitoring local write operations and periodic query reevaluation. It is only used as fallback nowadays. The more recent (and current default) implementation (oplog tailing) relies on monitoring MongoDB’s replication log.

Next, we will discuss both real-time query variants and their respective limitations.

Change Monitoring + Poll-and-Diff

The original approach towards real-time queries is a combination of two things: First, every Meteor app server monitors the write operations it receives to detect and handle state changes that are relevant for currently active real-time queries. If, for example, a newly inserted object matches a real-time query, this object is pushed to all clients listening to this particular query. As illustrated below, however, change monitoring is not sufficient in multi-server deployments: Inserting an object on one server (right) is not captured by change monitoring done on the other server (left). To compensate for this fact, Meteor employs a second strategy called poll-and-diff which boils down to periodically reevaluating a real-time query (“poll”) and sending only relevant changes (“diff”) to the client. When these two strategies are combined, the client will receive a relevant update either immediately through change monitoring or after a short delay upon discovery through poll-and-diff.

Poll-and-Diff: Meteor executes a real-time query again and again to discover changes.

From the client perspective, this approach has the obvious disadvantage of potential staleness windows bounded by the polling interval (default: 10 seconds). But even if an application can tolerate lagging behind a few seconds occasionally, poll-and-diff becomes infeasible when many real-time queries are active concurrently. This is because each one of them induces processing overhead on the database system through periodic polling. In numbers, 1,000 query subscriptions result in an average of 100 queries per second whose results have to be (1) assembled and (2) serialized, then (3) be sent to the app server where they are (4) deserialized again, so that they can finally (5) be analyzed for relevant changes. This is not cheap in itself, but gets downright expensive when results are large. And this is just what happens in an otherwise idle system, i.e. when no data is being written whatsoever. Things get worse when write throughput increases, because the database has less spare resources to serve real-time maintenance queries and because each app server has to spend more CPU time on change monitoring — which, of course, becomes more demanding with an increasing number of active real-time queries as well.

Seeing the considerable downsides of poll-and-diff, the Meteor developers came up with an alternative approach that has a different set of trade-offs.

Oplog Tailing

As described above, a Meteor app server is able to maintain a query result by itself — given it receives all relevant write operations. However, since a write operation is only visible to the app server that receives it, multi-server deployments require a mechanism that informs all app servers about write operations received by the others. The poll-and-diff approach effectively does that, but introduces high baseload by executing the same query over and over. Acknowledging this problem, oplog tailing was introduced as an alternative solution that uses MongoDB’s replication protocol to feed the full change stream to every app server.

So what is the oplog and how do you tail it? MongoDB achieves write scalability by distributing data across different partitions (a.k.a. shards). In a production deployment, data within a partition is kept redundantly in a so-called replica set: Write operations within a partition are first applied by the primary and then delivered to the secondaries. Internally, the primary logs all writes in a capped collection — a ring buffer called the oplog— and the secondaries are following along using tailable cursors.

The basic idea of oplog tailing is to have the Meteor app servers hook into MongoDB replication as though they were secondaries.

Oplog tailing: Each Meteor app server receives all MongoDB writes— the MongoDB cluster will scale with write throughput, but your app will not.

As illustrated above, each Meteor app server taps into each primary’s oplog and thus will never miss a single update operation. This setup makes periodic polling obsolete and eliminates the staleness inherent to poll-and-diff. But in doing so, it introduces the app server as a bottleneck for write throughput. Because this is where Meteor deviates from the way that MongoDB uses the oplog: While each MongoDB secondary has to keep up with only one primary, each Meteor server has to keep up with the combined throughput of the entire MongoDB cluster.

As a consequence, MongoDB can scale with write throughput, but Meteor can’t.

And this is not just some theoretical limit; it’s a known issue that load spikes can saturate and take down a Meteor app server when oplog tailing is enabled. To address this particular issue, poll-and-diff is used as a fallback strategy for oplog tailing whenever an app server falls behind. But this obviously comes at the cost of potential staleness and does not help, if there are more than a few active real-time queries. To leverage maximum performance, it is therefore officially recommended to carefully consider each single query and decide whether to enable or disable oplog tailing.

What game does Bobby play? To find out, Meteor has to ask MongoDB.

As a sidenote, oplog tailing does not completely eliminate the need to contact the database, either. For an example, consider a real-time query for the top-3 baccarat players as shown in the illustration. The maintaining server receives a partial update to a previously unknown player: Bobby now has a higher score than even the best baccarat player. However, it is unknown whether Bobby is a baccarat player, because the oplog does not tell: It only contains information on Bobby’s new score (500), but not the associated game. So to decide whether the query is affected by Bobby’s new high score, Meteor has to query MongoDB for Bobby’s game first.

Wrap-Up

In summary, Meteor offers two implementations of real-time queries:

  • Poll-and-diff can be laggy and only works when few real-time queries are active.
  • Oplog tailing is only feasible for low write throughput as it effectively circumvents the database’s sharding mechanism.
  • Neither approach works for many users and moderate or even high throughput.

RethinkDB

RethinkDB is a JSON document store with query expressiveness comparable to MongoDB. However, it is a completely independent project that seems more ambitious than MongoDB in some areas, offering advanced features like join queries and linearizable reads and writes. Among the more interesting specialties of RethinkDB are changefeeds — event stream filter queries — which we are going to detail next.

Changefeeds

RethinkDB’s take on change discovery is actually pretty similar to Meteor‘s oplog tailing: The recommended approach for employing changefeeds is to colocate each application server with a special RethinkDB process that does the real-time matching as illustrated below.

Each RethinkDB proxy has to handle all write operations hitting the cluster. High throughput is therefore infeasible.

This process is called RethinkDB proxy and is meant to reduce messaging overhead within the cluster and reduce processing load on the primary storage nodes. Just like in the Meteor architecture, there is sharding on the database level for collections, but not for real-time workloads. In other words, every RethinkDB proxy will receive the change stream of all RethinkDB shards, thus bounding overall system throughput. Due to this write bottleneck, of course, RethinkDB is subject to the same performance limitations as oplog tailing: Changefeeds cannot scale beyond the capacity of a single node and will impair app usability under high load. In contrast to Meteor, RethinkDB does not rely on external technical artifacts such as the oplog. Thus, internal change propagation is more elegant and does not require additional roundtrips to fill informational gaps — as is the case with oplog tailing. But this does not change the fact that the overall architecture does not scale.

There are few reports of RethinkDB users and their experiences with the scalability of changefeeds. However, SageMathCloud as one of the most enthusiastic RethinkDB users reverted from RethinkDB to PostgreSQL for a reactive web application, because an SQL makeshift solution outperformed RethinkDB changefeeds by an order of magnitude.

Wrap-Up

RethinkDB offers a real-time query implementation similar to Meteor’s oplog tailing:

  • Write throughput for RethinkDB’s real-time queries does not scale as it is bottlenecked by single-node capacity.
  • Since there is no poll-and-diff-like approach to emulate real-time query behavior, some query features are not eligible for real-time. For example, changefeeds for sorted queries support a limit, but no offset.
  • The client API has no self-maintaining queries. In order to keep a query result up-to-date, the application logic has to implement result maintenance on top of an event stream query.

Parse

Similar to Meteor, Parse is an app development framework that uses MongoDB as backing store. It was immensely popular and had one of the largest MongoDB deployments world-wide a few years ago, so there is much to be said about Parse and their infrastructure. While Meteor and RethinkDB provide real-time queries since 2012 and 2014, respectively, Parse announced the feature in March 2016 — after they announced their own shutdown. In this section, we highlight the Live Query feature that brings event stream queries to the Parse platform.

Live Queries

Without any support for sorting, Parse’s event stream queries are less powerful than those of the competition we saw earlier, even though the architecture itself is pretty similar:

The Parse LiveQuery architecture also does not scale with write throughput, since updates are broadcasted. (Illustration taken from the project’s documentation.)

Just like in Meteor (oplog tailing) and RethinkDB, each real-time query is maintained by a single process, the LiveQuery Server. Likewise, single-server matching performance is the hard limit for write throughput. In contrast to Meteor and RethinkDB, though, Parse does not hand updated objects directly from application server to LiveQuery server. Instead, there is a Redis message queue as broker in the middle which is another theoretical bottleneck.

We are not aware of any major projects using Parse Live Queries and consequently can’t quote first-hand reports on scalability.

Wrap-Up

The Parse LiveQuery architecture bears resemblance to Meteor’s oplog tailing and RethinkDB changefeeds:

  • Parse’s real-time queries do not scale with write throughput as all updates have to go through each LiveQuery server.
  • Real-time query expressiveness is limited in comparison to static queries: Sorted real-time queries are not supported at all.
  • Like RethinkDB, Parse has no self-maintaining queries. Live-updating results therefore have to be built with the event stream query interface.

Firebase

Even though it is advertised as real-time database, we think Firebase is best described as service for cross-device state synchronization. Unlike other JSON-based datastores, a Firebase instance is no collection of JSON documents, but one single JSON document: a cloud-hosted tree structure of nested objects and lists. In order to access data, you essentially have to navigate through the hierarchy and request specific child nodes for which you will receive immediate updates when other users modify data. Thus, Firebase is natively real-time. Another plus point is that, in contrast to its competition, there is no indication that Firebase does not scale well, if — and here is the catch — you can break down your application requirements to the simplistic access model Firebase exhibits.

Your Firebase application model is basically one big cloud-hosted JSON document. (Illustration taken from the Firebase Blog.)

The problem is not so much the tree-structured data model, but rather the lack of support for non-trivial queries. The only way to deviate from access by key is to apply a simple filter (no logical AND/OR) or to enforce order on a single attribute. If your use case requires more sophistication, the Firebase team recommends employing workarounds like introducing artificial attributes (e.g. the composite of two other attributes) or retrieving a superset of all relevant data and doing the actual query processing in the client. Likewise, the Firebase team recommends to avoid nesting, because it makes fine-granular access control impractical and data retrieval more expensive: Since Firebase only allows fetching data nodes on the whole (i.e. including their child nodes), high-level nodes in a deeply-nested structure become more bloated as their child count grows. However, flattening out your data structures takes away even more of the expressiveness, since one-to-one and one-to-many relationships are naturally expressed through nesting in the document data model.

And this is where we think Firebase does not scale: complexity. You may denormalize data in a few instances or you may evaluate one particular query in the client, but this will become inefficient quickly. If you have non-trivial application semantics, you might be better off just using a feature-packed and battle-tested SQL database and shoehorn reactive behavior on top of it with trigger magic. In both cases, you end up wrangling with your tool rather than your problem.

Wrap-Up

While the architecture of Firebase is not disclosed, the trade-offs behind it are:

  • Firebase is scalable and supports many users and high throughput.
  • On the downside, it allows no complex queries.
  • The API is purely event-based; there are no self-maintaining queries.

Summary

The below table sums up the respective capabilities of each system detailed in this article:

A direct comparison of the different real-time query implementations covered in this article.

Meteor, RethinkDB and Parse provide complex real-time queries, but present scale-prohibitive bottlenecks in their respective architectures. Firebase may well be scalable up to many users and many write operations per second, but avoids complexity and thus primarily lends itself to simple use cases.

Baqend, on the other hand, provides complex real-time queries with a scalable architecture and thus combines the benefits of all competing real-time databases. The architecture behind Baqend Real-Time Queries is detailed in the last section of this article.

Other Systems

We think the systems discussed so far are basically as close as you get to push-based database queries. But nonetheless, there are systems that do provide other forms of push-based access to data and therefore shouldn’t be left out. If you don’t want to read about SQL databases, datastream management and stream processing, skip ahead to the discussion. Otherwise, read on to find out what exactly differentiates real-time databases from the rest.

Different classes of information systems and their underlying notions of data.

As illustrated above, we think these systems can be classified by the way they facilitate access to data. The one extreme is to only consider static collections that are persistent and strongly consistent; this is basically the SQL way that database management systems follow. The other extreme is the stream processing end where data is in flow and mostly not persisted (ephemeral streams) unless specified by the application. Real-time databases and datastream management systems stand in the middle, but can be distinguished by the semantics they apply: Real-time databases (the systems detailed above) work on evolving collections that are distinguished from their static counterparts through continuous integration of updates over time, namely through real-time queries. Datastream management systems, as the name implies, provide APIs to query data streams (as in “not collections”), for example by filtering specific data or computing rolling aggregations and joins over to-be-specified time windows.

In the following, we’ll go into more detail on the individual classes.

Database Management

Around the 1960s, databases were designed as passive repositories that accept, modify or retrieve structured data as a direct response to an explicit request. In other words, they were basically meant to represent the current snapshot of application data and answer queries with respect to it. Active mechanisms like triggers or event condition action (ECA) rules were added in order to allow modeling behavioral in addition to the structural aspects of a domain (see Active Databases). Some systems even considered hard timing constraints in addition (see Real-Time Active Databases). But the core idea was and still is separating users from one another and not have them interact through side effects. Rather than send change notifications to a client, triggers and ECA rules are typically used to enforce a consistency constraints or apply business rules to database state through database programming alone (i.e. without an application server).

The implementation of demanding reactive applications, however, seems infeasible. For one thing, there is no interface by which a client could subscribe to a query result, so tinkering is mandatory. More severely, though, checking event conditions, evaluating queries and all other processing is done on the same machine that handles any other request. Given the limited scalability of these systems, high throughput or a great number of triggers and rules will quickly exhaust system resources.

The closest thing to a self-maintaining result in the realm of traditional databases is probably a materialized view, but this really cannot be used to give the client — say, a mobile phone user — an up-to-date view of her critical data. The idea of a materialized view is rather to eagerly maintain one specific query that is requested very often, so that it doesn’t have to be reevaluated over and over again. It’s essentially a way to improve performance for pull-based queries. To further boost efficiency, a materialized view is often allowed to lag behind and is periodically updated with throughput-optimized batch operations. Consequently, materialized views are not suitable to provide low-latency updates and it is also infeasible for traditional database systems to maintain a great number of them in parallel, because computational overhead quickly gets out of hand.

Using mechanisms related to change detection in materialized views, some relational database systems have the ability to send change notifications when previously requested data might have been altered (see for example Oracle, Microsoft, PostgreSQL). On first glance, this may appear similar to what event stream queries provide, but change notification messages in RDBMSs do not carry the changed data itself: They only transmit identifiers (e.g. row IDs, table names or query identifiers) and some information on what happened (e.g. the type of operation or a text message). Therefore, supposedly changed data items or queries have to be requested again after a notification has been received. Further, most implementations exhibit occasional false positives, so that you never know whether revalidation is actually required or just a waste of resources. Lastly, the same performance caveats as for materialized view maintenance apply to change notifications: In scenarios with high write throughput or many unique queries, some vendors explicitly discourage using their notification feature and instead recommend employing workarounds, for example using triggers or implementing sophisticated middleware for change detection.

Real-Time Databases

These are the systems in the focal point of this article. While traditional databases are targeted at providing a consistent snapshot of the application domain, real-time databases acknowledge that data may evolve. Both the architectures and client APIs of real-time databases reflect that facts can change over time and that the system may have to enhance or correct issued information.

A real-time database may allow issuing snapshot (one-time) queries on a collection or it may provide an interface to directly access the stream of update operations. But the defining property is that real-time queries are formulated as though they were evaluated on static data collections, even though their response — the self-maintaining result or the event stream — reflect ongoing data alterations.

The most prominent real-time databases are covered in this article, but there are a few peer systems that should not go unmentioned:

  • Realm is an embedded database often compared with SQLite that provides cross-device synchronization and collection-based real-time queries. Write operations are executed locally and synchronized with the Realm Object Server which broadcasts them to all clients. Since every client replicates the entire database, reads are always executed locally as well and return very fast. At the same time, however, this scheme introduces the client — e.g. a notebook, tablet or android phone — as a bottleneck for processing as well as storage and the Object Server as a bottleneck for change propagation. In consequence, Realm real-time queries are only feasible in domains with small data sets and low update throughput. (See this podcast for details.)
  • OrientDB allows you to filter newly written objects by a query predicate through so-called Live Queries. Semantically, they appear somewhat collection-based, since they handle inserts, updates and deletes. But in contrast to systems like Meteor, OrientDB‘s Live Queries only react to ongoing write operations and do not deliver initially matching items. To maintain an up-to-date view of your query result, you have to combine the pull-based query mechanism (result) with the push-based mechanism (updates).
  • CouchDB is another contender that has an API for continuous changes. Like in OrientDB, the initial result of your query has to be requested separately. In contrast to many systems discussed in this article, though, CouchDB only pushes matching items to the client (i.e. you won’t get delete notifications). In consequence, self-maintaining queries are complex to implement and require the application to maintain query state.
  • Similar to OrientDB’s Live Queries, Graphcool subscriptions let you filter the change stream by custom criteria. As with OrientDB’s Live Queries, the matching process does not consider a query result, but only the data item that is being written. As another similarity, Graphcool subscriptions also do not provide the initially matching items.
  • Lastly, real-time APIs for MongoDB and Stitch, a cloud backend by the MongoDB creators, are rumored to become available later this year. As far as we are aware, their real-time features are still in the conception phase and it is not yet clear whether they will provide full-fledged collection-based real-time query semantics or mere stream-based filtering.

Data Stream Management

In contrast to real-time databases, datastream management systems are designed to filter out and analyze relevant information from a continuous flow of data like sensor feeds or the Twitter firehose. Depending on the system, the focus may be SQL-like analytics on data properties (cf. PipelineDB) or temporal, local and even causal relationships between events (cf. CEP engines like Esper). But the stream in itself is ephemeral and only specified information is persisted, if any at all. Real-time databases, in contrast, always apply the incoming update stream to a collection.

When the underlying data stream is a database change log, a datastream management system can provide functionality similar to self-maintaining queries. But since stream history is not accessible per se, there are significant differences: For one thing, the system cannot produce an initial result for the lack of historical data and therefore result semantics diverge. But what’s more, sorting by an arbitrary attribute is impossible for the same reason and, consequently, order-by queries on a stream are usually only permitted when sorting by a monotonic attribute (think: timestamp). For example, it’s impossible to maintain an up-to-date user list that is sorted by last name, because the system lacks knowledge of every user.

Stream Processing

Stream processing is the latency-oriented counterpart to batch processing and should rather be considered a programming paradigm than an alternative to real-time databases. However, you can access information in streams in the most generic of ways (through writing an application) and you could implement a real-time database on top — which is what we at Baqend did. In the last section of this article, you can find more details about how we use stream processing to provide scalable real-time queries.

For more details, see our stream processing survey:

Discussion

The table below sums up our view of the current landscape of systems that are capable of providing push-based data access in one form or another.

An overview over systems providing push-based data access.

Traditional (SQL) database management systems provide an abundance of features for applications based on request-response interaction, but maintaining query results on a per-user basis is not what they have been designed for. Few SQL systems provide active features beyond simple triggers and, to the best of our knowledge, none have an interface for self-maintaining or event stream queries similar to what real-time databases provide. Data stream management systems or stream processing engines are expressive and scalable at the same time, but do not directly provide you with collection-based real-time queries, either; instead, they work on streams and require you to adapt to specific semantics.

Meteor, RethinkDB, Parse and Firebase deserve credit for the way they acknowledge evolving data and the requirement to keep the client up-to-date. However, there is not a single system that carries non-trivial pull-based query features to the push-based paradigm without severe compromises; developers always have to weigh a lack of expressiveness against the presence of hard scalability bottlenecks. While Firebase avoids complexity altogether, the others make an effort to provide rich push-based queries — and ultimately fail to scale because of their design. State-of-the-art real-time queries are bounded by single-node capacity and there will never be a fix for these systems, other than go back to the drawing board.

It does not make sense that Meteor, RethinkDB and Parse partition their data collections, but not their data streams.

But here’s the good news: As you will see next, these bottlenecks are only implementation artifacts and not inherent to the challenge of providing real-time queries. In the final part of our article, we describe how Baqend dodges the scalability problems all other available systems fall prey to.

Baqend: Scalable Real-Time Queries

The core problem in the design of Meteor, RethinkDB as well as Parse is the lack of object partitioning. (Firebase avoids this bottleneck by simply not offering complex queries to begin with.) At Baqend, we developed a scalable design for real-time databases that combines query and object partitioning to provide unparalleled scalability and extremely high performance, with respect to both latency and throughput. Our real-time architecture is called InvaliDB — because we also use it to invalidate database queries for query caching. (If you’re eager to learn more about our approach towards caching dynamic data, see our paper or have a chat with us at VLDB 2017 next month!)

One of InvaliDB’s main design goals is isolating the Baqend app servers from any possible side effects of the resource-intensive query maintenance process. To this end, the real-time component is deployed as a separate system that communicates asynchronously with the Baqend app servers as depicted below.

Baqend isolates real-time workload from the application server.

In a nutshell, information is exchanged between the app servers and the real-time component on the following occasions:

  • query subscription (red arrows): Whenever a user subscribes to a real-time query, a Baqend app server executes the query in MongoDB to get the initial result. This result and the query are asynchronously handed to InvaliDB which maintains the result. The client gets the initial result as well as updates (which will be detected by InvaliDB). Likewise, when a real-time query is canceled, the cancellation request is asynchronously passed to the real-time component, so that the query can be deactivated and does not consume further resources.
  • write operation (green arrows): For every insert, update or delete operation, the full after-image of the written object is handed to InvaliDB which then matches the after-image against all active real-time queries. Using full after-images, InvaliDB can maintain query results without additional database queries (as opposed to oplog tailing).
  • query matching (blue arrows): Whenever a write operation changes any currently active real-time query, the real-time component asynchronously sends a notification to the registered app servers which is then forwarded to the listening clients. To make this possible, the real-time component matches every single incoming write operation against each currently active real-time query.

Matching all updates against all active real-time queries demands extreme scalability from the real-time query processing engine. InvaliDB achieves it through a highly flexible workload distribution scheme as illustrated below.

Baqend’s real-time queries are scalable, because they partition workload both by query and by object.

In order to remove individual servers as bottleneck and avoid hot key-space partitions, InvaliDB employs a two-dimensional hash-partitioning scheme that distributes not only queries, but also written objects evenly across all matching nodes. The example illustration uses three object partitions and three query partitions, so that the entire InvaliDB cluster comprises nine matching nodes. Each of these nine nodes is responsible for one third of the queries and only one third of all updates. To scale with the number of simultaneous real-time queries (i.e. users), we can simply add additional query partitions. In similar fashion, we can add additional object partitions to increase sustainable write throughput. As the bottom line, partitioning by query and object makes query matching feasible even for demanding workloads that combine high throughput and many concurrent real-time queries. Our implementation is based on Apache Storm to provide latency in the realm of single-digit milliseconds. The query engine is pluggable and supports MongoDB as the default query language. Baqend currently supports sorted and unsorted filter queries with limit and offset. Streaming joins and streaming aggregations are available through our RxJS client interface. Both self-maintaining queries and event stream queries are natively supported.

The beauty of this approach lies not only in the benefit of high scalability, but also in a separation of concerns between request-response and real-time workloads: Meteor, for example, always colocates real-time query maintenance with the application server and ultimately crashes when the real-time subsystem becomes a bottleneck. Baqend, on the other hand, decouples your application server from real-time matching.

Real-time queries in Baqend are maintained in a separate system to minimize overhead within the app server.

To provide fault-tolerance, InvaliDB is deployed as a separate system next to the app server and consequently will not take down your app in an overload scenario. Queries and updates are passed asynchronously to InvaliDB and events are sent back in asynchronous fashion as well. Seeing that a single query subscription devours considerable resources, we consider it paramount for service reliability to keep critical functionality like the web server and primary storage online, even if the real-time feature is under pressure.

Try It!

If you are curious about the Baqend Real-Time Query feature, read how to get started with our public beta:

Don’t want to miss our next post on real-time databases and Big Data topics? Get it conveniently delivered to your inbox by joining our newsletter.

--

--

Distributed systems engineer at Baqend, a serverless backend for faster websites. Background in database research & developing Baqend’s real-time query engine.