review of MongoDB storaage MMAPv1 and WiredTigerIn this post, we’ll take a look at the differences between the MMAP and WiredTiger engines in MongoDB®. I’ve been asked this question by customers many times, and this blog is for you! We’ll tell you about the key features of these engines, then you can choose the right engine based on your requirement.

In MongoDB, we mainly use the MMAPV1 and WiredTiger engines. We could use other engines like in-Memory, rocks db with Percona Server for MongoDB (PSMDB), and in-memory engine with MongoDB Enterprise version. When MongoDB was introduced, MMAPV1 was the default engine and it’s still a part of the MongoDB releases, though it will not be seen from 4.2 as per MongoDB’s plan. Those who remember the days working with version 1.8 might miss this, even though they don’t use MMAP currently! MongoDB acquired wiredTiger Inc (see here) and from version 3.2 made it the default engine of MongoDB. This engine enabled the introduction of transactions with multi-documents, and is mainly used for features such as compression and document level locking. Here we’ll see the key features of wiredTiger and MMAPV1, and also present them in a tabular column at the end – who doesn’t love a table to check quickly the differences! It reminds me my school days :-)). My co-author, and friend – Aayushi feels the same?! 🙂

Some differences in detail

Storage Engines

The MongoDB storage engines manage BSON data in memory and on disk to support read and write operations.

MMAPV1:  This is the original storage engine for MongoDB, introduced in the first release, but from version 4.0 it is deprecated

WiredTiger:  This is the pluggable engine introduced by MongoDB in version 3.0 and it became the default storage engine from version 3.2

Data compression

MMAPV1: does not support data compression and it is based on memory mapped files. So it works well when you can keep your writeset in memory. It excels at workloads with high volume inserts, reads, and in-place updates.

WiredTiger: supports snappy and zlib compression. Consequently, MongoDB with WiredTiger takes very little space comparing with MMAP. It has its own write-cache and a filesystem cache.

  • Snappy: This is the default algorithm,  efficient computation with reasonable compression. See here.
  • Zlib: higher compression rate at the cost of CPU. See here.

Data Directory

Let’s take a look at the file system supporting the same data and replica set member for each of the engines. 

MMAPV1:

WiredTiger:

Journaling

MMAPV1: Ensures that writes are atomic.  If MongoDB goes down or terminates before committing changes to the data files, MongoDB can use the journal files to apply the write operation to the data files and maintain a consistent state.

WiredTiger: This uses checkpoints between writes and the journal persists all data modifications between checkpoints. So for any recovery from database crash or abrupt termination, it uses journal entries since the last checkpoint. In most cases, journal is not necessary for this engine and you enable it only if you need to be sure to recover until the last successful write before the crash from the journal. Otherwise, usually MongoDB can recover from the last valid checkpoint. Checkpoint occurs every minute by default. 

Journal directory

This is how journal files appear in the data directory for the different engines:

MMAPV1:

WiredTiger:

Locks and concurrency

MMAPV1

  • Up until version 2.6: uses a readers-writer [1] lock that allows concurrent reads access to a database, but gives exclusive access to a single write operation. When a read lock exists, many read operations may use this lock. However, when a write lock exists, a single write operation holds the lock exclusively, and no other read or write operations may share the lock.
  • From 3.0: The MMAPv1 storage engine uses collection level locking as of the 3.0 release series, an improvement on earlier versions in which the database lock was the finest-grain lock.

WiredTiger: supports document level locking. For most read and write operations, WiredTiger uses optimistic concurrency control. WiredTiger uses only intent locks at the global, database, and collection levels.

For example: deleting documents from the collection “testData” for a value of {x:1}, will acquire write “LOCK” at collection level differently for each of the storage engines.

MMAPV1:

where w = Represents Exclusive (X) lock

WiredTiger:

where w = Represents Intent Exclusive (IX) lock

Memory

MMAPv1: MongoDB automatically uses all free memory on the machine as its cache. System resource monitors show that MongoDB uses a lot of memory, but its usage is dynamic. If another process suddenly needs half the server’s RAM, MongoDB will yield cached memory to the other process.

Technically, the operating system’s virtual memory subsystem manages MongoDB’s memory. This means that MongoDB will use as much free memory as it can, swapping to disk as needed. Deployments with enough memory to fit the application’s working data set in RAM will achieve the best performance.

WiredTiger: with wiredTiger, MongoDB utilizes both the WiredTiger internal cache and the filesystem cache. Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or by other processes. Starting in 3.4, the WiredTiger internal cache, by default, will use the larger of either:

  • 50% of (RAM – 1 GB), or
  • 256 MB.

Quick reference: MMAPV1 vs WiredTiger

Use this table for a quick reference to the differences between MMAPv1 and WiredTiger

Key FeatureMMAPV1wiredTiger
Introduction & Default EngineIntroduced with MongoDB from scratch and default engine till 3.0 version. Deprecated in 4.0 and will be removed in futureIntroduced in 3.0 version and default from 3.2 version
Data CompressionDoesn’t support compressionCompression with default snappy compression method and zlib compression method. So occupy less space than MMAPV1 engine
JournalingMongoDB writes the in-memory changes first to on-disk journal files. If MongoDB goes down/terminates before committing the changes to the data files, MongoDB can use the journal files to apply the write operation to the data files and maintain a consistent state.The WiredTiger journal persists all data modifications between checkpoints. If MongoDB exits between checkpoints, it uses the journal to replay all data modified since the last checkpoint.
Locks & ConcurrencyTill 2.6, MongoDB uses a readers-writer [1] lock that allows concurrent reads access to a database but gives exclusive access to a single write operation. From 3.0, uses collection level lockIt supports document level locking.
TransactionOperation on a single document is atomicMulti-document transactions are only available for deployments from version 4.0
CPU Performanceadding CPU cores does not improve performance muchperforms better on multicore systems
EncryptionEncryption is not possibleEncryption at rest is available with MongoDB enterprise and as BETA in PSMDB 3.6.8
Memoryautomatically uses all free memory on the machine as its cacheUses internal cache and filesystem cache
UpdatesIt excels at workloads with high volume inserts, reads, and in-place updates.Does not support in place updates. It causes the whole document to rewrite
TuningLess chance to tune itAllows more tuning with this engine through different variables. Eg: cache size, read / write tickets, checkpoint interval etc

Conclusion

The above information does not cover every difference between MMAPV1 and WiredTiger, but it lists the key differences. If you have any key features to add, please feel free to add in the comments! Let’s share and let everyone know about them 🙂


Photo by Mathew Schwartz on Unsplash

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
codediscovery

Great breakdown on the differences between MMAP1 and WiredTiger!