- Pluggable Storage Engine
- Document-level Locking
- Compression
For a complete list of features, refer to MongoDB's release note.
Pluggable Storage Engine
Prior to v3.0, MongoDB runs only on MMAPv1 storage engine. Since acquiring WiredTiger, MongoDB had developed a pluggable storage engine API, which enables it to run on different storage engines.
List of storage engines:
Storage Engines | Status | Developed By |
---|---|---|
MMAPv1 | Supported | MongoDB |
WiredTiger | Supported | MongoDB |
In-Memory | In development | MongoDB |
RocksDB | In development | RocksDB |
InnoDB | In development | InnoDB |
FusionIO | In development | FusionIO |
HDFS | In development | Hadoop |
... | ... | ... |
Pluggable storage engines opens up new possibilities in replica set distribution. Each member of a replica set can run on different storage engine, while sharing the same JSON data model. In an example replica set, different members can run on:
- WiredTiger for write-heavy workload
- In-memory for extreme high throughtput
- HDFS integrates in Hadoop cluster
- FusionIO, backup engine, etc.
Document-level Locking
MongoDB was notorious for locking at database-level for all write activities. MongoDB suffers data throughput with write-heavy workloads. It had to refer to alternative methods to accommodate write-heavy workload by distribute writes to multiple databases, or distribute on a sharded cluster. With v3.0 WiredTiger engine, MongoDB is able to write at document-level. WiredTiger engine provides improvement to write-heavy application.
In addition, MongoDB v3.0 with the default MMAPv1 engine is able to lock at collection-level. It is also an improvment to the previous database-level lock.
WiredTiger shipped with default Btree algorithm, however, LSM algorithm is available as a configurable option.
- Read heavy use case: Btree > LSM
- Write heavy use case: LSM > Btree
Compression
Compression does not exist prior to v3.0. MongoDB v3.0 with WiredTiger engine can compress data in two flavors: Snappy, or Zlib.
- Snappy - 70% compression ratio, low CPU overhead, default option
- Zlib - 80% compression ratio, higher CPU overhead, non-default option
Zlib is suitable for archival purpose, as it uses higher CPU overhead and compress at a higher ratio.
Snappy and Zlib compression work on documents and the journal file, while indexes use Prefix compression, compresses indexes at ~50% ratio.
Note: compression ratio may vary depend on use case, on average a 70% compression ratio is observed.
Here's a look at compression size comparison between different storage configuration:
Test load: 1 collection, 500,000 docs, 20KB/doc
Compare to MMAPv1 which has no compression option, WiredTiger with snappy compression and zlib compression do a good job at compressing data size to about ~84% ratio. However, my question is, does compression affect performance? Look at next blog which we'll benchmark MongoDB with these storage configurations.