Evaluating MongoDB Under Python TPCC 1000W Workload
Following my blog post Evaluating the Python TPCC MongoDB Benchmark, I wanted to evaluate how MongoDB performs under workload with a bigger dataset. This time I will load a 1000 Warehouses dataset, which in raw format should equal to 100GB of data.
For the comparison, I will use the same hardware and the same MongoDB versions as in the blog post mentioned above. To reiterate:
Hardware Specs
For the client and server, I will use identical bare metal servers, connected via a 10Gb network.
The node specification:
# Percona Toolkit System Summary Report ###################### Hostname | beast-node4-ubuntu System | Supermicro; SYS-F619P2-RTN; v0123456789 (Other) Platform | Linux Release | Ubuntu 18.04.4 LTS (bionic) Kernel | 5.3.0-42-generic Architecture | CPU = 64-bit, OS = 64-bit # Processor ################################################## Processors | physical = 2, cores = 40, virtual = 80, hyperthreading = yes Models | 80xIntel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz Caches | 80x28160 KB # Memory ##################################################### Total | 187.6G Swappiness | 0
MongoDB Topology
For MongoDB I used:
- Single node instance without limiting cache size. As the bare metal server has 180GB of RAM, MongoDB should allocate 90GB of memory for WiredTiger cache and the rest will be used for OS cache. This should produce more CPU bound workload.
- Single node instance with limited cache size. For WiredTiger cache I will set limit 25GB, and to limit OS cache I will limit the memory available to mongodb instance to 50GB, as described in Using Cgroups to Limit MySQL and MongoDB memory usage.
- Replicate set setup with 3 nodes and limited cache as described above.
MongoDB Versions:
- Percona Server for MongoDB 4.0.18-11
- Percona Server for MongoDB 4.2.7-7
- MongoDB Community 4.4-rc8 (the latest 4.4 version available as for the time of testing)
Loading Data
I will load data using PyPy python version and using 100 clients and timing it:
time /mnt/data/vadim/bench/pypy2.7-v7.3.1-linux64/bin/pypy tpcc.py --config mconfig --warehouses 1000 --clients=100 --no-execute mongodb
The results:
4.0
time /mnt/data/vadim/bench/pypy2.7-v7.3.1-linux64/bin/pypy tpcc.py --config mconfig --warehouses 1000 --clients 100 --no-execute mongodb 2020-06-17 13:21:35,159 [<module>:245] INFO : Initializing TPC-C benchmark using MongodbDriver 2020-06-17 13:21:35,159 [<module>:255] INFO : Loading TPC-C benchmark data using MongodbDriver real 19m43.605s user 100m19.637s sys 26m27.597s
4.2
time /mnt/data/vadim/bench/pypy2.7-v7.3.1-linux64/bin/pypy tpcc.py --config mconfig --warehouses 1000 --clients=100 --no-execute mongodb 2020-06-17 13:28:48,325 [<module>:245] INFO : Initializing TPC-C benchmark using MongodbDriver 2020-06-17 13:28:48,325 [<module>:255] INFO : Loading TPC-C benchmark data using MongodbDriver real 13m34.238s user 87m30.806s sys 34m20.460s
4.4
time /mnt/data/vadim/servers/pypy2.7-v7.3.1-linux64/bin/pypy tpcc.py --config mconfig --warehouses 1000 --no-execute --clients=100 mongodb 2020-06-17 14:02:26,426 [<module>:245] INFO : Initializing TPC-C benchmark using MongodbDriver 2020-06-17 14:02:26,426 [<module>:255] INFO : Loading TPC-C benchmark data using MongodbDriver real 259m40.658s user 83m36.256s sys 14m11.330s
To Highlight:
4.2 loaded data a little faster than 4.0, and 4.4 performed extremely bad, being about 20 times slower than 4.2. I hope this is some Release Candidate bug which will be fixed for the release.
The size of MongoDB datadir is 165GB, it seems there is an overhead compared to the raw 100GB datasize.
Benchmark Results
Results With an Unlimited Cache
The results are in NEW ORDER transactions per minute, AKA more is better.
Results With a Limited Cache
In this cache, I allocate only 25GB for WiredTiger and 50GB for the mongodb process in total.
The results are in NEW ORDER transactions per minute; more is better.
Results with 3 Nodes ReplicaSet and Limited Cache
In this case, I only compare 4.0 and 4.2, as from the previous results, there is something going on with 4.4 and I want to wait until GA release to measure it in ReplicaSet setup.
‘write_concern’: 1 for this benchmark.
The results are in NEW ORDER transactions per minute, and more is better.
Now we can compare how much overhead there is from ReplicaSets:
With ‘write_concern’: 1 there really should not be much overhead from replicaset, which is confirmed for version 4.0. However, 4.2 shows a noticeable difference, which is a point for further investigation.
Conclusion
What is obvious from the collective results is that the 4.2 version took a noticeable performance hit, sometimes showing as much as a 2x throughput decline compared to 4.0.
Version 4.4, as of the current RC status, showed long load times and variation in the performance results under high concurrent load. I want to wait for the GA release for the final evaluation.
by Vadim Tkachenko via Percona Database Performance Blog
Comments
Post a Comment