New MongoDB Exporter Released with Percona Monitoring and Management 2.10.0

MongoDB Exporter Percona Monitoring and Management

MongoDB Exporter Percona Monitoring and ManagementWith Percona Monitoring and Management (PMM) 2.10.0, Percona is releasing a new MongoDB exporter for Prometheus. It is a complete rewrite from scratch with a totally new approach to collect and expose metrics from MongoDB diagnostic commands.

The MongoDB exporter in the 0.11.x branch exposes only a static list of handpicked metrics with custom names and labels. The new exporter uses a totally different approach: it exposes ALL available metrics returned by MongoDB internal diagnostic commands and the metric naming (or renaming) follows concrete rules that apply the same for all metrics.

For example, if we run

db.getSiblingDB('admin').runCommand({"getDiagnosticData": 1});
 the command returns a structure that looks like this:
{
     "data" : {
         "start" : ISODate("2020-08-23T22:25:26Z"),
         "serverStatus" : {
             "start" : ISODate("2020-08-23T22:25:26Z"),
             "host" : "f9cd25606ada",
             "version" : "4.2.8",
             "process" : "mongod",
             "pid" : NumberLong(1),
             "uptime" : 186327,
             "uptimeMillis" : NumberLong(186327655),
             "uptimeEstimate" : NumberLong(186327),
             "localTime" : ISODate("2020-08-23T22:25:26Z"),
             "asserts" : {
                 "regular" : 0,
                 "warning" : 0,
                 "msg" : 0,
                 "user" : 62,
                 "rollovers" : 0
             },
             "connections" : {
                 "current" : 25,
                 "available" : 838835,
                 "totalCreated" : 231,
                 "active" : 1
             },
             "electionMetrics" : {
                 "stepUpCmd" : {
                     "called" : NumberLong(0),
                     "successful" : NumberLong(0)
                 },
                 "priorityTakeover" : {
                     "called" : NumberLong(0),
                     "successful" : NumberLong(0)
                 },
                 "catchUpTakeover" : {
                     "called" : NumberLong(0),
                     "successful" : NumberLong(0)
                 },

In the new exporter, the approach to expose all metrics is to traverse the result of the diagnostic commands like

getDiagnosticData
looking for values to expose. In this case, we have
serverStatus
, inside it we found
asserts
 and inside asserts there are metrics to expose (because they are numbers):
regular
,
warning
,
msg
,
user
, and
rollovers
. In this method of metric gathering, the metric name is the composition of the metrics keys we had to follow, for example, it will produce a metric like this:
serverStatus_asserts_user
:
62
.

If we open the web interface for the exporter at http://localhost:9216, we won’t find that metric I just mentioned. Why? Because to make the metric names shorter and to be able to group some metrics under the same name, the new exporter implements a metric name prefix rename and we are converting some metric suffixes to labels.

Prefix Renaming Table

The string

mongodb_
is prepended to all metrics as the Prometheus job name. Unlike < v2.0 mongodb_exporter, there won’t be
mongod
 vs.
mongos
 in the job name.
Metric Prefix New Prefix
serverStatus.wiredTiger.transaction ss_wt_txn
serverStatus.wiredTiger ss_wt
serverStatus ss
replSetGetStatus rs
systemMetrics sys
local.oplog.rs.stats.wiredTiger oplog_stats_wt
local.oplog.rs.stats oplog_stats
collstats_storage.wiredTiger collstats_storage_wt
collstats_storage.indexDetails collstats_storage_idx
collStats.storageStats collstats_storage
collStats.latencyStats collstats_latency

Prefix labeling table:

Metric Prefix New Prefix
collStats.storageStats.indexDetails. index_name
globalLock.activeQueue. count_type
globalLock.locks. lock_type
serverStatus.asserts. assert_type
serverStatus.connections. conn_type
serverStatus.globalLock.currentQueue. count_type
serverStatus.metrics.commands. cmd_name
serverStatus.metrics.cursor.open. csr_type
serverStatus.metrics.document. doc_op_type
serverStatus.opLatencies. op_type
serverStatus.opReadConcernCounters. concern_type
serverStatus.opcounters. legacy_op_type
serverStatus.opcountersRepl. legacy_op_type
serverStatus.transactions.commitTypes. commit_type
serverStatus.wiredTiger.concurrentTransactions. txn_rw_type
serverStatus.wiredTiger.perf. perf_bucket
systemMetrics.disks. device_name

Because of the metric renaming and labeling, we will find that the metric

serverStatus.asserts
.
<metric>
  will become
ss_asserts
  and the metric name will be used as a label:
# HELP mongodb_ss_asserts serverStatus.asserts.
# TYPE mongodb_ss_asserts untyped
mongodb_ss_asserts{assert_type="msg"} 0
mongodb_ss_asserts{assert_type="regular"} 0
mongodb_ss_asserts{assert_type="rollovers"} 0
mongodb_ss_asserts{assert_type="user"} 62
mongodb_ss_asserts{assert_type="warning"} 0

Advantages Of The New Exporter

Since the new exporter will automatically collect all available metrics, it is now possible to collect new metrics in the PMM dashboards and as new MongoDB versions expose new metrics, they will automatically become available without the need to manually add metrics and upgrade the exporter. Also, since there are clear rules for metric renaming and how labels are created, metric names are more consistent even when new metrics are added.

How It Works

As mentioned previously, this new exporter exposes all metrics by traversing the JSON output of each MongoDB diagnostic command.
Those commands are:

{"getDiagnosticData": 1}
which includes:
serverStatus
replSetGetStatus (will be fetched separately if MongoDB <= v3.6)
Oplog collection stats
OS system metrics:
Memory
CPU
Disk usage
netstat
vmstat
{"replSetGetStatus": 1}

{"serverStatus": 1}

and it is possible also to specify database.collections pairs lists to get stats for collections usage and indexes by running these commands for each collection:

{"$collStats": {"latencyStats": {"histograms": true}}}

{"indexStats"}

Enabling Compatibility Mode

The new exporter has a parameter,

--compatible-mode
, which enables a special compatibility mode. In this mode, the old exporter metrics are also exposed along with the new metrics. This way, existing dashboards should work without requiring any change, and it is the default mode in PMM 2.10.0.

Example: in compatibility mode, all metrics having the

mongodb_ss_wt_txn_transaction_checkpoint
  prefix and the
min_time_msecs
  or
max_time_msecs
  suffix like
# HELP mongodb_ss_wt_txn_transaction_checkpoint_min_time_msecs serverStatus.wiredTiger.transaction.
# TYPE mongodb_ss_wt_txn_transaction_checkpoint_min_time_msecs untyped
mongodb_ss_wt_txn_transaction_checkpoint_min_time_msecs 14

will be also exposed using the old naming convention as

# HELP mongodb_mongod_wiredtiger_transactions_checkpoint_milliseconds mongodb_mongod_wiredtiger_transactions_checkpoint_milliseconds
# TYPE mongodb_mongod_wiredtiger_transactions_checkpoint_milliseconds untyped
mongodb_mongod_wiredtiger_transactions_checkpoint_milliseconds{type="max"} 71
mongodb_mongod_wiredtiger_transactions_checkpoint_milliseconds{type="min"} 14

and the suffix is used as a label.

Debugging

When starting the exporter with

--debug
, it will output the result of each diagnostic command to the standard error. This makes it easier to check the values returned by each command to verify the metric renaming and values.

Releases

This exporter is going to be released as part of PMM starting with version 2.10.0 and will also be released as an independent exporter in the repo’s release page.

Currently, the exporter resides in the

v0.20.0
  branch and the old exporter is in the master branch but, exporter
v0.11
  will be moved to the
main
 branch and
master
branch will be used for the new exporter code.

How to Contribute

Using the Makefile

In the main directory, there is a

Makefile
to help you with development and testing tasks. Use make without parameters to get help. These are the available options:
Command Description
init Install linters
build Build the binaries
format Format source code
check Run checks/linters
check-license Check license in headers.
help Display this help message.
test Run all tests (need to start the sandbox first)
test-cluster Starts MongoDB test cluster. Use env var TEST_MONGODB_IMAGE to set flavor and version.
Example: TEST_MONGODB_IMAGE=mongo:3.6 make test-cluster
test-cluster-clean Stops MongoDB test cluster

Initializing the Development Environment

First, you need to have Go and Docker installed on your system, and then in order to install tools to format, test, and build the exporter, you need to run this command:

make init

It will install

goimports
,
goreleaser
,
golangci-lint
, and
reviewdog
.

Testing

Starting the Sandbox

The testing sandbox starts in MongoDB instances as follows:

  • 3 instances for shard 1 at ports 17001, 17002, 17003
  • 3 instances for shard 2 at ports 17004, 17005, 17006
  • 3 config servers at ports 17007, 17008, 17009
  • 1 mongos server at port 17000
  • 1 stand-alone instance at port 27017

All instances are currently running without user and password so, for example, to connect to the mongos, you can just use:

mongo mongodb://127.0.0.1:17001/admin

The sandbox can be started using the provided

Makefile
 using:
make test-cluster
  and it can be stopped using
make test-cluster-clean
.

Running Tests

To run the unit tests, just run

make test
.

Formating Code

Before submitting code, please run make format to format the code according to the standards.

Known Issues

  • Replicaset lag sometimes shows strange values.
  • Elements that use next metrics have been removed from dashboards:
    mongodb_mongod_rocksdb_*
    mongodb_mongod_locks_time_locked_global_microseconds_total
    mongodb_mongod_durability_time_milliseconds_sum
    mongodb_mongod_durability_time_milliseconds_count

So, these dashboards have been updated:

  • dashboard “MongoDB RocksDB Details” –> removed dashboard completely
  • dashboard “MongoDB MMAPv1 Details”, element “MMAPv1 Journaling Time” –> remove element on the dashboard
  • dashboard “MongoDB MMAPv1 Details”, element “MMAPv1 Lock Ratios”, parameter “Lock (pre-3.2 only)” –> removed chart on the element on the dashboard

Final Thoughts

This new exporter shouldn’t affect any existing dashboard since the compatibility mode exposes all old-style metrics along with the new ones. We deprecated only a few metrics that were already meaningless because they are only valid and exposed for old MongoDB versions like mongodb_mongod_global_lock_ratio and mongodb_version_info.

At Percona, we built this new MongoDB exporter with the idea in mind of having an exporter capable of exposing all available metrics, with no hard-coded metrics and not tied to any particular MongoDB version. We would like to encourage users to help us by using this version and providing feedback. We also accept (and encourage) code fixes and improvements.

Also, learn more about the new Percona Customer Portal rolling out starting with the 2.10.0 release of Percona Monitoring and Management.


by Carlos Salguero via Percona Database Performance Blog

Comments