Navigation

Alert Conditions

Introduction

This page describes the conditions for which you can trigger alerts related to your clusters. You specify conditions and thresholds when configuring alerts.

Note

Groups and projects are synonymous terms. Your {GROUP-ID} is the same as your project id. For existing groups, your group/project id remains the same. This page uses the more familiar term group. The endpoints are as stated on the page.

Host Alerts

The conditions in this section apply if you select Host as the alert target when configuring the alert. You can apply the condition to all hosts or to specific type of host, such as primaries or config servers.

Asserts

The following alert conditions measure the rate of asserts for a MongoDB process, as collected from the MongoDB serverStatus command’s asserts document. You can view asserts through cluster monitoring.

Asserts: Regular is

Sends an alert if the rate of regular asserts meets the specified threshold.

Asserts: Warning is

Sends an alert if the rate of warnings meets the specified threshold.

Asserts: Msg is

Sends an alert if the rate of message asserts meets the specified threshold. Message asserts are internal server errors. Stack traces are logged for these.

Asserts: User is

Sends an alert if the rate of errors generated by users meets the specified threshold.

Opcounter

The following alert conditions measure the rate of database operations on a MongoDB process since the process last started, as collected from the MongoDB serverStatus command’s opcounters document. You can view opcounters through cluster monitoring.

Opcounter: Cmd is

Sends an alert if the rate of commands performed meets the specified threshold.

Opcounter: Query is

Sends an alert if the rate of queries meets the specified threshold.

Opcounter: Update is

Sends an alert if the rate of updates meets the specified threshold.

Opcounter: Delete is

Sends an alert if the rate of deletes meets the specified threshold.

Opcounter: Insert is

Sends an alert if the rate of inserts meets the specified threshold.

Opcounter: Getmores is

Sends an alert if the rate of getmore operations to retrieve the next cursor batch meets the specified threshold. See Cursor Batches in the MongoDB manual.

Opcounter - Repl

The following alert conditions measure the rate of database operations on MongoDB secondaries, as collected from the MongoDB serverStatus command’s opcountersRepl document. You can view these metrics on the Opcounters - Repl chart, accessed through cluster monitoring.

Opcounter: Repl Cmd is

Sends an alert if the rate of replicated commands meets the specified threshold.

Opcounter: Repl Update is

Sends an alert if the rate of replicated updates meets the specified threshold.

Opcounter: Repl Delete is

Sends an alert if the rate of replicated deletes meets the specified threshold.

Opcounter: Repl Insert is

Sends an alert if the rate of replicated inserts meets the specified threshold.

Atlas Free Tier

Logical Size is

Sends an alert if the total size of the data and indexes is outside the specified threshold.

Applicable for Atlas Free Tier Only

Memory

The following alert conditions measure memory for a MongoDB process, as collected from the MongoDB serverStatus command’s mem document. You can view these metrics on the Atlas Memory and Non-Mapped Virtual Memory charts, accessed through cluster monitoring.

Memory: Resident is

Sends an alert if the size of the resident memory meets the specified threshold. It is typical over time, on a dedicated database server, for the size of the resident memory to approach the amount of physical RAM on the box.

Memory: Virtual is

Sends an alert if the size of virtual memory for the mongod process meets the specified threshold. You can use this alert to flag excessive memory outside of memory mapping. For more information, click the Memory chart’s i icon.

Memory: Computed is

Sends an alert if the size of virtual memory that is not accounted for by memory-mapping meets the specified threshold. If this number is very high (multiple gigabytes), it indicates that excessive memory is being used outside of memory mapping. For more information on how to use this metric, view the Non-Mapped Virtual Memory chart and click the chart’s i icon.

Connections

The following alert condition measures connections to a MongoDB process, as collected from the MongoDB serverStatus command’s connections document. You can view this metric on the Atlas Connections chart, accessed through cluster monitoring.

Connections is

Sends an alert if the number of active connections to the host meets the specified average.

Queues

The following alert conditions measure operations waiting on locks, as collected from the MongoDB serverStatus command’s globalLock document. You can view these metrics on the Atlas Queues chart, accessed through cluster monitoring.

Queues: Total is

Sends an alert if the number of operations waiting on a lock of any type meets the specified average.

Queues: Readers is

Sends an alert if the number of operations waiting on a read lock meets the specified average.

Queues: Writers is

Sends an alert if the number of operations waiting on a write lock meets the specified average.

Page Faults

The following alert condition measures the rate of page faults for a MongoDB process, as collected from the MongoDB serverStatus command’s extra_info.page_faults field.

Page Faults is

Sends an alert if the rate of page faults (whether or not an exception is thrown) meets the specified threshold. You can view this metric on the Atlas Page Faults chart, accessed through cluster monitoring.

Cursors

The following alert conditions measure the number of cursors for a MongoDB process, as collected from the MongoDB serverStatus command’s metrics.cursor document. You can view these metrics on the Atlas Cursors chart, accessed through cluster monitoring.

Cursors: Open is

Sends an alert if the number of cursors the server is maintaining for clients meets the specified average.

Cursors: Timed Out is

Sends an alert if the number of timed-out cursors the server is maintaining for clients meets the specified average.

Network

The following alert conditions measure throughput for MongoDB process, as collected from the MongoDB serverStatus command’s network document. You can view these metrics on a host’s Network chart, accessed through cluster monitoring.

Network: Bytes In is

Sends an alert if the number of bytes sent to MongoDB meet the specified threshold.

Network: Bytes Out is

Sends an alert if the number of bytes sent from MongoDB meet the specified threshold.

Network: Num Requests is

Sends an alert if the number of requests sent to MongoDB meet the specified average.

Replication Oplog

The following alert conditions apply to the MongoDB process’s oplog. You can view these metrics on the following charts, accessed through cluster monitoring:

  • Replication Oplog Window
  • Replication Lag
  • Replication Headroom
  • Oplog GB/Hour

The following alert conditions apply to the oplog:

Replication Oplog Window is

Sends an alert if the approximate amount of time available in the primary’s replication oplog meets the specified threshold.

Replication Lag is

Sends an alert if the approximate amount of time that the secondary is behind the primary meets the specified threshold. Atlas calculates replication lag using the approach described in Check the Replication Lag in the MongoDB manual.

Replication Headroom is

Sends an alert when the difference between the sync source member’s oplog window and the replication lag time on the secondary meets the specified threshold.

Oplog Data Per Hour is

Sends an alert when the amount of data per hour being written to a primary’s oplog meets the specified threshold.

DB Storage

The following alert conditions apply to database storage, as collected for a MongoDB process by the MongoDB dbStats command. The conditions are based on the summed total of all databases on the MongoDB process:

DB Storage is

Sends an alert if the allocated storage meets the specified threshold. This alert condition can be viewed on a host’s DB Storage chart, accessed through cluster monitoring.

DB Data Size is

Sends an alert if approximate size of all documents (and their paddings) meets the specified threshold.

WiredTiger Storage Engine

The following alert conditions apply to the MongoDB process’s WiredTiger storage engine, as collected from the MongoDB serverStatus command’s wiredTiger.cache and wiredTiger.concurrentTransactions documents.

You can view these metrics on the following charts, accessed through cluster monitoring:

  • Tickets Available
  • Cache Activity
  • Cache Usage

The following are the alert conditions that apply to WiredTiger:

Tickets Available: Reads is

Sends an alert if the number of read tickets available to the WiredTiger storage engine meet the specified threshold.

Tickets Available: Writes is

Sends an alert if the number of write tickets available to the WiredTiger storage engine meet the specified threshold.

Cache: Dirty Bytes is

Sends an alert when the number of dirty bytes in the WiredTiger cache meets the specified threshold.

Cache: Used Bytes is

Sends an alert when the number of used bytes in the WiredTiger cache meets the specified threshold.

Cache: Bytes Read Into Cache is

Sends an alert when the number of bytes read into the WiredTiger cache meets the specified threshold.

Cache: Bytes Written From Cache is

Sends an alert when the number of bytes written from the WiredTiger cache meets the specified threshold.

System and Disk Alerts

The following alert conditions measure usage on your Atlas server instances:

System: CPU (Steal) % is

Applicable when the EC2 instance credit balance is exhausted.

The percentage by which the CPU usage exceeds the guaranteed baseline CPU credit accumulation rate. CPU credits are units of CPU utilization that you accumulate. The credits accumulate at a constant rate to provide a guaranteed level of performance. These credits can be used for additional CPU performance. When the credit balance is exhausted, only the guaranteed baseline of CPU performance is provided, and the amount of excess is shown as steal percent.

System: CPU (User) % is

The normalized CPU usage of the MongoDB process, which is scaled to a range of 0-100%.

Disk space % used on Data Partition is

The percentage of disk space used on any partition that contains the MongoDB collection data.

Disk space % used on Index Partition is

The percentage of disk space used on any partition that contains the MongoDB index data.

Disk space % used on Journal Partition is

The percentage of disk space used on the partition that contains the MongoDB journal, if journaling is enabled.

Disk I/O % utilization on Data Partition is

The percentage of time during which requests are being issued to any partition that contains the MongoDB collection data. This includes requests from any process, not just MongoDB processes.

Disk I/O % utilization on Index Partition is

The percentage of time during which requests are being issued to any partition that contains the MongoDB index data. This includes requests from any process, not just MongoDB processes.

Disk I/O % utilization on Journal Partition is

The percentage of time during which requests are being issued to the partition that contains the MongoDB journal, if journaling is enabled. This includes requests from any process, not just MongoDB processes.

Inapplicable Host Conditions

The following host conditions do not apply to Atlas. Atlas will not generate alerts for the following conditions:

  • Memory: Mapped is
  • B-tree: accesses is
  • B-tree: hits is
  • B-tree: misses is
  • B-tree: miss ratio is
  • Effective Lock % is
  • Background Flush Average is
  • Accesses Not In Memory: Total is
  • Page Fault Exceptions Thrown: Total is
  • Cursors: Client Cursors Size is
  • Journaling Commits in Write Lock is
  • Journaling MB is
  • Journaling Write Data Files MB is

Query Targeting Alerts

The following alerts may indicate a need for indexes to improve the efficiency of your read operations. For more information on indexing, refer to Indexing Strategies.

Query Targeting: Scanned / Returned

Sends an alert if the ratio of documents scanned to documents returned meets the specified threshold.

Query Targeting: Scanned Objects / Returned

Sends an alert if the ratio of index items scanned to documents returned meets the specified threshold.

Replica Set Alerts

The following alert condition applies to replica sets:

Replica set has no primary

Sends an alert when a replica set does not have a primary. Specifically, when none of the members of a replica set have a status of PRIMARY, the alert triggers. For example, this condition may arise when a set has an even number of voting members resulting in a tie.

If Atlas collects data during an election for primary, this alert might send a false positive. To prevent such false positives, set the alert configuration’s after waiting interval (in the configuration’s Send to section).

Sharded Cluster Alerts

The following alert condition applies to sharded clusters:

Cluster is missing an active mongos

Sends an alert if Atlas cannot reach a mongos for the cluster.

Backup Alerts

The following alert conditions apply to your cluster backups, if enabled:

Backup oplog is behind

Sends an alert if the most recent oplog data received by Atlas is more than 75 minutes old.

Backup requires a resync

Sends an alert if the replication process for a backup falls too far behind the oplog to catch up. This occurs when the host overwrites oplog entries that backup has not yet replicated. When this happens, you must resync backup.

User Alerts

The following alert conditions apply to Atlas users.

User joined the project

Sends an alert when a new user joins the Atlas project.

User left the project

Sends an alert when a user leaves the Atlas project.

User had their role changed

Sends an alert when an Atlas user’s roles have changed.

Project Alerts

The following alert conditions apply to your Atlas project.

Users awaiting approval to join project

Sends an alert if there are users who have asked to join the project. A user can ask to join a project when first registering for Atlas.

Users do not have two-factor authentication enabled

Sends an alert if the project has users who have not set up two-factor authentication.

Billing Alert

The following alert condition applies to Atlas billing.

Credit card is about to expire

Sends an alert if the credit card on file is about to expire. The alert is triggered at the beginning of the month that the card expires. Atlas enables this alert when a credit card is added for the first time.

Daily amount billed ($) is above threshold

The project’s last daily amount billed exceeds your configured threshold. Atlas does not account for any credits applied for the previous day when calculating the billed amount.

Note

The amount billed is in USD.

Project pending invoice ($) total is above threshold

The project’s pending monthly invoice exceeds your configured threshold. When the current pending invoice closes, this alert resets.

This alert resolves automatically if you pay enough to bring the pending invoice below the threshold. It may take up to 24 hours for the payment to clear. This alert can repeat if the pending invoice exceeds the threshold again before the invoice closes.

Note

The amount billed is in USD.