Navigation

Atlas Production Best Practices

With MongoDB Atlas as your data platform, operational focus can shift away from the mundane operational tasks and workflows required to build and maintain database infrastructure, allowing you to focus on helping engineers add value to the business. Instead of maintaining hardware and keeping up with operating system-level software patches, engineers can devote their time and energy to developing data models that meet the current and future requirements of your enterprise.

This document outlines some best practices for establishing and maintaining a successful MongoDB production deployment on MongoDB Atlas.

Roles and Responsibilities

MongoDB manages and operates the infrastructure required to provide a MongoDB Database Service to the customer. MongoDB’s responsibilities include the following:

  • Manage the database clusters and underlying infrastructure, ensuring availability, stability, and performance of MongoDB, backed by a 99.995% Uptime Service Level Agreement (SLA) for clusters of size M30 and larger.
  • Ensure the health of the underlying compute nodes. Make sure they are running, have network connectivity, and have all recommended OS-level patches to maintain the Uptime SLA.
  • Manage the MongoDB database configuration based on the customer’s specific design choices made via the Atlas user interface or REST API.
  • Apply all MongoDB maintenance upgrades automatically to ensure the latest bug fixes to the product are in use.
  • Manage the security profile, including Role-Based Access Control, IP Whitelisting, and peering to maximize cluster security per the customer’s direction.
  • Provide backup and restore services.

The customer continues to develop and deploy applications which access MongoDB, without having to directly manage the underlying database resources and/or infrastructure.

Organization and Project-Level Management

MongoDB Atlas abstracts away database operations so that you can focus on high-value, high-level management decisions.

Creating a well-designed hierarchy of organizations and projects within Atlas allows for maximum enterprise efficiency with minimum operational friction.

The Organization Level

At the Organization level, you can implement security controls and create users which work across one or more Projects. Atlas billing occurs at the Organization level.

To efficiently control user access and privileges, you can group users into teams at the Organization level.

The Project Level

Projects offer a security isolation and authorization boundary, so they are typically allocated by application team and application environment. For example, within two application teams there might be six projects: one for each team in the Development, Staging, and Production environments.

You can create project-level Atlas users and roles with appropriate access to the different production and development application environments.

  • Users with the Project Read Only role can access project-level monitoring and system health metadata without having access to any collection data or administrative operations.
  • Users with the Project Cluster Manager role can scale clusters and perform other administrative operations, but have no data-level access.

Other project-level responsibilities include:

Application Management

Application-level responsibilities include:

  • Schema design, including query and index optimization.
  • Cluster tier and topology selection. Choosing the appropriate cluster size and topology (replica set or sharded cluster), along with storage capacity and IOPS is crucial for optimal database performance.
  • Provisioning of non-production clusters. Production backups can be restored into non-production clusters with the Atlas UI or the API.
  • Capacity planning. Determining when additional computational capacity is needed, typically using the monitoring telemetry that Atlas provides. Additional capacity can be added with no application downtime, and you can optionally enable auto-scaling to respond automatically to spikes in usage.
  • Deciding when to implement a major database version upgrade.
  • Implementing and testing a backup and restoration plan.
  • Ensuring that applications gracefully handle cluster failover through testing.
  • Configuring data analytics services with tools such as BI Connector and Charts.

Scaling

MongoDB Atlas offers two methods for scaling, vertical and horizontal.

Vertical scaling involves increasing a cluster’s storage capacity, computing power, and/or IOPS rate. Vertical scaling can be accomplished quickly and is useful for peak usage periods.

When scaling vertically, M30 and higher clusters are recommended for production environments. You can use M10 and M20 clusters as production environments for low-traffic applications, but these tiers are recommended for development environments.

Horizontal scaling involves implementing sharding or adding additional shards to an existing sharded cluster. Horizontal scaling requires careful planning and execution, and is part of a long-term growth strategy.

Vertical and horizontal sharding can be combined in Atlas. For example, a sharded cluster can be vertically scaled up for a peak period, increasing the storage capacity and computing power of the individual sharded cluster members.

By default, Atlas auto-scales cluster storage up to your configured cluster tier size limit.

You can configure Atlas to automatically scale your cluster tier and cluster storage capacity in response to increased cluster usage, allowing for a rapid, automated response to a need for greater storage computing power.

Single Region and Multi-Region Clusters

High availability and cluster durability depend on a cluster’s geographical deployment configuration. Clusters which are deployed within a single region are spread across availability zones within that region, so they can withstand partial region outages without an interruption of read or write availability.

You can optionally choose to spread your clusters across two or more regions for greater resiliency and workload isolation.

The order of regions determines the priority order for the location of the primary node. Therefore, if you wish to direct database write operations to a particular region when that region is available, you should list that region first. The second region on the list should be the second choice for where writes should go if the first region is unavailable.

The following example from the Atlas Create a Cluster UI shows a multi-region cluster with electable nodes in three different regions, arranged by priority from highest to lowest:

Screenshot of electable nodes across three regions

If the us-east-1 region becomes unavailable, a new primary will be elected in the us-west-1 region.

Note

Clusters must have an odd number of nodes to ensure primary electability. To learn more, see Replica Set Elections.

Deployment in Two Regions

Deploying a cluster to two regions ensures that a copy of your data will always be maintained in more than one region. However, a loss of the region which contains a majority of the nodes in the cluster will leave the second region in a read-only state until an administrator intervenes or the original region becomes available.

Deployment in Three or More Regions

Deploying a cluster to three or more regions ensures that the cluster can withstand a full region-level outage while maintaining read and write availability, provided the application layer is fault-tolerant.

If maintaining write operations in your preferred region at all times is a high priority, it is recommended to deploy the cluster so that at least two electable members are in at least two data centers within your preferred region.

Global Clusters

For the best database performance in a worldwide deployment, users can configure a global cluster which uses location-aware sharding to minimize read and write latency. Users with geographical storage requirements can also ensure that data is stored in a particular geographical area.

Support

Different tiers of support are available, including options for customers in development and for enterprise customers.

Possible support areas include:

  • Issues and concerns with the MongoDB clusters under management.
  • Performance-related inquiries.
  • Application-side and driver consultation.