Docs Menu

Docs HomeLaunch & Manage MongoDBMongoDB Atlas

Configure Online Archive

On this page

  • Overview
  • Required Access
  • Configure Online Archive Through the Atlas CLI
  • Configure Online Archive Through the API
  • Configure Online Archive Through the User Interface
  • Limitations

Important

Feature unavailable in Serverless Instances

Serverless instances don't support this feature at this time. To learn more, see Serverless Instance Limitations.

You can configure data in a collection to be archived by specifying an archiving rule. The archiving rule for a:

  • Time series collection is a combination of a time that is used to determine when to archive data and a numeric value representing the number of days that the Atlas cluster stores the data.

  • Standard collection can be one of the following:

    • A combination of a date that is used to determine when to archive data and a numeric value representing the number of days that the Atlas cluster stores the data.

    • A custom query that is used to select the documents to archive.

To configure your Atlas cluster for online archive:

  1. Create an archiving rule by providing the collection namespace and the criteria for selecting data to archive in the collection.

  2. (Optional) Specify commonly queried fields to partition archived data.

When you configure an Online Archive on your cluster, Atlas creates 2 federated database instances on your cluster for your archive only and for your cluster and archive.

To create an Online Archive, you must have Project Data Access Admin access or higher to the project.

To watch for an archive to be available, you must have Project Read Only access or higher to the project.

Note

Online archive doesn't archive data below the size of 5 MiB after 7 days. For 7 days immediately after Atlas creates an archive, Atlas archives all data. After 7 days, Atlas archives data only when your data size reaches 5 MiB.

To create an online archive for a cluster using the Atlas CLI, run the following command:

atlas clusters onlineArchives create [options]

To watch for a specific online archive to become available using the Atlas CLI, run the following command:

atlas clusters onlineArchives watch <archiveId> [options]

To learn more about the syntax and parameters for the previous commands, see the Atlas CLI documentation for atlas clusters onlineArchives create and atlas clusters onlineArchives watch.

Tip

Note

Online archive doesn't archive data below the size of 5 MiB after 7 days. For 7 days immediately after Atlas creates an archive, Atlas archives all data. After 7 days, Atlas archives data only when your data size reaches 5 MiB.

To configure an online archive from the API, send a POST request to the onlineArchives endpoint.

Note

If you use the DATE criteria, you must specify the date field as part of the partition keys.

If the cluster already has an Active online archive with the same archiving rule for the same database and collection, the operation will fail. However, if the existing online archive is in Paused or Deleted state, the new online archive is created and its status is set to Active. To learn more about the syntax and options, see API.

Note

Online archive doesn't archive data below the size of 5 MiB after 7 days. For 7 days immediately after Atlas creates an archive, Atlas archives all data. After 7 days, Atlas archives data only when your data size reaches 5 MiB.

To configure an Online Archive, in your Atlas UI:

1
  1. If it is not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.

  2. If it is not already displayed, select your desired project from the Projects menu in the navigation bar.

  3. If the Database Deployments page is not already displayed, click Database in the sidebar.

2
  1. Click the name of the cluster.

  2. Click the Online Archive tab to view the list of online archives, if any, for the cluster.

3

To configure an online archive for your collection, click:

  • Configure Online Archive button the first time.

  • Add Archive button subsequently.

4
5
  1. Specify the collection namespace, which includes the database name, the dot (.) separator, and the collection name (that is, <database>.<collection>), in the Namespace field.

    You can't modify the namespace once the online archive is created.

  2. Select the cloud provider region where you want to store your archived data.

    Tip

    We recommend that you select the same region as your cluster if possible because you might incur higher data transfer cost if you choose a different region.

    Atlas displays the cloud provider regions based on the cloud provider where your cluster is deployed. For multi-cloud clusters, Atlas displays the cloud provider regions of the highest priority provider. Atlas displays a next to the region that closely or exactly matches the region where your cluster is deployed.

    Note

    Once Atlas creates the online archive, you can't modify the storage region.

  3. Specify the criteria for selecting documents to archive for the type of collection you want to archive.

    Note

    Atlas runs an index sufficiency query to determine the efficiency of the archival process. If the number of documents scanned to the number of documents returned is 10 or more, the query result triggers an Index Sufficiency Warning. This warning indicates that you have insufficient indexes for an efficient archival process. For date-based archives, you must index the date field. For custom criteria that use an expression, Atlas might first convert a value before it evaluates it against the query.

6
  1. (Optional) Specify a Deletion Age Limit.

    By default, Atlas doesn't delete archived data. However, if you specify the Deletion Age Limit, you can specify between 7 to 9125 days (25 years) to keep archived data. Atlas deletes archived data after the number of days you specify here. This data expiration rule takes effect 24 hours after you set the Deletion Age Limit.

    Warning

    Once Atlas deletes the data, you can't recover the data.

  2. (Optional) Specify a Schedule Archiving Window.

    By default, Atlas periodically runs a query to archive data. However, you can toggle the Schedule Archiving Window to explicitly schedule the time window during which you want Atlas to archive data. You can specify the following:

    • Frequency. You can choose to run the job every day, on a specific day of the week, or on a specific date every month. If you wish to schedule the data archiving job on the 29th, 30th, or 31st of every month, Atlas doesn't run the archiving job for months without these dates (for example, February).

    • Time window, in hours. Select the period of time during which you want Atlas to run the data archiving job. You must specify a minimum of two hours. If a running job doesn't complete during the specified time window, Atlas continues to run the job until it completes.

7
8

Note

Archive must have at least one partition field.

  • Choose fields that contain only characters supported on AWS. To learn more about the characters to avoid, see Creating object key names. Atlas skips and doesn't archive documents that contain unsupported characters.

  • Choose fields that do not contain polymorphic data. Atlas determines the data type of a partition field by sampling 10 documents from the collection. Atlas will not archive a document if the specified field value in a document does not match values in other documents in the same collection.

  • Choose fields that you query frequently and order them from the most frequently queried in the first position to the least queried field in the last position. For example, if you frequently query on the date field, then leave the date field in the first position. But if you frequently query on another field, then that field should be in the first position.

Note

For Online Archives created before June 2023, MongoDB doesn't recommend string type fields with high cardinality as a query field for Online Archives. For fields of type string with high cardinality, Atlas creates a large number of partitions. This doesn't apply to Online Archives created after June 2023. To learn more, read the MongoDB blog post.

Atlas supports the following partition attribute types:

  • date

  • double

  • int

  • long

  • objectId

  • string

  • uuid

    Note

    Partition fields of type UUID must be of binary subtype 4. Atlas skips partition fields of type UUID with subtype 3.

To learn more about the supported partition attribute types, see Partition Attribute Types.

Note

You can use the explain command to return information about the data partitions used to satisfy a query. To learn more, see explain.

While partitions improve query performance, queries that don't contain these fields require a full collection scan of all archived documents, which will take longer and increase your costs. To learn more about how partitions improve your query performance in Atlas Data Federation, see Data Structure in S3.

9

You can review the following archiving rule settings:

  • The name of the database and collection

  • The name of the cloud provider and the cloud provider region

  • The name of the date field (for Date Match only)

  • The number of days to keep data on the Atlas cluster (for Date Match only)

  • The number of days after which to delete archived data

  • The frequency and time window for archiving data

  • The custom query to use to identify data to archive (for Custom Criteria only)

  • The partition fields

Click Back to edit these settings if needed.

10

You can run explain on the query to check whether it uses an index. Proceed to the next step to create the index if the fields are not indexed. If the fields are already indexed, skip to step 11.

11
12
  1. Click Begin Archiving in the Confirm an online archive tab.

  2. Click Confirm in the Begin Archiving window.

Note

Once your document is queued for archiving, you can no longer edit the document. See Restore Archived Data to move archived data back into the live Atlas cluster.

You can create up to 50 online archives per cluster and up to 20 can be active per cluster. The following limitations apply:

  • You can configure multiple online archives in the same namespace, but only one can be active at any given time.

  • You can't create multiple online archives on the same fields in the same collection.

  • You can't access your online archive during the following scenarios:

    • A full outage of the primary region of your cluster.

    • An outage of AWS S3 where your archived data is stored.

  • You can't use an archiving rule for more than one collection.

    Note

    If your goal is to archive data from several collections, you must create an archiving rule for each collection.

  • You can't archive data below the size of 5 MiB after 7 days. For 7 days immediately after Atlas creates an archive, Atlas archives all data. After 7 days, Atlas archives data only when your data size reaches 5 MiB.

←  Archive DataSet Up a Private Endpoint for Online Archives →