Navigation

Configure Online Archive

You can configure data in a collection to be archived by specifying an archiving rule. The archiving rule can be a:

  • Combination of a date that is used to determine when to archive data and a numeric value representing the number of days that the Atlas cluster stores the data.
  • Custom query that is used to select the documents to archive.

To configure your Atlas cluster for online archive:

  1. Create an archiving rule by providing the collection namespace and the criteria for selecting data to archive in the collection.
  2. (Optional) Specify commonly queried fields to partition archived data.

To configure an Online Archive, in your Atlas UI:

1
  1. If it is not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.
  2. If it is not already displayed, select your desired project from the Projects menu in the navigation bar.
  3. If the Clusters page is not already displayed, click Clusters in the sidebar.
2
  1. Click the name of the cluster.
  2. Click the Online Archive tab to view the list of online archives, if any, for the cluster.
3
4
5
  1. Specify the collection namespace, which includes the database name, the dot (.) separator, and the collection name (that is, <database>.<collection>), in the Namespace field.

    You can't modify the namespace once the online archive is created.

  2. Specify the criteria for selecting documents to archive under the Date Match or Custom Query tab.

    Date Match

    To select documents from the collection using a combination of a date field and number of days:

    • Specify an already indexed date field from the documents in the collection. To specify a nested field, use the dot notation. You can't modify the date field once the online archive is created.
    • Specify the number of days to keep the data in the Atlas cluster.
    • Choose the date format of the specified date field.

    You can't modify the date field once the online archive is created.

    Custom QueryTo select documents from the collection using a custom query, specify the JSON query to run. Atlas uses the specified query with the db.collection.find(query) command. The empty document ({}) to return all documents is not supported.
6
7

To specify nested fields, use the dot notation.

The specified fields are used to partition your archived data. Partitions are similar to folders. The date field is in the first position of the partition by default. You can move another field to the first position of the partition if you frequently query by that field.

For example, suppose you are configuring the online archive for the movies collection in the sample_mflix database. If your archived field is the released date field, which you moved to the third position, your first queried field is title, and your second queried field is plot, your partition will look similar to the following:

/title/plot/released

The value of a partition field can be up to a maximum of 700 characters. Documents with values exceeding 700 characters are not archived.

  • Choose fields that do not contain polymorphic data. Atlas determines the data type of a partition field by sampling 10 documents from the collection. Atlas will not archive a document if the specified field value in a document does not match values in other documents in the same collection.
  • Choose query fields that do no have a large number of values unless you always use those fields in your queries. Query fields, such as _id, with possibly large number of values can cause operations such as count to touch all documents resulting in high latency.
  • Choose fields that you query frequently and order them from the most frequently queried in the first position to the least queried field in the last position. For example, if you frequently query on the date field, then leave the date field in the first position. But if you frequently query on another field, then that field should be in the first position.
Info With Circle IconCreated with Sketch.Note

Data partitions based on the date field from the document are truncated to the day even if there are timestamps with seconds in the date field value.

Partition fields of type UUID must be of binary subtype 4. Atlas skips partition fields of type UUID with subtype 3. To learn more about the supported partition attribute types, see Partition Attribute Types.

While partitions improve query performance, queries that don't contain these fields require a full collection scan of all archived documents, which will take longer and increase your costs. To learn more about how partitions improve your query performance in Data Lake, see Data Structure in S3.

8
9

You can run explain on the query to check whether it uses an index. Proceed to the next step to create the index if the fields are not indexed. If the fields are already indexed, skip to step 11.

10
11
  1. Click Begin Archiving in the Confirm an online archive tab.
  2. Click Confirm in the Begin Archiving window.
Info With Circle IconCreated with Sketch.Note

Once your document is queued for archiving, you can no longer edit the document. You must use mongodump or mongoexport to move archived data back into the live Atlas cluster delete the archive to remove the data from the archive.

To configure an online archive from the API, send a POST request to the onlineArchives endpoint. If the cluster already has an Active online archive with the same archiving rule for the same database and collection, the operation will fail. However, if the existing online archive is in Paused or Deleted state, the new online archive is created and its status is set to Active. To learn more about the API syntax and options, see Create an Online Archive.

You can create up to 50 online archives per cluster and up to 20 can be active per cluster. The following limitations apply:

  • You can configure multiple online archives in the same namespace, but only one can be active at any given time.
  • You cannot create multiple online archives on the same fields in the same collection.
Give Feedback