Fix This Page
Navigation

mongomirror

mongomirror

A utility for migrating data from an existing MongoDB replica set to a MongoDB Atlas replica set. mongomirror does not require you to shut down your existing replica set or applications.

Important

mongomirror does not import user/role data.

Syntax

To run mongomirror, you must specify the source replica set and the target Atlas replica set. For Atlas, you must specify a user in the Atlas cluster with appropriate privileges and the corresponding password. If the source replica set requires authentication, you must specify a user with appropriate privileges.

mongomirror --host <sourceReplSet> \
   --destination <atlasCluster> \
   --destinationUsername <atlasAdminUser> \
   --destinationPassword <atlasPassword> \
   [Additional options]

For details on the options, see Options.

Migration Process

When you start mongomirror:

  1. First, mongomirror performs an initial sync, copying collections from the existing MongoDB replica set to the target cluster in Atlas.
  2. After the initial sync, mongomirror continuously tails the replica set’s oplog for incoming changes and replays them on the target cluster in MongoDB Atlas. See oplog Entries.

Once started, mongomirror runs continuously until you shut down the process.

  • If you shut down mongomirror during the initial sync stage, either ensure that the target cluster is empty before you restart mongomirror or run mongomirror with the --drop option.
  • If you shut down mongomirror during the oplog tailing stage, you can restart mongomirror to continue from the last oplog record processed. See the --bookmarkFile option.

Progress Measurement

mongomirror logs its progress to the standard output in the terminal:

During the initial sync, mongomirror logs a progress bar for each collection it copies. For example:

2016-12-16T17:54:53.638-0800  [#....................]  park.events  2179/34184    (6.4%)
2016-12-16T17:54:53.638-0800  [#############........]  zoo.animals  29000/49778  (58.3%)

When tailing the oplog, mongomirror logs the lag time, in seconds, between the most recent oplog entry on the source and the last processed oplog entry on the target. For example:

2016-12-12T16:22:17.027-0800 Current lag from source: 6s

A lag time of 6 seconds means that the last oplog entry mongomirror processed was 6 seconds behind the most recent one available on the source.

Note

The amount of time it takes mongomirror to catch up completely may be greater or lesser than 6 seconds, depending on how many entries arrive per second.

A lag time of 0 seconds indicates that mongomirror is processing entries that arrived less than one second before the latest oplog entry.

Performance

To avoid contention for network and CPU resources, do not run mongomirror on the same hosts that provide your replica set’s mongod instances.

  • mongomirror must have network access to the source replica set.
  • mongomirror must have network access to the target cluster.
  • mongomirror has approximately the same performance impact on your source replica set as a secondary:
    • For the initial sync stage, the load scales with the size of your data set.
    • Once an initial sync completes, the load scales with oplog gigabytes used per hour.

Considerations

Source MongoDB Deployment

  • The source MongoDB deployment must be a replica set. If the source is a standalone MongoDB deployment, convert to a replica set first to use mongomirror. See Convert a Standalone to a Replica Set.
  • The source replica set must be version 3.0 or greater.

Target Atlas Cluster

  • The target Atlas cluster must be a replica set, but cannot be an M0 (Free Tier) deployment.
  • For the initial sync stage of mongomirror, the target Atlas cluster should be empty or you should run mongomirror with the --drop option.
  • The target Atlas cluster must include in its whitelist either
    • The public IP address of the server on which mongomirror is running, or
    • If set up for VPC peering, either the peer’s VPC CIDR block (or a subset) or the peer VPC’s Security Group.
  • mongomirror automatically connects to MongoDB Atlas over TLS/SSL.

Required Access on Source Replica Set

If the source replica set requires authentication, you must include user credentials when running mongomirror. You must specify a MongoDB user that has the following privileges in the source replica set:

  • Read any database, including the local database.

A MongoDB user with the backup role in the source provides these privileges.

Required Access on Destination Cluster

You must include user credentials for the Atlas cluster when running mongomirror. You must specify a MongoDB user that has the following privileges in the Atlas cluster:

A MongoDB user with Atlas admin role in the Atlas cluster provides these privileges.

oplog Entries

Because mongomirror tails the source oplog and applies the entries to the destination cluster, the destination oplog is not an exact duplicate of the source’s oplog. Instead, the tailed entries from the source oplog become part of an applyOps entry in the destination oplog.

For example, after mongomirror has performed the initial sync, the source replica set receives three insert operations and has the following oplog entries for these operations:

{"ts":<ts1>,"t":<t1>,"h":<h1>,"v":2,"op":"i","ns":"test.foo","o":{"_id":0,"a":0}}
{"ts":<ts2>,"t":<t2>,"h":<h2>,"v":2,"op":"i","ns":"test.foo","o":{"_id":1,"a":1}}
{"ts":<ts3>,"t":<t3>,"h":<h3>,"v":2,"op":"i","ns":"test.foo","o":{"_id":2,"a":2}}

As mongomirror tails the source oplog and applies these operations to the destination cluster, the three entries become part of a single entry in the destination cluster’s oplog:

{"ts":<ts>,"t":<t>,"h":<h>,"v":2,"op":"c","ns":"admin.$cmd","o":{"applyOps":
  [
    {"ts":<ts1>,"t":<t1>,"h":<h1>,"v":2,"op":"i","ns":"test.foo","o":{"_id":0,"a":0},"o2":{ }},
    {"ts":<ts2>,"t":<t2>,"h":<h2>,"v":2,"op":"i","ns":"test.foo","o":{"_id":1,"a":1},"o2":{ }},
    {"ts":<ts3>,"t":<t3>,"h":<h3>,"v":2,"op":"i","ns":"test.foo","o":{"_id":2,"a":2},"o2":{ }}
  ]
} }

For applications that tail or parse the oplog, if you switch these applications to read from the destination cluster’s oplog, you may need to modify the applications, depending on how you wish to handle these applyOps entries.

If you have not switched over to writing to the destination cluster, you can continue to read from the source without modifying these applications.

Options

--host <host>

The host information for the source replica set. Specify the replica set name and a seed list of the members, as in the following:

<RSname>/<host1>:<port1>,<host2>:<port2>,<host3>:<port3>
--username <username>

If the source replica set requires authentication, the name of a user in the source replica set with privileges to read any database, including the local database. A user with the backup role provides the appropriate privileges. For details on the specific privileges required, see Required Access on Source Replica Set.

--password <password>

Password for the user specified in --username.

--authenticationDatabase <authenticationDatabase>

The database in the source replica set where the user specified in --username was created.

--authenticationMechanism <authenticationMechanism>

The authentication mechanism to use to authenticate the user to the source replica set.

Value Description
SCRAM-SHA-1 RFC 5802 standard Salted Challenge Response Authentication Mechanism using the SHA1 hash function.
MONGODB-CR MongoDB challenge/response authentication.
MONGODB-X509 MongoDB TLS/SSL certificate authentication.
GSSAPI (Kerberos) External authentication using Kerberos. This mechanism is available only in MongoDB Enterprise.
PLAIN (LDAP SASL) External authentication using LDAP. You can also use PLAIN for authenticating in-database users. PLAIN transmits passwords in plain text. This mechanism is available only in MongoDB Enterprise.
--destination <destination>

The host information for the target Atlas replica set.

Specify the replica set name and a seed list of the members, as in the following:

<RSname>/<host1>:<port1>,<host2>:<port2>,<host3>:<port3>
--destinationUsername <Atlas user name>

Name of a MongoDB user in the Atlas cluster with privileges to read, write, and admin any database. A user with the Atlas admin role provides the appropriate privileges. For details on the specific privileges required, see Required Access on Destination Cluster.

--destinationPassword <password>

Password of the MongoDB user specified in –destinationUsername.

--drop

Flag that indicates that mongomirror should drop all but the system.* collections in the target cluster.

--ssl

Enables TLS/SSL encrypted connections to the source replica set.

--sslPEMKeyFile <file>

The .pem file if the source replica set requires clients to present a certificate. The .pem file contains both the TLS/SSL certificate and key. Specify the file using relative or absolute paths.

--sslPEMKeyPassword <value>

Password to decrypt the certificate-key file specified in –sslPEMKeyFile. Use if the –sslPEMKeyFile is encrypted.

--sslCAFile <file>

The .pem file that contains the root certificate chain from the Certificate Authority(CA) for the source replica set. Specify the file using relative or absolute paths.

--sslCRLFile <filename>

The .pem file that contains the Certificate Revocation List for the source replica set. Specify the file using relative or absolute paths.

--sslAllowInvalidHostnames

Disables the validation of the hostnames in TLS/SSL certificates presented by the source replica set. Allows mongomirror to connect to the source replica set if the hostname in the certificates does not match the specified hostname.

--sslAllowInvalidCertificates

Bypasses the validation checks for certificates presented by the source replica set and allows the use of invalid certificates. When using the --allowInvalidCertificates setting, MongoDB logs as a warning the use of the invalid certificate.

--gssapiServiceName <name>

If the source replica set uses Kerberos authentication, the name of the service using GSSAPI/Kerberos. Only required if the service does not use the default name of mongodb.

This option is available only in MongoDB Enterprise.

--gssapiHostName <host>

If the source replica set uses Kerberos authentication, the hostname of a service using GSSAPI/Kerberos. Only required if the hostname of a machine does not match the hostname resolved by DNS.

This option is available only in MongoDB Enterprise.

--readPreference <read preference>

Read preference that mongomirror uses to read from the source replica set. Can specify read preference name, e.g.

mongomirror --readPreference primary  ...

or as a JSON object; e.g.

mongomirror --readPreference '{mode: primary}' ...
--writeConcern <write concern>

Deprecated since version 0.2.3: mongomirror always uses majority write concern.

--numParallelCollections <num>, -j <num>

Default: 4

The number of collections to copy and restore in parallel.

--bypassDocumentValidation

Deprecated since version 0.2.3: mongomirror always bypasses document validation.

--tailOnly

Flag that indicates that mongomirror only tails the oplog; i.e. skips the initial sync phase.

--bookmarkFile <file>

Default: mongomirror.timestamp

Name of the oplog timestamp bookmark file.

--forceDump

Flag that indicates that mongomirror resync all source collections, even if a nonempty bookmark file exists.

--httpStatusPort

Directs mongomirror to start an HTTP server on port 8080. You can retrieve the current status of mongomirror by issuing an HTTP GET request to http://localhost:8080.

When running with --httpStatusPort, mongomirror does not exit when it encounters an error. Instead, it logs the error as normal and reports the error over HTTP.

mongomirror returns a document in response to the HTTP request. The following example syntax represents all the possible output fields - the actual response may only return a subset of these fields. See the subsequent table for a description of the fields and when to expect them.

{
   "stage" : "<stage Name>",
   "phase" : "<phase Name>",
   "details" : {
      "currentTimestamp" : "<BSON timestamp>",
      "latestTimestamp" : "<BSON timestamp>",
      "<namespace>" : {
         "complete" : <boolean>,
         "copiedBytes" : <integer>,
         "totalBytes" : <integer>,
         "createIndexes" : <integer>
      },
      ...
   },
   "errorMessage" : "<error message>"
}

The following table describes each field and its possible values:

Field Description
stage

The name of the stage in progress. Possible values are:

  • initializing

    mongomirror has started but is not yet copying any data.

  • initial sync

    mongomirror is copying documents and indexes that already exist on the source deployment. mongomirror also tails and applies entries from the oplog.

  • oplog sync

    mongomirror is tailing and applying entries from the oplog.

phase The name of the phase. Provides more specific details about what part of the stage is in progress.
details

A document providing a detailed description of the progress of the current phase.

During the initial sync stage, each subdocument in details represents a single collection being copied by mongomirror.

Depending on the stage or phase, mongomirror may not include this field in the response.

details.<namespace>

The full namespace of the collection being copied, displayed as <database>.<collection>.

Only displays during the initial sync phase when copying documents or indexes.

details.<namespace>.complete

Displays true or false depending on whether or not mongomirror has copied all documents or indexes from the collection to the target Atlas cluster.

Only displays during the initial sync phase when copying documents or indexes.

details.<namespace>.copiedBytes

The number of bytes copied so far. Note that this is a different measurement from the mongomirror logs, which report the current/total number of documents copied.

Only displays during the initial sync phase when copying non-index data.

details.<namespace>.totalBytes

The total size (in bytes) of the collection.

Only displays during the initial sync phase when copying non-index data.

details.<namespace>.createIndexes

The number of indexes that have been or will be created.

Only displays during the initial sync stage when copying indexes.

details.currentTimestamp

The BSON timestamp value of the oplog entry most recently processed. mongomirror only refreshes this data point every 10 seconds, so mongomirror may be slightly further ahead of the reported time.

Only displays during the initial sync or oplog sync stages when tailing or applying oplog entries.

details.latestTimestamp

During the initial sync stage, this represents the BSON timestamp value of the latest oplog entry available after the initial data was copied during initial sync.

During the oplog sync stage, this represents the BSON timestamp value of the latest oplog entry available on the source deployment.

Only displays during the initial sync or oplog sync stages when tailing or applying oplog entries.

errorMessage A string that describes any error encountered by mongomirror.

Examples

Migrate a Replica Set into Atlas

The following example migrates from a source replica set that does not require authentication:

mongomirror --host  sourceRS/source-host1:27017,source-host2:27017 \
         --destination myAtlasRS/atlas-host1:27017,atlas-host2:27017 \
         --destinationUsername myAtlasUser \
         --destinationPassword myAtlasPwd

To migrate from a source replica set that does not require authentication, run mongomirror with the following options:

For the destination, specify the replica set name followed by a seed list of members in the following format:

<replicaSetName>/<host1>:<port1>,<host2>:<port2>,<host3>:<port3>,...

The specified Atlas user must have the required privileges for Atlas. The Atlas admin role provides the required privileges.

Source Replica Set Uses SCRAM-SHA1 Authentication

The following example migrates from a source replica set that uses SCRAM-SHA1 authentication:

mongomirror --host sourceRS/source-host1:27017,source-host2:27017,source-host3:27017 \
   --username mySourceUser \
   --password mySourcePassword \
   --authenticationDatabase admin \
   --destination myAtlasRS/atlas-host1:27017,atlas-host2:27017 \
   --destinationUsername myAtlasUser \
   --destinationPassword atlasPassw0Rd

To migrate from a source replica set that does uses SCRAM-SHA1 authentication, run mongomirror with the following options:

The source replica set user must have the required access on source cluster. The backup role provides the appropriate privileges.

For the destination, specify the replica set name followed by a seed list of members in the following format:

<replicaSetName>/<replicaMember>,<replicaMember>,<replicaMember>,...

The specified Atlas user must have the required access on Atlas. The Atlas admin role provides the required privileges.

Source Replica Set Requires X509 Client Authentication

The following example migrates from a source replica set that uses X.509 authentication:

mongomirror --host sourceRS/source-host1:27017,source-host2:27017,source-host3:27017 \
   --username "CN=myName,OU=myOrgUnit,O=myOrg,L=myLocality,ST=myState,C=myCountry" \
   --authenticationDatabase '$external' \
   --authenticationMechanism MONGODB-X509 \
   --ssl \
   --sslPEMKeyFile <path-to-my-client-certificate.pem> \
   --sslCAFile <path-to-my-certificate-authority-certificate.pem> \
   --destination myAtlasRS/atlas-host1:27017,atlas-host2:27017 \
   --destinationUsername myAtlasUser \
   --destinationPassword atlasPassw0Rd

To migrate from a source replica set that uses X.509 authentication, run mongomirror with the following options:

The source replica set user must have the required access on source cluster. The backup role provides the appropriate privileges.

For the destination, specify the replica set name followed by a seed list of members in the following format:

<replicaSetName>/<replicaMember>,<replicaMember>,<replicaMember>,...

The specified Atlas user must have the required access on Atlas. The Atlas admin role provides the required privileges.

Source Replica Set Requires Kerberos/GSSAPI Authentication

The following example migrates from a source replica set that uses Kerberos authentication:

mongomirror --host sourceRS/source-host1:27017,source-host2:27017,source-host3:27017 \
   --username sourceUser/administrator@MYREALM.COM \
   --authenticationDatabase '$external' \
   --authenticationMechanism GSSAPI \
   --destination myAtlasRS/atlas-host1:27017,atlas-host2:27017,atlas-host3:27017 \
   --destinationUsername atlasUser \
   --destinationPassword atlasPass

To migrate from a source replica set that uses X.509 authentication, run mongomirror with the following options:

The source replica set user must have the required access on source cluster. The backup role provides the appropriate privileges.

For the destination, specify the replica set name followed by a seed list of members in the following format:

<replicaSetName>/<replicaMember>,<replicaMember>,<replicaMember>,...

The specified Atlas user must have the required access on Atlas. The Atlas admin role provides the required privileges.