Atlas Data Lake Changelog

2020 Releases

26 March 2020 Release

  • Includes various performance and stability improvements.
  • Supports filename field references for $out.
  • Supports $toString in $out to S3.

09 March 2020 Release

  • Supports optionally granting Atlas Data Lake write access to S3 buckets, enabling use of $out semantics to write directly to those buckets.
  • Adds incremental store, database, collection, and view commands for storage configuration management.
  • Limits collections returned for wildcard collections to 1,000.
  • Updates the storage configuration format.

11 February 2020 Release

  • Supports cross-database $lookup queries.
  • Supports lowercase and uppercase file extensions.
  • Template segments now support dot-separated attribute names that correspond to nested fields.

21 January 2020 Release

  • Allows the defaultFormat to be specified without a leading dot.
  • Supports filtering based on stripes for files in ORC format.
  • Allows query attributes to be extracted after the first stage.

2019 Releases

10 December 2019 Release

  • Includes several performance and stability improvements.
  • Supports partition definition for the following:
    • epoch_secs, which is seconds since the Unix Epoch
    • epoch_millis, which is milliseconds since the Unix Epoch
    • UUID, which is binary subtype 4

11 November 2019 Release

  • Includes several performance and stability improvements.
  • Adds support for reading Apache ORC files.

29 October 2019 Release

08 October 2019 Release

  • Returns an error if a query produces a document larger than 16 MiB.
  • The $indexStats stage now produces an empty list of indexes instead of an error.
  • Supports $out to S3 storage format in JSON.
  • $match now implicitly treats all terms as conjunctions.
  • No longer parses empty files.
  • Fixes an issue that caused the {$match: {$expr: {$and: []}}} expression to terminate the connection.

17 September 2019 Release

  • Allows nested fields in partition definitions.
  • No longer enumerates directories on S3 when a single subdirectory containing all the partitions matching the query is identified.
  • Fixes an issue where the new storage configuration did not appear on the issuing connection after running setStorageConfig.

21 August 2019 Release

  • Adds support for the getLastError database command.
  • Fixes a bug with how union types are handled in Avro.
  • Supports $out aggregation pipeline stage to S3.
  • listIndexes now always returns an empty list.
  • Translates dot-delimited CSV and TSV keys into subdocuments.
  • Storage configuration error message now includes a link to the documentation.
  • Supports the XLSX file format.
  • Includes the correlation ID in query execution error messages.
  • Returns an error to the client when the cursor storage limit is reached.
  • Returns an error to the client on the last getMore if the cursor storage limit is exceeded.

30 July 2019

  • Supports listCommands. For example: db.runCommand({"listCommands": 1})
  • Includes partition size information in the output of explain().

08 July 2019

  • Returns the first batch of cursor results more quickly.
  • Improves performance of $lookup when combined with $unwind.
  • Automatically supports SCRAM-SHA-1 credentials without requiring drivers to specify this authentication mechanism.
  • Provides a descriptive error message when the file format is unknown.
  • Provides additional validation on setStorageConfig.

18 June 2019

Initial public beta release of Atlas Data Lake.