Navigation

Prepare Your S3 Bucket

Beta

The Atlas Data Lake is available as a Beta feature. The product and the corresponding documentation may change at any time during the Beta stage. For support, see Atlas Support.

Estimated completion time: 10 minutes

Before creating your first Data Lake, you must first create and load an S3 bucket with sample data. This part of the tutorial walks you through preparing your S3 bucket.

Prerequisites

To complete this part of the tutorial, you will need an AWS account.

Procedure

1

Download the AirBnB sample dataset.

2

Download the weather sample dataset.

3

Log in to the AWS Console and navigate to the S3 service.

Once you’re logged in to AWS:

  1. Click the Services dropdown menu on the upper left-hand side of the console.
  2. Under Storage, select S3.
4

Create a new bucket for the sample data.

To ensure that you only query the provided sample data, create a new S3 bucket:

  1. Click Create Bucket.
  2. Fill in a Bucket name.
  3. Select the desired Region.
  4. Click Create.
5

Prepare the directory structure of the new bucket.

  1. Click the newly created bucket’s name to navigate to the bucket.

  2. Click Create Folder and name the new folder json:

    json
    
  3. Open the json folder, and create two new folders named airbnb and weather:

    airbnb
    
    weather
    
  4. Verify your directory structure resembles the following:

    |--json
       |--airbnb
       |--weather
    
6

Upload the AirBnB dataset.

  1. Navigate to the /json/airbnb/ directory in your S3 bucket.
  2. Click Upload.
  3. Drag and drop the listingsAndReviews.json file into the modal. Alternatively, click Add files and use your file explorer to locate the listingsAndReviews.json file.
  4. Click Upload.
7

Upload the weather dataset.

  1. Navigate to the /json/weather directory in your S3 bucket.
  2. Click Upload.
  3. Drag and drop the data.json file into the modal. Alternatively, click Add files and use your file explorer to locate the data.json file.
  4. Click Upload.
8

Verify the bucket.

Verify the bucket directory structure is as follows:

|--json
   |--airbnb
      |--listingsAndReviews.json
   |--weather
      |--data.json

Next Steps

Now that your S3 bucket is loaded with sample data, proceed to Deploy a Data Lake.