Navigation

Tutorial: Create and Query an Atlas Search Index

This tutorial takes you through the steps of setting up and querying an Atlas Search index. You will use a collection with movie data from the Atlas sample data set.

To complete this tutorial you will need:

  • An Atlas cluster.
  • MongoDB version 4.2 or higher.
  • The mongo shell on your local machine.
1

In the Atlas UI, navigate to the Clusters page for your project.

2

Locate the ... button next to the Collections button and click to reveal the dropdown menu.

3

The sample dataset takes a few minutes to load. When it finishes, proceed to the next step.

4
5
6
7
  • For a guided experience, select Visual Editor.
  • To edit the raw index definition, select JSON Editor.
8
  1. In the Database field, specify sample_mflix.
  2. In the Collection field, specify movies.
  3. In the Index Name field, specify default.
Info With Circle IconCreated with Sketch.Note

If you use the name default for your index you do not need to specify it by name when querying against it. If you use a custom name you must add the "index": <index-name> parameter to all $search queries.

9

The movies collection is large, so in order to save space we will only index the title, genres, and plot fields.

  1. Click Next.
  2. Click Refine Your Index.
  3. Change Dynamic Mapping to Off.
  4. Add the following fields:

    Field NameDynamic MappingData Type Configuration
    genresChange Enable Dynamic Mapping to Off.Click Add Data Type, and select String.
    plotChange Enable Dynamic Mapping to Off.Click Add Data Type, and select String.
    titleChange Enable Dynamic Mapping to Off.
    1. Click Add Data Type, and select String.
    2. Click Add Data Type, and select Multi.
    3. Specify keywordAnalyzer as the name of the Multi analyzer.
    4. Change Index Analyzer to lucene.keyword.
  5. Click Save Changes.

The above index definition specifies the standard analyzer as the default analyzer for all three indexed fields. It also specifies the keyword analyzer as an alternate analyzer for the title field, with the name keywordAnalyzer. The keyword analyzer indexes the entire field as a single term, so it only returns results if the search term and the specified field match exactly. The index definition also specifies standard analyzer as the analyzer by default for queries on the genres field, which is an array of strings. For indexing arrays, Atlas Search only requires the data type of the array elements. You don't have to specify that the data is contained in an array in the index definition.

For more information about static and dynamic field mappings, see index definitions. For more information about multi analyzer designations, see Path Construction.

10
11

A modal window appears to let you know your index is building. Click the Close button.

12

The index should take about one minute to build. While it is building, the Status column reads Build in Progress. When it is finished building, the Status column reads Active.

13

Open the mongo shell in a terminal window and connect to your cluster. For detailed instructions on connecting, see Connect to a Cluster.

14

Run the following command at the mongo shell prompt:

use sample_mflix
15

The following query searches for the word baseball in the plot field. It includes a $limit stage to limit the output to 5 results and a $project stage to exclude all fields except title and plot.

db.movies.aggregate([
{
$search: {
"text": {
"query": "baseball",
"path": "plot"
}
}
},
{
$limit: 5
},
{
$project: {
"_id": 0,
"title": 1,
"plot": 1
}
}
])

The above query returns the following results:

{
"plot" : "A trio of guys try and make up for missed opportunities in childhood by forming a three-player baseball team to compete against standard children baseball squads.",
"title" : "The Benchwarmers"
}
{
"plot" : "A young boy is bequeathed the ownership of a professional baseball team.",
"title" : "Little Big League"
}
{
"plot" : "A trained chimpanzee plays third base for a minor-league baseball team.",
"title" : "Ed"
}
{
"plot" : "The story of the life and career of the famed baseball player, Lou Gehrig.",
"title" : "The Pride of the Yankees"
}
{
"plot" : "Babe Ruth becomes a baseball legend but is unheroic to those who know him.",
"title" : "The Babe"
}

For more information about the $search pipeline stage, see its reference page. For complete aggregation pipeline documentation, see the MongoDB Server Manual.

16

$search has several operators for constructing different types of queries. The following query uses the compound operator to combine several operators into a single query. It has the following search criteria:

  • The plot field must contain either Hawaii or Alaska.
  • The plot field must contain a four-digit number, such as a year.
  • The genres field must not contain either Comedy or Romance.
  • The title field must not contain Beach or Snow.
db.movies.aggregate([
{
$search: {
"compound": {
"must": [ {
"text": {
"query": ["Hawaii", "Alaska"],
"path": "plot"
},
},
{
"regex": {
"query": "([0-9]{4})",
"path": "plot",
"allowAnalyzedField": true
}
} ],
"mustNot": [ {
"text": {
"query": ["Comedy", "Romance"],
"path": "genres"
}
},
{
"text": {
"query": ["Beach", "Snow"],
"path": "title"
}
} ]
}
}
},
{
$project: {
"title": 1,
"plot": 1,
"genres": 1,
"_id": 0
}
}
])

The above query returns the following results:

{
"plot" : "A modern aircraft carrier is thrown back in time to 1941 near Hawaii, just hours before the Japanese attack on Pearl Harbor.",
"genres" : [ "Action", "Sci-Fi" ],
"title" : "The Final Countdown"
}
{
"plot" : "Follows John McCain's 2008 presidential campaign, from his selection of Alaska Governor Sarah Palin as his running mate to their ultimate defeat in the general election.",
"genres" : [ "Biography", "Drama", "History" ],
"title" : "Game Change"
}
{
"plot" : "A devastating and heartrending take on grizzly bear activists Timothy Treadwell and Amie Huguenard, who were killed in October of 2003 while living among grizzlies in Alaska.",
"genres" : [ "Documentary", "Biography" ],
"title" : "Grizzly Man"
}
{
"plot" : "Truman Korovin is a lonely, sharp-witted cab driver in Fairbanks, Alaska, 1980. The usual routine of picking up fares and spending his nights at his favorite bar, the Boatel, is disrupted ...",
"genres" : [ "Drama" ],
"title" : "Chronic Town"
}
17

In the index definition you created in step 9, you specified that the title field should be able to use either the standard analyzer or the keyword analyzer for queries. The following query uses the alternate analyzer, named keywordAnalyzer, to search for exact matches on the string The Count of Monte Cristo.

db.movies.aggregate([
{
$search: {
"text": {
"query": "The Count of Monte Cristo",
"path": { "value": "title", "multi": "keywordAnalyzer" }
}
}
},
{
$project: {
"title": 1,
"year": 1,
"_id": 0
}
}
])

The above query returns the following results:

{ "title" : "The Count of Monte Cristo", "year" : 1934 }
{ "title" : "The Count of Monte Cristo", "year" : 1954 }
{ "title" : "The Count of Monte Cristo", "year" : 1998 }

By contrast, the same query using the standard analyzer would find all the movies with the word Count or Monte or Cristo in the title.

In this tutorial you loaded a sample dataset into your Atlas cluster, created an Atlas Search index, and ran some example queries against it. More examples can be found througout the Atlas Search documentation.

Give Feedback