Navigation

autocomplete

Definition

autocomplete

The autocomplete operator performs a search for a word or phrase that contains a sequence of characters from an incomplete input string. You can use the autocomplete operator with search-as-you-type applications to predict words with increasing accuracy as characters are entered in your application’s search field. autocomplete returns results that contain predicted words based on the tokenization strategy specified in the index definition for autocompletion. The fields that you intend to query with the autocomplete operator must be indexed with the autocomplete data type in the collection’s index defintion.

Syntax

autocomplete has the following syntax:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  $search: {
    "autocomplete": {
      "query": "<search-string>",
      "path": "<field-to-search>",
      "tokenOrder": "any|sequential",
      "fuzzy": <options>,
      "score": <options>
    }
  }
}

Options

Field Type Description Necessity Default
query string or array of strings String or strings to search for. If there are multiple terms in a string, Atlas Search also looks for a match for each term in the string separately. yes  
path string

Indexed autocomplete type of field to search.

Note

The autocomplete operator does not support multi in the field path.

yes  
fuzzy object Enable fuzzy search. Find strings which are similar to the search term or terms. no  
fuzzy
.maxEdits
integer Maximum number of single-character edits required to match the specified search term. Value can be 1 or 2. no 2
fuzzy
.prefixLength
integer Number of characters at the beginning of each term in the result that must exactly match. no 0
fuzzy
.maxExpansions
integer Maximum number of variations to generate and search for. This limit applies on a per-token basis. no 50
score object

score assigned to matching search term results. Use one of the following options to modify the score:

boost Multiply the result score by the given number.
constant Replace the result score with the given number.

Note

autocomplete offers less fidelity in score in exchange for faster query execution.

no  
tokenOrder string

Order in which to search for tokens. Value can be one of the following:

any Indicates tokens in the query can appear in any order in the documents. Results contain documents where the tokens appear sequentially and non-sequentially. However, results where the tokens appear sequentially score higher than other, non-sequential values.
sequential Indicates tokens in the query must appear adjacent to each other or in the order specified in the query in the documents. Results contain only documents where the tokens appear sequentially.
no any

Examples

The following examples use the movies collection in the sample_mflix database. If you loaded the sample dataset on your cluster, you can create the static index for autocompletion and run the queries on your cluster.

Click on your preferred tokenization strategy to view a sample index definition that you can use for the queries in the following examples:

Note

To learn more about edgeGram and nGram, see autocomplete.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
{
  "mappings": {
    "dynamic": false,
    "fields": {
      "title": [
        {
          "type": "autocomplete",
          "tokenization": "edgeGram",
          "minGrams": 3,
          "maxGrams": 7,
          "foldDiacritics": false
        }
      ]
    }
  }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
{
  "mappings": {
    "dynamic": false,
    "fields": {
      "title": [
        {
          "type": "autocomplete",
          "tokenization": "nGram",
          "minGrams": 3,
          "maxGrams": 7,
          "foldDiacritics": false
        }
      ]
    }
  }
}

You can follow the steps in the Tutorial: Create and Query an Atlas Search Index to load the sample dataset, create an index definition, and run Atlas Search queries.

Basic Examples

The following query searches for movies with the characters off in the title field. The query includes a:

  • $limit stage to limit the output to 10 results.
  • $project stage to exclude all fields except title.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
db.movies.aggregate([
  {
    $search: {
      "autocomplete": {
        "path": "title",
        "query": "off"
      }
    }
  },
  {
    $limit: 10
  },
  {
    $project: {
      "_id": 0,
      "title": 1
    }
  }
])

Click on tokenization strategy to view the results:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
{ "title" : "Off the Map" }
{ "title" : "Off and Running" }
{ "title" : "Benji: Off the Leash!" }
{ "title" : "An Officer and a Gentleman" }
{ "title" : "A Spell to Ward Off the Darkness" }
{ "title" : "Office Romance" }
{ "title" : "Office Killer" }
{ "title" : "Office Space" }
{ "title" : "Off Beat" }
{ "title" : "Official Rejection" }

In the above results, the characters off appears at the beginning of a word in all the titles.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
{ "title" : "Come Have Coffee with Us" }
{ "title" : "A Spell to Ward Off the Darkness" }
{ "title" : "Remake, Remix, Rip-Off: About Copy Culture & Turkish Pop Cinema" }
{ "title" : "Benji: Off the Leash!" }
{ "title" : "A Coffee in Berlin" }
{ "title" : "An Officer and a Gentleman" }
{ "title" : "The Official Story" }
{ "title" : "The Officer's Ward" }
{ "title" : "Hands off Mississippi" }
{ "title" : "Romanoff and Juliet" }

In the above results, the characters off is present at different positions in the titles.

Fuzzy Example

The following query searches for movies with the characters pre in the title field. The query uses:

Field Description
maxEdits Indicates that only one character variation is allowed in the query string pre to match the query to a word in the documents.
prefixLength Indicates that the first character in the query string pre can’t change when matching the query to a word in the documents.
maxExpansions Indicates that up to two hundred and fifty six similar terms for pre can be considered when matching the query string to a word in the documents.

The query also includes a:

  • $limit stage to limit the output to 10 results.
  • $project stage to exclude all fields except title.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
db.movies.aggregate([
  {
    $search: {
      "autocomplete": {
        "path": "title",
        "query": "pre",
        "fuzzy": {
          "maxEdits": 1,
          "prefixLength": 1,
          "maxExpansions": 256
        }
      }
    }
  },
  {
    $project: {
      "_id": 0,
      "title": 1
    }
  }
])

Click on tokenization strategy to view the results:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
{ "title" : "Prelude to War" }
{ "title" : "Sitting Pretty" }
{ "title" : "Gentlemen Prefer Blondes" }
{ "title" : "The Parent Trap" }
{ "title" : "Premature Burial" }
{ "title" : "The President's Analyst" }
{ "title" : "Pretty Poison" }
{ "title" : "El castillo de la pureza" }
{ "title" : "Premiya" }
{ "title" : "All the President's Men" }

These results show the words that are predicted for the query string with one character modification and with the first character constant at the beginning of the word in all the titles.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
{ "title" : "Les vampires" }
{ "title" : "He Who Gets Slapped" }
{ "title" : "Shanghai Express" }
{ "title" : "It Happened One Night" }
{ "title" : "The Scarlet Empress" }
{ "title" : "David Copperfield" }
{ "title" : "Prelude to War" }
{ "title" : "It Happened on Fifth Avenue" }
{ "title" : "Berlin Express" }
{ "title" : "Sitting Pretty" }

These results show the words that are predicted for the query string with one character modification at different positions in the words in the titles.

Token Order Example

The following queries search for movies with the characters men with in any and sequential order in the title field. The query includes a:

  • $limit stage to limit the output to 4 results.
  • $project stage to exclude all fields except title.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
db.movies.aggregate([
  {
    $search: {
      "autocomplete": {
        "path": "title",
        "query": "men with",
        "tokenOrder": "any"
      }
    }
  },
  {
    $limit: 4
  },
  {
    $project: {
      "_id": 0,
      "title": 1
    }
  }
])

This query returns the following results for the edgeGram tokenization strategy:

{ "title" : "Men Without Women" }
{ "title" : "Men with Guns" }
{ "title" : "Men with Brooms" }
{ "title" : "Without Men" }

This query returns the following results for the nGram tokenization strategy:

{ "title" : "Men Without Women" }
{ "title" : "Men with Guns" }
{ "title" : "Men with Brooms" }
{ "title" : "Women Without Men" }
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
db.movies.aggregate([
  {
    $search: {
      "autocomplete": {
        "path": "title",
        "query": "men with",
        "tokenOrder": "sequential"
      }
    }
  },
  {
    $limit: 4
  },
  {
    $project: {
      "_id": 0,
      "title": 1
    }
  }
])

This query returns the following results for the edgeGram tokenization strategy:

{ "title" : "Men Without Women" }
{ "title" : "Men with Guns" }
{ "title" : "Men with Brooms" }

This query returns the following results for the nGram tokenization strategy:

{ "title" : "Men Without Women" }
{ "title" : "Men with Guns" }
{ "title" : "Men with Brooms" }
{ "title" : "Women Without Men" }
←   Operators compound  →