autocomplete¶
Definition¶
autocomplete
¶The
autocomplete
operator performs a search for a word or phrase that contains a sequence of characters from an incomplete input string. You can use theautocomplete
operator with search-as-you-type applications to predict words with increasing accuracy as characters are entered in your application's search field.autocomplete
returns results that contain predicted words based on the tokenization strategy specified in the index definition for autocompletion. The fields that you intend to query with theautocomplete
operator must be indexed with theautocomplete
data type in the collection's index definition.
Syntax¶
autocomplete
has the following syntax:
1 { 2 $search: { 3 "autocomplete": { 4 "query": "<search-string>", 5 "path": "<field-to-search>", 6 "tokenOrder": "any|sequential", 7 "fuzzy": <options>, 8 "score": <options> 9 } 10 } 11 }
Options¶
Field | Type | Description | Necessity | Default | ||||
---|---|---|---|---|---|---|---|---|
query | string or array of strings | String or strings to search for. If there are multiple terms in a string, Atlas Search also looks for a match for each term in the string separately. | yes | |||||
path | string | Indexed autocomplete type of field to search. Tip See Also: Note The | yes | |||||
fuzzy | object | Enable fuzzy search. Find strings which are similar to the search term or terms. | no | |||||
fuzzy .maxEdits | integer | Maximum number of single-character edits required to match the
specified search term. Value can be 1 or 2 . | no | 2 | ||||
fuzzy .prefixLength | integer | Number of characters at the beginning of each term in the result that must exactly match. | no | 0 | ||||
fuzzy .maxExpansions | integer | Maximum number of variations to generate and search for. This limit applies on a per-token basis. | no | 50 | ||||
score | object | score assigned to matching search term results. Use one of the following options to modify the score:
Note
| no | |||||
tokenOrder | string | Order in which to search for tokens. Value can be one of the following:
| no | any |
Examples¶
The following examples use the movies
collection in the
sample_mflix
database. If you loaded the
sample dataset on your cluster, you
can create the static index for
autocompletion and run
the queries on your cluster.
Click on your preferred tokenization strategy to view a sample index definition that you can use for the queries in the following examples:
To learn more about edgeGram
and nGram
, see
autocomplete.
1 { 2 "mappings": { 3 "dynamic": false, 4 "fields": { 5 "title": [ 6 { 7 "type": "autocomplete", 8 "tokenization": "edgeGram", 9 "minGrams": 3, 10 "maxGrams": 7, 11 "foldDiacritics": false 12 } 13 ] 14 } 15 } 16 }
You can follow the steps in the Tutorial: Create and Query an Atlas Search Index to load the sample dataset, create an index definition, and run Atlas Search queries.
Basic Examples¶
The following query searches for movies with the characters off
in
the title
field. The query includes a:
1 db.movies.aggregate([ 2 { 3 $search: { 4 "autocomplete": { 5 "path": "title", 6 "query": "off" 7 } 8 } 9 }, 10 { 11 $limit: 10 12 }, 13 { 14 $project: { 15 "_id": 0, 16 "title": 1 17 } 18 } 19 ])
Click on tokenization strategy to view the results:
1 { "title" : "Off the Map" } 2 { "title" : "Off and Running" } 3 { "title" : "Benji: Off the Leash!" } 4 { "title" : "An Officer and a Gentleman" } 5 { "title" : "A Spell to Ward Off the Darkness" } 6 { "title" : "Office Romance" } 7 { "title" : "Office Killer" } 8 { "title" : "Office Space" } 9 { "title" : "Off Beat" } 10 { "title" : "Official Rejection" }
In the above results, the characters off
appears at the
beginning of a word in all the titles.
Fuzzy Example¶
The following query searches for movies with the characters pre
in
the title
field. The query uses:
Field | Description |
---|---|
maxEdits | Indicates that only one character variation is allowed in the
query string pre to match the query to a word in the
documents. |
prefixLength | Indicates that the first character in the query string pre
can't change when matching the query to a word in the documents. |
maxExpansions | Indicates that up to two hundred and fifty six similar terms for
pre can be considered when matching the query string to a
word in the documents. |
The query also includes a:
1 db.movies.aggregate([ 2 { 3 $search: { 4 "autocomplete": { 5 "path": "title", 6 "query": "pre", 7 "fuzzy": { 8 "maxEdits": 1, 9 "prefixLength": 1, 10 "maxExpansions": 256 11 } 12 } 13 } 14 }, 15 { 16 $project: { 17 "_id": 0, 18 "title": 1 19 } 20 } 21 ])
Click on tokenization strategy to view the results:
1 { "title" : "Prelude to War" } 2 { "title" : "Sitting Pretty" } 3 { "title" : "Gentlemen Prefer Blondes" } 4 { "title" : "The Parent Trap" } 5 { "title" : "Premature Burial" } 6 { "title" : "The President's Analyst" } 7 { "title" : "Pretty Poison" } 8 { "title" : "El castillo de la pureza" } 9 { "title" : "Premiya" } 10 { "title" : "All the President's Men" }
These results show the words that are predicted for the query string with one character modification and with the first character constant at the beginning of the word in all the titles.
Token Order Example¶
The following queries search for movies with the characters men
with
in any
and sequential
order in the title
field. The
query includes a:
1 db.movies.aggregate([ 2 { 3 $search: { 4 "autocomplete": { 5 "path": "title", 6 "query": "men with", 7 "tokenOrder": "any" 8 } 9 } 10 }, 11 { 12 $limit: 4 13 }, 14 { 15 $project: { 16 "_id": 0, 17 "title": 1 18 } 19 } 20 ])
This query returns the following results for the edgeGram
tokenization strategy:
{ "title" : "Men Without Women" } { "title" : "Men with Guns" } { "title" : "Men with Brooms" } { "title" : "Without Men" }
This query returns the following results for the nGram
tokenization strategy:
{ "title" : "Men Without Women" } { "title" : "Men with Guns" } { "title" : "Men with Brooms" } { "title" : "Women Without Men" }