Index Definitions¶
When you configure an Atlas Search index, you can specify that certain fields should be indexed with a particular analyzer or with multiple analyzers. You can also specify that certain fields should be indexed while others are left unindexed, or you can dynamically index all the fields in a collection.
Static and Dynamic Mappings¶
Individual field mappings that you configure when you create the index are called static mappings.
Mappings that are automatically assigned when new data is inserted into a field are called dynamic mappings. Dynamic mappings are useful if you have a dynamic schema and you don't know ahead of time all the fields that a collection may contain. You can configure an entire index to use dynamic mappings, or specify individual fields to be dynamically mapped.
Dynamically mapped indexes occupy more disk space than static mappings and may be less performant.
See the index configuration example.
Defining an Index¶
You must have the:
Project Data Access Read Only
or higher role to view Atlas Search analyzers and indexes using the Atlas UI or API.Project Data Access Admin
or higher role to create and manage Atlas Search analyzers and indexes using the Atlas UI or API.
When you create a new Atlas Search index, choose a configuration method.

You can use either the default index defintion or specify a custom definition for the index. The default index definition will work with any collection. If you wish to create a custom index definition, you can specify which fields should be indexed with which analyzer and as which data type.
Unlike compound indexes, the order of fields in the Atlas Search index definition is not important. Fields can be defined in any order.
The index name defaults to default. You can leave the default name in place or choose one of your own.
If you name your index default
, you don't need to specify
an index
parameter when using the $search pipeline stage. Otherwise, you must specify
the index name using the index
parameter.
Index names must be unique within their namespace.
See Create an Atlas Search Index for complete instructions on creating a new Atlas Search index.
Field Mapping Examples¶
The following example index definition uses static mappings.
- The default index analyzer is lucene.standard.
- The default search analyzer is lucene.standard. You can change the search analyzer if you want the query term to be parsed differently than how it is stored in your Atlas Search index.
The index specifies static field mappings (
dynamic
:false
), which means fields that are not explicitly mentioned are not indexed. So, the index definition includes:The
address
field, which is of typedocument
. It has two embedded sub-fields,city
andstate
.The
city
sub-field uses the lucene.simple analyzer by default for queries. It uses theignoreAbove
option to ignore any string of more than 255 bytes in length.The
state
sub-field uses the lucene.english analyzer by default for queries.The
company
field, which is of typestring
. It uses the lucene.whitespace analyzer by default for queries. It has amulti
analyzer namedmySecondaryAnalyzer
which uses the lucene.french analyzer by default for queries.For more information on
multi
analyzers, see Path Construction.- The
employees
field, which is an array of strings. It uses the lucene.standard analyzer by default for queries. For indexing arrays, Atlas Search only requires the data type of the array elements. You don't have to specify that the data is contained in an array in the index definition.
{ "analyzer": "lucene.standard", "searchAnalyzer": "lucene.standard", "mappings": { "dynamic": false, "fields": { "address": { "type": "document", "fields": { "city": { "type": "string", "analyzer": "lucene.simple", "ignoreAbove": 255 }, "state": { "type": "string", "analyzer": "lucene.english" } } }, "company": { "type": "string", "analyzer": "lucene.whitespace", "multi": { "mySecondaryAnalyzer": { "type": "string", "analyzer": "lucene.french" } } }, "employees": { "type": "string", "analyzer": "lucene.standard" } } } }
Custom Analyzers¶
You can also define a custom analyzer within an index definition. Custom analyzers allow you to create an Atlas Search mechanism tailored to your specific needs.
BSON Data Types¶
The table below enumerates all the BSON data types and whether they are included in an Atlas Search index with dynamic mappings.
Atlas Search Data Types¶
autocomplete
¶
The autocomplete
type is for indexing text values for autocompletion. The
indexed fields can only be queried with the autocomplete operator.
The autocomplete
type can't be used to index fields whose value is
an array of strings.
The autocomplete
type takes the following options:
Option | Type | Purpose | Necessity | Default | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
type | string | The type of field. Value must be autocomplete . | required | |||||||||||
maxGrams | int | The maximum number of characters per indexed sequence. The
value limits the character length of indexed tokens. When you
search for terms longer than the maxGrams value, tokens are
truncated to the maxGrams length. | optional | 15 | ||||||||||
minGrams | int | The minimum number of characters per indexed sequence. The
recommend minimum value is 4 . A value that is less
than 4 could impact performance because the size of the
index can become very large. The default value of 2 is
only recommended for edgeGram . | optional | 2 | ||||||||||
tokenization | enum | The tokenization strategy to use when indexing the field for autocompletion. Value can be one of the following:
For example, consider the following sentence:
When tokenized with
Note Indexing a field for autocomplete with | optional | edgeGram | ||||||||||
foldDiacritics | boolean | The setting to indicate whether diacritics should be included or removed from the indexed text. Value can be one of the following:
| optional | true |
{ "mappings": { "dynamic": true|false, "fields": { "<field-name>": [ { "type": "autocomplete", "tokenization": "edgeGram|nGram", "minGrams": <2>, "maxGrams": <15>, "foldDiacritics": true|false } ] } } }
boolean
¶
The boolean
data type is for indexing true
and false
values. It
works in conjunction with the equals operator.
Fields of type boolean
cannot be dynamically indexed. They must be
specifically indexed as part of a static mapping.
The following example index definition maps a field named verified_user
with the boolean
data type and field named teammates
with the
objectId
data type.
{ "mappings": { "dynamic": false, "fields": { "verified_user": { "type": "boolean" }, "teammates": { "type": "objectId" } } } }
objectId
¶
The objectId
data type is for indexing ObjectId fields. It works in conjunction with the
equals operator.
string
¶
The string
data type takes the following parameters:
Option | Type | Purpose | Default | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
analyzer | string | Name of a built-in or overridden analyzer
to use for indexing the field. | lucene.standard | ||||||||||||||||||||
searchAnalyzer | string | Analyzer to use when querying the field. | lucene.standard | ||||||||||||||||||||
indexOptions | string | Specifies the amount of information to store for the indexed field. Value can be one of the following:
| offsets | ||||||||||||||||||||
store | boolean | Specifies whether or not to store the exact document text as well as
the analyzed values in the index. Value can be true or false .
The value for this must be true for Highlighting. | true | ||||||||||||||||||||
ignoreAbove | int | Do not index if the field value is greater than the specified
number of characters. | None | ||||||||||||||||||||
multi | string | The string field to index with the name of the alternate analyzer
specified in the Example The following index definition for a
| None | ||||||||||||||||||||
norms | string | Specifies whether to include or omit the field length in the result when scoring. The length of the field is determined by the number of tokens produced by the analyzer for the field. Value can be one of the following:
If value is If value is | include |
document
¶
The document
data type is for fields with embedded documents.
It takes the following parameters:
Option | Type | Purpose | Default |
---|---|---|---|
type | string | Must be document . | none |
dynamic | boolean | If set to
If set to | true |
fields | document | Maps document field names to field definitions. See the example on this page. | none |
date
¶
The date
type is for indexing date values. A date cannot be
indexed if it is part of an array. It takes the type
option. The value of type
must be date
.
number
¶
The number
type is for fields with numeric values of int32
,
int64
, and double
data types. The number
type has the
following options:
Option | Purpose | Default | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
type | The type of field. Value must be number . | |||||||||||||
representation | The data type of the field to index. Valid values are:
Example The following index definition for the
| double | ||||||||||||
indexIntegers | Index or omit Example The following index definition for the
| true | ||||||||||||
indexDoubles | Index or omit Example The following index definition for the
| true |
geo
¶
The geo
type is for indexing geographic point and shape coordinates. For
this type, the indexed field must be a GeoJSON
object.
Option | Purpose | Default |
---|---|---|
type | The type of field. Value must be geo . | |
indexShapes | Specifies whether or not to index shapes. By default:
Value can be:
| false |
{ "mappings": { "dynamic": false, "fields": { "type": "document", "<field-name>": { "indexShapes": true|false "type": "geo" } } } }
Limitation¶
Atlas Search cannot index numeric, date, or boolean values if they are part of an array.