Navigation

regex (Operator)

Definition

regex

regex interprets the query field as a regular expression. regex is a term-level operator, meaning that the query field is not analyzed. For more information about analyzed and non-analyzed fields, see Analyzers. For an example of querying against analyzed vs. non-analyzed fields, see the analyzed field example.

Note

The regular expression language available to the regex operator is a limited subset of the PCRE library. For detailed information, see the Class RegExp documentation.

regex has the following syntax:

{
  $searchBeta: {
    "index": <index name>, // optional, defaults to "default"
    "regex": {
      "query": "<search-string>",
      "path": "<field-to-search>",
      "allowAnalyzedField": <boolean>,
      "score": <options>
    }
  }
}
Field Type Description Required? Default
query string or array of strings The string or strings to search for. yes  
path string or array of strings The indexed field or fields to search. See path construction for more information. yes  
allowAnalyzedField boolean Must be set to true if the query is run against an analyzed field. no false
score object

Modify the score assigned to matching search term results. Options are:

  • boost: multiply the result score by the given number.
  • constant: replace the result score with the given number.
no  

Behavior

regex is a term-level operator, meaning that the query field is not analyzed. Regular expression searches work well with the keyword analyzer, because it indexes fields one word at a time.

It is possible to use the regex operator to perform searches on an analyzed field by setting the allowAnalyzedField option to true, but you may get unexpected results.

Example

Searching for *Star Trek* on a field indexed with the keyword analyzer finds all documents in which the field contains the string Star Trek in any context. Searching for *Star Trek* on a field indexed with the standard analyzer finds nothing, because there is a space between Star and Trek, and the index contains no spaces.

Examples

The following examples use the movies collection in the sample_mflix database with a custom index definition that uses the keyword analyzer. If you have the sample dataset on your cluster, you can create an Atlas Search index on the movies collection and run the queries on your cluster. The Atlas Search Index Tutorial contains instructions for loading the sample dataset, creating an index definition, and running Atlas Search queries.

Index Definition

The following index definition indexes the title field in the movies collection with the keyword analyzer:

{
  "mappings": {
    "fields": {
      "title": {
        "analyzer": "lucene.keyword",
        "type": "string"
      }
    }
  }
}

The following example searches all title fields for movie titles that end with the word Seattle. The (.*) regular expression matches any number of characters.

db.movies.aggregate([
   {
      "$searchBeta": {
         "regex": {
            "path": "title",
            "query": "(.*) Seattle"
         }
      }
   },
   {
      $project: {
         "_id": 0,
         "title": 1
      }
   }
])

The above query returns the following results:

{ "title" : "Sleepless in Seattle" }
{ "title" : "Battle in Seattle" }

The following example uses the regular expression [0-9]{2} (.){4}s to find movie titles which begin with a 2-digit number followed by a space, and end with a 5-letter word ending in s.

db.movies.aggregate([
   {
      "$searchBeta": {
         "regex": {
            "path": "title",
            "query": "[0-9]{2} (.){4}s"
         }
      }
   },
   {
      $project: {
         "_id": 0,
         "title": 1
      }
   }
])

The above query returns the following results:

{ "title" : "20 Dates" }
{ "title" : "25 Watts" }
{ "title" : "21 Grams" }
{ "title" : "13 Lakes" }
{ "title" : "18 Meals" }
{ "title" : "17 Girls" }
{ "title" : "16 Acres" }
{ "title" : "26 Years" }
{ "title" : "99 Homes" }
{ "title" : "45 Years" }