Language Analyzers¶
Language-specific analyzers provide a convenient way to create indexes tailored to a particular language. Each language analyzer has built-in stopwords and word divisions based on that language's usage patterns.
Atlas Search offers the following language analyzers:
lucene.arabic | lucene.armenian | lucene.basque | lucene.bengali | lucene.brazilian |
lucene.bulgarian | lucene.catalan | lucene.cjk [1] | lucene.czech | lucene.danish |
lucene.dutch | lucene.english | lucene.finnish | lucene.french | lucene.galician |
lucene.german | lucene.greek | lucene.hindi | lucene.hungarian | lucene.indonesian |
lucene.irish | lucene.italian | lucene.latvian | lucene.lithuanian | lucene.norwegian |
lucene.persian | lucene.portuguese | lucene.romanian | lucene.russian | lucene.sorani |
lucene.spanish | lucene.swedish | lucene.turkish | lucene.thai |
[1] | cjk means Chinese, Japanese, and Korean |
Example¶
The following example index definition specifies an index on
the sujet
field using the french
analyzer:
{ "mappings": { "fields": { "sujet": { "type": "string", "analyzer": "lucene.french" } } } }
Consider a collection named voitures
with the following documents:
{ "_id": 1, "sujet": "Mieux équiper nos voitures pour comprendre les causes d'un accident." } { "_id": 2, "sujet": "Le meilleur moment pour le faire c'est immédiatement après que vous aurez fait le plein de carburant." }
The following query uses the index on the sujet
field:
db.voitures.aggregate([ { $search: { "text": { "query": "pour", "path": "sujet" } } } ])
The above query returns no results when using the french
analyzer,
because pour
is a built-in stop word. Using the standard
analyzer, the same query would return both documents.
The following query searches for the string carburant
in the
sujet
field:
db.voitures.aggregate([ { $search: { "text": { "query": "carburant", "path": "sujet" } } } ])
The above query returns the document with "_id": 2
from the collection.
{ "_id": 2, "sujet": "Le meilleur moment pour le faire c'est immédiatement après que vous aurez fait le plein de carburant." }