How it works...

There are several ways to provide a suggestion in Elasticsearch. You can obtain a simple term suggestion, or use some queries with wildcards or prefixes, but the completion fields are much faster and powerful due to the use of natively optimized structures.

Internally, Elasticsearch builds a finite state transducer (FST) structure for suggesting terms. (This topic is described in great detail at the following Wikipedia page: http://en.wikipedia.org/wiki/Finite_state_transducer.)

The most important properties that can be configured to use the completion field are as follows:

  • analyzer: This defines the analyzer to be used for indexing within this document. The default is simple to use for keeping stopwords in suggested terms such as at, the, of, and so (simple default).
  • search_analyzer: This defines the analyzer to be used for searching (simple default).
  • preserve_separators: This controls how tokens are processed. If disabled, the spaces are removed in suggestion; this makes it possible to match fightc as fight club (true default).
  • max_input_length: This property reduces the characters in the input string to reduce the suggested terms. Suggesting the longest text is nonsense (no one write long strings of text and want a suggestion on it) (50 default).
  • payloads: This allows you to store payloads (additional item values to be returned) (false default). For example, if you are searching for a book, it will be useful as it not only returns the book title, but also its ISBN. This is shown in the following example:
PUT test/_doc/1
{
"name": "Elasticsearch Cookbook",
"suggest": {
"input": ["ES", "Elasticsearch", "Elastic Search", "ElasticSearch Cookbook"],
"output": "Elasticsearch Cookbook",
"payload": {"isbn": "1782166629"},
"weight": 34
}
}

In the preceding example, we are able to see some functionalities that are available during indexing time for the completion field, which are as follows:

  • input: This manages a list of provided values that are usable for suggesting. If you are able to enrich your data, this can improve the quality of your suggester.
  • output: This is an optional string to be shown as a result and mainly used for presenting to the user a text representation (optional).
  • payload: This includes some extra data to be returned (optional).
  • weight: This is a weight boost to be used to score suggester (optional).

At the start of this recipe, I used a shortcut using the copy_to field property to populate the completion field from several fields. The copy_to property simply copies the content of one field in one or more other fields.