How it works...

The Elasticsearch GET API on the document doesn't require a refresh: all the GET calls are in real time.

This call is very fast because Elasticsearch redirects the search only on the shard that contains the document without another overhead, and the document IDs are often cached in memory for fast lookup.

The source of the document is only available if the _source field is stored (as per the default settings in Elasticsearch).

There are several additional parameters that can be used to control the get call:

  • _source allows us to retrieve only a subset of fields. This is very useful for reducing bandwidth or for retrieving calculated fields such as the attachment-mapping ones:
GET /myindex/_doc/2qLrAfPVQvCRMe7Ku8r0Tw?_source=date,sent
  • stored_fields, similar to source, allows us to retrieve only a subset of fields that are marked as stored in the mapping. Stored fields are kept in a separated memory portion of the index, and they can be retrieved without parsing the JSON source:
GET /myindex/_doc/2qLrAfPVQvCRMe7Ku8r0Tw?stored_fields=date,sent
  • routing allows us to specify the shard to be used for the get operation. To retrieve a document, the routing used in indexing time must be the same as the search time:
GET /myindex/_doc/2qLrAfPVQvCRMe7Ku8r0Tw?routing=customer_id
  • refresh allows us to refresh the current shard before performing the get operation (it must be used with care because it slows down indexing and introduces some overhead):
GET /myindex/_doc/2qLrAfPVQvCRMe7Ku8r0Tw?refresh=true
  • preference allows us to control which shard replica is chosen to execute the GET method. Generally, Elasticsearch chooses a random shard for the GET call. The possible values are as follows:
    • _primary for the primary shard.
    • _local, first trying the local shard and then falling back to a random choice. Using the local shard reduces the bandwidth usage and should generally be used with auto-replicating shards (replica set to 0-all).
    • custom value for selecting a shard-related value, such as customer_id and username.