Jump to: navigation, search

Near-Real-Time Search using Elasticsearch

Lucene allows new segments to be written and opened—making the documents they contain visible to search—without performing a full database commit. The in-memory buffer contents are written to a segment, which is searchable, but is not yet committed. This is a much lighter process than a commit, and can be done frequently without ruining performance.

In Elasticsearch, this lightweight process of writing and opening a new segment is called a refresh. By default, every shard is refreshed automatically once every second. Hence, Elasticsearch has near real-time search: document changes are not visible to search immediately, but will become visible within 1 second.

Not all use cases require a refresh every second. Perhaps you are using Elasticsearch to index millions of log files, and you would prefer to optimize for index speed rather than near real-time search. You can reduce the frequency of refreshes on a per-index basis by setting the refresh_interval.

The refresh_interval can be updated dynamically on an existing index. You can turn off automatic refreshes while you are building a big new index, and then turn them back on when you start using the index in production.

There is more detailed information in Elasticsearch documentation.

Feedback

Comment on this article:

blog comments powered by Disqus
This page was last modified on May 18, 2018, at 07:01.