Jump to: navigation, search

Near-Real-Time Search using Elasticsearch

Lucene allows new segments to be written and opened—making the documents they contain visible to search—without performing a full database commit. The in-memory buffer contents are written to a segment, which is searchable, but is not yet committed. This is a much lighter process than a commit, and can be done frequently without ruining performance.

In Elasticsearch, this lightweight process of writing and opening a new segment is called a refresh. By default, every shard is refreshed automatically once every second. Hence, Elasticsearch has near real-time search: document changes are not visible to search immediately, but will become visible within 1 second.

Not all use cases require a refresh every second. Perhaps you are using Elasticsearch to index millions of log files, and you would prefer to optimize for index speed rather than near real-time search. You can reduce the frequency of refreshes on a per-index basis by setting the refresh_interval.

The refresh_interval can be updated dynamically on an existing index. You can turn off automatic refreshes while you are building a big new index, and then turn them back on when you start using the index in production.

There is more detailed information in Elasticsearch documentation.

This page was last modified on May 18, 2018, at 08:01.

Feedback

Comment on this article:

blog comments powered by Disqus