Elasticsearch architecture and vocabulary

Every time before starting to learn new technology, we need to appropriate its specific vocabulary. In the case of Elasticsearch, this vocabulary is mostly related to the architecture terms.

Data Engineering Design Patterns

Looking for a book that defines and solves most common data engineering problems? I'm currently writing one on that topic and the first chapters are already available in πŸ‘‰ Early Release on the O'Reilly platform

I also help solve your data engineering problems πŸ‘‰ contact@waitingforcode.com πŸ“©

Because Elasticsearch is a layer built on Lucene search engine, we'll start by reminding some of terms related to it. After we'll pass to Elasticsearch specific definitions by beginning with architecture words. The last part will present the words that concern Elasticsearch documents. All terms will be listed in logical to easily pass from one definition to another.

Lucene terms in Elasticsearch

Below you can find a list of terms present in Lucene and, consequently, in Elasticsearch:

Architecture terms in Elasticsearch

One of powerful features of Elasticsearch is its horizontal scalability-oriented architecture. It means that we can improve searching and indexing performances simply by adding new servers into cluster. It's the reason why a big part of architecture terms are related to this aspect. Following list contains terms related to Elasticsearch architecture:

Document terms in Elasticsearch

Another terms family useful in Elasticsearch discovery is indexing and searching actions. This time we'll use analogy with relational databases to understand quicker some of concepts:

This article introduces some of basics but very important concepts to work well with Elasticsearch. The first part presented the ideas coming with Lucene, essentialy related to index construction. The next part described architecture ideas particular to Elasticsearch. The last part presented, thanks to analogies with relational databases, the main components of Elasticsearch indexing process. Two last points explained also the difference between two similar concepts - shards and indexes. The first one holds data and can be primary or replica by its nature. The indexes make only the links to shards, don't store any data and are exposed to deal with data consumer applications.


If you liked it, you should read:

πŸ“š Newsletter Get new posts, recommended reading and other exclusive information every week. SPAM free - no 3rd party ads, only the information about waitingforcode!