Scoring and boosting in Elasticsearch on waitingforcode.com

A subtle difference between filter and full-text search consists on scoring. It's score who distinguishes result corresponding to filter from how well result matches the query.

Data Engineering Design Patterns

Looking for a book that defines and solves most common data engineering problems? I wrote one on that topic! You can read it online on the O'Reilly platform, or get a print copy on Amazon.

I also help solve your data engineering problems 👉 contact@waitingforcode.com 📩

To work well with Elastcisearch, understanding of scoring is important. Without that, we can explain with difficulty why some documents are returned in higher position than others. It's why the first part of this article begins with explaination of scoring algorithm. After that, we'll try to explore boosting feature which consists on changing score results computed by Elasticsearch.

Scoring in Elasticsearch

Scoring in Elastcisearch consists on associating relevancy values to documents found in search. It's very useful in multiple words full-text search where document can match as well one searched word (for example: "house") as all of them ("my house"). Scores are represented by field called _score. But according to which criteria they are computed ?

To understand scoring calculation ingredients, we'll base on formula used to compute the score. This formula is based on concepts taken from term frequency/inverse document frequency (TF/IDF) and the vector space model concepts. Let's begin by components of the first one:

term frequency - according to the occurencies of given term in document, score increases.
inverse document frequency - its role is to give an importancy to searched term. For example, some very common words as prepositions (at, since, in, on...) can appear in big number of documents. They will decrease document relevancy while more exotical words, such as "Magnoliidae" will increase it. So even if one document matchs "on" and another one "Magnoliidae", and "on" term is present in almost all index documents (unlike Magnoliidae, present only in several ones), the score of "Magnoliidae" document will be much
field-length criteria - the length of field containing searched term(s) can also its influence on scoring. More the field is shorter, better score is computed. To understand that, let's take an analogy with journal articles. If we see words of our center of interest in article titles, we have more chances to read it than only watching on words appearing on a text of half-page length.

Vector space model is used to check how well document is matching multiterm query. Elasticsearch constructs a vector over each index document matching search query. The vector contains weights of all terms defined in the search and present in given document. For example, if we search "and Magnoliidae", document containing only "and" term will have vector looking like [1, 0], where 1 is the weight of "and" and 0 is the weight of "Magnoliidae" term, missing in the document. On the other side, document matching both "and Magnoliidae" terms, wil have a vector like [1, 5]. After, Elaticsearch measures the angle between the query and document vector. If the angle between query and document vector are big, the relevancy is low.

Debuging Elasticsearch score

Each search query allows us to debug scoring and understand score differences between documents. To achieve that we can pass explain parameter in query string, as below http://localhost:9200/waitingforcode/teams/_search?q=name:rc%20roubaix&pretty=true&explain. Elasticsearch will return hits part containing _explanation field:

"_source":{  
  "name":"Excelsior Roubaix"
},
"_explanation":{  
  "value":0.34794572,
  "description":"product of:",
  "details":[  
    {  
      "value":0.69589144,
      "description":"sum of:",
      "details":[  
        {  
          "value":0.69589144,
          "description":"weight(_all:roubaix in 15) [PerFieldSimilarity], result of:",
          "details":[  
            {  
              "value":0.69589144,
              "description":"score(doc=15,freq=1.0), product of:",
              "details":[  
                {  
                  "value":0.31664962,
                  "description":"queryWeight, product of:",
                  "details":[  
                    {  
                      "value":3.5162723,
                      "description":"idf(docFreq=101, maxDocs=1263)"
                    },
                    {  
                      "value":0.09005264,
                      "description":"queryNorm"
                    }
                  ]
                },
                {  
                  "value":2.1976702,
                  "description":"fieldWeight in 15, product of:",
                  "details":[  
                    {  
                      "value":1.0,
                      "description":"tf(freq=1.0), with freq of:",
                      "details":[  
                        {  
                          "value":1.0,
                          "description":"termFreq=1.0"
                        }
                      ]
                    },
                    {  
                      "value":3.5162723,
                      "description":"idf(docFreq=101, maxDocs=1263)"
                    },
                    {  
                      "value":0.625,
                      "description":"fieldNorm(doc=15)"
                    }
                  ]
                }
              ]
            }
          ]
        }
      ]
    },
    {  
      "value":0.5,
      "description":"coord(1/2)"
    }
  ]
}

As you can see, we retrieve there the concepts defined in the first part of this article: tf, idf and fieldNorm. Each of them has associated value, used after to make final score computation. Because our search contains two terms: "rc" and "roubaix", we can find explain parts of both of them.

Formula used to compute final score value is called practical scoring function. It's look like:

score(q,d)  =  queryNorm(q) * coord(q,d) * ∑ (tf(t in d) *  idf(t) *  t.getBoost() * norm(t,d)) (t in q)

Some new functions appeared:

queryNorm - query normalization factor, used to simplify the comparison between results of different queries.
coord - coordination factor, used to privilege documents containing more of searched terms. In our example we can see that the coordination factor is 0.5. It's because matching document contains only one ("roubaix") from two searched terms.
boost - boosting value, allows to give more importancy to some fields containing terms. For example, we can associate boost value "2" to document which title contains searched terms and 0.5 for the rest of fields with them.

Boosting in Elastcisearch

The last parameter quoted in previous part was boost. It helps to modify à posteriori the scores computed by Elasticsearch. It can be implemented at index time or at query time. According to Elasticsearch index boost documentation, boosting at query time should be prefered over boosting at index time for several reasons:

Field-length norm precision is lost because index boost value is combined with it and everything is stored in a single byte. In consequency, Elasticsearch isn't able anymore to distinguish the fields with different number of words.
Changing index boost needs to reindex all documents while boosting on query time can be adapted with different values for every query.
Field weight can be faked in the case when boosted field has multiple values. In this situation, boost is multiplied by itself for every value. By doing it, the weight for given field increases and can not reflect the real match.

To resume, if boost is needed, it's better to use it at query time. But how to do ? If we want to boost a single field, we need to define new attribute in query DSL, boost. Query without this field takes a neutral boost equal to 1. We can also boost one or mutliple indexes. To do that, we need to define indices_boost attribute at query DSL root level.

In our example we'll take the example of newspaper, quoted at the begin of this article. It'll contain 2 fields, title and content. Title field will be boosted while content no. Let's first create new type in index (http://localhost:9200/waitingforcode/_mapping/newspaper):

{"newspaper" : { "properties" : { 
  "title" : {"type" : "string", "store" : true },
  "content" : {"type" : "string", "store" : true }
}}}

And save some data (http://localhost:9200/waitingforcode/_bulk):

{"index": {"_index": "waitingforcode", "_type": "newspaper"}}
{"title": "MyTeam won Champions League", "content": "MyTeam won prestigious Champions League"}
{"index": {"_index": "waitingforcode", "_type": "newspaper"}}
{"title": "New Champions League winner", "content": "Prestigious Champions League was won by MyTeam"}

Now, let's compare results given by query without boost with the query containing negative boosting for title field:

{"query": { "bool": { "should": [
  {"match": {"title": {"query": "myteam"}}},
  {"match": {"content": {"query": "myteam"}}}
]}}}

The result returned two hits, one corresponding to document identified by AU74SjgfXaJ2i7zl3W3a, the second for document with id AU74SjgfXaJ2i7zl3W3b. The score was 4.95146 for the first one ("MyTeam won Champions League" title) and 0.2565833 for the second ("New Champions League winner"). Now, we'll try to decreate drastically the importance of title field by boosting it with negative value:

{"query": { "bool": { "should": [
  {"match": {"title": {"query": "myteam", "boost": -1000000}}},
  {"match": {"content": {"query": "myteam", "boost": 2}}}
]}}}

The influence on returned scores are visible immediately. "MyTeam won Champions League" document passed from 4.95146 to -3.7346187, while "New Champions League winner" gained almost 6 points: 6.460153e-7.

This article shows some basic concepts hidden behind Elasticsearch scoring feature. At the begin we can see which ideas are used to define how well given document matches to search criteria. After that, we learn that thanks to explain parameter we can see how many points are attributed for each term defined in matched document. At the end, we can see query boosting, ie. changing score values at query time with the definition of boost field in query DSL.

Consulting

With nearly 16 years of experience, including 8 as data engineer, I offer expert consulting to design and optimize scalable data solutions. As an O’Reilly author, Data+AI Summit speaker, and blogger, I bring cutting-edge insights to modernize infrastructure, build robust pipelines, and drive data-driven decision-making. Let's transform your data challenges into opportunities—reach out to elevate your data engineering game today!

👉 contact@waitingforcode.com
🔗 past projects