Creating time-based index in Elasticsearch using NEST (.NET clients for Elasticsearch)

Introduction

One of the most common use cases in Elasticsearch is to create time-based indexes for logs. In this blog, we will see how to create time-based index on run time using NEST (.NET clients for Elastic search).

When it comes to logging, we usually create a log file everyday to isolate the logs and get only the ones relevant for analysis, when required. If we store the logs in a relational database, we commonly have one table. With time, the entries on this table grow and to check the number of records on table, we usually delete the old records from the table at specific interval.

In Elasticsearch the same thing can be achieved by creating a new index per day.

Prerequisites

  • Basic knowledge of Elasticsearch database and NEST

Time-based data storage strategy is mostly used in two conditions:

  • For logging
  • For creating indices for social network activity (for example, to create indices for 3 months automatically, the indices name will be data-01, data-04, data-07).

Creating a new index is a challenge in most of the cases. Here we look at one of the strategies to deal with such a problem. To demonstrate this, we will create an entry of Company in Elasticsearch database through Web API and store the application log in an index named all_log. (Gist url : https://gist.github.com/DeepeshSaheb/b7a1b9cc3fa24d4c13807593f962178e)

So our entities are:

  • Company: Index name in Elasticsearch DB in This index is not time based
  • Log: Index name in Elasticsearch DB is all_log. This index is time based, so name of the index is all_log_23_02_2018.

Here we have defined a custom attribute ElasticIndexDetails Attribute which helps us to know if an entity for the index to be created is time-based or not.

blg1

So our Company entity can be defined as:

blg2

And Log entity can be defined as:

blg3

On our repository layer, we will define a method GetIndex which will get us the index name based on Elastic Index Details Attribute attribute. If the entity is of time series, then the method appends the current data at the end of the index name:

blg4

While inserting the record, call GetIndex method to get the index name:

blg5

Insert method inserts the record in the mentioned index (index name returned by GetIndex Method) if the index exists. If the index is not present, it creates a new index and then adds the record to that index.

On CompanyContoller, we will call the insert method of the repository:

blg6

When we call the create method from swagger UI, it inserts records in Company Index.  Once the insertion is successful, it inserts the record on all_log index:

blg7

On the very first call it creates both the indexes (company and all_log_23_02_2018).

To check if the indexes are created, we can go to elastic URL (http://localhost:9200/_cat/indices?v)

Results of elastic URL (http://localhost:9200/_cat/indices?v)

blg8

Conclusion

We have created the index on the run-time. The alternate approach is defining a template for rollover index having a time based condition and calling this template pattern through a CRON task after a specified interval.  This approach works well only when you want to store data based on a time frame. For example, if a system is tenant-based and you want to store data based on tenant, this approach would not be suitable.


 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s