Logstash introduction and tips

Camilo Matajira Avatar

Why Logstash?

Logs are text, Logstash converts that text into useful data and then helps you ingest it into ElasticSearch or stream it somewhere else (e.g kafka).

Logs from Filebeat are sent in JSON format with additional fields and tags added by Filebeat. Nevertheless, the “message” (i.e the log content) sent by Filebeat is still not parsed into useful fields: This is where Logstash comes in.

The configuration file

The configuration file controls Logstash. This file, in the container that I am using right now (and which I recommend: https://hub.docker.com/r/sebp/elk/) is located at /etc/logstash/conf.d.

Where to put the Volumes

If you are still in the development phase and need versatility in the modification of Logstash’s configuration files, you should put a bind volume like this:

$ -v where/you/want:/etc/logstash/conf.d

This way you can edit comfortably the Logstash configuration file from outside the container. Beware: whatever you put in the bind volume will overwrite the content inside the container. So in order for Logstash to work (i.e. not die), you need to
put a configuration file in your binded volume in your local machine.

When I am in PROD, I don’t use binded volumes for the configuration file, neither I persist the configuration files in a named volume; I prefer to use COPY in the Dockerfile to copy the configuration file when I build the container images.

How to apply a change in the config file

To apply the changes you need to restart Logstash. You can do so by:

$ /etc/init.d/logstash restart

Probably you need to execute the last statement twice because Logstash puts up a fight to restart. I always do it twice.

Contents of the configuration file

In Logstash the configuration file has three major components. The definition of the inputs, the filter, and the output. In my opinion, the fastest way to learn how to configure Logstash is to mimic a working Logstash.

Hence, below is a commented excerpt of one of my config files.

input {
beats {
port => 5044 #Logstash hears from this port. (it could have more inputs if you need)
}
}

filter {
# A lot of code for other types of logs
if [attrs][my_tagl][from_filebeat] == "my-application" { #This is where Filebeat tags become very useful!!!
grok {
match => { #This is where we break the logs into fields that we can use.

"message" => "(?%{YEAR}-%{MONTHNUM}-%{MONTHDA
Y} %{TIME})\|%{LOGLEVEL:log_level}\|%{GREEDYDATA:field1}\|%{GREEDYDATA:field2\|%{NUMBER:field3}\|%{GREEDYDATA:field4}\|%{GREEDYDATA:field5}\|%{GREEDYDATA:field6}\|%{NUMBER:field7}"

}
}
date { #Here we tell logstash how to interpret the date.
match => ["[log_timestamp]", "YYYY-MM-dd HH:mm:ss"]
target => "log_timestamp"
}
fingerprint { # This is for handling the duplicates. It creates an ID based on what I specify as "source".
In this case is the whole message.
source => "message"
target => "[@metadata][fingerprint]"
method => "MURMUR3"
}
}
# A lot of code for other types of logs
}
output {

if [attrs][my_tagl][from_filebeat] == "my-application"{ #This specifies where to send this type of log.
elasticsearch {# I send to elasticsearch.
hosts => "localhost:9200"
index =>"my-application-%{+YYYY.MM.dd}" #I tell him to send it to the following index.
document_id =>"%{[@metadata][fingerprint]}" # We specify the ID of the log.
}
}

How to do the GROK

I have a specific blog post about it: here

What happens if I commit a mistake?

Volume:

If you want to bind the volume for the configuration file, and you do it incorrectly, Logstash is not
going to start.

Typo in configuration file:

Logstash won’t start

Mistake in GROK

Your logs are going to be sent to the output you specified with the tag “_grokparsefailure”

You don’t add the “date” block inside the filter

Your date will be sent as text.

You don’t add the “fingerprint” block in the filter and the document_id in the output part

You are going to have duplicates.

You don’t specify your output correctly

Your logs will not be sent.

Too many Logs and ElasticSearch has low memory

ElasticSearch node status will be red and probably will shut down.

How to debug Logstash

If you break something after modifying the grok and the outputs, possibly the mistake is there. If you don’t see the error, check the logs in:

$ tail /var/log/logstash/logstash-plain.log

 

Camilo Matajira Avatar

Leave a Reply

Your email address will not be published. Required fields are marked *