Logstash’s GROK is the heart of the ELK stack. You must master this skill in order to be proficient in the ELK stack. In order to improve your Logstash’s GROK skills I recommend the following:
Know the tools at your disposal
Tools make your life easier. I keep the following tools open while I am groking:
- Kibana’s GROK debugger
- Grok Constructor: http://grokconstructor.appspot.com/do/match#result
- Available Logstash’s GROK patterns: https://github.com/elastic/logstash/blob/v1.4.2/patterns/
grok-patterns
Also take into account that if you want to grok the logs of a popular tool like Apache, MySQL, Nginx etc. it is possible that the formats are already built by someone else.
Create your own tools
It is very helpful to create a spreadsheet tool that helps you format the different fields and give the appropriate name to the fields. If you are working with KPIs some of them might be integers, other might be floats, you need a tool that helps you specify what type of number your field represents. Hence, it is useful to have some sort of spreadsheet program to help you in the development process.
I made a very simple spreadsheet that has helped me a lot. You can find it here: https://gitlab.com/ca.matajira966/grok
Match one pattern at a time
When you create your spreadsheet you will be tempted to copy paste your resulting GROK to see if it works. The problem with this approach is that despite some part of your work may be working, others may not. And sometimes the tools like Grok Constructor or Kibana’s Grok debugger might not be very helpful in debugging.
I recommend you do the following iterative procedure (assuming fields separated with pipes).
try: %{DATA:first_field}\|%{GREEDYDATA:rest} If it works try: %{DATA:first_field}\|%{DATA:second_field}\|%{GREEDYDATA:rest} If it works try: %{DATA:first_field}\|%{DATA:second_field}\|%{DATA:third_field}\|%{GREEDYDATA:rest} ...
This method goes one field at a time. The idea is that you solve the groking one field at a time, then go to the next. But the good thing is that you actually are matching the whole log every time! So there are no surprises at the end, you are playing safe. This happens because the %{GREEDYDATA:rest} is capturing the rest of your log line, in every single step.
I like this method because it is safe, you will achieve your results. It has happened to me several times that by trying to guess the whole GROK in one shot, I ended up with a lot of debugging problems. Avoid this.
Leave a Reply