This is the eleventh part of my syslog-ng tutorial. Last time, we learned about message parsing using syslog-ng. Today, we learn about enriching log messages.

You can watch the video or read the text below.

YouTube Link:


Enriching log messages

You can also enrich log messages using syslog-ng. Enriching in this case means, that you can create additional name-value pairs based on message content. There are several ways how you can enrich log messages using syslog-ng.

The PatternDB parser can not just parse out interesting information from log messages but can also create additional name-value based on message content. You can add fields in the XML database that describe the content of the message. For example you can mark any login related events with “action=login” and if the message is about an unsuccessful login then “status=failure”.

The GeoIP parser can find geolocation of an IP address. The software itself is freely available, but the database it uses requires registration. It is no more distributed as part of Linux distributions. The original implementation just returned the country and longitude / latitude information. The current implementation returns many more information and in multiple languages.

Geographical information can help to find anomalies, like a user logging in from two distant locations at once. A probably less useful but lot more popular usage of GeoIP is displaying the location of IP addresses on a map. It is mostly eye-candy for C-levels, but a spectacular map can help you to get extra funding for security 🙂

The add-contextual-data() parser can add metadata to log messages from CSV files. For example you can add a host role or a contact person to a log message. This way you can see the extra information already while browsing your log messages, without needing an additional lookup. The additional information can also enable more accurate alerts and dashboards.


Using loggen to read a file

We have already seen earlier how to use loggen to send synthetic log messages to a network source. Here we extend the command line we used previously by adding file reading to the mix:

The options used here are:

  • -i: Internet

  • -S: TCP (and unix-stream)

  • -d: do not parse

  • -R /path/to/file: read log messages from a file

  • Host & port

Why do we use the do not parse option here? For the sample configuration we want to send logs without the original date, so just the message part. The original date is not a real problem here, but will be a problem when we send the logs to Elasticsearch.


Iptables sample logs

Working with iptables logs are nice way to get started with message parsing. They follow the key=value formatting and can easily parsed by syslog-ng. Then we can use the results of the parsing and find the geographical location of the source IP address:

To avoid duplicating the date part in the logs, we remove that before sending the logs to syslog-ng using loggen. Loggen generates proper message headers based on the current date:


This example configuration collects iptables logs over the network, parses them, and adds geographical information to source IP addresses using the GeoIP parser. Finally it writes the resulting name-value pairs into a JSON formatted file.

Support for GeoIP is usually a separate sub/package on Linux systems. On FreeBSD it is not part of the default package configuration, which means that you cannot use the package but have to compile syslog-ng from ports yourself.

Note, that downloading the database for the GeoIP parser is not the scope of this tutorial.

As usual, the first few lines of the configuration deal with local log messages. The interesting part comes afterwards. Let’s follow the log statement at the end, as this is what connects all the building blocks together.

The first line opens a TCP source on port 514. The next line is where things start to get interesting. It calls a key=value parser on incoming log messages. prefix(“kv.”) here means, that the name of all resulting name-value pairs will start with “kv.”.

Next the GeoIP parser is called. It looks for the IP address in the kv.SRC name-value pair and stores the various information in name-value pairs under the geoip2 prefix.

Finally log messages are written to a file using JSON formatting. You can find more information about template functions in the documentation. Here I want to point you to the –rekey operator, which removes the leading dot from the name of name value pairs. The leading dot is normally replaced by an underscore by syslog-ng, but it has a special meaning in Elasticsearch. We also remove the DATE macro and include ISODATE with the name expected by Elasticsearch. Finally we add two line breaks for better readability. Of course we will remove those when we use the same template to send logs to Elasticsearch.




About DT Asia

DT Asia began in 2007 with a clear mission to build the market entry for various pioneering IT security solutions from the US, Europe and Israel.

Today, DT Asia is a regional, value-added distributor of cybersecurity solutions providing cutting-edge technologies to key government organisations and top private sector clients including global banks and Fortune 500 companies. We have offices and partners around the Asia Pacific to better understand the markets and deliver localised solutions.