I've been thinking about logging in the context of OpenStack and containerized service deployments. I'd like to lay out some of my thoughts on this topic and see if people think I am talking crazy or not.

There are effectively three different mechanisms that an application can use to emit log messages:

  • Via some logging-specific API, such as the legacy syslog API
  • By writing a byte stream to stdout/stderr
  • By writing a byte stream to a file

A substantial advantage to the first mechanism (using a logging API) is that the application is logging messages rather than bytes. This means that if you log a message containing embedded newlines (e.g., python or java tracebacks), you can collect that as a single message rather than having to impose some sort of structure on the byte stream after the fact in order to reconstruct those message.

Another advantage to the use of a logging API is that whatever is receiving logs from your application may be able to annotate the message in various interesting ways.


We're probably going to need to support all three of the above mechanisms. Some applications (such as haproxy) will only log to syslog. Others may only log to files (such as mariadb), and still others may only log to stdout.

Comparing different log mechanisms

Logging via syslog

In RHEL, the journald process is what listens to /dev/log. If you bind mount journald's /dev/log inside a container and then run the following Python code inside that container...

import logging
import logging.handlers

handler = logging.handlers.SysLogHandler(address='/dev/log')
log = logging.getLogger(__name__)

log.warning('This is a test')

...you will find that your simple log message has been annotated with a variety of useful metadata (the output below is the result of running journalctl -o verbose ...):

Wed 2017-06-14 12:35:57.577061 EDT [s=dc1dd9d61cf045e991f265aa17c5af03;i=6eb6e;b=0d9dc78871c34f43a4a6c27f43cf4167;m=a171206ec6;t=551ee258c6492;x=4e3c71faa52ba9d8]
    MESSAGE=This is a test
    _CMDLINE=python3 logtest.py

There are several items of interest there:

  • A high resolution timestamp
  • The kernel cgroup, which corresponds to the docker container id and thus uniquely identifies the container that originated the message
  • The executable path inside the container that generated the message
  • The machine id, which uniquely identifies the host

By logging via syslog you have removed the necessity of either (a) handling log rotation in your application or (b) handling log rotation in your container or (c) having to communicate log rotation configuration from the container to the host. Additionally, you can rely on journald to take care of rate limiting and log size management to prevent a broken application from performing a local DOS of the server.

Logging via stdout/stderr

Applications that write a byte stream to stdout/stderr will have their output handled by the Docker log driver. If we run Docker with the journald log driver (using the --log-driver=journald option to the Docker server), then Docker will add metadata lines read from stdout/stderr. For example, if we run...

docker run fedora echo This is a test.

...then our journal will contain:

Wed 2017-06-14 12:46:45.511515 EDT [s=dc1dd9d61cf045e991f265aa17c5af03;i=6ee72;b=0d9dc78871c34f43a4a6c27f43cf4167;m=a197bf222b;t=551ee4c2b17f7;x=e7c1a220c93ef3cf]
    _CMDLINE=/usr/bin/dockerd -G docker --dns --log-driver journald -s overlay2
    MESSAGE=This is a test.

Like the messages logged via syslog, this also containers information that identifies the source container. It does not identify the particular binary responsible for emitting the message.

Logging to a file

When logging to a file, the system is unable to add any metadata for us automatically. We can derive similar information by logging to a container-specific location (/var/log/CONTAINERNAME/..., for example), or by configuring our application to include specific information in the log messages, but ultimately this is the least information-rich mechanism available to us. Furthermore, it necessitates that we configure some sort of container-aware log rotation strategy to avoid eating up all the available disk space over time.

Log collection

Our goal is not simply to make log messages available locally. In general, we also want to aggregate log messages from several machines into a single location where we can perform various sorts of queries, analysis, and visualization. There are a number of solutions in place for getting logs off a local server to a central collector, including both fluentd and rsyslog.

In the context of the above discussion, it turns out that rsyslog has some very desirable features. In particular, the imjournal input module has support for reading structured messages from journald and exporting those to a remote collector (such as ElasticSearch) with their structure intact. Fluentd does not ship with journald support as a core plugin.

Rsyslog version 8.x and later have a rich language for filtering, annotating, and otherwise modifying log messages that would allow us to do things such as add host-specific tags to messages, normalize log messages from applications with poorly designed log messages, and perform other transformations before sending them on to a remote collector.

For example, we would ensure that messages from containerized services logged via syslog or via stdout/stderr have a CONTAINER_ID_FULL field with something like the following:

if re_match($!_SYSTEMD_CGROUP, "^/docker/") then {
        set $!CONTAINER_ID_FULL = re_extract($!_SYSTEMD_CGROUP, "^/docker/(.*)", 0, 1, "unknown");

This matches the _SYSTEMD_CGROUP field of the message, extracts the container id, and uses that to set the CONTAINER_ID_FULL property on the message.


  1. Provide a consistent logging environment to containerized services. Provide every container both with /dev/log and a container-specific host directory mounted on /var/log.
  2. For applications that support logging to syslog (such as all consumers of oslo.log), configure them to log exclusively via syslog.
  3. For applications that are unable to log via syslog but are able to log to stdout/stderr, ensure that Docker is using the journald log driver.
  4. For applications that can only log to files, configure rsyslog on the host to read those files using the imfile input plugin.
  5. Use rsyslog on the host to forward structured messages to a remote collector.

FAA Cannot Require Drone Registration

Thu 25 May 2017 by Lars Kellogg-Stedman Tags uav faa

This is now old news if you're already following the drone industry, but if you're not, I'd like to highlight a recent decision made by the US Court of Appeals regarding the FAA's drone registration requirements.

To place this in context, back in 2015 the FAA established a new set …

read more

Making sure your Gerrit changes aren't broken

Sun 22 January 2017 by Lars Kellogg-Stedman Tags openstack gerrit git

It's a bit of an embarrassment when you submit a review to Gerrit only to have it fail CI checks immediately because of something as simple as a syntax error or pep8 failure that you should have caught yourself before submitting...but you forgot to run your validations before submitting …

read more

Exploring YAQL Expressions

Thu 11 August 2016 by Lars Kellogg-Stedman Tags openstack heat hot yaql

The Newton release of Heat adds support for a yaql intrinsic function, which allows you to evaluate yaql expressions in your Heat templates. Unfortunately, the existing yaql documentation is somewhat limited, and does not offer examples of many of yaql's more advanced features.

I am working on a Fluentd composable …

read more

Connecting another vm to your tripleo-quickstart deployment

Let's say that you have set up an environment using tripleo-quickstart and you would like to add another virtual machine to the mix that has both "external" connectivity ("external" in quotes because I am using it in the same way as the quickstart does w/r/t the undercloud) and …

read more