log4j2 + rsyslog for the client side of centralized logging
So right now, the absolute coolest setup is the ELK (Elasticsearch, Logstash, Kibana) stack, and as far as I can tell, that community is growing and doing very cool things. However, you may not feel like installing LogStash on all your servers, especially if they are constrained for memory or older. Luckily, rsyslog is pretty common, and, so long as you are running at least version 5, you can use a relatively sane setup shown by the wonderful men and women of Loggly. They know a ton about logging!
This will get logs off the box. I prefer to buffer logs on the server so that I can minimize the pain of losing logs if my reciever goes down. For this reason, I also really think using TCP all-the-way is worth the overhead. I also like to use a localhost relay instead of firing them directly from applications.
I also write a lot of Java. As a Java guy in my day job, I am wildly impressed at how well rsyslog handles log load.
A logging format that is automatically supported but structural in nature turned out to be the one specified by RFC5424. Using this format in rsyslog ends up looking like this in version 5 rsyslog (version 6+ is described here). There is a whole spec available here as well.
$template Rfc5424Format,"<%PRI%>1 %TIMESTAMP:::date-rfc3339% %HOSTNAME% %APP-NAME% %PROCID% %MSGID% %STRUCTURED-DATA% %msg%"
We need an endpoint that can handle huge messages, is tcp all the way, buffers messages if it can’t write out, and can fail over.
My friends at Loggly have started a great base configuration with this actionQueue. This will help buffer any intermittent network issues that may occur. It will not completely mitigate them, however. ActionQueue (via loggly)
We want a TCP reciever as well, so we can use this: IMTCP.
Now that we have TCP, we want to fail over: failover. I don’t however, want to shoot them locally if we can’t fail.
Combining this together with our rfc5424 format gives the following:
# Thanks, Loggly, this is excellent $WorkDirectory /var/spool/rsyslog $ActionQueueFileName fwdRule1 $ActionQueueMaxDiskSpace 1g $ActionQueueSaveOnShutdown on $ActionQueueType LinkedList $ActionResumeRetryCount -1 # Start up the tcp relay $MaxMessageSize 64k # BEFORE imtcp $ModLoad imtcp $InputTCPMaxSession 200 $InputTCPServerRun 514 # RFC5424! Well-structured! If I had more guts, I'd drop %msg% $template Rfc5424Format,"<%PRI%>1 %TIMESTAMP:::date-rfc3339% %HOSTNAME% %APP-NAME% %PROCID% %MSGID% %STRUCTURED-DATA% %msg%" *.* @@primary-syslog.example.com $ActionExecOnlyWhenPreviousIsSuspended on & @@secondary-1-syslog.example.com & @@secondary-2-syslog.example.com $ActionExecOnlyWhenPreviousIsSuspended off
After digging into log4j configurations, it became apparent that log4j2 made a lot more sense. I have ripped off the following configuration by combining some of this doc with StackOverflow.
Also, reading up on what RFC5424 requires will encourage you to go make a company id with IANA here.
If you aren’t already, PLEASE make sure you’re using slf4j-api bindings in your application. It makes migrating between logging solutions a breeze. Today, log4j2. Tomorrow, logback.
<?xml version="1.0" encoding="UTF-8"?> <Configuration status="warn" name="MyApp" packages=""> <Appenders> <Syslog name="RFC5424" format="RFC5424" host="localhost" port="514" protocol="TCP" appName="MyApp" includeMDC="true" facility="USER" enterpriseNumber="18060" newLine="true" messageId="Audit" mdcId="mdc" id="App" connectTimeoutMillis="1000" reconnectionDelayMillis="5000"> <LoggerFields> <KeyValuePair key="thread" value="%t"/> <KeyValuePair key="priority" value="%p"/> <KeyValuePair key="category" value="%c"/> <KeyValuePair key="exception" value="%ex"/> <KeyValuePair key="message" value="%m"/> </LoggerFields> </Syslog> </Appenders> <Loggers> <Logger name="com.mycorp" level="info" /> <Root level="error"> <AppenderRef ref="RFC5424"/> </Root> </Loggers> </Configuration>
With that, you have log4j2-enabled syslog all firing into localhost, which then can forward to any endpoint you like (like LogStash, Apache Flume, or Loggly).
ERROR Syslog contains an invalid element or attribute "reconnectDelayMillis"
I believe you meant reconnectionDelayMillis.
Thanks for this, it was helpful 🙂
I have a question on the rsyslog part I was unable to find an answer for (certainly not thanks to rsyslog’s “documentation”): how do you add custom SDATA fields from rsyslog.conf?
Hey Fabian! Thanks for reading. If you check the RFC5424 spec, they claim the following:
This means that, at least for me, I will manually add the SD info in the message:
This is sort of a lie, in that, in at least all of my use cases, either the sender will set its own structured data, or I need to set the structured data myself (i.e. a file watcher). Given that, I only have one structured data in practice.