Parsing logback: attribute line break and CDATA

Notes, tips, and other usefull things on how to use LogMX

Moderator: admin

Post Reply
constructor
Posts: 4
Joined: Wed Jun 05, 2019 5:41 pm

Parsing logback: attribute line break and CDATA

Post by constructor »

LogMX version: 5.4.4

We are running into two issues.

#1. It fails to parse a logback message because timestamp attribute starts on a new line:
<log4j:event logger="com.acme"
timestamp="1559658306788" level="INFO" thread="[ACTIVE] ExecuteThread: '9'">
<log4j:message>Hello World</log4j:message>
</log4j:event>
#2. Even if timestamp is merged with previous line, the record is displayed, but the message column is empty, because it expects "<![CDATA[Hello World]]>" instead.

Is there some out-of-box solution for these issues, or we have to implement our own parser?
admin
Site Admin
Posts: 555
Joined: Sun Dec 17, 2006 10:30 pm

Re: Parsing logback: attribute line break and CDATA

Post by admin »

Hello,

These logs look like they were actually produced by Log4j, not Logback (see https://howtodoinjava.com/log4j/how-to- ... sing-log4j). But even then, this new line character right before the timestamp prevents LogMX from recognizing this Log4j syntax (and as you said, the lack of CDATA forces LogMX to consider the whole "<log4j:message>Hello World</log4j:message>" as the log message instead of just "Hello World".

So since it's not a valid Logback or Log4j standard format, you can use the following Log4jPattern Parser:

Code: Select all

<log4j:event logger="%c"%ntimestamp="%d{S}" level="%p" thread="%t">%n<log4j:message>%m</log4j:message>%n</log4j:event>
To create such a Parser: go to menu "Tools" > "Options" > tab "Parsers" > green "+" button at the right > tab "Log4j/Logback Pattern Parser".

Also, LogMX v5.4.4 has been released more than 3 years ago, so I would recommend upgrading to a more recent version: a few major security issues were fixed since then, as long as a great number of bugs, and of course new features are now available, like automatically creating a Log4j/Logback Pattern Parser from your log4j/logback config file, which would help you for the issue you're having (also, since v5 is not supported anymore, I wasn't even able to test the above pattern with LogMX v5).

Please let me know if you have any question.

Xavier
constructor
Posts: 4
Joined: Wed Jun 05, 2019 5:41 pm

Re: Parsing logback: attribute line break and CDATA

Post by constructor »

Thanks Xavier.

The log I showed was produced by Logback's XMLLayout:
https://github.com/qos-ch/logback/blob/ ... #L111-L113

I don't know why they start a new line at timestamp attribute, but it produces something LogMX doesn't recognize.

Does LogMx have any plan to support this format in future releases?
constructor
Posts: 4
Joined: Wed Jun 05, 2019 5:41 pm

Re: Parsing logback: attribute line break and CDATA

Post by constructor »

Another issue I didin't show, for simplicity, is that LogMX doesn't understand XML entity for single quote (&#39;) - they are displayed as is in the viewer.

Code: Select all

<log4j:event logger="com.acme"
             timestamp="1559658306788" level="INFO" thread="[ACTIVE] ExecuteThread: &#39;9&#39; for queue: &#39;weblogic.kernel.Default (self-tuning)&#39;">
[code]
admin
Site Admin
Posts: 555
Joined: Sun Dec 17, 2006 10:30 pm

Re: Parsing logback: attribute line break and CDATA

Post by admin »

Wow, I was not aware of this weird Layout offered by Logback :shock: . It's basically a Layout that tries (but clearly fails) to mimic the Log4j XML Layout. I personally think it's wrong that Logback provides this. If at least it had the same format, why not, but since it's not the same format, I wonder what the point is. And even if at some point the 2 formats are the same, they can easily fork since it's 2 different projects (maybe that's what happened). Ironically, this Logback layout seems to be written by Ceki Gülcü, the author of both Log4j and Logback...

Anyway, I don't think that LogMX will ever have a built-it Parser for this weird layout, because we are trying to ship LogMX with only the most common log formats to avoid having too many Parsers, because the more Parsers you have, the longer it is for LogMX to detect the format used in your logs.

You made a good point concerning the XML entity &#39; but it's actually on purpose that LogMX doesn't decode any kind of XML entities, for performances reasons. It's actually also for performances reasons that adding a new line between 2 XML elements prevents LogMX from recognizing this format: it doesn't use a proper XML parser for better performances. When we built the first XML Parsers, we had way better performances using our own code compared to the fastest available XML parsers (and also because most of XML parsers don't work well with invalid XML content, like it's sometimes the case with logs being currently written without being able to close the root element, or when log file is rolled).

But in general, I would not recommend using XML to write logs in the first place: log files are huge, not easily readable by humans, and have a serious performance hit when trying to analyze/parse them.

Anyway, if you still want to use this layout, the Parser I gave you in my last message should work just fine. Did you have any trouble with it? (well except for XML entities ;))

Xavier
constructor
Posts: 4
Joined: Wed Jun 05, 2019 5:41 pm

Re: Parsing logback: attribute line break and CDATA

Post by constructor »

Thanks Xavier for the detailed explanation. I haven't tried your parser yet. I ended up writing my own XMLLayout that fixed the issues I listed.
Post Reply