please help with a simple parser

Here developers can talk about how to write a Parser for LogMX

Moderator: admin

Post Reply
shemesh
Posts: 4
Joined: Thu Jun 10, 2010 2:36 pm

please help with a simple parser

Post by shemesh »

hi,
i need some help with writing a parser.

this is the format of a log entry:
yyyy-MM-dd HH:mm:ss,SSS | level | emiter | user name | user | class name | method name | message

and this is the code of my parser:

Code: Select all

private ParsedEntry entry = null;

private final static SimpleDateFormat DATE_FORMAT = new SimpleDateFormat(
	"yyyy-MM-dd HH:mm:ss,SSS");

private final static Pattern ENTRY_BEGIN_PATTERN = Pattern
	.compile("^\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3}.*$");

private StringBuilder entryMsgBuffer = null;

/** Key of user-defined fields */
private static final String USER_NAME = "User Name";
private static final String USER = "User";
private static final String CALLING_CLASS = "Calling Class";
private static final String CALLING_METHOD = "Calling Method";

private static final List<String> EXTRA_FIELDS_KEYS = Arrays
	.asList(USER_NAME, USER, CALLING_CLASS, CALLING_METHOD);

protected void parseLine(String line) throws Exception {
	
	//yyyy-MM-dd HH:mm:ss,SSS | level | emiter | user name | user | class name | method name | message
	// If end of file, records last entry if necessary, and exits
	if (line == null) {
		recordPreviousEntryIfExists();
		return;
	}
	String trimmedLine = line;//.replaceAll("\\s+", "");
	Matcher matcher = ENTRY_BEGIN_PATTERN.matcher(trimmedLine);
	if (matcher.matches()) {
		// Record previous found entry if exists, then create a new one
		prepareNewEntry();

		String[] fields = trimmedLine.split("\\|");

		entry.setDate(fields[0].trim());
		entry.setLevel(fields[1].trim());
		entry.setEmitter(fields[2].trim());
		entryMsgBuffer.append(fields[7].trim());
		entry.getUserDefinedFields().put(USER_NAME, fields[3].trim());
		entry.getUserDefinedFields().put(USER, fields[4].trim());
		entry.getUserDefinedFields().put(CALLING_CLASS, fields[5].trim());
		entry.getUserDefinedFields().put(CALLING_METHOD, fields[6].trim());
	} else if (entry != null) {
		entryMsgBuffer.append('\n').append(trimmedLine); // appends this line to previous entry's text
	}
}

public List<String> getUserDefinedFields() {
	return EXTRA_FIELDS_KEYS;
}

public Date getRelativeEntryDate(ParsedEntry pEntry) throws Exception {
	return DATE_FORMAT.parse(pEntry.getDate());
}

public Date getAbsoluteEntryDate(ParsedEntry pEntry) throws Exception {
	return DATE_FORMAT.parse(pEntry.getDate());
}

private void recordPreviousEntryIfExists() throws Exception {
	if (entry != null) {
		entry.setMessage(entryMsgBuffer.toString());
		addEntry(entry);
	}
}

private void prepareNewEntry() throws Exception {
	recordPreviousEntryIfExists();
	entry = createNewEntry();
	entryMsgBuffer = new StringBuilder(80);
	entry.setUserDefinedFields(new HashMap<String, Object>(1)); // Create an empty Map with only one element allocated
}
so far so good...
BUT!!
sometimes my log entry starts with some empty space at the beginning of a line (i cannot control this).
this is where it is all getting messy.
i tried line = line.trim() ... tried .replaceAll("\\s+", ""); but i only getting errors from LOGMX.

plz help with a solution.
admin
Site Admin
Posts: 557
Joined: Sun Dec 17, 2006 10:30 pm

Re: please help with a simple parser

Post by admin »

Hello,

You were close to the solution :wink: , just replace

Code: Select all

String trimmedLine = line;//.replaceAll("\\s+", "");
with:

Code: Select all

String trimmedLine = line.replaceAll("^\\s*(.*)$", "$1");
It will work for any number of spaces, from 0 to N thanks to the '*' in "\\s*" (instead of '+' which means from 1 to N).
Then "$1" will copy the text captured by "(.*)".
To learn more about regular expressions: http://java.sun.com/javase/6/docs/api/j ... ttern.html

Feel free to let me know if you have any other problem or question.

Xavier.
Post Reply