Blog post by Anadi Misra on log filtering with image banner of entangled wires.

Extracting Log Messages on Category

A nifty script for automation of log analysis for errors.

Posted by Anadi Misra on March 30, 2008 · 3 mins read

I was recently faced with this question of how to ease out log analysis for our production system. We run a cluster of sorts with applications running on 4 tomcat fronts. The site in question has requests/day going into millions. So, it is not a surprise the there is a flooding of logs. It gets really tough to monitor logs to see what’s for example the highest occurring WARN messages if you end up logging 400 MB of log statements per hour. So I came up with script that might help you as well if you are looking for some quick filtering of logs. We have the following log format

{DATE} {TIMESTAMP} {LOG_LEVEL} {MESSAGE} | {LOGGING COMPONENT} {TOMCAT-HTTP-PROCESSOR}

here’s the script:

use Getopt::Long;
GetOptions("file=s","level=s");

$w = "(.+?)";

if($opt_file){
$DISTILLED_LOGFILE = "distilled-".$opt_level."-".$opt_file;
open(INPUTFILE, "$opt_file") or die("Could not open log file.");
open OUTPUTFILE, ">", $DISTILLED_LOGFILE, or die("Could not create filtered log file.");
foreach $line () {

 $line =~ m/^$w $w $w $w \| $w \[$w\]/;
 $date = $1;
 $timeStamp = $2;
 $logLevel = $3;
 $message = $4;
 $classLocation = $5;
 $httpProcessor = $6;

 if($logLevel eq $opt_level) {
  print OUTPUTFILE "$logLevel\t$message\t$classLocation\n";
 }
}

close(INPUTFILE);
close(OUTPUTFILE);

}else {
  print STDOUT "You didn't select a file!\n";
};

This is how to use it

filter_logfile.pl -file someTomcatlogFile -level logLevel

What it does is pretty simple. matches each line for a regex pattern, checks if the line is the same log level as provided by you on command line; if yes. it copies the Log Level; Message and the component logging the message to a new file named

distilled-LOGLEVEL-[YOUR-INPUT-LOG-FILE-NAME].

Happy Log filtering!