[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-devel
Subject: Re: KLogTool
From: Jan Kneschke <jan () kneschke ! de>
Date: 2002-10-24 9:39:01
[Download RAW message or body]
On Thu, Oct 24, 2002 at 09:49:53AM +0200, Michael Goffioul wrote:
> > > - each log entry is a single line
> >
> > For very long lines this is NOT the case for apache 1.3.x.
>
> I would appreciate some example files (in private of course).
Don't you have webserver which is hit by nimda and friends ?
p3EE1DBEA.dip.t-dialin.net - - [25/Oct/2001:14:24:51 +0000] "GET \
/html/admin.php?name%5B0%5D=Pasta+-+programs+-+window.php&filename%5B0%5D=% \
2Fpages%2Fprojects%2Fpasta%2Fwindow.php&edit%5B0%5D=edit&name%5B1%5D=Pasta+-+programs+-+index.php&filename%5B1%5D=%2Fpages%2Fprojects%2Fpast
a%2Findex.php&name%5B2%5D=Pasta+-+programs+-+wm.php&filename%5B2%5D=%2Fpages%2Fprojects%2Fpasta%2Fwm.php&name%5B3%5D=Pasta+-+programs+-+kons
ole.php&filename%5B3%5D=%2Fpages%2Fprojects%2Fpasta%2Fkonsole.php&name%5B4%5D=Phpezant+-+listcreate.php&filename%5B4%5D=%2Fpages%2Fprojects%
2Fphpezant%2Flistcreate.php&name%5B5%5D=Modlogan+-+mlaconfiggen.php&filename%5B5%5D=%2Fpages%2Fprojects%2Fmodlogan%2Fmlaconfiggen.php&name%5
B6%5D=Pasta+-+lib+-+wm.inc&filename%5B6%5D=%2Flib%2Fphp%2Fpasta%2Fwm.inc&name%5B7%5D=Pasta+-+lib+-+themedwm.inc&filename%5B7%5D=%2Flib%2Fphp
%2Fpasta%2Fthemedwm.inc&name%5B8%5D=Pasta+-+lib+-+window.inc&filename%5B8%5D=%2Flib%2Fphp%2Fpasta%2Fwindow.inc&name%5B9%5D=Pasta+-+lib+-+obj
ect.inc&filename%5B9%5D=%2Flib%2Fphp%2Fpasta%2Fobject.i
nc&name%5B10%5D=Pasta+-+lib+-+view.inc&filename%5B10%5D=%2Flib%2Fphp%2Fpasta%2Fview.inc&name%5B11%5D=Pasta+-+lib+-+viewcollection.inc&filena
me%5B11%5D=%2Flib%2Fphp%2Fpasta%2Fviewcollection.inc&name%5B12%5D=Pasta+-+lib+-+box.inc&filename%5B12%5D=%2Flib%2Fphp%2Fpasta%2Fbox.inc \
HTTP /1.1" 200 15675 "http://jan.kneschke.de/html/admin.php?op=showsource" \
"Mozilla/5.0 (compatible; Konqueror/2.2.1; Linux)"
The Linebreak is after the "object.i". All the other linebreak are just virtual.
p3EE1DBEA.dip.t-dialin.net - - [25/Oct/2001:14:25:20 +0000] "GET \
/html/admin.php?op=showsource HTTP/1.1" 200 18793 "http://jan.kneschke.de/h \
tml/admin.php?name%5B0%5D=Pasta+-+programs+-+window.php&filename%5B0%5D=%2Fpages%2Fprojects%2Fpasta%2Fwindow.php&edit%5B0%5D=edit&name%5B1%5
D=Pasta+-+programs+-+index.php&filename%5B1%5D=%2Fpages%2Fprojects%2Fpasta%2Findex.php&name%5B2%5D=Pasta+-+programs+-+wm.php&filename%5B2%5D
=%2Fpages%2Fprojects%2Fpasta%2Fwm.php&name%5B3%5D=Pasta+-+programs+-+konsole.php&filename%5B3%5D=%2Fpages%2Fprojects%2Fpasta%2Fkonsole.php&n
ame%5B4%5D=Phpezant+-+listcreate.php&filename%5B4%5D=%2Fpages%2Fprojects%2Fphpezant%2Flistcreate.php&name%5B5%5D=Modlogan+-+mlaconfiggen.php
&filename%5B5%5D=%2Fpages%2Fprojects%2Fmodlogan%2Fmlaconfiggen.php&name%5B6%5D=Pasta+-+lib+-+wm.inc&filename%5B6%5D=%2Flib%2Fphp%2Fpasta%2Fw
m.inc&name%5B7%5D=Pasta+-+lib+-+themedwm.inc&filename%5B7%5D=%2Flib%2Fphp%2Fpasta%2Fthemedwm.inc&name%5B8%5D=Pasta+-+lib+-+window.inc&filena
me%5B8%5D=%2Flib%2Fphp%2Fpasta%2Fwindow.inc&name%5B9%5D
=Pasta+-+lib+-+object.inc&filename%5B9%5D=%2Flib%2Fphp%2Fpasta%2Fobject.inc&name%5B10%5D=Pasta+-+lib+-+view.inc&filename%5B10%5D=%2Flib%2Fph
p%2Fpasta%2Fview.inc&name%5B11%5D=Pasta+-+lib+-+viewcollection.inc&filename%5B11%5D=%2Flib%2Fphp%2Fpasta%2Fviewcollection.inc&name%5B12%5D=P
asta+-+lib+-+box.inc&filename%5B12%5D=%2Flib%2Fphp%2Fpasta%2Fbox.inc" "Mozilla/5.0 \
(compatible; Konqueror/2.2.1; Linux)"
Here it is after the "%5B9%5D".
> > > - fields with spaces are quoted (otherwise it seems impossible
> > > to parse)
> >
> > You should use regexes to parse them. If you want you should take a look at
> > modlogan (http://jan.kneschke.de/projects/modlogan/) which is a very
> > flexible log-file parser for multiple logfile-types (webserver, ftp-server,
> > mail-server, streaming-server, ...)
>
> klogtool is already based on regexp's. However, even with regexp's, I
> don't see how you can parse reliably a line that contains successively
> 2 unquoted fields including spaces. Imagine a log format like:
> %r %r. Or each of these fields must have a known format.
This is a user problem. That's why %r is always surrounded by '"'.
In the lastest release we have regex-generators for CustomLog (apache), MSIIS,
Netscape, Squid and Realserver as their Logfile format is configurable.
You should take a look at the source to get an impression what has to be
done:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/modlogan/modlogan/src/input/clf/plugin_config.c?rev=1.34&content-type=text/vnd.viewcvs-markup
parse_clf_field_info() is the parser for the apache CustomLog directive.
> Michael.
Jan
--
http://jan.kneschke.de - localizer, modlogan, pxtools
mailto:jan@kneschke.de - Jan Kneschke
> > Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic