On Thu, Oct 24, 2002 at 09:49:53AM +0200, Michael Goffioul wrote: > > > - each log entry is a single line > > > > For very long lines this is NOT the case for apache 1.3.x. > > I would appreciate some example files (in private of course). Don't you have webserver which is hit by nimda and friends ? p3EE1DBEA.dip.t-dialin.net - - [25/Oct/2001:14:24:51 +0000] "GET /html/admin.php?name%5B0%5D=Pasta+-+programs+-+window.php&filename%5B0%5D=% 2Fpages%2Fprojects%2Fpasta%2Fwindow.php&edit%5B0%5D=edit&name%5B1%5D=Pasta+-+programs+-+index.php&filename%5B1%5D=%2Fpages%2Fprojects%2Fpast a%2Findex.php&name%5B2%5D=Pasta+-+programs+-+wm.php&filename%5B2%5D=%2Fpages%2Fprojects%2Fpasta%2Fwm.php&name%5B3%5D=Pasta+-+programs+-+kons ole.php&filename%5B3%5D=%2Fpages%2Fprojects%2Fpasta%2Fkonsole.php&name%5B4%5D=Phpezant+-+listcreate.php&filename%5B4%5D=%2Fpages%2Fprojects% 2Fphpezant%2Flistcreate.php&name%5B5%5D=Modlogan+-+mlaconfiggen.php&filename%5B5%5D=%2Fpages%2Fprojects%2Fmodlogan%2Fmlaconfiggen.php&name%5 B6%5D=Pasta+-+lib+-+wm.inc&filename%5B6%5D=%2Flib%2Fphp%2Fpasta%2Fwm.inc&name%5B7%5D=Pasta+-+lib+-+themedwm.inc&filename%5B7%5D=%2Flib%2Fphp %2Fpasta%2Fthemedwm.inc&name%5B8%5D=Pasta+-+lib+-+window.inc&filename%5B8%5D=%2Flib%2Fphp%2Fpasta%2Fwindow.inc&name%5B9%5D=Pasta+-+lib+-+obj ect.inc&filename%5B9%5D=%2Flib%2Fphp%2Fpasta%2Fobject.i nc&name%5B10%5D=Pasta+-+lib+-+view.inc&filename%5B10%5D=%2Flib%2Fphp%2Fpasta%2Fview.inc&name%5B11%5D=Pasta+-+lib+-+viewcollection.inc&filena me%5B11%5D=%2Flib%2Fphp%2Fpasta%2Fviewcollection.inc&name%5B12%5D=Pasta+-+lib+-+box.inc&filename%5B12%5D=%2Flib%2Fphp%2Fpasta%2Fbox.inc HTTP /1.1" 200 15675 "http://jan.kneschke.de/html/admin.php?op=showsource" "Mozilla/5.0 (compatible; Konqueror/2.2.1; Linux)" The Linebreak is after the "object.i". All the other linebreak are just virtual. p3EE1DBEA.dip.t-dialin.net - - [25/Oct/2001:14:25:20 +0000] "GET /html/admin.php?op=showsource HTTP/1.1" 200 18793 "http://jan.kneschke.de/h tml/admin.php?name%5B0%5D=Pasta+-+programs+-+window.php&filename%5B0%5D=%2Fpages%2Fprojects%2Fpasta%2Fwindow.php&edit%5B0%5D=edit&name%5B1%5 D=Pasta+-+programs+-+index.php&filename%5B1%5D=%2Fpages%2Fprojects%2Fpasta%2Findex.php&name%5B2%5D=Pasta+-+programs+-+wm.php&filename%5B2%5D =%2Fpages%2Fprojects%2Fpasta%2Fwm.php&name%5B3%5D=Pasta+-+programs+-+konsole.php&filename%5B3%5D=%2Fpages%2Fprojects%2Fpasta%2Fkonsole.php&n ame%5B4%5D=Phpezant+-+listcreate.php&filename%5B4%5D=%2Fpages%2Fprojects%2Fphpezant%2Flistcreate.php&name%5B5%5D=Modlogan+-+mlaconfiggen.php &filename%5B5%5D=%2Fpages%2Fprojects%2Fmodlogan%2Fmlaconfiggen.php&name%5B6%5D=Pasta+-+lib+-+wm.inc&filename%5B6%5D=%2Flib%2Fphp%2Fpasta%2Fw m.inc&name%5B7%5D=Pasta+-+lib+-+themedwm.inc&filename%5B7%5D=%2Flib%2Fphp%2Fpasta%2Fthemedwm.inc&name%5B8%5D=Pasta+-+lib+-+window.inc&filena me%5B8%5D=%2Flib%2Fphp%2Fpasta%2Fwindow.inc&name%5B9%5D =Pasta+-+lib+-+object.inc&filename%5B9%5D=%2Flib%2Fphp%2Fpasta%2Fobject.inc&name%5B10%5D=Pasta+-+lib+-+view.inc&filename%5B10%5D=%2Flib%2Fph p%2Fpasta%2Fview.inc&name%5B11%5D=Pasta+-+lib+-+viewcollection.inc&filename%5B11%5D=%2Flib%2Fphp%2Fpasta%2Fviewcollection.inc&name%5B12%5D=P asta+-+lib+-+box.inc&filename%5B12%5D=%2Flib%2Fphp%2Fpasta%2Fbox.inc" "Mozilla/5.0 (compatible; Konqueror/2.2.1; Linux)" Here it is after the "%5B9%5D". > > > - fields with spaces are quoted (otherwise it seems impossible > > > to parse) > > > > You should use regexes to parse them. If you want you should take a look at > > modlogan (http://jan.kneschke.de/projects/modlogan/) which is a very > > flexible log-file parser for multiple logfile-types (webserver, ftp-server, > > mail-server, streaming-server, ...) > > klogtool is already based on regexp's. However, even with regexp's, I > don't see how you can parse reliably a line that contains successively > 2 unquoted fields including spaces. Imagine a log format like: > %r %r. Or each of these fields must have a known format. This is a user problem. That's why %r is always surrounded by '"'. In the lastest release we have regex-generators for CustomLog (apache), MSIIS, Netscape, Squid and Realserver as their Logfile format is configurable. You should take a look at the source to get an impression what has to be done: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/modlogan/modlogan/src/input/clf/plugin_config.c?rev=1.34&content-type=text/vnd.viewcvs-markup parse_clf_field_info() is the parser for the apache CustomLog directive. > Michael. Jan -- http://jan.kneschke.de - localizer, modlogan, pxtools mailto:jan@kneschke.de - Jan Kneschke >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<