From kde-pim Wed Feb 22 18:33:45 2012 From: =?ISO-8859-1?Q?Sebastian_Tr=FCg?= Date: Wed, 22 Feb 2012 18:33:45 +0000 To: kde-pim Subject: Re: [Kde-pim] Review Request: Add indexing throttling and fixed endless indexing problems Message-Id: <4F453509.1080908 () trueg ! de> X-MARC-Message: https://marc.info/?l=kde-pim&m=132993566819830 Thanks, that helps a lot. :) On 02/22/2012 05:48 PM, Volker Krause wrote: > Here we go: > = > Connected to OpenLink Virtuoso > Driver: 06.01.3127 OpenLink Virtuoso ODBC Driver > OpenLink Interactive SQL (Virtuoso), version 0.9849b. > Type HELP; for help and EXIT; to exit. > SQL> status(); > REPORT > VARCHAR > _________________________________________________________________________= ______ > = > OpenLink Virtuoso Server > Version 06.01.3127-pthreads for Linux as of ??? ?? ???? = > Started on: 2012/02/20 14:34 GMT+60 > Lite Mode > = > Database Status: > File size 1344274432, 164096 pages, 53194 free. > 400 buffers, 400 used, 1 dirty 1 wired down, repl age 125 0 w. io 1 w/c= rsr. > Disk Usage: 1114387496 reads avg 0 msec, 0% r 0% w last 0 s, 8976288 = > writes, > 537880 read ahead, batch =3D 6. Autocompact 64288 in 40567 out, 36% = saved. > Gate: 770368 2nd in reads, 0 gate write waits, 0 in while read 0 busy sc= rap. = > Log =3D = > /home/vkrause/.kde/share/apps/nepomuk/repository/main/data/virtuosobacken= d/soprano- > virtuoso.trx, 87 bytes > 110879 pages have been changed since last backup (in checkpoint state) > Current backup timestamp: 0x0000-0x00-0x00 > Last backup date: unknown > Clients: 610 connects, max 38 concurrent > RPC: 23159320 calls, -630 pending, 4 max until now, 0 queued, 2790 burst = reads = > (0%), 15 second brk=3D18624512 > Checkpoint Remap 0 pages, 0 mapped back. 2383 s atomic time. > DB master 164096 total 53194 free 0 remap 0 mapped back > temp 256 total 249 free > = > Lock Status: 0 deadlocks of which 0 2r1w, 71 waits, > Currently 2 threads running 0 threads waiting 0 threads in vdb. > Pending: > = > Client 1111:1:-48: Account: dba, 415702328 bytes in, 240297225 bytes out= , 0 = > stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-43: Account: dba, 296436892 bytes in, 173661376 bytes out= , 1 = > stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 1 threads. > Locks: = > = > Client 1111:1:-46: Account: dba, 292329482 bytes in, 179137713 bytes out= , 0 = > stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-17: Account: dba, 335589080 bytes in, 192090303 bytes out= , 0 = > stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-240: Account: dba, 10656 bytes in, 189180 bytes out, 0 st= mts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-41: Account: dba, 1552 bytes in, 938 bytes out, 0 stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-579: Account: dba, 120819 bytes in, 273415 bytes out, 0 s= tmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-13: Account: dba, 419207763 bytes in, 243086156 bytes out= , 0 = > stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-2: Account: dba, 78053 bytes in, 3394883 bytes out, 0 stm= ts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-69: Account: dba, 2564 bytes in, 80545 bytes out, 0 stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:610: Account: dba, 204 bytes in, 255 bytes out, 1 stmts. > PID: 14404, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 1 threads. > Locks: = > = > Client 1111:1:-594: Account: dba, 339712 bytes in, 584087 bytes out, 0 s= tmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-44: Account: dba, 424311940 bytes in, 250861715 bytes out= , 0 = > stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-65: Account: dba, 5613 bytes in, 687096 bytes out, 0 stmt= s. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-29: Account: dba, 7329 bytes in, 108643 bytes out, 0 stmt= s. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-22: Account: dba, 28081 bytes in, 9393 bytes out, 0 stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-470: Account: dba, 3257767 bytes in, 6410521 bytes out, 0 = > stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-12: Account: dba, 323006521 bytes in, 181981840 bytes out= , 0 = > stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-81: Account: dba, 30386 bytes in, 401979 bytes out, 0 stm= ts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-47: Account: dba, 398031874 bytes in, 233688412 bytes out= , 0 = > stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-45: Account: dba, 254603133 bytes in, 144263381 bytes out= , 0 = > stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-23: Account: dba, 372153873 bytes in, 224998663 bytes out= , 0 = > stmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-68: Account: dba, 10462 bytes in, 149481 bytes out, 0 stm= ts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-253: Account: dba, 14802 bytes in, 1564724 bytes out, 0 s= tmts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > Client 1111:1:-67: Account: dba, 10881 bytes in, 169263 bytes out, 0 stm= ts. > PID: 24734, OS: unix, Application: unknown, IP#: 127.0.0.1 > Transaction status: PENDING, 0 threads. > Locks: = > = > = > Running Statements: > Time (msec) Text > 96 status() > 2177514 sparql select distinct ?r count(?p) as ?cnt where { ?r ?p ?o= . = > filter( ?p in ( = > = > Hash indexes > = > = > 159 Rows. -- 97 msec. > = > = > = > On Monday 20 February 2012 17:34:27 Sebastian Tr=FCg wrote: >> Please reproduce the high load, then start >> isql localhost:1111 dba dba >> and run "status();" >> and tell me the running queries. >> >> Thanks, >> Sebastian >> >> On 02/20/2012 11:44 AM, Christian Mollekopf wrote: >>> On Friday 17 February 2012 19.53:42 Volker Krause wrote: >>>> On Friday 17 February 2012 09:00:19 Sebastian Trueg wrote: >>>>>> On Feb. 16, 2012, 7:45 p.m., Christian Mollekopf wrote: >>>>>>> The HighPrio Queue shouldn't ever be throttled ideally, but in view= of >>>>>>> the current problems it's definitely a reasonable approach. I didn't >>>>>>> give a close look yet, but you can ship it from my side. Thanks for >>>>>>> the >>>>>>> patch. >>>>> >>>>> Currently there is no way around throttling the high prio queue. As >>>>> stated >>>>> above (and as you confirmed in private email) adding a new email acco= unt >>>>> will result in newItem events for all the emails. That in turn will p= ut >>>>> them into the high prio queue. >>>> >>>> I'm currently testing this, and it indeed seems to improve indexing >>>> considerably. Without throttling in effect my system is now reliably >>>> indexing hundreds of mails per minute, without getting stuck with >>>> Virtuoso >>>> going crazy. >>> >>> It also goes for me a lot faster, it seems it's not the merging which w= as >>> the bottleneck after all. Not sure why it goes so much faster now thoug= h.> = >>>> I (locally) reduced the idle time limit a bit though, with the new two >>>> minute setting it rarely switches to full speed here, maybe something = we >>>> still want to tweak. >>> >>> Feel free to do so once that is committed. >>> >>>> I get a lot of these errors now though: "nepomukservicestub(24734)" >>>> Soprano: "Invalid argument (1)": >>>> "http://www.semanticdesktop.org/ontologies/2007/03/22/nmo#messageHeader >>>> has >>>> a max cardinality of 1. Provided 2 values - >>>> , >>>> . Existing - " >>>> >>>> This seems to be caused by emails with more than one extra header we >>>> index >>>> (such as List-Id), and thus triggers on basically everything in my >>>> mailinglist folders. Affected emails are skipped and re-indexed at an >>>> agent >>>> restart (which of course fails again). Is nmo:messageHeader the right >>>> property for these headers, and if yes, why does it have cardinality o= ne? >>> >>> I'm running into those too, I'll have a look at it. This might be relat= ed >>> to a bug in the merger (will have to check again). I reduced the batch >>> size to 1, in order to minimize this problem. Once that is working we >>> probably want to increase the size again for performance reasons, which >>> will of course affect the system load again (nepomuk will i.e. get 100 >>> items at a time instead of just one.) >>> >>> Otherwise I think we can commit that to master. I can backport it later >>> when I have my other fixes in. >>> >>> Sebastian, can you commit this to master? Or do you want me to do it? >>> >>> I think I still have to store the already indexed collections in the >>> config, because right now, if the initial indexing hasn't finished and >>> the computer is rebooted it will still start over, meaning it will query >>> for the existence of all items again. If that is fast enough that won't >>> be a problem of course. >>> >>> The removal of items seems to be broken right now too, that's not relat= ed >>> to the patch though. >>> >>> Also I'm still not sure that this is the actual cause the virtuoso going >>> crazy bug, but in either case a very good improvement. >>> >>> Cheers, >>> Christian >>> >>>> regards, >>>> Volker _______________________________________________ KDE PIM mailing list kde-pim@kde.org https://mail.kde.org/mailman/listinfo/kde-pim KDE PIM home page at http://pim.kde.org/