From kolab-devel Tue May 22 14:08:00 2012 From: Martin Konold Date: Tue, 22 May 2012 14:08:00 +0000 To: kolab-devel Subject: Re: [Kolab-devel] 10.000 events in a Resource Calendar Message-Id: <7042580.mHD12xvLWt () linux-78uc ! site> X-MARC-Message: https://marc.info/?l=kolab-devel&m=133769604626898 Am Dienstag, 22. Mai 2012, 13:26:23 schrieb Jeroen van Meeuwen: Hi Jeroen, > > No this is not a flaw in any way. A delete operation is handled > > exactly like > > and together with write operations. (E.g. EVERY modify is actually a > > delete+write operation by nature of how Kolab storage works) > = > Let's take a step back, because we're confusing the issue in OP. What does 'OP' mean? > The following *actually* happens when an event is deleted (whether "the > idea behind 0(1)" design or not); > = > - Adding or editing an event to a calendar obviously adds a new object > to IMAP. Correct. > - To remove an event from a calendar, the message could be flagged > \Deleted in IMAP, and (possibly) the folder is expunged (doesn't > matter), Yes. A remove is mapped to a \Deleted flag. Why do you consider the obvious = worthwile to mention? The way how a delete is actually syntactically = implemented in IMAP does not really matter here. The actual IMAP4 spec uses = this implementation in order to make the very common delete operation extre= mly = fast. (It is fast because a delete does not actually change anything in the = store except setting a flag. Setting a flag avoids extra seeks and filesyst= em = overhead. In order to not only semantically but really deleten an IMAP mess= age = a potentially very expensive EXPUNGE command is required. > - This is *not* a write operation that adds a new object to IMAP. Yes, it is not a write operation but a delete operation. Delete is technica= lly = implemented via setting a delete flag in IMAP. Why does this matter with = regards to scalability of a resource calendar? > It > does bump UIDVALIDITY No, this is plain wrong. Please reread the IMAP4rev1 RFC 3501 = http://tools.ietf.org/html/rfc3501#section-2.3.1.1. There it is explained t= hat = the UIDVALIDITY has nothing to do with neither adding nor removing items fr= om = an IMAP folder. The UIDVALIDITY is a property of a folder not of a message. Historically th= e = UIDVALIDITY was implemented in order make the following uncommon procedure = save: 1. Folder "foldername" created 2. Folder "foldername" populated with messages e.g. UID 1,2,3,4,5 3. Client A synchronises with "foldername" 4. Client B deletes messages with UID 1,2,3,4,5 5. Client B removes folder "foldernamme" 6. Client B creates folder "foldernamme" 7. Client B populates folder "foldernamme" with messages e.g. UID 1,2,3,4,5 8. Client A checks folder "foldernamme" and does not detect that actually t= he = messages with the previously existing UIDs did change. (It correctly assume= s = that there is no modify) Solution: Whenenver a folder is created it does not only get a unique foldername but = also a unique UIDVALIDITY in such a manner that the tupel (foldername, = UIDVALIDITY) is unique for every installation. In other works UIDVALIDITY allows that the tripple (foldername, UIDVALIDITY= , = UID) is immutable for any IMAP installation! Such an immutability guarantee is the foundation for correctness and = scalability. > , but... see below. > = > - The *client* is to trigger the Free/Busy update, Yes, this is implemented this way in order to keep the patchset small and m= ake = the Kolab solution work with any unmodified standards compliant IMAP4 serve= r. (An alternative would be to extend either IMAP4 syntax or IMAP4 semantics.) > - CONDSTORE (required for UIDVALIDITY) is not enabled on Kolab 2.3 > (Cyrus IMAP 2.3) mailboxes by default, Sorry, this is technically plain wrong. CONDSTORE is no prerequisite of = UIDVALITITY. CONDSTORE is defined in RFC 4551 (June 2006, years after Kolab was designed= ) = which happens to be much younger than UIDVALIDITY which is already defined = in = RFC 2683 (September 1999). > - The Free/Busy mechanism has little to hold on to, to see what has > changed, unless it maintains a local cache of at least the UIDs of the > message it used when it last generated the (partial) Free/Busy, Keeping such a cache for optimisation purposes is trivial and common practi= ce. = Actually it is not required for a scalable solution but this fact is a mino= r = detail which could be discussed seperately. The size of the cache is a = negletable simple list of 32bit Integers e.g. 40K in the case of 10.000 = events. > - Retrieval of relevant events to the relevant period in time could be > made faster using sorting and retrieving the newest objects first, This is common practise and trivial but doing sorting is plain wrong and sl= ow. A sorting approach is a typical relational database approach. There is NO n= eed = to do any sorting if you leverage upon the IMAP protocol. IMAP guarantees strong monotonous increasing UID values. Due to the fact th= at = IMAP does NOT know a modify every modified or new event results in a new IM= AP = message which happens to have a UID > LASTSEENUID. (For briefity I will not = get into the details of removal). Therefore the simple rule that a "FETCH LASTSEEN+1:*" is sufficient. > - The client triggering Free/Busy does not simply HEAD a URL and > disconnects No this claim is wrong, ofcourse this is the case up to today. > , as this would impede the slice of time any web server code > has available to do what it needs to do. Therefore, a client keeps open > the connection (and uses GET/POST) until the web server performing the > Free/Busy updating is done. This is considered a blocking operation for > clients that cannot do this in the background. This is wrong. Please look at the code. > >> Euh, as far as I know, it is the client software that triggers an > >> update of the free/busy, and not the Kolab server itself, and unless > >> the > >> client is multi-threaded like Kontact it is also a blocking > >> operation. > > = > > Sorry this is non-sense. > = > Thank you for your balanced and well-formulated opinion. I am sorry but how else should I call it. This is not an opinion but a triv= ial = provable fact that for every Kolab client the trigger by its very nature is = non blocking. After all it is a trigger. > As I've illustrated before, it's not like Kolab uses FPM or any other > FastCGI-like implementation, = Don't think in terms of a web developer. Kolab does not require any of thos= e = implementations in order have non blocking fb generation. (The current = implementation uses a daemon approach in order to avoid extra patching of = upstream resources. Though this is an implementation detail) > and it's not like the client can simple > HEAD a URI and be done with it (close the connection). But this is exactly what happens. Therefore I call you assumptions and clai= ms = nonsense. > > The main point here is that the Client trigger the update of the > > partial > > freebusy data but they never wait nor block. (A trigger is simply an > > http call which immediately returns and hints the server that the > > freebusy needs to be > > updated) > = > It only that were true. It sounds very good in theory, but theory is a > place up north in Narnia. In the real world, there is nothing that > "hints" the server and nothing to follow up on such "hint". It is the > client that is actively involved and waiting for the Free/Busy > information to be updated as part of the trigger URI it is hitting. I will stop now arguing. Please check with the code. > > Sorry, but you really got things wrong. The basic idea behind Kolab > > is NOT to think in terms of a relational database including terms of > > doing queries all > > the time. > > = > > This is the essential clue behind Kolab that it is so extremly > > scalable. > > = > > Introducing all these "query" concepts will lead to loosing this > > unique > > property. > = > Well, unique !=3D good and most certainly unique !=3D best. At most, uniq= ue > <> common. In this case unique =3D=3D good and I consider it insulting that you claim = that = the existing scalable solution is inferiour to your "query" approach while = denying all evidence as seen in source and existing binaries. IMHO query is slow, has scalability issues and should be avoided when = possible. = Leveraging upon guaranteed protocal semantics is good practise upwards = compatible. On the the other hand mapping everything towards a relational = database even though the underlaying problem does not have relational = properties is abuse and leads at least to scalability issues. > To be honest, the "extremely scalable" argument is starting to get to > be completely wasted on me. I accept that you do not care about scalability but then please don't ask f= or = answers to scalability questions like having 10.000 events in a single = calendar. From my experience both scalability and security MUST be designed into a = solution right from the beginning. Adding both later is extremly cumbersome = and most often not really solvable in a satisfactory manner. > Every time it is used, it is used as the ultimate argument against > something Most of the time it is used well founded as an argument against abusing = traditional web technology. (E.g. large scalable web solution like facebook= , = google or twitter have moved away from traditional relational databases lon= g = ago.) > , but it misses merit in that the scalability parameter to a > Kolab deployment is never removed nor reduced by any of the developments > or ideas to move forward. While you may disagree with that, I have to > conclude "no-SQL storage" is being confused and arbitrarily substituted > with "caches, possibly in SQL". This is plain wrong. There seems to be a fundamental missunderstanding. I hope that I could anyway provide some insight. As I lack both time and = funding for actually working on Kolab 3 I hereby stop contributing to this = thread. = Maybe sometime we can meet at some conference and have a beer together afte= r = meeting before for about an hour in front of a black board. I am confident = that you would then understand better what this fuzz is all about. Yours, -- martin -- = e r f r a k o n Erlewein, Frank, Konold & Partner - Beratende Ingenieure und Physiker Sitz: Adolfstra=DFe 23, 70469 Stuttgart, Partnerschaftsregister Stuttgart P= R 126 http://www.erfrakon.com/ _______________________________________________ Kolab-devel mailing list Kolab-devel@kolab.org https://www.intevation.de/mailman/listinfo/kolab-devel