From kolab-devel  Tue May 22 14:08:00 2012
From: Martin Konold <martin.konold () erfrakon ! de>
Date: Tue, 22 May 2012 14:08:00 +0000
To: kolab-devel
Subject: Re: [Kolab-devel] 10.000 events in a Resource Calendar
Message-Id: <7042580.mHD12xvLWt () linux-78uc ! site>
X-MARC-Message: https://marc.info/?l=kolab-devel&m=133769604626898

Am Dienstag, 22. Mai 2012, 13:26:23 schrieb Jeroen van Meeuwen:

Hi Jeroen,

> > No this is not a flaw in any way. A delete operation is handled
> > exactly like
> > and together with write operations. (E.g. EVERY modify is actually a
> > delete+write operation by nature of how Kolab storage works)
> =

> Let's take a step back, because we're confusing the issue in OP.

What does 'OP' mean?

> The following *actually* happens when an event is deleted (whether "the
> idea behind 0(1)" design or not);
> =

> - Adding or editing an event to a calendar obviously adds a new object
> to IMAP.

Correct.

> - To remove an event from a calendar, the message could be flagged
> \Deleted in IMAP, and (possibly) the folder is expunged (doesn't
> matter),

Yes. A remove is mapped to a \Deleted flag. Why do you consider the obvious =

worthwile to mention? The way how a delete is actually syntactically =

implemented in IMAP does not really matter here. The actual IMAP4 spec uses =

this implementation in order to make the very common delete operation extre=
mly =

fast. (It is fast because a delete does not actually change anything in the =

store except setting a flag. Setting a flag avoids extra seeks and filesyst=
em =

overhead. In order to not only semantically but really deleten an IMAP mess=
age =

a potentially very expensive EXPUNGE command is required.

>    - This is *not* a write operation that adds a new object to IMAP.

Yes, it is not a write operation but a delete operation. Delete is technica=
lly =

implemented via setting a delete flag in IMAP. Why does this matter with =

regards to scalability of a resource calendar?

>    It
> does bump UIDVALIDITY

No, this is plain wrong. Please reread the IMAP4rev1 RFC 3501 =

http://tools.ietf.org/html/rfc3501#section-2.3.1.1. There it is explained t=
hat =

the UIDVALIDITY has nothing to do with neither adding nor removing items fr=
om =

an IMAP folder.

The UIDVALIDITY is a property of a folder not of a message. Historically th=
e =

UIDVALIDITY was implemented in order make the following uncommon procedure =

save:

1. Folder "foldername" created
2. Folder "foldername" populated with messages e.g. UID 1,2,3,4,5
3. Client A synchronises with "foldername"
4. Client B deletes messages with UID 1,2,3,4,5
5. Client B removes folder "foldernamme"
6. Client B creates folder "foldernamme"
7. Client B populates folder "foldernamme" with messages e.g. UID 1,2,3,4,5
8. Client A checks folder "foldernamme" and does not detect that actually t=
he =

messages with the previously existing UIDs did change. (It correctly assume=
s =

that there is no modify)

Solution:
Whenenver a folder is created it does not only get a unique foldername but =

also a unique UIDVALIDITY in such a manner that the tupel (foldername, =

UIDVALIDITY) is unique for every installation.

In other works UIDVALIDITY allows that the tripple (foldername, UIDVALIDITY=
, =

UID) is immutable for any IMAP installation!

Such an immutability guarantee is the foundation for correctness and =

scalability.


> , but... see below.
> =

> - The *client* is to trigger the Free/Busy update,

Yes, this is implemented this way in order to keep the patchset small and m=
ake =

the Kolab solution work with any unmodified standards compliant IMAP4 serve=
r.

(An alternative would be to extend either IMAP4 syntax or IMAP4 semantics.)

> - CONDSTORE (required for UIDVALIDITY) is not enabled on Kolab 2.3
> (Cyrus IMAP 2.3) mailboxes by default,

Sorry, this is technically plain wrong. CONDSTORE is no prerequisite of =

UIDVALITITY.

CONDSTORE is defined in RFC 4551 (June 2006, years after Kolab was designed=
) =

which happens to be much younger than UIDVALIDITY which is already defined =
in =

RFC 2683 (September 1999).

> - The Free/Busy mechanism has little to hold on to, to see what has
> changed, unless it maintains a local cache of at least the UIDs of the
> message it used when it last generated the (partial) Free/Busy,

Keeping such a cache for optimisation purposes is trivial and common practi=
ce. =

Actually it is not required for a scalable solution but this fact is a mino=
r =

detail which could be discussed seperately. The size of the cache is a =

negletable simple list of 32bit Integers e.g. 40K in the case of 10.000 =

events.

> - Retrieval of relevant events to the relevant period in time could be
> made faster using sorting and retrieving the newest objects first,

This is common practise and trivial but doing sorting is plain wrong and sl=
ow.

A sorting approach is a typical relational database approach. There is NO n=
eed =

to do any sorting if you leverage upon the IMAP protocol.

IMAP guarantees strong monotonous increasing UID values. Due to the fact th=
at =

IMAP does NOT know a modify every modified or new event results in a new IM=
AP =

message which happens to have a UID > LASTSEENUID. (For briefity I will not =

get into the details of removal).

Therefore the simple rule that a "FETCH LASTSEEN+1:*" is sufficient.

> - The client triggering Free/Busy does not simply HEAD a URL and
> disconnects

No this claim is wrong, ofcourse this is the case up to today.

> , as this would impede the slice of time any web server code
> has available to do what it needs to do. Therefore, a client keeps open
> the connection (and uses GET/POST) until the web server performing the
> Free/Busy updating is done. This is considered a blocking operation for
> clients that cannot do this in the background.

This is wrong. Please look at the code.

> >> Euh, as far as I know, it is the client software that triggers an
> >> update of the free/busy, and not the Kolab server itself, and unless
> >> the
> >> client is multi-threaded like Kontact it is also a blocking
> >> operation.
> > =

> > Sorry this is non-sense.
> =

> Thank you for your balanced and well-formulated opinion.

I am sorry but how else should I call it. This is not an opinion but a triv=
ial =

provable fact that for every Kolab client the trigger by its very nature is =

non blocking. After all it is a trigger.

> As I've illustrated before, it's not like Kolab uses FPM or any other
> FastCGI-like implementation, =


Don't think in terms of a web developer. Kolab does not require any of thos=
e =

implementations in order have non blocking fb generation. (The current =

implementation uses a daemon approach in order to avoid extra patching of =

upstream resources. Though this is an implementation detail)

> and it's not like the client can simple
> HEAD a URI and be done with it (close the connection).

But this is exactly what happens. Therefore I call you assumptions and clai=
ms =

nonsense.

> > The main point here is that the Client trigger the update of the
> > partial
> > freebusy data but they never wait nor block. (A trigger is simply an
> > http call which immediately returns and hints the server that the
> > freebusy needs to be
> > updated)
> =

> It only that were true. It sounds very good in theory, but theory is a
> place up north in Narnia. In the real world, there is nothing that
> "hints" the server and nothing to follow up on such "hint". It is the
> client that is actively involved and waiting for the Free/Busy
> information to be updated as part of the trigger URI it is hitting.

I will stop now arguing. Please check with the code.

> > Sorry, but you really got things wrong. The basic idea behind Kolab
> > is NOT to think in terms of a relational database including terms of
> > doing queries all
> > the time.
> > =

> > This is the essential clue behind Kolab that it is so extremly
> > scalable.
> > =

> > Introducing all these "query" concepts will lead to loosing this
> > unique
> > property.
> =

> Well, unique !=3D good and most certainly unique !=3D best. At most, uniq=
ue
> <> common.

In this case unique =3D=3D good and I consider it insulting that you claim =
that =

the existing scalable solution is inferiour to your "query" approach while =

denying all evidence as seen in source and existing binaries.

IMHO query is slow, has scalability issues and should be avoided when =

possible. =


Leveraging upon guaranteed protocal semantics is good practise upwards =

compatible. On the the other hand mapping everything towards a relational =

database even though the underlaying problem does not have relational =

properties is abuse and leads at least to scalability issues.

> To be honest, the "extremely scalable" argument is starting to get to
> be completely wasted on me.

I accept that you do not care about scalability but then please don't ask f=
or =

answers to scalability questions like having 10.000 events in a single =

calendar.

From my experience both scalability and security MUST be designed into a =

solution right from the beginning. Adding both later is extremly cumbersome =

and most often not really solvable in a satisfactory manner.

> Every time it is used, it is used as the ultimate argument against
> something

Most of the time it is used well founded as an argument against abusing =

traditional web technology. (E.g. large scalable web solution like facebook=
, =

google or twitter have moved away from traditional relational databases lon=
g =

ago.)

> , but it misses merit in that the scalability parameter to a
> Kolab deployment is never removed nor reduced by any of the developments
> or ideas to move forward. While you may disagree with that, I have to
> conclude "no-SQL storage" is being confused and arbitrarily substituted
> with "caches, possibly in SQL".

This is plain wrong. There seems to be a fundamental missunderstanding.

I hope that I could anyway provide some insight. As I lack both time and =

funding for actually working on Kolab 3 I hereby stop contributing to this =

thread. =


Maybe sometime we can meet at some conference and have a beer together afte=
r =

meeting before for about an hour in front of a black board. I am confident =

that you would then understand better what this fuzz is all about.

Yours,
-- martin

--  =

e r f r a k o n
Erlewein, Frank, Konold & Partner - Beratende Ingenieure und Physiker
Sitz: Adolfstra=DFe 23, 70469 Stuttgart, Partnerschaftsregister Stuttgart P=
R 126
http://www.erfrakon.com/

_______________________________________________
Kolab-devel mailing list
Kolab-devel@kolab.org
https://www.intevation.de/mailman/listinfo/kolab-devel