[prev in list] [next in list] [prev in thread] [next in thread] 

List:       imap
Subject:    Re: FYI: Cyrus IMAP server design
From:       Mark Crispin <MRC () Panda ! COM>
Date:       1993-07-26 18:28:54
[Download RAW message or body]

Hi John -

> I'm not sure what I'd gain by this approach.  IMAP protocol handling
> isn't such a big deal and I can steal what I want from c-client's
> RFC-822 parsing code either way.

It seemed to me that a lot of what you are going to do is folder format
specifics, and that you might have something quicker if you weren't distracted
by other issues too quickly.  But, it's your choice.  I agree that IMAP
protocol handling isn't much -- I wrote imapd in about 2 days -- compared to
the much more complex issues of message parsing and folder handling.

> To permit "/", I would have to map it to something else to get the
> internal folder pathname.  I can't map it to "." because I use "." to
> keep from having conflicts between the names of files in a folder and
> the names of the directories for subfolders.

Yes.  But this is an internal implementation detail, and I think you're going
to need a mapping mechanism anyway.

> The "no non-ascii or shell metacharacter" rule is really a site policy
> restriction.  I should eventually make the folder name restrictions
> site-policy configurable.  8-bit characters are a problem on most unix
> filesystems--I'd have to map them.

It's probably alright to permit 8-bit characters only if the local filesystem
handles them, and then only in the local character set.  In other words, you
can't have a Japanese filename in Sweden, or a Swedish filename in Japan
(although Japanese e-mail is 7-bit ISO-2022-JP, Japanese filenames are 8-bit
EUC).

So, at CMU, you probably wouldn't have 8-bit characters on a campus-wide
facility, but maybe the Japanese studies department machine might.

> Consider the following exchange:
> >>> tag FETCH 7 FLAGS
> <<< * 3 EXPUNGE
> <<< * 7 FETCH (FLAGS (/SEEN))
> <<< tag OK Fetch complete

This sounds like a bug in the IMAP specification; obviously this is an
ambiguous sequence.  Perhaps it needs to be made explicit that you can not
give unsolicited EXPUNGE results until after completing the current request;
that is, the proper order is:
	tag FETCH 7 FLAGS
	* 7 FETCH (FLAGS (/SEEN))
	* 3 EXPUNGE
	tag OK Fetch complete

There's a more serious matter, which is that this makes it unfeasible to do
streamed FETCH commands; that is, doing multiple FETCH commands without
waiting for the results.  I remember that Stanford was really big on streaming
commands, even though flow control greatly limits its usefullness.  I don't
know if it was ever actually implemented in any clients.  Perhaps the whole
streaming concept should be just tossed?  If you really think streaming is
important IMAP should be UDP-based instead of TCP-based.

> > make the message numbers be unique on a system-wide basis,
> > instead of unique per folder.
> Again, I'm not sure what this buys me.  It adds the cost of having to
> lock a global file in order to assign a new number.

I don't understand why a global file needs to be locked, as long as foo++ is
an atomic operation and there is only one instance of foo in the system.  You
could do this by implementing a system call saying ``give me a foo''.  Some
flavors of UNIX already have such an operation, I think.

The reason for doing this is that then you'll have a UID that is valid for all
instances of the same message.  That'd be rather neat for disconnected use
operation, since then you have UID uniqueness across multiple folders instead
of just within a single folder, and you can build a copy of the linked model
on the client.

It also makes cross-post handling trivial, if message 12345 is read, then it
is read, and you don't have to deal with cross references to eliminate
duplicates (I am *not* looking forward to writing xref and threading code in
the c-client based server).

I guess the bottom line is, there's a chance to do it at this point with
fairly small cost; at worst, it will turn out to be unnecessary.  The cost of
not doing it is losing a characteristic that may turn out to be important.  I
can't think of any advantages of not having the message UIDs be globally
unique, other than perhaps a space size issue.  But 2^32 should be enough for
today's needs, and moving to 2^64 should not be rocket science should it ever
happen that a single repository ends up having processed more than 2^32
messages!!!  By the time that happens, we can hope that ANSI will give us
``double long'' in C like I've been wanting for years....

-- Mark --

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic