[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cyrus-info
Subject:    Cyrus 2.5 status
From:       Bron Gondwana <brong () fastmail ! fm>
Date:       2014-12-25 0:25:49
Message-ID: 1419467149.1596940.206579429.16C6C7C0 () webmail ! messagingengine ! com
[Download RAW message or body]

And so it's Christmas, and 2.5 isn't out yet.

In great news, the code running at FastMail is now fully rebased on top of 2.5.  I'm \
really happy with the state of almost everything.

What's still to do is fixing the replication code.  The same thing that has been an \
issue since forever.  Maybe the best thing is to just revert to the 2.4 protocol, and \
ignore the new fields totally for the initial release.  They will still replicate, \
just not be protected by the sync_crc.

There are also fixes to pick back from the FastMail branch.  For the past few weeks \
I've been focused on getting things ready for the carddav release, so not so much on \
having them upstream maintainable.

I am really sorry to everyone about the state of unix hierarchy separator and alt \
namespace stuff.  Well meaning but misguided fixes have just made it worse.  It's \
exactly the same problem that every web programmer deals with - you need to "entity \
encode" exactly once.  I have the correct fix for this in progress... basically it's \
this:

1) on disk/in database format changes so that the separator is a control character \
(less than space, so there's no need for improved_mboxlist_sort) 2) in-memory format \
is ALWAYS a 'struct mboxname_parts' (short name: mbname_t).  This format is all \
individual strings, with the mailbox name being a strarray_t, so no separators \
encoded in it. 3) the external format is the only thing that depends on the \
configuration.

Along with this are major changes to how LIST works (yes, again) - this time with a \
serious eye to passing all of imaptest.org's tests.

Rob M and I sat down the other day and created a giant whiteboard full of things that \
we want to see in Cyrus for the future.  We are planning to employ somebody to work \
full time on this:

https://www.fastmail.com/about/jobs/2015-01-cyrus.html

Here's a typed up version of the list::

* Unix HS and Alt Namespace => make consistent (see above)
* mailboxes.db format:
  * U[]foo.bar[]Sub[]Folder (for user namespace)
  * S[]shared[]folder (for shared namespace) - so that user NS isn't a sub-part of \
                shared NS, speeds up listing.
  * domains as part of user: U[]foo.bar@domain.com[]Trash
  * $ => version key for tracking contents of mailboxes.db - always read at startup \
                (we use the same trick in conversations.db)
* FAST reverse ACL map:
  * U:$userid => folders with ACLs
  * G:$groupname => folders with ACLs
  * combine those folders, eliminate common prefixes, search just those prefixes.
  - Makes LIST fast, even on big servers/giant murders.
* Mailbox on-disk paths == folder uniqueid
  * fast, atomic rename - including multiple folders
  * fix delayed_delete to just keep old uniqueid in mailboxes.db => no DELETED. \
                prefix
  * fast undelete of entire folders
  * store current mailbox name inside cyrus.header for reconstruct
  * only works now that we store uniqueid in mailboxes.db (DLIST format)
* Sieve standards support => vacation time period, etc.  Also check other features \
                for latest standard compatibility, e.g. imap4flags
* per-message annotations: change format to be more like cyrus.cache: offset based, \
                MVCC updatable such that QRESYNC and QUOTA are reliable.
* UNIFIED MURDER + sync:

**** THIS IS THE BIG ONE ****

I have dreamed of this forever.  It's a giant job.  Basically store multiple \
locations in mailboxes.db for a folder.  This combines replication with murder, and \
sync_client needs a manager so that you can create arbitrary sync patterns.

Sub parts:
  * sync_server in imapd (Ken's XFER-sync work ported from 2.4)
  * generic change-log system (sync_log, squatter log, etc from current FastMail \
                code, plus extras)
  * sync_client manager that reads.

* central cleanup task:
  * instead of running repack/cleanup/etc at mailbox_close, we log that it's needed \
                and let the current task continue.
  * a background daemon tries (non-blocking lock) to pick up the exclusive lock to do \
                the repack, meaning that clients never pay the delay themselves.  \
                Also fits with:
* short-locks for unlink
  * at the moment, we take an exclusive lock for the ENTIRE time that we're unlinking \
deleted messages from a folder.  That can be quite slow, because unlink is slow on \
most filesystems.  We need the exclusive lock to ensure no other task still expects \
to be able to read the file... BUT, we only need the exclusive lock for a moment to \
ensure nobody else held the lock over this time.  We can release it straight away and \
know that the files which were seem with FLAG_UNLINKED during the lock can be safely \
deleted, because nobody can remember them as existing any more.

* sync-state cache
  * right now, we always query the replica for the current mailbox state before \
sending a SYNC APPLY.  In the general case, the replica won't have changed since the \
last sync.  We could cache the remote state in a local database, and send an \
optimistic apply.  If the old state hasn't changed, the apply could happen \
immediately.   Along with optimistic reserve, we can apply changes in a single round \
                trip, instead of the current 3.
  * change sync_client do do partial user sync rather than grouping mailboxes across \
users - means a single lock for user-level database updates (calendar sync-token, \
conversations, etc)

* Conversations mark 2 - FastMail have plans to fix our conversations implementation \
to be better, then push that upstream.  There's work underway to standardise THRID \
and MSGID the way that Gmail do it, and our conversations would be compatible.

* Search:
  - get the existing Xapian stuff upstreamed.
  - external provider support: e.g. elastic search.

* Archive:
  - FastMail supports archiving parts of the mailbox to a different disk.  It's how \
                we keep the first week's email on SSD while storing older emails on \
                big slow SATA.
  - Make this more general and allow storing old email to a central object store, so \
indexes are replicated and emails are stored in a separate replicated system.

* Backups
 - backup format based on replication protocol
 - optional inline blobs for the rfc822 messages or index them separately

* JMAP (http://jmap.io/) support directly in Cyrus

* Sane Restart/Failover process.

* Nginx authentication backend

This is actually really awesome with the unified murder above.  You could run an \
nginx non-blocking proxy on every frontend, which uses the mailboxes.db to find the \
correct backend for the user, then proxies their connection to the right server.  \
This then means that you don't have tons of processes running on the frontends that \
are just proxying to another full-weight imapd, but you get the advantages of murder \
too - since it's unified, the backends have the full mailboxes.db and can connect \
through to other backends directly for shared folders which aren't on the same \
machine.

I have ideas around backend failover and handover through nginx as well, but they are \
longer term dreams...


So there's tons of work to go on with :)

Bron. 





-- 
  Bron Gondwana
  brong@fastmail.fm
----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic