[prev in list] [next in list] [prev in thread] [next in thread]
List: sqlite-users
Subject: Re: [sqlite] About clustering (GUIDs, PK offsets, etc.)
From: "Jay A. Kreibich" <jay () kreibi ! ch>
Date: 2009-01-02 5:47:26
Message-ID: 20090102054726.GA81750 () uiuc ! edu
[Download RAW message or body]
On Mon, Dec 29, 2008 at 04:11:53PM +0300, Alexey Pechnikov scratched on the wall:
> I did find interesting article about this popular question
> http://blog.maxindelicato.com/2008/12/how-to-organize-a-database-tables-keys-for-scalability.html
It isn't specifically called out, but an underlying theme in this article
is that none of the suggested techniques (with the exception of
"Auto-Increment Offset and Increment") require database-specific features
or extensions.
In fact, in regards to GUIDs and how they're generated, the article
is pretty specific that this is usually the developer's problem. To
quote: "This is often accomplished by generating a GUID within the
application layer...".
> I did create ticket "GUID primary keys" and think it's may be very good feature:
> http://www.sqlite.org/cvstrac/tktview?tn=3557
As the article mentions, the most generic way to do GUIDs is within
the application, not the database shard instances that make up the
global meta-database.
The difficulty with this is that you need the ability to generate
GUIDs without a lot of context, since there can be multiple
application clients generating GUIDs-- even if you're using a
middle-ware layer, like a webserver. As a result (and the fact that
GUIDs really, truly, are "global" and not just "cluster-wide"), GUIDs
tend to be (physically) huge. See the article they link to for more
details.
SQLite doesn't support the "AUTO_INCREMENT_INCREMENT" configuration
talked about in the article, but SQLite's ROWIDs are big enough it
might not matter.
If you use their described technique of AUTO_INCREMENT offset and
increment values, you're essentially storing the shard-id in the
lower few bits of a normal ROWID. For example, in their example they
have four shard databases, essentially making the lower two bits of
each "global" ROWID a shard-id. You can use any size increment in their
description, but the application code and math is easier if you use a
power of two, allowing you to use bit masks rather than mods to extract
the shard-id. The advantage of the offset/increment system is that
it is flat-out impossible for the ROWIDs to collide if everything is
working correctly.
You can do something similar with SQLite, you just need to use the top
end (and a little faith), rather than the bottom end of the ROWID. For
example, mask off the top 8 or 16 bits of a ROWID as a shard-id. If
you use SQLite's INTEGER PRIMARY KEY AUTOINCREMENT feature, you can
then "seed" the table with the first ROWID value by inserting (and
then deleting) a fake row**. Seeding also allows you to trivially
use negative ROWID values, getting full use out of the 64 bit ROWID
domain.
The AUTOINCREMENT keyword changes the way SQLite allocates ROWIDs
(see http://www.sqlite.org/autoinc.html). You'll burn through
ROWIDs faster (since they're never re-used) but you'll also be
assured that the automatic assignment of ROWIDs won't drop back to 1
if all the rows in a table are deleted. The "never re-used" limit is
rarely an issue anyways, since you usually use a shard architecture
for tables you never (or rarely) delete data from.
** or, you can use the undocumented/unsupported method of issuing
this command *immediately* following the CREATE TABLE command:
INSERT INTO sqlite_sequence VALUES (<table_name>, <initial_value>);
Where <initial_value> is something like (<shard_id> << 48).
There is, in theory, some danger, however. Since each database is
independently allocating ROWIDs with no inherent restrictions, it is
possible for one shard to cross over into the next shard's pool of
ROWID values. In practice, that's unlikely, however.... If you use
a 16 bit field for shard-ids, that leaves 48 bits for "actual" per-shard
ROWIDs. That's "only" 281,474,976,710,656 rows per shard-- or about
892K inserts per second for 10 years. Not likely a problem.
-j
--
Jay A. Kreibich < J A Y @ K R E I B I.C H >
"Our opponent is an alien starship packed with atomic bombs. We have
a protractor." "I'll go home and see if I can scrounge up a ruler
and a piece of string." --from Anathem by Neal Stephenson
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic