[prev in list] [next in list] [prev in thread] [next in thread]
List: mysql-internals
Subject: Re: Dynamic record sizes for HEAP engine
From: Igor Chernyshev <igor_cc75 () yahoo ! com>
Date: 2008-07-31 19:36:36
Message-ID: 227216.78645.qm () web65711 ! mail ! ac4 ! yahoo ! com
[Download RAW message or body]
Update for those interested in this patch --
Larry Zhou from Google and I have made a few bug fixes.
The diff and zip files have been updated.
Original diff can be found in "Deprecated" downloads.
It's still based on MySQL 5.0.45.
http://code.google.com/p/mysql-heap-dynamic-rows
Thanks,
Igor
--- On Thu, 4/17/08, Igor Chernyshev <igor_cc75@yahoo.com> wrote:
> From: Igor Chernyshev <igor_cc75@yahoo.com>
> Subject: Re: Dynamic record sizes for HEAP engine
> To: "Igor Chernyshev" <igor_cc75@yahoo.com>, "Sergei Golubchik" <serg@mysql.com>
> Cc: internals@lists.mysql.com
> Date: Thursday, April 17, 2008, 3:55 PM
> Forgot to mention in my previous post - as indicated
> on http://code.google.com/p/mysql-heap-dynamic-rows ,
> this patch has been contributed by eBay, Inc via GPL
> v2. (and, yes, I wrote the code : ))
>
> Thanks,
> Igor
>
> --- Igor Chernyshev <igor_cc75@yahoo.com> wrote:
>
> > I've uploaded the patch to code.google.com. It is
> > based on 5.0.45. See project description, as well as
> > DesignDetails, Usage and PatchFormat Wiki pages for
> > more information.
> >
> > Note that BLOB support should be now easy to add.
> > This
> > patch provides new HP_DATASPACE structure, which
> > could
> > be instantiated for table's BLOB area. As another
> > option, BLOB data could be embedded into the records
> > themselves.
> >
> > http://code.google.com/p/mysql-heap-dynamic-rows
> >
> > Below is a copy of design notes from dspace.c (same
> > as
> > DesignDetails Wiki).
> >
> > Thanks,
> > Igor
> >
> > ================
> > MySQL Heap tables keep data in arrays of
> > fixed-size
> > chunks.
> > These chunks are organized into two groups of
> > HP_BLOCK structures:
> > - group1 contains indexes, with one HP_BLOCK per
> > key
> > (part of HP_KEYDEF)
> > - group2 contains record data, with single
> > HP_BLOCK
> > for all records, referenced by
> > HP_SHARE.recordspace.block
> >
> > While columns used in index are usually small,
> > other
> > columns
> > in the table may need to accomodate larger data.
> > Typically,
> > larger data is placed into VARCHAR or BLOB
> > columns.
> > With actual
> > sizes varying, Heap Engine has to support
> > variable-sized records
> > in memory. Heap Engine implements the concept of
> > dataspace
> > (HP_DATASPACE), which incorporates HP_BLOCK for
> > the
> > record data,
> > and adds more information for managing
> > variable-sized records.
> >
> > Variable-size records are stored in multiple
> > "chunks",
> > which means that a single record of data (database
> > "row") can
> > consist of multiple chunks organized into one
> > "set".
> > HP_BLOCK
> > contains chunks. In variable-size format, one
> > record
> > is represented as one or many chunks, depending on
> > the actual
> > data, while in fixed-size mode, one record is
> > always
> > represented
> > as one chunk. The index structures would always
> > point to the first
> > chunk in the chunkset.
> >
> > At the time of table creation, Heap Engine
> > attempts
> > to find out
> > if variable-size records are desired. A user can
> > request
> > variable-size records by providing either
> > row_type=dynamic or
> > block_size=NNN table create option. Heap Engine
> > will
> > check
> > whether block_size provides enough space in the
> > first chunk
> > to keep all null bits and columns that are used in
> > indexes.
> > If block_size is too small, table creation will be
> > aborted
> > with an error. Heap Engine will revert to
> > fixed-size
> > allocation
> > mode if block_size provides no memory benefits (no
> > VARCHAR
> > fields extending past first chunk).
> >
> > In order to improve index search performance, Heap
> > Engine needs
> > to keep all null flags and all columns used as
> > keys
> > inside
> > the first chunk of a chunkset. In particular, this
> > means that
> > all columns used as keys should be defined first
> > in
> > the table
> > creation SQL. The length of data used by null bits
> > and key columns
> > is stored as fixed_data_length inside HP_SHARE.
> > fixed_data_length
> > will extend past last key column if more
> > fixed-length fields can
> > fit into the first chunk.
> >
> > Variable-size records are necessary only in the
> > presence
> > of variable-size columns. Heap Engine will be
> > looking for VARCHAR
> > columns, which declare length of 32 or more. If no
> > such columns
> > are found, table will be switched to fixed-size
> > format. You should
> > always try to put such columns at the end of the
> > table definition.
> >
> > Whenever data is being inserted or updated in the
> > table
> > Heap Engine will calculate how many chunks are
> > necessary.
> > For insert operations, Heap Engine allocates new
> > chunkset in
> > the recordspace. For update operations it will
> > modify length of
> > the existing chunkset, unlinking unnecessary
> > chunks
> > at the end,
> > or allocating and adding more if larger length is
> > necessary.
> >
> > When writing data to chunks or copying data back
> > to
> > record,
> > Heap Engine will first copy fixed_data_length of
> > data using single
> > memcpy call. The rest of the columns are processed
> > one-by-one.
> > Non-VARCHAR columns are copied in their full
> > format.
> > VARCHAR's
> > are copied based on their actual length. Any NULL
> > values after
> > fixed_data_length are skipped.
> >
> > The allocation and contents of the actual chunks
> > varies between
> > fixed and variable-size modes. Total chunk length
> > is
> > always
> > aligned to the next sizeof(byte*). Here is the
> > format of
> > fixed-size chunk:
> > byte[] - sizeof=chunk_dataspace_length, but at
> > least
> > sizeof(byte*) bytes. Keeps actual
> > data
> > or pointer
> > to the next deleted chunk.
> > chunk_dataspace_length equals to full
> > record length
> > byte - status field (1 means "in
> use", 0
> > means
> > "deleted")
> > Variable-size uses different format:
> > byte[] - sizeof=chunk_dataspace_length, but at
> > least
> > sizeof(byte*) bytes. Keeps actual
> > data
> > or pointer
> > to the next deleted chunk.
> > chunk_dataspace_length is set
> > according
> > to table
> > setup (block_size)
> > byte* - pointer to the next chunk in this
> > chunkset,
> > or NULL for the last chunk
> > byte - status field (1 means
> "first", 0
> > means
> > "deleted",
> > 2 means "linked")
> >
> > When allocating a new chunkset of N chunks, Heap
> > Engine will try
> > to allocate chunks one-by-one, linking them as
> > they
> > become
> > allocated. Allocation of a single chunk will
> > attempt
> > to reuse
> > a deleted (freed) chunk. If no free chunks are
> > available,
> > it will attempt to allocate a new area inside
> >
> === message truncated ===
>
>
>
>
> ____________________________________________________________________________________
> Be a better friend, newshound, and
> know-it-all with Yahoo! Mobile. Try it now.
> http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe: http://lists.mysql.com/internals?unsub=mysql-internals@progressive-comp.com
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic