[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xerces-c-dev
Subject:    OT: Data alignment (was: How do I use Xerces strings?)
From:       Axel =?iso-8859-1?q?Wei=DF?= <aweiss () informatik ! hu-berlin ! de>
Date:       2006-03-13 11:41:28
Message-ID: 200603131241.28216.aweiss () informatik ! hu-berlin ! de
[Download RAW message or body]

Steven T. Hatton schrieb:
> On Sunday 12 March 2006 10:33, Axel Weiß wrote:
> > Steven T. Hatton wrote:
> > > i386 (32-bit version), i486, P, PII, PIII, P4...
> > > #include <iostream>
> > > int main() {
> > >   char c('c');
> > >   std::cout<<c<<std::endl;
> > > }
> > >
> > > Assume char is 8-bits.  The smallest retrievable unit of storage
> > > is a 32-bit word.  That means the CPU puts c in a 32-bit word.
> > >  What will occupy the other 24 bits of the word?
> >
> > This statement is not correct. The 32-bit property, regarding i386
> > architectures, means that 32 bit units _can_ be accessed (because
> > the physical bus interface has 32 bits width). However, i386
> > _adresses_ 8-bit units and stores them byte-by-byte. You can easily
> > store four bytes into a 32 bit integer variable.
>
> See the last diagram on this page:
> http://tinyurl.com/qc8wq
> http://www.intel.com/software/products/compilers/flin/docs/main_for/me
>rgedprojects/optaps_for/fortran/optaps_prg_algn_f.htm

Hi Steven,

the topic of the page you pointed out is how data are aligned in mixed 
structures. It is common to these architectures that they have alignment 
restrictions when different data formats are accessed:
8 bit - any address
16 bit - only even addresses
32 bit - only addresses that are dividable through 4
and so on.

> You /can/ tell the compiler to go ahead and pack the data as tightly
> as possible by ignoring natural alignment boundaries, but doing so
> will probably have a significant negative impact on performace.

Again, this is only true, if you want to pack misaligned data in 
structures. Then the compiler must shift-mask-pack the large misaligned 
entities. On some architectures, like i386, this reduces performance 
significantly.

When you are talking about strings, you are talking about arrays, which 
may be understood as naturally aligned data assembles. The only relevant 
factor that restricts these alignments, is the addressability of the 
smallest data unit (called byte). This is 8 bit on any i386 processor, 
but maybe 16, 32, 64 or even 128 bit on other processors. Check it out: 
what gives
- sizeof(char): _always_ 1, on any architecture
- sizeof(int): 4 on i386 processors (int is 32 bit), 8 on 64-bit 
processors (int is 64 bit), but 1 on a TMS320C3x (since int is 32 bit, 
but char is also 32 bit)

Cheers,
			Axel

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic