Staying within binary files, here are some additional thoughts:

Roland Kaufmann wrote:
> Both the picture in the text and in the spreadsheet are named with
> the ordinal 0 because they are in different scopes. The diagram 

I thought of a naming convention like "ordinal[-label].part" which
encoded all the information into a filename.

But when I came to think of it, this could be done better if
compatability with .tar-files is not an issue (I don't know how
important that is to you).

Each part in a document has the following information:

 * Unique ordinal within the document
 * Editor for this part, this is passed to the naming service
 * An optional label, given by the user
 * Contents stream
 * Attributes for this part, this could consist of size within the
   container etc. In the future perhaps also an XSL and possibly DTD
 * References to other parts (by their ordinal)

Some may argue that all of these are really attributes (with the
attributes in the list being "extended" attributes) and I see their
point. (see below)

When a part is referenced, it should always be indirect, relative
references, e.g. "#1.2" meaning the third sub-part in the second
sub-part of this part (the sequence is zero-based...). The storage
system would review the list of references to resolve this into the
ordinals within that document.

This could very simply be implemented using a flat layout, in fact
one doesn't need any further directories because the parts themselves
contains the references necessary for navigation.

Hence, a file would have to consist of:

 * TOC, containing information about:
   --> the part that should be started as a frame
   --> internal structure information
 * Sequence of part entries, as described above
 * Sequence of data streams containing data

I also have some more concrete ideas for a file-format, if you're
interested. It all depends on how much it matters to you to follow
the .tar format.

> An extension to this model is also a special branch that holds the
> DTD for any parts that is stored in XML format (or a reference to
> a system-wide installed such -- that is a tradeoff between space
> and portability to systems without that part installed)

I got the idea about the DTD wrong; i.e. I think that the vision that
the user should be able to view a thumbnail of the part without
having the component installed is not a bad one, but the solution I
sketched was not good. I also mixed up type definitions and 
stylesheets.

The *real* solution would be to embed the component itself together
with the document, giving a true object! This however, leads to
various problems: systems administrators are given a headache when
users install parts themselves, you have a cross-platform problem,
there might be viruses in the components that is downloaded that way
etc.

A stylesheet is really an interpreter viewer program for your data
(based on the type definition), which is why I was so keen on having
a reference to it embedded.

However, there is a problem when a component might construct a DTD on
the fly, not having a clear one-to-one mapping from the component to
the type definition and stylesheet in a system repository. 

Perhaps in those cases, one should always store the DTD in the 
document itself, leaving the repository for component with standard
types? Having such a solution, it would be easy to slip a reference
to a data stream having the DTD & XSL into the part attributes (see
above)

By default, type definitions and style sheets should not be stored in
the document, but rather kept in a system repository. It is only when
you want other users to be able to view the file without having the
component this would be needed.

Sincerely,
	R.

---
No, I'm not an (e)mail-chauvinist.