On Monday 29 September 2003 12:36, Holger Schroeder wrote: > Hi all, > > On Saturday 27 September 2003 23:48, you wrote: > > On Saturday 27 September 2003 20:25, Nicolas Goutte wrote: > > > It is part of KDE. KZip 3.1 will simply generate "fat"-based data, KZip > > > 3.2 will give "unx" ones. > > > > Thanks guys for this investigation. I wasn't aware of that change. > > (Holger: it seems that this change broke the magic-recognition of KOffice > > files, where the uncompressed file called "mimetype" would have its > > contents at position 38 in the ZIP file). > > the structure of a zip file local header in a zip archive is: > > local file header signature 4 bytes (0x04034b50) (...) > > (from http://www.pkware.com/products/enterprise/white_papers/appnote.html) > > as u understend from the patch contained in thomas zanders first mail in > this thread, the mime type recognition is done by detecting the string > application/x-kword in the file at a fixed offset. > > when you look at "old" kzip files as i coded them, there was no use for a > extra field. so in the file there is the string > mimetypeapplication/x-kword. so the filename and the beginning of the > content are "concatenated" when there is no extra field. in this case (we > know how long the filename is, and we know that there is no extra field) > the beginning of the content/mimetype string is fixed. Yes, that is the plain old fat format without extension. That is on what we based KOffice. > > the extra field as it is introduced now gives us the following advantages > over the old way: > (...) > > so in the "not-koffice" case we should by default write the new fileformat, > as these values are kind of useful there. > > so how to fix this for koffice ? > > i see two possibilities: > > 1.) add an option to allow writing of zip files without this extra info and > use it in koffice, as the permissions and Xtimes are not needed in the > files. That is the one we want. > > 2.) as we only want to have "application/x-kword" at a fixed offset in the > zip-archive, it would also be possible to not create a first file with the > filename mimetype and the _content_ application/x-kword, but a first file > with the _name_ mimetypeapplication/x-kword and any content after that. > this would have the advantage, that our mimetype is _always_ at this > offset, no matter which different extra fields with which lengths will be > ever introduced, as the file name is saved in the zip file before the extra > fields. as long as nobody creates a "zip format version 2", which will then > be a whole new format, we would have solved this issue. No! A file name application/ means an extra directory. Also vnd.kde.kword has two dots and is therefore an invalid FAT name. However the idea of the common packaging format was to be cross-filesystems. As for ZIP "version 2", it exists. It is named ZIP64. (However it seems to fit more or less in the old format.) > > the only thing that should be checked is, how openoffice would handle these > files. ok, i looked at an example file from openwriter. they have no first > mimetype file in their format, they directly start with the file > content.xml. OO is still not using the common packaging format. (I have not checked in OO 1.1 RC.) > > so i guess they neither care about a file named "mimetype" nor about a file > called "mimetypeapplication/x-kword". We do care, as we have agreed with OO's people about the new format. > > so somebody could change the code in koffice, that writes the mimetype to > this and it should work. that would have the advantage, that we don't have > to introduce a new function in kzip to not write the extra stuff, which > would be a little bit ugly, if i understood it right with all these > virtual_hooks. and we wouldn't have to always check that nobody breaks kzip > in the future. No we cannot. This is the last version of KOffice-own file format (as we switch to OO in KOffice 1.4.). It would be ridiculous to have such a change just for the last format. > > while we are at it, i would not only call the first file > mimetypeapplication/x-kword, but i would suffix it with the version it was > created with, perhaps we can use it for something in the future, and these > few bytes do not hurt anybody. so it would be called for example > application/x-kword-1.3.0 or so. No, it is not what was agreed for. That is for the file format reader to detail. We have currently the syntaxversion attribute for that. And OO's formats allow extensions anyway. > > > > No, sure. I cannot remember if KOffice 1.2.x had its own KZip (named > > > KoZip) or not. So I do not know if the change has to be done if KDE > > > 3.1.x or in KOffice 1.2.x. > > > > CVS says that KoZip was part of KOffice-1.2.x indeed. But: > > > But in any case, I am really starting to ask me if for the last > > > KOffice-own file format it is useful to have again a subtle change. > > > However this would mean to force KZip 3.2 to be able to write in the > > > "fat" modus, either on command or simply for uncompressed files. > > > > Yes, I think we shouldn't do something that changes our 'magic' > > recognition: > > * other projects/tools/etc. might use the magic we had Yes, sure, that is why I wrote that we are stuck with it. > > > previously, this change will break it > > by using solution 2 it would be unbroken again > > * are we sure that the new offset is > > > always going to be 55? What's between position 30 and position 55? This > > looks more fragile to me. > > the extra field can be of a variable length, and the file content starts > directly after the filename and the extra field. so there is no general > solution, when we want the detection string to be in the content and on the > other hand allow this extra field. in the code of kzip.cpp there is already > a possibility to parseInfoZipUnixNew and for sure it will somewhen > introduce another length for the extra field... Yes, that is why we need a plain old fat entry without any extension. That is what we agreed of with OO's people. (We had not thought that it would so hard. Sigh!) > > > * OpenOffice.org uses the "fat" format in ZIP files, and the whole point > > of switching to ZIP was to use the same thing as they do, and > > particularly having the same kind of magic mimetype recognition. > > i have no idea how they are doing mimetype detection. iirc their "weak" > detection is solely based on the filename extension, and their "strong" > detection they use when loading a file parses their manifest.xml, a kind of > "table of contents" for their archive. but i may be wrong here... The problem is not what OO does *now*. (It does not really need the manifest either.) The problem is what we agreed with OO's people. > > > So we need to fix KZip to give us "fat" format again. Holger: do you know > > if that's easily doable? Is there any benefit in the "unx" stuff? Should > > just move "back" to fat, or should we add a method to choose the format? > > > > (In case you missed the rest of the thread: zipinfo shows "fat" or > > "unx"; "fat" on OOo and kdelibs-3.1-generated files, and "unx" in > > kdelibs-cvs-generated files) > > unfortunately i am quite busy with university, so i can't hack on that > right now, but i will follow this discussion, so feel free to ask, if i > explained something not good enough. That probably means that I have to look at it. Well, I do not mind, if somebody helps me to check it on KDE 3.2. (I have only KDE 3.1.4.) > > > Holger Have a nice day! > > ____________________________________ > koffice mailing list > koffice@mail.kde.org > To unsubscribe please visit: > http://mail.kde.org/mailman/listinfo/koffice ____________________________________ koffice mailing list koffice@mail.kde.org To unsubscribe please visit: http://mail.kde.org/mailman/listinfo/koffice