[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kfm-devel
Subject:    UDSEntry compression ideas
From:       Mark <markg85 () gmail ! com>
Date:       2013-09-16 18:56:13
Message-ID: CAPd6JnFFAL8nBeu-7WkMrevXchbi7sjHFYFsV=nq+e+gsBT+EA () mail ! gmail ! com
[Download RAW message or body]

Hi All,

please read this as a brainstorm post with some ideas. I have no code
working in this area yet.

Lazy loading is awesome, but if you can compress the data enough that you
still have all details without lazy loading is even better i think. For
example, doing a directory listing on any given folder gives at least the
following UDSEntry values:
- Device ID
- Inode
- User
- Group

In the "experimental" case where you list the folder contents that has just
a bunch of files created by one user you can say that at least the above 4
properties are the same for all files. However, the experimental situation
certainly doesn't work on "real data" where you can have files from
multiple users and folders as well. Folders will likely have a different
inode number (not tested). However, this is an area where - in most cases -
half of the data (by default we only have 8 UDSEntry properties) is
redundant. Among the 4 others are also two timestamps which are very often
the same as well.

I'd like to play with compression here, but i'm not sure how to follow up
on that. One idea i'm having is a multikey - > single value storage
structure which is not in C++ or Qt by itself. Perhaps boost::multiindex
would be a possible solution here? Where the index would probably be the
filename which you can then also remove from the UDSEntry since you can get
from the keys themselves.

Other then that i don't know a sane way of doing this or even if this will
work at all. The above idea would already require a singlethon
"UDSEntryDataContainer" that all UDSEntry objects would use. The UDSEntry
API would remain the same, the implementation would differ quite a lot.

I also don't know how much data will be saved at all by doing this. Sure,
it will save "some" but i image you will get some more pointers with a
multiindex structure and perhaps some other container bookkeeping code?

What's your idea here? If you have a better/other idea, please do tell :)

I'm looking forward to your input and other far better ideas!

Kind regards,
Mark

[Attachment #3 (text/html)]

<div dir="ltr">Hi All,<div><br></div><div>please read this as a brainstorm post with some ideas. I have \
no code working in this area yet.</div><div><br></div><div><div>Lazy loading is awesome, but if you can \
compress the data enough that you still have all details without lazy loading is even better i think. For \
example, doing a directory listing on any given folder gives at least the following UDSEntry \
values:</div>

<div>- Device ID</div><div>- Inode</div><div>- User</div><div>- Group</div><div><br></div><div>In the \
&quot;experimental&quot; case where you list the folder contents that has just a bunch of files created \
by one user you can say that at least the above 4 properties are the same for all files. However, the \
experimental situation certainly doesn&#39;t work on &quot;real data&quot; where you can have files from \
multiple users and folders as well. Folders will likely have a different inode number (not tested). \
However, this is an area where - in most cases - half of the data (by default we only have 8 UDSEntry \
properties) is redundant. Among the 4 others are also two timestamps which are very often the same as \
well.</div>

</div><div><br></div><div>I&#39;d like to play with compression here, but i&#39;m not sure how to follow \
up on that. One idea i&#39;m having is a multikey - &gt; single value storage structure which is not in \
C++ or Qt by itself. Perhaps boost::multiindex would be a possible solution here? Where the index would \
probably be the filename which you can then also remove from the UDSEntry since you can get from the keys \
themselves.</div>

<div><br></div><div>Other then that i don&#39;t know a sane way of doing this or even if this will work \
at all. The above idea would already require a singlethon &quot;UDSEntryDataContainer&quot; that all \
UDSEntry objects would use. The UDSEntry API would remain the same, the implementation would differ quite \
a lot.</div>

<div><br></div><div>I also don&#39;t know how much data will be saved at all by doing this. Sure, it will \
save &quot;some&quot; but i image you will get some more pointers with a multiindex structure and perhaps \
some other container bookkeeping code?</div>

<div><br></div><div>What&#39;s your idea here? If you have a better/other idea, please do tell \
:)</div><div><br></div><div>I&#39;m looking forward to your input and other far better \
ideas!</div><div><br></div><div>Kind regards,</div>

<div>Mark</div></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic