[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-devel
Subject:    Re: [Moving/Copying Files] Overwriting identical files
From:       Matthias Fuchs <mat69 () gmx ! net>
Date:       2011-02-25 15:12:09
Message-ID: 201102251612.09631.mat69 () gmx ! net
[Download RAW message or body]

Am Donnerstag 24 Februar 2011, 16:43:09 schrieb todd rme:
> On Wed, Feb 23, 2011 at 12:10 PM, Matthias Fuchs <mat69@gmx.net> wrote:
> > Am Dienstag 08 Februar 2011, 23:22:49 schrieb todd rme:
> >> On Sun, Oct 3, 2010 at 3:58 AM, todd rme <toddrme2178@gmail.com> wrote:
> >> > On Thu, Sep 30, 2010 at 7:30 AM, Matthias Fuchs <mat69@gmx.net> wrote:
> >> >> Hi,
> >> >> 
> >> >> When moving and copying multiple files it can be quite tedious to
> >> >> make out if there are differences for all these files.
> >> >> 
> >> >> 
> >> >> 
> >> >> ====Use Case:====
> >> >> You copy hundreds of text files, knowing that most are the same, but
> >> >> not all. Now you are greated with multiple "Do you want to overwrite
> >> >> XY size Z with XY size W" dialogs.
> >> >> 
> >> >> ====Proposal====
> >> >> What I propose is to not show this dialogs if both files are
> >> >> identical, in the case of copying nothing should happen then, while
> >> >> in the case of moving the source file should be deleted.
> >> >> 
> >> >> To check if a file is identical this should happen in a two step
> >> >> process: 1. Both file sizes equal and smaller a fixed size
> >> >> 2. Calculating the checksums for both files, the check for the fixed
> >> >> size above avoids long lasting calculations
> >> >> 
> >> >> If 1. turns out to be false a dialog should be shown.
> >> >> 
> >> >> 
> >> >> This could be either opt-in (via a checkbox) or always on with just
> >> >> an information text in the dialog. The hash function should be one
> >> >> that is very fast to calculate and if the file system supports and
> >> >> stores checksums for files those should be used.
> >> >> 
> >> >> ====Open Questions + Discussion====
> >> >> What do you think of this idea, should something like that be
> >> >> implemented? Also what do you think of the Nepomuk Ressources
> >> >> associated with the files? Imagine both files have a different
> >> >> rating, what should happen then?
> >> >> 
> >> >>>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to
> >> >>>> unsubscribe <<
> >> > 
> >> > I posted  brainstorm forum idea about this last year:
> >> > 
> >> > http://forum.kde.org/brainstorm.php#idea39563_page1
> >> > 
> >> > I proposed a three-stage process, similar to yours but with an extra
> >> > stage, an optional byte-by-byte check.
> >> > 
> >> > Checksums are fast for small files, but they can take longer on large
> >> > files and on older systems.  They also, as I understand it, are not
> >> > perfect.  So I think that a better approach is that, for files under a
> >> > certain size, an automatic three-stage approach is used.  First the
> >> > file size check, then checksum, then byte-by-byte.  If all of those
> >> > pass, then the file is just deleted.
> >> > 
> >> > For slightly bigger files, where the checksum is fast enough but the
> >> > byte-by-byte is not, only the first two stages are used.  If they both
> >> > pass, the "File Already Exists" dialog box should be changes to tell
> >> > the user that the files are "probably" the same, and gives them the
> >> > additional option (on top of renaming, overwriting, and skipping) of
> >> > doing an "Exact check" (or something along those lines), which then
> >> > does the byte-by-byte check.  If that passes, then the file is
> >> > deleted.
> >> > 
> >> > If the file is really big, then even the checksum is not done
> >> > automatically.  If the files have the same size, the user is told the
> >> > files have the same size, and the user has the additional options of
> >> > doing a "Quick check" and "Exact check" (checksum and byte-by-byte,
> >> > respectively).  If the checksums match, you are back to to the
> >> > previous situation where the user is given the option to do the exact
> >> > check or do one of the standard actions.  If the detailed check
> >> > passes, then the file is deleted.
> >> > 
> >> > The issue with the nepomuk data is an issue even without this.  When
> >> > you are moving files and decide to overwrite conflicting files, even
> >> > if they aren't the same.  A simple check box for "merge nepomuk data"
> >> > or "merge tags" or something like that (if they both have data, of
> >> > course) would be very useful independent of this.
> >> 
> >> Sorry for dredging up such an old topic, but I was wondering if this
> >> might this make a good GSOC project.
> >> 
> >> -Todd
> >> 
> >> >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to
> >> >> unsubscribe <<
> > 
> > I just saw your reply now.
> > Personally I don't really think that this should be a GSOC since I
> > believe it would be quite easy to realise.
> 
> What about as part of a larger duplicate file-finding tool?  There is
> no good KDE GUI that I am aware of.  Someone could make a general
> duplicate file check library for kdelibs, include this in the file
> overwrite dialog but also make a GUI to allow scanning directories for
> duplicate files.
> 
> -Todd
> 
> >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to
> >> unsubscribe <<

This sounds imo interesting.
I am not sure who you should contact though to add thisas GSOC idea.
 
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic