[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gpfsug-discuss
Subject:    Re: [gpfsug-discuss] Quick delete of huge tree
From:       Alec <anacreo () gmail ! com>
Date:       2021-04-20 16:44:53
Message-ID: CAGhSTwhiLq_uk1f1NKMyfL_RHjGeHfjhM5eLz2HeXU9fkOmujA () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


I would start with the mv to hide it, and then allow the delete to progress
in the background.  I would seperate out the delete of files from
directories... And i would try using mmxargs with a rm command to get
parallelism plus reduce the number of execs in one policy, followed by a
simple rm -r of the directory tree.

Maybe it's cheaper to make a new filesystem and just retain the data you
want though...

Alec

On Tue, Apr 20, 2021, 6:51 AM Jonathan Buzzard <
jonathan.buzzard@strath.ac.uk> wrote:

> On 20/04/2021 13:09, Ulrich Sibiller wrote:
>
> >>
> >> Consider using mv to move it out the way or hide it while the delete is
> >> in progress. If you do that think carefully about backups, you don't
> >> want to back it all up again while it is being deleted :-)
> >
> > ;-) Yeah, that's why I did not the do the mv in the first place ;-)
> >
>
> I would estimate (based on my experience) is that you should be able to
> delete that amount of data/files in under 24 hours anyway with a simple
> rm -rf. Which is why I question trying to find faster methods. You have
> already wasted a significant amount of that time :-)
>
> If your using TSM for the backup then just exclude it from the backup in
> your dsm.opts file
>
> exclude.dir  <path to directory>
>
> We have a NOBACK option that allows users to select if they don't want
> something backing up. Helpful if your job generates lots of temporary
> files or data that are junked as soon as the job finishes. Anything
> under a directory called NOBACK does not get backed up.
>
> exclude.dir  /.../NOBACK/
>
>
> JAB.
>
> --
> Jonathan A. Buzzard                         Tel: +44141-5483420
> HPC System Administrator, ARCHIE-WeSt.
> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>

[Attachment #5 (text/html)]

<div dir="auto">I would start with the mv to hide it, and then allow the delete to \
progress in the background.   I would seperate out the delete of files from \
directories... And i would try using mmxargs with a rm command to get parallelism \
plus reduce the number of execs in one policy, followed by a simple rm -r of the \
directory tree.<div dir="auto"><br></div><div dir="auto">Maybe it&#39;s cheaper to \
make a new filesystem and just retain the data you want though...  </div><div \
dir="auto"><br></div><div dir="auto">Alec</div></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Apr 20, 2021, 6:51 AM \
Jonathan Buzzard &lt;<a \
href="mailto:jonathan.buzzard@strath.ac.uk">jonathan.buzzard@strath.ac.uk</a>&gt; \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 \
.8ex;border-left:1px #ccc solid;padding-left:1ex">On 20/04/2021 13:09, Ulrich \
Sibiller wrote:<br> <br>
&gt;&gt;<br>
&gt;&gt; Consider using mv to move it out the way or hide it while the delete is<br>
&gt;&gt; in progress. If you do that think carefully about backups, you don&#39;t<br>
&gt;&gt; want to back it all up again while it is being deleted :-)<br>
&gt; <br>
&gt; ;-) Yeah, that&#39;s why I did not the do the mv in the first place ;-)<br>
&gt; <br>
<br>
I would estimate (based on my experience) is that you should be able to <br>
delete that amount of data/files in under 24 hours anyway with a simple <br>
rm -rf. Which is why I question trying to find faster methods. You have <br>
already wasted a significant amount of that time :-)<br>
<br>
If your using TSM for the backup then just exclude it from the backup in <br>
your dsm.opts file<br>
<br>
exclude.dir   &lt;path to directory&gt;<br>
<br>
We have a NOBACK option that allows users to select if they don&#39;t want <br>
something backing up. Helpful if your job generates lots of temporary <br>
files or data that are junked as soon as the job finishes. Anything <br>
under a directory called NOBACK does not get backed up.<br>
<br>
exclude.dir   /.../NOBACK/<br>
<br>
<br>
JAB.<br>
<br>
-- <br>
Jonathan A. Buzzard                                      Tel: +44141-5483420<br>
HPC System Administrator, ARCHIE-WeSt.<br>
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG<br>
_______________________________________________<br>
gpfsug-discuss mailing list<br>
gpfsug-discuss at <a href="http://spectrumscale.org" rel="noreferrer noreferrer" \
target="_blank">spectrumscale.org</a><br> <a \
href="http://gpfsug.org/mailman/listinfo/gpfsug-discuss" rel="noreferrer noreferrer" \
target="_blank">http://gpfsug.org/mailman/listinfo/gpfsug-discuss</a><br> \
</blockquote></div>



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic