[prev in list] [next in list] [prev in thread] [next in thread]
List: hadoop-user
Subject: Re: HDFS du Utility Inconsistencies?
From: David M <mcginnisda () outlook ! com>
Date: 2019-11-08 18:03:56
Message-ID: CY4PR1201MB01185B5B546BF0AAFA776696C37B0 () CY4PR1201MB0118 ! namprd12 ! prod ! outlook ! com
[Download RAW message or body]
We use snapshots in the cluster, but I've not seen any snapshot folders und=
erneath the folder in question. I'd need to verify with the application tea=
m if snapshots for this folder are available anywhere.
Get Outlook for Android<https://aka.ms/ghei36>
________________________________
From: Arpit Agarwal <aagarwal@cloudera.com>
Sent: Friday, November 8, 2019 11:41:31 AM
To: David M <mcginnisda@outlook.com>
Cc: user@hadoop.apache.org <user@hadoop.apache.org>
Subject: Re: HDFS du Utility Inconsistencies?
Got any snapshots?
On Fri, Nov 8, 2019, 09:38 David M <mcginnisda@outlook.com<mailto:mcginnisd=
a@outlook.com>> wrote:
All,
I=92m working on a cluster that is running Hadoop 2.7.3. I have one folder =
in particular where the command hdfs dfs -du is giving me strange results. =
If I query the folder and ask for a summary, it tells me 10 GB. If I don=92=
t ask for a summary, all of the folders underneath don=92t even add up to 1=
GB, much less 10 GB.
I=92ve verified this is true over time and is true using the hdfs user or a=
ny other user. We are on an HDP cluster, so we are using Ranger for HDFS se=
curity, and Kerberos for authentication. We see similar results in -count, =
where the size and counts are both different. We have not seen this behavio=
r in any other folders.
See below for a sample output we are seeing. I=92ve replaced the full path =
with a fake path to protect the data we have on the cluster. Does anyone kn=
ow anything that would cause this behavior? Thanks!
$ hdfs dfs -du -h /randomFolder
119.9 M /randomFolder/bug
1.0 M /randomFolder/commitment
86.8 K /randomFolder/customfield
31.3 M /randomFolder/epic
10.3 M /randomFolder/feature
4.0 M /randomFolder/insprintbug
372.9 K /randomFolder/project
15.1 K /randomFolder/projectstatus
330.9 M /randomFolder/story
256.3 M /randomFolder/subtask
74.7 K /randomFolder/subtemplate
89.6 M /randomFolder/task
7.4 M /randomFolder/techdebt
117.7 K /randomFolder/template
617.9 K /randomFolder/tempomember
8.2 K /randomFolder/tempoteam
1.4 M /randomFolder/tempoworklog
$ hdfs dfs -du -h -s /randomFolder
10.6 G /randomFolder
David McGinnis
[Attachment #3 (text/html)]
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
</head>
<body>
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: \
sans-serif; font-size: 11pt; color: black; "> We use snapshots in the cluster, but \
I've not seen any snapshot folders underneath the folder in question. I'd need to \
verify with the application team if snapshots for this folder are available \
anywhere.<br> <br>
</div>
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: \
sans-serif; font-size: 11pt; color: black; "> <span id="OutlookSignature">
<div dir="auto" style="direction: ltr; margin: 0; padding: 0; font-family: \
sans-serif; font-size: 11pt; color: black; "> Get <a \
href="https://aka.ms/ghei36">Outlook for Android</a></div> </span><br>
</div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" \
style="font-size:11pt" color="#000000"><b>From:</b> Arpit Agarwal \
<aagarwal@cloudera.com><br> <b>Sent:</b> Friday, November 8, 2019 11:41:31 \
AM<br> <b>To:</b> David M <mcginnisda@outlook.com><br>
<b>Cc:</b> user@hadoop.apache.org <user@hadoop.apache.org><br>
<b>Subject:</b> Re: HDFS du Utility Inconsistencies?</font>
<div> </div>
</div>
<div>
<div dir="auto">Got any snapshots?</div>
<br>
<div class="x_gmail_quote">
<div dir="ltr" class="x_gmail_attr">On Fri, Nov 8, 2019, 09:38 David M <<a \
href="mailto:mcginnisda@outlook.com" target="_blank" \
rel="noreferrer">mcginnisda@outlook.com</a>> wrote:<br> </div>
<blockquote class="x_gmail_quote" style="margin:0 0 0 .8ex; border-left:1px #ccc \
solid; padding-left:1ex"> <div lang="EN-US">
<div>
<p class="x_MsoNormal">All,<u></u><u></u></p>
<p class="x_MsoNormal"><u></u> <u></u></p>
<p class="x_MsoNormal">I’m working on a cluster that is running Hadoop 2.7.3. I have \
one folder in particular where the command hdfs dfs -du is giving me strange results. \
If I query the folder and ask for a summary, it tells me 10 GB. If I don’t ask for a \
summary, all of the folders underneath don’t even add up to 1 GB, much less 10 GB. \
<u></u><u></u></p> <p class="x_MsoNormal"><u></u> <u></u></p>
<p class="x_MsoNormal">I’ve verified this is true over time and is true using the \
hdfs user or any other user. We are on an HDP cluster, so we are using Ranger for \
HDFS security, and Kerberos for authentication. We see similar results in -count, \
where the size and counts are both different. We have not seen this behavior in any \
other folders. <u></u><u></u></p>
<p class="x_MsoNormal"><u></u> <u></u></p>
<p class="x_MsoNormal">See below for a sample output we are seeing. I’ve replaced the \
full path with a fake path to protect the data we have on the cluster. Does anyone \
know anything that would cause this behavior? Thanks!<u></u><u></u></p> <p \
class="x_MsoNormal" align="right" style="text-align:right"><u></u> <u></u></p> \
<p class="x_MsoNormal">$ hdfs dfs -du -h /randomFolder<u></u><u></u></p> <p \
class="x_MsoNormal">119.9 M /randomFolder/bug<u></u><u></u></p> <p \
class="x_MsoNormal">1.0 M \
/randomFolder/commitment<u></u><u></u></p> <p class="x_MsoNormal">86.8 K \
/randomFolder/customfield<u></u><u></u></p> <p class="x_MsoNormal">31.3 M \
/randomFolder/epic<u></u><u></u></p> <p class="x_MsoNormal">10.3 M \
/randomFolder/feature<u></u><u></u></p> <p class="x_MsoNormal">4.0 \
M /randomFolder/insprintbug<u></u><u></u></p> <p \
class="x_MsoNormal">372.9 K /randomFolder/project<u></u><u></u></p> <p \
class="x_MsoNormal">15.1 K /randomFolder/projectstatus<u></u><u></u></p> \
<p class="x_MsoNormal">330.9 M /randomFolder/story<u></u><u></u></p> <p \
class="x_MsoNormal">256.3 M /randomFolder/subtask<u></u><u></u></p> <p \
class="x_MsoNormal">74.7 K /randomFolder/subtemplate<u></u><u></u></p> <p \
class="x_MsoNormal">89.6 M /randomFolder/task<u></u><u></u></p> <p \
class="x_MsoNormal">7.4 M /randomFolder/techdebt<u></u><u></u></p> \
<p class="x_MsoNormal">117.7 K /randomFolder/template<u></u><u></u></p> <p \
class="x_MsoNormal">617.9 K /randomFolder/tempomember<u></u><u></u></p> <p \
class="x_MsoNormal">8.2 K /randomFolder/tempoteam<u></u><u></u></p> \
<p class="x_MsoNormal">1.4 M \
/randomFolder/tempoworklog<u></u><u></u></p> <p \
class="x_MsoNormal"><u></u> <u></u></p> <p class="x_MsoNormal">$ hdfs dfs -du -h \
-s /randomFolder<u></u><u></u></p> <p class="x_MsoNormal">10.6 G \
/randomFolder<u></u><u></u></p> <p class="x_MsoNormal"><u></u> <u></u></p>
<p class="x_MsoNormal">David McGinnis<u></u><u></u></p>
<p class="x_MsoNormal"><u></u> <u></u></p>
</div>
</div>
</blockquote>
</div>
</div>
</body>
</html>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic