[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-user
Subject:    Re: diagnosing the difference between dfs 'du' and 'df'
From:       Martin Serrano <martin () attivio ! com>
Date:       2015-12-23 13:07:32
Message-ID: CY1PR0501MB1644D9FE710A5826B1FDCCFCC0E60 () CY1PR0501MB1644 ! namprd05 ! prod ! outlook ! com
[Download RAW message or body]

I was able to resolve this issue.  By looking at the hdfs-audit.log we
noticed that there were a large number of appends to the same file
occurring in a very short time frame.  My guess is that each append is
reserving a full block (128mb in our configuration), leading to
temporary disk "utilization" until the appends are resolved into a
single file.   We were able to eliminate the issue by turning these
appends into a continuous write.

-Martin

On 12/22/2015 12:59 PM, Anu Engineer wrote:
> Just  a guess, but could you please check what is your dfs.replication set to ? 
>
> You should be able to find that setting in hdfs-site.xml or in core-site.xml
>
> Thanks
> Anu
>  
>
> On 12/21/15, 6:21 PM, "Martin Serrano" <martin@attivio.com> wrote:
>
>> Hi,
>>
>> I have an application that is writing data rapidly directly to HDFS
>> (creates and appends) as well as to HBase (10-15 tables).  The disk free
>> for the filesystem will report that a large percentage of the system is
>> in use:
>>
>> $ hdfs dfs -df -h /
>> Filesystem     Size     Used  Available  Use%
>> hdfs://ha   882.6 G  472.6 G    409.9 G   54%
>>
>> Yet when I try to figure out where the disk space is being used using
>> dfs -du reports:
>>
>> $ hdfs dfs -du -h /
>> 0        /app-logs
>> 7.6 G    /apps
>> 382.2 M  /hdp
>> 0        /mapred
>> 0        /mr-history
>> 8.5 K    /tmp
>> 3.8 G    /user
>>
>> A dfsadmin -report during the same time frame is below.  I'm trying to
>> figure out where all of this space is going to.  When my application is
>> killed or quiescent, the df and dfsadmin reports fall in line with what
>> I would expect.  I'm running HDP 2.3 with a default configuration as set
>> up by Ambari.  I'm looking for hints or suggestions on how I can
>> investigate this issue.  It seems crazy that ingesting 12g or so of data
>> can temporarily consume (reserve?) ~300g of HDFS.
>>
>> Thanks,
>> Martin
>>
>> Configured Capacity: 947644268544 (882.56 GB)
>> Present Capacity: 947064596261 (882.02 GB)
>> DFS Remaining: 490046627240 (456.39 GB)
>> DFS Used: 457017969021 (425.63 GB)
>> DFS Used%: 48.26%
>> Under replicated blocks: 0
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>> Missing blocks (with replication factor 1): 0
>>
>> -------------------------------------------------
>> Live datanodes (3):
>>
>> Name: *.*.*.*:50010 (**********.com)
>> Hostname: **********.com
>> Decommission Status : Normal
>> Configured Capacity: 315881422848 (294.19 GB)
>> DFS Used: 218955099179 (203.92 GB)
>> Non DFS Used: 168255175 (160.46 MB)
>> DFS Remaining: 96758068494 (90.11 GB)
>> DFS Used%: 69.32%
>> DFS Remaining%: 30.63%
>> Configured Cache Capacity: 0 (0 B)
>> Cache Used: 0 (0 B)
>> Cache Remaining: 0 (0 B)
>> Cache Used%: 100.00%
>> Cache Remaining%: 0.00%
>> Xceivers: 15
>> Last contact: Mon Dec 21 17:17:38 EST 2015
>>
>>
>> Name: *.*.*.*:50010 (**********.com)
>> Hostname: **********.com
>> Decommission Status : Normal
>> Configured Capacity: 315881422848 (294.19 GB)
>> DFS Used: 218873337575 (203.84 GB)
>> Non DFS Used: 151608508 (144.59 MB)
>> DFS Remaining: 96856476765 (90.20 GB)
>> DFS Used%: 69.29%
>> DFS Remaining%: 30.66%
>> Configured Cache Capacity: 0 (0 B)
>> Cache Used: 0 (0 B)
>> Cache Remaining: 0 (0 B)
>> Cache Used%: 100.00%
>> Cache Remaining%: 0.00%
>> Xceivers: 16
>> Last contact: Mon Dec 21 17:17:38 EST 2015
>>
>>
>> Name: *.*.*.*:50010 (*************.com)
>> Hostname: ***********.com
>> Decommission Status : Normal
>> Configured Capacity: 315881422848 (294.19 GB)
>> DFS Used: 19189532267 (17.87 GB)
>> Non DFS Used: 259808600 (247.77 MB)
>> DFS Remaining: 296432081981 (276.07 GB)
>> DFS Used%: 6.07%
>> DFS Remaining%: 93.84%
>> Configured Cache Capacity: 0 (0 B)
>> Cache Used: 0 (0 B)
>> Cache Remaining: 0 (0 B)
>> Cache Used%: 100.00%
>> Cache Remaining%: 0.00%
>> Xceivers: 16
>> Last contact: Mon Dec 21 17:17:39 EST 2015
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: user-help@hadoop.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic