[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ceph-users
Subject:    [ceph-users] OSD on XFS ENOSPC at 84% data / 5% inode and inode64?
From:       laurent () guerby ! net (Laurent GUERBY)
Date:       2015-11-27 9:00:39
Message-ID: 1448614839.1905.415.camel () guerby ! net
[Download RAW message or body]

On Thu, 2015-11-26 at 22:13 +0300, Andrey Korolyov wrote:
> On Thu, Nov 26, 2015 at 1:29 AM, Laurent GUERBY <laurent at guerby.net> wrote:
> > Hi,
> > 
> > After our trouble with ext4/xattr soft lockup kernel bug we started
> > moving some of our OSD to XFS, we're using ubuntu 14.04 3.19 kernel
> > and ceph 0.94.5.
> > 
> > We have two out of 28 rotational OSD running XFS and
> > they both get restarted regularly because they're terminating with
> > "ENOSPC":
> > 
> > 2015-11-25 16:51:08.015820 7f6135153700  0 filestore(/var/lib/ceph/osd/ceph-11)  \
> > error (28) No space left on device not handled on operation 0xa0f4d520 \
> > (12849173.0.4, or op 4, counting from 0) 2015-11-25 16:51:08.015837 7f6135153700  \
> > 0 filestore(/var/lib/ceph/osd/ceph-11) ENOSPC handling not implemented 2015-11-25 \
> > 16:51:08.015838 7f6135153700  0 filestore(/var/lib/ceph/osd/ceph-11)  transaction \
> >                 dump:
> > ...
> > {
> > "op_num": 4,
> > "op_name": "write",
> > "collection": "58.2d5_head",
> > "oid": "53e4fed5\/rbd_data.11f20f75aac8266.00000000000a79eb\/head\/\/58",
> > "length": 73728,
> > "offset": 4120576,
> > "bufferlist length": 73728
> > },
> > 
> > (Writing the last 73728 bytes = 72 kbytes of 4 Mbytes if I'm reading
> > this correctly)
> > 
> > Mount options:
> > 
> > /dev/sdb1 /var/lib/ceph/osd/ceph-11 xfs rw,noatime,attr2,inode64,noquota
> > 
> > Space and Inodes:
> > 
> > Filesystem     Type      1K-blocks       Used Available Use% Mounted on
> > /dev/sdb1      xfs      1947319356 1624460408 322858948  84% \
> > /var/lib/ceph/osd/ceph-11 
> > Filesystem     Type        Inodes   IUsed     IFree IUse% Mounted on
> > /dev/sdb1      xfs       48706752 1985587  46721165    5% \
> > /var/lib/ceph/osd/ceph-11 
> > We're only using rbd devices, so max 4 MB/object write, how
> > can we get ENOSPC for a 4MB operation with 322 GB free space?
> > 
> > The most surprising thing is that after the automatic restart
> > disk usage keep increasing and we no longer get ENOSPC for a while.
> > 
> > Did we miss a needed XFS mount option? Did other ceph users
> > encounter this issue with XFS?
> > 
> > We have no such issue with ~96% full ext4 OSD (after setting the right
> > value for the various ceph "fill" options).
> > 
> > Thanks in advance,
> > 
> > Laurent
> > 
> 
> Hi, from given numbers one can conclude that you are facing some kind
> of XFS preallocation bug, because ((raw space divided by number of
> files)) is four times lower than the ((raw space divided by 4MB
> blocks)). At a glance it could be avoided by specifying relatively
> small allocsize= mount option, of course by impacting overall
> performance, appropriate benchmarks could be found through
> ceph-users/ceph-devel. Also do you plan to preserve overcommit ratio
> to be that high forever?

Hi again,

Looks like we hit a bug in image deletion leaving objects undeleted on
disk:

http://tracker.ceph.com/issues/13894

I assume we'll get a lot more free space when it's fixed :).

Laurent


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic