[prev in list] [next in list] [prev in thread] [next in thread] 

List:       qemu-devel
Subject:    Re: [Qemu-devel] [RFC PATCH 14/17] block: support FALLOC_FL_PUNCH_HOLE trimming
From:       Paolo Bonzini <pbonzini () redhat ! com>
Date:       2012-03-12 9:34:01
Message-ID: 4F5DC309.7050709 () redhat ! com
[Download RAW message or body]

Il 09/03/2012 21:36, Richard Laager ha scritto:
> fallocate() only supports files. In my patch, I was fstat()ing the fd
> just after opening it and caching an is_device boolean. Then, when doing
> discards, if !is_device, call fallocate(), else (i.e. for devices) do
> the following (note: untested code):
> 
>         __uint64_t range[2];
> 
>         range[0] = sector_num << 9;
>         range[1] = (__uint64_t)nb_sectors << 9;
> 
>         if (ioctl(s->fd, BLKDISCARD, &range) && errno != EOPNOTSUPP) {
>              return -errno;
>         }
>         return 0;

You can instead put this code in a separate function hdev_discard.  hdev
device is used instead of file when the backend is a special file (block
or character).

> -------- sync requirements --------
> 
> I'm not sure if fallocate() and/or BLKDISCARD always guarantee that the
> discard has made it to stable storage. If they don't, does O_DIRECT or
> O_DSYNC on open() cause them to make any such guarantee?

O_DIRECT shouldn't have any effect; for O_DSYNC I don't think so.

> -------- Thin vs. Thick Provisioning --------
> 
> On Fri, 2012-03-09 at 00:35 -0800, Chris Wedgwood wrote:
>> Simplest still compare the blocks allocated by the file
>> to it's length (ie. stat.st_blocks != stat.st_size>>9).
> 
> I thought of this as well. It covers "99%" of the cases, but there's one
> case where it breaks down. Imagine I have a sparse file backing my
> virtual disk. In the guest, I fill the virtual disk 100%. Then, I
> restart QEMU. Now it thinks that sparse file is non-sparse and stops
> issuing hole punches.

I would not really have any problem with that, it seems like a
relatively minor case (also because newer guests will allocate the first
partition at 1 MB, so you might still have a small hole at the beginning).

> To be completely correct, I suggest the following behavior:
>      1. Add a discard boolean option to the disk layer.
>      2. If discard is not specified:
>               * For files, detect a true/false value by comparing
>                 stat.st_blocks != stat.st_size>>9.
>               * For devices, assume a fixed value (true?).
>      3. If discard is true, issue discards.
>      4. If discard is false, do not issue discards.

The problem is, who will use this interface?

Paolo

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic