'Re: [Qemu-ppc] Proper usage of the SPAPR vIOMMU with VFIO?'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       qemu-ppc
Subject:    Re: [Qemu-ppc] Proper usage of the SPAPR vIOMMU with VFIO?
From:       Shawn Anastasio <shawn () anastas ! io>
Date:       2019-04-30 23:28:33
Message-ID: f03e3253-a2df-fe49-c68a-ee60ffb8a51d () anastas ! io
[Download RAW message or body]

Thanks for the explanation of the difference between the two IOMMU
modes. Thankfully the limitations imposed by VFIO_SPAPR_TCE_IOMMU
aren't an issue for my use case.

On 4/29/19 9:12 PM, Alexey Kardashevskiy wrote:
> 
> 
> On 30/04/2019 07:42, Shawn Anastasio wrote:
>> Hello David,
>>
>> Thank you very much! Following your advice I created a separate
>> spapr-pci-host-bridge device and attached the ivshmem device to that.
>> Now I'm able to access the device through VFIO as expected!
> 
> Cool. Hosts and guests are quite different in a way they group PCI
> devices for VFIO.
> 
>> As an aside, I had to use VFIO_SPAPR_TCE_IOMMU rather than
>> VFIO_SPAPR_TCE_v2_IOMMU (I'm still not clear on the difference).
> 
> 
> VFIO_SPAPR_TCE_IOMMU allows DMA to the first 1 or 2GB of the PCI address
> space, mapped via IOMMU dynamically - Linux in the guest does frequent
> map/unmap per let's say every network packet but you could just allocate
> some memory up to that window size, map it once and use that.
> 
> VFIO_SPAPR_TCE_v2_IOMMU maps the entire guest RAM to the PCI address
> space (at some high offset == 1<<59) so the guest does not need to keep
> talking to IOMMU once such mapping is set up.
> 
> At the moment VFIO_SPAPR_TCE_v2_IOMMU is only supported on the baremetal
> (we are planning on adding support for it for guests), since you work
> with a guest, VFIO_SPAPR_TCE_IOMMU is your only choice.
> 
> 
>> After making this change it works as expected.
>>
>> I have yet to test hotplugging of additional ivshmem devices (via QMP),
>> but as long as I specify the correct spapr-pci-host-bridge bus I see
>> no reason why that wouldn't work too.
>>
>> Thanks again!
>> Shawn
>>
>> On 4/16/19 10:05 PM, David Gibson wrote:
>>> On Thu, Apr 04, 2019 at 05:15:50PM -0500, Shawn Anastasio wrote:
>>>> Hello all,
>>>
>>> Sorry I've taken so long to reply.   I didn't spot this for a while (I
>>> only read the qemu-ppc list irregularly) and then was busy for a while
>>> more.
>>>
>>>> I'm attempting to write a VFIO driver for QEMU's ivshmem shared memory
>>>> device on a ppc64 guest. Unfortunately, without using VFIO's unsafe
>>>> No-IOMMU mode, I'm unable to properly interface with the device.
>>>
>>> So, you want to write a device in guest userspace, accessing the
>>> device emulated by ivshmem via vfio.   Is that right?
>>>
>>> I'm assuming your guest is under KVM/qemu rather than being an LPAR
>>> under PowerVM.
>>>
>>>> When booting the guest with the iommu=on kernel parameter,
>>>
>>> The iommu=on parameter shouldn't make a difference.   PAPR guests
>>> *always* have a guest visible IOMMU.
>>>
>>>> the ivshmem
>>>> device can be binded to the vfio_pci kernel module and a group at
>>>> /dev/vfio/0 appears. When opening the group and checking its flags
>>>> with VFIO_GROUP_GET_STATUS, though, VFIO_GROUP_FLAGS_VIABLE is not
>>>> set. Ignoring this and attempting to set the VFIO container's IOMMU
>>>> mode to VFIO_SPAPR_TCE_v2_IOMMU fails with EPERM, though I'm not
>>>> sure if that's related.
>>>
>>> Yeah, the group will need to be viable before you can attach it to a
>>> container.
>>>
>>> I'm guessing the reason it's not is that some devices in the guest
>>> side IOMMU group are still bound to kernel drivers, rather than VFIO
>>> (or simply being unbound).
>>>
>>> Under PAPR, an IOMMU group generally consists of everything under the
>>> same virtual PCI Host Bridge (vPHB) - i.e. an entire (guest side) PCI
>>> domain.   Or at least, under the qemu implementation of PAPR.   It's not
>>> strictly required by PAPR, but it's pretty awkward to do otherwise.
>>>
>>> So, chances are you have your guest's disk and network on the same,
>>> default vPHB, meaning it's in the same (guest) IOMMU group as the
>>> ivshmem, which means it can't be safely used by userspace VFIO.
>>>
>>> However, unlike on x86, making extra vPHBs is very straightforward.
>>> Use something like:
>>>             -device spapr-pci-host-bridge,index=1,id=phb
>>>
>>> Then add bus=phb.0 to your ivshmem to put it on the secondary PHB.   It
>>> will then be in its own IOMMU group and you should be able to use it
>>> in guest userspace.
>>>
>>> Note that before you're able to map user memory into the IOMMU, you'll
>>> also need to "preregister" it with the ioctl
>>> VFIO_IOMMU_SPAPR_REGISTER_MEMORY.   [This is because for the case of
>>> passing through a device to a guest - which always has a vIOMMU,
>>> remember - doing accounting on every VFIO_DMA_MAP, can be pretty
>>> expensive in a hot path, the preregistration step lets us preregister
>>> all guest memory, handle the accounting then, and allow the actual
>>> maps and unmaps to go faster].
>>>
>>
> 


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic