diff mbox series

[v1] vhost-vdpa: Set discarding of RAM broken when initializing the backend

Message ID 20210302162129.52912-1-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v1] vhost-vdpa: Set discarding of RAM broken when initializing the backend | expand

Commit Message

David Hildenbrand March 2, 2021, 4:21 p.m. UTC
Similar to VFIO, vDPA will go ahead an map+pin all guest memory. Memory
that used to be discarded will get re-populated and if we
discard+re-access memory after mapping+pinning, the pages mapped into the
vDPA IOMMU will go out of sync with the actual pages mapped into the user
space page tables.

Set discarding of RAM broken such that:
- virtio-mem and vhost-vdpa run mutually exclusive
- virtio-balloon is inhibited and no memory discards will get issued

In the future, we might be able to support coordinated discarding of RAM
as used by virtio-mem and as planned for VFIO.

Cc: Jason Wang <jasowang@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cindy Lu <lulu@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---

Note: I was not actually able to reproduce/test as I fail to get the
vdpa_sim/vdpa_sim_net running on upstream Linux (whetever vdpa, vhost_vdpa,
vdpa_sim, vdpa_sim_net modules I probe, and in which order, no vdpa devices
appear under /sys/bus/vdpa/devices/ or /dev/).

---
 hw/virtio/vhost-vdpa.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Jason Wang March 3, 2021, 2:53 a.m. UTC | #1
On 2021/3/3 12:21 上午, David Hildenbrand wrote:
> Similar to VFIO, vDPA will go ahead an map+pin all guest memory. Memory
> that used to be discarded will get re-populated and if we
> discard+re-access memory after mapping+pinning, the pages mapped into the
> vDPA IOMMU will go out of sync with the actual pages mapped into the user
> space page tables.
>
> Set discarding of RAM broken such that:
> - virtio-mem and vhost-vdpa run mutually exclusive
> - virtio-balloon is inhibited and no memory discards will get issued
>
> In the future, we might be able to support coordinated discarding of RAM
> as used by virtio-mem and as planned for VFIO.
>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Cindy Lu <lulu@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>


Acked-by: Jason Wang <jasowang@redhat.com>


> ---
>
> Note: I was not actually able to reproduce/test as I fail to get the
> vdpa_sim/vdpa_sim_net running on upstream Linux (whetever vdpa, vhost_vdpa,
> vdpa_sim, vdpa_sim_net modules I probe, and in which order, no vdpa devices
> appear under /sys/bus/vdpa/devices/ or /dev/).


The device creation was switched to use vdpa tool that is integrated 
with iproue2[1].

[1] 
https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=143610383da51e1f868c6d5a2a5e2fb552293d18


>
> ---
>   hw/virtio/vhost-vdpa.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
>
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 01d2101d09..86058d4041 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -278,6 +278,17 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque)
>       uint64_t features;
>       assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
>       trace_vhost_vdpa_init(dev, opaque);
> +    int ret;
> +
> +    /*
> +     * Similar to VFIO, we end up pinning all guest memory and have to
> +     * disable discarding of RAM.
> +     */
> +    ret = ram_block_discard_disable(true);
> +    if (ret) {
> +        error_report("Cannot set discarding of RAM broken");
> +        return ret;
> +    }


vDPA will support non pinning (shared VM) backend soon[2]. So I guess we 
need a flag to be advertised to usersapce then we can conditionly enable 
the discard here.

[2] https://www.spinics.net/lists/netdev/msg723944.html

Thanks


>   
>       v = opaque;
>       v->dev = dev;
> @@ -302,6 +313,8 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
>       memory_listener_unregister(&v->listener);
>   
>       dev->opaque = NULL;
> +    ram_block_discard_disable(false);
> +
>       return 0;
>   }
>
David Hildenbrand March 3, 2021, 10:26 a.m. UTC | #2
On 03.03.21 03:53, Jason Wang wrote:
> 
> On 2021/3/3 12:21 上午, David Hildenbrand wrote:
>> Similar to VFIO, vDPA will go ahead an map+pin all guest memory. Memory
>> that used to be discarded will get re-populated and if we
>> discard+re-access memory after mapping+pinning, the pages mapped into the
>> vDPA IOMMU will go out of sync with the actual pages mapped into the user
>> space page tables.
>>
>> Set discarding of RAM broken such that:
>> - virtio-mem and vhost-vdpa run mutually exclusive
>> - virtio-balloon is inhibited and no memory discards will get issued
>>
>> In the future, we might be able to support coordinated discarding of RAM
>> as used by virtio-mem and as planned for VFIO.
>>
>> Cc: Jason Wang <jasowang@redhat.com>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Cc: Cindy Lu <lulu@redhat.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> 
> Acked-by: Jason Wang <jasowang@redhat.com>
> 
> 
>> ---
>>
>> Note: I was not actually able to reproduce/test as I fail to get the
>> vdpa_sim/vdpa_sim_net running on upstream Linux (whetever vdpa, vhost_vdpa,
>> vdpa_sim, vdpa_sim_net modules I probe, and in which order, no vdpa devices
>> appear under /sys/bus/vdpa/devices/ or /dev/).
> 
> 
> The device creation was switched to use vdpa tool that is integrated
> with iproue2[1].
> 
> [1]
> https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=143610383da51e1f868c6d5a2a5e2fb552293d18

It would be great to document that somewhere if not already done. I only 
found older RH documentations that were not aware of that. I'll give it 
a try - thanks!

> 
> 
>>
>> ---
>>    hw/virtio/vhost-vdpa.c | 13 +++++++++++++
>>    1 file changed, 13 insertions(+)
>>
>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>> index 01d2101d09..86058d4041 100644
>> --- a/hw/virtio/vhost-vdpa.c
>> +++ b/hw/virtio/vhost-vdpa.c
>> @@ -278,6 +278,17 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque)
>>        uint64_t features;
>>        assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
>>        trace_vhost_vdpa_init(dev, opaque);
>> +    int ret;
>> +
>> +    /*
>> +     * Similar to VFIO, we end up pinning all guest memory and have to
>> +     * disable discarding of RAM.
>> +     */
>> +    ret = ram_block_discard_disable(true);
>> +    if (ret) {
>> +        error_report("Cannot set discarding of RAM broken");
>> +        return ret;
>> +    }
> 
> 
> vDPA will support non pinning (shared VM) backend soon[2]. So I guess we
> need a flag to be advertised to usersapce then we can conditionly enable
> the discard here.

I thought that was already the default (because I stumbled over 
enforcing guest IOMMU) but was surprised when I had a look at the 
implementation.

Having a flag sounds good.

BTW: I assume iommu support is not fully working yet, right? I don't see 
special casing for iommu regions, including registering the listener and 
updating the mapping.
David Hildenbrand March 3, 2021, 10:37 a.m. UTC | #3
On 03.03.21 11:26, David Hildenbrand wrote:
> On 03.03.21 03:53, Jason Wang wrote:
>>
>> On 2021/3/3 12:21 上午, David Hildenbrand wrote:
>>> Similar to VFIO, vDPA will go ahead an map+pin all guest memory. Memory
>>> that used to be discarded will get re-populated and if we
>>> discard+re-access memory after mapping+pinning, the pages mapped into the
>>> vDPA IOMMU will go out of sync with the actual pages mapped into the user
>>> space page tables.
>>>
>>> Set discarding of RAM broken such that:
>>> - virtio-mem and vhost-vdpa run mutually exclusive
>>> - virtio-balloon is inhibited and no memory discards will get issued
>>>
>>> In the future, we might be able to support coordinated discarding of RAM
>>> as used by virtio-mem and as planned for VFIO.
>>>
>>> Cc: Jason Wang <jasowang@redhat.com>
>>> Cc: Michael S. Tsirkin <mst@redhat.com>
>>> Cc: Cindy Lu <lulu@redhat.com>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>
>>
>> Acked-by: Jason Wang <jasowang@redhat.com>
>>
>>
>>> ---
>>>
>>> Note: I was not actually able to reproduce/test as I fail to get the
>>> vdpa_sim/vdpa_sim_net running on upstream Linux (whetever vdpa, vhost_vdpa,
>>> vdpa_sim, vdpa_sim_net modules I probe, and in which order, no vdpa devices
>>> appear under /sys/bus/vdpa/devices/ or /dev/).
>>
>>
>> The device creation was switched to use vdpa tool that is integrated
>> with iproue2[1].
>>
>> [1]
>> https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=143610383da51e1f868c6d5a2a5e2fb552293d18
> 
> It would be great to document that somewhere if not already done. I only
> found older RH documentations that were not aware of that. I'll give it
> a try - thanks!

Seems to work just fine:


$ sudo ./build/qemu-system-x86_64 -m 2G,maxmem=4G --enable-kvm -object memory-backend-ram,id=mem0,size=2G  -device virtio-mem-pci,id=vmem0,memdev=mem0,node=0,requested-size=0G  -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa1 -device virtio-net-pci,netdev=vhost-vdpa1,mac=00:e8:ca:33:ba:05,disable-modern=off,page-per-vq=on -nographic
qemu-system-x86_64: -device virtio-mem-pci,id=vmem0,memdev=mem0,node=0,requested-size=0G: Discarding RAM is disabled

I think the -netdev is always processed/initialized before the
"-device virtio-mem-pci", which is why we always fail from virtio-mem code
right now and not from vhost-vdpa code.
Jason Wang March 4, 2021, 9:32 a.m. UTC | #4
On 2021/3/3 6:26 下午, David Hildenbrand wrote:
> On 03.03.21 03:53, Jason Wang wrote:
>>
>> On 2021/3/3 12:21 上午, David Hildenbrand wrote:
>>> Similar to VFIO, vDPA will go ahead an map+pin all guest memory. Memory
>>> that used to be discarded will get re-populated and if we
>>> discard+re-access memory after mapping+pinning, the pages mapped 
>>> into the
>>> vDPA IOMMU will go out of sync with the actual pages mapped into the 
>>> user
>>> space page tables.
>>>
>>> Set discarding of RAM broken such that:
>>> - virtio-mem and vhost-vdpa run mutually exclusive
>>> - virtio-balloon is inhibited and no memory discards will get issued
>>>
>>> In the future, we might be able to support coordinated discarding of 
>>> RAM
>>> as used by virtio-mem and as planned for VFIO.
>>>
>>> Cc: Jason Wang <jasowang@redhat.com>
>>> Cc: Michael S. Tsirkin <mst@redhat.com>
>>> Cc: Cindy Lu <lulu@redhat.com>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>
>>
>> Acked-by: Jason Wang <jasowang@redhat.com>
>>
>>
>>> ---
>>>
>>> Note: I was not actually able to reproduce/test as I fail to get the
>>> vdpa_sim/vdpa_sim_net running on upstream Linux (whetever vdpa, 
>>> vhost_vdpa,
>>> vdpa_sim, vdpa_sim_net modules I probe, and in which order, no vdpa 
>>> devices
>>> appear under /sys/bus/vdpa/devices/ or /dev/).
>>
>>
>> The device creation was switched to use vdpa tool that is integrated
>> with iproue2[1].
>>
>> [1]
>> https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=143610383da51e1f868c6d5a2a5e2fb552293d18 
>>
>
> It would be great to document that somewhere if not already done. I 
> only found older RH documentations that were not aware of that. I'll 
> give it a try - thanks!


Will think about this. Which RH doc do you refer here? Is this the 
redhat blog?


>
>>
>>
>>>
>>> ---
>>>    hw/virtio/vhost-vdpa.c | 13 +++++++++++++
>>>    1 file changed, 13 insertions(+)
>>>
>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>> index 01d2101d09..86058d4041 100644
>>> --- a/hw/virtio/vhost-vdpa.c
>>> +++ b/hw/virtio/vhost-vdpa.c
>>> @@ -278,6 +278,17 @@ static int vhost_vdpa_init(struct vhost_dev 
>>> *dev, void *opaque)
>>>        uint64_t features;
>>>        assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
>>>        trace_vhost_vdpa_init(dev, opaque);
>>> +    int ret;
>>> +
>>> +    /*
>>> +     * Similar to VFIO, we end up pinning all guest memory and have to
>>> +     * disable discarding of RAM.
>>> +     */
>>> +    ret = ram_block_discard_disable(true);
>>> +    if (ret) {
>>> +        error_report("Cannot set discarding of RAM broken");
>>> +        return ret;
>>> +    }
>>
>>
>> vDPA will support non pinning (shared VM) backend soon[2]. So I guess we
>> need a flag to be advertised to usersapce then we can conditionly enable
>> the discard here.
>
> I thought that was already the default (because I stumbled over 
> enforcing guest IOMMU) but was surprised when I had a look at the 
> implementation.
>
> Having a flag sounds good.
>
> BTW: I assume iommu support is not fully working yet, right? I don't 
> see special casing for iommu regions, including registering the 
> listener and updating the mapping.


It's not yet implemented. Yes, it's something like what VFIO did right 
now, e.g to use IOMMU notifiers.

Thanks
David Hildenbrand March 4, 2021, 9:34 a.m. UTC | #5
On 04.03.21 10:32, Jason Wang wrote:
> 
> On 2021/3/3 6:26 下午, David Hildenbrand wrote:
>> On 03.03.21 03:53, Jason Wang wrote:
>>>
>>> On 2021/3/3 12:21 上午, David Hildenbrand wrote:
>>>> Similar to VFIO, vDPA will go ahead an map+pin all guest memory. Memory
>>>> that used to be discarded will get re-populated and if we
>>>> discard+re-access memory after mapping+pinning, the pages mapped
>>>> into the
>>>> vDPA IOMMU will go out of sync with the actual pages mapped into the
>>>> user
>>>> space page tables.
>>>>
>>>> Set discarding of RAM broken such that:
>>>> - virtio-mem and vhost-vdpa run mutually exclusive
>>>> - virtio-balloon is inhibited and no memory discards will get issued
>>>>
>>>> In the future, we might be able to support coordinated discarding of
>>>> RAM
>>>> as used by virtio-mem and as planned for VFIO.
>>>>
>>>> Cc: Jason Wang <jasowang@redhat.com>
>>>> Cc: Michael S. Tsirkin <mst@redhat.com>
>>>> Cc: Cindy Lu <lulu@redhat.com>
>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>
>>>
>>> Acked-by: Jason Wang <jasowang@redhat.com>
>>>
>>>
>>>> ---
>>>>
>>>> Note: I was not actually able to reproduce/test as I fail to get the
>>>> vdpa_sim/vdpa_sim_net running on upstream Linux (whetever vdpa,
>>>> vhost_vdpa,
>>>> vdpa_sim, vdpa_sim_net modules I probe, and in which order, no vdpa
>>>> devices
>>>> appear under /sys/bus/vdpa/devices/ or /dev/).
>>>
>>>
>>> The device creation was switched to use vdpa tool that is integrated
>>> with iproue2[1].
>>>
>>> [1]
>>> https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=143610383da51e1f868c6d5a2a5e2fb552293d18
>>>
>>
>> It would be great to document that somewhere if not already done. I
>> only found older RH documentations that were not aware of that. I'll
>> give it a try - thanks!
> 
> 
> Will think about this. Which RH doc do you refer here? Is this the
> redhat blog?

https://www.redhat.com/en/blog/hands-vdpa-what-do-you-do-when-you-aint-got-hardware

As it's supposed to be from October 14, 2020 I was surprised to not get 
it running (even with older kernels IIRC).
Jason Wang March 4, 2021, 9:37 a.m. UTC | #6
On 2021/3/4 5:34 下午, David Hildenbrand wrote:
> On 04.03.21 10:32, Jason Wang wrote:
>>
>> On 2021/3/3 6:26 下午, David Hildenbrand wrote:
>>> On 03.03.21 03:53, Jason Wang wrote:
>>>>
>>>> On 2021/3/3 12:21 上午, David Hildenbrand wrote:
>>>>> Similar to VFIO, vDPA will go ahead an map+pin all guest memory. 
>>>>> Memory
>>>>> that used to be discarded will get re-populated and if we
>>>>> discard+re-access memory after mapping+pinning, the pages mapped
>>>>> into the
>>>>> vDPA IOMMU will go out of sync with the actual pages mapped into the
>>>>> user
>>>>> space page tables.
>>>>>
>>>>> Set discarding of RAM broken such that:
>>>>> - virtio-mem and vhost-vdpa run mutually exclusive
>>>>> - virtio-balloon is inhibited and no memory discards will get issued
>>>>>
>>>>> In the future, we might be able to support coordinated discarding of
>>>>> RAM
>>>>> as used by virtio-mem and as planned for VFIO.
>>>>>
>>>>> Cc: Jason Wang <jasowang@redhat.com>
>>>>> Cc: Michael S. Tsirkin <mst@redhat.com>
>>>>> Cc: Cindy Lu <lulu@redhat.com>
>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>
>>>>
>>>> Acked-by: Jason Wang <jasowang@redhat.com>
>>>>
>>>>
>>>>> ---
>>>>>
>>>>> Note: I was not actually able to reproduce/test as I fail to get the
>>>>> vdpa_sim/vdpa_sim_net running on upstream Linux (whetever vdpa,
>>>>> vhost_vdpa,
>>>>> vdpa_sim, vdpa_sim_net modules I probe, and in which order, no vdpa
>>>>> devices
>>>>> appear under /sys/bus/vdpa/devices/ or /dev/).
>>>>
>>>>
>>>> The device creation was switched to use vdpa tool that is integrated
>>>> with iproue2[1].
>>>>
>>>> [1]
>>>> https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=143610383da51e1f868c6d5a2a5e2fb552293d18 
>>>>
>>>>
>>>
>>> It would be great to document that somewhere if not already done. I
>>> only found older RH documentations that were not aware of that. I'll
>>> give it a try - thanks!
>>
>>
>> Will think about this. Which RH doc do you refer here? Is this the
>> redhat blog?
>
> https://www.redhat.com/en/blog/hands-vdpa-what-do-you-do-when-you-aint-got-hardware 
>
>
> As it's supposed to be from October 14, 2020 I was surprised to not 
> get it running (even with older kernels IIRC).


Right, the mgmt API is just merged. Will try to see if we can fix the blog.

Thanks
David Hildenbrand April 29, 2021, 7:51 a.m. UTC | #7
On 02.03.21 17:21, David Hildenbrand wrote:
> Similar to VFIO, vDPA will go ahead an map+pin all guest memory. Memory
> that used to be discarded will get re-populated and if we
> discard+re-access memory after mapping+pinning, the pages mapped into the
> vDPA IOMMU will go out of sync with the actual pages mapped into the user
> space page tables.
> 
> Set discarding of RAM broken such that:
> - virtio-mem and vhost-vdpa run mutually exclusive
> - virtio-balloon is inhibited and no memory discards will get issued
> 
> In the future, we might be able to support coordinated discarding of RAM
> as used by virtio-mem and as planned for VFIO.
> 
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Cindy Lu <lulu@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> 
> Note: I was not actually able to reproduce/test as I fail to get the
> vdpa_sim/vdpa_sim_net running on upstream Linux (whetever vdpa, vhost_vdpa,
> vdpa_sim, vdpa_sim_net modules I probe, and in which order, no vdpa devices
> appear under /sys/bus/vdpa/devices/ or /dev/).
> 
> ---
>   hw/virtio/vhost-vdpa.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 01d2101d09..86058d4041 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -278,6 +278,17 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque)
>       uint64_t features;
>       assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
>       trace_vhost_vdpa_init(dev, opaque);
> +    int ret;
> +
> +    /*
> +     * Similar to VFIO, we end up pinning all guest memory and have to
> +     * disable discarding of RAM.
> +     */
> +    ret = ram_block_discard_disable(true);
> +    if (ret) {
> +        error_report("Cannot set discarding of RAM broken");
> +        return ret;
> +    }
>   
>       v = opaque;
>       v->dev = dev;
> @@ -302,6 +313,8 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev)
>       memory_listener_unregister(&v->listener);
>   
>       dev->opaque = NULL;
> +    ram_block_discard_disable(false);
> +
>       return 0;
>   }
>   
> 

@MST, do you have this on your radar? thanks
diff mbox series

Patch

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 01d2101d09..86058d4041 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -278,6 +278,17 @@  static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque)
     uint64_t features;
     assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
     trace_vhost_vdpa_init(dev, opaque);
+    int ret;
+
+    /*
+     * Similar to VFIO, we end up pinning all guest memory and have to
+     * disable discarding of RAM.
+     */
+    ret = ram_block_discard_disable(true);
+    if (ret) {
+        error_report("Cannot set discarding of RAM broken");
+        return ret;
+    }
 
     v = opaque;
     v->dev = dev;
@@ -302,6 +313,8 @@  static int vhost_vdpa_cleanup(struct vhost_dev *dev)
     memory_listener_unregister(&v->listener);
 
     dev->opaque = NULL;
+    ram_block_discard_disable(false);
+
     return 0;
 }