Message ID | 20241205203430.76251-1-sahilcdq@proton.me (mailing list archive) |
---|---|
Headers | show |
Series | Add packed virtqueue to shadow virtqueue | expand |
On Thu, Dec 5, 2024 at 9:34 PM Sahil Siddiq <icegambit91@gmail.com> wrote: > > Hi, > > There are two issues that I found while trying to test > my changes. I thought I would send the patch series > as well in case that helps in troubleshooting. I haven't > been able to find an issue in the implementation yet. > Maybe I am missing something. > > I have been following the "Hands on vDPA: what do you do > when you ain't got the hardware v2 (Part 2)" [1] blog to > test my changes. To boot the L1 VM, I ran: > > sudo ./qemu/build/qemu-system-x86_64 \ > -enable-kvm \ > -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ > -net nic,model=virtio \ > -net user,hostfwd=tcp::2222-:22 \ > -device intel-iommu,snoop-control=on \ > -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=on,event_idx=off,bus=pcie.0,addr=0x4 \ > -netdev tap,id=net0,script=no,downscript=no \ > -nographic \ > -m 8G \ > -smp 4 \ > -M q35 \ > -cpu host 2>&1 | tee vm.log > > Without "guest_uso4=off,guest_uso6=off,host_uso=off, > guest_announce=off" in "-device virtio-net-pci", QEMU > throws "vdpa svq does not work with features" [2] when > trying to boot L2. > > The enums added in commit #2 in this series is new and > wasn't in the earlier versions of the series. Without > this change, x-svq=true throws "SVQ invalid device feature > flags" [3] and x-svq is consequently disabled. > > The first issue is related to running traffic in L2 > with vhost-vdpa. > > In L0: > > $ ip addr add 111.1.1.1/24 dev tap0 > $ ip link set tap0 up > $ ip addr show tap0 > 4: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000 > link/ether d2:6d:b9:61:e1:9a brd ff:ff:ff:ff:ff:ff > inet 111.1.1.1/24 scope global tap0 > valid_lft forever preferred_lft forever > inet6 fe80::d06d:b9ff:fe61:e19a/64 scope link proto kernel_ll > valid_lft forever preferred_lft forever > > I am able to run traffic in L2 when booting without > x-svq. > > In L1: > > $ ./qemu/build/qemu-system-x86_64 \ > -nographic \ > -m 4G \ > -enable-kvm \ > -M q35 \ > -drive file=//root/L2.qcow2,media=disk,if=virtio \ > -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0 \ > -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ > -smp 4 \ > -cpu host \ > 2>&1 | tee vm.log > > In L2: > > # ip addr add 111.1.1.2/24 dev eth0 > # ip addr show eth0 > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 > link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff > altname enp0s7 > inet 111.1.1.2/24 scope global eth0 > valid_lft forever preferred_lft forever > inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute > valid_lft forever preferred_lft forever > > # ip route > 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 > > # ping 111.1.1.1 -w3 > PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. > 64 bytes from 111.1.1.1: icmp_seq=1 ttl=64 time=0.407 ms > 64 bytes from 111.1.1.1: icmp_seq=2 ttl=64 time=0.671 ms > 64 bytes from 111.1.1.1: icmp_seq=3 ttl=64 time=0.291 ms > > --- 111.1.1.1 ping statistics --- > 3 packets transmitted, 3 received, 0% packet loss, time 2034ms > rtt min/avg/max/mdev = 0.291/0.456/0.671/0.159 ms > > > But if I boot L2 with x-svq=true as shown below, I am unable > to ping the host machine. > > $ ./qemu/build/qemu-system-x86_64 \ > -nographic \ > -m 4G \ > -enable-kvm \ > -M q35 \ > -drive file=//root/L2.qcow2,media=disk,if=virtio \ > -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ > -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ > -smp 4 \ > -cpu host \ > 2>&1 | tee vm.log > > In L2: > > # ip addr add 111.1.1.2/24 dev eth0 > # ip addr show eth0 > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 > link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff > altname enp0s7 > inet 111.1.1.2/24 scope global eth0 > valid_lft forever preferred_lft forever > inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute > valid_lft forever preferred_lft forever > > # ip route > 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 > > # ping 111.1.1.1 -w10 > PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. > From 111.1.1.2 icmp_seq=1 Destination Host Unreachable > ping: sendmsg: No route to host > From 111.1.1.2 icmp_seq=2 Destination Host Unreachable > From 111.1.1.2 icmp_seq=3 Destination Host Unreachable > > --- 111.1.1.1 ping statistics --- > 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2076ms > pipe 3 > > The other issue is related to booting L2 with "x-svq=true" > and "packed=on". > > In L1: > > $ ./qemu/build/qemu-system-x86_64 \ > -nographic \ > -m 4G \ > -enable-kvm \ > -M q35 \ > -drive file=//root/L2.qcow2,media=disk,if=virtio \ > -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0,x-svq=true \ > -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,packed=on,bus=pcie.0,addr=0x7 \ > -smp 4 \ > -cpu host \ > 2>&1 | tee vm.log > > The kernel throws "virtio_net virtio1: output.0:id 0 is not > a head!" [4]. > So this series implements the descriptor forwarding from the guest to the device in packed vq. We also need to forward the descriptors from the device to the guest. The device writes them in the SVQ ring. The functions responsible for that in QEMU are hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_flush, which is called by the device when used descriptors are written to the SVQ, which calls hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf. We need to do modifications similar to vhost_svq_add: Make them conditional if we're in split or packed vq, and "copy" the code from Linux's drivers/virtio/virtio_ring.c:virtqueue_get_buf. After these modifications you should be able to ping and forward traffic. As always, It is totally ok if it needs more than one iteration, and feel free to ask any question you have :). > Here's part of the trace: > > [...] > [ 945.370085] watchdog: BUG: soft lockup - CPU#2 stuck for 863s! [NetworkManager:795] > [ 945.372467] Modules linked in: rfkill intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core intel_vsec pmt_g > [ 945.387413] CPU: 2 PID: 795 Comm: NetworkManager Tainted: G L 6.8.7-200.fc39.x86_64 #1 > [ 945.390685] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 > [ 945.394256] RIP: 0010:virtnet_poll+0xd8/0x5c0 [virtio_net] > [ 945.395998] Code: c0 74 5c 65 8b 05 24 37 8b 3f 41 89 86 c4 00 00 00 80 bb 40 04 00 00 00 75 32 48 8b 3b e8 00 00 28 c7 48 89 df be8 > [ 945.401465] RSP: 0018:ffffabaec0134e48 EFLAGS: 00000246 > [ 945.403362] RAX: ffff9bf904432000 RBX: ffff9bf9085b1800 RCX: 00000000ffff0001 > [ 945.405447] RDX: 0000000000008080 RSI: 0000000000000001 RDI: ffff9bf9085b1800 > [ 945.408361] RBP: ffff9bf9085b0808 R08: 0000000000000001 R09: ffffabaec0134ba8 > [ 945.410828] R10: ffffabaec0134ba0 R11: 0000000000000003 R12: ffff9bf905a34ac0 > [ 945.413272] R13: 0000000000000040 R14: ffff9bf905a34a00 R15: ffff9bf9085b0800 > [ 945.415180] FS: 00007fa81f0f1540(0000) GS:ffff9bf97bd00000(0000) knlGS:0000000000000000 > [ 945.418177] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 945.419415] CR2: 000055614ba8dc48 CR3: 0000000102b42006 CR4: 0000000000770ef0 > [ 945.423312] PKRU: 55555554 > [ 945.424238] Call Trace: > [ 945.424238] <IRQ> > [ 945.426236] ? watchdog_timer_fn+0x1e6/0x270 > [ 945.427304] ? __pfx_watchdog_timer_fn+0x10/0x10 > [ 945.428239] ? __hrtimer_run_queues+0x10f/0x2b0 > [ 945.431304] ? hrtimer_interrupt+0xf8/0x230 > [ 945.432236] ? __sysvec_apic_timer_interrupt+0x4d/0x140 > [ 945.434187] ? sysvec_apic_timer_interrupt+0x39/0x90 > [ 945.436306] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 > [ 945.438199] ? virtnet_poll+0xd8/0x5c0 [virtio_net] > [ 945.438199] ? virtnet_poll+0xd0/0x5c0 [virtio_net] > [ 945.440197] ? handle_irq_event+0x50/0x80 > [ 945.442415] ? sched_clock_cpu+0x5e/0x190 > [ 945.444563] ? irqtime_account_irq+0x40/0xc0 > [ 945.446191] __napi_poll+0x28/0x1c0 > [ 945.446191] net_rx_action+0x2a4/0x380 > [ 945.448851] ? _raw_spin_unlock_irqrestore+0xe/0x40 > [ 945.450209] ? note_gp_changes+0x6c/0x80 > [ 945.452252] __do_softirq+0xc9/0x2c8 > [ 945.453579] do_softirq.part.0+0x3d/0x60 > [ 945.454188] </IRQ> > [ 945.454188] <TASK> > [ 945.456175] __local_bh_enable_ip+0x68/0x70 > [ 945.458373] virtnet_open+0xdc/0x310 [virtio_net] > [ 945.460005] __dev_open+0xfa/0x1b0 > [ 945.461310] __dev_change_flags+0x1dc/0x250 > [ 945.462800] dev_change_flags+0x26/0x70 > [ 945.464190] do_setlink+0x375/0x12d0 > [...] > > I am not sure if this issue is similar to the one > described in this patch (race between channels > setting and refill) [5]. As described in the patch, > I see drivers/net/virtio_net:virtnet_open invoke > try_fill_recv() and schedule_delayed_work() [6]. I > am unfamiliar with this and so I am not sure how to > progress. > > Maybe I can try disabling napi and checking it out > if that is possible. Would this be a good next step > to troubleshoot the kernel crash? > > Thanks, > Sahil > > Changes v3 -> v4: > - Split commit #1 of v3 into commit #1 and #2 in > this series [7]. > - Commit #3 is commit #2 of v3. > - Commit #4 is based on commit #3 of v3. > - Commit #5 was sent as an individual patch [8]. > - vhost-shadow-virtqueue.c > (vhost_svq_valid_features): Add enums. > (vhost_svq_memory_packed): Remove function. > (vhost_svq_driver_area_size,vhost_svq_descriptor_area_size): Decouple functions. > (vhost_svq_device_area_size): Rewrite function. > (vhost_svq_start): Simplify implementation. > (vhost_svq_stop): Unconditionally munmap(). > - vhost-shadow-virtqueue.h: New function declaration. > - vhost-vdpa.c > (vhost_vdpa_svq_unmap_rings): Call vhost_vdpa_svq_unmap_ring(). > (vhost_vdpa_svq_map_rings): New mappings. > (vhost_vdpa_svq_setup): Add comment. > > [1] https://www.redhat.com/en/blog/hands-vdpa-what-do-you-do-when-you-aint-got-hardware-part-2 > [2] https://gitlab.com/qemu-project/qemu/-/blob/master/net/vhost-vdpa.c#L167 > [3] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-shadow-virtqueue.c#L58 > [4] https://github.com/torvalds/linux/blob/master/drivers/virtio/virtio_ring.c#L1763 > [5] https://lkml.iu.edu/hypermail/linux/kernel/1307.0/01455.html > [6] https://github.com/torvalds/linux/blob/master/drivers/net/virtio_net.c#L3104 > [7] https://lists.nongnu.org/archive/html/qemu-devel/2024-08/msg01148.html > [8] https://lists.nongnu.org/archive/html/qemu-devel/2024-11/msg00598.html > > Sahil Siddiq (5): > vhost: Refactor vhost_svq_add_split > vhost: Write descriptors to packed svq > vhost: Data structure changes to support packed vqs > vdpa: Allocate memory for svq and map them to vdpa > vdpa: Support setting vring_base for packed svq > > hw/virtio/vhost-shadow-virtqueue.c | 222 +++++++++++++++++++---------- > hw/virtio/vhost-shadow-virtqueue.h | 70 ++++++--- > hw/virtio/vhost-vdpa.c | 47 +++++- > 3 files changed, 237 insertions(+), 102 deletions(-) > > -- > 2.47.0 >
Hi, Thank you for your reply. On 12/10/24 2:57 PM, Eugenio Perez Martin wrote: > On Thu, Dec 5, 2024 at 9:34 PM Sahil Siddiq <icegambit91@gmail.com> wrote: >> >> Hi, >> >> There are two issues that I found while trying to test >> my changes. I thought I would send the patch series >> as well in case that helps in troubleshooting. I haven't >> been able to find an issue in the implementation yet. >> Maybe I am missing something. >> >> I have been following the "Hands on vDPA: what do you do >> when you ain't got the hardware v2 (Part 2)" [1] blog to >> test my changes. To boot the L1 VM, I ran: >> >> [...] >> >> But if I boot L2 with x-svq=true as shown below, I am unable >> to ping the host machine. >> >> $ ./qemu/build/qemu-system-x86_64 \ >> -nographic \ >> -m 4G \ >> -enable-kvm \ >> -M q35 \ >> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ >> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ >> -smp 4 \ >> -cpu host \ >> 2>&1 | tee vm.log >> >> In L2: >> >> # ip addr add 111.1.1.2/24 dev eth0 >> # ip addr show eth0 >> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 >> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff >> altname enp0s7 >> inet 111.1.1.2/24 scope global eth0 >> valid_lft forever preferred_lft forever >> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute >> valid_lft forever preferred_lft forever >> >> # ip route >> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 >> >> # ping 111.1.1.1 -w10 >> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. >> From 111.1.1.2 icmp_seq=1 Destination Host Unreachable >> ping: sendmsg: No route to host >> From 111.1.1.2 icmp_seq=2 Destination Host Unreachable >> From 111.1.1.2 icmp_seq=3 Destination Host Unreachable >> >> --- 111.1.1.1 ping statistics --- >> 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2076ms >> pipe 3 >> >> The other issue is related to booting L2 with "x-svq=true" >> and "packed=on". >> >> In L1: >> >> $ ./qemu/build/qemu-system-x86_64 \ >> -nographic \ >> -m 4G \ >> -enable-kvm \ >> -M q35 \ >> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0,x-svq=true \ >> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,packed=on,bus=pcie.0,addr=0x7 \ >> -smp 4 \ >> -cpu host \ >> 2>&1 | tee vm.log >> >> The kernel throws "virtio_net virtio1: output.0:id 0 is not >> a head!" [4]. >> > > So this series implements the descriptor forwarding from the guest to > the device in packed vq. We also need to forward the descriptors from > the device to the guest. The device writes them in the SVQ ring. > > The functions responsible for that in QEMU are > hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_flush, which is called by > the device when used descriptors are written to the SVQ, which calls > hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf. We need to do > modifications similar to vhost_svq_add: Make them conditional if we're > in split or packed vq, and "copy" the code from Linux's > drivers/virtio/virtio_ring.c:virtqueue_get_buf. > > After these modifications you should be able to ping and forward > traffic. As always, It is totally ok if it needs more than one > iteration, and feel free to ask any question you have :). > Understood, I'll make these changes and will test it again. Thanks, Sahil
Hi, On 12/10/24 2:57 PM, Eugenio Perez Martin wrote: > On Thu, Dec 5, 2024 at 9:34 PM Sahil Siddiq <icegambit91@gmail.com> wrote: >> >> Hi, >> >> There are two issues that I found while trying to test >> my changes. I thought I would send the patch series >> as well in case that helps in troubleshooting. I haven't >> been able to find an issue in the implementation yet. >> Maybe I am missing something. >> >> I have been following the "Hands on vDPA: what do you do >> when you ain't got the hardware v2 (Part 2)" [1] blog to >> test my changes. To boot the L1 VM, I ran: >> >> sudo ./qemu/build/qemu-system-x86_64 \ >> -enable-kvm \ >> -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ >> -net nic,model=virtio \ >> -net user,hostfwd=tcp::2222-:22 \ >> -device intel-iommu,snoop-control=on \ >> -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=on,event_idx=off,bus=pcie.0,addr=0x4 \ >> -netdev tap,id=net0,script=no,downscript=no \ >> -nographic \ >> -m 8G \ >> -smp 4 \ >> -M q35 \ >> -cpu host 2>&1 | tee vm.log >> >> Without "guest_uso4=off,guest_uso6=off,host_uso=off, >> guest_announce=off" in "-device virtio-net-pci", QEMU >> throws "vdpa svq does not work with features" [2] when >> trying to boot L2. >> >> The enums added in commit #2 in this series is new and >> wasn't in the earlier versions of the series. Without >> this change, x-svq=true throws "SVQ invalid device feature >> flags" [3] and x-svq is consequently disabled. >> >> The first issue is related to running traffic in L2 >> with vhost-vdpa. >> >> In L0: >> >> $ ip addr add 111.1.1.1/24 dev tap0 >> $ ip link set tap0 up >> $ ip addr show tap0 >> 4: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000 >> link/ether d2:6d:b9:61:e1:9a brd ff:ff:ff:ff:ff:ff >> inet 111.1.1.1/24 scope global tap0 >> valid_lft forever preferred_lft forever >> inet6 fe80::d06d:b9ff:fe61:e19a/64 scope link proto kernel_ll >> valid_lft forever preferred_lft forever >> >> I am able to run traffic in L2 when booting without >> x-svq. >> >> In L1: >> >> $ ./qemu/build/qemu-system-x86_64 \ >> -nographic \ >> -m 4G \ >> -enable-kvm \ >> -M q35 \ >> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0 \ >> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ >> -smp 4 \ >> -cpu host \ >> 2>&1 | tee vm.log >> >> In L2: >> >> # ip addr add 111.1.1.2/24 dev eth0 >> # ip addr show eth0 >> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 >> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff >> altname enp0s7 >> inet 111.1.1.2/24 scope global eth0 >> valid_lft forever preferred_lft forever >> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute >> valid_lft forever preferred_lft forever >> >> # ip route >> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 >> >> # ping 111.1.1.1 -w3 >> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. >> 64 bytes from 111.1.1.1: icmp_seq=1 ttl=64 time=0.407 ms >> 64 bytes from 111.1.1.1: icmp_seq=2 ttl=64 time=0.671 ms >> 64 bytes from 111.1.1.1: icmp_seq=3 ttl=64 time=0.291 ms >> >> --- 111.1.1.1 ping statistics --- >> 3 packets transmitted, 3 received, 0% packet loss, time 2034ms >> rtt min/avg/max/mdev = 0.291/0.456/0.671/0.159 ms >> >> >> But if I boot L2 with x-svq=true as shown below, I am unable >> to ping the host machine. >> >> $ ./qemu/build/qemu-system-x86_64 \ >> -nographic \ >> -m 4G \ >> -enable-kvm \ >> -M q35 \ >> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ >> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ >> -smp 4 \ >> -cpu host \ >> 2>&1 | tee vm.log >> >> In L2: >> >> # ip addr add 111.1.1.2/24 dev eth0 >> # ip addr show eth0 >> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 >> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff >> altname enp0s7 >> inet 111.1.1.2/24 scope global eth0 >> valid_lft forever preferred_lft forever >> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute >> valid_lft forever preferred_lft forever >> >> # ip route >> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 >> >> # ping 111.1.1.1 -w10 >> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. >> From 111.1.1.2 icmp_seq=1 Destination Host Unreachable >> ping: sendmsg: No route to host >> From 111.1.1.2 icmp_seq=2 Destination Host Unreachable >> From 111.1.1.2 icmp_seq=3 Destination Host Unreachable >> >> --- 111.1.1.1 ping statistics --- >> 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2076ms >> pipe 3 >> >> The other issue is related to booting L2 with "x-svq=true" >> and "packed=on". >> >> In L1: >> >> $ ./qemu/build/qemu-system-x86_64 \ >> -nographic \ >> -m 4G \ >> -enable-kvm \ >> -M q35 \ >> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0,x-svq=true \ >> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,packed=on,bus=pcie.0,addr=0x7 \ >> -smp 4 \ >> -cpu host \ >> 2>&1 | tee vm.log >> >> The kernel throws "virtio_net virtio1: output.0:id 0 is not >> a head!" [4]. >> > > So this series implements the descriptor forwarding from the guest to > the device in packed vq. We also need to forward the descriptors from > the device to the guest. The device writes them in the SVQ ring. > > The functions responsible for that in QEMU are > hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_flush, which is called by > the device when used descriptors are written to the SVQ, which calls > hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf. We need to do > modifications similar to vhost_svq_add: Make them conditional if we're > in split or packed vq, and "copy" the code from Linux's > drivers/virtio/virtio_ring.c:virtqueue_get_buf. > > After these modifications you should be able to ping and forward > traffic. As always, It is totally ok if it needs more than one > iteration, and feel free to ask any question you have :). > I misunderstood this part. While working on extending hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf() [1] for packed vqs, I realized that this function and vhost_svq_flush() already support split vqs. However, I am unable to ping L0 when booting L2 with "x-svq=true" and "packed=off" or when the "packed" option is not specified in QEMU's command line. I tried debugging these functions for split vqs after running the following QEMU commands while following the blog [2]. Booting L1: $ sudo ./qemu/build/qemu-system-x86_64 \ -enable-kvm \ -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ -net nic,model=virtio \ -net user,hostfwd=tcp::2222-:22 \ -device intel-iommu,snoop-control=on \ -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=off,event_idx=off,bus=pcie.0,addr=0x4 \ -netdev tap,id=net0,script=no,downscript=no \ -nographic \ -m 8G \ -smp 4 \ -M q35 \ -cpu host 2>&1 | tee vm.log Booting L2: # ./qemu/build/qemu-system-x86_64 \ -nographic \ -m 4G \ -enable-kvm \ -M q35 \ -drive file=//root/L2.qcow2,media=disk,if=virtio \ -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ -smp 4 \ -cpu host \ 2>&1 | tee vm.log I printed out the contents of VirtQueueElement returned by vhost_svq_get_buf() in vhost_svq_flush() [3]. I noticed that "len" which is set by "vhost_svq_get_buf" is always set to 0 while VirtQueueElement.len is non-zero. I haven't understood the difference between these two "len"s. The "len" that is set to 0 is used in "virtqueue_fill()" in virtio.c [4]. Could this point to why I am not able to ping L0 from L2? Thanks, Sahil [1] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-shadow-virtqueue.c#L418 [2] https://www.redhat.com/en/blog/hands-vdpa-what-do-you-do-when-you-aint-got-hardware-part-2 [3] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-shadow-virtqueue.c#L488 [4] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-shadow-virtqueue.c#L501
On Sun, Dec 15, 2024 at 6:27 PM Sahil Siddiq <icegambit91@gmail.com> wrote: > > Hi, > > On 12/10/24 2:57 PM, Eugenio Perez Martin wrote: > > On Thu, Dec 5, 2024 at 9:34 PM Sahil Siddiq <icegambit91@gmail.com> wrote: > >> > >> Hi, > >> > >> There are two issues that I found while trying to test > >> my changes. I thought I would send the patch series > >> as well in case that helps in troubleshooting. I haven't > >> been able to find an issue in the implementation yet. > >> Maybe I am missing something. > >> > >> I have been following the "Hands on vDPA: what do you do > >> when you ain't got the hardware v2 (Part 2)" [1] blog to > >> test my changes. To boot the L1 VM, I ran: > >> > >> sudo ./qemu/build/qemu-system-x86_64 \ > >> -enable-kvm \ > >> -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ > >> -net nic,model=virtio \ > >> -net user,hostfwd=tcp::2222-:22 \ > >> -device intel-iommu,snoop-control=on \ > >> -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=on,event_idx=off,bus=pcie.0,addr=0x4 \ > >> -netdev tap,id=net0,script=no,downscript=no \ > >> -nographic \ > >> -m 8G \ > >> -smp 4 \ > >> -M q35 \ > >> -cpu host 2>&1 | tee vm.log > >> > >> Without "guest_uso4=off,guest_uso6=off,host_uso=off, > >> guest_announce=off" in "-device virtio-net-pci", QEMU > >> throws "vdpa svq does not work with features" [2] when > >> trying to boot L2. > >> > >> The enums added in commit #2 in this series is new and > >> wasn't in the earlier versions of the series. Without > >> this change, x-svq=true throws "SVQ invalid device feature > >> flags" [3] and x-svq is consequently disabled. > >> > >> The first issue is related to running traffic in L2 > >> with vhost-vdpa. > >> > >> In L0: > >> > >> $ ip addr add 111.1.1.1/24 dev tap0 > >> $ ip link set tap0 up > >> $ ip addr show tap0 > >> 4: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000 > >> link/ether d2:6d:b9:61:e1:9a brd ff:ff:ff:ff:ff:ff > >> inet 111.1.1.1/24 scope global tap0 > >> valid_lft forever preferred_lft forever > >> inet6 fe80::d06d:b9ff:fe61:e19a/64 scope link proto kernel_ll > >> valid_lft forever preferred_lft forever > >> > >> I am able to run traffic in L2 when booting without > >> x-svq. > >> > >> In L1: > >> > >> $ ./qemu/build/qemu-system-x86_64 \ > >> -nographic \ > >> -m 4G \ > >> -enable-kvm \ > >> -M q35 \ > >> -drive file=//root/L2.qcow2,media=disk,if=virtio \ > >> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0 \ > >> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ > >> -smp 4 \ > >> -cpu host \ > >> 2>&1 | tee vm.log > >> > >> In L2: > >> > >> # ip addr add 111.1.1.2/24 dev eth0 > >> # ip addr show eth0 > >> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 > >> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff > >> altname enp0s7 > >> inet 111.1.1.2/24 scope global eth0 > >> valid_lft forever preferred_lft forever > >> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute > >> valid_lft forever preferred_lft forever > >> > >> # ip route > >> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 > >> > >> # ping 111.1.1.1 -w3 > >> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. > >> 64 bytes from 111.1.1.1: icmp_seq=1 ttl=64 time=0.407 ms > >> 64 bytes from 111.1.1.1: icmp_seq=2 ttl=64 time=0.671 ms > >> 64 bytes from 111.1.1.1: icmp_seq=3 ttl=64 time=0.291 ms > >> > >> --- 111.1.1.1 ping statistics --- > >> 3 packets transmitted, 3 received, 0% packet loss, time 2034ms > >> rtt min/avg/max/mdev = 0.291/0.456/0.671/0.159 ms > >> > >> > >> But if I boot L2 with x-svq=true as shown below, I am unable > >> to ping the host machine. > >> > >> $ ./qemu/build/qemu-system-x86_64 \ > >> -nographic \ > >> -m 4G \ > >> -enable-kvm \ > >> -M q35 \ > >> -drive file=//root/L2.qcow2,media=disk,if=virtio \ > >> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ > >> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ > >> -smp 4 \ > >> -cpu host \ > >> 2>&1 | tee vm.log > >> > >> In L2: > >> > >> # ip addr add 111.1.1.2/24 dev eth0 > >> # ip addr show eth0 > >> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 > >> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff > >> altname enp0s7 > >> inet 111.1.1.2/24 scope global eth0 > >> valid_lft forever preferred_lft forever > >> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute > >> valid_lft forever preferred_lft forever > >> > >> # ip route > >> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 > >> > >> # ping 111.1.1.1 -w10 > >> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. > >> From 111.1.1.2 icmp_seq=1 Destination Host Unreachable > >> ping: sendmsg: No route to host > >> From 111.1.1.2 icmp_seq=2 Destination Host Unreachable > >> From 111.1.1.2 icmp_seq=3 Destination Host Unreachable > >> > >> --- 111.1.1.1 ping statistics --- > >> 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2076ms > >> pipe 3 > >> > >> The other issue is related to booting L2 with "x-svq=true" > >> and "packed=on". > >> > >> In L1: > >> > >> $ ./qemu/build/qemu-system-x86_64 \ > >> -nographic \ > >> -m 4G \ > >> -enable-kvm \ > >> -M q35 \ > >> -drive file=//root/L2.qcow2,media=disk,if=virtio \ > >> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0,x-svq=true \ > >> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,packed=on,bus=pcie.0,addr=0x7 \ > >> -smp 4 \ > >> -cpu host \ > >> 2>&1 | tee vm.log > >> > >> The kernel throws "virtio_net virtio1: output.0:id 0 is not > >> a head!" [4]. > >> > > > > So this series implements the descriptor forwarding from the guest to > > the device in packed vq. We also need to forward the descriptors from > > the device to the guest. The device writes them in the SVQ ring. > > > > The functions responsible for that in QEMU are > > hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_flush, which is called by > > the device when used descriptors are written to the SVQ, which calls > > hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf. We need to do > > modifications similar to vhost_svq_add: Make them conditional if we're > > in split or packed vq, and "copy" the code from Linux's > > drivers/virtio/virtio_ring.c:virtqueue_get_buf. > > > > After these modifications you should be able to ping and forward > > traffic. As always, It is totally ok if it needs more than one > > iteration, and feel free to ask any question you have :). > > > > I misunderstood this part. While working on extending > hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf() [1] > for packed vqs, I realized that this function and > vhost_svq_flush() already support split vqs. However, I am > unable to ping L0 when booting L2 with "x-svq=true" and > "packed=off" or when the "packed" option is not specified > in QEMU's command line. > > I tried debugging these functions for split vqs after running > the following QEMU commands while following the blog [2]. > > Booting L1: > > $ sudo ./qemu/build/qemu-system-x86_64 \ > -enable-kvm \ > -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ > -net nic,model=virtio \ > -net user,hostfwd=tcp::2222-:22 \ > -device intel-iommu,snoop-control=on \ > -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=off,event_idx=off,bus=pcie.0,addr=0x4 \ > -netdev tap,id=net0,script=no,downscript=no \ > -nographic \ > -m 8G \ > -smp 4 \ > -M q35 \ > -cpu host 2>&1 | tee vm.log > > Booting L2: > > # ./qemu/build/qemu-system-x86_64 \ > -nographic \ > -m 4G \ > -enable-kvm \ > -M q35 \ > -drive file=//root/L2.qcow2,media=disk,if=virtio \ > -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ > -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ > -smp 4 \ > -cpu host \ > 2>&1 | tee vm.log > > I printed out the contents of VirtQueueElement returned > by vhost_svq_get_buf() in vhost_svq_flush() [3]. > I noticed that "len" which is set by "vhost_svq_get_buf" > is always set to 0 while VirtQueueElement.len is non-zero. > I haven't understood the difference between these two "len"s. > VirtQueueElement.len is the length of the buffer, while the len of vhost_svq_get_buf is the bytes written by the device. In the case of the tx queue, VirtQueuelen is the length of the tx packet, and the vhost_svq_get_buf is always 0 as the device does not write. In the case of rx, VirtQueueElem.len is the available length for a rx frame, and the vhost_svq_get_buf len is the actual length written by the device. To be 100% accurate a rx packet can span over multiple buffers, but SVQ does not need special code to handle this. So vhost_svq_get_buf should return > 0 for rx queue (svq->vq->index == 0), and 0 for tx queue (svq->vq->index % 2 == 1). Take into account that vhost_svq_get_buf only handles split vq at the moment! It should be renamed or splitted into vhost_svq_get_buf_split. > The "len" that is set to 0 is used in "virtqueue_fill()" in > virtio.c [4]. Could this point to why I am not able to ping > L0 from L2? > It depends :). Let me know in what vq you find that. > Thanks, > Sahil > > [1] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-shadow-virtqueue.c#L418 > [2] https://www.redhat.com/en/blog/hands-vdpa-what-do-you-do-when-you-aint-got-hardware-part-2 > [3] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-shadow-virtqueue.c#L488 > [4] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-shadow-virtqueue.c#L501 >
Hi, Thank you for your reply. On 12/16/24 2:09 PM, Eugenio Perez Martin wrote: > On Sun, Dec 15, 2024 at 6:27 PM Sahil Siddiq <icegambit91@gmail.com> wrote: >> On 12/10/24 2:57 PM, Eugenio Perez Martin wrote: >>> On Thu, Dec 5, 2024 at 9:34 PM Sahil Siddiq <icegambit91@gmail.com> wrote: >>>> [...] >>>> I have been following the "Hands on vDPA: what do you do >>>> when you ain't got the hardware v2 (Part 2)" [1] blog to >>>> test my changes. To boot the L1 VM, I ran: >>>> >>>> sudo ./qemu/build/qemu-system-x86_64 \ >>>> -enable-kvm \ >>>> -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ >>>> -net nic,model=virtio \ >>>> -net user,hostfwd=tcp::2222-:22 \ >>>> -device intel-iommu,snoop-control=on \ >>>> -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=on,event_idx=off,bus=pcie.0,addr=0x4 \ >>>> -netdev tap,id=net0,script=no,downscript=no \ >>>> -nographic \ >>>> -m 8G \ >>>> -smp 4 \ >>>> -M q35 \ >>>> -cpu host 2>&1 | tee vm.log >>>> >>>> Without "guest_uso4=off,guest_uso6=off,host_uso=off, >>>> guest_announce=off" in "-device virtio-net-pci", QEMU >>>> throws "vdpa svq does not work with features" [2] when >>>> trying to boot L2. >>>> >>>> The enums added in commit #2 in this series is new and >>>> wasn't in the earlier versions of the series. Without >>>> this change, x-svq=true throws "SVQ invalid device feature >>>> flags" [3] and x-svq is consequently disabled. >>>> >>>> The first issue is related to running traffic in L2 >>>> with vhost-vdpa. >>>> >>>> In L0: >>>> >>>> $ ip addr add 111.1.1.1/24 dev tap0 >>>> $ ip link set tap0 up >>>> $ ip addr show tap0 >>>> 4: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000 >>>> link/ether d2:6d:b9:61:e1:9a brd ff:ff:ff:ff:ff:ff >>>> inet 111.1.1.1/24 scope global tap0 >>>> valid_lft forever preferred_lft forever >>>> inet6 fe80::d06d:b9ff:fe61:e19a/64 scope link proto kernel_ll >>>> valid_lft forever preferred_lft forever >>>> >>>> I am able to run traffic in L2 when booting without >>>> x-svq. >>>> >>>> In L1: >>>> >>>> $ ./qemu/build/qemu-system-x86_64 \ >>>> -nographic \ >>>> -m 4G \ >>>> -enable-kvm \ >>>> -M q35 \ >>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0 \ >>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ >>>> -smp 4 \ >>>> -cpu host \ >>>> 2>&1 | tee vm.log >>>> >>>> In L2: >>>> >>>> # ip addr add 111.1.1.2/24 dev eth0 >>>> # ip addr show eth0 >>>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 >>>> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff >>>> altname enp0s7 >>>> inet 111.1.1.2/24 scope global eth0 >>>> valid_lft forever preferred_lft forever >>>> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute >>>> valid_lft forever preferred_lft forever >>>> >>>> # ip route >>>> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 >>>> >>>> # ping 111.1.1.1 -w3 >>>> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. >>>> 64 bytes from 111.1.1.1: icmp_seq=1 ttl=64 time=0.407 ms >>>> 64 bytes from 111.1.1.1: icmp_seq=2 ttl=64 time=0.671 ms >>>> 64 bytes from 111.1.1.1: icmp_seq=3 ttl=64 time=0.291 ms >>>> >>>> --- 111.1.1.1 ping statistics --- >>>> 3 packets transmitted, 3 received, 0% packet loss, time 2034ms >>>> rtt min/avg/max/mdev = 0.291/0.456/0.671/0.159 ms >>>> >>>> >>>> But if I boot L2 with x-svq=true as shown below, I am unable >>>> to ping the host machine. >>>> >>>> $ ./qemu/build/qemu-system-x86_64 \ >>>> -nographic \ >>>> -m 4G \ >>>> -enable-kvm \ >>>> -M q35 \ >>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ >>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ >>>> -smp 4 \ >>>> -cpu host \ >>>> 2>&1 | tee vm.log >>>> >>>> In L2: >>>> >>>> # ip addr add 111.1.1.2/24 dev eth0 >>>> # ip addr show eth0 >>>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 >>>> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff >>>> altname enp0s7 >>>> inet 111.1.1.2/24 scope global eth0 >>>> valid_lft forever preferred_lft forever >>>> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute >>>> valid_lft forever preferred_lft forever >>>> >>>> # ip route >>>> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 >>>> >>>> # ping 111.1.1.1 -w10 >>>> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. >>>> From 111.1.1.2 icmp_seq=1 Destination Host Unreachable >>>> ping: sendmsg: No route to host >>>> From 111.1.1.2 icmp_seq=2 Destination Host Unreachable >>>> From 111.1.1.2 icmp_seq=3 Destination Host Unreachable >>>> >>>> --- 111.1.1.1 ping statistics --- >>>> 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2076ms >>>> pipe 3 >>>> >>>> The other issue is related to booting L2 with "x-svq=true" >>>> and "packed=on". >>>> >>>> In L1: >>>> >>>> $ ./qemu/build/qemu-system-x86_64 \ >>>> -nographic \ >>>> -m 4G \ >>>> -enable-kvm \ >>>> -M q35 \ >>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0,x-svq=true \ >>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,packed=on,bus=pcie.0,addr=0x7 \ >>>> -smp 4 \ >>>> -cpu host \ >>>> 2>&1 | tee vm.log >>>> >>>> The kernel throws "virtio_net virtio1: output.0:id 0 is not >>>> a head!" [4]. >>>> >>> >>> So this series implements the descriptor forwarding from the guest to >>> the device in packed vq. We also need to forward the descriptors from >>> the device to the guest. The device writes them in the SVQ ring. >>> >>> The functions responsible for that in QEMU are >>> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_flush, which is called by >>> the device when used descriptors are written to the SVQ, which calls >>> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf. We need to do >>> modifications similar to vhost_svq_add: Make them conditional if we're >>> in split or packed vq, and "copy" the code from Linux's >>> drivers/virtio/virtio_ring.c:virtqueue_get_buf. >>> >>> After these modifications you should be able to ping and forward >>> traffic. As always, It is totally ok if it needs more than one >>> iteration, and feel free to ask any question you have :). >>> >> >> I misunderstood this part. While working on extending >> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf() [1] >> for packed vqs, I realized that this function and >> vhost_svq_flush() already support split vqs. However, I am >> unable to ping L0 when booting L2 with "x-svq=true" and >> "packed=off" or when the "packed" option is not specified >> in QEMU's command line. >> >> I tried debugging these functions for split vqs after running >> the following QEMU commands while following the blog [2]. >> >> Booting L1: >> >> $ sudo ./qemu/build/qemu-system-x86_64 \ >> -enable-kvm \ >> -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ >> -net nic,model=virtio \ >> -net user,hostfwd=tcp::2222-:22 \ >> -device intel-iommu,snoop-control=on \ >> -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=off,event_idx=off,bus=pcie.0,addr=0x4 \ >> -netdev tap,id=net0,script=no,downscript=no \ >> -nographic \ >> -m 8G \ >> -smp 4 \ >> -M q35 \ >> -cpu host 2>&1 | tee vm.log >> >> Booting L2: >> >> # ./qemu/build/qemu-system-x86_64 \ >> -nographic \ >> -m 4G \ >> -enable-kvm \ >> -M q35 \ >> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ >> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ >> -smp 4 \ >> -cpu host \ >> 2>&1 | tee vm.log >> >> I printed out the contents of VirtQueueElement returned >> by vhost_svq_get_buf() in vhost_svq_flush() [3]. >> I noticed that "len" which is set by "vhost_svq_get_buf" >> is always set to 0 while VirtQueueElement.len is non-zero. >> I haven't understood the difference between these two "len"s. >> > > VirtQueueElement.len is the length of the buffer, while the len of > vhost_svq_get_buf is the bytes written by the device. In the case of > the tx queue, VirtQueuelen is the length of the tx packet, and the > vhost_svq_get_buf is always 0 as the device does not write. In the > case of rx, VirtQueueElem.len is the available length for a rx frame, > and the vhost_svq_get_buf len is the actual length written by the > device. > > To be 100% accurate a rx packet can span over multiple buffers, but > SVQ does not need special code to handle this. > > So vhost_svq_get_buf should return > 0 for rx queue (svq->vq->index == > 0), and 0 for tx queue (svq->vq->index % 2 == 1). > > Take into account that vhost_svq_get_buf only handles split vq at the > moment! It should be renamed or splitted into vhost_svq_get_buf_split. In L1, there are 2 virtio network devices. # lspci -nn | grep -i net 00:02.0 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1af4:1000] 00:04.0 Ethernet controller [0200]: Red Hat, Inc. Virtio 1.0 network device [1af4:1041] (rev 01) I am using the second one (1af4:1041) for testing my changes and have bound this device to the vp_vdpa driver. # vdpa dev show -jp { "dev": { "vdpa0": { "type": "network", "mgmtdev": "pci/0000:00:04.0", "vendor_id": 6900, "max_vqs": 3, "max_vq_size": 256 } } } The max number of vqs is 3 with the max size being 256. Since, there are 2 virtio net devices, vhost_vdpa_svqs_start [1] is called twice. For each of them. it calls vhost_svq_start [2] v->shadow_vqs->len number of times. Printing the values of dev->vdev->name, v->shadow_vqs->len and svq->vring.num in vhost_vdpa_svqs_start gives: name: virtio-net len: 2 num: 256 num: 256 name: virtio-net len: 1 num: 64 I am not sure how to match the above log lines to the right virtio-net device since the actual value of num can be less than "max_vq_size" in the output of "vdpa dev show". I think the first 3 log lines correspond to the virtio net device that I am using for testing since it has 2 vqs (rx and tx) while the other virtio-net device only has one vq. When printing out the values of svq->vring.num, used_elem.len and used_elem.id in vhost_svq_get_buf, there are two sets of output. One set corresponds to svq->vring.num = 64 and the other corresponds to svq->vring.num = 256. For svq->vring.num = 64, only the following line is printed repeatedly: size: 64, len: 1, i: 0 For svq->vring.num = 256, the following line is printed 20 times, size: 256, len: 0, i: 0 followed by: size: 256, len: 0, i: 1 size: 256, len: 0, i: 1 used_elem.len is used to set the value of len that is returned by vhost_svq_get_buf, and it's always 0. So the value of "len" returned by vhost_svq_get_buf when called in vhost_svq_flush is also 0. Thanks, Sahil [1] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-vdpa.c#L1243 [2] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-vdpa.c#L1265
On Tue, Dec 17, 2024 at 6:45 AM Sahil Siddiq <icegambit91@gmail.com> wrote: > > Hi, > > Thank you for your reply. > > On 12/16/24 2:09 PM, Eugenio Perez Martin wrote: > > On Sun, Dec 15, 2024 at 6:27 PM Sahil Siddiq <icegambit91@gmail.com> wrote: > >> On 12/10/24 2:57 PM, Eugenio Perez Martin wrote: > >>> On Thu, Dec 5, 2024 at 9:34 PM Sahil Siddiq <icegambit91@gmail.com> wrote: > >>>> [...] > >>>> I have been following the "Hands on vDPA: what do you do > >>>> when you ain't got the hardware v2 (Part 2)" [1] blog to > >>>> test my changes. To boot the L1 VM, I ran: > >>>> > >>>> sudo ./qemu/build/qemu-system-x86_64 \ > >>>> -enable-kvm \ > >>>> -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ > >>>> -net nic,model=virtio \ > >>>> -net user,hostfwd=tcp::2222-:22 \ > >>>> -device intel-iommu,snoop-control=on \ > >>>> -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=on,event_idx=off,bus=pcie.0,addr=0x4 \ > >>>> -netdev tap,id=net0,script=no,downscript=no \ > >>>> -nographic \ > >>>> -m 8G \ > >>>> -smp 4 \ > >>>> -M q35 \ > >>>> -cpu host 2>&1 | tee vm.log > >>>> > >>>> Without "guest_uso4=off,guest_uso6=off,host_uso=off, > >>>> guest_announce=off" in "-device virtio-net-pci", QEMU > >>>> throws "vdpa svq does not work with features" [2] when > >>>> trying to boot L2. > >>>> > >>>> The enums added in commit #2 in this series is new and > >>>> wasn't in the earlier versions of the series. Without > >>>> this change, x-svq=true throws "SVQ invalid device feature > >>>> flags" [3] and x-svq is consequently disabled. > >>>> > >>>> The first issue is related to running traffic in L2 > >>>> with vhost-vdpa. > >>>> > >>>> In L0: > >>>> > >>>> $ ip addr add 111.1.1.1/24 dev tap0 > >>>> $ ip link set tap0 up > >>>> $ ip addr show tap0 > >>>> 4: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000 > >>>> link/ether d2:6d:b9:61:e1:9a brd ff:ff:ff:ff:ff:ff > >>>> inet 111.1.1.1/24 scope global tap0 > >>>> valid_lft forever preferred_lft forever > >>>> inet6 fe80::d06d:b9ff:fe61:e19a/64 scope link proto kernel_ll > >>>> valid_lft forever preferred_lft forever > >>>> > >>>> I am able to run traffic in L2 when booting without > >>>> x-svq. > >>>> > >>>> In L1: > >>>> > >>>> $ ./qemu/build/qemu-system-x86_64 \ > >>>> -nographic \ > >>>> -m 4G \ > >>>> -enable-kvm \ > >>>> -M q35 \ > >>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ > >>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0 \ > >>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ > >>>> -smp 4 \ > >>>> -cpu host \ > >>>> 2>&1 | tee vm.log > >>>> > >>>> In L2: > >>>> > >>>> # ip addr add 111.1.1.2/24 dev eth0 > >>>> # ip addr show eth0 > >>>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 > >>>> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff > >>>> altname enp0s7 > >>>> inet 111.1.1.2/24 scope global eth0 > >>>> valid_lft forever preferred_lft forever > >>>> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute > >>>> valid_lft forever preferred_lft forever > >>>> > >>>> # ip route > >>>> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 > >>>> > >>>> # ping 111.1.1.1 -w3 > >>>> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. > >>>> 64 bytes from 111.1.1.1: icmp_seq=1 ttl=64 time=0.407 ms > >>>> 64 bytes from 111.1.1.1: icmp_seq=2 ttl=64 time=0.671 ms > >>>> 64 bytes from 111.1.1.1: icmp_seq=3 ttl=64 time=0.291 ms > >>>> > >>>> --- 111.1.1.1 ping statistics --- > >>>> 3 packets transmitted, 3 received, 0% packet loss, time 2034ms > >>>> rtt min/avg/max/mdev = 0.291/0.456/0.671/0.159 ms > >>>> > >>>> > >>>> But if I boot L2 with x-svq=true as shown below, I am unable > >>>> to ping the host machine. > >>>> > >>>> $ ./qemu/build/qemu-system-x86_64 \ > >>>> -nographic \ > >>>> -m 4G \ > >>>> -enable-kvm \ > >>>> -M q35 \ > >>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ > >>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ > >>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ > >>>> -smp 4 \ > >>>> -cpu host \ > >>>> 2>&1 | tee vm.log > >>>> > >>>> In L2: > >>>> > >>>> # ip addr add 111.1.1.2/24 dev eth0 > >>>> # ip addr show eth0 > >>>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 > >>>> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff > >>>> altname enp0s7 > >>>> inet 111.1.1.2/24 scope global eth0 > >>>> valid_lft forever preferred_lft forever > >>>> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute > >>>> valid_lft forever preferred_lft forever > >>>> > >>>> # ip route > >>>> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 > >>>> > >>>> # ping 111.1.1.1 -w10 > >>>> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. > >>>> From 111.1.1.2 icmp_seq=1 Destination Host Unreachable > >>>> ping: sendmsg: No route to host > >>>> From 111.1.1.2 icmp_seq=2 Destination Host Unreachable > >>>> From 111.1.1.2 icmp_seq=3 Destination Host Unreachable > >>>> > >>>> --- 111.1.1.1 ping statistics --- > >>>> 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2076ms > >>>> pipe 3 > >>>> > >>>> The other issue is related to booting L2 with "x-svq=true" > >>>> and "packed=on". > >>>> > >>>> In L1: > >>>> > >>>> $ ./qemu/build/qemu-system-x86_64 \ > >>>> -nographic \ > >>>> -m 4G \ > >>>> -enable-kvm \ > >>>> -M q35 \ > >>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ > >>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0,x-svq=true \ > >>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,packed=on,bus=pcie.0,addr=0x7 \ > >>>> -smp 4 \ > >>>> -cpu host \ > >>>> 2>&1 | tee vm.log > >>>> > >>>> The kernel throws "virtio_net virtio1: output.0:id 0 is not > >>>> a head!" [4]. > >>>> > >>> > >>> So this series implements the descriptor forwarding from the guest to > >>> the device in packed vq. We also need to forward the descriptors from > >>> the device to the guest. The device writes them in the SVQ ring. > >>> > >>> The functions responsible for that in QEMU are > >>> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_flush, which is called by > >>> the device when used descriptors are written to the SVQ, which calls > >>> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf. We need to do > >>> modifications similar to vhost_svq_add: Make them conditional if we're > >>> in split or packed vq, and "copy" the code from Linux's > >>> drivers/virtio/virtio_ring.c:virtqueue_get_buf. > >>> > >>> After these modifications you should be able to ping and forward > >>> traffic. As always, It is totally ok if it needs more than one > >>> iteration, and feel free to ask any question you have :). > >>> > >> > >> I misunderstood this part. While working on extending > >> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf() [1] > >> for packed vqs, I realized that this function and > >> vhost_svq_flush() already support split vqs. However, I am > >> unable to ping L0 when booting L2 with "x-svq=true" and > >> "packed=off" or when the "packed" option is not specified > >> in QEMU's command line. > >> > >> I tried debugging these functions for split vqs after running > >> the following QEMU commands while following the blog [2]. > >> > >> Booting L1: > >> > >> $ sudo ./qemu/build/qemu-system-x86_64 \ > >> -enable-kvm \ > >> -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ > >> -net nic,model=virtio \ > >> -net user,hostfwd=tcp::2222-:22 \ > >> -device intel-iommu,snoop-control=on \ > >> -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=off,event_idx=off,bus=pcie.0,addr=0x4 \ > >> -netdev tap,id=net0,script=no,downscript=no \ > >> -nographic \ > >> -m 8G \ > >> -smp 4 \ > >> -M q35 \ > >> -cpu host 2>&1 | tee vm.log > >> > >> Booting L2: > >> > >> # ./qemu/build/qemu-system-x86_64 \ > >> -nographic \ > >> -m 4G \ > >> -enable-kvm \ > >> -M q35 \ > >> -drive file=//root/L2.qcow2,media=disk,if=virtio \ > >> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ > >> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ > >> -smp 4 \ > >> -cpu host \ > >> 2>&1 | tee vm.log > >> > >> I printed out the contents of VirtQueueElement returned > >> by vhost_svq_get_buf() in vhost_svq_flush() [3]. > >> I noticed that "len" which is set by "vhost_svq_get_buf" > >> is always set to 0 while VirtQueueElement.len is non-zero. > >> I haven't understood the difference between these two "len"s. > >> > > > > VirtQueueElement.len is the length of the buffer, while the len of > > vhost_svq_get_buf is the bytes written by the device. In the case of > > the tx queue, VirtQueuelen is the length of the tx packet, and the > > vhost_svq_get_buf is always 0 as the device does not write. In the > > case of rx, VirtQueueElem.len is the available length for a rx frame, > > and the vhost_svq_get_buf len is the actual length written by the > > device. > > > > To be 100% accurate a rx packet can span over multiple buffers, but > > SVQ does not need special code to handle this. > > > > So vhost_svq_get_buf should return > 0 for rx queue (svq->vq->index == > > 0), and 0 for tx queue (svq->vq->index % 2 == 1). > > > > Take into account that vhost_svq_get_buf only handles split vq at the > > moment! It should be renamed or splitted into vhost_svq_get_buf_split. > > In L1, there are 2 virtio network devices. > > # lspci -nn | grep -i net > 00:02.0 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1af4:1000] > 00:04.0 Ethernet controller [0200]: Red Hat, Inc. Virtio 1.0 network device [1af4:1041] (rev 01) > > I am using the second one (1af4:1041) for testing my changes and have > bound this device to the vp_vdpa driver. > > # vdpa dev show -jp > { > "dev": { > "vdpa0": { > "type": "network", > "mgmtdev": "pci/0000:00:04.0", > "vendor_id": 6900, > "max_vqs": 3, How is max_vqs=3? For this to happen L0 QEMU should have virtio-net-pci,...,queues=3 cmdline argument. It's clear the guest is not using them, we can add mq=off to simplify the scenario. > "max_vq_size": 256 > } > } > } > > The max number of vqs is 3 with the max size being 256. > > Since, there are 2 virtio net devices, vhost_vdpa_svqs_start [1] > is called twice. For each of them. it calls vhost_svq_start [2] > v->shadow_vqs->len number of times. > Ok I understand this confusion, as the code is not intuitive :). Take into account you can only have svq in vdpa devices, so both vhost_vdpa_svqs_start are acting on the vdpa device. You are seeing two calls to vhost_vdpa_svqs_start because virtio (and vdpa) devices are modelled internally as two devices in QEMU: One for the dataplane vq, and other for the control vq. There are historical reasons for this, but we use it in vdpa to always shadow the CVQ while leaving dataplane passthrough if x-svq=off and the virtio & virtio-net feature set is understood by SVQ. If you break at vhost_vdpa_svqs_start with gdb and go higher in the stack you should reach vhost_net_start, that starts each vhost_net device individually. To be 100% honest, each dataplain *queue pair* (rx+tx) is modelled with a different vhost_net device in QEMU, but you don't need to take that into account implementing the packed vq :). > Printing the values of dev->vdev->name, v->shadow_vqs->len and > svq->vring.num in vhost_vdpa_svqs_start gives: > > name: virtio-net > len: 2 > num: 256 > num: 256 First QEMU's vhost_net device, the dataplane. > name: virtio-net > len: 1 > num: 64 > Second QEMU's vhost_net device, the control virtqueue. > I am not sure how to match the above log lines to the > right virtio-net device since the actual value of num > can be less than "max_vq_size" in the output of "vdpa > dev show". > Yes, the device can set a different vq max per vq, and the driver can negotiate a lower vq size per vq too. > I think the first 3 log lines correspond to the virtio > net device that I am using for testing since it has > 2 vqs (rx and tx) while the other virtio-net device > only has one vq. > > When printing out the values of svq->vring.num, > used_elem.len and used_elem.id in vhost_svq_get_buf, > there are two sets of output. One set corresponds to > svq->vring.num = 64 and the other corresponds to > svq->vring.num = 256. > > For svq->vring.num = 64, only the following line > is printed repeatedly: > > size: 64, len: 1, i: 0 > This is with packed=off, right? If this is testing with packed, you need to change the code to accommodate it. Let me know if you need more help with this. In the CVQ the only reply is a byte, indicating if the command was applied or not. This seems ok to me. The queue can also recycle ids as long as they are not available, so that part seems correct to me too. > For svq->vring.num = 256, the following line is > printed 20 times, > > size: 256, len: 0, i: 0 > > followed by: > > size: 256, len: 0, i: 1 > size: 256, len: 0, i: 1 > This makes sense for the tx queue too. Can you print the VirtQueue index? > used_elem.len is used to set the value of len that is > returned by vhost_svq_get_buf, and it's always 0. > > So the value of "len" returned by vhost_svq_get_buf > when called in vhost_svq_flush is also 0. > > Thanks, > Sahil > > [1] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-vdpa.c#L1243 > [2] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-vdpa.c#L1265 >
Hi, On 12/17/24 1:20 PM, Eugenio Perez Martin wrote: > On Tue, Dec 17, 2024 at 6:45 AM Sahil Siddiq <icegambit91@gmail.com> wrote: >> On 12/16/24 2:09 PM, Eugenio Perez Martin wrote: >>> On Sun, Dec 15, 2024 at 6:27 PM Sahil Siddiq <icegambit91@gmail.com> wrote: >>>> On 12/10/24 2:57 PM, Eugenio Perez Martin wrote: >>>>> On Thu, Dec 5, 2024 at 9:34 PM Sahil Siddiq <icegambit91@gmail.com> wrote: >>>>>> [...] >>>>>> I have been following the "Hands on vDPA: what do you do >>>>>> when you ain't got the hardware v2 (Part 2)" [1] blog to >>>>>> test my changes. To boot the L1 VM, I ran: >>>>>> >>>>>> sudo ./qemu/build/qemu-system-x86_64 \ >>>>>> -enable-kvm \ >>>>>> -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ >>>>>> -net nic,model=virtio \ >>>>>> -net user,hostfwd=tcp::2222-:22 \ >>>>>> -device intel-iommu,snoop-control=on \ >>>>>> -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=on,event_idx=off,bus=pcie.0,addr=0x4 \ >>>>>> -netdev tap,id=net0,script=no,downscript=no \ >>>>>> -nographic \ >>>>>> -m 8G \ >>>>>> -smp 4 \ >>>>>> -M q35 \ >>>>>> -cpu host 2>&1 | tee vm.log >>>>>> >>>>>> Without "guest_uso4=off,guest_uso6=off,host_uso=off, >>>>>> guest_announce=off" in "-device virtio-net-pci", QEMU >>>>>> throws "vdpa svq does not work with features" [2] when >>>>>> trying to boot L2. >>>>>> >>>>>> The enums added in commit #2 in this series is new and >>>>>> wasn't in the earlier versions of the series. Without >>>>>> this change, x-svq=true throws "SVQ invalid device feature >>>>>> flags" [3] and x-svq is consequently disabled. >>>>>> >>>>>> The first issue is related to running traffic in L2 >>>>>> with vhost-vdpa. >>>>>> >>>>>> In L0: >>>>>> >>>>>> $ ip addr add 111.1.1.1/24 dev tap0 >>>>>> $ ip link set tap0 up >>>>>> $ ip addr show tap0 >>>>>> 4: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000 >>>>>> link/ether d2:6d:b9:61:e1:9a brd ff:ff:ff:ff:ff:ff >>>>>> inet 111.1.1.1/24 scope global tap0 >>>>>> valid_lft forever preferred_lft forever >>>>>> inet6 fe80::d06d:b9ff:fe61:e19a/64 scope link proto kernel_ll >>>>>> valid_lft forever preferred_lft forever >>>>>> >>>>>> I am able to run traffic in L2 when booting without >>>>>> x-svq. >>>>>> >>>>>> In L1: >>>>>> >>>>>> $ ./qemu/build/qemu-system-x86_64 \ >>>>>> -nographic \ >>>>>> -m 4G \ >>>>>> -enable-kvm \ >>>>>> -M q35 \ >>>>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >>>>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0 \ >>>>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ >>>>>> -smp 4 \ >>>>>> -cpu host \ >>>>>> 2>&1 | tee vm.log >>>>>> >>>>>> In L2: >>>>>> >>>>>> # ip addr add 111.1.1.2/24 dev eth0 >>>>>> # ip addr show eth0 >>>>>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 >>>>>> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff >>>>>> altname enp0s7 >>>>>> inet 111.1.1.2/24 scope global eth0 >>>>>> valid_lft forever preferred_lft forever >>>>>> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute >>>>>> valid_lft forever preferred_lft forever >>>>>> >>>>>> # ip route >>>>>> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 >>>>>> >>>>>> # ping 111.1.1.1 -w3 >>>>>> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. >>>>>> 64 bytes from 111.1.1.1: icmp_seq=1 ttl=64 time=0.407 ms >>>>>> 64 bytes from 111.1.1.1: icmp_seq=2 ttl=64 time=0.671 ms >>>>>> 64 bytes from 111.1.1.1: icmp_seq=3 ttl=64 time=0.291 ms >>>>>> >>>>>> --- 111.1.1.1 ping statistics --- >>>>>> 3 packets transmitted, 3 received, 0% packet loss, time 2034ms >>>>>> rtt min/avg/max/mdev = 0.291/0.456/0.671/0.159 ms >>>>>> >>>>>> >>>>>> But if I boot L2 with x-svq=true as shown below, I am unable >>>>>> to ping the host machine. >>>>>> >>>>>> $ ./qemu/build/qemu-system-x86_64 \ >>>>>> -nographic \ >>>>>> -m 4G \ >>>>>> -enable-kvm \ >>>>>> -M q35 \ >>>>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >>>>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ >>>>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ >>>>>> -smp 4 \ >>>>>> -cpu host \ >>>>>> 2>&1 | tee vm.log >>>>>> >>>>>> In L2: >>>>>> >>>>>> # ip addr add 111.1.1.2/24 dev eth0 >>>>>> # ip addr show eth0 >>>>>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 >>>>>> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff >>>>>> altname enp0s7 >>>>>> inet 111.1.1.2/24 scope global eth0 >>>>>> valid_lft forever preferred_lft forever >>>>>> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute >>>>>> valid_lft forever preferred_lft forever >>>>>> >>>>>> # ip route >>>>>> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 >>>>>> >>>>>> # ping 111.1.1.1 -w10 >>>>>> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. >>>>>> From 111.1.1.2 icmp_seq=1 Destination Host Unreachable >>>>>> ping: sendmsg: No route to host >>>>>> From 111.1.1.2 icmp_seq=2 Destination Host Unreachable >>>>>> From 111.1.1.2 icmp_seq=3 Destination Host Unreachable >>>>>> >>>>>> --- 111.1.1.1 ping statistics --- >>>>>> 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2076ms >>>>>> pipe 3 >>>>>> >>>>>> The other issue is related to booting L2 with "x-svq=true" >>>>>> and "packed=on". >>>>>> >>>>>> In L1: >>>>>> >>>>>> $ ./qemu/build/qemu-system-x86_64 \ >>>>>> -nographic \ >>>>>> -m 4G \ >>>>>> -enable-kvm \ >>>>>> -M q35 \ >>>>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >>>>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0,x-svq=true \ >>>>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,packed=on,bus=pcie.0,addr=0x7 \ >>>>>> -smp 4 \ >>>>>> -cpu host \ >>>>>> 2>&1 | tee vm.log >>>>>> >>>>>> The kernel throws "virtio_net virtio1: output.0:id 0 is not >>>>>> a head!" [4]. >>>>>> >>>>> >>>>> So this series implements the descriptor forwarding from the guest to >>>>> the device in packed vq. We also need to forward the descriptors from >>>>> the device to the guest. The device writes them in the SVQ ring. >>>>> >>>>> The functions responsible for that in QEMU are >>>>> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_flush, which is called by >>>>> the device when used descriptors are written to the SVQ, which calls >>>>> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf. We need to do >>>>> modifications similar to vhost_svq_add: Make them conditional if we're >>>>> in split or packed vq, and "copy" the code from Linux's >>>>> drivers/virtio/virtio_ring.c:virtqueue_get_buf. >>>>> >>>>> After these modifications you should be able to ping and forward >>>>> traffic. As always, It is totally ok if it needs more than one >>>>> iteration, and feel free to ask any question you have :). >>>>> >>>> >>>> I misunderstood this part. While working on extending >>>> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf() [1] >>>> for packed vqs, I realized that this function and >>>> vhost_svq_flush() already support split vqs. However, I am >>>> unable to ping L0 when booting L2 with "x-svq=true" and >>>> "packed=off" or when the "packed" option is not specified >>>> in QEMU's command line. >>>> >>>> I tried debugging these functions for split vqs after running >>>> the following QEMU commands while following the blog [2]. >>>> >>>> Booting L1: >>>> >>>> $ sudo ./qemu/build/qemu-system-x86_64 \ >>>> -enable-kvm \ >>>> -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ >>>> -net nic,model=virtio \ >>>> -net user,hostfwd=tcp::2222-:22 \ >>>> -device intel-iommu,snoop-control=on \ >>>> -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=off,event_idx=off,bus=pcie.0,addr=0x4 \ >>>> -netdev tap,id=net0,script=no,downscript=no \ >>>> -nographic \ >>>> -m 8G \ >>>> -smp 4 \ >>>> -M q35 \ >>>> -cpu host 2>&1 | tee vm.log >>>> >>>> Booting L2: >>>> >>>> # ./qemu/build/qemu-system-x86_64 \ >>>> -nographic \ >>>> -m 4G \ >>>> -enable-kvm \ >>>> -M q35 \ >>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ >>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ >>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ >>>> -smp 4 \ >>>> -cpu host \ >>>> 2>&1 | tee vm.log >>>> >>>> I printed out the contents of VirtQueueElement returned >>>> by vhost_svq_get_buf() in vhost_svq_flush() [3]. >>>> I noticed that "len" which is set by "vhost_svq_get_buf" >>>> is always set to 0 while VirtQueueElement.len is non-zero. >>>> I haven't understood the difference between these two "len"s. >>>> >>> >>> VirtQueueElement.len is the length of the buffer, while the len of >>> vhost_svq_get_buf is the bytes written by the device. In the case of >>> the tx queue, VirtQueuelen is the length of the tx packet, and the >>> vhost_svq_get_buf is always 0 as the device does not write. In the >>> case of rx, VirtQueueElem.len is the available length for a rx frame, >>> and the vhost_svq_get_buf len is the actual length written by the >>> device. >>> >>> To be 100% accurate a rx packet can span over multiple buffers, but >>> SVQ does not need special code to handle this. >>> >>> So vhost_svq_get_buf should return > 0 for rx queue (svq->vq->index == >>> 0), and 0 for tx queue (svq->vq->index % 2 == 1). >>> >>> Take into account that vhost_svq_get_buf only handles split vq at the >>> moment! It should be renamed or splitted into vhost_svq_get_buf_split. >> >> In L1, there are 2 virtio network devices. >> >> # lspci -nn | grep -i net >> 00:02.0 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1af4:1000] >> 00:04.0 Ethernet controller [0200]: Red Hat, Inc. Virtio 1.0 network device [1af4:1041] (rev 01) >> >> I am using the second one (1af4:1041) for testing my changes and have >> bound this device to the vp_vdpa driver. >> >> # vdpa dev show -jp >> { >> "dev": { >> "vdpa0": { >> "type": "network", >> "mgmtdev": "pci/0000:00:04.0", >> "vendor_id": 6900, >> "max_vqs": 3, > > How is max_vqs=3? For this to happen L0 QEMU should have > virtio-net-pci,...,queues=3 cmdline argument. I am not sure why max_vqs is 3. I haven't set the value of queues to 3 in the cmdline argument. Is max_vqs expected to have a default value other than 3? In the blog [1] as well, max_vqs is 3 even though there's no queues=3 argument. > It's clear the guest is not using them, we can add mq=off > to simplify the scenario. The value of max_vqs is still 3 after adding mq=off. The whole command that I run to boot L0 is: $ sudo ./qemu/build/qemu-system-x86_64 \ -enable-kvm \ -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ -net nic,model=virtio \ -net user,hostfwd=tcp::2222-:22 \ -device intel-iommu,snoop-control=on \ -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,mq=off,ctrl_vq=on,ctrl_rx=on,packed=off,event_idx=off,bus=pcie.0,addr=0x4 \ -netdev tap,id=net0,script=no,downscript=no \ -nographic \ -m 8G \ -smp 4 \ -M q35 \ -cpu host 2>&1 | tee vm.log Could it be that 2 of the 3 vqs are used for the dataplane and the third vq is the control vq? >> "max_vq_size": 256 >> } >> } >> } >> >> The max number of vqs is 3 with the max size being 256. >> >> Since, there are 2 virtio net devices, vhost_vdpa_svqs_start [1] >> is called twice. For each of them. it calls vhost_svq_start [2] >> v->shadow_vqs->len number of times. >> > > Ok I understand this confusion, as the code is not intuitive :). Take > into account you can only have svq in vdpa devices, so both > vhost_vdpa_svqs_start are acting on the vdpa device. > > You are seeing two calls to vhost_vdpa_svqs_start because virtio (and > vdpa) devices are modelled internally as two devices in QEMU: One for > the dataplane vq, and other for the control vq. There are historical > reasons for this, but we use it in vdpa to always shadow the CVQ while > leaving dataplane passthrough if x-svq=off and the virtio & virtio-net > feature set is understood by SVQ. > > If you break at vhost_vdpa_svqs_start with gdb and go higher in the > stack you should reach vhost_net_start, that starts each vhost_net > device individually. > > To be 100% honest, each dataplain *queue pair* (rx+tx) is modelled > with a different vhost_net device in QEMU, but you don't need to take > that into account implementing the packed vq :). Got it, this makes sense now. >> Printing the values of dev->vdev->name, v->shadow_vqs->len and >> svq->vring.num in vhost_vdpa_svqs_start gives: >> >> name: virtio-net >> len: 2 >> num: 256 >> num: 256 > > First QEMU's vhost_net device, the dataplane. > >> name: virtio-net >> len: 1 >> num: 64 >> > > Second QEMU's vhost_net device, the control virtqueue. Ok, if I understand this correctly, the control vq doesn't need separate queues for rx and tx. >> I am not sure how to match the above log lines to the >> right virtio-net device since the actual value of num >> can be less than "max_vq_size" in the output of "vdpa >> dev show". >> > > Yes, the device can set a different vq max per vq, and the driver can > negotiate a lower vq size per vq too. > >> I think the first 3 log lines correspond to the virtio >> net device that I am using for testing since it has >> 2 vqs (rx and tx) while the other virtio-net device >> only has one vq. >> >> When printing out the values of svq->vring.num, >> used_elem.len and used_elem.id in vhost_svq_get_buf, >> there are two sets of output. One set corresponds to >> svq->vring.num = 64 and the other corresponds to >> svq->vring.num = 256. >> >> For svq->vring.num = 64, only the following line >> is printed repeatedly: >> >> size: 64, len: 1, i: 0 >> > > This is with packed=off, right? If this is testing with packed, you > need to change the code to accommodate it. Let me know if you need > more help with this. Yes, this is for packed=off. For the time being, I am trying to get L2 to communicate with L0 using split virtqueues and x-svq=true. > In the CVQ the only reply is a byte, indicating if the command was > applied or not. This seems ok to me. Understood. > The queue can also recycle ids as long as they are not available, so > that part seems correct to me too. I am a little confused here. The ids are recycled when they are available (i.e., the id is not already in use), right? >> For svq->vring.num = 256, the following line is >> printed 20 times, >> >> size: 256, len: 0, i: 0 >> >> followed by: >> >> size: 256, len: 0, i: 1 >> size: 256, len: 0, i: 1 >> > > This makes sense for the tx queue too. Can you print the VirtQueue index? For svq->vring.num = 64, the vq index is 2. So the following line (svq->vring.num, used_elem.len, used_elem.id, svq->vq->queue_index) is printed repeatedly: size: 64, len: 1, i: 0, vq idx: 2 For svq->vring.num = 256, the following line is repeated several times: size: 256, len: 0, i: 0, vq idx: 1 This is followed by: size: 256, len: 0, i: 1, vq idx: 1 In both cases, queue_index is 1. To get the value of queue_index, I used "virtio_get_queue_index(svq->vq)" [2]. Since the queue_index is 1, I guess this means this is the tx queue and the value of len (0) is correct. However, nothing with queue_index % 2 == 0 is printed by vhost_svq_get_buf() which means the device is not sending anything to the guest. Is this correct? >> used_elem.len is used to set the value of len that is >> returned by vhost_svq_get_buf, and it's always 0. >> >> So the value of "len" returned by vhost_svq_get_buf >> when called in vhost_svq_flush is also 0. >> >> Thanks, >> Sahil >> >> [1] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-vdpa.c#L1243 >> [2] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-vdpa.c#L1265 >> > Thanks, Sahil [1] https://www.redhat.com/en/blog/hands-vdpa-what-do-you-do-when-you-aint-got-hardware-part-2 [2] https://gitlab.com/qemu-project/qemu/-/blob/99d6a32469debf1a48921125879b614d15acfb7a/hw/virtio/virtio.c#L3454
On Thu, Dec 19, 2024 at 8:37 PM Sahil Siddiq <icegambit91@gmail.com> wrote: > > Hi, > > On 12/17/24 1:20 PM, Eugenio Perez Martin wrote: > > On Tue, Dec 17, 2024 at 6:45 AM Sahil Siddiq <icegambit91@gmail.com> wrote: > >> On 12/16/24 2:09 PM, Eugenio Perez Martin wrote: > >>> On Sun, Dec 15, 2024 at 6:27 PM Sahil Siddiq <icegambit91@gmail.com> wrote: > >>>> On 12/10/24 2:57 PM, Eugenio Perez Martin wrote: > >>>>> On Thu, Dec 5, 2024 at 9:34 PM Sahil Siddiq <icegambit91@gmail.com> wrote: > >>>>>> [...] > >>>>>> I have been following the "Hands on vDPA: what do you do > >>>>>> when you ain't got the hardware v2 (Part 2)" [1] blog to > >>>>>> test my changes. To boot the L1 VM, I ran: > >>>>>> > >>>>>> sudo ./qemu/build/qemu-system-x86_64 \ > >>>>>> -enable-kvm \ > >>>>>> -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ > >>>>>> -net nic,model=virtio \ > >>>>>> -net user,hostfwd=tcp::2222-:22 \ > >>>>>> -device intel-iommu,snoop-control=on \ > >>>>>> -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=on,event_idx=off,bus=pcie.0,addr=0x4 \ > >>>>>> -netdev tap,id=net0,script=no,downscript=no \ > >>>>>> -nographic \ > >>>>>> -m 8G \ > >>>>>> -smp 4 \ > >>>>>> -M q35 \ > >>>>>> -cpu host 2>&1 | tee vm.log > >>>>>> > >>>>>> Without "guest_uso4=off,guest_uso6=off,host_uso=off, > >>>>>> guest_announce=off" in "-device virtio-net-pci", QEMU > >>>>>> throws "vdpa svq does not work with features" [2] when > >>>>>> trying to boot L2. > >>>>>> > >>>>>> The enums added in commit #2 in this series is new and > >>>>>> wasn't in the earlier versions of the series. Without > >>>>>> this change, x-svq=true throws "SVQ invalid device feature > >>>>>> flags" [3] and x-svq is consequently disabled. > >>>>>> > >>>>>> The first issue is related to running traffic in L2 > >>>>>> with vhost-vdpa. > >>>>>> > >>>>>> In L0: > >>>>>> > >>>>>> $ ip addr add 111.1.1.1/24 dev tap0 > >>>>>> $ ip link set tap0 up > >>>>>> $ ip addr show tap0 > >>>>>> 4: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000 > >>>>>> link/ether d2:6d:b9:61:e1:9a brd ff:ff:ff:ff:ff:ff > >>>>>> inet 111.1.1.1/24 scope global tap0 > >>>>>> valid_lft forever preferred_lft forever > >>>>>> inet6 fe80::d06d:b9ff:fe61:e19a/64 scope link proto kernel_ll > >>>>>> valid_lft forever preferred_lft forever > >>>>>> > >>>>>> I am able to run traffic in L2 when booting without > >>>>>> x-svq. > >>>>>> > >>>>>> In L1: > >>>>>> > >>>>>> $ ./qemu/build/qemu-system-x86_64 \ > >>>>>> -nographic \ > >>>>>> -m 4G \ > >>>>>> -enable-kvm \ > >>>>>> -M q35 \ > >>>>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ > >>>>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0 \ > >>>>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ > >>>>>> -smp 4 \ > >>>>>> -cpu host \ > >>>>>> 2>&1 | tee vm.log > >>>>>> > >>>>>> In L2: > >>>>>> > >>>>>> # ip addr add 111.1.1.2/24 dev eth0 > >>>>>> # ip addr show eth0 > >>>>>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 > >>>>>> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff > >>>>>> altname enp0s7 > >>>>>> inet 111.1.1.2/24 scope global eth0 > >>>>>> valid_lft forever preferred_lft forever > >>>>>> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute > >>>>>> valid_lft forever preferred_lft forever > >>>>>> > >>>>>> # ip route > >>>>>> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 > >>>>>> > >>>>>> # ping 111.1.1.1 -w3 > >>>>>> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. > >>>>>> 64 bytes from 111.1.1.1: icmp_seq=1 ttl=64 time=0.407 ms > >>>>>> 64 bytes from 111.1.1.1: icmp_seq=2 ttl=64 time=0.671 ms > >>>>>> 64 bytes from 111.1.1.1: icmp_seq=3 ttl=64 time=0.291 ms > >>>>>> > >>>>>> --- 111.1.1.1 ping statistics --- > >>>>>> 3 packets transmitted, 3 received, 0% packet loss, time 2034ms > >>>>>> rtt min/avg/max/mdev = 0.291/0.456/0.671/0.159 ms > >>>>>> > >>>>>> > >>>>>> But if I boot L2 with x-svq=true as shown below, I am unable > >>>>>> to ping the host machine. > >>>>>> > >>>>>> $ ./qemu/build/qemu-system-x86_64 \ > >>>>>> -nographic \ > >>>>>> -m 4G \ > >>>>>> -enable-kvm \ > >>>>>> -M q35 \ > >>>>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ > >>>>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ > >>>>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ > >>>>>> -smp 4 \ > >>>>>> -cpu host \ > >>>>>> 2>&1 | tee vm.log > >>>>>> > >>>>>> In L2: > >>>>>> > >>>>>> # ip addr add 111.1.1.2/24 dev eth0 > >>>>>> # ip addr show eth0 > >>>>>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 > >>>>>> link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff > >>>>>> altname enp0s7 > >>>>>> inet 111.1.1.2/24 scope global eth0 > >>>>>> valid_lft forever preferred_lft forever > >>>>>> inet6 fe80::9877:de30:5f17:35f9/64 scope link noprefixroute > >>>>>> valid_lft forever preferred_lft forever > >>>>>> > >>>>>> # ip route > >>>>>> 111.1.1.0/24 dev eth0 proto kernel scope link src 111.1.1.2 > >>>>>> > >>>>>> # ping 111.1.1.1 -w10 > >>>>>> PING 111.1.1.1 (111.1.1.1) 56(84) bytes of data. > >>>>>> From 111.1.1.2 icmp_seq=1 Destination Host Unreachable > >>>>>> ping: sendmsg: No route to host > >>>>>> From 111.1.1.2 icmp_seq=2 Destination Host Unreachable > >>>>>> From 111.1.1.2 icmp_seq=3 Destination Host Unreachable > >>>>>> > >>>>>> --- 111.1.1.1 ping statistics --- > >>>>>> 3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 2076ms > >>>>>> pipe 3 > >>>>>> > >>>>>> The other issue is related to booting L2 with "x-svq=true" > >>>>>> and "packed=on". > >>>>>> > >>>>>> In L1: > >>>>>> > >>>>>> $ ./qemu/build/qemu-system-x86_64 \ > >>>>>> -nographic \ > >>>>>> -m 4G \ > >>>>>> -enable-kvm \ > >>>>>> -M q35 \ > >>>>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ > >>>>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhost-vdpa0,x-svq=true \ > >>>>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,packed=on,bus=pcie.0,addr=0x7 \ > >>>>>> -smp 4 \ > >>>>>> -cpu host \ > >>>>>> 2>&1 | tee vm.log > >>>>>> > >>>>>> The kernel throws "virtio_net virtio1: output.0:id 0 is not > >>>>>> a head!" [4]. > >>>>>> > >>>>> > >>>>> So this series implements the descriptor forwarding from the guest to > >>>>> the device in packed vq. We also need to forward the descriptors from > >>>>> the device to the guest. The device writes them in the SVQ ring. > >>>>> > >>>>> The functions responsible for that in QEMU are > >>>>> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_flush, which is called by > >>>>> the device when used descriptors are written to the SVQ, which calls > >>>>> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf. We need to do > >>>>> modifications similar to vhost_svq_add: Make them conditional if we're > >>>>> in split or packed vq, and "copy" the code from Linux's > >>>>> drivers/virtio/virtio_ring.c:virtqueue_get_buf. > >>>>> > >>>>> After these modifications you should be able to ping and forward > >>>>> traffic. As always, It is totally ok if it needs more than one > >>>>> iteration, and feel free to ask any question you have :). > >>>>> > >>>> > >>>> I misunderstood this part. While working on extending > >>>> hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_get_buf() [1] > >>>> for packed vqs, I realized that this function and > >>>> vhost_svq_flush() already support split vqs. However, I am > >>>> unable to ping L0 when booting L2 with "x-svq=true" and > >>>> "packed=off" or when the "packed" option is not specified > >>>> in QEMU's command line. > >>>> > >>>> I tried debugging these functions for split vqs after running > >>>> the following QEMU commands while following the blog [2]. > >>>> > >>>> Booting L1: > >>>> > >>>> $ sudo ./qemu/build/qemu-system-x86_64 \ > >>>> -enable-kvm \ > >>>> -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ > >>>> -net nic,model=virtio \ > >>>> -net user,hostfwd=tcp::2222-:22 \ > >>>> -device intel-iommu,snoop-control=on \ > >>>> -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,ctrl_vq=on,ctrl_rx=on,packed=off,event_idx=off,bus=pcie.0,addr=0x4 \ > >>>> -netdev tap,id=net0,script=no,downscript=no \ > >>>> -nographic \ > >>>> -m 8G \ > >>>> -smp 4 \ > >>>> -M q35 \ > >>>> -cpu host 2>&1 | tee vm.log > >>>> > >>>> Booting L2: > >>>> > >>>> # ./qemu/build/qemu-system-x86_64 \ > >>>> -nographic \ > >>>> -m 4G \ > >>>> -enable-kvm \ > >>>> -M q35 \ > >>>> -drive file=//root/L2.qcow2,media=disk,if=virtio \ > >>>> -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,x-svq=true,id=vhost-vdpa0 \ > >>>> -device virtio-net-pci,netdev=vhost-vdpa0,disable-legacy=on,disable-modern=off,ctrl_vq=on,ctrl_rx=on,event_idx=off,bus=pcie.0,addr=0x7 \ > >>>> -smp 4 \ > >>>> -cpu host \ > >>>> 2>&1 | tee vm.log > >>>> > >>>> I printed out the contents of VirtQueueElement returned > >>>> by vhost_svq_get_buf() in vhost_svq_flush() [3]. > >>>> I noticed that "len" which is set by "vhost_svq_get_buf" > >>>> is always set to 0 while VirtQueueElement.len is non-zero. > >>>> I haven't understood the difference between these two "len"s. > >>>> > >>> > >>> VirtQueueElement.len is the length of the buffer, while the len of > >>> vhost_svq_get_buf is the bytes written by the device. In the case of > >>> the tx queue, VirtQueuelen is the length of the tx packet, and the > >>> vhost_svq_get_buf is always 0 as the device does not write. In the > >>> case of rx, VirtQueueElem.len is the available length for a rx frame, > >>> and the vhost_svq_get_buf len is the actual length written by the > >>> device. > >>> > >>> To be 100% accurate a rx packet can span over multiple buffers, but > >>> SVQ does not need special code to handle this. > >>> > >>> So vhost_svq_get_buf should return > 0 for rx queue (svq->vq->index == > >>> 0), and 0 for tx queue (svq->vq->index % 2 == 1). > >>> > >>> Take into account that vhost_svq_get_buf only handles split vq at the > >>> moment! It should be renamed or splitted into vhost_svq_get_buf_split. > >> > >> In L1, there are 2 virtio network devices. > >> > >> # lspci -nn | grep -i net > >> 00:02.0 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1af4:1000] > >> 00:04.0 Ethernet controller [0200]: Red Hat, Inc. Virtio 1.0 network device [1af4:1041] (rev 01) > >> > >> I am using the second one (1af4:1041) for testing my changes and have > >> bound this device to the vp_vdpa driver. > >> > >> # vdpa dev show -jp > >> { > >> "dev": { > >> "vdpa0": { > >> "type": "network", > >> "mgmtdev": "pci/0000:00:04.0", > >> "vendor_id": 6900, > >> "max_vqs": 3, > > > > How is max_vqs=3? For this to happen L0 QEMU should have > > virtio-net-pci,...,queues=3 cmdline argument. Ouch! I totally misread it :(. Everything is correct, max_vqs should be 3. I read it as the virtio_net queues, which means queue *pairs*, as it includes rx and tx queue. > > I am not sure why max_vqs is 3. I haven't set the value of queues to 3 > in the cmdline argument. Is max_vqs expected to have a default value > other than 3? > > In the blog [1] as well, max_vqs is 3 even though there's no queues=3 > argument. > > > It's clear the guest is not using them, we can add mq=off > > to simplify the scenario. > > The value of max_vqs is still 3 after adding mq=off. The whole > command that I run to boot L0 is: > > $ sudo ./qemu/build/qemu-system-x86_64 \ > -enable-kvm \ > -drive file=//home/valdaarhun/valdaarhun/qcow2_img/L1.qcow2,media=disk,if=virtio \ > -net nic,model=virtio \ > -net user,hostfwd=tcp::2222-:22 \ > -device intel-iommu,snoop-control=on \ > -device virtio-net-pci,netdev=net0,disable-legacy=on,disable-modern=off,iommu_platform=on,guest_uso4=off,guest_uso6=off,host_uso=off,guest_announce=off,mq=off,ctrl_vq=on,ctrl_rx=on,packed=off,event_idx=off,bus=pcie.0,addr=0x4 \ > -netdev tap,id=net0,script=no,downscript=no \ > -nographic \ > -m 8G \ > -smp 4 \ > -M q35 \ > -cpu host 2>&1 | tee vm.log > > Could it be that 2 of the 3 vqs are used for the dataplane and > the third vq is the control vq? > > >> "max_vq_size": 256 > >> } > >> } > >> } > >> > >> The max number of vqs is 3 with the max size being 256. > >> > >> Since, there are 2 virtio net devices, vhost_vdpa_svqs_start [1] > >> is called twice. For each of them. it calls vhost_svq_start [2] > >> v->shadow_vqs->len number of times. > >> > > > > Ok I understand this confusion, as the code is not intuitive :). Take > > into account you can only have svq in vdpa devices, so both > > vhost_vdpa_svqs_start are acting on the vdpa device. > > > > You are seeing two calls to vhost_vdpa_svqs_start because virtio (and > > vdpa) devices are modelled internally as two devices in QEMU: One for > > the dataplane vq, and other for the control vq. There are historical > > reasons for this, but we use it in vdpa to always shadow the CVQ while > > leaving dataplane passthrough if x-svq=off and the virtio & virtio-net > > feature set is understood by SVQ. > > > > If you break at vhost_vdpa_svqs_start with gdb and go higher in the > > stack you should reach vhost_net_start, that starts each vhost_net > > device individually. > > > > To be 100% honest, each dataplain *queue pair* (rx+tx) is modelled > > with a different vhost_net device in QEMU, but you don't need to take > > that into account implementing the packed vq :). > > Got it, this makes sense now. > > >> Printing the values of dev->vdev->name, v->shadow_vqs->len and > >> svq->vring.num in vhost_vdpa_svqs_start gives: > >> > >> name: virtio-net > >> len: 2 > >> num: 256 > >> num: 256 > > > > First QEMU's vhost_net device, the dataplane. > > > >> name: virtio-net > >> len: 1 > >> num: 64 > >> > > > > Second QEMU's vhost_net device, the control virtqueue. > > Ok, if I understand this correctly, the control vq doesn't > need separate queues for rx and tx. > That's right. Since CVQ has one reply per command, the driver can just send ro+rw descriptors to the device. In the case of RX, the device needs a queue with only-writable descriptors, as neither the device or the driver knows how many packets will arrive. > >> I am not sure how to match the above log lines to the > >> right virtio-net device since the actual value of num > >> can be less than "max_vq_size" in the output of "vdpa > >> dev show". > >> > > > > Yes, the device can set a different vq max per vq, and the driver can > > negotiate a lower vq size per vq too. > > > >> I think the first 3 log lines correspond to the virtio > >> net device that I am using for testing since it has > >> 2 vqs (rx and tx) while the other virtio-net device > >> only has one vq. > >> > >> When printing out the values of svq->vring.num, > >> used_elem.len and used_elem.id in vhost_svq_get_buf, > >> there are two sets of output. One set corresponds to > >> svq->vring.num = 64 and the other corresponds to > >> svq->vring.num = 256. > >> > >> For svq->vring.num = 64, only the following line > >> is printed repeatedly: > >> > >> size: 64, len: 1, i: 0 > >> > > > > This is with packed=off, right? If this is testing with packed, you > > need to change the code to accommodate it. Let me know if you need > > more help with this. > > Yes, this is for packed=off. For the time being, I am trying to > get L2 to communicate with L0 using split virtqueues and x-svq=true. > Got it. > > In the CVQ the only reply is a byte, indicating if the command was > > applied or not. This seems ok to me. > > Understood. > > > The queue can also recycle ids as long as they are not available, so > > that part seems correct to me too. > > I am a little confused here. The ids are recycled when they are > available (i.e., the id is not already in use), right? > In virtio, available is that the device can use them. And used is that the device returned to the driver. I think you're aligned it's just it is better to follow the virtio nomenclature :). > >> For svq->vring.num = 256, the following line is > >> printed 20 times, > >> > >> size: 256, len: 0, i: 0 > >> > >> followed by: > >> > >> size: 256, len: 0, i: 1 > >> size: 256, len: 0, i: 1 > >> > > > > This makes sense for the tx queue too. Can you print the VirtQueue index? > > For svq->vring.num = 64, the vq index is 2. So the following line > (svq->vring.num, used_elem.len, used_elem.id, svq->vq->queue_index) > is printed repeatedly: > > size: 64, len: 1, i: 0, vq idx: 2 > > For svq->vring.num = 256, the following line is repeated several > times: > > size: 256, len: 0, i: 0, vq idx: 1 > > This is followed by: > > size: 256, len: 0, i: 1, vq idx: 1 > > In both cases, queue_index is 1. To get the value of queue_index, > I used "virtio_get_queue_index(svq->vq)" [2]. > > Since the queue_index is 1, I guess this means this is the tx queue > and the value of len (0) is correct. However, nothing with > queue_index % 2 == 0 is printed by vhost_svq_get_buf() which means > the device is not sending anything to the guest. Is this correct? > Yes, that's totally correct. You can set -netdev tap,...,vhost=off in L0 qemu and trace (or debug with gdb) it to check what is receiving. You should see calls to hw/net/virtio-net.c:virtio_net_flush_tx. The corresponding function to receive is virtio_net_receive_rcu, I recommend you trace too just it in case you see any strange call to it. > >> used_elem.len is used to set the value of len that is > >> returned by vhost_svq_get_buf, and it's always 0. > >> > >> So the value of "len" returned by vhost_svq_get_buf > >> when called in vhost_svq_flush is also 0. > >> > >> Thanks, > >> Sahil > >> > >> [1] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-vdpa.c#L1243 > >> [2] https://gitlab.com/qemu-project/qemu/-/blob/master/hw/virtio/vhost-vdpa.c#L1265 > >> > > > > Thanks, > Sahil > > [1] https://www.redhat.com/en/blog/hands-vdpa-what-do-you-do-when-you-aint-got-hardware-part-2 > [2] https://gitlab.com/qemu-project/qemu/-/blob/99d6a32469debf1a48921125879b614d15acfb7a/hw/virtio/virtio.c#L3454 >