Message ID | 20190717113030.163499-1-sgarzare@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | vsock/virtio: optimizations to increase the throughput | expand |
On Wed, Jul 17, 2019 at 01:30:25PM +0200, Stefano Garzarella wrote: > This series tries to increase the throughput of virtio-vsock with slight > changes. > While I was testing the v2 of this series I discovered an huge use of memory, > so I added patch 1 to mitigate this issue. I put it in this series in order > to better track the performance trends. > > v4: > - rebased all patches on current master (conflicts is Patch 4) > - Patch 1: added Stefan's R-b > - Patch 3: removed lock when buf_alloc is written [David]; > moved this patch after "vsock/virtio: reduce credit update messages" > to make it clearer > - Patch 4: vhost_exceeds_weight() is recently introduced, so I've solved some > conflicts Stefano: Do you want to continue experimenting before we merge this patch series? The code looks functionally correct and the performance increases, so I'm happy for it to be merged.
On Mon, Jul 22, 2019 at 10:08:35AM +0100, Stefan Hajnoczi wrote: > On Wed, Jul 17, 2019 at 01:30:25PM +0200, Stefano Garzarella wrote: > > This series tries to increase the throughput of virtio-vsock with slight > > changes. > > While I was testing the v2 of this series I discovered an huge use of memory, > > so I added patch 1 to mitigate this issue. I put it in this series in order > > to better track the performance trends. > > > > v4: > > - rebased all patches on current master (conflicts is Patch 4) > > - Patch 1: added Stefan's R-b > > - Patch 3: removed lock when buf_alloc is written [David]; > > moved this patch after "vsock/virtio: reduce credit update messages" > > to make it clearer > > - Patch 4: vhost_exceeds_weight() is recently introduced, so I've solved some > > conflicts > > Stefano: Do you want to continue experimenting before we merge this > patch series? The code looks functionally correct and the performance > increases, so I'm happy for it to be merged. I think we can merge this series. I'll continue to do other experiments (e.g. removing TX workers, allocating pages, etc.) but I think these changes are prerequisites for the other patches, so we can merge them. Thank you very much for the reviews! Stefano
On Mon, Jul 22, 2019 at 11:14:34AM +0200, Stefano Garzarella wrote: > On Mon, Jul 22, 2019 at 10:08:35AM +0100, Stefan Hajnoczi wrote: > > On Wed, Jul 17, 2019 at 01:30:25PM +0200, Stefano Garzarella wrote: > > > This series tries to increase the throughput of virtio-vsock with slight > > > changes. > > > While I was testing the v2 of this series I discovered an huge use of memory, > > > so I added patch 1 to mitigate this issue. I put it in this series in order > > > to better track the performance trends. > > > > > > v4: > > > - rebased all patches on current master (conflicts is Patch 4) > > > - Patch 1: added Stefan's R-b > > > - Patch 3: removed lock when buf_alloc is written [David]; > > > moved this patch after "vsock/virtio: reduce credit update messages" > > > to make it clearer > > > - Patch 4: vhost_exceeds_weight() is recently introduced, so I've solved some > > > conflicts > > > > Stefano: Do you want to continue experimenting before we merge this > > patch series? The code looks functionally correct and the performance > > increases, so I'm happy for it to be merged. > > I think we can merge this series. > > I'll continue to do other experiments (e.g. removing TX workers, allocating > pages, etc.) but I think these changes are prerequisites for the other patches, > so we can merge them. > > Thank you very much for the reviews! All patches have been reviewed by here. Have an Ack for good measure: Acked-by: Stefan Hajnoczi <stefanha@redhat.com> The topics discussed in sub-threads relate to longer-term optimization work that doesn't block this series. Please merge.
On Wed, Jul 17, 2019 at 01:30:25PM +0200, Stefano Garzarella wrote: > This series tries to increase the throughput of virtio-vsock with slight > changes. > While I was testing the v2 of this series I discovered an huge use of memory, > so I added patch 1 to mitigate this issue. I put it in this series in order > to better track the performance trends. Series: Acked-by: Michael S. Tsirkin <mst@redhat.com> Can this go into net-next? > v4: > - rebased all patches on current master (conflicts is Patch 4) > - Patch 1: added Stefan's R-b > - Patch 3: removed lock when buf_alloc is written [David]; > moved this patch after "vsock/virtio: reduce credit update messages" > to make it clearer > - Patch 4: vhost_exceeds_weight() is recently introduced, so I've solved some > conflicts > > v3: https://patchwork.kernel.org/cover/10970145 > > v2: https://patchwork.kernel.org/cover/10938743 > > v1: https://patchwork.kernel.org/cover/10885431 > > Below are the benchmarks step by step. I used iperf3 [1] modified with VSOCK > support. As Micheal suggested in the v1, I booted host and guest with 'nosmap'. > > A brief description of patches: > - Patches 1: limit the memory usage with an extra copy for small packets > - Patches 2+3: reduce the number of credit update messages sent to the > transmitter > - Patches 4+5: allow the host to split packets on multiple buffers and use > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE as the max packet size allowed > > host -> guest [Gbps] > pkt_size before opt p 1 p 2+3 p 4+5 > > 32 0.032 0.030 0.048 0.051 > 64 0.061 0.059 0.108 0.117 > 128 0.122 0.112 0.227 0.234 > 256 0.244 0.241 0.418 0.415 > 512 0.459 0.466 0.847 0.865 > 1K 0.927 0.919 1.657 1.641 > 2K 1.884 1.813 3.262 3.269 > 4K 3.378 3.326 6.044 6.195 > 8K 5.637 5.676 10.141 11.287 > 16K 8.250 8.402 15.976 16.736 > 32K 13.327 13.204 19.013 20.515 > 64K 21.241 21.341 20.973 21.879 > 128K 21.851 22.354 21.816 23.203 > 256K 21.408 21.693 21.846 24.088 > 512K 21.600 21.899 21.921 24.106 > > guest -> host [Gbps] > pkt_size before opt p 1 p 2+3 p 4+5 > > 32 0.045 0.046 0.057 0.057 > 64 0.089 0.091 0.103 0.104 > 128 0.170 0.179 0.192 0.200 > 256 0.364 0.351 0.361 0.379 > 512 0.709 0.699 0.731 0.790 > 1K 1.399 1.407 1.395 1.427 > 2K 2.670 2.684 2.745 2.835 > 4K 5.171 5.199 5.305 5.451 > 8K 8.442 8.500 10.083 9.941 > 16K 12.305 12.259 13.519 15.385 > 32K 11.418 11.150 11.988 24.680 > 64K 10.778 10.659 11.589 35.273 > 128K 10.421 10.339 10.939 40.338 > 256K 10.300 9.719 10.508 36.562 > 512K 9.833 9.808 10.612 35.979 > > As Stefan suggested in the v1, I measured also the efficiency in this way: > efficiency = Mbps / (%CPU_Host + %CPU_Guest) > > The '%CPU_Guest' is taken inside the VM. I know that it is not the best way, > but it's provided for free from iperf3 and could be an indication. > > host -> guest efficiency [Mbps / (%CPU_Host + %CPU_Guest)] > pkt_size before opt p 1 p 2+3 p 4+5 > > 32 0.35 0.45 0.79 1.02 > 64 0.56 0.80 1.41 1.54 > 128 1.11 1.52 3.03 3.12 > 256 2.20 2.16 5.44 5.58 > 512 4.17 4.18 10.96 11.46 > 1K 8.30 8.26 20.99 20.89 > 2K 16.82 16.31 39.76 39.73 > 4K 30.89 30.79 74.07 75.73 > 8K 53.74 54.49 124.24 148.91 > 16K 80.68 83.63 200.21 232.79 > 32K 132.27 132.52 260.81 357.07 > 64K 229.82 230.40 300.19 444.18 > 128K 332.60 329.78 331.51 492.28 > 256K 331.06 337.22 339.59 511.59 > 512K 335.58 328.50 331.56 504.56 > > guest -> host efficiency [Mbps / (%CPU_Host + %CPU_Guest)] > pkt_size before opt p 1 p 2+3 p 4+5 > > 32 0.43 0.43 0.53 0.56 > 64 0.85 0.86 1.04 1.10 > 128 1.63 1.71 2.07 2.13 > 256 3.48 3.35 4.02 4.22 > 512 6.80 6.67 7.97 8.63 > 1K 13.32 13.31 15.72 15.94 > 2K 25.79 25.92 30.84 30.98 > 4K 50.37 50.48 58.79 59.69 > 8K 95.90 96.15 107.04 110.33 > 16K 145.80 145.43 143.97 174.70 > 32K 147.06 144.74 146.02 282.48 > 64K 145.25 143.99 141.62 406.40 > 128K 149.34 146.96 147.49 489.34 > 256K 156.35 149.81 152.21 536.37 > 512K 151.65 150.74 151.52 519.93 > > [1] https://github.com/stefano-garzarella/iperf/ > > Stefano Garzarella (5): > vsock/virtio: limit the memory used per-socket > vsock/virtio: reduce credit update messages > vsock/virtio: fix locking in virtio_transport_inc_tx_pkt() > vhost/vsock: split packets to send using multiple buffers > vsock/virtio: change the maximum packet size allowed > > drivers/vhost/vsock.c | 68 ++++++++++++----- > include/linux/virtio_vsock.h | 4 +- > net/vmw_vsock/virtio_transport.c | 1 + > net/vmw_vsock/virtio_transport_common.c | 99 ++++++++++++++++++++----- > 4 files changed, 134 insertions(+), 38 deletions(-) > > -- > 2.20.1
On Mon, Jul 29, 2019 at 09:59:23AM -0400, Michael S. Tsirkin wrote: > On Wed, Jul 17, 2019 at 01:30:25PM +0200, Stefano Garzarella wrote: > > This series tries to increase the throughput of virtio-vsock with slight > > changes. > > While I was testing the v2 of this series I discovered an huge use of memory, > > so I added patch 1 to mitigate this issue. I put it in this series in order > > to better track the performance trends. > > Series: > > Acked-by: Michael S. Tsirkin <mst@redhat.com> > > Can this go into net-next? > I think so. Michael, Stefan thanks to ack the series! Should I resend it with net-next tag? Thanks, Stefano
On 2019/7/30 下午5:40, Stefano Garzarella wrote: > On Mon, Jul 29, 2019 at 09:59:23AM -0400, Michael S. Tsirkin wrote: >> On Wed, Jul 17, 2019 at 01:30:25PM +0200, Stefano Garzarella wrote: >>> This series tries to increase the throughput of virtio-vsock with slight >>> changes. >>> While I was testing the v2 of this series I discovered an huge use of memory, >>> so I added patch 1 to mitigate this issue. I put it in this series in order >>> to better track the performance trends. >> Series: >> >> Acked-by: Michael S. Tsirkin <mst@redhat.com> >> >> Can this go into net-next? >> > I think so. > Michael, Stefan thanks to ack the series! > > Should I resend it with net-next tag? > > Thanks, > Stefano I think so. Thanks
On Tue, Jul 30, 2019 at 06:03:24PM +0800, Jason Wang wrote: > > On 2019/7/30 下午5:40, Stefano Garzarella wrote: > > On Mon, Jul 29, 2019 at 09:59:23AM -0400, Michael S. Tsirkin wrote: > > > On Wed, Jul 17, 2019 at 01:30:25PM +0200, Stefano Garzarella wrote: > > > > This series tries to increase the throughput of virtio-vsock with slight > > > > changes. > > > > While I was testing the v2 of this series I discovered an huge use of memory, > > > > so I added patch 1 to mitigate this issue. I put it in this series in order > > > > to better track the performance trends. > > > Series: > > > > > > Acked-by: Michael S. Tsirkin <mst@redhat.com> > > > > > > Can this go into net-next? > > > > > I think so. > > Michael, Stefan thanks to ack the series! > > > > Should I resend it with net-next tag? > > > > Thanks, > > Stefano > > > I think so. Okay, I'll resend it! Thanks, Stefano
Patches 1,3 and 4 are needed for stable. Dave, could you queue them there please? On Wed, Jul 17, 2019 at 01:30:25PM +0200, Stefano Garzarella wrote: > This series tries to increase the throughput of virtio-vsock with slight > changes. > While I was testing the v2 of this series I discovered an huge use of memory, > so I added patch 1 to mitigate this issue. I put it in this series in order > to better track the performance trends. > > v4: > - rebased all patches on current master (conflicts is Patch 4) > - Patch 1: added Stefan's R-b > - Patch 3: removed lock when buf_alloc is written [David]; > moved this patch after "vsock/virtio: reduce credit update messages" > to make it clearer > - Patch 4: vhost_exceeds_weight() is recently introduced, so I've solved some > conflicts > > v3: https://patchwork.kernel.org/cover/10970145 > > v2: https://patchwork.kernel.org/cover/10938743 > > v1: https://patchwork.kernel.org/cover/10885431 > > Below are the benchmarks step by step. I used iperf3 [1] modified with VSOCK > support. As Micheal suggested in the v1, I booted host and guest with 'nosmap'. > > A brief description of patches: > - Patches 1: limit the memory usage with an extra copy for small packets > - Patches 2+3: reduce the number of credit update messages sent to the > transmitter > - Patches 4+5: allow the host to split packets on multiple buffers and use > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE as the max packet size allowed > > host -> guest [Gbps] > pkt_size before opt p 1 p 2+3 p 4+5 > > 32 0.032 0.030 0.048 0.051 > 64 0.061 0.059 0.108 0.117 > 128 0.122 0.112 0.227 0.234 > 256 0.244 0.241 0.418 0.415 > 512 0.459 0.466 0.847 0.865 > 1K 0.927 0.919 1.657 1.641 > 2K 1.884 1.813 3.262 3.269 > 4K 3.378 3.326 6.044 6.195 > 8K 5.637 5.676 10.141 11.287 > 16K 8.250 8.402 15.976 16.736 > 32K 13.327 13.204 19.013 20.515 > 64K 21.241 21.341 20.973 21.879 > 128K 21.851 22.354 21.816 23.203 > 256K 21.408 21.693 21.846 24.088 > 512K 21.600 21.899 21.921 24.106 > > guest -> host [Gbps] > pkt_size before opt p 1 p 2+3 p 4+5 > > 32 0.045 0.046 0.057 0.057 > 64 0.089 0.091 0.103 0.104 > 128 0.170 0.179 0.192 0.200 > 256 0.364 0.351 0.361 0.379 > 512 0.709 0.699 0.731 0.790 > 1K 1.399 1.407 1.395 1.427 > 2K 2.670 2.684 2.745 2.835 > 4K 5.171 5.199 5.305 5.451 > 8K 8.442 8.500 10.083 9.941 > 16K 12.305 12.259 13.519 15.385 > 32K 11.418 11.150 11.988 24.680 > 64K 10.778 10.659 11.589 35.273 > 128K 10.421 10.339 10.939 40.338 > 256K 10.300 9.719 10.508 36.562 > 512K 9.833 9.808 10.612 35.979 > > As Stefan suggested in the v1, I measured also the efficiency in this way: > efficiency = Mbps / (%CPU_Host + %CPU_Guest) > > The '%CPU_Guest' is taken inside the VM. I know that it is not the best way, > but it's provided for free from iperf3 and could be an indication. > > host -> guest efficiency [Mbps / (%CPU_Host + %CPU_Guest)] > pkt_size before opt p 1 p 2+3 p 4+5 > > 32 0.35 0.45 0.79 1.02 > 64 0.56 0.80 1.41 1.54 > 128 1.11 1.52 3.03 3.12 > 256 2.20 2.16 5.44 5.58 > 512 4.17 4.18 10.96 11.46 > 1K 8.30 8.26 20.99 20.89 > 2K 16.82 16.31 39.76 39.73 > 4K 30.89 30.79 74.07 75.73 > 8K 53.74 54.49 124.24 148.91 > 16K 80.68 83.63 200.21 232.79 > 32K 132.27 132.52 260.81 357.07 > 64K 229.82 230.40 300.19 444.18 > 128K 332.60 329.78 331.51 492.28 > 256K 331.06 337.22 339.59 511.59 > 512K 335.58 328.50 331.56 504.56 > > guest -> host efficiency [Mbps / (%CPU_Host + %CPU_Guest)] > pkt_size before opt p 1 p 2+3 p 4+5 > > 32 0.43 0.43 0.53 0.56 > 64 0.85 0.86 1.04 1.10 > 128 1.63 1.71 2.07 2.13 > 256 3.48 3.35 4.02 4.22 > 512 6.80 6.67 7.97 8.63 > 1K 13.32 13.31 15.72 15.94 > 2K 25.79 25.92 30.84 30.98 > 4K 50.37 50.48 58.79 59.69 > 8K 95.90 96.15 107.04 110.33 > 16K 145.80 145.43 143.97 174.70 > 32K 147.06 144.74 146.02 282.48 > 64K 145.25 143.99 141.62 406.40 > 128K 149.34 146.96 147.49 489.34 > 256K 156.35 149.81 152.21 536.37 > 512K 151.65 150.74 151.52 519.93 > > [1] https://github.com/stefano-garzarella/iperf/ > > Stefano Garzarella (5): > vsock/virtio: limit the memory used per-socket > vsock/virtio: reduce credit update messages > vsock/virtio: fix locking in virtio_transport_inc_tx_pkt() > vhost/vsock: split packets to send using multiple buffers > vsock/virtio: change the maximum packet size allowed > > drivers/vhost/vsock.c | 68 ++++++++++++----- > include/linux/virtio_vsock.h | 4 +- > net/vmw_vsock/virtio_transport.c | 1 + > net/vmw_vsock/virtio_transport_common.c | 99 ++++++++++++++++++++----- > 4 files changed, 134 insertions(+), 38 deletions(-) > > -- > 2.20.1