mbox series

[net-next,v2,0/3] net: A lightweight zero-copy notification

Message ID 20240419214819.671536-1-zijianzhang@bytedance.com (mailing list archive)
Headers show
Series net: A lightweight zero-copy notification | expand

Message

Zijian Zhang April 19, 2024, 9:48 p.m. UTC
From: Zijian Zhang <zijianzhang@bytedance.com>

Original title is "net: socket sendmsg MSG_ZEROCOPY_UARG"
https://lore.kernel.org/all/
20240409205300.1346681-2-zijianzhang@bytedance.com/

Original notification mechanism needs poll + recvmmsg which is not
easy for applcations to accommodate. And, it also incurs unignorable
overhead including extra system calls and usage of optmem.

While making maximum reuse of the existing MSG_ZEROCOPY related code,
this patch set introduces a new zerocopy socket notification mechanism.
Users of sendmsg pass a control message as a placeholder for the incoming
notifications. Upon returning, kernel embeds notifications directly into
user arguments passed in. By doing so, we can significantly reduce the
complexity and overhead for managing notifications. In an ideal pattern,
the user will keep calling sendmsg with SO_ZC_NOTIFICATION msg_control,
and the notification will be delivered as soon as possible.

Changelog:
  v1 -> v2:
    - Reuse msg_errqueue in the new notification mechanism, suggested
      by Willem de Bruijn, users can actually use these two mechanisms
      in hybrid way if they want to do so.
    - Update case SO_ZC_NOTIFICATION in __sock_cmsg_send
      1. Regardless of 32-bit, 64-bit program, we will always handle
      u64 type user address.
      2. The size of data to copy_to_user is precisely calculated
      in case of kernel stack leak.
    - fix (kbuild-bot)
      1. Add SO_ZC_NOTIFICATION to arch-specific header files.
      2. header file types.h in include/uapi/linux/socket.h

* Performance

We extend the selftests/msg_zerocopy.c to accommodate the new mechanism,
test result is as follows,

cfg_notification_limit = 1, in this case the original method approximately
aligns with the semantics of new one. In this case, the new flag has
around 13% cpu savings in TCP and 18% cpu savings in UDP.

+---------------------+---------+---------+---------+---------+
| Test Type / Protocol| TCP v4  | TCP v6  | UDP v4  | UDP v6  |
+---------------------+---------+---------+---------+---------+
| ZCopy (MB)          | 5147    | 4885    | 7489    | 7854    |
+---------------------+---------+---------+---------+---------+
| New ZCopy (MB)      | 5859    | 5505    | 9053    | 9236    |
+---------------------+---------+---------+---------+---------+
| New ZCopy / ZCopy   | 113.83% | 112.69% | 120.88% | 117.59% |
+---------------------+---------+---------+---------+---------+


cfg_notification_limit = 32, the new mechanism performs 8% better in TCP.
For UDP, no obvious performance gain is observed and sometimes may lead
to degradation. Thus, if users don't need to retrieve the notification
ASAP in UDP, the original mechanism is preferred.

+---------------------+---------+---------+---------+---------+
| Test Type / Protocol| TCP v4  | TCP v6  | UDP v4  | UDP v6  |
+---------------------+---------+---------+---------+---------+
| ZCopy (MB)          | 6272    | 6138    | 12138   | 10055   |
+---------------------+---------+---------+---------+---------+
| New ZCopy (MB)      | 6774    | 6620    | 11504   | 10355   |
+---------------------+---------+---------+---------+---------+
| New ZCopy / ZCopy   | 108.00% | 107.85% | 94.78%  | 102.98% |
+---------------------+---------+---------+---------+---------+

Zijian Zhang (3):
  selftests: fix OOM problem in msg_zerocopy selftest
  sock: add MSG_ZEROCOPY notification mechanism based on msg_control
  selftests: add MSG_ZEROCOPY msg_control notification test

 arch/alpha/include/uapi/asm/socket.h        |   2 +
 arch/mips/include/uapi/asm/socket.h         |   2 +
 arch/parisc/include/uapi/asm/socket.h       |   2 +
 arch/sparc/include/uapi/asm/socket.h        |   2 +
 include/uapi/asm-generic/socket.h           |   2 +
 include/uapi/linux/socket.h                 |  16 +++
 net/core/sock.c                             |  70 +++++++++++++
 tools/testing/selftests/net/msg_zerocopy.c  | 105 ++++++++++++++++++--
 tools/testing/selftests/net/msg_zerocopy.sh |   1 +
 9 files changed, 195 insertions(+), 7 deletions(-)

Comments

Jakub Kicinski April 20, 2024, 3:47 a.m. UTC | #1
On Fri, 19 Apr 2024 21:48:16 +0000 zijianzhang@bytedance.com wrote:
> Original title is "net: socket sendmsg MSG_ZEROCOPY_UARG"
> https://lore.kernel.org/all/
> 20240409205300.1346681-2-zijianzhang@bytedance.com/

AFAICT sparse reports this new warning:

net/core/sock.c:2864:26: warning: incorrect type in assignment (different address spaces)
net/core/sock.c:2864:26:    expected void [noderef] __user *usr_addr
net/core/sock.c:2864:26:    got void *
Zijian Zhang April 23, 2024, 7:29 p.m. UTC | #2
Thanks, will update in the next version.

On 4/19/24 8:47 PM, Jakub Kicinski wrote:
> On Fri, 19 Apr 2024 21:48:16 +0000 zijianzhang@bytedance.com wrote:
>> Original title is "net: socket sendmsg MSG_ZEROCOPY_UARG"
>> https://lore.kernel.org/all/
>> 20240409205300.1346681-2-zijianzhang@bytedance.com/
> 
> AFAICT sparse reports this new warning:
> 
> net/core/sock.c:2864:26: warning: incorrect type in assignment (different address spaces)
> net/core/sock.c:2864:26:    expected void [noderef] __user *usr_addr
> net/core/sock.c:2864:26:    got void *