Message ID | 20231122013629.28554-5-kuniyu@amazon.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | af_unix: Random improvements for GC. | expand |
Hi Kuniyuki,
kernel test robot noticed the following build errors:
[auto build test ERROR on net-next/main]
url: https://github.com/intel-lab-lkp/linux/commits/Kuniyuki-Iwashima/af_unix-Do-not-use-atomic-ops-for-unix_sk-sk-inflight/20231122-094214
base: net-next/main
patch link: https://lore.kernel.org/r/20231122013629.28554-5-kuniyu%40amazon.com
patch subject: [PATCH v1 net-next 4/4] af_unix: Try to run GC async.
config: sh-defconfig (https://download.01.org/0day-ci/archive/20231122/202311222204.jSne0FwB-lkp@intel.com/config)
compiler: sh4-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231122/202311222204.jSne0FwB-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202311222204.jSne0FwB-lkp@intel.com/
All errors (new ones prefixed by >>):
sh4-linux-ld: net/core/scm.o: in function `__scm_send':
>> scm.c:(.text+0x6e8): undefined reference to `unix_get_socket'
Hi Kuniyuki, kernel test robot noticed the following build errors: [auto build test ERROR on net-next/main] url: https://github.com/intel-lab-lkp/linux/commits/Kuniyuki-Iwashima/af_unix-Do-not-use-atomic-ops-for-unix_sk-sk-inflight/20231122-094214 base: net-next/main patch link: https://lore.kernel.org/r/20231122013629.28554-5-kuniyu%40amazon.com patch subject: [PATCH v1 net-next 4/4] af_unix: Try to run GC async. config: x86_64-buildonly-randconfig-006-20231122 (https://download.01.org/0day-ci/archive/20231123/202311230220.WfMchMxF-lkp@intel.com/config) compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231123/202311230220.WfMchMxF-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202311230220.WfMchMxF-lkp@intel.com/ All errors (new ones prefixed by >>): >> ld.lld: error: undefined symbol: unix_get_socket >>> referenced by usercopy_64.c >>> vmlinux.o:(__scm_send)
diff --git a/include/net/af_unix.h b/include/net/af_unix.h index c628d30ceb19..f8e654d418e6 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -13,7 +13,7 @@ void unix_notinflight(struct user_struct *user, struct file *fp); void unix_destruct_scm(struct sk_buff *skb); void io_uring_destruct_scm(struct sk_buff *skb); void unix_gc(void); -void wait_for_unix_gc(void); +void wait_for_unix_gc(struct scm_fp_list *fpl); struct unix_sock *unix_get_socket(struct file *filp); struct sock *unix_peer_get(struct sock *sk); diff --git a/include/net/scm.h b/include/net/scm.h index e8c76b4be2fe..1ff6a2855064 100644 --- a/include/net/scm.h +++ b/include/net/scm.h @@ -24,6 +24,7 @@ struct scm_creds { struct scm_fp_list { short count; + short count_unix; short max; struct user_struct *user; struct file *fp[SCM_MAX_FD]; diff --git a/net/core/scm.c b/net/core/scm.c index 880027ecf516..13d0e7e88be5 100644 --- a/net/core/scm.c +++ b/net/core/scm.c @@ -35,6 +35,7 @@ #include <net/compat.h> #include <net/scm.h> #include <net/cls_cgroup.h> +#include <net/af_unix.h> /* @@ -105,6 +106,8 @@ static int scm_fp_copy(struct cmsghdr *cmsg, struct scm_fp_list **fplp) return -EBADF; *fpp++ = file; fpl->count++; + if (unix_get_socket(file)) + fpl->count_unix++; } if (!fpl->user) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 0ba7fb09c1bd..d1a54e65f2cf 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1925,11 +1925,12 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, long timeo; int err; - wait_for_unix_gc(); err = scm_send(sock, msg, &scm, false); if (err < 0) return err; + wait_for_unix_gc(scm.fp); + err = -EOPNOTSUPP; if (msg->msg_flags&MSG_OOB) goto out; @@ -2201,11 +2202,12 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, bool fds_sent = false; int data_len; - wait_for_unix_gc(); err = scm_send(sock, msg, &scm, false); if (err < 0) return err; + wait_for_unix_gc(scm.fp); + err = -EOPNOTSUPP; if (msg->msg_flags & MSG_OOB) { #if IS_ENABLED(CONFIG_AF_UNIX_OOB) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 8bc93a7e745f..73091d6b7fc4 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -184,8 +184,9 @@ static void inc_inflight_move_tail(struct unix_sock *u) } #define UNIX_INFLIGHT_TRIGGER_GC 16000 +#define UNIX_INFLIGHT_SANE_USER 32 -void wait_for_unix_gc(void) +void wait_for_unix_gc(struct scm_fp_list *fpl) { /* If number of inflight sockets is insane, kick a * garbage collect right now. @@ -195,7 +196,12 @@ void wait_for_unix_gc(void) if (READ_ONCE(unix_tot_inflight) > UNIX_INFLIGHT_TRIGGER_GC) queue_work(system_unbound_wq, &unix_gc_work); - flush_work(&unix_gc_work); + /* Penalise users who want to send AF_UNIX sockets + * but whose sockets have not been received yet. + */ + if (fpl && fpl->count_unix && + READ_ONCE(fpl->user->unix_inflight) > UNIX_INFLIGHT_SANE_USER) + flush_work(&unix_gc_work); } static void __unix_gc(struct work_struct *work)
If more than 16000 inflight AF_UNIX sockets exist and the garbage collector is not running, unix_(dgram|stream)_sendmsg() call unix_gc(). Also, they wait for unix_gc() to complete. In unix_gc(), all inflight AF_UNIX sockets are traversed at least once, and more if they are the GC candidate. Thus, sendmsg() significantly slows down with too many inflight AF_UNIX sockets. However, if a process sends data with no AF_UNIX FD, the sendmsg() call does not need to wait for GC. After this change, only the process that meets the condition below will be blocked under such a situation. 1) cmsg contains AF_UNIX socket 2) more than 32 AF_UNIX sent by the same user are still inflight Note that even a sendmsg() call that only meets the condition 1) will be blocked later in unix_scm_to_skb() by the spinlock, but we allow that as a bonus for sane users. The results below are the time spent in unix_dgram_sendmsg() sending 1 byte of data with no FD 4096 times on a host where 32K inflight AF_UNIX sockets exist. Without series: the sane sendmsg() needs to wait gc unreasonably. $ sudo /usr/share/bcc/tools/funclatency -p 11165 unix_dgram_sendmsg Tracing 1 functions for "unix_dgram_sendmsg"... Hit Ctrl-C to end. ^C nsecs : count distribution [...] 524288 -> 1048575 : 0 | | 1048576 -> 2097151 : 3881 |****************************************| 2097152 -> 4194303 : 214 |** | 4194304 -> 8388607 : 1 | | avg = 1825567 nsecs, total: 7477526027 nsecs, count: 4096 With series: the sane sendmsg() can finish much faster. $ sudo /usr/share/bcc/tools/funclatency -p 8702 unix_dgram_sendmsg Tracing 1 functions for "unix_dgram_sendmsg"... Hit Ctrl-C to end. ^C nsecs : count distribution [...] 128 -> 255 : 0 | | 256 -> 511 : 4092 |****************************************| 512 -> 1023 : 2 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 1 | | 8192 -> 16383 : 1 | | avg = 410 nsecs, total: 1680510 nsecs, count: 4096 Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> --- include/net/af_unix.h | 2 +- include/net/scm.h | 1 + net/core/scm.c | 3 +++ net/unix/af_unix.c | 6 ++++-- net/unix/garbage.c | 10 ++++++++-- 5 files changed, 17 insertions(+), 5 deletions(-)