From patchwork Sun Nov 14 01:24:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617751 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE208C433EF for ; Sun, 14 Nov 2021 01:25:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8B3AC60EC0 for ; Sun, 14 Nov 2021 01:25:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230388AbhKNB1y (ORCPT ); Sat, 13 Nov 2021 20:27:54 -0500 Received: from smtp-fw-9103.amazon.com ([207.171.188.200]:58813 "EHLO smtp-fw-9103.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229988AbhKNB1x (ORCPT ); Sat, 13 Nov 2021 20:27:53 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853100; x=1668389100; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0F7mhTkggN4Aci28q7G1AHhOrM1/oe9uGG6Si3UbprI=; b=YVcPHB6OlE47WL9LzkxUj+rSBQokRpsvz5HgOXEZR/yDSdESacyqxNBN PnsoY0YYxmNEIRzEbFFJ0XNfIoUWlZqOLmomByeU1uPEshN43apiIKCxV pNdDwsWxr3tX0ZojpmUFKKMlN/uhBQAnAJIMGKqDixtGnnouqFAs6IyTk M=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="971465804" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-pdx-2c-7d0c7241.us-west-2.amazon.com) ([10.25.36.210]) by smtp-border-fw-9103.sea19.amazon.com with ESMTP; 14 Nov 2021 01:25:00 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-pdx-2c-7d0c7241.us-west-2.amazon.com (Postfix) with ESMTPS id 1620E41295; Sun, 14 Nov 2021 01:25:00 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:24:59 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:24:56 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 01/13] af_unix: Use offsetof() instead of sizeof(). Date: Sun, 14 Nov 2021 10:24:16 +0900 Message-ID: <20211114012428.81743-2-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The length of the AF_UNIX socket address contains an offset to the member sun_path of struct sockaddr_un. Currently, the preceding member is just sun_family, and its type is sa_family_t and resolved to short. Therefore, the offset is represented by sizeof(short). However, it is not clear and fragile to changes in struct sockaddr_storage or sockaddr_un. This commit makes it clear and robust by rewriting sizeof() with offsetof(). Signed-off-by: Kuniyuki Iwashima --- net/unix/af_unix.c | 15 ++++++++------- net/unix/diag.c | 3 ++- 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 78e08e82c08c..b0ef27062489 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -231,7 +231,7 @@ static int unix_mkname(struct sockaddr_un *sunaddr, int len, unsigned int *hashp { *hashp = 0; - if (len <= sizeof(short) || len > sizeof(*sunaddr)) + if (len <= offsetof(struct sockaddr_un, sun_path) || len > sizeof(*sunaddr)) return -EINVAL; if (!sunaddr || sunaddr->sun_family != AF_UNIX) return -EINVAL; @@ -244,7 +244,7 @@ static int unix_mkname(struct sockaddr_un *sunaddr, int len, unsigned int *hashp * kernel address buffer. */ ((char *)sunaddr)[len] = 0; - len = strlen(sunaddr->sun_path)+1+sizeof(short); + len = strlen(sunaddr->sun_path) + offsetof(struct sockaddr_un, sun_path) + 1; return len; } @@ -970,7 +970,7 @@ static int unix_autobind(struct socket *sock) goto out; err = -ENOMEM; - addr = kzalloc(sizeof(*addr) + sizeof(short) + 16, GFP_KERNEL); + addr = kzalloc(sizeof(*addr) + offsetof(struct sockaddr_un, sun_path) + 16, GFP_KERNEL); if (!addr) goto out; @@ -978,7 +978,8 @@ static int unix_autobind(struct socket *sock) refcount_set(&addr->refcnt, 1); retry: - addr->len = sprintf(addr->name->sun_path+1, "%05x", ordernum) + 1 + sizeof(short); + addr->len = sprintf(addr->name->sun_path + 1, "%05x", ordernum) + + offsetof(struct sockaddr_un, sun_path) + 1; addr->hash = unix_hash_fold(csum_partial(addr->name, addr->len, 0)); addr->hash ^= sk->sk_type; @@ -1160,7 +1161,7 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) sunaddr->sun_family != AF_UNIX) return -EINVAL; - if (addr_len == sizeof(short)) + if (addr_len == offsetof(struct sockaddr_un, sun_path)) return unix_autobind(sock); err = unix_mkname(sunaddr, addr_len, &hash); @@ -1604,7 +1605,7 @@ static int unix_getname(struct socket *sock, struct sockaddr *uaddr, int peer) if (!addr) { sunaddr->sun_family = AF_UNIX; sunaddr->sun_path[0] = 0; - err = sizeof(short); + err = offsetof(struct sockaddr_un, sun_path); } else { err = addr->len; memcpy(sunaddr, addr->name, addr->len); @@ -3235,7 +3236,7 @@ static int unix_seq_show(struct seq_file *seq, void *v) seq_putc(seq, ' '); i = 0; - len = u->addr->len - sizeof(short); + len = u->addr->len - offsetof(struct sockaddr_un, sun_path); if (!UNIX_ABSTRACT(s)) len--; else { diff --git a/net/unix/diag.c b/net/unix/diag.c index 7e7d7f45685a..db555f267407 100644 --- a/net/unix/diag.c +++ b/net/unix/diag.c @@ -19,7 +19,8 @@ static int sk_diag_dump_name(struct sock *sk, struct sk_buff *nlskb) if (!addr) return 0; - return nla_put(nlskb, UNIX_DIAG_NAME, addr->len - sizeof(short), + return nla_put(nlskb, UNIX_DIAG_NAME, + addr->len - offsetof(struct sockaddr_un, sun_path), addr->name->sun_path); } From patchwork Sun Nov 14 01:24:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617753 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AD41C433EF for ; Sun, 14 Nov 2021 01:25:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5952E60EC0 for ; Sun, 14 Nov 2021 01:25:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235571AbhKNB2H (ORCPT ); Sat, 13 Nov 2021 20:28:07 -0500 Received: from smtp-fw-80006.amazon.com ([99.78.197.217]:26084 "EHLO smtp-fw-80006.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229988AbhKNB2H (ORCPT ); Sat, 13 Nov 2021 20:28:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853113; x=1668389113; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8pgTkI/dcpfRL3ICipt+9mUS2f8qrVgObFiPJ4phffk=; b=k7CtoI1qNWqMOonUDyb4dYccEbMz9F7go4XBK+kk2PQSjXW8NszxMuPD /iqCLqMuom2CKqBsW/zcdMJO1Tgdsk/3E3tKQG3LSPlC1OWRzKR18ePqO H+T+gbOUJk3N6VtJFCJbDcl4CjF7KWIv46DbfnYDnojKyo54pXa9Mdhkc c=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="41264180" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-pdx-2a-39fdda15.us-west-2.amazon.com) ([10.25.36.214]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP; 14 Nov 2021 01:25:12 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2a-39fdda15.us-west-2.amazon.com (Postfix) with ESMTPS id 23ADF41B08; Sun, 14 Nov 2021 01:25:14 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:25:13 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:25:10 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 02/13] af_unix: Pass struct sock to unix_autobind(). Date: Sun, 14 Nov 2021 10:24:17 +0900 Message-ID: <20211114012428.81743-3-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org We do not use struct socket in unix_autobind() and pass struct sock to unix_bind_bsd() and unix_bind_abstract(). Let's pass it to unix_autobind() as well. Also, this patch fixes these errors by checkpatch.pl. ERROR: do not use assignment in if condition #1795: FILE: net/unix/af_unix.c:1795: + if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr CHECK: Logical continuations should be on the previous line #1796: FILE: net/unix/af_unix.c:1796: + if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr + && (err = unix_autobind(sock)) != 0) Signed-off-by: Kuniyuki Iwashima --- net/unix/af_unix.c | 36 ++++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index b0ef27062489..e6056275f488 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -952,15 +952,13 @@ static int unix_release(struct socket *sock) return 0; } -static int unix_autobind(struct socket *sock) +static int unix_autobind(struct sock *sk) { - struct sock *sk = sock->sk; - struct net *net = sock_net(sk); struct unix_sock *u = unix_sk(sk); - static u32 ordernum = 1; struct unix_address *addr; - int err; unsigned int retries = 0; + static u32 ordernum = 1; + int err; err = mutex_lock_interruptible(&u->bindlock); if (err) @@ -986,7 +984,7 @@ static int unix_autobind(struct socket *sock) spin_lock(&unix_table_lock); ordernum = (ordernum+1)&0xFFFFF; - if (__unix_find_socket_byname(net, addr->name, addr->len, addr->hash)) { + if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, addr->hash)) { spin_unlock(&unix_table_lock); /* * __unix_find_socket_byname() may take long time if many names @@ -1162,7 +1160,7 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) return -EINVAL; if (addr_len == offsetof(struct sockaddr_un, sun_path)) - return unix_autobind(sock); + return unix_autobind(sk); err = unix_mkname(sunaddr, addr_len, &hash); if (err < 0) @@ -1231,9 +1229,11 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, goto out; alen = err; - if (test_bit(SOCK_PASSCRED, &sock->flags) && - !unix_sk(sk)->addr && (err = unix_autobind(sock)) != 0) - goto out; + if (test_bit(SOCK_PASSCRED, &sock->flags) && !unix_sk(sk)->addr) { + err = unix_autobind(sk); + if (err) + goto out; + } restart: other = unix_find_other(net, sunaddr, alen, sock->type, hash, &err); @@ -1338,9 +1338,11 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, goto out; addr_len = err; - if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr && - (err = unix_autobind(sock)) != 0) - goto out; + if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr) { + err = unix_autobind(sk); + if (err) + goto out; + } timeo = sock_sndtimeo(sk, flags & O_NONBLOCK); @@ -1792,9 +1794,11 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, goto out; } - if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr - && (err = unix_autobind(sock)) != 0) - goto out; + if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr) { + err = unix_autobind(sk); + if (err) + goto out; + } err = -EMSGSIZE; if (len > sk->sk_sndbuf - 32) From patchwork Sun Nov 14 01:24:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617755 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A446C433F5 for ; Sun, 14 Nov 2021 01:25:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1A39460FBF for ; Sun, 14 Nov 2021 01:25:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231909AbhKNB2X (ORCPT ); Sat, 13 Nov 2021 20:28:23 -0500 Received: from smtp-fw-2101.amazon.com ([72.21.196.25]:7303 "EHLO smtp-fw-2101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229988AbhKNB2X (ORCPT ); Sat, 13 Nov 2021 20:28:23 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853130; x=1668389130; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tdXngxWLde0iRAACEZZ4tOfUIOewVsfuzVHgefZQEAU=; b=oVlQhwQFmxE85BpOegwQgeXWsVnRU5go08IPCITw2CN+cyyfmuhZC9+O ymUFqjTgJENTyX2vhJ1zdQ+ZIDqnmMS/Cf3RZjKFN48OfOVUSiw9m1jjH FJR3xW5myulQKIHuCAokRCeq0mc9N8X7CHjm4iHdRzMT4eC10vWxIHiuN U=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="151765528" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-pdx-2c-7d0c7241.us-west-2.amazon.com) ([10.43.8.2]) by smtp-border-fw-2101.iad2.amazon.com with ESMTP; 14 Nov 2021 01:25:29 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-pdx-2c-7d0c7241.us-west-2.amazon.com (Postfix) with ESMTPS id 0CED041295; Sun, 14 Nov 2021 01:25:29 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:25:28 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:25:25 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 03/13] af_unix: Factorise unix_find_other() based on address types. Date: Sun, 14 Nov 2021 10:24:18 +0900 Message-ID: <20211114012428.81743-4-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org As done in the commit fa42d910a38e ("unix_bind(): take BSD and abstract address cases into new helpers"), this patch moves BSD and abstract address cases from unix_find_other() into unix_find_bsd() and unix_find_abstract(). Signed-off-by: Kuniyuki Iwashima --- net/unix/af_unix.c | 133 ++++++++++++++++++++++++++------------------- 1 file changed, 78 insertions(+), 55 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index e6056275f488..076d6a2db965 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -952,6 +952,84 @@ static int unix_release(struct socket *sock) return 0; } +static struct sock *unix_find_bsd(struct net *net, struct sockaddr_un *sunaddr, + int type, int *error) +{ + struct inode *inode; + struct path path; + struct sock *sk; + int err; + + err = kern_path(sunaddr->sun_path, LOOKUP_FOLLOW, &path); + if (err) + goto fail; + + err = path_permission(&path, MAY_WRITE); + if (err) + goto path_put; + + err = -ECONNREFUSED; + inode = d_backing_inode(path.dentry); + if (!S_ISSOCK(inode->i_mode)) + goto path_put; + + sk = unix_find_socket_byinode(inode); + if (!sk) + goto path_put; + + err = -EPROTOTYPE; + if (sk->sk_type == type) + touch_atime(&path); + else + goto sock_put; + + path_put(&path); + + return sk; + +sock_put: + sock_put(sk); +path_put: + path_put(&path); +fail: + *error = err; + return NULL; +} + +static struct sock *unix_find_abstract(struct net *net, struct sockaddr_un *sunaddr, + int addr_len, int type, unsigned int hash, + int *error) +{ + struct dentry *dentry; + struct sock *sk; + + sk = unix_find_socket_byname(net, sunaddr, addr_len, type ^ hash); + if (!sk) { + *error = -ECONNREFUSED; + return NULL; + } + + dentry = unix_sk(sk)->path.dentry; + if (dentry) + touch_atime(&unix_sk(sk)->path); + + return sk; +} + +static struct sock *unix_find_other(struct net *net, struct sockaddr_un *sunaddr, + int addr_len, int type, unsigned int hash, + int *error) +{ + struct sock *sk; + + if (sunaddr->sun_path[0]) + sk = unix_find_bsd(net, sunaddr, type, error); + else + sk = unix_find_abstract(net, sunaddr, addr_len, type, hash, error); + + return sk; +} + static int unix_autobind(struct sock *sk) { struct unix_sock *u = unix_sk(sk); @@ -1008,61 +1086,6 @@ out: mutex_unlock(&u->bindlock); return err; } -static struct sock *unix_find_other(struct net *net, - struct sockaddr_un *sunname, int len, - int type, unsigned int hash, int *error) -{ - struct sock *u; - struct path path; - int err = 0; - - if (sunname->sun_path[0]) { - struct inode *inode; - err = kern_path(sunname->sun_path, LOOKUP_FOLLOW, &path); - if (err) - goto fail; - inode = d_backing_inode(path.dentry); - err = path_permission(&path, MAY_WRITE); - if (err) - goto put_fail; - - err = -ECONNREFUSED; - if (!S_ISSOCK(inode->i_mode)) - goto put_fail; - u = unix_find_socket_byinode(inode); - if (!u) - goto put_fail; - - if (u->sk_type == type) - touch_atime(&path); - - path_put(&path); - - err = -EPROTOTYPE; - if (u->sk_type != type) { - sock_put(u); - goto fail; - } - } else { - err = -ECONNREFUSED; - u = unix_find_socket_byname(net, sunname, len, type ^ hash); - if (u) { - struct dentry *dentry; - dentry = unix_sk(u)->path.dentry; - if (dentry) - touch_atime(&unix_sk(u)->path); - } else - goto fail; - } - return u; - -put_fail: - path_put(&path); -fail: - *error = err; - return NULL; -} - static int unix_bind_bsd(struct sock *sk, struct unix_address *addr) { struct unix_sock *u = unix_sk(sk); From patchwork Sun Nov 14 01:24:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617757 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23CD4C433F5 for ; Sun, 14 Nov 2021 01:25:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 04A8260C49 for ; Sun, 14 Nov 2021 01:25:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235272AbhKNB2i (ORCPT ); Sat, 13 Nov 2021 20:28:38 -0500 Received: from smtp-fw-9103.amazon.com ([207.171.188.200]:59774 "EHLO smtp-fw-9103.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233306AbhKNB2h (ORCPT ); Sat, 13 Nov 2021 20:28:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853144; x=1668389144; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FkThqrnN8EgRRp8rAQKD9dm8yZeZICxiThJ4pC+BpzE=; b=swKdNMUH7hB53ouB4agq9O6D724R3qLiaU8E/66xjV4E1VVa7ZjM1qLs EMxKIerM17uNRf/4KhpT5xKxX+Xf73ArF33NZ5wcowhjUSrdAUe2ffRnW UR8h9pRpXjO6j1TZtsyCG8O/t7vM1IBiznUps29HPe1zBxywV8MUnffGe w=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="971465831" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-pdx-2b-1f9d5b26.us-west-2.amazon.com) ([10.25.36.210]) by smtp-border-fw-9103.sea19.amazon.com with ESMTP; 14 Nov 2021 01:25:44 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-pdx-2b-1f9d5b26.us-west-2.amazon.com (Postfix) with ESMTPS id 8EB9F418F8; Sun, 14 Nov 2021 01:25:44 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:25:43 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:25:40 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 04/13] af_unix: Return an error as a pointer in unix_find_other(). Date: Sun, 14 Nov 2021 10:24:19 +0900 Message-ID: <20211114012428.81743-5-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org We can return an error as a pointer and need not pass an additional argument to unix_find_other(). Signed-off-by: Kuniyuki Iwashima --- net/unix/af_unix.c | 42 ++++++++++++++++++++++-------------------- 1 file changed, 22 insertions(+), 20 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 076d6a2db965..cb4e8cfa3958 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -953,7 +953,7 @@ static int unix_release(struct socket *sock) } static struct sock *unix_find_bsd(struct net *net, struct sockaddr_un *sunaddr, - int type, int *error) + int type) { struct inode *inode; struct path path; @@ -992,22 +992,18 @@ static struct sock *unix_find_bsd(struct net *net, struct sockaddr_un *sunaddr, path_put: path_put(&path); fail: - *error = err; - return NULL; + return ERR_PTR(err); } static struct sock *unix_find_abstract(struct net *net, struct sockaddr_un *sunaddr, - int addr_len, int type, unsigned int hash, - int *error) + int addr_len, int type, unsigned int hash) { struct dentry *dentry; struct sock *sk; sk = unix_find_socket_byname(net, sunaddr, addr_len, type ^ hash); - if (!sk) { - *error = -ECONNREFUSED; - return NULL; - } + if (!sk) + return ERR_PTR(-ECONNREFUSED); dentry = unix_sk(sk)->path.dentry; if (dentry) @@ -1017,15 +1013,14 @@ static struct sock *unix_find_abstract(struct net *net, struct sockaddr_un *suna } static struct sock *unix_find_other(struct net *net, struct sockaddr_un *sunaddr, - int addr_len, int type, unsigned int hash, - int *error) + int addr_len, int type, unsigned int hash) { struct sock *sk; if (sunaddr->sun_path[0]) - sk = unix_find_bsd(net, sunaddr, type, error); + sk = unix_find_bsd(net, sunaddr, type); else - sk = unix_find_abstract(net, sunaddr, addr_len, type, hash, error); + sk = unix_find_abstract(net, sunaddr, addr_len, type, hash); return sk; } @@ -1259,9 +1254,11 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, } restart: - other = unix_find_other(net, sunaddr, alen, sock->type, hash, &err); - if (!other) + other = unix_find_other(net, sunaddr, alen, sock->type, hash); + if (IS_ERR(other)) { + err = PTR_ERR(other); goto out; + } unix_state_double_lock(sk, other); @@ -1391,9 +1388,12 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, restart: /* Find listening sock. */ - other = unix_find_other(net, sunaddr, addr_len, sk->sk_type, hash, &err); - if (!other) + other = unix_find_other(net, sunaddr, addr_len, sk->sk_type, hash); + if (IS_ERR(other)) { + err = PTR_ERR(other); + other = NULL; goto out; + } /* Latch state of peer */ unix_state_lock(other); @@ -1861,10 +1861,12 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, if (sunaddr == NULL) goto out_free; - other = unix_find_other(net, sunaddr, namelen, sk->sk_type, - hash, &err); - if (other == NULL) + other = unix_find_other(net, sunaddr, namelen, sk->sk_type, hash); + if (IS_ERR(other)) { + err = PTR_ERR(other); + other = NULL; goto out_free; + } } if (sk_filter(other, skb) < 0) { From patchwork Sun Nov 14 01:24:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617759 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB952C433EF for ; Sun, 14 Nov 2021 01:26:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B226B60F36 for ; Sun, 14 Nov 2021 01:26:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235884AbhKNB2x (ORCPT ); Sat, 13 Nov 2021 20:28:53 -0500 Received: from smtp-fw-6001.amazon.com ([52.95.48.154]:2255 "EHLO smtp-fw-6001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235756AbhKNB2w (ORCPT ); Sat, 13 Nov 2021 20:28:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853160; x=1668389160; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vx/NeQq/vzfHfGtrJ3QvbFPFDv8TXourg5JX+ZZ0mv4=; b=Tpe9Y6+l5axPhdDU0h6QhwxTBhuNoGtjb5e/Nyj3C8d9db0r/36T/Sck ylTUuPjYe3+P1BlSNoJL2i9+WNaDMpF54XJHdvwSGvlhOm8BCzMJ6EhgO /iXgyUq7fVqpOQEZAdSGjQ5rkO1lMDhUXB2lbc9vtbXckvBtPb6cw2Hqf g=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="156316724" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-pdx-2a-6435a935.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-6001.iad6.amazon.com with ESMTP; 14 Nov 2021 01:25:59 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2a-6435a935.us-west-2.amazon.com (Postfix) with ESMTPS id C560641976; Sun, 14 Nov 2021 01:25:58 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:25:57 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:25:55 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 05/13] af_unix: Cut unix_validate_addr() out of unix_mkname(). Date: Sun, 14 Nov 2021 10:24:20 +0900 Message-ID: <20211114012428.81743-6-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org unix_mkname() tests socket address length and family and does some processing based on the address type. It is called in the early stage, and therefore some instructions are redundant and can end up in vain. The address length/family tests are done twice in unix_bind(). Also, the address type is rechecked later in unix_bind() and unix_find_other(), where we can do the same processing. Moreover, in the BSD address case, the hash is set to 0 but never used and confusing. This patch moves the address tests out of unix_mkname(), and the following patches move the other part into appropriate places and remove unix_mkname() finally. Signed-off-by: Kuniyuki Iwashima --- net/unix/af_unix.c | 39 ++++++++++++++++++++++++++++++--------- 1 file changed, 30 insertions(+), 9 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index cb4e8cfa3958..32164a5d8c40 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -227,14 +227,22 @@ static inline void unix_release_addr(struct unix_address *addr) * - if started by zero, it is abstract name. */ +static int unix_validate_addr(struct sockaddr_un *sunaddr, int addr_len) +{ + if (addr_len <= offsetof(struct sockaddr_un, sun_path) || + addr_len > sizeof(*sunaddr)) + return -EINVAL; + + if (sunaddr->sun_family != AF_UNIX) + return -EINVAL; + + return 0; +} + static int unix_mkname(struct sockaddr_un *sunaddr, int len, unsigned int *hashp) { *hashp = 0; - if (len <= offsetof(struct sockaddr_un, sun_path) || len > sizeof(*sunaddr)) - return -EINVAL; - if (!sunaddr || sunaddr->sun_family != AF_UNIX) - return -EINVAL; if (sunaddr->sun_path[0]) { /* * This may look like an off by one error but it is a bit more @@ -1173,13 +1181,14 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) unsigned int hash; struct unix_address *addr; - if (addr_len < offsetofend(struct sockaddr_un, sun_family) || - sunaddr->sun_family != AF_UNIX) - return -EINVAL; - - if (addr_len == offsetof(struct sockaddr_un, sun_path)) + if (addr_len == offsetof(struct sockaddr_un, sun_path) && + sunaddr->sun_family == AF_UNIX) return unix_autobind(sk); + err = unix_validate_addr(sunaddr, addr_len); + if (err) + return err; + err = unix_mkname(sunaddr, addr_len, &hash); if (err < 0) return err; @@ -1242,6 +1251,10 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, goto out; if (addr->sa_family != AF_UNSPEC) { + err = unix_validate_addr(sunaddr, alen); + if (err) + goto out; + err = unix_mkname(sunaddr, alen, &hash); if (err < 0) goto out; @@ -1353,6 +1366,10 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, int err; long timeo; + err = unix_validate_addr(sunaddr, addr_len); + if (err) + goto out; + err = unix_mkname(sunaddr, addr_len, &hash); if (err < 0) goto out; @@ -1805,6 +1822,10 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, goto out; if (msg->msg_namelen) { + err = unix_validate_addr(sunaddr, msg->msg_namelen); + if (err) + goto out; + err = unix_mkname(sunaddr, msg->msg_namelen, &hash); if (err < 0) goto out; From patchwork Sun Nov 14 01:24:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617761 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05994C433EF for ; Sun, 14 Nov 2021 01:26:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D5EED60EC0 for ; Sun, 14 Nov 2021 01:26:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235756AbhKNB3H (ORCPT ); Sat, 13 Nov 2021 20:29:07 -0500 Received: from smtp-fw-80006.amazon.com ([99.78.197.217]:27591 "EHLO smtp-fw-80006.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233306AbhKNB3H (ORCPT ); Sat, 13 Nov 2021 20:29:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853173; x=1668389173; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zSQVsDb8GSkOGnc+40hgCnJuduskPkGcJx4/UKl+2/0=; b=tClJbPOkHODGOvUk4h8ZgJOoetAtAWUBzBATkCd0QTAWEJkapg9IhzA2 AWrkqece7ArsERz7UMu3ErwkUGaFFwar5VUfccs9vWtEcBEfO9dJjVRO5 +mCcIbnOhapahw953XQEtlWpXJrhpflqAb9v1WSpsPkyxnxdNepNjT7Ib 8=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="41264219" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-pdx-2a-b168f70e.us-west-2.amazon.com) ([10.25.36.214]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP; 14 Nov 2021 01:26:13 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-pdx-2a-b168f70e.us-west-2.amazon.com (Postfix) with ESMTPS id 50E4341925; Sun, 14 Nov 2021 01:26:14 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:26:13 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:26:10 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 06/13] af_unix: Copy unix_mkname() into unix_find_(bsd|abstract)(). Date: Sun, 14 Nov 2021 10:24:21 +0900 Message-ID: <20211114012428.81743-7-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org We should not call unix_mkname() before unix_find_other() and instead do the same thing where necessary based on the address type: - terminating the address with '\0' in unix_find_bsd() - calculating the hash in unix_find_abstract(). Signed-off-by: Kuniyuki Iwashima --- net/unix/af_unix.c | 59 +++++++++++++++++++--------------------------- 1 file changed, 24 insertions(+), 35 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 32164a5d8c40..d0172d5d208f 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -239,19 +239,25 @@ static int unix_validate_addr(struct sockaddr_un *sunaddr, int addr_len) return 0; } +static void unix_mkname_bsd(struct sockaddr_un *sunaddr, int addr_len) +{ + /* This may look like an off by one error but it is a bit more + * subtle. 108 is the longest valid AF_UNIX path for a binding. + * sun_path[108] doesn't as such exist. However in kernel space + * we are guaranteed that it is a valid memory location in our + * kernel address buffer because syscall functions always pass + * a pointer of struct sockaddr_storage which has a bigger buffer + * than 108. + */ + ((char *)sunaddr)[addr_len] = 0; +} + static int unix_mkname(struct sockaddr_un *sunaddr, int len, unsigned int *hashp) { *hashp = 0; if (sunaddr->sun_path[0]) { - /* - * This may look like an off by one error but it is a bit more - * subtle. 108 is the longest valid AF_UNIX path for a binding. - * sun_path[108] doesn't as such exist. However in kernel space - * we are guaranteed that it is a valid memory location in our - * kernel address buffer. - */ - ((char *)sunaddr)[len] = 0; + unix_mkname_bsd(sunaddr, len); len = strlen(sunaddr->sun_path) + offsetof(struct sockaddr_un, sun_path) + 1; return len; } @@ -961,13 +967,14 @@ static int unix_release(struct socket *sock) } static struct sock *unix_find_bsd(struct net *net, struct sockaddr_un *sunaddr, - int type) + int addr_len, int type) { struct inode *inode; struct path path; struct sock *sk; int err; + unix_mkname_bsd(sunaddr, addr_len); err = kern_path(sunaddr->sun_path, LOOKUP_FOLLOW, &path); if (err) goto fail; @@ -1004,8 +1011,9 @@ static struct sock *unix_find_bsd(struct net *net, struct sockaddr_un *sunaddr, } static struct sock *unix_find_abstract(struct net *net, struct sockaddr_un *sunaddr, - int addr_len, int type, unsigned int hash) + int addr_len, int type) { + unsigned int hash = unix_hash_fold(csum_partial(sunaddr, addr_len, 0)); struct dentry *dentry; struct sock *sk; @@ -1021,14 +1029,14 @@ static struct sock *unix_find_abstract(struct net *net, struct sockaddr_un *suna } static struct sock *unix_find_other(struct net *net, struct sockaddr_un *sunaddr, - int addr_len, int type, unsigned int hash) + int addr_len, int type) { struct sock *sk; if (sunaddr->sun_path[0]) - sk = unix_find_bsd(net, sunaddr, type); + sk = unix_find_bsd(net, sunaddr, addr_len, type); else - sk = unix_find_abstract(net, sunaddr, addr_len, type, hash); + sk = unix_find_abstract(net, sunaddr, addr_len, type); return sk; } @@ -1243,7 +1251,6 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, struct net *net = sock_net(sk); struct sockaddr_un *sunaddr = (struct sockaddr_un *)addr; struct sock *other; - unsigned int hash; int err; err = -EINVAL; @@ -1255,11 +1262,6 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, if (err) goto out; - err = unix_mkname(sunaddr, alen, &hash); - if (err < 0) - goto out; - alen = err; - if (test_bit(SOCK_PASSCRED, &sock->flags) && !unix_sk(sk)->addr) { err = unix_autobind(sk); if (err) @@ -1267,7 +1269,7 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr, } restart: - other = unix_find_other(net, sunaddr, alen, sock->type, hash); + other = unix_find_other(net, sunaddr, alen, sock->type); if (IS_ERR(other)) { err = PTR_ERR(other); goto out; @@ -1361,7 +1363,6 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, struct sock *newsk = NULL; struct sock *other = NULL; struct sk_buff *skb = NULL; - unsigned int hash; int st; int err; long timeo; @@ -1370,11 +1371,6 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, if (err) goto out; - err = unix_mkname(sunaddr, addr_len, &hash); - if (err < 0) - goto out; - addr_len = err; - if (test_bit(SOCK_PASSCRED, &sock->flags) && !u->addr) { err = unix_autobind(sk); if (err) @@ -1405,7 +1401,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, restart: /* Find listening sock. */ - other = unix_find_other(net, sunaddr, addr_len, sk->sk_type, hash); + other = unix_find_other(net, sunaddr, addr_len, sk->sk_type); if (IS_ERR(other)) { err = PTR_ERR(other); other = NULL; @@ -1803,9 +1799,7 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, struct unix_sock *u = unix_sk(sk); DECLARE_SOCKADDR(struct sockaddr_un *, sunaddr, msg->msg_name); struct sock *other = NULL; - int namelen = 0; /* fake GCC */ int err; - unsigned int hash; struct sk_buff *skb; long timeo; struct scm_cookie scm; @@ -1825,11 +1819,6 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, err = unix_validate_addr(sunaddr, msg->msg_namelen); if (err) goto out; - - err = unix_mkname(sunaddr, msg->msg_namelen, &hash); - if (err < 0) - goto out; - namelen = err; } else { sunaddr = NULL; err = -ENOTCONN; @@ -1882,7 +1871,7 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg, if (sunaddr == NULL) goto out_free; - other = unix_find_other(net, sunaddr, namelen, sk->sk_type, hash); + other = unix_find_other(net, sunaddr, msg->msg_namelen, sk->sk_type); if (IS_ERR(other)) { err = PTR_ERR(other); other = NULL; From patchwork Sun Nov 14 01:24:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617763 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0E22C433F5 for ; Sun, 14 Nov 2021 01:26:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A517260C49 for ; Sun, 14 Nov 2021 01:26:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235896AbhKNB3Z (ORCPT ); Sat, 13 Nov 2021 20:29:25 -0500 Received: from smtp-fw-6002.amazon.com ([52.95.49.90]:56204 "EHLO smtp-fw-6002.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235893AbhKNB3Y (ORCPT ); Sat, 13 Nov 2021 20:29:24 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853192; x=1668389192; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vD6xtrmUq8lQfQrQgGSokdyQzLQ1hPMD1Wb7Ebs20hA=; b=YqSKn+ILDT3hr3Gq3kjTd7afM71/Ag3A4OPTmlwC/+cLfMuDRRw/dB6L 7HhrguJLG1iEXpdiGWjLP8J7QDIga2J8isH4QxR8gmAijwUTh42pgAieq TWQOaFCKaP2FVZoZSaJb7a/9KkksQB5z/E5zw/NbbRJxGn0K7YCWS/xm6 M=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="154894548" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-pdx-2c-34cb9e7b.us-west-2.amazon.com) ([10.43.8.2]) by smtp-border-fw-6002.iad6.amazon.com with ESMTP; 14 Nov 2021 01:26:30 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2c-34cb9e7b.us-west-2.amazon.com (Postfix) with ESMTPS id D490A41B73; Sun, 14 Nov 2021 01:26:29 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:26:28 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:26:25 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 07/13] af_unix: Remove unix_mkname(). Date: Sun, 14 Nov 2021 10:24:22 +0900 Message-ID: <20211114012428.81743-8-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This patch removes unix_mkname() and postpones calculating a hash to unix_bind_abstract(). Some BSD stuffs still remain in unix_bind() though, the next patch packs them into unix_bind_bsd(). Signed-off-by: Kuniyuki Iwashima --- net/unix/af_unix.c | 32 ++++++++++---------------------- 1 file changed, 10 insertions(+), 22 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index d0172d5d208f..00f9b83e8966 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -252,20 +252,6 @@ static void unix_mkname_bsd(struct sockaddr_un *sunaddr, int addr_len) ((char *)sunaddr)[addr_len] = 0; } -static int unix_mkname(struct sockaddr_un *sunaddr, int len, unsigned int *hashp) -{ - *hashp = 0; - - if (sunaddr->sun_path[0]) { - unix_mkname_bsd(sunaddr, len); - len = strlen(sunaddr->sun_path) + offsetof(struct sockaddr_un, sun_path) + 1; - return len; - } - - *hashp = unix_hash_fold(csum_partial(sunaddr, len, 0)); - return len; -} - static void __unix_remove_socket(struct sock *sk) { sk_del_node_init(sk); @@ -1167,6 +1153,9 @@ static int unix_bind_abstract(struct sock *sk, struct unix_address *addr) return -EINVAL; } + addr->hash = unix_hash_fold(csum_partial(addr->name, addr->len, 0)); + addr->hash ^= sk->sk_type; + spin_lock(&unix_table_lock); if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, addr->hash)) { @@ -1182,12 +1171,11 @@ static int unix_bind_abstract(struct sock *sk, struct unix_address *addr) static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) { - struct sock *sk = sock->sk; struct sockaddr_un *sunaddr = (struct sockaddr_un *)uaddr; char *sun_path = sunaddr->sun_path; - int err; - unsigned int hash; + struct sock *sk = sock->sk; struct unix_address *addr; + int err; if (addr_len == offsetof(struct sockaddr_un, sun_path) && sunaddr->sun_family == AF_UNIX) @@ -1197,17 +1185,17 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) if (err) return err; - err = unix_mkname(sunaddr, addr_len, &hash); - if (err < 0) - return err; - addr_len = err; + if (sun_path[0]) { + unix_mkname_bsd(sunaddr, addr_len); + addr_len = strlen(sunaddr->sun_path) + offsetof(struct sockaddr_un, sun_path) + 1; + } + addr = kmalloc(sizeof(*addr)+addr_len, GFP_KERNEL); if (!addr) return -ENOMEM; memcpy(addr->name, sunaddr, addr_len); addr->len = addr_len; - addr->hash = hash ^ sk->sk_type; refcount_set(&addr->refcnt, 1); if (sun_path[0]) From patchwork Sun Nov 14 01:24:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617765 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2986DC433EF for ; Sun, 14 Nov 2021 01:26:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F407360EC0 for ; Sun, 14 Nov 2021 01:26:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236034AbhKNB3i (ORCPT ); Sat, 13 Nov 2021 20:29:38 -0500 Received: from smtp-fw-80007.amazon.com ([99.78.197.218]:26433 "EHLO smtp-fw-80007.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233306AbhKNB3h (ORCPT ); Sat, 13 Nov 2021 20:29:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853204; x=1668389204; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TsIvw6xfJwKSOt9g7urUFeVunt9zmZwiZLj7JZuNUv0=; b=Eh5UinSR+m/XrsQS5Aw5hBQbUEgCF9QQlSjL6ae6sBJhQ+Xt5JseIH/U uYibslXZeFm4zvx9eCHK+DnEYaN0jTPFNWe7SQMZOLkyZfSrOdw2ZmPXZ 5ElnPaVvT8Kf7ta9Gau+CHIqblq2x/XWM+k/S5s+FL8t9kgm7qzj0yjpl E=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="41275713" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-pdx-2b-5a09360d.us-west-2.amazon.com) ([10.25.36.210]) by smtp-border-fw-80007.pdx80.corp.amazon.com with ESMTP; 14 Nov 2021 01:26:44 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-pdx-2b-5a09360d.us-west-2.amazon.com (Postfix) with ESMTPS id 554FA4156F; Sun, 14 Nov 2021 01:26:44 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:26:43 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:26:40 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 08/13] af_unix: Allocate unix_address in unix_bind_(bsd|abstract)(). Date: Sun, 14 Nov 2021 10:24:23 +0900 Message-ID: <20211114012428.81743-9-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org To terminate address with '\0' in unix_bind_bsd(), we add unix_create_addr() and call it in unix_bind_bsd() and unix_bind_abstract(). Also, unix_bind_abstract() does not return -EEXIST. Only kern_path_create() and vfs_mknod() in unix_bind_bsd() can return it, so we move the last error check in unix_bind() to unix_bind_bsd(). Signed-off-by: Kuniyuki Iwashima --- net/unix/af_unix.c | 103 +++++++++++++++++++++++++++------------------ 1 file changed, 63 insertions(+), 40 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 00f9b83e8966..75ba642dbcac 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -214,6 +214,21 @@ struct sock *unix_peer_get(struct sock *s) } EXPORT_SYMBOL_GPL(unix_peer_get); +static struct unix_address *unix_create_addr(struct sockaddr_un *sunaddr, int addr_len) +{ + struct unix_address *addr; + + addr = kmalloc(sizeof(*addr) + addr_len, GFP_KERNEL); + if (!addr) + return NULL; + + refcount_set(&addr->refcnt, 1); + addr->len = addr_len; + memcpy(addr->name, sunaddr, addr_len); + + return addr; +} + static inline void unix_release_addr(struct unix_address *addr) { if (refcount_dec_and_test(&addr->refcnt)) @@ -1083,34 +1098,44 @@ out: mutex_unlock(&u->bindlock); return err; } -static int unix_bind_bsd(struct sock *sk, struct unix_address *addr) +static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr, int addr_len) { - struct unix_sock *u = unix_sk(sk); umode_t mode = S_IFSOCK | (SOCK_INODE(sk->sk_socket)->i_mode & ~current_umask()); + struct unix_sock *u = unix_sk(sk); struct user_namespace *ns; // barf... - struct path parent; + struct unix_address *addr; struct dentry *dentry; + struct path parent; unsigned int hash; int err; + unix_mkname_bsd(sunaddr, addr_len); + addr_len = strlen(sunaddr->sun_path) + offsetof(struct sockaddr_un, sun_path) + 1; + + addr = unix_create_addr(sunaddr, addr_len); + if (!addr) + return -ENOMEM; + /* * Get the parent directory, calculate the hash for last * component. */ dentry = kern_path_create(AT_FDCWD, addr->name->sun_path, &parent, 0); - if (IS_ERR(dentry)) - return PTR_ERR(dentry); - ns = mnt_user_ns(parent.mnt); + if (IS_ERR(dentry)) { + err = PTR_ERR(dentry); + goto out; + } /* * All right, let's create it. */ + ns = mnt_user_ns(parent.mnt); err = security_path_mknod(&parent, dentry, mode, 0); if (!err) err = vfs_mknod(ns, d_inode(parent.dentry), dentry, mode, 0); if (err) - goto out; + goto out_path; err = mutex_lock_interruptible(&u->bindlock); if (err) goto out_unlink; @@ -1134,47 +1159,59 @@ static int unix_bind_bsd(struct sock *sk, struct unix_address *addr) out_unlink: /* failed after successful mknod? unlink what we'd created... */ vfs_unlink(ns, d_inode(parent.dentry), dentry, NULL); -out: +out_path: done_path_create(&parent, dentry); - return err; +out: + unix_release_addr(addr); + return err == -EEXIST ? -EADDRINUSE : err; } -static int unix_bind_abstract(struct sock *sk, struct unix_address *addr) +static int unix_bind_abstract(struct sock *sk, struct sockaddr_un *sunaddr, int addr_len) { struct unix_sock *u = unix_sk(sk); + struct unix_address *addr; int err; + addr = unix_create_addr(sunaddr, addr_len); + if (!addr) + return -ENOMEM; + err = mutex_lock_interruptible(&u->bindlock); if (err) - return err; + goto out; if (u->addr) { - mutex_unlock(&u->bindlock); - return -EINVAL; + err = -EINVAL; + goto out_mutex; } addr->hash = unix_hash_fold(csum_partial(addr->name, addr->len, 0)); addr->hash ^= sk->sk_type; spin_lock(&unix_table_lock); - if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, - addr->hash)) { - spin_unlock(&unix_table_lock); - mutex_unlock(&u->bindlock); - return -EADDRINUSE; - } + + if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, addr->hash)) + goto out_spin; + __unix_set_addr(sk, addr, addr->hash); spin_unlock(&unix_table_lock); mutex_unlock(&u->bindlock); return 0; + +out_spin: + spin_unlock(&unix_table_lock); + err = -EADDRINUSE; +out_mutex: + mutex_unlock(&u->bindlock); +out: + unix_release_addr(addr); + return err; } static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) { struct sockaddr_un *sunaddr = (struct sockaddr_un *)uaddr; - char *sun_path = sunaddr->sun_path; struct sock *sk = sock->sk; - struct unix_address *addr; int err; if (addr_len == offsetof(struct sockaddr_un, sun_path) && @@ -1185,26 +1222,12 @@ static int unix_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) if (err) return err; - if (sun_path[0]) { - unix_mkname_bsd(sunaddr, addr_len); - addr_len = strlen(sunaddr->sun_path) + offsetof(struct sockaddr_un, sun_path) + 1; - } - - addr = kmalloc(sizeof(*addr)+addr_len, GFP_KERNEL); - if (!addr) - return -ENOMEM; - - memcpy(addr->name, sunaddr, addr_len); - addr->len = addr_len; - refcount_set(&addr->refcnt, 1); - - if (sun_path[0]) - err = unix_bind_bsd(sk, addr); + if (sunaddr->sun_path[0]) + err = unix_bind_bsd(sk, sunaddr, addr_len); else - err = unix_bind_abstract(sk, addr); - if (err) - unix_release_addr(addr); - return err == -EEXIST ? -EADDRINUSE : err; + err = unix_bind_abstract(sk, sunaddr, addr_len); + + return err; } static void unix_state_double_lock(struct sock *sk1, struct sock *sk2) From patchwork Sun Nov 14 01:24:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617767 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A7B2C433EF for ; Sun, 14 Nov 2021 01:27:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1BCAB60FBF for ; Sun, 14 Nov 2021 01:27:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234774AbhKNB3z (ORCPT ); Sat, 13 Nov 2021 20:29:55 -0500 Received: from smtp-fw-6001.amazon.com ([52.95.48.154]:2489 "EHLO smtp-fw-6001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233306AbhKNB3y (ORCPT ); Sat, 13 Nov 2021 20:29:54 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853222; x=1668389222; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=C4ZC/TpwqI5mkXoU4a9XKNQWJsce3rsMz1wDptle6k4=; b=tEKOJQN3MjWzr/RNLutstx/xoUGF8slGENktgs3st1xeA62r+salsMvK pXflQWPYBXchNbqe9mEeVlDdanB/bHbgS0t8EYBC9o/EfzMGsajREBVZR zUKR3ptMswhti/JltbFR4yIC0MsUq6ISNEJsRW191XLVKG2LSpAi3ORCj g=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="156316759" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-pdx-2c-34cb9e7b.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-6001.iad6.amazon.com with ESMTP; 14 Nov 2021 01:27:01 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2c-34cb9e7b.us-west-2.amazon.com (Postfix) with ESMTPS id 8F10E41B6E; Sun, 14 Nov 2021 01:26:59 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:26:58 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:26:55 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 09/13] af_unix: Remove UNIX_ABSTRACT() macro and test sun_path[0] instead. Date: Sun, 14 Nov 2021 10:24:24 +0900 Message-ID: <20211114012428.81743-10-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org In BSD and abstract address cases, we store sockets in the hash table with keys between 0 and UNIX_HASH_SIZE - 1. However, the hash saved in a socket varies depending on its address type; sockets with BSD addresses always have UNIX_HASH_SIZE in their unix_sk(sk)->addr->hash. This is just for the UNIX_ABSTRACT() macro used to check the address type. The difference of the saved hashes comes from the first byte of the address in the first place. So, we can test it directly. Then we can keep a real hash in each socket and replace unix_table_lock with per-hash locks in the later patch. Signed-off-by: Kuniyuki Iwashima --- net/unix/af_unix.c | 4 +--- tools/testing/selftests/bpf/progs/bpf_iter_unix.c | 2 +- tools/testing/selftests/bpf/progs/bpf_tracing_net.h | 2 -- tools/testing/selftests/bpf/progs/test_skc_to_unix_sock.c | 2 +- 4 files changed, 3 insertions(+), 7 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 75ba642dbcac..ae20f6621239 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -134,8 +134,6 @@ static struct hlist_head *unix_sockets_unbound(void *addr) return &unix_socket_table[UNIX_HASH_SIZE + hash]; } -#define UNIX_ABSTRACT(sk) (unix_sk(sk)->addr->hash < UNIX_HASH_SIZE) - #ifdef CONFIG_SECURITY_NETWORK static void unix_get_secdata(struct scm_cookie *scm, struct sk_buff *skb) { @@ -3287,7 +3285,7 @@ static int unix_seq_show(struct seq_file *seq, void *v) i = 0; len = u->addr->len - offsetof(struct sockaddr_un, sun_path); - if (!UNIX_ABSTRACT(s)) + if (u->addr->name->sun_path[0]) len--; else { seq_putc(seq, '@'); diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_unix.c b/tools/testing/selftests/bpf/progs/bpf_iter_unix.c index 94423902685d..c21e3f545371 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_unix.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_unix.c @@ -49,7 +49,7 @@ int dump_unix(struct bpf_iter__unix *ctx) sock_i_ino(sk)); if (unix_sk->addr) { - if (!UNIX_ABSTRACT(unix_sk)) { + if (unix_sk->addr->name->sun_path[0]) { BPF_SEQ_PRINTF(seq, " %s", unix_sk->addr->name->sun_path); } else { /* The name of the abstract UNIX domain socket starts diff --git a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h index eef5646ddb19..e0f42601be9b 100644 --- a/tools/testing/selftests/bpf/progs/bpf_tracing_net.h +++ b/tools/testing/selftests/bpf/progs/bpf_tracing_net.h @@ -6,8 +6,6 @@ #define AF_INET6 10 #define __SO_ACCEPTCON (1 << 16) -#define UNIX_HASH_SIZE 256 -#define UNIX_ABSTRACT(unix_sk) (unix_sk->addr->hash < UNIX_HASH_SIZE) #define SOL_TCP 6 #define TCP_CONGESTION 13 diff --git a/tools/testing/selftests/bpf/progs/test_skc_to_unix_sock.c b/tools/testing/selftests/bpf/progs/test_skc_to_unix_sock.c index a408ec95cba4..eacda9fe07eb 100644 --- a/tools/testing/selftests/bpf/progs/test_skc_to_unix_sock.c +++ b/tools/testing/selftests/bpf/progs/test_skc_to_unix_sock.c @@ -23,7 +23,7 @@ int BPF_PROG(unix_listen, struct socket *sock, int backlog) if (!unix_sk) return 0; - if (!UNIX_ABSTRACT(unix_sk)) + if (unix_sk->addr->name->sun_path[0]) return 0; len = unix_sk->addr->len - sizeof(short); From patchwork Sun Nov 14 01:24:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617769 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A181CC433F5 for ; Sun, 14 Nov 2021 01:27:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7485660EC0 for ; Sun, 14 Nov 2021 01:27:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234332AbhKNBaH (ORCPT ); Sat, 13 Nov 2021 20:30:07 -0500 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]:39521 "EHLO smtp-fw-9102.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231736AbhKNBaH (ORCPT ); Sat, 13 Nov 2021 20:30:07 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853233; x=1668389233; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=OPrtWVw39HAtPmQ3YxANO2PhFNqDl5JB36apLanB9N4=; b=sNx9uEvJo2o0oKS0bE+UDVP9pcLOyB4//6lepQZw3ixsT3jFqYXTAne3 RVKvuiqaTu4NUdEYFzT1kylbwo0zm9ftuJPZDHBv5U/I1K0WbrWFSzNds Nt/TSq1stTGpAtLRIkhWcCGcjdIf4UU5TEJpBpSXQjTgpJUI5KachH8cf 4=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="173835547" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-pdx-2c-72dc3927.us-west-2.amazon.com) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP; 14 Nov 2021 01:27:12 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2c-72dc3927.us-west-2.amazon.com (Postfix) with ESMTPS id C744F41B54; Sun, 14 Nov 2021 01:27:13 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:27:12 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:27:09 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 10/13] af_unix: Add helpers to calculate hashes. Date: Sun, 14 Nov 2021 10:24:25 +0900 Message-ID: <20211114012428.81743-11-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This patch adds three helper functions that calculate hashes for unbound sockets and bound sockets with BSD/abstract addresses. Signed-off-by: Kuniyuki Iwashima --- net/unix/af_unix.c | 63 +++++++++++++++++++++++++--------------------- 1 file changed, 34 insertions(+), 29 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index ae20f6621239..f95786e00723 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -123,15 +123,37 @@ DEFINE_SPINLOCK(unix_table_lock); EXPORT_SYMBOL_GPL(unix_table_lock); static atomic_long_t unix_nr_socks; +/* SMP locking strategy: + * hash table is protected with spinlock unix_table_lock + * each socket state is protected by separate spin lock. + */ -static struct hlist_head *unix_sockets_unbound(void *addr) +static unsigned int unix_unbound_hash(struct sock *sk) { - unsigned long hash = (unsigned long)addr; + unsigned long hash = (unsigned long)sk; hash ^= hash >> 16; hash ^= hash >> 8; - hash %= UNIX_HASH_SIZE; - return &unix_socket_table[UNIX_HASH_SIZE + hash]; + hash ^= sk->sk_type; + + return UNIX_HASH_SIZE + (hash & (UNIX_HASH_SIZE - 1)); +} + +static unsigned int unix_bsd_hash(struct inode *i) +{ + return i->i_ino & (UNIX_HASH_SIZE - 1); +} + +static unsigned int unix_abstract_hash(struct sockaddr_un *sunaddr, + int addr_len, int type) +{ + unsigned int hash; + + hash = (__force unsigned int)csum_fold(csum_partial(sunaddr, addr_len, 0)); + hash ^= hash >> 8; + hash ^= type; + + return hash & (UNIX_HASH_SIZE - 1); } #ifdef CONFIG_SECURITY_NETWORK @@ -162,20 +184,6 @@ static inline bool unix_secdata_eq(struct scm_cookie *scm, struct sk_buff *skb) } #endif /* CONFIG_SECURITY_NETWORK */ -/* - * SMP locking strategy: - * hash table is protected with spinlock unix_table_lock - * each socket state is protected by separate spin lock. - */ - -static inline unsigned int unix_hash_fold(__wsum n) -{ - unsigned int hash = (__force unsigned int)csum_fold(n); - - hash ^= hash>>8; - return hash&(UNIX_HASH_SIZE-1); -} - #define unix_peer(sk) (unix_sk(sk)->peer) static inline int unix_our_peer(struct sock *sk, struct sock *osk) @@ -333,11 +341,11 @@ static inline struct sock *unix_find_socket_byname(struct net *net, static struct sock *unix_find_socket_byinode(struct inode *i) { + unsigned int hash = unix_bsd_hash(i); struct sock *s; spin_lock(&unix_table_lock); - sk_for_each(s, - &unix_socket_table[i->i_ino & (UNIX_HASH_SIZE - 1)]) { + sk_for_each(s, &unix_socket_table[hash]) { struct dentry *dentry = unix_sk(s)->path.dentry; if (dentry && d_backing_inode(dentry) == i) { @@ -900,7 +908,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, init_waitqueue_head(&u->peer_wait); init_waitqueue_func_entry(&u->peer_wake, unix_dgram_peer_wake_relay); memset(&u->scm_stat, 0, sizeof(struct scm_stat)); - unix_insert_socket(unix_sockets_unbound(sk), sk); + unix_insert_socket(&unix_socket_table[unix_unbound_hash(sk)], sk); local_bh_disable(); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); @@ -1012,11 +1020,11 @@ static struct sock *unix_find_bsd(struct net *net, struct sockaddr_un *sunaddr, static struct sock *unix_find_abstract(struct net *net, struct sockaddr_un *sunaddr, int addr_len, int type) { - unsigned int hash = unix_hash_fold(csum_partial(sunaddr, addr_len, 0)); + unsigned int hash = unix_abstract_hash(sunaddr, addr_len, type); struct dentry *dentry; struct sock *sk; - sk = unix_find_socket_byname(net, sunaddr, addr_len, type ^ hash); + sk = unix_find_socket_byname(net, sunaddr, addr_len, hash); if (!sk) return ERR_PTR(-ECONNREFUSED); @@ -1066,8 +1074,7 @@ static int unix_autobind(struct sock *sk) retry: addr->len = sprintf(addr->name->sun_path + 1, "%05x", ordernum) + offsetof(struct sockaddr_un, sun_path) + 1; - addr->hash = unix_hash_fold(csum_partial(addr->name, addr->len, 0)); - addr->hash ^= sk->sk_type; + addr->hash = unix_abstract_hash(addr->name, addr->len, sk->sk_type); spin_lock(&unix_table_lock); ordernum = (ordernum+1)&0xFFFFF; @@ -1141,7 +1148,7 @@ static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr, int addr_ goto out_unlock; addr->hash = UNIX_HASH_SIZE; - hash = d_backing_inode(dentry)->i_ino & (UNIX_HASH_SIZE - 1); + hash = unix_bsd_hash(d_backing_inode(dentry)); spin_lock(&unix_table_lock); u->path.mnt = mntget(parent.mnt); u->path.dentry = dget(dentry); @@ -1183,9 +1190,7 @@ static int unix_bind_abstract(struct sock *sk, struct sockaddr_un *sunaddr, int goto out_mutex; } - addr->hash = unix_hash_fold(csum_partial(addr->name, addr->len, 0)); - addr->hash ^= sk->sk_type; - + addr->hash = unix_abstract_hash(addr->name, addr->len, sk->sk_type); spin_lock(&unix_table_lock); if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, addr->hash)) From patchwork Sun Nov 14 01:24:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617771 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63E31C433EF for ; Sun, 14 Nov 2021 01:27:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4190C60EC0 for ; Sun, 14 Nov 2021 01:27:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235836AbhKNBaZ (ORCPT ); Sat, 13 Nov 2021 20:30:25 -0500 Received: from smtp-fw-2101.amazon.com ([72.21.196.25]:8240 "EHLO smtp-fw-2101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231736AbhKNBaY (ORCPT ); Sat, 13 Nov 2021 20:30:24 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853251; x=1668389251; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rsIXV+Csmvx2oNKncgAOc2NU8t4VakB62QcKo4YIq+4=; b=gs0imkICUMvkVVr5I86hg4pTUHODbTPLVOYcFEl18QkAFowW1+xpoAQH 8vUcj4jjCS28/+ar1JBixkJJ8c0ALHHCLe8uOO8zo3t6lamt42lEb8+TG Mj+8H5X2p20ZRltuc5Zsd40rc/LT7+U9MBO1vgznsbXozTmxm1D4T8ow7 M=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="151765595" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-pdx-2b-31df91b1.us-west-2.amazon.com) ([10.43.8.2]) by smtp-border-fw-2101.iad2.amazon.com with ESMTP; 14 Nov 2021 01:27:30 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2b-31df91b1.us-west-2.amazon.com (Postfix) with ESMTPS id 4145041881; Sun, 14 Nov 2021 01:27:30 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:27:28 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:27:25 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 11/13] af_unix: Save hash in sk_hash. Date: Sun, 14 Nov 2021 10:24:26 +0900 Message-ID: <20211114012428.81743-12-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org To replace unix_table_lock with per-hash locks in the next patch, we need to save a hash in each socket because /proc/net/unix or BPF prog iterate sockets while holding a hash table lock and release it later in a different function. Currently, we store a real/pseudo hash in struct unix_address. However, we do not allocate it to unbound sockets, nor should we do just for that. For this purpose, we can use sk_hash. Then, we no longer use the hash field in struct unix_address and can remove it. Also, this patch does - rename unix_insert_socket() to unix_insert_unbound_socket() - remove the redundant list argument from __unix_insert_socket() and unix_insert_unbound_socket() - use 'unsigned int' instead of 'unsigned' in __unix_set_addr_hash() - remove 'inline' from unix_remove_socket() and unix_insert_unbound_socket(). Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 1 - net/unix/af_unix.c | 41 ++++++++++++++++++++++------------------- 2 files changed, 22 insertions(+), 20 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 7d142e8a0550..89049cc6c066 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -26,7 +26,6 @@ extern struct hlist_head unix_socket_table[2 * UNIX_HASH_SIZE]; struct unix_address { refcount_t refcnt; int len; - unsigned int hash; struct sockaddr_un name[]; }; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index f95786e00723..db04dda07c39 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -278,31 +278,32 @@ static void __unix_remove_socket(struct sock *sk) sk_del_node_init(sk); } -static void __unix_insert_socket(struct hlist_head *list, struct sock *sk) +static void __unix_insert_socket(struct sock *sk) { WARN_ON(!sk_unhashed(sk)); - sk_add_node(sk, list); + sk_add_node(sk, &unix_socket_table[sk->sk_hash]); } -static void __unix_set_addr(struct sock *sk, struct unix_address *addr, - unsigned hash) +static void __unix_set_addr_hash(struct sock *sk, struct unix_address *addr, + unsigned int hash) { __unix_remove_socket(sk); + sk->sk_hash = hash; smp_store_release(&unix_sk(sk)->addr, addr); - __unix_insert_socket(&unix_socket_table[hash], sk); + __unix_insert_socket(sk); } -static inline void unix_remove_socket(struct sock *sk) +static void unix_remove_socket(struct sock *sk) { spin_lock(&unix_table_lock); __unix_remove_socket(sk); spin_unlock(&unix_table_lock); } -static inline void unix_insert_socket(struct hlist_head *list, struct sock *sk) +static void unix_insert_unbound_socket(struct sock *sk) { spin_lock(&unix_table_lock); - __unix_insert_socket(list, sk); + __unix_insert_socket(sk); spin_unlock(&unix_table_lock); } @@ -893,6 +894,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, sock_init_data(sock, sk); + sk->sk_hash = unix_unbound_hash(sk); sk->sk_allocation = GFP_KERNEL_ACCOUNT; sk->sk_write_space = unix_write_space; sk->sk_max_ack_backlog = net->unx.sysctl_max_dgram_qlen; @@ -908,7 +910,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, init_waitqueue_head(&u->peer_wait); init_waitqueue_func_entry(&u->peer_wake, unix_dgram_peer_wake_relay); memset(&u->scm_stat, 0, sizeof(struct scm_stat)); - unix_insert_socket(&unix_socket_table[unix_unbound_hash(sk)], sk); + unix_insert_unbound_socket(sk); local_bh_disable(); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); @@ -1054,6 +1056,7 @@ static int unix_autobind(struct sock *sk) struct unix_address *addr; unsigned int retries = 0; static u32 ordernum = 1; + unsigned int new_hash; int err; err = mutex_lock_interruptible(&u->bindlock); @@ -1074,12 +1077,12 @@ static int unix_autobind(struct sock *sk) retry: addr->len = sprintf(addr->name->sun_path + 1, "%05x", ordernum) + offsetof(struct sockaddr_un, sun_path) + 1; - addr->hash = unix_abstract_hash(addr->name, addr->len, sk->sk_type); + new_hash = unix_abstract_hash(addr->name, addr->len, sk->sk_type); spin_lock(&unix_table_lock); ordernum = (ordernum+1)&0xFFFFF; - if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, addr->hash)) { + if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, new_hash)) { spin_unlock(&unix_table_lock); /* * __unix_find_socket_byname() may take long time if many names @@ -1095,7 +1098,7 @@ static int unix_autobind(struct sock *sk) goto retry; } - __unix_set_addr(sk, addr, addr->hash); + __unix_set_addr_hash(sk, addr, new_hash); spin_unlock(&unix_table_lock); err = 0; @@ -1110,9 +1113,9 @@ static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr, int addr_ struct unix_sock *u = unix_sk(sk); struct user_namespace *ns; // barf... struct unix_address *addr; + unsigned int new_hash; struct dentry *dentry; struct path parent; - unsigned int hash; int err; unix_mkname_bsd(sunaddr, addr_len); @@ -1147,12 +1150,11 @@ static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr, int addr_ if (u->addr) goto out_unlock; - addr->hash = UNIX_HASH_SIZE; - hash = unix_bsd_hash(d_backing_inode(dentry)); + new_hash = unix_bsd_hash(d_backing_inode(dentry)); spin_lock(&unix_table_lock); u->path.mnt = mntget(parent.mnt); u->path.dentry = dget(dentry); - __unix_set_addr(sk, addr, hash); + __unix_set_addr_hash(sk, addr, new_hash); spin_unlock(&unix_table_lock); mutex_unlock(&u->bindlock); done_path_create(&parent, dentry); @@ -1175,6 +1177,7 @@ static int unix_bind_abstract(struct sock *sk, struct sockaddr_un *sunaddr, int { struct unix_sock *u = unix_sk(sk); struct unix_address *addr; + unsigned int new_hash; int err; addr = unix_create_addr(sunaddr, addr_len); @@ -1190,13 +1193,13 @@ static int unix_bind_abstract(struct sock *sk, struct sockaddr_un *sunaddr, int goto out_mutex; } - addr->hash = unix_abstract_hash(addr->name, addr->len, sk->sk_type); + new_hash = unix_abstract_hash(addr->name, addr->len, sk->sk_type); spin_lock(&unix_table_lock); - if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, addr->hash)) + if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, new_hash)) goto out_spin; - __unix_set_addr(sk, addr, addr->hash); + __unix_set_addr_hash(sk, addr, new_hash); spin_unlock(&unix_table_lock); mutex_unlock(&u->bindlock); return 0; From patchwork Sun Nov 14 01:24:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617773 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DB6EC433F5 for ; Sun, 14 Nov 2021 01:27:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5733760F22 for ; Sun, 14 Nov 2021 01:27:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234736AbhKNBah (ORCPT ); Sat, 13 Nov 2021 20:30:37 -0500 Received: from smtp-fw-80006.amazon.com ([99.78.197.217]:29329 "EHLO smtp-fw-80006.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229988AbhKNBag (ORCPT ); Sat, 13 Nov 2021 20:30:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853262; x=1668389262; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vYAh+SgP3NkyIS+Coxb9hrQT6/o+Tud5ko7vjCWpY9s=; b=swapd0bpmEPIXOM4RWQPJStuIYL8PE0r6O/tqHj8gQz+eBYxa8XPp99M lMDaU6RQ4dNxyV1hQPNe2GnP+IKufiYTmF7uz1++QkMAMJthAL4WZNZqX cGS20VPbJxjtoFVIST48dT4WkC17Y3cOf3TYZNHKOww95sWk7ivR08pgD c=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="41264261" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-pdx-2a-ff3df2fe.us-west-2.amazon.com) ([10.25.36.214]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP; 14 Nov 2021 01:27:42 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2a-ff3df2fe.us-west-2.amazon.com (Postfix) with ESMTPS id 97AC941AA1; Sun, 14 Nov 2021 01:27:43 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:27:42 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:27:39 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 12/13] af_unix: Replace the big lock with small locks. Date: Sun, 14 Nov 2021 10:24:27 +0900 Message-ID: <20211114012428.81743-13-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The hash table of AF_UNIX sockets is protected by the single lock. This patch replaces it with per-hash locks. The effect is noticeable when we handle multiple sockets simultaneously. Here is a test result on an EC2 c5.24xlarge instance. It shows latency (under 10us only) in unix_insert_unbound_socket() while 64 CPUs creating 1024 sockets for each in parallel. Without this patch: nsec : count distribution 0 : 179 | | 500 : 3021 |********* | 1000 : 6271 |******************* | 1500 : 6318 |******************* | 2000 : 5828 |***************** | 2500 : 5124 |*************** | 3000 : 4426 |************* | 3500 : 3672 |*********** | 4000 : 3138 |********* | 4500 : 2811 |******** | 5000 : 2384 |******* | 5500 : 2023 |****** | 6000 : 1954 |***** | 6500 : 1737 |***** | 7000 : 1749 |***** | 7500 : 1520 |**** | 8000 : 1469 |**** | 8500 : 1394 |**** | 9000 : 1232 |*** | 9500 : 1138 |*** | 10000 : 994 |*** | With this patch: nsec : count distribution 0 : 1634 |**** | 500 : 13170 |****************************************| 1000 : 13156 |*************************************** | 1500 : 9010 |*************************** | 2000 : 6363 |******************* | 2500 : 4443 |************* | 3000 : 3240 |********* | 3500 : 2549 |******* | 4000 : 1872 |***** | 4500 : 1504 |**** | 5000 : 1247 |*** | 5500 : 1035 |*** | 6000 : 889 |** | 6500 : 744 |** | 7000 : 634 |* | 7500 : 498 |* | 8000 : 433 |* | 8500 : 355 |* | 9000 : 336 |* | 9500 : 284 | | 10000 : 243 | | Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 2 +- net/unix/af_unix.c | 98 ++++++++++++++++++++++++++----------------- net/unix/diag.c | 20 ++++----- 3 files changed, 71 insertions(+), 49 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 89049cc6c066..a7ef624ed726 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -20,7 +20,7 @@ struct sock *unix_peer_get(struct sock *sk); #define UNIX_HASH_BITS 8 extern unsigned int unix_tot_inflight; -extern spinlock_t unix_table_lock; +extern spinlock_t unix_table_locks[2 * UNIX_HASH_SIZE]; extern struct hlist_head unix_socket_table[2 * UNIX_HASH_SIZE]; struct unix_address { diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index db04dda07c39..360f8dd901df 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -117,14 +117,14 @@ #include "scm.h" +spinlock_t unix_table_locks[2 * UNIX_HASH_SIZE]; +EXPORT_SYMBOL_GPL(unix_table_locks); struct hlist_head unix_socket_table[2 * UNIX_HASH_SIZE]; EXPORT_SYMBOL_GPL(unix_socket_table); -DEFINE_SPINLOCK(unix_table_lock); -EXPORT_SYMBOL_GPL(unix_table_lock); static atomic_long_t unix_nr_socks; /* SMP locking strategy: - * hash table is protected with spinlock unix_table_lock + * hash table is protected with spinlock unix_table_locks * each socket state is protected by separate spin lock. */ @@ -156,6 +156,25 @@ static unsigned int unix_abstract_hash(struct sockaddr_un *sunaddr, return hash & (UNIX_HASH_SIZE - 1); } +static void unix_table_double_lock(unsigned int hash1, unsigned int hash2) +{ + /* hash1 and hash2 is never the same because + * one is between 0 and UNIX_HASH_SIZE - 1, and + * another is between UNIX_HASH_SIZE and UNIX_HASH_SIZE * 2. + */ + if (hash1 > hash2) + swap(hash1, hash2); + + spin_lock(&unix_table_locks[hash1]); + spin_lock_nested(&unix_table_locks[hash2], SINGLE_DEPTH_NESTING); +} + +static void unix_table_double_unlock(unsigned int hash1, unsigned int hash2) +{ + spin_unlock(&unix_table_locks[hash1]); + spin_unlock(&unix_table_locks[hash2]); +} + #ifdef CONFIG_SECURITY_NETWORK static void unix_get_secdata(struct scm_cookie *scm, struct sk_buff *skb) { @@ -295,16 +314,16 @@ static void __unix_set_addr_hash(struct sock *sk, struct unix_address *addr, static void unix_remove_socket(struct sock *sk) { - spin_lock(&unix_table_lock); + spin_lock(&unix_table_locks[sk->sk_hash]); __unix_remove_socket(sk); - spin_unlock(&unix_table_lock); + spin_unlock(&unix_table_locks[sk->sk_hash]); } static void unix_insert_unbound_socket(struct sock *sk) { - spin_lock(&unix_table_lock); + spin_lock(&unix_table_locks[sk->sk_hash]); __unix_insert_socket(sk); - spin_unlock(&unix_table_lock); + spin_unlock(&unix_table_locks[sk->sk_hash]); } static struct sock *__unix_find_socket_byname(struct net *net, @@ -332,11 +351,11 @@ static inline struct sock *unix_find_socket_byname(struct net *net, { struct sock *s; - spin_lock(&unix_table_lock); + spin_lock(&unix_table_locks[hash]); s = __unix_find_socket_byname(net, sunname, len, hash); if (s) sock_hold(s); - spin_unlock(&unix_table_lock); + spin_unlock(&unix_table_locks[hash]); return s; } @@ -345,19 +364,18 @@ static struct sock *unix_find_socket_byinode(struct inode *i) unsigned int hash = unix_bsd_hash(i); struct sock *s; - spin_lock(&unix_table_lock); + spin_lock(&unix_table_locks[hash]); sk_for_each(s, &unix_socket_table[hash]) { struct dentry *dentry = unix_sk(s)->path.dentry; if (dentry && d_backing_inode(dentry) == i) { sock_hold(s); - goto found; + spin_unlock(&unix_table_locks[hash]); + return s; } } - s = NULL; -found: - spin_unlock(&unix_table_lock); - return s; + spin_unlock(&unix_table_locks[hash]); + return NULL; } /* Support code for asymmetrically connected dgram sockets @@ -1052,11 +1070,11 @@ static struct sock *unix_find_other(struct net *net, struct sockaddr_un *sunaddr static int unix_autobind(struct sock *sk) { + unsigned int new_hash, old_hash = sk->sk_hash; struct unix_sock *u = unix_sk(sk); struct unix_address *addr; unsigned int retries = 0; static u32 ordernum = 1; - unsigned int new_hash; int err; err = mutex_lock_interruptible(&u->bindlock); @@ -1079,11 +1097,12 @@ static int unix_autobind(struct sock *sk) offsetof(struct sockaddr_un, sun_path) + 1; new_hash = unix_abstract_hash(addr->name, addr->len, sk->sk_type); - spin_lock(&unix_table_lock); + unix_table_double_lock(old_hash, new_hash); ordernum = (ordernum+1)&0xFFFFF; if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, new_hash)) { - spin_unlock(&unix_table_lock); + unix_table_double_unlock(old_hash, new_hash); + /* * __unix_find_socket_byname() may take long time if many names * are already in use. @@ -1099,7 +1118,7 @@ static int unix_autobind(struct sock *sk) } __unix_set_addr_hash(sk, addr, new_hash); - spin_unlock(&unix_table_lock); + unix_table_double_unlock(old_hash, new_hash); err = 0; out: mutex_unlock(&u->bindlock); @@ -1110,10 +1129,10 @@ static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr, int addr_ { umode_t mode = S_IFSOCK | (SOCK_INODE(sk->sk_socket)->i_mode & ~current_umask()); + unsigned int new_hash, old_hash = sk->sk_hash; struct unix_sock *u = unix_sk(sk); struct user_namespace *ns; // barf... struct unix_address *addr; - unsigned int new_hash; struct dentry *dentry; struct path parent; int err; @@ -1151,11 +1170,11 @@ static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr, int addr_ goto out_unlock; new_hash = unix_bsd_hash(d_backing_inode(dentry)); - spin_lock(&unix_table_lock); + unix_table_double_lock(old_hash, new_hash); u->path.mnt = mntget(parent.mnt); u->path.dentry = dget(dentry); __unix_set_addr_hash(sk, addr, new_hash); - spin_unlock(&unix_table_lock); + unix_table_double_unlock(old_hash, new_hash); mutex_unlock(&u->bindlock); done_path_create(&parent, dentry); return 0; @@ -1175,9 +1194,9 @@ static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr, int addr_ static int unix_bind_abstract(struct sock *sk, struct sockaddr_un *sunaddr, int addr_len) { + unsigned int new_hash, old_hash = sk->sk_hash; struct unix_sock *u = unix_sk(sk); struct unix_address *addr; - unsigned int new_hash; int err; addr = unix_create_addr(sunaddr, addr_len); @@ -1194,18 +1213,18 @@ static int unix_bind_abstract(struct sock *sk, struct sockaddr_un *sunaddr, int } new_hash = unix_abstract_hash(addr->name, addr->len, sk->sk_type); - spin_lock(&unix_table_lock); + unix_table_double_lock(old_hash, new_hash); if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, new_hash)) goto out_spin; __unix_set_addr_hash(sk, addr, new_hash); - spin_unlock(&unix_table_lock); + unix_table_double_unlock(old_hash, new_hash); mutex_unlock(&u->bindlock); return 0; out_spin: - spin_unlock(&unix_table_lock); + unix_table_double_unlock(old_hash, new_hash); err = -EADDRINUSE; out_mutex: mutex_unlock(&u->bindlock); @@ -1511,9 +1530,9 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, * * The contents of *(otheru->addr) and otheru->path * are seen fully set up here, since we have found - * otheru in hash under unix_table_lock. Insertion + * otheru in hash under unix_table_locks. Insertion * into the hash chain we'd found it in had been done - * in an earlier critical area protected by unix_table_lock, + * in an earlier critical area protected by unix_table_locks, * the same one where we'd set *(otheru->addr) contents, * as well as otheru->path and otheru->addr itself. * @@ -3192,7 +3211,7 @@ static __poll_t unix_dgram_poll(struct file *file, struct socket *sock, #define BUCKET_SPACE (BITS_PER_LONG - (UNIX_HASH_BITS + 1) - 1) #define get_bucket(x) ((x) >> BUCKET_SPACE) -#define get_offset(x) ((x) & ((1L << BUCKET_SPACE) - 1)) +#define get_offset(x) ((x) & ((1UL << BUCKET_SPACE) - 1)) #define set_bucket_offset(b, o) ((b) << BUCKET_SPACE | (o)) static struct sock *unix_from_bucket(struct seq_file *seq, loff_t *pos) @@ -3216,7 +3235,7 @@ static struct sock *unix_next_socket(struct seq_file *seq, struct sock *sk, loff_t *pos) { - unsigned long bucket; + unsigned long bucket = get_bucket(*pos); while (sk > (struct sock *)SEQ_START_TOKEN) { sk = sk_next(sk); @@ -3227,12 +3246,13 @@ static struct sock *unix_next_socket(struct seq_file *seq, } do { + spin_lock(&unix_table_locks[bucket]); sk = unix_from_bucket(seq, pos); if (sk) return sk; next_bucket: - bucket = get_bucket(*pos) + 1; + spin_unlock(&unix_table_locks[bucket++]); *pos = set_bucket_offset(bucket, 1); } while (bucket < ARRAY_SIZE(unix_socket_table)); @@ -3240,10 +3260,7 @@ static struct sock *unix_next_socket(struct seq_file *seq, } static void *unix_seq_start(struct seq_file *seq, loff_t *pos) - __acquires(unix_table_lock) { - spin_lock(&unix_table_lock); - if (!*pos) return SEQ_START_TOKEN; @@ -3260,9 +3277,11 @@ static void *unix_seq_next(struct seq_file *seq, void *v, loff_t *pos) } static void unix_seq_stop(struct seq_file *seq, void *v) - __releases(unix_table_lock) { - spin_unlock(&unix_table_lock); + struct sock *sk = v; + + if (sk) + spin_unlock(&unix_table_locks[sk->sk_hash]); } static int unix_seq_show(struct seq_file *seq, void *v) @@ -3287,7 +3306,7 @@ static int unix_seq_show(struct seq_file *seq, void *v) (s->sk_state == TCP_ESTABLISHED ? SS_CONNECTING : SS_DISCONNECTING), sock_i_ino(s)); - if (u->addr) { // under unix_table_lock here + if (u->addr) { // under unix_table_locks here int i, len; seq_putc(seq, ' '); @@ -3445,10 +3464,13 @@ static void __init bpf_iter_register(void) static int __init af_unix_init(void) { - int rc = -1; + int i, rc = -1; BUILD_BUG_ON(sizeof(struct unix_skb_parms) > sizeof_field(struct sk_buff, cb)); + for (i = 0; i < 2 * UNIX_HASH_SIZE; i++) + spin_lock_init(&unix_table_locks[i]); + rc = proto_register(&unix_dgram_proto, 1); if (rc != 0) { pr_crit("%s: Cannot create unix_sock SLAB cache!\n", __func__); diff --git a/net/unix/diag.c b/net/unix/diag.c index db555f267407..bb0b5ea1655f 100644 --- a/net/unix/diag.c +++ b/net/unix/diag.c @@ -13,7 +13,7 @@ static int sk_diag_dump_name(struct sock *sk, struct sk_buff *nlskb) { - /* might or might not have unix_table_lock */ + /* might or might not have unix_table_locks */ struct unix_address *addr = smp_load_acquire(&unix_sk(sk)->addr); if (!addr) @@ -204,13 +204,13 @@ static int unix_diag_dump(struct sk_buff *skb, struct netlink_callback *cb) s_slot = cb->args[0]; num = s_num = cb->args[1]; - spin_lock(&unix_table_lock); for (slot = s_slot; slot < ARRAY_SIZE(unix_socket_table); s_num = 0, slot++) { struct sock *sk; num = 0; + spin_lock(&unix_table_locks[slot]); sk_for_each(sk, &unix_socket_table[slot]) { if (!net_eq(sock_net(sk), net)) continue; @@ -221,14 +221,16 @@ static int unix_diag_dump(struct sk_buff *skb, struct netlink_callback *cb) if (sk_diag_dump(sk, skb, req, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, - NLM_F_MULTI) < 0) + NLM_F_MULTI) < 0) { + spin_unlock(&unix_table_locks[slot]); goto done; + } next: num++; } + spin_unlock(&unix_table_locks[slot]); } done: - spin_unlock(&unix_table_lock); cb->args[0] = slot; cb->args[1] = num; @@ -237,21 +239,19 @@ static int unix_diag_dump(struct sk_buff *skb, struct netlink_callback *cb) static struct sock *unix_lookup_by_ino(unsigned int ino) { - int i; struct sock *sk; + int i; - spin_lock(&unix_table_lock); for (i = 0; i < ARRAY_SIZE(unix_socket_table); i++) { + spin_lock(&unix_table_locks[i]); sk_for_each(sk, &unix_socket_table[i]) if (ino == sock_i_ino(sk)) { sock_hold(sk); - spin_unlock(&unix_table_lock); - + spin_unlock(&unix_table_locks[i]); return sk; } + spin_unlock(&unix_table_locks[i]); } - - spin_unlock(&unix_table_lock); return NULL; } From patchwork Sun Nov 14 01:24:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iwashima, Kuniyuki" X-Patchwork-Id: 12617775 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8383EC433EF for ; Sun, 14 Nov 2021 01:28:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5984A61152 for ; Sun, 14 Nov 2021 01:28:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236065AbhKNBa6 (ORCPT ); Sat, 13 Nov 2021 20:30:58 -0500 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]:40632 "EHLO smtp-fw-9102.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235402AbhKNBa5 (ORCPT ); Sat, 13 Nov 2021 20:30:57 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1636853283; x=1668389283; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5pdCFsQjQMhKEEm2bcF9nCXaBrS7UoI3ZIOqmQlanqI=; b=BkiaQO12zUQttb6khZtMOyLeoV2b4vmn4HjN4ZYI15gRHcO3HfWIWk8P FUxXcdWYl1A/AflDPFqa1yKR7OnKAGqXIuW86AOG+XduZ8yP6v57Wyv4r 4ZrX6CHPsoKx2l4gkJCNRrGqa2CcejmZVWlajf2tRktoPf7EKmVwnYzMV 0=; X-IronPort-AV: E=Sophos;i="5.87,233,1631577600"; d="scan'208";a="173835573" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-pdx-2a-2dbf0206.us-west-2.amazon.com) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP; 14 Nov 2021 01:28:03 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-pdx-2a-2dbf0206.us-west-2.amazon.com (Postfix) with ESMTPS id 9591CA284E; Sun, 14 Nov 2021 01:28:04 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.249) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:28:03 +0000 Received: from 88665a182662.ant.amazon.com (10.43.160.241) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Sun, 14 Nov 2021 01:27:54 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Jakub Kicinski CC: Eric Dumazet , Kuniyuki Iwashima , Kuniyuki Iwashima , "Benjamin Herrenschmidt" , Subject: [PATCH v2 net-next 13/13] af_unix: Relax race in unix_autobind(). Date: Sun, 14 Nov 2021 10:24:28 +0900 Message-ID: <20211114012428.81743-14-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211114012428.81743-1-kuniyu@amazon.co.jp> References: <20211114012428.81743-1-kuniyu@amazon.co.jp> MIME-Version: 1.0 X-Originating-IP: [10.43.160.241] X-ClientProxiedBy: EX13D40UWC003.ant.amazon.com (10.43.162.246) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org When we bind an AF_UNIX socket without a name specified, the kernel selects an available one from 0x00000 to 0xFFFFF. unix_autobind() starts searching from a number in the 'static' variable and increments it after acquiring two locks. If multiple processes try autobind, they obtain the same lock and check if a socket in the hash list has the same name. If not, one process uses it, and all except one end up retrying the _next_ number (actually not, it may be incremented by the other processes). The more we autobind sockets in parallel, the longer the latency gets. We can avoid such a race by searching for a name from a random number. These show latency in unix_autobind() while 64 CPUs are simultaneously autobind-ing 1024 sockets for each. Without this patch: usec : count distribution 0 : 1176 |*** | 2 : 3655 |*********** | 4 : 4094 |************* | 6 : 3831 |************ | 8 : 3829 |************ | 10 : 3844 |************ | 12 : 3638 |*********** | 14 : 2992 |********* | 16 : 2485 |******* | 18 : 2230 |******* | 20 : 2095 |****** | 22 : 1853 |***** | 24 : 1827 |***** | 26 : 1677 |***** | 28 : 1473 |**** | 30 : 1573 |***** | 32 : 1417 |**** | 34 : 1385 |**** | 36 : 1345 |**** | 38 : 1344 |**** | 40 : 1200 |*** | With this patch: usec : count distribution 0 : 1855 |****** | 2 : 6464 |********************* | 4 : 9936 |******************************** | 6 : 12107 |****************************************| 8 : 10441 |********************************** | 10 : 7264 |*********************** | 12 : 4254 |************** | 14 : 2538 |******** | 16 : 1596 |***** | 18 : 1088 |*** | 20 : 800 |** | 22 : 670 |** | 24 : 601 |* | 26 : 562 |* | 28 : 525 |* | 30 : 446 |* | 32 : 378 |* | 34 : 337 |* | 36 : 317 |* | 38 : 314 |* | 40 : 298 | | Signed-off-by: Kuniyuki Iwashima --- net/unix/af_unix.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 360f8dd901df..89a844e7141b 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1073,8 +1073,7 @@ static int unix_autobind(struct sock *sk) unsigned int new_hash, old_hash = sk->sk_hash; struct unix_sock *u = unix_sk(sk); struct unix_address *addr; - unsigned int retries = 0; - static u32 ordernum = 1; + u32 lastnum, ordernum; int err; err = mutex_lock_interruptible(&u->bindlock); @@ -1089,31 +1088,34 @@ static int unix_autobind(struct sock *sk) if (!addr) goto out; + addr->len = offsetof(struct sockaddr_un, sun_path) + 6; addr->name->sun_family = AF_UNIX; refcount_set(&addr->refcnt, 1); + lastnum = ordernum = prandom_u32(); + lastnum &= 0xFFFFF; retry: - addr->len = sprintf(addr->name->sun_path + 1, "%05x", ordernum) + - offsetof(struct sockaddr_un, sun_path) + 1; + ordernum = (ordernum + 1) & 0xFFFFF; + sprintf(addr->name->sun_path + 1, "%05x", ordernum); new_hash = unix_abstract_hash(addr->name, addr->len, sk->sk_type); unix_table_double_lock(old_hash, new_hash); - ordernum = (ordernum+1)&0xFFFFF; if (__unix_find_socket_byname(sock_net(sk), addr->name, addr->len, new_hash)) { unix_table_double_unlock(old_hash, new_hash); - /* - * __unix_find_socket_byname() may take long time if many names + /* __unix_find_socket_byname() may take long time if many names * are already in use. */ cond_resched(); - /* Give up if all names seems to be in use. */ - if (retries++ == 0xFFFFF) { + + if (ordernum == lastnum) { + /* Give up if all names seems to be in use. */ err = -ENOSPC; - kfree(addr); + unix_release_addr(addr); goto out; } + goto retry; }