From patchwork Fri Feb 23 21:39:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570074 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-2101.amazon.com (smtp-fw-2101.amazon.com [72.21.196.25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A066EAD2 for ; Fri, 23 Feb 2024 21:40:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=72.21.196.25 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724454; cv=none; b=cqk+N4WV+B49woqlH3GUlY3QqY5nS5Mq6hTOT4Ufh+dPnGOJuVXylYrQMvnsYygtyaUbI/v8N/DU7t/GTlMRHGUOm6DhrcueQTmG70XEw6IOeLYy/VtBGoYRH2TbJ8XYfdNLSv+9kHrAYCCjaemERllz3PEZChI5Go5bWZxXt/s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724454; c=relaxed/simple; bh=7p09WJiSGsXQo7PM/MWCnQsbBhKl3oZQNHb6ZPosCB0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=GxyfpFklLcA659gTisdXbyaINNTVW42W+7vLNS38bP9QlplrZbsq+DPmijOrdKJvwp45IBu3fi5PX+o6Q7MB7lq79VpDamXIJvGWu9PF02R8kqZEhoreqkm7WmamqOkQNua2VxoGX0jWKk5kI91LKSMh0GZxuyHbC1kOu4VIN0w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=ltlEDx0s; arc=none smtp.client-ip=72.21.196.25 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="ltlEDx0s" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724453; x=1740260453; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Asgho1UWYenYA2GErEKKF8xNZYxLWHCdH/1HGwxHS7Y=; b=ltlEDx0sBl4aZ6xujnxLVICF7Pw8jy+l5ZArYqHhlvsNHXrHK/BQG2VP pzYJ4oCXFvp4qbSXm7umVLtXCOfXiilou8cjXxgBaVwqgwHc5N/uFeYur Ejx4lltqKAynanlG28UBqWust/OJy8Q2rXOStzYGyCv4aXwTamDucPwKf s=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="383301843" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-2101.iad2.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:40:52 +0000 Received: from EX19MTAUWA002.ant.amazon.com [10.0.21.151:53400] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.9.241:2525] with esmtp (Farcaster) id f35c56ba-b17e-42f3-9159-d07836d89244; Fri, 23 Feb 2024 21:40:50 +0000 (UTC) X-Farcaster-Flow-ID: f35c56ba-b17e-42f3-9159-d07836d89244 Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWA002.ant.amazon.com (10.250.64.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:40:48 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:40:46 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 01/14] af_unix: Allocate struct unix_vertex for each inflight AF_UNIX fd. Date: Fri, 23 Feb 2024 13:39:50 -0800 Message-ID: <20240223214003.17369-2-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D039UWA002.ant.amazon.com (10.13.139.32) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org We will replace the garbage collection algorithm for AF_UNIX, where we will consider each inflight AF_UNIX socket as a vertex and its file descriptor as an edge in a directed graph. This patch introduces a new struct unix_vertex representing a vertex in the graph and adds its pointer to struct unix_sock. When we send a fd using the SCM_RIGHTS message, we allocate struct scm_fp_list to struct scm_cookie in scm_fp_copy(). Then, we bump each refcount of the inflight fds' struct file and save them in scm_fp_list.fp. After that, unix_attach_fds() inexplicably clones scm_fp_list of scm_cookie and sets it to skb. (We will remove this part after replacing GC.) Here, we add a new function call in unix_attach_fds() to preallocate struct unix_vertex per inflight AF_UNIX fd and link each vertex to skb's scm_fp_list.vertices. When sendmsg() succeeds later, if the socket of the inflight fd is still not inflight yet, we will set the preallocated vertex to struct unix_sock.vertex and link it to a global list unix_unvisited_vertices under spin_lock(&unix_gc_lock). If the socket is already inflight, we free the preallocated vertex. This is to avoid taking the lock unnecessarily when sendmsg() could fail later. In the following patch, we will similarly allocate another struct per edge, which will finally be linked to the inflight socket's unix_vertex.edges. And then, we will count the number of edges as unix_vertex.out_degree. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 9 +++++++++ include/net/scm.h | 3 +++ net/core/scm.c | 7 +++++++ net/unix/af_unix.c | 6 ++++++ net/unix/garbage.c | 38 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 63 insertions(+) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 627ea8e2d915..c270877a5256 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -22,9 +22,17 @@ extern unsigned int unix_tot_inflight; void unix_inflight(struct user_struct *user, struct file *fp); void unix_notinflight(struct user_struct *user, struct file *fp); +int unix_prepare_fpl(struct scm_fp_list *fpl); +void unix_destroy_fpl(struct scm_fp_list *fpl); void unix_gc(void); void wait_for_unix_gc(struct scm_fp_list *fpl); +struct unix_vertex { + struct list_head edges; + struct list_head entry; + unsigned long out_degree; +}; + struct sock *unix_peer_get(struct sock *sk); #define UNIX_HASH_MOD (256 - 1) @@ -62,6 +70,7 @@ struct unix_sock { struct path path; struct mutex iolock, bindlock; struct sock *peer; + struct unix_vertex *vertex; struct list_head link; unsigned long inflight; spinlock_t lock; diff --git a/include/net/scm.h b/include/net/scm.h index 92276a2c5543..e34321b6e204 100644 --- a/include/net/scm.h +++ b/include/net/scm.h @@ -27,6 +27,9 @@ struct scm_fp_list { short count; short count_unix; short max; +#ifdef CONFIG_UNIX + struct list_head vertices; +#endif struct user_struct *user; struct file *fp[SCM_MAX_FD]; }; diff --git a/net/core/scm.c b/net/core/scm.c index 9cd4b0a01cd6..87dfec1c3378 100644 --- a/net/core/scm.c +++ b/net/core/scm.c @@ -89,6 +89,9 @@ static int scm_fp_copy(struct cmsghdr *cmsg, struct scm_fp_list **fplp) fpl->count_unix = 0; fpl->max = SCM_MAX_FD; fpl->user = NULL; +#if IS_ENABLED(CONFIG_UNIX) + INIT_LIST_HEAD(&fpl->vertices); +#endif } fpp = &fpl->fp[fpl->count]; @@ -376,8 +379,12 @@ struct scm_fp_list *scm_fp_dup(struct scm_fp_list *fpl) if (new_fpl) { for (i = 0; i < fpl->count; i++) get_file(fpl->fp[i]); + new_fpl->max = new_fpl->count; new_fpl->user = get_uid(fpl->user); +#if IS_ENABLED(CONFIG_UNIX) + INIT_LIST_HEAD(&new_fpl->vertices); +#endif } return new_fpl; } diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 5b41e2321209..a3b25d311560 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -980,6 +980,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, sk->sk_destruct = unix_sock_destructor; u = unix_sk(sk); u->inflight = 0; + u->vertex = NULL; u->path.dentry = NULL; u->path.mnt = NULL; spin_lock_init(&u->lock); @@ -1805,6 +1806,9 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) for (i = scm->fp->count - 1; i >= 0; i--) unix_inflight(scm->fp->user, scm->fp->fp[i]); + if (unix_prepare_fpl(UNIXCB(skb).fp)) + return -ENOMEM; + return 0; } @@ -1815,6 +1819,8 @@ static void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb) scm->fp = UNIXCB(skb).fp; UNIXCB(skb).fp = NULL; + unix_destroy_fpl(scm->fp); + for (i = scm->fp->count - 1; i >= 0; i--) unix_notinflight(scm->fp->user, scm->fp->fp[i]); } diff --git a/net/unix/garbage.c b/net/unix/garbage.c index fa39b6265238..75bdf66b81df 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -101,6 +101,44 @@ struct unix_sock *unix_get_socket(struct file *filp) return NULL; } +static void unix_free_vertices(struct scm_fp_list *fpl) +{ + struct unix_vertex *vertex, *next_vertex; + + list_for_each_entry_safe(vertex, next_vertex, &fpl->vertices, entry) { + list_del(&vertex->entry); + kfree(vertex); + } +} + +int unix_prepare_fpl(struct scm_fp_list *fpl) +{ + struct unix_vertex *vertex; + int i; + + if (!fpl->count_unix) + return 0; + + for (i = 0; i < fpl->count_unix; i++) { + vertex = kmalloc(sizeof(*vertex), GFP_KERNEL); + if (!vertex) + goto err; + + list_add(&vertex->entry, &fpl->vertices); + } + + return 0; + +err: + unix_free_vertices(fpl); + return -ENOMEM; +} + +void unix_destroy_fpl(struct scm_fp_list *fpl) +{ + unix_free_vertices(fpl); +} + DEFINE_SPINLOCK(unix_gc_lock); unsigned int unix_tot_inflight; static LIST_HEAD(gc_candidates); From patchwork Fri Feb 23 21:39:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570075 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52004.amazon.com (smtp-fw-52004.amazon.com [52.119.213.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24CE9148FF5 for ; Fri, 23 Feb 2024 21:41:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.154 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724478; cv=none; b=c8dxTtzxueR8jyTEEpzwmAiakPJigKCGh9IsDfZcq8qryULzkGqQCLna+n/OzOo3EyyjDE5Jqk4DZ7/oyjkgO2C4EVXScsVXNjtoEbwG0FvUJMQvO+fsqAcIJOiuhm0898I9MEkDcveBXVz6PYewJr1Jy6e7qu96bkyJN8MzQB0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724478; c=relaxed/simple; bh=cvAFTB5Fl4ABRm6q8gAQa76ykQI5qyDDvhFnu3Nlluc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=kzuw6m8n5WceNTvGXSxOuogGVn9RKz3J44QBOnAqhWjG1+gop/nDkJ0117GcpakvHz0jKeZUydAYA412YwvBjvbeaiFu38+O8CdvI9CzLz1gTLVUY+cFlz3K/7ujMVkypAFvmtSutVlljT2Q7hUKCo2tilBYj+Ei4nd90tNYNUU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=BOZwBzlr; arc=none smtp.client-ip=52.119.213.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="BOZwBzlr" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724478; x=1740260478; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=G1uch6QHxqxni8gJ+VBi5uykeMpQSVHkOLmEZAqaNWs=; b=BOZwBzlrh6sQi5NCd1QDrx8A1O5QdpWgKyuuWItjrRcCZsrv20IV9i2s IzxWiSRPowkRLSwRbxdFv1NGBzsfdWJNAHD0dAl44L5yQz1Ev0W8+VYFk uRAqMTXASys+AQyERzJOnWSUT4Ny1Z+mNO75Wah78hhVxD+ymJb2OuzLg c=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="187028505" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-52004.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:41:15 +0000 Received: from EX19MTAUWB001.ant.amazon.com [10.0.21.151:54755] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.16.177:2525] with esmtp (Farcaster) id 9e919f32-1b96-47f8-a703-b156e3e26cbe; Fri, 23 Feb 2024 21:41:13 +0000 (UTC) X-Farcaster-Flow-ID: 9e919f32-1b96-47f8-a703-b156e3e26cbe Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWB001.ant.amazon.com (10.250.64.248) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:41:13 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:41:10 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 02/14] af_unix: Allocate struct unix_edge for each inflight AF_UNIX fd. Date: Fri, 23 Feb 2024 13:39:51 -0800 Message-ID: <20240223214003.17369-3-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D040UWB001.ant.amazon.com (10.13.138.82) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org As with the previous patch, we preallocate to skb's scm_fp_list an array of struct unix_edge in the number of inflight AF_UNIX fds. There we just preallocate memory and do not use immediately because sendmsg() could fail after this point. The actual use will be in the next patch. When we queue skb with inflight edges, we will set the inflight socket's unix_sock as unix_edge->predecessor and the receiver's unix_sock as successor, and then we will link the edge to the inflight socket's unix_vertex.edges. Note that we set NULL to cloned scm_fp_list.edges in scm_fp_dup() so that MSG_PEEK does not change the shape of the directed graph. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 6 ++++++ include/net/scm.h | 5 +++++ net/core/scm.c | 2 ++ net/unix/garbage.c | 6 ++++++ 4 files changed, 19 insertions(+) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index c270877a5256..55c4abc26a71 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -33,6 +33,12 @@ struct unix_vertex { unsigned long out_degree; }; +struct unix_edge { + struct unix_sock *predecessor; + struct unix_sock *successor; + struct list_head vertex_entry; +}; + struct sock *unix_peer_get(struct sock *sk); #define UNIX_HASH_MOD (256 - 1) diff --git a/include/net/scm.h b/include/net/scm.h index e34321b6e204..5f5154e5096d 100644 --- a/include/net/scm.h +++ b/include/net/scm.h @@ -23,12 +23,17 @@ struct scm_creds { kgid_t gid; }; +#ifdef CONFIG_UNIX +struct unix_edge; +#endif + struct scm_fp_list { short count; short count_unix; short max; #ifdef CONFIG_UNIX struct list_head vertices; + struct unix_edge *edges; #endif struct user_struct *user; struct file *fp[SCM_MAX_FD]; diff --git a/net/core/scm.c b/net/core/scm.c index 87dfec1c3378..1bcc8a2d65e3 100644 --- a/net/core/scm.c +++ b/net/core/scm.c @@ -90,6 +90,7 @@ static int scm_fp_copy(struct cmsghdr *cmsg, struct scm_fp_list **fplp) fpl->max = SCM_MAX_FD; fpl->user = NULL; #if IS_ENABLED(CONFIG_UNIX) + fpl->edges = NULL; INIT_LIST_HEAD(&fpl->vertices); #endif } @@ -383,6 +384,7 @@ struct scm_fp_list *scm_fp_dup(struct scm_fp_list *fpl) new_fpl->max = new_fpl->count; new_fpl->user = get_uid(fpl->user); #if IS_ENABLED(CONFIG_UNIX) + new_fpl->edges = NULL; INIT_LIST_HEAD(&new_fpl->vertices); #endif } diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 75bdf66b81df..f31917683288 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -127,6 +127,11 @@ int unix_prepare_fpl(struct scm_fp_list *fpl) list_add(&vertex->entry, &fpl->vertices); } + fpl->edges = kvmalloc_array(fpl->count_unix, sizeof(*fpl->edges), + GFP_KERNEL_ACCOUNT); + if (!fpl->edges) + goto err; + return 0; err: @@ -136,6 +141,7 @@ int unix_prepare_fpl(struct scm_fp_list *fpl) void unix_destroy_fpl(struct scm_fp_list *fpl) { + kvfree(fpl->edges); unix_free_vertices(fpl); } From patchwork Fri Feb 23 21:39:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570076 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-80007.amazon.com (smtp-fw-80007.amazon.com [99.78.197.218]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A2FE81482FF for ; Fri, 23 Feb 2024 21:41:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.218 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724511; cv=none; b=nuSsEf3r+Nny6F5ekrkq9e3Rks5xGBNS7dyjCPDzRPhlsxuhN6Yz9O+cRyIEeJXizhh3XKhTvfx7RueXrrvc9bR3PlFL1dO0h4+L3q/z76XKReBZFioccKyJOdBqFX+D1kbJhzGD0wA/pEBbKrYxwvDD2nTygTxYTAdaIXQHVVs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724511; c=relaxed/simple; bh=1dS5+q7o16YsyA8Ujzw2IuOZ5C+k31h2No3aMPMLgtk=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=dh9zHBriYiTwp/Crs1KnLxr5X9I+hg85CZN0pzVQaa0lpubcMUu0+rqvCHdQ2ZbiMQl//WQ6t+lBpqV0kcGeAU9wXk3vh+7WoCOA4UEUb0ekp+pA1NsTz2tAe42XE/OJt+GaiyL7J/XuDUqo55IKRPvHFp/rjk1gTmVeRItMcOk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=WIYGlJ7W; arc=none smtp.client-ip=99.78.197.218 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="WIYGlJ7W" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724506; x=1740260506; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gxji3HwRYE4HfwiUnJqa3MedZdSXiM7XP8yeq70qusU=; b=WIYGlJ7WO/+6DYEI0qQM/y/82dryAOc7oZceax1HvevisYnZd7gwF+5N 4Rx8Z/5AhK9Ujkupl8YUX5JSXQupezm35t3ksbDqKDORlESQLeJ3Z2Up8 nr1vLQLOlhK6l7C+yRbk6R1uC1kbe7jVXHXgu0Sjpni1S0RZJpwAZWMsQ I=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="276374121" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-80007.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:41:44 +0000 Received: from EX19MTAUWC002.ant.amazon.com [10.0.38.20:26121] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.23.221:2525] with esmtp (Farcaster) id 610684b4-59b7-4ba8-bdde-b6be79c4cb09; Fri, 23 Feb 2024 21:41:43 +0000 (UTC) X-Farcaster-Flow-ID: 610684b4-59b7-4ba8-bdde-b6be79c4cb09 Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWC002.ant.amazon.com (10.250.64.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:41:38 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:41:35 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 03/14] af_unix: Link struct unix_edge when queuing skb. Date: Fri, 23 Feb 2024 13:39:52 -0800 Message-ID: <20240223214003.17369-4-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D033UWA003.ant.amazon.com (10.13.139.42) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org Just before queuing skb with inflight fds, we call scm_stat_add(), which is a good place to set up the preallocated struct unix_vertex and struct unix_edge in UNIXCB(skb).fp. Then, we call unix_add_edges() and construct the directed graph as follows: 1. Set the inflight socket's unix_sock to unix_edge.predecessor. 2. Set the receiver's unix_sock to unix_edge.successor. 3. Set the preallocated vertex to inflight socket's unix_sock.vertex. 4. Link inflight socket's unix_vertex.entry to unix_unvisited_vertices. 5. Link unix_edge.vertex_entry to the inflight socket's unix_vertex.edges. Let's say we pass the fd of AF_UNIX socket A to B and the fd of B to C. The graph looks like this: +-------------------------+ | unix_unvisited_vertices | <-------------------------. +-------------------------+ | + | | +--------------+ +--------------+ | +--------------+ | | unix_sock A | <---. .---> | unix_sock B | <-|-. .---> | unix_sock C | | +--------------+ | | +--------------+ | | | +--------------+ | .-+ | vertex | | | .-+ | vertex | | | | | vertex | | | +--------------+ | | | +--------------+ | | | +--------------+ | | | | | | | | | | +--------------+ | | | +--------------+ | | | | '-> | unix_vertex | | | '-> | unix_vertex | | | | | +--------------+ | | +--------------+ | | | `---> | entry | +---------> | entry | +-' | | |--------------| | | |--------------| | | | edges | <-. | | | edges | <-. | | +--------------+ | | | +--------------+ | | | | | | | | | .----------------------' | | .----------------------' | | | | | | | | | +--------------+ | | | +--------------+ | | | | unix_edge | | | | | unix_edge | | | | +--------------+ | | | +--------------+ | | `-> | vertex_entry | | | `-> | vertex_entry | | | |--------------| | | |--------------| | | | predecessor | +---' | | predecessor | +---' | |--------------| | |--------------| | | successor | +-----' | successor | +-----' +--------------+ +--------------+ Henceforth, we denote such a graph as A -> B (-> C). Now, we can express all inflight fd graphs that do not contain embryo sockets. We will support the particular case later. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 2 + include/net/scm.h | 1 + net/core/scm.c | 2 + net/unix/af_unix.c | 8 +++- net/unix/garbage.c | 94 ++++++++++++++++++++++++++++++++++++++++++- 5 files changed, 104 insertions(+), 3 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 55c4abc26a71..f31ad1166346 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -22,6 +22,8 @@ extern unsigned int unix_tot_inflight; void unix_inflight(struct user_struct *user, struct file *fp); void unix_notinflight(struct user_struct *user, struct file *fp); +void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver); +void unix_del_edges(struct scm_fp_list *fpl); int unix_prepare_fpl(struct scm_fp_list *fpl); void unix_destroy_fpl(struct scm_fp_list *fpl); void unix_gc(void); diff --git a/include/net/scm.h b/include/net/scm.h index 5f5154e5096d..bbc5527809d1 100644 --- a/include/net/scm.h +++ b/include/net/scm.h @@ -32,6 +32,7 @@ struct scm_fp_list { short count_unix; short max; #ifdef CONFIG_UNIX + bool inflight; struct list_head vertices; struct unix_edge *edges; #endif diff --git a/net/core/scm.c b/net/core/scm.c index 1bcc8a2d65e3..5763f3320358 100644 --- a/net/core/scm.c +++ b/net/core/scm.c @@ -90,6 +90,7 @@ static int scm_fp_copy(struct cmsghdr *cmsg, struct scm_fp_list **fplp) fpl->max = SCM_MAX_FD; fpl->user = NULL; #if IS_ENABLED(CONFIG_UNIX) + fpl->inflight = false; fpl->edges = NULL; INIT_LIST_HEAD(&fpl->vertices); #endif @@ -384,6 +385,7 @@ struct scm_fp_list *scm_fp_dup(struct scm_fp_list *fpl) new_fpl->max = new_fpl->count; new_fpl->user = get_uid(fpl->user); #if IS_ENABLED(CONFIG_UNIX) + new_fpl->inflight = false; new_fpl->edges = NULL; INIT_LIST_HEAD(&new_fpl->vertices); #endif diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index a3b25d311560..24adbc4d5188 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1943,8 +1943,10 @@ static void scm_stat_add(struct sock *sk, struct sk_buff *skb) struct scm_fp_list *fp = UNIXCB(skb).fp; struct unix_sock *u = unix_sk(sk); - if (unlikely(fp && fp->count)) + if (unlikely(fp && fp->count)) { atomic_add(fp->count, &u->scm_stat.nr_fds); + unix_add_edges(fp, u); + } } static void scm_stat_del(struct sock *sk, struct sk_buff *skb) @@ -1952,8 +1954,10 @@ static void scm_stat_del(struct sock *sk, struct sk_buff *skb) struct scm_fp_list *fp = UNIXCB(skb).fp; struct unix_sock *u = unix_sk(sk); - if (unlikely(fp && fp->count)) + if (unlikely(fp && fp->count)) { atomic_sub(fp->count, &u->scm_stat.nr_fds); + unix_del_edges(fp); + } } /* diff --git a/net/unix/garbage.c b/net/unix/garbage.c index f31917683288..96d0b1db3638 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -101,6 +101,42 @@ struct unix_sock *unix_get_socket(struct file *filp) return NULL; } +static LIST_HEAD(unix_unvisited_vertices); + +static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge) +{ + struct unix_vertex *vertex = edge->predecessor->vertex; + + if (!vertex) { + vertex = list_first_entry(&fpl->vertices, typeof(*vertex), entry); + vertex->out_degree = 0; + INIT_LIST_HEAD(&vertex->edges); + + list_move_tail(&vertex->entry, &unix_unvisited_vertices); + + edge->predecessor->vertex = vertex; + } + + vertex->out_degree++; + + list_add_tail(&edge->vertex_entry, &vertex->edges); +} + +static void unix_del_edge(struct scm_fp_list *fpl, struct unix_edge *edge) +{ + struct unix_vertex *vertex = edge->predecessor->vertex; + + list_del(&edge->vertex_entry); + + vertex->out_degree--; + + if (!vertex->out_degree) { + edge->predecessor->vertex = NULL; + + list_move_tail(&vertex->entry, &fpl->vertices); + } +} + static void unix_free_vertices(struct scm_fp_list *fpl) { struct unix_vertex *vertex, *next_vertex; @@ -111,6 +147,60 @@ static void unix_free_vertices(struct scm_fp_list *fpl) } } +DEFINE_SPINLOCK(unix_gc_lock); + +void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver) +{ + int i = 0, j = 0; + + spin_lock(&unix_gc_lock); + + if (!fpl->count_unix) + goto out; + + do { + struct unix_sock *inflight = unix_get_socket(fpl->fp[j++]); + struct unix_edge *edge; + + if (!inflight) + continue; + + edge = fpl->edges + i++; + edge->predecessor = inflight; + edge->successor = receiver; + + unix_add_edge(fpl, edge); + } while (i < fpl->count_unix); + +out: + spin_unlock(&unix_gc_lock); + + fpl->inflight = true; + + unix_free_vertices(fpl); +} + +void unix_del_edges(struct scm_fp_list *fpl) +{ + int i = 0; + + spin_lock(&unix_gc_lock); + + if (!fpl->count_unix) + goto out; + + do { + struct unix_edge *edge = fpl->edges + i++; + + unix_del_edge(fpl, edge); + } while (i < fpl->count_unix); + +out: + spin_unlock(&unix_gc_lock); + + fpl->inflight = false; +} + int unix_prepare_fpl(struct scm_fp_list *fpl) { struct unix_vertex *vertex; @@ -141,11 +231,13 @@ int unix_prepare_fpl(struct scm_fp_list *fpl) void unix_destroy_fpl(struct scm_fp_list *fpl) { + if (fpl->inflight) + unix_del_edges(fpl); + kvfree(fpl->edges); unix_free_vertices(fpl); } -DEFINE_SPINLOCK(unix_gc_lock); unsigned int unix_tot_inflight; static LIST_HEAD(gc_candidates); static LIST_HEAD(gc_inflight_list); From patchwork Fri Feb 23 21:39:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570077 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52005.amazon.com (smtp-fw-52005.amazon.com [52.119.213.156]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE2D91482FF for ; Fri, 23 Feb 2024 21:42:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.156 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724529; cv=none; b=X9lHiLZBXoQaVSpAcejyn4UYXbCrYbnir1XM7GVguz6TtKbTdHyYVcTXQbgmsUVPTwIHWKorSJNq5th1BfJVXCaBxnkaB7KmeXx9pCWQakw2DkrsuS5mQ+VQRifKEvd8uZMocFUwfTem2f0Xd37nPPfXw06liHArpMsEgBd6lJs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724529; c=relaxed/simple; bh=Ld/5sYAYRw1Zw8XG5+YHCANJ8kpVak7SmeZa89lKADI=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=nwCEgRrwnHIlmUbG9VatjOFE/nEtUPYtFByBaWvv9ySekFw4QUYmA0GKDMvMjRzWJRukR7l8+x57BME3YsCuJmE6znJDERwGphQxia859H213oTmA3TRhQr8VWEHIom89cCdNTMkZzoOQHv6718sSuDK4oha7P+G7ylxE6jxoEA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=HbSD9B0F; arc=none smtp.client-ip=52.119.213.156 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="HbSD9B0F" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724528; x=1740260528; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mG4DeBi49S1sIjflof/dHW+o8jtrJoR5nuFVVmk7y7o=; b=HbSD9B0F4itN+RTNcVfhuvjerxKXaeZV/DHa/8xDEiBOvCWkOhbZmYWP txwhTmkSRY7/enomKhCxii3hf6kYbZpQdVjYkHtge+5h25K6qghuIaNf6 fQv7Pz8LoDUwcfyqPf/QX1Xit0mezoLG5TbvscObxLrj6O7DE/MAr/rwW E=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="636425581" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52005.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:42:05 +0000 Received: from EX19MTAUWA001.ant.amazon.com [10.0.21.151:58470] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.6.154:2525] with esmtp (Farcaster) id 1c119320-54de-4949-a169-305254924cfb; Fri, 23 Feb 2024 21:42:03 +0000 (UTC) X-Farcaster-Flow-ID: 1c119320-54de-4949-a169-305254924cfb Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWA001.ant.amazon.com (10.250.64.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:42:03 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:42:00 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 04/14] af_unix: Bulk update unix_tot_inflight/unix_inflight when queuing skb. Date: Fri, 23 Feb 2024 13:39:53 -0800 Message-ID: <20240223214003.17369-5-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D044UWB004.ant.amazon.com (10.13.139.134) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org Currently, we track the number of inflight sockets in two variables. unix_tot_inflight is the total number of inflight AF_UNIX sockets on the host, and user->unix_inflight is the number of inflight fds per user. We update them one by one in unix_inflight(), which can be done once in batch. Also, sendmsg() could fail even after unix_inflight(), then we need to acquire unix_gc_lock only to decrement the counters. Let's bulk update the counters in unix_add_edges() and unix_del_edges(), which is called only for successfully passed fds. Signed-off-by: Kuniyuki Iwashima --- net/unix/garbage.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 96d0b1db3638..e8fe08796d02 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -148,6 +148,7 @@ static void unix_free_vertices(struct scm_fp_list *fpl) } DEFINE_SPINLOCK(unix_gc_lock); +unsigned int unix_tot_inflight; void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver) { @@ -172,7 +173,10 @@ void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver) unix_add_edge(fpl, edge); } while (i < fpl->count_unix); + WRITE_ONCE(unix_tot_inflight, unix_tot_inflight + fpl->count_unix); out: + WRITE_ONCE(fpl->user->unix_inflight, fpl->user->unix_inflight + fpl->count); + spin_unlock(&unix_gc_lock); fpl->inflight = true; @@ -195,7 +199,10 @@ void unix_del_edges(struct scm_fp_list *fpl) unix_del_edge(fpl, edge); } while (i < fpl->count_unix); + WRITE_ONCE(unix_tot_inflight, unix_tot_inflight - fpl->count_unix); out: + WRITE_ONCE(fpl->user->unix_inflight, fpl->user->unix_inflight - fpl->count); + spin_unlock(&unix_gc_lock); fpl->inflight = false; @@ -238,7 +245,6 @@ void unix_destroy_fpl(struct scm_fp_list *fpl) unix_free_vertices(fpl); } -unsigned int unix_tot_inflight; static LIST_HEAD(gc_candidates); static LIST_HEAD(gc_inflight_list); @@ -259,13 +265,8 @@ void unix_inflight(struct user_struct *user, struct file *filp) WARN_ON_ONCE(list_empty(&u->link)); } u->inflight++; - - /* Paired with READ_ONCE() in wait_for_unix_gc() */ - WRITE_ONCE(unix_tot_inflight, unix_tot_inflight + 1); } - WRITE_ONCE(user->unix_inflight, user->unix_inflight + 1); - spin_unlock(&unix_gc_lock); } @@ -282,13 +283,8 @@ void unix_notinflight(struct user_struct *user, struct file *filp) u->inflight--; if (!u->inflight) list_del_init(&u->link); - - /* Paired with READ_ONCE() in wait_for_unix_gc() */ - WRITE_ONCE(unix_tot_inflight, unix_tot_inflight - 1); } - WRITE_ONCE(user->unix_inflight, user->unix_inflight - 1); - spin_unlock(&unix_gc_lock); } From patchwork Fri Feb 23 21:39:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570078 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-9102.amazon.com (smtp-fw-9102.amazon.com [207.171.184.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A7A1149381 for ; Fri, 23 Feb 2024 21:42:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.184.29 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724556; cv=none; b=onShzb/+I3Q/yuAqHBE8+oA0L3ecZ9nQj2B/TYvmFJKiPFWE/ordfZ5DwbWaIUvVC56KCrUa/c3U+J0KaDdLxfaIskmgz10p1N4G8k6KqYRpz58mwvdz3y/fzQ3qdPprSC0sFDk5oL1RaY6RFq78GBPbDVkRU/yfDYIUOuIBT3k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724556; c=relaxed/simple; bh=aASPhid2LXOKaHzGvFhPZoMc1fQ0ybWirfd9mtapSdc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=UOjKVTxIeK7kDE4Jil+9xnuQykWoxddRu+iW0tEQyFYQ/Iic6ynHLuj1fTkv/CTWqg7S4FdPPB2gREX/oZFQr1VuZKuvIAyPDGPVsVmgw57XrD/67d4KdtueKd0+lvC0S4gZMmyTce7CDTsvqvPZzsRfuqraCDn/e30K/WjN4K0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=vOVmRBCU; arc=none smtp.client-ip=207.171.184.29 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="vOVmRBCU" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724555; x=1740260555; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BnG5PsxcueZutSFH2g3HBzD2OJzM6nxVNAzArujreVI=; b=vOVmRBCUQRdYN7Ws2T36Hp7pbs2Bw4hWUb8eQSe9+nqj/cLw1ZGxbeH/ pnWpTr4dggJlntXI8Ye9EL4WtPDzSnICPU4qRDxpsfEe3r+EhegOeWrl7 SUaJdvrpNMyQLxnmNV0vZvFhEX2vpqpUy6AD1GREzfdImxusLuaX17BPa Y=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="399310788" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:42:28 +0000 Received: from EX19MTAUWC001.ant.amazon.com [10.0.21.151:55520] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.52.236:2525] with esmtp (Farcaster) id 6559d1f8-9ff2-470a-bc5e-1629ea711940; Fri, 23 Feb 2024 21:42:28 +0000 (UTC) X-Farcaster-Flow-ID: 6559d1f8-9ff2-470a-bc5e-1629ea711940 Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWC001.ant.amazon.com (10.250.64.174) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:42:27 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:42:25 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 05/14] af_unix: Detect Strongly Connected Components. Date: Fri, 23 Feb 2024 13:39:54 -0800 Message-ID: <20240223214003.17369-6-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D041UWB003.ant.amazon.com (10.13.139.176) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org In the new GC, we use a simple graph algorithm, Tarjan's Strongly Connected Components (SCC) algorithm, to find cyclic references. The algorithm visits every vertex exactly once using depth-first search (DFS). We implement it without recursion so that no one can abuse it. There could be multiple graphs, so we iterate unix_unvisited_vertices in unix_walk_scc() and do DFS in __unix_walk_scc(), where we move visited vertices to another list, unix_visited_vertices, not to restart DFS twice on a visited vertex later in unix_walk_scc(). DFS starts by pushing an input vertex to a stack and assigning it a unique number. Two fields, index and lowlink, are initialised with the number, but lowlink could be updated later during DFS. If a vertex has an edge to an unvisited inflight vertex, we visit it and do the same processing. So, we will have vertices in the stack in the order they appear and number them consecutively in the same order. If a vertex has a back-edge to a visited vertex in the stack, we update the predecessor's lowlink with the successor's index. After iterating edges from the vertex, we check if its index equals its lowlink. If the lowlink is different from the index, it shows there was a back-edge. Then, we propagate the lowlink to its predecessor and go back to the predecessor to resume checking from the next edge of the back-edge. If the lowlink is the same as the index, we pop vertices before and including the vertex from the stack. Then, the set of vertices is SCC, possibly forming a cycle. At the same time, we move the vertices to unix_visited_vertices. When we finish the algorithm, all vertices in each SCC will be linked via unix_vertex.scc_entry. Let's take an example. We have a graph including five inflight vertices (F is not inflight): A -> B -> C -> D -> E (-> F) ^ | `---------' Suppose that we start DFS from C. We will visit C, D, and B first and initialise their index and lowlink. Then, the stack looks like this: > B = (3, 3) (index, lowlink) D = (2, 2) C = (1, 1) When checking B's edge to C, we update B's lowlink with C's index and propagate it to D. B = (3, 1) (index, lowlink) > D = (2, 1) C = (1, 1) Next, we visit E, which has no edge to an inflight vertex. > E = (4, 4) (index, lowlink) B = (3, 1) D = (2, 1) C = (1, 1) When we leave from E, its index and lowlink are the same, so we pop E from the stack as single-vertex SCC. Next, we leave from D but do nothing because its lowlink is different from its index. B = (3, 1) (index, lowlink) D = (2, 1) > C = (1, 1) Then, we leave from C, whose index and lowlink are the same, so we pop B, D and C as SCC. Last, we do DFS for the rest of vertices, A, which is also a single-vertex SCC. Finally, each unix_vertex.scc_entry is linked as follows: A -. B -> C -> D E -. ^ | ^ | ^ | `--' `---------' `--' We use SCC later to decide whether we can garbage-collect the sockets. Note that we still cannot detect SCC properly if an edge points to an embryo socket. The following two patches will sort it out. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 5 +++ net/unix/garbage.c | 82 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 87 insertions(+) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index f31ad1166346..67736767b616 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -32,13 +32,18 @@ void wait_for_unix_gc(struct scm_fp_list *fpl); struct unix_vertex { struct list_head edges; struct list_head entry; + struct list_head scc_entry; unsigned long out_degree; + unsigned long index; + unsigned long lowlink; + bool on_stack; }; struct unix_edge { struct unix_sock *predecessor; struct unix_sock *successor; struct list_head vertex_entry; + struct list_head stack_entry; }; struct sock *unix_peer_get(struct sock *sk); diff --git a/net/unix/garbage.c b/net/unix/garbage.c index e8fe08796d02..7e90663513f9 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -103,6 +103,11 @@ struct unix_sock *unix_get_socket(struct file *filp) static LIST_HEAD(unix_unvisited_vertices); +enum unix_vertex_index { + UNIX_VERTEX_INDEX_UNVISITED, + UNIX_VERTEX_INDEX_START, +}; + static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge) { struct unix_vertex *vertex = edge->predecessor->vertex; @@ -245,6 +250,81 @@ void unix_destroy_fpl(struct scm_fp_list *fpl) unix_free_vertices(fpl); } +static LIST_HEAD(unix_visited_vertices); + +static void __unix_walk_scc(struct unix_vertex *vertex) +{ + unsigned long index = UNIX_VERTEX_INDEX_START; + LIST_HEAD(vertex_stack); + struct unix_edge *edge; + LIST_HEAD(edge_stack); + +next_vertex: + vertex->index = index; + vertex->lowlink = index; + index++; + + vertex->on_stack = true; + list_add(&vertex->scc_entry, &vertex_stack); + + list_for_each_entry(edge, &vertex->edges, vertex_entry) { + struct unix_vertex *next_vertex = edge->successor->vertex; + + if (!next_vertex) + continue; + + if (next_vertex->index == UNIX_VERTEX_INDEX_UNVISITED) { + list_add(&edge->stack_entry, &edge_stack); + + vertex = next_vertex; + goto next_vertex; +prev_vertex: + next_vertex = vertex; + + edge = list_first_entry(&edge_stack, typeof(*edge), stack_entry); + list_del_init(&edge->stack_entry); + + vertex = edge->predecessor->vertex; + + vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); + } else if (edge->successor->vertex->on_stack) { + vertex->lowlink = min(vertex->lowlink, next_vertex->index); + } + } + + if (vertex->index == vertex->lowlink) { + struct list_head scc; + + __list_cut_position(&scc, &vertex_stack, &vertex->scc_entry); + + list_for_each_entry_reverse(vertex, &scc, scc_entry) { + list_move_tail(&vertex->entry, &unix_visited_vertices); + + vertex->on_stack = false; + } + + list_del(&scc); + } + + if (!list_empty(&edge_stack)) + goto prev_vertex; +} + +static void unix_walk_scc(void) +{ + struct unix_vertex *vertex; + + list_for_each_entry(vertex, &unix_unvisited_vertices, entry) + vertex->index = UNIX_VERTEX_INDEX_UNVISITED; + + while (!list_empty(&unix_unvisited_vertices)) { + vertex = list_first_entry(&unix_unvisited_vertices, typeof(*vertex), entry); + __unix_walk_scc(vertex); + } + + list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); +} + static LIST_HEAD(gc_candidates); static LIST_HEAD(gc_inflight_list); @@ -392,6 +472,8 @@ static void __unix_gc(struct work_struct *work) spin_lock(&unix_gc_lock); + unix_walk_scc(); + /* First, select candidates for garbage collection. Only * in-flight sockets are considered, and from those only ones * which don't have any external reference. From patchwork Fri Feb 23 21:39:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570079 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-80007.amazon.com (smtp-fw-80007.amazon.com [99.78.197.218]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1E5714601D for ; Fri, 23 Feb 2024 21:42:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.218 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724575; cv=none; b=VlPBZjLr3v3J1BHZvqFXW8VfCp7MM0rzOghQVzuDfVqd+2fO5vTRFJyhZVuqQOFmNxopIMuAf/6CAdKUVDPVMSFthzQcPvUuXLoGjUsqVKDX/dQUUVwEsNi81TNXG6ndWYmG1XVrEHEJ+V/87oMGg8qn0MyJPshFs535g7lPyYo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724575; c=relaxed/simple; bh=RZWS7glsf3FVPy/SJ7WMtTihIZZowfEc/e54NcYbMfY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=QPOV3aaMOxYQgR8ycagQY/ADoLhMhl9nohoX2dO1UaP0dVSOL48EJrk3ptdm4/KV8XcPlSemuzU7kGlx9XTUnhnwT+2rILe1zMnyvUD8v/oC7RoM8kM5lw5SxpSRdkstWNJXsm6pdUN795vJ3Ra6wSrqfogQZSOWKHVsecDDN1I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=dNoy2udJ; arc=none smtp.client-ip=99.78.197.218 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="dNoy2udJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724573; x=1740260573; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RClNWoP4mmxHjeK91vF3Qzm5FnqmxwlH+sv+IDMvaUA=; b=dNoy2udJPOd4XL/hu3kcEQFemNxaFYGq1W/6Mvwb34G357BaUVBs3Eb4 C7PqSWPzcuAsVPFK1VHbIoo1lGB8/9MPJaLsI2NCFwCJ5226lFXo0R06b 0W2gkojOPpKl94T1k+79xI1rhN9i+gqBgqZcKnUs2QC0zH8+uE/1ZwxNW g=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="276374340" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-80007.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:42:53 +0000 Received: from EX19MTAUWA001.ant.amazon.com [10.0.21.151:20533] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.16.229:2525] with esmtp (Farcaster) id 267d6ebe-0d19-4c61-82b7-d240ff128a3c; Fri, 23 Feb 2024 21:42:53 +0000 (UTC) X-Farcaster-Flow-ID: 267d6ebe-0d19-4c61-82b7-d240ff128a3c Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWA001.ant.amazon.com (10.250.64.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:42:52 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:42:50 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 06/14] af_unix: Save listener for embryo socket. Date: Fri, 23 Feb 2024 13:39:55 -0800 Message-ID: <20240223214003.17369-7-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D042UWA002.ant.amazon.com (10.13.139.17) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org This is a prep patch for the following change, where we need to fetch the listening socket from the successor embryo socket during GC. We add a new field to struct unix_sock to save a pointer to a listening socket. We set it when connect() creates a new socket, and clear it when accept() is called. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 1 + net/unix/af_unix.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 67736767b616..dc7469191195 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -83,6 +83,7 @@ struct unix_sock { struct path path; struct mutex iolock, bindlock; struct sock *peer; + struct sock *listener; struct unix_vertex *vertex; struct list_head link; unsigned long inflight; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 24adbc4d5188..af74e7ebc35a 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -979,6 +979,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, sk->sk_max_ack_backlog = net->unx.sysctl_max_dgram_qlen; sk->sk_destruct = unix_sock_destructor; u = unix_sk(sk); + u->listener = NULL; u->inflight = 0; u->vertex = NULL; u->path.dentry = NULL; @@ -1598,6 +1599,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr, newsk->sk_type = sk->sk_type; init_peercred(newsk); newu = unix_sk(newsk); + newu->listener = other; RCU_INIT_POINTER(newsk->sk_wq, &newu->peer_wq); otheru = unix_sk(other); @@ -1693,8 +1695,8 @@ static int unix_accept(struct socket *sock, struct socket *newsock, int flags, bool kern) { struct sock *sk = sock->sk; - struct sock *tsk; struct sk_buff *skb; + struct sock *tsk; int err; err = -EOPNOTSUPP; @@ -1719,6 +1721,7 @@ static int unix_accept(struct socket *sock, struct socket *newsock, int flags, } tsk = skb->sk; + unix_sk(tsk)->listener = NULL; skb_free_datagram(sk, skb); wake_up_interruptible(&unix_sk(sk)->peer_wait); From patchwork Fri Feb 23 21:39:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570080 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-33001.amazon.com (smtp-fw-33001.amazon.com [207.171.190.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 770B21493B1 for ; Fri, 23 Feb 2024 21:43:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.190.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724607; cv=none; b=EvPN3sHEgYRUdiz0y6VApGyxLR0MDmW/4a8Z1Z3RfjYhwa4+HjhNL4zlmEMKWq/DSQsCF07cPnO7hsNLBh2rqUjpnkAGzfgUMUXJRMknwErGRYUMGDHBq6URw4/ExBcW4kGl5sZNWCGB4aHtmzznvjBWZ1NIO0M+cjwCRexpcmQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724607; c=relaxed/simple; bh=6ZhfoUmnvONz2boA5kBsE7JjsODJes7pPYRWexiPZ1g=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=fNMhgJf87TbOwTFMFvi+ykYmKaCYYV06YjDhldWrTALN2ojgW7zVUxfd+EjqQOqNcNkgJ5D6hN77HxUhBnFBS/C7TNA9lMntIeJ2jnLc8Brz5JtL/in6atCOb7SOGZMOSC1JD00TXOc5RMeN0Xwxr5NGYGn9NedJIbQXhnnM8gk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=kz9LrX3h; arc=none smtp.client-ip=207.171.190.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="kz9LrX3h" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724606; x=1740260606; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=C56CLXSbZlJO7ok3iUUrZ2GmehLgNNNpw8l68Iz+sY4=; b=kz9LrX3hvft1nPR010sLNlBrrTpy4krFHLnUlxmabNfUbUJMmSvGn+jy NFqeNzIYIx9hL/OA/nVP+p92k4k6KXlGp/qCDXb0ZyyfIzK/k8V0rmeiO C6OiN99CdsiQtsx8wIABX3h0wDrM4SxLFDeS0o18pT8tKJYCM560RV1Gp 4=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="328926918" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-33001.sea14.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:43:18 +0000 Received: from EX19MTAUWC001.ant.amazon.com [10.0.21.151:8181] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.40.213:2525] with esmtp (Farcaster) id bc2e215d-5f54-4738-bc35-f9f8b38e086e; Fri, 23 Feb 2024 21:43:17 +0000 (UTC) X-Farcaster-Flow-ID: bc2e215d-5f54-4738-bc35-f9f8b38e086e Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWC001.ant.amazon.com (10.250.64.174) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:43:17 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1118.40; Fri, 23 Feb 2024 21:43:14 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 07/14] af_unix: Fix up unix_edge.successor for embryo socket. Date: Fri, 23 Feb 2024 13:39:56 -0800 Message-ID: <20240223214003.17369-8-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D036UWC004.ant.amazon.com (10.13.139.205) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org To garbage collect inflight AF_UNIX sockets, we must define the cyclic reference appropriately. This is a bit tricky if the loop consists of embryo sockets. Suppose that the fd of AF_UNIX socket A is passed to D and the fd B to C and that C and D are embryo sockets of A and B, respectively. It may appear that there are two separate graphs, A (-> D) and B (-> C), but this is not correct. A --. .-- B X C <-' `-> D Now, D holds A's refcount, and C has B's refcount, so unix_release() will never be called for A and B when we close() them. However, no one can call close() for D and C to free skbs holding refcounts of A and B because C/D is in A/B's receive queue, which should have been purged by unix_release() for A and B. So, here's another type of cyclic reference. When a fd of an AF_UNIX socket is passed to an embryo socket, the reference is indirectly held by its parent listening socket. .-> A .-> B | `- sk_receive_queue | `- sk_receive_queue | `- skb | `- skb | `- sk == C | `- sk == D | `- sk_receive_queue | `- sk_receive_queue | `- skb +---------' `- skb +--. | | `----------------------------------------------------------' Technically, the graph must be denoted as A <-> B instead of A (-> D) and B (-> C) to find such a cyclic reference without touching each socket's receive queue. .-> A --. .-- B <-. | X | == A <-> B `-- C <-' `-> D --' We apply this fixup during GC by fetching the real successor by unix_edge_successor(). When we call accept(), we clear unix_sock.listener under unix_gc_lock not to confuse GC. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 1 + net/unix/af_unix.c | 2 +- net/unix/garbage.c | 17 ++++++++++++++++- 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index dc7469191195..414463803b7e 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -24,6 +24,7 @@ void unix_inflight(struct user_struct *user, struct file *fp); void unix_notinflight(struct user_struct *user, struct file *fp); void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver); void unix_del_edges(struct scm_fp_list *fpl); +void unix_update_edges(struct unix_sock *receiver); int unix_prepare_fpl(struct scm_fp_list *fpl); void unix_destroy_fpl(struct scm_fp_list *fpl); void unix_gc(void); diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index af74e7ebc35a..ae77e2dc0dae 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1721,7 +1721,7 @@ static int unix_accept(struct socket *sock, struct socket *newsock, int flags, } tsk = skb->sk; - unix_sk(tsk)->listener = NULL; + unix_update_edges(unix_sk(tsk)); skb_free_datagram(sk, skb); wake_up_interruptible(&unix_sk(sk)->peer_wait); diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 7e90663513f9..a21dbbbd9bc7 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -101,6 +101,14 @@ struct unix_sock *unix_get_socket(struct file *filp) return NULL; } +static struct unix_vertex *unix_edge_successor(struct unix_edge *edge) +{ + if (edge->successor->listener) + return unix_sk(edge->successor->listener)->vertex; + + return edge->successor->vertex; +} + static LIST_HEAD(unix_unvisited_vertices); enum unix_vertex_index { @@ -213,6 +221,13 @@ void unix_del_edges(struct scm_fp_list *fpl) fpl->inflight = false; } +void unix_update_edges(struct unix_sock *receiver) +{ + spin_lock(&unix_gc_lock); + receiver->listener = NULL; + spin_unlock(&unix_gc_lock); +} + int unix_prepare_fpl(struct scm_fp_list *fpl) { struct unix_vertex *vertex; @@ -268,7 +283,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex) list_add(&vertex->scc_entry, &vertex_stack); list_for_each_entry(edge, &vertex->edges, vertex_entry) { - struct unix_vertex *next_vertex = edge->successor->vertex; + struct unix_vertex *next_vertex = unix_edge_successor(edge); if (!next_vertex) continue; From patchwork Fri Feb 23 21:39:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570081 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-52003.amazon.com (smtp-fw-52003.amazon.com [52.119.213.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BD1A149388 for ; Fri, 23 Feb 2024 21:43:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724630; cv=none; b=G2wE/dcCqzCzULUCUCsHjOtwoHYbKAUClbA1Bqu+PGYyGpO3tUfmAICoScHK+vCiHu0xSZTs0jXvRcWAadOKK7IgSlRk29Cuhwl+0eZpx/4ipk6AauwIV2g5Jmy2lFBksGcieafUpHzqZnNIVpgdsd8/KK6WsQispTD6eTzTi84= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724630; c=relaxed/simple; bh=foGhaZ4RKL53QyewcUle6fezrl8/29cd9NXAJpMITjw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ErUuYmFcG4yNmqqCEGsaP/xZaaovpdzUWGIaOc4uNMWRB4cmRjF9JZR+epYvwMMeDzxrcowl9Lx0kU+lgyMDYppwtZLNHKGooRFCcd6eZZT46nQ10SXTjtwVtH/BE4OuYUzxHaxTfKMleFmsxm8nwuOAVcxnHWAor5obeBwN4mk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=HJ8IBVN9; arc=none smtp.client-ip=52.119.213.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="HJ8IBVN9" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724628; x=1740260628; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oMudsZ6SlzLVP8C3qThUZJlSPRoeY/TNX/CcDjR/xVc=; b=HJ8IBVN9nFBo/DS6ikG2khHnHZSp6GdanDQx0xXCgGbJwHzH/3r/lHTR nMXF32v3rpOT+Xf3q+MwLVNINLDXmujcCrNl4BKFk+Pt3+XjUjwIP8Fd0 OXvKLVyClwmlZd2NVOMcp1yzwZeHPz85qdQ/c18As/QDvhNG7rjaq4nml w=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="640117864" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52003.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:43:45 +0000 Received: from EX19MTAUWB002.ant.amazon.com [10.0.21.151:2576] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.6.154:2525] with esmtp (Farcaster) id ddfabf47-698e-48c7-be88-ebc2a3bbdafe; Fri, 23 Feb 2024 21:43:44 +0000 (UTC) X-Farcaster-Flow-ID: ddfabf47-698e-48c7-be88-ebc2a3bbdafe Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWB002.ant.amazon.com (10.250.64.231) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:43:42 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1118.40; Fri, 23 Feb 2024 21:43:39 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 08/14] af_unix: Save O(n) setup of Tarjan's algo. Date: Fri, 23 Feb 2024 13:39:57 -0800 Message-ID: <20240223214003.17369-9-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D045UWA002.ant.amazon.com (10.13.139.12) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org Before starting Tarjan's algorithm, we need to mark all vertices as unvisited. We can save this O(n) setup by reserving two special indices (0, 1) and using two variables. The first time we link a vertex to unix_unvisited_vertices, we set unix_vertex_unvisited_index to index. During DFS, we can see that the index of unvisited vertices is the same as unix_vertex_unvisited_index. When we finalise SCC later, we set unix_vertex_grouped_index to each vertex's index. Then, we can know (i) that the vertex is on the stack if the index of a visited vertex is >= 2 and (ii) that it is not on the stack and belongs to a different SCC if the index is unix_vertex_grouped_index. After the whole algorithm, all indices of vertices are set as unix_vertex_grouped_index. Next time we start DFS, we know that all unvisited vertices have unix_vertex_grouped_index, and we can use unix_vertex_unvisited_index as the not-on-stack marker. To use the same variable in __unix_walk_scc(), we can swap unix_vertex_(grouped|unvisited)_index at the end of Tarjan's algorithm. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 1 - net/unix/garbage.c | 22 ++++++++++++---------- 2 files changed, 12 insertions(+), 11 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 414463803b7e..ec040caaa4b5 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -37,7 +37,6 @@ struct unix_vertex { unsigned long out_degree; unsigned long index; unsigned long lowlink; - bool on_stack; }; struct unix_edge { diff --git a/net/unix/garbage.c b/net/unix/garbage.c index a21dbbbd9bc7..a2cd24ee953b 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -112,16 +112,20 @@ static struct unix_vertex *unix_edge_successor(struct unix_edge *edge) static LIST_HEAD(unix_unvisited_vertices); enum unix_vertex_index { - UNIX_VERTEX_INDEX_UNVISITED, + UNIX_VERTEX_INDEX_MARK1, + UNIX_VERTEX_INDEX_MARK2, UNIX_VERTEX_INDEX_START, }; +static unsigned long unix_vertex_unvisited_index = UNIX_VERTEX_INDEX_MARK1; + static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge) { struct unix_vertex *vertex = edge->predecessor->vertex; if (!vertex) { vertex = list_first_entry(&fpl->vertices, typeof(*vertex), entry); + vertex->index = unix_vertex_unvisited_index; vertex->out_degree = 0; INIT_LIST_HEAD(&vertex->edges); @@ -266,6 +270,7 @@ void unix_destroy_fpl(struct scm_fp_list *fpl) } static LIST_HEAD(unix_visited_vertices); +static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; static void __unix_walk_scc(struct unix_vertex *vertex) { @@ -279,7 +284,6 @@ static void __unix_walk_scc(struct unix_vertex *vertex) vertex->lowlink = index; index++; - vertex->on_stack = true; list_add(&vertex->scc_entry, &vertex_stack); list_for_each_entry(edge, &vertex->edges, vertex_entry) { @@ -288,7 +292,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex) if (!next_vertex) continue; - if (next_vertex->index == UNIX_VERTEX_INDEX_UNVISITED) { + if (next_vertex->index == unix_vertex_unvisited_index) { list_add(&edge->stack_entry, &edge_stack); vertex = next_vertex; @@ -302,7 +306,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex) vertex = edge->predecessor->vertex; vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); - } else if (edge->successor->vertex->on_stack) { + } else if (next_vertex->index != unix_vertex_grouped_index) { vertex->lowlink = min(vertex->lowlink, next_vertex->index); } } @@ -315,7 +319,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex) list_for_each_entry_reverse(vertex, &scc, scc_entry) { list_move_tail(&vertex->entry, &unix_visited_vertices); - vertex->on_stack = false; + vertex->index = unix_vertex_grouped_index; } list_del(&scc); @@ -327,17 +331,15 @@ static void __unix_walk_scc(struct unix_vertex *vertex) static void unix_walk_scc(void) { - struct unix_vertex *vertex; - - list_for_each_entry(vertex, &unix_unvisited_vertices, entry) - vertex->index = UNIX_VERTEX_INDEX_UNVISITED; - while (!list_empty(&unix_unvisited_vertices)) { + struct unix_vertex *vertex; + vertex = list_first_entry(&unix_unvisited_vertices, typeof(*vertex), entry); __unix_walk_scc(vertex); } list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); + swap(unix_vertex_unvisited_index, unix_vertex_grouped_index); } static LIST_HEAD(gc_candidates); From patchwork Fri Feb 23 21:39:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570085 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-80006.amazon.com (smtp-fw-80006.amazon.com [99.78.197.217]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0A6C1493A5 for ; Fri, 23 Feb 2024 21:44:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.217 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724652; cv=none; b=MDoBz8IXLdqH0E4IxMud3YpVkkPjZ5HkQFtCwHNa6pl8mW3jmYyKAcc2yxGMwepCXmJ/bqvcMIbOH/oTKfDF1nfo1WeG14T5mFI3SaQF0OFHXRLxlgwsPpBDrdrKQGe6k/WCvmH+fVFqwlv/J+iEXP5gj/68yf8+k57NgzIBEu8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724652; c=relaxed/simple; bh=vyPHFsYy+8ltNjBmfIne5QhgppWWdroRFbjJhO4zyo4=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=iuQpdVG1gWGK3W8S0OKGb+O1VHJZCBGiDYMADVpcwj7kTxCNkpRVzcK8dGVkFpXkuRPKVnQT1Yabe3bXXClTBmNSOVy1Kt8pMYlCd5jMye5IIX+7wEaDDSWfd06oeATaDJIavOgsfssmezfIQTiyWec4qRbAE5ocIBjb9mxoXLY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=vnDikkAs; arc=none smtp.client-ip=99.78.197.217 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="vnDikkAs" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724651; x=1740260651; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vRTgDQ9UOFexuvV8bRclfeaMJUgz63h5oJNEuEp8ya8=; b=vnDikkAsh/G4J9SKek5jn8LHkdDXN0QGxAyxNI1mRG8zdpHoxa59rBuO l90VM1pYUuMmRL6WJY/X7GczEB9OHeDeWmJVKdV4f27PtpxHnJKYJf/Ax pAFC/xkp9/LPMtVWsnOHlErC8YS4VAJddBEA/l9rR7ok3mp7kpM/n1hu/ U=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="275499484" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:44:08 +0000 Received: from EX19MTAUWA002.ant.amazon.com [10.0.21.151:34406] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.52.236:2525] with esmtp (Farcaster) id f3b6900e-48ff-4edd-bc34-558666b41b11; Fri, 23 Feb 2024 21:44:07 +0000 (UTC) X-Farcaster-Flow-ID: f3b6900e-48ff-4edd-bc34-558666b41b11 Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWA002.ant.amazon.com (10.250.64.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:44:07 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:44:04 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 09/14] af_unix: Skip GC if no cycle exists. Date: Fri, 23 Feb 2024 13:39:58 -0800 Message-ID: <20240223214003.17369-10-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D038UWC003.ant.amazon.com (10.13.139.209) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org We do not need to run GC if there is no possible cyclic reference. We use unix_graph_maybe_cyclic to decide if we should run GC. If a fd of an AF_UNIX socket is passed to an already inflight AF_UNIX socket, they could form a cyclic reference. Then, we set true to unix_graph_maybe_cyclic and later run Tarjan's algorithm to group them into SCC. Once we run Tarjan's algorithm, we are 100% sure whether cyclic references exist or not. If there is no cycle, we set false to unix_graph_maybe_cyclic and can skip the entire garbage collection next time. When finalising SCC, we set true to unix_graph_maybe_cyclic if SCC consists of multiple vertices. Even if SCC is a single vertex, a cycle might exist as self-fd passing. Given the corner case is rare, we detect it by checking all edges of the vertex and set true to unix_graph_maybe_cyclic. With this change, __unix_gc() is just a spin_lock() dance in the normal usage. Signed-off-by: Kuniyuki Iwashima --- net/unix/garbage.c | 44 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 43 insertions(+), 1 deletion(-) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index a2cd24ee953b..28506b32dcb2 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -109,6 +109,19 @@ static struct unix_vertex *unix_edge_successor(struct unix_edge *edge) return edge->successor->vertex; } +static bool unix_graph_maybe_cyclic; + +static void unix_graph_update(struct unix_vertex *vertex) +{ + if (unix_graph_maybe_cyclic) + return; + + if (!vertex) + return; + + unix_graph_maybe_cyclic = true; +} + static LIST_HEAD(unix_unvisited_vertices); enum unix_vertex_index { @@ -137,12 +150,14 @@ static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge) vertex->out_degree++; list_add_tail(&edge->vertex_entry, &vertex->edges); + unix_graph_update(unix_edge_successor(edge)); } static void unix_del_edge(struct scm_fp_list *fpl, struct unix_edge *edge) { struct unix_vertex *vertex = edge->predecessor->vertex; + unix_graph_update(unix_edge_successor(edge)); list_del(&edge->vertex_entry); vertex->out_degree--; @@ -228,6 +243,7 @@ void unix_del_edges(struct scm_fp_list *fpl) void unix_update_edges(struct unix_sock *receiver) { spin_lock(&unix_gc_lock); + unix_graph_update(unix_sk(receiver->listener)->vertex); receiver->listener = NULL; spin_unlock(&unix_gc_lock); } @@ -269,6 +285,24 @@ void unix_destroy_fpl(struct scm_fp_list *fpl) unix_free_vertices(fpl); } +static bool unix_scc_cyclic(struct list_head *scc) +{ + struct unix_vertex *vertex; + struct unix_edge *edge; + + if (!list_is_singular(scc)) + return true; + + vertex = list_first_entry(scc, typeof(*vertex), scc_entry); + + list_for_each_entry(edge, &vertex->edges, vertex_entry) { + if (unix_edge_successor(edge) == vertex) + return true; + } + + return false; +} + static LIST_HEAD(unix_visited_vertices); static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; @@ -322,6 +356,9 @@ static void __unix_walk_scc(struct unix_vertex *vertex) vertex->index = unix_vertex_grouped_index; } + if (!unix_graph_maybe_cyclic) + unix_graph_maybe_cyclic = unix_scc_cyclic(&scc); + list_del(&scc); } @@ -331,6 +368,8 @@ static void __unix_walk_scc(struct unix_vertex *vertex) static void unix_walk_scc(void) { + unix_graph_maybe_cyclic = false; + while (!list_empty(&unix_unvisited_vertices)) { struct unix_vertex *vertex; @@ -489,6 +528,9 @@ static void __unix_gc(struct work_struct *work) spin_lock(&unix_gc_lock); + if (!unix_graph_maybe_cyclic) + goto skip_gc; + unix_walk_scc(); /* First, select candidates for garbage collection. Only @@ -582,7 +624,7 @@ static void __unix_gc(struct work_struct *work) /* All candidates should have been detached by now. */ WARN_ON_ONCE(!list_empty(&gc_candidates)); - +skip_gc: /* Paired with READ_ONCE() in wait_for_unix_gc(). */ WRITE_ONCE(gc_in_progress, false); From patchwork Fri Feb 23 21:39:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570086 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-9102.amazon.com (smtp-fw-9102.amazon.com [207.171.184.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11EDB1448DE for ; Fri, 23 Feb 2024 21:44:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.184.29 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724677; cv=none; b=uDeoPPvE3y41n/7npN1MvUMvXxC7KzSbPQqyyHVyy5qM0b3zluDLoDh1ps6nte/T/1FqzKcO4sgsCThlR2YkKCLS2LI3Y7X4KmjB8KueOxEsajcoHYZu13ssSTGJ50fBCha38seGYqj01lVZ9fSW/syJcYqP9H/jpS7fud1S8Pk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724677; c=relaxed/simple; bh=imzb3zUJGxxEWMmGstf3QcPI1kuidExhng1v1b6JrsU=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MBbDviTcHYA5SK0iDV+IZJM8Xfm0mwvTKXb4x162iit7BgGd5AYuqztGURrUhBlKfJIQU4pYaEQsms/M+Fp88Wt2R5JViXNWpS5jYIBx/ckpRcoDJXlqVX3V0iYNieF6tNIqBY5pFRbBl0NgmzPivzumKKGqKszKrHYyeKXdytg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=kLYayLFC; arc=none smtp.client-ip=207.171.184.29 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="kLYayLFC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724676; x=1740260676; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1S2JQ/gn3W3vOTfgwJ3e6j5VSJK8WIP/HshmbFfYxlQ=; b=kLYayLFCSit4b7fRSXwdubmO2gPUDadc964NzaVpSzkYJomx0ULJlrjD KNEXsddeZ3ecZs6prqjmW4FXJxOEN1IUx5PDy+pKY+io9exHCa6uEtXmV PDdue1MyD2P3bANcnZxYip/vcEgtm7GbFvSGfH9SdahP2J2ooDKcYqwjD Q=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="399311118" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:44:35 +0000 Received: from EX19MTAUWB001.ant.amazon.com [10.0.21.151:7346] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.39.43:2525] with esmtp (Farcaster) id 35f94793-bae6-424f-901e-963415162733; Fri, 23 Feb 2024 21:44:35 +0000 (UTC) X-Farcaster-Flow-ID: 35f94793-bae6-424f-901e-963415162733 Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWB001.ant.amazon.com (10.250.64.248) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:44:32 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:44:29 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 10/14] af_unix: Avoid Tarjan's algorithm if unnecessary. Date: Fri, 23 Feb 2024 13:39:59 -0800 Message-ID: <20240223214003.17369-11-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D031UWA002.ant.amazon.com (10.13.139.96) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org Once a cyclic reference is formed, we need to run GC to check if there is dead SCC. However, we do not need to run Tarjan's algorithm if we know that the shape of the inflight graph has not been changed. If an edge is added/updated/deleted and the edge's successor is inflight, we set false to unix_graph_grouped, which means we need to re-classify SCC. Once we finalise SCC, we set false to unix_graph_grouped. While unix_graph_grouped is false, we can iterate the grouped SCC using vertex->scc_entry in unix_walk_scc_fast(). list_add() and list_for_each_entry_reverse() uses seem weird, but they are to keep the vertex order consistent and make writing test easier. Signed-off-by: Kuniyuki Iwashima --- net/unix/garbage.c | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 28506b32dcb2..1d9a0498dec5 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -110,6 +110,7 @@ static struct unix_vertex *unix_edge_successor(struct unix_edge *edge) } static bool unix_graph_maybe_cyclic; +static bool unix_graph_grouped; static void unix_graph_update(struct unix_vertex *vertex) { @@ -120,6 +121,7 @@ static void unix_graph_update(struct unix_vertex *vertex) return; unix_graph_maybe_cyclic = true; + unix_graph_grouped = false; } static LIST_HEAD(unix_unvisited_vertices); @@ -141,6 +143,7 @@ static void unix_add_edge(struct scm_fp_list *fpl, struct unix_edge *edge) vertex->index = unix_vertex_unvisited_index; vertex->out_degree = 0; INIT_LIST_HEAD(&vertex->edges); + INIT_LIST_HEAD(&vertex->scc_entry); list_move_tail(&vertex->entry, &unix_unvisited_vertices); @@ -379,6 +382,26 @@ static void unix_walk_scc(void) list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); swap(unix_vertex_unvisited_index, unix_vertex_grouped_index); + + unix_graph_grouped = true; +} + +static void unix_walk_scc_fast(void) +{ + while (!list_empty(&unix_unvisited_vertices)) { + struct unix_vertex *vertex; + struct list_head scc; + + vertex = list_first_entry(&unix_unvisited_vertices, typeof(*vertex), entry); + list_add(&scc, &vertex->scc_entry); + + list_for_each_entry_reverse(vertex, &scc, scc_entry) + list_move_tail(&vertex->entry, &unix_visited_vertices); + + list_del(&scc); + } + + list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); } static LIST_HEAD(gc_candidates); @@ -531,7 +554,10 @@ static void __unix_gc(struct work_struct *work) if (!unix_graph_maybe_cyclic) goto skip_gc; - unix_walk_scc(); + if (unix_graph_grouped) + unix_walk_scc_fast(); + else + unix_walk_scc(); /* First, select candidates for garbage collection. Only * in-flight sockets are considered, and from those only ones From patchwork Fri Feb 23 21:40:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570087 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-80009.amazon.com (smtp-fw-80009.amazon.com [99.78.197.220]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 93DFD1448F3 for ; Fri, 23 Feb 2024 21:44:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.220 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724701; cv=none; b=cieFN0e8mk38iZ+NYv9LguJHgaqRn2tywMsh7IVuc368SU0rcKefw8kLwPjJMDWJwUEg5aQyB2Z2qtqNKbMkzZ96n72fyglNh4/KAmJKy2XsfVb+6TFmiQ+SOLnFje7PtaqdjsJWDoMruVwa3OoEvgpW/+ST8PUpbn15kP1UzW0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724701; c=relaxed/simple; bh=WJqWRX17pkq66Q1C8sdJBOKrOz1tcB+JaAdkM+48D4g=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=myWuYXG9fgz4yPIKK/iIoOdhretkGGa8w8Sb4eK8TaRChWVSjJiLYZRhaMc806BlbKkVbkzxK7H9098J/3UAaHsn3R71nXcrP9MuQkHvxL6mSLSTCuHa0Wib6PAFKllc35vNG0Hq9EBhBbbxMjXYjt7kO62x6mc1S6cvMCQ8gLI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=lnac7HLH; arc=none smtp.client-ip=99.78.197.220 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="lnac7HLH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724699; x=1740260699; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BRV4ajxR2mvRcP07lg4HMiBd+jkIMx2JZ2jRuLqqjKU=; b=lnac7HLHGeiEjcwb4xPlstXdnZpsBPqRMWMVt+aSURcrsJOuwTfqilrW tUFbcvUXGGrq8sZTPuKw9+7nZbY/5Nz9wgTLjLocYjL/vMMwjSpEO8nta r+EslTtx8z7OidtDOcgPI5h4IcElfjqLdVV0ExVclYkBUoKtxxCX+7lpM U=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="68370591" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-80009.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:44:57 +0000 Received: from EX19MTAUWC002.ant.amazon.com [10.0.21.151:6357] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.52.236:2525] with esmtp (Farcaster) id efa6df7d-3d6a-4217-9a28-acf66c908f2d; Fri, 23 Feb 2024 21:44:57 +0000 (UTC) X-Farcaster-Flow-ID: efa6df7d-3d6a-4217-9a28-acf66c908f2d Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWC002.ant.amazon.com (10.250.64.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:44:56 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1118.40; Fri, 23 Feb 2024 21:44:54 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 11/14] af_unix: Assign a unique index to SCC. Date: Fri, 23 Feb 2024 13:40:00 -0800 Message-ID: <20240223214003.17369-12-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D045UWA003.ant.amazon.com (10.13.139.46) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org The definition of the lowlink in Tarjan's algorithm is the smallest index of a vertex that is reachable with at most one back-edge in SCC. This is not useful for a cross-edge. If we start traversing from A in the following graph, the final lowlink of D is 3. The cross-edge here is one between D and C. A -> B -> D D = (4, 3) (index, lowlink) ^ | | C = (3, 1) | V | B = (2, 1) `--- C <--' A = (1, 1) This is because the lowlink of D is updated with the index of C. In the following patch, we detect a dead SCC by checking two conditions for each vertex. 1) vertex has no edge directed to another SCC (no bridge) 2) vertex's out_degree is the same as the refcount of its file If 1) is false, there is a receiver of all fds of the SCC and its ancestor SCC. To evaluate 1), we need to assign a unique index to each SCC and assign it to all vertices in the SCC. This patch changes the lowlink update logic for cross-edge so that in the example above, the lowlink of D is updated with the lowlink of C. A -> B -> D D = (4, 1) (index, lowlink) ^ | | C = (3, 1) | V | B = (2, 1) `--- C <--' A = (1, 1) Then, all vertices in the same SCC have the same lowlink, and we can quickly find the bridge connecting to different SCC if exists. However, it is no longer called lowlink, so we rename it to scc_index. (It's sometimes called lowpoint.) Also, we add a global variable to hold the last index used in DFS so that we do not reset the initial index in each DFS. This patch can be squashed to the SCC detection patch but is split deliberately for anyone wondering why lowlink is not used as used in the original Tarjan's algorithm and many reference implementations. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 2 +- net/unix/garbage.c | 15 ++++++++------- 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index ec040caaa4b5..696d997a5ac9 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -36,7 +36,7 @@ struct unix_vertex { struct list_head scc_entry; unsigned long out_degree; unsigned long index; - unsigned long lowlink; + unsigned long scc_index; }; struct unix_edge { diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 1d9a0498dec5..0eb1610c96d7 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -308,18 +308,18 @@ static bool unix_scc_cyclic(struct list_head *scc) static LIST_HEAD(unix_visited_vertices); static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; +static unsigned long unix_vertex_last_index = UNIX_VERTEX_INDEX_START; static void __unix_walk_scc(struct unix_vertex *vertex) { - unsigned long index = UNIX_VERTEX_INDEX_START; LIST_HEAD(vertex_stack); struct unix_edge *edge; LIST_HEAD(edge_stack); next_vertex: - vertex->index = index; - vertex->lowlink = index; - index++; + vertex->index = unix_vertex_last_index; + vertex->scc_index = unix_vertex_last_index; + unix_vertex_last_index++; list_add(&vertex->scc_entry, &vertex_stack); @@ -342,13 +342,13 @@ static void __unix_walk_scc(struct unix_vertex *vertex) vertex = edge->predecessor->vertex; - vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); } else if (next_vertex->index != unix_vertex_grouped_index) { - vertex->lowlink = min(vertex->lowlink, next_vertex->index); + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); } } - if (vertex->index == vertex->lowlink) { + if (vertex->index == vertex->scc_index) { struct list_head scc; __list_cut_position(&scc, &vertex_stack, &vertex->scc_entry); @@ -371,6 +371,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex) static void unix_walk_scc(void) { + unix_vertex_last_index = UNIX_VERTEX_INDEX_START; unix_graph_maybe_cyclic = false; while (!list_empty(&unix_unvisited_vertices)) { From patchwork Fri Feb 23 21:40:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570088 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-9106.amazon.com (smtp-fw-9106.amazon.com [207.171.188.206]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 059A8146E71 for ; Fri, 23 Feb 2024 21:45:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.188.206 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724729; cv=none; b=XkAqadrGv7VPOV7BH40KdjoXITFtESO7sLZ+BU4jPWSPYFfBZqQzAtdLFoC6i7EJA+c6Bt7hy8LrynVJjexjE6LboZm2fSUwpMdXZHE8+y23t3IT352cadEMy0rvcEUt1LyHYTwYXMRCGDeoB/P0akYym16Xtjzv3mHS9UGKaYg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724729; c=relaxed/simple; bh=UrnhIari++KSIifVWaqi+lr3ut7FAafiBel5RP941W0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Q9idaOyVmo6Lf2pZOC7IHdGeHOQFgyRqp02rI+DA62Hfd+X7QoVMGaa7lT1AL0u8qlRPrTChJeWf1jYNIhQ5fG4yJlnHRyPS9JuHnbfSJ17Y7y1Xd7K8suFYcQhJyTkOnGPoU4byI2N0AmpouC5tIVKurbav7iJzRkUrzj5hETk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=LxobkT2s; arc=none smtp.client-ip=207.171.188.206 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="LxobkT2s" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724728; x=1740260728; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YYoX8Rk6ojmSlrOJAsk3YtaYCneWU9KEhYNUz4Ws2o8=; b=LxobkT2sn7kx2LA8nHN7V00bOcwV/AQ1yriemH22c/rXOKHfG5/ZhrOD eoMWx6dZXaJaNSY+sbkrMBV/SNTu66uGixkxcHQpVyBhGbfvHB5aKEXK2 9q3PKEAVCCuLRcAPHRcQN2Rkt6MhhfamYwQHy8X3u9XGUcSP23b3RUfYM w=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="706491968" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-9106.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:45:23 +0000 Received: from EX19MTAUWA001.ant.amazon.com [10.0.38.20:51364] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.6.154:2525] with esmtp (Farcaster) id 70e236fb-1277-4169-9a07-0147965d2ebe; Fri, 23 Feb 2024 21:45:22 +0000 (UTC) X-Farcaster-Flow-ID: 70e236fb-1277-4169-9a07-0147965d2ebe Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWA001.ant.amazon.com (10.250.64.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:45:22 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1118.40; Fri, 23 Feb 2024 21:45:19 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 12/14] af_unix: Detect dead SCC. Date: Fri, 23 Feb 2024 13:40:01 -0800 Message-ID: <20240223214003.17369-13-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D035UWB003.ant.amazon.com (10.13.138.85) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org When iterating SCC, we call unix_vertex_dead() for each vertex to check if the vertex is close()d and has no bridge to another SCC. If both conditions are true for every vertex in SCC, we can execute garbage collection for all skb in the SCC. The actual garbage collection is done in the following patch, replacing the old implementation. Signed-off-by: Kuniyuki Iwashima --- net/unix/garbage.c | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 0eb1610c96d7..060e81be3614 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -288,6 +288,32 @@ void unix_destroy_fpl(struct scm_fp_list *fpl) unix_free_vertices(fpl); } +static bool unix_vertex_dead(struct unix_vertex *vertex) +{ + struct unix_edge *edge; + struct unix_sock *u; + long total_ref; + + list_for_each_entry(edge, &vertex->edges, vertex_entry) { + struct unix_vertex *next_vertex = unix_edge_successor(edge); + + if (!next_vertex) + return false; + + if (next_vertex->scc_index != vertex->scc_index) + return false; + } + + edge = list_first_entry(&vertex->edges, typeof(*edge), vertex_entry); + u = edge->predecessor; + total_ref = file_count(u->sk.sk_socket->file); + + if (total_ref != vertex->out_degree) + return false; + + return true; +} + static bool unix_scc_cyclic(struct list_head *scc) { struct unix_vertex *vertex; @@ -350,6 +376,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex) if (vertex->index == vertex->scc_index) { struct list_head scc; + bool dead = true; __list_cut_position(&scc, &vertex_stack, &vertex->scc_entry); @@ -357,6 +384,9 @@ static void __unix_walk_scc(struct unix_vertex *vertex) list_move_tail(&vertex->entry, &unix_visited_vertices); vertex->index = unix_vertex_grouped_index; + + if (dead) + dead = unix_vertex_dead(vertex); } if (!unix_graph_maybe_cyclic) @@ -392,13 +422,18 @@ static void unix_walk_scc_fast(void) while (!list_empty(&unix_unvisited_vertices)) { struct unix_vertex *vertex; struct list_head scc; + bool dead = true; vertex = list_first_entry(&unix_unvisited_vertices, typeof(*vertex), entry); list_add(&scc, &vertex->scc_entry); - list_for_each_entry_reverse(vertex, &scc, scc_entry) + list_for_each_entry_reverse(vertex, &scc, scc_entry) { list_move_tail(&vertex->entry, &unix_visited_vertices); + if (dead) + dead = unix_vertex_dead(vertex); + } + list_del(&scc); } From patchwork Fri Feb 23 21:40:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570089 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-80008.amazon.com (smtp-fw-80008.amazon.com [99.78.197.219]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 95B7114A0A9 for ; Fri, 23 Feb 2024 21:45:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.219 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724752; cv=none; b=QYazamWD+Duia4YrUP7GFGXiMmaTTiovl8pBd5cuiGheEsU4sKpoygA3gGVjgNMPLk9W57CDCAiHwiWTukmc8iJ9Aj633Q4RTm0zLiL0MHeyEVOSAs/vlybkt3IRkO6HABFlL2mPMoEf7LgYbYw/UnbJUYQeutOv2rGKSJdC5+8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724752; c=relaxed/simple; bh=uy0/FFFEQzwTFA8z8PMrnWayGANgDIO8CCXsWOysVPM=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=J6B9yCMPowf6JKjhGxBW4Kflf+Ws8WU+NMkaXbAMharibHxB/EiQPxjor9EuZOygfRhPwv04EV+e+WcPmOy5cX/ib54hximKO2sGjzXje5hX451OZHth3y42Fr8zcdbJZrHD/AS04Zf9DastJ5JyexxgWEO3XAoUjXisll2R/Tk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=A3HUxkKe; arc=none smtp.client-ip=99.78.197.219 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="A3HUxkKe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724750; x=1740260750; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/ZmP9nsj0p1GJ+z2/6UT2ETzuYRlnl6j7Hx3rNliUKg=; b=A3HUxkKe0aF04lVqlCiBQaWQFak/l407S33xmtr/1hlKoeXv7tJQyo4x Hv9Gg7mtbsJP/UhInWyCp2Wbt0THilAPIh4c2FN6gijCcgBr0NNQlDUpa x1O/MDF84FfLGnElPLOpKMC+NYT9xOvbkDWjCkdCYNPlT5mgFyv+nmDtv M=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="68377732" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:45:48 +0000 Received: from EX19MTAUWC001.ant.amazon.com [10.0.21.151:34506] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.16.229:2525] with esmtp (Farcaster) id 1029e676-b185-45d5-891b-c8a063a35d51; Fri, 23 Feb 2024 21:45:47 +0000 (UTC) X-Farcaster-Flow-ID: 1029e676-b185-45d5-891b-c8a063a35d51 Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWC001.ant.amazon.com (10.250.64.174) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:45:47 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:45:44 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 13/14] af_unix: Replace garbage collection algorithm. Date: Fri, 23 Feb 2024 13:40:02 -0800 Message-ID: <20240223214003.17369-14-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D042UWA003.ant.amazon.com (10.13.139.44) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org If we find a dead SCC during iteration, we call unix_collect_skb() to splice all skb in the SCC to the global sk_buff_head, hitlist. After iterating all SCC, we unlock unix_gc_lock and purge the queue. Signed-off-by: Kuniyuki Iwashima --- include/net/af_unix.h | 8 -- net/unix/af_unix.c | 12 -- net/unix/garbage.c | 287 ++++++++---------------------------------- 3 files changed, 53 insertions(+), 254 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 696d997a5ac9..226a8da2cbe3 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -19,9 +19,6 @@ static inline struct unix_sock *unix_get_socket(struct file *filp) extern spinlock_t unix_gc_lock; extern unsigned int unix_tot_inflight; - -void unix_inflight(struct user_struct *user, struct file *fp); -void unix_notinflight(struct user_struct *user, struct file *fp); void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver); void unix_del_edges(struct scm_fp_list *fpl); void unix_update_edges(struct unix_sock *receiver); @@ -85,12 +82,7 @@ struct unix_sock { struct sock *peer; struct sock *listener; struct unix_vertex *vertex; - struct list_head link; - unsigned long inflight; spinlock_t lock; - unsigned long gc_flags; -#define UNIX_GC_CANDIDATE 0 -#define UNIX_GC_MAYBE_CYCLE 1 struct socket_wq peer_wq; wait_queue_entry_t peer_wake; struct scm_stat scm_stat; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index ae77e2dc0dae..27ca50ab1cd1 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -980,12 +980,10 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, sk->sk_destruct = unix_sock_destructor; u = unix_sk(sk); u->listener = NULL; - u->inflight = 0; u->vertex = NULL; u->path.dentry = NULL; u->path.mnt = NULL; spin_lock_init(&u->lock); - INIT_LIST_HEAD(&u->link); mutex_init(&u->iolock); /* single task reading lock */ mutex_init(&u->bindlock); /* single task binding lock */ init_waitqueue_head(&u->peer_wait); @@ -1793,8 +1791,6 @@ static inline bool too_many_unix_fds(struct task_struct *p) static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) { - int i; - if (too_many_unix_fds(current)) return -ETOOMANYREFS; @@ -1806,9 +1802,6 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) if (!UNIXCB(skb).fp) return -ENOMEM; - for (i = scm->fp->count - 1; i >= 0; i--) - unix_inflight(scm->fp->user, scm->fp->fp[i]); - if (unix_prepare_fpl(UNIXCB(skb).fp)) return -ENOMEM; @@ -1817,15 +1810,10 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) static void unix_detach_fds(struct scm_cookie *scm, struct sk_buff *skb) { - int i; - scm->fp = UNIXCB(skb).fp; UNIXCB(skb).fp = NULL; unix_destroy_fpl(scm->fp); - - for (i = scm->fp->count - 1; i >= 0; i--) - unix_notinflight(scm->fp->user, scm->fp->fp[i]); } static void unix_peek_fds(struct scm_cookie *scm, struct sk_buff *skb) diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 060e81be3614..59a87a997a4d 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -314,6 +314,48 @@ static bool unix_vertex_dead(struct unix_vertex *vertex) return true; } +static struct sk_buff_head hitlist; + +static void unix_collect_skb(struct list_head *scc) +{ + struct unix_vertex *vertex; + + list_for_each_entry_reverse(vertex, scc, scc_entry) { + struct sk_buff_head *queue; + struct unix_edge *edge; + struct unix_sock *u; + + edge = list_first_entry(&vertex->edges, typeof(*edge), vertex_entry); + u = edge->predecessor; + queue = &u->sk.sk_receive_queue; + + spin_lock(&queue->lock); + + if (u->sk.sk_state == TCP_LISTEN) { + struct sk_buff *skb; + + skb_queue_walk(queue, skb) { + struct sk_buff_head *embryo_queue = &skb->sk->sk_receive_queue; + + spin_lock(&embryo_queue->lock); + skb_queue_splice_init(embryo_queue, &hitlist); + spin_unlock(&embryo_queue->lock); + } + } else { + skb_queue_splice_init(queue, &hitlist); + +#if IS_ENABLED(CONFIG_AF_UNIX_OOB) + if (u->oob_skb) { + kfree_skb(u->oob_skb); + u->oob_skb = NULL; + } +#endif + } + + spin_unlock(&queue->lock); + } +} + static bool unix_scc_cyclic(struct list_head *scc) { struct unix_vertex *vertex; @@ -389,7 +431,9 @@ static void __unix_walk_scc(struct unix_vertex *vertex) dead = unix_vertex_dead(vertex); } - if (!unix_graph_maybe_cyclic) + if (dead) + unix_collect_skb(&scc); + else if (!unix_graph_maybe_cyclic) unix_graph_maybe_cyclic = unix_scc_cyclic(&scc); list_del(&scc); @@ -434,263 +478,38 @@ static void unix_walk_scc_fast(void) dead = unix_vertex_dead(vertex); } + if (dead) + unix_collect_skb(&scc); + list_del(&scc); } list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices); } -static LIST_HEAD(gc_candidates); -static LIST_HEAD(gc_inflight_list); - -/* Keep the number of times in flight count for the file - * descriptor if it is for an AF_UNIX socket. - */ -void unix_inflight(struct user_struct *user, struct file *filp) -{ - struct unix_sock *u = unix_get_socket(filp); - - spin_lock(&unix_gc_lock); - - if (u) { - if (!u->inflight) { - WARN_ON_ONCE(!list_empty(&u->link)); - list_add_tail(&u->link, &gc_inflight_list); - } else { - WARN_ON_ONCE(list_empty(&u->link)); - } - u->inflight++; - } - - spin_unlock(&unix_gc_lock); -} - -void unix_notinflight(struct user_struct *user, struct file *filp) -{ - struct unix_sock *u = unix_get_socket(filp); - - spin_lock(&unix_gc_lock); - - if (u) { - WARN_ON_ONCE(!u->inflight); - WARN_ON_ONCE(list_empty(&u->link)); - - u->inflight--; - if (!u->inflight) - list_del_init(&u->link); - } - - spin_unlock(&unix_gc_lock); -} - -static void scan_inflight(struct sock *x, void (*func)(struct unix_sock *), - struct sk_buff_head *hitlist) -{ - struct sk_buff *skb; - struct sk_buff *next; - - spin_lock(&x->sk_receive_queue.lock); - skb_queue_walk_safe(&x->sk_receive_queue, skb, next) { - /* Do we have file descriptors ? */ - if (UNIXCB(skb).fp) { - bool hit = false; - /* Process the descriptors of this socket */ - int nfd = UNIXCB(skb).fp->count; - struct file **fp = UNIXCB(skb).fp->fp; - - while (nfd--) { - /* Get the socket the fd matches if it indeed does so */ - struct unix_sock *u = unix_get_socket(*fp++); - - /* Ignore non-candidates, they could have been added - * to the queues after starting the garbage collection - */ - if (u && test_bit(UNIX_GC_CANDIDATE, &u->gc_flags)) { - hit = true; - - func(u); - } - } - if (hit && hitlist != NULL) { - __skb_unlink(skb, &x->sk_receive_queue); - __skb_queue_tail(hitlist, skb); - } - } - } - spin_unlock(&x->sk_receive_queue.lock); -} - -static void scan_children(struct sock *x, void (*func)(struct unix_sock *), - struct sk_buff_head *hitlist) -{ - if (x->sk_state != TCP_LISTEN) { - scan_inflight(x, func, hitlist); - } else { - struct sk_buff *skb; - struct sk_buff *next; - struct unix_sock *u; - LIST_HEAD(embryos); - - /* For a listening socket collect the queued embryos - * and perform a scan on them as well. - */ - spin_lock(&x->sk_receive_queue.lock); - skb_queue_walk_safe(&x->sk_receive_queue, skb, next) { - u = unix_sk(skb->sk); - - /* An embryo cannot be in-flight, so it's safe - * to use the list link. - */ - WARN_ON_ONCE(!list_empty(&u->link)); - list_add_tail(&u->link, &embryos); - } - spin_unlock(&x->sk_receive_queue.lock); - - while (!list_empty(&embryos)) { - u = list_entry(embryos.next, struct unix_sock, link); - scan_inflight(&u->sk, func, hitlist); - list_del_init(&u->link); - } - } -} - -static void dec_inflight(struct unix_sock *usk) -{ - usk->inflight--; -} - -static void inc_inflight(struct unix_sock *usk) -{ - usk->inflight++; -} - -static void inc_inflight_move_tail(struct unix_sock *u) -{ - u->inflight++; - - /* If this still might be part of a cycle, move it to the end - * of the list, so that it's checked even if it was already - * passed over - */ - if (test_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags)) - list_move_tail(&u->link, &gc_candidates); -} - static bool gc_in_progress; static void __unix_gc(struct work_struct *work) { - struct sk_buff_head hitlist; - struct unix_sock *u, *next; - LIST_HEAD(not_cycle_list); - struct list_head cursor; - spin_lock(&unix_gc_lock); - if (!unix_graph_maybe_cyclic) + if (!unix_graph_maybe_cyclic) { + spin_unlock(&unix_gc_lock); goto skip_gc; + } + + __skb_queue_head_init(&hitlist); if (unix_graph_grouped) unix_walk_scc_fast(); else unix_walk_scc(); - /* First, select candidates for garbage collection. Only - * in-flight sockets are considered, and from those only ones - * which don't have any external reference. - * - * Holding unix_gc_lock will protect these candidates from - * being detached, and hence from gaining an external - * reference. Since there are no possible receivers, all - * buffers currently on the candidates' queues stay there - * during the garbage collection. - * - * We also know that no new candidate can be added onto the - * receive queues. Other, non candidate sockets _can_ be - * added to queue, so we must make sure only to touch - * candidates. - */ - list_for_each_entry_safe(u, next, &gc_inflight_list, link) { - long total_refs; - - total_refs = file_count(u->sk.sk_socket->file); - - WARN_ON_ONCE(!u->inflight); - WARN_ON_ONCE(total_refs < u->inflight); - if (total_refs == u->inflight) { - list_move_tail(&u->link, &gc_candidates); - __set_bit(UNIX_GC_CANDIDATE, &u->gc_flags); - __set_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags); - } - } - - /* Now remove all internal in-flight reference to children of - * the candidates. - */ - list_for_each_entry(u, &gc_candidates, link) - scan_children(&u->sk, dec_inflight, NULL); - - /* Restore the references for children of all candidates, - * which have remaining references. Do this recursively, so - * only those remain, which form cyclic references. - * - * Use a "cursor" link, to make the list traversal safe, even - * though elements might be moved about. - */ - list_add(&cursor, &gc_candidates); - while (cursor.next != &gc_candidates) { - u = list_entry(cursor.next, struct unix_sock, link); - - /* Move cursor to after the current position. */ - list_move(&cursor, &u->link); - - if (u->inflight) { - list_move_tail(&u->link, ¬_cycle_list); - __clear_bit(UNIX_GC_MAYBE_CYCLE, &u->gc_flags); - scan_children(&u->sk, inc_inflight_move_tail, NULL); - } - } - list_del(&cursor); - - /* Now gc_candidates contains only garbage. Restore original - * inflight counters for these as well, and remove the skbuffs - * which are creating the cycle(s). - */ - skb_queue_head_init(&hitlist); - list_for_each_entry(u, &gc_candidates, link) { - scan_children(&u->sk, inc_inflight, &hitlist); - -#if IS_ENABLED(CONFIG_AF_UNIX_OOB) - if (u->oob_skb) { - kfree_skb(u->oob_skb); - u->oob_skb = NULL; - } -#endif - } - - /* not_cycle_list contains those sockets which do not make up a - * cycle. Restore these to the inflight list. - */ - while (!list_empty(¬_cycle_list)) { - u = list_entry(not_cycle_list.next, struct unix_sock, link); - __clear_bit(UNIX_GC_CANDIDATE, &u->gc_flags); - list_move_tail(&u->link, &gc_inflight_list); - } - spin_unlock(&unix_gc_lock); - /* Here we are. Hitlist is filled. Die. */ __skb_queue_purge(&hitlist); - - spin_lock(&unix_gc_lock); - - /* All candidates should have been detached by now. */ - WARN_ON_ONCE(!list_empty(&gc_candidates)); skip_gc: - /* Paired with READ_ONCE() in wait_for_unix_gc(). */ WRITE_ONCE(gc_in_progress, false); - - spin_unlock(&unix_gc_lock); } static DECLARE_WORK(unix_gc_work, __unix_gc); From patchwork Fri Feb 23 21:40:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kuniyuki Iwashima X-Patchwork-Id: 13570090 X-Patchwork-Delegate: kuba@kernel.org Received: from smtp-fw-9106.amazon.com (smtp-fw-9106.amazon.com [207.171.188.206]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9A65814CAA3 for ; Fri, 23 Feb 2024 21:46:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.188.206 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724777; cv=none; b=p1JjT5270j+K4JLVtR6Vbjfkvl/g+5DZbtjiInb8q1rWp1BS11Xal9hIqHOVWfZe3w3pTDQ4HwfmXpGL8eaiaxDe1Pjei1fpzt9rGZ2JzhPa4ciTsLo1J6c1Xud9d2Pzqs8L0a4M8EiZjtGQEroTZfn1WdSK/u+KYdEbMDPiBVU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708724777; c=relaxed/simple; bh=ZuRdZHLz50uFNXbgNZaAm/C6KqlpRhYt39V0cu7ic4E=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=GUyqPtxqcaYAvQEoUYwE+MUYAPxCwX+6i8BSnGjKFDq+kBmCQZWPsQGDfMG2YLMv9Ta6MyL2UtmDq6VtDDxP8ci2aial4D/nU0HvSv1+tg7v4XFlS0eXTbRojN+564kDTzImPSvW+O/MxPrheny4KHZOIBFVIFRDqnq7AoqCZcE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.co.jp; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=p6acRRlL; arc=none smtp.client-ip=207.171.188.206 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.jp Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="p6acRRlL" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1708724775; x=1740260775; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Iq/zFFFJ8V14KK9QW8av9KdokGOaG8yduusDkCXibtE=; b=p6acRRlLx/6Z3UdThJW2xivPSth3nIMNkVsC+GYJyEnqTBsyOIGdyx/w MbYKm98SesVtVchBgfKe9Tv86574GDPPgyhpAb5aEQwTqKbFO5cn+isRv atg1muGimfgxwis54Op0vhVrfm0lKg+AHColLP+/e7b3qYpGvnEogLdFA 4=; X-IronPort-AV: E=Sophos;i="6.06,180,1705363200"; d="scan'208";a="706492111" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-9106.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2024 21:46:14 +0000 Received: from EX19MTAUWB002.ant.amazon.com [10.0.21.151:14694] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.16.177:2525] with esmtp (Farcaster) id 406e1a1f-ab32-4d66-8afd-3f91cc4e97a5; Fri, 23 Feb 2024 21:46:13 +0000 (UTC) X-Farcaster-Flow-ID: 406e1a1f-ab32-4d66-8afd-3f91cc4e97a5 Received: from EX19D004ANA003.ant.amazon.com (10.37.240.184) by EX19MTAUWB002.ant.amazon.com (10.250.64.231) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:46:12 +0000 Received: from 88665a182662.ant.amazon.com.com (10.106.100.9) by EX19D004ANA003.ant.amazon.com (10.37.240.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 23 Feb 2024 21:46:09 +0000 From: Kuniyuki Iwashima To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni CC: Kuniyuki Iwashima , Kuniyuki Iwashima , Subject: [PATCH v3 net-next 14/14] selftest: af_unix: Test GC for SCM_RIGHTS. Date: Fri, 23 Feb 2024 13:40:03 -0800 Message-ID: <20240223214003.17369-15-kuniyu@amazon.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20240223214003.17369-1-kuniyu@amazon.com> References: <20240223214003.17369-1-kuniyu@amazon.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D032UWB001.ant.amazon.com (10.13.139.152) To EX19D004ANA003.ant.amazon.com (10.37.240.184) X-Patchwork-Delegate: kuba@kernel.org This patch adds test cases to verify the new GC. We run each test for the following cases: * SOCK_DGRAM * SOCK_STREAM without embryo socket * SOCK_STREAM without embryo socket + MSG_OOB * SOCK_STREAM with embryo sockets * SOCK_STREAM with embryo sockets + MSG_OOB Before and after running each test case, we ensure that there is no AF_UNIX socket left in the netns by reading /proc/net/protocols. We cannot use /proc/net/unix and UNIX_DIAG because the embryo socket does not show up there. Each test creates multiple sockets in an array. We pass sockets in the even index using the peer sockets in the odd index. So, send_fd(0, 1) actually sends fd[0] to fd[2] via fd[0 + 1]. Test 1 : A <-> A Test 2 : A <-> B Test 3 : A -> B -> C <- D ^.___|___.' ^ `---------' Signed-off-by: Kuniyuki Iwashima --- tools/testing/selftests/net/.gitignore | 1 + tools/testing/selftests/net/af_unix/Makefile | 2 +- .../selftests/net/af_unix/scm_rights.c | 286 ++++++++++++++++++ 3 files changed, 288 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/net/af_unix/scm_rights.c diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore index 2f9d378edec3..d996a0ab0765 100644 --- a/tools/testing/selftests/net/.gitignore +++ b/tools/testing/selftests/net/.gitignore @@ -31,6 +31,7 @@ reuseport_dualstack rxtimestamp sctp_hello scm_pidfd +scm_rights sk_bind_sendto_listen sk_connect_zero_addr socket diff --git a/tools/testing/selftests/net/af_unix/Makefile b/tools/testing/selftests/net/af_unix/Makefile index 221c387a7d7f..3b83c797650d 100644 --- a/tools/testing/selftests/net/af_unix/Makefile +++ b/tools/testing/selftests/net/af_unix/Makefile @@ -1,4 +1,4 @@ CFLAGS += $(KHDR_INCLUDES) -TEST_GEN_PROGS := diag_uid test_unix_oob unix_connect scm_pidfd +TEST_GEN_PROGS := diag_uid test_unix_oob unix_connect scm_pidfd scm_rights include ../../lib.mk diff --git a/tools/testing/selftests/net/af_unix/scm_rights.c b/tools/testing/selftests/net/af_unix/scm_rights.c new file mode 100644 index 000000000000..bab606c9f1eb --- /dev/null +++ b/tools/testing/selftests/net/af_unix/scm_rights.c @@ -0,0 +1,286 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright Amazon.com Inc. or its affiliates. */ +#define _GNU_SOURCE +#include + +#include +#include +#include +#include +#include +#include + +#include "../../kselftest_harness.h" + +FIXTURE(scm_rights) +{ + int fd[16]; +}; + +FIXTURE_VARIANT(scm_rights) +{ + char name[16]; + int type; + int flags; + bool test_listener; +}; + +FIXTURE_VARIANT_ADD(scm_rights, dgram) +{ + .name = "UNIX ", + .type = SOCK_DGRAM, + .flags = 0, + .test_listener = false, +}; + +FIXTURE_VARIANT_ADD(scm_rights, stream) +{ + .name = "UNIX-STREAM ", + .type = SOCK_STREAM, + .flags = 0, + .test_listener = false, +}; + +FIXTURE_VARIANT_ADD(scm_rights, stream_oob) +{ + .name = "UNIX-STREAM ", + .type = SOCK_STREAM, + .flags = MSG_OOB, + .test_listener = false, +}; + +FIXTURE_VARIANT_ADD(scm_rights, stream_listener) +{ + .name = "UNIX-STREAM ", + .type = SOCK_STREAM, + .flags = 0, + .test_listener = true, +}; + +FIXTURE_VARIANT_ADD(scm_rights, stream_listener_oob) +{ + .name = "UNIX-STREAM ", + .type = SOCK_STREAM, + .flags = MSG_OOB, + .test_listener = true, +}; + +static int count_sockets(struct __test_metadata *_metadata, + const FIXTURE_VARIANT(scm_rights) *variant) +{ + int sockets = -1, len, ret; + char *line = NULL; + size_t unused; + FILE *f; + + f = fopen("/proc/net/protocols", "r"); + ASSERT_NE(NULL, f); + + len = strlen(variant->name); + + while (getline(&line, &unused, f) != -1) { + int unused2; + + if (strncmp(line, variant->name, len)) + continue; + + ret = sscanf(line + len, "%d %d", &unused2, &sockets); + ASSERT_EQ(2, ret); + + break; + } + + free(line); + + ret = fclose(f); + ASSERT_EQ(0, ret); + + return sockets; +} + +FIXTURE_SETUP(scm_rights) +{ + int ret; + + ret = unshare(CLONE_NEWNET); + ASSERT_EQ(0, ret); + + ret = count_sockets(_metadata, variant); + ASSERT_EQ(0, ret); +} + +FIXTURE_TEARDOWN(scm_rights) +{ + int ret; + + sleep(1); + + ret = count_sockets(_metadata, variant); + ASSERT_EQ(0, ret); +} + +static void create_listeners(struct __test_metadata *_metadata, + FIXTURE_DATA(scm_rights) *self, + int n) +{ + struct sockaddr_un addr = { + .sun_family = AF_UNIX, + }; + socklen_t addrlen; + int i, ret; + + for (i = 0; i < n * 2; i += 2) { + self->fd[i] = socket(AF_UNIX, SOCK_STREAM, 0); + ASSERT_LE(0, self->fd[i]); + + addrlen = sizeof(addr.sun_family); + ret = bind(self->fd[i], (struct sockaddr *)&addr, addrlen); + ASSERT_EQ(0, ret); + + ret = listen(self->fd[i], -1); + ASSERT_EQ(0, ret); + + addrlen = sizeof(addr); + ret = getsockname(self->fd[i], (struct sockaddr *)&addr, &addrlen); + ASSERT_EQ(0, ret); + + self->fd[i + 1] = socket(AF_UNIX, SOCK_STREAM, 0); + ASSERT_LE(0, self->fd[i + 1]); + + ret = connect(self->fd[i + 1], (struct sockaddr *)&addr, addrlen); + ASSERT_EQ(0, ret); + } +} + +static void create_socketpairs(struct __test_metadata *_metadata, + FIXTURE_DATA(scm_rights) *self, + const FIXTURE_VARIANT(scm_rights) *variant, + int n) +{ + int i, ret; + + ASSERT_GE(sizeof(self->fd) / sizeof(int), n); + + for (i = 0; i < n * 2; i += 2) { + ret = socketpair(AF_UNIX, variant->type, 0, self->fd + i); + ASSERT_EQ(0, ret); + } +} + +static void __create_sockets(struct __test_metadata *_metadata, + FIXTURE_DATA(scm_rights) *self, + const FIXTURE_VARIANT(scm_rights) *variant, + int n) +{ + if (variant->test_listener) + create_listeners(_metadata, self, n); + else + create_socketpairs(_metadata, self, variant, n); +} + +static void __close_sockets(struct __test_metadata *_metadata, + FIXTURE_DATA(scm_rights) *self, + int n) +{ + int i, ret; + + ASSERT_GE(sizeof(self->fd) / sizeof(int), n); + + for (i = 0; i < n * 2; i++) { + ret = close(self->fd[i]); + ASSERT_EQ(0, ret); + } +} + +void __send_fd(struct __test_metadata *_metadata, + const FIXTURE_DATA(scm_rights) *self, + const FIXTURE_VARIANT(scm_rights) *variant, + int inflight, int receiver) +{ +#define MSG "nop" +#define MSGLEN 3 + struct { + struct cmsghdr cmsghdr; + int fd[2]; + } cmsg = { + .cmsghdr = { + .cmsg_len = CMSG_LEN(sizeof(cmsg.fd)), + .cmsg_level = SOL_SOCKET, + .cmsg_type = SCM_RIGHTS, + }, + .fd = { + self->fd[inflight * 2], + self->fd[inflight * 2], + }, + }; + struct iovec iov = { + .iov_base = MSG, + .iov_len = MSGLEN, + }; + struct msghdr msg = { + .msg_name = NULL, + .msg_namelen = 0, + .msg_iov = &iov, + .msg_iovlen = 1, + .msg_control = &cmsg, + .msg_controllen = CMSG_SPACE(sizeof(cmsg.fd)), + }; + int ret; + + ret = sendmsg(self->fd[receiver * 2 + 1], &msg, variant->flags); + ASSERT_EQ(MSGLEN, ret); +} + +#define create_sockets(n) \ + __create_sockets(_metadata, self, variant, n) +#define close_sockets(n) \ + __close_sockets(_metadata, self, n) +#define send_fd(inflight, receiver) \ + __send_fd(_metadata, self, variant, inflight, receiver) + +TEST_F(scm_rights, self_ref) +{ + create_sockets(2); + + send_fd(0, 0); + + send_fd(1, 1); + + close_sockets(2); +} + +TEST_F(scm_rights, triangle) +{ + create_sockets(6); + + send_fd(0, 1); + send_fd(1, 2); + send_fd(2, 0); + + send_fd(3, 4); + send_fd(4, 5); + send_fd(5, 3); + + close_sockets(6); +} + +TEST_F(scm_rights, cross_edge) +{ + create_sockets(8); + + send_fd(0, 1); + send_fd(1, 2); + send_fd(2, 0); + send_fd(1, 3); + send_fd(3, 2); + + send_fd(4, 5); + send_fd(5, 6); + send_fd(6, 4); + send_fd(5, 7); + send_fd(7, 6); + + close_sockets(8); +} + +TEST_HARNESS_MAIN