From patchwork Mon Mar 10 09:41:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 14009478 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 725C2225A36 for ; Mon, 10 Mar 2025 09:42:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599752; cv=none; b=ea0GQlYhQMOLXP5VfYE23ZKg+zdUnPdXf5g3oXWdcZb/M6odIcV1eHul1Au5QLBuBwtdyFrZy0n1teolszACGh3zYXbIXyCzSJ3jS6q5XsPysMyfJo+DgLaQF151tlefhULX4LDsApV+d3YWSyuRutB1ajkCjTkVBo5F2g6ChEo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599752; c=relaxed/simple; bh=6vtwEs+rHM1jWLym/vaXSW2BCT8cpdWZuotX4+2I2Vc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CzOyPC8mxCD1jRk7GPVRh8GNMeuac0g3scnI7YZe4oGSCBWBPC2Wu1BqkYoOsIGnRPSNjtFSygRkLJY1UWuYQPt0R6dvTRkVLdoIcZ8qmj9xYJkEOfSIPhh4pLoMOiArv3DhCw7yYQh4q3Px5/CQh3V8kUAwIty4k06i7tHsq+A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=ZlCqvYLf; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ZlCqvYLf" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741599749; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cRYHIcDjLcg4Vn95ha4RWQZUMP+5IMAoLBX9IqR4Ab4=; b=ZlCqvYLf5yHU8RAqeIKXoe8sI1rmd5m90ULsvXXeZ4zVakIqDyPxP5LYs9Ud9RT1fTnSXJ Lk9Dl4rWvdvOZDNuWbbBv1n/Hoy6SRuHeHl+kaodfq/WHqRE7vwRHmiBSL1rjwptuUBxPW IoFrr246SqYPpFkzQYNNZHj/dHH1mVg= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-295-eS1fX2frNM2QQtn-R6jD8Q-1; Mon, 10 Mar 2025 05:42:24 -0400 X-MC-Unique: eS1fX2frNM2QQtn-R6jD8Q-1 X-Mimecast-MFC-AGG-ID: eS1fX2frNM2QQtn-R6jD8Q_1741599743 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1987E19560B2; Mon, 10 Mar 2025 09:42:22 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.61]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 633A91800944; Mon, 10 Mar 2025 09:42:19 +0000 (UTC) From: David Howells To: Marc Dionne Cc: David Howells , Christian Brauner , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Alexander Viro Subject: [PATCH v4 01/11] afs: Fix afs_atcell_get_link() to handle RCU pathwalk Date: Mon, 10 Mar 2025 09:41:54 +0000 Message-ID: <20250310094206.801057-2-dhowells@redhat.com> In-Reply-To: <20250310094206.801057-1-dhowells@redhat.com> References: <20250310094206.801057-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 The ->get_link() method may be entered under RCU pathwalk conditions (in which case, the dentry pointer is NULL). This is not taken account of by afs_atcell_get_link() and lockdep will complain when it tries to lock an rwsem. Fix this by marking net->ws_cell as __rcu and using RCU access macros on it and by making afs_atcell_get_link() just return a pointer to the name in RCU pathwalk without taking net->cells_lock or a ref on the cell as RCU will protect the name storage (the cell is already freed via call_rcu()). Fixes: 30bca65bbbae ("afs: Make /afs/@cell and /afs/.@cell symlinks") Reported-by: Alexander Viro Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org --- fs/afs/cell.c | 11 ++++++----- fs/afs/dynroot.c | 15 +++++++++++++-- fs/afs/internal.h | 2 +- fs/afs/proc.c | 4 ++-- 4 files changed, 22 insertions(+), 10 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index cee42646736c..96a6781f3653 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -64,7 +64,8 @@ static struct afs_cell *afs_find_cell_locked(struct afs_net *net, return ERR_PTR(-ENAMETOOLONG); if (!name) { - cell = net->ws_cell; + cell = rcu_dereference_protected(net->ws_cell, + lockdep_is_held(&net->cells_lock)); if (!cell) return ERR_PTR(-EDESTADDRREQ); goto found; @@ -388,8 +389,8 @@ int afs_cell_init(struct afs_net *net, const char *rootcell) /* install the new cell */ down_write(&net->cells_lock); afs_see_cell(new_root, afs_cell_trace_see_ws); - old_root = net->ws_cell; - net->ws_cell = new_root; + old_root = rcu_replace_pointer(net->ws_cell, new_root, + lockdep_is_held(&net->cells_lock)); up_write(&net->cells_lock); afs_unuse_cell(net, old_root, afs_cell_trace_unuse_ws); @@ -945,8 +946,8 @@ void afs_cell_purge(struct afs_net *net) _enter(""); down_write(&net->cells_lock); - ws = net->ws_cell; - net->ws_cell = NULL; + ws = rcu_replace_pointer(net->ws_cell, NULL, + lockdep_is_held(&net->cells_lock)); up_write(&net->cells_lock); afs_unuse_cell(net, ws, afs_cell_trace_unuse_ws); diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index d8bf52f77d93..008698d706ca 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -314,12 +314,23 @@ static const char *afs_atcell_get_link(struct dentry *dentry, struct inode *inod const char *name; bool dotted = vnode->fid.vnode == 3; - if (!net->ws_cell) + if (!dentry) { + /* We're in RCU-pathwalk. */ + cell = rcu_dereference(net->ws_cell); + if (dotted) + name = cell->name - 1; + else + name = cell->name; + /* Shouldn't need to set a delayed call. */ + return name; + } + + if (!rcu_access_pointer(net->ws_cell)) return ERR_PTR(-ENOENT); down_read(&net->cells_lock); - cell = net->ws_cell; + cell = rcu_dereference_protected(net->ws_cell, lockdep_is_held(&net->cells_lock)); if (dotted) name = cell->name - 1; else diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 90f407774a9a..df30bd62da79 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -287,7 +287,7 @@ struct afs_net { /* Cell database */ struct rb_root cells; - struct afs_cell *ws_cell; + struct afs_cell __rcu *ws_cell; struct work_struct cells_manager; struct timer_list cells_timer; atomic_t cells_outstanding; diff --git a/fs/afs/proc.c b/fs/afs/proc.c index e7614f4f30c2..12c88d8be3fe 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -206,7 +206,7 @@ static int afs_proc_rootcell_show(struct seq_file *m, void *v) net = afs_seq2net_single(m); down_read(&net->cells_lock); - cell = net->ws_cell; + cell = rcu_dereference_protected(net->ws_cell, lockdep_is_held(&net->cells_lock)); if (cell) seq_printf(m, "%s\n", cell->name); up_read(&net->cells_lock); @@ -242,7 +242,7 @@ static int afs_proc_rootcell_write(struct file *file, char *buf, size_t size) ret = -EEXIST; inode_lock(file_inode(file)); - if (!net->ws_cell) + if (!rcu_access_pointer(net->ws_cell)) ret = afs_cell_init(net, buf); else printk("busy\n"); From patchwork Mon Mar 10 09:41:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 14009479 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 72623225A37 for ; Mon, 10 Mar 2025 09:42:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599752; cv=none; b=mTVrKMC3EKuew9W6Umt1GUwzUkBSUDGaU9GtL1cZiTg0HFGVTHvEhQKM5sza7bkRUOQOQro26cXA8VXVO0HAoF5QLSqfk+lQhAedk72LMlKxLcANUX9zvXc8e2YRhZB/7F7zxJqZ49hYv7lYAkHIfCQoeRzfFy7SJpUkDlSKUoE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599752; c=relaxed/simple; bh=0Kixhs+Wzb2xkgum3kaWfJly4Y7xnsiWnM4rZnsGe50=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LjV52Fw2VOjm2fxCi4A6PcAAqbnps3E8yVHEhr5HgJYKNYjG1zXfGNlFfm7ULVP8w6YrjaUeEgY4Aoxt4dLpo+AFJyeDvR2WXNFvg3df9UfrtONTZeDISLS0CUYXBH9cOHJ1z/17T+wSQkh0tu7foEBW52EKDgB3/qXene54Ghg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=JtnJusDV; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="JtnJusDV" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741599749; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ms7KYDawzWvNN1tWImlyZ0pX/itnGIeBBJQ2IqDFAKk=; b=JtnJusDV7cJRUHJNkZxjxkRaYTMucLv4CXgfl9EwSLOVHIZ1J11MAXUmjeSPPOdHASr7++ szGEpNvMnJTyFgv7EbGqpAIYl0OPDZFwKlvODpjOk5DCJeuJ+qnBAEfJAZ52ftjLVgsAj7 ycWZLLUXpvUNn6MysWiiQKijP3yUgNs= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-373-VHgX47lnNCGefziFaqYZ0Q-1; Mon, 10 Mar 2025 05:42:26 -0400 X-MC-Unique: VHgX47lnNCGefziFaqYZ0Q-1 X-Mimecast-MFC-AGG-ID: VHgX47lnNCGefziFaqYZ0Q_1741599745 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1650119560A1; Mon, 10 Mar 2025 09:42:25 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.61]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 5A1AD3000197; Mon, 10 Mar 2025 09:42:23 +0000 (UTC) From: David Howells To: Marc Dionne Cc: David Howells , Christian Brauner , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 02/11] afs: Remove the "autocell" mount option Date: Mon, 10 Mar 2025 09:41:55 +0000 Message-ID: <20250310094206.801057-3-dhowells@redhat.com> In-Reply-To: <20250310094206.801057-1-dhowells@redhat.com> References: <20250310094206.801057-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Remove the "autocell" mount option. It was an attempt to do automounting of arbitrary cells based on what the user looked up but within the root directory of a mounted volume. This isn't really the right thing to do, and using the "dyn" mount option to get the dynamic root is the right way to do it. The kafs-client package uses "-o dyn" when mounting /afs, so it should be safe to drop "-o autocell". Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250224234154.2014840-7-dhowells@redhat.com/ # v1 --- fs/afs/dir.c | 5 ++--- fs/afs/dynroot.c | 5 +---- fs/afs/internal.h | 2 -- fs/afs/super.c | 5 ----- 4 files changed, 3 insertions(+), 14 deletions(-) diff --git a/fs/afs/dir.c b/fs/afs/dir.c index 02cbf38e1a77..9f62b8938350 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -1004,9 +1004,8 @@ static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry, afs_stat_v(dvnode, n_lookup); inode = afs_do_lookup(dir, dentry); if (inode == ERR_PTR(-ENOENT)) - inode = afs_try_auto_mntpt(dentry, dir); - - if (!IS_ERR_OR_NULL(inode)) + inode = NULL; + else if (!IS_ERR_OR_NULL(inode)) fid = AFS_FS_I(inode)->fid; _debug("splice %p", dentry->d_inode); diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index 008698d706ca..0b4cc291c65e 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -155,7 +155,7 @@ static int afs_probe_cell_name(struct dentry *dentry) * Try to auto mount the mountpoint with pseudo directory, if the autocell * operation is setted. */ -struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir) +static struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir) { struct afs_vnode *vnode = AFS_FS_I(dir); struct inode *inode; @@ -164,9 +164,6 @@ struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir) _enter("%p{%pd}, {%llx:%llu}", dentry, dentry, vnode->fid.vid, vnode->fid.vnode); - if (!test_bit(AFS_VNODE_AUTOCELL, &vnode->flags)) - goto out; - ret = afs_probe_cell_name(dentry); if (ret < 0) goto out; diff --git a/fs/afs/internal.h b/fs/afs/internal.h index df30bd62da79..0e00e061f0d9 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -700,7 +700,6 @@ struct afs_vnode { #define AFS_VNODE_ZAP_DATA 3 /* set if vnode's data should be invalidated */ #define AFS_VNODE_DELETED 4 /* set if vnode deleted on server */ #define AFS_VNODE_MOUNTPOINT 5 /* set if vnode is a mountpoint symlink */ -#define AFS_VNODE_AUTOCELL 6 /* set if Vnode is an auto mount point */ #define AFS_VNODE_PSEUDODIR 7 /* set if Vnode is a pseudo directory */ #define AFS_VNODE_NEW_CONTENT 8 /* Set if file has new content (create/trunc-0) */ #define AFS_VNODE_SILLY_DELETED 9 /* Set if file has been silly-deleted */ @@ -1111,7 +1110,6 @@ extern int afs_silly_iput(struct dentry *, struct inode *); extern const struct inode_operations afs_dynroot_inode_operations; extern const struct dentry_operations afs_dynroot_dentry_operations; -extern struct inode *afs_try_auto_mntpt(struct dentry *, struct inode *); extern int afs_dynroot_mkdir(struct afs_net *, struct afs_cell *); extern void afs_dynroot_rmdir(struct afs_net *, struct afs_cell *); extern int afs_dynroot_populate(struct super_block *); diff --git a/fs/afs/super.c b/fs/afs/super.c index a9bee610674e..2f18aa8e2806 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -194,8 +194,6 @@ static int afs_show_options(struct seq_file *m, struct dentry *root) if (as->dyn_root) seq_puts(m, ",dyn"); - if (test_bit(AFS_VNODE_AUTOCELL, &AFS_FS_I(d_inode(root))->flags)) - seq_puts(m, ",autocell"); switch (as->flock_mode) { case afs_flock_mode_unset: break; case afs_flock_mode_local: p = "local"; break; @@ -478,9 +476,6 @@ static int afs_fill_super(struct super_block *sb, struct afs_fs_context *ctx) if (IS_ERR(inode)) return PTR_ERR(inode); - if (ctx->autocell || as->dyn_root) - set_bit(AFS_VNODE_AUTOCELL, &AFS_FS_I(inode)->flags); - ret = -ENOMEM; sb->s_root = d_make_root(inode); if (!sb->s_root) From patchwork Mon Mar 10 09:41:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 14009480 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3861C28DD0 for ; Mon, 10 Mar 2025 09:42:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599757; cv=none; b=CBD9yuuh7ZXD9HyEGZEoxK12Nj4twARNpuLvaGAbcSgU3v2AQhwepXeyrdDhXOENirEejUC9IVMfUia7hP3XGdeIu5jU3SFTRA2Nt3G9nA36w1/u3jED5J+uSK+deN8tSfOWd10ZYiynRMd+uBqr3OT9rRNk0XgW6w+DvMppykg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599757; c=relaxed/simple; bh=cMHhv9RM3hj6/WlIbDmZi9TLJJamLvV8bOb7PkRKpSU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WH7nw9LCMa2kWlAifW5IuuJgGFbPskZ8Xm9Uqh6la+EhJ7BoP0EpGZq/fM3kF0o3Xw9BF3jAIuowJ40GqqGq1tNak3UEeDsdpMbjuuYIGAMqrKxxP0wqtaS0SyNyeA4euULt4k8r1NVB67g35+Ebwm/SKFMD5FJDDNsyue/xeec= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=I5+fOkdT; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="I5+fOkdT" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741599753; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9cMm0ywOLdFMT2qA3Az8Ii5/2TjzODIInteU1TsWfBo=; b=I5+fOkdTxUAstjMItJINTQXEYOrgvO1w+oopREfHCPOYUdSTz1rE8Jf7FD2pJb8yo80ldC sehMZ5XNCbPlAguI/AHX0wE4kENjk482/s39oaxEmjcuzpTSaLWr2xeBTbfg8AULG/lsTu 4ohLBX7pB2b34vum9/bKfFSV8sxhNuM= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-668-fuCyYHiwOHSkSoMt5MbXDA-1; Mon, 10 Mar 2025 05:42:29 -0400 X-MC-Unique: fuCyYHiwOHSkSoMt5MbXDA-1 X-Mimecast-MFC-AGG-ID: fuCyYHiwOHSkSoMt5MbXDA_1741599748 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 3091B1801A00; Mon, 10 Mar 2025 09:42:28 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.61]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 47B681800366; Mon, 10 Mar 2025 09:42:26 +0000 (UTC) From: David Howells To: Marc Dionne Cc: David Howells , Christian Brauner , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 03/11] afs: Change dynroot to create contents on demand Date: Mon, 10 Mar 2025 09:41:56 +0000 Message-ID: <20250310094206.801057-4-dhowells@redhat.com> In-Reply-To: <20250310094206.801057-1-dhowells@redhat.com> References: <20250310094206.801057-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Change the AFS dynamic root to do things differently: (1) Rather than having the creation of cell records create inodes and dentries for cell mountpoints, create them on demand during lookup. This simplifies cell management and locking as we no longer have to create these objects in advance *and* on speculative lookup by the user for a cell that isn't precreated. (2) Rather than using the libfs dentry-based readdir (the dentries now no longer exist until accessed from (1)), have readdir generate the contents by reading the list of cells. The @cell symlinks get pushed in positions 2 and 3 if rootcell has been configured. (3) Make the @cell symlink dentries persist for the life of the superblock or until reclaimed, but make cell mountpoints disappear immediately if unused. It's not perfect as someone doing an "ls -l /afs" may create a whole bunch of dentries which will be garbage collected immediately. But any dentry that gets automounted will be pinned by the mount, so it shouldn't be too bad. (4) Allocate the inode numbers for the cell mountpoints from an IDR to prevent duplicates appearing in the event it cycles round. The number allocated from the IDR is doubled to provide two inode numbers - one for the normal cell name (RO) and one for the dotted cell name (RW). Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250224234154.2014840-8-dhowells@redhat.com/ # v1 --- fs/afs/cell.c | 9 +- fs/afs/dynroot.c | 482 +++++++++++++++---------------------- fs/afs/internal.h | 8 +- fs/afs/main.c | 3 + fs/afs/super.c | 8 +- include/trace/events/afs.h | 2 + 6 files changed, 213 insertions(+), 299 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index 96a6781f3653..c2e44cd2eb96 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -204,7 +204,13 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, cell->dns_status = vllist->status; smp_store_release(&cell->dns_lookup_count, 1); /* vs source/status */ atomic_inc(&net->cells_outstanding); + ret = idr_alloc_cyclic(&net->cells_dyn_ino, cell, + 2, INT_MAX / 2, GFP_KERNEL); + if (ret < 0) + goto error; + cell->dynroot_ino = ret; cell->debug_id = atomic_inc_return(&cell_debug_id); + trace_afs_cell(cell->debug_id, 1, 0, afs_cell_trace_alloc); _leave(" = %p", cell); @@ -513,6 +519,7 @@ static void afs_cell_destroy(struct rcu_head *rcu) afs_put_vlserverlist(net, rcu_access_pointer(cell->vl_servers)); afs_unuse_cell(net, cell->alias_of, afs_cell_trace_unuse_alias); key_put(cell->anonymous_key); + idr_remove(&net->cells_dyn_ino, cell->dynroot_ino); kfree(cell->name - 1); kfree(cell); @@ -706,7 +713,6 @@ static int afs_activate_cell(struct afs_net *net, struct afs_cell *cell) if (cell->proc_link.next) cell->proc_link.next->pprev = &cell->proc_link.next; - afs_dynroot_mkdir(net, cell); mutex_unlock(&net->proc_cells_lock); return 0; } @@ -723,7 +729,6 @@ static void afs_deactivate_cell(struct afs_net *net, struct afs_cell *cell) mutex_lock(&net->proc_cells_lock); if (!hlist_unhashed(&cell->proc_link)) hlist_del_rcu(&cell->proc_link); - afs_dynroot_rmdir(net, cell); mutex_unlock(&net->proc_cells_lock); _leave(""); diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index 0b4cc291c65e..eb20e231d7ac 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -10,16 +10,19 @@ #include #include "internal.h" -static atomic_t afs_autocell_ino; +#define AFS_MIN_DYNROOT_CELL_INO 4 /* Allow for ., .., @cell, .@cell */ +#define AFS_MAX_DYNROOT_CELL_INO ((unsigned int)INT_MAX) + +static struct dentry *afs_lookup_atcell(struct inode *dir, struct dentry *dentry, ino_t ino); /* * iget5() comparator for inode created by autocell operations - * - * These pseudo inodes don't match anything. */ static int afs_iget5_pseudo_test(struct inode *inode, void *opaque) { - return 0; + struct afs_fid *fid = opaque; + + return inode->i_ino == fid->vnode; } /* @@ -39,28 +42,16 @@ static int afs_iget5_pseudo_set(struct inode *inode, void *opaque) } /* - * Create an inode for a dynamic root directory or an autocell dynamic - * automount dir. + * Create an inode for an autocell dynamic automount dir. */ -struct inode *afs_iget_pseudo_dir(struct super_block *sb, bool root) +static struct inode *afs_iget_pseudo_dir(struct super_block *sb, ino_t ino) { - struct afs_super_info *as = AFS_FS_S(sb); struct afs_vnode *vnode; struct inode *inode; - struct afs_fid fid = {}; + struct afs_fid fid = { .vnode = ino, .unique = 1, }; _enter(""); - if (as->volume) - fid.vid = as->volume->vid; - if (root) { - fid.vnode = 1; - fid.unique = 1; - } else { - fid.vnode = atomic_inc_return(&afs_autocell_ino); - fid.unique = 0; - } - inode = iget5_locked(sb, fid.vnode, afs_iget5_pseudo_test, afs_iget5_pseudo_set, &fid); if (!inode) { @@ -73,112 +64,70 @@ struct inode *afs_iget_pseudo_dir(struct super_block *sb, bool root) vnode = AFS_FS_I(inode); - /* there shouldn't be an existing inode */ - BUG_ON(!(inode->i_state & I_NEW)); - - netfs_inode_init(&vnode->netfs, NULL, false); - inode->i_size = 0; - inode->i_mode = S_IFDIR | S_IRUGO | S_IXUGO; - if (root) { - inode->i_op = &afs_dynroot_inode_operations; - inode->i_fop = &simple_dir_operations; - } else { - inode->i_op = &afs_autocell_inode_operations; - } - set_nlink(inode, 2); - inode->i_uid = GLOBAL_ROOT_UID; - inode->i_gid = GLOBAL_ROOT_GID; - simple_inode_init_ts(inode); - inode->i_blocks = 0; - inode->i_generation = 0; - - set_bit(AFS_VNODE_PSEUDODIR, &vnode->flags); - if (!root) { + if (inode->i_state & I_NEW) { + netfs_inode_init(&vnode->netfs, NULL, false); + simple_inode_init_ts(inode); + set_nlink(inode, 2); + inode->i_size = 0; + inode->i_mode = S_IFDIR | 0555; + inode->i_op = &afs_autocell_inode_operations; + inode->i_uid = GLOBAL_ROOT_UID; + inode->i_gid = GLOBAL_ROOT_GID; + inode->i_blocks = 0; + inode->i_generation = 0; + inode->i_flags |= S_AUTOMOUNT | S_NOATIME; + + set_bit(AFS_VNODE_PSEUDODIR, &vnode->flags); set_bit(AFS_VNODE_MOUNTPOINT, &vnode->flags); - inode->i_flags |= S_AUTOMOUNT; - } - inode->i_flags |= S_NOATIME; - unlock_new_inode(inode); + unlock_new_inode(inode); + } _leave(" = %p", inode); return inode; } /* - * Probe to see if a cell may exist. This prevents positive dentries from - * being created unnecessarily. + * Try to automount the mountpoint with pseudo directory, if the autocell + * option is set. */ -static int afs_probe_cell_name(struct dentry *dentry) +static struct dentry *afs_dynroot_lookup_cell(struct inode *dir, struct dentry *dentry, + unsigned int flags) { - struct afs_cell *cell; + struct afs_cell *cell = NULL; struct afs_net *net = afs_d2net(dentry); + struct inode *inode = NULL; const char *name = dentry->d_name.name; size_t len = dentry->d_name.len; - char *result = NULL; - int ret; + bool dotted = false; + int ret = -ENOENT; /* Names prefixed with a dot are R/W mounts. */ if (name[0] == '.') { - if (len == 1) - return -EINVAL; name++; len--; + dotted = true; } - cell = afs_find_cell(net, name, len, afs_cell_trace_use_probe); - if (!IS_ERR(cell)) { - afs_unuse_cell(net, cell, afs_cell_trace_unuse_probe); - return 0; - } - - ret = dns_query(net->net, "afsdb", name, len, "srv=1", - &result, NULL, false); - if (ret == -ENODATA || ret == -ENOKEY || ret == 0) - ret = -ENOENT; - if (ret > 0 && ret >= sizeof(struct dns_server_list_v1_header)) { - struct dns_server_list_v1_header *v1 = (void *)result; - - if (v1->hdr.zero == 0 && - v1->hdr.content == DNS_PAYLOAD_IS_SERVER_LIST && - v1->hdr.version == 1 && - (v1->status != DNS_LOOKUP_GOOD && - v1->status != DNS_LOOKUP_GOOD_WITH_BAD)) - return -ENOENT; - + cell = afs_lookup_cell(net, name, len, NULL, false); + if (IS_ERR(cell)) { + ret = PTR_ERR(cell); + goto out_no_cell; } - kfree(result); - return ret; -} - -/* - * Try to auto mount the mountpoint with pseudo directory, if the autocell - * operation is setted. - */ -static struct inode *afs_try_auto_mntpt(struct dentry *dentry, struct inode *dir) -{ - struct afs_vnode *vnode = AFS_FS_I(dir); - struct inode *inode; - int ret = -ENOENT; - - _enter("%p{%pd}, {%llx:%llu}", - dentry, dentry, vnode->fid.vid, vnode->fid.vnode); - - ret = afs_probe_cell_name(dentry); - if (ret < 0) - goto out; - - inode = afs_iget_pseudo_dir(dir->i_sb, false); + inode = afs_iget_pseudo_dir(dir->i_sb, cell->dynroot_ino * 2 + dotted); if (IS_ERR(inode)) { ret = PTR_ERR(inode); goto out; } - _leave("= %p", inode); - return inode; + dentry->d_fsdata = cell; + return d_splice_alias(inode, dentry); out: - _leave("= %d", ret); + afs_unuse_cell(cell->net, cell, afs_cell_trace_unuse_lookup_dynroot); +out_no_cell: + if (!inode) + return d_splice_alias(inode, dentry); return ret == -ENOENT ? NULL : ERR_PTR(ret); } @@ -190,8 +139,6 @@ static struct dentry *afs_dynroot_lookup(struct inode *dir, struct dentry *dentr { _enter("%pd", dentry); - ASSERTCMP(d_inode(dentry), ==, NULL); - if (flags & LOOKUP_CREATE) return ERR_PTR(-EOPNOTSUPP); @@ -200,98 +147,49 @@ static struct dentry *afs_dynroot_lookup(struct inode *dir, struct dentry *dentr return ERR_PTR(-ENAMETOOLONG); } - return d_splice_alias(afs_try_auto_mntpt(dentry, dir), dentry); + if (dentry->d_name.len == 5 && + memcmp(dentry->d_name.name, "@cell", 5) == 0) + return afs_lookup_atcell(dir, dentry, 2); + + if (dentry->d_name.len == 6 && + memcmp(dentry->d_name.name, ".@cell", 6) == 0) + return afs_lookup_atcell(dir, dentry, 3); + + return afs_dynroot_lookup_cell(dir, dentry, flags); } const struct inode_operations afs_dynroot_inode_operations = { .lookup = afs_dynroot_lookup, }; -const struct dentry_operations afs_dynroot_dentry_operations = { - .d_delete = always_delete_dentry, - .d_release = afs_d_release, - .d_automount = afs_d_automount, -}; - -/* - * Create a manually added cell mount directory. - * - The caller must hold net->proc_cells_lock - */ -int afs_dynroot_mkdir(struct afs_net *net, struct afs_cell *cell) -{ - struct super_block *sb = net->dynroot_sb; - struct dentry *root, *subdir, *dsubdir; - char *dotname = cell->name - 1; - int ret; - - if (!sb || atomic_read(&sb->s_active) == 0) - return 0; - - /* Let the ->lookup op do the creation */ - root = sb->s_root; - inode_lock(root->d_inode); - subdir = lookup_one_len(cell->name, root, cell->name_len); - if (IS_ERR(subdir)) { - ret = PTR_ERR(subdir); - goto unlock; - } - - dsubdir = lookup_one_len(dotname, root, cell->name_len + 1); - if (IS_ERR(dsubdir)) { - ret = PTR_ERR(dsubdir); - dput(subdir); - goto unlock; - } - - /* Note that we're retaining extra refs on the dentries. */ - subdir->d_fsdata = (void *)1UL; - dsubdir->d_fsdata = (void *)1UL; - ret = 0; -unlock: - inode_unlock(root->d_inode); - return ret; -} - -static void afs_dynroot_rm_one_dir(struct dentry *root, const char *name, size_t name_len) +static void afs_dynroot_d_release(struct dentry *dentry) { - struct dentry *subdir; - - /* Don't want to trigger a lookup call, which will re-add the cell */ - subdir = try_lookup_one_len(name, root, name_len); - if (IS_ERR_OR_NULL(subdir)) { - _debug("lookup %ld", PTR_ERR(subdir)); - return; - } - - _debug("rmdir %pd %u", subdir, d_count(subdir)); + struct afs_cell *cell = dentry->d_fsdata; - if (subdir->d_fsdata) { - _debug("unpin %u", d_count(subdir)); - subdir->d_fsdata = NULL; - dput(subdir); - } - dput(subdir); + afs_unuse_cell(cell->net, cell, afs_cell_trace_unuse_dynroot_mntpt); } /* - * Remove a manually added cell mount directory. - * - The caller must hold net->proc_cells_lock + * Keep @cell symlink dentries around, but only keep cell autodirs when they're + * being used. */ -void afs_dynroot_rmdir(struct afs_net *net, struct afs_cell *cell) +static int afs_dynroot_delete_dentry(const struct dentry *dentry) { - struct super_block *sb = net->dynroot_sb; - char *dotname = cell->name - 1; - - if (!sb || atomic_read(&sb->s_active) == 0) - return; + const struct qstr *name = &dentry->d_name; - inode_lock(sb->s_root->d_inode); - afs_dynroot_rm_one_dir(sb->s_root, cell->name, cell->name_len); - afs_dynroot_rm_one_dir(sb->s_root, dotname, cell->name_len + 1); - inode_unlock(sb->s_root->d_inode); - _leave(""); + if (name->len == 5 && memcmp(name->name, "@cell", 5) == 0) + return 0; + if (name->len == 6 && memcmp(name->name, ".@cell", 6) == 0) + return 0; + return 1; } +const struct dentry_operations afs_dynroot_dentry_operations = { + .d_delete = afs_dynroot_delete_dentry, + .d_release = afs_dynroot_d_release, + .d_automount = afs_d_automount, +}; + static void afs_atcell_delayed_put_cell(void *arg) { struct afs_cell *cell = arg; @@ -344,149 +242,163 @@ static const struct inode_operations afs_atcell_inode_operations = { }; /* - * Look up @cell or .@cell in a dynroot directory. This is a substitution for - * the local cell name for the net namespace. + * Create an inode for the @cell or .@cell symlinks. */ -static struct dentry *afs_dynroot_create_symlink(struct dentry *root, const char *name) +static struct dentry *afs_lookup_atcell(struct inode *dir, struct dentry *dentry, ino_t ino) { struct afs_vnode *vnode; - struct afs_fid fid = { .vnode = 2, .unique = 1, }; - struct dentry *dentry; struct inode *inode; + struct afs_fid fid = { .vnode = ino, .unique = 1, }; - if (name[0] == '.') - fid.vnode = 3; - - dentry = d_alloc_name(root, name); - if (!dentry) - return ERR_PTR(-ENOMEM); - - inode = iget5_locked(dentry->d_sb, fid.vnode, + inode = iget5_locked(dir->i_sb, fid.vnode, afs_iget5_pseudo_test, afs_iget5_pseudo_set, &fid); - if (!inode) { - dput(dentry); + if (!inode) return ERR_PTR(-ENOMEM); - } vnode = AFS_FS_I(inode); - /* there shouldn't be an existing inode */ - if (WARN_ON_ONCE(!(inode->i_state & I_NEW))) { - iput(inode); - dput(dentry); - return ERR_PTR(-EIO); + if (inode->i_state & I_NEW) { + netfs_inode_init(&vnode->netfs, NULL, false); + simple_inode_init_ts(inode); + set_nlink(inode, 1); + inode->i_size = 0; + inode->i_mode = S_IFLNK | 0555; + inode->i_op = &afs_atcell_inode_operations; + inode->i_uid = GLOBAL_ROOT_UID; + inode->i_gid = GLOBAL_ROOT_GID; + inode->i_blocks = 0; + inode->i_generation = 0; + inode->i_flags |= S_NOATIME; + + unlock_new_inode(inode); } - - netfs_inode_init(&vnode->netfs, NULL, false); - simple_inode_init_ts(inode); - set_nlink(inode, 1); - inode->i_size = 0; - inode->i_mode = S_IFLNK | 0555; - inode->i_op = &afs_atcell_inode_operations; - inode->i_uid = GLOBAL_ROOT_UID; - inode->i_gid = GLOBAL_ROOT_GID; - inode->i_blocks = 0; - inode->i_generation = 0; - inode->i_flags |= S_NOATIME; - - unlock_new_inode(inode); - d_splice_alias(inode, dentry); - return dentry; + return d_splice_alias(inode, dentry); } /* - * Create @cell and .@cell symlinks. + * Transcribe the cell database into readdir content under the RCU read lock. + * Each cell produces two entries, one prefixed with a dot and one not. */ -static int afs_dynroot_symlink(struct afs_net *net) +static int afs_dynroot_readdir_cells(struct afs_net *net, struct dir_context *ctx) { - struct super_block *sb = net->dynroot_sb; - struct dentry *root, *symlink, *dsymlink; - int ret; - - /* Let the ->lookup op do the creation */ - root = sb->s_root; - inode_lock(root->d_inode); - symlink = afs_dynroot_create_symlink(root, "@cell"); - if (IS_ERR(symlink)) { - ret = PTR_ERR(symlink); - goto unlock; - } + const struct afs_cell *cell; + loff_t newpos; + + _enter("%llu", ctx->pos); + + for (;;) { + unsigned int ix = ctx->pos >> 1; + + cell = idr_get_next(&net->cells_dyn_ino, &ix); + if (!cell) + return 0; + if (READ_ONCE(cell->state) == AFS_CELL_FAILED || + READ_ONCE(cell->state) == AFS_CELL_REMOVED) { + ctx->pos += 2; + ctx->pos &= ~1; + continue; + } - dsymlink = afs_dynroot_create_symlink(root, ".@cell"); - if (IS_ERR(dsymlink)) { - ret = PTR_ERR(dsymlink); - dput(symlink); - goto unlock; - } + newpos = ix << 1; + if (newpos > ctx->pos) + ctx->pos = newpos; - /* Note that we're retaining extra refs on the dentries. */ - symlink->d_fsdata = (void *)1UL; - dsymlink->d_fsdata = (void *)1UL; - ret = 0; -unlock: - inode_unlock(root->d_inode); - return ret; + _debug("pos %llu -> cell %u", ctx->pos, cell->dynroot_ino); + + if ((ctx->pos & 1) == 0) { + if (!dir_emit(ctx, cell->name, cell->name_len, + cell->dynroot_ino, DT_DIR)) + return 0; + ctx->pos++; + } + if ((ctx->pos & 1) == 1) { + if (!dir_emit(ctx, cell->name - 1, cell->name_len + 1, + cell->dynroot_ino + 1, DT_DIR)) + return 0; + ctx->pos++; + } + } + return 0; } /* - * Populate a newly created dynamic root with cell names. + * Read the AFS dynamic root directory. This produces a list of cellnames, + * dotted and undotted, along with @cell and .@cell links if configured. */ -int afs_dynroot_populate(struct super_block *sb) +static int afs_dynroot_readdir(struct file *file, struct dir_context *ctx) { - struct afs_cell *cell; - struct afs_net *net = afs_sb2net(sb); - int ret; + struct afs_net *net = afs_d2net(file->f_path.dentry); + int ret = 0; - mutex_lock(&net->proc_cells_lock); - - net->dynroot_sb = sb; - ret = afs_dynroot_symlink(net); - if (ret < 0) - goto error; + if (!dir_emit_dots(file, ctx)) + return 0; - hlist_for_each_entry(cell, &net->proc_cells, proc_link) { - ret = afs_dynroot_mkdir(net, cell); - if (ret < 0) - goto error; + if (ctx->pos == 2) { + if (rcu_access_pointer(net->ws_cell) && + !dir_emit(ctx, "@cell", 5, 2, DT_LNK)) + return 0; + ctx->pos = 3; + } + if (ctx->pos == 3) { + if (rcu_access_pointer(net->ws_cell) && + !dir_emit(ctx, ".@cell", 6, 3, DT_LNK)) + return 0; + ctx->pos = 4; } - ret = 0; -out: - mutex_unlock(&net->proc_cells_lock); + if ((unsigned long long)ctx->pos <= AFS_MAX_DYNROOT_CELL_INO) { + rcu_read_lock(); + ret = afs_dynroot_readdir_cells(net, ctx); + rcu_read_unlock(); + } return ret; - -error: - net->dynroot_sb = NULL; - goto out; } +static const struct file_operations afs_dynroot_file_operations = { + .llseek = generic_file_llseek, + .read = generic_read_dir, + .iterate_shared = afs_dynroot_readdir, + .fsync = noop_fsync, +}; + /* - * When a dynamic root that's in the process of being destroyed, depopulate it - * of pinned directories. + * Create an inode for a dynamic root directory. */ -void afs_dynroot_depopulate(struct super_block *sb) +struct inode *afs_dynroot_iget_root(struct super_block *sb) { - struct afs_net *net = afs_sb2net(sb); - struct dentry *root = sb->s_root, *subdir; - - /* Prevent more subdirs from being created */ - mutex_lock(&net->proc_cells_lock); - if (net->dynroot_sb == sb) - net->dynroot_sb = NULL; - mutex_unlock(&net->proc_cells_lock); - - if (root) { - struct hlist_node *n; - inode_lock(root->d_inode); - - /* Remove all the pins for dirs created for manually added cells */ - hlist_for_each_entry_safe(subdir, n, &root->d_children, d_sib) { - if (subdir->d_fsdata) { - subdir->d_fsdata = NULL; - dput(subdir); - } - } + struct afs_super_info *as = AFS_FS_S(sb); + struct afs_vnode *vnode; + struct inode *inode; + struct afs_fid fid = { .vid = 0, .vnode = 1, .unique = 1,}; + + if (as->volume) + fid.vid = as->volume->vid; - inode_unlock(root->d_inode); + inode = iget5_locked(sb, fid.vnode, + afs_iget5_pseudo_test, afs_iget5_pseudo_set, &fid); + if (!inode) + return ERR_PTR(-ENOMEM); + + vnode = AFS_FS_I(inode); + + /* there shouldn't be an existing inode */ + if (inode->i_state & I_NEW) { + netfs_inode_init(&vnode->netfs, NULL, false); + simple_inode_init_ts(inode); + set_nlink(inode, 2); + inode->i_size = 0; + inode->i_mode = S_IFDIR | 0555; + inode->i_op = &afs_dynroot_inode_operations; + inode->i_fop = &afs_dynroot_file_operations; + inode->i_uid = GLOBAL_ROOT_UID; + inode->i_gid = GLOBAL_ROOT_GID; + inode->i_blocks = 0; + inode->i_generation = 0; + inode->i_flags |= S_NOATIME; + + set_bit(AFS_VNODE_PSEUDODIR, &vnode->flags); + unlock_new_inode(inode); } + _leave(" = %p", inode); + return inode; } diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 0e00e061f0d9..47e98a78f59f 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -287,6 +287,7 @@ struct afs_net { /* Cell database */ struct rb_root cells; + struct idr cells_dyn_ino; /* cell->dynroot_ino mapping */ struct afs_cell __rcu *ws_cell; struct work_struct cells_manager; struct timer_list cells_timer; @@ -398,6 +399,7 @@ struct afs_cell { enum dns_lookup_status dns_status:8; /* Latest status of data from lookup */ unsigned int dns_lookup_count; /* Counter of DNS lookups */ unsigned int debug_id; + unsigned int dynroot_ino; /* Inode numbers for dynroot (a pair) */ /* The volumes belonging to this cell */ struct rw_semaphore vs_lock; /* Lock for server->volumes */ @@ -1110,10 +1112,7 @@ extern int afs_silly_iput(struct dentry *, struct inode *); extern const struct inode_operations afs_dynroot_inode_operations; extern const struct dentry_operations afs_dynroot_dentry_operations; -extern int afs_dynroot_mkdir(struct afs_net *, struct afs_cell *); -extern void afs_dynroot_rmdir(struct afs_net *, struct afs_cell *); -extern int afs_dynroot_populate(struct super_block *); -extern void afs_dynroot_depopulate(struct super_block *); +struct inode *afs_dynroot_iget_root(struct super_block *sb); /* * file.c @@ -1226,7 +1225,6 @@ int afs_readlink(struct dentry *dentry, char __user *buffer, int buflen); extern void afs_vnode_commit_status(struct afs_operation *, struct afs_vnode_param *); extern int afs_fetch_status(struct afs_vnode *, struct key *, bool, afs_access_t *); extern int afs_ilookup5_test_by_fid(struct inode *, void *); -extern struct inode *afs_iget_pseudo_dir(struct super_block *, bool); extern struct inode *afs_iget(struct afs_operation *, struct afs_vnode_param *); extern struct inode *afs_root_iget(struct super_block *, struct key *); extern int afs_getattr(struct mnt_idmap *idmap, const struct path *, diff --git a/fs/afs/main.c b/fs/afs/main.c index 1ae0067f772d..a7c7dc268302 100644 --- a/fs/afs/main.c +++ b/fs/afs/main.c @@ -76,6 +76,7 @@ static int __net_init afs_net_init(struct net *net_ns) mutex_init(&net->socket_mutex); net->cells = RB_ROOT; + idr_init(&net->cells_dyn_ino); init_rwsem(&net->cells_lock); INIT_WORK(&net->cells_manager, afs_manage_cells); timer_setup(&net->cells_timer, afs_cells_timer, 0); @@ -137,6 +138,7 @@ static int __net_init afs_net_init(struct net *net_ns) error_proc: afs_put_sysnames(net->sysnames); error_sysnames: + idr_destroy(&net->cells_dyn_ino); net->live = false; return ret; } @@ -155,6 +157,7 @@ static void __net_exit afs_net_exit(struct net *net_ns) afs_close_socket(net); afs_proc_cleanup(net); afs_put_sysnames(net->sysnames); + idr_destroy(&net->cells_dyn_ino); kfree_rcu(rcu_access_pointer(net->address_prefs), rcu); } diff --git a/fs/afs/super.c b/fs/afs/super.c index 2f18aa8e2806..dfc109f48ad5 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -466,7 +466,7 @@ static int afs_fill_super(struct super_block *sb, struct afs_fs_context *ctx) /* allocate the root inode and dentry */ if (as->dyn_root) { - inode = afs_iget_pseudo_dir(sb, true); + inode = afs_dynroot_iget_root(sb); } else { sprintf(sb->s_id, "%llu", as->volume->vid); afs_activate_volume(as->volume); @@ -483,9 +483,6 @@ static int afs_fill_super(struct super_block *sb, struct afs_fs_context *ctx) if (as->dyn_root) { sb->s_d_op = &afs_dynroot_dentry_operations; - ret = afs_dynroot_populate(sb); - if (ret < 0) - goto error; } else { sb->s_d_op = &afs_fs_dentry_operations; rcu_assign_pointer(as->volume->sb, sb); @@ -534,9 +531,6 @@ static void afs_kill_super(struct super_block *sb) { struct afs_super_info *as = AFS_FS_S(sb); - if (as->dyn_root) - afs_dynroot_depopulate(sb); - /* Clear the callback interests (which will do ilookup5) before * deactivating the superblock. */ diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 958a2460330c..c19132605f41 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -190,8 +190,10 @@ enum yfs_cm_operation { EM(afs_cell_trace_unuse_alias, "UNU alias ") \ EM(afs_cell_trace_unuse_check_alias, "UNU chk-al") \ EM(afs_cell_trace_unuse_delete, "UNU delete") \ + EM(afs_cell_trace_unuse_dynroot_mntpt, "UNU dyn-mp") \ EM(afs_cell_trace_unuse_fc, "UNU fc ") \ EM(afs_cell_trace_unuse_lookup, "UNU lookup") \ + EM(afs_cell_trace_unuse_lookup_dynroot, "UNU lu-dyn") \ EM(afs_cell_trace_unuse_mntpt, "UNU mntpt ") \ EM(afs_cell_trace_unuse_no_pin, "UNU no-pin") \ EM(afs_cell_trace_unuse_parse, "UNU parse ") \ From patchwork Mon Mar 10 09:41:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 14009481 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B973E226885 for ; Mon, 10 Mar 2025 09:42:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599760; cv=none; b=Qq2UXFPQ3t8YNsh5hWyBK8h+UDOdBa8hghqPJWbbf8gjb7bfnK9rpT7dLwULdZJ8XHcz0xJT+zJ4lzj+QrH4AVCvEUgd4d1ANtLQtHuVon84XHG9he5FSk6srXkQFQc4PfANKk56b97COFlV9ioS8FpL0y47PcdmnZUmbBXv66s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599760; c=relaxed/simple; bh=/htZgUzt8knAr+hiRacdqz5GeDl5/4xYQD4/ZC7eRD8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KJKDsleZNkm1CqwjVVPLd7OjCjCKxbc1SLK5legcefhq9T8tPpH+sZneLrGMq82vrS9as+A1cYyRHxa3bjDoGhK9gLI6c/ugcM1IuK3YpH42SuCeOUdhofR1NREtvJUSGMocp1V94OX5rdDpu4Uge7ZX4Pyt54DPlCPbF1hOsoc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=UPUFwYCt; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UPUFwYCt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741599756; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=16o31y5LryjLyHYuo1OhSgUhMY7/YxydWgak/ianHBU=; b=UPUFwYCtphDk8OC4BeXoTZni7xiIeVuwH/K6LOJl7Z/IwjaRKiAjAsXNr2n+ZsLWtIssn8 Z+srzmnXuvOJ2jNUM4JAnqlKaPydEMIrsuBWGGX/SJ0TcYtX0lK5QUECbkU/fcAf0gut71 KK1ziEmoHPsLDEZH0/xVOQt49URKZ1A= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-602-4W0z7l4xPNuYI2Te3AP8GA-1; Mon, 10 Mar 2025 05:42:33 -0400 X-MC-Unique: 4W0z7l4xPNuYI2Te3AP8GA-1 X-Mimecast-MFC-AGG-ID: 4W0z7l4xPNuYI2Te3AP8GA_1741599752 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 23822180025A; Mon, 10 Mar 2025 09:42:31 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.61]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 7555719560AB; Mon, 10 Mar 2025 09:42:29 +0000 (UTC) From: David Howells To: Marc Dionne Cc: David Howells , Christian Brauner , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 04/11] afs: Improve afs_volume tracing to display a debug ID Date: Mon, 10 Mar 2025 09:41:57 +0000 Message-ID: <20250310094206.801057-5-dhowells@redhat.com> In-Reply-To: <20250310094206.801057-1-dhowells@redhat.com> References: <20250310094206.801057-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 Improve the tracing of afs_volume objects to include displaying a debug ID so that different instances of volumes with the same "vid" can be distinguished. Also be consistent about displaying the volume's refcount (and not the cell's). Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250224234154.2014840-9-dhowells@redhat.com/ # v1 --- fs/afs/internal.h | 1 + fs/afs/volume.c | 15 +++++++++------ include/trace/events/afs.h | 18 +++++++++++------- 3 files changed, 21 insertions(+), 13 deletions(-) diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 47e98a78f59f..97045e2a455d 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -623,6 +623,7 @@ struct afs_volume { afs_volid_t vid; /* The volume ID of this volume */ afs_volid_t vids[AFS_MAXTYPES]; /* All associated volume IDs */ refcount_t ref; + unsigned int debug_id; /* Debugging ID for traces */ time64_t update_at; /* Time at which to next update */ struct afs_cell *cell; /* Cell to which belongs (pins ref) */ struct rb_node cell_node; /* Link in cell->volumes */ diff --git a/fs/afs/volume.c b/fs/afs/volume.c index af3a3f57c1b3..0efff3d25133 100644 --- a/fs/afs/volume.c +++ b/fs/afs/volume.c @@ -10,6 +10,7 @@ #include "internal.h" static unsigned __read_mostly afs_volume_record_life = 60 * 60; +static atomic_t afs_volume_debug_id; static void afs_destroy_volume(struct work_struct *work); @@ -59,7 +60,7 @@ static void afs_remove_volume_from_cell(struct afs_volume *volume) struct afs_cell *cell = volume->cell; if (!hlist_unhashed(&volume->proc_link)) { - trace_afs_volume(volume->vid, refcount_read(&cell->ref), + trace_afs_volume(volume->debug_id, volume->vid, refcount_read(&volume->ref), afs_volume_trace_remove); write_seqlock(&cell->volume_lock); hlist_del_rcu(&volume->proc_link); @@ -84,6 +85,7 @@ static struct afs_volume *afs_alloc_volume(struct afs_fs_context *params, if (!volume) goto error_0; + volume->debug_id = atomic_inc_return(&afs_volume_debug_id); volume->vid = vldb->vid[params->type]; volume->update_at = ktime_get_real_seconds() + afs_volume_record_life; volume->cell = afs_get_cell(params->cell, afs_cell_trace_get_vol); @@ -115,7 +117,7 @@ static struct afs_volume *afs_alloc_volume(struct afs_fs_context *params, *_slist = slist; rcu_assign_pointer(volume->servers, slist); - trace_afs_volume(volume->vid, 1, afs_volume_trace_alloc); + trace_afs_volume(volume->debug_id, volume->vid, 1, afs_volume_trace_alloc); return volume; error_1: @@ -247,7 +249,7 @@ static void afs_destroy_volume(struct work_struct *work) afs_remove_volume_from_cell(volume); afs_put_serverlist(volume->cell->net, slist); afs_put_cell(volume->cell, afs_cell_trace_put_vol); - trace_afs_volume(volume->vid, refcount_read(&volume->ref), + trace_afs_volume(volume->debug_id, volume->vid, refcount_read(&volume->ref), afs_volume_trace_free); kfree_rcu(volume, rcu); @@ -262,7 +264,7 @@ bool afs_try_get_volume(struct afs_volume *volume, enum afs_volume_trace reason) int r; if (__refcount_inc_not_zero(&volume->ref, &r)) { - trace_afs_volume(volume->vid, r + 1, reason); + trace_afs_volume(volume->debug_id, volume->vid, r + 1, reason); return true; } return false; @@ -278,7 +280,7 @@ struct afs_volume *afs_get_volume(struct afs_volume *volume, int r; __refcount_inc(&volume->ref, &r); - trace_afs_volume(volume->vid, r + 1, reason); + trace_afs_volume(volume->debug_id, volume->vid, r + 1, reason); } return volume; } @@ -290,12 +292,13 @@ struct afs_volume *afs_get_volume(struct afs_volume *volume, void afs_put_volume(struct afs_volume *volume, enum afs_volume_trace reason) { if (volume) { + unsigned int debug_id = volume->debug_id; afs_volid_t vid = volume->vid; bool zero; int r; zero = __refcount_dec_and_test(&volume->ref, &r); - trace_afs_volume(vid, r - 1, reason); + trace_afs_volume(debug_id, vid, r - 1, reason); if (zero) schedule_work(&volume->destructor); } diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index c19132605f41..cf94bf1e8286 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -1539,25 +1539,29 @@ TRACE_EVENT(afs_server, ); TRACE_EVENT(afs_volume, - TP_PROTO(afs_volid_t vid, int ref, enum afs_volume_trace reason), + TP_PROTO(unsigned int debug_id, afs_volid_t vid, int ref, + enum afs_volume_trace reason), - TP_ARGS(vid, ref, reason), + TP_ARGS(debug_id, vid, ref, reason), TP_STRUCT__entry( + __field(unsigned int, debug_id) __field(afs_volid_t, vid) __field(int, ref) __field(enum afs_volume_trace, reason) ), TP_fast_assign( - __entry->vid = vid; - __entry->ref = ref; - __entry->reason = reason; + __entry->debug_id = debug_id; + __entry->vid = vid; + __entry->ref = ref; + __entry->reason = reason; ), - TP_printk("V=%llx %s ur=%d", - __entry->vid, + TP_printk("V=%08x %s vid=%llx r=%d", + __entry->debug_id, __print_symbolic(__entry->reason, afs_volume_traces), + __entry->vid, __entry->ref) ); From patchwork Mon Mar 10 09:41:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 14009482 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A8EFC226CEE for ; Mon, 10 Mar 2025 09:42:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599761; cv=none; b=Cvg24EOTpi6nNu0p3ZZgehqw8asZiyVTDuvmeA4+10O4wJ+kPfLgBiWIzocDNWsJq7OjanaAu1tkxM20lKUqy+DkobY3iCXOG2pDt9mbg75s9Y8dqYEvl7BwTU5k3mZpwyNCm6jtupziMsjIYoy15SLHAsg3TADDpBinbeWXrLk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599761; c=relaxed/simple; bh=WsSlObPfD4aENaB3Lb+zEttK3URw04/ulYNruPEVpX8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=l5qo0vSmc+GPLmIb7hkpuKRJKePKdFtqL/W6vMbDosCnFJDWRDVlzFrNxAn/Xb6xfmJ8pA7XMZ2hedeZtqT6K2l7OOlfJZs848xi7mVmiqCsEQBDP5vqyIYls8zyzQkqrkttFS3qqR/tu0cOAQ3UmQGcGVRP4n0kQtg2L6YP4yc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=WPnSxMQ4; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WPnSxMQ4" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741599758; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cK53t25WsPqUvTA/J/SQdQUAcMOGh2IVjnO7sToojgk=; b=WPnSxMQ4gIVYiKyWvZqgf65Niq4+ebwRHOh78grQNoQEINRJKh3nSB7qoccPMy/XFzdNHH t/y4G7OkSjnCK2DxdJctGmZ9XQi7OzuiTdHJm4VZ410ztlTINo2MjJTWAmzkOr1YwCdk2k WaZayyQ7YaFEu8OdltMU/gk9PVMCv5k= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-319-dtOGABG9NU2i8Da6UDZLgA-1; Mon, 10 Mar 2025 05:42:35 -0400 X-MC-Unique: dtOGABG9NU2i8Da6UDZLgA-1 X-Mimecast-MFC-AGG-ID: dtOGABG9NU2i8Da6UDZLgA_1741599754 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 024511955BC5; Mon, 10 Mar 2025 09:42:34 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.61]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 5396F1956096; Mon, 10 Mar 2025 09:42:32 +0000 (UTC) From: David Howells To: Marc Dionne Cc: David Howells , Christian Brauner , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 05/11] afs: Improve server refcount/active count tracing Date: Mon, 10 Mar 2025 09:41:58 +0000 Message-ID: <20250310094206.801057-6-dhowells@redhat.com> In-Reply-To: <20250310094206.801057-1-dhowells@redhat.com> References: <20250310094206.801057-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Improve server refcount/active count tracing to distinguish between simply getting/putting a ref and using/unusing the server record (which changes the activity count as well as the refcount). This makes it a bit easier to work out what's going on. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250224234154.2014840-10-dhowells@redhat.com/ # v1 --- fs/afs/fsclient.c | 4 ++-- fs/afs/rxrpc.c | 2 +- fs/afs/server.c | 11 ++++++----- fs/afs/server_list.c | 4 ++-- include/trace/events/afs.h | 27 +++++++++++++++------------ 5 files changed, 26 insertions(+), 22 deletions(-) diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c index 1d9ecd5418d8..9f46d9aebc33 100644 --- a/fs/afs/fsclient.c +++ b/fs/afs/fsclient.c @@ -1653,7 +1653,7 @@ int afs_fs_give_up_all_callbacks(struct afs_net *net, struct afs_server *server, bp = call->request; *bp++ = htonl(FSGIVEUPALLCALLBACKS); - call->server = afs_use_server(server, afs_server_trace_give_up_cb); + call->server = afs_use_server(server, afs_server_trace_use_give_up_cb); afs_make_call(call, GFP_NOFS); afs_wait_for_call_to_complete(call); ret = call->error; @@ -1760,7 +1760,7 @@ bool afs_fs_get_capabilities(struct afs_net *net, struct afs_server *server, return false; call->key = key; - call->server = afs_use_server(server, afs_server_trace_get_caps); + call->server = afs_use_server(server, afs_server_trace_use_get_caps); call->peer = rxrpc_kernel_get_peer(estate->addresses->addrs[addr_index].peer); call->probe = afs_get_endpoint_state(estate, afs_estate_trace_get_getcaps); call->probe_index = addr_index; diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c index 886416ea1d96..de9e10575bdd 100644 --- a/fs/afs/rxrpc.c +++ b/fs/afs/rxrpc.c @@ -179,7 +179,7 @@ static void afs_free_call(struct afs_call *call) if (call->type->destructor) call->type->destructor(call); - afs_unuse_server_notime(call->net, call->server, afs_server_trace_put_call); + afs_unuse_server_notime(call->net, call->server, afs_server_trace_unuse_call); kfree(call->request); o = atomic_read(&net->nr_outstanding_calls); diff --git a/fs/afs/server.c b/fs/afs/server.c index 4504e16b458c..923e07c37032 100644 --- a/fs/afs/server.c +++ b/fs/afs/server.c @@ -33,7 +33,7 @@ struct afs_server *afs_find_server(struct afs_net *net, const struct rxrpc_peer do { if (server) - afs_unuse_server_notime(net, server, afs_server_trace_put_find_rsq); + afs_unuse_server_notime(net, server, afs_server_trace_unuse_find_rsq); server = NULL; seq++; /* 2 on the 1st/lockless path, otherwise odd */ read_seqbegin_or_lock(&net->fs_addr_lock, &seq); @@ -49,7 +49,7 @@ struct afs_server *afs_find_server(struct afs_net *net, const struct rxrpc_peer server = NULL; continue; found: - server = afs_maybe_use_server(server, afs_server_trace_get_by_addr); + server = afs_maybe_use_server(server, afs_server_trace_use_by_addr); } while (need_seqretry(&net->fs_addr_lock, seq)); @@ -76,7 +76,7 @@ struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uu * changes. */ if (server) - afs_unuse_server(net, server, afs_server_trace_put_uuid_rsq); + afs_unuse_server(net, server, afs_server_trace_unuse_uuid_rsq); server = NULL; seq++; /* 2 on the 1st/lockless path, otherwise odd */ read_seqbegin_or_lock(&net->fs_lock, &seq); @@ -91,7 +91,7 @@ struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uu } else if (diff > 0) { p = p->rb_right; } else { - afs_use_server(server, afs_server_trace_get_by_uuid); + afs_use_server(server, afs_server_trace_use_by_uuid); break; } @@ -273,7 +273,8 @@ static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_cell *cell, } /* - * Get or create a fileserver record. + * Get or create a fileserver record and return it with an active-use count on + * it. */ struct afs_server *afs_lookup_server(struct afs_cell *cell, struct key *key, const uuid_t *uuid, u32 addr_version) diff --git a/fs/afs/server_list.c b/fs/afs/server_list.c index d20cd902ef94..784236b9b2a9 100644 --- a/fs/afs/server_list.c +++ b/fs/afs/server_list.c @@ -16,7 +16,7 @@ void afs_put_serverlist(struct afs_net *net, struct afs_server_list *slist) if (slist && refcount_dec_and_test(&slist->usage)) { for (i = 0; i < slist->nr_servers; i++) afs_unuse_server(net, slist->servers[i].server, - afs_server_trace_put_slist); + afs_server_trace_unuse_slist); kfree_rcu(slist, rcu); } } @@ -98,7 +98,7 @@ struct afs_server_list *afs_alloc_server_list(struct afs_volume *volume, if (j < slist->nr_servers) { if (slist->servers[j].server == server) { afs_unuse_server(volume->cell->net, server, - afs_server_trace_put_slist_isort); + afs_server_trace_unuse_slist_isort); continue; } diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index cf94bf1e8286..24d99fbc298f 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -132,22 +132,25 @@ enum yfs_cm_operation { EM(afs_server_trace_destroy, "DESTROY ") \ EM(afs_server_trace_free, "FREE ") \ EM(afs_server_trace_gc, "GC ") \ - EM(afs_server_trace_get_by_addr, "GET addr ") \ - EM(afs_server_trace_get_by_uuid, "GET uuid ") \ - EM(afs_server_trace_get_caps, "GET caps ") \ EM(afs_server_trace_get_install, "GET inst ") \ - EM(afs_server_trace_get_new_cbi, "GET cbi ") \ EM(afs_server_trace_get_probe, "GET probe") \ - EM(afs_server_trace_give_up_cb, "giveup-cb") \ EM(afs_server_trace_purging, "PURGE ") \ - EM(afs_server_trace_put_call, "PUT call ") \ EM(afs_server_trace_put_cbi, "PUT cbi ") \ - EM(afs_server_trace_put_find_rsq, "PUT f-rsq") \ EM(afs_server_trace_put_probe, "PUT probe") \ - EM(afs_server_trace_put_slist, "PUT slist") \ - EM(afs_server_trace_put_slist_isort, "PUT isort") \ - EM(afs_server_trace_put_uuid_rsq, "PUT u-req") \ - E_(afs_server_trace_update, "UPDATE") + EM(afs_server_trace_see_expired, "SEE expd ") \ + EM(afs_server_trace_unuse_call, "UNU call ") \ + EM(afs_server_trace_unuse_create_fail, "UNU cfail") \ + EM(afs_server_trace_unuse_find_rsq, "UNU f-rsq") \ + EM(afs_server_trace_unuse_slist, "UNU slist") \ + EM(afs_server_trace_unuse_slist_isort, "UNU isort") \ + EM(afs_server_trace_unuse_uuid_rsq, "PUT u-req") \ + EM(afs_server_trace_update, "UPDATE ") \ + EM(afs_server_trace_use_by_addr, "USE addr ") \ + EM(afs_server_trace_use_by_uuid, "USE uuid ") \ + EM(afs_server_trace_use_cm_call, "USE cm-cl") \ + EM(afs_server_trace_use_get_caps, "USE gcaps") \ + EM(afs_server_trace_use_give_up_cb, "USE gvupc") \ + E_(afs_server_trace_wait_create, "WAIT crt ") #define afs_volume_traces \ EM(afs_volume_trace_alloc, "ALLOC ") \ @@ -1531,7 +1534,7 @@ TRACE_EVENT(afs_server, __entry->reason = reason; ), - TP_printk("s=%08x %s u=%d a=%d", + TP_printk("s=%08x %s r=%d a=%d", __entry->server, __print_symbolic(__entry->reason, afs_server_traces), __entry->ref, From patchwork Mon Mar 10 09:41:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 14009483 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0DF40227563 for ; Mon, 10 Mar 2025 09:42:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599764; cv=none; b=IfudIcS/7d9bN68SKo004ZLvn6E/axmMjXGAQleGixod2rD45NHwAmU8TAc5BQKqJHQ52UBpKTbPnqEtQC/VHjyR3888WtlXPQCn2/TN8hM1hKX+4yKrQilJ4zC3FERPP7R4moke9lzsvF6oW6B58ewFGtibqsl5jbuP2mnJi5c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599764; c=relaxed/simple; bh=zDjfWjbFdEQYv+MZY4KF1MUOtn2hxJSXYeXayEikkUo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rZnKfuSbwttv2U7YFzwtjXbL4hgZ9JNwnjHS2YkIXzA+E48k5E+0yImJNfyFDgZx3Wr8FqrofLMwiUXjeXTs4M8DlQTAK4VYXePM/D0c7pKRuQM7modrNQJpos85Pyw4xUB8FLBSLTMasQ54Z1Ht5rk4RQz8LsFK6G946RZs524= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=VOQqyqF2; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="VOQqyqF2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741599762; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ykl9ssSJxhfI2t++Zcd6/6YCjhY2pZQu3LkERuE7eRE=; b=VOQqyqF2NJAeQiooxW4ybXN/nNW79Lfxrq9gzgGmU1JAqNQM8fbLc2V361WK2CauqhSimB JMpVJHCBArNdjZOJEq1D7kDxR/aPgkTjIMFVnflsKcIo+aiBkLd8DSdAJ+Yu+gCUllsnWc HuCTbtuUTiLCAb4uoZINuPWX2xvpazg= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-120-VUmd8nfCMpqAZDYpzKfSdg-1; Mon, 10 Mar 2025 05:42:37 -0400 X-MC-Unique: VUmd8nfCMpqAZDYpzKfSdg-1 X-Mimecast-MFC-AGG-ID: VUmd8nfCMpqAZDYpzKfSdg_1741599756 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C097719560B4; Mon, 10 Mar 2025 09:42:36 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.61]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 45B6A19560AB; Mon, 10 Mar 2025 09:42:35 +0000 (UTC) From: David Howells To: Marc Dionne Cc: David Howells , Christian Brauner , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 06/11] afs: Make afs_lookup_cell() take a trace note Date: Mon, 10 Mar 2025 09:41:59 +0000 Message-ID: <20250310094206.801057-7-dhowells@redhat.com> In-Reply-To: <20250310094206.801057-1-dhowells@redhat.com> References: <20250310094206.801057-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 Pass a note to be added to the afs_cell tracepoint to afs_lookup_cell() so that different callers can be distinguished. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250224234154.2014840-11-dhowells@redhat.com/ # v1 --- fs/afs/cell.c | 13 ++++++++----- fs/afs/dynroot.c | 3 ++- fs/afs/internal.h | 6 ++++-- fs/afs/mntpt.c | 3 ++- fs/afs/proc.c | 3 ++- fs/afs/super.c | 3 ++- fs/afs/vl_alias.c | 3 ++- include/trace/events/afs.h | 7 ++++++- 8 files changed, 28 insertions(+), 13 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index c2e44cd2eb96..73894180f653 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -233,6 +233,7 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, * @namesz: The strlen of the cell name. * @vllist: A colon/comma separated list of numeric IP addresses or NULL. * @excl: T if an error should be given if the cell name already exists. + * @trace: The reason to be logged if the lookup is successful. * * Look up a cell record by name and query the DNS for VL server addresses if * needed. Note that that actual DNS query is punted off to the manager thread @@ -241,7 +242,8 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, */ struct afs_cell *afs_lookup_cell(struct afs_net *net, const char *name, unsigned int namesz, - const char *vllist, bool excl) + const char *vllist, bool excl, + enum afs_cell_trace trace) { struct afs_cell *cell, *candidate, *cursor; struct rb_node *parent, **pp; @@ -251,7 +253,7 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, _enter("%s,%s", name, vllist); if (!excl) { - cell = afs_find_cell(net, name, namesz, afs_cell_trace_use_lookup); + cell = afs_find_cell(net, name, namesz, trace); if (!IS_ERR(cell)) goto wait_for_cell; } @@ -327,7 +329,7 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, if (excl) { ret = -EEXIST; } else { - afs_use_cell(cursor, afs_cell_trace_use_lookup); + afs_use_cell(cursor, trace); ret = 0; } up_write(&net->cells_lock); @@ -382,8 +384,9 @@ int afs_cell_init(struct afs_net *net, const char *rootcell) if (cp && cp < rootcell + len) return -EINVAL; - /* allocate a cell record for the root cell */ - new_root = afs_lookup_cell(net, rootcell, len, vllist, false); + /* allocate a cell record for the root/workstation cell */ + new_root = afs_lookup_cell(net, rootcell, len, vllist, false, + afs_cell_trace_use_lookup_ws); if (IS_ERR(new_root)) { _leave(" = %ld", PTR_ERR(new_root)); return PTR_ERR(new_root); diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index eb20e231d7ac..4ff2a396dbd4 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -108,7 +108,8 @@ static struct dentry *afs_dynroot_lookup_cell(struct inode *dir, struct dentry * dotted = true; } - cell = afs_lookup_cell(net, name, len, NULL, false); + cell = afs_lookup_cell(net, name, len, NULL, false, + afs_cell_trace_use_lookup_dynroot); if (IS_ERR(cell)) { ret = PTR_ERR(cell); goto out_no_cell; diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 97045e2a455d..24b87ae11524 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -1046,8 +1046,10 @@ static inline bool afs_cb_is_broken(unsigned int cb_break, extern int afs_cell_init(struct afs_net *, const char *); extern struct afs_cell *afs_find_cell(struct afs_net *, const char *, unsigned, enum afs_cell_trace); -extern struct afs_cell *afs_lookup_cell(struct afs_net *, const char *, unsigned, - const char *, bool); +struct afs_cell *afs_lookup_cell(struct afs_net *net, + const char *name, unsigned int namesz, + const char *vllist, bool excl, + enum afs_cell_trace trace); extern struct afs_cell *afs_use_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_unuse_cell(struct afs_net *, struct afs_cell *, enum afs_cell_trace); extern struct afs_cell *afs_get_cell(struct afs_cell *, enum afs_cell_trace); diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index 507c25a5b2cb..4a3edb9990b0 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -107,7 +107,8 @@ static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt) if (size > AFS_MAXCELLNAME) return -ENAMETOOLONG; - cell = afs_lookup_cell(ctx->net, p, size, NULL, false); + cell = afs_lookup_cell(ctx->net, p, size, NULL, false, + afs_cell_trace_use_lookup_mntpt); if (IS_ERR(cell)) { pr_err("kAFS: unable to lookup cell '%pd'\n", mntpt); return PTR_ERR(cell); diff --git a/fs/afs/proc.c b/fs/afs/proc.c index 12c88d8be3fe..fc7027fc3084 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -122,7 +122,8 @@ static int afs_proc_cells_write(struct file *file, char *buf, size_t size) if (strcmp(buf, "add") == 0) { struct afs_cell *cell; - cell = afs_lookup_cell(net, name, strlen(name), args, true); + cell = afs_lookup_cell(net, name, strlen(name), args, true, + afs_cell_trace_use_lookup_add); if (IS_ERR(cell)) { ret = PTR_ERR(cell); goto done; diff --git a/fs/afs/super.c b/fs/afs/super.c index dfc109f48ad5..aa6a3ccf39b5 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -290,7 +290,8 @@ static int afs_parse_source(struct fs_context *fc, struct fs_parameter *param) /* lookup the cell record */ if (cellname) { cell = afs_lookup_cell(ctx->net, cellname, cellnamesz, - NULL, false); + NULL, false, + afs_cell_trace_use_lookup_mount); if (IS_ERR(cell)) { pr_err("kAFS: unable to lookup cell '%*.*s'\n", cellnamesz, cellnamesz, cellname ?: ""); diff --git a/fs/afs/vl_alias.c b/fs/afs/vl_alias.c index f9e76b604f31..ffcfba1725e6 100644 --- a/fs/afs/vl_alias.c +++ b/fs/afs/vl_alias.c @@ -269,7 +269,8 @@ static int yfs_check_canonical_cell_name(struct afs_cell *cell, struct key *key) if (!name_len || name_len > AFS_MAXCELLNAME) master = ERR_PTR(-EOPNOTSUPP); else - master = afs_lookup_cell(cell->net, cell_name, name_len, NULL, false); + master = afs_lookup_cell(cell->net, cell_name, name_len, NULL, false, + afs_cell_trace_use_lookup_canonical); kfree(cell_name); if (IS_ERR(master)) return PTR_ERR(master); diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 24d99fbc298f..42c3a51db72b 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -208,7 +208,12 @@ enum yfs_cm_operation { EM(afs_cell_trace_use_check_alias, "USE chk-al") \ EM(afs_cell_trace_use_fc, "USE fc ") \ EM(afs_cell_trace_use_fc_alias, "USE fc-al ") \ - EM(afs_cell_trace_use_lookup, "USE lookup") \ + EM(afs_cell_trace_use_lookup_add, "USE lu-add") \ + EM(afs_cell_trace_use_lookup_canonical, "USE lu-can") \ + EM(afs_cell_trace_use_lookup_dynroot, "USE lu-dyn") \ + EM(afs_cell_trace_use_lookup_mntpt, "USE lu-mpt") \ + EM(afs_cell_trace_use_lookup_mount, "USE lu-mnt") \ + EM(afs_cell_trace_use_lookup_ws, "USE lu-ws ") \ EM(afs_cell_trace_use_mntpt, "USE mntpt ") \ EM(afs_cell_trace_use_pin, "USE pin ") \ EM(afs_cell_trace_use_probe, "USE probe ") \ From patchwork Mon Mar 10 09:42:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 14009484 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E038E227B8E for ; Mon, 10 Mar 2025 09:42:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599767; cv=none; b=K4ZOyG9XrXDW5G2fU7YffZPhl7ZwamjJdutMduiSaXG2+FJrsB+CG29m0GgwfcNsaiddAvcioVEz8SjJT2flLMACeolf5OFmlDsrEXfX6DutEzG4bF3rZJMXpozGmnWG7mOwao+rTucsWmZSLdColvyxqMKllIu+r7yf32+jl+A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599767; c=relaxed/simple; bh=mnwr0qJt2k8Iix3QGp5xkW79KQcIIw/eqMdZgRCKhHQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KFiipbd7p7qyJ90R2fH0FKTs0anXJ+0rxDEVGrX90TE/+ylNOjxMhvo598VnH9ZCpm5gfYlDcO0LKbN5x62OPr1r/esFhsAz0PbC9z2fYIw2bNTzdCnlZrznvYqWtOXLxYJtBQFuGya2Ce8w1mgnLBZ4q2DuRqQVD6Q1ERTbNmE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=HiEbenx0; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="HiEbenx0" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741599763; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4mrpYWalYw7EJxA/nUrbhkuC60QhcJ0mNDk/XtiF2k4=; b=HiEbenx0XapvfjnLUfcquGOTeXfPwDDOleOJ7qdfLBruvESOoXRbV1sDupxPaUWKhbVodI RXJ/JY85YqdxUgKRcyHVgJc/jQgT4pymSRkjcPqBpUbspTimOXz5eONUixE84BkRRoVNh/ SM6i6O1Nc6l2zE+ep34sRBMk/YjfzLY= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-674-G7KWNf_hOZSMNOdZ4au04g-1; Mon, 10 Mar 2025 05:42:40 -0400 X-MC-Unique: G7KWNf_hOZSMNOdZ4au04g-1 X-Mimecast-MFC-AGG-ID: G7KWNf_hOZSMNOdZ4au04g_1741599759 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6662E19560A3; Mon, 10 Mar 2025 09:42:39 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.61]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id DC55D19560AD; Mon, 10 Mar 2025 09:42:37 +0000 (UTC) From: David Howells To: Marc Dionne Cc: David Howells , Christian Brauner , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 07/11] afs: Drop the net parameter from afs_unuse_cell() Date: Mon, 10 Mar 2025 09:42:00 +0000 Message-ID: <20250310094206.801057-8-dhowells@redhat.com> In-Reply-To: <20250310094206.801057-1-dhowells@redhat.com> References: <20250310094206.801057-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Remove the redundant net parameter to afs_unuse_cell() as cell->net can be used instead. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250224234154.2014840-12-dhowells@redhat.com/ # v1 --- fs/afs/cell.c | 12 ++++++------ fs/afs/dynroot.c | 4 ++-- fs/afs/internal.h | 2 +- fs/afs/mntpt.c | 2 +- fs/afs/proc.c | 2 +- fs/afs/super.c | 9 ++++----- fs/afs/vl_alias.c | 4 ++-- include/trace/events/afs.h | 1 + 8 files changed, 18 insertions(+), 18 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index 73894180f653..acbf35b4c9ed 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -339,7 +339,7 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, goto wait_for_cell; goto error_noput; error: - afs_unuse_cell(net, cell, afs_cell_trace_unuse_lookup); + afs_unuse_cell(cell, afs_cell_trace_unuse_lookup_error); error_noput: _leave(" = %d [error]", ret); return ERR_PTR(ret); @@ -402,7 +402,7 @@ int afs_cell_init(struct afs_net *net, const char *rootcell) lockdep_is_held(&net->cells_lock)); up_write(&net->cells_lock); - afs_unuse_cell(net, old_root, afs_cell_trace_unuse_ws); + afs_unuse_cell(old_root, afs_cell_trace_unuse_ws); _leave(" = 0"); return 0; } @@ -520,7 +520,7 @@ static void afs_cell_destroy(struct rcu_head *rcu) trace_afs_cell(cell->debug_id, r, atomic_read(&cell->active), afs_cell_trace_free); afs_put_vlserverlist(net, rcu_access_pointer(cell->vl_servers)); - afs_unuse_cell(net, cell->alias_of, afs_cell_trace_unuse_alias); + afs_unuse_cell(cell->alias_of, afs_cell_trace_unuse_alias); key_put(cell->anonymous_key); idr_remove(&net->cells_dyn_ino, cell->dynroot_ino); kfree(cell->name - 1); @@ -608,7 +608,7 @@ struct afs_cell *afs_use_cell(struct afs_cell *cell, enum afs_cell_trace reason) * Record a cell becoming less active. When the active counter reaches 1, it * is scheduled for destruction, but may get reactivated. */ -void afs_unuse_cell(struct afs_net *net, struct afs_cell *cell, enum afs_cell_trace reason) +void afs_unuse_cell(struct afs_cell *cell, enum afs_cell_trace reason) { unsigned int debug_id; time64_t now, expire_delay; @@ -632,7 +632,7 @@ void afs_unuse_cell(struct afs_net *net, struct afs_cell *cell, enum afs_cell_tr WARN_ON(a == 0); if (a == 1) /* 'cell' may now be garbage collected. */ - afs_set_cell_timer(net, expire_delay); + afs_set_cell_timer(cell->net, expire_delay); } /* @@ -957,7 +957,7 @@ void afs_cell_purge(struct afs_net *net) ws = rcu_replace_pointer(net->ws_cell, NULL, lockdep_is_held(&net->cells_lock)); up_write(&net->cells_lock); - afs_unuse_cell(net, ws, afs_cell_trace_unuse_ws); + afs_unuse_cell(ws, afs_cell_trace_unuse_ws); _debug("del timer"); if (del_timer_sync(&net->cells_timer)) diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index 4ff2a396dbd4..011c63350df1 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -125,7 +125,7 @@ static struct dentry *afs_dynroot_lookup_cell(struct inode *dir, struct dentry * return d_splice_alias(inode, dentry); out: - afs_unuse_cell(cell->net, cell, afs_cell_trace_unuse_lookup_dynroot); + afs_unuse_cell(cell, afs_cell_trace_unuse_lookup_dynroot); out_no_cell: if (!inode) return d_splice_alias(inode, dentry); @@ -167,7 +167,7 @@ static void afs_dynroot_d_release(struct dentry *dentry) { struct afs_cell *cell = dentry->d_fsdata; - afs_unuse_cell(cell->net, cell, afs_cell_trace_unuse_dynroot_mntpt); + afs_unuse_cell(cell, afs_cell_trace_unuse_dynroot_mntpt); } /* diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 24b87ae11524..9c8dfde758c3 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -1051,7 +1051,7 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, const char *vllist, bool excl, enum afs_cell_trace trace); extern struct afs_cell *afs_use_cell(struct afs_cell *, enum afs_cell_trace); -extern void afs_unuse_cell(struct afs_net *, struct afs_cell *, enum afs_cell_trace); +void afs_unuse_cell(struct afs_cell *cell, enum afs_cell_trace reason); extern struct afs_cell *afs_get_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_see_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_put_cell(struct afs_cell *, enum afs_cell_trace); diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c index 4a3edb9990b0..45cee6534122 100644 --- a/fs/afs/mntpt.c +++ b/fs/afs/mntpt.c @@ -87,7 +87,7 @@ static int afs_mntpt_set_params(struct fs_context *fc, struct dentry *mntpt) ctx->force = true; } if (ctx->cell) { - afs_unuse_cell(ctx->net, ctx->cell, afs_cell_trace_unuse_mntpt); + afs_unuse_cell(ctx->cell, afs_cell_trace_unuse_mntpt); ctx->cell = NULL; } if (test_bit(AFS_VNODE_PSEUDODIR, &vnode->flags)) { diff --git a/fs/afs/proc.c b/fs/afs/proc.c index fc7027fc3084..9a3d8eb5da43 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -130,7 +130,7 @@ static int afs_proc_cells_write(struct file *file, char *buf, size_t size) } if (test_and_set_bit(AFS_CELL_FL_NO_GC, &cell->flags)) - afs_unuse_cell(net, cell, afs_cell_trace_unuse_no_pin); + afs_unuse_cell(cell, afs_cell_trace_unuse_no_pin); } else { goto inval; } diff --git a/fs/afs/super.c b/fs/afs/super.c index aa6a3ccf39b5..25b306db6992 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -297,7 +297,7 @@ static int afs_parse_source(struct fs_context *fc, struct fs_parameter *param) cellnamesz, cellnamesz, cellname ?: ""); return PTR_ERR(cell); } - afs_unuse_cell(ctx->net, ctx->cell, afs_cell_trace_unuse_parse); + afs_unuse_cell(ctx->cell, afs_cell_trace_unuse_parse); afs_see_cell(cell, afs_cell_trace_see_source); ctx->cell = cell; } @@ -394,7 +394,7 @@ static int afs_validate_fc(struct fs_context *fc) ctx->key = NULL; cell = afs_use_cell(ctx->cell->alias_of, afs_cell_trace_use_fc_alias); - afs_unuse_cell(ctx->net, ctx->cell, afs_cell_trace_unuse_fc); + afs_unuse_cell(ctx->cell, afs_cell_trace_unuse_fc); ctx->cell = cell; goto reget_key; } @@ -520,9 +520,8 @@ static struct afs_super_info *afs_alloc_sbi(struct fs_context *fc) static void afs_destroy_sbi(struct afs_super_info *as) { if (as) { - struct afs_net *net = afs_net(as->net_ns); afs_put_volume(as->volume, afs_volume_trace_put_destroy_sbi); - afs_unuse_cell(net, as->cell, afs_cell_trace_unuse_sbi); + afs_unuse_cell(as->cell, afs_cell_trace_unuse_sbi); put_net(as->net_ns); kfree(as); } @@ -605,7 +604,7 @@ static void afs_free_fc(struct fs_context *fc) afs_destroy_sbi(fc->s_fs_info); afs_put_volume(ctx->volume, afs_volume_trace_put_free_fc); - afs_unuse_cell(ctx->net, ctx->cell, afs_cell_trace_unuse_fc); + afs_unuse_cell(ctx->cell, afs_cell_trace_unuse_fc); key_put(ctx->key); kfree(ctx); } diff --git a/fs/afs/vl_alias.c b/fs/afs/vl_alias.c index ffcfba1725e6..709b4cdb723e 100644 --- a/fs/afs/vl_alias.c +++ b/fs/afs/vl_alias.c @@ -205,11 +205,11 @@ static int afs_query_for_alias(struct afs_cell *cell, struct key *key) goto is_alias; if (mutex_lock_interruptible(&cell->net->proc_cells_lock) < 0) { - afs_unuse_cell(cell->net, p, afs_cell_trace_unuse_check_alias); + afs_unuse_cell(p, afs_cell_trace_unuse_check_alias); return -ERESTARTSYS; } - afs_unuse_cell(cell->net, p, afs_cell_trace_unuse_check_alias); + afs_unuse_cell(p, afs_cell_trace_unuse_check_alias); } mutex_unlock(&cell->net->proc_cells_lock); diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 42c3a51db72b..82d20c28dc0d 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -197,6 +197,7 @@ enum yfs_cm_operation { EM(afs_cell_trace_unuse_fc, "UNU fc ") \ EM(afs_cell_trace_unuse_lookup, "UNU lookup") \ EM(afs_cell_trace_unuse_lookup_dynroot, "UNU lu-dyn") \ + EM(afs_cell_trace_unuse_lookup_error, "UNU lu-err") \ EM(afs_cell_trace_unuse_mntpt, "UNU mntpt ") \ EM(afs_cell_trace_unuse_no_pin, "UNU no-pin") \ EM(afs_cell_trace_unuse_parse, "UNU parse ") \ From patchwork Mon Mar 10 09:42:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 14009486 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76223228CB8 for ; Mon, 10 Mar 2025 09:42:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599773; cv=none; b=bJT2iZeCSNqQo04x+vjl6m/qZ4iHSO1HR7f/W9ecS8sSLNAp08D5PG2Bx4XVF0Ro6NyCdtIeZaf6+h3QhatajQ6rAsEBaLx1aZ9KJf0MgvR8Rm9yayXYreQmjDlM5snoEHWgrq3ByPmvSA8pwOtaCQ/8GP3mOKtRq3CST7eBdqY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599773; c=relaxed/simple; bh=SJGN4O7YH0QuTjNuzVlal9hvaFlJqlKQBSGp7rwJSGs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GLJcZ1Z+iLJJkwUNAPmk9lPvcQfhvaRtmlTL4+EPxE29JojRGglNOA6NufV5Obm9W+nOdCMeZaIv/nLreF54hT+O4eQ319kxSGNjyskXrGUFQ8avSLgI/XsHZEzlWry4HcRKll9dCtUI5OZC1KVVT1K4YdFS4S8ZJSJwNFkCEow= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=VKle2jW6; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="VKle2jW6" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741599770; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T4eKUlqiuyz9nzriNgRZ9GgKX0nrdSDX3KCKc9kJycA=; b=VKle2jW69FokL7WspuDxMLYboTd66UhG27Mmj4gk+Q6ZLwmROt3Nffjr5dTFCm1TLT+XvI qI71McZ777IcJNQxj0a6OijpPtzLPoJpEFKarMrbMMdzG16zmEuQJ7m3bZmDhq9g3tC1yp Z6hSMRLcQfmNIoDE/eiPduRc4mF6oIc= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-121-LhWmFJw2MnW_O70Mc3jHHA-1; Mon, 10 Mar 2025 05:42:45 -0400 X-MC-Unique: LhWmFJw2MnW_O70Mc3jHHA-1 X-Mimecast-MFC-AGG-ID: LhWmFJw2MnW_O70Mc3jHHA_1741599763 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 79E9F1801A00; Mon, 10 Mar 2025 09:42:43 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.61]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9CB791956094; Mon, 10 Mar 2025 09:42:40 +0000 (UTC) From: David Howells To: Marc Dionne Cc: David Howells , Christian Brauner , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Simon Horman , netdev@vger.kernel.org Subject: [PATCH v4 08/11] rxrpc: Allow the app to store private data on peer structs Date: Mon, 10 Mar 2025 09:42:01 +0000 Message-ID: <20250310094206.801057-9-dhowells@redhat.com> In-Reply-To: <20250310094206.801057-1-dhowells@redhat.com> References: <20250310094206.801057-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 Provide a way for the application (e.g. the afs filesystem) to store private data on the rxrpc_peer structs for later retrieval via the call object. This will allow afs to store a pointer to the afs_server object on the rxrpc_peer struct, thereby obviating the need for afs to keep lookup tables by which it can associate an incoming call with server that transmitted it. Signed-off-by: David Howells cc: Marc Dionne cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Simon Horman cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20250224234154.2014840-13-dhowells@redhat.com/ # v1 --- include/net/af_rxrpc.h | 2 ++ net/rxrpc/ar-internal.h | 1 + net/rxrpc/peer_object.c | 26 ++++++++++++++++++++++++++ 3 files changed, 29 insertions(+) diff --git a/include/net/af_rxrpc.h b/include/net/af_rxrpc.h index 0754c463224a..cf793d18e5df 100644 --- a/include/net/af_rxrpc.h +++ b/include/net/af_rxrpc.h @@ -69,6 +69,8 @@ struct rxrpc_peer *rxrpc_kernel_get_peer(struct rxrpc_peer *peer); struct rxrpc_peer *rxrpc_kernel_get_call_peer(struct socket *sock, struct rxrpc_call *call); const struct sockaddr_rxrpc *rxrpc_kernel_remote_srx(const struct rxrpc_peer *peer); const struct sockaddr *rxrpc_kernel_remote_addr(const struct rxrpc_peer *peer); +unsigned long rxrpc_kernel_set_peer_data(struct rxrpc_peer *peer, unsigned long app_data); +unsigned long rxrpc_kernel_get_peer_data(const struct rxrpc_peer *peer); unsigned int rxrpc_kernel_get_srtt(const struct rxrpc_peer *); int rxrpc_kernel_charge_accept(struct socket *, rxrpc_notify_rx_t, rxrpc_user_attach_call_t, unsigned long, gfp_t, diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index a64a0cab1bf7..3cc3af15086f 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -344,6 +344,7 @@ struct rxrpc_peer { struct hlist_head error_targets; /* targets for net error distribution */ struct rb_root service_conns; /* Service connections */ struct list_head keepalive_link; /* Link in net->peer_keepalive[] */ + unsigned long app_data; /* Application data (e.g. afs_server) */ time64_t last_tx_at; /* Last time packet sent here */ seqlock_t service_conn_lock; spinlock_t lock; /* access lock */ diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c index 56e09d161a97..a0c0e4d590f5 100644 --- a/net/rxrpc/peer_object.c +++ b/net/rxrpc/peer_object.c @@ -520,3 +520,29 @@ const struct sockaddr *rxrpc_kernel_remote_addr(const struct rxrpc_peer *peer) (peer ? &peer->srx.transport : &rxrpc_null_addr.transport); } EXPORT_SYMBOL(rxrpc_kernel_remote_addr); + +/** + * rxrpc_kernel_set_peer_data - Set app-specific data on a peer. + * @peer: The peer to alter + * @app_data: The data to set + * + * Set the app-specific data on a peer. AF_RXRPC makes no effort to retain + * anything the data might refer to. The previous app_data is returned. + */ +unsigned long rxrpc_kernel_set_peer_data(struct rxrpc_peer *peer, unsigned long app_data) +{ + return xchg(&peer->app_data, app_data); +} +EXPORT_SYMBOL(rxrpc_kernel_set_peer_data); + +/** + * rxrpc_kernel_get_peer_data - Get app-specific data from a peer. + * @peer: The peer to query + * + * Retrieve the app-specific data from a peer. + */ +unsigned long rxrpc_kernel_get_peer_data(const struct rxrpc_peer *peer) +{ + return peer->app_data; +} +EXPORT_SYMBOL(rxrpc_kernel_get_peer_data); From patchwork Mon Mar 10 09:42:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 14009485 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 37E822288E3 for ; Mon, 10 Mar 2025 09:42:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599772; cv=none; b=kfvI4lPI9oU7juL4gzTBIQFf0d8wCi6JCD8gqAGYNQgekaHWl86GH/Lt7shL0Y3Q9pB5PMZjHAvxcDcexTWtcVuiGc9HpfvEFmpy/hOihpgOcBvmyO1HCYQv1Rba5xv15e/5eSw1/F4oArhuftEqsTXMXgGpx63R3oh5D6VGBsk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599772; c=relaxed/simple; bh=o20DsNKrxwfd4uUOeuyFaFfER8RAfFI0KRxcGbfOG2I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rrou+dRHOJC4sYbKMHRMkNxFIIfSTDEA5eY85EXmQA/0DG5PRwWyH3TX9psclk9LzGDnca+WwD1v2wmx3BSE2KtKfssaf5GFAXiEra3Y17drpELk4pEVcTq7WMBHM3xT4IWOQPvoPDbcLolSpiCy3EMoPZ220oRiKskC3E0uJpA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=igIUfmYS; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="igIUfmYS" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741599769; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pNjcis3bmlUYBwnkEqvwUlaKP/e/GzRt3p3JQQKZgDs=; b=igIUfmYSjj6wEcN8jwsiC6M3zbuhQWdTXFaEHmvqeB5r3dfLlt8mXLq0DHrqlFkQ/zuXA4 D7Bq7H1KhM3SYVr2IqQrbvm7Hzd3tyl0p7CFJmkrUChr7ElGjfobm5ohAkHoV2IqllptOH pj6RaEp5og1QVjr8AKYemo/6bxU0C4o= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-611--mMJy9LHNsKUsPexzeT7GQ-1; Mon, 10 Mar 2025 05:42:47 -0400 X-MC-Unique: -mMJy9LHNsKUsPexzeT7GQ-1 X-Mimecast-MFC-AGG-ID: -mMJy9LHNsKUsPexzeT7GQ_1741599766 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 802F919560B5; Mon, 10 Mar 2025 09:42:46 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.61]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B0BBB1800366; Mon, 10 Mar 2025 09:42:44 +0000 (UTC) From: David Howells To: Marc Dionne Cc: David Howells , Christian Brauner , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 09/11] afs: Use the per-peer app data provided by rxrpc Date: Mon, 10 Mar 2025 09:42:02 +0000 Message-ID: <20250310094206.801057-10-dhowells@redhat.com> In-Reply-To: <20250310094206.801057-1-dhowells@redhat.com> References: <20250310094206.801057-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Make use of the per-peer application data that rxrpc now allows the application to store on the rxrpc_peer struct to hold a back pointer to the afs_server record that peer represents an endpoint for. Then, when a call comes in to the AFS cache manager, this can be used to map it to the correct server record rather than having to use a UUID-to-server mapping table and having to do an additional lookup. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250224234154.2014840-14-dhowells@redhat.com/ # v1 --- fs/afs/addr_list.c | 50 +++++++++++++++++++++++ fs/afs/cmservice.c | 82 ++++++-------------------------------- fs/afs/fs_probe.c | 32 ++++++++++----- fs/afs/internal.h | 9 +++-- fs/afs/proc.c | 10 ++++- fs/afs/rxrpc.c | 6 +++ fs/afs/server.c | 46 ++++++--------------- include/trace/events/afs.h | 4 +- net/rxrpc/peer_object.c | 4 +- 9 files changed, 120 insertions(+), 123 deletions(-) diff --git a/fs/afs/addr_list.c b/fs/afs/addr_list.c index 6d42f85c6be5..e941da5b6dd9 100644 --- a/fs/afs/addr_list.c +++ b/fs/afs/addr_list.c @@ -362,3 +362,53 @@ int afs_merge_fs_addr6(struct afs_net *net, struct afs_addr_list *alist, alist->nr_addrs++; return 0; } + +/* + * Set the app data on the rxrpc peers an address list points to + */ +void afs_set_peer_appdata(struct afs_server *server, + struct afs_addr_list *old_alist, + struct afs_addr_list *new_alist) +{ + unsigned long data = (unsigned long)server; + int n = 0, o = 0; + + if (!old_alist) { + /* New server. Just set all. */ + for (; n < new_alist->nr_addrs; n++) + rxrpc_kernel_set_peer_data(new_alist->addrs[n].peer, data); + return; + } + if (!new_alist) { + /* Dead server. Just remove all. */ + for (; o < old_alist->nr_addrs; o++) + rxrpc_kernel_set_peer_data(old_alist->addrs[o].peer, 0); + return; + } + + /* Walk through the two lists simultaneously, setting new peers and + * clearing old ones. The two lists are ordered by pointer to peer + * record. + */ + while (n < new_alist->nr_addrs && o < old_alist->nr_addrs) { + struct rxrpc_peer *pn = new_alist->addrs[n].peer; + struct rxrpc_peer *po = old_alist->addrs[o].peer; + + if (pn == po) + continue; + if (pn < po) { + rxrpc_kernel_set_peer_data(pn, data); + n++; + } else { + rxrpc_kernel_set_peer_data(po, 0); + o++; + } + } + + if (n < new_alist->nr_addrs) + for (; n < new_alist->nr_addrs; n++) + rxrpc_kernel_set_peer_data(new_alist->addrs[n].peer, data); + if (o < old_alist->nr_addrs) + for (; o < old_alist->nr_addrs; o++) + rxrpc_kernel_set_peer_data(old_alist->addrs[o].peer, 0); +} diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c index 99a3f20bc786..1a906805a9e3 100644 --- a/fs/afs/cmservice.c +++ b/fs/afs/cmservice.c @@ -138,49 +138,6 @@ bool afs_cm_incoming_call(struct afs_call *call) } } -/* - * Find the server record by peer address and record a probe to the cache - * manager from a server. - */ -static int afs_find_cm_server_by_peer(struct afs_call *call) -{ - struct sockaddr_rxrpc srx; - struct afs_server *server; - struct rxrpc_peer *peer; - - peer = rxrpc_kernel_get_call_peer(call->net->socket, call->rxcall); - - server = afs_find_server(call->net, peer); - if (!server) { - trace_afs_cm_no_server(call, &srx); - return 0; - } - - call->server = server; - return 0; -} - -/* - * Find the server record by server UUID and record a probe to the cache - * manager from a server. - */ -static int afs_find_cm_server_by_uuid(struct afs_call *call, - struct afs_uuid *uuid) -{ - struct afs_server *server; - - rcu_read_lock(); - server = afs_find_server_by_uuid(call->net, call->request); - rcu_read_unlock(); - if (!server) { - trace_afs_cm_no_server_u(call, call->request); - return 0; - } - - call->server = server; - return 0; -} - /* * Clean up a cache manager call. */ @@ -322,10 +279,7 @@ static int afs_deliver_cb_callback(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - - /* we'll need the file server record as that tells us which set of - * vnodes to operate upon */ - return afs_find_cm_server_by_peer(call); + return 0; } /* @@ -349,18 +303,10 @@ static void SRXAFSCB_InitCallBackState(struct work_struct *work) */ static int afs_deliver_cb_init_call_back_state(struct afs_call *call) { - int ret; - _enter(""); afs_extract_discard(call, 0); - ret = afs_extract_data(call, false); - if (ret < 0) - return ret; - - /* we'll need the file server record as that tells us which set of - * vnodes to operate upon */ - return afs_find_cm_server_by_peer(call); + return afs_extract_data(call, false); } /* @@ -373,8 +319,6 @@ static int afs_deliver_cb_init_call_back_state3(struct afs_call *call) __be32 *b; int ret; - _enter(""); - _enter("{%u}", call->unmarshall); switch (call->unmarshall) { @@ -421,9 +365,13 @@ static int afs_deliver_cb_init_call_back_state3(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - /* we'll need the file server record as that tells us which set of - * vnodes to operate upon */ - return afs_find_cm_server_by_uuid(call, call->request); + if (memcmp(call->request, &call->server->_uuid, sizeof(call->server->_uuid)) != 0) { + pr_notice("Callback UUID does not match fileserver UUID\n"); + trace_afs_cm_no_server_u(call, call->request); + return 0; + } + + return 0; } /* @@ -455,7 +403,7 @@ static int afs_deliver_cb_probe(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - return afs_find_cm_server_by_peer(call); + return 0; } /* @@ -533,7 +481,7 @@ static int afs_deliver_cb_probe_uuid(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - return afs_find_cm_server_by_peer(call); + return 0; } /* @@ -593,7 +541,7 @@ static int afs_deliver_cb_tell_me_about_yourself(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - return afs_find_cm_server_by_peer(call); + return 0; } /* @@ -667,9 +615,5 @@ static int afs_deliver_yfs_cb_callback(struct afs_call *call) if (!afs_check_call_state(call, AFS_CALL_SV_REPLYING)) return afs_io_error(call, afs_io_error_cm_reply); - - /* We'll need the file server record as that tells us which set of - * vnodes to operate upon. - */ - return afs_find_cm_server_by_peer(call); + return 0; } diff --git a/fs/afs/fs_probe.c b/fs/afs/fs_probe.c index b516d05b0fef..07a8bfbdd9b9 100644 --- a/fs/afs/fs_probe.c +++ b/fs/afs/fs_probe.c @@ -235,20 +235,20 @@ void afs_fileserver_probe_result(struct afs_call *call) * Probe all of a fileserver's addresses to find out the best route and to * query its capabilities. */ -void afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, - struct afs_addr_list *new_alist, struct key *key) +int afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, + struct afs_addr_list *new_alist, struct key *key) { struct afs_endpoint_state *estate, *old; - struct afs_addr_list *alist; + struct afs_addr_list *old_alist = NULL, *alist; unsigned long unprobed; _enter("%pU", &server->uuid); estate = kzalloc(sizeof(*estate), GFP_KERNEL); if (!estate) - return; + return -ENOMEM; - refcount_set(&estate->ref, 1); + refcount_set(&estate->ref, 2); estate->server_id = server->debug_id; estate->rtt = UINT_MAX; @@ -256,21 +256,31 @@ void afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, old = rcu_dereference_protected(server->endpoint_state, lockdep_is_held(&server->fs_lock)); - estate->responsive_set = old->responsive_set; - estate->addresses = afs_get_addrlist(new_alist ?: old->addresses, - afs_alist_trace_get_estate); + if (old) { + estate->responsive_set = old->responsive_set; + if (!new_alist) + new_alist = old->addresses; + } + + if (old_alist != new_alist) + afs_set_peer_appdata(server, old_alist, new_alist); + + estate->addresses = afs_get_addrlist(new_alist, afs_alist_trace_get_estate); alist = estate->addresses; estate->probe_seq = ++server->probe_counter; atomic_set(&estate->nr_probing, alist->nr_addrs); + if (new_alist) + server->addr_version = new_alist->version; rcu_assign_pointer(server->endpoint_state, estate); - set_bit(AFS_ESTATE_SUPERSEDED, &old->flags); write_unlock(&server->fs_lock); + if (old) + set_bit(AFS_ESTATE_SUPERSEDED, &old->flags); trace_afs_estate(estate->server_id, estate->probe_seq, refcount_read(&estate->ref), afs_estate_trace_alloc_probe); - afs_get_address_preferences(net, alist); + afs_get_address_preferences(net, new_alist); server->probed_at = jiffies; unprobed = (1UL << alist->nr_addrs) - 1; @@ -293,6 +303,8 @@ void afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, } afs_put_endpoint_state(old, afs_estate_trace_put_probe); + afs_put_endpoint_state(estate, afs_estate_trace_put_probe); + return 0; } /* diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 9c8dfde758c3..3321fdafb3c7 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -1010,6 +1010,9 @@ extern int afs_merge_fs_addr4(struct afs_net *net, struct afs_addr_list *addr, __be32 xdr, u16 port); extern int afs_merge_fs_addr6(struct afs_net *net, struct afs_addr_list *addr, __be32 *xdr, u16 port); +void afs_set_peer_appdata(struct afs_server *server, + struct afs_addr_list *old_alist, + struct afs_addr_list *new_alist); /* * addr_prefs.c @@ -1207,8 +1210,8 @@ struct afs_endpoint_state *afs_get_endpoint_state(struct afs_endpoint_state *est enum afs_estate_trace where); void afs_put_endpoint_state(struct afs_endpoint_state *estate, enum afs_estate_trace where); extern void afs_fileserver_probe_result(struct afs_call *); -void afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, - struct afs_addr_list *new_addrs, struct key *key); +int afs_fs_probe_fileserver(struct afs_net *net, struct afs_server *server, + struct afs_addr_list *new_alist, struct key *key); int afs_wait_for_fs_probes(struct afs_operation *op, struct afs_server_state *states, bool intr); extern void afs_probe_fileserver(struct afs_net *, struct afs_server *); extern void afs_fs_probe_dispatcher(struct work_struct *); @@ -1509,7 +1512,7 @@ extern void __exit afs_clean_up_permit_cache(void); */ extern spinlock_t afs_server_peer_lock; -extern struct afs_server *afs_find_server(struct afs_net *, const struct rxrpc_peer *); +struct afs_server *afs_find_server(const struct rxrpc_peer *peer); extern struct afs_server *afs_find_server_by_uuid(struct afs_net *, const uuid_t *); extern struct afs_server *afs_lookup_server(struct afs_cell *, struct key *, const uuid_t *, u32); extern struct afs_server *afs_get_server(struct afs_server *, enum afs_server_trace); diff --git a/fs/afs/proc.c b/fs/afs/proc.c index 9a3d8eb5da43..40e879c8ca77 100644 --- a/fs/afs/proc.c +++ b/fs/afs/proc.c @@ -444,8 +444,6 @@ static int afs_proc_servers_show(struct seq_file *m, void *v) } server = list_entry(v, struct afs_server, proc_link); - estate = rcu_dereference(server->endpoint_state); - alist = estate->addresses; seq_printf(m, "%pU %3d %3d %s\n", &server->uuid, refcount_read(&server->ref), @@ -455,10 +453,16 @@ static int afs_proc_servers_show(struct seq_file *m, void *v) server->flags, server->rtt); seq_printf(m, " - probe: last=%d\n", (int)(jiffies - server->probed_at) / HZ); + + estate = rcu_dereference(server->endpoint_state); + if (!estate) + goto out; failed = estate->failed_set; seq_printf(m, " - ESTATE pq=%x np=%u rsp=%lx f=%lx\n", estate->probe_seq, atomic_read(&estate->nr_probing), estate->responsive_set, estate->failed_set); + + alist = estate->addresses; seq_printf(m, " - ALIST v=%u ap=%u\n", alist->version, alist->addr_pref_version); for (i = 0; i < alist->nr_addrs; i++) { @@ -471,6 +475,8 @@ static int afs_proc_servers_show(struct seq_file *m, void *v) rxrpc_kernel_get_srtt(addr->peer), addr->last_error, addr->prio); } + +out: return 0; } diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c index de9e10575bdd..d5e480a33859 100644 --- a/fs/afs/rxrpc.c +++ b/fs/afs/rxrpc.c @@ -766,8 +766,14 @@ static void afs_rx_discard_new_call(struct rxrpc_call *rxcall, static void afs_rx_new_call(struct sock *sk, struct rxrpc_call *rxcall, unsigned long user_call_ID) { + struct afs_call *call = (struct afs_call *)user_call_ID; struct afs_net *net = afs_sock2net(sk); + call->peer = rxrpc_kernel_get_call_peer(sk->sk_socket, call->rxcall); + call->server = afs_find_server(call->peer); + if (!call->server) + trace_afs_cm_no_server(call, rxrpc_kernel_remote_srx(call->peer)); + queue_work(afs_wq, &net->charge_preallocation_work); } diff --git a/fs/afs/server.c b/fs/afs/server.c index 923e07c37032..1140773f7aed 100644 --- a/fs/afs/server.c +++ b/fs/afs/server.c @@ -21,42 +21,13 @@ static void __afs_put_server(struct afs_net *, struct afs_server *); /* * Find a server by one of its addresses. */ -struct afs_server *afs_find_server(struct afs_net *net, const struct rxrpc_peer *peer) +struct afs_server *afs_find_server(const struct rxrpc_peer *peer) { - const struct afs_endpoint_state *estate; - const struct afs_addr_list *alist; - struct afs_server *server = NULL; - unsigned int i; - int seq = 1; - - rcu_read_lock(); - - do { - if (server) - afs_unuse_server_notime(net, server, afs_server_trace_unuse_find_rsq); - server = NULL; - seq++; /* 2 on the 1st/lockless path, otherwise odd */ - read_seqbegin_or_lock(&net->fs_addr_lock, &seq); - - hlist_for_each_entry_rcu(server, &net->fs_addresses, addr_link) { - estate = rcu_dereference(server->endpoint_state); - alist = estate->addresses; - for (i = 0; i < alist->nr_addrs; i++) - if (alist->addrs[i].peer == peer) - goto found; - } + struct afs_server *server = (struct afs_server *)rxrpc_kernel_get_peer_data(peer); - server = NULL; - continue; - found: - server = afs_maybe_use_server(server, afs_server_trace_use_by_addr); - - } while (need_seqretry(&net->fs_addr_lock, seq)); - - done_seqretry(&net->fs_addr_lock, seq); - - rcu_read_unlock(); - return server; + if (!server) + return NULL; + return afs_maybe_use_server(server, afs_server_trace_use_cm_call); } /* @@ -468,9 +439,16 @@ static void afs_give_up_callbacks(struct afs_net *net, struct afs_server *server */ static void afs_destroy_server(struct afs_net *net, struct afs_server *server) { + struct afs_endpoint_state *estate; + if (test_bit(AFS_SERVER_FL_MAY_HAVE_CB, &server->flags)) afs_give_up_callbacks(net, server); + /* Unbind the rxrpc_peer records from the server. */ + estate = rcu_access_pointer(server->endpoint_state); + if (estate) + afs_set_peer_appdata(server, estate->addresses, NULL); + afs_put_server(net, server, afs_server_trace_destroy); } diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 82d20c28dc0d..4d798b9e43bf 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -140,12 +140,10 @@ enum yfs_cm_operation { EM(afs_server_trace_see_expired, "SEE expd ") \ EM(afs_server_trace_unuse_call, "UNU call ") \ EM(afs_server_trace_unuse_create_fail, "UNU cfail") \ - EM(afs_server_trace_unuse_find_rsq, "UNU f-rsq") \ EM(afs_server_trace_unuse_slist, "UNU slist") \ EM(afs_server_trace_unuse_slist_isort, "UNU isort") \ EM(afs_server_trace_unuse_uuid_rsq, "PUT u-req") \ EM(afs_server_trace_update, "UPDATE ") \ - EM(afs_server_trace_use_by_addr, "USE addr ") \ EM(afs_server_trace_use_by_uuid, "USE uuid ") \ EM(afs_server_trace_use_cm_call, "USE cm-cl") \ EM(afs_server_trace_use_get_caps, "USE gcaps") \ @@ -1281,7 +1279,7 @@ TRACE_EVENT(afs_bulkstat_error, ); TRACE_EVENT(afs_cm_no_server, - TP_PROTO(struct afs_call *call, struct sockaddr_rxrpc *srx), + TP_PROTO(struct afs_call *call, const struct sockaddr_rxrpc *srx), TP_ARGS(call, srx), diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c index a0c0e4d590f5..71b6e07bf161 100644 --- a/net/rxrpc/peer_object.c +++ b/net/rxrpc/peer_object.c @@ -461,7 +461,7 @@ void rxrpc_destroy_all_peers(struct rxrpc_net *rxnet) continue; hlist_for_each_entry(peer, &rxnet->peer_hash[i], hash_link) { - pr_err("Leaked peer %u {%u} %pISp\n", + pr_err("Leaked peer %x {%u} %pISp\n", peer->debug_id, refcount_read(&peer->ref), &peer->srx.transport); @@ -478,7 +478,7 @@ void rxrpc_destroy_all_peers(struct rxrpc_net *rxnet) */ struct rxrpc_peer *rxrpc_kernel_get_call_peer(struct socket *sock, struct rxrpc_call *call) { - return call->peer; + return rxrpc_get_peer(call->peer, rxrpc_peer_get_application); } EXPORT_SYMBOL(rxrpc_kernel_get_call_peer); From patchwork Mon Mar 10 09:42:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 14009487 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E49D228CB8 for ; Mon, 10 Mar 2025 09:42:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599777; cv=none; b=f2JSusxRlOv4HguvnHrQgzYtvYzU/q64R5+CVpsMiEpM8eCoB0RjvWOq4iIb0YdNtc95a6y0uZM245ICRaTWYkYjTRHwgLTs931u1I/gpjmAU1DCfVsjYPFWmpDV5wq/i2sSCrHpLlfoIU6j7tzsUStpg89+RwH2H1TyG0C4RQw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599777; c=relaxed/simple; bh=de5b07SyeJXWBFnz5pkuEolbSF3AtYzsHuthpOZLmwM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LdagkGEDMfWdHKfIQx0Zzjm4GEyA9Y2wVJDFtvkJeP59rGmoSAXgbDqnqUwUjMh6WR/myRjrHO9MGUoiCsW9WUbB+AsFV7dWI39+U8GelXQZwKHEC28NADWzOViMVHF7fgzB2raUlQBUm+i9OKLvspZhGzX/u9S1bXqWVhOuBUg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=XJO0mHc7; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XJO0mHc7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741599773; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9NE4ZegdWhoJRTRrTystV+TjVp4mmZw8U3KNhjr9Mok=; b=XJO0mHc7C+csYKlfTqF35sSztWBNgphnXu6/dZeSF4AoJFdvCDnYLd0eXlx1Pkb5gR7kDu ixvl88Rc6KW79SUWr7VwaSHEOE230JRn+C+sFcLVHCMIwNWJb1wLVAmMd39lduXz9jV4P2 +O98+4QPuAdTFqTK+kIO+74dFUftIyQ= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-374-uxutYvR4NM-vs9A0MoeX2Q-1; Mon, 10 Mar 2025 05:42:52 -0400 X-MC-Unique: uxutYvR4NM-vs9A0MoeX2Q-1 X-Mimecast-MFC-AGG-ID: uxutYvR4NM-vs9A0MoeX2Q_1741599771 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C57E91809CA6; Mon, 10 Mar 2025 09:42:50 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.61]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 069F83000197; Mon, 10 Mar 2025 09:42:47 +0000 (UTC) From: David Howells To: Marc Dionne Cc: David Howells , Christian Brauner , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 10/11] afs: Fix afs_server ref accounting Date: Mon, 10 Mar 2025 09:42:03 +0000 Message-ID: <20250310094206.801057-11-dhowells@redhat.com> In-Reply-To: <20250310094206.801057-1-dhowells@redhat.com> References: <20250310094206.801057-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 The current way that afs_server refs are accounted and cleaned up sometimes cause rmmod to hang when it is waiting for cell records to be removed. The problem is that the cell cleanup might occasionally happen before the server cleanup and then there's nothing that causes the cell to garbage-collect the remaining servers as they become inactive. Partially fix this by: (1) Give each afs_server record its own management timer that rather than relying on the cell manager's central timer to drive each individual cell's maintenance work item to garbage collect servers. This timer is set when afs_unuse_server() reduces a server's activity count to zero and will schedule the server's destroyer work item upon firing. (2) Give each afs_server record its own destroyer work item that removes the record from the cell's database, shuts down the timer, cancels any pending work for itself, sends an RPC to the server to cancel outstanding callbacks. This change, in combination with the timer, obviates the need to try and coordinate so closely between the cell record and a bunch of other server records to try and tear everything down in a coordinated fashion. With this, the cell record is pinned until the server RCU is complete and namespace/module removal will wait until all the cell records are removed. (3) Now that incoming calls are mapped to servers (and thus cells) using data attached to an rxrpc_peer, the UUID-to-server mapping tree is moved from the namespace to the cell (cell->fs_servers). This means there can no longer be duplicates therein - and that allows the mapping tree to be simpler as there doesn't need to be a chain of same-UUID servers that are in different cells. (4) The lock protecting the UUID mapping tree is switched to an rw_semaphore on the cell rather than a seqlock on the namespace as it's now only used during mounting in contexts in which we're allowed to sleep. (5) When it comes time for a cell that is being removed to purge its set of servers, it just needs to iterate over them and wake them up. Once a server becomes inactive, its destroyer work item will observe the state of the cell and immediately remove that record. (6) When a server record is removed, it is marked AFS_SERVER_FL_EXPIRED to prevent reattempts at removal. The record will be dispatched to RCU for destruction once its refcount reaches 0. (7) The AFS_SERVER_FL_UNCREATED/CREATING flags are used to synchronise simultaneous creation attempts. If one attempt fails, it will abandon the attempt and allow another to try again. Note that the record can't just be abandoned when dead as it's bound into a server list attached to a volume and only subject to replacement if the server list obtained for the volume from the VLDB changes. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250224234154.2014840-15-dhowells@redhat.com/ # v1 --- fs/afs/cell.c | 3 +- fs/afs/fsclient.c | 4 +- fs/afs/internal.h | 54 ++-- fs/afs/main.c | 10 +- fs/afs/server.c | 564 ++++++++++++++++--------------------- fs/afs/server_list.c | 4 +- include/trace/events/afs.h | 7 +- 7 files changed, 289 insertions(+), 357 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index acbf35b4c9ed..694714d296ba 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -169,7 +169,7 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, INIT_HLIST_HEAD(&cell->proc_volumes); seqlock_init(&cell->volume_lock); cell->fs_servers = RB_ROOT; - seqlock_init(&cell->fs_lock); + init_rwsem(&cell->fs_lock); rwlock_init(&cell->vl_servers_lock); cell->flags = (1 << AFS_CELL_FL_CHECK_ALIAS); @@ -838,6 +838,7 @@ static void afs_manage_cell(struct afs_cell *cell) /* The root volume is pinning the cell */ afs_put_volume(cell->root_volume, afs_volume_trace_put_cell_root); cell->root_volume = NULL; + afs_purge_servers(cell); afs_put_cell(cell, afs_cell_trace_put_destroy); } diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c index 9f46d9aebc33..bc9556991d7c 100644 --- a/fs/afs/fsclient.c +++ b/fs/afs/fsclient.c @@ -1653,7 +1653,7 @@ int afs_fs_give_up_all_callbacks(struct afs_net *net, struct afs_server *server, bp = call->request; *bp++ = htonl(FSGIVEUPALLCALLBACKS); - call->server = afs_use_server(server, afs_server_trace_use_give_up_cb); + call->server = afs_use_server(server, false, afs_server_trace_use_give_up_cb); afs_make_call(call, GFP_NOFS); afs_wait_for_call_to_complete(call); ret = call->error; @@ -1760,7 +1760,7 @@ bool afs_fs_get_capabilities(struct afs_net *net, struct afs_server *server, return false; call->key = key; - call->server = afs_use_server(server, afs_server_trace_use_get_caps); + call->server = afs_use_server(server, false, afs_server_trace_use_get_caps); call->peer = rxrpc_kernel_get_peer(estate->addresses->addrs[addr_index].peer); call->probe = afs_get_endpoint_state(estate, afs_estate_trace_get_getcaps); call->probe_index = addr_index; diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 3321fdafb3c7..1e0ab5e7fc88 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -302,18 +302,11 @@ struct afs_net { * cell, but in practice, people create aliases and subsets and there's * no easy way to distinguish them. */ - seqlock_t fs_lock; /* For fs_servers, fs_probe_*, fs_proc */ - struct rb_root fs_servers; /* afs_server (by server UUID or address) */ + seqlock_t fs_lock; /* For fs_probe_*, fs_proc */ struct list_head fs_probe_fast; /* List of afs_server to probe at 30s intervals */ struct list_head fs_probe_slow; /* List of afs_server to probe at 5m intervals */ struct hlist_head fs_proc; /* procfs servers list */ - struct hlist_head fs_addresses; /* afs_server (by lowest IPv6 addr) */ - seqlock_t fs_addr_lock; /* For fs_addresses[46] */ - - struct work_struct fs_manager; - struct timer_list fs_timer; - struct work_struct fs_prober; struct timer_list fs_probe_timer; atomic_t servers_outstanding; @@ -409,7 +402,7 @@ struct afs_cell { /* Active fileserver interaction state. */ struct rb_root fs_servers; /* afs_server (by server UUID) */ - seqlock_t fs_lock; /* For fs_servers */ + struct rw_semaphore fs_lock; /* For fs_servers */ /* VL server list. */ rwlock_t vl_servers_lock; /* Lock on vl_servers */ @@ -544,22 +537,22 @@ struct afs_server { }; struct afs_cell *cell; /* Cell to which belongs (pins ref) */ - struct rb_node uuid_rb; /* Link in net->fs_servers */ - struct afs_server __rcu *uuid_next; /* Next server with same UUID */ - struct afs_server *uuid_prev; /* Previous server with same UUID */ - struct list_head probe_link; /* Link in net->fs_probe_list */ - struct hlist_node addr_link; /* Link in net->fs_addresses6 */ + struct rb_node uuid_rb; /* Link in cell->fs_servers */ + struct list_head probe_link; /* Link in net->fs_probe_* */ struct hlist_node proc_link; /* Link in net->fs_proc */ struct list_head volumes; /* RCU list of afs_server_entry objects */ - struct afs_server *gc_next; /* Next server in manager's list */ + struct work_struct destroyer; /* Work item to try and destroy a server */ + struct timer_list timer; /* Management timer */ time64_t unuse_time; /* Time at which last unused */ unsigned long flags; #define AFS_SERVER_FL_RESPONDING 0 /* The server is responding */ #define AFS_SERVER_FL_UPDATING 1 #define AFS_SERVER_FL_NEEDS_UPDATE 2 /* Fileserver address list is out of date */ -#define AFS_SERVER_FL_NOT_READY 4 /* The record is not ready for use */ -#define AFS_SERVER_FL_NOT_FOUND 5 /* VL server says no such server */ -#define AFS_SERVER_FL_VL_FAIL 6 /* Failed to access VL server */ +#define AFS_SERVER_FL_UNCREATED 3 /* The record needs creating */ +#define AFS_SERVER_FL_CREATING 4 /* The record is being created */ +#define AFS_SERVER_FL_EXPIRED 5 /* The record has expired */ +#define AFS_SERVER_FL_NOT_FOUND 6 /* VL server says no such server */ +#define AFS_SERVER_FL_VL_FAIL 7 /* Failed to access VL server */ #define AFS_SERVER_FL_MAY_HAVE_CB 8 /* May have callbacks on this fileserver */ #define AFS_SERVER_FL_IS_YFS 16 /* Server is YFS not AFS */ #define AFS_SERVER_FL_NO_IBULK 17 /* Fileserver doesn't support FS.InlineBulkStatus */ @@ -569,6 +562,7 @@ struct afs_server { atomic_t active; /* Active user count */ u32 addr_version; /* Address list version */ u16 service_id; /* Service ID we're using. */ + short create_error; /* Creation error */ unsigned int rtt; /* Server's current RTT in uS */ unsigned int debug_id; /* Debugging ID for traces */ @@ -1513,19 +1507,29 @@ extern void __exit afs_clean_up_permit_cache(void); extern spinlock_t afs_server_peer_lock; struct afs_server *afs_find_server(const struct rxrpc_peer *peer); -extern struct afs_server *afs_find_server_by_uuid(struct afs_net *, const uuid_t *); extern struct afs_server *afs_lookup_server(struct afs_cell *, struct key *, const uuid_t *, u32); extern struct afs_server *afs_get_server(struct afs_server *, enum afs_server_trace); -extern struct afs_server *afs_use_server(struct afs_server *, enum afs_server_trace); -extern void afs_unuse_server(struct afs_net *, struct afs_server *, enum afs_server_trace); -extern void afs_unuse_server_notime(struct afs_net *, struct afs_server *, enum afs_server_trace); +struct afs_server *afs_use_server(struct afs_server *server, bool activate, + enum afs_server_trace reason); +void afs_unuse_server(struct afs_net *net, struct afs_server *server, + enum afs_server_trace reason); +void afs_unuse_server_notime(struct afs_net *net, struct afs_server *server, + enum afs_server_trace reason); extern void afs_put_server(struct afs_net *, struct afs_server *, enum afs_server_trace); -extern void afs_manage_servers(struct work_struct *); -extern void afs_servers_timer(struct timer_list *); +void afs_purge_servers(struct afs_cell *cell); extern void afs_fs_probe_timer(struct timer_list *); -extern void __net_exit afs_purge_servers(struct afs_net *); +void __net_exit afs_wait_for_servers(struct afs_net *net); bool afs_check_server_record(struct afs_operation *op, struct afs_server *server, struct key *key); +static inline void afs_see_server(struct afs_server *server, enum afs_server_trace trace) +{ + int r = refcount_read(&server->ref); + int a = atomic_read(&server->active); + + trace_afs_server(server->debug_id, r, a, trace); + +} + static inline void afs_inc_servers_outstanding(struct afs_net *net) { atomic_inc(&net->servers_outstanding); diff --git a/fs/afs/main.c b/fs/afs/main.c index a7c7dc268302..bff0363286b0 100644 --- a/fs/afs/main.c +++ b/fs/afs/main.c @@ -86,16 +86,10 @@ static int __net_init afs_net_init(struct net *net_ns) INIT_HLIST_HEAD(&net->proc_cells); seqlock_init(&net->fs_lock); - net->fs_servers = RB_ROOT; INIT_LIST_HEAD(&net->fs_probe_fast); INIT_LIST_HEAD(&net->fs_probe_slow); INIT_HLIST_HEAD(&net->fs_proc); - INIT_HLIST_HEAD(&net->fs_addresses); - seqlock_init(&net->fs_addr_lock); - - INIT_WORK(&net->fs_manager, afs_manage_servers); - timer_setup(&net->fs_timer, afs_servers_timer, 0); INIT_WORK(&net->fs_prober, afs_fs_probe_dispatcher); timer_setup(&net->fs_probe_timer, afs_fs_probe_timer, 0); atomic_set(&net->servers_outstanding, 1); @@ -131,7 +125,7 @@ static int __net_init afs_net_init(struct net *net_ns) net->live = false; afs_fs_probe_cleanup(net); afs_cell_purge(net); - afs_purge_servers(net); + afs_wait_for_servers(net); error_cell_init: net->live = false; afs_proc_cleanup(net); @@ -153,7 +147,7 @@ static void __net_exit afs_net_exit(struct net *net_ns) net->live = false; afs_fs_probe_cleanup(net); afs_cell_purge(net); - afs_purge_servers(net); + afs_wait_for_servers(net); afs_close_socket(net); afs_proc_cleanup(net); afs_put_sysnames(net->sysnames); diff --git a/fs/afs/server.c b/fs/afs/server.c index 1140773f7aed..487e2134aea4 100644 --- a/fs/afs/server.c +++ b/fs/afs/server.c @@ -14,9 +14,9 @@ static unsigned afs_server_gc_delay = 10; /* Server record timeout in seconds */ static atomic_t afs_server_debug_id; -static struct afs_server *afs_maybe_use_server(struct afs_server *, - enum afs_server_trace); static void __afs_put_server(struct afs_net *, struct afs_server *); +static void afs_server_timer(struct timer_list *timer); +static void afs_server_destroyer(struct work_struct *work); /* * Find a server by one of its addresses. @@ -27,148 +27,91 @@ struct afs_server *afs_find_server(const struct rxrpc_peer *peer) if (!server) return NULL; - return afs_maybe_use_server(server, afs_server_trace_use_cm_call); + return afs_use_server(server, false, afs_server_trace_use_cm_call); } /* - * Look up a server by its UUID and mark it active. + * Look up a server by its UUID and mark it active. The caller must hold + * cell->fs_lock. */ -struct afs_server *afs_find_server_by_uuid(struct afs_net *net, const uuid_t *uuid) +static struct afs_server *afs_find_server_by_uuid(struct afs_cell *cell, const uuid_t *uuid) { - struct afs_server *server = NULL; + struct afs_server *server; struct rb_node *p; - int diff, seq = 1; + int diff; _enter("%pU", uuid); - do { - /* Unfortunately, rbtree walking doesn't give reliable results - * under just the RCU read lock, so we have to check for - * changes. - */ - if (server) - afs_unuse_server(net, server, afs_server_trace_unuse_uuid_rsq); - server = NULL; - seq++; /* 2 on the 1st/lockless path, otherwise odd */ - read_seqbegin_or_lock(&net->fs_lock, &seq); - - p = net->fs_servers.rb_node; - while (p) { - server = rb_entry(p, struct afs_server, uuid_rb); - - diff = memcmp(uuid, &server->uuid, sizeof(*uuid)); - if (diff < 0) { - p = p->rb_left; - } else if (diff > 0) { - p = p->rb_right; - } else { - afs_use_server(server, afs_server_trace_use_by_uuid); - break; - } - - server = NULL; - } - } while (need_seqretry(&net->fs_lock, seq)); + p = cell->fs_servers.rb_node; + while (p) { + server = rb_entry(p, struct afs_server, uuid_rb); - done_seqretry(&net->fs_lock, seq); + diff = memcmp(uuid, &server->uuid, sizeof(*uuid)); + if (diff < 0) { + p = p->rb_left; + } else if (diff > 0) { + p = p->rb_right; + } else { + if (test_bit(AFS_SERVER_FL_UNCREATED, &server->flags)) + return NULL; /* Need a write lock */ + afs_use_server(server, true, afs_server_trace_use_by_uuid); + return server; + } + } - _leave(" = %p", server); - return server; + return NULL; } /* - * Install a server record in the namespace tree. If there's a clash, we stick - * it into a list anchored on whichever afs_server struct is actually in the - * tree. + * Install a server record in the cell tree. The caller must hold an exclusive + * lock on cell->fs_lock. */ static struct afs_server *afs_install_server(struct afs_cell *cell, - struct afs_server *candidate) + struct afs_server **candidate) { - const struct afs_endpoint_state *estate; - const struct afs_addr_list *alist; - struct afs_server *server, *next; + struct afs_server *server; struct afs_net *net = cell->net; struct rb_node **pp, *p; int diff; _enter("%p", candidate); - write_seqlock(&net->fs_lock); - /* Firstly install the server in the UUID lookup tree */ - pp = &net->fs_servers.rb_node; + pp = &cell->fs_servers.rb_node; p = NULL; while (*pp) { p = *pp; _debug("- consider %p", p); server = rb_entry(p, struct afs_server, uuid_rb); - diff = memcmp(&candidate->uuid, &server->uuid, sizeof(uuid_t)); - if (diff < 0) { + diff = memcmp(&(*candidate)->uuid, &server->uuid, sizeof(uuid_t)); + if (diff < 0) pp = &(*pp)->rb_left; - } else if (diff > 0) { + else if (diff > 0) pp = &(*pp)->rb_right; - } else { - if (server->cell == cell) - goto exists; - - /* We have the same UUID representing servers in - * different cells. Append the new server to the list. - */ - for (;;) { - next = rcu_dereference_protected( - server->uuid_next, - lockdep_is_held(&net->fs_lock.lock)); - if (!next) - break; - server = next; - } - rcu_assign_pointer(server->uuid_next, candidate); - candidate->uuid_prev = server; - server = candidate; - goto added_dup; - } + else + goto exists; } - server = candidate; + server = *candidate; + *candidate = NULL; rb_link_node(&server->uuid_rb, p, pp); - rb_insert_color(&server->uuid_rb, &net->fs_servers); + rb_insert_color(&server->uuid_rb, &cell->fs_servers); + write_seqlock(&net->fs_lock); hlist_add_head_rcu(&server->proc_link, &net->fs_proc); + write_sequnlock(&net->fs_lock); afs_get_cell(cell, afs_cell_trace_get_server); -added_dup: - write_seqlock(&net->fs_addr_lock); - estate = rcu_dereference_protected(server->endpoint_state, - lockdep_is_held(&net->fs_addr_lock.lock)); - alist = estate->addresses; - - /* Secondly, if the server has any IPv4 and/or IPv6 addresses, install - * it in the IPv4 and/or IPv6 reverse-map lists. - * - * TODO: For speed we want to use something other than a flat list - * here; even sorting the list in terms of lowest address would help a - * bit, but anything we might want to do gets messy and memory - * intensive. - */ - if (alist->nr_addrs > 0) - hlist_add_head_rcu(&server->addr_link, &net->fs_addresses); - - write_sequnlock(&net->fs_addr_lock); - exists: - afs_get_server(server, afs_server_trace_get_install); - write_sequnlock(&net->fs_lock); + afs_use_server(server, true, afs_server_trace_get_install); return server; } /* - * Allocate a new server record and mark it active. + * Allocate a new server record and mark it as active but uncreated. */ -static struct afs_server *afs_alloc_server(struct afs_cell *cell, - const uuid_t *uuid, - struct afs_addr_list *alist) +static struct afs_server *afs_alloc_server(struct afs_cell *cell, const uuid_t *uuid) { - struct afs_endpoint_state *estate; struct afs_server *server; struct afs_net *net = cell->net; @@ -176,65 +119,49 @@ static struct afs_server *afs_alloc_server(struct afs_cell *cell, server = kzalloc(sizeof(struct afs_server), GFP_KERNEL); if (!server) - goto enomem; - - estate = kzalloc(sizeof(struct afs_endpoint_state), GFP_KERNEL); - if (!estate) - goto enomem_server; + return NULL; refcount_set(&server->ref, 1); - atomic_set(&server->active, 1); + atomic_set(&server->active, 0); + __set_bit(AFS_SERVER_FL_UNCREATED, &server->flags); server->debug_id = atomic_inc_return(&afs_server_debug_id); - server->addr_version = alist->version; server->uuid = *uuid; rwlock_init(&server->fs_lock); + INIT_WORK(&server->destroyer, &afs_server_destroyer); + timer_setup(&server->timer, afs_server_timer, 0); INIT_LIST_HEAD(&server->volumes); init_waitqueue_head(&server->probe_wq); INIT_LIST_HEAD(&server->probe_link); + INIT_HLIST_NODE(&server->proc_link); spin_lock_init(&server->probe_lock); server->cell = cell; server->rtt = UINT_MAX; server->service_id = FS_SERVICE; - server->probe_counter = 1; server->probed_at = jiffies - LONG_MAX / 2; - refcount_set(&estate->ref, 1); - estate->addresses = alist; - estate->server_id = server->debug_id; - estate->probe_seq = 1; - rcu_assign_pointer(server->endpoint_state, estate); afs_inc_servers_outstanding(net); - trace_afs_server(server->debug_id, 1, 1, afs_server_trace_alloc); - trace_afs_estate(estate->server_id, estate->probe_seq, refcount_read(&estate->ref), - afs_estate_trace_alloc_server); _leave(" = %p", server); return server; - -enomem_server: - kfree(server); -enomem: - _leave(" = NULL [nomem]"); - return NULL; } /* * Look up an address record for a server */ -static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_cell *cell, - struct key *key, const uuid_t *uuid) +static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_server *server, + struct key *key) { struct afs_vl_cursor vc; struct afs_addr_list *alist = NULL; int ret; ret = -ERESTARTSYS; - if (afs_begin_vlserver_operation(&vc, cell, key)) { + if (afs_begin_vlserver_operation(&vc, server->cell, key)) { while (afs_select_vlserver(&vc)) { if (test_bit(AFS_VLSERVER_FL_IS_YFS, &vc.server->flags)) - alist = afs_yfsvl_get_endpoints(&vc, uuid); + alist = afs_yfsvl_get_endpoints(&vc, &server->uuid); else - alist = afs_vl_get_addrs_u(&vc, uuid); + alist = afs_vl_get_addrs_u(&vc, &server->uuid); } ret = afs_end_vlserver_operation(&vc); @@ -250,67 +177,116 @@ static struct afs_addr_list *afs_vl_lookup_addrs(struct afs_cell *cell, struct afs_server *afs_lookup_server(struct afs_cell *cell, struct key *key, const uuid_t *uuid, u32 addr_version) { - struct afs_addr_list *alist; - struct afs_server *server, *candidate; + struct afs_addr_list *alist = NULL; + struct afs_server *server, *candidate = NULL; + bool creating = false; + int ret; _enter("%p,%pU", cell->net, uuid); - server = afs_find_server_by_uuid(cell->net, uuid); + down_read(&cell->fs_lock); + server = afs_find_server_by_uuid(cell, uuid); + /* Won't see servers marked uncreated. */ + up_read(&cell->fs_lock); + if (server) { + timer_delete_sync(&server->timer); + if (test_bit(AFS_SERVER_FL_CREATING, &server->flags)) + goto wait_for_creation; if (server->addr_version != addr_version) set_bit(AFS_SERVER_FL_NEEDS_UPDATE, &server->flags); return server; } - alist = afs_vl_lookup_addrs(cell, key, uuid); - if (IS_ERR(alist)) - return ERR_CAST(alist); - - candidate = afs_alloc_server(cell, uuid, alist); + candidate = afs_alloc_server(cell, uuid); if (!candidate) { afs_put_addrlist(alist, afs_alist_trace_put_server_oom); return ERR_PTR(-ENOMEM); } - server = afs_install_server(cell, candidate); - if (server != candidate) { - afs_put_addrlist(alist, afs_alist_trace_put_server_dup); + down_write(&cell->fs_lock); + server = afs_install_server(cell, &candidate); + if (test_bit(AFS_SERVER_FL_CREATING, &server->flags)) { + /* We need to wait for creation to complete. */ + up_write(&cell->fs_lock); + goto wait_for_creation; + } + if (test_bit(AFS_SERVER_FL_UNCREATED, &server->flags)) { + set_bit(AFS_SERVER_FL_CREATING, &server->flags); + clear_bit(AFS_SERVER_FL_UNCREATED, &server->flags); + creating = true; + } + up_write(&cell->fs_lock); + timer_delete_sync(&server->timer); + + /* If we get to create the server, we look up the addresses and then + * immediately dispatch an asynchronous probe to each interface on the + * fileserver. This will make sure the repeat-probing service is + * started. + */ + if (creating) { + alist = afs_vl_lookup_addrs(server, key); + if (IS_ERR(alist)) { + ret = PTR_ERR(alist); + goto create_failed; + } + + ret = afs_fs_probe_fileserver(cell->net, server, alist, key); + if (ret) + goto create_failed; + + clear_and_wake_up_bit(AFS_SERVER_FL_CREATING, &server->flags); + } + +out: + afs_put_addrlist(alist, afs_alist_trace_put_server_create); + if (candidate) { + kfree(rcu_access_pointer(server->endpoint_state)); kfree(candidate); - } else { - /* Immediately dispatch an asynchronous probe to each interface - * on the fileserver. This will make sure the repeat-probing - * service is started. - */ - afs_fs_probe_fileserver(cell->net, server, alist, key); + afs_dec_servers_outstanding(cell->net); + } + return server ?: ERR_PTR(ret); + +wait_for_creation: + afs_see_server(server, afs_server_trace_wait_create); + wait_on_bit(&server->flags, AFS_SERVER_FL_CREATING, TASK_UNINTERRUPTIBLE); + if (test_bit_acquire(AFS_SERVER_FL_UNCREATED, &server->flags)) { + /* Barrier: read flag before error */ + ret = READ_ONCE(server->create_error); + afs_put_server(cell->net, server, afs_server_trace_unuse_create_fail); + server = NULL; + goto out; } - return server; -} + ret = 0; + goto out; -/* - * Set the server timer to fire after a given delay, assuming it's not already - * set for an earlier time. - */ -static void afs_set_server_timer(struct afs_net *net, time64_t delay) -{ - if (net->live) { - afs_inc_servers_outstanding(net); - if (timer_reduce(&net->fs_timer, jiffies + delay * HZ)) - afs_dec_servers_outstanding(net); +create_failed: + down_write(&cell->fs_lock); + + WRITE_ONCE(server->create_error, ret); + smp_wmb(); /* Barrier: set error before flag. */ + set_bit(AFS_SERVER_FL_UNCREATED, &server->flags); + + clear_and_wake_up_bit(AFS_SERVER_FL_CREATING, &server->flags); + + if (test_bit(AFS_SERVER_FL_UNCREATED, &server->flags)) { + clear_bit(AFS_SERVER_FL_UNCREATED, &server->flags); + creating = true; } + afs_unuse_server(cell->net, server, afs_server_trace_unuse_create_fail); + server = NULL; + + up_write(&cell->fs_lock); + goto out; } /* - * Server management timer. We have an increment on fs_outstanding that we - * need to pass along to the work item. + * Set/reduce a server's timer. */ -void afs_servers_timer(struct timer_list *timer) +static void afs_set_server_timer(struct afs_server *server, unsigned int delay_secs) { - struct afs_net *net = container_of(timer, struct afs_net, fs_timer); - - _enter(""); - if (!queue_work(afs_wq, &net->fs_manager)) - afs_dec_servers_outstanding(net); + mod_timer(&server->timer, jiffies + delay_secs * HZ); } /* @@ -329,32 +305,20 @@ struct afs_server *afs_get_server(struct afs_server *server, } /* - * Try to get a reference on a server object. - */ -static struct afs_server *afs_maybe_use_server(struct afs_server *server, - enum afs_server_trace reason) -{ - unsigned int a; - int r; - - if (!__refcount_inc_not_zero(&server->ref, &r)) - return NULL; - - a = atomic_inc_return(&server->active); - trace_afs_server(server->debug_id, r + 1, a, reason); - return server; -} - -/* - * Get an active count on a server object. + * Get an active count on a server object and maybe remove from the inactive + * list. */ -struct afs_server *afs_use_server(struct afs_server *server, enum afs_server_trace reason) +struct afs_server *afs_use_server(struct afs_server *server, bool activate, + enum afs_server_trace reason) { unsigned int a; int r; __refcount_inc(&server->ref, &r); a = atomic_inc_return(&server->active); + if (a == 1 && activate && + !test_bit(AFS_SERVER_FL_EXPIRED, &server->flags)) + del_timer(&server->timer); trace_afs_server(server->debug_id, r + 1, a, reason); return server; @@ -387,13 +351,16 @@ void afs_put_server(struct afs_net *net, struct afs_server *server, void afs_unuse_server_notime(struct afs_net *net, struct afs_server *server, enum afs_server_trace reason) { - if (server) { - unsigned int active = atomic_dec_return(&server->active); + if (!server) + return; - if (active == 0) - afs_set_server_timer(net, afs_server_gc_delay); - afs_put_server(net, server, reason); + if (atomic_dec_and_test(&server->active)) { + if (test_bit(AFS_SERVER_FL_EXPIRED, &server->flags) || + READ_ONCE(server->cell->state) >= AFS_CELL_FAILED) + schedule_work(&server->destroyer); } + + afs_put_server(net, server, reason); } /* @@ -402,10 +369,22 @@ void afs_unuse_server_notime(struct afs_net *net, struct afs_server *server, void afs_unuse_server(struct afs_net *net, struct afs_server *server, enum afs_server_trace reason) { - if (server) { - server->unuse_time = ktime_get_real_seconds(); - afs_unuse_server_notime(net, server, reason); + if (!server) + return; + + if (atomic_dec_and_test(&server->active)) { + if (!test_bit(AFS_SERVER_FL_EXPIRED, &server->flags) && + READ_ONCE(server->cell->state) < AFS_CELL_FAILED) { + time64_t unuse_time = ktime_get_real_seconds(); + + server->unuse_time = unuse_time; + afs_set_server_timer(server, afs_server_gc_delay); + } else { + schedule_work(&server->destroyer); + } } + + afs_put_server(net, server, reason); } static void afs_server_rcu(struct rcu_head *rcu) @@ -435,166 +414,119 @@ static void afs_give_up_callbacks(struct afs_net *net, struct afs_server *server } /* - * destroy a dead server + * Check to see if the server record has expired. */ -static void afs_destroy_server(struct afs_net *net, struct afs_server *server) +static bool afs_has_server_expired(const struct afs_server *server) { - struct afs_endpoint_state *estate; + time64_t expires_at; - if (test_bit(AFS_SERVER_FL_MAY_HAVE_CB, &server->flags)) - afs_give_up_callbacks(net, server); + if (atomic_read(&server->active)) + return false; - /* Unbind the rxrpc_peer records from the server. */ - estate = rcu_access_pointer(server->endpoint_state); - if (estate) - afs_set_peer_appdata(server, estate->addresses, NULL); + if (server->cell->net->live || + server->cell->state >= AFS_CELL_FAILED) { + trace_afs_server(server->debug_id, refcount_read(&server->ref), + 0, afs_server_trace_purging); + return true; + } - afs_put_server(net, server, afs_server_trace_destroy); + expires_at = server->unuse_time; + if (!test_bit(AFS_SERVER_FL_VL_FAIL, &server->flags) && + !test_bit(AFS_SERVER_FL_NOT_FOUND, &server->flags)) + expires_at += afs_server_gc_delay; + + return ktime_get_real_seconds() > expires_at; } /* - * Garbage collect any expired servers. + * Remove a server record from it's parent cell's database. */ -static void afs_gc_servers(struct afs_net *net, struct afs_server *gc_list) +static bool afs_remove_server_from_cell(struct afs_server *server) { - struct afs_server *server, *next, *prev; - int active; - - while ((server = gc_list)) { - gc_list = server->gc_next; - - write_seqlock(&net->fs_lock); - - active = atomic_read(&server->active); - if (active == 0) { - trace_afs_server(server->debug_id, refcount_read(&server->ref), - active, afs_server_trace_gc); - next = rcu_dereference_protected( - server->uuid_next, lockdep_is_held(&net->fs_lock.lock)); - prev = server->uuid_prev; - if (!prev) { - /* The one at the front is in the tree */ - if (!next) { - rb_erase(&server->uuid_rb, &net->fs_servers); - } else { - rb_replace_node_rcu(&server->uuid_rb, - &next->uuid_rb, - &net->fs_servers); - next->uuid_prev = NULL; - } - } else { - /* This server is not at the front */ - rcu_assign_pointer(prev->uuid_next, next); - if (next) - next->uuid_prev = prev; - } - - list_del(&server->probe_link); - hlist_del_rcu(&server->proc_link); - if (!hlist_unhashed(&server->addr_link)) - hlist_del_rcu(&server->addr_link); - } - write_sequnlock(&net->fs_lock); + struct afs_cell *cell = server->cell; - if (active == 0) - afs_destroy_server(net, server); + down_write(&cell->fs_lock); + + if (!afs_has_server_expired(server)) { + up_write(&cell->fs_lock); + return false; } + + set_bit(AFS_SERVER_FL_EXPIRED, &server->flags); + _debug("expire %pU %u", &server->uuid, atomic_read(&server->active)); + afs_see_server(server, afs_server_trace_see_expired); + rb_erase(&server->uuid_rb, &cell->fs_servers); + up_write(&cell->fs_lock); + return true; } -/* - * Manage the records of servers known to be within a network namespace. This - * includes garbage collecting unused servers. - * - * Note also that we were given an increment on net->servers_outstanding by - * whoever queued us that we need to deal with before returning. - */ -void afs_manage_servers(struct work_struct *work) +static void afs_server_destroyer(struct work_struct *work) { - struct afs_net *net = container_of(work, struct afs_net, fs_manager); - struct afs_server *gc_list = NULL; - struct rb_node *cursor; - time64_t now = ktime_get_real_seconds(), next_manage = TIME64_MAX; - bool purging = !net->live; - - _enter(""); + struct afs_endpoint_state *estate; + struct afs_server *server = container_of(work, struct afs_server, destroyer); + struct afs_net *net = server->cell->net; - /* Trawl the server list looking for servers that have expired from - * lack of use. - */ - read_seqlock_excl(&net->fs_lock); + afs_see_server(server, afs_server_trace_see_destroyer); - for (cursor = rb_first(&net->fs_servers); cursor; cursor = rb_next(cursor)) { - struct afs_server *server = - rb_entry(cursor, struct afs_server, uuid_rb); - int active = atomic_read(&server->active); + if (test_bit(AFS_SERVER_FL_EXPIRED, &server->flags)) + return; - _debug("manage %pU %u", &server->uuid, active); + if (!afs_remove_server_from_cell(server)) + return; - if (purging) { - trace_afs_server(server->debug_id, refcount_read(&server->ref), - active, afs_server_trace_purging); - if (active != 0) - pr_notice("Can't purge s=%08x\n", server->debug_id); - } + timer_shutdown_sync(&server->timer); + cancel_work(&server->destroyer); - if (active == 0) { - time64_t expire_at = server->unuse_time; - - if (!test_bit(AFS_SERVER_FL_VL_FAIL, &server->flags) && - !test_bit(AFS_SERVER_FL_NOT_FOUND, &server->flags)) - expire_at += afs_server_gc_delay; - if (purging || expire_at <= now) { - server->gc_next = gc_list; - gc_list = server; - } else if (expire_at < next_manage) { - next_manage = expire_at; - } - } - } + if (test_bit(AFS_SERVER_FL_MAY_HAVE_CB, &server->flags)) + afs_give_up_callbacks(net, server); - read_sequnlock_excl(&net->fs_lock); + /* Unbind the rxrpc_peer records from the server. */ + estate = rcu_access_pointer(server->endpoint_state); + if (estate) + afs_set_peer_appdata(server, estate->addresses, NULL); - /* Update the timer on the way out. We have to pass an increment on - * servers_outstanding in the namespace that we are in to the timer or - * the work scheduler. - */ - if (!purging && next_manage < TIME64_MAX) { - now = ktime_get_real_seconds(); + write_seqlock(&net->fs_lock); + list_del_init(&server->probe_link); + if (!hlist_unhashed(&server->proc_link)) + hlist_del_rcu(&server->proc_link); + write_sequnlock(&net->fs_lock); - if (next_manage - now <= 0) { - if (queue_work(afs_wq, &net->fs_manager)) - afs_inc_servers_outstanding(net); - } else { - afs_set_server_timer(net, next_manage - now); - } - } + afs_put_server(net, server, afs_server_trace_destroy); +} - afs_gc_servers(net, gc_list); +static void afs_server_timer(struct timer_list *timer) +{ + struct afs_server *server = container_of(timer, struct afs_server, timer); - afs_dec_servers_outstanding(net); - _leave(" [%d]", atomic_read(&net->servers_outstanding)); + afs_see_server(server, afs_server_trace_see_timer); + if (!test_bit(AFS_SERVER_FL_EXPIRED, &server->flags)) + schedule_work(&server->destroyer); } -static void afs_queue_server_manager(struct afs_net *net) +/* + * Wake up all the servers in a cell so that they can purge themselves. + */ +void afs_purge_servers(struct afs_cell *cell) { - afs_inc_servers_outstanding(net); - if (!queue_work(afs_wq, &net->fs_manager)) - afs_dec_servers_outstanding(net); + struct afs_server *server; + struct rb_node *rb; + + down_read(&cell->fs_lock); + for (rb = rb_first(&cell->fs_servers); rb; rb = rb_next(rb)) { + server = rb_entry(rb, struct afs_server, uuid_rb); + afs_see_server(server, afs_server_trace_see_purge); + schedule_work(&server->destroyer); + } + up_read(&cell->fs_lock); } /* - * Purge list of servers. + * Wait for outstanding servers. */ -void afs_purge_servers(struct afs_net *net) +void afs_wait_for_servers(struct afs_net *net) { _enter(""); - if (del_timer_sync(&net->fs_timer)) - afs_dec_servers_outstanding(net); - - afs_queue_server_manager(net); - - _debug("wait"); atomic_dec(&net->servers_outstanding); wait_var_event(&net->servers_outstanding, !atomic_read(&net->servers_outstanding)); @@ -618,7 +550,7 @@ static noinline bool afs_update_server_record(struct afs_operation *op, atomic_read(&server->active), afs_server_trace_update); - alist = afs_vl_lookup_addrs(op->volume->cell, op->key, &server->uuid); + alist = afs_vl_lookup_addrs(server, op->key); if (IS_ERR(alist)) { rcu_read_lock(); estate = rcu_dereference(server->endpoint_state); diff --git a/fs/afs/server_list.c b/fs/afs/server_list.c index 784236b9b2a9..20d5474837df 100644 --- a/fs/afs/server_list.c +++ b/fs/afs/server_list.c @@ -97,8 +97,8 @@ struct afs_server_list *afs_alloc_server_list(struct afs_volume *volume, break; if (j < slist->nr_servers) { if (slist->servers[j].server == server) { - afs_unuse_server(volume->cell->net, server, - afs_server_trace_unuse_slist_isort); + afs_unuse_server_notime(volume->cell->net, server, + afs_server_trace_unuse_slist_isort); continue; } diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 4d798b9e43bf..02f8b2a6977c 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -127,7 +127,6 @@ enum yfs_cm_operation { E_(afs_call_trace_work, "QUEUE") #define afs_server_traces \ - EM(afs_server_trace_alloc, "ALLOC ") \ EM(afs_server_trace_callback, "CALLBACK ") \ EM(afs_server_trace_destroy, "DESTROY ") \ EM(afs_server_trace_free, "FREE ") \ @@ -137,12 +136,14 @@ enum yfs_cm_operation { EM(afs_server_trace_purging, "PURGE ") \ EM(afs_server_trace_put_cbi, "PUT cbi ") \ EM(afs_server_trace_put_probe, "PUT probe") \ + EM(afs_server_trace_see_destroyer, "SEE destr") \ EM(afs_server_trace_see_expired, "SEE expd ") \ + EM(afs_server_trace_see_purge, "SEE purge") \ + EM(afs_server_trace_see_timer, "SEE timer") \ EM(afs_server_trace_unuse_call, "UNU call ") \ EM(afs_server_trace_unuse_create_fail, "UNU cfail") \ EM(afs_server_trace_unuse_slist, "UNU slist") \ EM(afs_server_trace_unuse_slist_isort, "UNU isort") \ - EM(afs_server_trace_unuse_uuid_rsq, "PUT u-req") \ EM(afs_server_trace_update, "UPDATE ") \ EM(afs_server_trace_use_by_uuid, "USE uuid ") \ EM(afs_server_trace_use_cm_call, "USE cm-cl") \ @@ -229,7 +230,7 @@ enum yfs_cm_operation { EM(afs_alist_trace_put_getaddru, "PUT GtAdrU") \ EM(afs_alist_trace_put_parse_empty, "PUT p-empt") \ EM(afs_alist_trace_put_parse_error, "PUT p-err ") \ - EM(afs_alist_trace_put_server_dup, "PUT sv-dup") \ + EM(afs_alist_trace_put_server_create, "PUT sv-crt") \ EM(afs_alist_trace_put_server_oom, "PUT sv-oom") \ EM(afs_alist_trace_put_server_update, "PUT sv-upd") \ EM(afs_alist_trace_put_vlgetcaps, "PUT vgtcap") \ From patchwork Mon Mar 10 09:42:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 14009488 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEAEA229B38 for ; Mon, 10 Mar 2025 09:42:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599782; cv=none; b=r8/BqXa2T/6FZKmY6X17JQObMO/HazEZaeMl5BNtzpbFBjU/bVCVx+zIjcw1D8MyPB4GHJG1oo9kVDsqsr3sf0HaW/bb1P71aTHMOPaJE9zsFxsV8A5D1udEKaAYUNGSzFOc0u07LreOar5gfJlRjCxgG2TS1X8us9qjxPhaH/0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741599782; c=relaxed/simple; bh=aumxTOPjrUnUuW6eC/ryM/5CyTHCARkXtAow5GOwo0U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=E4Ruuk1W4+OnVdQ+mb7QBfIh0cFCevepLIUk9Gz0AM15GoQ7EhYYDz6hyZA0C/4pLjO3tJBFRNuSZZm7EKeBi2CeZ+hmLLlwwc+vBJRmrvnMHbGk4Zxb7ZejqZsTDUmkkTFKcUjR9nmUdIz6rqJPUoxPFLCl0UyiXPs2PlDs220= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=QN2VF5Wa; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QN2VF5Wa" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741599778; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vAvciIykKJLMNxGHQf8kaQjS1DwchiuzyQPyLasKTWg=; b=QN2VF5WaQqZC3VT6fJW+y/QfUnFuxyKImVc95t6GCr0/stxrpqwZNBojUt2sGSK98xRqm8 842KS2bMk1A6O9YQ3cx2oelGZa2b64N6nIQg1ji+eicDVSKAiJLWGBofMF2Qru9F8VTC6F xa79y64H50EpO4H7Mx/1MQ0hCK10iEA= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-209-AQK5usQaND2px50PiZcj6w-1; Mon, 10 Mar 2025 05:42:55 -0400 X-MC-Unique: AQK5usQaND2px50PiZcj6w-1 X-Mimecast-MFC-AGG-ID: AQK5usQaND2px50PiZcj6w_1741599774 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6AE6B1800349; Mon, 10 Mar 2025 09:42:54 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.61]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 744151800366; Mon, 10 Mar 2025 09:42:52 +0000 (UTC) From: David Howells To: Marc Dionne Cc: David Howells , Christian Brauner , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 11/11] afs: Simplify cell record handling Date: Mon, 10 Mar 2025 09:42:04 +0000 Message-ID: <20250310094206.801057-12-dhowells@redhat.com> In-Reply-To: <20250310094206.801057-1-dhowells@redhat.com> References: <20250310094206.801057-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Simplify afs_cell record handling to avoid very occasional races that cause module removal to hang (it waits for all cell records to be removed). There are two things that particularly contribute to the difficulty: firstly, the code tries to pass a ref on the cell to the cell's maintenance work item (which gets awkward if the work item is already queued); and, secondly, there's an overall cell manager that tries to use just one timer for the entire cell collection (to avoid having loads of timers). However, both of these are probably unnecessarily restrictive. To simplify this, the following changes are made: (1) The cell record collection manager is removed. Each cell record manages itself individually. (2) Each afs_cell is given a second work item (cell->destroyer) that is queued when its refcount reaches zero. This is not done in the context of the putting thread as it might be in an inconvenient place to sleep. (3) Each afs_cell is given its own timer. The timer is used to expire the cell record after a period of unuse if not otherwise pinned and can also be used for other maintenance tasks if necessary (of which there are currently none as DNS refresh is triggered by filesystem operations). (4) The afs_cell manager work item (cell->manager) is no longer given a ref on the cell when queued; rather, the manager must be deleted. This does away with the need to deal with the consequences of losing a race to queue cell->manager. Clean up of extra queuing is deferred to the destroyer. (5) The cell destroyer work item makes sure the cell timer is removed and that the normal cell work is cancelled before farming the actual destruction off to RCU. (6) When a network namespace is destroyed or the kafs module is unloaded, it's now a simple matter of marking the namespace as dead then just waking up all the cell work items. They will then remove and destroy themselves once all remaining activity counts and/or a ref counts are dropped. This makes sure that all server records are dropped first. (7) The cell record state set is reduced to just four states: SETTING_UP, ACTIVE, REMOVING and DEAD. The record persists in the active state even when it's not being used until the time comes to remove it rather than downgrading it to an inactive state from whence it can be restored. This means that the cell still appears in /proc and /afs when not in use until it switches to the REMOVING state - at which point it is removed. Note that the REMOVING state is included so that someone wanting to resurrect the cell record is forced to wait whilst the cell is torn down in that state. Once it's in the DEAD state, it has been removed from net->cells tree and is no longer findable and can be replaced. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250224234154.2014840-16-dhowells@redhat.com/ # v1 --- fs/afs/cell.c | 404 +++++++++++++++---------------------- fs/afs/dynroot.c | 4 +- fs/afs/internal.h | 16 +- fs/afs/main.c | 3 - fs/afs/server.c | 8 +- fs/afs/vl_rotate.c | 2 +- include/trace/events/afs.h | 23 +-- 7 files changed, 187 insertions(+), 273 deletions(-) diff --git a/fs/afs/cell.c b/fs/afs/cell.c index 694714d296ba..0168bbf53fe0 100644 --- a/fs/afs/cell.c +++ b/fs/afs/cell.c @@ -20,8 +20,9 @@ static unsigned __read_mostly afs_cell_min_ttl = 10 * 60; static unsigned __read_mostly afs_cell_max_ttl = 24 * 60 * 60; static atomic_t cell_debug_id; -static void afs_queue_cell_manager(struct afs_net *); -static void afs_manage_cell_work(struct work_struct *); +static void afs_cell_timer(struct timer_list *timer); +static void afs_destroy_cell_work(struct work_struct *work); +static void afs_manage_cell_work(struct work_struct *work); static void afs_dec_cells_outstanding(struct afs_net *net) { @@ -29,19 +30,11 @@ static void afs_dec_cells_outstanding(struct afs_net *net) wake_up_var(&net->cells_outstanding); } -/* - * Set the cell timer to fire after a given delay, assuming it's not already - * set for an earlier time. - */ -static void afs_set_cell_timer(struct afs_net *net, time64_t delay) +static void afs_set_cell_state(struct afs_cell *cell, enum afs_cell_state state) { - if (net->live) { - atomic_inc(&net->cells_outstanding); - if (timer_reduce(&net->cells_timer, jiffies + delay * HZ)) - afs_dec_cells_outstanding(net); - } else { - afs_queue_cell_manager(net); - } + smp_store_release(&cell->state, state); /* Commit cell changes before state */ + smp_wmb(); /* Set cell state before task state */ + wake_up_var(&cell->state); } /* @@ -116,7 +109,7 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, const char *name, unsigned int namelen, const char *addresses) { - struct afs_vlserver_list *vllist; + struct afs_vlserver_list *vllist = NULL; struct afs_cell *cell; int i, ret; @@ -163,7 +156,9 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, cell->net = net; refcount_set(&cell->ref, 1); atomic_set(&cell->active, 0); + INIT_WORK(&cell->destroyer, afs_destroy_cell_work); INIT_WORK(&cell->manager, afs_manage_cell_work); + timer_setup(&cell->management_timer, afs_cell_timer, 0); init_rwsem(&cell->vs_lock); cell->volumes = RB_ROOT; INIT_HLIST_HEAD(&cell->proc_volumes); @@ -220,6 +215,7 @@ static struct afs_cell *afs_alloc_cell(struct afs_net *net, if (ret == -EINVAL) printk(KERN_ERR "kAFS: bad VL server IP address\n"); error: + afs_put_vlserverlist(cell->net, vllist); kfree(cell->name - 1); kfree(cell); _leave(" = %d", ret); @@ -296,26 +292,28 @@ struct afs_cell *afs_lookup_cell(struct afs_net *net, cell = candidate; candidate = NULL; - atomic_set(&cell->active, 2); - trace_afs_cell(cell->debug_id, refcount_read(&cell->ref), 2, afs_cell_trace_insert); + afs_use_cell(cell, trace); rb_link_node_rcu(&cell->net_node, parent, pp); rb_insert_color(&cell->net_node, &net->cells); up_write(&net->cells_lock); - afs_queue_cell(cell, afs_cell_trace_get_queue_new); + afs_queue_cell(cell, afs_cell_trace_queue_new); wait_for_cell: - trace_afs_cell(cell->debug_id, refcount_read(&cell->ref), atomic_read(&cell->active), - afs_cell_trace_wait); _debug("wait_for_cell"); - wait_var_event(&cell->state, - ({ - state = smp_load_acquire(&cell->state); /* vs error */ - state == AFS_CELL_ACTIVE || state == AFS_CELL_REMOVED; - })); + state = smp_load_acquire(&cell->state); /* vs error */ + if (state != AFS_CELL_ACTIVE && + state != AFS_CELL_DEAD) { + afs_see_cell(cell, afs_cell_trace_wait); + wait_var_event(&cell->state, + ({ + state = smp_load_acquire(&cell->state); /* vs error */ + state == AFS_CELL_ACTIVE || state == AFS_CELL_DEAD; + })); + } /* Check the state obtained from the wait check. */ - if (state == AFS_CELL_REMOVED) { + if (state == AFS_CELL_DEAD) { ret = cell->error; goto error; } @@ -397,7 +395,6 @@ int afs_cell_init(struct afs_net *net, const char *rootcell) /* install the new cell */ down_write(&net->cells_lock); - afs_see_cell(new_root, afs_cell_trace_see_ws); old_root = rcu_replace_pointer(net->ws_cell, new_root, lockdep_is_held(&net->cells_lock)); up_write(&net->cells_lock); @@ -530,30 +527,14 @@ static void afs_cell_destroy(struct rcu_head *rcu) _leave(" [destroyed]"); } -/* - * Queue the cell manager. - */ -static void afs_queue_cell_manager(struct afs_net *net) -{ - int outstanding = atomic_inc_return(&net->cells_outstanding); - - _enter("%d", outstanding); - - if (!queue_work(afs_wq, &net->cells_manager)) - afs_dec_cells_outstanding(net); -} - -/* - * Cell management timer. We have an increment on cells_outstanding that we - * need to pass along to the work item. - */ -void afs_cells_timer(struct timer_list *timer) +static void afs_destroy_cell_work(struct work_struct *work) { - struct afs_net *net = container_of(timer, struct afs_net, cells_timer); + struct afs_cell *cell = container_of(work, struct afs_cell, destroyer); - _enter(""); - if (!queue_work(afs_wq, &net->cells_manager)) - afs_dec_cells_outstanding(net); + afs_see_cell(cell, afs_cell_trace_destroy); + timer_delete_sync(&cell->management_timer); + cancel_work_sync(&cell->manager); + call_rcu(&cell->rcu, afs_cell_destroy); } /* @@ -585,7 +566,7 @@ void afs_put_cell(struct afs_cell *cell, enum afs_cell_trace reason) if (zero) { a = atomic_read(&cell->active); WARN(a != 0, "Cell active count %u > 0\n", a); - call_rcu(&cell->rcu, afs_cell_destroy); + WARN_ON(!queue_work(afs_wq, &cell->destroyer)); } } } @@ -597,10 +578,9 @@ struct afs_cell *afs_use_cell(struct afs_cell *cell, enum afs_cell_trace reason) { int r, a; - r = refcount_read(&cell->ref); - WARN_ON(r == 0); + __refcount_inc(&cell->ref, &r); a = atomic_inc_return(&cell->active); - trace_afs_cell(cell->debug_id, r, a, reason); + trace_afs_cell(cell->debug_id, r + 1, a, reason); return cell; } @@ -612,6 +592,7 @@ void afs_unuse_cell(struct afs_cell *cell, enum afs_cell_trace reason) { unsigned int debug_id; time64_t now, expire_delay; + bool zero; int r, a; if (!cell) @@ -626,13 +607,15 @@ void afs_unuse_cell(struct afs_cell *cell, enum afs_cell_trace reason) expire_delay = afs_cell_gc_delay; debug_id = cell->debug_id; - r = refcount_read(&cell->ref); a = atomic_dec_return(&cell->active); - trace_afs_cell(debug_id, r, a, reason); - WARN_ON(a == 0); - if (a == 1) + if (!a) /* 'cell' may now be garbage collected. */ - afs_set_cell_timer(cell->net, expire_delay); + afs_set_cell_timer(cell, expire_delay); + + zero = __refcount_dec_and_test(&cell->ref, &r); + trace_afs_cell(debug_id, r - 1, a, reason); + if (zero) + WARN_ON(!queue_work(afs_wq, &cell->destroyer)); } /* @@ -652,9 +635,27 @@ void afs_see_cell(struct afs_cell *cell, enum afs_cell_trace reason) */ void afs_queue_cell(struct afs_cell *cell, enum afs_cell_trace reason) { - afs_get_cell(cell, reason); - if (!queue_work(afs_wq, &cell->manager)) - afs_put_cell(cell, afs_cell_trace_put_queue_fail); + queue_work(afs_wq, &cell->manager); +} + +/* + * Cell-specific management timer. + */ +static void afs_cell_timer(struct timer_list *timer) +{ + struct afs_cell *cell = container_of(timer, struct afs_cell, management_timer); + + afs_see_cell(cell, afs_cell_trace_see_mgmt_timer); + if (refcount_read(&cell->ref) > 0 && cell->net->live) + queue_work(afs_wq, &cell->manager); +} + +/* + * Set/reduce the cell timer. + */ +void afs_set_cell_timer(struct afs_cell *cell, unsigned int delay_secs) +{ + timer_reduce(&cell->management_timer, jiffies + delay_secs * HZ); } /* @@ -737,212 +738,125 @@ static void afs_deactivate_cell(struct afs_net *net, struct afs_cell *cell) _leave(""); } +static bool afs_has_cell_expired(struct afs_cell *cell, time64_t *_next_manage) +{ + const struct afs_vlserver_list *vllist; + time64_t expire_at = cell->last_inactive; + time64_t now = ktime_get_real_seconds(); + + if (atomic_read(&cell->active)) + return false; + if (!cell->net->live) + return true; + + vllist = rcu_dereference_protected(cell->vl_servers, true); + if (vllist && vllist->nr_servers > 0) + expire_at += afs_cell_gc_delay; + + if (expire_at <= now) + return true; + if (expire_at < *_next_manage) + *_next_manage = expire_at; + return false; +} + /* * Manage a cell record, initialising and destroying it, maintaining its DNS * records. */ -static void afs_manage_cell(struct afs_cell *cell) +static bool afs_manage_cell(struct afs_cell *cell) { struct afs_net *net = cell->net; - int ret, active; + time64_t next_manage = TIME64_MAX; + int ret; _enter("%s", cell->name); -again: _debug("state %u", cell->state); switch (cell->state) { - case AFS_CELL_INACTIVE: - case AFS_CELL_FAILED: - down_write(&net->cells_lock); - active = 1; - if (atomic_try_cmpxchg_relaxed(&cell->active, &active, 0)) { - rb_erase(&cell->net_node, &net->cells); - trace_afs_cell(cell->debug_id, refcount_read(&cell->ref), 0, - afs_cell_trace_unuse_delete); - smp_store_release(&cell->state, AFS_CELL_REMOVED); - } - up_write(&net->cells_lock); - if (cell->state == AFS_CELL_REMOVED) { - wake_up_var(&cell->state); - goto final_destruction; - } - if (cell->state == AFS_CELL_FAILED) - goto done; - smp_store_release(&cell->state, AFS_CELL_UNSET); - wake_up_var(&cell->state); - goto again; - - case AFS_CELL_UNSET: - smp_store_release(&cell->state, AFS_CELL_ACTIVATING); - wake_up_var(&cell->state); - goto again; - - case AFS_CELL_ACTIVATING: - ret = afs_activate_cell(net, cell); - if (ret < 0) - goto activation_failed; + case AFS_CELL_SETTING_UP: + goto set_up_cell; + case AFS_CELL_ACTIVE: + goto cell_is_active; + case AFS_CELL_REMOVING: + WARN_ON_ONCE(1); + return false; + case AFS_CELL_DEAD: + return false; + default: + _debug("bad state %u", cell->state); + WARN_ON_ONCE(1); /* Unhandled state */ + return false; + } - smp_store_release(&cell->state, AFS_CELL_ACTIVE); - wake_up_var(&cell->state); - goto again; +set_up_cell: + ret = afs_activate_cell(net, cell); + if (ret < 0) { + cell->error = ret; + goto remove_cell; + } - case AFS_CELL_ACTIVE: - if (atomic_read(&cell->active) > 1) { - if (test_and_clear_bit(AFS_CELL_FL_DO_LOOKUP, &cell->flags)) { - ret = afs_update_cell(cell); - if (ret < 0) - cell->error = ret; - } - goto done; - } - smp_store_release(&cell->state, AFS_CELL_DEACTIVATING); - wake_up_var(&cell->state); - goto again; + afs_set_cell_state(cell, AFS_CELL_ACTIVE); - case AFS_CELL_DEACTIVATING: - if (atomic_read(&cell->active) > 1) - goto reverse_deactivation; - afs_deactivate_cell(net, cell); - smp_store_release(&cell->state, AFS_CELL_INACTIVE); - wake_up_var(&cell->state); - goto again; +cell_is_active: + if (afs_has_cell_expired(cell, &next_manage)) + goto remove_cell; - case AFS_CELL_REMOVED: - goto done; + if (test_and_clear_bit(AFS_CELL_FL_DO_LOOKUP, &cell->flags)) { + ret = afs_update_cell(cell); + if (ret < 0) + cell->error = ret; + } - default: - break; + if (next_manage < TIME64_MAX && cell->net->live) { + time64_t now = ktime_get_real_seconds(); + + if (next_manage - now <= 0) + afs_queue_cell(cell, afs_cell_trace_queue_again); + else + afs_set_cell_timer(cell, next_manage - now); } - _debug("bad state %u", cell->state); - BUG(); /* Unhandled state */ + _leave(" [done %u]", cell->state); + return false; -activation_failed: - cell->error = ret; - afs_deactivate_cell(net, cell); +remove_cell: + down_write(&net->cells_lock); - smp_store_release(&cell->state, AFS_CELL_FAILED); /* vs error */ - wake_up_var(&cell->state); - goto again; + if (atomic_read(&cell->active)) { + up_write(&net->cells_lock); + goto cell_is_active; + } -reverse_deactivation: - smp_store_release(&cell->state, AFS_CELL_ACTIVE); - wake_up_var(&cell->state); - _leave(" [deact->act]"); - return; + /* Make sure that the expiring server records are going to see the fact + * that the cell is caput. + */ + afs_set_cell_state(cell, AFS_CELL_REMOVING); -done: - _leave(" [done %u]", cell->state); - return; + afs_deactivate_cell(net, cell); + afs_purge_servers(cell); + + rb_erase(&cell->net_node, &net->cells); + afs_see_cell(cell, afs_cell_trace_unuse_delete); + up_write(&net->cells_lock); -final_destruction: /* The root volume is pinning the cell */ afs_put_volume(cell->root_volume, afs_volume_trace_put_cell_root); cell->root_volume = NULL; - afs_purge_servers(cell); - afs_put_cell(cell, afs_cell_trace_put_destroy); + + afs_set_cell_state(cell, AFS_CELL_DEAD); + return true; } static void afs_manage_cell_work(struct work_struct *work) { struct afs_cell *cell = container_of(work, struct afs_cell, manager); + bool final_put; - afs_manage_cell(cell); - afs_put_cell(cell, afs_cell_trace_put_queue_work); -} - -/* - * Manage the records of cells known to a network namespace. This includes - * updating the DNS records and garbage collecting unused cells that were - * automatically added. - * - * Note that constructed cell records may only be removed from net->cells by - * this work item, so it is safe for this work item to stash a cursor pointing - * into the tree and then return to caller (provided it skips cells that are - * still under construction). - * - * Note also that we were given an increment on net->cells_outstanding by - * whoever queued us that we need to deal with before returning. - */ -void afs_manage_cells(struct work_struct *work) -{ - struct afs_net *net = container_of(work, struct afs_net, cells_manager); - struct rb_node *cursor; - time64_t now = ktime_get_real_seconds(), next_manage = TIME64_MAX; - bool purging = !net->live; - - _enter(""); - - /* Trawl the cell database looking for cells that have expired from - * lack of use and cells whose DNS results have expired and dispatch - * their managers. - */ - down_read(&net->cells_lock); - - for (cursor = rb_first(&net->cells); cursor; cursor = rb_next(cursor)) { - struct afs_cell *cell = - rb_entry(cursor, struct afs_cell, net_node); - unsigned active; - bool sched_cell = false; - - active = atomic_read(&cell->active); - trace_afs_cell(cell->debug_id, refcount_read(&cell->ref), - active, afs_cell_trace_manage); - - ASSERTCMP(active, >=, 1); - - if (purging) { - if (test_and_clear_bit(AFS_CELL_FL_NO_GC, &cell->flags)) { - active = atomic_dec_return(&cell->active); - trace_afs_cell(cell->debug_id, refcount_read(&cell->ref), - active, afs_cell_trace_unuse_pin); - } - } - - if (active == 1) { - struct afs_vlserver_list *vllist; - time64_t expire_at = cell->last_inactive; - - read_lock(&cell->vl_servers_lock); - vllist = rcu_dereference_protected( - cell->vl_servers, - lockdep_is_held(&cell->vl_servers_lock)); - if (vllist->nr_servers > 0) - expire_at += afs_cell_gc_delay; - read_unlock(&cell->vl_servers_lock); - if (purging || expire_at <= now) - sched_cell = true; - else if (expire_at < next_manage) - next_manage = expire_at; - } - - if (!purging) { - if (test_bit(AFS_CELL_FL_DO_LOOKUP, &cell->flags)) - sched_cell = true; - } - - if (sched_cell) - afs_queue_cell(cell, afs_cell_trace_get_queue_manage); - } - - up_read(&net->cells_lock); - - /* Update the timer on the way out. We have to pass an increment on - * cells_outstanding in the namespace that we are in to the timer or - * the work scheduler. - */ - if (!purging && next_manage < TIME64_MAX) { - now = ktime_get_real_seconds(); - - if (next_manage - now <= 0) { - if (queue_work(afs_wq, &net->cells_manager)) - atomic_inc(&net->cells_outstanding); - } else { - afs_set_cell_timer(net, next_manage - now); - } - } - - afs_dec_cells_outstanding(net); - _leave(" [%d]", atomic_read(&net->cells_outstanding)); + afs_see_cell(cell, afs_cell_trace_manage); + final_put = afs_manage_cell(cell); + afs_see_cell(cell, afs_cell_trace_managed); + if (final_put) + afs_put_cell(cell, afs_cell_trace_put_final); } /* @@ -951,6 +865,7 @@ void afs_manage_cells(struct work_struct *work) void afs_cell_purge(struct afs_net *net) { struct afs_cell *ws; + struct rb_node *cursor; _enter(""); @@ -960,12 +875,19 @@ void afs_cell_purge(struct afs_net *net) up_write(&net->cells_lock); afs_unuse_cell(ws, afs_cell_trace_unuse_ws); - _debug("del timer"); - if (del_timer_sync(&net->cells_timer)) - atomic_dec(&net->cells_outstanding); + _debug("kick cells"); + down_read(&net->cells_lock); + for (cursor = rb_first(&net->cells); cursor; cursor = rb_next(cursor)) { + struct afs_cell *cell = rb_entry(cursor, struct afs_cell, net_node); + + afs_see_cell(cell, afs_cell_trace_purge); - _debug("kick mgr"); - afs_queue_cell_manager(net); + if (test_and_clear_bit(AFS_CELL_FL_NO_GC, &cell->flags)) + afs_unuse_cell(cell, afs_cell_trace_unuse_pin); + + afs_queue_cell(cell, afs_cell_trace_queue_purge); + } + up_read(&net->cells_lock); _debug("wait"); wait_var_event(&net->cells_outstanding, diff --git a/fs/afs/dynroot.c b/fs/afs/dynroot.c index 011c63350df1..9732a1e17db3 100644 --- a/fs/afs/dynroot.c +++ b/fs/afs/dynroot.c @@ -293,8 +293,8 @@ static int afs_dynroot_readdir_cells(struct afs_net *net, struct dir_context *ct cell = idr_get_next(&net->cells_dyn_ino, &ix); if (!cell) return 0; - if (READ_ONCE(cell->state) == AFS_CELL_FAILED || - READ_ONCE(cell->state) == AFS_CELL_REMOVED) { + if (READ_ONCE(cell->state) == AFS_CELL_REMOVING || + READ_ONCE(cell->state) == AFS_CELL_DEAD) { ctx->pos += 2; ctx->pos &= ~1; continue; diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 1e0ab5e7fc88..440b0e731093 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -289,8 +289,6 @@ struct afs_net { struct rb_root cells; struct idr cells_dyn_ino; /* cell->dynroot_ino mapping */ struct afs_cell __rcu *ws_cell; - struct work_struct cells_manager; - struct timer_list cells_timer; atomic_t cells_outstanding; struct rw_semaphore cells_lock; struct mutex cells_alias_lock; @@ -339,13 +337,10 @@ struct afs_net { extern const char afs_init_sysname[]; enum afs_cell_state { - AFS_CELL_UNSET, - AFS_CELL_ACTIVATING, + AFS_CELL_SETTING_UP, AFS_CELL_ACTIVE, - AFS_CELL_DEACTIVATING, - AFS_CELL_INACTIVE, - AFS_CELL_FAILED, - AFS_CELL_REMOVED, + AFS_CELL_REMOVING, + AFS_CELL_DEAD, }; /* @@ -376,7 +371,9 @@ struct afs_cell { struct afs_cell *alias_of; /* The cell this is an alias of */ struct afs_volume *root_volume; /* The root.cell volume if there is one */ struct key *anonymous_key; /* anonymous user key for this cell */ + struct work_struct destroyer; /* Destroyer for cell */ struct work_struct manager; /* Manager for init/deinit/dns */ + struct timer_list management_timer; /* General management timer */ struct hlist_node proc_link; /* /proc cell list link */ time64_t dns_expiry; /* Time AFSDB/SRV record expires */ time64_t last_inactive; /* Time of last drop of usage count */ @@ -1053,8 +1050,7 @@ extern struct afs_cell *afs_get_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_see_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_put_cell(struct afs_cell *, enum afs_cell_trace); extern void afs_queue_cell(struct afs_cell *, enum afs_cell_trace); -extern void afs_manage_cells(struct work_struct *); -extern void afs_cells_timer(struct timer_list *); +void afs_set_cell_timer(struct afs_cell *cell, unsigned int delay_secs); extern void __net_exit afs_cell_purge(struct afs_net *); /* diff --git a/fs/afs/main.c b/fs/afs/main.c index bff0363286b0..c845c5daaeba 100644 --- a/fs/afs/main.c +++ b/fs/afs/main.c @@ -78,9 +78,6 @@ static int __net_init afs_net_init(struct net *net_ns) net->cells = RB_ROOT; idr_init(&net->cells_dyn_ino); init_rwsem(&net->cells_lock); - INIT_WORK(&net->cells_manager, afs_manage_cells); - timer_setup(&net->cells_timer, afs_cells_timer, 0); - mutex_init(&net->cells_alias_lock); mutex_init(&net->proc_cells_lock); INIT_HLIST_HEAD(&net->proc_cells); diff --git a/fs/afs/server.c b/fs/afs/server.c index 487e2134aea4..c530d1ca15df 100644 --- a/fs/afs/server.c +++ b/fs/afs/server.c @@ -103,7 +103,7 @@ static struct afs_server *afs_install_server(struct afs_cell *cell, afs_get_cell(cell, afs_cell_trace_get_server); exists: - afs_use_server(server, true, afs_server_trace_get_install); + afs_use_server(server, true, afs_server_trace_use_install); return server; } @@ -356,7 +356,7 @@ void afs_unuse_server_notime(struct afs_net *net, struct afs_server *server, if (atomic_dec_and_test(&server->active)) { if (test_bit(AFS_SERVER_FL_EXPIRED, &server->flags) || - READ_ONCE(server->cell->state) >= AFS_CELL_FAILED) + READ_ONCE(server->cell->state) >= AFS_CELL_REMOVING) schedule_work(&server->destroyer); } @@ -374,7 +374,7 @@ void afs_unuse_server(struct afs_net *net, struct afs_server *server, if (atomic_dec_and_test(&server->active)) { if (!test_bit(AFS_SERVER_FL_EXPIRED, &server->flags) && - READ_ONCE(server->cell->state) < AFS_CELL_FAILED) { + READ_ONCE(server->cell->state) < AFS_CELL_REMOVING) { time64_t unuse_time = ktime_get_real_seconds(); server->unuse_time = unuse_time; @@ -424,7 +424,7 @@ static bool afs_has_server_expired(const struct afs_server *server) return false; if (server->cell->net->live || - server->cell->state >= AFS_CELL_FAILED) { + server->cell->state >= AFS_CELL_REMOVING) { trace_afs_server(server->debug_id, refcount_read(&server->ref), 0, afs_server_trace_purging); return true; diff --git a/fs/afs/vl_rotate.c b/fs/afs/vl_rotate.c index d8f79f6ada3d..6ad9688d8f4b 100644 --- a/fs/afs/vl_rotate.c +++ b/fs/afs/vl_rotate.c @@ -48,7 +48,7 @@ static bool afs_start_vl_iteration(struct afs_vl_cursor *vc) cell->dns_expiry <= ktime_get_real_seconds()) { dns_lookup_count = smp_load_acquire(&cell->dns_lookup_count); set_bit(AFS_CELL_FL_DO_LOOKUP, &cell->flags); - afs_queue_cell(cell, afs_cell_trace_get_queue_dns); + afs_queue_cell(cell, afs_cell_trace_queue_dns); if (cell->dns_source == DNS_RECORD_UNAVAILABLE) { if (wait_var_event_interruptible( diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 02f8b2a6977c..8857f5ea77d4 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -131,7 +131,6 @@ enum yfs_cm_operation { EM(afs_server_trace_destroy, "DESTROY ") \ EM(afs_server_trace_free, "FREE ") \ EM(afs_server_trace_gc, "GC ") \ - EM(afs_server_trace_get_install, "GET inst ") \ EM(afs_server_trace_get_probe, "GET probe") \ EM(afs_server_trace_purging, "PURGE ") \ EM(afs_server_trace_put_cbi, "PUT cbi ") \ @@ -149,6 +148,7 @@ enum yfs_cm_operation { EM(afs_server_trace_use_cm_call, "USE cm-cl") \ EM(afs_server_trace_use_get_caps, "USE gcaps") \ EM(afs_server_trace_use_give_up_cb, "USE gvupc") \ + EM(afs_server_trace_use_install, "USE inst ") \ E_(afs_server_trace_wait_create, "WAIT crt ") #define afs_volume_traces \ @@ -171,37 +171,36 @@ enum yfs_cm_operation { #define afs_cell_traces \ EM(afs_cell_trace_alloc, "ALLOC ") \ + EM(afs_cell_trace_destroy, "DESTROY ") \ EM(afs_cell_trace_free, "FREE ") \ EM(afs_cell_trace_get_atcell, "GET atcell") \ - EM(afs_cell_trace_get_queue_dns, "GET q-dns ") \ - EM(afs_cell_trace_get_queue_manage, "GET q-mng ") \ - EM(afs_cell_trace_get_queue_new, "GET q-new ") \ EM(afs_cell_trace_get_server, "GET server") \ EM(afs_cell_trace_get_vol, "GET vol ") \ - EM(afs_cell_trace_insert, "INSERT ") \ - EM(afs_cell_trace_manage, "MANAGE ") \ + EM(afs_cell_trace_purge, "PURGE ") \ EM(afs_cell_trace_put_atcell, "PUT atcell") \ EM(afs_cell_trace_put_candidate, "PUT candid") \ - EM(afs_cell_trace_put_destroy, "PUT destry") \ - EM(afs_cell_trace_put_queue_work, "PUT q-work") \ - EM(afs_cell_trace_put_queue_fail, "PUT q-fail") \ + EM(afs_cell_trace_put_final, "PUT final ") \ EM(afs_cell_trace_put_server, "PUT server") \ EM(afs_cell_trace_put_vol, "PUT vol ") \ + EM(afs_cell_trace_queue_again, "QUE again ") \ + EM(afs_cell_trace_queue_dns, "QUE dns ") \ + EM(afs_cell_trace_queue_new, "QUE new ") \ + EM(afs_cell_trace_queue_purge, "QUE purge ") \ + EM(afs_cell_trace_manage, "MANAGE ") \ + EM(afs_cell_trace_managed, "MANAGED ") \ EM(afs_cell_trace_see_source, "SEE source") \ - EM(afs_cell_trace_see_ws, "SEE ws ") \ + EM(afs_cell_trace_see_mgmt_timer, "SEE mtimer") \ EM(afs_cell_trace_unuse_alias, "UNU alias ") \ EM(afs_cell_trace_unuse_check_alias, "UNU chk-al") \ EM(afs_cell_trace_unuse_delete, "UNU delete") \ EM(afs_cell_trace_unuse_dynroot_mntpt, "UNU dyn-mp") \ EM(afs_cell_trace_unuse_fc, "UNU fc ") \ - EM(afs_cell_trace_unuse_lookup, "UNU lookup") \ EM(afs_cell_trace_unuse_lookup_dynroot, "UNU lu-dyn") \ EM(afs_cell_trace_unuse_lookup_error, "UNU lu-err") \ EM(afs_cell_trace_unuse_mntpt, "UNU mntpt ") \ EM(afs_cell_trace_unuse_no_pin, "UNU no-pin") \ EM(afs_cell_trace_unuse_parse, "UNU parse ") \ EM(afs_cell_trace_unuse_pin, "UNU pin ") \ - EM(afs_cell_trace_unuse_probe, "UNU probe ") \ EM(afs_cell_trace_unuse_sbi, "UNU sbi ") \ EM(afs_cell_trace_unuse_ws, "UNU ws ") \ EM(afs_cell_trace_use_alias, "USE alias ") \