Message ID | 20240830023531.29421-1-neilb@suse.de (mailing list archive) |
---|---|
Headers | show |
Series | Address netns refcount issues for localio | expand |
On Fri, Aug 30, 2024 at 12:20:13PM +1000, NeilBrown wrote: > Following are revised versions of 6 patches from the v14 localio series. > > The issue addressed is net namespace refcounting. > > We don't want to keep a long-term counted reference in the client > because that prevents a server container from completely shutting down. > > So we avoid taking a reference at all and rely on the per-cpu reference > to the server being sufficient to keep the net-ns active. This involves > allowing the net-ns exit code to iterate all active clients and clear > their ->net pointers (which they need to find the per-cpu-refcount for > the nfs_serv). > > So: > - embed nfs_uuid_t in nfs_client. This provides a list_head that can > be used to find the client. It does add the actual uuid to nfs_client > so it is bigger than needed. If that is really a problem we can find > a fix. > > - When the nfs server confirms that the uuid is local, it moves the > nfs_uuid_t onto a per-net-ns list. > > - When the net-ns is shutting down - in a "pre_exit" handler, all these > nfS_uuid_t have their ->net cleared. There is an rcu_synchronize() > call between pre_exit() handlers and exit() handlers so and caller > that sees ->net as not NULL can safely check the ->counter > > - We now pass the nfs_uuid_t to nfsd_open_local_fh() so it can safely > look at ->net in a private rcu_read_lock() section. > > I have compile tested this code but nothing more. Wow, nicely done. I will go over it in much more detail and test. For the benefit of others, here is a single incremental diff relative to v14: diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 6a4b605cc943..4903f5ff5758 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -181,8 +181,7 @@ struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_init) #if IS_ENABLED(CONFIG_NFS_LOCALIO) seqlock_init(&clp->cl_boot_lock); ktime_get_real_ts64(&clp->cl_nfssvc_boot); - clp->cl_nfssvc_net = NULL; - clp->cl_nfssvc_dom = NULL; + clp->cl_uuid.net = NULL; spin_lock_init(&clp->cl_localio_lock); #endif /* CONFIG_NFS_LOCALIO */ diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c index 40521da422f7..55622084d5c2 100644 --- a/fs/nfs/localio.c +++ b/fs/nfs/localio.c @@ -120,15 +120,13 @@ const struct rpc_program nfslocalio_program = { /* * nfs_local_enable - enable local i/o for an nfs_client */ -static void nfs_local_enable(struct nfs_client *clp, nfs_uuid_t *nfs_uuid) +static void nfs_local_enable(struct nfs_client *clp) { spin_lock(&clp->cl_localio_lock); if (unlikely(!get_nfs_to_nfsd_symbols())) goto out; set_bit(NFS_CS_LOCAL_IO, &clp->cl_flags); - rcu_assign_pointer(clp->cl_nfssvc_net, nfs_uuid->net); - rcu_assign_pointer(clp->cl_nfssvc_dom, nfs_uuid->dom); trace_nfs_local_enable(clp); out: spin_unlock(&clp->cl_localio_lock); @@ -139,25 +137,12 @@ static void nfs_local_enable(struct nfs_client *clp, nfs_uuid_t *nfs_uuid) */ void nfs_local_disable(struct nfs_client *clp) { - struct net *cl_nfssvc_net; - struct auth_domain *cl_nfssvc_dom; - spin_lock(&clp->cl_localio_lock); if (test_and_clear_bit(NFS_CS_LOCAL_IO, &clp->cl_flags)) { trace_nfs_local_disable(clp); put_nfs_to_nfsd_symbols(); - cl_nfssvc_net = rcu_dereference(clp->cl_nfssvc_net); - if (cl_nfssvc_net) { - put_net(cl_nfssvc_net); - RCU_INIT_POINTER(clp->cl_nfssvc_net, NULL); - } - - cl_nfssvc_dom = rcu_dereference(clp->cl_nfssvc_dom); - if (cl_nfssvc_dom) { - auth_domain_put(cl_nfssvc_dom); - RCU_INIT_POINTER(clp->cl_nfssvc_dom, NULL); - } + nfs_uuid_invalidate_one_client(&clp->cl_uuid); } spin_unlock(&clp->cl_localio_lock); } @@ -179,8 +164,7 @@ static struct rpc_clnt *nfs_init_localioclient(struct nfs_client *clp) return rpcclient_localio; } -static bool nfs_server_uuid_is_local(struct nfs_client *clp, - nfs_uuid_t *nfs_uuid) +static bool nfs_server_uuid_is_local(struct nfs_client *clp) { u8 uuid[UUID_SIZE]; struct rpc_message msg = { @@ -193,7 +177,7 @@ static bool nfs_server_uuid_is_local(struct nfs_client *clp, if (IS_ERR(rpcclient_localio)) return false; - export_uuid(uuid, &nfs_uuid->uuid); + export_uuid(uuid, &clp->cl_uuid.uuid); msg.rpc_proc = &nfs_localio_procedures[LOCALIOPROC_UUID_IS_LOCAL]; status = rpc_call_sync(rpcclient_localio, &msg, 0); @@ -202,7 +186,7 @@ static bool nfs_server_uuid_is_local(struct nfs_client *clp, rpc_shutdown_client(rpcclient_localio); /* Server is only local if it initialized required struct members */ - if (status || !nfs_uuid->net || !nfs_uuid->dom) + if (status || !clp->cl_uuid.net || !clp->cl_uuid.dom) return false; return true; @@ -215,8 +199,6 @@ static bool nfs_server_uuid_is_local(struct nfs_client *clp, */ void nfs_local_probe(struct nfs_client *clp) { - nfs_uuid_t nfs_uuid; - /* Disallow localio if disabled via sysfs or AUTH_SYS isn't used */ if (!localio_enabled || clp->cl_rpcclient->cl_auth->au_flavor != RPC_AUTH_UNIX) { @@ -229,10 +211,10 @@ void nfs_local_probe(struct nfs_client *clp) nfs_local_disable(clp); } - nfs_uuid_begin(&nfs_uuid); - if (nfs_server_uuid_is_local(clp, &nfs_uuid)) - nfs_local_enable(clp, &nfs_uuid); - nfs_uuid_end(&nfs_uuid); + nfs_uuid_begin(&clp->cl_uuid); + if (nfs_server_uuid_is_local(clp)) + nfs_local_enable(clp); + nfs_uuid_end(&clp->cl_uuid); } EXPORT_SYMBOL_GPL(nfs_local_probe); @@ -245,8 +227,6 @@ struct nfs_localio_ctx * nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, struct nfs_fh *fh, const fmode_t mode) { - struct net *cl_nfssvc_net; - struct auth_domain *cl_nfssvc_dom; struct nfs_localio_ctx *localio; int status; @@ -255,15 +235,8 @@ nfs_local_open_fh(struct nfs_client *clp, const struct cred *cred, if (mode & ~(FMODE_READ | FMODE_WRITE)) return NULL; - rcu_read_lock(); - cl_nfssvc_net = rcu_dereference(clp->cl_nfssvc_net); - cl_nfssvc_dom = rcu_dereference(clp->cl_nfssvc_dom); - if (unlikely(!cl_nfssvc_net || !cl_nfssvc_dom)) - localio = ERR_PTR(-ENXIO); - else - localio = nfs_to.nfsd_open_local_fh(cl_nfssvc_net, cl_nfssvc_dom, - clp->cl_rpcclient, cred, fh, mode); - rcu_read_unlock(); + localio = nfs_to.nfsd_open_local_fh(&clp->cl_uuid, + clp->cl_rpcclient, cred, fh, mode); if (IS_ERR(localio)) { status = PTR_ERR(localio); trace_nfs_local_open_fh(fh, mode, status); diff --git a/fs/nfs_common/nfslocalio.c b/fs/nfs_common/nfslocalio.c index cc30fdb0cb46..8545ee75f756 100644 --- a/fs/nfs_common/nfslocalio.c +++ b/fs/nfs_common/nfslocalio.c @@ -11,11 +11,11 @@ MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("NFS localio protocol bypass support"); -DEFINE_MUTEX(nfs_uuid_mutex); +static DEFINE_SPINLOCK(nfs_uuid_lock); /* * Global list of nfs_uuid_t instances, add/remove - * is protected by nfs_uuid_mutex. + * is protected by nfs_uuid_lock. * Reads are protected by RCU read lock (see below). */ LIST_HEAD(nfs_uuids); @@ -26,17 +26,19 @@ void nfs_uuid_begin(nfs_uuid_t *nfs_uuid) nfs_uuid->dom = NULL; uuid_gen(&nfs_uuid->uuid); - mutex_lock(&nfs_uuid_mutex); + spin_lock(&nfs_uuid_lock); list_add_tail_rcu(&nfs_uuid->list, &nfs_uuids); - mutex_unlock(&nfs_uuid_mutex); + spin_unlock(&nfs_uuid_lock); } EXPORT_SYMBOL_GPL(nfs_uuid_begin); void nfs_uuid_end(nfs_uuid_t *nfs_uuid) { - mutex_lock(&nfs_uuid_mutex); - list_del_rcu(&nfs_uuid->list); - mutex_unlock(&nfs_uuid_mutex); + if (nfs_uuid->net == NULL) { + spin_lock(&nfs_uuid_lock); + list_del_init(&nfs_uuid->list); + spin_unlock(&nfs_uuid_lock); + } } EXPORT_SYMBOL_GPL(nfs_uuid_end); @@ -45,34 +47,66 @@ static nfs_uuid_t * nfs_uuid_lookup(const uuid_t *uuid) { nfs_uuid_t *nfs_uuid; - list_for_each_entry_rcu(nfs_uuid, &nfs_uuids, list) + list_for_each_entry(nfs_uuid, &nfs_uuids, list) if (uuid_equal(&nfs_uuid->uuid, uuid)) return nfs_uuid; return NULL; } -bool nfs_uuid_is_local(const uuid_t *uuid, struct net *net, struct auth_domain *dom) +void nfs_uuid_is_local(const uuid_t *uuid, struct list_head *list, + struct net *net, struct auth_domain *dom) { - bool is_local = false; nfs_uuid_t *nfs_uuid; - rcu_read_lock(); + spin_lock(&nfs_uuid_lock); nfs_uuid = nfs_uuid_lookup(uuid); if (nfs_uuid) { - nfs_uuid->net = maybe_get_net(net); - if (nfs_uuid->net) { - is_local = true; - kref_get(&dom->ref); - nfs_uuid->dom = dom; - } + kref_get(&dom->ref); + nfs_uuid->dom = dom; + /* We don't hold a ref on the net, but instead put + * ourselves on a list so the net pointer can be + * invalidated. + */ + list_move(&nfs_uuid->list, list); + nfs_uuid->net = net; } - rcu_read_unlock(); - - return is_local; + spin_unlock(&nfs_uuid_lock); } EXPORT_SYMBOL_GPL(nfs_uuid_is_local); +static void nfs_uuid_put_locked(nfs_uuid_t *nfs_uuid) +{ + if (nfs_uuid->net) + put_net(nfs_uuid->net); + nfs_uuid->net = NULL; + if (nfs_uuid->dom) + auth_domain_put(nfs_uuid->dom); + nfs_uuid->dom = NULL; + list_del_init(&nfs_uuid->list); +} + +void nfs_uuid_invalidate_clients(struct list_head *list) +{ + nfs_uuid_t *nfs_uuid, *tmp; + + spin_lock(&nfs_uuid_lock); + list_for_each_entry_safe(nfs_uuid, tmp, list, list) + nfs_uuid_put_locked(nfs_uuid); + spin_unlock(&nfs_uuid_lock); +} +EXPORT_SYMBOL_GPL(nfs_uuid_invalidate_clients); + +void nfs_uuid_invalidate_one_client(nfs_uuid_t *nfs_uuid) +{ + if (nfs_uuid->net) { + spin_lock(&nfs_uuid_lock); + nfs_uuid_put_locked(nfs_uuid); + spin_unlock(&nfs_uuid_lock); + } +} +EXPORT_SYMBOL_GPL(nfs_uuid_invalidate_one_client); + /* * The nfs localio code needs to call into nfsd using various symbols (below), * but cannot be statically linked, because that will make the nfs module @@ -100,7 +134,7 @@ static void put_##NFSD_SYMBOL(void) \ /* The nfs localio code needs to call into nfsd to map filehandle -> struct nfsd_file */ extern struct nfs_localio_ctx * -nfsd_open_local_fh(struct net *, struct auth_domain *, struct rpc_clnt *, +nfsd_open_local_fh(nfs_uuid_t *, struct rpc_clnt *, const struct cred *, const struct nfs_fh *, const fmode_t); DEFINE_NFS_TO_NFSD_SYMBOL(nfsd_open_local_fh); diff --git a/fs/nfsd/localio.c b/fs/nfsd/localio.c index a192bbe308df..491bf5017d34 100644 --- a/fs/nfsd/localio.c +++ b/fs/nfsd/localio.c @@ -42,13 +42,14 @@ * nfsd_serv_put (via nfs_localio_ctx_free). */ struct nfs_localio_ctx * -nfsd_open_local_fh(struct net *cl_nfssvc_net, struct auth_domain *cl_nfssvc_dom, +nfsd_open_local_fh(nfs_uuid_t *uuid, struct rpc_clnt *rpc_clnt, const struct cred *cred, const struct nfs_fh *nfs_fh, const fmode_t fmode) { int mayflags = NFSD_MAY_LOCALIO; int status = 0; - struct nfsd_net *nn; + struct nfsd_net *nn = NULL; + struct net *net; struct svc_cred rq_cred; struct svc_fh fh; struct nfs_localio_ctx *localio; @@ -64,12 +65,20 @@ nfsd_open_local_fh(struct net *cl_nfssvc_net, struct auth_domain *cl_nfssvc_dom, /* * Not running in nfsd context, so must safely get reference on nfsd_serv. * But the server may already be shutting down, if so disallow new localio. + * uuid->net is NOT a counted reference, but rcu_read_lock() ensures that if + * uuid->net is not NULL, then nfsd_serv_try_get() is safe and if that succeeds + * we will have an implied reference to the net. */ - nn = net_generic(cl_nfssvc_net, nfsd_net_id); - if (unlikely(!nfsd_serv_try_get(nn))) { + rcu_read_lock(); + net = READ_ONCE(uuid->net); + if (net) + nn = net_generic(net, nfsd_net_id); + if (unlikely(!nn || !nfsd_serv_try_get(nn))) { + rcu_read_unlock(); status = -ENXIO; goto out_nfsd_serv; } + rcu_read_unlock(); /* nfs_fh -> svc_fh */ fh_init(&fh, NFS4_FHSIZE); @@ -83,7 +92,7 @@ nfsd_open_local_fh(struct net *cl_nfssvc_net, struct auth_domain *cl_nfssvc_dom, svcauth_map_clnt_to_svc_cred_local(rpc_clnt, cred, &rq_cred); - beres = nfsd_file_acquire_local(cl_nfssvc_net, &rq_cred, cl_nfssvc_dom, + beres = nfsd_file_acquire_local(uuid->net, &rq_cred, uuid->dom, &fh, mayflags, &localio->nf); if (beres) { status = nfs_stat_to_errno(be32_to_cpu(beres)); @@ -123,9 +132,11 @@ struct localio_uuidarg { static __be32 localio_proc_uuid_is_local(struct svc_rqst *rqstp) { struct localio_uuidarg *argp = rqstp->rq_argp; + struct net *net = SVC_NET(rqstp); + struct nfsd_net *nn = net_generic(net, nfsd_net_id); - (void) nfs_uuid_is_local(&argp->uuid, SVC_NET(rqstp), - rqstp->rq_client); + nfs_uuid_is_local(&argp->uuid, &nn->local_clients, + net, rqstp->rq_client); return rpc_success; } diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index e2d953f21dde..9c65db8d3f44 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -216,6 +216,8 @@ struct nfsd_net { /* last time an admin-revoke happened for NFSv4.0 */ time64_t nfs40_last_revoke; + /* Local clients to be invalidated when net is shut down */ + struct list_head local_clients; }; /* Simple check to find out if a given net was properly initialized */ diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 64c1b4d649bc..01e383d692ab 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -18,6 +18,7 @@ #include <linux/sunrpc/svc.h> #include <linux/module.h> #include <linux/fsnotify.h> +#include <linux/nfslocalio.h> #include "idmap.h" #include "nfsd.h" @@ -2257,6 +2258,7 @@ static __net_init int nfsd_net_init(struct net *net) get_random_bytes(&nn->siphash_key, sizeof(nn->siphash_key)); seqlock_init(&nn->writeverf_lock); nfsd_proc_stat_init(net); + INIT_LIST_HEAD(&nn->local_clients); return 0; @@ -2268,6 +2270,19 @@ static __net_init int nfsd_net_init(struct net *net) return retval; } +/** + * nfsd_net_pre_exit - Disconnect localio clients from net namespace + * @net: a network namespace that is about to be destroyed + * + * This invalidated ->net pointers held by localio clients + * while they can still safely access nn->counter. + */ +static __net_exit void nfsd_net_pre_exit(struct net *net) +{ + struct nfsd_net *nn = net_generic(net, nfsd_net_id); + + nfs_uuid_invalidate_clients(&nn->local_clients); +} /** * nfsd_net_exit - Release the nfsd_net portion of a net namespace * @net: a network namespace that is about to be destroyed @@ -2285,6 +2300,7 @@ static __net_exit void nfsd_net_exit(struct net *net) static struct pernet_operations nfsd_net_ops = { .init = nfsd_net_init, + .pre_exit = nfsd_net_pre_exit, .exit = nfsd_net_exit, .id = &nfsd_net_id, .size = sizeof(struct nfsd_net), diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h index e12310dd5f4c..2ecceb8b9d3d 100644 --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -8,6 +8,7 @@ #include <linux/fs.h> #include <linux/posix_acl.h> +#include <linux/nfslocalio.h> #include "nfsfh.h" #include "nfsd.h" @@ -161,7 +162,7 @@ __be32 nfsd_permission(struct svc_cred *cred, struct svc_export *exp, void nfsd_filp_close(struct file *fp); struct nfs_localio_ctx * -nfsd_open_local_fh(struct net *, struct auth_domain *, +nfsd_open_local_fh(nfs_uuid_t *, struct rpc_clnt *, const struct cred *, const struct nfs_fh *, const fmode_t); diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index fc7982fc218c..b43e3e067e44 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -131,8 +131,7 @@ struct nfs_client { #if IS_ENABLED(CONFIG_NFS_LOCALIO) struct timespec64 cl_nfssvc_boot; seqlock_t cl_boot_lock; - struct net __rcu * cl_nfssvc_net; - struct auth_domain __rcu * cl_nfssvc_dom; + nfs_uuid_t cl_uuid; spinlock_t cl_localio_lock; #endif /* CONFIG_NFS_LOCALIO */ }; diff --git a/include/linux/nfslocalio.h b/include/linux/nfslocalio.h index 68f5b39f1940..e196f716a2f5 100644 --- a/include/linux/nfslocalio.h +++ b/include/linux/nfslocalio.h @@ -28,7 +28,10 @@ typedef struct { void nfs_uuid_begin(nfs_uuid_t *); void nfs_uuid_end(nfs_uuid_t *); -bool nfs_uuid_is_local(const uuid_t *, struct net *, struct auth_domain *); +void nfs_uuid_is_local(const uuid_t *, struct list_head *, + struct net *, struct auth_domain *); +void nfs_uuid_invalidate_clients(struct list_head *list); +void nfs_uuid_invalidate_one_client(nfs_uuid_t *nfs_uuid); struct nfsd_file; struct nfsd_net; @@ -39,7 +42,7 @@ struct nfs_localio_ctx { }; typedef struct nfs_localio_ctx * -(*nfs_to_nfsd_open_local_fh_t)(struct net *, struct auth_domain *, +(*nfs_to_nfsd_open_local_fh_t)(nfs_uuid_t *, struct rpc_clnt *, const struct cred *, const struct nfs_fh *, const fmode_t); typedef struct nfsd_file * (*nfs_to_nfsd_file_get_t)(struct nfsd_file *);
On Fri, Aug 30, 2024 at 12:20:13PM +1000, NeilBrown wrote: > Following are revised versions of 6 patches from the v14 localio series. > > The issue addressed is net namespace refcounting. > > We don't want to keep a long-term counted reference in the client > because that prevents a server container from completely shutting down. > > So we avoid taking a reference at all and rely on the per-cpu reference > to the server being sufficient to keep the net-ns active. This involves > allowing the net-ns exit code to iterate all active clients and clear > their ->net pointers (which they need to find the per-cpu-refcount for > the nfs_serv). > > So: > - embed nfs_uuid_t in nfs_client. This provides a list_head that can > be used to find the client. It does add the actual uuid to nfs_client > so it is bigger than needed. If that is really a problem we can find > a fix. > > - When the nfs server confirms that the uuid is local, it moves the > nfs_uuid_t onto a per-net-ns list. > > - When the net-ns is shutting down - in a "pre_exit" handler, all these > nfS_uuid_t have their ->net cleared. There is an rcu_synchronize() > call between pre_exit() handlers and exit() handlers so and caller > that sees ->net as not NULL can safely check the ->counter > > - We now pass the nfs_uuid_t to nfsd_open_local_fh() so it can safely > look at ->net in a private rcu_read_lock() section. > > I have compile tested this code but nothing more. > > Thanks, > NeilBrown > > [PATCH 14/25] nfs_common: add NFS LOCALIO auxiliary protocol > [PATCH 15/25] nfs_common: introduce nfs_localio_ctx struct and > [PATCH 16/25] nfsd: add localio support > [PATCH 17/25] nfsd: implement server support for NFS_LOCALIO_PROGRAM > [PATCH 19/25] nfs: add localio support > [PATCH 23/25] nfs: implement client support for NFS_LOCALIO_PROGRAM Hey Neil, I attempted to test the kernel with your changes but it crashed with: [ 55.422564] list_add corruption. next is NULL. [ 55.423523] ------------[ cut here ]------------ [ 55.424423] kernel BUG at lib/list_debug.c:27! [ 55.425291] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 55.426367] CPU: 29 UID: 0 PID: 5251 Comm: nfsd Kdump: loaded Not tainted 6.11.0-rc4.snitm+ #147 [ 55.427991] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-4.module+el8.9.0+19570+14a90618 04/01/2014 [ 55.429697] RIP: 0010:__list_add_valid_or_report+0x55/0xa0 [ 55.430741] Code: 4c 39 cf 74 4f b8 01 00 00 00 5d c3 cc cc cc cc 48 c7 c7 98 1d a5 82 e8 d9 6d 93 ff 0f 0b 48 c7 c7 c0 1d a5 82 e8 cb 6d 93 ff <0f> 0b 4c 89 c1 48 c7 c7 e8 1d a5 82 e8 ba 6d 93 ff 0f 0b 48 89 d1 [ 55.434167] RSP: 0018:ffff8881441a7d50 EFLAGS: 00010296 [ 55.435141] RAX: 0000000000000022 RBX: ffff888107b50370 RCX: 0000000000000000 [ 55.436455] RDX: ffff888473caf800 RSI: ffff888473ca18c0 RDI: ffff888473ca18c0 [ 55.437770] RBP: ffff8881441a7d50 R08: 0000000000000022 R09: ffff8881441a7be8 [ 55.439098] R10: ffff8881441a7be0 R11: ffffffff8333f328 R12: ffff888107b50380 [ 55.440419] R13: ffff888103b15080 R14: ffff888107bb5d00 R15: 0000000000000000 [ 55.441737] FS: 0000000000000000(0000) GS:ffff888473c80000(0000) knlGS:0000000000000000 [ 55.443228] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 55.444297] CR2: 000055958fab3488 CR3: 000000010615e000 CR4: 0000000000350ef0 [ 55.445615] Call Trace: [ 55.446090] <TASK> [ 55.446498] ? show_regs+0x6d/0x80 [ 55.447149] ? die+0x3c/0xa0 [ 55.447698] ? do_trap+0xcf/0xf0 [ 55.448316] ? do_error_trap+0x75/0xa0 [ 55.449026] ? __list_add_valid_or_report+0x55/0xa0 [ 55.449938] ? exc_invalid_op+0x57/0x80 [ 55.450660] ? __list_add_valid_or_report+0x55/0xa0 [ 55.451646] ? asm_exc_invalid_op+0x1f/0x30 [ 55.452438] ? __list_add_valid_or_report+0x55/0xa0 [ 55.453355] nfs_uuid_is_local+0xba/0x110 [ 55.454115] localio_proc_uuid_is_local+0x64/0x80 [nfsd] [ 55.455145] nfsd_dispatch+0xc2/0x210 [nfsd] [ 55.455977] svc_process_common+0x2e6/0x6e0 [ 55.456761] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] [ 55.457697] svc_process+0x13e/0x1e0 [ 55.458377] svc_recv+0x89e/0xa70 [ 55.459012] ? __pfx_nfsd+0x10/0x10 [nfsd] [ 55.459806] nfsd+0xa5/0x100 [nfsd] [ 55.460486] kthread+0xe5/0x120 [ 55.461090] ? __pfx_kthread+0x10/0x10 [ 55.461801] ret_from_fork+0x3d/0x60 [ 55.462476] ? __pfx_kthread+0x10/0x10 [ 55.463184] ret_from_fork_asm+0x1a/0x30 [ 55.463923] </TASK> I'll triple check my melding of your changes and mine in ~7 hours.. I may have missed something. Note this is _not_ with your other incremental patch (that uses __module_get) -- only because I didn't get to that yet. Mike
On Fri, Aug 30, 2024 at 03:47:31AM -0400, Mike Snitzer wrote: > On Fri, Aug 30, 2024 at 12:20:13PM +1000, NeilBrown wrote: > > Following are revised versions of 6 patches from the v14 localio series. > > > > The issue addressed is net namespace refcounting. > > > > We don't want to keep a long-term counted reference in the client > > because that prevents a server container from completely shutting down. > > > > So we avoid taking a reference at all and rely on the per-cpu reference > > to the server being sufficient to keep the net-ns active. This involves > > allowing the net-ns exit code to iterate all active clients and clear > > their ->net pointers (which they need to find the per-cpu-refcount for > > the nfs_serv). > > > > So: > > - embed nfs_uuid_t in nfs_client. This provides a list_head that can > > be used to find the client. It does add the actual uuid to nfs_client > > so it is bigger than needed. If that is really a problem we can find > > a fix. > > > > - When the nfs server confirms that the uuid is local, it moves the > > nfs_uuid_t onto a per-net-ns list. > > > > - When the net-ns is shutting down - in a "pre_exit" handler, all these > > nfS_uuid_t have their ->net cleared. There is an rcu_synchronize() > > call between pre_exit() handlers and exit() handlers so and caller > > that sees ->net as not NULL can safely check the ->counter > > > > - We now pass the nfs_uuid_t to nfsd_open_local_fh() so it can safely > > look at ->net in a private rcu_read_lock() section. > > > > I have compile tested this code but nothing more. > > > > Thanks, > > NeilBrown > > > > [PATCH 14/25] nfs_common: add NFS LOCALIO auxiliary protocol > > [PATCH 15/25] nfs_common: introduce nfs_localio_ctx struct and > > [PATCH 16/25] nfsd: add localio support > > [PATCH 17/25] nfsd: implement server support for NFS_LOCALIO_PROGRAM > > [PATCH 19/25] nfs: add localio support > > [PATCH 23/25] nfs: implement client support for NFS_LOCALIO_PROGRAM > > Hey Neil, > > I attempted to test the kernel with your changes but it crashed with: > > [ 55.422564] list_add corruption. next is NULL. > [ 55.423523] ------------[ cut here ]------------ > [ 55.424423] kernel BUG at lib/list_debug.c:27! ... > I'll triple check my melding of your changes and mine in ~7 hours.. I > may have missed something. Good news, I was just missing your fs/nfsd/nfsctl.c changes.. works well with those ;) Seeing a very slight drop in performance (with my well-worn smoke test that does 20 secs of aio directio 128K reads with 24 threads on my testbed). But I mean slight (~2-3%). Not worried about it, correctness trumps performance, but I think reclaiming that performance will be easy (by avoiding the need for KMEM_CACHE nfs_localio_ctx_{alloc,free}). > Note this is _not_ with your other incremental patch (that uses > __module_get) -- only because I didn't get to that yet. Moving on to incorporating your nfsd __module_get now. Then on to the work of switching from nfs_localio_ctx back to returning nfsd_file. Mike