From patchwork Fri Mar 21 15:21:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 14025651 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 09E451DE4CE for ; Fri, 21 Mar 2025 15:21:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742570467; cv=none; b=LWHsPj+6z8f76wlnPGTK6cV+jbZyyZTkTgr6F9TY6yCQAjTuK+F4a5x8+gMzhVZenSBf1ekWGSPKv3JPxnsE5tDt6ezggELTAFSfIptSwaOG5D3Or0VuLto9LQGmSoaiOGqE/7EtXSc7LYmsJW8eqAMKAffG1U5SUaNBv9mf6YU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742570467; c=relaxed/simple; bh=A7pnfRBRxM0CrgzFYSRmbFQfzwXWvbO1UEqtLJzuziM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gt/5A40O5rpWKOToP3A0LstNNG4dAo5E7f5LZWXx8ZtF3PYC1PKRIL7gUlBlH6c8FRT5Ymt5nnw5PvSe+phm1v3h00F4lnWGAydV8+C1sU44Liv02yvSyb5spU8YEE/VEWo9vhvvn6vKTmrupjFDEbqwk1HvQLqm/qvDt2MXPXw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Svh3M8o8; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Svh3M8o8" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 449D5C4CEE9; Fri, 21 Mar 2025 15:21:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742570466; bh=A7pnfRBRxM0CrgzFYSRmbFQfzwXWvbO1UEqtLJzuziM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Svh3M8o8QPcZgJ491e4NJeEZr6vPROOlmWLfOJIjf61jWcAgdi1PjwH0eLsVlAG61 NLnF/ZtUy86THX4slFiUleb6Tx3B4x8GRaUjbCLJE27uao51ZbcpewXWxlzFxzaxy8 281okKg1G4sKtwg5XCCvPjW9UyIaknOC40LUS9c/QWUiaB3TVoe6UwhqkW9+R7BV12 cOPFW312FfXChIf/U7bfUEPHvJbrAT+0i9yHCItgjxO4NlFCcTrk5pOSXfsaGBEwuI I9INDmDDE0jK/XY9wTQhXKGoARd5wMP7KllJmlsds4eF3zpCo9iIwC7eNf4QwZLFpN 5iP76w8+ElN3Q== From: trondmy@kernel.org To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Josef Bacik Subject: [PATCH v3 1/4] NFS: Add a mount option to make ENETUNREACH errors fatal Date: Fri, 21 Mar 2025 11:21:01 -0400 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Trond Myklebust If the NFS client was initially created in a container, and that container is torn down, there is usually no possibity to go back and destroy any NFS clients that are hung because their virtual network devices have been unlinked. Add a flag that tells the NFS client that in these circumstances, it should treat ENETDOWN and ENETUNREACH errors as fatal to the NFS client. The option defaults to being on when the mount happens from inside a net namespace that is not "init_net". Signed-off-by: Trond Myklebust Reviewed-by: Jeff Layton Tested-by: Jeff Layton --- fs/nfs/fs_context.c | 39 +++++++++++++++++++++++++++++++++++++++ fs/nfs/super.c | 3 +++ include/linux/nfs_fs_sb.h | 1 + 3 files changed, 43 insertions(+) diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c index 1cabba1231d6..13f71ca8c974 100644 --- a/fs/nfs/fs_context.c +++ b/fs/nfs/fs_context.c @@ -50,6 +50,7 @@ enum nfs_param { Opt_clientaddr, Opt_cto, Opt_alignwrite, + Opt_fatal_neterrors, Opt_fg, Opt_fscache, Opt_fscache_flag, @@ -97,6 +98,20 @@ enum nfs_param { Opt_xprtsec, }; +enum { + Opt_fatal_neterrors_default, + Opt_fatal_neterrors_enetunreach, + Opt_fatal_neterrors_none, +}; + +static const struct constant_table nfs_param_enums_fatal_neterrors[] = { + { "default", Opt_fatal_neterrors_default }, + { "ENETDOWN:ENETUNREACH", Opt_fatal_neterrors_enetunreach }, + { "ENETUNREACH:ENETDOWN", Opt_fatal_neterrors_enetunreach }, + { "none", Opt_fatal_neterrors_none }, + {} +}; + enum { Opt_local_lock_all, Opt_local_lock_flock, @@ -153,6 +168,8 @@ static const struct fs_parameter_spec nfs_fs_parameters[] = { fsparam_string("clientaddr", Opt_clientaddr), fsparam_flag_no("cto", Opt_cto), fsparam_flag_no("alignwrite", Opt_alignwrite), + fsparam_enum("fatal_neterrors", Opt_fatal_neterrors, + nfs_param_enums_fatal_neterrors), fsparam_flag ("fg", Opt_fg), fsparam_flag_no("fsc", Opt_fscache_flag), fsparam_string("fsc", Opt_fscache), @@ -896,6 +913,25 @@ static int nfs_fs_context_parse_param(struct fs_context *fc, goto out_of_bounds; ctx->nfs_server.max_connect = result.uint_32; break; + case Opt_fatal_neterrors: + trace_nfs_mount_assign(param->key, param->string); + switch (result.uint_32) { + case Opt_fatal_neterrors_default: + if (fc->net_ns != &init_net) + ctx->flags |= NFS_MOUNT_NETUNREACH_FATAL; + else + ctx->flags &= ~NFS_MOUNT_NETUNREACH_FATAL; + break; + case Opt_fatal_neterrors_enetunreach: + ctx->flags |= NFS_MOUNT_NETUNREACH_FATAL; + break; + case Opt_fatal_neterrors_none: + ctx->flags &= ~NFS_MOUNT_NETUNREACH_FATAL; + break; + default: + goto out_invalid_value; + } + break; case Opt_lookupcache: trace_nfs_mount_assign(param->key, param->string); switch (result.uint_32) { @@ -1675,6 +1711,9 @@ static int nfs_init_fs_context(struct fs_context *fc) ctx->xprtsec.cert_serial = TLS_NO_CERT; ctx->xprtsec.privkey_serial = TLS_NO_PRIVKEY; + if (fc->net_ns != &init_net) + ctx->flags |= NFS_MOUNT_NETUNREACH_FATAL; + fc->s_iflags |= SB_I_STABLE_WRITES; } fc->fs_private = ctx; diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 96de658a7886..9eea9e62afc9 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -457,6 +457,9 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss, { NFS_MOUNT_FORCE_RDIRPLUS, ",rdirplus=force", "" }, { NFS_MOUNT_UNSHARED, ",nosharecache", "" }, { NFS_MOUNT_NORESVPORT, ",noresvport", "" }, + { NFS_MOUNT_NETUNREACH_FATAL, + ",fatal_neterrors=ENETDOWN:ENETUNREACH", + ",fatal_neterrors=none" }, { 0, NULL, NULL } }; const struct proc_nfs_info *nfs_infop; diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index b83d16a42afc..a6ce8590eaaf 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -168,6 +168,7 @@ struct nfs_server { #define NFS_MOUNT_SHUTDOWN 0x08000000 #define NFS_MOUNT_NO_ALIGNWRITE 0x10000000 #define NFS_MOUNT_FORCE_RDIRPLUS 0x20000000 +#define NFS_MOUNT_NETUNREACH_FATAL 0x40000000 unsigned int fattr_valid; /* Valid attributes */ unsigned int caps; /* server capabilities */ From patchwork Fri Mar 21 15:21:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 14025652 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 813F11DED5C for ; Fri, 21 Mar 2025 15:21:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742570467; cv=none; b=SzqOzJrnbVLD9w61lOj/P3zoUEgQwsrocsu6fHNO2+HAngATwRPCt/QGx07nai2T1TNwnkMJEdiZKHB8Y3P6jkn4OIVkdkjXpzki06WPZxUFpawGA1cmo0NKBNvi0IvMSk1RA1QvaS41Y+ahz6nzHd7CjLp0Ck6zytlCjuMLKK8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742570467; c=relaxed/simple; bh=th+01ITaYCLvctnpnKKry+HF5686Ov8FKHn+qVhQ6jI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=E9W6g2DzRthc/j1GmUp7zU1hhZApAzIFxjEXZU2YWXvElpYL0oHwg6SeFmtim/9Xco5n8DU338s3PY3xH+WqcWfKnuqkJNlN77SkBfFn8enppT9V0V1ZrYixKmqZEYmKDO2qJPoDtjqoi1XtCf2M0OlGkDvMRf9kGHTgG/ubZUs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Lj60dYmd; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Lj60dYmd" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C9352C4CEE3; Fri, 21 Mar 2025 15:21:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742570467; bh=th+01ITaYCLvctnpnKKry+HF5686Ov8FKHn+qVhQ6jI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Lj60dYmdw9/WS/kDAZY07VR0U02fZx1GdYvhRKyC77F150ncz/+mS0dWf2CQqUqBP kniaoGrXPEKkHmML04ZsP5EhtueTEQItkdm6Y9Ryj/e+shF3ntjOXNNgi3vDL7rCO2 8W2uqIvIRCPumz5Y9RNUNu2gN7u6t/KbTzUE0EPS5EAK8FLMls/KDS8OoW2EfIy++s oOXLFIjTAhBt2jLhex+34gWey0PpJWUQgN9NZLTd8WPqjIAXwVpuC1z5Z4VdgyKC5K 082Pl3ZIenIkb8V2Ji11X3XKbt6hU+KtX+w/RCzMEKTBJYC9Adoe56FrJ8nGv+Zsy7 Z2eAE8faiE3yQ== From: trondmy@kernel.org To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Josef Bacik Subject: [PATCH v3 2/4] NFS: Treat ENETUNREACH errors as fatal in containers Date: Fri, 21 Mar 2025 11:21:02 -0400 Message-ID: <0ad321d6ea5003cbca74149d25ec66d73b41aefa.1742570192.git.trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Trond Myklebust Propagate the NFS_MOUNT_NETUNREACH_FATAL flag to work with the generic NFS client. If the flag is set, the client will receive ENETDOWN and ENETUNREACH errors from the RPC layer, and is expected to treat them as being fatal. Signed-off-by: Trond Myklebust Reviewed-by: Jeff Layton Tested-by: Jeff Layton --- fs/nfs/client.c | 5 +++++ fs/nfs/nfs4client.c | 2 ++ fs/nfs/nfs4proc.c | 3 +++ include/linux/nfs_fs_sb.h | 1 + include/linux/sunrpc/clnt.h | 5 ++++- include/linux/sunrpc/sched.h | 1 + include/trace/events/sunrpc.h | 1 + net/sunrpc/clnt.c | 30 ++++++++++++++++++++++-------- 8 files changed, 39 insertions(+), 9 deletions(-) diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 3b0918ade53c..02c916a55020 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -546,6 +546,8 @@ int nfs_create_rpc_client(struct nfs_client *clp, args.flags |= RPC_CLNT_CREATE_NOPING; if (test_bit(NFS_CS_REUSEPORT, &clp->cl_flags)) args.flags |= RPC_CLNT_CREATE_REUSEPORT; + if (test_bit(NFS_CS_NETUNREACH_FATAL, &clp->cl_flags)) + args.flags |= RPC_CLNT_CREATE_NETUNREACH_FATAL; if (!IS_ERR(clp->cl_rpcclient)) return 0; @@ -709,6 +711,9 @@ static int nfs_init_server(struct nfs_server *server, if (ctx->flags & NFS_MOUNT_NORESVPORT) set_bit(NFS_CS_NORESVPORT, &cl_init.init_flags); + if (ctx->flags & NFS_MOUNT_NETUNREACH_FATAL) + __set_bit(NFS_CS_NETUNREACH_FATAL, &cl_init.init_flags); + /* Allocate or find a client reference we can use */ clp = nfs_get_client(&cl_init); if (IS_ERR(clp)) diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c index 83378f69b35e..8f7d40844cdc 100644 --- a/fs/nfs/nfs4client.c +++ b/fs/nfs/nfs4client.c @@ -233,6 +233,8 @@ struct nfs_client *nfs4_alloc_client(const struct nfs_client_initdata *cl_init) __set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags); if (test_bit(NFS_CS_PNFS, &cl_init->init_flags)) __set_bit(NFS_CS_PNFS, &clp->cl_flags); + if (test_bit(NFS_CS_NETUNREACH_FATAL, &cl_init->init_flags)) + __set_bit(NFS_CS_NETUNREACH_FATAL, &clp->cl_flags); /* * Set up the connection to the server before we add add to the * global list. diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 24406fc1352d..889511650ceb 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -195,6 +195,9 @@ static int nfs4_map_errors(int err) return -EBUSY; case -NFS4ERR_NOT_SAME: return -ENOTSYNC; + case -ENETDOWN: + case -ENETUNREACH: + break; default: dprintk("%s could not handle NFSv4 error %d\n", __func__, -err); diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index a6ce8590eaaf..71319637a84e 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -50,6 +50,7 @@ struct nfs_client { #define NFS_CS_DS 7 /* - Server is a DS */ #define NFS_CS_REUSEPORT 8 /* - reuse src port on reconnect */ #define NFS_CS_PNFS 9 /* - Server used for pnfs */ +#define NFS_CS_NETUNREACH_FATAL 10 /* - ENETUNREACH errors are fatal */ struct sockaddr_storage cl_addr; /* server identifier */ size_t cl_addrlen; char * cl_hostname; /* hostname of server */ diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index fec976e58174..f8b406b0a1af 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -64,7 +64,9 @@ struct rpc_clnt { cl_noretranstimeo: 1,/* No retransmit timeouts */ cl_autobind : 1,/* use getport() */ cl_chatty : 1,/* be verbose */ - cl_shutdown : 1;/* rpc immediate -EIO */ + cl_shutdown : 1,/* rpc immediate -EIO */ + cl_netunreach_fatal : 1; + /* Treat ENETUNREACH errors as fatal */ struct xprtsec_parms cl_xprtsec; /* transport security policy */ struct rpc_rtt * cl_rtt; /* RTO estimator data */ @@ -175,6 +177,7 @@ struct rpc_add_xprt_test { #define RPC_CLNT_CREATE_SOFTERR (1UL << 10) #define RPC_CLNT_CREATE_REUSEPORT (1UL << 11) #define RPC_CLNT_CREATE_CONNECTED (1UL << 12) +#define RPC_CLNT_CREATE_NETUNREACH_FATAL (1UL << 13) struct rpc_clnt *rpc_create(struct rpc_create_args *args); struct rpc_clnt *rpc_bind_new_program(struct rpc_clnt *, diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h index eac57914dcf3..ccba79ebf893 100644 --- a/include/linux/sunrpc/sched.h +++ b/include/linux/sunrpc/sched.h @@ -134,6 +134,7 @@ struct rpc_task_setup { #define RPC_TASK_MOVEABLE 0x0004 /* nfs4.1+ rpc tasks */ #define RPC_TASK_NULLCREDS 0x0010 /* Use AUTH_NULL credential */ #define RPC_CALL_MAJORSEEN 0x0020 /* major timeout seen */ +#define RPC_TASK_NETUNREACH_FATAL 0x0040 /* ENETUNREACH is fatal */ #define RPC_TASK_DYNAMIC 0x0080 /* task was kmalloc'ed */ #define RPC_TASK_NO_ROUND_ROBIN 0x0100 /* send requests on "main" xprt */ #define RPC_TASK_SOFT 0x0200 /* Use soft timeouts */ diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h index 851841336ee6..5d331383047b 100644 --- a/include/trace/events/sunrpc.h +++ b/include/trace/events/sunrpc.h @@ -343,6 +343,7 @@ TRACE_EVENT(rpc_request, { RPC_TASK_MOVEABLE, "MOVEABLE" }, \ { RPC_TASK_NULLCREDS, "NULLCREDS" }, \ { RPC_CALL_MAJORSEEN, "MAJORSEEN" }, \ + { RPC_TASK_NETUNREACH_FATAL, "NETUNREACH_FATAL"}, \ { RPC_TASK_DYNAMIC, "DYNAMIC" }, \ { RPC_TASK_NO_ROUND_ROBIN, "NO_ROUND_ROBIN" }, \ { RPC_TASK_SOFT, "SOFT" }, \ diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 2fe88ea79a70..1cfaf93ceec1 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -512,6 +512,8 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, clnt->cl_discrtry = 1; if (!(args->flags & RPC_CLNT_CREATE_QUIET)) clnt->cl_chatty = 1; + if (args->flags & RPC_CLNT_CREATE_NETUNREACH_FATAL) + clnt->cl_netunreach_fatal = 1; return clnt; } @@ -662,6 +664,7 @@ static struct rpc_clnt *__rpc_clone_client(struct rpc_create_args *args, new->cl_noretranstimeo = clnt->cl_noretranstimeo; new->cl_discrtry = clnt->cl_discrtry; new->cl_chatty = clnt->cl_chatty; + new->cl_netunreach_fatal = clnt->cl_netunreach_fatal; new->cl_principal = clnt->cl_principal; new->cl_max_connect = clnt->cl_max_connect; return new; @@ -1195,6 +1198,8 @@ void rpc_task_set_client(struct rpc_task *task, struct rpc_clnt *clnt) task->tk_flags |= RPC_TASK_TIMEOUT; if (clnt->cl_noretranstimeo) task->tk_flags |= RPC_TASK_NO_RETRANS_TIMEOUT; + if (clnt->cl_netunreach_fatal) + task->tk_flags |= RPC_TASK_NETUNREACH_FATAL; atomic_inc(&clnt->cl_task_count); } @@ -2102,14 +2107,17 @@ call_bind_status(struct rpc_task *task) case -EPROTONOSUPPORT: trace_rpcb_bind_version_err(task); goto retry_timeout; + case -ENETDOWN: + case -ENETUNREACH: + if (task->tk_flags & RPC_TASK_NETUNREACH_FATAL) + break; + fallthrough; case -ECONNREFUSED: /* connection problems */ case -ECONNRESET: case -ECONNABORTED: case -ENOTCONN: case -EHOSTDOWN: - case -ENETDOWN: case -EHOSTUNREACH: - case -ENETUNREACH: case -EPIPE: trace_rpcb_unreachable_err(task); if (!RPC_IS_SOFTCONN(task)) { @@ -2191,19 +2199,22 @@ call_connect_status(struct rpc_task *task) task->tk_status = 0; switch (status) { + case -ENETDOWN: + case -ENETUNREACH: + if (task->tk_flags & RPC_TASK_NETUNREACH_FATAL) + break; + fallthrough; case -ECONNREFUSED: case -ECONNRESET: /* A positive refusal suggests a rebind is needed. */ - if (RPC_IS_SOFTCONN(task)) - break; if (clnt->cl_autobind) { rpc_force_rebind(clnt); + if (RPC_IS_SOFTCONN(task)) + break; goto out_retry; } fallthrough; case -ECONNABORTED: - case -ENETDOWN: - case -ENETUNREACH: case -EHOSTUNREACH: case -EPIPE: case -EPROTO: @@ -2455,10 +2466,13 @@ call_status(struct rpc_task *task) trace_rpc_call_status(task); task->tk_status = 0; switch(status) { - case -EHOSTDOWN: case -ENETDOWN: - case -EHOSTUNREACH: case -ENETUNREACH: + if (task->tk_flags & RPC_TASK_NETUNREACH_FATAL) + goto out_exit; + fallthrough; + case -EHOSTDOWN: + case -EHOSTUNREACH: case -EPERM: if (RPC_IS_SOFTCONN(task)) goto out_exit; From patchwork Fri Mar 21 15:21:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 14025653 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 35B411DED5C for ; Fri, 21 Mar 2025 15:21:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742570468; cv=none; b=SkaI9ubZH8XG/RypSBOhtVqpSBfZ9OOhZO89MyF8pofOGivjJ0K/TK7RXOJIYqMZmPbGHDWvmh8S5ycd1DRhyektmITgXF5jH6ORGIF0ijBamdQuTS7IAKLHw0L8x3f2T5TAjMdeVpUnO6dlIAHD8aZezUZSJqo7hSFI/2pbAQE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742570468; c=relaxed/simple; bh=6++rWQnadN5feCwhuPK4SSw4TMAIgpzETCs1IJzOxGA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=N3/FcdEw0BfUnmX2xULGVobEOi7R5q1epMItCEyGi+5ThfP+exA1Zt14YQyHDv+pNX1O22dl0E4ujLxs+OSenBhJqP2qn4Eafz3TqfSfJ/sk0ZecCGUc21JTvmz3Z3BE4FIyjenewJ/cHNQh6/VMZ25JQRbdl87XQskFag4Np70= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=NuPu+i33; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="NuPu+i33" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5CD3EC4CEED; Fri, 21 Mar 2025 15:21:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742570467; bh=6++rWQnadN5feCwhuPK4SSw4TMAIgpzETCs1IJzOxGA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NuPu+i33WssV2bU5AwN7cXqUWtXy0TPUMigSyQ3XODjWcy5nGX3TwpuN971d7EvhT 6Q6ADeUEbG+40lpacbmnfr8UBFYAXYPTTDG8FCmXsw+LNQXeBevcn7p7cBLPpJmskY QrsRFNCUc9aH93XjRpFhZIHCRCMEG4S1fB0fDJm9WUm4pqSTN3UR0iKEnI2yFgx+3Y 4R3FMgQLkGQGtKsw7SF4cXfJ4HdMFA8nnbbtdcBSg/lMSTsUH5AwzPGXq/v2jMaYhO 24lOkRziSj3MADyHEnP5PFeP/MwHUSH28TChuYW+namk3zl+pULIG2+S/BLOXzzdAR P0Vq8PJJdDvEw== From: trondmy@kernel.org To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Josef Bacik Subject: [PATCH v3 3/4] pNFS/flexfiles: Treat ENETUNREACH errors as fatal in containers Date: Fri, 21 Mar 2025 11:21:03 -0400 Message-ID: <0abd073b82ed88f0cce348ce70cc06c0d0e3c9d4.1742570192.git.trond.myklebust@hammerspace.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Trond Myklebust Propagate the NFS_MOUNT_NETUNREACH_FATAL flag to work with the pNFS flexfiles client. In these circumstances, the client needs to treat the ENETDOWN and ENETUNREACH errors as fatal, and should abandon the attempted I/O. Signed-off-by: Trond Myklebust Reviewed-by: Jeff Layton Tested-by: Jeff Layton --- fs/nfs/flexfilelayout/flexfilelayout.c | 23 +++++++++++++++++++++-- fs/nfs/nfs3client.c | 2 ++ fs/nfs/nfs4client.c | 5 +++++ include/linux/nfs4.h | 1 + 4 files changed, 29 insertions(+), 2 deletions(-) diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c index 98b45b636be3..f89fdba7289d 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -1154,10 +1154,14 @@ static int ff_layout_async_handle_error_v4(struct rpc_task *task, rpc_wake_up(&tbl->slot_tbl_waitq); goto reset; /* RPC connection errors */ + case -ENETDOWN: + case -ENETUNREACH: + if (test_bit(NFS_CS_NETUNREACH_FATAL, &clp->cl_flags)) + return -NFS4ERR_FATAL_IOERROR; + fallthrough; case -ECONNREFUSED: case -EHOSTDOWN: case -EHOSTUNREACH: - case -ENETUNREACH: case -EIO: case -ETIMEDOUT: case -EPIPE: @@ -1183,6 +1187,7 @@ static int ff_layout_async_handle_error_v4(struct rpc_task *task, /* Retry all errors through either pNFS or MDS except for -EJUKEBOX */ static int ff_layout_async_handle_error_v3(struct rpc_task *task, + struct nfs_client *clp, struct pnfs_layout_segment *lseg, u32 idx) { @@ -1200,6 +1205,11 @@ static int ff_layout_async_handle_error_v3(struct rpc_task *task, case -EJUKEBOX: nfs_inc_stats(lseg->pls_layout->plh_inode, NFSIOS_DELAY); goto out_retry; + case -ENETDOWN: + case -ENETUNREACH: + if (test_bit(NFS_CS_NETUNREACH_FATAL, &clp->cl_flags)) + return -NFS4ERR_FATAL_IOERROR; + fallthrough; default: dprintk("%s DS connection error %d\n", __func__, task->tk_status); @@ -1234,7 +1244,7 @@ static int ff_layout_async_handle_error(struct rpc_task *task, switch (vers) { case 3: - return ff_layout_async_handle_error_v3(task, lseg, idx); + return ff_layout_async_handle_error_v3(task, clp, lseg, idx); case 4: return ff_layout_async_handle_error_v4(task, state, clp, lseg, idx); @@ -1337,6 +1347,9 @@ static int ff_layout_read_done_cb(struct rpc_task *task, return task->tk_status; case -EAGAIN: goto out_eagain; + case -NFS4ERR_FATAL_IOERROR: + task->tk_status = -EIO; + return 0; } return 0; @@ -1507,6 +1520,9 @@ static int ff_layout_write_done_cb(struct rpc_task *task, return task->tk_status; case -EAGAIN: return -EAGAIN; + case -NFS4ERR_FATAL_IOERROR: + task->tk_status = -EIO; + return 0; } if (hdr->res.verf->committed == NFS_FILE_SYNC || @@ -1551,6 +1567,9 @@ static int ff_layout_commit_done_cb(struct rpc_task *task, case -EAGAIN: rpc_restart_call_prepare(task); return -EAGAIN; + case -NFS4ERR_FATAL_IOERROR: + task->tk_status = -EIO; + return 0; } ff_layout_set_layoutcommit(data->inode, data->lseg, data->lwb); diff --git a/fs/nfs/nfs3client.c b/fs/nfs/nfs3client.c index b0c8a39c2bbd..0d7310c1ee0c 100644 --- a/fs/nfs/nfs3client.c +++ b/fs/nfs/nfs3client.c @@ -120,6 +120,8 @@ struct nfs_client *nfs3_set_ds_client(struct nfs_server *mds_srv, if (mds_srv->flags & NFS_MOUNT_NORESVPORT) __set_bit(NFS_CS_NORESVPORT, &cl_init.init_flags); + if (test_bit(NFS_CS_NETUNREACH_FATAL, &mds_clp->cl_flags)) + __set_bit(NFS_CS_NETUNREACH_FATAL, &cl_init.init_flags); __set_bit(NFS_CS_DS, &cl_init.init_flags); diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c index 8f7d40844cdc..162c85a83a14 100644 --- a/fs/nfs/nfs4client.c +++ b/fs/nfs/nfs4client.c @@ -939,6 +939,9 @@ static int nfs4_set_client(struct nfs_server *server, __set_bit(NFS_CS_TSM_POSSIBLE, &cl_init.init_flags); server->port = rpc_get_port((struct sockaddr *)addr); + if (server->flags & NFS_MOUNT_NETUNREACH_FATAL) + __set_bit(NFS_CS_NETUNREACH_FATAL, &cl_init.init_flags); + /* Allocate or find a client reference we can use */ clp = nfs_get_client(&cl_init); if (IS_ERR(clp)) @@ -1013,6 +1016,8 @@ struct nfs_client *nfs4_set_ds_client(struct nfs_server *mds_srv, if (mds_srv->flags & NFS_MOUNT_NORESVPORT) __set_bit(NFS_CS_NORESVPORT, &cl_init.init_flags); + if (test_bit(NFS_CS_NETUNREACH_FATAL, &mds_clp->cl_flags)) + __set_bit(NFS_CS_NETUNREACH_FATAL, &cl_init.init_flags); __set_bit(NFS_CS_PNFS, &cl_init.init_flags); cl_init.max_connect = NFS_MAX_TRANSPORTS; diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h index 5fa60fe441b5..d8cad844870a 100644 --- a/include/linux/nfs4.h +++ b/include/linux/nfs4.h @@ -300,6 +300,7 @@ enum nfsstat4 { /* error codes for internal client use */ #define NFS4ERR_RESET_TO_MDS 12001 #define NFS4ERR_RESET_TO_PNFS 12002 +#define NFS4ERR_FATAL_IOERROR 12003 static inline bool seqid_mutating_err(u32 err) { From patchwork Fri Mar 21 15:21:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 14025654 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C6DC1DF721 for ; Fri, 21 Mar 2025 15:21:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742570468; cv=none; b=Obo1DCqvk6Rr26BUI/gRVaskyJUWSZNfO/HaOuiRRPokQkLFu1rHdp4cS9c6qoABmOFmjRsFsbk6DGQzOa3ud3Lbgs+qdJ/fh3+aiigRRrWeqdj+Zka704412WpjXFQFLbxNA8sxQtVjxgTHZiXLnkKtK94V10Nnl0APId81kYM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742570468; c=relaxed/simple; bh=VT8B0H3UBNpAVD6YZld9Ygkj9dKbicilQ4bPR/E3oJ4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FpOPSnGNjtJE3gPKTt2ZJYq0ERfo0a+sBxgu7u1JFqN/H7CeJvQ1mKqXHZKECLbDQ1H0XpFRXGaf16GGBcVHCv87/t31h5ypgYurF4Fj8WIKSaHzf4cEVo2u536nQ5FM21+o8mGqNF3wQAOsq4UJBqBQ93XYooLdTKoqV+/iBig= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MLmE4U+z; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MLmE4U+z" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E3C7FC4CEEE; Fri, 21 Mar 2025 15:21:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742570468; bh=VT8B0H3UBNpAVD6YZld9Ygkj9dKbicilQ4bPR/E3oJ4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MLmE4U+zZRD8TNQk5fEdYaLz+uX1QrKywNSWs2MqYDz6a001rDmohYWok21UiuFk5 RkmqzgSD8j6oZkcy7j31+6BDnZoSvQ/jBpOHm6Pw2997RNY2l7PN+nMQpDYu69eP4A DDcH4S+kZy7USgJIaryM0zIIFuOpGjADn/YhTlo7XBL1fYwXeGLa0hct3osNEaa4OB HqnOWSj9/CzRq2xvdoVDIPQJ/iAPVLleor70hKYNVOKLmmP4JJYUtyRSiI4l/w9Ap8 1VzQrhgLej10MK1xdWUYAyj2hnN4kO+yVGO0IQUavSZmUFbjbMoXB2T6R/DMPZfv9T Berytb1T2LYqw== From: trondmy@kernel.org To: linux-nfs@vger.kernel.org Cc: Jeff Layton , Josef Bacik Subject: [PATCH v3 4/4] pNFS/flexfiles: Report ENETDOWN as a connection error Date: Fri, 21 Mar 2025 11:21:04 -0400 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Trond Myklebust If the client should see an ENETDOWN when trying to connect to the data server, it might still be able to talk to the metadata server through another NIC. If so, report the error. Signed-off-by: Trond Myklebust Reviewed-by: Jeff Layton Tested-by: Jeff Layton --- fs/nfs/flexfilelayout/flexfilelayout.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c index f89fdba7289d..61ad269c825f 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -1274,6 +1274,7 @@ static void ff_layout_io_track_ds_error(struct pnfs_layout_segment *lseg, case -ECONNRESET: case -EHOSTDOWN: case -EHOSTUNREACH: + case -ENETDOWN: case -ENETUNREACH: case -EADDRINUSE: case -ENOBUFS: