From patchwork Tue Jun 4 19:45:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13685820 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF560372 for ; Tue, 4 Jun 2024 19:45:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717530353; cv=none; b=OANde70IgEEXFKJtVhablX5BJ7iD7InQgWGg4p1F5n5AOhj+TSGk4YyuaQQl/hNBI2qJgA5MUp4pnDnKsY4qAvjUZu32d6p5FbZHMC5O6dIopQ1CHvKLB+RkBPDpMEH+UuWAnpMJYd+8De7TxdTnGEefGR0wY7PwgccaE+FY6L0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717530353; c=relaxed/simple; bh=b8T/KpxbL3c6cIjs0UMpMzudI/Ts5eydQg/dvEh3ucc=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=CezIpEemYLAPdaKa5pTdd7CaS3Ta45yE3bq72HxTk170oVkoXMbxGo7bFc8RlNWRg48flfPsCN72u7Ia5x9X6bU6GNQmhuZ+omkzWLsVQvEirfJc5VbR6k4Zgny/q+eNSwkmP+e0/LqprhHWuRNFUJFDLLqaEah0bY2zPmLC6AA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SOPvYei3; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SOPvYei3" Received: by smtp.kernel.org (Postfix) with ESMTPSA id EC8FEC32782; Tue, 4 Jun 2024 19:45:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717530353; bh=b8T/KpxbL3c6cIjs0UMpMzudI/Ts5eydQg/dvEh3ucc=; h=From:To:Cc:Subject:Date:From; b=SOPvYei3kf+YrUlV68UJBH8uPjisUpJukqKSmeMvANUXE6899X7Yycd1ptHFrW1BF JFq5dWLnvVDkD1GvUv4JUY9htOlUetuipsJkbW5/RNkWtzHyFgr978RUufDR2TYVbs +UQ7OJHPdtreAsBXnU1UiJlZttYyKZeXOiFjQ9bfCMoAjxuNl1VgBL4nXYnE0fTVCM g9DKEBzjgPB1/CodzcKaxUyAT9yJXwiuibP4eL8/oZyZ1pyQ9tzHKibhfbTrCD0AGP co1GlRh+AroUiyEH8drlgWRB/OU/Bmijyv9ybfNxxHHInq/Jsy3yS4nMO5VP23lmGb YdpuZFuGWHAGQ== From: cel@kernel.org To: Trond Myklebust , Anna Schumaker Cc: , Chuck Lever Subject: [PATCH 1/5] xprtrdma: Fix rpcrdma_reqs_reset() Date: Tue, 4 Jun 2024 15:45:23 -0400 Message-ID: <20240604194522.10390-6-cel@kernel.org> X-Mailer: git-send-email 2.45.1 Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2623; i=chuck.lever@oracle.com; h=from:subject; bh=5KP9+eW8QKOVQ2hGYZ+vv0MHZUg2HKk/jOgZl24py5c=; b=owEBbQKS/ZANAwAIATNqszNvZn+XAcsmYgBmX27Sm6LFfb+EYI2/m3u+C63CSpXn3vN+Db2Om pB4Rc4PvZaJAjMEAAEIAB0WIQQosuWwEobfJDzyPv4zarMzb2Z/lwUCZl9u0gAKCRAzarMzb2Z/ lwBgEACO1bn/yfRywU0Dnd8NdVPyUJqtTZEcktLo4fKcZoWf6P4ArHw5FOOjPsmssLGEGpvu8A4 WcrE5MccK8Kbe3jzHp4YJz3vDHPxcNLtQfkFLxhJqlBpAkaoDCEo49ilogG7tiiQDsgDErdOzzx EnCXLpCJc+Qm1Qjs5ZHRd7vc+iTThmK1Jgbi171d2Uo559R+3Elroiu33mLv5NJ3YptxVuW4wdz eAETfyGD6u1nkJENwTEXdsAIT4kdeevMca/zm5Zn9rt0vtlNwerMewf6TUiIjA5/3HiZpfNo0V6 asBiVOXzrCMuWx1B+0emGOcaJhOEe4JEWcRvrPfPriqHUuaeiu3VbakDTCKaTSqtg5NohZJREcE 2GpBUAVlGE2JVepL7InIWhFN7ZRlxsl3/Gz50uhm3BH9M02xM3Mqu5YqYG77Oqrw+yr/AOznSLU rIzvyufdpGKTgPcyyTUl0536qOAOJchmOBUN/VRORaeh64QtOh04kTMjquloe/R0ayRlPP81V7b O/ci/UkHt4vVl4zxe+LSI5EfFhGZersq4BwxGrTnxBHw+umLM1nC61Qsk6X4hVufcOgjm24fwQz +xRIx07dm8LED4uLMNuARfMli2BpNClv1ALXZAXp2n9TJacb1UMDZTP+oLyw9SG0zqlB5eqA2m6 SU0Y2p98Nv4QinA== X-Developer-Key: i=chuck.lever@oracle.com; a=openpgp; fpr=28B2E5B01286DF243CF23EFE336AB3336F667F97 From: Chuck Lever Avoid FastReg operations getting MW_BIND_ERR after a reconnect. rpcrdma_reqs_reset() is called on transport tear-down to get each rpcrdma_req back into a clean state. MRs on req->rl_registered are waiting for a FastReg, are already registered, or are waiting for invalidation. If the transport is being torn down when reqs_reset() is called, the matching LocalInv might never be posted. That leaves these MR registered /and/ on req->rl_free_mrs, where they can be re-used for the next connection. Since xprtrdma does not keep specific track of the MR state, it's not possible to know what state these MRs are in, so the only safe thing to do is release them immediately. Fixes: 5de55ce951a1 ("xprtrdma: Release in-flight MRs on disconnect") Signed-off-by: Chuck Lever --- net/sunrpc/xprtrdma/frwr_ops.c | 3 ++- net/sunrpc/xprtrdma/verbs.c | 16 +++++++++++++++- 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c index ffbf99894970..47f33bb7bff8 100644 --- a/net/sunrpc/xprtrdma/frwr_ops.c +++ b/net/sunrpc/xprtrdma/frwr_ops.c @@ -92,7 +92,8 @@ static void frwr_mr_put(struct rpcrdma_mr *mr) rpcrdma_mr_push(mr, &mr->mr_req->rl_free_mrs); } -/* frwr_reset - Place MRs back on the free list +/** + * frwr_reset - Place MRs back on @req's free list * @req: request to reset * * Used after a failed marshal. For FRWR, this means the MRs diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index 432557a553e7..a0b071089e15 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -897,6 +897,8 @@ static int rpcrdma_reqs_setup(struct rpcrdma_xprt *r_xprt) static void rpcrdma_req_reset(struct rpcrdma_req *req) { + struct rpcrdma_mr *mr; + /* Credits are valid for only one connection */ req->rl_slot.rq_cong = 0; @@ -906,7 +908,19 @@ static void rpcrdma_req_reset(struct rpcrdma_req *req) rpcrdma_regbuf_dma_unmap(req->rl_sendbuf); rpcrdma_regbuf_dma_unmap(req->rl_recvbuf); - frwr_reset(req); + /* The verbs consumer can't know the state of an MR on the + * req->rl_registered list unless a successful completion + * has occurred, so they cannot be re-used. + */ + while ((mr = rpcrdma_mr_pop(&req->rl_registered))) { + struct rpcrdma_buffer *buf = &mr->mr_xprt->rx_buf; + + spin_lock(&buf->rb_lock); + list_del(&mr->mr_all); + spin_unlock(&buf->rb_lock); + + frwr_mr_release(mr); + } } /* ASSUMPTION: the rb_allreqs list is stable for the duration, From patchwork Tue Jun 4 19:45:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13685821 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 12101372 for ; Tue, 4 Jun 2024 19:45:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717530358; cv=none; b=c8wKz4+3LRgOoyWBiSSYh0TOsISZrhCa7MOnyFGxBMxDlsgItvvQmSLI4rnYzBhMOjRQnJM61b+pBMNILekdnOIuXnyu0WPlyMRFr8coEeImhLXspm5q7Z0+8il9p5q82UO81/FMcVVMuwDOfZJLPTJzI0T1yH6X6NTGqYmtUAo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717530358; c=relaxed/simple; bh=9Z/H+o73fZELcFt5f0ixfvT3O8kUFye2ASzlrC2TC/c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oJs/7RCBkywNSUJldk2TMdY8tTcr63vvWD7lqNnKcLn7UudGHaY1jZrQ4tJAOr+1xqygvcFDM86VUNyOEZROWgD8ZFgm7+W3aaGdVfseAL4QZl+aH/sTFCV4Vn74pNUvGISZzewhVNKxl5QqLoUuz24B5zXpe5Tv+7EAJuvuJrY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=oSwFq3fp; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oSwFq3fp" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 17592C2BBFC; Tue, 4 Jun 2024 19:45:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717530357; bh=9Z/H+o73fZELcFt5f0ixfvT3O8kUFye2ASzlrC2TC/c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oSwFq3fpd6LBeULO9G0GBialIxiYKtO9Xk+GWZly5EJehqjxF0sEhfJWhvMb6t7po US1wBWbLgljc0S7Enl68tD4ZVsQm7YuhLQoZM53W/ahp/eC+mTJH0xm4DOUaTWxloi yDL74eRX85rHXuVbevM5fGvDj7H4N3hqV91kXDH1D3AHGmGxhOjoC5DzPCMu72G8Jm v7NwXsT7CHc8YnxfmTdS2HsYbLQljT55qPYoR7V1ZGD7TcpelKHDaxZvGB1NZ8alkz 0IMsjBQxpP3INom3nyEuHDqxIv5sSygLCypxNRB8+SMahojwntIXHcwRh0Im9Wu3uu NRnkpRson2GwA== From: cel@kernel.org To: Trond Myklebust , Anna Schumaker Cc: , Chuck Lever , Sagi Grimberg Subject: [PATCH 2/5] rpcrdma: Implement generic device removal Date: Tue, 4 Jun 2024 15:45:24 -0400 Message-ID: <20240604194522.10390-7-cel@kernel.org> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240604194522.10390-6-cel@kernel.org> References: <20240604194522.10390-6-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=9675; i=chuck.lever@oracle.com; h=from:subject; bh=IIQw5AGrpM1pZ4+VKKIufISRiAUfR/46u4cSiJ1Jikg=; b=owEBbQKS/ZANAwAIATNqszNvZn+XAcsmYgBmX27ftOcnAsYp9H/5iXkDGRYwEgI+3SqOAXfii QJ7lWQYXO2JAjMEAAEIAB0WIQQosuWwEobfJDzyPv4zarMzb2Z/lwUCZl9u3wAKCRAzarMzb2Z/ lx8KD/9vQ90+DZZJgLbJnQoOkTsEuoMwgQL/vjeDbDBVJOJVS9fiZOyMSAFzhqmqtIp8UIJg7S/ PCfb71GD6rkN/LKmyxFBkubtT3IvG6WzRFUEgKgdKXWb7J3k7bjNoLRbafVRi0AHFR80yCZpj+U 0Bd3gugs8K82Sq1a0uXfP6gmYJ7D8w0tvf/TbQ+obdXvgt0+K/THttTPWIGYZ0rHtsM/CEzQEDm Tir/P8phQeSp1q8a5HRCsl4WEQuomAH9V/PFA0bIo9XRcoxIixqOv/iOawYcJnmnMnMcmTxtZDo Sgp+LPEBTuKPeRazfZ8BVkLNFvAMoSfV6ZHwdZwtAir5FCZfn/JUPJn8PohiUUIVmoZA34HRPtI oXCKHGiA/HrY08MyJCb0t+fab/5uiHg2eQa2TVNVDueI4qSXDtVGuy2yzsS12mk6qSpfVhg1KNC TBA+0eruGCLKonADEth5U7caGyeVfspuphmfhWrLWNq+hPA4mq9gLbhIvHDo6uhGSrg4VlaQ+aQ rNHXLhkiFSG6ddM5+pgW4WXMP0xopuS/j+6aVsbSrjWeMSBoOlpmq9M5+mlAKOjcK3DtjdAZx3U fFshtJk03imFLK+IaLN8rZ4E9kKuXj9QcD79NuPyOo4AJmKdjteDDwLvXg78XaMP+hiFTGB2bSB GBQxTKqedgZDW1g== X-Developer-Key: i=chuck.lever@oracle.com; a=openpgp; fpr=28B2E5B01286DF243CF23EFE336AB3336F667F97 From: Chuck Lever Commit e87a911fed07 ("nvme-rdma: use ib_client API to detect device removal") explains the benefits of handling device removal outside of the CM event handler. Sketch in an IB device removal notification mechanism that can be used by both the client and server side RPC-over-RDMA transport implementations. Suggested-by: Sagi Grimberg Signed-off-by: Chuck Lever Reviewed-by: Sagi Grimberg --- include/linux/sunrpc/rdma_rn.h | 27 +++++ include/trace/events/rpcrdma.h | 34 ++++++ net/sunrpc/xprtrdma/Makefile | 2 +- net/sunrpc/xprtrdma/ib_client.c | 181 ++++++++++++++++++++++++++++++++ net/sunrpc/xprtrdma/module.c | 18 +++- 5 files changed, 258 insertions(+), 4 deletions(-) create mode 100644 include/linux/sunrpc/rdma_rn.h create mode 100644 net/sunrpc/xprtrdma/ib_client.c diff --git a/include/linux/sunrpc/rdma_rn.h b/include/linux/sunrpc/rdma_rn.h new file mode 100644 index 000000000000..7d032ca057af --- /dev/null +++ b/include/linux/sunrpc/rdma_rn.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * * Copyright (c) 2024, Oracle and/or its affiliates. + */ + +#ifndef _LINUX_SUNRPC_RDMA_RN_H +#define _LINUX_SUNRPC_RDMA_RN_H + +#include + +/** + * rpcrdma_notification - request removal notification + */ +struct rpcrdma_notification { + void (*rn_done)(struct rpcrdma_notification *rn); + u32 rn_index; +}; + +int rpcrdma_rn_register(struct ib_device *device, + struct rpcrdma_notification *rn, + void (*done)(struct rpcrdma_notification *rn)); +void rpcrdma_rn_unregister(struct ib_device *device, + struct rpcrdma_notification *rn); +int rpcrdma_ib_client_register(void); +void rpcrdma_ib_client_unregister(void); + +#endif /* _LINUX_SUNRPC_RDMA_RN_H */ diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h index 14392652273a..ecdaf088219d 100644 --- a/include/trace/events/rpcrdma.h +++ b/include/trace/events/rpcrdma.h @@ -2220,6 +2220,40 @@ TRACE_EVENT(svcrdma_sq_post_err, ) ); +DECLARE_EVENT_CLASS(rpcrdma_client_device_class, + TP_PROTO( + const struct ib_device *device + ), + + TP_ARGS(device), + + TP_STRUCT__entry( + __string(name, device->name) + ), + + TP_fast_assign( + __assign_str(name); + ), + + TP_printk("device=%s", + __get_str(name) + ) +); + +#define DEFINE_CLIENT_DEVICE_EVENT(name) \ + DEFINE_EVENT(rpcrdma_client_device_class, name, \ + TP_PROTO( \ + const struct ib_device *device \ + ), \ + TP_ARGS(device) \ + ) + +DEFINE_CLIENT_DEVICE_EVENT(rpcrdma_client_completion); +DEFINE_CLIENT_DEVICE_EVENT(rpcrdma_client_add_one); +DEFINE_CLIENT_DEVICE_EVENT(rpcrdma_client_remove_one); +DEFINE_CLIENT_DEVICE_EVENT(rpcrdma_client_wait_on); +DEFINE_CLIENT_DEVICE_EVENT(rpcrdma_client_remove_one_done); + #endif /* _TRACE_RPCRDMA_H */ #include diff --git a/net/sunrpc/xprtrdma/Makefile b/net/sunrpc/xprtrdma/Makefile index 55b21bae866d..3232aa23cdb4 100644 --- a/net/sunrpc/xprtrdma/Makefile +++ b/net/sunrpc/xprtrdma/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_SUNRPC_XPRT_RDMA) += rpcrdma.o -rpcrdma-y := transport.o rpc_rdma.o verbs.o frwr_ops.o \ +rpcrdma-y := transport.o rpc_rdma.o verbs.o frwr_ops.o ib_client.o \ svc_rdma.o svc_rdma_backchannel.o svc_rdma_transport.o \ svc_rdma_sendto.o svc_rdma_recvfrom.o svc_rdma_rw.o \ svc_rdma_pcl.o module.o diff --git a/net/sunrpc/xprtrdma/ib_client.c b/net/sunrpc/xprtrdma/ib_client.c new file mode 100644 index 000000000000..a938c19c3490 --- /dev/null +++ b/net/sunrpc/xprtrdma/ib_client.c @@ -0,0 +1,181 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause +/* + * Copyright (c) 2024 Oracle. All rights reserved. + */ + +/* #include +#include */ +#include +#include +#include +#include + +#include +#include + +#include "xprt_rdma.h" +#include + +/* Per-ib_device private data for rpcrdma */ +struct rpcrdma_device { + struct kref rd_kref; + unsigned long rd_flags; + struct ib_device *rd_device; + struct xarray rd_xa; + struct completion rd_done; +}; + +#define RPCRDMA_RD_F_REMOVING (0) + +static struct ib_client rpcrdma_ib_client; + +/* + * Listeners have no associated device, so we never register them. + * Note that ib_get_client_data() does not check if @device is + * NULL for us. + */ +static struct rpcrdma_device *rpcrdma_get_client_data(struct ib_device *device) +{ + if (!device) + return NULL; + return ib_get_client_data(device, &rpcrdma_ib_client); +} + +/** + * rpcrdma_rn_register - register to get device removal notifications + * @device: device to monitor + * @rn: notification object that wishes to be notified + * @done: callback to notify caller of device removal + * + * Returns zero on success. The callback in rn_done is guaranteed + * to be invoked when the device is removed, unless this notification + * is unregistered first. + * + * On failure, a negative errno is returned. + */ +int rpcrdma_rn_register(struct ib_device *device, + struct rpcrdma_notification *rn, + void (*done)(struct rpcrdma_notification *rn)) +{ + struct rpcrdma_device *rd = rpcrdma_get_client_data(device); + + if (!rd || test_bit(RPCRDMA_RD_F_REMOVING, &rd->rd_flags)) + return -ENETUNREACH; + + kref_get(&rd->rd_kref); + if (xa_alloc(&rd->rd_xa, &rn->rn_index, rn, xa_limit_32b, GFP_KERNEL) < 0) + return -ENOMEM; + rn->rn_done = done; + return 0; +} + +static void rpcrdma_rn_release(struct kref *kref) +{ + struct rpcrdma_device *rd = container_of(kref, struct rpcrdma_device, + rd_kref); + + trace_rpcrdma_client_completion(rd->rd_device); + complete(&rd->rd_done); +} + +/** + * rpcrdma_rn_unregister - stop device removal notifications + * @device: monitored device + * @rn: notification object that no longer wishes to be notified + */ +void rpcrdma_rn_unregister(struct ib_device *device, + struct rpcrdma_notification *rn) +{ + struct rpcrdma_device *rd = rpcrdma_get_client_data(device); + + if (!rd) + return; + + xa_erase(&rd->rd_xa, rn->rn_index); + kref_put(&rd->rd_kref, rpcrdma_rn_release); +} + +/** + * rpcrdma_add_one - ib_client device insertion callback + * @device: device about to be inserted + * + * Returns zero on success. xprtrdma private data has been allocated + * for this device. On failure, a negative errno is returned. + */ +static int rpcrdma_add_one(struct ib_device *device) +{ + struct rpcrdma_device *rd; + + rd = kzalloc(sizeof(*rd), GFP_KERNEL); + if (!rd) + return -ENOMEM; + + kref_init(&rd->rd_kref); + xa_init_flags(&rd->rd_xa, XA_FLAGS_ALLOC1); + rd->rd_device = device; + init_completion(&rd->rd_done); + ib_set_client_data(device, &rpcrdma_ib_client, rd); + + trace_rpcrdma_client_add_one(device); + return 0; +} + +/** + * rpcrdma_remove_one - ib_client device removal callback + * @device: device about to be removed + * @client_data: this module's private per-device data + * + * Upon return, all transports associated with @device have divested + * themselves from IB hardware resources. + */ +static void rpcrdma_remove_one(struct ib_device *device, + void *client_data) +{ + struct rpcrdma_device *rd = client_data; + struct rpcrdma_notification *rn; + unsigned long index; + + trace_rpcrdma_client_remove_one(device); + + set_bit(RPCRDMA_RD_F_REMOVING, &rd->rd_flags); + xa_for_each(&rd->rd_xa, index, rn) + rn->rn_done(rn); + + /* + * Wait only if there are still outstanding notification + * registrants for this device. + */ + if (!refcount_dec_and_test(&rd->rd_kref.refcount)) { + trace_rpcrdma_client_wait_on(device); + wait_for_completion(&rd->rd_done); + } + + trace_rpcrdma_client_remove_one_done(device); + kfree(rd); +} + +static struct ib_client rpcrdma_ib_client = { + .name = "rpcrdma", + .add = rpcrdma_add_one, + .remove = rpcrdma_remove_one, +}; + +/** + * rpcrdma_ib_client_unregister - unregister ib_client for xprtrdma + * + * cel: watch for orphaned rpcrdma_device objects on module unload + */ +void rpcrdma_ib_client_unregister(void) +{ + ib_unregister_client(&rpcrdma_ib_client); +} + +/** + * rpcrdma_ib_client_register - register ib_client for rpcrdma + * + * Returns zero on success, or a negative errno. + */ +int rpcrdma_ib_client_register(void) +{ + return ib_register_client(&rpcrdma_ib_client); +} diff --git a/net/sunrpc/xprtrdma/module.c b/net/sunrpc/xprtrdma/module.c index 45c5b41ac8dc..697f571d4c01 100644 --- a/net/sunrpc/xprtrdma/module.c +++ b/net/sunrpc/xprtrdma/module.c @@ -11,6 +11,7 @@ #include #include #include +#include #include @@ -30,21 +31,32 @@ static void __exit rpc_rdma_cleanup(void) { xprt_rdma_cleanup(); svc_rdma_cleanup(); + rpcrdma_ib_client_unregister(); } static int __init rpc_rdma_init(void) { int rc; + rc = rpcrdma_ib_client_register(); + if (rc) + goto out_rc; + rc = svc_rdma_init(); if (rc) - goto out; + goto out_ib_client; rc = xprt_rdma_init(); if (rc) - svc_rdma_cleanup(); + goto out_svc_rdma; -out: + return 0; + +out_svc_rdma: + svc_rdma_cleanup(); +out_ib_client: + rpcrdma_ib_client_unregister(); +out_rc: return rc; } From patchwork Tue Jun 4 19:45:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13685822 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D52AD372 for ; Tue, 4 Jun 2024 19:46:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717530360; cv=none; b=FHUINeMapZ9gEZrlQhHi3NEpddPfcVpI5+ZsG2fz51kAwE14oCI2wIc4S9Nx/BLPFvtSj0/on/PNRH1iQ8yIN+Sasj3u0VlHFmuHyjhDmL2IROhG89P5ckHf9yyBsCSsc8d0qBM5htYSR7t/RKXAm8TCApSqepurU85MOQACVhQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717530360; c=relaxed/simple; bh=MeNlT7TdopZjysajwdcUBrq9ZwWCVURib/b08rUk99E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JwjvDjFclycB87ynN4+6Vg3YKMECgtdY2uDuUzCQN9JwMIXFQURMs+0w27qapw0Zke+EAY3P+lo6nG3MWrrjqAsHUxwOFU7JoV1hiRFSGviWf5FTWjiZbn0fynha+hhLI/3MTyah4zGZAJwDmvOP7TkOZGsjKqULi0DZiiQlv0c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kdR1QfF8; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kdR1QfF8" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E9D81C4AF07; Tue, 4 Jun 2024 19:45:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717530360; bh=MeNlT7TdopZjysajwdcUBrq9ZwWCVURib/b08rUk99E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kdR1QfF8sdkOrw06YpDfIVnXYh3bN0ne9tssZEe+uGN+aaMXg8nEqBnOQx4Oucr3K XPeL52GFnXsv5pU+cQ1YsuFsyxq7lyH5fBlTAo9y8+dF2I83MPh9ExcC7g8ECu3gx2 wRZtUUkwjOArJgPGScwatI6/xbQsN1LEucTSOfsswx06ma+ncSJ9YbTAMENQLmgUZR LgQDgAjwlM13vvstbj++ehlfwcYpNF682HUOHL48GUu4XiS8QXjKEXx/U3D1rYLiFc g/fTGcbhXcRiXWDloqQV57OvHQvOCZv3OhY+kdpta07GG+TJy+vow8Gviez3b+PXDG ez+fs5XozKaGQ== From: cel@kernel.org To: Trond Myklebust , Anna Schumaker Cc: , Chuck Lever Subject: [PATCH 3/5] xprtrdma: Handle device removal outside of the CM event handler Date: Tue, 4 Jun 2024 15:45:25 -0400 Message-ID: <20240604194522.10390-8-cel@kernel.org> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240604194522.10390-6-cel@kernel.org> References: <20240604194522.10390-6-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4252; i=chuck.lever@oracle.com; h=from:subject; bh=EAbyFc9tekg2vKPRrrre/Y17+KdtZfKCpjv5RXp4p68=; b=owEBbQKS/ZANAwAIATNqszNvZn+XAcsmYgBmX27gqXNjZPd6AqcfPrDmrRzozEULGd8SnLnEm 1TEiF27NJCJAjMEAAEIAB0WIQQosuWwEobfJDzyPv4zarMzb2Z/lwUCZl9u4AAKCRAzarMzb2Z/ l/TxD/99FpkPXHdeDroA1lt+OALJCEcNiBwZEJXTYn5fL77nOQiPpm2Tuq2euLiMyrYxJUTsUeE tZAxBhMP66xDY9Po7qBUQtH9wDDY3K5leRhKZD/CXU610/eTjKjClE54CUII4DE9XIukHmzbF9n omIb4Qmf3zbXM15znYWhgvzBXa/XfAJwqGAtrr3xFaqmukQ5ccCj97gS7wIf8RsFCuGet/wfk4F Wvd6DxUK/82krnfgFTZtbiPou7DVWaDaSukTbnpkbG984NnwEeRb3xkDY7QjwRDkiOSgIUe7hgu OhX4XcepdEe/UeFYZEfH5N4im4AeqwvYK76YywjENcq/zGVxuByC2FsVB3mkqLihGRtgw6+vJQq 8Q/1OQoSCAeqaPd3M5Co0IdQ+nSAarSMbS0OJH4tIyOFDnnsCDqwS+tdFIVIrGgDu3lhOtxkEPl ysOruGyO6e+kGRw5U5DpRmUIq7KZjV2ZeNanCpuXwbICLAUW/aQLCThd14IgkpQbCSfabkwe39b t9SdfUkTj0U4yVIvpQf45EsqCM/Lg8coRJXFBtQb6ru9I2sHjq9Wn9K0gww+bH/SGgGUSuesEmk GWgcgQvCA8bBuW7QskZHTpePMhSMqkuYmdyKwqeHp3wXfTMkjlbp2aSSL4P/8gC8lplvdPk4axs nsPqXw4dM5rnWYw== X-Developer-Key: i=chuck.lever@oracle.com; a=openpgp; fpr=28B2E5B01286DF243CF23EFE336AB3336F667F97 From: Chuck Lever Wait for all disconnects to complete to ensure the transport has divested all of its hardware resources before the underlying RDMA device can be removed. Signed-off-by: Chuck Lever Reviewed-by: Sagi Grimberg --- include/trace/events/rpcrdma.h | 23 +++++++++++++++++++++++ net/sunrpc/xprtrdma/verbs.c | 23 ++++++++++++++--------- net/sunrpc/xprtrdma/xprt_rdma.h | 2 ++ 3 files changed, 39 insertions(+), 9 deletions(-) diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h index ecdaf088219d..ba2d6a0e41cc 100644 --- a/include/trace/events/rpcrdma.h +++ b/include/trace/events/rpcrdma.h @@ -669,6 +669,29 @@ TRACE_EVENT(xprtrdma_inline_thresh, DEFINE_CONN_EVENT(connect); DEFINE_CONN_EVENT(disconnect); +TRACE_EVENT(xprtrdma_device_removal, + TP_PROTO( + const struct rdma_cm_id *id + ), + + TP_ARGS(id), + + TP_STRUCT__entry( + __string(name, id->device->name) + __array(unsigned char, addr, sizeof(struct sockaddr_in6)) + ), + + TP_fast_assign( + __assign_str(name); + memcpy(__entry->addr, &id->route.addr.dst_addr, + sizeof(struct sockaddr_in6)); + ), + + TP_printk("device %s to be removed, disconnecting %pISpc\n", + __get_str(name), __entry->addr + ) +); + DEFINE_RXPRT_EVENT(xprtrdma_op_inject_dsc); TRACE_EVENT(xprtrdma_op_connect, diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index a0b071089e15..04558c99e9f4 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -222,7 +222,6 @@ static void rpcrdma_update_cm_private(struct rpcrdma_ep *ep, static int rpcrdma_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event) { - struct sockaddr *sap = (struct sockaddr *)&id->route.addr.dst_addr; struct rpcrdma_ep *ep = id->context; might_sleep(); @@ -241,14 +240,6 @@ rpcrdma_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event) ep->re_async_rc = -ENETUNREACH; complete(&ep->re_done); return 0; - case RDMA_CM_EVENT_DEVICE_REMOVAL: - pr_info("rpcrdma: removing device %s for %pISpc\n", - ep->re_id->device->name, sap); - switch (xchg(&ep->re_connect_status, -ENODEV)) { - case 0: goto wake_connect_worker; - case 1: goto disconnected; - } - return 0; case RDMA_CM_EVENT_ADDR_CHANGE: ep->re_connect_status = -ENODEV; goto disconnected; @@ -284,6 +275,14 @@ rpcrdma_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event) return 0; } +static void rpcrdma_ep_removal_done(struct rpcrdma_notification *rn) +{ + struct rpcrdma_ep *ep = container_of(rn, struct rpcrdma_ep, re_rn); + + trace_xprtrdma_device_removal(ep->re_id); + xprt_force_disconnect(ep->re_xprt); +} + static struct rdma_cm_id *rpcrdma_create_id(struct rpcrdma_xprt *r_xprt, struct rpcrdma_ep *ep) { @@ -323,6 +322,10 @@ static struct rdma_cm_id *rpcrdma_create_id(struct rpcrdma_xprt *r_xprt, if (rc) goto out; + rc = rpcrdma_rn_register(id->device, &ep->re_rn, rpcrdma_ep_removal_done); + if (rc) + goto out; + return id; out: @@ -350,6 +353,8 @@ static void rpcrdma_ep_destroy(struct kref *kref) ib_dealloc_pd(ep->re_pd); ep->re_pd = NULL; + rpcrdma_rn_unregister(ep->re_id->device, &ep->re_rn); + kfree(ep); module_put(THIS_MODULE); } diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h index da409450dfc0..341725c66ec8 100644 --- a/net/sunrpc/xprtrdma/xprt_rdma.h +++ b/net/sunrpc/xprtrdma/xprt_rdma.h @@ -56,6 +56,7 @@ #include /* completion IDs */ #include /* RPC/RDMA protocol */ #include /* xprt parameters */ +#include /* removal notifications */ #define RDMA_RESOLVE_TIMEOUT (5000) /* 5 seconds */ #define RDMA_CONNECT_RETRY_MAX (2) /* retries if no listener backlog */ @@ -92,6 +93,7 @@ struct rpcrdma_ep { struct rpcrdma_connect_private re_cm_private; struct rdma_conn_param re_remote_cma; + struct rpcrdma_notification re_rn; int re_receive_count; unsigned int re_max_requests; /* depends on device */ unsigned int re_inline_send; /* negotiated */ From patchwork Tue Jun 4 19:45:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13685823 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D94C1384B3 for ; Tue, 4 Jun 2024 19:46:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717530362; cv=none; b=tUp7oTwpxsXwNr86tRIqZPC1u86e9LnMEWEJhf/QV4Jp2fyEu0QwvIGknSOyw+RWRhKbUSWQZnV2+sb/zpvIzap5zaWZ5HXrybG8n6EBjgAKfY4/PMp5X/gyiEb2gVhlqYDOg1GfAfbwd0E7wCkNxRAt6ecNzU9X+5bbgwKt41U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717530362; c=relaxed/simple; bh=5Ydso/zVQbeAxwgu5lw6RjuckO6TvFOED90x4PG3wLM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YJ6FUVQ0oBezPBdvtaV+xJZ43uAv7ode+O3xXhfmdfcF+TYgR85xgFBr3Se+HUMQ5fy4J3iAjZ5kqllbX2idqTalLZm++iUh+02I+4JWNeLQ8lVoOb3lj9syCGCt+09ejEALSOehNto+Uhtmt6I04Wl6xpR8CepknC6zKEFQPwg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=XAQEgQoX; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="XAQEgQoX" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8FBBCC2BBFC; Tue, 4 Jun 2024 19:46:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717530362; bh=5Ydso/zVQbeAxwgu5lw6RjuckO6TvFOED90x4PG3wLM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XAQEgQoXzqj7XIV2RUYlzmBgav76bYpNPET8YCkP2ztt6ujWVE+M93t0XrQi3BKBW XGjxUYrYCSm7Lej3gdARxlQKGgoZY+SOBd15SF2lKrUMrXi+Wqpa45axJ/VIGPVYlc SwDmKOSRL/96f0/jTbVPLXlFj10iqLgxv+fRMPNkJPRBQyLXETqED93gF6ziGTxdRh WBHkLFHPJp6raD4CVhLJmqa0xQqAnUXW9hbKiX9vQ14emmC5rVors8D1yBxWuCFDWb 3TP9WNsGY2Jv8F0gPvOt9JiLukImsofPHcv4/if6NYi3F7r24DaCJ7l0/SjocdUC6Q fgXIib+GarcJw== From: cel@kernel.org To: Trond Myklebust , Anna Schumaker Cc: , Chuck Lever Subject: [PATCH 4/5] xprtrdma: Clean up synopsis of frwr_mr_unmap() Date: Tue, 4 Jun 2024 15:45:26 -0400 Message-ID: <20240604194522.10390-9-cel@kernel.org> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240604194522.10390-6-cel@kernel.org> References: <20240604194522.10390-6-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1342; i=chuck.lever@oracle.com; h=from:subject; bh=JhOZt/3I+/9WkTp79yqTIaQpm4OZ0ujPaC35ryjiaAg=; b=owEBbQKS/ZANAwAIATNqszNvZn+XAcsmYgBmX27gchlNf6/z7MmrWuZ7OLNdacwUvY37eo3fw KnvwcPXUhmJAjMEAAEIAB0WIQQosuWwEobfJDzyPv4zarMzb2Z/lwUCZl9u4AAKCRAzarMzb2Z/ lxrLEACnHTY2knwjhEFAAR8khFtJH4EaDT+y0eylaiv0C0yJGbL7ez2pKCu80fxnNiM4HlvklUv /T/lqooQqHMHrXLB+Zfr+Yv1mUfPdeHI3MLO8Xv2hBt1Jo/ZgUXIoKenHUa0Svs8Ds8uIFugq5b da4/+pcF6IWobzZ5q+llDSyQGMqf7G1nWoyfNKnLDiXT9Tb03AbDXoY/MRjkB0IegL4pq886W3Y NumNQYKNTkId5Q+R/7g7HnT81pF6kvuluzrssg/ITY6ZqVsPd7R/I2faBna6jhgsCduyhb2/mnI Pss3JR9O9q20xWdiH6+GwSqYINid668Es+qN6D1//dMvoCCbOzeMxGOa7mgWHdknM2UE7hpwH6H mssZt65EE2uJuCitlg+sax1/01qqM0y68H0gEQADQpNOS9KSbE3Ks3I2NJg3oBwZc6l1jYSyTFE c+eIn0AGVoG1QwDqM309fN8/EufaJ4jfC33KIKyaSFnxq1CYkWR5VB3eaYp9cZ/YGTH6XuYHmeY vgXO/QBV1+/7RtSEpQFNN8P7orjualDy/AeyWvswiDwCpFtXjYTHeaED4VkzvicbLsiXeZDY7jp V01I10Veevd7qucmqeizg+G8E3a6tVuiTsZUB3Shu0memJXH7CX/T7F6WSBYfc4sjPYYnDA0n8H Fn0yvPSO+N8ntjw== X-Developer-Key: i=chuck.lever@oracle.com; a=openpgp; fpr=28B2E5B01286DF243CF23EFE336AB3336F667F97 From: Chuck Lever Commit 7a03aeb66c41 ("xprtrdma: Micro-optimize MR DMA-unmapping") removed the last use of the @r_xprt parameter in this function, but neglected to remove the parameter itself. Signed-off-by: Chuck Lever Reviewed-by: Sagi Grimberg --- net/sunrpc/xprtrdma/frwr_ops.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c index 47f33bb7bff8..31434aeb8e29 100644 --- a/net/sunrpc/xprtrdma/frwr_ops.c +++ b/net/sunrpc/xprtrdma/frwr_ops.c @@ -54,7 +54,7 @@ static void frwr_cid_init(struct rpcrdma_ep *ep, cid->ci_completion_id = mr->mr_ibmr->res.id; } -static void frwr_mr_unmap(struct rpcrdma_xprt *r_xprt, struct rpcrdma_mr *mr) +static void frwr_mr_unmap(struct rpcrdma_mr *mr) { if (mr->mr_device) { trace_xprtrdma_mr_unmap(mr); @@ -73,7 +73,7 @@ void frwr_mr_release(struct rpcrdma_mr *mr) { int rc; - frwr_mr_unmap(mr->mr_xprt, mr); + frwr_mr_unmap(mr); rc = ib_dereg_mr(mr->mr_ibmr); if (rc) @@ -84,7 +84,7 @@ void frwr_mr_release(struct rpcrdma_mr *mr) static void frwr_mr_put(struct rpcrdma_mr *mr) { - frwr_mr_unmap(mr->mr_xprt, mr); + frwr_mr_unmap(mr); /* The MR is returned to the req's MR free list instead * of to the xprt's MR free list. No spinlock is needed. From patchwork Tue Jun 4 19:45:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13685824 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F627372 for ; Tue, 4 Jun 2024 19:46:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717530364; cv=none; b=tdhtc6J4i1duqTsnpuVFdbgPNx25Z/xiDrnBi7bU7/u4O67C+b3qKnZtUnzBLARE0CUysB9GK5Q2HCRlQS+vItib601kc8ypV+taNrDYwo9si0MVYjken2JXeo0myMyeYaHsLbhrJeZo09zzX4oCx9VIHXtWBPV2q3b4jmFT/3g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717530364; c=relaxed/simple; bh=T2/IdJWwy+XEhDObzFebo9E1rinfU8oQ+aELTr1pF7k=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aEWoqiVSAMKipvvOt7TD9nl3L8hCYhFvjzekYSCnR5VhH0Xi10x0qGt5MUdpoSwrY2TutMJSkQCY9zV5ousx/OUsUgVJcYDWtqAN4RSoQVW9KV5XpDHaEihiomzJO3xuzDwViF/ztTjk0Ck0DmY42jIpoocWhSdUlfHYs/f6JlU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=T4D20+lz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="T4D20+lz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2909CC32782; Tue, 4 Jun 2024 19:46:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717530363; bh=T2/IdJWwy+XEhDObzFebo9E1rinfU8oQ+aELTr1pF7k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T4D20+lzhCRmlMkB9JOipulRcnS8jhTxGDMi5GIhbeGIDBq/PHm6MEn5g4H7Vwnlg nus1+mOLChUsCKvAqSrPLdoCSIrrYXNOY13WcJTAS1zZPtTgBritztoRarSXddyty8 AmJ7bUD/S//A+4fTnNZPOXl6b31xd5dQ/QEnq/JghupykLytSgwq9M2HMnpmJ9GqU6 lkwtQDgGeS5cbhvSaqhkxPTwTj7CWyTEDH8ixIUi5jQ97NApb+sQKJhgj55JlqfFkZ kHgaeM42rT5nIZdEo8oJ+KuzwS54AeRnUqZK0wGQViId+oMLGiP0bzlZyZcOHzbm33 dk6HoO76Y+h+Q== From: cel@kernel.org To: Trond Myklebust , Anna Schumaker Cc: , Chuck Lever Subject: [PATCH 5/5] xprtrdma: Remove temp allocation of rpcrdma_rep objects Date: Tue, 4 Jun 2024 15:45:27 -0400 Message-ID: <20240604194522.10390-10-cel@kernel.org> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240604194522.10390-6-cel@kernel.org> References: <20240604194522.10390-6-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=8085; i=chuck.lever@oracle.com; h=from:subject; bh=OomxY48fgJGqLDrBcl8N07rZ8h9bmtJbaqTSN0km27o=; b=owEBbQKS/ZANAwAIATNqszNvZn+XAcsmYgBmX27gT1QIAwRvFYi9mu24uwT1GmeW/aqKY7WsP +TD+/n5immJAjMEAAEIAB0WIQQosuWwEobfJDzyPv4zarMzb2Z/lwUCZl9u4AAKCRAzarMzb2Z/ l2UKEACO8AdqiizL62SXXrFl1Z8vmRd5cwiZ0OfVD4kyrwrFFGPlvfJ8ivkN4PrA0+VtgJ3zfhq 8E4OAiQB1+UhSeBsmp4BiHi/bJH5wnfG4gnmQYDRKi57ayqGlNQULQQZ9aAPQqY6VJiannCKMWo 3DQSRgnEuA3GW6osV9AAsKGfYdiHuIqQccoR/RUHNtwvRsqqGQ/tjaqMF9QkKD+XPU2bt4BZ2DB zt2v1OxiLc0kybtbnNFJlxpvYmZsr3QbZS4z6hHu2esJckt16CAcDdlMe0AI/Xpgku9dpClpbHB IbpRBCB52u8w94OHdMvMC86hGvXcXNPlTBDSgEJRBeLERUYrCUcf2aMmzQiNDd8HeQKxrDgjA7Y ySxSvK6wfOWx3413nl+AZnPrIj3DuobSnYsydWXCAVfPBXE52qHflLX5WAYH8quE5iUd9U6g6CI whA/izSB1ULDi9+JZM43x4mYFIvOZK8raqRD/718kTQzUmhxzm8pSBs9jeq11YDTcCMVSMwv5sP 2TxU1cDd2rBQ9uICo8VHcC96rcdz7HS2rGZPTNQLyqSIE4Rjt3//nF+zagX1enrmlPPV2tiLWdi Howw3JJfNoIjc5j/EXHihzKF51n9yIYy+FeEYM/deEkZF5jW/m56509u6GGXAR6uuVQ7+ldg1RO ahqEQxDaKaocymg== X-Developer-Key: i=chuck.lever@oracle.com; a=openpgp; fpr=28B2E5B01286DF243CF23EFE336AB3336F667F97 From: Chuck Lever The original code was designed so that most calls to rpcrdma_rep_create() would occur on the NUMA node that the device preferred. There are a few cases where that's not possible, so those reps are marked as temporary. However, we have the device (and its preferred node) already in rpcrdma_rep_create(), so let's use that to guarantee the memory is allocated from the correct node. Signed-off-by: Chuck Lever Reviewed-by: Sagi Grimberg --- net/sunrpc/xprtrdma/rpc_rdma.c | 3 +- net/sunrpc/xprtrdma/verbs.c | 57 ++++++++++++++------------------- net/sunrpc/xprtrdma/xprt_rdma.h | 3 +- 3 files changed, 26 insertions(+), 37 deletions(-) diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c index 190a4de239c8..1478c41c7e9d 100644 --- a/net/sunrpc/xprtrdma/rpc_rdma.c +++ b/net/sunrpc/xprtrdma/rpc_rdma.c @@ -1471,8 +1471,7 @@ void rpcrdma_reply_handler(struct rpcrdma_rep *rep) credits = 1; /* don't deadlock */ else if (credits > r_xprt->rx_ep->re_max_requests) credits = r_xprt->rx_ep->re_max_requests; - rpcrdma_post_recvs(r_xprt, credits + (buf->rb_bc_srv_max_requests << 1), - false); + rpcrdma_post_recvs(r_xprt, credits + (buf->rb_bc_srv_max_requests << 1)); if (buf->rb_credits != credits) rpcrdma_update_cwnd(r_xprt, credits); diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index 04558c99e9f4..110933745e5d 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -69,13 +69,15 @@ static void rpcrdma_sendctx_put_locked(struct rpcrdma_xprt *r_xprt, struct rpcrdma_sendctx *sc); static int rpcrdma_reqs_setup(struct rpcrdma_xprt *r_xprt); static void rpcrdma_reqs_reset(struct rpcrdma_xprt *r_xprt); -static void rpcrdma_rep_destroy(struct rpcrdma_rep *rep); static void rpcrdma_reps_unmap(struct rpcrdma_xprt *r_xprt); static void rpcrdma_mrs_create(struct rpcrdma_xprt *r_xprt); static void rpcrdma_mrs_destroy(struct rpcrdma_xprt *r_xprt); static void rpcrdma_ep_get(struct rpcrdma_ep *ep); static int rpcrdma_ep_put(struct rpcrdma_ep *ep); static struct rpcrdma_regbuf * +rpcrdma_regbuf_alloc_node(size_t size, enum dma_data_direction direction, + int node); +static struct rpcrdma_regbuf * rpcrdma_regbuf_alloc(size_t size, enum dma_data_direction direction); static void rpcrdma_regbuf_dma_unmap(struct rpcrdma_regbuf *rb); static void rpcrdma_regbuf_free(struct rpcrdma_regbuf *rb); @@ -510,7 +512,7 @@ int rpcrdma_xprt_connect(struct rpcrdma_xprt *r_xprt) * outstanding Receives. */ rpcrdma_ep_get(ep); - rpcrdma_post_recvs(r_xprt, 1, true); + rpcrdma_post_recvs(r_xprt, 1); rc = rdma_connect(ep->re_id, &ep->re_remote_cma); if (rc) @@ -943,18 +945,20 @@ static void rpcrdma_reqs_reset(struct rpcrdma_xprt *r_xprt) } static noinline -struct rpcrdma_rep *rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt, - bool temp) +struct rpcrdma_rep *rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt) { struct rpcrdma_buffer *buf = &r_xprt->rx_buf; + struct rpcrdma_ep *ep = r_xprt->rx_ep; + struct ib_device *device = ep->re_id->device; struct rpcrdma_rep *rep; rep = kzalloc(sizeof(*rep), XPRTRDMA_GFP_FLAGS); if (rep == NULL) goto out; - rep->rr_rdmabuf = rpcrdma_regbuf_alloc(r_xprt->rx_ep->re_inline_recv, - DMA_FROM_DEVICE); + rep->rr_rdmabuf = rpcrdma_regbuf_alloc_node(ep->re_inline_recv, + DMA_FROM_DEVICE, + ibdev_to_node(device)); if (!rep->rr_rdmabuf) goto out_free; @@ -969,7 +973,6 @@ struct rpcrdma_rep *rpcrdma_rep_create(struct rpcrdma_xprt *r_xprt, rep->rr_recv_wr.wr_cqe = &rep->rr_cqe; rep->rr_recv_wr.sg_list = &rep->rr_rdmabuf->rg_iov; rep->rr_recv_wr.num_sge = 1; - rep->rr_temp = temp; spin_lock(&buf->rb_lock); list_add(&rep->rr_all, &buf->rb_all_reps); @@ -988,17 +991,6 @@ static void rpcrdma_rep_free(struct rpcrdma_rep *rep) kfree(rep); } -static void rpcrdma_rep_destroy(struct rpcrdma_rep *rep) -{ - struct rpcrdma_buffer *buf = &rep->rr_rxprt->rx_buf; - - spin_lock(&buf->rb_lock); - list_del(&rep->rr_all); - spin_unlock(&buf->rb_lock); - - rpcrdma_rep_free(rep); -} - static struct rpcrdma_rep *rpcrdma_rep_get_locked(struct rpcrdma_buffer *buf) { struct llist_node *node; @@ -1030,10 +1022,8 @@ static void rpcrdma_reps_unmap(struct rpcrdma_xprt *r_xprt) struct rpcrdma_buffer *buf = &r_xprt->rx_buf; struct rpcrdma_rep *rep; - list_for_each_entry(rep, &buf->rb_all_reps, rr_all) { + list_for_each_entry(rep, &buf->rb_all_reps, rr_all) rpcrdma_regbuf_dma_unmap(rep->rr_rdmabuf); - rep->rr_temp = true; /* Mark this rep for destruction */ - } } static void rpcrdma_reps_destroy(struct rpcrdma_buffer *buf) @@ -1250,14 +1240,15 @@ void rpcrdma_buffer_put(struct rpcrdma_buffer *buffers, struct rpcrdma_req *req) * or Replies they may be registered externally via frwr_map. */ static struct rpcrdma_regbuf * -rpcrdma_regbuf_alloc(size_t size, enum dma_data_direction direction) +rpcrdma_regbuf_alloc_node(size_t size, enum dma_data_direction direction, + int node) { struct rpcrdma_regbuf *rb; - rb = kmalloc(sizeof(*rb), XPRTRDMA_GFP_FLAGS); + rb = kmalloc_node(sizeof(*rb), XPRTRDMA_GFP_FLAGS, node); if (!rb) return NULL; - rb->rg_data = kmalloc(size, XPRTRDMA_GFP_FLAGS); + rb->rg_data = kmalloc_node(size, XPRTRDMA_GFP_FLAGS, node); if (!rb->rg_data) { kfree(rb); return NULL; @@ -1269,6 +1260,12 @@ rpcrdma_regbuf_alloc(size_t size, enum dma_data_direction direction) return rb; } +static struct rpcrdma_regbuf * +rpcrdma_regbuf_alloc(size_t size, enum dma_data_direction direction) +{ + return rpcrdma_regbuf_alloc_node(size, direction, NUMA_NO_NODE); +} + /** * rpcrdma_regbuf_realloc - re-allocate a SEND/RECV buffer * @rb: regbuf to reallocate @@ -1346,10 +1343,9 @@ static void rpcrdma_regbuf_free(struct rpcrdma_regbuf *rb) * rpcrdma_post_recvs - Refill the Receive Queue * @r_xprt: controlling transport instance * @needed: current credit grant - * @temp: mark Receive buffers to be deleted after one use * */ -void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed, bool temp) +void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed) { struct rpcrdma_buffer *buf = &r_xprt->rx_buf; struct rpcrdma_ep *ep = r_xprt->rx_ep; @@ -1363,8 +1359,7 @@ void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed, bool temp) if (likely(ep->re_receive_count > needed)) goto out; needed -= ep->re_receive_count; - if (!temp) - needed += RPCRDMA_MAX_RECV_BATCH; + needed += RPCRDMA_MAX_RECV_BATCH; if (atomic_inc_return(&ep->re_receiving) > 1) goto out; @@ -1373,12 +1368,8 @@ void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed, bool temp) wr = NULL; while (needed) { rep = rpcrdma_rep_get_locked(buf); - if (rep && rep->rr_temp) { - rpcrdma_rep_destroy(rep); - continue; - } if (!rep) - rep = rpcrdma_rep_create(r_xprt, temp); + rep = rpcrdma_rep_create(r_xprt); if (!rep) break; if (!rpcrdma_regbuf_dma_map(r_xprt, rep->rr_rdmabuf)) { diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h index 341725c66ec8..8147d2b41494 100644 --- a/net/sunrpc/xprtrdma/xprt_rdma.h +++ b/net/sunrpc/xprtrdma/xprt_rdma.h @@ -200,7 +200,6 @@ struct rpcrdma_rep { __be32 rr_proc; int rr_wc_flags; u32 rr_inv_rkey; - bool rr_temp; struct rpcrdma_regbuf *rr_rdmabuf; struct rpcrdma_xprt *rr_rxprt; struct rpc_rqst *rr_rqst; @@ -468,7 +467,7 @@ void rpcrdma_flush_disconnect(struct rpcrdma_xprt *r_xprt, struct ib_wc *wc); int rpcrdma_xprt_connect(struct rpcrdma_xprt *r_xprt); void rpcrdma_xprt_disconnect(struct rpcrdma_xprt *r_xprt); -void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed, bool temp); +void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed); /* * Buffer calls - xprtrdma/verbs.c