From patchwork Wed Mar 20 22:55:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598250 Received: from mail-io1-f41.google.com (mail-io1-f41.google.com [209.85.166.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A94E4381D1 for ; Wed, 20 Mar 2024 22:57:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975480; cv=none; b=E4BUJuXNd9ENO4VkEbPvt63bWy1gsSjRrzyazgPvONJJI5PkxUIvnN37QfHdTr3pQeGjO/ZA6SmNEmGYEprQMocB/IHxanZqLIgwjLJFHm3nz1+rYPy9nT0hjFMlLGIEymijf9gW/4RjHqFh1L3OLJjObry/Q6q1HPjTqUD9IO0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975480; c=relaxed/simple; bh=eCtxrk4Io0v+A0yq+KNK3HkUPeffwmZjof/ejs96dvE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AFy4629oazWhh1C99u2KURwVocI3QfiLLQFus0qHLbAkFGqxiYJhPelOJBOvxrp4cSSGvFPsrMF6D+3D6ZkQeuerHX+SRtRTpR03clRW1R/FPraHPKq4c1cFFoOK2c1AKE552e1OXlriCJBx/qI4x6F/bLDprQjGnTRafv8w8OM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=hxZsoTlu; arc=none smtp.client-ip=209.85.166.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="hxZsoTlu" Received: by mail-io1-f41.google.com with SMTP id ca18e2360f4ac-7cc0e831e11so2537239f.1 for ; Wed, 20 Mar 2024 15:57:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975475; x=1711580275; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xzNkNd4QmdS0zJdufPglHElSH8fOBK4QXS9P0WWmGBg=; b=hxZsoTlufu41xvNUWgjlZfAnDOa8hYfQ3H6DL7SdAcQ6Tu8vI9PLRbA74xzOlaP9sA c+sl4fUgnLfwjLFjO15FrkWy/HWm7ea/XzQ42YqDALNqqU4Rrg/9YSI2zpQ+M4BymEtt XCj6pi/GX7xlT9hl1TCovZhqDmmqdrrrdEwnQpuZDqNramp12Yia+TGi5oiZ35sx13Aq IqbO/WyPmGpDahlp9+Yk92npwm5wlt7EytL9dHZaZ3s+WDh12+uNUfF4ci0qtI4chqyb 7vke7SAoavszXnEBzI4/15277zkKYO+i3LXIFX5ZamdvEkK2BYyekQF5nLh+WbyXeqOP r/+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975475; x=1711580275; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xzNkNd4QmdS0zJdufPglHElSH8fOBK4QXS9P0WWmGBg=; b=c7/lT1BUYbwO+bdzo3GXl6FyLS8WNCAsZYzgsn4tPk4igaSv2X/i6wqvOMj4fDtpd9 wRZ3ktIY/77Ko7UMVuarGg8VN0w7sboZKSkfMCJTadSRKbHc3arb+qgJunD1qWNVaVK/ TgZ7Eor8YUhDk865dufWFX+9vRJu+ElxKLUNbh3fYLTDztvl97HsnkLKy8gQMpzFIpw1 0zPedaIlOTmM4JlwJIlEOqj+1DVCmlDGQ9D75chKvmWr16Z1FIrPu7h7CXudALAXj4r4 wnxStoR7lgVBB3b3TycRBS0j3IW7WjegXjKpp26p+dgM/3WugluADqfQJnfBVDMxhg5e vD4w== X-Gm-Message-State: AOJu0YwTiFXniVaHkdvCFxQxCyTbGXntzlc+6vp21mM2IeBmeanqyIb/ E/BHQTi71tw05vApFc9nosJxmhAbOWD92gVk8URQGTY0HOBWBVluW+GlepG0WDopsc4bYj9z6V/ 3 X-Google-Smtp-Source: AGHT+IEBCeNkuy4lgnLYLxgCKlReimMu/KBp0wqMZpAjqtKGlZcShBMjd1mX9z9PxrjXRrx83gRJyA== X-Received: by 2002:a5e:c10d:0:b0:7cf:28df:79e2 with SMTP id v13-20020a5ec10d000000b007cf28df79e2mr700529iol.1.1710975475411; Wed, 20 Mar 2024 15:57:55 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.57.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:57:53 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 01/17] io_uring/net: switch io_send() and io_send_zc() to using io_async_msghdr Date: Wed, 20 Mar 2024 16:55:16 -0600 Message-ID: <20240320225750.1769647-2-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 No functional changes in this patch, just in preparation for carrying more state then we have now, if necessary. While unifying some of this code, add a generic send setup prep handler that they can both use. This gets rid of some manual msghdr and sockaddr on the stack, and makes it look a bit more like the sendmsg/recvmsg variants. We can probably unify a bit more on top of this going forward. Signed-off-by: Jens Axboe --- io_uring/net.c | 196 ++++++++++++++++++++++++----------------------- io_uring/opdef.c | 1 + 2 files changed, 103 insertions(+), 94 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index ed798e185bbf..a16838c0c837 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -322,36 +322,25 @@ static int io_sendmsg_copy_hdr(struct io_kiocb *req, int io_send_prep_async(struct io_kiocb *req) { - struct io_sr_msg *zc = io_kiocb_to_cmd(req, struct io_sr_msg); + struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); struct io_async_msghdr *io; int ret; if (req_has_async_data(req)) return 0; - zc->done_io = 0; - if (!zc->addr) + sr->done_io = 0; + if (!sr->addr) return 0; io = io_msg_alloc_async_prep(req); if (!io) return -ENOMEM; - ret = move_addr_to_kernel(zc->addr, zc->addr_len, &io->addr); - return ret; -} - -static int io_setup_async_addr(struct io_kiocb *req, - struct sockaddr_storage *addr_storage, - unsigned int issue_flags) -{ - struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr *io; - - if (!sr->addr || req_has_async_data(req)) - return -EAGAIN; - io = io_msg_alloc_async(req, issue_flags); - if (!io) - return -ENOMEM; - memcpy(&io->addr, addr_storage, sizeof(io->addr)); - return -EAGAIN; + memset(&io->msg, 0, sizeof(io->msg)); + ret = import_ubuf(ITER_SOURCE, sr->buf, sr->len, &io->msg.msg_iter); + if (unlikely(ret)) + return ret; + io->msg.msg_name = &io->addr; + io->msg.msg_namelen = sr->addr_len; + return move_addr_to_kernel(sr->addr, sr->addr_len, &io->addr); } int io_sendmsg_prep_async(struct io_kiocb *req) @@ -475,45 +464,68 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) return IOU_OK; } -int io_send(struct io_kiocb *req, unsigned int issue_flags) +static struct io_async_msghdr *io_send_setup(struct io_kiocb *req, + struct io_async_msghdr *stack_msg, + unsigned int issue_flags) { - struct sockaddr_storage __address; struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct msghdr msg; - struct socket *sock; - unsigned flags; - int min_ret = 0; + struct io_async_msghdr *kmsg; int ret; - msg.msg_name = NULL; - msg.msg_control = NULL; - msg.msg_controllen = 0; - msg.msg_namelen = 0; - msg.msg_ubuf = NULL; - - if (sr->addr) { - if (req_has_async_data(req)) { - struct io_async_msghdr *io = req->async_data; - - msg.msg_name = &io->addr; - } else { - ret = move_addr_to_kernel(sr->addr, sr->addr_len, &__address); + if (req_has_async_data(req)) { + kmsg = req->async_data; + } else { + kmsg = stack_msg; + kmsg->free_iov = NULL; + kmsg->msg.msg_name = NULL; + kmsg->msg.msg_namelen = 0; + kmsg->msg.msg_control = NULL; + kmsg->msg.msg_controllen = 0; + kmsg->msg.msg_ubuf = NULL; + + if (sr->addr) { + ret = move_addr_to_kernel(sr->addr, sr->addr_len, + &kmsg->addr); if (unlikely(ret < 0)) - return ret; - msg.msg_name = (struct sockaddr *)&__address; + return ERR_PTR(ret); + kmsg->msg.msg_name = &kmsg->addr; + kmsg->msg.msg_namelen = sr->addr_len; + } + + if (!io_do_buffer_select(req)) { + ret = import_ubuf(ITER_SOURCE, sr->buf, sr->len, + &kmsg->msg.msg_iter); + if (unlikely(ret)) + return ERR_PTR(ret); } - msg.msg_namelen = sr->addr_len; } if (!(req->flags & REQ_F_POLLED) && (sr->flags & IORING_RECVSEND_POLL_FIRST)) - return io_setup_async_addr(req, &__address, issue_flags); + return ERR_PTR(io_setup_async_msg(req, kmsg, issue_flags)); + + return kmsg; +} + +int io_send(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); + struct io_async_msghdr iomsg, *kmsg; + size_t len = sr->len; + struct socket *sock; + unsigned flags; + int min_ret = 0; + int ret; sock = sock_from_file(req->file); if (unlikely(!sock)) return -ENOTSOCK; - ret = import_ubuf(ITER_SOURCE, sr->buf, sr->len, &msg.msg_iter); + kmsg = io_send_setup(req, &iomsg, issue_flags); + if (IS_ERR(kmsg)) + return PTR_ERR(kmsg); + + ret = import_ubuf(ITER_SOURCE, sr->buf, len, &kmsg->msg.msg_iter); if (unlikely(ret)) return ret; @@ -521,21 +533,21 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags) if (issue_flags & IO_URING_F_NONBLOCK) flags |= MSG_DONTWAIT; if (flags & MSG_WAITALL) - min_ret = iov_iter_count(&msg.msg_iter); + min_ret = iov_iter_count(&kmsg->msg.msg_iter); flags &= ~MSG_INTERNAL_SENDMSG_FLAGS; - msg.msg_flags = flags; - ret = sock_sendmsg(sock, &msg); + kmsg->msg.msg_flags = flags; + ret = sock_sendmsg(sock, &kmsg->msg); if (ret < min_ret) { if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK)) - return io_setup_async_addr(req, &__address, issue_flags); + return io_setup_async_msg(req, kmsg, issue_flags); if (ret > 0 && io_net_retry(sock, flags)) { sr->len -= ret; sr->buf += ret; sr->done_io += ret; req->flags |= REQ_F_BL_NO_RECYCLE; - return io_setup_async_addr(req, &__address, issue_flags); + return io_setup_async_msg(req, kmsg, issue_flags); } if (ret == -ERESTARTSYS) ret = -EINTR; @@ -545,6 +557,7 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags) ret += sr->done_io; else if (sr->done_io) ret = sr->done_io; + io_req_msg_cleanup(req, kmsg, issue_flags); io_req_set_res(req, ret, 0); return IOU_OK; } @@ -1158,11 +1171,35 @@ static int io_sg_from_iter(struct sock *sk, struct sk_buff *skb, return ret; } +static int io_send_zc_import(struct io_kiocb *req, struct io_async_msghdr *kmsg) +{ + struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); + int ret; + + if (sr->flags & IORING_RECVSEND_FIXED_BUF) { + ret = io_import_fixed(ITER_SOURCE, &kmsg->msg.msg_iter, req->imu, + (u64)(uintptr_t)sr->buf, sr->len); + if (unlikely(ret)) + return ret; + kmsg->msg.sg_from_iter = io_sg_from_iter; + } else { + io_notif_set_extended(sr->notif); + ret = import_ubuf(ITER_SOURCE, sr->buf, sr->len, &kmsg->msg.msg_iter); + if (unlikely(ret)) + return ret; + ret = io_notif_account_mem(sr->notif, sr->len); + if (unlikely(ret)) + return ret; + kmsg->msg.sg_from_iter = io_sg_from_iter_iovec; + } + + return ret; +} + int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) { - struct sockaddr_storage __address; struct io_sr_msg *zc = io_kiocb_to_cmd(req, struct io_sr_msg); - struct msghdr msg; + struct io_async_msghdr iomsg, *kmsg; struct socket *sock; unsigned msg_flags; int ret, min_ret = 0; @@ -1173,67 +1210,37 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) if (!test_bit(SOCK_SUPPORT_ZC, &sock->flags)) return -EOPNOTSUPP; - msg.msg_name = NULL; - msg.msg_control = NULL; - msg.msg_controllen = 0; - msg.msg_namelen = 0; - - if (zc->addr) { - if (req_has_async_data(req)) { - struct io_async_msghdr *io = req->async_data; - - msg.msg_name = &io->addr; - } else { - ret = move_addr_to_kernel(zc->addr, zc->addr_len, &__address); - if (unlikely(ret < 0)) - return ret; - msg.msg_name = (struct sockaddr *)&__address; - } - msg.msg_namelen = zc->addr_len; - } - - if (!(req->flags & REQ_F_POLLED) && - (zc->flags & IORING_RECVSEND_POLL_FIRST)) - return io_setup_async_addr(req, &__address, issue_flags); + kmsg = io_send_setup(req, &iomsg, issue_flags); + if (IS_ERR(kmsg)) + return PTR_ERR(kmsg); - if (zc->flags & IORING_RECVSEND_FIXED_BUF) { - ret = io_import_fixed(ITER_SOURCE, &msg.msg_iter, req->imu, - (u64)(uintptr_t)zc->buf, zc->len); - if (unlikely(ret)) - return ret; - msg.sg_from_iter = io_sg_from_iter; - } else { - io_notif_set_extended(zc->notif); - ret = import_ubuf(ITER_SOURCE, zc->buf, zc->len, &msg.msg_iter); + if (!zc->done_io) { + ret = io_send_zc_import(req, kmsg); if (unlikely(ret)) return ret; - ret = io_notif_account_mem(zc->notif, zc->len); - if (unlikely(ret)) - return ret; - msg.sg_from_iter = io_sg_from_iter_iovec; } msg_flags = zc->msg_flags | MSG_ZEROCOPY; if (issue_flags & IO_URING_F_NONBLOCK) msg_flags |= MSG_DONTWAIT; if (msg_flags & MSG_WAITALL) - min_ret = iov_iter_count(&msg.msg_iter); + min_ret = iov_iter_count(&kmsg->msg.msg_iter); msg_flags &= ~MSG_INTERNAL_SENDMSG_FLAGS; - msg.msg_flags = msg_flags; - msg.msg_ubuf = &io_notif_to_data(zc->notif)->uarg; - ret = sock_sendmsg(sock, &msg); + kmsg->msg.msg_flags = msg_flags; + kmsg->msg.msg_ubuf = &io_notif_to_data(zc->notif)->uarg; + ret = sock_sendmsg(sock, &kmsg->msg); if (unlikely(ret < min_ret)) { if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK)) - return io_setup_async_addr(req, &__address, issue_flags); + return io_setup_async_msg(req, kmsg, issue_flags); - if (ret > 0 && io_net_retry(sock, msg.msg_flags)) { + if (ret > 0 && io_net_retry(sock, kmsg->msg.msg_flags)) { zc->len -= ret; zc->buf += ret; zc->done_io += ret; req->flags |= REQ_F_BL_NO_RECYCLE; - return io_setup_async_addr(req, &__address, issue_flags); + return io_setup_async_msg(req, kmsg, issue_flags); } if (ret == -ERESTARTSYS) ret = -EINTR; @@ -1251,6 +1258,7 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) */ if (!(issue_flags & IO_URING_F_UNLOCKED)) { io_notif_flush(zc->notif); + io_netmsg_recycle(req, issue_flags); req->flags &= ~REQ_F_NEED_CLEANUP; } io_req_set_res(req, ret, IORING_CQE_F_MORE); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 9c080aadc5a6..b0a990c6bbff 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -602,6 +602,7 @@ const struct io_cold_def io_cold_defs[] = { .name = "SEND", #if defined(CONFIG_NET) .async_size = sizeof(struct io_async_msghdr), + .cleanup = io_sendmsg_recvmsg_cleanup, .fail = io_sendrecv_fail, .prep_async = io_send_prep_async, #endif From patchwork Wed Mar 20 22:55:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598251 Received: from mail-io1-f45.google.com (mail-io1-f45.google.com [209.85.166.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 034C385C65 for ; Wed, 20 Mar 2024 22:57:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975481; cv=none; b=ac24oapZAPjI59O0ImqSUvE29IBecp7iRl+vfibQ/+XQV5mSo/RU4B6k/6wXKkFvoRAm7v/AclcVxgOtXE9mlAdMhjPmnAOCertINZYZNarHlYao5LLwWN609EI1IZkUBNxTmHHs6sf/4afYJv0JhUDs+cvS+Q/beLUplRyvwPQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975481; c=relaxed/simple; bh=0f6ZTCzHPr6SQVQhy1T+kZAfEPLk1THPHDAADumiOoM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DCSaNiS0qq/Nt/wTJ3Zt2G16wtrwmZ1pzGdi7o1DwpULfPg5gwNzipxKy9yvOS6v7vAT56oqPhtMXmtb7XwT0aRKkMYrVf5c7hbvQmurPX9Y6GE6D0dssDG+41nRv6+2gUnh40rL87EJHLuHKtB0VGJMtk/c22FtmXTI0xjY0uI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=a6zILMlw; arc=none smtp.client-ip=209.85.166.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="a6zILMlw" Received: by mail-io1-f45.google.com with SMTP id ca18e2360f4ac-7cc0e831e11so2537539f.1 for ; Wed, 20 Mar 2024 15:57:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975477; x=1711580277; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3D9t78d3dIlx2g5fA/D2tgVcusYA1i2fkeNzecTe30c=; b=a6zILMlwkYBa82BJP18s06uPpJYWb4n7htmAueN+7/pnw8X20OHTU0hWr7srfN3hyr T6OhvDS5SWWUXhadRsfyv/Mk030NlfSUZz+v2rAoe3i/XKLwlNypO/DVD+VPqPEybDa4 k4BrLOn8fH4Hfo3oJINnmFuOv9SE5S4N1QQkMmP3A362QMXpWg4RNLWZKS9Jx7sxWjjS WH9aWH6Fv0aqIwosFxkKlNprqxItM1wF2FPD1zUxmMYp2FeZVU/LwbMjTFZFUFm31bFb g+Ge3n+Kr+KuEze3rDJqnlVanaYyZDVs5LuhoNUmzpAvuQzkzjY1Kb1Hevcp7/y+2pOi 9Csw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975477; x=1711580277; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3D9t78d3dIlx2g5fA/D2tgVcusYA1i2fkeNzecTe30c=; b=l4Fa5pVd96LjBqCEDlOPuVmAj4qL6394EvhMbC1ywjk+pFJHCdaJPK9EHSPUMGJRpR z/C803bsVz3Qs3rhH46qZnr2C0xg+nSnxZsT/26Gphz+5U3uruiiXwGu+T5IeCNUuXP0 tX3jN37LYSPf30pComRIE8K4pwgS+pi+3pBBG7w8QgfRje+MxCz/Pb8mNrk3Bd4UP9UI 8mxeQi8fADPAjqjdza1HJcSXYberhjfyHS7DZwcNdP4kSZVDmJuCK4A39cmFGK9POVFC iJHymdpBPY38C6xNpD4kHgT9/0Tw6t2PHqgHMHdeVyumV49qCiRg/UJqIFqqnZ2SpmAU wX1A== X-Gm-Message-State: AOJu0YysKh3tGbpB4w7AteXO4n4cjF9x40DB76z9wBgH0fVyIzfzzL+T BjZbjOzjF7HuyWySUKx1D+j5eS/Nxrz0+f4iO2F9ekZYIiaJP50ODnKu9X2MT9H8ZPNyoBhZLhd 9 X-Google-Smtp-Source: AGHT+IEiHOSOXosNqaOtQoJC+jhJGdNlUPli/Evog8iwLXbxFVFz8thImsXYyZAV7YUTjSz0CvIhNA== X-Received: by 2002:a5e:c10d:0:b0:7cf:28df:79e2 with SMTP id v13-20020a5ec10d000000b007cf28df79e2mr700581iol.1.1710975477464; Wed, 20 Mar 2024 15:57:57 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.57.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:57:55 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 02/17] io_uring/net: switch io_recv() to using io_async_msghdr Date: Wed, 20 Mar 2024 16:55:17 -0600 Message-ID: <20240320225750.1769647-3-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 No functional changes in this patch, just in preparation for carrying more state then we have now, if necessary. Signed-off-by: Jens Axboe --- io_uring/net.c | 75 ++++++++++++++++++++++++++++++------------------ io_uring/net.h | 2 +- io_uring/opdef.c | 7 +++-- 3 files changed, 53 insertions(+), 31 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index a16838c0c837..d571115f4909 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -320,7 +320,7 @@ static int io_sendmsg_copy_hdr(struct io_kiocb *req, return ret; } -int io_send_prep_async(struct io_kiocb *req) +int io_sendrecv_prep_async(struct io_kiocb *req) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); struct io_async_msghdr *io; @@ -705,13 +705,13 @@ static inline void io_recv_prep_retry(struct io_kiocb *req) * again (for multishot). */ static inline bool io_recv_finish(struct io_kiocb *req, int *ret, - struct msghdr *msg, bool mshot_finished, - unsigned issue_flags) + struct io_async_msghdr *kmsg, + bool mshot_finished, unsigned issue_flags) { unsigned int cflags; cflags = io_put_kbuf(req, issue_flags); - if (msg->msg_inq > 0) + if (kmsg->msg.msg_inq > 0) cflags |= IORING_CQE_F_SOCK_NONEMPTY; /* @@ -725,7 +725,7 @@ static inline bool io_recv_finish(struct io_kiocb *req, int *ret, io_recv_prep_retry(req); /* Known not-empty or unknown state, retry */ - if (cflags & IORING_CQE_F_SOCK_NONEMPTY || msg->msg_inq < 0) { + if (cflags & IORING_CQE_F_SOCK_NONEMPTY || kmsg->msg.msg_inq < 0) { if (sr->nr_multishot_loops++ < MULTISHOT_MAX_RETRY) return false; /* mshot retries exceeded, force a requeue */ @@ -926,7 +926,7 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) else io_kbuf_recycle(req, issue_flags); - if (!io_recv_finish(req, &ret, &kmsg->msg, mshot_finished, issue_flags)) + if (!io_recv_finish(req, &ret, kmsg, mshot_finished, issue_flags)) goto retry_multishot; if (mshot_finished) @@ -940,29 +940,42 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) int io_recv(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct msghdr msg; + struct io_async_msghdr iomsg, *kmsg; struct socket *sock; unsigned flags; int ret, min_ret = 0; bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK; size_t len = sr->len; + if (req_has_async_data(req)) { + kmsg = req->async_data; + } else { + kmsg = &iomsg; + kmsg->free_iov = NULL; + kmsg->msg.msg_name = NULL; + kmsg->msg.msg_namelen = 0; + kmsg->msg.msg_control = NULL; + kmsg->msg.msg_get_inq = 1; + kmsg->msg.msg_controllen = 0; + kmsg->msg.msg_iocb = NULL; + kmsg->msg.msg_ubuf = NULL; + + if (!io_do_buffer_select(req)) { + ret = import_ubuf(ITER_DEST, sr->buf, sr->len, + &kmsg->msg.msg_iter); + if (unlikely(ret)) + return ret; + } + } + if (!(req->flags & REQ_F_POLLED) && (sr->flags & IORING_RECVSEND_POLL_FIRST)) - return -EAGAIN; + return io_setup_async_msg(req, kmsg, issue_flags); sock = sock_from_file(req->file); if (unlikely(!sock)) return -ENOTSOCK; - msg.msg_name = NULL; - msg.msg_namelen = 0; - msg.msg_control = NULL; - msg.msg_get_inq = 1; - msg.msg_controllen = 0; - msg.msg_iocb = NULL; - msg.msg_ubuf = NULL; - flags = sr->msg_flags; if (force_nonblock) flags |= MSG_DONTWAIT; @@ -976,22 +989,23 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags) return -ENOBUFS; sr->buf = buf; sr->len = len; + ret = import_ubuf(ITER_DEST, sr->buf, sr->len, + &kmsg->msg.msg_iter); + if (unlikely(ret)) + goto out_free; } - ret = import_ubuf(ITER_DEST, sr->buf, len, &msg.msg_iter); - if (unlikely(ret)) - goto out_free; - - msg.msg_inq = -1; - msg.msg_flags = 0; + kmsg->msg.msg_inq = -1; + kmsg->msg.msg_flags = 0; if (flags & MSG_WAITALL) - min_ret = iov_iter_count(&msg.msg_iter); + min_ret = iov_iter_count(&kmsg->msg.msg_iter); - ret = sock_recvmsg(sock, &msg, flags); + ret = sock_recvmsg(sock, &kmsg->msg, flags); if (ret < min_ret) { if (ret == -EAGAIN && force_nonblock) { - if (issue_flags & IO_URING_F_MULTISHOT) { + ret = io_setup_async_msg(req, kmsg, issue_flags); + if (ret == -EAGAIN && issue_flags & IO_URING_F_MULTISHOT) { io_kbuf_recycle(req, issue_flags); return IOU_ISSUE_SKIP_COMPLETE; } @@ -1003,12 +1017,12 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags) sr->buf += ret; sr->done_io += ret; req->flags |= REQ_F_BL_NO_RECYCLE; - return -EAGAIN; + return io_setup_async_msg(req, kmsg, issue_flags); } if (ret == -ERESTARTSYS) ret = -EINTR; req_set_fail(req); - } else if ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) { + } else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) { out_free: req_set_fail(req); } @@ -1020,9 +1034,14 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags) else io_kbuf_recycle(req, issue_flags); - if (!io_recv_finish(req, &ret, &msg, ret <= 0, issue_flags)) + if (!io_recv_finish(req, &ret, kmsg, ret <= 0, issue_flags)) goto retry_multishot; + if (ret == -EAGAIN) + return io_setup_async_msg(req, kmsg, issue_flags); + else if (ret != IOU_OK && ret != IOU_STOP_MULTISHOT) + io_req_msg_cleanup(req, kmsg, issue_flags); + return ret; } diff --git a/io_uring/net.h b/io_uring/net.h index 191009979bcb..5c1230f1aaf9 100644 --- a/io_uring/net.h +++ b/io_uring/net.h @@ -40,7 +40,7 @@ int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags); int io_send(struct io_kiocb *req, unsigned int issue_flags); -int io_send_prep_async(struct io_kiocb *req); +int io_sendrecv_prep_async(struct io_kiocb *req); int io_recvmsg_prep_async(struct io_kiocb *req); int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index b0a990c6bbff..77131826d603 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -604,13 +604,16 @@ const struct io_cold_def io_cold_defs[] = { .async_size = sizeof(struct io_async_msghdr), .cleanup = io_sendmsg_recvmsg_cleanup, .fail = io_sendrecv_fail, - .prep_async = io_send_prep_async, + .prep_async = io_sendrecv_prep_async, #endif }, [IORING_OP_RECV] = { .name = "RECV", #if defined(CONFIG_NET) + .async_size = sizeof(struct io_async_msghdr), + .cleanup = io_sendmsg_recvmsg_cleanup, .fail = io_sendrecv_fail, + .prep_async = io_sendrecv_prep_async, #endif }, [IORING_OP_OPENAT2] = { @@ -687,7 +690,7 @@ const struct io_cold_def io_cold_defs[] = { .name = "SEND_ZC", #if defined(CONFIG_NET) .async_size = sizeof(struct io_async_msghdr), - .prep_async = io_send_prep_async, + .prep_async = io_sendrecv_prep_async, .cleanup = io_send_zc_cleanup, .fail = io_sendrecv_fail, #endif From patchwork Wed Mar 20 22:55:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598252 Received: from mail-il1-f179.google.com (mail-il1-f179.google.com [209.85.166.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0647A381D1 for ; Wed, 20 Mar 2024 22:58:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975483; cv=none; b=UatTgfOseKXte1WFaCr7IYVsrH7lmn9Egsk3J3Qg296gbjxIIhUm4zEgArC8hpgUvDJABCBYDOt9jqPvkhjENi//RwGKYxOuC2gREhEmXdp8YEhEAfmD/2Rq+LetAre3JU7GWiohiamhV+d8kmwo0UwTDGEuHcDHEbiJuhndC5k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975483; c=relaxed/simple; bh=i3ZO/MQd00/JoBnmbo/zhQ8cFmmTzPDS293qrfzRk8A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hQTuETrtLbz6hmZU/yJsRnpWvOxpC/KMztPPYKfP1tbPbOZ+8Ojdow2vW1GyKcXUWJ86/+qourHzECrSJ3oYMshJeVIxgMAN3Jn2zyZ8lNRtaB4KbizDPA53h/b/huEY0fghDJ2qAFpj/LR5JPGTLY/whrUGvzUSQ2AJ62OnbxA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=VYrs/SvY; arc=none smtp.client-ip=209.85.166.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="VYrs/SvY" Received: by mail-il1-f179.google.com with SMTP id e9e14a558f8ab-366b8b0717cso600325ab.1 for ; Wed, 20 Mar 2024 15:58:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975480; x=1711580280; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Su5svnrduXYJh/wm/Ca9h9p5ZzSZLkUyEZMUZZYxFx0=; b=VYrs/SvYNu5FzIsRfmCfiO32pkfLWJ4rEmmQqtIghfctCd0vCunxW7oyLecL1oY5NQ EtddgU3L/a4Y4PuZgM4v27ZsDjyDYZy/SYHm1cJshwXjUCvGWumKNyOV48n/6zakHfGT yKA97qm5gkBjPpmFP5TJ7TnTrFbCYk/yKrss/UT6Jf7QxNct8o6sXwfvBkkLFjy5ztY7 M+LJPSKVl1QRhART61h+XVKAqJdG5rcklyhLMCkpEwYpkXkt9p2VkocAScvAvOk36LUR AITFlRGFi6qlZ4vjL3lkibXvPoD7AuhJ8WlsyQCMgpi8uLDEbgAx9QgWRNf1jJ/dwo75 JzmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975480; x=1711580280; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Su5svnrduXYJh/wm/Ca9h9p5ZzSZLkUyEZMUZZYxFx0=; b=Cpn3C5MNe/edfFmSk64XZIW+k9wmgZlVwarvHOp/6xjiRxPNX+HGrbz4liB8aEwkHd aNyebYBd43YBTBVWlamZaFp3JZGWAnnN+Xq3jvAx4DqcFV/eeaHPD7kiGwsmP97KqNfL oxacqISJEbPiCdfsDCygwPTcXccxR7OngyXi3CUEkiYspjQM32VTIGBQQ6tcp+AmB4mZ FBBpjZTRlIRum1o48iJ4giMQOP57u8iuUBiMft8h2K3DgEcMsl3JCrUQrLusuTHhHYl4 aJmDwEBbsDuWS8Ott4E9OBmDj725RaSPRt1hozNqs7/kMz4NaTd4XXHKIiPYCswbklB9 GGWQ== X-Gm-Message-State: AOJu0YxHCflJvKEOdCI9ESy7EAxUcegC697ubC53CL4wic5OsR0TZ0f0 x5X1tHfez6JD97c/7uE2b8C3kol5Yxn4R3vHmsA9USmkMsYejh8VzhbPO2sTQzuUJMp7nTSGY0d i X-Google-Smtp-Source: AGHT+IFjnDTf35oAbzLEpm1PUszkguGP6g8BJPzPkHI71G5plcoyLCtPZR9tNFsozBmMJbVQKbmTwg== X-Received: by 2002:a5e:a80c:0:b0:7cf:24de:c5f with SMTP id c12-20020a5ea80c000000b007cf24de0c5fmr1835513ioa.1.1710975479720; Wed, 20 Mar 2024 15:57:59 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.57.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:57:57 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 03/17] io_uring/net: unify cleanup handling Date: Wed, 20 Mar 2024 16:55:18 -0600 Message-ID: <20240320225750.1769647-4-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Now that recv/recvmsg both do the same cleanup, put it in the retry and finish handlers. Signed-off-by: Jens Axboe --- io_uring/net.c | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index d571115f4909..2df59fb19a15 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -688,10 +688,16 @@ int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) return 0; } -static inline void io_recv_prep_retry(struct io_kiocb *req) +static inline void io_recv_prep_retry(struct io_kiocb *req, + struct io_async_msghdr *kmsg) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); + if (kmsg->free_iov) { + kfree(kmsg->free_iov); + kmsg->free_iov = NULL; + } + req->flags &= ~REQ_F_BL_EMPTY; sr->done_io = 0; sr->len = 0; /* get from the provided buffer */ @@ -723,7 +729,7 @@ static inline bool io_recv_finish(struct io_kiocb *req, int *ret, struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); int mshot_retry_ret = IOU_ISSUE_SKIP_COMPLETE; - io_recv_prep_retry(req); + io_recv_prep_retry(req, kmsg); /* Known not-empty or unknown state, retry */ if (cflags & IORING_CQE_F_SOCK_NONEMPTY || kmsg->msg.msg_inq < 0) { if (sr->nr_multishot_loops++ < MULTISHOT_MAX_RETRY) @@ -732,10 +738,9 @@ static inline bool io_recv_finish(struct io_kiocb *req, int *ret, sr->nr_multishot_loops = 0; mshot_retry_ret = IOU_REQUEUE; } - if (issue_flags & IO_URING_F_MULTISHOT) + *ret = io_setup_async_msg(req, kmsg, issue_flags); + if (*ret == -EAGAIN && issue_flags & IO_URING_F_MULTISHOT) *ret = mshot_retry_ret; - else - *ret = -EAGAIN; return true; } @@ -746,6 +751,7 @@ static inline bool io_recv_finish(struct io_kiocb *req, int *ret, *ret = IOU_STOP_MULTISHOT; else *ret = IOU_OK; + io_req_msg_cleanup(req, kmsg, issue_flags); return true; } @@ -929,11 +935,6 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) if (!io_recv_finish(req, &ret, kmsg, mshot_finished, issue_flags)) goto retry_multishot; - if (mshot_finished) - io_req_msg_cleanup(req, kmsg, issue_flags); - else if (ret == -EAGAIN) - return io_setup_async_msg(req, kmsg, issue_flags); - return ret; } @@ -1037,11 +1038,6 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags) if (!io_recv_finish(req, &ret, kmsg, ret <= 0, issue_flags)) goto retry_multishot; - if (ret == -EAGAIN) - return io_setup_async_msg(req, kmsg, issue_flags); - else if (ret != IOU_OK && ret != IOU_STOP_MULTISHOT) - io_req_msg_cleanup(req, kmsg, issue_flags); - return ret; } From patchwork Wed Mar 20 22:55:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598253 Received: from mail-il1-f176.google.com (mail-il1-f176.google.com [209.85.166.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8BF0385C65 for ; Wed, 20 Mar 2024 22:58:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975486; cv=none; b=hHTQm7I4hA7DRiIFMjHxZrvHIjnSLbokWa5rZb+AIdFn/XZmLAw2LGmTtYCbLF6Wlc1Ox1oxUPCOX4CKzgj5rCImtFrym1WCl5CyvDKma39aA5RS65G0YZZQkP6Wi5rxY9+pd240rYK58AlxVU3et5qflEUaj33J4s329LdDrAw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975486; c=relaxed/simple; bh=HwbhZHPeos+lFR/ZyqIYdS0mIbBpKLUmTtkyrXBeJ6E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CdL4eU+hy2hJpZvI9mhPUvAy99P7Ge6ma+c5FPwfid24Vk5LDJH/WX8a6m5LAZKBpEXJBhBt9sOTw1wuDu2ld594zOQY2TXQGS3EiUPnO6j4OU3uj4e9iP43kBOn7SMkEXI+tLjrXSLODg5PE5RDaonNt5S28NZDkn+yBxgZ1ag= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=dqn3xYOS; arc=none smtp.client-ip=209.85.166.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="dqn3xYOS" Received: by mail-il1-f176.google.com with SMTP id e9e14a558f8ab-367c7daa395so273115ab.1 for ; Wed, 20 Mar 2024 15:58:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975482; x=1711580282; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7YrHWAoNOnRsYzp8SwgXG4DXgNTDASp6CZ/nb6mbJro=; b=dqn3xYOSsPtydSCCVqioWp8oDEO6SG5Zzx/XmWNJoes1RZ57X+gdBSqULriL/tqm7l m4E6YBQWwJVa0IB9ij3SGz2npcW6LFNkqZiSA1PABU2F4rSV65TgbCnR37n5z3HgJK5t oGOt0WFzF9ypszEG2lPnyejsmQSw2FxCPCW/dgG7m6Utz3IZaklhABbh8yOQIAz7sToJ o2JZlgLUQXPG4uykCtyFLzo8FxlIBYihCE15kLSU7Nuersb9zQ6Ps6IppwvxFhslFMzY pkbSZGRlc5TzwxI3dpyNm+B3nQnA7nGeIQ+/dXyozXpS6exaxoQ+21pcK1AJAJe45cJX WqZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975482; x=1711580282; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7YrHWAoNOnRsYzp8SwgXG4DXgNTDASp6CZ/nb6mbJro=; b=hgafr8UCy3bRGu5F/BLs+1kXNyq+Nw9EPLASGGOPZR6Eb8KVjlrcVjV/GKMzWge7aE VG3cksvGQUJTpiPe3+gI6UzSyajP2j/LmctXa2mG+GJxF9g3cTJUfHuUUihNG/oUnxhC C09GSzYy2x6L09GorSjaxyr5wKOZGTBbcNGyAagXQ6mRZlzCi3JG6X60pUyVbpqbdEJN R1y+YC3LV2BwhKuVzUDRvqZmfxP6R3En78VzhtpLLdA7gcuQRy8okmJQRToBG2av+jG+ Kj3KSx6ZX9HLJpa7qjcdew8y0sAW2iy3muQBe1yS4YmjIF2bG6il5PPH0LAnrzIrobuY hVNA== X-Gm-Message-State: AOJu0Yyaun6l9YFVHfkrDIWvnxOBOhIvmrN1gqiRBBic9Qb8UBCp2s8g ZcT5L8khbFmI8XpCS8Cd/oqihjcKPlpqTSQGTFvHdfL3GqhZTzamUNS2SK4xoDXIeYwMCBaLDhb T X-Google-Smtp-Source: AGHT+IEwTcG97u/QpWvP8cmfVpjqt8CFWI7T82GrFzx1ZKQD3zs9oqKZSbi80ivVesC0FTBSA8oRUQ== X-Received: by 2002:a05:6602:5c2:b0:7ce:f921:6a42 with SMTP id w2-20020a05660205c200b007cef9216a42mr6207869iox.0.1710975482095; Wed, 20 Mar 2024 15:58:02 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.57.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:00 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 04/17] io_uring/net: always setup an io_async_msghdr Date: Wed, 20 Mar 2024 16:55:19 -0600 Message-ID: <20240320225750.1769647-5-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Rather than use an on-stack one and then need to allocate and copy if we have to go async, always grab one upfront. This should be very cheap, and potentially even have cache hotness benefits for back-to-back send/recv requests. For any recv type of request, this is probably a good choice in general, as it's expected that no data is available initially. For send this is not necessarily the case, as we expect space to be available. However, getting a cached io_async_msghdr is very cheap, and as it should be cache hot, probably the difference here is neglible, if any. A nice side benefit is that we can kill io_setup_async_msg completely, which has some nasty iovec manipulation code. Signed-off-by: Jens Axboe --- io_uring/net.c | 117 ++++++++++++++++++++----------------------------- 1 file changed, 47 insertions(+), 70 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index 2df59fb19a15..14491fab6d59 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -161,36 +161,6 @@ static inline struct io_async_msghdr *io_msg_alloc_async_prep(struct io_kiocb *r return io_msg_alloc_async(req, 0); } -static int io_setup_async_msg(struct io_kiocb *req, - struct io_async_msghdr *kmsg, - unsigned int issue_flags) -{ - struct io_async_msghdr *async_msg; - - if (req_has_async_data(req)) - return -EAGAIN; - async_msg = io_msg_alloc_async(req, issue_flags); - if (!async_msg) { - kfree(kmsg->free_iov); - return -ENOMEM; - } - req->flags |= REQ_F_NEED_CLEANUP; - memcpy(async_msg, kmsg, sizeof(*kmsg)); - if (async_msg->msg.msg_name) - async_msg->msg.msg_name = &async_msg->addr; - - if ((req->flags & REQ_F_BUFFER_SELECT) && !async_msg->msg.msg_iter.nr_segs) - return -EAGAIN; - - /* if were using fast_iov, set it to the new one */ - if (iter_is_iovec(&kmsg->msg.msg_iter) && !kmsg->free_iov) { - size_t fast_idx = iter_iov(&kmsg->msg.msg_iter) - kmsg->fast_iov; - async_msg->msg.msg_iter.__iov = &async_msg->fast_iov[fast_idx]; - } - - return -EAGAIN; -} - #ifdef CONFIG_COMPAT static int io_compat_msg_copy_hdr(struct io_kiocb *req, struct io_async_msghdr *iomsg, @@ -409,7 +379,7 @@ static void io_req_msg_cleanup(struct io_kiocb *req, int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr iomsg, *kmsg; + struct io_async_msghdr *kmsg; struct socket *sock; unsigned flags; int min_ret = 0; @@ -423,15 +393,17 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) kmsg = req->async_data; kmsg->msg.msg_control_user = sr->msg_control; } else { - ret = io_sendmsg_copy_hdr(req, &iomsg); + kmsg = io_msg_alloc_async(req, issue_flags); + if (unlikely(!kmsg)) + return -ENOMEM; + ret = io_sendmsg_copy_hdr(req, kmsg); if (ret) return ret; - kmsg = &iomsg; } if (!(req->flags & REQ_F_POLLED) && (sr->flags & IORING_RECVSEND_POLL_FIRST)) - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; flags = sr->msg_flags; if (issue_flags & IO_URING_F_NONBLOCK) @@ -443,13 +415,13 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) if (ret < min_ret) { if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK)) - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; if (ret > 0 && io_net_retry(sock, flags)) { kmsg->msg.msg_controllen = 0; kmsg->msg.msg_control = NULL; sr->done_io += ret; req->flags |= REQ_F_BL_NO_RECYCLE; - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; } if (ret == -ERESTARTSYS) ret = -EINTR; @@ -465,7 +437,6 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) } static struct io_async_msghdr *io_send_setup(struct io_kiocb *req, - struct io_async_msghdr *stack_msg, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); @@ -475,8 +446,9 @@ static struct io_async_msghdr *io_send_setup(struct io_kiocb *req, if (req_has_async_data(req)) { kmsg = req->async_data; } else { - kmsg = stack_msg; - kmsg->free_iov = NULL; + kmsg = io_msg_alloc_async(req, issue_flags); + if (unlikely(!kmsg)) + return ERR_PTR(-ENOMEM); kmsg->msg.msg_name = NULL; kmsg->msg.msg_namelen = 0; kmsg->msg.msg_control = NULL; @@ -502,7 +474,7 @@ static struct io_async_msghdr *io_send_setup(struct io_kiocb *req, if (!(req->flags & REQ_F_POLLED) && (sr->flags & IORING_RECVSEND_POLL_FIRST)) - return ERR_PTR(io_setup_async_msg(req, kmsg, issue_flags)); + return ERR_PTR(-EAGAIN); return kmsg; } @@ -510,7 +482,7 @@ static struct io_async_msghdr *io_send_setup(struct io_kiocb *req, int io_send(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr iomsg, *kmsg; + struct io_async_msghdr *kmsg; size_t len = sr->len; struct socket *sock; unsigned flags; @@ -521,7 +493,7 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags) if (unlikely(!sock)) return -ENOTSOCK; - kmsg = io_send_setup(req, &iomsg, issue_flags); + kmsg = io_send_setup(req, issue_flags); if (IS_ERR(kmsg)) return PTR_ERR(kmsg); @@ -540,14 +512,14 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags) ret = sock_sendmsg(sock, &kmsg->msg); if (ret < min_ret) { if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK)) - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; if (ret > 0 && io_net_retry(sock, flags)) { sr->len -= ret; sr->buf += ret; sr->done_io += ret; req->flags |= REQ_F_BL_NO_RECYCLE; - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; } if (ret == -ERESTARTSYS) ret = -EINTR; @@ -738,9 +710,10 @@ static inline bool io_recv_finish(struct io_kiocb *req, int *ret, sr->nr_multishot_loops = 0; mshot_retry_ret = IOU_REQUEUE; } - *ret = io_setup_async_msg(req, kmsg, issue_flags); - if (*ret == -EAGAIN && issue_flags & IO_URING_F_MULTISHOT) + if (issue_flags & IO_URING_F_MULTISHOT) *ret = mshot_retry_ret; + else + *ret = -EAGAIN; return true; } @@ -842,7 +815,7 @@ static int io_recvmsg_multishot(struct socket *sock, struct io_sr_msg *io, int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr iomsg, *kmsg; + struct io_async_msghdr *kmsg; struct socket *sock; unsigned flags; int ret, min_ret = 0; @@ -856,15 +829,17 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) if (req_has_async_data(req)) { kmsg = req->async_data; } else { - ret = io_recvmsg_copy_hdr(req, &iomsg); + kmsg = io_msg_alloc_async(req, issue_flags); + if (unlikely(!kmsg)) + return -ENOMEM; + ret = io_recvmsg_copy_hdr(req, kmsg); if (ret) return ret; - kmsg = &iomsg; } if (!(req->flags & REQ_F_POLLED) && (sr->flags & IORING_RECVSEND_POLL_FIRST)) - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; flags = sr->msg_flags; if (force_nonblock) @@ -906,17 +881,16 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) if (ret < min_ret) { if (ret == -EAGAIN && force_nonblock) { - ret = io_setup_async_msg(req, kmsg, issue_flags); - if (ret == -EAGAIN && (issue_flags & IO_URING_F_MULTISHOT)) { + if (issue_flags & IO_URING_F_MULTISHOT) { io_kbuf_recycle(req, issue_flags); return IOU_ISSUE_SKIP_COMPLETE; } - return ret; + return -EAGAIN; } if (ret > 0 && io_net_retry(sock, flags)) { sr->done_io += ret; req->flags |= REQ_F_BL_NO_RECYCLE; - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; } if (ret == -ERESTARTSYS) ret = -EINTR; @@ -941,7 +915,7 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) int io_recv(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr iomsg, *kmsg; + struct io_async_msghdr *kmsg; struct socket *sock; unsigned flags; int ret, min_ret = 0; @@ -951,7 +925,9 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags) if (req_has_async_data(req)) { kmsg = req->async_data; } else { - kmsg = &iomsg; + kmsg = io_msg_alloc_async(req, issue_flags); + if (unlikely(!kmsg)) + return -ENOMEM; kmsg->free_iov = NULL; kmsg->msg.msg_name = NULL; kmsg->msg.msg_namelen = 0; @@ -971,7 +947,7 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags) if (!(req->flags & REQ_F_POLLED) && (sr->flags & IORING_RECVSEND_POLL_FIRST)) - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; sock = sock_from_file(req->file); if (unlikely(!sock)) @@ -1005,8 +981,7 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags) ret = sock_recvmsg(sock, &kmsg->msg, flags); if (ret < min_ret) { if (ret == -EAGAIN && force_nonblock) { - ret = io_setup_async_msg(req, kmsg, issue_flags); - if (ret == -EAGAIN && issue_flags & IO_URING_F_MULTISHOT) { + if (issue_flags & IO_URING_F_MULTISHOT) { io_kbuf_recycle(req, issue_flags); return IOU_ISSUE_SKIP_COMPLETE; } @@ -1018,7 +993,7 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags) sr->buf += ret; sr->done_io += ret; req->flags |= REQ_F_BL_NO_RECYCLE; - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; } if (ret == -ERESTARTSYS) ret = -EINTR; @@ -1214,7 +1189,7 @@ static int io_send_zc_import(struct io_kiocb *req, struct io_async_msghdr *kmsg) int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *zc = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr iomsg, *kmsg; + struct io_async_msghdr *kmsg; struct socket *sock; unsigned msg_flags; int ret, min_ret = 0; @@ -1225,7 +1200,7 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) if (!test_bit(SOCK_SUPPORT_ZC, &sock->flags)) return -EOPNOTSUPP; - kmsg = io_send_setup(req, &iomsg, issue_flags); + kmsg = io_send_setup(req, issue_flags); if (IS_ERR(kmsg)) return PTR_ERR(kmsg); @@ -1248,14 +1223,14 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) if (unlikely(ret < min_ret)) { if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK)) - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; if (ret > 0 && io_net_retry(sock, kmsg->msg.msg_flags)) { zc->len -= ret; zc->buf += ret; zc->done_io += ret; req->flags |= REQ_F_BL_NO_RECYCLE; - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; } if (ret == -ERESTARTSYS) ret = -EINTR; @@ -1283,7 +1258,7 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr iomsg, *kmsg; + struct io_async_msghdr *kmsg; struct socket *sock; unsigned flags; int ret, min_ret = 0; @@ -1299,15 +1274,17 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags) if (req_has_async_data(req)) { kmsg = req->async_data; } else { - ret = io_sendmsg_copy_hdr(req, &iomsg); + kmsg = io_msg_alloc_async(req, issue_flags); + if (unlikely(!kmsg)) + return -ENOMEM; + ret = io_sendmsg_copy_hdr(req, kmsg); if (ret) return ret; - kmsg = &iomsg; } if (!(req->flags & REQ_F_POLLED) && (sr->flags & IORING_RECVSEND_POLL_FIRST)) - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; flags = sr->msg_flags | MSG_ZEROCOPY; if (issue_flags & IO_URING_F_NONBLOCK) @@ -1321,12 +1298,12 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags) if (unlikely(ret < min_ret)) { if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK)) - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; if (ret > 0 && io_net_retry(sock, flags)) { sr->done_io += ret; req->flags |= REQ_F_BL_NO_RECYCLE; - return io_setup_async_msg(req, kmsg, issue_flags); + return -EAGAIN; } if (ret == -ERESTARTSYS) ret = -EINTR; From patchwork Wed Mar 20 22:55:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598254 Received: from mail-io1-f47.google.com (mail-io1-f47.google.com [209.85.166.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7399785C48 for ; Wed, 20 Mar 2024 22:58:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975487; cv=none; b=sTzUKzCcgR06l1rHL/sjUq97MnJrk+FSY79gxxzwdfCcwvUfb8p9CRFzW/P3KxzARqFpbMZIsUmgboIeryGk6/4qNrO376UwshyIWh6myG1DL38KlvOwGk82IW3JrNTsca+x82tusDbSKM0z2OMO/gYXS2CVjP+NiulZGGOU2+w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975487; c=relaxed/simple; bh=XqMqcyuqekdt834sGtCRWAzM8r53ZNfzjfCh7hnp80I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fPgQ5AZPMQL9WlS5t66H9WLMGz9vzQ/76sHCcvlFpUkbtJV3SXa2GqjfOEqRqz6ujRxUzoH5DNYRUClBD0/nCNA4+oq9nid8G1LXToNZatbb6Xa0n1ejebG7LHC5lQbmitPL5eYPfUV+Y4hFUtM27rbFxCkIFfAyA4GtmYpxh7E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=v8CGqNEq; arc=none smtp.client-ip=209.85.166.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="v8CGqNEq" Received: by mail-io1-f47.google.com with SMTP id ca18e2360f4ac-7cc5e664d52so6265439f.0 for ; Wed, 20 Mar 2024 15:58:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975484; x=1711580284; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0yqRqSadDrLycgB7gWD/2gAacwJUJVYMH2PNZjdZcSg=; b=v8CGqNEqHqos12ZisjuVK8E5bhxTPRxvaakG6mmSZfmxKa2YOKspD+Rpi3ol7QtZFd ieQusYOvgIB2kE01k8P61kGbElXZ1sK1htnEm14e2b2zjIaOgFcK6lRb+IZIbJSf2904 kEmyvC4nUIhpx+rEVAo2TUo1KEMoKdb/U03QNGzvmC3kRDODIfR8JrdLiqbROw3CIb1b 2prKH+wRQrxzkrG8V7MbC+o430OZLVymVFDygj5z4fcwc0JFfKRwN40hamRmeM2U54tG clTwTf5gWbr49niqEOfJGS00+MD5e+NihhAgFl43iacDEauM3GrvTGpS31q59PE7vs0B fMiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975484; x=1711580284; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0yqRqSadDrLycgB7gWD/2gAacwJUJVYMH2PNZjdZcSg=; b=qTk1B88p5dQL7G4SJTZ1B4IBI81exASF5PRijBz3RWJWwwyl3MBMIx9BK29lRQPGOs 9MoyKMDS51Jn8MVTqdTndau0JjBak8iOofOHIVhjoATGFo56yflZVO9o3QUXvDVHx4Y4 Ii1Z+KzQlZydyBhAIz0/fc5vYKHmFPfBmBGPF0O2LtX6fc0yeRNP0gTELEhBh1EUJmwM 4ZSnijA9TQfM5XaRQkCIglf0vZMSTrV6pEi0iPWzknlXbI6n5M3nhLLVX0sEee7+QYZ2 pTR2ZYVyMKG2s4A+rX9frr0Hnj2GFw+VIxqtKwm0NJ3oH8qt1kT/yvWT+GGPDqDYHHW7 fW+A== X-Gm-Message-State: AOJu0YzTZwqC+/vnC/8sP1E+6qILY0mHtmsELnuFsKJ00EsOjRhOLf8I 3ZOsOFo78jyScOtlRibKzwmf7xMkZa3RJdhv+AHiboq9h/ydhYub+p/d95LTz5N8nj3hjxiFF4G v X-Google-Smtp-Source: AGHT+IEH0KJnHk7/b/mCXlm++UHWiQFNvqhzjTNLCpevwtjnAYYYOsp88neya7Rk5cYmy83HTcttvA== X-Received: by 2002:a6b:5108:0:b0:7ce:f407:1edf with SMTP id f8-20020a6b5108000000b007cef4071edfmr6851307iob.0.1710975484215; Wed, 20 Mar 2024 15:58:04 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:02 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 05/17] io_uring/net: get rid of ->prep_async() for receive side Date: Wed, 20 Mar 2024 16:55:20 -0600 Message-ID: <20240320225750.1769647-6-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Move the io_async_msghdr out of the issue path and into prep handling, since it's now done unconditionally and hence does not need to be part of the issue path. This reduces the footprint of the multishot fast path of multiple invocations of ->issue() per prep, and also means that we can drop using ->prep_async() for recvmsg as we now do this setup on the prep side. Signed-off-by: Jens Axboe --- io_uring/net.c | 71 +++++++++++++++++++----------------------------- io_uring/net.h | 1 - io_uring/opdef.c | 2 -- 3 files changed, 28 insertions(+), 46 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index 14491fab6d59..e438b1ac2420 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -596,17 +596,36 @@ static int io_recvmsg_copy_hdr(struct io_kiocb *req, msg.msg_controllen); } -int io_recvmsg_prep_async(struct io_kiocb *req) +static int io_recvmsg_prep_setup(struct io_kiocb *req) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr *iomsg; + struct io_async_msghdr *kmsg; int ret; - sr->done_io = 0; - if (!io_msg_alloc_async_prep(req)) + /* always locked for prep */ + kmsg = io_msg_alloc_async(req, 0); + if (unlikely(!kmsg)) return -ENOMEM; - iomsg = req->async_data; - ret = io_recvmsg_copy_hdr(req, iomsg); + + if (req->opcode == IORING_OP_RECV) { + kmsg->msg.msg_name = NULL; + kmsg->msg.msg_namelen = 0; + kmsg->msg.msg_control = NULL; + kmsg->msg.msg_get_inq = 1; + kmsg->msg.msg_controllen = 0; + kmsg->msg.msg_iocb = NULL; + kmsg->msg.msg_ubuf = NULL; + + if (!io_do_buffer_select(req)) { + ret = import_ubuf(ITER_DEST, sr->buf, sr->len, + &kmsg->msg.msg_iter); + if (unlikely(ret)) + return ret; + } + return 0; + } + + ret = io_recvmsg_copy_hdr(req, kmsg); if (!ret) req->flags |= REQ_F_NEED_CLEANUP; return ret; @@ -657,7 +676,7 @@ int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) sr->msg_flags |= MSG_CMSG_COMPAT; #endif sr->nr_multishot_loops = 0; - return 0; + return io_recvmsg_prep_setup(req); } static inline void io_recv_prep_retry(struct io_kiocb *req, @@ -815,7 +834,7 @@ static int io_recvmsg_multishot(struct socket *sock, struct io_sr_msg *io, int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr *kmsg; + struct io_async_msghdr *kmsg = req->async_data; struct socket *sock; unsigned flags; int ret, min_ret = 0; @@ -826,17 +845,6 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) if (unlikely(!sock)) return -ENOTSOCK; - if (req_has_async_data(req)) { - kmsg = req->async_data; - } else { - kmsg = io_msg_alloc_async(req, issue_flags); - if (unlikely(!kmsg)) - return -ENOMEM; - ret = io_recvmsg_copy_hdr(req, kmsg); - if (ret) - return ret; - } - if (!(req->flags & REQ_F_POLLED) && (sr->flags & IORING_RECVSEND_POLL_FIRST)) return -EAGAIN; @@ -915,36 +923,13 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) int io_recv(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr *kmsg; + struct io_async_msghdr *kmsg = req->async_data; struct socket *sock; unsigned flags; int ret, min_ret = 0; bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK; size_t len = sr->len; - if (req_has_async_data(req)) { - kmsg = req->async_data; - } else { - kmsg = io_msg_alloc_async(req, issue_flags); - if (unlikely(!kmsg)) - return -ENOMEM; - kmsg->free_iov = NULL; - kmsg->msg.msg_name = NULL; - kmsg->msg.msg_namelen = 0; - kmsg->msg.msg_control = NULL; - kmsg->msg.msg_get_inq = 1; - kmsg->msg.msg_controllen = 0; - kmsg->msg.msg_iocb = NULL; - kmsg->msg.msg_ubuf = NULL; - - if (!io_do_buffer_select(req)) { - ret = import_ubuf(ITER_DEST, sr->buf, sr->len, - &kmsg->msg.msg_iter); - if (unlikely(ret)) - return ret; - } - } - if (!(req->flags & REQ_F_POLLED) && (sr->flags & IORING_RECVSEND_POLL_FIRST)) return -EAGAIN; diff --git a/io_uring/net.h b/io_uring/net.h index 5c1230f1aaf9..4b4fd9b1b7b4 100644 --- a/io_uring/net.h +++ b/io_uring/net.h @@ -42,7 +42,6 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags); int io_send(struct io_kiocb *req, unsigned int issue_flags); int io_sendrecv_prep_async(struct io_kiocb *req); -int io_recvmsg_prep_async(struct io_kiocb *req); int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags); int io_recv(struct io_kiocb *req, unsigned int issue_flags); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 77131826d603..1368193edc57 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -536,7 +536,6 @@ const struct io_cold_def io_cold_defs[] = { .name = "RECVMSG", #if defined(CONFIG_NET) .async_size = sizeof(struct io_async_msghdr), - .prep_async = io_recvmsg_prep_async, .cleanup = io_sendmsg_recvmsg_cleanup, .fail = io_sendrecv_fail, #endif @@ -613,7 +612,6 @@ const struct io_cold_def io_cold_defs[] = { .async_size = sizeof(struct io_async_msghdr), .cleanup = io_sendmsg_recvmsg_cleanup, .fail = io_sendrecv_fail, - .prep_async = io_sendrecv_prep_async, #endif }, [IORING_OP_OPENAT2] = { From patchwork Wed Mar 20 22:55:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598255 Received: from mail-io1-f52.google.com (mail-io1-f52.google.com [209.85.166.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 703EF85C65 for ; Wed, 20 Mar 2024 22:58:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975489; cv=none; b=PigQiidIOAWj9RzKtYpbPnrHGMXNbeSw/gY5PjvSNamEWP3Nihwsn496mmIgRn5C8k0RJF7TpnJRL4XbzI8snfgY56VvyORoUcS4M9DHhoVnKwhXUufoDjwS6ff8AtG8DnQwJgXdvv+hxzXsGUtVQQPTb+990DC0t1To5NXoHE4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975489; c=relaxed/simple; bh=caeNFUSSfWAnpz4lcI+eODh8GgEovyM3kf6YT1l0Lt4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=g6ZATwGZP3Uz7Kg2C73omrAGXgJMwx/CwSUKlt8dG+0iYRZkzNuOmR6SLbKIkZ2nX1wmMdnx+c15iAoWCn1x/6RUEnGi2XCDsaFC0LWpicYXnUxCIxL7RZlElt8s9T5maSALzyktYDOF0YgOOstmBeeVZEfIGAWOnj3lLLckkhw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=fccc3jvy; arc=none smtp.client-ip=209.85.166.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="fccc3jvy" Received: by mail-io1-f52.google.com with SMTP id ca18e2360f4ac-7c8e4c0412dso3588639f.1 for ; Wed, 20 Mar 2024 15:58:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975486; x=1711580286; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BpX2WhgS90Wr7HkWYuE7Ck1OBz3+CeOQjZCkZS5G5D0=; b=fccc3jvy5JVgm2gsaMMkPqp0uks0LFqBDl4Wo756xIAQsxyz0Q0ADYRxKNIUbzE1as lQ3gqGxXI/xE2p8EGN5t8iakWA5mVurAKWOc1b84+fmbxzOM1WH35GKEpP/LwPd9YQAN ZIrXC4bRfPYEwMJh3r+4RZsSA9oEVt+2lR74CzSVsvQKiUQTQ4ZRx9thMk1OA1j0vhUA nUD0tbVLBFO1OPi5t61zIqShzN2pdLRmykxz/BBXK6JGYPt3FcXiMcrLUkfEcffJVXpk 3g2pbcJ+kOjC1vZFqioYT2UNk0oNsQVPi45Q7ZanzIeC1pueoX26jurE+DTIJC8nnPpG A2lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975486; x=1711580286; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BpX2WhgS90Wr7HkWYuE7Ck1OBz3+CeOQjZCkZS5G5D0=; b=N9ORqJsdX4B1Ky5vNc5mXwzLskM0EFwGuhi2AIRhItdhgn39bFcC/Q1l8JzaQqTsU3 CMn3FmgGusYxgU/5YzPrab9/Iw8dP+VFjMFEKB5irD+l5ec+t2mXtDOVEhgrY4eO32w4 rA3UFULCyPeB5kvChFvLf84/885tZO5iaf6OmqgUkdXbxR/TPjEcsk7pOgQAkX/VbXP1 j8NI4YWR52Dr8/TTCjlFEcei7r9mn2yz3+Wtsiz7kW0Z173hDpo1wEVciyIx1XT+9D/x 0/F8dc8WYHP8oX2xt+WhrlZ2hXWo57BgRmK2BEFfb+SECw+GRsOx4TkV4CtXkz3f7Qfb OqxQ== X-Gm-Message-State: AOJu0YyRjYbJ6naBBkyb3pr4Ah4dp0+Ly01qIjsXK9T8XDVVO8HzEY1i jf5+ognfXD2vJXhm2Vip5pUNpeOcDSOnb2lhzMn0l/UhmrwURrsNY8VAlRMbRU8juSznrtXujq8 M X-Google-Smtp-Source: AGHT+IHCwHNwb/eSAh6rbZmq5VoOnIhiugIFAhJwo7D4PNfLuJa/BMjZD9C3dXZfYzFjDW8ASC7VjQ== X-Received: by 2002:a6b:6303:0:b0:7c8:d514:9555 with SMTP id p3-20020a6b6303000000b007c8d5149555mr16278044iog.1.1710975486258; Wed, 20 Mar 2024 15:58:06 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:04 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 06/17] io_uring/net: get rid of ->prep_async() for send side Date: Wed, 20 Mar 2024 16:55:21 -0600 Message-ID: <20240320225750.1769647-7-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Move the io_async_msghdr out of the issue path and into prep handling, e it's now done unconditionally and hence does not need to be part of the issue path. This means we can drop any usage of io_sendrecv_prep_async() and io_sendmsg_prep_async(), and hence the forced async setup path is now unified with the normal prep setup. Signed-off-by: Jens Axboe --- io_uring/net.c | 162 +++++++++++++++-------------------------------- io_uring/net.h | 2 - io_uring/opdef.c | 4 -- 3 files changed, 50 insertions(+), 118 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index e438b1ac2420..dc6cda076a93 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -290,50 +290,59 @@ static int io_sendmsg_copy_hdr(struct io_kiocb *req, return ret; } -int io_sendrecv_prep_async(struct io_kiocb *req) +void io_sendmsg_recvmsg_cleanup(struct io_kiocb *req) +{ + struct io_async_msghdr *io = req->async_data; + + kfree(io->free_iov); +} + +static int io_send_setup(struct io_kiocb *req) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr *io; + struct io_async_msghdr *kmsg = req->async_data; int ret; - if (req_has_async_data(req)) - return 0; - sr->done_io = 0; - if (!sr->addr) - return 0; - io = io_msg_alloc_async_prep(req); - if (!io) - return -ENOMEM; - memset(&io->msg, 0, sizeof(io->msg)); - ret = import_ubuf(ITER_SOURCE, sr->buf, sr->len, &io->msg.msg_iter); - if (unlikely(ret)) - return ret; - io->msg.msg_name = &io->addr; - io->msg.msg_namelen = sr->addr_len; - return move_addr_to_kernel(sr->addr, sr->addr_len, &io->addr); + kmsg->msg.msg_name = NULL; + kmsg->msg.msg_namelen = 0; + kmsg->msg.msg_control = NULL; + kmsg->msg.msg_controllen = 0; + kmsg->msg.msg_ubuf = NULL; + + if (sr->addr) { + ret = move_addr_to_kernel(sr->addr, sr->addr_len, &kmsg->addr); + if (unlikely(ret < 0)) + return ret; + kmsg->msg.msg_name = &kmsg->addr; + kmsg->msg.msg_namelen = sr->addr_len; + } + if (!io_do_buffer_select(req)) { + ret = import_ubuf(ITER_SOURCE, sr->buf, sr->len, + &kmsg->msg.msg_iter); + if (unlikely(ret < 0)) + return ret; + } + + return 0; } -int io_sendmsg_prep_async(struct io_kiocb *req) +static int io_sendmsg_prep_setup(struct io_kiocb *req, int is_msg) { - struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); + struct io_async_msghdr *kmsg; int ret; - sr->done_io = 0; - if (!io_msg_alloc_async_prep(req)) + /* always locked for prep */ + kmsg = io_msg_alloc_async(req, 0); + if (unlikely(!kmsg)) return -ENOMEM; - ret = io_sendmsg_copy_hdr(req, req->async_data); + if (!is_msg) + return io_send_setup(req); + ret = io_sendmsg_copy_hdr(req, kmsg); if (!ret) req->flags |= REQ_F_NEED_CLEANUP; return ret; } -void io_sendmsg_recvmsg_cleanup(struct io_kiocb *req) -{ - struct io_async_msghdr *io = req->async_data; - - kfree(io->free_iov); -} - int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); @@ -362,7 +371,7 @@ int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) if (req->ctx->compat) sr->msg_flags |= MSG_CMSG_COMPAT; #endif - return 0; + return io_sendmsg_prep_setup(req, req->opcode == IORING_OP_SENDMSG); } static void io_req_msg_cleanup(struct io_kiocb *req, @@ -379,7 +388,7 @@ static void io_req_msg_cleanup(struct io_kiocb *req, int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr *kmsg; + struct io_async_msghdr *kmsg = req->async_data; struct socket *sock; unsigned flags; int min_ret = 0; @@ -389,18 +398,6 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) if (unlikely(!sock)) return -ENOTSOCK; - if (req_has_async_data(req)) { - kmsg = req->async_data; - kmsg->msg.msg_control_user = sr->msg_control; - } else { - kmsg = io_msg_alloc_async(req, issue_flags); - if (unlikely(!kmsg)) - return -ENOMEM; - ret = io_sendmsg_copy_hdr(req, kmsg); - if (ret) - return ret; - } - if (!(req->flags & REQ_F_POLLED) && (sr->flags & IORING_RECVSEND_POLL_FIRST)) return -EAGAIN; @@ -436,54 +433,10 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) return IOU_OK; } -static struct io_async_msghdr *io_send_setup(struct io_kiocb *req, - unsigned int issue_flags) -{ - struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr *kmsg; - int ret; - - if (req_has_async_data(req)) { - kmsg = req->async_data; - } else { - kmsg = io_msg_alloc_async(req, issue_flags); - if (unlikely(!kmsg)) - return ERR_PTR(-ENOMEM); - kmsg->msg.msg_name = NULL; - kmsg->msg.msg_namelen = 0; - kmsg->msg.msg_control = NULL; - kmsg->msg.msg_controllen = 0; - kmsg->msg.msg_ubuf = NULL; - - if (sr->addr) { - ret = move_addr_to_kernel(sr->addr, sr->addr_len, - &kmsg->addr); - if (unlikely(ret < 0)) - return ERR_PTR(ret); - kmsg->msg.msg_name = &kmsg->addr; - kmsg->msg.msg_namelen = sr->addr_len; - } - - if (!io_do_buffer_select(req)) { - ret = import_ubuf(ITER_SOURCE, sr->buf, sr->len, - &kmsg->msg.msg_iter); - if (unlikely(ret)) - return ERR_PTR(ret); - } - } - - if (!(req->flags & REQ_F_POLLED) && - (sr->flags & IORING_RECVSEND_POLL_FIRST)) - return ERR_PTR(-EAGAIN); - - return kmsg; -} - int io_send(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr *kmsg; - size_t len = sr->len; + struct io_async_msghdr *kmsg = req->async_data; struct socket *sock; unsigned flags; int min_ret = 0; @@ -493,13 +446,9 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags) if (unlikely(!sock)) return -ENOTSOCK; - kmsg = io_send_setup(req, issue_flags); - if (IS_ERR(kmsg)) - return PTR_ERR(kmsg); - - ret = import_ubuf(ITER_SOURCE, sr->buf, len, &kmsg->msg.msg_iter); - if (unlikely(ret)) - return ret; + if (!(req->flags & REQ_F_POLLED) && + (sr->flags & IORING_RECVSEND_POLL_FIRST)) + return -EAGAIN; flags = sr->msg_flags; if (issue_flags & IO_URING_F_NONBLOCK) @@ -1085,7 +1034,7 @@ int io_send_zc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) if (req->ctx->compat) zc->msg_flags |= MSG_CMSG_COMPAT; #endif - return 0; + return io_sendmsg_prep_setup(req, req->opcode == IORING_OP_SENDMSG_ZC); } static int io_sg_from_iter_iovec(struct sock *sk, struct sk_buff *skb, @@ -1174,7 +1123,7 @@ static int io_send_zc_import(struct io_kiocb *req, struct io_async_msghdr *kmsg) int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *zc = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr *kmsg; + struct io_async_msghdr *kmsg = req->async_data; struct socket *sock; unsigned msg_flags; int ret, min_ret = 0; @@ -1185,9 +1134,9 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) if (!test_bit(SOCK_SUPPORT_ZC, &sock->flags)) return -EOPNOTSUPP; - kmsg = io_send_setup(req, issue_flags); - if (IS_ERR(kmsg)) - return PTR_ERR(kmsg); + if (!(req->flags & REQ_F_POLLED) && + (zc->flags & IORING_RECVSEND_POLL_FIRST)) + return -EAGAIN; if (!zc->done_io) { ret = io_send_zc_import(req, kmsg); @@ -1243,7 +1192,7 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr *kmsg; + struct io_async_msghdr *kmsg = req->async_data; struct socket *sock; unsigned flags; int ret, min_ret = 0; @@ -1256,17 +1205,6 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags) if (!test_bit(SOCK_SUPPORT_ZC, &sock->flags)) return -EOPNOTSUPP; - if (req_has_async_data(req)) { - kmsg = req->async_data; - } else { - kmsg = io_msg_alloc_async(req, issue_flags); - if (unlikely(!kmsg)) - return -ENOMEM; - ret = io_sendmsg_copy_hdr(req, kmsg); - if (ret) - return ret; - } - if (!(req->flags & REQ_F_POLLED) && (sr->flags & IORING_RECVSEND_POLL_FIRST)) return -EAGAIN; diff --git a/io_uring/net.h b/io_uring/net.h index 4b4fd9b1b7b4..f99ebb9dc0bb 100644 --- a/io_uring/net.h +++ b/io_uring/net.h @@ -34,13 +34,11 @@ struct io_async_connect { int io_shutdown_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_shutdown(struct io_kiocb *req, unsigned int issue_flags); -int io_sendmsg_prep_async(struct io_kiocb *req); void io_sendmsg_recvmsg_cleanup(struct io_kiocb *req); int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags); int io_send(struct io_kiocb *req, unsigned int issue_flags); -int io_sendrecv_prep_async(struct io_kiocb *req); int io_recvmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 1368193edc57..dd4a1e1425e1 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -527,7 +527,6 @@ const struct io_cold_def io_cold_defs[] = { .name = "SENDMSG", #if defined(CONFIG_NET) .async_size = sizeof(struct io_async_msghdr), - .prep_async = io_sendmsg_prep_async, .cleanup = io_sendmsg_recvmsg_cleanup, .fail = io_sendrecv_fail, #endif @@ -603,7 +602,6 @@ const struct io_cold_def io_cold_defs[] = { .async_size = sizeof(struct io_async_msghdr), .cleanup = io_sendmsg_recvmsg_cleanup, .fail = io_sendrecv_fail, - .prep_async = io_sendrecv_prep_async, #endif }, [IORING_OP_RECV] = { @@ -688,7 +686,6 @@ const struct io_cold_def io_cold_defs[] = { .name = "SEND_ZC", #if defined(CONFIG_NET) .async_size = sizeof(struct io_async_msghdr), - .prep_async = io_sendrecv_prep_async, .cleanup = io_send_zc_cleanup, .fail = io_sendrecv_fail, #endif @@ -697,7 +694,6 @@ const struct io_cold_def io_cold_defs[] = { .name = "SENDMSG_ZC", #if defined(CONFIG_NET) .async_size = sizeof(struct io_async_msghdr), - .prep_async = io_sendmsg_prep_async, .cleanup = io_send_zc_cleanup, .fail = io_sendrecv_fail, #endif From patchwork Wed Mar 20 22:55:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598256 Received: from mail-io1-f42.google.com (mail-io1-f42.google.com [209.85.166.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 380B485C7F for ; Wed, 20 Mar 2024 22:58:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975490; cv=none; b=cO3nnxKuz7ijMtNUozAjmVXeEh7+pTpzDg9KmMLurmCqZa6JooZrmpRtJDxRH+fliuOMOzGHBUjqFZU3xPD+mBb/GFwao0dQe7MDzsnSkJJjhqcxkzLczN92cPPh8TFTVBYV+lc9hPUr7xgsusqb1Aj4UeO1o9oQn14aOH2QVBk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975490; c=relaxed/simple; bh=Ve6uHhsYaizqlgH/NaC3fWbKK6mM5yEmkwoIG+qdOcw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EmmiJ70CbMKcn1YFP29m9E/mzYaRlE1C+1tjj8l8GQxHvlUuzcaK0bra9IhvacJs5fbgqdDcbTDcg7xMTgDLtBUPHmUsaCfi7bwLlgjcC+TcVLvsclIrquFC+DIsXb2mXaml6Y2sJIzBDGwImjog/ca6CdlBZTBig5GmbfEvNvU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=RfSwDzoW; arc=none smtp.client-ip=209.85.166.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="RfSwDzoW" Received: by mail-io1-f42.google.com with SMTP id ca18e2360f4ac-7cc5e664d52so6266439f.0 for ; Wed, 20 Mar 2024 15:58:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975488; x=1711580288; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=P2UrDyqNAoZmjKjSbgMjdB+JTKkJl5VhiIsoG9G4uPM=; b=RfSwDzoWws/aZa5vm00zf93OLMJVW2NibjiyQCNWFWvMDgXGHV6tj0yG2YEugcYMvc 2jWX3fqkhmwAI/pqLy/5jxa6C3UfqWzt03rSlKfsJBctGJfpAUlTRi4II3qcEbHt17Hv isNCa5z98nOZ2ulHV2i8cmqxv3pMQxhU7alvMbndmpRJJOENHwBxyvXdV1m4yPd1beyC O0EtGub9VPWeH/93CvVuFqtx2fv+/OfENkMcj2RqK+QinLm0HJDSTXLc6DodpVulMIuP L5Y9gMHyYfSMsuatkzYT09w1UdVBVHnH/9AP6gSjb9hQg2bB8nG1WyEV3smbmo1YS1GN 05RA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975488; x=1711580288; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P2UrDyqNAoZmjKjSbgMjdB+JTKkJl5VhiIsoG9G4uPM=; b=jdwOx4rad/7y2fcF2swYt5Nr+AwTK+6wDsJ0ZzJHmoHkHz5BmcEI+y+MkoHHCySZFg DW2CVkUW7E1nnVCejeibiNBHrQfrnG+eDjEODMNOoYSwGZUPrqULCpSqHtMVD2Kz1RWP jhkq4KRgz26nLNcbPW6cxwmFJtkVO3a4AM27rc8aqsPTUxXP+KLmt4u3C1D376D5sY13 d3Y7cPhCPXTdwrdL6Yd6eAIT/eZXyo69nNo55Gauol7M39AXs5eNzknVa8wnzWDUJ1Na L5u/U9n0dJXXjy+GKBVFnqGQaaPthqxmtCJFl+kPziUX6wuL5OLCcqT+WdF8PzUEL21f LnNA== X-Gm-Message-State: AOJu0YxUeZYFh3GQSNFms0VGF/zrRuhu735awZrhjNzpa35m4u6h/qB9 G4n68EsFKV2pSkx1tMB64UK2Wq16dl3pm5Ojue7Q4hgKf3xtRQ6qTu0dEdgjDSyAGWNpKGZ5rsK l X-Google-Smtp-Source: AGHT+IHuA/aCDq3yA3LeB0Fsdl4KErlzaTfwK1tsKB6FtOvHvGnjbGPNUfO+4+l8bPuKO2s6wVXRzQ== X-Received: by 2002:a6b:6303:0:b0:7c8:d514:9555 with SMTP id p3-20020a6b6303000000b007c8d5149555mr16278103iog.1.1710975487975; Wed, 20 Mar 2024 15:58:07 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:07 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 07/17] io_uring: kill io_msg_alloc_async_prep() Date: Wed, 20 Mar 2024 16:55:22 -0600 Message-ID: <20240320225750.1769647-8-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 We now ONLY call io_msg_alloc_async() from inside prep handling, which is always locked. No need for this helper anymore, or the check in io_msg_alloc_async() on whether the ring is locked or not. Signed-off-by: Jens Axboe --- io_uring/net.c | 31 ++++++++++--------------------- 1 file changed, 10 insertions(+), 21 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index dc6cda076a93..6b45311dcc08 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -129,22 +129,19 @@ static void io_netmsg_recycle(struct io_kiocb *req, unsigned int issue_flags) } } -static struct io_async_msghdr *io_msg_alloc_async(struct io_kiocb *req, - unsigned int issue_flags) +static struct io_async_msghdr *io_msg_alloc_async(struct io_kiocb *req) { struct io_ring_ctx *ctx = req->ctx; struct io_cache_entry *entry; struct io_async_msghdr *hdr; - if (!(issue_flags & IO_URING_F_UNLOCKED)) { - entry = io_alloc_cache_get(&ctx->netmsg_cache); - if (entry) { - hdr = container_of(entry, struct io_async_msghdr, cache); - hdr->free_iov = NULL; - req->flags |= REQ_F_ASYNC_DATA; - req->async_data = hdr; - return hdr; - } + entry = io_alloc_cache_get(&ctx->netmsg_cache); + if (entry) { + hdr = container_of(entry, struct io_async_msghdr, cache); + hdr->free_iov = NULL; + req->flags |= REQ_F_ASYNC_DATA; + req->async_data = hdr; + return hdr; } if (!io_alloc_async_data(req)) { @@ -155,12 +152,6 @@ static struct io_async_msghdr *io_msg_alloc_async(struct io_kiocb *req, return NULL; } -static inline struct io_async_msghdr *io_msg_alloc_async_prep(struct io_kiocb *req) -{ - /* ->prep_async is always called from the submission context */ - return io_msg_alloc_async(req, 0); -} - #ifdef CONFIG_COMPAT static int io_compat_msg_copy_hdr(struct io_kiocb *req, struct io_async_msghdr *iomsg, @@ -331,8 +322,7 @@ static int io_sendmsg_prep_setup(struct io_kiocb *req, int is_msg) struct io_async_msghdr *kmsg; int ret; - /* always locked for prep */ - kmsg = io_msg_alloc_async(req, 0); + kmsg = io_msg_alloc_async(req); if (unlikely(!kmsg)) return -ENOMEM; if (!is_msg) @@ -551,8 +541,7 @@ static int io_recvmsg_prep_setup(struct io_kiocb *req) struct io_async_msghdr *kmsg; int ret; - /* always locked for prep */ - kmsg = io_msg_alloc_async(req, 0); + kmsg = io_msg_alloc_async(req); if (unlikely(!kmsg)) return -ENOMEM; From patchwork Wed Mar 20 22:55:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598257 Received: from mail-il1-f173.google.com (mail-il1-f173.google.com [209.85.166.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5FDC885C65 for ; Wed, 20 Mar 2024 22:58:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975492; cv=none; b=mvTnTQ3a/KFj/uQK1tgY7dxQmJGPwwGmN4m+1alQyWxHfDa3vSOh0gJm0PYrTbPFYRn4bnoS1SaccZS3jiBZOILUda9ziDhVnSomaHypbt8AoB4Fox/25RCvN9Q6KH++wnUtMJErbKLXjolrRQJTGK5Tn6fg+McbEfYjpExfnXE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975492; c=relaxed/simple; bh=10ADhEGQYzZGxcBm42itPVIEZFSx7zgsjWVrLVrInvE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kKjPSeI2aPHxcq2MAeCXhCPVzhTRJoedrjMkN3XdabnsMcZDgHbwI4HSw3NnPt7UUkF8xD9DADLbUcZ24tTBfZ6xXRtoEaKqYSMEpwPFvERb/ZDGov2OkOAMyVExOvEblK8609C4/0sC3ie2Hb2/dA2AbejTe42V2hkipaUWw1I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=MmsxXidU; arc=none smtp.client-ip=209.85.166.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="MmsxXidU" Received: by mail-il1-f173.google.com with SMTP id e9e14a558f8ab-3667b0bb83eso574555ab.0 for ; Wed, 20 Mar 2024 15:58:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975489; x=1711580289; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gqhXeok8GHYy4naqWFY6Hw4mYOo6VHfgqsHImgSaso0=; b=MmsxXidUlkolYTsoAlTZjQDfIpMDqT7PzAo86ugsbefXauMTSh29N+BRBxEZSZV0rg PTn6oWhwJ0XW2LS69wM8cHAEfjlRYjH7uZRydoy9aGpigrvqIQFTk2exA7kNlNMU2EI4 5dBivVMmWcTJp7FKJSGBxeCqHvm9N2OWAgd1ae8rESzmNXj/Rr2ino71CkUNsieYCZCl qXVjQs1sATcT4ppbsZBMCu4QgxnAuwQ3HCIEPn6VoYP2g7Y4DJxbOucmAYaY5yVzpbSr dFUguc0VCGtdBkTcFxD5TGMRz/yzJAGmjcfnsjf498wt9iGh7B2ipA6TDtUb/RjPrt1L c3TA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975489; x=1711580289; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gqhXeok8GHYy4naqWFY6Hw4mYOo6VHfgqsHImgSaso0=; b=LDP0P0WEC0SXlka4v6T2U5c5ADEuavAv2LYGFzfi7Nt+s0DSMNXbAvStYMiZYz/BaZ HKiyavuZdIIaRuiRjB84BKIkctx0DbKl9sDGZf3l5VCTGXbAGRa+2X++fev5k3bEXK3L OlTvas/5h6+KGDNzXOESvlDZGSI1mYjr8hBCvGX3F/pbgdoyJa25TGjvLrW0JOz4qr8y I60qHT8cwT/i0SZeHKT0mz+kjtmfcnuElx+1YIWkd6tHpUi0Kk1eMtcgc+oLOIsecFPL 1aOPxj67+ygcO6BihEDeGA4jZ6OyYnSz+osVURMbC5FFtwUVt4JRFzQbcsSAriG1zGI3 eiUg== X-Gm-Message-State: AOJu0YygV8lxAEug31QXOih8IN09Jpi0833fHSIL/JiRqo557la47aF4 rLMF4R+NwUjW8d4c9SHoHldedOhJlS8BGgLKkQq/ogrBSRpGl02GpGS9Z621RUBedX3FrNrLdIr A X-Google-Smtp-Source: AGHT+IFpQAYyYUIdWYTx7wRkWEuOyXOBCIQQwx28RlGLvAHSxR5HuQmm0nEPeqybPmZ1Zx+j1eELVw== X-Received: by 2002:a5e:a80c:0:b0:7cf:24de:c5f with SMTP id c12-20020a5ea80c000000b007cf24de0c5fmr1835747ioa.1.1710975489139; Wed, 20 Mar 2024 15:58:09 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:08 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 08/17] io_uring/net: add iovec recycling Date: Wed, 20 Mar 2024 16:55:23 -0600 Message-ID: <20240320225750.1769647-9-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Right now the io_async_msghdr is recycled to avoid the overhead of allocating+freeing it for every request. But the iovec is not included, hence that will be allocated and freed for each transfer regardless. This commit enables recyling of the iovec between io_async_msghdr recycles. This avoids alloc+free for each one if we use it, and on top of that, it extends the cache hot nature of msg to the iovec as well. Also enables KASAN for the iovec entries, so that reuse can be detected even while they are in the cache. The io_async_msghdr also shrinks from 376 -> 288 bytes, an 88 byte saving (or ~23% smaller), as the fast_iovec entry is dropped from 8 entries to a single entry. There's no point keeping a big fast iovec entry, if we're not allocating and freeing new iovecs all the time. Signed-off-by: Jens Axboe --- io_uring/net.c | 133 ++++++++++++++++++++++++++++++++----------------- io_uring/net.h | 13 ++--- 2 files changed, 93 insertions(+), 53 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index 6b45311dcc08..20d6427f4250 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -115,15 +115,33 @@ static bool io_net_retry(struct socket *sock, int flags) return sock->type == SOCK_STREAM || sock->type == SOCK_SEQPACKET; } +static void io_netmsg_iovec_free(struct io_async_msghdr *kmsg) +{ + if (kmsg->free_iov) { + kfree(kmsg->free_iov); + kmsg->free_iov_nr = 0; + kmsg->free_iov = NULL; + } +} + static void io_netmsg_recycle(struct io_kiocb *req, unsigned int issue_flags) { struct io_async_msghdr *hdr = req->async_data; + struct iovec *iov; - if (!req_has_async_data(req) || issue_flags & IO_URING_F_UNLOCKED) + if (!req_has_async_data(req)) + return; + /* can't recycle, ensure we free the iovec if we have one */ + if (issue_flags & IO_URING_F_UNLOCKED) { + io_netmsg_iovec_free(hdr); return; + } /* Let normal cleanup path reap it if we fail adding to the cache */ + iov = hdr->free_iov; if (io_alloc_cache_put(&req->ctx->netmsg_cache, &hdr->cache)) { + if (iov) + kasan_mempool_poison_object(iov); req->async_data = NULL; req->flags &= ~REQ_F_ASYNC_DATA; } @@ -138,7 +156,11 @@ static struct io_async_msghdr *io_msg_alloc_async(struct io_kiocb *req) entry = io_alloc_cache_get(&ctx->netmsg_cache); if (entry) { hdr = container_of(entry, struct io_async_msghdr, cache); - hdr->free_iov = NULL; + if (hdr->free_iov) { + kasan_mempool_unpoison_object(hdr->free_iov, + hdr->free_iov_nr * sizeof(struct iovec)); + req->flags |= REQ_F_NEED_CLEANUP; + } req->flags |= REQ_F_ASYNC_DATA; req->async_data = hdr; return hdr; @@ -146,12 +168,27 @@ static struct io_async_msghdr *io_msg_alloc_async(struct io_kiocb *req) if (!io_alloc_async_data(req)) { hdr = req->async_data; + hdr->free_iov_nr = 0; hdr->free_iov = NULL; return hdr; } return NULL; } +/* assign new iovec to kmsg, if we need to */ +static int io_net_vec_assign(struct io_kiocb *req, struct io_async_msghdr *kmsg, + struct iovec *iov) +{ + if (iov) { + req->flags |= REQ_F_NEED_CLEANUP; + kmsg->free_iov_nr = kmsg->msg.msg_iter.nr_segs; + if (kmsg->free_iov) + kfree(kmsg->free_iov); + kmsg->free_iov = iov; + } + return 0; +} + #ifdef CONFIG_COMPAT static int io_compat_msg_copy_hdr(struct io_kiocb *req, struct io_async_msghdr *iomsg, @@ -159,7 +196,16 @@ static int io_compat_msg_copy_hdr(struct io_kiocb *req, { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); struct compat_iovec __user *uiov; - int ret; + struct iovec *iov; + int ret, nr_segs; + + if (iomsg->free_iov) { + nr_segs = iomsg->free_iov_nr; + iov = iomsg->free_iov; + } else { + iov = &iomsg->fast_iov; + nr_segs = 1; + } if (copy_from_user(msg, sr->umsg_compat, sizeof(*msg))) return -EFAULT; @@ -168,9 +214,9 @@ static int io_compat_msg_copy_hdr(struct io_kiocb *req, if (req->flags & REQ_F_BUFFER_SELECT) { compat_ssize_t clen; - iomsg->free_iov = NULL; if (msg->msg_iovlen == 0) { - sr->len = 0; + sr->len = iov->iov_len = 0; + iov->iov_base = NULL; } else if (msg->msg_iovlen > 1) { return -EINVAL; } else { @@ -186,14 +232,12 @@ static int io_compat_msg_copy_hdr(struct io_kiocb *req, return 0; } - iomsg->free_iov = iomsg->fast_iov; ret = __import_iovec(ddir, (struct iovec __user *)uiov, msg->msg_iovlen, - UIO_FASTIOV, &iomsg->free_iov, - &iomsg->msg.msg_iter, true); + nr_segs, &iov, &iomsg->msg.msg_iter, true); if (unlikely(ret < 0)) return ret; - return 0; + return io_net_vec_assign(req, iomsg, iov); } #endif @@ -201,7 +245,16 @@ static int io_msg_copy_hdr(struct io_kiocb *req, struct io_async_msghdr *iomsg, struct user_msghdr *msg, int ddir) { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - int ret; + struct iovec *iov; + int ret, nr_segs; + + if (iomsg->free_iov) { + nr_segs = iomsg->free_iov_nr; + iov = iomsg->free_iov; + } else { + iov = &iomsg->fast_iov; + nr_segs = 1; + } if (!user_access_begin(sr->umsg, sizeof(*sr->umsg))) return -EFAULT; @@ -217,9 +270,8 @@ static int io_msg_copy_hdr(struct io_kiocb *req, struct io_async_msghdr *iomsg, if (req->flags & REQ_F_BUFFER_SELECT) { if (msg->msg_iovlen == 0) { - sr->len = iomsg->fast_iov[0].iov_len = 0; - iomsg->fast_iov[0].iov_base = NULL; - iomsg->free_iov = NULL; + sr->len = iov->iov_len = 0; + iov->iov_base = NULL; } else if (msg->msg_iovlen > 1) { ret = -EINVAL; goto ua_end; @@ -227,10 +279,9 @@ static int io_msg_copy_hdr(struct io_kiocb *req, struct io_async_msghdr *iomsg, /* we only need the length for provided buffers */ if (!access_ok(&msg->msg_iov[0].iov_len, sizeof(__kernel_size_t))) goto ua_end; - unsafe_get_user(iomsg->fast_iov[0].iov_len, - &msg->msg_iov[0].iov_len, ua_end); - sr->len = iomsg->fast_iov[0].iov_len; - iomsg->free_iov = NULL; + unsafe_get_user(iov->iov_len, &msg->msg_iov[0].iov_len, + ua_end); + sr->len = iov->iov_len; } ret = 0; ua_end: @@ -239,13 +290,12 @@ static int io_msg_copy_hdr(struct io_kiocb *req, struct io_async_msghdr *iomsg, } user_access_end(); - iomsg->free_iov = iomsg->fast_iov; - ret = __import_iovec(ddir, msg->msg_iov, msg->msg_iovlen, UIO_FASTIOV, - &iomsg->free_iov, &iomsg->msg.msg_iter, false); + ret = __import_iovec(ddir, msg->msg_iov, msg->msg_iovlen, nr_segs, + &iov, &iomsg->msg.msg_iter, false); if (unlikely(ret < 0)) return ret; - return 0; + return io_net_vec_assign(req, iomsg, iov); } static int io_sendmsg_copy_hdr(struct io_kiocb *req, @@ -285,7 +335,7 @@ void io_sendmsg_recvmsg_cleanup(struct io_kiocb *req) { struct io_async_msghdr *io = req->async_data; - kfree(io->free_iov); + io_netmsg_iovec_free(io); } static int io_send_setup(struct io_kiocb *req) @@ -369,9 +419,6 @@ static void io_req_msg_cleanup(struct io_kiocb *req, unsigned int issue_flags) { req->flags &= ~REQ_F_NEED_CLEANUP; - /* fast path, check for non-NULL to avoid function call */ - if (kmsg->free_iov) - kfree(kmsg->free_iov); io_netmsg_recycle(req, issue_flags); } @@ -622,11 +669,6 @@ static inline void io_recv_prep_retry(struct io_kiocb *req, { struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); - if (kmsg->free_iov) { - kfree(kmsg->free_iov); - kmsg->free_iov = NULL; - } - req->flags &= ~REQ_F_BL_EMPTY; sr->done_io = 0; sr->len = 0; /* get from the provided buffer */ @@ -942,14 +984,10 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags) void io_send_zc_cleanup(struct io_kiocb *req) { struct io_sr_msg *zc = io_kiocb_to_cmd(req, struct io_sr_msg); - struct io_async_msghdr *io; + struct io_async_msghdr *io = req->async_data; - if (req_has_async_data(req)) { - io = req->async_data; - /* might be ->fast_iov if *msg_copy_hdr failed */ - if (io->free_iov != io->fast_iov) - kfree(io->free_iov); - } + if (req_has_async_data(req)) + io_netmsg_iovec_free(io); if (zc->notif) { io_notif_flush(zc->notif); zc->notif = NULL; @@ -1171,8 +1209,7 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) */ if (!(issue_flags & IO_URING_F_UNLOCKED)) { io_notif_flush(zc->notif); - io_netmsg_recycle(req, issue_flags); - req->flags &= ~REQ_F_NEED_CLEANUP; + io_req_msg_cleanup(req, kmsg, 0); } io_req_set_res(req, ret, IORING_CQE_F_MORE); return IOU_OK; @@ -1221,13 +1258,7 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags) ret = -EINTR; req_set_fail(req); } - /* fast path, check for non-NULL to avoid function call */ - if (kmsg->free_iov) { - kfree(kmsg->free_iov); - kmsg->free_iov = NULL; - } - io_netmsg_recycle(req, issue_flags); if (ret >= 0) ret += sr->done_io; else if (sr->done_io) @@ -1239,7 +1270,7 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags) */ if (!(issue_flags & IO_URING_F_UNLOCKED)) { io_notif_flush(sr->notif); - req->flags &= ~REQ_F_NEED_CLEANUP; + io_req_msg_cleanup(req, kmsg, 0); } io_req_set_res(req, ret, IORING_CQE_F_MORE); return IOU_OK; @@ -1483,6 +1514,14 @@ int io_connect(struct io_kiocb *req, unsigned int issue_flags) void io_netmsg_cache_free(struct io_cache_entry *entry) { - kfree(container_of(entry, struct io_async_msghdr, cache)); + struct io_async_msghdr *kmsg; + + kmsg = container_of(entry, struct io_async_msghdr, cache); + if (kmsg->free_iov) { + kasan_mempool_unpoison_object(kmsg->free_iov, + kmsg->free_iov_nr * sizeof(struct iovec)); + io_netmsg_iovec_free(kmsg); + } + kfree(kmsg); } #endif diff --git a/io_uring/net.h b/io_uring/net.h index f99ebb9dc0bb..0aef1c992aee 100644 --- a/io_uring/net.h +++ b/io_uring/net.h @@ -8,17 +8,18 @@ struct io_async_msghdr { #if defined(CONFIG_NET) union { - struct iovec fast_iov[UIO_FASTIOV]; + struct iovec fast_iov; struct { - struct iovec fast_iov_one; - __kernel_size_t controllen; - int namelen; - __kernel_size_t payloadlen; + struct io_cache_entry cache; + /* entry size of ->free_iov, if valid */ + int free_iov_nr; }; - struct io_cache_entry cache; }; /* points to an allocated iov, if NULL we use fast_iov instead */ struct iovec *free_iov; + __kernel_size_t controllen; + __kernel_size_t payloadlen; + int namelen; struct sockaddr __user *uaddr; struct msghdr msg; struct sockaddr_storage addr; From patchwork Wed Mar 20 22:55:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598258 Received: from mail-il1-f173.google.com (mail-il1-f173.google.com [209.85.166.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A3CD485C74 for ; Wed, 20 Mar 2024 22:58:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975493; cv=none; b=Yn6+uDLZuZSOldoTn5YSXZWRYIRvJyWvhcYJwJatJBHdK7dj5QsScqxJiWLupuHYFkrpn/IsGrSSkI6oPCkQAEl54yM8H2EvVCS++TMJ6nJoRotqLef/mYsGcnSfR2+tQYNKMO207LZAsqNMGBIwN2m24w+1mQpcLJp/1K78cJo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975493; c=relaxed/simple; bh=avWACbZY8rQBC3TVNGyl7ZB+G5kSSJrbeH+QRlrTXgo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gNTc/ih5qwgRNdzH9xIWrQ3KMYO5f2J7ZA7a6VQkzQN7W8auyBZDBKKtKe3mkxWo2aumiarssuXu3zNzafAnf8wJ2FNcQJtLhxaH50o4hlU+tB4XE0yGle+rbGI2F5FiOiBpfMiN5ZyoWMcBzhSFOxCIsD8lJQbeLYRUxQiw29A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=cvGB6TXt; arc=none smtp.client-ip=209.85.166.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="cvGB6TXt" Received: by mail-il1-f173.google.com with SMTP id e9e14a558f8ab-367c7daa395so273255ab.1 for ; Wed, 20 Mar 2024 15:58:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975490; x=1711580290; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=R/6uV6v08bWDnGXGmh/MSUfDQcxo0+hKEOxpX5v6xvo=; b=cvGB6TXtx1hN7ovY+QvUoAYc3tqImBNcOnSNUJgOH12gNAgtN3asTyQcYD508/0JBK DQO+Gq9oH6ZkUDoNnpvuu0fhQNcBzkxWzatHtqYK3UeZOITqA00vrFEljzBIvntIn9bv WTzsIs7ZnBqSIsODl2//ZF3otgPF+WrSf3Bo+XxQN/HPNQukl8tBnTLjS/7QZDbbgsjd 4Od8tug9WWTffKY/xB4quAUeb/e3H51neH27IsA8N8jIfYF3OyfEvET+bGyjHu7H5IvM 7q1rBRuyMR/AMnNOw3KKbM5bXjYLJdVzLzDWr1zXy7P11QhVvUHa9RZ9iBUq+VObEsUp y1Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975490; x=1711580290; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=R/6uV6v08bWDnGXGmh/MSUfDQcxo0+hKEOxpX5v6xvo=; b=Gs1ZmpmD2GtpYIdwuTTVNR2/HW6CLFbEHj0hKYBHRWksuwoEazUlav1xlt4N7IuPQY 9ceFs8xDe/OXCdEU5Uyeo+1xBTm9kre6AEimGbF6+0bk61mfLat2fD6wlDg4oF2Yauka 8LHr7dGm/X5eXpQvwkMkGZew0XXFHBefBplO6pmOXdgfMyW8hXQBxpJRj47MoW+1IuMZ T0dFFiMDvx1IZLN7BkxUzHHUuk/BcHFVENsgnqnFGDit17oqK4/seWXV+wNyQj1KdWB7 z6oAKJBvNqWm0YdY+Z8kwplpQvzeey/EMgLQ5vu6faJ2Eiodxr0kqnxFAbCApgki+6BP O4yQ== X-Gm-Message-State: AOJu0YydlFZVNzKbJT2Rg24m+i42C2P3A/FuNaGpb+MaMg9BVKU4J1eR fhdmhD81WD6wXVBY3ZNgFEOT6EpdjDaTLwLc9camrrP5Nt+qgi+nONx7Gd1LKuQaYBvDUUKpmek Y X-Google-Smtp-Source: AGHT+IF6TfnFjXXkSU6yYtKW3lsUbB9fcVsbfWwMw5xC5SW9pWf/nbnS7T1SwOcWSSygDXobFLK+3A== X-Received: by 2002:a6b:c9d2:0:b0:7cc:6b9:a59c with SMTP id z201-20020a6bc9d2000000b007cc06b9a59cmr7464638iof.1.1710975490405; Wed, 20 Mar 2024 15:58:10 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:09 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 09/17] io_uring/net: drop 'kmsg' parameter from io_req_msg_cleanup() Date: Wed, 20 Mar 2024 16:55:24 -0600 Message-ID: <20240320225750.1769647-10-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Now that iovec recycling is being done, the iovec is no longer being freed in there. Hence the kmsg parameter is now useless. Signed-off-by: Jens Axboe --- io_uring/net.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index 20d6427f4250..9472a66e035c 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -415,7 +415,6 @@ int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) } static void io_req_msg_cleanup(struct io_kiocb *req, - struct io_async_msghdr *kmsg, unsigned int issue_flags) { req->flags &= ~REQ_F_NEED_CLEANUP; @@ -461,7 +460,7 @@ int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) ret = -EINTR; req_set_fail(req); } - io_req_msg_cleanup(req, kmsg, issue_flags); + io_req_msg_cleanup(req, issue_flags); if (ret >= 0) ret += sr->done_io; else if (sr->done_io) @@ -515,7 +514,7 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags) ret += sr->done_io; else if (sr->done_io) ret = sr->done_io; - io_req_msg_cleanup(req, kmsg, issue_flags); + io_req_msg_cleanup(req, issue_flags); io_req_set_res(req, ret, 0); return IOU_OK; } @@ -723,7 +722,7 @@ static inline bool io_recv_finish(struct io_kiocb *req, int *ret, *ret = IOU_STOP_MULTISHOT; else *ret = IOU_OK; - io_req_msg_cleanup(req, kmsg, issue_flags); + io_req_msg_cleanup(req, issue_flags); return true; } @@ -1209,7 +1208,7 @@ int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) */ if (!(issue_flags & IO_URING_F_UNLOCKED)) { io_notif_flush(zc->notif); - io_req_msg_cleanup(req, kmsg, 0); + io_req_msg_cleanup(req, 0); } io_req_set_res(req, ret, IORING_CQE_F_MORE); return IOU_OK; @@ -1270,7 +1269,7 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags) */ if (!(issue_flags & IO_URING_F_UNLOCKED)) { io_notif_flush(sr->notif); - io_req_msg_cleanup(req, kmsg, 0); + io_req_msg_cleanup(req, 0); } io_req_set_res(req, ret, IORING_CQE_F_MORE); return IOU_OK; From patchwork Wed Mar 20 22:55:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598260 Received: from mail-il1-f176.google.com (mail-il1-f176.google.com [209.85.166.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A2BF785C74 for ; Wed, 20 Mar 2024 22:58:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975496; cv=none; b=p1Bb8I+pwCVBMnEsemLp8ljMtBo9hWCmYK64+yL70bt5m2DiqqrSYGoN6kQYnc/4TMLjVTrFH7GvZsJF66pW8BwjMaAzVQXe+/0LXg+mNPb0g+R5wVYGtuinziA6B6F+ksPm7skCyWKbHUFk00UKNY1hmUhhCBYNI7oc/NVCGHo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975496; c=relaxed/simple; bh=36tn+YsGIBAk751kyPcGoYlvo4+4/qvrxgKny/pS7IQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FIE74xETxaM5gmnq9S9UrQ2sco1ZVmUqY/FgdMRZYipFjNI/8SVa1JPs4p0+eojoTuatLxEI71DiAzc3ta7cjXbgSrnW/UJgxQeHfN3YU1iZEee6LlqnzdHPuGxXUIyShEpg0SSYgFPW41HSeL0Ztf+qB7z4Cw2ZtK8O9EqqL1k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=l/QjlQlA; arc=none smtp.client-ip=209.85.166.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="l/QjlQlA" Received: by mail-il1-f176.google.com with SMTP id e9e14a558f8ab-367c7daa395so273285ab.1 for ; Wed, 20 Mar 2024 15:58:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975492; x=1711580292; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TtW2hhQDKuo9tjpn5TZzJP9YHWlBdbIzpNr0s1mISCc=; b=l/QjlQlA36k97YUws5ifOYzSEhdkU07E6P4i+JAC0GhqMwLiw9ysk6vwZaxxsHFJsF Y3n5oId1uzm1gSQZD7ohPiX/7caMCup3KvJz3qgr3S4yI6nrnHkVV8YX0l0bTF0KVT9r 5WvTACozUfeEtLV+0fZluh1ALMSLS1ACO/XAjRHD+rAbL3LbxZoFv8Iq72+vPsXu09i6 RIinYT7SBTLAsLJB+gTMQuvIkdPtvP2YLZdHHJxk8k6SSTwFQg7HVz03ZovUCe1BkG5U jYmDN0jybOFojVruPTGFpxIudzjCkXGoD7BB3guoGwufyXbeHNeZEoJYdxEpgF+U6XLe mIXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975492; x=1711580292; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TtW2hhQDKuo9tjpn5TZzJP9YHWlBdbIzpNr0s1mISCc=; b=nkSl7JmeCygd/PnVSbAp2CdcvByXm2SkPV5LxaNe6YxzxrOKs/ChMUwjDDAPq6uUp6 lvi+khM9apKeK9LBz5xwzAcqYbgKmX6fGatDhzoR6X9r4UQDUqBL7rHzHurWSjTI00vu spEiubf5gtAbd6rQ68O1Gx21eqjxwaHOpBT4FBnArB772ji8teXNJQLSqftKhgA6js+t pduYrKSr36cGRD/+h888yjOPM3hTkkGWbFvqYxvee127TmnxpGNvT97Ah4GolPux63EQ 2ZFPOp05KHMhaLCEOFWZF6ZNeSZ3aFT4yPNcmaZflS9qP5grDtHZtp2vFGPzJ9wMPleD inoQ== X-Gm-Message-State: AOJu0YxBJ+NTPTYEeZMNFDV0u3uBImgmoPcL+3WVIQ2DzA0kYjmDjK5A Sq+/9UoBx0wgwVZkfG664LmtgdnfxIOlPvfAt8pRX4LZVkbECi9JJwAHp1VHzVHPKNMstazYjeb S X-Google-Smtp-Source: AGHT+IFLhXVmipOy0Vq0po9SfP+YkFcaik8Oim9aIxdmP9YBQAMbfyFZEvPTkRNUTodrhOgvw7Lc5w== X-Received: by 2002:a5e:9503:0:b0:7cf:ba0:7146 with SMTP id r3-20020a5e9503000000b007cf0ba07146mr5134853ioj.2.1710975491776; Wed, 20 Mar 2024 15:58:11 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:11 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 10/17] io_uring/rw: always setup io_async_rw for read/write requests Date: Wed, 20 Mar 2024 16:55:25 -0600 Message-ID: <20240320225750.1769647-11-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 read/write requests try to put everything on the stack, and then alloc and copy if we need to retry. This necessitates a bunch of nasty code that deals with intermediate state. Get rid of this, and have the prep side setup everything we need upfront, which greatly simplifies the opcode handlers. This includes adding an alloc cache for io_async_rw, to make it cheap to handle. In terms of cost, this should be basically free and transparent. For the worst case of {READ,WRITE}_FIXED which didn't need it before, performance is unaffected in the normal peak workload that is being used to test that. Still runs at 122M IOPS. Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 1 + io_uring/io_uring.c | 3 + io_uring/opdef.c | 15 +- io_uring/rw.c | 538 ++++++++++++++++----------------- io_uring/rw.h | 19 +- 5 files changed, 278 insertions(+), 298 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index f37caff64d05..2ba8676f83cc 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -300,6 +300,7 @@ struct io_ring_ctx { struct io_hash_table cancel_table_locked; struct io_alloc_cache apoll_cache; struct io_alloc_cache netmsg_cache; + struct io_alloc_cache rw_cache; /* * Any cancelable uring_cmd is added to this list in diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index ff0e233ce3c9..cc8ce830ff4b 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -308,6 +308,8 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) sizeof(struct async_poll)); io_alloc_cache_init(&ctx->netmsg_cache, IO_ALLOC_CACHE_MAX, sizeof(struct io_async_msghdr)); + io_alloc_cache_init(&ctx->rw_cache, IO_ALLOC_CACHE_MAX, + sizeof(struct io_async_rw)); io_futex_cache_init(ctx); init_completion(&ctx->ref_comp); xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1); @@ -2898,6 +2900,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) io_eventfd_unregister(ctx); io_alloc_cache_free(&ctx->apoll_cache, io_apoll_cache_free); io_alloc_cache_free(&ctx->netmsg_cache, io_netmsg_cache_free); + io_alloc_cache_free(&ctx->rw_cache, io_rw_cache_free); io_futex_cache_free(ctx); io_destroy_buffers(ctx); mutex_unlock(&ctx->uring_lock); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index dd4a1e1425e1..fcae75a08f2c 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -67,7 +67,7 @@ const struct io_issue_def io_issue_defs[] = { .iopoll = 1, .iopoll_queue = 1, .vectored = 1, - .prep = io_prep_rwv, + .prep = io_prep_readv, .issue = io_read, }, [IORING_OP_WRITEV] = { @@ -81,7 +81,7 @@ const struct io_issue_def io_issue_defs[] = { .iopoll = 1, .iopoll_queue = 1, .vectored = 1, - .prep = io_prep_rwv, + .prep = io_prep_writev, .issue = io_write, }, [IORING_OP_FSYNC] = { @@ -99,7 +99,7 @@ const struct io_issue_def io_issue_defs[] = { .ioprio = 1, .iopoll = 1, .iopoll_queue = 1, - .prep = io_prep_rw_fixed, + .prep = io_prep_read_fixed, .issue = io_read, }, [IORING_OP_WRITE_FIXED] = { @@ -112,7 +112,7 @@ const struct io_issue_def io_issue_defs[] = { .ioprio = 1, .iopoll = 1, .iopoll_queue = 1, - .prep = io_prep_rw_fixed, + .prep = io_prep_write_fixed, .issue = io_write, }, [IORING_OP_POLL_ADD] = { @@ -239,7 +239,7 @@ const struct io_issue_def io_issue_defs[] = { .ioprio = 1, .iopoll = 1, .iopoll_queue = 1, - .prep = io_prep_rw, + .prep = io_prep_read, .issue = io_read, }, [IORING_OP_WRITE] = { @@ -252,7 +252,7 @@ const struct io_issue_def io_issue_defs[] = { .ioprio = 1, .iopoll = 1, .iopoll_queue = 1, - .prep = io_prep_rw, + .prep = io_prep_write, .issue = io_write, }, [IORING_OP_FADVISE] = { @@ -490,14 +490,12 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_READV] = { .async_size = sizeof(struct io_async_rw), .name = "READV", - .prep_async = io_readv_prep_async, .cleanup = io_readv_writev_cleanup, .fail = io_rw_fail, }, [IORING_OP_WRITEV] = { .async_size = sizeof(struct io_async_rw), .name = "WRITEV", - .prep_async = io_writev_prep_async, .cleanup = io_readv_writev_cleanup, .fail = io_rw_fail, }, @@ -699,6 +697,7 @@ const struct io_cold_def io_cold_defs[] = { #endif }, [IORING_OP_READ_MULTISHOT] = { + .async_size = sizeof(struct io_async_rw), .name = "READ_MULTISHOT", }, [IORING_OP_WAITID] = { diff --git a/io_uring/rw.c b/io_uring/rw.c index 35216e8adc29..583fe61a0acb 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -75,7 +75,153 @@ static int io_iov_buffer_select_prep(struct io_kiocb *req) return 0; } -int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe) +static int __io_import_iovec(int ddir, struct io_kiocb *req, + struct io_async_rw *io, + unsigned int issue_flags) +{ + const struct io_issue_def *def = &io_issue_defs[req->opcode]; + struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); + void __user *buf; + size_t sqe_len; + + buf = u64_to_user_ptr(rw->addr); + sqe_len = rw->len; + + if (!def->vectored || req->flags & REQ_F_BUFFER_SELECT) { + if (io_do_buffer_select(req)) { + buf = io_buffer_select(req, &sqe_len, issue_flags); + if (!buf) + return -ENOBUFS; + rw->addr = (unsigned long) buf; + rw->len = sqe_len; + } + + return import_ubuf(ddir, buf, sqe_len, &io->s.iter); + } + + io->free_iovec = io->s.fast_iov; + return __import_iovec(ddir, buf, sqe_len, UIO_FASTIOV, &io->free_iovec, + &io->s.iter, req->ctx->compat); +} + +static inline int io_import_iovec(int rw, struct io_kiocb *req, + struct io_async_rw *io, + unsigned int issue_flags) +{ + int ret; + + ret = __io_import_iovec(rw, req, io, issue_flags); + if (unlikely(ret < 0)) + return ret; + + iov_iter_save_state(&io->s.iter, &io->s.iter_state); + return 0; +} + +static void io_rw_iovec_free(struct io_async_rw *rw) +{ + if (rw->free_iovec) { + kfree(rw->free_iovec); + rw->free_iovec = NULL; + } +} + +static void io_rw_recycle(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_async_rw *rw = req->async_data; + + if (unlikely(issue_flags & IO_URING_F_UNLOCKED)) { + io_rw_iovec_free(rw); + return; + } + if (io_alloc_cache_put(&req->ctx->rw_cache, &rw->cache)) { + req->async_data = NULL; + req->flags &= ~REQ_F_ASYNC_DATA; + } +} + +static void io_req_rw_cleanup(struct io_kiocb *req, unsigned int issue_flags) +{ + /* + * Disable quick recycling for anything that's gone through io-wq. + * In theory, this should be fine to cleanup. However, some read or + * write iter handling touches the iovec AFTER having called into the + * handler, eg to reexpand or revert. This means we can have: + * + * task io-wq + * issue + * punt to io-wq + * issue + * blkdev_write_iter() + * ->ki_complete() + * io_complete_rw() + * queue tw complete + * run tw + * req_rw_cleanup + * iov_iter_count() <- look at iov_iter again + * + * which can lead to a UAF. This is only possible for io-wq offload + * as the cleanup can run in parallel. As io-wq is not the fast path, + * just leave cleanup to the end. + * + * This is really a bug in the core code that does this, any issue + * path should assume that a successful (or -EIOCBQUEUED) return can + * mean that the underlying data can be gone at any time. But that + * should be fixed seperately, and then this check could be killed. + */ + if (!(req->flags & REQ_F_REFCOUNT)) { + req->flags &= ~REQ_F_NEED_CLEANUP; + io_rw_recycle(req, issue_flags); + } +} + +static int io_rw_alloc_async(struct io_kiocb *req) +{ + struct io_ring_ctx *ctx = req->ctx; + struct io_cache_entry *entry; + struct io_async_rw *rw; + + entry = io_alloc_cache_get(&ctx->rw_cache); + if (entry) { + rw = container_of(entry, struct io_async_rw, cache); + req->flags |= REQ_F_ASYNC_DATA; + req->async_data = rw; + goto done; + } + + if (!io_alloc_async_data(req)) { + rw = req->async_data; +done: + rw->free_iovec = NULL; + rw->bytes_done = 0; + return 0; + } + + return -ENOMEM; +} + +static int io_prep_rw_setup(struct io_kiocb *req, int ddir, bool do_import) +{ + struct io_async_rw *rw; + int ret; + + if (io_rw_alloc_async(req)) + return -ENOMEM; + + if (!do_import || io_do_buffer_select(req)) + return 0; + + rw = req->async_data; + ret = io_import_iovec(ddir, req, rw, 0); + if (unlikely(ret < 0)) + return ret; + + iov_iter_save_state(&rw->s.iter, &rw->s.iter_state); + return 0; +} + +static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe, + int ddir, bool do_import) { struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); unsigned ioprio; @@ -100,34 +246,58 @@ int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe) rw->addr = READ_ONCE(sqe->addr); rw->len = READ_ONCE(sqe->len); rw->flags = READ_ONCE(sqe->rw_flags); - return 0; + return io_prep_rw_setup(req, ddir, do_import); } -int io_prep_rwv(struct io_kiocb *req, const struct io_uring_sqe *sqe) +int io_prep_read(struct io_kiocb *req, const struct io_uring_sqe *sqe) { + return io_prep_rw(req, sqe, ITER_DEST, true); +} + +int io_prep_write(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + return io_prep_rw(req, sqe, ITER_SOURCE, true); +} + +static int io_prep_rwv(struct io_kiocb *req, const struct io_uring_sqe *sqe, + int ddir) +{ + const bool do_import = !(req->flags & REQ_F_BUFFER_SELECT); int ret; - ret = io_prep_rw(req, sqe); + ret = io_prep_rw(req, sqe, ddir, do_import); if (unlikely(ret)) return ret; + if (do_import) + return 0; /* * Have to do this validation here, as this is in io_read() rw->len * might have chanaged due to buffer selection */ - if (req->flags & REQ_F_BUFFER_SELECT) - return io_iov_buffer_select_prep(req); + return io_iov_buffer_select_prep(req); +} - return 0; +int io_prep_readv(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + return io_prep_rwv(req, sqe, ITER_DEST); +} + +int io_prep_writev(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + return io_prep_rwv(req, sqe, ITER_SOURCE); } -int io_prep_rw_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe) +static int io_prep_rw_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe, + int ddir) { + struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); struct io_ring_ctx *ctx = req->ctx; + struct io_async_rw *io; u16 index; int ret; - ret = io_prep_rw(req, sqe); + ret = io_prep_rw(req, sqe, ddir, false); if (unlikely(ret)) return ret; @@ -136,7 +306,21 @@ int io_prep_rw_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe) index = array_index_nospec(req->buf_index, ctx->nr_user_bufs); req->imu = ctx->user_bufs[index]; io_req_set_rsrc_node(req, ctx, 0); - return 0; + + io = req->async_data; + ret = io_import_fixed(ddir, &io->s.iter, req->imu, rw->addr, rw->len); + iov_iter_save_state(&io->s.iter, &io->s.iter_state); + return ret; +} + +int io_prep_read_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + return io_prep_rw_fixed(req, sqe, ITER_DEST); +} + +int io_prep_write_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + return io_prep_rw_fixed(req, sqe, ITER_SOURCE); } /* @@ -152,7 +336,7 @@ int io_read_mshot_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) if (!(req->flags & REQ_F_BUFFER_SELECT)) return -EINVAL; - ret = io_prep_rw(req, sqe); + ret = io_prep_rw(req, sqe, ITER_DEST, false); if (unlikely(ret)) return ret; @@ -165,9 +349,7 @@ int io_read_mshot_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) void io_readv_writev_cleanup(struct io_kiocb *req) { - struct io_async_rw *io = req->async_data; - - kfree(io->free_iovec); + io_rw_iovec_free(req->async_data); } static inline loff_t *io_kiocb_update_pos(struct io_kiocb *req) @@ -188,14 +370,11 @@ static inline loff_t *io_kiocb_update_pos(struct io_kiocb *req) } #ifdef CONFIG_BLOCK -static bool io_resubmit_prep(struct io_kiocb *req) +static void io_resubmit_prep(struct io_kiocb *req) { struct io_async_rw *io = req->async_data; - if (!req_has_async_data(req)) - return !io_req_prep_async(req); iov_iter_restore(&io->s.iter, &io->s.iter_state); - return true; } static bool io_rw_should_reissue(struct io_kiocb *req) @@ -224,9 +403,8 @@ static bool io_rw_should_reissue(struct io_kiocb *req) return true; } #else -static bool io_resubmit_prep(struct io_kiocb *req) +static void io_resubmit_prep(struct io_kiocb *req) { - return false; } static bool io_rw_should_reissue(struct io_kiocb *req) { @@ -308,6 +486,7 @@ void io_req_rw_complete(struct io_kiocb *req, struct io_tw_state *ts) if (req->flags & (REQ_F_BUFFER_SELECTED|REQ_F_BUFFER_RING)) req->cqe.flags |= io_put_kbuf(req, 0); + io_req_rw_cleanup(req, 0); io_req_task_complete(req, ts); } @@ -388,6 +567,7 @@ static int kiocb_done(struct io_kiocb *req, ssize_t ret, io_req_io_end(req); io_req_set_res(req, final_ret, io_put_kbuf(req, issue_flags)); + io_req_rw_cleanup(req, issue_flags); return IOU_OK; } } else { @@ -396,71 +576,12 @@ static int kiocb_done(struct io_kiocb *req, ssize_t ret, if (req->flags & REQ_F_REISSUE) { req->flags &= ~REQ_F_REISSUE; - if (io_resubmit_prep(req)) - return -EAGAIN; - else - io_req_task_queue_fail(req, final_ret); + io_resubmit_prep(req); + return -EAGAIN; } return IOU_ISSUE_SKIP_COMPLETE; } -static struct iovec *__io_import_iovec(int ddir, struct io_kiocb *req, - struct io_rw_state *s, - unsigned int issue_flags) -{ - struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); - struct iov_iter *iter = &s->iter; - u8 opcode = req->opcode; - struct iovec *iovec; - void __user *buf; - size_t sqe_len; - ssize_t ret; - - if (opcode == IORING_OP_READ_FIXED || opcode == IORING_OP_WRITE_FIXED) { - ret = io_import_fixed(ddir, iter, req->imu, rw->addr, rw->len); - if (ret) - return ERR_PTR(ret); - return NULL; - } - - buf = u64_to_user_ptr(rw->addr); - sqe_len = rw->len; - - if (!io_issue_defs[opcode].vectored || req->flags & REQ_F_BUFFER_SELECT) { - if (io_do_buffer_select(req)) { - buf = io_buffer_select(req, &sqe_len, issue_flags); - if (!buf) - return ERR_PTR(-ENOBUFS); - rw->addr = (unsigned long) buf; - rw->len = sqe_len; - } - - ret = import_ubuf(ddir, buf, sqe_len, iter); - if (ret) - return ERR_PTR(ret); - return NULL; - } - - iovec = s->fast_iov; - ret = __import_iovec(ddir, buf, sqe_len, UIO_FASTIOV, &iovec, iter, - req->ctx->compat); - if (unlikely(ret < 0)) - return ERR_PTR(ret); - return iovec; -} - -static inline int io_import_iovec(int rw, struct io_kiocb *req, - struct iovec **iovec, struct io_rw_state *s, - unsigned int issue_flags) -{ - *iovec = __io_import_iovec(rw, req, s, issue_flags); - if (IS_ERR(*iovec)) - return PTR_ERR(*iovec); - - iov_iter_save_state(&s->iter, &s->iter_state); - return 0; -} - static inline loff_t *io_kiocb_ppos(struct kiocb *kiocb) { return (kiocb->ki_filp->f_mode & FMODE_STREAM) ? NULL : &kiocb->ki_pos; @@ -532,89 +653,6 @@ static ssize_t loop_rw_iter(int ddir, struct io_rw *rw, struct iov_iter *iter) return ret; } -static void io_req_map_rw(struct io_kiocb *req, const struct iovec *iovec, - const struct iovec *fast_iov, struct iov_iter *iter) -{ - struct io_async_rw *io = req->async_data; - - memcpy(&io->s.iter, iter, sizeof(*iter)); - io->free_iovec = iovec; - io->bytes_done = 0; - /* can only be fixed buffers, no need to do anything */ - if (iov_iter_is_bvec(iter) || iter_is_ubuf(iter)) - return; - if (!iovec) { - unsigned iov_off = 0; - - io->s.iter.__iov = io->s.fast_iov; - if (iter->__iov != fast_iov) { - iov_off = iter_iov(iter) - fast_iov; - io->s.iter.__iov += iov_off; - } - if (io->s.fast_iov != fast_iov) - memcpy(io->s.fast_iov + iov_off, fast_iov + iov_off, - sizeof(struct iovec) * iter->nr_segs); - } else { - req->flags |= REQ_F_NEED_CLEANUP; - } -} - -static int io_setup_async_rw(struct io_kiocb *req, const struct iovec *iovec, - struct io_rw_state *s, bool force) -{ - if (!force && !io_cold_defs[req->opcode].prep_async) - return 0; - /* opcode type doesn't need async data */ - if (!io_cold_defs[req->opcode].async_size) - return 0; - if (!req_has_async_data(req)) { - struct io_async_rw *iorw; - - if (io_alloc_async_data(req)) { - kfree(iovec); - return -ENOMEM; - } - - io_req_map_rw(req, iovec, s->fast_iov, &s->iter); - iorw = req->async_data; - /* we've copied and mapped the iter, ensure state is saved */ - iov_iter_save_state(&iorw->s.iter, &iorw->s.iter_state); - } - return 0; -} - -static inline int io_rw_prep_async(struct io_kiocb *req, int rw) -{ - struct io_async_rw *iorw = req->async_data; - struct iovec *iov; - int ret; - - iorw->bytes_done = 0; - iorw->free_iovec = NULL; - - /* submission path, ->uring_lock should already be taken */ - ret = io_import_iovec(rw, req, &iov, &iorw->s, 0); - if (unlikely(ret < 0)) - return ret; - - if (iov) { - iorw->free_iovec = iov; - req->flags |= REQ_F_NEED_CLEANUP; - } - - return 0; -} - -int io_readv_prep_async(struct io_kiocb *req) -{ - return io_rw_prep_async(req, ITER_DEST); -} - -int io_writev_prep_async(struct io_kiocb *req) -{ - return io_rw_prep_async(req, ITER_SOURCE); -} - /* * This is our waitqueue callback handler, registered through __folio_lock_async() * when we initially tried to do the IO with the iocb armed our waitqueue. @@ -754,54 +792,28 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode) static int __io_read(struct io_kiocb *req, unsigned int issue_flags) { + bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK; struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); - struct io_rw_state __s, *s = &__s; - struct iovec *iovec; + struct io_async_rw *io = req->async_data; struct kiocb *kiocb = &rw->kiocb; - bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK; - struct io_async_rw *io; - ssize_t ret, ret2; + ssize_t ret; loff_t *ppos; - if (!req_has_async_data(req)) { - ret = io_import_iovec(ITER_DEST, req, &iovec, s, issue_flags); + if (io_do_buffer_select(req)) { + ret = io_import_iovec(ITER_DEST, req, io, issue_flags); if (unlikely(ret < 0)) return ret; - } else { - io = req->async_data; - s = &io->s; - - /* - * Safe and required to re-import if we're using provided - * buffers, as we dropped the selected one before retry. - */ - if (io_do_buffer_select(req)) { - ret = io_import_iovec(ITER_DEST, req, &iovec, s, issue_flags); - if (unlikely(ret < 0)) - return ret; - } - - /* - * We come here from an earlier attempt, restore our state to - * match in case it doesn't. It's cheap enough that we don't - * need to make this conditional. - */ - iov_iter_restore(&s->iter, &s->iter_state); - iovec = NULL; } + ret = io_rw_init_file(req, FMODE_READ); - if (unlikely(ret)) { - kfree(iovec); + if (unlikely(ret)) return ret; - } - req->cqe.res = iov_iter_count(&s->iter); + req->cqe.res = iov_iter_count(&io->s.iter); if (force_nonblock) { /* If the file doesn't support async, just async punt */ - if (unlikely(!io_file_supports_nowait(req))) { - ret = io_setup_async_rw(req, iovec, s, true); - return ret ?: -EAGAIN; - } + if (unlikely(!io_file_supports_nowait(req))) + return -EAGAIN; kiocb->ki_flags |= IOCB_NOWAIT; } else { /* Ensure we clear previously set non-block flag */ @@ -811,20 +823,15 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags) ppos = io_kiocb_update_pos(req); ret = rw_verify_area(READ, req->file, ppos, req->cqe.res); - if (unlikely(ret)) { - kfree(iovec); + if (unlikely(ret)) return ret; - } - ret = io_iter_do_read(rw, &s->iter); + ret = io_iter_do_read(rw, &io->s.iter); if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) { req->flags &= ~REQ_F_REISSUE; - /* - * If we can poll, just do that. For a vectored read, we'll - * need to copy state first. - */ - if (io_file_can_poll(req) && !io_issue_defs[req->opcode].vectored) + /* If we can poll, just do that. */ + if (io_file_can_poll(req)) return -EAGAIN; /* IOPOLL retry should happen for io-wq threads */ if (!force_nonblock && !(req->ctx->flags & IORING_SETUP_IOPOLL)) @@ -834,8 +841,6 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags) goto done; ret = 0; } else if (ret == -EIOCBQUEUED) { - if (iovec) - kfree(iovec); return IOU_ISSUE_SKIP_COMPLETE; } else if (ret == req->cqe.res || ret <= 0 || !force_nonblock || (req->flags & REQ_F_NOWAIT) || !need_complete_io(req)) { @@ -848,21 +853,7 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags) * untouched in case of error. Restore it and we'll advance it * manually if we need to. */ - iov_iter_restore(&s->iter, &s->iter_state); - - ret2 = io_setup_async_rw(req, iovec, s, true); - iovec = NULL; - if (ret2) { - ret = ret > 0 ? ret : ret2; - goto done; - } - - io = req->async_data; - s = &io->s; - /* - * Now use our persistent iterator and state, if we aren't already. - * We've restored and mapped the iter to match. - */ + iov_iter_restore(&io->s.iter, &io->s.iter_state); do { /* @@ -870,11 +861,11 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags) * above or inside this loop. Advance the iter by the bytes * that were consumed. */ - iov_iter_advance(&s->iter, ret); - if (!iov_iter_count(&s->iter)) + iov_iter_advance(&io->s.iter, ret); + if (!iov_iter_count(&io->s.iter)) break; io->bytes_done += ret; - iov_iter_save_state(&s->iter, &s->iter_state); + iov_iter_save_state(&io->s.iter, &io->s.iter_state); /* if we can retry, do so with the callbacks armed */ if (!io_rw_should_retry(req)) { @@ -882,24 +873,22 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags) return -EAGAIN; } - req->cqe.res = iov_iter_count(&s->iter); + req->cqe.res = iov_iter_count(&io->s.iter); /* * Now retry read with the IOCB_WAITQ parts set in the iocb. If * we get -EIOCBQUEUED, then we'll get a notification when the * desired page gets unlocked. We can also get a partial read * here, and if we do, then just retry at the new offset. */ - ret = io_iter_do_read(rw, &s->iter); + ret = io_iter_do_read(rw, &io->s.iter); if (ret == -EIOCBQUEUED) return IOU_ISSUE_SKIP_COMPLETE; /* we got some bytes, but not all. retry. */ kiocb->ki_flags &= ~IOCB_WAITQ; - iov_iter_restore(&s->iter, &s->iter_state); + iov_iter_restore(&io->s.iter, &io->s.iter_state); } while (ret > 0); done: /* it's faster to check here then delegate to kfree */ - if (iovec) - kfree(iovec); return ret; } @@ -908,8 +897,9 @@ int io_read(struct io_kiocb *req, unsigned int issue_flags) int ret; ret = __io_read(req, issue_flags); - if (ret >= 0) - return kiocb_done(req, ret, issue_flags); + if (ret >= 0) { + ret = kiocb_done(req, ret, issue_flags); + } return ret; } @@ -974,6 +964,7 @@ int io_read_mshot(struct io_kiocb *req, unsigned int issue_flags) * multishot request, hitting overflow will terminate it. */ io_req_set_res(req, ret, cflags); + io_req_rw_cleanup(req, issue_flags); if (issue_flags & IO_URING_F_MULTISHOT) return IOU_STOP_MULTISHOT; return IOU_OK; @@ -981,42 +972,28 @@ int io_read_mshot(struct io_kiocb *req, unsigned int issue_flags) int io_write(struct io_kiocb *req, unsigned int issue_flags) { + bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK; struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); - struct io_rw_state __s, *s = &__s; - struct iovec *iovec; + struct io_async_rw *io = req->async_data; struct kiocb *kiocb = &rw->kiocb; - bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK; ssize_t ret, ret2; loff_t *ppos; - if (!req_has_async_data(req)) { - ret = io_import_iovec(ITER_SOURCE, req, &iovec, s, issue_flags); - if (unlikely(ret < 0)) - return ret; - } else { - struct io_async_rw *io = req->async_data; - - s = &io->s; - iov_iter_restore(&s->iter, &s->iter_state); - iovec = NULL; - } ret = io_rw_init_file(req, FMODE_WRITE); - if (unlikely(ret)) { - kfree(iovec); + if (unlikely(ret)) return ret; - } - req->cqe.res = iov_iter_count(&s->iter); + req->cqe.res = iov_iter_count(&io->s.iter); if (force_nonblock) { /* If the file doesn't support async, just async punt */ if (unlikely(!io_file_supports_nowait(req))) - goto copy_iov; + goto ret_eagain; /* File path supports NOWAIT for non-direct_IO only for block devices. */ if (!(kiocb->ki_flags & IOCB_DIRECT) && !(kiocb->ki_filp->f_mode & FMODE_BUF_WASYNC) && (req->flags & REQ_F_ISREG)) - goto copy_iov; + goto ret_eagain; kiocb->ki_flags |= IOCB_NOWAIT; } else { @@ -1027,19 +1004,17 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) ppos = io_kiocb_update_pos(req); ret = rw_verify_area(WRITE, req->file, ppos, req->cqe.res); - if (unlikely(ret)) { - kfree(iovec); + if (unlikely(ret)) return ret; - } if (req->flags & REQ_F_ISREG) kiocb_start_write(kiocb); kiocb->ki_flags |= IOCB_WRITE; if (likely(req->file->f_op->write_iter)) - ret2 = call_write_iter(req->file, kiocb, &s->iter); + ret2 = call_write_iter(req->file, kiocb, &io->s.iter); else if (req->file->f_op->write) - ret2 = loop_rw_iter(WRITE, rw, &s->iter); + ret2 = loop_rw_iter(WRITE, rw, &io->s.iter); else ret2 = -EINVAL; @@ -1060,11 +1035,9 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) if (!force_nonblock || ret2 != -EAGAIN) { /* IOPOLL retry should happen for io-wq threads */ if (ret2 == -EAGAIN && (req->ctx->flags & IORING_SETUP_IOPOLL)) - goto copy_iov; + goto ret_eagain; if (ret2 != req->cqe.res && ret2 >= 0 && need_complete_io(req)) { - struct io_async_rw *io; - trace_io_uring_short_write(req->ctx, kiocb->ki_pos - ret2, req->cqe.res, ret2); @@ -1073,33 +1046,22 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) * in the worker. Also update bytes_done to account for * the bytes already written. */ - iov_iter_save_state(&s->iter, &s->iter_state); - ret = io_setup_async_rw(req, iovec, s, true); - - io = req->async_data; - if (io) - io->bytes_done += ret2; + iov_iter_save_state(&io->s.iter, &io->s.iter_state); + io->bytes_done += ret2; if (kiocb->ki_flags & IOCB_WRITE) io_req_end_write(req); - return ret ? ret : -EAGAIN; + return -EAGAIN; } done: ret = kiocb_done(req, ret2, issue_flags); } else { -copy_iov: - iov_iter_restore(&s->iter, &s->iter_state); - ret = io_setup_async_rw(req, iovec, s, false); - if (!ret) { - if (kiocb->ki_flags & IOCB_WRITE) - io_req_end_write(req); - return -EAGAIN; - } - return ret; +ret_eagain: + iov_iter_restore(&io->s.iter, &io->s.iter_state); + if (kiocb->ki_flags & IOCB_WRITE) + io_req_end_write(req); + return -EAGAIN; } - /* it's reportedly faster than delegating the null check to kfree() */ - if (iovec) - kfree(iovec); return ret; } @@ -1174,6 +1136,8 @@ int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin) break; nr_events++; req->cqe.flags = io_put_kbuf(req, 0); + if (req->opcode != IORING_OP_URING_CMD) + io_req_rw_cleanup(req, 0); } if (unlikely(!nr_events)) return 0; @@ -1187,3 +1151,11 @@ int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin) __io_submit_flush_completions(ctx); return nr_events; } + +void io_rw_cache_free(struct io_cache_entry *entry) +{ + struct io_async_rw *rw; + + rw = container_of(entry, struct io_async_rw, cache); + kfree(rw); +} diff --git a/io_uring/rw.h b/io_uring/rw.h index f9e89b4fe4da..f7905070d10b 100644 --- a/io_uring/rw.h +++ b/io_uring/rw.h @@ -9,21 +9,26 @@ struct io_rw_state { }; struct io_async_rw { + union { + size_t bytes_done; + struct io_cache_entry cache; + }; struct io_rw_state s; - const struct iovec *free_iovec; - size_t bytes_done; + struct iovec *free_iovec; struct wait_page_queue wpq; }; -int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe); -int io_prep_rwv(struct io_kiocb *req, const struct io_uring_sqe *sqe); -int io_prep_rw_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_prep_read_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_prep_write_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_prep_readv(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_prep_writev(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_prep_read(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_prep_write(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_read(struct io_kiocb *req, unsigned int issue_flags); -int io_readv_prep_async(struct io_kiocb *req); int io_write(struct io_kiocb *req, unsigned int issue_flags); -int io_writev_prep_async(struct io_kiocb *req); void io_readv_writev_cleanup(struct io_kiocb *req); void io_rw_fail(struct io_kiocb *req); void io_req_rw_complete(struct io_kiocb *req, struct io_tw_state *ts); int io_read_mshot_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_read_mshot(struct io_kiocb *req, unsigned int issue_flags); +void io_rw_cache_free(struct io_cache_entry *entry); From patchwork Wed Mar 20 22:55:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598259 Received: from mail-il1-f178.google.com (mail-il1-f178.google.com [209.85.166.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 23E4A8593C for ; Wed, 20 Mar 2024 22:58:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975495; cv=none; b=oO2yAxvUZn98FVVqwTbzgDeRgkJBV6YG5QQRf1JLyWe22iRJJHP6SxJqX7UGd7qnd0Vm89EquNeAk+CGsQA+BlWtcKJBIO1HSqq6bSSyatgkFbKIWf6TyNOj0H8rFkYYEpZWiHUT/Qt636+XkqYwC7DTDkvFdldj2ZjSRSQ9lQ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975495; c=relaxed/simple; bh=kP5mcTi3CROgTV2jL7W/XAFDP9BtxqoA4JB+ostteOE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ClyqnExAL8jrWN7x9EAgQxSoxNOhajGJMD29J3Xr49t6CaF8ED0ZEkXTWunIBOQMASk70sKmO9mHCGxi5aVv4zEFuc3QIlTyy+k/LU8jpCuyeSqYtC49awGqOOQWjyzcusELiBwiOEiAsz3hAEFRR2XhM/oIO5+PfIbWS9WRN1c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=TiiMK0u9; arc=none smtp.client-ip=209.85.166.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="TiiMK0u9" Received: by mail-il1-f178.google.com with SMTP id e9e14a558f8ab-3667b0bb83eso574615ab.0 for ; Wed, 20 Mar 2024 15:58:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975493; x=1711580293; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xnPLbnZoILIxEGmuUDdw9jcTDSwtSkgiLb2dTvZ1A9w=; b=TiiMK0u9QKInO71TCYTS41Cj2CuQaT3Kr8RB6lljmvTDxWhA9e4XfaCcYEDZxZ/4Ux jT/gD8traNmOD/3wP8veWeX18aCOBtRO66Szv2bcaw3XNX7HuX0v4/ikE+vfqYRP+Ku0 l1QMUUuA1okvFTVwpE+3SKOJGp6mQM8IxoE9ME3MwsuV9dNKtrkJZC1u8TpyBxAXNNb0 S4KBTxD79Gx/mOznf80MvIJCAVonLniHZ23nAjcGPy5NAypKzfbSsW/+QPJlbh/IkOmJ wIWSopYMhXtgzvFVtb0ow5BlKStLHOVCD4sTuAnSYZ5kN66CXwPeAAtm6OmdujtVZkLE P51Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975493; x=1711580293; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xnPLbnZoILIxEGmuUDdw9jcTDSwtSkgiLb2dTvZ1A9w=; b=U0GeUvENOcjIdTvxK6OGIhn8dcLcdlR645G0v14jOUquZG6Ed5hodOTDu546054JbK vxobTqVSlfpcuh6tjxM2IYgKszf7YyHpP5Jc8oEDXRlo11pw8TYXVmNiDnNh/Tsp0IrU GZMcBAK4cA+0Aq7z1MAmTxbCpdKIGIOIYeL41GC972Xrblo0NK6mJM9eP+Wo4q9stCRP i6ZC3jwzxT4Asl0OZ/RwvVg7iv+dTj3HUuKVIv7IzDJAAcd2Tb0QDAYokiLA2wGB3+EV CyaczyLjYu08KlIOhI/ZWXMhxo5yFSOzSzmc1KlwlxHjgg+woRqGlFloHq3FVvBMdh5I Ytiw== X-Gm-Message-State: AOJu0YzI2Yc59bX0e2mBWuNrpuVRNBoADynWSINP98itJrsW78N1Dev9 nNsgDV4j21mC4JbqrsLXpKHVTCXO+0Evm+FIQAQfnXYyMtXs4vr7WR6xlV5sJYFTT4Ea4QGu5HV 4 X-Google-Smtp-Source: AGHT+IGyZTPlMvxuuA4lU1GDYJg92lbbkf9dhTD6EQPOzQmtwuwzGN7Be7aXv7EOYm16uZPL3tHsgQ== X-Received: by 2002:a5e:a80c:0:b0:7cf:24de:c5f with SMTP id c12-20020a5ea80c000000b007cf24de0c5fmr1835839ioa.1.1710975492665; Wed, 20 Mar 2024 15:58:12 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:12 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 11/17] io_uring: get rid of struct io_rw_state Date: Wed, 20 Mar 2024 16:55:26 -0600 Message-ID: <20240320225750.1769647-12-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 A separate state struct is not needed anymore, just fold it in with io_async_rw. Signed-off-by: Jens Axboe --- io_uring/rw.c | 45 +++++++++++++++++++++++---------------------- io_uring/rw.h | 10 +++------- 2 files changed, 26 insertions(+), 29 deletions(-) diff --git a/io_uring/rw.c b/io_uring/rw.c index 583fe61a0acb..19e866929cd3 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -96,12 +96,12 @@ static int __io_import_iovec(int ddir, struct io_kiocb *req, rw->len = sqe_len; } - return import_ubuf(ddir, buf, sqe_len, &io->s.iter); + return import_ubuf(ddir, buf, sqe_len, &io->iter); } - io->free_iovec = io->s.fast_iov; + io->free_iovec = io->fast_iov; return __import_iovec(ddir, buf, sqe_len, UIO_FASTIOV, &io->free_iovec, - &io->s.iter, req->ctx->compat); + &io->iter, req->ctx->compat); } static inline int io_import_iovec(int rw, struct io_kiocb *req, @@ -114,7 +114,7 @@ static inline int io_import_iovec(int rw, struct io_kiocb *req, if (unlikely(ret < 0)) return ret; - iov_iter_save_state(&io->s.iter, &io->s.iter_state); + iov_iter_save_state(&io->iter, &io->iter_state); return 0; } @@ -216,7 +216,7 @@ static int io_prep_rw_setup(struct io_kiocb *req, int ddir, bool do_import) if (unlikely(ret < 0)) return ret; - iov_iter_save_state(&rw->s.iter, &rw->s.iter_state); + iov_iter_save_state(&rw->iter, &rw->iter_state); return 0; } @@ -308,8 +308,8 @@ static int io_prep_rw_fixed(struct io_kiocb *req, const struct io_uring_sqe *sqe io_req_set_rsrc_node(req, ctx, 0); io = req->async_data; - ret = io_import_fixed(ddir, &io->s.iter, req->imu, rw->addr, rw->len); - iov_iter_save_state(&io->s.iter, &io->s.iter_state); + ret = io_import_fixed(ddir, &io->iter, req->imu, rw->addr, rw->len); + iov_iter_save_state(&io->iter, &io->iter_state); return ret; } @@ -374,7 +374,7 @@ static void io_resubmit_prep(struct io_kiocb *req) { struct io_async_rw *io = req->async_data; - iov_iter_restore(&io->s.iter, &io->s.iter_state); + iov_iter_restore(&io->iter, &io->iter_state); } static bool io_rw_should_reissue(struct io_kiocb *req) @@ -808,7 +808,7 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags) ret = io_rw_init_file(req, FMODE_READ); if (unlikely(ret)) return ret; - req->cqe.res = iov_iter_count(&io->s.iter); + req->cqe.res = iov_iter_count(&io->iter); if (force_nonblock) { /* If the file doesn't support async, just async punt */ @@ -826,7 +826,7 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags) if (unlikely(ret)) return ret; - ret = io_iter_do_read(rw, &io->s.iter); + ret = io_iter_do_read(rw, &io->iter); if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) { req->flags &= ~REQ_F_REISSUE; @@ -853,7 +853,7 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags) * untouched in case of error. Restore it and we'll advance it * manually if we need to. */ - iov_iter_restore(&io->s.iter, &io->s.iter_state); + iov_iter_restore(&io->iter, &io->iter_state); do { /* @@ -861,11 +861,11 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags) * above or inside this loop. Advance the iter by the bytes * that were consumed. */ - iov_iter_advance(&io->s.iter, ret); - if (!iov_iter_count(&io->s.iter)) + iov_iter_advance(&io->iter, ret); + if (!iov_iter_count(&io->iter)) break; io->bytes_done += ret; - iov_iter_save_state(&io->s.iter, &io->s.iter_state); + iov_iter_save_state(&io->iter, &io->iter_state); /* if we can retry, do so with the callbacks armed */ if (!io_rw_should_retry(req)) { @@ -873,19 +873,19 @@ static int __io_read(struct io_kiocb *req, unsigned int issue_flags) return -EAGAIN; } - req->cqe.res = iov_iter_count(&io->s.iter); + req->cqe.res = iov_iter_count(&io->iter); /* * Now retry read with the IOCB_WAITQ parts set in the iocb. If * we get -EIOCBQUEUED, then we'll get a notification when the * desired page gets unlocked. We can also get a partial read * here, and if we do, then just retry at the new offset. */ - ret = io_iter_do_read(rw, &io->s.iter); + ret = io_iter_do_read(rw, &io->iter); if (ret == -EIOCBQUEUED) return IOU_ISSUE_SKIP_COMPLETE; /* we got some bytes, but not all. retry. */ kiocb->ki_flags &= ~IOCB_WAITQ; - iov_iter_restore(&io->s.iter, &io->s.iter_state); + iov_iter_restore(&io->iter, &io->iter_state); } while (ret > 0); done: /* it's faster to check here then delegate to kfree */ @@ -982,7 +982,7 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) ret = io_rw_init_file(req, FMODE_WRITE); if (unlikely(ret)) return ret; - req->cqe.res = iov_iter_count(&io->s.iter); + req->cqe.res = iov_iter_count(&io->iter); if (force_nonblock) { /* If the file doesn't support async, just async punt */ @@ -1012,9 +1012,9 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) kiocb->ki_flags |= IOCB_WRITE; if (likely(req->file->f_op->write_iter)) - ret2 = call_write_iter(req->file, kiocb, &io->s.iter); + ret2 = call_write_iter(req->file, kiocb, &io->iter); else if (req->file->f_op->write) - ret2 = loop_rw_iter(WRITE, rw, &io->s.iter); + ret2 = loop_rw_iter(WRITE, rw, &io->iter); else ret2 = -EINVAL; @@ -1046,7 +1046,7 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) * in the worker. Also update bytes_done to account for * the bytes already written. */ - iov_iter_save_state(&io->s.iter, &io->s.iter_state); + iov_iter_save_state(&io->iter, &io->iter_state); io->bytes_done += ret2; if (kiocb->ki_flags & IOCB_WRITE) @@ -1057,7 +1057,7 @@ int io_write(struct io_kiocb *req, unsigned int issue_flags) ret = kiocb_done(req, ret2, issue_flags); } else { ret_eagain: - iov_iter_restore(&io->s.iter, &io->s.iter_state); + iov_iter_restore(&io->iter, &io->iter_state); if (kiocb->ki_flags & IOCB_WRITE) io_req_end_write(req); return -EAGAIN; @@ -1157,5 +1157,6 @@ void io_rw_cache_free(struct io_cache_entry *entry) struct io_async_rw *rw; rw = container_of(entry, struct io_async_rw, cache); + kfree(rw->free_iovec); kfree(rw); } diff --git a/io_uring/rw.h b/io_uring/rw.h index f7905070d10b..7824896dc52d 100644 --- a/io_uring/rw.h +++ b/io_uring/rw.h @@ -2,18 +2,14 @@ #include -struct io_rw_state { - struct iov_iter iter; - struct iov_iter_state iter_state; - struct iovec fast_iov[UIO_FASTIOV]; -}; - struct io_async_rw { union { size_t bytes_done; struct io_cache_entry cache; }; - struct io_rw_state s; + struct iov_iter iter; + struct iov_iter_state iter_state; + struct iovec fast_iov[UIO_FASTIOV]; struct iovec *free_iovec; struct wait_page_queue wpq; }; From patchwork Wed Mar 20 22:55:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598261 Received: from mail-io1-f53.google.com (mail-io1-f53.google.com [209.85.166.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 259D085C7B for ; Wed, 20 Mar 2024 22:58:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975497; cv=none; b=LQCbeIXMU0kISUo/SWjJKID27HIawyZaKk6RMjX92zVRus8ZdFd/rM9KyRY280JazwDL+5ZKwFr5hPkwU1Y2nsH9hc0lcHYHf1/1KxwWZLW3DmP/bMnLIUVPGR+vWRJAaNDE9r7pc+jKUcuCHf6sBYMq9hvrk6c2sKb4dvHgBuA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975497; c=relaxed/simple; bh=bB/99Sql5QhvGfpvnuMj+twJX5qvjPZVN7MMuR6CJdU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=saxG2/gWoQCltXAkm2jolpA2jdsctr3pQJSOZ0bR+jpAPlOUkrxyNQnZ7QLyWm3R0cmUHYxGrnpFiiyElP0zGAhKaBEbo27MruVMYo7jHW9SMoSHz5q3KAKKRnGaJ9mOrLDZHsDJO30HYgeNpkXwcoJG7aX+zoxCSJsJbW5WLQY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=XP7oOtw2; arc=none smtp.client-ip=209.85.166.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="XP7oOtw2" Received: by mail-io1-f53.google.com with SMTP id ca18e2360f4ac-7c8e4c0412dso3588839f.1 for ; Wed, 20 Mar 2024 15:58:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975495; x=1711580295; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QQvAjh5Wpcw+bcAzkV2ljLDRQuX/cONcnXkTdAOR3TI=; b=XP7oOtw2HB3YeGtqOCIrjQc0j5J2mCoWJh3KPbFP+xkQpf6JcRAqWrapYP/BpMBFs1 GpMlJRt656TdV2aMZ/4GnCgCvGJzNsQ0lm27t4/lrAYw3MNb5cS8BPmVanTsAXV8z9qq ejV3/tf6LrX3ACbdYQ0/hgFb7pY5vnFPq8leiDtz1y+dtel9MUMGAG74gfPIVzuahj7s kM/5sBXQ3GVeDV3JUyx16np/ZzJxCvisMju1R/20+jz8FaQYD2nvwkOiwMU9WUlMUGdQ UXnCFDHcKg448CWaDlni/NZsSVkAAtzrjLNZ3HQ6/ZR+A5IqcJ1HwG+faloiJxhhbvbx Zv9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975495; x=1711580295; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QQvAjh5Wpcw+bcAzkV2ljLDRQuX/cONcnXkTdAOR3TI=; b=r5jk1mvJYJix7O/y0cv8vUHNVj+QR/x1U00NgryTC3Lprq9BWoHwFq6GSCvy+4LKht DFfeOfT4R3ALoVgFWIXWn8aw22S/Ky+Nqr+63zO1uNJmFjsrnr2AT4bDMYcAQap5QtlX zzT6GpeD/TLsMdmnasxrLd9Y4hpXWoRITXNx2ILMee33A1NEX6YBicB6TR/fC7EhsNxM USKzsEURFNf+Jb75NK7SGKiY36kzyhmvALQ2Ud1Akx1mgTKdM9aAG/O2bA6Sun4Fu0OE DdP8G8l7xhaOIOnD/7ktt6Y0nCPpDYnq0y3H030r4UWlHJXLzHrF7BGuS3UH9nKKeHG2 yyhg== X-Gm-Message-State: AOJu0YxP7doJ4awqxGEnfJWN/8AwExMSt1DW/IJgClcORaJaoRWMHfL9 d6IXgE8MGnS6xriNu6d4S4J2Oh81Qq6nk5dM0ZnH2R6Uj532dRxvO9yr3CM5gKTzLimQX+9E5s+ X X-Google-Smtp-Source: AGHT+IGzs737Fq5BWxZDi55vn8UwF2s9qG9jHRK9epOcRrhEqftaF2Q11PXbSYiaKQfP7cFpNvdQRw== X-Received: by 2002:a6b:5108:0:b0:7ce:f407:1edf with SMTP id f8-20020a6b5108000000b007cef4071edfmr6851731iob.0.1710975494826; Wed, 20 Mar 2024 15:58:14 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:13 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 12/17] io_uring/rw: add iovec recycling Date: Wed, 20 Mar 2024 16:55:27 -0600 Message-ID: <20240320225750.1769647-13-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Let the io_async_rw hold on to the iovec and reuse it, rather than always allocate and free them. Also enables KASAN for the iovec entries, so that reuse can be detected even while they are in the cache. While doing so, shrink io_async_rw by getting rid of the bigger embedded fast iovec. Since iovecs are being recycled now, shrink it from 8 to 1. This reduces the io_async_rw size from 264 to 160 bytes, a 40% reduction. Signed-off-by: Jens Axboe --- io_uring/rw.c | 42 +++++++++++++++++++++++++++++++++++++----- io_uring/rw.h | 3 ++- 2 files changed, 39 insertions(+), 6 deletions(-) diff --git a/io_uring/rw.c b/io_uring/rw.c index 19e866929cd3..57f2d315a620 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -81,7 +81,9 @@ static int __io_import_iovec(int ddir, struct io_kiocb *req, { const struct io_issue_def *def = &io_issue_defs[req->opcode]; struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); + struct iovec *iov; void __user *buf; + int nr_segs, ret; size_t sqe_len; buf = u64_to_user_ptr(rw->addr); @@ -99,9 +101,24 @@ static int __io_import_iovec(int ddir, struct io_kiocb *req, return import_ubuf(ddir, buf, sqe_len, &io->iter); } - io->free_iovec = io->fast_iov; - return __import_iovec(ddir, buf, sqe_len, UIO_FASTIOV, &io->free_iovec, - &io->iter, req->ctx->compat); + if (io->free_iovec) { + nr_segs = io->free_iov_nr; + iov = io->free_iovec; + } else { + iov = &io->fast_iov; + nr_segs = 1; + } + ret = __import_iovec(ddir, buf, sqe_len, nr_segs, &iov, &io->iter, + req->ctx->compat); + if (unlikely(ret < 0)) + return ret; + if (iov) { + req->flags |= REQ_F_NEED_CLEANUP; + io->free_iov_nr = io->iter.nr_segs; + kfree(io->free_iovec); + io->free_iovec = iov; + } + return 0; } static inline int io_import_iovec(int rw, struct io_kiocb *req, @@ -122,6 +139,7 @@ static void io_rw_iovec_free(struct io_async_rw *rw) { if (rw->free_iovec) { kfree(rw->free_iovec); + rw->free_iov_nr = 0; rw->free_iovec = NULL; } } @@ -129,12 +147,16 @@ static void io_rw_iovec_free(struct io_async_rw *rw) static void io_rw_recycle(struct io_kiocb *req, unsigned int issue_flags) { struct io_async_rw *rw = req->async_data; + struct iovec *iov; if (unlikely(issue_flags & IO_URING_F_UNLOCKED)) { io_rw_iovec_free(rw); return; } + iov = rw->free_iovec; if (io_alloc_cache_put(&req->ctx->rw_cache, &rw->cache)) { + if (iov) + kasan_mempool_poison_object(iov); req->async_data = NULL; req->flags &= ~REQ_F_ASYNC_DATA; } @@ -184,6 +206,11 @@ static int io_rw_alloc_async(struct io_kiocb *req) entry = io_alloc_cache_get(&ctx->rw_cache); if (entry) { rw = container_of(entry, struct io_async_rw, cache); + if (rw->free_iovec) { + kasan_mempool_unpoison_object(rw->free_iovec, + rw->free_iov_nr * sizeof(struct iovec)); + req->flags |= REQ_F_NEED_CLEANUP; + } req->flags |= REQ_F_ASYNC_DATA; req->async_data = rw; goto done; @@ -191,8 +218,9 @@ static int io_rw_alloc_async(struct io_kiocb *req) if (!io_alloc_async_data(req)) { rw = req->async_data; -done: rw->free_iovec = NULL; + rw->free_iov_nr = 0; +done: rw->bytes_done = 0; return 0; } @@ -1157,6 +1185,10 @@ void io_rw_cache_free(struct io_cache_entry *entry) struct io_async_rw *rw; rw = container_of(entry, struct io_async_rw, cache); - kfree(rw->free_iovec); + if (rw->free_iovec) { + kasan_mempool_unpoison_object(rw->free_iovec, + rw->free_iov_nr * sizeof(struct iovec)); + io_rw_iovec_free(rw); + } kfree(rw); } diff --git a/io_uring/rw.h b/io_uring/rw.h index 7824896dc52d..cf51d0eb407a 100644 --- a/io_uring/rw.h +++ b/io_uring/rw.h @@ -9,8 +9,9 @@ struct io_async_rw { }; struct iov_iter iter; struct iov_iter_state iter_state; - struct iovec fast_iov[UIO_FASTIOV]; + struct iovec fast_iov; struct iovec *free_iovec; + int free_iov_nr; struct wait_page_queue wpq; }; From patchwork Wed Mar 20 22:55:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598262 Received: from mail-io1-f46.google.com (mail-io1-f46.google.com [209.85.166.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E2C8485C6C for ; Wed, 20 Mar 2024 22:58:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975499; cv=none; b=dNCcvPAKeqU9SGkhafxzdTxDAgb23LWy+N+1H+V+DdrWW4nbUneraEKxI3TJs00p8WEs2BxtEWHAA8QrtXWXdrvUq49xOKkJVz/nwml0hanMvkB96IAguhYqdbC0EphEYsIuTWjPVObAF8geyPN6a5cLiaeaOKnAxwqIIQMW0Qs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975499; c=relaxed/simple; bh=jjPA7eqCgGJ/H4xZpGgM6c/JlF5Tl7lHIjK6/3EPDSQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q55WnAQRy0BFwvFy9nxgqo//X6MPIZfPfpSwjt/08jmbQMBrBEvLjC5x2qdR2X/Hmw4AgH0Yj2QRZWDgqTJPRy0X1ZbnksWgiL766XK5j6jU1P5C5r47IZamK0Zun8bvm7OzDUnSqpi2f4+MXDJmePQ5ihm+SyBzVPjTgSpgmks= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=I3a2IQ6w; arc=none smtp.client-ip=209.85.166.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="I3a2IQ6w" Received: by mail-io1-f46.google.com with SMTP id ca18e2360f4ac-7cc0e831e11so2541939f.1 for ; Wed, 20 Mar 2024 15:58:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975496; x=1711580296; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RyqUyQc1GXdt3Svgvt+zg4mr+tNs2Tqlbg+SmXyDLhI=; b=I3a2IQ6wOpHtraUjThKika8rLt4bywLGvTLwTJbi3Q9E3C7/ooP5nnicUF18iinb2r qoWVvZOYvMOhhDtBFdEyJlY6x/c3cDiJJ6y2hw9yQfRotK/TfDRZcWPhmNk+D5l9YMSF hX9Hs/ZppBDs6Z9wvqPT+tO6iNJ6L+XNnDBouf+ATsgc4nkKbsjaTFr/mmCtgO/WpTV5 ODW7BS/juy4j5Cdv2dq7qxNqk5o4Lt0QoAcvNM701gN1hqS8W46BEd+oEGA/S+tuptKU 1hhJVJcwPyYolu/FPw/MHZr4bZ03QDxQR5Lz8Mykv1AjQxOGBjHR1J1k4FYRfHJHkxme dKNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975496; x=1711580296; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RyqUyQc1GXdt3Svgvt+zg4mr+tNs2Tqlbg+SmXyDLhI=; b=wC+WZ8cIRrc0FLIEPomBfiG6eEXXAG/ibO5zV0p+HHfU6LHwPyeEX00QiPyc/c9CEH ZSYc3h/5z8+A6s7D6+4vRPlLbJmsZYADRmbOiLEA8i9w2fAgU2PT5NBbcoXqcajR95hl OqOg+6PXFPaMMmtjWnkRJ8l8g/qGaC/eWMwIw2dfYhRBi0j52SoVl3fK9Zk0JQg6ndL7 JIfeRcV1+2Knhq+iNtcm0PBBmadoMgvg90v1hvMUmAvimoONEPzqRwPdTk3zw6cSvBLK GqLq93d4AOylM9Qkq/c+3ksx6nCuPXjjKBuhjLobSCjXiC+XvU3qAfgmN2TODXsUN5gP NVUA== X-Gm-Message-State: AOJu0YyDJ0aeJ+cU4fXs6tK81Tw2nY0+GFrAZdgUJSI/CLr6ZcmF1l5r nukpWSM3/twWKVnrFm2aesA5ER9+n8Mkz49AiYo1o2nxGfXpuAEhEO3IKuDukJLN61xG8skqwxO / X-Google-Smtp-Source: AGHT+IHzFlEcXiChhjU8L0ccSuy4ATsGD81l7uKSeDfLPXOciJoYqCRMDj/FnuGhyAjC47zy0pYwwg== X-Received: by 2002:a5e:c10d:0:b0:7cf:28df:79e2 with SMTP id v13-20020a5ec10d000000b007cf28df79e2mr701145iol.1.1710975496675; Wed, 20 Mar 2024 15:58:16 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:15 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 13/17] io_uring/net: move connect to always using async data Date: Wed, 20 Mar 2024 16:55:28 -0600 Message-ID: <20240320225750.1769647-14-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 While doing that, get rid of io_async_connect and just use the generic io_async_msghdr. Both of them have a struct sockaddr_storage in there, and while io_async_msghdr is bigger, if the same type can be used then we get recycling for free. Signed-off-by: Jens Axboe --- io_uring/net.c | 41 +++++++++++------------------------------ io_uring/net.h | 5 ----- io_uring/opdef.c | 3 +-- 3 files changed, 12 insertions(+), 37 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index 9472a66e035c..5794b941254c 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -1430,17 +1430,10 @@ int io_socket(struct io_kiocb *req, unsigned int issue_flags) return IOU_OK; } -int io_connect_prep_async(struct io_kiocb *req) -{ - struct io_async_connect *io = req->async_data; - struct io_connect *conn = io_kiocb_to_cmd(req, struct io_connect); - - return move_addr_to_kernel(conn->addr, conn->addr_len, &io->address); -} - int io_connect_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_connect *conn = io_kiocb_to_cmd(req, struct io_connect); + struct io_async_msghdr *io; if (sqe->len || sqe->buf_index || sqe->rw_flags || sqe->splice_fd_in) return -EINVAL; @@ -1448,32 +1441,26 @@ int io_connect_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) conn->addr = u64_to_user_ptr(READ_ONCE(sqe->addr)); conn->addr_len = READ_ONCE(sqe->addr2); conn->in_progress = conn->seen_econnaborted = false; - return 0; + + io = io_msg_alloc_async(req); + if (unlikely(!io)) + return -ENOMEM; + + return move_addr_to_kernel(conn->addr, conn->addr_len, &io->addr); } int io_connect(struct io_kiocb *req, unsigned int issue_flags) { struct io_connect *connect = io_kiocb_to_cmd(req, struct io_connect); - struct io_async_connect __io, *io; + struct io_async_msghdr *io = req->async_data; unsigned file_flags; int ret; bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK; - if (req_has_async_data(req)) { - io = req->async_data; - } else { - ret = move_addr_to_kernel(connect->addr, - connect->addr_len, - &__io.address); - if (ret) - goto out; - io = &__io; - } - file_flags = force_nonblock ? O_NONBLOCK : 0; - ret = __sys_connect_file(req->file, &io->address, - connect->addr_len, file_flags); + ret = __sys_connect_file(req->file, &io->addr, connect->addr_len, + file_flags); if ((ret == -EAGAIN || ret == -EINPROGRESS || ret == -ECONNABORTED) && force_nonblock) { if (ret == -EINPROGRESS) { @@ -1483,13 +1470,6 @@ int io_connect(struct io_kiocb *req, unsigned int issue_flags) goto out; connect->seen_econnaborted = true; } - if (req_has_async_data(req)) - return -EAGAIN; - if (io_alloc_async_data(req)) { - ret = -ENOMEM; - goto out; - } - memcpy(req->async_data, &__io, sizeof(__io)); return -EAGAIN; } if (connect->in_progress) { @@ -1507,6 +1487,7 @@ int io_connect(struct io_kiocb *req, unsigned int issue_flags) out: if (ret < 0) req_set_fail(req); + io_req_msg_cleanup(req, issue_flags); io_req_set_res(req, ret, 0); return IOU_OK; } diff --git a/io_uring/net.h b/io_uring/net.h index 0aef1c992aee..b47b43ec6459 100644 --- a/io_uring/net.h +++ b/io_uring/net.h @@ -28,10 +28,6 @@ struct io_async_msghdr { #if defined(CONFIG_NET) -struct io_async_connect { - struct sockaddr_storage address; -}; - int io_shutdown_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_shutdown(struct io_kiocb *req, unsigned int issue_flags); @@ -53,7 +49,6 @@ int io_accept(struct io_kiocb *req, unsigned int issue_flags); int io_socket_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_socket(struct io_kiocb *req, unsigned int issue_flags); -int io_connect_prep_async(struct io_kiocb *req); int io_connect_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_connect(struct io_kiocb *req, unsigned int issue_flags); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index fcae75a08f2c..1951107210d4 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -557,8 +557,7 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_CONNECT] = { .name = "CONNECT", #if defined(CONFIG_NET) - .async_size = sizeof(struct io_async_connect), - .prep_async = io_connect_prep_async, + .async_size = sizeof(struct io_async_msghdr), #endif }, [IORING_OP_FALLOCATE] = { From patchwork Wed Mar 20 22:55:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598263 Received: from mail-io1-f48.google.com (mail-io1-f48.google.com [209.85.166.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D4A798595C for ; Wed, 20 Mar 2024 22:58:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975501; cv=none; b=bCMyHOn1oRUntHw1m64CXmHZsX648/vgba2z40OcTTo/emP9XGk2yRPVOaheNdeTY8at1cKeosORXr3iUoiESoRM6YNWx1Tpn3+nlR8RuBIF/1IKUqkeOV28dD2qwXTcd4UIQaf+1z6q8pd9qPDnaaPqFF8J3lAQp7CJSsw+TaU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975501; c=relaxed/simple; bh=imKW6bzR+X9Wmqgjzn5fcP0ioCp4iaCKMIBfhG+5UBc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZPENfQVjn++d7Zi6hEQ7BlC/SoB0vYmS2GjPNMGyyAil/pEap0OstcJW02IBMECh/09LCBRUepvOVMGTVrmhMiRyLHf4VP1hKXBFQqCAcQggK8u9DFSDp9uR/Yl4IDeOSoTOKOY+68H3/wK86AX/tZTymKQKwPL67Y4eHd0Z8/Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=evb9yOPy; arc=none smtp.client-ip=209.85.166.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="evb9yOPy" Received: by mail-io1-f48.google.com with SMTP id ca18e2360f4ac-7cc0e831e11so2542239f.1 for ; Wed, 20 Mar 2024 15:58:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975498; x=1711580298; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=81BPzzmBgQzVbYIwH3OxkEDfqVr96cHlM67nUhExOn0=; b=evb9yOPyVPOpfc3wdmvXiBYBFx2NamRF1e0GrICJLGyHY8CKG9z8XkYMO2fHjX/voC 6acyE03ble4GZyQpQYe3r09s0m0JJJNp2pWEBlQlMx6hvfK3y5OGkLHkz2wz45DBO6FI 6dHSkyrqi9diahGqegDlAolN10QJI+YsmG5NvMI8urJSVqcXVun0QY5cgrzuumreEoNP L8bwTNI6cpIQ8cfge7lcwHV6faBgNNv8mLLmMrHhIJF6HGiX1e4poLYdLiw4pwkObkpu YY837nOgd2QAHjqKI9p90fYy40kdEigvujD1EFU1pIki4kKmVqXk7sm3ZTwdcIPpa7xZ w6Yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975498; x=1711580298; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=81BPzzmBgQzVbYIwH3OxkEDfqVr96cHlM67nUhExOn0=; b=Y2lyqFVNNSm4eQRnVr+ue8REV8/R7pjev/2DjhQ+sDDzJRaWFsoWJR4LA3jI7snyps fAamGEIhaNo3sYah6Q2jZEzUCajQa1q/qG10GD5QyKGfFcmHMb4JMGfJxHnpM2lQ34FK r3DENmd1zR+yGG08Lffm/HvS9roJgBsgriMbtm7ycJyKfaE/HnSp8G8vE6yTkJTleSOq jl8mKxGXgz6cu9VsA+QW8pgFDxqzUxli00d3UFDa+DtsSDtpvSAgGldTDSiNTY3tsIsY SG6yDURlOvtz9g0VUPjpECwqYdb86cI01pOXKIDFi1z4F3CPWUrF5yhUy+Bkhd02Ix41 2eGg== X-Gm-Message-State: AOJu0YwPfyi3QVVEKPh6zvIyEGHpL4riFFLssYAU7huepvb4uLiK5aqZ OZqWFEpy0o/OFGgALyyaMV9VlOQs4MeoiGMiOaaugTCIuUtXAcqH1q7MZ0Q3w9aljPV2fKK9LCY 6 X-Google-Smtp-Source: AGHT+IGBi51SgQKuDNVaIXwZUDTuaXBOuD6tFsdn38krvVzbUTarP9creIDGauOJkjzq410giq2Gfg== X-Received: by 2002:a5e:c10d:0:b0:7cf:28df:79e2 with SMTP id v13-20020a5ec10d000000b007cf28df79e2mr701194iol.1.1710975498480; Wed, 20 Mar 2024 15:58:18 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:17 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 14/17] io_uring/uring_cmd: switch to always allocating async data Date: Wed, 20 Mar 2024 16:55:29 -0600 Message-ID: <20240320225750.1769647-15-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Basic conversion ensuring async_data is allocated off the prep path. Adds a basic alloc cache as well, as passthrough IO can be quite high in rate. Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 1 + io_uring/io_uring.c | 3 ++ io_uring/opdef.c | 1 - io_uring/uring_cmd.c | 77 ++++++++++++++++++++++++---------- io_uring/uring_cmd.h | 10 ++++- 5 files changed, 69 insertions(+), 23 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 2ba8676f83cc..e3ec84c43f1a 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -301,6 +301,7 @@ struct io_ring_ctx { struct io_alloc_cache apoll_cache; struct io_alloc_cache netmsg_cache; struct io_alloc_cache rw_cache; + struct io_alloc_cache uring_cache; /* * Any cancelable uring_cmd is added to this list in diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index cc8ce830ff4b..e2b9b00eedef 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -310,6 +310,8 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) sizeof(struct io_async_msghdr)); io_alloc_cache_init(&ctx->rw_cache, IO_ALLOC_CACHE_MAX, sizeof(struct io_async_rw)); + io_alloc_cache_init(&ctx->uring_cache, IO_ALLOC_CACHE_MAX, + sizeof(struct uring_cache)); io_futex_cache_init(ctx); init_completion(&ctx->ref_comp); xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1); @@ -2901,6 +2903,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) io_alloc_cache_free(&ctx->apoll_cache, io_apoll_cache_free); io_alloc_cache_free(&ctx->netmsg_cache, io_netmsg_cache_free); io_alloc_cache_free(&ctx->rw_cache, io_rw_cache_free); + io_alloc_cache_free(&ctx->uring_cache, io_uring_cache_free); io_futex_cache_free(ctx); io_destroy_buffers(ctx); mutex_unlock(&ctx->uring_lock); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 1951107210d4..745246086c23 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -677,7 +677,6 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_URING_CMD] = { .name = "URING_CMD", .async_size = 2 * sizeof(struct io_uring_sqe), - .prep_async = io_uring_cmd_prep_async, }, [IORING_OP_SEND_ZC] = { .name = "SEND_ZC", diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 4614ce734fee..9bd0ba87553f 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -14,6 +14,38 @@ #include "rsrc.h" #include "uring_cmd.h" +static struct uring_cache *io_uring_async_get(struct io_kiocb *req) +{ + struct io_ring_ctx *ctx = req->ctx; + struct io_cache_entry *entry; + struct uring_cache *cache; + + entry = io_alloc_cache_get(&ctx->uring_cache); + if (entry) { + cache = container_of(entry, struct uring_cache, cache); + req->flags |= REQ_F_ASYNC_DATA; + req->async_data = cache; + return cache; + } + if (!io_alloc_async_data(req)) + return req->async_data; + return NULL; +} + +static void io_req_uring_cleanup(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd); + struct uring_cache *cache = req->async_data; + + if (issue_flags & IO_URING_F_UNLOCKED) + return; + if (io_alloc_cache_put(&req->ctx->uring_cache, &cache->cache)) { + ioucmd->sqe = NULL; + req->async_data = NULL; + req->flags &= ~REQ_F_ASYNC_DATA; + } +} + bool io_uring_try_cancel_uring_cmd(struct io_ring_ctx *ctx, struct task_struct *task, bool cancel_all) { @@ -128,6 +160,7 @@ void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret, ssize_t res2, io_req_set_res(req, ret, 0); if (req->ctx->flags & IORING_SETUP_CQE32) io_req_set_cqe32_extra(req, res2, 0); + io_req_uring_cleanup(req, issue_flags); if (req->ctx->flags & IORING_SETUP_IOPOLL) { /* order with io_iopoll_req_issued() checking ->iopoll_complete */ smp_store_release(&req->iopoll_completed, 1); @@ -142,13 +175,19 @@ void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret, ssize_t res2, } EXPORT_SYMBOL_GPL(io_uring_cmd_done); -int io_uring_cmd_prep_async(struct io_kiocb *req) +static int io_uring_cmd_prep_setup(struct io_kiocb *req, + const struct io_uring_sqe *sqe) { struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd); + struct uring_cache *cache; - memcpy(req->async_data, ioucmd->sqe, uring_sqe_size(req->ctx)); - ioucmd->sqe = req->async_data; - return 0; + cache = io_uring_async_get(req); + if (cache) { + memcpy(cache->sqes, sqe, uring_sqe_size(req->ctx)); + ioucmd->sqe = req->async_data; + return 0; + } + return -ENOMEM; } int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) @@ -173,9 +212,9 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) req->imu = ctx->user_bufs[index]; io_req_set_rsrc_node(req, ctx, 0); } - ioucmd->sqe = sqe; ioucmd->cmd_op = READ_ONCE(sqe->cmd_op); - return 0; + + return io_uring_cmd_prep_setup(req, sqe); } int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) @@ -206,23 +245,14 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) } ret = file->f_op->uring_cmd(ioucmd, issue_flags); - if (ret == -EAGAIN) { - if (!req_has_async_data(req)) { - if (io_alloc_async_data(req)) - return -ENOMEM; - io_uring_cmd_prep_async(req); - } - return -EAGAIN; - } - - if (ret != -EIOCBQUEUED) { - if (ret < 0) - req_set_fail(req); - io_req_set_res(req, ret, 0); + if (ret == -EAGAIN || ret == -EIOCBQUEUED) return ret; - } - return IOU_ISSUE_SKIP_COMPLETE; + if (ret < 0) + req_set_fail(req); + io_req_uring_cleanup(req, issue_flags); + io_req_set_res(req, ret, 0); + return ret; } int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, @@ -311,3 +341,8 @@ int io_uring_cmd_sock(struct io_uring_cmd *cmd, unsigned int issue_flags) } EXPORT_SYMBOL_GPL(io_uring_cmd_sock); #endif + +void io_uring_cache_free(struct io_cache_entry *entry) +{ + kfree(container_of(entry, struct uring_cache, cache)); +} diff --git a/io_uring/uring_cmd.h b/io_uring/uring_cmd.h index 7356bf9aa655..b0ccff7091ee 100644 --- a/io_uring/uring_cmd.h +++ b/io_uring/uring_cmd.h @@ -1,8 +1,16 @@ // SPDX-License-Identifier: GPL-2.0 +struct uring_cache { + union { + struct io_cache_entry cache; + struct io_uring_sqe sqes[2]; + }; +}; + int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags); int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_uring_cmd_prep_async(struct io_kiocb *req); +void io_uring_cache_free(struct io_cache_entry *entry); bool io_uring_try_cancel_uring_cmd(struct io_ring_ctx *ctx, - struct task_struct *task, bool cancel_all); \ No newline at end of file + struct task_struct *task, bool cancel_all); From patchwork Wed Mar 20 22:55:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598264 Received: from mail-io1-f48.google.com (mail-io1-f48.google.com [209.85.166.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 902AC85C65 for ; Wed, 20 Mar 2024 22:58:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975503; cv=none; b=QZWs72/Iw7iWgWlscbrshFH3rO9xSuakqhIe5MaowQmIMREl/alSIv+1PfoIFvVCIgIXvQ9xVDOHGRqplSg/saj6cVj9A48sLjOAXYRRseNvIrM7KN1zT75pqDcbfBYvbSFfokxDJUykiYNWWMvgulwyzWUHOalz9+vlyAkrhEg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975503; c=relaxed/simple; bh=4wN+eMVoL9e72mNvWWYFnGOyQ5/owDLs69We+rw9488=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Y9956007Y9QN0Gez9bdccyev6qJjNILs8ut3udG5CrsyFUqLDbW/CvpwHSYHQqvU7WMQ/o3awO/rlUw02U22V37tZ3WS+JBmfzSVLFa2vGrAne1Kz1M2AQqaZYIArVnm1pecxRQ3/SkdT7mS2M7jp6ycCU+jjbr6iJ6P0aNBhQg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=gAmIKYNP; arc=none smtp.client-ip=209.85.166.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="gAmIKYNP" Received: by mail-io1-f48.google.com with SMTP id ca18e2360f4ac-7cc5e664d52so6269039f.0 for ; Wed, 20 Mar 2024 15:58:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975500; x=1711580300; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=N6Dq8Aq42rQFuVjQCYhDK27TFaOTjwnZrM1PePD0aC0=; b=gAmIKYNPtyB6ZhqPH58MX7GXl7DczXgNDme87rw79PxiPSgsGiUV7nK8MzH7GLgn7H Fje3QBeEkvir+ASWnK111UxCr/sHQgCoiMGydipD+2ihhdL1wgPQZre+PnYpSdqPkq3t cB7LeisROFR/ZHJcUB2CsValsaY4pAvr2DjEFdqybXluL6Il4lz7lYFd5lAJUn8IIR1s 37T3xf8+gAMYx5/rf/DdwID+tZ1/W23cg15v3PZAfn7/EETPJ2o1dZ0brqo9vwL1bygN iZswDz4iafrOJGJ/j2Pgz9yxYMW2xH1sncmeGWOkcNt3Pot5kARuJzWlQ5XVuECyzwIZ tVug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975500; x=1711580300; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=N6Dq8Aq42rQFuVjQCYhDK27TFaOTjwnZrM1PePD0aC0=; b=nxxd3d0Ul2de7jYy0SzCIC/0ca0uYxgaYBwyQ99vpzxobNCBt5uj3N9BJOTQxxEbxR t43mjT/vyYvMsxdh0eOI0l1X7XSHsgUcwI9YD+00Ej7MCj/M/qPgKNMeR0FCUyHL8oJP 5zG6wW5EhudE5kYVsgot+1i+vp630yA+ZHRumMR3M8iw9MhlzStaB+cjwqtelzKLCtTd Zyn7948rw5Jw4QPNEptCxIPWm67l6RQkKlf/4yZqRIai4YZ7SkZJ1e0CCpJfUhe3Ku/2 3qvPFeoAbnt+VGcnBsDhnMpWtw+UKzg1KM4nIdTRiOyNrFqNX7OXNBusxC6PsHo0pGvU 6hrw== X-Gm-Message-State: AOJu0Yy1IXjCtE5+g7USvcyo3re/j9Ftj0IYD+MxiokGGGcbFw2ZhJVs XgQfCa9BmfRLYe1G5EeqJm+SARnnEPBeaWOCKjVoS6wz+/GcTppspfXTXz9WhIPdzg/cuSEbQsH q X-Google-Smtp-Source: AGHT+IGXCl6RHxETjLp+uYT75eZfU5J2KIkCfP/DhdsSVfDaVw32lOLhX+1QNGQubHNqNy+nEUezog== X-Received: by 2002:a6b:5108:0:b0:7ce:f407:1edf with SMTP id f8-20020a6b5108000000b007cef4071edfmr6851938iob.0.1710975500449; Wed, 20 Mar 2024 15:58:20 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:18 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 15/17] io_uring/uring_cmd: defer SQE copying until we need it Date: Wed, 20 Mar 2024 16:55:30 -0600 Message-ID: <20240320225750.1769647-16-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The previous commit turned on async data for uring_cmd, and did the basic conversion of setting everything up on the prep side. However, for a lot of use cases, we'll get -EIOCBQUEUED on issue, which means we do not need a persistent big SQE copied. Unless we're going async immediately, defer copying the double SQE until we know we have to. This greatly reduces the overhead of such commands, as evidenced by a perf diff from before and after this change: 10.60% -8.58% [kernel.vmlinux] [k] io_uring_cmd_prep where the prep side drops from 10.60% to ~2%, which is more expected. Performance also rises from ~113M IOPS to ~122M IOPS, bringing us back to where it was before the async command prep. Signed-off-by: Jens Axboe ~# Last command done (1 command done): Tested-by: Anuj Gupta Reviewed-by: Anuj Gupta --- io_uring/uring_cmd.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 9bd0ba87553f..92346b5d9f5b 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -182,12 +182,18 @@ static int io_uring_cmd_prep_setup(struct io_kiocb *req, struct uring_cache *cache; cache = io_uring_async_get(req); - if (cache) { - memcpy(cache->sqes, sqe, uring_sqe_size(req->ctx)); - ioucmd->sqe = req->async_data; + if (unlikely(!cache)) + return -ENOMEM; + + if (!(req->flags & REQ_F_FORCE_ASYNC)) { + /* defer memcpy until we need it */ + ioucmd->sqe = sqe; return 0; } - return -ENOMEM; + + memcpy(req->async_data, sqe, uring_sqe_size(req->ctx)); + ioucmd->sqe = req->async_data; + return 0; } int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) @@ -245,8 +251,15 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) } ret = file->f_op->uring_cmd(ioucmd, issue_flags); - if (ret == -EAGAIN || ret == -EIOCBQUEUED) - return ret; + if (ret == -EAGAIN) { + struct uring_cache *cache = req->async_data; + + if (ioucmd->sqe != (void *) cache) + memcpy(cache, ioucmd->sqe, uring_sqe_size(req->ctx)); + return -EAGAIN; + } else if (ret == -EIOCBQUEUED) { + return -EIOCBQUEUED; + } if (ret < 0) req_set_fail(req); From patchwork Wed Mar 20 22:55:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598265 Received: from mail-io1-f54.google.com (mail-io1-f54.google.com [209.85.166.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA42185C65 for ; Wed, 20 Mar 2024 22:58:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975505; cv=none; b=PGaevBJ4IYWScXVY0k00G2dgjtjDyKc9kJlzvhdjIW6jo1QZhqSpIxJzwp7yMSpdaFELOEjO2AZn3n55Pc7Uuwtnr98bZKh8HzIwT5+oi6dAtTPeNlWXTvYfBmLFX07WPllIPf7cvi8TeOeEyWIr4ySluetbejQA9i8AsgQebxw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975505; c=relaxed/simple; bh=ykZAf3Q8DBafxceYbZ4oIWwEr1AaVUCtxIWXCkoSqK8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=G45zFrHQ9vPePhJbYiq1t7rMbLYBkuVji+qfufEYCpD0c1cqOUgYqqVwtzZoYPeU/IKIEjqSO0wJ0bEWAB7CO7YL1SBv9+vbwRN9RvxTlHpNeTEoBgQEzrl858OCMmA2+a+jmTNO3C18pCCppjLtXOQvlCVfiNgVUyBTwg1pr3Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=wYthfjIZ; arc=none smtp.client-ip=209.85.166.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="wYthfjIZ" Received: by mail-io1-f54.google.com with SMTP id ca18e2360f4ac-7cc0e831e11so2542839f.1 for ; Wed, 20 Mar 2024 15:58:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975502; x=1711580302; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CU0lTkJ8eZZ/oUI5/c9z1/jTv3vXx4pfsrXqhBT1eCI=; b=wYthfjIZ8AAmSUpMwwD0GG6NlQAO3uwojXyaSJvXoIDRLs9pgvxCtRScj1PZmX1+0I GnyFnjNEUlLoB3zysHN1Jgc9ys9+1j/SUUfLJw8c6FZ2QvyoUXLvIB60DGhxtm+3vS4S BTF8RuFjwJmXF7t8HFBrdEpAy0PGEtcocow66W5djc3IgZisa/bZzC9nUlJMt/SgQS9X x+QzQDIORV0LIpBlzLQ1EFd+GR0c0ma68siVHjWDYa3BcMQlBExHTvtNqrld5cuQ6bsc pRWkL5zVppKyi5FZ4tm7yTuCmW/+7cnM4flHPJGnpBMSw46JPAFiMDrftv7gIni0tuEu L0qQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975502; x=1711580302; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CU0lTkJ8eZZ/oUI5/c9z1/jTv3vXx4pfsrXqhBT1eCI=; b=llWavvyOox21Wmak6F6NHW7nPXUOlyr2BQg9y+i8LcGUs2e9U67Y2FNUshvLMjUMTj +LFRRLD8/uQBUUqeyMZKisIFtvflEcQjfaL3at/cCxKyBB/axRGfvdGyzDB4iJWhz1M8 KHokoTcP++vp0QKfjcjItHAT9j2V8VRw1NUNeZfJXaJhbtnMna3WCCHYa2zqFwz+OJhn f9hs3BEsx41DUqkQSHu8U+rlxKr8O3twQR10lbzu3VKVhcLlQb47ZjcPsXv29gTwMo4m amj8nzRRLvK9j3q5wEKgYP80c1fhE6g4qioxlIHykNJHa53rzAqIPHjYI4meDQErteik EjAA== X-Gm-Message-State: AOJu0YwY2Tv9/t4JlcKoLkIsO6BPngO5A+wJcwxdSK1JFI3Ml2ZBx578 4F8W++fXeWMkctWfIRaIdFg/rwVA7ln5lfQY9NIO2niIuZ5A00YEmjOnnxKdiyDX20Uj8UMw1oT 4 X-Google-Smtp-Source: AGHT+IG2IWqUFRMJSX+L6O7af1bBSAT7u3K7tOr5pCZlpjJ42wf5sXJQj5shIsxvXqwCHO+RjAhFWw== X-Received: by 2002:a05:6602:59:b0:7c8:ad73:a702 with SMTP id z25-20020a056602005900b007c8ad73a702mr7016324ioz.0.1710975502352; Wed, 20 Mar 2024 15:58:22 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:20 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 16/17] io_uring: drop ->prep_async() Date: Wed, 20 Mar 2024 16:55:31 -0600 Message-ID: <20240320225750.1769647-17-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 It's now unused, drop the code related to it. This includes the io_issue_defs->manual alloc field. While in there, and since ->async_size is now being used a bit more frequently and in the issue path, move it to io_issue_defs[]. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 36 ++++-------------------------------- io_uring/io_uring.h | 1 - io_uring/opdef.c | 44 +++++++++++++++++++------------------------- io_uring/opdef.h | 9 +++------ io_uring/uring_cmd.h | 1 - 5 files changed, 26 insertions(+), 65 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index e2b9b00eedef..5eee07563079 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1709,8 +1709,10 @@ io_req_flags_t io_file_get_flags(struct file *file) bool io_alloc_async_data(struct io_kiocb *req) { - WARN_ON_ONCE(!io_cold_defs[req->opcode].async_size); - req->async_data = kmalloc(io_cold_defs[req->opcode].async_size, GFP_KERNEL); + const struct io_issue_def *def = &io_issue_defs[req->opcode]; + + WARN_ON_ONCE(!def->async_size); + req->async_data = kmalloc(def->async_size, GFP_KERNEL); if (req->async_data) { req->flags |= REQ_F_ASYNC_DATA; return false; @@ -1718,25 +1720,6 @@ bool io_alloc_async_data(struct io_kiocb *req) return true; } -int io_req_prep_async(struct io_kiocb *req) -{ - const struct io_cold_def *cdef = &io_cold_defs[req->opcode]; - const struct io_issue_def *def = &io_issue_defs[req->opcode]; - - /* assign early for deferred execution for non-fixed file */ - if (def->needs_file && !(req->flags & REQ_F_FIXED_FILE) && !req->file) - req->file = io_file_get_normal(req, req->cqe.fd); - if (!cdef->prep_async) - return 0; - if (WARN_ON_ONCE(req_has_async_data(req))) - return -EFAULT; - if (!def->manual_alloc) { - if (io_alloc_async_data(req)) - return -EAGAIN; - } - return cdef->prep_async(req); -} - static u32 io_get_sequence(struct io_kiocb *req) { u32 seq = req->ctx->cached_sq_head; @@ -2049,13 +2032,6 @@ static void io_queue_sqe_fallback(struct io_kiocb *req) req->flags |= REQ_F_LINK; io_req_defer_failed(req, req->cqe.res); } else { - int ret = io_req_prep_async(req); - - if (unlikely(ret)) { - io_req_defer_failed(req, ret); - return; - } - if (unlikely(req->ctx->drain_active)) io_drain_req(req); else @@ -2265,10 +2241,6 @@ static inline int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, * conditions are true (normal request), then just queue it. */ if (unlikely(link->head)) { - ret = io_req_prep_async(req); - if (unlikely(ret)) - return io_submit_fail_init(sqe, req, ret); - trace_io_uring_link(req, link->head); link->last->link = req; link->last = req; diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index ef9bf610734c..caf1f573bb87 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -101,7 +101,6 @@ int io_poll_issue(struct io_kiocb *req, struct io_tw_state *ts); int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr); int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin); void __io_submit_flush_completions(struct io_ring_ctx *ctx); -int io_req_prep_async(struct io_kiocb *req); struct io_wq_work *io_wq_free_work(struct io_wq_work *work); void io_wq_submit_work(struct io_wq_work *work); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 745246086c23..2de5cca9504e 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -67,6 +67,7 @@ const struct io_issue_def io_issue_defs[] = { .iopoll = 1, .iopoll_queue = 1, .vectored = 1, + .async_size = sizeof(struct io_async_rw), .prep = io_prep_readv, .issue = io_read, }, @@ -81,6 +82,7 @@ const struct io_issue_def io_issue_defs[] = { .iopoll = 1, .iopoll_queue = 1, .vectored = 1, + .async_size = sizeof(struct io_async_rw), .prep = io_prep_writev, .issue = io_write, }, @@ -99,6 +101,7 @@ const struct io_issue_def io_issue_defs[] = { .ioprio = 1, .iopoll = 1, .iopoll_queue = 1, + .async_size = sizeof(struct io_async_rw), .prep = io_prep_read_fixed, .issue = io_read, }, @@ -112,6 +115,7 @@ const struct io_issue_def io_issue_defs[] = { .ioprio = 1, .iopoll = 1, .iopoll_queue = 1, + .async_size = sizeof(struct io_async_rw), .prep = io_prep_write_fixed, .issue = io_write, }, @@ -138,8 +142,8 @@ const struct io_issue_def io_issue_defs[] = { .unbound_nonreg_file = 1, .pollout = 1, .ioprio = 1, - .manual_alloc = 1, #if defined(CONFIG_NET) + .async_size = sizeof(struct io_async_msghdr), .prep = io_sendmsg_prep, .issue = io_sendmsg, #else @@ -152,8 +156,8 @@ const struct io_issue_def io_issue_defs[] = { .pollin = 1, .buffer_select = 1, .ioprio = 1, - .manual_alloc = 1, #if defined(CONFIG_NET) + .async_size = sizeof(struct io_async_msghdr), .prep = io_recvmsg_prep, .issue = io_recvmsg, #else @@ -162,6 +166,7 @@ const struct io_issue_def io_issue_defs[] = { }, [IORING_OP_TIMEOUT] = { .audit_skip = 1, + .async_size = sizeof(struct io_timeout_data), .prep = io_timeout_prep, .issue = io_timeout, }, @@ -191,6 +196,7 @@ const struct io_issue_def io_issue_defs[] = { }, [IORING_OP_LINK_TIMEOUT] = { .audit_skip = 1, + .async_size = sizeof(struct io_timeout_data), .prep = io_link_timeout_prep, .issue = io_no_issue, }, @@ -199,6 +205,7 @@ const struct io_issue_def io_issue_defs[] = { .unbound_nonreg_file = 1, .pollout = 1, #if defined(CONFIG_NET) + .async_size = sizeof(struct io_async_msghdr), .prep = io_connect_prep, .issue = io_connect, #else @@ -239,6 +246,7 @@ const struct io_issue_def io_issue_defs[] = { .ioprio = 1, .iopoll = 1, .iopoll_queue = 1, + .async_size = sizeof(struct io_async_rw), .prep = io_prep_read, .issue = io_read, }, @@ -252,6 +260,7 @@ const struct io_issue_def io_issue_defs[] = { .ioprio = 1, .iopoll = 1, .iopoll_queue = 1, + .async_size = sizeof(struct io_async_rw), .prep = io_prep_write, .issue = io_write, }, @@ -272,8 +281,9 @@ const struct io_issue_def io_issue_defs[] = { .pollout = 1, .audit_skip = 1, .ioprio = 1, - .manual_alloc = 1, + .buffer_select = 1, #if defined(CONFIG_NET) + .async_size = sizeof(struct io_async_msghdr), .prep = io_sendmsg_prep, .issue = io_send, #else @@ -288,6 +298,7 @@ const struct io_issue_def io_issue_defs[] = { .audit_skip = 1, .ioprio = 1, #if defined(CONFIG_NET) + .async_size = sizeof(struct io_async_msghdr), .prep = io_recvmsg_prep, .issue = io_recv, #else @@ -403,6 +414,7 @@ const struct io_issue_def io_issue_defs[] = { .plug = 1, .iopoll = 1, .iopoll_queue = 1, + .async_size = 2 * sizeof(struct io_uring_sqe), .prep = io_uring_cmd_prep, .issue = io_uring_cmd, }, @@ -412,8 +424,8 @@ const struct io_issue_def io_issue_defs[] = { .pollout = 1, .audit_skip = 1, .ioprio = 1, - .manual_alloc = 1, #if defined(CONFIG_NET) + .async_size = sizeof(struct io_async_msghdr), .prep = io_send_zc_prep, .issue = io_send_zc, #else @@ -425,8 +437,8 @@ const struct io_issue_def io_issue_defs[] = { .unbound_nonreg_file = 1, .pollout = 1, .ioprio = 1, - .manual_alloc = 1, #if defined(CONFIG_NET) + .async_size = sizeof(struct io_async_msghdr), .prep = io_send_zc_prep, .issue = io_sendmsg_zc, #else @@ -439,10 +451,12 @@ const struct io_issue_def io_issue_defs[] = { .pollin = 1, .buffer_select = 1, .audit_skip = 1, + .async_size = sizeof(struct io_async_rw), .prep = io_read_mshot_prep, .issue = io_read_mshot, }, [IORING_OP_WAITID] = { + .async_size = sizeof(struct io_waitid_async), .prep = io_waitid_prep, .issue = io_waitid, }, @@ -488,13 +502,11 @@ const struct io_cold_def io_cold_defs[] = { .name = "NOP", }, [IORING_OP_READV] = { - .async_size = sizeof(struct io_async_rw), .name = "READV", .cleanup = io_readv_writev_cleanup, .fail = io_rw_fail, }, [IORING_OP_WRITEV] = { - .async_size = sizeof(struct io_async_rw), .name = "WRITEV", .cleanup = io_readv_writev_cleanup, .fail = io_rw_fail, @@ -503,12 +515,10 @@ const struct io_cold_def io_cold_defs[] = { .name = "FSYNC", }, [IORING_OP_READ_FIXED] = { - .async_size = sizeof(struct io_async_rw), .name = "READ_FIXED", .fail = io_rw_fail, }, [IORING_OP_WRITE_FIXED] = { - .async_size = sizeof(struct io_async_rw), .name = "WRITE_FIXED", .fail = io_rw_fail, }, @@ -524,7 +534,6 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_SENDMSG] = { .name = "SENDMSG", #if defined(CONFIG_NET) - .async_size = sizeof(struct io_async_msghdr), .cleanup = io_sendmsg_recvmsg_cleanup, .fail = io_sendrecv_fail, #endif @@ -532,13 +541,11 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_RECVMSG] = { .name = "RECVMSG", #if defined(CONFIG_NET) - .async_size = sizeof(struct io_async_msghdr), .cleanup = io_sendmsg_recvmsg_cleanup, .fail = io_sendrecv_fail, #endif }, [IORING_OP_TIMEOUT] = { - .async_size = sizeof(struct io_timeout_data), .name = "TIMEOUT", }, [IORING_OP_TIMEOUT_REMOVE] = { @@ -551,14 +558,10 @@ const struct io_cold_def io_cold_defs[] = { .name = "ASYNC_CANCEL", }, [IORING_OP_LINK_TIMEOUT] = { - .async_size = sizeof(struct io_timeout_data), .name = "LINK_TIMEOUT", }, [IORING_OP_CONNECT] = { .name = "CONNECT", -#if defined(CONFIG_NET) - .async_size = sizeof(struct io_async_msghdr), -#endif }, [IORING_OP_FALLOCATE] = { .name = "FALLOCATE", @@ -578,12 +581,10 @@ const struct io_cold_def io_cold_defs[] = { .cleanup = io_statx_cleanup, }, [IORING_OP_READ] = { - .async_size = sizeof(struct io_async_rw), .name = "READ", .fail = io_rw_fail, }, [IORING_OP_WRITE] = { - .async_size = sizeof(struct io_async_rw), .name = "WRITE", .fail = io_rw_fail, }, @@ -596,7 +597,6 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_SEND] = { .name = "SEND", #if defined(CONFIG_NET) - .async_size = sizeof(struct io_async_msghdr), .cleanup = io_sendmsg_recvmsg_cleanup, .fail = io_sendrecv_fail, #endif @@ -604,7 +604,6 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_RECV] = { .name = "RECV", #if defined(CONFIG_NET) - .async_size = sizeof(struct io_async_msghdr), .cleanup = io_sendmsg_recvmsg_cleanup, .fail = io_sendrecv_fail, #endif @@ -676,12 +675,10 @@ const struct io_cold_def io_cold_defs[] = { }, [IORING_OP_URING_CMD] = { .name = "URING_CMD", - .async_size = 2 * sizeof(struct io_uring_sqe), }, [IORING_OP_SEND_ZC] = { .name = "SEND_ZC", #if defined(CONFIG_NET) - .async_size = sizeof(struct io_async_msghdr), .cleanup = io_send_zc_cleanup, .fail = io_sendrecv_fail, #endif @@ -689,18 +686,15 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_SENDMSG_ZC] = { .name = "SENDMSG_ZC", #if defined(CONFIG_NET) - .async_size = sizeof(struct io_async_msghdr), .cleanup = io_send_zc_cleanup, .fail = io_sendrecv_fail, #endif }, [IORING_OP_READ_MULTISHOT] = { - .async_size = sizeof(struct io_async_rw), .name = "READ_MULTISHOT", }, [IORING_OP_WAITID] = { .name = "WAITID", - .async_size = sizeof(struct io_waitid_async), }, [IORING_OP_FUTEX_WAIT] = { .name = "FUTEX_WAIT", diff --git a/io_uring/opdef.h b/io_uring/opdef.h index 9e5435ec27d0..7ee6f5aa90aa 100644 --- a/io_uring/opdef.h +++ b/io_uring/opdef.h @@ -27,22 +27,19 @@ struct io_issue_def { unsigned iopoll : 1; /* have to be put into the iopoll list */ unsigned iopoll_queue : 1; - /* opcode specific path will handle ->async_data allocation if needed */ - unsigned manual_alloc : 1; /* vectored opcode, set if 1) vectored, and 2) handler needs to know */ unsigned vectored : 1; + /* size of async data needed, if any */ + unsigned short async_size; + int (*issue)(struct io_kiocb *, unsigned int); int (*prep)(struct io_kiocb *, const struct io_uring_sqe *); }; struct io_cold_def { - /* size of async data needed, if any */ - unsigned short async_size; - const char *name; - int (*prep_async)(struct io_kiocb *); void (*cleanup)(struct io_kiocb *); void (*fail)(struct io_kiocb *); }; diff --git a/io_uring/uring_cmd.h b/io_uring/uring_cmd.h index b0ccff7091ee..477ea8865639 100644 --- a/io_uring/uring_cmd.h +++ b/io_uring/uring_cmd.h @@ -9,7 +9,6 @@ struct uring_cache { int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags); int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); -int io_uring_cmd_prep_async(struct io_kiocb *req); void io_uring_cache_free(struct io_cache_entry *entry); bool io_uring_try_cancel_uring_cmd(struct io_ring_ctx *ctx, From patchwork Wed Mar 20 22:55:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13598266 Received: from mail-io1-f54.google.com (mail-io1-f54.google.com [209.85.166.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C814085C73 for ; Wed, 20 Mar 2024 22:58:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975509; cv=none; b=WeTVx5MhvQDohbVRnCPjsDvpLqeAMh8E8IAOHnMXR8EdciVKm7nVNF9ni1ppRnv4k7YYf3AwCAl1UydkKnRnCSiRisaBAI52mjBoB4QBcQSCG4GaDiLLyz6W4VTEH93hCJ41Eqf883p28gwv59KxHUq9BmqHa3MALeKanYqItcM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710975509; c=relaxed/simple; bh=caViakkOg5kJXZMP9AVtl4ebLqQ2aXaJBYp6zawekwI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=j5WgLVSWhHVO65Qf4VYlp8E42ybRQAp+zIbiwLi/Yam/N733obcTpqiVV0w7ab8D4X/qKxjWxTPDAk1YTMLVH9lGfjSD5zEsgpzIuUP7YH3VDV1hAplYpPWtTGcpoSZcRRFAdOZx4/9FPpK+V1BvHlR9r6NgiInovDNyaeD5hbQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=eRxsdA7g; arc=none smtp.client-ip=209.85.166.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="eRxsdA7g" Received: by mail-io1-f54.google.com with SMTP id ca18e2360f4ac-7cc5e664d52so6269939f.0 for ; Wed, 20 Mar 2024 15:58:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1710975504; x=1711580304; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VPSzS6txPEkTrfLLkD9zuSt6VxfHBiyyTyoq7ztEnDo=; b=eRxsdA7glgdIAJUzQZOFNXcck9loPF2drew/JE89J0Og0kHF2bNxbEXQMT3RmO7PX8 MMk6kKTRzmw8aAz/HSIo1jQhOForJUnRTAk9N/PVUWHIyfxaBRaeuN/NlS1+QvQpTSg5 eB3HYOk8Yeyh0ExWomIeL+hxpErT/vVCQxMck8gDW61b05D6lmYVy57b/AXiue3zp3PY h5yQqC5wEjQMjnWeIV3LyR5ECR74xK/POjBeECQI0NAoDvPVsB2zJctedHQPzotChKbW d7A2wctiW3faXZM8UAI/pzlprkWeSozyUcey92Mwyk0JiVz0rzQXBSoT9ZBOtnLvPBat HIIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710975504; x=1711580304; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VPSzS6txPEkTrfLLkD9zuSt6VxfHBiyyTyoq7ztEnDo=; b=exkVMEYW1lJzMvYjwsqIFW85cJ2vFnXZWVKoL6zE9iaju4acPRw9ya5PJzU4TInUPU 9he4oY15a1E4IE0FAOOfBrTOq57vgsOonBzgCxrRZYPmUqkSWWhFUfn0cELm+rdLs+QK 2zrWkxUeLRqpFMlSYrOrCCiFfxYVHAHES2AYN29I0hPBSFapnLHc3Il6k3ru6b3p0ykL IQV9Kx3cc4G0G5DeIJ701zXNnJKfadQBF9F7eHePz/iPOcnMPpuDb241vxHb0h2rR0IQ uXhTeO52WjLgB3DEJX96YeI3EjhEEZwFxU3orAGUrm7Z4baJgFYCuRa42pQjy561ej3L naMQ== X-Gm-Message-State: AOJu0Yydj1PNdA2prTwYiWHzF8C3nyqhU+zpQNe/qr8POvLdAytbPsrg yEYFMVTFqNJO30nxawkK4PLLXkFLcgtfvwlN88HkyJnwehV+rv/krk39rptMtJBuabpFIHJpJKg w X-Google-Smtp-Source: AGHT+IE9jSuNIf2Gxg+UyXEzVCZ6VPlwhQLCHtemOraVrc0gGulMTttW8Qjz2EG42FgM+sthlHz/+Q== X-Received: by 2002:a6b:5108:0:b0:7ce:f407:1edf with SMTP id f8-20020a6b5108000000b007cef4071edfmr6852065iob.0.1710975504349; Wed, 20 Mar 2024 15:58:24 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id z19-20020a6b0a13000000b007cf23a498dcsm434384ioi.38.2024.03.20.15.58.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 15:58:22 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 17/17] io_uring/alloc_cache: switch to array based caching Date: Wed, 20 Mar 2024 16:55:32 -0600 Message-ID: <20240320225750.1769647-18-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240320225750.1769647-1-axboe@kernel.dk> References: <20240320225750.1769647-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Currently lists are being used to manage this, but lists isn't a very good choice for as extracting the current entry necessitates touching the next entry as well, to update the list head. Outside of that detail, games are also played with KASAN as the list is inside the cached entry itself. Finally, all users of this need a struct io_cache_entry embedded in their struct, which is union'ized with something else in there that isn't used across the free -> realloc cycle. Get rid of all of that, and simply have it be an array. This will not change the memory used, as we're just trading an 8-byte member entry for the per-elem array size. This reduces the overhead of the recycled allocations, and it reduces the code we have to support recycling. Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 2 +- io_uring/alloc_cache.h | 51 +++++++++++++++------------------- io_uring/futex.c | 26 ++++++----------- io_uring/futex.h | 5 ++-- io_uring/io_uring.c | 35 ++++++++++++----------- io_uring/net.c | 13 ++++----- io_uring/net.h | 16 ++++------- io_uring/poll.c | 11 ++------ io_uring/poll.h | 7 +---- io_uring/rsrc.c | 9 ++---- io_uring/rsrc.h | 5 +--- io_uring/rw.c | 13 ++++----- io_uring/rw.h | 7 ++--- io_uring/uring_cmd.c | 13 ++------- io_uring/uring_cmd.h | 6 +--- 15 files changed, 82 insertions(+), 137 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index e3ec84c43f1a..aeb4639785b5 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -220,7 +220,7 @@ struct io_ev_fd { }; struct io_alloc_cache { - struct io_wq_work_node list; + void **entries; unsigned int nr_cached; unsigned int max_cached; size_t elem_size; diff --git a/io_uring/alloc_cache.h b/io_uring/alloc_cache.h index 138ad14b0b12..4349d3519563 100644 --- a/io_uring/alloc_cache.h +++ b/io_uring/alloc_cache.h @@ -6,61 +6,54 @@ */ #define IO_ALLOC_CACHE_MAX 128 -struct io_cache_entry { - struct io_wq_work_node node; -}; - static inline bool io_alloc_cache_put(struct io_alloc_cache *cache, - struct io_cache_entry *entry) + void *entry) { if (cache->nr_cached < cache->max_cached) { - cache->nr_cached++; - wq_stack_add_head(&entry->node, &cache->list); - kasan_mempool_poison_object(entry); + if (!kasan_mempool_poison_object(entry)) + return false; + cache->entries[cache->nr_cached++] = entry; return true; } return false; } -static inline bool io_alloc_cache_empty(struct io_alloc_cache *cache) -{ - return !cache->list.next; -} - -static inline struct io_cache_entry *io_alloc_cache_get(struct io_alloc_cache *cache) +static inline void *io_alloc_cache_get(struct io_alloc_cache *cache) { - if (cache->list.next) { - struct io_cache_entry *entry; + if (cache->nr_cached) { + void *entry = cache->entries[--cache->nr_cached]; - entry = container_of(cache->list.next, struct io_cache_entry, node); kasan_mempool_unpoison_object(entry, cache->elem_size); - cache->list.next = cache->list.next->next; - cache->nr_cached--; return entry; } return NULL; } -static inline void io_alloc_cache_init(struct io_alloc_cache *cache, - unsigned max_nr, size_t size) +static inline int io_alloc_cache_init(struct io_alloc_cache *cache, + unsigned max_nr, size_t size) { - cache->list.next = NULL; + cache->entries = kvmalloc_array(max_nr, sizeof(void *), GFP_KERNEL); + if (!cache->entries) + return -ENOMEM; cache->nr_cached = 0; cache->max_cached = max_nr; cache->elem_size = size; + return 0; } static inline void io_alloc_cache_free(struct io_alloc_cache *cache, - void (*free)(struct io_cache_entry *)) + void (*free)(const void *)) { - while (1) { - struct io_cache_entry *entry = io_alloc_cache_get(cache); + void *entry; - if (!entry) - break; + if (!cache->entries) + return; + + while ((entry = io_alloc_cache_get(cache)) != NULL) free(entry); - } - cache->nr_cached = 0; + + kvfree(cache->entries); + cache->entries = NULL; } #endif diff --git a/io_uring/futex.c b/io_uring/futex.c index 792a03df58de..3dd6d394ca88 100644 --- a/io_uring/futex.c +++ b/io_uring/futex.c @@ -27,27 +27,19 @@ struct io_futex { }; struct io_futex_data { - union { - struct futex_q q; - struct io_cache_entry cache; - }; + struct futex_q q; struct io_kiocb *req; }; -void io_futex_cache_init(struct io_ring_ctx *ctx) +int io_futex_cache_init(struct io_ring_ctx *ctx) { - io_alloc_cache_init(&ctx->futex_cache, IO_NODE_ALLOC_CACHE_MAX, + return io_alloc_cache_init(&ctx->futex_cache, IO_NODE_ALLOC_CACHE_MAX, sizeof(struct io_futex_data)); } -static void io_futex_cache_entry_free(struct io_cache_entry *entry) -{ - kfree(container_of(entry, struct io_futex_data, cache)); -} - void io_futex_cache_free(struct io_ring_ctx *ctx) { - io_alloc_cache_free(&ctx->futex_cache, io_futex_cache_entry_free); + io_alloc_cache_free(&ctx->futex_cache, kfree); } static void __io_futex_complete(struct io_kiocb *req, struct io_tw_state *ts) @@ -63,7 +55,7 @@ static void io_futex_complete(struct io_kiocb *req, struct io_tw_state *ts) struct io_ring_ctx *ctx = req->ctx; io_tw_lock(ctx, ts); - if (!io_alloc_cache_put(&ctx->futex_cache, &ifd->cache)) + if (!io_alloc_cache_put(&ctx->futex_cache, ifd)) kfree(ifd); __io_futex_complete(req, ts); } @@ -259,11 +251,11 @@ static void io_futex_wake_fn(struct wake_q_head *wake_q, struct futex_q *q) static struct io_futex_data *io_alloc_ifd(struct io_ring_ctx *ctx) { - struct io_cache_entry *entry; + struct io_futex_data *ifd; - entry = io_alloc_cache_get(&ctx->futex_cache); - if (entry) - return container_of(entry, struct io_futex_data, cache); + ifd = io_alloc_cache_get(&ctx->futex_cache); + if (ifd) + return ifd; return kmalloc(sizeof(struct io_futex_data), GFP_NOWAIT); } diff --git a/io_uring/futex.h b/io_uring/futex.h index 0847e9e8a127..75ea753240ba 100644 --- a/io_uring/futex.h +++ b/io_uring/futex.h @@ -13,7 +13,7 @@ int io_futex_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, unsigned int issue_flags); bool io_futex_remove_all(struct io_ring_ctx *ctx, struct task_struct *task, bool cancel_all); -void io_futex_cache_init(struct io_ring_ctx *ctx); +int io_futex_cache_init(struct io_ring_ctx *ctx); void io_futex_cache_free(struct io_ring_ctx *ctx); #else static inline int io_futex_cancel(struct io_ring_ctx *ctx, @@ -27,8 +27,9 @@ static inline bool io_futex_remove_all(struct io_ring_ctx *ctx, { return false; } -static inline void io_futex_cache_init(struct io_ring_ctx *ctx) +static inline int io_futex_cache_init(struct io_ring_ctx *ctx) { + return 0; } static inline void io_futex_cache_free(struct io_ring_ctx *ctx) { diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 5eee07563079..2aa3f223739a 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -273,7 +273,7 @@ static int io_alloc_hash_table(struct io_hash_table *table, unsigned bits) static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) { struct io_ring_ctx *ctx; - int hash_bits; + int ret, hash_bits; ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); if (!ctx) @@ -302,17 +302,19 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) INIT_LIST_HEAD(&ctx->sqd_list); INIT_LIST_HEAD(&ctx->cq_overflow_list); INIT_LIST_HEAD(&ctx->io_buffers_cache); - io_alloc_cache_init(&ctx->rsrc_node_cache, IO_NODE_ALLOC_CACHE_MAX, + ret = io_alloc_cache_init(&ctx->rsrc_node_cache, IO_NODE_ALLOC_CACHE_MAX, sizeof(struct io_rsrc_node)); - io_alloc_cache_init(&ctx->apoll_cache, IO_ALLOC_CACHE_MAX, + ret |= io_alloc_cache_init(&ctx->apoll_cache, IO_ALLOC_CACHE_MAX, sizeof(struct async_poll)); - io_alloc_cache_init(&ctx->netmsg_cache, IO_ALLOC_CACHE_MAX, + ret |= io_alloc_cache_init(&ctx->netmsg_cache, IO_ALLOC_CACHE_MAX, sizeof(struct io_async_msghdr)); - io_alloc_cache_init(&ctx->rw_cache, IO_ALLOC_CACHE_MAX, + ret |= io_alloc_cache_init(&ctx->rw_cache, IO_ALLOC_CACHE_MAX, sizeof(struct io_async_rw)); - io_alloc_cache_init(&ctx->uring_cache, IO_ALLOC_CACHE_MAX, + ret |= io_alloc_cache_init(&ctx->uring_cache, IO_ALLOC_CACHE_MAX, sizeof(struct uring_cache)); - io_futex_cache_init(ctx); + ret |= io_futex_cache_init(ctx); + if (ret) + goto err; init_completion(&ctx->ref_comp); xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1); mutex_init(&ctx->uring_lock); @@ -342,6 +344,12 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) return ctx; err: + io_alloc_cache_free(&ctx->rsrc_node_cache, kfree); + io_alloc_cache_free(&ctx->apoll_cache, kfree); + io_alloc_cache_free(&ctx->netmsg_cache, io_netmsg_cache_free); + io_alloc_cache_free(&ctx->rw_cache, io_rw_cache_free); + io_alloc_cache_free(&ctx->uring_cache, kfree); + io_futex_cache_free(ctx); kfree(ctx->cancel_table.hbs); kfree(ctx->cancel_table_locked.hbs); xa_destroy(&ctx->io_bl_xa); @@ -1479,7 +1487,7 @@ static void io_free_batch_list(struct io_ring_ctx *ctx, if (apoll->double_poll) kfree(apoll->double_poll); - if (!io_alloc_cache_put(&ctx->apoll_cache, &apoll->cache)) + if (!io_alloc_cache_put(&ctx->apoll_cache, apoll)) kfree(apoll); req->flags &= ~REQ_F_POLLED; } @@ -2853,11 +2861,6 @@ static void io_req_caches_free(struct io_ring_ctx *ctx) mutex_unlock(&ctx->uring_lock); } -static void io_rsrc_node_cache_free(struct io_cache_entry *entry) -{ - kfree(container_of(entry, struct io_rsrc_node, cache)); -} - static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) { io_sq_thread_finish(ctx); @@ -2872,10 +2875,10 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) __io_sqe_files_unregister(ctx); io_cqring_overflow_kill(ctx); io_eventfd_unregister(ctx); - io_alloc_cache_free(&ctx->apoll_cache, io_apoll_cache_free); + io_alloc_cache_free(&ctx->apoll_cache, kfree); io_alloc_cache_free(&ctx->netmsg_cache, io_netmsg_cache_free); io_alloc_cache_free(&ctx->rw_cache, io_rw_cache_free); - io_alloc_cache_free(&ctx->uring_cache, io_uring_cache_free); + io_alloc_cache_free(&ctx->uring_cache, kfree); io_futex_cache_free(ctx); io_destroy_buffers(ctx); mutex_unlock(&ctx->uring_lock); @@ -2891,7 +2894,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) WARN_ON_ONCE(!list_empty(&ctx->rsrc_ref_list)); WARN_ON_ONCE(!list_empty(&ctx->ltimeout_list)); - io_alloc_cache_free(&ctx->rsrc_node_cache, io_rsrc_node_cache_free); + io_alloc_cache_free(&ctx->rsrc_node_cache, kfree); if (ctx->mm_account) { mmdrop(ctx->mm_account); ctx->mm_account = NULL; diff --git a/io_uring/net.c b/io_uring/net.c index 5794b941254c..6485c50493ac 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -139,7 +139,7 @@ static void io_netmsg_recycle(struct io_kiocb *req, unsigned int issue_flags) /* Let normal cleanup path reap it if we fail adding to the cache */ iov = hdr->free_iov; - if (io_alloc_cache_put(&req->ctx->netmsg_cache, &hdr->cache)) { + if (io_alloc_cache_put(&req->ctx->netmsg_cache, hdr)) { if (iov) kasan_mempool_poison_object(iov); req->async_data = NULL; @@ -150,12 +150,10 @@ static void io_netmsg_recycle(struct io_kiocb *req, unsigned int issue_flags) static struct io_async_msghdr *io_msg_alloc_async(struct io_kiocb *req) { struct io_ring_ctx *ctx = req->ctx; - struct io_cache_entry *entry; struct io_async_msghdr *hdr; - entry = io_alloc_cache_get(&ctx->netmsg_cache); - if (entry) { - hdr = container_of(entry, struct io_async_msghdr, cache); + hdr = io_alloc_cache_get(&ctx->netmsg_cache); + if (hdr) { if (hdr->free_iov) { kasan_mempool_unpoison_object(hdr->free_iov, hdr->free_iov_nr * sizeof(struct iovec)); @@ -1492,11 +1490,10 @@ int io_connect(struct io_kiocb *req, unsigned int issue_flags) return IOU_OK; } -void io_netmsg_cache_free(struct io_cache_entry *entry) +void io_netmsg_cache_free(const void *entry) { - struct io_async_msghdr *kmsg; + struct io_async_msghdr *kmsg = (struct io_async_msghdr *) entry; - kmsg = container_of(entry, struct io_async_msghdr, cache); if (kmsg->free_iov) { kasan_mempool_unpoison_object(kmsg->free_iov, kmsg->free_iov_nr * sizeof(struct iovec)); diff --git a/io_uring/net.h b/io_uring/net.h index b47b43ec6459..c48c44a81850 100644 --- a/io_uring/net.h +++ b/io_uring/net.h @@ -7,19 +7,13 @@ struct io_async_msghdr { #if defined(CONFIG_NET) - union { - struct iovec fast_iov; - struct { - struct io_cache_entry cache; - /* entry size of ->free_iov, if valid */ - int free_iov_nr; - }; - }; + struct iovec fast_iov; /* points to an allocated iov, if NULL we use fast_iov instead */ struct iovec *free_iov; + int free_iov_nr; + int namelen; __kernel_size_t controllen; __kernel_size_t payloadlen; - int namelen; struct sockaddr __user *uaddr; struct msghdr msg; struct sockaddr_storage addr; @@ -57,9 +51,9 @@ int io_sendmsg_zc(struct io_kiocb *req, unsigned int issue_flags); int io_send_zc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); void io_send_zc_cleanup(struct io_kiocb *req); -void io_netmsg_cache_free(struct io_cache_entry *entry); +void io_netmsg_cache_free(const void *entry); #else -static inline void io_netmsg_cache_free(struct io_cache_entry *entry) +static inline void io_netmsg_cache_free(const void *entry) { } #endif diff --git a/io_uring/poll.c b/io_uring/poll.c index 5d55bbf1de15..536c4eda7c26 100644 --- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -686,17 +686,15 @@ static struct async_poll *io_req_alloc_apoll(struct io_kiocb *req, unsigned issue_flags) { struct io_ring_ctx *ctx = req->ctx; - struct io_cache_entry *entry; struct async_poll *apoll; if (req->flags & REQ_F_POLLED) { apoll = req->apoll; kfree(apoll->double_poll); } else if (!(issue_flags & IO_URING_F_UNLOCKED)) { - entry = io_alloc_cache_get(&ctx->apoll_cache); - if (entry == NULL) + apoll = io_alloc_cache_get(&ctx->apoll_cache); + if (!apoll) goto alloc_apoll; - apoll = container_of(entry, struct async_poll, cache); apoll->poll.retries = APOLL_MAX_RETRY; } else { alloc_apoll: @@ -1055,8 +1053,3 @@ int io_poll_remove(struct io_kiocb *req, unsigned int issue_flags) io_req_set_res(req, ret, 0); return IOU_OK; } - -void io_apoll_cache_free(struct io_cache_entry *entry) -{ - kfree(container_of(entry, struct async_poll, cache)); -} diff --git a/io_uring/poll.h b/io_uring/poll.h index 1dacae9e816c..f67c5aeabb63 100644 --- a/io_uring/poll.h +++ b/io_uring/poll.h @@ -17,10 +17,7 @@ struct io_poll { }; struct async_poll { - union { - struct io_poll poll; - struct io_cache_entry cache; - }; + struct io_poll poll; struct io_poll *double_poll; }; @@ -46,6 +43,4 @@ int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags); bool io_poll_remove_all(struct io_ring_ctx *ctx, struct task_struct *tsk, bool cancel_all); -void io_apoll_cache_free(struct io_cache_entry *entry); - void io_poll_task_func(struct io_kiocb *req, struct io_tw_state *ts); diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index 7195c01e675a..2def86427a5e 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -169,7 +169,7 @@ static void io_rsrc_put_work(struct io_rsrc_node *node) void io_rsrc_node_destroy(struct io_ring_ctx *ctx, struct io_rsrc_node *node) { - if (!io_alloc_cache_put(&ctx->rsrc_node_cache, &node->cache)) + if (!io_alloc_cache_put(&ctx->rsrc_node_cache, node)) kfree(node); } @@ -197,12 +197,9 @@ void io_rsrc_node_ref_zero(struct io_rsrc_node *node) struct io_rsrc_node *io_rsrc_node_alloc(struct io_ring_ctx *ctx) { struct io_rsrc_node *ref_node; - struct io_cache_entry *entry; - entry = io_alloc_cache_get(&ctx->rsrc_node_cache); - if (entry) { - ref_node = container_of(entry, struct io_rsrc_node, cache); - } else { + ref_node = io_alloc_cache_get(&ctx->rsrc_node_cache); + if (!ref_node) { ref_node = kzalloc(sizeof(*ref_node), GFP_KERNEL); if (!ref_node) return NULL; diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h index e21000238954..b4cec653100d 100644 --- a/io_uring/rsrc.h +++ b/io_uring/rsrc.h @@ -36,10 +36,7 @@ struct io_rsrc_data { }; struct io_rsrc_node { - union { - struct io_cache_entry cache; - struct io_ring_ctx *ctx; - }; + struct io_ring_ctx *ctx; int refs; bool empty; u16 type; diff --git a/io_uring/rw.c b/io_uring/rw.c index 57f2d315a620..6849795532ab 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -154,7 +154,7 @@ static void io_rw_recycle(struct io_kiocb *req, unsigned int issue_flags) return; } iov = rw->free_iovec; - if (io_alloc_cache_put(&req->ctx->rw_cache, &rw->cache)) { + if (io_alloc_cache_put(&req->ctx->rw_cache, rw)) { if (iov) kasan_mempool_poison_object(iov); req->async_data = NULL; @@ -200,12 +200,10 @@ static void io_req_rw_cleanup(struct io_kiocb *req, unsigned int issue_flags) static int io_rw_alloc_async(struct io_kiocb *req) { struct io_ring_ctx *ctx = req->ctx; - struct io_cache_entry *entry; struct io_async_rw *rw; - entry = io_alloc_cache_get(&ctx->rw_cache); - if (entry) { - rw = container_of(entry, struct io_async_rw, cache); + rw = io_alloc_cache_get(&ctx->rw_cache); + if (rw) { if (rw->free_iovec) { kasan_mempool_unpoison_object(rw->free_iovec, rw->free_iov_nr * sizeof(struct iovec)); @@ -1180,11 +1178,10 @@ int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin) return nr_events; } -void io_rw_cache_free(struct io_cache_entry *entry) +void io_rw_cache_free(const void *entry) { - struct io_async_rw *rw; + struct io_async_rw *rw = (struct io_async_rw *) entry; - rw = container_of(entry, struct io_async_rw, cache); if (rw->free_iovec) { kasan_mempool_unpoison_object(rw->free_iovec, rw->free_iov_nr * sizeof(struct iovec)); diff --git a/io_uring/rw.h b/io_uring/rw.h index cf51d0eb407a..3f432dc75441 100644 --- a/io_uring/rw.h +++ b/io_uring/rw.h @@ -3,10 +3,7 @@ #include struct io_async_rw { - union { - size_t bytes_done; - struct io_cache_entry cache; - }; + size_t bytes_done; struct iov_iter iter; struct iov_iter_state iter_state; struct iovec fast_iov; @@ -28,4 +25,4 @@ void io_rw_fail(struct io_kiocb *req); void io_req_rw_complete(struct io_kiocb *req, struct io_tw_state *ts); int io_read_mshot_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_read_mshot(struct io_kiocb *req, unsigned int issue_flags); -void io_rw_cache_free(struct io_cache_entry *entry); +void io_rw_cache_free(const void *entry); diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 92346b5d9f5b..509cfd56726c 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -17,12 +17,10 @@ static struct uring_cache *io_uring_async_get(struct io_kiocb *req) { struct io_ring_ctx *ctx = req->ctx; - struct io_cache_entry *entry; struct uring_cache *cache; - entry = io_alloc_cache_get(&ctx->uring_cache); - if (entry) { - cache = container_of(entry, struct uring_cache, cache); + cache = io_alloc_cache_get(&ctx->uring_cache); + if (cache) { req->flags |= REQ_F_ASYNC_DATA; req->async_data = cache; return cache; @@ -39,7 +37,7 @@ static void io_req_uring_cleanup(struct io_kiocb *req, unsigned int issue_flags) if (issue_flags & IO_URING_F_UNLOCKED) return; - if (io_alloc_cache_put(&req->ctx->uring_cache, &cache->cache)) { + if (io_alloc_cache_put(&req->ctx->uring_cache, cache)) { ioucmd->sqe = NULL; req->async_data = NULL; req->flags &= ~REQ_F_ASYNC_DATA; @@ -354,8 +352,3 @@ int io_uring_cmd_sock(struct io_uring_cmd *cmd, unsigned int issue_flags) } EXPORT_SYMBOL_GPL(io_uring_cmd_sock); #endif - -void io_uring_cache_free(struct io_cache_entry *entry) -{ - kfree(container_of(entry, struct uring_cache, cache)); -} diff --git a/io_uring/uring_cmd.h b/io_uring/uring_cmd.h index 477ea8865639..a361f98664d2 100644 --- a/io_uring/uring_cmd.h +++ b/io_uring/uring_cmd.h @@ -1,15 +1,11 @@ // SPDX-License-Identifier: GPL-2.0 struct uring_cache { - union { - struct io_cache_entry cache; - struct io_uring_sqe sqes[2]; - }; + struct io_uring_sqe sqes[2]; }; int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags); int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); -void io_uring_cache_free(struct io_cache_entry *entry); bool io_uring_try_cancel_uring_cmd(struct io_ring_ctx *ctx, struct task_struct *task, bool cancel_all);