From patchwork Wed Apr 5 16:53:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202228 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1205DC7619A for ; Wed, 5 Apr 2023 16:53:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A2FDF6B0074; Wed, 5 Apr 2023 12:53:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9DEA66B0075; Wed, 5 Apr 2023 12:53:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E3606B0078; Wed, 5 Apr 2023 12:53:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6CC736B0074 for ; Wed, 5 Apr 2023 12:53:54 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 39F601C6A06 for ; Wed, 5 Apr 2023 16:53:53 +0000 (UTC) X-FDA: 80647934346.06.DD50B2F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf21.hostedemail.com (Postfix) with ESMTP id E616C1C001D for ; Wed, 5 Apr 2023 16:53:50 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="af/BDPQj"; spf=pass (imf21.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713631; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TEbJfA2XN3YS2bVTA2W3okmrsaJKbACj3JMzRKlRxcU=; b=4MFL9SUuUJH2R0KxdNHvcyEAoL+vvpNEzjUuFkuIK1sszY/UCkOQDqzdLZN5mM+/B6K7v0 SUO7wg4/Y2pxVDKNYoQA9Zimv07ulMpNyzzvszCkkp13s8fi2DTQdMowQXrZKx83ia11mE PvzUffupd/1Q5yFc12obx+2XwD/u4Bk= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="af/BDPQj"; spf=pass (imf21.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713631; a=rsa-sha256; cv=none; b=xfkVnR3UIvpSWdZwiJ/SdzAyxxOQOKpc8k1HBJnXvl5ZPoX8OyRPwYf0eyPH+7zstSI+Xb SUH0PMQLxR/n0QR3avH0U/6rnrMzqsp7ZuDLmaxID4/l7lNviYMNSy7/LCeZ7OOMQVAypo MVbdqA+8EOCHpINfUQ9+x3uqbvlFch4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713630; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TEbJfA2XN3YS2bVTA2W3okmrsaJKbACj3JMzRKlRxcU=; b=af/BDPQj9/5LxiaTkl2I5tolr3vDmeKIJJFmwjaw5UYzKQQMOUontz17tBzZWI43bgmDhk z5+nggVuu5GS0WmIdHa93z6588XAPsXGBx0Yo2SSiAhfZOCgSWYPqCKucHl3MIsjn+o11t HCPgClMhsxfT+Kyeg0BJEgmFbCtpQaQ= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-587-Ma9Js4ofNX2CEUj3eZmXIg-1; Wed, 05 Apr 2023 12:53:48 -0400 X-MC-Unique: Ma9Js4ofNX2CEUj3eZmXIg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id ADC8B3823A04; Wed, 5 Apr 2023 16:53:47 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2CAB2202701F; Wed, 5 Apr 2023 16:53:45 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Boris Pismenny , John Fastabend , Herbert Xu Subject: [PATCH net-next v4 01/20] net: Add samples for network I/O and splicing Date: Wed, 5 Apr 2023 17:53:20 +0100 Message-Id: <20230405165339.3468808-2-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E616C1C001D X-Stat-Signature: yujn7bnw43iuptucesy7krbo3pbyjpko X-Rspam-User: X-HE-Tag: 1680713630-628853 X-HE-Meta: U2FsdGVkX1+TOE6QXJFfWifJJ2nFiFLaM/YsGIxr4z5sS6I/kM1nOzLbadjxO6rmqnezVQe36i2tz5ARD4julrSHJzV6CHKbwHwABk/zBSRemHBNTDEfjMSfDmsWbmwgesLcDSzdniksOE1c/yZmo8LUXWwlPI0Wm2Z58pbDzuyQ5nzP4oTKC3jA+w9q5bRB3vQmnbwK5XsTQQ+zv5Zsjeb/+7KagW7UAtO4rAnAFXcT/S4Rqi4YSYiSek44iBMkZ9JiIQ/o9SScbXN7XSYdzZLpf6Irj4Cg04lnhPVvysvT+cHnMhWEqpwsAhBU29HOnR/3wsjlu0sxfaHSHHFOkQbLEyc/rTn8SmfUTZDISWkQ7uKl+OUNkAJa/Fw4kVuaSLST11ehoOZasnPRdiT2S3QArfEICp6U8LvSFrzG0O9pI1ayH/HyRH4iSdxM0VICICom54ZZTkAezAOWhcR/Wcmrd9Nw66gdsCBRd5Krmxt0JVL7ObOfr+plhtuCBMNP9e9kYjAW5UG+IepMussOMopVxb9frc8k8sUbZee2S5oFeGDF6LQdXKmMsWaEg+XMfYF7zvK7A+Ji00bycoXxTJu14aTdZ2ZFlR9VoAqq5o0xguWZ8kN++jQkF0wHsC/K4SbArg4nqrDvCgSqBwajH376r+bH7MdAYB2wT9cJmTADOTCCdcgqoGYY/79ffhY7HmC+DpUfZZlF6wxlMu51AJbikuqTNLOeO7xBLS23AcRt2hA/XuZIVsflRoXuMgfCR9w3VgS69451qhtqhiM/6RdZEwhXhIg/bikQlJFD0pp8nvWmI8Gt1j1PCBg1ud1mY6JHxCQXjbHDKLcHBp9uf6CDmFKilBbejqWd/ZYr3/Ecbz7NHqTy6f5ZVhYWD1nmr2U4vmx4tU54exBPbBhgqiTvbcbtin/6SrLoFFxSRpsd9/pMYMTPc4GyghsGaVy7pUuLyXhV8ftqT56vode waPlDnY7 WbZM9DJo5VFOG3/dfoLoQHYar+XUwjex5/cPucq/IZ+7VzDLPCso8+cVJS792xhpEOqfHn42ashK3sSagjiTzRNVNkHISfjIBwNA70tzIk2CwR+3hQzewjmRKbbCdIab6KKH0DUL9eFYvup1HCm6BuCPKMNmuoKNLOIhqKnOQZ+fTC49mzOfATWBDVVv1T2W2ftOG33zF2VND8Y+RTjTzKVU6lA6J/A7C2/pCNnCfFM2acb6TMzv27iTzXLtCLIXgBhIX+KhlodRQADhRjQ6hSOfr0wS7w3wTzN522OL1BlfKzqfcJ+0yaYebWcH3mQex4evVyVLWotlozMWtQ8OL+kY2t6Qn8GVqkM4n1d3HZgkx/RaW2dPah+CSI7cjWuOC4aRFxO0p2dOvNr0SpHynQQIMZ1GfsENZHcUNEwK/cfkBQ4V+BhEuCabqF0WSoIX6ZcURdjH1IHZ2A+ZOrhgvXP8F97WEVZV3tClSa2F9AnrDjRudz0ol0ObkMHa2oHf4k44ds83oKGoH3aNq33b1Iunri0HVeai/YVb9J8g+3i/SYzBCrJjzv+5cTTPIdmB7bsDV X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add some small sample programs for doing network I/O including splicing. There are three IPv4/IPv6 servers: tcp-sink, tls-sink and udp-sink. They can be given a port number by passing "-p " and will listen on an IPv6 socket unless given a "-4" flag, in which case they'll listen for IPv4 only. There are three IPv4/IPv6 clients: tcp-send, tls-send and udp-send. They are given a file to get data from (or "-" for stdin) and the name of a server to talk to. They can also be given a port number by passing "-p ", "-4" or "-6" to force the use of IPv4 or IPv6, "-s" to indicate they should use splice/sendfile to transfer the data and "-z" to specify how much data to copy. If "-s" is given, the input will be spliced if it's a pipe and sendfiled otherwise. A driver program, splice-out, is provided to splice data from a file/stdin to stdout and can be used to pipe into the aforementioned clients for testing splice. This takes the name of the file to splice from (or "-" for stdin). It can also be given "-w " to indicate the maximum size of each splice, "-k " if a chunk of the input should be skipped between splices to prevent coalescence and "-s" if sendfile should be used instead of splice. Additionally, there is an AF_UNIX client and server. These are similar to the IPv[46] programs, except both take a socket path and there is no option to change the port number. And then there are two AF_ALG clients (there is no server). These are similar to the other clients, except no destination is specified. One exercised skcipher encryption and the other hashing. Examples include: ./splice-out -w0x400 /foo/16K 4K | ./alg-encrypt -s - ./splice-out -w0x400 /foo/1M | ./unix-send -s - /tmp/foo ./splice-out -w0x400 /foo/16K 16K -w1 | ./tls-send -s6 -z16K - servbox ./tcp-send /bin/ls 192.168.6.1 ./udp-send -4 -p5555 /foo/4K localhost where, for example, /foo/16K is a 16KiB file. Signed-off-by: David Howells cc: Willem de Bruijn cc: Boris Pismenny cc: John Fastabend cc: Herbert Xu cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: netdev@vger.kernel.org --- samples/Kconfig | 6 ++ samples/Makefile | 1 + samples/net/Makefile | 13 +++ samples/net/alg-encrypt.c | 201 ++++++++++++++++++++++++++++++++++++++ samples/net/alg-hash.c | 143 +++++++++++++++++++++++++++ samples/net/splice-out.c | 142 +++++++++++++++++++++++++++ samples/net/tcp-send.c | 154 +++++++++++++++++++++++++++++ samples/net/tcp-sink.c | 76 ++++++++++++++ samples/net/tls-send.c | 176 +++++++++++++++++++++++++++++++++ samples/net/tls-sink.c | 98 +++++++++++++++++++ samples/net/udp-send.c | 151 ++++++++++++++++++++++++++++ samples/net/udp-sink.c | 82 ++++++++++++++++ samples/net/unix-send.c | 147 ++++++++++++++++++++++++++++ samples/net/unix-sink.c | 51 ++++++++++ 14 files changed, 1441 insertions(+) create mode 100644 samples/net/Makefile create mode 100644 samples/net/alg-encrypt.c create mode 100644 samples/net/alg-hash.c create mode 100644 samples/net/splice-out.c create mode 100644 samples/net/tcp-send.c create mode 100644 samples/net/tcp-sink.c create mode 100644 samples/net/tls-send.c create mode 100644 samples/net/tls-sink.c create mode 100644 samples/net/udp-send.c create mode 100644 samples/net/udp-sink.c create mode 100644 samples/net/unix-send.c create mode 100644 samples/net/unix-sink.c diff --git a/samples/Kconfig b/samples/Kconfig index 30ef8bd48ba3..14051e9f7532 100644 --- a/samples/Kconfig +++ b/samples/Kconfig @@ -273,6 +273,12 @@ config SAMPLE_CORESIGHT_SYSCFG This demonstrates how a user may create their own CoreSight configurations and easily load them into the system at runtime. +config SAMPLE_NET + bool "Build example programs that drive network protocols" + depends on NET + help + Build example userspace programs that drive network protocols. + source "samples/rust/Kconfig" endif # SAMPLES diff --git a/samples/Makefile b/samples/Makefile index 7cb632ef88ee..22c1d6244eaf 100644 --- a/samples/Makefile +++ b/samples/Makefile @@ -37,3 +37,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak/ obj-$(CONFIG_SAMPLE_CORESIGHT_SYSCFG) += coresight/ obj-$(CONFIG_SAMPLE_FPROBE) += fprobe/ obj-$(CONFIG_SAMPLES_RUST) += rust/ +obj-$(CONFIG_SAMPLE_NET) += net/ diff --git a/samples/net/Makefile b/samples/net/Makefile new file mode 100644 index 000000000000..0ccd68a36edf --- /dev/null +++ b/samples/net/Makefile @@ -0,0 +1,13 @@ +# SPDX-License-Identifier: GPL-2.0-only +userprogs-always-y += \ + alg-hash \ + alg-encrypt \ + splice-out \ + tcp-send \ + tcp-sink \ + tls-send \ + tls-sink \ + udp-send \ + udp-sink \ + unix-send \ + unix-sink diff --git a/samples/net/alg-encrypt.c b/samples/net/alg-encrypt.c new file mode 100644 index 000000000000..34a62a9c480a --- /dev/null +++ b/samples/net/alg-encrypt.c @@ -0,0 +1,201 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* AF_ALG hash test + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) +#define min(x, y) ((x) < (y) ? (x) : (y)) + +static unsigned char buffer[4096 * 32] __attribute__((aligned(4096))); +static unsigned char iv[16]; +static unsigned char key[16]; + +static const struct sockaddr_alg sa = { + .salg_family = AF_ALG, + .salg_type = "skcipher", + .salg_name = "cbc(aes)", +}; + +static __attribute__((noreturn)) +void format(void) +{ + fprintf(stderr, "alg-send [-s] [-z] |-\n"); + exit(2); +} + +static void algif_add_set_op(struct msghdr *msg, unsigned int op) +{ + struct cmsghdr *__cmsg; + + __cmsg = msg->msg_control + msg->msg_controllen; + __cmsg->cmsg_len = CMSG_LEN(sizeof(unsigned int)); + __cmsg->cmsg_level = SOL_ALG; + __cmsg->cmsg_type = ALG_SET_OP; + *(unsigned int *)CMSG_DATA(__cmsg) = op; + msg->msg_controllen += CMSG_ALIGN(__cmsg->cmsg_len); +} + +static void algif_add_set_iv(struct msghdr *msg, const void *iv, size_t ivlen) +{ + struct af_alg_iv *ivbuf; + struct cmsghdr *__cmsg; + + printf("%zx\n", msg->msg_controllen); + __cmsg = msg->msg_control + msg->msg_controllen; + __cmsg->cmsg_len = CMSG_LEN(sizeof(*ivbuf) + ivlen); + __cmsg->cmsg_level = SOL_ALG; + __cmsg->cmsg_type = ALG_SET_IV; + ivbuf = (struct af_alg_iv *)CMSG_DATA(__cmsg); + ivbuf->ivlen = ivlen; + memcpy(ivbuf->iv, iv, ivlen); + msg->msg_controllen += CMSG_ALIGN(__cmsg->cmsg_len); +} + +int main(int argc, char *argv[]) +{ + struct msghdr msg; + struct stat st; + const char *filename; + unsigned char ctrl[4096]; + ssize_t r, w, o, ret; + size_t size = LONG_MAX, total = 0, i, out = 160; + char *end; + bool use_sendfile = false, all = true; + int opt, alg, sock, fd = 0; + + while ((opt = getopt(argc, argv, "sz:")) != EOF) { + switch (opt) { + case 's': + use_sendfile = true; + break; + case 'z': + size = strtoul(optarg, &end, 0); + switch (*end) { + case 'K': + case 'k': + size *= 1024; + break; + case 'M': + case 'm': + size *= 1024 * 1024; + break; + } + all = false; + break; + default: + format(); + } + } + + argc -= optind; + argv += optind; + if (argc != 1) + format(); + filename = argv[0]; + + alg = socket(AF_ALG, SOCK_SEQPACKET, 0); + OSERROR(alg, "AF_ALG"); + OSERROR(bind(alg, (struct sockaddr *)&sa, sizeof(sa)), "bind"); + OSERROR(setsockopt(alg, SOL_ALG, ALG_SET_KEY, key, sizeof(key)), "ALG_SET_KEY"); + sock = accept(alg, NULL, 0); + OSERROR(sock, "accept"); + + if (strcmp(filename, "-") != 0) { + fd = open(filename, O_RDONLY); + OSERROR(fd, filename); + OSERROR(fstat(fd, &st), filename); + size = st.st_size; + } else { + OSERROR(fstat(fd, &st), argv[2]); + } + + memset(&msg, 0, sizeof(msg)); + msg.msg_control = ctrl; + algif_add_set_op(&msg, ALG_OP_ENCRYPT); + algif_add_set_iv(&msg, iv, sizeof(iv)); + + OSERROR(sendmsg(sock, &msg, MSG_MORE), "sock/sendmsg"); + + if (!use_sendfile) { + bool more = false; + + while (size) { + r = read(fd, buffer, sizeof(buffer)); + OSERROR(r, filename); + if (r == 0) + break; + size -= r; + + o = 0; + do { + more = size > 0; + w = send(sock, buffer + o, r - o, + more ? MSG_MORE : 0); + OSERROR(w, "sock/send"); + total += w; + o += w; + } while (o < r); + } + + if (more) + send(sock, NULL, 0, 0); + } else if (S_ISFIFO(st.st_mode)) { + do { + r = splice(fd, NULL, sock, NULL, size, + size > 0 ? SPLICE_F_MORE : 0); + OSERROR(r, "sock/splice"); + size -= r; + total += r; + } while (r > 0 && size > 0); + if (size && !all) { + fprintf(stderr, "Short splice\n"); + exit(1); + } + } else { + r = sendfile(sock, fd, NULL, size); + OSERROR(r, "sock/sendfile"); + if (r != size) { + fprintf(stderr, "Short sendfile\n"); + exit(1); + } + total = r; + } + + while (total > 0) { + ret = read(sock, buffer, min(sizeof(buffer), total)); + OSERROR(ret, "sock/read"); + if (ret == 0) + break; + total -= ret; + + if (out > 0) { + ret = min(out, ret); + out -= ret; + for (i = 0; i < ret; i++) + printf("%02x", (unsigned char)buffer[i]); + } + printf("...\n"); + } + + OSERROR(close(sock), "sock/close"); + OSERROR(close(alg), "alg/close"); + OSERROR(close(fd), "close"); + return 0; +} diff --git a/samples/net/alg-hash.c b/samples/net/alg-hash.c new file mode 100644 index 000000000000..842a8016acb3 --- /dev/null +++ b/samples/net/alg-hash.c @@ -0,0 +1,143 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* AF_ALG hash test + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) + +static unsigned char buffer[4096 * 32] __attribute__((aligned(4096))); + +static const struct sockaddr_alg sa = { + .salg_family = AF_ALG, + .salg_type = "hash", + .salg_name = "sha1", +}; + +static __attribute__((noreturn)) +void format(void) +{ + fprintf(stderr, "alg-send [-s] [-z] |-\n"); + exit(2); +} + +int main(int argc, char *argv[]) +{ + struct stat st; + const char *filename; + ssize_t r, w, o, ret; + size_t size = LONG_MAX, i; + char *end; + int use_sendfile = 0; + int opt, alg, sock, fd = 0; + + while ((opt = getopt(argc, argv, "sz:")) != EOF) { + switch (opt) { + case 's': + use_sendfile = true; + break; + case 'z': + size = strtoul(optarg, &end, 0); + switch (*end) { + case 'K': + case 'k': + size *= 1024; + break; + case 'M': + case 'm': + size *= 1024 * 1024; + break; + } + break; + default: + format(); + } + } + + argc -= optind; + argv += optind; + if (argc != 1) + format(); + filename = argv[0]; + + alg = socket(AF_ALG, SOCK_SEQPACKET, 0); + OSERROR(alg, "AF_ALG"); + OSERROR(bind(alg, (struct sockaddr *)&sa, sizeof(sa)), "bind"); + sock = accept(alg, NULL, 0); + OSERROR(sock, "accept"); + + if (strcmp(filename, "-") != 0) { + fd = open(filename, O_RDONLY); + OSERROR(fd, filename); + OSERROR(fstat(fd, &st), filename); + size = st.st_size; + } else { + OSERROR(fstat(fd, &st), argv[2]); + } + + if (!use_sendfile) { + bool more = false; + + while (size) { + r = read(fd, buffer, sizeof(buffer)); + OSERROR(r, filename); + if (r == 0) + break; + size -= r; + + o = 0; + do { + more = size > 0; + w = send(sock, buffer + o, r - o, + more ? MSG_MORE : 0); + OSERROR(w, "sock/send"); + o += w; + } while (o < r); + } + + if (more) + send(sock, NULL, 0, 0); + } else if (S_ISFIFO(st.st_mode)) { + r = splice(fd, NULL, sock, NULL, size, 0); + OSERROR(r, "sock/splice"); + if (r != size) { + fprintf(stderr, "Short splice\n"); + exit(1); + } + } else { + r = sendfile(sock, fd, NULL, size); + OSERROR(r, "sock/sendfile"); + if (r != size) { + fprintf(stderr, "Short sendfile\n"); + exit(1); + } + } + + ret = read(sock, buffer, sizeof(buffer)); + OSERROR(ret, "sock/read"); + + for (i = 0; i < ret; i++) + printf("%02x", (unsigned char)buffer[i]); + printf("\n"); + + OSERROR(close(sock), "sock/close"); + OSERROR(close(alg), "alg/close"); + OSERROR(close(fd), "close"); + return 0; +} diff --git a/samples/net/splice-out.c b/samples/net/splice-out.c new file mode 100644 index 000000000000..07bc0d774779 --- /dev/null +++ b/samples/net/splice-out.c @@ -0,0 +1,142 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* Splice or sendfile from the given file/stdin to stdout. + * + * Format: splice-out [-s] |- [] + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) +#define min(x, y) ((x) < (y) ? (x) : (y)) + +static unsigned char buffer[4096] __attribute__((aligned(4096))); + +static __attribute__((noreturn)) +void format(void) +{ + fprintf(stderr, "splice-out [-kN][-s][-wN] |- []\n"); + exit(2); +} + +int main(int argc, char *argv[]) +{ + const char *filename; + struct stat st; + ssize_t r; + size_t size = 1024 * 1024, skip = 0, unit = 0, part; + char *end; + bool use_sendfile = false, all = true; + int opt, fd = 0; + + while ((opt = getopt(argc, argv, "k:sw:")), + opt != -1) { + switch (opt) { + case 'k': + /* Skip size - prevent coalescence. */ + skip = strtoul(optarg, &end, 0); + if (skip < 1 || skip >= 4096) { + fprintf(stderr, "-kN must be 00\n"); + exit(2); + } + switch (*end) { + case 'K': + case 'k': + unit *= 1024; + break; + case 'M': + case 'm': + unit *= 1024 * 1024; + break; + } + break; + default: + format(); + } + } + + argc -= optind; + argv += optind; + + if (argc != 1 && argc != 2) + format(); + + filename = argv[0]; + if (argc == 2) { + size = strtoul(argv[1], &end, 0); + switch (*end) { + case 'K': + case 'k': + size *= 1024; + break; + case 'M': + case 'm': + size *= 1024 * 1024; + break; + } + all = false; + } + + OSERROR(fstat(1, &st), "stdout"); + if (!S_ISFIFO(st.st_mode)) { + fprintf(stderr, "stdout must be a pipe\n"); + exit(3); + } + + if (strcmp(filename, "-") != 0) { + fd = open(filename, O_RDONLY); + OSERROR(fd, filename); + OSERROR(fstat(fd, &st), filename); + if (!all && size > st.st_size) { + fprintf(stderr, "%s: Specified size larger than file\n", filename); + exit(3); + } + } + + do { + if (skip) { + part = skip; + do { + r = read(fd, buffer, skip); + OSERROR(r, filename); + part -= r; + } while (part > 0 && r > 0); + } + + part = unit ? min(size, unit) : size; + if (use_sendfile) { + r = sendfile(1, fd, NULL, part); + OSERROR(r, "sendfile"); + } else { + r = splice(fd, NULL, 1, NULL, part, 0); + OSERROR(r, "splice"); + } + if (!all) + size -= r; + } while (r > 0 && size > 0); + + OSERROR(close(fd), "close"); + return 0; +} diff --git a/samples/net/tcp-send.c b/samples/net/tcp-send.c new file mode 100644 index 000000000000..153105f6a30a --- /dev/null +++ b/samples/net/tcp-send.c @@ -0,0 +1,154 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * TCP send client. Pass -s to splice. + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) + +static unsigned char buffer[4096] __attribute__((aligned(4096))); + +static __attribute__((noreturn)) +void format(void) +{ + fprintf(stderr, "tcp-send [-46s][-p][-z] |- \n"); + exit(2); +} + +int main(int argc, char *argv[]) +{ + struct addrinfo *addrs = NULL, hints = {}; + struct stat st; + const char *filename, *sockname, *service = "5555"; + ssize_t r, w, o; + size_t size = LONG_MAX; + char *end; + bool use_sendfile = false; + int opt, sock, fd = 0, gai; + + hints.ai_family = AF_UNSPEC; + hints.ai_socktype = SOCK_STREAM; + + while ((opt = getopt(argc, argv, "46p:sz:")) != EOF) { + switch (opt) { + case '4': + hints.ai_family = AF_INET; + break; + case '6': + hints.ai_family = AF_INET6; + break; + case 'p': + service = optarg; + break; + case 's': + use_sendfile = true; + break; + case 'z': + size = strtoul(optarg, &end, 0); + switch (*end) { + case 'K': + case 'k': + size *= 1024; + break; + case 'M': + case 'm': + size *= 1024 * 1024; + break; + } + break; + default: + format(); + } + } + + argc -= optind; + argv += optind; + if (argc != 2) + format(); + filename = argv[0]; + sockname = argv[1]; + + gai = getaddrinfo(sockname, service, &hints, &addrs); + if (gai) { + fprintf(stderr, "%s: %s\n", sockname, gai_strerror(gai)); + exit(3); + } + + if (!addrs) { + fprintf(stderr, "%s: No addresses\n", sockname); + exit(3); + } + + sockname = addrs->ai_canonname; + sock = socket(addrs->ai_family, addrs->ai_socktype, addrs->ai_protocol); + OSERROR(sock, "socket"); + OSERROR(connect(sock, addrs->ai_addr, addrs->ai_addrlen), "connect"); + + if (strcmp(filename, "-") != 0) { + fd = open(filename, O_RDONLY); + OSERROR(fd, filename); + OSERROR(fstat(fd, &st), filename); + if (size > st.st_size) + size = st.st_size; + } else { + OSERROR(fstat(fd, &st), filename); + } + + if (!use_sendfile) { + bool more = false; + + while (size) { + r = read(fd, buffer, sizeof(buffer)); + OSERROR(r, filename); + if (r == 0) + break; + size -= r; + + o = 0; + do { + more = size > 0; + w = send(sock, buffer + o, r - o, + more ? MSG_MORE : 0); + OSERROR(w, "sock/send"); + o += w; + } while (o < r); + } + + if (more) + send(sock, NULL, 0, 0); + } else if (S_ISFIFO(st.st_mode)) { + r = splice(fd, NULL, sock, NULL, size, 0); + OSERROR(r, "sock/splice"); + if (r != size) { + fprintf(stderr, "Short splice\n"); + exit(1); + } + } else { + r = sendfile(sock, fd, NULL, size); + OSERROR(r, "sock/sendfile"); + if (r != size) { + fprintf(stderr, "Short sendfile\n"); + exit(1); + } + } + + OSERROR(close(sock), "sock/close"); + OSERROR(close(fd), "close"); + return 0; +} diff --git a/samples/net/tcp-sink.c b/samples/net/tcp-sink.c new file mode 100644 index 000000000000..33d949d0e9aa --- /dev/null +++ b/samples/net/tcp-sink.c @@ -0,0 +1,76 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * TCP sink server + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) + +static unsigned char buffer[512 * 1024] __attribute__((aligned(4096))); + +static __attribute__((noreturn)) +void format(void) +{ + fprintf(stderr, "tcp-sink [-4][-p]\n"); + exit(2); +} + +int main(int argc, char *argv[]) +{ + unsigned int port = 5555; + bool ipv6 = true; + int opt, server_sock, sock; + + + while ((opt = getopt(argc, argv, "4p:")) != EOF) { + switch (opt) { + case '4': + ipv6 = false; + break; + case 'p': + port = atoi(optarg); + break; + default: + format(); + } + } + + if (!ipv6) { + struct sockaddr_in sin = { + .sin_family = AF_INET, + .sin_port = htons(port), + }; + server_sock = socket(AF_INET, SOCK_STREAM, 0); + OSERROR(server_sock, "socket"); + OSERROR(bind(server_sock, (struct sockaddr *)&sin, sizeof(sin)), "bind"); + OSERROR(listen(server_sock, 1), "listen"); + } else { + struct sockaddr_in6 sin6 = { + .sin6_family = AF_INET6, + .sin6_port = htons(port), + }; + server_sock = socket(AF_INET6, SOCK_STREAM, 0); + OSERROR(server_sock, "socket"); + OSERROR(bind(server_sock, (struct sockaddr *)&sin6, sizeof(sin6)), "bind"); + OSERROR(listen(server_sock, 1), "listen"); + } + + for (;;) { + sock = accept(server_sock, NULL, NULL); + if (sock != -1) { + while (read(sock, buffer, sizeof(buffer)) > 0) {} + close(sock); + } + } +} diff --git a/samples/net/tls-send.c b/samples/net/tls-send.c new file mode 100644 index 000000000000..b3b8a0a3b41f --- /dev/null +++ b/samples/net/tls-send.c @@ -0,0 +1,176 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * TLS-over-TCP send client. Pass -s to splice. + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) + +static unsigned char buffer[4096] __attribute__((aligned(4096))); + +static __attribute__((noreturn)) +void format(void) +{ + fprintf(stderr, "tls-send [-46s][-p][-z] |- \n"); + exit(2); +} + +static void set_tls(int sock) +{ + struct tls12_crypto_info_aes_gcm_128 crypto_info; + + crypto_info.info.version = TLS_1_2_VERSION; + crypto_info.info.cipher_type = TLS_CIPHER_AES_GCM_128; + memset(crypto_info.iv, 0, TLS_CIPHER_AES_GCM_128_IV_SIZE); + memset(crypto_info.rec_seq, 0, TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE); + memset(crypto_info.key, 0, TLS_CIPHER_AES_GCM_128_KEY_SIZE); + memset(crypto_info.salt, 0, TLS_CIPHER_AES_GCM_128_SALT_SIZE); + + OSERROR(setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls")), + "TCP_ULP"); + OSERROR(setsockopt(sock, SOL_TLS, TLS_TX, &crypto_info, sizeof(crypto_info)), + "TLS_TX"); + OSERROR(setsockopt(sock, SOL_TLS, TLS_RX, &crypto_info, sizeof(crypto_info)), + "TLS_RX"); +} + +int main(int argc, char *argv[]) +{ + struct addrinfo *addrs = NULL, hints = {}; + struct stat st; + const char *filename, *sockname, *service = "5556"; + ssize_t r, w, o; + size_t size = LONG_MAX; + char *end; + bool use_sendfile = false; + int opt, sock, fd = 0, gai; + + hints.ai_family = AF_UNSPEC; + hints.ai_socktype = SOCK_STREAM; + + while ((opt = getopt(argc, argv, "46p:sz:")) != EOF) { + switch (opt) { + case '4': + hints.ai_family = AF_INET; + break; + case '6': + hints.ai_family = AF_INET6; + break; + case 'p': + service = optarg; + break; + case 's': + use_sendfile = true; + break; + case 'z': + size = strtoul(optarg, &end, 0); + switch (*end) { + case 'K': + case 'k': + size *= 1024; + break; + case 'M': + case 'm': + size *= 1024 * 1024; + break; + } + break; + default: + format(); + } + } + + argc -= optind; + argv += optind; + if (argc != 2) + format(); + filename = argv[0]; + sockname = argv[1]; + + gai = getaddrinfo(sockname, service, &hints, &addrs); + if (gai) { + fprintf(stderr, "%s: %s\n", sockname, gai_strerror(gai)); + exit(3); + } + + if (!addrs) { + fprintf(stderr, "%s: No addresses\n", sockname); + exit(3); + } + + sockname = addrs->ai_canonname; + sock = socket(addrs->ai_family, addrs->ai_socktype, addrs->ai_protocol); + OSERROR(sock, "socket"); + OSERROR(connect(sock, addrs->ai_addr, addrs->ai_addrlen), "connect"); + set_tls(sock); + + if (strcmp(filename, "-") != 0) { + fd = open(filename, O_RDONLY); + OSERROR(fd, filename); + OSERROR(fstat(fd, &st), filename); + if (size > st.st_size) + size = st.st_size; + } else { + OSERROR(fstat(fd, &st), filename); + } + + if (!use_sendfile) { + bool more = false; + + while (size) { + r = read(fd, buffer, sizeof(buffer)); + OSERROR(r, filename); + if (r == 0) + break; + size -= r; + + o = 0; + do { + more = size > 0; + w = send(sock, buffer + o, r - o, + more ? MSG_MORE : 0); + OSERROR(w, "sock/send"); + o += w; + } while (o < r); + } + + if (more) + send(sock, NULL, 0, 0); + } else if (S_ISFIFO(st.st_mode)) { + r = splice(fd, NULL, sock, NULL, size, 0); + OSERROR(r, "sock/splice"); + if (r != size) { + fprintf(stderr, "Short splice\n"); + exit(1); + } + } else { + r = sendfile(sock, fd, NULL, size); + OSERROR(r, "sock/sendfile"); + if (r != size) { + fprintf(stderr, "Short sendfile\n"); + exit(1); + } + } + + OSERROR(close(sock), "sock/close"); + OSERROR(close(fd), "close"); + return 0; +} diff --git a/samples/net/tls-sink.c b/samples/net/tls-sink.c new file mode 100644 index 000000000000..1d6d4ed07101 --- /dev/null +++ b/samples/net/tls-sink.c @@ -0,0 +1,98 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * TLS-over-TCP sink server + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) + +static unsigned char buffer[512 * 1024] __attribute__((aligned(4096))); + +static __attribute__((noreturn)) +void format(void) +{ + fprintf(stderr, "tls-sink [-4][-p]\n"); + exit(2); +} + +static void set_tls(int sock) +{ + struct tls12_crypto_info_aes_gcm_128 crypto_info; + + crypto_info.info.version = TLS_1_2_VERSION; + crypto_info.info.cipher_type = TLS_CIPHER_AES_GCM_128; + memset(crypto_info.iv, 0, TLS_CIPHER_AES_GCM_128_IV_SIZE); + memset(crypto_info.rec_seq, 0, TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE); + memset(crypto_info.key, 0, TLS_CIPHER_AES_GCM_128_KEY_SIZE); + memset(crypto_info.salt, 0, TLS_CIPHER_AES_GCM_128_SALT_SIZE); + + OSERROR(setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls")), + "TCP_ULP"); + OSERROR(setsockopt(sock, SOL_TLS, TLS_TX, &crypto_info, sizeof(crypto_info)), + "TLS_TX"); + OSERROR(setsockopt(sock, SOL_TLS, TLS_RX, &crypto_info, sizeof(crypto_info)), + "TLS_RX"); +} + +int main(int argc, char *argv[]) +{ + unsigned int port = 5556; + bool ipv6 = true; + int opt, server_sock, sock; + + + while ((opt = getopt(argc, argv, "4p:")) != EOF) { + switch (opt) { + case '4': + ipv6 = false; + break; + case 'p': + port = atoi(optarg); + break; + default: + format(); + } + } + + if (!ipv6) { + struct sockaddr_in sin = { + .sin_family = AF_INET, + .sin_port = htons(port), + }; + server_sock = socket(AF_INET, SOCK_STREAM, 0); + OSERROR(server_sock, "socket"); + OSERROR(bind(server_sock, (struct sockaddr *)&sin, sizeof(sin)), "bind"); + OSERROR(listen(server_sock, 1), "listen"); + } else { + struct sockaddr_in6 sin6 = { + .sin6_family = AF_INET6, + .sin6_port = htons(port), + }; + server_sock = socket(AF_INET6, SOCK_STREAM, 0); + OSERROR(server_sock, "socket"); + OSERROR(bind(server_sock, (struct sockaddr *)&sin6, sizeof(sin6)), "bind"); + OSERROR(listen(server_sock, 1), "listen"); + } + + for (;;) { + sock = accept(server_sock, NULL, NULL); + if (sock != -1) { + set_tls(sock); + while (read(sock, buffer, sizeof(buffer)) > 0) {} + close(sock); + } + } +} diff --git a/samples/net/udp-send.c b/samples/net/udp-send.c new file mode 100644 index 000000000000..31abd6b2d9fd --- /dev/null +++ b/samples/net/udp-send.c @@ -0,0 +1,151 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * UDP send client. Pass -s to splice. + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) +#define min(x, y) ((x) < (y) ? (x) : (y)) + +static unsigned char buffer[65536] __attribute__((aligned(4096))); + +static __attribute__((noreturn)) +void format(void) +{ + fprintf(stderr, "udp-send [-46s][-p][-z] |- \n"); + exit(2); +} + +int main(int argc, char *argv[]) +{ + struct addrinfo *addrs = NULL, hints = {}; + struct stat st; + const char *filename, *sockname, *service = "5555"; + unsigned int len; + ssize_t r, o, size = 65535; + char *end; + bool use_sendfile = false; + int opt, sock, fd = 0, gai; + + hints.ai_family = AF_UNSPEC; + hints.ai_socktype = SOCK_DGRAM; + + while ((opt = getopt(argc, argv, "46p:sz:")) != EOF) { + switch (opt) { + case '4': + hints.ai_family = AF_INET; + break; + case '6': + hints.ai_family = AF_INET6; + break; + case 'p': + service = optarg; + break; + case 's': + use_sendfile = true; + break; + case 'z': + size = strtoul(optarg, &end, 0); + switch (*end) { + case 'K': + case 'k': + size *= 1024; + break; + } + if (size > 65535) { + fprintf(stderr, "Too much data for UDP packet\n"); + exit(2); + } + break; + default: + format(); + } + } + + argc -= optind; + argv += optind; + if (argc != 2) + format(); + filename = argv[0]; + sockname = argv[1]; + + gai = getaddrinfo(sockname, service, &hints, &addrs); + if (gai) { + fprintf(stderr, "%s: %s\n", sockname, gai_strerror(gai)); + exit(3); + } + + if (!addrs) { + fprintf(stderr, "%s: No addresses\n", sockname); + exit(3); + } + + sockname = addrs->ai_canonname; + sock = socket(addrs->ai_family, addrs->ai_socktype, addrs->ai_protocol); + OSERROR(sock, "socket"); + OSERROR(connect(sock, addrs->ai_addr, addrs->ai_addrlen), "connect"); + + if (strcmp(filename, "-") != 0) { + fd = open(filename, O_RDONLY); + OSERROR(fd, filename); + OSERROR(fstat(fd, &st), filename); + if (size > st.st_size) + size = st.st_size; + } else { + OSERROR(fstat(fd, &st), filename); + } + + len = htonl(size); + OSERROR(send(sock, &len, 4, MSG_MORE), "sock/send"); + + if (!use_sendfile) { + while (size) { + r = read(fd, buffer, sizeof(buffer)); + OSERROR(r, filename); + if (r == 0) + break; + size -= r; + + o = 0; + do { + ssize_t w = send(sock, buffer + o, r - o, + size > 0 ? MSG_MORE : 0); + OSERROR(w, "sock/send"); + o += w; + } while (o < r); + } + } else if (S_ISFIFO(st.st_mode)) { + r = splice(fd, NULL, sock, NULL, size, 0); + OSERROR(r, "sock/splice"); + if (r != size) { + fprintf(stderr, "Short splice\n"); + exit(1); + } + } else { + r = sendfile(sock, fd, NULL, size); + OSERROR(r, "sock/sendfile"); + if (r != size) { + fprintf(stderr, "Short sendfile\n"); + exit(1); + } + } + + OSERROR(close(sock), "sock/close"); + OSERROR(close(fd), "close"); + return 0; +} diff --git a/samples/net/udp-sink.c b/samples/net/udp-sink.c new file mode 100644 index 000000000000..b98f45b64296 --- /dev/null +++ b/samples/net/udp-sink.c @@ -0,0 +1,82 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * UDP sink server + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) + +static unsigned char buffer[512 * 1024] __attribute__((aligned(4096))); + +static __attribute__((noreturn)) +void format(void) +{ + fprintf(stderr, "udp-sink [-4][-p]\n"); + exit(2); +} + +int main(int argc, char *argv[]) +{ + struct iovec iov[1] = { + [0] = { + .iov_base = buffer, + .iov_len = sizeof(buffer), + }, + }; + struct msghdr msg = { + .msg_iov = iov, + .msg_iovlen = 1, + }; + unsigned int port = 5555; + bool ipv6 = true; + int opt, sock; + + while ((opt = getopt(argc, argv, "4p:")) != EOF) { + switch (opt) { + case '4': + ipv6 = false; + break; + case 'p': + port = atoi(optarg); + break; + default: + format(); + } + } + + if (!ipv6) { + struct sockaddr_in sin = { + .sin_family = AF_INET, + .sin_port = htons(port), + }; + sock = socket(AF_INET, SOCK_DGRAM, 0); + OSERROR(sock, "socket"); + OSERROR(bind(sock, (struct sockaddr *)&sin, sizeof(sin)), "bind"); + } else { + struct sockaddr_in6 sin6 = { + .sin6_family = AF_INET6, + .sin6_port = htons(port), + }; + sock = socket(AF_INET6, SOCK_DGRAM, 0); + OSERROR(sock, "socket"); + OSERROR(bind(sock, (struct sockaddr *)&sin6, sizeof(sin6)), "bind"); + } + + for (;;) { + ssize_t r; + + r = recvmsg(sock, &msg, 0); + printf("rx %zd\n", r); + } +} diff --git a/samples/net/unix-send.c b/samples/net/unix-send.c new file mode 100644 index 000000000000..88fae776985c --- /dev/null +++ b/samples/net/unix-send.c @@ -0,0 +1,147 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * AF_UNIX stream send client. Pass -s to splice. + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) +#define min(x, y) ((x) < (y) ? (x) : (y)) + +static unsigned char buffer[4096] __attribute__((aligned(4096))); + +static __attribute__((noreturn)) +void format(void) +{ + fprintf(stderr, "unix-send [-s] [-z] |- \n"); + exit(2); +} + +int main(int argc, char *argv[]) +{ + struct sockaddr_un sun = { .sun_family = AF_UNIX, }; + struct stat st; + const char *filename, *sockname; + ssize_t r, w, o, size = LONG_MAX; + size_t plen, total = 0; + char *end; + bool use_sendfile = false, all = true; + int opt, sock, fd = 0; + + while ((opt = getopt(argc, argv, "sz:")) != EOF) { + switch (opt) { + case 's': + use_sendfile = true; + break; + case 'z': + size = strtoul(optarg, &end, 0); + switch (*end) { + case 'K': + case 'k': + size *= 1024; + break; + case 'M': + case 'm': + size *= 1024 * 1024; + break; + } + all = false; + break; + default: + format(); + } + } + + argc -= optind; + argv += optind; + if (argc != 2) + format(); + filename = argv[0]; + sockname = argv[1]; + + plen = strlen(sockname); + if (plen == 0 || plen > sizeof(sun.sun_path) - 1) { + fprintf(stderr, "socket filename too short or too long\n"); + exit(2); + } + memcpy(sun.sun_path, sockname, plen + 1); + + sock = socket(AF_UNIX, SOCK_STREAM, 0); + OSERROR(sock, "socket"); + OSERROR(connect(sock, (struct sockaddr *)&sun, sizeof(sun)), "connect"); + + if (strcmp(filename, "-") != 0) { + fd = open(filename, O_RDONLY); + OSERROR(fd, filename); + OSERROR(fstat(fd, &st), filename); + if (size > st.st_size) + size = st.st_size; + } else { + OSERROR(fstat(fd, &st), argv[2]); + } + + if (!use_sendfile) { + bool more = false; + + while (size) { + r = read(fd, buffer, min(sizeof(buffer), size)); + OSERROR(r, filename); + if (r == 0) + break; + size -= r; + + o = 0; + do { + more = size > 0; + w = send(sock, buffer + o, r - o, + more ? MSG_MORE : 0); + OSERROR(w, "sock/send"); + o += w; + total += w; + } while (o < r); + } + + if (more) + send(sock, NULL, 0, 0); + } else if (S_ISFIFO(st.st_mode)) { + do { + r = splice(fd, NULL, sock, NULL, size, + size > 0 ? SPLICE_F_MORE : 0); + OSERROR(r, "sock/splice"); + size -= r; + total += r; + } while (r > 0 && size > 0); + if (size && !all) { + fprintf(stderr, "Short splice\n"); + exit(1); + } + } else { + r = sendfile(sock, fd, NULL, size); + OSERROR(r, "sock/sendfile"); + if (r != size) { + fprintf(stderr, "Short sendfile\n"); + exit(1); + } + total += r; + } + + printf("Sent %zu bytes\n", total); + OSERROR(close(sock), "sock/close"); + OSERROR(close(fd), "close"); + return 0; +} diff --git a/samples/net/unix-sink.c b/samples/net/unix-sink.c new file mode 100644 index 000000000000..3c75979dc52a --- /dev/null +++ b/samples/net/unix-sink.c @@ -0,0 +1,51 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * UNIX stream sink server + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#include +#include +#include +#include +#include +#include +#include + +#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0) + +static unsigned char buffer[512 * 1024] __attribute__((aligned(4096))); + +int main(int argc, char *argv[]) +{ + struct sockaddr_un sun = { .sun_family = AF_UNIX, }; + size_t plen; + int server_sock, sock; + + if (argc != 2) { + fprintf(stderr, "unix-sink \n"); + exit(2); + } + + plen = strlen(argv[1]); + if (plen == 0 || plen > sizeof(sun.sun_path) - 1) { + fprintf(stderr, "socket filename too short or too long\n"); + exit(2); + } + memcpy(sun.sun_path, argv[1], plen + 1); + + server_sock = socket(AF_UNIX, SOCK_STREAM, 0); + OSERROR(server_sock, "socket"); + OSERROR(bind(server_sock, (struct sockaddr *)&sun, sizeof(sun)), "bind"); + OSERROR(listen(server_sock, 1), "listen"); + + for (;;) { + sock = accept(server_sock, NULL, NULL); + if (sock != -1) { + while (read(sock, buffer, sizeof(buffer)) > 0) {} + close(sock); + } + } +} From patchwork Wed Apr 5 16:53:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202255 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0991C7619A for ; Wed, 5 Apr 2023 16:54:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D4606B0075; Wed, 5 Apr 2023 12:54:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 684596B0078; Wed, 5 Apr 2023 12:54:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 525296B007B; Wed, 5 Apr 2023 12:54:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 40C006B0075 for ; Wed, 5 Apr 2023 12:54:01 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E10A0120BBD for ; Wed, 5 Apr 2023 16:54:00 +0000 (UTC) X-FDA: 80647934640.10.30E4951 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf30.hostedemail.com (Postfix) with ESMTP id 296C080015 for ; Wed, 5 Apr 2023 16:53:57 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=B4J+1nL3; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf30.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713638; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QAACn2mr7ooi1DdUlemHLPNq4itk2LGg3Qvic6fgj7U=; b=eHpYCj1QwJSXdaMBiF2FHiBKMKPyanw8iBuLHxRyN09NxtwbFi1g9KyknfRAKFj3gQ8v4u f1Isje2IpSAA7y2bPsrUNbNNNUMmip6/88BkhV6v0TkeexeqB5mfXeVveZiW5Fxonwj715 5LCxj0+c/JmET+vbcWeXWitr1ee2Znc= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=B4J+1nL3; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf30.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713638; a=rsa-sha256; cv=none; b=F0KrCA76280LMLC1cmwEHujo8zY+r82w4QbEwjk4z2UZFgiO4NxgWeRSTzk/IdLn1+9Hvk rsFv3O8k24Pv5m52o60V8ykrjdlM0yKP+JdBes6f+ySIQB6+5umXduxCklXIQCjGE9qVJV WzMhKtCjqWxrzgVw4hJuuB1WcbY3d1g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713637; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QAACn2mr7ooi1DdUlemHLPNq4itk2LGg3Qvic6fgj7U=; b=B4J+1nL3MJU3iT6BoHQibEvlGCPkt1Tzg2w2G7NM6HSH1XQb+kp0U0FEWyXa/6/kHb/g5u Ogu7fq4BmxuVSa+UXkKqP/6Yjot1psSJAx8KgS/mPCRnsxjzIoGDTmuqUGJbyJlnukcwAc Bp/Xw7X+y1VlcYbKIMZtrWRid8QB1vs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-86-CL3O1Pd0MuC9cMWkPeSHyw-1; Wed, 05 Apr 2023 12:53:51 -0400 X-MC-Unique: CL3O1Pd0MuC9cMWkPeSHyw-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 82003185A791; Wed, 5 Apr 2023 16:53:50 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6A251400F57; Wed, 5 Apr 2023 16:53:48 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Willem de Bruijn Subject: [PATCH net-next v4 02/20] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag Date: Wed, 5 Apr 2023 17:53:21 +0100 Message-Id: <20230405165339.3468808-3-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Rspamd-Queue-Id: 296C080015 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: web1frmhoekunwqeej83xp8pzsr1317n X-HE-Tag: 1680713637-631409 X-HE-Meta: U2FsdGVkX1+qIjt0Xwri89yKus+g+yxOmmUpINCJI1JOTevjVbXYI67afwlF4yU4Ut8rUmpbf1aMHPCxq9uZio4k2Tb+X/f3vg0DfHBLPiUbB9y35dXTO2H4fHgsllAAh1jJHQzRhuQTXhEHJi+HfZMOgBwtIbWc4pgmEvly1u4ws/sQf1U58/vHy9b9xx3cBuQHW+7OiwH8sBYRmbSzVtRXBhYN0IVowypDWK2xX2BSfmuOlu5jMo1wRqi62nhOak3kA1uhUI6zIZG3wTDltJTtz8hM3DwpCM9BrwTIQMsK1r3pXMFbilVoayNj3uDoZpZj9ei1moKtcc5xt67LryWzPIL9aFgfMySc90lnE0poTxbwQTuBJuNcKOMd9gbdNP4ADgi88An9oQAzWOZe1NGZFiurdlABCF4EorIeBQ/lxYVBVp4k2nFp067QpwObjYv08ebjP7RhO+UfjByi9YyNt44jb9VCZ5rVOPL/0ykNSxICRX/MCM+y+AZFuIJwfpSxdk5dx8HijtBtw6VsLULu4kSV0wiW4vSBcV2eVylJ9RgaEXixaAVzEYe3oBzm6FK82OiOjevwDAPt1o3SUB8bino4ET+KleJ2Y5F9rPFUVmwz+FTHE5TYwz9GVY907PK++B4MF/Wsf1WjJzKmaa7CdTIwvk7vE7Dg1Ph53/3drR7Vw3ce1mwZ7kdfFfDixaUVhkopgKEOhtfiqO/dyXwrD05Q/6FH+7tSLQarHYtzMpbh0gO/uKweZw0dg3dnbw+kh5iq9OJIqKFYGRd0pRFNYYTXIBsy3qPvWHiJvNWGEZNA9fnZfNYe5bhRCNQXGQIR5FhK9oa198gUZzvWCSj8s9HbbE3gMOD81BDc9/Xw138E06EFfP64zWRy12EJmbgZxK0lYbpxuPvrOIk/LjDWEet+M/1yfnlDcAL0JD2EiLaWYKzdxoJ4TbsbtCMp9TdKIj0vuJlgxfUOfb0 CQH8OlSI CigwjnyVu/gZCtw9bdoE2gvxO6UuUSL7VV/OlxlwMbF9I3sNQM+pbu74HwGArxOcX9qljUQUlZbEe6jmtKm60imT7fchouXCVB7aeEi2AIbVATXMgULyki4SxOKRYLAPe0FU+LZgcqkHu91adoT8c5LfML7QJU3g1wosN/VZ4KC537UkLThwE/1iLBkYnEeF9EQAoxFnuGO7VwxeY3BxTSHi6igQ+8doemvAHH9A9YxkqWhHN3YXxVW7OIEuhtqSNJFcyOh7EBQHvb75i2Sb9iUIkeGE3WKtnnLFMdbX42nDSx7wvbOwYECpPVpGJ2EELDiIKqzwJSeHn1iPsJlVuXjBUN+tTcG4cqIp2dnRinIWDOLGGh/++aDjclrRS2VaAQQoCal+iUdj3MAocczwxYpdq+WTWwYj/q9xhnyFAh5NZg8BMd0rgyMlYfPguJDF+PfaKwSS80pmQNXGMeQimWR8jDAg3HPyT3yDnkFOSrgtHiGkHJUC5OoXka/V6chP2AATb X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Declare MSG_SPLICE_PAGES, an internal sendmsg() flag, that hints to a network protocol that it should splice pages from the source iterator rather than copying the data if it can. This flag is added to a list that is cleared by sendmsg syscalls on entry. This is intended as a replacement for the ->sendpage() op, allowing a way to splice in several multipage folios in one go. Signed-off-by: David Howells Reviewed-by: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/linux/socket.h | 3 +++ net/socket.c | 2 ++ 2 files changed, 5 insertions(+) diff --git a/include/linux/socket.h b/include/linux/socket.h index 13c3a237b9c9..bd1cc3238851 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -327,6 +327,7 @@ struct ucred { */ #define MSG_ZEROCOPY 0x4000000 /* Use user data in kernel path */ +#define MSG_SPLICE_PAGES 0x8000000 /* Splice the pages from the iterator in sendmsg() */ #define MSG_FASTOPEN 0x20000000 /* Send data in TCP SYN */ #define MSG_CMSG_CLOEXEC 0x40000000 /* Set close_on_exec for file descriptor received through @@ -337,6 +338,8 @@ struct ucred { #define MSG_CMSG_COMPAT 0 /* We never have 32 bit fixups */ #endif +/* Flags to be cleared on entry by sendmsg and sendmmsg syscalls */ +#define MSG_INTERNAL_SENDMSG_FLAGS (MSG_SPLICE_PAGES) /* Setsockoptions(2) level. Thanks to BSD these must match IPPROTO_xxx */ #define SOL_IP 0 diff --git a/net/socket.c b/net/socket.c index 73e493da4589..b3fd3f7f7e03 100644 --- a/net/socket.c +++ b/net/socket.c @@ -2136,6 +2136,7 @@ int __sys_sendto(int fd, void __user *buff, size_t len, unsigned int flags, msg.msg_name = (struct sockaddr *)&address; msg.msg_namelen = addr_len; } + flags &= ~MSG_INTERNAL_SENDMSG_FLAGS; if (sock->file->f_flags & O_NONBLOCK) flags |= MSG_DONTWAIT; msg.msg_flags = flags; @@ -2483,6 +2484,7 @@ static int ____sys_sendmsg(struct socket *sock, struct msghdr *msg_sys, } msg_sys->msg_flags = flags; + flags &= ~MSG_INTERNAL_SENDMSG_FLAGS; if (sock->file->f_flags & O_NONBLOCK) msg_sys->msg_flags |= MSG_DONTWAIT; /* From patchwork Wed Apr 5 16:53:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202257 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57D08C76188 for ; Wed, 5 Apr 2023 16:54:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E76616B007B; Wed, 5 Apr 2023 12:54:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E256A6B007D; Wed, 5 Apr 2023 12:54:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C9E796B007E; Wed, 5 Apr 2023 12:54:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B98036B007B for ; Wed, 5 Apr 2023 12:54:05 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 4CBB9AC5F6 for ; Wed, 5 Apr 2023 16:54:05 +0000 (UTC) X-FDA: 80647934850.05.0397F4B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf17.hostedemail.com (Postfix) with ESMTP id 7ABEB4000B for ; Wed, 5 Apr 2023 16:54:02 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Dla8QCvM; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713642; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6CZTaviJSB+HI6SS+UVazISaMVPKFDisRAz1iEkYVHU=; b=Pkjnl3Q20VH/A18ST+w4jDpvAF+wFhnMLqgL5fSLcdAr3uXezvQYHlpukUVyYzNEkvW9n+ hrjKVsoQpWangOlVT/1iMli0SI/4yfP/D+7HgeuRd42nbVJ8Bq0q8iN93fwhdw0ZULd1ls Ttc3fRYJMkiOzzxtDUCfMv/utzEBD+o= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Dla8QCvM; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713642; a=rsa-sha256; cv=none; b=0hgnHcu2a/uJfWG5sKMqrpCOVPSD2Ff5BYddYExjtjE7Ri9VGNsQY1HCDsdck1d1NUrhM1 78wkDOH1NfwrfcEmCaaTQfrChmNmtga3qNk9OR5slSEAU4gueiHTrTV0o/jEUk4NEAab3H 2ghEtIHy4o5BAz2l/6rZmQUB2Qo5Mmo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713641; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6CZTaviJSB+HI6SS+UVazISaMVPKFDisRAz1iEkYVHU=; b=Dla8QCvM8hpGX73m10zK5/Eaxfv81ugGF38q8WQW9UZ8vUB9XdluHhy0EIqqjOeE9jy2Y/ AbCuZCFsk7ZhBM91R95Zxjtxg60ufunfeB6hDKEcGum1ukrfVatFPRe3iSeZuAussB/DZr ZNbsJD9FZNHIDfpOQLF93PxcY0FzOB4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-347-uZn3wgunNsa9-TdfFJYYyQ-1; Wed, 05 Apr 2023 12:53:54 -0400 X-MC-Unique: uZn3wgunNsa9-TdfFJYYyQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9442A8996E6; Wed, 5 Apr 2023 16:53:53 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 23187140EBF4; Wed, 5 Apr 2023 16:53:51 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Bernard Metzler , Tom Talpey , linux-rdma@vger.kernel.org Subject: [PATCH net-next v4 03/20] mm: Move the page fragment allocator from page_alloc.c into its own file Date: Wed, 5 Apr 2023 17:53:22 +0100 Message-Id: <20230405165339.3468808-4-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Rspamd-Queue-Id: 7ABEB4000B X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: 4j38xkr8ug1jnmcp76yj31kgnppnqefk X-HE-Tag: 1680713642-892057 X-HE-Meta: U2FsdGVkX194jdXrcC2FoVmxyKs7+BJfppqW9HiBu2OLnyQxqyZmVPQXO2kRGnLfd4sdMPthJS6JKBmSO/vXr7fQXafPGq/9X/FKJBGByh1Rb8l4Bv+wclLLXMbdiOF5sBwz8Gbpt4ei4TGZIP48exesmwbrvRGioneK4hUod7/B6Rw1iZW+4FfD8KWGorxYzh5o02EWpUk3zrkri/069Y3MKvfVm1b4yK+lRnWJcCHpT8cDKmKMQVt2j9ok2CxxGCcNZirXp9ktLVyBlIXedUTdpRGEAgS9XvbMItcqFaHVcZRHd0reCULytTpAGpJDKGuOfR5s7bKk50osf4ltkFdMVq/6rX37dKn4mEJ+NlDLmrTi/DgXOuRn93ldTqWItEqpd7GetJ8DuwaJY5RLK4ZDjUC9iNW/lOr3VBAgFvqo10pTaWyVunBy2n3fADiU3qjiI44mH40NvPaiDiYx7QswKLO1cFxq2BOL9mDgvOYs3wyi36dZz+E8oOud38ISgUyVRTWv3n0Fqn8NzQc+Gu7/RU5cRwt5OmmoRu1oqEAnVoxfkPTaaCfNYkFMPxD2rhz3c0nzffmdrNCEtynxS1RdkZ4GowFCOxj5vEAcCMgwWn8Su82lcORmBbaCWQD6YmLJpGP1Is2ycDKCdyGYO8OFv6gzfs6AmeC61KCmrDxAVXWgZHVOPAlfJ8+Ly6WxtEqnYIgveuqAapLYiLWHLSSQpDMwKaZvRXLwcFxLUnnhdPUfQ0wswRuwbtUw5nHbIbV3YmTS4jnKmCoEtchYlwRNp0Ir1bgT7xG+YaYjepOlzx9CXJh9SOwl4So93yVWcdjdPkoJe1E9e/wvAOF120PskOuuPTSiDlHFWrfO80TV+aFDeH6OWdYSHGqzi3bJuQtdFLeJLANMAxJTduTwVsu2n2GeSVT/++Jgum7zxP9s3k++KBKZ34eMAOZeOKDRfuUHX1PzjJyhzLN2u0K zfc7kC4g gTAb0EugEb5w8EPoYgb1H6ainsS7GnjLXMvvvPFG/8BHz3YCIRn8kvaQTRwGw4mT5tDunqE1STd5kdpFP8WaOZCo7Uz1Bs03XNPTQ7rtmMaPK3FcSjtYzIVwdLHJ3Gcs2wyRUkey/+Lj0cKa3e7sAUqN82pzRSawUITfyRU0J3PQq1bliBxi1aHLiuRzpSvji9G1bSQ4AfZk2MNZLNZk/HNgo437l4fQSihw56PA3dl6FJ10AmcwrciOPWwvaSyhhtT7ey/jX/RPFNYh9TCTKS7EEVAad+un+17x8UmEUFYyRfSvkrLwhQKqSlWu2c7FzOAUo9bjmuOL14d/MW1vTGgXvJPEUs77zCR3S/8sZSa9aTEFZnTEUWi1asKL6pNw4Jq3NTUAYA0nWIJ/Mto5lsvCOK9wnTc/6COhBAV8ezWxS9dIvaCRJ6T7QXAhbH4OjlEkPso8rYGdT56tNUdj7nh0nfO0Pe+611Z+zRmzCWwet3fU87FKg+D/0k02VrceZjak5my/BwVE6yu/FMcLBcNLKuEX4Mf3r3E2TB0OmTUOZ0hq9NGteuqTpB3jbO4hYPW5RLi8jNactzSH1k0LLaf8q7g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Move the page fragment allocator from page_alloc.c into its own file preparatory to changing it. Signed-off-by: David Howells cc: Bernard Metzler cc: Tom Talpey cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: netdev@vger.kernel.org --- mm/Makefile | 2 +- mm/page_alloc.c | 126 ----------------------------------------- mm/page_frag_alloc.c | 131 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 132 insertions(+), 127 deletions(-) create mode 100644 mm/page_frag_alloc.c diff --git a/mm/Makefile b/mm/Makefile index 8e105e5b3e29..4e6dc12b4cbd 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -52,7 +52,7 @@ obj-y := filemap.o mempool.o oom_kill.o fadvise.o \ readahead.o swap.o truncate.o vmscan.o shmem.o \ util.o mmzone.o vmstat.o backing-dev.o \ mm_init.o percpu.o slab_common.o \ - compaction.o \ + compaction.o page_frag_alloc.o \ interval_tree.o list_lru.o workingset.o \ debug.o gup.o mmap_lock.o $(mmu-y) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 7136c36c5d01..d751e750c14b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5695,132 +5695,6 @@ void free_pages(unsigned long addr, unsigned int order) EXPORT_SYMBOL(free_pages); -/* - * Page Fragment: - * An arbitrary-length arbitrary-offset area of memory which resides - * within a 0 or higher order page. Multiple fragments within that page - * are individually refcounted, in the page's reference counter. - * - * The page_frag functions below provide a simple allocation framework for - * page fragments. This is used by the network stack and network device - * drivers to provide a backing region of memory for use as either an - * sk_buff->head, or to be used in the "frags" portion of skb_shared_info. - */ -static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, - gfp_t gfp_mask) -{ - struct page *page = NULL; - gfp_t gfp = gfp_mask; - -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | - __GFP_NOMEMALLOC; - page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, - PAGE_FRAG_CACHE_MAX_ORDER); - nc->size = page ? PAGE_FRAG_CACHE_MAX_SIZE : PAGE_SIZE; -#endif - if (unlikely(!page)) - page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); - - nc->va = page ? page_address(page) : NULL; - - return page; -} - -void __page_frag_cache_drain(struct page *page, unsigned int count) -{ - VM_BUG_ON_PAGE(page_ref_count(page) == 0, page); - - if (page_ref_sub_and_test(page, count)) - free_the_page(page, compound_order(page)); -} -EXPORT_SYMBOL(__page_frag_cache_drain); - -void *page_frag_alloc_align(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask, - unsigned int align_mask) -{ - unsigned int size = PAGE_SIZE; - struct page *page; - int offset; - - if (unlikely(!nc->va)) { -refill: - page = __page_frag_cache_refill(nc, gfp_mask); - if (!page) - return NULL; - -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif - /* Even if we own the page, we do not use atomic_set(). - * This would break get_page_unless_zero() users. - */ - page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE); - - /* reset page count bias and offset to start of new frag */ - nc->pfmemalloc = page_is_pfmemalloc(page); - nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; - nc->offset = size; - } - - offset = nc->offset - fragsz; - if (unlikely(offset < 0)) { - page = virt_to_page(nc->va); - - if (!page_ref_sub_and_test(page, nc->pagecnt_bias)) - goto refill; - - if (unlikely(nc->pfmemalloc)) { - free_the_page(page, compound_order(page)); - goto refill; - } - -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif - /* OK, page count is 0, we can safely set it */ - set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); - - /* reset page count bias and offset to start of new frag */ - nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; - offset = size - fragsz; - if (unlikely(offset < 0)) { - /* - * The caller is trying to allocate a fragment - * with fragsz > PAGE_SIZE but the cache isn't big - * enough to satisfy the request, this may - * happen in low memory conditions. - * We don't release the cache page because - * it could make memory pressure worse - * so we simply return NULL here. - */ - return NULL; - } - } - - nc->pagecnt_bias--; - offset &= align_mask; - nc->offset = offset; - - return nc->va + offset; -} -EXPORT_SYMBOL(page_frag_alloc_align); - -/* - * Frees a page fragment allocated out of either a compound or order 0 page. - */ -void page_frag_free(void *addr) -{ - struct page *page = virt_to_head_page(addr); - - if (unlikely(put_page_testzero(page))) - free_the_page(page, compound_order(page)); -} -EXPORT_SYMBOL(page_frag_free); - static void *make_alloc_exact(unsigned long addr, unsigned int order, size_t size) { diff --git a/mm/page_frag_alloc.c b/mm/page_frag_alloc.c new file mode 100644 index 000000000000..bee95824ef8f --- /dev/null +++ b/mm/page_frag_alloc.c @@ -0,0 +1,131 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Page fragment allocator + * + * Page Fragment: + * An arbitrary-length arbitrary-offset area of memory which resides within a + * 0 or higher order page. Multiple fragments within that page are + * individually refcounted, in the page's reference counter. + * + * The page_frag functions provide a simple allocation framework for page + * fragments. This is used by the network stack and network device drivers to + * provide a backing region of memory for use as either an sk_buff->head, or to + * be used in the "frags" portion of skb_shared_info. + */ + +#include +#include +#include + +static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, + gfp_t gfp_mask) +{ + struct page *page = NULL; + gfp_t gfp = gfp_mask; + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | + __GFP_NOMEMALLOC; + page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, + PAGE_FRAG_CACHE_MAX_ORDER); + nc->size = page ? PAGE_FRAG_CACHE_MAX_SIZE : PAGE_SIZE; +#endif + if (unlikely(!page)) + page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); + + nc->va = page ? page_address(page) : NULL; + + return page; +} + +void __page_frag_cache_drain(struct page *page, unsigned int count) +{ + VM_BUG_ON_PAGE(page_ref_count(page) == 0, page); + + if (page_ref_sub_and_test(page, count - 1)) + __free_pages(page, compound_order(page)); +} +EXPORT_SYMBOL(__page_frag_cache_drain); + +void *page_frag_alloc_align(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask, + unsigned int align_mask) +{ + unsigned int size = PAGE_SIZE; + struct page *page; + int offset; + + if (unlikely(!nc->va)) { +refill: + page = __page_frag_cache_refill(nc, gfp_mask); + if (!page) + return NULL; + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif + /* Even if we own the page, we do not use atomic_set(). + * This would break get_page_unless_zero() users. + */ + page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE); + + /* reset page count bias and offset to start of new frag */ + nc->pfmemalloc = page_is_pfmemalloc(page); + nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; + nc->offset = size; + } + + offset = nc->offset - fragsz; + if (unlikely(offset < 0)) { + page = virt_to_page(nc->va); + + if (page_ref_count(page) != nc->pagecnt_bias) + goto refill; + if (unlikely(nc->pfmemalloc)) { + page_ref_sub(page, nc->pagecnt_bias - 1); + __free_pages(page, compound_order(page)); + goto refill; + } + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif + /* OK, page count is 0, we can safely set it */ + set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); + + /* reset page count bias and offset to start of new frag */ + nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; + offset = size - fragsz; + if (unlikely(offset < 0)) { + /* + * The caller is trying to allocate a fragment + * with fragsz > PAGE_SIZE but the cache isn't big + * enough to satisfy the request, this may + * happen in low memory conditions. + * We don't release the cache page because + * it could make memory pressure worse + * so we simply return NULL here. + */ + return NULL; + } + } + + nc->pagecnt_bias--; + offset &= align_mask; + nc->offset = offset; + + return nc->va + offset; +} +EXPORT_SYMBOL(page_frag_alloc_align); + +/* + * Frees a page fragment allocated out of either a compound or order 0 page. + */ +void page_frag_free(void *addr) +{ + struct page *page = virt_to_head_page(addr); + + __free_pages(page, compound_order(page)); +} +EXPORT_SYMBOL(page_frag_free); From patchwork Wed Apr 5 16:53:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202256 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83666C76188 for ; Wed, 5 Apr 2023 16:54:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C2D2C6B0078; Wed, 5 Apr 2023 12:54:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BDAFD6B007B; Wed, 5 Apr 2023 12:54:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7AFC6B007D; Wed, 5 Apr 2023 12:54:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9A0CC6B0078 for ; Wed, 5 Apr 2023 12:54:01 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 556C080DF6 for ; Wed, 5 Apr 2023 16:54:01 +0000 (UTC) X-FDA: 80647934682.08.EE77FA0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf11.hostedemail.com (Postfix) with ESMTP id 904AE40018 for ; Wed, 5 Apr 2023 16:53:59 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GylYef7p; spf=pass (imf11.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713639; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5HFcSli7GWFgeyoEx21ojBPq2vhhM7tieZtPm930p+c=; b=xc2wBb9oYYjGKivVHqaD3qov1ZSq9n7OObGTBjta7TJEzFQbDFfSFiIN+We57s6Y0C+0F7 A9JO34YcFnL6bcXb8ZYyyA4tS5hFdk0TNDyPXlwWhwoY16OLIJYsCkXBtqqq7FJ24uhjhW bffp1/EBkHzRvuZr/n8M168BRY6Od6U= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GylYef7p; spf=pass (imf11.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713639; a=rsa-sha256; cv=none; b=omLKX4cauSRF+psiSQd4BbkJluyMpQjNOKMl1iHUOQMG517amKdPunwXM9sNkoh4mgic7H jr/ecu4UBc9tilE9GkG0cTa8bGp+a6iaKjaiO9qkFusPENF4rwSSl0DBi1MGc8kAS6lKfL J5pbJASiQdo5AEi1aEPIHMlpY3bYlWM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713638; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5HFcSli7GWFgeyoEx21ojBPq2vhhM7tieZtPm930p+c=; b=GylYef7p+jpVN9uzCLTEglx/uEpamy8qDTyXoyTn82CbHHJmScl6mKpw1FZoWdzJjwuYXQ 6lMXBL35vcRay3oPl3G1MG9H89jj5icKXlM7licYOZw+QwGwDLqh7EB7g7tlecgVTdyB7f NHAE6ZUKwEAODZLDZzCI2OXJglYAaFw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-163-JZMa1yBiN6mNkIPxSZoyZQ-1; Wed, 05 Apr 2023 12:53:57 -0400 X-MC-Unique: JZMa1yBiN6mNkIPxSZoyZQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 53FD1185A794; Wed, 5 Apr 2023 16:53:56 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4A7E8C1602A; Wed, 5 Apr 2023 16:53:54 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 04/20] mm: Make the page_frag_cache allocator use multipage folios Date: Wed, 5 Apr 2023 17:53:23 +0100 Message-Id: <20230405165339.3468808-5-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Stat-Signature: 7hukkogezgkrya1inpwtz4srr5huxhs3 X-Rspam-User: X-Rspamd-Queue-Id: 904AE40018 X-Rspamd-Server: rspam06 X-HE-Tag: 1680713639-821635 X-HE-Meta: U2FsdGVkX19m7dA8XZEEdZTII4Ji6dEBZJOTFOHuXDEkCryJ0tIfboqY2cezJrFw7etzZHcrfkzFV9/ZcFWlHbQ9vBAAW9BM6J9K2F+6geWpWHNFZQn8YjaGfVUdpoL4qGL2X6FMwm5Uxuc+Ybf9AJwDCcyauEg3qRoKPJyg0cEKeNjoFXsbpl9eutaPEJ5aEEfUD46znI8J/HZMrOiCwc7FFXqPOTrZCMhpJ3PkjbV4NDuxSKpV7plJ6EBgQVI3ITZ/xy8FvEyO9ly1ssaYFnO6C6UUSgzza4DkC+LOf+9P11Q2MYbFa0Qqw1Dhzj3075gEnXAuiwzU3OimQE7iztmf0Z6+Qgt2C63p48TszEjuMfqdIa6nMRprckDIsyoyRGuaXccsxmoBeQNwAa0ig8QC9rGw27ypYk1JDtZJzSnAVf+DVkZfqhXqUkj3VT/Hrfj7HJ8LW3rWLHW0g+hkuL/OmzxnR0gy1da0b37P6DeBY4yiPNZNKBYRR3B4hm5ewzam1gQusrv9BdTisPDeJI85md9vuQOW2ZsEq34DES0KAxtMbOLJ58+Ycpfajw+LVHmD0QiTmZ8fxJH4wKTWdy/yafbK1NGAiiZmzMp2d617truI8jmucFANNdJaEhfPyaLgNPcZMhbazVLcbhDpMOPHS2yzy4iIE7ZfPicBhI0uiIIYWfTQPgN/4V3IE6/k1aimNP+2YtzMGQC0iwU5U3aD89EhHircJlUaTkAayn3l1zmdEqChOzpyUdQwQYFkVNqXkIUaXd+nn1890zg3bhQW9MhdIvbkDOHAlUhjgfZqmDqRDxamRXBR5JJFbIyGigmDMbteK544a+bkroUqQlKH+AXW+XBrs6Wg+fGLHKOSrbH/x8IH6ZPaU3sW/M2sLyQUJGlVJuQ9nMpnr0rhW7qzWYMSTwaa7n/t7dY6KdwmTV6UXa9/vq2+YJbxrwv/PEoAskttZqlAC70NtEH bg5O0cnr w2O89knyH56JrOa1cbiksYERb1XgCqEoEZAlVbrzhAlRk0JeXW2Cgnscp5XVdDA5gtrcXkWd1U8hKUwxst71byTT1hQZXAT3gXNK9WQfUOt6l614/COMeVBYT8t+9SzzY6Y5dOQ48FoTpMfdOcXsj1jG5pTKTgDDQB7QmJqjlY2xmTcFhBdKtMQp0H+qxUMHTwpvc9D0G3X8ge/+f+fBDRJGlgnCPa2c4qvElRmAty0jAYIV6tNOMbiNmFjNA0M3FJcrWyFAw+UexT9rBW7ptoyZa+0KyDKAOYBQh949hzEd4/brXvEZd821peuESgW3owepR7ZAr1pxlfmwHmbnyJ2hWz4kgmLo9ryaXPl6JnXXIN8hXShwtp4OZkJnD0Rslhs1hZXGi88rhiT5Du+u1/fjl00qqIp+dWK6wMbMok2tQLljWvHHUPLAvFippcMk9eQm3azHGXXDjrQeAAsQf5VVVDV53O6NU0sr/EtkfSswsD76J7H/DpmxAMUU1Wx7oZ5Jf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Change the page_frag_cache allocator to use multipage folios rather than groups of pages. This reduces page_frag_free to just a folio_put() or put_page(). Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- drivers/net/ethernet/mediatek/mtk_wed_wo.c | 15 +-- drivers/nvme/host/tcp.c | 7 +- drivers/nvme/target/tcp.c | 5 +- include/linux/gfp.h | 1 + include/linux/mm_types.h | 13 +-- mm/page_frag_alloc.c | 101 +++++++++++---------- 6 files changed, 63 insertions(+), 79 deletions(-) diff --git a/drivers/net/ethernet/mediatek/mtk_wed_wo.c b/drivers/net/ethernet/mediatek/mtk_wed_wo.c index 69fba29055e9..6ce532217777 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed_wo.c +++ b/drivers/net/ethernet/mediatek/mtk_wed_wo.c @@ -286,7 +286,6 @@ mtk_wed_wo_queue_free(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) static void mtk_wed_wo_queue_tx_clean(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) { - struct page *page; int i; for (i = 0; i < q->n_desc; i++) { @@ -298,12 +297,7 @@ mtk_wed_wo_queue_tx_clean(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) entry->buf = NULL; } - if (!q->cache.va) - return; - - page = virt_to_page(q->cache.va); - __page_frag_cache_drain(page, q->cache.pagecnt_bias); - memset(&q->cache, 0, sizeof(q->cache)); + page_frag_cache_clear(&q->cache); } static void @@ -320,12 +314,7 @@ mtk_wed_wo_queue_rx_clean(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) skb_free_frag(buf); } - if (!q->cache.va) - return; - - page = virt_to_page(q->cache.va); - __page_frag_cache_drain(page, q->cache.pagecnt_bias); - memset(&q->cache, 0, sizeof(q->cache)); + page_frag_cache_clear(&q->cache); } static void diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 42c0598c31f2..76f12ac714b0 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1323,12 +1323,7 @@ static void nvme_tcp_free_queue(struct nvme_ctrl *nctrl, int qid) if (queue->hdr_digest || queue->data_digest) nvme_tcp_free_crypto(queue); - if (queue->pf_cache.va) { - page = virt_to_head_page(queue->pf_cache.va); - __page_frag_cache_drain(page, queue->pf_cache.pagecnt_bias); - queue->pf_cache.va = NULL; - } - + page_frag_cache_clear(&queue->pf_cache); noreclaim_flag = memalloc_noreclaim_save(); sock_release(queue->sock); memalloc_noreclaim_restore(noreclaim_flag); diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 66e8f9fd0ca7..ae871c31cf00 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1438,7 +1438,6 @@ static void nvmet_tcp_free_cmd_data_in_buffers(struct nvmet_tcp_queue *queue) static void nvmet_tcp_release_queue_work(struct work_struct *w) { - struct page *page; struct nvmet_tcp_queue *queue = container_of(w, struct nvmet_tcp_queue, release_work); @@ -1460,9 +1459,7 @@ static void nvmet_tcp_release_queue_work(struct work_struct *w) if (queue->hdr_digest || queue->data_digest) nvmet_tcp_free_crypto(queue); ida_free(&nvmet_tcp_queue_ida, queue->idx); - - page = virt_to_head_page(queue->pf_cache.va); - __page_frag_cache_drain(page, queue->pf_cache.pagecnt_bias); + page_frag_cache_clear(&queue->pf_cache); kfree(queue); } diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 65a78773dcca..5e15384798eb 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -313,6 +313,7 @@ static inline void *page_frag_alloc(struct page_frag_cache *nc, { return page_frag_alloc_align(nc, fragsz, gfp_mask, ~0u); } +void page_frag_cache_clear(struct page_frag_cache *nc); extern void page_frag_free(void *addr); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 0722859c3647..49a70b3f44a9 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -420,18 +420,13 @@ static inline void *folio_get_private(struct folio *folio) } struct page_frag_cache { - void * va; -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - __u16 offset; - __u16 size; -#else - __u32 offset; -#endif + struct folio *folio; + unsigned int offset; /* we maintain a pagecount bias, so that we dont dirty cache line * containing page->_refcount every time we allocate a fragment. */ - unsigned int pagecnt_bias; - bool pfmemalloc; + unsigned int pagecnt_bias; + bool pfmemalloc; }; typedef unsigned long vm_flags_t; diff --git a/mm/page_frag_alloc.c b/mm/page_frag_alloc.c index bee95824ef8f..9b138cb0e3a4 100644 --- a/mm/page_frag_alloc.c +++ b/mm/page_frag_alloc.c @@ -16,88 +16,95 @@ #include #include -static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, - gfp_t gfp_mask) +/* + * Allocate a new folio for the frag cache. + */ +static struct folio *page_frag_cache_refill(struct page_frag_cache *nc, + gfp_t gfp_mask) { - struct page *page = NULL; + struct folio *folio = NULL; gfp_t gfp = gfp_mask; #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | - __GFP_NOMEMALLOC; - page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, - PAGE_FRAG_CACHE_MAX_ORDER); - nc->size = page ? PAGE_FRAG_CACHE_MAX_SIZE : PAGE_SIZE; + gfp_mask |= __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC; + folio = folio_alloc(gfp_mask, PAGE_FRAG_CACHE_MAX_ORDER); #endif - if (unlikely(!page)) - page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); - - nc->va = page ? page_address(page) : NULL; + if (unlikely(!folio)) + folio = folio_alloc(gfp, 0); - return page; + if (folio) + nc->folio = folio; + return folio; } void __page_frag_cache_drain(struct page *page, unsigned int count) { - VM_BUG_ON_PAGE(page_ref_count(page) == 0, page); + struct folio *folio = page_folio(page); - if (page_ref_sub_and_test(page, count - 1)) - __free_pages(page, compound_order(page)); + VM_BUG_ON_FOLIO(folio_ref_count(folio) == 0, folio); + + folio_put_refs(folio, count); } EXPORT_SYMBOL(__page_frag_cache_drain); +void page_frag_cache_clear(struct page_frag_cache *nc) +{ + struct folio *folio = nc->folio; + + if (folio) { + VM_BUG_ON_FOLIO(folio_ref_count(folio) == 0, folio); + folio_put_refs(folio, nc->pagecnt_bias); + nc->folio = NULL; + } + +} +EXPORT_SYMBOL(page_frag_cache_clear); + void *page_frag_alloc_align(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask, unsigned int align_mask) { - unsigned int size = PAGE_SIZE; - struct page *page; - int offset; + struct folio *folio = nc->folio; + size_t offset; - if (unlikely(!nc->va)) { + if (unlikely(!folio)) { refill: - page = __page_frag_cache_refill(nc, gfp_mask); - if (!page) + folio = page_frag_cache_refill(nc, gfp_mask); + if (!folio) return NULL; -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif /* Even if we own the page, we do not use atomic_set(). * This would break get_page_unless_zero() users. */ - page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE); + folio_ref_add(folio, PAGE_FRAG_CACHE_MAX_SIZE); /* reset page count bias and offset to start of new frag */ - nc->pfmemalloc = page_is_pfmemalloc(page); + nc->pfmemalloc = folio_is_pfmemalloc(folio); nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; - nc->offset = size; + nc->offset = folio_size(folio); } - offset = nc->offset - fragsz; - if (unlikely(offset < 0)) { - page = virt_to_page(nc->va); - - if (page_ref_count(page) != nc->pagecnt_bias) + offset = nc->offset; + if (unlikely(fragsz > offset)) { + /* Reuse the folio if everyone we gave it to has finished with it. */ + if (!folio_ref_sub_and_test(folio, nc->pagecnt_bias)) { + nc->folio = NULL; goto refill; + } + if (unlikely(nc->pfmemalloc)) { - page_ref_sub(page, nc->pagecnt_bias - 1); - __free_pages(page, compound_order(page)); + __folio_put(folio); + nc->folio = NULL; goto refill; } -#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif /* OK, page count is 0, we can safely set it */ - set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); + folio_set_count(folio, PAGE_FRAG_CACHE_MAX_SIZE + 1); /* reset page count bias and offset to start of new frag */ nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; - offset = size - fragsz; - if (unlikely(offset < 0)) { + offset = folio_size(folio); + if (unlikely(fragsz > offset)) { /* * The caller is trying to allocate a fragment * with fragsz > PAGE_SIZE but the cache isn't big @@ -107,15 +114,17 @@ void *page_frag_alloc_align(struct page_frag_cache *nc, * it could make memory pressure worse * so we simply return NULL here. */ + nc->offset = offset; return NULL; } } nc->pagecnt_bias--; + offset -= fragsz; offset &= align_mask; nc->offset = offset; - return nc->va + offset; + return folio_address(folio) + offset; } EXPORT_SYMBOL(page_frag_alloc_align); @@ -124,8 +133,6 @@ EXPORT_SYMBOL(page_frag_alloc_align); */ void page_frag_free(void *addr) { - struct page *page = virt_to_head_page(addr); - - __free_pages(page, compound_order(page)); + folio_put(virt_to_folio(addr)); } EXPORT_SYMBOL(page_frag_free); From patchwork Wed Apr 5 16:53:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202258 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 712E3C7619A for ; Wed, 5 Apr 2023 16:54:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ACC666B007D; Wed, 5 Apr 2023 12:54:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A56456B007E; Wed, 5 Apr 2023 12:54:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E4346B0080; Wed, 5 Apr 2023 12:54:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 68D4A6B007D for ; Wed, 5 Apr 2023 12:54:06 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0F22841155 for ; Wed, 5 Apr 2023 16:54:06 +0000 (UTC) X-FDA: 80647934892.25.719CDE0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf05.hostedemail.com (Postfix) with ESMTP id 3DB21100010 for ; Wed, 5 Apr 2023 16:54:04 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cPpg7A8i; spf=pass (imf05.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713644; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IxFOI8BlLlR7NCyVkzu+oUkui6zaCPq4znyAUScBVv0=; b=Su3DFf+flxpeYedf6R73Bo2FqVsAK1KtrDHB1Hjc3iaPrg3Cu8Vy/XpnA57GED/RO7pjiJ +bG+6P8cGESCAufsLk5eOggdXsxC9X5aHhJG5DGomNQso1SvsfznqKsEX4v88DELMqMax7 OoHNsoXAewohfWSo2e/2lxEMRgFtnow= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=cPpg7A8i; spf=pass (imf05.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713644; a=rsa-sha256; cv=none; b=PqkzkUFiC6k57bzOyqdLsVdoNIYiQWZQkfr0KUEekXgKf0EuNexA6UXnyGqTz0975v+BxJ 4zj05nvX+ArlCNLGxYdxOHCL7ExxP0GAXbWoDZXMOmDWQnHYmTRTLyCKHJOggzjDL4R1JI p85K1MaHk9WIom2EtacLLk10U87cxT0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713643; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IxFOI8BlLlR7NCyVkzu+oUkui6zaCPq4znyAUScBVv0=; b=cPpg7A8i1zloMMZRKj1ZsLz93kX+dTITtLUeBNM/a/iOa7qUR1FfPWJ/GI7bsBb9dvOEF9 a4OI+NUHKuJlQOTB3YbDOR+3Otwp+kQvIwbkc//0UrIQlAcL/BOVwQVQC4tZAxN415oUj0 8WCX0C19d/YNkwxIbZpe8ArkGVU2MfE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-499-j5Q8MndgOw2Vu35Xrvwpvg-1; Wed, 05 Apr 2023 12:54:02 -0400 X-MC-Unique: j5Q8MndgOw2Vu35Xrvwpvg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 97712858F09; Wed, 5 Apr 2023 16:54:00 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id E7A24140EBF4; Wed, 5 Apr 2023 16:53:56 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Lorenzo Bianconi , Felix Fietkau , John Crispin , Sean Wang , Mark Lee , Keith Busch , Christoph Hellwig , Sagi Grimberg , Chaitanya Kulkarni , linux-nvme@lists.infradead.org, linux-mediatek@lists.infradead.org Subject: [PATCH net-next v4 05/20] mm: Make the page_frag_cache allocator use per-cpu Date: Wed, 5 Apr 2023 17:53:24 +0100 Message-Id: <20230405165339.3468808-6-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 3DB21100010 X-Stat-Signature: 397i9fwgb3owxy773b3s5k5gry7yuxgm X-Rspam-User: X-HE-Tag: 1680713644-721290 X-HE-Meta: U2FsdGVkX19/y5r32toMXUbFgXdMyHU5jlHTcZMhtLu8pzQKP+d3GFjZgC6BRpiAbLxzak8M/KBI41kBny0KRTdG81LZdn4N0hksGc8ONgqPjaFwcndLelHL0Gkmn4m1jM/oViRdajKKu5CyYev5Na+mOb+BFwsyvX8GSWY81SysfgBALLk0BeuGCuxLtBT+mzQm5i8KMIcPUvtJsnOdG4FWXOOyUgObtSGnGGCtX2seFZNqqCCAWtBZcY9BQDhfH2h7Q6WnkD7iOra+p7kqm0ZlQd/S+ZnWmj9odjMxR4TIdiMlJOYXJ9rlnkVYE47MbPOIZAcJgj6ppPN4lBFQ9+O5EjH9S+Z22iaMhoxEhhjR9ioY/OLs09FGwIroPAXWU4qhmxeoCbRD4zhOZ/nACIuIgbE76g3c2scgXUZ0InL5njrx9QBzGqSTz9pgh2QLhudS3E3s1pf3GK/1e8K6TMR/OmrpYdIgofSBInauNkbJthx3Oqdn6mGR1nzaLjg3P6kDtShubuKR/m23DJSN1L9uNR+pKHGe3UumsaTshNVjgqLwxEqB4ZWyME7HtEwG5IBOYLMFVOkqVv/u7IxNNewQalmAmvcWr7sgd2ZaOvC6k37EJS8zr//FLH/g+3C4K12KOwDXlX+qNBhU5AksRiMC/ZcoRrg9aJTbAqH6yHtuHtXcLiW3iIJBmGA80a0FFm61HsQCgMdErXaaIIRMmxoNckmrAj8vhB3a+AWdm6121CMR9m4gAztjVFbrJVhHaCbPNs31d4uSlKvcbeilhIZmpeqUWPz2IM5Qqj6UxXYxbY6CxC6kaglKEz8scb8A3nCEIqsTfE2Oc1JC2VLNg62leRckrbak88mrN2CJgkSc5IEbViAH1EwWL62ngXdUadS5clxCwejmqQcaFcWXrFFwZb7HcfbzD1nMXRzTzmf9+PPdkaxW5s0uQJnvjhU9c5i/5/8uP9hpA1nkYQP dlw5JpEe ve6qHMTQhqOf2GBon6KJDKO4yLobGb60Gnyzbeg/yrp6wVjuQt3It5A2izUqmOTJdHGpbF0MiXsZHO1FIoQwqbx8Pd1XDbiGSX+xn2aMWyZkyrVFlwUh/2JDmMcwxBK7LkI+79CiR1K3kc0Vs4uWjshpkJRyKCtMtguDmps1YXPYl73R7vRTV8+iByYQDK6BhbQoDuR1O9loPww82GzGN+hrnWceJOj+bDiOeWn5lOS7Dmj1UUNFIRV3RtXsx+auR6r0WHkyKuPylroP8HZV4L07MBqnPJ1pv2QkHUJGprmWvsPNZVcCkDRy7GllIUWC9cZuCSKwaTCT/2DLw9j353zykfquHd6Ao/vRTgIBkwJmJioTntttU71r51/nSvSkaU2thFdbQ9gYz2jZWvTumSEOlpeUF3hSgkkD5Smj89XPXbF9rKi4bgIwsVJqIIK1JhVlLsXMwRtUUNXIypIfPYm7Tmi7w2MG+LEd82X8zIOVPXTcTlYkeiWznIwjlC8GQNVy0umMRvq+8SMdi9uh66gmGb9LTIt26bmKvpz2Ai+uEx0ZCevJfjusXdeq9DXsJEGXHc5NTBFHwKsu1mdIBVcCV+FrBrgzggjBotuktqs4EXgItG0VxnTPnRUvnWy8LzpvEAaHHjvb6gg51McSxd3tQ9g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Make the page_frag_cache allocator have a separate allocation bucket for each cpu to avoid racing. This means that no lock is required, other than preempt disablement, to allocate from it, though if a softirq wants to access it, then softirq disablement will need to be added. Make the NVMe and mediatek drivers pass in NULL to page_frag_cache() and use the default allocation buckets rather than defining their own. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Lorenzo Bianconi cc: Felix Fietkau cc: John Crispin cc: Sean Wang cc: Mark Lee cc: Keith Busch cc: Christoph Hellwig cc: Sagi Grimberg cc: Chaitanya Kulkarni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org cc: linux-nvme@lists.infradead.org cc: linux-mediatek@lists.infradead.org --- drivers/net/ethernet/mediatek/mtk_wed_wo.c | 8 +- drivers/net/ethernet/mediatek/mtk_wed_wo.h | 2 - drivers/nvme/host/tcp.c | 14 +- drivers/nvme/target/tcp.c | 19 +- include/linux/gfp.h | 18 +- mm/page_frag_alloc.c | 195 ++++++++++++++------- net/core/skbuff.c | 32 ++-- 7 files changed, 164 insertions(+), 124 deletions(-) diff --git a/drivers/net/ethernet/mediatek/mtk_wed_wo.c b/drivers/net/ethernet/mediatek/mtk_wed_wo.c index 6ce532217777..859f34447f2f 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed_wo.c +++ b/drivers/net/ethernet/mediatek/mtk_wed_wo.c @@ -143,7 +143,7 @@ mtk_wed_wo_queue_refill(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q, dma_addr_t addr; void *buf; - buf = page_frag_alloc(&q->cache, q->buf_size, GFP_ATOMIC); + buf = page_frag_alloc(NULL, q->buf_size, GFP_ATOMIC); if (!buf) break; @@ -296,15 +296,11 @@ mtk_wed_wo_queue_tx_clean(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) skb_free_frag(entry->buf); entry->buf = NULL; } - - page_frag_cache_clear(&q->cache); } static void mtk_wed_wo_queue_rx_clean(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) { - struct page *page; - for (;;) { void *buf = mtk_wed_wo_dequeue(wo, q, NULL, true); @@ -313,8 +309,6 @@ mtk_wed_wo_queue_rx_clean(struct mtk_wed_wo *wo, struct mtk_wed_wo_queue *q) skb_free_frag(buf); } - - page_frag_cache_clear(&q->cache); } static void diff --git a/drivers/net/ethernet/mediatek/mtk_wed_wo.h b/drivers/net/ethernet/mediatek/mtk_wed_wo.h index dbcf42ce9173..6f940db67fb8 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed_wo.h +++ b/drivers/net/ethernet/mediatek/mtk_wed_wo.h @@ -210,8 +210,6 @@ struct mtk_wed_wo_queue_entry { struct mtk_wed_wo_queue { struct mtk_wed_wo_queue_regs regs; - struct page_frag_cache cache; - struct mtk_wed_wo_queue_desc *desc; dma_addr_t desc_dma; diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 76f12ac714b0..5a92236db92a 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -147,8 +147,6 @@ struct nvme_tcp_queue { __le32 exp_ddgst; __le32 recv_ddgst; - struct page_frag_cache pf_cache; - void (*state_change)(struct sock *); void (*data_ready)(struct sock *); void (*write_space)(struct sock *); @@ -482,9 +480,8 @@ static int nvme_tcp_init_request(struct blk_mq_tag_set *set, struct nvme_tcp_queue *queue = &ctrl->queues[queue_idx]; u8 hdgst = nvme_tcp_hdgst_len(queue); - req->pdu = page_frag_alloc(&queue->pf_cache, - sizeof(struct nvme_tcp_cmd_pdu) + hdgst, - GFP_KERNEL | __GFP_ZERO); + req->pdu = page_frag_alloc(NULL, sizeof(struct nvme_tcp_cmd_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!req->pdu) return -ENOMEM; @@ -1300,9 +1297,8 @@ static int nvme_tcp_alloc_async_req(struct nvme_tcp_ctrl *ctrl) struct nvme_tcp_request *async = &ctrl->async_req; u8 hdgst = nvme_tcp_hdgst_len(queue); - async->pdu = page_frag_alloc(&queue->pf_cache, - sizeof(struct nvme_tcp_cmd_pdu) + hdgst, - GFP_KERNEL | __GFP_ZERO); + async->pdu = page_frag_alloc(NULL, sizeof(struct nvme_tcp_cmd_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!async->pdu) return -ENOMEM; @@ -1312,7 +1308,6 @@ static int nvme_tcp_alloc_async_req(struct nvme_tcp_ctrl *ctrl) static void nvme_tcp_free_queue(struct nvme_ctrl *nctrl, int qid) { - struct page *page; struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl); struct nvme_tcp_queue *queue = &ctrl->queues[qid]; unsigned int noreclaim_flag; @@ -1323,7 +1318,6 @@ static void nvme_tcp_free_queue(struct nvme_ctrl *nctrl, int qid) if (queue->hdr_digest || queue->data_digest) nvme_tcp_free_crypto(queue); - page_frag_cache_clear(&queue->pf_cache); noreclaim_flag = memalloc_noreclaim_save(); sock_release(queue->sock); memalloc_noreclaim_restore(noreclaim_flag); diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index ae871c31cf00..d6cc557cc539 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -143,8 +143,6 @@ struct nvmet_tcp_queue { struct nvmet_tcp_cmd connect; - struct page_frag_cache pf_cache; - void (*data_ready)(struct sock *); void (*state_change)(struct sock *); void (*write_space)(struct sock *); @@ -1312,25 +1310,25 @@ static int nvmet_tcp_alloc_cmd(struct nvmet_tcp_queue *queue, c->queue = queue; c->req.port = queue->port->nport; - c->cmd_pdu = page_frag_alloc(&queue->pf_cache, - sizeof(*c->cmd_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); + c->cmd_pdu = page_frag_alloc(NULL, sizeof(*c->cmd_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!c->cmd_pdu) return -ENOMEM; c->req.cmd = &c->cmd_pdu->cmd; - c->rsp_pdu = page_frag_alloc(&queue->pf_cache, - sizeof(*c->rsp_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); + c->rsp_pdu = page_frag_alloc(NULL, sizeof(*c->rsp_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!c->rsp_pdu) goto out_free_cmd; c->req.cqe = &c->rsp_pdu->cqe; - c->data_pdu = page_frag_alloc(&queue->pf_cache, - sizeof(*c->data_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); + c->data_pdu = page_frag_alloc(NULL, sizeof(*c->data_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!c->data_pdu) goto out_free_rsp; - c->r2t_pdu = page_frag_alloc(&queue->pf_cache, - sizeof(*c->r2t_pdu) + hdgst, GFP_KERNEL | __GFP_ZERO); + c->r2t_pdu = page_frag_alloc(NULL, sizeof(*c->r2t_pdu) + hdgst, + GFP_KERNEL | __GFP_ZERO); if (!c->r2t_pdu) goto out_free_data; @@ -1459,7 +1457,6 @@ static void nvmet_tcp_release_queue_work(struct work_struct *w) if (queue->hdr_digest || queue->data_digest) nvmet_tcp_free_crypto(queue); ida_free(&nvmet_tcp_queue_ida, queue->idx); - page_frag_cache_clear(&queue->pf_cache); kfree(queue); } diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 5e15384798eb..b208ca315882 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -304,16 +304,18 @@ extern void free_pages(unsigned long addr, unsigned int order); struct page_frag_cache; extern void __page_frag_cache_drain(struct page *page, unsigned int count); -extern void *page_frag_alloc_align(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask, - unsigned int align_mask); - -static inline void *page_frag_alloc(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask) +extern void *page_frag_alloc_align(struct page_frag_cache __percpu *frag_cache, + size_t fragsz, gfp_t gfp, + unsigned long align_mask); +extern void *page_frag_memdup(struct page_frag_cache __percpu *frag_cache, + const void *p, size_t fragsz, gfp_t gfp, + unsigned long align_mask); + +static inline void *page_frag_alloc(struct page_frag_cache __percpu *frag_cache, + size_t fragsz, gfp_t gfp) { - return page_frag_alloc_align(nc, fragsz, gfp_mask, ~0u); + return page_frag_alloc_align(frag_cache, fragsz, gfp, ULONG_MAX); } -void page_frag_cache_clear(struct page_frag_cache *nc); extern void page_frag_free(void *addr); diff --git a/mm/page_frag_alloc.c b/mm/page_frag_alloc.c index 9b138cb0e3a4..7844398afe26 100644 --- a/mm/page_frag_alloc.c +++ b/mm/page_frag_alloc.c @@ -16,25 +16,23 @@ #include #include +static DEFINE_PER_CPU(struct page_frag_cache, page_frag_default_allocator); + /* * Allocate a new folio for the frag cache. */ -static struct folio *page_frag_cache_refill(struct page_frag_cache *nc, - gfp_t gfp_mask) +static struct folio *page_frag_cache_refill(gfp_t gfp) { - struct folio *folio = NULL; - gfp_t gfp = gfp_mask; + struct folio *folio; #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) - gfp_mask |= __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC; - folio = folio_alloc(gfp_mask, PAGE_FRAG_CACHE_MAX_ORDER); + folio = folio_alloc(gfp | __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC, + PAGE_FRAG_CACHE_MAX_ORDER); + if (folio) + return folio; #endif - if (unlikely(!folio)) - folio = folio_alloc(gfp, 0); - if (folio) - nc->folio = folio; - return folio; + return folio_alloc(gfp, 0); } void __page_frag_cache_drain(struct page *page, unsigned int count) @@ -47,54 +45,68 @@ void __page_frag_cache_drain(struct page *page, unsigned int count) } EXPORT_SYMBOL(__page_frag_cache_drain); -void page_frag_cache_clear(struct page_frag_cache *nc) -{ - struct folio *folio = nc->folio; - - if (folio) { - VM_BUG_ON_FOLIO(folio_ref_count(folio) == 0, folio); - folio_put_refs(folio, nc->pagecnt_bias); - nc->folio = NULL; - } - -} -EXPORT_SYMBOL(page_frag_cache_clear); - -void *page_frag_alloc_align(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask, - unsigned int align_mask) +/** + * page_frag_alloc_align - Allocate some memory for use in zerocopy + * @frag_cache: The frag cache to use (or NULL for the default) + * @fragsz: The size of the fragment desired + * @gfp: Allocation flags under which to make an allocation + * @align_mask: The required alignment + * + * Allocate some memory for use with zerocopy where protocol bits have to be + * mixed in with spliced/zerocopied data. Unlike memory allocated from the + * slab, this memory's lifetime is purely dependent on the folio's refcount. + * + * The way it works is that a folio is allocated and fragments are broken off + * sequentially and returned to the caller with a ref until the folio no longer + * has enough spare space - at which point the allocator's ref is dropped and a + * new folio is allocated. The folio remains in existence until the last ref + * held by, say, an sk_buff is discarded and then the page is returned to the + * page allocator. + * + * Returns a pointer to the memory on success and -ENOMEM on allocation + * failure. + * + * The allocated memory should be disposed of with folio_put(). + */ +void *page_frag_alloc_align(struct page_frag_cache __percpu *frag_cache, + size_t fragsz, gfp_t gfp, unsigned long align_mask) { - struct folio *folio = nc->folio; + struct page_frag_cache *nc; + struct folio *folio, *spare = NULL; size_t offset; + void *p; - if (unlikely(!folio)) { -refill: - folio = page_frag_cache_refill(nc, gfp_mask); - if (!folio) - return NULL; + if (!frag_cache) + frag_cache = &page_frag_default_allocator; + if (WARN_ON_ONCE(fragsz == 0)) + fragsz = 1; + align_mask &= ~3UL; - /* Even if we own the page, we do not use atomic_set(). - * This would break get_page_unless_zero() users. - */ - folio_ref_add(folio, PAGE_FRAG_CACHE_MAX_SIZE); - - /* reset page count bias and offset to start of new frag */ - nc->pfmemalloc = folio_is_pfmemalloc(folio); - nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; - nc->offset = folio_size(folio); + nc = get_cpu_ptr(frag_cache); +reload: + folio = nc->folio; + offset = nc->offset; +try_again: + + /* Make the allocation if there's sufficient space. */ + if (fragsz <= offset) { + nc->pagecnt_bias--; + offset = (offset - fragsz) & align_mask; + nc->offset = offset; + p = folio_address(folio) + offset; + put_cpu_ptr(frag_cache); + if (spare) + folio_put(spare); + return p; } - offset = nc->offset; - if (unlikely(fragsz > offset)) { - /* Reuse the folio if everyone we gave it to has finished with it. */ - if (!folio_ref_sub_and_test(folio, nc->pagecnt_bias)) { - nc->folio = NULL; + /* Insufficient space - see if we can refurbish the current folio. */ + if (folio) { + if (!folio_ref_sub_and_test(folio, nc->pagecnt_bias)) goto refill; - } if (unlikely(nc->pfmemalloc)) { __folio_put(folio); - nc->folio = NULL; goto refill; } @@ -104,27 +116,56 @@ void *page_frag_alloc_align(struct page_frag_cache *nc, /* reset page count bias and offset to start of new frag */ nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; offset = folio_size(folio); - if (unlikely(fragsz > offset)) { - /* - * The caller is trying to allocate a fragment - * with fragsz > PAGE_SIZE but the cache isn't big - * enough to satisfy the request, this may - * happen in low memory conditions. - * We don't release the cache page because - * it could make memory pressure worse - * so we simply return NULL here. - */ - nc->offset = offset; + if (unlikely(fragsz > offset)) + goto frag_too_big; + goto try_again; + } + +refill: + if (!spare) { + nc->folio = NULL; + put_cpu_ptr(frag_cache); + + spare = page_frag_cache_refill(gfp); + if (!spare) return NULL; - } + + nc = get_cpu_ptr(frag_cache); + /* We may now be on a different cpu and/or someone else may + * have refilled it + */ + nc->pfmemalloc = folio_is_pfmemalloc(spare); + if (nc->folio) + goto reload; } - nc->pagecnt_bias--; - offset -= fragsz; - offset &= align_mask; + nc->folio = spare; + folio = spare; + spare = NULL; + + /* Even if we own the page, we do not use atomic_set(). This would + * break get_page_unless_zero() users. + */ + folio_ref_add(folio, PAGE_FRAG_CACHE_MAX_SIZE); + + /* Reset page count bias and offset to start of new frag */ + nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; + offset = folio_size(folio); + goto try_again; + +frag_too_big: + /* + * The caller is trying to allocate a fragment with fragsz > PAGE_SIZE + * but the cache isn't big enough to satisfy the request, this may + * happen in low memory conditions. We don't release the cache page + * because it could make memory pressure worse so we simply return NULL + * here. + */ nc->offset = offset; - - return folio_address(folio) + offset; + put_cpu_ptr(frag_cache); + if (spare) + folio_put(spare); + return NULL; } EXPORT_SYMBOL(page_frag_alloc_align); @@ -136,3 +177,25 @@ void page_frag_free(void *addr) folio_put(virt_to_folio(addr)); } EXPORT_SYMBOL(page_frag_free); + +/** + * page_frag_memdup - Allocate a page fragment and duplicate some data into it + * @frag_cache: The frag cache to use (or NULL for the default) + * @fragsz: The amount of memory to copy (maximum 1/2 page). + * @p: The source data to copy + * @gfp: Allocation flags under which to make an allocation + * @align_mask: The required alignment + */ +void *page_frag_memdup(struct page_frag_cache __percpu *frag_cache, + const void *p, size_t fragsz, gfp_t gfp, + unsigned long align_mask) +{ + void *q; + + q = page_frag_alloc_align(frag_cache, fragsz, gfp, align_mask); + if (!q) + return q; + + return memcpy(q, p, fragsz); +} +EXPORT_SYMBOL(page_frag_memdup); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 050a875d09c5..3d05ed64b606 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -222,13 +222,13 @@ static void *page_frag_alloc_1k(struct page_frag_1k *nc, gfp_t gfp_mask) #endif struct napi_alloc_cache { - struct page_frag_cache page; struct page_frag_1k page_small; unsigned int skb_count; void *skb_cache[NAPI_SKB_CACHE_SIZE]; }; static DEFINE_PER_CPU(struct page_frag_cache, netdev_alloc_cache); +static DEFINE_PER_CPU(struct page_frag_cache, napi_frag_cache); static DEFINE_PER_CPU(struct napi_alloc_cache, napi_alloc_cache); /* Double check that napi_get_frags() allocates skbs with @@ -250,11 +250,9 @@ void napi_get_frags_check(struct napi_struct *napi) void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) { - struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); - fragsz = SKB_DATA_ALIGN(fragsz); - return page_frag_alloc_align(&nc->page, fragsz, GFP_ATOMIC, align_mask); + return page_frag_alloc_align(&napi_frag_cache, fragsz, GFP_ATOMIC, align_mask); } EXPORT_SYMBOL(__napi_alloc_frag_align); @@ -264,15 +262,12 @@ void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) fragsz = SKB_DATA_ALIGN(fragsz); if (in_hardirq() || irqs_disabled()) { - struct page_frag_cache *nc = this_cpu_ptr(&netdev_alloc_cache); - - data = page_frag_alloc_align(nc, fragsz, GFP_ATOMIC, align_mask); + data = page_frag_alloc_align(&netdev_alloc_cache, + fragsz, GFP_ATOMIC, align_mask); } else { - struct napi_alloc_cache *nc; - local_bh_disable(); - nc = this_cpu_ptr(&napi_alloc_cache); - data = page_frag_alloc_align(&nc->page, fragsz, GFP_ATOMIC, align_mask); + data = page_frag_alloc_align(&napi_frag_cache, + fragsz, GFP_ATOMIC, align_mask); local_bh_enable(); } return data; @@ -652,7 +647,6 @@ EXPORT_SYMBOL(__alloc_skb); struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, gfp_t gfp_mask) { - struct page_frag_cache *nc; struct sk_buff *skb; bool pfmemalloc; void *data; @@ -677,14 +671,12 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, gfp_mask |= __GFP_MEMALLOC; if (in_hardirq() || irqs_disabled()) { - nc = this_cpu_ptr(&netdev_alloc_cache); - data = page_frag_alloc(nc, len, gfp_mask); - pfmemalloc = nc->pfmemalloc; + data = page_frag_alloc(&netdev_alloc_cache, len, gfp_mask); + pfmemalloc = folio_is_pfmemalloc(virt_to_folio(data)); } else { local_bh_disable(); - nc = this_cpu_ptr(&napi_alloc_cache.page); - data = page_frag_alloc(nc, len, gfp_mask); - pfmemalloc = nc->pfmemalloc; + data = page_frag_alloc(&napi_frag_cache, len, gfp_mask); + pfmemalloc = folio_is_pfmemalloc(virt_to_folio(data)); local_bh_enable(); } @@ -772,8 +764,8 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, } else { len = SKB_HEAD_ALIGN(len); - data = page_frag_alloc(&nc->page, len, gfp_mask); - pfmemalloc = nc->page.pfmemalloc; + data = page_frag_alloc(&napi_frag_cache, len, gfp_mask); + pfmemalloc = folio_is_pfmemalloc(virt_to_folio(data)); } if (unlikely(!data)) From patchwork Wed Apr 5 16:53:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202259 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1E18C7619A for ; Wed, 5 Apr 2023 16:54:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 415B56B0074; Wed, 5 Apr 2023 12:54:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C4986B007E; Wed, 5 Apr 2023 12:54:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B3C16B0080; Wed, 5 Apr 2023 12:54:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1D6D46B0074 for ; Wed, 5 Apr 2023 12:54:12 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C62A4410BB for ; Wed, 5 Apr 2023 16:54:11 +0000 (UTC) X-FDA: 80647935102.28.3312746 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf29.hostedemail.com (Postfix) with ESMTP id 1647012000C for ; Wed, 5 Apr 2023 16:54:09 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="EnroBNB/"; spf=pass (imf29.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713650; a=rsa-sha256; cv=none; b=kmgEPlmIN30vq4R3qHXEI+WXTUZg1fBF/6gBKjodefSMI4xhKjlaGeTC2nbw79sSOtxQHE RLot9/+ZKM9VnM0uaRLNgAVxSXdv1ztPk0UYNqcTTVT81P87b1wwruJrjz7yPaXytfWUuI +HT0ejR3TbCF7bFvbI9SExDWH6ruRZk= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="EnroBNB/"; spf=pass (imf29.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713650; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gNgTI4ct3B9VKtnPZ5NdE6sXYdaZCQGxwVn8DWUxWC8=; b=3WgGLlOT04s/LOY2J6Di6YexRGewk2I/0+Uch/pYNc19izGlFak1Oj3N2kB9uYnYRTtyHx 5elI+VEpfh50UVIUNWE/rTWBQLTwjV7Hu3+CqI50bCNkg3IVS3gWZutV7hNL/XgfOttLfY RsxSEMpcZ5EfuOGmrM/hHODF7k1adpw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713649; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gNgTI4ct3B9VKtnPZ5NdE6sXYdaZCQGxwVn8DWUxWC8=; b=EnroBNB/A1q8sCUmlJLaQ4KjmrqsGY+kbz6NLuswEb4JKCBEJ3Tkmm0q3UUhwYQZlqCNrx GbWf0t/7yRm2P0S9AgYtzYxbaUaAug6O7IpF1Qy+HXcmfENE5X0F/lplv0xlwUXkzKZv0I HNzC7NmkhvwqXA9aOKaJPEpKWCQ/KUw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-652-HgV9LYxTO12prleccTyfZw-1; Wed, 05 Apr 2023 12:54:04 -0400 X-MC-Unique: HgV9LYxTO12prleccTyfZw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5CFB588904B; Wed, 5 Apr 2023 16:54:03 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 53E15C1602A; Wed, 5 Apr 2023 16:54:01 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 06/20] tcp: Support MSG_SPLICE_PAGES Date: Wed, 5 Apr 2023 17:53:25 +0100 Message-Id: <20230405165339.3468808-7-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Rspam-User: X-Rspamd-Queue-Id: 1647012000C X-Rspamd-Server: rspam01 X-Stat-Signature: qg58c9bi37f1qe1nqwffpg4hk64k7y69 X-HE-Tag: 1680713649-967297 X-HE-Meta: U2FsdGVkX1/mdzOz6MrW64rf2GSz40o6jzTETUwWd3dJE+UG+Y90XSKEJie07SN3vWCIqrzdYhPJELNyhMolI9PnsDEfO8PQXJhvACJjl1f3F76//nv9v8EbVgRxZymJRitrQ3xe/IDIkZUpxcAmh+OXgWB3HAVuLX/XVzhLY1WpVjr+EvlLj1+AyraoNPot2tWX3i0G3uFHOIQ6Rg9BFgSc1IJx67gR9OPKCgXGbtm0usoO56+k/KFhtkLjOOIRhmMUIDGjAsQgyCnzJtAhisZXvZ8WtmrUwQUbvb4TFzKtHYTiM31LyiwkGaJlRnjFl2jTC0mtLl/TQJfOxPBt3cf1YCFGFWGx2gXbQlSwIUvNim/cfYXU/98ZsRU230CxpI8tQ0hgSILBnHkjMusFwjyDTU7G4XLk6GdpcD3tOcF+3wqNh3j8iAcSjtxQDKg91DFtaqb+MU5dPaYuaVC9mMNW+rYm0IcGb9uGMq+4fottRYsKQvMfaajpODS4Y63AVVMy1+QuswW6oC0/pJJ+rJ5nWCKgiBlaB2ItRMzJwoGZXVOyXzYjPEsx7oDumR1RHIT9jX2Ngt/Pwf4jqY15MnxYoDpanh3exudEU7iRmlC5/BnxYz0bNqUYTv5Xi5C0n7l+bz59uoYA3iBfs1c5OTQ4tFmEYxhFO7v3k1OHvyu2tDQXV21qapDtjvouFY4djwXMdgEAO46yRVCFr32OYrJwSYB6c05MK06z9we26G7EzHSAuRsjiygg7UVofIkMIoAsuknz3NvisI028TUeZ3FF9t01pIB+63DWbT2WyDbBoSUKIibl1PKwTTaxnODfxs5o7aE/JZTnN1aVCT/t/zb9LFYRS5fc9Kj8OC2JoS20ZeuHsJKbbCvtv+p0bp7Llxm7v2uCqh5WbG0pLGJKBYjhT/ZKwXO1MIRxQU8LLpPo2T7ah0dxLO5HTUYlk9Tm4NelFVMS7EYcGzT/EiU LPCGqfAo qskR0goTfFU9U5E32z67JHlPQMW7PXAZKidAu45j4OtmKnqFi4pnVtMbRfq8quJNv9byp9+IACw7rmFmPL7Fe+MRSVZBXnw6wQehjyoR6qt2by9YRtNC8w+cxAGDQz0fbWUYrZ+YPtJte9MEkF8YXnHYwtQhEu/ZIVXHEHORRHcwEbs18IfJY4k1pKKwpfV2rVRm52Gl8alGDWNnbd1jkaxnhrbh/vgR9eyXV8uYZrVOzd5L7YK10pMKTOr+ABhgeHy41IMooMliMZiR3aDLnkqQuhmeIu7bLe/yXeH7BZVCjfpkZYuYvgRTr+ANVKvgXzAiTSiYqMlciBbzQIyUkVuM3xtwDiK6R9D8rCsJQv+ZMUov8QsllDf7pDDr3Lc4ZAbEaF4KgpNv7RT+L4xScOeGtTE3iKohSDoddNuSBATv/WiIv1b8xjIiD7oeQuAOM9+o7MTI8LR550aiLYpGEjfXYmPWPdGp7vKj+MpmJdzZ07J1v13OyUjD0Ea/DNFEWJJV4 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Make TCP's sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/tcp.c | 67 ++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 60 insertions(+), 7 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index fd68d49490f2..510bacc7ce7b 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1221,7 +1221,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) int flags, err, copied = 0; int mss_now = 0, size_goal, copied_syn = 0; int process_backlog = 0; - bool zc = false; + int zc = 0; long timeo; flags = msg->msg_flags; @@ -1232,17 +1232,22 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (msg->msg_ubuf) { uarg = msg->msg_ubuf; net_zcopy_get(uarg); - zc = sk->sk_route_caps & NETIF_F_SG; + if (sk->sk_route_caps & NETIF_F_SG) + zc = 1; } else if (sock_flag(sk, SOCK_ZEROCOPY)) { uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb)); if (!uarg) { err = -ENOBUFS; goto out_err; } - zc = sk->sk_route_caps & NETIF_F_SG; - if (!zc) + if (sk->sk_route_caps & NETIF_F_SG) + zc = 1; + else uarg_to_msgzc(uarg)->zerocopy = 0; } + } else if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES) && size) { + if (sk->sk_route_caps & NETIF_F_SG) + zc = 2; } if (unlikely(flags & MSG_FASTOPEN || inet_sk(sk)->defer_connect) && @@ -1305,7 +1310,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) goto do_error; while (msg_data_left(msg)) { - int copy = 0; + ssize_t copy = 0; skb = tcp_write_queue_tail(sk); if (skb) @@ -1346,7 +1351,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (copy > msg_data_left(msg)) copy = msg_data_left(msg); - if (!zc) { + if (zc == 0) { bool merge = true; int i = skb_shinfo(skb)->nr_frags; struct page_frag *pfrag = sk_page_frag(sk); @@ -1391,7 +1396,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) page_ref_inc(pfrag->page); } pfrag->offset += copy; - } else { + } else if (zc == 1) { /* First append to a fragless skb builds initial * pure zerocopy skb */ @@ -1412,6 +1417,54 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (err < 0) goto do_error; copy = err; + } else if (zc == 2) { + /* Splice in data. */ + struct page *page = NULL, **pages = &page; + size_t off = 0, part; + bool can_coalesce; + int i = skb_shinfo(skb)->nr_frags; + + copy = iov_iter_extract_pages(&msg->msg_iter, &pages, + copy, 1, 0, &off); + if (copy <= 0) { + err = copy ?: -EIO; + goto do_error; + } + + can_coalesce = skb_can_coalesce(skb, i, page, off); + if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) { + tcp_mark_push(tp, skb); + iov_iter_revert(&msg->msg_iter, copy); + goto new_segment; + } + if (tcp_downgrade_zcopy_pure(sk, skb)) { + iov_iter_revert(&msg->msg_iter, copy); + goto wait_for_space; + } + + part = tcp_wmem_schedule(sk, copy); + iov_iter_revert(&msg->msg_iter, copy - part); + if (!part) + goto wait_for_space; + copy = part; + + if (can_coalesce) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + } else { + get_page(page); + skb_fill_page_desc_noacc(skb, i, page, off, copy); + } + page = NULL; + + if (!(flags & MSG_NO_SHARED_FRAGS)) + skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG; + + skb->len += copy; + skb->data_len += copy; + skb->truesize += copy; + sk_wmem_queued_add(sk, copy); + sk_mem_charge(sk, copy); + } if (!copied) From patchwork Wed Apr 5 16:53:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202260 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1061C761AF for ; Wed, 5 Apr 2023 16:54:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7BE5C6B007E; Wed, 5 Apr 2023 12:54:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 76E3C6B0080; Wed, 5 Apr 2023 12:54:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 60F016B0081; Wed, 5 Apr 2023 12:54:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5172A6B007E for ; Wed, 5 Apr 2023 12:54:16 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2BDB841155 for ; Wed, 5 Apr 2023 16:54:16 +0000 (UTC) X-FDA: 80647935312.09.D26D39F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 7A1E1C0017 for ; Wed, 5 Apr 2023 16:54:14 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QQk6nXxk; spf=pass (imf28.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713654; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5sjNMjWipBo1Z72oEKiWWK/H6upfHkIbrOvJ4ujx2jY=; b=R61midC2OEHbILCjh/hsSwexAMg5MRDYIIEkmT+RM+sWCZxas8mjWzlVWDFf3GzhVZlRjd dckRzby0cPS1btvlNlV/wOpw+7brPes8wHnqlWO0IGKZag6Vh+l5jBzkCEsNYP2wDx8jJe 3dlfhBQFAz8dDwbo3lLyWhimk1uXW80= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QQk6nXxk; spf=pass (imf28.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713654; a=rsa-sha256; cv=none; b=w3Q+tHlUbW9LA2QQ5VERFz4AvigWS7R+dfIXmyJjyIQmv2Eh8w00WVIB4hYsl7RK4HpskU /Pp1F3rHarOadavQtLAbiBJtiMSAxX/g3XQPvnn0aajVFtdHAgGMPqz+jttA2BzfWPHXa2 WVSQadSp24wG5f6i7LzTdzU/iPB768A= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713653; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5sjNMjWipBo1Z72oEKiWWK/H6upfHkIbrOvJ4ujx2jY=; b=QQk6nXxkjrSviRnhzZ9W6cJl85Ma0mRe4lBHpumNt5ihc3J/7UQYRt5qxwEhgkvvvlQbFA MTrCj/0eP6S4Dk966RJFdd4ZywECPyU19/py4cYYDPSxIG5xAXOBCK3M9FyDzbLjI3Lu9o ardr1aYo/LlmoJE+yK3OAN71dmyt4ro= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-511-Ya2LXt7-OzuHeZYmbpex1g-1; Wed, 05 Apr 2023 12:54:07 -0400 X-MC-Unique: Ya2LXt7-OzuHeZYmbpex1g-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 00159885622; Wed, 5 Apr 2023 16:54:05 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id F2A0AC1602A; Wed, 5 Apr 2023 16:54:03 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 07/20] tcp: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data Date: Wed, 5 Apr 2023 17:53:26 +0100 Message-Id: <20230405165339.3468808-8-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Stat-Signature: zpodqrgcm6skz8di8ihdeafwmd5stx7o X-Rspam-User: X-Rspamd-Queue-Id: 7A1E1C0017 X-Rspamd-Server: rspam06 X-HE-Tag: 1680713654-864008 X-HE-Meta: U2FsdGVkX1/mzx8YLWTX+u4sOB3gvJx5ZUrS3lotJsPz3oiQoR6KCvp9MCDh92vPtyq1CRd0FaQ4DaScRqTRYB5lVbD753gJLNj9T3SCAkhA841XbyROdy8a2MIhk/Dew9CX9d5VpfsMWYy/lN/gPXTGva6Z8qzR+AcUcMGFxQFIfvn6whJ5l87E+q7WT6i/z/TLm5yzFhqvTHUzOjTGnXZBFgND0TRxagC8m8yR9xNJFr5lJks+heAuvDFuSwOIxyC45+FzADVOLZqXMuqKUc7vOWHjgOvlHhqtmh3jmTgZWs+S6cz+8Q5AGDloRF1G4GymK813YMWIqcTqpjYAiVkq+Ink2c7zVaD0l7ZymqlPYtDauf1m2mZJidX1Yjztm8/zJ/xXkEWUQQWqxdA+npgIoCLjdC4gimHvRMETbGr8mygR5z/p3C/RAOdXEaKqS/aV+PZTyrvlzV+ID+x6l1uFAjRbfVpAisA3kb6g/RGxKqWY/2n427G76tiFgnyoAUOXiC+l9dPuhxvPvGidszmIlikkms+D8Kx19a4fB2i6/2VxG6ytXIxAnM0Zp2o7Paz3TdMsYR7hS9jdvMfzSKBH3RSVf2uvcpqfgFYqYCVOGJQtvDNKRydVMPCH133OeiwXs6l7T2CG16cMRmPgmno/2LRRu/iZcpLd3Z4LSTfNNZmv3j66fQIi16sS3w5nsZXU0cUTVvqgivsz03Sap26uuhs+uxiWihn3LUknK5qaU8Uvu/QNwA9gF/00yWfBu0jrsbGNX91eZNnurd/AcgbBnout4UwcbtA2IvEJ4JlquL0jCtBoExFyIcykA+scynwQoBWpnaGVJQOk79NUf04frifO7eKa2yrg60h86kD/HNMe7lSPg7WBS6wcsmW//WICA0Yp+14INdX5S0PhCTduN4rq7AXCXJZ+6gzCqswgFuAYj6OAWgTXkMIhLXlTFnQmsRNniIn1ZGTPU3i 0/QiX1b2 MjWQb5qrpuvMSXMlrzdbGg5uXxy0/jo/V4vI/4zIWu7exJab4COSxeQvdFhK+8+q5GwU7AUrNKBMIutKV/gdGraXKKEAr/B3FCz2fwcuvPYfW5nrPe28t8OGpBO/pBVDRwLO/RcDSJ7K3sCkokvBrarSw8pJqSzMkybvRQpU48P/g4UF6CFTtjnjA0SLR9+cpUH2Liv96A6ltinEJry5NLNzd2KJ/5J6uVEuR9dQvmHehh7F67HTzk9FvO365qm7Gzhgp1/4EatmgV/7Ql2Bw3dOzTSA4LVIMNrzOI7N4UGnw/09NtMZIKIpDH6+Qjiedj1SxJ4zZG5iqUQG+GQJ/HTuKqxsy6zagmbsg4Jfoz1A4gveDLXf6OZqdDfSqCzJGQw5UuavMBU6HSkiULfo2tjxtd2bDUQf2ka1Umz/7RByMv5eeiDub/J9DjbZJfbjkmjhhXt6szqwX7tPnMlfua7SJk/GgFPG+W+WuThTAFG3pdA9lqugbJbR8VRyeaZwBcMAh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If sendmsg() with MSG_SPLICE_PAGES encounters a page that shouldn't be spliced - a slab page, for instance, or one with a zero count - make tcp_sendmsg() copy it. Signed-off-by: David Howells cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/tcp.c | 28 +++++++++++++++++++++++++--- 1 file changed, 25 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 510bacc7ce7b..238a8ad6527c 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1418,10 +1418,10 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) goto do_error; copy = err; } else if (zc == 2) { - /* Splice in data. */ + /* Splice in data if we can; copy if we can't. */ struct page *page = NULL, **pages = &page; size_t off = 0, part; - bool can_coalesce; + bool can_coalesce, put = false; int i = skb_shinfo(skb)->nr_frags; copy = iov_iter_extract_pages(&msg->msg_iter, &pages, @@ -1448,12 +1448,34 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) goto wait_for_space; copy = part; + if (!sendpage_ok(page)) { + const void *p = kmap_local_page(page); + void *q; + + q = page_frag_memdup(NULL, p + off, copy, + sk->sk_allocation, ULONG_MAX); + kunmap_local(p); + if (!q) { + iov_iter_revert(&msg->msg_iter, copy); + err = copy ?: -ENOMEM; + goto do_error; + } + page = virt_to_page(q); + off = offset_in_page(q); + put = true; + can_coalesce = false; + } + if (can_coalesce) { skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); } else { - get_page(page); + if (!put) + get_page(page); + put = false; skb_fill_page_desc_noacc(skb, i, page, off, copy); } + if (put) + put_page(page); page = NULL; if (!(flags & MSG_NO_SHARED_FRAGS)) From patchwork Wed Apr 5 16:53:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202262 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20F5FC761AF for ; Wed, 5 Apr 2023 16:54:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E2DA06B0082; Wed, 5 Apr 2023 12:54:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E03C16B0080; Wed, 5 Apr 2023 12:54:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC8F56B0085; Wed, 5 Apr 2023 12:54:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 954396B0080 for ; Wed, 5 Apr 2023 12:54:18 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 74FF9AB380 for ; Wed, 5 Apr 2023 16:54:18 +0000 (UTC) X-FDA: 80647935396.15.DF4C038 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf09.hostedemail.com (Postfix) with ESMTP id B9A6214002A for ; Wed, 5 Apr 2023 16:54:16 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=LqLUHpX+; spf=pass (imf09.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713656; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uC0BSMRrW9srePF7/s9CMXAAGjZJUiQVDnNxN1lT1H0=; b=deXWXx6SbvBZMEIporpJhtSdW+Fkka4n/uyl07M8X5fc0RC/yvPJM4JbJP64ZJiLMX/PKL 5uVFIAeIgklX4bIx9bf5dbqKKMx4ntzFLKZT5JejrIKJn5/P0vGR84R5lNuE1BhbMvDDBM ocuCiXJlgE3VqyVBU1U3cPo0+Nmhn/c= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=LqLUHpX+; spf=pass (imf09.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713656; a=rsa-sha256; cv=none; b=7u+uzyYBGCuW4+Tp0SKyrnFRQuMvZ+hqzWOvHRk5DER88R1Xoi/NaH6fBeh35wJAS/9imx gp0t6Z69jA5DogEbR8WPKyWDQbSQSr8kgGE8xUqnGLPfavAjgB98kdLfS3gVVCdSCVsm5c YgALsuNiryx5JxPNZ07rKMaozBivPuk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713656; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uC0BSMRrW9srePF7/s9CMXAAGjZJUiQVDnNxN1lT1H0=; b=LqLUHpX+36Fu9VniL+gkUd08YrF6umK7Rs/3Qy/hJw81BucwQ/WGwYhd5zhRGXa7XgR9aS 2AFE7rpBdQQ3x4k9ActKJ3BaTYFmuWMD2Fj/RR1ciUgXHeaA3D1KfUxlSEqZtIkqJ0k+du +Y/jO/BhfknFGIk+0Z3J6oFHbpHH/Vg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-350-u4AwxYTMNSm_M09jQKer0w-1; Wed, 05 Apr 2023 12:54:09 -0400 X-MC-Unique: u4AwxYTMNSm_M09jQKer0w-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id AB6A7101A531; Wed, 5 Apr 2023 16:54:08 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id AB8111400E6D; Wed, 5 Apr 2023 16:54:06 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 08/20] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES Date: Wed, 5 Apr 2023 17:53:27 +0100 Message-Id: <20230405165339.3468808-9-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: hizfxehmp6cor9jrfi4bne65cghmfacy X-Rspamd-Queue-Id: B9A6214002A X-HE-Tag: 1680713656-305095 X-HE-Meta: U2FsdGVkX1/zYPsZsrlxWNpFcFuqzg6JieXiL2WXA6oELVqB+DLHXDVoad3KQr1i4ZwqZlRu2ruDDYjonT7H8chFRDNuqQCH1jJMZz4t5USTOPSpMqY4qUwFjLfxMvcOvmEQmIf57aUktOPPcNCyTUzn7LlTfPkN6FoAYO0caG2mSb9ncSt2HOB8agALGT1u9IdCCJ0KHXcNnrg0ZEIYUmkFo9ebbNtc/MenDI22XAnrtMAqjBz76Vk9TEdIr26v7PvvMvJ0/rXN9T83kb26NAKWDNVcts1hx6GMAL4kc1BuBaTRAzoChMxaJhPSbpnvOMSmoIh5HoWF1itiyIV6ts98yMPCfQWSQUKetNvyTxN/O3JPHs+Xh4rfIhUe88HSfvj60Ixt5nNGJ5UMkxDGDlFDzGWw9z8jhNHteqAJ4ErDV7tFGDl6tS/gpBIFlNslau6Tfiv51vidJ+HF0yNP6YAQaP8eQ08itD+ShdmFHcxmJB5F4R/NCMII5lCNWynqyldi4CzIteB45aJg4CVFRU3eaHMOpHkRTEmPlxlpxiF0uVzb/T/uqGrVxT4eE0m81Ns3ns2qQOA3awrlLRgdLKor3eEBhqqsMft/remLmjnvfeRjxk1Vj/ZX+8Yoqv0UmGOWtO0DZFwgaXBk4/mG+xkXkMtiv2mJiqOL20NFz+sDVC4vUn3HHOZ8xa8ClH0RFDP0ZqnIzrX2+roiWiAhfbLsTPm37Wymp6curfyFihALNokGovvBooAn/3BM0CWo5wAGKu/oi++iXKv5mgL52M1tVvpn3I7eIQX8doGUmBYWkcE4HiVA05rnbWX3u38Kn0V/qh9XiD5zk9LX+dXnZLrgpr+K0fJko7xfjqo3mY5p8wW8K1thM4ZjgD0J/wxoqULuOJ03RH3Q+uCV5I0fvRrrv53NkUWoK43HKxinw/X8vmmxXFpLIlFOKgXJsz+ZYXjWlwSrAjFBYqLfhVW +jX4aC/w AAqSdJ47RQtzQGBfuuthYILfbr1D9uU3AGYMOydkjNZt3LvaSLdbfZzfk+7cKbFb4SrN5dEg8H18IV4DHkyhjLSsA8ID5N2aLk4578osVdTLf6JgWJlPipbepLybUBWniKjoSKrXOtUO+/Pnx43lNLuEVK7G52KBszaqZFMIlXhTxhPfQ5vMQjOSvcFrgYzrnrl8EJNFNyB71OIiDsUM7WY9Y9rC/yGVfff7XporY7mLP4hzeakRdCGmsoQxyqEZ2TsrIWjrqvcymionDkO9UPp2s280Y622MoudJfpV1X312kEc+Wb3fvjTTj9S7A3P/0cFoEBMEf6lD7Hr0rqDsRtKeaNZ3sQdu1Uc5v+lGfRH36QzWL2OZqM67Ysl+hL9pYNL+aCFKuhfm4jTgyidfQQ95jWrFXAGCJPBexkKjmtLjLf85NosDVLEqmUoe2bemCvVjovIIX3D17fPMZ7EEiT1jTw82hXVvr4I+4e2gR4oiGxB1jbbpvP7j7ItqqFIrVr3r X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Convert do_tcp_sendpages() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. do_tcp_sendpages() can then be inlined in subsequent patches into its callers. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/tcp.c | 158 +++---------------------------------------------- 1 file changed, 7 insertions(+), 151 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 238a8ad6527c..a8a4ace8b3da 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -972,163 +972,19 @@ static int tcp_wmem_schedule(struct sock *sk, int copy) return min(copy, sk->sk_forward_alloc); } -static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags, - struct page *page, int offset, size_t *size) -{ - struct sk_buff *skb = tcp_write_queue_tail(sk); - struct tcp_sock *tp = tcp_sk(sk); - bool can_coalesce; - int copy, i; - - if (!skb || (copy = size_goal - skb->len) <= 0 || - !tcp_skb_can_collapse_to(skb)) { -new_segment: - if (!sk_stream_memory_free(sk)) - return NULL; - - skb = tcp_stream_alloc_skb(sk, 0, sk->sk_allocation, - tcp_rtx_and_write_queues_empty(sk)); - if (!skb) - return NULL; - -#ifdef CONFIG_TLS_DEVICE - skb->decrypted = !!(flags & MSG_SENDPAGE_DECRYPTED); -#endif - tcp_skb_entail(sk, skb); - copy = size_goal; - } - - if (copy > *size) - copy = *size; - - i = skb_shinfo(skb)->nr_frags; - can_coalesce = skb_can_coalesce(skb, i, page, offset); - if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) { - tcp_mark_push(tp, skb); - goto new_segment; - } - if (tcp_downgrade_zcopy_pure(sk, skb)) - return NULL; - - copy = tcp_wmem_schedule(sk, copy); - if (!copy) - return NULL; - - if (can_coalesce) { - skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); - } else { - get_page(page); - skb_fill_page_desc_noacc(skb, i, page, offset, copy); - } - - if (!(flags & MSG_NO_SHARED_FRAGS)) - skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG; - - skb->len += copy; - skb->data_len += copy; - skb->truesize += copy; - sk_wmem_queued_add(sk, copy); - sk_mem_charge(sk, copy); - WRITE_ONCE(tp->write_seq, tp->write_seq + copy); - TCP_SKB_CB(skb)->end_seq += copy; - tcp_skb_pcount_set(skb, 0); - - *size = copy; - return skb; -} - ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct tcp_sock *tp = tcp_sk(sk); - int mss_now, size_goal; - int err; - ssize_t copied; - long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); - - if (IS_ENABLED(CONFIG_DEBUG_VM) && - WARN_ONCE(!sendpage_ok(page), - "page must not be a Slab one and have page_count > 0")) - return -EINVAL; - - /* Wait for a connection to finish. One exception is TCP Fast Open - * (passive side) where data is allowed to be sent before a connection - * is fully established. - */ - if (((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) && - !tcp_passive_fastopen(sk)) { - err = sk_stream_wait_connect(sk, &timeo); - if (err != 0) - goto out_err; - } + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; - sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk); + bvec_set_page(&bvec, page, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - mss_now = tcp_send_mss(sk, &size_goal, flags); - copied = 0; + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; - err = -EPIPE; - if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) - goto out_err; - - while (size > 0) { - struct sk_buff *skb; - size_t copy = size; - - skb = tcp_build_frag(sk, size_goal, flags, page, offset, ©); - if (!skb) - goto wait_for_space; - - if (!copied) - TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_PSH; - - copied += copy; - offset += copy; - size -= copy; - if (!size) - goto out; - - if (skb->len < size_goal || (flags & MSG_OOB)) - continue; - - if (forced_push(tp)) { - tcp_mark_push(tp, skb); - __tcp_push_pending_frames(sk, mss_now, TCP_NAGLE_PUSH); - } else if (skb == tcp_send_head(sk)) - tcp_push_one(sk, mss_now); - continue; - -wait_for_space: - set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); - tcp_push(sk, flags & ~MSG_MORE, mss_now, - TCP_NAGLE_PUSH, size_goal); - - err = sk_stream_wait_memory(sk, &timeo); - if (err != 0) - goto do_error; - - mss_now = tcp_send_mss(sk, &size_goal, flags); - } - -out: - if (copied) { - tcp_tx_timestamp(sk, sk->sk_tsflags); - if (!(flags & MSG_SENDPAGE_NOTLAST)) - tcp_push(sk, flags, mss_now, tp->nonagle, size_goal); - } - return copied; - -do_error: - tcp_remove_empty_skb(sk); - if (copied) - goto out; -out_err: - /* make sure we wake any epoll edge trigger waiter */ - if (unlikely(tcp_rtx_and_write_queues_empty(sk) && err == -EAGAIN)) { - sk->sk_write_space(sk); - tcp_chrono_stop(sk, TCP_CHRONO_SNDBUF_LIMITED); - } - return sk_stream_error(sk, flags, err); + return tcp_sendmsg_locked(sk, &msg, size); } EXPORT_SYMBOL_GPL(do_tcp_sendpages); From patchwork Wed Apr 5 16:53:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202261 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FB96C76188 for ; Wed, 5 Apr 2023 16:54:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B681E6B0081; Wed, 5 Apr 2023 12:54:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B3E126B0080; Wed, 5 Apr 2023 12:54:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91C266B0082; Wed, 5 Apr 2023 12:54:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7E0C26B0080 for ; Wed, 5 Apr 2023 12:54:18 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4FB3580316 for ; Wed, 5 Apr 2023 16:54:18 +0000 (UTC) X-FDA: 80647935396.15.B69EE38 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf27.hostedemail.com (Postfix) with ESMTP id 78DEF4000B for ; Wed, 5 Apr 2023 16:54:16 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DvWA3Yf6; spf=pass (imf27.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713656; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LU+EuI1w95foBpvL+MnfTRdJBoeqcZi7esoiDYFHUqg=; b=YtGv3LpM+mMcVaj+P25FFjuc2i/GhNVCmEfApyUQd7wCY4wQ2E81IqcSZtmPnjsd1oE2Jh H+aYuGurcBtGns8PdAkwqKsFIFfoayaK8oHAFM+haOPRgZtLzcH2+dw5QuEePBorIQMQxs fEUhZyYE9tGv1lnIlC5xs7vXtkPvXfg= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DvWA3Yf6; spf=pass (imf27.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713656; a=rsa-sha256; cv=none; b=GjNGEuf4KJ1xV88nDXofTomsPjzdHIqsnyXAlqXJ2hyuEnBgac8HwqNaTy/rCxKRezWse4 Y75dyYrfD5JsGf+3IIIpgLYUQtz2eChAOsGmTwIGMHXj2Vm/fWW8gmJ5BfczkBocHOMeEN dHJL2YnG17D+R2YOjjUe4mMqzEoQ93s= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713655; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LU+EuI1w95foBpvL+MnfTRdJBoeqcZi7esoiDYFHUqg=; b=DvWA3Yf63Yp2qydqmp4Pru2vSRHQYI0G7TN2VfepNyldnuQOTzzp2jYLi6sXEsBbgztLV1 gimaEqQACGwVQzKeh3pWw2GPTdFMYChMx+Tior+8fTTTulBmx+SdAxXDhu4mwcd77PCkmW whFpP5GcS5U8FFjlt3pb1jh3qxI46Fg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-256-Ory9MB64PridaGhduckyKQ-1; Wed, 05 Apr 2023 12:54:12 -0400 X-MC-Unique: Ory9MB64PridaGhduckyKQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 95935885620; Wed, 5 Apr 2023 16:54:11 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4A2CDC1602A; Wed, 5 Apr 2023 16:54:09 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, John Fastabend , Jakub Sitnicki , bpf@vger.kernel.org Subject: [PATCH net-next v4 09/20] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg Date: Wed, 5 Apr 2023 17:53:28 +0100 Message-Id: <20230405165339.3468808-10-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Rspamd-Queue-Id: 78DEF4000B X-Stat-Signature: mah9ypciw7ksxako6q3mj875y4rdnncw X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1680713656-326000 X-HE-Meta: U2FsdGVkX1+SM9zNHckQBb00hM0AtC1gtonpN/i3KEZTX0KAii7n9ZGTRD0+Mc7mVJiermkOAY9qR9jofC0j7hpqE4j5Ec1MzDppIy0W76sMbalwtYvLbN/mo4jaSD8V2OfDenGIXEw3GYEEzB3s399KG4QKiv8TmOrKbivxiWZf8WZIN2jro+iPnjiaDx7ESdJyPL5qRstkWRkMb6lwlXqxHuKZb4GnCqJHDvyvD+KII7Yaz41akr5WGdJBZdZGexbJG8eoLeOH+uBDKMfl0/yh4Lb6iZ1OiEyohW27oaxMEljP/iyvNXh6VfvKf/lXBgHjI76MGwd6h7zxOHMckltuSrQws8nQObDQRTI0PJ24YC+mNHwxz8avFst7JFxUNzyFSpFSDE1mUEr3IxvV6x0waCtzZS4YMaWlT8xIezP+wo74ENhS0yh3+wC7Pe/9uXxieV1MeTxBL8tLeBc9odRE/W2EV8r+EsmA/8qwUVxIYcpUyRa45qPnCsWUEuSfFc1AdId20KokS3DrymvlE9HGSxCLgVx+5nL8j6UGHTOMOHPSnx0EWfln4VVR/NKdm3Axp3+nubmapRIVd0KdwjIab7kXxkmw4xuIcZUg4QRzss8BlXRO0RV+N1uVtC3J0mREp4MyCeFLDhJXxdhmqJN7D7TjjY9szg5Q3oqY5fom8ggs0ukRHeCTQw39BK+QWUuAjA4VJAtioQd1fT0tFb2LiIbVAbH1Q67j65+ivVGyrWRAfBptS9/75YCNwFULKFA5iXEo5zjidW//RFhr3/qcTCWfVO5m0eGRoDAFv7zbX9PGiB+x7rNtjUPHBrIVeDgwR0QQrX+iq0QFibYVpRqmV4L0I2mENgQ7jhLUg9ZU1Z9y3IgwQU3mIk5BEVutYukCUGwPZ+E/+GPRUM4mtnGkLhbD6+mxbFr1DuabPZkllmTBtQGzdrR0OOZnxA41kXdpxhffXCOPdop7htJ ZFfAXYWA lJWSfxER75fVAB6v9GDfPqYRrp5f0JdfNZR4JOjpFSRG+5Ra5oeXwJ1TYksUkqxjxoRmA05PKjckvlPgufdXHJyfBKX5BowSbRW+O1reZFEYKnDJ9KN/ohzSiMVRpJUUSSo+LMhwS0kaZ34wlD1ePdV+HpMOHL7RMTfbT/8jI6QtDIhSXDAnMdFKSSHXadc+/3uTvaIwoml7iJrI6q0SbwSfDT/8SzA3X6StPUebHBoYcx1GoZEKmBwOSFk/OudPPKv61pSRmSgfKwD/ug+6XEMS2paa/vI5MWjp7Foiya7xEZe688aMghKvNxwyqgHxLxXwxCPTFMW0pT/M2zFdtBLo2eWZZ/Ul+a8mwy8/Pg+GS3162hqxiLJ52Yw9wl+YwPyVP6HLnoEW1N+J/hUB1KxhvxzZ6nadcsTi/ovTFabMQUBpOnTmmfVpyJMH2zvzEcH4UQNUBVwQj7GnYnWvFVf7V+bQfePZurIbOV0aNPax9g4jKxE8LTUICUTHtQTDyl4vbfCP35voRkRDOcXKS25qg/6qK9zhXlL9EQs3sT8FfJ/TMjjm8GYC1lw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: John Fastabend cc: Jakub Sitnicki cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org cc: bpf@vger.kernel.org --- net/ipv4/tcp_bpf.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index ebf917511937..24bfb885777e 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -72,11 +72,13 @@ static int tcp_bpf_push(struct sock *sk, struct sk_msg *msg, u32 apply_bytes, { bool apply = apply_bytes; struct scatterlist *sge; + struct msghdr msghdr = { .msg_flags = flags | MSG_SPLICE_PAGES, }; struct page *page; int size, ret = 0; u32 off; while (1) { + struct bio_vec bvec; bool has_tx_ulp; sge = sk_msg_elem(msg, msg->sg.start); @@ -88,16 +90,18 @@ static int tcp_bpf_push(struct sock *sk, struct sk_msg *msg, u32 apply_bytes, tcp_rate_check_app_limited(sk); retry: has_tx_ulp = tls_sw_has_ctx_tx(sk); - if (has_tx_ulp) { - flags |= MSG_SENDPAGE_NOPOLICY; - ret = kernel_sendpage_locked(sk, - page, off, size, flags); - } else { - ret = do_tcp_sendpages(sk, page, off, size, flags); - } + if (has_tx_ulp) + msghdr.msg_flags |= MSG_SENDPAGE_NOPOLICY; + if (flags & MSG_SENDPAGE_NOTLAST) + msghdr.msg_flags |= MSG_MORE; + + bvec_set_page(&bvec, page, size, off); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size); + ret = tcp_sendmsg_locked(sk, &msghdr, size); if (ret <= 0) return ret; + if (apply) apply_bytes -= ret; msg->sg.size -= ret; @@ -404,7 +408,7 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) long timeo; int flags; - /* Don't let internal do_tcp_sendpages() flags through */ + /* Don't let internal sendpage flags through */ flags = (msg->msg_flags & ~MSG_SENDPAGE_DECRYPTED); flags |= MSG_NO_SHARED_FRAGS; From patchwork Wed Apr 5 16:53:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202264 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2120C7619A for ; Wed, 5 Apr 2023 16:54:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D82D6B0083; Wed, 5 Apr 2023 12:54:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4AEB66B0085; Wed, 5 Apr 2023 12:54:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 377436B0087; Wed, 5 Apr 2023 12:54:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2708C6B0083 for ; Wed, 5 Apr 2023 12:54:28 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 090D8141139 for ; Wed, 5 Apr 2023 16:54:27 +0000 (UTC) X-FDA: 80647935816.14.1199AC2 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 10FB8180010 for ; Wed, 5 Apr 2023 16:54:25 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ajuQZvkL; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713666; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DjALLcCTAkJ9GFGzQm0mxszTSA2UBcCYteYf8nsJBJs=; b=z2plNB1n63mREhbws34JfF7dfrx4skjXIU+nHQ/E8kUpBo4xABKuS7Oqn2Yq309+fU68KM cHLrzDkVV8gVymPkUZvqZGha5r02P/WBT0U6H2s73jzSG/eYrJBBkx+u8muDiPzDVMfphM Y0u4xLeo4RdC2QbLMmD6uCSAuJe1fT4= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ajuQZvkL; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713666; a=rsa-sha256; cv=none; b=53OV0Kbc3FVNcYMLu7wdNqpcZqKLiUrLfCJ8/ED1a0dnRi0Mrw8f8lF7t+wb29zvmj5/x1 PykjF0HsWvKQ43vMrStPT5S6Xu/WVtA8mErEsgtb9pD6BCIqshUKcm+lu7iJzy+Xl31EDo nJb3lJ075smz31M6Eealx8pDahVYy4g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713665; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DjALLcCTAkJ9GFGzQm0mxszTSA2UBcCYteYf8nsJBJs=; b=ajuQZvkL0V2yuCvVzAZrd7sryW1j1Ve8tkThDdH6wV8p228RRL+n236n5QaetNRs5QGJbS oxC4BiNm+AnXn0v+rmnVpl6s0RRkBhtd5MoL6YTGzaQA31d2+YPR3oa2Lg31aj4yA/uYfX wq6mrXIusVYfqc1G10zI9YxdmQtdemE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-561-cGYh42d-MymHRdF7Qm9tVg-1; Wed, 05 Apr 2023 12:54:23 -0400 X-MC-Unique: cGYh42d-MymHRdF7Qm9tVg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9C29F88904E; Wed, 5 Apr 2023 16:54:14 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4C0AA18EC7; Wed, 5 Apr 2023 16:54:12 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Steffen Klassert , Herbert Xu Subject: [PATCH net-next v4 10/20] espintcp: Inline do_tcp_sendpages() Date: Wed, 5 Apr 2023 17:53:29 +0100 Message-Id: <20230405165339.3468808-11-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Rspamd-Queue-Id: 10FB8180010 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: 9ma8setd3ie5hdcskutb8eqo7b7cybj8 X-HE-Tag: 1680713665-794599 X-HE-Meta: U2FsdGVkX1+apW6fWZ/mrvQe9kiNQLA1PfaowjsFwESzZaTXe6kHOuXs5r8JK1/HoxgKiGXSOZyWxjb/FXcAndmrV/9mIxMWjVKjY9//uNjTcjkSiJ+7GcDbtRUv4sYc9TRwTpnhahID543RtAYJNJwxtsZ4y+QFbCNhy2/ZZNnWzQJgvk/KRsC+VwZM27ltj9DHFY6KqKOqZE5nt1i1zzMbZpMkGTnAFC3/vta8853iua2vBEnxwBkIlI/kLZtxp9eS894GW3LMTFdf8RlKw25KAN9qNSBRJjpp79zbADdYUreVCVsjxuLd2BLC6GLoxe/L3mx97e5v56hqK8W7j7yM01U3E/86pClfDNYZJhpNYHXCIJBWehu4u9ixrAak0vMqGvougH5ZKhfCv2Gd94R/PZqB08/0K1Bvh2cGHPZOWTy6AM5Jp2pLWX133ubNd84uGqNxTFJAwIspOx5YuPpp0H15W3zcLGh1pcUkBi+5uefhfrifjuG19/4fzX6QXldKieO2atmXg2n8my/MXdzXxsKYNniPCuT2amjFd9Bhuf+/kORtd5/EIKFeIjgSorUQrzBWuVMub1/nOYJFuB0KTPvzEUJf1/l9s4U30gdiKxoik+4b6ukXoZZEhG+hWtcAx0Cyti4BgDhYo4tKneejrWbzRod+ibLsy7EEP7l5E7cJc/QhR0lC3q6hzJIfHL6aMxNIcX1p62cPhvTffSba1tlbyQUNY16/LuL0xtIr0mrgQuhzJMs2V4pGI61itrUk1kgNLMnrKJ9WcLWQRzVGOSbJ4Qh7ZDHugStv3s+3ASrME/zrDINfXzK5zGL0x0unYl0OA4aGPok+kfaCsWxgtH1YuoyRzIFxkUSGCLiwh8t+RM+iwGyErcfjchNMev/JU7ynRil9OUV0yodmDSq6Wa6OUIIo6cRVxMU5IcvGE8iTgbwCB1WTd5KAFr9RZOsxa+7SOqKfsTUEwCW 96FxYLS2 PLx0Kg2so/MmrO7dySUOLKPR1Dxh2jgm0nbWTBMlEcLFIcjkElo6zHkDxblrTzQ25mmIcDXHX24kptZNtosBZk2Qdng7ZlejfIwqTbiZPUS8qkuYYjRmmTGMGw22m7nZavHrYVp0C+PRiJlrXZKBvxCmO2yTZAyLi+IboZSX54r5LZf7pyHZEufYDhO/Ra/TVFvjOh8nUWrrn0PPqGFazfhDmeRJTbDmrC809DUWFuwtYzfZKJTG+1W5RAoyFKSqsJV/Dzwt40Wcen/JRQwFdXtnsw2ZCt1qfUFZVd8I/i3Gqn10v2bTUZoPeyM/cFx43T5AuoU1461Bmy94vGC71ddGEOfQXbjof65v1AtbXoYVTvYrb1j6g9NphOhISLN5DhIS41/axAE4Udh4cLqXaaMKf8V10bZJSRJtkD8le5XLPWNdQQqNjcLQokiDUrjpsiq0qpn9NUE83R/jUCoXSlT7RC7M0lAyUTK1DgjI8zWcBGJyv7V847Hn72zCF3Rv61moLVfuzm2NabeQbYAiIMVRNk5kAbqNiswz52llfCtfT4xikhDUkHnn0GajFCBNgbxBIoEQmEH8Ks9/pNRTtu1B0eQo8To56ahLfLPpcwZgAB56WQvScR41Hxw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: Steffen Klassert cc: Herbert Xu cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/xfrm/espintcp.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c index 872b80188e83..3504925babdb 100644 --- a/net/xfrm/espintcp.c +++ b/net/xfrm/espintcp.c @@ -205,14 +205,16 @@ static int espintcp_sendskb_locked(struct sock *sk, struct espintcp_msg *emsg, static int espintcp_sendskmsg_locked(struct sock *sk, struct espintcp_msg *emsg, int flags) { + struct msghdr msghdr = { .msg_flags = flags | MSG_SPLICE_PAGES, }; struct sk_msg *skmsg = &emsg->skmsg; struct scatterlist *sg; int done = 0; int ret; - flags |= MSG_SENDPAGE_NOTLAST; + msghdr.msg_flags |= MSG_SENDPAGE_NOTLAST; sg = &skmsg->sg.data[skmsg->sg.start]; do { + struct bio_vec bvec; size_t size = sg->length - emsg->offset; int offset = sg->offset + emsg->offset; struct page *p; @@ -220,11 +222,13 @@ static int espintcp_sendskmsg_locked(struct sock *sk, emsg->offset = 0; if (sg_is_last(sg)) - flags &= ~MSG_SENDPAGE_NOTLAST; + msghdr.msg_flags &= ~MSG_SENDPAGE_NOTLAST; p = sg_page(sg); retry: - ret = do_tcp_sendpages(sk, p, offset, size, flags); + bvec_set_page(&bvec, p, size, offset); + iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size); + ret = tcp_sendmsg_locked(sk, &msghdr, size); if (ret < 0) { emsg->offset = offset - sg->offset; skmsg->sg.start += done; From patchwork Wed Apr 5 16:53:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202263 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28B6EC7619A for ; Wed, 5 Apr 2023 16:54:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B40916B0078; Wed, 5 Apr 2023 12:54:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF0D36B0080; Wed, 5 Apr 2023 12:54:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B8F96B0083; Wed, 5 Apr 2023 12:54:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8DD916B0078 for ; Wed, 5 Apr 2023 12:54:23 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6CC88ABFDF for ; Wed, 5 Apr 2023 16:54:23 +0000 (UTC) X-FDA: 80647935606.26.B844CAF Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf07.hostedemail.com (Postfix) with ESMTP id A3B8140010 for ; Wed, 5 Apr 2023 16:54:21 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iF7LZ0fI; spf=pass (imf07.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713661; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3FA5Yjp/rAglxaSde3E7QeP89aqCLnwKrUfyvWNxLew=; b=wxh2/dj0o7MVHO1aUN3dYnUzfVbWpzyG5IaP8zSar8OtLCQez5X94buo7ZlrMqHzpmrfi9 B+eEG84LeVh7Do8IDkTQEmWF7utefUcj05SQgikX60ft+647o1Us2F56UP5t9QjfCkeIe4 KpyQ6jFCZtaC4BYCgSKbTLIj2YSSTK0= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=iF7LZ0fI; spf=pass (imf07.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713661; a=rsa-sha256; cv=none; b=SW6hHRnPuZ8n02QnBXBhtvMm/7RCCS4IaLc5ouBSHZxS+eHI5nPtGZCrvqLXXNZxX5XfYR qMvQUXgo66awITCmC/FtC3w8ibUROpGqFxFyl/3puvvf2fcIv519ENLafxESOIsVqAYL8a ZdBuPQjzJkSr5yxACaAnc42PVqBgSW0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713661; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3FA5Yjp/rAglxaSde3E7QeP89aqCLnwKrUfyvWNxLew=; b=iF7LZ0fIlmdktmckJW8p9EWwJP3TLyy2mlRWqd0i4TcicrfhvvWSa/ZB4vibSq3N+XxwfC XkinYHW0OMmbpnmOWHOnLYmVAeLuOptdEBfO5USXsDHLAy3AAV+CcQCZrN0+rS8b00PdLL b0yepgdujJavksgwpO9x6z1A/jci1yI= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-452-U_gW8eScNw2-IFPm1Urp3A-1; Wed, 05 Apr 2023 12:54:18 -0400 X-MC-Unique: U_gW8eScNw2-IFPm1Urp3A-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6D39D2807D69; Wed, 5 Apr 2023 16:54:17 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3A90D40C6EC4; Wed, 5 Apr 2023 16:54:15 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Boris Pismenny , John Fastabend Subject: [PATCH net-next v4 11/20] tls: Inline do_tcp_sendpages() Date: Wed, 5 Apr 2023 17:53:30 +0100 Message-Id: <20230405165339.3468808-12-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Stat-Signature: garm8xo3izcuuhyizspntaxpoaim6kje X-Rspam-User: X-Rspamd-Queue-Id: A3B8140010 X-Rspamd-Server: rspam06 X-HE-Tag: 1680713661-77225 X-HE-Meta: U2FsdGVkX18A0b01HvsMXwhC9MRD72rNlzXm+dMk+94zcPk3ZYHVcIC7YFl4FhCWRzmBoOYLd/ZknGBg24HAV/GMeQdh6kso9WiV49CXhC/BLGUUNqZBy34q6/u0iwkZDuyPHNtaAJxX2pjd9RAIbnMkYsRd2DGAVmIA0wWbqvgzN4+1CE/qhMvEFR1XCwG91vdEYGcZ+22cBaxDlEU8LMxhbOsA65fY0AxQ1dzjQO9rWlATP4cmGI+X4bIhlEN9772MjPngu32FsR2FL/TAb44aV60tEEfoh6pK7sNcpNnlC/9wKmOEdKIGq2MTAmOgtjoHhHgmLZPX83FfSBIcYviJOOd3qy4rGGacyYhxsHSjGxAZ6bmdDQDIwHoXWFRZA/GbZQAjWTTZRX+rT9AZbrWv1ztgNcXAOmjudffN89X7l6qMxf7AclIcgrIXw88tRPLtPGHDHWIUVEZRs5PLqtkEUXwt56bW9Qgif37vVQ/FOcEqjPpmUIEN1MHV47OvsRsbpipxx8+zLlGknOLPuKLWlG7LP/qB6Lu+9hT0lSDY0TV+xp/Z2uf6WvKY3CIDfPizM+4tIaXXQM5i/LvIgGY8juB/sPEI510iCAYppul85jU8D0ysGWnO7/S8Q+jZ4dmE5QaHngkv0wvQSng1m3hluOXVfPg0ylU80oIjP0KSjME9jtaCQJcKJIr+C1xPSiqmqVP8E/0lNIpa51doMySpiJsr66X2IiHPO6xXnKcCr8PcMKOFMJTAwI8p2M4O5K3cOucl0DKLOQTXKLJrbBMOZ7yhpqc8Nz5IiiGbraMhpr8b2XmMANTRS2k9G5Wy1iwq0Th5lOwwZcM1NCbStucPKpIFUXHAJKOXoQy+6+vv0jq4+hh+BkSm03hQHHOJXIG7b2PBk1n0hUI2JNfn/1D7yMcWLTtY733s38TddWfBNaj6cH11P/aFEsY8uahl4wDoANZax8r7w62cCXH x8np1o6/ KviNqeOAqmmsBsK36D98niiMMc63e3yki49RX++cUMV2627wjI23JF4ztecJLElJObyj2C1YWUZRz6D0ujEmW/n2L02Nu6C0WjFyhBJxa1thLyYL09ww2KppigCTVcA6Tc1LG5PUhhrEjNefCUWTVB5MZ6a6wCjGCyL/JRZetkouumxw3nf0j1isKVPs/LpDQ7TopUC08tTesVROAYPyWnS8LCRwVX1bBKCEUEd4hmXc+KczIkXPfNiD/MWC8D5cuq4T4VsozJAgWwpk1osqFiBm5GYKjRFYisNPPNE+71bmHP8TdhqliGpK4jHPkPHDwzueUpgyTrSJGozICVnpr0Nk3AiTvXeOgixJOuT1ehgKWJvMyej8LA/vY4yz/SMaIcURd5m+GYLY1FjHHzTw9kzVfx3TS08r1QIn6j0aOAQaaVpcyTxGdc/LaTJFXbZO8s1n5xV940SUDMfsafnIkDI/sow8/yomuyiczR5JAL4KIAcc8H9j5UW1vulkLf+xihnwORIb93dHIEnEAvIqgqjhGG+zWJIv7Oa03oJHW6A92/M6SSZC+QjmwAA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: Boris Pismenny cc: John Fastabend cc: Jakub Kicinski cc: "David S. Miller" cc: Eric Dumazet cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/net/tls.h | 2 +- net/tls/tls_main.c | 24 +++++++++++++++--------- 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/include/net/tls.h b/include/net/tls.h index 154949c7b0c8..d31521c36a84 100644 --- a/include/net/tls.h +++ b/include/net/tls.h @@ -256,7 +256,7 @@ struct tls_context { struct scatterlist *partially_sent_record; u16 partially_sent_offset; - bool in_tcp_sendpages; + bool splicing_pages; bool pending_open_record_frags; struct mutex tx_lock; /* protects partially_sent_* fields and diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index b32c112984dd..1d0e318d7977 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -124,7 +124,10 @@ int tls_push_sg(struct sock *sk, u16 first_offset, int flags) { - int sendpage_flags = flags | MSG_SENDPAGE_NOTLAST; + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = MSG_SENDPAGE_NOTLAST | MSG_SPLICE_PAGES | flags, + }; int ret = 0; struct page *p; size_t size; @@ -133,16 +136,19 @@ int tls_push_sg(struct sock *sk, size = sg->length - offset; offset += sg->offset; - ctx->in_tcp_sendpages = true; + ctx->splicing_pages = true; while (1) { if (sg_is_last(sg)) - sendpage_flags = flags; + msg.msg_flags = flags; /* is sending application-limited? */ tcp_rate_check_app_limited(sk); p = sg_page(sg); retry: - ret = do_tcp_sendpages(sk, p, offset, size, sendpage_flags); + bvec_set_page(&bvec, p, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + + ret = tcp_sendmsg_locked(sk, &msg, size); if (ret != size) { if (ret > 0) { @@ -154,7 +160,7 @@ int tls_push_sg(struct sock *sk, offset -= sg->offset; ctx->partially_sent_offset = offset; ctx->partially_sent_record = (void *)sg; - ctx->in_tcp_sendpages = false; + ctx->splicing_pages = false; return ret; } @@ -168,7 +174,7 @@ int tls_push_sg(struct sock *sk, size = sg->length; } - ctx->in_tcp_sendpages = false; + ctx->splicing_pages = false; return 0; } @@ -246,11 +252,11 @@ static void tls_write_space(struct sock *sk) { struct tls_context *ctx = tls_get_ctx(sk); - /* If in_tcp_sendpages call lower protocol write space handler + /* If splicing_pages call lower protocol write space handler * to ensure we wake up any waiting operations there. For example - * if do_tcp_sendpages where to call sk_wait_event. + * if splicing pages where to call sk_wait_event. */ - if (ctx->in_tcp_sendpages) { + if (ctx->splicing_pages) { ctx->sk_write_space(sk); return; } From patchwork Wed Apr 5 16:53:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202265 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 990A3C76188 for ; Wed, 5 Apr 2023 16:54:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9BC686B0085; Wed, 5 Apr 2023 12:54:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 969826B0087; Wed, 5 Apr 2023 12:54:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 80A746B0088; Wed, 5 Apr 2023 12:54:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 6BDCC6B0087 for ; Wed, 5 Apr 2023 12:54:28 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3480C80810 for ; Wed, 5 Apr 2023 16:54:28 +0000 (UTC) X-FDA: 80647935816.12.074241C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf15.hostedemail.com (Postfix) with ESMTP id 6CE9BA0016 for ; Wed, 5 Apr 2023 16:54:26 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Eng3yHgc; spf=pass (imf15.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713666; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pP5V7G7GRqo/82ZWWApO+0CTJP1pdqvbg0H8r7V6l6c=; b=WpxtVz7WxdK5KNpXJw4PR2srNlSa+2Smjdlg7oQDDucMLDszpnsTC2q5EyJSaFT8RIXYOH a0A8BsmYqvxA+OzuceLe6UcR4QDVCzACQssjsl5vALTTD48kdprTkej7LxFmb2scKV9ZhS S0WCb17W5N4ProjVaaxdHlo3YOZjuPM= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Eng3yHgc; spf=pass (imf15.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713666; a=rsa-sha256; cv=none; b=CHRlX4AC0DxGvo/ctcCEVi+qHg5RsdEwXUgX9ljRgssPCFCOmRXGWqoqAIZfQBGAQ0OyZy RAwzEs947M5wVp/9VCPUJqfbOEXetzZISMisSPG5e+J4HoSlDtjIFxv6QEDmZx6T8unePQ jSXJQXp5X3b8tdTHm0IYA6oD+xVMBFM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713665; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pP5V7G7GRqo/82ZWWApO+0CTJP1pdqvbg0H8r7V6l6c=; b=Eng3yHgcfAuQ02ojazpM+W/HudjSPJJ7ZXgRrO4fDQaoIGFY3hq9kjc5mkg3X/RHhQ2sVR 20WWJZ3Ywr6se6rtWGvK0cP/50fv2hkVH/4ZsqE4byqTnvQA81po6jIZ2z40PRR9E9GQOa 21VobiXyBH4rDesSIk/j+PSP6E/1mkI= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-523-jkYT3UqJN2e1IEz58PiLDA-1; Wed, 05 Apr 2023 12:54:21 -0400 X-MC-Unique: jkYT3UqJN2e1IEz58PiLDA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6F2AA3823A07; Wed, 5 Apr 2023 16:54:20 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 227C72027061; Wed, 5 Apr 2023 16:54:18 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Bernard Metzler , Tom Talpey , linux-rdma@vger.kernel.org Subject: [PATCH net-next v4 12/20] siw: Inline do_tcp_sendpages() Date: Wed, 5 Apr 2023 17:53:31 +0100 Message-Id: <20230405165339.3468808-13-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6CE9BA0016 X-Stat-Signature: pxm7p7ixas11xknidexb9ws165cx43nm X-HE-Tag: 1680713666-220033 X-HE-Meta: U2FsdGVkX18lpq8Nno4LHf7wFgOq5OcP7aDZo4iFlfmF/LU/5PebzvnjSTqiNeEkYBcNx6nN/79ea1E2A4LcoQVBKQ4OnSsMFLQXcLCLMVebKQQz/bh9juhJMFGjNYmEn7D27RhQYFHMX7LBCtUF2/YWRMiWGYHbj8Pj5z66pNnKmTIiw3nglrwCTkpuYIu9EkUmGHq7lGMVZx+9jF1jQCQAQy+u++2v48XmB16/wepY4jwmiZRWI3n0evlh9owhk6RBE3HVbvTB5NLZN/BBKMnxSlWic2plLZjyh9z0o0hyB2ubuDOl61H6gWgYMaivARMgghTb1IrItTk5S+g7UadIaXMvnHpN2k+Gfeihoyk1Ra9MAmTa0HmzLXPnE2rfbtRSE6LoUcaHM01SwhVHs+ngq3x0fTRlUvrrsLZWIVSzjTWjqICCDB8h8Rc+74R086/Y03Qv5C54AlfMA5Rd/S0amNgPAUoNo2OdRz0aQGehBc0xOV9R68BFmNerf3LnKeY0rs9odaIwY9otfowdLWuq/hsGFxMM2ZHIe2TOqZKu/3XoGIXAjELzoA9pbznU0zpvkN/38fVBco8SfyqT1/pIqBwl3DAyJLSBEgBhX2zATcl782njnFRo3WT8HWWmCZlJIqbrV5wMFVIKYEKAGfhefhmMrufkOySKbS95uyAdOD8Y3f+fYSUHIizDb5eQ3qTxE2IUUWZIredjRlY+pnAwD/qfzWR8/knUalzH2npzjWK1AI0z/n4ildgv5058huLX8i8doieeTNGaHjeB8t7vFq/FGtOOEZ9oJMzxjfKw5yjDTMIzCAZfRhHaVabM6lRUfXRCIhsaE1AzarqPeckbuXaaTbeFz96U/0Rmn6/GjYrLC1g+/2xeB5Y+3YveaJHW6mDsjA5NZ/W/twU0Jdp7/MHn9EE8gVslpcK5J91hdhNXHx3g5C34SJetvKlUfJ6uNdAPmP5hoTBe4EI p3UAOpXG oTWmjBIFn/eOD7va7B+Y85rl9blD3mmEBkS5YBJJ1pfgDR2oRYHo1WLoZPFnuZuCBXIijtassLoYRbMjaOVPausn+/hmakPcKOBqw8qQPOLH+JIqe2+UTUFzdV8ghZFnv6o/fL29Q3R+/LnRhGA3VVlFpoO/dWawsksc8wMthyyk5Gztud6r6qv85H9oXW3nQwXMZ7ko6Qu49PEXH/0cBeAzdfpqQM06y9x3Fz+cpFOHPJhRxwabun9LAlGaj+bA5zBs+65AaUHirGUDWtxRsFeLGfgGpjmlc7pWz8q6CzSqy/2ZUbbhV8pPhJ3Ub7inVwjna9y0zBq5zcMxH2DTL10tMvkeN5gLXd1LyEnW6Te2rxoz+rWCpMhwvcveuKQLVyrUY1z8jZHmWxAWB+UW9ZmWnMWD2s/c20bTIBoZVs9YRnHsJ0GlwOcPrWCMGDZpmME8mPkBbBV0NEWKIuz6NLnD0zZMV8fTGW5JYY0Ykdcpq8V5figNKH4vQ0Z3uLHxv+jUuYibZceH6/2FU30/xKFcpWzLVG58TgpPfF6HaO3QC7PCJqa0W+R3yOP5EuNgN6iCdeuMLWxIk+mOsJnKU8sSRSw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells cc: Bernard Metzler cc: Tom Talpey cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-rdma@vger.kernel.org cc: netdev@vger.kernel.org --- drivers/infiniband/sw/siw/siw_qp_tx.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c index 05052b49107f..fa5de40d85d5 100644 --- a/drivers/infiniband/sw/siw/siw_qp_tx.c +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c @@ -313,7 +313,7 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s, } /* - * 0copy TCP transmit interface: Use do_tcp_sendpages. + * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES. * * Using sendpage to push page by page appears to be less efficient * than using sendmsg, even if data are copied. @@ -324,20 +324,27 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s, static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset, size_t size) { + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = (MSG_MORE | MSG_DONTWAIT | MSG_SENDPAGE_NOTLAST | + MSG_SPLICE_PAGES), + }; struct sock *sk = s->sk; - int i = 0, rv = 0, sent = 0, - flags = MSG_MORE | MSG_DONTWAIT | MSG_SENDPAGE_NOTLAST; + int i = 0, rv = 0, sent = 0; while (size) { size_t bytes = min_t(size_t, PAGE_SIZE - offset, size); if (size + offset <= PAGE_SIZE) - flags = MSG_MORE | MSG_DONTWAIT; + msg.msg_flags = MSG_MORE | MSG_DONTWAIT; tcp_rate_check_app_limited(sk); + bvec_set_page(&bvec, page[i], bytes, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + try_page_again: lock_sock(sk); - rv = do_tcp_sendpages(sk, page[i], offset, bytes, flags); + rv = tcp_sendmsg_locked(sk, &msg, size); release_sock(sk); if (rv > 0) { From patchwork Wed Apr 5 16:53:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202266 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4695EC761AF for ; Wed, 5 Apr 2023 16:54:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CA3056B0087; Wed, 5 Apr 2023 12:54:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C2C006B0088; Wed, 5 Apr 2023 12:54:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7BA26B0089; Wed, 5 Apr 2023 12:54:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8C40C6B0087 for ; Wed, 5 Apr 2023 12:54:29 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6C5CF1A0C45 for ; Wed, 5 Apr 2023 16:54:29 +0000 (UTC) X-FDA: 80647935858.30.7BC9EB4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf23.hostedemail.com (Postfix) with ESMTP id A520714001B for ; Wed, 5 Apr 2023 16:54:27 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=UafgsLpo; spf=pass (imf23.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713667; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mV2Euz/SvGe1s9i2UzZ379KdNOAT2pMlJalr2LTSkGw=; b=2bzd6a7rLpD7sQjcYk6JrhTJtLu6F8Ed2wrRhZ8w90mBsuwUpgavIQsx2kwdmpTOh2sOcq fDpet+T5z11GhRIe5DXDtOfXgzuBIH/7OwUCIOiJGCvwjaE8ooFAobntIfmrjRONDZsTBe fdOA7Qqb1GZggkKspnnv0+Z0jxJN448= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=UafgsLpo; spf=pass (imf23.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713667; a=rsa-sha256; cv=none; b=i8m0zOIadH7iWv4ICNgyt86GYIBWqRwLOvg73SkGwiYhsYpp0CtiI3vvKTZnzoHYsz7Fuv mdVujRtFUQ6aDAXiasM1jizpfJOvpMmwU9IwvoFaJVl/7EgSGslwfpzJvOcEMAIrtnMGLn hq89bjlAMFT6qB7VMFpeBsj5iXqdwZs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713667; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mV2Euz/SvGe1s9i2UzZ379KdNOAT2pMlJalr2LTSkGw=; b=UafgsLpoIZQgHPXcpukhWnRpXwlJtDYZORT1vWp2sJStu6wxSLJ7IW3IHYWK5LYVaJungx A5ardWqqCmWjuwXSkbmkWBTOsBf8kCufp/Qagi/6baE7dk9Yibn8PBUs9pFT+/uxIaBjzb RzpmZf6z9D44fuSDGnsNvs9mb8nfLKM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-191-AWcuO68OMJ-A2c7oOB3JYw-1; Wed, 05 Apr 2023 12:54:24 -0400 X-MC-Unique: AWcuO68OMJ-A2c7oOB3JYw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 159FC858F09; Wed, 5 Apr 2023 16:54:23 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0D2D1C1602A; Wed, 5 Apr 2023 16:54:20 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 13/20] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() Date: Wed, 5 Apr 2023 17:53:32 +0100 Message-Id: <20230405165339.3468808-14-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Rspamd-Queue-Id: A520714001B X-Stat-Signature: d6dwj8sirxkghbwuaadbg85heeqsyjyq X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1680713667-755128 X-HE-Meta: U2FsdGVkX1/0lEtg4wLK7f+1mI6sFvAQgMzEzYyCRPGzn0LN28/GR1GHS9SNgNoF8HHNxjWM7JldoAWEdgmS2e/1nIK2SfYsB4nlscJvv+SoGIbJVA5LHl4QTEi925vFEkAUhXsa/CeL6UejtnTCivYnuaXaewunxcFRswHnrBxzxIdvGto+w9vKMbxa7phIF52gnMCyOdycwyFOiBKxNhGOyVpMcIdRaYE512tbBMfWycOdiedebwKYSuKnwjKJAF5fITvM1G9sst4aa3ATZQRQh5d7zsOji5MC5fcaQoPzsD3NIixOc1cJDe4Zb9eURNdhcQtugwmHWIcDkRsJuTzEGJyGJT4Yckktoqr8xGvMBid+OxKCAXJGVRGIk6t0B5lXu+25YZw7fQfpfBX8dXOD8/Ttg6vhjNrY/GdBvbmxGuOZZCBKnYTL3QxZvz7PuctUxEXLHTJs7iH4NOiTLe9DDHSYfB6Ow4BUmim9jpiaRTQT3KFvPxJXJsEZ3gLvyLr4hyiqTdc+3KDkyUC1DJrNLTPaT/eT4xHt+0NiEx9wpmWbuoUfhOUGgbRxrKwwN2fKXoj2+tgY4ka0bJQ4WqLQyLJbgNNXWTNlNMA5ZGR67xToB42anYPMq4Uo5fHjhur10jUHmVfKIwOhkr0GZNdZUWw6jgnxIouDBywrGALbnZ3vxBGDV5mdjs65+82OWzRGkho1OuA+Gy76cXUpFXkCJig/2CPDZCT8q7yEqhdQWCCZx1AZPp+VUcXuWFu06/10krXN6tYFQB8ynUAndqkBwxMdXZw/o26H/k0PDkJja/fBNDQvG0oS00aXx4iRsNcgahHz/a6+fegUfrkUUzrvS2/bEIh1oNAzlHon8hgdGEWThcDQ0WQwliQ3YNZUmji4VckCJB0tTtnyEPoYpDzEInRtA/Hempd8HQXrvqzAZbecbSxJf6ZwkuMCETW9tUADQ0hMu03nol/rGzu 4KmgiRV/ pdlmtBU0sw3h/6RLQ/8svszL6Lo0BU8JcxusC4WY2oHgWSsd6SLCX1jUT2p45RoY6cYQTTnALWCNU+ovF/KYLzz3bcBoM6hX47q/aZv4Gc0GW6TQtcBBmFRF+tST0cS/kqpPS96k6T50kCfKStJOHt4agnKGeBx8PSQ0swRNJ8bohA6j5JiaLkW2T1EbndPQGSfhpEMfz7/pnkDWeZ9te/YvzpltFbpipvIUkU48jv9vXshiqCFbfKJTEHBCoqc5VmKjBXNEPP/DV2+Kn2O0/+rRwn27TdQhLYt2876fP3bVqmx2RtYx+PUqea4P2rm+4HyoEAYHE0TLedjiYJHjtARHDtEcq3a9eH8iAmVe82lQ3U80Rh7Wzeta+8FDq2A9i4P5w0eCra9FPJLDM/vbvhsgHbE82ET/IFt0KAQQPMBiFEsib9P9UVlCHkW16G/Foo/sCubZYXuPIIethcBesW+MlpUNGL7R5yo70AtCT2vJkjlo6U9BDre4n5fsCrYBkfKLm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Fold do_tcp_sendpages() into its last remaining caller, tcp_sendpage_locked(). Signed-off-by: David Howells cc: Eric Dumazet cc: "David S. Miller" cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/net/tcp.h | 2 -- net/ipv4/tcp.c | 21 +++++++-------------- 2 files changed, 7 insertions(+), 16 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index a0a91a988272..11c62d37f3d5 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -333,8 +333,6 @@ int tcp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags); int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, size_t size, int flags); -ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, - size_t size, int flags); int tcp_send_mss(struct sock *sk, int *size_goal, int flags); void tcp_push(struct sock *sk, int flags, int mss_now, int nonagle, int size_goal); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index a8a4ace8b3da..c7240beedfed 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -972,12 +972,17 @@ static int tcp_wmem_schedule(struct sock *sk, int copy) return min(copy, sk->sk_forward_alloc); } -ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, - size_t size, int flags) +int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, + size_t size, int flags) { struct bio_vec bvec; struct msghdr msg = { .msg_flags = flags | MSG_SPLICE_PAGES, }; + if (!(sk->sk_route_caps & NETIF_F_SG)) + return sock_no_sendpage_locked(sk, page, offset, size, flags); + + tcp_rate_check_app_limited(sk); /* is sending application-limited? */ + bvec_set_page(&bvec, page, size, offset); iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); @@ -986,18 +991,6 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, return tcp_sendmsg_locked(sk, &msg, size); } -EXPORT_SYMBOL_GPL(do_tcp_sendpages); - -int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, - size_t size, int flags) -{ - if (!(sk->sk_route_caps & NETIF_F_SG)) - return sock_no_sendpage_locked(sk, page, offset, size, flags); - - tcp_rate_check_app_limited(sk); /* is sending application-limited? */ - - return do_tcp_sendpages(sk, page, offset, size, flags); -} EXPORT_SYMBOL_GPL(tcp_sendpage_locked); int tcp_sendpage(struct sock *sk, struct page *page, int offset, From patchwork Wed Apr 5 16:53:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202267 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E50AC76188 for ; Wed, 5 Apr 2023 16:54:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C52CF6B0088; Wed, 5 Apr 2023 12:54:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BB4806B0089; Wed, 5 Apr 2023 12:54:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7D276B008A; Wed, 5 Apr 2023 12:54:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9078F6B0088 for ; Wed, 5 Apr 2023 12:54:32 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 72740AC116 for ; Wed, 5 Apr 2023 16:54:32 +0000 (UTC) X-FDA: 80647935984.02.7411CB3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf03.hostedemail.com (Postfix) with ESMTP id AB1B52001C for ; Wed, 5 Apr 2023 16:54:30 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=I1Cu3WGJ; spf=pass (imf03.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713670; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rczukz9qV63hFuelx2DCwDUOgKVDzUk5rK7clppjA7k=; b=N8x9N1rFT/Hf2SLUOUYYssePkoUiYbl4WcFvFLaSEDv6/5tpiazm/STKP6F6KNcsk5RKMY n4IelFB0VD39/+Axqjvvbjf/ysaIk5AjWdz8A+3TEq4m937QavdV6Y8XiRKP3f8OdkME9b YpGLukTHKBmQqcc9DxB2BIINlQ2iOQ0= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=I1Cu3WGJ; spf=pass (imf03.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713670; a=rsa-sha256; cv=none; b=Hn7qDJVsta8O22FyFAFi102zFWc7TaHAWxkxl5eHSaFrRaVelgM9DZfTvtrsk+F+jBAGDb ssSiWuWV8lhGTo/uWzDrxrCgonwRns7e1Xs5oaaugmVuOgQtWXvm5cjn0Z15vL5l6Zrvr5 LRIbF9CMMJNmJ2kqwayYThzoh57hfZg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713670; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rczukz9qV63hFuelx2DCwDUOgKVDzUk5rK7clppjA7k=; b=I1Cu3WGJOpD9T1ukLGQJLQ89iXKr0DrBgxnH1AEb0xlnGNZaNsINEcVbCfjRHsNVxyfRXW xifpxh5yebdg+Bpni8TZcmngrRobgGmnFdDs3oUbQ6Vl3o2idjNgir8R8M6drx/rH1c6X4 PmDw6E4POvrfimdTLjm1XbrEqe9pL1s= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-224-KG6uRlhrMeKauli9V7lJyg-1; Wed, 05 Apr 2023 12:54:26 -0400 X-MC-Unique: KG6uRlhrMeKauli9V7lJyg-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CC15D858F09; Wed, 5 Apr 2023 16:54:25 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id C7AD1400F4F; Wed, 5 Apr 2023 16:54:23 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 14/20] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES Date: Wed, 5 Apr 2023 17:53:33 +0100 Message-Id: <20230405165339.3468808-15-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Stat-Signature: mfhhx7wzjyim9jrgmujqdiu8e9di7oq7 X-Rspam-User: X-Rspamd-Queue-Id: AB1B52001C X-Rspamd-Server: rspam06 X-HE-Tag: 1680713670-156837 X-HE-Meta: U2FsdGVkX1+x7i6zZW+jsfJDz7Tx1oNEAyKVyiz/7n5fWzf90vyGuNuy5s2W+WpdVwbHsXixphd6/LXtl/Sy+P5AoMhRBUt7QhjGFKPPNnQk+/teSBZezjFOEd7rOOEaSkKT7XBvyUc8ekwIDlpfSiaakb/1gTwy28Tp36IhTwhm9sOnKAnR68t1l6BYMubuTC4e6DnUIRbh/XDqPxqUAUC1zr8ciFrvhGPSTgaiJV44XFU2DpTAuMiOFOgb57953Yn0knk7iekvw214d2E++8kdj6EqFbEuOwTlEVpjb2oolQRyG6jKjAqEwJwERQKUhPcClta7Qe78mP1RzV0cIKCIhVGZzGnxNpf8bkL+HpHfwFFO263LKobJsmeHMBKJVP5kHODhiDkbus9Czy89NaANI5wMdzNYFprT/uYOPhLkcgU3uKhgBRT4m20WR/jwl57Iax+cwVKDtvtjb7YR7CwBzbauwDdFx+oSmei1gF6+Jy+0BGLBPAYFAIj+bjFqsrZa5xXjoUmlalHiOnSbcdmir3cOICKJDcoRvOnBqkLLkNi2kRH6xaAtGxCXNB8hXspIMl6qAO5/cOVuS4+9K76K4v3nT58TzWb845FdaTEryVhvXpb3bExR8FmeS4KCPCsEiPR5ALPtbPHGX2hEhjm3mjYoGYtegDoH8zLjPZw4D4lSFbKXeIa5uIYEphgbcmAZuMDInOvQ9vW/RxdVNIq/mndMWG05bxQ1uMtr52ex/DKYyk7t9pyOm8O2O6z/yJpi0zM1wveF38gg2P7dNHFT0W53TNYb3qMHklLPPPgq3D1kuqxTnXcBTetcfhqPaKL3ep2UMtrbbjt9D2oPPCoobkqtAHG0u+w9mC67cylNvR0podfqJAWIxdHBFkZabYQFIej2ySfUzYc+LZpZDDAmHE94JFnCnpnwBxydVKvCCKtumAolm1XleuygrIrVfDjcSdKOExhfXz2xbWV Z/qoFKL+ q0V4LEtyeN6C6tFQiwsl2+1NIqKllc4jM3FaxQZx6HS90vr9WGEDDz41RV4v61szMEzLzniVsIHC2O5hDiLiEBAIOYtR5UHx9HHf9Px122vpHTBwvYy9q+WoSWWVWC//26OSY+T7e49HxMMu3+PZoOiXFWMEjn50isEn8DAVxUs0qVfOOuxjQbzx6EJ9PHwMWX+f1lN79+oWOY59W0KhjuzKeYFPBe2+a3W83ZWIkA8Bpmod/dJ6jn0aZsVi9t6Px9ziDHnfpX8c4TKhRjE9oK+5E9bsbKs3LHZSZUq0s5KH/KlKYbKk0DETAC8ebWSylzITbYEoBcDoUyDQHXQ4lMEgNQODIlMTP9Rq0J7tASs0ZOr82dVoGCe03kDRJrEU2mLTYd1z102GGLuawiNlzS7GzsIiZKtJFuv/kwnsfVCo3bJDk062lLIkoBL7A9WyJmDELJEOW2V5+1Khgpr+jEt9lCSw9kT4vB2+GJkWJGPRAVPJRnZmM+xt0KDUBlr5VC3uT4eAOJd54k6G42hfYeQq46w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Convert udp_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/udp.c | 50 +++++++++----------------------------------------- 1 file changed, 9 insertions(+), 41 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index aa32afd871ee..0d3e78a65f51 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1332,52 +1332,20 @@ EXPORT_SYMBOL(udp_sendmsg); int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct inet_sock *inet = inet_sk(sk); - struct udp_sock *up = udp_sk(sk); + struct bio_vec bvec; + struct msghdr msg = { + .msg_flags = flags | MSG_SPLICE_PAGES | MSG_MORE + }; int ret; - if (flags & MSG_SENDPAGE_NOTLAST) - flags |= MSG_MORE; + bvec_set_page(&bvec, page, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - if (!up->pending) { - struct msghdr msg = { .msg_flags = flags|MSG_MORE }; - - /* Call udp_sendmsg to specify destination address which - * sendpage interface can't pass. - * This will succeed only when the socket is connected. - */ - ret = udp_sendmsg(sk, &msg, 0); - if (ret < 0) - return ret; - } + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |= MSG_MORE; lock_sock(sk); - - if (unlikely(!up->pending)) { - release_sock(sk); - - net_dbg_ratelimited("cork failed\n"); - return -EINVAL; - } - - ret = ip_append_page(sk, &inet->cork.fl.u.ip4, - page, offset, size, flags); - if (ret == -EOPNOTSUPP) { - release_sock(sk); - return sock_no_sendpage(sk->sk_socket, page, offset, - size, flags); - } - if (ret < 0) { - udp_flush_pending_frames(sk); - goto out; - } - - up->len += size; - if (!(READ_ONCE(up->corkflag) || (flags&MSG_MORE))) - ret = udp_push_pending_frames(sk); - if (!ret) - ret = size; -out: + ret = udp_sendmsg(sk, &msg, size); release_sock(sk); return ret; } From patchwork Wed Apr 5 16:53:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202268 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4EF1C7619A for ; Wed, 5 Apr 2023 16:54:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C3646B0089; Wed, 5 Apr 2023 12:54:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 573276B008A; Wed, 5 Apr 2023 12:54:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3EC816B008C; Wed, 5 Apr 2023 12:54:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2B7F36B0089 for ; Wed, 5 Apr 2023 12:54:33 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 14A9C1A0D0A for ; Wed, 5 Apr 2023 16:54:33 +0000 (UTC) X-FDA: 80647936026.26.BC182ED Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 43FF4100024 for ; Wed, 5 Apr 2023 16:54:31 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=crySfXQz; spf=pass (imf14.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713671; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vSIQMN2HZrWiQ4eH8GOTjkisrj+ZE6cW7zZ4IaXdbXs=; b=dP8YKMIPpafR+3YukjjdMA9gcpuHdgRBodM91aFTgfu6Rvfwm+IbtREMlc/PQdj7dguq+G 3Vdu0cpRMtPwySaPdk2aGp881zhmQY+ITG7yv3qnkse/cmDULNfYI8FlUmx7LzN+f4reQg 5SS+B4dJyVYyqE505UAj1SiWCyuulJE= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=crySfXQz; spf=pass (imf14.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713671; a=rsa-sha256; cv=none; b=647JAT+2H0+99mjrq5SxU3dlox/Nnc6sBoXmHcFlez+YD8BiTygLUBlH6czf/n/u07407N e8SW3wkbZ11ABn9BXL3TaMX4a4GtXxnveHjNk/nj1IBft9VP+vzS/KePsCkSu3GFfvY43z 8kOWjpsxN7N9OH+g5LqD5jlXJ71vpD4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713670; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vSIQMN2HZrWiQ4eH8GOTjkisrj+ZE6cW7zZ4IaXdbXs=; b=crySfXQzJA7FMwz3j7cjP8Cl9NzkCJ+kUy3ruNdJwMVDStjVAwbKiFmYpYqVlQlr27CTCf mMI2N06MaTLiBw/4Tw0LRk2pJyyHY9gsWrbDGxdAlUCd13SQmZegucPuSuuwMM8vBrDknB YJViQcrGLj1fYlaCc39IRvegJabGXJM= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-612-UDoecsx0PrK-YaToklNIkQ-1; Wed, 05 Apr 2023 12:54:29 -0400 X-MC-Unique: UDoecsx0PrK-YaToklNIkQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id AA0EC3823A07; Wed, 5 Apr 2023 16:54:28 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id A5B03140EBF4; Wed, 5 Apr 2023 16:54:26 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 15/20] ip: Remove ip_append_page() Date: Wed, 5 Apr 2023 17:53:34 +0100 Message-Id: <20230405165339.3468808-16-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 43FF4100024 X-Stat-Signature: y9x9nyi4wnpk834usc83h5ns1nhdh8te X-HE-Tag: 1680713671-647649 X-HE-Meta: U2FsdGVkX18NxgJK/ORdnc59+g8FoxU7Gr5nwYGPhcDgvE4fpgXXxzB3Ip+E47KGQRr7qoR5Uw5/6BUKlPd8+3AZKz2l8p2o8+gmqCe/i0W6NP2hZYkhvJDPB9MhX7E4njzmcUHSomwopyLAxzI4Nv/Rn7CoTkWCmpYbLuR66UNFYs+EFlyy1FD9C97MsztkKkAqmTRGnqGTUgaja1BnKOJ5xM55MwWTsI2zcICrquMiK0AJP57GgOPOiqW9dJofKt6+WGCBItZJNEqleGLso0dP+4u3DVCkcVhN7WJoldkpw2jzYDENeZB5f8cgGl6a9Cmw8347sW/pc1bMFXlT3fnOqsvM51fylyEcf6Z4uUx14jfwoKh7eMqWMISTXe2ofX52E+5QFKq+RrcXcRblVwLWqbWZlhGa6bCo9hpoCrXdhD6w8U3y6t1PAP8/MwnL1qRI4pCkmYzT8anwYBxZUfgAZlvvxaGqBuJMAXepPfQoj4l4Ungd4cIV97s8NfJlF1KYXsN67oIbGHYeobcSTMZ0J6j0cm9E7PI1jPnc/rg9rnoTXJAfhOI6e5PnNjv98s3/PgPSD+ojRL1NVdoHY4klH/JSGTWfQ8OesChzKVfLRY5Kg0hZ5WbGCXO8PffGvLAiUDHQ2+G/sLkVhdfGYeYAsDNQHx/jb23Rp4TuDrdBuzrKkz2VhhTErc8nhxsPSy3D8qdT3MG5+sVcTCAmbvwzpD3QCkD2kaznUp3ZqE3r42/A2YSpIUd4AuhtHDBwNXFkupLXNpJTktGXu7kH4/YIX6B5PCBM47kTlGjQOjkP5EoXwoMus4NoAO2uhIqg0LQpJgfZjYQgmrbqhVfi5+zFCr+ZvwD5baLbneGZI7RWwjyyfxlH/fsbsKCXpF/tCXtXvcVpKgfAKfZefzTGghyXv6uuxbeTYumJSLkwdPTULQ9DL0UIuhR4z3BxPQw56mWISDBeaL9dF3GDqPY C5V9fgam niGCnkoYbPACVywroTIYCLnSuTXAQMDf3diUK5AZylnZe98aPpYqSBFWneK+BGfpCcXKyEN371yMKEvEPKd35Q3Lw+S0EM6LEzrrFMItR7LhgCurgoTyoNMontPIFe3BrD1J8ZrlbZhfylGQFndgvSF7CgF+OPdQBjQ+D0CBAOkWfB35SrxSeqVDWO3NrAtDB5vFSEzA44nPLguxOqi1Vf1A3Aj6Xat3q9FZexj8fE2Hg1Ux0HyDd0Gs1S1mtNIuc4SaBtfAh0+oU2GNaZsi1fKuLPp74/bW2OcrdZlBsvjDOjx+fFMRbxKZ+4OE78LC3/z/yqCbAI5NcmeST6Lg0Z+5+VMKlW+molADEtb6bfSyEW0yZdyySeTHxY8YjWMUkpl9WC3gFvIcOcu2sZttM4IZQWjsJV508+0GwUJ82maIwwlyE56AI758yOb5dABPEXLML6FFcbktQVWVsdiaCEJui83VA4aHVX/LKB4anOGHw6Y6EXESZDsQraT2uXpYzJ4glvxrl63tdkDLBRMSonWbDIw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: ip_append_page() is no longer used with the removal of udp_sendpage(), so remove it. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/net/ip.h | 2 - net/ipv4/ip_output.c | 136 ++----------------------------------------- 2 files changed, 4 insertions(+), 134 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index c3fffaa92d6e..7627a4df893b 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -220,8 +220,6 @@ int ip_append_data(struct sock *sk, struct flowi4 *fl4, unsigned int flags); int ip_generic_getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb); -ssize_t ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page, - int offset, size_t size, int flags); struct sk_buff *__ip_make_skb(struct sock *sk, struct flowi4 *fl4, struct sk_buff_head *queue, struct inet_cork *cork); diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 22a90a9392eb..2dacee1a1ed4 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1310,10 +1310,10 @@ static int ip_setup_cork(struct sock *sk, struct inet_cork *cork, } /* - * ip_append_data() and ip_append_page() can make one large IP datagram - * from many pieces of data. Each pieces will be holded on the socket - * until ip_push_pending_frames() is called. Each piece can be a page - * or non-page data. + * ip_append_data() can make one large IP datagram from many pieces of + * data. Each piece will be held on the socket until + * ip_push_pending_frames() is called. Each piece can be a page or + * non-page data. * * Not only UDP, other transport protocols - e.g. raw sockets - can use * this interface potentially. @@ -1346,134 +1346,6 @@ int ip_append_data(struct sock *sk, struct flowi4 *fl4, from, length, transhdrlen, flags); } -ssize_t ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page, - int offset, size_t size, int flags) -{ - struct inet_sock *inet = inet_sk(sk); - struct sk_buff *skb; - struct rtable *rt; - struct ip_options *opt = NULL; - struct inet_cork *cork; - int hh_len; - int mtu; - int len; - int err; - unsigned int maxfraglen, fragheaderlen, fraggap, maxnonfragsize; - - if (inet->hdrincl) - return -EPERM; - - if (flags&MSG_PROBE) - return 0; - - if (skb_queue_empty(&sk->sk_write_queue)) - return -EINVAL; - - cork = &inet->cork.base; - rt = (struct rtable *)cork->dst; - if (cork->flags & IPCORK_OPT) - opt = cork->opt; - - if (!(rt->dst.dev->features & NETIF_F_SG)) - return -EOPNOTSUPP; - - hh_len = LL_RESERVED_SPACE(rt->dst.dev); - mtu = cork->gso_size ? IP_MAX_MTU : cork->fragsize; - - fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0); - maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen; - maxnonfragsize = ip_sk_ignore_df(sk) ? 0xFFFF : mtu; - - if (cork->length + size > maxnonfragsize - fragheaderlen) { - ip_local_error(sk, EMSGSIZE, fl4->daddr, inet->inet_dport, - mtu - (opt ? opt->optlen : 0)); - return -EMSGSIZE; - } - - skb = skb_peek_tail(&sk->sk_write_queue); - if (!skb) - return -EINVAL; - - cork->length += size; - - while (size > 0) { - /* Check if the remaining data fits into current packet. */ - len = mtu - skb->len; - if (len < size) - len = maxfraglen - skb->len; - - if (len <= 0) { - struct sk_buff *skb_prev; - int alloclen; - - skb_prev = skb; - fraggap = skb_prev->len - maxfraglen; - - alloclen = fragheaderlen + hh_len + fraggap + 15; - skb = sock_wmalloc(sk, alloclen, 1, sk->sk_allocation); - if (unlikely(!skb)) { - err = -ENOBUFS; - goto error; - } - - /* - * Fill in the control structures - */ - skb->ip_summed = CHECKSUM_NONE; - skb->csum = 0; - skb_reserve(skb, hh_len); - - /* - * Find where to start putting bytes. - */ - skb_put(skb, fragheaderlen + fraggap); - skb_reset_network_header(skb); - skb->transport_header = (skb->network_header + - fragheaderlen); - if (fraggap) { - skb->csum = skb_copy_and_csum_bits(skb_prev, - maxfraglen, - skb_transport_header(skb), - fraggap); - skb_prev->csum = csum_sub(skb_prev->csum, - skb->csum); - pskb_trim_unique(skb_prev, maxfraglen); - } - - /* - * Put the packet on the pending queue. - */ - __skb_queue_tail(&sk->sk_write_queue, skb); - continue; - } - - if (len > size) - len = size; - - if (skb_append_pagefrags(skb, page, offset, len)) { - err = -EMSGSIZE; - goto error; - } - - if (skb->ip_summed == CHECKSUM_NONE) { - __wsum csum; - csum = csum_page(page, offset, len); - skb->csum = csum_block_add(skb->csum, csum, skb->len); - } - - skb_len_add(skb, len); - refcount_add(len, &sk->sk_wmem_alloc); - offset += len; - size -= len; - } - return 0; - -error: - cork->length -= size; - IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTDISCARDS); - return err; -} - static void ip_cork_release(struct inet_cork *cork) { cork->flags &= ~IPCORK_OPT; From patchwork Wed Apr 5 16:53:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202269 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B027FC761AF for ; Wed, 5 Apr 2023 16:54:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D9616B0071; Wed, 5 Apr 2023 12:54:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4AE9C6B008A; Wed, 5 Apr 2023 12:54:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 376E36B008C; Wed, 5 Apr 2023 12:54:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 29AD76B0071 for ; Wed, 5 Apr 2023 12:54:39 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 0E2661A0C54 for ; Wed, 5 Apr 2023 16:54:39 +0000 (UTC) X-FDA: 80647936278.02.6D1C618 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 63510100002 for ; Wed, 5 Apr 2023 16:54:37 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=UagTwg0I; spf=pass (imf14.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713677; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4sRhyVsK/d5pgFDi65Wlb3WXN0jzDUhNbeQVzVbMamI=; b=gSLECFZZQDXCZDzd9P1h+ugz4Ssf7MjfqD9hoXqjJQfjWbxMXx2EQPta5+mSZ+u4GpAM1V WRKergwbhSohjaFmkJcyOVBlhxyaAMB67oWbUaKZiV8SNnqMi+0NsXL+dmcdC33YA03T9q nTAN5vV20+8xMC2zlakPPB+XhyAiay0= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=UagTwg0I; spf=pass (imf14.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713677; a=rsa-sha256; cv=none; b=Eq8o8Wi73KjFqZDkrINAnHz6BoGCUHrRe5IYaYb/64CJynqPZIkW0QKO029+sfMhj0F7uk l+utrOaOaE7kR4/+7VxQHbTO6rnb+j+r3s9OJqZU38Mz1BspzrWanVQZUOzOKNikrRrLzn r7rtWxUrlw1YsH7jYxcFFXiu3F/kZVo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713676; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4sRhyVsK/d5pgFDi65Wlb3WXN0jzDUhNbeQVzVbMamI=; b=UagTwg0I7GGktVxFCC7PJZWJ2TnsjT3tRAzFRAPEXYwwoZSyS/ntOwOma2xhYP3xqGOj2z D7rKFZhkWmG3rPe1p6jbxMIm16Rhf/8opQb85xgPTWv8TSFaGLfkzvjlyKdkL5l9b668uR +PFToqVA45WCXDcwuEVFvKduluHK2J4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-68-CiPL6DpHNz2kv6EGD8VAcQ-1; Wed, 05 Apr 2023 12:54:32 -0400 X-MC-Unique: CiPL6DpHNz2kv6EGD8VAcQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 610BD185A78B; Wed, 5 Apr 2023 16:54:31 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5F30A1121314; Wed, 5 Apr 2023 16:54:29 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 16/20] ip, udp: Support MSG_SPLICE_PAGES Date: Wed, 5 Apr 2023 17:53:35 +0100 Message-Id: <20230405165339.3468808-17-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 3pi7uorjhdegapzfmnsz6sqjkakot3ch X-Rspamd-Queue-Id: 63510100002 X-HE-Tag: 1680713677-44759 X-HE-Meta: U2FsdGVkX1/SrfJE0eC0gqm2MYCi6jKGhK7iXH8f3Wn0AFFoDfgvZAFDYH6Ffqft+KvNLP9nKzklbnyMgUaDJxyimoOsNtVSLq4odNHCnC+vOBHUGYczuqw8V49/x7fYeYlsk9eEaCIZmoj53hHeQkaPn1xWF8W4Vj7fI2bWrnnB1LcnQyeKGbYWxNBCyiraHep2rlNa/bnSUx5wU5orvxcf/1w70SzIQg0NfohpwOUWWYlpewlBqKdUO8yTYCFtn6bYOVnH+WF+r1KRyHhtf7tvj02PYoJ9WAwm5CTGCok04oaetcRBZfiFCwknDNpbHP3pI9sXDiDFXQXUUx+HVhY4mH0dKy1qCrplTbq9clVbLBJt/qnCnEvTM4YHwkAEStQ4mkkxPGe0NendtnvUeAgOIDzZhrudOwMOK2YmRrsUnaVYtcHPGyCIv3nnCYPfAQgZxskbV1KEukvnVwCbcsSVByQqwiGbBAh2Q1XmEogV495c4C2LFpr7fsfZdsyawcxPYGDUuRNb7qpegRwM5Ekp10PIi1h3E1FLNN2UxX7Lqg2ATDdF/hl3pbqFm3pg6fTQyy7+sLZSFIJViDrimjitfsT7I6g96n7YEJvQkPda2QUY8DBDAG/3XKg+ICZit2jGUrtLyECgnJuWFzJwBaYfUEHYNUIi9DqLxD2xubzA29ZzrY1zHXafvpoyT8R3y6w8c46zKF1QaVUuzTHNZz5U8r+ybK9WQw5F/J6GCeqOyhKgByB7sXyz9XNHcrgyHvYkGE87CYboiiN7HqhTnDIRxPV6QoBzTBAPKvEx4/+x32fEbjx0ObqwL4zOQkKQ6ezp8QZWclhItKgiqZcu9nh9QEI4MsXu+DmUUl2kL+JBGta0/Rf5FCRS6kEhHeFWYrJ5DV9MdPWMl1doi98sESFsLLfLxs08Va4uLnMws06lgJFNubXVtMesI62QMtYh9icPAf5rkfnT9trubZS 0XBNb3HF 0s2CJtvVLJU8vpDSuFRvOVI4WuR4YYC6jVpesRM5eFvSozdGtNhfTJFC/mw2oajLkhSCx/tYm0UrXnzADZMBLtKZO6yLcNGw0Ld69MluiPu2/EzhsXRN5lfeBgekrSKa2l67tLSbd+LFzBev429691vhlghCy0V0798JROFPSkIhm9PnnEFZBaYZOmdbGrUCsHqWcLOq2OaLhrsfSQ5jF343e4SVIdsbHXWR6Easc5kTLv+thX4liha/Tbhw2GCfKAIOYofnKlyHDt2SvOG/Uj8IAZWTxMDvwZmzVIK3BZWP0QXLtsYozZwO9lN1bHNDV6Pw3Z+V6Gne09uxbdRvq5Cxk3q7CAqYXjs+ChZ3Y7pZujA4dFJe3MQOqArZwDbcMQ+/3lo17b0nkJbHwUjfiakvt3/bunVNehR5yGqSPVOzEV0ExsW8h/R6g7H80pXwXQTa+hI6XV1ORhrwr5gl7x5KCjogERhp53I+TYzSuRzOI20xtf/sMc2904NQYvlGd3CxgSuSn5w2ix4nooWqItUYFng== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Make IP/UDP sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/ip_output.c | 47 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 2dacee1a1ed4..13d19867ffd3 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -957,6 +957,41 @@ csum_page(struct page *page, int offset, int copy) return csum; } +/* + * Add (or copy) data pages for MSG_SPLICE_PAGES. + */ +static int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, + void *from, int *pcopy) +{ + struct msghdr *msg = from; + struct page *page = NULL, **pages = &page; + ssize_t copy = *pcopy; + size_t off; + int err; + + copy = iov_iter_extract_pages(&msg->msg_iter, &pages, copy, 1, 0, &off); + if (copy <= 0) + return copy ?: -EIO; + + err = skb_append_pagefrags(skb, page, off, copy); + if (err < 0) { + iov_iter_revert(&msg->msg_iter, copy); + return err; + } + + if (skb->ip_summed == CHECKSUM_NONE) { + __wsum csum; + + csum = csum_page(page, off, copy); + skb->csum = csum_block_add(skb->csum, csum, skb->len); + } + + skb_len_add(skb, copy); + refcount_add(copy, &sk->sk_wmem_alloc); + *pcopy = copy; + return 0; +} + static int __ip_append_data(struct sock *sk, struct flowi4 *fl4, struct sk_buff_head *queue, @@ -1048,6 +1083,14 @@ static int __ip_append_data(struct sock *sk, skb_zcopy_set(skb, uarg, &extra_uref); } } + } else if ((flags & MSG_SPLICE_PAGES) && length) { + if (inet->hdrincl) + return -EPERM; + if (rt->dst.dev->features & NETIF_F_SG) + /* We need an empty buffer to attach stuff to */ + paged = true; + else + flags &= ~MSG_SPLICE_PAGES; } cork->length += length; @@ -1207,6 +1250,10 @@ static int __ip_append_data(struct sock *sk, err = -EFAULT; goto error; } + } else if (flags & MSG_SPLICE_PAGES) { + err = __ip_splice_pages(sk, skb, from, ©); + if (err < 0) + goto error; } else if (!zc) { int i = skb_shinfo(skb)->nr_frags; From patchwork Wed Apr 5 16:53:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202270 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7F45C77B62 for ; Wed, 5 Apr 2023 16:54:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 78DE86B0083; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 684B46B0095; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3EBC06B0093; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 2C6286B0092 for ; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 0887F1C6A06 for ; Wed, 5 Apr 2023 16:54:44 +0000 (UTC) X-FDA: 80647936488.26.CA20589 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf13.hostedemail.com (Postfix) with ESMTP id 5572920004 for ; Wed, 5 Apr 2023 16:54:42 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GyC+EHaT; spf=pass (imf13.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713682; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OMwy0qt4L2slgfHXAMvoFhpKJLhShXlB6/4lgaiCeU4=; b=ADSNgP2n6GiBzRMZV5JGI8nKcB3YYmVEkR550G734QaaZ3Qweq3VB70YV89DEKz0kxeEhR dW2scKxlA70uzOg6Er74QJvGMVztWfNcw/oBSGKGhk/npNMq82lvmkLwIUU6DXLCw4cH6O U0BABuU7jeNM/9WqZ/a8tXDVbOBFYh0= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GyC+EHaT; spf=pass (imf13.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713682; a=rsa-sha256; cv=none; b=FJsRv92sjlxVLuHKliLGW/sM292OPREQdSBwbRZ543FgP1Hs+5QmNjtfG5VU372QPgvim3 TOS0gR/vxfVJplRUnLku/yNpMx3h24S7xsiNYxyFaF/D12ySjAzksk1/AlFAuP7yWxDoBv ogLA/KSz0EdgCX/b6np+Wc3rS6kai9A= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713681; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OMwy0qt4L2slgfHXAMvoFhpKJLhShXlB6/4lgaiCeU4=; b=GyC+EHaTcAe1EVxUF330CWQcciaBgviOvY+fGvsBjb7EM9dDrUsnboO2eEVSzXyZIaHPD8 qSoL3QpVarIk1QHGK9Exq32sp4ERczJIwBBWssQcUVtwYZ8WYSv0f/ALT0s2AUWIhK2/aq fM+kjMk0MJgzLg/87hzwcaChQxYdZPA= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-160-4RoSPamyNOOyKJ-8OPZdCA-1; Wed, 05 Apr 2023 12:54:36 -0400 X-MC-Unique: 4RoSPamyNOOyKJ-8OPZdCA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 044E23823A0B; Wed, 5 Apr 2023 16:54:34 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id F2DAE2166B26; Wed, 5 Apr 2023 16:54:31 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 17/20] ip, udp: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data Date: Wed, 5 Apr 2023 17:53:36 +0100 Message-Id: <20230405165339.3468808-18-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5572920004 X-Rspam-User: X-Stat-Signature: 5ygsx4cebsbi8ruadfnbbspm6xhgqoeo X-HE-Tag: 1680713682-334846 X-HE-Meta: U2FsdGVkX1/0Ti7jgAvr7WFDzHncZt6cIJXTNdamA6xrrzgggL2Jn+tGOLWrs/Uoa+e3wf8Eery9F+kqDq70ChVM1sNjIqROjhohYC8mPa+CUmMrPi6AGrxe/+hfwQUq1Q5lYMHCe0xbMFrvrg8yrsyDYZaJ0PweBShh+qR7CSWJqJQg44pujl0r7VK/g888IsYpRQk8y/RMKnMqGGRQYcfH5iub1ViXZK9kxH15JWg2/XpOXJouZJMTlG6JSM9LgGAyq2QBVlYAyKsAr5/UmFxdcN+34bq3bDI9Fc0mSa95whuXt1bEjTQLWmlPYNdqbzF9+cr7S98QNAz6sNHAdbzUryk462+prttxSHZf7YrzEJXSB8RF6P+UC1Wm+060cIRJBtSeO5OG3P2s4oufHbE6VChZxzmbM95AgnkonweZXWW2hHpPesxWfewQ+VZupb23RUG2H+NWPALRRBhccS0K4m6L9QnObfH73OdSAjA7qgT5z7q/Nz9gn8cshK/lVV+ArYPqrR4y0RQI1s5xigjCuGlfWTs8Kqqss5Vp6wUf8dC3v5jIDr5G3O0Kq00vcuwbS45qB8WJJywWnU9Evg9XUCOg70sawArDu0QBsm3ef1ReVXO5eUzCLyNS1YpEeAtRKvjKEuGwYvnWnqPEVYf/c4dKU0XyGGnpHXighqXGSV6h0F1xGu+JchPpwmeSmgvgF94YoIleCMTq3DinXithBjl8LnlWu4/srhDW+bZhgrNRct28vEyQk5sENRCvoz+naLjL2fs+hpTf5zdHp6ri4IwaBOt1dyXMktV7BmPbJPoLVRsvh8KbOLF32gMMM3QF8wUX827E8ei71jmlwGOKk9tISQ1Fxzt7cM65PTzT4G7y24cYdZxndjXpRY8pBSpeTCYM1Vkcgz6sBBamPacfJ6DZgwTYWwfi8qM2Ak5C3LWbV2r2LgD/rnBDSeKOC4X/BQFd/guCq1+48sr W6mNcuW1 LhPcTlukqntSkYDvGKPK4kBA4TX8s/E1fGOfSRbsjy3a7ICkOLxn3MLf7gBLIYuSzSl0kF/Jkj3StdHWlqZsOh9UC2w+N3lzNXrs/s3NQLD8XXbZeUC/4fYqPAcjMDcyGMAp8oAqXDKFjmj1vATInTjm7JZWEREI1VEJYy5K9xBmcWM5dWyLuL3J7s+S1g7+3zofXL5uEGU3Ei/44pOnnVqpWO7NcwqhvYtBnOiAzRprVd4ryQxGiZI9H/la9HGjv0g63ymH78JkQtmc4adgatkxRIZ+Gb+hbqVa7rLnLsHhPsP5H+nIliIzhKpL2Z90rYxddJtEcysEJ1RKJmqQu1uKncC6zShS687VurAVN+qzGCy1+rdfr2iGP5ku8CujDraD1XZtKVr2xfk0HI1coYRyDpVePNY27gAGhd+ECk9yK8a+t443JisaoG2fpNZ89s16s6wBc8ut8wKoSfojjSFXwtkRpKxoDiAX3d95k1rG7asmCgppEYg5W9D1nhA0OpkpjbuuZHVws94ZXDsbZrkny6Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If sendmsg() with MSG_SPLICE_PAGES encounters a page that shouldn't be spliced - a slab page, for instance, or one with a zero count - make __ip_append_data() copy it. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/ipv4/ip_output.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 13d19867ffd3..e34c86b1b59a 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -967,13 +967,32 @@ static int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, struct page *page = NULL, **pages = &page; ssize_t copy = *pcopy; size_t off; + bool put = false; int err; copy = iov_iter_extract_pages(&msg->msg_iter, &pages, copy, 1, 0, &off); if (copy <= 0) return copy ?: -EIO; + if (!sendpage_ok(page)) { + const void *p = kmap_local_page(page); + void *q; + + q = page_frag_memdup(NULL, p + off, copy, + sk->sk_allocation, ULONG_MAX); + kunmap_local(p); + if (!q) { + iov_iter_revert(&msg->msg_iter, copy); + return -ENOMEM; + } + page = virt_to_page(q); + off = offset_in_page(q); + put = true; + } + err = skb_append_pagefrags(skb, page, off, copy); + if (put) + put_page(page); if (err < 0) { iov_iter_revert(&msg->msg_iter, copy); return err; From patchwork Wed Apr 5 16:53:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202272 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7684CC761AF for ; Wed, 5 Apr 2023 16:54:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB13B6B0092; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D60E36B0093; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C022F6B0095; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id AF9C56B0092 for ; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 9147980399 for ; Wed, 5 Apr 2023 16:54:44 +0000 (UTC) X-FDA: 80647936488.16.8544C04 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf24.hostedemail.com (Postfix) with ESMTP id A3338180011 for ; Wed, 5 Apr 2023 16:54:42 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gbMHLpA1; spf=pass (imf24.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713682; a=rsa-sha256; cv=none; b=nIBPbKiKLGUdaxDqaQLRXGKL2iNIjacEfzBX2EPk8fVKHr2tbUXlvaqUgFvvYb0o7YDM7T pLQsqgpHRFu0bX2NJx/w5vfbFGAQxFQlJHSRUe7xz6OVlQ6hl73f4mwqB8hllaQRi8tPtV xNNk5j9NRcGiYcBfKdISM5ZPp/5vXCU= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gbMHLpA1; spf=pass (imf24.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713682; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/JFI/23Tc+s/ig3NV2/20J93ojirddDNW00qNvViyfU=; b=MKIcD4z/XMwcjqk2f0a8/vNilrQbnOot0L6Tu2UbOlHp9UlEeBz8ZostKM0UclhoSZ45Hf iceoaXq64maXjvlmjr5aDZjf7U0FV0BEVrOeOlyac07bvFiwnAE3zPUUCzIs8NcTBJ8FOr eDvRt8v0ekQFC2cQ/cov1pcF78bInho= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713682; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/JFI/23Tc+s/ig3NV2/20J93ojirddDNW00qNvViyfU=; b=gbMHLpA1JUSgpEW0PJ5unzf0Ye3QBux5eb2Yr9AU9Q66j5cuLIUbLRd0Et/snHA5n8iQqo TOuKkJ9+o9Ha3iye4iOW5voOTmce8SPT/0LMkMilCvavW6GAyKmdSfTM7utbSIlnxfORQF ZUzOykAF+9PrE8jGehzqZrBgIjZjyJQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-456-mGoavVuyPK63hF6pFvfXqA-1; Wed, 05 Apr 2023 12:54:37 -0400 X-MC-Unique: mGoavVuyPK63hF6pFvfXqA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B5FED101A531; Wed, 5 Apr 2023 16:54:36 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id B22C6C1602B; Wed, 5 Apr 2023 16:54:34 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 18/20] ip6, udp6: Support MSG_SPLICE_PAGES Date: Wed, 5 Apr 2023 17:53:37 +0100 Message-Id: <20230405165339.3468808-19-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Rspam-User: X-Rspamd-Queue-Id: A3338180011 X-Rspamd-Server: rspam01 X-Stat-Signature: 8oi55c55g56qgmcco14c759stokcjcx9 X-HE-Tag: 1680713682-240293 X-HE-Meta: U2FsdGVkX1+dQCo0CWSrYEPcUopLtgr2f4Sdzn5ffNasN+NvOiZvpt3fStNNjXwpT3awH7/cGPkZzixAcoCgf6GuO7g3r28sqLkyQeC14Auflt3i6Opa1ouolenMlgWVEeuH2+2wCU4vz0ZKkO1ojQ73cjItrW08UbuC/Z3u9u/j+nJvAYF7jTt8kRsFPzA7dCI+4OkxzvP0X/K8stB6eH2l/V+Munxw+KSi7CI2wf5mhrIR0daF6KsRW7wPg0N/RuzjqSx8O3N4OsM090tayeyq1tqAKJ9ZxAcZe33ypEygU4j5Qu5zlgKDjoCUfr5za2GiqFRRBqUTr1vabVl8tisneyhz99ArEQP2KZZ6rGF60C9GlZ6HgKtvygXswf+ZRe3O/pvOYwflzeXMvBNH9HnSl90yZFGSgEw0+HOzADb+LdagKH1+Kp+nwENbyUMm+8EumZmmoKDAJQumY1rZ0L68+Qg6XpCbLJQ+9mS1zdnj2EEXE004NXQrGJxNwLZ7FTp4kCFpvspNGcOCAhaDN8BVT4PY/deDhk6mi9FjulANdcv8A1YDgooRw/uGsaeUvJJA8oDtfovPeeZHRUpGwxDLQdNsTiHXuX0WqcAdwcH0CNtwo4fHaLOV7id6Lhu+jfQsLKM+ux9/nhHgECOo0AsHRiqSe9Qyv62z4/lfev0NLvbPFmvpLVRc1Um3ukz3mw13gevy6wsmEjKtf6FDaxCpChQMdx8cMQYkNFTm5cfolclaMqclcZWme6J5yquK7x0a969MWuqiWHFe0X4fuaXkGilJB3vj8n7I8bydHUivhduCPZRVypIZF+ao7UD7juQY3o3SxhAZGSq0WawnTMilh0toZcJZ/TGOgYvDGkoo8yZ+63Mc1xDxKVwgeZUvLzqjr2O6YDpfp/XSK5g2HSMcynG9ky+0g6vJgy50aMKF9ir6Zarnp4PndPb8lI2OdXJD8fzdy53W5yeVDSP FiLImA+t BGThS9fKdtpNXZ2AUnnRaDbz/e7jtt4DL4syYkZOJd+/MEGmPFYDB1Wd0M73HwH1Idlu5v1YBZIhM1Xge/ABJTLTbiiRRGFQRhh4m/x85M+1QsgExQoE8AbScQFBZ8HjWsqzZGCAWCaF+8zp8cdqFOA4vWrpbeZyLZagkNxCgY1MheWaiRn7nfoqFa13Jh23zGjMa6fp43sWEpzOHP4n7h+Jb0UPZYGQnxEROn+60+309AdQICI7ndRmXCS/b2YbfRIG2qaj2vlxKvZerpoc7hchsXKy9vVfNe4kRwejG7a6Rq8wjQTvRs9DAx7kXqfNw2i9fCEFzCAbAGDY0nn6IEi1K6vIQadrvJsE6WHHsQMuORAlBkZAn3DT7Odg049LM6m7hPFLE2rUymee/QMRksPaGKi0XmTS0RYFU8T1In/zXFfOAoF5Ljf3lX8nno2F58FyCgU8YQvUcI6tcMoxhD7Y490o8Ev4qdxhgJOzz4SRlwCWEz/rVi/DaFXxH47o8kBEyN1kRVK1lNs0XZZ7NoeMiIQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Make IP6/UDP6 sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible, copying the data if not. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Willem de Bruijn cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- include/net/ip.h | 1 + net/ipv4/ip_output.c | 4 ++-- net/ipv6/ip6_output.c | 12 ++++++++++++ 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index 7627a4df893b..8a50341007bf 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -211,6 +211,7 @@ int ip_local_out(struct net *net, struct sock *sk, struct sk_buff *skb); int __ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl, __u8 tos); void ip_init(void); +int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, void *from, int *pcopy); int ip_append_data(struct sock *sk, struct flowi4 *fl4, int getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb), diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index e34c86b1b59a..241a78d82766 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -960,8 +960,7 @@ csum_page(struct page *page, int offset, int copy) /* * Add (or copy) data pages for MSG_SPLICE_PAGES. */ -static int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, - void *from, int *pcopy) +int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, void *from, int *pcopy) { struct msghdr *msg = from; struct page *page = NULL, **pages = &page; @@ -1010,6 +1009,7 @@ static int __ip_splice_pages(struct sock *sk, struct sk_buff *skb, *pcopy = copy; return 0; } +EXPORT_SYMBOL_GPL(__ip_splice_pages); static int __ip_append_data(struct sock *sk, struct flowi4 *fl4, diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 0b6140f0179d..82846d18cf22 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1589,6 +1589,14 @@ static int __ip6_append_data(struct sock *sk, skb_zcopy_set(skb, uarg, &extra_uref); } } + } else if ((flags & MSG_SPLICE_PAGES) && length) { + if (inet_sk(sk)->hdrincl) + return -EPERM; + if (rt->dst.dev->features & NETIF_F_SG) + /* We need an empty buffer to attach stuff to */ + paged = true; + else + flags &= ~MSG_SPLICE_PAGES; } /* @@ -1778,6 +1786,10 @@ static int __ip6_append_data(struct sock *sk, err = -EFAULT; goto error; } + } else if (flags & MSG_SPLICE_PAGES) { + err = __ip_splice_pages(sk, skb, from, ©); + if (err < 0) + goto error; } else if (!zc) { int i = skb_shinfo(skb)->nr_frags; From patchwork Wed Apr 5 16:53:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202271 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDBB8C7619A for ; Wed, 5 Apr 2023 16:54:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 575FA6B008C; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FC8D6B0083; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 350C56B0095; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 24A0F6B008C for ; Wed, 5 Apr 2023 12:54:44 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C6E2F1C68E1 for ; Wed, 5 Apr 2023 16:54:43 +0000 (UTC) X-FDA: 80647936446.30.DB26CB0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf10.hostedemail.com (Postfix) with ESMTP id 01BB1C001E for ; Wed, 5 Apr 2023 16:54:41 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=OvDK+oWj; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf10.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713682; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4DKKh1jysgamxWGpEyeqRE19EsYVOUgugB+w/Q3Uqtg=; b=5UMCdKisGtXQij+7bhN7+RI7lu0NguBwAbDZuyjwH5w6T27oJTnIZxlAt3UNyBUn5lY0J8 EqIvtYBM8GnKZy0i533Cz1g46ccW0X8msfCT6xYFhdFfVMs1LJX029+jqiqWIPqwF2pIRk TJBZEpimTCaH0XFAjnUGyVQXSPZOVj0= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=OvDK+oWj; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf10.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713682; a=rsa-sha256; cv=none; b=bblsCwxCnxqIP1Oj9OWXvUH9mPy3OAT0bs+GK3B3BAuOY1l6/TceCyVDKXTgkwpWD1jxN9 mJN5zY1tW6cnS0328mjsJPGx7Jre+4y/ZkETewH0ryJnL7wwBTb+nQiVGYWL2h5IhV3PWL C3jLSY3thSgZOkc7tlv55gwhwgsoWKs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713681; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4DKKh1jysgamxWGpEyeqRE19EsYVOUgugB+w/Q3Uqtg=; b=OvDK+oWj9TN2QVna60YG+ArR76AxnpUIcGk4A/a/ruRVtMzyNo6/ooB3b5Op6mb+6oBmyn KKIhGFNJLhT+bIxTekF5yGMNlHbEhqkGGNWaN5dwHK09g0Hzr+e7BNc+QMmRIlGF3uo9tK iNRGUTzWBae/fJTluP0GuzIpLebe5yE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-547-cweTzd3XPYSIW2fcQ6DILQ-1; Wed, 05 Apr 2023 12:54:40 -0400 X-MC-Unique: cweTzd3XPYSIW2fcQ6DILQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5B889885624; Wed, 5 Apr 2023 16:54:39 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 54D9CC1602A; Wed, 5 Apr 2023 16:54:37 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 19/20] af_unix: Support MSG_SPLICE_PAGES Date: Wed, 5 Apr 2023 17:53:38 +0100 Message-Id: <20230405165339.3468808-20-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 01BB1C001E X-Stat-Signature: pzwpw3uefezzxr73b3ucjsaum3arf447 X-HE-Tag: 1680713681-651721 X-HE-Meta: U2FsdGVkX19KaOUH5xgIOTW7RPft7Z0EknrhL0SOc+uUaP3Mm1XnzPvawN35b4jAred7RBTUF5LIGEs0KyN/qYl6E4l4kFdVE4YEZWmwMfKH/XK2TXBJ2/f+u96l1QdGYibk2Ur/UuMOUL2gBln5r6uPm/uXXy2MKOZ64x5U/k1aWGVJyXz/RdQHTUsCIK2ybZSP8ni9CF7IiJf3EGKlLBrJFv57qpiTAGaGC5LmrbZmZhQYDrHdV/UPB5G1u97lgOlb16hDjot5jcCKjW09F2/Yh3GeZSTcl4v3rl/1ZQErDS5gvpnqgg4b0APTSK24Oa0K4weHBNzyLIC+JZH1De3S2rKHA+98kWjPwQ3rKpStmI2sZpzuh+9JqOifGZ4ds2Nuv1bTZTvEXDuJNnEaAolEsn+GIk0DRT75EZoiR8B6VQH3s3Yt5h+0AEj4C4+QqUm/tLfKTrB6IafImFwsTrCQfuliEYl6rK2AFiif2KCBy1fesTRwkbiGn7K6sL+cuLYHKTFGMekks2zKoxuVb4PNrm82kqqYNXV5hG5yPMd37UTtnkb/3lxhIhqTo1jCC3cUaVj+36JONUNKfdbrserVAbvmc2nJwn61SKXk59B8CtI7GIYhZCLGeNulssVuCkEqEI1dWmTS01zuJrfyNjOIpDyP9sj4imn9WkF8oNVYjtzH2NedmxaeFJmGClKJ266LkaKvk6m71SzyxfAS1MfiqFCxo/hWHR8dRtHjydq4UfH35q6KNxYfgWqW1bHAI/nWuwLWA79An+F7dXTog7U1MqNq8/5Lxox1uD2y3Ez/++88pbFkqRO0r7h+bolKyDKAIrP6tSSm5XpxjeSbOVaO2l7q1ZuCyrCtGjATugGUi0CfI4UpuP38YGrEnt7JoHoGmnY2JNFNWbo7cTCJ3NPnesdvLaH1JwHZp7va+79gmimJP9JV+owf7yhObO1eFutjXxukbmBMh4vXIvO Al5j1EVz y5X5+CmM8h/yFS0scFO8QxncY2NEG42UkCUX/aAFTv0M+OpwkJr0+klZZdbC6DK029mvs+ddURnJEzqph+TzyK2r4JY4SUppCtKN64pfuE8HVT+OQXBV+hv+XUwoH/8iJD5SuC2XmfEnQp5XRgXLpxhtUuG/M12LyK0aGb2X5auyN63t0ybOh3NZ8w/b+mQY5cRUA/vAromwk3QV+w0GoM/Ws7RcyBsnheCx3TOWS1r1bNibjj5c9ytTE2W9XPp8JUCV9Z/XWGdc1fTHb7ZO65DK2WaUqYpcM+wRhQ24AITSCb3BTE040c6sxnqzMaDj8fhFkgUTluavjDTrcs6rkEdKPqBk/UW5c6ENzWSKJ7amF5RICTSFHGjHmms61l8ZCdllRXj/zMD8JPdFHZs9I1V+VMfSr4+gfRnQfL+h2B+XNcZC2qrKo6/IE7gkuCCb0Z6YLewluhLLoxAfEfBzL2n3TnJtqnC/Y5XJn+VjXNPIUY0DntEmcVyz6kQcphMyYHhBj X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Make AF_UNIX sendmsg() support MSG_SPLICE_PAGES, splicing in pages from the source iterator if possible and copying the data in otherwise. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/unix/af_unix.c | 93 ++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 77 insertions(+), 16 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index fb31e8a4409e..fee431a089d3 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2157,6 +2157,53 @@ static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other } #endif +/* + * Extract pages from an iterator and add them to the socket buffer. + */ +static ssize_t unix_extract_bvec_to_skb(struct sk_buff *skb, + struct iov_iter *iter, ssize_t maxsize) +{ + struct page *pages[8], **ppages = pages; + unsigned int i, nr; + ssize_t ret = 0; + + while (iter->count > 0) { + size_t off, len; + + nr = min_t(size_t, MAX_SKB_FRAGS - skb_shinfo(skb)->nr_frags, + ARRAY_SIZE(pages)); + if (nr == 0) + break; + + len = iov_iter_extract_pages(iter, &ppages, maxsize, nr, 0, &off); + if (len <= 0) { + if (!ret) + ret = len ?: -EIO; + break; + } + + i = 0; + do { + size_t part = min_t(size_t, PAGE_SIZE - off, len); + + if (skb_append_pagefrags(skb, pages[i++], off, part) < 0) { + if (!ret) + ret = -EMSGSIZE; + goto out; + } + off = 0; + ret += part; + maxsize -= part; + len -= part; + } while (len > 0); + if (maxsize <= 0) + break; + } + +out: + return ret; +} + static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { @@ -2200,19 +2247,25 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, while (sent < len) { size = len - sent; - /* Keep two messages in the pipe so it schedules better */ - size = min_t(int, size, (sk->sk_sndbuf >> 1) - 64); + if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) { + skb = sock_alloc_send_pskb(sk, 0, 0, + msg->msg_flags & MSG_DONTWAIT, + &err, 0); + } else { + /* Keep two messages in the pipe so it schedules better */ + size = min_t(int, size, (sk->sk_sndbuf >> 1) - 64); - /* allow fallback to order-0 allocations */ - size = min_t(int, size, SKB_MAX_HEAD(0) + UNIX_SKB_FRAGS_SZ); + /* allow fallback to order-0 allocations */ + size = min_t(int, size, SKB_MAX_HEAD(0) + UNIX_SKB_FRAGS_SZ); - data_len = max_t(int, 0, size - SKB_MAX_HEAD(0)); + data_len = max_t(int, 0, size - SKB_MAX_HEAD(0)); - data_len = min_t(size_t, size, PAGE_ALIGN(data_len)); + data_len = min_t(size_t, size, PAGE_ALIGN(data_len)); - skb = sock_alloc_send_pskb(sk, size - data_len, data_len, - msg->msg_flags & MSG_DONTWAIT, &err, - get_order(UNIX_SKB_FRAGS_SZ)); + skb = sock_alloc_send_pskb(sk, size - data_len, data_len, + msg->msg_flags & MSG_DONTWAIT, &err, + get_order(UNIX_SKB_FRAGS_SZ)); + } if (!skb) goto out_err; @@ -2224,13 +2277,21 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, } fds_sent = true; - skb_put(skb, size - data_len); - skb->data_len = data_len; - skb->len = size; - err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size); - if (err) { - kfree_skb(skb); - goto out_err; + if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) { + size = unix_extract_bvec_to_skb(skb, &msg->msg_iter, size); + skb->data_len += size; + skb->len += size; + skb->truesize += size; + refcount_add(size, &sk->sk_wmem_alloc); + } else { + skb_put(skb, size - data_len); + skb->data_len = data_len; + skb->len = size; + err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size); + if (err) { + kfree_skb(skb); + goto out_err; + } } unix_state_lock(other); From patchwork Wed Apr 5 16:53:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13202273 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93485C7619A for ; Wed, 5 Apr 2023 16:54:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DD006B0093; Wed, 5 Apr 2023 12:54:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 28EF46B0095; Wed, 5 Apr 2023 12:54:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 105E46B0096; Wed, 5 Apr 2023 12:54:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id EB0766B0093 for ; Wed, 5 Apr 2023 12:54:48 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A872DAC2FB for ; Wed, 5 Apr 2023 16:54:48 +0000 (UTC) X-FDA: 80647936656.09.5F6ED0F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf11.hostedemail.com (Postfix) with ESMTP id DDEF640023 for ; Wed, 5 Apr 2023 16:54:46 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=NT9H3O18; spf=pass (imf11.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1680713686; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QZshbeXwoidsAPUStOu/ElI81f+RaVvzRteHlRNDwSs=; b=kJJpqnikmh9s1TNGWB96k9LLTBX3kUdD5yy7AXg0WPUY3tTKvR3MZbpZFmo/0qbL4RExVQ J4l9vKtzMXtMzSQcO4r43XG5Gp8fDzShmtQz1yrACqNWGUFoPGu7boRLMqLjZ3GnSdAqLU HV3BnqXSx6e6z7Ryew1hEwGY5XQuwiQ= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=NT9H3O18; spf=pass (imf11.hostedemail.com: domain of dhowells@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1680713686; a=rsa-sha256; cv=none; b=spJVtaV6t4WYeUW2krD14s5MSo15HfV59egADm+htzbsH7xq03FBwbpyhuZDGysOH9P3at IZFGXGVmvvObG3Vc3Fz5BS+PI6J34MiFeYR/nGae+KNyOEWDXNahqsqF0cdNTpcIfnoEDd gJ6Uz1Olbu8gMf7s51MayCb3SMWRGO0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680713686; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QZshbeXwoidsAPUStOu/ElI81f+RaVvzRteHlRNDwSs=; b=NT9H3O187YCBRnISQk6rhHmHWRnbVy6/BCCiAyVNHF4JVQ4zoSP5Cj7gzv20EdU8AqS2DX lzAuJAptmPLtaEnYrw3iaXchKFy1b0PcD/XYKn/Ymlj/KCkmkluz+/qKnAD07QZzt4QSeW QxPAPLkW8lzyR3OrJonWk0DFdbCY9ok= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-659-7ntbeM6RPUWZ_fvl6pOJLw-1; Wed, 05 Apr 2023 12:54:43 -0400 X-MC-Unique: 7ntbeM6RPUWZ_fvl6pOJLw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1C59F8996E4; Wed, 5 Apr 2023 16:54:42 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 19E08140EBF4; Wed, 5 Apr 2023 16:54:40 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH net-next v4 20/20] af_unix: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data Date: Wed, 5 Apr 2023 17:53:39 +0100 Message-Id: <20230405165339.3468808-21-dhowells@redhat.com> In-Reply-To: <20230405165339.3468808-1-dhowells@redhat.com> References: <20230405165339.3468808-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: DDEF640023 X-Stat-Signature: kfohdr5omf7xif3zmukhp4m7ffbgzdna X-Rspam-User: X-HE-Tag: 1680713686-275479 X-HE-Meta: U2FsdGVkX198DxLn04SFnotb+a4SiKyyclbdefAm3Pu4Rzyvwpde9C3WUKgRVAOlSd0frU2dlKhr+iMEBbwz+poJTQNWrQhvjFM2AILQNSzfe8JKMv8ntss2p4Ct6lqIVK+tAootyk3u7tnLS1UVdw+8EtH7GYBSEE/j/y+UVj5TvdoeJWQCpOKJKmGmJ1kLhRcW0tfDaK7W/cX2AiWz6LgZkyIXRfpYzfvEHS8nBkILV7c1pqiuR2ZMjhyRYx8K9EXOECe1fYUWTriSqpaIhHAuJB2Qy1zOFz/b/ZXSsYpNVLONPxnNcDzhRpH2iKUCQrkVUmJd4R04y7Nww+9+lXZgpghpUO0/5WOI9YgUW1imsyUd8mNa28Jq9Psc3fEXD5tQG8f4SeKh58HESORgZZ1sAGJauy1MavTayk9qzDJXIzGoihIQLU9bqUiwoRZWyv/M/gSyZTL0/CxUMxfZK4yZ4DJZ6W6kKzWMu78Z1yGQT7OIlid+jvT1o/5QddzZAOWNAIJt+NmcG62u/zUjlqvF7rxfA0xUE2YtCxU84t2FcYn0A3HSzeaeJ5turAIZnTFYgwcPgNvf738s/mCRaLF3OA/aKOQIr3kmF7zxZXykmxSUAnUC8T78oRkJavm+GFIzxY2O7iaroNowC+YFsWDfx0zR/C+gLg2Uh1+LxD9NFcbGne35QWbZYalok/vQjFtOeECcxSHXQlupcdbUerXEBcue5B33E3wzM8FXEsbfuiDuFlqoxft9VW0D6mRuFtkmpmdhZ8AqWAXTdSSXq3qbnni7u4DgzpTXTkk3uXzbgbbYBrBZ74t9Lrl1m7k5VH7Hd17PejllV07E9Xgbu5hCp9RyGDPgfWVaBSe+56/zAEmbBkYZK3SFUcjKb8a/xkzBD9lRaxVkxWhEmKZZrbQ1nG8swuAd1emiC5Q6AVpDI7iieY0l9pRpV9DpqUFtUqyGcPqPO5ghcGXpU9z ya9ZrT6M 29Dz4Y30h36U51odGPslwkE4gq/VgeBTpe2FBdq0tq5G+LWnA9DdHXtPvDn6e5073yvIUJlQKudDhEMElI09O77KVTaqNUoNid9Wt4kQ6ykQ6yvWBTYj8TUpwAUi9IrJv4w4Y2PGJpNql7BvS/ilCJ66CM9IbnZa+krjU5o4mPFZQj+/57+54/Njp344VtIc5283f+xD3AhMJbQ4xZeKtmtnXDx7Ig4FsCS3n2xOvo9xvS1U7qPsSJZ0OVeoG9FHtC/NaIWs8mfi2OqWwl6tkHD7yWJ6Orqoxn5rs+YfM1cn41dhBNmI3WHKt6Td0BD3yxP4oIS5qWuzYWX1Jt0rEQ2AzTlpKHOF+lJrgq6eloqLfFzw1xsAIXrFTpyuAcJZEG3puEU3th2ixOroZwOK5Mpw9zDtiEdaZpkvEHn7xY+zomPjqoNkKU5eGt9hstQpHOagjHLsPKUh1+didFyhivBaM5UAGdTiuUX39fqNDAwBNGEv403ycXAyh9V2wd3y4I9DgAOfMYf0XbYPeaQ1rqPQh5jYcHbACsIs00Ix9dBcJs4c= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If sendmsg() with MSG_SPLICE_PAGES encounters a page that shouldn't be spliced - a slab page, for instance, or one with a zero count - make unix_extract_bvec_to_skb() copy it. Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/unix/af_unix.c | 44 +++++++++++++++++++++++++++++++++----------- 1 file changed, 33 insertions(+), 11 deletions(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index fee431a089d3..6941be8dae7e 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2160,12 +2160,12 @@ static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other /* * Extract pages from an iterator and add them to the socket buffer. */ -static ssize_t unix_extract_bvec_to_skb(struct sk_buff *skb, - struct iov_iter *iter, ssize_t maxsize) +static ssize_t unix_extract_bvec_to_skb(struct sk_buff *skb, struct iov_iter *iter, + ssize_t maxsize, gfp_t gfp) { struct page *pages[8], **ppages = pages; unsigned int i, nr; - ssize_t ret = 0; + ssize_t spliced = 0, ret = 0; while (iter->count > 0) { size_t off, len; @@ -2177,31 +2177,52 @@ static ssize_t unix_extract_bvec_to_skb(struct sk_buff *skb, len = iov_iter_extract_pages(iter, &ppages, maxsize, nr, 0, &off); if (len <= 0) { - if (!ret) - ret = len ?: -EIO; + ret = len ?: -EIO; break; } i = 0; do { + struct page *page = pages[i++]; size_t part = min_t(size_t, PAGE_SIZE - off, len); + bool put = false; + + if (!sendpage_ok(page)) { + const void *p = kmap_local_page(page); + void *q; + + q = page_frag_memdup(NULL, p + off, part, gfp, + ULONG_MAX); + kunmap_local(p); + if (!q) { + iov_iter_revert(iter, len); + ret = -ENOMEM; + goto out; + } + page = virt_to_page(q); + off = offset_in_page(q); + put = true; + } - if (skb_append_pagefrags(skb, pages[i++], off, part) < 0) { - if (!ret) - ret = -EMSGSIZE; + ret = skb_append_pagefrags(skb, page, off, part); + if (put) + put_page(page); + if (ret < 0) { + iov_iter_revert(iter, len); goto out; } off = 0; - ret += part; + spliced += part; maxsize -= part; len -= part; } while (len > 0); + if (maxsize <= 0) break; } out: - return ret; + return spliced ?: ret; } static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, @@ -2278,7 +2299,8 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, fds_sent = true; if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) { - size = unix_extract_bvec_to_skb(skb, &msg->msg_iter, size); + size = unix_extract_bvec_to_skb(skb, &msg->msg_iter, size, + sk->sk_allocation); skb->data_len += size; skb->len += size; skb->truesize += size;