From patchwork Tue Jun 20 14:53:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13286023 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8971CEB64D7 for ; Tue, 20 Jun 2023 14:59:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233156AbjFTO7V (ORCPT ); Tue, 20 Jun 2023 10:59:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233260AbjFTO67 (ORCPT ); Tue, 20 Jun 2023 10:58:59 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A80711BEE for ; Tue, 20 Jun 2023 07:57:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273057; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2HlaXseyq5U5o1i8LIFlWuZ5rATyuHBFais1x2nt408=; b=h7GrBAeAcdQwyfb7vFB7U11NlmIKokVTCaoRae8F2AjAvfPeHZ8rL31QcTgGCEDKlLfvX6 vZPrySDL1068WnOf+e4LUxXZ6BxTbHOk9ZAo3WDe0JsPU1WzHGDvTnBczsCbvUlB6/L/9C rfxoU9a8ijODj6d1bvruO6f2GG2JQvk= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-448-QZ1cyihYMfqFmIoZXCbVIQ-1; Tue, 20 Jun 2023 10:57:28 -0400 X-MC-Unique: QZ1cyihYMfqFmIoZXCbVIQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 88D0B29ABA3F; Tue, 20 Jun 2023 14:54:36 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id E58D02166B26; Tue, 20 Jun 2023 14:54:33 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Philipp Reisner , Lars Ellenberg , =?utf-8?q?Christoph_B=C3=B6hmwa?= =?utf-8?q?lder?= , drbd-dev@lists.linbit.com, linux-block@vger.kernel.org Subject: [PATCH net-next v3 14/18] drbd: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() Date: Tue, 20 Jun 2023 15:53:33 +0100 Message-ID: <20230620145338.1300897-15-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Use sendmsg() conditionally with MSG_SPLICE_PAGES in _drbd_send_page() rather than calling sendpage() or _drbd_no_send_page(). Signed-off-by: David Howells cc: Philipp Reisner cc: Lars Ellenberg cc: "Christoph Böhmwalder" cc: Jens Axboe cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: drbd-dev@lists.linbit.com cc: linux-block@vger.kernel.org cc: netdev@vger.kernel.org --- Notes: ver #2) - Wrap lines at 80. drivers/block/drbd/drbd_main.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index 83987e7a5ef2..8a01a18a2550 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -1540,7 +1540,8 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa int offset, size_t size, unsigned msg_flags) { struct socket *socket = peer_device->connection->data.socket; - int len = size; + struct bio_vec bvec; + struct msghdr msg = { .msg_flags = msg_flags, }; int err = -EIO; /* e.g. XFS meta- & log-data is in slab pages, which have a @@ -1549,33 +1550,35 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa * put_page(); and would cause either a VM_BUG directly, or * __page_cache_release a page that would actually still be referenced * by someone, leading to some obscure delayed Oops somewhere else. */ - if (drbd_disable_sendpage || !sendpage_ok(page)) - return _drbd_no_send_page(peer_device, page, offset, size, msg_flags); + if (!drbd_disable_sendpage && sendpage_ok(page)) + msg.msg_flags |= MSG_NOSIGNAL | MSG_SPLICE_PAGES; + + bvec_set_page(&bvec, page, offset, size); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); - msg_flags |= MSG_NOSIGNAL; drbd_update_congested(peer_device->connection); do { int sent; - sent = socket->ops->sendpage(socket, page, offset, len, msg_flags); + sent = sock_sendmsg(socket, &msg); if (sent <= 0) { if (sent == -EAGAIN) { if (we_should_drop_the_connection(peer_device->connection, socket)) break; continue; } - drbd_warn(peer_device->device, "%s: size=%d len=%d sent=%d\n", - __func__, (int)size, len, sent); + drbd_warn(peer_device->device, "%s: size=%d len=%zu sent=%d\n", + __func__, (int)size, msg_data_left(&msg), + sent); if (sent < 0) err = sent; break; } - len -= sent; - offset += sent; - } while (len > 0 /* THINK && device->cstate >= C_CONNECTED*/); + } while (msg_data_left(&msg) + /* THINK && device->cstate >= C_CONNECTED*/); clear_bit(NET_CONGESTED, &peer_device->connection->flags); - if (len == 0) { + if (!msg_data_left(&msg)) { err = 0; peer_device->device->send_cnt += size >> 9; } From patchwork Tue Jun 20 14:53:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13286022 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2541EB64D7 for ; Tue, 20 Jun 2023 14:58:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233459AbjFTO60 (ORCPT ); Tue, 20 Jun 2023 10:58:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60470 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233250AbjFTO6T (ORCPT ); Tue, 20 Jun 2023 10:58:19 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4D7C1731 for ; Tue, 20 Jun 2023 07:57:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1687273033; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XKse4vF5Frqs4bdWsYMKttBmqWRsH1PFiAHwvjH79qA=; b=CMNBpdmQ6p8PQQCFrPt9jtdv4VgtozM6T0RtX4nP7gmcL4L1+66ZmRfQLz+pwMHNYLhE6I CfYdps7sqDiDjEzbazgMVQaUflL5Flfke+w2O/g3+mSLWM6Bp/M46dJp8dmkdYdo4T+50a EMh11wYmieNWKDlTbyKgKUd8F1WUv3A= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-620-Q6NvzJ07NxKczalwLXSO3g-1; Tue, 20 Jun 2023 10:57:10 -0400 X-MC-Unique: Q6NvzJ07NxKczalwLXSO3g-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1552E1064B1B; Tue, 20 Jun 2023 14:54:39 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 341B5200A398; Tue, 20 Jun 2023 14:54:37 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Alexander Duyck , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Philipp Reisner , Lars Ellenberg , =?utf-8?q?Christoph_B=C3=B6hmwa?= =?utf-8?q?lder?= , drbd-dev@lists.linbit.com, linux-block@vger.kernel.org Subject: [PATCH net-next v3 15/18] drdb: Send an entire bio in a single sendmsg Date: Tue, 20 Jun 2023 15:53:34 +0100 Message-ID: <20230620145338.1300897-16-dhowells@redhat.com> In-Reply-To: <20230620145338.1300897-1-dhowells@redhat.com> References: <20230620145338.1300897-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Since _drdb_sendpage() is now using sendmsg to send the pages rather sendpage, pass the entire bio in one go using a bvec iterator instead of doing it piecemeal. Signed-off-by: David Howells cc: Philipp Reisner cc: Lars Ellenberg cc: "Christoph Böhmwalder" cc: Jens Axboe cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: drbd-dev@lists.linbit.com cc: linux-block@vger.kernel.org cc: netdev@vger.kernel.org --- Notes: ver #2) - Use "unsigned int" rather than "unsigned". drivers/block/drbd/drbd_main.c | 77 +++++++++++----------------------- 1 file changed, 25 insertions(+), 52 deletions(-) diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index 8a01a18a2550..beba74ae093b 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -1520,28 +1520,15 @@ static void drbd_update_congested(struct drbd_connection *connection) * As a workaround, we disable sendpage on pages * with page_count == 0 or PageSlab. */ -static int _drbd_no_send_page(struct drbd_peer_device *peer_device, struct page *page, - int offset, size_t size, unsigned msg_flags) -{ - struct socket *socket; - void *addr; - int err; - - socket = peer_device->connection->data.socket; - addr = kmap(page) + offset; - err = drbd_send_all(peer_device->connection, socket, addr, size, msg_flags); - kunmap(page); - if (!err) - peer_device->device->send_cnt += size >> 9; - return err; -} - -static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *page, - int offset, size_t size, unsigned msg_flags) +static int _drbd_send_pages(struct drbd_peer_device *peer_device, + struct iov_iter *iter, unsigned int msg_flags) { struct socket *socket = peer_device->connection->data.socket; - struct bio_vec bvec; - struct msghdr msg = { .msg_flags = msg_flags, }; + struct msghdr msg = { + .msg_flags = msg_flags | MSG_NOSIGNAL, + .msg_iter = *iter, + }; + size_t size = iov_iter_count(iter); int err = -EIO; /* e.g. XFS meta- & log-data is in slab pages, which have a @@ -1550,11 +1537,8 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa * put_page(); and would cause either a VM_BUG directly, or * __page_cache_release a page that would actually still be referenced * by someone, leading to some obscure delayed Oops somewhere else. */ - if (!drbd_disable_sendpage && sendpage_ok(page)) - msg.msg_flags |= MSG_NOSIGNAL | MSG_SPLICE_PAGES; - - bvec_set_page(&bvec, page, offset, size); - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + if (drbd_disable_sendpage) + msg.msg_flags &= ~(MSG_NOSIGNAL | MSG_SPLICE_PAGES); drbd_update_congested(peer_device->connection); do { @@ -1587,39 +1571,22 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa static int _drbd_send_bio(struct drbd_peer_device *peer_device, struct bio *bio) { - struct bio_vec bvec; - struct bvec_iter iter; + struct iov_iter iter; - /* hint all but last page with MSG_MORE */ - bio_for_each_segment(bvec, bio, iter) { - int err; + iov_iter_bvec(&iter, ITER_SOURCE, bio->bi_io_vec, bio->bi_vcnt, + bio->bi_iter.bi_size); - err = _drbd_no_send_page(peer_device, bvec.bv_page, - bvec.bv_offset, bvec.bv_len, - bio_iter_last(bvec, iter) - ? 0 : MSG_MORE); - if (err) - return err; - } - return 0; + return _drbd_send_pages(peer_device, &iter, 0); } static int _drbd_send_zc_bio(struct drbd_peer_device *peer_device, struct bio *bio) { - struct bio_vec bvec; - struct bvec_iter iter; + struct iov_iter iter; - /* hint all but last page with MSG_MORE */ - bio_for_each_segment(bvec, bio, iter) { - int err; + iov_iter_bvec(&iter, ITER_SOURCE, bio->bi_io_vec, bio->bi_vcnt, + bio->bi_iter.bi_size); - err = _drbd_send_page(peer_device, bvec.bv_page, - bvec.bv_offset, bvec.bv_len, - bio_iter_last(bvec, iter) ? 0 : MSG_MORE); - if (err) - return err; - } - return 0; + return _drbd_send_pages(peer_device, &iter, MSG_SPLICE_PAGES); } static int _drbd_send_zc_ee(struct drbd_peer_device *peer_device, @@ -1631,10 +1598,16 @@ static int _drbd_send_zc_ee(struct drbd_peer_device *peer_device, /* hint all but last page with MSG_MORE */ page_chain_for_each(page) { + struct iov_iter iter; + struct bio_vec bvec; unsigned l = min_t(unsigned, len, PAGE_SIZE); - err = _drbd_send_page(peer_device, page, 0, l, - page_chain_next(page) ? MSG_MORE : 0); + bvec_set_page(&bvec, page, 0, l); + iov_iter_bvec(&iter, ITER_SOURCE, &bvec, 1, l); + + err = _drbd_send_pages(peer_device, &iter, + MSG_SPLICE_PAGES | + (page_chain_next(page) ? MSG_MORE : 0)); if (err) return err; len -= l;