From patchwork Fri Sep 24 17:18:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12516207 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C7A2C433EF for ; Fri, 24 Sep 2021 17:19:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2E15661251 for ; Fri, 24 Sep 2021 17:19:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2E15661251 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id C3ED36B007B; Fri, 24 Sep 2021 13:19:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BEE46900002; Fri, 24 Sep 2021 13:19:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ADD946B007E; Fri, 24 Sep 2021 13:19:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0243.hostedemail.com [216.40.44.243]) by kanga.kvack.org (Postfix) with ESMTP id A024A6B007B for ; Fri, 24 Sep 2021 13:19:02 -0400 (EDT) Received: from smtpin39.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 577F31802D4F7 for ; Fri, 24 Sep 2021 17:19:02 +0000 (UTC) X-FDA: 78623127324.39.C627B28 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 05C50801A8A7 for ; Fri, 24 Sep 2021 17:19:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632503941; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=b+XD7Zi/YFKEa2yF0urj/0T7v2kKhbU5Vfw3G+ykgV0=; b=KB2s7LBqG4bwSvaHC4z4SJ5hLo3gLw1v6LcczlwwfJN5MtRX9oV2uRQFmA7tm7rjZU7xSn 9463goujnEwC+slzZapwP9xqZXn9kTSQI2GXk3k0DZ3uF4wlebKdGWnm6m3vXQdRe07z/h jQIDONLRUEyVjQAEeCdlNjHipgRrCDA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-301-4a3NA8WpOZiVovrffGCEEQ-1; Fri, 24 Sep 2021 13:19:00 -0400 X-MC-Unique: 4a3NA8WpOZiVovrffGCEEQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2708E84A5E9; Fri, 24 Sep 2021 17:18:58 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4C30D69320; Fri, 24 Sep 2021 17:18:55 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH v3 6/9] mm: Make __swap_writepage() do async DIO if asked for it From: David Howells To: willy@infradead.org, hch@lst.de, trond.myklebust@primarydata.com Cc: "Darrick J. Wong" , Trond Myklebust , linux-nfs@vger.kernel.org, linux-block@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, dhowells@redhat.com, dhowells@redhat.com, darrick.wong@oracle.com, viro@zeniv.linux.org.uk, jlayton@kernel.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 24 Sep 2021 18:18:54 +0100 Message-ID: <163250393435.2330363.12822795853508093546.stgit@warthog.procyon.org.uk> In-Reply-To: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> References: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 05C50801A8A7 X-Stat-Signature: 83h6jb5nfrr98p37hqckonrsiwynjcd9 Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=KB2s7LBq; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf06.hostedemail.com: domain of dhowells@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=dhowells@redhat.com X-HE-Tag: 1632503941-993702 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Make __swap_writepage()'s DIO path do sync DIO if the writeback control's sync mode is WB_SYNC_ALL and async DIO if not. Note that this causes hanging processes in sunrpc if the swapfile is on NFS. I'm not sure whether it's due to misscheduling or something else. Suggested-by: Matthew Wilcox (Oracle) Signed-off-by: David Howells cc: Matthew Wilcox (Oracle) cc: Christoph Hellwig cc: Darrick J. Wong cc: Trond Myklebust cc: linux-nfs@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-xfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- mm/page_io.c | 133 ++++++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 92 insertions(+), 41 deletions(-) diff --git a/mm/page_io.c b/mm/page_io.c index 6b1465699c72..8f1199d59162 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -298,6 +298,96 @@ static void bio_associate_blkg_from_page(struct bio *bio, struct page *page) #define bio_associate_blkg_from_page(bio, page) do { } while (0) #endif /* CONFIG_MEMCG && CONFIG_BLK_CGROUP */ +static void swapfile_write_complete(struct page *page, long ret) +{ + if (ret == thp_size(page)) { + count_swpout_vm_event(page); + } else { + /* + * In the case of swap-over-nfs, this can be a + * temporary failure if the system has limited memory + * for allocating transmit buffers. Mark the page + * dirty and avoid rotate_reclaimable_page but + * rate-limit the messages but do not flag PageError + * like the normal direct-to-bio case as it could be + * temporary. + */ + set_page_dirty(page); + ClearPageReclaim(page); + pr_err_ratelimited("Write error (%ld) on dio swapfile (%llu)\n", + ret, page_file_offset(page)); + } + end_page_writeback(page); +} + +static void __swapfile_write_complete(struct kiocb *iocb, long ret, long ret2) +{ + struct swapfile_kiocb *ki = container_of(iocb, struct swapfile_kiocb, iocb); + + swapfile_write_complete(iocb->ki_swap_page, ret); + swapfile_put_kiocb(ki); +} + +static int swapfile_write_sync(struct swap_info_struct *sis, + struct page *page, struct writeback_control *wbc, + struct iov_iter *from) +{ + struct kiocb kiocb; + struct file *swap_file = sis->swap_file; + int ret; + + init_sync_kiocb(&kiocb, swap_file); + kiocb.ki_swap_page = page; + kiocb.ki_pos = page_file_offset(page); + kiocb.ki_flags = IOCB_DIRECT | IOCB_WRITE | IOCB_SWAP; + + set_page_writeback(page); + unlock_page(page); + + ret = swap_file->f_mapping->a_ops->swap_rw(&kiocb, from); + swapfile_write_complete(page, ret); + return ret == page_size(page) ? 0 : ret >= 0 ? -ENODATA : ret; +} + +static int swapfile_write(struct swap_info_struct *sis, + struct page *page, struct writeback_control *wbc) +{ + struct swapfile_kiocb *ki; + struct file *swap_file = sis->swap_file; + struct bio_vec bv = { + .bv_page = page, + .bv_len = page_size(page), + .bv_offset = 0 + }; + struct iov_iter from; + int ret; + + iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE); + + if (wbc->sync_mode == WB_SYNC_ALL) + return swapfile_write_sync(sis, page, wbc, &from); + + ki = kzalloc(sizeof(*ki), GFP_KERNEL); + if (!ki) + return -ENOMEM; + + refcount_set(&ki->ref, 2); + init_sync_kiocb(&ki->iocb, swap_file); + ki->iocb.ki_swap_page = page; + ki->iocb.ki_pos = page_file_offset(page); + ki->iocb.ki_flags = IOCB_DIRECT | IOCB_WRITE | IOCB_SWAP; + ki->iocb.ki_complete = __swapfile_write_complete; + + set_page_writeback(page); + unlock_page(page); + ret = swap_file->f_mapping->a_ops->swap_rw(&ki->iocb, &from); + + if (ret != -EIOCBQUEUED) + __swapfile_write_complete(&ki->iocb, ret, 0); + swapfile_put_kiocb(ki); + return ret == page_size(page) ? 0 : ret >= 0 ? -ENODATA : ret; +} + int __swap_writepage(struct page *page, struct writeback_control *wbc) { struct bio *bio; @@ -305,47 +395,8 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc) struct swap_info_struct *sis = page_swap_info(page); VM_BUG_ON_PAGE(!PageSwapCache(page), page); - if (data_race(sis->flags & SWP_FS_OPS)) { - struct kiocb kiocb; - struct file *swap_file = sis->swap_file; - struct address_space *mapping = swap_file->f_mapping; - struct bio_vec bv = { - .bv_page = page, - .bv_len = PAGE_SIZE, - .bv_offset = 0 - }; - struct iov_iter from; - - iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE); - init_sync_kiocb(&kiocb, swap_file); - kiocb.ki_pos = page_file_offset(page); - kiocb.ki_flags = IOCB_DIRECT | IOCB_WRITE | IOCB_SWAP; - - set_page_writeback(page); - unlock_page(page); - ret = mapping->a_ops->swap_rw(&kiocb, &from); - if (ret == PAGE_SIZE) { - count_vm_event(PSWPOUT); - ret = 0; - } else { - /* - * In the case of swap-over-nfs, this can be a - * temporary failure if the system has limited - * memory for allocating transmit buffers. - * Mark the page dirty and avoid - * rotate_reclaimable_page but rate-limit the - * messages but do not flag PageError like - * the normal direct-to-bio case as it could - * be temporary. - */ - set_page_dirty(page); - ClearPageReclaim(page); - pr_err_ratelimited("Write error (%d) on dio swapfile (%llu)\n", - ret, page_file_offset(page)); - } - end_page_writeback(page); - return ret; - } + if (data_race(sis->flags & SWP_FS_OPS)) + return swapfile_write(sis, page, wbc); ret = bdev_write_page(sis->bdev, swap_page_sector(page), page, wbc); if (!ret) {