From patchwork Fri Sep 24 17:18:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12516187 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF220C433F5 for ; Fri, 24 Sep 2021 17:18:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 936DE60F21 for ; Fri, 24 Sep 2021 17:18:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347627AbhIXRTx (ORCPT ); Fri, 24 Sep 2021 13:19:53 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:47993 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347670AbhIXRTu (ORCPT ); Fri, 24 Sep 2021 13:19:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632503896; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NNU0bFI5zkvJJJmtP4TNyGEOQ6HKFpxRqWA6pe5+eN8=; b=Eg+GQzXZeqmS/k0LYQ5A8Cx9QqmGzFYqJKj2W2sjvAPtIfjr8Lub2wJHtNLP/Nzcy1ZmNB UxKMUsKnvHQhjdNbi0DfpPm//VzbKjEEUmWePKX6616E6y+pyl2n6TOHrlVNC+4GJy8eWi CwLZDFj+B+/+k4lf8rD/9kZeI79DL54= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-366-dzhGp2qYMX-bXRkOQ6RAuA-1; Fri, 24 Sep 2021 13:18:12 -0400 X-MC-Unique: dzhGp2qYMX-bXRkOQ6RAuA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 745DF1922039; Fri, 24 Sep 2021 17:18:09 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 098FD5D9DE; Fri, 24 Sep 2021 17:18:05 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH v3 1/9] mm: Remove the callback func argument from __swap_writepage() From: David Howells To: willy@infradead.org, hch@lst.de, trond.myklebust@primarydata.com Cc: "Darrick J. Wong" , Seth Jennings , Bob Liu , Minchan Kim , Dan Magenheimer , linux-block@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, dhowells@redhat.com, dhowells@redhat.com, darrick.wong@oracle.com, viro@zeniv.linux.org.uk, jlayton@kernel.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 24 Sep 2021 18:18:05 +0100 Message-ID: <163250388519.2330363.14896768040342703526.stgit@warthog.procyon.org.uk> In-Reply-To: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> References: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Remove the callback func argument from __swap_writepage() as it's end_swap_bio_write() in both places that call it. This reverts: commit 1eec6702a80e04416d528846a5ff2122484d95ec mm: allow for outstanding swap writeback accounting Signed-off-by: David Howells Reviewed-by: Christoph Hellwig cc: Matthew Wilcox (Oracle) cc: Darrick J. Wong cc: Seth Jennings cc: Bob Liu cc: Minchan Kim cc: Dan Magenheimer cc: linux-block@vger.kernel.org cc: linux-xfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- include/linux/swap.h | 4 +--- mm/page_io.c | 9 ++++----- mm/zswap.c | 2 +- 3 files changed, 6 insertions(+), 9 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index ba52f3a3478e..576d40e33b1f 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -418,9 +418,7 @@ extern void kswapd_stop(int nid); /* linux/mm/page_io.c */ extern int swap_readpage(struct page *page, bool do_poll); extern int swap_writepage(struct page *page, struct writeback_control *wbc); -extern void end_swap_bio_write(struct bio *bio); -extern int __swap_writepage(struct page *page, struct writeback_control *wbc, - bio_end_io_t end_write_func); +int __swap_writepage(struct page *page, struct writeback_control *wbc); extern int swap_set_page_dirty(struct page *page); int add_swap_extent(struct swap_info_struct *sis, unsigned long start_page, diff --git a/mm/page_io.c b/mm/page_io.c index c493ce9ebcf5..afd18f6ec09e 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -26,7 +26,7 @@ #include #include -void end_swap_bio_write(struct bio *bio) +static void end_swap_bio_write(struct bio *bio) { struct page *page = bio_first_page_all(bio); @@ -249,7 +249,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) end_page_writeback(page); goto out; } - ret = __swap_writepage(page, wbc, end_swap_bio_write); + ret = __swap_writepage(page, wbc); out: return ret; } @@ -282,8 +282,7 @@ static void bio_associate_blkg_from_page(struct bio *bio, struct page *page) #define bio_associate_blkg_from_page(bio, page) do { } while (0) #endif /* CONFIG_MEMCG && CONFIG_BLK_CGROUP */ -int __swap_writepage(struct page *page, struct writeback_control *wbc, - bio_end_io_t end_write_func) +int __swap_writepage(struct page *page, struct writeback_control *wbc) { struct bio *bio; int ret; @@ -341,7 +340,7 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, bio_set_dev(bio, sis->bdev); bio->bi_iter.bi_sector = swap_page_sector(page); bio->bi_opf = REQ_OP_WRITE | REQ_SWAP | wbc_to_write_flags(wbc); - bio->bi_end_io = end_write_func; + bio->bi_end_io = end_swap_bio_write; bio_add_page(bio, page, thp_size(page), 0); bio_associate_blkg_from_page(bio, page); diff --git a/mm/zswap.c b/mm/zswap.c index 7944e3e57e78..f38e34917aa3 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1011,7 +1011,7 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) SetPageReclaim(page); /* start writeback */ - __swap_writepage(page, &wbc, end_swap_bio_write); + __swap_writepage(page, &wbc); put_page(page); zswap_written_back_pages++; From patchwork Fri Sep 24 17:18:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12516239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5671CC433F5 for ; Fri, 24 Sep 2021 17:19:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 364B061250 for ; Fri, 24 Sep 2021 17:19:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347814AbhIXRU1 (ORCPT ); Fri, 24 Sep 2021 13:20:27 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:39091 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347717AbhIXRT6 (ORCPT ); Fri, 24 Sep 2021 13:19:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632503904; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=98vjdU3chxPFXFBTBaOBXAwecDWuSSqTqG27FOFg2mk=; b=KNvBjJ76GNnSWeooc32c1/vXImT8Fvot+i4CNWhQ2VcVtK01P8xpf0g+g3x8P4BmYH7biN kFH1cnpsj0RhKDspy0IGwviQPnKiPQV8cz+OEO6099B3UPUpAl3faT7Ik6WweYHpYOa7/o A6jL//8QRKucEdC6lcRnUQASdCFmXJI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-348-4OSQiREVNYWeJbgSSn2jwg-1; Fri, 24 Sep 2021 13:18:21 -0400 X-MC-Unique: 4OSQiREVNYWeJbgSSn2jwg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E05801084681; Fri, 24 Sep 2021 17:18:18 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8ACB519724; Fri, 24 Sep 2021 17:18:15 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH v3 2/9] mm: Add 'supports' field to the address_space_operations to list features From: David Howells To: willy@infradead.org, hch@lst.de, trond.myklebust@primarydata.com Cc: "Darrick J. Wong" , Ilya Dryomov , Jeff Layton , ceph-devel@vger.kernel.org, Steve French , linux-cifs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, dhowells@redhat.com, dhowells@redhat.com, darrick.wong@oracle.com, viro@zeniv.linux.org.uk, jlayton@kernel.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 24 Sep 2021 18:18:14 +0100 Message-ID: <163250389458.2330363.17234460134406104577.stgit@warthog.procyon.org.uk> In-Reply-To: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> References: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Rather than depending on .direct_IO to point to something to indicate that direct I/O is supported, add a 'supports' bitmask that we can test, since we only need one bit. We can then remove noop_direct_IO, ceph_direct_io and cifs_direct_io. [Question: Some filesystems support read DIO but not write DIO - should I split the flag?] Signed-off-by: David Howells cc: Matthew Wilcox cc: Christoph Hellwig cc: Darrick J. Wong cc: Ilya Dryomov cc: Jeff Layton cc: ceph-devel@vger.kernel.org cc: Steve French cc: linux-cifs@vger.kernel.org cc: linux-xfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- Documentation/filesystems/vfs.rst | 8 ++++++++ block/fops.c | 1 + drivers/block/loop.c | 6 +++--- fs/9p/vfs_addr.c | 1 + fs/affs/file.c | 1 + fs/btrfs/inode.c | 2 +- fs/ceph/addr.c | 13 +------------ fs/cifs/file.c | 21 +-------------------- fs/erofs/data.c | 2 +- fs/exfat/inode.c | 1 + fs/ext2/inode.c | 4 +++- fs/ext4/inode.c | 8 ++++---- fs/f2fs/data.c | 1 + fs/fat/inode.c | 1 + fs/fcntl.c | 2 +- fs/fuse/dax.c | 2 +- fs/fuse/file.c | 1 + fs/gfs2/aops.c | 2 +- fs/hfs/inode.c | 1 + fs/hfsplus/inode.c | 1 + fs/jfs/inode.c | 1 + fs/libfs.c | 12 ------------ fs/nfs/file.c | 1 + fs/nilfs2/inode.c | 1 + fs/ntfs3/inode.c | 1 + fs/ocfs2/aops.c | 1 + fs/open.c | 3 ++- fs/orangefs/inode.c | 1 + fs/overlayfs/file.c | 2 +- fs/overlayfs/inode.c | 3 +-- fs/reiserfs/inode.c | 1 + fs/udf/file.c | 1 + fs/udf/inode.c | 1 + fs/xfs/xfs_aops.c | 4 ++-- fs/zonefs/super.c | 2 +- include/linux/fs.h | 4 +++- 36 files changed, 53 insertions(+), 65 deletions(-) diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index bf5c48066fac..abb844792d6a 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -721,6 +721,7 @@ cache in your filesystem. The following members are defined: .. code-block:: c struct address_space_operations { + unsigned int supports; int (*writepage)(struct page *page, struct writeback_control *wbc); int (*readpage)(struct file *, struct page *); int (*writepages)(struct address_space *, struct writeback_control *); @@ -755,6 +756,13 @@ cache in your filesystem. The following members are defined: int (*swap_deactivate)(struct file *); }; +``supports`` + provides a list of features supported by address_spaces using this + operations set. The following feature support flags are provided: + + ``AS_SUPPORTS_DIRECT_IO`` + Direct I/O is supported. + ``writepage`` called by the VM to write a dirty page to backing store. This may happen for data integrity reasons (i.e. 'sync'), or to free diff --git a/block/fops.c b/block/fops.c index ffce6f6c68dd..84c64d814d0d 100644 --- a/block/fops.c +++ b/block/fops.c @@ -384,6 +384,7 @@ const struct address_space_operations def_blk_aops = { .direct_IO = blkdev_direct_IO, .migratepage = buffer_migrate_page_norefs, .is_dirty_writeback = buffer_check_dirty_writeback, + .supports = AS_SUPPORTS_DIRECT_IO, }; /* diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 7bf4686af774..76f7a6d85815 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -237,9 +237,9 @@ static void __loop_update_dio(struct loop_device *lo, bool dio) */ if (dio) { if (queue_logical_block_size(lo->lo_queue) >= sb_bsize && - !(lo->lo_offset & dio_align) && - mapping->a_ops->direct_IO && - !lo->transfer) + !(lo->lo_offset & dio_align) && + (mapping->a_ops->supports & AS_SUPPORTS_DIRECT_IO) && + !lo->transfer) use_dio = true; else use_dio = false; diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c index cce9ace651a2..4910898af0d7 100644 --- a/fs/9p/vfs_addr.c +++ b/fs/9p/vfs_addr.c @@ -333,4 +333,5 @@ const struct address_space_operations v9fs_addr_operations = { .invalidatepage = v9fs_invalidate_page, .launder_page = v9fs_launder_page, .direct_IO = v9fs_direct_IO, + .supports = AS_SUPPORTS_DIRECT_IO, }; diff --git a/fs/affs/file.c b/fs/affs/file.c index 75ebd2b576ca..7488bd7d3e0c 100644 --- a/fs/affs/file.c +++ b/fs/affs/file.c @@ -460,6 +460,7 @@ const struct address_space_operations affs_aops = { .write_end = affs_write_end, .direct_IO = affs_direct_IO, .bmap = _affs_bmap + .supports = AS_SUPPORTS_DIRECT_IO, }; static inline struct buffer_head * diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 487533c35ddb..b479c97e42fc 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -10937,7 +10937,6 @@ static const struct address_space_operations btrfs_aops = { .writepage = btrfs_writepage, .writepages = btrfs_writepages, .readahead = btrfs_readahead, - .direct_IO = noop_direct_IO, .invalidatepage = btrfs_invalidatepage, .releasepage = btrfs_releasepage, #ifdef CONFIG_MIGRATION @@ -10947,6 +10946,7 @@ static const struct address_space_operations btrfs_aops = { .error_remove_page = generic_error_remove_page, .swap_activate = btrfs_swap_activate, .swap_deactivate = btrfs_swap_deactivate, + .supports = AS_SUPPORTS_DIRECT_IO, }; static const struct inode_operations btrfs_file_inode_operations = { diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 99b80b5c7a93..086d4745b99e 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -1306,17 +1306,6 @@ static int ceph_write_end(struct file *file, struct address_space *mapping, return copied; } -/* - * we set .direct_IO to indicate direct io is supported, but since we - * intercept O_DIRECT reads and writes early, this function should - * never get called. - */ -static ssize_t ceph_direct_io(struct kiocb *iocb, struct iov_iter *iter) -{ - WARN_ON(1); - return -EINVAL; -} - const struct address_space_operations ceph_aops = { .readpage = ceph_readpage, .readahead = ceph_readahead, @@ -1327,7 +1316,7 @@ const struct address_space_operations ceph_aops = { .set_page_dirty = ceph_set_page_dirty, .invalidatepage = ceph_invalidatepage, .releasepage = ceph_releasepage, - .direct_IO = ceph_direct_io, + .supports = AS_SUPPORTS_DIRECT_IO, }; static void ceph_block_sigs(sigset_t *oldset) diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 6796fc73b304..a5787cf3d836 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -4891,25 +4891,6 @@ void cifs_oplock_break(struct work_struct *work) cifs_done_oplock_break(cinode); } -/* - * The presence of cifs_direct_io() in the address space ops vector - * allowes open() O_DIRECT flags which would have failed otherwise. - * - * In the non-cached mode (mount with cache=none), we shunt off direct read and write requests - * so this method should never be called. - * - * Direct IO is not yet supported in the cached mode. - */ -static ssize_t -cifs_direct_io(struct kiocb *iocb, struct iov_iter *iter) -{ - /* - * FIXME - * Eventually need to support direct IO for non forcedirectio mounts - */ - return -EINVAL; -} - static int cifs_swap_activate(struct swap_info_struct *sis, struct file *swap_file, sector_t *span) { @@ -4974,7 +4955,6 @@ const struct address_space_operations cifs_addr_ops = { .write_end = cifs_write_end, .set_page_dirty = __set_page_dirty_nobuffers, .releasepage = cifs_release_page, - .direct_IO = cifs_direct_io, .invalidatepage = cifs_invalidate_page, .launder_page = cifs_launder_page, /* @@ -4984,6 +4964,7 @@ const struct address_space_operations cifs_addr_ops = { */ .swap_activate = cifs_swap_activate, .swap_deactivate = cifs_swap_deactivate, + .supports = AS_SUPPORTS_DIRECT_IO, }; /* diff --git a/fs/erofs/data.c b/fs/erofs/data.c index 9db829715652..30f19296b268 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -299,7 +299,7 @@ const struct address_space_operations erofs_raw_access_aops = { .readpage = erofs_readpage, .readahead = erofs_readahead, .bmap = erofs_bmap, - .direct_IO = noop_direct_IO, + .supports = AS_SUPPORTS_DIRECT_IO, }; #ifdef CONFIG_FS_DAX diff --git a/fs/exfat/inode.c b/fs/exfat/inode.c index ca37d4344361..f38f42282f54 100644 --- a/fs/exfat/inode.c +++ b/fs/exfat/inode.c @@ -500,6 +500,7 @@ static const struct address_space_operations exfat_aops = { .write_end = exfat_write_end, .direct_IO = exfat_direct_IO, .bmap = exfat_aop_bmap + .supports = AS_SUPPORTS_DIRECT_IO, }; static inline unsigned long exfat_hash(loff_t i_pos) diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 333fa62661d5..4ad3655defd9 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -974,6 +974,7 @@ const struct address_space_operations ext2_aops = { .migratepage = buffer_migrate_page, .is_partially_uptodate = block_is_partially_uptodate, .error_remove_page = generic_error_remove_page, + .supports = AS_SUPPORTS_DIRECT_IO, }; const struct address_space_operations ext2_nobh_aops = { @@ -988,13 +989,14 @@ const struct address_space_operations ext2_nobh_aops = { .writepages = ext2_writepages, .migratepage = buffer_migrate_page, .error_remove_page = generic_error_remove_page, + .supports = AS_SUPPORTS_DIRECT_IO, }; static const struct address_space_operations ext2_dax_aops = { .writepages = ext2_dax_writepages, - .direct_IO = noop_direct_IO, .set_page_dirty = __set_page_dirty_no_writeback, .invalidatepage = noop_invalidatepage, + .supports = AS_SUPPORTS_DIRECT_IO, }; /* diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index d18852d6029c..08d3541d8daa 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3662,11 +3662,11 @@ static const struct address_space_operations ext4_aops = { .bmap = ext4_bmap, .invalidatepage = ext4_invalidatepage, .releasepage = ext4_releasepage, - .direct_IO = noop_direct_IO, .migratepage = buffer_migrate_page, .is_partially_uptodate = block_is_partially_uptodate, .error_remove_page = generic_error_remove_page, .swap_activate = ext4_iomap_swap_activate, + .supports = AS_SUPPORTS_DIRECT_IO, }; static const struct address_space_operations ext4_journalled_aops = { @@ -3680,10 +3680,10 @@ static const struct address_space_operations ext4_journalled_aops = { .bmap = ext4_bmap, .invalidatepage = ext4_journalled_invalidatepage, .releasepage = ext4_releasepage, - .direct_IO = noop_direct_IO, .is_partially_uptodate = block_is_partially_uptodate, .error_remove_page = generic_error_remove_page, .swap_activate = ext4_iomap_swap_activate, + .supports = AS_SUPPORTS_DIRECT_IO, }; static const struct address_space_operations ext4_da_aops = { @@ -3697,20 +3697,20 @@ static const struct address_space_operations ext4_da_aops = { .bmap = ext4_bmap, .invalidatepage = ext4_invalidatepage, .releasepage = ext4_releasepage, - .direct_IO = noop_direct_IO, .migratepage = buffer_migrate_page, .is_partially_uptodate = block_is_partially_uptodate, .error_remove_page = generic_error_remove_page, .swap_activate = ext4_iomap_swap_activate, + .supports = AS_SUPPORTS_DIRECT_IO, }; static const struct address_space_operations ext4_dax_aops = { .writepages = ext4_dax_writepages, - .direct_IO = noop_direct_IO, .set_page_dirty = __set_page_dirty_no_writeback, .bmap = ext4_bmap, .invalidatepage = noop_invalidatepage, .swap_activate = ext4_iomap_swap_activate, + .supports = AS_SUPPORTS_DIRECT_IO, }; void ext4_set_aops(struct inode *inode) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index f4fd6c246c9a..4c3643969b69 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -4156,6 +4156,7 @@ const struct address_space_operations f2fs_dblock_aops = { #ifdef CONFIG_MIGRATION .migratepage = f2fs_migrate_page, #endif + .supports = AS_SUPPORTS_DIRECT_IO, }; void f2fs_clear_page_cache_dirty_tag(struct page *page) diff --git a/fs/fat/inode.c b/fs/fat/inode.c index de0c9b013a85..4352981dfb82 100644 --- a/fs/fat/inode.c +++ b/fs/fat/inode.c @@ -351,6 +351,7 @@ static const struct address_space_operations fat_aops = { .write_end = fat_write_end, .direct_IO = fat_direct_IO, .bmap = _fat_bmap + .supports = AS_SUPPORTS_DIRECT_IO, }; /* diff --git a/fs/fcntl.c b/fs/fcntl.c index 9c6c6a3e2de5..7308e8274ff9 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -58,7 +58,7 @@ static int setfl(int fd, struct file * filp, unsigned long arg) /* Pipe packetized mode is controlled by O_DIRECT flag */ if (!S_ISFIFO(inode->i_mode) && (arg & O_DIRECT)) { if (!filp->f_mapping || !filp->f_mapping->a_ops || - !filp->f_mapping->a_ops->direct_IO) + !(filp->f_mapping->a_ops->supports & AS_SUPPORTS_DIRECT_IO)) return -EINVAL; } diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index 281d79f8b3d3..e39468fd7177 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -1325,9 +1325,9 @@ bool fuse_dax_inode_alloc(struct super_block *sb, struct fuse_inode *fi) static const struct address_space_operations fuse_dax_file_aops = { .writepages = fuse_dax_writepages, - .direct_IO = noop_direct_IO, .set_page_dirty = __set_page_dirty_no_writeback, .invalidatepage = noop_invalidatepage, + .supports = AS_SUPPORTS_DIRECT_IO, }; void fuse_dax_inode_init(struct inode *inode) diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 11404f8c21c7..3db64194d346 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -3161,6 +3161,7 @@ static const struct address_space_operations fuse_file_aops = { .direct_IO = fuse_direct_IO, .write_begin = fuse_write_begin, .write_end = fuse_write_end, + .supports = AS_SUPPORTS_DIRECT_IO, }; void fuse_init_file_inode(struct inode *inode) diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index 005e920f5d4a..dc50b53d6abd 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -783,10 +783,10 @@ static const struct address_space_operations gfs2_aops = { .releasepage = iomap_releasepage, .invalidatepage = iomap_invalidatepage, .bmap = gfs2_bmap, - .direct_IO = noop_direct_IO, .migratepage = iomap_migrate_page, .is_partially_uptodate = iomap_is_partially_uptodate, .error_remove_page = generic_error_remove_page, + .supports = AS_SUPPORTS_DIRECT_IO, }; static const struct address_space_operations gfs2_jdata_aops = { diff --git a/fs/hfs/inode.c b/fs/hfs/inode.c index 4a95a92546a0..5f9e5464a5bf 100644 --- a/fs/hfs/inode.c +++ b/fs/hfs/inode.c @@ -177,6 +177,7 @@ const struct address_space_operations hfs_aops = { .bmap = hfs_bmap, .direct_IO = hfs_direct_IO, .writepages = hfs_writepages, + .supports = AS_SUPPORTS_DIRECT_IO, }; /* diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c index 6fef67c2a9f0..9f0c27e5e115 100644 --- a/fs/hfsplus/inode.c +++ b/fs/hfsplus/inode.c @@ -174,6 +174,7 @@ const struct address_space_operations hfsplus_aops = { .bmap = hfsplus_bmap, .direct_IO = hfsplus_direct_IO, .writepages = hfsplus_writepages, + .supports = AS_SUPPORTS_DIRECT_IO, }; const struct dentry_operations hfsplus_dentry_operations = { diff --git a/fs/jfs/inode.c b/fs/jfs/inode.c index 57ab424c05ff..a477267471a4 100644 --- a/fs/jfs/inode.c +++ b/fs/jfs/inode.c @@ -366,6 +366,7 @@ const struct address_space_operations jfs_aops = { .write_end = nobh_write_end, .bmap = jfs_bmap, .direct_IO = jfs_direct_IO, + .supports = AS_SUPPORTS_DIRECT_IO, }; /* diff --git a/fs/libfs.c b/fs/libfs.c index 51b4de3b3447..c27f681291e5 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -1182,18 +1182,6 @@ void noop_invalidatepage(struct page *page, unsigned int offset, } EXPORT_SYMBOL_GPL(noop_invalidatepage); -ssize_t noop_direct_IO(struct kiocb *iocb, struct iov_iter *iter) -{ - /* - * iomap based filesystems support direct I/O without need for - * this callback. However, it still needs to be set in - * inode->a_ops so that open/fcntl know that direct I/O is - * generally supported. - */ - return -EINVAL; -} -EXPORT_SYMBOL_GPL(noop_direct_IO); - /* Because kfree isn't assignment-compatible with void(void*) ;-/ */ void kfree_link(void *p) { diff --git a/fs/nfs/file.c b/fs/nfs/file.c index aa353fd58240..7403ec6317cb 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -532,6 +532,7 @@ const struct address_space_operations nfs_file_aops = { .error_remove_page = generic_error_remove_page, .swap_activate = nfs_swap_activate, .swap_deactivate = nfs_swap_deactivate, + .supports = AS_SUPPORTS_DIRECT_IO, }; /* diff --git a/fs/nilfs2/inode.c b/fs/nilfs2/inode.c index 2e8eb263cf0f..c57395c01817 100644 --- a/fs/nilfs2/inode.c +++ b/fs/nilfs2/inode.c @@ -307,6 +307,7 @@ const struct address_space_operations nilfs_aops = { .invalidatepage = block_invalidatepage, .direct_IO = nilfs_direct_IO, .is_partially_uptodate = block_is_partially_uptodate, + .supports = AS_SUPPORTS_DIRECT_IO, }; static int nilfs_insert_inode_locked(struct inode *inode, diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c index db2a5a4c38e4..7b3ac1ab5d04 100644 --- a/fs/ntfs3/inode.c +++ b/fs/ntfs3/inode.c @@ -1948,6 +1948,7 @@ const struct address_space_operations ntfs_aops = { .direct_IO = ntfs_direct_IO, .bmap = ntfs_bmap, .set_page_dirty = __set_page_dirty_buffers, + .supports = AS_SUPPORTS_DIRECT_IO, }; const struct address_space_operations ntfs_aops_cmpr = { diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index 68d11c295dd3..5a158975a4ff 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -2466,4 +2466,5 @@ const struct address_space_operations ocfs2_aops = { .migratepage = buffer_migrate_page, .is_partially_uptodate = block_is_partially_uptodate, .error_remove_page = generic_error_remove_page, + .supports = AS_SUPPORTS_DIRECT_IO, }; diff --git a/fs/open.c b/fs/open.c index daa324606a41..d679dc0c1801 100644 --- a/fs/open.c +++ b/fs/open.c @@ -840,7 +840,8 @@ static int do_dentry_open(struct file *f, /* NB: we're sure to have correct a_ops only after f_op->open */ if (f->f_flags & O_DIRECT) { - if (!f->f_mapping->a_ops || !f->f_mapping->a_ops->direct_IO) + if (!f->f_mapping->a_ops || + !(f->f_mapping->a_ops->supports & AS_SUPPORTS_DIRECT_IO)) return -EINVAL; } diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c index c1bb4c4b5d67..c5bad94dfbd0 100644 --- a/fs/orangefs/inode.c +++ b/fs/orangefs/inode.c @@ -641,6 +641,7 @@ static const struct address_space_operations orangefs_address_operations = { .freepage = orangefs_freepage, .launder_page = orangefs_launder_page, .direct_IO = orangefs_direct_IO, + .supports = AS_SUPPORTS_DIRECT_IO, }; vm_fault_t orangefs_page_mkwrite(struct vm_fault *vmf) diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index d081faa55e83..87d05f1d718a 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -83,7 +83,7 @@ static int ovl_change_flags(struct file *file, unsigned int flags) if (flags & O_DIRECT) { if (!file->f_mapping->a_ops || - !file->f_mapping->a_ops->direct_IO) + !(file->f_mapping->a_ops->supports & AS_SUPPORTS_DIRECT_IO)) return -EINVAL; } diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c index 832b17589733..9902608b1715 100644 --- a/fs/overlayfs/inode.c +++ b/fs/overlayfs/inode.c @@ -660,8 +660,7 @@ static const struct inode_operations ovl_special_inode_operations = { }; static const struct address_space_operations ovl_aops = { - /* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO */ - .direct_IO = noop_direct_IO, + .supports = AS_SUPPORTS_DIRECT_IO, }; /* diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c index f49b72ccac4c..890d91847d58 100644 --- a/fs/reiserfs/inode.c +++ b/fs/reiserfs/inode.c @@ -3436,4 +3436,5 @@ const struct address_space_operations reiserfs_address_space_operations = { .bmap = reiserfs_aop_bmap, .direct_IO = reiserfs_direct_IO, .set_page_dirty = reiserfs_set_page_dirty, + .supports = AS_SUPPORTS_DIRECT_IO, }; diff --git a/fs/udf/file.c b/fs/udf/file.c index 1baff8ddb754..2cb1b499e5c7 100644 --- a/fs/udf/file.c +++ b/fs/udf/file.c @@ -131,6 +131,7 @@ const struct address_space_operations udf_adinicb_aops = { .write_begin = udf_adinicb_write_begin, .write_end = udf_adinicb_write_end, .direct_IO = udf_adinicb_direct_IO, + .supports = AS_SUPPORTS_DIRECT_IO, }; static ssize_t udf_file_write_iter(struct kiocb *iocb, struct iov_iter *from) diff --git a/fs/udf/inode.c b/fs/udf/inode.c index 1d6b7a50736b..38b799b457d5 100644 --- a/fs/udf/inode.c +++ b/fs/udf/inode.c @@ -244,6 +244,7 @@ const struct address_space_operations udf_aops = { .write_end = generic_write_end, .direct_IO = udf_direct_IO, .bmap = udf_bmap, + .supports = AS_SUPPORTS_DIRECT_IO, }; /* diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 34fc6148032a..2a4570516591 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -548,17 +548,17 @@ const struct address_space_operations xfs_address_space_operations = { .releasepage = iomap_releasepage, .invalidatepage = iomap_invalidatepage, .bmap = xfs_vm_bmap, - .direct_IO = noop_direct_IO, .migratepage = iomap_migrate_page, .is_partially_uptodate = iomap_is_partially_uptodate, .error_remove_page = generic_error_remove_page, .swap_activate = xfs_iomap_swapfile_activate, + .supports = AS_SUPPORTS_DIRECT_IO, }; const struct address_space_operations xfs_dax_aops = { .writepages = xfs_dax_writepages, - .direct_IO = noop_direct_IO, .set_page_dirty = __set_page_dirty_no_writeback, .invalidatepage = noop_invalidatepage, .swap_activate = xfs_iomap_swapfile_activate, + .supports = AS_SUPPORTS_DIRECT_IO, }; diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index ddc346a9df9b..37ff541467e8 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -191,8 +191,8 @@ static const struct address_space_operations zonefs_file_aops = { .migratepage = iomap_migrate_page, .is_partially_uptodate = iomap_is_partially_uptodate, .error_remove_page = generic_error_remove_page, - .direct_IO = noop_direct_IO, .swap_activate = zonefs_swap_activate, + .supports = AS_SUPPORTS_DIRECT_IO, }; static void zonefs_update_stats(struct inode *inode, loff_t new_isize) diff --git a/include/linux/fs.h b/include/linux/fs.h index e7a633353fd2..c909ca6c0eb6 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -369,7 +369,10 @@ typedef struct { typedef int (*read_actor_t)(read_descriptor_t *, struct page *, unsigned long, unsigned long); +#define AS_SUPPORTS_DIRECT_IO 0x00000001 + struct address_space_operations { + unsigned int supports; /* Bitmask of AS_SUPPORTS_* flags */ int (*writepage)(struct page *page, struct writeback_control *wbc); int (*readpage)(struct file *, struct page *); @@ -3391,7 +3394,6 @@ extern void simple_recursive_removal(struct dentry *, extern int noop_fsync(struct file *, loff_t, loff_t, int); extern void noop_invalidatepage(struct page *page, unsigned int offset, unsigned int length); -extern ssize_t noop_direct_IO(struct kiocb *iocb, struct iov_iter *iter); extern int simple_empty(struct dentry *); extern int simple_write_begin(struct file *file, struct address_space *mapping, loff_t pos, unsigned len, unsigned flags, From patchwork Fri Sep 24 17:18:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12516241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49050C433EF for ; Fri, 24 Sep 2021 17:19:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 34E3860F21 for ; Fri, 24 Sep 2021 17:19:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347635AbhIXRUo (ORCPT ); Fri, 24 Sep 2021 13:20:44 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:30174 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347770AbhIXRUE (ORCPT ); Fri, 24 Sep 2021 13:20:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632503911; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7Vd5ZE4YLtOCXyKgQM0MGouYb2uMkTroJbSS8in5XqM=; b=U5Kp/JJqnRv7f4Gxogd7oKMJDc93tYqBPKtiYKUd+SdKJOmUQN1XLvzlXldSv5FxrsH/ad 95YcGp1ltlz4wZCNwrD6MKNxIM3tu5NCkJQtPVdIWnJ90heeaqP/v0wXEf6pib5BLru7m7 e05R2UhEqTGeP2eCWJ8kCds82YKroyw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-592-LV6ULrfEMW222u8Hp8wIZA-1; Fri, 24 Sep 2021 13:18:29 -0400 X-MC-Unique: LV6ULrfEMW222u8Hp8wIZA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8234F1922035; Fri, 24 Sep 2021 17:18:27 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id EDCBC19D9F; Fri, 24 Sep 2021 17:18:24 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH v3 3/9] mm: Make swap_readpage() void From: David Howells To: willy@infradead.org, hch@lst.de, trond.myklebust@primarydata.com Cc: Jens Axboe , "Darrick J. Wong" , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, dhowells@redhat.com, dhowells@redhat.com, darrick.wong@oracle.com, viro@zeniv.linux.org.uk, jlayton@kernel.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 24 Sep 2021 18:18:24 +0100 Message-ID: <163250390413.2330363.3248359518033939175.stgit@warthog.procyon.org.uk> In-Reply-To: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> References: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org None of the callers of swap_readpage() actually check its return value and, indeed, the operation may still be in progress, so remove the return value. Signed-off-by: David Howells cc: Matthew Wilcox cc: Christoph Hellwig cc: Jens Axboe cc: Darrick J. Wong cc: linux-xfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Reviewed-by: Matthew Wilcox (Oracle) --- include/linux/swap.h | 2 +- mm/page_io.c | 11 +++-------- 2 files changed, 4 insertions(+), 9 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 576d40e33b1f..293eba012d4f 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -416,7 +416,7 @@ extern void kswapd_stop(int nid); #include /* for bio_end_io_t */ /* linux/mm/page_io.c */ -extern int swap_readpage(struct page *page, bool do_poll); +void swap_readpage(struct page *page, bool synchronous); extern int swap_writepage(struct page *page, struct writeback_control *wbc); int __swap_writepage(struct page *page, struct writeback_control *wbc); extern int swap_set_page_dirty(struct page *page); diff --git a/mm/page_io.c b/mm/page_io.c index afd18f6ec09e..b9fe25101a39 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -352,10 +352,9 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc) return 0; } -int swap_readpage(struct page *page, bool synchronous) +void swap_readpage(struct page *page, bool synchronous) { struct bio *bio; - int ret = 0; struct swap_info_struct *sis = page_swap_info(page); blk_qc_t qc; struct gendisk *disk; @@ -382,15 +381,13 @@ int swap_readpage(struct page *page, bool synchronous) struct file *swap_file = sis->swap_file; struct address_space *mapping = swap_file->f_mapping; - ret = mapping->a_ops->readpage(swap_file, page); - if (!ret) + if (!mapping->a_ops->readpage(swap_file, page)) count_vm_event(PSWPIN); goto out; } if (sis->flags & SWP_SYNCHRONOUS_IO) { - ret = bdev_read_page(sis->bdev, swap_page_sector(page), page); - if (!ret) { + if (!bdev_read_page(sis->bdev, swap_page_sector(page), page)) { if (trylock_page(page)) { swap_slot_free_notify(page); unlock_page(page); @@ -401,7 +398,6 @@ int swap_readpage(struct page *page, bool synchronous) } } - ret = 0; bio = bio_alloc(GFP_KERNEL, 1); bio_set_dev(bio, sis->bdev); bio->bi_opf = REQ_OP_READ; @@ -435,7 +431,6 @@ int swap_readpage(struct page *page, bool synchronous) out: psi_memstall_leave(&pflags); - return ret; } int swap_set_page_dirty(struct page *page) From patchwork Fri Sep 24 17:18:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12516243 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B07EC433F5 for ; Fri, 24 Sep 2021 17:20:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2402C6103B for ; Fri, 24 Sep 2021 17:20:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344001AbhIXRVb (ORCPT ); Fri, 24 Sep 2021 13:21:31 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:53894 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347784AbhIXRUO (ORCPT ); Fri, 24 Sep 2021 13:20:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632503921; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X6DQnexwniQ/NKSS0cOEbgPfeH+oSvyPP6EWb64AW6A=; b=R+Qz6DzESzCmHmT7LjoR0HZIqQTwldEVjGD79M9/5kJ5fl2EUvqB9bZDtTQnnsxsDJYT3q jHeiPjfKqxBwXaOOvsuskGH/fNOOW6aEHcs9a5+Ulfox0jRiAp1uvxRUGadCntCXbXL697 whqxTNw0hxkO0QZHLgepWEt69ONyeD0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-542-x1QTI5CYO5S4aviDytbwsg-1; Fri, 24 Sep 2021 13:18:37 -0400 X-MC-Unique: x1QTI5CYO5S4aviDytbwsg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1EBCA84A5E1; Fri, 24 Sep 2021 17:18:36 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 90BDA5D9DD; Fri, 24 Sep 2021 17:18:33 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH v3 4/9] Introduce IOCB_SWAP kiocb flag to trigger REQ_SWAP From: David Howells To: willy@infradead.org, hch@lst.de, trond.myklebust@primarydata.com Cc: "Darrick J. Wong" , linux-xfs@vger.kernel.org, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, dhowells@redhat.com, dhowells@redhat.com, darrick.wong@oracle.com, viro@zeniv.linux.org.uk, jlayton@kernel.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 24 Sep 2021 18:18:32 +0100 Message-ID: <163250391274.2330363.16176856646027970865.stgit@warthog.procyon.org.uk> In-Reply-To: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> References: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Introduce an IOCB_SWAP flag for the kiocb struct such that the REQ_SWAP will get set on lower level operation structures in generic code. Signed-off-by: David Howells cc: Matthew Wilcox cc: Christoph Hellwig cc: Darrick J. Wong cc: linux-xfs@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- fs/direct-io.c | 2 ++ include/linux/bio.h | 2 ++ include/linux/fs.h | 1 + 3 files changed, 5 insertions(+) diff --git a/fs/direct-io.c b/fs/direct-io.c index b2e86e739d7a..76eec0a68fa4 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -1216,6 +1216,8 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, } if (iocb->ki_flags & IOCB_HIPRI) dio->op_flags |= REQ_HIPRI; + if (iocb->ki_flags & IOCB_SWAP) + dio->op_flags |= REQ_SWAP; /* * For AIO O_(D)SYNC writes we need to defer completions to a workqueue diff --git a/include/linux/bio.h b/include/linux/bio.h index 00952e92eae1..b01133727494 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -787,6 +787,8 @@ static inline void bio_set_polled(struct bio *bio, struct kiocb *kiocb) bio->bi_opf |= REQ_HIPRI; if (!is_sync_kiocb(kiocb)) bio->bi_opf |= REQ_NOWAIT; + if (kiocb->ki_flags & IOCB_SWAP) + bio->bi_opf |= REQ_SWAP; } struct bio *blk_next_bio(struct bio *bio, unsigned int nr_pages, gfp_t gfp); diff --git a/include/linux/fs.h b/include/linux/fs.h index c909ca6c0eb6..c20f4423e2f1 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -321,6 +321,7 @@ enum rw_hint { #define IOCB_NOIO (1 << 20) /* can use bio alloc cache */ #define IOCB_ALLOC_CACHE (1 << 21) +#define IOCB_SWAP (1 << 22) /* Operation on a swapfile */ struct kiocb { struct file *ki_filp; From patchwork Fri Sep 24 17:18:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12516245 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41FB2C433FE for ; Fri, 24 Sep 2021 17:20:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2C7BA60F21 for ; Fri, 24 Sep 2021 17:20:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344400AbhIXRVm (ORCPT ); Fri, 24 Sep 2021 13:21:42 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:43863 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347808AbhIXRU1 (ORCPT ); Fri, 24 Sep 2021 13:20:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632503933; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Wh92/GIQSWuimSHEUrh/KLNHrbQS0s/D/rLAazMZI+U=; b=f8uJXzBxwn5QtAtLpTvkorUshsPcz2+00rm0FhyFRPW+TTC5DSGa9hoqc5nvykJO2rt9xL 4bGnNfHZB8KnHaT9t1QnkldOptN3gahbkHnAikdrSh54HY49vujhz0+/xeNvIhGRrDElnh HBj9xWvhVyKRdQzkvjd3KCq3seoUYhY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-400-gHqa0HBcMl20u-8s2KXwMA-1; Fri, 24 Sep 2021 13:18:51 -0400 X-MC-Unique: gHqa0HBcMl20u-8s2KXwMA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2220E1922023; Fri, 24 Sep 2021 17:18:49 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2CBB21017E36; Fri, 24 Sep 2021 17:18:42 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH v3 5/9] mm: Make swap_readpage() for SWP_FS_OPS use ->swap_rw() not ->readpage() From: David Howells To: willy@infradead.org, hch@lst.de, trond.myklebust@primarydata.com Cc: Jens Axboe , "Darrick J. Wong" , linux-block@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, dhowells@redhat.com, dhowells@redhat.com, darrick.wong@oracle.com, viro@zeniv.linux.org.uk, jlayton@kernel.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 24 Sep 2021 18:18:41 +0100 Message-ID: <163250392134.2330363.2715808422502485629.stgit@warthog.procyon.org.uk> In-Reply-To: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> References: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Make swap_readpage() use the ->swap_rw() method on the filesystem to do direct I/O rather then ->readpage() when accessing a swap file (SWP_FS_OPS). Make swap_writepage() similarly use ->swap_rw() also rather than the ->direct_IO() method. Suggested-by: Matthew Wilcox (Oracle) Signed-off-by: David Howells cc: Matthew Wilcox cc: Christoph Hellwig cc: Jens Axboe cc: Darrick J. Wong cc: linux-block@vger.kernel.org cc: linux-xfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- include/linux/fs.h | 2 + mm/page_io.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 98 insertions(+), 10 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index c20f4423e2f1..c8f7724ecded 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -338,6 +338,7 @@ struct kiocb { union { unsigned int ki_cookie; /* for ->iopoll */ struct wait_page_queue *ki_waitq; /* for async buffered IO */ + struct page *ki_swap_page; /* For swapfile_read/write */ }; randomized_struct_fields_end @@ -404,6 +405,7 @@ struct address_space_operations { int (*releasepage) (struct page *, gfp_t); void (*freepage)(struct page *); ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter); + ssize_t (*swap_rw)(struct kiocb *, struct iov_iter *); /* * migrate the contents of a page to the specified target. If * migrate_mode is MIGRATE_ASYNC, it must not block. diff --git a/mm/page_io.c b/mm/page_io.c index b9fe25101a39..6b1465699c72 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -4,7 +4,7 @@ * * Copyright (C) 1991, 1992, 1993, 1994 Linus Torvalds * - * Swap reorganised 29.12.95, + * Swap reorganised 29.12.95, * Asynchronous swapping added 30.12.95. Stephen Tweedie * Removed race in async swapping. 14.4.1996. Bruno Haible * Add swap of shared pages through the page cache. 20.2.1998. Stephen Tweedie @@ -26,6 +26,22 @@ #include #include +/* + * Keep track of the kiocb we're using to do async DIO. We have to + * refcount it until various things stop looking at the kiocb *after* + * calling ->ki_complete(). + */ +struct swapfile_kiocb { + struct kiocb iocb; + refcount_t ref; +}; + +static void swapfile_put_kiocb(struct swapfile_kiocb *ki) +{ + if (refcount_dec_and_test(&ki->ref)) + kfree(ki); +} + static void end_swap_bio_write(struct bio *bio) { struct page *page = bio_first_page_all(bio); @@ -302,11 +318,12 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc) iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE); init_sync_kiocb(&kiocb, swap_file); - kiocb.ki_pos = page_file_offset(page); + kiocb.ki_pos = page_file_offset(page); + kiocb.ki_flags = IOCB_DIRECT | IOCB_WRITE | IOCB_SWAP; set_page_writeback(page); unlock_page(page); - ret = mapping->a_ops->direct_IO(&kiocb, &from); + ret = mapping->a_ops->swap_rw(&kiocb, &from); if (ret == PAGE_SIZE) { count_vm_event(PSWPOUT); ret = 0; @@ -323,8 +340,8 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc) */ set_page_dirty(page); ClearPageReclaim(page); - pr_err_ratelimited("Write error on dio swapfile (%llu)\n", - page_file_offset(page)); + pr_err_ratelimited("Write error (%d) on dio swapfile (%llu)\n", + ret, page_file_offset(page)); } end_page_writeback(page); return ret; @@ -352,6 +369,79 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc) return 0; } +static void swapfile_read_complete(struct page *page, long ret) +{ + if (ret == page_size(page)) { + count_vm_event(PSWPIN); + SetPageUptodate(page); + } else { + SetPageError(page); + pr_err_ratelimited("Read error (%ld) on dio swapfile (%llu)\n", + ret, page_file_offset(page)); + } + + unlock_page(page); +} + +static void __swapfile_read_complete(struct kiocb *iocb, long ret, long ret2) +{ + struct swapfile_kiocb *ki = container_of(iocb, struct swapfile_kiocb, iocb); + + swapfile_read_complete(iocb->ki_swap_page, ret); + swapfile_put_kiocb(ki); +} + +static void swapfile_read_sync(struct swap_info_struct *sis, struct page *page, + struct iov_iter *to) +{ + struct kiocb kiocb; + struct file *swap_file = sis->swap_file; + int ret; + + init_sync_kiocb(&kiocb, swap_file); + kiocb.ki_swap_page = page; + kiocb.ki_pos = page_file_offset(page); + kiocb.ki_flags = IOCB_DIRECT | IOCB_SWAP; + ret = swap_file->f_mapping->a_ops->swap_rw(&kiocb, to); + + swapfile_read_complete(page, ret); +} + +static void swapfile_read(struct swap_info_struct *sis, struct page *page, + bool synchronous) +{ + struct swapfile_kiocb *ki; + struct file *swap_file = sis->swap_file; + struct bio_vec bv = { + .bv_page = page, + .bv_len = thp_size(page), + .bv_offset = 0 + }; + struct iov_iter to; + int ret; + + iov_iter_bvec(&to, READ, &bv, 1, thp_size(page)); + + if (synchronous) + return swapfile_read_sync(sis, page, &to); + + ki = kzalloc(sizeof(*ki), GFP_KERNEL); + if (!ki) + return; + + refcount_set(&ki->ref, 2); + init_sync_kiocb(&ki->iocb, swap_file); + ki->iocb.ki_swap_page = page; + ki->iocb.ki_flags = IOCB_DIRECT | IOCB_SWAP; + ki->iocb.ki_pos = page_file_offset(page); + ki->iocb.ki_complete = __swapfile_read_complete; + + ret = swap_file->f_mapping->a_ops->swap_rw(&ki->iocb, &to); + if (ret != -EIOCBQUEUED) + __swapfile_read_complete(&ki->iocb, ret, 0); + swapfile_put_kiocb(ki); +} + void swap_readpage(struct page *page, bool synchronous) { struct bio *bio; @@ -378,11 +468,7 @@ void swap_readpage(struct page *page, bool synchronous) } if (data_race(sis->flags & SWP_FS_OPS)) { - struct file *swap_file = sis->swap_file; - struct address_space *mapping = swap_file->f_mapping; - - if (!mapping->a_ops->readpage(swap_file, page)) - count_vm_event(PSWPIN); + swapfile_read(sis, page, synchronous); goto out; } From patchwork Fri Sep 24 17:18:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12516247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6063FC433FE for ; Fri, 24 Sep 2021 17:20:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4CE3060F21 for ; Fri, 24 Sep 2021 17:20:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347837AbhIXRVu (ORCPT ); Fri, 24 Sep 2021 13:21:50 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:21454 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344070AbhIXRUh (ORCPT ); Fri, 24 Sep 2021 13:20:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632503943; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=b+XD7Zi/YFKEa2yF0urj/0T7v2kKhbU5Vfw3G+ykgV0=; b=Mg1BkYY1e5VIfz7Uhtel/xxgNtbhLLsEJVq+UrKyC9wrukeUGiiME71r50etXgUiJJpbzj JN9hagzYmloZtnf1Lvub8hrDy4KPHEFM3Q8LRbXYwphPEYIqKnYpbvEcXJAacikl2LiYx7 QCQxOF0j8eK1SihUGlZqaMe4xXRqAF0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-301-4a3NA8WpOZiVovrffGCEEQ-1; Fri, 24 Sep 2021 13:19:00 -0400 X-MC-Unique: 4a3NA8WpOZiVovrffGCEEQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2708E84A5E9; Fri, 24 Sep 2021 17:18:58 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4C30D69320; Fri, 24 Sep 2021 17:18:55 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH v3 6/9] mm: Make __swap_writepage() do async DIO if asked for it From: David Howells To: willy@infradead.org, hch@lst.de, trond.myklebust@primarydata.com Cc: "Darrick J. Wong" , Trond Myklebust , linux-nfs@vger.kernel.org, linux-block@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, dhowells@redhat.com, dhowells@redhat.com, darrick.wong@oracle.com, viro@zeniv.linux.org.uk, jlayton@kernel.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 24 Sep 2021 18:18:54 +0100 Message-ID: <163250393435.2330363.12822795853508093546.stgit@warthog.procyon.org.uk> In-Reply-To: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> References: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Make __swap_writepage()'s DIO path do sync DIO if the writeback control's sync mode is WB_SYNC_ALL and async DIO if not. Note that this causes hanging processes in sunrpc if the swapfile is on NFS. I'm not sure whether it's due to misscheduling or something else. Suggested-by: Matthew Wilcox (Oracle) Signed-off-by: David Howells cc: Matthew Wilcox (Oracle) cc: Christoph Hellwig cc: Darrick J. Wong cc: Trond Myklebust cc: linux-nfs@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-xfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- mm/page_io.c | 133 ++++++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 92 insertions(+), 41 deletions(-) diff --git a/mm/page_io.c b/mm/page_io.c index 6b1465699c72..8f1199d59162 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -298,6 +298,96 @@ static void bio_associate_blkg_from_page(struct bio *bio, struct page *page) #define bio_associate_blkg_from_page(bio, page) do { } while (0) #endif /* CONFIG_MEMCG && CONFIG_BLK_CGROUP */ +static void swapfile_write_complete(struct page *page, long ret) +{ + if (ret == thp_size(page)) { + count_swpout_vm_event(page); + } else { + /* + * In the case of swap-over-nfs, this can be a + * temporary failure if the system has limited memory + * for allocating transmit buffers. Mark the page + * dirty and avoid rotate_reclaimable_page but + * rate-limit the messages but do not flag PageError + * like the normal direct-to-bio case as it could be + * temporary. + */ + set_page_dirty(page); + ClearPageReclaim(page); + pr_err_ratelimited("Write error (%ld) on dio swapfile (%llu)\n", + ret, page_file_offset(page)); + } + end_page_writeback(page); +} + +static void __swapfile_write_complete(struct kiocb *iocb, long ret, long ret2) +{ + struct swapfile_kiocb *ki = container_of(iocb, struct swapfile_kiocb, iocb); + + swapfile_write_complete(iocb->ki_swap_page, ret); + swapfile_put_kiocb(ki); +} + +static int swapfile_write_sync(struct swap_info_struct *sis, + struct page *page, struct writeback_control *wbc, + struct iov_iter *from) +{ + struct kiocb kiocb; + struct file *swap_file = sis->swap_file; + int ret; + + init_sync_kiocb(&kiocb, swap_file); + kiocb.ki_swap_page = page; + kiocb.ki_pos = page_file_offset(page); + kiocb.ki_flags = IOCB_DIRECT | IOCB_WRITE | IOCB_SWAP; + + set_page_writeback(page); + unlock_page(page); + + ret = swap_file->f_mapping->a_ops->swap_rw(&kiocb, from); + swapfile_write_complete(page, ret); + return ret == page_size(page) ? 0 : ret >= 0 ? -ENODATA : ret; +} + +static int swapfile_write(struct swap_info_struct *sis, + struct page *page, struct writeback_control *wbc) +{ + struct swapfile_kiocb *ki; + struct file *swap_file = sis->swap_file; + struct bio_vec bv = { + .bv_page = page, + .bv_len = page_size(page), + .bv_offset = 0 + }; + struct iov_iter from; + int ret; + + iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE); + + if (wbc->sync_mode == WB_SYNC_ALL) + return swapfile_write_sync(sis, page, wbc, &from); + + ki = kzalloc(sizeof(*ki), GFP_KERNEL); + if (!ki) + return -ENOMEM; + + refcount_set(&ki->ref, 2); + init_sync_kiocb(&ki->iocb, swap_file); + ki->iocb.ki_swap_page = page; + ki->iocb.ki_pos = page_file_offset(page); + ki->iocb.ki_flags = IOCB_DIRECT | IOCB_WRITE | IOCB_SWAP; + ki->iocb.ki_complete = __swapfile_write_complete; + + set_page_writeback(page); + unlock_page(page); + ret = swap_file->f_mapping->a_ops->swap_rw(&ki->iocb, &from); + + if (ret != -EIOCBQUEUED) + __swapfile_write_complete(&ki->iocb, ret, 0); + swapfile_put_kiocb(ki); + return ret == page_size(page) ? 0 : ret >= 0 ? -ENODATA : ret; +} + int __swap_writepage(struct page *page, struct writeback_control *wbc) { struct bio *bio; @@ -305,47 +395,8 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc) struct swap_info_struct *sis = page_swap_info(page); VM_BUG_ON_PAGE(!PageSwapCache(page), page); - if (data_race(sis->flags & SWP_FS_OPS)) { - struct kiocb kiocb; - struct file *swap_file = sis->swap_file; - struct address_space *mapping = swap_file->f_mapping; - struct bio_vec bv = { - .bv_page = page, - .bv_len = PAGE_SIZE, - .bv_offset = 0 - }; - struct iov_iter from; - - iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE); - init_sync_kiocb(&kiocb, swap_file); - kiocb.ki_pos = page_file_offset(page); - kiocb.ki_flags = IOCB_DIRECT | IOCB_WRITE | IOCB_SWAP; - - set_page_writeback(page); - unlock_page(page); - ret = mapping->a_ops->swap_rw(&kiocb, &from); - if (ret == PAGE_SIZE) { - count_vm_event(PSWPOUT); - ret = 0; - } else { - /* - * In the case of swap-over-nfs, this can be a - * temporary failure if the system has limited - * memory for allocating transmit buffers. - * Mark the page dirty and avoid - * rotate_reclaimable_page but rate-limit the - * messages but do not flag PageError like - * the normal direct-to-bio case as it could - * be temporary. - */ - set_page_dirty(page); - ClearPageReclaim(page); - pr_err_ratelimited("Write error (%d) on dio swapfile (%llu)\n", - ret, page_file_offset(page)); - } - end_page_writeback(page); - return ret; - } + if (data_race(sis->flags & SWP_FS_OPS)) + return swapfile_write(sis, page, wbc); ret = bdev_write_page(sis->bdev, swap_page_sector(page), page, wbc); if (!ret) { From patchwork Fri Sep 24 17:19:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12516249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B631BC43217 for ; Fri, 24 Sep 2021 17:20:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9ED7D60F21 for ; Fri, 24 Sep 2021 17:20:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347874AbhIXRVv (ORCPT ); Fri, 24 Sep 2021 13:21:51 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:53304 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347754AbhIXRUo (ORCPT ); Fri, 24 Sep 2021 13:20:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632503950; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q8VagtUlhT5Uy1W6HF6FVflMAOHtehIQM8x4I7fA4ic=; b=IJ5f3lsPulje4zUj3tiVMOB0dkci36DO7vJv7Q57HxQKDgRx2D0ujpNkHuzVyG6OfA52to kQLMj2IJZMOvDypq/wBMU47JmqjEFUk3/W9qKtotvTP1QBECwwwd4Pu6whbEhRfO2vPflQ 8+oem9opidFrYk14QG+Wq0tYqjfG3ns= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-577-9HXdbiU9P266rFCe3CjhAw-1; Fri, 24 Sep 2021 13:19:08 -0400 X-MC-Unique: 9HXdbiU9P266rFCe3CjhAw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AAE951006AA2; Fri, 24 Sep 2021 17:19:06 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 288C26A908; Fri, 24 Sep 2021 17:19:04 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH v3 7/9] nfs: Fix write to swapfile failure due to generic_write_checks() From: David Howells To: willy@infradead.org, hch@lst.de, trond.myklebust@primarydata.com Cc: Anna Schumaker , NeilBrown , "Darrick J. Wong" , linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, dhowells@redhat.com, dhowells@redhat.com, darrick.wong@oracle.com, viro@zeniv.linux.org.uk, jlayton@kernel.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 24 Sep 2021 18:19:03 +0100 Message-ID: <163250394337.2330363.10000329002686277942.stgit@warthog.procyon.org.uk> In-Reply-To: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> References: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Trying to use a swapfile on NFS results in every DIO write failing with ETXTBSY because generic_write_checks(), as called by nfs_direct_write() from nfs_direct_IO(), forbids writes to swapfiles. Fix this implementing the ->swap_rw() method for NFS, and using that to bypass the checks in generic_write_checks(). [I'm not sure if we still need to do some of the checks] Without this patch, the following is seen: Write error on dio swapfile (3800334336) Altering __swap_writepage() to show the error shows: Write error (-26) on dio swapfile (3800334336) Tested by swapping off all swap partitions and then swapping on a prepared NFS file (CONFIG_NFS_SWAP=y is also needed). Enough copies of the following program then need to be run to force swapping to occur (at least one per gigabyte of RAM): #include #include #include #include #include int main() { unsigned int pid = getpid(), iterations = 0; size_t i, j, size = 1024 * 1024 * 1024; char *p; bool mismatch; p = malloc(size); if (!p) { perror("malloc"); exit(1); } srand(pid); for (i = 0; i < size; i += 4) *(unsigned int *)(p + i) = rand(); do { for (j = 0; j < 16; j++) { for (i = 0; i < size; i += 4096) *(unsigned int *)(p + i) += 1; iterations++; } mismatch = false; srand(pid); for (i = 0; i < size; i += 4) { unsigned int r = rand(); unsigned int v = *(unsigned int *)(p + i); if (i % 4096 == 0) v -= iterations; if (v != r) { fprintf(stderr, "mismatch %zx: %x != %x (diff %x)\n", i, v, r, v - r); mismatch = true; } } } while (!mismatch); exit(1); } Fixes: dc617f29dbe5 ("vfs: don't allow writes to swap files") Signed-off-by: David Howells cc: Trond Myklebust cc: Anna Schumaker cc: "NeilBrown" cc: Matthew Wilcox cc: Darrick J. Wong cc: Christoph Hellwig cc: linux-nfs@vger.kernel.org cc: linux-mm@kvack.org cc: linux-fsdevel@vger.kernel.org --- fs/nfs/direct.c | 28 +++++++--------------------- fs/nfs/file.c | 14 ++++++-------- include/linux/nfs_fs.h | 2 +- 3 files changed, 14 insertions(+), 30 deletions(-) diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index 2e894fec036b..71da8054df7e 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -152,28 +152,18 @@ nfs_direct_count_bytes(struct nfs_direct_req *dreq, } /** - * nfs_direct_IO - NFS address space operation for direct I/O + * nfs_swap_rw - Do direct I/O to a swapfile on NFS * @iocb: target I/O control block * @iter: I/O buffer * * The presence of this routine in the address space ops vector means - * the NFS client supports direct I/O. However, for most direct IO, we - * shunt off direct read and write requests before the VFS gets them, - * so this method is only ever called for swap. + * the NFS client supports direct I/O for swap. */ -ssize_t nfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) +ssize_t nfs_swap_rw(struct kiocb *iocb, struct iov_iter *iter) { - struct inode *inode = iocb->ki_filp->f_mapping->host; - - /* we only support swap file calling nfs_direct_IO */ - if (!IS_SWAPFILE(inode)) - return 0; - - VM_BUG_ON(iov_iter_count(iter) != PAGE_SIZE); - - if (iov_iter_rw(iter) == READ) - return nfs_file_direct_read(iocb, iter); - return nfs_file_direct_write(iocb, iter); + if (iocb->ki_flags & IOCB_WRITE) + return nfs_file_direct_write(iocb, iter); + return nfs_file_direct_read(iocb, iter); } static void nfs_direct_release_pages(struct page **pages, unsigned int npages) @@ -894,7 +884,7 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq, ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter) { ssize_t result, requested; - size_t count; + size_t count = iov_iter_count(iter); struct file *file = iocb->ki_filp; struct address_space *mapping = file->f_mapping; struct inode *inode = mapping->host; @@ -905,10 +895,6 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter) dfprintk(FILE, "NFS: direct write(%pD2, %zd@%Ld)\n", file, iov_iter_count(iter), (long long) iocb->ki_pos); - result = generic_write_checks(iocb, iter); - if (result <= 0) - return result; - count = result; nfs_add_stats(mapping->host, NFSIOS_DIRECTWRITTENBYTES, count); pos = iocb->ki_pos; diff --git a/fs/nfs/file.c b/fs/nfs/file.c index 7403ec6317cb..70dd49994751 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -523,7 +523,7 @@ const struct address_space_operations nfs_file_aops = { .write_end = nfs_write_end, .invalidatepage = nfs_invalidate_page, .releasepage = nfs_release_page, - .direct_IO = nfs_direct_IO, + .swap_rw = nfs_swap_rw, #ifdef CONFIG_MIGRATION .migratepage = nfs_migrate_page, #endif @@ -616,14 +616,16 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from) if (result) return result; - if (iocb->ki_flags & IOCB_DIRECT) + if (iocb->ki_flags & IOCB_DIRECT) { + result = generic_write_checks(iocb, from); + if (result <= 0) + return result; return nfs_file_direct_write(iocb, from); + } dprintk("NFS: write(%pD2, %zu@%Ld)\n", file, iov_iter_count(from), (long long) iocb->ki_pos); - if (IS_SWAPFILE(inode)) - goto out_swapfile; /* * O_APPEND implies that we must revalidate the file length. */ @@ -678,10 +680,6 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from) nfs_add_stats(inode, NFSIOS_NORMALWRITTENBYTES, written); out: return result; - -out_swapfile: - printk(KERN_INFO "NFS: attempt to write to active swap file!\n"); - return -ETXTBSY; } EXPORT_SYMBOL_GPL(nfs_file_write); diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index b9a8b925db43..4a8bd9e48237 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -493,7 +493,7 @@ static inline const struct cred *nfs_file_cred(struct file *file) /* * linux/fs/nfs/direct.c */ -extern ssize_t nfs_direct_IO(struct kiocb *, struct iov_iter *); +extern ssize_t nfs_swap_rw(struct kiocb *, struct iov_iter *); extern ssize_t nfs_file_direct_read(struct kiocb *iocb, struct iov_iter *iter); extern ssize_t nfs_file_direct_write(struct kiocb *iocb, From patchwork Fri Sep 24 17:19:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12516251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D495DC433F5 for ; Fri, 24 Sep 2021 17:20:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C31C561250 for ; Fri, 24 Sep 2021 17:20:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347782AbhIXRV6 (ORCPT ); Fri, 24 Sep 2021 13:21:58 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:22191 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347851AbhIXRU5 (ORCPT ); Fri, 24 Sep 2021 13:20:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632503964; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x7BpkThd4W7Otj77d326ZQ439zIBR66OsnpLRS8EUC4=; b=T1l07n8bWH7pYgykqlnFUvKGGYhO7Wiw4/7ewh4fVQ3Y0I8wzAafhgt+mPu8659E5mrj9b kkXuXKQDWhk4iekFf1MTD5xdXOGaDkihxkjB79TCSmg3ERQRZnhlHlLo39lmeM7T6ofzMl 83vUVS1dvPAKW5G6vIyp10BEC5Yxzn8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-384-FXe5ABoKNvWZ-0xXSP1U8g-1; Fri, 24 Sep 2021 13:19:21 -0400 X-MC-Unique: FXe5ABoKNvWZ-0xXSP1U8g-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4F5DC802B9F; Fri, 24 Sep 2021 17:19:18 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id D7B3D5F707; Fri, 24 Sep 2021 17:19:12 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH v3 8/9] block, btrfs, ext4, xfs: Implement swap_rw From: David Howells To: willy@infradead.org, hch@lst.de, trond.myklebust@primarydata.com Cc: Jens Axboe , Chris Mason , Josef Bacik , David Sterba , Theodore Ts'o , Andreas Dilger , "Darrick J. Wong" , linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, dhowells@redhat.com, dhowells@redhat.com, darrick.wong@oracle.com, viro@zeniv.linux.org.uk, jlayton@kernel.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 24 Sep 2021 18:19:11 +0100 Message-ID: <163250395192.2330363.9101664122191208351.stgit@warthog.procyon.org.uk> In-Reply-To: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> References: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Implement swap_rw for block devices, btrfs, ext4 and xfs. This allows the the page swapping code to use direct-IO rather than direct bio submission, whilst skipping the checks going via read/write_iter would entail. Signed-off-by: David Howells cc: Matthew Wilcox cc: Christoph Hellwig cc: Jens Axboe cc: Chris Mason cc: Josef Bacik cc: David Sterba cc: "Theodore Ts'o" cc: Andreas Dilger cc: Darrick J. Wong cc: linux-block@vger.kernel.org cc: linux-btrfs@vger.kernel.org cc: linux-ext4@vger.kernel.org cc: linux-xfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- block/fops.c | 1 + fs/btrfs/inode.c | 12 +++++------- fs/ext4/inode.c | 9 +++++++++ fs/xfs/xfs_aops.c | 9 +++++++++ 4 files changed, 24 insertions(+), 7 deletions(-) diff --git a/block/fops.c b/block/fops.c index 84c64d814d0d..7ba37dfafae2 100644 --- a/block/fops.c +++ b/block/fops.c @@ -382,6 +382,7 @@ const struct address_space_operations def_blk_aops = { .write_end = blkdev_write_end, .writepages = blkdev_writepages, .direct_IO = blkdev_direct_IO, + .swap_rw = blkdev_direct_IO, .migratepage = buffer_migrate_page_norefs, .is_dirty_writeback = buffer_check_dirty_writeback, .supports = AS_SUPPORTS_DIRECT_IO, diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index b479c97e42fc..9ffcefecb3bb 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -10852,15 +10852,10 @@ static int btrfs_swap_activate(struct swap_info_struct *sis, struct file *file, sis->highest_bit = bsi.nr_pages - 1; return bsi.nr_extents; } -#else -static void btrfs_swap_deactivate(struct file *file) -{ -} -static int btrfs_swap_activate(struct swap_info_struct *sis, struct file *file, - sector_t *span) +static ssize_t btrfs_swap_rw(struct kiocb *iocb, struct iov_iter *iter) { - return -EOPNOTSUPP; + return iomap_dio_rw(iocb, iter, &btrfs_dio_iomap_ops, NULL, 0); } #endif @@ -10944,8 +10939,11 @@ static const struct address_space_operations btrfs_aops = { #endif .set_page_dirty = btrfs_set_page_dirty, .error_remove_page = generic_error_remove_page, +#ifdef CONFIG_SWAP .swap_activate = btrfs_swap_activate, .swap_deactivate = btrfs_swap_deactivate, + .swap_rw = btrfs_swap_rw, +#endif .supports = AS_SUPPORTS_DIRECT_IO, }; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 08d3541d8daa..3c14724d58a8 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3651,6 +3651,11 @@ static int ext4_iomap_swap_activate(struct swap_info_struct *sis, &ext4_iomap_report_ops); } +static ssize_t ext4_swap_rw(struct kiocb *iocb, struct iov_iter *iter) +{ + return iomap_dio_rw(iocb, iter, &ext4_iomap_ops, NULL, 0); +} + static const struct address_space_operations ext4_aops = { .readpage = ext4_readpage, .readahead = ext4_readahead, @@ -3666,6 +3671,7 @@ static const struct address_space_operations ext4_aops = { .is_partially_uptodate = block_is_partially_uptodate, .error_remove_page = generic_error_remove_page, .swap_activate = ext4_iomap_swap_activate, + .swap_rw = ext4_swap_rw, .supports = AS_SUPPORTS_DIRECT_IO, }; @@ -3683,6 +3689,7 @@ static const struct address_space_operations ext4_journalled_aops = { .is_partially_uptodate = block_is_partially_uptodate, .error_remove_page = generic_error_remove_page, .swap_activate = ext4_iomap_swap_activate, + .swap_rw = ext4_swap_rw, .supports = AS_SUPPORTS_DIRECT_IO, }; @@ -3701,6 +3708,7 @@ static const struct address_space_operations ext4_da_aops = { .is_partially_uptodate = block_is_partially_uptodate, .error_remove_page = generic_error_remove_page, .swap_activate = ext4_iomap_swap_activate, + .swap_rw = ext4_swap_rw, .supports = AS_SUPPORTS_DIRECT_IO, }; @@ -3710,6 +3718,7 @@ static const struct address_space_operations ext4_dax_aops = { .bmap = ext4_bmap, .invalidatepage = noop_invalidatepage, .swap_activate = ext4_iomap_swap_activate, + .swap_rw = ext4_swap_rw, .supports = AS_SUPPORTS_DIRECT_IO, }; diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 2a4570516591..23ade2cc8241 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -540,6 +540,13 @@ xfs_iomap_swapfile_activate( &xfs_read_iomap_ops); } +static ssize_t xfs_swap_rw(struct kiocb *iocb, struct iov_iter *iter) +{ + if (iocb->ki_flags & IOCB_WRITE) + return iomap_dio_rw(iocb, iter, &xfs_direct_write_iomap_ops, NULL, 0); + return iomap_dio_rw(iocb, iter, &xfs_read_iomap_ops, NULL, 0); +} + const struct address_space_operations xfs_address_space_operations = { .readpage = xfs_vm_readpage, .readahead = xfs_vm_readahead, @@ -552,6 +559,7 @@ const struct address_space_operations xfs_address_space_operations = { .is_partially_uptodate = iomap_is_partially_uptodate, .error_remove_page = generic_error_remove_page, .swap_activate = xfs_iomap_swapfile_activate, + .swap_rw = xfs_swap_rw, .supports = AS_SUPPORTS_DIRECT_IO, }; @@ -560,5 +568,6 @@ const struct address_space_operations xfs_dax_aops = { .set_page_dirty = __set_page_dirty_no_writeback, .invalidatepage = noop_invalidatepage, .swap_activate = xfs_iomap_swapfile_activate, + .swap_rw = xfs_swap_rw, .supports = AS_SUPPORTS_DIRECT_IO, }; From patchwork Fri Sep 24 17:19:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 12516253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91194C433FE for ; Fri, 24 Sep 2021 17:20:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 730526127C for ; Fri, 24 Sep 2021 17:20:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347777AbhIXRWB (ORCPT ); Fri, 24 Sep 2021 13:22:01 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:41004 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347863AbhIXRVF (ORCPT ); Fri, 24 Sep 2021 13:21:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1632503972; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Arocsm2H++jgTSxW2Ds/xKG2F2CIUBb6boQqvt8idwM=; b=RgWxWpZDiCLFEVRUoWcgfX6X7mPuRHf/A4zJ2CbptAjM6XnK5ltu8nL8rX56dxfQhmHHuO 6PbQbPTvhYQrZ5eYRImW74flWV4XYxWW66Z7VFtJqWbaQ50g4svt0wmjuaOFm6ce5cV6xn kk7QH1pLbOMvFOyQw6Y3XHHCMcGoDH0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-66-TYnDqK9rMcyp5PU33PijFw-1; Fri, 24 Sep 2021 13:19:28 -0400 X-MC-Unique: TYnDqK9rMcyp5PU33PijFw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 106EC5075D; Fri, 24 Sep 2021 17:19:27 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0991260BF1; Fri, 24 Sep 2021 17:19:23 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH v3 9/9] mm: Remove swap BIO paths and only use DIO paths From: David Howells To: willy@infradead.org, hch@lst.de, trond.myklebust@primarydata.com Cc: Jens Axboe , "Darrick J. Wong" , linux-block@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, dhowells@redhat.com, dhowells@redhat.com, darrick.wong@oracle.com, viro@zeniv.linux.org.uk, jlayton@kernel.org, torvalds@linux-foundation.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 24 Sep 2021 18:19:23 +0100 Message-ID: <163250396319.2330363.10564506508011638258.stgit@warthog.procyon.org.uk> In-Reply-To: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> References: <163250387273.2330363.13240781819520072222.stgit@warthog.procyon.org.uk> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Delete the BIO-generating swap read/write paths and always use ->swap_rw(). This puts the mapping layer in the filesystem. [!] ALSO: Add a compile-time knob to disable swap by asynchronous DIO, only using synchronous DIO. Async DIO doesn't seem to work, with ATA errors being chucked out by the swap-on-blockdev and swapfile-on-XFS. It also misbehaves on NFS. I have tested this with sync DIO on ext4-swapfile, xfs-swapfile, a raw blockdev and NFS. The first three work; NFS works for a while then grinds to a halt, chucking out lists of blocked sunrpc operations (I suspect it can't allocate memory somewhere). Suggested-by: Matthew Wilcox (Oracle) Signed-off-by: David Howells cc: Matthew Wilcox cc: Christoph Hellwig cc: Jens Axboe cc: Darrick J. Wong cc: linux-block@vger.kernel.org cc: linux-xfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- mm/page_io.c | 156 +++------------------------------------------------------ mm/swapfile.c | 4 + 2 files changed, 10 insertions(+), 150 deletions(-) diff --git a/mm/page_io.c b/mm/page_io.c index 8f1199d59162..b48318951380 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -26,6 +26,8 @@ #include #include +#define ONLY_USE_SYNC_DIO 1 + /* * Keep track of the kiocb we're using to do async DIO. We have to * refcount it until various things stop looking at the kiocb *after* @@ -42,30 +44,6 @@ static void swapfile_put_kiocb(struct swapfile_kiocb *ki) kfree(ki); } -static void end_swap_bio_write(struct bio *bio) -{ - struct page *page = bio_first_page_all(bio); - - if (bio->bi_status) { - SetPageError(page); - /* - * We failed to write the page out to swap-space. - * Re-dirty the page in order to avoid it being reclaimed. - * Also print a dire warning that things will go BAD (tm) - * very quickly. - * - * Also clear PG_reclaim to avoid rotate_reclaimable_page() - */ - set_page_dirty(page); - pr_alert_ratelimited("Write-error on swap-device (%u:%u:%llu)\n", - MAJOR(bio_dev(bio)), MINOR(bio_dev(bio)), - (unsigned long long)bio->bi_iter.bi_sector); - ClearPageReclaim(page); - } - end_page_writeback(page); - bio_put(bio); -} - static void swap_slot_free_notify(struct page *page) { struct swap_info_struct *sis; @@ -114,32 +92,6 @@ static void swap_slot_free_notify(struct page *page) } } -static void end_swap_bio_read(struct bio *bio) -{ - struct page *page = bio_first_page_all(bio); - struct task_struct *waiter = bio->bi_private; - - if (bio->bi_status) { - SetPageError(page); - ClearPageUptodate(page); - pr_alert_ratelimited("Read-error on swap-device (%u:%u:%llu)\n", - MAJOR(bio_dev(bio)), MINOR(bio_dev(bio)), - (unsigned long long)bio->bi_iter.bi_sector); - goto out; - } - - SetPageUptodate(page); - swap_slot_free_notify(page); -out: - unlock_page(page); - WRITE_ONCE(bio->bi_private, NULL); - bio_put(bio); - if (waiter) { - blk_wake_io_task(waiter); - put_task_struct(waiter); - } -} - int generic_swapfile_activate(struct swap_info_struct *sis, struct file *swap_file, sector_t *span) @@ -279,25 +231,6 @@ static inline void count_swpout_vm_event(struct page *page) count_vm_events(PSWPOUT, thp_nr_pages(page)); } -#if defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP) -static void bio_associate_blkg_from_page(struct bio *bio, struct page *page) -{ - struct cgroup_subsys_state *css; - struct mem_cgroup *memcg; - - memcg = page_memcg(page); - if (!memcg) - return; - - rcu_read_lock(); - css = cgroup_e_css(memcg->css.cgroup, &io_cgrp_subsys); - bio_associate_blkg_from_css(bio, css); - rcu_read_unlock(); -} -#else -#define bio_associate_blkg_from_page(bio, page) do { } while (0) -#endif /* CONFIG_MEMCG && CONFIG_BLK_CGROUP */ - static void swapfile_write_complete(struct page *page, long ret) { if (ret == thp_size(page)) { @@ -364,7 +297,7 @@ static int swapfile_write(struct swap_info_struct *sis, iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE); - if (wbc->sync_mode == WB_SYNC_ALL) + if (ONLY_USE_SYNC_DIO || wbc->sync_mode == WB_SYNC_ALL) return swapfile_write_sync(sis, page, wbc, &from); ki = kzalloc(sizeof(*ki), GFP_KERNEL); @@ -390,40 +323,17 @@ static int swapfile_write(struct swap_info_struct *sis, int __swap_writepage(struct page *page, struct writeback_control *wbc) { - struct bio *bio; - int ret; struct swap_info_struct *sis = page_swap_info(page); VM_BUG_ON_PAGE(!PageSwapCache(page), page); - if (data_race(sis->flags & SWP_FS_OPS)) - return swapfile_write(sis, page, wbc); - - ret = bdev_write_page(sis->bdev, swap_page_sector(page), page, wbc); - if (!ret) { - count_swpout_vm_event(page); - return 0; - } - - bio = bio_alloc(GFP_NOIO, 1); - bio_set_dev(bio, sis->bdev); - bio->bi_iter.bi_sector = swap_page_sector(page); - bio->bi_opf = REQ_OP_WRITE | REQ_SWAP | wbc_to_write_flags(wbc); - bio->bi_end_io = end_swap_bio_write; - bio_add_page(bio, page, thp_size(page), 0); - - bio_associate_blkg_from_page(bio, page); - count_swpout_vm_event(page); - set_page_writeback(page); - unlock_page(page); - submit_bio(bio); - - return 0; + return swapfile_write(sis, page, wbc); } static void swapfile_read_complete(struct page *page, long ret) { if (ret == page_size(page)) { count_vm_event(PSWPIN); + swap_slot_free_notify(page); SetPageUptodate(page); } else { SetPageError(page); @@ -473,7 +383,7 @@ static void swapfile_read(struct swap_info_struct *sis, struct page *page, iov_iter_bvec(&to, READ, &bv, 1, thp_size(page)); - if (synchronous) + if (ONLY_USE_SYNC_DIO || synchronous) return swapfile_read_sync(sis, page, &to); ki = kzalloc(sizeof(*ki), GFP_KERNEL); @@ -495,10 +405,7 @@ static void swapfile_read(struct swap_info_struct *sis, struct page *page, void swap_readpage(struct page *page, bool synchronous) { - struct bio *bio; struct swap_info_struct *sis = page_swap_info(page); - blk_qc_t qc; - struct gendisk *disk; unsigned long pflags; VM_BUG_ON_PAGE(!PageSwapCache(page) && !synchronous, page); @@ -515,58 +422,9 @@ void swap_readpage(struct page *page, bool synchronous) if (frontswap_load(page) == 0) { SetPageUptodate(page); unlock_page(page); - goto out; - } - - if (data_race(sis->flags & SWP_FS_OPS)) { + } else { swapfile_read(sis, page, synchronous); - goto out; } - - if (sis->flags & SWP_SYNCHRONOUS_IO) { - if (!bdev_read_page(sis->bdev, swap_page_sector(page), page)) { - if (trylock_page(page)) { - swap_slot_free_notify(page); - unlock_page(page); - } - - count_vm_event(PSWPIN); - goto out; - } - } - - bio = bio_alloc(GFP_KERNEL, 1); - bio_set_dev(bio, sis->bdev); - bio->bi_opf = REQ_OP_READ; - bio->bi_iter.bi_sector = swap_page_sector(page); - bio->bi_end_io = end_swap_bio_read; - bio_add_page(bio, page, thp_size(page), 0); - - disk = bio->bi_bdev->bd_disk; - /* - * Keep this task valid during swap readpage because the oom killer may - * attempt to access it in the page fault retry time check. - */ - if (synchronous) { - bio->bi_opf |= REQ_HIPRI; - get_task_struct(current); - bio->bi_private = current; - } - count_vm_event(PSWPIN); - bio_get(bio); - qc = submit_bio(bio); - while (synchronous) { - set_current_state(TASK_UNINTERRUPTIBLE); - if (!READ_ONCE(bio->bi_private)) - break; - - if (!blk_poll(disk->queue, qc, true)) - blk_io_schedule(); - } - __set_current_state(TASK_RUNNING); - bio_put(bio); - -out: psi_memstall_leave(&pflags); } diff --git a/mm/swapfile.c b/mm/swapfile.c index 22d10f713848..95d2571e3727 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -2918,6 +2918,8 @@ static int claim_swapfile(struct swap_info_struct *p, struct inode *inode) return -EINVAL; p->flags |= SWP_BLKDEV; } else if (S_ISREG(inode->i_mode)) { + if (!inode->i_mapping->a_ops->swap_rw) + return -EINVAL; p->bdev = inode->i_sb->s_bdev; } @@ -3165,7 +3167,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) name = NULL; goto bad_swap; } - swap_file = file_open_name(name, O_RDWR|O_LARGEFILE, 0); + swap_file = file_open_name(name, O_RDWR | O_LARGEFILE | O_DIRECT, 0); if (IS_ERR(swap_file)) { error = PTR_ERR(swap_file); swap_file = NULL;