From patchwork Fri Sep 15 21:32:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13387719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB889CD37AE for ; Fri, 15 Sep 2023 21:33:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237683AbjIOVd1 (ORCPT ); Fri, 15 Sep 2023 17:33:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55348 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237721AbjIOVdM (ORCPT ); Fri, 15 Sep 2023 17:33:12 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 394E8F7; Fri, 15 Sep 2023 14:33:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=fY/ZSJcX0nGtNhorp279tvkX1snuX5Kki9NEKyyEvZ0=; b=pMZOMSoplkznMz23hCcyCuvW+1 Q5X0DNXRoWSaQm50XrYed2ERGsB62yIGhxRFVJ5yWO7/WHTncq80QWO6bWPAxmnhgXa2arperZvms NTxp8MdrjSQkYV48LcOR50FiLW93j9OS1kLLdbjLQ8sW2W7WPR5rqgWAM7bEnXWkxT4QnkGneUGO3 E8rh9r/+rao7kSNdq1f6/C0v4BHWzLlSlNz46HMKDM6azvHhAKSm/tyv230E9q1BPCMDWUATuo9/Y IL5mKOlshmRBOtbl2P/dz7P55FbzsqPlB8QGk16lhaOPwcNLq1sYSSDisbfbiHzu6BE1WTaycpYL6 3hzMI4aQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qhGQq-00BQnH-0K; Fri, 15 Sep 2023 21:32:56 +0000 From: Luis Chamberlain To: hch@infradead.org, djwong@kernel.org, dchinner@redhat.com, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com Cc: willy@infradead.org, brauner@kernel.org, hare@suse.de, ritesh.list@gmail.com, rgoldwyn@suse.com, jack@suse.cz, ziy@nvidia.com, ryan.roberts@arm.com, patches@lists.linux.dev, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, p.raghav@samsung.com, da.gomez@samsung.com, dan.helmick@samsung.com, mcgrof@kernel.org Subject: [RFC v2 01/10] bdev: rename iomap aops Date: Fri, 15 Sep 2023 14:32:45 -0700 Message-Id: <20230915213254.2724586-2-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230915213254.2724586-1-mcgrof@kernel.org> References: <20230915213254.2724586-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: Luis Chamberlain Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Allow buffer-head and iomap aops to co-exist on a build. Right now the iomap aops is can only be used if you disable buffer-heads. In the near future we should be able to dynamically select at runtime the intended aops based on the nature of the filesystem and device requirements. So rename the iomap aops, and select use the new name if buffer-heads is disabled. This introduces no functional changes. Signed-off-by: Luis Chamberlain --- block/bdev.c | 4 ++++ block/blk.h | 1 + block/fops.c | 14 +++++++------- 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/block/bdev.c b/block/bdev.c index f3b13aa1b7d4..6e62d8a992e6 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -392,7 +392,11 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno) return NULL; inode->i_mode = S_IFBLK; inode->i_rdev = 0; +#ifdef CONFIG_BUFFER_HEAD inode->i_data.a_ops = &def_blk_aops; +#else + inode->i_data.a_ops = &def_blk_aops_iomap; +#endif mapping_set_gfp_mask(&inode->i_data, GFP_USER); bdev = I_BDEV(inode); diff --git a/block/blk.h b/block/blk.h index 08a358bc0919..75e8deb9f458 100644 --- a/block/blk.h +++ b/block/blk.h @@ -473,6 +473,7 @@ long blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg); long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg); extern const struct address_space_operations def_blk_aops; +extern const struct address_space_operations def_blk_aops_iomap; int disk_register_independent_access_ranges(struct gendisk *disk); void disk_unregister_independent_access_ranges(struct gendisk *disk); diff --git a/block/fops.c b/block/fops.c index acff3d5d22d4..80a8430bcd69 100644 --- a/block/fops.c +++ b/block/fops.c @@ -455,13 +455,14 @@ const struct address_space_operations def_blk_aops = { .migrate_folio = buffer_migrate_folio_norefs, .is_dirty_writeback = buffer_check_dirty_writeback, }; -#else /* CONFIG_BUFFER_HEAD */ -static int blkdev_read_folio(struct file *file, struct folio *folio) + +#endif /* CONFIG_BUFFER_HEAD */ +static int blkdev_read_folio_iomap(struct file *file, struct folio *folio) { return iomap_read_folio(folio, &blkdev_iomap_ops); } -static void blkdev_readahead(struct readahead_control *rac) +static void blkdev_readahead_iomap(struct readahead_control *rac) { iomap_readahead(rac, &blkdev_iomap_ops); } @@ -492,18 +493,17 @@ static int blkdev_writepages(struct address_space *mapping, return iomap_writepages(mapping, wbc, &wpc, &blkdev_writeback_ops); } -const struct address_space_operations def_blk_aops = { +const struct address_space_operations def_blk_aops_iomap = { .dirty_folio = filemap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, - .read_folio = blkdev_read_folio, - .readahead = blkdev_readahead, + .read_folio = blkdev_read_folio_iomap, + .readahead = blkdev_readahead_iomap, .writepages = blkdev_writepages, .is_partially_uptodate = iomap_is_partially_uptodate, .error_remove_page = generic_error_remove_page, .migrate_folio = filemap_migrate_folio, }; -#endif /* CONFIG_BUFFER_HEAD */ /* * for a block special file file_inode(file)->i_size is zero From patchwork Fri Sep 15 21:32:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13387722 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C772CD37BA for ; Fri, 15 Sep 2023 21:33:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237780AbjIOVd3 (ORCPT ); Fri, 15 Sep 2023 17:33:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237741AbjIOVdO (ORCPT ); Fri, 15 Sep 2023 17:33:14 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57DFC196; Fri, 15 Sep 2023 14:33:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=Xh7vw1sTfFAIIC7IdEzQuwsZlBUzUqtsvV+NboGq3ok=; b=0ZL4HQvXbqYv52GWetOWj1OK8M qehjk4nrJDZukU81QHh6aQ1MpjX0zuIEdK/YETV8xtJeSmbP+fLIjxkHQr3NUETmJkeDDsxRq95tC gTXX0uy1loUeReAIlfUj/l1GoMohmYcpCBZs6l5RKmd8IymdTF4A4OGGE0RV+busvY7J+BbWHOi1h Ncpwxsfc/OZNdgKT07yd7fGALHiScD5AU2AFoMkxWwGI2hxxBROmEXf/Z3zFm9/d0xMXCoJg6/AgS ZCDcRbUYLeYCuz69jWzTUkc8rxWS9JErmsumGG0/ip2aQXTV9gS2PwV+plIt72OMKHM2Q1ekB0DKF XeYU/3sw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qhGQq-00BQnM-0U; Fri, 15 Sep 2023 21:32:56 +0000 From: Luis Chamberlain To: hch@infradead.org, djwong@kernel.org, dchinner@redhat.com, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com Cc: willy@infradead.org, brauner@kernel.org, hare@suse.de, ritesh.list@gmail.com, rgoldwyn@suse.com, jack@suse.cz, ziy@nvidia.com, ryan.roberts@arm.com, patches@lists.linux.dev, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, p.raghav@samsung.com, da.gomez@samsung.com, dan.helmick@samsung.com, mcgrof@kernel.org Subject: [RFC v2 02/10] bdev: dynamically set aops to enable LBS support Date: Fri, 15 Sep 2023 14:32:46 -0700 Message-Id: <20230915213254.2724586-3-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230915213254.2724586-1-mcgrof@kernel.org> References: <20230915213254.2724586-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: Luis Chamberlain Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org In order to support large block devices where block size > page size we must be able to support an aops which does support blocks > ps and the block layer needs this on its address space operations. We have to sets of aops and only one which does support bs > ps right now and that is when we use iomap on the aops for the block device cache. If the min order has not yet been set and the target filesystem does require bs > ps allow for the inode for the block device cache to use the iomap aops. Signed-off-by: Luis Chamberlain --- block/bdev.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/block/bdev.c b/block/bdev.c index 6e62d8a992e6..63b4d7dd8075 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -126,6 +126,7 @@ static void set_init_blocksize(struct block_device *bdev) { unsigned int bsize = bdev_logical_block_size(bdev); loff_t size = i_size_read(bdev->bd_inode); + int order, folio_order; while (bsize < PAGE_SIZE) { if (size & bsize) @@ -133,6 +134,13 @@ static void set_init_blocksize(struct block_device *bdev) bsize <<= 1; } bdev->bd_inode->i_blkbits = blksize_bits(bsize); + order = bdev->bd_inode->i_blkbits - PAGE_SHIFT; + folio_order = mapping_min_folio_order(bdev->bd_inode->i_mapping); + if (order > 0 && folio_order == 0) { + mapping_set_folio_orders(bdev->bd_inode->i_mapping, order, + MAX_PAGECACHE_ORDER); + bdev->bd_inode->i_data.a_ops = &def_blk_aops_iomap; + } } int set_blocksize(struct block_device *bdev, int size) From patchwork Fri Sep 15 21:32:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13387724 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C598CD37BF for ; Fri, 15 Sep 2023 21:33:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237805AbjIOVda (ORCPT ); Fri, 15 Sep 2023 17:33:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237735AbjIOVdN (ORCPT ); Fri, 15 Sep 2023 17:33:13 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 260CC195; Fri, 15 Sep 2023 14:33:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=2mSGek1i5EgJL9SrMaIAYJlLO6wzLTnTk7T6Som6xA4=; b=Ycrwpwu7ypuRwzODX1BDsAiH+x MWK9lId7QT9gWn+XlmAbxOB4bSVNOmm4G/+yV0lLPwfmgADJUAF6zlOeJAt9lnoexWN2CxXYWQlSK AFIPn/AhNXs/9E5DN2ZWTvC5XwMj56lS3cMbuYNaZRmtyOhukh50OezpfX13JwzSICvSlO4aVnBXE eJLz3FPaIhB6+mAVhKDpVdr0MWAKqnMCLIqfiPqd81Z2y5GmnlWO8EH6OpJlqSfMhlhfxniLs3CnJ PsDD68MzF3vYxWyk2ML392iUBym8SeaLa+cefz1Y+LbLyBi8HuhrEDG9tTGTdQ2SuIwFl3yhvSnVs cW26v5qg==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qhGQq-00BQnO-0d; Fri, 15 Sep 2023 21:32:56 +0000 From: Luis Chamberlain To: hch@infradead.org, djwong@kernel.org, dchinner@redhat.com, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com Cc: willy@infradead.org, brauner@kernel.org, hare@suse.de, ritesh.list@gmail.com, rgoldwyn@suse.com, jack@suse.cz, ziy@nvidia.com, ryan.roberts@arm.com, patches@lists.linux.dev, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, p.raghav@samsung.com, da.gomez@samsung.com, dan.helmick@samsung.com, mcgrof@kernel.org Subject: [RFC v2 03/10] bdev: increase bdev max blocksize depending on the aops used Date: Fri, 15 Sep 2023 14:32:47 -0700 Message-Id: <20230915213254.2724586-4-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230915213254.2724586-1-mcgrof@kernel.org> References: <20230915213254.2724586-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: Luis Chamberlain Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org While buffer-heads is stuck at order 0, so PAGE_SIZE, iomap support is currently capped at MAX_PAGECACHE_ORDER as that is the cap also set on readahead. We match parity for the max allowed block size when using iomap. Signed-off-by: Luis Chamberlain --- block/bdev.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/block/bdev.c b/block/bdev.c index 63b4d7dd8075..0d685270cd34 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -143,10 +143,17 @@ static void set_init_blocksize(struct block_device *bdev) } } +static int bdev_bsize_limit(struct block_device *bdev) +{ + if (bdev->bd_inode->i_data.a_ops == &def_blk_aops_iomap) + return 1 << (PAGE_SHIFT + MAX_PAGECACHE_ORDER); + return PAGE_SIZE; +} + int set_blocksize(struct block_device *bdev, int size) { - /* Size must be a power of two, and between 512 and PAGE_SIZE */ - if (size > PAGE_SIZE || size < 512 || !is_power_of_2(size)) + /* Size must be a power of two, and between 512 and supported order */ + if (size > bdev_bsize_limit(bdev) || size < 512 || !is_power_of_2(size)) return -EINVAL; /* Size cannot be smaller than the size supported by the device */ From patchwork Fri Sep 15 21:32:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13387729 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4388CD13D2 for ; Fri, 15 Sep 2023 21:33:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237834AbjIOVdc (ORCPT ); Fri, 15 Sep 2023 17:33:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237756AbjIOVdP (ORCPT ); Fri, 15 Sep 2023 17:33:15 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F1FF197; Fri, 15 Sep 2023 14:33:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=tAHuZOTRnJkuAieR6OQU11hxZjIemhP5K5oO05h360E=; b=k1J5RsdsGwRvjm0rSSOFEnoVKZ uAidGm/Nh4/fnxXdDaj9BuBFL+fyFphAjuYXfUD3YUn9NM4jzz6HSyBmITkl2PaCf/7OJq0AfbvUS Sp2ErZ0Fe6eErMLGuvRZF7wbAW/XQXIE5LDJGdDrqbNy91/LUocM18NcAZHQJ2f2MVHq+3o7D2Djr zJulmaozeTK++xfokkoo4a63a+WYesGZ8ANM4E7sBvazvoSDDsPaBG8NRgYD+5+d4hBSqTJfuuMZ0 jYjkuwyatKrpmMMs01Xafjr6yadQZOhIW3LNIJVmV9r7cUUpfVE3Vr4T3Bc39VIQElJgmUzGGveOz W3uPNhIg==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qhGQq-00BQnQ-0o; Fri, 15 Sep 2023 21:32:56 +0000 From: Luis Chamberlain To: hch@infradead.org, djwong@kernel.org, dchinner@redhat.com, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com Cc: willy@infradead.org, brauner@kernel.org, hare@suse.de, ritesh.list@gmail.com, rgoldwyn@suse.com, jack@suse.cz, ziy@nvidia.com, ryan.roberts@arm.com, patches@lists.linux.dev, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, p.raghav@samsung.com, da.gomez@samsung.com, dan.helmick@samsung.com, mcgrof@kernel.org Subject: [RFC v2 04/10] filesystems: add filesytem buffer-head flag Date: Fri, 15 Sep 2023 14:32:48 -0700 Message-Id: <20230915213254.2724586-5-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230915213254.2724586-1-mcgrof@kernel.org> References: <20230915213254.2724586-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: Luis Chamberlain Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Signed-off-by: Luis Chamberlain --- fs/adfs/super.c | 2 +- fs/affs/super.c | 2 +- fs/befs/linuxvfs.c | 2 +- fs/bfs/inode.c | 2 +- fs/efs/super.c | 2 +- fs/exfat/super.c | 2 +- fs/ext2/super.c | 2 +- fs/ext4/super.c | 7 ++++--- fs/f2fs/super.c | 2 +- fs/fat/namei_msdos.c | 2 +- fs/fat/namei_vfat.c | 2 +- fs/freevxfs/vxfs_super.c | 2 +- fs/gfs2/ops_fstype.c | 4 ++-- fs/hfs/super.c | 2 +- fs/hfsplus/super.c | 2 +- fs/isofs/inode.c | 2 +- fs/jfs/super.c | 2 +- fs/minix/inode.c | 2 +- fs/nilfs2/super.c | 2 +- fs/ntfs/super.c | 2 +- fs/ntfs3/super.c | 2 +- fs/ocfs2/super.c | 2 +- fs/omfs/inode.c | 2 +- fs/qnx4/inode.c | 2 +- fs/qnx6/inode.c | 2 +- fs/reiserfs/super.c | 2 +- fs/sysv/super.c | 4 ++-- fs/udf/super.c | 2 +- fs/ufs/super.c | 2 +- include/linux/fs.h | 1 + 30 files changed, 35 insertions(+), 33 deletions(-) diff --git a/fs/adfs/super.c b/fs/adfs/super.c index e8bfc38239cd..7c57fff29bb4 100644 --- a/fs/adfs/super.c +++ b/fs/adfs/super.c @@ -464,7 +464,7 @@ static struct file_system_type adfs_fs_type = { .name = "adfs", .mount = adfs_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("adfs"); diff --git a/fs/affs/super.c b/fs/affs/super.c index 58b391446ae1..2dc200010740 100644 --- a/fs/affs/super.c +++ b/fs/affs/super.c @@ -649,7 +649,7 @@ static struct file_system_type affs_fs_type = { .name = "affs", .mount = affs_mount, .kill_sb = affs_kill_sb, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("affs"); diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c index 9a16a51fbb88..64715f554034 100644 --- a/fs/befs/linuxvfs.c +++ b/fs/befs/linuxvfs.c @@ -982,7 +982,7 @@ static struct file_system_type befs_fs_type = { .name = "befs", .mount = befs_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("befs"); diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c index e6a76ae9eb44..e94d7c92b591 100644 --- a/fs/bfs/inode.c +++ b/fs/bfs/inode.c @@ -459,7 +459,7 @@ static struct file_system_type bfs_fs_type = { .name = "bfs", .mount = bfs_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("bfs"); diff --git a/fs/efs/super.c b/fs/efs/super.c index b287f47c165b..35891a596267 100644 --- a/fs/efs/super.c +++ b/fs/efs/super.c @@ -40,7 +40,7 @@ static struct file_system_type efs_fs_type = { .name = "efs", .mount = efs_mount, .kill_sb = efs_kill_sb, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("efs"); diff --git a/fs/exfat/super.c b/fs/exfat/super.c index 17100b13dcdc..fbc2aa45d291 100644 --- a/fs/exfat/super.c +++ b/fs/exfat/super.c @@ -786,7 +786,7 @@ static struct file_system_type exfat_fs_type = { .init_fs_context = exfat_init_fs_context, .parameters = exfat_parameters, .kill_sb = exfat_kill_sb, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; static void exfat_inode_init_once(void *foo) diff --git a/fs/ext2/super.c b/fs/ext2/super.c index aaf3e3e88cb2..ca2c44ca2c29 100644 --- a/fs/ext2/super.c +++ b/fs/ext2/super.c @@ -1629,7 +1629,7 @@ static struct file_system_type ext2_fs_type = { .name = "ext2", .mount = ext2_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("ext2"); diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 38217422f938..b7fab39cd1ca 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -137,7 +137,7 @@ static struct file_system_type ext2_fs_type = { .init_fs_context = ext4_init_fs_context, .parameters = ext4_param_specs, .kill_sb = ext4_kill_sb, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("ext2"); MODULE_ALIAS("ext2"); @@ -153,7 +153,7 @@ static struct file_system_type ext3_fs_type = { .init_fs_context = ext4_init_fs_context, .parameters = ext4_param_specs, .kill_sb = ext4_kill_sb, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("ext3"); MODULE_ALIAS("ext3"); @@ -7314,7 +7314,8 @@ static struct file_system_type ext4_fs_type = { .init_fs_context = ext4_init_fs_context, .parameters = ext4_param_specs, .kill_sb = ext4_kill_sb, - .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME, + .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME | + FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("ext4"); diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index fe25ff9cebbe..6ffc7a4d57d8 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -4903,7 +4903,7 @@ static struct file_system_type f2fs_fs_type = { .name = "f2fs", .mount = f2fs_mount, .kill_sb = kill_f2fs_super, - .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP, + .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("f2fs"); diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c index 2116c486843b..785ee85cf77d 100644 --- a/fs/fat/namei_msdos.c +++ b/fs/fat/namei_msdos.c @@ -667,7 +667,7 @@ static struct file_system_type msdos_fs_type = { .name = "msdos", .mount = msdos_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP, + .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("msdos"); diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c index c4d00999a433..4fe85c569543 100644 --- a/fs/fat/namei_vfat.c +++ b/fs/fat/namei_vfat.c @@ -1212,7 +1212,7 @@ static struct file_system_type vfat_fs_type = { .name = "vfat", .mount = vfat_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP, + .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("vfat"); diff --git a/fs/freevxfs/vxfs_super.c b/fs/freevxfs/vxfs_super.c index 310d73e254df..d1e042784694 100644 --- a/fs/freevxfs/vxfs_super.c +++ b/fs/freevxfs/vxfs_super.c @@ -293,7 +293,7 @@ static struct file_system_type vxfs_fs_type = { .name = "vxfs", .mount = vxfs_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("vxfs"); /* makes mount -t vxfs autoload the module */ MODULE_ALIAS("vxfs"); diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c index ecf789b7168c..f97d8480e665 100644 --- a/fs/gfs2/ops_fstype.c +++ b/fs/gfs2/ops_fstype.c @@ -1810,7 +1810,7 @@ static void gfs2_kill_sb(struct super_block *sb) struct file_system_type gfs2_fs_type = { .name = "gfs2", - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, .init_fs_context = gfs2_init_fs_context, .parameters = gfs2_fs_parameters, .kill_sb = gfs2_kill_sb, @@ -1820,7 +1820,7 @@ MODULE_ALIAS_FS("gfs2"); struct file_system_type gfs2meta_fs_type = { .name = "gfs2meta", - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, .init_fs_context = gfs2_meta_init_fs_context, .owner = THIS_MODULE, }; diff --git a/fs/hfs/super.c b/fs/hfs/super.c index 6764afa98a6f..1b31b8e2b7ef 100644 --- a/fs/hfs/super.c +++ b/fs/hfs/super.c @@ -461,7 +461,7 @@ static struct file_system_type hfs_fs_type = { .name = "hfs", .mount = hfs_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("hfs"); diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c index 1986b4f18a90..efc12ff05e0f 100644 --- a/fs/hfsplus/super.c +++ b/fs/hfsplus/super.c @@ -646,7 +646,7 @@ static struct file_system_type hfsplus_fs_type = { .name = "hfsplus", .mount = hfsplus_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("hfsplus"); diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c index 2ee21286ac8f..96d5b4dfec12 100644 --- a/fs/isofs/inode.c +++ b/fs/isofs/inode.c @@ -1564,7 +1564,7 @@ static struct file_system_type iso9660_fs_type = { .name = "iso9660", .mount = isofs_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("iso9660"); MODULE_ALIAS("iso9660"); diff --git a/fs/jfs/super.c b/fs/jfs/super.c index 2e2f7f6d36a0..052c277eab9f 100644 --- a/fs/jfs/super.c +++ b/fs/jfs/super.c @@ -906,7 +906,7 @@ static struct file_system_type jfs_fs_type = { .name = "jfs", .mount = jfs_do_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("jfs"); diff --git a/fs/minix/inode.c b/fs/minix/inode.c index df575473c1cc..ae2bbb610603 100644 --- a/fs/minix/inode.c +++ b/fs/minix/inode.c @@ -689,7 +689,7 @@ static struct file_system_type minix_fs_type = { .name = "minix", .mount = minix_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("minix"); diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c index a5d1fa4e7552..8e3737a3302a 100644 --- a/fs/nilfs2/super.c +++ b/fs/nilfs2/super.c @@ -1371,7 +1371,7 @@ struct file_system_type nilfs_fs_type = { .name = "nilfs2", .mount = nilfs_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("nilfs2"); diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c index 56a7d5bd33e4..eb9e434b6541 100644 --- a/fs/ntfs/super.c +++ b/fs/ntfs/super.c @@ -3062,7 +3062,7 @@ static struct file_system_type ntfs_fs_type = { .name = "ntfs", .mount = ntfs_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("ntfs"); diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c index cfec5e0c7f66..40f09d5159d2 100644 --- a/fs/ntfs3/super.c +++ b/fs/ntfs3/super.c @@ -1738,7 +1738,7 @@ static struct file_system_type ntfs_fs_type = { .init_fs_context = ntfs_init_fs_context, .parameters = ntfs_fs_parameters, .kill_sb = ntfs3_kill_sb, - .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP, + .fs_flags = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_BUFFER_HEADS, }; // clang-format on diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c index 6b906424902b..50d5be9fb28f 100644 --- a/fs/ocfs2/super.c +++ b/fs/ocfs2/super.c @@ -1191,7 +1191,7 @@ static struct file_system_type ocfs2_fs_type = { .name = "ocfs2", .mount = ocfs2_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV|FS_RENAME_DOES_D_MOVE, + .fs_flags = FS_REQUIRES_DEV|FS_RENAME_DOES_D_MOVE | FS_BUFFER_HEADS, .next = NULL }; MODULE_ALIAS_FS("ocfs2"); diff --git a/fs/omfs/inode.c b/fs/omfs/inode.c index 2f8c1882f45c..e95b7d91fe35 100644 --- a/fs/omfs/inode.c +++ b/fs/omfs/inode.c @@ -607,7 +607,7 @@ static struct file_system_type omfs_fs_type = { .name = "omfs", .mount = omfs_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("omfs"); diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c index a7171f5532a1..4f66b93397d7 100644 --- a/fs/qnx4/inode.c +++ b/fs/qnx4/inode.c @@ -389,7 +389,7 @@ static struct file_system_type qnx4_fs_type = { .name = "qnx4", .mount = qnx4_mount, .kill_sb = qnx4_kill_sb, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("qnx4"); diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c index 21f90d519f1a..68f650d67d3b 100644 --- a/fs/qnx6/inode.c +++ b/fs/qnx6/inode.c @@ -645,7 +645,7 @@ static struct file_system_type qnx6_fs_type = { .name = "qnx6", .mount = qnx6_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("qnx6"); diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c index 7eaf36b3de12..838046f43d11 100644 --- a/fs/reiserfs/super.c +++ b/fs/reiserfs/super.c @@ -2635,7 +2635,7 @@ struct file_system_type reiserfs_fs_type = { .name = "reiserfs", .mount = get_super_block, .kill_sb = reiserfs_kill_sb, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("reiserfs"); diff --git a/fs/sysv/super.c b/fs/sysv/super.c index 3365a30dc1e0..92f08cb24a28 100644 --- a/fs/sysv/super.c +++ b/fs/sysv/super.c @@ -545,7 +545,7 @@ static struct file_system_type sysv_fs_type = { .name = "sysv", .mount = sysv_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("sysv"); @@ -554,7 +554,7 @@ static struct file_system_type v7_fs_type = { .name = "v7", .mount = v7_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("v7"); MODULE_ALIAS("v7"); diff --git a/fs/udf/super.c b/fs/udf/super.c index 928a04d9d9e0..9f160973cbe8 100644 --- a/fs/udf/super.c +++ b/fs/udf/super.c @@ -130,7 +130,7 @@ static struct file_system_type udf_fstype = { .name = "udf", .mount = udf_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("udf"); diff --git a/fs/ufs/super.c b/fs/ufs/super.c index 23377c1baed9..7829c325d011 100644 --- a/fs/ufs/super.c +++ b/fs/ufs/super.c @@ -1513,7 +1513,7 @@ static struct file_system_type ufs_fs_type = { .name = "ufs", .mount = ufs_mount, .kill_sb = kill_block_super, - .fs_flags = FS_REQUIRES_DEV, + .fs_flags = FS_REQUIRES_DEV | FS_BUFFER_HEADS, }; MODULE_ALIAS_FS("ufs"); diff --git a/include/linux/fs.h b/include/linux/fs.h index ebc7b8ac5008..35b661b48a49 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2336,6 +2336,7 @@ struct file_system_type { #define FS_DISALLOW_NOTIFY_PERM 16 /* Disable fanotify permission events */ #define FS_ALLOW_IDMAP 32 /* FS has been updated to handle vfs idmappings. */ #define FS_MGTIME 64 /* FS uses multigrain timestamps */ +#define FS_BUFFER_HEADS 128 /* filesystem requires buffer-heads */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ int (*init_fs_context)(struct fs_context *); const struct fs_parameter_spec *parameters; From patchwork Fri Sep 15 21:32:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13387723 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE9C8CD37B6 for ; Fri, 15 Sep 2023 21:33:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237794AbjIOVda (ORCPT ); Fri, 15 Sep 2023 17:33:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237742AbjIOVdO (ORCPT ); Fri, 15 Sep 2023 17:33:14 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06A57B8; Fri, 15 Sep 2023 14:33:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=ywg9faqG2BCnT1wKSQTUuxdInS3gshCAShEIiU6RHjQ=; b=qZS2ofRnqhN2ypoQAYEMnwcIRF HZUE+++FdwsDen9M6IhyoAQ+Y7IzzG+N6V5wdQXwdMhZwhUZOOOkXUyJn5dH5RZELYsDhvqfIyCnA ZjYUrf9zEzLClRiJF4F8RlTgboxEj7+SeYzajsKRRaiu9LdY0ic7q1ZAKdUj1flkfYBD5mf2ru396 o6BEKaGXoMw51wobZ87Zc80mRgP73fRjgEVpPf3KoqmioNBUm5UeaqAtE4ddC30ZIYqkwxU6RlDOC 28YjopC1+/y6tkHKPQpwssGCKPPuBy05mqO5WZzTekd+HM/K74fIDkiSrHBRW+fDUWOH85cjqgyBi gUde0Spg==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qhGQq-00BQnS-0y; Fri, 15 Sep 2023 21:32:56 +0000 From: Luis Chamberlain To: hch@infradead.org, djwong@kernel.org, dchinner@redhat.com, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com Cc: willy@infradead.org, brauner@kernel.org, hare@suse.de, ritesh.list@gmail.com, rgoldwyn@suse.com, jack@suse.cz, ziy@nvidia.com, ryan.roberts@arm.com, patches@lists.linux.dev, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, p.raghav@samsung.com, da.gomez@samsung.com, dan.helmick@samsung.com, mcgrof@kernel.org Subject: [RFC v2 05/10] bdev: allow to switch between bdev aops Date: Fri, 15 Sep 2023 14:32:49 -0700 Message-Id: <20230915213254.2724586-6-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230915213254.2724586-1-mcgrof@kernel.org> References: <20230915213254.2724586-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: Luis Chamberlain Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Now that we have annotations for filesystems which require buffer-heads we can use that flag to verify if we can use the filesystem on the target block devices which require higher order folios. A filesystems which requires buffer-heads cannot be used on block devices which have a logical block size greater than PAGE_SIZE. We also want to allow to use buffer-head filesystems on block devices and at a later time then unmount and switch to a filesystem which supports bs > PAGE_SIZE, even if the logical block size of the block device is PAGE_SIZE, and this requires iomap. Provide helpers to do all these checks and resets the aops to iomap when needed. Leaving iomap in place after an umount would not make such block devices usable for buffer-head filesystems so we must reset the aops to buffer-heads also on unmount. Signed-off-by: Luis Chamberlain --- block/bdev.c | 55 ++++++++++++++++++++++++++++++++++++++++++ fs/super.c | 3 ++- include/linux/blkdev.h | 7 ++++++ 3 files changed, 64 insertions(+), 1 deletion(-) diff --git a/block/bdev.c b/block/bdev.c index 0d685270cd34..bf3cfc02aaf9 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -150,6 +150,59 @@ static int bdev_bsize_limit(struct block_device *bdev) return PAGE_SIZE; } +#ifdef CONFIG_BUFFER_HEAD +static void bdev_aops_set(struct block_device *bdev, + const struct address_space_operations *aops) +{ + kill_bdev(bdev); + bdev->bd_inode->i_data.a_ops = aops; +} + +static void bdev_aops_sync(struct super_block *sb, struct block_device *bdev, + const struct address_space_operations *aops) +{ + sync_blockdev(bdev); + bdev_aops_set(bdev, aops); + kill_bdev(bdev); + bdev->bd_inode->i_data.a_ops = aops; +} + +void bdev_aops_reset(struct block_device *bdev) +{ + bdev_aops_set(bdev, &def_blk_aops); +} + +static int sb_bdev_aops_set(struct super_block *sb) +{ + struct block_device *bdev = sb->s_bdev; + + if (mapping_min_folio_order(bdev->bd_inode->i_mapping) != 0 && + sb->s_type->fs_flags & FS_BUFFER_HEADS) { + pr_warn_ratelimited( +"block device logical block size > PAGE_SIZE, buffer-head filesystem cannot be used.\n"); + return -EINVAL; + } + + /* + * We can switch back and forth, but we need to use buffer-heads + * first, otherwise a filesystem created which only uses iomap + * will have it sticky and we can't detect buffer-head filesystems + * on mount. + */ + bdev_aops_sync(sb, bdev, &def_blk_aops); + if (sb->s_type->fs_flags & FS_BUFFER_HEADS) + return 0; + + bdev_aops_sync(sb, bdev, &def_blk_aops_iomap); + return 0; +} +#else +static int sb_bdev_aops_set(struct super_block *sb) +{ + return 0; +} +#endif + int set_blocksize(struct block_device *bdev, int size) { /* Size must be a power of two, and between 512 and supported order */ @@ -173,6 +226,8 @@ EXPORT_SYMBOL(set_blocksize); int sb_set_blocksize(struct super_block *sb, int size) { + if (sb_bdev_aops_set(sb)) + return 0; if (set_blocksize(sb->s_bdev, size)) return 0; /* If we get here, we know size is power of two diff --git a/fs/super.c b/fs/super.c index 816a22a5cad1..eb269c9489cb 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1649,12 +1649,13 @@ void kill_block_super(struct super_block *sb) generic_shutdown_super(sb); if (bdev) { sync_blockdev(bdev); + bdev_aops_reset(bdev); blkdev_put(bdev, sb); } } EXPORT_SYMBOL(kill_block_super); -#endif +#endif /* CONFIG_BLOCK */ struct dentry *mount_nodev(struct file_system_type *fs_type, int flags, void *data, diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index eef450f25982..738a879a0786 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1503,6 +1503,13 @@ void sync_bdevs(bool wait); void bdev_statx_dioalign(struct inode *inode, struct kstat *stat); void printk_all_partitions(void); int __init early_lookup_bdev(const char *pathname, dev_t *dev); +#ifdef CONFIG_BUFFER_HEAD +void bdev_aops_reset(struct block_device *bdev); +#else +static inline void bdev_aops_reset(struct block_device *bdev) +{ +} +#endif #else static inline void invalidate_bdev(struct block_device *bdev) { From patchwork Fri Sep 15 21:32:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13387726 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DB89CD37AE for ; Fri, 15 Sep 2023 21:33:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237831AbjIOVdc (ORCPT ); Fri, 15 Sep 2023 17:33:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39924 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237728AbjIOVdN (ORCPT ); Fri, 15 Sep 2023 17:33:13 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DFB0B18D; Fri, 15 Sep 2023 14:33:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=thUa+GI8LogBnk2itZ6HfrgnIy9fcg1//Qc9u4UGwh8=; b=FylAZt6NRWKeXs49zjVqb5xpZE MQ1ezlENpkCtY1xG8O9m+S6v3geuBLsuoU/jZZ6rOdB1Uri4HZNCRooRQVJkI8iiVZvRZb0SVlvUC t1N1i7YlV9BJnxC7pbJi/AeGw6Bauh6rpOThcOvQpHro8p9gSJzt6bWPCox4T/DKPjbTEgztYEG0Q ovruUkG34tjzemunuH6MjFwxtgishAiMLG7Goa84fzGJSWMlbuBL5yGFkPBdNAeKX3PSQM9khO7BM qlgx9+sV2oGyt3gWnXEvOMlapwuJvm0PZ3sj/e2AoKG7wERS0hYHBEOe0M2SD/9LoJn7Djncb7jHi MW5A3D3A==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qhGQq-00BQnU-17; Fri, 15 Sep 2023 21:32:56 +0000 From: Luis Chamberlain To: hch@infradead.org, djwong@kernel.org, dchinner@redhat.com, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com Cc: willy@infradead.org, brauner@kernel.org, hare@suse.de, ritesh.list@gmail.com, rgoldwyn@suse.com, jack@suse.cz, ziy@nvidia.com, ryan.roberts@arm.com, patches@lists.linux.dev, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, p.raghav@samsung.com, da.gomez@samsung.com, dan.helmick@samsung.com, mcgrof@kernel.org Subject: [RFC v2 06/10] bdev: simplify coexistance Date: Fri, 15 Sep 2023 14:32:50 -0700 Message-Id: <20230915213254.2724586-7-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230915213254.2724586-1-mcgrof@kernel.org> References: <20230915213254.2724586-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: Luis Chamberlain Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Now that we have a bdev inode aops easily switch between buffer-heads and iomap we can simplify our requirement. This also enables usage of xfs on devices which for example have a physical block size greater than page size but the logical block size is only 4k or lower. Signed-off-by: Luis Chamberlain --- block/bdev.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/block/bdev.c b/block/bdev.c index bf3cfc02aaf9..d9236eb149a4 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -143,13 +143,6 @@ static void set_init_blocksize(struct block_device *bdev) } } -static int bdev_bsize_limit(struct block_device *bdev) -{ - if (bdev->bd_inode->i_data.a_ops == &def_blk_aops_iomap) - return 1 << (PAGE_SHIFT + MAX_PAGECACHE_ORDER); - return PAGE_SIZE; -} - #ifdef CONFIG_BUFFER_HEAD static void bdev_aops_set(struct block_device *bdev, const struct address_space_operations *aops) @@ -205,8 +198,10 @@ static int sb_bdev_aops_set(struct super_block *sb) int set_blocksize(struct block_device *bdev, int size) { + unsigned int bdev_size_limit = 1 << (PAGE_SHIFT + MAX_PAGECACHE_ORDER); + /* Size must be a power of two, and between 512 and supported order */ - if (size > bdev_bsize_limit(bdev) || size < 512 || !is_power_of_2(size)) + if (size > bdev_size_limit || size < 512 || !is_power_of_2(size)) return -EINVAL; /* Size cannot be smaller than the size supported by the device */ From patchwork Fri Sep 15 21:32:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13387728 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D40D0CD13CF for ; Fri, 15 Sep 2023 21:33:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237842AbjIOVdd (ORCPT ); Fri, 15 Sep 2023 17:33:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237762AbjIOVdQ (ORCPT ); Fri, 15 Sep 2023 17:33:16 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D37D5B8; Fri, 15 Sep 2023 14:33:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=hsCbEbpQePNsNlaxz4E9IonlUtvi5CZw112iATZuYCc=; b=aZkcYzqyB48MzWjwPzdWMDhnaS 36unGgvgZVmx44ucFkaeDQD30sty+H/nNJyNc4PVX0IXLz6eD9/cB7me475sIol2vyN7fYQx1l1Am mSvqthZp/qFLxkk+BeosmOvQUZ8A1E8FGSjBLsIf9RsQC7plzVR1SbO6qZNYLP5CwXOfod9R+IjDE +Rge5ByL8PoOTza2K1k8kHSwNczsNsE42mHkUBxS0CMU3Hk/sFrhiJjzpe40voK+YXIe9YMVPzELz b+Po4D0c5miORXLMmY9pFMtmWJwfnOpvAt3A7dOo8k2lyjC8WDy0QyFTJDCWMuLJCj85gIOPDzRoI SHnevtWQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qhGQq-00BQnW-1G; Fri, 15 Sep 2023 21:32:56 +0000 From: Luis Chamberlain To: hch@infradead.org, djwong@kernel.org, dchinner@redhat.com, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com Cc: willy@infradead.org, brauner@kernel.org, hare@suse.de, ritesh.list@gmail.com, rgoldwyn@suse.com, jack@suse.cz, ziy@nvidia.com, ryan.roberts@arm.com, patches@lists.linux.dev, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, p.raghav@samsung.com, da.gomez@samsung.com, dan.helmick@samsung.com, mcgrof@kernel.org Subject: [RFC v2 07/10] nvme: enhance max supported LBA format check Date: Fri, 15 Sep 2023 14:32:51 -0700 Message-Id: <20230915213254.2724586-8-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230915213254.2724586-1-mcgrof@kernel.org> References: <20230915213254.2724586-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: Luis Chamberlain Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Only pure-iomap configurations, systems where CONFIG_BUFFER_HEAD is disabled can enable NVMe devices with LBA formats with a blocksize larger then the PAGE_SIZE. Systems with buffer-heads enabled cannot currently make use of these devices, but this will eventually get fixed. We cap the max supported LBA format to 19, 512 KiB as support for 1 MiB LBA format still needs some work. Also, add a debug module parameter nvme_core.debug_large_lbas to enable folks to shoot themselves on their foot though if they want to test and expand support beyond what is supported, only to be used on pure-iomap configurations. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- drivers/nvme/host/core.c | 34 +++++++++++++++++++++++++++++----- 1 file changed, 29 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index f3a01b79148c..0365f260c514 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -88,6 +88,10 @@ module_param(apst_secondary_latency_tol_us, ulong, 0644); MODULE_PARM_DESC(apst_secondary_latency_tol_us, "secondary APST latency tolerance in us"); +static bool debug_large_lbas; +module_param(debug_large_lbas, bool, 0644); +MODULE_PARM_DESC(debug_large_lbas, "allow LBAs > PAGE_SIZE"); + /* * nvme_wq - hosts nvme related works that are not reset or delete * nvme_reset_wq - hosts nvme reset works @@ -1878,6 +1882,29 @@ static void nvme_set_queue_limits(struct nvme_ctrl *ctrl, blk_queue_write_cache(q, vwc, vwc); } +/* XXX: shift 20 (1 MiB LBA) crashes on pure-iomap */ +#define NVME_MAX_SHIFT_SUPPORTED 19 + +static bool nvme_lba_shift_supported(struct nvme_ns *ns) +{ + if (ns->lba_shift <= PAGE_SHIFT) + return true; + + if (IS_ENABLED(CONFIG_BUFFER_HEAD)) + return false; + + if (ns->lba_shift <= NVME_MAX_SHIFT_SUPPORTED) + return true; + + if (debug_large_lbas) { + dev_warn(ns->ctrl->device, + "forcibly allowing LBAS > 1 MiB due to nvme_core.debug_large_lbas -- use at your own risk\n"); + return true; + } + + return false; +} + static void nvme_update_disk_info(struct gendisk *disk, struct nvme_ns *ns, struct nvme_id_ns *id) { @@ -1885,13 +1912,10 @@ static void nvme_update_disk_info(struct gendisk *disk, u32 bs = 1U << ns->lba_shift; u32 atomic_bs, phys_bs, io_opt = 0; - /* - * The block layer can't support LBA sizes larger than the page size - * yet, so catch this early and don't allow block I/O. - */ - if (ns->lba_shift > PAGE_SHIFT) { + if (!nvme_lba_shift_supported(ns)) { capacity = 0; bs = (1 << 9); + dev_warn(ns->ctrl->device, "I'm sorry dave, I'm afraid I can't do that\n"); } blk_integrity_unregister(disk); From patchwork Fri Sep 15 21:32:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13387721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EE63CD37B3 for ; Fri, 15 Sep 2023 21:33:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237769AbjIOVd2 (ORCPT ); Fri, 15 Sep 2023 17:33:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237722AbjIOVdM (ORCPT ); Fri, 15 Sep 2023 17:33:12 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1479B8; Fri, 15 Sep 2023 14:33:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=R2D5ERHgA50rjnGDJd3UTlmr28nE94IPRsx6Q30NTNM=; b=IioY1+MgNEocbawTB+OnCHuQPq m334jTZ/rCCDiPzly1hKTAnORYCPXPfXiFu35XxAtsJh70qoLVkXLWWdO79+mD7WR432rlQebZrTL cYwrdeEgZUkSN3sqaVas6b/eMzrthfiXWQnmOA1+sT7ytv5z33SyWfO59607OFtHrFeJ2lZSGVkxH +MEHOarKFO7fXw36ZP5Fi2MZv/HiawQB8OuB1KcPEsdCTHz3XDjrAVwB0n8rNubNFaQx8l88wsHcF lQqTYpYp1EkQybTcS+lydA1PawBqxqbzMb72Itu7Xlcc6pLwY6e97lZ4gggq007lKBQE0yY/f5xQo gyCQ8WmQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qhGQq-00BQnY-1P; Fri, 15 Sep 2023 21:32:56 +0000 From: Luis Chamberlain To: hch@infradead.org, djwong@kernel.org, dchinner@redhat.com, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com Cc: willy@infradead.org, brauner@kernel.org, hare@suse.de, ritesh.list@gmail.com, rgoldwyn@suse.com, jack@suse.cz, ziy@nvidia.com, ryan.roberts@arm.com, patches@lists.linux.dev, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, p.raghav@samsung.com, da.gomez@samsung.com, dan.helmick@samsung.com, mcgrof@kernel.org Subject: [RFC v2 08/10] nvme: add awun / nawun sanity check Date: Fri, 15 Sep 2023 14:32:52 -0700 Message-Id: <20230915213254.2724586-9-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230915213254.2724586-1-mcgrof@kernel.org> References: <20230915213254.2724586-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: Luis Chamberlain Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org AWUN/NAWUN control the atomicity of command execution in relation to other commands. They impose inter-command serialization of writing of blocks of data to the NVM and prevent blocks of data ending up on the NVM containing partial data from one new command and partial data from one or more other new commands. Parse awun / nawun to verify at least the physical block size exposed is not greater than this value. The special case of awun / nawun == 0xffff tells us we can ramp up to mdts. Suggested-by: Dan Helmick Signed-off-by: Luis Chamberlain --- drivers/nvme/host/core.c | 21 +++++++++++++++++++++ drivers/nvme/host/nvme.h | 1 + 2 files changed, 22 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 0365f260c514..7a3c51ac13bd 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1911,6 +1911,7 @@ static void nvme_update_disk_info(struct gendisk *disk, sector_t capacity = nvme_lba_to_sect(ns, le64_to_cpu(id->nsze)); u32 bs = 1U << ns->lba_shift; u32 atomic_bs, phys_bs, io_opt = 0; + u32 awun = 0, awun_bs = 0; if (!nvme_lba_shift_supported(ns)) { capacity = 0; @@ -1931,6 +1932,15 @@ static void nvme_update_disk_info(struct gendisk *disk, atomic_bs = (1 + le16_to_cpu(id->nawupf)) * bs; else atomic_bs = (1 + ns->ctrl->subsys->awupf) * bs; + if (id->nsfeat & NVME_NS_FEAT_ATOMICS && id->nawun) + awun = (1 + le16_to_cpu(id->nawun)); + else + awun = (1 + ns->ctrl->subsys->awun); + /* Indicates MDTS can be used */ + if (awun == 0xffff) + awun_bs = ns->ctrl->max_hw_sectors << SECTOR_SHIFT; + else + awun_bs = awun * bs; } if (id->nsfeat & NVME_NS_FEAT_IO_OPT) { @@ -1940,6 +1950,16 @@ static void nvme_update_disk_info(struct gendisk *disk, io_opt = bs * (1 + le16_to_cpu(id->nows)); } + if (awun) { + phys_bs = min(awun_bs, phys_bs); + + /* + * npwg and nows could be > awun, in such cases users should + * be aware of out of order reads/writes as npwg and nows + * are purely performance optimizations. + */ + } + blk_queue_logical_block_size(disk->queue, bs); /* * Linux filesystems assume writing a single physical block is @@ -2785,6 +2805,7 @@ static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id) kfree(subsys); return -EINVAL; } + subsys->awun = le16_to_cpu(id->awun); subsys->awupf = le16_to_cpu(id->awupf); nvme_mpath_default_iopolicy(subsys); diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index f35647c470af..071ec52d83ea 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -410,6 +410,7 @@ struct nvme_subsystem { u8 cmic; enum nvme_subsys_type subtype; u16 vendor_id; + u16 awun; u16 awupf; /* 0's based awupf value. */ struct ida ns_ida; #ifdef CONFIG_NVME_MULTIPATH From patchwork Fri Sep 15 21:32:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13387725 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBA6AC2BA18 for ; Fri, 15 Sep 2023 21:33:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237800AbjIOVda (ORCPT ); Fri, 15 Sep 2023 17:33:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39938 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237734AbjIOVdN (ORCPT ); Fri, 15 Sep 2023 17:33:13 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81F6E193; Fri, 15 Sep 2023 14:33:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=klgxghukoLB077BcGmD37gwuwN5TiuVyvr8uJe/6qJU=; b=xTeTtvE1PB+sCk5jYN9CsRNDcC oq2rbg1f2LevVKTppTb9rvS7a6d9knnti9jQdDdUKPFipV/7xRYRi9bXw1loPY9YRQLYncMi0eTH0 eSA7OvBPVYDjb/pTBadQaMypikILS+6vmRakqpcosKTLfq6mA5A9nT5TFdX/6+BWkToM79mGvZ7C/ tfJjEHFIAWrCbPalQvlsnGYZpiYB2i4a7P6av3qMHhcJxJ08e6/CKkcCSA5aVBYA0JrWxd6Hx1R0v HvwVgB8hOjoe2H7UeTLiZRP+Ab6e+CWQp+OvTodpFSnvw6ER3Zk057IRQNPe4YstyGZWoX/Y8EDtM kClVBvPA==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qhGQq-00BQna-1Y; Fri, 15 Sep 2023 21:32:56 +0000 From: Luis Chamberlain To: hch@infradead.org, djwong@kernel.org, dchinner@redhat.com, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com Cc: willy@infradead.org, brauner@kernel.org, hare@suse.de, ritesh.list@gmail.com, rgoldwyn@suse.com, jack@suse.cz, ziy@nvidia.com, ryan.roberts@arm.com, patches@lists.linux.dev, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, p.raghav@samsung.com, da.gomez@samsung.com, dan.helmick@samsung.com, mcgrof@kernel.org Subject: [RFC v2 09/10] nvme: add nvme_core.debug_large_atomics to force high awun as phys_bs Date: Fri, 15 Sep 2023 14:32:53 -0700 Message-Id: <20230915213254.2724586-10-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230915213254.2724586-1-mcgrof@kernel.org> References: <20230915213254.2724586-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: Luis Chamberlain Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org A drive with atomic write support should have awun / nawun defined, for these drives it should be possible to play with and experiment safely with LBS support up to awun / nawun settings if you are completely ignoring power failure situations. Add support to experiment with this. The rationale to limit to awun / nawun is to avoid races with other on flight commands which otherwise could cause unexpected results. This also means this debug module parameter feature is not supported if your drive does not support atomics / awun / nawun. Suggested-by: Dan Helmick Signed-off-by: Luis Chamberlain --- drivers/nvme/host/core.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 7a3c51ac13bd..c1f9d8e3ea93 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -92,6 +92,10 @@ static bool debug_large_lbas; module_param(debug_large_lbas, bool, 0644); MODULE_PARM_DESC(debug_large_lbas, "allow LBAs > PAGE_SIZE"); +static unsigned int debug_large_atomics; +module_param(debug_large_atomics, uint, 0644); +MODULE_PARM_DESC(debug_large_atomics, "allow large atomics <= awun or nawun <= mdts"); + /* * nvme_wq - hosts nvme related works that are not reset or delete * nvme_reset_wq - hosts nvme reset works @@ -1958,6 +1962,20 @@ static void nvme_update_disk_info(struct gendisk *disk, * be aware of out of order reads/writes as npwg and nows * are purely performance optimizations. */ + + /* + * If you're not concerned about power failure, in theory, + * you should be able to experiment up to awun rather safely. + * + * Ignore qemu awun value of 1. + */ + if (debug_large_atomics && awun != 1) { + debug_large_atomics = min(awun_bs, debug_large_atomics); + phys_bs = atomic_bs = debug_large_atomics; + dev_info(ns->ctrl->device, + "Forcing large atomic: %u (awun_bs: %u awun: %u)\n", + debug_large_atomics, awun_bs, awun); + } } blk_queue_logical_block_size(disk->queue, bs); From patchwork Fri Sep 15 21:32:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13387727 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE853C2BA1B for ; Fri, 15 Sep 2023 21:33:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237813AbjIOVdb (ORCPT ); Fri, 15 Sep 2023 17:33:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39934 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237730AbjIOVdN (ORCPT ); Fri, 15 Sep 2023 17:33:13 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61F7B18E; Fri, 15 Sep 2023 14:33:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=YAt8eqD7AwRW7noJ7ot53OuTQiWk3/Uzs5ZyXdvJRXo=; b=ChjtTsEwNBGrcoqmRLmGJI9MF/ Znds+peolza0YOyDfbzK7sYjlaorMElTJeuVXXemAGZ/77XsGsWLy4LiwZ3aNSzf7NUgvVJJBGbA1 hmCGMflxN19Oe4EI3C7aKcRcio6PaX5Na8PK1imIjJ8kOMelI0FTQZ7NT+1XpQNn+Vgw+fnPaFQte QnhQwpAvQ4HFE39VSr0ZaN5+ixCaC6SXpcaZFxK1+sKdZo2gZoZCYrf8yjGAxoQS9F3QaCqG3cV0Q 6sPUZ/gg1WftlzkFMKrXlPvvm0+qh0R4kHfYP7ZAlWL+jp2OCGX2QSrXPUFhcQQXub2SKy5eQNaol SQVAqgWg==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1qhGQq-00BQnc-1h; Fri, 15 Sep 2023 21:32:56 +0000 From: Luis Chamberlain To: hch@infradead.org, djwong@kernel.org, dchinner@redhat.com, kbusch@kernel.org, sagi@grimberg.me, axboe@fb.com Cc: willy@infradead.org, brauner@kernel.org, hare@suse.de, ritesh.list@gmail.com, rgoldwyn@suse.com, jack@suse.cz, ziy@nvidia.com, ryan.roberts@arm.com, patches@lists.linux.dev, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, p.raghav@samsung.com, da.gomez@samsung.com, dan.helmick@samsung.com, mcgrof@kernel.org Subject: [RFC v2 10/10] nvme: enable LBS support Date: Fri, 15 Sep 2023 14:32:54 -0700 Message-Id: <20230915213254.2724586-11-mcgrof@kernel.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230915213254.2724586-1-mcgrof@kernel.org> References: <20230915213254.2724586-1-mcgrof@kernel.org> MIME-Version: 1.0 Sender: Luis Chamberlain Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Now that the block device cache allows LBS, and we have buffered-io support, and we support both buffer-heads and iomap aops on the block device cache through the inode dynamically, enable support for LBA formats > PAGE_SIZE. Signed-off-by: Luis Chamberlain --- drivers/nvme/host/core.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index c1f9d8e3ea93..724d3c342344 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1894,9 +1894,6 @@ static bool nvme_lba_shift_supported(struct nvme_ns *ns) if (ns->lba_shift <= PAGE_SHIFT) return true; - if (IS_ENABLED(CONFIG_BUFFER_HEAD)) - return false; - if (ns->lba_shift <= NVME_MAX_SHIFT_SUPPORTED) return true;