diff mbox series

[1/1] btrfs-progs: mkfs: Enforce 4k sectorsize by default

Message ID 20230322221714.2702819-2-neal@gompa.dev (mailing list archive)
State New, archived
Headers show
Series Enforce 4k sectorize by default for mkfs | expand

Commit Message

Neal Gompa March 22, 2023, 10:17 p.m. UTC
We have had working subpage support in Btrfs for many cycles now.
Generally, we do not want people creating filesystems by default
with non-4k sectorsizes since it creates portability problems.

Signed-off-by: Neal Gompa <neal@gompa.dev>

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>

---
 Documentation/Subpage.rst    | 15 ++++++++-------
 Documentation/mkfs.btrfs.rst | 13 +++++++++----
 mkfs/main.c                  |  2 +-
 3 files changed, 18 insertions(+), 12 deletions(-)

Comments

Anand Jain April 1, 2023, 5:19 a.m. UTC | #1
Comparing btrfs sectorsize=64K with sectorsize=4K on an aarch64
virtual host with PAGESIZE=64k shows that switching to sectorsize=4K
by default for buffered IO has a low impact, while the direct IO
performance is improved by roughly 21% to 152% for various fio
block sizes as shown below.


aarch64 PAGESIZE=64K
====================

Buffered IO:
============

FIO bs  sectorsize=64k	sectorsize=4k   diff
K	MB/s		MB/s		PC
4	 752		 755
8	 783		 832		+6
64	1066		1173		+10
128	1120		1098		+2
256	1112		1079		-3


Dierct IO:
============

FIO bs  sectorsize=64k	sectorsize=4k	diff
K	MB/s		MB/s 		PC
4	 54		 106		+96
8	107		 270		+152
64	737	 	 894		+21
128	862		1130		+31
256	846		1103		+30



FIO config file:

[global]
directory=/mnt/scratch
allrandrepeat=1
readwrite=readwrite
ioengine=io_uring
iodepth=256
end_fsync=1
fallocate=none
group_reporting
gtod_reduce=1
numjobs=8
size=10G
stonewall

[fio-direct-4k]
direct=1 <-- changed as appropriate
bs=4k    <---changed as appropriate



On 23/03/2023 06:17, Neal Gompa wrote:
> We have had working subpage support in Btrfs for many cycles now.
> Generally, we do not want people creating filesystems by default
> with non-4k sectorsizes since it creates portability problems.
> 
> Signed-off-by: Neal Gompa <neal@gompa.dev>
> 
> Reviewed-by: Anand Jain <anand.jain@oracle.com>
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> 
> ---
>   Documentation/Subpage.rst    | 15 ++++++++-------
>   Documentation/mkfs.btrfs.rst | 13 +++++++++----
>   mkfs/main.c                  |  2 +-
>   3 files changed, 18 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/Subpage.rst b/Documentation/Subpage.rst
> index 21a495d5..39ef7d6d 100644
> --- a/Documentation/Subpage.rst
> +++ b/Documentation/Subpage.rst
> @@ -9,17 +9,18 @@ to the exactly same size of the block and page. On x86_64 this is typically
>   pages, like 64KiB on 64bit ARM or PowerPC. This means filesystems created
>   with 64KiB sector size cannot be mounted on a system with 4KiB page size.
>   
> -While with subpage support, systems with 64KiB page size can create (still needs
> -"-s 4k" option for mkfs.btrfs) and mount filesystems with 4KiB sectorsize,
> -allowing us to push 4KiB sectorsize as default sectorsize for all platforms in the
> -near future.
> +Since v6.3, filesystems are created with a 4KiB sectorsize by default,
> +though it remains possible to create filesystems with other page sizes
> +(such as 64KiB with the "-s 64k" option for mkfs.btrfs). This ensures that
> +new filesystems are compatible across other architecture variants using
> +larger page sizes.
>   
>   Requirements, limitations
>   -------------------------
>   
> -The initial subpage support has been added in v5.15, although it's still
> -considered as experimental at the time of writing (v5.18), most features are
> -already working without problems.
> +The initial subpage support has been added in v5.15. Most features are
> +already working without problems. Subpage support is used by default
> +for systems with a non-4KiB page size since v6.3.
>   
>   End users can mount filesystems with 4KiB sectorsize and do their usual
>   workload, while should not notice any obvious change, as long as the initial
> diff --git a/Documentation/mkfs.btrfs.rst b/Documentation/mkfs.btrfs.rst
> index ba7227b3..16abf0ca 100644
> --- a/Documentation/mkfs.btrfs.rst
> +++ b/Documentation/mkfs.btrfs.rst
> @@ -116,10 +116,15 @@ OPTIONS
>   -s|--sectorsize <size>
>           Specify the sectorsize, the minimum data block allocation unit.
>   
> -        The default value is the page size and is autodetected. If the sectorsize
> -        differs from the page size, the created filesystem may not be mountable by the
> -        running kernel. Therefore it is not recommended to use this option unless you
> -        are going to mount it on a system with the appropriate page size.
> +        By default, the value is 4KiB, but it can be manually set to match the
> +        system page size. However, if the sector size is different from the page
> +        size, the resulting filesystem may not be mountable by the current
> +        kernel, apart from the default 4KiB. Hence, using this option is not
> +        advised unless you intend to mount it on a system with the suitable
> +        page size.
> +
> +        .. note::
> +                Versions prior to 6.3 set the sectorsize matching to the page size.
>   
>   -L|--label <string>
>           Specify a label for the filesystem. The *string* should be less than 256
> diff --git a/mkfs/main.c b/mkfs/main.c
> index f5e34cbd..5e1834d7 100644
> --- a/mkfs/main.c
> +++ b/mkfs/main.c
> @@ -1207,7 +1207,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
>   	}
>   
>   	if (!sectorsize)
> -		sectorsize = (u32)sysconf(_SC_PAGESIZE);
> +		sectorsize = (u32)SZ_4K;
>   	if (btrfs_check_sectorsize(sectorsize))
>   		goto error;
>
Anand Jain April 1, 2023, 5:31 a.m. UTC | #2
[ fixed table format ]

Comparing btrfs sectorsize=64K with sectorsize=4K on an aarch64
virtual host with PAGESIZE=64k shows that switching to sectorsize
(ss) =4K by default for buffered IO has a low impact, while the
direct IO performance is improved by roughly 21% to 152% for
various fio block sizes as shown below.


aarch64 PAGESIZE=64K
====================

Buffered IO:
===========

FIO bs  ss=64k ss=4k   diff
K       MB/s   MB/s    %
4        752    755
8        783    832    +6
64      1066   1173    +10
128     1120   1098    +2
256     1112   1079    -3


Dierct IO:
=========

FIO bs ss=64k  ss=4k  diff
K      MB/s    MB/s   %
4       54      106   +96
8      107      270   +152
64     737      894   +21
128    862     1130   +31
256    846     1103   +30


FIO config file:

[global]
directory=/mnt/scratch
allrandrepeat=1
readwrite=readwrite
ioengine=io_uring
iodepth=256
end_fsync=1
fallocate=none
group_reporting
gtod_reduce=1
numjobs=8
size=10G
stonewall

[fio-direct-4k]
direct=1 <-- changed as appropriate
bs=4k    <---changed as appropriate



On 23/03/2023 06:17, Neal Gompa wrote:
> We have had working subpage support in Btrfs for many cycles now.
> Generally, we do not want people creating filesystems by default
> with non-4k sectorsizes since it creates portability problems.
> 
> Signed-off-by: Neal Gompa <neal@gompa.dev>
> 
> Reviewed-by: Anand Jain <anand.jain@oracle.com>
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> 
> ---
>   Documentation/Subpage.rst    | 15 ++++++++-------
>   Documentation/mkfs.btrfs.rst | 13 +++++++++----
>   mkfs/main.c                  |  2 +-
>   3 files changed, 18 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/Subpage.rst b/Documentation/Subpage.rst
> index 21a495d5..39ef7d6d 100644
> --- a/Documentation/Subpage.rst
> +++ b/Documentation/Subpage.rst
> @@ -9,17 +9,18 @@ to the exactly same size of the block and page. On x86_64 this is typically
>   pages, like 64KiB on 64bit ARM or PowerPC. This means filesystems created
>   with 64KiB sector size cannot be mounted on a system with 4KiB page size.
>   
> -While with subpage support, systems with 64KiB page size can create (still needs
> -"-s 4k" option for mkfs.btrfs) and mount filesystems with 4KiB sectorsize,
> -allowing us to push 4KiB sectorsize as default sectorsize for all platforms in the
> -near future.
> +Since v6.3, filesystems are created with a 4KiB sectorsize by default,
> +though it remains possible to create filesystems with other page sizes
> +(such as 64KiB with the "-s 64k" option for mkfs.btrfs). This ensures that
> +new filesystems are compatible across other architecture variants using
> +larger page sizes.
>   
>   Requirements, limitations
>   -------------------------
>   
> -The initial subpage support has been added in v5.15, although it's still
> -considered as experimental at the time of writing (v5.18), most features are
> -already working without problems.
> +The initial subpage support has been added in v5.15. Most features are
> +already working without problems. Subpage support is used by default
> +for systems with a non-4KiB page size since v6.3.
>   
>   End users can mount filesystems with 4KiB sectorsize and do their usual
>   workload, while should not notice any obvious change, as long as the initial
> diff --git a/Documentation/mkfs.btrfs.rst b/Documentation/mkfs.btrfs.rst
> index ba7227b3..16abf0ca 100644
> --- a/Documentation/mkfs.btrfs.rst
> +++ b/Documentation/mkfs.btrfs.rst
> @@ -116,10 +116,15 @@ OPTIONS
>   -s|--sectorsize <size>
>           Specify the sectorsize, the minimum data block allocation unit.
>   
> -        The default value is the page size and is autodetected. If the sectorsize
> -        differs from the page size, the created filesystem may not be mountable by the
> -        running kernel. Therefore it is not recommended to use this option unless you
> -        are going to mount it on a system with the appropriate page size.
> +        By default, the value is 4KiB, but it can be manually set to match the
> +        system page size. However, if the sector size is different from the page
> +        size, the resulting filesystem may not be mountable by the current
> +        kernel, apart from the default 4KiB. Hence, using this option is not
> +        advised unless you intend to mount it on a system with the suitable
> +        page size.
> +
> +        .. note::
> +                Versions prior to 6.3 set the sectorsize matching to the page size.
>   
>   -L|--label <string>
>           Specify a label for the filesystem. The *string* should be less than 256
> diff --git a/mkfs/main.c b/mkfs/main.c
> index f5e34cbd..5e1834d7 100644
> --- a/mkfs/main.c
> +++ b/mkfs/main.c
> @@ -1207,7 +1207,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
>   	}
>   
>   	if (!sectorsize)
> -		sectorsize = (u32)sysconf(_SC_PAGESIZE);
> +		sectorsize = (u32)SZ_4K;
>   	if (btrfs_check_sectorsize(sectorsize))
>   		goto error;
>
Neal Gompa April 6, 2023, 10:51 p.m. UTC | #3
On Sat, Apr 1, 2023 at 1:31 AM Anand Jain <anand.jain@oracle.com> wrote:
>
>
> [ fixed table format ]
>
> Comparing btrfs sectorsize=64K with sectorsize=4K on an aarch64
> virtual host with PAGESIZE=64k shows that switching to sectorsize
> (ss) =4K by default for buffered IO has a low impact, while the
> direct IO performance is improved by roughly 21% to 152% for
> various fio block sizes as shown below.
>
>
> aarch64 PAGESIZE=64K
> ====================
>
> Buffered IO:
> ===========
>
> FIO bs  ss=64k ss=4k   diff
> K       MB/s   MB/s    %
> 4        752    755
> 8        783    832    +6
> 64      1066   1173    +10
> 128     1120   1098    +2
> 256     1112   1079    -3
>
>
> Dierct IO:
> =========
>
> FIO bs ss=64k  ss=4k  diff
> K      MB/s    MB/s   %
> 4       54      106   +96
> 8      107      270   +152
> 64     737      894   +21
> 128    862     1130   +31
> 256    846     1103   +30
>
>
> FIO config file:
>
> [global]
> directory=/mnt/scratch
> allrandrepeat=1
> readwrite=readwrite
> ioengine=io_uring
> iodepth=256
> end_fsync=1
> fallocate=none
> group_reporting
> gtod_reduce=1
> numjobs=8
> size=10G
> stonewall
>
> [fio-direct-4k]
> direct=1 <-- changed as appropriate
> bs=4k    <---changed as appropriate
>

Well, it's nice to see that there are performance advantages to using
4k block sizes everywhere. :)

Can we get this committed to btrfs-progs then?



--
真実はいつも一つ!/ Always, there's only one truth!
diff mbox series

Patch

diff --git a/Documentation/Subpage.rst b/Documentation/Subpage.rst
index 21a495d5..39ef7d6d 100644
--- a/Documentation/Subpage.rst
+++ b/Documentation/Subpage.rst
@@ -9,17 +9,18 @@  to the exactly same size of the block and page. On x86_64 this is typically
 pages, like 64KiB on 64bit ARM or PowerPC. This means filesystems created
 with 64KiB sector size cannot be mounted on a system with 4KiB page size.
 
-While with subpage support, systems with 64KiB page size can create (still needs
-"-s 4k" option for mkfs.btrfs) and mount filesystems with 4KiB sectorsize,
-allowing us to push 4KiB sectorsize as default sectorsize for all platforms in the
-near future.
+Since v6.3, filesystems are created with a 4KiB sectorsize by default,
+though it remains possible to create filesystems with other page sizes
+(such as 64KiB with the "-s 64k" option for mkfs.btrfs). This ensures that
+new filesystems are compatible across other architecture variants using
+larger page sizes.
 
 Requirements, limitations
 -------------------------
 
-The initial subpage support has been added in v5.15, although it's still
-considered as experimental at the time of writing (v5.18), most features are
-already working without problems.
+The initial subpage support has been added in v5.15. Most features are
+already working without problems. Subpage support is used by default
+for systems with a non-4KiB page size since v6.3.
 
 End users can mount filesystems with 4KiB sectorsize and do their usual
 workload, while should not notice any obvious change, as long as the initial
diff --git a/Documentation/mkfs.btrfs.rst b/Documentation/mkfs.btrfs.rst
index ba7227b3..16abf0ca 100644
--- a/Documentation/mkfs.btrfs.rst
+++ b/Documentation/mkfs.btrfs.rst
@@ -116,10 +116,15 @@  OPTIONS
 -s|--sectorsize <size>
         Specify the sectorsize, the minimum data block allocation unit.
 
-        The default value is the page size and is autodetected. If the sectorsize
-        differs from the page size, the created filesystem may not be mountable by the
-        running kernel. Therefore it is not recommended to use this option unless you
-        are going to mount it on a system with the appropriate page size.
+        By default, the value is 4KiB, but it can be manually set to match the
+        system page size. However, if the sector size is different from the page
+        size, the resulting filesystem may not be mountable by the current
+        kernel, apart from the default 4KiB. Hence, using this option is not
+        advised unless you intend to mount it on a system with the suitable
+        page size.
+
+        .. note::
+                Versions prior to 6.3 set the sectorsize matching to the page size.
 
 -L|--label <string>
         Specify a label for the filesystem. The *string* should be less than 256
diff --git a/mkfs/main.c b/mkfs/main.c
index f5e34cbd..5e1834d7 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -1207,7 +1207,7 @@  int BOX_MAIN(mkfs)(int argc, char **argv)
 	}
 
 	if (!sectorsize)
-		sectorsize = (u32)sysconf(_SC_PAGESIZE);
+		sectorsize = (u32)SZ_4K;
 	if (btrfs_check_sectorsize(sectorsize))
 		goto error;