Message ID | 55b1841a271b69b8047f1195eeb26fb23f893f71.1686738215.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs-progs: fix accessors for big endian systems | expand |
On Wed, Jun 14, 2023 at 06:23:43PM +0800, Qu Wenruo wrote: > [BUG] > There is a bug report from the github issues, that on s390x big endian > systems, mkfs.btrfs just fails: > > $ mkfs.btrfs -f ~/test.img > btrfs-progs v6.3.1 > Invalid mapping for 1081344-1097728, got 17592186044416-17592190238720 > Couldn't map the block 1081344 > ERROR: cannot read chunk root > ERROR: open ctree failed > > [CAUSE] > The error is caused by wrong endian conversion. > > The system where Fedora guys reported errors are using big endian: > > $ lscpu > Byte Order: Big Endian > > While checking the offending @disk_key and @key inside > btrfs_read_sys_array(), we got: > > 2301 while (cur_offset < array_size) { > (gdb) > 2304 if (cur_offset + len > array_size) > (gdb) > 2307 btrfs_disk_key_to_cpu(&key, disk_key); > (gdb) > 2310 sb_array_offset += len; > (gdb) print *disk_key > $2 = {objectid = 281474976710656, type = 228 '\344', offset = 17592186044416} > (gdb) print key > $3 = {objectid = 281474976710656, type = 228 '\344', offset = 17592186044416} > (gdb) > > Now we can see, @disk_key is indeed in the little endian, but @key is > not converted to the CPU native endian. > > Furthermore, if we step into the help btrfs_disk_key_to_cpu(), it shows > we're using little endian version: > > (gdb) step > btrfs_disk_key_to_cpu (disk_key=0x109fcdb, cpu_key=0x3ffffff847f) > at ./kernel-shared/accessors.h:592 > 592 memcpy(cpu_key, disk_key, sizeof(struct btrfs_key)); > > [FIX] > The kernel accessors.h checks if __LITTLE_ENDIAN is defined or not, but > that only works inside kernel. > > In user space, __LITTLE_ENDIAN and __BIG_ENDIAN are both defined inside > <bit/endian.h> header. > > Instead we should check __BYTE_ORDER against __LITTLE_ENDIAN to > determine our endianness. > > With this change, s390x build works as expected now. > > Issue: #639 > Signed-off-by: Qu Wenruo <wqu@suse.com> Added to devel, thanks.
On 14.06.23 12:26, Qu Wenruo wrote: > --- > kernel-shared/accessors.h | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/kernel-shared/accessors.h b/kernel-shared/accessors.h > index 625acfbe8ca7..06ab6e7e9f12 100644 > --- a/kernel-shared/accessors.h > +++ b/kernel-shared/accessors.h > @@ -7,6 +7,8 @@ > #define _static_assert(expr) _Static_assert(expr, #expr) > #endif > > +#include <bits/endian.h> > + > struct btrfs_map_token { > struct extent_buffer *eb; > char *kaddr; > @@ -579,7 +581,7 @@ BTRFS_SETGET_STACK_FUNCS(disk_key_objectid, struct btrfs_disk_key, objectid, 64) > BTRFS_SETGET_STACK_FUNCS(disk_key_offset, struct btrfs_disk_key, offset, 64); > BTRFS_SETGET_STACK_FUNCS(disk_key_type, struct btrfs_disk_key, type, 8); > > -#ifdef __LITTLE_ENDIAN > +#if __BYTE_ORDER == __LITTLE_ENDIAN > > /* > * Optimized helpers for little-endian architectures where CPU and on-disk Hmm but with a change like that we can't just copy over the kernel files into user-space. Just something we need to keep in mind.
On Wed, Jun 14, 2023 at 05:50:32PM +0000, Johannes Thumshirn wrote: > On 14.06.23 12:26, Qu Wenruo wrote: > > --- > > kernel-shared/accessors.h | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/kernel-shared/accessors.h b/kernel-shared/accessors.h > > index 625acfbe8ca7..06ab6e7e9f12 100644 > > --- a/kernel-shared/accessors.h > > +++ b/kernel-shared/accessors.h > > @@ -7,6 +7,8 @@ > > #define _static_assert(expr) _Static_assert(expr, #expr) > > #endif > > > > +#include <bits/endian.h> > > + > > struct btrfs_map_token { > > struct extent_buffer *eb; > > char *kaddr; > > @@ -579,7 +581,7 @@ BTRFS_SETGET_STACK_FUNCS(disk_key_objectid, struct btrfs_disk_key, objectid, 64) > > BTRFS_SETGET_STACK_FUNCS(disk_key_offset, struct btrfs_disk_key, offset, 64); > > BTRFS_SETGET_STACK_FUNCS(disk_key_type, struct btrfs_disk_key, type, 8); > > > > -#ifdef __LITTLE_ENDIAN > > +#if __BYTE_ORDER == __LITTLE_ENDIAN > > > > /* > > * Optimized helpers for little-endian architectures where CPU and on-disk > > Hmm but with a change like that we can't just copy over the > kernel files into user-space. > > Just something we need to keep in mind. There are more changes in all synced files, it can't be a direct copy yet.
On Wed, Jun 14, 2023 at 06:23:43PM +0800, Qu Wenruo wrote: > --- a/kernel-shared/accessors.h > +++ b/kernel-shared/accessors.h > @@ -7,6 +7,8 @@ > #define _static_assert(expr) _Static_assert(expr, #expr) > #endif > > +#include <bits/endian.h> Files from bits/ should not be included directly and it's glibc-specific, also breaks build on musl. Fixed. I'm going to enable quick build tests for pull request, right now the tests are ran once I push devel which usually happens after I reply with the 'applied' mail. The test results can be found at https://github.com/kdave/btrfs-progs/actions .
On 2023/6/15 17:28, David Sterba wrote: > On Wed, Jun 14, 2023 at 06:23:43PM +0800, Qu Wenruo wrote: >> --- a/kernel-shared/accessors.h >> +++ b/kernel-shared/accessors.h >> @@ -7,6 +7,8 @@ >> #define _static_assert(expr) _Static_assert(expr, #expr) >> #endif >> >> +#include <bits/endian.h> > > Files from bits/ should not be included directly and it's > glibc-specific, also breaks build on musl. Fixed. Weird, as for those things which should not be included, normally they have something like this from endianness.h: #ifndef _BITS_ENDIAN_H # error "Never use XXXXX directly; include <whatever.h> instead." #endif Another concern removing this line is, we're relying on the final .c file to include needed headers. For this particular case it's not a problem at all as standard library headers would be included anyway. But I'm still not sure what should be the proper way to go. > > I'm going to enable quick build tests for pull request, right now the > tests are ran once I push devel which usually happens after I reply with > the 'applied' mail. The test results can be found at > https://github.com/kdave/btrfs-progs/actions . That would be very appreciated. However this is better with github pull hooks, I'm not sure if this means it's better to use github pull as the primary method, and just keep the emails as backup methods for old school guys like me. Thanks, Qu
On Thu, Jun 15, 2023 at 05:56:31PM +0800, Qu Wenruo wrote: > > > On 2023/6/15 17:28, David Sterba wrote: > > On Wed, Jun 14, 2023 at 06:23:43PM +0800, Qu Wenruo wrote: > >> --- a/kernel-shared/accessors.h > >> +++ b/kernel-shared/accessors.h > >> @@ -7,6 +7,8 @@ > >> #define _static_assert(expr) _Static_assert(expr, #expr) > >> #endif > >> > >> +#include <bits/endian.h> > > > > Files from bits/ should not be included directly and it's > > glibc-specific, also breaks build on musl. Fixed. > > Weird, as for those things which should not be included, normally they > have something like this from endianness.h: > > #ifndef _BITS_ENDIAN_H > # error "Never use XXXXX directly; include <whatever.h> instead." > #endif Yeah not all of them have it, they probably should. > Another concern removing this line is, we're relying on the final .c > file to include needed headers. > For this particular case it's not a problem at all as standard library > headers would be included anyway. > > But I'm still not sure what should be the proper way to go. Actually we have the proper way, to include kerncompat.h from all kernel-shared files. There's endian.h included and there are also __BYTE_ORDER checks done the right way. The accessors.h currently has no includes at all. Ideally all includes are self contained when compile tested, ie. all types/macros/... are either properly defined by other includes or have a forward definition. I had done a pass with include-what-you-use tool but that was before the kernel source sync. Doing compile tests of headers is simpler, using only compiler. Planned to be done eventually. > > I'm going to enable quick build tests for pull request, right now the > > tests are ran once I push devel which usually happens after I reply with > > the 'applied' mail. The test results can be found at > > https://github.com/kdave/btrfs-progs/actions . > > That would be very appreciated. > > However this is better with github pull hooks, I'm not sure if this > means it's better to use github pull as the primary method, and just > keep the emails as backup methods for old school guys like me. Currently we do both and you can open a pull request with the patches you send as mails to get the build tests. Or, you can do local build test when you set up docker (how to do that is described in ci/README.md), then it's one script away.
diff --git a/kernel-shared/accessors.h b/kernel-shared/accessors.h index 625acfbe8ca7..06ab6e7e9f12 100644 --- a/kernel-shared/accessors.h +++ b/kernel-shared/accessors.h @@ -7,6 +7,8 @@ #define _static_assert(expr) _Static_assert(expr, #expr) #endif +#include <bits/endian.h> + struct btrfs_map_token { struct extent_buffer *eb; char *kaddr; @@ -579,7 +581,7 @@ BTRFS_SETGET_STACK_FUNCS(disk_key_objectid, struct btrfs_disk_key, objectid, 64) BTRFS_SETGET_STACK_FUNCS(disk_key_offset, struct btrfs_disk_key, offset, 64); BTRFS_SETGET_STACK_FUNCS(disk_key_type, struct btrfs_disk_key, type, 8); -#ifdef __LITTLE_ENDIAN +#if __BYTE_ORDER == __LITTLE_ENDIAN /* * Optimized helpers for little-endian architectures where CPU and on-disk
[BUG] There is a bug report from the github issues, that on s390x big endian systems, mkfs.btrfs just fails: $ mkfs.btrfs -f ~/test.img btrfs-progs v6.3.1 Invalid mapping for 1081344-1097728, got 17592186044416-17592190238720 Couldn't map the block 1081344 ERROR: cannot read chunk root ERROR: open ctree failed [CAUSE] The error is caused by wrong endian conversion. The system where Fedora guys reported errors are using big endian: $ lscpu Byte Order: Big Endian While checking the offending @disk_key and @key inside btrfs_read_sys_array(), we got: 2301 while (cur_offset < array_size) { (gdb) 2304 if (cur_offset + len > array_size) (gdb) 2307 btrfs_disk_key_to_cpu(&key, disk_key); (gdb) 2310 sb_array_offset += len; (gdb) print *disk_key $2 = {objectid = 281474976710656, type = 228 '\344', offset = 17592186044416} (gdb) print key $3 = {objectid = 281474976710656, type = 228 '\344', offset = 17592186044416} (gdb) Now we can see, @disk_key is indeed in the little endian, but @key is not converted to the CPU native endian. Furthermore, if we step into the help btrfs_disk_key_to_cpu(), it shows we're using little endian version: (gdb) step btrfs_disk_key_to_cpu (disk_key=0x109fcdb, cpu_key=0x3ffffff847f) at ./kernel-shared/accessors.h:592 592 memcpy(cpu_key, disk_key, sizeof(struct btrfs_key)); [FIX] The kernel accessors.h checks if __LITTLE_ENDIAN is defined or not, but that only works inside kernel. In user space, __LITTLE_ENDIAN and __BIG_ENDIAN are both defined inside <bit/endian.h> header. Instead we should check __BYTE_ORDER against __LITTLE_ENDIAN to determine our endianness. With this change, s390x build works as expected now. Issue: #639 Signed-off-by: Qu Wenruo <wqu@suse.com> --- kernel-shared/accessors.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)