Message ID | 20201102073114.66750-1-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: fix build warning due to u64 devided by u32 for 32bit arch | expand |
On Mon, Nov 02, 2020 at 03:31:14PM +0800, Qu Wenruo wrote: > [BUG] > When building the kernel with subpage preparation patches, 32bit arches > will complain about the following linking error: > > ld: fs/btrfs/extent_io.o: in function `release_extent_buffer': > fs/btrfs/extent_io.c:5340: undefined reference to `__udivdi3' > > [CAUSE] > For 32bits, dividing u64 with u32 need to call div_u64(), not directly > call u64 / u32. > > [FIX] > Instead of calling the div_u64() macros, here we introduce a helper, > btrfs_sector_shift(), to calculate the sector shift, and we just do bit > shift to avoid executing the expensive division instruction. Division is expensive but ilog2 does not come without a cost either. It's implemented as bsrl+cmov, which can be also considered expensive for frequent use. > The sector_shift may be better cached in btrfs_fs_info, but so far there > are only very limited callers for that, thus the fs_info::sector_shift > can be there for further cleanup. > > David, can this patch be folded into the offending commit? > The patch is small enough, and doesn't change btrfs_fs_info. > Thus should be OK to fold. I have sent my series cleaning up the simple shifts, for the sectorsize shift in particular see https://lore.kernel.org/linux-btrfs/b38721840b8d703a29807b71460464134b9ca7e1.1603981453.git.dsterba@suse.com/ > Fixes: ef57afc454fb ("btrfs: extent_io: make btrfs_fs_info::buffer_radix to take sector size devided values") > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > fs/btrfs/ctree.h | 5 +++++ > fs/btrfs/extent_io.c | 14 +++++++++----- > 2 files changed, 14 insertions(+), 5 deletions(-) > > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > index 8a83bce3225c..eb282af985f5 100644 > --- a/fs/btrfs/ctree.h > +++ b/fs/btrfs/ctree.h > @@ -3489,6 +3489,11 @@ static inline int __btrfs_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag) > return !!(btrfs_super_compat_ro_flags(disk_super) & flag); > } > > +static inline u8 btrfs_sector_shift(struct btrfs_fs_info *fs_info) > +{ > + return ilog2(fs_info->sectorsize); This has a runtime cost of calculating the the ilog2 each time we use it. > +} > + > /* acl.c */ > #ifdef CONFIG_BTRFS_FS_POSIX_ACL > struct posix_acl *btrfs_get_acl(struct inode *inode, int type); > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c > index 80b35885004a..3452019aef79 100644 > --- a/fs/btrfs/extent_io.c > +++ b/fs/btrfs/extent_io.c > @@ -5129,10 +5129,10 @@ struct extent_buffer *find_extent_buffer(struct btrfs_fs_info *fs_info, > u64 start) > { > struct extent_buffer *eb; > + u8 sector_shift = btrfs_sector_shift(fs_info); And each use needs a temporary variable, where u8 generates worse assembly and also potentially needs stack space.
On 2020/11/2 下午9:54, David Sterba wrote: > On Mon, Nov 02, 2020 at 03:31:14PM +0800, Qu Wenruo wrote: >> [BUG] >> When building the kernel with subpage preparation patches, 32bit arches >> will complain about the following linking error: >> >> ld: fs/btrfs/extent_io.o: in function `release_extent_buffer': >> fs/btrfs/extent_io.c:5340: undefined reference to `__udivdi3' >> >> [CAUSE] >> For 32bits, dividing u64 with u32 need to call div_u64(), not directly >> call u64 / u32. >> >> [FIX] >> Instead of calling the div_u64() macros, here we introduce a helper, >> btrfs_sector_shift(), to calculate the sector shift, and we just do bit >> shift to avoid executing the expensive division instruction. > > Division is expensive but ilog2 does not come without a cost either. > It's implemented as bsrl+cmov, which can be also considered expensive > for frequent use. You're right, ilog2() is also expensive. For our usage, what we really want is ffs(), which can be done with hardware support to reduce the cost. Thanks, Qu > >> The sector_shift may be better cached in btrfs_fs_info, but so far there >> are only very limited callers for that, thus the fs_info::sector_shift >> can be there for further cleanup. >> >> David, can this patch be folded into the offending commit? >> The patch is small enough, and doesn't change btrfs_fs_info. >> Thus should be OK to fold. > > I have sent my series cleaning up the simple shifts, for the sectorsize > shift in particular see > > https://lore.kernel.org/linux-btrfs/b38721840b8d703a29807b71460464134b9ca7e1.1603981453.git.dsterba@suse.com/ > >> Fixes: ef57afc454fb ("btrfs: extent_io: make btrfs_fs_info::buffer_radix to take sector size devided values") >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> fs/btrfs/ctree.h | 5 +++++ >> fs/btrfs/extent_io.c | 14 +++++++++----- >> 2 files changed, 14 insertions(+), 5 deletions(-) >> >> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h >> index 8a83bce3225c..eb282af985f5 100644 >> --- a/fs/btrfs/ctree.h >> +++ b/fs/btrfs/ctree.h >> @@ -3489,6 +3489,11 @@ static inline int __btrfs_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag) >> return !!(btrfs_super_compat_ro_flags(disk_super) & flag); >> } >> >> +static inline u8 btrfs_sector_shift(struct btrfs_fs_info *fs_info) >> +{ >> + return ilog2(fs_info->sectorsize); > > This has a runtime cost of calculating the the ilog2 each time we use > it. > >> +} >> + >> /* acl.c */ >> #ifdef CONFIG_BTRFS_FS_POSIX_ACL >> struct posix_acl *btrfs_get_acl(struct inode *inode, int type); >> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c >> index 80b35885004a..3452019aef79 100644 >> --- a/fs/btrfs/extent_io.c >> +++ b/fs/btrfs/extent_io.c >> @@ -5129,10 +5129,10 @@ struct extent_buffer *find_extent_buffer(struct btrfs_fs_info *fs_info, >> u64 start) >> { >> struct extent_buffer *eb; >> + u8 sector_shift = btrfs_sector_shift(fs_info); > > And each use needs a temporary variable, where u8 generates worse > assembly and also potentially needs stack space. >
On Mon, Nov 02, 2020 at 03:31:14PM +0800, Qu Wenruo wrote: > [BUG] > When building the kernel with subpage preparation patches, 32bit arches > will complain about the following linking error: > > ld: fs/btrfs/extent_io.o: in function `release_extent_buffer': > fs/btrfs/extent_io.c:5340: undefined reference to `__udivdi3' > > [CAUSE] > For 32bits, dividing u64 with u32 need to call div_u64(), not directly > call u64 / u32. > > [FIX] > Instead of calling the div_u64() macros, here we introduce a helper, > btrfs_sector_shift(), to calculate the sector shift, and we just do bit > shift to avoid executing the expensive division instruction. I've refreshed and pushed the series adding fs_info::sectorsize_bits so you can use it for the subpage patches.
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 8a83bce3225c..eb282af985f5 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3489,6 +3489,11 @@ static inline int __btrfs_fs_compat_ro(struct btrfs_fs_info *fs_info, u64 flag) return !!(btrfs_super_compat_ro_flags(disk_super) & flag); } +static inline u8 btrfs_sector_shift(struct btrfs_fs_info *fs_info) +{ + return ilog2(fs_info->sectorsize); +} + /* acl.c */ #ifdef CONFIG_BTRFS_FS_POSIX_ACL struct posix_acl *btrfs_get_acl(struct inode *inode, int type); diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 80b35885004a..3452019aef79 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5129,10 +5129,10 @@ struct extent_buffer *find_extent_buffer(struct btrfs_fs_info *fs_info, u64 start) { struct extent_buffer *eb; + u8 sector_shift = btrfs_sector_shift(fs_info); rcu_read_lock(); - eb = radix_tree_lookup(&fs_info->buffer_radix, - start / fs_info->sectorsize); + eb = radix_tree_lookup(&fs_info->buffer_radix, start >> sector_shift); if (eb && atomic_inc_not_zero(&eb->refs)) { rcu_read_unlock(); /* @@ -5167,6 +5167,7 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info, u64 start) { struct extent_buffer *eb, *exists = NULL; + u8 sector_shift = btrfs_sector_shift(fs_info); int ret; eb = find_extent_buffer(fs_info, start); @@ -5184,7 +5185,7 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info, } spin_lock(&fs_info->buffer_lock); ret = radix_tree_insert(&fs_info->buffer_radix, - start / fs_info->sectorsize, eb); + start >> sector_shift, eb); spin_unlock(&fs_info->buffer_lock); radix_tree_preload_end(); if (ret == -EEXIST) { @@ -5215,6 +5216,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, struct extent_buffer *exists = NULL; struct page *p; struct address_space *mapping = fs_info->btree_inode->i_mapping; + u8 sector_shift = btrfs_sector_shift(fs_info); int uptodate = 1; int ret; @@ -5292,7 +5294,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, spin_lock(&fs_info->buffer_lock); ret = radix_tree_insert(&fs_info->buffer_radix, - start / fs_info->sectorsize, eb); + start >> sector_shift, eb); spin_unlock(&fs_info->buffer_lock); radix_tree_preload_end(); if (ret == -EEXIST) { @@ -5337,6 +5339,8 @@ static inline void btrfs_release_extent_buffer_rcu(struct rcu_head *head) static int release_extent_buffer(struct extent_buffer *eb) __releases(&eb->refs_lock) { + u8 sector_shift = btrfs_sector_shift(eb->fs_info); + lockdep_assert_held(&eb->refs_lock); WARN_ON(atomic_read(&eb->refs) == 0); @@ -5348,7 +5352,7 @@ static int release_extent_buffer(struct extent_buffer *eb) spin_lock(&fs_info->buffer_lock); radix_tree_delete(&fs_info->buffer_radix, - eb->start / fs_info->sectorsize); + eb->start >> sector_shift); spin_unlock(&fs_info->buffer_lock); } else { spin_unlock(&eb->refs_lock);
[BUG] When building the kernel with subpage preparation patches, 32bit arches will complain about the following linking error: ld: fs/btrfs/extent_io.o: in function `release_extent_buffer': fs/btrfs/extent_io.c:5340: undefined reference to `__udivdi3' [CAUSE] For 32bits, dividing u64 with u32 need to call div_u64(), not directly call u64 / u32. [FIX] Instead of calling the div_u64() macros, here we introduce a helper, btrfs_sector_shift(), to calculate the sector shift, and we just do bit shift to avoid executing the expensive division instruction. The sector_shift may be better cached in btrfs_fs_info, but so far there are only very limited callers for that, thus the fs_info::sector_shift can be there for further cleanup. David, can this patch be folded into the offending commit? The patch is small enough, and doesn't change btrfs_fs_info. Thus should be OK to fold. Fixes: ef57afc454fb ("btrfs: extent_io: make btrfs_fs_info::buffer_radix to take sector size devided values") Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/ctree.h | 5 +++++ fs/btrfs/extent_io.c | 14 +++++++++----- 2 files changed, 14 insertions(+), 5 deletions(-)