Message ID | 20231009151026.66145-10-aleksander.lobakin@intel.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | ip_tunnel: convert __be16 tunnel flags to bitmaps | expand |
+ Alexander Potapenko <glider@google.com> On Mon, Oct 09, 2023 at 05:10:21PM +0200, Alexander Lobakin wrote: > Sometimes there's need to get a 8/16/...-bit piece of a bitmap at a > particular offset. Currently, there are only bitmap_{get,set}_value8() > to do that for 8 bits and that's it. And also a series from Alexander Potapenko, which I really hope will get into the -next really soon. It introduces bitmap_read/write which can set up to BITS_PER_LONG at once, with no limitations on alignment of position and length: https://lore.kernel.org/linux-arm-kernel/ZRXbOoKHHafCWQCW@yury-ThinkPad/T/#mc311037494229647088b3a84b9f0d9b50bf227cb Can you consider building your series on top of it? > Instead of introducing a separate pair for u16 and so on, which doesn't > scale well, extend the existing functions to be able to pass the wanted > value width. Make both offset and width arbitrary, but in order to not > over complicate the current logic and keep the helpers as optimized as > the current ones, require the width to be a pow-2 value and the offset > to be a multiple of the width, while the target piece should not cross > a %BITS_PER_LONG boundary and stay within one long. > Avoid adjusting all the already existing callsites by defining oneliner > wrapper macros named after the former functions. bloat-o-meter shows > almost no difference (+1-2 bytes in a couple of places), meaning the > new helpers get optimized just nicely. > > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> > Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> > --- > include/linux/bitmap.h | 51 ++++++++++++++++++++++++++++++------------ > 1 file changed, 37 insertions(+), 14 deletions(-) > > diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h > index 63e422f8ba3d..9c010a7fa331 100644 > --- a/include/linux/bitmap.h > +++ b/include/linux/bitmap.h > @@ -6,8 +6,10 @@ > > #include <linux/align.h> > #include <linux/bitops.h> > +#include <linux/bug.h> > #include <linux/find.h> > #include <linux/limits.h> > +#include <linux/log2.h> > #include <linux/string.h> > #include <linux/types.h> > > @@ -569,38 +571,59 @@ static inline void bitmap_from_u64(unsigned long *dst, u64 mask) > } > > /** > - * bitmap_get_value8 - get an 8-bit value within a memory region > + * bitmap_get_bits - get a 8/16/32/64-bit value within a memory region > * @map: address to the bitmap memory region > - * @start: bit offset of the 8-bit value; must be a multiple of 8 > + * @start: bit offset of the value; must be a multiple of @len > + * @len: bit width of the value; must be a power of two > * > - * Returns the 8-bit value located at the @start bit offset within the @src > - * memory region. > + * Return: the 8/16/32/64-bit value located at the @start bit offset within > + * the @src memory region. Its position (@start + @len) can't cross > + * a ``BITS_PER_LONG`` boundary. > */ > -static inline unsigned long bitmap_get_value8(const unsigned long *map, > - unsigned long start) > +static inline unsigned long bitmap_get_bits(const unsigned long *map, > + unsigned long start, size_t len) > { > const size_t index = BIT_WORD(start); > const unsigned long offset = start % BITS_PER_LONG; > > - return (map[index] >> offset) & 0xFF; > + if (WARN_ON_ONCE(!is_power_of_2(len) || offset % len || > + offset + len > BITS_PER_LONG)) > + return 0; > + > + return (map[index] >> offset) & GENMASK(len - 1, 0); > } > > /** > - * bitmap_set_value8 - set an 8-bit value within a memory region > + * bitmap_set_bits - set a 8/16/32/64-bit value within a memory region > * @map: address to the bitmap memory region > - * @value: the 8-bit value; values wider than 8 bits may clobber bitmap > - * @start: bit offset of the 8-bit value; must be a multiple of 8 > + * @start: bit offset of the value; must be a multiple of @len > + * @value: new value to set > + * @len: bit width of the value; must be a power of two > + * > + * Replaces the 8/16/32/64-bit value located at the @start bit offset within > + * the @src memory region with the new @value. Its position (@start + @len) > + * can't cross a ``BITS_PER_LONG`` boundary. > */ > -static inline void bitmap_set_value8(unsigned long *map, unsigned long value, > - unsigned long start) > +static inline void bitmap_set_bits(unsigned long *map, unsigned long start, > + unsigned long value, size_t len) > { > const size_t index = BIT_WORD(start); > const unsigned long offset = start % BITS_PER_LONG; > + unsigned long mask = GENMASK(len - 1, 0); > > - map[index] &= ~(0xFFUL << offset); > - map[index] |= value << offset; > + if (WARN_ON_ONCE(!is_power_of_2(len) || offset % len || > + offset + len > BITS_PER_LONG)) > + return; > + > + map[index] &= ~(mask << offset); > + map[index] |= (value & mask) << offset; > } > > +#define bitmap_get_value8(map, start) \ > + bitmap_get_bits(map, start, BITS_PER_BYTE) > +#define bitmap_set_value8(map, value, start) \ > + bitmap_set_bits(map, start, value, BITS_PER_BYTE) > + > #endif /* __ASSEMBLY__ */ > > #endif /* __LINUX_BITMAP_H */ > -- > 2.41.0
From: Yury Norov <yury.norov@gmail.com> Date: Mon, 9 Oct 2023 09:31:15 -0700 > + Alexander Potapenko <glider@google.com> > > On Mon, Oct 09, 2023 at 05:10:21PM +0200, Alexander Lobakin wrote: >> Sometimes there's need to get a 8/16/...-bit piece of a bitmap at a >> particular offset. Currently, there are only bitmap_{get,set}_value8() >> to do that for 8 bits and that's it. > > And also a series from Alexander Potapenko, which I really hope will > get into the -next really soon. It introduces bitmap_read/write which > can set up to BITS_PER_LONG at once, with no limitations on alignment > of position and length: > > https://lore.kernel.org/linux-arm-kernel/ZRXbOoKHHafCWQCW@yury-ThinkPad/T/#mc311037494229647088b3a84b9f0d9b50bf227cb > > Can you consider building your series on top of it? Yeah, I mentioned in the cover letter that I'm aware of it and in fact it doesn't conflict much, as the functions I'm adding here get optimized as much as the original bitmap_{get,set}_value8(), while Alexander's generic helpers are heavier. I realize lots of calls will be optimized as well due to the offset and the width being compile-time constants, but not all of them. The idea of keeping two pairs of helpers initially came from Andy if I understood him correctly. What do you think? I can provide some bloat-o-meter stats after rebasing. And either way, I see no issue in basing this series on top of Alex' one. > >> Instead of introducing a separate pair for u16 and so on, which doesn't >> scale well, extend the existing functions to be able to pass the wanted >> value width. Make both offset and width arbitrary, but in order to not >> over complicate the current logic and keep the helpers as optimized as >> the current ones, require the width to be a pow-2 value and the offset >> to be a multiple of the width, while the target piece should not cross >> a %BITS_PER_LONG boundary and stay within one long. >> Avoid adjusting all the already existing callsites by defining oneliner >> wrapper macros named after the former functions. bloat-o-meter shows >> almost no difference (+1-2 bytes in a couple of places), meaning the [...] Thanks, Olek
On Wed, Oct 11, 2023 at 11:33:25AM +0200, Alexander Lobakin wrote: > From: Yury Norov <yury.norov@gmail.com> > Date: Mon, 9 Oct 2023 09:31:15 -0700 > > > + Alexander Potapenko <glider@google.com> > > > > On Mon, Oct 09, 2023 at 05:10:21PM +0200, Alexander Lobakin wrote: > >> Sometimes there's need to get a 8/16/...-bit piece of a bitmap at a > >> particular offset. Currently, there are only bitmap_{get,set}_value8() > >> to do that for 8 bits and that's it. > > > > And also a series from Alexander Potapenko, which I really hope will > > get into the -next really soon. It introduces bitmap_read/write which > > can set up to BITS_PER_LONG at once, with no limitations on alignment > > of position and length: > > > > https://lore.kernel.org/linux-arm-kernel/ZRXbOoKHHafCWQCW@yury-ThinkPad/T/#mc311037494229647088b3a84b9f0d9b50bf227cb > > > > Can you consider building your series on top of it? > > Yeah, I mentioned in the cover letter that I'm aware of it and in fact > it doesn't conflict much, as the functions I'm adding here get optimized > as much as the original bitmap_{get,set}_value8(), while Alexander's > generic helpers are heavier. > I realize lots of calls will be optimized as well due to the offset and > the width being compile-time constants, but not all of them. The idea of > keeping two pairs of helpers initially came from Andy if I understood > him correctly. Just a disclaimer: The idea came before I saw the series by Alexander Potapenko. > What do you think? I can provide some bloat-o-meter stats after > rebasing. And either way, I see no issue in basing this series on top of > Alex' one.
On Wed, Oct 11, 2023 at 11:33:25AM +0200, Alexander Lobakin wrote: > From: Yury Norov <yury.norov@gmail.com> > Date: Mon, 9 Oct 2023 09:31:15 -0700 > > > + Alexander Potapenko <glider@google.com> > > > > On Mon, Oct 09, 2023 at 05:10:21PM +0200, Alexander Lobakin wrote: > >> Sometimes there's need to get a 8/16/...-bit piece of a bitmap at a > >> particular offset. Currently, there are only bitmap_{get,set}_value8() > >> to do that for 8 bits and that's it. > > > > And also a series from Alexander Potapenko, which I really hope will > > get into the -next really soon. It introduces bitmap_read/write which > > can set up to BITS_PER_LONG at once, with no limitations on alignment > > of position and length: > > > > https://lore.kernel.org/linux-arm-kernel/ZRXbOoKHHafCWQCW@yury-ThinkPad/T/#mc311037494229647088b3a84b9f0d9b50bf227cb > > > > Can you consider building your series on top of it? > > Yeah, I mentioned in the cover letter that I'm aware of it and in fact > it doesn't conflict much, as the functions I'm adding here get optimized > as much as the original bitmap_{get,set}_value8(), while Alexander's > generic helpers are heavier. > I realize lots of calls will be optimized as well due to the offset and > the width being compile-time constants, but not all of them. The idea of > keeping two pairs of helpers initially came from Andy if I understood > him correctly. > What do you think? I can provide some bloat-o-meter stats after > rebasing. And either way, I see no issue in basing this series on top of > Alex' one. You're right, let's try both and see what how worse is one comparing to another wrt bloat-o-meter and overall code generation. If the difference is not that terrible, I'd stick to universal and simpler for users version. If the difference is significant, we'd have to keep both. Maybe it's worth to try merge the aligned case into generic one, but it's not the purpose of your series, of course. Thanks, Yury
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h index 63e422f8ba3d..9c010a7fa331 100644 --- a/include/linux/bitmap.h +++ b/include/linux/bitmap.h @@ -6,8 +6,10 @@ #include <linux/align.h> #include <linux/bitops.h> +#include <linux/bug.h> #include <linux/find.h> #include <linux/limits.h> +#include <linux/log2.h> #include <linux/string.h> #include <linux/types.h> @@ -569,38 +571,59 @@ static inline void bitmap_from_u64(unsigned long *dst, u64 mask) } /** - * bitmap_get_value8 - get an 8-bit value within a memory region + * bitmap_get_bits - get a 8/16/32/64-bit value within a memory region * @map: address to the bitmap memory region - * @start: bit offset of the 8-bit value; must be a multiple of 8 + * @start: bit offset of the value; must be a multiple of @len + * @len: bit width of the value; must be a power of two * - * Returns the 8-bit value located at the @start bit offset within the @src - * memory region. + * Return: the 8/16/32/64-bit value located at the @start bit offset within + * the @src memory region. Its position (@start + @len) can't cross + * a ``BITS_PER_LONG`` boundary. */ -static inline unsigned long bitmap_get_value8(const unsigned long *map, - unsigned long start) +static inline unsigned long bitmap_get_bits(const unsigned long *map, + unsigned long start, size_t len) { const size_t index = BIT_WORD(start); const unsigned long offset = start % BITS_PER_LONG; - return (map[index] >> offset) & 0xFF; + if (WARN_ON_ONCE(!is_power_of_2(len) || offset % len || + offset + len > BITS_PER_LONG)) + return 0; + + return (map[index] >> offset) & GENMASK(len - 1, 0); } /** - * bitmap_set_value8 - set an 8-bit value within a memory region + * bitmap_set_bits - set a 8/16/32/64-bit value within a memory region * @map: address to the bitmap memory region - * @value: the 8-bit value; values wider than 8 bits may clobber bitmap - * @start: bit offset of the 8-bit value; must be a multiple of 8 + * @start: bit offset of the value; must be a multiple of @len + * @value: new value to set + * @len: bit width of the value; must be a power of two + * + * Replaces the 8/16/32/64-bit value located at the @start bit offset within + * the @src memory region with the new @value. Its position (@start + @len) + * can't cross a ``BITS_PER_LONG`` boundary. */ -static inline void bitmap_set_value8(unsigned long *map, unsigned long value, - unsigned long start) +static inline void bitmap_set_bits(unsigned long *map, unsigned long start, + unsigned long value, size_t len) { const size_t index = BIT_WORD(start); const unsigned long offset = start % BITS_PER_LONG; + unsigned long mask = GENMASK(len - 1, 0); - map[index] &= ~(0xFFUL << offset); - map[index] |= value << offset; + if (WARN_ON_ONCE(!is_power_of_2(len) || offset % len || + offset + len > BITS_PER_LONG)) + return; + + map[index] &= ~(mask << offset); + map[index] |= (value & mask) << offset; } +#define bitmap_get_value8(map, start) \ + bitmap_get_bits(map, start, BITS_PER_BYTE) +#define bitmap_set_value8(map, value, start) \ + bitmap_set_bits(map, start, value, BITS_PER_BYTE) + #endif /* __ASSEMBLY__ */ #endif /* __LINUX_BITMAP_H */