mbox series

[v6,00/12] lib/find_bit: fast path for small bitmaps

Message ID 20210401003153.97325-1-yury.norov@gmail.com (mailing list archive)
Headers show
Series lib/find_bit: fast path for small bitmaps | expand

Message

Yury Norov April 1, 2021, 12:31 a.m. UTC
Bitmap operations are much simpler and faster in case of small bitmaps
which fit into a single word. In linux/bitmap.c we have a machinery that
allows compiler to replace actual function call with a few instructions
if bitmaps passed into the function are small and their size is known at
compile time.

find_*_bit() API lacks this functionality; but users will benefit from it
a lot. One important example is cpumask subsystem when
NR_CPUS <= BITS_PER_LONG.

v6 is mostly a resend. The only change comparing to v5 is a fix of
small_const_nbits() synchronization patch.

v1: https://www.spinics.net/lists/kernel/msg3804727.html
v2: https://www.spinics.net/lists/linux-m68k/msg16945.html
v3: https://www.spinics.net/lists/kernel/msg3837020.html
v4: https://patchwork.kernel.org/project/linux-sh/cover/20210316015424.1999082-1-yury.norov@gmail.com/
v5: https://lore.kernel.org/linux-arch/20210321215457.588554-1-yury.norov@gmail.com/T/
v6: - sync small_const_nbits() properly (patch 6).
    - Rasmus' ack added.

Yury Norov (12):
  tools: disable -Wno-type-limits
  tools: bitmap: sync function declarations with the kernel
  tools: sync BITMAP_LAST_WORD_MASK() macro with the kernel
  arch: rearrange headers inclusion order in asm/bitops for m68k and sh
  lib: extend the scope of small_const_nbits() macro
  tools: sync small_const_nbits() macro with the kernel
  lib: inline _find_next_bit() wrappers
  tools: sync find_next_bit implementation
  lib: add fast path for find_next_*_bit()
  lib: add fast path for find_first_*_bit() and find_last_bit()
  tools: sync lib/find_bit implementation
  MAINTAINERS: Add entry for the bitmap API

 MAINTAINERS                             |  16 ++++
 arch/m68k/include/asm/bitops.h          |   6 +-
 arch/sh/include/asm/bitops.h            |   5 +-
 include/asm-generic/bitops/find.h       | 108 +++++++++++++++++++++---
 include/asm-generic/bitops/le.h         |  38 ++++++++-
 include/asm-generic/bitsperlong.h       |  12 +++
 include/linux/bitmap.h                  |   8 --
 include/linux/bitops.h                  |  12 ---
 lib/find_bit.c                          |  68 ++-------------
 tools/include/asm-generic/bitops/find.h |  85 +++++++++++++++++--
 tools/include/asm-generic/bitsperlong.h |   3 +
 tools/include/linux/bitmap.h            |  18 ++--
 tools/lib/bitmap.c                      |   4 +-
 tools/lib/find_bit.c                    |  56 +++++-------
 tools/scripts/Makefile.include          |   1 +
 15 files changed, 284 insertions(+), 156 deletions(-)

Comments

Andy Shevchenko April 1, 2021, 9:14 a.m. UTC | #1
On Thu, Apr 1, 2021 at 3:36 AM Yury Norov <yury.norov@gmail.com> wrote:
>
> Bitmap operations are much simpler and faster in case of small bitmaps
> which fit into a single word. In linux/bitmap.c we have a machinery that
> allows compiler to replace actual function call with a few instructions
> if bitmaps passed into the function are small and their size is known at
> compile time.
>
> find_*_bit() API lacks this functionality; but users will benefit from it
> a lot. One important example is cpumask subsystem when
> NR_CPUS <= BITS_PER_LONG.

Cool, thanks!

I guess it's assumed to go via Andrew's tree.

But after that since you are about to be a maintainer of this, I think
it would make sense to send PRs directly to Linus. I would recommend
creating an official tree (followed by an update in the MAINTAINERS)
and connecting it to Linux next (usually done by email to Stephen).


> v6 is mostly a resend. The only change comparing to v5 is a fix of
> small_const_nbits() synchronization patch.
>
> v1: https://www.spinics.net/lists/kernel/msg3804727.html
> v2: https://www.spinics.net/lists/linux-m68k/msg16945.html
> v3: https://www.spinics.net/lists/kernel/msg3837020.html
> v4: https://patchwork.kernel.org/project/linux-sh/cover/20210316015424.1999082-1-yury.norov@gmail.com/
> v5: https://lore.kernel.org/linux-arch/20210321215457.588554-1-yury.norov@gmail.com/T/
> v6: - sync small_const_nbits() properly (patch 6).
>     - Rasmus' ack added.
>
> Yury Norov (12):
>   tools: disable -Wno-type-limits
>   tools: bitmap: sync function declarations with the kernel
>   tools: sync BITMAP_LAST_WORD_MASK() macro with the kernel
>   arch: rearrange headers inclusion order in asm/bitops for m68k and sh
>   lib: extend the scope of small_const_nbits() macro
>   tools: sync small_const_nbits() macro with the kernel
>   lib: inline _find_next_bit() wrappers
>   tools: sync find_next_bit implementation
>   lib: add fast path for find_next_*_bit()
>   lib: add fast path for find_first_*_bit() and find_last_bit()
>   tools: sync lib/find_bit implementation
>   MAINTAINERS: Add entry for the bitmap API
>
>  MAINTAINERS                             |  16 ++++
>  arch/m68k/include/asm/bitops.h          |   6 +-
>  arch/sh/include/asm/bitops.h            |   5 +-
>  include/asm-generic/bitops/find.h       | 108 +++++++++++++++++++++---
>  include/asm-generic/bitops/le.h         |  38 ++++++++-
>  include/asm-generic/bitsperlong.h       |  12 +++
>  include/linux/bitmap.h                  |   8 --
>  include/linux/bitops.h                  |  12 ---
>  lib/find_bit.c                          |  68 ++-------------
>  tools/include/asm-generic/bitops/find.h |  85 +++++++++++++++++--
>  tools/include/asm-generic/bitsperlong.h |   3 +
>  tools/include/linux/bitmap.h            |  18 ++--
>  tools/lib/bitmap.c                      |   4 +-
>  tools/lib/find_bit.c                    |  56 +++++-------
>  tools/scripts/Makefile.include          |   1 +
>  15 files changed, 284 insertions(+), 156 deletions(-)
>
> --
> 2.25.1
>


--
With Best Regards,
Andy Shevchenko
Arnd Bergmann April 1, 2021, 9:28 a.m. UTC | #2
On Thu, Apr 1, 2021 at 11:16 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> On Thu, Apr 1, 2021 at 3:36 AM Yury Norov <yury.norov@gmail.com> wrote:
> >
> > Bitmap operations are much simpler and faster in case of small bitmaps
> > which fit into a single word. In linux/bitmap.c we have a machinery that
> > allows compiler to replace actual function call with a few instructions
> > if bitmaps passed into the function are small and their size is known at
> > compile time.
> >
> > find_*_bit() API lacks this functionality; but users will benefit from it
> > a lot. One important example is cpumask subsystem when
> > NR_CPUS <= BITS_PER_LONG.
>
> Cool, thanks!
>
> I guess it's assumed to go via Andrew's tree.
>
> But after that since you are about to be a maintainer of this, I think
> it would make sense to send PRs directly to Linus. I would recommend
> creating an official tree (followed by an update in the MAINTAINERS)
> and connecting it to Linux next (usually done by email to Stephen).

It depends on how often we expect to see updates to this. I have not
followed the changes as closely as I should have, but I can also
merge them through the asm-generic tree for this time so Andrew
has to carry fewer patches for this.

I normally don't have a lot of material for asm-generic either, half
the time there are no pull requests at all for a given release. I would
expect future changes to the bitmap implementation to only need
an occasional bugfix, which could go through either the asm-generic
tree or through mm and doesn't need another separate pull request.

If it turns out to be a tree that needs regular updates every time,
then having a top level repository in linux-next would be appropriate.

        Arnd
Andy Shevchenko April 1, 2021, 9:50 a.m. UTC | #3
On Thu, Apr 1, 2021 at 12:29 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Apr 1, 2021 at 11:16 AM Andy Shevchenko
> <andy.shevchenko@gmail.com> wrote:
> >
> > On Thu, Apr 1, 2021 at 3:36 AM Yury Norov <yury.norov@gmail.com> wrote:
> > >
> > > Bitmap operations are much simpler and faster in case of small bitmaps
> > > which fit into a single word. In linux/bitmap.c we have a machinery that
> > > allows compiler to replace actual function call with a few instructions
> > > if bitmaps passed into the function are small and their size is known at
> > > compile time.
> > >
> > > find_*_bit() API lacks this functionality; but users will benefit from it
> > > a lot. One important example is cpumask subsystem when
> > > NR_CPUS <= BITS_PER_LONG.
> >
> > Cool, thanks!
> >
> > I guess it's assumed to go via Andrew's tree.
> >
> > But after that since you are about to be a maintainer of this, I think
> > it would make sense to send PRs directly to Linus. I would recommend
> > creating an official tree (followed by an update in the MAINTAINERS)
> > and connecting it to Linux next (usually done by email to Stephen).
>
> It depends on how often we expect to see updates to this. I have not
> followed the changes as closely as I should have, but I can also
> merge them through the asm-generic tree for this time so Andrew
> has to carry fewer patches for this.
>
> I normally don't have a lot of material for asm-generic either, half
> the time there are no pull requests at all for a given release. I would
> expect future changes to the bitmap implementation to only need
> an occasional bugfix, which could go through either the asm-generic
> tree or through mm and doesn't need another separate pull request.
>
> If it turns out to be a tree that needs regular updates every time,
> then having a top level repository in linux-next would be appropriate.

Agree. asm-generic may serve for this. My worries are solely about how
much burden we add on Andrew's shoulders.
Andrew Morton April 2, 2021, 12:32 a.m. UTC | #4
On Thu, 1 Apr 2021 12:50:31 +0300 Andy Shevchenko <andy.shevchenko@gmail.com> wrote:

> > I normally don't have a lot of material for asm-generic either, half
> > the time there are no pull requests at all for a given release. I would
> > expect future changes to the bitmap implementation to only need
> > an occasional bugfix, which could go through either the asm-generic
> > tree or through mm and doesn't need another separate pull request.
> >
> > If it turns out to be a tree that needs regular updates every time,
> > then having a top level repository in linux-next would be appropriate.
> 
> Agree. asm-generic may serve for this. My worries are solely about how
> much burden we add on Andrew's shoulders.

Is fine.  Saving other developers from having to maintain tiny trees is
a thing I do.