diff mbox

bitmap: fix memset optimization on big-endian systems

Message ID 817147544aa3ecc2b78d6cadeab713869d8805e6.1522709616.git.osandov@fb.com (mailing list archive)
State New, archived
Headers show

Commit Message

Omar Sandoval April 2, 2018, 10:58 p.m. UTC
From: Omar Sandoval <osandov@fb.com>

Commit 2a98dc028f91 introduced an optimization to bitmap_{set,clear}()
which uses memset() when the start and length are constants aligned to a
byte. This is wrong on big-endian systems; our bitmaps are arrays of
unsigned long, so bit n is not at byte n / 8 in memory. This was caught
by the Btrfs selftests, but the bitmap selftests also fail when run on a
big-endian machine.

We can still use memset if the start and length are aligned to an
unsigned long, so do that on big-endian. The same problem applies to the
memcmp in bitmap_equal(), so fix it there, too.

Fixes: 2a98dc028f91 ("include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible")
Fixes: 2c6deb01525a ("bitmap: use memcmp optimisation in more situations")
Cc: stable@kernel.org
Reported-by: "Erhard F." <erhard_f@mailbox.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 include/linux/bitmap.h | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

Comments

Omar Sandoval April 2, 2018, 11 p.m. UTC | #1
On Mon, Apr 02, 2018 at 03:58:31PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
> 
> Commit 2a98dc028f91 introduced an optimization to bitmap_{set,clear}()
> which uses memset() when the start and length are constants aligned to a
> byte. This is wrong on big-endian systems; our bitmaps are arrays of
> unsigned long, so bit n is not at byte n / 8 in memory. This was caught
> by the Btrfs selftests, but the bitmap selftests also fail when run on a
> big-endian machine.
> 
> We can still use memset if the start and length are aligned to an
> unsigned long, so do that on big-endian. The same problem applies to the
> memcmp in bitmap_equal(), so fix it there, too.
> 
> Fixes: 2a98dc028f91 ("include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible")
> Fixes: 2c6deb01525a ("bitmap: use memcmp optimisation in more situations")
> Cc: stable@kernel.org

This should be stable@vger.kernel.org, of course
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds April 2, 2018, 11:37 p.m. UTC | #2
On Mon, Apr 2, 2018 at 3:58 PM, Omar Sandoval <osandov@osandov.com> wrote:
>
> Commit 2a98dc028f91 introduced an optimization to bitmap_{set,clear}()
> which uses memset() when the start and length are constants aligned to a
> byte. This is wrong on big-endian systems;

Ugh, yes.

In retrospect, I do wish I had just made the bitmap types be
byte-based, but we had strong reasons for those "unsigned long" array
semantics.

The traditional problem - and the reason why it is byte-order
dependent - was that we often mix bitmap operations with "unsigned
long flags" style operations.

We should probably have at least switched it to "unsigned long int"
with the whole 64-bit transition, but never did even that, so the
bitmap format is not just byte order dependent, but dependent on the
size of "long".

I guess the "unsigned long flag" issue still exists in several places,
and we're stuck with it, probably forever. Think page flags, but also
various networking flags etc.

You'd *think* they use bitmap operations consistently, but they
definitely mix it with direct accesses to the flags field, eg the page
flags are usually done using the PG_xyz bit numbers, but occasionally
you have code that checks multiple independent bits at once, doing
things like

  #define PAGE_FLAGS_PRIVATE                              \
          (1UL << PG_private | 1UL << PG_private_2)

  static inline int page_has_private(struct page *page)
  {
          return !!(page->flags & PAGE_FLAGS_PRIVATE);
  }

so the bits really are _exposed_ as being encoded as bits in an unsigned long.

Your patch is obviously correct, and we just need to make sure people
*really* understand that bitmaps are arrays of unsigned long, and byte
(and bit) order is a real thing.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds April 2, 2018, 11:49 p.m. UTC | #3
On Mon, Apr 2, 2018 at 4:37 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> We should probably have at least switched it to "unsigned long int"

I meant just "unsigned int", of course.

Right now we occasionally have a silly 64-bit field just for a couple of flags.

Of course, we do have cases where 64-bit architectures happily end up
using more than 32 bits too, so the "unsigned long" is occasionally
useful. But it's rare enough that it probably wasn't the right thing
to do.

Water under the bridge. Part of it is due to another historical
accident: the alpha port was one of the early ports, and it didn't do
atomic byte accesses at all.

So we had multiple issues that all conspired to "unsigned long arrays
are the natural for atomic bit operations" even though they have this
really annoying byte order issue.

                  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Omar Sandoval April 3, 2018, 6:08 p.m. UTC | #4
On Mon, Apr 02, 2018 at 04:49:39PM -0700, Linus Torvalds wrote:
> On Mon, Apr 2, 2018 at 4:37 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > We should probably have at least switched it to "unsigned long int"
> 
> I meant just "unsigned int", of course.
> 
> Right now we occasionally have a silly 64-bit field just for a couple of flags.

Not to mention the mix of bit fields, macros, and enums, some of which
are bit masks, some of which are the bit number :)

> Of course, we do have cases where 64-bit architectures happily end up
> using more than 32 bits too, so the "unsigned long" is occasionally
> useful. But it's rare enough that it probably wasn't the right thing
> to do.
> 
> Water under the bridge. Part of it is due to another historical
> accident: the alpha port was one of the early ports, and it didn't do
> atomic byte accesses at all.
> 
> So we had multiple issues that all conspired to "unsigned long arrays
> are the natural for atomic bit operations" even though they have this
> really annoying byte order issue.

Thanks for the historical context, this is exactly what I was wondering
when I spotted this bug and fixed a similar one in Btrfs a couple of
years back.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Omar Sandoval April 3, 2018, 6:14 p.m. UTC | #5
Just wanted to make sure this doesn't get missed because I misspelled
the stable email address in the patch. It applies to v4.13+.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds April 3, 2018, 6:57 p.m. UTC | #6
On Tue, Apr 3, 2018 at 11:14 AM, Omar Sandoval <osandov@osandov.com> wrote:
> Just wanted to make sure this doesn't get missed because I misspelled
> the stable email address in the patch. It applies to v4.13+.

The "stable@kernel.org" address for a stable cc is actually better
than stable@vger" imho, because afaik it still matches Greg's
automation (that also picks up on "Fixes:" tags), and it does *not*
cause extra email flurries when people use git-send-email scripts.

iirc, we had some vendor leak a security bug early because they
actually just cc'd everybody - including the stable list - on the
not-yet-publicly-committed change.

Greg - if your automation has changed, and you actually really want
the "vger", let me know. Because I tend to just use
"stable@kernel.org"

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Greg KH April 4, 2018, 6:12 a.m. UTC | #7
On Tue, Apr 03, 2018 at 11:57:26AM -0700, Linus Torvalds wrote:
> 
> Greg - if your automation has changed, and you actually really want
> the "vger", let me know. Because I tend to just use
> "stable@kernel.org"

Either is fine, my scripts pick up both variants.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sasha Levin April 6, 2018, 8:05 p.m. UTC | #8
Hi,

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag,
fixing commit: 2a98dc028f91 include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible.

The bot has also determined it's probably a bug fixing patch. (score: 65.4067)

The bot has tested the following trees: v4.16, v4.15.15, v4.14.32.

v4.16: Build OK!
v4.15.15: Build OK!
v4.14.32: Build OK!

--
Thanks,
Sasha--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 5f11fbdc27f8..1ee46f492267 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -302,12 +302,20 @@  static inline void bitmap_complement(unsigned long *dst, const unsigned long *sr
 		__bitmap_complement(dst, src, nbits);
 }
 
+#ifdef __LITTLE_ENDIAN
+#define BITMAP_MEM_ALIGNMENT 8
+#else
+#define BITMAP_MEM_ALIGNMENT (8 * sizeof(unsigned long))
+#endif
+#define BITMAP_MEM_MASK (BITMAP_MEM_ALIGNMENT - 1)
+
 static inline int bitmap_equal(const unsigned long *src1,
 			const unsigned long *src2, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
 		return !((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
-	if (__builtin_constant_p(nbits & 7) && IS_ALIGNED(nbits, 8))
+	if (__builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
+	    IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT))
 		return !memcmp(src1, src2, nbits / 8);
 	return __bitmap_equal(src1, src2, nbits);
 }
@@ -358,8 +366,10 @@  static __always_inline void bitmap_set(unsigned long *map, unsigned int start,
 {
 	if (__builtin_constant_p(nbits) && nbits == 1)
 		__set_bit(start, map);
-	else if (__builtin_constant_p(start & 7) && IS_ALIGNED(start, 8) &&
-		 __builtin_constant_p(nbits & 7) && IS_ALIGNED(nbits, 8))
+	else if (__builtin_constant_p(start & BITMAP_MEM_MASK) &&
+		 IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) &&
+		 __builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
+		 IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT))
 		memset((char *)map + start / 8, 0xff, nbits / 8);
 	else
 		__bitmap_set(map, start, nbits);
@@ -370,8 +380,10 @@  static __always_inline void bitmap_clear(unsigned long *map, unsigned int start,
 {
 	if (__builtin_constant_p(nbits) && nbits == 1)
 		__clear_bit(start, map);
-	else if (__builtin_constant_p(start & 7) && IS_ALIGNED(start, 8) &&
-		 __builtin_constant_p(nbits & 7) && IS_ALIGNED(nbits, 8))
+	else if (__builtin_constant_p(start & BITMAP_MEM_MASK) &&
+		 IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) &&
+		 __builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
+		 IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT))
 		memset((char *)map + start / 8, 0, nbits / 8);
 	else
 		__bitmap_clear(map, start, nbits);