diff mbox series

[3/9] simplify ((A & M) | B ) & M' even when (M & M') != 0

Message ID 20180808143528.82880-4-luc.vanoostenryck@gmail.com (mailing list archive)
State Rejected, archived
Headers show
Series more simplifications of bitfiled accesses | expand

Commit Message

Luc Van Oostenryck Aug. 8, 2018, 2:35 p.m. UTC
Currently, in simplify_and_or_mask(), the simplification
is only done when (M & M') == 0.

However it can be simplified anytime when (M & M') != M,
resulting in a smaller mask.

Allow the mask simplification when (M & M') != M.
---
 simplify.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Luc Van Oostenryck Aug. 9, 2018, 9:20 a.m. UTC | #1
On Wed, Aug 08, 2018 at 04:35:22PM +0200, Luc Van Oostenryck wrote:
> Currently, in simplify_and_or_mask(), the simplification
> is only done when (M & M') == 0.
> 
> However it can be simplified anytime when (M & M') != M,
> resulting in a smaller mask.
> 
> Allow the mask simplification when (M & M') != M.
> ---
>  simplify.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/simplify.c b/simplify.c
> index 6b813d4f9..6a68ee5b4 100644
> --- a/simplify.c
> +++ b/simplify.c
> @@ -833,9 +833,12 @@ static int simplify_and_or_mask(struct instruction *insn, pseudo_t and, pseudo_t
>  		return 0;
>  	omask = def->src2->value;
>  	nmask = omask & mask;
> +	if (nmask == omask)
> +		return 0;
>  	if (nmask == 0)
>  		return replace_pseudo(insn, &insn->src1, other);
> -	return 0;

There is a
	if (nbr_users(insn->src1) > 1)
		return 0;
missing here.

This make the whole series incorrect.
V2 will be rolled out later.

> +	def->src2 = value_pseudo(nmask);
> +	return REPEAT_CSE;
>  }
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds Aug. 9, 2018, 7:05 p.m. UTC | #2
On Thu, Aug 9, 2018 at 2:20 AM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> >       if (nmask == 0)
> >               return replace_pseudo(insn, &insn->src1, other);
> > -     return 0;
>
> There is a
>         if (nbr_users(insn->src1) > 1)
>                 return 0;
> missing here.

Doin't you mean

        if (multi_users(insn->src1))
                return 0;

in the new world order?

Side note, and independently of that comment, and the real reason I'm
asking: it would really be lovely for all these "simplify" commits to
have example code generation changes.

I'm betting you have some simple bitfield code examples that you are
using to guide these things, wouldn't it be nice to have those
examples as part of the explanation for why you do particular
simplifications?

I realize that this isn't what has been normally done, and it
obviously depends on the particular simplification. Some
simplifications are so obvious that they don't really even merit an
example. But this one strikes me as pretty involved from the
description, even though I suspect that the actual bitfield operation
that triggers this is very simple.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luc Van Oostenryck Aug. 9, 2018, 9:24 p.m. UTC | #3
On Thu, Aug 09, 2018 at 12:05:56PM -0700, Linus Torvalds wrote:
> On Thu, Aug 9, 2018 at 2:20 AM Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> > >       if (nmask == 0)
> > >               return replace_pseudo(insn, &insn->src1, other);
> > > -     return 0;
> >
> > There is a
> >         if (nbr_users(insn->src1) > 1)
> >                 return 0;
> > missing here.
> 
> Doin't you mean
> 
>         if (multi_users(insn->src1))
>                 return 0;
> 
> in the new world order?

Well, yes and no.
Because it's the last part of an old branch (and I often avoid to
rebase in this case), it's still in the old world nbr_users().
I'll either rebase it later or add a path doing the conversion
once it's merged with the master branch.

> Side note, and independently of that comment, and the real reason I'm
> asking: it would really be lovely for all these "simplify" commits to
> have example code generation changes.
> 
> I'm betting you have some simple bitfield code examples that you are
> using to guide these things, wouldn't it be nice to have those
> examples as part of the explanation for why you do particular
> simplifications?

Yes, absolutely.
Months ago I did this more often but now I tend to focus more on
the tests (so often now, a series begin with a testcase with the
goal to reach and set as known-to-fail, then at some stages the
known-to-fail can be removed). I also try to exaplain a bit the
simplification as a comment in the code itself. But yes adding an
example in the description would be very good.

Now, it seems that yesterday, I must have been tired because I made
this error here, I didn't add the testcase(s) and I forgot the signoffs.

> I realize that this isn't what has been normally done, and it
> obviously depends on the particular simplification. Some
> simplifications are so obvious that they don't really even merit an
> example. But this one strikes me as pretty involved from the
> description, even though I suspect that the actual bitfield operation
> that triggers this is very simple.

Yes, indeed. In this case, I just would like to simplify storing
a bitfield and reading it back. For example:
	struct s {
		unsigned int :2;
		unsigned int f:3;
	};
	
	int foo(struct s s, int a)
	{
		s.f = a;
		return s.f;
	}

which now give:
	foo:
		and.32      %r11 <- %arg2, $7
		ret.32      %r11

instead of:
	foo:
		and.32      %r4 <- %arg2, $7
		shl.32      %r5 <- %r4, $2
		and.32      %r6 <- %arg1, $0xffffffe3
		or.32       %r7 <- %r6, %r5
		lsr.32      %r9 <- %r7, $2
		and.32      %r11 <- %r9, $7
		ret.32      %r11

I'll add all this in the patch description.

This is in fact the last step before I integrate the SSA rewrite
(which I hope to do real soon now, next week I hope) because there
was some regression there with the introduction of PSEUDO_UNDEFs
(for the tracking of undefined pseudo which, among other things,
allows to avoid loops of self-defined pseudos, what's I call
"cyclic 'DAGs'" and is blocking a lot of simplifications).

I must say that I'm struggling a bit lately with the simplification
of bitfield accesses and I'm more and more inclined to add bitfield
insert & bit field extract instruction. It would help a lot but it
would also force to duplicate or redo a lot of the simplifications
already done for ASR/LSR/SH/AND/OR.

-- Luc
Linus Torvalds Aug. 9, 2018, 9:42 p.m. UTC | #4
On Thu, Aug 9, 2018 at 2:24 PM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> I must say that I'm struggling a bit lately with the simplification
> of bitfield accesses and I'm more and more inclined to add bitfield
> insert & bit field extract instruction. It would help a lot but it
> would also force to duplicate or redo a lot of the simplifications
> already done for ASR/LSR/SH/AND/OR.

I would really suggest against it.

Turning bitfields into just plain memory accesses and shift-and-mask
operations as quickly as possible is almost certainly the right thing
to do.

gcc special-cases bitfields way too much, and it has caused some very
nasty issues. See for example my ancient bug-report at

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696

which is also a strong argument for the whole "keep the base type
around, and do *not* think of it as just bits X-Y access", because it
turns out that the base type matters for both access size and for
access rules (ie volatile).

Now, I'm assuming you were thinking of just an "arithmetic"
insert/extract model (so separate from the memory access), and while
that at least avoids the access issues, I can pretty much guarantee
that it just makes for even more special cases and combinations than
just turning them into shift-and-mask operations.

              Linus
Luc Van Oostenryck Aug. 9, 2018, 11:15 p.m. UTC | #5
On Thu, Aug 09, 2018 at 02:42:24PM -0700, Linus Torvalds wrote:
> On Thu, Aug 9, 2018 at 2:24 PM Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> > I must say that I'm struggling a bit lately with the simplification
> > of bitfield accesses and I'm more and more inclined to add bitfield
> > insert & bit field extract instruction. It would help a lot but it
> > would also force to duplicate or redo a lot of the simplifications
> > already done for ASR/LSR/SH/AND/OR.
> 
> I would really suggest against it.
> 
> Turning bitfields into just plain memory accesses and shift-and-mask
> operations as quickly as possible is almost certainly the right thing
> to do.
> 
> gcc special-cases bitfields way too much, and it has caused some very
> nasty issues. See for example my ancient bug-report at
> 
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696
> 
> which is also a strong argument for the whole "keep the base type
> around, and do *not* think of it as just bits X-Y access", because it
> turns out that the base type matters for both access size and for
> access rules (ie volatile).

Hehe, I'm in fact just busy to work on a bug that sparse has with
volatile bitfields (the base type lost the volatile modifiers).

> Now, I'm assuming you were thinking of just an "arithmetic"
> insert/extract model (so separate from the memory access), and while
> that at least avoids the access issues, I can pretty much guarantee
> that it just makes for even more special cases and combinations than
> just turning them into shift-and-mask operations.

Yes, I didn't meant the memory access, only the shift/mask/combine
part, like ARM64's BFI, BFXU & BFXS. They would indeed create
new combinations that would need to simplify but they would have
the big advantage to be single-intruction operationss while now
a bitfield extract is 3 instructions (LSR + TRUNC + [SZ]EXT)
and an insert is 4 instructions (SHL + 2 AND + OR) and it's the
simplifications of these multi-instruction operations that is
really painful, especially the insert.

But since the current multi-instructions simplifications need anyway
to be kept as they're also useful for non-bitfields, it would just add
more work and more combinations and more code with very few in return
(less instructions and pseudos at linearization time, faster
simplification of bitfields).

-- Luc
Linus Torvalds Aug. 9, 2018, 11:25 p.m. UTC | #6
On Thu, Aug 9, 2018 at 4:15 PM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> But since the current multi-instructions simplifications need anyway
> to be kept as they're also useful for non-bitfields, it would just add
> more work and more combinations and more code with very few in return
> (less instructions and pseudos at linearization time, faster
> simplification of bitfields).

Exactly. And you often have combinations of shifts _and_ the bitfield
insert/extract cases, so it really explodes.

So having insert/extract instructions might simplify the simple cases,
but it would make the complex cases even worse.

          Linus
diff mbox series

Patch

diff --git a/simplify.c b/simplify.c
index 6b813d4f9..6a68ee5b4 100644
--- a/simplify.c
+++ b/simplify.c
@@ -833,9 +833,12 @@  static int simplify_and_or_mask(struct instruction *insn, pseudo_t and, pseudo_t
 		return 0;
 	omask = def->src2->value;
 	nmask = omask & mask;
+	if (nmask == omask)
+		return 0;
 	if (nmask == 0)
 		return replace_pseudo(insn, &insn->src1, other);
-	return 0;
+	def->src2 = value_pseudo(nmask);
+	return REPEAT_CSE;
 }
 
 static int simplify_constant_mask(struct instruction *insn, unsigned long long mask)
@@ -854,7 +857,6 @@  static int simplify_constant_mask(struct instruction *insn, unsigned long long m
 	case OP_OR:
 		// Let's handle ((A & M') | B ) & M
 		// or           (B | (A & M')) & M
-		// when M' & M == 0
 		src1 = def->src1;
 		src2 = def->src2;
 		if (def_opcode(src1) == OP_AND)