Message ID | 20180808143528.82880-4-luc.vanoostenryck@gmail.com (mailing list archive) |
---|---|
State | Rejected, archived |
Headers | show |
Series | more simplifications of bitfiled accesses | expand |
On Wed, Aug 08, 2018 at 04:35:22PM +0200, Luc Van Oostenryck wrote: > Currently, in simplify_and_or_mask(), the simplification > is only done when (M & M') == 0. > > However it can be simplified anytime when (M & M') != M, > resulting in a smaller mask. > > Allow the mask simplification when (M & M') != M. > --- > simplify.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/simplify.c b/simplify.c > index 6b813d4f9..6a68ee5b4 100644 > --- a/simplify.c > +++ b/simplify.c > @@ -833,9 +833,12 @@ static int simplify_and_or_mask(struct instruction *insn, pseudo_t and, pseudo_t > return 0; > omask = def->src2->value; > nmask = omask & mask; > + if (nmask == omask) > + return 0; > if (nmask == 0) > return replace_pseudo(insn, &insn->src1, other); > - return 0; There is a if (nbr_users(insn->src1) > 1) return 0; missing here. This make the whole series incorrect. V2 will be rolled out later. > + def->src2 = value_pseudo(nmask); > + return REPEAT_CSE; > } -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 9, 2018 at 2:20 AM Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > > if (nmask == 0) > > return replace_pseudo(insn, &insn->src1, other); > > - return 0; > > There is a > if (nbr_users(insn->src1) > 1) > return 0; > missing here. Doin't you mean if (multi_users(insn->src1)) return 0; in the new world order? Side note, and independently of that comment, and the real reason I'm asking: it would really be lovely for all these "simplify" commits to have example code generation changes. I'm betting you have some simple bitfield code examples that you are using to guide these things, wouldn't it be nice to have those examples as part of the explanation for why you do particular simplifications? I realize that this isn't what has been normally done, and it obviously depends on the particular simplification. Some simplifications are so obvious that they don't really even merit an example. But this one strikes me as pretty involved from the description, even though I suspect that the actual bitfield operation that triggers this is very simple. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 09, 2018 at 12:05:56PM -0700, Linus Torvalds wrote: > On Thu, Aug 9, 2018 at 2:20 AM Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > > > > > if (nmask == 0) > > > return replace_pseudo(insn, &insn->src1, other); > > > - return 0; > > > > There is a > > if (nbr_users(insn->src1) > 1) > > return 0; > > missing here. > > Doin't you mean > > if (multi_users(insn->src1)) > return 0; > > in the new world order? Well, yes and no. Because it's the last part of an old branch (and I often avoid to rebase in this case), it's still in the old world nbr_users(). I'll either rebase it later or add a path doing the conversion once it's merged with the master branch. > Side note, and independently of that comment, and the real reason I'm > asking: it would really be lovely for all these "simplify" commits to > have example code generation changes. > > I'm betting you have some simple bitfield code examples that you are > using to guide these things, wouldn't it be nice to have those > examples as part of the explanation for why you do particular > simplifications? Yes, absolutely. Months ago I did this more often but now I tend to focus more on the tests (so often now, a series begin with a testcase with the goal to reach and set as known-to-fail, then at some stages the known-to-fail can be removed). I also try to exaplain a bit the simplification as a comment in the code itself. But yes adding an example in the description would be very good. Now, it seems that yesterday, I must have been tired because I made this error here, I didn't add the testcase(s) and I forgot the signoffs. > I realize that this isn't what has been normally done, and it > obviously depends on the particular simplification. Some > simplifications are so obvious that they don't really even merit an > example. But this one strikes me as pretty involved from the > description, even though I suspect that the actual bitfield operation > that triggers this is very simple. Yes, indeed. In this case, I just would like to simplify storing a bitfield and reading it back. For example: struct s { unsigned int :2; unsigned int f:3; }; int foo(struct s s, int a) { s.f = a; return s.f; } which now give: foo: and.32 %r11 <- %arg2, $7 ret.32 %r11 instead of: foo: and.32 %r4 <- %arg2, $7 shl.32 %r5 <- %r4, $2 and.32 %r6 <- %arg1, $0xffffffe3 or.32 %r7 <- %r6, %r5 lsr.32 %r9 <- %r7, $2 and.32 %r11 <- %r9, $7 ret.32 %r11 I'll add all this in the patch description. This is in fact the last step before I integrate the SSA rewrite (which I hope to do real soon now, next week I hope) because there was some regression there with the introduction of PSEUDO_UNDEFs (for the tracking of undefined pseudo which, among other things, allows to avoid loops of self-defined pseudos, what's I call "cyclic 'DAGs'" and is blocking a lot of simplifications). I must say that I'm struggling a bit lately with the simplification of bitfield accesses and I'm more and more inclined to add bitfield insert & bit field extract instruction. It would help a lot but it would also force to duplicate or redo a lot of the simplifications already done for ASR/LSR/SH/AND/OR. -- Luc
On Thu, Aug 9, 2018 at 2:24 PM Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > I must say that I'm struggling a bit lately with the simplification > of bitfield accesses and I'm more and more inclined to add bitfield > insert & bit field extract instruction. It would help a lot but it > would also force to duplicate or redo a lot of the simplifications > already done for ASR/LSR/SH/AND/OR. I would really suggest against it. Turning bitfields into just plain memory accesses and shift-and-mask operations as quickly as possible is almost certainly the right thing to do. gcc special-cases bitfields way too much, and it has caused some very nasty issues. See for example my ancient bug-report at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 which is also a strong argument for the whole "keep the base type around, and do *not* think of it as just bits X-Y access", because it turns out that the base type matters for both access size and for access rules (ie volatile). Now, I'm assuming you were thinking of just an "arithmetic" insert/extract model (so separate from the memory access), and while that at least avoids the access issues, I can pretty much guarantee that it just makes for even more special cases and combinations than just turning them into shift-and-mask operations. Linus
On Thu, Aug 09, 2018 at 02:42:24PM -0700, Linus Torvalds wrote: > On Thu, Aug 9, 2018 at 2:24 PM Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > > > > I must say that I'm struggling a bit lately with the simplification > > of bitfield accesses and I'm more and more inclined to add bitfield > > insert & bit field extract instruction. It would help a lot but it > > would also force to duplicate or redo a lot of the simplifications > > already done for ASR/LSR/SH/AND/OR. > > I would really suggest against it. > > Turning bitfields into just plain memory accesses and shift-and-mask > operations as quickly as possible is almost certainly the right thing > to do. > > gcc special-cases bitfields way too much, and it has caused some very > nasty issues. See for example my ancient bug-report at > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 > > which is also a strong argument for the whole "keep the base type > around, and do *not* think of it as just bits X-Y access", because it > turns out that the base type matters for both access size and for > access rules (ie volatile). Hehe, I'm in fact just busy to work on a bug that sparse has with volatile bitfields (the base type lost the volatile modifiers). > Now, I'm assuming you were thinking of just an "arithmetic" > insert/extract model (so separate from the memory access), and while > that at least avoids the access issues, I can pretty much guarantee > that it just makes for even more special cases and combinations than > just turning them into shift-and-mask operations. Yes, I didn't meant the memory access, only the shift/mask/combine part, like ARM64's BFI, BFXU & BFXS. They would indeed create new combinations that would need to simplify but they would have the big advantage to be single-intruction operationss while now a bitfield extract is 3 instructions (LSR + TRUNC + [SZ]EXT) and an insert is 4 instructions (SHL + 2 AND + OR) and it's the simplifications of these multi-instruction operations that is really painful, especially the insert. But since the current multi-instructions simplifications need anyway to be kept as they're also useful for non-bitfields, it would just add more work and more combinations and more code with very few in return (less instructions and pseudos at linearization time, faster simplification of bitfields). -- Luc
On Thu, Aug 9, 2018 at 4:15 PM Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > But since the current multi-instructions simplifications need anyway > to be kept as they're also useful for non-bitfields, it would just add > more work and more combinations and more code with very few in return > (less instructions and pseudos at linearization time, faster > simplification of bitfields). Exactly. And you often have combinations of shifts _and_ the bitfield insert/extract cases, so it really explodes. So having insert/extract instructions might simplify the simple cases, but it would make the complex cases even worse. Linus
diff --git a/simplify.c b/simplify.c index 6b813d4f9..6a68ee5b4 100644 --- a/simplify.c +++ b/simplify.c @@ -833,9 +833,12 @@ static int simplify_and_or_mask(struct instruction *insn, pseudo_t and, pseudo_t return 0; omask = def->src2->value; nmask = omask & mask; + if (nmask == omask) + return 0; if (nmask == 0) return replace_pseudo(insn, &insn->src1, other); - return 0; + def->src2 = value_pseudo(nmask); + return REPEAT_CSE; } static int simplify_constant_mask(struct instruction *insn, unsigned long long mask) @@ -854,7 +857,6 @@ static int simplify_constant_mask(struct instruction *insn, unsigned long long m case OP_OR: // Let's handle ((A & M') | B ) & M // or (B | (A & M')) & M - // when M' & M == 0 src1 = def->src1; src2 = def->src2; if (def_opcode(src1) == OP_AND)