Message ID | 20170508185710.22021-1-luc.vanoostenryck@gmail.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
On Mon, May 8, 2017 at 11:57 AM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > Fix this by calling kill_unreachable_bbs() after having > simplified the switch into a branch. This will avoid to > create a cycle with because of the removed phisrc in the > header and as an added benefit will avoid to waste time > trying to simplify BBs that are unreachable. > > In addition, it's now useless to call kill_bb() for each > removed switch's children as kill_unreachable_bbs() will > do that too. I thin the fix is correct. Have some very minor comment. > > Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> > --- > linearize.c | 3 +-- > validation/crazy02-not-so.c | 22 ++++++++++++++++++++++ > 2 files changed, 23 insertions(+), 2 deletions(-) > create mode 100644 validation/crazy02-not-so.c > > diff --git a/linearize.c b/linearize.c > index a9f36b823..ee9591897 100644 > --- a/linearize.c > +++ b/linearize.c > @@ -642,8 +642,6 @@ static void set_activeblock(struct entrypoint *ep, struct basic_block *bb) > static void remove_parent(struct basic_block *child, struct basic_block *parent) > { > remove_bb_from_list(&child->parents, parent, 1); > - if (!child->parents) > - kill_bb(child); This makes every caller of remove_parent() need to clean up the unreachable basic blocks. Currently there is only one caller "insert_branch()" which is fine. But if developer write a new code call this function, he might forget to clean up the unreachable basic blocks. Kind of like a trap. > } > > /* Change a "switch" into a branch */ > @@ -670,6 +668,7 @@ void insert_branch(struct basic_block *bb, struct instruction *jmp, struct basic > remove_parent(child, bb); > } END_FOR_EACH_PTR(child); > PACK_PTR_LIST(&bb->children); > + kill_unreachable_bbs(bb->ep); It is correct to do so. The kill unreachable is relative expensive. Need to go through all the basic block in a function. If there function has more than one switch statement get simplified, then each simplification will go through each basic block again. Preferably kill_unreachable() only run once at the finial stage. I don't know how feasible to have remove_parent() mark the "ep" needs to clean. Then at some later point kill off the dead basic block all together. Currently simplify a switch statement should be relative rare. Have two switch statement in same function get simplify at the same time should be even rarer. Might not worth the effort to optimize for that. As it is, the patch is acceptable. Chris -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 09, 2017 at 03:38:43AM -0700, Christopher Li wrote: > On Mon, May 8, 2017 at 11:57 AM, Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > > Fix this by calling kill_unreachable_bbs() after having > > simplified the switch into a branch. This will avoid to > > create a cycle with because of the removed phisrc in the > > header and as an added benefit will avoid to waste time > > trying to simplify BBs that are unreachable. > > > > In addition, it's now useless to call kill_bb() for each > > removed switch's children as kill_unreachable_bbs() will > > do that too. > > I thin the fix is correct. Have some very minor comment. > > > diff --git a/linearize.c b/linearize.c > > index a9f36b823..ee9591897 100644 > > --- a/linearize.c > > +++ b/linearize.c > > @@ -642,8 +642,6 @@ static void set_activeblock(struct entrypoint *ep, struct basic_block *bb) > > static void remove_parent(struct basic_block *child, struct basic_block *parent) > > { > > remove_bb_from_list(&child->parents, parent, 1); > > - if (!child->parents) > > - kill_bb(child); > > This makes every caller of remove_parent() need to clean up the > unreachable basic blocks. Currently there is only one caller "insert_branch()" > which is fine. But if developer write a new code call this function, he might > forget to clean up the unreachable basic blocks. Kind of like a trap. Yes, true. I didn't liked it very much myself and hesitated to remove the helper and directly calling remove_bb_from_list() from insert_branch(). I'll do this now. but note that in 'normal' situation, it's doesn't really matter if the BB is killed now or not because: 1) it's only the direct descendant that is killed, the indirect ones have to wait anyway. 2) kill_unreachable_bbs() is called after each cleanup loops > > > > /* Change a "switch" into a branch */ > > @@ -670,6 +668,7 @@ void insert_branch(struct basic_block *bb, struct instruction *jmp, struct basic > > remove_parent(child, bb); > > } END_FOR_EACH_PTR(child); > > PACK_PTR_LIST(&bb->children); > > + kill_unreachable_bbs(bb->ep); > > It is correct to do so. The kill unreachable is relative expensive. > Need to go through > all the basic block in a function. I don't really agree with this. When doing the patch I also at first thought the same but then I changed my mind because: 1) it *only* need to run through all the BB (in fact the cost is O(nbr of BB edges) while there is a lot of places where we loop through all *instructions* 2) the fact that the unneeded BB are killed as soon as possible means that the normal CSE+simplification won't wastly run anymore on the indirect descendant blocks (I can redo the measurements but I think I saw a measurable speedup because of this). > If there function has more than one > switch statement > get simplified, then each simplification will go through each basic block again. > Preferably kill_unreachable() only run once at the finial stage. > > I don't know how feasible to have remove_parent() mark the "ep" needs to clean. > Then at some later point kill off the dead basic block all together. Don't forget that the only reason for this patch is a *correctness* issue when the removed BB is followed by a loop (if it has no other entry point (which is generaly the case) and which thus become unreachable). We can mark the ep (the easiest is the reuse repeat_phase and add REPEAT_FLOW (which I'll will need to do anyway for something else, I'm polishing the patch)) but it won't change anything as we *must not* allow simplifications to be done once there is such a dead cycle and the only way to detect such cycles is via something like the marker algorithm used by kill_unreachable_bbs(). > Currently simplify a switch statement should be relative rare. Like a lot of things, it depends. For example, the kernel use a lot inline functions or macro with code like: ... switch (sizeof(x)) { case 1: ... case 2: ... case 4: ... case 8: ... default: issue a build error } Also, the simplification concerned here is 'insert_branch()' which, like I explained in the commit message, is used by the switch simplification but is also used by other branch simplifications. > Have two switch > statement in same function get simplify at the same time should be > even rarer. > Might not worth the effort to optimize for that. > > As it is, the patch is acceptable. I really also would prefer that we would not have to walk the BBs but I really think we don't have good others choices. If we would really really want to avoid to call kill_unreachable_bss() we could in simplify_one_memop(), once we detect a possible loop in the addresses calculation (which is the core of the problem here: avoid to detect such a false loop), instead of issuing a message is to first call kill_unreachable_bbs() and then redo the memop simplification if some BBs were killed and the concerned instruction still exist). But: - it's more complicated than needed - the advanatge would only be based on the principle that kill_unreachable_bbs() is costly (which is not really) - calling kill_unreachable_bbs() early has the advantage to avoid doing memop and instruction simplification and CSE on code that is in fact unreacheble (and those are more costly because they are done on every instructions) I don't have the number anymore, but when testing what I saw (and I was a bit surprised at first) was that with the patch, compiling the kernel with sparse and calling test-linearize on GCC's testsuite and such was in fact slightly faster (but admitingly, the difference was small and it could have been within the accuracy of the measurementi. I can redo the measurement if needed). Also, when talking about performance, if you do some profiling you very quickly realize: - CSE is quite costly (it's not new, cfr. http://marc.info/?l=linux-sparse&m=111616763219436&w=4 and commit b5a8032aaeaf2121a642ead32653d4288fa2983d) I have looked at this more than once but I don't see what we can do but: - avoid to call CSE on dead code - call CSE less often, possibly to not trying to reach the fix point, of course this impact the quality of the code - memory allocation eats a lot of memory cycles too (about 16-20%, not especially in userland but also by things like kernel's clear_page()) But, see how you can easily win a speed up of 5%: http://marc.info/?l=linux-sparse&m=149198627609282&w=4 There is a lot that can be done here. I have a few things in preparation but it won't be for soon. - __add_ptr_list() is also surprinsingly high but I think that it's simply because it's really called a lot. -- Luc -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 9, 2017 at 8:27 AM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > Like a lot of things, it depends. > For example, the kernel use a lot inline functions or macro with > code like: > ... > switch (sizeof(x)) { > case 1: ... > case 2: ... > case 4: ... > case 8: ... > default: issue a build error > } Ha, that is good to know. > Also, the simplification concerned here is 'insert_branch()' > which, like I explained in the commit message, is used by the > switch simplification but is also used by other branch > simplifications. > >> Have two switch >> statement in same function get simplify at the same time should be >> even rarer. >> Might not worth the effort to optimize for that. >> >> As it is, the patch is acceptable. > > I really also would prefer that we would not have to walk the BBs > but I really think we don't have good others choices. Yes, that is what I am afraid as well, it might not worth the complexity try to optimist for it. I have not problem walk the BBs, I just don't want to walk it more than once if easy avoidable. Looks like there is no easy way to get it. Let's leave it as it is. You convinced me. ACK. Chris -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
The goal of this series is to get rid of false "crazy programmer" warnings. This warning is issued when a cycle is detected in the address calculation of memory operations. Such cycle can happen when simplifying phi-nodes dealing with uninitialized variables. However the very same diagnostic can also be emitted if we try to simplify phi-nodes inside an unreachable loop (because localy it's exactly the same situation: since the loop is unreachable it's very possible that the induction variables have no more an initial value). Of course, since this situation is only the consequence of trying to simplify code that is not reachable while the original may very well not contain any unitialized var, this diagnostic should not be emitted in this situation. The solution is to kill potential unreachable BBs before trying to simplify them, each time a parent is removed from a BB. --- Changes since v1: - split the patch in 3, the first 2 being small preparatory patches, the third being the meat. - try to avoid to have to call kill_unreachable_bbs() in the rare situations where the BB having a parent removed still contains another parent and thus can't be unreachable. Luc Van Oostenryck (3): introduce REPEAT_CFG_CLEANUP let kill_unreachable_bbs() clear REPEAT_CFG_CLEANUP fix: kill unreachable BBs after killing a child flow.c | 2 ++ flow.h | 1 + linearize.c | 5 ++++- validation/crazy02-not-so.c | 22 ++++++++++++++++++++++ 4 files changed, 29 insertions(+), 1 deletion(-) create mode 100644 validation/crazy02-not-so.c
diff --git a/linearize.c b/linearize.c index a9f36b823..ee9591897 100644 --- a/linearize.c +++ b/linearize.c @@ -642,8 +642,6 @@ static void set_activeblock(struct entrypoint *ep, struct basic_block *bb) static void remove_parent(struct basic_block *child, struct basic_block *parent) { remove_bb_from_list(&child->parents, parent, 1); - if (!child->parents) - kill_bb(child); } /* Change a "switch" into a branch */ @@ -670,6 +668,7 @@ void insert_branch(struct basic_block *bb, struct instruction *jmp, struct basic remove_parent(child, bb); } END_FOR_EACH_PTR(child); PACK_PTR_LIST(&bb->children); + kill_unreachable_bbs(bb->ep); } diff --git a/validation/crazy02-not-so.c b/validation/crazy02-not-so.c new file mode 100644 index 000000000..fe7133587 --- /dev/null +++ b/validation/crazy02-not-so.c @@ -0,0 +1,22 @@ +int foo(int *ptr, int i) +{ + int *p; + + switch (i - i) { // will be optimized to 0 + case 0: + return 0; + case 1: // will be optimized away + p = ptr; + do { // will be an unreachable loop + *p++ = 123; + } while (--i); + break; + } + + return 1; +} + +/* + * check-name: crazy02-not-so.c + * check-command: sparse -Wno-decl $file + */
When simplifying a switch into a simple branch all the now unused children of the current BB must be removed. If one of these children become now orphaned, it is directly killed (it will need to be killed soon or later since it is unreachable). However, if one of the killed children is the header of a loop where some variables are updated this may cause problems. Indeed, by killing the header (which contains the phisrc of the entry value of the variable) the whole loop may become unreachable but is not killed yet, OTOH simplification of the associated OP_PHI may create a cycle which may then be detected later by simplify_one_memop() which will issue a "crazy programmer" warning while the programmer was innocent. This situation can be seen in code like: int *p; switch (i - i) { // will be optimized to 0 case 0: // will be the simple branch return 0; case 1: // will be optimized away p = ptr; do { // will be an unreachable loop *p++ = 123; } while (--i); } Fix this by calling kill_unreachable_bbs() after having simplified the switch into a branch. This will avoid to create a cycle with because of the removed phisrc in the header and as an added benefit will avoid to waste time trying to simplify BBs that are unreachable. In addition, it's now useless to call kill_bb() for each removed switch's children as kill_unreachable_bbs() will do that too. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> --- linearize.c | 3 +-- validation/crazy02-not-so.c | 22 ++++++++++++++++++++++ 2 files changed, 23 insertions(+), 2 deletions(-) create mode 100644 validation/crazy02-not-so.c