Message ID | 20180905223828.73478-9-luc.vanoostenryck@gmail.com (mailing list archive) |
---|---|
State | Mainlined, archived |
Headers | show |
Series | Fix linearization of missing return | expand |
On 05/09/18 23:38, Luc Van Oostenryck wrote: > This will allow to reuse this code to generate valid IR > in the case of a missing return statement. > > Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> > --- > linearize.c | 27 ++++++++++++++++----------- > 1 file changed, 16 insertions(+), 11 deletions(-) > > diff --git a/linearize.c b/linearize.c > index d6647e443..53d66ffa0 100644 > --- a/linearize.c > +++ b/linearize.c > @@ -1980,6 +1980,21 @@ static pseudo_t linearize_compound_statement(struct entrypoint *ep, struct state > return pseudo; > } > > +static void add_return(struct entrypoint *ep, struct basic_block *bb, struct symbol *ctype, pseudo_t src) > +{ > + struct instruction *phi_node = first_instruction(bb->insns); > + pseudo_t phi; > + if (!phi_node) { > + phi_node = alloc_typed_instruction(OP_PHI, ctype); > + phi_node->target = alloc_pseudo(phi_node); > + phi_node->bb = bb; > + add_instruction(&bb->insns, phi_node); > + } > + phi = alloc_phi(ep->active, src, ctype); > + phi->ident = &return_ident; > + use_pseudo(phi_node, phi, add_pseudo(&phi_node->phi_list, phi)); > +} > + > static pseudo_t linearize_fn_statement(struct entrypoint *ep, struct statement *stmt) > { > struct instruction *phi_node; > @@ -2157,17 +2172,7 @@ static pseudo_t linearize_return(struct entrypoint *ep, struct statement *stmt) > pseudo_t src = linearize_expression(ep, expr); > active = ep->active; > if (active && src != VOID) { > - struct instruction *phi_node = first_instruction(bb_return->insns); > - pseudo_t phi; > - if (!phi_node) { > - phi_node = alloc_typed_instruction(OP_PHI, expr->ctype); > - phi_node->target = alloc_pseudo(phi_node); > - phi_node->bb = bb_return; > - add_instruction(&bb_return->insns, phi_node); > - } > - phi = alloc_phi(active, src, expr->ctype); > - phi->ident = &return_ident; > - use_pseudo(phi_node, phi, add_pseudo(&phi_node->phi_list, phi)); > + add_return(ep, bb_return, expr->ctype, src); It does not matter here, but as a matter of interest, why pass the entrypoint here, rather than the active bb? Future plans? ATB, Ramsay Jones > } > add_goto(ep, bb_return); > return VOID; >
On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote: > On 05/09/18 23:38, Luc Van Oostenryck wrote: > > This will allow to reuse this code to generate valid IR > > in the case of a missing return statement. > > > > + add_return(ep, bb_return, expr->ctype, src); > > It does not matter here, but as a matter of interest, why pass > the entrypoint here, rather than the active bb? Indeed, ep in itself is not needed. I didn't noticed. It's mostly a matter of consistency: functions that will add an instruction generaly need the entrypoint because they may change the active BB. > Future plans? Hehe :) Well ... 3 years ago I had a bit free time and a patch that was waiting since 12 or 13 years. I said to myself "wouldn't it be nice if I would clean it up a little bit and upstream it?" And then when cleaning it I found a few bugs in the official code, then there was some patches from other people that was waiting a review since months (_Static_assert() & the constness rework) and voilĂ . Now I have about 180 topics/branches that are pending. My short term plan is to fix all problems related to linearization & SSA (the classic SSA is now in (my) master, I'm busy with the current series to fix some problems with the phi-nodes from linearization), the next step will be to make simplify_loads() generate valid SSA. A longer term will be to write an simulator for the IR (which won't be hard but need stuff put in place first, like correct SSA). I also have this series 'codegen' that can be used to generate code for ARM, ARM64 & RISC-V (it's still crude and lack register allocation but it's quite cool). And there is tons of stuff to do on optimization. There is also an interesting series to allow __bitwise for enums (which is not active for the moment because it impacts quite a bit of the current kernel code and there is a few choices that need to be done). Having good doc is also a goal ... It would also be good to be a bit more attentive to the needs on the kernel side. Oh yes, it it would be good if the patch 551b85c8a ("sparse: add -Wpointer-arith flag to toggle sizeof(void) warnings") could be in the official tree *long sigh*. -- Luc
On 06/09/18 01:52, Luc Van Oostenryck wrote: > On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote: >> On 05/09/18 23:38, Luc Van Oostenryck wrote: >>> This will allow to reuse this code to generate valid IR >>> in the case of a missing return statement. >>> >>> + add_return(ep, bb_return, expr->ctype, src); >> >> It does not matter here, but as a matter of interest, why pass >> the entrypoint here, rather than the active bb? > > Indeed, ep in itself is not needed. I didn't noticed. > It's mostly a matter of consistency: functions that will add > an instruction generaly need the entrypoint because they may > change the active BB. > >> Future plans? > > Hehe :) ;-) I was only enquiring about the add_return() function! (Any future plans to use anything other than ep->active). However, ... > Well ... 3 years ago I had a bit free time and a patch that > was waiting since 12 or 13 years. I said to myself "wouldn't > it be nice if I would clean it up a little bit and upstream it?" > And then when cleaning it I found a few bugs in the official > code, then there was some patches from other people that was > waiting a review since months (_Static_assert() & the constness > rework) and voilĂ . Now I have about 180 topics/branches that > are pending. > > My short term plan is to fix all problems related to linearization > & SSA (the classic SSA is now in (my) master, I'm busy with the > current series to fix some problems with the phi-nodes from > linearization), the next step will be to make simplify_loads() > generate valid SSA. A longer term will be to write an simulator > for the IR (which won't be hard but need stuff put in place first, > like correct SSA). > > I also have this series 'codegen' that can be used to generate > code for ARM, ARM64 & RISC-V (it's still crude and lack > register allocation but it's quite cool). And there is tons > of stuff to do on optimization. > > There is also an interesting series to allow __bitwise > for enums (which is not active for the moment because > it impacts quite a bit of the current kernel code and there is > a few choices that need to be done). > Having good doc is also a goal ... > It would also be good to be a bit more attentive to the needs > on the kernel side. ... good to know. > Oh yes, it it would be good if the patch 551b85c8a > ("sparse: add -Wpointer-arith flag to toggle sizeof(void) warnings") > could be in the official tree *long sigh*. :-D ATB, Ramsay Jones
On Thu, Sep 06, 2018 at 02:46:28AM +0100, Ramsay Jones wrote: > > > On 06/09/18 01:52, Luc Van Oostenryck wrote: > > On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote: > >> On 05/09/18 23:38, Luc Van Oostenryck wrote: > >>> This will allow to reuse this code to generate valid IR > >>> in the case of a missing return statement. > >>> > >>> + add_return(ep, bb_return, expr->ctype, src); > >> > >> It does not matter here, but as a matter of interest, why pass > >> the entrypoint here, rather than the active bb? > > > > Indeed, ep in itself is not needed. I didn't noticed. > > It's mostly a matter of consistency: functions that will add > > an instruction generaly need the entrypoint because they may > > change the active BB. > > > >> Future plans? > > > > Hehe :) > > ;-) I was only enquiring about the add_return() function! > (Any future plans to use anything other than ep->active). I sort of supposed it was but it was more interesting so ;) To answer your real question: no, not that I can imagine. Unlike other parts, the linearization is essentially complete: * C won't have new language constructions * if new instructions will be added they won't impact linearization (I'm thinking about things like OP_ROL/OP_ROR or OP_BSWAP8/16/32). There is one expection though, it's for varags. I can't see other uses for add_return(). I prefer to leave 'ep' as argument since, to me, seeing this as the first argument means "this function need the current context/ will add instructions to the current BB", while seeing an explicit BB means "this function will add instructions to an arbitrary BB". -- Luc
On 06/09/18 21:51, Luc Van Oostenryck wrote: > On Thu, Sep 06, 2018 at 02:46:28AM +0100, Ramsay Jones wrote: >> >> >> On 06/09/18 01:52, Luc Van Oostenryck wrote: >>> On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote: >>>> On 05/09/18 23:38, Luc Van Oostenryck wrote: >>>>> This will allow to reuse this code to generate valid IR >>>>> in the case of a missing return statement. >>>>> >>>>> + add_return(ep, bb_return, expr->ctype, src); >>>> >>>> It does not matter here, but as a matter of interest, why pass >>>> the entrypoint here, rather than the active bb? >>> >>> Indeed, ep in itself is not needed. I didn't noticed. >>> It's mostly a matter of consistency: functions that will add >>> an instruction generaly need the entrypoint because they may >>> change the active BB. >>> >>>> Future plans? >>> >>> Hehe :) >> >> ;-) I was only enquiring about the add_return() function! >> (Any future plans to use anything other than ep->active). > > I sort of supposed it was but it was more interesting so ;) Indeed! > To answer your real question: no, not that I can imagine. > Unlike other parts, the linearization is essentially complete: > * C won't have new language constructions > * if new instructions will be added they won't impact linearization > (I'm thinking about things like OP_ROL/OP_ROR or OP_BSWAP8/16/32). > There is one expection though, it's for varags. > I can't see other uses for add_return(). > I prefer to leave 'ep' as argument since, to me, seeing this as > the first argument means "this function need the current context/ > will add instructions to the current BB", while seeing an explicit > BB means "this function will add instructions to an arbitrary BB". Yep, I suspected it would be something like that. Sounds good to me ... :-D ATB, Ramsay Jones
diff --git a/linearize.c b/linearize.c index d6647e443..53d66ffa0 100644 --- a/linearize.c +++ b/linearize.c @@ -1980,6 +1980,21 @@ static pseudo_t linearize_compound_statement(struct entrypoint *ep, struct state return pseudo; } +static void add_return(struct entrypoint *ep, struct basic_block *bb, struct symbol *ctype, pseudo_t src) +{ + struct instruction *phi_node = first_instruction(bb->insns); + pseudo_t phi; + if (!phi_node) { + phi_node = alloc_typed_instruction(OP_PHI, ctype); + phi_node->target = alloc_pseudo(phi_node); + phi_node->bb = bb; + add_instruction(&bb->insns, phi_node); + } + phi = alloc_phi(ep->active, src, ctype); + phi->ident = &return_ident; + use_pseudo(phi_node, phi, add_pseudo(&phi_node->phi_list, phi)); +} + static pseudo_t linearize_fn_statement(struct entrypoint *ep, struct statement *stmt) { struct instruction *phi_node; @@ -2157,17 +2172,7 @@ static pseudo_t linearize_return(struct entrypoint *ep, struct statement *stmt) pseudo_t src = linearize_expression(ep, expr); active = ep->active; if (active && src != VOID) { - struct instruction *phi_node = first_instruction(bb_return->insns); - pseudo_t phi; - if (!phi_node) { - phi_node = alloc_typed_instruction(OP_PHI, expr->ctype); - phi_node->target = alloc_pseudo(phi_node); - phi_node->bb = bb_return; - add_instruction(&bb_return->insns, phi_node); - } - phi = alloc_phi(active, src, expr->ctype); - phi->ident = &return_ident; - use_pseudo(phi_node, phi, add_pseudo(&phi_node->phi_list, phi)); + add_return(ep, bb_return, expr->ctype, src); } add_goto(ep, bb_return); return VOID;
This will allow to reuse this code to generate valid IR in the case of a missing return statement. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> --- linearize.c | 27 ++++++++++++++++----------- 1 file changed, 16 insertions(+), 11 deletions(-)