[08/11] extract add_return() from linearize_return()

Message ID	20180905223828.73478-9-luc.vanoostenryck@gmail.com (mailing list archive)
State	Mainlined, archived
Headers	show Return-Path: <linux-sparse-owner@kernel.org> From: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> To: linux-sparse@vger.kernel.org Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Subject: [PATCH 08/11] extract add_return() from linearize_return() Date: Thu, 6 Sep 2018 00:38:25 +0200 Message-Id: <20180905223828.73478-9-luc.vanoostenryck@gmail.com> In-Reply-To: <20180905223828.73478-1-luc.vanoostenryck@gmail.com> References: <20180905223828.73478-1-luc.vanoostenryck@gmail.com> Sender: linux-sparse-owner@vger.kernel.org Precedence: bulk
Series	Fix linearization of missing return \| expand [00/11] Fix linearization of missing return [01/11] add linearization as a pass [02/11] add testcases for missing return in last block [03/11] use a temp var for function's upper-level statement [04/11] topasm: top-level asm is special [05/11] specialize linearize_compound_statement() [06/11] there is always an active BB after linearize_fn_statement() [07/11] the return BB is never terminated [08/11] extract add_return() from linearize_return() [09/11] use UNDEF for missing returns [10/11] use a temp var for the return type/symbol [11/11] return nothing only in void functions

Luc Van Oostenryck Sept. 5, 2018, 10:38 p.m. UTC

This will allow to reuse this code to generate valid IR
in the case of a missing return statement.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 linearize.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

Ramsay Jones Sept. 6, 2018, midnight UTC | #1

On 05/09/18 23:38, Luc Van Oostenryck wrote:
> This will allow to reuse this code to generate valid IR
> in the case of a missing return statement.
> 
> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
> ---
>  linearize.c | 27 ++++++++++++++++-----------
>  1 file changed, 16 insertions(+), 11 deletions(-)
> 
> diff --git a/linearize.c b/linearize.c
> index d6647e443..53d66ffa0 100644
> --- a/linearize.c
> +++ b/linearize.c
> @@ -1980,6 +1980,21 @@ static pseudo_t linearize_compound_statement(struct entrypoint *ep, struct state
>  	return pseudo;
>  }
>  
> +static void add_return(struct entrypoint *ep, struct basic_block *bb, struct symbol *ctype, pseudo_t src)
> +{
> +	struct instruction *phi_node = first_instruction(bb->insns);
> +	pseudo_t phi;
> +	if (!phi_node) {
> +		phi_node = alloc_typed_instruction(OP_PHI, ctype);
> +		phi_node->target = alloc_pseudo(phi_node);
> +		phi_node->bb = bb;
> +		add_instruction(&bb->insns, phi_node);
> +	}
> +	phi = alloc_phi(ep->active, src, ctype);
> +	phi->ident = &return_ident;
> +	use_pseudo(phi_node, phi, add_pseudo(&phi_node->phi_list, phi));
> +}
> +
>  static pseudo_t linearize_fn_statement(struct entrypoint *ep, struct statement *stmt)
>  {
>  	struct instruction *phi_node;
> @@ -2157,17 +2172,7 @@ static pseudo_t linearize_return(struct entrypoint *ep, struct statement *stmt)
>  	pseudo_t src = linearize_expression(ep, expr);
>  	active = ep->active;
>  	if (active && src != VOID) {
> -		struct instruction *phi_node = first_instruction(bb_return->insns);
> -		pseudo_t phi;
> -		if (!phi_node) {
> -			phi_node = alloc_typed_instruction(OP_PHI, expr->ctype);
> -			phi_node->target = alloc_pseudo(phi_node);
> -			phi_node->bb = bb_return;
> -			add_instruction(&bb_return->insns, phi_node);
> -		}
> -		phi = alloc_phi(active, src, expr->ctype);
> -		phi->ident = &return_ident;
> -		use_pseudo(phi_node, phi, add_pseudo(&phi_node->phi_list, phi));
> +		add_return(ep, bb_return, expr->ctype, src);

It does not matter here, but as a matter of interest, why pass
the entrypoint here, rather than the active bb?

Future plans?

ATB,
Ramsay Jones

>  	}
>  	add_goto(ep, bb_return);
>  	return VOID;
>

Luc Van Oostenryck Sept. 6, 2018, 12:52 a.m. UTC | #2

On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote:
> On 05/09/18 23:38, Luc Van Oostenryck wrote:
> > This will allow to reuse this code to generate valid IR
> > in the case of a missing return statement.
> > 
> > +		add_return(ep, bb_return, expr->ctype, src);
> 
> It does not matter here, but as a matter of interest, why pass
> the entrypoint here, rather than the active bb?

Indeed, ep in itself is not needed. I didn't noticed.
It's mostly a matter of consistency: functions that will add
an instruction generaly need the entrypoint because they may
change the active BB.

> Future plans?

Hehe :)
Well ... 3 years ago I had a bit free time and a patch that
was waiting since 12 or 13 years. I said to myself "wouldn't
it be nice if I would clean it up a little bit and upstream it?"
And then when cleaning it I found a few bugs in the official
code, then there was some patches from other people that was
waiting a review since months (_Static_assert() & the constness
rework) and voilà. Now I have about 180 topics/branches that
are pending.

My short term plan is to fix all problems related to linearization
& SSA (the classic SSA is now in (my) master, I'm busy with the
current series to fix some problems with the phi-nodes from
linearization), the next step will be to make simplify_loads()
generate valid SSA. A longer term will be to write an simulator
for the IR (which won't be hard but need stuff put in place first,
like correct SSA).

I also have this series 'codegen' that can be used to generate
code for ARM, ARM64 & RISC-V (it's still crude and lack
register allocation but it's quite cool). And there is tons
of stuff to do on optimization.

There is also an interesting series to allow __bitwise
for enums (which is not active for the moment because
it impacts quite a bit of the current kernel code and there is
a few choices that need to be done).
Having good doc is also a goal ...
It would also be good to be a bit more attentive to the needs
on the kernel side.

Oh yes, it it would be good if the patch 551b85c8a
  ("sparse: add -Wpointer-arith flag to toggle sizeof(void) warnings")
could be in the official tree *long sigh*.

-- Luc

Ramsay Jones Sept. 6, 2018, 1:46 a.m. UTC | #3

On 06/09/18 01:52, Luc Van Oostenryck wrote:
> On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote:
>> On 05/09/18 23:38, Luc Van Oostenryck wrote:
>>> This will allow to reuse this code to generate valid IR
>>> in the case of a missing return statement.
>>>
>>> +		add_return(ep, bb_return, expr->ctype, src);
>>
>> It does not matter here, but as a matter of interest, why pass
>> the entrypoint here, rather than the active bb?
> 
> Indeed, ep in itself is not needed. I didn't noticed.
> It's mostly a matter of consistency: functions that will add
> an instruction generaly need the entrypoint because they may
> change the active BB.
>  
>> Future plans?
> 
> Hehe :)

;-) I was only enquiring about the add_return() function!
(Any future plans to use anything other than ep->active).

However, ...

> Well ... 3 years ago I had a bit free time and a patch that
> was waiting since 12 or 13 years. I said to myself "wouldn't
> it be nice if I would clean it up a little bit and upstream it?"
> And then when cleaning it I found a few bugs in the official
> code, then there was some patches from other people that was
> waiting a review since months (_Static_assert() & the constness
> rework) and voilà. Now I have about 180 topics/branches that
> are pending.
> 
> My short term plan is to fix all problems related to linearization
> & SSA (the classic SSA is now in (my) master, I'm busy with the
> current series to fix some problems with the phi-nodes from
> linearization), the next step will be to make simplify_loads()
> generate valid SSA. A longer term will be to write an simulator
> for the IR (which won't be hard but need stuff put in place first,
> like correct SSA).
> 
> I also have this series 'codegen' that can be used to generate
> code for ARM, ARM64 & RISC-V (it's still crude and lack
> register allocation but it's quite cool). And there is tons
> of stuff to do on optimization.
> 
> There is also an interesting series to allow __bitwise
> for enums (which is not active for the moment because
> it impacts quite a bit of the current kernel code and there is
> a few choices that need to be done).
> Having good doc is also a goal ...
> It would also be good to be a bit more attentive to the needs
> on the kernel side.

... good to know.

> Oh yes, it it would be good if the patch 551b85c8a
>   ("sparse: add -Wpointer-arith flag to toggle sizeof(void) warnings")
> could be in the official tree *long sigh*.

:-D

ATB,
Ramsay Jones

Luc Van Oostenryck Sept. 6, 2018, 8:51 p.m. UTC | #4

On Thu, Sep 06, 2018 at 02:46:28AM +0100, Ramsay Jones wrote:
> 
> 
> On 06/09/18 01:52, Luc Van Oostenryck wrote:
> > On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote:
> >> On 05/09/18 23:38, Luc Van Oostenryck wrote:
> >>> This will allow to reuse this code to generate valid IR
> >>> in the case of a missing return statement.
> >>>
> >>> +		add_return(ep, bb_return, expr->ctype, src);
> >>
> >> It does not matter here, but as a matter of interest, why pass
> >> the entrypoint here, rather than the active bb?
> > 
> > Indeed, ep in itself is not needed. I didn't noticed.
> > It's mostly a matter of consistency: functions that will add
> > an instruction generaly need the entrypoint because they may
> > change the active BB.
> >  
> >> Future plans?
> > 
> > Hehe :)
> 
> ;-) I was only enquiring about the add_return() function!
> (Any future plans to use anything other than ep->active).

I sort of supposed it was but it was more interesting so ;)

To answer your real question: no, not that I can imagine.
Unlike other parts, the linearization is essentially complete:
* C won't have new language constructions
* if new instructions will be added they won't impact linearization
  (I'm thinking about things like OP_ROL/OP_ROR or OP_BSWAP8/16/32).
  There is one expection though, it's for varags.
I can't see other uses for add_return().
I prefer to leave 'ep' as argument since, to me, seeing this as
the first argument means "this function need the current context/
will add instructions to the current BB", while seeing an explicit
BB means "this function will add instructions to an arbitrary BB".

-- Luc

Ramsay Jones Sept. 6, 2018, 9:09 p.m. UTC | #5

On 06/09/18 21:51, Luc Van Oostenryck wrote:
> On Thu, Sep 06, 2018 at 02:46:28AM +0100, Ramsay Jones wrote:
>>
>>
>> On 06/09/18 01:52, Luc Van Oostenryck wrote:
>>> On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote:
>>>> On 05/09/18 23:38, Luc Van Oostenryck wrote:
>>>>> This will allow to reuse this code to generate valid IR
>>>>> in the case of a missing return statement.
>>>>>
>>>>> +		add_return(ep, bb_return, expr->ctype, src);
>>>>
>>>> It does not matter here, but as a matter of interest, why pass
>>>> the entrypoint here, rather than the active bb?
>>>
>>> Indeed, ep in itself is not needed. I didn't noticed.
>>> It's mostly a matter of consistency: functions that will add
>>> an instruction generaly need the entrypoint because they may
>>> change the active BB.
>>>  
>>>> Future plans?
>>>
>>> Hehe :)
>>
>> ;-) I was only enquiring about the add_return() function!
>> (Any future plans to use anything other than ep->active).
> 
> I sort of supposed it was but it was more interesting so ;)

Indeed!

> To answer your real question: no, not that I can imagine.
> Unlike other parts, the linearization is essentially complete:
> * C won't have new language constructions
> * if new instructions will be added they won't impact linearization
>   (I'm thinking about things like OP_ROL/OP_ROR or OP_BSWAP8/16/32).
>   There is one expection though, it's for varags.
> I can't see other uses for add_return().
> I prefer to leave 'ep' as argument since, to me, seeing this as
> the first argument means "this function need the current context/
> will add instructions to the current BB", while seeing an explicit
> BB means "this function will add instructions to an arbitrary BB".

Yep, I suspected it would be something like that. Sounds good
to me ... :-D

ATB,
Ramsay Jones

[08/11] extract add_return() from linearize_return()

Commit Message

Comments

Patch