diff mbox series

[08/11] extract add_return() from linearize_return()

Message ID 20180905223828.73478-9-luc.vanoostenryck@gmail.com (mailing list archive)
State Mainlined, archived
Headers show
Series Fix linearization of missing return | expand

Commit Message

Luc Van Oostenryck Sept. 5, 2018, 10:38 p.m. UTC
This will allow to reuse this code to generate valid IR
in the case of a missing return statement.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 linearize.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

Comments

Ramsay Jones Sept. 6, 2018, midnight UTC | #1
On 05/09/18 23:38, Luc Van Oostenryck wrote:
> This will allow to reuse this code to generate valid IR
> in the case of a missing return statement.
> 
> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
> ---
>  linearize.c | 27 ++++++++++++++++-----------
>  1 file changed, 16 insertions(+), 11 deletions(-)
> 
> diff --git a/linearize.c b/linearize.c
> index d6647e443..53d66ffa0 100644
> --- a/linearize.c
> +++ b/linearize.c
> @@ -1980,6 +1980,21 @@ static pseudo_t linearize_compound_statement(struct entrypoint *ep, struct state
>  	return pseudo;
>  }
>  
> +static void add_return(struct entrypoint *ep, struct basic_block *bb, struct symbol *ctype, pseudo_t src)
> +{
> +	struct instruction *phi_node = first_instruction(bb->insns);
> +	pseudo_t phi;
> +	if (!phi_node) {
> +		phi_node = alloc_typed_instruction(OP_PHI, ctype);
> +		phi_node->target = alloc_pseudo(phi_node);
> +		phi_node->bb = bb;
> +		add_instruction(&bb->insns, phi_node);
> +	}
> +	phi = alloc_phi(ep->active, src, ctype);
> +	phi->ident = &return_ident;
> +	use_pseudo(phi_node, phi, add_pseudo(&phi_node->phi_list, phi));
> +}
> +
>  static pseudo_t linearize_fn_statement(struct entrypoint *ep, struct statement *stmt)
>  {
>  	struct instruction *phi_node;
> @@ -2157,17 +2172,7 @@ static pseudo_t linearize_return(struct entrypoint *ep, struct statement *stmt)
>  	pseudo_t src = linearize_expression(ep, expr);
>  	active = ep->active;
>  	if (active && src != VOID) {
> -		struct instruction *phi_node = first_instruction(bb_return->insns);
> -		pseudo_t phi;
> -		if (!phi_node) {
> -			phi_node = alloc_typed_instruction(OP_PHI, expr->ctype);
> -			phi_node->target = alloc_pseudo(phi_node);
> -			phi_node->bb = bb_return;
> -			add_instruction(&bb_return->insns, phi_node);
> -		}
> -		phi = alloc_phi(active, src, expr->ctype);
> -		phi->ident = &return_ident;
> -		use_pseudo(phi_node, phi, add_pseudo(&phi_node->phi_list, phi));
> +		add_return(ep, bb_return, expr->ctype, src);

It does not matter here, but as a matter of interest, why pass
the entrypoint here, rather than the active bb?

Future plans?

ATB,
Ramsay Jones

>  	}
>  	add_goto(ep, bb_return);
>  	return VOID;
>
Luc Van Oostenryck Sept. 6, 2018, 12:52 a.m. UTC | #2
On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote:
> On 05/09/18 23:38, Luc Van Oostenryck wrote:
> > This will allow to reuse this code to generate valid IR
> > in the case of a missing return statement.
> > 
> > +		add_return(ep, bb_return, expr->ctype, src);
> 
> It does not matter here, but as a matter of interest, why pass
> the entrypoint here, rather than the active bb?

Indeed, ep in itself is not needed. I didn't noticed.
It's mostly a matter of consistency: functions that will add
an instruction generaly need the entrypoint because they may
change the active BB.
 
> Future plans?

Hehe :)
Well ... 3 years ago I had a bit free time and a patch that
was waiting since 12 or 13 years. I said to myself "wouldn't
it be nice if I would clean it up a little bit and upstream it?"
And then when cleaning it I found a few bugs in the official
code, then there was some patches from other people that was
waiting a review since months (_Static_assert() & the constness
rework) and voilĂ . Now I have about 180 topics/branches that
are pending.

My short term plan is to fix all problems related to linearization
& SSA (the classic SSA is now in (my) master, I'm busy with the
current series to fix some problems with the phi-nodes from
linearization), the next step will be to make simplify_loads()
generate valid SSA. A longer term will be to write an simulator
for the IR (which won't be hard but need stuff put in place first,
like correct SSA).

I also have this series 'codegen' that can be used to generate
code for ARM, ARM64 & RISC-V (it's still crude and lack
register allocation but it's quite cool). And there is tons
of stuff to do on optimization.

There is also an interesting series to allow __bitwise
for enums (which is not active for the moment because
it impacts quite a bit of the current kernel code and there is
a few choices that need to be done).
Having good doc is also a goal ...
It would also be good to be a bit more attentive to the needs
on the kernel side.

Oh yes, it it would be good if the patch 551b85c8a
  ("sparse: add -Wpointer-arith flag to toggle sizeof(void) warnings")
could be in the official tree *long sigh*.

-- Luc
Ramsay Jones Sept. 6, 2018, 1:46 a.m. UTC | #3
On 06/09/18 01:52, Luc Van Oostenryck wrote:
> On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote:
>> On 05/09/18 23:38, Luc Van Oostenryck wrote:
>>> This will allow to reuse this code to generate valid IR
>>> in the case of a missing return statement.
>>>
>>> +		add_return(ep, bb_return, expr->ctype, src);
>>
>> It does not matter here, but as a matter of interest, why pass
>> the entrypoint here, rather than the active bb?
> 
> Indeed, ep in itself is not needed. I didn't noticed.
> It's mostly a matter of consistency: functions that will add
> an instruction generaly need the entrypoint because they may
> change the active BB.
>  
>> Future plans?
> 
> Hehe :)

;-) I was only enquiring about the add_return() function!
(Any future plans to use anything other than ep->active).

However, ...

> Well ... 3 years ago I had a bit free time and a patch that
> was waiting since 12 or 13 years. I said to myself "wouldn't
> it be nice if I would clean it up a little bit and upstream it?"
> And then when cleaning it I found a few bugs in the official
> code, then there was some patches from other people that was
> waiting a review since months (_Static_assert() & the constness
> rework) and voilĂ . Now I have about 180 topics/branches that
> are pending.
> 
> My short term plan is to fix all problems related to linearization
> & SSA (the classic SSA is now in (my) master, I'm busy with the
> current series to fix some problems with the phi-nodes from
> linearization), the next step will be to make simplify_loads()
> generate valid SSA. A longer term will be to write an simulator
> for the IR (which won't be hard but need stuff put in place first,
> like correct SSA).
> 
> I also have this series 'codegen' that can be used to generate
> code for ARM, ARM64 & RISC-V (it's still crude and lack
> register allocation but it's quite cool). And there is tons
> of stuff to do on optimization.
> 
> There is also an interesting series to allow __bitwise
> for enums (which is not active for the moment because
> it impacts quite a bit of the current kernel code and there is
> a few choices that need to be done).
> Having good doc is also a goal ...
> It would also be good to be a bit more attentive to the needs
> on the kernel side.

... good to know.

> Oh yes, it it would be good if the patch 551b85c8a
>   ("sparse: add -Wpointer-arith flag to toggle sizeof(void) warnings")
> could be in the official tree *long sigh*.

:-D

ATB,
Ramsay Jones
Luc Van Oostenryck Sept. 6, 2018, 8:51 p.m. UTC | #4
On Thu, Sep 06, 2018 at 02:46:28AM +0100, Ramsay Jones wrote:
> 
> 
> On 06/09/18 01:52, Luc Van Oostenryck wrote:
> > On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote:
> >> On 05/09/18 23:38, Luc Van Oostenryck wrote:
> >>> This will allow to reuse this code to generate valid IR
> >>> in the case of a missing return statement.
> >>>
> >>> +		add_return(ep, bb_return, expr->ctype, src);
> >>
> >> It does not matter here, but as a matter of interest, why pass
> >> the entrypoint here, rather than the active bb?
> > 
> > Indeed, ep in itself is not needed. I didn't noticed.
> > It's mostly a matter of consistency: functions that will add
> > an instruction generaly need the entrypoint because they may
> > change the active BB.
> >  
> >> Future plans?
> > 
> > Hehe :)
> 
> ;-) I was only enquiring about the add_return() function!
> (Any future plans to use anything other than ep->active).

I sort of supposed it was but it was more interesting so ;)

To answer your real question: no, not that I can imagine.
Unlike other parts, the linearization is essentially complete:
* C won't have new language constructions
* if new instructions will be added they won't impact linearization
  (I'm thinking about things like OP_ROL/OP_ROR or OP_BSWAP8/16/32).
  There is one expection though, it's for varags.
I can't see other uses for add_return().
I prefer to leave 'ep' as argument since, to me, seeing this as
the first argument means "this function need the current context/
will add instructions to the current BB", while seeing an explicit
BB means "this function will add instructions to an arbitrary BB".

-- Luc
Ramsay Jones Sept. 6, 2018, 9:09 p.m. UTC | #5
On 06/09/18 21:51, Luc Van Oostenryck wrote:
> On Thu, Sep 06, 2018 at 02:46:28AM +0100, Ramsay Jones wrote:
>>
>>
>> On 06/09/18 01:52, Luc Van Oostenryck wrote:
>>> On Thu, Sep 06, 2018 at 01:00:11AM +0100, Ramsay Jones wrote:
>>>> On 05/09/18 23:38, Luc Van Oostenryck wrote:
>>>>> This will allow to reuse this code to generate valid IR
>>>>> in the case of a missing return statement.
>>>>>
>>>>> +		add_return(ep, bb_return, expr->ctype, src);
>>>>
>>>> It does not matter here, but as a matter of interest, why pass
>>>> the entrypoint here, rather than the active bb?
>>>
>>> Indeed, ep in itself is not needed. I didn't noticed.
>>> It's mostly a matter of consistency: functions that will add
>>> an instruction generaly need the entrypoint because they may
>>> change the active BB.
>>>  
>>>> Future plans?
>>>
>>> Hehe :)
>>
>> ;-) I was only enquiring about the add_return() function!
>> (Any future plans to use anything other than ep->active).
> 
> I sort of supposed it was but it was more interesting so ;)

Indeed!

> To answer your real question: no, not that I can imagine.
> Unlike other parts, the linearization is essentially complete:
> * C won't have new language constructions
> * if new instructions will be added they won't impact linearization
>   (I'm thinking about things like OP_ROL/OP_ROR or OP_BSWAP8/16/32).
>   There is one expection though, it's for varags.
> I can't see other uses for add_return().
> I prefer to leave 'ep' as argument since, to me, seeing this as
> the first argument means "this function need the current context/
> will add instructions to the current BB", while seeing an explicit
> BB means "this function will add instructions to an arbitrary BB".

Yep, I suspected it would be something like that. Sounds good
to me ... :-D

ATB,
Ramsay Jones
diff mbox series

Patch

diff --git a/linearize.c b/linearize.c
index d6647e443..53d66ffa0 100644
--- a/linearize.c
+++ b/linearize.c
@@ -1980,6 +1980,21 @@  static pseudo_t linearize_compound_statement(struct entrypoint *ep, struct state
 	return pseudo;
 }
 
+static void add_return(struct entrypoint *ep, struct basic_block *bb, struct symbol *ctype, pseudo_t src)
+{
+	struct instruction *phi_node = first_instruction(bb->insns);
+	pseudo_t phi;
+	if (!phi_node) {
+		phi_node = alloc_typed_instruction(OP_PHI, ctype);
+		phi_node->target = alloc_pseudo(phi_node);
+		phi_node->bb = bb;
+		add_instruction(&bb->insns, phi_node);
+	}
+	phi = alloc_phi(ep->active, src, ctype);
+	phi->ident = &return_ident;
+	use_pseudo(phi_node, phi, add_pseudo(&phi_node->phi_list, phi));
+}
+
 static pseudo_t linearize_fn_statement(struct entrypoint *ep, struct statement *stmt)
 {
 	struct instruction *phi_node;
@@ -2157,17 +2172,7 @@  static pseudo_t linearize_return(struct entrypoint *ep, struct statement *stmt)
 	pseudo_t src = linearize_expression(ep, expr);
 	active = ep->active;
 	if (active && src != VOID) {
-		struct instruction *phi_node = first_instruction(bb_return->insns);
-		pseudo_t phi;
-		if (!phi_node) {
-			phi_node = alloc_typed_instruction(OP_PHI, expr->ctype);
-			phi_node->target = alloc_pseudo(phi_node);
-			phi_node->bb = bb_return;
-			add_instruction(&bb_return->insns, phi_node);
-		}
-		phi = alloc_phi(active, src, expr->ctype);
-		phi->ident = &return_ident;
-		use_pseudo(phi_node, phi, add_pseudo(&phi_node->phi_list, phi));
+		add_return(ep, bb_return, expr->ctype, src);
 	}
 	add_goto(ep, bb_return);
 	return VOID;