[4/4] CSE: improve hashing of non-commutative binops

Message ID	20170216231231.5781-5-luc.vanoostenryck@gmail.com (mailing list archive)
State	Rejected, archived
Headers	show Return-Path: <linux-sparse-owner@kernel.org> From: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> To: linux-sparse@vger.kernel.org Cc: Christopher Li <sparse@chrisli.org>, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Subject: [PATCH 4/4] CSE: improve hashing of non-commutative binops Date: Fri, 17 Feb 2017 00:12:31 +0100 Message-Id: <20170216231231.5781-5-luc.vanoostenryck@gmail.com> In-Reply-To: <20170216231231.5781-1-luc.vanoostenryck@gmail.com> References: <20170216231231.5781-1-luc.vanoostenryck@gmail.com> Sender: linux-sparse-owner@vger.kernel.org Precedence: bulk

Message ID

20170216231231.5781-5-luc.vanoostenryck@gmail.com (mailing list archive)

State

Rejected, archived

Headers

From: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
To: linux-sparse@vger.kernel.org
Cc: Christopher Li <sparse@chrisli.org>,
	Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Subject: [PATCH 4/4] CSE: improve hashing of non-commutative binops
Date: Fri, 17 Feb 2017 00:12:31 +0100
Message-Id: <20170216231231.5781-5-luc.vanoostenryck@gmail.com>
In-Reply-To: <20170216231231.5781-1-luc.vanoostenryck@gmail.com>
References: <20170216231231.5781-1-luc.vanoostenryck@gmail.com>
Sender: linux-sparse-owner@vger.kernel.org
Precedence: bulk

Commit Message

Luc Van Oostenryck Feb. 16, 2017, 11:12 p.m. UTC

During CSE equivalent instructions should hash to the same value
but we should also try to *not* hash to the same value instructions
that cannot be equivalent.
For commutative ops this means that the hash function should itself
be commutative/symmetrical regarding the exchange of its operands.
This is already the case but the current hash function is symmetrical
for all binops, not only the commutative ones. Thus expressions like
'a - b' and 'b - a' hash to the same value while it should be the case
only when 'a == b'.

Fix this by changing the hashing of non-commutative binops so that it
is anti-symmetrical regarding the exchange of operands while keeping
commutative ones symmetrical.

This change have no functional effects (in the sense that it shoudl
CSE exactly the same instructiosn as before), it should only improve
the efficiency of the hashing+comparing.

Note: on the 5000+ test set I'm using, I can't see any significant
  speedup which is quite normal since most of the functions therein
  have (much) less instructions than the size of the hash table.
  The effect of this patch should only be on much bigger functions.

Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
---
 cse.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

Comments

Christopher Li Feb. 23, 2017, 12:39 p.m. UTC | #1

On Fri, Feb 17, 2017 at 7:12 AM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> During CSE equivalent instructions should hash to the same value
> but we should also try to *not* hash to the same value instructions
> that cannot be equivalent.
> For commutative ops this means that the hash function should itself
> be commutative/symmetrical regarding the exchange of its operands.
> This is already the case but the current hash function is symmetrical
> for all binops, not only the commutative ones. Thus expressions like
> 'a - b' and 'b - a' hash to the same value while it should be the case
> only when 'a == b'.
>
> Fix this by changing the hashing of non-commutative binops so that it
> is anti-symmetrical regarding the exchange of operands while keeping
> commutative ones symmetrical.
>
> This change have no functional effects (in the sense that it shoudl
> CSE exactly the same instructiosn as before), it should only improve
> the efficiency of the hashing+comparing.
>
> Note: on the 5000+ test set I'm using, I can't see any significant
>   speedup which is quite normal since most of the functions therein
>   have (much) less instructions than the size of the hash table.
>   The effect of this patch should only be on much bigger functions.

I apply and push this series to sparse-next.

Just curious if there is test case can show the performance increase.
As far as I can tell, this patch has down side as well, it hash the non
commutative operation *twice*. If for most of the case this actually
is the higher cost we might want to keep the old behavior.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Luc Van Oostenryck Feb. 23, 2017, 2:13 p.m. UTC | #2

On Thu, Feb 23, 2017 at 08:39:17PM +0800, Christopher Li wrote:
> On Fri, Feb 17, 2017 at 7:12 AM, Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> > During CSE equivalent instructions should hash to the same value
> > but we should also try to *not* hash to the same value instructions
> > that cannot be equivalent.
> > For commutative ops this means that the hash function should itself
> > be commutative/symmetrical regarding the exchange of its operands.
> > This is already the case but the current hash function is symmetrical
> > for all binops, not only the commutative ones. Thus expressions like
> > 'a - b' and 'b - a' hash to the same value while it should be the case
> > only when 'a == b'.
> >
> > Fix this by changing the hashing of non-commutative binops so that it
> > is anti-symmetrical regarding the exchange of operands while keeping
> > commutative ones symmetrical.
> >
> > This change have no functional effects (in the sense that it shoudl
> > CSE exactly the same instructiosn as before), it should only improve
> > the efficiency of the hashing+comparing.
> >
> > Note: on the 5000+ test set I'm using, I can't see any significant
> >   speedup which is quite normal since most of the functions therein
> >   have (much) less instructions than the size of the hash table.
> >   The effect of this patch should only be on much bigger functions.
> 
> I apply and push this series to sparse-next.
I don't see them yet, but it's not important.

> Just curious if there is test case can show the performance increase.
I don't have such test case. To see any effect you would need a
function with a number of instructions quite large relatively to
the size of the hash table. It's even worse than that, it's the
number of pair of dual instructions (like 'a - b' / 'b - a') that
should be large (and to see a significant relative difference,
those pairs should also be a significant proportion of the total
numbers of instructions).

If needed I could try to build some artificial cases to determine
when a difference appears but ... (see below).

> As far as I can tell, this patch has down side as well, it hash the non
> commutative operation *twice*.
Sure it has some down fall but I it's not true that they are now
hashed twice. The number of operations is exactly the same as for
the commutative ones: each of src1 & src2 are 'hashed' together with
the opcode and the instruction size (and the hash() function is
essentially a no-op, not a costly thing anyway). The difference
between the two cases is that for commutative instructions src2
is first hashed and then there is a fall-through to the unary ops
to add src1's hash.
The down side is only a slight increase in code complexity.

> If for most of the case this actually
> is the higher cost we might want to keep the old behavior.
I can redo my tests and give the numbers but the measurements
I made didn't didn't showed any significant differences:
no speedup but also no slowdown.


That said, I completly agree that there is not much justification
to this patch. The only reason I made it was because when I modified
the code for the commutative case (which don't give a speedup but
eliminate more common expression) I took a look at the non-commutative
situation and said to myself something like: "Oh, they are hashed
symmetrically, like the commutative ones, that's not very logical".

So, I have absolutely no problems if this patch is dropped.
(much more interesting is the patch following this in the serie:
 "[RFC] CSE: relax type checking in hashing/compare").

Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Luc Van Oostenryck Feb. 23, 2017, 3:03 p.m. UTC | #3

On Thu, Feb 23, 2017 at 03:13:14PM +0100, Luc Van Oostenryck wrote:
> 
> So, I have absolutely no problems if this patch is dropped.

After some more thoughts I'm convinced that this patch was
a bad idea, please drop it.

The rationale being that even if there is a lot of pair of
dual instructions ('Xi - Yi' / 'Yi - Xi'), their hashing
will still be spreaded relatively evenly because there
will be a lot of Xi & Yi but with the patch *every*
instruction like 'Xi - Xi' will give the same hash
indenpendently of the Xi.

Luc
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Christopher Li Feb. 27, 2017, 8:03 a.m. UTC | #4

On Thu, Feb 23, 2017 at 11:03 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> On Thu, Feb 23, 2017 at 03:13:14PM +0100, Luc Van Oostenryck wrote:
>>
>> So, I have absolutely no problems if this patch is dropped.
>
> After some more thoughts I'm convinced that this patch was
> a bad idea, please drop it.

OK. Patch dropped in the current sparse-next.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/cse.c b/cse.c
index 0d3815c5a..89812afae 100644
--- a/cse.c
+++ b/cse.c
@@ -51,28 +51,30 @@  static void clean_up_one_instruction(struct basic_block *bb, struct instruction
 		hash += hashval(insn->src3);
 		/* Fall through */	
 
-	/* Binary arithmetic */
-	case OP_ADD: case OP_SUB:
-	case OP_MULU: case OP_MULS:
+	/* non-commutative binops */
+	case OP_SUB:
 	case OP_DIVU: case OP_DIVS:
 	case OP_MODU: case OP_MODS:
 	case OP_SHL:
 	case OP_LSR: case OP_ASR:
-	case OP_AND: case OP_OR:
-
-	/* Binary logical */
-	case OP_XOR: case OP_AND_BOOL:
-	case OP_OR_BOOL:
-
-	/* Binary comparison */
-	case OP_SET_EQ: case OP_SET_NE:
 	case OP_SET_LE: case OP_SET_GE:
 	case OP_SET_LT: case OP_SET_GT:
 	case OP_SET_B:  case OP_SET_A:
 	case OP_SET_BE: case OP_SET_AE:
+		hash -= hashval(insn->src2);
+		hash += hashval(insn->src1);
+		break;
+
+	/* commutative binops */
+	case OP_SET_EQ: case OP_SET_NE:
+	case OP_ADD:
+	case OP_MULU: case OP_MULS:
+	case OP_AND_BOOL: case OP_OR_BOOL:
+	case OP_AND: case OP_OR:
+	case OP_XOR:
 		hash += hashval(insn->src2);
 		/* Fall through */
-	
+
 	/* Unary */
 	case OP_NOT: case OP_NEG:
 		hash += hashval(insn->src1);