diff mbox series

[RISU,v3,04/18] risugen_x86_constraints: add module

Message ID 20190711223300.6061-5-jan.bobek@gmail.com (mailing list archive)
State New, archived
Headers show
Series Support for generating x86 SIMD test images | expand

Commit Message

Jan Bobek July 11, 2019, 10:32 p.m. UTC
The module risugen_x86_constraints.pm provides environment for
evaluating x86 "!constraints" blocks. This is facilitated by the
single exported function eval_constraints_block.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 risugen_x86_constraints.pm | 154 +++++++++++++++++++++++++++++++++++++
 1 file changed, 154 insertions(+)
 create mode 100644 risugen_x86_constraints.pm

Comments

Richard Henderson July 12, 2019, 2:24 p.m. UTC | #1
On 7/12/19 12:32 AM, Jan Bobek wrote:
> +sub vex($%)
> +{
> +    my ($insn, %vex) = @_;
> +    my $regidw = $is_x86_64 ? 4 : 3;
> +
> +    # There is no point in randomizing other VEX fields, since
> +    # VEX.R/.X/.B are encoded automatically by risugen_x86_asm, and
> +    # VEX.M/.P are opcodes.
> +    $vex{l} = randint(width => 1) ? 256 : 128 unless defined $vex{l};

VEX.L is sort-of opcode-like as well.  It certainly differentiates AVX1 vs
AVX2, and so probably should be constrained somehow.  I can't think of what's
the best way to do that at the moment, since our existing --xstate=foo isn't right.

Perhaps just a FIXME comment for now?

> +sub modrm_($%)
> +{
> +    my ($insn, %args) = @_;
> +    my $regidw = $is_x86_64 ? 4 : 3;
> +
> +    my %modrm = ();
> +    if (defined $args{reg}) {
> +        # This makes the config file syntax a bit more accommodating
> +        # in cases where MODRM.REG is an opcode extension field.
> +        $modrm{reg} = $args{reg};
> +    } else {
> +        $modrm{reg} = randint(width => $regidw);
> +    }
> +
> +    # There is also a displacement-only form, but we don't know
> +    # absolute address of the memblock, so we cannot test it.

32-bit mode has displacement-only, aka absolute; 64-bit replaces that with
rip-relative.  But agreed that the first is impossible to test and the second
is difficult.

> +sub modrm($%)
> +{
> +    my ($insn, %args) = @_;
> +    modrm_($insn, indexk => 'index', %args);
> +}

How are you avoiding %rsp as index?
I saw you die for that in the previous patch...


r~
Jan Bobek July 14, 2019, 10:39 p.m. UTC | #2
On 7/12/19 10:24 AM, Richard Henderson wrote:
> On 7/12/19 12:32 AM, Jan Bobek wrote:
>> +sub vex($%)
>> +{
>> +    my ($insn, %vex) = @_;
>> +    my $regidw = $is_x86_64 ? 4 : 3;
>> +
>> +    # There is no point in randomizing other VEX fields, since
>> +    # VEX.R/.X/.B are encoded automatically by risugen_x86_asm, and
>> +    # VEX.M/.P are opcodes.
>> +    $vex{l} = randint(width => 1) ? 256 : 128 unless defined $vex{l};
> 
> VEX.L is sort-of opcode-like as well.  It certainly differentiates AVX1 vs
> AVX2, and so probably should be constrained somehow.  I can't think of what's
> the best way to do that at the moment, since our existing --xstate=foo isn't right.
> 
> Perhaps just a FIXME comment for now?

So, the instructions that use VEX.L specify it in the !constraints
block in the config file. Originally, I thought some instructions are
supposed to ignore it (denoted by LIG in the Intel manual -- it's the
scalar instructions like ADDSS), so it might be worth randomizing.
However, when I later read the manual pages of some of these
instructions, it said they are supposed to be encoded with VEX.L=0
anyway. I didn't check every single one of them, but right now they
are all encoded with VEX.L=0, so I suppose this line can be removed
and we can rely on the caller (the !constraints block) to always
specify it.

>> +sub modrm_($%)
>> +{
>> +    my ($insn, %args) = @_;
>> +    my $regidw = $is_x86_64 ? 4 : 3;
>> +
>> +    my %modrm = ();
>> +    if (defined $args{reg}) {
>> +        # This makes the config file syntax a bit more accommodating
>> +        # in cases where MODRM.REG is an opcode extension field.
>> +        $modrm{reg} = $args{reg};
>> +    } else {
>> +        $modrm{reg} = randint(width => $regidw);
>> +    }
>> +
>> +    # There is also a displacement-only form, but we don't know
>> +    # absolute address of the memblock, so we cannot test it.
> 
> 32-bit mode has displacement-only, aka absolute; 64-bit replaces that with
> rip-relative.  But agreed that the first is impossible to test and the second
> is difficult.
> 
>> +sub modrm($%)
>> +{
>> +    my ($insn, %args) = @_;
>> +    modrm_($insn, indexk => 'index', %args);
>> +}
> 
> How are you avoiding %rsp as index?
> I saw you die for that in the previous patch...

See write_mem_getoffset in risugen_x86.pm. I felt there's a better
place for it there, since that's when we actually need to write to it,
so the problem is more exposed.

-Jan

> 
> r~
>
Richard Henderson July 21, 2019, 1:54 a.m. UTC | #3
On 7/11/19 3:32 PM, Jan Bobek wrote:
> +sub data16($%)
> +{
> +    my ($insn, %data16) = @_;
> +    $insn->{data16} = \%data16;
> +}
> +
> +sub rep($%)
> +{
> +    my ($insn, %rep) = @_;
> +    $insn->{rep} = \%rep;
> +}
> +
> +sub repne($%)
> +{
> +    my ($insn, %repne) = @_;
> +    $insn->{repne} = \%repne;
> +}

What do you think of replacing these with p($_, 0x66), etc?

It kinda matches up with the "p => 0x66" within vex(), and it is easier for the
eye to match up with the comments before each pattern.

> +sub modrm($%)
> +{
> +    my ($insn, %args) = @_;
> +    modrm_($insn, indexk => 'index', %args);
> +}
> +
> +sub modrm_vsib($%)
> +{
> +    my ($insn, %args) = @_;
> +    modrm_($insn, indexk => 'vindex', %args);
> +}

I'm thinking of adding a few more exports for very common patterns:

modrm_reg    -- force use of register.
modrm_mem    -- force use of memory.
modrm_mmx_1  -- crop reg1 to 0-7 for mm register.
modrm_mmx_2  -- crop reg2 to 0-7 if in use.
modrm_mmx_12 -- crop both reg1 and reg2.

I think these would significantly shorten some of the !constraints.

I'm willing to do these changes myself; for the GSoC project I'd rather you
continue to the next phase instead of iterating on risugen further.


r~
Jan Bobek July 22, 2019, 1:41 p.m. UTC | #4
On 7/20/19 9:54 PM, Richard Henderson wrote:
> On 7/11/19 3:32 PM, Jan Bobek wrote:
>> +sub data16($%)
>> +{
>> +    my ($insn, %data16) = @_;
>> +    $insn->{data16} = \%data16;
>> +}
>> +
>> +sub rep($%)
>> +{
>> +    my ($insn, %rep) = @_;
>> +    $insn->{rep} = \%rep;
>> +}
>> +
>> +sub repne($%)
>> +{
>> +    my ($insn, %repne) = @_;
>> +    $insn->{repne} = \%repne;
>> +}
> 
> What do you think of replacing these with p($_, 0x66), etc?
> 
> It kinda matches up with the "p => 0x66" within vex(), and it is easier for the
> eye to match up with the comments before each pattern.

Good idea!

>> +sub modrm($%)
>> +{
>> +    my ($insn, %args) = @_;
>> +    modrm_($insn, indexk => 'index', %args);
>> +}
>> +
>> +sub modrm_vsib($%)
>> +{
>> +    my ($insn, %args) = @_;
>> +    modrm_($insn, indexk => 'vindex', %args);
>> +}
> 
> I'm thinking of adding a few more exports for very common patterns:
> 
> modrm_reg    -- force use of register.
> modrm_mem    -- force use of memory.
> modrm_mmx_1  -- crop reg1 to 0-7 for mm register.
> modrm_mmx_2  -- crop reg2 to 0-7 if in use.
> modrm_mmx_12 -- crop both reg1 and reg2.
> 
> I think these would significantly shorten some of the !constraints.

I agree. I thought of something similar when I was preparing the v3
series; I didn't include it only because it would have further delayed
getting the v3 out.

> I'm willing to do these changes myself; for the GSoC project I'd rather you
> continue to the next phase instead of iterating on risugen further.

Of course, and thank you!

-Jan
diff mbox series

Patch

diff --git a/risugen_x86_constraints.pm b/risugen_x86_constraints.pm
new file mode 100644
index 0000000..a4ee687
--- /dev/null
+++ b/risugen_x86_constraints.pm
@@ -0,0 +1,154 @@ 
+#!/usr/bin/perl -w
+###############################################################################
+# Copyright (c) 2019 Jan Bobek
+# All rights reserved. This program and the accompanying materials
+# are made available under the terms of the Eclipse Public License v1.0
+# which accompanies this distribution, and is available at
+# http://www.eclipse.org/legal/epl-v10.html
+#
+# Contributors:
+#     Jan Bobek - initial implementation
+###############################################################################
+
+# risugen_x86_constraints -- risugen_x86's helper module for "!constraints" blocks
+package risugen_x86_constraints;
+
+use strict;
+use warnings;
+
+use risugen_common;
+use risugen_x86_asm;
+
+our @ISA    = qw(Exporter);
+our @EXPORT = qw(eval_constraints_block);
+
+my $is_x86_64;
+
+sub data16($%)
+{
+    my ($insn, %data16) = @_;
+    $insn->{data16} = \%data16;
+}
+
+sub rep($%)
+{
+    my ($insn, %rep) = @_;
+    $insn->{rep} = \%rep;
+}
+
+sub repne($%)
+{
+    my ($insn, %repne) = @_;
+    $insn->{repne} = \%repne;
+}
+
+sub rex($%)
+{
+    my ($insn, %rex) = @_;
+    # It doesn't make sense to randomize any REX fields, since REX.W
+    # is opcode-like and REX.R/.X/.B are encoded automatically by
+    # risugen_x86_asm.
+    $insn->{rex} = \%rex;
+}
+
+sub vex($%)
+{
+    my ($insn, %vex) = @_;
+    my $regidw = $is_x86_64 ? 4 : 3;
+
+    # There is no point in randomizing other VEX fields, since
+    # VEX.R/.X/.B are encoded automatically by risugen_x86_asm, and
+    # VEX.M/.P are opcodes.
+    $vex{l} = randint(width => 1) ? 256 : 128 unless defined $vex{l};
+    $vex{v} = randint(width => $regidw)       unless defined $vex{v};
+    $vex{w} = randint(width => 1)             unless defined $vex{w};
+    $insn->{vex} = \%vex;
+}
+
+sub modrm_($%)
+{
+    my ($insn, %args) = @_;
+    my $regidw = $is_x86_64 ? 4 : 3;
+
+    my %modrm = ();
+    if (defined $args{reg}) {
+        # This makes the config file syntax a bit more accommodating
+        # in cases where MODRM.REG is an opcode extension field.
+        $modrm{reg} = $args{reg};
+    } else {
+        $modrm{reg} = randint(width => $regidw);
+    }
+
+    # There is also a displacement-only form, but we don't know
+    # absolute address of the memblock, so we cannot test it.
+    my $form = int(rand(4));
+    if ($form == 0) {
+        $modrm{reg2} = randint(width => $regidw);
+    } else {
+        $modrm{base} = randint(width => $regidw);
+
+        if ($form == 2) {
+            $modrm{base}        = randint(width => $regidw);
+            $modrm{disp}{value} = randint(width => 8, signed => 1);
+            $modrm{disp}{width} = 8;
+        } elsif ($form == 3) {
+            $modrm{base}        = randint(width => $regidw);
+            $modrm{disp}{value} = randint(width => 32, signed => 1);
+            $modrm{disp}{width} = 32;
+        }
+
+        my $have_index = int(rand(2));
+        if ($have_index) {
+            my $indexk      = $args{indexk};
+            $modrm{ss}      = randint(width => 2);
+            $modrm{$indexk} = randint(width => $regidw);
+        }
+    }
+
+    $insn->{modrm} = \%modrm;
+}
+
+sub modrm($%)
+{
+    my ($insn, %args) = @_;
+    modrm_($insn, indexk => 'index', %args);
+}
+
+sub modrm_vsib($%)
+{
+    my ($insn, %args) = @_;
+    modrm_($insn, indexk => 'vindex', %args);
+}
+
+sub imm($%)
+{
+    my ($insn, %args) = @_;
+    $insn->{imm}{value} = randint(%args);
+    $insn->{imm}{width} = $args{width};
+}
+
+sub eval_constraints_block(%)
+{
+    my (%args) = @_;
+    my $rec = $args{rec};
+    my $insn = $args{insn};
+    my $insnname = $rec->{name};
+    my $opcode = $insn->{opcode}{value};
+
+    $is_x86_64 = $args{is_x86_64};
+
+    my $constraint = $rec->{blocks}{"constraints"};
+    if (defined $constraint) {
+        # user-specified constraint: evaluate in an environment
+        # with variables set corresponding to the variable fields.
+        my %env = extract_fields($opcode, $rec);
+        # set the variable $_ to the instruction in question
+        $env{_} = $insn;
+
+        return eval_block($insnname, "constraints", $constraint, \%env);
+    } else {
+        return 1;
+    }
+}
+
+1;