Message ID | 20190711223300.6061-5-jan.bobek@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Support for generating x86 SIMD test images | expand |
On 7/12/19 12:32 AM, Jan Bobek wrote: > +sub vex($%) > +{ > + my ($insn, %vex) = @_; > + my $regidw = $is_x86_64 ? 4 : 3; > + > + # There is no point in randomizing other VEX fields, since > + # VEX.R/.X/.B are encoded automatically by risugen_x86_asm, and > + # VEX.M/.P are opcodes. > + $vex{l} = randint(width => 1) ? 256 : 128 unless defined $vex{l}; VEX.L is sort-of opcode-like as well. It certainly differentiates AVX1 vs AVX2, and so probably should be constrained somehow. I can't think of what's the best way to do that at the moment, since our existing --xstate=foo isn't right. Perhaps just a FIXME comment for now? > +sub modrm_($%) > +{ > + my ($insn, %args) = @_; > + my $regidw = $is_x86_64 ? 4 : 3; > + > + my %modrm = (); > + if (defined $args{reg}) { > + # This makes the config file syntax a bit more accommodating > + # in cases where MODRM.REG is an opcode extension field. > + $modrm{reg} = $args{reg}; > + } else { > + $modrm{reg} = randint(width => $regidw); > + } > + > + # There is also a displacement-only form, but we don't know > + # absolute address of the memblock, so we cannot test it. 32-bit mode has displacement-only, aka absolute; 64-bit replaces that with rip-relative. But agreed that the first is impossible to test and the second is difficult. > +sub modrm($%) > +{ > + my ($insn, %args) = @_; > + modrm_($insn, indexk => 'index', %args); > +} How are you avoiding %rsp as index? I saw you die for that in the previous patch... r~
On 7/12/19 10:24 AM, Richard Henderson wrote: > On 7/12/19 12:32 AM, Jan Bobek wrote: >> +sub vex($%) >> +{ >> + my ($insn, %vex) = @_; >> + my $regidw = $is_x86_64 ? 4 : 3; >> + >> + # There is no point in randomizing other VEX fields, since >> + # VEX.R/.X/.B are encoded automatically by risugen_x86_asm, and >> + # VEX.M/.P are opcodes. >> + $vex{l} = randint(width => 1) ? 256 : 128 unless defined $vex{l}; > > VEX.L is sort-of opcode-like as well. It certainly differentiates AVX1 vs > AVX2, and so probably should be constrained somehow. I can't think of what's > the best way to do that at the moment, since our existing --xstate=foo isn't right. > > Perhaps just a FIXME comment for now? So, the instructions that use VEX.L specify it in the !constraints block in the config file. Originally, I thought some instructions are supposed to ignore it (denoted by LIG in the Intel manual -- it's the scalar instructions like ADDSS), so it might be worth randomizing. However, when I later read the manual pages of some of these instructions, it said they are supposed to be encoded with VEX.L=0 anyway. I didn't check every single one of them, but right now they are all encoded with VEX.L=0, so I suppose this line can be removed and we can rely on the caller (the !constraints block) to always specify it. >> +sub modrm_($%) >> +{ >> + my ($insn, %args) = @_; >> + my $regidw = $is_x86_64 ? 4 : 3; >> + >> + my %modrm = (); >> + if (defined $args{reg}) { >> + # This makes the config file syntax a bit more accommodating >> + # in cases where MODRM.REG is an opcode extension field. >> + $modrm{reg} = $args{reg}; >> + } else { >> + $modrm{reg} = randint(width => $regidw); >> + } >> + >> + # There is also a displacement-only form, but we don't know >> + # absolute address of the memblock, so we cannot test it. > > 32-bit mode has displacement-only, aka absolute; 64-bit replaces that with > rip-relative. But agreed that the first is impossible to test and the second > is difficult. > >> +sub modrm($%) >> +{ >> + my ($insn, %args) = @_; >> + modrm_($insn, indexk => 'index', %args); >> +} > > How are you avoiding %rsp as index? > I saw you die for that in the previous patch... See write_mem_getoffset in risugen_x86.pm. I felt there's a better place for it there, since that's when we actually need to write to it, so the problem is more exposed. -Jan > > r~ >
On 7/11/19 3:32 PM, Jan Bobek wrote: > +sub data16($%) > +{ > + my ($insn, %data16) = @_; > + $insn->{data16} = \%data16; > +} > + > +sub rep($%) > +{ > + my ($insn, %rep) = @_; > + $insn->{rep} = \%rep; > +} > + > +sub repne($%) > +{ > + my ($insn, %repne) = @_; > + $insn->{repne} = \%repne; > +} What do you think of replacing these with p($_, 0x66), etc? It kinda matches up with the "p => 0x66" within vex(), and it is easier for the eye to match up with the comments before each pattern. > +sub modrm($%) > +{ > + my ($insn, %args) = @_; > + modrm_($insn, indexk => 'index', %args); > +} > + > +sub modrm_vsib($%) > +{ > + my ($insn, %args) = @_; > + modrm_($insn, indexk => 'vindex', %args); > +} I'm thinking of adding a few more exports for very common patterns: modrm_reg -- force use of register. modrm_mem -- force use of memory. modrm_mmx_1 -- crop reg1 to 0-7 for mm register. modrm_mmx_2 -- crop reg2 to 0-7 if in use. modrm_mmx_12 -- crop both reg1 and reg2. I think these would significantly shorten some of the !constraints. I'm willing to do these changes myself; for the GSoC project I'd rather you continue to the next phase instead of iterating on risugen further. r~
On 7/20/19 9:54 PM, Richard Henderson wrote: > On 7/11/19 3:32 PM, Jan Bobek wrote: >> +sub data16($%) >> +{ >> + my ($insn, %data16) = @_; >> + $insn->{data16} = \%data16; >> +} >> + >> +sub rep($%) >> +{ >> + my ($insn, %rep) = @_; >> + $insn->{rep} = \%rep; >> +} >> + >> +sub repne($%) >> +{ >> + my ($insn, %repne) = @_; >> + $insn->{repne} = \%repne; >> +} > > What do you think of replacing these with p($_, 0x66), etc? > > It kinda matches up with the "p => 0x66" within vex(), and it is easier for the > eye to match up with the comments before each pattern. Good idea! >> +sub modrm($%) >> +{ >> + my ($insn, %args) = @_; >> + modrm_($insn, indexk => 'index', %args); >> +} >> + >> +sub modrm_vsib($%) >> +{ >> + my ($insn, %args) = @_; >> + modrm_($insn, indexk => 'vindex', %args); >> +} > > I'm thinking of adding a few more exports for very common patterns: > > modrm_reg -- force use of register. > modrm_mem -- force use of memory. > modrm_mmx_1 -- crop reg1 to 0-7 for mm register. > modrm_mmx_2 -- crop reg2 to 0-7 if in use. > modrm_mmx_12 -- crop both reg1 and reg2. > > I think these would significantly shorten some of the !constraints. I agree. I thought of something similar when I was preparing the v3 series; I didn't include it only because it would have further delayed getting the v3 out. > I'm willing to do these changes myself; for the GSoC project I'd rather you > continue to the next phase instead of iterating on risugen further. Of course, and thank you! -Jan
diff --git a/risugen_x86_constraints.pm b/risugen_x86_constraints.pm new file mode 100644 index 0000000..a4ee687 --- /dev/null +++ b/risugen_x86_constraints.pm @@ -0,0 +1,154 @@ +#!/usr/bin/perl -w +############################################################################### +# Copyright (c) 2019 Jan Bobek +# All rights reserved. This program and the accompanying materials +# are made available under the terms of the Eclipse Public License v1.0 +# which accompanies this distribution, and is available at +# http://www.eclipse.org/legal/epl-v10.html +# +# Contributors: +# Jan Bobek - initial implementation +############################################################################### + +# risugen_x86_constraints -- risugen_x86's helper module for "!constraints" blocks +package risugen_x86_constraints; + +use strict; +use warnings; + +use risugen_common; +use risugen_x86_asm; + +our @ISA = qw(Exporter); +our @EXPORT = qw(eval_constraints_block); + +my $is_x86_64; + +sub data16($%) +{ + my ($insn, %data16) = @_; + $insn->{data16} = \%data16; +} + +sub rep($%) +{ + my ($insn, %rep) = @_; + $insn->{rep} = \%rep; +} + +sub repne($%) +{ + my ($insn, %repne) = @_; + $insn->{repne} = \%repne; +} + +sub rex($%) +{ + my ($insn, %rex) = @_; + # It doesn't make sense to randomize any REX fields, since REX.W + # is opcode-like and REX.R/.X/.B are encoded automatically by + # risugen_x86_asm. + $insn->{rex} = \%rex; +} + +sub vex($%) +{ + my ($insn, %vex) = @_; + my $regidw = $is_x86_64 ? 4 : 3; + + # There is no point in randomizing other VEX fields, since + # VEX.R/.X/.B are encoded automatically by risugen_x86_asm, and + # VEX.M/.P are opcodes. + $vex{l} = randint(width => 1) ? 256 : 128 unless defined $vex{l}; + $vex{v} = randint(width => $regidw) unless defined $vex{v}; + $vex{w} = randint(width => 1) unless defined $vex{w}; + $insn->{vex} = \%vex; +} + +sub modrm_($%) +{ + my ($insn, %args) = @_; + my $regidw = $is_x86_64 ? 4 : 3; + + my %modrm = (); + if (defined $args{reg}) { + # This makes the config file syntax a bit more accommodating + # in cases where MODRM.REG is an opcode extension field. + $modrm{reg} = $args{reg}; + } else { + $modrm{reg} = randint(width => $regidw); + } + + # There is also a displacement-only form, but we don't know + # absolute address of the memblock, so we cannot test it. + my $form = int(rand(4)); + if ($form == 0) { + $modrm{reg2} = randint(width => $regidw); + } else { + $modrm{base} = randint(width => $regidw); + + if ($form == 2) { + $modrm{base} = randint(width => $regidw); + $modrm{disp}{value} = randint(width => 8, signed => 1); + $modrm{disp}{width} = 8; + } elsif ($form == 3) { + $modrm{base} = randint(width => $regidw); + $modrm{disp}{value} = randint(width => 32, signed => 1); + $modrm{disp}{width} = 32; + } + + my $have_index = int(rand(2)); + if ($have_index) { + my $indexk = $args{indexk}; + $modrm{ss} = randint(width => 2); + $modrm{$indexk} = randint(width => $regidw); + } + } + + $insn->{modrm} = \%modrm; +} + +sub modrm($%) +{ + my ($insn, %args) = @_; + modrm_($insn, indexk => 'index', %args); +} + +sub modrm_vsib($%) +{ + my ($insn, %args) = @_; + modrm_($insn, indexk => 'vindex', %args); +} + +sub imm($%) +{ + my ($insn, %args) = @_; + $insn->{imm}{value} = randint(%args); + $insn->{imm}{width} = $args{width}; +} + +sub eval_constraints_block(%) +{ + my (%args) = @_; + my $rec = $args{rec}; + my $insn = $args{insn}; + my $insnname = $rec->{name}; + my $opcode = $insn->{opcode}{value}; + + $is_x86_64 = $args{is_x86_64}; + + my $constraint = $rec->{blocks}{"constraints"}; + if (defined $constraint) { + # user-specified constraint: evaluate in an environment + # with variables set corresponding to the variable fields. + my %env = extract_fields($opcode, $rec); + # set the variable $_ to the instruction in question + $env{_} = $insn; + + return eval_block($insnname, "constraints", $constraint, \%env); + } else { + return 1; + } +} + +1;
The module risugen_x86_constraints.pm provides environment for evaluating x86 "!constraints" blocks. This is facilitated by the single exported function eval_constraints_block. Signed-off-by: Jan Bobek <jan.bobek@gmail.com> --- risugen_x86_constraints.pm | 154 +++++++++++++++++++++++++++++++++++++ 1 file changed, 154 insertions(+) create mode 100644 risugen_x86_constraints.pm