x86emul: defer rIP-relative address calculation

Message ID	584825BE0200007800126455@prv-mh.provo.novell.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xen.org> Message-Id: <584825BE0200007800126455@prv-mh.provo.novell.com> Date: Wed, 07 Dec 2016 07:07:42 -0700 From: "Jan Beulich" <JBeulich@suse.com> To: "xen-devel" <xen-devel@lists.xenproject.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=__PartD1E8CCBE.1__=" Cc: Andrew Cooper <andrew.cooper3@citrix.com> Subject: [Xen-devel] [PATCH] x86emul: defer rIP-relative address calculation Precedence: list Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>

Jan Beulich Dec. 7, 2016, 2:07 p.m. UTC

By putting it after all instruction fetching has been done, we can both
simplify the existing handling of immediate operands and take care of
any future instructions allowing rIP-relative operands and getting
additional bytes fetched in x86_decode_*() (the current cases of extra
bytes getting fetched there are only for operands without ModR/M bytes,
or with them only allowing their register forms).

Similarly the new placement of truncate_ea() will take care of any
future cases of non-standard memory operands (the one existing case -
opcodes A0...A3 - are fine with and without this, as they fetch an
ad_bytes sized unsigned address anyway).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
x86emul: defer rIP-relative address calculation

By putting it after all instruction fetching has been done, we can both
simplify the existing handling of immediate operands and take care of
any future instructions allowing rIP-relative operands and getting
additional bytes fetched in x86_decode_*() (the current cases of extra
bytes getting fetched there are only for operands without ModR/M bytes,
or with them only allowing their register forms).

Similarly the new placement of truncate_ea() will take care of any
future cases of non-standard memory operands (the one existing case -
opcodes A0...A3 - are fine with and without this, as they fetch an
ad_bytes sized unsigned address anyway).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1925,6 +1925,7 @@ x86_decode(
     uint8_t b, d, sib, sib_index, sib_base;
     unsigned int def_op_bytes, def_ad_bytes, opcode;
     enum x86_segment override_seg = x86_seg_none;
+    bool ip_rel = false;
     int rc = X86EMUL_OKAY;
 
     memset(state, 0, sizeof(*state));
@@ -2290,7 +2291,6 @@ x86_decode(
                 ea.mem.off += insn_fetch_type(int16_t);
                 break;
             }
-            ea.mem.off = truncate_ea(ea.mem.off);
         }
         else
         {
@@ -2339,15 +2339,7 @@ x86_decode(
                 if ( (modrm_rm & 7) != 5 )
                     break;
                 ea.mem.off = insn_fetch_type(int32_t);
-                if ( !mode_64bit() )
-                    break;
-                /* Relative to RIP of next instruction. Argh! */
-                ea.mem.off += state->eip;
-                if ( (d & SrcMask) == SrcImm )
-                    ea.mem.off += (d & ByteOp) ? 1 :
-                        ((op_bytes == 8) ? 4 : op_bytes);
-                else if ( (d & SrcMask) == SrcImmByte )
-                    ea.mem.off += 1;
+                ip_rel = mode_64bit();
                 break;
             case 1:
                 ea.mem.off += insn_fetch_type(int8_t);
@@ -2356,7 +2348,6 @@ x86_decode(
                 ea.mem.off += insn_fetch_type(int32_t);
                 break;
             }
-            ea.mem.off = truncate_ea(ea.mem.off);
         }
     }
 
@@ -2421,6 +2412,14 @@ x86_decode(
         return X86EMUL_UNHANDLEABLE;
     }
 
+    if ( ea.type == OP_MEM )
+    {
+        if ( ip_rel )
+            ea.mem.off += state->eip;
+
+        ea.mem.off = truncate_ea(ea.mem.off);
+    }
+
     /*
      * Undo the operand-size override effect of prefix 66 when it was
      * determined to have another meaning.

Andrew Cooper Dec. 7, 2016, 3:38 p.m. UTC | #1

On 07/12/16 14:07, Jan Beulich wrote:
> By putting it after all instruction fetching has been done, we can both
> simplify the existing handling of immediate operands and take care of
> any future instructions allowing rIP-relative operands and getting
> additional bytes fetched in x86_decode_*() (the current cases of extra
> bytes getting fetched there are only for operands without ModR/M bytes,
> or with them only allowing their register forms).
>
> Similarly the new placement of truncate_ea() will take care of any
> future cases of non-standard memory operands (the one existing case -
> opcodes A0...A3 - are fine with and without this, as they fetch an
> ad_bytes sized unsigned address anyway).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

This is rather clearer to follow.

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>, although...

>
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -1925,6 +1925,7 @@ x86_decode(
>      uint8_t b, d, sib, sib_index, sib_base;
>      unsigned int def_op_bytes, def_ad_bytes, opcode;
>      enum x86_segment override_seg = x86_seg_none;
> +    bool ip_rel = false;

I would name this specifically rip_rel, as that is the term used in all
the manuals.

~Andrew

Jan Beulich Dec. 7, 2016, 3:43 p.m. UTC | #2

>>> On 07.12.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
> On 07/12/16 14:07, Jan Beulich wrote:
>> By putting it after all instruction fetching has been done, we can both
>> simplify the existing handling of immediate operands and take care of
>> any future instructions allowing rIP-relative operands and getting
>> additional bytes fetched in x86_decode_*() (the current cases of extra
>> bytes getting fetched there are only for operands without ModR/M bytes,
>> or with them only allowing their register forms).
>>
>> Similarly the new placement of truncate_ea() will take care of any
>> future cases of non-standard memory operands (the one existing case -
>> opcodes A0...A3 - are fine with and without this, as they fetch an
>> ad_bytes sized unsigned address anyway).
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> This is rather clearer to follow.
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>, although...
> 
>>
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -1925,6 +1925,7 @@ x86_decode(
>>      uint8_t b, d, sib, sib_index, sib_base;
>>      unsigned int def_op_bytes, def_ad_bytes, opcode;
>>      enum x86_segment override_seg = x86_seg_none;
>> +    bool ip_rel = false;
> 
> I would name this specifically rip_rel, as that is the term used in all
> the manuals.

And I specifically avoided it as being wrong in the context of the
32-bit test harness. Would pc_rel suit you better than ip_rel?

Jan

Jan Beulich Dec. 7, 2016, 3:47 p.m. UTC | #3

>>> On 07.12.16 at 16:43, <JBeulich@suse.com> wrote:
>>>> On 07.12.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>> On 07/12/16 14:07, Jan Beulich wrote:
>>> By putting it after all instruction fetching has been done, we can both
>>> simplify the existing handling of immediate operands and take care of
>>> any future instructions allowing rIP-relative operands and getting
>>> additional bytes fetched in x86_decode_*() (the current cases of extra
>>> bytes getting fetched there are only for operands without ModR/M bytes,
>>> or with them only allowing their register forms).
>>>
>>> Similarly the new placement of truncate_ea() will take care of any
>>> future cases of non-standard memory operands (the one existing case -
>>> opcodes A0...A3 - are fine with and without this, as they fetch an
>>> ad_bytes sized unsigned address anyway).
>>>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> 
>> This is rather clearer to follow.
>> 
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>, although...
>> 
>>>
>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>> @@ -1925,6 +1925,7 @@ x86_decode(
>>>      uint8_t b, d, sib, sib_index, sib_base;
>>>      unsigned int def_op_bytes, def_ad_bytes, opcode;
>>>      enum x86_segment override_seg = x86_seg_none;
>>> +    bool ip_rel = false;
>> 
>> I would name this specifically rip_rel, as that is the term used in all
>> the manuals.
> 
> And I specifically avoided it as being wrong in the context of the
> 32-bit test harness. Would pc_rel suit you better than ip_rel?

Actually the reference to the 32-bit test harness was wrong here
(obviously). Instead, it is wrong in the context of 32-bit addressing
in 64-bit mode.

Jan

Andrew Cooper Dec. 7, 2016, 4:08 p.m. UTC | #4

On 07/12/16 15:47, Jan Beulich wrote:
>>>> On 07.12.16 at 16:43, <JBeulich@suse.com> wrote:
>>>>> On 07.12.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>>> On 07/12/16 14:07, Jan Beulich wrote:
>>>> By putting it after all instruction fetching has been done, we can both
>>>> simplify the existing handling of immediate operands and take care of
>>>> any future instructions allowing rIP-relative operands and getting
>>>> additional bytes fetched in x86_decode_*() (the current cases of extra
>>>> bytes getting fetched there are only for operands without ModR/M bytes,
>>>> or with them only allowing their register forms).
>>>>
>>>> Similarly the new placement of truncate_ea() will take care of any
>>>> future cases of non-standard memory operands (the one existing case -
>>>> opcodes A0...A3 - are fine with and without this, as they fetch an
>>>> ad_bytes sized unsigned address anyway).
>>>>
>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>> This is rather clearer to follow.
>>>
>>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>, although...
>>>
>>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>>> @@ -1925,6 +1925,7 @@ x86_decode(
>>>>      uint8_t b, d, sib, sib_index, sib_base;
>>>>      unsigned int def_op_bytes, def_ad_bytes, opcode;
>>>>      enum x86_segment override_seg = x86_seg_none;
>>>> +    bool ip_rel = false;
>>> I would name this specifically rip_rel, as that is the term used in all
>>> the manuals.
>> And I specifically avoided it as being wrong in the context of the
>> 32-bit test harness. Would pc_rel suit you better than ip_rel?
> Actually the reference to the 32-bit test harness was wrong here
> (obviously). Instead, it is wrong in the context of 32-bit addressing
> in 64-bit mode.

Such a case would still have rip_rel = false.  This addressing mode is
unique to 64bit.

But yes, pc_rel is still slightly better if you insist for not using
rip_rel.

~Andrew

Jan Beulich Dec. 7, 2016, 4:30 p.m. UTC | #5

>>> On 07.12.16 at 17:08, <andrew.cooper3@citrix.com> wrote:
> On 07/12/16 15:47, Jan Beulich wrote:
>>>>> On 07.12.16 at 16:43, <JBeulich@suse.com> wrote:
>>>>>> On 07.12.16 at 16:38, <andrew.cooper3@citrix.com> wrote:
>>>> On 07/12/16 14:07, Jan Beulich wrote:
>>>>> By putting it after all instruction fetching has been done, we can both
>>>>> simplify the existing handling of immediate operands and take care of
>>>>> any future instructions allowing rIP-relative operands and getting
>>>>> additional bytes fetched in x86_decode_*() (the current cases of extra
>>>>> bytes getting fetched there are only for operands without ModR/M bytes,
>>>>> or with them only allowing their register forms).
>>>>>
>>>>> Similarly the new placement of truncate_ea() will take care of any
>>>>> future cases of non-standard memory operands (the one existing case -
>>>>> opcodes A0...A3 - are fine with and without this, as they fetch an
>>>>> ad_bytes sized unsigned address anyway).
>>>>>
>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>> This is rather clearer to follow.
>>>>
>>>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>, although...
>>>>
>>>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>>>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>>>>> @@ -1925,6 +1925,7 @@ x86_decode(
>>>>>      uint8_t b, d, sib, sib_index, sib_base;
>>>>>      unsigned int def_op_bytes, def_ad_bytes, opcode;
>>>>>      enum x86_segment override_seg = x86_seg_none;
>>>>> +    bool ip_rel = false;
>>>> I would name this specifically rip_rel, as that is the term used in all
>>>> the manuals.
>>> And I specifically avoided it as being wrong in the context of the
>>> 32-bit test harness. Would pc_rel suit you better than ip_rel?
>> Actually the reference to the 32-bit test harness was wrong here
>> (obviously). Instead, it is wrong in the context of 32-bit addressing
>> in 64-bit mode.
> 
> Such a case would still have rip_rel = false.  This addressing mode is
> unique to 64bit.

That's what I've said - 32-bit addressing in 64-bit mode (specifically
not compat mode), i.e. an address size override present there.

> But yes, pc_rel is still slightly better if you insist for not using
> rip_rel.

Will use that then.

Jan

x86emul: defer rIP-relative address calculation

Commit Message

Comments

Patch