diff mbox

[v3] x86emul/fuzz: add rudimentary limit checking

Message ID 595E71AA0200007800169398@prv-mh.provo.novell.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Beulich July 6, 2017, 3:21 p.m. UTC
>>> On 06.07.17 at 16:02, <george.dunlap@citrix.com> wrote:
> On 07/06/2017 01:34 PM, Jan Beulich wrote:
>>>>> On 06.07.17 at 12:57, <george.dunlap@citrix.com> wrote:
>>> On 07/06/2017 10:20 AM, Jan Beulich wrote:
>>>> fuzz_insn_fetch() is the only data access helper where it is possible
>>>> to see offsets larger than 4Gb in 16- or 32-bit modes, as we leave the
>>>> incoming rIP untouched in the emulator itself. The check is needed here
>>>> as otherwise, after successfully fetching insn bytes, we may end up
>>>> zero-extending EIP soon after complete_insn, which collides with the
>>>> X86EMUL_EXCEPTION-conditional respective ASSERT() in
>>>> x86_emulate_wrapper(). (NB: put_rep_prefix() is what allows
>>>> complete_insn to be reached with rc set to other than X86EMUL_OKAY or
>>>> X86EMUL_DONE. See also commit 53f87c03b4 ["x86emul: generalize
>>>> exception handling for rep_* hooks"].)
>>>>
>>>> Add assert()-s for all other (data) access routines, as effective
>>>> address generation in the emulator ought to guarantee in-range values.
>>>> For them to not trigger, several adjustments to the emulator's address
>>>> calculations are needed: While for DstBitBase it is really mandatory,
>>>> the specification allows for either behavior for two-part accesses.
>>>> Observed behavior on real hardware, however, is for such accesses to
>>>> silently wrap at the 2^^32 boundary in other than 64-bit mode, just
>>>> like they do at the 2^^64 boundary in 64-bit mode. While adding
>>>> truncate_ea() invocations there, also convert open coded instances of
>>>> it.
>>>>
>>>> Reported-by: George Dunlap <george.dunlap@citrix.com>
>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>> ---
>>>> v3: Add more truncate_ea().
>>>> v2: Correct system segment related assert()-s.
>>>
>>> Still getting crashes in protmode_load_seg(), line 1824.  (See attached
>>> for an example stack trace; but basically any place that calls
>>> protmode_load_seg()).
>> 
>> Ah, this is one I indeed forgot about. We shouldn't deal with this in
>> the emulator though, so slightly relaxing the assert() seems like the
>> only option: We'd need to permit reads up to 0x10007 instead of
>> 0xffff (which would never pass limit checks).
> 
> Replacing !(offset >> 16) with (offset <= 0x10007) makes all the current
> crash cases I have pass.
> 
> If you want I can submit this patch, modified, with my series of afl
> fixes / changes.

I've done the above change slightly differently (distinguishing long
from legacy modes), so if you want to put it in your series, please
use the attached variant (aka v4).

Jan
x86emul/fuzz: add rudimentary limit checking

fuzz_insn_fetch() is the only data access helper where it is possible
to see offsets larger than 4Gb in 16- or 32-bit modes, as we leave the
incoming rIP untouched in the emulator itself. The check is needed here
as otherwise, after successfully fetching insn bytes, we may end up
zero-extending EIP soon after complete_insn, which collides with the
X86EMUL_EXCEPTION-conditional respective ASSERT() in
x86_emulate_wrapper(). (NB: put_rep_prefix() is what allows
complete_insn to be reached with rc set to other than X86EMUL_OKAY or
X86EMUL_DONE. See also commit 53f87c03b4 ["x86emul: generalize
exception handling for rep_* hooks"].)

Add assert()-s for all other (data) access routines, as effective
address generation in the emulator ought to guarantee in-range values.
For them to not trigger, several adjustments to the emulator's address
calculations are needed: While for DstBitBase it is really mandatory,
the specification allows for either behavior for two-part accesses.
Observed behavior on real hardware, however, is for such accesses to
silently wrap at the 2^^32 boundary in other than 64-bit mode, just
like they do at the 2^^64 boundary in 64-bit mode. While adding
truncate_ea() invocations there, also convert open coded instances of
it.

Reported-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v4: Relax system segment read upper bounds for long mode.
v3: Add more truncate_ea().
v2: Correct system segment related assert()-s.

Comments

George Dunlap July 6, 2017, 3:28 p.m. UTC | #1
On 07/06/2017 04:21 PM, Jan Beulich wrote:
>>>> On 06.07.17 at 16:02, <george.dunlap@citrix.com> wrote:
>> On 07/06/2017 01:34 PM, Jan Beulich wrote:
>>>>>> On 06.07.17 at 12:57, <george.dunlap@citrix.com> wrote:
>>>> On 07/06/2017 10:20 AM, Jan Beulich wrote:
>>>>> fuzz_insn_fetch() is the only data access helper where it is possible
>>>>> to see offsets larger than 4Gb in 16- or 32-bit modes, as we leave the
>>>>> incoming rIP untouched in the emulator itself. The check is needed here
>>>>> as otherwise, after successfully fetching insn bytes, we may end up
>>>>> zero-extending EIP soon after complete_insn, which collides with the
>>>>> X86EMUL_EXCEPTION-conditional respective ASSERT() in
>>>>> x86_emulate_wrapper(). (NB: put_rep_prefix() is what allows
>>>>> complete_insn to be reached with rc set to other than X86EMUL_OKAY or
>>>>> X86EMUL_DONE. See also commit 53f87c03b4 ["x86emul: generalize
>>>>> exception handling for rep_* hooks"].)
>>>>>
>>>>> Add assert()-s for all other (data) access routines, as effective
>>>>> address generation in the emulator ought to guarantee in-range values.
>>>>> For them to not trigger, several adjustments to the emulator's address
>>>>> calculations are needed: While for DstBitBase it is really mandatory,
>>>>> the specification allows for either behavior for two-part accesses.
>>>>> Observed behavior on real hardware, however, is for such accesses to
>>>>> silently wrap at the 2^^32 boundary in other than 64-bit mode, just
>>>>> like they do at the 2^^64 boundary in 64-bit mode. While adding
>>>>> truncate_ea() invocations there, also convert open coded instances of
>>>>> it.
>>>>>
>>>>> Reported-by: George Dunlap <george.dunlap@citrix.com>
>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>>> ---
>>>>> v3: Add more truncate_ea().
>>>>> v2: Correct system segment related assert()-s.
>>>>
>>>> Still getting crashes in protmode_load_seg(), line 1824.  (See attached
>>>> for an example stack trace; but basically any place that calls
>>>> protmode_load_seg()).
>>>
>>> Ah, this is one I indeed forgot about. We shouldn't deal with this in
>>> the emulator though, so slightly relaxing the assert() seems like the
>>> only option: We'd need to permit reads up to 0x10007 instead of
>>> 0xffff (which would never pass limit checks).
>>
>> Replacing !(offset >> 16) with (offset <= 0x10007) makes all the current
>> crash cases I have pass.
>>
>> If you want I can submit this patch, modified, with my series of afl
>> fixes / changes.
> 
> I've done the above change slightly differently (distinguishing long
> from legacy modes), so if you want to put it in your series, please
> use the attached variant (aka v4).

OK -- again, that works with all the previously-crashing test cases.
It's fuzzing now; I'll include it in my series.

Thanks Jan,
 -George
diff mbox

Patch

--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -139,7 +139,18 @@  static int fuzz_read(
     struct x86_emulate_ctxt *ctxt)
 {
     /* Reads expected for all user and system segments. */
-    assert(is_x86_user_segment(seg) || is_x86_system_segment(seg));
+    if ( is_x86_user_segment(seg) )
+        assert(ctxt->addr_size == 64 || !(offset >> 32));
+    else if ( seg == x86_seg_tr )
+        /*
+         * The TSS is special in that accesses below the segment base are
+         * possible, as the Interrupt Redirection Bitmap starts 32 bytes
+         * ahead of the I/O Bitmap, regardless of the value of the latter.
+         */
+        assert((long)offset < 0 ? (long)offset > -32 : !(offset >> 17));
+    else
+        assert(is_x86_system_segment(seg) &&
+               (ctxt->lma ? offset <= 0x10007 : !(offset >> 16)));
 
     return data_read(ctxt, seg, "read", p_data, bytes);
 }
@@ -162,6 +173,13 @@  static int fuzz_insn_fetch(
 {
     assert(seg == x86_seg_cs);
 
+    /* Minimal segment limit checking, until full one is being put in place. */
+    if ( ctxt->addr_size < 64 && (offset >> 32) )
+    {
+        x86_emul_hw_exception(13, 0, ctxt);
+        return X86EMUL_EXCEPTION;
+    }
+
     /*
      * Zero-length instruction fetches are made at the destination of jumps,
      * to perform segmentation checks.  No data needs returning.
@@ -232,6 +250,7 @@  static int fuzz_rep_ins(
     struct x86_emulate_ctxt *ctxt)
 {
     assert(dst_seg == x86_seg_es);
+    assert(ctxt->addr_size == 64 || !(dst_offset >> 32));
 
     return _fuzz_rep_read(ctxt, "rep_ins", reps);
 }
@@ -247,6 +266,7 @@  static int fuzz_rep_movs(
 {
     assert(is_x86_user_segment(src_seg));
     assert(dst_seg == x86_seg_es);
+    assert(ctxt->addr_size == 64 || !((src_offset | dst_offset) >> 32));
 
     return _fuzz_rep_read(ctxt, "rep_movs", reps);
 }
@@ -260,6 +280,7 @@  static int fuzz_rep_outs(
     struct x86_emulate_ctxt *ctxt)
 {
     assert(is_x86_user_segment(src_seg));
+    assert(ctxt->addr_size == 64 || !(src_offset >> 32));
 
     return _fuzz_rep_write(ctxt, "rep_outs", reps);
 }
@@ -277,6 +298,7 @@  static int fuzz_rep_stos(
      * for CLZERO.
      */
     assert(is_x86_user_segment(seg));
+    assert(ctxt->addr_size == 64 || !(offset >> 32));
 
     return _fuzz_rep_write(ctxt, "rep_stos", reps);
 }
@@ -290,6 +312,7 @@  static int fuzz_write(
 {
     /* Writes not expected for any system segments. */
     assert(is_x86_user_segment(seg));
+    assert(ctxt->addr_size == 64 || !(offset >> 32));
 
     return maybe_fail(ctxt, "write", true);
 }
@@ -306,8 +329,10 @@  static int fuzz_cmpxchg(
      * Cmpxchg expected for user segments, and setting accessed/busy bits in
      * GDT/LDT enties, but not expected for any IDT or TR accesses.
      */
-    assert(is_x86_user_segment(seg) ||
-           seg == x86_seg_gdtr || seg == x86_seg_ldtr);
+    if ( is_x86_user_segment(seg) )
+        assert(ctxt->addr_size == 64 || !(offset >> 32));
+    else
+        assert((seg == x86_seg_gdtr || seg == x86_seg_ldtr) && !(offset >> 16));
 
     return maybe_fail(ctxt, "cmpxchg", true);
 }
@@ -319,6 +344,7 @@  static int fuzz_invlpg(
 {
     /* invlpg(), unlike all other hooks, may be called with x86_seg_none. */
     assert(is_x86_user_segment(seg) || seg == x86_seg_none);
+    assert(ctxt->addr_size == 64 || !(offset >> 32));
 
     return maybe_fail(ctxt, "invlpg", false);
 }
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1249,10 +1249,10 @@  static void __put_rep_prefix(
 
 /* Clip maximum repetitions so that the index register at most just wraps. */
 #define truncate_ea_and_reps(ea, reps, bytes_per_rep) ({                  \
-    unsigned long todo__, ea__ = truncate_word(ea, ad_bytes);             \
+    unsigned long todo__, ea__ = truncate_ea(ea);                         \
     if ( !(_regs.eflags & X86_EFLAGS_DF) )                                \
-        todo__ = truncate_word(-(ea), ad_bytes) / (bytes_per_rep);        \
-    else if ( truncate_word((ea) + (bytes_per_rep) - 1, ad_bytes) < ea__ )\
+        todo__ = truncate_ea(-ea__) / (bytes_per_rep);                    \
+    else if ( truncate_ea(ea__ + (bytes_per_rep) - 1) < ea__ )            \
         todo__ = 1;                                                       \
     else                                                                  \
         todo__ = ea__ / (bytes_per_rep) + 1;                              \
@@ -3128,6 +3128,7 @@  x86_emulate(
                     op_bytes + (((-src.val - 1) >> 3) & ~(op_bytes - 1L));
             else
                 ea.mem.off += (src.val >> 3) & ~(op_bytes - 1L);
+            ea.mem.off = truncate_ea(ea.mem.off);
         }
 
         /* Bit index always truncated to within range. */
@@ -3346,7 +3347,7 @@  x86_emulate(
         unsigned long src_val2;
         int lb, ub, idx;
         generate_exception_if(src.type != OP_MEM, EXC_UD);
-        if ( (rc = read_ulong(src.mem.seg, src.mem.off + op_bytes,
+        if ( (rc = read_ulong(src.mem.seg, truncate_ea(src.mem.off + op_bytes),
                               &src_val2, op_bytes, ctxt, ops)) )
             goto done;
         ub  = (op_bytes == 2) ? (int16_t)src_val2 : (int32_t)src_val2;
@@ -3897,7 +3898,7 @@  x86_emulate(
         seg = (b & 1) * 3; /* es = 0, ds = 3 */
     les:
         generate_exception_if(src.type != OP_MEM, EXC_UD);
-        if ( (rc = read_ulong(src.mem.seg, src.mem.off + src.bytes,
+        if ( (rc = read_ulong(src.mem.seg, truncate_ea(src.mem.off + src.bytes),
                               &dst.val, 2, ctxt, ops)) != X86EMUL_OKAY )
             goto done;
         ASSERT(is_x86_user_segment(seg));
@@ -4931,7 +4932,8 @@  x86_emulate(
         case 5: /* jmp (far, absolute indirect) */
             generate_exception_if(src.type != OP_MEM, EXC_UD);
 
-            if ( (rc = read_ulong(src.mem.seg, src.mem.off + op_bytes,
+            if ( (rc = read_ulong(src.mem.seg,
+                                  truncate_ea(src.mem.off + op_bytes),
                                   &imm2, 2, ctxt, ops)) )
                 goto done;
             imm1 = src.val;
@@ -5115,8 +5117,8 @@  x86_emulate(
             }
             if ( (rc = ops->write(ea.mem.seg, ea.mem.off, &sreg.limit,
                                   2, ctxt)) != X86EMUL_OKAY ||
-                 (rc = ops->write(ea.mem.seg, ea.mem.off + 2, &sreg.base,
-                                  op_bytes, ctxt)) != X86EMUL_OKAY )
+                 (rc = ops->write(ea.mem.seg, truncate_ea(ea.mem.off + 2),
+                                  &sreg.base, op_bytes, ctxt)) != X86EMUL_OKAY )
                 goto done;
             break;
         case 2: /* lgdt */
@@ -5125,9 +5127,9 @@  x86_emulate(
             generate_exception_if(ea.type != OP_MEM, EXC_UD);
             fail_if(ops->write_segment == NULL);
             memset(&sreg, 0, sizeof(sreg));
-            if ( (rc = read_ulong(ea.mem.seg, ea.mem.off+0,
+            if ( (rc = read_ulong(ea.mem.seg, ea.mem.off,
                                   &limit, 2, ctxt, ops)) ||
-                 (rc = read_ulong(ea.mem.seg, ea.mem.off+2,
+                 (rc = read_ulong(ea.mem.seg, truncate_ea(ea.mem.off + 2),
                                   &base, mode_64bit() ? 8 : 4, ctxt, ops)) )
                 goto done;
             generate_exception_if(!is_canonical_address(base), EXC_GP, 0);