diff mbox

[v3] x86emul/fuzz: add rudimentary limit checking

Message ID 595E1D100200007800169103@prv-mh.provo.novell.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Beulich July 6, 2017, 9:20 a.m. UTC
fuzz_insn_fetch() is the only data access helper where it is possible
to see offsets larger than 4Gb in 16- or 32-bit modes, as we leave the
incoming rIP untouched in the emulator itself. The check is needed here
as otherwise, after successfully fetching insn bytes, we may end up
zero-extending EIP soon after complete_insn, which collides with the
X86EMUL_EXCEPTION-conditional respective ASSERT() in
x86_emulate_wrapper(). (NB: put_rep_prefix() is what allows
complete_insn to be reached with rc set to other than X86EMUL_OKAY or
X86EMUL_DONE. See also commit 53f87c03b4 ["x86emul: generalize
exception handling for rep_* hooks"].)

Add assert()-s for all other (data) access routines, as effective
address generation in the emulator ought to guarantee in-range values.
For them to not trigger, several adjustments to the emulator's address
calculations are needed: While for DstBitBase it is really mandatory,
the specification allows for either behavior for two-part accesses.
Observed behavior on real hardware, however, is for such accesses to
silently wrap at the 2^^32 boundary in other than 64-bit mode, just
like they do at the 2^^64 boundary in 64-bit mode. While adding
truncate_ea() invocations there, also convert open coded instances of
it.

Reported-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Add more truncate_ea().
v2: Correct system segment related assert()-s.

Comments

George Dunlap July 6, 2017, 10:57 a.m. UTC | #1
On 07/06/2017 10:20 AM, Jan Beulich wrote:
> fuzz_insn_fetch() is the only data access helper where it is possible
> to see offsets larger than 4Gb in 16- or 32-bit modes, as we leave the
> incoming rIP untouched in the emulator itself. The check is needed here
> as otherwise, after successfully fetching insn bytes, we may end up
> zero-extending EIP soon after complete_insn, which collides with the
> X86EMUL_EXCEPTION-conditional respective ASSERT() in
> x86_emulate_wrapper(). (NB: put_rep_prefix() is what allows
> complete_insn to be reached with rc set to other than X86EMUL_OKAY or
> X86EMUL_DONE. See also commit 53f87c03b4 ["x86emul: generalize
> exception handling for rep_* hooks"].)
> 
> Add assert()-s for all other (data) access routines, as effective
> address generation in the emulator ought to guarantee in-range values.
> For them to not trigger, several adjustments to the emulator's address
> calculations are needed: While for DstBitBase it is really mandatory,
> the specification allows for either behavior for two-part accesses.
> Observed behavior on real hardware, however, is for such accesses to
> silently wrap at the 2^^32 boundary in other than 64-bit mode, just
> like they do at the 2^^64 boundary in 64-bit mode. While adding
> truncate_ea() invocations there, also convert open coded instances of
> it.
> 
> Reported-by: George Dunlap <george.dunlap@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v3: Add more truncate_ea().
> v2: Correct system segment related assert()-s.

Still getting crashes in protmode_load_seg(), line 1824.  (See attached
for an example stack trace; but basically any place that calls
protmode_load_seg()).

 -George
Setting MSR i3 (c0000080) to 1ad40d032ceb49
Setting CR 4 to 22030d031aeb36
Setting CR 0 to ffff001ae95b0000
Canonicalized 0 to 0
Setting EFER_LMA
Disabling hook cmpxchg
Disabling hook rep_outs
Disabling hook rep_stos
Disabling hook read_io
Disabling hook write_io
Disabling hook write_cr
Disabling hook invlpg
Setting EFER_LMA
Setting EFER_LMA
 -- State -- 
addr / sp size: 16 / 16
 cr0: ffff001ae95b0001
 cr3: 0
 cr4: 22030d031aeb36
 rip: 0
Setting EFER_LMA
EFER: 1ad40d032cef49
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  c9
maybe_fail read: X86EMUL_OKAY
read:  03 ff
Emulation result: 0
Setting EFER_LMA
Setting EFER_LMA
 -- State -- 
addr / sp size: 16 / 16
 cr0: ffff001ae95b0001
 cr3: 0
 cr4: 22030d031aeb36
 rip: 1
Setting EFER_LMA
EFER: 1ad40d032cef49
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  f3
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  0f
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  05
Setting EFER_LMA
maybe_fail write_segment: X86EMUL_OKAY
maybe_fail write_segment: X86EMUL_OKAY
Emulation result: 0
Setting EFER_LMA
Setting EFER_LMA
 -- State -- 
addr / sp size: 64 / 64
 cr0: ffff001ae95b0001
 cr3: 0
 cr4: 22030d031aeb36
 rip: 0
Setting EFER_LMA
EFER: 1ad40d032cef49
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  4c
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  4c
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  4c
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  0f
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  ac
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  30
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  03
maybe_fail read: X86EMUL_OKAY
read:  4c 4c 2b 0d b6 80 18 c9
maybe_fail write: X86EMUL_OKAY
Emulation result: 0
Setting EFER_LMA
Setting EFER_LMA
 -- State -- 
addr / sp size: 64 / 64
 cr0: ffff001ae95b0001
 cr3: 0
 cr4: 22030d031aeb36
 rip: 7
Setting EFER_LMA
EFER: 1ad40d032cef49
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  ff
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  f3
maybe_fail write: X86EMUL_OKAY
Emulation result: 0
Setting EFER_LMA
Setting EFER_LMA
 -- State -- 
addr / sp size: 64 / 64
 cr0: ffff001ae95b0001
 cr3: 0
 cr4: 22030d031aeb36
 rip: 9
Setting EFER_LMA
EFER: 1ad40d032cef49
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  01
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  00
maybe_fail read: X86EMUL_OKAY
read:  64 4c 6a 4c
Emulation result: 0
Setting EFER_LMA
Setting EFER_LMA
 -- State -- 
addr / sp size: 64 / 64
 cr0: ffff001ae95b0001
 cr3: 0
 cr4: 22030d031aeb36
 rip: b
Setting EFER_LMA
EFER: 1ad40d032cef49
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  4c
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  0f
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  03
maybe_fail insn_fetch: X86EMUL_OKAY
insn_fetch:  d4
maybe_fail read: X86EMUL_OKAY
read:  00 00 00 4c 37 4c 4c 77
afl-harness: fuzz-emul.c:177: int fuzz_read(enum x86_segment, unsigned long, void *, unsigned int, struct x86_emulate_ctxt *): Assertion `is_x86_system_segment(seg) && !(offset >> 16)' failed.

Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff7a6e3fa in __GI_abort () at abort.c:89
#2  0x00007ffff7a65e37 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x443082 "is_x86_system_segment(seg) && !(offset >> 16)", file=file@entry=0x442fde "fuzz-emul.c", line=line@entry=177, function=function@entry=0x442fea "int fuzz_read(enum x86_segment, unsigned long, void *, unsigned int, struct x86_emulate_ctxt *)") at assert.c:92
#3  0x00007ffff7a65ee2 in __GI___assert_fail (assertion=0x443082 "is_x86_system_segment(seg) && !(offset >> 16)", file=0x442fde "fuzz-emul.c", line=177, function=0x442fea "int fuzz_read(enum x86_segment, unsigned long, void *, unsigned int, struct x86_emulate_ctxt *)") at assert.c:101
#4  0x0000000000403b7e in fuzz_read (seg=<optimized out>, offset=<optimized out>, p_data=0x8, bytes=4294955200, ctxt=0x0) at fuzz-emul.c:177
#5  0x0000000000441afa in protmode_load_seg (seg=<optimized out>, sel=<optimized out>, is_ret=<error reading variable: access outside bounds of object referenced via synthetic pointer>, sreg=<optimized out>, ctxt=<optimized out>, ops=<optimized out>) at ./x86_emulate/x86_emulate.c:1824
#6  0x000000000041888f in x86_emulate (ctxt=<optimized out>, ops=<optimized out>) at ./x86_emulate/x86_emulate.c:5238
#7  0x000000000044240b in x86_emulate_wrapper (ctxt=0x7fffffffe320, ops=0x7fffffffd0c0) at ./x86_emulate/x86_emulate.c:7921
#8  0x00000000004028b4 in runtest (state=<optimized out>, ctxt=<optimized out>) at fuzz-emul.c:911
#9  0x0000000000402ae4 in LLVMFuzzerTestOneInput (data_p=0x6571c0 <input> "\250\254\020\067\003\367\025\016\b", size=112) at fuzz-emul.c:949
#10 0x0000000000401418 in main (argc=<optimized out>, argv=<optimized out>) at afl-harness.c:108
Jan Beulich July 6, 2017, 12:34 p.m. UTC | #2
>>> On 06.07.17 at 12:57, <george.dunlap@citrix.com> wrote:
> On 07/06/2017 10:20 AM, Jan Beulich wrote:
>> fuzz_insn_fetch() is the only data access helper where it is possible
>> to see offsets larger than 4Gb in 16- or 32-bit modes, as we leave the
>> incoming rIP untouched in the emulator itself. The check is needed here
>> as otherwise, after successfully fetching insn bytes, we may end up
>> zero-extending EIP soon after complete_insn, which collides with the
>> X86EMUL_EXCEPTION-conditional respective ASSERT() in
>> x86_emulate_wrapper(). (NB: put_rep_prefix() is what allows
>> complete_insn to be reached with rc set to other than X86EMUL_OKAY or
>> X86EMUL_DONE. See also commit 53f87c03b4 ["x86emul: generalize
>> exception handling for rep_* hooks"].)
>> 
>> Add assert()-s for all other (data) access routines, as effective
>> address generation in the emulator ought to guarantee in-range values.
>> For them to not trigger, several adjustments to the emulator's address
>> calculations are needed: While for DstBitBase it is really mandatory,
>> the specification allows for either behavior for two-part accesses.
>> Observed behavior on real hardware, however, is for such accesses to
>> silently wrap at the 2^^32 boundary in other than 64-bit mode, just
>> like they do at the 2^^64 boundary in 64-bit mode. While adding
>> truncate_ea() invocations there, also convert open coded instances of
>> it.
>> 
>> Reported-by: George Dunlap <george.dunlap@citrix.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> v3: Add more truncate_ea().
>> v2: Correct system segment related assert()-s.
> 
> Still getting crashes in protmode_load_seg(), line 1824.  (See attached
> for an example stack trace; but basically any place that calls
> protmode_load_seg()).

Ah, this is one I indeed forgot about. We shouldn't deal with this in
the emulator though, so slightly relaxing the assert() seems like the
only option: We'd need to permit reads up to 0x10007 instead of
0xffff (which would never pass limit checks).

Jan
George Dunlap July 6, 2017, 2:02 p.m. UTC | #3
On 07/06/2017 01:34 PM, Jan Beulich wrote:
>>>> On 06.07.17 at 12:57, <george.dunlap@citrix.com> wrote:
>> On 07/06/2017 10:20 AM, Jan Beulich wrote:
>>> fuzz_insn_fetch() is the only data access helper where it is possible
>>> to see offsets larger than 4Gb in 16- or 32-bit modes, as we leave the
>>> incoming rIP untouched in the emulator itself. The check is needed here
>>> as otherwise, after successfully fetching insn bytes, we may end up
>>> zero-extending EIP soon after complete_insn, which collides with the
>>> X86EMUL_EXCEPTION-conditional respective ASSERT() in
>>> x86_emulate_wrapper(). (NB: put_rep_prefix() is what allows
>>> complete_insn to be reached with rc set to other than X86EMUL_OKAY or
>>> X86EMUL_DONE. See also commit 53f87c03b4 ["x86emul: generalize
>>> exception handling for rep_* hooks"].)
>>>
>>> Add assert()-s for all other (data) access routines, as effective
>>> address generation in the emulator ought to guarantee in-range values.
>>> For them to not trigger, several adjustments to the emulator's address
>>> calculations are needed: While for DstBitBase it is really mandatory,
>>> the specification allows for either behavior for two-part accesses.
>>> Observed behavior on real hardware, however, is for such accesses to
>>> silently wrap at the 2^^32 boundary in other than 64-bit mode, just
>>> like they do at the 2^^64 boundary in 64-bit mode. While adding
>>> truncate_ea() invocations there, also convert open coded instances of
>>> it.
>>>
>>> Reported-by: George Dunlap <george.dunlap@citrix.com>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>> ---
>>> v3: Add more truncate_ea().
>>> v2: Correct system segment related assert()-s.
>>
>> Still getting crashes in protmode_load_seg(), line 1824.  (See attached
>> for an example stack trace; but basically any place that calls
>> protmode_load_seg()).
> 
> Ah, this is one I indeed forgot about. We shouldn't deal with this in
> the emulator though, so slightly relaxing the assert() seems like the
> only option: We'd need to permit reads up to 0x10007 instead of
> 0xffff (which would never pass limit checks).

Replacing !(offset >> 16) with (offset <= 0x10007) makes all the current
crash cases I have pass.

If you want I can submit this patch, modified, with my series of afl
fixes / changes.

 -George
diff mbox

Patch

--- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
+++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
@@ -139,7 +139,17 @@  static int fuzz_read(
     struct x86_emulate_ctxt *ctxt)
 {
     /* Reads expected for all user and system segments. */
-    assert(is_x86_user_segment(seg) || is_x86_system_segment(seg));
+    if ( is_x86_user_segment(seg) )
+        assert(ctxt->addr_size == 64 || !(offset >> 32));
+    else if ( seg == x86_seg_tr )
+        /*
+         * The TSS is special in that accesses below the segment base are
+         * possible, as the Interrupt Redirection Bitmap starts 32 bytes
+         * ahead of the I/O Bitmap, regardless of the value of the latter.
+         */
+        assert((long)offset < 0 ? (long)offset > -32 : !(offset >> 17));
+    else
+        assert(is_x86_system_segment(seg) && !(offset >> 16));
 
     return data_read(ctxt, seg, "read", p_data, bytes);
 }
@@ -162,6 +172,13 @@  static int fuzz_insn_fetch(
 {
     assert(seg == x86_seg_cs);
 
+    /* Minimal segment limit checking, until full one is being put in place. */
+    if ( ctxt->addr_size < 64 && (offset >> 32) )
+    {
+        x86_emul_hw_exception(13, 0, ctxt);
+        return X86EMUL_EXCEPTION;
+    }
+
     /*
      * Zero-length instruction fetches are made at the destination of jumps,
      * to perform segmentation checks.  No data needs returning.
@@ -232,6 +249,7 @@  static int fuzz_rep_ins(
     struct x86_emulate_ctxt *ctxt)
 {
     assert(dst_seg == x86_seg_es);
+    assert(ctxt->addr_size == 64 || !(dst_offset >> 32));
 
     return _fuzz_rep_read(ctxt, "rep_ins", reps);
 }
@@ -247,6 +265,7 @@  static int fuzz_rep_movs(
 {
     assert(is_x86_user_segment(src_seg));
     assert(dst_seg == x86_seg_es);
+    assert(ctxt->addr_size == 64 || !((src_offset | dst_offset) >> 32));
 
     return _fuzz_rep_read(ctxt, "rep_movs", reps);
 }
@@ -260,6 +279,7 @@  static int fuzz_rep_outs(
     struct x86_emulate_ctxt *ctxt)
 {
     assert(is_x86_user_segment(src_seg));
+    assert(ctxt->addr_size == 64 || !(src_offset >> 32));
 
     return _fuzz_rep_write(ctxt, "rep_outs", reps);
 }
@@ -277,6 +297,7 @@  static int fuzz_rep_stos(
      * for CLZERO.
      */
     assert(is_x86_user_segment(seg));
+    assert(ctxt->addr_size == 64 || !(offset >> 32));
 
     return _fuzz_rep_write(ctxt, "rep_stos", reps);
 }
@@ -290,6 +311,7 @@  static int fuzz_write(
 {
     /* Writes not expected for any system segments. */
     assert(is_x86_user_segment(seg));
+    assert(ctxt->addr_size == 64 || !(offset >> 32));
 
     return maybe_fail(ctxt, "write", true);
 }
@@ -306,8 +328,10 @@  static int fuzz_cmpxchg(
      * Cmpxchg expected for user segments, and setting accessed/busy bits in
      * GDT/LDT enties, but not expected for any IDT or TR accesses.
      */
-    assert(is_x86_user_segment(seg) ||
-           seg == x86_seg_gdtr || seg == x86_seg_ldtr);
+    if ( is_x86_user_segment(seg) )
+        assert(ctxt->addr_size == 64 || !(offset >> 32));
+    else
+        assert((seg == x86_seg_gdtr || seg == x86_seg_ldtr) && !(offset >> 16));
 
     return maybe_fail(ctxt, "cmpxchg", true);
 }
@@ -319,6 +343,7 @@  static int fuzz_invlpg(
 {
     /* invlpg(), unlike all other hooks, may be called with x86_seg_none. */
     assert(is_x86_user_segment(seg) || seg == x86_seg_none);
+    assert(ctxt->addr_size == 64 || !(offset >> 32));
 
     return maybe_fail(ctxt, "invlpg", false);
 }
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1249,10 +1249,10 @@  static void __put_rep_prefix(
 
 /* Clip maximum repetitions so that the index register at most just wraps. */
 #define truncate_ea_and_reps(ea, reps, bytes_per_rep) ({                  \
-    unsigned long todo__, ea__ = truncate_word(ea, ad_bytes);             \
+    unsigned long todo__, ea__ = truncate_ea(ea);                         \
     if ( !(_regs.eflags & X86_EFLAGS_DF) )                                \
-        todo__ = truncate_word(-(ea), ad_bytes) / (bytes_per_rep);        \
-    else if ( truncate_word((ea) + (bytes_per_rep) - 1, ad_bytes) < ea__ )\
+        todo__ = truncate_ea(-ea__) / (bytes_per_rep);                    \
+    else if ( truncate_ea(ea__ + (bytes_per_rep) - 1) < ea__ )            \
         todo__ = 1;                                                       \
     else                                                                  \
         todo__ = ea__ / (bytes_per_rep) + 1;                              \
@@ -3128,6 +3128,7 @@  x86_emulate(
                     op_bytes + (((-src.val - 1) >> 3) & ~(op_bytes - 1L));
             else
                 ea.mem.off += (src.val >> 3) & ~(op_bytes - 1L);
+            ea.mem.off = truncate_ea(ea.mem.off);
         }
 
         /* Bit index always truncated to within range. */
@@ -3346,7 +3347,7 @@  x86_emulate(
         unsigned long src_val2;
         int lb, ub, idx;
         generate_exception_if(src.type != OP_MEM, EXC_UD);
-        if ( (rc = read_ulong(src.mem.seg, src.mem.off + op_bytes,
+        if ( (rc = read_ulong(src.mem.seg, truncate_ea(src.mem.off + op_bytes),
                               &src_val2, op_bytes, ctxt, ops)) )
             goto done;
         ub  = (op_bytes == 2) ? (int16_t)src_val2 : (int32_t)src_val2;
@@ -3897,7 +3898,7 @@  x86_emulate(
         seg = (b & 1) * 3; /* es = 0, ds = 3 */
     les:
         generate_exception_if(src.type != OP_MEM, EXC_UD);
-        if ( (rc = read_ulong(src.mem.seg, src.mem.off + src.bytes,
+        if ( (rc = read_ulong(src.mem.seg, truncate_ea(src.mem.off + src.bytes),
                               &dst.val, 2, ctxt, ops)) != X86EMUL_OKAY )
             goto done;
         ASSERT(is_x86_user_segment(seg));
@@ -4931,7 +4932,8 @@  x86_emulate(
         case 5: /* jmp (far, absolute indirect) */
             generate_exception_if(src.type != OP_MEM, EXC_UD);
 
-            if ( (rc = read_ulong(src.mem.seg, src.mem.off + op_bytes,
+            if ( (rc = read_ulong(src.mem.seg,
+                                  truncate_ea(src.mem.off + op_bytes),
                                   &imm2, 2, ctxt, ops)) )
                 goto done;
             imm1 = src.val;
@@ -5115,8 +5117,8 @@  x86_emulate(
             }
             if ( (rc = ops->write(ea.mem.seg, ea.mem.off, &sreg.limit,
                                   2, ctxt)) != X86EMUL_OKAY ||
-                 (rc = ops->write(ea.mem.seg, ea.mem.off + 2, &sreg.base,
-                                  op_bytes, ctxt)) != X86EMUL_OKAY )
+                 (rc = ops->write(ea.mem.seg, truncate_ea(ea.mem.off + 2),
+                                  &sreg.base, op_bytes, ctxt)) != X86EMUL_OKAY )
                 goto done;
             break;
         case 2: /* lgdt */
@@ -5125,9 +5127,9 @@  x86_emulate(
             generate_exception_if(ea.type != OP_MEM, EXC_UD);
             fail_if(ops->write_segment == NULL);
             memset(&sreg, 0, sizeof(sreg));
-            if ( (rc = read_ulong(ea.mem.seg, ea.mem.off+0,
+            if ( (rc = read_ulong(ea.mem.seg, ea.mem.off,
                                   &limit, 2, ctxt, ops)) ||
-                 (rc = read_ulong(ea.mem.seg, ea.mem.off+2,
+                 (rc = read_ulong(ea.mem.seg, truncate_ea(ea.mem.off + 2),
                                   &base, mode_64bit() ? 8 : 4, ctxt, ops)) )
                 goto done;
             generate_exception_if(!is_canonical_address(base), EXC_GP, 0);