[v2,12/13] fuzz/x86_emulate: Set and fuzz more CPU state

Message ID	20170925142648.25959-12-george.dunlap@citrix.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xen.org> From: George Dunlap <george.dunlap@citrix.com> To: <xen-devel@lists.xenproject.org> Date: Mon, 25 Sep 2017 15:26:47 +0100 Message-ID: <20170925142648.25959-12-george.dunlap@citrix.com> In-Reply-To: <20170925142648.25959-1-george.dunlap@citrix.com> References: <20170925142648.25959-1-george.dunlap@citrix.com> MIME-Version: 1.0 Cc: Ian Jackson <ian.jackson@citrix.com>, Wei Liu <wei.liu2@citrix.com>, George Dunlap <george.dunlap@citrix.com>, Jan Beulich <jbeulich@suse.com>, Andrew Cooper <andrew.cooper3@citrix.com> Subject: [Xen-devel] [PATCH v2 12/13] fuzz/x86_emulate: Set and fuzz more CPU state Precedence: list Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>

George Dunlap Sept. 25, 2017, 2:26 p.m. UTC

x86_emulate() operates not only on state passed to it in
cpu_user_regs, but also on state currently found on the cpu: namely,
the FPU and XMM registers.  At the moment, we re-zero (and/or
re-initialize) cpu_user_regs on every invocation, but leave the
cpu-stored state alone.  In "persistent mode", this causes test cases
to behave differently -- sometimes significantly so -- depending on
which test cases have been run beforehand.

Zero out the state before each test run, and then fuzz it based on the
corpus input.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
v2: Rebase on top of previous changes

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 71 +++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

Jan Beulich Oct. 4, 2017, 8:28 a.m. UTC | #1

>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote:
> @@ -597,6 +599,47 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>  };
>  #undef SET
>  
> +static void _set_fpu_state(char *fxsave, bool store)
> +{
> +    if ( cpu_has_fxsr )
> +    {
> +        static union __attribute__((__aligned__(16))) {
> +            char x[464];

The final part of the save area isn't being written, yes, but is it
really worth saving the few bytes of stack space here, rather than
having the expected 512 as array dimension?

> +            struct {
> +                uint32_t other[6];
> +                uint32_t mxcsr;
> +                uint32_t mxcsr_mask;
> +                /* ... */
> +            };
> +        } *fxs;
> +
> +        fxs = (typeof(fxs)) fxsave;
> +
> +        if ( store ) {

Style.

> +            char null[512] __attribute__((aligned(16))) = { 0 };

No need for the 0 and a blank line between declaration and statements
please.

> +            asm volatile(" fxrstor %0; "::"m"(*null));
> +            asm volatile(" fxrstor %0; "::"m"(*fxsave));

Style again - these want to follow the

    asm volatile ( "..." :: "m" (...) )

form. No need for the ; following the instructions.

> +        }
> +        
> +        asm volatile( "fxsave %0" : "=m" (*fxs) );

This is pretty confusing, the more with the different variable names
used which point to the same piece of memory. You basically store back
into the area you've read from. Is the caller expecting the memory area
to change? Is this being done other than for convenience to not have
another instance of scratch space on the stack? Some comment on the
intentions may be helpful here.

The function's parameter name being "store" adds to the confusion,
since what it controls is actually what we call "load" on x86 (or
"restore" following the insn mnemonics).

And then - what about YMM register state? Other more exotic registers
(like the BND* ones) may indeed not be that relevant to fuzz yet.

> @@ -737,6 +780,17 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
>              printf("Setting cpu_user_regs offset %x\n", offset);
>              continue;
>          }
> +        offset -= sizeof(struct cpu_user_regs);
> +
> +        /* Fuzz fxsave state */
> +        if ( offset < 128 )
> +        {
> +            if ( !input_read(s, s->fxsave + (offset * 4), 4) )
> +                return;
> +            printf("Setting fxsave offset %x\n", offset * 4);

What's this 32-bit granularity derived from?

> @@ -883,6 +937,9 @@ static void sanitize_state(struct x86_emulate_ctxt *ctxt)
>          s->segments[x86_seg_cs].db = 0;
>          s->segments[x86_seg_ss].db = 0;
>      }
> +
> +    /* Setting this value seems to cause crashes in fxrstor */
> +    *((unsigned int *)(s->fxsave) + 6) = 0;

That's the MXCSR field - instead of storing zero you want to mask with
mxcsr_mask. To avoid the ugly literal 6 (and to make clear what it is
that needs adjustment here) it may also be worthwhile widening the
scope of the type declared in _set_fpu_state() and use it for struct
fuzz_state's fxsave field.

Jan

George Dunlap Oct. 5, 2017, 5:08 p.m. UTC | #2

On 10/04/2017 09:28 AM, Jan Beulich wrote:
>>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote:
>> @@ -597,6 +599,47 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>>  };
>>  #undef SET
>>  
>> +static void _set_fpu_state(char *fxsave, bool store)
>> +{
>> +    if ( cpu_has_fxsr )
>> +    {
>> +        static union __attribute__((__aligned__(16))) {
>> +            char x[464];
> 
> The final part of the save area isn't being written, yes, but is it
> really worth saving the few bytes of stack space here, rather than
> having the expected 512 as array dimension?

So I didn't actually look into this very much; I mainly just hacked at
it until it seemed to work.  I copied-and-pasted emul_test_init() from
x86_emulate.c (which is where the 464 came from), then copied some
scraps of asm from stackoverflow.

>> +            struct {
>> +                uint32_t other[6];
>> +                uint32_t mxcsr;
>> +                uint32_t mxcsr_mask;
>> +                /* ... */
>> +            };
>> +        } *fxs;
>> +
>> +        fxs = (typeof(fxs)) fxsave;
>> +
>> +        if ( store ) {
> 
> Style.
> 
>> +            char null[512] __attribute__((aligned(16))) = { 0 };
> 
> No need for the 0 and a blank line between declaration and statements
> please.
> 
>> +            asm volatile(" fxrstor %0; "::"m"(*null));
>> +            asm volatile(" fxrstor %0; "::"m"(*fxsave));
> 
> Style again - these want to follow the
> 
>     asm volatile ( "..." :: "m" (...) )
> 
> form. No need for the ; following the instructions.
>
>> +        }
>> +        
>> +        asm volatile( "fxsave %0" : "=m" (*fxs) );
> 
> This is pretty confusing, the more with the different variable names
> used which point to the same piece of memory. You basically store back
> into the area you've read from. Is the caller expecting the memory area
> to change? Is this being done other than for convenience to not have
> another instance of scratch space on the stack? Some comment on the
> intentions may be helpful here.

Yes, sorry for the different variable names.  I should have done a
better clean-up of this patch.

As for why it's doing an fxsave after just doing an fxrstor: I had the
idea that what came out via fxsave might not be the same as what was
written via fxrstor (i.e., the instruction would "interpret" the data),
particularly as what went in would be completely random fuzzed state.
The idea behind doing the restore / save was to "sanitize" the state in
the state struct to look more like real input data.

> The function's parameter name being "store" adds to the confusion,
> since what it controls is actually what we call "load" on x86 (or
> "restore" following the insn mnemonics).

I chose 'store' as the argument name before I realized that fxrstor was
"fx restore" and not "fxr store".

Do you think 'write' would be suitable?  Names like "restore" or "load"
make sense if you're thinking about things from the processor's
perspective (as the architects certainly were); but they make less sense
from a programmer's perspective, since (to me anyway) it seems like I'm
writing to or reading from the FPU unit (rather than loading/restoring
or saving).

If you don't like 'write' I'll change it to 'restore'.

> And then - what about YMM register state? Other more exotic registers
> (like the BND* ones) may indeed not be that relevant to fuzz yet.

I can look into that if you want, or if you want to give me some runes
to copy in I'm happy to do that as well.

>> @@ -737,6 +780,17 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
>>              printf("Setting cpu_user_regs offset %x\n", offset);
>>              continue;
>>          }
>> +        offset -= sizeof(struct cpu_user_regs);
>> +
>> +        /* Fuzz fxsave state */
>> +        if ( offset < 128 )
>> +        {
>> +            if ( !input_read(s, s->fxsave + (offset * 4), 4) )
>> +                return;
>> +            printf("Setting fxsave offset %x\n", offset * 4);
> 
> What's this 32-bit granularity derived from?

Just seemed like a good-sized chunk.  Doing it byte-by-byte seemed to be
"wasting" input on offsets (as in the input you'd have a 2-byte 'offset'
followed by a one-byte bit of data).  This way you have a 2-byte offset
and a 4-byte chunk of data that you write.

Let me know if you think there's a better size for chunks of data to
write.  In any case I'll add a comment in here to let people know that
the size is arbitrary.

>> @@ -883,6 +937,9 @@ static void sanitize_state(struct x86_emulate_ctxt *ctxt)
>>          s->segments[x86_seg_cs].db = 0;
>>          s->segments[x86_seg_ss].db = 0;
>>      }
>> +
>> +    /* Setting this value seems to cause crashes in fxrstor */
>> +    *((unsigned int *)(s->fxsave) + 6) = 0;
> 
> That's the MXCSR field - instead of storing zero you want to mask with
> mxcsr_mask. To avoid the ugly literal 6 (and to make clear what it is
> that needs adjustment here) it may also be worthwhile widening the
> scope of the type declared in _set_fpu_state() and use it for struct
> fuzz_state's fxsave field.

Got it.

 -George

Jan Beulich Oct. 6, 2017, 6:10 a.m. UTC | #3

>>> George Dunlap <george.dunlap@citrix.com> 10/05/17 7:08 PM >>>
>On 10/04/2017 09:28 AM, Jan Beulich wrote:
>>>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote:
>>> @@ -597,6 +599,47 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>>>  };
>>>  #undef SET
>>>  
>>> +static void _set_fpu_state(char *fxsave, bool store)
>>> +{
>>> +    if ( cpu_has_fxsr )
>>> +    {
>>> +        static union __attribute__((__aligned__(16))) {
>>> +            char x[464];
>> 
>> The final part of the save area isn't being written, yes, but is it
>> really worth saving the few bytes of stack space here, rather than
>> having the expected 512 as array dimension?
>
>So I didn't actually look into this very much; I mainly just hacked at
>it until it seemed to work.  I copied-and-pasted emul_test_init() from
>x86_emulate.c (which is where the 464 came from), then copied some
>scraps of asm from stackoverflow.

Oh, so it looks like I'm guilty here, as I think it was me who wrote it that
way there. I have to admit I don't really see why I wanted to save on stack
consumption there. In any event I'm then fine for you to leave it that way,
so the two places remain in sync (but I would also be fine if you changed
it here, and I'd then try to remember to clean it up on the other side).

>>> +        }
>>> +        
>>> +        asm volatile( "fxsave %0" : "=m" (*fxs) );
>> 
>> This is pretty confusing, the more with the different variable names
>> used which point to the same piece of memory. You basically store back
>> into the area you've read from. Is the caller expecting the memory area
>> to change? Is this being done other than for convenience to not have
>> another instance of scratch space on the stack? Some comment on the
>> intentions may be helpful here.
>
>Yes, sorry for the different variable names.  I should have done a
>better clean-up of this patch.
>
>As for why it's doing an fxsave after just doing an fxrstor: I had the
>idea that what came out via fxsave might not be the same as what was
>written via fxrstor (i.e., the instruction would "interpret" the data),
>particularly as what went in would be completely random fuzzed state.
>The idea behind doing the restore / save was to "sanitize" the state in
>the state struct to look more like real input data.

Okay, that's what I had guessed. As said, please put this in a comment, the
more that you've realized this doesn't work all by itself (due to the MXCSR field
causing #GP when not sanitized _before_ doing the fxrstor). And the restore
from null then is to pre-init any (theoretical) fields the subsequent restore may
not touch at all?

>> The function's parameter name being "store" adds to the confusion,
>> since what it controls is actually what we call "load" on x86 (or
>> "restore" following the insn mnemonics).
>
>I chose 'store' as the argument name before I realized that fxrstor was
>"fx restore" and not "fxr store".
>
>Do you think 'write' would be suitable?  Names like "restore" or "load"
>make sense if you're thinking about things from the processor's
>perspective (as the architects certainly were); but they make less sense
>from a programmer's perspective, since (to me anyway) it seems like I'm
>writing to or reading from the FPU unit (rather than loading/restoring
>or saving).
>
>If you don't like 'write' I'll change it to 'restore'.

"write" is fine, I think, as would be "ro" or "readonly".

>> And then - what about YMM register state? Other more exotic registers
>> (like the BND* ones) may indeed not be that relevant to fuzz yet.
>
>I can look into that if you want, or if you want to give me some runes
>to copy in I'm happy to do that as well.

As that's not as simple as FXSAVE/FXRSTOR (due to first needing to
discover area sizes) it's perhaps best to simply leave a TODO comment for
now.

>>> @@ -737,6 +780,17 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
>>>              printf("Setting cpu_user_regs offset %x\n", offset);
>>>              continue;
>>>          }
>>> +        offset -= sizeof(struct cpu_user_regs);
>>> +
>>> +        /* Fuzz fxsave state */
>>> +        if ( offset < 128 )
>>> +        {
>>> +            if ( !input_read(s, s->fxsave + (offset * 4), 4) )
>>> +                return;
>>> +            printf("Setting fxsave offset %x\n", offset * 4);
>> 
>> What's this 32-bit granularity derived from?
>
>Just seemed like a good-sized chunk.  Doing it byte-by-byte seemed to be
>"wasting" input on offsets (as in the input you'd have a 2-byte 'offset'
>followed by a one-byte bit of data).  This way you have a 2-byte offset
>and a 4-byte chunk of data that you write.

Well, ideally individual pieces would be taken all-or-nothing, but due to the
varying sizes this would be rather cumbersome. So with the comment about
this being arbitrary add, I think this will be fine for the time being.

Jan

Jan Beulich Oct. 6, 2017, 9:57 a.m. UTC | #4

>>> On 05.10.17 at 19:08, <george.dunlap@citrix.com> wrote:
> On 10/04/2017 09:28 AM, Jan Beulich wrote:
>>>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote:
>>> @@ -597,6 +599,47 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>>>  };
>>>  #undef SET
>>>  
>>> +static void _set_fpu_state(char *fxsave, bool store)
>>> +{
>>> +    if ( cpu_has_fxsr )
>>> +    {
>>> +        static union __attribute__((__aligned__(16))) {
>>> +            char x[464];
>> 
>> The final part of the save area isn't being written, yes, but is it
>> really worth saving the few bytes of stack space here, rather than
>> having the expected 512 as array dimension?
> 
> So I didn't actually look into this very much; I mainly just hacked at
> it until it seemed to work.  I copied-and-pasted emul_test_init() from
> x86_emulate.c (which is where the 464 came from), then copied some
> scraps of asm from stackoverflow.

One thing that came to mind in this context: It would perhaps be
useful to not waste input bytes on the unused portions of the
save area. Along those lines it may also be worth considering not
to waste input on the high halves of 64-bit registers as well as
the high 8 GPRs when emulating 32- or 16-bit mode.

Jan

George Dunlap Oct. 6, 2017, 10:50 a.m. UTC | #5

On 10/06/2017 10:57 AM, Jan Beulich wrote:
>>>> On 05.10.17 at 19:08, <george.dunlap@citrix.com> wrote:
>> On 10/04/2017 09:28 AM, Jan Beulich wrote:
>>>>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote:
>>>> @@ -597,6 +599,47 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>>>>  };
>>>>  #undef SET
>>>>  
>>>> +static void _set_fpu_state(char *fxsave, bool store)
>>>> +{
>>>> +    if ( cpu_has_fxsr )
>>>> +    {
>>>> +        static union __attribute__((__aligned__(16))) {
>>>> +            char x[464];
>>>
>>> The final part of the save area isn't being written, yes, but is it
>>> really worth saving the few bytes of stack space here, rather than
>>> having the expected 512 as array dimension?
>>
>> So I didn't actually look into this very much; I mainly just hacked at
>> it until it seemed to work.  I copied-and-pasted emul_test_init() from
>> x86_emulate.c (which is where the 464 came from), then copied some
>> scraps of asm from stackoverflow.
> 
> One thing that came to mind in this context: It would perhaps be
> useful to not waste input bytes on the unused portions of the
> save area. Along those lines it may also be worth considering not
> to waste input on the high halves of 64-bit registers as well as
> the high 8 GPRs when emulating 32- or 16-bit mode.

Well with the 'compact' mode we're not wasting anything -- AFL may 'try'
to write to those areas (by setting the <offset> word to those areas),
but it will find that nothing happens and not generate any test cases there.

Even for the high ends of the GPRs, if garbage in those *did* cause a
crash in 32- or 16-bit mode, we'd definitely want to know, wouldn't we?
We'd want to fix it anyway, and we'd need to make sure there wasn't a
way for a guest to trigger that situation with the emulator.

 -George

George Dunlap Oct. 6, 2017, 10:53 a.m. UTC | #6

On 10/06/2017 07:10 AM, Jan Beulich wrote:
>>>> George Dunlap <george.dunlap@citrix.com> 10/05/17 7:08 PM >>>
>> On 10/04/2017 09:28 AM, Jan Beulich wrote:
>>>>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote:
>>>> @@ -597,6 +599,47 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>>>>  };
>>>>  #undef SET
>>>>  
>>>> +static void _set_fpu_state(char *fxsave, bool store)
>>>> +{
>>>> +    if ( cpu_has_fxsr )
>>>> +    {
>>>> +        static union __attribute__((__aligned__(16))) {
>>>> +            char x[464];
>>>
>>> The final part of the save area isn't being written, yes, but is it
>>> really worth saving the few bytes of stack space here, rather than
>>> having the expected 512 as array dimension?
>>
>> So I didn't actually look into this very much; I mainly just hacked at
>> it until it seemed to work.  I copied-and-pasted emul_test_init() from
>> x86_emulate.c (which is where the 464 came from), then copied some
>> scraps of asm from stackoverflow.
> 
> Oh, so it looks like I'm guilty here, as I think it was me who wrote it that
> way there. I have to admit I don't really see why I wanted to save on stack
> consumption there. In any event I'm then fine for you to leave it that way,
> so the two places remain in sync (but I would also be fine if you changed
> it here, and I'd then try to remember to clean it up on the other side).

Well I don't think this function really looks much at all like
emul_test_init(); and I think it makes sense to keep this bit and the
"null" load below identical, so I'll change it to 512.

>>>> +        }
>>>> +        
>>>> +        asm volatile( "fxsave %0" : "=m" (*fxs) );
>>>
>>> This is pretty confusing, the more with the different variable names
>>> used which point to the same piece of memory. You basically store back
>>> into the area you've read from. Is the caller expecting the memory area
>>> to change? Is this being done other than for convenience to not have
>>> another instance of scratch space on the stack? Some comment on the
>>> intentions may be helpful here.
>>
>> Yes, sorry for the different variable names.  I should have done a
>> better clean-up of this patch.
>>
>> As for why it's doing an fxsave after just doing an fxrstor: I had the
>> idea that what came out via fxsave might not be the same as what was
>> written via fxrstor (i.e., the instruction would "interpret" the data),
>> particularly as what went in would be completely random fuzzed state.
>> The idea behind doing the restore / save was to "sanitize" the state in
>> the state struct to look more like real input data.
> 
> Okay, that's what I had guessed. As said, please put this in a comment, the
> more that you've realized this doesn't work all by itself (due to the MXCSR field
> causing #GP when not sanitized _before_ doing the fxrstor). And the restore
> from null then is to pre-init any (theoretical) fields the subsequent restore may
> not touch at all?

Yes; as I said, those two instructions were copied-and-pasted from
stackoverflow (or some other website); and I seem to recall them saying
that architecturally, if a certain amount of the "load data" was zero,
that the unit would simply do a full reset.

>>> The function's parameter name being "store" adds to the confusion,
>>> since what it controls is actually what we call "load" on x86 (or
>>> "restore" following the insn mnemonics).
>>
>> I chose 'store' as the argument name before I realized that fxrstor was
>> "fx restore" and not "fxr store".
>>
>> Do you think 'write' would be suitable?  Names like "restore" or "load"
>> make sense if you're thinking about things from the processor's
>> perspective (as the architects certainly were); but they make less sense
>>from a programmer's perspective, since (to me anyway) it seems like I'm
>> writing to or reading from the FPU unit (rather than loading/restoring
>> or saving).
>>
>> If you don't like 'write' I'll change it to 'restore'.
> 
> "write" is fine, I think, as would be "ro" or "readonly".
> 
>>> And then - what about YMM register state? Other more exotic registers
>>> (like the BND* ones) may indeed not be that relevant to fuzz yet.
>>
>> I can look into that if you want, or if you want to give me some runes
>> to copy in I'm happy to do that as well.
> 
> As that's not as simple as FXSAVE/FXRSTOR (due to first needing to
> discover area sizes) it's perhaps best to simply leave a TODO comment for
> now.

OK.

>>>> @@ -737,6 +780,17 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
>>>>              printf("Setting cpu_user_regs offset %x\n", offset);
>>>>              continue;
>>>>          }
>>>> +        offset -= sizeof(struct cpu_user_regs);
>>>> +
>>>> +        /* Fuzz fxsave state */
>>>> +        if ( offset < 128 )
>>>> +        {
>>>> +            if ( !input_read(s, s->fxsave + (offset * 4), 4) )
>>>> +                return;
>>>> +            printf("Setting fxsave offset %x\n", offset * 4);
>>>
>>> What's this 32-bit granularity derived from?
>>
>> Just seemed like a good-sized chunk.  Doing it byte-by-byte seemed to be
>> "wasting" input on offsets (as in the input you'd have a 2-byte 'offset'
>> followed by a one-byte bit of data).  This way you have a 2-byte offset
>> and a 4-byte chunk of data that you write.
> 
> Well, ideally individual pieces would be taken all-or-nothing, but due to the
> varying sizes this would be rather cumbersome. So with the comment about
> this being arbitrary add, I think this will be fine for the time being.

OK.

 -George

Jan Beulich Oct. 6, 2017, 11:53 a.m. UTC | #7

>>> On 06.10.17 at 12:50, <george.dunlap@citrix.com> wrote:
> On 10/06/2017 10:57 AM, Jan Beulich wrote:
>>>>> On 05.10.17 at 19:08, <george.dunlap@citrix.com> wrote:
>>> On 10/04/2017 09:28 AM, Jan Beulich wrote:
>>>>>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote:
>>>>> @@ -597,6 +599,47 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>>>>>  };
>>>>>  #undef SET
>>>>>  
>>>>> +static void _set_fpu_state(char *fxsave, bool store)
>>>>> +{
>>>>> +    if ( cpu_has_fxsr )
>>>>> +    {
>>>>> +        static union __attribute__((__aligned__(16))) {
>>>>> +            char x[464];
>>>>
>>>> The final part of the save area isn't being written, yes, but is it
>>>> really worth saving the few bytes of stack space here, rather than
>>>> having the expected 512 as array dimension?
>>>
>>> So I didn't actually look into this very much; I mainly just hacked at
>>> it until it seemed to work.  I copied-and-pasted emul_test_init() from
>>> x86_emulate.c (which is where the 464 came from), then copied some
>>> scraps of asm from stackoverflow.
>> 
>> One thing that came to mind in this context: It would perhaps be
>> useful to not waste input bytes on the unused portions of the
>> save area. Along those lines it may also be worth considering not
>> to waste input on the high halves of 64-bit registers as well as
>> the high 8 GPRs when emulating 32- or 16-bit mode.
> 
> Well with the 'compact' mode we're not wasting anything -- AFL may 'try'
> to write to those areas (by setting the <offset> word to those areas),
> but it will find that nothing happens and not generate any test cases there.
> 
> Even for the high ends of the GPRs, if garbage in those *did* cause a
> crash in 32- or 16-bit mode, we'd definitely want to know, wouldn't we?
> We'd want to fix it anyway, and we'd need to make sure there wasn't a
> way for a guest to trigger that situation with the emulator.

Ah, indeed.

Jan

Jan Beulich Oct. 6, 2017, 11:56 a.m. UTC | #8

>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote:
> @@ -597,6 +599,47 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>  };
>  #undef SET
>  
> +static void _set_fpu_state(char *fxsave, bool store)
> +{
> +    if ( cpu_has_fxsr )
> +    {
> +        static union __attribute__((__aligned__(16))) {
> +            char x[464];
> +            struct {
> +                uint32_t other[6];
> +                uint32_t mxcsr;
> +                uint32_t mxcsr_mask;
> +                /* ... */
> +            };
> +        } *fxs;
> +
> +        fxs = (typeof(fxs)) fxsave;
> +
> +        if ( store ) {
> +            char null[512] __attribute__((aligned(16))) = { 0 };
> +            asm volatile(" fxrstor %0; "::"m"(*null));
> +            asm volatile(" fxrstor %0; "::"m"(*fxsave));
> +        }
> +        
> +        asm volatile( "fxsave %0" : "=m" (*fxs) );
> +
> +        if ( fxs->mxcsr_mask )
> +            mxcsr_mask = fxs->mxcsr_mask;
> +        else
> +            mxcsr_mask = 0x000ffbf;

Actually - why is this necessary? I.e. why isn't emul_test_init()
setting mxcsr_mask sufficient?

Jan

George Dunlap Oct. 10, 2017, 3:45 p.m. UTC | #9

On 10/06/2017 12:56 PM, Jan Beulich wrote:
>>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote:
>> @@ -597,6 +599,47 @@ static const struct x86_emulate_ops all_fuzzer_ops = {
>>  };
>>  #undef SET
>>  
>> +static void _set_fpu_state(char *fxsave, bool store)
>> +{
>> +    if ( cpu_has_fxsr )
>> +    {
>> +        static union __attribute__((__aligned__(16))) {
>> +            char x[464];
>> +            struct {
>> +                uint32_t other[6];
>> +                uint32_t mxcsr;
>> +                uint32_t mxcsr_mask;
>> +                /* ... */
>> +            };
>> +        } *fxs;
>> +
>> +        fxs = (typeof(fxs)) fxsave;
>> +
>> +        if ( store ) {
>> +            char null[512] __attribute__((aligned(16))) = { 0 };
>> +            asm volatile(" fxrstor %0; "::"m"(*null));
>> +            asm volatile(" fxrstor %0; "::"m"(*fxsave));
>> +        }
>> +        
>> +        asm volatile( "fxsave %0" : "=m" (*fxs) );
>> +
>> +        if ( fxs->mxcsr_mask )
>> +            mxcsr_mask = fxs->mxcsr_mask;
>> +        else
>> +            mxcsr_mask = 0x000ffbf;
> 
> Actually - why is this necessary? I.e. why isn't emul_test_init()
> setting mxcsr_mask sufficient?

This is me not understanding what's going on.  I've removed this bit,
and modified this function to do the 'sanitation' -- to mask off mxcsr
before doing the fxrstor (and removed the change from "sanitize_input").

 -George

[v2,12/13] fuzz/x86_emulate: Set and fuzz more CPU state

Commit Message

Comments

Patch