[v3,09/12] fuzz/x86_emulate: Make input more compact

Message ID	20171010162011.9629-9-george.dunlap@citrix.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xen.org> From: George Dunlap <george.dunlap@citrix.com> To: <xen-devel@lists.xenproject.org> Date: Tue, 10 Oct 2017 17:20:08 +0100 Message-ID: <20171010162011.9629-9-george.dunlap@citrix.com> In-Reply-To: <20171010162011.9629-1-george.dunlap@citrix.com> References: <20171010162011.9629-1-george.dunlap@citrix.com> MIME-Version: 1.0 Cc: Ian Jackson <ian.jackson@citrix.com>, Wei Liu <wei.liu2@citrix.com>, George Dunlap <george.dunlap@citrix.com>, Jan Beulich <jbeulich@suse.com>, Andrew Cooper <andrew.cooper3@citrix.com> Subject: [Xen-devel] [PATCH v3 09/12] fuzz/x86_emulate: Make input more compact Precedence: list Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>

George Dunlap Oct. 10, 2017, 4:20 p.m. UTC

At the moment, AFL reckons that for any given input, 87% of it is
completely irrelevant: that is, it can change it as much as it wants
but have no impact on the result of the test; and yet it can't remove
it.

This is largely because we interpret the blob handed to us as a large
struct, including CR values, MSR values, segment registers, and a full
cpu_user_regs.

Instead, modify our interpretation to have a "set state" stanza at the
front.  Begin by reading a 16-bit value; if it is lower than a certain
threshold, set some state according to what byte it is, and repeat.
Continue until the byte is above a certain threshold.

This allows AFL to compact any given test case much smaller; to the
point where now it reckons there is not a single byte of the test file
which becomes irrelevant.  Testing have shown that this option both
allows AFL to reach coverage much faster, and to have a total coverage
higher than with the old format.

Make this an option (rather than a unilateral change) to enable
side-by-side performance comparison of the old and new formats.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
v3:
 - Set default for opt_compact statically in fuzz-emul.c rather than afl-harness
 - Make input size / unconditional input reading more consistent
 - Require only minimum input read, not first instruction byte
 - Use ARRAY_SIZE() for hardcoded values
 - Use `for ( ; ; )` rather than `while(1)`
 - Some style issues
 - Move opt_compact declaration into fuzz-emul.h
v2: Port over previous changes

CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 tools/fuzz/x86_instruction_emulator/afl-harness.c |  9 ++-
 tools/fuzz/x86_instruction_emulator/fuzz-emul.c   | 96 ++++++++++++++++++++---
 tools/fuzz/x86_instruction_emulator/fuzz-emul.h   |  2 +
 3 files changed, 96 insertions(+), 11 deletions(-)

Andrew Cooper Oct. 10, 2017, 4:59 p.m. UTC | #1

On 10/10/17 17:20, George Dunlap wrote:
> At the moment, AFL reckons that for any given input, 87% of it is
> completely irrelevant: that is, it can change it as much as it wants
> but have no impact on the result of the test; and yet it can't remove
> it.
>
> This is largely because we interpret the blob handed to us as a large
> struct, including CR values, MSR values, segment registers, and a full
> cpu_user_regs.
>
> Instead, modify our interpretation to have a "set state" stanza at the
> front.  Begin by reading a 16-bit value; if it is lower than a certain
> threshold, set some state according to what byte it is, and repeat.
> Continue until the byte is above a certain threshold.
>
> This allows AFL to compact any given test case much smaller; to the
> point where now it reckons there is not a single byte of the test file
> which becomes irrelevant.  Testing have shown that this option both
> allows AFL to reach coverage much faster, and to have a total coverage
> higher than with the old format.
>
> Make this an option (rather than a unilateral change) to enable
> side-by-side performance comparison of the old and new formats.
>
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

I am still of the opinion that this is a waste of effort, which would be
better spent actually removing the irrelevant state in the first place;
not building an obfuscation algorithm.

I'm not going to nack the patch because that is probably over the top,
but I'm not in favour if this change going in.

~Andrew

George Dunlap Oct. 10, 2017, 5:01 p.m. UTC | #2

On 10/10/2017 05:59 PM, Andrew Cooper wrote:
> On 10/10/17 17:20, George Dunlap wrote:
>> At the moment, AFL reckons that for any given input, 87% of it is
>> completely irrelevant: that is, it can change it as much as it wants
>> but have no impact on the result of the test; and yet it can't remove
>> it.
>>
>> This is largely because we interpret the blob handed to us as a large
>> struct, including CR values, MSR values, segment registers, and a full
>> cpu_user_regs.
>>
>> Instead, modify our interpretation to have a "set state" stanza at the
>> front.  Begin by reading a 16-bit value; if it is lower than a certain
>> threshold, set some state according to what byte it is, and repeat.
>> Continue until the byte is above a certain threshold.
>>
>> This allows AFL to compact any given test case much smaller; to the
>> point where now it reckons there is not a single byte of the test file
>> which becomes irrelevant.  Testing have shown that this option both
>> allows AFL to reach coverage much faster, and to have a total coverage
>> higher than with the old format.
>>
>> Make this an option (rather than a unilateral change) to enable
>> side-by-side performance comparison of the old and new formats.
>>
>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> 
> I am still of the opinion that this is a waste of effort, which would be
> better spent actually removing the irrelevant state in the first place;
> not building an obfuscation algorithm.
> 
> I'm not going to nack the patch because that is probably over the top,
> but I'm not in favour if this change going in.

Did you look at the evidence I presented, demonstrating that this
significantly increases the effectiveness of AFL?

 -George

Andrew Cooper Oct. 10, 2017, 5:11 p.m. UTC | #3

On 10/10/17 18:01, George Dunlap wrote:
> On 10/10/2017 05:59 PM, Andrew Cooper wrote:
>> On 10/10/17 17:20, George Dunlap wrote:
>>> At the moment, AFL reckons that for any given input, 87% of it is
>>> completely irrelevant: that is, it can change it as much as it wants
>>> but have no impact on the result of the test; and yet it can't remove
>>> it.
>>>
>>> This is largely because we interpret the blob handed to us as a large
>>> struct, including CR values, MSR values, segment registers, and a full
>>> cpu_user_regs.
>>>
>>> Instead, modify our interpretation to have a "set state" stanza at the
>>> front.  Begin by reading a 16-bit value; if it is lower than a certain
>>> threshold, set some state according to what byte it is, and repeat.
>>> Continue until the byte is above a certain threshold.
>>>
>>> This allows AFL to compact any given test case much smaller; to the
>>> point where now it reckons there is not a single byte of the test file
>>> which becomes irrelevant.  Testing have shown that this option both
>>> allows AFL to reach coverage much faster, and to have a total coverage
>>> higher than with the old format.
>>>
>>> Make this an option (rather than a unilateral change) to enable
>>> side-by-side performance comparison of the old and new formats.
>>>
>>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>> I am still of the opinion that this is a waste of effort, which would be
>> better spent actually removing the irrelevant state in the first place;
>> not building an obfuscation algorithm.
>>
>> I'm not going to nack the patch because that is probably over the top,
>> but I'm not in favour if this change going in.
> Did you look at the evidence I presented, demonstrating that this
> significantly increases the effectiveness of AFL?

I can easily believe that you've found an obfucation algorithm which
does better than the current state layout.

I do not believe that any amount of obfuscation will be better than
actually fixing the root cause of the problem; that the current state
really is mostly irrelevant, and can easily be shrunk.

~Andrew

George Dunlap Oct. 10, 2017, 5:13 p.m. UTC | #4

On 10/10/2017 06:11 PM, Andrew Cooper wrote:
> On 10/10/17 18:01, George Dunlap wrote:
>> On 10/10/2017 05:59 PM, Andrew Cooper wrote:
>>> On 10/10/17 17:20, George Dunlap wrote:
>>>> At the moment, AFL reckons that for any given input, 87% of it is
>>>> completely irrelevant: that is, it can change it as much as it wants
>>>> but have no impact on the result of the test; and yet it can't remove
>>>> it.
>>>>
>>>> This is largely because we interpret the blob handed to us as a large
>>>> struct, including CR values, MSR values, segment registers, and a full
>>>> cpu_user_regs.
>>>>
>>>> Instead, modify our interpretation to have a "set state" stanza at the
>>>> front.  Begin by reading a 16-bit value; if it is lower than a certain
>>>> threshold, set some state according to what byte it is, and repeat.
>>>> Continue until the byte is above a certain threshold.
>>>>
>>>> This allows AFL to compact any given test case much smaller; to the
>>>> point where now it reckons there is not a single byte of the test file
>>>> which becomes irrelevant.  Testing have shown that this option both
>>>> allows AFL to reach coverage much faster, and to have a total coverage
>>>> higher than with the old format.
>>>>
>>>> Make this an option (rather than a unilateral change) to enable
>>>> side-by-side performance comparison of the old and new formats.
>>>>
>>>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>>> I am still of the opinion that this is a waste of effort, which would be
>>> better spent actually removing the irrelevant state in the first place;
>>> not building an obfuscation algorithm.
>>>
>>> I'm not going to nack the patch because that is probably over the top,
>>> but I'm not in favour if this change going in.
>> Did you look at the evidence I presented, demonstrating that this
>> significantly increases the effectiveness of AFL?
> 
> I can easily believe that you've found an obfucation algorithm which
> does better than the current state layout.
> 
> I do not believe that any amount of obfuscation will be better than
> actually fixing the root cause of the problem; that the current state
> really is mostly irrelevant, and can easily be shrunk.

Right; well I've already explained why I don't think "obfuscation" is
the right term.  For the time being, we have something which improves
efficiency; let's check it in now, and in the future if you or someone
else finds a way to fix it "properly" we can do that.

 -George

Ian Jackson Oct. 10, 2017, 5:26 p.m. UTC | #5

George Dunlap writes ("[PATCH v3 09/12] fuzz/x86_emulate: Make input more compact"):
> At the moment, AFL reckons that for any given input, 87% of it is
> completely irrelevant: that is, it can change it as much as it wants
> but have no impact on the result of the test; and yet it can't remove
> it.
> 
> This is largely because we interpret the blob handed to us as a large
> struct, including CR values, MSR values, segment registers, and a full
> cpu_user_regs.
> 
> Instead, modify our interpretation to have a "set state" stanza at the
> front.  Begin by reading a 16-bit value; if it is lower than a certain
> threshold, set some state according to what byte it is, and repeat.
> Continue until the byte is above a certain threshold.
> 
> This allows AFL to compact any given test case much smaller; to the
> point where now it reckons there is not a single byte of the test file
> which becomes irrelevant.  Testing have shown that this option both
> allows AFL to reach coverage much faster, and to have a total coverage
> higher than with the old format.

This is basically a compression scheme.  How odd that it should help.

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

Andrew Cooper Oct. 10, 2017, 5:31 p.m. UTC | #6

On 10/10/17 18:13, George Dunlap wrote:
> On 10/10/2017 06:11 PM, Andrew Cooper wrote:
>> On 10/10/17 18:01, George Dunlap wrote:
>>> On 10/10/2017 05:59 PM, Andrew Cooper wrote:
>>>> On 10/10/17 17:20, George Dunlap wrote:
>>>>> At the moment, AFL reckons that for any given input, 87% of it is
>>>>> completely irrelevant: that is, it can change it as much as it wants
>>>>> but have no impact on the result of the test; and yet it can't remove
>>>>> it.
>>>>>
>>>>> This is largely because we interpret the blob handed to us as a large
>>>>> struct, including CR values, MSR values, segment registers, and a full
>>>>> cpu_user_regs.
>>>>>
>>>>> Instead, modify our interpretation to have a "set state" stanza at the
>>>>> front.  Begin by reading a 16-bit value; if it is lower than a certain
>>>>> threshold, set some state according to what byte it is, and repeat.
>>>>> Continue until the byte is above a certain threshold.
>>>>>
>>>>> This allows AFL to compact any given test case much smaller; to the
>>>>> point where now it reckons there is not a single byte of the test file
>>>>> which becomes irrelevant.  Testing have shown that this option both
>>>>> allows AFL to reach coverage much faster, and to have a total coverage
>>>>> higher than with the old format.
>>>>>
>>>>> Make this an option (rather than a unilateral change) to enable
>>>>> side-by-side performance comparison of the old and new formats.
>>>>>
>>>>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
>>>> I am still of the opinion that this is a waste of effort, which would be
>>>> better spent actually removing the irrelevant state in the first place;
>>>> not building an obfuscation algorithm.
>>>>
>>>> I'm not going to nack the patch because that is probably over the top,
>>>> but I'm not in favour if this change going in.
>>> Did you look at the evidence I presented, demonstrating that this
>>> significantly increases the effectiveness of AFL?
>> I can easily believe that you've found an obfucation algorithm which
>> does better than the current state layout.
>>
>> I do not believe that any amount of obfuscation will be better than
>> actually fixing the root cause of the problem; that the current state
>> really is mostly irrelevant, and can easily be shrunk.
> Right; well I've already explained why I don't think "obfuscation" is
> the right term.

How else would describe it?  You are purposefully hiding the structure
of the data by doing conditional initialisation based on earlier values.

> For the time being, we have something which improves
> efficiency;

How much of this measured efficiently is actually ALF finding paths
around setup_state() rather than finding new paths in the emulator
itself?  I can't think of a test which would isolate this properly.

> let's check it in now, and in the future if you or someone
> else finds a way to fix it "properly" we can do that.

I wouldn't really mind so much if this change didn't make it harder to
fix the root cause.  As it is, your prerequisite patch prohibits any
ability to overlay a minimal structure over the fuzzing corpus.

~Andrew

George Dunlap Oct. 10, 2017, 6:57 p.m. UTC | #7

> On Oct 10, 2017, at 6:26 PM, Ian Jackson <ian.jackson@eu.citrix.com> wrote:

> 

> George Dunlap writes ("[PATCH v3 09/12] fuzz/x86_emulate: Make input more compact"):

>> At the moment, AFL reckons that for any given input, 87% of it is

>> completely irrelevant: that is, it can change it as much as it wants

>> but have no impact on the result of the test; and yet it can't remove

>> it.

>> 

>> This is largely because we interpret the blob handed to us as a large

>> struct, including CR values, MSR values, segment registers, and a full

>> cpu_user_regs.

>> 

>> Instead, modify our interpretation to have a "set state" stanza at the

>> front.  Begin by reading a 16-bit value; if it is lower than a certain

>> threshold, set some state according to what byte it is, and repeat.

>> Continue until the byte is above a certain threshold.

>> 

>> This allows AFL to compact any given test case much smaller; to the

>> point where now it reckons there is not a single byte of the test file

>> which becomes irrelevant.  Testing have shown that this option both

>> allows AFL to reach coverage much faster, and to have a total coverage

>> higher than with the old format.

> 

> This is basically a compression scheme.  How odd that it should help.


Well I’m pretty sure the size of the input file is more or less the precise cause for the difference in speed: Fuzzing a 32-byte file is just a lot faster than fuzzing a 1-k file.  Running them side by side makes the effect more obvious — I’ll show you tomorrow if you’re interested.

Since the file size is the direct cause of the speed difference, having a “compressed” file will naturally make things faster.


> 

> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>


Thanks.

 -George

George Dunlap Oct. 10, 2017, 8:55 p.m. UTC | #8

> On Oct 10, 2017, at 6:31 PM, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:

> 

> On 10/10/17 18:13, George Dunlap wrote:

>> On 10/10/2017 06:11 PM, Andrew Cooper wrote:

>>> On 10/10/17 18:01, George Dunlap wrote:

>>>> On 10/10/2017 05:59 PM, Andrew Cooper wrote:

>>>>> On 10/10/17 17:20, George Dunlap wrote:

>>>>>> At the moment, AFL reckons that for any given input, 87% of it is

>>>>>> completely irrelevant: that is, it can change it as much as it wants

>>>>>> but have no impact on the result of the test; and yet it can't remove

>>>>>> it.

>>>>>> 

>>>>>> This is largely because we interpret the blob handed to us as a large

>>>>>> struct, including CR values, MSR values, segment registers, and a full

>>>>>> cpu_user_regs.

>>>>>> 

>>>>>> Instead, modify our interpretation to have a "set state" stanza at the

>>>>>> front.  Begin by reading a 16-bit value; if it is lower than a certain

>>>>>> threshold, set some state according to what byte it is, and repeat.

>>>>>> Continue until the byte is above a certain threshold.

>>>>>> 

>>>>>> This allows AFL to compact any given test case much smaller; to the

>>>>>> point where now it reckons there is not a single byte of the test file

>>>>>> which becomes irrelevant.  Testing have shown that this option both

>>>>>> allows AFL to reach coverage much faster, and to have a total coverage

>>>>>> higher than with the old format.

>>>>>> 

>>>>>> Make this an option (rather than a unilateral change) to enable

>>>>>> side-by-side performance comparison of the old and new formats.

>>>>>> 

>>>>>> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

>>>>> I am still of the opinion that this is a waste of effort, which would be

>>>>> better spent actually removing the irrelevant state in the first place;

>>>>> not building an obfuscation algorithm.

>>>>> 

>>>>> I'm not going to nack the patch because that is probably over the top,

>>>>> but I'm not in favour if this change going in.

>>>> Did you look at the evidence I presented, demonstrating that this

>>>> significantly increases the effectiveness of AFL?

>>> I can easily believe that you've found an obfucation algorithm which

>>> does better than the current state layout.

>>> 

>>> I do not believe that any amount of obfuscation will be better than

>>> actually fixing the root cause of the problem; that the current state

>>> really is mostly irrelevant, and can easily be shrunk.

>> Right; well I've already explained why I don't think "obfuscation" is

>> the right term.

> 

> 

> How else would describe it?  You are purposefully hiding the structure

> of the data by doing conditional initialisation based on earlier values.

Consider the following progression:

1. Have a big structure, which includes RAX, and have the corpus be loaded into the whole thing (meaning we always fuzz the whole thing).

2. Have <state, value> tuples which will set specific state to specific values; e.g., <FUZZ_rax, 0xfefefefefefefefe>.  

3. Have <offset, value> tuples which write $value into a specific offset of the processor state (including RAX).

I would consider #3 a generalized form of #2.

>> For the time being, we have something which improves

>> efficiency;

> 

> How much of this measured efficiently is actually ALF finding paths

> around setup_state() rather than finding new paths in the emulator

> itself?  I can't think of a test which would isolate this properly.

Come now; we’re talking about thousands of unique “tuples” difference between “all paths found after 24 hours” for the compact and non-compact versions.  (Like, 12k vs 8k.)  Do you really think there are 4000 unique tuples inside that one function?

We now have a way to generate gcov data from AFL test cases, so that’s a theory that can be tested now, if you care to do so.

> 

>> let's check it in now, and in the future if you or someone

>> else finds a way to fix it "properly" we can do that.

> 

> I wouldn't really mind so much if this change didn't make it harder to

> fix the root cause.  As it is, your prerequisite patch prohibits any

> ability to overlay a minimal structure over the fuzzing corpus.

I’d be interested to know what you mean by “a minimal structure” (and “architectural values in fuzz_state derived from bitfields in fuzz_corpus” in another email).

Consider the following three input formats.

1. We bits to fuzz_state->options for whether we should set the GPRs to fuzzed data or not.  If, for instance, the FUZZ_rax bit is set, then we read 8 bytes out of the fuzz state into RAX as part of the setup.

2. A format similar to what I have, but we have an explicit enumeration.  Read “fuzz_target” byte.  If fuzz_target == FUZZ_rax, then you read 64 bytes and write it into regs->rax.

3. The format I have: read an offset, write bytes into that offset.

Clearly there will be similarly ‘compact’ corpus files in all three that specify: “Set RAX to 0xfefefefefefefefe, then execute mov RAX, RBX”.  In all three formats AFL can efficiently mutate the corpus to setting RAX to 0xffffffffffffffff, or RBX to 0xfefefefefefefefe, or running “mov RAX, RDX” instead.  

From a fuzzing perspective, I think #2 and #3 are almost identical.

One advantage of #2 is that it could in theory align more closely with AFL’s “deterministic” fuzzing steps: If the value that ends up being written into RAX always stored as an aligned 64-bit number inside the file, for example, then maybe AFL’s “arithmetic” and “special value” heuristics might be more effective at finding interesting testcases, rather than relying mainly on “havoc” finding those same testcases by random bit-mashing.

One *disadvantage* of #2 is that it will only fuzz the parts of the data structure you expect to be set.  #3 will set *all* parts of your cpu structure, even the parts you don’t think are relevant.  From a “finding bugs in the emulator” perspective, I think #3 is clearly superior.

See the response I gave to Jan about clearing the upper bits of GPRs in 16- or 32-bit mode. Yes, in theory the upper bits should have no effect on the outcome of the emulation.  But if they *did* have an effect, we’d want to know, wouldn’t we?  Similarly, yes, in theory bits of the cpu structure that aren’t used shouldn’t have any effect on the emulation.  But if they *did*, we’d want to know: it would definitely be a bug, and we’d want to make sure that nobody could trigger that state 

I’m not sure if #2 is anything like you had in mind, but it wouldn’t actually be very difficult to morph what I’ve done here into it: we already have a “range” of “offset” values for MSRs, segment registers, and so on; all it would take is adding specifically-enumerated state into that list.

If that’s not your idea, all I can say is this: the primary way the more “compact” state increases effectiveness is by reducing file size.  Anything that doesn’t reduce file size will, I predict, be ineffective.

 -George

Jan Beulich Oct. 11, 2017, 9:18 a.m. UTC | #9

>>> On 10.10.17 at 18:20, <george.dunlap@citrix.com> wrote:
> At the moment, AFL reckons that for any given input, 87% of it is
> completely irrelevant: that is, it can change it as much as it wants
> but have no impact on the result of the test; and yet it can't remove
> it.
> 
> This is largely because we interpret the blob handed to us as a large
> struct, including CR values, MSR values, segment registers, and a full
> cpu_user_regs.
> 
> Instead, modify our interpretation to have a "set state" stanza at the
> front.  Begin by reading a 16-bit value; if it is lower than a certain
> threshold, set some state according to what byte it is, and repeat.
> Continue until the byte is above a certain threshold.
> 
> This allows AFL to compact any given test case much smaller; to the
> point where now it reckons there is not a single byte of the test file
> which becomes irrelevant.  Testing have shown that this option both
> allows AFL to reach coverage much faster, and to have a total coverage
> higher than with the old format.
> 
> Make this an option (rather than a unilateral change) to enable
> side-by-side performance comparison of the old and new formats.
> 
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>

Without meaning to override Andrew's objections, in case
he can grudgingly accept this going in
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan

[v3,09/12] fuzz/x86_emulate: Make input more compact

Commit Message

Comments

Patch