x86emul/fuzz: adjust canonicalization in sanitize_input()

Message ID	5C9E30F40200007800222DDC@prv1-mh.provo.novell.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xenproject.org> Message-Id: <5C9E30F40200007800222DDC@prv1-mh.provo.novell.com> Date: Fri, 29 Mar 2019 08:51:32 -0600 From: "Jan Beulich" <JBeulich@suse.com> To: "xen-devel" <xen-devel@lists.xenproject.org> Mime-Version: 1.0 Content-Disposition: inline Subject: [Xen-devel] [PATCH] x86emul/fuzz: adjust canonicalization in sanitize_input() Precedence: list Cc: George Dunlap <George.Dunlap@eu.citrix.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Wei Liu <wei.liu2@citrix.com>, Roger Pau Monne <roger.pau@citrix.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org>
Series	x86emul/fuzz: adjust canonicalization in sanitize_input() \| expand x86emul/fuzz: adjust canonicalization in sanitize_input()

Jan Beulich March 29, 2019, 2:51 p.m. UTC

Drop it entirely for %rbp - this register is not special purpose enough
to warrant such special treatment. Add a comment to clarify the purpose
of the canonicalization of %rip and %rsp.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Andrew Cooper March 29, 2019, 3:03 p.m. UTC | #1

On 29/03/2019 14:51, Jan Beulich wrote:
> Drop it entirely for %rbp - this register is not special purpose enough
> to warrant such special treatment. Add a comment to clarify the purpose
> of the canonicalization of %rip and %rsp.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

George Dunlap March 29, 2019, 3:14 p.m. UTC | #2

> On Mar 29, 2019, at 2:51 PM, Jan Beulich <JBeulich@suse.com> wrote:
> 
> Drop it entirely for %rbp - this register is not special purpose enough
> to warrant such special treatment. Add a comment to clarify the purpose
> of the canonicalization of %rip and %rsp.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

FAOD:
1. I don’t oppose this, but
2. I don’t support it either; however,
3. I don’t think my Ack is necessary.

 -George

Jan Beulich March 29, 2019, 3:23 p.m. UTC | #3

>>> On 29.03.19 at 16:14, <George.Dunlap@citrix.com> wrote:

> 
>> On Mar 29, 2019, at 2:51 PM, Jan Beulich <JBeulich@suse.com> wrote:
>> 
>> Drop it entirely for %rbp - this register is not special purpose enough
>> to warrant such special treatment. Add a comment to clarify the purpose
>> of the canonicalization of %rip and %rsp.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> FAOD:
> 1. I don’t oppose this, but
> 2. I don’t support it either; however,
> 3. I don’t think my Ack is necessary.

Well, preferably I would address your concerns despite 3. So could
you clarify what you would suggest instead? Keep things as they
are? Drop all canonicalization? I've basically tried to find a middle
ground between the two extremes.

Jan

George Dunlap March 29, 2019, 3:42 p.m. UTC | #4

> On Mar 29, 2019, at 3:23 PM, Jan Beulich <JBeulich@suse.com> wrote:
> 
>>>> On 29.03.19 at 16:14, <George.Dunlap@citrix.com> wrote:
> 
>> 
>>> On Mar 29, 2019, at 2:51 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>> 
>>> Drop it entirely for %rbp - this register is not special purpose enough
>>> to warrant such special treatment. Add a comment to clarify the purpose
>>> of the canonicalization of %rip and %rsp.
>>> 
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> 
>> FAOD:
>> 1. I don’t oppose this, but
>> 2. I don’t support it either; however,
>> 3. I don’t think my Ack is necessary.
> 
> Well, preferably I would address your concerns despite 3. So could
> you clarify what you would suggest instead? Keep things as they
> are? Drop all canonicalization? I've basically tried to find a middle
> ground between the two extremes.

I appreciate that. :-) But the main reason I wrote this was #3: I didn’t want my silence interpreted as a nack.

I don’t think it will help fuzzing to remove canonicalization of ebp; it may help to have it in.  In fact I’d prefer to CANONICALIZE_MAYBE() more registers.

But I don’t think the question is so clear, or so important, that it’s worth having a long discussion about.  Absent some sort of rigorous testing, we’re all just guessing which is better; you & Andy are guessing one way, I’m guessing the other way.  This patch is about as close to middle-ground as there is.

Peace,
 -George

Jan Beulich March 29, 2019, 4:14 p.m. UTC | #5

>>> On 29.03.19 at 16:42, <George.Dunlap@citrix.com> wrote:
>> On Mar 29, 2019, at 3:23 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 29.03.19 at 16:14, <George.Dunlap@citrix.com> wrote:
>>> FAOD:
>>> 1. I don’t oppose this, but
>>> 2. I don’t support it either; however,
>>> 3. I don’t think my Ack is necessary.
>> 
>> Well, preferably I would address your concerns despite 3. So could
>> you clarify what you would suggest instead? Keep things as they
>> are? Drop all canonicalization? I've basically tried to find a middle
>> ground between the two extremes.
> 
> I appreciate that. :-) But the main reason I wrote this was #3: I didn’t 
> want my silence interpreted as a nack.
> 
> I don’t think it will help fuzzing to remove canonicalization of ebp; it may 
> help to have it in.  In fact I’d prefer to CANONICALIZE_MAYBE() more 
> registers.

But canonicalization removes potentially interesting bits from fuzzed
input, which is liable to be relevant if a register is used for other than
a base address in an effective address calculation. As an example,
take BSR: You'd remove the possibility to get results in the range
[48,62]. Or take the XSA-195 case: The memory range covered by
BT{,C,R,S} is dramatically much smaller when the register holding the
bit offset first got canonicalized.

Granted the canonicalization is conditional, so it wouldn't be making it
entirely impossible to get into such a state, but since fuzzing is all
about likelihood, we'd like to avoid reducing our chances of hitting
interesting cases. But as you say ...

> But I don’t think the question is so clear, or so important, that it’s worth 
> having a long discussion about.  Absent some sort of rigorous testing, we’re 
> all just guessing which is better; you & Andy are guessing one way, I’m 
> guessing the other way.  This patch is about as close to middle-ground as 
> there is.

... here, there's indeed a fair bit of guesswork involved.

Thanks in any event for the clarification,
Jan

George Dunlap March 29, 2019, 7:20 p.m. UTC | #6

> On Mar 29, 2019, at 4:14 PM, Jan Beulich <JBeulich@suse.com> wrote:
> 
>>>> On 29.03.19 at 16:42, <George.Dunlap@citrix.com> wrote:
>>> On Mar 29, 2019, at 3:23 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 29.03.19 at 16:14, <George.Dunlap@citrix.com> wrote:
>>>> FAOD:
>>>> 1. I don’t oppose this, but
>>>> 2. I don’t support it either; however,
>>>> 3. I don’t think my Ack is necessary.
>>> 
>>> Well, preferably I would address your concerns despite 3. So could
>>> you clarify what you would suggest instead? Keep things as they
>>> are? Drop all canonicalization? I've basically tried to find a middle
>>> ground between the two extremes.
>> 
>> I appreciate that. :-) But the main reason I wrote this was #3: I didn’t 
>> want my silence interpreted as a nack.
>> 
>> I don’t think it will help fuzzing to remove canonicalization of ebp; it may 
>> help to have it in.  In fact I’d prefer to CANONICALIZE_MAYBE() more 
>> registers.
> 
> But canonicalization removes potentially interesting bits from fuzzed
> input, which is liable to be relevant if a register is used for other than
> a base address in an effective address calculation. As an example,
> take BSR: You'd remove the possibility to get results in the range
> [48,62]. Or take the XSA-195 case: The memory range covered by
> BT{,C,R,S} is dramatically much smaller when the register holding the
> bit offset first got canonicalized.
> 
> Granted the canonicalization is conditional, so it wouldn't be making it
> entirely impossible to get into such a state, but since fuzzing is all
> about likelihood, we'd like to avoid reducing our chances of hitting
> interesting cases.

OK, since you want to discuss it, let’s discuss it. :-)

Just to summarize everything:

“Canonical” addresses must have bits 48-63 identical to bit 47.  This means the chance of any “random” 64-bit number being canonical is 1 in 65536 (2^16).

A large number of registers (rax, rbp, &c) *can* be used as memory addresses; a handful of registers (esp, rip) are designated to be used primarily for that purpose. 

It is asserted that the canonicity* of any registers has no effect on the codepaths the emulator takes; it simply does calculations and passes the results to various callbacks (e.g., insn_fetch, read, write, &c).  It is these callbacks which, in Xen, check the calculated linear address for canonicity and return an error for non-canonical addresses.

[* This is the form of this word my spell-checker doesn’t red-underline]

However, the whole point of testing is to find places where your assumptions are violated.  If the emulator ever *did* behave differently for canonical and non-canonical addresses, or near the boundary of canonicity, we’d want those behaviors to be tested.  In fact, it’s *possible* that the emulator already does have this kind of behavior in it, and we just haven’t realized it.

Fuzzing is probabilistic; theoretically, if we run the fuzzer long enough, we’ll hit all “interesting” inputs.   But practically speaking, “focusing” the fuzzing such that “interesting” inputs are found more quickly seems like a good idea.

AFL’s fuzzing takes two modes:
* “Deterministic” fuzzing, where individual bits are flipped, bytes / words / &c are incremented and decremented, bytes / words / &c are replaced with “interesting” numbers like 0, 1, -1, and so on
* “Random” fuzzing, where completely random changes are made: random values added, random bits of the file removed, random bits of the file duplicated, &c

At the moment, all registers can be modified at start to any random 64-bit value.  But there’s also a bitfield near the beginning that controls whether some of those registers (namely rip, rsp, and rbp) are “canonicalized”: If the register’s bitfield is 1, bits 48-63 will be set to the value of bit 47; otherwise the register will be left alone.  This allows AFL to choose whether the register will be canonicalized or not.

Effectively, a “maybe canonicalized” register has a 50% chance (plus change) of being “canonical”, and a 50% chance (minus a bit) of being non-canonical.

This canonicalization has several effects.

First of all, it means that any interesting “canonical” addresses (if they exist) has approximately a 65536x higher chance of being selected;  or to put it a different way, AFL will find any interesting “canonical” addresses 65536x faster than otherwise.

On the other hand, any interesting non-canonical addresses (or non-address values) have a 50% lower chance of being selected; AFL will take twice as long to find an interesting “non-canonical” / non-address values address as it would otherwise.

The other effect, at the moment, is to add another branch to fuzz-emulate.c.  This means that the very act of adding a select-able bit to canonicalize (or not) a register adds another path that AFL will consider “interesting”; given two otherwise-identical inputs, A and A’, where one is canonicalized and the other not, AFL will consider them two different test cases, *even if they go through the same paths in x86_emulate()*.  It will keep them both in its “queue", and will spend time mutating them both.

For random fuzzing, even if x86_emulate() is the same, I don’t think this will be wasted effort: 5 seconds fuzzing A and then 5 seconds fuzzing A’ will be the same as 10 seconds of fuzzing A.  It may skew AFL’s priorities a bit by fuzzing something for 10 seconds rather than 5, but it still has a chance of finding new data.

For deterministic fuzzing, if x86_emulate() is the same, then there is duplicated work: the 5 seconds spent fuzzing A’ will go through exactly the same sets of mutations as the 5 seconds spent fuzzing A.

(It may be possible to exclude fuzz-emul.c from instrumentation, in which case this effect would go away.)

Now, we have two possibilities: Either canonicity does not and never will affect x86_emulate(); or it may, either now or later.

And we have a few choices: We maybe-canonicalize no registers; we maybe-canonicalize only registers that are almost always used as an address; or we maybe-canonicalize more (maybe all) registers.

Let us call these C0, C1, and C2 (for “canonicity matters, and we canonicalize none, some, or most, respectively"), and N0, N1, and N2 (for “canonicity doesn’t mater, and we canonicalize none, some, or most respectively”).  (And remember, N vs C is a fact about the world that we don’t know; and 0-2 are options we can choose.)

N0: Everything is the fastest: “interesting” non-address values are found the fastest, AFL isn’t fooled into double-random-fuzzing some entries, AFL doesn’t waste time double-deterministic-fuzzing entries.

N1: “Interesting” non-address values take twice as long to find; but since the registers in question are used almost exclusively for addresses, this is less important.  Canonicity has no effect, so it doesn’t matter whether we choose canonical or non-canonical values for the registers.  We do spend twice as long random-fuzzing entries (for some benefit), and we twice as much time deterministically fuzzing some entries to no benefit; but only when fuzzing these address-mostly registers.

N2: “Interesting” non-address values take twice as long to find.  We spend twice as long random-fuzzing entries (to some benefit), and twice as long deterministically fuzzing some entries (to no benefit).

C0: Finding “interesting” non-address values is as fast as N0.  However, finding “interesting” address values take 65536 times as long.  Extra time spent fuzzing finds useful differences.

C1: “Interesting” non-address values for address-mostly registers take twice as long to find; but “interesting” address-values for mostly-address registers take about the same amount of time to find (i.e., 65536x faster than C0).  “Interesting” address values for other registers take 65536x times as long to find.  Again, extra fuzzing time is spent usefully.

C2: “Interesting” non-address values for all registers take twice as long to find; but “interesting” address values for all registers take that same amount of time.  Again, extra fuzzing time is spent usefully.

If N is true, and we go with option 2, it will take on average 2 hours to find an “interesting" value that would have taken 1 hour with option 0.

If C is true, and we go with 0, it will take on average 65536 hours (7.48 years) to find an “interesting” value that would have taken 1 hour in option 2.

I think the chances of C are less than 50%; but I think it's certainly higher than 1 in 65536.

So given that we don’t know whether we’re in case N or C, from a cost/benefits analysis, I think option 1 is a minimum; and that 2 is an obvious choice — particularly if we can get rid of the “wasted” fuzzing, perhaps by not instrumenting fuzz-emul.c.

Obviously there’s a lot of hand-waving and guessing here.  My understanding is that in practice, ebp is *mostly* used as an address and not a general-purpose register, which is why I maybe-canonicalized it in the first place, and would rather it remain so (option 1).  But I can see the arguments for classifying it as a general-purpose register; so as long as rip and rsp are maybe-canonicalized, I won’t spend a lot of time arguing against this patch.

This was very long — I hope it makes sense. :-)

Peace,
 -George

Andrew Cooper March 29, 2019, 8:40 p.m. UTC | #7

On 29/03/2019 19:20, George Dunlap wrote:
>
>> On Mar 29, 2019, at 4:14 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>
>>>>> On 29.03.19 at 16:42, <George.Dunlap@citrix.com> wrote:
>>>> On Mar 29, 2019, at 3:23 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>> On 29.03.19 at 16:14, <George.Dunlap@citrix.com> wrote:
>>>>> FAOD:
>>>>> 1. I don’t oppose this, but
>>>>> 2. I don’t support it either; however,
>>>>> 3. I don’t think my Ack is necessary.
>>>> Well, preferably I would address your concerns despite 3. So could
>>>> you clarify what you would suggest instead? Keep things as they
>>>> are? Drop all canonicalization? I've basically tried to find a middle
>>>> ground between the two extremes.
>>> I appreciate that. :-) But the main reason I wrote this was #3: I didn’t 
>>> want my silence interpreted as a nack.
>>>
>>> I don’t think it will help fuzzing to remove canonicalization of ebp; it may 
>>> help to have it in.  In fact I’d prefer to CANONICALIZE_MAYBE() more 
>>> registers.
>> But canonicalization removes potentially interesting bits from fuzzed
>> input, which is liable to be relevant if a register is used for other than
>> a base address in an effective address calculation. As an example,
>> take BSR: You'd remove the possibility to get results in the range
>> [48,62]. Or take the XSA-195 case: The memory range covered by
>> BT{,C,R,S} is dramatically much smaller when the register holding the
>> bit offset first got canonicalized.
>>
>> Granted the canonicalization is conditional, so it wouldn't be making it
>> entirely impossible to get into such a state, but since fuzzing is all
>> about likelihood, we'd like to avoid reducing our chances of hitting
>> interesting cases.
> OK, since you want to discuss it, let’s discuss it. :-)
>
> Just to summarize everything:
>
> “Canonical” addresses must have bits 48-63 identical to bit 47.  This means the chance of any “random” 64-bit number being canonical is 1 in 65536 (2^16).

As with everything, this is more complicated.  With 5-level paging
support in IceLake hardware, we now get a different type of canonical
address, where the sign extension is from bit 57, rather than 48.  In
practice, that now means that, depending on CR4.LA57 (which is the
control bit), we may have a 1 in 2^7 chance of being canonical.

>
> A large number of registers (rax, rbp, &c) *can* be used as memory addresses; a handful of registers (esp, rip) are designated to be used primarily for that purpose. 
>
> It is asserted that the canonicity* of any registers has no effect on the codepaths the emulator takes; it simply does calculations and passes the results to various callbacks (e.g., insn_fetch, read, write, &c).  It is these callbacks which, in Xen, check the calculated linear address for canonicity and return an error for non-canonical addresses.

Correct.  Registers are just numbers without any specific meaning.  It
is only when considering a resulting memory operand (which can be
derived from multiple registers and embedded constants) that canonicity*
is relevant.

As said before, we'll shortly gain a second type of canonical address,
but at the end of the day, all it amounts to is "is this a valid linear
address or not".

> [* This is the form of this word my spell-checker doesn’t red-underline]

[* My spell-checker doesn't like it, but meh - the meaning is clear]

> However, the whole point of testing is to find places where your assumptions are violated.  If the emulator ever *did* behave differently for canonical and non-canonical addresses, or near the boundary of canonicity, we’d want those behaviors to be tested.  In fact, it’s *possible* that the emulator already does have this kind of behavior in it, and we just haven’t realized it.

Accepting this point, we do have a couple of places in x86_emulate.c
which check canonicity.

* validate_far_branch()
* protmode_load_seg()
* LIDT/LGDT
* SYSEXIT
* WR{FS,GS}BASE

validate_far_branch() is strictly on %rip.

protmode_load_seg() and LIDT/LGDT are both to do with the content about
to be loaded into a system segment.  WR{FS,GS}BASE is for loading into a
user segment.  These should arguably be deferred to the
->write_segment() hook.

The use in SYSEXIT is explicitly documented in the manual, but they
shouldn't be necessary.  The check for %rdx will be done as part of
validate_far_branch(), whereas SYSEXIT shouldn't care about %rcx, and be
able to load a non-canonical %rsp.  In some copious free time, I'll try
seeing how hardware actually behaves.

All of these uses result in a difference in behaviour which ALF is
liable to pick up on.

I'm afraid that I'm not sufficiently caffeinated to work out if/how this
affects the rest of your reasoning.

> Fuzzing is probabilistic; theoretically, if we run the fuzzer long enough, we’ll hit all “interesting” inputs.   But practically speaking, “focusing” the fuzzing such that “interesting” inputs are found more quickly seems like a good idea.
>
> AFL’s fuzzing takes two modes:
> * “Deterministic” fuzzing, where individual bits are flipped, bytes / words / &c are incremented and decremented, bytes / words / &c are replaced with “interesting” numbers like 0, 1, -1, and so on
> * “Random” fuzzing, where completely random changes are made: random values added, random bits of the file removed, random bits of the file duplicated, &c
>
> At the moment, all registers can be modified at start to any random 64-bit value.  But there’s also a bitfield near the beginning that controls whether some of those registers (namely rip, rsp, and rbp) are “canonicalized”: If the register’s bitfield is 1, bits 48-63 will be set to the value of bit 47; otherwise the register will be left alone.  This allows AFL to choose whether the register will be canonicalized or not.
>
> Effectively, a “maybe canonicalized” register has a 50% chance (plus change) of being “canonical”, and a 50% chance (minus a bit) of being non-canonical.
>
> This canonicalization has several effects.
>
> First of all, it means that any interesting “canonical” addresses (if they exist) has approximately a 65536x higher chance of being selected;  or to put it a different way, AFL will find any interesting “canonical” addresses 65536x faster than otherwise.
>
> On the other hand, any interesting non-canonical addresses (or non-address values) have a 50% lower chance of being selected; AFL will take twice as long to find an interesting “non-canonical” / non-address values address as it would otherwise.
>
> The other effect, at the moment, is to add another branch to fuzz-emulate.c.  This means that the very act of adding a select-able bit to canonicalize (or not) a register adds another path that AFL will consider “interesting”; given two otherwise-identical inputs, A and A’, where one is canonicalized and the other not, AFL will consider them two different test cases, *even if they go through the same paths in x86_emulate()*.  It will keep them both in its “queue", and will spend time mutating them both.
>
> For random fuzzing, even if x86_emulate() is the same, I don’t think this will be wasted effort: 5 seconds fuzzing A and then 5 seconds fuzzing A’ will be the same as 10 seconds of fuzzing A.  It may skew AFL’s priorities a bit by fuzzing something for 10 seconds rather than 5, but it still has a chance of finding new data.
>
> For deterministic fuzzing, if x86_emulate() is the same, then there is duplicated work: the 5 seconds spent fuzzing A’ will go through exactly the same sets of mutations as the 5 seconds spent fuzzing A.
>
> (It may be possible to exclude fuzz-emul.c from instrumentation, in which case this effect would go away.)
>
> Now, we have two possibilities: Either canonicity does not and never will affect x86_emulate(); or it may, either now or later.
>
> And we have a few choices: We maybe-canonicalize no registers; we maybe-canonicalize only registers that are almost always used as an address; or we maybe-canonicalize more (maybe all) registers.
>
> Let us call these C0, C1, and C2 (for “canonicity matters, and we canonicalize none, some, or most, respectively"), and N0, N1, and N2 (for “canonicity doesn’t mater, and we canonicalize none, some, or most respectively”).  (And remember, N vs C is a fact about the world that we don’t know; and 0-2 are options we can choose.)
>
> N0: Everything is the fastest: “interesting” non-address values are found the fastest, AFL isn’t fooled into double-random-fuzzing some entries, AFL doesn’t waste time double-deterministic-fuzzing entries.
>
> N1: “Interesting” non-address values take twice as long to find; but since the registers in question are used almost exclusively for addresses, this is less important.  Canonicity has no effect, so it doesn’t matter whether we choose canonical or non-canonical values for the registers.  We do spend twice as long random-fuzzing entries (for some benefit), and we twice as much time deterministically fuzzing some entries to no benefit; but only when fuzzing these address-mostly registers.
>
> N2: “Interesting” non-address values take twice as long to find.  We spend twice as long random-fuzzing entries (to some benefit), and twice as long deterministically fuzzing some entries (to no benefit).
>
> C0: Finding “interesting” non-address values is as fast as N0.  However, finding “interesting” address values take 65536 times as long.  Extra time spent fuzzing finds useful differences.
>
> C1: “Interesting” non-address values for address-mostly registers take twice as long to find; but “interesting” address-values for mostly-address registers take about the same amount of time to find (i.e., 65536x faster than C0).  “Interesting” address values for other registers take 65536x times as long to find.  Again, extra fuzzing time is spent usefully.
>
> C2: “Interesting” non-address values for all registers take twice as long to find; but “interesting” address values for all registers take that same amount of time.  Again, extra fuzzing time is spent usefully.
>
> If N is true, and we go with option 2, it will take on average 2 hours to find an “interesting" value that would have taken 1 hour with option 0.
>
> If C is true, and we go with 0, it will take on average 65536 hours (7.48 years) to find an “interesting” value that would have taken 1 hour in option 2.
>
> I think the chances of C are less than 50%; but I think it's certainly higher than 1 in 65536.
>
> So given that we don’t know whether we’re in case N or C, from a cost/benefits analysis, I think option 1 is a minimum; and that 2 is an obvious choice — particularly if we can get rid of the “wasted” fuzzing, perhaps by not instrumenting fuzz-emul.c.
>
> Obviously there’s a lot of hand-waving and guessing here.  My understanding is that in practice, ebp is *mostly* used as an address and not a general-purpose register, which is why I maybe-canonicalized it in the first place, and would rather it remain so (option 1).  But I can see the arguments for classifying it as a general-purpose register; so as long as rip and rsp are maybe-canonicalized, I won’t spend a lot of time arguing against this patch.

There is a selection bias here.

It is true that, with -fno-omit-frame-pointer, %rbp is always an
address, but we aren't considering compiler-crafted code here.

To a first approximation, AFL will generate instruction fragments with a
uniform probability of which of the 8 registers are encoded (or 16 for
64bit, but lets take the simple path here).  Therefore, %rbp will be
encoded with a 1/8th probability, as will %rsp, and %rax, etc.

At this point, what matters is how many instructions use %rbp/%rsp as an
implicit memory operand.  %rsp is used by all stack operations, which is
a large quantity of the encoding space.

I can't think of any instruction which uses %rbp in this way. 
ENTER/LEAVE/PUSHA/POPA use/modify it, but only in its integer form - not
as a memory address.

~Andrew

Jan Beulich April 1, 2019, 6:57 a.m. UTC | #8

>>> On 29.03.19 at 21:40, <andrew.cooper3@citrix.com> wrote:
> I can't think of any instruction which uses %rbp in this way. 
> ENTER/LEAVE/PUSHA/POPA use/modify it, but only in its integer form - not
> as a memory address.

ENTER with a nesting level above 1 will use %rbp as a memory
address (or to be precise, %rbp minus some small offset).

Jan

Jan Beulich April 1, 2019, 7:09 a.m. UTC | #9

>>> On 29.03.19 at 20:20, <George.Dunlap@citrix.com> wrote:
> However, the whole point of testing is to find places where your assumptions 
> are violated.  If the emulator ever *did* behave differently for canonical 
> and non-canonical addresses, or near the boundary of canonicity, we’d want 
> those behaviors to be tested.

In this case wouldn't it be preferable to not exactly canonicalize
registers, but only almost, to get them near the boundary, but
both above and below?

Also please let's not forget that there are several constituent
parts to a linear address calculation: Base address (any GPR or
none at all), scaled index (GPRs other than %rsp or none at all,
with AVX and later also [XYZ]MM registers), displacement
(possibly zero), and segment register base (typically but not
always zero). What matters (outside of the special cases that
Andrew did enumerate, where the emulator currently does
canonical checks on its own) is the result of the calculation, not
the value(s) of input registers. Of course, if any of the parts
is non-canonical, there's a good chance that the calculation
result too will be.

Jan

x86emul/fuzz: adjust canonicalization in sanitize_input()

Commit Message

Comments

Patch