diff mbox series

[v2,2/2] pci: ensure configuration access is within bounds

Message ID 20200603202251.1199170-3-ppandit@redhat.com (mailing list archive)
State New, archived
Headers show
Series Ensure PCI configuration access is within bounds | expand

Commit Message

Prasad Pandit June 3, 2020, 8:22 p.m. UTC
From: Prasad J Pandit <pjp@fedoraproject.org>

While reading PCI configuration bytes, a guest may send an
address towards the end of the configuration space. It may lead
to an OOB access issue. Assert that 'address + len' is within
PCI configuration space.

Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
---
 hw/pci/pci.c | 2 ++
 1 file changed, 2 insertions(+)

Update v2: assert PCI configuration access is within bounds
  -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00711.html

Comments

BALATON Zoltan June 3, 2020, 10:13 p.m. UTC | #1
On Thu, 4 Jun 2020, P J P wrote:
> From: Prasad J Pandit <pjp@fedoraproject.org>
>
> While reading PCI configuration bytes, a guest may send an
> address towards the end of the configuration space. It may lead
> to an OOB access issue. Assert that 'address + len' is within
> PCI configuration space.
>
> Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
> ---
> hw/pci/pci.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> Update v2: assert PCI configuration access is within bounds
>  -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00711.html
>
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 70c66965f5..173bec4fd5 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1381,6 +1381,8 @@ uint32_t pci_default_read_config(PCIDevice *d,
> {
>     uint32_t val = 0;
>
> +    assert(address + len <= pci_config_size(d));

Does this allow guest now to crash QEMU? I think it was suggested that 
assert should only be used for cases that can only arise from a 
programming error and not from values set by the guest. If this is 
considered to be an error now to call this function with wrong parameters 
did you check other callers? I've found a few such as:

hw/scsi/esp-pci.c
hw/watchdog/wdt_i6300esb.c
hw/ide/cmd646.c
hw/vfio/pci.c

and maybe others. Would it be better to not crash just log invalid access 
and either fix up parameters or return some garbage like 0?

Regards,
BALATON Zoltan

> +
>     if (pci_is_express_downstream_port(d) &&
>         ranges_overlap(address, len, d->exp.exp_cap + PCI_EXP_LNKSTA, 2)) {
>         pcie_sync_bridge_lnk(d);
>
Gerd Hoffmann June 4, 2020, 5:14 a.m. UTC | #2
Hi,

> > +    assert(address + len <= pci_config_size(d));
> 
> Does this allow guest now to crash QEMU?

Looks like it does (didn't actually try though).

> I think it was suggested that assert should only be used for cases
> that can only arise from a programming error and not from values set
> by the guest.

Correct.  We do have guest-triggerable asserts in the code base.  They
are not the end of the world as the guest will only hurt itself.  But
in general we try to get rid of them instead of adding new ones ...

Often you can just ignore the illegal guest action (bonus points for
logging GUEST_ERROR as debugging aid).  Sometimes it is more difficult
to deal with it (in case the hardware is expected to throw an error irq
for example).

take care,
  Gerd
Prasad Pandit June 4, 2020, 5:31 a.m. UTC | #3
+-- On Thu, 4 Jun 2020, BALATON Zoltan wrote --+
| On Thu, 4 Jun 2020, P J P wrote:
| > +    assert(address + len <= pci_config_size(d));
| 
| Does this allow guest now to crash QEMU?

Yes, possible. Such crash (assert failure) can be a regular bug, as reading 
PCI configuration is likely a privileged operation inside guest.

| If this is considered to be an error now to call this function with wrong 
| parameters did you check other callers?

No, I haven't checked all other cases.

| Would it be better to not crash just log invalid access and either fix up 
| parameters or return some garbage like 0?

* Earlier patch v1 did the same, returned 0.

* Assert(3) may help to fix current and future incorrect usage of the call.

@mst ...?

Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D
Philippe Mathieu-Daudé June 4, 2020, 6:07 a.m. UTC | #4
On 6/4/20 12:13 AM, BALATON Zoltan wrote:
> On Thu, 4 Jun 2020, P J P wrote:
>> From: Prasad J Pandit <pjp@fedoraproject.org>
>>
>> While reading PCI configuration bytes, a guest may send an
>> address towards the end of the configuration space. It may lead
>> to an OOB access issue. Assert that 'address + len' is within
>> PCI configuration space.
>>
>> Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
>> ---
>> hw/pci/pci.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> Update v2: assert PCI configuration access is within bounds
>>  -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00711.html
>>
>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>> index 70c66965f5..173bec4fd5 100644
>> --- a/hw/pci/pci.c
>> +++ b/hw/pci/pci.c
>> @@ -1381,6 +1381,8 @@ uint32_t pci_default_read_config(PCIDevice *d,
>> {
>>     uint32_t val = 0;
>>
>> +    assert(address + len <= pci_config_size(d));
> 
> Does this allow guest now to crash QEMU? I think it was suggested that
> assert should only be used for cases that can only arise from a
> programming error and not from values set by the guest. If this is
> considered to be an error now to call this function with wrong
> parameters did you check other callers? I've found a few such as:
> 
> hw/scsi/esp-pci.c
> hw/watchdog/wdt_i6300esb.c
> hw/ide/cmd646.c
> hw/vfio/pci.c
> 
> and maybe others. Would it be better to not crash just log invalid
> access and either fix up parameters or return some garbage like 0?

Yes, maybe I was not clear while reviewing v1, we need to audit the
callers and fix them first, then we can safely add the assert here.

> 
> Regards,
> BALATON Zoltan
> 
>> +
>>     if (pci_is_express_downstream_port(d) &&
>>         ranges_overlap(address, len, d->exp.exp_cap + PCI_EXP_LNKSTA,
>> 2)) {
>>         pcie_sync_bridge_lnk(d);
>>
Peter Maydell June 4, 2020, 9:10 a.m. UTC | #5
On Wed, 3 Jun 2020 at 21:26, P J P <ppandit@redhat.com> wrote:
>
> From: Prasad J Pandit <pjp@fedoraproject.org>
>
> While reading PCI configuration bytes, a guest may send an
> address towards the end of the configuration space. It may lead
> to an OOB access issue. Assert that 'address + len' is within
> PCI configuration space.

What does the spec say should happen when the guest does this?
Does it depend on the pci controller implementation?

thanks
-- PMM
Michael S. Tsirkin June 4, 2020, 9:35 a.m. UTC | #6
On Thu, Jun 04, 2020 at 10:10:07AM +0100, Peter Maydell wrote:
> On Wed, 3 Jun 2020 at 21:26, P J P <ppandit@redhat.com> wrote:
> >
> > From: Prasad J Pandit <pjp@fedoraproject.org>
> >
> > While reading PCI configuration bytes, a guest may send an
> > address towards the end of the configuration space. It may lead
> > to an OOB access issue. Assert that 'address + len' is within
> > PCI configuration space.
> 
> What does the spec say should happen when the guest does this?

Spec says anything can happen *to the device*. Naturally there's
an expectation that while device might crash it stays
resettable and does not blow up.

> Does it depend on the pci controller implementation?
> 
> thanks
> -- PMM

Shouldn't I think.
Michael S. Tsirkin June 4, 2020, 9:38 a.m. UTC | #7
On Thu, Jun 04, 2020 at 01:52:51AM +0530, P J P wrote:
> From: Prasad J Pandit <pjp@fedoraproject.org>
> 
> While reading PCI configuration bytes, a guest may send an
> address towards the end of the configuration space. It may lead
> to an OOB access issue. Assert that 'address + len' is within
> PCI configuration space.
> 
> Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>

My understanding is that this can't really happen normally,
this is more an assert in case some pci host devices are buggy,
as is the case of alt-vga.
Right?
Pls clarify commit log so it's obvious this is defence in depth.

> ---
>  hw/pci/pci.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> Update v2: assert PCI configuration access is within bounds
>   -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00711.html
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 70c66965f5..173bec4fd5 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1381,6 +1381,8 @@ uint32_t pci_default_read_config(PCIDevice *d,
>  {
>      uint32_t val = 0;
>  
> +    assert(address + len <= pci_config_size(d));
> +
>      if (pci_is_express_downstream_port(d) &&
>          ranges_overlap(address, len, d->exp.exp_cap + PCI_EXP_LNKSTA, 2)) {
>          pcie_sync_bridge_lnk(d);
> -- 
> 2.26.2
Michael S. Tsirkin June 4, 2020, 9:41 a.m. UTC | #8
On Thu, Jun 04, 2020 at 08:07:52AM +0200, Philippe Mathieu-Daudé wrote:
> On 6/4/20 12:13 AM, BALATON Zoltan wrote:
> > On Thu, 4 Jun 2020, P J P wrote:
> >> From: Prasad J Pandit <pjp@fedoraproject.org>
> >>
> >> While reading PCI configuration bytes, a guest may send an
> >> address towards the end of the configuration space. It may lead
> >> to an OOB access issue. Assert that 'address + len' is within
> >> PCI configuration space.
> >>
> >> Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> >> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
> >> ---
> >> hw/pci/pci.c | 2 ++
> >> 1 file changed, 2 insertions(+)
> >>
> >> Update v2: assert PCI configuration access is within bounds
> >>  -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00711.html
> >>
> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >> index 70c66965f5..173bec4fd5 100644
> >> --- a/hw/pci/pci.c
> >> +++ b/hw/pci/pci.c
> >> @@ -1381,6 +1381,8 @@ uint32_t pci_default_read_config(PCIDevice *d,
> >> {
> >>     uint32_t val = 0;
> >>
> >> +    assert(address + len <= pci_config_size(d));
> > 
> > Does this allow guest now to crash QEMU? I think it was suggested that
> > assert should only be used for cases that can only arise from a
> > programming error and not from values set by the guest. If this is
> > considered to be an error now to call this function with wrong
> > parameters did you check other callers? I've found a few such as:
> > 
> > hw/scsi/esp-pci.c
> > hw/watchdog/wdt_i6300esb.c
> > hw/ide/cmd646.c
> > hw/vfio/pci.c
> > 
> > and maybe others. Would it be better to not crash just log invalid
> > access and either fix up parameters or return some garbage like 0?
> 
> Yes, maybe I was not clear while reviewing v1, we need to audit the
> callers and fix them first, then we can safely add the assert here.

We can add assert here regardless of auditing callers. Doing that
will also make fuzzying easier. But the assert is unrelated to CVE imho.

> > 
> > Regards,
> > BALATON Zoltan
> > 
> >> +
> >>     if (pci_is_express_downstream_port(d) &&
> >>         ranges_overlap(address, len, d->exp.exp_cap + PCI_EXP_LNKSTA,
> >> 2)) {
> >>         pcie_sync_bridge_lnk(d);
> >>
Michael S. Tsirkin June 4, 2020, 9:44 a.m. UTC | #9
On Thu, Jun 04, 2020 at 07:14:00AM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > > +    assert(address + len <= pci_config_size(d));
> > 
> > Does this allow guest now to crash QEMU?
> 
> Looks like it does (didn't actually try though).
> 
> > I think it was suggested that assert should only be used for cases
> > that can only arise from a programming error and not from values set
> > by the guest.
> 
> Correct.  We do have guest-triggerable asserts in the code base.  They
> are not the end of the world as the guest will only hurt itself.  But
> in general we try to get rid of them instead of adding new ones ...
> 
> Often you can just ignore the illegal guest action (bonus points for
> logging GUEST_ERROR as debugging aid).  Sometimes it is more difficult
> to deal with it (in case the hardware is expected to throw an error irq
> for example).
> 
> take care,
>   Gerd

In this case it's not supposed to be guest triggerable, so I'm inlined
to merge this, but as a separate patch from patch 1,
and commit log need to be clearer that it's defence in depth
not a bugfix.
BALATON Zoltan June 4, 2020, 11:37 a.m. UTC | #10
On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
> On Thu, Jun 04, 2020 at 08:07:52AM +0200, Philippe Mathieu-Daudé wrote:
>> On 6/4/20 12:13 AM, BALATON Zoltan wrote:
>>> On Thu, 4 Jun 2020, P J P wrote:
>>>> From: Prasad J Pandit <pjp@fedoraproject.org>
>>>>
>>>> While reading PCI configuration bytes, a guest may send an
>>>> address towards the end of the configuration space. It may lead
>>>> to an OOB access issue. Assert that 'address + len' is within
>>>> PCI configuration space.
>>>>
>>>> Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>>>> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
>>>> ---
>>>> hw/pci/pci.c | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>>>
>>>> Update v2: assert PCI configuration access is within bounds
>>>>  -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00711.html
>>>>
>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>> index 70c66965f5..173bec4fd5 100644
>>>> --- a/hw/pci/pci.c
>>>> +++ b/hw/pci/pci.c
>>>> @@ -1381,6 +1381,8 @@ uint32_t pci_default_read_config(PCIDevice *d,
>>>> {
>>>>     uint32_t val = 0;
>>>>
>>>> +    assert(address + len <= pci_config_size(d));
>>>
>>> Does this allow guest now to crash QEMU? I think it was suggested that
>>> assert should only be used for cases that can only arise from a
>>> programming error and not from values set by the guest. If this is
>>> considered to be an error now to call this function with wrong
>>> parameters did you check other callers? I've found a few such as:
>>>
>>> hw/scsi/esp-pci.c
>>> hw/watchdog/wdt_i6300esb.c
>>> hw/ide/cmd646.c
>>> hw/vfio/pci.c
>>>
>>> and maybe others. Would it be better to not crash just log invalid
>>> access and either fix up parameters or return some garbage like 0?
>>
>> Yes, maybe I was not clear while reviewing v1, we need to audit the
>> callers and fix them first, then we can safely add the assert here.
>
> We can add assert here regardless of auditing callers. Doing that
> will also make fuzzying easier. But the assert is unrelated to CVE imho.

I wonder why isn't the check added to pci_default_read_config() right 
away? If we have an assert there the overhead is the same and adding the 
check there would make it unnecessary to patch all callers so it's just 
one patch instead of a whole series.

Regards,
BALATON Zoltan
Michael S. Tsirkin June 4, 2020, 11:40 a.m. UTC | #11
On Thu, Jun 04, 2020 at 01:37:13PM +0200, BALATON Zoltan wrote:
> On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
> > On Thu, Jun 04, 2020 at 08:07:52AM +0200, Philippe Mathieu-Daudé wrote:
> > > On 6/4/20 12:13 AM, BALATON Zoltan wrote:
> > > > On Thu, 4 Jun 2020, P J P wrote:
> > > > > From: Prasad J Pandit <pjp@fedoraproject.org>
> > > > > 
> > > > > While reading PCI configuration bytes, a guest may send an
> > > > > address towards the end of the configuration space. It may lead
> > > > > to an OOB access issue. Assert that 'address + len' is within
> > > > > PCI configuration space.
> > > > > 
> > > > > Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > > > Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
> > > > > ---
> > > > > hw/pci/pci.c | 2 ++
> > > > > 1 file changed, 2 insertions(+)
> > > > > 
> > > > > Update v2: assert PCI configuration access is within bounds
> > > > >  -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00711.html
> > > > > 
> > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > > index 70c66965f5..173bec4fd5 100644
> > > > > --- a/hw/pci/pci.c
> > > > > +++ b/hw/pci/pci.c
> > > > > @@ -1381,6 +1381,8 @@ uint32_t pci_default_read_config(PCIDevice *d,
> > > > > {
> > > > >     uint32_t val = 0;
> > > > > 
> > > > > +    assert(address + len <= pci_config_size(d));
> > > > 
> > > > Does this allow guest now to crash QEMU? I think it was suggested that
> > > > assert should only be used for cases that can only arise from a
> > > > programming error and not from values set by the guest. If this is
> > > > considered to be an error now to call this function with wrong
> > > > parameters did you check other callers? I've found a few such as:
> > > > 
> > > > hw/scsi/esp-pci.c
> > > > hw/watchdog/wdt_i6300esb.c
> > > > hw/ide/cmd646.c
> > > > hw/vfio/pci.c
> > > > 
> > > > and maybe others. Would it be better to not crash just log invalid
> > > > access and either fix up parameters or return some garbage like 0?
> > > 
> > > Yes, maybe I was not clear while reviewing v1, we need to audit the
> > > callers and fix them first, then we can safely add the assert here.
> > 
> > We can add assert here regardless of auditing callers. Doing that
> > will also make fuzzying easier. But the assert is unrelated to CVE imho.
> 
> I wonder why isn't the check added to pci_default_read_config() right away?
> If we have an assert there the overhead is the same and adding the check
> there would make it unnecessary to patch all callers so it's just one patch
> instead of a whole series.
> 
> Regards,
> BALATON Zoltan

We need to return something, and we can't be sure that callers will
handle returning random stuff correctly. Callers know what
to do on errors, we don't.
BALATON Zoltan June 4, 2020, 11:49 a.m. UTC | #12
On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
> On Thu, Jun 04, 2020 at 01:37:13PM +0200, BALATON Zoltan wrote:
>> On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
>>> On Thu, Jun 04, 2020 at 08:07:52AM +0200, Philippe Mathieu-Daudé wrote:
>>>> On 6/4/20 12:13 AM, BALATON Zoltan wrote:
>>>>> On Thu, 4 Jun 2020, P J P wrote:
>>>>>> From: Prasad J Pandit <pjp@fedoraproject.org>
>>>>>>
>>>>>> While reading PCI configuration bytes, a guest may send an
>>>>>> address towards the end of the configuration space. It may lead
>>>>>> to an OOB access issue. Assert that 'address + len' is within
>>>>>> PCI configuration space.
>>>>>>
>>>>>> Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>>>>>> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
>>>>>> ---
>>>>>> hw/pci/pci.c | 2 ++
>>>>>> 1 file changed, 2 insertions(+)
>>>>>>
>>>>>> Update v2: assert PCI configuration access is within bounds
>>>>>>  -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00711.html
>>>>>>
>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>>> index 70c66965f5..173bec4fd5 100644
>>>>>> --- a/hw/pci/pci.c
>>>>>> +++ b/hw/pci/pci.c
>>>>>> @@ -1381,6 +1381,8 @@ uint32_t pci_default_read_config(PCIDevice *d,
>>>>>> {
>>>>>>     uint32_t val = 0;
>>>>>>
>>>>>> +    assert(address + len <= pci_config_size(d));
>>>>>
>>>>> Does this allow guest now to crash QEMU? I think it was suggested that
>>>>> assert should only be used for cases that can only arise from a
>>>>> programming error and not from values set by the guest. If this is
>>>>> considered to be an error now to call this function with wrong
>>>>> parameters did you check other callers? I've found a few such as:
>>>>>
>>>>> hw/scsi/esp-pci.c
>>>>> hw/watchdog/wdt_i6300esb.c
>>>>> hw/ide/cmd646.c
>>>>> hw/vfio/pci.c
>>>>>
>>>>> and maybe others. Would it be better to not crash just log invalid
>>>>> access and either fix up parameters or return some garbage like 0?
>>>>
>>>> Yes, maybe I was not clear while reviewing v1, we need to audit the
>>>> callers and fix them first, then we can safely add the assert here.
>>>
>>> We can add assert here regardless of auditing callers. Doing that
>>> will also make fuzzying easier. But the assert is unrelated to CVE imho.
>>
>> I wonder why isn't the check added to pci_default_read_config() right away?
>> If we have an assert there the overhead is the same and adding the check
>> there would make it unnecessary to patch all callers so it's just one patch
>> instead of a whole series.
>>
>> Regards,
>> BALATON Zoltan
>
> We need to return something, and we can't be sure that callers will
> handle returning random stuff correctly. Callers know what
> to do on errors, we don't.

This is an invalid case where behaviour will be undefined anyway so 
returning anything such as 0 or -1 is probably OK (what do most hardware 
return in this case?). If callers need better error handling they can do a 
check before calling the function but for other (most) callers which will 
just return the same random value you would return from 
pci_default_read_config() having an assert instead makes it necessary to 
modify all of them one by one and doubles the check overhead by 
unnecessarily double checking. So I think having a default check and error 
handling in pci_default_read_config() would be better so callers who don't 
care would work and those few who might care could check before calling or 
actually implement their own callback (which I expect they already do as 
this is just the default implementation of this callback).
Michael S. Tsirkin June 4, 2020, 11:58 a.m. UTC | #13
On Thu, Jun 04, 2020 at 01:49:53PM +0200, BALATON Zoltan wrote:
> On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
> > On Thu, Jun 04, 2020 at 01:37:13PM +0200, BALATON Zoltan wrote:
> > > On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
> > > > On Thu, Jun 04, 2020 at 08:07:52AM +0200, Philippe Mathieu-Daudé wrote:
> > > > > On 6/4/20 12:13 AM, BALATON Zoltan wrote:
> > > > > > On Thu, 4 Jun 2020, P J P wrote:
> > > > > > > From: Prasad J Pandit <pjp@fedoraproject.org>
> > > > > > > 
> > > > > > > While reading PCI configuration bytes, a guest may send an
> > > > > > > address towards the end of the configuration space. It may lead
> > > > > > > to an OOB access issue. Assert that 'address + len' is within
> > > > > > > PCI configuration space.
> > > > > > > 
> > > > > > > Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > > > > > Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
> > > > > > > ---
> > > > > > > hw/pci/pci.c | 2 ++
> > > > > > > 1 file changed, 2 insertions(+)
> > > > > > > 
> > > > > > > Update v2: assert PCI configuration access is within bounds
> > > > > > >  -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00711.html
> > > > > > > 
> > > > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > > > > index 70c66965f5..173bec4fd5 100644
> > > > > > > --- a/hw/pci/pci.c
> > > > > > > +++ b/hw/pci/pci.c
> > > > > > > @@ -1381,6 +1381,8 @@ uint32_t pci_default_read_config(PCIDevice *d,
> > > > > > > {
> > > > > > >     uint32_t val = 0;
> > > > > > > 
> > > > > > > +    assert(address + len <= pci_config_size(d));
> > > > > > 
> > > > > > Does this allow guest now to crash QEMU? I think it was suggested that
> > > > > > assert should only be used for cases that can only arise from a
> > > > > > programming error and not from values set by the guest. If this is
> > > > > > considered to be an error now to call this function with wrong
> > > > > > parameters did you check other callers? I've found a few such as:
> > > > > > 
> > > > > > hw/scsi/esp-pci.c
> > > > > > hw/watchdog/wdt_i6300esb.c
> > > > > > hw/ide/cmd646.c
> > > > > > hw/vfio/pci.c
> > > > > > 
> > > > > > and maybe others. Would it be better to not crash just log invalid
> > > > > > access and either fix up parameters or return some garbage like 0?
> > > > > 
> > > > > Yes, maybe I was not clear while reviewing v1, we need to audit the
> > > > > callers and fix them first, then we can safely add the assert here.
> > > > 
> > > > We can add assert here regardless of auditing callers. Doing that
> > > > will also make fuzzying easier. But the assert is unrelated to CVE imho.
> > > 
> > > I wonder why isn't the check added to pci_default_read_config() right away?
> > > If we have an assert there the overhead is the same and adding the check
> > > there would make it unnecessary to patch all callers so it's just one patch
> > > instead of a whole series.
> > > 
> > > Regards,
> > > BALATON Zoltan
> > 
> > We need to return something, and we can't be sure that callers will
> > handle returning random stuff correctly. Callers know what
> > to do on errors, we don't.
> 
> This is an invalid case where behaviour will be undefined anyway so
> returning anything such as 0 or -1 is probably OK (what do most hardware
> return in this case?).

This is an internal detail of the API. It's not about what hardware
returns.  Look at the ati as an example.

> If callers need better error handling they can do a
> check before calling the function but for other (most) callers which will
> just return the same random value you would return from
> pci_default_read_config() having an assert instead makes it necessary to
> modify all of them one by one and doubles the check overhead by
> unnecessarily double checking. So I think having a default check and error
> handling in pci_default_read_config() would be better so callers who don't
> care would work and those few who might care could check before calling or
> actually implement their own callback (which I expect they already do as
> this is just the default implementation of this callback).


Basically if you look at the specific example, you will see that it
triggers because of a misaligned access which device code never
expected. Which memory core should not allow at all.
It will likely trigger other bugs, some of them could be
security related. assert is a reasonable way to help us catch them in
fuzzying.
BALATON Zoltan June 4, 2020, 12:14 p.m. UTC | #14
On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
> On Thu, Jun 04, 2020 at 01:49:53PM +0200, BALATON Zoltan wrote:
>> On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
>>> On Thu, Jun 04, 2020 at 01:37:13PM +0200, BALATON Zoltan wrote:
>>>> On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
>>>>> On Thu, Jun 04, 2020 at 08:07:52AM +0200, Philippe Mathieu-Daudé wrote:
>>>>>> On 6/4/20 12:13 AM, BALATON Zoltan wrote:
>>>>>>> On Thu, 4 Jun 2020, P J P wrote:
>>>>>>>> From: Prasad J Pandit <pjp@fedoraproject.org>
>>>>>>>>
>>>>>>>> While reading PCI configuration bytes, a guest may send an
>>>>>>>> address towards the end of the configuration space. It may lead
>>>>>>>> to an OOB access issue. Assert that 'address + len' is within
>>>>>>>> PCI configuration space.
>>>>>>>>
>>>>>>>> Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>>>>>>>> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
>>>>>>>> ---
>>>>>>>> hw/pci/pci.c | 2 ++
>>>>>>>> 1 file changed, 2 insertions(+)
>>>>>>>>
>>>>>>>> Update v2: assert PCI configuration access is within bounds
>>>>>>>>  -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00711.html
>>>>>>>>
>>>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>>>>> index 70c66965f5..173bec4fd5 100644
>>>>>>>> --- a/hw/pci/pci.c
>>>>>>>> +++ b/hw/pci/pci.c
>>>>>>>> @@ -1381,6 +1381,8 @@ uint32_t pci_default_read_config(PCIDevice *d,
>>>>>>>> {
>>>>>>>>     uint32_t val = 0;
>>>>>>>>
>>>>>>>> +    assert(address + len <= pci_config_size(d));
>>>>>>>
>>>>>>> Does this allow guest now to crash QEMU? I think it was suggested that
>>>>>>> assert should only be used for cases that can only arise from a
>>>>>>> programming error and not from values set by the guest. If this is
>>>>>>> considered to be an error now to call this function with wrong
>>>>>>> parameters did you check other callers? I've found a few such as:
>>>>>>>
>>>>>>> hw/scsi/esp-pci.c
>>>>>>> hw/watchdog/wdt_i6300esb.c
>>>>>>> hw/ide/cmd646.c
>>>>>>> hw/vfio/pci.c
>>>>>>>
>>>>>>> and maybe others. Would it be better to not crash just log invalid
>>>>>>> access and either fix up parameters or return some garbage like 0?
>>>>>>
>>>>>> Yes, maybe I was not clear while reviewing v1, we need to audit the
>>>>>> callers and fix them first, then we can safely add the assert here.
>>>>>
>>>>> We can add assert here regardless of auditing callers. Doing that
>>>>> will also make fuzzying easier. But the assert is unrelated to CVE imho.
>>>>
>>>> I wonder why isn't the check added to pci_default_read_config() right away?
>>>> If we have an assert there the overhead is the same and adding the check
>>>> there would make it unnecessary to patch all callers so it's just one patch
>>>> instead of a whole series.
>>>>
>>>> Regards,
>>>> BALATON Zoltan
>>>
>>> We need to return something, and we can't be sure that callers will
>>> handle returning random stuff correctly. Callers know what
>>> to do on errors, we don't.
>>
>> This is an invalid case where behaviour will be undefined anyway so
>> returning anything such as 0 or -1 is probably OK (what do most hardware
>> return in this case?).
>
> This is an internal detail of the API. It's not about what hardware
> returns.  Look at the ati as an example.

Considering that this function implements reading PCI config space its API 
should aligh with what happens on hardware normally. You could make it 
unrelated but that does not make much sense other than causing trouble for 
callers.

>> If callers need better error handling they can do a
>> check before calling the function but for other (most) callers which will
>> just return the same random value you would return from
>> pci_default_read_config() having an assert instead makes it necessary to
>> modify all of them one by one and doubles the check overhead by
>> unnecessarily double checking. So I think having a default check and error
>> handling in pci_default_read_config() would be better so callers who don't
>> care would work and those few who might care could check before calling or
>> actually implement their own callback (which I expect they already do as
>> this is just the default implementation of this callback).
>
>
> Basically if you look at the specific example, you will see that it
> triggers because of a misaligned access which device code never
> expected. Which memory core should not allow at all.
> It will likely trigger other bugs, some of them could be
> security related. assert is a reasonable way to help us catch them in
> fuzzying.

The specific example (ati-vga) does expect and should support unaligned 
access. Not for all regs but for most registers, there's a table in docs 
which says for PCI POS registers (whatever those are) unalligned access is 
supported. This works now, if it should not work witout .impl.unaligned or 
some other value set somewhere that should be patched instead.

Regards,
BALATON Zoltan
Michael S. Tsirkin June 4, 2020, 2:11 p.m. UTC | #15
On Thu, Jun 04, 2020 at 02:14:46PM +0200, BALATON Zoltan wrote:
> On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
> > On Thu, Jun 04, 2020 at 01:49:53PM +0200, BALATON Zoltan wrote:
> > > On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
> > > > On Thu, Jun 04, 2020 at 01:37:13PM +0200, BALATON Zoltan wrote:
> > > > > On Thu, 4 Jun 2020, Michael S. Tsirkin wrote:
> > > > > > On Thu, Jun 04, 2020 at 08:07:52AM +0200, Philippe Mathieu-Daudé wrote:
> > > > > > > On 6/4/20 12:13 AM, BALATON Zoltan wrote:
> > > > > > > > On Thu, 4 Jun 2020, P J P wrote:
> > > > > > > > > From: Prasad J Pandit <pjp@fedoraproject.org>
> > > > > > > > > 
> > > > > > > > > While reading PCI configuration bytes, a guest may send an
> > > > > > > > > address towards the end of the configuration space. It may lead
> > > > > > > > > to an OOB access issue. Assert that 'address + len' is within
> > > > > > > > > PCI configuration space.
> > > > > > > > > 
> > > > > > > > > Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > > > > > > > Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
> > > > > > > > > ---
> > > > > > > > > hw/pci/pci.c | 2 ++
> > > > > > > > > 1 file changed, 2 insertions(+)
> > > > > > > > > 
> > > > > > > > > Update v2: assert PCI configuration access is within bounds
> > > > > > > > >  -> https://lists.gnu.org/archive/html/qemu-devel/2020-06/msg00711.html
> > > > > > > > > 
> > > > > > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > > > > > > index 70c66965f5..173bec4fd5 100644
> > > > > > > > > --- a/hw/pci/pci.c
> > > > > > > > > +++ b/hw/pci/pci.c
> > > > > > > > > @@ -1381,6 +1381,8 @@ uint32_t pci_default_read_config(PCIDevice *d,
> > > > > > > > > {
> > > > > > > > >     uint32_t val = 0;
> > > > > > > > > 
> > > > > > > > > +    assert(address + len <= pci_config_size(d));
> > > > > > > > 
> > > > > > > > Does this allow guest now to crash QEMU? I think it was suggested that
> > > > > > > > assert should only be used for cases that can only arise from a
> > > > > > > > programming error and not from values set by the guest. If this is
> > > > > > > > considered to be an error now to call this function with wrong
> > > > > > > > parameters did you check other callers? I've found a few such as:
> > > > > > > > 
> > > > > > > > hw/scsi/esp-pci.c
> > > > > > > > hw/watchdog/wdt_i6300esb.c
> > > > > > > > hw/ide/cmd646.c
> > > > > > > > hw/vfio/pci.c
> > > > > > > > 
> > > > > > > > and maybe others. Would it be better to not crash just log invalid
> > > > > > > > access and either fix up parameters or return some garbage like 0?
> > > > > > > 
> > > > > > > Yes, maybe I was not clear while reviewing v1, we need to audit the
> > > > > > > callers and fix them first, then we can safely add the assert here.
> > > > > > 
> > > > > > We can add assert here regardless of auditing callers. Doing that
> > > > > > will also make fuzzying easier. But the assert is unrelated to CVE imho.
> > > > > 
> > > > > I wonder why isn't the check added to pci_default_read_config() right away?
> > > > > If we have an assert there the overhead is the same and adding the check
> > > > > there would make it unnecessary to patch all callers so it's just one patch
> > > > > instead of a whole series.
> > > > > 
> > > > > Regards,
> > > > > BALATON Zoltan
> > > > 
> > > > We need to return something, and we can't be sure that callers will
> > > > handle returning random stuff correctly. Callers know what
> > > > to do on errors, we don't.
> > > 
> > > This is an invalid case where behaviour will be undefined anyway so
> > > returning anything such as 0 or -1 is probably OK (what do most hardware
> > > return in this case?).
> > 
> > This is an internal detail of the API. It's not about what hardware
> > returns.  Look at the ati as an example.
> 
> Considering that this function implements reading PCI config space its API
> should aligh with what happens on hardware normally. You could make it
> unrelated but that does not make much sense other than causing trouble for
> callers.

What happens on hardware is that there's no way to send to
device a transaction that is out of range: on pci
offset is 8 bit so <= 0xff, and on express 12 bit so <= 4K.

So this handles something that never happens on real hardware
and it happens because of a bug elsewhere in QEMU.
assert seems appropriate.


> > > If callers need better error handling they can do a
> > > check before calling the function but for other (most) callers which will
> > > just return the same random value you would return from
> > > pci_default_read_config() having an assert instead makes it necessary to
> > > modify all of them one by one and doubles the check overhead by
> > > unnecessarily double checking. So I think having a default check and error
> > > handling in pci_default_read_config() would be better so callers who don't
> > > care would work and those few who might care could check before calling or
> > > actually implement their own callback (which I expect they already do as
> > > this is just the default implementation of this callback).
> > 
> > 
> > Basically if you look at the specific example, you will see that it
> > triggers because of a misaligned access which device code never
> > expected. Which memory core should not allow at all.
> > It will likely trigger other bugs, some of them could be
> > security related. assert is a reasonable way to help us catch them in
> > fuzzying.
> 
> The specific example (ati-vga) does expect and should support unaligned
> access.

Then it should set "unaligned = true". It does not seem to do so.

> Not for all regs but for most registers, there's a table in docs
> which says for PCI POS registers (whatever those are) unalligned access is
> supported. This works now, if it should not work witout .impl.unaligned or
> some other value set somewhere that should be patched instead.

Argue with the docs/devel/memory.rst about this please, that's not
what it says.


> 
> Regards,
> BALATON Zoltan
diff mbox series

Patch

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 70c66965f5..173bec4fd5 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1381,6 +1381,8 @@  uint32_t pci_default_read_config(PCIDevice *d,
 {
     uint32_t val = 0;
 
+    assert(address + len <= pci_config_size(d));
+
     if (pci_is_express_downstream_port(d) &&
         ranges_overlap(address, len, d->exp.exp_cap + PCI_EXP_LNKSTA, 2)) {
         pcie_sync_bridge_lnk(d);