diff mbox

[V2] device-assignment pci: correct pci config size default for cap version 2 endpoints

Message ID 20110721163733.661.22067.stgit@dddsys0.bos.redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Donald Dutile July 21, 2011, 4:39 p.m. UTC
v2: do local boundary check with respect to legacy PCI header length,
    and don't depend on it in pci_add_capability().
  : fix compilation, and change else>2 to simple else for all other cases.

Doing device assignement using a PCIe device with it's
PCI Cap structure at offset 0xcc showed a problem in
the default size mapped for this cap-id.

The failure caused a corruption which might have gone unnoticed
otherwise.

Fix assigned_device_pci_cap_init() to set the default
size of PCIe Cap structure (cap-id 0x10) to 0x34 instead of 0x3c.
0x34 is default, min, for endpoint device with a cap version of 2.

Add check in assigned_devic_pci_cap_init() to ensure
size of Cap structure doesn't exceed legacy PCI header space,
which is where it must fit.

Signed-off-by: Donald Dutile <ddutile@redhat.com>
cc: Alex Williamson <alex.williamson@redhat.com>
cc: Michael S. Tsirkin <mst@redhat.com>
---

 hw/device-assignment.c |   19 ++++++++++++++++---
 1 files changed, 16 insertions(+), 3 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Donald Dutile July 21, 2011, 5:52 p.m. UTC | #1
On 07/21/2011 01:02 PM, Alex Williamson wrote:
> On Thu, 2011-07-21 at 12:39 -0400, Donald Dutile wrote:
>> v2: do local boundary check with respect to legacy PCI header length,
>>      and don't depend on it in pci_add_capability().
>>    : fix compilation, and change else>2 to simple else for all other cases.
>>
>> Doing device assignement using a PCIe device with it's
>> PCI Cap structure at offset 0xcc showed a problem in
>> the default size mapped for this cap-id.
>>
>> The failure caused a corruption which might have gone unnoticed
>> otherwise.
>>
>> Fix assigned_device_pci_cap_init() to set the default
>> size of PCIe Cap structure (cap-id 0x10) to 0x34 instead of 0x3c.
>> 0x34 is default, min, for endpoint device with a cap version of 2.
>>
>> Add check in assigned_devic_pci_cap_init() to ensure
>> size of Cap structure doesn't exceed legacy PCI header space,
>> which is where it must fit.
>>
>> Signed-off-by: Donald Dutile<ddutile@redhat.com>
>> cc: Alex Williamson<alex.williamson@redhat.com>
>> cc: Michael S. Tsirkin<mst@redhat.com>
>> ---
>>
>>   hw/device-assignment.c |   19 ++++++++++++++++---
>>   1 files changed, 16 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/device-assignment.c b/hw/device-assignment.c
>> index 36ad6b0..6bb8af7 100644
>> --- a/hw/device-assignment.c
>> +++ b/hw/device-assignment.c
>> @@ -1419,21 +1419,34 @@ static int assigned_device_pci_cap_init(PCIDevice *pci_dev)
>>       }
>>
>>       if ((pos = pci_find_cap_offset(pci_dev, PCI_CAP_ID_EXP, 0))) {
>> -        uint8_t version;
>> +        uint8_t version, size;
>>           uint16_t type, devctl, lnkcap, lnksta;
>>           uint32_t devcap;
>> -        int size = 0x3c; /* version 2 size */
>>
>>           version = pci_get_byte(pci_dev->config + pos + PCI_EXP_FLAGS);
>>           version&= PCI_EXP_FLAGS_VERS;
>>           if (version == 1) {
>>               size = 0x14;
>> -        } else if (version>  2) {
>> +        } else if (version == 2) {
>> +            /* don't include slot cap/stat/ctrl 2 regs; only support endpoints */
>> +            size = 0x34;
>> +        } else {
>>               fprintf(stderr, "Unsupported PCI express capability version %d\n",
>>                       version);
>>               return -EINVAL;
>>           }
>>
>> +	/* make sure cap struct resides in legacy hdr space */
>> +	if (size>  PCI_CONFIG_SPACE_SIZE - pos) {
>> +            fprintf(stderr, "ERROR: %04x:%02x:%02x.%x "
>> +                    "Attempt to add PCI Cap Structure 0x%x at offset 0x%x,"
>> +                    "size 0x%x, which exceeds legacy PCI config space 0x%x\n",
>> +                    pci_find_domain(pci_dev->bus), pci_bus_num(pci_dev->bus),
>> +                    PCI_SLOT(pci_dev->devfn), PCI_FUNC(pci_dev->devfn),
>> +                    PCI_CAP_ID_EXP, pos, size, PCI_CONFIG_SPACE_SIZE);
>> +            return -EINVAL;
>> +        }
>> +
>
> This is crazy, why would we only test this for PCI_CAP_ID_EXP?  If the
> test is going to go in device-assignment, we need to wrap
> pci_add_capability and do it for all callers.  However, maybe we can get
> MST to take it in pci_add_capability() if we make the test more
> complete...  something like:
>
> if ((pos<  256&&  size>  256 - pos) ||
>      (pci_config_size()>  256&&  pos>  256&&
>       size>  pci_config_size() - pos)) {
>     ... badness
>
> Thanks,
>
> Alex
>
+1

>>           if ((ret = pci_add_capability(pci_dev, PCI_CAP_ID_EXP,
>>                                         pos, size))<  0) {
>>               return ret;
>>
>
>
>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin July 24, 2011, 8:36 a.m. UTC | #2
On Thu, Jul 21, 2011 at 11:02:53AM -0600, Alex Williamson wrote:
> This is crazy, why would we only test this for PCI_CAP_ID_EXP?  If the
> test is going to go in device-assignment, we need to wrap
> pci_add_capability and do it for all callers.  However, maybe we can get
> MST to take it in pci_add_capability() if we make the test more
> complete...  something like:
> 
> if ((pos < 256 && size > 256 - pos) ||
>     (pci_config_size() > 256 && pos > 256 &&
>      size > pci_config_size() - pos)) {
>    ... badness
> 
> Thanks,
> 
> Alex

We expect device assignment to be able to corrupt
guest memory but not qemu memory. So we must validate
whatever we get from the device, and I think this
validation belongs in device-assignment.c:

IOW I think input should be validated where it's
input, while we still know it's untrusted,
instead of relying on core to validate parameters.

Makes sense?
Alex Williamson July 25, 2011, 8:14 p.m. UTC | #3
On Sun, 2011-07-24 at 11:36 +0300, Michael S. Tsirkin wrote:
> On Thu, Jul 21, 2011 at 11:02:53AM -0600, Alex Williamson wrote:
> > This is crazy, why would we only test this for PCI_CAP_ID_EXP?  If the
> > test is going to go in device-assignment, we need to wrap
> > pci_add_capability and do it for all callers.  However, maybe we can get
> > MST to take it in pci_add_capability() if we make the test more
> > complete...  something like:
> > 
> > if ((pos < 256 && size > 256 - pos) ||
> >     (pci_config_size() > 256 && pos > 256 &&
> >      size > pci_config_size() - pos)) {
> >    ... badness
> > 
> > Thanks,
> > 
> > Alex
> 
> We expect device assignment to be able to corrupt
> guest memory but not qemu memory. So we must validate

Gee, thanks.

> whatever we get from the device, and I think this
> validation belongs in device-assignment.c:
> 
> IOW I think input should be validated where it's
> input, while we still know it's untrusted,
> instead of relying on core to validate parameters.
> 
> Makes sense?

No, not really.  Why should the core "trust" anything?  This is a basic,
simply sanity test, why push it out to the leaf driver when pushing it
into the core provides the sanity test for everyone?  Is it beneath an
emulated driver to pass such a test?  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin July 26, 2011, 11:42 a.m. UTC | #4
On Mon, Jul 25, 2011 at 02:14:28PM -0600, Alex Williamson wrote:
> On Sun, 2011-07-24 at 11:36 +0300, Michael S. Tsirkin wrote:
> > On Thu, Jul 21, 2011 at 11:02:53AM -0600, Alex Williamson wrote:
> > > This is crazy, why would we only test this for PCI_CAP_ID_EXP?  If the
> > > test is going to go in device-assignment, we need to wrap
> > > pci_add_capability and do it for all callers.  However, maybe we can get
> > > MST to take it in pci_add_capability() if we make the test more
> > > complete...  something like:
> > > 
> > > if ((pos < 256 && size > 256 - pos) ||
> > >     (pci_config_size() > 256 && pos > 256 &&
> > >      size > pci_config_size() - pos)) {
> > >    ... badness
> > > 
> > > Thanks,
> > > 
> > > Alex
> > 
> > We expect device assignment to be able to corrupt
> > guest memory but not qemu memory. So we must validate
> 
> Gee, thanks.

Ugh, sorry, not device assignment per se.
I really meant an assigned device controlled by
a malicious guest.

> > whatever we get from the device, and I think this
> > validation belongs in device-assignment.c:
> > 
> > IOW I think input should be validated where it's
> > input, while we still know it's untrusted,
> > instead of relying on core to validate parameters.
> > 
> > Makes sense?
> 
> No, not really.  Why should the core "trust" anything?

We generally don't validate most function parameters.
We trust callers to a level because they can corrupt our memory anyway.

> This is a basic, simply sanity test, why push it out to the leaf
> driver when pushing it
> into the core provides the sanity test for everyone?  Is it beneath an
> emulated driver to pass such a test?  Thanks,
> 
> Alex

It's easy to add this to the core but please don't
rely on this: functions validating parameters is
a debugging aid. Validating untrusted input from a guest-controlled
device is a security feature.
Alex Williamson July 26, 2011, 3:45 p.m. UTC | #5
On Tue, 2011-07-26 at 14:42 +0300, Michael S. Tsirkin wrote:
> On Mon, Jul 25, 2011 at 02:14:28PM -0600, Alex Williamson wrote:
> > On Sun, 2011-07-24 at 11:36 +0300, Michael S. Tsirkin wrote:
> > > On Thu, Jul 21, 2011 at 11:02:53AM -0600, Alex Williamson wrote:
> > > > This is crazy, why would we only test this for PCI_CAP_ID_EXP?  If the
> > > > test is going to go in device-assignment, we need to wrap
> > > > pci_add_capability and do it for all callers.  However, maybe we can get
> > > > MST to take it in pci_add_capability() if we make the test more
> > > > complete...  something like:
> > > > 
> > > > if ((pos < 256 && size > 256 - pos) ||
> > > >     (pci_config_size() > 256 && pos > 256 &&
> > > >      size > pci_config_size() - pos)) {
> > > >    ... badness
> > > > 
> > > > Thanks,
> > > > 
> > > > Alex
> > > 
> > > We expect device assignment to be able to corrupt
> > > guest memory but not qemu memory. So we must validate
> > 
> > Gee, thanks.
> 
> Ugh, sorry, not device assignment per se.
> I really meant an assigned device controlled by
> a malicious guest.

Huh?  A malicious guest can't redefine PCI config space for an assigned
device.  This is an initialization phase of parsing the device config
space.

> > > whatever we get from the device, and I think this
> > > validation belongs in device-assignment.c:
> > > 
> > > IOW I think input should be validated where it's
> > > input, while we still know it's untrusted,
> > > instead of relying on core to validate parameters.
> > > 
> > > Makes sense?
> > 
> > No, not really.  Why should the core "trust" anything?
> 
> We generally don't validate most function parameters.
> We trust callers to a level because they can corrupt our memory anyway.

This is device init code though, adding sanity checks doesn't hurt
anything and makes sure emulated device developers don't inadvertently
ignore PCI requirements.

> > This is a basic, simply sanity test, why push it out to the leaf
> > driver when pushing it
> > into the core provides the sanity test for everyone?  Is it beneath an
> > emulated driver to pass such a test?  Thanks,
> > 
> > Alex
> 
> It's easy to add this to the core but please don't
> rely on this: functions validating parameters is
> a debugging aid. Validating untrusted input from a guest-controlled
> device is a security feature.

Some might call it robustness to validate parameters and protect qemu
against all callers.  I hadn't realized that device assignment was such
a second class citizen.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index 36ad6b0..6bb8af7 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -1419,21 +1419,34 @@  static int assigned_device_pci_cap_init(PCIDevice *pci_dev)
     }
 
     if ((pos = pci_find_cap_offset(pci_dev, PCI_CAP_ID_EXP, 0))) {
-        uint8_t version;
+        uint8_t version, size;
         uint16_t type, devctl, lnkcap, lnksta;
         uint32_t devcap;
-        int size = 0x3c; /* version 2 size */
 
         version = pci_get_byte(pci_dev->config + pos + PCI_EXP_FLAGS);
         version &= PCI_EXP_FLAGS_VERS;
         if (version == 1) {
             size = 0x14;
-        } else if (version > 2) {
+        } else if (version == 2) {
+            /* don't include slot cap/stat/ctrl 2 regs; only support endpoints */
+            size = 0x34;
+        } else {
             fprintf(stderr, "Unsupported PCI express capability version %d\n",
                     version);
             return -EINVAL;
         }
 
+	/* make sure cap struct resides in legacy hdr space */
+	if (size > PCI_CONFIG_SPACE_SIZE - pos) {
+            fprintf(stderr, "ERROR: %04x:%02x:%02x.%x "
+                    "Attempt to add PCI Cap Structure 0x%x at offset 0x%x,"
+                    "size 0x%x, which exceeds legacy PCI config space 0x%x\n",
+                    pci_find_domain(pci_dev->bus), pci_bus_num(pci_dev->bus),
+                    PCI_SLOT(pci_dev->devfn), PCI_FUNC(pci_dev->devfn),
+                    PCI_CAP_ID_EXP, pos, size, PCI_CONFIG_SPACE_SIZE);
+            return -EINVAL;
+        }
+
         if ((ret = pci_add_capability(pci_dev, PCI_CAP_ID_EXP,
                                       pos, size)) < 0) {
             return ret;