diff mbox series

[v2,1/2] VT-d: avoid NULL deref on domain_context_mapping_one() error paths

Message ID 6739cf19-a74a-208d-82e8-28dfde7710f5@suse.com (mailing list archive)
State New, archived
Headers show
Series VT-d: address fallout from XSA-400 | expand

Commit Message

Jan Beulich April 7, 2022, 6:11 a.m. UTC
First there's a printk() which actually wrongly uses pdev in the first
place: We want to log the coordinates of the (perhaps fake) device
acted upon, which may not be pdev.

Then it was quite pointless for eb19326a328d ("VT-d: prepare for per-
device quarantine page tables (part I)") to add a domid_t parameter to
domain_context_unmap_one(): It's only used to pass back here via
me_wifi_quirk() -> map_me_phantom_function(). Drop the parameter again.

Finally there's the invocation of domain_context_mapping_one(), which
needs to be passed the correct domain ID. Avoid taking that path when
pdev is NULL and the quarantine state is what would need restoring to.
This means we can't security-support PCI devices with RMRRs (if such
exist in practice) any longer.

Fixes: 8f41e481b485 ("VT-d: re-assign devices directly")
Fixes: 14dd241aad8a ("IOMMU/x86: use per-device page tables for quarantining")
Coverity ID: 1503784
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Drop SUPPORT.md addition. Adjust comment. Extend another comment.

Comments

Roger Pau Monné April 7, 2022, 7:41 a.m. UTC | #1
On Thu, Apr 07, 2022 at 08:11:06AM +0200, Jan Beulich wrote:
> First there's a printk() which actually wrongly uses pdev in the first
> place: We want to log the coordinates of the (perhaps fake) device
> acted upon, which may not be pdev.
> 
> Then it was quite pointless for eb19326a328d ("VT-d: prepare for per-
> device quarantine page tables (part I)") to add a domid_t parameter to
> domain_context_unmap_one(): It's only used to pass back here via
> me_wifi_quirk() -> map_me_phantom_function(). Drop the parameter again.
> 
> Finally there's the invocation of domain_context_mapping_one(), which
> needs to be passed the correct domain ID. Avoid taking that path when
> pdev is NULL and the quarantine state is what would need restoring to.
> This means we can't security-support PCI devices with RMRRs (if such
> exist in practice) any longer.

The sentence:

"This means we can't security-support PCI devices with RMRRs"

Seems too broad and could lead to confusion. So I would maybe use:
"legacy PCI devices" or "non PCI Express devices".

> 
> Fixes: 8f41e481b485 ("VT-d: re-assign devices directly")
> Fixes: 14dd241aad8a ("IOMMU/x86: use per-device page tables for quarantining")
> Coverity ID: 1503784
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.
Jan Beulich April 7, 2022, 7:50 a.m. UTC | #2
On 07.04.2022 09:41, Roger Pau Monné wrote:
> On Thu, Apr 07, 2022 at 08:11:06AM +0200, Jan Beulich wrote:
>> First there's a printk() which actually wrongly uses pdev in the first
>> place: We want to log the coordinates of the (perhaps fake) device
>> acted upon, which may not be pdev.
>>
>> Then it was quite pointless for eb19326a328d ("VT-d: prepare for per-
>> device quarantine page tables (part I)") to add a domid_t parameter to
>> domain_context_unmap_one(): It's only used to pass back here via
>> me_wifi_quirk() -> map_me_phantom_function(). Drop the parameter again.
>>
>> Finally there's the invocation of domain_context_mapping_one(), which
>> needs to be passed the correct domain ID. Avoid taking that path when
>> pdev is NULL and the quarantine state is what would need restoring to.
>> This means we can't security-support PCI devices with RMRRs (if such
>> exist in practice) any longer.
> 
> The sentence:
> 
> "This means we can't security-support PCI devices with RMRRs"
> 
> Seems too broad and could lead to confusion. So I would maybe use:
> "legacy PCI devices" or "non PCI Express devices".

Right. I did actually forget to either drop or edit that sentence. I've
now extended this to

"This means we can't security-support non-PCI-Express devices with RMRRs
 (if such exist in practice) any longer; note that as of trhe 1st of the
 two commits referenced below assigning them to DomU-s is unsupported
 anyway."

>> Fixes: 8f41e481b485 ("VT-d: re-assign devices directly")
>> Fixes: 14dd241aad8a ("IOMMU/x86: use per-device page tables for quarantining")
>> Coverity ID: 1503784
>> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

Jan
Jason Andryuk April 7, 2022, 4:31 p.m. UTC | #3
Hi,

On Thu, Apr 7, 2022 at 3:50 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 07.04.2022 09:41, Roger Pau Monné wrote:
> > On Thu, Apr 07, 2022 at 08:11:06AM +0200, Jan Beulich wrote:
> >> First there's a printk() which actually wrongly uses pdev in the first
> >> place: We want to log the coordinates of the (perhaps fake) device
> >> acted upon, which may not be pdev.
> >>
> >> Then it was quite pointless for eb19326a328d ("VT-d: prepare for per-
> >> device quarantine page tables (part I)") to add a domid_t parameter to
> >> domain_context_unmap_one(): It's only used to pass back here via
> >> me_wifi_quirk() -> map_me_phantom_function(). Drop the parameter again.
> >>
> >> Finally there's the invocation of domain_context_mapping_one(), which
> >> needs to be passed the correct domain ID. Avoid taking that path when
> >> pdev is NULL and the quarantine state is what would need restoring to.
> >> This means we can't security-support PCI devices with RMRRs (if such
> >> exist in practice) any longer.
> >
> > The sentence:
> >
> > "This means we can't security-support PCI devices with RMRRs"
> >
> > Seems too broad and could lead to confusion. So I would maybe use:
> > "legacy PCI devices" or "non PCI Express devices".
>
> Right. I did actually forget to either drop or edit that sentence. I've
> now extended this to
>
> "This means we can't security-support non-PCI-Express devices with RMRRs
>  (if such exist in practice) any longer; note that as of trhe 1st of the
>  two commits referenced below assigning them to DomU-s is unsupported
>  anyway."

Mentioning "Express" makes the support statement clearer.  However,
I'm not clear on what unsupported means in "assigning them to DomU-s
is unsupported anyway".  They can't be assigned?  I'm testing with
staging-4.16, so with XSA-400, but not this patch.  I seemingly have a
legacy PCI device still being assigned to a domU.

It is an 8th Gen Intel laptop with:
00:14.0 USB controller: Intel Corporation Cannon Point-LP USB 3.1 xHCI
Controller (rev 30) (prog-if 30 [XHCI]).

lspci output for 00:14.0 does *not* have capability "Express (v2)",
but it does have an RMRR:
(XEN) [VT-D]found ACPI_DMAR_RMRR:
(XEN) [VT-D] endpoint: 0000:00:14.0

It looks like it is PCI compared to 39:00.0 which does have Express (v2):
(XEN) [VT-D]d1:PCI: map 0000:00:14.0
(XEN) [VT-D]d1:PCIe: map 0000:39:00.0

As I understand it, an RMRR is common with USB controllers for
implementing legacy mouse & keyboard support.  The Cannon Point PCH is
fairly modern, so I'd expect it to use PCI Express.  Xen seems to
identify it as DEV_TYPE_PCI given "PCI" above.  It is an integrated
PCI device, so it has no upstream bridge.  Maybe that is why it can
still be assigned?  The "unsupported assignment" is a legacy PCI
device, behind a bridge, with an RMRR?

Thanks,
Jason
Tian, Kevin April 8, 2022, 4:02 a.m. UTC | #4
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Thursday, April 7, 2022 3:50 PM
> 
> On 07.04.2022 09:41, Roger Pau Monné wrote:
> > On Thu, Apr 07, 2022 at 08:11:06AM +0200, Jan Beulich wrote:
> >> First there's a printk() which actually wrongly uses pdev in the first
> >> place: We want to log the coordinates of the (perhaps fake) device
> >> acted upon, which may not be pdev.
> >>
> >> Then it was quite pointless for eb19326a328d ("VT-d: prepare for per-
> >> device quarantine page tables (part I)") to add a domid_t parameter to
> >> domain_context_unmap_one(): It's only used to pass back here via
> >> me_wifi_quirk() -> map_me_phantom_function(). Drop the parameter
> again.
> >>
> >> Finally there's the invocation of domain_context_mapping_one(), which
> >> needs to be passed the correct domain ID. Avoid taking that path when
> >> pdev is NULL and the quarantine state is what would need restoring to.
> >> This means we can't security-support PCI devices with RMRRs (if such
> >> exist in practice) any longer.
> >
> > The sentence:
> >
> > "This means we can't security-support PCI devices with RMRRs"
> >
> > Seems too broad and could lead to confusion. So I would maybe use:
> > "legacy PCI devices" or "non PCI Express devices".
> 
> Right. I did actually forget to either drop or edit that sentence. I've
> now extended this to
> 
> "This means we can't security-support non-PCI-Express devices with RMRRs
>  (if such exist in practice) any longer; note that as of trhe 1st of the
>  two commits referenced below assigning them to DomU-s is unsupported
>  anyway."
> 

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich April 8, 2022, 6:03 a.m. UTC | #5
On 07.04.2022 18:31, Jason Andryuk wrote:
> Hi,
> 
> On Thu, Apr 7, 2022 at 3:50 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 07.04.2022 09:41, Roger Pau Monné wrote:
>>> On Thu, Apr 07, 2022 at 08:11:06AM +0200, Jan Beulich wrote:
>>>> First there's a printk() which actually wrongly uses pdev in the first
>>>> place: We want to log the coordinates of the (perhaps fake) device
>>>> acted upon, which may not be pdev.
>>>>
>>>> Then it was quite pointless for eb19326a328d ("VT-d: prepare for per-
>>>> device quarantine page tables (part I)") to add a domid_t parameter to
>>>> domain_context_unmap_one(): It's only used to pass back here via
>>>> me_wifi_quirk() -> map_me_phantom_function(). Drop the parameter again.
>>>>
>>>> Finally there's the invocation of domain_context_mapping_one(), which
>>>> needs to be passed the correct domain ID. Avoid taking that path when
>>>> pdev is NULL and the quarantine state is what would need restoring to.
>>>> This means we can't security-support PCI devices with RMRRs (if such
>>>> exist in practice) any longer.
>>>
>>> The sentence:
>>>
>>> "This means we can't security-support PCI devices with RMRRs"
>>>
>>> Seems too broad and could lead to confusion. So I would maybe use:
>>> "legacy PCI devices" or "non PCI Express devices".
>>
>> Right. I did actually forget to either drop or edit that sentence. I've
>> now extended this to
>>
>> "This means we can't security-support non-PCI-Express devices with RMRRs
>>  (if such exist in practice) any longer; note that as of trhe 1st of the
>>  two commits referenced below assigning them to DomU-s is unsupported
>>  anyway."
> 
> Mentioning "Express" makes the support statement clearer.  However,
> I'm not clear on what unsupported means in "assigning them to DomU-s
> is unsupported anyway".  They can't be assigned?  I'm testing with
> staging-4.16, so with XSA-400, but not this patch.  I seemingly have a
> legacy PCI device still being assigned to a domU.
> 
> It is an 8th Gen Intel laptop with:
> 00:14.0 USB controller: Intel Corporation Cannon Point-LP USB 3.1 xHCI
> Controller (rev 30) (prog-if 30 [XHCI]).
> 
> lspci output for 00:14.0 does *not* have capability "Express (v2)",
> but it does have an RMRR:
> (XEN) [VT-D]found ACPI_DMAR_RMRR:
> (XEN) [VT-D] endpoint: 0000:00:14.0
> 
> It looks like it is PCI compared to 39:00.0 which does have Express (v2):
> (XEN) [VT-D]d1:PCI: map 0000:00:14.0
> (XEN) [VT-D]d1:PCIe: map 0000:39:00.0
> 
> As I understand it, an RMRR is common with USB controllers for
> implementing legacy mouse & keyboard support.  The Cannon Point PCH is
> fairly modern, so I'd expect it to use PCI Express.  Xen seems to
> identify it as DEV_TYPE_PCI given "PCI" above.  It is an integrated
> PCI device, so it has no upstream bridge.  Maybe that is why it can
> still be assigned?  The "unsupported assignment" is a legacy PCI
> device, behind a bridge, with an RMRR?

Ah yes - the "behind a bridge" aspect does matter (but I can't
adjust the description of an already committed patch). That's both
for the respective part of the XSA-400 series as well as for the
change here.

Jan
Jason Andryuk April 8, 2022, 12:12 p.m. UTC | #6
On Fri, Apr 8, 2022 at 2:03 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 07.04.2022 18:31, Jason Andryuk wrote:
> > As I understand it, an RMRR is common with USB controllers for
> > implementing legacy mouse & keyboard support.  The Cannon Point PCH is
> > fairly modern, so I'd expect it to use PCI Express.  Xen seems to
> > identify it as DEV_TYPE_PCI given "PCI" above.  It is an integrated
> > PCI device, so it has no upstream bridge.  Maybe that is why it can
> > still be assigned?  The "unsupported assignment" is a legacy PCI
> > device, behind a bridge, with an RMRR?
>
> Ah yes - the "behind a bridge" aspect does matter (but I can't
> adjust the description of an already committed patch). That's both
> for the respective part of the XSA-400 series as well as for the
> change here.

Ok.  Thank you for confirming.  It is captured in the mailing list
archive at least.

Regards,
Jason
diff mbox series

Patch

--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -85,7 +85,7 @@  int domain_context_mapping_one(struct do
                                const struct pci_dev *pdev, domid_t domid,
                                paddr_t pgd_maddr, unsigned int mode);
 int domain_context_unmap_one(struct domain *domain, struct vtd_iommu *iommu,
-                             uint8_t bus, uint8_t devfn, domid_t domid);
+                             uint8_t bus, uint8_t devfn);
 int cf_check intel_iommu_get_reserved_device_memory(
     iommu_grdm_t *func, void *ctxt);
 
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1533,7 +1533,7 @@  int domain_context_mapping_one(
                 check_cleanup_domid_map(domain, pdev, iommu);
             printk(XENLOG_ERR
                    "%pp: unexpected context entry %016lx_%016lx (expected %016lx_%016lx)\n",
-                   &PCI_SBDF3(pdev->seg, pdev->bus, devfn),
+                   &PCI_SBDF3(seg, bus, devfn),
                    (uint64_t)(res >> 64), (uint64_t)res,
                    (uint64_t)(old >> 64), (uint64_t)old);
             rc = -EILSEQ;
@@ -1601,9 +1601,14 @@  int domain_context_mapping_one(
 
     if ( rc )
     {
-        if ( !prev_dom )
-            ret = domain_context_unmap_one(domain, iommu, bus, devfn,
-                                           DEVICE_DOMID(domain, pdev));
+        if ( !prev_dom ||
+             /*
+              * Unmapping here means DEV_TYPE_PCI devices with RMRRs (if such
+              * exist) would cause problems if such a region was actually
+              * accessed.
+              */
+             (prev_dom == dom_io && !pdev) )
+            ret = domain_context_unmap_one(domain, iommu, bus, devfn);
         else if ( prev_dom != domain ) /* Avoid infinite recursion. */
             ret = domain_context_mapping_one(prev_dom, iommu, bus, devfn, pdev,
                                              DEVICE_DOMID(prev_dom, pdev),
@@ -1744,7 +1749,9 @@  static int domain_context_mapping(struct
          * Strictly speaking if the device is the only one behind this bridge
          * and the only one with this (secbus,0,0) tuple, it could be allowed
          * to be re-assigned regardless of RMRR presence.  But let's deal with
-         * that case only if it is actually found in the wild.
+         * that case only if it is actually found in the wild.  Note that
+         * dealing with this just here would still not render the operation
+         * secure.
          */
         else if ( prev_present && (mode & MAP_WITH_RMRR) &&
                   domain != pdev->domain )
@@ -1809,7 +1816,7 @@  static int domain_context_mapping(struct
 int domain_context_unmap_one(
     struct domain *domain,
     struct vtd_iommu *iommu,
-    uint8_t bus, uint8_t devfn, domid_t domid)
+    uint8_t bus, uint8_t devfn)
 {
     struct context_entry *context, *context_entries;
     u64 maddr;
@@ -1867,7 +1874,8 @@  int domain_context_unmap_one(
     unmap_vtd_domain_page(context_entries);
 
     if ( !iommu->drhd->segment && !rc )
-        rc = me_wifi_quirk(domain, bus, devfn, domid, 0, UNMAP_ME_PHANTOM_FUNC);
+        rc = me_wifi_quirk(domain, bus, devfn, DOMID_INVALID, 0,
+                           UNMAP_ME_PHANTOM_FUNC);
 
     if ( rc && !is_hardware_domain(domain) && domain != dom_io )
     {
@@ -1916,8 +1924,7 @@  static const struct acpi_drhd_unit *doma
         if ( iommu_debug )
             printk(VTDPREFIX "%pd:PCIe: unmap %pp\n",
                    domain, &PCI_SBDF3(seg, bus, devfn));
-        ret = domain_context_unmap_one(domain, iommu, bus, devfn,
-                                       DEVICE_DOMID(domain, pdev));
+        ret = domain_context_unmap_one(domain, iommu, bus, devfn);
         if ( !ret && devfn == pdev->devfn && ats_device(pdev, drhd) > 0 )
             disable_ats_device(pdev);
 
@@ -1930,8 +1937,7 @@  static const struct acpi_drhd_unit *doma
         if ( iommu_debug )
             printk(VTDPREFIX "%pd:PCI: unmap %pp\n",
                    domain, &PCI_SBDF3(seg, bus, devfn));
-        ret = domain_context_unmap_one(domain, iommu, bus, devfn,
-                                       DEVICE_DOMID(domain, pdev));
+        ret = domain_context_unmap_one(domain, iommu, bus, devfn);
         if ( ret )
             break;
 
@@ -1954,12 +1960,10 @@  static const struct acpi_drhd_unit *doma
             break;
         }
 
-        ret = domain_context_unmap_one(domain, iommu, tmp_bus, tmp_devfn,
-                                       DEVICE_DOMID(domain, pdev));
+        ret = domain_context_unmap_one(domain, iommu, tmp_bus, tmp_devfn);
         /* PCIe to PCI/PCIx bridge */
         if ( !ret && pdev_type(seg, tmp_bus, tmp_devfn) == DEV_TYPE_PCIe2PCI_BRIDGE )
-            ret = domain_context_unmap_one(domain, iommu, secbus, 0,
-                                           DEVICE_DOMID(domain, pdev));
+            ret = domain_context_unmap_one(domain, iommu, secbus, 0);
 
         break;
 
--- a/xen/drivers/passthrough/vtd/quirks.c
+++ b/xen/drivers/passthrough/vtd/quirks.c
@@ -427,7 +427,7 @@  static int __must_check map_me_phantom_f
                                         domid, pgd_maddr, mode);
     else
         rc = domain_context_unmap_one(domain, drhd->iommu, 0,
-                                      PCI_DEVFN(dev, 7), domid);
+                                      PCI_DEVFN(dev, 7));
 
     return rc;
 }