Message ID | e12898835f25234561c9d7de4435590d957b85d9.1734924854.git.dns@arista.com (mailing list archive) |
---|---|
State | New |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | Re: PCI/ASPM: Fix UAF by disabling ASPM for link when child function is removed | expand |
> On Dec 22, 2024, at 7:39 PM, Daniel Stodden <dns@arista.com> wrote: > > From: Daniel Stodden <daniel.stodden@gmail.com> If this gets accepted — remove that line? Not important, but I don’t know how it got there. Might be because I had to export/import patches between private and corporate machines during test a few times. Thanks, Daniel > Before change 456d8aa37d0f "Disable ASPM on MFD function removal to > avoid use-after-free", we would free the ASPM link only after the last > function on the bus pertaining to the given link was removed. > > That was too late. If function 0 is removed before sibling function, > link->downstream would point to free'd memory after. > > After above change, we freed the ASPM parent link state upon any > function removal on the bus pertaining to a given link. > > That is too early. If the link is to a PCIe switch with MFD on the > upstream port, then removing functions other than 0 first would free a > link which still remains parent_link to the remaining downstream > ports. > > The resulting GPFs are especially frequent during hot-unplug, because > pciehp removes devices on the link bus in reverse order. > > On that switch, function 0 is the virtual P2P bridge to the internal > bus. Free exactly when function 0 is removed -- before the parent link > is obsolete, but after all subordinate links are gone. > > Fixes: 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free") > Signed-off-by: Daniel Stodden <dns@arista.com> > --- > drivers/pci/pcie/aspm.c | 17 +++++++++-------- > 1 file changed, 9 insertions(+), 8 deletions(-) > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index e0bc90597dca..8ae7c75b408c 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -1273,16 +1273,16 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) > parent_link = link->parent; > > /* > - * link->downstream is a pointer to the pci_dev of function 0. If > - * we remove that function, the pci_dev is about to be deallocated, > - * so we can't use link->downstream again. Free the link state to > - * avoid this. > + * Free the parent link state, no later than function 0 (i.e. > + * link->downstream) being removed. > * > - * If we're removing a non-0 function, it's possible we could > - * retain the link state, but PCIe r6.0, sec 7.5.3.7, recommends > - * programming the same ASPM Control value for all functions of > - * multi-function devices, so disable ASPM for all of them. > + * Do not free free the link state any earlier. If function 0 > + * is a switch upstream port, this link state is parent_link > + * to all subordinate ones. > */ > + if (pdev != link->downstream) > + goto out; > + > pcie_config_aspm_link(link, 0); > list_del(&link->sibling); > free_link_state(link); > @@ -1293,6 +1293,7 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) > pcie_config_aspm_path(parent_link); > } > > + out: > mutex_unlock(&aspm_lock); > up_read(&pci_bus_sem); > } > -- > 2.47.0 >
On Sun, Dec 22, 2024 at 07:39:08PM -0800, Daniel Stodden wrote: > From: Daniel Stodden <daniel.stodden@gmail.com> > > Before change 456d8aa37d0f "Disable ASPM on MFD function removal to > avoid use-after-free", we would free the ASPM link only after the last > function on the bus pertaining to the given link was removed. > > That was too late. If function 0 is removed before sibling function, > link->downstream would point to free'd memory after. > > After above change, we freed the ASPM parent link state upon any > function removal on the bus pertaining to a given link. > > That is too early. If the link is to a PCIe switch with MFD on the > upstream port, then removing functions other than 0 first would free a > link which still remains parent_link to the remaining downstream > ports. Is this specific to a Switch? It seems like removal of any multi-function device might trip over this. > The resulting GPFs are especially frequent during hot-unplug, because > pciehp removes devices on the link bus in reverse order. Do you have a sample GPF? If we can include a few pertinent lines here it may help people connect a problem with this fix. > On that switch, function 0 is the virtual P2P bridge to the internal > bus. Free exactly when function 0 is removed -- before the parent link > is obsolete, but after all subordinate links are gone. I agree this is a problem. IIUC we allocate pcie_link_state when enumerating a device on the upstream end of a link, i.e., a Root Port or Switch Downstream Port, but we deallocate it when removing a device on the downstream end of the link, i.e., an Endpoint or Switch Upstream Port. This asymmetry seems kind of prone to error. Also, struct pcie_link_state contains redundant information, e.g., the pdev, downstream, parent, and sibling members basically duplicate the hierarchy already described by the struct pci_bus parent, self, and devices members. Redundancy like this is also error prone. This patch is attractive because it's a very small fix, and maybe we should use it for that reason. But I do think we're basically papering over a pretty serious design defect in ASPM. I think we'd ultimately be better off if we allocated pcie_link_state either as a member of struct pci_dev (instead of using a pointer), or perhaps in pci_setup_device() when we set up the rest of the bridge-related things and made it live as long as the bridge device. Actually, if we removed all the redundant pointers in struct pcie_link_state, it would be smaller than a single pointer, so there'd be no reason to allocate it dynamically. Of course this would be a much bigger change to aspm.c. > Fixes: 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free") Do we have any public problem reports we could mention here? I'm actually a little surprised that this hasn't been a bigger problem, given that 456d8aa37d0f appeared in v6.5 in Aug 2023. But maybe there is some unusual topology or hot-unplug involved? > Signed-off-by: Daniel Stodden <dns@arista.com> > --- > drivers/pci/pcie/aspm.c | 17 +++++++++-------- > 1 file changed, 9 insertions(+), 8 deletions(-) > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c > index e0bc90597dca..8ae7c75b408c 100644 > --- a/drivers/pci/pcie/aspm.c > +++ b/drivers/pci/pcie/aspm.c > @@ -1273,16 +1273,16 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) > parent_link = link->parent; > > /* > - * link->downstream is a pointer to the pci_dev of function 0. If > - * we remove that function, the pci_dev is about to be deallocated, > - * so we can't use link->downstream again. Free the link state to > - * avoid this. > + * Free the parent link state, no later than function 0 (i.e. > + * link->downstream) being removed. > * > - * If we're removing a non-0 function, it's possible we could > - * retain the link state, but PCIe r6.0, sec 7.5.3.7, recommends > - * programming the same ASPM Control value for all functions of > - * multi-function devices, so disable ASPM for all of them. > + * Do not free free the link state any earlier. If function 0 > + * is a switch upstream port, this link state is parent_link > + * to all subordinate ones. > */ > + if (pdev != link->downstream) > + goto out; > + > pcie_config_aspm_link(link, 0); > list_del(&link->sibling); > free_link_state(link); > @@ -1293,6 +1293,7 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) > pcie_config_aspm_path(parent_link); > } > > + out: > mutex_unlock(&aspm_lock); > up_read(&pci_bus_sem); > } > -- > 2.47.0 >
Hi, On 1/29/25 1:29 PM, Bjorn Helgaas wrote: > On Sun, Dec 22, 2024 at 07:39:08PM -0800, Daniel Stodden wrote: >> From: Daniel Stodden <daniel.stodden@gmail.com> If possible include where you noticed this issue and any dmesg logs related to it. >> Before change 456d8aa37d0f "Disable ASPM on MFD function removal to >> avoid use-after-free", we would free the ASPM link only after the last >> function on the bus pertaining to the given link was removed. >> >> That was too late. If function 0 is removed before sibling function, >> link->downstream would point to free'd memory after. >> >> After above change, we freed the ASPM parent link state upon any >> function removal on the bus pertaining to a given link. >> >> That is too early. If the link is to a PCIe switch with MFD on the >> upstream port, then removing functions other than 0 first would free a >> link which still remains parent_link to the remaining downstream >> ports. > Is this specific to a Switch? It seems like removal of any > multi-function device might trip over this. > >> The resulting GPFs are especially frequent during hot-unplug, because >> pciehp removes devices on the link bus in reverse order. > Do you have a sample GPF? If we can include a few pertinent lines > here it may help people connect a problem with this fix. > >> On that switch, function 0 is the virtual P2P bridge to the internal >> bus. Free exactly when function 0 is removed -- before the parent link >> is obsolete, but after all subordinate links are gone. > I agree this is a problem. > > IIUC we allocate pcie_link_state when enumerating a device on the > upstream end of a link, i.e., a Root Port or Switch Downstream Port, > but we deallocate it when removing a device on the downstream end of > the link, i.e., an Endpoint or Switch Upstream Port. This asymmetry > seems kind of prone to error. > > Also, struct pcie_link_state contains redundant information, e.g., the > pdev, downstream, parent, and sibling members basically duplicate the > hierarchy already described by the struct pci_bus parent, self, and > devices members. Redundancy like this is also error prone. > > This patch is attractive because it's a very small fix, and maybe we > should use it for that reason. But I do think we're basically > papering over a pretty serious design defect in ASPM. > > I think we'd ultimately be better off if we allocated pcie_link_state > either as a member of struct pci_dev (instead of using a pointer), or > perhaps in pci_setup_device() when we set up the rest of the > bridge-related things and made it live as long as the bridge device. > > Actually, if we removed all the redundant pointers in struct > pcie_link_state, it would be smaller than a single pointer, so there'd > be no reason to allocate it dynamically. > > Of course this would be a much bigger change to aspm.c. Agree. I also think creating the link at the setup would be a better approach. But If this issue is seen in any real hardware now and need a urgent fix, may be we can merge this change for now. > >> Fixes: 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free") > Do we have any public problem reports we could mention here? I'm > actually a little surprised that this hasn't been a bigger problem, > given that 456d8aa37d0f appeared in v6.5 in Aug 2023. But maybe there > is some unusual topology or hot-unplug involved? > >> Signed-off-by: Daniel Stodden <dns@arista.com> >> --- >> drivers/pci/pcie/aspm.c | 17 +++++++++-------- >> 1 file changed, 9 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c >> index e0bc90597dca..8ae7c75b408c 100644 >> --- a/drivers/pci/pcie/aspm.c >> +++ b/drivers/pci/pcie/aspm.c >> @@ -1273,16 +1273,16 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) >> parent_link = link->parent; >> >> /* >> - * link->downstream is a pointer to the pci_dev of function 0. If >> - * we remove that function, the pci_dev is about to be deallocated, >> - * so we can't use link->downstream again. Free the link state to >> - * avoid this. >> + * Free the parent link state, no later than function 0 (i.e. >> + * link->downstream) being removed. >> * >> - * If we're removing a non-0 function, it's possible we could >> - * retain the link state, but PCIe r6.0, sec 7.5.3.7, recommends >> - * programming the same ASPM Control value for all functions of >> - * multi-function devices, so disable ASPM for all of them. >> + * Do not free free the link state any earlier. If function 0 Above line is not very clear. May be you can remove " Do not free free the link state any earlier" >> + * is a switch upstream port, this link state is parent_link >> + * to all subordinate ones. >> */ >> + if (pdev != link->downstream) >> + goto out; >> + >> pcie_config_aspm_link(link, 0); >> list_del(&link->sibling); >> free_link_state(link); >> @@ -1293,6 +1293,7 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) >> pcie_config_aspm_path(parent_link); >> } >> >> + out: >> mutex_unlock(&aspm_lock); >> up_read(&pci_bus_sem); >> } >> -- >> 2.47.0 >>
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c index e0bc90597dca..8ae7c75b408c 100644 --- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -1273,16 +1273,16 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) parent_link = link->parent; /* - * link->downstream is a pointer to the pci_dev of function 0. If - * we remove that function, the pci_dev is about to be deallocated, - * so we can't use link->downstream again. Free the link state to - * avoid this. + * Free the parent link state, no later than function 0 (i.e. + * link->downstream) being removed. * - * If we're removing a non-0 function, it's possible we could - * retain the link state, but PCIe r6.0, sec 7.5.3.7, recommends - * programming the same ASPM Control value for all functions of - * multi-function devices, so disable ASPM for all of them. + * Do not free free the link state any earlier. If function 0 + * is a switch upstream port, this link state is parent_link + * to all subordinate ones. */ + if (pdev != link->downstream) + goto out; + pcie_config_aspm_link(link, 0); list_del(&link->sibling); free_link_state(link); @@ -1293,6 +1293,7 @@ void pcie_aspm_exit_link_state(struct pci_dev *pdev) pcie_config_aspm_path(parent_link); } + out: mutex_unlock(&aspm_lock); up_read(&pci_bus_sem); }