Message ID | 20170512181107.13853.34680.stgit@localhost.localdomain (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Fri, May 12, 2017 at 11:13:57AM -0700, Alexander Duyck wrote: > From: Alexander Duyck <alexander.h.duyck@intel.com> > > This patch is meant to address issues seen when performing on FLR on some > systems as the long wait can result in a master abort when reading a > function that is not yet ready. > > To prevent the master abort from being reported up we should disable > reporting for it while waiting on the reset. Once the reset is completed > then we can re-enable the master abort for the bus. > > Fixes: 5adecf817dd6 ("PCI: Wait for up to 1000ms after FLR reset") > > Reported-by: Brian Norris <briannorris@chromium.org> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> > --- > > I haven't been able to test this code as I don't have a system that can > reproduce the issue. The fix itself is based on the issue as reported by > Brian so I am hoping he can test this on the Samsung Chromebook Plus with > RK399 OP1 that this was seen on. FYI, it's "RK3399" not "RK399". No harm done :) > drivers/pci/pci.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 7904d02ffdb9..acbdbabeaa0e 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev) > */ > static void pci_flr_wait(struct pci_dev *dev) > { > + struct pci_dev *bridge = dev->bus->self; > int i = 0; > + u16 bctl; > u32 id; > > + /* > + * Disable MasterAbortMode while we are waiting to avoid reporting > + * bus errors for reading the command register on a device that is > + * not ready (in some architectures) I assume it's intentional to only do this *after* you've started the reset (but before you start polling)? > + */ > + if (bridge) { > + pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl); > + pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, > + bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); > + } > + > do { > msleep(100); > pci_read_config_dword(dev, PCI_COMMAND, &id); IIUC, the patch works fine, in that I no longer get an abort from the RC. > } while (i++ < 10 && id == ~0); BTW, the RK3399 host controller doesn't actually return all 1's on aborted transactions. So this loop doesn't really work for it at all. I take it that this (seemingly widely used) pattern all derives from the PCI spec, which notes: "The host bus bridge, in PC compatible systems, must return all 1's on a read transaction and discard data on a write transaction when terminated with Master-Abort." RK3399 is far from a "PC compatible system", but I don't know if that's a real excuse here... > > + /* restore bridge state */ > + if (bridge) > + pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bctl); > + > if (id == ~0) > dev_warn(&dev->dev, "Failed to return from FLR\n"); And there's no actual error code returned here; so since several of the error cases I can trigger on my hardware seem to never actually recover (no longer how long I wait and/or poll), it seems like we should propagate the error upward. Right now, we still unconditionally continue with the pci_dev_restore(), even though that's doomed to abort too... So there may be several implementation bugs with RK3399. I don't know how much can be fixed on Rockchip's side, vs. how much can be accomodated in the PCI core. Brian > else if (i > 1) >
On Mon, May 15, 2017 at 9:36 AM, Brian Norris <briannorris@chromium.org> wrote: > On Fri, May 12, 2017 at 11:13:57AM -0700, Alexander Duyck wrote: >> From: Alexander Duyck <alexander.h.duyck@intel.com> >> >> This patch is meant to address issues seen when performing on FLR on some >> systems as the long wait can result in a master abort when reading a >> function that is not yet ready. >> >> To prevent the master abort from being reported up we should disable >> reporting for it while waiting on the reset. Once the reset is completed >> then we can re-enable the master abort for the bus. >> >> Fixes: 5adecf817dd6 ("PCI: Wait for up to 1000ms after FLR reset") >> >> Reported-by: Brian Norris <briannorris@chromium.org> >> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> >> --- >> >> I haven't been able to test this code as I don't have a system that can >> reproduce the issue. The fix itself is based on the issue as reported by >> Brian so I am hoping he can test this on the Samsung Chromebook Plus with >> RK399 OP1 that this was seen on. > > FYI, it's "RK3399" not "RK399". No harm done :) I will try to update my patch in case there is a future re-submission. >> drivers/pci/pci.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index 7904d02ffdb9..acbdbabeaa0e 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev) >> */ >> static void pci_flr_wait(struct pci_dev *dev) >> { >> + struct pci_dev *bridge = dev->bus->self; >> int i = 0; >> + u16 bctl; >> u32 id; >> >> + /* >> + * Disable MasterAbortMode while we are waiting to avoid reporting >> + * bus errors for reading the command register on a device that is >> + * not ready (in some architectures) > > I assume it's intentional to only do this *after* you've started the > reset (but before you start polling)? Yes. The general idea is that the only thing that should be accessing the device is us. So this way we can catch any other code that might be broken such as a device driver that is leaving a thread active that is accessing the device before a reset. >> + */ >> + if (bridge) { >> + pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl); >> + pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, >> + bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); >> + } >> + >> do { >> msleep(100); >> pci_read_config_dword(dev, PCI_COMMAND, &id); > > IIUC, the patch works fine, in that I no longer get an abort from the > RC. So did you actually test this patch or are you just speculating it should work? I'm just not sure what you mean by "IIUC" in this context here, were you referring to how the patch fixes the issue or how the testing should work for verifying it fixed the issue? >> } while (i++ < 10 && id == ~0); > > BTW, the RK3399 host controller doesn't actually return all 1's on > aborted transactions. So this loop doesn't really work for it at all. > > I take it that this (seemingly widely used) pattern all derives from the > PCI spec, which notes: > > "The host bus bridge, in PC compatible systems, must return all 1's on > a read transaction and discard data on a write transaction when > terminated with Master-Abort." > > RK3399 is far from a "PC compatible system", but I don't know if that's > a real excuse here... So if it doesn't return all 1's does it return all 0's? I'm just curious what is returned? Also do we know why we are reading the PCI_COMMAND register instead of just checking for the device ID? Just wondering because the easiest solution would be to copy out the logic in pci_bus_read_dev_vendor_id() if all we are doing is waiting on the configuration space to begin responding to requests again. >> >> + /* restore bridge state */ >> + if (bridge) >> + pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bctl); >> + >> if (id == ~0) >> dev_warn(&dev->dev, "Failed to return from FLR\n"); > > And there's no actual error code returned here; so since several of the > error cases I can trigger on my hardware seem to never actually recover > (no longer how long I wait and/or poll), it seems like we should > propagate the error upward. Right now, we still unconditionally continue > with the pci_dev_restore(), even though that's doomed to abort too... If you take a look we ignore the return value anyway and call the restore function in pci_reset_function(). I think the general idea is we should still have a device here after attempting to perform the reset. It may be that performing an FLR isn't the right choice for this hardware. Do you know if it functions correctly and actually completes the reset correctly? I know in the case of some of the Intel Ethernet silicon we were advertising support for it but found an issue that was causing the Ethernet PHY to go offline as a result of performing the FLR. Do you know what it was that was kicking off the need to perform the reset in the first place? Also you may consider possibly adding a quirk if the FLR is kicking your device completely out of the system. There was recently a PCI flag added to indicate a device doesn't support FLR. You might try adding your own version of the quirk for this device to see if maybe pushing it to use a different sort of reset might be more useful. > So there may be several implementation bugs with RK3399. I don't know > how much can be fixed on Rockchip's side, vs. how much can be > accomodated in the PCI core. > > Brian This is the kind of thing we have PCIe quirks for if nothing else. Odds are we should be able to work around it, it is just a matter of knowing what all the quirks are. - Alex
On Mon, 15 May 2017 12:48:30 -0700 Alexander Duyck <alexander.duyck@gmail.com> wrote: > On Mon, May 15, 2017 at 9:36 AM, Brian Norris <briannorris@chromium.org> wrote: > > On Fri, May 12, 2017 at 11:13:57AM -0700, Alexander Duyck wrote: > >> From: Alexander Duyck <alexander.h.duyck@intel.com> > >> > >> This patch is meant to address issues seen when performing on FLR on some > >> systems as the long wait can result in a master abort when reading a > >> function that is not yet ready. > >> > >> To prevent the master abort from being reported up we should disable > >> reporting for it while waiting on the reset. Once the reset is completed > >> then we can re-enable the master abort for the bus. > >> > >> Fixes: 5adecf817dd6 ("PCI: Wait for up to 1000ms after FLR reset") > >> > >> Reported-by: Brian Norris <briannorris@chromium.org> > >> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> > >> --- > >> > >> I haven't been able to test this code as I don't have a system that can > >> reproduce the issue. The fix itself is based on the issue as reported by > >> Brian so I am hoping he can test this on the Samsung Chromebook Plus with > >> RK399 OP1 that this was seen on. > > > > FYI, it's "RK3399" not "RK399". No harm done :) > > I will try to update my patch in case there is a future re-submission. > > >> drivers/pci/pci.c | 17 +++++++++++++++++ > >> 1 file changed, 17 insertions(+) > >> > >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > >> index 7904d02ffdb9..acbdbabeaa0e 100644 > >> --- a/drivers/pci/pci.c > >> +++ b/drivers/pci/pci.c > >> @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev) > >> */ > >> static void pci_flr_wait(struct pci_dev *dev) > >> { > >> + struct pci_dev *bridge = dev->bus->self; > >> int i = 0; > >> + u16 bctl; > >> u32 id; > >> > >> + /* > >> + * Disable MasterAbortMode while we are waiting to avoid reporting > >> + * bus errors for reading the command register on a device that is > >> + * not ready (in some architectures) > > > > I assume it's intentional to only do this *after* you've started the > > reset (but before you start polling)? > > Yes. The general idea is that the only thing that should be accessing > the device is us. So this way we can catch any other code that might > be broken such as a device driver that is leaving a thread active that > is accessing the device before a reset. > > >> + */ > >> + if (bridge) { > >> + pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl); > >> + pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, > >> + bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); > >> + } > >> + > >> do { > >> msleep(100); > >> pci_read_config_dword(dev, PCI_COMMAND, &id); > > > > IIUC, the patch works fine, in that I no longer get an abort from the > > RC. > > So did you actually test this patch or are you just speculating it > should work? I'm just not sure what you mean by "IIUC" in this context > here, were you referring to how the patch fixes the issue or how the > testing should work for verifying it fixed the issue? > > >> } while (i++ < 10 && id == ~0); > > > > BTW, the RK3399 host controller doesn't actually return all 1's on > > aborted transactions. So this loop doesn't really work for it at all. > > > > I take it that this (seemingly widely used) pattern all derives from the > > PCI spec, which notes: > > > > "The host bus bridge, in PC compatible systems, must return all 1's on > > a read transaction and discard data on a write transaction when > > terminated with Master-Abort." > > > > RK3399 is far from a "PC compatible system", but I don't know if that's > > a real excuse here... > > So if it doesn't return all 1's does it return all 0's? I'm just > curious what is returned? > > Also do we know why we are reading the PCI_COMMAND register instead of > just checking for the device ID? Just wondering because the easiest > solution would be to copy out the logic in > pci_bus_read_dev_vendor_id() if all we are doing is waiting on the > configuration space to begin responding to requests again. Because SR-IOV VFs don't implement the first dword. Thanks, Alex
On Mon, May 15, 2017 at 12:48:30PM -0700, Alexander Duyck wrote: > On Mon, May 15, 2017 at 9:36 AM, Brian Norris <briannorris@chromium.org> wrote: > > On Fri, May 12, 2017 at 11:13:57AM -0700, Alexander Duyck wrote: > >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > >> index 7904d02ffdb9..acbdbabeaa0e 100644 > >> --- a/drivers/pci/pci.c > >> +++ b/drivers/pci/pci.c > >> @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev) > >> */ > >> static void pci_flr_wait(struct pci_dev *dev) > >> { > >> + struct pci_dev *bridge = dev->bus->self; > >> int i = 0; > >> + u16 bctl; > >> u32 id; > >> > >> + /* > >> + * Disable MasterAbortMode while we are waiting to avoid reporting > >> + * bus errors for reading the command register on a device that is > >> + * not ready (in some architectures) > > > > I assume it's intentional to only do this *after* you've started the > > reset (but before you start polling)? > > Yes. The general idea is that the only thing that should be accessing > the device is us. So this way we can catch any other code that might > be broken such as a device driver that is leaving a thread active that > is accessing the device before a reset. I figured as much (and that makes sense). Thus, this patch is independent of the report I made in the first place -- that somehow MSI interrupts can still be delivered from my EP while we're in the midst of FLR. > >> + */ > >> + if (bridge) { > >> + pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl); > >> + pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, > >> + bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); > >> + } > >> + > >> do { > >> msleep(100); > >> pci_read_config_dword(dev, PCI_COMMAND, &id); > > > > IIUC, the patch works fine, in that I no longer get an abort from the > > RC. > > So did you actually test this patch or are you just speculating it > should work? I'm just not sure what you mean by "IIUC" in this context > here, were you referring to how the patch fixes the issue or how the > testing should work for verifying it fixed the issue? I did test it, but due to the below issue (the timeout loop doesn't actually work for me, and then we still abort / panic when the device doesn't respond later) it's hard to tell whether this is very effective. The precise reason why I use "IIUC": even with this patch, my system can crash with asynchronous aborts (ARM64 SError) just after FLR. I *think* that means the above pci_read_config_dword() is no longer causing an abort, but it's hard to tell for sure what exactly caused the abort at all when looking at asynchronous errors (I can add extra delays, prints, etc., to get a pretty good idea, but that isn't exactly precise, and it's time consuming). I *suspect* that I'm now just crashing due to pci_dev_restore(). > >> } while (i++ < 10 && id == ~0); > > > > BTW, the RK3399 host controller doesn't actually return all 1's on > > aborted transactions. So this loop doesn't really work for it at all. > > > > I take it that this (seemingly widely used) pattern all derives from the > > PCI spec, which notes: > > > > "The host bus bridge, in PC compatible systems, must return all 1's on > > a read transaction and discard data on a write transaction when > > terminated with Master-Abort." > > > > RK3399 is far from a "PC compatible system", but I don't know if that's > > a real excuse here... > > So if it doesn't return all 1's does it return all 0's? I'm just > curious what is returned? All 0's. > >> > >> + /* restore bridge state */ > >> + if (bridge) > >> + pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bctl); > >> + > >> if (id == ~0) > >> dev_warn(&dev->dev, "Failed to return from FLR\n"); > > > > And there's no actual error code returned here; so since several of the > > error cases I can trigger on my hardware seem to never actually recover > > (no longer how long I wait and/or poll), it seems like we should > > propagate the error upward. Right now, we still unconditionally continue > > with the pci_dev_restore(), even though that's doomed to abort too... > > If you take a look we ignore the return value anyway and call the > restore function in pci_reset_function(). Right, that's what I was referring to. > I think the general idea is > we should still have a device here after attempting to perform the > reset. That assumption doesn't always seem to hold true here. > It may be that performing an FLR isn't the right choice for this > hardware. Do you know if it functions correctly and actually completes > the reset correctly? I know in the case of some of the Intel Ethernet > silicon we were advertising support for it but found an issue that was > causing the Ethernet PHY to go offline as a result of performing the > FLR. I don't know exactly the cause for why this Wifi card aborts vs. when it resets properly. A high percentage of the time, this card seems to reset properly, but I am able to trigger certain cases (due to a buggy Wifi driver; what's new?) that seem to get the entire card into a bad state such that it no longer reset properly. I can pursue the source of such driver bugs, but if I know anything about this vendor, there's never going to be an acceptably-close-to-100% certainty that there won't be such bugs again. > Do you know what it was that was kicking off the need to perform > the reset in the first place? It's now baked into the mwifiex Wifi driver. It always has had a fallback such that if the Wifi FW stops responding for whatever reason [1], we can automatically unload the netdev / reset the device / reload the netdev. This was supported for SDIO previously, but was added to the PCIe driver in commit 4c5dae59d2e9 ("mwifiex: add PCIe function level reset support") and extended to work automatically via this not-yet-merged patch: https://patchwork.kernel.org/patch/9706855/ [PATCH v3 2/2] mwifiex: pcie: add card_reset() support > Also you may consider possibly adding a > quirk if the FLR is kicking your device completely out of the system. > There was recently a PCI flag added to indicate a device doesn't > support FLR. You might try adding your own version of the quirk for > this device to see if maybe pushing it to use a different sort of > reset might be more useful. I'm not sure if another sort of reset is available, nor if it would work better. I'll investigate though. Also, I'm told this same Wifi card works fine on other PCIe systems (e.g., Intel-based laptops) without trouble, including over 1000s of FLR cycles. (But I only trust such testing as far as I can throw the tester though -- I don't think they're testing as many corner cases as I am.) > > So there may be several implementation bugs with RK3399. I don't know > > how much can be fixed on Rockchip's side, vs. how much can be > > accomodated in the PCI core. > > > > Brian > > This is the kind of thing we have PCIe quirks for if nothing else. > Odds are we should be able to work around it, it is just a matter of > knowing what all the quirks are. Maybe your imagination for "quirks" isn't creative enough :) Brian [1] Hint: there are a boundless number of reasons for such FW failures.
On Mon, May 15, 2017 at 02:00:33PM -0700, Brian Norris wrote: > On Mon, May 15, 2017 at 12:48:30PM -0700, Alexander Duyck wrote: > > On Mon, May 15, 2017 at 9:36 AM, Brian Norris <briannorris@chromium.org> wrote: > > > So there may be several implementation bugs with RK3399. I don't know > > > how much can be fixed on Rockchip's side, vs. how much can be > > > accomodated in the PCI core. > > > > This is the kind of thing we have PCIe quirks for if nothing else. > > Odds are we should be able to work around it, it is just a matter of > > knowing what all the quirks are. > > Maybe your imagination for "quirks" isn't creative enough :) BTW, I meant to note that people have gone to some extreme lengths to handle buggy / quirky RCs. Look at drivers/pci/dwc/pci-imx6.c, which hooks all the way into the core ARM exception handling just to mask aborts! (See imx6q_pcie_abort_handler.) And they have some patches intended to fix crashes in 4.12-rc1 to hook synchronous aborts to return "all 1's". But that's extremely invasive and not very precise (I don't think that even checks the difference between PCI-induced aborts and other system aborts?). Probably not gonna fly on ARM64 either... Brian [1] [PATCH] PCI: imx6: fix downstream bus scanning https://patchwork.ozlabs.org/patch/760748/
On Fri, May 12, 2017 at 11:13:57AM -0700, Alexander Duyck wrote: > From: Alexander Duyck <alexander.h.duyck@intel.com> > > This patch is meant to address issues seen when performing on FLR on some > systems as the long wait can result in a master abort when reading a > function that is not yet ready. > > To prevent the master abort from being reported up we should disable > reporting for it while waiting on the reset. Once the reset is completed > then we can re-enable the master abort for the bus. > > Fixes: 5adecf817dd6 ("PCI: Wait for up to 1000ms after FLR reset") > > Reported-by: Brian Norris <briannorris@chromium.org> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> > --- > > I haven't been able to test this code as I don't have a system that can > reproduce the issue. The fix itself is based on the issue as reported by > Brian so I am hoping he can test this on the Samsung Chromebook Plus with > RK399 OP1 that this was seen on. I'm slightly hesitant about turning off Master Abort reporting in the upstream bridge because that potentially affects several devices below the bridge, not just the one that's being reset. But maybe it's the best we can do. Is there a public problem report we can reference? It'd be nice to have "lspci -vvxxxx" output (for future reference since lspci doesn't decode FRS stuff yet). Alex W, since you wrote 5adecf817dd6, have you seen similar problems or do you have any thoughts on this? I have the same question as Alex D -- why aren't we using CRS in this path as in pci_bus_read_dev_vendor_id()? PCIe r3.1, sec 6.6.2, says a function should use CRS if it requires more time. I see the comment about VFs not implementing the first dword, but a VF vendor ID *should* read as 0xffff, which is distinguishable from a CRS response (0x0001). Per sec 6.6.2, a function must complete an FLR within 100ms. I assume that means the Intel IGD from 5adecf817dd6 and whatever device is involved here (what is it, by the way?) are out of spec. Or is there some weasel wording that allows devices to take this long after a reset? If they're out of spec, it'd be nice to identify them with quirks (maybe using pci_dev_specific_reset()) instead of changing the generic code to accommodate them. That way the code matches the spec better and we can potentially work on better ways to handle the issue within the spec, e.g., via something like Function Readiness Status notifications (sec 6.23.2). Please repost this eventually with the typo correction and hopefully a link to the problem report and lspci info. Bjorn > drivers/pci/pci.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 7904d02ffdb9..acbdbabeaa0e 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev) > */ > static void pci_flr_wait(struct pci_dev *dev) > { > + struct pci_dev *bridge = dev->bus->self; > int i = 0; > + u16 bctl; > u32 id; > > + /* > + * Disable MasterAbortMode while we are waiting to avoid reporting > + * bus errors for reading the command register on a device that is > + * not ready (in some architectures) > + */ > + if (bridge) { > + pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl); > + pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, > + bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); > + } > + > do { > msleep(100); > pci_read_config_dword(dev, PCI_COMMAND, &id); > } while (i++ < 10 && id == ~0); > > + /* restore bridge state */ > + if (bridge) > + pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bctl); > + > if (id == ~0) > dev_warn(&dev->dev, "Failed to return from FLR\n"); > else if (i > 1) >
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 7904d02ffdb9..acbdbabeaa0e 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3758,14 +3758,31 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev) */ static void pci_flr_wait(struct pci_dev *dev) { + struct pci_dev *bridge = dev->bus->self; int i = 0; + u16 bctl; u32 id; + /* + * Disable MasterAbortMode while we are waiting to avoid reporting + * bus errors for reading the command register on a device that is + * not ready (in some architectures) + */ + if (bridge) { + pci_read_config_word(bridge, PCI_BRIDGE_CONTROL, &bctl); + pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, + bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT); + } + do { msleep(100); pci_read_config_dword(dev, PCI_COMMAND, &id); } while (i++ < 10 && id == ~0); + /* restore bridge state */ + if (bridge) + pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bctl); + if (id == ~0) dev_warn(&dev->dev, "Failed to return from FLR\n"); else if (i > 1)