Message ID | 20131126183750.GA29265@concerto (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Tue, Nov 26, 2013 at 10:37 PM, Khalid Aziz <khalid.aziz@oracle.com> wrote: > On Tue, Nov 26, 2013 at 05:48:06PM +0000, Matthew Garrett wrote: >> On Tue, Nov 26, 2013 at 10:35:26AM -0700, Bjorn Helgaas wrote: >> > On Tue, Nov 26, 2013 at 9:40 AM, Khalid Aziz <khalid.aziz@oracle.com> wrote: >> > > Disabling Bus Master bit is effectively a brute force and not an elegant way >> > > to stop unwanted DMA. It can have side effects as Alan and others pointed >> > > out in the original discussion, and we are now seeing one with Lynx Point on >> > > Acer. >> > >> > I'm getting more queasy all the time about disabling Bus Master. I >> > don't think RHEL does it, and that's probably where most kexec use is. >> > So I doubt we really have much experience with it yet. >> >> Does Windows disable the BM bit on shutdown? If not, it's likely that >> there are platforms where the SMM code assumes it's still enabled. We >> also know that there are devices that hang if BM is disabled while their >> DMA engines are still running. >> >> Unless we verify that Windows does this, I think there's no way we can >> guarantee that firmware won't make assumptions about the state of PCI. >> The easiest compromise would probably be to set a flag that disables >> busmastering purely when we're performing a kexec. >> > > This is the approach that is most appealing to me. If we confine > clearing Bus Master bit to just the kexec path, we can side step all > the land mines in normal shutdown path. Here is an informal patch > just for comments and discussion. It has been tested lightly. If > this approach is acceptable, I will create a more formal patch. ACK. This is perfect solution. > > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c > index 9042fdb..c605be0 100644 > --- a/drivers/pci/pci-driver.c > +++ b/drivers/pci/pci-driver.c > @@ -400,10 +400,11 @@ static void pci_device_shutdown(struct device *dev) > pci_msix_shutdown(pci_dev); > > /* > - * Turn off Bus Master bit on the device to tell it to not > - * continue to do DMA. Don't touch devices in D3cold or unknown states. > + * If this is a kexec reboot, turn off Bus Master bit on the > + * device to tell it to not continue to do DMA. Don't touch > + * devices in D3cold or unknown states. > */ > - if (pci_dev->current_state <= PCI_D3hot) > + if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot)) > pci_clear_master(pci_dev); > } > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index 9c91ecc..7d85733 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -9,6 +9,9 @@ > extern const unsigned char pcix_bus_speed[]; > extern const unsigned char pcie_link_speed[]; > > +/* flag to track if kexec reboot is in progress */ > +extern unsigned long kexec_in_progress; > + > /* Functions internal to the PCI core code */ > > int pci_create_sysfs_dev_files(struct pci_dev *pdev); > diff --git a/kernel/kexec.c b/kernel/kexec.c > index 490afc0..fd2d63e 100644 > --- a/kernel/kexec.c > +++ b/kernel/kexec.c > @@ -47,6 +47,9 @@ u32 vmcoreinfo_note[VMCOREINFO_NOTE_SIZE/4]; > size_t vmcoreinfo_size; > size_t vmcoreinfo_max_size = sizeof(vmcoreinfo_data); > > +/* Flag to indicate we are going to kexec a new kernel */ > +unsigned long kexec_in_progress = 0; > + > /* Location of the reserved area for the crash kernel */ > struct resource crashk_res = { > .name = "Crash kernel", > @@ -1675,6 +1678,7 @@ int kernel_kexec(void) > } else > #endif > { > + kexec_in_progress = 1; > kernel_restart_prepare(NULL); > printk(KERN_EMERG "Starting new kernel\n"); > machine_shutdown(); > > > -- > Khalid -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 9042fdb..c605be0 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -400,10 +400,11 @@ static void pci_device_shutdown(struct device *dev) pci_msix_shutdown(pci_dev); /* - * Turn off Bus Master bit on the device to tell it to not - * continue to do DMA. Don't touch devices in D3cold or unknown states. + * If this is a kexec reboot, turn off Bus Master bit on the + * device to tell it to not continue to do DMA. Don't touch + * devices in D3cold or unknown states. */ - if (pci_dev->current_state <= PCI_D3hot) + if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot)) pci_clear_master(pci_dev); } diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 9c91ecc..7d85733 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -9,6 +9,9 @@ extern const unsigned char pcix_bus_speed[]; extern const unsigned char pcie_link_speed[]; +/* flag to track if kexec reboot is in progress */ +extern unsigned long kexec_in_progress; + /* Functions internal to the PCI core code */ int pci_create_sysfs_dev_files(struct pci_dev *pdev); diff --git a/kernel/kexec.c b/kernel/kexec.c index 490afc0..fd2d63e 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -47,6 +47,9 @@ u32 vmcoreinfo_note[VMCOREINFO_NOTE_SIZE/4]; size_t vmcoreinfo_size; size_t vmcoreinfo_max_size = sizeof(vmcoreinfo_data); +/* Flag to indicate we are going to kexec a new kernel */ +unsigned long kexec_in_progress = 0; + /* Location of the reserved area for the crash kernel */ struct resource crashk_res = { .name = "Crash kernel", @@ -1675,6 +1678,7 @@ int kernel_kexec(void) } else #endif { + kexec_in_progress = 1; kernel_restart_prepare(NULL); printk(KERN_EMERG "Starting new kernel\n"); machine_shutdown();