Message ID | 1527650389-31575-1-git-send-email-okaya@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Tue, May 29, 2018 at 11:19:41PM -0400, Sinan Kaya wrote: > Adding pci=safemode kernel command line parameter to turn off all PCI > Express service driver as well as all optional PCIe features such as LTR, > Extended tags, Relaxed Ordering etc. > > Also setting MPS configuration to PCIE_BUS_SAFE so that MPS and MRRS can be > reconfigured with by the kernel in case BIOS hands off a broken > configuration. Why not fix the BIOS? That's what sane platforms do :) > > Signed-off-by: Sinan Kaya <okaya@codeaurora.org> > --- > Documentation/admin-guide/kernel-parameters.txt | 2 ++ > drivers/pci/pci.c | 7 +++++++ > drivers/pci/pci.h | 2 ++ > drivers/pci/pcie/portdrv_core.c | 2 +- > drivers/pci/probe.c | 6 ++++++ > 5 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index 641ec9c..247adbb 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -3153,6 +3153,8 @@ > noari do not use PCIe ARI. > noats [PCIE, Intel-IOMMU, AMD-IOMMU] > do not use PCIe ATS (and IOMMU device IOTLB). > + safemode turns of all optinal PCI features. Useful > + for bringup/troubleshooting. s/optinal/optional/ ? And you should explain what exactly in PCI is "optional". Who defines this and where is that list and what can go wrong if those options are not enabled? In looking at your patch, I can't determine that at all, so there's no way that someone just looking at this sentence will be able to understand. thanks, greg k-h
On 5/29/2018 9:31 PM, Greg Kroah-Hartman wrote: > On Tue, May 29, 2018 at 11:19:41PM -0400, Sinan Kaya wrote: >> Adding pci=safemode kernel command line parameter to turn off all PCI >> Express service driver as well as all optional PCIe features such as LTR, >> Extended tags, Relaxed Ordering etc. >> >> Also setting MPS configuration to PCIE_BUS_SAFE so that MPS and MRRS can be >> reconfigured with by the kernel in case BIOS hands off a broken >> configuration. > > Why not fix the BIOS? That's what sane platforms do :) > >> >> Signed-off-by: Sinan Kaya <okaya@codeaurora.org> >> --- >> Documentation/admin-guide/kernel-parameters.txt | 2 ++ >> drivers/pci/pci.c | 7 +++++++ >> drivers/pci/pci.h | 2 ++ >> drivers/pci/pcie/portdrv_core.c | 2 +- >> drivers/pci/probe.c | 6 ++++++ >> 5 files changed, 18 insertions(+), 1 deletion(-) >> >> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt >> index 641ec9c..247adbb 100644 >> --- a/Documentation/admin-guide/kernel-parameters.txt >> +++ b/Documentation/admin-guide/kernel-parameters.txt >> @@ -3153,6 +3153,8 @@ >> noari do not use PCIe ARI. >> noats [PCIE, Intel-IOMMU, AMD-IOMMU] >> do not use PCIe ATS (and IOMMU device IOTLB). >> + safemode turns of all optinal PCI features. Useful >> + for bringup/troubleshooting. > > s/optinal/optional/ ? sure. > > And you should explain what exactly in PCI is "optional". Who defines > this and where is that list and what can go wrong if those options are > not enabled? Bjorn and I discussed the need for such a "safe" mode feature when you want to bring up PCI for a platform. You want to turn off everything as a starter and just stick to bare minimum. I can add a few words describing them. The goal of this option is to keep base PCI features with MSI only. Things like PME, AER, ASPM, Extended Tags, LTR, Relaxed Ordering, SRIOV are all considered optional. safemode is certainly not intended for production environments. I can taint the kernel as a suggestion. I defined minimum as just booting a device and to be able to do DMA traffic only with 0 optimization > > In looking at your patch, I can't determine that at all, so there's no > way that someone just looking at this sentence will be able to > understand. > > thanks, > > greg k-h >
On Tue, May 29, 2018 at 09:41:33PM -0700, Sinan Kaya wrote: > On 5/29/2018 9:31 PM, Greg Kroah-Hartman wrote: > > On Tue, May 29, 2018 at 11:19:41PM -0400, Sinan Kaya wrote: > >> Adding pci=safemode kernel command line parameter to turn off all PCI > >> Express service driver as well as all optional PCIe features such as LTR, > >> Extended tags, Relaxed Ordering etc. > >> > >> Also setting MPS configuration to PCIE_BUS_SAFE so that MPS and MRRS can be > >> reconfigured with by the kernel in case BIOS hands off a broken > >> configuration. > > > > Why not fix the BIOS? That's what sane platforms do :) > > > >> > >> Signed-off-by: Sinan Kaya <okaya@codeaurora.org> > >> --- > >> Documentation/admin-guide/kernel-parameters.txt | 2 ++ > >> drivers/pci/pci.c | 7 +++++++ > >> drivers/pci/pci.h | 2 ++ > >> drivers/pci/pcie/portdrv_core.c | 2 +- > >> drivers/pci/probe.c | 6 ++++++ > >> 5 files changed, 18 insertions(+), 1 deletion(-) > >> > >> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > >> index 641ec9c..247adbb 100644 > >> --- a/Documentation/admin-guide/kernel-parameters.txt > >> +++ b/Documentation/admin-guide/kernel-parameters.txt > >> @@ -3153,6 +3153,8 @@ > >> noari do not use PCIe ARI. > >> noats [PCIE, Intel-IOMMU, AMD-IOMMU] > >> do not use PCIe ATS (and IOMMU device IOTLB). > >> + safemode turns of all optinal PCI features. Useful > >> + for bringup/troubleshooting. > > > > s/optinal/optional/ ? > > sure. > > > > > And you should explain what exactly in PCI is "optional". Who defines > > this and where is that list and what can go wrong if those options are > > not enabled? > > Bjorn and I discussed the need for such a "safe" mode feature when you > want to bring up PCI for a platform. You want to turn off everything as > a starter and just stick to bare minimum. > > I can add a few words describing them. The goal of this option is to keep > base PCI features with MSI only. Things like PME, AER, ASPM, Extended > Tags, LTR, Relaxed Ordering, SRIOV are all considered optional. safemode > is certainly not intended for production environments. Ok, then you should say that here, or somewhere, so that people know this. Otherwise people will see that "hey this lets my hardware boot!" and then never change it :( > I can taint the kernel as a suggestion. I would not worry about that. > I defined minimum as just booting a device and to be able to do DMA traffic > only with 0 optimization Ok, again, just document this really well, so that people do not have questions and start wondering why their devices barely seem to work. thanks, greg k-h
On Tue, May 29, 2018 at 09:41:33PM -0700, Sinan Kaya wrote: > Bjorn and I discussed the need for such a "safe" mode feature when you > want to bring up PCI for a platform. You want to turn off everything as > a starter and just stick to bare minimum. Can we please make it a config option the instead of adding code to every kernel? Also maybe the bringup should be in the name to make this more clear?
On 2018-05-30 00:37, Christoph Hellwig wrote: > On Tue, May 29, 2018 at 09:41:33PM -0700, Sinan Kaya wrote: >> Bjorn and I discussed the need for such a "safe" mode feature when you >> want to bring up PCI for a platform. You want to turn off everything >> as >> a starter and just stick to bare minimum. > > Can we please make it a config option the instead of adding code > to every kernel? Also maybe the bringup should be in the name > to make this more clear? One other requirement was to have a runtime option rather than compile time option. When someone reported a problem, we wanted to be able to say "use this option and see if system boots" without doing any bisects or recompilation. This would be the first step in troubleshooting a system to see if fundamental features are working. I don't mind changing the name Bjorn mentioned safe option. I made it safemode. I am looking at Bjorn for suggestions at this moment.
On Wed, May 30, 2018 at 12:44:29AM -0700, okaya@codeaurora.org wrote: > On 2018-05-30 00:37, Christoph Hellwig wrote: > > On Tue, May 29, 2018 at 09:41:33PM -0700, Sinan Kaya wrote: > > > Bjorn and I discussed the need for such a "safe" mode feature when you > > > want to bring up PCI for a platform. You want to turn off everything > > > as > > > a starter and just stick to bare minimum. > > > > Can we please make it a config option the instead of adding code > > to every kernel? Also maybe the bringup should be in the name > > to make this more clear? > > One other requirement was to have a runtime option rather than compile time > option. > > When someone reported a problem, we wanted to be able to say "use this > option and see if system boots" without doing any bisects or recompilation. > > This would be the first step in troubleshooting a system to see if > fundamental features are working. That makes sense, people can not rebuild their kernels for the most part. Putting it behind a config option would not make sense as it would always have to be enabled. > I don't mind changing the name Bjorn mentioned safe option. I made it > safemode. I am looking at Bjorn for suggestions at this moment. "minimal"? "basic"? "crippled"? "my_hardware_is_so_borked_it_needs_this_option"? :) Naming is hard... greg k-h
On 2018-05-30 00:48, Greg Kroah-Hartman wrote: > On Wed, May 30, 2018 at 12:44:29AM -0700, okaya@codeaurora.org wrote: >> On 2018-05-30 00:37, Christoph Hellwig wrote: >> > On Tue, May 29, 2018 at 09:41:33PM -0700, Sinan Kaya wrote: >> > > Bjorn and I discussed the need for such a "safe" mode feature when you >> > > want to bring up PCI for a platform. You want to turn off everything >> > > as >> > > a starter and just stick to bare minimum. >> > >> > Can we please make it a config option the instead of adding code >> > to every kernel? Also maybe the bringup should be in the name >> > to make this more clear? >> >> One other requirement was to have a runtime option rather than compile >> time >> option. >> >> When someone reported a problem, we wanted to be able to say "use this >> option and see if system boots" without doing any bisects or >> recompilation. >> >> This would be the first step in troubleshooting a system to see if >> fundamental features are working. > > That makes sense, people can not rebuild their kernels for the most > part. Putting it behind a config option would not make sense as it > would always have to be enabled. > Here is where the discussion took place. Last 5-10 messages should help. https://bugzilla.kernel.org/show_bug.cgi?id=196197 >> I don't mind changing the name Bjorn mentioned safe option. I made it >> safemode. I am looking at Bjorn for suggestions at this moment. > > "minimal"? "basic"? "crippled"? > "my_hardware_is_so_borked_it_needs_this_option"? :) > > Naming is hard... > > greg k-h
On 2018-05-30 00:56, okaya@codeaurora.org wrote: > On 2018-05-30 00:48, Greg Kroah-Hartman wrote: >> On Wed, May 30, 2018 at 12:44:29AM -0700, okaya@codeaurora.org wrote: >>> On 2018-05-30 00:37, Christoph Hellwig wrote: >>> > On Tue, May 29, 2018 at 09:41:33PM -0700, Sinan Kaya wrote: >>> > > Bjorn and I discussed the need for such a "safe" mode feature when you >>> > > want to bring up PCI for a platform. You want to turn off everything >>> > > as >>> > > a starter and just stick to bare minimum. >>> > >>> > Can we please make it a config option the instead of adding code >>> > to every kernel? Also maybe the bringup should be in the name >>> > to make this more clear? >>> >>> One other requirement was to have a runtime option rather than >>> compile time >>> option. >>> >>> When someone reported a problem, we wanted to be able to say "use >>> this >>> option and see if system boots" without doing any bisects or >>> recompilation. >>> >>> This would be the first step in troubleshooting a system to see if >>> fundamental features are working. >> >> That makes sense, people can not rebuild their kernels for the most >> part. Putting it behind a config option would not make sense as it >> would always have to be enabled. >> > > Here is where the discussion took place. Last 5-10 messages should > help. > > > https://bugzilla.kernel.org/show_bug.cgi?id=196197 > Some more paper trail for general awareness. https://lkml.org/lkml/2018/5/3/509 > >>> I don't mind changing the name Bjorn mentioned safe option. I made it >>> safemode. I am looking at Bjorn for suggestions at this moment. >> >> "minimal"? "basic"? "crippled"? >> "my_hardware_is_so_borked_it_needs_this_option"? :) >> >> Naming is hard... >> >> greg k-h
On Wed, May 30, 2018 at 12:44:29AM -0700, okaya@codeaurora.org wrote: > On 2018-05-30 00:37, Christoph Hellwig wrote: > > On Tue, May 29, 2018 at 09:41:33PM -0700, Sinan Kaya wrote: > > > Bjorn and I discussed the need for such a "safe" mode feature when you > > > want to bring up PCI for a platform. You want to turn off everything > > > as > > > a starter and just stick to bare minimum. > > > > Can we please make it a config option the instead of adding code > > to every kernel? Also maybe the bringup should be in the name > > to make this more clear? > > One other requirement was to have a runtime option rather than compile time > option. > > When someone reported a problem, we wanted to be able to say "use this > option and see if system boots" without doing any bisects or recompilation. > > This would be the first step in troubleshooting a system to see if > fundamental features are working. > > I don't mind changing the name > Bjorn mentioned safe option. I made it safemode. I am looking at Bjorn for > suggestions at this moment. This *was* my idea, but I'm starting to think it was a bad idea. I don't want people to use pci= parameters as the normal way to get a system to boot. That would be a huge support hassle (putting things in release notes, diagnosing problems when people forget it, etc). But the parameters *are* useful for debugging. If we had a "pci=safemode" and it avoided some problem, the next step would be to narrow it down by using the more specific flags (pci=nomsi, pci=noari, pci=no_ext_tags, etc). So I think 95% of the value is in the specific flags, and a "pci=safemode" might add a little bit of value but at the cost of a small but nagging maintenance concern and code clutter. Bjorn
On 5/30/2018 10:56 AM, Bjorn Helgaas wrote: > This *was* my idea, but I'm starting to think it was a bad idea. > > I don't want people to use pci= parameters as the normal way to get a > system to boot. That would be a huge support hassle (putting things > in release notes, diagnosing problems when people forget it, etc). > > But the parameters *are* useful for debugging. If we had a > "pci=safemode" and it avoided some problem, the next step would be to > narrow it down by using the more specific flags (pci=nomsi, pci=noari, > pci=no_ext_tags, etc). So I think 95% of the value is in the specific > flags, and a "pci=safemode" might add a little bit of value but at the > cost of a small but nagging maintenance concern and code clutter. OK. Let's try noXYZ feature. Can you enumerate the XYZ features that you want to see?
Hi! > > And you should explain what exactly in PCI is "optional". Who defines > > this and where is that list and what can go wrong if those options are > > not enabled? > > Bjorn and I discussed the need for such a "safe" mode feature when you > want to bring up PCI for a platform. You want to turn off everything as > a starter and just stick to bare minimum. > > I can add a few words describing them. The goal of this option is to keep > base PCI features with MSI only. Things like PME, AER, ASPM, Extended > Tags, LTR, Relaxed Ordering, SRIOV are all considered optional. safemode > is certainly not intended for production environments. > > I can taint the kernel as a suggestion. I don't think tainting is required. even modern platforms should work in the safe mode. Pavel
On 2018-06-02 13:43, Pavel Machek wrote: > Hi! > >> > And you should explain what exactly in PCI is "optional". Who defines >> > this and where is that list and what can go wrong if those options are >> > not enabled? >> >> Bjorn and I discussed the need for such a "safe" mode feature when you >> want to bring up PCI for a platform. You want to turn off everything >> as >> a starter and just stick to bare minimum. >> >> I can add a few words describing them. The goal of this option is to >> keep >> base PCI features with MSI only. Things like PME, AER, ASPM, Extended >> Tags, LTR, Relaxed Ordering, SRIOV are all considered optional. >> safemode >> is certainly not intended for production environments. >> >> I can taint the kernel as a suggestion. > > I don't think tainting is required. even modern platforms should work > in the safe mode. Yeah, concern was getting used to the safe mode and never running the full stack to fix the actual issues like getting away with crappy hardware and firmware. It becomes a support issue for the community. > Pavel
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 641ec9c..247adbb 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3153,6 +3153,8 @@ noari do not use PCIe ARI. noats [PCIE, Intel-IOMMU, AMD-IOMMU] do not use PCIe ATS (and IOMMU device IOTLB). + safemode turns of all optinal PCI features. Useful + for bringup/troubleshooting. pcie_scan_all Scan all possible PCIe devices. Otherwise we only look for one device below a PCIe downstream port. diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index d27f771..11f0282 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -115,6 +115,9 @@ static bool pcie_ari_disabled; /* If set, the PCIe ATS capability will not be used. */ static bool pcie_ats_disabled; +/* If set, disables most of the optional PCI features */ +bool pci_safe_mode; + bool pci_ats_disabled(void) { return pcie_ats_disabled; @@ -5845,6 +5848,10 @@ static int __init pci_setup(char *str) if (*str && (str = pcibios_setup(str)) && *str) { if (!strcmp(str, "nomsi")) { pci_no_msi(); + } else if (!strncmp(str, "safemode", 8)) { + pr_info("PCI: safe mode with minimum features\n"); + pci_safe_mode = true; + pcie_bus_config = PCIE_BUS_SAFE; } else if (!strncmp(str, "noats", 5)) { pr_info("PCIe: ATS is disabled\n"); pcie_ats_disabled = true; diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index c358e7a0..4517bcd 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -8,6 +8,8 @@ extern const unsigned char pcie_link_speed[]; +extern bool pci_safe_mode; + bool pcie_cap_has_lnkctl(const struct pci_dev *dev); /* Functions internal to the PCI core code */ diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index a5b3b3a..9fe4ed6 100644 --- a/drivers/pci/pcie/portdrv_core.c +++ b/drivers/pci/pcie/portdrv_core.c @@ -311,7 +311,7 @@ int pcie_port_device_register(struct pci_dev *dev) /* Get and check PCI Express port services */ capabilities = get_port_device_capability(dev); - if (!capabilities) + if (!capabilities || pci_safe_mode) return 0; pci_set_master(dev); diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 3840207..295b79c 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2047,6 +2047,9 @@ static void pci_configure_device(struct pci_dev *dev) struct hotplug_params hpp; int ret; + if (pci_safe_mode) + return; + pci_configure_mps(dev); pci_configure_extended_tags(dev, NULL); pci_configure_relaxed_ordering(dev); @@ -2213,6 +2216,9 @@ static void pci_init_capabilities(struct pci_dev *dev) /* Setup MSI caps & disable MSI/MSI-X interrupts */ pci_msi_setup_pci_dev(dev); + if (pci_safe_mode) + return; + /* Buffers for saving PCIe and PCI-X capabilities */ pci_allocate_cap_save_buffers(dev);
Adding pci=safemode kernel command line parameter to turn off all PCI Express service driver as well as all optional PCIe features such as LTR, Extended tags, Relaxed Ordering etc. Also setting MPS configuration to PCIE_BUS_SAFE so that MPS and MRRS can be reconfigured with by the kernel in case BIOS hands off a broken configuration. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> --- Documentation/admin-guide/kernel-parameters.txt | 2 ++ drivers/pci/pci.c | 7 +++++++ drivers/pci/pci.h | 2 ++ drivers/pci/pcie/portdrv_core.c | 2 +- drivers/pci/probe.c | 6 ++++++ 5 files changed, 18 insertions(+), 1 deletion(-)