diff mbox series

[v4] PCI: Reduce warnings on possible RW1C corruption

Message ID 20200806041455.11070-1-mark.tomlinson@alliedtelesis.co.nz (mailing list archive)
State Accepted, archived
Delegated to: Bjorn Helgaas
Headers show
Series [v4] PCI: Reduce warnings on possible RW1C corruption | expand

Commit Message

Mark Tomlinson Aug. 6, 2020, 4:14 a.m. UTC
For hardware that only supports 32-bit writes to PCI there is the
possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
messages was introduced by fb2659230120, but rate-limiting is not the
best choice here. Some devices may not show the warnings they should if
another device has just produced a bunch of warnings. Also, the number
of messages can be a nuisance on devices which are otherwise working
fine.

This patch changes the ratelimit to a single warning per bus. This
ensures no bus is 'starved' of emitting a warning and also that there
isn't a continuous stream of warnings. It would be preferable to have a
warning per device, but the pci_dev structure is not available here, and
a lookup from devfn would be far too slow.

Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit config writes")
Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
---
changes in v4:
 - Use bitfield rather than bool to save memory (was meant to be in v3).

 drivers/pci/access.c | 9 ++++++---
 include/linux/pci.h  | 1 +
 2 files changed, 7 insertions(+), 3 deletions(-)

Comments

Scott Branden Aug. 6, 2020, 5:55 p.m. UTC | #1
Looks good.

On 2020-08-05 9:14 p.m., Mark Tomlinson wrote:
> For hardware that only supports 32-bit writes to PCI there is the
> possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
> messages was introduced by fb2659230120, but rate-limiting is not the
> best choice here. Some devices may not show the warnings they should if
> another device has just produced a bunch of warnings. Also, the number
> of messages can be a nuisance on devices which are otherwise working
> fine.
>
> This patch changes the ratelimit to a single warning per bus. This
> ensures no bus is 'starved' of emitting a warning and also that there
> isn't a continuous stream of warnings. It would be preferable to have a
> warning per device, but the pci_dev structure is not available here, and
> a lookup from devfn would be far too slow.
>
> Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit config writes")
> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Acked-by: Scott Branden <scott.branden@broadcom.com>
> ---
> changes in v4:
>  - Use bitfield rather than bool to save memory (was meant to be in v3).
>
>  drivers/pci/access.c | 9 ++++++---
>  include/linux/pci.h  | 1 +
>  2 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> index 79c4a2ef269a..b452467fd133 100644
> --- a/drivers/pci/access.c
> +++ b/drivers/pci/access.c
> @@ -160,9 +160,12 @@ int pci_generic_config_write32(struct pci_bus *bus, unsigned int devfn,
>  	 * write happen to have any RW1C (write-one-to-clear) bits set, we
>  	 * just inadvertently cleared something we shouldn't have.
>  	 */
> -	dev_warn_ratelimited(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
> -			     size, pci_domain_nr(bus), bus->number,
> -			     PCI_SLOT(devfn), PCI_FUNC(devfn), where);
> +	if (!bus->unsafe_warn) {
> +		dev_warn(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
> +			 size, pci_domain_nr(bus), bus->number,
> +			 PCI_SLOT(devfn), PCI_FUNC(devfn), where);
> +		bus->unsafe_warn = 1;
> +	}
>  
>  	mask = ~(((1 << (size * 8)) - 1) << ((where & 0x3) * 8));
>  	tmp = readl(addr) & mask;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 34c1c4f45288..85211a787f8b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -626,6 +626,7 @@ struct pci_bus {
>  	struct bin_attribute	*legacy_io;	/* Legacy I/O for this bus */
>  	struct bin_attribute	*legacy_mem;	/* Legacy mem */
>  	unsigned int		is_added:1;
> +	unsigned int		unsafe_warn:1;	/* warned about RW1C config write */
>  };
>  
>  #define to_pci_bus(n)	container_of(n, struct pci_bus, dev)
Florian Fainelli Aug. 6, 2020, 6:34 p.m. UTC | #2
On 8/5/2020 9:14 PM, Mark Tomlinson wrote:
> For hardware that only supports 32-bit writes to PCI there is the
> possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
> messages was introduced by fb2659230120, but rate-limiting is not the
> best choice here. Some devices may not show the warnings they should if
> another device has just produced a bunch of warnings. Also, the number
> of messages can be a nuisance on devices which are otherwise working
> fine.
> 
> This patch changes the ratelimit to a single warning per bus. This
> ensures no bus is 'starved' of emitting a warning and also that there
> isn't a continuous stream of warnings. It would be preferable to have a
> warning per device, but the pci_dev structure is not available here, and
> a lookup from devfn would be far too slow.
> 
> Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit config writes")
> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Rob Herring Aug. 14, 2020, 5:45 p.m. UTC | #3
On Wed, Aug 5, 2020 at 10:15 PM Mark Tomlinson
<mark.tomlinson@alliedtelesis.co.nz> wrote:
>
> For hardware that only supports 32-bit writes to PCI there is the
> possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
> messages was introduced by fb2659230120, but rate-limiting is not the
> best choice here. Some devices may not show the warnings they should if
> another device has just produced a bunch of warnings. Also, the number
> of messages can be a nuisance on devices which are otherwise working
> fine.
>
> This patch changes the ratelimit to a single warning per bus. This
> ensures no bus is 'starved' of emitting a warning and also that there
> isn't a continuous stream of warnings. It would be preferable to have a
> warning per device, but the pci_dev structure is not available here, and
> a lookup from devfn would be far too slow.
>
> Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit config writes")
> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
> ---
> changes in v4:
>  - Use bitfield rather than bool to save memory (was meant to be in v3).
>
>  drivers/pci/access.c | 9 ++++++---
>  include/linux/pci.h  | 1 +
>  2 files changed, 7 insertions(+), 3 deletions(-)

Reviewed-by: Rob Herring <robh@kernel.org>
Bjorn Helgaas Aug. 20, 2020, 10:11 p.m. UTC | #4
On Thu, Aug 06, 2020 at 04:14:55PM +1200, Mark Tomlinson wrote:
> For hardware that only supports 32-bit writes to PCI there is the
> possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
> messages was introduced by fb2659230120, but rate-limiting is not the
> best choice here. Some devices may not show the warnings they should if
> another device has just produced a bunch of warnings. Also, the number
> of messages can be a nuisance on devices which are otherwise working
> fine.
> 
> This patch changes the ratelimit to a single warning per bus. This
> ensures no bus is 'starved' of emitting a warning and also that there
> isn't a continuous stream of warnings. It would be preferable to have a
> warning per device, but the pci_dev structure is not available here, and
> a lookup from devfn would be far too slow.
> 
> Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit config writes")
> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>

Applied with collected reviews/acks to pci/enumeration for v5.10,
thanks!

> ---
> changes in v4:
>  - Use bitfield rather than bool to save memory (was meant to be in v3).
> 
>  drivers/pci/access.c | 9 ++++++---
>  include/linux/pci.h  | 1 +
>  2 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> index 79c4a2ef269a..b452467fd133 100644
> --- a/drivers/pci/access.c
> +++ b/drivers/pci/access.c
> @@ -160,9 +160,12 @@ int pci_generic_config_write32(struct pci_bus *bus, unsigned int devfn,
>  	 * write happen to have any RW1C (write-one-to-clear) bits set, we
>  	 * just inadvertently cleared something we shouldn't have.
>  	 */
> -	dev_warn_ratelimited(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
> -			     size, pci_domain_nr(bus), bus->number,
> -			     PCI_SLOT(devfn), PCI_FUNC(devfn), where);
> +	if (!bus->unsafe_warn) {
> +		dev_warn(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
> +			 size, pci_domain_nr(bus), bus->number,
> +			 PCI_SLOT(devfn), PCI_FUNC(devfn), where);
> +		bus->unsafe_warn = 1;
> +	}
>  
>  	mask = ~(((1 << (size * 8)) - 1) << ((where & 0x3) * 8));
>  	tmp = readl(addr) & mask;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 34c1c4f45288..85211a787f8b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -626,6 +626,7 @@ struct pci_bus {
>  	struct bin_attribute	*legacy_io;	/* Legacy I/O for this bus */
>  	struct bin_attribute	*legacy_mem;	/* Legacy mem */
>  	unsigned int		is_added:1;
> +	unsigned int		unsafe_warn:1;	/* warned about RW1C config write */
>  };
>  
>  #define to_pci_bus(n)	container_of(n, struct pci_bus, dev)
> -- 
> 2.28.0
>
Chris Packham March 4, 2022, 1:30 a.m. UTC | #5
Hi All,

On 21/08/20 10:11, Bjorn Helgaas wrote:
> On Thu, Aug 06, 2020 at 04:14:55PM +1200, Mark Tomlinson wrote:
>> For hardware that only supports 32-bit writes to PCI there is the
>> possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
>> messages was introduced by fb2659230120, but rate-limiting is not the
>> best choice here. Some devices may not show the warnings they should if
>> another device has just produced a bunch of warnings. Also, the number
>> of messages can be a nuisance on devices which are otherwise working
>> fine.
>>
>> This patch changes the ratelimit to a single warning per bus. This
>> ensures no bus is 'starved' of emitting a warning and also that there
>> isn't a continuous stream of warnings. It would be preferable to have a
>> warning per device, but the pci_dev structure is not available here, and
>> a lookup from devfn would be far too slow.
>>
>> Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
>> Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit config writes")
>> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
> Applied with collected reviews/acks to pci/enumeration for v5.10,
> thanks!

Whatever happened to this change?

I'm just going through our queue of patches that have been sent upstream 
and expected this one to be gone after we pulled v5.10. Looking at 
Linus's tree I don't see it ever having been applied. I couldn't see 
anything on the relevant mailing lists suggesting that there was a 
problem with this change so I'm just wondering what's happened to it?

>> ---
>> changes in v4:
>>   - Use bitfield rather than bool to save memory (was meant to be in v3).
>>
>>   drivers/pci/access.c | 9 ++++++---
>>   include/linux/pci.h  | 1 +
>>   2 files changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
>> index 79c4a2ef269a..b452467fd133 100644
>> --- a/drivers/pci/access.c
>> +++ b/drivers/pci/access.c
>> @@ -160,9 +160,12 @@ int pci_generic_config_write32(struct pci_bus *bus, unsigned int devfn,
>>   	 * write happen to have any RW1C (write-one-to-clear) bits set, we
>>   	 * just inadvertently cleared something we shouldn't have.
>>   	 */
>> -	dev_warn_ratelimited(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
>> -			     size, pci_domain_nr(bus), bus->number,
>> -			     PCI_SLOT(devfn), PCI_FUNC(devfn), where);
>> +	if (!bus->unsafe_warn) {
>> +		dev_warn(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
>> +			 size, pci_domain_nr(bus), bus->number,
>> +			 PCI_SLOT(devfn), PCI_FUNC(devfn), where);
>> +		bus->unsafe_warn = 1;
>> +	}
>>   
>>   	mask = ~(((1 << (size * 8)) - 1) << ((where & 0x3) * 8));
>>   	tmp = readl(addr) & mask;
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 34c1c4f45288..85211a787f8b 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -626,6 +626,7 @@ struct pci_bus {
>>   	struct bin_attribute	*legacy_io;	/* Legacy I/O for this bus */
>>   	struct bin_attribute	*legacy_mem;	/* Legacy mem */
>>   	unsigned int		is_added:1;
>> +	unsigned int		unsafe_warn:1;	/* warned about RW1C config write */
>>   };
>>   
>>   #define to_pci_bus(n)	container_of(n, struct pci_bus, dev)
>> -- 
>> 2.28.0
>>
Bjorn Helgaas March 4, 2022, 10:01 p.m. UTC | #6
On Fri, Mar 04, 2022 at 01:30:29AM +0000, Chris Packham wrote:
> Hi All,
> 
> On 21/08/20 10:11, Bjorn Helgaas wrote:
> > On Thu, Aug 06, 2020 at 04:14:55PM +1200, Mark Tomlinson wrote:
> >> For hardware that only supports 32-bit writes to PCI there is the
> >> possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
> >> messages was introduced by fb2659230120, but rate-limiting is not the
> >> best choice here. Some devices may not show the warnings they should if
> >> another device has just produced a bunch of warnings. Also, the number
> >> of messages can be a nuisance on devices which are otherwise working
> >> fine.
> >>
> >> This patch changes the ratelimit to a single warning per bus. This
> >> ensures no bus is 'starved' of emitting a warning and also that there
> >> isn't a continuous stream of warnings. It would be preferable to have a
> >> warning per device, but the pci_dev structure is not available here, and
> >> a lookup from devfn would be far too slow.
> >>
> >> Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> >> Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit config writes")
> >> Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
> > Applied with collected reviews/acks to pci/enumeration for v5.10,
> > thanks!
> 
> Whatever happened to this change?
> 
> I'm just going through our queue of patches that have been sent upstream 
> and expected this one to be gone after we pulled v5.10. Looking at 
> Linus's tree I don't see it ever having been applied. I couldn't see 
> anything on the relevant mailing lists suggesting that there was a 
> problem with this change so I'm just wondering what's happened to it?

Sorry, I blew it somehow and dropped it.  I applied it again for
v5.18.  Thanks for noticing!

> >> ---
> >> changes in v4:
> >>   - Use bitfield rather than bool to save memory (was meant to be in v3).
> >>
> >>   drivers/pci/access.c | 9 ++++++---
> >>   include/linux/pci.h  | 1 +
> >>   2 files changed, 7 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> >> index 79c4a2ef269a..b452467fd133 100644
> >> --- a/drivers/pci/access.c
> >> +++ b/drivers/pci/access.c
> >> @@ -160,9 +160,12 @@ int pci_generic_config_write32(struct pci_bus *bus, unsigned int devfn,
> >>   	 * write happen to have any RW1C (write-one-to-clear) bits set, we
> >>   	 * just inadvertently cleared something we shouldn't have.
> >>   	 */
> >> -	dev_warn_ratelimited(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
> >> -			     size, pci_domain_nr(bus), bus->number,
> >> -			     PCI_SLOT(devfn), PCI_FUNC(devfn), where);
> >> +	if (!bus->unsafe_warn) {
> >> +		dev_warn(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
> >> +			 size, pci_domain_nr(bus), bus->number,
> >> +			 PCI_SLOT(devfn), PCI_FUNC(devfn), where);
> >> +		bus->unsafe_warn = 1;
> >> +	}
> >>   
> >>   	mask = ~(((1 << (size * 8)) - 1) << ((where & 0x3) * 8));
> >>   	tmp = readl(addr) & mask;
> >> diff --git a/include/linux/pci.h b/include/linux/pci.h
> >> index 34c1c4f45288..85211a787f8b 100644
> >> --- a/include/linux/pci.h
> >> +++ b/include/linux/pci.h
> >> @@ -626,6 +626,7 @@ struct pci_bus {
> >>   	struct bin_attribute	*legacy_io;	/* Legacy I/O for this bus */
> >>   	struct bin_attribute	*legacy_mem;	/* Legacy mem */
> >>   	unsigned int		is_added:1;
> >> +	unsigned int		unsafe_warn:1;	/* warned about RW1C config write */
> >>   };
> >>   
> >>   #define to_pci_bus(n)	container_of(n, struct pci_bus, dev)
> >> -- 
> >> 2.28.0
> >>
diff mbox series

Patch

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 79c4a2ef269a..b452467fd133 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -160,9 +160,12 @@  int pci_generic_config_write32(struct pci_bus *bus, unsigned int devfn,
 	 * write happen to have any RW1C (write-one-to-clear) bits set, we
 	 * just inadvertently cleared something we shouldn't have.
 	 */
-	dev_warn_ratelimited(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
-			     size, pci_domain_nr(bus), bus->number,
-			     PCI_SLOT(devfn), PCI_FUNC(devfn), where);
+	if (!bus->unsafe_warn) {
+		dev_warn(&bus->dev, "%d-byte config write to %04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
+			 size, pci_domain_nr(bus), bus->number,
+			 PCI_SLOT(devfn), PCI_FUNC(devfn), where);
+		bus->unsafe_warn = 1;
+	}
 
 	mask = ~(((1 << (size * 8)) - 1) << ((where & 0x3) * 8));
 	tmp = readl(addr) & mask;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 34c1c4f45288..85211a787f8b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -626,6 +626,7 @@  struct pci_bus {
 	struct bin_attribute	*legacy_io;	/* Legacy I/O for this bus */
 	struct bin_attribute	*legacy_mem;	/* Legacy mem */
 	unsigned int		is_added:1;
+	unsigned int		unsafe_warn:1;	/* warned about RW1C config write */
 };
 
 #define to_pci_bus(n)	container_of(n, struct pci_bus, dev)