diff mbox series

[v4,6/7] PCI/AER: Add ratelimits to PCI AER Documentation

Message ID 20250320082057.622983-7-pandoh@google.com (mailing list archive)
State Superseded
Delegated to: Bjorn Helgaas
Headers show
Series Rate limit AER logs | expand

Commit Message

Jon Pan-Doh March 20, 2025, 8:20 a.m. UTC
Add ratelimits section for rationale and defaults.

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Jon Pan-Doh <pandoh@google.com>
---
 Documentation/PCI/pcieaer-howto.rst | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Karolina Stolarek March 20, 2025, 2:57 p.m. UTC | #1
On 20/03/2025 09:20, Jon Pan-Doh wrote:
> Add ratelimits section for rationale and defaults.
> 
> Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
> Signed-off-by: Jon Pan-Doh <pandoh@google.com>
> ---
>   Documentation/PCI/pcieaer-howto.rst | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
> index f013f3b27c82..896d2a232a90 100644
> --- a/Documentation/PCI/pcieaer-howto.rst
> +++ b/Documentation/PCI/pcieaer-howto.rst
> @@ -85,6 +85,17 @@ In the example, 'Requester ID' means the ID of the device that sent
>   the error message to the Root Port. Please refer to PCIe specs for other
>   fields.
>   
> +AER Ratelimits
> +--------------
> +
> +Since error messages can be generated for each transaction, we may see
> +large volumes of errors reported. To prevent spammy devices from flooding
> +the console/stalling execution, messages are throttled by device and error
> +type (correctable vs. uncorrectable).
> +
> +AER uses the default ratelimit of DEFAULT_RATELIMIT_BURST (10 events) over
> +DEFAULT_RATELIMIT_INTERVAL (5 seconds).

This is not quite true, as we double the number of available bursts so 
we can print both the port info and an error message. We could say that 
this limit (2 * DEFAULT_RATELIMIT_BURST) roughly translates to ten error 
notifications within the 5 second window.

All the best,
Karolina
Sathyanarayanan Kuppuswamy March 21, 2025, 1 a.m. UTC | #2
On 3/20/25 1:20 AM, Jon Pan-Doh wrote:
> Add ratelimits section for rationale and defaults.
>
> Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
> Signed-off-by: Jon Pan-Doh <pandoh@google.com>
> ---

Reviewed-by: Kuppuswamy Sathyanarayanan 
<sathyanarayanan.kuppuswamy@linux.intel.com>

>   Documentation/PCI/pcieaer-howto.rst | 11 +++++++++++
>   1 file changed, 11 insertions(+)
>
> diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
> index f013f3b27c82..896d2a232a90 100644
> --- a/Documentation/PCI/pcieaer-howto.rst
> +++ b/Documentation/PCI/pcieaer-howto.rst
> @@ -85,6 +85,17 @@ In the example, 'Requester ID' means the ID of the device that sent
>   the error message to the Root Port. Please refer to PCIe specs for other
>   fields.
>   
> +AER Ratelimits
> +--------------
> +
> +Since error messages can be generated for each transaction, we may see
> +large volumes of errors reported. To prevent spammy devices from flooding
> +the console/stalling execution, messages are throttled by device and error
> +type (correctable vs. uncorrectable).
> +
> +AER uses the default ratelimit of DEFAULT_RATELIMIT_BURST (10 events) over
> +DEFAULT_RATELIMIT_INTERVAL (5 seconds).
> +
>   AER Statistics / Counters
>   -------------------------
>
diff mbox series

Patch

diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
index f013f3b27c82..896d2a232a90 100644
--- a/Documentation/PCI/pcieaer-howto.rst
+++ b/Documentation/PCI/pcieaer-howto.rst
@@ -85,6 +85,17 @@  In the example, 'Requester ID' means the ID of the device that sent
 the error message to the Root Port. Please refer to PCIe specs for other
 fields.
 
+AER Ratelimits
+--------------
+
+Since error messages can be generated for each transaction, we may see
+large volumes of errors reported. To prevent spammy devices from flooding
+the console/stalling execution, messages are throttled by device and error
+type (correctable vs. uncorrectable).
+
+AER uses the default ratelimit of DEFAULT_RATELIMIT_BURST (10 events) over
+DEFAULT_RATELIMIT_INTERVAL (5 seconds).
+
 AER Statistics / Counters
 -------------------------