Message ID | cover.1734005191.git.karolina.stolarek@oracle.com (mailing list archive) |
---|---|
Headers | show |
Series | Rate limit PCIe Correctable Errors | expand |
On Thu, 12 Dec 2024 14:27:28 +0000 Karolina Stolarek <karolina.stolarek@oracle.com> wrote: > TL;DR > ==== > > We are getting multiple reports about excessive logging of Correctable > Errors with no clear common root cause. As these errors are already > corrected by hardware, it makes sense to limit them. Introduce > a ratelimit state definition to pci_dev to control the number of > messages reported by a Root Port within a specified time interval. > The series adds other improvements in the area, as outlined in the > Proposal section. Hi Karolina, Just to check, this doesn't affect tracepoints? From a quick read of the patches they look like they will still be triggered so monitoring tools will see the correctable errors. That's definitely the right option even if we limit prints to the kernel log. Assuming I read it right, change the series title to make it clear this is just the prints to the kernel log that you are touching. Thanks, Jonathan
Hi Jonathan, Many thanks for taking a look at the patches. On 16/12/2024 11:44, Jonathan Cameron wrote: > > Hi Karolina, > > Just to check, this doesn't affect tracepoints? From a quick read > of the patches they look like they will still be triggered so monitoring > tools will see the correctable errors. That's definitely the right > option even if we limit prints to the kernel log. The tracing seems to be working -- rasdaemon recorded every correctable error despite that we didn't print all of them to the kernel log. > Assuming I read it right, change the series title to make it clear this > is just the prints to the kernel log that you are touching. OK, will do that in the next version. All the best, Karolina > > Thanks, > > Jonathan