diff mbox series

[net-next,13/13] bnxt_en: Make PTP TX timestamp HWRM query silent

Message ID 20231212005122.2401-14-michael.chan@broadcom.com (mailing list archive)
State Accepted
Commit 056bce63c469ca397e30d16bdbd4408489f089a9
Delegated to: Netdev Maintainers
Headers show
Series bnxt_en: Update for net-next | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1115 this patch: 1115
netdev/cc_maintainers warning 1 maintainers not CCed: richardcochran@gmail.com
netdev/build_clang success Errors and warnings before: 1142 this patch: 1142
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1142 this patch: 1142
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 18 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Michael Chan Dec. 12, 2023, 12:51 a.m. UTC
From: Pavan Chebbi <pavan.chebbi@broadcom.com>

In a busy network, especially with flow control enabled, we may
experience timestamp query failures fairly regularly. After a while,
dmesg may be flooded with timestamp query failure error messages.

Silence the error message from the low level hwrm function that
sends the firmware message.  Change netdev_err() to netdev_WARN_ONCE()
if this FW call ever fails.

Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Breno Leitao Jan. 24, 2024, 10:18 a.m. UTC | #1
Hello Michael, Pavan,

On Mon, Dec 11, 2023 at 04:51:22PM -0800, Michael Chan wrote:
> From: Pavan Chebbi <pavan.chebbi@broadcom.com>
> 
> In a busy network, especially with flow control enabled, we may
> experience timestamp query failures fairly regularly. After a while,
> dmesg may be flooded with timestamp query failure error messages.
> 
> Silence the error message from the low level hwrm function that
> sends the firmware message.  Change netdev_err() to netdev_WARN_ONCE()
> if this FW call ever fails.

This is starting to cause a warning now, which is not ideal, because
this error-now-warning happens quite frequently in Meta's fleet.

At the same time, we want to have our kernels running warninglessly.
Moreover, the call stack displayed by the warning doesn't seem to be
quite useful and doees not help to investigate "the problem", I _think_.

Is it OK to move it back to error, something as:

-	netdev_WARN_ONCE(bp->dev,
+	netdev_err_once(bp->dev,
			 "TS query for TX timer failed rc = %x\n", rc);

Thank you
Pavan Chebbi Jan. 25, 2024, 3:35 a.m. UTC | #2
On Wed, Jan 24, 2024 at 3:48 PM Breno Leitao <leitao@debian.org> wrote:
>
> Hello Michael, Pavan,
>
> On Mon, Dec 11, 2023 at 04:51:22PM -0800, Michael Chan wrote:
> > From: Pavan Chebbi <pavan.chebbi@broadcom.com>
> >
> > In a busy network, especially with flow control enabled, we may
> > experience timestamp query failures fairly regularly. After a while,
> > dmesg may be flooded with timestamp query failure error messages.
> >
> > Silence the error message from the low level hwrm function that
> > sends the firmware message.  Change netdev_err() to netdev_WARN_ONCE()
> > if this FW call ever fails.
>
> This is starting to cause a warning now, which is not ideal, because
> this error-now-warning happens quite frequently in Meta's fleet.
>
> At the same time, we want to have our kernels running warninglessly.
> Moreover, the call stack displayed by the warning doesn't seem to be
> quite useful and doees not help to investigate "the problem", I _think_.
>
> Is it OK to move it back to error, something as:
>
> -       netdev_WARN_ONCE(bp->dev,
> +       netdev_err_once(bp->dev,
>                          "TS query for TX timer failed rc = %x\n", rc);

Hi Breno, I think it is OK to change.
Would you be submitting a patch for this?

>
> Thank you
Michael Chan Jan. 25, 2024, 4:47 a.m. UTC | #3
On Wed, Jan 24, 2024 at 7:35 PM Pavan Chebbi <pavan.chebbi@broadcom.com> wrote:
>
> On Wed, Jan 24, 2024 at 3:48 PM Breno Leitao <leitao@debian.org> wrote:
> >
> > Hello Michael, Pavan,
> >
> > On Mon, Dec 11, 2023 at 04:51:22PM -0800, Michael Chan wrote:
> > > From: Pavan Chebbi <pavan.chebbi@broadcom.com>
> > >
> > > In a busy network, especially with flow control enabled, we may
> > > experience timestamp query failures fairly regularly. After a while,
> > > dmesg may be flooded with timestamp query failure error messages.
> > >
> > > Silence the error message from the low level hwrm function that
> > > sends the firmware message.  Change netdev_err() to netdev_WARN_ONCE()
> > > if this FW call ever fails.
> >
> > This is starting to cause a warning now, which is not ideal, because
> > this error-now-warning happens quite frequently in Meta's fleet.
> >
> > At the same time, we want to have our kernels running warninglessly.
> > Moreover, the call stack displayed by the warning doesn't seem to be
> > quite useful and doees not help to investigate "the problem", I _think_.
> >
> > Is it OK to move it back to error, something as:
> >
> > -       netdev_WARN_ONCE(bp->dev,
> > +       netdev_err_once(bp->dev,
> >                          "TS query for TX timer failed rc = %x\n", rc);
>
> Hi Breno, I think it is OK to change.
> Would you be submitting a patch for this?
>

Why not netdev_warn_once()?  It will just print a message at the
warning level without the stack trace.  I think we consider this
condition to be just a warning and not an error.  Thanks.
Breno Leitao Jan. 25, 2024, 9:51 a.m. UTC | #4
On Thu, Jan 25, 2024 at 09:05:39AM +0530, Pavan Chebbi wrote:
> On Wed, Jan 24, 2024 at 3:48 PM Breno Leitao <leitao@debian.org> wrote:
> >
> > Hello Michael, Pavan,
> >
> > On Mon, Dec 11, 2023 at 04:51:22PM -0800, Michael Chan wrote:
> > > From: Pavan Chebbi <pavan.chebbi@broadcom.com>
> > >
> > > In a busy network, especially with flow control enabled, we may
> > > experience timestamp query failures fairly regularly. After a while,
> > > dmesg may be flooded with timestamp query failure error messages.
> > >
> > > Silence the error message from the low level hwrm function that
> > > sends the firmware message.  Change netdev_err() to netdev_WARN_ONCE()
> > > if this FW call ever fails.
> >
> > This is starting to cause a warning now, which is not ideal, because
> > this error-now-warning happens quite frequently in Meta's fleet.
> >
> > At the same time, we want to have our kernels running warninglessly.
> > Moreover, the call stack displayed by the warning doesn't seem to be
> > quite useful and doees not help to investigate "the problem", I _think_.
> >
> > Is it OK to move it back to error, something as:
> >
> > -       netdev_WARN_ONCE(bp->dev,
> > +       netdev_err_once(bp->dev,
> >                          "TS query for TX timer failed rc = %x\n", rc);
> 
> Hi Breno, I think it is OK to change.

> Would you be submitting a patch for this?

Yes, let me send a patch. I will follow Michael's suggestion and use
netdev_warn_once()

Thanks!
Breno Leitao Jan. 25, 2024, 9:52 a.m. UTC | #5
On Wed, Jan 24, 2024 at 08:47:03PM -0800, Michael Chan wrote:
> On Wed, Jan 24, 2024 at 7:35 PM Pavan Chebbi <pavan.chebbi@broadcom.com> wrote:
> >
> > On Wed, Jan 24, 2024 at 3:48 PM Breno Leitao <leitao@debian.org> wrote:
> > >
> > > Hello Michael, Pavan,
> > >
> > > On Mon, Dec 11, 2023 at 04:51:22PM -0800, Michael Chan wrote:
> > > > From: Pavan Chebbi <pavan.chebbi@broadcom.com>
> > > >
> > > > In a busy network, especially with flow control enabled, we may
> > > > experience timestamp query failures fairly regularly. After a while,
> > > > dmesg may be flooded with timestamp query failure error messages.
> > > >
> > > > Silence the error message from the low level hwrm function that
> > > > sends the firmware message.  Change netdev_err() to netdev_WARN_ONCE()
> > > > if this FW call ever fails.
> > >
> > > This is starting to cause a warning now, which is not ideal, because
> > > this error-now-warning happens quite frequently in Meta's fleet.
> > >
> > > At the same time, we want to have our kernels running warninglessly.
> > > Moreover, the call stack displayed by the warning doesn't seem to be
> > > quite useful and doees not help to investigate "the problem", I _think_.
> > >
> > > Is it OK to move it back to error, something as:
> > >
> > > -       netdev_WARN_ONCE(bp->dev,
> > > +       netdev_err_once(bp->dev,
> > >                          "TS query for TX timer failed rc = %x\n", rc);
> >
> > Hi Breno, I think it is OK to change.
> > Would you be submitting a patch for this?
> >
> 
> Why not netdev_warn_once()?  It will just print a message at the
> warning level without the stack trace.  I think we consider this
> condition to be just a warning and not an error.  Thanks.

This is even better. I will send a patch shortly.

Thanks
diff mbox series

Patch

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
index 3d1c36d384c2..adad188e38b8 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
@@ -129,7 +129,7 @@  static int bnxt_hwrm_port_ts_query(struct bnxt *bp, u32 flags, u64 *ts)
 	}
 	resp = hwrm_req_hold(bp, req);
 
-	rc = hwrm_req_send(bp, req);
+	rc = hwrm_req_send_silent(bp, req);
 	if (!rc)
 		*ts = le64_to_cpu(resp->ptp_msg_ts);
 	hwrm_req_drop(bp, req);
@@ -684,8 +684,8 @@  static void bnxt_stamp_tx_skb(struct bnxt *bp, struct sk_buff *skb)
 		timestamp.hwtstamp = ns_to_ktime(ns);
 		skb_tstamp_tx(ptp->tx_skb, &timestamp);
 	} else {
-		netdev_err(bp->dev, "TS query for TX timer failed rc = %x\n",
-			   rc);
+		netdev_WARN_ONCE(bp->dev,
+				 "TS query for TX timer failed rc = %x\n", rc);
 	}
 
 	dev_kfree_skb_any(ptp->tx_skb);