diff mbox series

[net,V2,1/4] octeontx2-pf: Update PFC configuration

Message ID 20230809070532.3252464-2-sumang@marvell.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Fix PFC related issues | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 9 this patch: 9
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 1351 this patch: 1351
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1351 this patch: 1351
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 24 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Suman Ghosh Aug. 9, 2023, 7:05 a.m. UTC
As of now we are creating/deleting Tx schedulers when user is
setting PFC on/off. The problem is if we have a running traffic on
the interface and as we are updating the sq->smq mapping on the fly,
we might loose completion interrupt for some packets. As a result of
that a watchdog reset is hit from BQL.
This patch solves the issue by simply calling interface off/on APIs
which will reconfigure all the queues. We might loss the running traffic
momentarily but that should be fine.

Fixes: 99c969a83d82 ("octeontx2-pf: Add egress PFC support")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
---
 .../net/ethernet/marvell/octeontx2/nic/otx2_dcbnl.c  | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

Comments

Jakub Kicinski Aug. 9, 2023, 11:05 p.m. UTC | #1
On Wed, 9 Aug 2023 12:35:29 +0530 Suman Ghosh wrote:
> +		otx2_stop(dev);
> +		otx2_open(dev);

If there is any error in open() this will silently take the interface
down. Can't you force a NAPI poll or some such, if the concern is a
missed IRQ?
Suman Ghosh Aug. 18, 2023, 6:54 a.m. UTC | #2
>----------------------------------------------------------------------
>On Wed, 9 Aug 2023 12:35:29 +0530 Suman Ghosh wrote:
>> +		otx2_stop(dev);
>> +		otx2_open(dev);
>
>If there is any error in open() this will silently take the interface
>down. Can't you force a NAPI poll or some such, if the concern is a
>missed IRQ?
[Suman] I can check the return type of open() and report in case of error. But even if we force NAPI poll we might not be able to control the watchdog reset. If we have a running traffic and interface is up, when we force NAPI poll, then the new packets will have updated scheduler queue and we will still loose the completion interrupts of the previous packets. Do you see any issue if I handle the error situation during open() call?
>--
>pw-bot: cr
Jakub Kicinski Aug. 18, 2023, 4 p.m. UTC | #3
Thanks for replying a week late, always a good use of maintainers time
to swap back all the context from random conversations!

On Fri, 18 Aug 2023 06:54:52 +0000 Suman Ghosh wrote:
> >If there is any error in open() this will silently take the interface
> >down. Can't you force a NAPI poll or some such, if the concern is a
> >missed IRQ?  
> [Suman] I can check the return type of open() and report in case of
> error. But even if we force NAPI poll we might not be able to control
> the watchdog reset. If we have a running traffic and interface is up,
> when we force NAPI poll, then the new packets will have updated
> scheduler queue and we will still loose the completion interrupts of
> the previous packets.

Why does it matter that you lost an interrupt if the poll has happened.
Can you describe the problem in more detail?

> Do you see any issue if I handle the error situation during open() call?

No, for years we have been rejecting code which does this.
If the machine is under memory pressure allocating all the buffers 
for rings can easily fail and make the machine drop off the network.
You either have to refuse to change this setting at runtime or
implement prepare/commit reconfiguration model like other modern
drivers, where allocations are done before the stop().
Suman Ghosh Aug. 19, 2023, 10:10 a.m. UTC | #4
>Thanks for replying a week late, always a good use of maintainers time
>to swap back all the context from random conversations!
[Suman] Sorry for being late this time Jakub. I will remove this patch from the patch set and will push a new version with the other three patches. I will analyze the issue in more detail and will produce a proper fix.
>
>On Fri, 18 Aug 2023 06:54:52 +0000 Suman Ghosh wrote:
>> >If there is any error in open() this will silently take the interface
>> >down. Can't you force a NAPI poll or some such, if the concern is a
>> >missed IRQ?
>> [Suman] I can check the return type of open() and report in case of
>> error. But even if we force NAPI poll we might not be able to control
>> the watchdog reset. If we have a running traffic and interface is up,
>> when we force NAPI poll, then the new packets will have updated
>> scheduler queue and we will still loose the completion interrupts of
>> the previous packets.
>
>Why does it matter that you lost an interrupt if the poll has happened.
>Can you describe the problem in more detail?
>
>> Do you see any issue if I handle the error situation during open()
>call?
>
>No, for years we have been rejecting code which does this.
>If the machine is under memory pressure allocating all the buffers for
>rings can easily fail and make the machine drop off the network.
>You either have to refuse to change this setting at runtime or implement
>prepare/commit reconfiguration model like other modern drivers, where
>allocations are done before the stop().
diff mbox series

Patch

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_dcbnl.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_dcbnl.c
index ccaf97bb1ce0..d54edfa8fcc9 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_dcbnl.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_dcbnl.c
@@ -406,6 +406,7 @@  static int otx2_dcbnl_ieee_getpfc(struct net_device *dev, struct ieee_pfc *pfc)
 static int otx2_dcbnl_ieee_setpfc(struct net_device *dev, struct ieee_pfc *pfc)
 {
 	struct otx2_nic *pfvf = netdev_priv(dev);
+	bool if_up = netif_running(dev);
 	int err;
 
 	/* Save PFC configuration to interface */
@@ -426,14 +427,9 @@  static int otx2_dcbnl_ieee_setpfc(struct net_device *dev, struct ieee_pfc *pfc)
 	if (err)
 		return err;
 
-	/* Request Per channel Bpids */
-	if (pfc->pfc_en)
-		otx2_nix_config_bp(pfvf, true);
-
-	err = otx2_pfc_txschq_update(pfvf);
-	if (err) {
-		dev_err(pfvf->dev, "%s failed to update TX schedulers\n", __func__);
-		return err;
+	if (if_up) {
+		otx2_stop(dev);
+		otx2_open(dev);
 	}
 
 	return 0;