diff mbox series

[net,01/15] net/mlx5e: E-switch, Fix rate calculation for overflow

Message ID 20210212025641.323844-2-saeed@kernel.org (mailing list archive)
State Accepted
Delegated to: Netdev Maintainers
Headers show
Series [net,01/15] net/mlx5e: E-switch, Fix rate calculation for overflow | expand

Checks

Context Check Description
netdev/cover_letter success Pull request
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers warning 4 maintainers not CCed: linux-rdma@vger.kernel.org paulb@mellanox.com eli@mellanox.com leon@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Saeed Mahameed Feb. 12, 2021, 2:56 a.m. UTC
From: Parav Pandit <parav@nvidia.com>

rate_bytes_ps is a 64-bit field. It passed as 32-bit field to
apply_police_params(). Due to this when police rate is higher
than 4Gbps, 32-bit calculation ignores the carry. This results
in incorrect rate configurationn the device.

Fix it by performing 64-bit calculation.

Fixes: fcb64c0f5640 ("net/mlx5: E-Switch, add ingress rate support")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Arnd Bergmann Feb. 27, 2021, 12:14 p.m. UTC | #1
On Fri, Feb 12, 2021 at 3:59 AM Saeed Mahameed <saeed@kernel.org> wrote:
>
> From: Parav Pandit <parav@nvidia.com>
>
> rate_bytes_ps is a 64-bit field. It passed as 32-bit field to
> apply_police_params(). Due to this when police rate is higher
> than 4Gbps, 32-bit calculation ignores the carry. This results
> in incorrect rate configurationn the device.
>
> Fix it by performing 64-bit calculation.

I just stumbled over this commit while looking at an unrelated
problem.

> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> index dd0bfbacad47..717fbaa6ce73 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> @@ -5040,7 +5040,7 @@ static int apply_police_params(struct mlx5e_priv *priv, u64 rate,
>          */
>         if (rate) {
>                 rate = (rate * BITS_PER_BYTE) + 500000;
> -               rate_mbps = max_t(u32, do_div(rate, 1000000), 1);
> +               rate_mbps = max_t(u64, do_div(rate, 1000000), 1);

I think there are still multiple issues with this line:

- Before commit 1fe3e3166b35 ("net/mlx5e: E-switch, Fix rate calculation for
  overflow"), it was trying to calculate rate divided by 1000000, but now
  it uses the remainder of the division rather than the quotient. I assume
  this was meant to use div_u64() instead of do_div().

- Both div_u64() and do_div() return a 32-bit number, and '1' is a constant
  that also comfortably fits into a 32-bit number, so changing the max_t
  to return a 64-bit type has no effect on the result

- The maximum of an arbitrary unsigned integer and '1' is either one or zero,
   so there doesn't need to be an expensive division here at all. From the
   comment it sounds like the intention was to use 'min_t()' instead
of 'max_t()'.
   It has however used 'max_t' since the code was first introduced.

        Arnd
Saeed Mahameed March 2, 2021, 12:52 a.m. UTC | #2
On Sat, 2021-02-27 at 13:14 +0100, Arnd Bergmann wrote:
> On Fri, Feb 12, 2021 at 3:59 AM Saeed Mahameed <saeed@kernel.org>
> wrote:
> > 
> > From: Parav Pandit <parav@nvidia.com>
> > 
> > rate_bytes_ps is a 64-bit field. It passed as 32-bit field to
> > apply_police_params(). Due to this when police rate is higher
> > than 4Gbps, 32-bit calculation ignores the carry. This results
> > in incorrect rate configurationn the device.
> > 
> > Fix it by performing 64-bit calculation.
> 
> I just stumbled over this commit while looking at an unrelated
> problem.
> 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > index dd0bfbacad47..717fbaa6ce73 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > @@ -5040,7 +5040,7 @@ static int apply_police_params(struct
> > mlx5e_priv *priv, u64 rate,
> >          */
> >         if (rate) {
> >                 rate = (rate * BITS_PER_BYTE) + 500000;
> > -               rate_mbps = max_t(u32, do_div(rate, 1000000), 1);
> > +               rate_mbps = max_t(u64, do_div(rate, 1000000), 1);
> 
> I think there are still multiple issues with this line:
> 
> - Before commit 1fe3e3166b35 ("net/mlx5e: E-switch, Fix rate
> calculation for
>   overflow"), it was trying to calculate rate divided by 1000000, but
> now
>   it uses the remainder of the division rather than the quotient. I
> assume
>   this was meant to use div_u64() instead of do_div().
> 

Yes, I already have a patch lined up to fix this issue.

Thanks for spotting this.

> - Both div_u64() and do_div() return a 32-bit number, and '1' is a
> constant
>   that also comfortably fits into a 32-bit number, so changing the
> max_t
>   to return a 64-bit type has no effect on the result
> 

as of the above comment, we shouldn't be using the return value of
do_div().


> - The maximum of an arbitrary unsigned integer and '1' is either one
> or zero,
>    so there doesn't need to be an expensive division here at all.
> From the
>    comment it sounds like the intention was to use 'min_t()' instead
> of 'max_t()'.
>    It has however used 'max_t' since the code was first introduced.
> 

if the input rate is less that 1mbps then the quotient will be 0,
otherwise we want the quotient, and we don't allow 0, so max_t(rate, 1)
should be used, what am I missing ?
Arnd Bergmann March 2, 2021, 9:01 a.m. UTC | #3
On Tue, Mar 2, 2021 at 1:52 AM Saeed Mahameed <saeed@kernel.org> wrote:
> On Sat, 2021-02-27 at 13:14 +0100, Arnd Bergmann wrote:
> > On Fri, Feb 12, 2021 at 3:59 AM Saeed Mahameed <saeed@kernel.org>
> > wrote:
> > >
> > > From: Parav Pandit <parav@nvidia.com>
> > >
> > > rate_bytes_ps is a 64-bit field. It passed as 32-bit field to
> > > apply_police_params(). Due to this when police rate is higher
> > > than 4Gbps, 32-bit calculation ignores the carry. This results
> > > in incorrect rate configurationn the device.
> > >
> > > Fix it by performing 64-bit calculation.
> >
> > I just stumbled over this commit while looking at an unrelated
> > problem.
> >
> > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > > b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > > index dd0bfbacad47..717fbaa6ce73 100644
> > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > > @@ -5040,7 +5040,7 @@ static int apply_police_params(struct
> > > mlx5e_priv *priv, u64 rate,
> > >          */
> > >         if (rate) {
> > >                 rate = (rate * BITS_PER_BYTE) + 500000;
> > > -               rate_mbps = max_t(u32, do_div(rate, 1000000), 1);
> > > +               rate_mbps = max_t(u64, do_div(rate, 1000000), 1);
> >
> > I think there are still multiple issues with this line:
> >
> > - Before commit 1fe3e3166b35 ("net/mlx5e: E-switch, Fix rate
> > calculation for
> >   overflow"), it was trying to calculate rate divided by 1000000, but
> > now
> >   it uses the remainder of the division rather than the quotient. I
> > assume
> >   this was meant to use div_u64() instead of do_div().
> >
>
> Yes, I already have a patch lined up to fix this issue.

ok

> > - Both div_u64() and do_div() return a 32-bit number, and '1' is a
> > constant
> >   that also comfortably fits into a 32-bit number, so changing the
> > max_t
> >   to return a 64-bit type has no effect on the result
> >
>
> as of the above comment, we shouldn't be using the return value of
> do_div().

Ok, I was confused here because do_div() returns a 32-bit type,
and is called by div_u64(). Of course that was nonsense because
do_div() returns the 32-bit remainder, while the division result
remains 64-bit.

> > - The maximum of an arbitrary unsigned integer and '1' is either one
> > or zero,
> >    so there doesn't need to be an expensive division here at all.
> > From the
> >    comment it sounds like the intention was to use 'min_t()' instead
> > of 'max_t()'.
> >    It has however used 'max_t' since the code was first introduced.
> >
>
> if the input rate is less that 1mbps then the quotient will be 0,
> otherwise we want the quotient, and we don't allow 0, so max_t(rate, 1)
> should be used, what am I missing ?

And I have no idea what I was thinking here, of course you are right
and there is no other bug.

       Arnd
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index dd0bfbacad47..717fbaa6ce73 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -5040,7 +5040,7 @@  static int apply_police_params(struct mlx5e_priv *priv, u64 rate,
 	 */
 	if (rate) {
 		rate = (rate * BITS_PER_BYTE) + 500000;
-		rate_mbps = max_t(u32, do_div(rate, 1000000), 1);
+		rate_mbps = max_t(u64, do_div(rate, 1000000), 1);
 	}
 
 	err = mlx5_esw_modify_vport_rate(esw, vport_num, rate_mbps);