Message ID | 20220315211225.2923496-2-anthony.l.nguyen@intel.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Intel Wired LAN Driver Updates 2022-03-15 | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Clearly marked for net |
netdev/fixes_present | success | Fixes tag present in non-next series |
netdev/subject_prefix | success | Link |
netdev/cover_letter | success | Series has a cover letter |
netdev/patch_count | success | Link |
netdev/header_inline | success | No static functions without inline keyword in header files |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/cc_maintainers | success | CCed 4 of 4 maintainers |
netdev/build_clang | success | Errors and warnings before: 0 this patch: 0 |
netdev/module_param | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Signed-off-by tag matches author and committer |
netdev/verify_fixes | success | Fixes tag looks correct |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/checkpatch | warning | WARNING: line length of 87 exceeds 80 columns |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/source_inline | success | Was 0 now: 0 |
On Tue, 15 Mar 2022 14:12:23 -0700 Tony Nguyen wrote: > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > It is possible to do NULL pointer dereference in routine that updates > Tx ring stats. Currently only stats and bytes are updated when ring s/stats/packets/ ? > pointer is valid, but later on ring is accessed to propagate gathered Tx > stats onto VSI stats. > > Change the existing logic to move to next ring when ring is NULL. > > Fixes: e72bba21355d ("ice: split ice_ring onto Tx/Rx separate structs") > Reported-by: kernel test robot <lkp@intel.com> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com> > Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) > Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> > --- > drivers/net/ethernet/intel/ice/ice_main.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c > index 493942e910be..d4a7c39fd078 100644 > --- a/drivers/net/ethernet/intel/ice/ice_main.c > +++ b/drivers/net/ethernet/intel/ice/ice_main.c > @@ -5962,8 +5962,9 @@ ice_update_vsi_tx_ring_stats(struct ice_vsi *vsi, > u64 pkts = 0, bytes = 0; > > ring = READ_ONCE(rings[i]); Not really related to this patch but why is there a read_once() here? Aren't stats read under rtnl_lock? What is this protecting against? > - if (ring) > - ice_fetch_u64_stats_per_ring(&ring->syncp, ring->stats, &pkts, &bytes); > + if (!ring) > + continue; > + ice_fetch_u64_stats_per_ring(&ring->syncp, ring->stats, &pkts, &bytes);
On Tue, 2022-03-15 at 20:29 -0700, Jakub Kicinski wrote: > On Tue, 15 Mar 2022 14:12:23 -0700 Tony Nguyen wrote: > > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > > > It is possible to do NULL pointer dereference in routine that > > updates > > Tx ring stats. Currently only stats and bytes are updated when ring > > s/stats/packets/ ? Will fix. > > > pointer is valid, but later on ring is accessed to propagate > > gathered Tx > > stats onto VSI stats. > > > > Change the existing logic to move to next ring when ring is NULL. > > > > Fixes: e72bba21355d ("ice: split ice_ring onto Tx/Rx separate > > structs") > > Reported-by: kernel test robot <lkp@intel.com> > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com> > > Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent > > worker at Intel) > > Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> > > --- > > drivers/net/ethernet/intel/ice/ice_main.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c > > b/drivers/net/ethernet/intel/ice/ice_main.c > > index 493942e910be..d4a7c39fd078 100644 > > --- a/drivers/net/ethernet/intel/ice/ice_main.c > > +++ b/drivers/net/ethernet/intel/ice/ice_main.c > > @@ -5962,8 +5962,9 @@ ice_update_vsi_tx_ring_stats(struct ice_vsi > > *vsi, > > u64 pkts = 0, bytes = 0; > > > > ring = READ_ONCE(rings[i]); > > Not really related to this patch but why is there a read_once() here? > Aren't stats read under rtnl_lock? What is this protecting against? It looks like it was based on a patch from i40e [1]. From the commit, I gather this is the reason: "Previously the stats were 64 bit but highly racy due to the fact that 64 bit transactions are not atomic on 32 bit systems." Thanks, Tony [1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=980e9b1186424fa3eb766d59fc91003d0ed1ed6a (Resending as some non-text formatting snuck in to my reply. Sorry for the spam)
On Wed, Mar 16, 2022 at 12:01 PM Nguyen, Anthony L <anthony.l.nguyen@intel.com> wrote: > > On Tue, 2022-03-15 at 20:29 -0700, Jakub Kicinski wrote: > > On Tue, 15 Mar 2022 14:12:23 -0700 Tony Nguyen wrote: > > > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com> <snip> > > > pointer is valid, but later on ring is accessed to propagate > > > gathered Tx > > > stats onto VSI stats. > > > > > > Change the existing logic to move to next ring when ring is NULL. > > > > > > Fixes: e72bba21355d ("ice: split ice_ring onto Tx/Rx separate > > > structs") > > > Reported-by: kernel test robot <lkp@intel.com> > > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> > > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > > > Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com> > > > Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent > > > worker at Intel) > > > Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> > > > --- > > > drivers/net/ethernet/intel/ice/ice_main.c | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c > > > b/drivers/net/ethernet/intel/ice/ice_main.c > > > index 493942e910be..d4a7c39fd078 100644 > > > --- a/drivers/net/ethernet/intel/ice/ice_main.c > > > +++ b/drivers/net/ethernet/intel/ice/ice_main.c > > > @@ -5962,8 +5962,9 @@ ice_update_vsi_tx_ring_stats(struct ice_vsi > > > *vsi, > > > u64 pkts = 0, bytes = 0; > > > > > > ring = READ_ONCE(rings[i]); > > > > Not really related to this patch but why is there a read_once() here? > > Aren't stats read under rtnl_lock? What is this protecting against? > > It looks like it was based on a patch from i40e [1]. From the commit, I > gather this is the reason: > > "Previously the stats were 64 bit but highly racy due to the fact that > 64 bit transactions are not atomic on 32 bit systems." > > Thanks, > > Tony > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=980e9b1186424fa3eb766d59fc91003d0ed1ed6a > > > (Resending as some non-text formatting snuck in to my reply. Sorry for > the spam) Actually the rcu locking and READ_ONCE has to do with the fact that the driver code for the igb/ixgbe/i40e driver had a bunch of code that could kick in from the sysfs and/or PCIe paths that would start tearing things down without necessarily holding the rtnl_lock as it should have. It might make sense to reevaluate the usage of the READ_ONCE and rcu locking to determine if it is actually needed since much of that is a hold over from when we were doing this in the watchdog and not while holding the rtnl_lock in the stats call.
On 3/16/2022 1:24 PM, Alexander Duyck wrote: > On Wed, Mar 16, 2022 at 12:01 PM Nguyen, Anthony L > <anthony.l.nguyen@intel.com> wrote: >> On Tue, 2022-03-15 at 20:29 -0700, Jakub Kicinski wrote: >>> On Tue, 15 Mar 2022 14:12:23 -0700 Tony Nguyen wrote: >>>> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > <snip> > >>>> pointer is valid, but later on ring is accessed to propagate >>>> gathered Tx >>>> stats onto VSI stats. >>>> >>>> Change the existing logic to move to next ring when ring is NULL. >>>> >>>> Fixes: e72bba21355d ("ice: split ice_ring onto Tx/Rx separate >>>> structs") >>>> Reported-by: kernel test robot <lkp@intel.com> >>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> >>>> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> >>>> Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com> >>>> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent >>>> worker at Intel) >>>> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> >>>> --- >>>> drivers/net/ethernet/intel/ice/ice_main.c | 5 +++-- >>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c >>>> b/drivers/net/ethernet/intel/ice/ice_main.c >>>> index 493942e910be..d4a7c39fd078 100644 >>>> --- a/drivers/net/ethernet/intel/ice/ice_main.c >>>> +++ b/drivers/net/ethernet/intel/ice/ice_main.c >>>> @@ -5962,8 +5962,9 @@ ice_update_vsi_tx_ring_stats(struct ice_vsi >>>> *vsi, >>>> u64 pkts = 0, bytes = 0; >>>> >>>> ring = READ_ONCE(rings[i]); >>> Not really related to this patch but why is there a read_once() here? >>> Aren't stats read under rtnl_lock? What is this protecting against? >> It looks like it was based on a patch from i40e [1]. From the commit, I >> gather this is the reason: >> >> "Previously the stats were 64 bit but highly racy due to the fact that >> 64 bit transactions are not atomic on 32 bit systems." >> >> Thanks, >> >> Tony >> >> [1] >> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=980e9b1186424fa3eb766d59fc91003d0ed1ed6a >> >> >> (Resending as some non-text formatting snuck in to my reply. Sorry for >> the spam) > Actually the rcu locking and READ_ONCE has to do with the fact that > the driver code for the igb/ixgbe/i40e driver had a bunch of code that > could kick in from the sysfs and/or PCIe paths that would start > tearing things down without necessarily holding the rtnl_lock as it > should have. > > It might make sense to reevaluate the usage of the READ_ONCE and rcu > locking to determine if it is actually needed since much of that is a > hold over from when we were doing this in the watchdog and not while > holding the rtnl_lock in the stats call. Thanks for clarifying Alex. I'll ask the teams to look into this.
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 493942e910be..d4a7c39fd078 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -5962,8 +5962,9 @@ ice_update_vsi_tx_ring_stats(struct ice_vsi *vsi, u64 pkts = 0, bytes = 0; ring = READ_ONCE(rings[i]); - if (ring) - ice_fetch_u64_stats_per_ring(&ring->syncp, ring->stats, &pkts, &bytes); + if (!ring) + continue; + ice_fetch_u64_stats_per_ring(&ring->syncp, ring->stats, &pkts, &bytes); vsi_stats->tx_packets += pkts; vsi_stats->tx_bytes += bytes; vsi->tx_restart += ring->tx_stats.restart_q;