Message ID | 20230412081929.173220-5-mschmidt@redhat.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | ice: lower CPU usage with GNSS | expand |
>From: Michal Schmidt <mschmidt@redhat.com> >Sent: Wednesday, April 12, 2023 10:19 AM > >The driver polls for ice_sq_done() with a 100 µs period for up to 1 s >and it uses udelay to do that. > >Let's use usleep_range instead. We know sleeping is allowed here, >because we're holding a mutex (cq->sq_lock). To preserve the total >max waiting time, measure the timeout in jiffies. > >ICE_CTL_Q_SQ_CMD_TIMEOUT is used also in ice_release_res(), but there >the polling period is 1 ms (i.e. 10 times longer). Since the timeout was >expressed in terms of the number of loops, the total timeout in this >function is 10 s. I do not know if this is intentional. This patch keeps >it. > >The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread >on my system from ~8 % to less than 1 %. > >I received a report of high CPU usage with ptp4l where the busy-waiting >in ice_sq_send_cmd dominated the profile. This patch has been tested in >that usecase too and it made a huge improvement there. > >Tested-by: Brent Rowsell <browsell@redhat.com> >Signed-off-by: Michal Schmidt <mschmidt@redhat.com> >--- > drivers/net/ethernet/intel/ice/ice_common.c | 14 +++++++------- > drivers/net/ethernet/intel/ice/ice_controlq.c | 9 +++++---- > drivers/net/ethernet/intel/ice/ice_controlq.h | 2 +- > 3 files changed, 13 insertions(+), 12 deletions(-) > >diff --git a/drivers/net/ethernet/intel/ice/ice_common.c >b/drivers/net/ethernet/intel/ice/ice_common.c >index f4c256563248..3638598d732b 100644 >--- a/drivers/net/ethernet/intel/ice/ice_common.c >+++ b/drivers/net/ethernet/intel/ice/ice_common.c >@@ -1992,19 +1992,19 @@ ice_acquire_res(struct ice_hw *hw, enum >ice_aq_res_ids res, > */ > void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res) > { >- u32 total_delay = 0; >+ unsigned long timeout; > int status; > >- status = ice_aq_release_res(hw, res, 0, NULL); >- > /* there are some rare cases when trying to release the resource > * results in an admin queue timeout, so handle them correctly > */ >- while ((status == -EIO) && (total_delay < ICE_CTL_Q_SQ_CMD_TIMEOUT)) >{ >- mdelay(1); >+ timeout = jiffies + 10 * ICE_CTL_Q_SQ_CMD_TIMEOUT; >+ do { > status = ice_aq_release_res(hw, res, 0, NULL); >- total_delay++; >- } >+ if (status != -EIO) >+ break; >+ usleep_range(1000, 2000); >+ } while (time_before(jiffies, timeout)); > } > > /** >diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.c >b/drivers/net/ethernet/intel/ice/ice_controlq.c >index c8fb10106ec3..d2faf1baad2f 100644 >--- a/drivers/net/ethernet/intel/ice/ice_controlq.c >+++ b/drivers/net/ethernet/intel/ice/ice_controlq.c >@@ -964,7 +964,7 @@ ice_sq_send_cmd(struct ice_hw *hw, struct >ice_ctl_q_info *cq, > struct ice_aq_desc *desc_on_ring; > bool cmd_completed = false; > struct ice_sq_cd *details; >- u32 total_delay = 0; >+ unsigned long timeout; > int status = 0; > u16 retval = 0; > u32 val = 0; >@@ -1057,13 +1057,14 @@ ice_sq_send_cmd(struct ice_hw *hw, struct >ice_ctl_q_info *cq, > cq->sq.next_to_use = 0; > wr32(hw, cq->sq.tail, cq->sq.next_to_use); > >+ timeout = jiffies + ICE_CTL_Q_SQ_CMD_TIMEOUT; > do { > if (ice_sq_done(hw, cq)) > break; > >- udelay(ICE_CTL_Q_SQ_CMD_USEC); >- total_delay++; >- } while (total_delay < ICE_CTL_Q_SQ_CMD_TIMEOUT); >+ usleep_range(ICE_CTL_Q_SQ_CMD_USEC, >+ ICE_CTL_Q_SQ_CMD_USEC * 3 / 2); >+ } while (time_before(jiffies, timeout)); > > /* if ready, copy the desc back to temp */ > if (ice_sq_done(hw, cq)) { >diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.h >b/drivers/net/ethernet/intel/ice/ice_controlq.h >index e790b2f4e437..950b7f4a7a05 100644 >--- a/drivers/net/ethernet/intel/ice/ice_controlq.h >+++ b/drivers/net/ethernet/intel/ice/ice_controlq.h >@@ -34,7 +34,7 @@ enum ice_ctl_q { > }; > > /* Control Queue timeout settings - max delay 1s */ >-#define ICE_CTL_Q_SQ_CMD_TIMEOUT 10000 /* Count 10000 times */ >+#define ICE_CTL_Q_SQ_CMD_TIMEOUT HZ /* Wait max 1s */ > #define ICE_CTL_Q_SQ_CMD_USEC 100 /* Check every 100usec */ > #define ICE_CTL_Q_ADMIN_INIT_TIMEOUT 10 /* Count 10 times */ > #define ICE_CTL_Q_ADMIN_INIT_MSEC 100 /* Check every 100msec */ >-- >2.39.2 Looks good, thank you Michal! Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
On Wed, Apr 12, 2023 at 10:19:27AM +0200, Michal Schmidt wrote: > The driver polls for ice_sq_done() with a 100 µs period for up to 1 s > and it uses udelay to do that. > > Let's use usleep_range instead. We know sleeping is allowed here, > because we're holding a mutex (cq->sq_lock). To preserve the total > max waiting time, measure the timeout in jiffies. > > ICE_CTL_Q_SQ_CMD_TIMEOUT is used also in ice_release_res(), but there > the polling period is 1 ms (i.e. 10 times longer). Since the timeout was > expressed in terms of the number of loops, the total timeout in this > function is 10 s. I do not know if this is intentional. This patch keeps > it. > > The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread > on my system from ~8 % to less than 1 %. > > I received a report of high CPU usage with ptp4l where the busy-waiting > in ice_sq_send_cmd dominated the profile. This patch has been tested in > that usecase too and it made a huge improvement there. > > Tested-by: Brent Rowsell <browsell@redhat.com> > Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com>
> -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Michal Schmidt > Sent: Wednesday, April 12, 2023 1:19 AM > To: intel-wired-lan@lists.osuosl.org > Cc: Brent Rowsell <browsell@redhat.com>; Andrew Lunn <andrew@lunn.ch>; netdev@vger.kernel.org; Brandeburg, Jesse <jesse.brandeburg@intel.com>; Kolacinski, Karol <karol.kolacinski@intel.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Simon Horman <simon.horman@corigine.com> > Subject: [Intel-wired-lan] [PATCH net-next v2 4/6] ice: sleep, don't busy-wait, for ICE_CTL_Q_SQ_CMD_TIMEOUT > > The driver polls for ice_sq_done() with a 100 µs period for up to 1 s and it uses udelay to do that. > > Let's use usleep_range instead. We know sleeping is allowed here, because we're holding a mutex (cq->sq_lock). To preserve the total max waiting time, measure the timeout in jiffies. > > ICE_CTL_Q_SQ_CMD_TIMEOUT is used also in ice_release_res(), but there the polling period is 1 ms (i.e. 10 times longer). Since the timeout was expressed in terms of the number of loops, the total timeout in this function is 10 s. I do not know if this is intentional. This patch keeps it. > > The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread on my system from ~8 % to less than 1 %. > > I received a report of high CPU usage with ptp4l where the busy-waiting in ice_sq_send_cmd dominated the profile. This patch has been tested in that usecase too and it made a huge improvement there. > > Tested-by: Brent Rowsell <browsell@redhat.com> > Signed-off-by: Michal Schmidt <mschmidt@redhat.com> > --- > drivers/net/ethernet/intel/ice/ice_common.c | 14 +++++++------- > drivers/net/ethernet/intel/ice/ice_controlq.c | 9 +++++---- drivers/net/ethernet/intel/ice/ice_controlq.h | 2 +- > 3 files changed, 13 insertions(+), 12 deletions(-) > Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel)
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index f4c256563248..3638598d732b 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -1992,19 +1992,19 @@ ice_acquire_res(struct ice_hw *hw, enum ice_aq_res_ids res, */ void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res) { - u32 total_delay = 0; + unsigned long timeout; int status; - status = ice_aq_release_res(hw, res, 0, NULL); - /* there are some rare cases when trying to release the resource * results in an admin queue timeout, so handle them correctly */ - while ((status == -EIO) && (total_delay < ICE_CTL_Q_SQ_CMD_TIMEOUT)) { - mdelay(1); + timeout = jiffies + 10 * ICE_CTL_Q_SQ_CMD_TIMEOUT; + do { status = ice_aq_release_res(hw, res, 0, NULL); - total_delay++; - } + if (status != -EIO) + break; + usleep_range(1000, 2000); + } while (time_before(jiffies, timeout)); } /** diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.c b/drivers/net/ethernet/intel/ice/ice_controlq.c index c8fb10106ec3..d2faf1baad2f 100644 --- a/drivers/net/ethernet/intel/ice/ice_controlq.c +++ b/drivers/net/ethernet/intel/ice/ice_controlq.c @@ -964,7 +964,7 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq, struct ice_aq_desc *desc_on_ring; bool cmd_completed = false; struct ice_sq_cd *details; - u32 total_delay = 0; + unsigned long timeout; int status = 0; u16 retval = 0; u32 val = 0; @@ -1057,13 +1057,14 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq, cq->sq.next_to_use = 0; wr32(hw, cq->sq.tail, cq->sq.next_to_use); + timeout = jiffies + ICE_CTL_Q_SQ_CMD_TIMEOUT; do { if (ice_sq_done(hw, cq)) break; - udelay(ICE_CTL_Q_SQ_CMD_USEC); - total_delay++; - } while (total_delay < ICE_CTL_Q_SQ_CMD_TIMEOUT); + usleep_range(ICE_CTL_Q_SQ_CMD_USEC, + ICE_CTL_Q_SQ_CMD_USEC * 3 / 2); + } while (time_before(jiffies, timeout)); /* if ready, copy the desc back to temp */ if (ice_sq_done(hw, cq)) { diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.h b/drivers/net/ethernet/intel/ice/ice_controlq.h index e790b2f4e437..950b7f4a7a05 100644 --- a/drivers/net/ethernet/intel/ice/ice_controlq.h +++ b/drivers/net/ethernet/intel/ice/ice_controlq.h @@ -34,7 +34,7 @@ enum ice_ctl_q { }; /* Control Queue timeout settings - max delay 1s */ -#define ICE_CTL_Q_SQ_CMD_TIMEOUT 10000 /* Count 10000 times */ +#define ICE_CTL_Q_SQ_CMD_TIMEOUT HZ /* Wait max 1s */ #define ICE_CTL_Q_SQ_CMD_USEC 100 /* Check every 100usec */ #define ICE_CTL_Q_ADMIN_INIT_TIMEOUT 10 /* Count 10 times */ #define ICE_CTL_Q_ADMIN_INIT_MSEC 100 /* Check every 100msec */