Message ID | 20230401172659.38508-3-mschmidt@redhat.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | ice: lower CPU usage with GNSS | expand |
On Sat, Apr 01, 2023 at 07:26:57PM +0200, Michal Schmidt wrote: > The driver polls for ice_sq_done() with a 100 µs period for up to 1 s > and it uses udelay to do that. > > Let's use usleep_range instead. We know sleeping is allowed here, > because we're holding a mutex (cq->sq_lock). To preserve the total > max waiting time, measure cq->sq_cmd_timeout in jiffies. > > The sq_cmd_timeout is referenced also in ice_release_res(), but there > the polling period is 1 ms (i.e. 10 times longer). Since the timeout > was expressed in terms of the number of loops, the total timeout in this > function is 10 s. I do not know if this is intentional. This patch keeps > it. > > The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread > on my system from ~8 % to less than 1 %. > I saw a report of high CPU usage with ptp4l where the busy-waiting in > ice_sq_send_cmd dominated the profile. The patch should help with that. > > Signed-off-by: Michal Schmidt <mschmidt@redhat.com> > --- > drivers/net/ethernet/intel/ice/ice_common.c | 14 +++++++------- > drivers/net/ethernet/intel/ice/ice_controlq.c | 9 +++++---- > drivers/net/ethernet/intel/ice/ice_controlq.h | 2 +- > 3 files changed, 13 insertions(+), 12 deletions(-) > > diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c > index c2fda4fa4188..14cffe49fa8c 100644 > --- a/drivers/net/ethernet/intel/ice/ice_common.c > +++ b/drivers/net/ethernet/intel/ice/ice_common.c > @@ -1992,19 +1992,19 @@ ice_acquire_res(struct ice_hw *hw, enum ice_aq_res_ids res, > */ > void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res) > { > - u32 total_delay = 0; > + unsigned long timeout; > int status; > > - status = ice_aq_release_res(hw, res, 0, NULL); > - > /* there are some rare cases when trying to release the resource > * results in an admin queue timeout, so handle them correctly > */ > - while ((status == -EIO) && (total_delay < hw->adminq.sq_cmd_timeout)) { > - mdelay(1); > + timeout = jiffies + 10 * hw->adminq.sq_cmd_timeout; Not needed for this series. But it occurs to me that a clean-up would be to use ICE_CTL_Q_SQ_CMD_TIMEOUT directly and remove the sq_cmd_timeout field, as it seems to be only set to that constant. > + do { > status = ice_aq_release_res(hw, res, 0, NULL); > - total_delay++; > - } > + if (status != -EIO) > + break; > + usleep_range(1000, 2000); > + } while (time_before(jiffies, timeout)); > } > > /** > diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.c b/drivers/net/ethernet/intel/ice/ice_controlq.c > index 6bcfee295991..10125e8aa555 100644 > --- a/drivers/net/ethernet/intel/ice/ice_controlq.c > +++ b/drivers/net/ethernet/intel/ice/ice_controlq.c > @@ -967,7 +967,7 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq, > struct ice_aq_desc *desc_on_ring; > bool cmd_completed = false; > struct ice_sq_cd *details; > - u32 total_delay = 0; > + unsigned long timeout; > int status = 0; > u16 retval = 0; > u32 val = 0; > @@ -1060,13 +1060,14 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq, > cq->sq.next_to_use = 0; > wr32(hw, cq->sq.tail, cq->sq.next_to_use); > > + timeout = jiffies + cq->sq_cmd_timeout; > do { > if (ice_sq_done(hw, cq)) > break; > > - udelay(ICE_CTL_Q_SQ_CMD_USEC); > - total_delay++; > - } while (total_delay < cq->sq_cmd_timeout); > + usleep_range(ICE_CTL_Q_SQ_CMD_USEC, > + ICE_CTL_Q_SQ_CMD_USEC * 3 / 2); > + } while (time_before(jiffies, timeout)); > > /* if ready, copy the desc back to temp */ > if (ice_sq_done(hw, cq)) { > diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.h b/drivers/net/ethernet/intel/ice/ice_controlq.h > index c07e9cc9fc6e..f2d3b115ae0b 100644 > --- a/drivers/net/ethernet/intel/ice/ice_controlq.h > +++ b/drivers/net/ethernet/intel/ice/ice_controlq.h > @@ -34,7 +34,7 @@ enum ice_ctl_q { > }; > > /* Control Queue timeout settings - max delay 1s */ > -#define ICE_CTL_Q_SQ_CMD_TIMEOUT 10000 /* Count 10000 times */ > +#define ICE_CTL_Q_SQ_CMD_TIMEOUT HZ /* Wait max 1s */ > #define ICE_CTL_Q_SQ_CMD_USEC 100 /* Check every 100usec */ > #define ICE_CTL_Q_ADMIN_INIT_TIMEOUT 10 /* Count 10 times */ > #define ICE_CTL_Q_ADMIN_INIT_MSEC 100 /* Check every 100msec */ > -- > 2.39.2 >
On Sun, Apr 2, 2023 at 1:18 PM Simon Horman <simon.horman@corigine.com> wrote: > On Sat, Apr 01, 2023 at 07:26:57PM +0200, Michal Schmidt wrote: > > The driver polls for ice_sq_done() with a 100 µs period for up to 1 s > > and it uses udelay to do that. > > > > Let's use usleep_range instead. We know sleeping is allowed here, > > because we're holding a mutex (cq->sq_lock). To preserve the total > > max waiting time, measure cq->sq_cmd_timeout in jiffies. > > > > The sq_cmd_timeout is referenced also in ice_release_res(), but there > > the polling period is 1 ms (i.e. 10 times longer). Since the timeout > > was expressed in terms of the number of loops, the total timeout in this > > function is 10 s. I do not know if this is intentional. This patch keeps > > it. > > > > The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread > > on my system from ~8 % to less than 1 %. > > I saw a report of high CPU usage with ptp4l where the busy-waiting in > > ice_sq_send_cmd dominated the profile. The patch should help with that. > > > > Signed-off-by: Michal Schmidt <mschmidt@redhat.com> > > --- > > drivers/net/ethernet/intel/ice/ice_common.c | 14 +++++++------- > > drivers/net/ethernet/intel/ice/ice_controlq.c | 9 +++++---- > > drivers/net/ethernet/intel/ice/ice_controlq.h | 2 +- > > 3 files changed, 13 insertions(+), 12 deletions(-) > > > > diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c > > index c2fda4fa4188..14cffe49fa8c 100644 > > --- a/drivers/net/ethernet/intel/ice/ice_common.c > > +++ b/drivers/net/ethernet/intel/ice/ice_common.c > > @@ -1992,19 +1992,19 @@ ice_acquire_res(struct ice_hw *hw, enum ice_aq_res_ids res, > > */ > > void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res) > > { > > - u32 total_delay = 0; > > + unsigned long timeout; > > int status; > > > > - status = ice_aq_release_res(hw, res, 0, NULL); > > - > > /* there are some rare cases when trying to release the resource > > * results in an admin queue timeout, so handle them correctly > > */ > > - while ((status == -EIO) && (total_delay < hw->adminq.sq_cmd_timeout)) { > > - mdelay(1); > > + timeout = jiffies + 10 * hw->adminq.sq_cmd_timeout; > > Not needed for this series. But it occurs to me that a clean-up would be to > use ICE_CTL_Q_SQ_CMD_TIMEOUT directly and remove the sq_cmd_timeout field, > as it seems to be only set to that constant. Simon, You are right. I can do that in v2. BTW, i40e and iavf are similar to ice here. Thanks, Michal
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index c2fda4fa4188..14cffe49fa8c 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -1992,19 +1992,19 @@ ice_acquire_res(struct ice_hw *hw, enum ice_aq_res_ids res, */ void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res) { - u32 total_delay = 0; + unsigned long timeout; int status; - status = ice_aq_release_res(hw, res, 0, NULL); - /* there are some rare cases when trying to release the resource * results in an admin queue timeout, so handle them correctly */ - while ((status == -EIO) && (total_delay < hw->adminq.sq_cmd_timeout)) { - mdelay(1); + timeout = jiffies + 10 * hw->adminq.sq_cmd_timeout; + do { status = ice_aq_release_res(hw, res, 0, NULL); - total_delay++; - } + if (status != -EIO) + break; + usleep_range(1000, 2000); + } while (time_before(jiffies, timeout)); } /** diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.c b/drivers/net/ethernet/intel/ice/ice_controlq.c index 6bcfee295991..10125e8aa555 100644 --- a/drivers/net/ethernet/intel/ice/ice_controlq.c +++ b/drivers/net/ethernet/intel/ice/ice_controlq.c @@ -967,7 +967,7 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq, struct ice_aq_desc *desc_on_ring; bool cmd_completed = false; struct ice_sq_cd *details; - u32 total_delay = 0; + unsigned long timeout; int status = 0; u16 retval = 0; u32 val = 0; @@ -1060,13 +1060,14 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq, cq->sq.next_to_use = 0; wr32(hw, cq->sq.tail, cq->sq.next_to_use); + timeout = jiffies + cq->sq_cmd_timeout; do { if (ice_sq_done(hw, cq)) break; - udelay(ICE_CTL_Q_SQ_CMD_USEC); - total_delay++; - } while (total_delay < cq->sq_cmd_timeout); + usleep_range(ICE_CTL_Q_SQ_CMD_USEC, + ICE_CTL_Q_SQ_CMD_USEC * 3 / 2); + } while (time_before(jiffies, timeout)); /* if ready, copy the desc back to temp */ if (ice_sq_done(hw, cq)) { diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.h b/drivers/net/ethernet/intel/ice/ice_controlq.h index c07e9cc9fc6e..f2d3b115ae0b 100644 --- a/drivers/net/ethernet/intel/ice/ice_controlq.h +++ b/drivers/net/ethernet/intel/ice/ice_controlq.h @@ -34,7 +34,7 @@ enum ice_ctl_q { }; /* Control Queue timeout settings - max delay 1s */ -#define ICE_CTL_Q_SQ_CMD_TIMEOUT 10000 /* Count 10000 times */ +#define ICE_CTL_Q_SQ_CMD_TIMEOUT HZ /* Wait max 1s */ #define ICE_CTL_Q_SQ_CMD_USEC 100 /* Check every 100usec */ #define ICE_CTL_Q_ADMIN_INIT_TIMEOUT 10 /* Count 10 times */ #define ICE_CTL_Q_ADMIN_INIT_MSEC 100 /* Check every 100msec */
The driver polls for ice_sq_done() with a 100 µs period for up to 1 s and it uses udelay to do that. Let's use usleep_range instead. We know sleeping is allowed here, because we're holding a mutex (cq->sq_lock). To preserve the total max waiting time, measure cq->sq_cmd_timeout in jiffies. The sq_cmd_timeout is referenced also in ice_release_res(), but there the polling period is 1 ms (i.e. 10 times longer). Since the timeout was expressed in terms of the number of loops, the total timeout in this function is 10 s. I do not know if this is intentional. This patch keeps it. The patch lowers the CPU usage of the ice-gnss-<dev_name> kernel thread on my system from ~8 % to less than 1 %. I saw a report of high CPU usage with ptp4l where the busy-waiting in ice_sq_send_cmd dominated the profile. The patch should help with that. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> --- drivers/net/ethernet/intel/ice/ice_common.c | 14 +++++++------- drivers/net/ethernet/intel/ice/ice_controlq.c | 9 +++++---- drivers/net/ethernet/intel/ice/ice_controlq.h | 2 +- 3 files changed, 13 insertions(+), 12 deletions(-)