diff mbox series

[net-next,2/3] ice: avoid the PTP hardware semaphore in gettimex64 path

Message ID 20240226151125.45391-3-mschmidt@redhat.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series ice: lighter locking for PTP time reading | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 940 this patch: 942
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 4 maintainers not CCed: pabeni@redhat.com anthony.l.nguyen@intel.com kuba@kernel.org edumazet@google.com
netdev/build_clang success Errors and warnings before: 957 this patch: 957
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 957 this patch: 959
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 66 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Michal Schmidt Feb. 26, 2024, 3:11 p.m. UTC
The PTP hardware semaphore (PFTSYN_SEM) is used to synchronize
operations that program the PTP timers. The operations involve issuing
commands to the sideband queue. The E810 does not have a hardware
sideband queue, so the admin queue is used. The admin queue is slow.
I have observed delays in hundreds of milliseconds waiting for
ice_sq_done.

When phc2sys reads the time from the ice PTP clock and PFTSYN_SEM is
held by a task performing one of the slow operations, ice_ptp_lock can
easily time out. phc2sys gets -EBUSY and the kernel prints:
  ice 0000:XX:YY.0: PTP failed to get time
These messages appear once every few seconds, causing log spam.

The E810 datasheet recommends an algorithm for reading the upper 64 bits
of the GLTSYN_TIME register. It matches what's implemented in
ice_ptp_read_src_clk_reg. It is robust against wrap-around, but not
necessarily against the concurrent setting of the register (with
GLTSYN_CMD_{INIT,ADJ}_TIME commands). Perhaps that's why
ice_ptp_gettimex64 also takes PFTSYN_SEM.

The race with time setters can be prevented without relying on the PTP
hardware semaphore. Using the "ice_adapter" from the previous patch,
we can have a common spinlock for the PFs that share the clock hardware.
It will protect the reading and writing to the GLTSYN_TIME register.
The writing is performed indirectly, by the hardware, as a result of
the driver writing GLTSYN_CMD_SYNC in ice_ptp_exec_tmr_cmd. I wasn't
sure if the ice_flush there is enough to make sure GLTSYN_TIME has been
updated, but it works well in my testing.

My test code can be seen here:
https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock
It consists of:
 - kernel threads reading the time in a busy loop and looking at the
   deltas between consecutive values, reporting new maxima.
   in the consecutive values;
 - a shell script that sets the time repeatedly;
 - a bpftrace probe to produce a histogram of the measured deltas.
Without the spinlock ptp_gltsyn_time_lock, it is easy to see tearing.
Deltas in the [2G, 4G) range appear in the histograms.
With the spinlock added, there is no tearing and the biggest delta I saw
was in the range [1M, 2M), that is under 2 ms.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice_adapter.c | 2 ++
 drivers/net/ethernet/intel/ice/ice_adapter.h | 6 ++++++
 drivers/net/ethernet/intel/ice/ice_ptp.c     | 8 +-------
 drivers/net/ethernet/intel/ice/ice_ptp_hw.c  | 3 +++
 4 files changed, 12 insertions(+), 7 deletions(-)

Comments

Jacob Keller Feb. 26, 2024, 7:36 p.m. UTC | #1
On 2/26/2024 7:11 AM, Michal Schmidt wrote:
> The PTP hardware semaphore (PFTSYN_SEM) is used to synchronize
> operations that program the PTP timers. The operations involve issuing
> commands to the sideband queue. The E810 does not have a hardware
> sideband queue, so the admin queue is used. The admin queue is slow.
> I have observed delays in hundreds of milliseconds waiting for
> ice_sq_done.

Yep, this can happen. We've worked to try and improve this, but
fundamentally the firmware was never originally intended to be involved
in these flows.

> 
> When phc2sys reads the time from the ice PTP clock and PFTSYN_SEM is
> held by a task performing one of the slow operations, ice_ptp_lock can
> easily time out. phc2sys gets -EBUSY and the kernel prints:
>   ice 0000:XX:YY.0: PTP failed to get time
> These messages appear once every few seconds, causing log spam.
> 

Yea, this is something we wanted to fix.

> The E810 datasheet recommends an algorithm for reading the upper 64 bits
> of the GLTSYN_TIME register. It matches what's implemented in
> ice_ptp_read_src_clk_reg. It is robust against wrap-around, but not
> necessarily against the concurrent setting of the register (with
> GLTSYN_CMD_{INIT,ADJ}_TIME commands). Perhaps that's why
> ice_ptp_gettimex64 also takes PFTSYN_SEM.
> 

Yes. We tried removing this as well. However, as you point out later,
doing so would result in tearing and incorrect clock reads.

> The race with time setters can be prevented without relying on the PTP
> hardware semaphore. Using the "ice_adapter" from the previous patch,
> we can have a common spinlock for the PFs that share the clock hardware.
> It will protect the reading and writing to the GLTSYN_TIME register.
> The writing is performed indirectly, by the hardware, as a result of
> the driver writing GLTSYN_CMD_SYNC in ice_ptp_exec_tmr_cmd. I wasn't
> sure if the ice_flush there is enough to make sure GLTSYN_TIME has been
> updated, but it works well in my testing.
> 

I believe this is good enough. I don't know the exact timing involved
here, but the hardware synchronizes the update to all the PHYs and the
MAC and it is supposed to be on the order of nanoseconds.

> My test code can be seen here:
> https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock
> It consists of:
>  - kernel threads reading the time in a busy loop and looking at the
>    deltas between consecutive values, reporting new maxima.
>    in the consecutive values;
>  - a shell script that sets the time repeatedly;
>  - a bpftrace probe to produce a histogram of the measured deltas.
> Without the spinlock ptp_gltsyn_time_lock, it is easy to see tearing.
> Deltas in the [2G, 4G) range appear in the histograms.
> With the spinlock added, there is no tearing and the biggest delta I saw
> was in the range [1M, 2M), that is under 2 ms.
> 

Nice.


At first, I wasn't convinced we actually need cross-adapter spinlock
here. I thought all the flows that took hardware semaphore were on the
clock owner. Only the clock owner PF will access the GLTSYN_TIME
registers, (gettimex is only exposed through the ptp device, which hooks
into the clock owner). Similarly, only the clock owner does adjtime,
settime, etc.

However... It turns out that at least a couple of flows use the
semaphore on non-clock owners. Specifically E822 hardware has to
initialize the PHY after a link restart, which includes re-doing Vernier
calibration. Each PF handles this itself, but does require use of the
timer synchronization commands to do it. In this flow, the main timer is
not modified but we still use the semaphore and sync registers.

I don't think that impacts the E810 card, but we use the same code flow
here. The E822 series hardware doesn't use the AdminQ for the PHY
messages, so its faster but I think the locking improvement would still
be relevant for them as well.

Given all this, I think it makes sense to go this route.

I'd like to follow-up with possibly replacing the entire HW semaphore
with a mutex initialized here. That would remove a bunch of PCIe posted
transactions required to acquire the HW semaphore which would be a
further improvement here.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_adapter.c | 2 ++
>  drivers/net/ethernet/intel/ice/ice_adapter.h | 6 ++++++
>  drivers/net/ethernet/intel/ice/ice_ptp.c     | 8 +-------
>  drivers/net/ethernet/intel/ice/ice_ptp_hw.c  | 3 +++
>  4 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
> index deb063401238..4b9f5d29811c 100644
> --- a/drivers/net/ethernet/intel/ice/ice_adapter.c
> +++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
> @@ -5,6 +5,7 @@
>  #include <linux/mutex.h>
>  #include <linux/pci.h>
>  #include <linux/slab.h>
> +#include <linux/spinlock.h>
>  #include <linux/xarray.h>
>  #include "ice_adapter.h"
>  
> @@ -38,6 +39,7 @@ struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
>  	if (!a)
>  		return NULL;
>  
> +	spin_lock_init(&a->ptp_gltsyn_time_lock);
>  	refcount_set(&a->refcount, 1);
>  
>  	if (xa_is_err(xa_store(&ice_adapters, index, a, GFP_KERNEL))) {
> diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
> index cb5a02eb24c1..9d11014ec02f 100644
> --- a/drivers/net/ethernet/intel/ice/ice_adapter.h
> +++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
> @@ -4,15 +4,21 @@
>  #ifndef _ICE_ADAPTER_H_
>  #define _ICE_ADAPTER_H_
>  
> +#include <linux/spinlock_types.h>
>  #include <linux/refcount_types.h>
>  
>  struct pci_dev;
>  
>  /**
>   * struct ice_adapter - PCI adapter resources shared across PFs
> + * @ptp_gltsyn_time_lock: Spinlock protecting access to the GLTSYN_TIME
> + *                        register of the PTP clock.
>   * @refcount: Reference count. struct ice_pf objects hold the references.
>   */
>  struct ice_adapter {
> +	/* For access to the GLTSYN_TIME register */
> +	spinlock_t ptp_gltsyn_time_lock;
> +
>  	refcount_t refcount;
>  };
>  
> diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
> index c11eba07283c..b6c7246245c6 100644
> --- a/drivers/net/ethernet/intel/ice/ice_ptp.c
> +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
> @@ -374,6 +374,7 @@ ice_ptp_read_src_clk_reg(struct ice_pf *pf, struct ptp_system_timestamp *sts)
>  	u8 tmr_idx;
>  
>  	tmr_idx = ice_get_ptp_src_clock_index(hw);
> +	guard(spinlock_irqsave)(&pf->adapter->ptp_gltsyn_time_lock);
>  	/* Read the system timestamp pre PHC read */
>  	ptp_read_system_prets(sts);
>  
> @@ -1925,15 +1926,8 @@ ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
>  		   struct ptp_system_timestamp *sts)
>  {
>  	struct ice_pf *pf = ptp_info_to_pf(info);
> -	struct ice_hw *hw = &pf->hw;
> -
> -	if (!ice_ptp_lock(hw)) {
> -		dev_err(ice_pf_to_dev(pf), "PTP failed to get time\n");
> -		return -EBUSY;
> -	}
>  
>  	ice_ptp_read_time(pf, ts, sts);
> -	ice_ptp_unlock(hw);
>  
>  	return 0;
>  }
> diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
> index 187ce9b54e1a..a47dbbfadb74 100644
> --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
> +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
> @@ -274,6 +274,9 @@ void ice_ptp_src_cmd(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
>   */
>  static void ice_ptp_exec_tmr_cmd(struct ice_hw *hw)
>  {
> +	struct ice_pf *pf = container_of(hw, struct ice_pf, hw);
> +
> +	guard(spinlock_irqsave)(&pf->adapter->ptp_gltsyn_time_lock);
>  	wr32(hw, GLTSYN_CMD_SYNC, SYNC_EXEC_CMD);
>  	ice_flush(hw);
>  }
Michal Schmidt Feb. 26, 2024, 8:11 p.m. UTC | #2
On Mon, Feb 26, 2024 at 8:36 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
> On 2/26/2024 7:11 AM, Michal Schmidt wrote:
> > The writing is performed indirectly, by the hardware, as a result of
> > the driver writing GLTSYN_CMD_SYNC in ice_ptp_exec_tmr_cmd. I wasn't
> > sure if the ice_flush there is enough to make sure GLTSYN_TIME has been
> > updated, but it works well in my testing.
> >
>
> I believe this is good enough. I don't know the exact timing involved
> here, but the hardware synchronizes the update to all the PHYs and the
> MAC and it is supposed to be on the order of nanoseconds.

Thanks, that's good to know.

> > My test code can be seen here:
> > https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock
> > It consists of:
> >  - kernel threads reading the time in a busy loop and looking at the
> >    deltas between consecutive values, reporting new maxima.
> >    in the consecutive values;
> >  - a shell script that sets the time repeatedly;
> >  - a bpftrace probe to produce a histogram of the measured deltas.
> > Without the spinlock ptp_gltsyn_time_lock, it is easy to see tearing.
> > Deltas in the [2G, 4G) range appear in the histograms.
> > With the spinlock added, there is no tearing and the biggest delta I saw
> > was in the range [1M, 2M), that is under 2 ms.
> >
>
> Nice.
>
>
> At first, I wasn't convinced we actually need cross-adapter spinlock
> here. I thought all the flows that took hardware semaphore were on the
> clock owner. Only the clock owner PF will access the GLTSYN_TIME
> registers, (gettimex is only exposed through the ptp device, which hooks
> into the clock owner). Similarly, only the clock owner does adjtime,
> settime, etc.

Non-owners do not expose a ptp device to userspace, but they still do
ice_ptp_periodic_work -> ice_ptp_update_cached_phctime ->
ice_ptp_read_src_clk_reg, where they read GLTSYN_TIME.

> However... It turns out that at least a couple of flows use the
> semaphore on non-clock owners. Specifically E822 hardware has to
> initialize the PHY after a link restart, which includes re-doing Vernier
> calibration. Each PF handles this itself, but does require use of the
> timer synchronization commands to do it. In this flow, the main timer is
> not modified but we still use the semaphore and sync registers.
>
> I don't think that impacts the E810 card, but we use the same code flow
> here. The E822 series hardware doesn't use the AdminQ for the PHY
> messages, so its faster but I think the locking improvement would still
> be relevant for them as well.
>
> Given all this, I think it makes sense to go this route.
>
> I'd like to follow-up with possibly replacing the entire HW semaphore
> with a mutex initialized here. That would remove a bunch of PCIe posted
> transactions required to acquire the HW semaphore which would be a
> further improvement here.

Yes, I agree and I have already been looking into this.
Thanks,
Michal

> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
>
> > Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_adapter.c | 2 ++
> >  drivers/net/ethernet/intel/ice/ice_adapter.h | 6 ++++++
> >  drivers/net/ethernet/intel/ice/ice_ptp.c     | 8 +-------
> >  drivers/net/ethernet/intel/ice/ice_ptp_hw.c  | 3 +++
> >  4 files changed, 12 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
> > index deb063401238..4b9f5d29811c 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_adapter.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
> > @@ -5,6 +5,7 @@
> >  #include <linux/mutex.h>
> >  #include <linux/pci.h>
> >  #include <linux/slab.h>
> > +#include <linux/spinlock.h>
> >  #include <linux/xarray.h>
> >  #include "ice_adapter.h"
> >
> > @@ -38,6 +39,7 @@ struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
> >       if (!a)
> >               return NULL;
> >
> > +     spin_lock_init(&a->ptp_gltsyn_time_lock);
> >       refcount_set(&a->refcount, 1);
> >
> >       if (xa_is_err(xa_store(&ice_adapters, index, a, GFP_KERNEL))) {
> > diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
> > index cb5a02eb24c1..9d11014ec02f 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_adapter.h
> > +++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
> > @@ -4,15 +4,21 @@
> >  #ifndef _ICE_ADAPTER_H_
> >  #define _ICE_ADAPTER_H_
> >
> > +#include <linux/spinlock_types.h>
> >  #include <linux/refcount_types.h>
> >
> >  struct pci_dev;
> >
> >  /**
> >   * struct ice_adapter - PCI adapter resources shared across PFs
> > + * @ptp_gltsyn_time_lock: Spinlock protecting access to the GLTSYN_TIME
> > + *                        register of the PTP clock.
> >   * @refcount: Reference count. struct ice_pf objects hold the references.
> >   */
> >  struct ice_adapter {
> > +     /* For access to the GLTSYN_TIME register */
> > +     spinlock_t ptp_gltsyn_time_lock;
> > +
> >       refcount_t refcount;
> >  };
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
> > index c11eba07283c..b6c7246245c6 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_ptp.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
> > @@ -374,6 +374,7 @@ ice_ptp_read_src_clk_reg(struct ice_pf *pf, struct ptp_system_timestamp *sts)
> >       u8 tmr_idx;
> >
> >       tmr_idx = ice_get_ptp_src_clock_index(hw);
> > +     guard(spinlock_irqsave)(&pf->adapter->ptp_gltsyn_time_lock);
> >       /* Read the system timestamp pre PHC read */
> >       ptp_read_system_prets(sts);
> >
> > @@ -1925,15 +1926,8 @@ ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
> >                  struct ptp_system_timestamp *sts)
> >  {
> >       struct ice_pf *pf = ptp_info_to_pf(info);
> > -     struct ice_hw *hw = &pf->hw;
> > -
> > -     if (!ice_ptp_lock(hw)) {
> > -             dev_err(ice_pf_to_dev(pf), "PTP failed to get time\n");
> > -             return -EBUSY;
> > -     }
> >
> >       ice_ptp_read_time(pf, ts, sts);
> > -     ice_ptp_unlock(hw);
> >
> >       return 0;
> >  }
> > diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
> > index 187ce9b54e1a..a47dbbfadb74 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
> > @@ -274,6 +274,9 @@ void ice_ptp_src_cmd(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
> >   */
> >  static void ice_ptp_exec_tmr_cmd(struct ice_hw *hw)
> >  {
> > +     struct ice_pf *pf = container_of(hw, struct ice_pf, hw);
> > +
> > +     guard(spinlock_irqsave)(&pf->adapter->ptp_gltsyn_time_lock);
> >       wr32(hw, GLTSYN_CMD_SYNC, SYNC_EXEC_CMD);
> >       ice_flush(hw);
> >  }
>
Jacob Keller Feb. 26, 2024, 9:13 p.m. UTC | #3
On 2/26/2024 12:11 PM, Michal Schmidt wrote:
> On Mon, Feb 26, 2024 at 8:36 PM Jacob Keller <jacob.e.keller@intel.com> wrote:
>> On 2/26/2024 7:11 AM, Michal Schmidt wrote:
>>> The writing is performed indirectly, by the hardware, as a result of
>>> the driver writing GLTSYN_CMD_SYNC in ice_ptp_exec_tmr_cmd. I wasn't
>>> sure if the ice_flush there is enough to make sure GLTSYN_TIME has been
>>> updated, but it works well in my testing.
>>>
>>
>> I believe this is good enough. I don't know the exact timing involved
>> here, but the hardware synchronizes the update to all the PHYs and the
>> MAC and it is supposed to be on the order of nanoseconds.
> 
> Thanks, that's good to know.
> 
>>> My test code can be seen here:
>>> https://gitlab.com/mschmidt2/linux/-/commits/ice-ptp-host-side-lock
>>> It consists of:
>>>  - kernel threads reading the time in a busy loop and looking at the
>>>    deltas between consecutive values, reporting new maxima.
>>>    in the consecutive values;
>>>  - a shell script that sets the time repeatedly;
>>>  - a bpftrace probe to produce a histogram of the measured deltas.
>>> Without the spinlock ptp_gltsyn_time_lock, it is easy to see tearing.
>>> Deltas in the [2G, 4G) range appear in the histograms.
>>> With the spinlock added, there is no tearing and the biggest delta I saw
>>> was in the range [1M, 2M), that is under 2 ms.
>>>
>>
>> Nice.
>>
>>
>> At first, I wasn't convinced we actually need cross-adapter spinlock
>> here. I thought all the flows that took hardware semaphore were on the
>> clock owner. Only the clock owner PF will access the GLTSYN_TIME
>> registers, (gettimex is only exposed through the ptp device, which hooks
>> into the clock owner). Similarly, only the clock owner does adjtime,
>> settime, etc.
> 
> Non-owners do not expose a ptp device to userspace, but they still do
> ice_ptp_periodic_work -> ice_ptp_update_cached_phctime ->
> ice_ptp_read_src_clk_reg, where they read GLTSYN_TIME.
> 

I think we (Karol?) have some work to fix this by refactoring so that
the clock owner does this over auxiliary bus for all PFs, rather than
having 4/8 PFs each with their own background thread we have a single
background thread that reads the value once and updates the cache of all
the PFs. It looks like its still only in internal testing however.. the
current kernel code does what you said above.

Either way, we also have the other flows for E822 devices which also
need to execute timer commands (albiet ones which do not directly affect
the main timer).

It is much simpler to reason about correctness if we simply lock every
executed command and the reads of the main timer. This has the advantage
that we also do not need to lock the timer reads while waiting on
firmware to prep all the PHYs, so it solves a major problem we had
before as well.

Thanks!

>> However... It turns out that at least a couple of flows use the
>> semaphore on non-clock owners. Specifically E822 hardware has to
>> initialize the PHY after a link restart, which includes re-doing Vernier
>> calibration. Each PF handles this itself, but does require use of the
>> timer synchronization commands to do it. In this flow, the main timer is
>> not modified but we still use the semaphore and sync registers.
>>
>> I don't think that impacts the E810 card, but we use the same code flow
>> here. The E822 series hardware doesn't use the AdminQ for the PHY
>> messages, so its faster but I think the locking improvement would still
>> be relevant for them as well.
>>
>> Given all this, I think it makes sense to go this route.
>>
>> I'd like to follow-up with possibly replacing the entire HW semaphore
>> with a mutex initialized here. That would remove a bunch of PCIe posted
>> transactions required to acquire the HW semaphore which would be a
>> further improvement here.
> 
> Yes, I agree and I have already been looking into this.
> Thanks,
> Michal

Ok great. If you think you already have patches for this I can go ahead
and wait on this work instead of duplicating effort.
> 
>> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
>>
>>> Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
>>> ---
>>>  drivers/net/ethernet/intel/ice/ice_adapter.c | 2 ++
>>>  drivers/net/ethernet/intel/ice/ice_adapter.h | 6 ++++++
>>>  drivers/net/ethernet/intel/ice/ice_ptp.c     | 8 +-------
>>>  drivers/net/ethernet/intel/ice/ice_ptp_hw.c  | 3 +++
>>>  4 files changed, 12 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
>>> index deb063401238..4b9f5d29811c 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_adapter.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
>>> @@ -5,6 +5,7 @@
>>>  #include <linux/mutex.h>
>>>  #include <linux/pci.h>
>>>  #include <linux/slab.h>
>>> +#include <linux/spinlock.h>
>>>  #include <linux/xarray.h>
>>>  #include "ice_adapter.h"
>>>
>>> @@ -38,6 +39,7 @@ struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
>>>       if (!a)
>>>               return NULL;
>>>
>>> +     spin_lock_init(&a->ptp_gltsyn_time_lock);
>>>       refcount_set(&a->refcount, 1);
>>>
>>>       if (xa_is_err(xa_store(&ice_adapters, index, a, GFP_KERNEL))) {
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
>>> index cb5a02eb24c1..9d11014ec02f 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_adapter.h
>>> +++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
>>> @@ -4,15 +4,21 @@
>>>  #ifndef _ICE_ADAPTER_H_
>>>  #define _ICE_ADAPTER_H_
>>>
>>> +#include <linux/spinlock_types.h>
>>>  #include <linux/refcount_types.h>
>>>
>>>  struct pci_dev;
>>>
>>>  /**
>>>   * struct ice_adapter - PCI adapter resources shared across PFs
>>> + * @ptp_gltsyn_time_lock: Spinlock protecting access to the GLTSYN_TIME
>>> + *                        register of the PTP clock.
>>>   * @refcount: Reference count. struct ice_pf objects hold the references.
>>>   */
>>>  struct ice_adapter {
>>> +     /* For access to the GLTSYN_TIME register */
>>> +     spinlock_t ptp_gltsyn_time_lock;
>>> +
>>>       refcount_t refcount;
>>>  };
>>>
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
>>> index c11eba07283c..b6c7246245c6 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_ptp.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
>>> @@ -374,6 +374,7 @@ ice_ptp_read_src_clk_reg(struct ice_pf *pf, struct ptp_system_timestamp *sts)
>>>       u8 tmr_idx;
>>>
>>>       tmr_idx = ice_get_ptp_src_clock_index(hw);
>>> +     guard(spinlock_irqsave)(&pf->adapter->ptp_gltsyn_time_lock);
>>>       /* Read the system timestamp pre PHC read */
>>>       ptp_read_system_prets(sts);
>>>
>>> @@ -1925,15 +1926,8 @@ ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
>>>                  struct ptp_system_timestamp *sts)
>>>  {
>>>       struct ice_pf *pf = ptp_info_to_pf(info);
>>> -     struct ice_hw *hw = &pf->hw;
>>> -
>>> -     if (!ice_ptp_lock(hw)) {
>>> -             dev_err(ice_pf_to_dev(pf), "PTP failed to get time\n");
>>> -             return -EBUSY;
>>> -     }
>>>
>>>       ice_ptp_read_time(pf, ts, sts);
>>> -     ice_ptp_unlock(hw);
>>>
>>>       return 0;
>>>  }
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
>>> index 187ce9b54e1a..a47dbbfadb74 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
>>> @@ -274,6 +274,9 @@ void ice_ptp_src_cmd(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
>>>   */
>>>  static void ice_ptp_exec_tmr_cmd(struct ice_hw *hw)
>>>  {
>>> +     struct ice_pf *pf = container_of(hw, struct ice_pf, hw);
>>> +
>>> +     guard(spinlock_irqsave)(&pf->adapter->ptp_gltsyn_time_lock);
>>>       wr32(hw, GLTSYN_CMD_SYNC, SYNC_EXEC_CMD);
>>>       ice_flush(hw);
>>>  }
>>
>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c
index deb063401238..4b9f5d29811c 100644
--- a/drivers/net/ethernet/intel/ice/ice_adapter.c
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
@@ -5,6 +5,7 @@ 
 #include <linux/mutex.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
+#include <linux/spinlock.h>
 #include <linux/xarray.h>
 #include "ice_adapter.h"
 
@@ -38,6 +39,7 @@  struct ice_adapter *ice_adapter_get(const struct pci_dev *pdev)
 	if (!a)
 		return NULL;
 
+	spin_lock_init(&a->ptp_gltsyn_time_lock);
 	refcount_set(&a->refcount, 1);
 
 	if (xa_is_err(xa_store(&ice_adapters, index, a, GFP_KERNEL))) {
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h b/drivers/net/ethernet/intel/ice/ice_adapter.h
index cb5a02eb24c1..9d11014ec02f 100644
--- a/drivers/net/ethernet/intel/ice/ice_adapter.h
+++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
@@ -4,15 +4,21 @@ 
 #ifndef _ICE_ADAPTER_H_
 #define _ICE_ADAPTER_H_
 
+#include <linux/spinlock_types.h>
 #include <linux/refcount_types.h>
 
 struct pci_dev;
 
 /**
  * struct ice_adapter - PCI adapter resources shared across PFs
+ * @ptp_gltsyn_time_lock: Spinlock protecting access to the GLTSYN_TIME
+ *                        register of the PTP clock.
  * @refcount: Reference count. struct ice_pf objects hold the references.
  */
 struct ice_adapter {
+	/* For access to the GLTSYN_TIME register */
+	spinlock_t ptp_gltsyn_time_lock;
+
 	refcount_t refcount;
 };
 
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c
index c11eba07283c..b6c7246245c6 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp.c
@@ -374,6 +374,7 @@  ice_ptp_read_src_clk_reg(struct ice_pf *pf, struct ptp_system_timestamp *sts)
 	u8 tmr_idx;
 
 	tmr_idx = ice_get_ptp_src_clock_index(hw);
+	guard(spinlock_irqsave)(&pf->adapter->ptp_gltsyn_time_lock);
 	/* Read the system timestamp pre PHC read */
 	ptp_read_system_prets(sts);
 
@@ -1925,15 +1926,8 @@  ice_ptp_gettimex64(struct ptp_clock_info *info, struct timespec64 *ts,
 		   struct ptp_system_timestamp *sts)
 {
 	struct ice_pf *pf = ptp_info_to_pf(info);
-	struct ice_hw *hw = &pf->hw;
-
-	if (!ice_ptp_lock(hw)) {
-		dev_err(ice_pf_to_dev(pf), "PTP failed to get time\n");
-		return -EBUSY;
-	}
 
 	ice_ptp_read_time(pf, ts, sts);
-	ice_ptp_unlock(hw);
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
index 187ce9b54e1a..a47dbbfadb74 100644
--- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
+++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c
@@ -274,6 +274,9 @@  void ice_ptp_src_cmd(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd)
  */
 static void ice_ptp_exec_tmr_cmd(struct ice_hw *hw)
 {
+	struct ice_pf *pf = container_of(hw, struct ice_pf, hw);
+
+	guard(spinlock_irqsave)(&pf->adapter->ptp_gltsyn_time_lock);
 	wr32(hw, GLTSYN_CMD_SYNC, SYNC_EXEC_CMD);
 	ice_flush(hw);
 }