mbox series

[v6,net-next,0/4] PHC support in ENA driver

Message ID 20250206141538.549-1-darinzon@amazon.com (mailing list archive)
Headers show
Series PHC support in ENA driver | expand

Message

Arinzon, David Feb. 6, 2025, 2:15 p.m. UTC
Changes in v6:
- Remove PHC error bound

Changes in v5 (https://lore.kernel.org/netdev/20250122102040.752-1-darinzon@amazon.com/):
- Add PHC error bound
- Add PHC enablement and error bound retrieval through sysfs

Changes in v4 (https://lore.kernel.org/netdev/20241114095930.200-1-darinzon@amazon.com/):
- Minor documentation change (resolution instead of accuracy)

Changes in v3 (https://lore.kernel.org/netdev/20241103113140.275-1-darinzon@amazon.com/):
- Resolve a compilation error

Changes in v2 (https://lore.kernel.org/netdev/20241031085245.18146-1-darinzon@amazon.com/):
- CCd PTP maintainer
- Fixed style issues
- Fixed documentation warning

v1 (https://lore.kernel.org/netdev/20241021052011.591-1-darinzon@amazon.com/)

This patchset adds the support for PHC (PTP Hardware Clock)
in the ENA driver. The documentation part of the patchset
includes additional information, including statistics,
utilization and invocation examples through the testptp
utility.


David Arinzon (4):
  net: ena: Add PHC support in the ENA driver
  net: ena: PHC silent reset
  net: ena: Add PHC documentation
  net: ena: PHC enable through sysfs

 .../device_drivers/ethernet/amazon/ena.rst    |  90 +++++++
 drivers/net/ethernet/amazon/Kconfig           |   1 +
 drivers/net/ethernet/amazon/ena/Makefile      |   2 +-
 .../net/ethernet/amazon/ena/ena_admin_defs.h  |  63 ++++-
 drivers/net/ethernet/amazon/ena/ena_com.c     | 247 ++++++++++++++++++
 drivers/net/ethernet/amazon/ena/ena_com.h     |  83 ++++++
 drivers/net/ethernet/amazon/ena/ena_ethtool.c | 102 ++++++--
 drivers/net/ethernet/amazon/ena/ena_netdev.c  |  44 +++-
 drivers/net/ethernet/amazon/ena/ena_netdev.h  |   6 +
 drivers/net/ethernet/amazon/ena/ena_phc.c     | 230 ++++++++++++++++
 drivers/net/ethernet/amazon/ena/ena_phc.h     |  37 +++
 .../net/ethernet/amazon/ena/ena_regs_defs.h   |   8 +
 drivers/net/ethernet/amazon/ena/ena_sysfs.c   |  83 ++++++
 drivers/net/ethernet/amazon/ena/ena_sysfs.h   |  28 ++
 14 files changed, 995 insertions(+), 29 deletions(-)
 create mode 100644 drivers/net/ethernet/amazon/ena/ena_phc.c
 create mode 100644 drivers/net/ethernet/amazon/ena/ena_phc.h
 create mode 100644 drivers/net/ethernet/amazon/ena/ena_sysfs.c
 create mode 100644 drivers/net/ethernet/amazon/ena/ena_sysfs.h

Comments

Jakub Kicinski Feb. 8, 2025, 12:58 a.m. UTC | #1
On Thu, 6 Feb 2025 16:15:34 +0200 David Arinzon wrote:
> This patchset adds the support for PHC (PTP Hardware Clock)
> in the ENA driver. The documentation part of the patchset
> includes additional information, including statistics,
> utilization and invocation examples through the testptp
> utility.

Vadim, Maciek, did you see this? Looks like the device has limitations
on number of gettime calls per sec. Could be a good fit for the work
you are prototyping?
Vadim Fedorenko Feb. 9, 2025, 12:33 p.m. UTC | #2
On 08/02/2025 00:58, Jakub Kicinski wrote:
> On Thu, 6 Feb 2025 16:15:34 +0200 David Arinzon wrote:
>> This patchset adds the support for PHC (PTP Hardware Clock)
>> in the ENA driver. The documentation part of the patchset
>> includes additional information, including statistics,
>> utilization and invocation examples through the testptp
>> utility.
> 
> Vadim, Maciek, did you see this? Looks like the device has limitations
> on number of gettime calls per sec. Could be a good fit for the work
> you are prototyping?

Hi Jakub!

Yes, we have seen this patchset, and we were thinking of how to
generalize error_bound property, which was removed from the latest
version unfortunately. But it's a good point to look at it once
again in terms of our prototype, thanks!

Best,
Vadim
Jakub Kicinski Feb. 11, 2025, 12:46 a.m. UTC | #3
On Sun, 9 Feb 2025 12:33:24 +0000 Vadim Fedorenko wrote:
> Yes, we have seen this patchset, and we were thinking of how to
> generalize error_bound property, which was removed from the latest
> version unfortunately. But it's a good point to look at it once
> again in terms of our prototype, thanks!

I was wondering whether they have a user space "time extrapolation
component" which we should try to be compatible with. Perhaps they
just expect that the user will sync system time.
Maciek Machnikowski Feb. 11, 2025, 7:58 a.m. UTC | #4
On 2/11/2025 1:46 AM, Jakub Kicinski wrote:
> On Sun, 9 Feb 2025 12:33:24 +0000 Vadim Fedorenko wrote:
>> Yes, we have seen this patchset, and we were thinking of how to
>> generalize error_bound property, which was removed from the latest
>> version unfortunately. But it's a good point to look at it once
>> again in terms of our prototype, thanks!
> 
> I was wondering whether they have a user space "time extrapolation
> component" which we should try to be compatible with. Perhaps they
> just expect that the user will sync system time.

error_bound has a different purpose - it tries to get the "baseline"
clock accuracy from the HW. The number returned here is needed to
calculate the uncertainty, not to extrapolate.

And yes - AFIK AWS suggests system time sync for EC2 instances [1]

[1]
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configure-ec2-ntp.html
David Woodhouse Feb. 14, 2025, 9:57 a.m. UTC | #5
On Tue, 2025-02-11 at 08:58 +0100, Maciek Machnikowski wrote:
> On 2/11/2025 1:46 AM, Jakub Kicinski wrote:
> > On Sun, 9 Feb 2025 12:33:24 +0000 Vadim Fedorenko wrote:
> > > Yes, we have seen this patchset, and we were thinking of how to
> > > generalize error_bound property, which was removed from the latest
> > > version unfortunately. But it's a good point to look at it once
> > > again in terms of our prototype, thanks!
> > 
> > I was wondering whether they have a user space "time extrapolation
> > component" which we should try to be compatible with. Perhaps they
> > just expect that the user will sync system time.
> 
> error_bound has a different purpose - it tries to get the "baseline"
> clock accuracy from the HW. The number returned here is needed to
> calculate the uncertainty, not to extrapolate.
> 
> And yes - AFIK AWS suggests system time sync for EC2 instances [1]
> 
> [1]
> https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configure-ec2-ntp.html


Right, error bound gives you a min/max for what the time can possibly
be, and that allows you to know whether certain transactions/timestamps
could *possibly* overlap, or if there is a known ordering between them.

There are libraries in userspace which handle this, to be used by
things like distributed databases. https://github.com/aws/clock-bound

The error bound is also exposed by the vmclock device which is now
supported in Linux (https://git.kernel.org/torvalds/c/205032724226) and
QEMU (https://gitlab.com/qemu-project/qemu/-/commit/3634039b93cc5),
although QEMU doesn't actually expose time through it yet, only the
fact that time has been *disrupted* e.g. through live migration).

(The PHC discussed here allows each guest on the system to do the same
work of carefully calibrating the *same* underlying hardware
oscillator; with vmclock the host does it once and then just tells the
guests the result through a shared memory structure. With the added
bonus of still being accurate immediately after live migration to a new
host.)