Message ID | 20250206141538.549-1-darinzon@amazon.com (mailing list archive) |
---|---|
Headers | show |
Series | PHC support in ENA driver | expand |
On Thu, 6 Feb 2025 16:15:34 +0200 David Arinzon wrote: > This patchset adds the support for PHC (PTP Hardware Clock) > in the ENA driver. The documentation part of the patchset > includes additional information, including statistics, > utilization and invocation examples through the testptp > utility. Vadim, Maciek, did you see this? Looks like the device has limitations on number of gettime calls per sec. Could be a good fit for the work you are prototyping?
On 08/02/2025 00:58, Jakub Kicinski wrote: > On Thu, 6 Feb 2025 16:15:34 +0200 David Arinzon wrote: >> This patchset adds the support for PHC (PTP Hardware Clock) >> in the ENA driver. The documentation part of the patchset >> includes additional information, including statistics, >> utilization and invocation examples through the testptp >> utility. > > Vadim, Maciek, did you see this? Looks like the device has limitations > on number of gettime calls per sec. Could be a good fit for the work > you are prototyping? Hi Jakub! Yes, we have seen this patchset, and we were thinking of how to generalize error_bound property, which was removed from the latest version unfortunately. But it's a good point to look at it once again in terms of our prototype, thanks! Best, Vadim
On Sun, 9 Feb 2025 12:33:24 +0000 Vadim Fedorenko wrote: > Yes, we have seen this patchset, and we were thinking of how to > generalize error_bound property, which was removed from the latest > version unfortunately. But it's a good point to look at it once > again in terms of our prototype, thanks! I was wondering whether they have a user space "time extrapolation component" which we should try to be compatible with. Perhaps they just expect that the user will sync system time.
On 2/11/2025 1:46 AM, Jakub Kicinski wrote: > On Sun, 9 Feb 2025 12:33:24 +0000 Vadim Fedorenko wrote: >> Yes, we have seen this patchset, and we were thinking of how to >> generalize error_bound property, which was removed from the latest >> version unfortunately. But it's a good point to look at it once >> again in terms of our prototype, thanks! > > I was wondering whether they have a user space "time extrapolation > component" which we should try to be compatible with. Perhaps they > just expect that the user will sync system time. error_bound has a different purpose - it tries to get the "baseline" clock accuracy from the HW. The number returned here is needed to calculate the uncertainty, not to extrapolate. And yes - AFIK AWS suggests system time sync for EC2 instances [1] [1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configure-ec2-ntp.html
On Tue, 2025-02-11 at 08:58 +0100, Maciek Machnikowski wrote: > On 2/11/2025 1:46 AM, Jakub Kicinski wrote: > > On Sun, 9 Feb 2025 12:33:24 +0000 Vadim Fedorenko wrote: > > > Yes, we have seen this patchset, and we were thinking of how to > > > generalize error_bound property, which was removed from the latest > > > version unfortunately. But it's a good point to look at it once > > > again in terms of our prototype, thanks! > > > > I was wondering whether they have a user space "time extrapolation > > component" which we should try to be compatible with. Perhaps they > > just expect that the user will sync system time. > > error_bound has a different purpose - it tries to get the "baseline" > clock accuracy from the HW. The number returned here is needed to > calculate the uncertainty, not to extrapolate. > > And yes - AFIK AWS suggests system time sync for EC2 instances [1] > > [1] > https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configure-ec2-ntp.html Right, error bound gives you a min/max for what the time can possibly be, and that allows you to know whether certain transactions/timestamps could *possibly* overlap, or if there is a known ordering between them. There are libraries in userspace which handle this, to be used by things like distributed databases. https://github.com/aws/clock-bound The error bound is also exposed by the vmclock device which is now supported in Linux (https://git.kernel.org/torvalds/c/205032724226) and QEMU (https://gitlab.com/qemu-project/qemu/-/commit/3634039b93cc5), although QEMU doesn't actually expose time through it yet, only the fact that time has been *disrupted* e.g. through live migration). (The PHC discussed here allows each guest on the system to do the same work of carefully calibrating the *same* underlying hardware oscillator; with vmclock the host does it once and then just tells the guests the result through a shared memory structure. With the added bonus of still being accurate immediately after live migration to a new host.)