diff mbox series

[RFC,1/2] arm64/tracing: add cntvct based trace clock

Message ID 20211119102117.22304-2-nsaenzju@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: Host/Guest trace syncronization | expand

Commit Message

Nicolas Saenz Julienne Nov. 19, 2021, 10:21 a.m. UTC
Add a new arm64-specific trace clock using the cntvct register, similar
to x64-tsc. This gives us:
 - A clock that is relatively fast (1GHz on armv8.6, 1-50MHz otherwise),
   monotonic, and resilient to low power modes.
 - It can be used to correlate events across cpus as well as across
   hypervisor and guests.

By using arch_timer_read_counter() we make sure that armv8.6 cpus use
the less expensive CNTVCTSS_EL0, which cannot be accessed speculatively.

Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
---
 arch/arm64/include/asm/trace_clock.h | 12 ++++++++++++
 arch/arm64/kernel/Makefile           |  2 +-
 arch/arm64/kernel/trace_clock.c      | 12 ++++++++++++
 3 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/trace_clock.h
 create mode 100644 arch/arm64/kernel/trace_clock.c

Comments

Marcelo Tosatti Nov. 19, 2021, 11:26 a.m. UTC | #1
On Fri, Nov 19, 2021 at 11:21:17AM +0100, Nicolas Saenz Julienne wrote:
> Add a new arm64-specific trace clock using the cntvct register, similar
> to x64-tsc. This gives us:
>  - A clock that is relatively fast (1GHz on armv8.6, 1-50MHz otherwise),
>    monotonic, and resilient to low power modes.
>  - It can be used to correlate events across cpus as well as across
>    hypervisor and guests.
> 
> By using arch_timer_read_counter() we make sure that armv8.6 cpus use
> the less expensive CNTVCTSS_EL0, which cannot be accessed speculatively.

Can this register be read by userspace ? (otherwise it won't be possible
to correlate userspace events).
Marc Zyngier Nov. 19, 2021, noon UTC | #2
On Fri, 19 Nov 2021 11:26:24 +0000,
Marcelo Tosatti <mtosatti@redhat.com> wrote:
> 
> On Fri, Nov 19, 2021 at 11:21:17AM +0100, Nicolas Saenz Julienne wrote:
> > Add a new arm64-specific trace clock using the cntvct register, similar
> > to x64-tsc. This gives us:
> >  - A clock that is relatively fast (1GHz on armv8.6, 1-50MHz otherwise),
> >    monotonic, and resilient to low power modes.
> >  - It can be used to correlate events across cpus as well as across
> >    hypervisor and guests.
> > 
> > By using arch_timer_read_counter() we make sure that armv8.6 cpus use
> > the less expensive CNTVCTSS_EL0, which cannot be accessed speculatively.
> 
> Can this register be read by userspace ? (otherwise it won't be possible
> to correlate userspace events).

Yes. That's part of the userspace ABI. Although this particular
accessor is only available from ARMv8.6 and is advertised via a hwcap
to userspace.

For currently existing implementations, userspace will use the
CNTVCT_EL0 accessor, which requires extra synchronisation as it can be
speculated.

	M.
Nicolas Saenz Julienne Nov. 19, 2021, 1:26 p.m. UTC | #3
On Fri, 2021-11-19 at 12:00 +0000, Marc Zyngier wrote:
> On Fri, 19 Nov 2021 11:26:24 +0000,
> Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > 
> > On Fri, Nov 19, 2021 at 11:21:17AM +0100, Nicolas Saenz Julienne wrote:
> > > Add a new arm64-specific trace clock using the cntvct register, similar
> > > to x64-tsc. This gives us:
> > >  - A clock that is relatively fast (1GHz on armv8.6, 1-50MHz otherwise),
> > >    monotonic, and resilient to low power modes.
> > >  - It can be used to correlate events across cpus as well as across
> > >    hypervisor and guests.
> > > 
> > > By using arch_timer_read_counter() we make sure that armv8.6 cpus use
> > > the less expensive CNTVCTSS_EL0, which cannot be accessed speculatively.
> > 
> > Can this register be read by userspace ? (otherwise it won't be possible
> > to correlate userspace events).
> 
> Yes. That's part of the userspace ABI. Although this particular
> accessor is only available from ARMv8.6 and is advertised via a hwcap
> to userspace.
> 
> For currently existing implementations, userspace will use the
> CNTVCT_EL0 accessor, which requires extra synchronisation as it can be
> speculated.

To complete Marc's reply, here's an example of how CNTVCT_EL0 is being used in
rt-tests' oslat:

https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git/tree/src/oslat/oslat.c#n87
Steven Rostedt Nov. 22, 2021, 2:57 p.m. UTC | #4
On Fri, 19 Nov 2021 11:21:17 +0100
Nicolas Saenz Julienne <nsaenzju@redhat.com> wrote:

> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_ARM64_TRACE_CLOCK_H
> +#define _ASM_ARM64_TRACE_CLOCK_H
> +
> +#include <linux/types.h>
> +
> +extern u64 notrace trace_clock_arm64_cntvct(void);
> +
> +# define ARCH_TRACE_CLOCKS \
> +	{ trace_clock_arm64_cntvct, "cntvct", .in_ns = 0 },
> +
> +#endif  /* _ASM_ARM64_TRACE_CLOCK_H */

So this will appear as a usable clock in trace-cmd.

And since this will be used to synchronize between host and guest like the
x86_tsc is used, that means that trace-cmd needs to know that this is the
an arch "CPU" clock. I wonder if we should rename x86_clock (or at least
make it an alias) to "kvm_clock". Then we can have trace-cmd use
"kvm_clock" as the clock for synchronization between host and guests for
all architectures?

Thinking about this, instead of renaming it, I'll add code to create an
alias to these clocks. Then every arch can pick what clock is used that is
the same between hosts and guests such that user space tooling doesn't have
to keep a database of what clocks are used for synchronization between
hosts and guests for each arch.

I'll go add some code ;-)

-- Steve
Nicolas Saenz Julienne Nov. 24, 2021, 9:45 a.m. UTC | #5
On Mon, 2021-11-22 at 09:57 -0500, Steven Rostedt wrote:
> On Fri, 19 Nov 2021 11:21:17 +0100
> Nicolas Saenz Julienne <nsaenzju@redhat.com> wrote:
> 
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_ARM64_TRACE_CLOCK_H
> > +#define _ASM_ARM64_TRACE_CLOCK_H
> > +
> > +#include <linux/types.h>
> > +
> > +extern u64 notrace trace_clock_arm64_cntvct(void);
> > +
> > +# define ARCH_TRACE_CLOCKS \
> > +	{ trace_clock_arm64_cntvct, "cntvct", .in_ns = 0 },
> > +
> > +#endif  /* _ASM_ARM64_TRACE_CLOCK_H */
> 
> So this will appear as a usable clock in trace-cmd.
> 
> And since this will be used to synchronize between host and guest like the
> x86_tsc is used, that means that trace-cmd needs to know that this is the
> an arch "CPU" clock. I wonder if we should rename x86_clock (or at least
> make it an alias) to "kvm_clock". Then we can have trace-cmd use
> "kvm_clock" as the clock for synchronization between host and guests for
> all architectures?
>
> Thinking about this, instead of renaming it, I'll add code to create an
> alias to these clocks. Then every arch can pick what clock is used that is
> the same between hosts and guests such that user space tooling doesn't have
> to keep a database of what clocks are used for synchronization between
> hosts and guests for each arch.
> 
> I'll go add some code ;-)

I really like the idea, please keep me in the loop if you send something
upstream.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/trace_clock.h b/arch/arm64/include/asm/trace_clock.h
new file mode 100644
index 000000000000..ce4a66d63108
--- /dev/null
+++ b/arch/arm64/include/asm/trace_clock.h
@@ -0,0 +1,12 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_ARM64_TRACE_CLOCK_H
+#define _ASM_ARM64_TRACE_CLOCK_H
+
+#include <linux/types.h>
+
+extern u64 notrace trace_clock_arm64_cntvct(void);
+
+# define ARCH_TRACE_CLOCKS \
+	{ trace_clock_arm64_cntvct, "cntvct", .in_ns = 0 },
+
+#endif  /* _ASM_ARM64_TRACE_CLOCK_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 88b3e2a21408..ec9180f0ac90 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -29,7 +29,7 @@  obj-y			:= debug-monitors.o entry.o irq.o fpsimd.o		\
 			   cpufeature.o alternative.o cacheinfo.o		\
 			   smp.o smp_spin_table.o topology.o smccc-call.o	\
 			   syscall.o proton-pack.o idreg-override.o idle.o	\
-			   patching.o
+			   patching.o trace_clock.o
 
 targets			+= efi-entry.o
 
diff --git a/arch/arm64/kernel/trace_clock.c b/arch/arm64/kernel/trace_clock.c
new file mode 100644
index 000000000000..fe1315f368cb
--- /dev/null
+++ b/arch/arm64/kernel/trace_clock.c
@@ -0,0 +1,12 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * arm64 trace clocks
+ */
+#include <asm/trace_clock.h>
+
+#include <clocksource/arm_arch_timer.h>
+
+u64 notrace trace_clock_arm64_cntvct(void)
+{
+	return arch_timer_read_counter();
+}