diff mbox

[RFC,1/3] x86/pvclock: add setter for pvclock_pvti_cpu0_va

Message ID 1451339557-24473-2-git-send-email-joao.m.martins@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Joao Martins Dec. 28, 2015, 9:52 p.m. UTC
Right now there is only a pvclock_pvti_cpu0_va() which is defined on
kvmclock since:

commit dac16fba6fc5
("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap")

The only user of this interface so far is kvm. This commit adds a setter
function for the pvti page and moves pvclock_pvti_cpu0_va to pvclock, which
is a more generic place to have it; and would allow other PV clocksources
to use it, such as Xen. 

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
---
 arch/x86/include/asm/pvclock.h | 22 +++++++++++++---------
 arch/x86/kernel/kvmclock.c     |  6 +-----
 arch/x86/kernel/pvclock.c      | 11 +++++++++++
 3 files changed, 25 insertions(+), 14 deletions(-)

Comments

Andy Lutomirski Dec. 28, 2015, 11:45 p.m. UTC | #1
On Mon, Dec 28, 2015 at 1:52 PM, Joao Martins <joao.m.martins@oracle.com> wrote:
> Right now there is only a pvclock_pvti_cpu0_va() which is defined on
> kvmclock since:
>
> commit dac16fba6fc5
> ("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap")
>
> The only user of this interface so far is kvm. This commit adds a setter
> function for the pvti page and moves pvclock_pvti_cpu0_va to pvclock, which
> is a more generic place to have it; and would allow other PV clocksources
> to use it, such as Xen.
>

> +
> +void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti)
> +{
> +       pvti_cpu0_va = pvti;
> +}

IMO this either wants to be __init or wants a
WARN_ON(vclock_was_used(VCLOCK_PVCLOCK)).  The latter hasn't landed in
-tip yet, but I think it'll land next week unless the merge window
opens early.

It may pay to actually separate out the kvm-clock clocksource and
rename it rather than partially duplicating it, assuming the result
wouldn't be messy.

Can you CC me on the rest of the series for new versions?

BTW, since this seems to require hypervisor changes to be useful, it
might make sense to rethink the interface a bit.  Are you actually
planning to support per-cpu pvti for this in any useful way?  If not,
I think that this would work a whole lot better and be considerably
less code if you had a single global pvti that lived in
hypervisor-allocated memory instead of an array that lives in guest
memory.  I'd be happy to discuss next week in more detail (currently
on vacation).

--Andy
Joao Martins Dec. 29, 2015, 12:50 p.m. UTC | #2
On 12/28/2015 11:45 PM, Andy Lutomirski wrote:
> On Mon, Dec 28, 2015 at 1:52 PM, Joao Martins <joao.m.martins@oracle.com> wrote:
>> Right now there is only a pvclock_pvti_cpu0_va() which is defined on
>> kvmclock since:
>>
>> commit dac16fba6fc5
>> ("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap")
>>
>> The only user of this interface so far is kvm. This commit adds a setter
>> function for the pvti page and moves pvclock_pvti_cpu0_va to pvclock, which
>> is a more generic place to have it; and would allow other PV clocksources
>> to use it, such as Xen.
>>
> 
>> +
>> +void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti)
>> +{
>> +       pvti_cpu0_va = pvti;
>> +}
> 
> IMO this either wants to be __init or wants a
> WARN_ON(vclock_was_used(VCLOCK_PVCLOCK)).  The latter hasn't landed in
> -tip yet, but I think it'll land next week unless the merge window
> opens early.
OK, I will add those two once it lands in -tip.

I had a silly mistake in this patch as I bindly ommited the parameter name to
keep checkpatch happy, but didn't compile check when built without PARAVIRT.
Apologies for that and will fix that also on the next version.

> 
> It may pay to actually separate out the kvm-clock clocksource and
> rename it rather than partially duplicating it, assuming the result
> wouldn't be messy.
> 
Not sure if I follow but I moved out pvclock_pvti_cpu0_va from kvm-clock or do
you mean to separate out kvm-clock in it's enterity, or something else within
kvm-clock is that is common to both (such as kvm_setup_vsyscall_timeinfo) ?

> Can you CC me on the rest of the series for new versions?
>
Sure! Thanks for the prompt reply.

> BTW, since this seems to require hypervisor changes to be useful, it
> might make sense to rethink the interface a bit.  Are you actually
> planning to support per-cpu pvti for this in any useful way?  If not,
> I think that this would work a whole lot better and be considerably
> less code if you had a single global pvti that lived in
> hypervisor-allocated memory instead of an array that lives in guest
> memory.  I'd be happy to discuss next week in more detail (currently
> on vacation).
Initially I had this series using per-cpu pvti's based on Linux 4.4 but since
that was removed in favor of vdso using solely cpu0 pvti, then I ended up just
registering the cpu 0 page. I don't intend to add per-cpu pvti's since it would
only be used for this case: (unless the reviewers think it should be done)
meaning I would register pvti's for the other CPUs without having them used.
Having a global pvti as you suggest it would get a lot simpler for the guest,
but I guess this would only work assuming PVCLOCK_TSC_STABLE_BIT is there?
Looking forward to discuss it next week.

Joao

> 
> --Andy
>
Andy Lutomirski Dec. 29, 2015, 1:03 p.m. UTC | #3
On Tue, Dec 29, 2015 at 4:50 AM, Joao Martins <joao.m.martins@oracle.com> wrote:
> On 12/28/2015 11:45 PM, Andy Lutomirski wrote:
>> On Mon, Dec 28, 2015 at 1:52 PM, Joao Martins <joao.m.martins@oracle.com> wrote:
>>> Right now there is only a pvclock_pvti_cpu0_va() which is defined on
>>> kvmclock since:
>>>
>>> commit dac16fba6fc5
>>> ("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap")
>>>
>>> The only user of this interface so far is kvm. This commit adds a setter
>>> function for the pvti page and moves pvclock_pvti_cpu0_va to pvclock, which
>>> is a more generic place to have it; and would allow other PV clocksources
>>> to use it, such as Xen.
>>>
>>
>>> +
>>> +void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti)
>>> +{
>>> +       pvti_cpu0_va = pvti;
>>> +}
>>
>> IMO this either wants to be __init or wants a
>> WARN_ON(vclock_was_used(VCLOCK_PVCLOCK)).  The latter hasn't landed in
>> -tip yet, but I think it'll land next week unless the merge window
>> opens early.
> OK, I will add those two once it lands in -tip.
>
> I had a silly mistake in this patch as I bindly ommited the parameter name to
> keep checkpatch happy, but didn't compile check when built without PARAVIRT.
> Apologies for that and will fix that also on the next version.
>
>>
>> It may pay to actually separate out the kvm-clock clocksource and
>> rename it rather than partially duplicating it, assuming the result
>> wouldn't be messy.
>>
> Not sure if I follow but I moved out pvclock_pvti_cpu0_va from kvm-clock or do
> you mean to separate out kvm-clock in it's enterity, or something else within
> kvm-clock is that is common to both (such as kvm_setup_vsyscall_timeinfo) ?

I meant literally using the same clocksource.  I don't know whether
the Xen and KVM variants are similar enough for that to make sense.

>
>> Can you CC me on the rest of the series for new versions?
>>
> Sure! Thanks for the prompt reply.
>
>> BTW, since this seems to require hypervisor changes to be useful, it
>> might make sense to rethink the interface a bit.  Are you actually
>> planning to support per-cpu pvti for this in any useful way?  If not,
>> I think that this would work a whole lot better and be considerably
>> less code if you had a single global pvti that lived in
>> hypervisor-allocated memory instead of an array that lives in guest
>> memory.  I'd be happy to discuss next week in more detail (currently
>> on vacation).
> Initially I had this series using per-cpu pvti's based on Linux 4.4 but since
> that was removed in favor of vdso using solely cpu0 pvti, then I ended up just
> registering the cpu 0 page. I don't intend to add per-cpu pvti's since it would
> only be used for this case: (unless the reviewers think it should be done)
> meaning I would register pvti's for the other CPUs without having them used.
> Having a global pvti as you suggest it would get a lot simpler for the guest,
> but I guess this would only work assuming PVCLOCK_TSC_STABLE_BIT is there?
> Looking forward to discuss it next week.

Sounds good.

--Andy
diff mbox

Patch

diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
index 66df22b..cfb1bb6 100644
--- a/arch/x86/include/asm/pvclock.h
+++ b/arch/x86/include/asm/pvclock.h
@@ -4,15 +4,6 @@ 
 #include <linux/clocksource.h>
 #include <asm/pvclock-abi.h>
 
-#ifdef CONFIG_PARAVIRT_CLOCK
-extern struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void);
-#else
-static inline struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
-{
-	return NULL;
-}
-#endif
-
 /* some helper functions for xen and kvm pv clock sources */
 cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src);
 u8 pvclock_read_flags(struct pvclock_vcpu_time_info *src);
@@ -101,4 +92,17 @@  struct pvclock_vsyscall_time_info {
 
 #define PVTI_SIZE sizeof(struct pvclock_vsyscall_time_info)
 
+#ifdef CONFIG_PARAVIRT_CLOCK
+void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti);
+struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void);
+#else
+static inline void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *)
+{
+}
+static inline struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
+{
+	return NULL;
+}
+#endif
+
 #endif /* _ASM_X86_PVCLOCK_H */
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 72cef58..02a5d9e6 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -45,11 +45,6 @@  early_param("no-kvmclock", parse_no_kvmclock);
 static struct pvclock_vsyscall_time_info *hv_clock;
 static struct pvclock_wall_clock wall_clock;
 
-struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
-{
-	return hv_clock;
-}
-
 /*
  * The wallclock is the time of day when we booted. Since then, some time may
  * have elapsed since the hypervisor wrote the data. So we try to account for
@@ -329,6 +324,7 @@  int __init kvm_setup_vsyscall_timeinfo(void)
 		return 1;
 	}
 
+	pvclock_set_pvti_cpu0_va(hv_clock);
 	put_cpu();
 
 	kvm_clock.archdata.vclock_mode = VCLOCK_PVCLOCK;
diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c
index 99bfc02..da6fbe2 100644
--- a/arch/x86/kernel/pvclock.c
+++ b/arch/x86/kernel/pvclock.c
@@ -25,6 +25,7 @@ 
 #include <asm/pvclock.h>
 
 static u8 valid_flags __read_mostly = 0;
+static struct pvclock_vsyscall_time_info *pvti_cpu0_va __read_mostly;
 
 void pvclock_set_flags(u8 flags)
 {
@@ -140,3 +141,13 @@  void pvclock_read_wallclock(struct pvclock_wall_clock *wall_clock,
 
 	set_normalized_timespec(ts, now.tv_sec, now.tv_nsec);
 }
+
+void pvclock_set_pvti_cpu0_va(struct pvclock_vsyscall_time_info *pvti)
+{
+	pvti_cpu0_va = pvti;
+}
+
+struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
+{
+	return pvti_cpu0_va;
+}