[0/7] x86/kvmclock: Remove memblock dependency and further cleanups
diff mbox

Message ID faa8df8c-f658-7fca-886d-ae741e41f4a8@amd.com
State New
Headers show

Commit Message

Brijesh Singh July 6, 2018, 11:51 p.m. UTC
Adding Tom and Boris


On 7/6/18 12:47 PM, Paolo Bonzini wrote:
> On 06/07/2018 18:13, Thomas Gleixner wrote:
>> To allow early utilization of kvmclock it is required to remove the
>> memblock dependency. memblock is currently used to allocate the per
>> cpu data for kvmclock.
>>
>> The first patch replaces the memblock with a static array sized 64bytes *
>> NR_CPUS and was posted by Pavel. That patch allocates everything statically
>> which is a waste when kvmclock is not used.
>>
>> The rest of the series cleans up the code and converts it to per cpu
>> variables but does not put the kvmclock data into the per cpu area as that
>> has an issue vs. mapping the boot cpu data into the VDSO (leaks arbitrary
>> data, unless page sized).
>>
>> The per cpu data consists of pointers to the actual data. For the boot cpu
>> a page sized array is statically allocated which can be mapped into the
>> VDSO. That array is used for initializing the first 64 CPU pointers. If
>> there are more CPUs the pvclock data is allocated during CPU bringup.
>>
>> So this still will have some overhead when kvmclock is not in use, but
>> bringing it down to zero would be a massive trainwreck and even more
>> indirections.
>>
>> Thanks,
>>
>> 	tglx
>>
>> 8<--------------
>>  a/arch/x86/include/asm/kvm_guest.h |    7 
>>  arch/x86/include/asm/kvm_para.h    |    1 
>>  arch/x86/kernel/kvm.c              |   14 -
>>  arch/x86/kernel/kvmclock.c         |  262 ++++++++++++++-----------------------
>>  arch/x86/kernel/setup.c            |    4 
>>  5 files changed, 105 insertions(+), 183 deletions(-)
>>
>>
>>
>>
> Thanks, this is really nice.  With the small changes from my review,
>
> Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Hi Paolo and Thomas,


This series breaks SEV guest support. The physical address of both
wall_clock and hv_clock is shared with hypervisor for updates. In case
of SEV the address must be mapped as 'decrypted (i.e C=0)' so that both
guest and HV can access the data correctly. The follow patch should map
the pages as decrypted.


        struct pvclock_vcpu_time_info *vcpu_time;
@@ -269,6 +283,8 @@ void __init kvmclock_init(void)
        wall_clock = (struct pvclock_wall_clock *)wall_clock_mem;
        hv_clock = (struct pvclock_vsyscall_time_info *)hv_clock_mem;
 
+       sev_map_clocks_decrypted();
+
        if (kvm_register_clock("primary cpu clock")) {
                hv_clock = NULL;
                wall_clock = NULL;


But this patch triggers the below kernel crash.
early_set_memory_decrypted() uses kernel_physical_mapping_init() to
split the large pages and clear the C-bit. It seems this function still
has dependency with memblock.

[    0.000000] Hypervisor detected: KVM
[    0.000000] Kernel panic - not syncing: alloc_low_pages: ran out of
memory
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.18.0-rc3-sev #19
[    0.000000] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
0.0.0 02/06/2015
[    0.000000] Call Trace:
[    0.000000]  ? dump_stack+0x5c/0x80
[    0.000000]  ? panic+0xe7/0x247
[    0.000000]  ? alloc_low_pages+0x130/0x130
[    0.000000]  ? kernel_physical_mapping_init+0xe0/0x204
[    0.000000]  ? early_set_memory_enc_dec+0x10f/0x160
[    0.000000]  ? 0xffffffffb1000000
[    0.000000]  ? kvmclock_init+0x83/0x20a
[    0.000000]  ? setup_arch+0x42c/0xce6
[    0.000000]  ? start_kernel+0x67/0x531
[    0.000000]  ? load_ucode_bsp+0x76/0x12e
[    0.000000]  ? secondary_startup_64+0xa5/0xb0
[    0.000000] ---[ end Kernel panic - not syncing: alloc_low_pages: ran
out of memory ]---

- Brijesh

Patch
diff mbox

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 890e9e5..640c796 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -251,6 +251,20 @@  static void kvm_shutdown(void)
        native_machine_shutdown();
 }
 
+static void sev_map_clocks_decrypted(void)
+{
+       if (!sev_active())
+               return;
+
+       /*
+        * wall_clock and hv_clock addresses are shared with hypervisor.
+        * When SEV is enabled, any addresses shared with hypervisor must be
+        * mapped decrypted.
+        */
+       early_set_memory_decrypted((unsigned long) wall_clock,
WALL_CLOCK_SIZE);
+       early_set_memory_decrypted((unsigned long) hv_clock, HV_CLOCK_SIZE);
+}
+
 void __init kvmclock_init(void)
 {