diff mbox series

x86/sev: Make early_set_memory_decrypted() calls page aligned

Message ID 20230818233451.3615464-1-srutherford@google.com (mailing list archive)
State New, archived
Headers show
Series x86/sev: Make early_set_memory_decrypted() calls page aligned | expand

Commit Message

Steve Rutherford Aug. 18, 2023, 11:34 p.m. UTC
early_set_memory_decrypted() assumes its parameters are page aligned.
Non-page aligned calls result in additional pages being marked as
decrypted via the encryption status hypercall, which results in
consistent corruption of pages during live migration. Live
migration requires accurate encryption status information to avoid
migrating pages from the wrong perspective.

Fixes: 4716276184ec ("X86/KVM: Decrypt shared per-cpu variables when SEV is active")
Signed-off-by: Steve Rutherford <srutherford@google.com>
---
 arch/x86/kernel/kvm.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

Comments

Gupta, Pankaj Aug. 21, 2023, 6:44 a.m. UTC | #1
On 8/19/2023 1:34 AM, Steve Rutherford wrote:
> early_set_memory_decrypted() assumes its parameters are page aligned.
> Non-page aligned calls result in additional pages being marked as
> decrypted via the encryption status hypercall, which results in
> consistent corruption of pages during live migration. Live
> migration requires accurate encryption status information to avoid
> migrating pages from the wrong perspective.
> 
> Fixes: 4716276184ec ("X86/KVM: Decrypt shared per-cpu variables when SEV is active")
> Signed-off-by: Steve Rutherford <srutherford@google.com>
> ---
>   arch/x86/kernel/kvm.c | 14 +++++++++++++-
>   1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 6a36db4f79fd..a0c072d3103c 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -419,7 +419,14 @@ static u64 kvm_steal_clock(int cpu)
>   
>   static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
>   {
> -	early_set_memory_decrypted((unsigned long) ptr, size);
> +	/*
> +	 * early_set_memory_decrypted() requires page aligned parameters, but
> +	 * this function needs to handle ptrs offset into a page.
> +	 */
> +	unsigned long start = PAGE_ALIGN_DOWN((unsigned long) ptr);
> +	unsigned long end = (unsigned long) ptr + size;
> +
> +	early_set_memory_decrypted(start, end - start);
>   }
>   
>   /*
> @@ -438,6 +445,11 @@ static void __init sev_map_percpu_data(void)
>   		return;
>   
>   	for_each_possible_cpu(cpu) {
> +		/*
> +		 * Calling __set_percpu_decrypted() for each per-cpu variable is
> +		 * inefficent, since it may decrypt the same page multiple times.
> +		 * That said, it avoids the need for more complicated logic.
> +		 */
>   		__set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
>   		__set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
>   		__set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));

The fix looks correct to me.

Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Tom Lendacky Aug. 21, 2023, 1:10 p.m. UTC | #2
On 8/18/23 18:34, Steve Rutherford wrote:
> early_set_memory_decrypted() assumes its parameters are page aligned.
> Non-page aligned calls result in additional pages being marked as
> decrypted via the encryption status hypercall, which results in
> consistent corruption of pages during live migration. Live
> migration requires accurate encryption status information to avoid
> migrating pages from the wrong perspective.

Hmmm... I'm not sure this is the proper fix. The code is actually doing 
the right thing from a encyrption/decryption point of view by checking the 
c-bit for the PTE associated with the virtual address and the size 
(possibly crossing page boundaries).

I think the problem is on the call to early_set_mem_enc_dec_hypercall() 
where it doesn't take into account the possible crossing of page 
boundaries and so can under-count the number of pages, right?

Thanks,
Tom

> 
> Fixes: 4716276184ec ("X86/KVM: Decrypt shared per-cpu variables when SEV is active")
> Signed-off-by: Steve Rutherford <srutherford@google.com>
> ---
>   arch/x86/kernel/kvm.c | 14 +++++++++++++-
>   1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 6a36db4f79fd..a0c072d3103c 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -419,7 +419,14 @@ static u64 kvm_steal_clock(int cpu)
>   
>   static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
>   {
> -	early_set_memory_decrypted((unsigned long) ptr, size);
> +	/*
> +	 * early_set_memory_decrypted() requires page aligned parameters, but
> +	 * this function needs to handle ptrs offset into a page.
> +	 */
> +	unsigned long start = PAGE_ALIGN_DOWN((unsigned long) ptr);
> +	unsigned long end = (unsigned long) ptr + size;
> +
> +	early_set_memory_decrypted(start, end - start);
>   }
>   
>   /*
> @@ -438,6 +445,11 @@ static void __init sev_map_percpu_data(void)
>   		return;
>   
>   	for_each_possible_cpu(cpu) {
> +		/*
> +		 * Calling __set_percpu_decrypted() for each per-cpu variable is
> +		 * inefficent, since it may decrypt the same page multiple times.
> +		 * That said, it avoids the need for more complicated logic.
> +		 */
>   		__set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
>   		__set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
>   		__set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));
Steve Rutherford Aug. 21, 2023, 6:15 p.m. UTC | #3
On Mon, Aug 21, 2023 at 6:10 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 8/18/23 18:34, Steve Rutherford wrote:
> > early_set_memory_decrypted() assumes its parameters are page aligned.
> > Non-page aligned calls result in additional pages being marked as
> > decrypted via the encryption status hypercall, which results in
> > consistent corruption of pages during live migration. Live
> > migration requires accurate encryption status information to avoid
> > migrating pages from the wrong perspective.
>
> Hmmm... I'm not sure this is the proper fix. The code is actually doing
> the right thing from a encyrption/decryption point of view by checking the
> c-bit for the PTE associated with the virtual address and the size
> (possibly crossing page boundaries).
>
> I think the problem is on the call to early_set_mem_enc_dec_hypercall()
> where it doesn't take into account the possible crossing of page
> boundaries and so can under-count the number of pages, right?

Right now, if you request decryption of e.g. a non-page aligned 0x40
byte structure, it rounds the 0x40 bytes up to one page, and then
hypercalls to mark both the page it's on and the subsequent page as
decrypted (since the rounding stretches the structure onto the next
page spuriously). The arithmetic in the combination of
early_set_memory_enc_dec() and early_set_mem_enc_dec_hypercall() are
correct if they are called with page aligned vaddrs (non-page-aligned
sizes are fine iiuc).

Thanks,
Steve
>
> Thanks,
> Tom
>
> >
> > Fixes: 4716276184ec ("X86/KVM: Decrypt shared per-cpu variables when SEV is active")
> > Signed-off-by: Steve Rutherford <srutherford@google.com>
> > ---
> >   arch/x86/kernel/kvm.c | 14 +++++++++++++-
> >   1 file changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > index 6a36db4f79fd..a0c072d3103c 100644
> > --- a/arch/x86/kernel/kvm.c
> > +++ b/arch/x86/kernel/kvm.c
> > @@ -419,7 +419,14 @@ static u64 kvm_steal_clock(int cpu)
> >
> >   static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
> >   {
> > -     early_set_memory_decrypted((unsigned long) ptr, size);
> > +     /*
> > +      * early_set_memory_decrypted() requires page aligned parameters, but
> > +      * this function needs to handle ptrs offset into a page.
> > +      */
> > +     unsigned long start = PAGE_ALIGN_DOWN((unsigned long) ptr);
> > +     unsigned long end = (unsigned long) ptr + size;
> > +
> > +     early_set_memory_decrypted(start, end - start);
> >   }
> >
> >   /*
> > @@ -438,6 +445,11 @@ static void __init sev_map_percpu_data(void)
> >               return;
> >
> >       for_each_possible_cpu(cpu) {
> > +             /*
> > +              * Calling __set_percpu_decrypted() for each per-cpu variable is
> > +              * inefficent, since it may decrypt the same page multiple times.
> > +              * That said, it avoids the need for more complicated logic.
> > +              */
> >               __set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
> >               __set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
> >               __set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));
Tom Lendacky Aug. 21, 2023, 6:53 p.m. UTC | #4
On 8/21/23 13:15, Steve Rutherford wrote:
> On Mon, Aug 21, 2023 at 6:10 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>
>> On 8/18/23 18:34, Steve Rutherford wrote:
>>> early_set_memory_decrypted() assumes its parameters are page aligned.
>>> Non-page aligned calls result in additional pages being marked as
>>> decrypted via the encryption status hypercall, which results in
>>> consistent corruption of pages during live migration. Live
>>> migration requires accurate encryption status information to avoid
>>> migrating pages from the wrong perspective.
>>
>> Hmmm... I'm not sure this is the proper fix. The code is actually doing
>> the right thing from a encyrption/decryption point of view by checking the
>> c-bit for the PTE associated with the virtual address and the size
>> (possibly crossing page boundaries).
>>
>> I think the problem is on the call to early_set_mem_enc_dec_hypercall()
>> where it doesn't take into account the possible crossing of page
>> boundaries and so can under-count the number of pages, right?
> 
> Right now, if you request decryption of e.g. a non-page aligned 0x40
> byte structure, it rounds the 0x40 bytes up to one page, and then
> hypercalls to mark both the page it's on and the subsequent page as
> decrypted (since the rounding stretches the structure onto the next
> page spuriously). The arithmetic in the combination of
> early_set_memory_enc_dec() and early_set_mem_enc_dec_hypercall() are
> correct if they are called with page aligned vaddrs (non-page-aligned
> sizes are fine iiuc).

Ah, right, correct. It is still related to how the page count is 
calculated for the hypercall, though, right? The encryption/decryption 
operations function properly.

If another caller of early_set_memory_decrypted() gets added, it would 
need to know to do the same thing. So I just wonder if this wouldn't be 
better fixed in early_set_memory_enc_dec() by using a page aligned address 
and proper number of pages when calling early_set_mem_enc_dec_hypercall() 
or in early_set_mem_enc_dec_hypercall() where it would take a size 
argument instead of a page count and does the proper work to get a page 
aligned address and proper page count.

Also, if it is the hypercall that is causing the issue, should the Fixes 
tag be 064ce6c550a0 ("mm: x86: Invoke hypercall when page encryption 
status is changed") since the problem is around the hypercall.

Thanks,
Tom

> 
> Thanks,
> Steve
>>
>> Thanks,
>> Tom
>>
>>>
>>> Fixes: 4716276184ec ("X86/KVM: Decrypt shared per-cpu variables when SEV is active")
>>> Signed-off-by: Steve Rutherford <srutherford@google.com>
>>> ---
>>>    arch/x86/kernel/kvm.c | 14 +++++++++++++-
>>>    1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
>>> index 6a36db4f79fd..a0c072d3103c 100644
>>> --- a/arch/x86/kernel/kvm.c
>>> +++ b/arch/x86/kernel/kvm.c
>>> @@ -419,7 +419,14 @@ static u64 kvm_steal_clock(int cpu)
>>>
>>>    static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
>>>    {
>>> -     early_set_memory_decrypted((unsigned long) ptr, size);
>>> +     /*
>>> +      * early_set_memory_decrypted() requires page aligned parameters, but
>>> +      * this function needs to handle ptrs offset into a page.
>>> +      */
>>> +     unsigned long start = PAGE_ALIGN_DOWN((unsigned long) ptr);
>>> +     unsigned long end = (unsigned long) ptr + size;
>>> +
>>> +     early_set_memory_decrypted(start, end - start);
>>>    }
>>>
>>>    /*
>>> @@ -438,6 +445,11 @@ static void __init sev_map_percpu_data(void)
>>>                return;
>>>
>>>        for_each_possible_cpu(cpu) {
>>> +             /*
>>> +              * Calling __set_percpu_decrypted() for each per-cpu variable is
>>> +              * inefficent, since it may decrypt the same page multiple times.
>>> +              * That said, it avoids the need for more complicated logic.
>>> +              */
>>>                __set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
>>>                __set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
>>>                __set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));
Steve Rutherford Aug. 21, 2023, 7:25 p.m. UTC | #5
On Mon, Aug 21, 2023 at 11:54 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 8/21/23 13:15, Steve Rutherford wrote:
> > On Mon, Aug 21, 2023 at 6:10 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>
> >> On 8/18/23 18:34, Steve Rutherford wrote:
> >>> early_set_memory_decrypted() assumes its parameters are page aligned.
> >>> Non-page aligned calls result in additional pages being marked as
> >>> decrypted via the encryption status hypercall, which results in
> >>> consistent corruption of pages during live migration. Live
> >>> migration requires accurate encryption status information to avoid
> >>> migrating pages from the wrong perspective.
> >>
> >> Hmmm... I'm not sure this is the proper fix. The code is actually doing
> >> the right thing from a encyrption/decryption point of view by checking the
> >> c-bit for the PTE associated with the virtual address and the size
> >> (possibly crossing page boundaries).
> >>
> >> I think the problem is on the call to early_set_mem_enc_dec_hypercall()
> >> where it doesn't take into account the possible crossing of page
> >> boundaries and so can under-count the number of pages, right?
> >
> > Right now, if you request decryption of e.g. a non-page aligned 0x40
> > byte structure, it rounds the 0x40 bytes up to one page, and then
> > hypercalls to mark both the page it's on and the subsequent page as
> > decrypted (since the rounding stretches the structure onto the next
> > page spuriously). The arithmetic in the combination of
> > early_set_memory_enc_dec() and early_set_mem_enc_dec_hypercall() are
> > correct if they are called with page aligned vaddrs (non-page-aligned
> > sizes are fine iiuc).
>
> Ah, right, correct. It is still related to how the page count is
> calculated for the hypercall, though, right? The encryption/decryption
> operations function properly.

Yep! It's just the hypercall that behaves poorly in this situation.
>
> If another caller of early_set_memory_decrypted() gets added, it would
> need to know to do the same thing. So I just wonder if this wouldn't be
> better fixed in early_set_memory_enc_dec() by using a page aligned address
> and proper number of pages when calling early_set_mem_enc_dec_hypercall()
> or in early_set_mem_enc_dec_hypercall() where it would take a size
> argument instead of a page count and does the proper work to get a page
> aligned address and proper page count.
>
> Also, if it is the hypercall that is causing the issue, should the Fixes
> tag be 064ce6c550a0 ("mm: x86: Invoke hypercall when page encryption
> status is changed") since the problem is around the hypercall.

Fair question. I was torn about where to point this, since either
fixing up the value inside early_set_memory_enc_dec() or fixing up the
per-cpu callers is correct. The non-early version
(__set_memory_enc_pgtable()) calls WARN_ONCE for misaligned addresses
under the hood, so I thought the early version should have the same
contract (though, obviously, this lacks the actual WARN_ONCE). I can
re-upload with a WARN_ONCE or with the masking moved into
early_set_memory_enc_dec().
Thanks,
Steve

>
> Thanks,
> Tom
>
> >
> > Thanks,
> > Steve
> >>
> >> Thanks,
> >> Tom
> >>
> >>>
> >>> Fixes: 4716276184ec ("X86/KVM: Decrypt shared per-cpu variables when SEV is active")
> >>> Signed-off-by: Steve Rutherford <srutherford@google.com>
> >>> ---
> >>>    arch/x86/kernel/kvm.c | 14 +++++++++++++-
> >>>    1 file changed, 13 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> >>> index 6a36db4f79fd..a0c072d3103c 100644
> >>> --- a/arch/x86/kernel/kvm.c
> >>> +++ b/arch/x86/kernel/kvm.c
> >>> @@ -419,7 +419,14 @@ static u64 kvm_steal_clock(int cpu)
> >>>
> >>>    static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
> >>>    {
> >>> -     early_set_memory_decrypted((unsigned long) ptr, size);
> >>> +     /*
> >>> +      * early_set_memory_decrypted() requires page aligned parameters, but
> >>> +      * this function needs to handle ptrs offset into a page.
> >>> +      */
> >>> +     unsigned long start = PAGE_ALIGN_DOWN((unsigned long) ptr);
> >>> +     unsigned long end = (unsigned long) ptr + size;
> >>> +
> >>> +     early_set_memory_decrypted(start, end - start);
> >>>    }
> >>>
> >>>    /*
> >>> @@ -438,6 +445,11 @@ static void __init sev_map_percpu_data(void)
> >>>                return;
> >>>
> >>>        for_each_possible_cpu(cpu) {
> >>> +             /*
> >>> +              * Calling __set_percpu_decrypted() for each per-cpu variable is
> >>> +              * inefficent, since it may decrypt the same page multiple times.
> >>> +              * That said, it avoids the need for more complicated logic.
> >>> +              */
> >>>                __set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
> >>>                __set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
> >>>                __set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));
Tom Lendacky Aug. 21, 2023, 8:24 p.m. UTC | #6
On 8/21/23 14:25, Steve Rutherford wrote:
> On Mon, Aug 21, 2023 at 11:54 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>
>> On 8/21/23 13:15, Steve Rutherford wrote:
>>> On Mon, Aug 21, 2023 at 6:10 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>>
>>>> On 8/18/23 18:34, Steve Rutherford wrote:
>>>>> early_set_memory_decrypted() assumes its parameters are page aligned.
>>>>> Non-page aligned calls result in additional pages being marked as
>>>>> decrypted via the encryption status hypercall, which results in
>>>>> consistent corruption of pages during live migration. Live
>>>>> migration requires accurate encryption status information to avoid
>>>>> migrating pages from the wrong perspective.
>>>>
>>>> Hmmm... I'm not sure this is the proper fix. The code is actually doing
>>>> the right thing from a encyrption/decryption point of view by checking the
>>>> c-bit for the PTE associated with the virtual address and the size
>>>> (possibly crossing page boundaries).
>>>>
>>>> I think the problem is on the call to early_set_mem_enc_dec_hypercall()
>>>> where it doesn't take into account the possible crossing of page
>>>> boundaries and so can under-count the number of pages, right?
>>>
>>> Right now, if you request decryption of e.g. a non-page aligned 0x40
>>> byte structure, it rounds the 0x40 bytes up to one page, and then
>>> hypercalls to mark both the page it's on and the subsequent page as
>>> decrypted (since the rounding stretches the structure onto the next
>>> page spuriously). The arithmetic in the combination of
>>> early_set_memory_enc_dec() and early_set_mem_enc_dec_hypercall() are
>>> correct if they are called with page aligned vaddrs (non-page-aligned
>>> sizes are fine iiuc).
>>
>> Ah, right, correct. It is still related to how the page count is
>> calculated for the hypercall, though, right? The encryption/decryption
>> operations function properly.
> 
> Yep! It's just the hypercall that behaves poorly in this situation.

Ok, cool.

>>
>> If another caller of early_set_memory_decrypted() gets added, it would
>> need to know to do the same thing. So I just wonder if this wouldn't be
>> better fixed in early_set_memory_enc_dec() by using a page aligned address
>> and proper number of pages when calling early_set_mem_enc_dec_hypercall()
>> or in early_set_mem_enc_dec_hypercall() where it would take a size
>> argument instead of a page count and does the proper work to get a page
>> aligned address and proper page count.
>>
>> Also, if it is the hypercall that is causing the issue, should the Fixes
>> tag be 064ce6c550a0 ("mm: x86: Invoke hypercall when page encryption
>> status is changed") since the problem is around the hypercall.
> 
> Fair question. I was torn about where to point this, since either
> fixing up the value inside early_set_memory_enc_dec() or fixing up the
> per-cpu callers is correct. The non-early version
> (__set_memory_enc_pgtable()) calls WARN_ONCE for misaligned addresses
> under the hood, so I thought the early version should have the same
> contract (though, obviously, this lacks the actual WARN_ONCE). I can
> re-upload with a WARN_ONCE or with the masking moved into
> early_set_memory_enc_dec().

I like the fix for the hypercall being in early_set_memory_enc_dec(). This 
way the behavior doesn't change for existing callers and doesn't require 
adding a WARN.

Thanks,
Tom

> Thanks,
> Steve
> 
>>
>> Thanks,
>> Tom
>>
>>>
>>> Thanks,
>>> Steve
>>>>
>>>> Thanks,
>>>> Tom
>>>>
>>>>>
>>>>> Fixes: 4716276184ec ("X86/KVM: Decrypt shared per-cpu variables when SEV is active")
>>>>> Signed-off-by: Steve Rutherford <srutherford@google.com>
>>>>> ---
>>>>>     arch/x86/kernel/kvm.c | 14 +++++++++++++-
>>>>>     1 file changed, 13 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
>>>>> index 6a36db4f79fd..a0c072d3103c 100644
>>>>> --- a/arch/x86/kernel/kvm.c
>>>>> +++ b/arch/x86/kernel/kvm.c
>>>>> @@ -419,7 +419,14 @@ static u64 kvm_steal_clock(int cpu)
>>>>>
>>>>>     static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
>>>>>     {
>>>>> -     early_set_memory_decrypted((unsigned long) ptr, size);
>>>>> +     /*
>>>>> +      * early_set_memory_decrypted() requires page aligned parameters, but
>>>>> +      * this function needs to handle ptrs offset into a page.
>>>>> +      */
>>>>> +     unsigned long start = PAGE_ALIGN_DOWN((unsigned long) ptr);
>>>>> +     unsigned long end = (unsigned long) ptr + size;
>>>>> +
>>>>> +     early_set_memory_decrypted(start, end - start);
>>>>>     }
>>>>>
>>>>>     /*
>>>>> @@ -438,6 +445,11 @@ static void __init sev_map_percpu_data(void)
>>>>>                 return;
>>>>>
>>>>>         for_each_possible_cpu(cpu) {
>>>>> +             /*
>>>>> +              * Calling __set_percpu_decrypted() for each per-cpu variable is
>>>>> +              * inefficent, since it may decrypt the same page multiple times.
>>>>> +              * That said, it avoids the need for more complicated logic.
>>>>> +              */
>>>>>                 __set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
>>>>>                 __set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
>>>>>                 __set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));
Steve Rutherford Aug. 21, 2023, 10:53 p.m. UTC | #7
On Mon, Aug 21, 2023 at 1:24 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> I like the fix for the hypercall being in early_set_memory_enc_dec(). This
> way the behavior doesn't change for existing callers and doesn't require
> adding a WARN.
>
> Thanks,
> Tom

I uploaded a version based on your earlier advice to have
early_set_mem_enc_dec_hypercall() take a size. I was hesitant since I
thought I'd have to change a ton of callsites, but the line count was
a lot shorter than I expected. This seems like the right way to go
since it directly fixes the problematic rounding.

Thanks,
Steve
Ingo Molnar Sept. 15, 2023, 11:59 a.m. UTC | #8
* Steve Rutherford <srutherford@google.com> wrote:

> early_set_memory_decrypted() assumes its parameters are page aligned.
> Non-page aligned calls result in additional pages being marked as
> decrypted via the encryption status hypercall, which results in
> consistent corruption of pages during live migration. Live
> migration requires accurate encryption status information to avoid
> migrating pages from the wrong perspective.
> 
> Fixes: 4716276184ec ("X86/KVM: Decrypt shared per-cpu variables when SEV is active")
> Signed-off-by: Steve Rutherford <srutherford@google.com>
> ---
>  arch/x86/kernel/kvm.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)

I suppose this fix is going through the KVM tree, or should we pick it up
in the x86 tree?

Thanks,

	Ingo
Steve Rutherford Sept. 15, 2023, 6:44 p.m. UTC | #9
I believe V3 of this fix was already merged into both x86 and Linus'
tree (I think as ac3f9c9f1b37edaa7d1a9b908bc79d843955a1a2, "x86/sev:
Make enc_dec_hypercall() accept a size instead of npages").


On Fri, Sep 15, 2023 at 4:59 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Steve Rutherford <srutherford@google.com> wrote:
>
> > early_set_memory_decrypted() assumes its parameters are page aligned.
> > Non-page aligned calls result in additional pages being marked as
> > decrypted via the encryption status hypercall, which results in
> > consistent corruption of pages during live migration. Live
> > migration requires accurate encryption status information to avoid
> > migrating pages from the wrong perspective.
> >
> > Fixes: 4716276184ec ("X86/KVM: Decrypt shared per-cpu variables when SEV is active")
> > Signed-off-by: Steve Rutherford <srutherford@google.com>
> > ---
> >  arch/x86/kernel/kvm.c | 14 +++++++++++++-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
>
> I suppose this fix is going through the KVM tree, or should we pick it up
> in the x86 tree?
>
> Thanks,
>
>         Ingo
Ingo Molnar Sept. 16, 2023, 9:19 a.m. UTC | #10
* Steve Rutherford <srutherford@google.com> wrote:

> I believe V3 of this fix was already merged into both x86 and Linus'
> tree (I think as ac3f9c9f1b37edaa7d1a9b908bc79d843955a1a2, "x86/sev:
> Make enc_dec_hypercall() accept a size instead of npages").

Erm ... indeed, and I forgot to mark the old v1 thread as read and then
promptly forgot about it all ... ;-)

Thanks,

	Ingo
diff mbox series

Patch

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 6a36db4f79fd..a0c072d3103c 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -419,7 +419,14 @@  static u64 kvm_steal_clock(int cpu)
 
 static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
 {
-	early_set_memory_decrypted((unsigned long) ptr, size);
+	/*
+	 * early_set_memory_decrypted() requires page aligned parameters, but
+	 * this function needs to handle ptrs offset into a page.
+	 */
+	unsigned long start = PAGE_ALIGN_DOWN((unsigned long) ptr);
+	unsigned long end = (unsigned long) ptr + size;
+
+	early_set_memory_decrypted(start, end - start);
 }
 
 /*
@@ -438,6 +445,11 @@  static void __init sev_map_percpu_data(void)
 		return;
 
 	for_each_possible_cpu(cpu) {
+		/*
+		 * Calling __set_percpu_decrypted() for each per-cpu variable is
+		 * inefficent, since it may decrypt the same page multiple times.
+		 * That said, it avoids the need for more complicated logic.
+		 */
 		__set_percpu_decrypted(&per_cpu(apf_reason, cpu), sizeof(apf_reason));
 		__set_percpu_decrypted(&per_cpu(steal_time, cpu), sizeof(steal_time));
 		__set_percpu_decrypted(&per_cpu(kvm_apic_eoi, cpu), sizeof(kvm_apic_eoi));