diff mbox series

x86/S3: put data segment registers into known state upon resume

Message ID 3cad2798-1a01-7d5e-ea55-ddb9ba6388d9@suse.com (mailing list archive)
State New, archived
Headers show
Series x86/S3: put data segment registers into known state upon resume | expand

Commit Message

Jan Beulich July 20, 2020, 3:20 p.m. UTC
wakeup_32 sets %ds and %es to BOOT_DS, while leaving %fs at what
wakeup_start did set it to, and %gs at whatever BIOS did load into it.
All of this may end up confusing the first load_segments() to run on
the BSP after resume, in particular allowing a non-nul selector value
to be left in %fs.

Alongside %ss, also put all other data segment registers into the same
state that the boot and CPU bringup paths put them in.

Reported-by: M. Vefa Bicakci <m.v.b@runbox.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>

Comments

Roger Pau Monné July 21, 2020, 11:27 a.m. UTC | #1
On Mon, Jul 20, 2020 at 05:20:15PM +0200, Jan Beulich wrote:
> wakeup_32 sets %ds and %es to BOOT_DS, while leaving %fs at what
> wakeup_start did set it to, and %gs at whatever BIOS did load into it.
> All of this may end up confusing the first load_segments() to run on
> the BSP after resume, in particular allowing a non-nul selector value
> to be left in %fs.
> 
> Alongside %ss, also put all other data segment registers into the same
> state that the boot and CPU bringup paths put them in.
> 
> Reported-by: M. Vefa Bicakci <m.v.b@runbox.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

I wouldn't mind if the added chunk was placed before loading %ss, so
that the context of what's in %eax would be clearer.

Roger.
Andrew Cooper July 23, 2020, 2:40 p.m. UTC | #2
On 20/07/2020 16:20, Jan Beulich wrote:
> wakeup_32 sets %ds and %es to BOOT_DS, while leaving %fs at what
> wakeup_start did set it to, and %gs at whatever BIOS did load into it.
> All of this may end up confusing the first load_segments() to run on
> the BSP after resume, in particular allowing a non-nul selector value
> to be left in %fs.
>
> Alongside %ss, also put all other data segment registers into the same
> state that the boot and CPU bringup paths put them in.
>
> Reported-by: M. Vefa Bicakci <m.v.b@runbox.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/acpi/wakeup_prot.S
> +++ b/xen/arch/x86/acpi/wakeup_prot.S
> @@ -52,6 +52,16 @@ ENTRY(s3_resume)
>          mov     %eax, %ss
>          mov     saved_rsp(%rip), %rsp
>  
> +        /*
> +         * Also put other segment registers into known state, like would
> +         * be done on the boot path. This is in particular necessary for
> +         * the first load_segments() to work as intended.
> +         */

I don't think the comment is helpful, not least because it refers to a
broken behaviour in load_segemnts() which is soon going to change anyway.

We've literally just loaded the GDT, at which point reloading all
segments *is* the expected thing to do.

I'd recommend that the diff be simply:

diff --git a/xen/arch/x86/acpi/wakeup_prot.S
b/xen/arch/x86/acpi/wakeup_prot.S
index dcc7e2327d..a2c41c4f3f 100644
--- a/xen/arch/x86/acpi/wakeup_prot.S
+++ b/xen/arch/x86/acpi/wakeup_prot.S
@@ -49,6 +49,10 @@ ENTRY(s3_resume)
         mov     %rax, %cr0
 
         mov     $__HYPERVISOR_DS64, %eax
+        mov     %eax, %ds
+        mov     %eax, %es
+        mov     %eax, %fs
+        mov     %eax, %gs
         mov     %eax, %ss
         mov     saved_rsp(%rip), %rsp
 

It is a shame that the CR0 load breaks up the obvious connection with
lgdt, but IIRC, that was a consequence of how the code was laid out
previously.

Preferably with the above diff, Reviewed-by: Andrew Cooper
<andrew.cooper3@citrix.com>
Jan Beulich July 23, 2020, 3:19 p.m. UTC | #3
On 23.07.2020 16:40, Andrew Cooper wrote:
> On 20/07/2020 16:20, Jan Beulich wrote:
>> wakeup_32 sets %ds and %es to BOOT_DS, while leaving %fs at what
>> wakeup_start did set it to, and %gs at whatever BIOS did load into it.
>> All of this may end up confusing the first load_segments() to run on
>> the BSP after resume, in particular allowing a non-nul selector value
>> to be left in %fs.
>>
>> Alongside %ss, also put all other data segment registers into the same
>> state that the boot and CPU bringup paths put them in.
>>
>> Reported-by: M. Vefa Bicakci <m.v.b@runbox.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>
>> --- a/xen/arch/x86/acpi/wakeup_prot.S
>> +++ b/xen/arch/x86/acpi/wakeup_prot.S
>> @@ -52,6 +52,16 @@ ENTRY(s3_resume)
>>          mov     %eax, %ss
>>          mov     saved_rsp(%rip), %rsp
>>  
>> +        /*
>> +         * Also put other segment registers into known state, like would
>> +         * be done on the boot path. This is in particular necessary for
>> +         * the first load_segments() to work as intended.
>> +         */
> 
> I don't think the comment is helpful, not least because it refers to a
> broken behaviour in load_segemnts() which is soon going to change anyway.

Well, I can drop it. I merely thought I'd be nice and comment my
code once in a while (and the comment could be dropped / adjusted
when load_segments() changes)...

> We've literally just loaded the GDT, at which point reloading all
> segments *is* the expected thing to do.

In a way, unless some/all are assumed to already hold a nul selector.

> I'd recommend that the diff be simply:
> 
> diff --git a/xen/arch/x86/acpi/wakeup_prot.S
> b/xen/arch/x86/acpi/wakeup_prot.S
> index dcc7e2327d..a2c41c4f3f 100644
> --- a/xen/arch/x86/acpi/wakeup_prot.S
> +++ b/xen/arch/x86/acpi/wakeup_prot.S
> @@ -49,6 +49,10 @@ ENTRY(s3_resume)
>          mov     %rax, %cr0
>  
>          mov     $__HYPERVISOR_DS64, %eax
> +        mov     %eax, %ds
> +        mov     %eax, %es
> +        mov     %eax, %fs
> +        mov     %eax, %gs
>          mov     %eax, %ss
>          mov     saved_rsp(%rip), %rsp

So I had specifically elected to not put the addition there, to make
sure the stack would get established first. But seeing both Roger
and you ask me to do otherwise - well, so be it then.

> It is a shame that the CR0 load breaks up the obvious connection with
> lgdt, but IIRC, that was a consequence of how the code was laid out
> previously.
> 
> Preferably with the above diff, Reviewed-by: Andrew Cooper
> <andrew.cooper3@citrix.com>

Thanks, Jan
Andrew Cooper July 23, 2020, 4 p.m. UTC | #4
On 23/07/2020 16:19, Jan Beulich wrote:
> On 23.07.2020 16:40, Andrew Cooper wrote:
>> On 20/07/2020 16:20, Jan Beulich wrote:
>>> wakeup_32 sets %ds and %es to BOOT_DS, while leaving %fs at what
>>> wakeup_start did set it to, and %gs at whatever BIOS did load into it.
>>> All of this may end up confusing the first load_segments() to run on
>>> the BSP after resume, in particular allowing a non-nul selector value
>>> to be left in %fs.
>>>
>>> Alongside %ss, also put all other data segment registers into the same
>>> state that the boot and CPU bringup paths put them in.
>>>
>>> Reported-by: M. Vefa Bicakci <m.v.b@runbox.com>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>
>>> --- a/xen/arch/x86/acpi/wakeup_prot.S
>>> +++ b/xen/arch/x86/acpi/wakeup_prot.S
>>> @@ -52,6 +52,16 @@ ENTRY(s3_resume)
>>>          mov     %eax, %ss
>>>          mov     saved_rsp(%rip), %rsp
>>>  
>>> +        /*
>>> +         * Also put other segment registers into known state, like would
>>> +         * be done on the boot path. This is in particular necessary for
>>> +         * the first load_segments() to work as intended.
>>> +         */
>> I don't think the comment is helpful, not least because it refers to a
>> broken behaviour in load_segemnts() which is soon going to change anyway.
> Well, I can drop it. I merely thought I'd be nice and comment my
> code once in a while (and the comment could be dropped / adjusted
> when load_segments() changes)...
>
>> We've literally just loaded the GDT, at which point reloading all
>> segments *is* the expected thing to do.
> In a way, unless some/all are assumed to already hold a nul selector.
>
>> I'd recommend that the diff be simply:
>>
>> diff --git a/xen/arch/x86/acpi/wakeup_prot.S
>> b/xen/arch/x86/acpi/wakeup_prot.S
>> index dcc7e2327d..a2c41c4f3f 100644
>> --- a/xen/arch/x86/acpi/wakeup_prot.S
>> +++ b/xen/arch/x86/acpi/wakeup_prot.S
>> @@ -49,6 +49,10 @@ ENTRY(s3_resume)
>>          mov     %rax, %cr0
>>  
>>          mov     $__HYPERVISOR_DS64, %eax
>> +        mov     %eax, %ds
>> +        mov     %eax, %es
>> +        mov     %eax, %fs
>> +        mov     %eax, %gs
>>          mov     %eax, %ss
>>          mov     saved_rsp(%rip), %rsp
> So I had specifically elected to not put the addition there, to make
> sure the stack would get established first. But seeing both Roger
> and you ask me to do otherwise - well, so be it then.

There is no IDT.  Any fault is will be triple, irrespective of the exact
code layout.

This sequence actually matches what we have in __high_start().

I don't think it is wise to write code which presumes that
__HYPERVISOR_DS64 is 0 (it happens to be, but could easily be 0xe010 as
well), or that the trampoline has fixed behaviours for the segments.

~Andrew
M. Vefa Bicakci July 29, 2020, 11:29 p.m. UTC | #5
On 7/23/20 7:00 PM, Andrew Cooper wrote:
> On 23/07/2020 16:19, Jan Beulich wrote:
>> On 23.07.2020 16:40, Andrew Cooper wrote:
>>> On 20/07/2020 16:20, Jan Beulich wrote:
>>>> wakeup_32 sets %ds and %es to BOOT_DS, while leaving %fs at what
>>>> wakeup_start did set it to, and %gs at whatever BIOS did load into it.
>>>> All of this may end up confusing the first load_segments() to run on
>>>> the BSP after resume, in particular allowing a non-nul selector value
>>>> to be left in %fs.
>>>>
>>>> Alongside %ss, also put all other data segment registers into the same
>>>> state that the boot and CPU bringup paths put them in.
>>>>
>>>> Reported-by: M. Vefa Bicakci <m.v.b@runbox.com>
>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>>
>>>> --- a/xen/arch/x86/acpi/wakeup_prot.S
>>>> +++ b/xen/arch/x86/acpi/wakeup_prot.S
>>>> @@ -52,6 +52,16 @@ ENTRY(s3_resume)
>>>>           mov     %eax, %ss
>>>>           mov     saved_rsp(%rip), %rsp
>>>>   
>>>> +        /*
>>>> +         * Also put other segment registers into known state, like would
>>>> +         * be done on the boot path. This is in particular necessary for
>>>> +         * the first load_segments() to work as intended.
>>>> +         */
>>> I don't think the comment is helpful, not least because it refers to a
>>> broken behaviour in load_segemnts() which is soon going to change anyway.
>> Well, I can drop it. I merely thought I'd be nice and comment my
>> code once in a while (and the comment could be dropped / adjusted
>> when load_segments() changes)...
>>
>>> We've literally just loaded the GDT, at which point reloading all
>>> segments *is* the expected thing to do.
>> In a way, unless some/all are assumed to already hold a nul selector.
>>
>>> I'd recommend that the diff be simply:
>>>
>>> diff --git a/xen/arch/x86/acpi/wakeup_prot.S
>>> b/xen/arch/x86/acpi/wakeup_prot.S
>>> index dcc7e2327d..a2c41c4f3f 100644
>>> --- a/xen/arch/x86/acpi/wakeup_prot.S
>>> +++ b/xen/arch/x86/acpi/wakeup_prot.S
>>> @@ -49,6 +49,10 @@ ENTRY(s3_resume)
>>>           mov     %rax, %cr0
>>>   
>>>           mov     $__HYPERVISOR_DS64, %eax
>>> +        mov     %eax, %ds
>>> +        mov     %eax, %es
>>> +        mov     %eax, %fs
>>> +        mov     %eax, %gs
>>>           mov     %eax, %ss
>>>           mov     saved_rsp(%rip), %rsp
>> So I had specifically elected to not put the addition there, to make
>> sure the stack would get established first. But seeing both Roger
>> and you ask me to do otherwise - well, so be it then.
> 
> There is no IDT.  Any fault is will be triple, irrespective of the exact
> code layout.
> 
> This sequence actually matches what we have in __high_start().
> 
> I don't think it is wise to write code which presumes that
> __HYPERVISOR_DS64 is 0 (it happens to be, but could easily be 0xe010 as
> well), or that the trampoline has fixed behaviours for the segments.

Hello Jan and Andrew,

Is there anything I can do to help with the delivery/merging of this patch?

If it would help, I can prepare and publish a patch according to Andrew's
comments. Given that the patch is not my work though, I assume that it
would be appropriate for me credit both of you in the commit message and
add a Signed-off-by tag in the commit message for each of you.

Vefa
Andrew Cooper July 29, 2020, 11:31 p.m. UTC | #6
On 30/07/2020 00:29, M. Vefa Bicakci wrote:
> On 7/23/20 7:00 PM, Andrew Cooper wrote:
>> On 23/07/2020 16:19, Jan Beulich wrote:
>>> On 23.07.2020 16:40, Andrew Cooper wrote:
>>>> On 20/07/2020 16:20, Jan Beulich wrote:
>>>>> wakeup_32 sets %ds and %es to BOOT_DS, while leaving %fs at what
>>>>> wakeup_start did set it to, and %gs at whatever BIOS did load into
>>>>> it.
>>>>> All of this may end up confusing the first load_segments() to run on
>>>>> the BSP after resume, in particular allowing a non-nul selector value
>>>>> to be left in %fs.
>>>>>
>>>>> Alongside %ss, also put all other data segment registers into the
>>>>> same
>>>>> state that the boot and CPU bringup paths put them in.
>>>>>
>>>>> Reported-by: M. Vefa Bicakci <m.v.b@runbox.com>
>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>>>
>>>>> --- a/xen/arch/x86/acpi/wakeup_prot.S
>>>>> +++ b/xen/arch/x86/acpi/wakeup_prot.S
>>>>> @@ -52,6 +52,16 @@ ENTRY(s3_resume)
>>>>>           mov     %eax, %ss
>>>>>           mov     saved_rsp(%rip), %rsp
>>>>>   +        /*
>>>>> +         * Also put other segment registers into known state,
>>>>> like would
>>>>> +         * be done on the boot path. This is in particular
>>>>> necessary for
>>>>> +         * the first load_segments() to work as intended.
>>>>> +         */
>>>> I don't think the comment is helpful, not least because it refers to a
>>>> broken behaviour in load_segemnts() which is soon going to change
>>>> anyway.
>>> Well, I can drop it. I merely thought I'd be nice and comment my
>>> code once in a while (and the comment could be dropped / adjusted
>>> when load_segments() changes)...
>>>
>>>> We've literally just loaded the GDT, at which point reloading all
>>>> segments *is* the expected thing to do.
>>> In a way, unless some/all are assumed to already hold a nul selector.
>>>
>>>> I'd recommend that the diff be simply:
>>>>
>>>> diff --git a/xen/arch/x86/acpi/wakeup_prot.S
>>>> b/xen/arch/x86/acpi/wakeup_prot.S
>>>> index dcc7e2327d..a2c41c4f3f 100644
>>>> --- a/xen/arch/x86/acpi/wakeup_prot.S
>>>> +++ b/xen/arch/x86/acpi/wakeup_prot.S
>>>> @@ -49,6 +49,10 @@ ENTRY(s3_resume)
>>>>           mov     %rax, %cr0
>>>>             mov     $__HYPERVISOR_DS64, %eax
>>>> +        mov     %eax, %ds
>>>> +        mov     %eax, %es
>>>> +        mov     %eax, %fs
>>>> +        mov     %eax, %gs
>>>>           mov     %eax, %ss
>>>>           mov     saved_rsp(%rip), %rsp
>>> So I had specifically elected to not put the addition there, to make
>>> sure the stack would get established first. But seeing both Roger
>>> and you ask me to do otherwise - well, so be it then.
>>
>> There is no IDT.  Any fault is will be triple, irrespective of the exact
>> code layout.
>>
>> This sequence actually matches what we have in __high_start().
>>
>> I don't think it is wise to write code which presumes that
>> __HYPERVISOR_DS64 is 0 (it happens to be, but could easily be 0xe010 as
>> well), or that the trampoline has fixed behaviours for the segments.
>
> Hello Jan and Andrew,
>
> Is there anything I can do to help with the delivery/merging of this
> patch?

It was committed last Friday.

https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=55f8c389d4348cc517946fdcb10794112458e81e

I presume Jan will backport it to stable trees when he's not OoO.

~Andrew
M. Vefa Bicakci July 29, 2020, 11:38 p.m. UTC | #7
On 7/30/20 2:31 AM, Andrew Cooper wrote:
> On 30/07/2020 00:29, M. Vefa Bicakci wrote:
>> On 7/23/20 7:00 PM, Andrew Cooper wrote:
>>> On 23/07/2020 16:19, Jan Beulich wrote:
>>>> On 23.07.2020 16:40, Andrew Cooper wrote:
>>>>> On 20/07/2020 16:20, Jan Beulich wrote:
>>>>>> wakeup_32 sets %ds and %es to BOOT_DS, while leaving %fs at what
>>>>>> wakeup_start did set it to, and %gs at whatever BIOS did load into
>>>>>> it.
>>>>>> All of this may end up confusing the first load_segments() to run on
>>>>>> the BSP after resume, in particular allowing a non-nul selector value
>>>>>> to be left in %fs.
>>>>>>
>>>>>> Alongside %ss, also put all other data segment registers into the
>>>>>> same
>>>>>> state that the boot and CPU bringup paths put them in.
>>>>>>
>>>>>> Reported-by: M. Vefa Bicakci <m.v.b@runbox.com>
>>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>>>>
>>>>>> --- a/xen/arch/x86/acpi/wakeup_prot.S
>>>>>> +++ b/xen/arch/x86/acpi/wakeup_prot.S
>>>>>> @@ -52,6 +52,16 @@ ENTRY(s3_resume)
>>>>>>            mov     %eax, %ss
>>>>>>            mov     saved_rsp(%rip), %rsp
>>>>>>    +        /*
>>>>>> +         * Also put other segment registers into known state,
>>>>>> like would
>>>>>> +         * be done on the boot path. This is in particular
>>>>>> necessary for
>>>>>> +         * the first load_segments() to work as intended.
>>>>>> +         */
>>>>> I don't think the comment is helpful, not least because it refers to a
>>>>> broken behaviour in load_segemnts() which is soon going to change
>>>>> anyway.
>>>> Well, I can drop it. I merely thought I'd be nice and comment my
>>>> code once in a while (and the comment could be dropped / adjusted
>>>> when load_segments() changes)...
>>>>
>>>>> We've literally just loaded the GDT, at which point reloading all
>>>>> segments *is* the expected thing to do.
>>>> In a way, unless some/all are assumed to already hold a nul selector.
>>>>
>>>>> I'd recommend that the diff be simply:
>>>>>
>>>>> diff --git a/xen/arch/x86/acpi/wakeup_prot.S
>>>>> b/xen/arch/x86/acpi/wakeup_prot.S
>>>>> index dcc7e2327d..a2c41c4f3f 100644
>>>>> --- a/xen/arch/x86/acpi/wakeup_prot.S
>>>>> +++ b/xen/arch/x86/acpi/wakeup_prot.S
>>>>> @@ -49,6 +49,10 @@ ENTRY(s3_resume)
>>>>>            mov     %rax, %cr0
>>>>>              mov     $__HYPERVISOR_DS64, %eax
>>>>> +        mov     %eax, %ds
>>>>> +        mov     %eax, %es
>>>>> +        mov     %eax, %fs
>>>>> +        mov     %eax, %gs
>>>>>            mov     %eax, %ss
>>>>>            mov     saved_rsp(%rip), %rsp
>>>> So I had specifically elected to not put the addition there, to make
>>>> sure the stack would get established first. But seeing both Roger
>>>> and you ask me to do otherwise - well, so be it then.
>>>
>>> There is no IDT.  Any fault is will be triple, irrespective of the exact
>>> code layout.
>>>
>>> This sequence actually matches what we have in __high_start().
>>>
>>> I don't think it is wise to write code which presumes that
>>> __HYPERVISOR_DS64 is 0 (it happens to be, but could easily be 0xe010 as
>>> well), or that the trampoline has fixed behaviours for the segments.
>>
>> Hello Jan and Andrew,
>>
>> Is there anything I can do to help with the delivery/merging of this
>> patch?
> 
> It was committed last Friday.
> 
> https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=55f8c389d4348cc517946fdcb10794112458e81e
> 
> I presume Jan will backport it to stable trees when he's not OoO.

Great -- thanks! (And sorry for not checking the git tree prior to
sending my e-mail.)

Vefa
diff mbox series

Patch

--- a/xen/arch/x86/acpi/wakeup_prot.S
+++ b/xen/arch/x86/acpi/wakeup_prot.S
@@ -52,6 +52,16 @@  ENTRY(s3_resume)
         mov     %eax, %ss
         mov     saved_rsp(%rip), %rsp
 
+        /*
+         * Also put other segment registers into known state, like would
+         * be done on the boot path. This is in particular necessary for
+         * the first load_segments() to work as intended.
+         */
+        mov     %eax, %ds
+        mov     %eax, %es
+        mov     %eax, %fs
+        mov     %eax, %gs
+
         /* Reload code selector */
         pushq   $__HYPERVISOR_CS
         leaq    1f(%rip),%rax