diff mbox series

pc-bios/s390x: Pack ResetInfo struct

Message ID 20200205182126.13010-1-jjherne@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series pc-bios/s390x: Pack ResetInfo struct | expand

Commit Message

Jason J. Herne Feb. 5, 2020, 6:21 p.m. UTC
This fixes vfio-ccw when booting non-Linux operating systems. Without this
struct being packed, a few extra bytes of low core memory get overwritten when
we  assign a value to memory address 0 in jump_to_IPL_2. This is enough to
cause some non-Linux OSes of fail when booting.

The problem was introduced by:
5c6f0d5f46a77d77 "pc-bios/s390x: Fix reset psw mask".

The fix is to pack the struct thereby removing the 4 bytes of padding that get
added at the end, likely to allow an array of these structs to naturally align
on an 8-byte boundary.

Fixes: 5c6f0d5f46a7 ("pc-bios/s390x: Fix reset psw mask")
CC: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
---
 pc-bios/s390-ccw/jump2ipl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Cornelia Huck Feb. 6, 2020, 9:55 a.m. UTC | #1
On Wed,  5 Feb 2020 13:21:26 -0500
"Jason J. Herne" <jjherne@linux.ibm.com> wrote:

> This fixes vfio-ccw when booting non-Linux operating systems. Without this
> struct being packed, a few extra bytes of low core memory get overwritten when
> we  assign a value to memory address 0 in jump_to_IPL_2. This is enough to
> cause some non-Linux OSes of fail when booting.

s/of/to/

> 
> The problem was introduced by:
> 5c6f0d5f46a77d77 "pc-bios/s390x: Fix reset psw mask".

So, what introduced the problem was turning two 32 bit values into a 64
bit value?

> 
> The fix is to pack the struct thereby removing the 4 bytes of padding that get
> added at the end, likely to allow an array of these structs to naturally align
> on an 8-byte boundary.
> 
> Fixes: 5c6f0d5f46a7 ("pc-bios/s390x: Fix reset psw mask")
> CC: Janosch Frank <frankja@linux.ibm.com>
> Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
> ---
>  pc-bios/s390-ccw/jump2ipl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
> index da13c43cc0..1e9eaa037f 100644
> --- a/pc-bios/s390-ccw/jump2ipl.c
> +++ b/pc-bios/s390-ccw/jump2ipl.c
> @@ -18,7 +18,7 @@
>  typedef struct ResetInfo {
>      uint64_t ipl_psw;
>      uint32_t ipl_continue;
> -} ResetInfo;
> +} __attribute__((packed)) ResetInfo;
>  
>  static ResetInfo save;
>  

I'm wondering if we have more stuff like that lurking in the bios.
Christian Borntraeger Feb. 6, 2020, 10:09 a.m. UTC | #2
On 05.02.20 19:21, Jason J. Herne wrote:
> This fixes vfio-ccw when booting non-Linux operating systems. Without this
> struct being packed, a few extra bytes of low core memory get overwritten when
> we  assign a value to memory address 0 in jump_to_IPL_2. This is enough to
> cause some non-Linux OSes of fail when booting.
> 
> The problem was introduced by:
> 5c6f0d5f46a77d77 "pc-bios/s390x: Fix reset psw mask".
> 
> The fix is to pack the struct thereby removing the 4 bytes of padding that get
> added at the end, likely to allow an array of these structs to naturally align
> on an 8-byte boundary.
> 
> Fixes: 5c6f0d5f46a7 ("pc-bios/s390x: Fix reset psw mask")
> CC: Janosch Frank <frankja@linux.ibm.com>
> Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
> ---
>  pc-bios/s390-ccw/jump2ipl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
> index da13c43cc0..1e9eaa037f 100644
> --- a/pc-bios/s390-ccw/jump2ipl.c
> +++ b/pc-bios/s390-ccw/jump2ipl.c
> @@ -18,7 +18,7 @@
>  typedef struct ResetInfo {
>      uint64_t ipl_psw;
>      uint32_t ipl_continue;
> -} ResetInfo;
> +} __attribute__((packed)) ResetInfo;
>  
>  static ResetInfo save;

Just looked into that.

We do save the old content in "save" and restore the old memory content.

static void jump_to_IPL_2(void)
{
    ResetInfo *current = 0;

    void (*ipl)(void) = (void *) (uint64_t) current->ipl_continue;
--->*current = save;
    ipl(); /* should not return */
}

void jump_to_IPL_code(uint64_t address)
{
    /* store the subsystem information _after_ the bootmap was loaded */
    write_subsystem_identification();

    /* prevent unknown IPL types in the guest */
    if (iplb.pbt == S390_IPL_TYPE_QEMU_SCSI) {
        iplb.pbt = S390_IPL_TYPE_CCW;
        set_iplb(&iplb);
    }

    /*
     * The IPL PSW is at address 0. We also must not overwrite the
     * content of non-BIOS memory after we loaded the guest, so we
     * save the original content and restore it in jump_to_IPL_2.
     */
    ResetInfo *current = 0;

--->save = *current;



does something like


diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
index da13c43cc0..8839226803 100644
--- a/pc-bios/s390-ccw/jump2ipl.c
+++ b/pc-bios/s390-ccw/jump2ipl.c
@@ -18,6 +18,7 @@
 typedef struct ResetInfo {
     uint64_t ipl_psw;
     uint32_t ipl_continue;
+    uint32_t pad;
 } ResetInfo;
 
 static ResetInfo save;


also work? If yes, both variants are valid. Either packed or explicit padding.
Thomas Huth Feb. 6, 2020, 11 a.m. UTC | #3
On 06/02/2020 11.09, Christian Borntraeger wrote:
> 
> 
> On 05.02.20 19:21, Jason J. Herne wrote:
>> This fixes vfio-ccw when booting non-Linux operating systems. Without this
>> struct being packed, a few extra bytes of low core memory get overwritten when
>> we  assign a value to memory address 0 in jump_to_IPL_2. This is enough to
>> cause some non-Linux OSes of fail when booting.
>>
>> The problem was introduced by:
>> 5c6f0d5f46a77d77 "pc-bios/s390x: Fix reset psw mask".
>>
>> The fix is to pack the struct thereby removing the 4 bytes of padding that get
>> added at the end, likely to allow an array of these structs to naturally align
>> on an 8-byte boundary.
>>
>> Fixes: 5c6f0d5f46a7 ("pc-bios/s390x: Fix reset psw mask")
>> CC: Janosch Frank <frankja@linux.ibm.com>
>> Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
>> ---
>>  pc-bios/s390-ccw/jump2ipl.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
>> index da13c43cc0..1e9eaa037f 100644
>> --- a/pc-bios/s390-ccw/jump2ipl.c
>> +++ b/pc-bios/s390-ccw/jump2ipl.c
>> @@ -18,7 +18,7 @@
>>  typedef struct ResetInfo {
>>      uint64_t ipl_psw;
>>      uint32_t ipl_continue;
>> -} ResetInfo;
>> +} __attribute__((packed)) ResetInfo;
>>  
>>  static ResetInfo save;
> 
> Just looked into that.
> 
> We do save the old content in "save" and restore the old memory content.
> 
> static void jump_to_IPL_2(void)
> {
>     ResetInfo *current = 0;
> 
>     void (*ipl)(void) = (void *) (uint64_t) current->ipl_continue;
> --->*current = save;
>     ipl(); /* should not return */
> }
> 
> void jump_to_IPL_code(uint64_t address)
> {
>     /* store the subsystem information _after_ the bootmap was loaded */
>     write_subsystem_identification();
> 
>     /* prevent unknown IPL types in the guest */
>     if (iplb.pbt == S390_IPL_TYPE_QEMU_SCSI) {
>         iplb.pbt = S390_IPL_TYPE_CCW;
>         set_iplb(&iplb);
>     }
> 
>     /*
>      * The IPL PSW is at address 0. We also must not overwrite the
>      * content of non-BIOS memory after we loaded the guest, so we
>      * save the original content and restore it in jump_to_IPL_2.
>      */
>     ResetInfo *current = 0;
> 
> --->save = *current;

Right, and this should also work without your modification. I've stared
at the code a couple of weeks ago, looking for a very similar issue:

 https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03484.html

... but in the end, the problem was something else:

 https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03520.html

and the fix had been done in the startup code of the test:

 https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg04225.html

So I'd guess that you face the very same problem here. That means, you
either have to convince the non-Linux OS to check their startup code
whether they depend on zeroed registers somewhere, or we fix this issue
for good in jump_to_IPL_2() by clearing the registers there before
jumping into the OS code (which we likely should do anyway since the OS
may expect a clean state).

 Thomas
Christian Borntraeger Feb. 7, 2020, 11:28 a.m. UTC | #4
Jason,

can you run objdump -Sdr on jump2ipl.o on a broken variant?


On 06.02.20 12:00, Thomas Huth wrote:
> On 06/02/2020 11.09, Christian Borntraeger wrote:
>>
>>
>> On 05.02.20 19:21, Jason J. Herne wrote:
>>> This fixes vfio-ccw when booting non-Linux operating systems. Without this
>>> struct being packed, a few extra bytes of low core memory get overwritten when
>>> we  assign a value to memory address 0 in jump_to_IPL_2. This is enough to
>>> cause some non-Linux OSes of fail when booting.
>>>
>>> The problem was introduced by:
>>> 5c6f0d5f46a77d77 "pc-bios/s390x: Fix reset psw mask".
>>>
>>> The fix is to pack the struct thereby removing the 4 bytes of padding that get
>>> added at the end, likely to allow an array of these structs to naturally align
>>> on an 8-byte boundary.
>>>
>>> Fixes: 5c6f0d5f46a7 ("pc-bios/s390x: Fix reset psw mask")
>>> CC: Janosch Frank <frankja@linux.ibm.com>
>>> Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
>>> ---
>>>  pc-bios/s390-ccw/jump2ipl.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
>>> index da13c43cc0..1e9eaa037f 100644
>>> --- a/pc-bios/s390-ccw/jump2ipl.c
>>> +++ b/pc-bios/s390-ccw/jump2ipl.c
>>> @@ -18,7 +18,7 @@
>>>  typedef struct ResetInfo {
>>>      uint64_t ipl_psw;
>>>      uint32_t ipl_continue;
>>> -} ResetInfo;
>>> +} __attribute__((packed)) ResetInfo;
>>>  
>>>  static ResetInfo save;
>>
>> Just looked into that.
>>
>> We do save the old content in "save" and restore the old memory content.
>>
>> static void jump_to_IPL_2(void)
>> {
>>     ResetInfo *current = 0;
>>
>>     void (*ipl)(void) = (void *) (uint64_t) current->ipl_continue;
>> --->*current = save;
>>     ipl(); /* should not return */
>> }
>>
>> void jump_to_IPL_code(uint64_t address)
>> {
>>     /* store the subsystem information _after_ the bootmap was loaded */
>>     write_subsystem_identification();
>>
>>     /* prevent unknown IPL types in the guest */
>>     if (iplb.pbt == S390_IPL_TYPE_QEMU_SCSI) {
>>         iplb.pbt = S390_IPL_TYPE_CCW;
>>         set_iplb(&iplb);
>>     }
>>
>>     /*
>>      * The IPL PSW is at address 0. We also must not overwrite the
>>      * content of non-BIOS memory after we loaded the guest, so we
>>      * save the original content and restore it in jump_to_IPL_2.
>>      */
>>     ResetInfo *current = 0;
>>
>> --->save = *current;
> 
> Right, and this should also work without your modification. I've stared
> at the code a couple of weeks ago, looking for a very similar issue:
> 
>  https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03484.html
> 
> ... but in the end, the problem was something else:
> 
>  https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg03520.html
> 
> and the fix had been done in the startup code of the test:
> 
>  https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg04225.html
> 
> So I'd guess that you face the very same problem here. That means, you
> either have to convince the non-Linux OS to check their startup code
> whether they depend on zeroed registers somewhere, or we fix this issue
> for good in jump_to_IPL_2() by clearing the registers there before
> jumping into the OS code (which we likely should do anyway since the OS
> may expect a clean state).
> 
>  Thomas
> 
>
Jason J. Herne Feb. 7, 2020, 2:02 p.m. UTC | #5
On 2/7/20 6:28 AM, Christian Borntraeger wrote:
> Jason,
> 
> can you run objdump -Sdr on jump2ipl.o on a broken variant?
> 
> 
To keep the volume lower, I've only pasted the output that I think you're interested in. 
If you want to see the entire thing just let me know.

static void jump_to_IPL_2(void)
{
  1d0:	eb bf f0 58 00 24 	stmg	%r11,%r15,88(%r15)
  1d6:	a7 fb ff 50       	aghi	%r15,-176
  1da:	b9 04 00 bf       	lgr	%r11,%r15
     ResetInfo *current = 0;
  1de:	a7 19 00 00       	lghi	%r1,0
  1e2:	e3 10 b0 a8 00 24 	stg	%r1,168(%r11)

     void (*ipl)(void) = (void *) (uint64_t) current->ipl_continue;
  1e8:	e3 10 b0 a8 00 04 	lg	%r1,168(%r11)
  1ee:	58 10 10 08       	l	%r1,8(%r1)
  1f2:	b9 16 00 11       	llgfr	%r1,%r1
  1f6:	e3 10 b0 a0 00 24 	stg	%r1,160(%r11)
     *current = save;
  1fc:	e3 10 b0 a8 00 04 	lg	%r1,168(%r11)
  202:	c0 20 00 00 00 00 	larl	%r2,202 <jump_to_IPL_2+0x32>
			204: R_390_PC32DBL	.bss+0x2
  208:	eb 23 20 00 00 04 	lmg	%r2,%r3,0(%r2)
  20e:	eb 23 10 00 00 24 	stmg	%r2,%r3,0(%r1)
     ipl(); /* should not return */
  214:	e3 10 b0 a0 00 04 	lg	%r1,160(%r11)
  21a:	0d e1             	basr	%r14,%r1
}
  21c:	18 00             	lr	%r0,%r0
  21e:	eb bf b1 08 00 04 	lmg	%r11,%r15,264(%r11)
  224:	07 fe             	br	%r14
  226:	07 07             	nopr	%r7

0000000000000228 <jump_to_IPL_code>:

void jump_to_IPL_code(uint64_t address)
{
  228:	eb bf f0 58 00 24 	stmg	%r11,%r15,88(%r15)
  22e:	c0 d0 00 00 00 00 	larl	%r13,22e <jump_to_IPL_code+0x6>
			230: R_390_PC32DBL	.rodata+0x2a
  234:	a7 fb ff 50       	aghi	%r15,-176
  238:	b9 04 00 bf       	lgr	%r11,%r15
  23c:	e3 20 b0 a0 00 24 	stg	%r2,160(%r11)
     /* store the subsystem information _after_ the bootmap was loaded */
     write_subsystem_identification();
  242:	c0 e5 00 00 00 00 	brasl	%r14,242 <jump_to_IPL_code+0x1a>
			244: R_390_PLT32DBL	write_subsystem_identification+0x2

     /* prevent unknown IPL types in the guest */
     if (iplb.pbt == S390_IPL_TYPE_QEMU_SCSI) {
  248:	c0 10 00 00 00 00 	larl	%r1,248 <jump_to_IPL_code+0x20>
			24a: R_390_GOTENT	iplb+0x2
  24e:	e3 10 10 00 00 04 	lg	%r1,0(%r1)
  254:	43 10 10 0c       	ic	%r1,12(%r1)
  258:	a7 28 00 ff       	lhi	%r2,255
  25c:	14 12             	nr	%r1,%r2
  25e:	a7 1e 00 ff       	chi	%r1,255
  262:	a7 74 00 15       	jne	28c <jump_to_IPL_code+0x64>
         iplb.pbt = S390_IPL_TYPE_CCW;
  266:	c0 10 00 00 00 00 	larl	%r1,266 <jump_to_IPL_code+0x3e>
			268: R_390_GOTENT	iplb+0x2
  26c:	e3 10 10 00 00 04 	lg	%r1,0(%r1)
  272:	92 02 10 0c       	mvi	12(%r1),2
         set_iplb(&iplb);
  276:	c0 10 00 00 00 00 	larl	%r1,276 <jump_to_IPL_code+0x4e>
			278: R_390_GOTENT	iplb+0x2
  27c:	e3 10 10 00 00 04 	lg	%r1,0(%r1)
  282:	b9 04 00 21       	lgr	%r2,%r1
  286:	c0 e5 ff ff ff 75 	brasl	%r14,170 <set_iplb>
     /*
      * The IPL PSW is at address 0. We also must not overwrite the
      * content of non-BIOS memory after we loaded the guest, so we
      * save the original content and restore it in jump_to_IPL_2.
      */
     ResetInfo *current = 0;
  28c:	a7 19 00 00       	lghi	%r1,0
  290:	e3 10 b0 a8 00 24 	stg	%r1,168(%r11)

     save = *current;
  296:	c0 10 00 00 00 00 	larl	%r1,296 <jump_to_IPL_code+0x6e>
			298: R_390_PC32DBL	.bss+0x2
  29c:	e3 20 b0 a8 00 04 	lg	%r2,168(%r11)
  2a2:	eb 23 20 00 00 04 	lmg	%r2,%r3,0(%r2)
  2a8:	eb 23 10 00 00 24 	stmg	%r2,%r3,0(%r1)

     current->ipl_psw = (uint64_t) &jump_to_IPL_2;
  2ae:	c0 20 ff ff ff 91 	larl	%r2,1d0 <jump_to_IPL_2>
  2b4:	e3 10 b0 a8 00 04 	lg	%r1,168(%r11)
  2ba:	e3 20 10 00 00 24 	stg	%r2,0(%r1)
     current->ipl_psw |= RESET_PSW_MASK;
  2c0:	e3 10 b0 a8 00 04 	lg	%r1,168(%r11)
  2c6:	e3 10 10 00 00 04 	lg	%r1,0(%r1)
  2cc:	e3 20 d0 00 00 04 	lg	%r2,0(%r13)
  2d2:	b9 81 00 21       	ogr	%r2,%r1
  2d6:	e3 10 b0 a8 00 04 	lg	%r1,168(%r11)
  2dc:	e3 20 10 00 00 24 	stg	%r2,0(%r1)
     current->ipl_continue = address & 0x7fffffff;
  2e2:	e3 10 b0 a0 00 04 	lg	%r1,160(%r11)
  2e8:	b9 17 00 21       	llgtr	%r2,%r1
  2ec:	e3 10 b0 a8 00 04 	lg	%r1,168(%r11)
  2f2:	50 20 10 08       	st	%r2,8(%r1)

     debug_print_int("set IPL addr to", current->ipl_continue);
  2f6:	e3 10 b0 a8 00 04 	lg	%r1,168(%r11)
  2fc:	58 10 10 08       	l	%r1,8(%r1)
  300:	b9 16 00 11       	llgfr	%r1,%r1
  304:	b9 04 00 31       	lgr	%r3,%r1
  308:	c0 20 00 00 00 00 	larl	%r2,308 <jump_to_IPL_code+0xe0>
			30a: R_390_PC32DBL	.rodata+0x2
  30e:	c0 e5 ff ff ff 4d 	brasl	%r14,1a8 <debug_print_int>

     /* Ensure the guest output starts fresh */
     sclp_print("\n");
  314:	c0 20 00 00 00 00 	larl	%r2,314 <jump_to_IPL_code+0xec>
			316: R_390_PC32DBL	.rodata+0x12
  31a:	c0 e5 00 00 00 00 	brasl	%r14,31a <jump_to_IPL_code+0xf2>
			31c: R_390_PLT32DBL	sclp_print+0x2
     /*
      * HACK ALERT.
      * We use the load normal reset to keep r15 unchanged. jump_to_IPL_2
      * can then use r15 as its stack pointer.
      */
     asm volatile("lghi 1,1\n\t"
  320:	a7 19 00 01       	lghi	%r1,1
  324:	83 11 03 08       	diag	%r1,%r1,776
                  "diag 1,1,0x308\n\t"
                  : : : "1", "memory");
     panic("\n! IPL returns !\n");
  328:	c0 20 00 00 00 00 	larl	%r2,328 <jump_to_IPL_code+0x100>
			32a: R_390_PC32DBL	.rodata+0x14
  32e:	c0 e5 00 00 00 00 	brasl	%r14,32e <jump_to_IPL_code+0x106>
			330: R_390_PLT32DBL	panic+0x2
}
  334:	18 00             	lr	%r0,%r0
  336:	eb bf b1 08 00 04 	lmg	%r11,%r15,264(%r11)
  33c:	07 fe             	br	%r14
  33e:	07 07             	nopr	%r7
Jason J. Herne Feb. 13, 2020, 6:02 p.m. UTC | #6
On 2/6/20 5:09 AM, Christian Borntraeger wrote:
> 
> 
> On 05.02.20 19:21, Jason J. Herne wrote:
>> This fixes vfio-ccw when booting non-Linux operating systems. Without this
>> struct being packed, a few extra bytes of low core memory get overwritten when
>> we  assign a value to memory address 0 in jump_to_IPL_2. This is enough to
>> cause some non-Linux OSes of fail when booting.
>>
>> The problem was introduced by:
>> 5c6f0d5f46a77d77 "pc-bios/s390x: Fix reset psw mask".
>>
>> The fix is to pack the struct thereby removing the 4 bytes of padding that get
>> added at the end, likely to allow an array of these structs to naturally align
>> on an 8-byte boundary.
>>
>> Fixes: 5c6f0d5f46a7 ("pc-bios/s390x: Fix reset psw mask")
>> CC: Janosch Frank <frankja@linux.ibm.com>
>> Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
>> ---
>>   pc-bios/s390-ccw/jump2ipl.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
>> index da13c43cc0..1e9eaa037f 100644
>> --- a/pc-bios/s390-ccw/jump2ipl.c
>> +++ b/pc-bios/s390-ccw/jump2ipl.c
>> @@ -18,7 +18,7 @@
>>   typedef struct ResetInfo {
>>       uint64_t ipl_psw;
>>       uint32_t ipl_continue;
>> -} ResetInfo;
>> +} __attribute__((packed)) ResetInfo;
>>   
>>   static ResetInfo save;
> 
> Just looked into that.
> 
> We do save the old content in "save" and restore the old memory content.
> 
> static void jump_to_IPL_2(void)
> {
>      ResetInfo *current = 0;
> 
>      void (*ipl)(void) = (void *) (uint64_t) current->ipl_continue;
> --->*current = save;
>      ipl(); /* should not return */
> }
> 
> void jump_to_IPL_code(uint64_t address)
> {
>      /* store the subsystem information _after_ the bootmap was loaded */
>      write_subsystem_identification();
> 
>      /* prevent unknown IPL types in the guest */
>      if (iplb.pbt == S390_IPL_TYPE_QEMU_SCSI) {
>          iplb.pbt = S390_IPL_TYPE_CCW;
>          set_iplb(&iplb);
>      }
> 
>      /*
>       * The IPL PSW is at address 0. We also must not overwrite the
>       * content of non-BIOS memory after we loaded the guest, so we
>       * save the original content and restore it in jump_to_IPL_2.
>       */
>      ResetInfo *current = 0;
> 
> --->save = *current;
> 
> 
> 
> does something like
> 
> 
> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
> index da13c43cc0..8839226803 100644
> --- a/pc-bios/s390-ccw/jump2ipl.c
> +++ b/pc-bios/s390-ccw/jump2ipl.c
> @@ -18,6 +18,7 @@
>   typedef struct ResetInfo {
>       uint64_t ipl_psw;
>       uint32_t ipl_continue;
> +    uint32_t pad;
>   } ResetInfo;
>   
>   static ResetInfo save;
> 
> 
> also work? If yes, both variants are valid. Either packed or explicit padding.
> 

I don't believe this will work. I think the problem is that we're overwriting too much 
memory when we cast address 0 as a ResetInfo and then overwrite it (*current = save). I 
think we need the struct to be sized at 12-bytes instead of 16.
Christian Borntraeger Feb. 13, 2020, 6:24 p.m. UTC | #7
On 13.02.20 19:02, Jason J. Herne wrote:
> On 2/6/20 5:09 AM, Christian Borntraeger wrote:
>>
>>
>> On 05.02.20 19:21, Jason J. Herne wrote:
>>> This fixes vfio-ccw when booting non-Linux operating systems. Without this
>>> struct being packed, a few extra bytes of low core memory get overwritten when
>>> we  assign a value to memory address 0 in jump_to_IPL_2. This is enough to
>>> cause some non-Linux OSes of fail when booting.
>>>
>>> The problem was introduced by:
>>> 5c6f0d5f46a77d77 "pc-bios/s390x: Fix reset psw mask".
>>>
>>> The fix is to pack the struct thereby removing the 4 bytes of padding that get
>>> added at the end, likely to allow an array of these structs to naturally align
>>> on an 8-byte boundary.
>>>
>>> Fixes: 5c6f0d5f46a7 ("pc-bios/s390x: Fix reset psw mask")
>>> CC: Janosch Frank <frankja@linux.ibm.com>
>>> Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
>>> ---
>>>   pc-bios/s390-ccw/jump2ipl.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
>>> index da13c43cc0..1e9eaa037f 100644
>>> --- a/pc-bios/s390-ccw/jump2ipl.c
>>> +++ b/pc-bios/s390-ccw/jump2ipl.c
>>> @@ -18,7 +18,7 @@
>>>   typedef struct ResetInfo {
>>>       uint64_t ipl_psw;
>>>       uint32_t ipl_continue;
>>> -} ResetInfo;
>>> +} __attribute__((packed)) ResetInfo;
>>>     static ResetInfo save;
>>
>> Just looked into that.
>>
>> We do save the old content in "save" and restore the old memory content.
>>
>> static void jump_to_IPL_2(void)
>> {
>>      ResetInfo *current = 0;
>>
>>      void (*ipl)(void) = (void *) (uint64_t) current->ipl_continue;
>> --->*current = save;
>>      ipl(); /* should not return */
>> }
>>
>> void jump_to_IPL_code(uint64_t address)
>> {
>>      /* store the subsystem information _after_ the bootmap was loaded */
>>      write_subsystem_identification();
>>
>>      /* prevent unknown IPL types in the guest */
>>      if (iplb.pbt == S390_IPL_TYPE_QEMU_SCSI) {
>>          iplb.pbt = S390_IPL_TYPE_CCW;
>>          set_iplb(&iplb);
>>      }
>>
>>      /*
>>       * The IPL PSW is at address 0. We also must not overwrite the
>>       * content of non-BIOS memory after we loaded the guest, so we
>>       * save the original content and restore it in jump_to_IPL_2.
>>       */
>>      ResetInfo *current = 0;
>>
>> --->save = *current;
>>
>>
>>
>> does something like
>>
>>
>> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
>> index da13c43cc0..8839226803 100644
>> --- a/pc-bios/s390-ccw/jump2ipl.c
>> +++ b/pc-bios/s390-ccw/jump2ipl.c
>> @@ -18,6 +18,7 @@
>>   typedef struct ResetInfo {
>>       uint64_t ipl_psw;
>>       uint32_t ipl_continue;
>> +    uint32_t pad;
>>   } ResetInfo;
>>     static ResetInfo save;
>>
>>
>> also work? If yes, both variants are valid. Either packed or explicit padding.
>>
> 
> I don't believe this will work. I think the problem is that we're overwriting too much memory when we cast address 0 as a ResetInfo and then overwrite it (*current = save). I think we need the struct to be sized at 12-bytes instead of 16.
> 

The idea of the code is that we _save_ the original content from address 0 to save and _restore_ it before jumping into final code. I do not yet understand why this does not work.
Jason J. Herne Feb. 25, 2020, 10:23 a.m. UTC | #8
On 2/13/20 1:24 PM, Christian Borntraeger wrote:
...
>>> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
>>> index da13c43cc0..8839226803 100644
>>> --- a/pc-bios/s390-ccw/jump2ipl.c
>>> +++ b/pc-bios/s390-ccw/jump2ipl.c
>>> @@ -18,6 +18,7 @@
>>>    typedef struct ResetInfo {
>>>        uint64_t ipl_psw;
>>>        uint32_t ipl_continue;
>>> +    uint32_t pad;
>>>    } ResetInfo;
>>>      static ResetInfo save;
>>>
>>>
>>> also work? If yes, both variants are valid. Either packed or explicit padding.
>>>
>>
>> I don't believe this will work. I think the problem is that we're overwriting too much memory when we cast address 0 as a ResetInfo and then overwrite it (*current = save). I think we need the struct to be sized at 12-bytes instead of 16.
>>
> 
> The idea of the code is that we _save_ the original content from address 0 to save and _restore_ it before jumping into final code. I do not yet understand why this does not work.
> 

I've found the real problem here. Legacy operating systems that expect to start
in 32-bit addressing mode can fail if we leave junk in the high halves of our
64-bit registers. This is because some instructions (LA for example) are
bi-modal and operate differently depending on the machine's current addressing
mode.

In the case where we pack the struct, the compiler happens to use the mvc
instruction to load/store the current/save memory areas.

       *current = save;
   1fc:	e3 10 b0 a8 00 04 	lg	%r1,168(%r11)
   202:	c0 20 00 00 00 00 	larl	%r2,202 <jump_to_IPL_2+0x32>
			204: R_390_PC32DBL	.bss+0x2
   208:	d2 0b 10 00 20 00 	mvc	0(12,%r1),0(%r2)

Everything works as expected here, our legacy OS boots without issue.
However, in the case where we've packed this struct the compiler optimizes the
code and uses lmg/stmg instead of mvc to copy the data:

       *current = save;
   1fc:	e3 10 b0 a8 00 04 	lg	%r1,168(%r11)
   202:	c0 20 00 00 00 00 	larl	%r2,202 <jump_to_IPL_2+0x32>
			204: R_390_PC32DBL	.bss+0x2
   208:	eb 23 20 00 00 04 	lmg	%r2,%r3,0(%r2)
   20e:	eb 23 10 00 00 24 	stmg	%r2,%r3,0(%r1)

Depending on the data being copied, the high halves of the registers may contain
non-zero values. Example:

     r2             0x108000080000780        74309395999098752
     r3             0x601001800004368        432627142283510632

So, by sheer luck of the generated assembler, the patch happens to "fix" the
problem.  A real fix might be to insert inline assembler that clears the high
halves of the registers before we call ipl() in jump_to_IPL_2(). Can we think of
a better way to do that than 15 LLGTR instructions? :) Let me know your
thoughts.

jump_to_IPL_2 for easy reference:
     static void jump_to_IPL_2(void)
     {
         ResetInfo *current = 0;

         void (*ipl)(void) = (void *) (uint64_t) current->ipl_continue;
         *current = save;
         ipl(); /* should not return */
     }
Christian Borntraeger Feb. 25, 2020, 11:13 a.m. UTC | #9
On 25.02.20 11:23, Jason J. Herne wrote:
> On 2/13/20 1:24 PM, Christian Borntraeger wrote:
> ...
>>>> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
>>>> index da13c43cc0..8839226803 100644
>>>> --- a/pc-bios/s390-ccw/jump2ipl.c
>>>> +++ b/pc-bios/s390-ccw/jump2ipl.c
>>>> @@ -18,6 +18,7 @@
>>>>    typedef struct ResetInfo {
>>>>        uint64_t ipl_psw;
>>>>        uint32_t ipl_continue;
>>>> +    uint32_t pad;
>>>>    } ResetInfo;
>>>>      static ResetInfo save;
>>>>
>>>>
>>>> also work? If yes, both variants are valid. Either packed or explicit padding.
>>>>
>>>
>>> I don't believe this will work. I think the problem is that we're overwriting too much memory when we cast address 0 as a ResetInfo and then overwrite it (*current = save). I think we need the struct to be sized at 12-bytes instead of 16.
>>>
>>
>> The idea of the code is that we _save_ the original content from address 0 to save and _restore_ it before jumping into final code. I do not yet understand why this does not work.
>>
> 
> I've found the real problem here. Legacy operating systems that expect to start
> in 32-bit addressing mode can fail if we leave junk in the high halves of our
> 64-bit registers. This is because some instructions (LA for example) are
> bi-modal and operate differently depending on the machine's current addressing
> mode.
> 
> In the case where we pack the struct, the compiler happens to use the mvc
> instruction to load/store the current/save memory areas.
> 
>       *current = save;
>   1fc:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>   202:    c0 20 00 00 00 00     larl    %r2,202 <jump_to_IPL_2+0x32>
>             204: R_390_PC32DBL    .bss+0x2
>   208:    d2 0b 10 00 20 00     mvc    0(12,%r1),0(%r2)
> 
> Everything works as expected here, our legacy OS boots without issue.
> However, in the case where we've packed this struct the compiler optimizes the
> code and uses lmg/stmg instead of mvc to copy the data:
> 
>       *current = save;
>   1fc:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>   202:    c0 20 00 00 00 00     larl    %r2,202 <jump_to_IPL_2+0x32>
>             204: R_390_PC32DBL    .bss+0x2
>   208:    eb 23 20 00 00 04     lmg    %r2,%r3,0(%r2)
>   20e:    eb 23 10 00 00 24     stmg    %r2,%r3,0(%r1)
> 
> Depending on the data being copied, the high halves of the registers may contain
> non-zero values. Example:
> 
>     r2             0x108000080000780        74309395999098752
>     r3             0x601001800004368        432627142283510632
> 
> So, by sheer luck of the generated assembler, the patch happens to "fix" the
> problem.  A real fix might be to insert inline assembler that clears the high
> halves of the registers before we call ipl() in jump_to_IPL_2(). Can we think of
> a better way to do that than 15 LLGTR instructions? :) Let me know your
> thoughts

Does sam31 before the ipl() work?

> 
> jump_to_IPL_2 for easy reference:
>     static void jump_to_IPL_2(void)
>     {
>         ResetInfo *current = 0;
> 
>         void (*ipl)(void) = (void *) (uint64_t) current->ipl_continue;
>         *current = save;
>         ipl(); /* should not return */
>     }
> 
>
Jason J. Herne Feb. 25, 2020, 12:58 p.m. UTC | #10
On 2/25/20 6:13 AM, Christian Borntraeger wrote:
> 
> 
> On 25.02.20 11:23, Jason J. Herne wrote:
>> On 2/13/20 1:24 PM, Christian Borntraeger wrote:
>> ...
>>>>> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
>>>>> index da13c43cc0..8839226803 100644
>>>>> --- a/pc-bios/s390-ccw/jump2ipl.c
>>>>> +++ b/pc-bios/s390-ccw/jump2ipl.c
>>>>> @@ -18,6 +18,7 @@
>>>>>     typedef struct ResetInfo {
>>>>>         uint64_t ipl_psw;
>>>>>         uint32_t ipl_continue;
>>>>> +    uint32_t pad;
>>>>>     } ResetInfo;
>>>>>       static ResetInfo save;
>>>>>
>>>>>
>>>>> also work? If yes, both variants are valid. Either packed or explicit padding.
>>>>>
>>>>
>>>> I don't believe this will work. I think the problem is that we're overwriting too much memory when we cast address 0 as a ResetInfo and then overwrite it (*current = save). I think we need the struct to be sized at 12-bytes instead of 16.
>>>>
>>>
>>> The idea of the code is that we _save_ the original content from address 0 to save and _restore_ it before jumping into final code. I do not yet understand why this does not work.
>>>
>>
>> I've found the real problem here. Legacy operating systems that expect to start
>> in 32-bit addressing mode can fail if we leave junk in the high halves of our
>> 64-bit registers. This is because some instructions (LA for example) are
>> bi-modal and operate differently depending on the machine's current addressing
>> mode.
>>
>> In the case where we pack the struct, the compiler happens to use the mvc
>> instruction to load/store the current/save memory areas.
>>
>>        *current = save;
>>    1fc:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>>    202:    c0 20 00 00 00 00     larl    %r2,202 <jump_to_IPL_2+0x32>
>>              204: R_390_PC32DBL    .bss+0x2
>>    208:    d2 0b 10 00 20 00     mvc    0(12,%r1),0(%r2)
>>
>> Everything works as expected here, our legacy OS boots without issue.
>> However, in the case where we've packed this struct the compiler optimizes the
>> code and uses lmg/stmg instead of mvc to copy the data:
>>
>>        *current = save;
>>    1fc:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>>    202:    c0 20 00 00 00 00     larl    %r2,202 <jump_to_IPL_2+0x32>
>>              204: R_390_PC32DBL    .bss+0x2
>>    208:    eb 23 20 00 00 04     lmg    %r2,%r3,0(%r2)
>>    20e:    eb 23 10 00 00 24     stmg    %r2,%r3,0(%r1)
>>
>> Depending on the data being copied, the high halves of the registers may contain
>> non-zero values. Example:
>>
>>      r2             0x108000080000780        74309395999098752
>>      r3             0x601001800004368        432627142283510632
>>
>> So, by sheer luck of the generated assembler, the patch happens to "fix" the
>> problem.  A real fix might be to insert inline assembler that clears the high
>> halves of the registers before we call ipl() in jump_to_IPL_2(). Can we think of
>> a better way to do that than 15 LLGTR instructions? :) Let me know your
>> thoughts
> 
> Does sam31 before the ipl() work?
asm volatile ("sam31\n");

Inserting the above right before ipl(); does not change the outcome, the guest still fails.

This allows the guest to boot.

asm volatile ("llgtr %r2,%r2\n"
               "llgtr %r3,%r3\n");

My guess as to why sam31 does not work: The legacy OS is eventually doing a sam64 and the 
high halves of the registers are not subsequently cleared before use. I could be wrong 
about this though.
Christian Borntraeger Feb. 25, 2020, 3 p.m. UTC | #11
On 25.02.20 13:58, Jason J. Herne wrote:
> On 2/25/20 6:13 AM, Christian Borntraeger wrote:
>>
>>
>> On 25.02.20 11:23, Jason J. Herne wrote:
>>> On 2/13/20 1:24 PM, Christian Borntraeger wrote:
>>> ...
>>>>>> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
>>>>>> index da13c43cc0..8839226803 100644
>>>>>> --- a/pc-bios/s390-ccw/jump2ipl.c
>>>>>> +++ b/pc-bios/s390-ccw/jump2ipl.c
>>>>>> @@ -18,6 +18,7 @@
>>>>>>     typedef struct ResetInfo {
>>>>>>         uint64_t ipl_psw;
>>>>>>         uint32_t ipl_continue;
>>>>>> +    uint32_t pad;
>>>>>>     } ResetInfo;
>>>>>>       static ResetInfo save;
>>>>>>
>>>>>>
>>>>>> also work? If yes, both variants are valid. Either packed or explicit padding.
>>>>>>
>>>>>
>>>>> I don't believe this will work. I think the problem is that we're overwriting too much memory when we cast address 0 as a ResetInfo and then overwrite it (*current = save). I think we need the struct to be sized at 12-bytes instead of 16.
>>>>>
>>>>
>>>> The idea of the code is that we _save_ the original content from address 0 to save and _restore_ it before jumping into final code. I do not yet understand why this does not work.
>>>>
>>>
>>> I've found the real problem here. Legacy operating systems that expect to start
>>> in 32-bit addressing mode can fail if we leave junk in the high halves of our
>>> 64-bit registers. This is because some instructions (LA for example) are
>>> bi-modal and operate differently depending on the machine's current addressing
>>> mode.
>>>
>>> In the case where we pack the struct, the compiler happens to use the mvc
>>> instruction to load/store the current/save memory areas.
>>>
>>>        *current = save;
>>>    1fc:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>>>    202:    c0 20 00 00 00 00     larl    %r2,202 <jump_to_IPL_2+0x32>
>>>              204: R_390_PC32DBL    .bss+0x2
>>>    208:    d2 0b 10 00 20 00     mvc    0(12,%r1),0(%r2)
>>>
>>> Everything works as expected here, our legacy OS boots without issue.
>>> However, in the case where we've packed this struct the compiler optimizes the
>>> code and uses lmg/stmg instead of mvc to copy the data:
>>>
>>>        *current = save;
>>>    1fc:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>>>    202:    c0 20 00 00 00 00     larl    %r2,202 <jump_to_IPL_2+0x32>
>>>              204: R_390_PC32DBL    .bss+0x2
>>>    208:    eb 23 20 00 00 04     lmg    %r2,%r3,0(%r2)
>>>    20e:    eb 23 10 00 00 24     stmg    %r2,%r3,0(%r1)
>>>
>>> Depending on the data being copied, the high halves of the registers may contain
>>> non-zero values. Example:
>>>
>>>      r2             0x108000080000780        74309395999098752
>>>      r3             0x601001800004368        432627142283510632
>>>
>>> So, by sheer luck of the generated assembler, the patch happens to "fix" the
>>> problem.  A real fix might be to insert inline assembler that clears the high
>>> halves of the registers before we call ipl() in jump_to_IPL_2(). Can we think of
>>> a better way to do that than 15 LLGTR instructions? :) Let me know your
>>> thoughts
>>
>> Does sam31 before the ipl() work?
> asm volatile ("sam31\n");
> 
> Inserting the above right before ipl(); does not change the outcome, the guest still fails.
> 
> This allows the guest to boot.
> 
> asm volatile ("llgtr %r2,%r2\n"
>               "llgtr %r3,%r3\n");
> 
> My guess as to why sam31 does not work: The legacy OS is eventually doing a sam64 and the high halves of the registers are not subsequently cleared before use. I could be wrong about this though.

I think we should rewrite jump_to_IPL_2 is assembler as we cannot clear out all registers
with just inline assembly (we whould kill the stack and others that the compiler might still want).

Do the register clearing in there and then use something like

static void jump_to_IPL_2(void)
{
    asm volatile(	....clearing...
			"llgf 14,8\n"
                	"br 14\n");
}
Christian Borntraeger Feb. 25, 2020, 3:05 p.m. UTC | #12
On 25.02.20 16:00, Christian Borntraeger wrote:
> 
> 
> On 25.02.20 13:58, Jason J. Herne wrote:
>> On 2/25/20 6:13 AM, Christian Borntraeger wrote:
>>>
>>>
>>> On 25.02.20 11:23, Jason J. Herne wrote:
>>>> On 2/13/20 1:24 PM, Christian Borntraeger wrote:
>>>> ...
>>>>>>> diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
>>>>>>> index da13c43cc0..8839226803 100644
>>>>>>> --- a/pc-bios/s390-ccw/jump2ipl.c
>>>>>>> +++ b/pc-bios/s390-ccw/jump2ipl.c
>>>>>>> @@ -18,6 +18,7 @@
>>>>>>>     typedef struct ResetInfo {
>>>>>>>         uint64_t ipl_psw;
>>>>>>>         uint32_t ipl_continue;
>>>>>>> +    uint32_t pad;
>>>>>>>     } ResetInfo;
>>>>>>>       static ResetInfo save;
>>>>>>>
>>>>>>>
>>>>>>> also work? If yes, both variants are valid. Either packed or explicit padding.
>>>>>>>
>>>>>>
>>>>>> I don't believe this will work. I think the problem is that we're overwriting too much memory when we cast address 0 as a ResetInfo and then overwrite it (*current = save). I think we need the struct to be sized at 12-bytes instead of 16.
>>>>>>
>>>>>
>>>>> The idea of the code is that we _save_ the original content from address 0 to save and _restore_ it before jumping into final code. I do not yet understand why this does not work.
>>>>>
>>>>
>>>> I've found the real problem here. Legacy operating systems that expect to start
>>>> in 32-bit addressing mode can fail if we leave junk in the high halves of our
>>>> 64-bit registers. This is because some instructions (LA for example) are
>>>> bi-modal and operate differently depending on the machine's current addressing
>>>> mode.
>>>>
>>>> In the case where we pack the struct, the compiler happens to use the mvc
>>>> instruction to load/store the current/save memory areas.
>>>>
>>>>        *current = save;
>>>>    1fc:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>>>>    202:    c0 20 00 00 00 00     larl    %r2,202 <jump_to_IPL_2+0x32>
>>>>              204: R_390_PC32DBL    .bss+0x2
>>>>    208:    d2 0b 10 00 20 00     mvc    0(12,%r1),0(%r2)
>>>>
>>>> Everything works as expected here, our legacy OS boots without issue.
>>>> However, in the case where we've packed this struct the compiler optimizes the
>>>> code and uses lmg/stmg instead of mvc to copy the data:
>>>>
>>>>        *current = save;
>>>>    1fc:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>>>>    202:    c0 20 00 00 00 00     larl    %r2,202 <jump_to_IPL_2+0x32>
>>>>              204: R_390_PC32DBL    .bss+0x2
>>>>    208:    eb 23 20 00 00 04     lmg    %r2,%r3,0(%r2)
>>>>    20e:    eb 23 10 00 00 24     stmg    %r2,%r3,0(%r1)
>>>>
>>>> Depending on the data being copied, the high halves of the registers may contain
>>>> non-zero values. Example:
>>>>
>>>>      r2             0x108000080000780        74309395999098752
>>>>      r3             0x601001800004368        432627142283510632
>>>>
>>>> So, by sheer luck of the generated assembler, the patch happens to "fix" the
>>>> problem.  A real fix might be to insert inline assembler that clears the high
>>>> halves of the registers before we call ipl() in jump_to_IPL_2(). Can we think of
>>>> a better way to do that than 15 LLGTR instructions? :) Let me know your
>>>> thoughts
>>>
>>> Does sam31 before the ipl() work?
>> asm volatile ("sam31\n");
>>
>> Inserting the above right before ipl(); does not change the outcome, the guest still fails.
>>
>> This allows the guest to boot.
>>
>> asm volatile ("llgtr %r2,%r2\n"
>>               "llgtr %r3,%r3\n");
>>
>> My guess as to why sam31 does not work: The legacy OS is eventually doing a sam64 and the high halves of the registers are not subsequently cleared before use. I could be wrong about this though.
> 
> I think we should rewrite jump_to_IPL_2 is assembler as we cannot clear out all registers
> with just inline assembly (we whould kill the stack and others that the compiler might still want).
> 
> Do the register clearing in there and then use something like
> 
> static void jump_to_IPL_2(void)
> {
>     asm volatile(	....clearing...
> 			"llgf 14,8\n"
>                 	"br 14\n");
> }

maybe something like that instead.

    asm volatile(  ...clearing...
		    "llgf 14,%0\n"
                   "br 14\n"::"i" (__builtin_offsetof(ResetInfo, ipl_continue)));
 }
Thomas Huth Aug. 27, 2020, 10:07 a.m. UTC | #13
On 07/02/2020 15.02, Jason J. Herne wrote:
> On 2/7/20 6:28 AM, Christian Borntraeger wrote:
>> Jason,
>>
>> can you run objdump -Sdr on jump2ipl.o on a broken variant?
>>
>>
> To keep the volume lower, I've only pasted the output that I think
> you're interested in. If you want to see the entire thing just let me know.
> 
> static void jump_to_IPL_2(void)
> {
>  1d0:    eb bf f0 58 00 24     stmg    %r11,%r15,88(%r15)
>  1d6:    a7 fb ff 50           aghi    %r15,-176
>  1da:    b9 04 00 bf           lgr    %r11,%r15
>     ResetInfo *current = 0;
>  1de:    a7 19 00 00           lghi    %r1,0
>  1e2:    e3 10 b0 a8 00 24     stg    %r1,168(%r11)
> 
>     void (*ipl)(void) = (void *) (uint64_t) current->ipl_continue;
>  1e8:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>  1ee:    58 10 10 08           l    %r1,8(%r1)
>  1f2:    b9 16 00 11           llgfr    %r1,%r1
>  1f6:    e3 10 b0 a0 00 24     stg    %r1,160(%r11)
>     *current = save;
>  1fc:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>  202:    c0 20 00 00 00 00     larl    %r2,202 <jump_to_IPL_2+0x32>
>             204: R_390_PC32DBL    .bss+0x2
>  208:    eb 23 20 00 00 04     lmg    %r2,%r3,0(%r2)
>  20e:    eb 23 10 00 00 24     stmg    %r2,%r3,0(%r1)
>     ipl(); /* should not return */
>  214:    e3 10 b0 a0 00 04     lg    %r1,160(%r11)
>  21a:    0d e1                 basr    %r14,%r1
> }
>  21c:    18 00                 lr    %r0,%r0
>  21e:    eb bf b1 08 00 04     lmg    %r11,%r15,264(%r11)
>  224:    07 fe                 br    %r14
>  226:    07 07                 nopr    %r7

I'm currently looking through the past s390-ccw bios patches that still
might need attention ... was there ever a follow up on this discussion?
Do we need to clear the registers before jumping to the OS?
And looking at the disassembly, should we declar the ipl function
pointer with __attribute__((noreturn)) ?

 Thomas
Jason J. Herne Sept. 1, 2020, 1:02 p.m. UTC | #14
On 8/27/20 6:07 AM, Thomas Huth wrote:
> On 07/02/2020 15.02, Jason J. Herne wrote:
>> On 2/7/20 6:28 AM, Christian Borntraeger wrote:
>>> Jason,
>>>
>>> can you run objdump -Sdr on jump2ipl.o on a broken variant?
>>>
>>>
>> To keep the volume lower, I've only pasted the output that I think
>> you're interested in. If you want to see the entire thing just let me know.
>>
>> static void jump_to_IPL_2(void)
>> {
>>   1d0:    eb bf f0 58 00 24     stmg    %r11,%r15,88(%r15)
>>   1d6:    a7 fb ff 50           aghi    %r15,-176
>>   1da:    b9 04 00 bf           lgr    %r11,%r15
>>      ResetInfo *current = 0;
>>   1de:    a7 19 00 00           lghi    %r1,0
>>   1e2:    e3 10 b0 a8 00 24     stg    %r1,168(%r11)
>>
>>      void (*ipl)(void) = (void *) (uint64_t) current->ipl_continue;
>>   1e8:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>>   1ee:    58 10 10 08           l    %r1,8(%r1)
>>   1f2:    b9 16 00 11           llgfr    %r1,%r1
>>   1f6:    e3 10 b0 a0 00 24     stg    %r1,160(%r11)
>>      *current = save;
>>   1fc:    e3 10 b0 a8 00 04     lg    %r1,168(%r11)
>>   202:    c0 20 00 00 00 00     larl    %r2,202 <jump_to_IPL_2+0x32>
>>              204: R_390_PC32DBL    .bss+0x2
>>   208:    eb 23 20 00 00 04     lmg    %r2,%r3,0(%r2)
>>   20e:    eb 23 10 00 00 24     stmg    %r2,%r3,0(%r1)
>>      ipl(); /* should not return */
>>   214:    e3 10 b0 a0 00 04     lg    %r1,160(%r11)
>>   21a:    0d e1                 basr    %r14,%r1
>> }
>>   21c:    18 00                 lr    %r0,%r0
>>   21e:    eb bf b1 08 00 04     lmg    %r11,%r15,264(%r11)
>>   224:    07 fe                 br    %r14
>>   226:    07 07                 nopr    %r7
> 
> I'm currently looking through the past s390-ccw bios patches that still
> might need attention ... was there ever a follow up on this discussion?
> Do we need to clear the registers before jumping to the OS?
> And looking at the disassembly, should we declar the ipl function
> pointer with __attribute__((noreturn)) ?
> 

Janosch has done some cleanup work that has not hit master yet. This work, if accepted, 
would fix this problem and eliminate the need for this patch. So I think we should wait to 
see how that plays out before we revisit this.
diff mbox series

Patch

diff --git a/pc-bios/s390-ccw/jump2ipl.c b/pc-bios/s390-ccw/jump2ipl.c
index da13c43cc0..1e9eaa037f 100644
--- a/pc-bios/s390-ccw/jump2ipl.c
+++ b/pc-bios/s390-ccw/jump2ipl.c
@@ -18,7 +18,7 @@ 
 typedef struct ResetInfo {
     uint64_t ipl_psw;
     uint32_t ipl_continue;
-} ResetInfo;
+} __attribute__((packed)) ResetInfo;
 
 static ResetInfo save;