[v5,11/12] s390-ccw: clear pending irqs

Message ID	1517864246-11101-12-git-send-email-walling@linux.vnet.ibm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org> Gateway: Authorized Use Only! Violators will be prosecuted for <qemu-devel@nongnu.org> from <walling@linux.vnet.ibm.com>; Mon, 5 Feb 2018 16:01:44 -0500 Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 5 Feb 2018 16:01:41 -0500 From: "Collin L. Walling" <walling@linux.vnet.ibm.com> To: qemu-s390x@nongnu.org, qemu-devel@nongnu.org Date: Mon, 5 Feb 2018 15:57:25 -0500 In-Reply-To: <1517864246-11101-1-git-send-email-walling@linux.vnet.ibm.com> References: <1517864246-11101-1-git-send-email-walling@linux.vnet.ibm.com> Message-Id: <1517864246-11101-12-git-send-email-walling@linux.vnet.ibm.com> Subject: [Qemu-devel] [PATCH v5 11/12] s390-ccw: clear pending irqs Precedence: list Cc: frankja@linux.vnet.ibm.com, thuth@redhat.com, cohuck@redhat.com, david@redhat.com, alifm@linux.vnet.ibm.com, mihajlov@linux.vnet.ibm.com, borntraeger@de.ibm.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>

Collin L. Walling Feb. 5, 2018, 8:57 p.m. UTC

It is possible while waiting for multiple types of external
interrupts that we might have pending irqs remaining between
irq consumption and irq disabling. Those interrupts could
propagate to the guest after IPL completes and cause unwanted
behavior.

To avoid this, we clear the write event mask to prevent further
service interrupts from ASCII events and then consume all pending
irqs for a miniscule duration. Once finished, we reset the write
event mask and resume business as usual.

Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
---
 pc-bios/s390-ccw/menu.c | 16 ++++++++++++++++
 pc-bios/s390-ccw/sclp.c | 12 ++++++++++++
 2 files changed, 28 insertions(+)

Thomas Huth Feb. 6, 2018, 10:07 a.m. UTC | #1

On 05.02.2018 21:57, Collin L. Walling wrote:
> It is possible while waiting for multiple types of external
> interrupts that we might have pending irqs remaining between
> irq consumption and irq disabling. Those interrupts could
> propagate to the guest after IPL completes and cause unwanted
> behavior.
> 
> To avoid this, we clear the write event mask to prevent further
> service interrupts from ASCII events and then consume all pending
> irqs for a miniscule duration. Once finished, we reset the write
> event mask and resume business as usual.
> 
> Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
> ---
>  pc-bios/s390-ccw/menu.c | 16 ++++++++++++++++
>  pc-bios/s390-ccw/sclp.c | 12 ++++++++++++
>  2 files changed, 28 insertions(+)
> 
> diff --git a/pc-bios/s390-ccw/menu.c b/pc-bios/s390-ccw/menu.c
> index 85d285f..971f6b6 100644
> --- a/pc-bios/s390-ccw/menu.c
> +++ b/pc-bios/s390-ccw/menu.c
> @@ -64,6 +64,20 @@ static inline bool check_clock_int(void)
>      return *code == 0x1004;
>  }
>  
> +static void clear_pending_irqs(void)
> +{
> +    uint64_t time = 50 * TOD_CLOCK_SECOND / 0x3e8;

s/0x3e8/1000/ please.

> +    sclp_clear_write_mask();
> +
> +    set_clock_comparator(get_clock() + time);
> +    enable_clock_int();
> +    consume_sclp_int();
> +    disable_clock_int();
> +
> +    sclp_setup(); /* re-enable write mask */
> +}

I'm pretty much confused by this code. First, isn't there a small chance
that there is a clock int between consume_sclp_int() and
disable_clock_int() (if consume_sclp_int() has consumed an SCLP
interrupt instead of a clock interrupt) ?
Second, if you finally enable the SCLP write mask again, doesn't that
mean that there could be interrupts pending again afterwards?

 Thomas

Collin L. Walling Feb. 6, 2018, 4:37 p.m. UTC | #2

On 02/06/2018 05:07 AM, Thomas Huth wrote:
> On 05.02.2018 21:57, Collin L. Walling wrote:
>> It is possible while waiting for multiple types of external
>> interrupts that we might have pending irqs remaining between
>> irq consumption and irq disabling. Those interrupts could
>> propagate to the guest after IPL completes and cause unwanted
>> behavior.
>>
>> To avoid this, we clear the write event mask to prevent further
>> service interrupts from ASCII events and then consume all pending
>> irqs for a miniscule duration. Once finished, we reset the write
>> event mask and resume business as usual.
>>
>> Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
>> ---
>>   pc-bios/s390-ccw/menu.c | 16 ++++++++++++++++
>>   pc-bios/s390-ccw/sclp.c | 12 ++++++++++++
>>   2 files changed, 28 insertions(+)
>>
>> diff --git a/pc-bios/s390-ccw/menu.c b/pc-bios/s390-ccw/menu.c
>> index 85d285f..971f6b6 100644
>> --- a/pc-bios/s390-ccw/menu.c
>> +++ b/pc-bios/s390-ccw/menu.c
>> @@ -64,6 +64,20 @@ static inline bool check_clock_int(void)
>>       return *code == 0x1004;
>>   }
>>   
>> +static void clear_pending_irqs(void)
>> +{
>> +    uint64_t time = 50 * TOD_CLOCK_SECOND / 0x3e8;
> s/0x3e8/1000/ please.

Agreed.

>
>> +    sclp_clear_write_mask();
>> +
>> +    set_clock_comparator(get_clock() + time);
>> +    enable_clock_int();
>> +    consume_sclp_int();
>> +    disable_clock_int();
>> +
>> +    sclp_setup(); /* re-enable write mask */
>> +}
> I'm pretty much confused by this code. First, isn't there a small chance
> that there is a clock int between consume_sclp_int() and
> disable_clock_int() (if consume_sclp_int() has consumed an SCLP
> interrupt instead of a clock interrupt) ?

By clearing the write mask, we can no longer send nor receive ASCII events
(such as prints or keystrokes). Therefore we will not have any service
interrupts originating from keystrokes.

Theoretically, the only interrupt we should be waiting for in this case
is from the clock.

> Second, if you finally enable the SCLP write mask again, doesn't that
> mean that there could be interrupts pending again afterwards?

Yes.  When the write mask is re-enabled for cp_receive mask, we can still
get pending service interrupts originating from keystrokes (or other 
external,
ASCII related interruptions). Hopefully the user does not fall asleep
and smack their head on the keyboard during the IPL process. (joking)

Ideally (as it was suggested in a previous review), we should refactor
the consume_sclp_int so we can manually enable / disable service interrupts,
but I think this is a good first step, no?

We also might be able to get away with just not setting the receive mask at
all when we re-enable the write mask. That will prevent any interrupts from
keystrokes.

>
>   Thomas
>

David Hildenbrand Feb. 14, 2018, 10:57 a.m. UTC | #3

On 05.02.2018 21:57, Collin L. Walling wrote:
> It is possible while waiting for multiple types of external
> interrupts that we might have pending irqs remaining between
> irq consumption and irq disabling. Those interrupts could
> propagate to the guest after IPL completes and cause unwanted
> behavior.
> 
> To avoid this, we clear the write event mask to prevent further
> service interrupts from ASCII events and then consume all pending
> irqs for a miniscule duration. Once finished, we reset the write
> event mask and resume business as usual.
> 
> Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
> ---
>  pc-bios/s390-ccw/menu.c | 16 ++++++++++++++++
>  pc-bios/s390-ccw/sclp.c | 12 ++++++++++++
>  2 files changed, 28 insertions(+)
> 
> diff --git a/pc-bios/s390-ccw/menu.c b/pc-bios/s390-ccw/menu.c
> index 85d285f..971f6b6 100644
> --- a/pc-bios/s390-ccw/menu.c
> +++ b/pc-bios/s390-ccw/menu.c
> @@ -64,6 +64,20 @@ static inline bool check_clock_int(void)
>      return *code == 0x1004;
>  }
>  
> +static void clear_pending_irqs(void)
> +{
> +    uint64_t time = 50 * TOD_CLOCK_SECOND / 0x3e8;
> +
> +    sclp_clear_write_mask();
> +
> +    set_clock_comparator(get_clock() + time);
> +    enable_clock_int();
> +    consume_sclp_int();
> +    disable_clock_int();
> +
> +    sclp_setup(); /* re-enable write mask */
> +}
> +
>  static int read_prompt(char *buf, size_t len)
>  {
>      char inp[2] = {};
> @@ -165,6 +179,8 @@ static int get_boot_index(int entries)
>      sclp_print("\nBooting entry #");
>      sclp_print(itostr(boot_index, tmp, sizeof(tmp)));
>  
> +    clear_pending_irqs();
> +
>      return boot_index;
>  }
>  
> diff --git a/pc-bios/s390-ccw/sclp.c b/pc-bios/s390-ccw/sclp.c
> index 5902d5b..025eb2d 100644
> --- a/pc-bios/s390-ccw/sclp.c
> +++ b/pc-bios/s390-ccw/sclp.c
> @@ -46,6 +46,18 @@ static int sclp_service_call(unsigned int command, void *sccb)
>          return 0;
>  }
>  
> +void sclp_clear_write_mask(void)
> +{
> +    WriteEventMask *sccb = (void *)_sccb;
> +
> +    sccb->h.length = sizeof(WriteEventMask);
> +    sccb->mask_length = sizeof(unsigned int);
> +    sccb->cp_receive_mask = 0;
> +    sccb->cp_send_mask = 0;
> +
> +    sclp_service_call(SCLP_CMD_WRITE_EVENT_MASK, sccb);
> +}
> +
>  static void sclp_set_write_mask(void)
>  {
>      WriteEventMask *sccb = (void *)_sccb;
> 

1. CKC interrupts can be cleared by resetting the CKC
2. SCLP interrupts can be cleared only via delivery (apart from CPU reset)

So if you have CKC and SCLP pending at the same time, you get the CKC
delivered first and the SCLP remains pending.

Now, the easiest way to clear that (if you don't know if any is
pending!) is to simply print a string. Then you know that you have
exactly one SCLP interrupt pending.

So simply printing a string after potentially reading should be
sufficient to clear the SCLP interrupt deterministically :)

Collin L. Walling Feb. 14, 2018, 3:33 p.m. UTC | #4

On 02/14/2018 05:57 AM, David Hildenbrand wrote:
> On 05.02.2018 21:57, Collin L. Walling wrote:
>> It is possible while waiting for multiple types of external
>> interrupts that we might have pending irqs remaining between
>> irq consumption and irq disabling. Those interrupts could
>> propagate to the guest after IPL completes and cause unwanted
>> behavior.
>>
>> To avoid this, we clear the write event mask to prevent further
>> service interrupts from ASCII events and then consume all pending
>> irqs for a miniscule duration. Once finished, we reset the write
>> event mask and resume business as usual.
>>
>> Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
>> ---
>>   pc-bios/s390-ccw/menu.c | 16 ++++++++++++++++
>>   pc-bios/s390-ccw/sclp.c | 12 ++++++++++++
>>   2 files changed, 28 insertions(+)
>>
>> diff --git a/pc-bios/s390-ccw/menu.c b/pc-bios/s390-ccw/menu.c
>> index 85d285f..971f6b6 100644
>> --- a/pc-bios/s390-ccw/menu.c
>> +++ b/pc-bios/s390-ccw/menu.c
>> @@ -64,6 +64,20 @@ static inline bool check_clock_int(void)
>>       return *code == 0x1004;
>>   }
>>   
>> +static void clear_pending_irqs(void)
>> +{
>> +    uint64_t time = 50 * TOD_CLOCK_SECOND / 0x3e8;
>> +
>> +    sclp_clear_write_mask();
>> +
>> +    set_clock_comparator(get_clock() + time);
>> +    enable_clock_int();
>> +    consume_sclp_int();
>> +    disable_clock_int();
>> +
>> +    sclp_setup(); /* re-enable write mask */
>> +}
>> +
>>   static int read_prompt(char *buf, size_t len)
>>   {
>>       char inp[2] = {};
>> @@ -165,6 +179,8 @@ static int get_boot_index(int entries)
>>       sclp_print("\nBooting entry #");
>>       sclp_print(itostr(boot_index, tmp, sizeof(tmp)));
>>   
>> +    clear_pending_irqs();
>> +
>>       return boot_index;
>>   }
>>   
>> diff --git a/pc-bios/s390-ccw/sclp.c b/pc-bios/s390-ccw/sclp.c
>> index 5902d5b..025eb2d 100644
>> --- a/pc-bios/s390-ccw/sclp.c
>> +++ b/pc-bios/s390-ccw/sclp.c
>> @@ -46,6 +46,18 @@ static int sclp_service_call(unsigned int command, void *sccb)
>>           return 0;
>>   }
>>   
>> +void sclp_clear_write_mask(void)
>> +{
>> +    WriteEventMask *sccb = (void *)_sccb;
>> +
>> +    sccb->h.length = sizeof(WriteEventMask);
>> +    sccb->mask_length = sizeof(unsigned int);
>> +    sccb->cp_receive_mask = 0;
>> +    sccb->cp_send_mask = 0;
>> +
>> +    sclp_service_call(SCLP_CMD_WRITE_EVENT_MASK, sccb);
>> +}
>> +
>>   static void sclp_set_write_mask(void)
>>   {
>>       WriteEventMask *sccb = (void *)_sccb;
>>
> 1. CKC interrupts can be cleared by resetting the CKC
> 2. SCLP interrupts can be cleared only via delivery (apart from CPU reset)
>
> So if you have CKC and SCLP pending at the same time, you get the CKC
> delivered first and the SCLP remains pending.
>
> Now, the easiest way to clear that (if you don't know if any is
> pending!) is to simply print a string. Then you know that you have
> exactly one SCLP interrupt pending.
>
> So simply printing a string after potentially reading should be
> sufficient to clear the SCLP interrupt deterministically :)
>

Perhaps it is due to my lack of understanding of how irqs are queued, 
but is it
possible that we could still end up with service interrupts pending in 
the SCLP?
Specifically if we're still accepting external interrupts from 
keystrokes but we
aren't reading anything from the SCLP.

Let's say we have 1 service signal pending and we go to print something. 
This
executes the sclp service call instruction and generates a new service 
signal.
The SCLP would consume one of the service interrupts and write to the 
console.
We still have 1 interrupt pending that we need to deal with.

That 1 pending interrupt could have been generated at any time we're still
listening to activity from the keyboard.

In my next update to this patch, I setup the control program receive mask in
the SCLP only when we need to get input from the user and then clear the 
mask
when we're done. Doing so will make it so we generate an interrupt from
keystrokes ONLY when the mask is set. No external interrupts from 
keystrokes
will be generated when the cp_receive mask is NOT set.

After I clear the cp_receive mask, we consume any leftover interrupts by
calling consume_sclp_int (I also fixup the patch to make sure we only end
irq-clearing on a ckc interrupt -- oops).

Am I at least in the ballpark regarding the problem this patch aims to 
solve?

Thomas Huth Feb. 15, 2018, 7:04 a.m. UTC | #5

On 14.02.2018 16:33, Collin L. Walling wrote:
> On 02/14/2018 05:57 AM, David Hildenbrand wrote:
[...]
>> 1. CKC interrupts can be cleared by resetting the CKC
>> 2. SCLP interrupts can be cleared only via delivery (apart from CPU
>> reset)
>>
>> So if you have CKC and SCLP pending at the same time, you get the CKC
>> delivered first and the SCLP remains pending.
>>
>> Now, the easiest way to clear that (if you don't know if any is
>> pending!) is to simply print a string. Then you know that you have
>> exactly one SCLP interrupt pending.
>>
>> So simply printing a string after potentially reading should be
>> sufficient to clear the SCLP interrupt deterministically :)
> 
> Perhaps it is due to my lack of understanding of how irqs are queued,
> but is it
> possible that we could still end up with service interrupts pending in
> the SCLP?
> Specifically if we're still accepting external interrupts from
> keystrokes but we
> aren't reading anything from the SCLP.
> 
> Let's say we have 1 service signal pending and we go to print something.
> This
> executes the sclp service call instruction and generates a new service
> signal.
> The SCLP would consume one of the service interrupts and write to the
> console.
> We still have 1 interrupt pending that we need to deal with.
> 
> That 1 pending interrupt could have been generated at any time we're still
> listening to activity from the keyboard.

There is no "queue" or something like this for service interrupts.
Either one service interrupt is pending, or it is not. Have a look at
arch/s390/kvm/interrupt.c in the Linux kernel sources and search for the
functions __deliver_service() and __inject_service() for example.

> In my next update to this patch, I setup the control program receive
> mask in
> the SCLP only when we need to get input from the user and then clear the
> mask
> when we're done. Doing so will make it so we generate an interrupt from
> keystrokes ONLY when the mask is set. No external interrupts from
> keystrokes
> will be generated when the cp_receive mask is NOT set.
> 
> After I clear the cp_receive mask, we consume any leftover interrupts by
> calling consume_sclp_int (I also fixup the patch to make sure we only end
> irq-clearing on a ckc interrupt -- oops).

Not sure whether you really have to deal with the ckc here again to get
rid of pending service interrupts... David's idea to simply print out
something to clear the pending service interrupt sounds easier to me.

 Thomas

Collin L. Walling Feb. 15, 2018, 3:47 p.m. UTC | #6

On 02/15/2018 02:04 AM, Thomas Huth wrote:
> On 14.02.2018 16:33, Collin L. Walling wrote:
>> On 02/14/2018 05:57 AM, David Hildenbrand wrote:
> [...]
>>> 1. CKC interrupts can be cleared by resetting the CKC
>>> 2. SCLP interrupts can be cleared only via delivery (apart from CPU
>>> reset)
>>>
>>> So if you have CKC and SCLP pending at the same time, you get the CKC
>>> delivered first and the SCLP remains pending.
>>>
>>> Now, the easiest way to clear that (if you don't know if any is
>>> pending!) is to simply print a string. Then you know that you have
>>> exactly one SCLP interrupt pending.
>>>
>>> So simply printing a string after potentially reading should be
>>> sufficient to clear the SCLP interrupt deterministically :)
>> Perhaps it is due to my lack of understanding of how irqs are queued,
>> but is it
>> possible that we could still end up with service interrupts pending in
>> the SCLP?
>> Specifically if we're still accepting external interrupts from
>> keystrokes but we
>> aren't reading anything from the SCLP.
>>
>> Let's say we have 1 service signal pending and we go to print something.
>> This
>> executes the sclp service call instruction and generates a new service
>> signal.
>> The SCLP would consume one of the service interrupts and write to the
>> console.
>> We still have 1 interrupt pending that we need to deal with.
>>
>> That 1 pending interrupt could have been generated at any time we're still
>> listening to activity from the keyboard.
> There is no "queue" or something like this for service interrupts.
> Either one service interrupt is pending, or it is not. Have a look at
> arch/s390/kvm/interrupt.c in the Linux kernel sources and search for the
> functions __deliver_service() and __inject_service() for example.
>
>> In my next update to this patch, I setup the control program receive
>> mask in
>> the SCLP only when we need to get input from the user and then clear the
>> mask
>> when we're done. Doing so will make it so we generate an interrupt from
>> keystrokes ONLY when the mask is set. No external interrupts from
>> keystrokes
>> will be generated when the cp_receive mask is NOT set.
>>
>> After I clear the cp_receive mask, we consume any leftover interrupts by
>> calling consume_sclp_int (I also fixup the patch to make sure we only end
>> irq-clearing on a ckc interrupt -- oops).
> Not sure whether you really have to deal with the ckc here again to get
> rid of pending service interrupts... David's idea to simply print out
> something to clear the pending service interrupt sounds easier to me.
>
>   Thomas
>

Ah, I understand the problem now. We can't have a multiple irqs of the 
*same* *type*
pending at the same time, but we /can/ have multiple irqs of *different* 
*types* pending
at the same time (i.e. a ckc and service signal).

Thanks for clearing this up for me.  I was over thinking the problem and 
I agree that
a print would be the easiest solution. :)

David Hildenbrand Feb. 15, 2018, 4:12 p.m. UTC | #7

> 
> Ah, I understand the problem now. We can't have a multiple irqs of the 
> *same* *type*
> pending at the same time, but we /can/ have multiple irqs of *different* 
> *types* pending
> at the same time (i.e. a ckc and service signal).

Indeed, at least for these types of external interrupts.

(if I remember correctly, previous TCG versions did this wrong but I
fixed it :) )

> 
> Thanks for clearing this up for me.  I was over thinking the problem and 
> I agree that
> a print would be the easiest solution. :)
>

[v5,11/12] s390-ccw: clear pending irqs

Commit Message

Comments

Patch