diff mbox series

[v2,01/12] KVM: s390: leave AIs in IPM of GISA during vcpu_pre_run()

Message ID 20181119172536.52649-2-mimu@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series KVM: s390: make use of the GIB | expand

Commit Message

Michael Mueller Nov. 19, 2018, 5:25 p.m. UTC
Do not call __deliver_io() for adapter interruptions already
pending in the IPM. That is a double effort. They will
be processed as soon the vcpu control is given to SIE.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
---
 arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 29 deletions(-)

Comments

Cornelia Huck Nov. 20, 2018, 11:33 a.m. UTC | #1
On Mon, 19 Nov 2018 18:25:25 +0100
Michael Mueller <mimu@linux.ibm.com> wrote:

> Do not call __deliver_io() for adapter interruptions already
> pending in the IPM. That is a double effort. They will
> be processed as soon the vcpu control is given to SIE.
> 
> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
> ---
>  arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++-------------------------
>  1 file changed, 25 insertions(+), 29 deletions(-)

I think this patch does what it says on the tin, but I'm a bit lost as
to the why. (Might make more sense with the gib.)

Currently, we are trying to process any I/O interrupts, even if we'd
get them delivered via the gisa, when we're out of the SIE anyway.
IIRC, while this looks a bit like a belt-and-suspenders approach, it
also prevented performance problems when the vcpu did not go back into
the SIE immediately (it even may exit to userspace).

Also, if you're ignoring the I/O interrupts pending in the ipm, you may
end up delivering interrupts with a lower priority (higher isc) first.
I'm not sure that's what we want.

But maybe I'm just missing another bit of the code that makes this
safe. Can you elaborate a bit?
David Hildenbrand Nov. 20, 2018, noon UTC | #2
On 20.11.18 12:33, Cornelia Huck wrote:
> On Mon, 19 Nov 2018 18:25:25 +0100
> Michael Mueller <mimu@linux.ibm.com> wrote:
> 
>> Do not call __deliver_io() for adapter interruptions already
>> pending in the IPM. That is a double effort. They will
>> be processed as soon the vcpu control is given to SIE.
>>
>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>> ---
>>  arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++-------------------------
>>  1 file changed, 25 insertions(+), 29 deletions(-)
> 
> I think this patch does what it says on the tin, but I'm a bit lost as
> to the why. (Might make more sense with the gib.)
> 
> Currently, we are trying to process any I/O interrupts, even if we'd
> get them delivered via the gisa, when we're out of the SIE anyway.
> IIRC, while this looks a bit like a belt-and-suspenders approach, it
> also prevented performance problems when the vcpu did not go back into
> the SIE immediately (it even may exit to userspace).
> 
> Also, if you're ignoring the I/O interrupts pending in the ipm, you may
> end up delivering interrupts with a lower priority (higher isc) first.
> I'm not sure that's what we want.
> 
> But maybe I'm just missing another bit of the code that makes this
> safe. Can you elaborate a bit?
> 

For interrupt priorities to work at least somewhat predictable, we
should always try to inject all interrupts even if the HW would be doing
it for us. In the order of priority.

But we don't have the same thing for external calls injected via SCA. I
remember that I once had a patch lying around a couple of years ago to
fix that ... it went missing :)

IO interrupt almost have lowest priority, so letting the HW inject them
could be problematic when mixing IO interrupt priorities between SW and
HW injected ones (hat Conny described).

There are other corner cases if a e.g. a RESTART interrupt is pending
for that CPU. We would deliver eventually the RESTART interrupt before
delivering the IO interrupt, which would be wrong.
Pierre Morel Nov. 20, 2018, 6:22 p.m. UTC | #3
On 20/11/2018 12:33, Cornelia Huck wrote:
> On Mon, 19 Nov 2018 18:25:25 +0100
> Michael Mueller <mimu@linux.ibm.com> wrote:
> 
>> Do not call __deliver_io() for adapter interruptions already
>> pending in the IPM. That is a double effort. They will
>> be processed as soon the vcpu control is given to SIE.
>>
>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>> ---
>>   arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++-------------------------
>>   1 file changed, 25 insertions(+), 29 deletions(-)
> 
> I think this patch does what it says on the tin, but I'm a bit lost as
> to the why. (Might make more sense with the gib.)
> 
> Currently, we are trying to process any I/O interrupts, even if we'd
> get them delivered via the gisa, when we're out of the SIE anyway.
> IIRC, while this looks a bit like a belt-and-suspenders approach, it
> also prevented performance problems when the vcpu did not go back into
> the SIE immediately (it even may exit to userspace).
> 
> Also, if you're ignoring the I/O interrupts pending in the ipm, you may
> end up delivering interrupts with a lower priority (higher isc) first.
> I'm not sure that's what we want.
> 
> But maybe I'm just missing another bit of the code that makes this
> safe. Can you elaborate a bit?
> 

I do not think we should worry.

In the architecture all interrupts are asynchronous to any activity of 
the CPU.
The priority of the interrupt is controlled intern by each sub-channel 
and adapter and then the by each CPU among sub-channel and adapter requests.

While the first system is completely hardware dependent the second is 
collisioning with software IRQ we may dispatch out of KVM/QEMU.

The assignment of these priority is model dependent and must guarantee 
that no interrupt is delayed so much that it could cause recovery 
actions to be initiated.

In our case, we can take for sure that the priority seen by the vCPU, 
that is dispatched by the software by touching the SIE page or by the 
GISA mechanism, are compatible with the architecture.

If we agree on this the only problem may arise from the first level of 
interruption also inside the subchannel priority mechanism and from the 
delay induced by GISA when delivering these interrupts to the vCPU.

This delay occurs on an asynchronous interrupt, so yes, there will be a 
delay.
Is it larger or smaller than the delay introduced by the software?

Why should we worry if it is, the interrupt is asynchronous, should we 
worry, for example, if the AP card take longer to send the interrupt?
I don't think so.

Regards,
Pierre
Christian Borntraeger Nov. 20, 2018, 9:03 p.m. UTC | #4
On 11/20/2018 12:33 PM, Cornelia Huck wrote:
> On Mon, 19 Nov 2018 18:25:25 +0100
> Michael Mueller <mimu@linux.ibm.com> wrote:
> 
>> Do not call __deliver_io() for adapter interruptions already
>> pending in the IPM. That is a double effort. They will
>> be processed as soon the vcpu control is given to SIE.
>>
>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>> ---
>>  arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++-------------------------
>>  1 file changed, 25 insertions(+), 29 deletions(-)
> 
> I think this patch does what it says on the tin, but I'm a bit lost as
> to the why. (Might make more sense with the gib.)
> 
> Currently, we are trying to process any I/O interrupts, even if we'd
> get them delivered via the gisa, when we're out of the SIE anyway.
> IIRC, while this looks a bit like a belt-and-suspenders approach, it
> also prevented performance problems when the vcpu did not go back into
> the SIE immediately (it even may exit to userspace

In fact, doing the inject when in SIE is likely better performance-wise.
Right now we "inject" the floating interrupt and then we handle 
the requests. That can actually mean that it could take a while 
until the interrupt is actually noticed by the guest (when
in SIE). If you now have a 2nd CPU enabled this interrupt could
have been delivered to the guest much earlier but it is "stuck" in
the local CPU.


> 
> Also, if you're ignoring the I/O interrupts pending in the ipm, you may
> end up delivering interrupts with a lower priority (higher isc) first.
> I'm not sure that's what we want.

FWIW, LPAR has the same relaxation regarding priorities of subclasses.

> But maybe I'm just missing another bit of the code that makes this
> safe. Can you elaborate a bit?
Michael Mueller Nov. 21, 2018, 8:30 a.m. UTC | #5
On 20.11.18 12:33, Cornelia Huck wrote:
> On Mon, 19 Nov 2018 18:25:25 +0100
> Michael Mueller <mimu@linux.ibm.com> wrote:
> 
>> Do not call __deliver_io() for adapter interruptions already
>> pending in the IPM. That is a double effort. They will
>> be processed as soon the vcpu control is given to SIE.
>>
>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>> ---
>>   arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++-------------------------
>>   1 file changed, 25 insertions(+), 29 deletions(-)
> 
> I think this patch does what it says on the tin, but I'm a bit lost as
> to the why. (Might make more sense with the gib.)
> 
> Currently, we are trying to process any I/O interrupts, even if we'd
> get them delivered via the gisa, when we're out of the SIE anyway.
> IIRC, while this looks a bit like a belt-and-suspenders approach, it
> also prevented performance problems when the vcpu did not go back into
> the SIE immediately (it even may exit to userspace).
> 
> Also, if you're ignoring the I/O interrupts pending in the ipm, you may
> end up delivering interrupts with a lower priority (higher isc) first.
> I'm not sure that's what we want.
> 
> But maybe I'm just missing another bit of the code that makes this
> safe. Can you elaborate a bit?


Function kvm_s390_deliver_pending_interrupts() is called in the 
beginning of vcpu_pre_run() and we are about to enter the SIE anyway. 
SIE will also grant the right ISC priority for adapter interruptions.

I basically want to avoid that a GISA that is part of the alert list
will get its IPM bits cleared outside the GIB alert interruption 
context. process_gib_alert_list()

>
Cornelia Huck Nov. 21, 2018, 12:16 p.m. UTC | #6
On Tue, 20 Nov 2018 22:03:59 +0100
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 11/20/2018 12:33 PM, Cornelia Huck wrote:
> > On Mon, 19 Nov 2018 18:25:25 +0100
> > Michael Mueller <mimu@linux.ibm.com> wrote:
> >   
> >> Do not call __deliver_io() for adapter interruptions already
> >> pending in the IPM. That is a double effort. They will
> >> be processed as soon the vcpu control is given to SIE.
> >>
> >> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
> >> ---
> >>  arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++-------------------------
> >>  1 file changed, 25 insertions(+), 29 deletions(-)  
> > 
> > I think this patch does what it says on the tin, but I'm a bit lost as
> > to the why. (Might make more sense with the gib.)
> > 
> > Currently, we are trying to process any I/O interrupts, even if we'd
> > get them delivered via the gisa, when we're out of the SIE anyway.
> > IIRC, while this looks a bit like a belt-and-suspenders approach, it
> > also prevented performance problems when the vcpu did not go back into
> > the SIE immediately (it even may exit to userspace  
> 
> In fact, doing the inject when in SIE is likely better performance-wise.
> Right now we "inject" the floating interrupt and then we handle 
> the requests. That can actually mean that it could take a while 
> until the interrupt is actually noticed by the guest (when
> in SIE). If you now have a 2nd CPU enabled this interrupt could
> have been delivered to the guest much earlier but it is "stuck" in
> the local CPU.

Hm, yes. Do we see any different effects if we have a guest with only
one cpu (or only one cpu enabled for I/O interrupts?) Or does all of
this even out in practice?

> > Also, if you're ignoring the I/O interrupts pending in the ipm, you may
> > end up delivering interrupts with a lower priority (higher isc) first.
> > I'm not sure that's what we want.  
> 
> FWIW, LPAR has the same relaxation regarding priorities of subclasses.

Interesting to know, thanks. What about restart etc. interrupts, as
David has noted?
Cornelia Huck Nov. 21, 2018, 12:19 p.m. UTC | #7
On Wed, 21 Nov 2018 09:30:34 +0100
Michael Mueller <mimu@linux.ibm.com> wrote:

> On 20.11.18 12:33, Cornelia Huck wrote:
> > On Mon, 19 Nov 2018 18:25:25 +0100
> > Michael Mueller <mimu@linux.ibm.com> wrote:
> >   
> >> Do not call __deliver_io() for adapter interruptions already
> >> pending in the IPM. That is a double effort. They will
> >> be processed as soon the vcpu control is given to SIE.
> >>
> >> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
> >> ---
> >>   arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++-------------------------
> >>   1 file changed, 25 insertions(+), 29 deletions(-)  
> > 
> > I think this patch does what it says on the tin, but I'm a bit lost as
> > to the why. (Might make more sense with the gib.)
> > 
> > Currently, we are trying to process any I/O interrupts, even if we'd
> > get them delivered via the gisa, when we're out of the SIE anyway.
> > IIRC, while this looks a bit like a belt-and-suspenders approach, it
> > also prevented performance problems when the vcpu did not go back into
> > the SIE immediately (it even may exit to userspace).
> > 
> > Also, if you're ignoring the I/O interrupts pending in the ipm, you may
> > end up delivering interrupts with a lower priority (higher isc) first.
> > I'm not sure that's what we want.
> > 
> > But maybe I'm just missing another bit of the code that makes this
> > safe. Can you elaborate a bit?  
> 
> 
> Function kvm_s390_deliver_pending_interrupts() is called in the 
> beginning of vcpu_pre_run() and we are about to enter the SIE anyway. 
> SIE will also grant the right ISC priority for adapter interruptions.
> 
> I basically want to avoid that a GISA that is part of the alert list
> will get its IPM bits cleared outside the GIB alert interruption 
> context. process_gib_alert_list()

Unfinished sentence?

Anyway, I think the patch description would benefit from adding a
sentence or two describing possible performance benefits and ease of
alert handling when the gib is introduced (?)
Michael Mueller Nov. 21, 2018, 3:58 p.m. UTC | #8
On 21.11.18 13:19, Cornelia Huck wrote:
> On Wed, 21 Nov 2018 09:30:34 +0100
> Michael Mueller <mimu@linux.ibm.com> wrote:
> 
>> On 20.11.18 12:33, Cornelia Huck wrote:
>>> On Mon, 19 Nov 2018 18:25:25 +0100
>>> Michael Mueller <mimu@linux.ibm.com> wrote:
>>>    
>>>> Do not call __deliver_io() for adapter interruptions already
>>>> pending in the IPM. That is a double effort. They will
>>>> be processed as soon the vcpu control is given to SIE.
>>>>
>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>>>> ---
>>>>    arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++-------------------------
>>>>    1 file changed, 25 insertions(+), 29 deletions(-)
>>>
>>> I think this patch does what it says on the tin, but I'm a bit lost as
>>> to the why. (Might make more sense with the gib.)
>>>
>>> Currently, we are trying to process any I/O interrupts, even if we'd
>>> get them delivered via the gisa, when we're out of the SIE anyway.
>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it
>>> also prevented performance problems when the vcpu did not go back into
>>> the SIE immediately (it even may exit to userspace).
>>>
>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you may
>>> end up delivering interrupts with a lower priority (higher isc) first.
>>> I'm not sure that's what we want.
>>>
>>> But maybe I'm just missing another bit of the code that makes this
>>> safe. Can you elaborate a bit?
>>
>>
>> Function kvm_s390_deliver_pending_interrupts() is called in the
>> beginning of vcpu_pre_run() and we are about to enter the SIE anyway.
>> SIE will also grant the right ISC priority for adapter interruptions.
>>
>> I basically want to avoid that a GISA that is part of the alert list
>> will get its IPM bits cleared outside the GIB alert interruption
>> context. process_gib_alert_list()
> 
> Unfinished sentence?
> 
> Anyway, I think the patch description would benefit from adding a
> sentence or two describing possible performance benefits and ease of
> alert handling when the gib is introduced (?)

The patch is independent from the GIB related code. It shall be
active with the introduction of the GISA already. Yes, the GIB code
will benefit from this change as well but I don't like the forward 
reference to upcoming patches if not really required.

I will update the commit message to:

Do not call __deliver_io() for adapter interruptions already
pending in the IPM. That is a double effort. They will
be processed as soon the vcpu control is given to SIE or by
another SIE instance of the guest currently running.

>
Halil Pasic Nov. 21, 2018, 9:05 p.m. UTC | #9
On Tue, 20 Nov 2018 13:00:03 +0100
David Hildenbrand <david@redhat.com> wrote:

> On 20.11.18 12:33, Cornelia Huck wrote:
> > On Mon, 19 Nov 2018 18:25:25 +0100
> > Michael Mueller <mimu@linux.ibm.com> wrote:
> > 
> >> Do not call __deliver_io() for adapter interruptions already
> >> pending in the IPM. That is a double effort. They will
> >> be processed as soon the vcpu control is given to SIE.
> >>
> >> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
> >> ---
> >>  arch/s390/kvm/interrupt.c | 54
> >> ++++++++++++++++++++++------------------------- 1 file changed, 25
> >> insertions(+), 29 deletions(-)
> > 
> > I think this patch does what it says on the tin, but I'm a bit lost
> > as to the why. (Might make more sense with the gib.)
> > 
> > Currently, we are trying to process any I/O interrupts, even if we'd
> > get them delivered via the gisa, when we're out of the SIE anyway.
> > IIRC, while this looks a bit like a belt-and-suspenders approach, it
> > also prevented performance problems when the vcpu did not go back
> > into the SIE immediately (it even may exit to userspace).
> > 
> > Also, if you're ignoring the I/O interrupts pending in the ipm, you
> > may end up delivering interrupts with a lower priority (higher isc)
> > first. I'm not sure that's what we want.
> > 
> > But maybe I'm just missing another bit of the code that makes this
> > safe. Can you elaborate a bit?
> > 
> 
> For interrupt priorities to work at least somewhat predictable, we
> should always try to inject all interrupts even if the HW would be
> doing it for us. In the order of priority.
> 
> But we don't have the same thing for external calls injected via SCA.
> I remember that I once had a patch lying around a couple of years ago
> to fix that ... it went missing :)
> 
> IO interrupt almost have lowest priority, so letting the HW inject
> them could be problematic when mixing IO interrupt priorities between
> SW and HW injected ones (hat Conny described).
> 
> There are other corner cases if a e.g. a RESTART interrupt is pending
> for that CPU. We would deliver eventually the RESTART interrupt before
> delivering the IO interrupt, which would be wrong.
> 

I do share David's concern. Could somebody try to explain why this
RESTART scenario David described is not actually a problem -- AFAIU it
is a problem.

Regards,
Halil
Michael Mueller Nov. 22, 2018, 11:21 a.m. UTC | #10
On 21.11.18 22:05, Halil Pasic wrote:
> On Tue, 20 Nov 2018 13:00:03 +0100
> David Hildenbrand <david@redhat.com> wrote:
>
>> On 20.11.18 12:33, Cornelia Huck wrote:
>>> On Mon, 19 Nov 2018 18:25:25 +0100
>>> Michael Mueller <mimu@linux.ibm.com> wrote:
>>>
>>>> Do not call __deliver_io() for adapter interruptions already
>>>> pending in the IPM. That is a double effort. They will
>>>> be processed as soon the vcpu control is given to SIE.
>>>>
>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>>>> ---
>>>>   arch/s390/kvm/interrupt.c | 54
>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25
>>>> insertions(+), 29 deletions(-)
>>> I think this patch does what it says on the tin, but I'm a bit lost
>>> as to the why. (Might make more sense with the gib.)
>>>
>>> Currently, we are trying to process any I/O interrupts, even if we'd
>>> get them delivered via the gisa, when we're out of the SIE anyway.
>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it
>>> also prevented performance problems when the vcpu did not go back
>>> into the SIE immediately (it even may exit to userspace).
>>>
>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you
>>> may end up delivering interrupts with a lower priority (higher isc)
>>> first. I'm not sure that's what we want.
>>>
>>> But maybe I'm just missing another bit of the code that makes this
>>> safe. Can you elaborate a bit?
>>>
>> For interrupt priorities to work at least somewhat predictable, we
>> should always try to inject all interrupts even if the HW would be
>> doing it for us. In the order of priority.
>>
>> But we don't have the same thing for external calls injected via SCA.
>> I remember that I once had a patch lying around a couple of years ago
>> to fix that ... it went missing :)
>>
>> IO interrupt almost have lowest priority, so letting the HW inject
>> them could be problematic when mixing IO interrupt priorities between
>> SW and HW injected ones (hat Conny described).
>>
>> There are other corner cases if a e.g. a RESTART interrupt is pending
>> for that CPU. We would deliver eventually the RESTART interrupt before
>> delivering the IO interrupt, which would be wrong.
>>
> I do share David's concern. Could somebody try to explain why this
> RESTART scenario David described is not actually a problem -- AFAIU it
> is a problem.
>
> Regards,
> Halil
>
Before I start arguing why this is *not* a problem I ask you both why 
you consider
this being a problem. We are talking about the CPU restart  here, right?
David Hildenbrand Nov. 22, 2018, 4:38 p.m. UTC | #11
On 22.11.18 12:21, Michael Mueller wrote:
> 
> 
> On 21.11.18 22:05, Halil Pasic wrote:
>> On Tue, 20 Nov 2018 13:00:03 +0100
>> David Hildenbrand <david@redhat.com> wrote:
>>
>>> On 20.11.18 12:33, Cornelia Huck wrote:
>>>> On Mon, 19 Nov 2018 18:25:25 +0100
>>>> Michael Mueller <mimu@linux.ibm.com> wrote:
>>>>
>>>>> Do not call __deliver_io() for adapter interruptions already
>>>>> pending in the IPM. That is a double effort. They will
>>>>> be processed as soon the vcpu control is given to SIE.
>>>>>
>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>>>>> ---
>>>>>   arch/s390/kvm/interrupt.c | 54
>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25
>>>>> insertions(+), 29 deletions(-)
>>>> I think this patch does what it says on the tin, but I'm a bit lost
>>>> as to the why. (Might make more sense with the gib.)
>>>>
>>>> Currently, we are trying to process any I/O interrupts, even if we'd
>>>> get them delivered via the gisa, when we're out of the SIE anyway.
>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it
>>>> also prevented performance problems when the vcpu did not go back
>>>> into the SIE immediately (it even may exit to userspace).
>>>>
>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you
>>>> may end up delivering interrupts with a lower priority (higher isc)
>>>> first. I'm not sure that's what we want.
>>>>
>>>> But maybe I'm just missing another bit of the code that makes this
>>>> safe. Can you elaborate a bit?
>>>>
>>> For interrupt priorities to work at least somewhat predictable, we
>>> should always try to inject all interrupts even if the HW would be
>>> doing it for us. In the order of priority.
>>>
>>> But we don't have the same thing for external calls injected via SCA.
>>> I remember that I once had a patch lying around a couple of years ago
>>> to fix that ... it went missing :)
>>>
>>> IO interrupt almost have lowest priority, so letting the HW inject
>>> them could be problematic when mixing IO interrupt priorities between
>>> SW and HW injected ones (hat Conny described).
>>>
>>> There are other corner cases if a e.g. a RESTART interrupt is pending
>>> for that CPU. We would deliver eventually the RESTART interrupt before
>>> delivering the IO interrupt, which would be wrong.
>>>
>> I do share David's concern. Could somebody try to explain why this
>> RESTART scenario David described is not actually a problem -- AFAIU it
>> is a problem.
>>
>> Regards,
>> Halil
>>
> Before I start arguing why this is *not* a problem I ask you both why 
> you consider
> this being a problem. We are talking about the CPU restart  here, right?
> 
> 

When sending a restart interrupt to a running CPU (e.g. system_reset) an
IO interrupt might remain pending and not delivered.

One could make a guess how bad that is (depending on the type of guest
and use case), however it is guest observable difference to what is
documented in the PoP. Restart interrupt has (almost) lowest priority.
Michael Mueller Nov. 22, 2018, 5:02 p.m. UTC | #12
On 22.11.18 17:38, David Hildenbrand wrote:
> On 22.11.18 12:21, Michael Mueller wrote:
>>
>> On 21.11.18 22:05, Halil Pasic wrote:
>>> On Tue, 20 Nov 2018 13:00:03 +0100
>>> David Hildenbrand <david@redhat.com> wrote:
>>>
>>>> On 20.11.18 12:33, Cornelia Huck wrote:
>>>>> On Mon, 19 Nov 2018 18:25:25 +0100
>>>>> Michael Mueller <mimu@linux.ibm.com> wrote:
>>>>>
>>>>>> Do not call __deliver_io() for adapter interruptions already
>>>>>> pending in the IPM. That is a double effort. They will
>>>>>> be processed as soon the vcpu control is given to SIE.
>>>>>>
>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>>>>>> ---
>>>>>>    arch/s390/kvm/interrupt.c | 54
>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25
>>>>>> insertions(+), 29 deletions(-)
>>>>> I think this patch does what it says on the tin, but I'm a bit lost
>>>>> as to the why. (Might make more sense with the gib.)
>>>>>
>>>>> Currently, we are trying to process any I/O interrupts, even if we'd
>>>>> get them delivered via the gisa, when we're out of the SIE anyway.
>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it
>>>>> also prevented performance problems when the vcpu did not go back
>>>>> into the SIE immediately (it even may exit to userspace).
>>>>>
>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you
>>>>> may end up delivering interrupts with a lower priority (higher isc)
>>>>> first. I'm not sure that's what we want.
>>>>>
>>>>> But maybe I'm just missing another bit of the code that makes this
>>>>> safe. Can you elaborate a bit?
>>>>>
>>>> For interrupt priorities to work at least somewhat predictable, we
>>>> should always try to inject all interrupts even if the HW would be
>>>> doing it for us. In the order of priority.
>>>>
>>>> But we don't have the same thing for external calls injected via SCA.
>>>> I remember that I once had a patch lying around a couple of years ago
>>>> to fix that ... it went missing :)
>>>>
>>>> IO interrupt almost have lowest priority, so letting the HW inject
>>>> them could be problematic when mixing IO interrupt priorities between
>>>> SW and HW injected ones (hat Conny described).
>>>>
>>>> There are other corner cases if a e.g. a RESTART interrupt is pending
>>>> for that CPU. We would deliver eventually the RESTART interrupt before
>>>> delivering the IO interrupt, which would be wrong.
>>>>
>>> I do share David's concern. Could somebody try to explain why this
>>> RESTART scenario David described is not actually a problem -- AFAIU it
>>> is a problem.
>>>
>>> Regards,
>>> Halil
>>>
>> Before I start arguing why this is *not* a problem I ask you both why
>> you consider
>> this being a problem. We are talking about the CPU restart  here, right?
>>
>>
> When sending a restart interrupt to a running CPU (e.g. system_reset) an
> IO interrupt might remain pending and not delivered.
Thanks for your answer David.

The whole GISA is cleared as well in this case. So no indication of any 
unprocessed
adapter interruption survives the system reset.
>
> One could make a guess how bad that is (depending on the type of guest
> and use case), however it is guest observable difference to what is
> documented in the PoP. Restart interrupt has (almost) lowest priority.
>
Independent of this patch I made a minimal change in the 
process_gib_alert_list()
routine that will recognizes if the the interruption has already been 
delivered by
kvm_s390_deliver_pending_interrupts() and in that case not dispatch an 
idle vcpu
for that guest.

That means I will pull back this patch from the series with the next 
release.
Pierre Morel Nov. 22, 2018, 5:33 p.m. UTC | #13
On 22/11/2018 17:38, David Hildenbrand wrote:
> On 22.11.18 12:21, Michael Mueller wrote:
>>
>>
>> On 21.11.18 22:05, Halil Pasic wrote:
>>> On Tue, 20 Nov 2018 13:00:03 +0100
>>> David Hildenbrand <david@redhat.com> wrote:
>>>
>>>> On 20.11.18 12:33, Cornelia Huck wrote:
>>>>> On Mon, 19 Nov 2018 18:25:25 +0100
>>>>> Michael Mueller <mimu@linux.ibm.com> wrote:
>>>>>
>>>>>> Do not call __deliver_io() for adapter interruptions already
>>>>>> pending in the IPM. That is a double effort. They will
>>>>>> be processed as soon the vcpu control is given to SIE.
>>>>>>
>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>>>>>> ---
>>>>>>    arch/s390/kvm/interrupt.c | 54
>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25
>>>>>> insertions(+), 29 deletions(-)
>>>>> I think this patch does what it says on the tin, but I'm a bit lost
>>>>> as to the why. (Might make more sense with the gib.)
>>>>>
>>>>> Currently, we are trying to process any I/O interrupts, even if we'd
>>>>> get them delivered via the gisa, when we're out of the SIE anyway.
>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it
>>>>> also prevented performance problems when the vcpu did not go back
>>>>> into the SIE immediately (it even may exit to userspace).
>>>>>
>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you
>>>>> may end up delivering interrupts with a lower priority (higher isc)
>>>>> first. I'm not sure that's what we want.
>>>>>
>>>>> But maybe I'm just missing another bit of the code that makes this
>>>>> safe. Can you elaborate a bit?
>>>>>
>>>> For interrupt priorities to work at least somewhat predictable, we
>>>> should always try to inject all interrupts even if the HW would be
>>>> doing it for us. In the order of priority.
>>>>
>>>> But we don't have the same thing for external calls injected via SCA.
>>>> I remember that I once had a patch lying around a couple of years ago
>>>> to fix that ... it went missing :)
>>>>
>>>> IO interrupt almost have lowest priority, so letting the HW inject
>>>> them could be problematic when mixing IO interrupt priorities between
>>>> SW and HW injected ones (hat Conny described).
>>>>
>>>> There are other corner cases if a e.g. a RESTART interrupt is pending
>>>> for that CPU. We would deliver eventually the RESTART interrupt before
>>>> delivering the IO interrupt, which would be wrong.
>>>>
>>> I do share David's concern. Could somebody try to explain why this
>>> RESTART scenario David described is not actually a problem -- AFAIU it
>>> is a problem.
>>>
>>> Regards,
>>> Halil
>>>
>> Before I start arguing why this is *not* a problem I ask you both why
>> you consider
>> this being a problem. We are talking about the CPU restart  here, right?
>>
>>
> 
> When sending a restart interrupt to a running CPU (e.g. system_reset) an
> IO interrupt might remain pending and not delivered.
> 
> One could make a guess how bad that is (depending on the type of guest
> and use case), however it is guest observable difference to what is
> documented in the PoP. Restart interrupt has (almost) lowest priority.
> 

I do not see the priority as a problem.
RESET and IO IRQ are asynchronous from each other anyway.

But there I see another issue that your question point.

Generally, on system RESET all peripheral also get a RESET and all 
interrupt are cleared.
This is the case for AP VFIO devices.
This is the case for local interrupts.

I miss the GISA reset code to avoid that the adapter IO IRQ is cleared 
on RESET.

May be I did not look at the right place.

Regards,
Pierre
Michael Mueller Nov. 23, 2018, 8:37 a.m. UTC | #14
On 22.11.18 18:33, Pierre Morel wrote:
> On 22/11/2018 17:38, David Hildenbrand wrote:
>> On 22.11.18 12:21, Michael Mueller wrote:
>>>
>>>
>>> On 21.11.18 22:05, Halil Pasic wrote:
>>>> On Tue, 20 Nov 2018 13:00:03 +0100
>>>> David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>>> On 20.11.18 12:33, Cornelia Huck wrote:
>>>>>> On Mon, 19 Nov 2018 18:25:25 +0100
>>>>>> Michael Mueller <mimu@linux.ibm.com> wrote:
>>>>>>
>>>>>>> Do not call __deliver_io() for adapter interruptions already
>>>>>>> pending in the IPM. That is a double effort. They will
>>>>>>> be processed as soon the vcpu control is given to SIE.
>>>>>>>
>>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>>>>>>> ---
>>>>>>>    arch/s390/kvm/interrupt.c | 54
>>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25
>>>>>>> insertions(+), 29 deletions(-)
>>>>>> I think this patch does what it says on the tin, but I'm a bit lost
>>>>>> as to the why. (Might make more sense with the gib.)
>>>>>>
>>>>>> Currently, we are trying to process any I/O interrupts, even if we'd
>>>>>> get them delivered via the gisa, when we're out of the SIE anyway.
>>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it
>>>>>> also prevented performance problems when the vcpu did not go back
>>>>>> into the SIE immediately (it even may exit to userspace).
>>>>>>
>>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you
>>>>>> may end up delivering interrupts with a lower priority (higher isc)
>>>>>> first. I'm not sure that's what we want.
>>>>>>
>>>>>> But maybe I'm just missing another bit of the code that makes this
>>>>>> safe. Can you elaborate a bit?
>>>>>>
>>>>> For interrupt priorities to work at least somewhat predictable, we
>>>>> should always try to inject all interrupts even if the HW would be
>>>>> doing it for us. In the order of priority.
>>>>>
>>>>> But we don't have the same thing for external calls injected via SCA.
>>>>> I remember that I once had a patch lying around a couple of years ago
>>>>> to fix that ... it went missing :)
>>>>>
>>>>> IO interrupt almost have lowest priority, so letting the HW inject
>>>>> them could be problematic when mixing IO interrupt priorities between
>>>>> SW and HW injected ones (hat Conny described).
>>>>>
>>>>> There are other corner cases if a e.g. a RESTART interrupt is pending
>>>>> for that CPU. We would deliver eventually the RESTART interrupt before
>>>>> delivering the IO interrupt, which would be wrong.
>>>>>
>>>> I do share David's concern. Could somebody try to explain why this
>>>> RESTART scenario David described is not actually a problem -- AFAIU it
>>>> is a problem.
>>>>
>>>> Regards,
>>>> Halil
>>>>
>>> Before I start arguing why this is *not* a problem I ask you both why
>>> you consider
>>> this being a problem. We are talking about the CPU restart  here, right?
>>>
>>>
>>
>> When sending a restart interrupt to a running CPU (e.g. system_reset) an
>> IO interrupt might remain pending and not delivered.
>>
>> One could make a guess how bad that is (depending on the type of guest
>> and use case), however it is guest observable difference to what is
>> documented in the PoP. Restart interrupt has (almost) lowest priority.
>>
> 
> I do not see the priority as a problem.
> RESET and IO IRQ are asynchronous from each other anyway.
> 
> But there I see another issue that your question point.
> 
> Generally, on system RESET all peripheral also get a RESET and all 
> interrupt are cleared.
> This is the case for AP VFIO devices.
> This is the case for local interrupts.
> 
> I miss the GISA reset code to avoid that the adapter IO IRQ is cleared 
> on RESET.
> 
> May be I did not look at the right place.

See kvm_s390_clear_float_irqs() and how it as called.

> 
> Regards,
> Pierre
> 
>
David Hildenbrand Nov. 23, 2018, 8:50 a.m. UTC | #15
On 23.11.18 09:37, Michael Mueller wrote:
> 
> 
> On 22.11.18 18:33, Pierre Morel wrote:
>> On 22/11/2018 17:38, David Hildenbrand wrote:
>>> On 22.11.18 12:21, Michael Mueller wrote:
>>>>
>>>>
>>>> On 21.11.18 22:05, Halil Pasic wrote:
>>>>> On Tue, 20 Nov 2018 13:00:03 +0100
>>>>> David Hildenbrand <david@redhat.com> wrote:
>>>>>
>>>>>> On 20.11.18 12:33, Cornelia Huck wrote:
>>>>>>> On Mon, 19 Nov 2018 18:25:25 +0100
>>>>>>> Michael Mueller <mimu@linux.ibm.com> wrote:
>>>>>>>
>>>>>>>> Do not call __deliver_io() for adapter interruptions already
>>>>>>>> pending in the IPM. That is a double effort. They will
>>>>>>>> be processed as soon the vcpu control is given to SIE.
>>>>>>>>
>>>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>>>>>>>> ---
>>>>>>>>    arch/s390/kvm/interrupt.c | 54
>>>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25
>>>>>>>> insertions(+), 29 deletions(-)
>>>>>>> I think this patch does what it says on the tin, but I'm a bit lost
>>>>>>> as to the why. (Might make more sense with the gib.)
>>>>>>>
>>>>>>> Currently, we are trying to process any I/O interrupts, even if we'd
>>>>>>> get them delivered via the gisa, when we're out of the SIE anyway.
>>>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it
>>>>>>> also prevented performance problems when the vcpu did not go back
>>>>>>> into the SIE immediately (it even may exit to userspace).
>>>>>>>
>>>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you
>>>>>>> may end up delivering interrupts with a lower priority (higher isc)
>>>>>>> first. I'm not sure that's what we want.
>>>>>>>
>>>>>>> But maybe I'm just missing another bit of the code that makes this
>>>>>>> safe. Can you elaborate a bit?
>>>>>>>
>>>>>> For interrupt priorities to work at least somewhat predictable, we
>>>>>> should always try to inject all interrupts even if the HW would be
>>>>>> doing it for us. In the order of priority.
>>>>>>
>>>>>> But we don't have the same thing for external calls injected via SCA.
>>>>>> I remember that I once had a patch lying around a couple of years ago
>>>>>> to fix that ... it went missing :)
>>>>>>
>>>>>> IO interrupt almost have lowest priority, so letting the HW inject
>>>>>> them could be problematic when mixing IO interrupt priorities between
>>>>>> SW and HW injected ones (hat Conny described).
>>>>>>
>>>>>> There are other corner cases if a e.g. a RESTART interrupt is pending
>>>>>> for that CPU. We would deliver eventually the RESTART interrupt before
>>>>>> delivering the IO interrupt, which would be wrong.
>>>>>>
>>>>> I do share David's concern. Could somebody try to explain why this
>>>>> RESTART scenario David described is not actually a problem -- AFAIU it
>>>>> is a problem.
>>>>>
>>>>> Regards,
>>>>> Halil
>>>>>
>>>> Before I start arguing why this is *not* a problem I ask you both why
>>>> you consider
>>>> this being a problem. We are talking about the CPU restart  here, right?
>>>>
>>>>
>>>
>>> When sending a restart interrupt to a running CPU (e.g. system_reset) an
>>> IO interrupt might remain pending and not delivered.
>>>
>>> One could make a guess how bad that is (depending on the type of guest
>>> and use case), however it is guest observable difference to what is
>>> documented in the PoP. Restart interrupt has (almost) lowest priority.
>>>
>>
>> I do not see the priority as a problem.
>> RESET and IO IRQ are asynchronous from each other anyway.
>>
>> But there I see another issue that your question point.
>>
>> Generally, on system RESET all peripheral also get a RESET and all 
>> interrupt are cleared.
>> This is the case for AP VFIO devices.
>> This is the case for local interrupts.

Please don't confuse SIGP RESTART (e.g. RESTART interrupt any CPU can
happily send to another CPU) with system resets. For ordinary SIGP
RESTARTs, no floating interrupts are cleared.

However, the question is if the asynchronous nature of IO interrupts
indeed give no guarantees in respect to priorities against RESTART
interrupts (IOW, will it be observerable). I'll have to think about this
a little bit longer :)

>>
>> I miss the GISA reset code to avoid that the adapter IO IRQ is cleared 
>> on RESET.
>>
>> May be I did not look at the right place.
> 
> See kvm_s390_clear_float_irqs() and how it as called.
> 
>>
>> Regards,
>> Pierre
>>
>>
>
Pierre Morel Nov. 23, 2018, 9 a.m. UTC | #16
On 23/11/2018 09:50, David Hildenbrand wrote:
> On 23.11.18 09:37, Michael Mueller wrote:
>>
>>
>> On 22.11.18 18:33, Pierre Morel wrote:
>>> On 22/11/2018 17:38, David Hildenbrand wrote:
>>>> On 22.11.18 12:21, Michael Mueller wrote:
>>>>>
>>>>>
>>>>> On 21.11.18 22:05, Halil Pasic wrote:
>>>>>> On Tue, 20 Nov 2018 13:00:03 +0100
>>>>>> David Hildenbrand <david@redhat.com> wrote:
>>>>>>
>>>>>>> On 20.11.18 12:33, Cornelia Huck wrote:
>>>>>>>> On Mon, 19 Nov 2018 18:25:25 +0100
>>>>>>>> Michael Mueller <mimu@linux.ibm.com> wrote:
>>>>>>>>
>>>>>>>>> Do not call __deliver_io() for adapter interruptions already
>>>>>>>>> pending in the IPM. That is a double effort. They will
>>>>>>>>> be processed as soon the vcpu control is given to SIE.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>>>>>>>>> ---
>>>>>>>>>     arch/s390/kvm/interrupt.c | 54
>>>>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25
>>>>>>>>> insertions(+), 29 deletions(-)
>>>>>>>> I think this patch does what it says on the tin, but I'm a bit lost
>>>>>>>> as to the why. (Might make more sense with the gib.)
>>>>>>>>
>>>>>>>> Currently, we are trying to process any I/O interrupts, even if we'd
>>>>>>>> get them delivered via the gisa, when we're out of the SIE anyway.
>>>>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it
>>>>>>>> also prevented performance problems when the vcpu did not go back
>>>>>>>> into the SIE immediately (it even may exit to userspace).
>>>>>>>>
>>>>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you
>>>>>>>> may end up delivering interrupts with a lower priority (higher isc)
>>>>>>>> first. I'm not sure that's what we want.
>>>>>>>>
>>>>>>>> But maybe I'm just missing another bit of the code that makes this
>>>>>>>> safe. Can you elaborate a bit?
>>>>>>>>
>>>>>>> For interrupt priorities to work at least somewhat predictable, we
>>>>>>> should always try to inject all interrupts even if the HW would be
>>>>>>> doing it for us. In the order of priority.
>>>>>>>
>>>>>>> But we don't have the same thing for external calls injected via SCA.
>>>>>>> I remember that I once had a patch lying around a couple of years ago
>>>>>>> to fix that ... it went missing :)
>>>>>>>
>>>>>>> IO interrupt almost have lowest priority, so letting the HW inject
>>>>>>> them could be problematic when mixing IO interrupt priorities between
>>>>>>> SW and HW injected ones (hat Conny described).
>>>>>>>
>>>>>>> There are other corner cases if a e.g. a RESTART interrupt is pending
>>>>>>> for that CPU. We would deliver eventually the RESTART interrupt before
>>>>>>> delivering the IO interrupt, which would be wrong.
>>>>>>>
>>>>>> I do share David's concern. Could somebody try to explain why this
>>>>>> RESTART scenario David described is not actually a problem -- AFAIU it
>>>>>> is a problem.
>>>>>>
>>>>>> Regards,
>>>>>> Halil
>>>>>>
>>>>> Before I start arguing why this is *not* a problem I ask you both why
>>>>> you consider
>>>>> this being a problem. We are talking about the CPU restart  here, right?
>>>>>
>>>>>
>>>>
>>>> When sending a restart interrupt to a running CPU (e.g. system_reset) an
>>>> IO interrupt might remain pending and not delivered.
>>>>
>>>> One could make a guess how bad that is (depending on the type of guest
>>>> and use case), however it is guest observable difference to what is
>>>> documented in the PoP. Restart interrupt has (almost) lowest priority.
>>>>
>>>
>>> I do not see the priority as a problem.
>>> RESET and IO IRQ are asynchronous from each other anyway.
>>>
>>> But there I see another issue that your question point.
>>>
>>> Generally, on system RESET all peripheral also get a RESET and all
>>> interrupt are cleared.
>>> This is the case for AP VFIO devices.
>>> This is the case for local interrupts.
> 
> Please don't confuse SIGP RESTART (e.g. RESTART interrupt any CPU can
> happily send to another CPU) with system resets.

I do not, you were speaking about system_reset.
And I answered on this.


> For ordinary SIGP
> RESTARTs, no floating interrupts are cleared.
> 
> However, the question is if the asynchronous nature of IO interrupts
> indeed give no guarantees in respect to priorities against RESTART
> interrupts (IOW, will it be observerable). I'll have to think about this
> a little bit longer :)
> 
>>>
>>> I miss the GISA reset code to avoid that the adapter IO IRQ is cleared
>>> on RESET.
>>>
>>> May be I did not look at the right place.
>>
>> See kvm_s390_clear_float_irqs() and how it as called.
>>
>>>
>>> Regards,
>>> Pierre
>>>
>>>
>>
> 
>
Pierre Morel Nov. 23, 2018, 9:25 a.m. UTC | #17
On 23/11/2018 09:37, Michael Mueller wrote:
> 
> 
> On 22.11.18 18:33, Pierre Morel wrote:
>> On 22/11/2018 17:38, David Hildenbrand wrote:
>>> On 22.11.18 12:21, Michael Mueller wrote:
>>>>
>>>>
>>>> On 21.11.18 22:05, Halil Pasic wrote:
>>>>> On Tue, 20 Nov 2018 13:00:03 +0100
>>>>> David Hildenbrand <david@redhat.com> wrote:
>>>>>
>>>>>> On 20.11.18 12:33, Cornelia Huck wrote:
>>>>>>> On Mon, 19 Nov 2018 18:25:25 +0100
>>>>>>> Michael Mueller <mimu@linux.ibm.com> wrote:
>>>>>>>
>>>>>>>> Do not call __deliver_io() for adapter interruptions already
>>>>>>>> pending in the IPM. That is a double effort. They will
>>>>>>>> be processed as soon the vcpu control is given to SIE.
>>>>>>>>
>>>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>>>>>>>> ---
>>>>>>>>    arch/s390/kvm/interrupt.c | 54
>>>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25
>>>>>>>> insertions(+), 29 deletions(-)
>>>>>>> I think this patch does what it says on the tin, but I'm a bit lost
>>>>>>> as to the why. (Might make more sense with the gib.)
>>>>>>>
>>>>>>> Currently, we are trying to process any I/O interrupts, even if we'd
>>>>>>> get them delivered via the gisa, when we're out of the SIE anyway.
>>>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it
>>>>>>> also prevented performance problems when the vcpu did not go back
>>>>>>> into the SIE immediately (it even may exit to userspace).
>>>>>>>
>>>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you
>>>>>>> may end up delivering interrupts with a lower priority (higher isc)
>>>>>>> first. I'm not sure that's what we want.
>>>>>>>
>>>>>>> But maybe I'm just missing another bit of the code that makes this
>>>>>>> safe. Can you elaborate a bit?
>>>>>>>
>>>>>> For interrupt priorities to work at least somewhat predictable, we
>>>>>> should always try to inject all interrupts even if the HW would be
>>>>>> doing it for us. In the order of priority.
>>>>>>
>>>>>> But we don't have the same thing for external calls injected via SCA.
>>>>>> I remember that I once had a patch lying around a couple of years ago
>>>>>> to fix that ... it went missing :)
>>>>>>
>>>>>> IO interrupt almost have lowest priority, so letting the HW inject
>>>>>> them could be problematic when mixing IO interrupt priorities between
>>>>>> SW and HW injected ones (hat Conny described).
>>>>>>
>>>>>> There are other corner cases if a e.g. a RESTART interrupt is pending
>>>>>> for that CPU. We would deliver eventually the RESTART interrupt 
>>>>>> before
>>>>>> delivering the IO interrupt, which would be wrong.
>>>>>>
>>>>> I do share David's concern. Could somebody try to explain why this
>>>>> RESTART scenario David described is not actually a problem -- AFAIU it
>>>>> is a problem.
>>>>>
>>>>> Regards,
>>>>> Halil
>>>>>
>>>> Before I start arguing why this is *not* a problem I ask you both why
>>>> you consider
>>>> this being a problem. We are talking about the CPU restart  here, 
>>>> right?
>>>>
>>>>
>>>
>>> When sending a restart interrupt to a running CPU (e.g. system_reset) an
>>> IO interrupt might remain pending and not delivered.
>>>
>>> One could make a guess how bad that is (depending on the type of guest
>>> and use case), however it is guest observable difference to what is
>>> documented in the PoP. Restart interrupt has (almost) lowest priority.
>>>
>>
>> I do not see the priority as a problem.
>> RESET and IO IRQ are asynchronous from each other anyway.
>>
>> But there I see another issue that your question point.
>>
>> Generally, on system RESET all peripheral also get a RESET and all 
>> interrupt are cleared.
>> This is the case for AP VFIO devices.
>> This is the case for local interrupts.
>>
>> I miss the GISA reset code to avoid that the adapter IO IRQ is cleared 
>> on RESET.
>>
>> May be I did not look at the right place.
> 
> See kvm_s390_clear_float_irqs() and how it as called.

right, thanks.

> 
>>
>> Regards,
>> Pierre
>>
>>
>
David Hildenbrand Nov. 23, 2018, 10:23 a.m. UTC | #18
On 23.11.18 10:00, Pierre Morel wrote:
> On 23/11/2018 09:50, David Hildenbrand wrote:
>> On 23.11.18 09:37, Michael Mueller wrote:
>>>
>>>
>>> On 22.11.18 18:33, Pierre Morel wrote:
>>>> On 22/11/2018 17:38, David Hildenbrand wrote:
>>>>> On 22.11.18 12:21, Michael Mueller wrote:
>>>>>>
>>>>>>
>>>>>> On 21.11.18 22:05, Halil Pasic wrote:
>>>>>>> On Tue, 20 Nov 2018 13:00:03 +0100
>>>>>>> David Hildenbrand <david@redhat.com> wrote:
>>>>>>>
>>>>>>>> On 20.11.18 12:33, Cornelia Huck wrote:
>>>>>>>>> On Mon, 19 Nov 2018 18:25:25 +0100
>>>>>>>>> Michael Mueller <mimu@linux.ibm.com> wrote:
>>>>>>>>>
>>>>>>>>>> Do not call __deliver_io() for adapter interruptions already
>>>>>>>>>> pending in the IPM. That is a double effort. They will
>>>>>>>>>> be processed as soon the vcpu control is given to SIE.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
>>>>>>>>>> ---
>>>>>>>>>>     arch/s390/kvm/interrupt.c | 54
>>>>>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25
>>>>>>>>>> insertions(+), 29 deletions(-)
>>>>>>>>> I think this patch does what it says on the tin, but I'm a bit lost
>>>>>>>>> as to the why. (Might make more sense with the gib.)
>>>>>>>>>
>>>>>>>>> Currently, we are trying to process any I/O interrupts, even if we'd
>>>>>>>>> get them delivered via the gisa, when we're out of the SIE anyway.
>>>>>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it
>>>>>>>>> also prevented performance problems when the vcpu did not go back
>>>>>>>>> into the SIE immediately (it even may exit to userspace).
>>>>>>>>>
>>>>>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you
>>>>>>>>> may end up delivering interrupts with a lower priority (higher isc)
>>>>>>>>> first. I'm not sure that's what we want.
>>>>>>>>>
>>>>>>>>> But maybe I'm just missing another bit of the code that makes this
>>>>>>>>> safe. Can you elaborate a bit?
>>>>>>>>>
>>>>>>>> For interrupt priorities to work at least somewhat predictable, we
>>>>>>>> should always try to inject all interrupts even if the HW would be
>>>>>>>> doing it for us. In the order of priority.
>>>>>>>>
>>>>>>>> But we don't have the same thing for external calls injected via SCA.
>>>>>>>> I remember that I once had a patch lying around a couple of years ago
>>>>>>>> to fix that ... it went missing :)
>>>>>>>>
>>>>>>>> IO interrupt almost have lowest priority, so letting the HW inject
>>>>>>>> them could be problematic when mixing IO interrupt priorities between
>>>>>>>> SW and HW injected ones (hat Conny described).
>>>>>>>>
>>>>>>>> There are other corner cases if a e.g. a RESTART interrupt is pending
>>>>>>>> for that CPU. We would deliver eventually the RESTART interrupt before
>>>>>>>> delivering the IO interrupt, which would be wrong.
>>>>>>>>
>>>>>>> I do share David's concern. Could somebody try to explain why this
>>>>>>> RESTART scenario David described is not actually a problem -- AFAIU it
>>>>>>> is a problem.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Halil
>>>>>>>
>>>>>> Before I start arguing why this is *not* a problem I ask you both why
>>>>>> you consider
>>>>>> this being a problem. We are talking about the CPU restart  here, right?
>>>>>>
>>>>>>
>>>>>
>>>>> When sending a restart interrupt to a running CPU (e.g. system_reset) an
>>>>> IO interrupt might remain pending and not delivered.
>>>>>
>>>>> One could make a guess how bad that is (depending on the type of guest
>>>>> and use case), however it is guest observable difference to what is
>>>>> documented in the PoP. Restart interrupt has (almost) lowest priority.
>>>>>
>>>>
>>>> I do not see the priority as a problem.
>>>> RESET and IO IRQ are asynchronous from each other anyway.
>>>>
>>>> But there I see another issue that your question point.
>>>>
>>>> Generally, on system RESET all peripheral also get a RESET and all
>>>> interrupt are cleared.
>>>> This is the case for AP VFIO devices.
>>>> This is the case for local interrupts.
>>
>> Please don't confuse SIGP RESTART (e.g. RESTART interrupt any CPU can
>> happily send to another CPU) with system resets.
> 
> I do not, you were speaking about system_reset.
> And I answered on this.

I think I was confused. "nmi" uses RESTART interrupts from "external".
"system_reset" does not use RESTART interrupts on the QEMU side. (so the
example I gave was both confusing and wrong :) )

In Linux, restart interrupts are used for multiple purposes (hibernation
restore, bringing up CPUs, kdump).

Not sure if the priority thingy could be a problem there (I guess not,
but I am not sure yet about the implications).
diff mbox series

Patch

diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index fcb55b02990e..1f4c0c7286f7 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -271,14 +271,9 @@  static unsigned long disable_iscs(struct kvm_vcpu *vcpu,
 	return active_mask;
 }
 
-static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu)
+static unsigned long __deliverable_irqs(struct kvm_vcpu *vcpu,
+					unsigned long active_mask)
 {
-	unsigned long active_mask;
-
-	active_mask = pending_irqs(vcpu);
-	if (!active_mask)
-		return 0;
-
 	if (psw_extint_disabled(vcpu))
 		active_mask &= ~IRQ_PEND_EXT_MASK;
 	if (psw_ioint_disabled(vcpu))
@@ -315,6 +310,28 @@  static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu)
 	return active_mask;
 }
 
+static unsigned long deliverable_irqs_no_gisa(struct kvm_vcpu *vcpu)
+{
+	unsigned long active_mask;
+
+	active_mask = pending_irqs_no_gisa(vcpu);
+	if (!active_mask)
+		return 0;
+
+	return __deliverable_irqs(vcpu, active_mask);
+}
+
+static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu)
+{
+	unsigned long active_mask;
+
+	active_mask = pending_irqs(vcpu);
+	if (!active_mask)
+		return 0;
+
+	return __deliverable_irqs(vcpu, active_mask);
+}
+
 static void __set_cpu_idle(struct kvm_vcpu *vcpu)
 {
 	kvm_s390_set_cpuflags(vcpu, CPUSTAT_WAIT);
@@ -957,7 +974,6 @@  static int __must_check __deliver_io(struct kvm_vcpu *vcpu,
 	struct list_head *isc_list;
 	struct kvm_s390_float_interrupt *fi;
 	struct kvm_s390_interrupt_info *inti = NULL;
-	struct kvm_s390_io_info io;
 	u32 isc;
 	int rc = 0;
 
@@ -995,28 +1011,8 @@  static int __must_check __deliver_io(struct kvm_vcpu *vcpu,
 	if (inti) {
 		rc = __do_deliver_io(vcpu, &(inti->io));
 		kfree(inti);
-		goto out;
 	}
 
-	if (vcpu->kvm->arch.gisa &&
-	    kvm_s390_gisa_tac_ipm_gisc(vcpu->kvm->arch.gisa, isc)) {
-		/*
-		 * in case an adapter interrupt was not delivered
-		 * in SIE context KVM will handle the delivery
-		 */
-		VCPU_EVENT(vcpu, 4, "%s isc %u", "deliver: I/O (AI/gisa)", isc);
-		memset(&io, 0, sizeof(io));
-		io.io_int_word = isc_to_int_word(isc);
-		vcpu->stat.deliver_io++;
-		trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id,
-			KVM_S390_INT_IO(1, 0, 0, 0),
-			((__u32)io.subchannel_id << 16) |
-			io.subchannel_nr,
-			((__u64)io.io_int_parm << 32) |
-			io.io_int_word);
-		rc = __do_deliver_io(vcpu, &io);
-	}
-out:
 	return rc;
 }
 
@@ -1205,7 +1201,7 @@  int __must_check kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu)
 	if (cpu_timer_irq_pending(vcpu))
 		set_bit(IRQ_PEND_EXT_CPU_TIMER, &li->pending_irqs);
 
-	while ((irqs = deliverable_irqs(vcpu)) && !rc) {
+	while ((irqs = deliverable_irqs_no_gisa(vcpu)) && !rc) {
 		/* bits are in the reverse order of interrupt priority */
 		irq_type = find_last_bit(&irqs, IRQ_PEND_COUNT);
 		switch (irq_type) {