Message ID | 20211019175401.3757927-4-pasic@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fixes for __airqs_kick_single_vcpu() | expand |
Am 19.10.21 um 19:54 schrieb Halil Pasic: > The idea behind kicked mask is that we should not re-kick a vcpu > from __airqs_kick_single_vcpu() that is already in the middle of > being kicked by the same function. > > If however the vcpu that was idle before when the idle_mask was > examined, is not idle any more after the kicked_mask is set, that > means that we don't need to kick, and that we need to clear the > bit we just set because we may be beyond the point where it would > get cleared in the wake-up process. Since the time window is short, > this is probably more a theoretical than a practical thing: the race > window is small. > > To get things harmonized let us also move the clear from vcpu_pre_run() > to __unset_cpu_idle(). this part makes sense. > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com> > Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()") > --- > arch/s390/kvm/interrupt.c | 7 ++++++- > arch/s390/kvm/kvm-s390.c | 2 -- > 2 files changed, 6 insertions(+), 3 deletions(-) > > diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c > index 2245f4b8d362..3c80a2237ef5 100644 > --- a/arch/s390/kvm/interrupt.c > +++ b/arch/s390/kvm/interrupt.c > @@ -426,6 +426,7 @@ static void __unset_cpu_idle(struct kvm_vcpu *vcpu) > { > kvm_s390_clear_cpuflags(vcpu, CPUSTAT_WAIT); > clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask); > + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); > } > > static void __reset_intercept_indicators(struct kvm_vcpu *vcpu) > @@ -3064,7 +3065,11 @@ static void __airqs_kick_single_vcpu(struct kvm *kvm, u8 deliverable_mask) > /* lately kicked but not yet running */ > if (test_and_set_bit(vcpu_idx, gi->kicked_mask)) > return; > - kvm_s390_vcpu_wakeup(vcpu); > + /* if meanwhile not idle: clear and don't kick */ > + if (test_bit(vcpu_idx, kvm->arch.idle_mask)) > + kvm_s390_vcpu_wakeup(vcpu); > + else > + clear_bit(vcpu_idx, gi->kicked_mask); I think this is now a bug. We should not return but continue in that case, no? I think it might be safer to also clear kicked_mask in __set_cpu_idle From a CPUs perspective: We have been running and are on our way to become idle. There is no way that someone kicked us for a wakeup. In other words as long as we are running, there is no point in kicking us but when going idle we should get rid of old kick_mask bit. Doesnt this cover your scenario? > return; > } > } > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c > index 1c97493d21e1..6b779ef9f5fb 100644 > --- a/arch/s390/kvm/kvm-s390.c > +++ b/arch/s390/kvm/kvm-s390.c > @@ -4067,8 +4067,6 @@ static int vcpu_pre_run(struct kvm_vcpu *vcpu) > kvm_s390_patch_guest_per_regs(vcpu); > } > > - clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); > - > vcpu->arch.sie_block->icptcode = 0; > cpuflags = atomic_read(&vcpu->arch.sie_block->cpuflags); > VCPU_EVENT(vcpu, 6, "entering sie flags %x", cpuflags); >
Am 19.10.21 um 23:35 schrieb Christian Borntraeger: > > > Am 19.10.21 um 19:54 schrieb Halil Pasic: >> The idea behind kicked mask is that we should not re-kick a vcpu >> from __airqs_kick_single_vcpu() that is already in the middle of >> being kicked by the same function. >> >> If however the vcpu that was idle before when the idle_mask was >> examined, is not idle any more after the kicked_mask is set, that >> means that we don't need to kick, and that we need to clear the >> bit we just set because we may be beyond the point where it would >> get cleared in the wake-up process. Since the time window is short, >> this is probably more a theoretical than a practical thing: the race >> window is small. >> >> To get things harmonized let us also move the clear from vcpu_pre_run() >> to __unset_cpu_idle(). > > this part makes sense. >> >> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> >> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()") >> --- >> arch/s390/kvm/interrupt.c | 7 ++++++- >> arch/s390/kvm/kvm-s390.c | 2 -- >> 2 files changed, 6 insertions(+), 3 deletions(-) >> >> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c >> index 2245f4b8d362..3c80a2237ef5 100644 >> --- a/arch/s390/kvm/interrupt.c >> +++ b/arch/s390/kvm/interrupt.c >> @@ -426,6 +426,7 @@ static void __unset_cpu_idle(struct kvm_vcpu *vcpu) >> { >> kvm_s390_clear_cpuflags(vcpu, CPUSTAT_WAIT); >> clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask); >> + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); >> } >> static void __reset_intercept_indicators(struct kvm_vcpu *vcpu) >> @@ -3064,7 +3065,11 @@ static void __airqs_kick_single_vcpu(struct kvm *kvm, u8 deliverable_mask) >> /* lately kicked but not yet running */ >> if (test_and_set_bit(vcpu_idx, gi->kicked_mask)) >> return; >> - kvm_s390_vcpu_wakeup(vcpu); >> + /* if meanwhile not idle: clear and don't kick */ >> + if (test_bit(vcpu_idx, kvm->arch.idle_mask)) >> + kvm_s390_vcpu_wakeup(vcpu); >> + else >> + clear_bit(vcpu_idx, gi->kicked_mask); > > I think this is now a bug. We should not return but continue in that case, no? Thinking again about this, it might be ok. If we went from idle to non-idle we likely were in SIE and the interrupt should have been delivered. But I would rather wake up too often than too less. > > I think it might be safer to also clear kicked_mask in __set_cpu_idle > From a CPUs perspective: We have been running and are on our way to become idle. > There is no way that someone kicked us for a wakeup. In other words as long as we > are running, there is no point in kicking us but when going idle we should get rid > of old kick_mask bit. > Doesnt this cover your scenario? > > >> return; >> } >> } >> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c >> index 1c97493d21e1..6b779ef9f5fb 100644 >> --- a/arch/s390/kvm/kvm-s390.c >> +++ b/arch/s390/kvm/kvm-s390.c >> @@ -4067,8 +4067,6 @@ static int vcpu_pre_run(struct kvm_vcpu *vcpu) >> kvm_s390_patch_guest_per_regs(vcpu); >> } >> - clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); >> - >> vcpu->arch.sie_block->icptcode = 0; >> cpuflags = atomic_read(&vcpu->arch.sie_block->cpuflags); >> VCPU_EVENT(vcpu, 6, "entering sie flags %x", cpuflags); >>
On Tue, 19 Oct 2021 23:35:25 +0200 Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > @@ -426,6 +426,7 @@ static void __unset_cpu_idle(struct kvm_vcpu *vcpu) > > { > > kvm_s390_clear_cpuflags(vcpu, CPUSTAT_WAIT); > > clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask); > > + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); BTW, do you know are bit-ops garanteed to be serialized as seen by another cpu even when acting on a different byte? I mean could the kick_single_vcpu() set the clear of the kicked_mask bit but not see the clear of the idle mask? If that is not true we may need some barriers, or possibly merging the two bitmasks like idle bit, kick bit alterating to ensure there absolutely ain't no race. > > } > > > > static void __reset_intercept_indicators(struct kvm_vcpu *vcpu) > > @@ -3064,7 +3065,11 @@ static void __airqs_kick_single_vcpu(struct kvm *kvm, u8 deliverable_mask) > > /* lately kicked but not yet running */ > > if (test_and_set_bit(vcpu_idx, gi->kicked_mask)) > > return; > > - kvm_s390_vcpu_wakeup(vcpu); > > + /* if meanwhile not idle: clear and don't kick */ > > + if (test_bit(vcpu_idx, kvm->arch.idle_mask)) > > + kvm_s390_vcpu_wakeup(vcpu); > > + else > > + clear_bit(vcpu_idx, gi->kicked_mask); > > I think this is now a bug. We should not return but continue in that case, no? > I don't think so. The purpose of this function is to kick a *single* vcpu that can handle *some* of the I/O interrupts indicated by the deliverable_mask. The deliverable mask predates the check of the idle_mask. I believe if we selected a suitable vcpu, that was idle and before we actually do a wakeup on it we see that it isn't idle any more, I believe it is as good if not better as performing the wakeup (and a new wakeup() call is pointless: this vcpu either already got the the irqs it can get, or is about to enter SIE soon to do so. We just saved a pointless call to wakeup(). > > I think it might be safer to also clear kicked_mask in __set_cpu_idle It would not hurt, but my guess is that kvm_arch_vcpu_runnable() before we really decide to go to sleep: void kvm_vcpu_block(struct kvm_vcpu *vcpu) { [..] for (;;) { set_current_state(TASK_INTERRUPTIBLE); if (kvm_vcpu_check_block(vcpu) < 0) <=== calls runnable() break; waited = true; schedule(); } > From a CPUs perspective: We have been running and are on our way to become idle. > There is no way that someone kicked us for a wakeup. In other words as long as we > are running, there is no point in kicking us but when going idle we should get rid > of old kick_mask bit. > Doesnt this cover your scenario? In practice probably yes, in theory I don't think so. I hope this is more of a theoretical problem than a practical one anyway. But let me discuss the theory anyway. Under the assumption that an arbitrary amount of time can pass between 1) for_each_set_bit finds the vcpus bit in the idle mask set and 2) test_and_set_bit(kicked_mask) that returns a false (bit was not set, and we did set it) then if we choose an absurdly large amount of time, it is possible that we are past a whole cycle: an __unset_cpu_ilde() and an __set_cpu_idle() but we didn't reach set_current_state(TASK_INTERRUPTIBLE). If we set the bit at a suitable place there we theoretically may end up in a situation where the wakeup is ineffective because the state didn't change yet, but the bit gets set. So we end up in a stable sleeps and does not want to get woken up state. If the clear in kvm_arch_vcpu_runnable() does not save us... It could be that, that clear alone is sufficient. Because, before we really go to sleep we kind of attempt to wake up, and this guy clears on every attempted wakeup. So the clear in kvm_arch_vcpu_runnable() may be the only clear we need. Or? Anyway the scenario I described is very very far fetched I guess, but I prefer solutions that are theoretically race free over solutions that are practically race free if performance does not suffer. Regards, Halil
On 19.10.21 19:54, Halil Pasic wrote: > The idea behind kicked mask is that we should not re-kick a vcpu > from __airqs_kick_single_vcpu() that is already in the middle of > being kicked by the same function. > > If however the vcpu that was idle before when the idle_mask was > examined, is not idle any more after the kicked_mask is set, that > means that we don't need to kick, and that we need to clear the > bit we just set because we may be beyond the point where it would > get cleared in the wake-up process. Since the time window is short, > this is probably more a theoretical than a practical thing: the race > window is small. > > To get things harmonized let us also move the clear from vcpu_pre_run() > to __unset_cpu_idle(). > > Signed-off-by: Halil Pasic <pasic@linux.ibm.com> > Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()") Before releasing something like this, where none of us is sure if it really saves cpu cost, I'd prefer to run some measurement with the whole kicked_mask logic removed and to compare the number of vcpu wake ups with the number of interrupts to be processed by the gib alert mechanism in a slightly over committed host while driving with Matthews test load. A similar run can be done with this code. > --- > arch/s390/kvm/interrupt.c | 7 ++++++- > arch/s390/kvm/kvm-s390.c | 2 -- > 2 files changed, 6 insertions(+), 3 deletions(-) > > diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c > index 2245f4b8d362..3c80a2237ef5 100644 > --- a/arch/s390/kvm/interrupt.c > +++ b/arch/s390/kvm/interrupt.c > @@ -426,6 +426,7 @@ static void __unset_cpu_idle(struct kvm_vcpu *vcpu) > { > kvm_s390_clear_cpuflags(vcpu, CPUSTAT_WAIT); > clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask); > + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); > } > > static void __reset_intercept_indicators(struct kvm_vcpu *vcpu) > @@ -3064,7 +3065,11 @@ static void __airqs_kick_single_vcpu(struct kvm *kvm, u8 deliverable_mask) > /* lately kicked but not yet running */ > if (test_and_set_bit(vcpu_idx, gi->kicked_mask)) > return; > - kvm_s390_vcpu_wakeup(vcpu); > + /* if meanwhile not idle: clear and don't kick */ > + if (test_bit(vcpu_idx, kvm->arch.idle_mask)) > + kvm_s390_vcpu_wakeup(vcpu); > + else > + clear_bit(vcpu_idx, gi->kicked_mask); > return; > } > } > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c > index 1c97493d21e1..6b779ef9f5fb 100644 > --- a/arch/s390/kvm/kvm-s390.c > +++ b/arch/s390/kvm/kvm-s390.c > @@ -4067,8 +4067,6 @@ static int vcpu_pre_run(struct kvm_vcpu *vcpu) > kvm_s390_patch_guest_per_regs(vcpu); > } > > - clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); > - > vcpu->arch.sie_block->icptcode = 0; > cpuflags = atomic_read(&vcpu->arch.sie_block->cpuflags); > VCPU_EVENT(vcpu, 6, "entering sie flags %x", cpuflags); >
Am 20.10.21 um 11:48 schrieb Michael Mueller: > > > On 19.10.21 19:54, Halil Pasic wrote: >> The idea behind kicked mask is that we should not re-kick a vcpu >> from __airqs_kick_single_vcpu() that is already in the middle of >> being kicked by the same function. >> >> If however the vcpu that was idle before when the idle_mask was >> examined, is not idle any more after the kicked_mask is set, that >> means that we don't need to kick, and that we need to clear the >> bit we just set because we may be beyond the point where it would >> get cleared in the wake-up process. Since the time window is short, >> this is probably more a theoretical than a practical thing: the race >> window is small. >> >> To get things harmonized let us also move the clear from vcpu_pre_run() >> to __unset_cpu_idle(). >> >> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> >> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()") > > Before releasing something like this, where none of us is sure if > it really saves cpu cost, I'd prefer to run some measurement with > the whole kicked_mask logic removed and to compare the number of > vcpu wake ups with the number of interrupts to be processed by > the gib alert mechanism in a slightly over committed host while > driving with Matthews test load. But I think patch 1 and 2 can go immediately as they measurably or testable fix things. Correct? > A similar run can be done with this code. > >> --- >> arch/s390/kvm/interrupt.c | 7 ++++++- >> arch/s390/kvm/kvm-s390.c | 2 -- >> 2 files changed, 6 insertions(+), 3 deletions(-) >> >> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c >> index 2245f4b8d362..3c80a2237ef5 100644 >> --- a/arch/s390/kvm/interrupt.c >> +++ b/arch/s390/kvm/interrupt.c >> @@ -426,6 +426,7 @@ static void __unset_cpu_idle(struct kvm_vcpu *vcpu) >> { >> kvm_s390_clear_cpuflags(vcpu, CPUSTAT_WAIT); >> clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask); >> + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); >> } >> static void __reset_intercept_indicators(struct kvm_vcpu *vcpu) >> @@ -3064,7 +3065,11 @@ static void __airqs_kick_single_vcpu(struct kvm *kvm, u8 deliverable_mask) >> /* lately kicked but not yet running */ >> if (test_and_set_bit(vcpu_idx, gi->kicked_mask)) >> return; >> - kvm_s390_vcpu_wakeup(vcpu); >> + /* if meanwhile not idle: clear and don't kick */ >> + if (test_bit(vcpu_idx, kvm->arch.idle_mask)) >> + kvm_s390_vcpu_wakeup(vcpu); >> + else >> + clear_bit(vcpu_idx, gi->kicked_mask); >> return; >> } >> } >> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c >> index 1c97493d21e1..6b779ef9f5fb 100644 >> --- a/arch/s390/kvm/kvm-s390.c >> +++ b/arch/s390/kvm/kvm-s390.c >> @@ -4067,8 +4067,6 @@ static int vcpu_pre_run(struct kvm_vcpu *vcpu) >> kvm_s390_patch_guest_per_regs(vcpu); >> } >> - clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); >> - >> vcpu->arch.sie_block->icptcode = 0; >> cpuflags = atomic_read(&vcpu->arch.sie_block->cpuflags); >> VCPU_EVENT(vcpu, 6, "entering sie flags %x", cpuflags); >>
On Wed, 20 Oct 2021 12:31:19 +0200 Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > Before releasing something like this, where none of us is sure if > > it really saves cpu cost, I'd prefer to run some measurement with > > the whole kicked_mask logic removed and to compare the number of > > vcpu wake ups with the number of interrupts to be processed by > > the gib alert mechanism in a slightly over committed host while > > driving with Matthews test load. > > But I think patch 1 and 2 can go immediately as they measurably or > testable fix things. Correct? I think so as well. And if patch 3 is going to be dropped, I would really like to keep the unconditional clear in kvm_arch_vcpu_runnable(), as my analysis in the discussion points out: I think it can save us form trouble this patch is trying to address. Regards, Halil
On 20.10.21 12:31, Christian Borntraeger wrote: > Am 20.10.21 um 11:48 schrieb Michael Mueller: >> >> >> On 19.10.21 19:54, Halil Pasic wrote: >>> The idea behind kicked mask is that we should not re-kick a vcpu >>> from __airqs_kick_single_vcpu() that is already in the middle of >>> being kicked by the same function. >>> >>> If however the vcpu that was idle before when the idle_mask was >>> examined, is not idle any more after the kicked_mask is set, that >>> means that we don't need to kick, and that we need to clear the >>> bit we just set because we may be beyond the point where it would >>> get cleared in the wake-up process. Since the time window is short, >>> this is probably more a theoretical than a practical thing: the race >>> window is small. >>> >>> To get things harmonized let us also move the clear from vcpu_pre_run() >>> to __unset_cpu_idle(). >>> >>> Signed-off-by: Halil Pasic <pasic@linux.ibm.com> >>> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()") >> >> Before releasing something like this, where none of us is sure if >> it really saves cpu cost, I'd prefer to run some measurement with >> the whole kicked_mask logic removed and to compare the number of >> vcpu wake ups with the number of interrupts to be processed by >> the gib alert mechanism in a slightly over committed host while >> driving with Matthews test load. > > But I think patch 1 and 2 can go immediately as they measurably or > testable fix things. Correct? Yes > >> A similar run can be done with this code. >> >>> --- >>> arch/s390/kvm/interrupt.c | 7 ++++++- >>> arch/s390/kvm/kvm-s390.c | 2 -- >>> 2 files changed, 6 insertions(+), 3 deletions(-) >>> >>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c >>> index 2245f4b8d362..3c80a2237ef5 100644 >>> --- a/arch/s390/kvm/interrupt.c >>> +++ b/arch/s390/kvm/interrupt.c >>> @@ -426,6 +426,7 @@ static void __unset_cpu_idle(struct kvm_vcpu *vcpu) >>> { >>> kvm_s390_clear_cpuflags(vcpu, CPUSTAT_WAIT); >>> clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask); >>> + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); >>> } >>> static void __reset_intercept_indicators(struct kvm_vcpu *vcpu) >>> @@ -3064,7 +3065,11 @@ static void __airqs_kick_single_vcpu(struct >>> kvm *kvm, u8 deliverable_mask) >>> /* lately kicked but not yet running */ >>> if (test_and_set_bit(vcpu_idx, gi->kicked_mask)) >>> return; >>> - kvm_s390_vcpu_wakeup(vcpu); >>> + /* if meanwhile not idle: clear and don't kick */ >>> + if (test_bit(vcpu_idx, kvm->arch.idle_mask)) >>> + kvm_s390_vcpu_wakeup(vcpu); >>> + else >>> + clear_bit(vcpu_idx, gi->kicked_mask); >>> return; >>> } >>> } >>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c >>> index 1c97493d21e1..6b779ef9f5fb 100644 >>> --- a/arch/s390/kvm/kvm-s390.c >>> +++ b/arch/s390/kvm/kvm-s390.c >>> @@ -4067,8 +4067,6 @@ static int vcpu_pre_run(struct kvm_vcpu *vcpu) >>> kvm_s390_patch_guest_per_regs(vcpu); >>> } >>> - clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); >>> - >>> vcpu->arch.sie_block->icptcode = 0; >>> cpuflags = atomic_read(&vcpu->arch.sie_block->cpuflags); >>> VCPU_EVENT(vcpu, 6, "entering sie flags %x", cpuflags); >>>
Am 20.10.21 um 12:45 schrieb Halil Pasic: > On Wed, 20 Oct 2021 12:31:19 +0200 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >>> Before releasing something like this, where none of us is sure if >>> it really saves cpu cost, I'd prefer to run some measurement with >>> the whole kicked_mask logic removed and to compare the number of >>> vcpu wake ups with the number of interrupts to be processed by >>> the gib alert mechanism in a slightly over committed host while >>> driving with Matthews test load. >> >> But I think patch 1 and 2 can go immediately as they measurably or >> testable fix things. Correct? > > I think so as well. And if patch 3 is going to be dropped, I would > really like to keep the unconditional clear in kvm_arch_vcpu_runnable(), > as my analysis in the discussion points out: I think it can save us > form trouble this patch is trying to address. Yes, lets keep patch 1 and 2 as us and let us look deeper into this patch. I will apply and queue 1 and 2 soon.
Am 20.10.21 um 09:52 schrieb Halil Pasic: > On Tue, 19 Oct 2021 23:35:25 +0200 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >>> @@ -426,6 +426,7 @@ static void __unset_cpu_idle(struct kvm_vcpu *vcpu) >>> { >>> kvm_s390_clear_cpuflags(vcpu, CPUSTAT_WAIT); >>> clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask); >>> + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); > > BTW, do you know are bit-ops garanteed to be serialized as seen by > another cpu even when acting on a different byte? I mean > could the kick_single_vcpu() set the clear of the kicked_mask bit but > not see the clear of the idle mask? clear_bit explicitely says. * This is a relaxed atomic operation (no implied memory barriers). so if we really need the ordering, then we need to add a barrier. > > If that is not true we may need some barriers, or possibly merging the > two bitmasks like idle bit, kick bit alterating to ensure there > absolutely ain't no race. >
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index 2245f4b8d362..3c80a2237ef5 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -426,6 +426,7 @@ static void __unset_cpu_idle(struct kvm_vcpu *vcpu) { kvm_s390_clear_cpuflags(vcpu, CPUSTAT_WAIT); clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.idle_mask); + clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); } static void __reset_intercept_indicators(struct kvm_vcpu *vcpu) @@ -3064,7 +3065,11 @@ static void __airqs_kick_single_vcpu(struct kvm *kvm, u8 deliverable_mask) /* lately kicked but not yet running */ if (test_and_set_bit(vcpu_idx, gi->kicked_mask)) return; - kvm_s390_vcpu_wakeup(vcpu); + /* if meanwhile not idle: clear and don't kick */ + if (test_bit(vcpu_idx, kvm->arch.idle_mask)) + kvm_s390_vcpu_wakeup(vcpu); + else + clear_bit(vcpu_idx, gi->kicked_mask); return; } } diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 1c97493d21e1..6b779ef9f5fb 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -4067,8 +4067,6 @@ static int vcpu_pre_run(struct kvm_vcpu *vcpu) kvm_s390_patch_guest_per_regs(vcpu); } - clear_bit(vcpu->vcpu_idx, vcpu->kvm->arch.gisa_int.kicked_mask); - vcpu->arch.sie_block->icptcode = 0; cpuflags = atomic_read(&vcpu->arch.sie_block->cpuflags); VCPU_EVENT(vcpu, 6, "entering sie flags %x", cpuflags);
The idea behind kicked mask is that we should not re-kick a vcpu from __airqs_kick_single_vcpu() that is already in the middle of being kicked by the same function. If however the vcpu that was idle before when the idle_mask was examined, is not idle any more after the kicked_mask is set, that means that we don't need to kick, and that we need to clear the bit we just set because we may be beyond the point where it would get cleared in the wake-up process. Since the time window is short, this is probably more a theoretical than a practical thing: the race window is small. To get things harmonized let us also move the clear from vcpu_pre_run() to __unset_cpu_idle(). Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Fixes: 9f30f6216378 ("KVM: s390: add gib_alert_irq_handler()") --- arch/s390/kvm/interrupt.c | 7 ++++++- arch/s390/kvm/kvm-s390.c | 2 -- 2 files changed, 6 insertions(+), 3 deletions(-)