diff mbox series

riscv: KVM: Remove unnecessary vcpu kick

Message ID 20250219015426.1939-1-xiangwencheng@lanxincomputing.com (mailing list archive)
State New
Headers show
Series riscv: KVM: Remove unnecessary vcpu kick | expand

Commit Message

xiangwencheng Feb. 19, 2025, 1:54 a.m. UTC
Thank you Andrew Jones, forgive my errors in the last email.
I'm wondering whether it's necessary to kick the virtual hart
after writing to the vsfile of IMSIC.
From my understanding, writing to the vsfile should directly
forward the interrupt as MSI to the virtual hart. This means that
an additional kick should not be necessary, as it would cause the
vCPU to exit unnecessarily and potentially degrade performance.
I've tested this behavior in QEMU, and it seems to work perfectly
fine without the extra kick.
Would appreciate any insights or confirmation on this!
Best regards.

Signed-off-by: BillXiang <xiangwencheng@lanxincomputing.com>
---
 arch/riscv/kvm/aia_imsic.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Andrew Jones Feb. 19, 2025, 8:36 a.m. UTC | #1
On Wed, Feb 19, 2025 at 09:54:26AM +0800, BillXiang wrote:
> Thank you Andrew Jones, forgive my errors in the last email.

From here down is all exactly the same as your first email, which I
already completely replied to.

> I'm wondering whether it's necessary to kick the virtual hart
> after writing to the vsfile of IMSIC.
> From my understanding, writing to the vsfile should directly
> forward the interrupt as MSI to the virtual hart. This means that
> an additional kick should not be necessary, as it would cause the
> vCPU to exit unnecessarily and potentially degrade performance.
> I've tested this behavior in QEMU, and it seems to work perfectly
> fine without the extra kick.
> Would appreciate any insights or confirmation on this!
> Best regards.
> 
> Signed-off-by: BillXiang <xiangwencheng@lanxincomputing.com>
> ---
>  arch/riscv/kvm/aia_imsic.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/riscv/kvm/aia_imsic.c b/arch/riscv/kvm/aia_imsic.c
> index a8085cd8215e..29ef9c2133a9 100644
> --- a/arch/riscv/kvm/aia_imsic.c
> +++ b/arch/riscv/kvm/aia_imsic.c
> @@ -974,7 +974,6 @@ int kvm_riscv_vcpu_aia_imsic_inject(struct kvm_vcpu *vcpu,
>  
>  	if (imsic->vsfile_cpu >= 0) {
>  		writel(iid, imsic->vsfile_va + IMSIC_MMIO_SETIPNUM_LE);
> -		kvm_vcpu_kick(vcpu);
>  	} else {
>  		eix = &imsic->swfile->eix[iid / BITS_PER_TYPE(u64)];
>  		set_bit(iid & (BITS_PER_TYPE(u64) - 1), eix->eip);
> -- 
> 2.46.2
Radim Krčmář Feb. 19, 2025, 8:51 a.m. UTC | #2
2025-02-19T09:54:26+08:00, BillXiang <xiangwencheng@lanxincomputing.com>:
> Thank you Andrew Jones, forgive my errors in the last email.
> I'm wondering whether it's necessary to kick the virtual hart
> after writing to the vsfile of IMSIC.
> From my understanding, writing to the vsfile should directly
> forward the interrupt as MSI to the virtual hart. This means that
> an additional kick should not be necessary, as it would cause the
> vCPU to exit unnecessarily and potentially degrade performance.

Andrew proposed to avoid the exit overhead, but do a wakeup if the VCPU
is "sleeping".  I talked with Andrew and thought so as well, but now I
agree with you that we shouldn't have anything extra here.

Direct MSIs from IOMMU or other harts won't perform anything afterwards,
so what you want to do correct and KVM has to properly handle the memory
write alone.

> I've tested this behavior in QEMU, and it seems to work perfectly
> fine without the extra kick.

If the rest of KVM behaves correctly is a different question.
A mistake might result in a very rare race condition, so it's better to
do verification rather than generic testing.

For example, is `vsfile_cpu >= 0` the right condition for using direct
interrupts?

I don't see KVM setting vsfile_cpu to -1 before descheduling after
emulating WFI, which could cause a bug as a MSI would never cause a wake
up.  It might still look like it works, because something else could be
waking the VCPU up and then the VCPU would notice this MSI as well.

Please note that I didn't actualy verify the KVM code, so it can be
correct, I just used this to give you an example of what can go wrong
without being able to see it in testing.

I would like to know if KVM needs fixing before this change is accepted.
(It could make bad things worse.)

> Would appreciate any insights or confirmation on this!

Your patch is not acceptable because of its commit message, though.
Please look again at the document that Andrew posted and always reply
the previous thread if you do not send a new patch version.

The commit message should be on point.
Please avoid extraneous information that won't help anyone reading the
commit.  Greeting and commentary can go below the "---" line.
(And possibly above a "---8<---" line, although that is not official and
 may cause issues with some maintainers.)

Thanks.
xiangwencheng Feb. 20, 2025, 7:12 a.m. UTC | #3
> From: "Radim Krčmář"<rkrcmar@ventanamicro.com>
> Date:  Wed, Feb 19, 2025, 16:51
> Subject:  Re: [PATCH] riscv: KVM: Remove unnecessary vcpu kick
> To: "BillXiang"<xiangwencheng@lanxincomputing.com>, <anup@brainfault.org>
> Cc: <ajones@ventanamicro.com>, <kvm-riscv@lists.infradead.org>, <kvm@vger.kernel.org>, <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <atishp@atishpatra.org>, <paul.walmsley@sifive.com>, <palmer@dabbelt.com>, <aou@eecs.berkeley.edu>, "linux-riscv"<linux-riscv-bounces@lists.infradead.org>
> 2025-02-19T09:54:26+08:00, BillXiang <xiangwencheng@lanxincomputing.com>:

> > Thank you Andrew Jones, forgive my errors in the last email.

> > I'm wondering whether it's necessary to kick the virtual hart

> > after writing to the vsfile of IMSIC.

> > From my understanding, writing to the vsfile should directly

> > forward the interrupt as MSI to the virtual hart. This means that

> > an additional kick should not be necessary, as it would cause the

> > vCPU to exit unnecessarily and potentially degrade performance.

> Andrew proposed to avoid the exit overhead, but do a wakeup if the VCPU

> is "sleeping".  I talked with Andrew and thought so as well, but now I

> agree with you that we shouldn't have anything extra here.

> Direct MSIs from IOMMU or other harts won't perform anything afterwards,

> so what you want to do correct and KVM has to properly handle the memory

> write alone.

> > I've tested this behavior in QEMU, and it seems to work perfectly

> > fine without the extra kick.

> If the rest of KVM behaves correctly is a different question.

> A mistake might result in a very rare race condition, so it's better to

> do verification rather than generic testing.

> For example, is `vsfile_cpu >= 0` the right condition for using direct

> interrupts?

> I don't see KVM setting vsfile_cpu to -1 before descheduling after

It's not necessary to set vsfile_cpu to -1 as it doesn't release it, and
the vsfile still belongs to the vCPU after WFI.

> emulating WFI, which could cause a bug as a MSI would never cause a wake

> up.  It might still look like it works, because something else could be

> waking the VCPU up and then the VCPU would notice this MSI as well.

> Please note that I didn't actualy verify the KVM code, so it can be

> correct, I just used this to give you an example of what can go wrong

> without being able to see it in testing.

> I would like to know if KVM needs fixing before this change is accepted.

> (It could make bad things worse.)

As "KVM:  WFI wake-up using IMSIC VS-files" that described in [1], writing to 
VS-FILE will wake up vCPU.

KVM has also handled the situation of WFI. Here is the WFI emulation process:
kvm_riscv_vcpu_exit 
    -> kvm_riscv_vcpu_virtual_insn 
         -> system_opcode_insn 
              -> wfi_insn 
                 -> kvm_riscv_vcpu_wfi
                     -> kvm_vcpu_halt
                         -> kvm_vcpu_block
                             -> kvm_arch_vcpu_blocking
                                   -> kvm_riscv_aia_wakeon_hgei
                                         -> csr_set(CSR_HGEIE, BIT(hgei));
                             -> set_current_state(TASK_INTERRUPTIBLE);
                             -> schedule

In kvm_arch_vcpu_blocking it will enable guest external interrupt, which
means wirting to VS_FILE will cause an interrupt. And the interrupt handler
hgei_interrupt which is setted in aia_hgei_init will finally call kvm_vcpu_kick
to wake up vCPU.

So I still think is not necessary to call another kvm_vcpu_kick after writing to
VS_FILE.

Waiting for more info. Thanks.

[1]  https://kvm-forum.qemu.org/2022/AIA_Virtualization_in_KVM_RISCV_final.pdf

> > Would appreciate any insights or confirmation on this!

> Your patch is not acceptable because of its commit message, though.

> Please look again at the document that Andrew posted and always reply

> the previous thread if you do not send a new patch version.

> The commit message should be on point.

> Please avoid extraneous information that won't help anyone reading the

> commit.  Greeting and commentary can go below the "---" line.

> (And possibly above a "---8<---" line, although that is not official and

>  may cause issues with some maintainers.)

> Thanks.
>
Andrew Jones Feb. 20, 2025, 8:01 a.m. UTC | #4
On Thu, Feb 20, 2025 at 03:12:58PM +0800, xiangwencheng wrote:
...
> As "KVM:  WFI wake-up using IMSIC VS-files" that described in [1], writing to 
> VS-FILE will wake up vCPU.
> 
> KVM has also handled the situation of WFI. Here is the WFI emulation process:
> kvm_riscv_vcpu_exit 
>     -> kvm_riscv_vcpu_virtual_insn 
>          -> system_opcode_insn 
>               -> wfi_insn 
>                  -> kvm_riscv_vcpu_wfi
>                      -> kvm_vcpu_halt
>                          -> kvm_vcpu_block
>                              -> kvm_arch_vcpu_blocking
>                                    -> kvm_riscv_aia_wakeon_hgei
>                                          -> csr_set(CSR_HGEIE, BIT(hgei));
>                              -> set_current_state(TASK_INTERRUPTIBLE);
>                              -> schedule
> 
> In kvm_arch_vcpu_blocking it will enable guest external interrupt, which
> means wirting to VS_FILE will cause an interrupt. And the interrupt handler
> hgei_interrupt which is setted in aia_hgei_init will finally call kvm_vcpu_kick
> to wake up vCPU.
> 
> So I still think is not necessary to call another kvm_vcpu_kick after writing to
> VS_FILE.
> 
> Waiting for more info. Thanks.
> 
> [1]  https://kvm-forum.qemu.org/2022/AIA_Virtualization_in_KVM_RISCV_final.pdf
>

Right, we don't need anything since hgei_interrupt() kicks for us, but if
we do

@@ -973,8 +973,8 @@ int kvm_riscv_vcpu_aia_imsic_inject(struct kvm_vcpu *vcpu,
        read_lock_irqsave(&imsic->vsfile_lock, flags);

        if (imsic->vsfile_cpu >= 0) {
+               kvm_vcpu_wake_up(vcpu);
                writel(iid, imsic->vsfile_va + IMSIC_MMIO_SETIPNUM_LE);
-               kvm_vcpu_kick(vcpu);
        } else {
                eix = &imsic->swfile->eix[iid / BITS_PER_TYPE(u64)];
                set_bit(iid & (BITS_PER_TYPE(u64) - 1), eix->eip);

then we should be able to avoid taking a host interrupt.

Thanks,
drew
xiangwencheng Feb. 20, 2025, 8:17 a.m. UTC | #5
> From: "Andrew Jones"<ajones@ventanamicro.com>
> Date:  Thu, Feb 20, 2025, 16:01
> Subject:  Re: [PATCH] riscv: KVM: Remove unnecessary vcpu kick
> To: "xiangwencheng"<xiangwencheng@lanxincomputing.com>
> Cc: "Radim Krčmář"<rkrcmar@ventanamicro.com>, <anup@brainfault.org>, <kvm-riscv@lists.infradead.org>, <kvm@vger.kernel.org>, <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <atishp@atishpatra.org>, <paul.walmsley@sifive.com>, <palmer@dabbelt.com>, <aou@eecs.berkeley.edu>, "linux-riscv"<linux-riscv-bounces@lists.infradead.org>
> On Thu, Feb 20, 2025 at 03:12:58PM +0800, xiangwencheng wrote:

> ...

> > As "KVM:  WFI wake-up using IMSIC VS-files" that described in [1], writing to 

> > VS-FILE will wake up vCPU.

> > 

> > KVM has also handled the situation of WFI. Here is the WFI emulation process:

> > kvm_riscv_vcpu_exit 

> >     -> kvm_riscv_vcpu_virtual_insn 

> >          -> system_opcode_insn 

> >               -> wfi_insn 

> >                  -> kvm_riscv_vcpu_wfi

> >                      -> kvm_vcpu_halt

> >                          -> kvm_vcpu_block

> >                              -> kvm_arch_vcpu_blocking

> >                                    -> kvm_riscv_aia_wakeon_hgei

> >                                          -> csr_set(CSR_HGEIE, BIT(hgei));

> >                              -> set_current_state(TASK_INTERRUPTIBLE);

> >                              -> schedule

> > 

> > In kvm_arch_vcpu_blocking it will enable guest external interrupt, which

> > means wirting to VS_FILE will cause an interrupt. And the interrupt handler

> > hgei_interrupt which is setted in aia_hgei_init will finally call kvm_vcpu_kick

> > to wake up vCPU.

> > 

> > So I still think is not necessary to call another kvm_vcpu_kick after writing to

> > VS_FILE.

> > 

> > Waiting for more info. Thanks.

> > 

> > [1]  https://kvm-forum.qemu.org/2022/AIA_Virtualization_in_KVM_RISCV_final.pdf

> >

> Right, we don't need anything since hgei_interrupt() kicks for us, but if

> we do

> @@ -973,8 +973,8 @@ int kvm_riscv_vcpu_aia_imsic_inject(struct kvm_vcpu *vcpu,

>         read_lock_irqsave(&imsic->vsfile_lock, flags);

>         if (imsic->vsfile_cpu >= 0) {

> +               kvm_vcpu_wake_up(vcpu);

>                 writel(iid, imsic->vsfile_va + IMSIC_MMIO_SETIPNUM_LE);

> -               kvm_vcpu_kick(vcpu);

>         } else {

>                 eix = &imsic->swfile->eix[iid / BITS_PER_TYPE(u64)];

>                 set_bit(iid & (BITS_PER_TYPE(u64) - 1), eix->eip);

> then we should be able to avoid taking a host interrupt.

But it may schedule again in the for(;;) loop of kvm_vcpu_block after kvm_vcpu_wake_up but 
before the write of vsfile, and we will still get a host interrupt.
@@ -3573,6 +3573,8 @@ bool kvm_vcpu_block(struct kvm_vcpu *vcpu)
        for (;;) {
                set_current_state(TASK_INTERRUPTIBLE);

+               // Here will not break before the write of vsfile,
+               // and then we will schedule again.
                if (kvm_vcpu_check_block(vcpu) < 0)
                        break;

                waited = true;
                schedule();
        }

Thanks 

> Thanks,

> drew
>
Radim Krčmář Feb. 20, 2025, 8:50 a.m. UTC | #6
2025-02-20T16:17:33+08:00, xiangwencheng <xiangwencheng@lanxincomputing.com>:
>> From: "Andrew Jones"<ajones@ventanamicro.com>
>> On Thu, Feb 20, 2025 at 03:12:58PM +0800, xiangwencheng wrote:
>> > In kvm_arch_vcpu_blocking it will enable guest external interrupt, which
>
>> > means wirting to VS_FILE will cause an interrupt. And the interrupt handler
>
>> > hgei_interrupt which is setted in aia_hgei_init will finally call kvm_vcpu_kick
>
>> > to wake up vCPU.

(Configure your mail client, so it doesn't add a newline between each
 quoted line when replying.)

>> > So I still think is not necessary to call another kvm_vcpu_kick after writing to
>> > VS_FILE.

So the kick wasn't there to mask some other bug, thanks.

>> Right, we don't need anything since hgei_interrupt() kicks for us, but if
>> we do
>> 
>> @@ -973,8 +973,8 @@ int kvm_riscv_vcpu_aia_imsic_inject(struct kvm_vcpu *vcpu,
>>         read_lock_irqsave(&imsic->vsfile_lock, flags);
>> 
>>         if (imsic->vsfile_cpu >= 0) {
>> +               kvm_vcpu_wake_up(vcpu);
>>                 writel(iid, imsic->vsfile_va + IMSIC_MMIO_SETIPNUM_LE);
>> -               kvm_vcpu_kick(vcpu);
>>         } else {
>>                 eix = &imsic->swfile->eix[iid / BITS_PER_TYPE(u64)];
>>                 set_bit(iid & (BITS_PER_TYPE(u64) - 1), eix->eip);
>> 
>> then we should be able to avoid taking a host interrupt.

The wakeup is asynchronous, and this would practically never avoid the
host interrupt, but we'd do extra pointless work...
I think it's much better just with the write.  (The wakeup would again
make KVM look like it has a bug elsewhere.)
Andrew Jones Feb. 20, 2025, 12:14 p.m. UTC | #7
On Thu, Feb 20, 2025 at 09:50:06AM +0100, Radim Krčmář wrote:
> 2025-02-20T16:17:33+08:00, xiangwencheng <xiangwencheng@lanxincomputing.com>:
> >> From: "Andrew Jones"<ajones@ventanamicro.com>
> >> On Thu, Feb 20, 2025 at 03:12:58PM +0800, xiangwencheng wrote:
> >> > In kvm_arch_vcpu_blocking it will enable guest external interrupt, which
> >
> >> > means wirting to VS_FILE will cause an interrupt. And the interrupt handler
> >
> >> > hgei_interrupt which is setted in aia_hgei_init will finally call kvm_vcpu_kick
> >
> >> > to wake up vCPU.
> 
> (Configure your mail client, so it doesn't add a newline between each
>  quoted line when replying.)
> 
> >> > So I still think is not necessary to call another kvm_vcpu_kick after writing to
> >> > VS_FILE.
> 
> So the kick wasn't there to mask some other bug, thanks.
> 
> >> Right, we don't need anything since hgei_interrupt() kicks for us, but if
> >> we do
> >> 
> >> @@ -973,8 +973,8 @@ int kvm_riscv_vcpu_aia_imsic_inject(struct kvm_vcpu *vcpu,
> >>         read_lock_irqsave(&imsic->vsfile_lock, flags);
> >> 
> >>         if (imsic->vsfile_cpu >= 0) {
> >> +               kvm_vcpu_wake_up(vcpu);
> >>                 writel(iid, imsic->vsfile_va + IMSIC_MMIO_SETIPNUM_LE);
> >> -               kvm_vcpu_kick(vcpu);
> >>         } else {
> >>                 eix = &imsic->swfile->eix[iid / BITS_PER_TYPE(u64)];
> >>                 set_bit(iid & (BITS_PER_TYPE(u64) - 1), eix->eip);
> >> 
> >> then we should be able to avoid taking a host interrupt.
> 
> The wakeup is asynchronous, and this would practically never avoid the
> host interrupt, but we'd do extra pointless work...
> I think it's much better just with the write.  (The wakeup would again
> make KVM look like it has a bug elsewhere.)

Ah yes, the wakeup is asynchronous. Just dropping the kick is the right
way to go then.

Thanks,
drew
diff mbox series

Patch

diff --git a/arch/riscv/kvm/aia_imsic.c b/arch/riscv/kvm/aia_imsic.c
index a8085cd8215e..29ef9c2133a9 100644
--- a/arch/riscv/kvm/aia_imsic.c
+++ b/arch/riscv/kvm/aia_imsic.c
@@ -974,7 +974,6 @@  int kvm_riscv_vcpu_aia_imsic_inject(struct kvm_vcpu *vcpu,
 
 	if (imsic->vsfile_cpu >= 0) {
 		writel(iid, imsic->vsfile_va + IMSIC_MMIO_SETIPNUM_LE);
-		kvm_vcpu_kick(vcpu);
 	} else {
 		eix = &imsic->swfile->eix[iid / BITS_PER_TYPE(u64)];
 		set_bit(iid & (BITS_PER_TYPE(u64) - 1), eix->eip);