Message ID | 20240916181633.366449-1-heinrich.schuchardt@canonical.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | [1/1] target/riscv: enable floating point unit | expand |
On Mon, Sep 16, 2024 at 08:16:33PM GMT, Heinrich Schuchardt wrote: > OpenSBI enables the floating point in mstatus. For consistency QEMU/KVM > should do the same. > > Without this patch EDK II with TLS enabled crashes when hitting the first > floating point instruction while running QEMU with --accel kvm and runs > fine with --accel tcg. > > Additionally to this patch EDK II should be changed to make no assumptions > about the state of the floating point unit. > > Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> > --- > target/riscv/cpu.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > index 4bda754b01..c32e2721d4 100644 > --- a/target/riscv/cpu.c > +++ b/target/riscv/cpu.c > @@ -923,6 +923,13 @@ static void riscv_cpu_reset_hold(Object *obj, ResetType type) > if (mcc->parent_phases.hold) { > mcc->parent_phases.hold(obj, type); > } > + if (riscv_has_ext(env, RVF) || riscv_has_ext(env, RVD)) { > + env->mstatus = set_field(env->mstatus, MSTATUS_FS, env->misa_mxl); > + for (int regnr = 0; regnr < 32; ++regnr) { > + env->fpr[regnr] = 0; > + } > + riscv_csrrw(env, CSR_FCSR, NULL, 0, -1); > + } If this is only fixing KVM, then I think it belongs in kvm_riscv_reset_vcpu(). But, I feel like we're working around an issue with KVM synchronization with this, as well as with the "clear CSR values" part of commit 8633951530cc ("target/riscv: Clear CSR values at reset and sync MPSTATE with host"). KVM knows how to reset VCPUs. It does so on VCPU creation and for any secondaries started with SBI HSM start. KVM's reset would set sstatus.FS to 1 ("Initial") and zero out all the fp registers and fcsr. So it seems like we're either synchronizing prior to KVM resetting the boot VCPU, not synchronizing at all, or KVM isn't doing the reset of the boot VCPU. Thanks, drew
On 17.09.24 14:13, Andrew Jones wrote: > On Mon, Sep 16, 2024 at 08:16:33PM GMT, Heinrich Schuchardt wrote: >> OpenSBI enables the floating point in mstatus. For consistency QEMU/KVM >> should do the same. >> >> Without this patch EDK II with TLS enabled crashes when hitting the first >> floating point instruction while running QEMU with --accel kvm and runs >> fine with --accel tcg. >> >> Additionally to this patch EDK II should be changed to make no assumptions >> about the state of the floating point unit. >> >> Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> >> --- >> target/riscv/cpu.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c >> index 4bda754b01..c32e2721d4 100644 >> --- a/target/riscv/cpu.c >> +++ b/target/riscv/cpu.c >> @@ -923,6 +923,13 @@ static void riscv_cpu_reset_hold(Object *obj, ResetType type) >> if (mcc->parent_phases.hold) { >> mcc->parent_phases.hold(obj, type); >> } >> + if (riscv_has_ext(env, RVF) || riscv_has_ext(env, RVD)) { >> + env->mstatus = set_field(env->mstatus, MSTATUS_FS, env->misa_mxl); >> + for (int regnr = 0; regnr < 32; ++regnr) { >> + env->fpr[regnr] = 0; >> + } >> + riscv_csrrw(env, CSR_FCSR, NULL, 0, -1); >> + } > > If this is only fixing KVM, then I think it belongs in > kvm_riscv_reset_vcpu(). But, I feel like we're working around an issue > with KVM synchronization with this, as well as with the "clear CSR values" > part of commit 8633951530cc ("target/riscv: Clear CSR values at reset and > sync MPSTATE with host"). KVM knows how to reset VCPUs. It does so on > VCPU creation and for any secondaries started with SBI HSM start. KVM's > reset would set sstatus.FS to 1 ("Initial") and zero out all the fp > registers and fcsr. So it seems like we're either synchronizing prior to > KVM resetting the boot VCPU, not synchronizing at all, or KVM isn't doing > the reset of the boot VCPU. > > Thanks, > drew Hello Drew, Thanks for reviewing. Concerning the question whether kvm_riscv_reset_vcpu() would be a better place for the change: Is there any specification prescribing what the state of the FS bits should be when entering M-mode and when entering S-mode? Patch 8633951530cc seems not to touch the status register in QEMU's kvm_riscv_reset_vcpu(). So it is not obvious that this patch could have caused the problem. Looking at the call sequences in Linux gives some ideas where to debug: kvm_arch_vcpu_create calls kvm_riscv_reset_vcpu which calls kvm_riscv_vcpu_fp_reset. riscv_vcpu_set_isa_ext_single and kvm_riscv_vcpu_set_reg_config only call kvm_riscv_vcpu_fp_reset if !vcpu->arch.ran_atleast_once. kvm_riscv_vcpu_fp_reset sets FS bits to "initial" if CONFIG_FPU=y and extension F or D is available. It seems that in KVM only the creation of a vcpu will set the FS bits but rebooting will not. Best regards Heinrich
On Tue, Sep 17, 2024 at 03:28:42PM GMT, Heinrich Schuchardt wrote: > On 17.09.24 14:13, Andrew Jones wrote: > > On Mon, Sep 16, 2024 at 08:16:33PM GMT, Heinrich Schuchardt wrote: > > > OpenSBI enables the floating point in mstatus. For consistency QEMU/KVM > > > should do the same. > > > > > > Without this patch EDK II with TLS enabled crashes when hitting the first > > > floating point instruction while running QEMU with --accel kvm and runs > > > fine with --accel tcg. > > > > > > Additionally to this patch EDK II should be changed to make no assumptions > > > about the state of the floating point unit. > > > > > > Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> > > > --- > > > target/riscv/cpu.c | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c > > > index 4bda754b01..c32e2721d4 100644 > > > --- a/target/riscv/cpu.c > > > +++ b/target/riscv/cpu.c > > > @@ -923,6 +923,13 @@ static void riscv_cpu_reset_hold(Object *obj, ResetType type) > > > if (mcc->parent_phases.hold) { > > > mcc->parent_phases.hold(obj, type); > > > } > > > + if (riscv_has_ext(env, RVF) || riscv_has_ext(env, RVD)) { > > > + env->mstatus = set_field(env->mstatus, MSTATUS_FS, env->misa_mxl); > > > + for (int regnr = 0; regnr < 32; ++regnr) { > > > + env->fpr[regnr] = 0; > > > + } > > > + riscv_csrrw(env, CSR_FCSR, NULL, 0, -1); > > > + } > > > > If this is only fixing KVM, then I think it belongs in > > kvm_riscv_reset_vcpu(). But, I feel like we're working around an issue > > with KVM synchronization with this, as well as with the "clear CSR values" > > part of commit 8633951530cc ("target/riscv: Clear CSR values at reset and > > sync MPSTATE with host"). KVM knows how to reset VCPUs. It does so on > > VCPU creation and for any secondaries started with SBI HSM start. KVM's > > reset would set sstatus.FS to 1 ("Initial") and zero out all the fp > > registers and fcsr. So it seems like we're either synchronizing prior to > > KVM resetting the boot VCPU, not synchronizing at all, or KVM isn't doing > > the reset of the boot VCPU. > > > > Thanks, > > drew > > Hello Drew, > > Thanks for reviewing. > > Concerning the question whether kvm_riscv_reset_vcpu() would be a better > place for the change: > > Is there any specification prescribing what the state of the FS bits should > be when entering M-mode and when entering S-mode? I didn't see anything in the spec, so I think 0 (or 1 when all fp registers are also reset) is reasonable for an implementation to choose. > > Patch 8633951530cc seems not to touch the status register in QEMU's > kvm_riscv_reset_vcpu(). So it is not obvious that this patch could have > caused the problem. I don't think 8633951530cc caused this problem. It was solving its own problem in the same way, which is to add some more reset for the VCPU. I think both it and this patch are working around a problem with KVM or with a problem synchronizing with KVM. If that's the case, and we fix KVM or the synchronization with KVM, then I would revert the reset parts of 8633951530cc too. > > Looking at the call sequences in Linux gives some ideas where to debug: > > kvm_arch_vcpu_create calls kvm_riscv_reset_vcpu which calls > kvm_riscv_vcpu_fp_reset. > > riscv_vcpu_set_isa_ext_single and kvm_riscv_vcpu_set_reg_config > only call kvm_riscv_vcpu_fp_reset if !vcpu->arch.ran_atleast_once. > > kvm_riscv_vcpu_fp_reset sets FS bits to "initial" > if CONFIG_FPU=y and extension F or D is available. > > It seems that in KVM only the creation of a vcpu will set the FS bits but > rebooting will not. If KVM never resets the boot VCPU on reboot, then maybe it should or needs QEMU to inform it to do so. I'd rather just one of the two (KVM or QEMU) decide what needs to be reset and to which values, rather than both having their own ideas. For example, with this patch, the boot hart will have its sstatus.FS set to 3, but, iiuc, all secondaries will be brought up with their sstatus.FS set to 1. Thanks, drew
On 17.09.24 16:49, Andrew Jones wrote: > On Tue, Sep 17, 2024 at 03:28:42PM GMT, Heinrich Schuchardt wrote: >> On 17.09.24 14:13, Andrew Jones wrote: >>> On Mon, Sep 16, 2024 at 08:16:33PM GMT, Heinrich Schuchardt wrote: >>>> OpenSBI enables the floating point in mstatus. For consistency QEMU/KVM >>>> should do the same. >>>> >>>> Without this patch EDK II with TLS enabled crashes when hitting the first >>>> floating point instruction while running QEMU with --accel kvm and runs >>>> fine with --accel tcg. >>>> >>>> Additionally to this patch EDK II should be changed to make no assumptions >>>> about the state of the floating point unit. >>>> >>>> Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> >>>> --- >>>> target/riscv/cpu.c | 7 +++++++ >>>> 1 file changed, 7 insertions(+) >>>> >>>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c >>>> index 4bda754b01..c32e2721d4 100644 >>>> --- a/target/riscv/cpu.c >>>> +++ b/target/riscv/cpu.c >>>> @@ -923,6 +923,13 @@ static void riscv_cpu_reset_hold(Object *obj, ResetType type) >>>> if (mcc->parent_phases.hold) { >>>> mcc->parent_phases.hold(obj, type); >>>> } >>>> + if (riscv_has_ext(env, RVF) || riscv_has_ext(env, RVD)) { >>>> + env->mstatus = set_field(env->mstatus, MSTATUS_FS, env->misa_mxl); >>>> + for (int regnr = 0; regnr < 32; ++regnr) { >>>> + env->fpr[regnr] = 0; >>>> + } >>>> + riscv_csrrw(env, CSR_FCSR, NULL, 0, -1); >>>> + } >>> >>> If this is only fixing KVM, then I think it belongs in >>> kvm_riscv_reset_vcpu(). But, I feel like we're working around an issue >>> with KVM synchronization with this, as well as with the "clear CSR values" >>> part of commit 8633951530cc ("target/riscv: Clear CSR values at reset and >>> sync MPSTATE with host"). KVM knows how to reset VCPUs. It does so on >>> VCPU creation and for any secondaries started with SBI HSM start. KVM's >>> reset would set sstatus.FS to 1 ("Initial") and zero out all the fp >>> registers and fcsr. So it seems like we're either synchronizing prior to >>> KVM resetting the boot VCPU, not synchronizing at all, or KVM isn't doing >>> the reset of the boot VCPU. >>> >>> Thanks, >>> drew >> >> Hello Drew, >> >> Thanks for reviewing. >> >> Concerning the question whether kvm_riscv_reset_vcpu() would be a better >> place for the change: >> >> Is there any specification prescribing what the state of the FS bits should >> be when entering M-mode and when entering S-mode? > > I didn't see anything in the spec, so I think 0 (or 1 when all fp > registers are also reset) is reasonable for an implementation to > choose. > >> >> Patch 8633951530cc seems not to touch the status register in QEMU's >> kvm_riscv_reset_vcpu(). So it is not obvious that this patch could have >> caused the problem. > > I don't think 8633951530cc caused this problem. It was solving its own > problem in the same way, which is to add some more reset for the VCPU. > I think both it and this patch are working around a problem with KVM or > with a problem synchronizing with KVM. If that's the case, and we fix > KVM or the synchronization with KVM, then I would revert the reset parts > of 8633951530cc too. > >> >> Looking at the call sequences in Linux gives some ideas where to debug: >> >> kvm_arch_vcpu_create calls kvm_riscv_reset_vcpu which calls >> kvm_riscv_vcpu_fp_reset. >> >> riscv_vcpu_set_isa_ext_single and kvm_riscv_vcpu_set_reg_config >> only call kvm_riscv_vcpu_fp_reset if !vcpu->arch.ran_atleast_once. >> >> kvm_riscv_vcpu_fp_reset sets FS bits to "initial" >> if CONFIG_FPU=y and extension F or D is available. >> >> It seems that in KVM only the creation of a vcpu will set the FS bits but >> rebooting will not. > > If KVM never resets the boot VCPU on reboot, then maybe it should or needs > QEMU to inform it to do so. I'd rather just one of the two (KVM or QEMU) > decide what needs to be reset and to which values, rather than both having > their own ideas. For example, with this patch, the boot hart will have its > sstatus.FS set to 3, but, iiuc, all secondaries will be brought up > with their sstatus.FS set to 1. > > Thanks, > drew Hello Drew, I added some debug messages. Without smp: Linux' kvm_riscv_vcpu_fp_reset() is called before QEMU's kvm_riscv_reset_vcpu() and is never called on reboot. qemu-system-riscv64 -M virt -accel kvm -nographic -kernel payload_workaround.bin [ 920.805102] kvm_arch_vcpu_create: Entry [ 920.805608] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 920.805961] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 920.806289] kvm_arch_vcpu_create: Exit [ 920.810554] kvm_arch_vcpu_create: Entry [ 920.810959] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 920.811334] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 920.811696] kvm_arch_vcpu_create: Exit [ 920.816772] kvm_arch_vcpu_create: Entry [ 920.817095] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 920.817411] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 920.817975] kvm_arch_vcpu_create: Exit [ 920.818395] kvm_riscv_vcpu_set_reg_config: [ 920.818696] kvm_riscv_vcpu_set_reg_config: [ 920.818975] kvm_riscv_vcpu_set_reg_config: QEMU riscv_cpu_reset_hold: Entry QEMU kvm_riscv_reset_vcpu: Entry QEMU kvm_riscv_reset_vcpu: Exit QEMU riscv_cpu_reset_hold: Exit QEMU riscv_cpu_reset_hold: Entry QEMU kvm_riscv_reset_vcpu: Entry QEMU kvm_riscv_reset_vcpu: Exit QEMU riscv_cpu_reset_hold: Exit [ 920.946333] kvm_arch_vcpu_ioctl_run: run->ext_reason 0 [ 920.947031] kvm_arch_vcpu_ioctl_run: run->ext_reason 0 [ 920.947700] kvm_riscv_check_vcpu_requests: Entry [ 920.948482] kvm_riscv_check_vcpu_requests: Entry Test payload ============ [ 920.950012] kvm_arch_vcpu_ioctl_run: run->ext_reason 35 [ 920.950666] kvm_riscv_check_vcpu_requests: Entry Rebooting [ 920.951478] kvm_arch_vcpu_ioctl_run: run->ext_reason 35 [ 920.952051] kvm_riscv_check_vcpu_requests: Entry QEMU riscv_cpu_reset_hold: Entry QEMU kvm_riscv_reset_vcpu: Entry QEMU kvm_riscv_reset_vcpu: Exit QEMU riscv_cpu_reset_hold: Exit [ 920.962404] kvm_arch_vcpu_ioctl_run: run->ext_reason 24 [ 920.962969] kvm_arch_vcpu_ioctl_run: run->ext_reason 24 [ 920.963496] kvm_riscv_check_vcpu_requests: Entry Test payload ============ With -smp 2 this seems to hold true per CPU. So essentially the effect of vm_riscv_vcpu_fp_reset() is always ignored both on the primary and the secondary harts. $ qemu-system-riscv64 -M virt -accel kvm -smp 2 -nographic -kernel payload_workaround.bin [ 202.573659] kvm_arch_vcpu_create: Entry [ 202.574024] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 202.574328] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 202.574626] kvm_arch_vcpu_create: Exit [ 202.580626] kvm_arch_vcpu_create: Entry [ 202.581070] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 202.581599] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 202.582040] kvm_arch_vcpu_create: Exit [ 202.587356] kvm_arch_vcpu_create: Entry [ 202.587894] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 202.588376] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 202.589188] kvm_arch_vcpu_create: Exit [ 202.589650] kvm_riscv_vcpu_set_reg_config: [ 202.590014] kvm_riscv_vcpu_set_reg_config: [ 202.590340] kvm_riscv_vcpu_set_reg_config: [ 202.595220] kvm_arch_vcpu_create: Entry [ 202.595604] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 202.595939] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 202.596278] kvm_arch_vcpu_create: Exit QEMU riscv_cpu_reset_hold: Entry QEMU kvm_riscv_reset_vcpu: Entry QEMU kvm_riscv_reset_vcpu: Exit QEMU riscv_cpu_reset_hold: Exit [ 202.602093] kvm_arch_vcpu_create: Entry [ 202.602426] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 202.602777] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 202.603110] kvm_arch_vcpu_create: Exit [ 202.607898] kvm_arch_vcpu_create: Entry [ 202.608306] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 202.608989] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 202.609416] kvm_arch_vcpu_create: Exit [ 202.609939] kvm_riscv_vcpu_set_reg_config: [ 202.610312] kvm_riscv_vcpu_set_reg_config: [ 202.610666] kvm_riscv_vcpu_set_reg_config: QEMU riscv_cpu_reset_hold: Entry QEMU kvm_riscv_reset_vcpu: Entry QEMU kvm_riscv_reset_vcpu: Exit QEMU riscv_cpu_reset_hold: Exit QEMU riscv_cpu_reset_hold: Entry QEMU kvm_riscv_reset_vcpu: Entry QEMU kvm_riscv_reset_vcpu: Exit QEMU riscv_cpu_reset_hold: Exit QEMU riscv_cpu_reset_hold: Entry QEMU kvm_riscv_reset_vcpu: Entry QEMU kvm_riscv_reset_vcpu: Exit QEMU riscv_cpu_reset_hold: Exit [ 202.749911] kvm_arch_vcpu_ioctl_run: run->ext_reason 0 [ 202.750370] kvm_arch_vcpu_ioctl_run: run->ext_reason 0 [ 202.750799] kvm_arch_vcpu_ioctl_run: run->ext_reason 0 [ 202.750819] kvm_arch_vcpu_ioctl_run: run->ext_reason 0 [ 202.751574] kvm_riscv_check_vcpu_requests: Entry [ 202.751617] kvm_riscv_check_vcpu_requests: Entry [ 202.752737] kvm_riscv_check_vcpu_requests: Entry Test payload ============ [ 202.753678] kvm_arch_vcpu_ioctl_run: run->ext_reason 35 fcvt.d.w fa5,a5 [ 202.754145] kvm_riscv_check_vcpu_requests: Entry Rebooting [ 202.754655] kvm_arch_vcpu_ioctl_run: run->ext_reason 35 [ 202.755030] kvm_riscv_check_vcpu_requests: Entry QEMU riscv_cpu_reset_hold: Entry QEMU kvm_riscv_reset_vcpu: Entry QEMU kvm_riscv_reset_vcpu: Exit QEMU riscv_cpu_reset_hold: Exit QEMU riscv_cpu_reset_hold: Entry QEMU kvm_riscv_reset_vcpu: Entry QEMU kvm_riscv_reset_vcpu: Exit QEMU riscv_cpu_reset_hold: Exit [ 202.770352] kvm_arch_vcpu_ioctl_run: run->ext_reason 24 [ 202.770915] kvm_arch_vcpu_ioctl_run: run->ext_reason 10 [ 202.770951] kvm_arch_vcpu_ioctl_run: run->ext_reason 24 [ 202.771802] kvm_arch_vcpu_ioctl_run: run->ext_reason 10 [ 202.772272] kvm_riscv_check_vcpu_requests: Entry [ 202.772888] kvm_riscv_check_vcpu_requests: Entry Test payload ============ When thinking about the migration of virtual machines shouldn't QEMU be in control of the initial state of vcpus instead of KVM? CCing the RISC-V KVM maintainers. Best regards Heinrich
On Tue, Sep 17, 2024 at 06:45:21PM GMT, Heinrich Schuchardt wrote: ... > When thinking about the migration of virtual machines shouldn't QEMU be in > control of the initial state of vcpus instead of KVM? > Thinking about this more, I'm inclined to agree. Initial state and reset state should be traits of the VMM (potentially influenced by the user) rather than KVM. Thanks, drew
On Wed, 18 Sept 2024 at 07:06, Andrew Jones <ajones@ventanamicro.com> wrote: > > On Tue, Sep 17, 2024 at 06:45:21PM GMT, Heinrich Schuchardt wrote: > ... > > When thinking about the migration of virtual machines shouldn't QEMU be in > > control of the initial state of vcpus instead of KVM? > > > > Thinking about this more, I'm inclined to agree. Initial state and reset > state should be traits of the VMM (potentially influenced by the user) > rather than KVM. Mmm. IIRC the way this works on Arm at least is that at some point post-reset and before running the VM we do a QEMU->kernel state sync, which means that whatever the kernel does with the CPU state doesn't matter, only what QEMU's idea of reset is. Looking at the source I think the way this happens is that kvm_cpu_synchronize_post_reset() arranges to do a kvm_arch_put_registers(). (For Arm we have to do some fiddling around to make sure our CPU state is in the right place for that put_registers to DTRT, which is what kvm_arm_reset_vcpu() is doing, but that's a consequence of the way we chose to handle migration and in particular migration of system registers rather than something necessarily every architecture wants to be doing.) This also works for reset of the vCPU on a guest-reboot. We don't tell KVM to reset the vCPU, we just set up the vCPU state on the QEMU side and then do a QEMU->kernel state sync of it. -- PMM
On 18.09.24 13:10, Peter Maydell wrote: > On Wed, 18 Sept 2024 at 07:06, Andrew Jones <ajones@ventanamicro.com> wrote: >> >> On Tue, Sep 17, 2024 at 06:45:21PM GMT, Heinrich Schuchardt wrote: >> ... >>> When thinking about the migration of virtual machines shouldn't QEMU be in >>> control of the initial state of vcpus instead of KVM? >>> >> >> Thinking about this more, I'm inclined to agree. Initial state and reset >> state should be traits of the VMM (potentially influenced by the user) >> rather than KVM. > > Mmm. IIRC the way this works on Arm at least is that at some point > post-reset and before running the VM we do a QEMU->kernel state > sync, which means that whatever the kernel does with the CPU state > doesn't matter, only what QEMU's idea of reset is. Looking at the > source I think the way this happens is that kvm_cpu_synchronize_post_reset() > arranges to do a kvm_arch_put_registers(). (For Arm we have to do > some fiddling around to make sure our CPU state is in the right > place for that put_registers to DTRT, which is what kvm_arm_reset_vcpu() > is doing, but that's a consequence of the way we chose to handle > migration and in particular migration of system registers rather than > something necessarily every architecture wants to be doing.) > > This also works for reset of the vCPU on a guest-reboot. We don't > tell KVM to reset the vCPU, we just set up the vCPU state on the > QEMU side and then do a QEMU->kernel state sync of it. > > -- PMM Thanks Peter for looking into this. QEMU's cpu_synchronize_all_post_init() and do_kvm_cpu_synchronize_post_reset() both end up in kvm_arch_put_registers() and that is long after Linux kvm_arch_vcpu_create() has been setting some FPU state. See the output below. kvm_arch_put_registers() copies the CSRs by calling kvm_riscv_put_regs_csr(). Here we can find: KVM_RISCV_SET_CSR(cs, env, sstatus, env->mstatus); This call enables or disables the FPU according to the value of env->mstatus. So we need to set the desired state of the floating point unit in QEMU. And this is what the current patch does both for TCG and KVM. Best regards Heinrich $ qemu-system-riscv64 -M virt -accel kvm -nographic -kernel payload.bin QEMU qemu_init: Entry QEMU qmp_x_exit_preconfig: Entry [ 3503.369249] kvm_arch_vcpu_create: Entry [ 3503.369669] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 3503.369966] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 3503.370256] kvm_arch_vcpu_create: Exit [ 3503.378620] kvm_arch_vcpu_create: Entry [ 3503.379123] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 3503.379610] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 3503.380111] kvm_arch_vcpu_create: Exit [ 3503.394837] kvm_arch_vcpu_create: Entry [ 3503.395238] kvm_riscv_vcpu_fp_reset: At entry FS=0 [ 3503.395585] kvm_riscv_vcpu_fp_reset: At exit FS=8192 [ 3503.395947] kvm_arch_vcpu_create: Exit [ 3503.397023] kvm_riscv_vcpu_set_reg_config: [ 3503.398066] kvm_riscv_vcpu_set_reg_config: [ 3503.398430] kvm_riscv_vcpu_set_reg_config: QEMU riscv_cpu_reset_hold: Entry QEMU kvm_riscv_reset_vcpu: Entry QEMU kvm_riscv_reset_vcpu: Exit QEMU riscv_cpu_reset_hold: Exit QEMU qemu_machine_creation_done: Entry QEMU qdev_machine_creation_done: Entry QEMU cpu_synchronize_all_post_init: Entry QEMU cpu_synchronize_post_init: Entry QEMU kvm_cpu_synchronize_post_init: Entry QEMU do_kvm_cpu_synchronize_post_init: Entry QEMU kvm_arch_put_registers: Entry QEMU kvm_riscv_put_regs_csr: Entry QEMU kvm_riscv_put_regs_csr: Exit QEMU kvm_arch_put_registers: Exit QEMU do_kvm_cpu_synchronize_post_init: Exit QEMU kvm_cpu_synchronize_post_init: Exit QEMU cpu_synchronize_post_init: Exit QEMU cpu_synchronize_all_post_init: Exit QEMU qemu_system_reset: Entry QEMU kvm_arch_get_registers: Entry QEMU riscv_cpu_reset_hold: Entry QEMU kvm_riscv_reset_vcpu: Entry QEMU kvm_riscv_reset_vcpu: Exit QEMU riscv_cpu_reset_hold: Exit QEMU cpu_synchronize_all_post_reset: Entry QEMU cpu_synchronize_post_reset: Entry QEMU do_kvm_cpu_synchronize_post_reset: Entry QEMU kvm_arch_put_registers: Entry QEMU kvm_riscv_put_regs_csr: Entry QEMU kvm_riscv_put_regs_csr: Exit QEMU kvm_riscv_sync_mpstate_to_kvm: Entry QEMU kvm_riscv_sync_mpstate_to_kvm: Exit QEMU kvm_arch_put_registers: Exit QEMU do_kvm_cpu_synchronize_post_reset: Exit QEMU cpu_synchronize_post_reset: Exit QEMU cpu_synchronize_all_post_reset: Exit QEMU qemu_system_reset: Exit QEMU qdev_machine_creation_done: Exit QEMU qmp_x_exit_preconfig: Exit QEMU qemu_init: Exit QEMU kvm_cpu_exec: Entry [ 3503.566493] kvm_arch_vcpu_ioctl_run: run->ext_reason 0 QEMU kvm_cpu_exec: Exit QEMU kvm_cpu_exec: Entry [ 3503.568338] kvm_arch_vcpu_ioctl_run: run->ext_reason 0 [ 3503.568740] kvm_riscv_check_vcpu_requests: Entry [ 3503.569534] kvm_riscv_check_vcpu_requests: Entry Test payload ============
On Wed, 18 Sept 2024 at 14:06, Heinrich Schuchardt <heinrich.schuchardt@canonical.com> wrote: > Thanks Peter for looking into this. > > QEMU's cpu_synchronize_all_post_init() and > do_kvm_cpu_synchronize_post_reset() both end up in > kvm_arch_put_registers() and that is long after Linux > kvm_arch_vcpu_create() has been setting some FPU state. See the output > below. > > kvm_arch_put_registers() copies the CSRs by calling > kvm_riscv_put_regs_csr(). Here we can find: > > KVM_RISCV_SET_CSR(cs, env, sstatus, env->mstatus); > > This call enables or disables the FPU according to the value of > env->mstatus. > > So we need to set the desired state of the floating point unit in QEMU. > And this is what the current patch does both for TCG and KVM. If it does this for both TCG and KVM then I don't understand this bit from the commit message: # Without this patch EDK II with TLS enabled crashes when hitting the first # floating point instruction while running QEMU with --accel kvm and runs # fine with --accel tcg. Shouldn't this guest crash the same way with both KVM and TCG without this patch, because the FPU state is the same for both? -- PMM
On 18.09.24 15:12, Peter Maydell wrote: > On Wed, 18 Sept 2024 at 14:06, Heinrich Schuchardt > <heinrich.schuchardt@canonical.com> wrote: >> Thanks Peter for looking into this. >> >> QEMU's cpu_synchronize_all_post_init() and >> do_kvm_cpu_synchronize_post_reset() both end up in >> kvm_arch_put_registers() and that is long after Linux >> kvm_arch_vcpu_create() has been setting some FPU state. See the output >> below. >> >> kvm_arch_put_registers() copies the CSRs by calling >> kvm_riscv_put_regs_csr(). Here we can find: >> >> KVM_RISCV_SET_CSR(cs, env, sstatus, env->mstatus); >> >> This call enables or disables the FPU according to the value of >> env->mstatus. >> >> So we need to set the desired state of the floating point unit in QEMU. >> And this is what the current patch does both for TCG and KVM. > > If it does this for both TCG and KVM then I don't understand > this bit from the commit message: > > # Without this patch EDK II with TLS enabled crashes when hitting the first > # floating point instruction while running QEMU with --accel kvm and runs > # fine with --accel tcg. > > Shouldn't this guest crash the same way with both KVM and TCG without > this patch, because the FPU state is the same for both? > > -- PMM By default `qemu-system-riscv64 --accel tcg` runs OpenSBI as firmware which enables the FPU. If you would choose a different SBI implementation which does not enable the FPU you could experience the same crash. Best regards Heinrich
On Wed, Sep 18, 2024 at 03:49:39PM GMT, Heinrich Schuchardt wrote: > On 18.09.24 15:12, Peter Maydell wrote: > > On Wed, 18 Sept 2024 at 14:06, Heinrich Schuchardt > > <heinrich.schuchardt@canonical.com> wrote: > > > Thanks Peter for looking into this. > > > > > > QEMU's cpu_synchronize_all_post_init() and > > > do_kvm_cpu_synchronize_post_reset() both end up in > > > kvm_arch_put_registers() and that is long after Linux > > > kvm_arch_vcpu_create() has been setting some FPU state. See the output > > > below. > > > > > > kvm_arch_put_registers() copies the CSRs by calling > > > kvm_riscv_put_regs_csr(). Here we can find: > > > > > > KVM_RISCV_SET_CSR(cs, env, sstatus, env->mstatus); > > > > > > This call enables or disables the FPU according to the value of > > > env->mstatus. > > > > > > So we need to set the desired state of the floating point unit in QEMU. > > > And this is what the current patch does both for TCG and KVM. > > > > If it does this for both TCG and KVM then I don't understand > > this bit from the commit message: > > > > # Without this patch EDK II with TLS enabled crashes when hitting the first > > # floating point instruction while running QEMU with --accel kvm and runs > > # fine with --accel tcg. > > > > Shouldn't this guest crash the same way with both KVM and TCG without > > this patch, because the FPU state is the same for both? > > > > -- PMM > > By default `qemu-system-riscv64 --accel tcg` runs OpenSBI as firmware which > enables the FPU. > > If you would choose a different SBI implementation which does not enable the > FPU you could experience the same crash. > Thanks Heinrich, I had also forgotten that distinction. So the last question is whether or not we want to reset mstatus.FS to 1 instead of 3, as is done in this patch. Thanks, drew
On 9/16/24 20:16, Heinrich Schuchardt wrote:
> + env->mstatus = set_field(env->mstatus, MSTATUS_FS, env->misa_mxl);
Using mxl is definitely wrong here.
r~
On Wed, 18 Sept 2024 at 14:49, Heinrich Schuchardt <heinrich.schuchardt@canonical.com> wrote: > > On 18.09.24 15:12, Peter Maydell wrote: > > On Wed, 18 Sept 2024 at 14:06, Heinrich Schuchardt > > <heinrich.schuchardt@canonical.com> wrote: > >> Thanks Peter for looking into this. > >> > >> QEMU's cpu_synchronize_all_post_init() and > >> do_kvm_cpu_synchronize_post_reset() both end up in > >> kvm_arch_put_registers() and that is long after Linux > >> kvm_arch_vcpu_create() has been setting some FPU state. See the output > >> below. > >> > >> kvm_arch_put_registers() copies the CSRs by calling > >> kvm_riscv_put_regs_csr(). Here we can find: > >> > >> KVM_RISCV_SET_CSR(cs, env, sstatus, env->mstatus); > >> > >> This call enables or disables the FPU according to the value of > >> env->mstatus. > >> > >> So we need to set the desired state of the floating point unit in QEMU. > >> And this is what the current patch does both for TCG and KVM. > > > > If it does this for both TCG and KVM then I don't understand > > this bit from the commit message: > > > > # Without this patch EDK II with TLS enabled crashes when hitting the first > > # floating point instruction while running QEMU with --accel kvm and runs > > # fine with --accel tcg. > > > > Shouldn't this guest crash the same way with both KVM and TCG without > > this patch, because the FPU state is the same for both? > By default `qemu-system-riscv64 --accel tcg` runs OpenSBI as firmware > which enables the FPU. > > If you would choose a different SBI implementation which does not enable > the FPU you could experience the same crash. Ah, so KVM vs TCG is a red herring and it's actually "some guest firmware doesn't enable the FPU itself, and if you run that then it will fall over, whether you do it in KVM or TCG" ? That makes more sense. I don't have an opinion on whether you want to do that or not, not knowing what the riscv architecture mandates. (On Arm this would be fairly clearly "the guest software is broken and should be fixed", but that's because the Arm architecture says you can't assume the FPU is enabled from reset.) I do think the commit message could use clarification to explain this. thanks -- PMM
On 18.09.24 15:56, Andrew Jones wrote: > Thanks Heinrich, I had also forgotten that distinction. So the last > question is whether or not we want to reset mstatus.FS to 1 instead of 3, > as is done in this patch. > > Thanks, > drew If the FD flag set to 3 (Dirty) indicates that "the corresponding state has potentially been modified since the last context save". Upon context switches the value of the floating point registers has to be saved. The value 1 (Initial) implies that saving the registers on a context change is not needed. 3 (Dirty) is the safest choice. This is why OpenSBI is also using it. For reference: 3.1.6.6. Extension Context Status in mstatus Register The RISC-V Instruction Set Manual: Volume II - Privileged Architecture Version 2024-04-11 Best regards Heinrich
On 18.09.24 17:32, Peter Maydell wrote: > On Wed, 18 Sept 2024 at 14:49, Heinrich Schuchardt > <heinrich.schuchardt@canonical.com> wrote: >> >> On 18.09.24 15:12, Peter Maydell wrote: >>> On Wed, 18 Sept 2024 at 14:06, Heinrich Schuchardt >>> <heinrich.schuchardt@canonical.com> wrote: >>>> Thanks Peter for looking into this. >>>> >>>> QEMU's cpu_synchronize_all_post_init() and >>>> do_kvm_cpu_synchronize_post_reset() both end up in >>>> kvm_arch_put_registers() and that is long after Linux >>>> kvm_arch_vcpu_create() has been setting some FPU state. See the output >>>> below. >>>> >>>> kvm_arch_put_registers() copies the CSRs by calling >>>> kvm_riscv_put_regs_csr(). Here we can find: >>>> >>>> KVM_RISCV_SET_CSR(cs, env, sstatus, env->mstatus); >>>> >>>> This call enables or disables the FPU according to the value of >>>> env->mstatus. >>>> >>>> So we need to set the desired state of the floating point unit in QEMU. >>>> And this is what the current patch does both for TCG and KVM. >>> >>> If it does this for both TCG and KVM then I don't understand >>> this bit from the commit message: >>> >>> # Without this patch EDK II with TLS enabled crashes when hitting the first >>> # floating point instruction while running QEMU with --accel kvm and runs >>> # fine with --accel tcg. >>> >>> Shouldn't this guest crash the same way with both KVM and TCG without >>> this patch, because the FPU state is the same for both? > >> By default `qemu-system-riscv64 --accel tcg` runs OpenSBI as firmware >> which enables the FPU. >> >> If you would choose a different SBI implementation which does not enable >> the FPU you could experience the same crash. > > Ah, so KVM vs TCG is a red herring and it's actually "some guest > firmware doesn't enable the FPU itself, and if you run that then it will > fall over, whether you do it in KVM or TCG" ? That makes more sense. > > I don't have an opinion on whether you want to do that or not, > not knowing what the riscv architecture mandates. (On Arm this > would be fairly clearly "the guest software is broken and > should be fixed", but that's because the Arm architecture > says you can't assume the FPU is enabled from reset.) > > I do think the commit message could use clarification to > explain this. > > thanks > -- PMM I have not found a specification defining what the status of the FPU should be when M-Mode is stared and when moving from M-Mode to S-Mode. OpenSBI (which is the dominating M-Mode firmware and invoked by default in TCG mode) enables the FPU before jumping to S-Mode. KVM should to the same for consistency. Best regards Heinrich
On Thu, Sep 19, 2024 at 1:34 AM Peter Maydell <peter.maydell@linaro.org> wrote: > > On Wed, 18 Sept 2024 at 14:49, Heinrich Schuchardt > <heinrich.schuchardt@canonical.com> wrote: > > > > On 18.09.24 15:12, Peter Maydell wrote: > > > On Wed, 18 Sept 2024 at 14:06, Heinrich Schuchardt > > > <heinrich.schuchardt@canonical.com> wrote: > > >> Thanks Peter for looking into this. > > >> > > >> QEMU's cpu_synchronize_all_post_init() and > > >> do_kvm_cpu_synchronize_post_reset() both end up in > > >> kvm_arch_put_registers() and that is long after Linux > > >> kvm_arch_vcpu_create() has been setting some FPU state. See the output > > >> below. > > >> > > >> kvm_arch_put_registers() copies the CSRs by calling > > >> kvm_riscv_put_regs_csr(). Here we can find: > > >> > > >> KVM_RISCV_SET_CSR(cs, env, sstatus, env->mstatus); > > >> > > >> This call enables or disables the FPU according to the value of > > >> env->mstatus. > > >> > > >> So we need to set the desired state of the floating point unit in QEMU. > > >> And this is what the current patch does both for TCG and KVM. > > > > > > If it does this for both TCG and KVM then I don't understand > > > this bit from the commit message: > > > > > > # Without this patch EDK II with TLS enabled crashes when hitting the first > > > # floating point instruction while running QEMU with --accel kvm and runs > > > # fine with --accel tcg. > > > > > > Shouldn't this guest crash the same way with both KVM and TCG without > > > this patch, because the FPU state is the same for both? > > > By default `qemu-system-riscv64 --accel tcg` runs OpenSBI as firmware > > which enables the FPU. > > > > If you would choose a different SBI implementation which does not enable > > the FPU you could experience the same crash. > > Ah, so KVM vs TCG is a red herring and it's actually "some guest > firmware doesn't enable the FPU itself, and if you run that then it will > fall over, whether you do it in KVM or TCG" ? That makes more sense. > > I don't have an opinion on whether you want to do that or not, > not knowing what the riscv architecture mandates. (On Arm this > would be fairly clearly "the guest software is broken and > should be fixed", but that's because the Arm architecture > says you can't assume the FPU is enabled from reset.) RISC-V is the same. Section "3.4 Reset" states that: "All other hart state is UNSPECIFIED." (the paragraph doesn't mention the FS state). So it's unspecified what the value is on reset. Guest software shouldn't assume anything about it and it does seem like a guest software bug. In saying that, we are allowed to set it then as the spec doesn't say it should be 0. So setting it to 0x01 (Initial) doesn't seem like a bad idea, as the name kind of implies that it should be 0x01 on reset Alistair > > I do think the commit message could use clarification to > explain this. > > thanks > -- PMM >
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 4bda754b01..c32e2721d4 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -923,6 +923,13 @@ static void riscv_cpu_reset_hold(Object *obj, ResetType type) if (mcc->parent_phases.hold) { mcc->parent_phases.hold(obj, type); } + if (riscv_has_ext(env, RVF) || riscv_has_ext(env, RVD)) { + env->mstatus = set_field(env->mstatus, MSTATUS_FS, env->misa_mxl); + for (int regnr = 0; regnr < 32; ++regnr) { + env->fpr[regnr] = 0; + } + riscv_csrrw(env, CSR_FCSR, NULL, 0, -1); + } #ifndef CONFIG_USER_ONLY env->misa_mxl = mcc->misa_mxl_max; env->priv = PRV_M;
OpenSBI enables the floating point in mstatus. For consistency QEMU/KVM should do the same. Without this patch EDK II with TLS enabled crashes when hitting the first floating point instruction while running QEMU with --accel kvm and runs fine with --accel tcg. Additionally to this patch EDK II should be changed to make no assumptions about the state of the floating point unit. Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> --- target/riscv/cpu.c | 7 +++++++ 1 file changed, 7 insertions(+)