Message ID | 601f1278-17dd-7124-f328-b865447ca160@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | SRCU dereference check warning with SEV-ES | expand |
On 05/05/21 16:01, Tom Lendacky wrote: > Boris noticed the below warning when running an SEV-ES guest with > CONFIG_PROVE_LOCKING=y. > > The SRCU lock is released before invoking the vCPU run op where the SEV-ES > support will unmap the GHCB. Is the proper thing to do here to take the > SRCU lock around the call to kvm_vcpu_unmap() in this case? It does fix > the issue, but I just want to be sure that I shouldn't, instead, be taking > the memslot lock: I would rather avoid having long-lived maps, as I am working on removing them from the Intel code. However, it seems to me that the GHCB is almost not used after sev_handle_vmgexit returns? If so, there's no need to keep it mapped until pre_sev_es_run: diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index a9d8d6aafdb8..b2226a5e249d 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -2200,9 +2200,6 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm) static void pre_sev_es_run(struct vcpu_svm *svm) { - if (!svm->ghcb) - return; - if (svm->ghcb_sa_free) { /* * The scratch area lives outside the GHCB, so there is a @@ -2220,13 +2217,6 @@ static void pre_sev_es_run(struct vcpu_svm *svm) svm->ghcb_sa = NULL; svm->ghcb_sa_free = false; } - - trace_kvm_vmgexit_exit(svm->vcpu.vcpu_id, svm->ghcb); - - sev_es_sync_to_ghcb(svm); - - kvm_vcpu_unmap(&svm->vcpu, &svm->ghcb_map, true); - svm->ghcb = NULL; } void pre_sev_run(struct vcpu_svm *svm, int cpu) @@ -2465,7 +2455,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu) ret = sev_es_validate_vmgexit(svm); if (ret) - return ret; + goto out_unmap; sev_es_sync_from_ghcb(svm); ghcb_set_sw_exit_info_1(ghcb, 0); @@ -2485,6 +2485,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu) ret = svm_invoke_exit_handler(vcpu, SVM_EXIT_IRET); break; case SVM_VMGEXIT_AP_HLT_LOOP: + ghcb_set_sw_exit_info_2(svm->ghcb, 1); ret = kvm_emulate_ap_reset_hold(vcpu); break; case SVM_VMGEXIT_AP_JUMP_TABLE: { @@ -2531,6 +2521,11 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu) ret = svm_invoke_exit_handler(vcpu, exit_code); } + sev_es_sync_to_ghcb(svm); + +out_unmap: + kvm_vcpu_unmap(&svm->vcpu, &svm->ghcb_map, true); + svm->ghcb = NULL; return ret; } @@ -2619,21 +2620,4 @@ void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu) void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) { - struct vcpu_svm *svm = to_svm(vcpu); - - /* First SIPI: Use the values as initially set by the VMM */ - if (!svm->received_first_sipi) { - svm->received_first_sipi = true; - return; - } - - /* - * Subsequent SIPI: Return from an AP Reset Hold VMGEXIT, where - * the guest will set the CS and RIP. Set SW_EXIT_INFO_2 to a - * non-zero value. - */ - if (!svm->ghcb) - return; - - ghcb_set_sw_exit_info_2(svm->ghcb, 1); } However: 1) I admit I got lost in the maze starting with sev_es_string_io 2) upon an AP reset hold exit, the above patch sets the EXITINFO2 field before the SIPI was received. My understanding is that the processor will not see the value anyway until it resumes execution, and why would other vCPUs muck with the AP's GHCB. But I'm not sure if it's okay. Paolo
On 5/5/21 11:16 AM, Paolo Bonzini wrote: > On 05/05/21 16:01, Tom Lendacky wrote: >> Boris noticed the below warning when running an SEV-ES guest with >> CONFIG_PROVE_LOCKING=y. >> >> The SRCU lock is released before invoking the vCPU run op where the SEV-ES >> support will unmap the GHCB. Is the proper thing to do here to take the >> SRCU lock around the call to kvm_vcpu_unmap() in this case? It does fix >> the issue, but I just want to be sure that I shouldn't, instead, be taking >> the memslot lock: > > I would rather avoid having long-lived maps, as I am working on removing > them from the Intel code. However, it seems to me that the GHCB is almost > not used after sev_handle_vmgexit returns? Except for as you pointed out below, things like MMIO and IO. Anything that has to exit to userspace to complete will still need the GHCB mapped when coming back into the kernel if the shared buffer area of the GHCB is being used. Btw, what do you consider long lived maps? Is having a map while context switching back to userspace considered long lived? The GHCB will (possibly) only be mapped on VMEXIT (VMGEXIT) and unmapped on the next VMRUN for the vCPU. An AP reset hold could be a while, though. > > If so, there's no need to keep it mapped until pre_sev_es_run: > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > index a9d8d6aafdb8..b2226a5e249d 100644 > --- a/arch/x86/kvm/svm/sev.c > +++ b/arch/x86/kvm/svm/sev.c > @@ -2200,9 +2200,6 @@ static int sev_es_validate_vmgexit(struct vcpu_svm > *svm) > > static void pre_sev_es_run(struct vcpu_svm *svm) > { > - if (!svm->ghcb) > - return; > - > if (svm->ghcb_sa_free) { > /* > * The scratch area lives outside the GHCB, so there is a > @@ -2220,13 +2217,6 @@ static void pre_sev_es_run(struct vcpu_svm *svm) > svm->ghcb_sa = NULL; > svm->ghcb_sa_free = false; > } > - > - trace_kvm_vmgexit_exit(svm->vcpu.vcpu_id, svm->ghcb); > - > - sev_es_sync_to_ghcb(svm); > - > - kvm_vcpu_unmap(&svm->vcpu, &svm->ghcb_map, true); > - svm->ghcb = NULL; > } > > void pre_sev_run(struct vcpu_svm *svm, int cpu) > @@ -2465,7 +2455,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu) > > ret = sev_es_validate_vmgexit(svm); > if (ret) > - return ret; > + goto out_unmap; > > sev_es_sync_from_ghcb(svm); > ghcb_set_sw_exit_info_1(ghcb, 0); > @@ -2485,6 +2485,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu) > ret = svm_invoke_exit_handler(vcpu, SVM_EXIT_IRET); > break; > case SVM_VMGEXIT_AP_HLT_LOOP: > + ghcb_set_sw_exit_info_2(svm->ghcb, 1); > ret = kvm_emulate_ap_reset_hold(vcpu); > break; > case SVM_VMGEXIT_AP_JUMP_TABLE: { > @@ -2531,6 +2521,11 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu) > ret = svm_invoke_exit_handler(vcpu, exit_code); > } > > + sev_es_sync_to_ghcb(svm); > + > +out_unmap: > + kvm_vcpu_unmap(&svm->vcpu, &svm->ghcb_map, true); > + svm->ghcb = NULL; > return ret; > } > > @@ -2619,21 +2620,4 @@ void sev_es_prepare_guest_switch(struct vcpu_svm > *svm, unsigned int cpu) > > void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) > { > - struct vcpu_svm *svm = to_svm(vcpu); > - > - /* First SIPI: Use the values as initially set by the VMM */ > - if (!svm->received_first_sipi) { > - svm->received_first_sipi = true; > - return; > - } > - > - /* > - * Subsequent SIPI: Return from an AP Reset Hold VMGEXIT, where > - * the guest will set the CS and RIP. Set SW_EXIT_INFO_2 to a > - * non-zero value. > - */ > - if (!svm->ghcb) > - return; > - > - ghcb_set_sw_exit_info_2(svm->ghcb, 1); > } > > However: > > 1) I admit I got lost in the maze starting with sev_es_string_io Yeah, kvm_sev_es_string_io bypasses the instruction emulation that would normally be done to get register values and information about the port I/O and supplies those directly. But in general, it follows the same exiting to userspace to complete the port I/O operation. > > 2) upon an AP reset hold exit, the above patch sets the EXITINFO2 field > before the SIPI was received. My understanding is that the processor will > not see the value anyway until it resumes execution, and why would other > vCPUs muck with the AP's GHCB. But I'm not sure if it's okay. As long as the vCPU might not be woken up for any reason other than a SIPI, you can get a way with this. But if it was to be woken up for some other reason (an IPI for some reason?), then you wouldn't want the non-zero value set in the GHCB in advance, because that might cause the vCPU to exit the HLT loop it is in waiting for the actual SIPI. Thanks, Tom > > Paolo >
On 05/05/21 20:39, Tom Lendacky wrote: > On 5/5/21 11:16 AM, Paolo Bonzini wrote: >> On 05/05/21 16:01, Tom Lendacky wrote: >>> Boris noticed the below warning when running an SEV-ES guest with >>> CONFIG_PROVE_LOCKING=y. >>> >>> The SRCU lock is released before invoking the vCPU run op where the SEV-ES >>> support will unmap the GHCB. Is the proper thing to do here to take the >>> SRCU lock around the call to kvm_vcpu_unmap() in this case? It does fix >>> the issue, but I just want to be sure that I shouldn't, instead, be taking >>> the memslot lock: >> >> I would rather avoid having long-lived maps, as I am working on removing >> them from the Intel code. However, it seems to me that the GHCB is almost >> not used after sev_handle_vmgexit returns? > > Except for as you pointed out below, things like MMIO and IO. Anything > that has to exit to userspace to complete will still need the GHCB mapped > when coming back into the kernel if the shared buffer area of the GHCB is > being used. > > Btw, what do you consider long lived maps? Is having a map while context > switching back to userspace considered long lived? The GHCB will > (possibly) only be mapped on VMEXIT (VMGEXIT) and unmapped on the next > VMRUN for the vCPU. An AP reset hold could be a while, though. Anything that cannot be unmapped in the same function that maps it, essentially. >> 2) upon an AP reset hold exit, the above patch sets the EXITINFO2 field >> before the SIPI was received. My understanding is that the processor will >> not see the value anyway until it resumes execution, and why would other >> vCPUs muck with the AP's GHCB. But I'm not sure if it's okay. > > As long as the vCPU might not be woken up for any reason other than a > SIPI, you can get a way with this. But if it was to be woken up for some > other reason (an IPI for some reason?), then you wouldn't want the > non-zero value set in the GHCB in advance, because that might cause the > vCPU to exit the HLT loop it is in waiting for the actual SIPI. Ok. Then the best thing to do is to pull sev_es_pre_run to the prepare_guest_switch callback. Paolo
On 5/5/21 1:50 PM, Paolo Bonzini wrote: > On 05/05/21 20:39, Tom Lendacky wrote: >> On 5/5/21 11:16 AM, Paolo Bonzini wrote: >>> On 05/05/21 16:01, Tom Lendacky wrote: >>>> Boris noticed the below warning when running an SEV-ES guest with >>>> CONFIG_PROVE_LOCKING=y. >>>> >>>> The SRCU lock is released before invoking the vCPU run op where the >>>> SEV-ES >>>> support will unmap the GHCB. Is the proper thing to do here to take the >>>> SRCU lock around the call to kvm_vcpu_unmap() in this case? It does fix >>>> the issue, but I just want to be sure that I shouldn't, instead, be >>>> taking >>>> the memslot lock: >>> >>> I would rather avoid having long-lived maps, as I am working on removing >>> them from the Intel code. However, it seems to me that the GHCB is almost >>> not used after sev_handle_vmgexit returns? >> >> Except for as you pointed out below, things like MMIO and IO. Anything >> that has to exit to userspace to complete will still need the GHCB mapped >> when coming back into the kernel if the shared buffer area of the GHCB is >> being used. >> >> Btw, what do you consider long lived maps? Is having a map while context >> switching back to userspace considered long lived? The GHCB will >> (possibly) only be mapped on VMEXIT (VMGEXIT) and unmapped on the next >> VMRUN for the vCPU. An AP reset hold could be a while, though. > > Anything that cannot be unmapped in the same function that maps it, > essentially. > >>> 2) upon an AP reset hold exit, the above patch sets the EXITINFO2 field >>> before the SIPI was received. My understanding is that the processor will >>> not see the value anyway until it resumes execution, and why would other >>> vCPUs muck with the AP's GHCB. But I'm not sure if it's okay. >> >> As long as the vCPU might not be woken up for any reason other than a >> SIPI, you can get a way with this. But if it was to be woken up for some >> other reason (an IPI for some reason?), then you wouldn't want the >> non-zero value set in the GHCB in advance, because that might cause the >> vCPU to exit the HLT loop it is in waiting for the actual SIPI. > > Ok. Then the best thing to do is to pull sev_es_pre_run to the > prepare_guest_switch callback. A quick test of this failed (VMRUN failure), let me see what is going on and post back. Thanks, Tom > > Paolo >
On 5/5/21 2:55 PM, Tom Lendacky wrote: > On 5/5/21 1:50 PM, Paolo Bonzini wrote: >> On 05/05/21 20:39, Tom Lendacky wrote: >>> On 5/5/21 11:16 AM, Paolo Bonzini wrote: >>>> On 05/05/21 16:01, Tom Lendacky wrote: >>>>> Boris noticed the below warning when running an SEV-ES guest with >>>>> CONFIG_PROVE_LOCKING=y. >>>>> >>>>> The SRCU lock is released before invoking the vCPU run op where the >>>>> SEV-ES >>>>> support will unmap the GHCB. Is the proper thing to do here to take the >>>>> SRCU lock around the call to kvm_vcpu_unmap() in this case? It does fix >>>>> the issue, but I just want to be sure that I shouldn't, instead, be >>>>> taking >>>>> the memslot lock: >>>> >>>> I would rather avoid having long-lived maps, as I am working on removing >>>> them from the Intel code. However, it seems to me that the GHCB is almost >>>> not used after sev_handle_vmgexit returns? >>> >>> Except for as you pointed out below, things like MMIO and IO. Anything >>> that has to exit to userspace to complete will still need the GHCB mapped >>> when coming back into the kernel if the shared buffer area of the GHCB is >>> being used. >>> >>> Btw, what do you consider long lived maps? Is having a map while context >>> switching back to userspace considered long lived? The GHCB will >>> (possibly) only be mapped on VMEXIT (VMGEXIT) and unmapped on the next >>> VMRUN for the vCPU. An AP reset hold could be a while, though. >> >> Anything that cannot be unmapped in the same function that maps it, >> essentially. >> >>>> 2) upon an AP reset hold exit, the above patch sets the EXITINFO2 field >>>> before the SIPI was received. My understanding is that the processor will >>>> not see the value anyway until it resumes execution, and why would other >>>> vCPUs muck with the AP's GHCB. But I'm not sure if it's okay. >>> >>> As long as the vCPU might not be woken up for any reason other than a >>> SIPI, you can get a way with this. But if it was to be woken up for some >>> other reason (an IPI for some reason?), then you wouldn't want the >>> non-zero value set in the GHCB in advance, because that might cause the >>> vCPU to exit the HLT loop it is in waiting for the actual SIPI. >> >> Ok. Then the best thing to do is to pull sev_es_pre_run to the >> prepare_guest_switch callback. > > A quick test of this failed (VMRUN failure), let me see what is going on > and post back. I couldn't just move sev_es_pre_run() into sev_es_prepare_guest_switch() because of the guest_state_loaded check in svm_prepare_guest_switch(). Renaming sev_es_pre_run() to sev_es_unmap_ghcb() and calling it before the guest_state_loaded check worked nicely. I'll send a patch soon. Thanks, Tom > > Thanks, > Tom > >> >> Paolo >>
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 17adc1e79136..76f90cf8c5aa 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -2723,7 +2723,10 @@ static void pre_sev_es_run(struct vcpu_svm *svm) sev_es_sync_to_ghcb(svm); + svm->vcpu.srcu_idx = srcu_read_lock(&svm->vcpu.kvm->srcu); kvm_vcpu_unmap(&svm->vcpu, &svm->ghcb_map, true); + srcu_read_unlock(&svm->vcpu.kvm->srcu, svm->vcpu.srcu_idx); + svm->ghcb = NULL; }