Message ID | 20230922140914.13906-3-phil@philjordan.eu (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | hvf x86 correctness and efficiency improvements | expand |
On Fri, Sep 22, 2023 at 04:09:13PM +0200, Phil Dennis-Jordan wrote: > When interrupting a vCPU thread, this patch actually tells the hypervisor to > stop running guest code on that vCPU. > > Calling hv_vcpu_interrupt actually forces a vCPU exit, analogously to > hv_vcpus_exit on aarch64. > > Previously, hvf_kick_vcpu_thread relied upon hv_vcpu_run returning very > frequently, including many spurious exits, which made it less of a problem that > nothing was actively done to stop the vCPU thread running guest code. > The newer, more efficient hv_vcpu_run_until exits much more rarely, so a true > "kick" is needed. > Hi Phil, I see severe performance regression with the patch on a Windows XP guest. The display is not refreshed properly like a broken LVDS panel, only some horizontal lines appear on it. My test laptop for x86 hvf is MBA 2015 with the latest Big Sur. What are you runing QEMU/HVF on? FWIW. I recall a few years ago I submitted a similar patch that does something similar but addresses a few more issues: https://patchwork.kernel.org/project/qemu-devel/patch/20200729124832.79375-1-r.bolshakov@yadro.com/ I don't remember why it never got merged. Regards, Roman
On Sun, 8 Oct 2023 at 20:23, Roman Bolshakov <roman@roolebo.dev> wrote: > On Fri, Sep 22, 2023 at 04:09:13PM +0200, Phil Dennis-Jordan wrote: > > When interrupting a vCPU thread, this patch actually tells the > hypervisor to > > stop running guest code on that vCPU. > > > > Calling hv_vcpu_interrupt actually forces a vCPU exit, analogously to > > hv_vcpus_exit on aarch64. > > > > Previously, hvf_kick_vcpu_thread relied upon hv_vcpu_run returning very > > frequently, including many spurious exits, which made it less of a > problem that > > nothing was actively done to stop the vCPU thread running guest code. > > The newer, more efficient hv_vcpu_run_until exits much more rarely, so a > true > > "kick" is needed. > > > Hi Roman, Thanks for the review and test of this patch and the preceding one! > I see severe performance regression with the patch on a Windows XP > guest. The display is not refreshed properly like a broken LVDS panel, > only some horizontal lines appear on it. OK, that's interesting - I've been running into that sort of issue while trying to integrate HVF's APIC implementation into Qemu. I assume that's with patch 3/3 applied as well? The fact you've repro'd it with just these patch would explain why I've not been able to fix it on the APIC side… My test laptop for x86 hvf is > MBA 2015 with the latest Big Sur. What are you runing QEMU/HVF on? > Most of the testing has been with 2018 Mac Mini (Intel Coffee Lake) hosts running Big Sur (11), Monterey (12), and Ventura (13), and using macOS guests. I've also sanity-checked with a 2015 MBP (Broadwell) Monterey host with various macOS guests as well as Linux and FreeBSD guests. Guess I should have tried Windows guests too, sorry about the regression! > FWIW. I recall a few years ago I submitted a similar patch that does > something similar but addresses a few more issues: > > https://patchwork.kernel.org/project/qemu-devel/patch/20200729124832.79375-1-r.bolshakov@yadro.com/ > > I don't remember why it never got merged. > Looks like the VM kick might be a more complex undertaking than I was anticipating. I'll try to repro the problem you ran into, and then look over your original patch and make sense of it. Hopefully an updated version of your 'kick' implementation will work well in combination with the newer hv_vcpu_run_until() API from patch 3/3. I'll keep you posted. Thanks again, Phil
On Sun, Oct 08, 2023 at 08:39:08PM +0200, Phil Dennis-Jordan wrote: > On Sun, 8 Oct 2023 at 20:23, Roman Bolshakov <roman@roolebo.dev> wrote: > > > On Fri, Sep 22, 2023 at 04:09:13PM +0200, Phil Dennis-Jordan wrote: > > > When interrupting a vCPU thread, this patch actually tells the > > hypervisor to > > > stop running guest code on that vCPU. > > > > > > Calling hv_vcpu_interrupt actually forces a vCPU exit, analogously to > > > hv_vcpus_exit on aarch64. > > > > > > Previously, hvf_kick_vcpu_thread relied upon hv_vcpu_run returning very > > > frequently, including many spurious exits, which made it less of a > > problem that > > > nothing was actively done to stop the vCPU thread running guest code. > > > The newer, more efficient hv_vcpu_run_until exits much more rarely, so a > > true > > > "kick" is needed. > > > > > > > Hi Roman, > > Thanks for the review and test of this patch and the preceding one! > My pleasure. Thanks for submitting the patch and confirming the need of the feature on x86 HVF. > > I see severe performance regression with the patch on a Windows XP > > guest. The display is not refreshed properly like a broken LVDS panel, > > only some horizontal lines appear on it. > > > I assume that's with patch 3/3 applied as well? The fact you've > repro'd it with just these patch would explain why I've not been able > to fix it on the APIC side… > Yes, I applied with patch 3/3 and then retested only with the first two patches. > > FWIW. I recall a few years ago I submitted a similar patch that does > > something similar but addresses a few more issues: > > > > https://patchwork.kernel.org/project/qemu-devel/patch/20200729124832.79375-1-r.bolshakov@yadro.com/ > > > > I don't remember why it never got merged. > > > > Looks like the VM kick might be a more complex undertaking than I was > anticipating. I'll try to repro the problem you ran into, and then look > over your original patch and make sense of it. Hopefully an updated version > of your 'kick' implementation will work well in combination with the > newer hv_vcpu_run_until() API from patch 3/3. I'll keep you posted. > Apparently I left a note that some interrupts weren't delivered even with my patch and I was not able figure out the reason back then. I had another attempt to debug this two weeks later after I submitted v4 and I can find a WIP branch on github where I added a Debug Registers support patch and some tracepoints: https://github.com/qemu/qemu/compare/master...roolebo:qemu:hvf-debug-kick Perhaps that's where we should start from besides the obvious need of rebase. With regards to hv_vcpu_run_until() I can find the following thread: https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg09468.html > hv_vcpu_run_until() was also evaluated on macOS 10.15.5 but it degrades > VM performance significantly compared to explicit setting of > VMX-preepmtion timer value and hv_vcpu_run(). The performance issue was > observed on Broadwell-based MacBook Air and Ivy Bridge-based MacBook > Pro. > > macOS 11.0 Beta deprecated hv_vcpu_run() and introduced a special > declaration for hv_vcpu_run_until(), that's not available 10.15 - > HV_DEADLINE_FOREVER (UINT64_MAX, which is bigger than maximum value of > VMX-preeemption counter). Perhaps the performance issue is addressed > there. All discussion with Paolo might be helpful, particurlarly: https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg09893.html > So, I've tried Big Sur Beta and it has exactly the same performance > issue with hv_vcpu_run_until() while hv_vcpu_run() works as good as it > worked on 10.15.5. I've submitted FB7827341 to Apple wrt the issue. In November 2020, Apple responded to FB7827341 that there's an issue on QEMU side. Regards, Roman
On Sun, 8 Oct 2023 at 21:19, Roman Bolshakov <roman@roolebo.dev> wrote: > > I assume that's with patch 3/3 applied as well? The fact you've > > repro'd it with just these patch would explain why I've not been able > > to fix it on the APIC side… > > > > Yes, I applied with patch 3/3 and then retested only with the first two > patches. > OK, interesting that it would happen without patch 3 as well. > > > FWIW. I recall a few years ago I submitted a similar patch that does > > > something similar but addresses a few more issues: > > > > > > > https://patchwork.kernel.org/project/qemu-devel/patch/20200729124832.79375-1-r.bolshakov@yadro.com/ > > > > > > I don't remember why it never got merged. > > > > > > > Looks like the VM kick might be a more complex undertaking than I was > > anticipating. I'll try to repro the problem you ran into, and then look > > over your original patch and make sense of it. Hopefully an updated > version > > of your 'kick' implementation will work well in combination with the > > newer hv_vcpu_run_until() API from patch 3/3. I'll keep you posted. > > > > Apparently I left a note that some interrupts weren't delivered even > with my patch and I was not able figure out the reason back then. I had > another attempt to debug this two weeks later after I submitted v4 and I > can find a WIP branch on github where I added a Debug Registers support > patch and some tracepoints: > > https://github.com/qemu/qemu/compare/master...roolebo:qemu:hvf-debug-kick > > Perhaps that's where we should start from besides the obvious need of > rebase. > Sounds good, I'll take a look at those changes and try to work out what to do next. > With regards to hv_vcpu_run_until() I can find the following thread: > https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg09468.html > > > hv_vcpu_run_until() was also evaluated on macOS 10.15.5 but it degrades > > VM performance significantly compared to explicit setting of > > VMX-preepmtion timer value and hv_vcpu_run(). The performance issue was > > observed on Broadwell-based MacBook Air and Ivy Bridge-based MacBook > > Pro. > > > > macOS 11.0 Beta deprecated hv_vcpu_run() and introduced a special > > declaration for hv_vcpu_run_until(), that's not available 10.15 - > > HV_DEADLINE_FOREVER (UINT64_MAX, which is bigger than maximum value of > > VMX-preeemption counter). Perhaps the performance issue is addressed > > there. > > All discussion with Paolo might be helpful, particurlarly: > https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg09893.html > > > So, I've tried Big Sur Beta and it has exactly the same performance > > issue with hv_vcpu_run_until() while hv_vcpu_run() works as good as it > > worked on 10.15.5. I've submitted FB7827341 to Apple wrt the issue. > > In November 2020, Apple responded to FB7827341 that there's an issue on > QEMU side. > Hmm, that's interesting. I'll need to work my way through that thread, but I'll point out that in my testing with SMP guests, I measured a performance *improvement* with the hv_vcpu_run_until() API (and the forever deadline) versus hv_vcpu_run(), as it significantly reduced BQL contention - with so many VMEXITs, vCPU threads were spending a lot of time waiting for the lock.
Hi Roman, A quick update on my progress on this so far: - I've managed to repro the graphical issue; it turns up with the switch to hv_vcpu_run_until() (patch 3/3). I can't repro with just the changes to the kick. (patch 2/3; It would have surprised me if that had caused it.) - Thanks for linking to the old mailing list threads for background on the issues. - I've rebased/ported your more sophisticated vCPU kick implementation to the latest upstream master branch. This doesn't seem to cause any obvious regressions, but it also doesn't appear to fix the issue with hv_vcpu_run_until(). My branch is here, https://github.com/pmj/qemu/tree/hvf-kick-modern and the updated version of your patch here: https://github.com/qemu/qemu/commit/fadc716c5bb15345a49e08eecf7ab1077782975c - I've done some tracing, and I so far can't spot any undelivered or slow kicks. Each kick is paired with a return from hvf_vcpu_exec with reason EXCP_INTERRUPT, and this happens within less than a millisecond, typically around 200µs. I'm seeing kicks being delivered at all the various stages we're expecting them apart from VMX preemption timer exits, at least so far. But those only target a tiny window and so should be statistically quite rare. - I'm starting to entertain the idea that hv_vcpu_run_until() exhibits some other difference compared to hv_vcpu_run() which is causing the graphical issues, rather than a problem with interrupt delivery. Unfortunately, I'm not familiar with how drawing works on legacy (S)VGA graphics, and what effect might cause it to end up with these dropped scanline updates. I might try to find some kind of OSDev example code that draws to (S)VGA and hopefully lets me repro and perhaps debug the problem in isolation. Any ideas/thoughts? Thanks, Phil On Sun, 8 Oct 2023 at 21:30, Phil Dennis-Jordan <phil@philjordan.eu> wrote: > > > On Sun, 8 Oct 2023 at 21:19, Roman Bolshakov <roman@roolebo.dev> wrote: >> >> > I assume that's with patch 3/3 applied as well? The fact you've >> > repro'd it with just these patch would explain why I've not been able >> > to fix it on the APIC side… >> > >> >> Yes, I applied with patch 3/3 and then retested only with the first two >> patches. > > > OK, interesting that it would happen without patch 3 as well. > >> >> > > FWIW. I recall a few years ago I submitted a similar patch that does >> > > something similar but addresses a few more issues: >> > > >> > > https://patchwork.kernel.org/project/qemu-devel/patch/20200729124832.79375-1-r.bolshakov@yadro.com/ >> > > >> > > I don't remember why it never got merged. >> > > >> > >> > Looks like the VM kick might be a more complex undertaking than I was >> > anticipating. I'll try to repro the problem you ran into, and then look >> > over your original patch and make sense of it. Hopefully an updated version >> > of your 'kick' implementation will work well in combination with the >> > newer hv_vcpu_run_until() API from patch 3/3. I'll keep you posted. >> > >> >> Apparently I left a note that some interrupts weren't delivered even >> with my patch and I was not able figure out the reason back then. I had >> another attempt to debug this two weeks later after I submitted v4 and I >> can find a WIP branch on github where I added a Debug Registers support >> patch and some tracepoints: >> >> https://github.com/qemu/qemu/compare/master...roolebo:qemu:hvf-debug-kick >> >> Perhaps that's where we should start from besides the obvious need of >> rebase. > > > Sounds good, I'll take a look at those changes and try to work out what to do next. > >> >> With regards to hv_vcpu_run_until() I can find the following thread: >> https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg09468.html >> >> > hv_vcpu_run_until() was also evaluated on macOS 10.15.5 but it degrades >> > VM performance significantly compared to explicit setting of >> > VMX-preepmtion timer value and hv_vcpu_run(). The performance issue was >> > observed on Broadwell-based MacBook Air and Ivy Bridge-based MacBook >> > Pro. >> > >> > macOS 11.0 Beta deprecated hv_vcpu_run() and introduced a special >> > declaration for hv_vcpu_run_until(), that's not available 10.15 - >> > HV_DEADLINE_FOREVER (UINT64_MAX, which is bigger than maximum value of >> > VMX-preeemption counter). Perhaps the performance issue is addressed >> > there. >> >> All discussion with Paolo might be helpful, particurlarly: >> https://lists.nongnu.org/archive/html/qemu-devel/2020-06/msg09893.html >> >> > So, I've tried Big Sur Beta and it has exactly the same performance >> > issue with hv_vcpu_run_until() while hv_vcpu_run() works as good as it >> > worked on 10.15.5. I've submitted FB7827341 to Apple wrt the issue. >> >> In November 2020, Apple responded to FB7827341 that there's an issue on >> QEMU side. > > > Hmm, that's interesting. I'll need to work my way through that thread, but I'll point out that in my testing with SMP guests, I measured a performance *improvement* with the hv_vcpu_run_until() API (and the forever deadline) versus hv_vcpu_run(), as it significantly reduced BQL contention - with so many VMEXITs, vCPU threads were spending a lot of time waiting for the lock. > >
Hi Roman, hi Paolo, Just an update on my investigation of the hv_vcpu_run -> hv_vcpu_run_until issue. The graphical issues with the Windows XP VM appear to be caused by the dirty memory page system not working as expected. The emulated (Cirrus) VGA adapter uses dirty page tracking to perform partial screen updates, so when pages aren't marked as dirty, they don't get updated on the host console. This got me digging into how dirty memory tracking is actually implemented in the Qemu hvf backend, and basically, it should never have worked in the first place. When we get a write fault, the code marks the *whole* 'logged' memory range as writable rather than just the page that's just been dirtied. It just so happens that hv_vcpu_run was causing EPT fault exits on those pages even after marking them writable (?), and hv_vcpu_run_until() no longer does that. So basically, this has been a Qemu bug masked by undesirable hv_vcpu_run() behaviour. I'll start putting together a fix for this. I'm also hoping to settle the hv_vcpu_interrupt() race condition question empirically - if we can avoid the complicated signal/vmexit race avoidance logic with atomic flags, that will make the code rather simpler. Phil
On Fri, Oct 20, 2023 at 05:12:13PM +0200, Phil Dennis-Jordan wrote: > Hi Roman, hi Paolo, > Hi Phil, Pardon for not responding earlier. I was travelling the last three weeks. I appreciate the time you spent on the rebase. I have compiled it and observed the same issue with SVGA like with your third patch. > Just an update on my investigation of the hv_vcpu_run -> > hv_vcpu_run_until issue. The graphical issues with the Windows XP VM > appear to be caused by the dirty memory page system not working as > expected. The emulated (Cirrus) VGA adapter uses dirty page tracking > to perform partial screen updates, so when pages aren't marked as > dirty, they don't get updated on the host console. > That sounds awesome, I think you have tracked it down correctly. I have also looked at SVGA code and the only idea I had is dirty tracking is somehow not working properly. I observed similar issue when tried to add GDB stub for x86 hvf. The changes from GDB stub produced no apparent effect on the guest - breakpoints were there, in the memory but did not stop the guest and so on. I got lost why it didn't work back then. > This got me digging into how dirty memory tracking is actually > implemented in the Qemu hvf backend, and basically, it should never > have worked in the first place. When we get a write fault, the code > marks the *whole* 'logged' memory range as writable rather than just > the page that's just been dirtied. It just so happens that hv_vcpu_run > was causing EPT fault exits on those pages even after marking them > writable (?), and hv_vcpu_run_until() no longer does that. So > basically, this has been a Qemu bug masked by undesirable > hv_vcpu_run() behaviour. I'll start putting together a fix for this. > Sounds good, have you got anything to test or review? Meanwhile, I'll review the pending patches you sent. Best regards, Roman
Hi Roman, On Sun, 5 Nov 2023 at 16:21, Roman Bolshakov <roman@roolebo.dev> wrote: > > This got me digging into how dirty memory tracking is actually > > implemented in the Qemu hvf backend, and basically, it should never > > have worked in the first place. When we get a write fault, the code > > marks the *whole* 'logged' memory range as writable rather than just > > the page that's just been dirtied. It just so happens that hv_vcpu_run > > was causing EPT fault exits on those pages even after marking them > > writable (?), and hv_vcpu_run_until() no longer does that. So > > basically, this has been a Qemu bug masked by undesirable > > hv_vcpu_run() behaviour. I'll start putting together a fix for this. > > > > Sounds good, have you got anything to test or review? Meanwhile, I'll > review the pending patches you sent. > Sorry, I've likewise been busy with other things the last 2-3 weeks. As far as I'm aware, we don't actually know 100% for certain if there's a race condition when using hv_vcpu_interrupt(), right? (As I mentioned, the patches with hv_vcpu_run_until and a hv_vcpu_interrupt-only kick have been running without issue for months on dozens to hundreds of VMs.) So before we add the complexity of the hybrid hv_vcpu_interrupt & signal-based kick to the codebase, I'd like to test out hv_vcpu_interrupt's behaviour in isolation and actually force the edge cases we're worried about. But yeah, I haven't got around to doing that yet. :-) I'm hoping to take a crack at it later this week or next week, probably using https://github.com/mist64/hvdos as a starting point. Thanks for reviewing and testing the first set of patches! Phil
diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c index cb2cd0b02f..55bd7d2af8 100644 --- a/target/i386/hvf/hvf.c +++ b/target/i386/hvf/hvf.c @@ -209,7 +209,10 @@ static inline bool apic_bus_freq_is_known(CPUX86State *env) void hvf_kick_vcpu_thread(CPUState *cpu) { + hv_vcpuid_t hvf_vcpuid; cpus_kick_thread(cpu); + hvf_vcpuid = cpu->accel->fd; + hv_vcpu_interrupt(&hvf_vcpuid, 1); } int hvf_arch_init(void)
When interrupting a vCPU thread, this patch actually tells the hypervisor to stop running guest code on that vCPU. Calling hv_vcpu_interrupt actually forces a vCPU exit, analogously to hv_vcpus_exit on aarch64. Previously, hvf_kick_vcpu_thread relied upon hv_vcpu_run returning very frequently, including many spurious exits, which made it less of a problem that nothing was actively done to stop the vCPU thread running guest code. The newer, more efficient hv_vcpu_run_until exits much more rarely, so a true "kick" is needed. Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu> --- target/i386/hvf/hvf.c | 3 +++ 1 file changed, 3 insertions(+)