Message ID | CACt9=QgsSM18to9M5k8+3N3NvRoNVmAvsQo5oLO5-A0dm7VFNg@mail.gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Xen panic due to xstate mismatch | expand |
This is a sanity check that an algorithm in Xen matches hardware. It is only compiled into debug builds by default. Given that you're running under virtualbox, i have a suspicion as to what's wrong. Can you collect the full `xen-cpuid -p` output from within your environment? I don't believe you're suggested code change is correct, but it will good enough to get these diagnostics. ~Andrew On Sun, 2 Feb 2025, 15:32 Guillaume, <thouveng@gmail.com> wrote: > Hello, > > I'd like to report an issue I encountered when building Xen from source. > To give you some context, During the Xen winter meetup in Grenoble few days > ago, there was a discussion about strengthening collaboration between Xen > and academia. One issue raised by a professor was that Xen is harder for > students to install and experiment compared to KVM. In response it was > mentionned that Debian packages are quite decent. This motivated me to try > installing and playing with Xen myself. While I am familiar with Xen (I > work on the XAPI toolstack at Vates) I'm not deeply familiar with its > internals, so this seemed like a good learning opportunity and maybe some > contents for some blog posts :). > > I set up a Debian testing VM on Virtualbox and installed Xen from > packages. Everything worked fine: Grub was updated, I rebooted, and I had a > functional Xen setup with xl running in Dom0. > Next I download the last version of Xen from xenbits.org, and built only > the hypervisor (no tools, no stubdom) , using the same configuration as > the Debian package (which is for Xen 4.19). After updating GRUB and > rebooting, Xen failed to boot. Fortunately, I was able to capture the > following error via `ttyS0`: > ``` > (XEN) [0000000d2c23739a] xstate: size: 0x340 and states: 0x7 > (XEN) [0000000d2c509c1d] > (XEN) [0000000d2c641ffa] **************************************** > (XEN) [0000000d2c948e3b] Panic on CPU 0: > (XEN) [0000000d2cb349bb] XSTATE 0x0000000000000003, uncompressed hw size > 0x340 != xen size 0x240 > (XEN) [0000000d2cfc5786] **************************************** > (XEN) [0000000d2d308c24] > ``` > From my understanding, the hardware xstate size (`hw_size`) represents > the maximum memory required for the `XSAVE/XRSTOR` save area, while > `xen_size` is computed by summing the space required for the enabled > features. In `xen/arch/x86/xstate.c`, if these sizes do not match, Xen > panics. However, wouldn’t it be correct for `xen_size` to be **less than > or equal to** `hw_size` instead of exactly matching? > > I tested the following change: > ``` > --- a/xen/arch/x86/xstate.c > +++ b/xen/arch/x86/xstate.c > @@ -710,7 +710,7 @@ static void __init check_new_xstate(struct > xcheck_state *s, uint64_t new) > */ > xen_size = xstate_uncompressed_size(s->states & X86_XCR0_STATES); > > - if ( xen_size != hw_size ) > + if ( xen_size > hw_size ) > panic("XSTATE 0x%016"PRIx64", uncompressed hw size %#x != xen > size %#x\n", > s->states, hw_size, xen_size); > ``` > With this change, Xen boots correctly, but I may be missing some side > effects... > Additionally, I am confused as to why this issue does *not* occur with > the default Debian Xen package. Even when I rebuild Xen *4.19.1* from > source (the same version as the package), I still encounter the issue. > So I have two questions: > - Is my understanding correct that xen_size <= hw_size should be allowed? > - Are there any potential side effects of this change? > - Bonus: Have some of you any explanations about why does the issue not > occur with the packaged version of Xen but does with a self-built version? > > Hope I wasn't too long and thanks for taking the time to read this, > Best regards, > > Guillaume >
Yes sure I can collect the output. As you said the change is good enough to start the dom0 without errors (at least no apparent errors :). ``` Xen reports there are maximum 120 leaves and 2 MSRs Raw policy: 32 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx 00000000:ffffffff -> 00000016:756e6547:6c65746e:49656e69 00000001:ffffffff -> 000806c1:00020800:f6fa3203:178bfbff 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 00000006:ffffffff -> 00000004:00000000:00000000:00000000 00000007:00000000 -> 00000000:208c2569:00000000:30000400 0000000b:00000000 -> 00000000:00000001:00000100:00000000 0000000b:00000001 -> 00000001:00000002:00000201:00000000 0000000d:00000000 -> 00000007:00000000:00000340:00000000 0000000d:00000002 -> 00000100:00000240:00000000:00000000 80000000:ffffffff -> 80000008:00000000:00000000:00000000 80000001:ffffffff -> 00000000:00000000:00000121:28100800 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 80000006:ffffffff -> 00000000:00000000:01007040:00000000 80000007:ffffffff -> 00000000:00000000:00000000:00000100 80000008:ffffffff -> 00003027:00000000:00000000:00000000 MSRs: index -> value 000000ce -> 0000000000000000 0000010a -> 0000000000000000 Host policy: 30 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 00000001:ffffffff -> 000806c1:00020800:c6fa2203:178bfbff 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 00000007:00000000 -> 00000000:208c2549:00000000:30000400 0000000d:00000000 -> 00000003:00000000:00000240:00000000 80000000:ffffffff -> 80000008:00000000:00000000:00000000 80000001:ffffffff -> 00000000:00000000:00000121:28100800 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 80000006:ffffffff -> 00000000:00000000:01007040:00000000 80000007:ffffffff -> 00000000:00000000:00000000:00000100 80000008:ffffffff -> 00003027:00000000:00000000:00000000 MSRs: index -> value 000000ce -> 0000000000000000 0000010a -> 0000000000000000 PV Max policy: 57 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 00000001:ffffffff -> 000806c1:00020800:c6f82203:1789cbf5 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 00000007:00000000 -> 00000002:208c0109:00000000:20000400 0000000d:00000000 -> 00000003:00000000:00000240:00000000 80000000:ffffffff -> 80000021:00000000:00000000:00000000 80000001:ffffffff -> 00000000:00000000:00000123:28100800 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 80000006:ffffffff -> 00000000:00000000:01007040:00000000 80000007:ffffffff -> 00000000:00000000:00000000:00000100 80000008:ffffffff -> 00003027:00000000:00000000:00000000 MSRs: index -> value 000000ce -> 0000000000000000 0000010a -> 0000000010020004 HVM Max policy: 4 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx MSRs: index -> value 000000ce -> 0000000000000000 0000010a -> 0000000000000000 PV Default policy: 30 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 00000001:ffffffff -> 000806c1:00020800:c6d82203:1789cbf5 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 00000007:00000000 -> 00000000:208c0109:00000000:20000400 0000000d:00000000 -> 00000003:00000000:00000240:00000000 80000000:ffffffff -> 80000008:00000000:00000000:00000000 80000001:ffffffff -> 00000000:00000000:00000121:28100800 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 80000006:ffffffff -> 00000000:00000000:01007040:00000000 80000008:ffffffff -> 00003027:00000000:00000000:00000000 MSRs: index -> value 000000ce -> 0000000000000000 0000010a -> 0000000000000000 HVM Default policy: 4 leaves, 2 MSRs CPUID: leaf subleaf -> eax ebx ecx edx MSRs: index -> value 000000ce -> 0000000000000000 0000010a -> 0000000000000000 ``` Guillaume On Sun, Feb 2, 2025 at 4:32 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote: > This is a sanity check that an algorithm in Xen matches hardware. It is > only compiled into debug builds by default. > > Given that you're running under virtualbox, i have a suspicion as to > what's wrong. > > Can you collect the full `xen-cpuid -p` output from within your > environment? I don't believe you're suggested code change is correct, but > it will good enough to get these diagnostics. > > ~Andrew > > On Sun, 2 Feb 2025, 15:32 Guillaume, <thouveng@gmail.com> wrote: > >> Hello, >> >> I'd like to report an issue I encountered when building Xen from source. >> To give you some context, During the Xen winter meetup in Grenoble few days >> ago, there was a discussion about strengthening collaboration between Xen >> and academia. One issue raised by a professor was that Xen is harder for >> students to install and experiment compared to KVM. In response it was >> mentionned that Debian packages are quite decent. This motivated me to try >> installing and playing with Xen myself. While I am familiar with Xen (I >> work on the XAPI toolstack at Vates) I'm not deeply familiar with its >> internals, so this seemed like a good learning opportunity and maybe some >> contents for some blog posts :). >> >> I set up a Debian testing VM on Virtualbox and installed Xen from >> packages. Everything worked fine: Grub was updated, I rebooted, and I had a >> functional Xen setup with xl running in Dom0. >> Next I download the last version of Xen from xenbits.org, and built only >> the hypervisor (no tools, no stubdom) , using the same configuration as >> the Debian package (which is for Xen 4.19). After updating GRUB and >> rebooting, Xen failed to boot. Fortunately, I was able to capture the >> following error via `ttyS0`: >> ``` >> (XEN) [0000000d2c23739a] xstate: size: 0x340 and states: 0x7 >> (XEN) [0000000d2c509c1d] >> (XEN) [0000000d2c641ffa] **************************************** >> (XEN) [0000000d2c948e3b] Panic on CPU 0: >> (XEN) [0000000d2cb349bb] XSTATE 0x0000000000000003, uncompressed hw size >> 0x340 != xen size 0x240 >> (XEN) [0000000d2cfc5786] **************************************** >> (XEN) [0000000d2d308c24] >> ``` >> From my understanding, the hardware xstate size (`hw_size`) represents >> the maximum memory required for the `XSAVE/XRSTOR` save area, while >> `xen_size` is computed by summing the space required for the enabled >> features. In `xen/arch/x86/xstate.c`, if these sizes do not match, Xen >> panics. However, wouldn’t it be correct for `xen_size` to be **less than >> or equal to** `hw_size` instead of exactly matching? >> >> I tested the following change: >> ``` >> --- a/xen/arch/x86/xstate.c >> +++ b/xen/arch/x86/xstate.c >> @@ -710,7 +710,7 @@ static void __init check_new_xstate(struct >> xcheck_state *s, uint64_t new) >> */ >> xen_size = xstate_uncompressed_size(s->states & X86_XCR0_STATES); >> >> - if ( xen_size != hw_size ) >> + if ( xen_size > hw_size ) >> panic("XSTATE 0x%016"PRIx64", uncompressed hw size %#x != xen >> size %#x\n", >> s->states, hw_size, xen_size); >> ``` >> With this change, Xen boots correctly, but I may be missing some side >> effects... >> Additionally, I am confused as to why this issue does *not* occur with >> the default Debian Xen package. Even when I rebuild Xen *4.19.1* from >> source (the same version as the package), I still encounter the issue. >> So I have two questions: >> - Is my understanding correct that xen_size <= hw_size should be allowed? >> - Are there any potential side effects of this change? >> - Bonus: Have some of you any explanations about why does the issue not >> occur with the packaged version of Xen but does with a self-built version? >> >> Hope I wasn't too long and thanks for taking the time to read this, >> Best regards, >> >> Guillaume >> >
Can you also get `xl dmesg` too, and attach it? I think this is a VirtualBox bug, but I'm confused as to why Xen has decided to turn off AVX. ~Andrew On 02/02/2025 4:01 pm, Guillaume wrote: > Yes sure I can collect the output. As you said the change is good > enough to start the dom0 without errors (at least no apparent errors :). > ``` > Xen reports there are maximum 120 leaves and 2 MSRs > Raw policy: 32 leaves, 2 MSRs > CPUID: > leaf subleaf -> eax ebx ecx edx > 00000000:ffffffff -> 00000016:756e6547:6c65746e:49656e69 > 00000001:ffffffff -> 000806c1:00020800:f6fa3203:178bfbff > 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 > 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 > 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 > 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 > 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 > 00000006:ffffffff -> 00000004:00000000:00000000:00000000 > 00000007:00000000 -> 00000000:208c2569:00000000:30000400 > 0000000b:00000000 -> 00000000:00000001:00000100:00000000 > 0000000b:00000001 -> 00000001:00000002:00000201:00000000 > 0000000d:00000000 -> 00000007:00000000:00000340:00000000 > 0000000d:00000002 -> 00000100:00000240:00000000:00000000 > 80000000:ffffffff -> 80000008:00000000:00000000:00000000 > 80000001:ffffffff -> 00000000:00000000:00000121:28100800 > 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 > 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 > 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 > 80000006:ffffffff -> 00000000:00000000:01007040:00000000 > 80000007:ffffffff -> 00000000:00000000:00000000:00000100 > 80000008:ffffffff -> 00003027:00000000:00000000:00000000 > MSRs: > index -> value > 000000ce -> 0000000000000000 > 0000010a -> 0000000000000000 > Host policy: 30 leaves, 2 MSRs > CPUID: > leaf subleaf -> eax ebx ecx edx > 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 > 00000001:ffffffff -> 000806c1:00020800:c6fa2203:178bfbff > 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 > 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 > 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 > 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 > 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 > 00000007:00000000 -> 00000000:208c2549:00000000:30000400 > 0000000d:00000000 -> 00000003:00000000:00000240:00000000 > 80000000:ffffffff -> 80000008:00000000:00000000:00000000 > 80000001:ffffffff -> 00000000:00000000:00000121:28100800 > 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 > 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 > 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 > 80000006:ffffffff -> 00000000:00000000:01007040:00000000 > 80000007:ffffffff -> 00000000:00000000:00000000:00000100 > 80000008:ffffffff -> 00003027:00000000:00000000:00000000 > MSRs: > index -> value > 000000ce -> 0000000000000000 > 0000010a -> 0000000000000000 > PV Max policy: 57 leaves, 2 MSRs > CPUID: > leaf subleaf -> eax ebx ecx edx > 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 > 00000001:ffffffff -> 000806c1:00020800:c6f82203:1789cbf5 > 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 > 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 > 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 > 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 > 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 > 00000007:00000000 -> 00000002:208c0109:00000000:20000400 > 0000000d:00000000 -> 00000003:00000000:00000240:00000000 > 80000000:ffffffff -> 80000021:00000000:00000000:00000000 > 80000001:ffffffff -> 00000000:00000000:00000123:28100800 > 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 > 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 > 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 > 80000006:ffffffff -> 00000000:00000000:01007040:00000000 > 80000007:ffffffff -> 00000000:00000000:00000000:00000100 > 80000008:ffffffff -> 00003027:00000000:00000000:00000000 > MSRs: > index -> value > 000000ce -> 0000000000000000 > 0000010a -> 0000000010020004 > HVM Max policy: 4 leaves, 2 MSRs > CPUID: > leaf subleaf -> eax ebx ecx edx > MSRs: > index -> value > 000000ce -> 0000000000000000 > 0000010a -> 0000000000000000 > PV Default policy: 30 leaves, 2 MSRs > CPUID: > leaf subleaf -> eax ebx ecx edx > 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 > 00000001:ffffffff -> 000806c1:00020800:c6d82203:1789cbf5 > 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 > 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 > 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 > 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 > 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 > 00000007:00000000 -> 00000000:208c0109:00000000:20000400 > 0000000d:00000000 -> 00000003:00000000:00000240:00000000 > 80000000:ffffffff -> 80000008:00000000:00000000:00000000 > 80000001:ffffffff -> 00000000:00000000:00000121:28100800 > 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 > 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 > 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 > 80000006:ffffffff -> 00000000:00000000:01007040:00000000 > 80000008:ffffffff -> 00003027:00000000:00000000:00000000 > MSRs: > index -> value > 000000ce -> 0000000000000000 > 0000010a -> 0000000000000000 > HVM Default policy: 4 leaves, 2 MSRs > CPUID: > leaf subleaf -> eax ebx ecx edx > MSRs: > index -> value > 000000ce -> 0000000000000000 > 0000010a -> 0000000000000000 > ``` > > Guillaume > > On Sun, Feb 2, 2025 at 4:32 PM Andrew Cooper > <andrew.cooper3@citrix.com> wrote: > > This is a sanity check that an algorithm in Xen matches hardware. > It is only compiled into debug builds by default. > > Given that you're running under virtualbox, i have a suspicion as > to what's wrong. > > Can you collect the full `xen-cpuid -p` output from within your > environment? I don't believe you're suggested code change is > correct, but it will good enough to get these diagnostics. > > ~Andrew > > On Sun, 2 Feb 2025, 15:32 Guillaume, <thouveng@gmail.com> wrote: > > Hello, > > I'd like to report an issue I encountered when building Xen > from source. To give you some context, During the Xen winter > meetup in Grenoble few days ago, there was a discussion about > strengthening collaboration between Xen and academia. One > issue raised by a professor was that Xen is harder for > students to install and experiment compared to KVM. In > response it was mentionned that Debian packages are quite > decent. This motivated me to try installing and playing with > Xen myself. While I am familiar with Xen (I work on the XAPI > toolstack at Vates) I'm not deeply familiar with its > internals, so this seemed like a good learning opportunity and > maybe some contents for some blog posts :). > > I set up a Debian testing VM on Virtualbox and installed Xen > from packages. Everything worked fine: Grub was updated, I > rebooted, and I had a functional Xen setup with xl running in > Dom0. > Next I download the last version of Xen from xenbits.org > <http://xenbits.org>,and built only the hypervisor (no tools, > no stubdom) , using the same configuration as the Debian > package (which is for Xen 4.19). After updating GRUB and > rebooting, Xen failed to boot. Fortunately, I was able to > capture the following error via `ttyS0`: > ``` > (XEN) [0000000d2c23739a] xstate: size: 0x340 and states: 0x7 > (XEN) [0000000d2c509c1d] > (XEN) [0000000d2c641ffa] **************************************** > (XEN) [0000000d2c948e3b] Panic on CPU 0: > (XEN) [0000000d2cb349bb] XSTATE 0x0000000000000003, > uncompressed hw size 0x340 != xen size 0x240 > (XEN) [0000000d2cfc5786] **************************************** > (XEN) [0000000d2d308c24] > ``` > From my understanding, the hardware xstate size (`hw_size`) > represents the maximum memory required for the `XSAVE/XRSTOR` > save area, while `xen_size` is computed by summing the space > required for the enabled features. In `xen/arch/x86/xstate.c`, > if these sizes do not match, Xen panics. However, wouldn’t it > be correct for `xen_size` to be **less than or equal to** > `hw_size` instead of exactly matching? > > I tested the following change: > ``` > --- a/xen/arch/x86/xstate.c > +++ b/xen/arch/x86/xstate.c > @@ -710,7 +710,7 @@ static void __init check_new_xstate(struct > xcheck_state *s, uint64_t new) > */ > xen_size = xstate_uncompressed_size(s->states & > X86_XCR0_STATES); > > - if ( xen_size != hw_size ) > + if ( xen_size > hw_size ) > panic("XSTATE 0x%016"PRIx64", uncompressed hw size > %#x != xen size %#x\n", > s->states, hw_size, xen_size); > ``` > With this change, Xen boots correctly, but I may be missing > some side effects... > Additionally, I am confused as to why this issue does *not* > occur with the default Debian Xen package. Even when I rebuild > Xen *4.19.1* from source (the same version as the package), I > still encounter the issue. > So I have two questions: > - Is my understanding correct that |xen_size <= hw_size| > should be allowed? > - Are there any potential side effects of this change? > - Bonus: Have some of you any explanations about why does the > issue not occur with the packaged version of Xen but does with > a self-built version? > > Hope I wasn't too long and thanks for taking the time to read > this, > Best regards, > > Guillaume >
I attached the output of the `xl dmesg`. This is the 4.19.1 kernel I rebuild but I have the same issue with master (just for info). And also as you said earlier it works with the default installation because I see that the first line is: `(XEN) [0000001476779e16] Xen version 4.19.1 (Debian 4.19.1-1+b2) ( pkg-xen-devel@lists.alioth.debian.org) (x86_64-linux-gnu-gcc (Debian 14.2.0-14) 14.2.0) debug=n Mon Jan 27 15:31:22 UTC 2025` Indeed it is compiled with debug=n while mine has debug set to yes. So that explains why the default one is booting. But what is strange is that to build the kernel I copy the default `/boot/xen-4.19-amd64.config` as `.config` where I built the kernel. So I probably miss something here. Oh wait I'm stupid I copy it into the top dir and not the xen/ dir. So in fact it generates a default one with debug enabled. Well actually this error is interesting because it allows me to dive into the code :) On Sun, Feb 2, 2025 at 5:11 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote: > Can you also get `xl dmesg` too, and attach it? > > I think this is a VirtualBox bug, but I'm confused as to why Xen has > decided to turn off AVX. > > ~Andrew > > On 02/02/2025 4:01 pm, Guillaume wrote: > > Yes sure I can collect the output. As you said the change is good > > enough to start the dom0 without errors (at least no apparent errors :). > > ``` > > Xen reports there are maximum 120 leaves and 2 MSRs > > Raw policy: 32 leaves, 2 MSRs > > CPUID: > > leaf subleaf -> eax ebx ecx edx > > 00000000:ffffffff -> 00000016:756e6547:6c65746e:49656e69 > > 00000001:ffffffff -> 000806c1:00020800:f6fa3203:178bfbff > > 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 > > 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 > > 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 > > 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 > > 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 > > 00000006:ffffffff -> 00000004:00000000:00000000:00000000 > > 00000007:00000000 -> 00000000:208c2569:00000000:30000400 > > 0000000b:00000000 -> 00000000:00000001:00000100:00000000 > > 0000000b:00000001 -> 00000001:00000002:00000201:00000000 > > 0000000d:00000000 -> 00000007:00000000:00000340:00000000 > > 0000000d:00000002 -> 00000100:00000240:00000000:00000000 > > 80000000:ffffffff -> 80000008:00000000:00000000:00000000 > > 80000001:ffffffff -> 00000000:00000000:00000121:28100800 > > 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 > > 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 > > 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 > > 80000006:ffffffff -> 00000000:00000000:01007040:00000000 > > 80000007:ffffffff -> 00000000:00000000:00000000:00000100 > > 80000008:ffffffff -> 00003027:00000000:00000000:00000000 > > MSRs: > > index -> value > > 000000ce -> 0000000000000000 > > 0000010a -> 0000000000000000 > > Host policy: 30 leaves, 2 MSRs > > CPUID: > > leaf subleaf -> eax ebx ecx edx > > 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 > > 00000001:ffffffff -> 000806c1:00020800:c6fa2203:178bfbff > > 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 > > 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 > > 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 > > 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 > > 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 > > 00000007:00000000 -> 00000000:208c2549:00000000:30000400 > > 0000000d:00000000 -> 00000003:00000000:00000240:00000000 > > 80000000:ffffffff -> 80000008:00000000:00000000:00000000 > > 80000001:ffffffff -> 00000000:00000000:00000121:28100800 > > 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 > > 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 > > 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 > > 80000006:ffffffff -> 00000000:00000000:01007040:00000000 > > 80000007:ffffffff -> 00000000:00000000:00000000:00000100 > > 80000008:ffffffff -> 00003027:00000000:00000000:00000000 > > MSRs: > > index -> value > > 000000ce -> 0000000000000000 > > 0000010a -> 0000000000000000 > > PV Max policy: 57 leaves, 2 MSRs > > CPUID: > > leaf subleaf -> eax ebx ecx edx > > 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 > > 00000001:ffffffff -> 000806c1:00020800:c6f82203:1789cbf5 > > 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 > > 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 > > 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 > > 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 > > 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 > > 00000007:00000000 -> 00000002:208c0109:00000000:20000400 > > 0000000d:00000000 -> 00000003:00000000:00000240:00000000 > > 80000000:ffffffff -> 80000021:00000000:00000000:00000000 > > 80000001:ffffffff -> 00000000:00000000:00000123:28100800 > > 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 > > 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 > > 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 > > 80000006:ffffffff -> 00000000:00000000:01007040:00000000 > > 80000007:ffffffff -> 00000000:00000000:00000000:00000100 > > 80000008:ffffffff -> 00003027:00000000:00000000:00000000 > > MSRs: > > index -> value > > 000000ce -> 0000000000000000 > > 0000010a -> 0000000010020004 > > HVM Max policy: 4 leaves, 2 MSRs > > CPUID: > > leaf subleaf -> eax ebx ecx edx > > MSRs: > > index -> value > > 000000ce -> 0000000000000000 > > 0000010a -> 0000000000000000 > > PV Default policy: 30 leaves, 2 MSRs > > CPUID: > > leaf subleaf -> eax ebx ecx edx > > 00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69 > > 00000001:ffffffff -> 000806c1:00020800:c6d82203:1789cbf5 > > 00000002:ffffffff -> 00feff01:000000f0:00000000:00000000 > > 00000004:00000000 -> 04000121:02c0003f:0000003f:00000000 > > 00000004:00000001 -> 04000122:01c0003f:0000003f:00000000 > > 00000004:00000002 -> 04000143:04c0003f:000003ff:00000000 > > 00000004:00000003 -> 04000163:02c0003f:00003fff:00000004 > > 00000007:00000000 -> 00000000:208c0109:00000000:20000400 > > 0000000d:00000000 -> 00000003:00000000:00000240:00000000 > > 80000000:ffffffff -> 80000008:00000000:00000000:00000000 > > 80000001:ffffffff -> 00000000:00000000:00000121:28100800 > > 80000002:ffffffff -> 68743131:6e654720:746e4920:52286c65 > > 80000003:ffffffff -> 6f432029:54286572:6920294d:31312d37 > > 80000004:ffffffff -> 37473538:33204020:4730302e:00007a48 > > 80000006:ffffffff -> 00000000:00000000:01007040:00000000 > > 80000008:ffffffff -> 00003027:00000000:00000000:00000000 > > MSRs: > > index -> value > > 000000ce -> 0000000000000000 > > 0000010a -> 0000000000000000 > > HVM Default policy: 4 leaves, 2 MSRs > > CPUID: > > leaf subleaf -> eax ebx ecx edx > > MSRs: > > index -> value > > 000000ce -> 0000000000000000 > > 0000010a -> 0000000000000000 > > ``` > > > > Guillaume > > > > On Sun, Feb 2, 2025 at 4:32 PM Andrew Cooper > > <andrew.cooper3@citrix.com> wrote: > > > > This is a sanity check that an algorithm in Xen matches hardware. > > It is only compiled into debug builds by default. > > > > Given that you're running under virtualbox, i have a suspicion as > > to what's wrong. > > > > Can you collect the full `xen-cpuid -p` output from within your > > environment? I don't believe you're suggested code change is > > correct, but it will good enough to get these diagnostics. > > > > ~Andrew > > > > On Sun, 2 Feb 2025, 15:32 Guillaume, <thouveng@gmail.com> wrote: > > > > Hello, > > > > I'd like to report an issue I encountered when building Xen > > from source. To give you some context, During the Xen winter > > meetup in Grenoble few days ago, there was a discussion about > > strengthening collaboration between Xen and academia. One > > issue raised by a professor was that Xen is harder for > > students to install and experiment compared to KVM. In > > response it was mentionned that Debian packages are quite > > decent. This motivated me to try installing and playing with > > Xen myself. While I am familiar with Xen (I work on the XAPI > > toolstack at Vates) I'm not deeply familiar with its > > internals, so this seemed like a good learning opportunity and > > maybe some contents for some blog posts :). > > > > I set up a Debian testing VM on Virtualbox and installed Xen > > from packages. Everything worked fine: Grub was updated, I > > rebooted, and I had a functional Xen setup with xl running in > > Dom0. > > Next I download the last version of Xen from xenbits.org > > <http://xenbits.org>,and built only the hypervisor (no tools, > > no stubdom) , using the same configuration as the Debian > > package (which is for Xen 4.19). After updating GRUB and > > rebooting, Xen failed to boot. Fortunately, I was able to > > capture the following error via `ttyS0`: > > ``` > > (XEN) [0000000d2c23739a] xstate: size: 0x340 and states: 0x7 > > (XEN) [0000000d2c509c1d] > > (XEN) [0000000d2c641ffa] **************************************** > > (XEN) [0000000d2c948e3b] Panic on CPU 0: > > (XEN) [0000000d2cb349bb] XSTATE 0x0000000000000003, > > uncompressed hw size 0x340 != xen size 0x240 > > (XEN) [0000000d2cfc5786] **************************************** > > (XEN) [0000000d2d308c24] > > ``` > > From my understanding, the hardware xstate size (`hw_size`) > > represents the maximum memory required for the `XSAVE/XRSTOR` > > save area, while `xen_size` is computed by summing the space > > required for the enabled features. In `xen/arch/x86/xstate.c`, > > if these sizes do not match, Xen panics. However, wouldn’t it > > be correct for `xen_size` to be **less than or equal to** > > `hw_size` instead of exactly matching? > > > > I tested the following change: > > ``` > > --- a/xen/arch/x86/xstate.c > > +++ b/xen/arch/x86/xstate.c > > @@ -710,7 +710,7 @@ static void __init check_new_xstate(struct > > xcheck_state *s, uint64_t new) > > */ > > xen_size = xstate_uncompressed_size(s->states & > > X86_XCR0_STATES); > > > > - if ( xen_size != hw_size ) > > + if ( xen_size > hw_size ) > > panic("XSTATE 0x%016"PRIx64", uncompressed hw size > > %#x != xen size %#x\n", > > s->states, hw_size, xen_size); > > ``` > > With this change, Xen boots correctly, but I may be missing > > some side effects... > > Additionally, I am confused as to why this issue does *not* > > occur with the default Debian Xen package. Even when I rebuild > > Xen *4.19.1* from source (the same version as the package), I > > still encounter the issue. > > So I have two questions: > > - Is my understanding correct that |xen_size <= hw_size| > > should be allowed? > > - Are there any potential side effects of this change? > > - Bonus: Have some of you any explanations about why does the > > issue not occur with the packaged version of Xen but does with > > a self-built version? > > > > Hope I wasn't too long and thanks for taking the time to read > > this, > > Best regards, > > > > Guillaume > > > > root@vbox:~# xl dmesg Xen 4.19.1 (XEN) Xen version 4.19.1 (vboxuser@localdomain) (gcc (Debian 14.2.0-12) 14.2.0) debug=y Sun Feb 2 13:51:15 CET 2025 (XEN) Latest ChangeSet: Wed Dec 4 08:52:37 2024 +0100 git:ccf4008467-dirty (XEN) build-id: 82980349d0eb3bc244071cae65a93d89b5233914 (XEN) Bootloader: GRUB 2.12-5 (XEN) Command line: placeholder (XEN) Xen image load base address: 0x7f200000 (XEN) Video information: (XEN) VGA is text mode 80x25, font 8x16 (XEN) Disc information: (XEN) Found 1 MBR signatures (XEN) Found 1 EDD information structures (XEN) CPU Vendor: Intel, Family 6 (0x6), Model 140 (0x8c), Stepping 1 (raw 000806c1) (XEN) Xen-e820 RAM map: (XEN) [0000000000000000, 000000000009fbff] (usable) (XEN) [000000000009fc00, 000000000009ffff] (reserved) (XEN) [00000000000f0000, 00000000000fffff] (reserved) (XEN) [0000000000100000, 000000007ffeffff] (usable) (XEN) [000000007fff0000, 000000007fffffff] (ACPI data) (XEN) [00000000fec00000, 00000000fec00fff] (reserved) (XEN) [00000000fee00000, 00000000fee00fff] (reserved) (XEN) [00000000fffc0000, 00000000ffffffff] (reserved) (XEN) BSP microcode revision: 0x00000000 (XEN) System RAM: 2047MB (2096700kB) (XEN) ACPI: RSDP 000E0000, 0024 (r2 VBOX ) (XEN) ACPI: XSDT 7FFF0030, 003C (r1 VBOX VBOXXSDT 1 ASL 61) (XEN) ACPI: FACP 7FFF00F0, 00F4 (r4 VBOX VBOXFACP 1 ASL 61) (XEN) ACPI: DSDT 7FFF0610, 2353 (r2 VBOX VBOXBIOS 2 INTL 20200925) (XEN) ACPI: FACS 7FFF0200, 0040 (XEN) ACPI: APIC 7FFF0240, 005C (r2 VBOX VBOXAPIC 1 ASL 61) (XEN) ACPI: SSDT 7FFF02A0, 036C (r1 VBOX VBOXCPUT 2 INTL 20200925) (XEN) No NUMA configuration found (XEN) Faking a node at 0000000000000000-000000007fff0000 (XEN) Domain heap initialised (XEN) found SMP MP-table at 0009fff0 (XEN) DMI 2.5 present. (XEN) Using APIC driver default (XEN) ACPI: PM-Timer IO Port: 0x4008 (32 bits) (XEN) ACPI: SLEEP INFO: pm1x_cnt[1:4004,1:0], pm1x_evt[1:4000,1:0] (XEN) ACPI: wakeup_vec[7fff020c], vec_size[20] (XEN) ACPI: Local APIC address 0xfee00000 (XEN) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) (XEN) IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) (XEN) ACPI: IRQ0 used by override. (XEN) ACPI: IRQ2 used by override. (XEN) ACPI: IRQ9 used by override. (XEN) Using ACPI (MADT) for SMP configuration information (XEN) SMP: Allowing 2 CPUs (0 hotplug CPUs) (XEN) IRQ limits: 24 GSI, 392 MSI/MSI-X (XEN) Switched to APIC driver x2apic_mixed (XEN) CPU0: 3000 ... 3000 MHz (XEN) xstate: size: 0x340 and states: 0x7 (XEN) Xen WARN at arch/x86/xstate.c:750 (XEN) ----[ Xen-4.19.1 x86_64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82d0404580b3>] arch/x86/xstate.c#check_new_xstate+0x1c5/0x1d3 (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: 0000000000000988 rcx: 0000000000000000 (XEN) rdx: 0000000000000000 rsi: ffff82d0405d7308 rdi: 0000000000000003 (XEN) rbp: ffff82d040467ce8 rsp: ffff82d040467cc8 r8: 0000000000000000 (XEN) r9: ffff82d0404bed00 r10: 0000000000000000 r11: ffff82d0405e5400 (XEN) r12: ffff82d040467cf8 r13: 0000000000000003 r14: 0000000000000001 (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000400a0 (XEN) cr3: 000000007f6ca000 cr2: 0000000000000000 (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen code around <ffff82d0404580b3> (arch/x86/xstate.c#check_new_xstate+0x1c5/0x1d3): (XEN) 00 00 0f 85 39 ff ff ff <0f> 0b c6 05 d4 77 1a 00 01 e9 2b ff ff ff 55 48 (XEN) Xen stack trace from rsp=ffff82d040467cc8: (XEN) 0000000000000003 0000000000000007 0000000000000000 0000000000000007 (XEN) ffff82d040467d20 ffff82d0404580fc 0000000000000003 0000000000000340 (XEN) 0000000000000240 ffff82d0404bed00 0000000000000007 ffff82d040467d48 (XEN) ffff82d04038979e 0000000000000988 ffff82d0404bed00 00000000ffffffff (XEN) ffff82d040467d68 ffff82d040290c77 ffff82d0404bee80 0000000000000001 (XEN) ffff82d040467ee8 ffff82d0404523c4 ffff82d04049cf80 0000000000000000 (XEN) ffff82d0405e7f00 0000000001000000 ffff83000009df80 ffff83000009df80 (XEN) 0000000004066000 000000007697d000 ffff82d040467e44 ffff82d0405e7f00 (XEN) ffff82d04049cf80 ffff82d040393000 0000000000000000 ffff82d040393000 (XEN) 0000000004066000 000000000007fff0 000000000000408c ffff82d000800163 (XEN) 0000000000000002 0000000000000001 ffff83007fdf0000 ffff82d040467ef8 (XEN) ffff83000009df80 ffff83000009dfb0 ffff83000009df60 0007a8007f637cb5 (XEN) 7f63768600b737c0 7f637cb70009df71 702d7cb87f6373ef 7f637cb500000000 (XEN) 7f667e980009df7b 0009df7b7f6376fc 000000047f637cb1 7f667eb00009df01 (XEN) 0000000800000000 000000010000006e 0000000000000003 00000000000002f8 (XEN) ffff82d0405ce000 ffff82d0404ce000 0000000000000002 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 ffff82d040203334 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) Xen call trace: (XEN) [<ffff82d0404580b3>] R arch/x86/xstate.c#check_new_xstate+0x1c5/0x1d3 (XEN) [<ffff82d0404580fc>] F arch/x86/xstate.c#xstate_check_sizes+0x3b/0x22a (XEN) [<ffff82d04038979e>] F xstate_init+0x332/0x351 (XEN) [<ffff82d040290c77>] F identify_cpu+0x377/0x4f4 (XEN) [<ffff82d0404523c4>] F __start_xen+0x1ccb/0x2523 (XEN) [<ffff82d040203334>] F __high_start+0x94/0xa0 (XEN) (XEN) CPU0: No MCE banks present. Machine check support disabled (XEN) Unrecognised CPU model 0x8c - assuming vulnerable to LazyFPU (XEN) Unrecognised CPU model 0x8c - assuming vulnerable to L1TF (XEN) Unrecognised CPU model 0x8c - assuming vulnerable to MDS (XEN) Mitigating GDS by disabling AVX while virtualised - protections are best-effort (XEN) Speculative mitigation facilities: (XEN) Hardware hints: (XEN) Hardware features: L1D_FLUSH MD_CLEAR (XEN) Compiled-in support: INDIRECT_THUNK SHADOW_PAGING HARDEN_ARRAY HARDEN_BRANCH HARDEN_GUEST_ACCESS HARDEN_LOCK (XEN) Xen settings: BTI-Thunk: RETPOLINE, BHB-Seq: SHORT, SPEC_CTRL: No, Other: L1D_FLUSH VERW BRANCH_HARDEN (XEN) L1TF: believed vulnerable, maxphysaddr L1D 39, CPUID 39, Safe address 6000000000 (XEN) Support for HVM VMs: RSB EAGER_FPU BHB-entry (XEN) Support for PV VMs: RSB EAGER_FPU VERW BHB-entry (XEN) XPTI (64-bit PV only): Dom0 enabled, DomU enabled (with PCID) (XEN) PV L1TF shadowing: Dom0 disabled, DomU enabled (XEN) Using scheduler: SMP Credit Scheduler rev2 (credit2) (XEN) Initializing Credit2 scheduler (XEN) load_precision_shift: 18 (XEN) load_window_shift: 30 (XEN) underload_balance_tolerance: 0 (XEN) overload_balance_tolerance: -3 (XEN) runqueues arrangement: socket (XEN) cap enforcement granularity: 10ms (XEN) load tracking window length 1073741824 ns (XEN) Platform timer is 3.580MHz ACPI PM Timer (XEN) Detected 2995.198 MHz processor. (XEN) Freed 1024kB unused BSS memory (XEN) alt table ffff82d0404a2998 -> ffff82d0404b5422 (XEN) I/O virtualisation disabled (XEN) nr_sockets: 1 (XEN) Enabling APIC mode. Using 1 I/O APICs (XEN) ENABLING IO-APIC IRQs (XEN) -> Using new ACK method (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1 (XEN) Allocated console ring of 16 KiB. (XEN) mwait-idle: does not run on family 6 model 140 (XEN) alt table ffff82d0404a2998 -> ffff82d0404b5422 (XEN) CPU1: No MCE banks present. Machine check support disabled (XEN) Brought up 2 CPUs (XEN) Scheduling granularity: cpu, 1 CPU per sched-resource (XEN) Initializing Credit2 scheduler (XEN) load_precision_shift: 18 (XEN) load_window_shift: 30 (XEN) underload_balance_tolerance: 0 (XEN) overload_balance_tolerance: -3 (XEN) runqueues arrangement: socket (XEN) cap enforcement granularity: 10ms (XEN) load tracking window length 1073741824 ns (XEN) Adding cpu 0 to runqueue 0 (XEN) First cpu on runqueue, activating (XEN) Adding cpu 1 to runqueue 0 (XEN) mtrr: your CPUs had inconsistent variable MTRR settings (XEN) mtrr: probably your BIOS does not setup all CPUs. (XEN) mtrr: corrected configuration. (XEN) MTRR default type: uncachable (XEN) MTRR fixed ranges enabled: (XEN) 00000-9ffff write-back (XEN) a0000-bffff uncachable (XEN) c0000-fffff write-protect (XEN) MTRR variable ranges enabled: (XEN) 0 base 0000000000 mask 7f80000000 write-back (XEN) 1 disabled (XEN) 2 disabled (XEN) 3 disabled (XEN) 4 disabled (XEN) 5 disabled (XEN) 6 disabled (XEN) 7 disabled (XEN) 8 disabled (XEN) 9 disabled (XEN) 10 disabled (XEN) 11 disabled (XEN) 12 disabled (XEN) 13 disabled (XEN) 14 disabled (XEN) 15 disabled (XEN) Running stub recovery selftests... (XEN) Fixup #UD[0000]: ffff82d07fffe044 [ffff82d07fffe044] -> ffff82d040391f11 (XEN) Fixup #GP[0000]: ffff82d07fffe045 [ffff82d07fffe045] -> ffff82d040391f11 (XEN) Fixup #SS[0000]: ffff82d07fffe044 [ffff82d07fffe044] -> ffff82d040391f11 (XEN) Fixup #BP[0000]: ffff82d07fffe045 [ffff82d07fffe045] -> ffff82d040391f11 (XEN) NX (Execute Disable) protection active (XEN) d0 has maximum 416 PIRQs (XEN) *** Building a PV Dom0 *** (XEN) ELF: phdr: paddr=0x1000000 memsz=0x1b7912c (XEN) ELF: phdr: paddr=0x2c00000 memsz=0x70f000 (XEN) ELF: phdr: paddr=0x330f000 memsz=0x39000 (XEN) ELF: phdr: paddr=0x3348000 memsz=0xeb8000 (XEN) ELF: memory: 0x1000000 -> 0x4200000 (XEN) ELF: note: PHYS32_RELOC align: 0x200000 min: 0x1000000 max: 0x3fffffff (XEN) ELF: note: PHYS32_ENTRY = 0x1000b20 (XEN) ELF: note: GUEST_OS = "linux" (XEN) ELF: note: GUEST_VERSION = "2.6" (XEN) ELF: note: XEN_VERSION = "xen-3.0" (XEN) ELF: note: VIRT_BASE = 0xffffffff80000000 (XEN) ELF: note: INIT_P2M = 0x8000000000 (XEN) ELF: note: ENTRY = 0xffffffff833596a0 (XEN) ELF: note: FEATURES = "!writable_page_tables" (XEN) ELF: note: PAE_MODE = "yes" (XEN) ELF: note: L1_MFN_VALID (XEN) ELF: note: MOD_START_PFN = 0x1 (XEN) ELF: note: PADDR_OFFSET = 0 (XEN) ELF: note: SUPPORTED_FEATURES = 0x8801 (XEN) ELF: note: LOADER = "generic" (XEN) ELF: note: SUSPEND_CANCEL = 0x1 (XEN) ELF: addresses: (XEN) virt_base = 0xffffffff80000000 (XEN) elf_paddr_offset = 0x0 (XEN) virt_offset = 0xffffffff80000000 (XEN) virt_kstart = 0xffffffff81000000 (XEN) virt_kend = 0xffffffff84200000 (XEN) virt_entry = 0xffffffff833596a0 (XEN) p2m_base = 0x8000000000 (XEN) Xen kernel: 64-bit, lsb (XEN) Dom0 kernel: 64-bit, lsb, paddr 0x1000000 -> 0x4200000 (XEN) PHYSICAL MEMORY ARRANGEMENT: (XEN) Dom0 alloc.: 0000000068000000->0000000070000000 (436342 pages to be allocated) (XEN) Init. ramdisk: 000000007b374000->000000007f3ffff5 (XEN) VIRTUAL MEMORY ARRANGEMENT: (XEN) Loaded kernel: ffffffff81000000->ffffffff84200000 (XEN) Phys-Mach map: 0000008000000000->00000080003b4810 (XEN) Start info: ffffffff84200000->ffffffff842004b8 (XEN) Page tables: ffffffff84201000->ffffffff84226000 (XEN) Boot stack: ffffffff84226000->ffffffff84227000 (XEN) TOTAL: ffffffff80000000->ffffffff84400000 (XEN) ENTRY ADDRESS: ffffffff833596a0 (XEN) Dom0 has maximum 2 VCPUs (XEN) ELF: phdr 0 at 0xffffffff81000000 -> 0xffffffff82b7912c (XEN) ELF: phdr 1 at 0xffffffff82c00000 -> 0xffffffff8330f000 (XEN) ELF: phdr 2 at 0xffffffff8330f000 -> 0xffffffff83348000 (XEN) ELF: phdr 3 at 0xffffffff83348000 -> 0xffffffff84200000 (XEN) Initial low memory virq threshold set at 0x4000 pages. (XEN) Scrubbing Free RAM in background (XEN) Std. Loglevel: All (XEN) Guest Loglevel: All (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input) (XEN) Freed 664kB init memory (XEN) PCI add device 0000:00:00.0 (XEN) PCI add device 0000:00:01.0 (XEN) PCI add device 0000:00:01.1 (XEN) PCI add device 0000:00:02.0 (XEN) PCI add device 0000:00:03.0 (XEN) PCI add device 0000:00:04.0 (XEN) PCI add device 0000:00:06.0 (XEN) PCI add device 0000:00:07.0 (XEN) PCI add device 0000:00:0d.0 (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v1 RDMSR 0x0000064e unimplemented (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v1 RDMSR 0x00000034 unimplemented (XEN) CPU1: No irq handler for vector 88 (IRQ -2147483648, LAPIC) (XEN) CPU1: No irq handler for vector 30 (IRQ -2147483648, LAPIC) (XEN) CPU1: No irq handler for vector a0 (IRQ -2147483648, LAPIC) (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v1 RDMSR 0x00000639 unimplemented (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v1 RDMSR 0x00000611 unimplemented (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v1 RDMSR 0x00000619 unimplemented (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v1 RDMSR 0x00000641 unimplemented (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v1 RDMSR 0x0000064d unimplemented (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v1 RDMSR 0x00000606 unimplemented (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v0 RDMSR 0x00000611 unimplemented (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v0 RDMSR 0x00000639 unimplemented (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v0 RDMSR 0x00000641 unimplemented (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v0 RDMSR 0x00000619 unimplemented (XEN) arch/x86/pv/emul-priv-op.c:1017:d0v0 RDMSR 0x0000064d unimplemented (XEN) common/grant_table.c:1909:d0v0 Expanding d0 grant table from 1 to 2 frames
On 02/02/2025 4:58 pm, Guillaume wrote: > I attached the output of the `xl dmesg`. This is the 4.19.1 kernel I > rebuild but I have the same issue with master (just for info). Thanks. This is a TigerLake CPU, and: > (XEN) Mitigating GDS by disabling AVX while virtualised - protections > are best-effort is why Xen is ignoring AVX. Now, as to the bug. From the panic line, you're seeing: > XSTATE 0x0000000000000003, uncompressed hw size 0x340 != xen size 0x240 xstate is XCR0_SSE | XCR0_X87, and the correct size for this configuration is 0x240. There reason why it matters is because this is the amount of data the processor will write out/read in for the XSAVE/XRSTOR instructions, which are used for context switching. These instructions are also available in userspace. Here, VirtualBox is claiming that with AVX disabled, it will still write out the AVX registers. This is buggy, but we're going to have to narrow it down further. Can you try building Xen with this additional line diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index af9e345a7ace..5a5011ba8b10 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -789,6 +789,8 @@ static void __init noinline xstate_check_sizes(void) */ check_new_xstate(&s, X86_XCR0_SSE | X86_XCR0_X87); + asm volatile ("vzeroupper"); + if ( cpu_has_avx ) check_new_xstate(&s, X86_XCR0_YMM); and see if the result crashes or boots? One possible bug is that VirtualBox is shadowing XCR0 and the real setting in hardware is 0x7 (including XCR0_AVX) rather than 0x3. In this case, the reported size is correct, and VirtualBox is failing to honour the XSETBV setting. Alternatively, another bug is that XCR0 is really 0x3, but the CPUID emulation for max size is wrong, in which case the XSAVE/etc instructions wont actually access beyond 0x240, and "all" that's wrong is that we'll allocate a larger buffer than necessary. The VZEROUPPER (an AVX instruction) should distinguish these two cases. If Xen crashes with it in place, then the XCR0 register is correct and it's CPUID which is buggy. If Xen boots with that in place, then Virtualbox is shadowing XCR0 with a different value behind Xen's back. ~Andrew
Oh cool, thanks a lot for the explanation. I added the "vzeroupper" and Xen crashes so it looks like the CPUID emulation is buggy. Also I was able to try it using a VM (same debian testing) running on virt-manager+kvm and it works fine (Xen in debug mode). I will have a look by printing the xstate when running on virt-manager+KVM and I will also run the xen-cpuid command to see the difference just by curiosity as with your test we already spotted the issue. Thanks again for your enlightenment. I will continue my testing later today and if you need me to test something else you are welcome, just ask I will do my best. Guillaume On Sun, Feb 2, 2025 at 6:32 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 02/02/2025 4:58 pm, Guillaume wrote: > > I attached the output of the `xl dmesg`. This is the 4.19.1 kernel I > > rebuild but I have the same issue with master (just for info). > > Thanks. This is a TigerLake CPU, and: > > > (XEN) Mitigating GDS by disabling AVX while virtualised - protections > > are best-effort > > is why Xen is ignoring AVX. > > Now, as to the bug. From the panic line, you're seeing: > > > XSTATE 0x0000000000000003, uncompressed hw size 0x340 != xen size 0x240 > > xstate is XCR0_SSE | XCR0_X87, and the correct size for this > configuration is 0x240. > > There reason why it matters is because this is the amount of data the > processor will write out/read in for the XSAVE/XRSTOR instructions, > which are used for context switching. These instructions are also > available in userspace. > > Here, VirtualBox is claiming that with AVX disabled, it will still write > out the AVX registers. This is buggy, but we're going to have to narrow > it down further. > > Can you try building Xen with this additional line > > diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c > index af9e345a7ace..5a5011ba8b10 100644 > --- a/xen/arch/x86/xstate.c > +++ b/xen/arch/x86/xstate.c > @@ -789,6 +789,8 @@ static void __init noinline xstate_check_sizes(void) > */ > check_new_xstate(&s, X86_XCR0_SSE | X86_XCR0_X87); > > + asm volatile ("vzeroupper"); > + > if ( cpu_has_avx ) > check_new_xstate(&s, X86_XCR0_YMM); > > > and see if the result crashes or boots? > > One possible bug is that VirtualBox is shadowing XCR0 and the real > setting in hardware is 0x7 (including XCR0_AVX) rather than 0x3. In > this case, the reported size is correct, and VirtualBox is failing to > honour the XSETBV setting. > > Alternatively, another bug is that XCR0 is really 0x3, but the CPUID > emulation for max size is wrong, in which case the XSAVE/etc > instructions wont actually access beyond 0x240, and "all" that's wrong > is that we'll allocate a larger buffer than necessary. > > The VZEROUPPER (an AVX instruction) should distinguish these two cases. > If Xen crashes with it in place, then the XCR0 register is correct and > it's CPUID which is buggy. If Xen boots with that in place, then > Virtualbox is shadowing XCR0 with a different value behind Xen's back. > > ~Andrew >
On 03/02/2025 8:58 am, Guillaume wrote: > Oh cool, thanks a lot for the explanation. > I added the "vzeroupper" and Xen crashes so it looks like the CPUID > emulation is buggy. Also I was able to try it using a VM (same debian > testing) running on virt-manager+kvm and it works fine (Xen in debug > mode). I will have a look by printing the xstate when running on > virt-manager+KVM and I will also run the xen-cpuid command to see the > difference just by curiosity as with your test we already spotted the > issue. > Thanks again for your enlightenment. I will continue my testing later > today and if you need me to test something else you are welcome, just > ask I will do my best. It sounds like KVM has a better CPUID emulation than VirtualBox. It would be ideal to report this bug with VirtualBox. But, as you identified originally, it's not nice that Xen simply like this. We should see about what to for Xen, seeing as we're close to the line on 4.20. I'm thinking maybe making the xstate checks non-fatal in the cpu_has_hypervisor case. Thoughts? ~Andrew
On 04.02.2025 18:35, Andrew Cooper wrote: > On 03/02/2025 8:58 am, Guillaume wrote: >> Oh cool, thanks a lot for the explanation. >> I added the "vzeroupper" and Xen crashes so it looks like the CPUID >> emulation is buggy. Also I was able to try it using a VM (same debian >> testing) running on virt-manager+kvm and it works fine (Xen in debug >> mode). I will have a look by printing the xstate when running on >> virt-manager+KVM and I will also run the xen-cpuid command to see the >> difference just by curiosity as with your test we already spotted the >> issue. >> Thanks again for your enlightenment. I will continue my testing later >> today and if you need me to test something else you are welcome, just >> ask I will do my best. > > It sounds like KVM has a better CPUID emulation than VirtualBox. > > It would be ideal to report this bug with VirtualBox. > > But, as you identified originally, it's not nice that Xen simply like > this. We should see about what to for Xen, seeing as we're close to the > line on 4.20. I'm thinking maybe making the xstate checks non-fatal in > the cpu_has_hypervisor case. Thoughts? In principle: Yes, that's an option. But then we need to suppress use of xstate_{,un}compressed_size() anywhere, perhaps by disabling respective features. Not sure whether that is what you meant to imply. Jan
--- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -710,7 +710,7 @@ static void __init check_new_xstate(struct xcheck_state *s, uint64_t new) */ xen_size = xstate_uncompressed_size(s->states & X86_XCR0_STATES); - if ( xen_size != hw_size ) + if ( xen_size > hw_size ) panic("XSTATE 0x%016"PRIx64", uncompressed hw size %#x != xen size %#x\n",