Message ID | 20250326152558.350103-1-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | x86/emul: Adjust handling of CR8_LEGACY | expand |
On 26.03.2025 16:25, Andrew Cooper wrote: > The CR8_LEGACY feature was introduced in the K8 Revision F. It doesn't exist > in prior revisions of the K8. > > Furthermore, from APM Vol2 3.1.5 CR8 (Task Priority Register, TPR): > > The AMD64 architecture introduces a new control register, CR8, defined as > the task priority register (TPR). > > Give CR8_LEGACY a dependency on LM, seeing as CR8 doesn't exist on pre-64bit > CPUs. But that's not what LM stands for in the dependencies. If you want to run a guest strictly as 32-bit one, you could suppress exposure of LM. That shouldn't also suppress the ability to access CR8 then, though - the LOCK way of accessing was - aiui - specifically added to allow access to it from a 32-bit kernel. > Additionally, from APM Vol3 4 System Instructions MOV CRn: > > CR8 can be read and written in 64-bit mode, using a REX prefix. CR8 can be > read and written in all modes using a LOCK prefix instead of a REX prefix to > specify the additional opcode bit. > > i.e. the LOCK prefix serves as an alternative encoding for REX.R. > > Switch decode_twobyte() from += 8 to |= 8 to better match the description > given. Other indications that the encoding isn't additive are that the CR > intercepts stop at 15, that LOCK MOV CR8 generates #UD rather than becoming a > CR0 access. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > CC: Jan Beulich <JBeulich@suse.com> > CC: Roger Pau Monné <roger.pau@citrix.com> > > Also, designers never put an ADD into silicon if they can possibly avoid it, > because it's slow and large compared to the single OR gate needed in this > case. My reading of the respective doc was never resulting in something unambiguous. I find this argument far more convincing than anything that's in the doc. I probably should have tried out REX + LOCK. Maybe I even did and concluded from it faulting that the two things are cumulative. Jan
diff --git a/xen/arch/x86/x86_emulate/decode.c b/xen/arch/x86/x86_emulate/decode.c index 7ce97c4726dc..e4a17ea2cd59 100644 --- a/xen/arch/x86/x86_emulate/decode.c +++ b/xen/arch/x86/x86_emulate/decode.c @@ -777,10 +777,10 @@ decode_twobyte(struct x86_emulate_state *s, case 0x20: case 0x22: /* mov to/from cr */ if ( s->lock_prefix && vcpu_has_cr8_legacy() ) { - s->modrm_reg += 8; + s->modrm_reg |= 8; s->lock_prefix = false; } - /* fall through */ + fallthrough; case 0x21: case 0x23: /* mov to/from dr */ ASSERT(s->ea.type == OP_REG); /* Early operand adjustment ensures this. */ generate_exception_if(s->lock_prefix, X86_EXC_UD); diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py index a77cb30bdb4f..bb7ac5291589 100755 --- a/xen/tools/gen-cpuid.py +++ b/xen/tools/gen-cpuid.py @@ -271,10 +271,12 @@ def crunch_numbers(state): # CX16 is only encodable in Long Mode. LAHF_LM indicates that the # SAHF/LAHF instructions are reintroduced in Long Mode. 1GB - # superpages, PCID and PKU are only available in 4 level paging. - # NO_LMSL indicates the absense of Long Mode Segment Limits, which - # have been dropped in hardware. - LM: [CX16, PCID, LAHF_LM, PAGE1GB, PKU, NO_LMSL, AMX_TILE, CMPCCXADD], + # superpages, PCID and PKU are only available in 4 level paging. The + # CR8 register only exists in the AMD64 architecture. NO_LMSL + # indicates the absense of Long Mode Segment Limits, which have been + # dropped in hardware. + LM: [CX16, PCID, LAHF_LM, CR8_LEGACY, PAGE1GB, PKU, NO_LMSL, + AMX_TILE, CMPCCXADD], # AMD K6-2+ and K6-III processors shipped with 3DNow+, beyond the # standard 3DNow in the earlier K6 processors.
The CR8_LEGACY feature was introduced in the K8 Revision F. It doesn't exist in prior revisions of the K8. Furthermore, from APM Vol2 3.1.5 CR8 (Task Priority Register, TPR): The AMD64 architecture introduces a new control register, CR8, defined as the task priority register (TPR). Give CR8_LEGACY a dependency on LM, seeing as CR8 doesn't exist on pre-64bit CPUs. Additionally, from APM Vol3 4 System Instructions MOV CRn: CR8 can be read and written in 64-bit mode, using a REX prefix. CR8 can be read and written in all modes using a LOCK prefix instead of a REX prefix to specify the additional opcode bit. i.e. the LOCK prefix serves as an alternative encoding for REX.R. Switch decode_twobyte() from += 8 to |= 8 to better match the description given. Other indications that the encoding isn't additive are that the CR intercepts stop at 15, that LOCK MOV CR8 generates #UD rather than becoming a CR0 access. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> Also, designers never put an ADD into silicon if they can possibly avoid it, because it's slow and large compared to the single OR gate needed in this case. --- xen/arch/x86/x86_emulate/decode.c | 4 ++-- xen/tools/gen-cpuid.py | 10 ++++++---- 2 files changed, 8 insertions(+), 6 deletions(-) base-commit: 38adc2d7879c9a68b21a1dddb09d9c9f34d15ee4