Message ID | 20240419195147.434894-1-mark.cave-ayland@ilande.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | target/i386/translate.c: always write 32-bits for SGDT and SIDT | expand |
On 4/19/24 12:51, Mark Cave-Ayland wrote: > The various Intel CPU manuals claim that SGDT and SIDT can write either 24-bits > or 32-bits depending upon the operand size, but this is incorrect. Not only do > the Intel CPU manuals give contradictory information between processor > revisions, but this information doesn't even match real-life behaviour. > > In fact, tests on real hardware show that the CPU always writes 32-bits for SGDT > and SIDT, and this behaviour is required for at least OS/2 Warp and WFW 3.11 with > Win32s to function correctly. Remove the masking applied due to the operand size > for SGDT and SIDT so that the TCG behaviour matches the behaviour on real > hardware. > > Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198 > > -- > MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed that this > patch fixes the issue in WFW 3.11 with Win32s. For more technical information I > highly recommend the excellent write-up at > https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/. > --- > target/i386/tcg/translate.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c > index 76a42c679c..3026eb6774 100644 > --- a/target/i386/tcg/translate.c > +++ b/target/i386/tcg/translate.c > @@ -5846,9 +5846,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu) > gen_op_st_v(s, MO_16, s->T0, s->A0); > gen_add_A0_im(s, 2); > tcg_gen_ld_tl(s->T0, tcg_env, offsetof(CPUX86State, gdt.base)); > - if (dflag == MO_16) { > - tcg_gen_andi_tl(s->T0, s->T0, 0xffffff); > - } > + /* > + * NB: Despite claims to the contrary in Intel CPU documentation, > + * all 32-bits are written regardless of operand size. > + */ Current documentation agrees that all 32 bits are written, so I don't think you need this comment: IF OperandSize =16 or OperandSize = 32 (* Legacy or Compatibility Mode *) THEN DEST[0:15] := GDTR(Limit); DEST[16:47] := GDTR(Base); (* Full 32-bit base address stored *) FI; Anyway, Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~
On 20/04/2024 02:21, Richard Henderson wrote: > On 4/19/24 12:51, Mark Cave-Ayland wrote: >> The various Intel CPU manuals claim that SGDT and SIDT can write either 24-bits >> or 32-bits depending upon the operand size, but this is incorrect. Not only do >> the Intel CPU manuals give contradictory information between processor >> revisions, but this information doesn't even match real-life behaviour. >> >> In fact, tests on real hardware show that the CPU always writes 32-bits for SGDT >> and SIDT, and this behaviour is required for at least OS/2 Warp and WFW 3.11 with >> Win32s to function correctly. Remove the masking applied due to the operand size >> for SGDT and SIDT so that the TCG behaviour matches the behaviour on real >> hardware. >> >> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> >> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198 >> >> -- >> MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed that this >> patch fixes the issue in WFW 3.11 with Win32s. For more technical information I >> highly recommend the excellent write-up at >> https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/. >> --- >> target/i386/tcg/translate.c | 14 ++++++++------ >> 1 file changed, 8 insertions(+), 6 deletions(-) >> >> diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c >> index 76a42c679c..3026eb6774 100644 >> --- a/target/i386/tcg/translate.c >> +++ b/target/i386/tcg/translate.c >> @@ -5846,9 +5846,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu) >> gen_op_st_v(s, MO_16, s->T0, s->A0); >> gen_add_A0_im(s, 2); >> tcg_gen_ld_tl(s->T0, tcg_env, offsetof(CPUX86State, gdt.base)); >> - if (dflag == MO_16) { >> - tcg_gen_andi_tl(s->T0, s->T0, 0xffffff); >> - } >> + /* >> + * NB: Despite claims to the contrary in Intel CPU documentation, >> + * all 32-bits are written regardless of operand size. >> + */ > > Current documentation agrees that all 32 bits are written, so I don't think you need > this comment: Ah that's good to know the docs are now correct. I added the comment as there was a lot of conflicting information around for older CPUs so I thought it was worth an explicit mention. If everyone agrees a version without comments is preferable, I can re-send an updated version without them included. > IF OperandSize =16 or OperandSize = 32 (* Legacy or Compatibility Mode *) > THEN > DEST[0:15] := GDTR(Limit); > DEST[16:47] := GDTR(Base); (* Full 32-bit base address stored *) > FI; > > > Anyway, > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Thanks! ATB, Mark.
Am 20.04.24 um 07:40 schrieb Mark Cave-Ayland: > On 20/04/2024 02:21, Richard Henderson wrote: > >> On 4/19/24 12:51, Mark Cave-Ayland wrote: >>> The various Intel CPU manuals claim that SGDT and SIDT can write >>> either 24-bits >>> or 32-bits depending upon the operand size, but this is incorrect. >>> Not only do >>> the Intel CPU manuals give contradictory information between processor >>> revisions, but this information doesn't even match real-life behaviour. >>> >>> In fact, tests on real hardware show that the CPU always writes >>> 32-bits for SGDT >>> and SIDT, and this behaviour is required for at least OS/2 Warp and >>> WFW 3.11 with >>> Win32s to function correctly. Remove the masking applied due to the >>> operand size >>> for SGDT and SIDT so that the TCG behaviour matches the behaviour on >>> real >>> hardware. >>> >>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> >>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198 >>> >>> -- >>> MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed >>> that this >>> patch fixes the issue in WFW 3.11 with Win32s. For more technical >>> information I >>> highly recommend the excellent write-up at >>> https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/. >>> --- >>> target/i386/tcg/translate.c | 14 ++++++++------ >>> 1 file changed, 8 insertions(+), 6 deletions(-) >>> >>> diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c >>> index 76a42c679c..3026eb6774 100644 >>> --- a/target/i386/tcg/translate.c >>> +++ b/target/i386/tcg/translate.c >>> @@ -5846,9 +5846,10 @@ static bool disas_insn(DisasContext *s, >>> CPUState *cpu) >>> gen_op_st_v(s, MO_16, s->T0, s->A0); >>> gen_add_A0_im(s, 2); >>> tcg_gen_ld_tl(s->T0, tcg_env, offsetof(CPUX86State, >>> gdt.base)); >>> - if (dflag == MO_16) { >>> - tcg_gen_andi_tl(s->T0, s->T0, 0xffffff); >>> - } >>> + /* >>> + * NB: Despite claims to the contrary in Intel CPU >>> documentation, >>> + * all 32-bits are written regardless of operand size. >>> + */ >> >> Current documentation agrees that all 32 bits are written, so I don't >> think you need this comment: > > Ah that's good to know the docs are now correct. I added the comment > as there was a lot of conflicting information around for older CPUs so > I thought it was worth an explicit mention. > > If everyone agrees a version without comments is preferable, I can > re-send an updated version without them included. > Hi Mark, I wouldn't remove the comment. Quote from the Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2B: Instruction Set Reference, M-U March 2024: IA-32 Architecture Compatibility The 16-bit form of SGDT is compatible with the Intel 286 processor if the upper 8 bits are not referenced. The Intel 286 processor fills these bits with 1s; processor generations later than the Intel 286 processor fill these bits with 0s. Intel still claims the upper 8 bits are filled with 0s, but the Operation pseudo code below is correct. The same is true for SIDT. With best regards, Volker >> IF OperandSize =16 or OperandSize = 32 (* Legacy or Compatibility >> Mode *) >> THEN >> DEST[0:15] := GDTR(Limit); >> DEST[16:47] := GDTR(Base); (* Full 32-bit base address stored *) >> FI; >> >> >> Anyway, >> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > > Thanks! > > > ATB, > > Mark. > > >
On 22/4/24 21:03, Volker Rümelin wrote: > Am 20.04.24 um 07:40 schrieb Mark Cave-Ayland: >> On 20/04/2024 02:21, Richard Henderson wrote: >> >>> On 4/19/24 12:51, Mark Cave-Ayland wrote: >>>> The various Intel CPU manuals claim that SGDT and SIDT can write >>>> either 24-bits >>>> or 32-bits depending upon the operand size, but this is incorrect. >>>> Not only do >>>> the Intel CPU manuals give contradictory information between processor >>>> revisions, but this information doesn't even match real-life behaviour. >>>> >>>> In fact, tests on real hardware show that the CPU always writes >>>> 32-bits for SGDT >>>> and SIDT, and this behaviour is required for at least OS/2 Warp and >>>> WFW 3.11 with >>>> Win32s to function correctly. Remove the masking applied due to the >>>> operand size >>>> for SGDT and SIDT so that the TCG behaviour matches the behaviour on >>>> real >>>> hardware. >>>> >>>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> >>>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198 >>>> >>>> -- >>>> MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed >>>> that this >>>> patch fixes the issue in WFW 3.11 with Win32s. For more technical >>>> information I >>>> highly recommend the excellent write-up at >>>> https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/. >>>> --- >>>> target/i386/tcg/translate.c | 14 ++++++++------ >>>> 1 file changed, 8 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c >>>> index 76a42c679c..3026eb6774 100644 >>>> --- a/target/i386/tcg/translate.c >>>> +++ b/target/i386/tcg/translate.c >>>> @@ -5846,9 +5846,10 @@ static bool disas_insn(DisasContext *s, >>>> CPUState *cpu) >>>> gen_op_st_v(s, MO_16, s->T0, s->A0); >>>> gen_add_A0_im(s, 2); >>>> tcg_gen_ld_tl(s->T0, tcg_env, offsetof(CPUX86State, >>>> gdt.base)); >>>> - if (dflag == MO_16) { >>>> - tcg_gen_andi_tl(s->T0, s->T0, 0xffffff); >>>> - } >>>> + /* >>>> + * NB: Despite claims to the contrary in Intel CPU >>>> documentation, >>>> + * all 32-bits are written regardless of operand size. >>>> + */ >>> >>> Current documentation agrees that all 32 bits are written, so I don't >>> think you need this comment: >> >> Ah that's good to know the docs are now correct. I added the comment >> as there was a lot of conflicting information around for older CPUs so >> I thought it was worth an explicit mention. >> >> If everyone agrees a version without comments is preferable, I can >> re-send an updated version without them included. >> > > Hi Mark, > > I wouldn't remove the comment. > > Quote from the Intel® 64 and IA-32 Architectures Software Developer’s > Manual Volume 2B: Instruction Set Reference, M-U March 2024: > > IA-32 Architecture Compatibility > The 16-bit form of SGDT is compatible with the Intel 286 processor if > the upper 8 bits are not referenced. The Intel 286 processor fills these > bits with 1s; processor generations later than the Intel 286 processor > fill these bits with 0s. Is that that OS/2 Warp and WFW 3.11 expect a 286 CPU? QEMU starts modelling the 486, do we want to consider down to the 286? > Intel still claims the upper 8 bits are filled with 0s, but the > Operation pseudo code below is correct. The same is true for SIDT. > > With best regards, > Volker > >>> IF OperandSize =16 or OperandSize = 32 (* Legacy or Compatibility >>> Mode *) >>> THEN >>> DEST[0:15] := GDTR(Limit); >>> DEST[16:47] := GDTR(Base); (* Full 32-bit base address stored *) >>> FI; >>> >>> >>> Anyway, >>> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> >> >> Thanks! >> >> >> ATB, >> >> Mark. >> >> >> > >
On Tue, Apr 23, 2024 at 10:15:57AM +0200, Philippe Mathieu-Daudé wrote: > On 22/4/24 21:03, Volker Rümelin wrote: > > Am 20.04.24 um 07:40 schrieb Mark Cave-Ayland: > > > On 20/04/2024 02:21, Richard Henderson wrote: > > > > > > > On 4/19/24 12:51, Mark Cave-Ayland wrote: > > > > > The various Intel CPU manuals claim that SGDT and SIDT can write > > > > > either 24-bits > > > > > or 32-bits depending upon the operand size, but this is incorrect. > > > > > Not only do > > > > > the Intel CPU manuals give contradictory information between processor > > > > > revisions, but this information doesn't even match real-life behaviour. > > > > > > > > > > In fact, tests on real hardware show that the CPU always writes > > > > > 32-bits for SGDT > > > > > and SIDT, and this behaviour is required for at least OS/2 Warp and > > > > > WFW 3.11 with > > > > > Win32s to function correctly. Remove the masking applied due to the > > > > > operand size > > > > > for SGDT and SIDT so that the TCG behaviour matches the behaviour on > > > > > real > > > > > hardware. > > > > > > > > > > Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> > > > > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198 > > > > > > > > > > -- > > > > > MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed > > > > > that this > > > > > patch fixes the issue in WFW 3.11 with Win32s. For more technical > > > > > information I > > > > > highly recommend the excellent write-up at > > > > > https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/. > > > > > --- > > > > > target/i386/tcg/translate.c | 14 ++++++++------ > > > > > 1 file changed, 8 insertions(+), 6 deletions(-) > > > > > > > > > > diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c > > > > > index 76a42c679c..3026eb6774 100644 > > > > > --- a/target/i386/tcg/translate.c > > > > > +++ b/target/i386/tcg/translate.c > > > > > @@ -5846,9 +5846,10 @@ static bool disas_insn(DisasContext *s, > > > > > CPUState *cpu) > > > > > gen_op_st_v(s, MO_16, s->T0, s->A0); > > > > > gen_add_A0_im(s, 2); > > > > > tcg_gen_ld_tl(s->T0, tcg_env, offsetof(CPUX86State, > > > > > gdt.base)); > > > > > - if (dflag == MO_16) { > > > > > - tcg_gen_andi_tl(s->T0, s->T0, 0xffffff); > > > > > - } > > > > > + /* > > > > > + * NB: Despite claims to the contrary in Intel CPU > > > > > documentation, > > > > > + * all 32-bits are written regardless of operand size. > > > > > + */ > > > > > > > > Current documentation agrees that all 32 bits are written, so I don't > > > > think you need this comment: > > > > > > Ah that's good to know the docs are now correct. I added the comment > > > as there was a lot of conflicting information around for older CPUs so > > > I thought it was worth an explicit mention. > > > > > > If everyone agrees a version without comments is preferable, I can > > > re-send an updated version without them included. > > > > > > > Hi Mark, > > > > I wouldn't remove the comment. > > > > Quote from the Intel® 64 and IA-32 Architectures Software Developer’s > > Manual Volume 2B: Instruction Set Reference, M-U March 2024: > > > > IA-32 Architecture Compatibility > > The 16-bit form of SGDT is compatible with the Intel 286 processor if > > the upper 8 bits are not referenced. The Intel 286 processor fills these > > bits with 1s; processor generations later than the Intel 286 processor > > fill these bits with 0s. > > Is that that OS/2 Warp and WFW 3.11 expect a 286 CPU? QEMU starts > modelling the 486, do we want to consider down to the 286? Depends on the versions you're talking about. From what I can gather, OS/2 1.x targetted i286, OS/2 2.x & 3.x targetted i386, and OS/2 4.0 Warp targetted i486, while WFW 3.11 was i386. With regards, Daniel
On Mon, Apr 22, 2024 at 9:10 PM Volker Rümelin <vr_qemu@t-online.de> wrote: > > Am 20.04.24 um 07:40 schrieb Mark Cave-Ayland: > >> Current documentation agrees that all 32 bits are written, so I don't > >> think you need this comment: > > > > Ah that's good to know the docs are now correct. I added the comment > > as there was a lot of conflicting information around for older CPUs so > > I thought it was worth an explicit mention. > > Quote from the Intel® 64 and IA-32 Architectures Software Developer’s > Manual Volume 2B: Instruction Set Reference, M-U March 2024: > > IA-32 Architecture Compatibility > The 16-bit form of SGDT is compatible with the Intel 286 processor if > the upper 8 bits are not referenced. The Intel 286 processor fills these > bits with 1s; processor generations later than the Intel 286 processor > fill these bits with 0s. > > Intel still claims the upper 8 bits are filled with 0s, but the > Operation pseudo code below is correct. The same is true for SIDT. I think the claim is that it fills with 0s when the software is compatible with the 286, i.e. never uses a 32-bit LIDT or LGDT instruction. Software written to target specifically older processors typically used the undocumented LOADALL instruction to exit protected mode or to set 4GB segment limits, so it won't run on QEMU. You can read about the usage here: https://www.os2museum.com/wp/more-on-loadall-and-os2/ (286) https://www.os2museum.com/wp/386-loadall/ (386) and about how it worked here: https://www.pcjs.org/documents/manuals/intel/80286/loadall/ https://www.pcjs.org/documents/manuals/intel/80386/loadall/ Interestingly, byte 3 of the GDTR or IDTR on the 286 are documented as "should be zeroes" for LOADALL, not all ones. Let's change "Despite claims to the contrary" with "Despite a confusing description". Paolo
On 23/04/2024 10:18, Paolo Bonzini wrote: > On Mon, Apr 22, 2024 at 9:10 PM Volker Rümelin <vr_qemu@t-online.de> wrote: >> >> Am 20.04.24 um 07:40 schrieb Mark Cave-Ayland: >>>> Current documentation agrees that all 32 bits are written, so I don't >>>> think you need this comment: >>> >>> Ah that's good to know the docs are now correct. I added the comment >>> as there was a lot of conflicting information around for older CPUs so >>> I thought it was worth an explicit mention. >> >> Quote from the Intel® 64 and IA-32 Architectures Software Developer’s >> Manual Volume 2B: Instruction Set Reference, M-U March 2024: >> >> IA-32 Architecture Compatibility >> The 16-bit form of SGDT is compatible with the Intel 286 processor if >> the upper 8 bits are not referenced. The Intel 286 processor fills these >> bits with 1s; processor generations later than the Intel 286 processor >> fill these bits with 0s. >> >> Intel still claims the upper 8 bits are filled with 0s, but the >> Operation pseudo code below is correct. The same is true for SIDT. > > I think the claim is that it fills with 0s when the software is > compatible with the 286, i.e. never uses a 32-bit LIDT or LGDT > instruction. Software written to target specifically older processors > typically used the undocumented LOADALL instruction to exit protected > mode or to set 4GB segment limits, so it won't run on QEMU. You can > read about the usage here: > > https://www.os2museum.com/wp/more-on-loadall-and-os2/ (286) > https://www.os2museum.com/wp/386-loadall/ (386) > > and about how it worked here: > > https://www.pcjs.org/documents/manuals/intel/80286/loadall/ > https://www.pcjs.org/documents/manuals/intel/80386/loadall/ > > Interestingly, byte 3 of the GDTR or IDTR on the 286 are documented as > "should be zeroes" for LOADALL, not all ones. > > Let's change "Despite claims to the contrary" with "Despite a > confusing description". Thanks for sorting this, Paolo. I suspect that KVM needs a similar patch as per https://gitlab.com/qemu-project/qemu/-/issues/2198#note_1815726425 however the Win32s and OS/2 Warp 4 tests seem to work fine on KVM. Maybe it's because the SGDT and SIDT instructions run natively and don't need to be emulated for these cases? ATB, Mark.
On Tue, Apr 23, 2024 at 10:42 PM Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote: > > Let's change "Despite claims to the contrary" with "Despite a > > confusing description". > > Thanks for sorting this, Paolo. I suspect that KVM needs a similar patch as per > https://gitlab.com/qemu-project/qemu/-/issues/2198#note_1815726425 however the Win32s > and OS/2 Warp 4 tests seem to work fine on KVM. Maybe it's because the SGDT and SIDT > instructions run natively and don't need to be emulated for these cases? Yes, they are almost never emulated (only in big real mode and only on old Intel processors). Paolo
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 76a42c679c..3026eb6774 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -5846,9 +5846,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu) gen_op_st_v(s, MO_16, s->T0, s->A0); gen_add_A0_im(s, 2); tcg_gen_ld_tl(s->T0, tcg_env, offsetof(CPUX86State, gdt.base)); - if (dflag == MO_16) { - tcg_gen_andi_tl(s->T0, s->T0, 0xffffff); - } + /* + * NB: Despite claims to the contrary in Intel CPU documentation, + * all 32-bits are written regardless of operand size. + */ gen_op_st_v(s, CODE64(s) + MO_32, s->T0, s->A0); break; @@ -5901,9 +5902,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu) gen_op_st_v(s, MO_16, s->T0, s->A0); gen_add_A0_im(s, 2); tcg_gen_ld_tl(s->T0, tcg_env, offsetof(CPUX86State, idt.base)); - if (dflag == MO_16) { - tcg_gen_andi_tl(s->T0, s->T0, 0xffffff); - } + /* + * NB: Despite claims to the contrary in Intel CPU documentation, + * all 32-bits are written regardless of operand size. + */ gen_op_st_v(s, CODE64(s) + MO_32, s->T0, s->A0); break;
The various Intel CPU manuals claim that SGDT and SIDT can write either 24-bits or 32-bits depending upon the operand size, but this is incorrect. Not only do the Intel CPU manuals give contradictory information between processor revisions, but this information doesn't even match real-life behaviour. In fact, tests on real hardware show that the CPU always writes 32-bits for SGDT and SIDT, and this behaviour is required for at least OS/2 Warp and WFW 3.11 with Win32s to function correctly. Remove the masking applied due to the operand size for SGDT and SIDT so that the TCG behaviour matches the behaviour on real hardware. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2198 -- MCA: Whilst I don't have a copy of OS/2 Warp handy, I've confirmed that this patch fixes the issue in WFW 3.11 with Win32s. For more technical information I highly recommend the excellent write-up at https://www.os2museum.com/wp/sgdtsidt-fiction-and-reality/. --- target/i386/tcg/translate.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)