Message ID | 1458175619-32206-2-git-send-email-toshi.kani@hpe.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Mar 16, 2016 at 06:46:56PM -0600, Toshi Kani wrote: > A Xorg failure on qemu32 was reported as a regression caused > by 'commit 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is > disabled")'. [1] This patch fixes the regression. I hope so. > Negative effects of this regression were two failures in Xorg > on qemu32 env, which were triggered by the fact that its virtual "... with QEMU CPU model "qemu32" (-cpu qemu32) ... " > CPU does not support MTRR. [2] > > #1. copy_process() failed in the check in reserve_pfn_range() > > copy_process > copy_mm > dup_mm > dup_mmap > copy_page_range > track_pfn_copy > reserve_pfn_range Here's where you describe why it failed the check and which check. > > #2. error path in copy_process() then hit WARN_ON_ONCE in > untrack_pfn(). > > x86/PAT: Xorg:509 map pfn expected mapping type uncached- > minus for [mem 0xfd000000-0xfdffffff], got write-combining > Call Trace: > dump_stack+0x58/0x79 > warn_slowpath_common+0x8b/0xc0 > ? untrack_pfn+0x9f/0xb0 > ? untrack_pfn+0x9f/0xb0 > warn_slowpath_null+0x22/0x30 > untrack_pfn+0x9f/0xb0 > ? __kunmap_atomic+0x54/0x110 > unmap_single_vma+0x56f/0x580 > ? pagevec_move_tail_fn+0xa0/0xa0 > unmap_vmas+0x43/0x60 > exit_mmap+0x5f/0xf0 > mmput+0x2d/0xa0 > copy_process.part.47+0x1229/0x1430 > _do_fork+0xb4/0x3b0 > SyS_clone+0x2c/0x30 > do_syscall_32_irqs_on+0x54/0xb0 > entry_INT80_32+0x2a/0x2a You can delete the offsets after the "+" - they're useless. > These negative effects are caused by two separate bugs, but they > can be dealt in lower priority. ?? > Fixing the pat_init() issue below > avoids Xorg to hit these cases. > > When the CPU does not support MTRR, MTRR does not call pat_init(), > which leaves PAT enabled without initializing PAT. This pat_init() > issue is a long-standing issue, but manifested as issue #1 (and then > hit issue #2) with the commit commit 9cd25aac1f44 ? > because the memtype now tracks cache > attribute with 'page_cache_mode'. A WC map request is tracked as WC > in memtype, but sets a PTE as UC (pgprot) per __cachemode2pte_tbl[]. > This caused the error in reserve_pfn_range() when it was called from > track_pfn_copy(), which obtained pgprot from a PTE. It converts > pgprot to page_cache_mode, which does not necessarily result in > the original page_cache_mode since __cachemode2pte_tbl[] redirects > multiple types to UC. This is a separate issue in reserve_pfn_range(). Good. > This pat_init() issue existed before the commit, but we used pgprot > in memtype. Hence, we did not have issue #1 before. But WC request > resulted in WT in effect because WC pgrot is actually WT when PAT > is not initialized. This is not how it was designed to work. When > PAT is set to disable properly, WC is converted to UC. The use of > WT can result in a system crash if the target range does not support > WT. Fortunately, nobody ran into such issue before. > > To fix this pat_init() issue, PAT code has been enhanced to provide > pat_disable() interface, which disables the OS to initialize PAT MSR, ... prevents the OS from initializing the PAT MSR. > and sets PAT table to the BIOS handoff state. > This patch changes > MTRR code to call pat_disable() when MTRR is disabled as PAT cannot > be initialized in this case. This sets PAT to disable properly, and > makes PAT code to bypass the memtype check. This avoids issue #1 > (which can be dealt in lower priority). You don't need all that text from "This patch ..." on - we can see that in the diff. The commit message needs to contain "why" not "what". > > [1]: https://lkml.org/lkml/2016/3/3/828 > [2]: https://lkml.org/lkml/2016/3/4/775 I believe I already mentioned that links should be supplied like this: Link: http://lkml.kernel.org/r/<Message-ID> > Signed-off-by: Toshi Kani <toshi.kani@hpe.com> > Cc: Borislav Petkov <bp@suse.de> > Cc: Luis R. Rodriguez <mcgrof@suse.com> > Cc: Juergen Gross <jgross@suse.com> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: H. Peter Anvin <hpa@zytor.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > --- > arch/x86/include/asm/mtrr.h | 6 +++++- > arch/x86/kernel/cpu/mtrr/main.c | 10 +++++++++- > 2 files changed, 14 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h > index b94f6f6..634c593 100644 > --- a/arch/x86/include/asm/mtrr.h > +++ b/arch/x86/include/asm/mtrr.h > @@ -24,6 +24,7 @@ > #define _ASM_X86_MTRR_H > > #include <uapi/asm/mtrr.h> > +#include <asm/pat.h> > > > /* > @@ -83,9 +84,12 @@ static inline int mtrr_trim_uncached_memory(unsigned long end_pfn) > static inline void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi) > { > } > +static inline void mtrr_bp_init(void) > +{ > + pat_disable("Skip PAT initialization"); Make that more user-friendly: "MTRRs disabled, skipping PAT initialization too." > +} > > #define mtrr_ap_init() do {} while (0) > -#define mtrr_bp_init() do {} while (0) > #define set_mtrr_aps_delayed_init() do {} while (0) > #define mtrr_aps_init() do {} while (0) > #define mtrr_bp_restore() do {} while (0) > diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c > index 10f8d47..2d7d8d7 100644 > --- a/arch/x86/kernel/cpu/mtrr/main.c > +++ b/arch/x86/kernel/cpu/mtrr/main.c > @@ -759,8 +759,16 @@ void __init mtrr_bp_init(void) > } > } > > - if (!mtrr_enabled()) > + if (!mtrr_enabled()) { > pr_info("MTRR: Disabled\n"); > + > + /* > + * PAT initialization relies on MTRR's rendezvous handler. > + * Skip PAT init until the handler can initialize both > + * features independently. > + */ > + pat_disable("Skip PAT initialization"); Ditto: you can merge the pr_info text with the pat_disable() string.
On Tue, 2016-03-22 at 18:00 +0100, Borislav Petkov wrote: > On Wed, Mar 16, 2016 at 06:46:56PM -0600, Toshi Kani wrote: > > A Xorg failure on qemu32 was reported as a regression caused > > by 'commit 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is > > disabled")'. [1] This patch fixes the regression. > > I hope so. > > > Negative effects of this regression were two failures in Xorg > > on qemu32 env, which were triggered by the fact that its virtual > > "... with QEMU CPU model "qemu32" (-cpu qemu32) ... " Will add this description. > > CPU does not support MTRR. [2] > > > > #1. copy_process() failed in the check in reserve_pfn_range() > > > > copy_process > > copy_mm > > dup_mm > > dup_mmap > > copy_page_range > > track_pfn_copy > > reserve_pfn_range > > Here's where you describe why it failed the check and which check. Will do. > > #2. error path in copy_process() then hit WARN_ON_ONCE in > > untrack_pfn(). > > > > x86/PAT: Xorg:509 map pfn expected mapping type uncached- > > minus for [mem 0xfd000000-0xfdffffff], got write-combining > > Call Trace: > > dump_stack+0x58/0x79 > > warn_slowpath_common+0x8b/0xc0 > > ? untrack_pfn+0x9f/0xb0 > > ? untrack_pfn+0x9f/0xb0 > > warn_slowpath_null+0x22/0x30 > > untrack_pfn+0x9f/0xb0 > > ? __kunmap_atomic+0x54/0x110 > > unmap_single_vma+0x56f/0x580 > > ? pagevec_move_tail_fn+0xa0/0xa0 > > unmap_vmas+0x43/0x60 > > exit_mmap+0x5f/0xf0 > > mmput+0x2d/0xa0 > > copy_process.part.47+0x1229/0x1430 > > _do_fork+0xb4/0x3b0 > > SyS_clone+0x2c/0x30 > > do_syscall_32_irqs_on+0x54/0xb0 > > entry_INT80_32+0x2a/0x2a > > You can delete the offsets after the "+" - they're useless. Will do. > > These negative effects are caused by two separate bugs, but they > > can be dealt in lower priority. > > ?? Will change to "they can be addressed in separate patches." > > Fixing the pat_init() issue below > > avoids Xorg to hit these cases. > > > > When the CPU does not support MTRR, MTRR does not call pat_init(), > > which leaves PAT enabled without initializing PAT. This pat_init() > > issue is a long-standing issue, but manifested as issue #1 (and then > > hit issue #2) with the commit > > commit 9cd25aac1f44 ? Yes. I had to remove this number since checkpatch complained that I needed to quote the whole patch tile again. I will ignore this checkpatch error and add this commit number here. > > because the memtype now tracks cache > > attribute with 'page_cache_mode'. A WC map request is tracked as WC > > in memtype, but sets a PTE as UC (pgprot) per __cachemode2pte_tbl[]. > > This caused the error in reserve_pfn_range() when it was called from > > track_pfn_copy(), which obtained pgprot from a PTE. It converts > > pgprot to page_cache_mode, which does not necessarily result in > > the original page_cache_mode since __cachemode2pte_tbl[] redirects > > multiple types to UC. This is a separate issue in reserve_pfn_range(). > > Good. > > > This pat_init() issue existed before the commit, but we used pgprot > > in memtype. Hence, we did not have issue #1 before. But WC request > > resulted in WT in effect because WC pgrot is actually WT when PAT > > is not initialized. This is not how it was designed to work. When > > PAT is set to disable properly, WC is converted to UC. The use of > > WT can result in a system crash if the target range does not support > > WT. Fortunately, nobody ran into such issue before. > > > > To fix this pat_init() issue, PAT code has been enhanced to provide > > pat_disable() interface, which disables the OS to initialize PAT MSR, > > ... prevents the OS from initializing the > PAT MSR. Will do. > > and sets PAT table to the BIOS handoff state. > > > This patch changes > > MTRR code to call pat_disable() when MTRR is disabled as PAT cannot > > be initialized in this case. This sets PAT to disable properly, and > > makes PAT code to bypass the memtype check. This avoids issue #1 > > (which can be dealt in lower priority). > > You don't need all that text from "This patch ..." on - we can see that > in the diff. The commit message needs to contain "why" not "what". OK. > > > > /* > > @@ -83,9 +84,12 @@ static inline int mtrr_trim_uncached_memory(unsigned > > long end_pfn) > > static inline void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi) > > { > > } > > +static inline void mtrr_bp_init(void) > > +{ > > + pat_disable("Skip PAT initialization"); > > Make that more user-friendly: > > "MTRRs disabled, skipping PAT initialization too." Agreed. Will do. > > +} > > > > #define mtrr_ap_init() do {} while (0) > > -#define mtrr_bp_init() do {} while (0) > > #define set_mtrr_aps_delayed_init() do {} while (0) > > #define mtrr_aps_init() do {} while (0) > > #define mtrr_bp_restore() do {} while (0) > > diff --git a/arch/x86/kernel/cpu/mtrr/main.c > > b/arch/x86/kernel/cpu/mtrr/main.c > > index 10f8d47..2d7d8d7 100644 > > --- a/arch/x86/kernel/cpu/mtrr/main.c > > +++ b/arch/x86/kernel/cpu/mtrr/main.c > > @@ -759,8 +759,16 @@ void __init mtrr_bp_init(void) > > } > > } > > > > - if (!mtrr_enabled()) > > + if (!mtrr_enabled()) { > > pr_info("MTRR: Disabled\n"); > > + > > + /* > > + * PAT initialization relies on MTRR's rendezvous > > handler. > > + * Skip PAT init until the handler can initialize both > > + * features independently. > > + */ > > + pat_disable("Skip PAT initialization"); > > Ditto: you can merge the pr_info text with the pat_disable() string. Will do. Thanks, -Toshi
On Tue, Mar 22, 2016 at 03:53:30PM -0600, Toshi Kani wrote: > Yes. I had to remove this number since checkpatch complained that I needed > to quote the whole patch tile again. I will ignore this checkpatch error > and add this commit number here. Actually, checkpatch is right. We do quote the commit IDs *together* with their names so that the reader knows which commit the text is talking about.
On Wed, 2016-03-23 at 09:44 +0100, Borislav Petkov wrote: > On Tue, Mar 22, 2016 at 03:53:30PM -0600, Toshi Kani wrote: > > Yes. I had to remove this number since checkpatch complained that I > > needed to quote the whole patch tile again. I will ignore this > > checkpatch error and add this commit number here. > > Actually, checkpatch is right. We do quote the commit IDs *together* > with their names so that the reader knows which commit the text is > talking about. OK, I will use [1] to refer this patch. This patch is fully quoted at the top of this changelog, and it'd be verbose to repeat this full quote every time I refers it... Thanks, -Toshi
On Wed, 2016-03-23 at 09:53 -0600, Toshi Kani wrote: > On Wed, 2016-03-23 at 09:44 +0100, Borislav Petkov wrote: > > On Tue, Mar 22, 2016 at 03:53:30PM -0600, Toshi Kani wrote: > > > Yes. I had to remove this number since checkpatch complained that I > > > needed to quote the whole patch tile again. I will ignore this > > > checkpatch error and add this commit number here. > > > > Actually, checkpatch is right. We do quote the commit IDs *together* > > with their names so that the reader knows which commit the text is > > talking about. > > OK, I will use [1] to refer this patch. This patch is fully quoted at > the top of this changelog, and it'd be verbose to repeat this full quote > every time I refers it... I ended up with using "the above-mentioned patch" in v3. Thanks, -Toshi
diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index b94f6f6..634c593 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -24,6 +24,7 @@ #define _ASM_X86_MTRR_H #include <uapi/asm/mtrr.h> +#include <asm/pat.h> /* @@ -83,9 +84,12 @@ static inline int mtrr_trim_uncached_memory(unsigned long end_pfn) static inline void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi) { } +static inline void mtrr_bp_init(void) +{ + pat_disable("Skip PAT initialization"); +} #define mtrr_ap_init() do {} while (0) -#define mtrr_bp_init() do {} while (0) #define set_mtrr_aps_delayed_init() do {} while (0) #define mtrr_aps_init() do {} while (0) #define mtrr_bp_restore() do {} while (0) diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c index 10f8d47..2d7d8d7 100644 --- a/arch/x86/kernel/cpu/mtrr/main.c +++ b/arch/x86/kernel/cpu/mtrr/main.c @@ -759,8 +759,16 @@ void __init mtrr_bp_init(void) } } - if (!mtrr_enabled()) + if (!mtrr_enabled()) { pr_info("MTRR: Disabled\n"); + + /* + * PAT initialization relies on MTRR's rendezvous handler. + * Skip PAT init until the handler can initialize both + * features independently. + */ + pat_disable("Skip PAT initialization"); + } } void mtrr_ap_init(void)
A Xorg failure on qemu32 was reported as a regression caused by 'commit 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is disabled")'. [1] This patch fixes the regression. Negative effects of this regression were two failures in Xorg on qemu32 env, which were triggered by the fact that its virtual CPU does not support MTRR. [2] #1. copy_process() failed in the check in reserve_pfn_range() copy_process copy_mm dup_mm dup_mmap copy_page_range track_pfn_copy reserve_pfn_range #2. error path in copy_process() then hit WARN_ON_ONCE in untrack_pfn(). x86/PAT: Xorg:509 map pfn expected mapping type uncached- minus for [mem 0xfd000000-0xfdffffff], got write-combining Call Trace: dump_stack+0x58/0x79 warn_slowpath_common+0x8b/0xc0 ? untrack_pfn+0x9f/0xb0 ? untrack_pfn+0x9f/0xb0 warn_slowpath_null+0x22/0x30 untrack_pfn+0x9f/0xb0 ? __kunmap_atomic+0x54/0x110 unmap_single_vma+0x56f/0x580 ? pagevec_move_tail_fn+0xa0/0xa0 unmap_vmas+0x43/0x60 exit_mmap+0x5f/0xf0 mmput+0x2d/0xa0 copy_process.part.47+0x1229/0x1430 _do_fork+0xb4/0x3b0 SyS_clone+0x2c/0x30 do_syscall_32_irqs_on+0x54/0xb0 entry_INT80_32+0x2a/0x2a These negative effects are caused by two separate bugs, but they can be dealt in lower priority. Fixing the pat_init() issue below avoids Xorg to hit these cases. When the CPU does not support MTRR, MTRR does not call pat_init(), which leaves PAT enabled without initializing PAT. This pat_init() issue is a long-standing issue, but manifested as issue #1 (and then hit issue #2) with the commit because the memtype now tracks cache attribute with 'page_cache_mode'. A WC map request is tracked as WC in memtype, but sets a PTE as UC (pgprot) per __cachemode2pte_tbl[]. This caused the error in reserve_pfn_range() when it was called from track_pfn_copy(), which obtained pgprot from a PTE. It converts pgprot to page_cache_mode, which does not necessarily result in the original page_cache_mode since __cachemode2pte_tbl[] redirects multiple types to UC. This is a separate issue in reserve_pfn_range(). This pat_init() issue existed before the commit, but we used pgprot in memtype. Hence, we did not have issue #1 before. But WC request resulted in WT in effect because WC pgrot is actually WT when PAT is not initialized. This is not how it was designed to work. When PAT is set to disable properly, WC is converted to UC. The use of WT can result in a system crash if the target range does not support WT. Fortunately, nobody ran into such issue before. To fix this pat_init() issue, PAT code has been enhanced to provide pat_disable() interface, which disables the OS to initialize PAT MSR, and sets PAT table to the BIOS handoff state. This patch changes MTRR code to call pat_disable() when MTRR is disabled as PAT cannot be initialized in this case. This sets PAT to disable properly, and makes PAT code to bypass the memtype check. This avoids issue #1 (which can be dealt in lower priority). [1]: https://lkml.org/lkml/2016/3/3/828 [2]: https://lkml.org/lkml/2016/3/4/775 Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Borislav Petkov <bp@suse.de> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Juergen Gross <jgross@suse.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/include/asm/mtrr.h | 6 +++++- arch/x86/kernel/cpu/mtrr/main.c | 10 +++++++++- 2 files changed, 14 insertions(+), 2 deletions(-)