Message ID | 20240407130537.16977-1-osalvador@suse.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm,swapops: Update check in is_pfn_swap_entry for hwpoison entries | expand |
On Sun, Apr 07, 2024 at 03:05:37PM +0200, Oscar Salvador wrote: > Tony reported that the Machine check recovery was broken in v6.9-rc1, > as he was hitting a VM_BUG_ON when injecting uncorrectable memory errors > to DRAM. > After some more digging and debugging on his side, he realized that this > went back to v6.1, with the introduction of 'commit 0d206b5d2e0d ("mm/swap: add > swp_offset_pfn() to fetch PFN from swap entry")'. > That commit, among other things, introduced swp_offset_pfn(), replacing > hwpoison_entry_to_pfn() in its favour. > > The patch also introduced a VM_BUG_ON() check for is_pfn_swap_entry(), > but is_pfn_swap_entry() never got updated to cover hwpoison entries, which > means that we would hit the VM_BUG_ON whenever we would call > swp_offset_pfn() for such entries on environments with CONFIG_DEBUG_VM set. > Fix this by updating the check to cover hwpoison entries as well, and update > the comment while we are it. > > Reported-by: Tony Luck <tony.luck@intel.com> > Closes: https://lore.kernel.org/all/Zg8kLSl2yAlA3o5D@agluck-desk3/ > Tested-by: Tony Luck <tony.luck@intel.com> > Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry") > Cc: <stable@vger.kernel.org> # 6.1.x I think I need to clarify why the stable. It is my understanding that some distros ship their kernel with CONFIG_DEBUG_VM set by default (I think Fedora comes to my mind?). I am fine with backing down if people think that this is an overreaction.
On Sun, Apr 07, 2024 at 03:56:17PM +0200, Oscar Salvador wrote: > On Sun, Apr 07, 2024 at 03:05:37PM +0200, Oscar Salvador wrote: > > Tony reported that the Machine check recovery was broken in v6.9-rc1, > > as he was hitting a VM_BUG_ON when injecting uncorrectable memory errors > > to DRAM. > > After some more digging and debugging on his side, he realized that this > > went back to v6.1, with the introduction of 'commit 0d206b5d2e0d ("mm/swap: add > > swp_offset_pfn() to fetch PFN from swap entry")'. > > That commit, among other things, introduced swp_offset_pfn(), replacing > > hwpoison_entry_to_pfn() in its favour. > > > > The patch also introduced a VM_BUG_ON() check for is_pfn_swap_entry(), > > but is_pfn_swap_entry() never got updated to cover hwpoison entries, which > > means that we would hit the VM_BUG_ON whenever we would call > > swp_offset_pfn() for such entries on environments with CONFIG_DEBUG_VM set. > > Fix this by updating the check to cover hwpoison entries as well, and update > > the comment while we are it. > > > > Reported-by: Tony Luck <tony.luck@intel.com> > > Closes: https://lore.kernel.org/all/Zg8kLSl2yAlA3o5D@agluck-desk3/ > > Tested-by: Tony Luck <tony.luck@intel.com> > > Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry") Totally unexpected, as this commit even removed hwpoison_entry_to_pfn(). Obviously even until now I assumed hwpoison is accounted as pfn swap entry but it's just missing.. Since this commit didn't really change is_pfn_swap_entry() itself, I was thinking maybe an older fix tag would apply, but then I noticed the old code indeed should work well even if hwpoison entry is missing. For example, it's a grey area on whether a hwpoisoned page should be accounted in smaps. So I think the Fixes tag is correct, and thanks for fixing this. Reviewed-by: Peter Xu <peterx@redhat.com> > > Cc: <stable@vger.kernel.org> # 6.1.x > > I think I need to clarify why the stable. > > It is my understanding that some distros ship their kernel with > CONFIG_DEBUG_VM set by default (I think Fedora comes to my mind?). > I am fine with backing down if people think that this is an > overreaction. Fedora stopped having DEBUG_VM for some time, but not sure about when it's still in the 6.1 trees. It looks like cc stable is still reasonable from that regard. A side note is that when I'm looking at this, I went back and see why in some cases we need the pfn maintained for the poisoned, then I saw the only user is check_hwpoisoned_entry() who wants to do fast kills in some contexts and that includes a double check on the pfns in a poisoned entry. Then afaict this path is just too rarely used and buggy. A few things we may need fixing, maybe someone in the loop would have time to have a look: - check_hwpoisoned_entry() - pte_none check is missing - all the rest swap types are missing (e.g., we want to kill the proc too if the page is during migration) - check_hwpoisoned_pmd_entry() - need similar care like above (pmd_none is covered not others) I copied Naoya too. Thanks,
> Totally unexpected, as this commit even removed hwpoison_entry_to_pfn(). > Obviously even until now I assumed hwpoison is accounted as pfn swap entry > but it's just missing.. > > Since this commit didn't really change is_pfn_swap_entry() itself, I was > thinking maybe an older fix tag would apply, but then I noticed the old > code indeed should work well even if hwpoison entry is missing. For > example, it's a grey area on whether a hwpoisoned page should be accounted > in smaps. So I think the Fixes tag is correct, and thanks for fixing this. > > Reviewed-by: Peter Xu <peterx@redhat.com> Thanks Peter > Fedora stopped having DEBUG_VM for some time, but not sure about when it's > still in the 6.1 trees. It looks like cc stable is still reasonable from > that regard. Good to know, thanks for the info. > A side note is that when I'm looking at this, I went back and see why in > some cases we need the pfn maintained for the poisoned, then I saw the only > user is check_hwpoisoned_entry() who wants to do fast kills in some > contexts and that includes a double check on the pfns in a poisoned entry. > Then afaict this path is just too rarely used and buggy. Yes, unfortunately memory-failure code does not get exercised that much, and so there might be subtly bugs lurking in there for quite some time. > A few things we may need fixing, maybe someone in the loop would have time > to have a look: > > - check_hwpoisoned_entry() > - pte_none check is missing > - all the rest swap types are missing (e.g., we want to kill the proc too > if the page is during migration) > - check_hwpoisoned_pmd_entry() > - need similar care like above (pmd_none is covered not others) I will have a look and see what needs fixing, thanks for bringing it up.
On 07.04.24 15:05, Oscar Salvador wrote: > Tony reported that the Machine check recovery was broken in v6.9-rc1, > as he was hitting a VM_BUG_ON when injecting uncorrectable memory errors > to DRAM. > After some more digging and debugging on his side, he realized that this > went back to v6.1, with the introduction of 'commit 0d206b5d2e0d ("mm/swap: add > swp_offset_pfn() to fetch PFN from swap entry")'. > That commit, among other things, introduced swp_offset_pfn(), replacing > hwpoison_entry_to_pfn() in its favour. > > The patch also introduced a VM_BUG_ON() check for is_pfn_swap_entry(), > but is_pfn_swap_entry() never got updated to cover hwpoison entries, which > means that we would hit the VM_BUG_ON whenever we would call > swp_offset_pfn() for such entries on environments with CONFIG_DEBUG_VM set. > Fix this by updating the check to cover hwpoison entries as well, and update > the comment while we are it. > > Reported-by: Tony Luck <tony.luck@intel.com> > Closes: https://lore.kernel.org/all/Zg8kLSl2yAlA3o5D@agluck-desk3/ > Tested-by: Tony Luck <tony.luck@intel.com> > Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry") > Cc: <stable@vger.kernel.org> # 6.1.x > Signed-off-by: Oscar Salvador <osalvador@suse.de> > --- Reviewed-by: David Hildenbrand <david@redhat.com>
On 2024/4/7 21:05, Oscar Salvador wrote: > Tony reported that the Machine check recovery was broken in v6.9-rc1, > as he was hitting a VM_BUG_ON when injecting uncorrectable memory errors > to DRAM. > After some more digging and debugging on his side, he realized that this > went back to v6.1, with the introduction of 'commit 0d206b5d2e0d ("mm/swap: add > swp_offset_pfn() to fetch PFN from swap entry")'. > That commit, among other things, introduced swp_offset_pfn(), replacing > hwpoison_entry_to_pfn() in its favour. > > The patch also introduced a VM_BUG_ON() check for is_pfn_swap_entry(), > but is_pfn_swap_entry() never got updated to cover hwpoison entries, which > means that we would hit the VM_BUG_ON whenever we would call > swp_offset_pfn() for such entries on environments with CONFIG_DEBUG_VM set. > Fix this by updating the check to cover hwpoison entries as well, and update > the comment while we are it. > > Reported-by: Tony Luck <tony.luck@intel.com> > Closes: https://lore.kernel.org/all/Zg8kLSl2yAlA3o5D@agluck-desk3/ > Tested-by: Tony Luck <tony.luck@intel.com> > Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry") > Cc: <stable@vger.kernel.org> # 6.1.x > Signed-off-by: Oscar Salvador <osalvador@suse.de> Acked-by: Miaohe Lin <linmiaohe@huawei.com> Thanks.
On 2024/4/8 4:31, Oscar Salvador wrote: >> Totally unexpected, as this commit even removed hwpoison_entry_to_pfn(). >> Obviously even until now I assumed hwpoison is accounted as pfn swap entry >> but it's just missing.. >> >> Since this commit didn't really change is_pfn_swap_entry() itself, I was >> thinking maybe an older fix tag would apply, but then I noticed the old >> code indeed should work well even if hwpoison entry is missing. For >> example, it's a grey area on whether a hwpoisoned page should be accounted >> in smaps. So I think the Fixes tag is correct, and thanks for fixing this. >> >> Reviewed-by: Peter Xu <peterx@redhat.com> > > Thanks Peter Thanks both. > >> Fedora stopped having DEBUG_VM for some time, but not sure about when it's >> still in the 6.1 trees. It looks like cc stable is still reasonable from >> that regard. > > Good to know, thanks for the info. > >> A side note is that when I'm looking at this, I went back and see why in >> some cases we need the pfn maintained for the poisoned, then I saw the only >> user is check_hwpoisoned_entry() who wants to do fast kills in some >> contexts and that includes a double check on the pfns in a poisoned entry. >> Then afaict this path is just too rarely used and buggy. > > Yes, unfortunately memory-failure code does not get exercised that much, > and so there might be subtly bugs lurking in there for quite some time. There're many memory-failure testcases but some code paths still didn't get exercised. That's a pity. :( > >> A few things we may need fixing, maybe someone in the loop would have time >> to have a look: >> >> - check_hwpoisoned_entry() >> - pte_none check is missing >> - all the rest swap types are missing (e.g., we want to kill the proc too >> if the page is during migration) Firstly, I thought rest swap types just won't exist in this code path. But after second thought, it seems it's possible. For example, when page is being isolated for migration, memory_failure will fails to isolate it. And the second MCE event will goes to kill_accessing_process() and see a migrate swap entry. >> - check_hwpoisoned_pmd_entry() >> - need similar care like above (pmd_none is covered not others) > > I will have a look and see what needs fixing, thanks for bringing it up. Thanks for your time. . > >
diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 48b700ba1d18..a5c560a2f8c2 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -390,6 +390,35 @@ static inline bool is_migration_entry_dirty(swp_entry_t entry) } #endif /* CONFIG_MIGRATION */ +#ifdef CONFIG_MEMORY_FAILURE + +/* + * Support for hardware poisoned pages + */ +static inline swp_entry_t make_hwpoison_entry(struct page *page) +{ + BUG_ON(!PageLocked(page)); + return swp_entry(SWP_HWPOISON, page_to_pfn(page)); +} + +static inline int is_hwpoison_entry(swp_entry_t entry) +{ + return swp_type(entry) == SWP_HWPOISON; +} + +#else + +static inline swp_entry_t make_hwpoison_entry(struct page *page) +{ + return swp_entry(0, 0); +} + +static inline int is_hwpoison_entry(swp_entry_t swp) +{ + return 0; +} +#endif + typedef unsigned long pte_marker; #define PTE_MARKER_UFFD_WP BIT(0) @@ -483,8 +512,9 @@ static inline struct folio *pfn_swap_entry_folio(swp_entry_t entry) /* * A pfn swap entry is a special type of swap entry that always has a pfn stored - * in the swap offset. They are used to represent unaddressable device memory - * and to restrict access to a page undergoing migration. + * in the swap offset. They can either be used to represent unaddressable device + * memory, to restrict access to a page undergoing migration or to represent a + * pfn which has been hwpoisoned and unmapped. */ static inline bool is_pfn_swap_entry(swp_entry_t entry) { @@ -492,7 +522,7 @@ static inline bool is_pfn_swap_entry(swp_entry_t entry) BUILD_BUG_ON(SWP_TYPE_SHIFT < SWP_PFN_BITS); return is_migration_entry(entry) || is_device_private_entry(entry) || - is_device_exclusive_entry(entry); + is_device_exclusive_entry(entry) || is_hwpoison_entry(entry); } struct page_vma_mapped_walk; @@ -561,35 +591,6 @@ static inline int is_pmd_migration_entry(pmd_t pmd) } #endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */ -#ifdef CONFIG_MEMORY_FAILURE - -/* - * Support for hardware poisoned pages - */ -static inline swp_entry_t make_hwpoison_entry(struct page *page) -{ - BUG_ON(!PageLocked(page)); - return swp_entry(SWP_HWPOISON, page_to_pfn(page)); -} - -static inline int is_hwpoison_entry(swp_entry_t entry) -{ - return swp_type(entry) == SWP_HWPOISON; -} - -#else - -static inline swp_entry_t make_hwpoison_entry(struct page *page) -{ - return swp_entry(0, 0); -} - -static inline int is_hwpoison_entry(swp_entry_t swp) -{ - return 0; -} -#endif - static inline int non_swap_entry(swp_entry_t entry) { return swp_type(entry) >= MAX_SWAPFILES;