Message ID | 20200220075220.2327056-1-ying.huang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: Fix possible PMD dirty bit lost in set_pmd_migration_entry() | expand |
> On Feb 20, 2020, at 12:52 AM, Huang, Ying <ying.huang@intel.com> wrote: > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Looks good to me. Reviewed-by: William Kucharski <william.kucharski@oracle.com> > Cc: Zi Yan <ziy@nvidia.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Michal Hocko <mhocko@kernel.org> > Cc: Vlastimil Babka <vbabka@suse.cz> > --- > mm/huge_memory.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 580098e115bd..b1e069e68189 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3060,8 +3060,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, > return; > > flush_cache_range(vma, address, address + HPAGE_PMD_SIZE); > - pmdval = *pvmw->pmd; > - pmdp_invalidate(vma, address, pvmw->pmd); > + pmdval = pmdp_invalidate(vma, address, pvmw->pmd); > if (pmd_dirty(pmdval)) > set_page_dirty(page); > entry = make_migration_entry(page, pmd_write(pmdval)); > -- > 2.25.0 > >
On Thu, Feb 20, 2020 at 03:52:20PM +0800, Huang, Ying wrote: > From: Huang Ying <ying.huang@intel.com> > > In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD > atomically. But the PMD is read before that with an ordinary memory > reading. If the THP (transparent huge page) is written between the > PMD reading and pmdp_invalidate(), the PMD dirty bit may be lost, and > cause data corruption. The race window is quite small, but still > possible in theory, so need to be fixed. > > The race is fixed via using the return value of pmdp_invalidate() to > get the original content of PMD, which is a read/modify/write atomic > operation. So no THP writing can occur in between. > > The race has been introduced when the THP migration support is added > in the commit 616b8371539a ("mm: thp: enable thp migration in generic > path"). But this fix depends on the commit d52605d7cb30 ("mm: do not > lose dirty and accessed bits in pmdp_invalidate()"). So it's easy to > be backported after v4.16. But the race window is really small, so it > may be fine not to backport the fix at all. > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > Cc: Zi Yan <ziy@nvidia.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Michal Hocko <mhocko@kernel.org> > Cc: Vlastimil Babka <vbabka@suse.cz> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
On 20 Feb 2020, at 2:52, Huang, Ying wrote: > From: Huang Ying <ying.huang@intel.com> > > In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD > atomically. But the PMD is read before that with an ordinary memory > reading. If the THP (transparent huge page) is written between the > PMD reading and pmdp_invalidate(), the PMD dirty bit may be lost, and > cause data corruption. The race window is quite small, but still > possible in theory, so need to be fixed. > > The race is fixed via using the return value of pmdp_invalidate() to > get the original content of PMD, which is a read/modify/write atomic > operation. So no THP writing can occur in between. > > The race has been introduced when the THP migration support is added > in the commit 616b8371539a ("mm: thp: enable thp migration in generic > path"). But this fix depends on the commit d52605d7cb30 ("mm: do not > lose dirty and accessed bits in pmdp_invalidate()"). So it's easy to > be backported after v4.16. But the race window is really small, so it > may be fine not to backport the fix at all. > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > Cc: Zi Yan <ziy@nvidia.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Michal Hocko <mhocko@kernel.org> > Cc: Vlastimil Babka <vbabka@suse.cz> > --- > mm/huge_memory.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 580098e115bd..b1e069e68189 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3060,8 +3060,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, > return; > > flush_cache_range(vma, address, address + HPAGE_PMD_SIZE); > - pmdval = *pvmw->pmd; > - pmdp_invalidate(vma, address, pvmw->pmd); > + pmdval = pmdp_invalidate(vma, address, pvmw->pmd); > if (pmd_dirty(pmdval)) > set_page_dirty(page); > entry = make_migration_entry(page, pmd_write(pmdval)); > -- > 2.25.0 Looks good to me. Thanks. Reviewed-by: Zi Yan <ziy@nvidia.com> — Best Regards, Yan Zi
On Thu, 20 Feb 2020 15:52:20 +0800 "Huang, Ying" <ying.huang@intel.com> wrote: > From: Huang Ying <ying.huang@intel.com> > > In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD > atomically. But the PMD is read before that with an ordinary memory > reading. If the THP (transparent huge page) is written between the > PMD reading and pmdp_invalidate(), the PMD dirty bit may be lost, and > cause data corruption. The race window is quite small, but still > possible in theory, so need to be fixed. > > The race is fixed via using the return value of pmdp_invalidate() to > get the original content of PMD, which is a read/modify/write atomic > operation. So no THP writing can occur in between. > > The race has been introduced when the THP migration support is added > in the commit 616b8371539a ("mm: thp: enable thp migration in generic > path"). But this fix depends on the commit d52605d7cb30 ("mm: do not > lose dirty and accessed bits in pmdp_invalidate()"). So it's easy to > be backported after v4.16. But the race window is really small, so it > may be fine not to backport the fix at all. Thanks. I'm inclined to add a cc:stable to this one. Silent data corruption is pretty serious.
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 580098e115bd..b1e069e68189 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3060,8 +3060,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, return; flush_cache_range(vma, address, address + HPAGE_PMD_SIZE); - pmdval = *pvmw->pmd; - pmdp_invalidate(vma, address, pvmw->pmd); + pmdval = pmdp_invalidate(vma, address, pvmw->pmd); if (pmd_dirty(pmdval)) set_page_dirty(page); entry = make_migration_entry(page, pmd_write(pmdval));