Message ID | 20230224141145.96814-1-ying.huang@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | migrate_pages: fix deadlock in batched synchronous migration | expand |
On Fri, 24 Feb 2023 22:11:42 +0800 Huang Ying <ying.huang@intel.com> wrote: > Two deadlock bugs were reported for the migrate_pages() batching > series. "migrate_pages(): batch TLB flushing" > Thanks Hugh and Pengfei. Analysis shows that if we have > locked some other folios except the one we are migrating, it's not > safe in general to wait synchronously, for example, to wait the > writeback to complete or wait to lock the buffer head. > > So 1/3 fixes the deadlock in a simple way, where the batching support > for the synchronous migration is disabled. The change is > straightforward and easy to be understood. While 3/3 re-introduce the > batching for synchronous migration via trying to migrate > asynchronously in batch optimistically, then fall back to migrate > synchronously one by one for fail-to-migrate folios. Test shows that > this can restore the TLB flushing batching performance for synchronous > migration effectively. If anyone backports the "migrate_pages(): batch TLB flushing" series into their kernels, they will want to know about such fixes. So we can help them by providing suitable Link: tags. Such a Link: may also be helpful to people who are performing git bisection searches for some issue but who keep stumbling over the issues which this series addresses. Being lazy, I slapped Fixes: 6f7d760e86fa ("migrate_pages: move THP/hugetlb migration support check to simplify code") on all three, as this was the final patch in that series. Inaccurate, but it means that these fixes will land in a suitable place if anyone needs them.
Andrew Morton <akpm@linux-foundation.org> writes: > On Fri, 24 Feb 2023 22:11:42 +0800 Huang Ying <ying.huang@intel.com> wrote: > >> Two deadlock bugs were reported for the migrate_pages() batching >> series. > > "migrate_pages(): batch TLB flushing" Yes. Should have written as that. >> Thanks Hugh and Pengfei. Analysis shows that if we have >> locked some other folios except the one we are migrating, it's not >> safe in general to wait synchronously, for example, to wait the >> writeback to complete or wait to lock the buffer head. >> >> So 1/3 fixes the deadlock in a simple way, where the batching support >> for the synchronous migration is disabled. The change is >> straightforward and easy to be understood. While 3/3 re-introduce the >> batching for synchronous migration via trying to migrate >> asynchronously in batch optimistically, then fall back to migrate >> synchronously one by one for fail-to-migrate folios. Test shows that >> this can restore the TLB flushing batching performance for synchronous >> migration effectively. > > If anyone backports the "migrate_pages(): batch TLB flushing" series > into their kernels, they will want to know about such fixes. So we can > help them by providing suitable Link: tags. > > Such a Link: may also be helpful to people who are performing git > bisection searches for some issue but who keep stumbling over the > issues which this series addresses. > > Being lazy, I slapped > > Fixes: 6f7d760e86fa ("migrate_pages: move THP/hugetlb migration support check to simplify code") > > on all three, as this was the final patch in that series. Inaccurate, > but it means that these fixes will land in a suitable place if anyone > needs them. Sorry. I should have added the "Fixes:" tag. I will be more careful in the future. And, I will add proper "Link:" tag too. Best Regards, Huang, Ying