Message ID | 20241018192525.95862-1-ryncsn@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm, zswap: don't touch the XArray lock if there is no entry to free | expand |
On Sat, Oct 19, 2024 at 03:25:25AM +0800, Kairui Song wrote: > if (xa_empty(tree)) > return; > > - entry = xa_erase(tree, offset); > - if (entry) > + rcu_read_lock(); > + entry = xas_load(&xas); > + if (entry) { You should call xas_reset() here. And I'm not sure it's a great idea to spin waiting for the xa lock while holding the RCU read lock? Probably not awful but I could easily be wrong. > + xas_lock(&xas); > + WARN_ON_ONCE(xas_reload(&xas) != entry); > + xas_store(&xas, NULL); > + xas_unlock(&xas); > zswap_entry_free(entry); > + } > + rcu_read_unlock(); > } > > int zswap_swapon(int type, unsigned long nr_pages) > -- > 2.47.0 > >
On Sat, Oct 19, 2024 at 3:46 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Sat, Oct 19, 2024 at 03:25:25AM +0800, Kairui Song wrote: > > if (xa_empty(tree)) > > return; > > > > - entry = xa_erase(tree, offset); > > - if (entry) > > + rcu_read_lock(); > > + entry = xas_load(&xas); > > + if (entry) { > > You should call xas_reset() here. And I'm not sure it's a great idea to > spin waiting for the xa lock while holding the RCU read lock? Probably > not awful but I could easily be wrong. Thanks for the review. I thought about it, that could cancel this optimization. Oh, and there is a thing I forgot to mention (maybe I should add some comments about it?). If xas_load found an entry, that entry must be pinned by HAS_CACHE or swap slot count right now, and one entry can only be freed once. So it should be safe here? This might be a little fragile though, maybe this optimization can better be done after some zswap invalidation path cleanup. > > > + xas_lock(&xas); > > + WARN_ON_ONCE(xas_reload(&xas) != entry); > > + xas_store(&xas, NULL); > > + xas_unlock(&xas); > > zswap_entry_free(entry); > > + } > > + rcu_read_unlock(); > > } > > > > int zswap_swapon(int type, unsigned long nr_pages) > > -- > > 2.47.0 > > > > >
On Fri, Oct 18, 2024 at 1:01 PM Kairui Song <ryncsn@gmail.com> wrote: > > On Sat, Oct 19, 2024 at 3:46 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Sat, Oct 19, 2024 at 03:25:25AM +0800, Kairui Song wrote: > > > if (xa_empty(tree)) > > > return; > > > > > > - entry = xa_erase(tree, offset); > > > - if (entry) > > > + rcu_read_lock(); > > > + entry = xas_load(&xas); > > > + if (entry) { > > > > You should call xas_reset() here. Oh I thought xas_reload() is enough here to check that the entry is still there after the lock is acquired. Do we have to start the walk over after holding the lock? If yes, it seems like that would be equivalent to the following: entry = xa_load(tree, offset); if (entry) xa_erase(tree, offset); >> And I'm not sure it's a great idea to > > spin waiting for the xa lock while holding the RCU read lock? Probably > > not awful but I could easily be wrong. If we end up using xa_load() and xa_erase() then we avoid that, but then we'd need to walk the xarray twice. I thought we could avoid the rewalk with xas_reload(). I am not sure if the xa_load() check would still be worth it at this point -- or maybe the second walk will be much faster as everything will be cache hot? Idk. Matthew, any prior experience with such patterns of lockless lookups followed by a conditional locked operation? > > Thanks for the review. I thought about it, that could cancel this optimization. > > Oh, and there is a thing I forgot to mention (maybe I should add some > comments about it?). If xas_load found an entry, that entry must be > pinned by HAS_CACHE or swap slot count right now, and one entry can > only be freed once. > So it should be safe here? > > This might be a little fragile though, maybe this optimization can > better be done after some zswap invalidation path cleanup. The only guarantee that we are requiring from the caller here is that the swap entry is stable, i.e. is not freed and reused while zswap_invalidate() is running. This seems to be a reasonable assumption, or did I miss something here?
On Fri, Oct 18, 2024 at 01:40:18PM -0700, Yosry Ahmed wrote: > Oh I thought xas_reload() is enough here to check that the entry is > still there after the lock is acquired. Do we have to start the walk > over after holding the lock? Yes. The entry is guaranteed to still be valid, but the node you're looking in might have been freed, so you can't modify the node without making sure the node is still in the tree. We could make that cheaper than a rewalk, but you're going to need to write that code since you're the first to want to do something like this.
On Fri, Oct 18, 2024 at 1:55 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Fri, Oct 18, 2024 at 01:40:18PM -0700, Yosry Ahmed wrote: > > Oh I thought xas_reload() is enough here to check that the entry is > > still there after the lock is acquired. Do we have to start the walk > > over after holding the lock? > > Yes. The entry is guaranteed to still be valid, but the node you're > looking in might have been freed, so you can't modify the node without > making sure the node is still in the tree. We could make that cheaper > than a rewalk, but you're going to need to write that code since you're > the first to want to do something like this. I see, thanks for elaborating. Could you confirm if the current patch with the xas_reset() added would be equivalent to just checking xa_load() before using xa_erase()?
On Fri, Oct 18, 2024 at 02:00:16PM -0700, Yosry Ahmed wrote: > On Fri, Oct 18, 2024 at 1:55 PM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Fri, Oct 18, 2024 at 01:40:18PM -0700, Yosry Ahmed wrote: > > > Oh I thought xas_reload() is enough here to check that the entry is > > > still there after the lock is acquired. Do we have to start the walk > > > over after holding the lock? > > > > Yes. The entry is guaranteed to still be valid, but the node you're > > looking in might have been freed, so you can't modify the node without > > making sure the node is still in the tree. We could make that cheaper > > than a rewalk, but you're going to need to write that code since you're > > the first to want to do something like this. > > I see, thanks for elaborating. > > Could you confirm if the current patch with the xas_reset() added > would be equivalent to just checking xa_load() before using > xa_erase()? Yes, I think it would, so it's probably a poor tradeoff.
diff --git a/mm/zswap.c b/mm/zswap.c index f6316b66fb23..a5ba80ac8861 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1641,12 +1641,21 @@ void zswap_invalidate(swp_entry_t swp) struct xarray *tree = swap_zswap_tree(swp); struct zswap_entry *entry; + XA_STATE(xas, tree, offset); + if (xa_empty(tree)) return; - entry = xa_erase(tree, offset); - if (entry) + rcu_read_lock(); + entry = xas_load(&xas); + if (entry) { + xas_lock(&xas); + WARN_ON_ONCE(xas_reload(&xas) != entry); + xas_store(&xas, NULL); + xas_unlock(&xas); zswap_entry_free(entry); + } + rcu_read_unlock(); } int zswap_swapon(int type, unsigned long nr_pages)