Message ID | 20240924011709.7037-5-kanchana.p.sridhar@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: ZSWAP swap-out of mTHP folios | expand |
On Mon, Sep 23, 2024 at 6:17 PM Kanchana P Sridhar <kanchana.p.sridhar@intel.com> wrote: > > Added a new procedure zswap_delete_stored_offsets() that can be > called to delete stored offsets in a folio in case zswap_store() > fails or zswap is disabled. > > Refactored the code in zswap_store() that handles these cases, > to call zswap_delete_stored_offsets(). > > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> > --- > mm/zswap.c | 33 ++++++++++++++++++++++++++++++--- > 1 file changed, 30 insertions(+), 3 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index fd35a81b6e36..9bea948d653e 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1435,8 +1435,37 @@ static bool zswap_store_entry(struct xarray *tree, > return true; > } > > +/* > + * If the zswap store fails or zswap is disabled, we must invalidate the > + * possibly stale entries which were previously stored at the offsets > + * corresponding to each page of the folio. Otherwise, writeback could > + * overwrite the new data in the swapfile. > + * > + * This is called after the store of an offset in a large folio has failed. "store of a subpage" rather than "stored of an offset"? > + * All zswap entries in the folio must be deleted. This helps make sure > + * that a swapped-out mTHP is either entirely stored in zswap, or entirely > + * not stored in zswap. > + * > + * This is also called if zswap_store() is invoked, but zswap is not enabled. > + * All offsets for the folio are deleted from zswap in this case. > + */ > +static void zswap_delete_stored_offsets(struct xarray *tree, > + pgoff_t offset, > + long nr_pages) > +{ > + struct zswap_entry *entry; > + long i; > + > + for (i = 0; i < nr_pages; ++i) { > + entry = xa_erase(tree, offset + i); > + if (entry) > + zswap_entry_free(entry); > + } > +} > +
On Mon, Sep 23, 2024 at 6:17 PM Kanchana P Sridhar <kanchana.p.sridhar@intel.com> wrote: > > Added a new procedure zswap_delete_stored_offsets() that can be > called to delete stored offsets in a folio in case zswap_store() > fails or zswap is disabled. I don't see the value in this helper. It will get called in one place AFAICT, and it is a bit inconsistent that we have to explicitly loop in zswap_store() to store pages, but the loop to delete pages upon failure is hidden in the helper. I am not against adding a trivial zswap_tree_delete() helper (or similar) that calls xa_erase() and zswap_entry_free() to match zswap_tree_store() if you prefer that. > > Refactored the code in zswap_store() that handles these cases, > to call zswap_delete_stored_offsets(). > > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> > --- > mm/zswap.c | 33 ++++++++++++++++++++++++++++++--- > 1 file changed, 30 insertions(+), 3 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index fd35a81b6e36..9bea948d653e 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1435,8 +1435,37 @@ static bool zswap_store_entry(struct xarray *tree, > return true; > } > > +/* > + * If the zswap store fails or zswap is disabled, we must invalidate the > + * possibly stale entries which were previously stored at the offsets > + * corresponding to each page of the folio. Otherwise, writeback could > + * overwrite the new data in the swapfile. > + * > + * This is called after the store of an offset in a large folio has failed. > + * All zswap entries in the folio must be deleted. This helps make sure > + * that a swapped-out mTHP is either entirely stored in zswap, or entirely > + * not stored in zswap. > + * > + * This is also called if zswap_store() is invoked, but zswap is not enabled. > + * All offsets for the folio are deleted from zswap in this case. > + */ > +static void zswap_delete_stored_offsets(struct xarray *tree, > + pgoff_t offset, > + long nr_pages) > +{ > + struct zswap_entry *entry; > + long i; > + > + for (i = 0; i < nr_pages; ++i) { > + entry = xa_erase(tree, offset + i); > + if (entry) > + zswap_entry_free(entry); > + } > +} > + > bool zswap_store(struct folio *folio) > { > + long nr_pages = folio_nr_pages(folio); > swp_entry_t swp = folio->swap; > pgoff_t offset = swp_offset(swp); > struct xarray *tree = swap_zswap_tree(swp); > @@ -1541,9 +1570,7 @@ bool zswap_store(struct folio *folio) > * possibly stale entry which was previously stored at this offset. > * Otherwise, writeback could overwrite the new data in the swapfile. > */ > - entry = xa_erase(tree, offset); > - if (entry) > - zswap_entry_free(entry); > + zswap_delete_stored_offsets(tree, offset, nr_pages); > return false; > } > > -- > 2.27.0 >
> -----Original Message----- > From: Nhat Pham <nphamcs@gmail.com> > Sent: Tuesday, September 24, 2024 10:25 AM > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com> > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; > hannes@cmpxchg.org; yosryahmed@google.com; > chengming.zhou@linux.dev; usamaarif642@gmail.com; > shakeel.butt@linux.dev; ryan.roberts@arm.com; Huang, Ying > <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-foundation.org; > Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K > <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com> > Subject: Re: [PATCH v7 4/8] mm: zswap: Refactor code to delete stored > offsets in case of errors. > > On Mon, Sep 23, 2024 at 6:17 PM Kanchana P Sridhar > <kanchana.p.sridhar@intel.com> wrote: > > > > Added a new procedure zswap_delete_stored_offsets() that can be > > called to delete stored offsets in a folio in case zswap_store() > > fails or zswap is disabled. > > > > Refactored the code in zswap_store() that handles these cases, > > to call zswap_delete_stored_offsets(). > > > > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> > > --- > > mm/zswap.c | 33 ++++++++++++++++++++++++++++++--- > > 1 file changed, 30 insertions(+), 3 deletions(-) > > > > diff --git a/mm/zswap.c b/mm/zswap.c > > index fd35a81b6e36..9bea948d653e 100644 > > --- a/mm/zswap.c > > +++ b/mm/zswap.c > > @@ -1435,8 +1435,37 @@ static bool zswap_store_entry(struct xarray > *tree, > > return true; > > } > > > > +/* > > + * If the zswap store fails or zswap is disabled, we must invalidate the > > + * possibly stale entries which were previously stored at the offsets > > + * corresponding to each page of the folio. Otherwise, writeback could > > + * overwrite the new data in the swapfile. > > + * > > + * This is called after the store of an offset in a large folio has failed. > > "store of a subpage" rather than "stored of an offset"? Sure, I will make this change in v8. > > > > + * All zswap entries in the folio must be deleted. This helps make sure > > + * that a swapped-out mTHP is either entirely stored in zswap, or entirely > > + * not stored in zswap. > > + * > > + * This is also called if zswap_store() is invoked, but zswap is not enabled. > > + * All offsets for the folio are deleted from zswap in this case. > > + */ > > +static void zswap_delete_stored_offsets(struct xarray *tree, > > + pgoff_t offset, > > + long nr_pages) > > +{ > > + struct zswap_entry *entry; > > + long i; > > + > > + for (i = 0; i < nr_pages; ++i) { > > + entry = xa_erase(tree, offset + i); > > + if (entry) > > + zswap_entry_free(entry); > > + } > > +} > > +
> -----Original Message----- > From: Yosry Ahmed <yosryahmed@google.com> > Sent: Tuesday, September 24, 2024 12:20 PM > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com> > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; > hannes@cmpxchg.org; nphamcs@gmail.com; chengming.zhou@linux.dev; > usamaarif642@gmail.com; shakeel.butt@linux.dev; ryan.roberts@arm.com; > Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux- > foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K > <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com> > Subject: Re: [PATCH v7 4/8] mm: zswap: Refactor code to delete stored > offsets in case of errors. > > On Mon, Sep 23, 2024 at 6:17 PM Kanchana P Sridhar > <kanchana.p.sridhar@intel.com> wrote: > > > > Added a new procedure zswap_delete_stored_offsets() that can be > > called to delete stored offsets in a folio in case zswap_store() > > fails or zswap is disabled. > > I don't see the value in this helper. It will get called in one place > AFAICT, and it is a bit inconsistent that we have to explicitly loop > in zswap_store() to store pages, but the loop to delete pages upon > failure is hidden in the helper. > > I am not against adding a trivial zswap_tree_delete() helper (or > similar) that calls xa_erase() and zswap_entry_free() to match > zswap_tree_store() if you prefer that. This is a good point. I had refactored this routine in the context of my code that does batching and the same loop over the mTHP's subpages would get called in multiple error condition cases. I am thinking it might probably make sense for say zswap_tree_delete() to take a "folio" and "tree" and encapsulate deleting all stored offsets for that folio. Since we have already done the computes for finding the "tree", having that as an input parameter is mainly for latency, but if it is cleaner to have "zswap_tree_delete(struct folio *folio)", that should be Ok too. Please let me know your suggestion on this. Thanks, Kanchana > > > > > Refactored the code in zswap_store() that handles these cases, > > to call zswap_delete_stored_offsets(). > > > > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> > > --- > > mm/zswap.c | 33 ++++++++++++++++++++++++++++++--- > > 1 file changed, 30 insertions(+), 3 deletions(-) > > > > diff --git a/mm/zswap.c b/mm/zswap.c > > index fd35a81b6e36..9bea948d653e 100644 > > --- a/mm/zswap.c > > +++ b/mm/zswap.c > > @@ -1435,8 +1435,37 @@ static bool zswap_store_entry(struct xarray > *tree, > > return true; > > } > > > > +/* > > + * If the zswap store fails or zswap is disabled, we must invalidate the > > + * possibly stale entries which were previously stored at the offsets > > + * corresponding to each page of the folio. Otherwise, writeback could > > + * overwrite the new data in the swapfile. > > + * > > + * This is called after the store of an offset in a large folio has failed. > > + * All zswap entries in the folio must be deleted. This helps make sure > > + * that a swapped-out mTHP is either entirely stored in zswap, or entirely > > + * not stored in zswap. > > + * > > + * This is also called if zswap_store() is invoked, but zswap is not enabled. > > + * All offsets for the folio are deleted from zswap in this case. > > + */ > > +static void zswap_delete_stored_offsets(struct xarray *tree, > > + pgoff_t offset, > > + long nr_pages) > > +{ > > + struct zswap_entry *entry; > > + long i; > > + > > + for (i = 0; i < nr_pages; ++i) { > > + entry = xa_erase(tree, offset + i); > > + if (entry) > > + zswap_entry_free(entry); > > + } > > +} > > + > > bool zswap_store(struct folio *folio) > > { > > + long nr_pages = folio_nr_pages(folio); > > swp_entry_t swp = folio->swap; > > pgoff_t offset = swp_offset(swp); > > struct xarray *tree = swap_zswap_tree(swp); > > @@ -1541,9 +1570,7 @@ bool zswap_store(struct folio *folio) > > * possibly stale entry which was previously stored at this offset. > > * Otherwise, writeback could overwrite the new data in the swapfile. > > */ > > - entry = xa_erase(tree, offset); > > - if (entry) > > - zswap_entry_free(entry); > > + zswap_delete_stored_offsets(tree, offset, nr_pages); > > return false; > > } > > > > -- > > 2.27.0 > >
On Tue, Sep 24, 2024 at 3:33 PM Sridhar, Kanchana P <kanchana.p.sridhar@intel.com> wrote: > > > -----Original Message----- > > From: Yosry Ahmed <yosryahmed@google.com> > > Sent: Tuesday, September 24, 2024 12:20 PM > > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com> > > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; > > hannes@cmpxchg.org; nphamcs@gmail.com; chengming.zhou@linux.dev; > > usamaarif642@gmail.com; shakeel.butt@linux.dev; ryan.roberts@arm.com; > > Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux- > > foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K > > <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com> > > Subject: Re: [PATCH v7 4/8] mm: zswap: Refactor code to delete stored > > offsets in case of errors. > > > > On Mon, Sep 23, 2024 at 6:17 PM Kanchana P Sridhar > > <kanchana.p.sridhar@intel.com> wrote: > > > > > > Added a new procedure zswap_delete_stored_offsets() that can be > > > called to delete stored offsets in a folio in case zswap_store() > > > fails or zswap is disabled. > > > > I don't see the value in this helper. It will get called in one place > > AFAICT, and it is a bit inconsistent that we have to explicitly loop > > in zswap_store() to store pages, but the loop to delete pages upon > > failure is hidden in the helper. > > > > I am not against adding a trivial zswap_tree_delete() helper (or > > similar) that calls xa_erase() and zswap_entry_free() to match > > zswap_tree_store() if you prefer that. > > This is a good point. I had refactored this routine in the context > of my code that does batching and the same loop over the mTHP's > subpages would get called in multiple error condition cases. > > I am thinking it might probably make sense for say zswap_tree_delete() > to take a "folio" and "tree" and encapsulate deleting all stored offsets > for that folio. Since we have already done the computes for finding the > "tree", having that as an input parameter is mainly for latency, but if > it is cleaner to have "zswap_tree_delete(struct folio *folio)", that should > be Ok too. Please let me know your suggestion on this. > What I meant is "zswap_tree_delete(struct xarray *tree, pgoff_t offset)", and loop and call this in zswap_store(). This would be consistent on looping and calling zswap_store_page(). But we can keep the helper as-is actually and just rename it to zswap_tree_delete() and move the loop inside. No strong preference.
> -----Original Message----- > From: Yosry Ahmed <yosryahmed@google.com> > Sent: Tuesday, September 24, 2024 5:43 PM > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com> > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; > hannes@cmpxchg.org; nphamcs@gmail.com; chengming.zhou@linux.dev; > usamaarif642@gmail.com; shakeel.butt@linux.dev; ryan.roberts@arm.com; > Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux- > foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K > <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com> > Subject: Re: [PATCH v7 4/8] mm: zswap: Refactor code to delete stored > offsets in case of errors. > > On Tue, Sep 24, 2024 at 3:33 PM Sridhar, Kanchana P > <kanchana.p.sridhar@intel.com> wrote: > > > > > -----Original Message----- > > > From: Yosry Ahmed <yosryahmed@google.com> > > > Sent: Tuesday, September 24, 2024 12:20 PM > > > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com> > > > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; > > > hannes@cmpxchg.org; nphamcs@gmail.com; > chengming.zhou@linux.dev; > > > usamaarif642@gmail.com; shakeel.butt@linux.dev; > ryan.roberts@arm.com; > > > Huang, Ying <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux- > > > foundation.org; Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K > > > <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com> > > > Subject: Re: [PATCH v7 4/8] mm: zswap: Refactor code to delete stored > > > offsets in case of errors. > > > > > > On Mon, Sep 23, 2024 at 6:17 PM Kanchana P Sridhar > > > <kanchana.p.sridhar@intel.com> wrote: > > > > > > > > Added a new procedure zswap_delete_stored_offsets() that can be > > > > called to delete stored offsets in a folio in case zswap_store() > > > > fails or zswap is disabled. > > > > > > I don't see the value in this helper. It will get called in one place > > > AFAICT, and it is a bit inconsistent that we have to explicitly loop > > > in zswap_store() to store pages, but the loop to delete pages upon > > > failure is hidden in the helper. > > > > > > I am not against adding a trivial zswap_tree_delete() helper (or > > > similar) that calls xa_erase() and zswap_entry_free() to match > > > zswap_tree_store() if you prefer that. > > > > This is a good point. I had refactored this routine in the context > > of my code that does batching and the same loop over the mTHP's > > subpages would get called in multiple error condition cases. > > > > I am thinking it might probably make sense for say zswap_tree_delete() > > to take a "folio" and "tree" and encapsulate deleting all stored offsets > > for that folio. Since we have already done the computes for finding the > > "tree", having that as an input parameter is mainly for latency, but if > > it is cleaner to have "zswap_tree_delete(struct folio *folio)", that should > > be Ok too. Please let me know your suggestion on this. > > > > What I meant is "zswap_tree_delete(struct xarray *tree, pgoff_t > offset)", and loop and call this in zswap_store(). This would be > consistent on looping and calling zswap_store_page(). > > But we can keep the helper as-is actually and just rename it to > zswap_tree_delete() and move the loop inside. No strong preference. Ok, sounds good. Thanks, Kanchana
On Tue, Sep 24, 2024 at 05:43:22PM -0700, Yosry Ahmed wrote: > What I meant is "zswap_tree_delete(struct xarray *tree, pgoff_t > offset)", and loop and call this in zswap_store(). This would be > consistent on looping and calling zswap_store_page(). > > But we can keep the helper as-is actually and just rename it to > zswap_tree_delete() and move the loop inside. No strong preference. Both helpers seem unnecesary. zswap_tree_store() is not called in a loop directly. It's called from zswap_store_page(), which is essentially what zswap_store() is now, and that was fine with the open-coded insert. zswap_tree_delete() just hides what's going on. zswap_store() has the for-loop to store the subpages, so it makes sense it has the for loop for unwinding on rejection as well. This makes it easier on the reader to match up attempt and unwind. Please just drop both.
> -----Original Message----- > From: Johannes Weiner <hannes@cmpxchg.org> > Sent: Wednesday, September 25, 2024 7:11 AM > To: Yosry Ahmed <yosryahmed@google.com> > Cc: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>; linux- > kernel@vger.kernel.org; linux-mm@kvack.org; nphamcs@gmail.com; > chengming.zhou@linux.dev; usamaarif642@gmail.com; > shakeel.butt@linux.dev; ryan.roberts@arm.com; Huang, Ying > <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-foundation.org; > Zou, Nanhai <nanhai.zou@intel.com>; Feghali, Wajdi K > <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com> > Subject: Re: [PATCH v7 4/8] mm: zswap: Refactor code to delete stored > offsets in case of errors. > > On Tue, Sep 24, 2024 at 05:43:22PM -0700, Yosry Ahmed wrote: > > What I meant is "zswap_tree_delete(struct xarray *tree, pgoff_t > > offset)", and loop and call this in zswap_store(). This would be > > consistent on looping and calling zswap_store_page(). > > > > But we can keep the helper as-is actually and just rename it to > > zswap_tree_delete() and move the loop inside. No strong preference. > > Both helpers seem unnecesary. > > zswap_tree_store() is not called in a loop directly. It's called from > zswap_store_page(), which is essentially what zswap_store() is now, > and that was fine with the open-coded insert. > > zswap_tree_delete() just hides what's going on. zswap_store() has the > for-loop to store the subpages, so it makes sense it has the for loop > for unwinding on rejection as well. This makes it easier on the reader > to match up attempt and unwind. > > Please just drop both. Ok, sounds good.
diff --git a/mm/zswap.c b/mm/zswap.c index fd35a81b6e36..9bea948d653e 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1435,8 +1435,37 @@ static bool zswap_store_entry(struct xarray *tree, return true; } +/* + * If the zswap store fails or zswap is disabled, we must invalidate the + * possibly stale entries which were previously stored at the offsets + * corresponding to each page of the folio. Otherwise, writeback could + * overwrite the new data in the swapfile. + * + * This is called after the store of an offset in a large folio has failed. + * All zswap entries in the folio must be deleted. This helps make sure + * that a swapped-out mTHP is either entirely stored in zswap, or entirely + * not stored in zswap. + * + * This is also called if zswap_store() is invoked, but zswap is not enabled. + * All offsets for the folio are deleted from zswap in this case. + */ +static void zswap_delete_stored_offsets(struct xarray *tree, + pgoff_t offset, + long nr_pages) +{ + struct zswap_entry *entry; + long i; + + for (i = 0; i < nr_pages; ++i) { + entry = xa_erase(tree, offset + i); + if (entry) + zswap_entry_free(entry); + } +} + bool zswap_store(struct folio *folio) { + long nr_pages = folio_nr_pages(folio); swp_entry_t swp = folio->swap; pgoff_t offset = swp_offset(swp); struct xarray *tree = swap_zswap_tree(swp); @@ -1541,9 +1570,7 @@ bool zswap_store(struct folio *folio) * possibly stale entry which was previously stored at this offset. * Otherwise, writeback could overwrite the new data in the swapfile. */ - entry = xa_erase(tree, offset); - if (entry) - zswap_entry_free(entry); + zswap_delete_stored_offsets(tree, offset, nr_pages); return false; }
Added a new procedure zswap_delete_stored_offsets() that can be called to delete stored offsets in a folio in case zswap_store() fails or zswap is disabled. Refactored the code in zswap_store() that handles these cases, to call zswap_delete_stored_offsets(). Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> --- mm/zswap.c | 33 ++++++++++++++++++++++++++++++--- 1 file changed, 30 insertions(+), 3 deletions(-)