diff mbox series

[v2,01/18] mm: Make folios_put() the basis of release_pages()

Message ID 20240217022546.1496101-2-willy@infradead.org (mailing list archive)
State New
Headers show
Series Rearrange batched folio freeing | expand

Commit Message

Matthew Wilcox Feb. 17, 2024, 2:25 a.m. UTC
By making release_pages() call folios_put(), we can get rid of the calls
to compound_head() for the callers that already know they have folios.
We can also get rid of the lock_batch tracking as we know the size
of the batch is limited by folio_batch.  This does reduce the maximum
number of pages for which the lruvec lock is held, from SWAP_CLUSTER_MAX
(32) to PAGEVEC_SIZE (15).  I do not expect this to make a significant
difference, but if it does, we can increase PAGEVEC_SIZE to 31.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm.h | 19 ++---------
 mm/mlock.c         |  3 +-
 mm/swap.c          | 84 +++++++++++++++++++++++++++-------------------
 3 files changed, 52 insertions(+), 54 deletions(-)

Comments

David Hildenbrand Feb. 19, 2024, 9:43 a.m. UTC | #1
On 17.02.24 03:25, Matthew Wilcox (Oracle) wrote:
> By making release_pages() call folios_put(), we can get rid of the calls
> to compound_head() for the callers that already know they have folios.
> We can also get rid of the lock_batch tracking as we know the size
> of the batch is limited by folio_batch.  This does reduce the maximum
> number of pages for which the lruvec lock is held, from SWAP_CLUSTER_MAX
> (32) to PAGEVEC_SIZE (15).  I do not expect this to make a significant
> difference, but if it does, we can increase PAGEVEC_SIZE to 31.
> 

I'm afraid that won't apply to current mm-unstable anymore, where we can 
now put multiple references to a single folio (as part of unmapping
large PTE-mapped folios).

[...]

> +/**
> + * release_pages - batched put_page()
> + * @arg: array of pages to release
> + * @nr: number of pages
> + *
> + * Decrement the reference count on all the pages in @arg.  If it
> + * fell to zero, remove the page from the LRU and free it.
> + *
> + * Note that the argument can be an array of pages, encoded pages,
> + * or folio pointers. We ignore any encoded bits, and turn any of
> + * them into just a folio that gets free'd.
> + */
> +void release_pages(release_pages_arg arg, int nr)
> +{
> +	struct folio_batch fbatch;
> +	struct encoded_page **encoded = arg.encoded_pages;
> +	int i;
> +
> +	folio_batch_init(&fbatch);
> +	for (i = 0; i < nr; i++) {
> +		/* Turn any of the argument types into a folio */
> +		struct folio *folio = page_folio(encoded_page_ptr(encoded[i]));
> +

As an "easy" way forward, we could handle these "multiple-ref" cases 
here by putting ref-1 references, and leaving the single remaining 
reference to folios_put().

That implies, more atomic operations, though.

Alternatively, "struct folio_batch" would have to be optimized to 
understand "put multiple references" as well.

> +		if (folio_batch_add(&fbatch, folio) > 0)
> +			continue;
> +		folios_put(&fbatch);
> +	}
> +
> +	if (fbatch.nr)
> +		folios_put(&fbatch);
>   }
>   EXPORT_SYMBOL(release_pages);
>
Matthew Wilcox Feb. 19, 2024, 3:03 p.m. UTC | #2
On Mon, Feb 19, 2024 at 10:43:06AM +0100, David Hildenbrand wrote:
> On 17.02.24 03:25, Matthew Wilcox (Oracle) wrote:
> > By making release_pages() call folios_put(), we can get rid of the calls
> > to compound_head() for the callers that already know they have folios.
> > We can also get rid of the lock_batch tracking as we know the size
> > of the batch is limited by folio_batch.  This does reduce the maximum
> > number of pages for which the lruvec lock is held, from SWAP_CLUSTER_MAX
> > (32) to PAGEVEC_SIZE (15).  I do not expect this to make a significant
> > difference, but if it does, we can increase PAGEVEC_SIZE to 31.
> > 
> 
> I'm afraid that won't apply to current mm-unstable anymore, where we can now
> put multiple references to a single folio (as part of unmapping
> large PTE-mapped folios).

Argh.  I'm not a huge fan of that approach, but let's live with it for
now.  How about this as a replacement patch?  It compiles ...


diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1743bdeab506..42de41e469a1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -36,6 +36,7 @@ struct anon_vma;
 struct anon_vma_chain;
 struct user_struct;
 struct pt_regs;
+struct folio_batch;
 
 extern int sysctl_page_lock_unfairness;
 
@@ -1519,6 +1520,8 @@ static inline void folio_put_refs(struct folio *folio, int refs)
 		__folio_put(folio);
 }
 
+void folios_put_refs(struct folio_batch *folios, unsigned int *refs);
+
 /*
  * union release_pages_arg - an array of pages or folios
  *
@@ -1541,18 +1544,19 @@ void release_pages(release_pages_arg, int nr);
 /**
  * folios_put - Decrement the reference count on an array of folios.
  * @folios: The folios.
- * @nr: How many folios there are.
  *
- * Like folio_put(), but for an array of folios.  This is more efficient
- * than writing the loop yourself as it will optimise the locks which
- * need to be taken if the folios are freed.
+ * Like folio_put(), but for a batch of folios.  This is more efficient
+ * than writing the loop yourself as it will optimise the locks which need
+ * to be taken if the folios are freed.  The folios batch is returned
+ * empty and ready to be reused for another batch; there is no need to
+ * reinitialise it.
  *
  * Context: May be called in process or interrupt context, but not in NMI
  * context.  May be called while holding a spinlock.
  */
-static inline void folios_put(struct folio **folios, unsigned int nr)
+static inline void folios_put(struct folio_batch *folios)
 {
-	release_pages(folios, nr);
+	folios_put_refs(folios, NULL);
 }
 
 static inline void put_page(struct page *page)
diff --git a/mm/mlock.c b/mm/mlock.c
index 086546ac5766..1ed2f2ab37cd 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -206,8 +206,7 @@ static void mlock_folio_batch(struct folio_batch *fbatch)
 
 	if (lruvec)
 		unlock_page_lruvec_irq(lruvec);
-	folios_put(fbatch->folios, folio_batch_count(fbatch));
-	folio_batch_reinit(fbatch);
+	folios_put(fbatch);
 }
 
 void mlock_drain_local(void)
diff --git a/mm/swap.c b/mm/swap.c
index e5380d732c0d..6b736fceccfa 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -89,7 +89,7 @@ static void __page_cache_release(struct folio *folio)
 		__folio_clear_lru_flags(folio);
 		unlock_page_lruvec_irqrestore(lruvec, flags);
 	}
-	/* See comment on folio_test_mlocked in release_pages() */
+	/* See comment on folio_test_mlocked in folios_put() */
 	if (unlikely(folio_test_mlocked(folio))) {
 		long nr_pages = folio_nr_pages(folio);
 
@@ -175,7 +175,7 @@ static void lru_add_fn(struct lruvec *lruvec, struct folio *folio)
 	 * while the LRU lock is held.
 	 *
 	 * (That is not true of __page_cache_release(), and not necessarily
-	 * true of release_pages(): but those only clear the mlocked flag after
+	 * true of folios_put(): but those only clear the mlocked flag after
 	 * folio_put_testzero() has excluded any other users of the folio.)
 	 */
 	if (folio_evictable(folio)) {
@@ -221,8 +221,7 @@ static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t move_fn)
 
 	if (lruvec)
 		unlock_page_lruvec_irqrestore(lruvec, flags);
-	folios_put(fbatch->folios, folio_batch_count(fbatch));
-	folio_batch_reinit(fbatch);
+	folios_put(fbatch);
 }
 
 static void folio_batch_add_and_move(struct folio_batch *fbatch,
@@ -946,47 +945,30 @@ void lru_cache_disable(void)
 }
 
 /**
- * release_pages - batched put_page()
- * @arg: array of pages to release
- * @nr: number of pages
+ * folios_put_refs - Reduce the reference count on a batch of folios.
+ * @folios: The folios.
+ * @refs: The number of refs to subtract from each folio.
  *
- * Decrement the reference count on all the pages in @arg.  If it
- * fell to zero, remove the page from the LRU and free it.
+ * Like folio_put(), but for a batch of folios.  This is more efficient
+ * than writing the loop yourself as it will optimise the locks which need
+ * to be taken if the folios are freed.  The folios batch is returned
+ * empty and ready to be reused for another batch; there is no need
+ * to reinitialise it.  If @refs is NULL, we subtract one from each
+ * folio refcount.
  *
- * Note that the argument can be an array of pages, encoded pages,
- * or folio pointers. We ignore any encoded bits, and turn any of
- * them into just a folio that gets free'd.
+ * Context: May be called in process or interrupt context, but not in NMI
+ * context.  May be called while holding a spinlock.
  */
-void release_pages(release_pages_arg arg, int nr)
+void folios_put_refs(struct folio_batch *folios, unsigned int *refs)
 {
 	int i;
-	struct encoded_page **encoded = arg.encoded_pages;
 	LIST_HEAD(pages_to_free);
 	struct lruvec *lruvec = NULL;
 	unsigned long flags = 0;
-	unsigned int lock_batch;
 
-	for (i = 0; i < nr; i++) {
-		unsigned int nr_refs = 1;
-		struct folio *folio;
-
-		/* Turn any of the argument types into a folio */
-		folio = page_folio(encoded_page_ptr(encoded[i]));
-
-		/* Is our next entry actually "nr_pages" -> "nr_refs" ? */
-		if (unlikely(encoded_page_flags(encoded[i]) &
-			     ENCODED_PAGE_BIT_NR_PAGES_NEXT))
-			nr_refs = encoded_nr_pages(encoded[++i]);
-
-		/*
-		 * Make sure the IRQ-safe lock-holding time does not get
-		 * excessive with a continuous string of pages from the
-		 * same lruvec. The lock is held only if lruvec != NULL.
-		 */
-		if (lruvec && ++lock_batch == SWAP_CLUSTER_MAX) {
-			unlock_page_lruvec_irqrestore(lruvec, flags);
-			lruvec = NULL;
-		}
+	for (i = 0; i < folios->nr; i++) {
+		struct folio *folio = folios->folios[i];
+		unsigned int nr_refs = refs ? refs[i] : 1;
 
 		if (is_huge_zero_page(&folio->page))
 			continue;
@@ -1016,13 +998,8 @@ void release_pages(release_pages_arg arg, int nr)
 		}
 
 		if (folio_test_lru(folio)) {
-			struct lruvec *prev_lruvec = lruvec;
-
 			lruvec = folio_lruvec_relock_irqsave(folio, lruvec,
 									&flags);
-			if (prev_lruvec != lruvec)
-				lock_batch = 0;
-
 			lruvec_del_folio(lruvec, folio);
 			__folio_clear_lru_flags(folio);
 		}
@@ -1046,6 +1023,47 @@ void release_pages(release_pages_arg arg, int nr)
 
 	mem_cgroup_uncharge_list(&pages_to_free);
 	free_unref_page_list(&pages_to_free);
+	folios->nr = 0;
+}
+EXPORT_SYMBOL(folios_put);
+
+/**
+ * release_pages - batched put_page()
+ * @arg: array of pages to release
+ * @nr: number of pages
+ *
+ * Decrement the reference count on all the pages in @arg.  If it
+ * fell to zero, remove the page from the LRU and free it.
+ *
+ * Note that the argument can be an array of pages, encoded pages,
+ * or folio pointers. We ignore any encoded bits, and turn any of
+ * them into just a folio that gets free'd.
+ */
+void release_pages(release_pages_arg arg, int nr)
+{
+	struct folio_batch fbatch;
+	int refs[PAGEVEC_SIZE];
+	struct encoded_page **encoded = arg.encoded_pages;
+	int i;
+
+	folio_batch_init(&fbatch);
+	for (i = 0; i < nr; i++) {
+		/* Turn any of the argument types into a folio */
+		struct folio *folio = page_folio(encoded_page_ptr(encoded[i]));
+
+		/* Is our next entry actually "nr_pages" -> "nr_refs" ? */
+		refs[fbatch.nr] = 1;
+		if (unlikely(encoded_page_flags(encoded[i]) &
+			     ENCODED_PAGE_BIT_NR_PAGES_NEXT))
+			refs[fbatch.nr] = encoded_nr_pages(encoded[++i]);
+
+		if (folio_batch_add(&fbatch, folio) > 0)
+			continue;
+		folios_put_refs(&fbatch, refs);
+	}
+
+	if (fbatch.nr)
+		folios_put_refs(&fbatch, refs);
 }
 EXPORT_SYMBOL(release_pages);
David Hildenbrand Feb. 19, 2024, 3:31 p.m. UTC | #3
On 19.02.24 16:03, Matthew Wilcox wrote:
> On Mon, Feb 19, 2024 at 10:43:06AM +0100, David Hildenbrand wrote:
>> On 17.02.24 03:25, Matthew Wilcox (Oracle) wrote:
>>> By making release_pages() call folios_put(), we can get rid of the calls
>>> to compound_head() for the callers that already know they have folios.
>>> We can also get rid of the lock_batch tracking as we know the size
>>> of the batch is limited by folio_batch.  This does reduce the maximum
>>> number of pages for which the lruvec lock is held, from SWAP_CLUSTER_MAX
>>> (32) to PAGEVEC_SIZE (15).  I do not expect this to make a significant
>>> difference, but if it does, we can increase PAGEVEC_SIZE to 31.
>>>
>>
>> I'm afraid that won't apply to current mm-unstable anymore, where we can now
>> put multiple references to a single folio (as part of unmapping
>> large PTE-mapped folios).
> 
> Argh.  I'm not a huge fan of that approach, but let's live with it for
> now.  

I'm hoping we at least can get rid of page ranges at some point (and 
just have folio + nr_refs), but for the time being there is no way 
around that due to delayed rmap handling that needs the exact pages (ugh).

folios_put_refs() does sound reasonable in any case, although likely 
"putting multiple references" is limited to zap/munmap/... code paths.

> How about this as a replacement patch?  It compiles ...
> 

Nothing jumped at me, one comment:

[...]

> +EXPORT_SYMBOL(folios_put);
> +
> +/**
> + * release_pages - batched put_page()
> + * @arg: array of pages to release
> + * @nr: number of pages
> + *
> + * Decrement the reference count on all the pages in @arg.  If it
> + * fell to zero, remove the page from the LRU and free it.
> + *
> + * Note that the argument can be an array of pages, encoded pages,
> + * or folio pointers. We ignore any encoded bits, and turn any of
> + * them into just a folio that gets free'd.
> + */
> +void release_pages(release_pages_arg arg, int nr)
> +{
> +	struct folio_batch fbatch;
> +	int refs[PAGEVEC_SIZE];
> +	struct encoded_page **encoded = arg.encoded_pages;
> +	int i;
> +
> +	folio_batch_init(&fbatch);
> +	for (i = 0; i < nr; i++) {
> +		/* Turn any of the argument types into a folio */
> +		struct folio *folio = page_folio(encoded_page_ptr(encoded[i]));
> +
> +		/* Is our next entry actually "nr_pages" -> "nr_refs" ? */
> +		refs[fbatch.nr] = 1;
> +		if (unlikely(encoded_page_flags(encoded[i]) &
> +			     ENCODED_PAGE_BIT_NR_PAGES_NEXT))
> +			refs[fbatch.nr] = encoded_nr_pages(encoded[++i]);
> +
> +		if (folio_batch_add(&fbatch, folio) > 0)
> +			continue;
> +		folios_put_refs(&fbatch, refs);
> +	}
> +
> +	if (fbatch.nr)
> +		folios_put_refs(&fbatch, refs);

I wonder if it makes sense to remember if any ref !=1, and simply call 
folios_put() if that's the case.

But I guess the whole point about PAGEVEC_SIZE is that it is very 
cache-friendly and traversing it a second time (e.g., when all we are 
doing is freeing order-0 folios) is not too expensive.
Matthew Wilcox Feb. 19, 2024, 4:07 p.m. UTC | #4
On Mon, Feb 19, 2024 at 04:31:14PM +0100, David Hildenbrand wrote:
> I'm hoping we at least can get rid of page ranges at some point (and just
> have folio + nr_refs), but for the time being there is no way around that
> due to delayed rmap handling that needs the exact pages (ugh).

Yup.  I've looked at pulling some of that apart, but realistically it's
not going to happen soon.

> folios_put_refs() does sound reasonable in any case, although likely
> "putting multiple references" is limited to zap/munmap/... code paths.

Well ... maybe.  We have a few places where we call folio_put_refs(),
and maybe some of them could be batched.  unpin_user_pages_dirty_lock()
is a candidate, but I wouldn't be surprised if someone inventive could
find a way to do something similar in the filemap_free_folio() paths.
Although the real solution there is to make the pagecache reference
count once, not N times.

> > +EXPORT_SYMBOL(folios_put);

heh, forgot to change that line.  A full compile (as opposed to just mm/)
picked it up.

> > +	if (fbatch.nr)
> > +		folios_put_refs(&fbatch, refs);
> 
> I wonder if it makes sense to remember if any ref !=1, and simply call
> folios_put() if that's the case.
> 
> But I guess the whole point about PAGEVEC_SIZE is that it is very
> cache-friendly and traversing it a second time (e.g., when all we are doing
> is freeing order-0 folios) is not too expensive.

I don't think we need to add that; it'd certainly be something we could
look at though.
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6095c86aa040..2a1ebda5fb79 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -36,6 +36,7 @@  struct anon_vma;
 struct anon_vma_chain;
 struct user_struct;
 struct pt_regs;
+struct folio_batch;
 
 extern int sysctl_page_lock_unfairness;
 
@@ -1532,23 +1533,7 @@  typedef union {
 } release_pages_arg __attribute__ ((__transparent_union__));
 
 void release_pages(release_pages_arg, int nr);
-
-/**
- * folios_put - Decrement the reference count on an array of folios.
- * @folios: The folios.
- * @nr: How many folios there are.
- *
- * Like folio_put(), but for an array of folios.  This is more efficient
- * than writing the loop yourself as it will optimise the locks which
- * need to be taken if the folios are freed.
- *
- * Context: May be called in process or interrupt context, but not in NMI
- * context.  May be called while holding a spinlock.
- */
-static inline void folios_put(struct folio **folios, unsigned int nr)
-{
-	release_pages(folios, nr);
-}
+void folios_put(struct folio_batch *folios);
 
 static inline void put_page(struct page *page)
 {
diff --git a/mm/mlock.c b/mm/mlock.c
index 086546ac5766..1ed2f2ab37cd 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -206,8 +206,7 @@  static void mlock_folio_batch(struct folio_batch *fbatch)
 
 	if (lruvec)
 		unlock_page_lruvec_irq(lruvec);
-	folios_put(fbatch->folios, folio_batch_count(fbatch));
-	folio_batch_reinit(fbatch);
+	folios_put(fbatch);
 }
 
 void mlock_drain_local(void)
diff --git a/mm/swap.c b/mm/swap.c
index cd8f0150ba3a..7bdc63b56859 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -89,7 +89,7 @@  static void __page_cache_release(struct folio *folio)
 		__folio_clear_lru_flags(folio);
 		unlock_page_lruvec_irqrestore(lruvec, flags);
 	}
-	/* See comment on folio_test_mlocked in release_pages() */
+	/* See comment on folio_test_mlocked in folios_put() */
 	if (unlikely(folio_test_mlocked(folio))) {
 		long nr_pages = folio_nr_pages(folio);
 
@@ -175,7 +175,7 @@  static void lru_add_fn(struct lruvec *lruvec, struct folio *folio)
 	 * while the LRU lock is held.
 	 *
 	 * (That is not true of __page_cache_release(), and not necessarily
-	 * true of release_pages(): but those only clear the mlocked flag after
+	 * true of folios_put(): but those only clear the mlocked flag after
 	 * folio_put_testzero() has excluded any other users of the folio.)
 	 */
 	if (folio_evictable(folio)) {
@@ -221,8 +221,7 @@  static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t move_fn)
 
 	if (lruvec)
 		unlock_page_lruvec_irqrestore(lruvec, flags);
-	folios_put(fbatch->folios, folio_batch_count(fbatch));
-	folio_batch_reinit(fbatch);
+	folios_put(fbatch);
 }
 
 static void folio_batch_add_and_move(struct folio_batch *fbatch,
@@ -946,41 +945,27 @@  void lru_cache_disable(void)
 }
 
 /**
- * release_pages - batched put_page()
- * @arg: array of pages to release
- * @nr: number of pages
+ * folios_put - Decrement the reference count on a batch of folios.
+ * @folios: The folios.
  *
- * Decrement the reference count on all the pages in @arg.  If it
- * fell to zero, remove the page from the LRU and free it.
+ * Like folio_put(), but for a batch of folios.  This is more efficient
+ * than writing the loop yourself as it will optimise the locks which need
+ * to be taken if the folios are freed.  The folios batch is returned
+ * empty and ready to be reused for another batch; there is no need to
+ * reinitialise it.
  *
- * Note that the argument can be an array of pages, encoded pages,
- * or folio pointers. We ignore any encoded bits, and turn any of
- * them into just a folio that gets free'd.
+ * Context: May be called in process or interrupt context, but not in NMI
+ * context.  May be called while holding a spinlock.
  */
-void release_pages(release_pages_arg arg, int nr)
+void folios_put(struct folio_batch *folios)
 {
 	int i;
-	struct encoded_page **encoded = arg.encoded_pages;
 	LIST_HEAD(pages_to_free);
 	struct lruvec *lruvec = NULL;
 	unsigned long flags = 0;
-	unsigned int lock_batch;
 
-	for (i = 0; i < nr; i++) {
-		struct folio *folio;
-
-		/* Turn any of the argument types into a folio */
-		folio = page_folio(encoded_page_ptr(encoded[i]));
-
-		/*
-		 * Make sure the IRQ-safe lock-holding time does not get
-		 * excessive with a continuous string of pages from the
-		 * same lruvec. The lock is held only if lruvec != NULL.
-		 */
-		if (lruvec && ++lock_batch == SWAP_CLUSTER_MAX) {
-			unlock_page_lruvec_irqrestore(lruvec, flags);
-			lruvec = NULL;
-		}
+	for (i = 0; i < folios->nr; i++) {
+		struct folio *folio = folios->folios[i];
 
 		if (is_huge_zero_page(&folio->page))
 			continue;
@@ -1010,13 +995,8 @@  void release_pages(release_pages_arg arg, int nr)
 		}
 
 		if (folio_test_lru(folio)) {
-			struct lruvec *prev_lruvec = lruvec;
-
 			lruvec = folio_lruvec_relock_irqsave(folio, lruvec,
 									&flags);
-			if (prev_lruvec != lruvec)
-				lock_batch = 0;
-
 			lruvec_del_folio(lruvec, folio);
 			__folio_clear_lru_flags(folio);
 		}
@@ -1040,6 +1020,40 @@  void release_pages(release_pages_arg arg, int nr)
 
 	mem_cgroup_uncharge_list(&pages_to_free);
 	free_unref_page_list(&pages_to_free);
+	folios->nr = 0;
+}
+EXPORT_SYMBOL(folios_put);
+
+/**
+ * release_pages - batched put_page()
+ * @arg: array of pages to release
+ * @nr: number of pages
+ *
+ * Decrement the reference count on all the pages in @arg.  If it
+ * fell to zero, remove the page from the LRU and free it.
+ *
+ * Note that the argument can be an array of pages, encoded pages,
+ * or folio pointers. We ignore any encoded bits, and turn any of
+ * them into just a folio that gets free'd.
+ */
+void release_pages(release_pages_arg arg, int nr)
+{
+	struct folio_batch fbatch;
+	struct encoded_page **encoded = arg.encoded_pages;
+	int i;
+
+	folio_batch_init(&fbatch);
+	for (i = 0; i < nr; i++) {
+		/* Turn any of the argument types into a folio */
+		struct folio *folio = page_folio(encoded_page_ptr(encoded[i]));
+
+		if (folio_batch_add(&fbatch, folio) > 0)
+			continue;
+		folios_put(&fbatch);
+	}
+
+	if (fbatch.nr)
+		folios_put(&fbatch);
 }
 EXPORT_SYMBOL(release_pages);