mbox series

[GIT,PULL] Btrfs updates for 5.16

Message ID cover.1635773880.git.dsterba@suse.com (mailing list archive)
State New, archived
Headers show
Series [GIT,PULL] Btrfs updates for 5.16 | expand

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.16-tag

Message

David Sterba Nov. 1, 2021, 4:45 p.m. UTC
Hi,

the updates this time are more under the hood and enhancing existing
features (subpage with compression and zoned namespaces).

There's a merge conflict due to the last minute 5.15 changes (kmap
reverts) and the conflict is not trivial. I've prepared another branch
that makes the changes as a separate commit possibly easier to merge,
same as the manual merge diff should look like.

The branch is for-5.16-resolve (with a signed tag for-5.16-resolve-tag).
A git merge resolving conflicts by leaving the new code results in zero
merge diff.

If there's anything else I can do regarding the merge, let me know.
Otherwise, please pull, thanks.


Performance related:

* misc small inode logging improvements (+3% throughput, -11% latency on
  sample dbench workload)

* more efficient directory logging: bulk item insertion, less tree
  searches and locking

* speed up bulk insertion of items into a b-tree, which is used when
  logging directories, when running delayed items for directories (fsync
  and transaction commits) and when running the slow path (full sync) of
  an fsync (bulk creation run time -4%, deletion -12%)

Core:

* continued subpage support
  * make defragmentation work
  * make compression write work

* zoned mode
  * support ZNS (zoned namespaces), zone capacity is number of usable
    blocks in each zone
  * add dedicated block group (zoned) for relocation, to prevent out of
    order writes in some cases
  * greedy block group reclaim, pick the ones with least usable space
    first

* preparatory work for send protocol updates

* error handling improvements

* cleanups and refactoring

Fixes:

* lockdep warnings
  * in show_devname callback, on seeding device
  * device delete on loop device due to conversions to workqueues

* fix deadlock between chunk allocation and chunk btree modifications

* fix tracking of missing device count and status

----------------------------------------------------------------
The following changes since commit 3906fe9bb7f1a2c8667ae54e967dc8690824f4ea:

  Linux 5.15-rc7 (2021-10-25 11:30:31 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.16-tag

for you to fetch changes up to d1ed82f3559e151804743df0594f45d7ff6e55fa:

  btrfs: remove root argument from check_item_in_log() (2021-10-29 12:39:13 +0200)

----------------------------------------------------------------
Anand Jain (12):
      btrfs: drop unnecessary ret in ioctl_quota_rescan_status
      btrfs: rename and switch to bool btrfs_chunk_readonly
      btrfs: convert latest_bdev type to btrfs_device and rename
      btrfs: use latest_dev in btrfs_show_devname
      btrfs: update latest_dev when we create a sprout device
      btrfs: remove stale comment about the btrfs_show_devname
      btrfs: reduce btrfs_update_block_group alloc argument to bool
      btrfs: use num_device to check for the last surviving seed device
      btrfs: add comments for device counts in struct btrfs_fs_devices
      btrfs: sysfs: convert scnprintf and snprintf to sysfs_emit
      btrfs: fix comment about sector sizes supported in 64K systems
      btrfs: call btrfs_check_rw_degradable only if there is a missing device

Christoph Hellwig (2):
      btrfs: use bvec_kmap_local in btrfs_csum_one_bio
      btrfs: check-integrity: stop storing the block device name in btrfsic_dev_state

David Sterba (1):
      btrfs: send: prepare for v2 protocol

Filipe Manana (26):
      btrfs: check if a log tree exists at inode_logged()
      btrfs: remove no longer needed checks for NULL log context
      btrfs: do not log new dentries when logging that a new name exists
      btrfs: always update the logged transaction when logging new names
      btrfs: avoid expensive search when dropping inode items from log
      btrfs: add helper to truncate inode items when logging inode
      btrfs: avoid expensive search when truncating inode items from the log
      btrfs: avoid search for logged i_size when logging inode if possible
      btrfs: avoid attempt to drop extents when logging inode for the first time
      btrfs: do not commit delayed inode when logging a file in full sync mode
      btrfs: remove root argument from btrfs_log_inode() and its callees
      btrfs: remove redundant log root assignment from log_dir_items()
      btrfs: factor out the copying loop of dir items from log_dir_items()
      btrfs: insert items in batches when logging a directory when possible
      btrfs: keep track of the last logged keys when logging a directory
      btrfs: assert that extent buffers are write locked instead of only locked
      btrfs: loop only once over data sizes array when inserting an item batch
      btrfs: unexport setup_items_for_insert()
      btrfs: use single bulk copy operations when logging directories
      btrfs: fix lost error handling when replaying directory deletes
      btrfs: fix deadlock between chunk allocation and chunk btree modifications
      btrfs: update comments for chunk allocation -ENOSPC cases
      btrfs: remove root argument from drop_one_dir_item()
      btrfs: remove root argument from btrfs_unlink_inode()
      btrfs: remove root argument from add_link()
      btrfs: remove root argument from check_item_in_log()

Johannes Thumshirn (9):
      btrfs: introduce btrfs_is_data_reloc_root
      btrfs: zoned: add a dedicated data relocation block group
      btrfs: zoned: only allow one process to add pages to a relocation inode
      btrfs: zoned: use regular writes for relocation
      btrfs: check for relocation inodes on zoned btrfs in should_nocow
      btrfs: zoned: allow preallocation for relocation inodes
      btrfs: rename setup_extent_mapping in relocation code
      btrfs: zoned: let the for_treelog test in the allocator stand out
      btrfs: zoned: use greedy gc for auto reclaim

Josef Bacik (11):
      btrfs: do not take the uuid_mutex in btrfs_rm_device
      btrfs: change handle_fs_error in recover_log_trees to aborts
      btrfs: change error handling for btrfs_delete_*_in_log
      btrfs: add a BTRFS_FS_ERROR helper
      btrfs: do not infinite loop in data reclaim if we aborted
      btrfs: do not call close_fs_devices in btrfs_rm_device
      btrfs: handle device lookup with btrfs_dev_lookup_args
      btrfs: add a btrfs_get_dev_args_from_path helper
      btrfs: use btrfs_get_dev_args_from_path in dev removal ioctls
      fs: export an inode_update_time helper
      btrfs: update device path inode time instead of bd_inode

Kai Song (1):
      btrfs: zoned: use kmemdup() to replace kmalloc + memcpy

Li Zhang (1):
      btrfs: clear MISSING device status bit in btrfs_close_one_device

Marcos Paulo de Souza (1):
      btrfs: send: simplify send_create_inode_if_needed

Naohiro Aota (17):
      btrfs: zoned: load zone capacity information from devices
      btrfs: zoned: move btrfs_free_excluded_extents out of btrfs_calc_zone_unusable
      btrfs: zoned: calculate free space from zone capacity
      btrfs: zoned: tweak reclaim threshold for zone capacity
      btrfs: zoned: consider zone as full when no more SB can be written
      btrfs: zoned: locate superblock position using zone capacity
      btrfs: zoned: finish superblock zone once no space left for new SB
      btrfs: zoned: load active zone information from devices
      btrfs: zoned: introduce physical_map to btrfs_block_group
      btrfs: zoned: implement active zone tracking
      btrfs: zoned: load active zone info for block group
      btrfs: zoned: activate block group on allocation
      btrfs: zoned: activate new block group
      btrfs: move ffe_ctl one level up
      btrfs: zoned: avoid chunk allocation if active block group has enough space
      btrfs: zoned: finish fully written block group
      btrfs: zoned: finish relocating block group

Nikolay Borisov (6):
      btrfs: rename btrfs_alloc_chunk to btrfs_create_chunk
      btrfs: rename root fields in delayed refs structs
      btrfs: rely on owning_root field in btrfs_add_delayed_tree_ref to detect CHUNK_ROOT
      btrfs: add additional parameters to btrfs_init_tree_ref/btrfs_init_data_ref
      btrfs: pull up qgroup checks from delayed-ref core to init time
      btrfs: make btrfs_ref::real_root optional

Omar Sandoval (1):
      btrfs: fix deadlock when defragging transparent huge pages

Qu Wenruo (50):
      btrfs: subpage: only call btrfs_alloc_subpage() when sectorsize is smaller than PAGE_SIZE
      btrfs: subpage: make btrfs_alloc_subpage() return btrfs_subpage directly
      btrfs: subpage: introduce btrfs_subpage_bitmap_info
      btrfs: subpage: pack all subpage bitmaps into a larger bitmap
      btrfs: defrag: pass file_ra_state instead of file to btrfs_defrag_file()
      btrfs: defrag: also check PagePrivate for subpage cases in cluster_pages_for_defrag()
      btrfs: defrag: replace hard coded PAGE_SIZE with sectorsize
      btrfs: defrag: factor out page preparation into a helper
      btrfs: defrag: introduce helper to collect target file extents
      btrfs: defrag: introduce helper to defrag a contiguous prepared range
      btrfs: defrag: introduce helper to defrag a range
      btrfs: defrag: introduce helper to defrag one cluster
      btrfs: defrag: use defrag_one_cluster() to implement btrfs_defrag_file()
      btrfs: defrag: remove the old infrastructure
      btrfs: defrag: enable defrag for subpage case
      btrfs: unexport repair_io_failure()
      btrfs: rename btrfs_bio to btrfs_io_context
      btrfs: remove btrfs_bio_alloc() helper
      btrfs: rename struct btrfs_io_bio to btrfs_bio
      btrfs: make sure btrfs_io_context::fs_info is always initialized
      btrfs: remove btrfs_raid_bio::fs_info member
      btrfs: remove unused parameter nr_pages in add_ra_bio_pages()
      btrfs: remove unnecessary parameter delalloc_start for writepage_delalloc()
      btrfs: use async_chunk::async_cow to replace the confusing pending pointer
      btrfs: don't pass compressed pages to btrfs_writepage_endio_finish_ordered()
      btrfs: subpage: make add_ra_bio_pages() compatible
      btrfs: introduce compressed_bio::pending_sectors to trace compressed bio
      btrfs: subpage: add bitmap for PageChecked flag
      btrfs: handle errors properly inside btrfs_submit_compressed_read()
      btrfs: handle errors properly inside btrfs_submit_compressed_write()
      btrfs: introduce submit_compressed_bio() for compression
      btrfs: introduce alloc_compressed_bio() for compression
      btrfs: determine stripe boundary at bio allocation time in btrfs_submit_compressed_read
      btrfs: determine stripe boundary at bio allocation time in btrfs_submit_compressed_write
      btrfs: remove unused function btrfs_bio_fits_in_stripe()
      btrfs: refactor submit_compressed_extents()
      btrfs: cleanup for extent_write_locked_range()
      btrfs: subpage: make compress_file_range() compatible
      btrfs: subpage: make btrfs_submit_compressed_write() compatible
      btrfs: subpage: make end_compressed_bio_writeback() compatible
      btrfs: subpage: make extent_write_locked_range() compatible
      btrfs: factor uncompressed async extent submission code into a new helper
      btrfs: subpage: make lzo_compress_pages() compatible
      btrfs: rework page locking in __extent_writepage()
      btrfs: handle page locking in btrfs_page_end_writer_lock with no writers
      btrfs: subpage: avoid potential deadlock with compression and delalloc
      btrfs: subpage: only allow compression if the range is fully page aligned
      btrfs: rename btrfs_dio_private::logical_offset to file_offset
      btrfs: remove btrfs_bio::logical member
      btrfs: make btrfs_super_block size match BTRFS_SUPER_INFO_SIZE

Sidong Yang (1):
      btrfs: reflink: initialize return value to 0 in btrfs_extent_same()

Su Yue (1):
      btrfs: update comment for fs_devices::seed_list in btrfs_rm_device

 fs/btrfs/block-group.c               |  242 +++++---
 fs/btrfs/block-group.h               |    8 +-
 fs/btrfs/btrfs_inode.h               |   46 +-
 fs/btrfs/check-integrity.c           |  205 +++----
 fs/btrfs/compression.c               |  681 +++++++++++++----------
 fs/btrfs/compression.h               |    4 +-
 fs/btrfs/ctree.c                     |  156 +++---
 fs/btrfs/ctree.h                     |   84 ++-
 fs/btrfs/delayed-inode.c             |   41 +-
 fs/btrfs/delayed-ref.c               |   17 +-
 fs/btrfs/delayed-ref.h               |   51 +-
 fs/btrfs/dev-replace.c               |   16 +-
 fs/btrfs/disk-io.c                   |   51 +-
 fs/btrfs/disk-io.h                   |    5 +-
 fs/btrfs/extent-tree.c               |  326 +++++++----
 fs/btrfs/extent_io.c                 |  334 ++++++-----
 fs/btrfs/extent_io.h                 |   10 +-
 fs/btrfs/extent_map.c                |    4 +-
 fs/btrfs/file-item.c                 |   21 +-
 fs/btrfs/file.c                      |   35 +-
 fs/btrfs/free-space-cache.c          |   24 +-
 fs/btrfs/inode.c                     |  611 +++++++++++----------
 fs/btrfs/ioctl.c                     | 1004 ++++++++++++++++------------------
 fs/btrfs/locking.h                   |    7 +-
 fs/btrfs/lzo.c                       |  270 +++++----
 fs/btrfs/raid56.c                    |  175 +++---
 fs/btrfs/raid56.h                    |   22 +-
 fs/btrfs/reada.c                     |   26 +-
 fs/btrfs/ref-verify.c                |    4 +-
 fs/btrfs/reflink.c                   |    4 +-
 fs/btrfs/relocation.c                |   81 +--
 fs/btrfs/scrub.c                     |  139 ++---
 fs/btrfs/send.c                      |   38 +-
 fs/btrfs/send.h                      |    7 +
 fs/btrfs/space-info.c                |   28 +-
 fs/btrfs/subpage.c                   |  290 +++++++---
 fs/btrfs/subpage.h                   |   56 +-
 fs/btrfs/super.c                     |   28 +-
 fs/btrfs/sysfs.c                     |   93 ++--
 fs/btrfs/tests/extent-buffer-tests.c |    2 +-
 fs/btrfs/tests/extent-io-tests.c     |   12 +-
 fs/btrfs/tests/inode-tests.c         |    4 +-
 fs/btrfs/transaction.c               |   11 +-
 fs/btrfs/tree-log.c                  |  745 ++++++++++++++++---------
 fs/btrfs/tree-log.h                  |   18 +-
 fs/btrfs/volumes.c                   |  592 +++++++++++---------
 fs/btrfs/volumes.h                   |  119 ++--
 fs/btrfs/xattr.c                     |    2 +-
 fs/btrfs/zoned.c                     |  531 ++++++++++++++++--
 fs/btrfs/zoned.h                     |   39 +-
 fs/inode.c                           |    7 +-
 include/linux/fs.h                   |    2 +
 include/uapi/linux/btrfs.h           |   11 +-
 53 files changed, 4439 insertions(+), 2900 deletions(-)

Comments

Linus Torvalds Nov. 1, 2021, 8:03 p.m. UTC | #1
On Mon, Nov 1, 2021 at 9:46 AM David Sterba <dsterba@suse.com> wrote:
>
> There's a merge conflict due to the last minute 5.15 changes (kmap
> reverts) and the conflict is not trivial.

You don't say.

I ended up just re-doing that resolution entirely, and as I did so, I
think I found a bug in the original revert that caused the conflict in
the first place.

And since that revert made it into 5.15, I felt like I had to fix that
bug first - and separately - so that the fix can be backported to
stable.

I then re-did my merge on top of that hopefully fixed state, and maybe
it's correct.

Or maybe I messed up entirely.

I did end up comparing it to your other branch too, but that was
equally as messy, apart from the "ok, I can mindlessly just take your
side".

And it was fairly different from what I had done in my merge
resolution, so who knows.

ANYWAY. What I'm trying to say is that you should look very very
carefully at commits

  2cf3f8133bda ("btrfs: fix lzo_decompress_bio() kmap leakage")
  037c50bfbeb3 ("Merge tag 'for-5.16-tag' of git://git.../linux")

because I marked that first one for stable, and the second is
obviously my entirely untested merge.

It makes sense to me, but apart from "it builds", I've not actually
tested any of it. This is all purely from looking at the code and
trying to figure out what the RightThing(tm) is.

Obviously the kmap thing tends to only be noticeable on 32-bit
platforms, and that lzo_decompress_bio() bug also needs just the
proper filesystem settings to trigger in the first place.

Again - please take a careful look. Both at my merge and at that
alleged kmap fix.

                          Linus
pr-tracker-bot@kernel.org Nov. 1, 2021, 8:04 p.m. UTC | #2
The pull request you sent on Mon,  1 Nov 2021 17:45:56 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-5.16-tag

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/037c50bfbeb33b4c74e120eef5b8b99d8f025418

Thank you!
Qu Wenruo Nov. 2, 2021, 12:08 a.m. UTC | #3
On 2021/11/2 04:03, Linus Torvalds wrote:
> On Mon, Nov 1, 2021 at 9:46 AM David Sterba <dsterba@suse.com> wrote:
>>
>> There's a merge conflict due to the last minute 5.15 changes (kmap
>> reverts) and the conflict is not trivial.
>
> You don't say.
>
> I ended up just re-doing that resolution entirely, and as I did so, I
> think I found a bug in the original revert that caused the conflict in
> the first place.
>
> And since that revert made it into 5.15, I felt like I had to fix that
> bug first - and separately - so that the fix can be backported to
> stable.
>
> I then re-did my merge on top of that hopefully fixed state, and maybe
> it's correct.
>
> Or maybe I messed up entirely.
>
> I did end up comparing it to your other branch too, but that was
> equally as messy, apart from the "ok, I can mindlessly just take your
> side".
>
> And it was fairly different from what I had done in my merge
> resolution, so who knows.
>
> ANYWAY. What I'm trying to say is that you should look very very
> carefully at commits
>
>    2cf3f8133bda ("btrfs: fix lzo_decompress_bio() kmap leakage")

Since I'm doing the revert manually for lzo part, I double checked the code.

It turns out, your fix is the same as the original version I sent to
David (although not through the mail list).
Full patch attached.

@@ -345,8 +358,9 @@ int lzo_decompress_bio(struct list_head *ws, struct
compressed_bio *cb)
  		       (cur_in + LZO_LEN - 1) / sectorsize);
  		cur_page = cb->compressed_pages[cur_in / PAGE_SIZE];
  		ASSERT(cur_page);
-		seg_len = read_compress_length(page_address(cur_page) +
-					       offset_in_page(cur_in));
+		kaddr = kmap(cur_page);
+		seg_len = read_compress_length(kaddr + offset_in_page(cur_in));
+		kunmap(cur_page);
  		cur_in += LZO_LEN;

Thus it looks like by somehow my version is not applied?

Thanks,
Qu

>    037c50bfbeb3 ("Merge tag 'for-5.16-tag' of git://git.../linux") >
> because I marked that first one for stable, and the second is
> obviously my entirely untested merge.
>
> It makes sense to me, but apart from "it builds", I've not actually
> tested any of it. This is all purely from looking at the code and
> trying to figure out what the RightThing(tm) is.
>
> Obviously the kmap thing tends to only be noticeable on 32-bit
> platforms, and that lzo_decompress_bio() bug also needs just the
> proper filesystem settings to trigger in the first place.
>
> Again - please take a careful look. Both at my merge and at that
> alleged kmap fix.
>
>                            Linus
>
David Sterba Nov. 2, 2021, 2:32 p.m. UTC | #4
On Tue, Nov 02, 2021 at 08:08:22AM +0800, Qu Wenruo wrote:
> On 2021/11/2 04:03, Linus Torvalds wrote:
> > On Mon, Nov 1, 2021 at 9:46 AM David Sterba <dsterba@suse.com> wrote:
> > it's correct.
> >
> > Or maybe I messed up entirely.
> >
> > I did end up comparing it to your other branch too, but that was
> > equally as messy, apart from the "ok, I can mindlessly just take your
> > side".
> >
> > And it was fairly different from what I had done in my merge
> > resolution, so who knows.
> >
> > ANYWAY. What I'm trying to say is that you should look very very
> > carefully at commits
> >
> >    2cf3f8133bda ("btrfs: fix lzo_decompress_bio() kmap leakage")
> 
> Since I'm doing the revert manually for lzo part, I double checked the code.
> 
> It turns out, your fix is the same as the original version I sent to
> David (although not through the mail list).
> Full patch attached.
> 
> @@ -345,8 +358,9 @@ int lzo_decompress_bio(struct list_head *ws, struct
> compressed_bio *cb)
>   		       (cur_in + LZO_LEN - 1) / sectorsize);
>   		cur_page = cb->compressed_pages[cur_in / PAGE_SIZE];
>   		ASSERT(cur_page);
> -		seg_len = read_compress_length(page_address(cur_page) +
> -					       offset_in_page(cur_in));
> +		kaddr = kmap(cur_page);
> +		seg_len = read_compress_length(kaddr + offset_in_page(cur_in));
> +		kunmap(cur_page);
>   		cur_in += LZO_LEN;
> 
> Thus it looks like by somehow my version is not applied?

Yeah, I had a look what you sent me, that version was correct. The
mistake was on my side, a copy&paste error, sorry.
David Sterba Nov. 2, 2021, 2:50 p.m. UTC | #5
On Mon, Nov 01, 2021 at 01:03:25PM -0700, Linus Torvalds wrote:
> On Mon, Nov 1, 2021 at 9:46 AM David Sterba <dsterba@suse.com> wrote:
> >
> > There's a merge conflict due to the last minute 5.15 changes (kmap
> > reverts) and the conflict is not trivial.
> 
> You don't say.
> 
> I ended up just re-doing that resolution entirely, and as I did so, I
> think I found a bug in the original revert that caused the conflict in
> the first place.
> 
> And since that revert made it into 5.15, I felt like I had to fix that
> bug first - and separately - so that the fix can be backported to
> stable.
> 
> I then re-did my merge on top of that hopefully fixed state, and maybe
> it's correct.
> 
> Or maybe I messed up entirely.
> 
> I did end up comparing it to your other branch too, but that was
> equally as messy, apart from the "ok, I can mindlessly just take your
> side".
> 
> And it was fairly different from what I had done in my merge
> resolution, so who knows.

Looks good to me, thanks for catching the bug.

> ANYWAY. What I'm trying to say is that you should look very very
> carefully at commits
> 
>   2cf3f8133bda ("btrfs: fix lzo_decompress_bio() kmap leakage")
>   037c50bfbeb3 ("Merge tag 'for-5.16-tag' of git://git.../linux")
> 
> because I marked that first one for stable, and the second is
> obviously my entirely untested merge.
> 
> It makes sense to me, but apart from "it builds", I've not actually
> tested any of it. This is all purely from looking at the code and
> trying to figure out what the RightThing(tm) is.
> 
> Obviously the kmap thing tends to only be noticeable on 32-bit
> platforms, and that lzo_decompress_bio() bug also needs just the
> proper filesystem settings to trigger in the first place.
> 
> Again - please take a careful look. Both at my merge and at that
> alleged kmap fix.

I checked the commits individually and in the source files, all the
kmaps seem to be correctly paired with kunmaps. We'll do more tests too.
I'm sorry that it turned out to be such mess.
Qu Wenruo Nov. 4, 2021, 11:40 a.m. UTC | #6
On 2021/11/2 22:50, David Sterba wrote:
> On Mon, Nov 01, 2021 at 01:03:25PM -0700, Linus Torvalds wrote:
>> On Mon, Nov 1, 2021 at 9:46 AM David Sterba <dsterba@suse.com> wrote:
>>>
>>> There's a merge conflict due to the last minute 5.15 changes (kmap
>>> reverts) and the conflict is not trivial.
>>
>> You don't say.
>>
>> I ended up just re-doing that resolution entirely, and as I did so, I
>> think I found a bug in the original revert that caused the conflict in
>> the first place.
>>
>> And since that revert made it into 5.15, I felt like I had to fix that
>> bug first - and separately - so that the fix can be backported to
>> stable.
>>
>> I then re-did my merge on top of that hopefully fixed state, and maybe
>> it's correct.
>>
>> Or maybe I messed up entirely.
>>
>> I did end up comparing it to your other branch too, but that was
>> equally as messy, apart from the "ok, I can mindlessly just take your
>> side".
>>
>> And it was fairly different from what I had done in my merge
>> resolution, so who knows.
>
> Looks good to me, thanks for catching the bug.
>
>> ANYWAY. What I'm trying to say is that you should look very very
>> carefully at commits
>>
>>    2cf3f8133bda ("btrfs: fix lzo_decompress_bio() kmap leakage")
>>    037c50bfbeb3 ("Merge tag 'for-5.16-tag' of git://git.../linux")
>>
>> because I marked that first one for stable, and the second is
>> obviously my entirely untested merge.
>>
>> It makes sense to me, but apart from "it builds", I've not actually
>> tested any of it. This is all purely from looking at the code and
>> trying to figure out what the RightThing(tm) is.
>>
>> Obviously the kmap thing tends to only be noticeable on 32-bit
>> platforms, and that lzo_decompress_bio() bug also needs just the
>> proper filesystem settings to trigger in the first place.
>>
>> Again - please take a careful look. Both at my merge and at that
>> alleged kmap fix.
>
> I checked the commits individually and in the source files, all the
> kmaps seem to be correctly paired with kunmaps. We'll do more tests too.
> I'm sorry that it turned out to be such mess.
>

OK, something is going weird now.

As an extra step to make sure there is no leakage, I ran fstests with
"compress=lzo" mount option, but the result can't be more ugly.

For the master branch (which includes the fix), it has a very strange
"leakage" behavior, at least in my x86 32bit VM.

The used memory reported from free would just suddenly sky rocket during
generic/027.
(From regular 100~200M to 800M and more).
Easily causing OOM for my 2G RAM VM.

I originally think it's btrfs LZO, but nope.

On v5.15 tag with the fix from Linus, generic/027 runs fine as expected
with lzo compression.

BTW, zlib/zstd compression runs fine.

I guess there is something changed in MM layer causing the problem.
Btrfs LZO has tons of quick kmap()/kunmap() pairs, unlike what we did in
zlib/zstd, thus I guess that may be a triggering factor.

Any clue?

Thanks,
Qu