diff mbox series

[1/3] btrfs: zlib: fix and simplify the inline extent decompression

Message ID 29b7793e53e1cdd559ad212ee69cec211a3b4db2.1704704328.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: fix and simplify the inline extent decompression path for subpage | expand

Commit Message

Qu Wenruo Jan. 8, 2024, 9:08 a.m. UTC
[BUG]

If we have a filesystem with 4k sectorsize, and an inlined compressed
extent created like this:

	item 4 key (257 INODE_ITEM 0) itemoff 15863 itemsize 160
		generation 8 transid 8 size 4096 nbytes 4096
		block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0
		sequence 1 flags 0x0(none)
	item 5 key (257 INODE_REF 256) itemoff 15839 itemsize 24
		index 2 namelen 14 name: source_inlined
	item 6 key (257 EXTENT_DATA 0) itemoff 15770 itemsize 69
		generation 8 type 0 (inline)
		inline extent data size 48 ram_bytes 4096 compression 1 (zlib)

Which has an inline compressed extent at file offset 0, and its
decompressed size is 4K, allowing us to reflink that 4K range to another
location (which will not be compressed).

If we do such reflink on a subpage system, it would fail like this:

 # xfs_io -f -c "reflink $mnt/source_inlined 0 60k 4k" $mnt/dest
 XFS_IOC_CLONE_RANGE: Input/output error

[CAUSE]
In zlib_decompress(), we didn't treat @start_byte as just a page offset,
but also use it as an indicator on whether we should switch our output
buffer.

In reality, for subpage cases, although @start_byte can be non-zero,
we should never switch input/output buffer, since the whole input/output
buffer should never exceed one sector.
(The above assumption is only not true if we're going to support
multi-page sectorsize)

Thus the current code using @start_byte as a condition to switch
input/output buffer or finish the decompression is completely incorrect.

[FIX]
The fix involves several modification:

- Rename @start_byte to @dest_pgoff to properly express its meaning

- Add extra ASSERT() inside btrfs_decompress() to make sure the
  input/output size never exceed one sector.

- Use Z_FINISH flag to make sure the decompression happens in one go

- Remove all the loop needed to switch input/output buffer

- Use correct destination offset inside the destination page

- Consider early end as an error

After the fix, even on 64K page sized aarch64, above reflink now
works as expected:

 # xfs_io -f -c "reflink $mnt/source_inlined 0 60k 4k" $mnt/dest
 linked 4096/4096 bytes at offset 61440

And resulted a correct file layout:

	item 9 key (258 INODE_ITEM 0) itemoff 15542 itemsize 160
		generation 10 transid 10 size 65536 nbytes 4096
		block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0
		sequence 1 flags 0x0(none)
	item 10 key (258 INODE_REF 256) itemoff 15528 itemsize 14
		index 3 namelen 4 name: dest
	item 11 key (258 XATTR_ITEM 3817753667) itemoff 15445 itemsize 83
		location key (0 UNKNOWN.0 0) type XATTR
		transid 10 data_len 37 name_len 16
		name: security.selinux
		data unconfined_u:object_r:unlabeled_t:s0
	item 12 key (258 EXTENT_DATA 61440) itemoff 15392 itemsize 53
		generation 10 type 1 (regular)
		extent data disk byte 13631488 nr 4096
		extent data offset 0 nr 4096 ram 4096
		extent compression 0 (none)

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/compression.c | 23 +++++++++----
 fs/btrfs/compression.h |  2 +-
 fs/btrfs/super.h       |  3 ++
 fs/btrfs/zlib.c        | 73 +++++++++++-------------------------------
 4 files changed, 39 insertions(+), 62 deletions(-)

Comments

kernel test robot Jan. 9, 2024, 3:02 a.m. UTC | #1
Hi Qu,

kernel test robot noticed the following build warnings:

[auto build test WARNING on kdave/for-next]
[also build test WARNING on next-20240108]
[cannot apply to linus/master v6.7]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Qu-Wenruo/btrfs-zlib-fix-and-simplify-the-inline-extent-decompression/20240108-171206
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link:    https://lore.kernel.org/r/29b7793e53e1cdd559ad212ee69cec211a3b4db2.1704704328.git.wqu%40suse.com
patch subject: [PATCH 1/3] btrfs: zlib: fix and simplify the inline extent decompression
config: i386-allmodconfig (https://download.01.org/0day-ci/archive/20240109/202401091012.pLm6PcKG-lkp@intel.com/config)
compiler: ClangBuiltLinux clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240109/202401091012.pLm6PcKG-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202401091012.pLm6PcKG-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> fs/btrfs/zlib.c:402:15: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
     401 |                 pr_warn_ratelimited("BTRFS: infalte failed, decompressed=%lu expected=%lu\n",
         |                                                                                       ~~~
         |                                                                                       %zu
     402 |                                         to_copy, destlen);
         |                                                  ^~~~~~~
   include/linux/printk.h:665:49: note: expanded from macro 'pr_warn_ratelimited'
     665 |         printk_ratelimited(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
         |                                                ~~~     ^~~~~~~~~~~
   include/linux/printk.h:649:17: note: expanded from macro 'printk_ratelimited'
     649 |                 printk(fmt, ##__VA_ARGS__);                             \
         |                        ~~~    ^~~~~~~~~~~
   include/linux/printk.h:455:60: note: expanded from macro 'printk'
     455 | #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__)
         |                                                     ~~~    ^~~~~~~~~~~
   include/linux/printk.h:427:19: note: expanded from macro 'printk_index_wrap'
     427 |                 _p_func(_fmt, ##__VA_ARGS__);                           \
         |                         ~~~~    ^~~~~~~~~~~
   1 warning generated.


vim +402 fs/btrfs/zlib.c

   355	
   356	int zlib_decompress(struct list_head *ws, const u8 *data_in,
   357			struct page *dest_page, unsigned long dest_pgoff, size_t srclen,
   358			size_t destlen)
   359	{
   360		struct workspace *workspace = list_entry(ws, struct workspace, list);
   361		int ret = 0;
   362		int wbits = MAX_WBITS;
   363		unsigned long to_copy;
   364	
   365		workspace->strm.next_in = data_in;
   366		workspace->strm.avail_in = srclen;
   367		workspace->strm.total_in = 0;
   368	
   369		workspace->strm.next_out = workspace->buf;
   370		workspace->strm.avail_out = workspace->buf_size;
   371		workspace->strm.total_out = 0;
   372		/* If it's deflate, and it's got no preset dictionary, then
   373		   we can tell zlib to skip the adler32 check. */
   374		if (srclen > 2 && !(data_in[1] & PRESET_DICT) &&
   375		    ((data_in[0] & 0x0f) == Z_DEFLATED) &&
   376		    !(((data_in[0]<<8) + data_in[1]) % 31)) {
   377	
   378			wbits = -((data_in[0] >> 4) + 8);
   379			workspace->strm.next_in += 2;
   380			workspace->strm.avail_in -= 2;
   381		}
   382	
   383		if (Z_OK != zlib_inflateInit2(&workspace->strm, wbits)) {
   384			pr_warn("BTRFS: inflateInit failed\n");
   385			return -EIO;
   386		}
   387	
   388		/*
   389		 * Everything (in/out buf) should be at most one sector, there should
   390		 * be no need to switch any input/output buffer.
   391		 */
   392		ret = zlib_inflate(&workspace->strm, Z_FINISH);
   393		to_copy = min(workspace->strm.total_out, destlen);
   394		if (ret != Z_STREAM_END)
   395			goto out;
   396	
   397		memcpy_to_page(dest_page, dest_pgoff, workspace->buf, to_copy);
   398	
   399	out:
   400		if (unlikely(to_copy != destlen)) {
   401			pr_warn_ratelimited("BTRFS: infalte failed, decompressed=%lu expected=%lu\n",
 > 402						to_copy, destlen);
   403			ret = -EIO;
   404		} else {
   405			ret = 0;
   406		}
   407	
   408		zlib_inflateEnd(&workspace->strm);
   409	
   410		if (unlikely(to_copy < destlen))
   411			memzero_page(dest_page, dest_pgoff + to_copy, destlen - to_copy);
   412		return ret;
   413	}
   414
David Sterba Jan. 10, 2024, 1:59 a.m. UTC | #2
On Tue, Jan 09, 2024 at 11:02:54AM +0800, kernel test robot wrote:
> Hi Qu,
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on kdave/for-next]
> [also build test WARNING on next-20240108]
> [cannot apply to linus/master v6.7]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Qu-Wenruo/btrfs-zlib-fix-and-simplify-the-inline-extent-decompression/20240108-171206
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
> patch link:    https://lore.kernel.org/r/29b7793e53e1cdd559ad212ee69cec211a3b4db2.1704704328.git.wqu%40suse.com
> patch subject: [PATCH 1/3] btrfs: zlib: fix and simplify the inline extent decompression
> config: i386-allmodconfig (https://download.01.org/0day-ci/archive/20240109/202401091012.pLm6PcKG-lkp@intel.com/config)
> compiler: ClangBuiltLinux clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240109/202401091012.pLm6PcKG-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202401091012.pLm6PcKG-lkp@intel.com/
> 
> All warnings (new ones prefixed by >>):
> 
> >> fs/btrfs/zlib.c:402:15: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
>      401 |                 pr_warn_ratelimited("BTRFS: infalte failed, decompressed=%lu expected=%lu\n",
>          |                                                                                       ~~~
>          |                                                                                       %zu
>      402 |                                         to_copy, destlen);
>          |                                                  ^~~~~~~

Valid report but I can't reproduce it. Built with clang 17 and
explicitly enabled -Wformat. We have additional warnings enabled per
directory fs/btrfs/ so we can add -Wformat, I'd like to know what I'm
missing, we've had fixups for the size_t printk format so it would make
sense to catch it early.
Qu Wenruo Jan. 10, 2024, 2:03 a.m. UTC | #3
On 2024/1/10 12:29, David Sterba wrote:
> On Tue, Jan 09, 2024 at 11:02:54AM +0800, kernel test robot wrote:
>> Hi Qu,
>>
>> kernel test robot noticed the following build warnings:
>>
>> [auto build test WARNING on kdave/for-next]
>> [also build test WARNING on next-20240108]
>> [cannot apply to linus/master v6.7]
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use '--base' as documented in
>> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>>
>> url:    https://github.com/intel-lab-lkp/linux/commits/Qu-Wenruo/btrfs-zlib-fix-and-simplify-the-inline-extent-decompression/20240108-171206
>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
>> patch link:    https://lore.kernel.org/r/29b7793e53e1cdd559ad212ee69cec211a3b4db2.1704704328.git.wqu%40suse.com
>> patch subject: [PATCH 1/3] btrfs: zlib: fix and simplify the inline extent decompression
>> config: i386-allmodconfig (https://download.01.org/0day-ci/archive/20240109/202401091012.pLm6PcKG-lkp@intel.com/config)
>> compiler: ClangBuiltLinux clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
>> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240109/202401091012.pLm6PcKG-lkp@intel.com/reproduce)
>>
>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>> the same patch/commit), kindly add following tags
>> | Reported-by: kernel test robot <lkp@intel.com>
>> | Closes: https://lore.kernel.org/oe-kbuild-all/202401091012.pLm6PcKG-lkp@intel.com/
>>
>> All warnings (new ones prefixed by >>):
>>
>>>> fs/btrfs/zlib.c:402:15: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
>>       401 |                 pr_warn_ratelimited("BTRFS: infalte failed, decompressed=%lu expected=%lu\n",
>>           |                                                                                       ~~~
>>           |                                                                                       %zu
>>       402 |                                         to_copy, destlen);
>>           |                                                  ^~~~~~~
> 
> Valid report but I can't reproduce it. Built with clang 17 and
> explicitly enabled -Wformat. We have additional warnings enabled per
> directory fs/btrfs/ so we can add -Wformat, I'd like to know what I'm
> missing, we've had fixups for the size_t printk format so it would make
> sense to catch it early.

I guess it's due to the platform? (The report is from 32bit system).

Otherwise it's indeed my bad, for now I don't even have a 32bit VM to 
verify, thus my LSP doesn't warn me about the format.

Thanks,
Qu
David Sterba Jan. 10, 2024, 2:26 a.m. UTC | #4
On Wed, Jan 10, 2024 at 12:33:17PM +1030, Qu Wenruo wrote:
> 
> 
> On 2024/1/10 12:29, David Sterba wrote:
> > On Tue, Jan 09, 2024 at 11:02:54AM +0800, kernel test robot wrote:
> >> Hi Qu,
> >>
> >> kernel test robot noticed the following build warnings:
> >>
> >> [auto build test WARNING on kdave/for-next]
> >> [also build test WARNING on next-20240108]
> >> [cannot apply to linus/master v6.7]
> >> [If your patch is applied to the wrong git tree, kindly drop us a note.
> >> And when submitting patch, we suggest to use '--base' as documented in
> >> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> >>
> >> url:    https://github.com/intel-lab-lkp/linux/commits/Qu-Wenruo/btrfs-zlib-fix-and-simplify-the-inline-extent-decompression/20240108-171206
> >> base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
> >> patch link:    https://lore.kernel.org/r/29b7793e53e1cdd559ad212ee69cec211a3b4db2.1704704328.git.wqu%40suse.com
> >> patch subject: [PATCH 1/3] btrfs: zlib: fix and simplify the inline extent decompression
> >> config: i386-allmodconfig (https://download.01.org/0day-ci/archive/20240109/202401091012.pLm6PcKG-lkp@intel.com/config)
> >> compiler: ClangBuiltLinux clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
> >> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240109/202401091012.pLm6PcKG-lkp@intel.com/reproduce)
> >>
> >> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> >> the same patch/commit), kindly add following tags
> >> | Reported-by: kernel test robot <lkp@intel.com>
> >> | Closes: https://lore.kernel.org/oe-kbuild-all/202401091012.pLm6PcKG-lkp@intel.com/
> >>
> >> All warnings (new ones prefixed by >>):
> >>
> >>>> fs/btrfs/zlib.c:402:15: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
> >>       401 |                 pr_warn_ratelimited("BTRFS: infalte failed, decompressed=%lu expected=%lu\n",
> >>           |                                                                                       ~~~
> >>           |                                                                                       %zu
> >>       402 |                                         to_copy, destlen);
> >>           |                                                  ^~~~~~~
> > 
> > Valid report but I can't reproduce it. Built with clang 17 and
> > explicitly enabled -Wformat. We have additional warnings enabled per
> > directory fs/btrfs/ so we can add -Wformat, I'd like to know what I'm
> > missing, we've had fixups for the size_t printk format so it would make
> > sense to catch it early.
> 
> I guess it's due to the platform? (The report is from 32bit system).

Ah I see, I build on 64bit platform but should the Wformat warning also
point out mismatch there? The size_t type is an alias of unsigned long
so it is not an error but when size_t and %zu don't match could be a
platform-independent error, no? This would save us reports and followup
fixups roundtrips.

> Otherwise it's indeed my bad, for now I don't even have a 32bit VM to 
> verify, thus my LSP doesn't warn me about the format.

Yeah, it could/should though.
Qu Wenruo Jan. 10, 2024, 2:34 a.m. UTC | #5
On 2024/1/10 12:56, David Sterba wrote:
> On Wed, Jan 10, 2024 at 12:33:17PM +1030, Qu Wenruo wrote:
>>
>>
>> On 2024/1/10 12:29, David Sterba wrote:
>>> On Tue, Jan 09, 2024 at 11:02:54AM +0800, kernel test robot wrote:
>>>> Hi Qu,
>>>>
>>>> kernel test robot noticed the following build warnings:
>>>>
>>>> [auto build test WARNING on kdave/for-next]
>>>> [also build test WARNING on next-20240108]
>>>> [cannot apply to linus/master v6.7]
>>>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>>>> And when submitting patch, we suggest to use '--base' as documented in
>>>> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>>>>
>>>> url:    https://github.com/intel-lab-lkp/linux/commits/Qu-Wenruo/btrfs-zlib-fix-and-simplify-the-inline-extent-decompression/20240108-171206
>>>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
>>>> patch link:    https://lore.kernel.org/r/29b7793e53e1cdd559ad212ee69cec211a3b4db2.1704704328.git.wqu%40suse.com
>>>> patch subject: [PATCH 1/3] btrfs: zlib: fix and simplify the inline extent decompression
>>>> config: i386-allmodconfig (https://download.01.org/0day-ci/archive/20240109/202401091012.pLm6PcKG-lkp@intel.com/config)
>>>> compiler: ClangBuiltLinux clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
>>>> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240109/202401091012.pLm6PcKG-lkp@intel.com/reproduce)
>>>>
>>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>>>> the same patch/commit), kindly add following tags
>>>> | Reported-by: kernel test robot <lkp@intel.com>
>>>> | Closes: https://lore.kernel.org/oe-kbuild-all/202401091012.pLm6PcKG-lkp@intel.com/
>>>>
>>>> All warnings (new ones prefixed by >>):
>>>>
>>>>>> fs/btrfs/zlib.c:402:15: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
>>>>        401 |                 pr_warn_ratelimited("BTRFS: infalte failed, decompressed=%lu expected=%lu\n",
>>>>            |                                                                                       ~~~
>>>>            |                                                                                       %zu
>>>>        402 |                                         to_copy, destlen);
>>>>            |                                                  ^~~~~~~
>>>
>>> Valid report but I can't reproduce it. Built with clang 17 and
>>> explicitly enabled -Wformat. We have additional warnings enabled per
>>> directory fs/btrfs/ so we can add -Wformat, I'd like to know what I'm
>>> missing, we've had fixups for the size_t printk format so it would make
>>> sense to catch it early.
>>
>> I guess it's due to the platform? (The report is from 32bit system).
> 
> Ah I see, I build on 64bit platform but should the Wformat warning also
> point out mismatch there? The size_t type is an alias of unsigned long
> so it is not an error but when size_t and %zu don't match could be a
> platform-independent error, no? This would save us reports and followup
> fixups roundtrips.

size_t is defined differently, in include/uapi/asm-generic/posix_types.h:

/*
  * Most 32 bit architectures use "unsigned int" size_t,
  * and all 64 bit architectures use "unsigned long" size_t.
  */
#ifndef __kernel_size_t
#if __BITS_PER_LONG != 64
typedef unsigned int    __kernel_size_t;
typedef int             __kernel_ssize_t;
typedef int             __kernel_ptrdiff_t;
#else
typedef __kernel_ulong_t __kernel_size_t;
typedef __kernel_long_t __kernel_ssize_t;
typedef __kernel_long_t __kernel_ptrdiff_t;
#endif
#endif

Thus this is the reason why we need @zu for size_t to handle the 
difference, and since for 64bit it's just unsigned long, thus compiler 
won't give any warning.

(That's also why I tend to not use size_t at all, and why I like rust's 
explicit sized type, and we may want to go that path to prefer 
u8/u16/u32/u64 when possible)

Thanks,
Qu
> 
>> Otherwise it's indeed my bad, for now I don't even have a 32bit VM to
>> verify, thus my LSP doesn't warn me about the format.
> 
> Yeah, it could/should though.
David Sterba Jan. 10, 2024, 2:42 a.m. UTC | #6
On Wed, Jan 10, 2024 at 01:04:06PM +1030, Qu Wenruo wrote:
> >>>>
> >>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> >>>> the same patch/commit), kindly add following tags
> >>>> | Reported-by: kernel test robot <lkp@intel.com>
> >>>> | Closes: https://lore.kernel.org/oe-kbuild-all/202401091012.pLm6PcKG-lkp@intel.com/
> >>>>
> >>>> All warnings (new ones prefixed by >>):
> >>>>
> >>>>>> fs/btrfs/zlib.c:402:15: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
> >>>>        401 |                 pr_warn_ratelimited("BTRFS: infalte failed, decompressed=%lu expected=%lu\n",
> >>>>            |                                                                                       ~~~
> >>>>            |                                                                                       %zu
> >>>>        402 |                                         to_copy, destlen);
> >>>>            |                                                  ^~~~~~~
> >>>
> >>> Valid report but I can't reproduce it. Built with clang 17 and
> >>> explicitly enabled -Wformat. We have additional warnings enabled per
> >>> directory fs/btrfs/ so we can add -Wformat, I'd like to know what I'm
> >>> missing, we've had fixups for the size_t printk format so it would make
> >>> sense to catch it early.
> >>
> >> I guess it's due to the platform? (The report is from 32bit system).
> > 
> > Ah I see, I build on 64bit platform but should the Wformat warning also
> > point out mismatch there? The size_t type is an alias of unsigned long
> > so it is not an error but when size_t and %zu don't match could be a
> > platform-independent error, no? This would save us reports and followup
> > fixups roundtrips.
> 
> size_t is defined differently, in include/uapi/asm-generic/posix_types.h:
> 
> /*
>   * Most 32 bit architectures use "unsigned int" size_t,
>   * and all 64 bit architectures use "unsigned long" size_t.
>   */
> #ifndef __kernel_size_t
> #if __BITS_PER_LONG != 64
> typedef unsigned int    __kernel_size_t;
> typedef int             __kernel_ssize_t;
> typedef int             __kernel_ptrdiff_t;
> #else
> typedef __kernel_ulong_t __kernel_size_t;
> typedef __kernel_long_t __kernel_ssize_t;
> typedef __kernel_long_t __kernel_ptrdiff_t;
> #endif
> #endif
> 
> Thus this is the reason why we need @zu for size_t to handle the 
> difference, and since for 64bit it's just unsigned long, thus compiler 
> won't give any warning.

So it's the int/long difference and kernel does it in a special way due
to absence of the standard libc headers.

> (That's also why I tend to not use size_t at all, and why I like rust's 
> explicit sized type, and we may want to go that path to prefer 
> u8/u16/u32/u64 when possible)

Yeah, from what I've seen so far is that size_t brings us only problems,
I would not mind conversions to explicit types like u32 or u64. We do
that in many places, I think it's namely on the interface boundaries
where we provide a callback with a size_t type, but from that down it
could be u64.

Quick grep for size_t returns 300+ lines and 100+ of that is ssize_t,
that's still a lot so the conversions should be targeted.
diff mbox series

Patch

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 3d8fc2ad0f42..9cf7d38dc66c 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -141,16 +141,16 @@  static int compression_decompress_bio(struct list_head *ws,
 }
 
 static int compression_decompress(int type, struct list_head *ws,
-               const u8 *data_in, struct page *dest_page,
-               unsigned long start_byte, size_t srclen, size_t destlen)
+		const u8 *data_in, struct page *dest_page,
+		unsigned long dest_pgoff, size_t srclen, size_t destlen)
 {
 	switch (type) {
 	case BTRFS_COMPRESS_ZLIB: return zlib_decompress(ws, data_in, dest_page,
-						start_byte, srclen, destlen);
+						dest_pgoff, srclen, destlen);
 	case BTRFS_COMPRESS_LZO:  return lzo_decompress(ws, data_in, dest_page,
-						start_byte, srclen, destlen);
+						dest_pgoff, srclen, destlen);
 	case BTRFS_COMPRESS_ZSTD: return zstd_decompress(ws, data_in, dest_page,
-						start_byte, srclen, destlen);
+						dest_pgoff, srclen, destlen);
 	case BTRFS_COMPRESS_NONE:
 	default:
 		/*
@@ -1037,14 +1037,23 @@  static int btrfs_decompress_bio(struct compressed_bio *cb)
  * start_byte tells us the offset into the compressed data we're interested in
  */
 int btrfs_decompress(int type, const u8 *data_in, struct page *dest_page,
-		     unsigned long start_byte, size_t srclen, size_t destlen)
+		     unsigned long dest_pgoff, size_t srclen, size_t destlen)
 {
+	struct btrfs_fs_info *fs_info = btrfs_sb(dest_page->mapping->host->i_sb);
 	struct list_head *workspace;
+	const u32 sectorsize = fs_info->sectorsize;
 	int ret;
 
+	/*
+	 * The full destination page range should not exceed the page size.
+	 * And the @destlen should not exceed sectorsize, as this is only called for
+	 * inline file extents, which should not exceed sectorsize.
+	 */
+	ASSERT(dest_pgoff + destlen <= PAGE_SIZE && destlen <= sectorsize);
+
 	workspace = get_workspace(type, 0);
 	ret = compression_decompress(type, workspace, data_in, dest_page,
-				     start_byte, srclen, destlen);
+				     dest_pgoff, srclen, destlen);
 	put_workspace(type, workspace);
 
 	return ret;
diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h
index 93cc92974dee..2b4dfb1b010c 100644
--- a/fs/btrfs/compression.h
+++ b/fs/btrfs/compression.h
@@ -148,7 +148,7 @@  int zlib_compress_pages(struct list_head *ws, struct address_space *mapping,
 		unsigned long *total_in, unsigned long *total_out);
 int zlib_decompress_bio(struct list_head *ws, struct compressed_bio *cb);
 int zlib_decompress(struct list_head *ws, const u8 *data_in,
-		struct page *dest_page, unsigned long start_byte, size_t srclen,
+		struct page *dest_page, unsigned long dest_pgoff, size_t srclen,
 		size_t destlen);
 struct list_head *zlib_alloc_workspace(unsigned int level);
 void zlib_free_workspace(struct list_head *ws);
diff --git a/fs/btrfs/super.h b/fs/btrfs/super.h
index f18253ca280d..0dc3dd713cbb 100644
--- a/fs/btrfs/super.h
+++ b/fs/btrfs/super.h
@@ -3,6 +3,9 @@ 
 #ifndef BTRFS_SUPER_H
 #define BTRFS_SUPER_H
 
+#include <linux/fs.h>
+#include "fs.h"
+
 bool btrfs_check_options(struct btrfs_fs_info *info, unsigned long *mount_opt,
 			 unsigned long flags);
 int btrfs_sync_fs(struct super_block *sb, int wait);
diff --git a/fs/btrfs/zlib.c b/fs/btrfs/zlib.c
index 36cf1f0e338e..58cd23f2ca1c 100644
--- a/fs/btrfs/zlib.c
+++ b/fs/btrfs/zlib.c
@@ -354,18 +354,13 @@  int zlib_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 }
 
 int zlib_decompress(struct list_head *ws, const u8 *data_in,
-		struct page *dest_page, unsigned long start_byte, size_t srclen,
+		struct page *dest_page, unsigned long dest_pgoff, size_t srclen,
 		size_t destlen)
 {
 	struct workspace *workspace = list_entry(ws, struct workspace, list);
 	int ret = 0;
 	int wbits = MAX_WBITS;
-	unsigned long bytes_left;
-	unsigned long total_out = 0;
-	unsigned long pg_offset = 0;
-
-	destlen = min_t(unsigned long, destlen, PAGE_SIZE);
-	bytes_left = destlen;
+	unsigned long to_copy;
 
 	workspace->strm.next_in = data_in;
 	workspace->strm.avail_in = srclen;
@@ -390,60 +385,30 @@  int zlib_decompress(struct list_head *ws, const u8 *data_in,
 		return -EIO;
 	}
 
-	while (bytes_left > 0) {
-		unsigned long buf_start;
-		unsigned long buf_offset;
-		unsigned long bytes;
+	/*
+	 * Everything (in/out buf) should be at most one sector, there should
+	 * be no need to switch any input/output buffer.
+	 */
+	ret = zlib_inflate(&workspace->strm, Z_FINISH);
+	to_copy = min(workspace->strm.total_out, destlen);
+	if (ret != Z_STREAM_END)
+		goto out;
 
-		ret = zlib_inflate(&workspace->strm, Z_NO_FLUSH);
-		if (ret != Z_OK && ret != Z_STREAM_END)
-			break;
+	memcpy_to_page(dest_page, dest_pgoff, workspace->buf, to_copy);
 
-		buf_start = total_out;
-		total_out = workspace->strm.total_out;
-
-		if (total_out == buf_start) {
-			ret = -EIO;
-			break;
-		}
-
-		if (total_out <= start_byte)
-			goto next;
-
-		if (total_out > start_byte && buf_start < start_byte)
-			buf_offset = start_byte - buf_start;
-		else
-			buf_offset = 0;
-
-		bytes = min(PAGE_SIZE - pg_offset,
-			    PAGE_SIZE - (buf_offset % PAGE_SIZE));
-		bytes = min(bytes, bytes_left);
-
-		memcpy_to_page(dest_page, pg_offset,
-			       workspace->buf + buf_offset, bytes);
-
-		pg_offset += bytes;
-		bytes_left -= bytes;
-next:
-		workspace->strm.next_out = workspace->buf;
-		workspace->strm.avail_out = workspace->buf_size;
-	}
-
-	if (ret != Z_STREAM_END && bytes_left != 0)
+out:
+	if (unlikely(to_copy != destlen)) {
+		pr_warn_ratelimited("BTRFS: infalte failed, decompressed=%lu expected=%lu\n",
+					to_copy, destlen);
 		ret = -EIO;
-	else
+	} else {
 		ret = 0;
+	}
 
 	zlib_inflateEnd(&workspace->strm);
 
-	/*
-	 * this should only happen if zlib returned fewer bytes than we
-	 * expected.  btrfs_get_block is responsible for zeroing from the
-	 * end of the inline extent (destlen) to the end of the page
-	 */
-	if (pg_offset < destlen) {
-		memzero_page(dest_page, pg_offset, destlen - pg_offset);
-	}
+	if (unlikely(to_copy < destlen))
+		memzero_page(dest_page, dest_pgoff + to_copy, destlen - to_copy);
 	return ret;
 }