diff mbox series

[v9,5/9] block: Make bio_iov_iter_get_pages() respect bio_required_sector_alignment()

Message ID 20210604210908.2105870-6-satyat@google.com (mailing list archive)
State New, archived
Headers show
Series add support for direct I/O with fscrypt using blk-crypto | expand

Commit Message

Satya Tangirala June 4, 2021, 9:09 p.m. UTC
Previously, bio_iov_iter_get_pages() wasn't used with bios that could have
an encryption context. However, direct I/O support using blk-crypto
introduces this possibility, so this function must now respect
bio_required_sector_alignment() (otherwise, xfstests like generic/465 with
ext4 will fail).

Signed-off-by: Satya Tangirala <satyat@google.com>
---
 block/bio.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Comments

kernel test robot June 4, 2021, 10:51 p.m. UTC | #1
Hi Satya,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on block/for-next]
[also build test ERROR on ext4/dev f2fs/dev-test linus/master v5.13-rc4 next-20210604]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Satya-Tangirala/add-support-for-direct-I-O-with-fscrypt-using-blk-crypto/20210605-051023
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: um-x86_64_defconfig (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/eea1225f680da16dce01bfb2914b9e8b78de298d
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Satya-Tangirala/add-support-for-direct-I-O-with-fscrypt-using-blk-crypto/20210605-051023
        git checkout eea1225f680da16dce01bfb2914b9e8b78de298d
        # save the attached .config to linux build tree
        make W=1 ARCH=um SUBARCH=x86_64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from include/linux/kernel.h:14,
                    from include/asm-generic/bug.h:20,
                    from ./arch/um/include/generated/asm/bug.h:1,
                    from include/linux/bug.h:5,
                    from include/linux/mmdebug.h:5,
                    from include/linux/mm.h:9,
                    from block/bio.c:5:
   block/bio.c: In function 'bio_iov_iter_get_pages':
>> block/bio.c:1131:10: error: implicit declaration of function 'bio_required_sector_alignment' [-Werror=implicit-function-declaration]
    1131 |          bio_required_sector_alignment(bio));
         |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/math.h:14:46: note: in definition of macro '__round_mask'
      14 | #define __round_mask(x, y) ((__typeof__(x))((y)-1))
         |                                              ^
   block/bio.c:1130:20: note: in expansion of macro 'round_down'
    1130 |  aligned_sectors = round_down(bio_sectors(bio),
         |                    ^~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/bio_required_sector_alignment +1131 block/bio.c

  1083	
  1084	/**
  1085	 * bio_iov_iter_get_pages - add user or kernel pages to a bio
  1086	 * @bio: bio to add pages to
  1087	 * @iter: iov iterator describing the region to be added
  1088	 *
  1089	 * This takes either an iterator pointing to user memory, or one pointing to
  1090	 * kernel pages (BVEC iterator). If we're adding user pages, we pin them and
  1091	 * map them into the kernel. On IO completion, the caller should put those
  1092	 * pages. For bvec based iterators bio_iov_iter_get_pages() uses the provided
  1093	 * bvecs rather than copying them. Hence anyone issuing kiocb based IO needs
  1094	 * to ensure the bvecs and pages stay referenced until the submitted I/O is
  1095	 * completed by a call to ->ki_complete() or returns with an error other than
  1096	 * -EIOCBQUEUED. The caller needs to check if the bio is flagged BIO_NO_PAGE_REF
  1097	 * on IO completion. If it isn't, then pages should be released.
  1098	 *
  1099	 * The function tries, but does not guarantee, to pin as many pages as
  1100	 * fit into the bio, or are requested in @iter, whatever is smaller. If
  1101	 * MM encounters an error pinning the requested pages, it stops. Error
  1102	 * is returned only if 0 pages could be pinned. It also ensures that the number
  1103	 * of sectors added to the bio is aligned to bio_required_sector_alignment().
  1104	 *
  1105	 * It's intended for direct IO, so doesn't do PSI tracking, the caller is
  1106	 * responsible for setting BIO_WORKINGSET if necessary.
  1107	 */
  1108	int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
  1109	{
  1110		int ret = 0;
  1111		unsigned int aligned_sectors;
  1112	
  1113		if (iov_iter_is_bvec(iter)) {
  1114			if (bio_op(bio) == REQ_OP_ZONE_APPEND)
  1115				return bio_iov_bvec_set_append(bio, iter);
  1116			return bio_iov_bvec_set(bio, iter);
  1117		}
  1118	
  1119		do {
  1120			if (bio_op(bio) == REQ_OP_ZONE_APPEND)
  1121				ret = __bio_iov_append_get_pages(bio, iter);
  1122			else
  1123				ret = __bio_iov_iter_get_pages(bio, iter);
  1124		} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
  1125	
  1126		/*
  1127		 * Ensure that number of sectors in bio is aligned to
  1128		 * bio_required_sector_align()
  1129		 */
  1130		aligned_sectors = round_down(bio_sectors(bio),
> 1131					     bio_required_sector_alignment(bio));
  1132		iov_iter_revert(iter, (bio_sectors(bio) - aligned_sectors) << SECTOR_SHIFT);
  1133		bio_truncate(bio, aligned_sectors << SECTOR_SHIFT);
  1134	
  1135		/* don't account direct I/O as memory stall */
  1136		bio_clear_flag(bio, BIO_WORKINGSET);
  1137		return bio->bi_vcnt ? 0 : ret;
  1138	}
  1139	EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages);
  1140	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
kernel test robot June 5, 2021, 2:55 a.m. UTC | #2
Hi Satya,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on block/for-next]
[also build test ERROR on ext4/dev f2fs/dev-test linus/master v5.13-rc4 next-20210604]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Satya-Tangirala/add-support-for-direct-I-O-with-fscrypt-using-blk-crypto/20210605-051023
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: powerpc64-randconfig-r025-20210604 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 5c0d1b2f902aa6a9cf47cc7e42c5b83bb2217cf9)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc64 cross compiling tool for clang build
        # apt-get install binutils-powerpc64-linux-gnu
        # https://github.com/0day-ci/linux/commit/eea1225f680da16dce01bfb2914b9e8b78de298d
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Satya-Tangirala/add-support-for-direct-I-O-with-fscrypt-using-blk-crypto/20210605-051023
        git checkout eea1225f680da16dce01bfb2914b9e8b78de298d
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from block/bio.c:5:
   In file included from include/linux/mm.h:9:
   In file included from include/linux/mmdebug.h:5:
   In file included from include/linux/bug.h:5:
   In file included from arch/powerpc/include/asm/bug.h:109:
   In file included from include/asm-generic/bug.h:20:
   In file included from include/linux/kernel.h:12:
   In file included from include/linux/bitops.h:32:
   In file included from arch/powerpc/include/asm/bitops.h:62:
   arch/powerpc/include/asm/barrier.h:49:9: warning: '__lwsync' macro redefined [-Wmacro-redefined]
   #define __lwsync()      __asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
           ^
   <built-in>:309:9: note: previous definition is here
   #define __lwsync __builtin_ppc_lwsync
           ^
>> block/bio.c:1131:10: error: implicit declaration of function 'bio_required_sector_alignment' [-Werror,-Wimplicit-function-declaration]
                                        bio_required_sector_alignment(bio));
                                        ^
   1 warning and 1 error generated.


vim +/bio_required_sector_alignment +1131 block/bio.c

  1083	
  1084	/**
  1085	 * bio_iov_iter_get_pages - add user or kernel pages to a bio
  1086	 * @bio: bio to add pages to
  1087	 * @iter: iov iterator describing the region to be added
  1088	 *
  1089	 * This takes either an iterator pointing to user memory, or one pointing to
  1090	 * kernel pages (BVEC iterator). If we're adding user pages, we pin them and
  1091	 * map them into the kernel. On IO completion, the caller should put those
  1092	 * pages. For bvec based iterators bio_iov_iter_get_pages() uses the provided
  1093	 * bvecs rather than copying them. Hence anyone issuing kiocb based IO needs
  1094	 * to ensure the bvecs and pages stay referenced until the submitted I/O is
  1095	 * completed by a call to ->ki_complete() or returns with an error other than
  1096	 * -EIOCBQUEUED. The caller needs to check if the bio is flagged BIO_NO_PAGE_REF
  1097	 * on IO completion. If it isn't, then pages should be released.
  1098	 *
  1099	 * The function tries, but does not guarantee, to pin as many pages as
  1100	 * fit into the bio, or are requested in @iter, whatever is smaller. If
  1101	 * MM encounters an error pinning the requested pages, it stops. Error
  1102	 * is returned only if 0 pages could be pinned. It also ensures that the number
  1103	 * of sectors added to the bio is aligned to bio_required_sector_alignment().
  1104	 *
  1105	 * It's intended for direct IO, so doesn't do PSI tracking, the caller is
  1106	 * responsible for setting BIO_WORKINGSET if necessary.
  1107	 */
  1108	int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
  1109	{
  1110		int ret = 0;
  1111		unsigned int aligned_sectors;
  1112	
  1113		if (iov_iter_is_bvec(iter)) {
  1114			if (bio_op(bio) == REQ_OP_ZONE_APPEND)
  1115				return bio_iov_bvec_set_append(bio, iter);
  1116			return bio_iov_bvec_set(bio, iter);
  1117		}
  1118	
  1119		do {
  1120			if (bio_op(bio) == REQ_OP_ZONE_APPEND)
  1121				ret = __bio_iov_append_get_pages(bio, iter);
  1122			else
  1123				ret = __bio_iov_iter_get_pages(bio, iter);
  1124		} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
  1125	
  1126		/*
  1127		 * Ensure that number of sectors in bio is aligned to
  1128		 * bio_required_sector_align()
  1129		 */
  1130		aligned_sectors = round_down(bio_sectors(bio),
> 1131					     bio_required_sector_alignment(bio));
  1132		iov_iter_revert(iter, (bio_sectors(bio) - aligned_sectors) << SECTOR_SHIFT);
  1133		bio_truncate(bio, aligned_sectors << SECTOR_SHIFT);
  1134	
  1135		/* don't account direct I/O as memory stall */
  1136		bio_clear_flag(bio, BIO_WORKINGSET);
  1137		return bio->bi_vcnt ? 0 : ret;
  1138	}
  1139	EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages);
  1140	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Eric Biggers July 23, 2021, 9:33 p.m. UTC | #3
On Fri, Jun 04, 2021 at 09:09:04PM +0000, Satya Tangirala wrote:
> Previously, bio_iov_iter_get_pages() wasn't used with bios that could have
> an encryption context. However, direct I/O support using blk-crypto
> introduces this possibility, so this function must now respect
> bio_required_sector_alignment() (otherwise, xfstests like generic/465 with
> ext4 will fail).

Can you be more clear that the fscrypt direct I/O support only requires this in
order to support I/O segments that aren't fs-block aligned?

I do still wonder if we should just not support that...  Dave is the only person
who has asked for it, and it's a lot of trouble to support.

I also noticed that f2fs has always only supported direct I/O that is *fully*
fs-block aligned (including the I/O segments) anyway.  So presumably that
limitation is not really that important after all...

Does anyone else have thoughts on this?

One more comment on this patch below:

> 
> Signed-off-by: Satya Tangirala <satyat@google.com>
> ---
>  block/bio.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 32f75f31bb5c..99c510f706e2 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1099,7 +1099,8 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter)
>   * The function tries, but does not guarantee, to pin as many pages as
>   * fit into the bio, or are requested in @iter, whatever is smaller. If
>   * MM encounters an error pinning the requested pages, it stops. Error
> - * is returned only if 0 pages could be pinned.
> + * is returned only if 0 pages could be pinned. It also ensures that the number
> + * of sectors added to the bio is aligned to bio_required_sector_alignment().
>   *
>   * It's intended for direct IO, so doesn't do PSI tracking, the caller is
>   * responsible for setting BIO_WORKINGSET if necessary.
> @@ -1107,6 +1108,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter)
>  int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>  {
>  	int ret = 0;
> +	unsigned int aligned_sectors;
>  
>  	if (iov_iter_is_bvec(iter)) {
>  		if (bio_op(bio) == REQ_OP_ZONE_APPEND)
> @@ -1121,6 +1123,15 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>  			ret = __bio_iov_iter_get_pages(bio, iter);
>  	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
>  
> +	/*
> +	 * Ensure that number of sectors in bio is aligned to
> +	 * bio_required_sector_align()
> +	 */
> +	aligned_sectors = round_down(bio_sectors(bio),
> +				     bio_required_sector_alignment(bio));
> +	iov_iter_revert(iter, (bio_sectors(bio) - aligned_sectors) << SECTOR_SHIFT);
> +	bio_truncate(bio, aligned_sectors << SECTOR_SHIFT);
> +
>  	/* don't account direct I/O as memory stall */
>  	bio_clear_flag(bio, BIO_WORKINGSET);
>  	return bio->bi_vcnt ? 0 : ret;

Doesn't this need to return an error if the bio's size gets rounded down to 0?
For example if logical_block_size=512 and data_unit_size=4096, and the iov_iter
points to 4096 bytes in 8 512-byte segments but the last one isn't mapped, then
7 pages would be pinned and the last one would fail.  This would then truncate
the bio's size to 0, but bio->bi_vcnt would be 7, so this would still return 0.
It would also be necessary to release the pages before returning an error.

- Eric
Christoph Hellwig July 24, 2021, 7:19 a.m. UTC | #4
On Fri, Jul 23, 2021 at 02:33:02PM -0700, Eric Biggers wrote:
> I do still wonder if we should just not support that...  Dave is the only person
> who has asked for it, and it's a lot of trouble to support.
> 
> I also noticed that f2fs has always only supported direct I/O that is *fully*
> fs-block aligned (including the I/O segments) anyway.  So presumably that
> limitation is not really that important after all...
> 
> Does anyone else have thoughts on this?

There are some use cases that really like sector aligned direct I/O,
what comes to mind is some data bases, and file system repair tools
(the latter on the raw block device).  So it is nice to support, but not
really required.

So for now I'd much prefer to initially support inline encryption for
direct I/O without that if that simplifies the support.  We can revisit
the additional complexity later.

Also note that for cheap flash media pretending support for 512 byte
blocks is actually a bit awwkward, so just presenting the media as
having 4096 sectors in these setups would be the better choice anyway.
diff mbox series

Patch

diff --git a/block/bio.c b/block/bio.c
index 32f75f31bb5c..99c510f706e2 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1099,7 +1099,8 @@  static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter)
  * The function tries, but does not guarantee, to pin as many pages as
  * fit into the bio, or are requested in @iter, whatever is smaller. If
  * MM encounters an error pinning the requested pages, it stops. Error
- * is returned only if 0 pages could be pinned.
+ * is returned only if 0 pages could be pinned. It also ensures that the number
+ * of sectors added to the bio is aligned to bio_required_sector_alignment().
  *
  * It's intended for direct IO, so doesn't do PSI tracking, the caller is
  * responsible for setting BIO_WORKINGSET if necessary.
@@ -1107,6 +1108,7 @@  static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter)
 int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 {
 	int ret = 0;
+	unsigned int aligned_sectors;
 
 	if (iov_iter_is_bvec(iter)) {
 		if (bio_op(bio) == REQ_OP_ZONE_APPEND)
@@ -1121,6 +1123,15 @@  int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 			ret = __bio_iov_iter_get_pages(bio, iter);
 	} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
 
+	/*
+	 * Ensure that number of sectors in bio is aligned to
+	 * bio_required_sector_align()
+	 */
+	aligned_sectors = round_down(bio_sectors(bio),
+				     bio_required_sector_alignment(bio));
+	iov_iter_revert(iter, (bio_sectors(bio) - aligned_sectors) << SECTOR_SHIFT);
+	bio_truncate(bio, aligned_sectors << SECTOR_SHIFT);
+
 	/* don't account direct I/O as memory stall */
 	bio_clear_flag(bio, BIO_WORKINGSET);
 	return bio->bi_vcnt ? 0 : ret;