diff mbox series

[3/6] block, fs: assert that key paths use iovecs, and nothing else

Message ID 20220227093434.2889464-4-jhubbard@nvidia.com (mailing list archive)
State New, archived
Headers show
Series block, fs: convert most Direct IO cases to FOLL_PIN | expand

Commit Message

jhubbard.send.patches@gmail.com Feb. 27, 2022, 9:34 a.m. UTC
From: John Hubbard <jhubbard@nvidia.com>

Upcoming changes to Direct IO will change it from acquiring pages via
get_user_pages_fast(), to calling pin_user_pages_fast() instead.

Place a few assertions at key points, that the pages are IOVEC (user
pages), to enforce the assumptions that there are no kernel or pipe or
other odd variations being passed.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 block/bio.c    | 4 ++++
 fs/direct-io.c | 2 ++
 2 files changed, 6 insertions(+)

Comments

Jens Axboe Feb. 27, 2022, 9:58 p.m. UTC | #1
> diff --git a/block/bio.c b/block/bio.c
> index b15f5466ce08..4679d6539e2d 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1167,6 +1167,8 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>  	BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2);
>  	pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
>  
> +	WARN_ON_ONCE(!iter_is_iovec(iter));
> +

If these make sense, why aren't they also returning an error?
John Hubbard Feb. 27, 2022, 10:12 p.m. UTC | #2
On 2/27/22 13:58, Jens Axboe wrote:
>> diff --git a/block/bio.c b/block/bio.c
>> index b15f5466ce08..4679d6539e2d 100644
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -1167,6 +1167,8 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>>   	BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2);
>>   	pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
>>   
>> +	WARN_ON_ONCE(!iter_is_iovec(iter));
>> +
> 
> If these make sense, why aren't they also returning an error?
> 

Oops. Got caught up in the luxury of local testing and watching the
logs, but also returning errors is of course the right answer. Will fix.

thanks,
Al Viro Feb. 27, 2022, 10:15 p.m. UTC | #3
On Sun, Feb 27, 2022 at 01:34:31AM -0800, jhubbard.send.patches@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> Upcoming changes to Direct IO will change it from acquiring pages via
> get_user_pages_fast(), to calling pin_user_pages_fast() instead.
> 
> Place a few assertions at key points, that the pages are IOVEC (user
> pages), to enforce the assumptions that there are no kernel or pipe or
> other odd variations being passed.

Umm...  And what should happen when O_DIRECT file gets passed to splice()?
John Hubbard Feb. 27, 2022, 10:27 p.m. UTC | #4
On 2/27/22 14:15, Al Viro wrote:
> On Sun, Feb 27, 2022 at 01:34:31AM -0800, jhubbard.send.patches@gmail.com wrote:
>> From: John Hubbard <jhubbard@nvidia.com>
>>
>> Upcoming changes to Direct IO will change it from acquiring pages via
>> get_user_pages_fast(), to calling pin_user_pages_fast() instead.
>>
>> Place a few assertions at key points, that the pages are IOVEC (user
>> pages), to enforce the assumptions that there are no kernel or pipe or
>> other odd variations being passed.
> 
> Umm...  And what should happen when O_DIRECT file gets passed to splice()?

Hi Al,

First of all, full disclosure: I still haven't worked through how
splice() handles pages in all cases. I was hoping to defer it, by
limiting this series to not *all* of the original callers of
iov_iter_get_pages*().

This series leaves the splice() code pointing to iov_iter_get_pages(),
but maybe that's not possible after all.

Any advice or ideas about how to solve the O_DIRECT-to-splice() is very
welcome.


thanks,
John Hubbard Feb. 28, 2022, 3:29 a.m. UTC | #5
On 2/27/22 14:15, Al Viro wrote:
> On Sun, Feb 27, 2022 at 01:34:31AM -0800, jhubbard.send.patches@gmail.com wrote:
>> From: John Hubbard <jhubbard@nvidia.com>
>>
>> Upcoming changes to Direct IO will change it from acquiring pages via
>> get_user_pages_fast(), to calling pin_user_pages_fast() instead.
>>
>> Place a few assertions at key points, that the pages are IOVEC (user
>> pages), to enforce the assumptions that there are no kernel or pipe or
>> other odd variations being passed.
> 
> Umm...  And what should happen when O_DIRECT file gets passed to splice()?

OK, right, this is not going to work as-is.

Any of the remaining iov_iter_get_pages*() callers that do direct IO
(vmsplice, ceph, ...) will end up calling bio_release_pages(), which is
now unconditionally calling unpin_user_pages(). So this is just
completely broken.

And so, once again, I'm back to, "must convert the whole pile of
iov_iter_get_page*() callers at once".

Back to the drawing board.

thanks,
diff mbox series

Patch

diff --git a/block/bio.c b/block/bio.c
index b15f5466ce08..4679d6539e2d 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1167,6 +1167,8 @@  static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2);
 	pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
 
+	WARN_ON_ONCE(!iter_is_iovec(iter));
+
 	size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset);
 	if (unlikely(size <= 0))
 		return size ? size : -EFAULT;
@@ -1217,6 +1219,8 @@  static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter)
 	BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2);
 	pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
 
+	WARN_ON_ONCE(!iter_is_iovec(iter));
+
 	size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset);
 	if (unlikely(size <= 0))
 		return size ? size : -EFAULT;
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 38bca4980a1c..7dbbbfef300d 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -169,6 +169,8 @@  static inline int dio_refill_pages(struct dio *dio, struct dio_submit *sdio)
 {
 	ssize_t ret;
 
+	WARN_ON_ONCE(!iter_is_iovec(sdio->iter));
+
 	ret = iov_iter_get_pages(sdio->iter, dio->pages, LONG_MAX, DIO_PAGES,
 				&sdio->from);