diff mbox series

io_uring: don't allow IORING_SETUP_NO_MMAP rings on highmem pages

Message ID 4c9eddf5-75d8-44cf-9365-a0dd3d0b4c05@kernel.dk (mailing list archive)
State New
Headers show
Series io_uring: don't allow IORING_SETUP_NO_MMAP rings on highmem pages | expand

Commit Message

Jens Axboe Oct. 3, 2023, 4:02 p.m. UTC
On at least arm32, but presumably any arch with highmem, if the
application passes in memory that resides in highmem for the rings,
then we should fail that ring creation. We fail it with -EINVAL, which
is what kernels that don't support IORING_SETUP_NO_MMAP will do as well.

Cc: stable@vger.kernel.org
Fixes: 03d89a2de25b ("io_uring: support for user allocated memory for rings/sqes")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

---

Comments

Jens Axboe Oct. 3, 2023, 4:27 p.m. UTC | #1
On 10/3/23 10:30 AM, Jeff Moyer wrote:
> Hi, Jens,
> 
> Jens Axboe <axboe@kernel.dk> writes:
> 
>> On at least arm32, but presumably any arch with highmem, if the
>> application passes in memory that resides in highmem for the rings,
>> then we should fail that ring creation. We fail it with -EINVAL, which
>> is what kernels that don't support IORING_SETUP_NO_MMAP will do as well.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: 03d89a2de25b ("io_uring: support for user allocated memory for rings/sqes")
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>
>> ---
>>
>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>> index 783ed0fff71b..d839a80a6751 100644
>> --- a/io_uring/io_uring.c
>> +++ b/io_uring/io_uring.c
>> @@ -2686,7 +2686,7 @@ static void *__io_uaddr_map(struct page ***pages, unsigned short *npages,
>>  {
>>  	struct page **page_array;
>>  	unsigned int nr_pages;
>> -	int ret;
>> +	int ret, i;
>>  
>>  	*npages = 0;
>>  
>> @@ -2716,6 +2716,20 @@ static void *__io_uaddr_map(struct page ***pages, unsigned short *npages,
>>  	 */
>>  	if (page_array[0] != page_array[ret - 1])
>>  		goto err;
>> +
>> +	/*
>> +	 * Can't support mapping user allocated ring memory on 32-bit archs
>> +	 * where it could potentially reside in highmem. Just fail those with
>> +	 * -EINVAL, just like we did on kernels that didn't support this
>> +	 * feature.
>> +	 */
>> +	for (i = 0; i < nr_pages; i++) {
>> +		if (PageHighMem(page_array[i])) {
>> +			ret = -EINVAL;
>> +			goto err;
>> +		}
>> +	}
>> +
> 
> What do you think about throwing a printk_once in there that explains
> the problem?  I'm worried that this will fail somewhat randomly, and it
> may not be apparent to the user why.  We should also add documentation,
> of course, and encourage developers to add fallbacks for this case.

For both cases posted, it's rather more advanced use cases. And 32-bit
isn't so prevalent anymore, thankfully. I was going to add to the man
pages explaining this failure case. Not sure it's worth adding a printk
for though.

FWIW, once I got an arm32 vm setup, it fails everytime for me. Not sure
how it'd do on 32-bit x86, similarly or more randomly. But yeah it's
definitely at the mercy of how things are mapped.
Jeff Moyer Oct. 3, 2023, 4:30 p.m. UTC | #2
Hi, Jens,

Jens Axboe <axboe@kernel.dk> writes:

> On at least arm32, but presumably any arch with highmem, if the
> application passes in memory that resides in highmem for the rings,
> then we should fail that ring creation. We fail it with -EINVAL, which
> is what kernels that don't support IORING_SETUP_NO_MMAP will do as well.
>
> Cc: stable@vger.kernel.org
> Fixes: 03d89a2de25b ("io_uring: support for user allocated memory for rings/sqes")
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>
> ---
>
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index 783ed0fff71b..d839a80a6751 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -2686,7 +2686,7 @@ static void *__io_uaddr_map(struct page ***pages, unsigned short *npages,
>  {
>  	struct page **page_array;
>  	unsigned int nr_pages;
> -	int ret;
> +	int ret, i;
>  
>  	*npages = 0;
>  
> @@ -2716,6 +2716,20 @@ static void *__io_uaddr_map(struct page ***pages, unsigned short *npages,
>  	 */
>  	if (page_array[0] != page_array[ret - 1])
>  		goto err;
> +
> +	/*
> +	 * Can't support mapping user allocated ring memory on 32-bit archs
> +	 * where it could potentially reside in highmem. Just fail those with
> +	 * -EINVAL, just like we did on kernels that didn't support this
> +	 * feature.
> +	 */
> +	for (i = 0; i < nr_pages; i++) {
> +		if (PageHighMem(page_array[i])) {
> +			ret = -EINVAL;
> +			goto err;
> +		}
> +	}
> +

What do you think about throwing a printk_once in there that explains
the problem?  I'm worried that this will fail somewhat randomly, and it
may not be apparent to the user why.  We should also add documentation,
of course, and encourage developers to add fallbacks for this case.

-Jeff
Jeff Moyer Oct. 3, 2023, 6:24 p.m. UTC | #3
Jens Axboe <axboe@kernel.dk> writes:

> On 10/3/23 10:30 AM, Jeff Moyer wrote:
>> Hi, Jens,
>> 
[snip]
>> What do you think about throwing a printk_once in there that explains
>> the problem?  I'm worried that this will fail somewhat randomly, and it
>> may not be apparent to the user why.  We should also add documentation,
>> of course, and encourage developers to add fallbacks for this case.
>
> For both cases posted, it's rather more advanced use cases. And 32-bit
> isn't so prevalent anymore, thankfully. I was going to add to the man
> pages explaining this failure case. Not sure it's worth adding a printk
> for though.

I try not to make decisions based on how prevalent I think a particular
configuration is (mainly because I'm usually wrong).  Anyway, it's not a
big deal, I'm glad you gave it some thought.

> FWIW, once I got an arm32 vm setup, it fails everytime for me. Not sure
> how it'd do on 32-bit x86, similarly or more randomly. But yeah it's
> definitely at the mercy of how things are mapped.

...and potentially the load on the system.  Anyway, it's fine with me to
keep it as is.  We can always add a warning later if it ends up being a
problem.

Thanks!
Jeff
Jens Axboe Oct. 3, 2023, 6:25 p.m. UTC | #4
On 10/3/23 12:24 PM, Jeff Moyer wrote:
> Jens Axboe <axboe@kernel.dk> writes:
> 
>> On 10/3/23 10:30 AM, Jeff Moyer wrote:
>>> Hi, Jens,
>>>
> [snip]
>>> What do you think about throwing a printk_once in there that explains
>>> the problem?  I'm worried that this will fail somewhat randomly, and it
>>> may not be apparent to the user why.  We should also add documentation,
>>> of course, and encourage developers to add fallbacks for this case.
>>
>> For both cases posted, it's rather more advanced use cases. And 32-bit
>> isn't so prevalent anymore, thankfully. I was going to add to the man
>> pages explaining this failure case. Not sure it's worth adding a printk
>> for though.
> 
> I try not to make decisions based on how prevalent I think a particular
> configuration is (mainly because I'm usually wrong).  Anyway, it's not a
> big deal, I'm glad you gave it some thought.

Me neither, but I think we can all safely agree that 32-bit highmem is
thankfully not on the uptick :-)

>> FWIW, once I got an arm32 vm setup, it fails everytime for me. Not sure
>> how it'd do on 32-bit x86, similarly or more randomly. But yeah it's
>> definitely at the mercy of how things are mapped.
> 
> ...and potentially the load on the system.  Anyway, it's fine with me to
> keep it as is.  We can always add a warning later if it ends up being a
> problem.

Certainly!
diff mbox series

Patch

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 783ed0fff71b..d839a80a6751 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -2686,7 +2686,7 @@  static void *__io_uaddr_map(struct page ***pages, unsigned short *npages,
 {
 	struct page **page_array;
 	unsigned int nr_pages;
-	int ret;
+	int ret, i;
 
 	*npages = 0;
 
@@ -2716,6 +2716,20 @@  static void *__io_uaddr_map(struct page ***pages, unsigned short *npages,
 	 */
 	if (page_array[0] != page_array[ret - 1])
 		goto err;
+
+	/*
+	 * Can't support mapping user allocated ring memory on 32-bit archs
+	 * where it could potentially reside in highmem. Just fail those with
+	 * -EINVAL, just like we did on kernels that didn't support this
+	 * feature.
+	 */
+	for (i = 0; i < nr_pages; i++) {
+		if (PageHighMem(page_array[i])) {
+			ret = -EINVAL;
+			goto err;
+		}
+	}
+
 	*pages = page_array;
 	*npages = nr_pages;
 	return page_to_virt(page_array[0]);