diff mbox series

[v2,1/5] sgl_alloc_order: remove 4 GiB limit

Message ID 20221112194939.4823-2-dgilbert@interlog.com (mailing list archive)
State Changes Requested
Headers show
Series scatterlist: add operations for scsi_debug | expand

Commit Message

Douglas Gilbert Nov. 12, 2022, 7:49 p.m. UTC
This patch fixes a check done by sgl_alloc_order() before it starts
any allocations. The comment in the original said: "Check for integer
overflow" but the right hand side of the expression in the condition
is resolved as u32 so it can not exceed UINT32_MAX (4 GiB) which
means 'length' can not exceed that value.

This function may be used to replace vmalloc(unsigned long) for a
large allocation (e.g. a ramdisk). vmalloc has no limit at 4 GiB so
it seems unreasonable that sgl_alloc_order() whose length type is
unsigned long long should be limited to 4 GB.

Solutions to this issue were discussed by Jason Gunthorpe
<jgg@ziepe.ca> and Bodo Stroesser <bostroesser@gmail.com>. This
version is base on a linux-scsi post by Jason titled: "Re:
[PATCH v7 1/4] sgl_alloc_order: remove 4 GiB limit" dated 20220201.

An earlier patch fixed a memory leak in sg_alloc_order() due to the
misuse of sgl_free(). Take the opportunity to put a one line comment
above sgl_free()'s declaration warning that it is not suitable when
order > 0 .

Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
---
 include/linux/scatterlist.h |  1 +
 lib/scatterlist.c           | 23 ++++++++++++++---------
 2 files changed, 15 insertions(+), 9 deletions(-)

Comments

Jason Gunthorpe Nov. 15, 2022, 8:33 p.m. UTC | #1
On Sat, Nov 12, 2022 at 02:49:35PM -0500, Douglas Gilbert wrote:
> This patch fixes a check done by sgl_alloc_order() before it starts
> any allocations. The comment in the original said: "Check for integer
> overflow" but the right hand side of the expression in the condition
> is resolved as u32 so it can not exceed UINT32_MAX (4 GiB) which
> means 'length' can not exceed that value.
> 
> This function may be used to replace vmalloc(unsigned long) for a
> large allocation (e.g. a ramdisk). vmalloc has no limit at 4 GiB so
> it seems unreasonable that sgl_alloc_order() whose length type is
> unsigned long long should be limited to 4 GB.
> 
> Solutions to this issue were discussed by Jason Gunthorpe
> <jgg@ziepe.ca> and Bodo Stroesser <bostroesser@gmail.com>. This
> version is base on a linux-scsi post by Jason titled: "Re:
> [PATCH v7 1/4] sgl_alloc_order: remove 4 GiB limit" dated 20220201.
> 
> An earlier patch fixed a memory leak in sg_alloc_order() due to the
> misuse of sgl_free(). Take the opportunity to put a one line comment
> above sgl_free()'s declaration warning that it is not suitable when
> order > 0 .
> 
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Bodo Stroesser <bostroesser@gmail.com>
> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
> ---
>  include/linux/scatterlist.h |  1 +
>  lib/scatterlist.c           | 23 ++++++++++++++---------
>  2 files changed, 15 insertions(+), 9 deletions(-)

I still prefer the version I posted here:

https://lore.kernel.org/linux-scsi/Y1aDQznakNaWD8kd@ziepe.ca/

Jason
Douglas Gilbert Nov. 16, 2022, 12:20 a.m. UTC | #2
On 2022-11-15 15:33, Jason Gunthorpe wrote:
> On Sat, Nov 12, 2022 at 02:49:35PM -0500, Douglas Gilbert wrote:
>> This patch fixes a check done by sgl_alloc_order() before it starts
>> any allocations. The comment in the original said: "Check for integer
>> overflow" but the right hand side of the expression in the condition
>> is resolved as u32 so it can not exceed UINT32_MAX (4 GiB) which
>> means 'length' can not exceed that value.
>>
>> This function may be used to replace vmalloc(unsigned long) for a
>> large allocation (e.g. a ramdisk). vmalloc has no limit at 4 GiB so
>> it seems unreasonable that sgl_alloc_order() whose length type is
>> unsigned long long should be limited to 4 GB.
>>
>> Solutions to this issue were discussed by Jason Gunthorpe
>> <jgg@ziepe.ca> and Bodo Stroesser <bostroesser@gmail.com>. This
>> version is base on a linux-scsi post by Jason titled: "Re:
>> [PATCH v7 1/4] sgl_alloc_order: remove 4 GiB limit" dated 20220201.
>>
>> An earlier patch fixed a memory leak in sg_alloc_order() due to the
>> misuse of sgl_free(). Take the opportunity to put a one line comment
>> above sgl_free()'s declaration warning that it is not suitable when
>> order > 0 .
>>
>> Cc: Jason Gunthorpe <jgg@ziepe.ca>
>> Cc: Bodo Stroesser <bostroesser@gmail.com>
>> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
>> ---
>>   include/linux/scatterlist.h |  1 +
>>   lib/scatterlist.c           | 23 ++++++++++++++---------
>>   2 files changed, 15 insertions(+), 9 deletions(-)
> 
> I still prefer the version I posted here:
> 
> https://lore.kernel.org/linux-scsi/Y1aDQznakNaWD8kd@ziepe.ca/

Three reasons that I don't:
   1) making the first argument of type size_t may constrict the size
      that can be allocated on a 32 bit machine (faint recollection of
      extended/expanded memory on 8086). uint64_t would be better
      than unsigned long long but see point 3)
   2) making the last (fifth) argument of type size_t is overkill on a
      64 bit machine. IMO 32 bits is sufficient. The maximum unsigned int
      is 2^32 - 1 and with a typical PAGE_SIZE of 4096 bytes and order 0,
      that is roughly 2^44 bytes or about 16 TB. If part of the kernel
      did want 16 TB in a single allocation, I hope it would choose a
      larger value for order. So then the maximum single allocation
      would be 2^(44+MAX_ORDER-1) bytes. Can I stop now?
   3) it changes the signature of an existing exported kernel function
      requiring changes in several call sites. Changing an output pointer
      type may require more than a one line change at the existing call
      sites. Due to the fact that this patch is removing an existing
      4 GB limit, those call sites have zero need for this. If I was
      maintaining the driver containing those call sites, I would be
      a bit peeved. [That said, maintaining out-of-tree patchsets, while
      trying to get them accepted in the mainline, is a considerable
      pain due to the constant changes in the block layer API.]

Doug Gilbert
Jason Gunthorpe Nov. 16, 2022, 12:39 a.m. UTC | #3
On Tue, Nov 15, 2022 at 07:20:13PM -0500, Douglas Gilbert wrote:
> On 2022-11-15 15:33, Jason Gunthorpe wrote:
> > On Sat, Nov 12, 2022 at 02:49:35PM -0500, Douglas Gilbert wrote:
> > > This patch fixes a check done by sgl_alloc_order() before it starts
> > > any allocations. The comment in the original said: "Check for integer
> > > overflow" but the right hand side of the expression in the condition
> > > is resolved as u32 so it can not exceed UINT32_MAX (4 GiB) which
> > > means 'length' can not exceed that value.
> > > 
> > > This function may be used to replace vmalloc(unsigned long) for a
> > > large allocation (e.g. a ramdisk). vmalloc has no limit at 4 GiB so
> > > it seems unreasonable that sgl_alloc_order() whose length type is
> > > unsigned long long should be limited to 4 GB.
> > > 
> > > Solutions to this issue were discussed by Jason Gunthorpe
> > > <jgg@ziepe.ca> and Bodo Stroesser <bostroesser@gmail.com>. This
> > > version is base on a linux-scsi post by Jason titled: "Re:
> > > [PATCH v7 1/4] sgl_alloc_order: remove 4 GiB limit" dated 20220201.
> > > 
> > > An earlier patch fixed a memory leak in sg_alloc_order() due to the
> > > misuse of sgl_free(). Take the opportunity to put a one line comment
> > > above sgl_free()'s declaration warning that it is not suitable when
> > > order > 0 .
> > > 
> > > Cc: Jason Gunthorpe <jgg@ziepe.ca>
> > > Cc: Bodo Stroesser <bostroesser@gmail.com>
> > > Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
> > > ---
> > >   include/linux/scatterlist.h |  1 +
> > >   lib/scatterlist.c           | 23 ++++++++++++++---------
> > >   2 files changed, 15 insertions(+), 9 deletions(-)
> > 
> > I still prefer the version I posted here:
> > 
> > https://lore.kernel.org/linux-scsi/Y1aDQznakNaWD8kd@ziepe.ca/
> 
> Three reasons that I don't:
>   1) making the first argument of type size_t may constrict the size
>      that can be allocated on a 32 bit machine (faint recollection of
>      extended/expanded memory on 8086). uint64_t would be better
>      than unsigned long long but see point 3)

32 bit machines can't kmap more than size_t - so this is not
correct. We can't put sgl tables into highmem.

>   2) making the last (fifth) argument of type size_t is overkill on a
>      64 bit machine. IMO 32 bits is sufficient. 

IIRC, I changed it to obviously avoid integer promotion/truncation
issues. It is better to handle those with correct typing than
introducing a bunch of frail checks. We don't need to worry about the
extra 32 bits in something like this.

>   3) it changes the signature of an existing exported kernel function
>      requiring changes in several call sites. 

So fix them. It is why we have one git tree. You'll get sympathy if it
is more than 5-10 :)

>      type may require more than a one line change at the existing call
>      sites. Due to the fact that this patch is removing an existing
>      4 GB limit, those call sites have zero need for this. If I was
>      maintaining the driver containing those call sites, I would be
>      a bit peeved.

Uh, if someone is "peeved" they are not understanding how kernel APIs
are expected to evolve, I think.

It should be two patches, one to correct the types in the function
signature, and another to resolve the 4G problem.

>     [That said, maintaining out-of-tree patchsets, while
>      trying to get them accepted in the mainline, is a considerable
>      pain due to the constant changes in the block layer API.]

Which is consistent with how the community views in-kernel APIs.

Jason
diff mbox series

Patch

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 375a5e90d86a..0930755a756e 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -426,6 +426,7 @@  struct scatterlist *sgl_alloc(unsigned long long length, gfp_t gfp,
 			      unsigned int *nent_p);
 void sgl_free_n_order(struct scatterlist *sgl, int nents, int order);
 void sgl_free_order(struct scatterlist *sgl, int order);
+/* Only use sgl_free() when order is 0 */
 void sgl_free(struct scatterlist *sgl);
 #endif /* CONFIG_SGL_ALLOC */
 
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index c8c3d675845c..ee69d33d1228 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -585,13 +585,16 @@  EXPORT_SYMBOL(sg_alloc_table_from_pages_segment);
 #ifdef CONFIG_SGL_ALLOC
 
 /**
- * sgl_alloc_order - allocate a scatterlist and its pages
+ * sgl_alloc_order - allocate a scatterlist with equally sized elements each
+ *		     of which has 2^@order continuous pages
  * @length: Length in bytes of the scatterlist. Must be at least one
- * @order: Second argument for alloc_pages()
+ * @order:  Second argument for alloc_pages(). Each sgl element size will
+ *	    be (PAGE_SIZE*2^@order) bytes. @order must not exceed 16.
  * @chainable: Whether or not to allocate an extra element in the scatterlist
- *	for scatterlist chaining purposes
+ *	       for scatterlist chaining purposes
  * @gfp: Memory allocation flags
- * @nent_p: [out] Number of entries in the scatterlist that have pages
+ * @nent_p: [out] Number of entries in the scatterlist that have pages.
+ *		  Ignored if @nent_p is NULL.
  *
  * Returns: A pointer to an initialized scatterlist or %NULL upon failure.
  */
@@ -601,14 +604,16 @@  struct scatterlist *sgl_alloc_order(unsigned long long length,
 {
 	struct scatterlist *sgl, *sg;
 	struct page *page;
-	unsigned int nent, nalloc;
+	uint64_t nent;
+	unsigned int nalloc;
 	u32 elem_len;
 
+	if (WARN_ON_ONCE(order >= MAX_ORDER))
+		return NULL;
 	nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order);
-	/* Check for integer overflow */
-	if (length > (nent << (PAGE_SHIFT + order)))
+	if (nent > UINT_MAX)
 		return NULL;
-	nalloc = nent;
+	nalloc = (unsigned int)nent;
 	if (chainable) {
 		/* Check for integer overflow */
 		if (nalloc + 1 < nalloc)
@@ -636,7 +641,7 @@  struct scatterlist *sgl_alloc_order(unsigned long long length,
 	}
 	WARN_ONCE(length, "length = %lld\n", length);
 	if (nent_p)
-		*nent_p = nent;
+		*nent_p = (unsigned int)nent;
 	return sgl;
 }
 EXPORT_SYMBOL(sgl_alloc_order);