diff mbox series

[8/9] scsi: sd_zbc: Cleanup sd_zbc_alloc_report_buffer()

Message ID 20191108015702.233102-9-damien.lemoal@wdc.com (mailing list archive)
State New, archived
Headers show
Series Zoned block device enhancements and zone report rework | expand

Commit Message

Damien Le Moal Nov. 8, 2019, 1:57 a.m. UTC
There is no need to arbitrarily limit the size of a report zone to the
number of zones defined by SD_ZBC_REPORT_MAX_ZONES. Rather, simply
calculate the report buffer size needed for the requested number of
zones without exceeding the device total number of zones. This buffer
size limitation to the hardware maximum transfer size and page mapping
capabilities is kept unchanged. Starting with this initial buffer size,
the allocation is optimized by iterating over decreasing buffer size
until the allocation succeeds. This ensures forward progress for zone
reports and avoids failures of zones revalidation under memory pressure.

While at it, also replace the hard coded 512 B sector size with the
SECTOR_SIZE macro.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/sd_zbc.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

Comments

Christoph Hellwig Nov. 8, 2019, 6:31 a.m. UTC | #1
Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
Hannes Reinecke Nov. 8, 2019, 7:20 a.m. UTC | #2
On 11/8/19 2:57 AM, Damien Le Moal wrote:
> There is no need to arbitrarily limit the size of a report zone to the
> number of zones defined by SD_ZBC_REPORT_MAX_ZONES. Rather, simply
> calculate the report buffer size needed for the requested number of
> zones without exceeding the device total number of zones. This buffer
> size limitation to the hardware maximum transfer size and page mapping
> capabilities is kept unchanged. Starting with this initial buffer size,
> the allocation is optimized by iterating over decreasing buffer size
> until the allocation succeeds. This ensures forward progress for zone
> reports and avoids failures of zones revalidation under memory pressure.
> 
> While at it, also replace the hard coded 512 B sector size with the
> SECTOR_SIZE macro.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
>  drivers/scsi/sd_zbc.c | 22 +++++++++++-----------
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
Bart Van Assche Nov. 8, 2019, 7:06 p.m. UTC | #3
On 11/7/19 5:57 PM, Damien Le Moal wrote:
> -	buf = vzalloc(bufsize);
> -	if (buf)
> -		*buflen = bufsize;
> +	while (bufsize >= SECTOR_SIZE) {
> +		buf = vzalloc(bufsize);
> +		if (buf) {
> +			*buflen = bufsize;
> +			return buf;
> +		}
> +		bufsize >>= 1;
> +	}

Hi Damien,

Has it been considered to pass the __GFP_NORETRY flag to this vzalloc() 
call?

Thanks,

Bart.
Damien Le Moal Nov. 9, 2019, 2:54 a.m. UTC | #4
On 2019/11/09 4:06, Bart Van Assche wrote:
> On 11/7/19 5:57 PM, Damien Le Moal wrote:
>> -	buf = vzalloc(bufsize);
>> -	if (buf)
>> -		*buflen = bufsize;
>> +	while (bufsize >= SECTOR_SIZE) {
>> +		buf = vzalloc(bufsize);
>> +		if (buf) {
>> +			*buflen = bufsize;
>> +			return buf;
>> +		}
>> +		bufsize >>= 1;
>> +	}
> 
> Hi Damien,
> 
> Has it been considered to pass the __GFP_NORETRY flag to this vzalloc() 
> call?

Do you mean using

__vmalloc(bufsize,
	  GFP_KERNEL | __GFP_ZERO | __GFP_NORETRY, PAGE_KERNEL);

instead of vzalloc() ? (since we cannot pass GFP flags to vzalloc()...)

Note that this is called with GFP_NOIO set for the caller context in the
case of revalidate zones, and default to GFP_KERNEL for
blkdev_report_zones() unless the caller also tweaks the context memalloc
flags.

> 
> Thanks,
> 
> Bart.
>
Bart Van Assche Nov. 9, 2019, 3:02 a.m. UTC | #5
On 2019-11-08 18:54, Damien Le Moal wrote:
> On 2019/11/09 4:06, Bart Van Assche wrote:
>> On 11/7/19 5:57 PM, Damien Le Moal wrote:
>>> -	buf = vzalloc(bufsize);
>>> -	if (buf)
>>> -		*buflen = bufsize;
>>> +	while (bufsize >= SECTOR_SIZE) {
>>> +		buf = vzalloc(bufsize);
>>> +		if (buf) {
>>> +			*buflen = bufsize;
>>> +			return buf;
>>> +		}
>>> +		bufsize >>= 1;
>>> +	}
>>
>> Hi Damien,
>>
>> Has it been considered to pass the __GFP_NORETRY flag to this vzalloc() 
>> call?
> 
> Do you mean using
> 
> __vmalloc(bufsize,
> 	  GFP_KERNEL | __GFP_ZERO | __GFP_NORETRY, PAGE_KERNEL);
> 
> instead of vzalloc() ? (since we cannot pass GFP flags to vzalloc()...)
> 
> Note that this is called with GFP_NOIO set for the caller context in the
> case of revalidate zones, and default to GFP_KERNEL for
> blkdev_report_zones() unless the caller also tweaks the context memalloc
> flags.

Hi Damien,

Yes, that's what I meant. The following comment from mm/util.c explains
why __GFP_RETRY should be used if it is OK for an allocation to fail:

/*
 * We want to attempt a large physically contiguous block first because
 * it is less likely to fragment multiple larger blocks and therefore
 * contribute to a long term fragmentation less than vmalloc fallback.
 * However make sure that larger requests are not too disruptive - no
 * OOM killer and no allocation failure warnings as we have a fallback.
 */

Thanks,

Bart.
diff mbox series

Patch

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 7c4690f26698..f191af15de1b 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -104,11 +104,6 @@  static int sd_zbc_do_report_zones(struct scsi_disk *sdkp, unsigned char *buf,
 	return 0;
 }
 
-/*
- * Maximum number of zones to get with one report zones command.
- */
-#define SD_ZBC_REPORT_MAX_ZONES		8192U
-
 /**
  * Allocate a buffer for report zones reply.
  * @sdkp: The target disk
@@ -138,17 +133,22 @@  static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
 	 * sure that the allocated buffer can always be mapped by limiting the
 	 * number of pages allocated to the HBA max segments limit.
 	 */
-	nr_zones = min(nr_zones, SD_ZBC_REPORT_MAX_ZONES);
-	bufsize = roundup((nr_zones + 1) * 64, 512);
+	nr_zones = min(nr_zones, sdkp->nr_zones);
+	bufsize = roundup((nr_zones + 1) * 64, SECTOR_SIZE);
 	bufsize = min_t(size_t, bufsize,
 			queue_max_hw_sectors(q) << SECTOR_SHIFT);
 	bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT);
 
-	buf = vzalloc(bufsize);
-	if (buf)
-		*buflen = bufsize;
+	while (bufsize >= SECTOR_SIZE) {
+		buf = vzalloc(bufsize);
+		if (buf) {
+			*buflen = bufsize;
+			return buf;
+		}
+		bufsize >>= 1;
+	}
 
-	return buf;
+	return NULL;
 }
 
 /**