diff mbox series

[v4,8/8] sd_zbc: clear zone resources for non-zoned case

Message ID 20210128044733.503606-9-damien.lemoal@wdc.com (mailing list archive)
State New, archived
Headers show
Series block: add zone write granularity limit | expand

Commit Message

Damien Le Moal Jan. 28, 2021, 4:47 a.m. UTC
For host-aware ZBC disk, setting the device zoned model to BLK_ZONED_HA
using blk_queue_set_zoned() in sd_read_block_characteristics() may
result in the block device effective zoned model to be "none"
(BLK_ZONED_NONE) if partitions are present on the device. In this case,
sd_zbc_read_zones() should not setup the zone related queue limits for
the disk so that the device limits and configuration is consistent with
a regular disk and resources not uselessly allocated (e.g. the zone
write pointer tracking array for zone append emulation).

Furthermore, if the disk zoned model changes at run time due to the
creation of a partition by the user, the zone related resources can be
released.

Fix both problems by introducing the function sd_zbc_clear_zone_info()
to reset the scsi disk zone information and free resources and by
returning early in sd_zbc_read_zones() for a block device that has a
zoned model equal to BLK_ZONED_NONE.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/sd_zbc.c | 37 ++++++++++++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 5 deletions(-)

Comments

Chaitanya Kulkarni Jan. 28, 2021, 5:38 a.m. UTC | #1
On 1/27/21 20:47, Damien Le Moal wrote:
> -void sd_zbc_release_disk(struct scsi_disk *sdkp)
> +static void sd_zbc_clear_zone_info(struct scsi_disk *sdkp)
>  {
> +	/* Serialize against revalidate zones */
> +	mutex_lock(&sdkp->rev_mutex);
> +
>  	kvfree(sdkp->zones_wp_offset);
>  	sdkp->zones_wp_offset = NULL;
>  	kfree(sdkp->zone_wp_update_buf);
>  	sdkp->zone_wp_update_buf = NULL;
> +
> +	sdkp->nr_zones = 0;
> +	sdkp->rev_nr_zones = 0;
> +	sdkp->zone_blocks = 0;
> +	sdkp->rev_zone_blocks = 0;
> +
> +	mutex_unlock(&sdkp->rev_mutex);
> +}
> +
> +void sd_zbc_release_disk(struct scsi_disk *sdkp)
> +{
> +	if (sd_is_zoned(sdkp))
> +		sd_zbc_clear_zone_info(sdkp);
>  }
>  
If I'm not wrong there is only one caller for sd_zbc_clear_zone_info().
Is there any reason why sd_zbc_clear_zone_info() is notopen coded with
a meaningful comment in sd_zbc_release_disk() ? e.g. :-

void sd_zbc_release_disk(struct scsi_disk *sdkp)
{
	if (!sd_is_zoned(sdkp))
		return; 
	/* Serialize against revalidate zones */
	mutex_lock(&sdkp->rev_mutex);

 	kvfree(sdkp->zones_wp_offset);
 	sdkp->zones_wp_offset = NULL;
 	kfree(sdkp->zone_wp_update_buf);
 	sdkp->zone_wp_update_buf = NULL;

	/* clear zone info */
	sdkp->nr_zones = 0;
	sdkp->rev_nr_zones = 0;
	sdkp->zone_blocks = 0;
	sdkp->rev_zone_blocks = 0;

	mutex_unlock(&sdkp->rev_mutex);
 }


unless I miss something, in either case LGTM.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Damien Le Moal Jan. 28, 2021, 5:40 a.m. UTC | #2
On 2021/01/28 14:38, Chaitanya Kulkarni wrote:
> On 1/27/21 20:47, Damien Le Moal wrote:
>> -void sd_zbc_release_disk(struct scsi_disk *sdkp)
>> +static void sd_zbc_clear_zone_info(struct scsi_disk *sdkp)
>>  {
>> +	/* Serialize against revalidate zones */
>> +	mutex_lock(&sdkp->rev_mutex);
>> +
>>  	kvfree(sdkp->zones_wp_offset);
>>  	sdkp->zones_wp_offset = NULL;
>>  	kfree(sdkp->zone_wp_update_buf);
>>  	sdkp->zone_wp_update_buf = NULL;
>> +
>> +	sdkp->nr_zones = 0;
>> +	sdkp->rev_nr_zones = 0;
>> +	sdkp->zone_blocks = 0;
>> +	sdkp->rev_zone_blocks = 0;
>> +
>> +	mutex_unlock(&sdkp->rev_mutex);
>> +}
>> +
>> +void sd_zbc_release_disk(struct scsi_disk *sdkp)
>> +{
>> +	if (sd_is_zoned(sdkp))
>> +		sd_zbc_clear_zone_info(sdkp);
>>  }
>>  
> If I'm not wrong there is only one caller for sd_zbc_clear_zone_info().
> Is there any reason why sd_zbc_clear_zone_info() is notopen coded with
> a meaningful comment in sd_zbc_release_disk() ? e.g. :-

There are 2 call sites: sd_zbc_read_zones() and sd_zbc_release_disk().

> 
> void sd_zbc_release_disk(struct scsi_disk *sdkp)
> {
> 	if (!sd_is_zoned(sdkp))
> 		return; 
> 	/* Serialize against revalidate zones */
> 	mutex_lock(&sdkp->rev_mutex);
> 
>  	kvfree(sdkp->zones_wp_offset);
>  	sdkp->zones_wp_offset = NULL;
>  	kfree(sdkp->zone_wp_update_buf);
>  	sdkp->zone_wp_update_buf = NULL;
> 
> 	/* clear zone info */
> 	sdkp->nr_zones = 0;
> 	sdkp->rev_nr_zones = 0;
> 	sdkp->zone_blocks = 0;
> 	sdkp->rev_zone_blocks = 0;
> 
> 	mutex_unlock(&sdkp->rev_mutex);
>  }
> 
> 
> unless I miss something, in either case LGTM.
> 
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> 
>
Christoph Hellwig Jan. 28, 2021, 9:24 a.m. UTC | #3
On Thu, Jan 28, 2021 at 01:47:33PM +0900, Damien Le Moal wrote:
> For host-aware ZBC disk, setting the device zoned model to BLK_ZONED_HA
> using blk_queue_set_zoned() in sd_read_block_characteristics() may
> result in the block device effective zoned model to be "none"
> (BLK_ZONED_NONE) if partitions are present on the device. In this case,
> sd_zbc_read_zones() should not setup the zone related queue limits for
> the disk so that the device limits and configuration is consistent with
> a regular disk and resources not uselessly allocated (e.g. the zone
> write pointer tracking array for zone append emulation).
> 
> Furthermore, if the disk zoned model changes at run time due to the
> creation of a partition by the user, the zone related resources can be
> released.
> 
> Fix both problems by introducing the function sd_zbc_clear_zone_info()
> to reset the scsi disk zone information and free resources and by
> returning early in sd_zbc_read_zones() for a block device that has a
> zoned model equal to BLK_ZONED_NONE.

So creating the partition doesn't even call into the driver, which
means we'll leak the info for now.  But I guess the next revalidate
will simply clean it up, so it is not a major issue.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Damien Le Moal Jan. 28, 2021, 9:36 a.m. UTC | #4
On 2021/01/28 18:24, Christoph Hellwig wrote:
> On Thu, Jan 28, 2021 at 01:47:33PM +0900, Damien Le Moal wrote:
>> For host-aware ZBC disk, setting the device zoned model to BLK_ZONED_HA
>> using blk_queue_set_zoned() in sd_read_block_characteristics() may
>> result in the block device effective zoned model to be "none"
>> (BLK_ZONED_NONE) if partitions are present on the device. In this case,
>> sd_zbc_read_zones() should not setup the zone related queue limits for
>> the disk so that the device limits and configuration is consistent with
>> a regular disk and resources not uselessly allocated (e.g. the zone
>> write pointer tracking array for zone append emulation).
>>
>> Furthermore, if the disk zoned model changes at run time due to the
>> creation of a partition by the user, the zone related resources can be
>> released.
>>
>> Fix both problems by introducing the function sd_zbc_clear_zone_info()
>> to reset the scsi disk zone information and free resources and by
>> returning early in sd_zbc_read_zones() for a block device that has a
>> zoned model equal to BLK_ZONED_NONE.
> 
> So creating the partition doesn't even call into the driver, which
> means we'll leak the info for now.  But I guess the next revalidate
> will simply clean it up, so it is not a major issue.

Exactly. But the leak is only for the sd level resources now.
blk_queue_set_zoned() cleans up everything else.
The super annoying thing is that deleting all partitions leaves that disk in
regular mode instead of returning it to zoned mode. Need to wait for a
revalidate or for a manual rescan for that to happen. I wonder if we should not
trigger a revalidate, always, to avoid that the user sees the device type
suddenly changing long after the partitions were deleted...
Adding such revalidate for partition creation would solve the "leak" problem (or
rather the lack of cleaning) too.

> 
> Looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
>
Johannes Thumshirn Jan. 28, 2021, 11:48 a.m. UTC | #5
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@edc.com>
Martin K. Petersen Feb. 5, 2021, 2:56 a.m. UTC | #6
Damien,

> For host-aware ZBC disk, setting the device zoned model to
> BLK_ZONED_HA using blk_queue_set_zoned() in
> sd_read_block_characteristics() may result in the block device
> effective zoned model to be "none" (BLK_ZONED_NONE) if partitions are
> present on the device. In this case, sd_zbc_read_zones() should not
> setup the zone related queue limits for the disk so that the device
> limits and configuration is consistent with a regular disk and
> resources not uselessly allocated (e.g. the zone write pointer
> tracking array for zone append emulation).
>
> Furthermore, if the disk zoned model changes at run time due to the
> creation of a partition by the user, the zone related resources can be
> released.
>
> Fix both problems by introducing the function sd_zbc_clear_zone_info()
> to reset the scsi disk zone information and free resources and by
> returning early in sd_zbc_read_zones() for a block device that has a
> zoned model equal to BLK_ZONED_NONE.

Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
diff mbox series

Patch

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 8293b29584b3..03adb39293c2 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -665,12 +665,28 @@  static int sd_zbc_init_disk(struct scsi_disk *sdkp)
 	return 0;
 }
 
-void sd_zbc_release_disk(struct scsi_disk *sdkp)
+static void sd_zbc_clear_zone_info(struct scsi_disk *sdkp)
 {
+	/* Serialize against revalidate zones */
+	mutex_lock(&sdkp->rev_mutex);
+
 	kvfree(sdkp->zones_wp_offset);
 	sdkp->zones_wp_offset = NULL;
 	kfree(sdkp->zone_wp_update_buf);
 	sdkp->zone_wp_update_buf = NULL;
+
+	sdkp->nr_zones = 0;
+	sdkp->rev_nr_zones = 0;
+	sdkp->zone_blocks = 0;
+	sdkp->rev_zone_blocks = 0;
+
+	mutex_unlock(&sdkp->rev_mutex);
+}
+
+void sd_zbc_release_disk(struct scsi_disk *sdkp)
+{
+	if (sd_is_zoned(sdkp))
+		sd_zbc_clear_zone_info(sdkp);
 }
 
 static void sd_zbc_revalidate_zones_cb(struct gendisk *disk)
@@ -769,6 +785,21 @@  int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 		 */
 		return 0;
 
+	/* READ16/WRITE16 is mandatory for ZBC disks */
+	sdkp->device->use_16_for_rw = 1;
+	sdkp->device->use_10_for_rw = 0;
+
+	if (!blk_queue_is_zoned(q)) {
+		/*
+		 * This can happen for a host aware disk with partitions.
+		 * The block device zone information was already cleared
+		 * by blk_queue_set_zoned(). Only clear the scsi disk zone
+		 * information and exit early.
+		 */
+		sd_zbc_clear_zone_info(sdkp);
+		return 0;
+	}
+
 	/* Check zoned block device characteristics (unconstrained reads) */
 	ret = sd_zbc_check_zoned_characteristics(sdkp, buf);
 	if (ret)
@@ -797,10 +828,6 @@  int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf)
 	if (blk_queue_zoned_model(q) == BLK_ZONED_HM)
 		blk_queue_zone_write_granularity(q, sdkp->physical_block_size);
 
-	/* READ16/WRITE16 is mandatory for ZBC disks */
-	sdkp->device->use_16_for_rw = 1;
-	sdkp->device->use_10_for_rw = 0;
-
 	sdkp->rev_nr_zones = nr_zones;
 	sdkp->rev_zone_blocks = zone_blocks;