diff mbox

[25/27] block: remove the discard_zeroes_data flag

Message ID 1493782395.23202.84.camel@haakon3.risingtidesystems.com (mailing list archive)
State New, archived
Headers show

Commit Message

Nicholas A. Bellinger May 3, 2017, 3:33 a.m. UTC
On Tue, 2017-05-02 at 09:23 +0200, hch@lst.de wrote:
> On Tue, May 02, 2017 at 12:16:13AM -0700, Nicholas A. Bellinger wrote:
> > Or, another options is use bdev_write_zeroes_sectors() to determine when
> > dev_attrib->unmap_zeroes_data should be set.
> 
> Yes, that in combination with your patch to use bdev_write_zeroes_sectors
> for zeroing from write same seems like the right fix.

The larger target/iblock conversion patch looks like post v4.12 material
at this point, so to avoid breakage wrt to existing LBPRZ behavior, I'll
plan to push the following patch post -rc1.


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Mike Snitzer May 3, 2017, 2:33 p.m. UTC | #1
On Tue, May 02 2017 at 11:33pm -0400,
Nicholas A. Bellinger <nab@linux-iscsi.org> wrote:

> On Tue, 2017-05-02 at 09:23 +0200, hch@lst.de wrote:
> > On Tue, May 02, 2017 at 12:16:13AM -0700, Nicholas A. Bellinger wrote:
> > > Or, another options is use bdev_write_zeroes_sectors() to determine when
> > > dev_attrib->unmap_zeroes_data should be set.
> > 
> > Yes, that in combination with your patch to use bdev_write_zeroes_sectors
> > for zeroing from write same seems like the right fix.
> 
> The larger target/iblock conversion patch looks like post v4.12 material
> at this point, so to avoid breakage wrt to existing LBPRZ behavior, I'll
> plan to push the following patch post -rc1.
> 
> diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c
> index d2f089c..e7caf78 100644
> --- a/drivers/target/target_core_device.c
> +++ b/drivers/target/target_core_device.c
> @@ -851,7 +851,7 @@ bool target_configure_unmap_from_queue(struct se_dev_attrib *attrib,
>         attrib->unmap_granularity = q->limits.discard_granularity / block_size;
>         attrib->unmap_granularity_alignment = q->limits.discard_alignment /
>                                                                 block_size;
> -       attrib->unmap_zeroes_data = 0;
> +       attrib->unmap_zeroes_data = (q->limits.max_write_zeroes_sectors);
>         return true;
>  }
>  EXPORT_SYMBOL(target_configure_unmap_from_queue);
> 

Completely a nit but: why the extra parenthesis?
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger May 5, 2017, 3:10 a.m. UTC | #2
On Wed, 2017-05-03 at 10:33 -0400, Mike Snitzer wrote:
> On Tue, May 02 2017 at 11:33pm -0400,
> Nicholas A. Bellinger <nab@linux-iscsi.org> wrote:
> 
> > On Tue, 2017-05-02 at 09:23 +0200, hch@lst.de wrote:
> > > On Tue, May 02, 2017 at 12:16:13AM -0700, Nicholas A. Bellinger wrote:
> > > > Or, another options is use bdev_write_zeroes_sectors() to determine when
> > > > dev_attrib->unmap_zeroes_data should be set.
> > > 
> > > Yes, that in combination with your patch to use bdev_write_zeroes_sectors
> > > for zeroing from write same seems like the right fix.
> > 
> > The larger target/iblock conversion patch looks like post v4.12 material
> > at this point, so to avoid breakage wrt to existing LBPRZ behavior, I'll
> > plan to push the following patch post -rc1.
> > 
> > diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c
> > index d2f089c..e7caf78 100644
> > --- a/drivers/target/target_core_device.c
> > +++ b/drivers/target/target_core_device.c
> > @@ -851,7 +851,7 @@ bool target_configure_unmap_from_queue(struct se_dev_attrib *attrib,
> >         attrib->unmap_granularity = q->limits.discard_granularity / block_size;
> >         attrib->unmap_granularity_alignment = q->limits.discard_alignment /
> >                                                                 block_size;
> > -       attrib->unmap_zeroes_data = 0;
> > +       attrib->unmap_zeroes_data = (q->limits.max_write_zeroes_sectors);
> >         return true;
> >  }
> >  EXPORT_SYMBOL(target_configure_unmap_from_queue);
> > 
> 
> Completely a nit but: why the extra parenthesis?

dev_attrib->unmap_zeros_data is only compared as a bool.

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig May 7, 2017, 9:22 a.m. UTC | #3
On Tue, May 02, 2017 at 08:33:15PM -0700, Nicholas A. Bellinger wrote:
> The larger target/iblock conversion patch looks like post v4.12 material
> at this point, so to avoid breakage wrt to existing LBPRZ behavior, I'll
> plan to push the following patch post -rc1.

I don't think this is safe.  If you want to do the aboe you also
need to ensure ->execute_unmap always zeroes the data.  For actual
files in the file backend we should be all fine, but for the block
device case [1] and iblock we'd need to use blkdev_issue_zeroout
instead of blkdev_issue_discard when unmap_zeroes_data is set.

[1] which btw already seems broken as it doesn't invalidate cached
data when issuing a discard.
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger May 9, 2017, 6:46 a.m. UTC | #4
On Sun, 2017-05-07 at 11:22 +0200, hch@lst.de wrote:
> On Tue, May 02, 2017 at 08:33:15PM -0700, Nicholas A. Bellinger wrote:
> > The larger target/iblock conversion patch looks like post v4.12 material
> > at this point, so to avoid breakage wrt to existing LBPRZ behavior, I'll
> > plan to push the following patch post -rc1.
> 
> I don't think this is safe.  If you want to do the aboe you also
> need to ensure ->execute_unmap always zeroes the data.  For actual
> files in the file backend we should be all fine, but for the block
> device case [1] and iblock we'd need to use blkdev_issue_zeroout
> instead of blkdev_issue_discard when unmap_zeroes_data is set.
> 
> [1] which btw already seems broken as it doesn't invalidate cached
> data when issuing a discard.

Mmm, for [1] that would appear to be true, but after a deeper look at
existing code I don't think this is the case.

The reason being is target backend attributes emulate_tpu and
emulate_tpws are strictly user configurable, and aren't automatically
set based upon the underlying IBLOCK block_device support for either
one.

According to pre v4.12-rc1 code, q->limits.discard_zeroes_data was only
enabled by drivers/scsi/sd.c:sd_config_discard() for
sdkp->provisioning_mode WRITE_SAME with LBPRZ = 1 or explicit ZERO, and
for NVME for devices that supported NVME_QUIRK_DISCARD_ZEROES.  Eg: Only
real DISCARD + ZERO support.

In your changes to v4.12-rc1, this logic to signal real DISCARD + zero
support for SCSI and NVMe via q->limits.max_write_zeroes_sectors has not
changed..

So AFAICT, regardless if the user sets emulate_tpu or emulate_tpws for a
IBLOCK backend, SCSI host code will have to choose sdkp->zeroing_mode
WRITE_SAME with LBPRZ or explicit ZERO, and NVMe host code will have to
chose a ctrl NVME_QUIRK_DEALLOCATE_ZEROES before
q->limits.max_write_zeroes_sectors != 0 is propagated up to target code,
and LBPRZ = 1 is signaled via READ_CAPACITY_16 and EVPD = 0xb2.

That said, simply propagating up q->limits.max_write_zeroes_sectors as
dev_attrib->unmap_zeroes_data following existing code still looks like
the right thing to do.

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig May 10, 2017, 2:06 p.m. UTC | #5
On Mon, May 08, 2017 at 11:46:14PM -0700, Nicholas A. Bellinger wrote:
> That said, simply propagating up q->limits.max_write_zeroes_sectors as
> dev_attrib->unmap_zeroes_data following existing code still looks like
> the right thing to do.

It is not.  Martin has decoupled write same/zeroes support from discard
support.  Any device will claim to support it initially, and we'll
only clear the flag if a Write Same command fails.

So even if LBPRZ is not set you can trivially get into a situation
where discard is supported through UNMAP, and you'll incorrectly
set LBPRZ and will cause data corruption.
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger May 11, 2017, 4:50 a.m. UTC | #6
On Wed, 2017-05-10 at 16:06 +0200, hch@lst.de wrote:
> On Mon, May 08, 2017 at 11:46:14PM -0700, Nicholas A. Bellinger wrote:
> > That said, simply propagating up q->limits.max_write_zeroes_sectors as
> > dev_attrib->unmap_zeroes_data following existing code still looks like
> > the right thing to do.
> 
> It is not.  Martin has decoupled write same/zeroes support from discard
> support.  Any device will claim to support it initially, and we'll
> only clear the flag if a Write Same command fails.
> 
> So even if LBPRZ is not set you can trivially get into a situation
> where discard is supported through UNMAP, and you'll incorrectly
> set LBPRZ and will cause data corruption.

In that case, there are two choices.

1) Expose a block_device or request_queue bit to signal 'real LBPRZ'
support up to IBLOCK, in order to maintain SCSI target feature
compatibility.

2) Or drop the LBPRZ bit usage for IBLOCK all-together.

Since I happen happen to support a block driver that has 'real LBPRZ'
support for all discards, I'd prefer the latter so this doesn't have to
be carried out-of-tree.

So what are the options for this in post v4.12..?

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig May 11, 2017, 6:26 a.m. UTC | #7
On Wed, May 10, 2017 at 09:50:35PM -0700, Nicholas A. Bellinger wrote:
> 1) Expose a block_device or request_queue bit to signal 'real LBPRZ'
> support up to IBLOCK, in order to maintain SCSI target feature
> compatibility.

No way.  If you want to zero use REQ_OP_WRITE_ZEROES..
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger May 11, 2017, 6:36 a.m. UTC | #8
On Thu, 2017-05-11 at 08:26 +0200, hch@lst.de wrote:
> On Wed, May 10, 2017 at 09:50:35PM -0700, Nicholas A. Bellinger wrote:
> > 1) Expose a block_device or request_queue bit to signal 'real LBPRZ'
> > support up to IBLOCK, in order to maintain SCSI target feature
> > compatibility.
> 
> No way.  If you want to zero use REQ_OP_WRITE_ZEROES..

Yes, I understand that part and it's what the earlier conversion of
IBLOCK to use blkdev_issue_zeroout() already does.

Once the blkdev_issue_zeroout() conversion is in place then LBPRZ can
always be set to one for IBLOCK using blkdev_issue_zeroout().

The point is this is not in -rc1, which as-is breaks LBPRZ compat for a
release.






--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c
index d2f089c..e7caf78 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -851,7 +851,7 @@  bool target_configure_unmap_from_queue(struct se_dev_attrib *attrib,
        attrib->unmap_granularity = q->limits.discard_granularity / block_size;
        attrib->unmap_granularity_alignment = q->limits.discard_alignment /
                                                                block_size;
-       attrib->unmap_zeroes_data = 0;
+       attrib->unmap_zeroes_data = (q->limits.max_write_zeroes_sectors);
        return true;
 }
 EXPORT_SYMBOL(target_configure_unmap_from_queue);