diff mbox

[RFC] Simlify dif_verify routines and fixup fileio protection information code.

Message ID CAC5umygGG4BTs0u35eVi6dLhoDuDshGNPsQaWq2kO5YexWDP+Q@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Akinobu Mita April 15, 2015, 2:16 p.m. UTC
2015-04-15 19:07 GMT+09:00 Sagi Grimberg <sagig@dev.mellanox.co.il>:
> On 4/15/2015 2:52 AM, Akinobu Mita wrote:
>
>>>>
>>>> Looks good...
>>>>
>>>> I'll test with these patches and check if the problems I met
>>>> disappear.
>>>
>>>
>>>
>>> Thanks Akinobu,
>>>
>>> Waiting to hear your verdict before sending a formal patchset.
>>
>>
>> I hit a original bug in sbc_dif_verify() which is not introduced by
>> your patch set, though.
>
>
> What is this original bug?

I meant to say about the problem I fixed by fix-sbc_dif_verify.patch.

>>  Please consider to include attached patch.
>
>
> I'll include it. thanks.
>
>> I'm still seeing another problem and trying to find out a root cause,
>> but it seems like it's caused by other change in -next.
>>
>
> care to elaborate?

When mounting ext4 filesystem at the first time after mkfs, a lot of
WRITE_SAME commands are issued for lazy initialization or something.

By the commit 436f4a0a ("loopback: Add fabric_prot_type attribute
support"), When WRITE_SAME command with WRPROTECT=0 is executed,
sbc_dif_generate() is called but cmd->t_prot_sg is NULL as block layer
didn't allocate it for WRITE_SAME.

I could work around with the attached patch, as the WRITE_SAME command
will fail after all when protection info is enabled with FILEIO, we only need to
avoid null pointer dereference.  But I need to ask Nic about the right
way to fix.

For this patch set, please feel free to add:

Tested-by: Akinobu Mita <akinobu.mita@gmail.com>

Comments

Sagi Grimberg April 15, 2015, 2:33 p.m. UTC | #1
On 4/15/2015 5:16 PM, Akinobu Mita wrote:
> 2015-04-15 19:07 GMT+09:00 Sagi Grimberg <sagig@dev.mellanox.co.il>:
>> On 4/15/2015 2:52 AM, Akinobu Mita wrote:
>>
>>>>>
>>>>> Looks good...
>>>>>
>>>>> I'll test with these patches and check if the problems I met
>>>>> disappear.
>>>>
>>>>
>>>>
>>>> Thanks Akinobu,
>>>>
>>>> Waiting to hear your verdict before sending a formal patchset.
>>>
>>>
>>> I hit a original bug in sbc_dif_verify() which is not introduced by
>>> your patch set, though.
>>
>>
>> What is this original bug?
>
> I meant to say about the problem I fixed by fix-sbc_dif_verify.patch.
>
>>>   Please consider to include attached patch.
>>
>>
>> I'll include it. thanks.
>>
>>> I'm still seeing another problem and trying to find out a root cause,
>>> but it seems like it's caused by other change in -next.
>>>
>>
>> care to elaborate?
>
> When mounting ext4 filesystem at the first time after mkfs, a lot of
> WRITE_SAME commands are issued for lazy initialization or something.
>
> By the commit 436f4a0a ("loopback: Add fabric_prot_type attribute
> support"), When WRITE_SAME command with WRPROTECT=0 is executed,
> sbc_dif_generate() is called but cmd->t_prot_sg is NULL as block layer
> didn't allocate it for WRITE_SAME.
>
> I could work around with the attached patch, as the WRITE_SAME command
> will fail after all when protection info is enabled with FILEIO, we only need to
> avoid null pointer dereference.  But I need to ask Nic about the right
> way to fix.
>

I don't think this is sufficient. With this we actually write
unprotected data for WRITE_SAME (i.e. write data blocks but not storing
the corresponding PI information). When this data will be read back you
will see PI errors (you currently don't see those because your backend
drive contains escape values I assume).

I'd say the correct fix is to calc PI for the block and "write_same"
it...

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martin K. Petersen April 15, 2015, 3:05 p.m. UTC | #2
>>>>> "Sagi" == Sagi Grimberg <sagig@dev.mellanox.co.il> writes:

Sagi> I don't think this is sufficient. With this we actually write
Sagi> unprotected data for WRITE_SAME (i.e. write data blocks but not
Sagi> storing the corresponding PI information). When this data will be
Sagi> read back you will see PI errors (you currently don't see those
Sagi> because your backend drive contains escape values I assume).

Sagi> I'd say the correct fix is to calc PI for the block

Indeed!

Sagi> and "write_same" it...

Well, the ref tag needs to be incremented for each block (for Type 1).
Sagi Grimberg April 15, 2015, 3:08 p.m. UTC | #3
On 4/15/2015 5:33 PM, Sagi Grimberg wrote:
> On 4/15/2015 5:16 PM, Akinobu Mita wrote:
>> 2015-04-15 19:07 GMT+09:00 Sagi Grimberg <sagig@dev.mellanox.co.il>:
>>> On 4/15/2015 2:52 AM, Akinobu Mita wrote:
>>>
>>>>>>
>>>>>> Looks good...
>>>>>>
>>>>>> I'll test with these patches and check if the problems I met
>>>>>> disappear.
>>>>>
>>>>>
>>>>>
>>>>> Thanks Akinobu,
>>>>>
>>>>> Waiting to hear your verdict before sending a formal patchset.
>>>>
>>>>
>>>> I hit a original bug in sbc_dif_verify() which is not introduced by
>>>> your patch set, though.
>>>
>>>
>>> What is this original bug?
>>
>> I meant to say about the problem I fixed by fix-sbc_dif_verify.patch.
>>
>>>>   Please consider to include attached patch.
>>>
>>>
>>> I'll include it. thanks.
>>>
>>>> I'm still seeing another problem and trying to find out a root cause,
>>>> but it seems like it's caused by other change in -next.
>>>>
>>>
>>> care to elaborate?
>>
>> When mounting ext4 filesystem at the first time after mkfs, a lot of
>> WRITE_SAME commands are issued for lazy initialization or something.
>>
>> By the commit 436f4a0a ("loopback: Add fabric_prot_type attribute
>> support"), When WRITE_SAME command with WRPROTECT=0 is executed,
>> sbc_dif_generate() is called but cmd->t_prot_sg is NULL as block layer
>> didn't allocate it for WRITE_SAME.
>>

Actually this is a bug. Why didn't the initiator allocate integrity
meta-data for WRITE_SAME? Looking at the code it looks like it should.

Martin?

>> I could work around with the attached patch, as the WRITE_SAME command
>> will fail after all when protection info is enabled with FILEIO, we
>> only need to
>> avoid null pointer dereference.  But I need to ask Nic about the right
>> way to fix.
>>
>
> I don't think this is sufficient. With this we actually write
> unprotected data for WRITE_SAME (i.e. write data blocks but not storing
> the corresponding PI information). When this data will be read back you
> will see PI errors (you currently don't see those because your backend
> drive contains escape values I assume).
>
> I'd say the correct fix is to calc PI for the block and "write_same"
> it...
>
> Sagi.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martin K. Petersen April 15, 2015, 4:10 p.m. UTC | #4
>>>>> "Sagi" == Sagi Grimberg <sagig@dev.mellanox.co.il> writes:

>>> By the commit 436f4a0a ("loopback: Add fabric_prot_type attribute
>>> support"), When WRITE_SAME command with WRPROTECT=0 is executed,
>>> sbc_dif_generate() is called but cmd->t_prot_sg is NULL as block
>>> layer didn't allocate it for WRITE_SAME.

Sagi> Actually this is a bug. Why didn't the initiator allocate
Sagi> integrity meta-data for WRITE_SAME? Looking at the code it looks
Sagi> like it should.

We don't issue WRITE SAME with PI so there is no prot SGL.
Sagi Grimberg April 16, 2015, 8:52 a.m. UTC | #5
On 4/15/2015 7:10 PM, Martin K. Petersen wrote:
>>>>>> "Sagi" == Sagi Grimberg <sagig@dev.mellanox.co.il> writes:
>
>>>> By the commit 436f4a0a ("loopback: Add fabric_prot_type attribute
>>>> support"), When WRITE_SAME command with WRPROTECT=0 is executed,
>>>> sbc_dif_generate() is called but cmd->t_prot_sg is NULL as block
>>>> layer didn't allocate it for WRITE_SAME.
>
> Sagi> Actually this is a bug. Why didn't the initiator allocate
> Sagi> integrity meta-data for WRITE_SAME? Looking at the code it looks
> Sagi> like it should.
>
> We don't issue WRITE SAME with PI so there is no prot SGL.
>

Is there a specific reason why we don't?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Akinobu Mita April 16, 2015, 1:46 p.m. UTC | #6
2015-04-16 17:52 GMT+09:00 Sagi Grimberg <sagig@dev.mellanox.co.il>:
> On 4/15/2015 7:10 PM, Martin K. Petersen wrote:
>>>>>>>
>>>>>>> "Sagi" == Sagi Grimberg <sagig@dev.mellanox.co.il> writes:
>>
>>
>>>>> By the commit 436f4a0a ("loopback: Add fabric_prot_type attribute
>>>>> support"), When WRITE_SAME command with WRPROTECT=0 is executed,
>>>>> sbc_dif_generate() is called but cmd->t_prot_sg is NULL as block
>>>>> layer didn't allocate it for WRITE_SAME.
>>
>>
>> Sagi> Actually this is a bug. Why didn't the initiator allocate
>> Sagi> integrity meta-data for WRITE_SAME? Looking at the code it looks
>> Sagi> like it should.
>>
>> We don't issue WRITE SAME with PI so there is no prot SGL.
>>
>
> Is there a specific reason why we don't?

It is not only for the WRITE SAME requests from block device but
also for READ/WRITE with PROTECT=0 requests by SG_IO.

So isn't is appropreate to allocate prot SGL in
target_write_prot_action() (and mark se_cmd->se_cmd_flags to release
it at deallocation time)?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martin K. Petersen April 16, 2015, 3:30 p.m. UTC | #7
>>>>> "Akinobu" == Akinobu Mita <akinobu.mita@gmail.com> writes:

>>> We don't issue WRITE SAME with PI so there is no prot SGL.

>> Is there a specific reason why we don't?

There really isn't much of a benefit when all you're doing is
replicating zeroes. So it hasn't been very high on my list.

Akinobu> It is not only for the WRITE SAME requests from block device
Akinobu> but also for READ/WRITE with PROTECT=0 requests by SG_IO.

Akinobu> So isn't is appropreate to allocate prot SGL in
Akinobu> target_write_prot_action() (and mark se_cmd->se_cmd_flags to
Akinobu> release it at deallocation time)?

Correct. Just because a target is formatted with PI does not mean that
every I/O it receives has PI attached. That's entirely driven by
RDPROTECT/WRPROTECT/VRPROTECT at the initiator's discretion.

It is an absolute requirement that the device, if formatted with PI,
will generate and write the correct protection information when
WRPROTECT is 0.
Sagi Grimberg April 16, 2015, 3:58 p.m. UTC | #8
On 4/16/2015 4:46 PM, Akinobu Mita wrote:
> 2015-04-16 17:52 GMT+09:00 Sagi Grimberg <sagig@dev.mellanox.co.il>:
>> On 4/15/2015 7:10 PM, Martin K. Petersen wrote:
>>>>>>>>
>>>>>>>> "Sagi" == Sagi Grimberg <sagig@dev.mellanox.co.il> writes:
>>>
>>>
>>>>>> By the commit 436f4a0a ("loopback: Add fabric_prot_type attribute
>>>>>> support"), When WRITE_SAME command with WRPROTECT=0 is executed,
>>>>>> sbc_dif_generate() is called but cmd->t_prot_sg is NULL as block
>>>>>> layer didn't allocate it for WRITE_SAME.
>>>
>>>
>>> Sagi> Actually this is a bug. Why didn't the initiator allocate
>>> Sagi> integrity meta-data for WRITE_SAME? Looking at the code it looks
>>> Sagi> like it should.
>>>
>>> We don't issue WRITE SAME with PI so there is no prot SGL.
>>>
>>
>> Is there a specific reason why we don't?
>
> It is not only for the WRITE SAME requests from block device but
> also for READ/WRITE with PROTECT=0 requests by SG_IO.
>

This is specific to loopback which is using target_submit_cmd_map_sgls()
Other fabrics would allocate sgls per IO and the core would allocate
protection SGLs as well.

> So isn't is appropreate to allocate prot SGL in
> target_write_prot_action() (and mark se_cmd->se_cmd_flags to release
> it at deallocation time)?
>

I'd say that given this is specific to loopback, than tcm_loop needs
to be fixed... But specifically for WRITE_SAME, I'd be careful with
allocating a single 8 byte protection buffer because as Martin said,
unlike the data block, the protection field may change from sector to
sector (ref_tag in Type 1).

So allocating a single 8 byte buf will take it's toll in the backend
(iblock backend would need to allocate all the protection information
and add it to the bio anyway, file/rd will need to do multiple writes).

It might be better that for the special WRITE_SAME case, allocate 8 *
sectors sgl and set it up (incrementing ref_tag for type 1). This way,
the backend code can stay the same (other than opening write_same with
PI in iblock).

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg April 16, 2015, 4:04 p.m. UTC | #9
On 4/16/2015 6:58 PM, Sagi Grimberg wrote:
> On 4/16/2015 4:46 PM, Akinobu Mita wrote:
>> 2015-04-16 17:52 GMT+09:00 Sagi Grimberg <sagig@dev.mellanox.co.il>:
>>> On 4/15/2015 7:10 PM, Martin K. Petersen wrote:
>>>>>>>>>
>>>>>>>>> "Sagi" == Sagi Grimberg <sagig@dev.mellanox.co.il> writes:
>>>>
>>>>
>>>>>>> By the commit 436f4a0a ("loopback: Add fabric_prot_type attribute
>>>>>>> support"), When WRITE_SAME command with WRPROTECT=0 is executed,
>>>>>>> sbc_dif_generate() is called but cmd->t_prot_sg is NULL as block
>>>>>>> layer didn't allocate it for WRITE_SAME.
>>>>
>>>>
>>>> Sagi> Actually this is a bug. Why didn't the initiator allocate
>>>> Sagi> integrity meta-data for WRITE_SAME? Looking at the code it looks
>>>> Sagi> like it should.
>>>>
>>>> We don't issue WRITE SAME with PI so there is no prot SGL.
>>>>
>>>
>>> Is there a specific reason why we don't?
>>
>> It is not only for the WRITE SAME requests from block device but
>> also for READ/WRITE with PROTECT=0 requests by SG_IO.
>>
>
> This is specific to loopback which is using target_submit_cmd_map_sgls()
> Other fabrics would allocate sgls per IO and the core would allocate
> protection SGLs as well.
>

By "This" I meant the NULL deref you are witnessing in for wrprotect=0.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index 3d88c00..d8d6267 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -1183,6 +1183,9 @@  sbc_dif_generate(struct se_cmd *cmd)
 	void *daddr, *paddr;
 	int i, j, offset = 0;
 
+	if (!psg)
+		return;
+
 	for_each_sg(cmd->t_data_sg, dsg, cmd->t_data_nents, i) {
 		daddr = kmap_atomic(sg_page(dsg)) + dsg->offset;
 		paddr = kmap_atomic(sg_page(psg)) + psg->offset;