mbox series

[0/3] blktests: nvme: Fix pass data of nvmet TCs

Message ID 20190505150611.15776-1-minwoo.im.dev@gmail.com (mailing list archive)
Headers show
Series blktests: nvme: Fix pass data of nvmet TCs | expand

Message

Minwoo Im May 5, 2019, 3:06 p.m. UTC
Hi, Omar,

This patchset fixes mismatch between expected pass data and real data
printed out for nvmet testcases.

It has been long time to have changes for nvmet and nvme-cli.  The
following commits can explain why the pass data has to be updated.

About genctr:
  Commit b662a078 ("nvmet: enable Discovery Controller AENs")

About treq field:
- nvmet:
  Commit 9b95d2fb ("nvmet: expose support for fabrics SQ flow control
                    disable in treq")
- nvme-cli:
  Commit 2cf370c3 ("fabrics: support fabrics sq flow control disable")

Please correct me if I'm wrong here. :)

Thanks,

*** BLURB HERE ***

Minwoo Im (3):
  nvme: 002: fix nvmet pass data with loop
  nvme: 016: fix nvmet pass data with loop
  nvme: 017: fix nvmet pass data with loop

 tests/nvme/002.out | 2002 ++++++++++++++++++++++++++--------------------------
 tests/nvme/016.out |    4 +-
 tests/nvme/017.out |    4 +-
 3 files changed, 1005 insertions(+), 1005 deletions(-)

Comments

Minwoo Im May 6, 2019, 4:38 p.m. UTC | #1
On 5/7/19 1:25 AM, Chaitanya Kulkarni wrote:
> We need to get rid of the string comparison here.
Chaitanya,

Do you mean that test case bash script should check the result of output 
and return some error value instead of zero instead of *.out comparison ?

Thanks,
Chaitanya Kulkarni May 6, 2019, 4:46 p.m. UTC | #2
On 05/06/2019 09:38 AM, Minwoo Im wrote:
> On 5/7/19 1:25 AM, Chaitanya Kulkarni wrote:
>> We need to get rid of the string comparison here.
> Chaitanya,
>
> Do you mean that test case bash script should check the result of output
> and return some error value instead of zero instead of *.out comparison ?
>
> Thanks,
>

We need to get rid the string comparison as much as we can e.g.
in following output the nvme-cli output should not be compared
but the return value itself.

-Discovery Log Number of Records 1, Generation counter 1
+Discovery Log Number of Records 1, Generation counter 2
=====Discovery Log Entry 0======
trtype: loop
adrfam: pci
subtype: nvme subsystem
-treq: not specified
+treq: not specified, sq flow control disable supported
portid: X
trsvcid:
subnqn: blktests-subsystem-1

Reason :- we cannot rely on the output as it tends to change
with development which triggers fixes in blktests, unless
testcase is focused on entirely on examining the output of
the tool.
Minwoo Im May 6, 2019, 4:54 p.m. UTC | #3
> We need to get rid the string comparison as much as we can e.g.
> in following output the nvme-cli output should not be compared
> but the return value itself.
> 
> -Discovery Log Number of Records 1, Generation counter 1
> +Discovery Log Number of Records 1, Generation counter 2
> =====Discovery Log Entry 0======
> trtype: loop
> adrfam: pci
> subtype: nvme subsystem
> -treq: not specified
> +treq: not specified, sq flow control disable supported
> portid: X
> trsvcid:
> subnqn: blktests-subsystem-1
> 
> Reason :- we cannot rely on the output as it tends to change
> with development which triggers fixes in blktests, unless
> testcase is focused on entirely on examining the output of
> the tool.

Totally agree with you.  nvme-cli is going to keep updated and output 
format might be changed which may cause test failure in blktests.

If Johannes who wrote these tests agrees to not check output result from 
nvme-cli, I'll get rid of them.

By the way, Checking the return value of a program like nvme-cli might 
not be enough to figure out what happened because this test looks like 
wanted to check the output of discover get log page is exactly the same 
with what it wanted to be in case data size is greater than 4K.
Chaitanya Kulkarni May 6, 2019, 5:47 p.m. UTC | #4
On 05/06/2019 09:54 AM, Minwoo Im wrote:
>> We need to get rid the string comparison as much as we can e.g.
>> in following output the nvme-cli output should not be compared
>> but the return value itself.
>>
>> -Discovery Log Number of Records 1, Generation counter 1
>> +Discovery Log Number of Records 1, Generation counter 2
>> =====Discovery Log Entry 0======
>> trtype: loop
>> adrfam: pci
>> subtype: nvme subsystem
>> -treq: not specified
>> +treq: not specified, sq flow control disable supported
>> portid: X
>> trsvcid:
>> subnqn: blktests-subsystem-1
>>
>> Reason :- we cannot rely on the output as it tends to change
>> with development which triggers fixes in blktests, unless
>> testcase is focused on entirely on examining the output of
>> the tool.
>
> Totally agree with you.  nvme-cli is going to keep updated and output
> format might be changed which may cause test failure in blktests.
>
> If Johannes who wrote these tests agrees to not check output result from
> nvme-cli, I'll get rid of them.
>
> By the way, Checking the return value of a program like nvme-cli might
> not be enough to figure out what happened because this test looks like
> wanted to check the output of discover get log page is exactly the same
> with what it wanted to be in case data size is greater than 4K.
>

I wasn't clear enough.

It doesn't check for the return value for now. What needs to happen is :-

1. Get rid of the variable strings which are not part of the discovery
    log page entries such as Generation counter.
2. Validate each log page entry content.
3. Check the return value.

We cannot *only* rely on the nvme-cli return value or on the output.
Minwoo Im May 6, 2019, 8:13 p.m. UTC | #5
> I wasn't clear enough.
> 
> It doesn't check for the return value for now. What needs to happen is :-
> 
> 1. Get rid of the variable strings which are not part of the discovery
>      log page entries such as Generation counter.
> 2. Validate each log page entry content.

Question again here.
Do you mean that log page entry contents validation should be in bash 
level instead of *.out comparison?

> 3. Check the return value.

nvme-cli is currently returning value like:
   > 0 :   failed with nvme status code (but the actual value may not be 
the same with status)
   == 0 : done successfully
   < 0 :   failed with -errno

But, ( > 0) case may be removed from nvme-cli soon due to [1] discuss. 
Anyway, if nvme-cli is going to return 0 for both cases: success, error 
with nvme status, then test case is going to be hard to check the error 
status by a return value.  It should be with output string parsing which 
would be great if it's going to be commonized.

[1] https://github.com/linux-nvme/nvme-cli/pull/492
Chaitanya Kulkarni May 6, 2019, 11:01 p.m. UTC | #6
On 05/06/2019 01:13 PM, Minwoo Im wrote:
>> I wasn't clear enough.
>>
>> It doesn't check for the return value for now. What needs to happen is :-
>>
>> 1. Get rid of the variable strings which are not part of the discovery
>>       log page entries such as Generation counter.
>> 2. Validate each log page entry content.
>
> Question again here.
> Do you mean that log page entry contents validation should be in bash
> level instead of *.out comparison?

It has *out level comparison.
>
>> 3. Check the return value.
>
> nvme-cli is currently returning value like:
>     > 0 :   failed with nvme status code (but the actual value may not be
> the same with status)
>     == 0 : done successfully
>     < 0 :   failed with -errno
>
> But, ( > 0) case may be removed from nvme-cli soon due to [1] discuss.
> Anyway, if nvme-cli is going to return 0 for both cases: success, error
> with nvme status, then test case is going to be hard to check the error
> status by a return value.

This should not happen as it will break existing scripts which are 
written on the top the nvme-cli in past few years.

   It should be with output string parsing which
> would be great if it's going to be commonized.
>
No, we cannot rely on the output string as this problem is a good example.

I'm not sure returning == 0 for error case is a good idea. Checking 
return value prevents us from writing unstable testcases which are bases 
on the text output and allow us to modify tools as specification moves 
forward.

> [1] https://github.com/linux-nvme/nvme-cli/pull/492
>
Minwoo Im May 6, 2019, 11:23 p.m. UTC | #7
> >> I wasn't clear enough.
> >>
> >> It doesn't check for the return value for now. What needs to happen is :-
> >>
> >> 1. Get rid of the variable strings which are not part of the discovery
> >>       log page entries such as Generation counter.
> >> 2. Validate each log page entry content.
> >
> > Question again here.
> > Do you mean that log page entry contents validation should be in bash
> > level instead of *.out comparison?
>
> It has *out level comparison.

I'm not sure I have got what you really meant.  Please correct me if I'm wrong
here ;)

First of all, removal of variable values in the result of the
discovery get log page
looks great to me also.  Maybe those variable values are able to be replaced
a fixed value like port does (replace port value via sed command to X).

But, it also depends on the outout of the nvme-cli, not return value.
Anyway, let's discuss about it with Keith also because he discussed it with me
few weeks ago,I think.

> >
> >> 3. Check the return value.
> >
> > nvme-cli is currently returning value like:
> >     > 0 :   failed with nvme status code (but the actual value may not be
> > the same with status)
> >     == 0 : done successfully
> >     < 0 :   failed with -errno
> >
> > But, ( > 0) case may be removed from nvme-cli soon due to [1] discuss.
> > Anyway, if nvme-cli is going to return 0 for both cases: success, error
> > with nvme status, then test case is going to be hard to check the error
> > status by a return value.
>
> This should not happen as it will break existing scripts which are
> written on the top the nvme-cli in past few years.

Agreed.  But, please refer the following comment below ;)

>
>    It should be with output string parsing which
> > would be great if it's going to be commonized.
> >
> No, we cannot rely on the output string as this problem is a good example.
>
> I'm not sure returning == 0 for error case is a good idea. Checking
> return value prevents us from writing unstable testcases which are bases
> on the text output and allow us to modify tools as specification moves
> forward.

Please refer a discussion on https://github.com/linux-nvme/nvme-cli/pull/489
Keith and I had a discussion about the return value of the program itself.
The nvme status value is composed of 16 bits value, by the way, the actual
return value of the program is in 8bits(signed) which means it's not
able to carry
the actual nvme status value out of the program.

If you have any idea about it, Please let me know.  By the way, I really do
agree with what you mentioned about the return value.  If it's possible,
I would like to too :)

>
> > [1] https://github.com/linux-nvme/nvme-cli/pull/492
> >
>
Chaitanya Kulkarni May 7, 2019, 1:38 a.m. UTC | #8
On 5/6/19 4:24 PM, Minwoo Im wrote:
>>>> I wasn't clear enough.
>>>>
>>>> It doesn't check for the return value for now. What needs to happen is :-
>>>>
>>>> 1. Get rid of the variable strings which are not part of the discovery
>>>>       log page entries such as Generation counter.
>>>> 2. Validate each log page entry content.
>>> Question again here.
>>> Do you mean that log page entry contents validation should be in bash
>>> level instead of *.out comparison?
>> It has *out level comparison.
Give me sometime I'll get back to you on this.
> I'm not sure I have got what you really meant.  Please correct me if I'm wrong
> here ;)
>
> First of all, removal of variable values in the result of the
> discovery get log page
> looks great to me also.  Maybe those variable values are able to be replaced
> a fixed value like port does (replace port value via sed command to X).
>
> But, it also depends on the outout of the nvme-cli, not return value.
> Anyway, let's discuss about it with Keith also because he discussed it with me
> few weeks ago,I think.
>
>>>> 3. Check the return value.
>>> nvme-cli is currently returning value like:
>>>     > 0 :   failed with nvme status code (but the actual value may not be
>>> the same with status)
>>>     == 0 : done successfully
>>>     < 0 :   failed with -errno
>>>
>>> But, ( > 0) case may be removed from nvme-cli soon due to [1] discuss.
>>> Anyway, if nvme-cli is going to return 0 for both cases: success, error
>>> with nvme status, then test case is going to be hard to check the error
>>> status by a return value.
>> This should not happen as it will break existing scripts which are
>> written on the top the nvme-cli in past few years.
> Agreed.  But, please refer the following comment below ;)
>
>>    It should be with output string parsing which
>>> would be great if it's going to be commonized.
>>>
>> No, we cannot rely on the output string as this problem is a good example.
>>
>> I'm not sure returning == 0 for error case is a good idea. Checking
>> return value prevents us from writing unstable testcases which are bases
>> on the text output and allow us to modify tools as specification moves
>> forward.
> Please refer a discussion on https://github.com/linux-nvme/nvme-cli/pull/489
> Keith and I had a discussion about the return value of the program itself.
> The nvme status value is composed of 16 bits value, by the way, the actual
> return value of the program is in 8bits(signed) which means it's not
> able to carry
> the actual nvme status value out of the program.
>
> If you have any idea about it, Please let me know.  By the way, I really do
> agree with what you mentioned about the return value.  If it's possible,
> I would like to too :)
>
>>> [1] https://github.com/linux-nvme/nvme-cli/pull/492
>>>
Johannes Thumshirn May 7, 2019, 6:20 a.m. UTC | #9
On Tue, May 07, 2019 at 01:54:09AM +0900, Minwoo Im wrote:
> If Johannes who wrote these tests agrees to not check output result from
> nvme-cli, I'll get rid of them.

Yes agreed. In the beginning I though of validating the nvme-cli output but
this would grow more and more filters, which makes it impractical.
Minwoo Im May 7, 2019, 10:23 a.m. UTC | #10
On 5/7/19 3:20 PM, Johannes Thumshirn wrote:
> On Tue, May 07, 2019 at 01:54:09AM +0900, Minwoo Im wrote:
>> If Johannes who wrote these tests agrees to not check output result from
>> nvme-cli, I'll get rid of them.
> 
> Yes agreed. In the beginning I though of validating the nvme-cli output but
> this would grow more and more filters, which makes it impractical.
> 

Cool, then I think I can update the pass condition of this testcase in 
other way.  Once Chaitanya gives his comment on this, I'll prepare it.

Thanks,
Chaitanya Kulkarni May 9, 2019, 5:51 a.m. UTC | #11
On 5/6/19 4:24 PM, Minwoo Im wrote:
>>>> I wasn't clear enough.
>>>>
>>>> It doesn't check for the return value for now. What needs to happen is :-
>>>>
>>>> 1. Get rid of the variable strings which are not part of the discovery
>>>>        log page entries such as Generation counter.
>>>> 2. Validate each log page entry content.
>>>
>>> Question again here.
>>> Do you mean that log page entry contents validation should be in bash
>>> level instead of *.out comparison?
>>
>> It has *out level comparison.
> 
> I'm not sure I have got what you really meant.  Please correct me if I'm wrong
> here ;)
> 
> First of all, removal of variable values in the result of the
> discovery get log page
> looks great to me also.  Maybe those variable values are able to be replaced
> a fixed value like port does (replace port value via sed command to X).
> 
> But, it also depends on the outout of the nvme-cli, not return value.
> Anyway, let's discuss about it with Keith also because he discussed it with me
> few weeks ago,I think.
> 
>>>
>>>> 3. Check the return value.
>>>
>>> nvme-cli is currently returning value like:
>>>      > 0 :   failed with nvme status code (but the actual value may not be
>>> the same with status)
>>>      == 0 : done successfully
>>>      < 0 :   failed with -errno
>>>
>>> But, ( > 0) case may be removed from nvme-cli soon due to [1] discuss.
>>> Anyway, if nvme-cli is going to return 0 for both cases: success, error
>>> with nvme status, then test case is going to be hard to check the error
>>> status by a return value.
>>
>> This should not happen as it will break existing scripts which are
>> written on the top the nvme-cli in past few years.
> 
> Agreed.  But, please refer the following comment below ;)
> 
>>
>>     It should be with output string parsing which
>>> would be great if it's going to be commonized.
>>>
>> No, we cannot rely on the output string as this problem is a good example.
>>
>> I'm not sure returning == 0 for error case is a good idea. Checking
>> return value prevents us from writing unstable testcases which are bases
>> on the text output and allow us to modify tools as specification moves
>> forward.
> 
> Please refer a discussion on https://github.com/linux-nvme/nvme-cli/pull/489
> Keith and I had a discussion about the return value of the program itself.
> The nvme status value is composed of 16 bits value, by the way, the actual
> return value of the program is in 8bits(signed) which means it's not
> able to carry
> the actual nvme status value out of the program.
Isn't this unsigned ? as pointed out by Keith ?

$ cat a.c
int main(void)
{
	return -1;
}
$ gcc a.c -o a
$ ./a
$ echo $?
255

May be I'm missing something here ?
> 
> If you have any idea about it, Please let me know.  By the way, I really do
> agree with what you mentioned about the return value.  If it's possible,
> I would like to too :)

How about we instead of returning the NVMe Status we map the NVMe Status 
of the program to the error code and in-turn return that error code ?

The above is true only when command is successfully submitted from the
program i.e. no errno is set by any library calls and failed in the 
completion queue entry with NVMe Status != NVME_SC_SUCCESS.

For your reference In kernel we already do this detailed mapping where :-

1. Please refer to the drivers/nvme/target/core.c file where we map the
errno_to_nvme_status(), the reverse mapping of that function can be done 
with nvme_status_to_errno(). Of course you will have to add more cases 
and do in-detail reverse error mapping from NVMe status to errno and
return that errno.
2. nvme_error_status() we map NVMe Status to block layer status.
3. blk_status_to_errno() we map the block layer status to the errno.

With the help of 1, 2 and 3 you can reverse map the NVMe Status to errno
and add that mapper function for nvme-cli which will be consistent with 
the kernel NVMe status to errno mapping.

Now you might find some cases where you cannot map all the status codes 
to errno and for those default cases you may end up using something like 
EIO, this is still better way than having to return 0.
> 
>>
>>> [1] https://github.com/linux-nvme/nvme-cli/pull/492
>>>
>>
>
Chaitanya Kulkarni May 9, 2019, 5:52 a.m. UTC | #12
On 5/7/19 3:23 AM, Minwoo Im wrote:
> On 5/7/19 3:20 PM, Johannes Thumshirn wrote:
>> On Tue, May 07, 2019 at 01:54:09AM +0900, Minwoo Im wrote:
>>> If Johannes who wrote these tests agrees to not check output result from
>>> nvme-cli, I'll get rid of them.
>>
>> Yes agreed. In the beginning I though of validating the nvme-cli output but
>> this would grow more and more filters, which makes it impractical.
>>
> 
> Cool, then I think I can update the pass condition of this testcase in
> other way.  Once Chaitanya gives his comment on this, I'll prepare it.
> 
I think responded to this one.
> Thanks,
>
Minwoo Im May 9, 2019, 6:15 a.m. UTC | #13
> Isn't this unsigned ? as pointed out by Keith ?
>
> $ cat a.c
> int main(void)
> {
>         return -1;
> }
> $ gcc a.c -o a
> $ ./a
> $ echo $?
> 255
>
> May be I'm missing something here ?

I meant that the program returns in a signed value, but it's going to be
parsed in 8bits which is unix style, I think.  Sorry for being unclear.
That's not enough to hold nvme status value at all.

> >
> > If you have any idea about it, Please let me know.  By the way, I really do
> > agree with what you mentioned about the return value.  If it's possible,
> > I would like to too :)
>
> How about we instead of returning the NVMe Status we map the NVMe Status
> of the program to the error code and in-turn return that error code ?
>
> The above is true only when command is successfully submitted from the
> program i.e. no errno is set by any library calls and failed in the
> completion queue entry with NVMe Status != NVME_SC_SUCCESS.
>
> For your reference In kernel we already do this detailed mapping where :-
>
> 1. Please refer to the drivers/nvme/target/core.c file where we map the
> errno_to_nvme_status(), the reverse mapping of that function can be done
> with nvme_status_to_errno(). Of course you will have to add more cases
> and do in-detail reverse error mapping from NVMe status to errno and
> return that errno.
> 2. nvme_error_status() we map NVMe Status to block layer status.
> 3. blk_status_to_errno() we map the block layer status to the errno.
>
> With the help of 1, 2 and 3 you can reverse map the NVMe Status to errno
> and add that mapper function for nvme-cli which will be consistent with
> the kernel NVMe status to errno mapping.
>
> Now you might find some cases where you cannot map all the status codes
> to errno and for those default cases you may end up using something like
> EIO, this is still better way than having to return 0.

Got your point. To make this discussion short, I think we need to make it
in nvme-cli first.  Let me have a discussion on the following link:
https://github.com/linux-nvme/nvme-cli/pull/492

Thanks,