diff mbox series

[blktests] nvme: Add passthru error logging tests to nvme/039

Message ID 20240216233053.2795930-1-alan.adamson@oracle.com (mailing list archive)
State New, archived
Headers show
Series [blktests] nvme: Add passthru error logging tests to nvme/039 | expand

Commit Message

Alan Adamson Feb. 16, 2024, 11:30 p.m. UTC
Tests the ability to enable and disable error logging for passthru admin commands issued to
the controller and passthru IO commands issued to a namespace.

Signed-off-by: Alan Adamson <alan.adamson@oracle.com>
---
 tests/nvme/039     | 41 +++++++++++++++++++++++++++++++++++++++++
 tests/nvme/039.out |  2 ++
 tests/nvme/rc      | 37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 80 insertions(+)

Comments

Chaitanya Kulkarni Feb. 17, 2024, 12:50 a.m. UTC | #1
> +_nvme_passthru_logging_setup()
> +{
> +	ctrl_dev_passthru_logging=$(cat /sys/class/nvme/"$2"/passthru_err_log_enabled)
> +	ns_dev_passthru_logging=$(cat /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled)
> +
> +	_nvme_disable_passthru_admin_error_logging "$2"
> +	_nvme_disable_passthru_io_error_logging "$1" "$2"
> +}
> +
> +_nvme_passthru_logging_cleanup()
> +{
> +	echo $ctrl_dev_passthru_logging > /sys/class/nvme/"$2"/passthru_err_log_enabled
> +	echo $ns_dev_passthru_logging > /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled
> +}
>   
>   _nvme_err_inject_setup()
>   {
> @@ -985,6 +1002,26 @@ _nvme_disable_err_inject()
>           echo 0 > /sys/kernel/debug/"$1"/fault_inject/times
>   }
>   
> +_nvme_enable_passthru_admin_error_logging()
> +{
> +	echo on > /sys/class/nvme/"$1"/passthru_err_log_enabled
> +}
> +
> +_nvme_enable_passthru_io_error_logging()
> +{
> +	echo on > /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled
> +}
> +
> +_nvme_disable_passthru_admin_error_logging()
> +{
> +	echo off > /sys/class/nvme/"$1"/passthru_err_log_enabled
> +}
> +
> +_nvme_disable_passthru_io_error_logging()
> +{
> +	echo off > /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled
> +}
> +
>

Thanks for the test, let's move these helper testcase itself if we get
second testcase ? I'd not bloat nvme rc file unless we have another
user for these function, also if you move these to testcase itself
then you don't really need to make these as a function ...

-ck
Shin'ichiro Kawasaki Feb. 19, 2024, 6:44 a.m. UTC | #2
On Feb 17, 2024 / 00:50, Chaitanya Kulkarni wrote:
> > +_nvme_passthru_logging_setup()
> > +{
> > +	ctrl_dev_passthru_logging=$(cat /sys/class/nvme/"$2"/passthru_err_log_enabled)
> > +	ns_dev_passthru_logging=$(cat /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled)
> > +
> > +	_nvme_disable_passthru_admin_error_logging "$2"
> > +	_nvme_disable_passthru_io_error_logging "$1" "$2"
> > +}
> > +
> > +_nvme_passthru_logging_cleanup()
> > +{
> > +	echo $ctrl_dev_passthru_logging > /sys/class/nvme/"$2"/passthru_err_log_enabled
> > +	echo $ns_dev_passthru_logging > /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled
> > +}
> >   
> >   _nvme_err_inject_setup()
> >   {
> > @@ -985,6 +1002,26 @@ _nvme_disable_err_inject()
> >           echo 0 > /sys/kernel/debug/"$1"/fault_inject/times
> >   }
> >   
> > +_nvme_enable_passthru_admin_error_logging()
> > +{
> > +	echo on > /sys/class/nvme/"$1"/passthru_err_log_enabled
> > +}
> > +
> > +_nvme_enable_passthru_io_error_logging()
> > +{
> > +	echo on > /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled
> > +}
> > +
> > +_nvme_disable_passthru_admin_error_logging()
> > +{
> > +	echo off > /sys/class/nvme/"$1"/passthru_err_log_enabled
> > +}
> > +
> > +_nvme_disable_passthru_io_error_logging()
> > +{
> > +	echo off > /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled
> > +}
> > +
> >
> 
> Thanks for the test, let's move these helper testcase itself if we get
> second testcase ? I'd not bloat nvme rc file unless we have another
> user for these function, also if you move these to testcase itself
> then you don't really need to make these as a function ...

FYI, we already have similar helper functions in nvme/rc:

  _nvme_err_inject_setup()
  _nvme_err_inject_cleanup()
  _nvme_enable_err_inject()
  _nvme_disable_err_inject()

In the past, it was discussed where to place these functions and we concluded to
place them in nvme/rc rather than nvme/039, so that they can be used in other
test cases [1]. I guess Alan placed the new helper functions in nvme/rc to be
consistent with the existing functions.

[1] https://lore.kernel.org/linux-block/20220512115457.hoa6lhk4as63xrq3@shindev/
Shin'ichiro Kawasaki Feb. 19, 2024, 8:06 a.m. UTC | #3
On Feb 16, 2024 / 15:30, Alan Adamson wrote:
> Tests the ability to enable and disable error logging for passthru admin commands issued to
> the controller and passthru IO commands issued to a namespace.

Alan, thanks for the patch. Good to get the test coverage back.

I applied the patch and ran nvme/039 for a QEMU nvme device on the kernel
v6.8-rc5. Then I observed the failure below:

---------------------------------------------------------------------
nvme/039 => nvme0n1 (test error logging)                     [failed]
    runtime  5.308s  ...  5.318s
    --- tests/nvme/039.out      2024-02-19 15:59:12.143488379 +0900
    +++ /home/shin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad   2024-02-19 16:33:06.135840798 +0900
    @@ -3,5 +3,4 @@
      Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR
      Write(0x1) @ LBA 0, 1 blocks, Write Fault (sct 0x2 / sc 0x80) DNR
      Identify(0x6), Access Denied (sct 0x2 / sc 0x86) DNR cdw10=0x1 cdw11=0x0 cdw12=0x0 cdw13=0x0 cdw14=0x0 cdw15=0x0
    - Read(0x2), Invalid Command Opcode (sct 0x0 / sc 0x1) DNR cdw10=0x0 cdw11=0x0 cdw12=0x1 cdw13=0x0 cdw14=0x0 cdw15=0x0
     Test complete
---------------------------------------------------------------------

FYI, kernel reported messages below:

---------------------------------------------------------------------
[   41.441042][ T1040] run blktests nvme/039 at 2024-02-19 16:33:00
[   46.539249][    C2] nvme0n1: I/O Cmd(0x2) @ LBA 0, 1 blocks, I/O Error (sct 0x2 / sc 0x81) DNR
[   46.539974][    C2] critical medium error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x800 phys_seg 12
[   46.570149][    C3] nvme0n1: I/O Cmd(0x2) @ LBA 0, 1 blocks, I/O Error (sct 0x3 / sc 0x75) DNR
[   46.571329][    C3] I/O error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
[   46.597495][    C1] nvme0n1: I/O Cmd(0x1) @ LBA 0, 1 blocks, I/O Error (sct 0x2 / sc 0x80) DNR
[   46.598334][    C1] critical medium error, dev nvme0n1, sector 0 op 0x1:(WRITE) flags 0x8800 phys_seg0
[   46.672226][    C0] nvme0: Admin Cmd(0x6), I/O Error (sct 0x2 / sc 0x86) DNR cdw10=0x1 cdw11=0x0 cdw10
[   46.714923][ T1137] nvme nvme0: NVME_IOCTL_IO_CMD not supported when multiple namespaces present!
[   46.754907][ T1149] nvme nvme0: NVME_IOCTL_IO_CMD not supported when multiple namespaces present!
---------------------------------------------------------------------

Do I miss anything to make the test case pass?


I also ran nvme/039 on the kernel v6.7 and observed a different failure symptom
below. Old kernels do not have the sysfs attribute passthru_err_log_enabled,
hence the failure.

---------------------------------------------------------------------
nvme/039 => nvme0n1 (test error logging)                     [failed]
    runtime  5.570s  ...  5.308s
    --- tests/nvme/039.out      2024-02-19 15:59:12.143488379 +0900
    +++ /home/shin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad   2024-02-19 16:23:02.669330853 +0900
    @@ -1,7 +1,15 @@
     Running nvme/039
    +cat: /sys/class/nvme/nvme0/passthru_err_log_enabled: No such file or directory
    +cat: /sys/class/nvme/nvme0/nvme0n1/passthru_err_log_enabled: No such file or directory
    +tests/nvme/rc: line 1017: /sys/class/nvme/nvme0/passthru_err_log_enabled: Permission denied
    +tests/nvme/rc: line 1022: /sys/class/nvme/nvme0/nvme0n1/passthru_err_log_enabled: Permission denied
      Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
      Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR
    ...
    (Run 'diff -u tests/nvme/039.out /home/shpin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad' to see the entire diff)
---------------------------------------------------------------------

Then the added tests should be executed only when the kernel has the sysfs
attribute. If such control is introduced in the test case, the output of the
test case will have variations and can not be compared with static 039.out
file. Check for the added tests must be done by the test case in a different
way.

Another idea is to create another test case dedicated for the added tests.
It will allow comparison with static out file. It also allow to use
_require_test_dev_sysfs() helper function in device_requires() to check the
sysfs attribute (ref: block/005). On the other hand, this approach will need to
move more functions from nvme/039 to nvme/rc.

Which way looks the better for you?
Alan Adamson Feb. 20, 2024, 5:18 p.m. UTC | #4
On 2/19/24 12:06 AM, Shinichiro Kawasaki wrote:
> On Feb 16, 2024 / 15:30, Alan Adamson wrote:
>> Tests the ability to enable and disable error logging for passthru admin commands issued to
>> the controller and passthru IO commands issued to a namespace.
> Alan, thanks for the patch. Good to get the test coverage back.
>
> I applied the patch and ran nvme/039 for a QEMU nvme device on the kernel
> v6.8-rc5. Then I observed the failure below:
>
> ---------------------------------------------------------------------
> nvme/039 => nvme0n1 (test error logging)                     [failed]
>      runtime  5.308s  ...  5.318s
>      --- tests/nvme/039.out      2024-02-19 15:59:12.143488379 +0900
>      +++ /home/shin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad   2024-02-19 16:33:06.135840798 +0900
>      @@ -3,5 +3,4 @@
>        Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR
>        Write(0x1) @ LBA 0, 1 blocks, Write Fault (sct 0x2 / sc 0x80) DNR
>        Identify(0x6), Access Denied (sct 0x2 / sc 0x86) DNR cdw10=0x1 cdw11=0x0 cdw12=0x0 cdw13=0x0 cdw14=0x0 cdw15=0x0
>      - Read(0x2), Invalid Command Opcode (sct 0x0 / sc 0x1) DNR cdw10=0x0 cdw11=0x0 cdw12=0x1 cdw13=0x0 cdw14=0x0 cdw15=0x0
>       Test complete
> ---------------------------------------------------------------------
>
Has to do with the nvme device having multiple namespaces.  I'll fix the 
test.


Thanks,

Alan
Alan Adamson Feb. 20, 2024, 5:40 p.m. UTC | #5
> I also ran nvme/039 on the kernel v6.7 and observed a different failure symptom
> below. Old kernels do not have the sysfs attribute passthru_err_log_enabled,
> hence the failure.
>
> ---------------------------------------------------------------------
> nvme/039 => nvme0n1 (test error logging)                     [failed]
>      runtime  5.570s  ...  5.308s
>      --- tests/nvme/039.out      2024-02-19 15:59:12.143488379 +0900
>      +++ /home/shin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad   2024-02-19 16:23:02.669330853 +0900
>      @@ -1,7 +1,15 @@
>       Running nvme/039
>      +cat: /sys/class/nvme/nvme0/passthru_err_log_enabled: No such file or directory
>      +cat: /sys/class/nvme/nvme0/nvme0n1/passthru_err_log_enabled: No such file or directory
>      +tests/nvme/rc: line 1017: /sys/class/nvme/nvme0/passthru_err_log_enabled: Permission denied
>      +tests/nvme/rc: line 1022: /sys/class/nvme/nvme0/nvme0n1/passthru_err_log_enabled: Permission denied
>        Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
>        Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR
>      ...
>      (Run 'diff -u tests/nvme/039.out /home/shpin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad' to see the entire diff)
> ---------------------------------------------------------------------
>
> Then the added tests should be executed only when the kernel has the sysfs
> attribute. If such control is introduced in the test case, the output of the
> test case will have variations and can not be compared with static 039.out
> file. Check for the added tests must be done by the test case in a different
> way.
>
> Another idea is to create another test case dedicated for the added tests.
> It will allow comparison with static out file. It also allow to use
> _require_test_dev_sysfs() helper function in device_requires() to check the
> sysfs attribute (ref: block/005). On the other hand, this approach will need to
> move more functions from nvme/039 to nvme/rc.
>
> Which way looks the better for you?

I would prefer to keep all the error logging tests within 039. I'll need 
to fake out the new test output if it is a pre-6.8 kernel.

Thanks,

Alan
diff mbox series

Patch

diff --git a/tests/nvme/039 b/tests/nvme/039
index 73b53d0b949c..7cf7638ed4c8 100755
--- a/tests/nvme/039
+++ b/tests/nvme/039
@@ -130,6 +130,33 @@  inject_invalid_admin_cmd()
 	fi
 }
 
+inject_invalid_io_cmd_passthru()
+{
+	local ns_dev
+	local ctrl_dev
+	local ns
+
+	ns_dev="$1"
+	ctrl_dev=${ns_dev%n*}
+	ns=$(echo "$ns_dev" |  cut -d "n" -f3)
+
+	# Inject a 'Invalid Command Opcode' (0x1) on a read (0x02)
+	_nvme_enable_err_inject "$ns_dev" 0 100 1 0x1 1
+
+	nvme io-passthru /dev/"$ctrl_dev" --opcode=0x02 --namespace-id="$ns" \
+		--data-len=512 --read --cdw10=0 --cdw11=0 --cdw12="$2" 2> /dev/null 1>&2
+
+	_nvme_disable_err_inject "$1"
+	if ${nvme_verbose_errors}; then
+		last_dmesg 2 | grep "Invalid Command Opcode (" | \
+		    sed 's/nvme.*://g'
+	else
+		last_dmesg 2 | grep "Cmd(0x2" | sed 's/I\/O Cmd/Read/g' | \
+		    sed 's/I\/O Error/Invalid Command Opcode/g' | \
+		    sed 's/nvme.*://g'
+	fi
+}
+
 test_device() {
 	echo "Running ${TEST_NAME}"
 
@@ -147,6 +174,7 @@  test_device() {
 	ctrl_dev=${ns_dev%n*}
 
 	_nvme_err_inject_setup "${ns_dev}" "${ctrl_dev}"
+	_nvme_passthru_logging_setup "${ns_dev}" "${ctrl_dev}"
 
 	# wait DEFAULT_RATELIMIT_INTERVAL=5 seconds to ensure errors are printed
 	sleep 5
@@ -155,6 +183,19 @@  test_device() {
 	inject_invalid_status_on_read "${ns_dev}"
 	inject_write_fault_on_write "${ns_dev}"
 
+	# Test Pass Thru Admin Logging
+	_nvme_disable_passthru_admin_error_logging "${ctrl_dev}"
+	inject_invalid_admin_cmd "${ctrl_dev}"
+	_nvme_enable_passthru_admin_error_logging "${ctrl_dev}"
+	inject_access_denied_on_identify "${ctrl_dev}"
+
+	# Test Pass Thru IO Logging
+	_nvme_disable_passthru_io_error_logging "${ns_dev}" "${ctrl_dev}"
+	inject_invalid_io_cmd_passthru "${ns_dev}" 0
+	_nvme_enable_passthru_io_error_logging "${ns_dev}" "${ctrl_dev}"
+	inject_invalid_io_cmd_passthru "${ns_dev}" 1
+
+	_nvme_passthru_logging_cleanup "${ns_dev}" "${ctrl_dev}"
 	_nvme_err_inject_cleanup "${ns_dev}" "${ctrl_dev}"
 
 	echo "Test complete"
diff --git a/tests/nvme/039.out b/tests/nvme/039.out
index 139070d22240..fea76cfd1245 100644
--- a/tests/nvme/039.out
+++ b/tests/nvme/039.out
@@ -2,4 +2,6 @@  Running nvme/039
  Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR 
  Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR 
  Write(0x1) @ LBA 0, 1 blocks, Write Fault (sct 0x2 / sc 0x80) DNR 
+ Identify(0x6), Access Denied (sct 0x2 / sc 0x86) DNR cdw10=0x1 cdw11=0x0 cdw12=0x0 cdw13=0x0 cdw14=0x0 cdw15=0x0
+ Read(0x2), Invalid Command Opcode (sct 0x0 / sc 0x1) DNR cdw10=0x0 cdw11=0x0 cdw12=0x1 cdw13=0x0 cdw14=0x0 cdw15=0x0
 Test complete
diff --git a/tests/nvme/rc b/tests/nvme/rc
index dfc4c1ef1975..2d6ebeab2f6f 100644
--- a/tests/nvme/rc
+++ b/tests/nvme/rc
@@ -943,6 +943,23 @@  _check_uuid() {
 
 declare -A NS_DEV_FAULT_INJECT_SAVE
 declare -A CTRL_DEV_FAULT_INJECT_SAVE
+ns_dev_passthru_logging=off
+ctrl_dev_passthru_logging=off
+
+_nvme_passthru_logging_setup()
+{
+	ctrl_dev_passthru_logging=$(cat /sys/class/nvme/"$2"/passthru_err_log_enabled)
+	ns_dev_passthru_logging=$(cat /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled)
+
+	_nvme_disable_passthru_admin_error_logging "$2"
+	_nvme_disable_passthru_io_error_logging "$1" "$2"
+}
+
+_nvme_passthru_logging_cleanup()
+{
+	echo $ctrl_dev_passthru_logging > /sys/class/nvme/"$2"/passthru_err_log_enabled
+	echo $ns_dev_passthru_logging > /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled
+}
 
 _nvme_err_inject_setup()
 {
@@ -985,6 +1002,26 @@  _nvme_disable_err_inject()
         echo 0 > /sys/kernel/debug/"$1"/fault_inject/times
 }
 
+_nvme_enable_passthru_admin_error_logging()
+{
+	echo on > /sys/class/nvme/"$1"/passthru_err_log_enabled
+}
+
+_nvme_enable_passthru_io_error_logging()
+{
+	echo on > /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled
+}
+
+_nvme_disable_passthru_admin_error_logging()
+{
+	echo off > /sys/class/nvme/"$1"/passthru_err_log_enabled
+}
+
+_nvme_disable_passthru_io_error_logging()
+{
+	echo off > /sys/class/nvme/"$2"/"$1"/passthru_err_log_enabled
+}
+
 _nvme_reset_ctrl() {
 	echo 1 > /sys/class/nvme/"$1"/reset_controller
 }