diff mbox series

[blktests,v2,2/2] nvme-rc: Cleanup fc resource before module unloading

Message ID 20230419084757.24846-3-dwagner@suse.de (mailing list archive)
State New, archived
Headers show
Series nvme_trtype=fc fixes | expand

Commit Message

Daniel Wagner April 19, 2023, 8:47 a.m. UTC
Before we unload the module we should cleanup the fc resources first,
basically reorder the shutdown sequence to be in reverse order of the
setup path.

Also unload the nvme-fcloop after usage.

While at it also update the rdma stop_soft_rdma before the module
unloading for the same reasoning.

Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 tests/nvme/rc | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

Comments

Chaitanya Kulkarni April 19, 2023, 9:41 a.m. UTC | #1
On 4/19/23 01:47, Daniel Wagner wrote:
> Before we unload the module we should cleanup the fc resources first,
> basically reorder the shutdown sequence to be in reverse order of the
> setup path.
>
> Also unload the nvme-fcloop after usage.
>
> While at it also update the rdma stop_soft_rdma before the module
> unloading for the same reasoning.
>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   tests/nvme/rc | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/tests/nvme/rc b/tests/nvme/rc
> index ec0cc2d8d8cc..41f196b037d6 100644
> --- a/tests/nvme/rc
> +++ b/tests/nvme/rc
> @@ -260,18 +260,20 @@ _cleanup_nvmet() {
>   	shopt -u nullglob
>   	trap SIGINT
>   
> -	modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null
> -	if [[ "${nvme_trtype}" != "loop" ]]; then
> -		modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null
> -	fi
> -	modprobe -rq nvmet 2>/dev/null
>   	if [[ "${nvme_trtype}" == "rdma" ]]; then
>   		stop_soft_rdma
>   	fi
>   	if [[ "${nvme_trtype}" == "fc" ]]; then
>   		_cleanup_fcloop "${def_local_wwnn}" "${def_local_wwpn}" \
>   			        "${def_remote_wwnn}" "${def_remote_wwpn}"
> +		modprobe -rq nvme-fcloop
>   	fi
> +
> +	modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null
> +	if [[ "${nvme_trtype}" != "loop" ]]; then
> +		modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null
> +	fi
> +	modprobe -rq nvmet 2>/dev/null
>   }
>   
>   _setup_nvmet() {

were you able to test this with RDMA ?

just want to make sure we are not breaking anything since we are changing
the order of module unload and stop_soft_rdma() in this patch ...

-ck
Sagi Grimberg April 19, 2023, 9:44 a.m. UTC | #2
> Before we unload the module we should cleanup the fc resources first,
> basically reorder the shutdown sequence to be in reverse order of the
> setup path.

If this triggers a bug, then I think it is a good idea to have a
dedicated test that reproduces it if we are changing the default
behavior.

> 
> Also unload the nvme-fcloop after usage.
> 
> While at it also update the rdma stop_soft_rdma before the module
> unloading for the same reasoning.

Why? it creates the wrong reverse ordering.

1. setup soft-rdma
2. setup nvme-rdma

2. teardown nvme-rdma
1. teardown soft-rdma

I don't think we need this change. I mean it is a good test
to have that the rdma device goes away underneath nvme-rdma
but it is good for a dedicated test.

> 
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   tests/nvme/rc | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/tests/nvme/rc b/tests/nvme/rc
> index ec0cc2d8d8cc..41f196b037d6 100644
> --- a/tests/nvme/rc
> +++ b/tests/nvme/rc
> @@ -260,18 +260,20 @@ _cleanup_nvmet() {
>   	shopt -u nullglob
>   	trap SIGINT
>   
> -	modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null
> -	if [[ "${nvme_trtype}" != "loop" ]]; then
> -		modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null
> -	fi
> -	modprobe -rq nvmet 2>/dev/null
>   	if [[ "${nvme_trtype}" == "rdma" ]]; then
>   		stop_soft_rdma
>   	fi
>   	if [[ "${nvme_trtype}" == "fc" ]]; then
>   		_cleanup_fcloop "${def_local_wwnn}" "${def_local_wwpn}" \
>   			        "${def_remote_wwnn}" "${def_remote_wwpn}"
> +		modprobe -rq nvme-fcloop
>   	fi
> +
> +	modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null
> +	if [[ "${nvme_trtype}" != "loop" ]]; then
> +		modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null
> +	fi
> +	modprobe -rq nvmet 2>/dev/null
>   }
>   
>   _setup_nvmet() {
Daniel Wagner April 19, 2023, 10:36 a.m. UTC | #3
On Wed, Apr 19, 2023 at 09:41:28AM +0000, Chaitanya Kulkarni wrote:
> were you able to test this with RDMA ?

Yes, I've tested it with all transports (loop, tcp, rdma, fc)

> just want to make sure we are not breaking anything since we are changing
> the order of module unload and stop_soft_rdma() in this patch ...

Sure thing
Daniel Wagner April 19, 2023, 10:42 a.m. UTC | #4
On Wed, Apr 19, 2023 at 12:44:42PM +0300, Sagi Grimberg wrote:
> 
> > Before we unload the module we should cleanup the fc resources first,
> > basically reorder the shutdown sequence to be in reverse order of the
> > setup path.
> 
> If this triggers a bug, then I think it is a good idea to have a
> dedicated test that reproduces it if we are changing the default
> behavior.

Right, though I would like to tackle one problem after the other, first get fc
working with the 'correct' order.

> > While at it also update the rdma stop_soft_rdma before the module
> > unloading for the same reasoning.
> 
> Why? it creates the wrong reverse ordering.
> 
> 1. setup soft-rdma
> 2. setup nvme-rdma
> 
> 2. teardown nvme-rdma
> 1. teardown soft-rdma
> 
> I don't think we need this change. I mean it is a good test
> to have that the rdma device goes away underneath nvme-rdma
> but it is good for a dedicated test.

I was woried about this setup sequence here:

	modprobe -q nvme-"${nvme_trtype}"
	if [[ "${nvme_trtype}" == "rdma" ]]; then
		start_soft_rdma

The module is loaded before start_soft_rdma is started, thus I thought we should
do the reverse, first call stop_soft_rdma and the unload the module.
Sagi Grimberg April 19, 2023, 10:45 a.m. UTC | #5
>>> Before we unload the module we should cleanup the fc resources first,
>>> basically reorder the shutdown sequence to be in reverse order of the
>>> setup path.
>>
>> If this triggers a bug, then I think it is a good idea to have a
>> dedicated test that reproduces it if we are changing the default
>> behavior.
> 
> Right, though I would like to tackle one problem after the other, first get fc
> working with the 'correct' order.
> 
>>> While at it also update the rdma stop_soft_rdma before the module
>>> unloading for the same reasoning.
>>
>> Why? it creates the wrong reverse ordering.
>>
>> 1. setup soft-rdma
>> 2. setup nvme-rdma
>>
>> 2. teardown nvme-rdma
>> 1. teardown soft-rdma
>>
>> I don't think we need this change. I mean it is a good test
>> to have that the rdma device goes away underneath nvme-rdma
>> but it is good for a dedicated test.
> 
> I was woried about this setup sequence here:
> 
> 	modprobe -q nvme-"${nvme_trtype}"
> 	if [[ "${nvme_trtype}" == "rdma" ]]; then
> 		start_soft_rdma
> 
> The module is loaded before start_soft_rdma is started, thus I thought we should
> do the reverse, first call stop_soft_rdma and the unload the module.

They should be unrelated. the safe route is to first remove the uld and
then the device.
Chaitanya Kulkarni April 19, 2023, 9:15 p.m. UTC | #6
On 4/19/23 02:44, Sagi Grimberg wrote:
>
>> Before we unload the module we should cleanup the fc resources first,
>> basically reorder the shutdown sequence to be in reverse order of the
>> setup path.
>
> If this triggers a bug, then I think it is a good idea to have a
> dedicated test that reproduces it if we are changing the default
> behavior.
>

+1

-ck
Shinichiro Kawasaki April 30, 2023, 10:05 a.m. UTC | #7
On Apr 19, 2023 / 21:15, Chaitanya Kulkarni wrote:
> On 4/19/23 02:44, Sagi Grimberg wrote:
> >
> >> Before we unload the module we should cleanup the fc resources first,
> >> basically reorder the shutdown sequence to be in reverse order of the
> >> setup path.
> >
> > If this triggers a bug, then I think it is a good idea to have a
> > dedicated test that reproduces it if we are changing the default
> > behavior.
> >
> 
> +1

Agreed. Patch post for the new test case will be appreciated. Not to forget this
work, I will open a github issue later.
Shinichiro Kawasaki April 30, 2023, 10:08 a.m. UTC | #8
On Apr 19, 2023 / 12:36, Daniel Wagner wrote:
> On Wed, Apr 19, 2023 at 09:41:28AM +0000, Chaitanya Kulkarni wrote:
> > were you able to test this with RDMA ?
> 
> Yes, I've tested it with all transports (loop, tcp, rdma, fc)
> 
> > just want to make sure we are not breaking anything since we are changing
> > the order of module unload and stop_soft_rdma() in this patch ...
> 
> Sure thing

I also tested, and observed no result change by these two patches. Only one
failure I observed is nvme/003 due to lockdep WARN, but it happens regardless
of the patches.
Shinichiro Kawasaki April 30, 2023, 10:34 a.m. UTC | #9
On Apr 19, 2023 / 13:45, Sagi Grimberg wrote:
> 
> > > > Before we unload the module we should cleanup the fc resources first,
> > > > basically reorder the shutdown sequence to be in reverse order of the
> > > > setup path.
> > > 
> > > If this triggers a bug, then I think it is a good idea to have a
> > > dedicated test that reproduces it if we are changing the default
> > > behavior.
> > 
> > Right, though I would like to tackle one problem after the other, first get fc
> > working with the 'correct' order.
> > 
> > > > While at it also update the rdma stop_soft_rdma before the module
> > > > unloading for the same reasoning.
> > > 
> > > Why? it creates the wrong reverse ordering.
> > > 
> > > 1. setup soft-rdma
> > > 2. setup nvme-rdma
> > > 
> > > 2. teardown nvme-rdma
> > > 1. teardown soft-rdma
> > > 
> > > I don't think we need this change. I mean it is a good test
> > > to have that the rdma device goes away underneath nvme-rdma
> > > but it is good for a dedicated test.

I agree that the new test case is good.

> > 
> > I was woried about this setup sequence here:
> > 
> > 	modprobe -q nvme-"${nvme_trtype}"
> > 	if [[ "${nvme_trtype}" == "rdma" ]]; then
> > 		start_soft_rdma
> > 
> > The module is loaded before start_soft_rdma is started, thus I thought we should
> > do the reverse, first call stop_soft_rdma and the unload the module.
> 
> They should be unrelated. the safe route is to first remove the uld and
> then the device.

Sagi, this comment above was not clear for me. Is Daniel's patch ok for you?

IMO, it is reasonable to "do clean-up in reverse order as setup" as a general
guide. It will reduce the chance to see module related failures when the test
cases do not expect such failures. Instead, we can have dedicated test cases for
the module load/unload order related failures. start_soft_rdma and
stop_soft_rdma do module load and unload. So I think the guide is good for those
helper functions also.
Sagi Grimberg May 1, 2023, 2:10 p.m. UTC | #10
On 4/30/23 13:34, Shinichiro Kawasaki wrote:
> On Apr 19, 2023 / 13:45, Sagi Grimberg wrote:
>>
>>>>> Before we unload the module we should cleanup the fc resources first,
>>>>> basically reorder the shutdown sequence to be in reverse order of the
>>>>> setup path.
>>>>
>>>> If this triggers a bug, then I think it is a good idea to have a
>>>> dedicated test that reproduces it if we are changing the default
>>>> behavior.
>>>
>>> Right, though I would like to tackle one problem after the other, first get fc
>>> working with the 'correct' order.
>>>
>>>>> While at it also update the rdma stop_soft_rdma before the module
>>>>> unloading for the same reasoning.
>>>>
>>>> Why? it creates the wrong reverse ordering.
>>>>
>>>> 1. setup soft-rdma
>>>> 2. setup nvme-rdma
>>>>
>>>> 2. teardown nvme-rdma
>>>> 1. teardown soft-rdma
>>>>
>>>> I don't think we need this change. I mean it is a good test
>>>> to have that the rdma device goes away underneath nvme-rdma
>>>> but it is good for a dedicated test.
> 
> I agree that the new test case is good.
> 
>>>
>>> I was woried about this setup sequence here:
>>>
>>> 	modprobe -q nvme-"${nvme_trtype}"
>>> 	if [[ "${nvme_trtype}" == "rdma" ]]; then
>>> 		start_soft_rdma
>>>
>>> The module is loaded before start_soft_rdma is started, thus I thought we should
>>> do the reverse, first call stop_soft_rdma and the unload the module.
>>
>> They should be unrelated. the safe route is to first remove the uld and
>> then the device.
> 
> Sagi, this comment above was not clear for me. Is Daniel's patch ok for you?
> 
> IMO, it is reasonable to "do clean-up in reverse order as setup" as a general
> guide. It will reduce the chance to see module related failures when the test
> cases do not expect such failures. Instead, we can have dedicated test cases for
> the module load/unload order related failures. start_soft_rdma and
> stop_soft_rdma do module load and unload. So I think the guide is good for those
> helper functions also.

As I mentioned here, this change exercises a code path in the driver
that is a surprise unplug of the rdma device. It is equivalent to
triggering a surprise removal of the pci device normally during
nvme-pci test teardown. While this is worth testing, I'm not sure we
want the default behavior to do that, but rather add dedicated tests for
it.

Hence, my suggestion was to leave nvme-rdma as is.
Shinichiro Kawasaki May 2, 2023, 6:13 a.m. UTC | #11
On May 01, 2023 / 17:10, Sagi Grimberg wrote:
> On 4/30/23 13:34, Shinichiro Kawasaki wrote:
[...]
> > Sagi, this comment above was not clear for me. Is Daniel's patch ok for you?
> > 
> > IMO, it is reasonable to "do clean-up in reverse order as setup" as a general
> > guide. It will reduce the chance to see module related failures when the test
> > cases do not expect such failures. Instead, we can have dedicated test cases for
> > the module load/unload order related failures. start_soft_rdma and
> > stop_soft_rdma do module load and unload. So I think the guide is good for those
> > helper functions also.
> 
> As I mentioned here, this change exercises a code path in the driver
> that is a surprise unplug of the rdma device. It is equivalent to
> triggering a surprise removal of the pci device normally during
> nvme-pci test teardown. While this is worth testing, I'm not sure we
> want the default behavior to do that, but rather add dedicated tests for
> it.
> 
> Hence, my suggestion was to leave nvme-rdma as is.

Thanks for the clarification. I assume that stop_soft_rdma is the "surprise
unplug of the rdma device". If I understand it correctly, the change for nvme-fc
will be like this:


diff --git a/tests/nvme/rc b/tests/nvme/rc
index ec0cc2d..24803af 100644
--- a/tests/nvme/rc
+++ b/tests/nvme/rc
@@ -260,6 +260,11 @@ _cleanup_nvmet() {
 	shopt -u nullglob
 	trap SIGINT
 
+	if [[ "${nvme_trtype}" == "fc" ]]; then
+		_cleanup_fcloop "${def_local_wwnn}" "${def_local_wwpn}" \
+				"${def_remote_wwnn}" "${def_remote_wwpn}"
+		modprobe -rq nvme-fcloop
+	fi
 	modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null
 	if [[ "${nvme_trtype}" != "loop" ]]; then
 		modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null
@@ -268,10 +273,6 @@ _cleanup_nvmet() {
 	if [[ "${nvme_trtype}" == "rdma" ]]; then
 		stop_soft_rdma
 	fi
-	if [[ "${nvme_trtype}" == "fc" ]]; then
-		_cleanup_fcloop "${def_local_wwnn}" "${def_local_wwpn}" \
-			        "${def_remote_wwnn}" "${def_remote_wwpn}"
-	fi
 }
 
 _setup_nvmet() {
diff mbox series

Patch

diff --git a/tests/nvme/rc b/tests/nvme/rc
index ec0cc2d8d8cc..41f196b037d6 100644
--- a/tests/nvme/rc
+++ b/tests/nvme/rc
@@ -260,18 +260,20 @@  _cleanup_nvmet() {
 	shopt -u nullglob
 	trap SIGINT
 
-	modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null
-	if [[ "${nvme_trtype}" != "loop" ]]; then
-		modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null
-	fi
-	modprobe -rq nvmet 2>/dev/null
 	if [[ "${nvme_trtype}" == "rdma" ]]; then
 		stop_soft_rdma
 	fi
 	if [[ "${nvme_trtype}" == "fc" ]]; then
 		_cleanup_fcloop "${def_local_wwnn}" "${def_local_wwpn}" \
 			        "${def_remote_wwnn}" "${def_remote_wwpn}"
+		modprobe -rq nvme-fcloop
 	fi
+
+	modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null
+	if [[ "${nvme_trtype}" != "loop" ]]; then
+		modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null
+	fi
+	modprobe -rq nvmet 2>/dev/null
 }
 
 _setup_nvmet() {