Message ID | 20230419084757.24846-3-dwagner@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | nvme_trtype=fc fixes | expand |
On 4/19/23 01:47, Daniel Wagner wrote: > Before we unload the module we should cleanup the fc resources first, > basically reorder the shutdown sequence to be in reverse order of the > setup path. > > Also unload the nvme-fcloop after usage. > > While at it also update the rdma stop_soft_rdma before the module > unloading for the same reasoning. > > Signed-off-by: Daniel Wagner <dwagner@suse.de> > --- > tests/nvme/rc | 12 +++++++----- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/tests/nvme/rc b/tests/nvme/rc > index ec0cc2d8d8cc..41f196b037d6 100644 > --- a/tests/nvme/rc > +++ b/tests/nvme/rc > @@ -260,18 +260,20 @@ _cleanup_nvmet() { > shopt -u nullglob > trap SIGINT > > - modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null > - if [[ "${nvme_trtype}" != "loop" ]]; then > - modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null > - fi > - modprobe -rq nvmet 2>/dev/null > if [[ "${nvme_trtype}" == "rdma" ]]; then > stop_soft_rdma > fi > if [[ "${nvme_trtype}" == "fc" ]]; then > _cleanup_fcloop "${def_local_wwnn}" "${def_local_wwpn}" \ > "${def_remote_wwnn}" "${def_remote_wwpn}" > + modprobe -rq nvme-fcloop > fi > + > + modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null > + if [[ "${nvme_trtype}" != "loop" ]]; then > + modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null > + fi > + modprobe -rq nvmet 2>/dev/null > } > > _setup_nvmet() { were you able to test this with RDMA ? just want to make sure we are not breaking anything since we are changing the order of module unload and stop_soft_rdma() in this patch ... -ck
> Before we unload the module we should cleanup the fc resources first, > basically reorder the shutdown sequence to be in reverse order of the > setup path. If this triggers a bug, then I think it is a good idea to have a dedicated test that reproduces it if we are changing the default behavior. > > Also unload the nvme-fcloop after usage. > > While at it also update the rdma stop_soft_rdma before the module > unloading for the same reasoning. Why? it creates the wrong reverse ordering. 1. setup soft-rdma 2. setup nvme-rdma 2. teardown nvme-rdma 1. teardown soft-rdma I don't think we need this change. I mean it is a good test to have that the rdma device goes away underneath nvme-rdma but it is good for a dedicated test. > > Signed-off-by: Daniel Wagner <dwagner@suse.de> > --- > tests/nvme/rc | 12 +++++++----- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/tests/nvme/rc b/tests/nvme/rc > index ec0cc2d8d8cc..41f196b037d6 100644 > --- a/tests/nvme/rc > +++ b/tests/nvme/rc > @@ -260,18 +260,20 @@ _cleanup_nvmet() { > shopt -u nullglob > trap SIGINT > > - modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null > - if [[ "${nvme_trtype}" != "loop" ]]; then > - modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null > - fi > - modprobe -rq nvmet 2>/dev/null > if [[ "${nvme_trtype}" == "rdma" ]]; then > stop_soft_rdma > fi > if [[ "${nvme_trtype}" == "fc" ]]; then > _cleanup_fcloop "${def_local_wwnn}" "${def_local_wwpn}" \ > "${def_remote_wwnn}" "${def_remote_wwpn}" > + modprobe -rq nvme-fcloop > fi > + > + modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null > + if [[ "${nvme_trtype}" != "loop" ]]; then > + modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null > + fi > + modprobe -rq nvmet 2>/dev/null > } > > _setup_nvmet() {
On Wed, Apr 19, 2023 at 09:41:28AM +0000, Chaitanya Kulkarni wrote: > were you able to test this with RDMA ? Yes, I've tested it with all transports (loop, tcp, rdma, fc) > just want to make sure we are not breaking anything since we are changing > the order of module unload and stop_soft_rdma() in this patch ... Sure thing
On Wed, Apr 19, 2023 at 12:44:42PM +0300, Sagi Grimberg wrote: > > > Before we unload the module we should cleanup the fc resources first, > > basically reorder the shutdown sequence to be in reverse order of the > > setup path. > > If this triggers a bug, then I think it is a good idea to have a > dedicated test that reproduces it if we are changing the default > behavior. Right, though I would like to tackle one problem after the other, first get fc working with the 'correct' order. > > While at it also update the rdma stop_soft_rdma before the module > > unloading for the same reasoning. > > Why? it creates the wrong reverse ordering. > > 1. setup soft-rdma > 2. setup nvme-rdma > > 2. teardown nvme-rdma > 1. teardown soft-rdma > > I don't think we need this change. I mean it is a good test > to have that the rdma device goes away underneath nvme-rdma > but it is good for a dedicated test. I was woried about this setup sequence here: modprobe -q nvme-"${nvme_trtype}" if [[ "${nvme_trtype}" == "rdma" ]]; then start_soft_rdma The module is loaded before start_soft_rdma is started, thus I thought we should do the reverse, first call stop_soft_rdma and the unload the module.
>>> Before we unload the module we should cleanup the fc resources first, >>> basically reorder the shutdown sequence to be in reverse order of the >>> setup path. >> >> If this triggers a bug, then I think it is a good idea to have a >> dedicated test that reproduces it if we are changing the default >> behavior. > > Right, though I would like to tackle one problem after the other, first get fc > working with the 'correct' order. > >>> While at it also update the rdma stop_soft_rdma before the module >>> unloading for the same reasoning. >> >> Why? it creates the wrong reverse ordering. >> >> 1. setup soft-rdma >> 2. setup nvme-rdma >> >> 2. teardown nvme-rdma >> 1. teardown soft-rdma >> >> I don't think we need this change. I mean it is a good test >> to have that the rdma device goes away underneath nvme-rdma >> but it is good for a dedicated test. > > I was woried about this setup sequence here: > > modprobe -q nvme-"${nvme_trtype}" > if [[ "${nvme_trtype}" == "rdma" ]]; then > start_soft_rdma > > The module is loaded before start_soft_rdma is started, thus I thought we should > do the reverse, first call stop_soft_rdma and the unload the module. They should be unrelated. the safe route is to first remove the uld and then the device.
On 4/19/23 02:44, Sagi Grimberg wrote: > >> Before we unload the module we should cleanup the fc resources first, >> basically reorder the shutdown sequence to be in reverse order of the >> setup path. > > If this triggers a bug, then I think it is a good idea to have a > dedicated test that reproduces it if we are changing the default > behavior. > +1 -ck
On Apr 19, 2023 / 21:15, Chaitanya Kulkarni wrote: > On 4/19/23 02:44, Sagi Grimberg wrote: > > > >> Before we unload the module we should cleanup the fc resources first, > >> basically reorder the shutdown sequence to be in reverse order of the > >> setup path. > > > > If this triggers a bug, then I think it is a good idea to have a > > dedicated test that reproduces it if we are changing the default > > behavior. > > > > +1 Agreed. Patch post for the new test case will be appreciated. Not to forget this work, I will open a github issue later.
On Apr 19, 2023 / 12:36, Daniel Wagner wrote: > On Wed, Apr 19, 2023 at 09:41:28AM +0000, Chaitanya Kulkarni wrote: > > were you able to test this with RDMA ? > > Yes, I've tested it with all transports (loop, tcp, rdma, fc) > > > just want to make sure we are not breaking anything since we are changing > > the order of module unload and stop_soft_rdma() in this patch ... > > Sure thing I also tested, and observed no result change by these two patches. Only one failure I observed is nvme/003 due to lockdep WARN, but it happens regardless of the patches.
On Apr 19, 2023 / 13:45, Sagi Grimberg wrote: > > > > > Before we unload the module we should cleanup the fc resources first, > > > > basically reorder the shutdown sequence to be in reverse order of the > > > > setup path. > > > > > > If this triggers a bug, then I think it is a good idea to have a > > > dedicated test that reproduces it if we are changing the default > > > behavior. > > > > Right, though I would like to tackle one problem after the other, first get fc > > working with the 'correct' order. > > > > > > While at it also update the rdma stop_soft_rdma before the module > > > > unloading for the same reasoning. > > > > > > Why? it creates the wrong reverse ordering. > > > > > > 1. setup soft-rdma > > > 2. setup nvme-rdma > > > > > > 2. teardown nvme-rdma > > > 1. teardown soft-rdma > > > > > > I don't think we need this change. I mean it is a good test > > > to have that the rdma device goes away underneath nvme-rdma > > > but it is good for a dedicated test. I agree that the new test case is good. > > > > I was woried about this setup sequence here: > > > > modprobe -q nvme-"${nvme_trtype}" > > if [[ "${nvme_trtype}" == "rdma" ]]; then > > start_soft_rdma > > > > The module is loaded before start_soft_rdma is started, thus I thought we should > > do the reverse, first call stop_soft_rdma and the unload the module. > > They should be unrelated. the safe route is to first remove the uld and > then the device. Sagi, this comment above was not clear for me. Is Daniel's patch ok for you? IMO, it is reasonable to "do clean-up in reverse order as setup" as a general guide. It will reduce the chance to see module related failures when the test cases do not expect such failures. Instead, we can have dedicated test cases for the module load/unload order related failures. start_soft_rdma and stop_soft_rdma do module load and unload. So I think the guide is good for those helper functions also.
On 4/30/23 13:34, Shinichiro Kawasaki wrote: > On Apr 19, 2023 / 13:45, Sagi Grimberg wrote: >> >>>>> Before we unload the module we should cleanup the fc resources first, >>>>> basically reorder the shutdown sequence to be in reverse order of the >>>>> setup path. >>>> >>>> If this triggers a bug, then I think it is a good idea to have a >>>> dedicated test that reproduces it if we are changing the default >>>> behavior. >>> >>> Right, though I would like to tackle one problem after the other, first get fc >>> working with the 'correct' order. >>> >>>>> While at it also update the rdma stop_soft_rdma before the module >>>>> unloading for the same reasoning. >>>> >>>> Why? it creates the wrong reverse ordering. >>>> >>>> 1. setup soft-rdma >>>> 2. setup nvme-rdma >>>> >>>> 2. teardown nvme-rdma >>>> 1. teardown soft-rdma >>>> >>>> I don't think we need this change. I mean it is a good test >>>> to have that the rdma device goes away underneath nvme-rdma >>>> but it is good for a dedicated test. > > I agree that the new test case is good. > >>> >>> I was woried about this setup sequence here: >>> >>> modprobe -q nvme-"${nvme_trtype}" >>> if [[ "${nvme_trtype}" == "rdma" ]]; then >>> start_soft_rdma >>> >>> The module is loaded before start_soft_rdma is started, thus I thought we should >>> do the reverse, first call stop_soft_rdma and the unload the module. >> >> They should be unrelated. the safe route is to first remove the uld and >> then the device. > > Sagi, this comment above was not clear for me. Is Daniel's patch ok for you? > > IMO, it is reasonable to "do clean-up in reverse order as setup" as a general > guide. It will reduce the chance to see module related failures when the test > cases do not expect such failures. Instead, we can have dedicated test cases for > the module load/unload order related failures. start_soft_rdma and > stop_soft_rdma do module load and unload. So I think the guide is good for those > helper functions also. As I mentioned here, this change exercises a code path in the driver that is a surprise unplug of the rdma device. It is equivalent to triggering a surprise removal of the pci device normally during nvme-pci test teardown. While this is worth testing, I'm not sure we want the default behavior to do that, but rather add dedicated tests for it. Hence, my suggestion was to leave nvme-rdma as is.
On May 01, 2023 / 17:10, Sagi Grimberg wrote: > On 4/30/23 13:34, Shinichiro Kawasaki wrote: [...] > > Sagi, this comment above was not clear for me. Is Daniel's patch ok for you? > > > > IMO, it is reasonable to "do clean-up in reverse order as setup" as a general > > guide. It will reduce the chance to see module related failures when the test > > cases do not expect such failures. Instead, we can have dedicated test cases for > > the module load/unload order related failures. start_soft_rdma and > > stop_soft_rdma do module load and unload. So I think the guide is good for those > > helper functions also. > > As I mentioned here, this change exercises a code path in the driver > that is a surprise unplug of the rdma device. It is equivalent to > triggering a surprise removal of the pci device normally during > nvme-pci test teardown. While this is worth testing, I'm not sure we > want the default behavior to do that, but rather add dedicated tests for > it. > > Hence, my suggestion was to leave nvme-rdma as is. Thanks for the clarification. I assume that stop_soft_rdma is the "surprise unplug of the rdma device". If I understand it correctly, the change for nvme-fc will be like this: diff --git a/tests/nvme/rc b/tests/nvme/rc index ec0cc2d..24803af 100644 --- a/tests/nvme/rc +++ b/tests/nvme/rc @@ -260,6 +260,11 @@ _cleanup_nvmet() { shopt -u nullglob trap SIGINT + if [[ "${nvme_trtype}" == "fc" ]]; then + _cleanup_fcloop "${def_local_wwnn}" "${def_local_wwpn}" \ + "${def_remote_wwnn}" "${def_remote_wwpn}" + modprobe -rq nvme-fcloop + fi modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null if [[ "${nvme_trtype}" != "loop" ]]; then modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null @@ -268,10 +273,6 @@ _cleanup_nvmet() { if [[ "${nvme_trtype}" == "rdma" ]]; then stop_soft_rdma fi - if [[ "${nvme_trtype}" == "fc" ]]; then - _cleanup_fcloop "${def_local_wwnn}" "${def_local_wwpn}" \ - "${def_remote_wwnn}" "${def_remote_wwpn}" - fi } _setup_nvmet() {
diff --git a/tests/nvme/rc b/tests/nvme/rc index ec0cc2d8d8cc..41f196b037d6 100644 --- a/tests/nvme/rc +++ b/tests/nvme/rc @@ -260,18 +260,20 @@ _cleanup_nvmet() { shopt -u nullglob trap SIGINT - modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null - if [[ "${nvme_trtype}" != "loop" ]]; then - modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null - fi - modprobe -rq nvmet 2>/dev/null if [[ "${nvme_trtype}" == "rdma" ]]; then stop_soft_rdma fi if [[ "${nvme_trtype}" == "fc" ]]; then _cleanup_fcloop "${def_local_wwnn}" "${def_local_wwpn}" \ "${def_remote_wwnn}" "${def_remote_wwpn}" + modprobe -rq nvme-fcloop fi + + modprobe -rq nvme-"${nvme_trtype}" 2>/dev/null + if [[ "${nvme_trtype}" != "loop" ]]; then + modprobe -rq nvmet-"${nvme_trtype}" 2>/dev/null + fi + modprobe -rq nvmet 2>/dev/null } _setup_nvmet() {
Before we unload the module we should cleanup the fc resources first, basically reorder the shutdown sequence to be in reverse order of the setup path. Also unload the nvme-fcloop after usage. While at it also update the rdma stop_soft_rdma before the module unloading for the same reasoning. Signed-off-by: Daniel Wagner <dwagner@suse.de> --- tests/nvme/rc | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)