diff mbox series

[2/2] nvme: don't freeze/unfreeze queues from different contexts

Message ID 20230613005847.1762378-3-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series nvme: fix two kinds of IO hang from removing NSs | expand

Commit Message

Ming Lei June 13, 2023, 12:58 a.m. UTC
For block layer freeze/unfreeze APIs, the caller is required to call the
two in strict pair, so most of users simply call them from same context,
and everything works just fine.

For NVMe, the two are done from different contexts, this way has caused
all kinds of IO hang issue, such as:

1) When io queue connect fails, this controller is deleted without being
marked as DEAD. Upper layer may wait forever in __bio_queue_enter(), because
in del_gendisk(), disk won't be marked as DEAD unless bdev sync & invalidate
returns. If any writeback IO waits in __bio_queue_enter(), IO deadlock is
caused. Reported from Yi Zhang.

2) error recovering vs. namespace deletiong, if any IO originated from
scan work is waited in __bio_queue_enter(), flushing scan work hangs for
ever in nvme_remove_namespaces() because controller is left as frozen
when error recovery is interrupted by controller removal. Reported from
Chunguang.

Fix the issue by calling the two in same context just when reset is done
and not starting freeze from beginning of error recovery. Not only IO hang
is solved, correctness of freeze & unfreeze is respected.

And this way is correct because quiesce is enough for driver to handle
error recovery. The only difference is where to wait during error recovery.
With this way, IO is just queued in block layer queue instead of
__bio_queue_enter(), finally waiting for completion is done in upper
layer. Either way, IO can't move on during error recovery.

Reported-by: Chunguang Xu <brookxu.cn@gmail.com>
Closes: https://lore.kernel.org/linux-nvme/cover.1685350577.git.chunguang.xu@shopee.com/
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/nvme/host/core.c | 4 +---
 drivers/nvme/host/pci.c  | 8 +++++---
 drivers/nvme/host/rdma.c | 3 ++-
 drivers/nvme/host/tcp.c  | 3 ++-
 4 files changed, 10 insertions(+), 8 deletions(-)

Comments

Sagi Grimberg June 13, 2023, 1:13 p.m. UTC | #1
> For block layer freeze/unfreeze APIs, the caller is required to call the
> two in strict pair, so most of users simply call them from same context,
> and everything works just fine.
> 
> For NVMe, the two are done from different contexts, this way has caused
> all kinds of IO hang issue, such as:
> 
> 1) When io queue connect fails, this controller is deleted without being
> marked as DEAD. Upper layer may wait forever in __bio_queue_enter(), because
> in del_gendisk(), disk won't be marked as DEAD unless bdev sync & invalidate
> returns. If any writeback IO waits in __bio_queue_enter(), IO deadlock is
> caused. Reported from Yi Zhang.
> 
> 2) error recovering vs. namespace deletiong, if any IO originated from
> scan work is waited in __bio_queue_enter(), flushing scan work hangs for
> ever in nvme_remove_namespaces() because controller is left as frozen
> when error recovery is interrupted by controller removal. Reported from
> Chunguang.
> 
> Fix the issue by calling the two in same context just when reset is done
> and not starting freeze from beginning of error recovery. Not only IO hang
> is solved, correctness of freeze & unfreeze is respected.
> 
> And this way is correct because quiesce is enough for driver to handle
> error recovery. The only difference is where to wait during error recovery.
> With this way, IO is just queued in block layer queue instead of
> __bio_queue_enter(), finally waiting for completion is done in upper
> layer. Either way, IO can't move on during error recovery.
> 
> Reported-by: Chunguang Xu <brookxu.cn@gmail.com>
> Closes: https://lore.kernel.org/linux-nvme/cover.1685350577.git.chunguang.xu@shopee.com/
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>   drivers/nvme/host/core.c | 4 +---
>   drivers/nvme/host/pci.c  | 8 +++++---
>   drivers/nvme/host/rdma.c | 3 ++-
>   drivers/nvme/host/tcp.c  | 3 ++-
>   4 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 4ef5eaecaa75..d5d9b6f6ec74 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4707,10 +4707,8 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
>   	 * removing the namespaces' disks; fail all the queues now to avoid
>   	 * potentially having to clean up the failed sync later.
>   	 */
> -	if (ctrl->state == NVME_CTRL_DEAD) {
> +	if (ctrl->state == NVME_CTRL_DEAD)
>   		nvme_mark_namespaces_dead(ctrl);
> -		nvme_unquiesce_io_queues(ctrl);
> -	}

Shouldn't this be in the next patch? Not sure what
this helps in this patch? It is not clearly documented
in the commit msg.

>   
>   	/* this is a no-op when called from the controller reset handler */
>   	nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING_NOIO);
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 492f319ebdf3..5d775b76baca 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2578,14 +2578,15 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
>   	dead = nvme_pci_ctrl_is_dead(dev);
>   	if (dev->ctrl.state == NVME_CTRL_LIVE ||
>   	    dev->ctrl.state == NVME_CTRL_RESETTING) {
> -		if (pci_is_enabled(pdev))
> -			nvme_start_freeze(&dev->ctrl);
>   		/*
>   		 * Give the controller a chance to complete all entered requests
>   		 * if doing a safe shutdown.
>   		 */
> -		if (!dead && shutdown)
> +		if (!dead && shutdown & pci_is_enabled(pdev)) {
> +			nvme_start_freeze(&dev->ctrl);
>   			nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT);
> +			nvme_unfreeze(&dev->ctrl);
> +		}

I'd split out the pci portion, it is not related to the reported issue,
and it is structured differently than the fabrics transports (as for now
at least).

>   	}
>   
>   	nvme_quiesce_io_queues(&dev->ctrl);
> @@ -2740,6 +2741,7 @@ static void nvme_reset_work(struct work_struct *work)
>   	 * controller around but remove all namespaces.
>   	 */
>   	if (dev->online_queues > 1) {
> +		nvme_start_freeze(&dev->ctrl);
>   		nvme_unquiesce_io_queues(&dev->ctrl);
>   		nvme_wait_freeze(&dev->ctrl);
>   		nvme_pci_update_nr_queues(dev);
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 0eb79696fb73..354cce8853c1 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -918,6 +918,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>   		goto out_cleanup_tagset;
>   
>   	if (!new) {
> +		nvme_start_freeze(&ctrl->ctrl);
>   		nvme_unquiesce_io_queues(&ctrl->ctrl);
>   		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
>   			/*
> @@ -926,6 +927,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>   			 * to be safe.
>   			 */
>   			ret = -ENODEV;
> +			nvme_unfreeze(&ctrl->ctrl);

What does this unfreeze designed to do?
Ming Lei June 13, 2023, 1:20 p.m. UTC | #2
On Tue, Jun 13, 2023 at 04:13:42PM +0300, Sagi Grimberg wrote:
> 
> > For block layer freeze/unfreeze APIs, the caller is required to call the
> > two in strict pair, so most of users simply call them from same context,
> > and everything works just fine.
> > 
> > For NVMe, the two are done from different contexts, this way has caused
> > all kinds of IO hang issue, such as:
> > 
> > 1) When io queue connect fails, this controller is deleted without being
> > marked as DEAD. Upper layer may wait forever in __bio_queue_enter(), because
> > in del_gendisk(), disk won't be marked as DEAD unless bdev sync & invalidate
> > returns. If any writeback IO waits in __bio_queue_enter(), IO deadlock is
> > caused. Reported from Yi Zhang.
> > 
> > 2) error recovering vs. namespace deletiong, if any IO originated from
> > scan work is waited in __bio_queue_enter(), flushing scan work hangs for
> > ever in nvme_remove_namespaces() because controller is left as frozen
> > when error recovery is interrupted by controller removal. Reported from
> > Chunguang.
> > 
> > Fix the issue by calling the two in same context just when reset is done
> > and not starting freeze from beginning of error recovery. Not only IO hang
> > is solved, correctness of freeze & unfreeze is respected.
> > 
> > And this way is correct because quiesce is enough for driver to handle
> > error recovery. The only difference is where to wait during error recovery.
> > With this way, IO is just queued in block layer queue instead of
> > __bio_queue_enter(), finally waiting for completion is done in upper
> > layer. Either way, IO can't move on during error recovery.
> > 
> > Reported-by: Chunguang Xu <brookxu.cn@gmail.com>
> > Closes: https://lore.kernel.org/linux-nvme/cover.1685350577.git.chunguang.xu@shopee.com/
> > Reported-by: Yi Zhang <yi.zhang@redhat.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >   drivers/nvme/host/core.c | 4 +---
> >   drivers/nvme/host/pci.c  | 8 +++++---
> >   drivers/nvme/host/rdma.c | 3 ++-
> >   drivers/nvme/host/tcp.c  | 3 ++-
> >   4 files changed, 10 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index 4ef5eaecaa75..d5d9b6f6ec74 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -4707,10 +4707,8 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
> >   	 * removing the namespaces' disks; fail all the queues now to avoid
> >   	 * potentially having to clean up the failed sync later.
> >   	 */
> > -	if (ctrl->state == NVME_CTRL_DEAD) {
> > +	if (ctrl->state == NVME_CTRL_DEAD)
> >   		nvme_mark_namespaces_dead(ctrl);
> > -		nvme_unquiesce_io_queues(ctrl);
> > -	}
> 
> Shouldn't this be in the next patch? Not sure what
> this helps in this patch? It is not clearly documented
> in the commit msg.

oops, good catch, will fix it in V2.

> 
> >   	/* this is a no-op when called from the controller reset handler */
> >   	nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING_NOIO);
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index 492f319ebdf3..5d775b76baca 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -2578,14 +2578,15 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
> >   	dead = nvme_pci_ctrl_is_dead(dev);
> >   	if (dev->ctrl.state == NVME_CTRL_LIVE ||
> >   	    dev->ctrl.state == NVME_CTRL_RESETTING) {
> > -		if (pci_is_enabled(pdev))
> > -			nvme_start_freeze(&dev->ctrl);
> >   		/*
> >   		 * Give the controller a chance to complete all entered requests
> >   		 * if doing a safe shutdown.
> >   		 */
> > -		if (!dead && shutdown)
> > +		if (!dead && shutdown & pci_is_enabled(pdev)) {
> > +			nvme_start_freeze(&dev->ctrl);
> >   			nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT);
> > +			nvme_unfreeze(&dev->ctrl);
> > +		}
> 
> I'd split out the pci portion, it is not related to the reported issue,

Yes.

> and it is structured differently than the fabrics transports (as for now
> at least).

The above change needs to be done in this patch given the same pattern
needs to remove the above 'if (pci_is_enabled(pdev)) nvme_start_freeze()'.

> 
> >   	}
> >   	nvme_quiesce_io_queues(&dev->ctrl);
> > @@ -2740,6 +2741,7 @@ static void nvme_reset_work(struct work_struct *work)
> >   	 * controller around but remove all namespaces.
> >   	 */
> >   	if (dev->online_queues > 1) {
> > +		nvme_start_freeze(&dev->ctrl);
> >   		nvme_unquiesce_io_queues(&dev->ctrl);
> >   		nvme_wait_freeze(&dev->ctrl);
> >   		nvme_pci_update_nr_queues(dev);
> > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> > index 0eb79696fb73..354cce8853c1 100644
> > --- a/drivers/nvme/host/rdma.c
> > +++ b/drivers/nvme/host/rdma.c
> > @@ -918,6 +918,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
> >   		goto out_cleanup_tagset;
> >   	if (!new) {
> > +		nvme_start_freeze(&ctrl->ctrl);
> >   		nvme_unquiesce_io_queues(&ctrl->ctrl);
> >   		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> >   			/*
> > @@ -926,6 +927,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
> >   			 * to be safe.
> >   			 */
> >   			ret = -ENODEV;
> > +			nvme_unfreeze(&ctrl->ctrl);
> 
> What does this unfreeze designed to do?

It is for undoing the previous nvme_start_freeze.


Thanks,
Ming
Sagi Grimberg June 13, 2023, 1:26 p.m. UTC | #3
> > > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> > > index 0eb79696fb73..354cce8853c1 100644
> > > --- a/drivers/nvme/host/rdma.c
> > > +++ b/drivers/nvme/host/rdma.c
> > > @@ -918,6 +918,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
> > >   		goto out_cleanup_tagset;
> > >   	if (!new) {
> > > +		nvme_start_freeze(&ctrl->ctrl);
> > >   		nvme_unquiesce_io_queues(&ctrl->ctrl);
> > >   		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> > >   			/*
> > > @@ -926,6 +927,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
> > >   			 * to be safe.
> > >   			 */
> > >   			ret = -ENODEV;
> > > +			nvme_unfreeze(&ctrl->ctrl);
> > 
> > What does this unfreeze designed to do?
> 
> It is for undoing the previous nvme_start_freeze.

ok.
Keith Busch June 13, 2023, 2:41 p.m. UTC | #4
On Tue, Jun 13, 2023 at 08:58:47AM +0800, Ming Lei wrote:
> And this way is correct because quiesce is enough for driver to handle
> error recovery. The only difference is where to wait during error recovery.
> With this way, IO is just queued in block layer queue instead of
> __bio_queue_enter(), finally waiting for completion is done in upper
> layer. Either way, IO can't move on during error recovery.

The point was to contain the fallout from modifying the hctx mappings.
If you allow IO to queue in the blk-mq layer while a reset is in
progress, they may be entering a context that won't be as expected on
the other side of the reset.
Sagi Grimberg June 13, 2023, 8:34 p.m. UTC | #5
>> And this way is correct because quiesce is enough for driver to handle
>> error recovery. The only difference is where to wait during error recovery.
>> With this way, IO is just queued in block layer queue instead of
>> __bio_queue_enter(), finally waiting for completion is done in upper
>> layer. Either way, IO can't move on during error recovery.
> 
> The point was to contain the fallout from modifying the hctx mappings.
> If you allow IO to queue in the blk-mq layer while a reset is in
> progress, they may be entering a context that won't be as expected on
> the other side of the reset.

That still happens to *some* commands though right?
Keith Busch June 13, 2023, 10:43 p.m. UTC | #6
On Tue, Jun 13, 2023 at 11:34:05PM +0300, Sagi Grimberg wrote:
> 
> > > And this way is correct because quiesce is enough for driver to handle
> > > error recovery. The only difference is where to wait during error recovery.
> > > With this way, IO is just queued in block layer queue instead of
> > > __bio_queue_enter(), finally waiting for completion is done in upper
> > > layer. Either way, IO can't move on during error recovery.
> > 
> > The point was to contain the fallout from modifying the hctx mappings.
> > If you allow IO to queue in the blk-mq layer while a reset is in
> > progress, they may be entering a context that won't be as expected on
> > the other side of the reset.
> 
> That still happens to *some* commands though right?

That is possible only for commands that were already dispatched and
subsequently failed with retry disposition. At the point of reset today,
nothing new enters a queue till we know what the mapping looks like.
Ming Lei June 14, 2023, 12:38 a.m. UTC | #7
On Tue, Jun 13, 2023 at 08:41:46AM -0600, Keith Busch wrote:
> On Tue, Jun 13, 2023 at 08:58:47AM +0800, Ming Lei wrote:
> > And this way is correct because quiesce is enough for driver to handle
> > error recovery. The only difference is where to wait during error recovery.
> > With this way, IO is just queued in block layer queue instead of
> > __bio_queue_enter(), finally waiting for completion is done in upper
> > layer. Either way, IO can't move on during error recovery.
> 
> The point was to contain the fallout from modifying the hctx mappings.

blk_mq_update_nr_hw_queues() is called after nvme_wait_freeze
returns, nothing changes here, so correctness wrt. updating hctx
mapping is provided.

> If you allow IO to queue in the blk-mq layer while a reset is in
> progress, they may be entering a context that won't be as expected on
> the other side of the reset.
 
The only difference is that in-tree code starts to freeze
at the beginning of error recovery, which way can just prevent new IO,
and old ones still are queued, but can't be dispatched to driver
because of quiescing in both ways. With this patch, new IOs queued
after error recovery are just like old ones canceled before resetting.

So not see problems from driver side with this change, and nvme
driver has to cover new IOs queued after error happens.


Thanks.
Ming
diff mbox series

Patch

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4ef5eaecaa75..d5d9b6f6ec74 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4707,10 +4707,8 @@  void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
 	 * removing the namespaces' disks; fail all the queues now to avoid
 	 * potentially having to clean up the failed sync later.
 	 */
-	if (ctrl->state == NVME_CTRL_DEAD) {
+	if (ctrl->state == NVME_CTRL_DEAD)
 		nvme_mark_namespaces_dead(ctrl);
-		nvme_unquiesce_io_queues(ctrl);
-	}
 
 	/* this is a no-op when called from the controller reset handler */
 	nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING_NOIO);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 492f319ebdf3..5d775b76baca 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2578,14 +2578,15 @@  static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 	dead = nvme_pci_ctrl_is_dead(dev);
 	if (dev->ctrl.state == NVME_CTRL_LIVE ||
 	    dev->ctrl.state == NVME_CTRL_RESETTING) {
-		if (pci_is_enabled(pdev))
-			nvme_start_freeze(&dev->ctrl);
 		/*
 		 * Give the controller a chance to complete all entered requests
 		 * if doing a safe shutdown.
 		 */
-		if (!dead && shutdown)
+		if (!dead && shutdown & pci_is_enabled(pdev)) {
+			nvme_start_freeze(&dev->ctrl);
 			nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT);
+			nvme_unfreeze(&dev->ctrl);
+		}
 	}
 
 	nvme_quiesce_io_queues(&dev->ctrl);
@@ -2740,6 +2741,7 @@  static void nvme_reset_work(struct work_struct *work)
 	 * controller around but remove all namespaces.
 	 */
 	if (dev->online_queues > 1) {
+		nvme_start_freeze(&dev->ctrl);
 		nvme_unquiesce_io_queues(&dev->ctrl);
 		nvme_wait_freeze(&dev->ctrl);
 		nvme_pci_update_nr_queues(dev);
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 0eb79696fb73..354cce8853c1 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -918,6 +918,7 @@  static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 		goto out_cleanup_tagset;
 
 	if (!new) {
+		nvme_start_freeze(&ctrl->ctrl);
 		nvme_unquiesce_io_queues(&ctrl->ctrl);
 		if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
 			/*
@@ -926,6 +927,7 @@  static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 			 * to be safe.
 			 */
 			ret = -ENODEV;
+			nvme_unfreeze(&ctrl->ctrl);
 			goto out_wait_freeze_timed_out;
 		}
 		blk_mq_update_nr_hw_queues(ctrl->ctrl.tagset,
@@ -975,7 +977,6 @@  static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl,
 		bool remove)
 {
 	if (ctrl->ctrl.queue_count > 1) {
-		nvme_start_freeze(&ctrl->ctrl);
 		nvme_quiesce_io_queues(&ctrl->ctrl);
 		nvme_sync_io_queues(&ctrl->ctrl);
 		nvme_rdma_stop_io_queues(ctrl);
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index bf0230442d57..5ae08e9cb16d 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1909,6 +1909,7 @@  static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 		goto out_cleanup_connect_q;
 
 	if (!new) {
+		nvme_start_freeze(ctrl);
 		nvme_unquiesce_io_queues(ctrl);
 		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
 			/*
@@ -1917,6 +1918,7 @@  static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
 			 * to be safe.
 			 */
 			ret = -ENODEV;
+			nvme_unfreeze(ctrl);
 			goto out_wait_freeze_timed_out;
 		}
 		blk_mq_update_nr_hw_queues(ctrl->tagset,
@@ -2021,7 +2023,6 @@  static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl,
 	if (ctrl->queue_count <= 1)
 		return;
 	nvme_quiesce_admin_queue(ctrl);
-	nvme_start_freeze(ctrl);
 	nvme_quiesce_io_queues(ctrl);
 	nvme_sync_io_queues(ctrl);
 	nvme_tcp_stop_io_queues(ctrl);