diff mbox

vhost-scsi: Take configfs group dependency during VHOST_SCSI_SET_ENDPOINT

Message ID 1412825695-15134-1-git-send-email-nab@daterainc.com (mailing list archive)
State New, archived
Headers show

Commit Message

Nicholas A. Bellinger Oct. 9, 2014, 3:34 a.m. UTC
From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch addresses a bug where individual vhost-scsi configfs endpoint
groups can be removed from below while active exports to QEMU userspace
still exist, resulting in an OOPs.

It adds a configfs_depend_item() in vhost_scsi_set_endpoint() to obtain
an explicit dependency on se_tpg->tpg_group in order to prevent individual
vhost-scsi WWPN endpoints from being released via normal configfs methods
while an QEMU ioctl reference still exists.

Also, add matching configfs_undepend_item() in vhost_scsi_clear_endpoint()
to release the dependency, once QEMU's reference to the individual group
at /sys/kernel/config/target/vhost/$WWPN/$TPGT is released.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: <stable@vger.kernel.org> # 3.6+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
---
 drivers/vhost/scsi.c |   22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Comments

Nicholas A. Bellinger Oct. 9, 2014, 4:14 a.m. UTC | #1
Hi MST & Co,

Quick question below wrt to this patch..

On Thu, 2014-10-09 at 03:34 +0000, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger <nab@linux-iscsi.org>
> 
> This patch addresses a bug where individual vhost-scsi configfs endpoint
> groups can be removed from below while active exports to QEMU userspace
> still exist, resulting in an OOPs.
> 
> It adds a configfs_depend_item() in vhost_scsi_set_endpoint() to obtain
> an explicit dependency on se_tpg->tpg_group in order to prevent individual
> vhost-scsi WWPN endpoints from being released via normal configfs methods
> while an QEMU ioctl reference still exists.
> 
> Also, add matching configfs_undepend_item() in vhost_scsi_clear_endpoint()
> to release the dependency, once QEMU's reference to the individual group
> at /sys/kernel/config/target/vhost/$WWPN/$TPGT is released.
> 
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: <stable@vger.kernel.org> # 3.6+
> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
> ---
>  drivers/vhost/scsi.c |   22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
> index 69906ca..bc1b620 100644
> --- a/drivers/vhost/scsi.c
> +++ b/drivers/vhost/scsi.c
> @@ -1312,6 +1312,7 @@ static int
>  vhost_scsi_set_endpoint(struct vhost_scsi *vs,
>  			struct vhost_scsi_target *t)
>  {
> +	struct se_portal_group *se_tpg;
>  	struct tcm_vhost_tport *tv_tport;
>  	struct tcm_vhost_tpg *tpg;
>  	struct tcm_vhost_tpg **vs_tpg;
> @@ -1359,6 +1360,19 @@ vhost_scsi_set_endpoint(struct vhost_scsi *vs,
>  				ret = -EEXIST;
>  				goto out;
>  			}
> +			/*
> +			 * In order to ensure individual vhost-scsi configfs
> +			 * groups cannot be removed while in use by vhost ioctl,
> +			 * go ahead and take an explicit se_tpg->tpg_group.cg_item
> +			 * dependency now.
> +			 */
> +			se_tpg = &tpg->se_tpg;
> +			ret = configfs_depend_item(se_tpg->se_tpg_tfo->tf_subsys,
> +						   &se_tpg->tpg_group.cg_item);
> +			if (ret) {
> +				pr_warn("configfs_depend_item() failed: %d\n", ret);
> +				goto out;
> +			}
>  			tpg->tv_tpg_vhost_count++;
>  			tpg->vhost_scsi = vs;
>  			vs_tpg[tpg->tport_tpgt] = tpg;
> @@ -1401,6 +1415,7 @@ static int
>  vhost_scsi_clear_endpoint(struct vhost_scsi *vs,
>  			  struct vhost_scsi_target *t)
>  {
> +	struct se_portal_group *se_tpg;
>  	struct tcm_vhost_tport *tv_tport;
>  	struct tcm_vhost_tpg *tpg;
>  	struct vhost_virtqueue *vq;
> @@ -1449,6 +1464,13 @@ vhost_scsi_clear_endpoint(struct vhost_scsi *vs,
>  		vs->vs_tpg[target] = NULL;
>  		match = true;
>  		mutex_unlock(&tpg->tv_tpg_mutex);
> +		/*
> +		 * Release se_tpg->tpg_group.cg_item configfs dependency now
> +		 * to allow vhost-scsi WWPN se_tpg->tpg_group shutdown to occur.
> +		 */
> +		se_tpg = &tpg->se_tpg;
> +		configfs_undepend_item(se_tpg->se_tpg_tfo->tf_subsys,
> +				       &se_tpg->tpg_group.cg_item);
>  	}
>  	if (match) {
>  		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {

AFAICT from qemu code, the ioctl VHOST_SCSI_CLEAR_ENDPOINT is always
called during shutdown in order to release the endpoint and drop this
new configfs dependency.

The question is, what happens when qemu crashes..?  Is there currently
an assurance that VHOST_SCSI_CLEAR_ENDPOINT is called via the normal
VirtioDeviceClass->unrealize() when qemu exits abnormally..?

Thanks,

--nab

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Oct. 9, 2014, 8:49 a.m. UTC | #2
Il 09/10/2014 06:14, Nicholas A. Bellinger ha scritto:
> AFAICT from qemu code, the ioctl VHOST_SCSI_CLEAR_ENDPOINT is always
> called during shutdown in order to release the endpoint and drop this
> new configfs dependency.

As far as I can see, the only path leading to the ioctl is
vhost_scsi_set_status->vhost_scsi_stop.  That only happens if the guest
driver resets the device upon shutdown, or via vhost_scsi_unrealize as
you pointed out.  But unrealize() is only called when a device is
hot-unplugged.

It does not happen if you close QEMU with SIGTERM, ctrl-c, or with the
"quit" command, because no attempt is done to bring down the VM data
structures (or free memory, or close file descriptors) in case of a
fatal exit.  The kernel should do that for us.

Besides that...

> The question is, what happens when qemu crashes..?  Is there currently
> an assurance that VHOST_SCSI_CLEAR_ENDPOINT is called via the normal
> VirtioDeviceClass->unrealize() when qemu exits abnormally..?

... of course nothing is called if you SIGKILL QEMU.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Oct. 9, 2014, 10:49 a.m. UTC | #3
Il 09/10/2014 10:49, Paolo Bonzini ha scritto:
> 
> It does not happen if you close QEMU with SIGTERM, ctrl-c, or with the
> "quit" command, because no attempt is done to bring down the VM data
> structures (or free memory, or close file descriptors) in case of a
> fatal exit.  The kernel should do that for us.

... and in the case of vhost-scsi, doesn't it do that when
vhost_scsi_release calls vhost_scsi_clear_endpoint?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger Oct. 21, 2014, 7:05 p.m. UTC | #4
Hey Paolo,

On Thu, 2014-10-09 at 12:49 +0200, Paolo Bonzini wrote:
> Il 09/10/2014 10:49, Paolo Bonzini ha scritto:
> > 
> > It does not happen if you close QEMU with SIGTERM, ctrl-c, or with the
> > "quit" command, because no attempt is done to bring down the VM data
> > structures (or free memory, or close file descriptors) in case of a
> > fatal exit.  The kernel should do that for us.
> 
> ... and in the case of vhost-scsi, doesn't it do that when
> vhost_scsi_release calls vhost_scsi_clear_endpoint?
> 

Thanks for the extra clarifications here.

The SIGTERM, ctrl-c and "quit" command cases happen as you describe, and
invoke vhost_scsi_release() -> vhost_scsi_clear_endpoint() to drop the
endpoint reference.

AFAICT, it's the SIGKILL case that is problematic both with and without
this patch.  With the patch, the configfs dependency on the vhost-scsi
endpoint group is left in place, thus preventing the group (and
underlying target_core_mod) from being removed until a
VHOST_SCSI_CLEAR_ENDPOINT with the same wwpn is called to drop the
original reference.

Without the patch, the group can still be removed at any time, but any
subsequent VHOST_SCSI_SET_ENDPOINT attempts with the original wwpn will
fail after SIGKILL, because the original reference is still in place.

So I held off on pushing this patch to -rc1 for the moment, but even
with the above limitation preventing group shutdown after SIGKILL, I
think it's still better to obtain the configfs dependency to prevent the
removal of endpoints while there are active references.

That said, I'm still unsure how to address the SIGKILL case, and what's
the most sane way to drop dead references after it happens, and how
vhost-scsi should be differentiating between dead and active references.

Any ideas..?

--nab

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin Oct. 21, 2014, 10:23 p.m. UTC | #5
On Tue, Oct 21, 2014 at 12:05:31PM -0700, Nicholas A. Bellinger wrote:
> Hey Paolo,
> 
> On Thu, 2014-10-09 at 12:49 +0200, Paolo Bonzini wrote:
> > Il 09/10/2014 10:49, Paolo Bonzini ha scritto:
> > > 
> > > It does not happen if you close QEMU with SIGTERM, ctrl-c, or with the
> > > "quit" command, because no attempt is done to bring down the VM data
> > > structures (or free memory, or close file descriptors) in case of a
> > > fatal exit.  The kernel should do that for us.
> > 
> > ... and in the case of vhost-scsi, doesn't it do that when
> > vhost_scsi_release calls vhost_scsi_clear_endpoint?
> > 
> 
> Thanks for the extra clarifications here.
> 
> The SIGTERM, ctrl-c and "quit" command cases happen as you describe, and
> invoke vhost_scsi_release() -> vhost_scsi_clear_endpoint() to drop the
> endpoint reference.
> 
> AFAICT, it's the SIGKILL case that is problematic both with and without
> this patch.  With the patch, the configfs dependency on the vhost-scsi
> endpoint group is left in place, thus preventing the group (and
> underlying target_core_mod) from being removed until a
> VHOST_SCSI_CLEAR_ENDPOINT with the same wwpn is called to drop the
> original reference.
> 
> Without the patch, the group can still be removed at any time, but any
> subsequent VHOST_SCSI_SET_ENDPOINT attempts with the original wwpn will
> fail after SIGKILL, because the original reference is still in place.
> 
> So I held off on pushing this patch to -rc1 for the moment, but even
> with the above limitation preventing group shutdown after SIGKILL, I
> think it's still better to obtain the configfs dependency to prevent the
> removal of endpoints while there are active references.
> 
> That said, I'm still unsure how to address the SIGKILL case, and what's
> the most sane way to drop dead references after it happens, and how
> vhost-scsi should be differentiating between dead and active references.
> 
> Any ideas..?
> 
> --nab

Need to use some other file (not sysfs), cleanup on release.
diff mbox

Patch

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 69906ca..bc1b620 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1312,6 +1312,7 @@  static int
 vhost_scsi_set_endpoint(struct vhost_scsi *vs,
 			struct vhost_scsi_target *t)
 {
+	struct se_portal_group *se_tpg;
 	struct tcm_vhost_tport *tv_tport;
 	struct tcm_vhost_tpg *tpg;
 	struct tcm_vhost_tpg **vs_tpg;
@@ -1359,6 +1360,19 @@  vhost_scsi_set_endpoint(struct vhost_scsi *vs,
 				ret = -EEXIST;
 				goto out;
 			}
+			/*
+			 * In order to ensure individual vhost-scsi configfs
+			 * groups cannot be removed while in use by vhost ioctl,
+			 * go ahead and take an explicit se_tpg->tpg_group.cg_item
+			 * dependency now.
+			 */
+			se_tpg = &tpg->se_tpg;
+			ret = configfs_depend_item(se_tpg->se_tpg_tfo->tf_subsys,
+						   &se_tpg->tpg_group.cg_item);
+			if (ret) {
+				pr_warn("configfs_depend_item() failed: %d\n", ret);
+				goto out;
+			}
 			tpg->tv_tpg_vhost_count++;
 			tpg->vhost_scsi = vs;
 			vs_tpg[tpg->tport_tpgt] = tpg;
@@ -1401,6 +1415,7 @@  static int
 vhost_scsi_clear_endpoint(struct vhost_scsi *vs,
 			  struct vhost_scsi_target *t)
 {
+	struct se_portal_group *se_tpg;
 	struct tcm_vhost_tport *tv_tport;
 	struct tcm_vhost_tpg *tpg;
 	struct vhost_virtqueue *vq;
@@ -1449,6 +1464,13 @@  vhost_scsi_clear_endpoint(struct vhost_scsi *vs,
 		vs->vs_tpg[target] = NULL;
 		match = true;
 		mutex_unlock(&tpg->tv_tpg_mutex);
+		/*
+		 * Release se_tpg->tpg_group.cg_item configfs dependency now
+		 * to allow vhost-scsi WWPN se_tpg->tpg_group shutdown to occur.
+		 */
+		se_tpg = &tpg->se_tpg;
+		configfs_undepend_item(se_tpg->se_tpg_tfo->tf_subsys,
+				       &se_tpg->tpg_group.cg_item);
 	}
 	if (match) {
 		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {