diff mbox

[0/2] avoid crashing when reading /proc/scsi/scsi and simultaneously removing devices

Message ID 1452539718.2363.12.camel@HansenPartnership.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

James Bottomley Jan. 11, 2016, 7:15 p.m. UTC
On Mon, 2016-01-11 at 12:28 -0500, Ewan D. Milne wrote:
> From: "Ewan D. Milne" <emilne@redhat.com>
> 
> The klist traversal used by the reading of /proc/scsi/scsi is not
> interlocked
> against device removal.  It takes a reference on the containing
> object, but
> this does not prevent the device from being removed from the list. 
>  Thus, we
> get errors and eventually panic, as shown in the traces below.  Fix
> this by
> keeping a klist iterator in the seq_file private data.
> 
> The problem can be easily reproduced by repeatedly increasing
> scsi_debug's
> max_luns to 30 and then deleting the devices via sysfs, while
> simulatenously
> accessing /proc/scsi/scsi.
>     
> From a patch originally developed by David Jeffery <
> djeffery@redhat.com>

OK, so it looks like this is a bug in the klist system.  When a
starting point is used, there should be a check to see if it's still
active otherwise the whole thing is racy.  If it's fixed in klist, the
fix works for everyone, not just SCSI.

How about this?  It causes the iterator to start at the beginning if
the node has been deleted.  That will produce double output during some
of your test, but I think that's OK given that this is a rare race.

James

---

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Ewan Milne Jan. 11, 2016, 9:32 p.m. UTC | #1
On Mon, 2016-01-11 at 11:15 -0800, James Bottomley wrote:
> On Mon, 2016-01-11 at 12:28 -0500, Ewan D. Milne wrote:
> > From: "Ewan D. Milne" <emilne@redhat.com>
> > 
> > The klist traversal used by the reading of /proc/scsi/scsi is not
> > interlocked
> > against device removal.  It takes a reference on the containing
> > object, but
> > this does not prevent the device from being removed from the list. 
> >  Thus, we
> > get errors and eventually panic, as shown in the traces below.  Fix
> > this by
> > keeping a klist iterator in the seq_file private data.
> > 
> > The problem can be easily reproduced by repeatedly increasing
> > scsi_debug's
> > max_luns to 30 and then deleting the devices via sysfs, while
> > simulatenously
> > accessing /proc/scsi/scsi.
> >     
> > From a patch originally developed by David Jeffery <
> > djeffery@redhat.com>
> 
> OK, so it looks like this is a bug in the klist system.  When a
> starting point is used, there should be a check to see if it's still
> active otherwise the whole thing is racy.  If it's fixed in klist, the
> fix works for everyone, not just SCSI.
> 
> How about this?  It causes the iterator to start at the beginning if
> the node has been deleted.  That will produce double output during some
> of your test, but I think that's OK given that this is a rare race.
> 
> James

I'm running with your change now, it does appear to fix the problem.
I guess the question is whether this behavior would trip up any other
klist users, for /proc/scsi/scsi it is probably not a problem.  The
worst that might happen is that userspace tools that parse the output
would get duplicate entries.

-Ewan

> ---
> 
> diff --git a/lib/klist.c b/lib/klist.c
> index d74cf7a..0507fa5 100644
> --- a/lib/klist.c
> +++ b/lib/klist.c
> @@ -282,9 +282,9 @@ void klist_iter_init_node(struct klist *k, struct klist_iter *i,
>  			  struct klist_node *n)
>  {
>  	i->i_klist = k;
> -	i->i_cur = n;
> -	if (n)
> -		kref_get(&n->n_ref);
> +	i->i_cur = NULL;
> +	if (n && kref_get_unless_zero(&n->n_ref))
> +		i->i_cur = n;
>  }
>  EXPORT_SYMBOL_GPL(klist_iter_init_node);
>  


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley Jan. 12, 2016, 2:35 a.m. UTC | #2
On Mon, 2016-01-11 at 16:32 -0500, Ewan Milne wrote:
> On Mon, 2016-01-11 at 11:15 -0800, James Bottomley wrote:
> > On Mon, 2016-01-11 at 12:28 -0500, Ewan D. Milne wrote:
> > > From: "Ewan D. Milne" <emilne@redhat.com>
> > > 
> > > The klist traversal used by the reading of /proc/scsi/scsi is not
> > > interlocked
> > > against device removal.  It takes a reference on the containing
> > > object, but
> > > this does not prevent the device from being removed from the
> > > list. 
> > >  Thus, we
> > > get errors and eventually panic, as shown in the traces below. 
> > >  Fix
> > > this by
> > > keeping a klist iterator in the seq_file private data.
> > > 
> > > The problem can be easily reproduced by repeatedly increasing
> > > scsi_debug's
> > > max_luns to 30 and then deleting the devices via sysfs, while
> > > simulatenously
> > > accessing /proc/scsi/scsi.
> > >     
> > > From a patch originally developed by David Jeffery <
> > > djeffery@redhat.com>
> > 
> > OK, so it looks like this is a bug in the klist system.  When a
> > starting point is used, there should be a check to see if it's
> > still
> > active otherwise the whole thing is racy.  If it's fixed in klist,
> > the
> > fix works for everyone, not just SCSI.
> > 
> > How about this?  It causes the iterator to start at the beginning
> > if
> > the node has been deleted.  That will produce double output during
> > some
> > of your test, but I think that's OK given that this is a rare race.
> > 
> > James
> 
> I'm running with your change now, it does appear to fix the problem.
> I guess the question is whether this behavior would trip up any other
> klist users,

I don't see how it can.  We simply can't use a removed node as the
starting point without triggering the kref warn on you see in your log.
 Things just go downhill from there.  This will happen to any klist
user, not just us.  I'd contend that starting at the beginning is
better than an eventual panic for anyone.

What you should see currently is actually the list is truncated when
the problem is hit because the next pointer is set to the poison value.
 That causes another warn on and traversal truncation (or straight
dereference if you're unlucky).

>  for /proc/scsi/scsi it is probably not a problem.  The
> worst that might happen is that userspace tools that parse the output
> would get duplicate entries.

Well, it's not really possible to keep the current behaviour and
truncate the list (before oopsing).  There's no non-racy way of
positioning the iterator at the last element so it exits on
klist_next() because the list can be added to during the iteration.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/lib/klist.c b/lib/klist.c
index d74cf7a..0507fa5 100644
--- a/lib/klist.c
+++ b/lib/klist.c
@@ -282,9 +282,9 @@  void klist_iter_init_node(struct klist *k, struct klist_iter *i,
 			  struct klist_node *n)
 {
 	i->i_klist = k;
-	i->i_cur = n;
-	if (n)
-		kref_get(&n->n_ref);
+	i->i_cur = NULL;
+	if (n && kref_get_unless_zero(&n->n_ref))
+		i->i_cur = n;
 }
 EXPORT_SYMBOL_GPL(klist_iter_init_node);