diff mbox

[6/6] Fix unsafe fw_event_list usage

Message ID 1433821856-2815280-7-git-send-email-calvinowens@fb.com (mailing list archive)
State New, archived
Headers show

Commit Message

Calvin Owens June 9, 2015, 3:50 a.m. UTC
Since the fw_event deletes itself from the list, cleanup_queue() can
walk onto garbage pointers or walk off into freed memory.

This refactors the code in _scsih_fw_event_cleanup_queue() to not
iterate over the fw_event_list without a lock. 

Signed-off-by: Calvin Owens <calvinowens@fb.com>
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

Comments

Christoph Hellwig July 3, 2015, 4:02 p.m. UTC | #1
On Mon, Jun 08, 2015 at 08:50:56PM -0700, Calvin Owens wrote:
> Since the fw_event deletes itself from the list, cleanup_queue() can
> walk onto garbage pointers or walk off into freed memory.
> 
> This refactors the code in _scsih_fw_event_cleanup_queue() to not
> iterate over the fw_event_list without a lock. 

I think this really should be folded into the previous one, with the
fixes in this one the other refcounting change don't make a whole lot
sense.

> +static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
> +{
> +	unsigned long flags;
> +	struct fw_event_work *fw_event = NULL;
> +
> +	spin_lock_irqsave(&ioc->fw_event_lock, flags);
> +	if (!list_empty(&ioc->fw_event_list)) {
> +		fw_event = list_first_entry(&ioc->fw_event_list,
> +				struct fw_event_work, list);
> +		list_del_init(&fw_event->list);
> +		fw_event_work_get(fw_event);
> +	}
> +	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
> +
> +	return fw_event;

Shouldn't we have a reference for each item on the list that gets
transfer to whomever removes it from the list?

Additionally _firmware_event_work should call dequeue_next_fw_event
first in the function so that item is off the list before we process
it, and can then just drop the reference once it's done.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Calvin Owens July 12, 2015, 4:20 a.m. UTC | #2
On Friday 07/03 at 09:02 -0700, Christoph Hellwig wrote:
> On Mon, Jun 08, 2015 at 08:50:56PM -0700, Calvin Owens wrote:
> > Since the fw_event deletes itself from the list, cleanup_queue() can
> > walk onto garbage pointers or walk off into freed memory.
> > 
> > This refactors the code in _scsih_fw_event_cleanup_queue() to not
> > iterate over the fw_event_list without a lock. 
> 
> I think this really should be folded into the previous one, with the
> fixes in this one the other refcounting change don't make a whole lot
> sense.
> 
> > +static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
> > +{
> > +	unsigned long flags;
> > +	struct fw_event_work *fw_event = NULL;
> > +
> > +	spin_lock_irqsave(&ioc->fw_event_lock, flags);
> > +	if (!list_empty(&ioc->fw_event_list)) {
> > +		fw_event = list_first_entry(&ioc->fw_event_list,
> > +				struct fw_event_work, list);
> > +		list_del_init(&fw_event->list);
> > +		fw_event_work_get(fw_event);
> > +	}
> > +	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
> > +
> > +	return fw_event;
> 
> Shouldn't we have a reference for each item on the list that gets
> transfer to whomever removes it from the list?

Yes, this was a bit weird the way I did it. I redid this in v2, hopefully
it's clearer.

> Additionally _firmware_event_work should call dequeue_next_fw_event
> first in the function so that item is off the list before we process
> it, and can then just drop the reference once it's done.

That works: cleanup_queue() won't wait on some already-running events, but
destroy_workqueue() drains the wq, so we won't run ahead and free things
from under the fw_event when unwinding.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 8d8c814..f504e28 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -2939,6 +2939,23 @@  mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 	fw_event_work_put(fw_event);
 }
 
+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+	unsigned long flags;
+	struct fw_event_work *fw_event = NULL;
+
+	spin_lock_irqsave(&ioc->fw_event_lock, flags);
+	if (!list_empty(&ioc->fw_event_list)) {
+		fw_event = list_first_entry(&ioc->fw_event_list,
+				struct fw_event_work, list);
+		list_del_init(&fw_event->list);
+		fw_event_work_get(fw_event);
+	}
+	spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
+
+	return fw_event;
+}
+
 /**
  * _scsih_fw_event_cleanup_queue - cleanup event queue
  * @ioc: per adapter object
@@ -2951,17 +2968,18 @@  mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
 static void
 _scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
 {
-	struct fw_event_work *fw_event, *next;
+	struct fw_event_work *fw_event;
 
 	if (list_empty(&ioc->fw_event_list) ||
 	     !ioc->firmware_event_thread || in_interrupt())
 		return;
 
-	list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
+	while ((fw_event = dequeue_next_fw_event(ioc))) {
 		if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
 			_scsih_fw_event_free(ioc, fw_event);
 			continue;
 		}
+		fw_event_work_put(fw_event);
 	}
 }