diff mbox

scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure

Message ID 1519684636-16564-1-git-send-email-William.Kuzeja@stratus.com (mailing list archive)
State Changes Requested
Headers show

Commit Message

Bill Kuzeja Feb. 26, 2018, 10:37 p.m. UTC
Because of the shifting around of code in qla2x00_probe_one recently,
failures during adapter initialization can lead to problems, i.e. NULL
pointer crashes and doubly freed data structures which cause eventual
panics.

To address these problems, I tested all early probe exit points (and
debugged the resulting panics). The end result is a simple looking patch
set of changes, with minimum reshuffling of code. In my testing with this
patch, I have not encountered any NULL pointer crashes or double frees.

Fixes: d64d6c5671db ("scsi: qla2xxx: Fix NULL pointer crash due to probe failure")
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>

---

Some of these issues are due to:

commit d64d6c5671db scsi: qla2xxx: Fix NULL pointer crash due to probe failure

where some frees were moved around, as well as the error exit from
a qla2x00_request_irqs failure.

This was a fix for:

commit d74595278f4a scsi: qla2xxx: Add multiple queue pair functionality.

which caused problems of its own.

To reproduce these issues, I run a test where I break the card early in init,
(and also through kprobe fault injection). This way, I've been able to hit
four different types of crashes.

Here's the leadup to the first issue:

[ 1044.133225] mpt3sas_cm4: diag reset: FAILED
[ 1072.694857] qla2xxx [0000:b7:00.0]-d04c:12: MBX Command timeout for cmd 2,
_status 0xffffffff hccr 0xffffffff
[ 1072.734020] qla2xxx [0000:b7:00.0]-00cf:12: Setup chip ****FAILED****.
[ 1072.748506] qla2xxx [0000:b7:00.0]-00d6:12: Failed to initialize adapter - Adapter flags 42.

and we crash....

[ 1072.787691] kernel BUG at mm/slub.c:3601!

Here's the stack trace from the dump:

PID: 20037  TASK: ffffa1447ab24f10  CPU: 0   COMMAND: "kworker/0:14"
 #0 [ffffa144c6ba7850] die at ffffffff8aa2e92b
 #1 [ffffa144c6ba7880] do_trap at ffffffff8b10e1f0
 #2 [ffffa144c6ba78d0] do_invalid_op at ffffffff8aa2b284
 #3 [ffffa144c6ba7980] invalid_op at ffffffff8b11b4ee
    [exception RIP: kfree+313]
    RIP: ffffffff8abf3f79  RSP: ffffa144c6ba7a30  RFLAGS: 00010246
    RAX: 001fffff00000000  RBX: ffffa13cdbe24000  RCX: 000000010040003c
    RDX: 001fffff00000000  RSI: fffffd615cf8a6c0  RDI: ffffa13cdbe24000
    RBP: ffffa144c6ba7a48   R8: ffffa143be29b1c0   R9: 000000010040003c
    R10: 00000000be29bc01  R11: fffffd61416f8900  R12: ffffa14c763bf000
    R13: ffffffffc08df36d  R14: ffffa13cdbe20740  R15: ffffa13cdbe20000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
 #4 [ffffa144c6ba7a50] qla2x00_probe_one at ffffffffc08df36d [qla2xxx]
 #5 [ffffa144c6ba7dd0] local_pci_probe at ffffffff8ad92e45
 #6 [ffffa144c6ba7e08] work_for_cpu_fn at ffffffff8aaae4c4
 #7 [ffffa144c6ba7e20] process_one_work at ffffffff8aab170a
 #8 [ffffa144c6ba7e68] worker_thread at ffffffff8aab2528
 #9 [ffffa144c6ba7ec8] kthread at ffffffff8aab96af

Here's we're trying to re-free ha->nvram in qla2x00_mem_free. When the
failure in ha->isp_ops->initialize_adapter occurs, we jump to the
probe_failed label. Then we end up calling qla2x00_mem_free twice because
qla2x00_free_device calls qla2x00_mem_free as well. This is problem #1.

Problem #2: If we fail even earlier - in qla2x00_mem_alloc - we jump to
probe_hw_failed which now tries to free memory that wasn't allocated
(note before commit d64d6c5671db, we only called qla2x00_clear_drv_active -
this was correct). My server crashed on qla2x00_free_req_que.

Next, notice that the only exit point that calls probe_init_failed is
when qla2x00_alloc_queues fails. But looking inside qla2x00_alloc_queues,
we only do the following if this routine is successful:

        ha->rsp_q_map[0] = rsp;
        ha->req_q_map[0] = req;
        set_bit(0, ha->rsp_qid_map);
        set_bit(0, ha->req_qid_map);

Otherwise, we free up ha->base_qpair, ha->rsp_q_map, ha->req_q_map
and exit with an error. Then we jump to probe_init_failed, which does
two things that are bad.

First, there's (problem #3):

       ha->req_q_map[0] = NULL;
       ha->rsp_q_map[0] = NULL;

In the failure case of qla2x00_alloc_queues, ha->req_q_map and
ha->rsp_q_map are set to NULL. Can't dereference a NULL pointer and
that is what we crash on here.

If the system didn't panic on this, it would have on problem #4.
There are extra calls to qla2x00_free_req_que and qla2x00_free_rsp_que
in probe_init_failed path because these calls are now duplicated in
probe_hw_failed because of commit d64d6c5671db.

Now for the changes...

Problem #1 is easily solved. When we enter probe_failed, have a new
variable that indicates this is the case so probe_hw_failed can check
it and not call the same free routines again (qla2x00_free_device calls
qla2x00_mem_free, qla2x00_free_req_que, and qla2x00_free_rsp_que).

For problem 2, an extra label (probe_hw_failed_no_mem) will be created
at the end of probe_hw_failed, so we don't try to free unallocated memory.

Problems 3 and 4 are also easy to remedy: if qla2x00_alloc_queues fails,
we should call probe_failed instead. And since nobody jumps to
probe_init_failed anymore, just eliminate that label. The calls to
qla2x00_free_req_que are done below in probe_hw_failed anyways, to which
we fall through.

I've tested out these changes thoroughly with a test that fails
initialization at various times. I've also used kprobes to inject errors
to force us into various error paths. The exit points are very solid now,
no NULL pointer crashes, no double frees.
---
 drivers/scsi/qla2xxx/qla_os.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

Comments

Hannes Reinecke Feb. 27, 2018, 11 a.m. UTC | #1
On 02/26/2018 11:37 PM, Bill Kuzeja wrote:
> Because of the shifting around of code in qla2x00_probe_one recently,
> failures during adapter initialization can lead to problems, i.e. NULL
> pointer crashes and doubly freed data structures which cause eventual
> panics.
> 
> To address these problems, I tested all early probe exit points (and
> debugged the resulting panics). The end result is a simple looking patch
> set of changes, with minimum reshuffling of code. In my testing with this
> patch, I have not encountered any NULL pointer crashes or double frees.
> 
> Fixes: d64d6c5671db ("scsi: qla2xxx: Fix NULL pointer crash due to probe failure")
> Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
> 
> ---
> 
> Some of these issues are due to:
> 
> commit d64d6c5671db scsi: qla2xxx: Fix NULL pointer crash due to probe failure
> 
> where some frees were moved around, as well as the error exit from
> a qla2x00_request_irqs failure.
> 
> This was a fix for:
> 
> commit d74595278f4a scsi: qla2xxx: Add multiple queue pair functionality.
> 
> which caused problems of its own.
> 
> To reproduce these issues, I run a test where I break the card early in init,
> (and also through kprobe fault injection). This way, I've been able to hit
> four different types of crashes.
> 
> Here's the leadup to the first issue:
> 
> [ 1044.133225] mpt3sas_cm4: diag reset: FAILED
> [ 1072.694857] qla2xxx [0000:b7:00.0]-d04c:12: MBX Command timeout for cmd 2,
> _status 0xffffffff hccr 0xffffffff
> [ 1072.734020] qla2xxx [0000:b7:00.0]-00cf:12: Setup chip ****FAILED****.
> [ 1072.748506] qla2xxx [0000:b7:00.0]-00d6:12: Failed to initialize adapter - Adapter flags 42.
> 
> and we crash....
> 
> [ 1072.787691] kernel BUG at mm/slub.c:3601!
> 
> Here's the stack trace from the dump:
> 
> PID: 20037  TASK: ffffa1447ab24f10  CPU: 0   COMMAND: "kworker/0:14"
>  #0 [ffffa144c6ba7850] die at ffffffff8aa2e92b
>  #1 [ffffa144c6ba7880] do_trap at ffffffff8b10e1f0
>  #2 [ffffa144c6ba78d0] do_invalid_op at ffffffff8aa2b284
>  #3 [ffffa144c6ba7980] invalid_op at ffffffff8b11b4ee
>     [exception RIP: kfree+313]
>     RIP: ffffffff8abf3f79  RSP: ffffa144c6ba7a30  RFLAGS: 00010246
>     RAX: 001fffff00000000  RBX: ffffa13cdbe24000  RCX: 000000010040003c
>     RDX: 001fffff00000000  RSI: fffffd615cf8a6c0  RDI: ffffa13cdbe24000
>     RBP: ffffa144c6ba7a48   R8: ffffa143be29b1c0   R9: 000000010040003c
>     R10: 00000000be29bc01  R11: fffffd61416f8900  R12: ffffa14c763bf000
>     R13: ffffffffc08df36d  R14: ffffa13cdbe20740  R15: ffffa13cdbe20000
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
>  #4 [ffffa144c6ba7a50] qla2x00_probe_one at ffffffffc08df36d [qla2xxx]
>  #5 [ffffa144c6ba7dd0] local_pci_probe at ffffffff8ad92e45
>  #6 [ffffa144c6ba7e08] work_for_cpu_fn at ffffffff8aaae4c4
>  #7 [ffffa144c6ba7e20] process_one_work at ffffffff8aab170a
>  #8 [ffffa144c6ba7e68] worker_thread at ffffffff8aab2528
>  #9 [ffffa144c6ba7ec8] kthread at ffffffff8aab96af
> 
> Here's we're trying to re-free ha->nvram in qla2x00_mem_free. When the
> failure in ha->isp_ops->initialize_adapter occurs, we jump to the
> probe_failed label. Then we end up calling qla2x00_mem_free twice because
> qla2x00_free_device calls qla2x00_mem_free as well. This is problem #1.
> 
> Problem #2: If we fail even earlier - in qla2x00_mem_alloc - we jump to
> probe_hw_failed which now tries to free memory that wasn't allocated
> (note before commit d64d6c5671db, we only called qla2x00_clear_drv_active -
> this was correct). My server crashed on qla2x00_free_req_que.
> 
> Next, notice that the only exit point that calls probe_init_failed is
> when qla2x00_alloc_queues fails. But looking inside qla2x00_alloc_queues,
> we only do the following if this routine is successful:
> 
>         ha->rsp_q_map[0] = rsp;
>         ha->req_q_map[0] = req;
>         set_bit(0, ha->rsp_qid_map);
>         set_bit(0, ha->req_qid_map);
> 
> Otherwise, we free up ha->base_qpair, ha->rsp_q_map, ha->req_q_map
> and exit with an error. Then we jump to probe_init_failed, which does
> two things that are bad.
> 
> First, there's (problem #3):
> 
>        ha->req_q_map[0] = NULL;
>        ha->rsp_q_map[0] = NULL;
> 
> In the failure case of qla2x00_alloc_queues, ha->req_q_map and
> ha->rsp_q_map are set to NULL. Can't dereference a NULL pointer and
> that is what we crash on here.
> 
> If the system didn't panic on this, it would have on problem #4.
> There are extra calls to qla2x00_free_req_que and qla2x00_free_rsp_que
> in probe_init_failed path because these calls are now duplicated in
> probe_hw_failed because of commit d64d6c5671db.
> 
> Now for the changes...
> 
> Problem #1 is easily solved. When we enter probe_failed, have a new
> variable that indicates this is the case so probe_hw_failed can check
> it and not call the same free routines again (qla2x00_free_device calls
> qla2x00_mem_free, qla2x00_free_req_que, and qla2x00_free_rsp_que).
> 
> For problem 2, an extra label (probe_hw_failed_no_mem) will be created
> at the end of probe_hw_failed, so we don't try to free unallocated memory.
> 
> Problems 3 and 4 are also easy to remedy: if qla2x00_alloc_queues fails,
> we should call probe_failed instead. And since nobody jumps to
> probe_init_failed anymore, just eliminate that label. The calls to
> qla2x00_free_req_que are done below in probe_hw_failed anyways, to which
> we fall through.
> 
> I've tested out these changes thoroughly with a test that fails
> initialization at various times. I've also used kprobes to inject errors
> to force us into various error paths. The exit points are very solid now,
> no NULL pointer crashes, no double frees.
> ---
>  drivers/scsi/qla2xxx/qla_os.c | 26 ++++++++++++--------------
>  1 file changed, 12 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index afcb5567..3faa7a3 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -2739,6 +2739,7 @@ static void qla2x00_iocb_work_fn(struct work_struct *work)
>  	uint16_t req_length = 0, rsp_length = 0;
>  	struct req_que *req = NULL;
>  	struct rsp_que *rsp = NULL;
> +	int probe_failed = 0;
>  	int i;
>  
>  	bars = pci_select_bars(pdev, IORESOURCE_MEM | IORESOURCE_IO);
> @@ -3024,7 +3025,7 @@ static void qla2x00_iocb_work_fn(struct work_struct *work)
>  		ql_log_pci(ql_log_fatal, pdev, 0x0031,
>  		    "Failed to allocate memory for adapter, aborting.\n");
>  
> -		goto probe_hw_failed;
> +		goto probe_hw_failed_no_mem;
>  	}
>  
>  	req->max_q_depth = MAX_Q_DEPTH;
> @@ -3102,7 +3103,7 @@ static void qla2x00_iocb_work_fn(struct work_struct *work)
>  		ql_log(ql_log_fatal, base_vha, 0x003d,
>  		    "Failed to allocate memory for queue pointers..."
>  		    "aborting.\n");
> -		goto probe_init_failed;
> +		goto probe_failed;
>  	}
>  
>  	if (ha->mqenable && shost_use_blk_mq(host)) {
> @@ -3387,16 +3388,8 @@ static void qla2x00_iocb_work_fn(struct work_struct *work)
>  
>  	return 0;
>  
> -probe_init_failed:
> -	qla2x00_free_req_que(ha, req);
> -	ha->req_q_map[0] = NULL;
> -	clear_bit(0, ha->req_qid_map);
> -	qla2x00_free_rsp_que(ha, rsp);
> -	ha->rsp_q_map[0] = NULL;
> -	clear_bit(0, ha->rsp_qid_map);
> -	ha->max_req_queues = ha->max_rsp_queues = 0;
> -
>  probe_failed:
> +	probe_failed = 1;
>  	if (base_vha->timer_active)
>  		qla2x00_stop_timer(base_vha);
>  	base_vha->flags.online = 0;
> @@ -3412,9 +3405,14 @@ static void qla2x00_iocb_work_fn(struct work_struct *work)
>  	scsi_host_put(base_vha->host);
>  
>  probe_hw_failed:
> -	qla2x00_mem_free(ha);
> -	qla2x00_free_req_que(ha, req);
> -	qla2x00_free_rsp_que(ha, rsp);
> +	/* These already done in qla2x00_free_device, don't double free */
> +	if (!probe_failed) {
> +		qla2x00_mem_free(ha);
> +		qla2x00_free_req_que(ha, req);
> +		qla2x00_free_rsp_que(ha, rsp);
> +	}
> +
> +probe_hw_failed_no_mem:
>  	qla2x00_clear_drv_active(ha);
>  
>  iospace_config_failed:
> 
Hmm. Isn't is rather the case that the labels and gotos are mixed up?
We have this order of labels:

probe_init_failed
probe_failed
probe_hw_failed
iospace_config_failed
disable_device

but this potential calling order:

disable_device
iospace_config_failed
probe_hw_failed
probe_hw_failed
probe_init_failed
probe_failed
probe_failed
probe_failed

So it looks to me as if the 'probe_init_failed' and 'probe_failed'
labels should be exchanged...
But yeah, we need to fix this up.
I've run into this issue myself recently :-(

Cheers,

Hannes
Bill Kuzeja Feb. 27, 2018, 3:18 p.m. UTC | #2
Thanks for the quick reply and analysis Hannes! My apologies in advance, I'm stuck on 
Outlook here at work - I'll try to format this to be readable (hopefully it doesn't get
mangled).

On 02/27/2018 06:01 AM, Hannes Reinecke wrote:

> Hmm. Isn't is rather the case that the labels and gotos are mixed up?
> We have this order of labels:

> probe_init_failed
> probe_failed
> probe_hw_failed
> iospace_config_failed
> disable_device

> but this potential calling order:

> disable_device
> iospace_config_failed
> probe_hw_failed
> probe_hw_failed
> probe_init_failed
> probe_failed
> probe_failed
> probe_failed

> So it looks to me as if the 'probe_init_failed' and 'probe_failed'
> labels should be exchanged...
> But yeah, we need to fix this up.
> I've run into this issue myself recently :-(

So, that is partly the problem. But.....if we switch the order, we still have problems.
Consider the code with order switched:

probe_failed:
....
        qla2x00_free_device(base_vha);
        scsi_host_put(base_vha->host);

probe_init_failed:
        qla2x00_free_req_que(ha, req);
        ha->req_q_map[0] = NULL;     alloc_queues succeeds
        clear_bit(0, ha->req_qid_map);
        qla2x00_free_rsp_que(ha, rsp);
        ha->rsp_q_map[0] = NULL;
        clear_bit(0, ha->rsp_qid_map);
        ha->max_req_queues = ha->max_rsp_queues = 0;

probe_hw_failed:
        qla2x00_mem_free(ha);
        qla2x00_free_req_que(ha, req);
        qla2x00_free_rsp_que(ha, rsp);
        qla2x00_clear_drv_active(ha);

1) We call probe_init_failed only if qla2x00_alloc_queues fails. But if this routine fails,
ha->req_q_map = NULL (as well as rsp_q_map), so we crash doing 
ha->req_q_map[0] = NULL. So, I can remove these assignments. But the
qla2x00_free_req_que calls are redundant (they are done below), so we're left 
with clearing out max_req_queues and max_rsp_queues for this label. Since 
we're freeing ha anyways, I'm not sure if this is worth having another label for.

2) We still have the problem where in probe_failed, qla2x00_free_device does 
frees which are repeated later on in probe_hw_failed. 

qla2x00_free_device -> qla2x00_mem_free
qla2x00_free_device -> qla2x00_free_queues -> qla2x00_free_req_que

3) Also, I still need another label for when qla2x00_mem_alloc fails. We can't just
jump to probe_hw_failed.

I suppose we could do something like this:

probe_failed:
        probe_failed = 1;
....
        qla2x00_free_device(base_vha);
        scsi_host_put(base_vha->host);

probe_init_failed:
        ha->max_req_queues = ha->max_rsp_queues = 0;

probe_hw_failed:
        if (!probe_failed) {
              qla2x00_mem_free(ha);
              qla2x00_free_req_que(ha, req);
              qla2x00_free_rsp_que(ha, rsp);
        }
 probe_hw_failed_no_mem:
        qla2x00_clear_drv_active(ha);

This is essentially what I have, except my version does not have the probe_init_failed 
label. Thoughts??

Thanks again and regards.

---
Hannes Reinecke Feb. 27, 2018, 4:28 p.m. UTC | #3
On 02/27/2018 04:18 PM, Kuzeja, William wrote:
> 
> Thanks for the quick reply and analysis Hannes! My apologies in advance, I'm stuck on 
> Outlook here at work - I'll try to format this to be readable (hopefully it doesn't get
> mangled).
> 
> On 02/27/2018 06:01 AM, Hannes Reinecke wrote:
> 
>> Hmm. Isn't is rather the case that the labels and gotos are mixed up?
>> We have this order of labels:
> 
>> probe_init_failed
>> probe_failed
>> probe_hw_failed
>> iospace_config_failed
>> disable_device
> 
>> but this potential calling order:
> 
>> disable_device
>> iospace_config_failed
>> probe_hw_failed
>> probe_hw_failed
>> probe_init_failed
>> probe_failed
>> probe_failed
>> probe_failed
> 
>> So it looks to me as if the 'probe_init_failed' and 'probe_failed'
>> labels should be exchanged...
>> But yeah, we need to fix this up.
>> I've run into this issue myself recently :-(
> 
> So, that is partly the problem. But.....if we switch the order, we still have problems.
> Consider the code with order switched:
> 
> probe_failed:
> ....
>         qla2x00_free_device(base_vha);
>         scsi_host_put(base_vha->host);
> 
> probe_init_failed:
>         qla2x00_free_req_que(ha, req);
>         ha->req_q_map[0] = NULL;     alloc_queues succeeds
>         clear_bit(0, ha->req_qid_map);
>         qla2x00_free_rsp_que(ha, rsp);
>         ha->rsp_q_map[0] = NULL;
>         clear_bit(0, ha->rsp_qid_map);
>         ha->max_req_queues = ha->max_rsp_queues = 0;
> 
> probe_hw_failed:
>         qla2x00_mem_free(ha);
>         qla2x00_free_req_que(ha, req);
>         qla2x00_free_rsp_que(ha, rsp);
>         qla2x00_clear_drv_active(ha);
> 
> 1) We call probe_init_failed only if qla2x00_alloc_queues fails. But if this routine fails,
> ha->req_q_map = NULL (as well as rsp_q_map), so we crash doing 
> ha->req_q_map[0] = NULL. So, I can remove these assignments. But the
> qla2x00_free_req_que calls are redundant (they are done below), so we're left 
> with clearing out max_req_queues and max_rsp_queues for this label. Since 
> we're freeing ha anyways, I'm not sure if this is worth having another label for.
> 
> 2) We still have the problem where in probe_failed, qla2x00_free_device does 
> frees which are repeated later on in probe_hw_failed. 
> 
> qla2x00_free_device -> qla2x00_mem_free
> qla2x00_free_device -> qla2x00_free_queues -> qla2x00_free_req_que
> 
> 3) Also, I still need another label for when qla2x00_mem_alloc fails. We can't just
> jump to probe_hw_failed.
> 
> I suppose we could do something like this:
> 
> probe_failed:
>         probe_failed = 1;
> ....
>         qla2x00_free_device(base_vha);
>         scsi_host_put(base_vha->host);
> 
> probe_init_failed:
>         ha->max_req_queues = ha->max_rsp_queues = 0;
> 
> probe_hw_failed:
>         if (!probe_failed) {
>               qla2x00_mem_free(ha);
>               qla2x00_free_req_que(ha, req);
>               qla2x00_free_rsp_que(ha, rsp);
>         }
Why not make all of these idempotent, so that they won't crash when
called twice? Then we can drop the 'probe_failed' marker, and the entire
code becomes more readable.
Plus the diff will be smaller, too...

Cheers,

Hannes
diff mbox

Patch

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index afcb5567..3faa7a3 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -2739,6 +2739,7 @@  static void qla2x00_iocb_work_fn(struct work_struct *work)
 	uint16_t req_length = 0, rsp_length = 0;
 	struct req_que *req = NULL;
 	struct rsp_que *rsp = NULL;
+	int probe_failed = 0;
 	int i;
 
 	bars = pci_select_bars(pdev, IORESOURCE_MEM | IORESOURCE_IO);
@@ -3024,7 +3025,7 @@  static void qla2x00_iocb_work_fn(struct work_struct *work)
 		ql_log_pci(ql_log_fatal, pdev, 0x0031,
 		    "Failed to allocate memory for adapter, aborting.\n");
 
-		goto probe_hw_failed;
+		goto probe_hw_failed_no_mem;
 	}
 
 	req->max_q_depth = MAX_Q_DEPTH;
@@ -3102,7 +3103,7 @@  static void qla2x00_iocb_work_fn(struct work_struct *work)
 		ql_log(ql_log_fatal, base_vha, 0x003d,
 		    "Failed to allocate memory for queue pointers..."
 		    "aborting.\n");
-		goto probe_init_failed;
+		goto probe_failed;
 	}
 
 	if (ha->mqenable && shost_use_blk_mq(host)) {
@@ -3387,16 +3388,8 @@  static void qla2x00_iocb_work_fn(struct work_struct *work)
 
 	return 0;
 
-probe_init_failed:
-	qla2x00_free_req_que(ha, req);
-	ha->req_q_map[0] = NULL;
-	clear_bit(0, ha->req_qid_map);
-	qla2x00_free_rsp_que(ha, rsp);
-	ha->rsp_q_map[0] = NULL;
-	clear_bit(0, ha->rsp_qid_map);
-	ha->max_req_queues = ha->max_rsp_queues = 0;
-
 probe_failed:
+	probe_failed = 1;
 	if (base_vha->timer_active)
 		qla2x00_stop_timer(base_vha);
 	base_vha->flags.online = 0;
@@ -3412,9 +3405,14 @@  static void qla2x00_iocb_work_fn(struct work_struct *work)
 	scsi_host_put(base_vha->host);
 
 probe_hw_failed:
-	qla2x00_mem_free(ha);
-	qla2x00_free_req_que(ha, req);
-	qla2x00_free_rsp_que(ha, rsp);
+	/* These already done in qla2x00_free_device, don't double free */
+	if (!probe_failed) {
+		qla2x00_mem_free(ha);
+		qla2x00_free_req_que(ha, req);
+		qla2x00_free_rsp_que(ha, rsp);
+	}
+
+probe_hw_failed_no_mem:
 	qla2x00_clear_drv_active(ha);
 
 iospace_config_failed: