Message ID | cc2fe0148944cfac5e58339bf98e76fd5c3a54b8.1636578573.git.christophe.jaillet@wanadoo.fr (mailing list archive) |
---|---|
State | Deferred |
Headers | show |
Series | scsi: qla2xxx: Fix memory leaks in the error handling path of 'qla2x00_mem_alloc()' | expand |
On Wed, Nov 10, 2021 at 10:11:34PM +0100, Christophe JAILLET wrote: > In case of memory allocation failure, we should release many things and > should not return directly. > > The tricky part here, is that some (kzalloc + dma_pool_alloc) resources > are allocated and stored in 'unusable' and a 'good' list. > The 'good' list is then freed and only the 'unusable' list remains > allocated. > So, only this 'unusable' list is then freed in the error handling path of > the function. > > So, instead of adding even more code in this already huge function, just > 'continue' (as already done if dma_pool_alloc() fails) instead of > returning directly. > > After the 'for' loop, we will then branch to the correct place of the > error handling path when another memory allocation will (likely) fail > afterward. > > Fixes: 50b812755e97 ("scsi: qla2xxx: Fix DMA error when the DIF sg buffer crosses 4GB boundary") > Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> > --- > Certainly not the best solution, but look 'safe' to me. Your analysis seems correct, but this is deeply weird. It sort of looks like this was debug code that was committed accidentally. Neither the "good" list nor the "unusable" are used except to print some debug info: ql_dbg_pci(ql_dbg_init, ha->pdev, 0x0024, "%s: dif dma pool (good=%u unusable=%u)\n", __func__, ha->pool.good.count, ha->pool.unusable.count); The good list is freed immediately, and then there is a no-op free in qla2x00_mem_free(). The unusable list is preserved until qla2x00_mem_free() but not used anywhere. regards, dan carpenter
Le 11/11/2021 à 10:17, Dan Carpenter a écrit : > On Wed, Nov 10, 2021 at 10:11:34PM +0100, Christophe JAILLET wrote: >> In case of memory allocation failure, we should release many things and >> should not return directly. >> >> The tricky part here, is that some (kzalloc + dma_pool_alloc) resources >> are allocated and stored in 'unusable' and a 'good' list. >> The 'good' list is then freed and only the 'unusable' list remains >> allocated. >> So, only this 'unusable' list is then freed in the error handling path of >> the function. >> >> So, instead of adding even more code in this already huge function, just >> 'continue' (as already done if dma_pool_alloc() fails) instead of >> returning directly. >> >> After the 'for' loop, we will then branch to the correct place of the >> error handling path when another memory allocation will (likely) fail >> afterward. >> >> Fixes: 50b812755e97 ("scsi: qla2xxx: Fix DMA error when the DIF sg buffer crosses 4GB boundary") >> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> >> --- >> Certainly not the best solution, but look 'safe' to me. > > Your analysis seems correct, but this is deeply weird. I agree, deeply weird :) > It sort of looks > like this was debug code that was committed accidentally. Neither > the "good" list nor the "unusable" are used except to print some debug > info: > > ql_dbg_pci(ql_dbg_init, ha->pdev, 0x0024, > "%s: dif dma pool (good=%u unusable=%u)\n", > __func__, ha->pool.good.count, > ha->pool.unusable.count); > > The good list is freed immediately, and then there is a no-op free in > qla2x00_mem_free(). I agree. > The unusable list is preserved until qla2x00_mem_free() > but not used anywhere. I agree. The logic in commit '50b812755e97' puzzled me a lot. I wonder why the 128 magic number in the for loop. My understanding is: - try to allocate things at start-up - check if this allocation crosses the 4G limit (see commit log) - keep the "unusable" allocation allocated, so that this memory is reserved (i.e. wasted) and won't be allocated later (see usage of the dif_bundl_pool dma pool in [1]) - hope that tying 128 allocations is enough and that no "unusable allocation" will be done at run-time. In other words, I tried to convinced myself that there was a real logic, even if unperfect. Even if the above description is correct and if it works as expected in RL, it real looks like an overkill! Now that I reread code around 'dif_local_dma_alloc' usage, I'm tempt to agree with your feeling about debug code. CJ [1]: https://elixir.bootlin.com/linux/v5.15.1/source/drivers/scsi/qla2xxx/qla_iocb.c#L1138 > > regards, > dan carpenter > >
On Thu, Nov 11, 2021 at 11:18:06AM +0100, Christophe JAILLET wrote: > Le 11/11/2021 à 10:17, Dan Carpenter a écrit : > > On Wed, Nov 10, 2021 at 10:11:34PM +0100, Christophe JAILLET wrote: > > > In case of memory allocation failure, we should release many things and > > > should not return directly. > > > > > > The tricky part here, is that some (kzalloc + dma_pool_alloc) resources > > > are allocated and stored in 'unusable' and a 'good' list. > > > The 'good' list is then freed and only the 'unusable' list remains > > > allocated. > > > So, only this 'unusable' list is then freed in the error handling path of > > > the function. > > > > > > So, instead of adding even more code in this already huge function, just > > > 'continue' (as already done if dma_pool_alloc() fails) instead of > > > returning directly. > > > > > > After the 'for' loop, we will then branch to the correct place of the > > > error handling path when another memory allocation will (likely) fail > > > afterward. > > > > > > Fixes: 50b812755e97 ("scsi: qla2xxx: Fix DMA error when the DIF sg buffer crosses 4GB boundary") > > > Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> > > > --- > > > Certainly not the best solution, but look 'safe' to me. > > > > Your analysis seems correct, but this is deeply weird. > I agree, deeply weird :) > > > It sort of looks > > like this was debug code that was committed accidentally. Neither > > the "good" list nor the "unusable" are used except to print some debug > > info: > > > > ql_dbg_pci(ql_dbg_init, ha->pdev, 0x0024, > > "%s: dif dma pool (good=%u unusable=%u)\n", > > __func__, ha->pool.good.count, > > ha->pool.unusable.count); > > > > The good list is freed immediately, and then there is a no-op free in > > qla2x00_mem_free(). > I agree. > > > The unusable list is preserved until qla2x00_mem_free() > > but not used anywhere. > I agree. > > The logic in commit '50b812755e97' puzzled me a lot. > I wonder why the 128 magic number in the for loop. > > My understanding is: > - try to allocate things at start-up > - check if this allocation crosses the 4G limit (see commit log) > - keep the "unusable" allocation allocated, so that this memory is > reserved (i.e. wasted) and won't be allocated later (see usage of the > dif_bundl_pool dma pool in [1]) Ah, I considered that but didn't follow through on the analysis all the way. Possible! regards, dan carpenter
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index abcd30917263..0722dd618b99 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -4151,7 +4151,7 @@ qla2x00_mem_alloc(struct qla_hw_data *ha, uint16_t req_len, uint16_t rsp_len, ql_dbg_pci(ql_dbg_init, ha->pdev, 0xe0ee, "%s: failed alloc dsd\n", __func__); - return -ENOMEM; + continue; } ha->dif_bundle_kallocs++;
In case of memory allocation failure, we should release many things and should not return directly. The tricky part here, is that some (kzalloc + dma_pool_alloc) resources are allocated and stored in 'unusable' and a 'good' list. The 'good' list is then freed and only the 'unusable' list remains allocated. So, only this 'unusable' list is then freed in the error handling path of the function. So, instead of adding even more code in this already huge function, just 'continue' (as already done if dma_pool_alloc() fails) instead of returning directly. After the 'for' loop, we will then branch to the correct place of the error handling path when another memory allocation will (likely) fail afterward. Fixes: 50b812755e97 ("scsi: qla2xxx: Fix DMA error when the DIF sg buffer crosses 4GB boundary") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> --- Certainly not the best solution, but look 'safe' to me. --- drivers/scsi/qla2xxx/qla_os.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)