diff mbox series

[v2] target: core: Prevent memory reclaim recursion

Message ID 20191108082901.417950-1-damien.lemoal@wdc.com (mailing list archive)
State Accepted
Commit 0eccce866f84db2cd23e1f28737920aa7b9e70d7
Headers show
Series [v2] target: core: Prevent memory reclaim recursion | expand

Commit Message

Damien Le Moal Nov. 8, 2019, 8:29 a.m. UTC
Prevent recursion into the IO path under low memory conditions by using
GFP_NOIO in place of GFP_KERNEL when allocating a new command with
tcmu_alloc_cmd() and user ring space with tcmu_get_empty_block().

Reported-by: Masato Suzuki <masato.suzuki@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---

Changes from v1:
* Added reported-by tag

 drivers/target/target_core_user.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Mike Christie Nov. 8, 2019, 3:19 p.m. UTC | #1
On 11/08/2019 02:29 AM, Damien Le Moal wrote:
> Prevent recursion into the IO path under low memory conditions by using
> GFP_NOIO in place of GFP_KERNEL when allocating a new command with
> tcmu_alloc_cmd() and user ring space with tcmu_get_empty_block().
> 
> Reported-by: Masato Suzuki <masato.suzuki@wdc.com>
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
> 
> Changes from v1:
> * Added reported-by tag
> 
>  drivers/target/target_core_user.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
> index 35be1be87d2a..0b9dfa6b17bc 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c
> @@ -499,7 +499,7 @@ static inline bool tcmu_get_empty_block(struct tcmu_dev *udev,
>  			schedule_delayed_work(&tcmu_unmap_work, 0);
>  
>  		/* try to get new page from the mm */
> -		page = alloc_page(GFP_KERNEL);
> +		page = alloc_page(GFP_NOIO);
>  		if (!page)
>  			goto err_alloc;
>  
> @@ -573,7 +573,7 @@ static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
>  	struct tcmu_dev *udev = TCMU_DEV(se_dev);
>  	struct tcmu_cmd *tcmu_cmd;
>  
> -	tcmu_cmd = kmem_cache_zalloc(tcmu_cmd_cache, GFP_KERNEL);
> +	tcmu_cmd = kmem_cache_zalloc(tcmu_cmd_cache, GFP_NOIO);
>  	if (!tcmu_cmd)
>  		return NULL;
>  
> @@ -584,7 +584,7 @@ static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
>  	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
>  	tcmu_cmd->dbi_cnt = tcmu_cmd_get_block_cnt(tcmu_cmd);
>  	tcmu_cmd->dbi = kcalloc(tcmu_cmd->dbi_cnt, sizeof(uint32_t),
> -				GFP_KERNEL);
> +				GFP_NOIO);
>  	if (!tcmu_cmd->dbi) {
>  		kmem_cache_free(tcmu_cmd_cache, tcmu_cmd);
>  		return NULL;
> 

Acked-by: Mike Christie <mchristi@redhat.com>

We should also change tcmu_setup_cmd_timer so the gfp use in that code
path is consistent. I think we can do that in a separate patch later as
this one just fixes a specific bug.
Bart Van Assche Nov. 8, 2019, 4:22 p.m. UTC | #2
On 11/8/19 12:29 AM, Damien Le Moal wrote:
> Prevent recursion into the IO path under low memory conditions by using
> GFP_NOIO in place of GFP_KERNEL when allocating a new command with
> tcmu_alloc_cmd() and user ring space with tcmu_get_empty_block().
> 
> Reported-by: Masato Suzuki <masato.suzuki@wdc.com>
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
> 
> Changes from v1:
> * Added reported-by tag
> 
>   drivers/target/target_core_user.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)

The patch subject is weird. Shouldn't the patch subject start with 
"tcmu" instead of "target: core"?

Has the recursion mentioned in the patch description been observed or is 
this a theoretical issue? I'm asking this because GFP_NOIO only prevents 
recursion if it is used inside a block driver or filesystem. The tcmu 
driver is neither - it submits block I/O or filesystem I/O instead of 
implementing a block driver or filesystem. Should we really disallow 
tcmu to use the swap subsystem?

Thanks,

Bart.
Mike Christie Nov. 8, 2019, 4:34 p.m. UTC | #3
On 11/08/2019 10:22 AM, Bart Van Assche wrote:
> On 11/8/19 12:29 AM, Damien Le Moal wrote:
>> Prevent recursion into the IO path under low memory conditions by using
>> GFP_NOIO in place of GFP_KERNEL when allocating a new command with
>> tcmu_alloc_cmd() and user ring space with tcmu_get_empty_block().
>>
>> Reported-by: Masato Suzuki <masato.suzuki@wdc.com>
>> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
>> ---
>>
>> Changes from v1:
>> * Added reported-by tag
>>
>>   drivers/target/target_core_user.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> The patch subject is weird. Shouldn't the patch subject start with
> "tcmu" instead of "target: core"?
> 
> Has the recursion mentioned in the patch description been observed or is
> this a theoretical issue? I'm asking this because GFP_NOIO only prevents

Observed.

> recursion if it is used inside a block driver or filesystem. The tcmu
> driver is neither - it submits block I/O or filesystem I/O instead of
> implementing a block driver or filesystem. Should we really disallow
> tcmu to use the swap subsystem?

A common use is tcm loop on the frontend and tcmu on the backend. You
see this with virt and containers, where some app is used to interacting
SCSI devices, but then the storage is backed by something that people
didn't want to put in the kernel. It's similar to nbd when you use
AF_UNIX sockets with a local running daemon.
Bart Van Assche Nov. 8, 2019, 5:02 p.m. UTC | #4
On 11/8/19 12:29 AM, Damien Le Moal wrote:
> Prevent recursion into the IO path under low memory conditions by using
> GFP_NOIO in place of GFP_KERNEL when allocating a new command with
> tcmu_alloc_cmd() and user ring space with tcmu_get_empty_block().

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Damien Le Moal Nov. 9, 2019, 2:18 a.m. UTC | #5
On 2019/11/09 1:22, Bart Van Assche wrote:
> On 11/8/19 12:29 AM, Damien Le Moal wrote:
>> Prevent recursion into the IO path under low memory conditions by using
>> GFP_NOIO in place of GFP_KERNEL when allocating a new command with
>> tcmu_alloc_cmd() and user ring space with tcmu_get_empty_block().
>>
>> Reported-by: Masato Suzuki <masato.suzuki@wdc.com>
>> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
>> ---
>>
>> Changes from v1:
>> * Added reported-by tag
>>
>>   drivers/target/target_core_user.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> The patch subject is weird. Shouldn't the patch subject start with 
> "tcmu" instead of "target: core"?

I was not sure what to put there since the file being touched is
target_core_user.c (it has both core and user). May be "target: user" is
better ?

Martin,

Please let me know if you want me to resend with a fixed header (and
what that header should be).

> Has the recursion mentioned in the patch description been observed or is 
> this a theoretical issue? I'm asking this because GFP_NOIO only prevents 
> recursion if it is used inside a block driver or filesystem. The tcmu 
> driver is neither - it submits block I/O or filesystem I/O instead of 
> implementing a block driver or filesystem. Should we really disallow 
> tcmu to use the swap subsystem?

As Mike confirmed already, the problem was observed and actually fairly
easy to recreate. Our go-to setup to test this is:

fio->XFS->dm-zoned->tcmu-runner ZBC handler (emulated ZBC
drive)->ext4->regular disk.

Without this patch and Mike's proposed prctl() patch adding NOIO through
prctl() in tcmu-runner context, memory reclaim recursions causing
deadlocks are triggered very quickly.

> 
> Thanks,
> 
> Bart.
> 
>
Martin K. Petersen Nov. 9, 2019, 2:38 a.m. UTC | #6
Damien,

> Please let me know if you want me to resend with a fixed header (and
> what that header should be).

I fixed it up. Applied to 5.5/scsi-queue. Thanks!
diff mbox series

Patch

diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c
index 35be1be87d2a..0b9dfa6b17bc 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -499,7 +499,7 @@  static inline bool tcmu_get_empty_block(struct tcmu_dev *udev,
 			schedule_delayed_work(&tcmu_unmap_work, 0);
 
 		/* try to get new page from the mm */
-		page = alloc_page(GFP_KERNEL);
+		page = alloc_page(GFP_NOIO);
 		if (!page)
 			goto err_alloc;
 
@@ -573,7 +573,7 @@  static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
 	struct tcmu_dev *udev = TCMU_DEV(se_dev);
 	struct tcmu_cmd *tcmu_cmd;
 
-	tcmu_cmd = kmem_cache_zalloc(tcmu_cmd_cache, GFP_KERNEL);
+	tcmu_cmd = kmem_cache_zalloc(tcmu_cmd_cache, GFP_NOIO);
 	if (!tcmu_cmd)
 		return NULL;
 
@@ -584,7 +584,7 @@  static struct tcmu_cmd *tcmu_alloc_cmd(struct se_cmd *se_cmd)
 	tcmu_cmd_reset_dbi_cur(tcmu_cmd);
 	tcmu_cmd->dbi_cnt = tcmu_cmd_get_block_cnt(tcmu_cmd);
 	tcmu_cmd->dbi = kcalloc(tcmu_cmd->dbi_cnt, sizeof(uint32_t),
-				GFP_KERNEL);
+				GFP_NOIO);
 	if (!tcmu_cmd->dbi) {
 		kmem_cache_free(tcmu_cmd_cache, tcmu_cmd);
 		return NULL;