diff mbox series

[v14,08/15] xfs: Handle krealloc errors in xlog_recover_add_to_cont_trans

Message ID 20201218072917.16805-9-allison.henderson@oracle.com (mailing list archive)
State Superseded
Headers show
Series xfs: Delayed Attributes | expand

Commit Message

Allison Henderson Dec. 18, 2020, 7:29 a.m. UTC
Because xattrs can be over a page in size, we need to handle possible
krealloc errors to avoid warnings

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/xfs_log_recover.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Darrick J. Wong Jan. 5, 2021, 5:38 a.m. UTC | #1
On Fri, Dec 18, 2020 at 12:29:10AM -0700, Allison Henderson wrote:
> Because xattrs can be over a page in size, we need to handle possible
> krealloc errors to avoid warnings

Which warnings?

> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/xfs_log_recover.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 97f3130..295a5c6 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2061,7 +2061,10 @@ xlog_recover_add_to_cont_trans(
>  	old_ptr = item->ri_buf[item->ri_cnt-1].i_addr;
>  	old_len = item->ri_buf[item->ri_cnt-1].i_len;
>  
> -	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL | __GFP_NOFAIL);
> +	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL);

Does the removal of NOFAIL increase the likelihood that log recovery
will fail instead of looping around looking for more memory?

Hm, what /are/ we doing here, anyway?  I guess someone logged a gigantic
xattri item, which gets split across multiple log records, and now we're
trying to staple all that back together?  And perhaps the xattri item is
larger than a ... page(?) which causes dmesg warnings when combined with
NOFAIL?

--D

> +	if (ptr == NULL)
> +		return -ENOMEM;
> +
>  	memcpy(&ptr[old_len], dp, len);
>  	item->ri_buf[item->ri_cnt-1].i_len += len;
>  	item->ri_buf[item->ri_cnt-1].i_addr = ptr;
> -- 
> 2.7.4
>
Allison Henderson Jan. 5, 2021, 8:15 p.m. UTC | #2
On 1/4/21 10:38 PM, Darrick J. Wong wrote:
> On Fri, Dec 18, 2020 at 12:29:10AM -0700, Allison Henderson wrote:
>> Because xattrs can be over a page in size, we need to handle possible
>> krealloc errors to avoid warnings
> 
> Which warnings?

Sorry, I should have included it here.  The warning is:
[  +0.000016] WARNING: CPU: 1 PID: 20255 at mm/page_alloc.c:3446 
get_page_from_freelist+0x100b/0x1690

and if we look at that line number we have this snippet:
         /*
          * We most definitely don't want callers attempting to
          * allocate greater than order-1 page units with __GFP_NOFAIL.
          */
         WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
> 
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/xfs_log_recover.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
>> index 97f3130..295a5c6 100644
>> --- a/fs/xfs/xfs_log_recover.c
>> +++ b/fs/xfs/xfs_log_recover.c
>> @@ -2061,7 +2061,10 @@ xlog_recover_add_to_cont_trans(
>>   	old_ptr = item->ri_buf[item->ri_cnt-1].i_addr;
>>   	old_len = item->ri_buf[item->ri_cnt-1].i_len;
>>   
>> -	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL | __GFP_NOFAIL);
>> +	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL);
> 
> Does the removal of NOFAIL increase the likelihood that log recovery
> will fail instead of looping around looking for more memory?

I suppose it would?  But better to return the error code than proceed 
with a NULL pointer.  I would think it would be quickly proceeded with 
questions of what else is causing memory pressure to build though.

> 
> Hm, what /are/ we doing here, anyway?  I guess someone logged a gigantic
> xattri item, which gets split across multiple log records, and now we're
> trying to staple all that back together?  And perhaps the xattri item is
> larger than a ... page(?) which causes dmesg warnings when combined with
> NOFAIL?

Effectively yes, this is coming from one of the new test cases I came up 
with to test the replay.  It progressively sets larger and larger attrs 
and pulls the error tag to see that it replays correctly.  Up to 64k 
which I think is where ATTR_MAX_VALUELEN is.  I figured since we are 
opening up a means of logging as much, its something that we should be 
testing. :-)

Allison
> 
> --D
> 
>> +	if (ptr == NULL)
>> +		return -ENOMEM;
>> +
>>   	memcpy(&ptr[old_len], dp, len);
>>   	item->ri_buf[item->ri_cnt-1].i_len += len;
>>   	item->ri_buf[item->ri_cnt-1].i_addr = ptr;
>> -- 
>> 2.7.4
>>
diff mbox series

Patch

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 97f3130..295a5c6 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2061,7 +2061,10 @@  xlog_recover_add_to_cont_trans(
 	old_ptr = item->ri_buf[item->ri_cnt-1].i_addr;
 	old_len = item->ri_buf[item->ri_cnt-1].i_len;
 
-	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL | __GFP_NOFAIL);
+	ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL);
+	if (ptr == NULL)
+		return -ENOMEM;
+
 	memcpy(&ptr[old_len], dp, len);
 	item->ri_buf[item->ri_cnt-1].i_len += len;
 	item->ri_buf[item->ri_cnt-1].i_addr = ptr;