[V2] ocfs2: free up write context when direct IO failed
diff mbox series

Message ID 20181102170632.25921-1-wen.gang.wang@oracle.com
State New
Headers show
Series
  • [V2] ocfs2: free up write context when direct IO failed
Related show

Commit Message

Wengang Nov. 2, 2018, 5:06 p.m. UTC
The write context should also be freed even when direct IO failed.
Otherwise a memory leak is introduced and entries remain in
oi->ip_unwritten_list causing the following BUG later in unlink path:

ERROR: bug expression: !list_empty(&oi->ip_unwritten_list)
ERROR: Clear inode of 215043, inode has unwritten extents
...
Call Trace:
? __set_current_blocked+0x42/0x68
ocfs2_evict_inode+0x91/0x6a0 [ocfs2]
? bit_waitqueue+0x40/0x33
evict+0xdb/0x1af
iput+0x1a2/0x1f7
do_unlinkat+0x194/0x28f
SyS_unlinkat+0x1b/0x2f
do_syscall_64+0x79/0x1ae
entry_SYSCALL_64_after_hwframe+0x151/0x0

This patch also logs, with frequency limit, direct IO failures.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
---
v1 -> v2: log direct IO failures
---
 fs/ocfs2/aops.c            | 12 ++++++++++--
 fs/ocfs2/cluster/masklog.h |  9 +++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

Comments

Wengang Nov. 2, 2018, 5:10 p.m. UTC | #1
Hi Guys,

Thanks for your reviews.
As suggested by Junxiao, logging direct IO failures is added in v2. 
Please review the v2.

thanks,
wengang


On 2018/11/2 10:06, Wengang Wang wrote:
> The write context should also be freed even when direct IO failed.
> Otherwise a memory leak is introduced and entries remain in
> oi->ip_unwritten_list causing the following BUG later in unlink path:
>
> ERROR: bug expression: !list_empty(&oi->ip_unwritten_list)
> ERROR: Clear inode of 215043, inode has unwritten extents
> ...
> Call Trace:
> ? __set_current_blocked+0x42/0x68
> ocfs2_evict_inode+0x91/0x6a0 [ocfs2]
> ? bit_waitqueue+0x40/0x33
> evict+0xdb/0x1af
> iput+0x1a2/0x1f7
> do_unlinkat+0x194/0x28f
> SyS_unlinkat+0x1b/0x2f
> do_syscall_64+0x79/0x1ae
> entry_SYSCALL_64_after_hwframe+0x151/0x0
>
> This patch also logs, with frequency limit, direct IO failures.
>
> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
> ---
> v1 -> v2: log direct IO failures
> ---
>   fs/ocfs2/aops.c            | 12 ++++++++++--
>   fs/ocfs2/cluster/masklog.h |  9 +++++++++
>   2 files changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index 302cd7c..7578bd5 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
> @@ -2412,8 +2412,16 @@ static int ocfs2_dio_end_io(struct kiocb *iocb,
>   	/* this io's submitter should not have unlocked this before we could */
>   	BUG_ON(!ocfs2_iocb_is_rw_locked(iocb));
>   
> -	if (bytes > 0 && private)
> -		ret = ocfs2_dio_end_io_write(inode, private, offset, bytes);
> +	if (bytes <= 0)
> +		mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld",
> +				 (long long)bytes);
> +	if (private) {
> +		if (bytes > 0)
> +			ret = ocfs2_dio_end_io_write(inode, private, offset,
> +						     bytes);
> +		else
> +			ocfs2_dio_free_write_ctx(inode, private);
> +	}
>   
>   	ocfs2_iocb_clear_rw_locked(iocb);
>   
> diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h
> index 308ea0e..a396096 100644
> --- a/fs/ocfs2/cluster/masklog.h
> +++ b/fs/ocfs2/cluster/masklog.h
> @@ -178,6 +178,15 @@ do {									\
>   			      ##__VA_ARGS__);				\
>   } while (0)
>   
> +#define mlog_ratelimited(mask, fmt, ...)				\
> +do {									\
> +	static DEFINE_RATELIMIT_STATE(_rs,				\
> +				      DEFAULT_RATELIMIT_INTERVAL,	\
> +				      DEFAULT_RATELIMIT_BURST);		\
> +	if (__ratelimit(&_rs))						\
> +		mlog(mask, fmt, ##__VA_ARGS__);				\
> +} while (0)
> +
>   #define mlog_errno(st) ({						\
>   	int _st = (st);							\
>   	if (_st != -ERESTARTSYS && _st != -EINTR &&			\
Greg Kroah-Hartman Nov. 3, 2018, 6:44 a.m. UTC | #2
On Fri, Nov 02, 2018 at 10:06:32AM -0700, Wengang Wang wrote:
> The write context should also be freed even when direct IO failed.
> Otherwise a memory leak is introduced and entries remain in
> oi->ip_unwritten_list causing the following BUG later in unlink path:
> 
> ERROR: bug expression: !list_empty(&oi->ip_unwritten_list)
> ERROR: Clear inode of 215043, inode has unwritten extents
> ...
> Call Trace:
> ? __set_current_blocked+0x42/0x68
> ocfs2_evict_inode+0x91/0x6a0 [ocfs2]
> ? bit_waitqueue+0x40/0x33
> evict+0xdb/0x1af
> iput+0x1a2/0x1f7
> do_unlinkat+0x194/0x28f
> SyS_unlinkat+0x1b/0x2f
> do_syscall_64+0x79/0x1ae
> entry_SYSCALL_64_after_hwframe+0x151/0x0
> 
> This patch also logs, with frequency limit, direct IO failures.
> 
> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
> ---
> v1 -> v2: log direct IO failures
> ---
>  fs/ocfs2/aops.c            | 12 ++++++++++--
>  fs/ocfs2/cluster/masklog.h |  9 +++++++++
>  2 files changed, 19 insertions(+), 2 deletions(-)

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://urldefense.proofpoint.com/v2/url?u=https-3A__www.kernel.org_doc_html_latest_process_stable-2Dkernel-2Drules.html&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=C7gAd4uDxlAvTdc0vmU6X8CMk6L2iDY8-HD0qT6Fo7Y&m=P2I-XIsUfJ0h1jlymjMe7ziTkrs4pJ-sCZYaijMZpWw&s=ygRPYcgBDv1LNZjnBgPzukjEYvJ_sE8jlBo7R_3_b5c&e=
for how to do this properly.

</formletter>
Junxiao Bi Nov. 4, 2018, 11:54 p.m. UTC | #3
On 11/03/2018 01:06 AM, Wengang Wang wrote:

> The write context should also be freed even when direct IO failed.
> Otherwise a memory leak is introduced and entries remain in
> oi->ip_unwritten_list causing the following BUG later in unlink path:
>
> ERROR: bug expression: !list_empty(&oi->ip_unwritten_list)
> ERROR: Clear inode of 215043, inode has unwritten extents
> ...
> Call Trace:
> ? __set_current_blocked+0x42/0x68
> ocfs2_evict_inode+0x91/0x6a0 [ocfs2]
> ? bit_waitqueue+0x40/0x33
> evict+0xdb/0x1af
> iput+0x1a2/0x1f7
> do_unlinkat+0x194/0x28f
> SyS_unlinkat+0x1b/0x2f
> do_syscall_64+0x79/0x1ae
> entry_SYSCALL_64_after_hwframe+0x151/0x0
>
> This patch also logs, with frequency limit, direct IO failures.
>
> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
> ---
> v1 -> v2: log direct IO failures
> ---
>   fs/ocfs2/aops.c            | 12 ++++++++++--
>   fs/ocfs2/cluster/masklog.h |  9 +++++++++
>   2 files changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index 302cd7c..7578bd5 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
> @@ -2412,8 +2412,16 @@ static int ocfs2_dio_end_io(struct kiocb *iocb,
>   	/* this io's submitter should not have unlocked this before we could */
>   	BUG_ON(!ocfs2_iocb_is_rw_locked(iocb));
>   
> -	if (bytes > 0 && private)
> -		ret = ocfs2_dio_end_io_write(inode, private, offset, bytes);
> +	if (bytes <= 0)
> +		mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld",
> +				 (long long)bytes);
> +	if (private) {
> +		if (bytes > 0)
> +			ret = ocfs2_dio_end_io_write(inode, private, offset,
> +						     bytes);
> +		else
> +			ocfs2_dio_free_write_ctx(inode, private);
> +	}
>   
>   	ocfs2_iocb_clear_rw_locked(iocb);
>   
> diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h
> index 308ea0e..a396096 100644
> --- a/fs/ocfs2/cluster/masklog.h
> +++ b/fs/ocfs2/cluster/masklog.h
> @@ -178,6 +178,15 @@ do {									\
>   			      ##__VA_ARGS__);				\
>   } while (0)
>   
> +#define mlog_ratelimited(mask, fmt, ...)				\
> +do {									\
> +	static DEFINE_RATELIMIT_STATE(_rs,				\
> +				      DEFAULT_RATELIMIT_INTERVAL,	\
> +				      DEFAULT_RATELIMIT_BURST);		\
> +	if (__ratelimit(&_rs))						\
> +		mlog(mask, fmt, ##__VA_ARGS__);				\
> +} while (0)
> +
>   #define mlog_errno(st) ({						\
>   	int _st = (st);							\
>   	if (_st != -ERESTARTSYS && _st != -EINTR &&			\
Changwei Ge Nov. 5, 2018, 1:59 a.m. UTC | #4
This version also added a nice mlog infrastructure.

Looks good to me.

Thanks,
Changwei

On 2018/11/3 1:08, Wengang Wang wrote:
> The write context should also be freed even when direct IO failed.
> Otherwise a memory leak is introduced and entries remain in
> oi->ip_unwritten_list causing the following BUG later in unlink path:
> 
> ERROR: bug expression: !list_empty(&oi->ip_unwritten_list)
> ERROR: Clear inode of 215043, inode has unwritten extents
> ...
> Call Trace:
> ? __set_current_blocked+0x42/0x68
> ocfs2_evict_inode+0x91/0x6a0 [ocfs2]
> ? bit_waitqueue+0x40/0x33
> evict+0xdb/0x1af
> iput+0x1a2/0x1f7
> do_unlinkat+0x194/0x28f
> SyS_unlinkat+0x1b/0x2f
> do_syscall_64+0x79/0x1ae
> entry_SYSCALL_64_after_hwframe+0x151/0x0
> 
> This patch also logs, with frequency limit, direct IO failures.
> 
> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>

Reviewed-by: Changwei Ge <ge.changwei@h3c.com>

> ---
> v1 -> v2: log direct IO failures
> ---
>   fs/ocfs2/aops.c            | 12 ++++++++++--
>   fs/ocfs2/cluster/masklog.h |  9 +++++++++
>   2 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index 302cd7c..7578bd5 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
> @@ -2412,8 +2412,16 @@ static int ocfs2_dio_end_io(struct kiocb *iocb,
>   	/* this io's submitter should not have unlocked this before we could */
>   	BUG_ON(!ocfs2_iocb_is_rw_locked(iocb));
>   
> -	if (bytes > 0 && private)
> -		ret = ocfs2_dio_end_io_write(inode, private, offset, bytes);
> +	if (bytes <= 0)
> +		mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld",
> +				 (long long)bytes);
> +	if (private) {
> +		if (bytes > 0)
> +			ret = ocfs2_dio_end_io_write(inode, private, offset,
> +						     bytes);
> +		else
> +			ocfs2_dio_free_write_ctx(inode, private);
> +	}
>   
>   	ocfs2_iocb_clear_rw_locked(iocb);
>   
> diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h
> index 308ea0e..a396096 100644
> --- a/fs/ocfs2/cluster/masklog.h
> +++ b/fs/ocfs2/cluster/masklog.h
> @@ -178,6 +178,15 @@ do {									\
>   			      ##__VA_ARGS__);				\
>   } while (0)
>   
> +#define mlog_ratelimited(mask, fmt, ...)				\
> +do {									\
> +	static DEFINE_RATELIMIT_STATE(_rs,				\
> +				      DEFAULT_RATELIMIT_INTERVAL,	\
> +				      DEFAULT_RATELIMIT_BURST);		\
> +	if (__ratelimit(&_rs))						\
> +		mlog(mask, fmt, ##__VA_ARGS__);				\
> +} while (0)
> +
>   #define mlog_errno(st) ({						\
>   	int _st = (st);							\
>   	if (_st != -ERESTARTSYS && _st != -EINTR &&			\
>
Joseph Qi Nov. 5, 2018, 2:05 a.m. UTC | #5
On 18/11/3 01:06, Wengang Wang wrote:
> The write context should also be freed even when direct IO failed.
> Otherwise a memory leak is introduced and entries remain in
> oi->ip_unwritten_list causing the following BUG later in unlink path:
> 
> ERROR: bug expression: !list_empty(&oi->ip_unwritten_list)
> ERROR: Clear inode of 215043, inode has unwritten extents
> ...
> Call Trace:
> ? __set_current_blocked+0x42/0x68
> ocfs2_evict_inode+0x91/0x6a0 [ocfs2]
> ? bit_waitqueue+0x40/0x33
> evict+0xdb/0x1af
> iput+0x1a2/0x1f7
> do_unlinkat+0x194/0x28f
> SyS_unlinkat+0x1b/0x2f
> do_syscall_64+0x79/0x1ae
> entry_SYSCALL_64_after_hwframe+0x151/0x0
> 
> This patch also logs, with frequency limit, direct IO failures.
> 
> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>

Reviewed-by: Joseph Qi <jiangqi903@gmail.com>

> ---
> v1 -> v2: log direct IO failures
> ---
>  fs/ocfs2/aops.c            | 12 ++++++++++--
>  fs/ocfs2/cluster/masklog.h |  9 +++++++++
>  2 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index 302cd7c..7578bd5 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
> @@ -2412,8 +2412,16 @@ static int ocfs2_dio_end_io(struct kiocb *iocb,
>  	/* this io's submitter should not have unlocked this before we could */
>  	BUG_ON(!ocfs2_iocb_is_rw_locked(iocb));
>  
> -	if (bytes > 0 && private)
> -		ret = ocfs2_dio_end_io_write(inode, private, offset, bytes);
> +	if (bytes <= 0)
> +		mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld",
> +				 (long long)bytes);
> +	if (private) {
> +		if (bytes > 0)
> +			ret = ocfs2_dio_end_io_write(inode, private, offset,
> +						     bytes);
> +		else
> +			ocfs2_dio_free_write_ctx(inode, private);
> +	}
>  
>  	ocfs2_iocb_clear_rw_locked(iocb);
>  
> diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h
> index 308ea0e..a396096 100644
> --- a/fs/ocfs2/cluster/masklog.h
> +++ b/fs/ocfs2/cluster/masklog.h
> @@ -178,6 +178,15 @@ do {									\
>  			      ##__VA_ARGS__);				\
>  } while (0)
>  
> +#define mlog_ratelimited(mask, fmt, ...)				\
> +do {									\
> +	static DEFINE_RATELIMIT_STATE(_rs,				\
> +				      DEFAULT_RATELIMIT_INTERVAL,	\
> +				      DEFAULT_RATELIMIT_BURST);		\
> +	if (__ratelimit(&_rs))						\
> +		mlog(mask, fmt, ##__VA_ARGS__);				\
> +} while (0)
> +
>  #define mlog_errno(st) ({						\
>  	int _st = (st);							\
>  	if (_st != -ERESTARTSYS && _st != -EINTR &&			\
>
Wengang Nov. 6, 2018, 8:02 p.m. UTC | #6
Hi AKPM,

Now we have three reviewed-bys for this patch:

Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Changwei Ge <ge.changwei@h3c.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>

Could you please merge it to mainline code?

Also thanks Junxiao, Changwei and Joseph for the reviews!

Thanks,
Wengang

On 2018/11/4 18:05, Joseph Qi wrote:
>
> On 18/11/3 01:06, Wengang Wang wrote:
>> The write context should also be freed even when direct IO failed.
>> Otherwise a memory leak is introduced and entries remain in
>> oi->ip_unwritten_list causing the following BUG later in unlink path:
>>
>> ERROR: bug expression: !list_empty(&oi->ip_unwritten_list)
>> ERROR: Clear inode of 215043, inode has unwritten extents
>> ...
>> Call Trace:
>> ? __set_current_blocked+0x42/0x68
>> ocfs2_evict_inode+0x91/0x6a0 [ocfs2]
>> ? bit_waitqueue+0x40/0x33
>> evict+0xdb/0x1af
>> iput+0x1a2/0x1f7
>> do_unlinkat+0x194/0x28f
>> SyS_unlinkat+0x1b/0x2f
>> do_syscall_64+0x79/0x1ae
>> entry_SYSCALL_64_after_hwframe+0x151/0x0
>>
>> This patch also logs, with frequency limit, direct IO failures.
>>
>> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
> Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
>
>> ---
>> v1 -> v2: log direct IO failures
>> ---
>>   fs/ocfs2/aops.c            | 12 ++++++++++--
>>   fs/ocfs2/cluster/masklog.h |  9 +++++++++
>>   2 files changed, 19 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
>> index 302cd7c..7578bd5 100644
>> --- a/fs/ocfs2/aops.c
>> +++ b/fs/ocfs2/aops.c
>> @@ -2412,8 +2412,16 @@ static int ocfs2_dio_end_io(struct kiocb *iocb,
>>   	/* this io's submitter should not have unlocked this before we could */
>>   	BUG_ON(!ocfs2_iocb_is_rw_locked(iocb));
>>   
>> -	if (bytes > 0 && private)
>> -		ret = ocfs2_dio_end_io_write(inode, private, offset, bytes);
>> +	if (bytes <= 0)
>> +		mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld",
>> +				 (long long)bytes);
>> +	if (private) {
>> +		if (bytes > 0)
>> +			ret = ocfs2_dio_end_io_write(inode, private, offset,
>> +						     bytes);
>> +		else
>> +			ocfs2_dio_free_write_ctx(inode, private);
>> +	}
>>   
>>   	ocfs2_iocb_clear_rw_locked(iocb);
>>   
>> diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h
>> index 308ea0e..a396096 100644
>> --- a/fs/ocfs2/cluster/masklog.h
>> +++ b/fs/ocfs2/cluster/masklog.h
>> @@ -178,6 +178,15 @@ do {									\
>>   			      ##__VA_ARGS__);				\
>>   } while (0)
>>   
>> +#define mlog_ratelimited(mask, fmt, ...)				\
>> +do {									\
>> +	static DEFINE_RATELIMIT_STATE(_rs,				\
>> +				      DEFAULT_RATELIMIT_INTERVAL,	\
>> +				      DEFAULT_RATELIMIT_BURST);		\
>> +	if (__ratelimit(&_rs))						\
>> +		mlog(mask, fmt, ##__VA_ARGS__);				\
>> +} while (0)
>> +
>>   #define mlog_errno(st) ({						\
>>   	int _st = (st);							\
>>   	if (_st != -ERESTARTSYS && _st != -EINTR &&			\
>>

Patch
diff mbox series

diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 302cd7c..7578bd5 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2412,8 +2412,16 @@  static int ocfs2_dio_end_io(struct kiocb *iocb,
 	/* this io's submitter should not have unlocked this before we could */
 	BUG_ON(!ocfs2_iocb_is_rw_locked(iocb));
 
-	if (bytes > 0 && private)
-		ret = ocfs2_dio_end_io_write(inode, private, offset, bytes);
+	if (bytes <= 0)
+		mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld",
+				 (long long)bytes);
+	if (private) {
+		if (bytes > 0)
+			ret = ocfs2_dio_end_io_write(inode, private, offset,
+						     bytes);
+		else
+			ocfs2_dio_free_write_ctx(inode, private);
+	}
 
 	ocfs2_iocb_clear_rw_locked(iocb);
 
diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h
index 308ea0e..a396096 100644
--- a/fs/ocfs2/cluster/masklog.h
+++ b/fs/ocfs2/cluster/masklog.h
@@ -178,6 +178,15 @@  do {									\
 			      ##__VA_ARGS__);				\
 } while (0)
 
+#define mlog_ratelimited(mask, fmt, ...)				\
+do {									\
+	static DEFINE_RATELIMIT_STATE(_rs,				\
+				      DEFAULT_RATELIMIT_INTERVAL,	\
+				      DEFAULT_RATELIMIT_BURST);		\
+	if (__ratelimit(&_rs))						\
+		mlog(mask, fmt, ##__VA_ARGS__);				\
+} while (0)
+
 #define mlog_errno(st) ({						\
 	int _st = (st);							\
 	if (_st != -ERESTARTSYS && _st != -EINTR &&			\