Message ID | 20181102170632.25921-1-wen.gang.wang@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [V2] ocfs2: free up write context when direct IO failed | expand |
Hi Guys, Thanks for your reviews. As suggested by Junxiao, logging direct IO failures is added in v2. Please review the v2. thanks, wengang On 2018/11/2 10:06, Wengang Wang wrote: > The write context should also be freed even when direct IO failed. > Otherwise a memory leak is introduced and entries remain in > oi->ip_unwritten_list causing the following BUG later in unlink path: > > ERROR: bug expression: !list_empty(&oi->ip_unwritten_list) > ERROR: Clear inode of 215043, inode has unwritten extents > ... > Call Trace: > ? __set_current_blocked+0x42/0x68 > ocfs2_evict_inode+0x91/0x6a0 [ocfs2] > ? bit_waitqueue+0x40/0x33 > evict+0xdb/0x1af > iput+0x1a2/0x1f7 > do_unlinkat+0x194/0x28f > SyS_unlinkat+0x1b/0x2f > do_syscall_64+0x79/0x1ae > entry_SYSCALL_64_after_hwframe+0x151/0x0 > > This patch also logs, with frequency limit, direct IO failures. > > Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> > --- > v1 -> v2: log direct IO failures > --- > fs/ocfs2/aops.c | 12 ++++++++++-- > fs/ocfs2/cluster/masklog.h | 9 +++++++++ > 2 files changed, 19 insertions(+), 2 deletions(-) > > diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c > index 302cd7c..7578bd5 100644 > --- a/fs/ocfs2/aops.c > +++ b/fs/ocfs2/aops.c > @@ -2412,8 +2412,16 @@ static int ocfs2_dio_end_io(struct kiocb *iocb, > /* this io's submitter should not have unlocked this before we could */ > BUG_ON(!ocfs2_iocb_is_rw_locked(iocb)); > > - if (bytes > 0 && private) > - ret = ocfs2_dio_end_io_write(inode, private, offset, bytes); > + if (bytes <= 0) > + mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld", > + (long long)bytes); > + if (private) { > + if (bytes > 0) > + ret = ocfs2_dio_end_io_write(inode, private, offset, > + bytes); > + else > + ocfs2_dio_free_write_ctx(inode, private); > + } > > ocfs2_iocb_clear_rw_locked(iocb); > > diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h > index 308ea0e..a396096 100644 > --- a/fs/ocfs2/cluster/masklog.h > +++ b/fs/ocfs2/cluster/masklog.h > @@ -178,6 +178,15 @@ do { \ > ##__VA_ARGS__); \ > } while (0) > > +#define mlog_ratelimited(mask, fmt, ...) \ > +do { \ > + static DEFINE_RATELIMIT_STATE(_rs, \ > + DEFAULT_RATELIMIT_INTERVAL, \ > + DEFAULT_RATELIMIT_BURST); \ > + if (__ratelimit(&_rs)) \ > + mlog(mask, fmt, ##__VA_ARGS__); \ > +} while (0) > + > #define mlog_errno(st) ({ \ > int _st = (st); \ > if (_st != -ERESTARTSYS && _st != -EINTR && \
On Fri, Nov 02, 2018 at 10:06:32AM -0700, Wengang Wang wrote: > The write context should also be freed even when direct IO failed. > Otherwise a memory leak is introduced and entries remain in > oi->ip_unwritten_list causing the following BUG later in unlink path: > > ERROR: bug expression: !list_empty(&oi->ip_unwritten_list) > ERROR: Clear inode of 215043, inode has unwritten extents > ... > Call Trace: > ? __set_current_blocked+0x42/0x68 > ocfs2_evict_inode+0x91/0x6a0 [ocfs2] > ? bit_waitqueue+0x40/0x33 > evict+0xdb/0x1af > iput+0x1a2/0x1f7 > do_unlinkat+0x194/0x28f > SyS_unlinkat+0x1b/0x2f > do_syscall_64+0x79/0x1ae > entry_SYSCALL_64_after_hwframe+0x151/0x0 > > This patch also logs, with frequency limit, direct IO failures. > > Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> > --- > v1 -> v2: log direct IO failures > --- > fs/ocfs2/aops.c | 12 ++++++++++-- > fs/ocfs2/cluster/masklog.h | 9 +++++++++ > 2 files changed, 19 insertions(+), 2 deletions(-) <formletter> This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.kernel.org_doc_html_latest_process_stable-2Dkernel-2Drules.html&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=C7gAd4uDxlAvTdc0vmU6X8CMk6L2iDY8-HD0qT6Fo7Y&m=P2I-XIsUfJ0h1jlymjMe7ziTkrs4pJ-sCZYaijMZpWw&s=ygRPYcgBDv1LNZjnBgPzukjEYvJ_sE8jlBo7R_3_b5c&e= for how to do this properly. </formletter>
On 11/03/2018 01:06 AM, Wengang Wang wrote: > The write context should also be freed even when direct IO failed. > Otherwise a memory leak is introduced and entries remain in > oi->ip_unwritten_list causing the following BUG later in unlink path: > > ERROR: bug expression: !list_empty(&oi->ip_unwritten_list) > ERROR: Clear inode of 215043, inode has unwritten extents > ... > Call Trace: > ? __set_current_blocked+0x42/0x68 > ocfs2_evict_inode+0x91/0x6a0 [ocfs2] > ? bit_waitqueue+0x40/0x33 > evict+0xdb/0x1af > iput+0x1a2/0x1f7 > do_unlinkat+0x194/0x28f > SyS_unlinkat+0x1b/0x2f > do_syscall_64+0x79/0x1ae > entry_SYSCALL_64_after_hwframe+0x151/0x0 > > This patch also logs, with frequency limit, direct IO failures. > > Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> > --- > v1 -> v2: log direct IO failures > --- > fs/ocfs2/aops.c | 12 ++++++++++-- > fs/ocfs2/cluster/masklog.h | 9 +++++++++ > 2 files changed, 19 insertions(+), 2 deletions(-) > > diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c > index 302cd7c..7578bd5 100644 > --- a/fs/ocfs2/aops.c > +++ b/fs/ocfs2/aops.c > @@ -2412,8 +2412,16 @@ static int ocfs2_dio_end_io(struct kiocb *iocb, > /* this io's submitter should not have unlocked this before we could */ > BUG_ON(!ocfs2_iocb_is_rw_locked(iocb)); > > - if (bytes > 0 && private) > - ret = ocfs2_dio_end_io_write(inode, private, offset, bytes); > + if (bytes <= 0) > + mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld", > + (long long)bytes); > + if (private) { > + if (bytes > 0) > + ret = ocfs2_dio_end_io_write(inode, private, offset, > + bytes); > + else > + ocfs2_dio_free_write_ctx(inode, private); > + } > > ocfs2_iocb_clear_rw_locked(iocb); > > diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h > index 308ea0e..a396096 100644 > --- a/fs/ocfs2/cluster/masklog.h > +++ b/fs/ocfs2/cluster/masklog.h > @@ -178,6 +178,15 @@ do { \ > ##__VA_ARGS__); \ > } while (0) > > +#define mlog_ratelimited(mask, fmt, ...) \ > +do { \ > + static DEFINE_RATELIMIT_STATE(_rs, \ > + DEFAULT_RATELIMIT_INTERVAL, \ > + DEFAULT_RATELIMIT_BURST); \ > + if (__ratelimit(&_rs)) \ > + mlog(mask, fmt, ##__VA_ARGS__); \ > +} while (0) > + > #define mlog_errno(st) ({ \ > int _st = (st); \ > if (_st != -ERESTARTSYS && _st != -EINTR && \
This version also added a nice mlog infrastructure. Looks good to me. Thanks, Changwei On 2018/11/3 1:08, Wengang Wang wrote: > The write context should also be freed even when direct IO failed. > Otherwise a memory leak is introduced and entries remain in > oi->ip_unwritten_list causing the following BUG later in unlink path: > > ERROR: bug expression: !list_empty(&oi->ip_unwritten_list) > ERROR: Clear inode of 215043, inode has unwritten extents > ... > Call Trace: > ? __set_current_blocked+0x42/0x68 > ocfs2_evict_inode+0x91/0x6a0 [ocfs2] > ? bit_waitqueue+0x40/0x33 > evict+0xdb/0x1af > iput+0x1a2/0x1f7 > do_unlinkat+0x194/0x28f > SyS_unlinkat+0x1b/0x2f > do_syscall_64+0x79/0x1ae > entry_SYSCALL_64_after_hwframe+0x151/0x0 > > This patch also logs, with frequency limit, direct IO failures. > > Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reviewed-by: Changwei Ge <ge.changwei@h3c.com> > --- > v1 -> v2: log direct IO failures > --- > fs/ocfs2/aops.c | 12 ++++++++++-- > fs/ocfs2/cluster/masklog.h | 9 +++++++++ > 2 files changed, 19 insertions(+), 2 deletions(-) > > diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c > index 302cd7c..7578bd5 100644 > --- a/fs/ocfs2/aops.c > +++ b/fs/ocfs2/aops.c > @@ -2412,8 +2412,16 @@ static int ocfs2_dio_end_io(struct kiocb *iocb, > /* this io's submitter should not have unlocked this before we could */ > BUG_ON(!ocfs2_iocb_is_rw_locked(iocb)); > > - if (bytes > 0 && private) > - ret = ocfs2_dio_end_io_write(inode, private, offset, bytes); > + if (bytes <= 0) > + mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld", > + (long long)bytes); > + if (private) { > + if (bytes > 0) > + ret = ocfs2_dio_end_io_write(inode, private, offset, > + bytes); > + else > + ocfs2_dio_free_write_ctx(inode, private); > + } > > ocfs2_iocb_clear_rw_locked(iocb); > > diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h > index 308ea0e..a396096 100644 > --- a/fs/ocfs2/cluster/masklog.h > +++ b/fs/ocfs2/cluster/masklog.h > @@ -178,6 +178,15 @@ do { \ > ##__VA_ARGS__); \ > } while (0) > > +#define mlog_ratelimited(mask, fmt, ...) \ > +do { \ > + static DEFINE_RATELIMIT_STATE(_rs, \ > + DEFAULT_RATELIMIT_INTERVAL, \ > + DEFAULT_RATELIMIT_BURST); \ > + if (__ratelimit(&_rs)) \ > + mlog(mask, fmt, ##__VA_ARGS__); \ > +} while (0) > + > #define mlog_errno(st) ({ \ > int _st = (st); \ > if (_st != -ERESTARTSYS && _st != -EINTR && \ >
On 18/11/3 01:06, Wengang Wang wrote: > The write context should also be freed even when direct IO failed. > Otherwise a memory leak is introduced and entries remain in > oi->ip_unwritten_list causing the following BUG later in unlink path: > > ERROR: bug expression: !list_empty(&oi->ip_unwritten_list) > ERROR: Clear inode of 215043, inode has unwritten extents > ... > Call Trace: > ? __set_current_blocked+0x42/0x68 > ocfs2_evict_inode+0x91/0x6a0 [ocfs2] > ? bit_waitqueue+0x40/0x33 > evict+0xdb/0x1af > iput+0x1a2/0x1f7 > do_unlinkat+0x194/0x28f > SyS_unlinkat+0x1b/0x2f > do_syscall_64+0x79/0x1ae > entry_SYSCALL_64_after_hwframe+0x151/0x0 > > This patch also logs, with frequency limit, direct IO failures. > > Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> > --- > v1 -> v2: log direct IO failures > --- > fs/ocfs2/aops.c | 12 ++++++++++-- > fs/ocfs2/cluster/masklog.h | 9 +++++++++ > 2 files changed, 19 insertions(+), 2 deletions(-) > > diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c > index 302cd7c..7578bd5 100644 > --- a/fs/ocfs2/aops.c > +++ b/fs/ocfs2/aops.c > @@ -2412,8 +2412,16 @@ static int ocfs2_dio_end_io(struct kiocb *iocb, > /* this io's submitter should not have unlocked this before we could */ > BUG_ON(!ocfs2_iocb_is_rw_locked(iocb)); > > - if (bytes > 0 && private) > - ret = ocfs2_dio_end_io_write(inode, private, offset, bytes); > + if (bytes <= 0) > + mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld", > + (long long)bytes); > + if (private) { > + if (bytes > 0) > + ret = ocfs2_dio_end_io_write(inode, private, offset, > + bytes); > + else > + ocfs2_dio_free_write_ctx(inode, private); > + } > > ocfs2_iocb_clear_rw_locked(iocb); > > diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h > index 308ea0e..a396096 100644 > --- a/fs/ocfs2/cluster/masklog.h > +++ b/fs/ocfs2/cluster/masklog.h > @@ -178,6 +178,15 @@ do { \ > ##__VA_ARGS__); \ > } while (0) > > +#define mlog_ratelimited(mask, fmt, ...) \ > +do { \ > + static DEFINE_RATELIMIT_STATE(_rs, \ > + DEFAULT_RATELIMIT_INTERVAL, \ > + DEFAULT_RATELIMIT_BURST); \ > + if (__ratelimit(&_rs)) \ > + mlog(mask, fmt, ##__VA_ARGS__); \ > +} while (0) > + > #define mlog_errno(st) ({ \ > int _st = (st); \ > if (_st != -ERESTARTSYS && _st != -EINTR && \ >
Hi AKPM, Now we have three reviewed-bys for this patch: Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Changwei Ge <ge.changwei@h3c.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Could you please merge it to mainline code? Also thanks Junxiao, Changwei and Joseph for the reviews! Thanks, Wengang On 2018/11/4 18:05, Joseph Qi wrote: > > On 18/11/3 01:06, Wengang Wang wrote: >> The write context should also be freed even when direct IO failed. >> Otherwise a memory leak is introduced and entries remain in >> oi->ip_unwritten_list causing the following BUG later in unlink path: >> >> ERROR: bug expression: !list_empty(&oi->ip_unwritten_list) >> ERROR: Clear inode of 215043, inode has unwritten extents >> ... >> Call Trace: >> ? __set_current_blocked+0x42/0x68 >> ocfs2_evict_inode+0x91/0x6a0 [ocfs2] >> ? bit_waitqueue+0x40/0x33 >> evict+0xdb/0x1af >> iput+0x1a2/0x1f7 >> do_unlinkat+0x194/0x28f >> SyS_unlinkat+0x1b/0x2f >> do_syscall_64+0x79/0x1ae >> entry_SYSCALL_64_after_hwframe+0x151/0x0 >> >> This patch also logs, with frequency limit, direct IO failures. >> >> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> > Reviewed-by: Joseph Qi <jiangqi903@gmail.com> > >> --- >> v1 -> v2: log direct IO failures >> --- >> fs/ocfs2/aops.c | 12 ++++++++++-- >> fs/ocfs2/cluster/masklog.h | 9 +++++++++ >> 2 files changed, 19 insertions(+), 2 deletions(-) >> >> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c >> index 302cd7c..7578bd5 100644 >> --- a/fs/ocfs2/aops.c >> +++ b/fs/ocfs2/aops.c >> @@ -2412,8 +2412,16 @@ static int ocfs2_dio_end_io(struct kiocb *iocb, >> /* this io's submitter should not have unlocked this before we could */ >> BUG_ON(!ocfs2_iocb_is_rw_locked(iocb)); >> >> - if (bytes > 0 && private) >> - ret = ocfs2_dio_end_io_write(inode, private, offset, bytes); >> + if (bytes <= 0) >> + mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld", >> + (long long)bytes); >> + if (private) { >> + if (bytes > 0) >> + ret = ocfs2_dio_end_io_write(inode, private, offset, >> + bytes); >> + else >> + ocfs2_dio_free_write_ctx(inode, private); >> + } >> >> ocfs2_iocb_clear_rw_locked(iocb); >> >> diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h >> index 308ea0e..a396096 100644 >> --- a/fs/ocfs2/cluster/masklog.h >> +++ b/fs/ocfs2/cluster/masklog.h >> @@ -178,6 +178,15 @@ do { \ >> ##__VA_ARGS__); \ >> } while (0) >> >> +#define mlog_ratelimited(mask, fmt, ...) \ >> +do { \ >> + static DEFINE_RATELIMIT_STATE(_rs, \ >> + DEFAULT_RATELIMIT_INTERVAL, \ >> + DEFAULT_RATELIMIT_BURST); \ >> + if (__ratelimit(&_rs)) \ >> + mlog(mask, fmt, ##__VA_ARGS__); \ >> +} while (0) >> + >> #define mlog_errno(st) ({ \ >> int _st = (st); \ >> if (_st != -ERESTARTSYS && _st != -EINTR && \ >>
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index 302cd7c..7578bd5 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -2412,8 +2412,16 @@ static int ocfs2_dio_end_io(struct kiocb *iocb, /* this io's submitter should not have unlocked this before we could */ BUG_ON(!ocfs2_iocb_is_rw_locked(iocb)); - if (bytes > 0 && private) - ret = ocfs2_dio_end_io_write(inode, private, offset, bytes); + if (bytes <= 0) + mlog_ratelimited(ML_ERROR, "Direct IO failed, bytes = %lld", + (long long)bytes); + if (private) { + if (bytes > 0) + ret = ocfs2_dio_end_io_write(inode, private, offset, + bytes); + else + ocfs2_dio_free_write_ctx(inode, private); + } ocfs2_iocb_clear_rw_locked(iocb); diff --git a/fs/ocfs2/cluster/masklog.h b/fs/ocfs2/cluster/masklog.h index 308ea0e..a396096 100644 --- a/fs/ocfs2/cluster/masklog.h +++ b/fs/ocfs2/cluster/masklog.h @@ -178,6 +178,15 @@ do { \ ##__VA_ARGS__); \ } while (0) +#define mlog_ratelimited(mask, fmt, ...) \ +do { \ + static DEFINE_RATELIMIT_STATE(_rs, \ + DEFAULT_RATELIMIT_INTERVAL, \ + DEFAULT_RATELIMIT_BURST); \ + if (__ratelimit(&_rs)) \ + mlog(mask, fmt, ##__VA_ARGS__); \ +} while (0) + #define mlog_errno(st) ({ \ int _st = (st); \ if (_st != -ERESTARTSYS && _st != -EINTR && \
The write context should also be freed even when direct IO failed. Otherwise a memory leak is introduced and entries remain in oi->ip_unwritten_list causing the following BUG later in unlink path: ERROR: bug expression: !list_empty(&oi->ip_unwritten_list) ERROR: Clear inode of 215043, inode has unwritten extents ... Call Trace: ? __set_current_blocked+0x42/0x68 ocfs2_evict_inode+0x91/0x6a0 [ocfs2] ? bit_waitqueue+0x40/0x33 evict+0xdb/0x1af iput+0x1a2/0x1f7 do_unlinkat+0x194/0x28f SyS_unlinkat+0x1b/0x2f do_syscall_64+0x79/0x1ae entry_SYSCALL_64_after_hwframe+0x151/0x0 This patch also logs, with frequency limit, direct IO failures. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> --- v1 -> v2: log direct IO failures --- fs/ocfs2/aops.c | 12 ++++++++++-- fs/ocfs2/cluster/masklog.h | 9 +++++++++ 2 files changed, 19 insertions(+), 2 deletions(-)