[v2] ocfs2: do not BUG if jbd2_journal_dirty_metadata fails
diff mbox

Message ID 55515D2B.8010202@huawei.com
State New
Headers show

Commit Message

Joseph Qi May 12, 2015, 1:53 a.m. UTC
jbd2_journal_dirty_metadata may fail. Currently it cannot take care of
non zero return value and just BUG in ocfs2_journal_dirty.
This patch is aborting the handle and journal instead of BUG.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: joyce.xue <xuejiufei@huawei.com>
---
 fs/ocfs2/journal.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Andrew Morton May 12, 2015, 11:03 p.m. UTC | #1
On Tue, 12 May 2015 09:53:47 +0800 Joseph Qi <joseph.qi@huawei.com> wrote:

> jbd2_journal_dirty_metadata may fail. Currently it cannot take care of
> non zero return value and just BUG in ocfs2_journal_dirty.
> This patch is aborting the handle and journal instead of BUG.
> 

This patch is internally inconsistent.

> --- a/fs/ocfs2/journal.c
> +++ b/fs/ocfs2/journal.c
> @@ -775,7 +775,17 @@ void ocfs2_journal_dirty(handle_t *handle, struct buffer_head *bh)

The "-775,7 +775,17" means "the 7 lines at line 775 get turned into 17
lines".  ie: 10 lines are added.

>  	trace_ocfs2_journal_dirty((unsigned long long)bh->b_blocknr);
> 
>  	status = jbd2_journal_dirty_metadata(handle, bh);
> -	BUG_ON(status);
> +	if (status) {
> +		mlog_errno(status);
> +		if (!is_handle_aborted(handle)) {
> +			journal_t *journal = handle->h_transaction->t_journal;
> +
> +			mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed. "
> +					"Aborting transaction and journal.");
> +			handle->h_err = status;
> +			jbd2_journal_abort_handle(handle);
> +			jbd2_journal_abort(journal, status);
> +		}
> +	}
>  }

But the patch deletes 1 line and adds 12, for a net addition of 11 lines,
not 10.

So the patch doesn't apply.  Changing the "17" to "18" fixes that.


Whatever tool you used for generating this patch needs slapping.
Junxiao Bi May 20, 2015, 8:22 a.m. UTC | #2
On 05/12/2015 09:53 AM, Joseph Qi wrote:
> jbd2_journal_dirty_metadata may fail. Currently it cannot take care of
> non zero return value and just BUG in ocfs2_journal_dirty.
> This patch is aborting the handle and journal instead of BUG.
> 
> Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
> Cc: joyce.xue <xuejiufei@huawei.com>
> ---
>  fs/ocfs2/journal.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
> index ff53192..eefca1e 100644
> --- a/fs/ocfs2/journal.c
> +++ b/fs/ocfs2/journal.c
> @@ -775,7 +775,17 @@ void ocfs2_journal_dirty(handle_t *handle, struct buffer_head *bh)
>  	trace_ocfs2_journal_dirty((unsigned long long)bh->b_blocknr);
> 
>  	status = jbd2_journal_dirty_metadata(handle, bh);
> -	BUG_ON(status);
> +	if (status) {
> +		mlog_errno(status);
> +		if (!is_handle_aborted(handle)) {
> +			journal_t *journal = handle->h_transaction->t_journal;
> +
> +			mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed. "
> +					"Aborting transaction and journal.");
> +			handle->h_err = status;
> +			jbd2_journal_abort_handle(handle);
> +			jbd2_journal_abort(journal, status);
Let fs go after journal lose affect seemed not safe, may we set fs
read-only here?

Thanks,
Junxiao.


> +		}
> +	}
>  }
> 
>  #define OCFS2_DEFAULT_COMMIT_INTERVAL	(HZ * JBD2_DEFAULT_MAX_COMMIT_AGE)
>
Joseph Qi May 20, 2015, 10:25 a.m. UTC | #3
On 2015/5/20 16:22, Junxiao Bi wrote:
> On 05/12/2015 09:53 AM, Joseph Qi wrote:
>> jbd2_journal_dirty_metadata may fail. Currently it cannot take care of
>> non zero return value and just BUG in ocfs2_journal_dirty.
>> This patch is aborting the handle and journal instead of BUG.
>>
>> Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
>> Cc: joyce.xue <xuejiufei@huawei.com>
>> ---
>>  fs/ocfs2/journal.c | 12 +++++++++++-
>>  1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>> index ff53192..eefca1e 100644
>> --- a/fs/ocfs2/journal.c
>> +++ b/fs/ocfs2/journal.c
>> @@ -775,7 +775,17 @@ void ocfs2_journal_dirty(handle_t *handle, struct buffer_head *bh)
>>  	trace_ocfs2_journal_dirty((unsigned long long)bh->b_blocknr);
>>
>>  	status = jbd2_journal_dirty_metadata(handle, bh);
>> -	BUG_ON(status);
>> +	if (status) {
>> +		mlog_errno(status);
>> +		if (!is_handle_aborted(handle)) {
>> +			journal_t *journal = handle->h_transaction->t_journal;
>> +
>> +			mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed. "
>> +					"Aborting transaction and journal.");
>> +			handle->h_err = status;
>> +			jbd2_journal_abort_handle(handle);
>> +			jbd2_journal_abort(journal, status);
> Let fs go after journal lose affect seemed not safe, may we set fs
> read-only here?
> 
> Thanks,
> Junxiao.
> 
> 
Do you mean journal can still be updated even if it is aborted?

>> +		}
>> +	}
>>  }
>>
>>  #define OCFS2_DEFAULT_COMMIT_INTERVAL	(HZ * JBD2_DEFAULT_MAX_COMMIT_AGE)
>>
> 
> 
>
Junxiao Bi May 20, 2015, 11:09 a.m. UTC | #4
On 05/20/2015 06:25 PM, Joseph Qi wrote:
> On 2015/5/20 16:22, Junxiao Bi wrote:
>> On 05/12/2015 09:53 AM, Joseph Qi wrote:
>>> jbd2_journal_dirty_metadata may fail. Currently it cannot take care of
>>> non zero return value and just BUG in ocfs2_journal_dirty.
>>> This patch is aborting the handle and journal instead of BUG.
>>>
>>> Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
>>> Cc: joyce.xue <xuejiufei@huawei.com>
>>> ---
>>>  fs/ocfs2/journal.c | 12 +++++++++++-
>>>  1 file changed, 11 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>>> index ff53192..eefca1e 100644
>>> --- a/fs/ocfs2/journal.c
>>> +++ b/fs/ocfs2/journal.c
>>> @@ -775,7 +775,17 @@ void ocfs2_journal_dirty(handle_t *handle, struct buffer_head *bh)
>>>  	trace_ocfs2_journal_dirty((unsigned long long)bh->b_blocknr);
>>>
>>>  	status = jbd2_journal_dirty_metadata(handle, bh);
>>> -	BUG_ON(status);
>>> +	if (status) {
>>> +		mlog_errno(status);
>>> +		if (!is_handle_aborted(handle)) {
>>> +			journal_t *journal = handle->h_transaction->t_journal;
>>> +
>>> +			mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed. "
>>> +					"Aborting transaction and journal.");
>>> +			handle->h_err = status;
>>> +			jbd2_journal_abort_handle(handle);
>>> +			jbd2_journal_abort(journal, status);
>> Let fs go after journal lose affect seemed not safe, may we set fs
>> read-only here?
>>
>> Thanks,
>> Junxiao.
>>
>>
> Do you mean journal can still be updated even if it is aborted?
No, journal will not be updated. After abort, journal api will return
error, at last fs will be set read-only, but there is also api which
didn't return error like ocfs2_journal_dirty(), so why not stop at the
first time?

Thanks,
Junxiao.
> 
>>> +		}
>>> +	}
>>>  }
>>>
>>>  #define OCFS2_DEFAULT_COMMIT_INTERVAL	(HZ * JBD2_DEFAULT_MAX_COMMIT_AGE)
>>>
>>
>>
>>
> 
>
Joseph Qi May 21, 2015, 12:49 a.m. UTC | #5
On 2015/5/20 19:09, Junxiao Bi wrote:
> On 05/20/2015 06:25 PM, Joseph Qi wrote:
>> On 2015/5/20 16:22, Junxiao Bi wrote:
>>> On 05/12/2015 09:53 AM, Joseph Qi wrote:
>>>> jbd2_journal_dirty_metadata may fail. Currently it cannot take care of
>>>> non zero return value and just BUG in ocfs2_journal_dirty.
>>>> This patch is aborting the handle and journal instead of BUG.
>>>>
>>>> Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
>>>> Cc: joyce.xue <xuejiufei@huawei.com>
>>>> ---
>>>>  fs/ocfs2/journal.c | 12 +++++++++++-
>>>>  1 file changed, 11 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>>>> index ff53192..eefca1e 100644
>>>> --- a/fs/ocfs2/journal.c
>>>> +++ b/fs/ocfs2/journal.c
>>>> @@ -775,7 +775,17 @@ void ocfs2_journal_dirty(handle_t *handle, struct buffer_head *bh)
>>>>  	trace_ocfs2_journal_dirty((unsigned long long)bh->b_blocknr);
>>>>
>>>>  	status = jbd2_journal_dirty_metadata(handle, bh);
>>>> -	BUG_ON(status);
>>>> +	if (status) {
>>>> +		mlog_errno(status);
>>>> +		if (!is_handle_aborted(handle)) {
>>>> +			journal_t *journal = handle->h_transaction->t_journal;
>>>> +
>>>> +			mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed. "
>>>> +					"Aborting transaction and journal.");
>>>> +			handle->h_err = status;
>>>> +			jbd2_journal_abort_handle(handle);
>>>> +			jbd2_journal_abort(journal, status);
>>> Let fs go after journal lose affect seemed not safe, may we set fs
>>> read-only here?
>>>
>>> Thanks,
>>> Junxiao.
>>>
>>>
>> Do you mean journal can still be updated even if it is aborted?
> No, journal will not be updated. After abort, journal api will return
> error, at last fs will be set read-only, but there is also api which
> didn't return error like ocfs2_journal_dirty(), so why not stop at the
> first time?
> 
> Thanks,
> Junxiao.
Agree with you.
But here bh can be anything like di, gd, ... As Joyce Xue reported, the
problem also exists in ocfs2_abort_trigger. So I don't think we can get
sb we needed.
Is there any other way except changing prototype of ocfs2_journal_dirty?

>>
>>>> +		}
>>>> +	}
>>>>  }
>>>>
>>>>  #define OCFS2_DEFAULT_COMMIT_INTERVAL	(HZ * JBD2_DEFAULT_MAX_COMMIT_AGE)
>>>>
>>>
>>>
>>>
>>
>>
> 
> 
> .
>
Junxiao Bi May 21, 2015, 2:09 a.m. UTC | #6
On 05/21/2015 08:49 AM, Joseph Qi wrote:
> On 2015/5/20 19:09, Junxiao Bi wrote:
>> On 05/20/2015 06:25 PM, Joseph Qi wrote:
>>> On 2015/5/20 16:22, Junxiao Bi wrote:
>>>> On 05/12/2015 09:53 AM, Joseph Qi wrote:
>>>>> jbd2_journal_dirty_metadata may fail. Currently it cannot take care of
>>>>> non zero return value and just BUG in ocfs2_journal_dirty.
>>>>> This patch is aborting the handle and journal instead of BUG.
>>>>>
>>>>> Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
>>>>> Cc: joyce.xue <xuejiufei@huawei.com>
>>>>> ---
>>>>>  fs/ocfs2/journal.c | 12 +++++++++++-
>>>>>  1 file changed, 11 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>>>>> index ff53192..eefca1e 100644
>>>>> --- a/fs/ocfs2/journal.c
>>>>> +++ b/fs/ocfs2/journal.c
>>>>> @@ -775,7 +775,17 @@ void ocfs2_journal_dirty(handle_t *handle, struct buffer_head *bh)
>>>>>  	trace_ocfs2_journal_dirty((unsigned long long)bh->b_blocknr);
>>>>>
>>>>>  	status = jbd2_journal_dirty_metadata(handle, bh);
>>>>> -	BUG_ON(status);
>>>>> +	if (status) {
>>>>> +		mlog_errno(status);
>>>>> +		if (!is_handle_aborted(handle)) {
>>>>> +			journal_t *journal = handle->h_transaction->t_journal;
>>>>> +
>>>>> +			mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed. "
>>>>> +					"Aborting transaction and journal.");
>>>>> +			handle->h_err = status;
>>>>> +			jbd2_journal_abort_handle(handle);
>>>>> +			jbd2_journal_abort(journal, status);
>>>> Let fs go after journal lose affect seemed not safe, may we set fs
>>>> read-only here?
>>>>
>>>> Thanks,
>>>> Junxiao.
>>>>
>>>>
>>> Do you mean journal can still be updated even if it is aborted?
>> No, journal will not be updated. After abort, journal api will return
>> error, at last fs will be set read-only, but there is also api which
>> didn't return error like ocfs2_journal_dirty(), so why not stop at the
>> first time?
>>
>> Thanks,
>> Junxiao.
> Agree with you.
> But here bh can be anything like di, gd, ... As Joyce Xue reported, the
> problem also exists in ocfs2_abort_trigger. So I don't think we can get
> sb we needed.
Yes, right.

> Is there any other way except changing prototype of ocfs2_journal_dirty?
I can't see how to do that. From me, if jbd2_journal_dirty_metadata()
return an error, there seemed a bug, BUG_ON for a bug seemed acceptable
since not easy to set fs read-only from here.
Did you ever watch this BUG_ON triggered not due to a bug?

Thanks,
Junxiao.


> 
>>>
>>>>> +		}
>>>>> +	}
>>>>>  }
>>>>>
>>>>>  #define OCFS2_DEFAULT_COMMIT_INTERVAL	(HZ * JBD2_DEFAULT_MAX_COMMIT_AGE)
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> .
>>
> 
>
Joseph Qi May 21, 2015, 3:54 a.m. UTC | #7
On 2015/5/21 10:09, Junxiao Bi wrote:
> On 05/21/2015 08:49 AM, Joseph Qi wrote:
>> On 2015/5/20 19:09, Junxiao Bi wrote:
>>> On 05/20/2015 06:25 PM, Joseph Qi wrote:
>>>> On 2015/5/20 16:22, Junxiao Bi wrote:
>>>>> On 05/12/2015 09:53 AM, Joseph Qi wrote:
>>>>>> jbd2_journal_dirty_metadata may fail. Currently it cannot take care of
>>>>>> non zero return value and just BUG in ocfs2_journal_dirty.
>>>>>> This patch is aborting the handle and journal instead of BUG.
>>>>>>
>>>>>> Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
>>>>>> Cc: joyce.xue <xuejiufei@huawei.com>
>>>>>> ---
>>>>>>  fs/ocfs2/journal.c | 12 +++++++++++-
>>>>>>  1 file changed, 11 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>>>>>> index ff53192..eefca1e 100644
>>>>>> --- a/fs/ocfs2/journal.c
>>>>>> +++ b/fs/ocfs2/journal.c
>>>>>> @@ -775,7 +775,17 @@ void ocfs2_journal_dirty(handle_t *handle, struct buffer_head *bh)
>>>>>>  	trace_ocfs2_journal_dirty((unsigned long long)bh->b_blocknr);
>>>>>>
>>>>>>  	status = jbd2_journal_dirty_metadata(handle, bh);
>>>>>> -	BUG_ON(status);
>>>>>> +	if (status) {
>>>>>> +		mlog_errno(status);
>>>>>> +		if (!is_handle_aborted(handle)) {
>>>>>> +			journal_t *journal = handle->h_transaction->t_journal;
>>>>>> +
>>>>>> +			mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed. "
>>>>>> +					"Aborting transaction and journal.");
>>>>>> +			handle->h_err = status;
>>>>>> +			jbd2_journal_abort_handle(handle);
>>>>>> +			jbd2_journal_abort(journal, status);
>>>>> Let fs go after journal lose affect seemed not safe, may we set fs
>>>>> read-only here?
>>>>>
>>>>> Thanks,
>>>>> Junxiao.
>>>>>
>>>>>
>>>> Do you mean journal can still be updated even if it is aborted?
>>> No, journal will not be updated. After abort, journal api will return
>>> error, at last fs will be set read-only, but there is also api which
>>> didn't return error like ocfs2_journal_dirty(), so why not stop at the
>>> first time?
>>>
>>> Thanks,
>>> Junxiao.
>> Agree with you.
>> But here bh can be anything like di, gd, ... As Joyce Xue reported, the
>> problem also exists in ocfs2_abort_trigger. So I don't think we can get
>> sb we needed.
> Yes, right.
> 
>> Is there any other way except changing prototype of ocfs2_journal_dirty?
> I can't see how to do that. From me, if jbd2_journal_dirty_metadata()
> return an error, there seemed a bug, BUG_ON for a bug seemed acceptable
> since not easy to set fs read-only from here.
> Did you ever watch this BUG_ON triggered not due to a bug?
> 
Yes, one case is jbd2_journal_dirty_metadata returns -EROFS.
Another is it returns -EINVAL once transaction has changed. ocfs2 will
extends transaction once credits not enough. But I think ocfs2 should
take care of this case when calling jbd2_journal_restart.

> Thanks,
> Junxiao.
> 
> 
>>
>>>>
>>>>>> +		}
>>>>>> +	}
>>>>>>  }
>>>>>>
>>>>>>  #define OCFS2_DEFAULT_COMMIT_INTERVAL	(HZ * JBD2_DEFAULT_MAX_COMMIT_AGE)
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>> .
>>>
>>
>>
> 
> 
> .
>

Patch
diff mbox

diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index ff53192..eefca1e 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -775,7 +775,17 @@  void ocfs2_journal_dirty(handle_t *handle, struct buffer_head *bh)
 	trace_ocfs2_journal_dirty((unsigned long long)bh->b_blocknr);

 	status = jbd2_journal_dirty_metadata(handle, bh);
-	BUG_ON(status);
+	if (status) {
+		mlog_errno(status);
+		if (!is_handle_aborted(handle)) {
+			journal_t *journal = handle->h_transaction->t_journal;
+
+			mlog(ML_ERROR, "jbd2_journal_dirty_metadata failed. "
+					"Aborting transaction and journal.");
+			handle->h_err = status;
+			jbd2_journal_abort_handle(handle);
+			jbd2_journal_abort(journal, status);
+		}
+	}
 }

 #define OCFS2_DEFAULT_COMMIT_INTERVAL	(HZ * JBD2_DEFAULT_MAX_COMMIT_AGE)