[1/8] ocfs2: alloc_dinode counts and group bitmap should be update simultaneously
diff mbox

Message ID 20140319210959.8E1C531C1E1@corp2gmr1-1.hot.corp.google.com
State New, archived
Headers show

Commit Message

Andrew Morton March 19, 2014, 9:09 p.m. UTC
From: Younger Liu <younger.liucn@gmail.com>
Subject: ocfs2: alloc_dinode counts and group bitmap should be update simultaneously

Updating alloc_dinode counts in ocfs2_alloc_dinode_update_counts() and
setting group bitmap in ocfs2_alloc_dinode_update_bitmap() have to be done
simultaneously.  Two cases are as follow:

(1) If ocfs2_alloc_dinode_update_counts() fails, there is no need to
    set group bitmap.  This case has been considered.

(2) If ocfs2_alloc_dinode_update_bitmap() fails, alloc_dinode counts
    should be rolled back.  Otherwise, some clusters would never be used. 
    This case is not considered.

So, we combine two functions, and ensure simultaneity.

Signed-off-by: Younger Liu <younger.liucn@gmail.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/move_extents.c |   11 --
 fs/ocfs2/ocfs2_trace.h  |    2 
 fs/ocfs2/suballoc.c     |  155 +++++++++++++++-----------------------
 fs/ocfs2/suballoc.h     |   21 +----
 4 files changed, 77 insertions(+), 112 deletions(-)

Comments

Mark Fasheh March 30, 2014, 10:44 p.m. UTC | #1
On Wed, Mar 19, 2014 at 02:09:59PM -0700, Andrew Morton wrote:
> From: Younger Liu <younger.liucn@gmail.com>
> Subject: ocfs2: alloc_dinode counts and group bitmap should be update simultaneously
> 
> Updating alloc_dinode counts in ocfs2_alloc_dinode_update_counts() and
> setting group bitmap in ocfs2_alloc_dinode_update_bitmap() have to be done
> simultaneously.  Two cases are as follow:
> 
> (1) If ocfs2_alloc_dinode_update_counts() fails, there is no need to
>     set group bitmap.  This case has been considered.
> 
> (2) If ocfs2_alloc_dinode_update_bitmap() fails, alloc_dinode counts
>     should be rolled back.  Otherwise, some clusters would never be used. 
>     This case is not considered.
> 
> So, we combine two functions, and ensure simultaneity.

By 'ocfs2_alloc_dinode_update_bitmap' above, you mean
ocfs2_block_group_set_bits() correct? I am going under that assumption.


Anyway, I believe you're trying to fix a rare case where we might leak some
space in that our chain counts might be a bit high. Basically the call to
journal_access() in ocfs2_block_group_set_bits() has to fail. To be honest,
this has existed for a while and hasn't ever been an issue.

That said, we can fix it but I don't like that your approach changes core
bitmap handling code. Can you please do this by writing a simple fallback
function to set the correct values on the bitmap dinode? Have that called in
case of error from ocfs2_block_group_set_bits(). It doesn't need to make
journal calls because that has already been set up for it so this is
literally a couple lines of code.

If you like, you can take the pattern of:

call ocfs2_alloc_dinode_update_counts()
call ocfs2_block_group_set_bits()
if (error)
   call ocfs2_rollback_alloc_dinode_counts()
   ...

And put that into a master function.

Thanks,
	--Mark


--
Mark Fasheh
Younger Liu March 31, 2014, 1:35 a.m. UTC | #2
On 2014/3/31 6:44, Mark Fasheh wrote:
> On Wed, Mar 19, 2014 at 02:09:59PM -0700, Andrew Morton wrote:
>> From: Younger Liu <younger.liucn@gmail.com>
>> Subject: ocfs2: alloc_dinode counts and group bitmap should be update simultaneously
>>
>> Updating alloc_dinode counts in ocfs2_alloc_dinode_update_counts() and
>> setting group bitmap in ocfs2_alloc_dinode_update_bitmap() have to be done
>> simultaneously.  Two cases are as follow:
>>
>> (1) If ocfs2_alloc_dinode_update_counts() fails, there is no need to
>>     set group bitmap.  This case has been considered.
>>
>> (2) If ocfs2_alloc_dinode_update_bitmap() fails, alloc_dinode counts
>>     should be rolled back.  Otherwise, some clusters would never be used. 
>>     This case is not considered.
>>
>> So, we combine two functions, and ensure simultaneity.
> 
> By 'ocfs2_alloc_dinode_update_bitmap' above, you mean
> ocfs2_block_group_set_bits() correct? I am going under that assumption.
> 
> 
> Anyway, I believe you're trying to fix a rare case where we might leak some
> space in that our chain counts might be a bit high. Basically the call to
> journal_access() in ocfs2_block_group_set_bits() has to fail. To be honest,
> this has existed for a while and hasn't ever been an issue.
> 
> That said, we can fix it but I don't like that your approach changes core
> bitmap handling code. Can you please do this by writing a simple fallback
> function to set the correct values on the bitmap dinode? Have that called in
> case of error from ocfs2_block_group_set_bits(). It doesn't need to make
> journal calls because that has already been set up for it so this is
> literally a couple lines of code.
> 
> If you like, you can take the pattern of:
> 
> call ocfs2_alloc_dinode_update_counts()
> call ocfs2_block_group_set_bits()
> if (error)
>    call ocfs2_rollback_alloc_dinode_counts()
>    ...
> 
> And put that into a master function.
> 
Hi Mark,
  Thanks for your review.
  It is not a good idea to change core code. I will resend a patch in a
monent.
     Younger
> Thanks,
> 	--Mark
> 
> 
> --
> Mark Fasheh
> 
>

Patch
diff mbox

diff -puN fs/ocfs2/move_extents.c~ocfs2-alloc_dinode-counts-and-group-bitmap-should-be-update-simultaneously fs/ocfs2/move_extents.c
--- a/fs/ocfs2/move_extents.c~ocfs2-alloc_dinode-counts-and-group-bitmap-should-be-update-simultaneously
+++ a/fs/ocfs2/move_extents.c
@@ -681,18 +681,15 @@  static int ocfs2_move_extent(struct ocfs
 	}
 
 	gd = (struct ocfs2_group_desc *)gd_bh->b_data;
-	ret = ocfs2_alloc_dinode_update_counts(gb_inode, handle, gb_bh, len,
-					       le16_to_cpu(gd->bg_chain));
+	ret = ocfs2_alloc_dinode_update_bitmap(handle,
+				gb_inode, gb_bh, gd, gb_bh,
+				le16_to_cpu(gd->bg_chain),
+				goal_bit, len);
 	if (ret) {
 		mlog_errno(ret);
 		goto out_commit;
 	}
 
-	ret = ocfs2_block_group_set_bits(handle, gb_inode, gd, gd_bh,
-					 goal_bit, len);
-	if (ret)
-		mlog_errno(ret);
-
 	/*
 	 * Here we should write the new page out first if we are
 	 * in write-back mode.
diff -puN fs/ocfs2/ocfs2_trace.h~ocfs2-alloc_dinode-counts-and-group-bitmap-should-be-update-simultaneously fs/ocfs2/ocfs2_trace.h
--- a/fs/ocfs2/ocfs2_trace.h~ocfs2-alloc_dinode-counts-and-group-bitmap-should-be-update-simultaneously
+++ a/fs/ocfs2/ocfs2_trace.h
@@ -788,7 +788,7 @@  DEFINE_OCFS2_UINT_UINT_UINT_EVENT(ocfs2_
 
 DEFINE_OCFS2_ULL_EVENT(ocfs2_reserve_new_inode_new_group);
 
-DEFINE_OCFS2_UINT_UINT_EVENT(ocfs2_block_group_set_bits);
+DEFINE_OCFS2_UINT_UINT_EVENT(ocfs2_alloc_dinode_update_bitmap);
 
 TRACE_EVENT(ocfs2_relink_block_group,
 	TP_PROTO(unsigned long long i_blkno, unsigned int chain,
diff -puN fs/ocfs2/suballoc.c~ocfs2-alloc_dinode-counts-and-group-bitmap-should-be-update-simultaneously fs/ocfs2/suballoc.c
--- a/fs/ocfs2/suballoc.c~ocfs2-alloc_dinode-counts-and-group-bitmap-should-be-update-simultaneously
+++ a/fs/ocfs2/suballoc.c
@@ -1337,54 +1337,6 @@  static int ocfs2_block_group_find_clear_
 	return status;
 }
 
-int ocfs2_block_group_set_bits(handle_t *handle,
-					     struct inode *alloc_inode,
-					     struct ocfs2_group_desc *bg,
-					     struct buffer_head *group_bh,
-					     unsigned int bit_off,
-					     unsigned int num_bits)
-{
-	int status;
-	void *bitmap = bg->bg_bitmap;
-	int journal_type = OCFS2_JOURNAL_ACCESS_WRITE;
-
-	/* All callers get the descriptor via
-	 * ocfs2_read_group_descriptor().  Any corruption is a code bug. */
-	BUG_ON(!OCFS2_IS_VALID_GROUP_DESC(bg));
-	BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);
-
-	trace_ocfs2_block_group_set_bits(bit_off, num_bits);
-
-	if (ocfs2_is_cluster_bitmap(alloc_inode))
-		journal_type = OCFS2_JOURNAL_ACCESS_UNDO;
-
-	status = ocfs2_journal_access_gd(handle,
-					 INODE_CACHE(alloc_inode),
-					 group_bh,
-					 journal_type);
-	if (status < 0) {
-		mlog_errno(status);
-		goto bail;
-	}
-
-	le16_add_cpu(&bg->bg_free_bits_count, -num_bits);
-	if (le16_to_cpu(bg->bg_free_bits_count) > le16_to_cpu(bg->bg_bits)) {
-		ocfs2_error(alloc_inode->i_sb, "Group descriptor # %llu has bit"
-			    " count %u but claims %u are freed. num_bits %d",
-			    (unsigned long long)le64_to_cpu(bg->bg_blkno),
-			    le16_to_cpu(bg->bg_bits),
-			    le16_to_cpu(bg->bg_free_bits_count), num_bits);
-		return -EROFS;
-	}
-	while(num_bits--)
-		ocfs2_set_bit(bit_off++, bitmap);
-
-	ocfs2_journal_dirty(handle, group_bh);
-
-bail:
-	return status;
-}
-
 /* find the one with the most empty bits */
 static inline u16 ocfs2_find_victim_chain(struct ocfs2_chain_list *cl)
 {
@@ -1580,31 +1532,78 @@  static int ocfs2_block_group_search(stru
 	return ret;
 }
 
-int ocfs2_alloc_dinode_update_counts(struct inode *inode,
-				       handle_t *handle,
-				       struct buffer_head *di_bh,
-				       u32 num_bits,
-				       u16 chain)
+int ocfs2_alloc_dinode_update_bitmap(handle_t *handle,
+				struct inode *alloc_inode,
+				struct buffer_head *di_bh,
+				struct ocfs2_group_desc *bg,
+				struct buffer_head *group_bh,
+				u16 chain, u32 bit_off, u32 num_bits)
 {
 	int ret;
 	u32 tmp_used;
 	struct ocfs2_dinode *di = (struct ocfs2_dinode *) di_bh->b_data;
 	struct ocfs2_chain_list *cl = (struct ocfs2_chain_list *) &di->id2.i_chain;
+	void *bitmap = bg->bg_bitmap;
+	int journal_type = OCFS2_JOURNAL_ACCESS_WRITE;
 
-	ret = ocfs2_journal_access_di(handle, INODE_CACHE(inode), di_bh,
-				      OCFS2_JOURNAL_ACCESS_WRITE);
+	/*
+	 * All callers get the descriptor via
+	 * ocfs2_read_group_descriptor().  Any corruption is a code bug.
+	 */
+	BUG_ON(!OCFS2_IS_VALID_GROUP_DESC(bg));
+	BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);
+
+	trace_ocfs2_alloc_dinode_update_bitmap(bit_off, num_bits);
+
+	ret = ocfs2_journal_access_di(handle,
+			INODE_CACHE(alloc_inode), di_bh, journal_type);
 	if (ret < 0) {
 		mlog_errno(ret);
 		goto out;
 	}
 
+	if (ocfs2_is_cluster_bitmap(alloc_inode))
+		journal_type = OCFS2_JOURNAL_ACCESS_UNDO;
+
+	ret = ocfs2_journal_access_gd(handle,
+			INODE_CACHE(alloc_inode), group_bh, journal_type);
+	if (ret < 0) {
+		mlog_errno(ret);
+		goto out;
+	}
+
+	/* update alloc_dinode counts */
 	tmp_used = le32_to_cpu(di->id1.bitmap1.i_used);
 	di->id1.bitmap1.i_used = cpu_to_le32(num_bits + tmp_used);
 	le32_add_cpu(&cl->cl_recs[chain].c_free, -num_bits);
+
+	/* update bg counts and bitmap*/
+	le16_add_cpu(&bg->bg_free_bits_count, -num_bits);
+	if (le16_to_cpu(bg->bg_free_bits_count) > le16_to_cpu(bg->bg_bits)) {
+		ocfs2_error(alloc_inode->i_sb, "Group descriptor # %llu has bit"
+			" count %u but claims %u are freed. num_bits %d",
+			(unsigned long long)le64_to_cpu(bg->bg_blkno),
+			le16_to_cpu(bg->bg_bits),
+			le16_to_cpu(bg->bg_free_bits_count), num_bits);
+		ret = -EROFS;
+		goto out_rollback;
+	}
+	while (num_bits--)
+		ocfs2_set_bit(bit_off++, bitmap);
+
 	ocfs2_journal_dirty(handle, di_bh);
+	ocfs2_journal_dirty(handle, group_bh);
 
 out:
 	return ret;
+
+out_rollback:
+	le16_add_cpu(&bg->bg_free_bits_count, num_bits);
+
+	di->id1.bitmap1.i_used = cpu_to_le32(tmp_used - num_bits);
+	le32_add_cpu(&cl->cl_recs[chain].c_free, num_bits);
+
+	return ret;
 }
 
 static int ocfs2_bg_discontig_fix_by_rec(struct ocfs2_suballoc_result *res,
@@ -1697,19 +1696,15 @@  static int ocfs2_search_one_group(struct
 	if (ac->ac_find_loc_only)
 		goto out_loc_only;
 
-	ret = ocfs2_alloc_dinode_update_counts(alloc_inode, handle, ac->ac_bh,
-					       res->sr_bits,
-					       le16_to_cpu(gd->bg_chain));
+	ret = ocfs2_alloc_dinode_update_bitmap(handle,
+			alloc_inode, ac->ac_bh, gd, group_bh,
+			le16_to_cpu(gd->bg_chain),
+			res->sr_bit_offset, res->sr_bits);
 	if (ret < 0) {
 		mlog_errno(ret);
 		goto out;
 	}
 
-	ret = ocfs2_block_group_set_bits(handle, alloc_inode, gd, group_bh,
-					 res->sr_bit_offset, res->sr_bits);
-	if (ret < 0)
-		mlog_errno(ret);
-
 out_loc_only:
 	*bits_left = le16_to_cpu(gd->bg_free_bits_count);
 
@@ -1823,20 +1818,9 @@  static int ocfs2_search_chain(struct ocf
 	if (ac->ac_find_loc_only)
 		goto out_loc_only;
 
-	status = ocfs2_alloc_dinode_update_counts(alloc_inode, handle,
-						  ac->ac_bh, res->sr_bits,
-						  chain);
-	if (status) {
-		mlog_errno(status);
-		goto bail;
-	}
-
-	status = ocfs2_block_group_set_bits(handle,
-					    alloc_inode,
-					    bg,
-					    group_bh,
-					    res->sr_bit_offset,
-					    res->sr_bits);
+	status = ocfs2_alloc_dinode_update_bitmap(handle,
+			alloc_inode, ac->ac_bh, bg, group_bh,
+			chain, res->sr_bit_offset, res->sr_bits);
 	if (status < 0) {
 		mlog_errno(status);
 		goto bail;
@@ -2134,20 +2118,9 @@  int ocfs2_claim_new_inode_at_loc(handle_
 	bg = (struct ocfs2_group_desc *) bg_bh->b_data;
 	chain = le16_to_cpu(bg->bg_chain);
 
-	ret = ocfs2_alloc_dinode_update_counts(ac->ac_inode, handle,
-					       ac->ac_bh, res->sr_bits,
-					       chain);
-	if (ret) {
-		mlog_errno(ret);
-		goto out;
-	}
-
-	ret = ocfs2_block_group_set_bits(handle,
-					 ac->ac_inode,
-					 bg,
-					 bg_bh,
-					 res->sr_bit_offset,
-					 res->sr_bits);
+	ret = ocfs2_alloc_dinode_update_bitmap(handle,
+			ac->ac_inode, ac->ac_bh, bg, bg_bh,
+			chain, res->sr_bit_offset, res->sr_bits);
 	if (ret < 0) {
 		mlog_errno(ret);
 		goto out;
diff -puN fs/ocfs2/suballoc.h~ocfs2-alloc_dinode-counts-and-group-bitmap-should-be-update-simultaneously fs/ocfs2/suballoc.h
--- a/fs/ocfs2/suballoc.h~ocfs2-alloc_dinode-counts-and-group-bitmap-should-be-update-simultaneously
+++ a/fs/ocfs2/suballoc.h
@@ -85,19 +85,14 @@  int ocfs2_reserve_new_inode(struct ocfs2
 int ocfs2_reserve_clusters(struct ocfs2_super *osb,
 			   u32 bits_wanted,
 			   struct ocfs2_alloc_context **ac);
-
-int ocfs2_alloc_dinode_update_counts(struct inode *inode,
-			 handle_t *handle,
-			 struct buffer_head *di_bh,
-			 u32 num_bits,
-			 u16 chain);
-int ocfs2_block_group_set_bits(handle_t *handle,
-			 struct inode *alloc_inode,
-			 struct ocfs2_group_desc *bg,
-			 struct buffer_head *group_bh,
-			 unsigned int bit_off,
-			 unsigned int num_bits);
-
+int ocfs2_alloc_dinode_update_bitmap(handle_t *handle,
+				struct inode *alloc_inode,
+				struct buffer_head *di_bh,
+				struct ocfs2_group_desc *bg,
+				struct buffer_head *group_bh,
+				u16 chain,
+				u32 bit_off,
+				u32 num_bits);
 int ocfs2_claim_metadata(handle_t *handle,
 			 struct ocfs2_alloc_context *ac,
 			 u32 bits_wanted,