diff mbox

[3/7] ext2: Avoid DAX zeroing to corrupt data

Message ID 20160516152206.GC21714@quack2.suse.cz (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Kara May 16, 2016, 3:22 p.m. UTC
On Thu 12-05-16 12:45:22, Ross Zwisler wrote:
> On Wed, May 11, 2016 at 11:58:49AM +0200, Jan Kara wrote:
> > Currently ext2 zeroes any data blocks allocated for DAX inode however it
> > still returns them as BH_New. Thus DAX code zeroes them again in
> > dax_insert_mapping() which can possibly overwrite the data that has been
> > already stored to those blocks by a racing dax_io(). Avoid marking
> > pre-zeroed buffers as new.
> > 
> > Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/ext2/inode.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
> > index 6bd58e6ff038..1f07b758b968 100644
> > --- a/fs/ext2/inode.c
> > +++ b/fs/ext2/inode.c
> > @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode *inode,
> >  			mutex_unlock(&ei->truncate_mutex);
> >  			goto cleanup;
> >  		}
> > -	}
> > +	} else
> > +		set_buffer_new(bh_result);
> >  
> >  	ext2_splice_branch(inode, iblock, partial, indirect_blks, count);
> >  	mutex_unlock(&ei->truncate_mutex);
> > -	set_buffer_new(bh_result);
> >  got_it:
> >  	map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth-1].key));
> >  	if (count > blocks_to_boundary)
> > -- 
> > 2.6.6
> 
> Interestingly this change is causing a bunch of xfstests regressions for me
> with ext2 + DAX.  All of these tests pass without this one change.

Good catch. Attached patch fixes this issue for me. Preferably it should be
merged before the above ext2 change.

								Honza

Comments

Verma, Vishal L May 17, 2016, 6:52 a.m. UTC | #1
T24gTW9uLCAyMDE2LTA1LTE2IGF0IDE3OjIyICswMjAwLCBKYW4gS2FyYSB3cm90ZToNCj4gT24g
VGh1IDEyLTA1LTE2IDEyOjQ1OjIyLCBSb3NzIFp3aXNsZXIgd3JvdGU6DQo+ID4gDQo+ID4gT24g
V2VkLCBNYXkgMTEsIDIwMTYgYXQgMTE6NTg6NDlBTSArMDIwMCwgSmFuIEthcmEgd3JvdGU6DQo+
ID4gPiANCj4gPiA+IEN1cnJlbnRseSBleHQyIHplcm9lcyBhbnkgZGF0YSBibG9ja3MgYWxsb2Nh
dGVkIGZvciBEQVggaW5vZGUNCj4gPiA+IGhvd2V2ZXIgaXQNCj4gPiA+IHN0aWxsIHJldHVybnMg
dGhlbSBhcyBCSF9OZXcuIFRodXMgREFYIGNvZGUgemVyb2VzIHRoZW0gYWdhaW4gaW4NCj4gPiA+
IGRheF9pbnNlcnRfbWFwcGluZygpIHdoaWNoIGNhbiBwb3NzaWJseSBvdmVyd3JpdGUgdGhlIGRh
dGEgdGhhdA0KPiA+ID4gaGFzIGJlZW4NCj4gPiA+IGFscmVhZHkgc3RvcmVkIHRvIHRob3NlIGJs
b2NrcyBieSBhIHJhY2luZyBkYXhfaW8oKS4gQXZvaWQNCj4gPiA+IG1hcmtpbmcNCj4gPiA+IHBy
ZS16ZXJvZWQgYnVmZmVycyBhcyBuZXcuDQo+ID4gPiANCj4gPiA+IFJldmlld2VkLWJ5OiBSb3Nz
IFp3aXNsZXIgPHJvc3Muendpc2xlckBsaW51eC5pbnRlbC5jb20+DQo+ID4gPiBTaWduZWQtb2Zm
LWJ5OiBKYW4gS2FyYSA8amFja0BzdXNlLmN6Pg0KPiA+ID4gLS0tDQo+ID4gPiDCoGZzL2V4dDIv
aW5vZGUuYyB8IDQgKystLQ0KPiA+ID4gwqAxIGZpbGUgY2hhbmdlZCwgMiBpbnNlcnRpb25zKCsp
LCAyIGRlbGV0aW9ucygtKQ0KPiA+ID4gDQo+ID4gPiBkaWZmIC0tZ2l0IGEvZnMvZXh0Mi9pbm9k
ZS5jIGIvZnMvZXh0Mi9pbm9kZS5jDQo+ID4gPiBpbmRleCA2YmQ1OGU2ZmYwMzguLjFmMDdiNzU4
Yjk2OCAxMDA2NDQNCj4gPiA+IC0tLSBhL2ZzL2V4dDIvaW5vZGUuYw0KPiA+ID4gKysrIGIvZnMv
ZXh0Mi9pbm9kZS5jDQo+ID4gPiBAQCAtNzQ1LDExICs3NDUsMTEgQEAgc3RhdGljIGludCBleHQy
X2dldF9ibG9ja3Moc3RydWN0IGlub2RlDQo+ID4gPiAqaW5vZGUsDQo+ID4gPiDCoAkJCW11dGV4
X3VubG9jaygmZWktPnRydW5jYXRlX211dGV4KTsNCj4gPiA+IMKgCQkJZ290byBjbGVhbnVwOw0K
PiA+ID4gwqAJCX0NCj4gPiA+IC0JfQ0KPiA+ID4gKwl9IGVsc2UNCj4gPiA+ICsJCXNldF9idWZm
ZXJfbmV3KGJoX3Jlc3VsdCk7DQo+ID4gPiDCoA0KPiA+ID4gwqAJZXh0Ml9zcGxpY2VfYnJhbmNo
KGlub2RlLCBpYmxvY2ssIHBhcnRpYWwsDQo+ID4gPiBpbmRpcmVjdF9ibGtzLCBjb3VudCk7DQo+
ID4gPiDCoAltdXRleF91bmxvY2soJmVpLT50cnVuY2F0ZV9tdXRleCk7DQo+ID4gPiAtCXNldF9i
dWZmZXJfbmV3KGJoX3Jlc3VsdCk7DQo+ID4gPiDCoGdvdF9pdDoNCj4gPiA+IMKgCW1hcF9iaChi
aF9yZXN1bHQsIGlub2RlLT5pX3NiLCBsZTMyX3RvX2NwdShjaGFpbltkZXB0aC0NCj4gPiA+IDFd
LmtleSkpOw0KPiA+ID4gwqAJaWYgKGNvdW50ID4gYmxvY2tzX3RvX2JvdW5kYXJ5KQ0KPiA+ID4g
LS3CoA0KPiA+ID4gMi42LjYNCj4gPiBJbnRlcmVzdGluZ2x5IHRoaXMgY2hhbmdlIGlzIGNhdXNp
bmcgYSBidW5jaCBvZiB4ZnN0ZXN0cw0KPiA+IHJlZ3Jlc3Npb25zIGZvciBtZQ0KPiA+IHdpdGgg
ZXh0MiArIERBWC7CoMKgQWxsIG9mIHRoZXNlIHRlc3RzIHBhc3Mgd2l0aG91dCB0aGlzIG9uZSBj
aGFuZ2UuDQo+IEdvb2QgY2F0Y2guIEF0dGFjaGVkIHBhdGNoIGZpeGVzIHRoaXMgaXNzdWUgZm9y
IG1lLiBQcmVmZXJhYmx5IGl0DQo+IHNob3VsZCBiZQ0KPiBtZXJnZWQgYmVmb3JlIHRoZSBhYm92
ZSBleHQyIGNoYW5nZS4NCj4gDQo+IAkJCQkJCQkJSG9uemENCg0KSGV5IEphbiwNCg0KSW4gbXkg
cGF0Y2ggMyBvZiB0aGUgZXJyb3IgaGFuZGxpbmcgc2VyaWVzLCBJIGhhdmU6DQoNCi3CoMKgwqDC
oMKgwqDCoMKgwqDCoMKgwqDCoMKgwqBlcnIgPSBkYXhfY2xlYXJfc2VjdG9ycyhpbm9kZS0+aV9z
Yi0+c19iZGV2LA0KLcKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKg
wqDCoMKgwqDCoMKgwqDCoMKgbGUzMl90b19jcHUoY2hhaW5bZGVwdGgtMV0ua2V5KSA8PA0KLcKg
wqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDC
oMKgKGlub2RlLT5pX2Jsa2JpdHMgLSA5KSwNCi3CoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKg
wqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoDEgPDwgaW5vZGUtPmlfYmxrYml0cyk7
DQorwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgZXJyID0gc2JfaXNzdWVfemVyb291dChp
bm9kZS0+aV9zYiwNCivCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDC
oMKgwqDCoMKgwqDCoMKgwqDCoGxlMzJfdG9fY3B1KGNoYWluW2RlcHRoLTFdLmtleSksIDEsIEdG
UF9OT0ZTKTsNCg0KRG9lcyB0aGlzIG1lYW4gSSBoYXZlIHRvIGNoYW5nZSB0byBzZW5kIHRoZSBz
Yl9pc3N1ZV96ZXJvb3V0IGZvcg0KJ2NvdW50JyBibG9ja3MuLiBpLmUuDQoNCi3CoMKgwqDCoMKg
wqDCoMKgwqDCoMKgwqDCoMKgwqBlcnIgPSBkYXhfY2xlYXJfc2VjdG9ycyhpbm9kZS0+aV9zYi0+
c19iZGV2LA0KLcKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDC
oMKgwqDCoMKgwqDCoMKgbGUzMl90b19jcHUoY2hhaW5bZGVwdGgtMV0ua2V5KSA8PA0KLcKgwqDC
oMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKg
KGlub2RlLT5pX2Jsa2JpdHMgLSA5KSwNCi3CoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDC
oMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoDEgPDwgaW5vZGUtPmlfYmxrYml0cyk7DQor
wqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgZXJyID0gc2JfaXNzdWVfemVyb291dChpbm9k
ZS0+aV9zYiwNCivCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKg
wqDCoMKgwqDCoMKgwqDCoGxlMzJfdG9fY3B1KGNoYWluW2RlcHRoLTFdLmtleSksIGNvdW50LCBH
RlBfTk9GUyk7DQoNCklmIHNvLCBJJ2xsIHVwZGF0ZSBteSBzZXJpZXMgdG9tb3Jyb3cgdG8gaW5j
bHVkZSBpbiBib3RoIG9mIHRoZXNlIGNoYW5nZXMuDQoNClRoYW5rcywNCgktVmlzaGFs
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kara May 17, 2016, 7:19 a.m. UTC | #2
On Tue 17-05-16 06:52:47, Verma, Vishal L wrote:
> On Mon, 2016-05-16 at 17:22 +0200, Jan Kara wrote:
> > On Thu 12-05-16 12:45:22, Ross Zwisler wrote:
> > > 
> > > On Wed, May 11, 2016 at 11:58:49AM +0200, Jan Kara wrote:
> > > > 
> > > > Currently ext2 zeroes any data blocks allocated for DAX inode
> > > > however it
> > > > still returns them as BH_New. Thus DAX code zeroes them again in
> > > > dax_insert_mapping() which can possibly overwrite the data that
> > > > has been
> > > > already stored to those blocks by a racing dax_io(). Avoid
> > > > marking
> > > > pre-zeroed buffers as new.
> > > > 
> > > > Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > ---
> > > >  fs/ext2/inode.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
> > > > index 6bd58e6ff038..1f07b758b968 100644
> > > > --- a/fs/ext2/inode.c
> > > > +++ b/fs/ext2/inode.c
> > > > @@ -745,11 +745,11 @@ static int ext2_get_blocks(struct inode
> > > > *inode,
> > > >  			mutex_unlock(&ei->truncate_mutex);
> > > >  			goto cleanup;
> > > >  		}
> > > > -	}
> > > > +	} else
> > > > +		set_buffer_new(bh_result);
> > > >  
> > > >  	ext2_splice_branch(inode, iblock, partial,
> > > > indirect_blks, count);
> > > >  	mutex_unlock(&ei->truncate_mutex);
> > > > -	set_buffer_new(bh_result);
> > > >  got_it:
> > > >  	map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth-
> > > > 1].key));
> > > >  	if (count > blocks_to_boundary)
> > > > -- 
> > > > 2.6.6
> > > Interestingly this change is causing a bunch of xfstests
> > > regressions for me
> > > with ext2 + DAX.  All of these tests pass without this one change.
> > Good catch. Attached patch fixes this issue for me. Preferably it
> > should be
> > merged before the above ext2 change.
> > 
> > 								Honza
> 
> Hey Jan,
> 
> In my patch 3 of the error handling series, I have:
> 
> -               err = dax_clear_sectors(inode->i_sb->s_bdev,
> -                               le32_to_cpu(chain[depth-1].key) <<
> -                               (inode->i_blkbits - 9),
> -                               1 << inode->i_blkbits);
> +               err = sb_issue_zeroout(inode->i_sb,
> +                               le32_to_cpu(chain[depth-1].key), 1, GFP_NOFS);
> 
> Does this mean I have to change to send the sb_issue_zeroout for
> 'count' blocks.. i.e.

Yes, I've noticed the conflict today as well.

> -               err = dax_clear_sectors(inode->i_sb->s_bdev,
> -                               le32_to_cpu(chain[depth-1].key) <<
> -                               (inode->i_blkbits - 9),
> -                               1 << inode->i_blkbits);
> +               err = sb_issue_zeroout(inode->i_sb,
> +                               le32_to_cpu(chain[depth-1].key), count, GFP_NOFS);
> 
> If so, I'll update my series tomorrow to include in both of these changes.

I'd prefer these two to stay separate commits (they are really
independent).  Since you already depend on other patches from the DAX
cleanup series, just add this patch to the list of dependencies and base
your change on that... Hmm?

								Honza
Verma, Vishal L May 17, 2016, 5:56 p.m. UTC | #3
On Tue, 2016-05-17 at 09:19 +0200, Jan Kara wrote:
> On Tue 17-05-16 06:52:47, Verma, Vishal L wrote:


[...]

> > 

> > 

> > -               err = dax_clear_sectors(inode->i_sb->s_bdev,

> > -                               le32_to_cpu(chain[depth-1].key) <<

> > -                               (inode->i_blkbits - 9),

> > -                               1 << inode->i_blkbits);

> > +               err = sb_issue_zeroout(inode->i_sb,

> > +                               le32_to_cpu(chain[depth-1].key),

> > count, GFP_NOFS);

> > 

> > If so, I'll update my series tomorrow to include in both of these

> > changes.

> I'd prefer these two to stay separate commits (they are really

> independent).  Since you already depend on other patches from the DAX

> cleanup series, just add this patch to the list of dependencies and

> base

> your change on that... Hmm?


Yes I agree. I'm preparing a branch that adds your fix, and modifying my
sb_issue_zeroout to use the same fix. I've verified that those tests
pass after both changes.

Thanks!
	-Vishal
diff mbox

Patch

From 287d6b6cb0b6f325696fff93ff0f29ee5fde5736 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Mon, 16 May 2016 17:17:04 +0200
Subject: [PATCH] ext2: Fix block zeroing in ext2_get_blocks() for DAX

When zeroing allocated blocks for DAX, we accidentally zeroed only the
first allocated block instead of all of them. So far this problem is
hidden by the fact that page faults always need only a single block and
DAX write code zeroes blocks again. But the zeroing in DAX code is racy
and needs to be removed so fix the zeroing in ext2 to zero all allocated
blocks.

Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext2/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 6bd58e6ff038..038d0ed5f565 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -740,7 +740,7 @@  static int ext2_get_blocks(struct inode *inode,
 		err = dax_clear_sectors(inode->i_sb->s_bdev,
 				le32_to_cpu(chain[depth-1].key) <<
 				(inode->i_blkbits - 9),
-				1 << inode->i_blkbits);
+				count << inode->i_blkbits);
 		if (err) {
 			mutex_unlock(&ei->truncate_mutex);
 			goto cleanup;
-- 
2.6.6