Message ID | 26d882184fc43043a810114258f45277752186c7.1535682461.git.ernesto.mnd.fernandez@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/6] hfsplus: prevent btree data loss on root split | expand |
On Fri, Aug 31, 2018 at 12:58:19AM -0300, Ernesto A. Fernández wrote: > Creating, renaming or deleting a file may cause catalog corruption and > data loss. This bug is randomly triggered by xfstests generic/027, but > here is a faster reproducer: > > truncate -s 50M fs.iso > mkfs.hfsplus fs.iso > mount fs.iso /mnt > i=100 > while [ $i -le 150 ]; do > touch /mnt/$i &>/dev/null > ((++i)) > done > i=100 > while [ $i -le 150 ]; do > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > ((++i)) > done > umount /mnt > fsck.hfsplus -n fs.iso It would be good to wire up this short reproducer as well for xfstests.
On Thu, Aug 30, 2018 at 10:36:42PM -0700, Christoph Hellwig wrote: > On Fri, Aug 31, 2018 at 12:58:19AM -0300, Ernesto A. Fernández wrote: > > Creating, renaming or deleting a file may cause catalog corruption and > > data loss. This bug is randomly triggered by xfstests generic/027, but > > here is a faster reproducer: > > > > truncate -s 50M fs.iso > > mkfs.hfsplus fs.iso > > mount fs.iso /mnt > > i=100 > > while [ $i -le 150 ]; do > > touch /mnt/$i &>/dev/null > > ((++i)) > > done > > i=100 > > while [ $i -le 150 ]; do > > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > > ((++i)) > > done > > umount /mnt > > fsck.hfsplus -n fs.iso > > It would be good to wire up this short reproducer as well for xfstests. Yes, that's my intention. The problem is that mkfs.hfsplus does not allow setting the size of the filesystem for scratch_mkfs_sized(); you need a workaround with the device mapper. I think I should submit that patch first and see if there is a problem with it.
On Fri, Aug 31, 2018 at 11:55:54AM -0300, Ernesto A. Fernández wrote: > On Thu, Aug 30, 2018 at 10:36:42PM -0700, Christoph Hellwig wrote: > > On Fri, Aug 31, 2018 at 12:58:19AM -0300, Ernesto A. Fernández wrote: > > > Creating, renaming or deleting a file may cause catalog corruption and > > > data loss. This bug is randomly triggered by xfstests generic/027, but > > > here is a faster reproducer: > > > > > > truncate -s 50M fs.iso > > > mkfs.hfsplus fs.iso > > > mount fs.iso /mnt > > > i=100 > > > while [ $i -le 150 ]; do > > > touch /mnt/$i &>/dev/null > > > ((++i)) > > > done > > > i=100 > > > while [ $i -le 150 ]; do > > > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > > > ((++i)) > > > done > > > umount /mnt > > > fsck.hfsplus -n fs.iso > > > > It would be good to wire up this short reproducer as well for xfstests. > > Yes, that's my intention. The problem is that mkfs.hfsplus does not allow > setting the size of the filesystem for scratch_mkfs_sized(); you need a > workaround with the device mapper. I think I should submit that patch first > and see if there is a problem with it. You don't need to do that. We use loop devices like this w/ mkfs_dev quite a lot in fstests. For example, generic/361 has pretty much the exact code pattern you need.... Cheers, Dave.
On Sat, Sep 01, 2018 at 02:49:26PM +1000, Dave Chinner wrote: > On Fri, Aug 31, 2018 at 11:55:54AM -0300, Ernesto A. Fernández wrote: > > On Thu, Aug 30, 2018 at 10:36:42PM -0700, Christoph Hellwig wrote: > > > On Fri, Aug 31, 2018 at 12:58:19AM -0300, Ernesto A. Fernández wrote: > > > > Creating, renaming or deleting a file may cause catalog corruption and > > > > data loss. This bug is randomly triggered by xfstests generic/027, but > > > > here is a faster reproducer: > > > > > > > > truncate -s 50M fs.iso > > > > mkfs.hfsplus fs.iso > > > > mount fs.iso /mnt > > > > i=100 > > > > while [ $i -le 150 ]; do > > > > touch /mnt/$i &>/dev/null > > > > ((++i)) > > > > done > > > > i=100 > > > > while [ $i -le 150 ]; do > > > > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > > > > ((++i)) > > > > done > > > > umount /mnt > > > > fsck.hfsplus -n fs.iso > > > > > > It would be good to wire up this short reproducer as well for xfstests. > > > > Yes, that's my intention. The problem is that mkfs.hfsplus does not allow > > setting the size of the filesystem for scratch_mkfs_sized(); you need a > > workaround with the device mapper. I think I should submit that patch first > > and see if there is a problem with it. > > You don't need to do that. We use loop devices like this w/ mkfs_dev > quite a lot in fstests. For example, generic/361 has pretty much the > exact code pattern you need.... I see what you mean in this case, but I really want to run every test that uses _scratch_mkfs_sized(); generic/027 in particular, since it's the one that found these bugs for me. > > Cheers, > > Dave. > > > -- > Dave Chinner > david@fromorbit.com
On Sun, Sep 02, 2018 at 01:33:09AM -0300, Ernesto A. Fernández wrote: > On Sat, Sep 01, 2018 at 02:49:26PM +1000, Dave Chinner wrote: > > On Fri, Aug 31, 2018 at 11:55:54AM -0300, Ernesto A. Fernández wrote: > > > On Thu, Aug 30, 2018 at 10:36:42PM -0700, Christoph Hellwig wrote: > > > > On Fri, Aug 31, 2018 at 12:58:19AM -0300, Ernesto A. Fernández wrote: > > > > > Creating, renaming or deleting a file may cause catalog corruption and > > > > > data loss. This bug is randomly triggered by xfstests generic/027, but > > > > > here is a faster reproducer: > > > > > > > > > > truncate -s 50M fs.iso > > > > > mkfs.hfsplus fs.iso > > > > > mount fs.iso /mnt > > > > > i=100 > > > > > while [ $i -le 150 ]; do > > > > > touch /mnt/$i &>/dev/null > > > > > ((++i)) > > > > > done > > > > > i=100 > > > > > while [ $i -le 150 ]; do > > > > > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > > > > > ((++i)) > > > > > done > > > > > umount /mnt > > > > > fsck.hfsplus -n fs.iso > > > > > > > > It would be good to wire up this short reproducer as well for xfstests. > > > > > > Yes, that's my intention. The problem is that mkfs.hfsplus does not allow > > > setting the size of the filesystem for scratch_mkfs_sized(); you need a > > > workaround with the device mapper. I think I should submit that patch first > > > and see if there is a problem with it. > > > > You don't need to do that. We use loop devices like this w/ mkfs_dev > > quite a lot in fstests. For example, generic/361 has pretty much the > > exact code pattern you need.... > > I see what you mean in this case, but I really want to run every test that > uses _scratch_mkfs_sized(); generic/027 in particular, since it's the one > that found these bugs for me. Trying to hack around SCRATCH_DEV to be some other device set up by _scratch_mkfs_sized is likely to break things in unexpected ways. Lots of library routines directly access SCRATCH_DEV or manipulate it in interesting ways (e.g. build their own dm constructs on it). IMO the right thing to do is implement the size parameter in mkfs.hfsplus. I note that it has a dry-run capability (-N <size>) already allows it to simulate making a filesystem of any size, so it looks to me that most of what you need it to do is already in the mkfs code. Then you can test for the mkfs parameter in _scratch_mkfs_sized.... Cheers, Dave.
On Mon, Sep 03, 2018 at 09:32:36AM +1000, Dave Chinner wrote: > On Sun, Sep 02, 2018 at 01:33:09AM -0300, Ernesto A. Fernández wrote: > > On Sat, Sep 01, 2018 at 02:49:26PM +1000, Dave Chinner wrote: > > > On Fri, Aug 31, 2018 at 11:55:54AM -0300, Ernesto A. Fernández wrote: > > > > On Thu, Aug 30, 2018 at 10:36:42PM -0700, Christoph Hellwig wrote: > > > > > On Fri, Aug 31, 2018 at 12:58:19AM -0300, Ernesto A. Fernández wrote: > > > > > > Creating, renaming or deleting a file may cause catalog corruption and > > > > > > data loss. This bug is randomly triggered by xfstests generic/027, but > > > > > > here is a faster reproducer: > > > > > > > > > > > > truncate -s 50M fs.iso > > > > > > mkfs.hfsplus fs.iso > > > > > > mount fs.iso /mnt > > > > > > i=100 > > > > > > while [ $i -le 150 ]; do > > > > > > touch /mnt/$i &>/dev/null > > > > > > ((++i)) > > > > > > done > > > > > > i=100 > > > > > > while [ $i -le 150 ]; do > > > > > > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > > > > > > ((++i)) > > > > > > done > > > > > > umount /mnt > > > > > > fsck.hfsplus -n fs.iso > > > > > > > > > > It would be good to wire up this short reproducer as well for xfstests. > > > > > > > > Yes, that's my intention. The problem is that mkfs.hfsplus does not allow > > > > setting the size of the filesystem for scratch_mkfs_sized(); you need a > > > > workaround with the device mapper. I think I should submit that patch first > > > > and see if there is a problem with it. > > > > > > You don't need to do that. We use loop devices like this w/ mkfs_dev > > > quite a lot in fstests. For example, generic/361 has pretty much the > > > exact code pattern you need.... > > > > I see what you mean in this case, but I really want to run every test that > > uses _scratch_mkfs_sized(); generic/027 in particular, since it's the one > > that found these bugs for me. > > Trying to hack around SCRATCH_DEV to be some other device set up by > _scratch_mkfs_sized is likely to break things in unexpected ways. > Lots of library routines directly access SCRATCH_DEV or manipulate > it in interesting ways (e.g. build their own dm constructs on it). > I wasn't changing SCRATCH_DEV, the device mapper was only used briefly on _scratch_mkfs_sized, and then again on _check_scratch_fs. Still your point stands, Eryu already found problems with my patch. > IMO the right thing to do is implement the size parameter in > mkfs.hfsplus. I note that it has a dry-run capability (-N <size>) > already allows it to simulate making a filesystem of any size, so it > looks to me that most of what you need it to do is already in the > mkfs code. Then you can test for the mkfs parameter in > _scratch_mkfs_sized.... I'll look into it. I will also need to change fsck.hfsplus, since it claims it found corruption when the filesystem doesn't fill the device. Thanks. > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com
I'm sorry Andrew, could you please drop patches 3 and 6 of this series? There is a regression: the free node count is not always updated correctly when reserving space. I'll send the new versions as soon as I'm done testing them.
On Fri, 2018-08-31 at 00:58 -0300, Ernesto A. Fernández wrote: > Creating, renaming or deleting a file may cause catalog corruption and > data loss. This bug is randomly triggered by xfstests generic/027, but > here is a faster reproducer: > > truncate -s 50M fs.iso > mkfs.hfsplus fs.iso > mount fs.iso /mnt > i=100 > while [ $i -le 150 ]; do > touch /mnt/$i &>/dev/null > ((++i)) > done > i=100 > while [ $i -le 150 ]; do > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > ((++i)) > done > umount /mnt > fsck.hfsplus -n fs.iso > > The bug is triggered whenever hfs_brec_update_parent() needs to split > the root node. The height of the btree is not increased, which leaves > the new node orphaned and its records lost. > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> > --- > fs/hfsplus/brec.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/fs/hfsplus/brec.c b/fs/hfsplus/brec.c > index ed8eacb34452..aa17a392b414 100644 > --- a/fs/hfsplus/brec.c > +++ b/fs/hfsplus/brec.c > @@ -429,6 +429,10 @@ static int hfs_brec_update_parent(struct hfs_find_data *fd) > if (new_node) { > __be32 cnid; > > + if (!new_node->parent) { > + hfs_btree_inc_height(tree); > + new_node->parent = tree->root; I worry about the case when we are adding the node on intermediate (not root) level. As far as I can see, we will be in trouble here because I don't see any processing of two possible cases: (1) root node; (2) node of intermediate level. Do I miss something here? Thanks, Vyacheslav Dubeyko. > + } > fd->bnode = hfs_bnode_find(tree, new_node->parent); > /* create index key and entry */ > hfs_bnode_read_key(new_node, fd->search_key, 14);
On Tue, Oct 23, 2018 at 06:23:53PM -0700, Viacheslav Dubeyko wrote: > On Fri, 2018-08-31 at 00:58 -0300, Ernesto A. Fernández wrote: > > Creating, renaming or deleting a file may cause catalog corruption and > > data loss. This bug is randomly triggered by xfstests generic/027, but > > here is a faster reproducer: > > > > truncate -s 50M fs.iso > > mkfs.hfsplus fs.iso > > mount fs.iso /mnt > > i=100 > > while [ $i -le 150 ]; do > > touch /mnt/$i &>/dev/null > > ((++i)) > > done > > i=100 > > while [ $i -le 150 ]; do > > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > > ((++i)) > > done > > umount /mnt > > fsck.hfsplus -n fs.iso > > > > The bug is triggered whenever hfs_brec_update_parent() needs to split > > the root node. The height of the btree is not increased, which leaves > > the new node orphaned and its records lost. > > > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> > > --- > > fs/hfsplus/brec.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/fs/hfsplus/brec.c b/fs/hfsplus/brec.c > > index ed8eacb34452..aa17a392b414 100644 > > --- a/fs/hfsplus/brec.c > > +++ b/fs/hfsplus/brec.c > > @@ -429,6 +429,10 @@ static int hfs_brec_update_parent(struct hfs_find_data *fd) > > if (new_node) { > > __be32 cnid; > > > > + if (!new_node->parent) { > > + hfs_btree_inc_height(tree); > > + new_node->parent = tree->root; > > I worry about the case when we are adding the node on intermediate (not > root) level. As far as I can see, we will be in trouble here because I > don't see any processing of two possible cases: (1) root node; (2) node > of intermediate level. Do I miss something here? If 'new_node' had been the result of splitting a node other than root, then it would have a parent. > > Thanks, > Vyacheslav Dubeyko. > > > + } > > fd->bnode = hfs_bnode_find(tree, new_node->parent); > > /* create index key and entry */ > > hfs_bnode_read_key(new_node, fd->search_key, 14); > >
On Tue, 2018-10-23 at 22:32 -0300, Ernesto A. Fernández wrote: > On Tue, Oct 23, 2018 at 06:23:53PM -0700, Viacheslav Dubeyko wrote: > > > > On Fri, 2018-08-31 at 00:58 -0300, Ernesto A. Fernández wrote: > > > > > > Creating, renaming or deleting a file may cause catalog > > > corruption and > > > data loss. This bug is randomly triggered by xfstests > > > generic/027, but > > > here is a faster reproducer: > > > > > > truncate -s 50M fs.iso > > > mkfs.hfsplus fs.iso > > > mount fs.iso /mnt > > > i=100 > > > while [ $i -le 150 ]; do > > > touch /mnt/$i &>/dev/null > > > ((++i)) > > > done > > > i=100 > > > while [ $i -le 150 ]; do > > > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > > > ((++i)) > > > done > > > umount /mnt > > > fsck.hfsplus -n fs.iso > > > > > > The bug is triggered whenever hfs_brec_update_parent() needs to > > > split > > > the root node. The height of the btree is not increased, which > > > leaves > > > the new node orphaned and its records lost. > > > > > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail. > > > com> > > > --- > > > fs/hfsplus/brec.c | 4 ++++ > > > 1 file changed, 4 insertions(+) > > > > > > diff --git a/fs/hfsplus/brec.c b/fs/hfsplus/brec.c > > > index ed8eacb34452..aa17a392b414 100644 > > > --- a/fs/hfsplus/brec.c > > > +++ b/fs/hfsplus/brec.c > > > @@ -429,6 +429,10 @@ static int hfs_brec_update_parent(struct > > > hfs_find_data *fd) > > > if (new_node) { > > > __be32 cnid; > > > > > > + if (!new_node->parent) { > > > + hfs_btree_inc_height(tree); > > > + new_node->parent = tree->root; > > I worry about the case when we are adding the node on intermediate > > (not > > root) level. As far as I can see, we will be in trouble here > > because I > > don't see any processing of two possible cases: (1) root node; (2) > > node > > of intermediate level. Do I miss something here? > If 'new_node' had been the result of splitting a node other than > root, > then it would have a parent. > Could you please share the callstack or/and more detailed explanation that would show the correctness of the fix? Currently, it's not enough details in the comment for easy understanding the issue's environment and correctness of the fix. I simply mean here that as implementation as concept of the HFS+ b-trees is not trivial. But everybody should easy understand the patch. Thanks, Vyacheslav Dubeyko.
On Thu, Oct 25, 2018 at 09:51:57AM -0700, Viacheslav Dubeyko wrote: > On Tue, 2018-10-23 at 22:32 -0300, Ernesto A. Fernández wrote: > > On Tue, Oct 23, 2018 at 06:23:53PM -0700, Viacheslav Dubeyko wrote: > > > > > > On Fri, 2018-08-31 at 00:58 -0300, Ernesto A. Fernández wrote: > > > > > > > > Creating, renaming or deleting a file may cause catalog > > > > corruption and > > > > data loss. This bug is randomly triggered by xfstests > > > > generic/027, but > > > > here is a faster reproducer: > > > > > > > > truncate -s 50M fs.iso > > > > mkfs.hfsplus fs.iso > > > > mount fs.iso /mnt > > > > i=100 > > > > while [ $i -le 150 ]; do > > > > touch /mnt/$i &>/dev/null > > > > ((++i)) > > > > done > > > > i=100 > > > > while [ $i -le 150 ]; do > > > > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > > > > ((++i)) > > > > done > > > > umount /mnt > > > > fsck.hfsplus -n fs.iso > > > > > > > > The bug is triggered whenever hfs_brec_update_parent() needs to > > > > split > > > > the root node. The height of the btree is not increased, which > > > > leaves > > > > the new node orphaned and its records lost. > > > > > > > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail. > > > > com> > > > > --- > > > > fs/hfsplus/brec.c | 4 ++++ > > > > 1 file changed, 4 insertions(+) > > > > > > > > diff --git a/fs/hfsplus/brec.c b/fs/hfsplus/brec.c > > > > index ed8eacb34452..aa17a392b414 100644 > > > > --- a/fs/hfsplus/brec.c > > > > +++ b/fs/hfsplus/brec.c > > > > @@ -429,6 +429,10 @@ static int hfs_brec_update_parent(struct > > > > hfs_find_data *fd) > > > > if (new_node) { > > > > __be32 cnid; > > > > > > > > + if (!new_node->parent) { > > > > + hfs_btree_inc_height(tree); > > > > + new_node->parent = tree->root; > > > I worry about the case when we are adding the node on intermediate > > > (not > > > root) level. As far as I can see, we will be in trouble here > > > because I > > > don't see any processing of two possible cases: (1) root node; (2) > > > node > > > of intermediate level. Do I miss something here? > > If 'new_node' had been the result of splitting a node other than > > root, > > then it would have a parent. > > > > Could you please share the callstack or/and more detailed explanation > that would show the correctness of the fix? Currently, it's not enough > details in the comment for easy understanding the issue's environment > and correctness of the fix. This patch is two months old now. Why did you wait until the merge window to give me your traditional "the patch is confusing" reply? > I simply mean here that as implementation > as concept of the HFS+ b-trees is not trivial. But everybody should > easy understand the patch. Those two sentences are not compatible. Reviewers need to have some understanding of the code they review. No royal road... > Thanks, > Vyacheslav Dubeyko. >
On Thu, 2018-10-25 at 16:42 -0300, Ernesto A. Fernández wrote: > On Thu, Oct 25, 2018 at 09:51:57AM -0700, Viacheslav Dubeyko wrote: > > > > On Tue, 2018-10-23 at 22:32 -0300, Ernesto A. Fernández wrote: > > > > > > On Tue, Oct 23, 2018 at 06:23:53PM -0700, Viacheslav Dubeyko > > > wrote: > > > > > > > > > > > > On Fri, 2018-08-31 at 00:58 -0300, Ernesto A. Fernández wrote: > > > > > > > > > > > > > > > Creating, renaming or deleting a file may cause catalog > > > > > corruption and > > > > > data loss. This bug is randomly triggered by xfstests > > > > > generic/027, but > > > > > here is a faster reproducer: > > > > > > > > > > truncate -s 50M fs.iso > > > > > mkfs.hfsplus fs.iso > > > > > mount fs.iso /mnt > > > > > i=100 > > > > > while [ $i -le 150 ]; do > > > > > touch /mnt/$i &>/dev/null > > > > > ((++i)) > > > > > done > > > > > i=100 > > > > > while [ $i -le 150 ]; do > > > > > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > > > > > ((++i)) > > > > > done > > > > > umount /mnt > > > > > fsck.hfsplus -n fs.iso > > > > > > > > > > The bug is triggered whenever hfs_brec_update_parent() needs > > > > > to > > > > > split > > > > > the root node. The height of the btree is not increased, > > > > > which > > > > > leaves > > > > > the new node orphaned and its records lost. > > > > > > > > > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gm > > > > > ail. > > > > > com> > > > > > --- > > > > > fs/hfsplus/brec.c | 4 ++++ > > > > > 1 file changed, 4 insertions(+) > > > > > > > > > > diff --git a/fs/hfsplus/brec.c b/fs/hfsplus/brec.c > > > > > index ed8eacb34452..aa17a392b414 100644 > > > > > --- a/fs/hfsplus/brec.c > > > > > +++ b/fs/hfsplus/brec.c > > > > > @@ -429,6 +429,10 @@ static int hfs_brec_update_parent(struct > > > > > hfs_find_data *fd) > > > > > if (new_node) { > > > > > __be32 cnid; > > > > > > > > > > + if (!new_node->parent) { > > > > > + hfs_btree_inc_height(tree); > > > > > + new_node->parent = tree->root; > > > > I worry about the case when we are adding the node on > > > > intermediate > > > > (not > > > > root) level. As far as I can see, we will be in trouble here > > > > because I > > > > don't see any processing of two possible cases: (1) root node; > > > > (2) > > > > node > > > > of intermediate level. Do I miss something here? > > > If 'new_node' had been the result of splitting a node other than > > > root, > > > then it would have a parent. > > > > > Could you please share the callstack or/and more detailed > > explanation > > that would show the correctness of the fix? Currently, it's not > > enough > > details in the comment for easy understanding the issue's > > environment > > and correctness of the fix. > This patch is two months old now. Why did you wait until the merge > window to give me your traditional "the patch is confusing" reply? > > > > > I simply mean here that as implementation > > as concept of the HFS+ b-trees is not trivial. But everybody should > > easy understand the patch. > Those two sentences are not compatible. Reviewers need to have some > understanding of the code they review. No royal road... > If you've found an issue or a bug then you did see some error message and/or callstack. It's not big deal to share it. Moreover, it's important to include such error message, callstack and issue's details in the comment section of the patch. Let's imagine that somebody will use the same test-case and this guy would find another issue. So, how will it be possible to distinguish your issue with another one for the same test-case? Does your fix resolve another issue or not? It will be impossible to realize this without error message or something. I didn't ask to share some "secret" knowledge. But to share more details about the issue is really important. Could you please share more details (error message, callstack and so on) about the found issue? Even if we are talking about silent corruption then the fsck tool complains about something. And, finally, you've found some path in the code (callstack) that is the reason of such corruption. Again, it's not big deal to share these details. Thanks, Vyacheslav Dubeyko.
On Fri, Oct 26, 2018 at 09:58:49AM -0700, Viacheslav Dubeyko wrote: > On Thu, 2018-10-25 at 16:42 -0300, Ernesto A. Fernández wrote: > > On Thu, Oct 25, 2018 at 09:51:57AM -0700, Viacheslav Dubeyko wrote: > > > > > > On Tue, 2018-10-23 at 22:32 -0300, Ernesto A. Fernández wrote: > > > > > > > > On Tue, Oct 23, 2018 at 06:23:53PM -0700, Viacheslav Dubeyko > > > > wrote: > > > > > > > > > > > > > > > On Fri, 2018-08-31 at 00:58 -0300, Ernesto A. Fernández wrote: > > > > > > > > > > > > > > > > > > Creating, renaming or deleting a file may cause catalog > > > > > > corruption and > > > > > > data loss. This bug is randomly triggered by xfstests > > > > > > generic/027, but > > > > > > here is a faster reproducer: > > > > > > > > > > > > truncate -s 50M fs.iso > > > > > > mkfs.hfsplus fs.iso > > > > > > mount fs.iso /mnt > > > > > > i=100 > > > > > > while [ $i -le 150 ]; do > > > > > > touch /mnt/$i &>/dev/null > > > > > > ((++i)) > > > > > > done > > > > > > i=100 > > > > > > while [ $i -le 150 ]; do > > > > > > mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null > > > > > > ((++i)) > > > > > > done > > > > > > umount /mnt > > > > > > fsck.hfsplus -n fs.iso > > > > > > > > > > > > The bug is triggered whenever hfs_brec_update_parent() needs > > > > > > to > > > > > > split > > > > > > the root node. The height of the btree is not increased, > > > > > > which > > > > > > leaves > > > > > > the new node orphaned and its records lost. > > > > > > > > > > > > Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gm > > > > > > ail. > > > > > > com> > > > > > > --- > > > > > > fs/hfsplus/brec.c | 4 ++++ > > > > > > 1 file changed, 4 insertions(+) > > > > > > > > > > > > diff --git a/fs/hfsplus/brec.c b/fs/hfsplus/brec.c > > > > > > index ed8eacb34452..aa17a392b414 100644 > > > > > > --- a/fs/hfsplus/brec.c > > > > > > +++ b/fs/hfsplus/brec.c > > > > > > @@ -429,6 +429,10 @@ static int hfs_brec_update_parent(struct > > > > > > hfs_find_data *fd) > > > > > > if (new_node) { > > > > > > __be32 cnid; > > > > > > > > > > > > + if (!new_node->parent) { > > > > > > + hfs_btree_inc_height(tree); > > > > > > + new_node->parent = tree->root; > > > > > I worry about the case when we are adding the node on > > > > > intermediate > > > > > (not > > > > > root) level. As far as I can see, we will be in trouble here > > > > > because I > > > > > don't see any processing of two possible cases: (1) root node; > > > > > (2) > > > > > node > > > > > of intermediate level. Do I miss something here? > > > > If 'new_node' had been the result of splitting a node other than > > > > root, > > > > then it would have a parent. > > > > > > > Could you please share the callstack or/and more detailed > > > explanation > > > that would show the correctness of the fix? Currently, it's not > > > enough > > > details in the comment for easy understanding the issue's > > > environment > > > and correctness of the fix. > > This patch is two months old now. Why did you wait until the merge > > window to give me your traditional "the patch is confusing" reply? > > > > > > > > I simply mean here that as implementation > > > as concept of the HFS+ b-trees is not trivial. But everybody should > > > easy understand the patch. > > Those two sentences are not compatible. Reviewers need to have some > > understanding of the code they review. No royal road... > > > > If you've found an issue or a bug then you did see some error message > and/or callstack. It's not big deal to share it. Moreover, it's > important to include such error message, callstack and issue's details > in the comment section of the patch. Let's imagine that somebody will > use the same test-case and this guy would find another issue. So, how > will it be possible to distinguish your issue with another one for the > same test-case? Does your fix resolve another issue or not? It will be > impossible to realize this without error message or something. I didn't > ask to share some "secret" knowledge. But to share more details about > the issue is really important. Could you please share more details > (error message, callstack and so on) about the found issue? Even if we > are talking about silent corruption then the fsck tool complains about > something. And, finally, you've found some path in the code (callstack) > that is the reason of such corruption. Again, it's not big deal to > share these details. I didn't remember the fsck error, so I rebuilt without the patch and ran the reproducer again. No idea why you couldn't do that yourself. The error is "Invalid sibling link (4, 3)". Have fun with that... > Thanks, > Vyacheslav Dubeyko. >
diff --git a/fs/hfsplus/brec.c b/fs/hfsplus/brec.c index ed8eacb34452..aa17a392b414 100644 --- a/fs/hfsplus/brec.c +++ b/fs/hfsplus/brec.c @@ -429,6 +429,10 @@ static int hfs_brec_update_parent(struct hfs_find_data *fd) if (new_node) { __be32 cnid; + if (!new_node->parent) { + hfs_btree_inc_height(tree); + new_node->parent = tree->root; + } fd->bnode = hfs_bnode_find(tree, new_node->parent); /* create index key and entry */ hfs_bnode_read_key(new_node, fd->search_key, 14);
Creating, renaming or deleting a file may cause catalog corruption and data loss. This bug is randomly triggered by xfstests generic/027, but here is a faster reproducer: truncate -s 50M fs.iso mkfs.hfsplus fs.iso mount fs.iso /mnt i=100 while [ $i -le 150 ]; do touch /mnt/$i &>/dev/null ((++i)) done i=100 while [ $i -le 150 ]; do mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null ((++i)) done umount /mnt fsck.hfsplus -n fs.iso The bug is triggered whenever hfs_brec_update_parent() needs to split the root node. The height of the btree is not increased, which leaves the new node orphaned and its records lost. Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> --- fs/hfsplus/brec.c | 4 ++++ 1 file changed, 4 insertions(+)