diff mbox series

[v2] staging: erofs: fix an error handling in erofs_readdir()

Message ID 20190818015631.6982-1-hsiangkao@aol.com (mailing list archive)
State New, archived
Headers show
Series [v2] staging: erofs: fix an error handling in erofs_readdir() | expand

Commit Message

Gao Xiang Aug. 18, 2019, 1:56 a.m. UTC
From: Gao Xiang <gaoxiang25@huawei.com>

Richard observed a forever loop of erofs_read_raw_page() [1]
which can be generated by forcely setting ->u.i_blkaddr
to 0xdeadbeef (as my understanding block layer can
handle access beyond end of device correctly).

After digging into that, it seems the problem is highly
related with directories and then I found the root cause
is an improper error handling in erofs_readdir().

Let's fix it now.

[1] https://lore.kernel.org/r/1163995781.68824.1566084358245.JavaMail.zimbra@nod.at/

Reported-by: Richard Weinberger <richard@nod.at>
Fixes: 3aa8ec716e52 ("staging: erofs: add directory operations")
Cc: <stable@vger.kernel.org> # 4.19+
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
---

changelog from v1:
 - fix the incorrect external link in commit message.

This patch is based on the following patch as well
https://lore.kernel.org/r/20190816071142.8633-1-gaoxiang25@huawei.com/

and
https://lore.kernel.org/r/20190817082313.21040-1-hsiangkao@aol.com/
can still be properly applied after this patch.

 drivers/staging/erofs/dir.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Matthew Wilcox (Oracle) Aug. 18, 2019, 2:20 a.m. UTC | #1
On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
> @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct dir_context *ctx)
>  		unsigned int nameoff, maxsize;
>  
>  		dentry_page = read_mapping_page(mapping, i, NULL);
> -		if (IS_ERR(dentry_page))
> -			continue;
> +		if (IS_ERR(dentry_page)) {
> +			errln("fail to readdir of logical block %u of nid %llu",
> +			      i, EROFS_V(dir)->nid);
> +			err = PTR_ERR(dentry_page);
> +			break;

I don't think you want to use the errno that came back from
read_mapping_page() (which is, I think, always going to be -EIO).
Rather you want -EFSCORRUPTED, at least if I understand the recent
patches to ext2/ext4/f2fs/xfs/...
Gao Xiang Aug. 18, 2019, 2:32 a.m. UTC | #2
Hi Willy,

On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
> On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
> > @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct dir_context *ctx)
> >  		unsigned int nameoff, maxsize;
> >  
> >  		dentry_page = read_mapping_page(mapping, i, NULL);
> > -		if (IS_ERR(dentry_page))
> > -			continue;
> > +		if (IS_ERR(dentry_page)) {
> > +			errln("fail to readdir of logical block %u of nid %llu",
> > +			      i, EROFS_V(dir)->nid);
> > +			err = PTR_ERR(dentry_page);
> > +			break;
> 
> I don't think you want to use the errno that came back from
> read_mapping_page() (which is, I think, always going to be -EIO).
> Rather you want -EFSCORRUPTED, at least if I understand the recent
> patches to ext2/ext4/f2fs/xfs/...

Thanks for your reply and noticing this. :)

Yes, as I talked with you about read_mapping_page() in a xfs related
topic earlier, I think I fully understand what returns here.

I actually had some concern about that before sending out this patch.
You know the status is
   PG_uptodate is not set and PG_error is set here.

But we cannot know it is actually a disk read error or due to
corrupted images (due to lack of page flags or some status, and
I think it could be a waste of page structure space for such
corrupted image or disk error)...

And some people also like propagate errors from insiders...
(and they could argue about err = -EFSCORRUPTED as well..)

I'd like hear your suggestion about this after my words above?
still return -EFSCORRUPTED?

Thanks,
Gao Xiang
Matthew Wilcox (Oracle) Aug. 18, 2019, 2:53 a.m. UTC | #3
On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote:
> On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
> > On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
> > > @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct dir_context *ctx)
> > >  		unsigned int nameoff, maxsize;
> > >  
> > >  		dentry_page = read_mapping_page(mapping, i, NULL);
> > > -		if (IS_ERR(dentry_page))
> > > -			continue;
> > > +		if (IS_ERR(dentry_page)) {
> > > +			errln("fail to readdir of logical block %u of nid %llu",
> > > +			      i, EROFS_V(dir)->nid);
> > > +			err = PTR_ERR(dentry_page);
> > > +			break;
> > 
> > I don't think you want to use the errno that came back from
> > read_mapping_page() (which is, I think, always going to be -EIO).
> > Rather you want -EFSCORRUPTED, at least if I understand the recent
> > patches to ext2/ext4/f2fs/xfs/...
> 
> Thanks for your reply and noticing this. :)
> 
> Yes, as I talked with you about read_mapping_page() in a xfs related
> topic earlier, I think I fully understand what returns here.
> 
> I actually had some concern about that before sending out this patch.
> You know the status is
>    PG_uptodate is not set and PG_error is set here.
> 
> But we cannot know it is actually a disk read error or due to
> corrupted images (due to lack of page flags or some status, and
> I think it could be a waste of page structure space for such
> corrupted image or disk error)...
> 
> And some people also like propagate errors from insiders...
> (and they could argue about err = -EFSCORRUPTED as well..)
> 
> I'd like hear your suggestion about this after my words above?
> still return -EFSCORRUPTED?

I don't think it matters whether it's due to a disk error or a corrupted
image.  We can't read the directory entry, so we should probably return
-EFSCORRUPTED.  Thinking about it some more, read_mapping_page() can
also return -ENOMEM, so it should probably look something like this:

		err = 0;
		if (dentry_page == ERR_PTR(-ENOMEM))
			err = -ENOMEM;
		else if (IS_ERR(dentry_page)) {
			errln("fail to readdir of logical block %u of nid %llu",
			      i, EROFS_V(dir)->nid);
			err = -EFSCORRUPTED;
		}

		if (err)
			break;
Gao Xiang Aug. 18, 2019, 3:01 a.m. UTC | #4
On Sat, Aug 17, 2019 at 07:53:39PM -0700, Matthew Wilcox wrote:
> On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote:
> > On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
> > > On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
> > > > @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct dir_context *ctx)
> > > >  		unsigned int nameoff, maxsize;
> > > >  
> > > >  		dentry_page = read_mapping_page(mapping, i, NULL);
> > > > -		if (IS_ERR(dentry_page))
> > > > -			continue;
> > > > +		if (IS_ERR(dentry_page)) {
> > > > +			errln("fail to readdir of logical block %u of nid %llu",
> > > > +			      i, EROFS_V(dir)->nid);
> > > > +			err = PTR_ERR(dentry_page);
> > > > +			break;
> > > 
> > > I don't think you want to use the errno that came back from
> > > read_mapping_page() (which is, I think, always going to be -EIO).
> > > Rather you want -EFSCORRUPTED, at least if I understand the recent
> > > patches to ext2/ext4/f2fs/xfs/...
> > 
> > Thanks for your reply and noticing this. :)
> > 
> > Yes, as I talked with you about read_mapping_page() in a xfs related
> > topic earlier, I think I fully understand what returns here.
> > 
> > I actually had some concern about that before sending out this patch.
> > You know the status is
> >    PG_uptodate is not set and PG_error is set here.
> > 
> > But we cannot know it is actually a disk read error or due to
> > corrupted images (due to lack of page flags or some status, and
> > I think it could be a waste of page structure space for such
> > corrupted image or disk error)...
> > 
> > And some people also like propagate errors from insiders...
> > (and they could argue about err = -EFSCORRUPTED as well..)
> > 
> > I'd like hear your suggestion about this after my words above?
> > still return -EFSCORRUPTED?
> 
> I don't think it matters whether it's due to a disk error or a corrupted
> image.  We can't read the directory entry, so we should probably return
> -EFSCORRUPTED.  Thinking about it some more, read_mapping_page() can
> also return -ENOMEM, so it should probably look something like this:

OK, I will send the next version like what you described below.
I realized that at first but I have no tendency to return
EFSCORRUPTED or EIO here.

Thanks,
Gao Xiang

> 
> 		err = 0;
> 		if (dentry_page == ERR_PTR(-ENOMEM))
> 			err = -ENOMEM;
> 		else if (IS_ERR(dentry_page)) {
> 			errln("fail to readdir of logical block %u of nid %llu",
> 			      i, EROFS_V(dir)->nid);
> 			err = -EFSCORRUPTED;
> 		}
> 
> 		if (err)
> 			break;
Chao Yu Aug. 18, 2019, 10:39 a.m. UTC | #5
On 2019-8-18 10:53, Matthew Wilcox wrote:
> On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote:
>> On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
>>> On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
>>>> @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct dir_context *ctx)
>>>>  		unsigned int nameoff, maxsize;
>>>>  
>>>>  		dentry_page = read_mapping_page(mapping, i, NULL);
>>>> -		if (IS_ERR(dentry_page))
>>>> -			continue;
>>>> +		if (IS_ERR(dentry_page)) {
>>>> +			errln("fail to readdir of logical block %u of nid %llu",
>>>> +			      i, EROFS_V(dir)->nid);
>>>> +			err = PTR_ERR(dentry_page);
>>>> +			break;
>>>
>>> I don't think you want to use the errno that came back from
>>> read_mapping_page() (which is, I think, always going to be -EIO).
>>> Rather you want -EFSCORRUPTED, at least if I understand the recent
>>> patches to ext2/ext4/f2fs/xfs/...
>>
>> Thanks for your reply and noticing this. :)
>>
>> Yes, as I talked with you about read_mapping_page() in a xfs related
>> topic earlier, I think I fully understand what returns here.
>>
>> I actually had some concern about that before sending out this patch.
>> You know the status is
>>    PG_uptodate is not set and PG_error is set here.
>>
>> But we cannot know it is actually a disk read error or due to
>> corrupted images (due to lack of page flags or some status, and
>> I think it could be a waste of page structure space for such
>> corrupted image or disk error)...
>>
>> And some people also like propagate errors from insiders...
>> (and they could argue about err = -EFSCORRUPTED as well..)
>>
>> I'd like hear your suggestion about this after my words above?
>> still return -EFSCORRUPTED?
> 
> I don't think it matters whether it's due to a disk error or a corrupted
> image.  We can't read the directory entry, so we should probably return
> -EFSCORRUPTED.  Thinking about it some more, read_mapping_page() can
> also return -ENOMEM, so it should probably look something like this:
> 
> 		err = 0;
> 		if (dentry_page == ERR_PTR(-ENOMEM))
> 			err = -ENOMEM;
> 		else if (IS_ERR(dentry_page)) {
> 			errln("fail to readdir of logical block %u of nid %llu",
> 			      i, EROFS_V(dir)->nid);
> 			err = -EFSCORRUPTED;

Well, if there is real IO error happen under filesystem, we should return -EIO
instead of EFSCORRUPTED?

The right fix may be that doing sanity check on on-disk blkaddr, and return
-EFSCORRUPTED if the blkaddr is invalid and propagate the error to its caller
erofs_readdir(), IIUC below error info.

> [36297.354090] attempt to access beyond end of device
> [36297.354098] loop17: rw=0, want=29887428984, limit=1953128
> [36297.354107] attempt to access beyond end of device
> [36297.354109] loop17: rw=0, want=29887428480, limit=1953128
> [36301.827234] attempt to access beyond end of device
> [36301.827243] loop17: rw=0, want=29887428480, limit=1953128

Thanks,

> 		}
> 
> 		if (err)
> 			break;
>
Gao Xiang Aug. 18, 2019, 10:52 a.m. UTC | #6
Hi Chao,

On Sun, Aug 18, 2019 at 06:39:52PM +0800, Chao Yu wrote:
> On 2019-8-18 10:53, Matthew Wilcox wrote:
> > On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote:
> >> On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
> >>> On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
> >>>> @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct dir_context *ctx)
> >>>>  		unsigned int nameoff, maxsize;
> >>>>  
> >>>>  		dentry_page = read_mapping_page(mapping, i, NULL);
> >>>> -		if (IS_ERR(dentry_page))
> >>>> -			continue;
> >>>> +		if (IS_ERR(dentry_page)) {
> >>>> +			errln("fail to readdir of logical block %u of nid %llu",
> >>>> +			      i, EROFS_V(dir)->nid);
> >>>> +			err = PTR_ERR(dentry_page);
> >>>> +			break;
> >>>
> >>> I don't think you want to use the errno that came back from
> >>> read_mapping_page() (which is, I think, always going to be -EIO).
> >>> Rather you want -EFSCORRUPTED, at least if I understand the recent
> >>> patches to ext2/ext4/f2fs/xfs/...
> >>
> >> Thanks for your reply and noticing this. :)
> >>
> >> Yes, as I talked with you about read_mapping_page() in a xfs related
> >> topic earlier, I think I fully understand what returns here.
> >>
> >> I actually had some concern about that before sending out this patch.
> >> You know the status is
> >>    PG_uptodate is not set and PG_error is set here.
> >>
> >> But we cannot know it is actually a disk read error or due to
> >> corrupted images (due to lack of page flags or some status, and
> >> I think it could be a waste of page structure space for such
> >> corrupted image or disk error)...
> >>
> >> And some people also like propagate errors from insiders...
> >> (and they could argue about err = -EFSCORRUPTED as well..)
> >>
> >> I'd like hear your suggestion about this after my words above?
> >> still return -EFSCORRUPTED?
> > 
> > I don't think it matters whether it's due to a disk error or a corrupted
> > image.  We can't read the directory entry, so we should probably return
> > -EFSCORRUPTED.  Thinking about it some more, read_mapping_page() can
> > also return -ENOMEM, so it should probably look something like this:
> > 
> > 		err = 0;
> > 		if (dentry_page == ERR_PTR(-ENOMEM))
> > 			err = -ENOMEM;
> > 		else if (IS_ERR(dentry_page)) {
> > 			errln("fail to readdir of logical block %u of nid %llu",
> > 			      i, EROFS_V(dir)->nid);
> > 			err = -EFSCORRUPTED;
> 
> Well, if there is real IO error happen under filesystem, we should return -EIO
> instead of EFSCORRUPTED?
> 
> The right fix may be that doing sanity check on on-disk blkaddr, and return
> -EFSCORRUPTED if the blkaddr is invalid and propagate the error to its caller
> erofs_readdir(), IIUC below error info.

In my thought, I actually don't care what is actually returned
(In other words, I have no tendency about EFSCORRUPTED / EIO)
as long as it behaves normal for corrupted image.

A little concern is that I have no idea whether all user problems
can handle EUCLEAN properly.

I don't want to limit blkaddr as what ->blocks recorded in
erofs_super_block since it's already used for our hotpatching
approach and that field is only used for statfs() for users
to know its visible size, and block layer will block such
invalid block access.

All in all, that is minor. I think we can fix as what Matthew said.

Thanks,
Gao Xiang

> 
> > [36297.354090] attempt to access beyond end of device
> > [36297.354098] loop17: rw=0, want=29887428984, limit=1953128
> > [36297.354107] attempt to access beyond end of device
> > [36297.354109] loop17: rw=0, want=29887428480, limit=1953128
> > [36301.827234] attempt to access beyond end of device
> > [36301.827243] loop17: rw=0, want=29887428480, limit=1953128
> 
> Thanks,
> 
> > 		}
> > 
> > 		if (err)
> > 			break;
> >
Chao Yu Aug. 18, 2019, 12:28 p.m. UTC | #7
Hi Xiang,

On 2019-8-18 18:52, Gao Xiang wrote:
> Hi Chao,
> 
> On Sun, Aug 18, 2019 at 06:39:52PM +0800, Chao Yu wrote:
>> On 2019-8-18 10:53, Matthew Wilcox wrote:
>>> On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote:
>>>> On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
>>>>> On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
>>>>>> @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct dir_context *ctx)
>>>>>>  		unsigned int nameoff, maxsize;
>>>>>>  
>>>>>>  		dentry_page = read_mapping_page(mapping, i, NULL);
>>>>>> -		if (IS_ERR(dentry_page))
>>>>>> -			continue;
>>>>>> +		if (IS_ERR(dentry_page)) {
>>>>>> +			errln("fail to readdir of logical block %u of nid %llu",
>>>>>> +			      i, EROFS_V(dir)->nid);
>>>>>> +			err = PTR_ERR(dentry_page);
>>>>>> +			break;
>>>>>
>>>>> I don't think you want to use the errno that came back from
>>>>> read_mapping_page() (which is, I think, always going to be -EIO).
>>>>> Rather you want -EFSCORRUPTED, at least if I understand the recent
>>>>> patches to ext2/ext4/f2fs/xfs/...
>>>>
>>>> Thanks for your reply and noticing this. :)
>>>>
>>>> Yes, as I talked with you about read_mapping_page() in a xfs related
>>>> topic earlier, I think I fully understand what returns here.
>>>>
>>>> I actually had some concern about that before sending out this patch.
>>>> You know the status is
>>>>    PG_uptodate is not set and PG_error is set here.
>>>>
>>>> But we cannot know it is actually a disk read error or due to
>>>> corrupted images (due to lack of page flags or some status, and
>>>> I think it could be a waste of page structure space for such
>>>> corrupted image or disk error)...
>>>>
>>>> And some people also like propagate errors from insiders...
>>>> (and they could argue about err = -EFSCORRUPTED as well..)
>>>>
>>>> I'd like hear your suggestion about this after my words above?
>>>> still return -EFSCORRUPTED?
>>>
>>> I don't think it matters whether it's due to a disk error or a corrupted
>>> image.  We can't read the directory entry, so we should probably return
>>> -EFSCORRUPTED.  Thinking about it some more, read_mapping_page() can
>>> also return -ENOMEM, so it should probably look something like this:
>>>
>>> 		err = 0;
>>> 		if (dentry_page == ERR_PTR(-ENOMEM))
>>> 			err = -ENOMEM;
>>> 		else if (IS_ERR(dentry_page)) {
>>> 			errln("fail to readdir of logical block %u of nid %llu",
>>> 			      i, EROFS_V(dir)->nid);
>>> 			err = -EFSCORRUPTED;
>>
>> Well, if there is real IO error happen under filesystem, we should return -EIO
>> instead of EFSCORRUPTED?
>>
>> The right fix may be that doing sanity check on on-disk blkaddr, and return
>> -EFSCORRUPTED if the blkaddr is invalid and propagate the error to its caller
>> erofs_readdir(), IIUC below error info.
> 
> In my thought, I actually don't care what is actually returned
> (In other words, I have no tendency about EFSCORRUPTED / EIO)
> as long as it behaves normal for corrupted image.

I doubt that user can and be willing to handle different error code in there
error path.

> 
> A little concern is that I have no idea whether all user problems
> can handle EUCLEAN properly.

Yes, I can see it's widely used in fileystem, I guess it would be better to
update manual of common fs interfaces to let user be aware of the meaning of it.

> 
> I don't want to limit blkaddr as what ->blocks recorded in
> erofs_super_block since it's already used for our hotpatching
> approach and that field is only used for statfs() for users
> to know its visible size, and block layer will block such
> invalid block access.
> 
> All in all, that is minor. I think we can fix as what Matthew said.

Yeah, as we discuss offline, we need keep flexibility on current version for
android, and maybe we can add a feature to check blkaddr validation later.

Thanks,

> 
> Thanks,
> Gao Xiang
> 
>>
>>> [36297.354090] attempt to access beyond end of device
>>> [36297.354098] loop17: rw=0, want=29887428984, limit=1953128
>>> [36297.354107] attempt to access beyond end of device
>>> [36297.354109] loop17: rw=0, want=29887428480, limit=1953128
>>> [36301.827234] attempt to access beyond end of device
>>> [36301.827243] loop17: rw=0, want=29887428480, limit=1953128
>>
>> Thanks,
>>
>>> 		}
>>>
>>> 		if (err)
>>> 			break;
>>>
diff mbox series

Patch

diff --git a/drivers/staging/erofs/dir.c b/drivers/staging/erofs/dir.c
index 5f38382637e6..f2d7539589e4 100644
--- a/drivers/staging/erofs/dir.c
+++ b/drivers/staging/erofs/dir.c
@@ -82,8 +82,12 @@  static int erofs_readdir(struct file *f, struct dir_context *ctx)
 		unsigned int nameoff, maxsize;
 
 		dentry_page = read_mapping_page(mapping, i, NULL);
-		if (IS_ERR(dentry_page))
-			continue;
+		if (IS_ERR(dentry_page)) {
+			errln("fail to readdir of logical block %u of nid %llu",
+			      i, EROFS_V(dir)->nid);
+			err = PTR_ERR(dentry_page);
+			break;
+		}
 
 		de = (struct erofs_dirent *)kmap(dentry_page);