Message ID | 20181213115306.fm2mjc3qszjiwkgf@merlin (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fs: Return EOPNOTSUPP if block layer does not support REQ_NOWAIT | expand |
On 12/13/18 1:53 PM, Goldwyn Rodrigues wrote: > For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, > it returns EIO. Return EOPNOTSUPP to represent the correct error code. Cc: stable@? > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > --- > fs/direct-io.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/fs/direct-io.c b/fs/direct-io.c > index 41a0e97252ae..77adf33916b8 100644 > --- a/fs/direct-io.c > +++ b/fs/direct-io.c > @@ -542,10 +542,13 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio) > blk_status_t err = bio->bi_status; > > if (err) { > - if (err == BLK_STS_AGAIN && (bio->bi_opf & REQ_NOWAIT)) > - dio->io_error = -EAGAIN; > - else > - dio->io_error = -EIO; > + dio->io_error = -EIO; > + if (bio->bi_opf & REQ_NOWAIT) { > + if (err == BLK_STS_AGAIN) > + dio->io_error = -EAGAIN; > + else if (err == BLK_STS_NOTSUPP) > + dio->io_error = -EOPNOTSUPP; > + } > } > > if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty) { Looks good. I wonder why it only shows up so rarely. Is there an alternative path that generates EOPNOTSUPP, that works most of the time?
On Thu, Dec 13, 2018 at 02:04:41PM +0200, Avi Kivity wrote: > On 12/13/18 1:53 PM, Goldwyn Rodrigues wrote: > > For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, > > it returns EIO. Return EOPNOTSUPP to represent the correct error code. > > > Cc: stable@? > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > --- > > fs/direct-io.c | 11 +++++++---- > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/fs/direct-io.c b/fs/direct-io.c > > index 41a0e97252ae..77adf33916b8 100644 > > --- a/fs/direct-io.c > > +++ b/fs/direct-io.c > > @@ -542,10 +542,13 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio) > > blk_status_t err = bio->bi_status; > > if (err) { I think this just need to become: if (err) dio->io_error = blk_status_to_errno(bio->bi_status); And Avi, you really should be using XFS ;-)
On 6:24 13/12, Christoph Hellwig wrote: > On Thu, Dec 13, 2018 at 02:04:41PM +0200, Avi Kivity wrote: > > On 12/13/18 1:53 PM, Goldwyn Rodrigues wrote: > > > For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, > > > it returns EIO. Return EOPNOTSUPP to represent the correct error code. > > > > > > Cc: stable@? > > > > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > > --- > > > fs/direct-io.c | 11 +++++++---- > > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > > > diff --git a/fs/direct-io.c b/fs/direct-io.c > > > index 41a0e97252ae..77adf33916b8 100644 > > > --- a/fs/direct-io.c > > > +++ b/fs/direct-io.c > > > @@ -542,10 +542,13 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio) > > > blk_status_t err = bio->bi_status; > > > if (err) { > > I think this just need to become: > > if (err) > dio->io_error = blk_status_to_errno(bio->bi_status); > Ahh.. Din't of it's existence. Yes, the function is much more elaborate. Thanks!
On Thu, Dec 13, 2018 at 05:53:06AM -0600, Goldwyn Rodrigues wrote: > For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, > it returns EIO. Return EOPNOTSUPP to represent the correct error code. Why is EOPNOTSUPP the "correct" error code? That's a networking error, not a block layer error.
On 8:27 13/12, Matthew Wilcox wrote: > On Thu, Dec 13, 2018 at 05:53:06AM -0600, Goldwyn Rodrigues wrote: > > For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, > > it returns EIO. Return EOPNOTSUPP to represent the correct error code. > > Why is EOPNOTSUPP the "correct" error code? That's a networking error, > not a block layer error. No. We return EOPNOTSUPP in filesystems as well, in case RWF_NOWAIT is not supported.
On Thu, Dec 13, 2018 at 05:53:06AM -0600, Goldwyn Rodrigues wrote: > For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, > it returns EIO. Return EOPNOTSUPP to represent the correct error code. Say what? Does this mean that if a filesystem supports RWF_NOWAIT, but the underlying block device/storage stack doesn't support it, then we'll getting EIO/EOPNOTSUPP errors returned to userspace? Isn't that highly unfriendly to userspace applications? i.e. instead of just ignoring RWF_NOWAIT in this case and having the AIO succeed, we return a /fatal/ error from deep in the guts of the IO subsystem that the user has no obvious way of tracking down? I'm also concerned that this is highly hardware dependent - two identical filesystems on different storage hardware on the same machine could behave differently. i.e. it works on one filesystem but not on the other, and there's no way to tell when it will work or fail apart from trying to use RWF_NOWAIT? I'd also like to point out that this errori (whether EIO or EOPNOTSUPP) is completely undocumented in the preadv2/pwritev2 man page, so application developers that get bug reports about EOPNOTSUPP errors are going to be rather confused.... Cheers, Dave.
On 9:43 14/12, Dave Chinner wrote: > On Thu, Dec 13, 2018 at 05:53:06AM -0600, Goldwyn Rodrigues wrote: > > For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, > > it returns EIO. Return EOPNOTSUPP to represent the correct error code. > > Say what? > > Does this mean that if a filesystem supports RWF_NOWAIT, but the > underlying block device/storage stack doesn't support it, then we'll > getting EIO/EOPNOTSUPP errors returned to userspace? > > Isn't that highly unfriendly to userspace applications? i.e. instead > of just ignoring RWF_NOWAIT in this case and having the AIO succeed, > we return a /fatal/ error from deep in the guts of the IO subsystem > that the user has no obvious way of tracking down? Well, if it is not supported, we'd rather let users decide how they want to handle it rather than manipulating the request in the kernel. For all you know, it could be a probe call to understand if RWF_NOWAIT is supported or not. > > I'm also concerned that this is highly hardware dependent - two > identical filesystems on different storage hardware on the same > machine could behave differently. i.e. it works on one filesystem > but not on the other, and there's no way to tell when it will work > or fail apart from trying to use RWF_NOWAIT? I was not too happy getting it all the way down to block layer either. The multi-devices makes it worse. However, here we are and we need to tell the user that RWF_NOWAIT is not supported in this environment. > > I'd also like to point out that this errori (whether EIO or > EOPNOTSUPP) is completely undocumented in the preadv2/pwritev2 man > page, so application developers that get bug reports about > EOPNOTSUPP errors are going to be rather confused.... Yes, I will send a patch to update the man page.
On 12/13/18 4:24 PM, Christoph Hellwig wrote: > >>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> >>> --- >>> fs/direct-io.c | 11 +++++++---- >>> 1 file changed, 7 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/direct-io.c b/fs/direct-io.c >>> index 41a0e97252ae..77adf33916b8 100644 >>> --- a/fs/direct-io.c >>> +++ b/fs/direct-io.c >>> @@ -542,10 +542,13 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio) >>> blk_status_t err = bio->bi_status; >>> if (err) { > I think this just need to become: > > if (err) > dio->io_error = blk_status_to_errno(bio->bi_status); > > And Avi, you really should be using XFS ;-) I did see this on XFS too. The whole thing bothers me, it doesn't happen consistently in some setups, which I don't understand. Either it should trigger always or never.
On Fri, Dec 14, 2018 at 11:09:10AM -0600, Goldwyn Rodrigues wrote: > On 9:43 14/12, Dave Chinner wrote: > > On Thu, Dec 13, 2018 at 05:53:06AM -0600, Goldwyn Rodrigues wrote: > > > For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, > > > it returns EIO. Return EOPNOTSUPP to represent the correct error code. > > > > Say what? > > > > Does this mean that if a filesystem supports RWF_NOWAIT, but the > > underlying block device/storage stack doesn't support it, then we'll > > getting EIO/EOPNOTSUPP errors returned to userspace? > > > > Isn't that highly unfriendly to userspace applications? i.e. instead > > of just ignoring RWF_NOWAIT in this case and having the AIO succeed, > > we return a /fatal/ error from deep in the guts of the IO subsystem > > that the user has no obvious way of tracking down? > > Well, if it is not supported, we'd rather let users decide how they > want to handle it rather than manipulating the request in the kernel. > For all you know, it could be a probe call to understand if RWF_NOWAIT > is supported or not. So even though the filesystem supports it and the app can avaoid blocking on filesystem locks (the biggest problem they have by far), we're going to prevent the filesystems from being non-blocking because the underlying block device isn't non blocking? That amkes no sense to me at all. > > I'm also concerned that this is highly hardware dependent - two > > identical filesystems on different storage hardware on the same > > machine could behave differently. i.e. it works on one filesystem > > but not on the other, and there's no way to tell when it will work > > or fail apart from trying to use RWF_NOWAIT? > > I was not too happy getting it all the way down to block layer either. > The multi-devices makes it worse. However, here we are and we need to > tell the user that RWF_NOWAIT is not supported in this environment. RWF_NOWAIT matters for filesystems much more than the underlying block device. If the application is accessing the blockd evice directly, then yes, RWF_NOWAIT support in the block device matters. But when the IO is being done through the filesystem it's far more important to avoid blocking on filesystem locks that whatever the block device does.... Hence I think that if the bio is coming from a filesystem, REQ_NOWAIT should always be accepted or bounced with EAGAIN and never failed with EOPNOTSUPP. It just makes no sense at all for filesytsem based IO.... Cheers, Dave.
On Sun, Dec 16, 2018 at 12:45:19PM +0200, Avi Kivity wrote: > I did see this on XFS too. The whole thing bothers me, it doesn't happen > consistently in some setups, which I don't understand. Either it should > trigger always or never. Well, if it also happens in XFS the above change isn't going to fix it alone, there must be another issue hiding in addition to the error conversion problems.
On 8:35 17/12, Dave Chinner wrote: > > > > I was not too happy getting it all the way down to block layer either. > > The multi-devices makes it worse. However, here we are and we need to > > tell the user that RWF_NOWAIT is not supported in this environment. > > RWF_NOWAIT matters for filesystems much more than the underlying > block device. If the application is accessing the blockd evice > directly, then yes, RWF_NOWAIT support in the block device matters. > But when the IO is being done through the filesystem it's far more > important to avoid blocking on filesystem locks that whatever the > block device does.... > > Hence I think that if the bio is coming from a filesystem, > REQ_NOWAIT should always be accepted or bounced with EAGAIN and > never failed with EOPNOTSUPP. It just makes no sense at all for > filesytsem based IO.... It was initially suggested where the block layer would retry getting a bio in get_request(). While request based devices were fine, the bio based ones such as MD needed extra work. However, when I actually got down to writing code for multi-device, it got more hurdles than solutions primarily in the area of bio merging. RWF_NOWAIT should have been restricted to filesystems and I think we should do away (or at least ignore) REQ_NOWAIT for now.
On 9:38 17/12, Christoph Hellwig wrote: > On Sun, Dec 16, 2018 at 12:45:19PM +0200, Avi Kivity wrote: > > I did see this on XFS too. The whole thing bothers me, it doesn't happen > > consistently in some setups, which I don't understand. Either it should > > trigger always or never. > > Well, if it also happens in XFS the above change isn't going to fix > it alone, there must be another issue hiding in addition to the error > conversion problems. Are you using multi-device setup as your block device? That could make it return EOPNOTSUPP since we never got to a point where we could merge code which supported bio based devices.
On 12/18/18 1:55 PM, Goldwyn Rodrigues wrote: > On 9:38 17/12, Christoph Hellwig wrote: >> On Sun, Dec 16, 2018 at 12:45:19PM +0200, Avi Kivity wrote: >>> I did see this on XFS too. The whole thing bothers me, it doesn't happen >>> consistently in some setups, which I don't understand. Either it should >>> trigger always or never. >> Well, if it also happens in XFS the above change isn't going to fix >> it alone, there must be another issue hiding in addition to the error >> conversion problems. > Are you using multi-device setup as your block device? That could make > it return EOPNOTSUPP since we never got to a point where we could > merge code which supported bio based devices. > Yes, an lvm linear device on top of a single SATA SSD.
On 13/12/2018 13.53, Goldwyn Rodrigues wrote: > For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, > it returns EIO. Return EOPNOTSUPP to represent the correct error code. > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > --- > fs/direct-io.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/fs/direct-io.c b/fs/direct-io.c > index 41a0e97252ae..77adf33916b8 100644 > --- a/fs/direct-io.c > +++ b/fs/direct-io.c > @@ -542,10 +542,13 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio) > blk_status_t err = bio->bi_status; > > if (err) { > - if (err == BLK_STS_AGAIN && (bio->bi_opf & REQ_NOWAIT)) > - dio->io_error = -EAGAIN; > - else > - dio->io_error = -EIO; > + dio->io_error = -EIO; > + if (bio->bi_opf & REQ_NOWAIT) { > + if (err == BLK_STS_AGAIN) > + dio->io_error = -EAGAIN; > + else if (err == BLK_STS_NOTSUPP) > + dio->io_error = -EOPNOTSUPP; > + } > } > > if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty) { In the end, did this or some alternative get applied? I'd like to enable RWF_NOWAIT support, but EIO scares me and my application.
On 19:08 22/07, Avi Kivity wrote: > > On 13/12/2018 13.53, Goldwyn Rodrigues wrote: > > For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, > > it returns EIO. Return EOPNOTSUPP to represent the correct error code. > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > --- > > fs/direct-io.c | 11 +++++++---- > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/fs/direct-io.c b/fs/direct-io.c > > index 41a0e97252ae..77adf33916b8 100644 > > --- a/fs/direct-io.c > > +++ b/fs/direct-io.c > > @@ -542,10 +542,13 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio) > > blk_status_t err = bio->bi_status; > > if (err) { > > - if (err == BLK_STS_AGAIN && (bio->bi_opf & REQ_NOWAIT)) > > - dio->io_error = -EAGAIN; > > - else > > - dio->io_error = -EIO; > > + dio->io_error = -EIO; > > + if (bio->bi_opf & REQ_NOWAIT) { > > + if (err == BLK_STS_AGAIN) > > + dio->io_error = -EAGAIN; > > + else if (err == BLK_STS_NOTSUPP) > > + dio->io_error = -EOPNOTSUPP; > > + } > > } > > if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty) { > > > In the end, did this or some alternative get applied? I'd like to enable > RWF_NOWAIT support, but EIO scares me and my application. > No, it was not. There were lot of objections to return error from the block layer for a filesystem nowait request.
On 28/07/2020 16.38, Goldwyn Rodrigues wrote: > On 19:08 22/07, Avi Kivity wrote: >> On 13/12/2018 13.53, Goldwyn Rodrigues wrote: >>> For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, >>> it returns EIO. Return EOPNOTSUPP to represent the correct error code. >>> >>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> >>> --- >>> fs/direct-io.c | 11 +++++++---- >>> 1 file changed, 7 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/direct-io.c b/fs/direct-io.c >>> index 41a0e97252ae..77adf33916b8 100644 >>> --- a/fs/direct-io.c >>> +++ b/fs/direct-io.c >>> @@ -542,10 +542,13 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio) >>> blk_status_t err = bio->bi_status; >>> if (err) { >>> - if (err == BLK_STS_AGAIN && (bio->bi_opf & REQ_NOWAIT)) >>> - dio->io_error = -EAGAIN; >>> - else >>> - dio->io_error = -EIO; >>> + dio->io_error = -EIO; >>> + if (bio->bi_opf & REQ_NOWAIT) { >>> + if (err == BLK_STS_AGAIN) >>> + dio->io_error = -EAGAIN; >>> + else if (err == BLK_STS_NOTSUPP) >>> + dio->io_error = -EOPNOTSUPP; >>> + } >>> } >>> if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty) { >> >> In the end, did this or some alternative get applied? I'd like to enable >> RWF_NOWAIT support, but EIO scares me and my application. >> > No, it was not. There were lot of objections to return error from the > block layer for a filesystem nowait request. > I see. For me, it makes RWF_NOWAIT unusable, since I have no way to distinguish between real EIO and EIO due to this bug. Maybe the filesystem should ask the block device if it supports nowait ahead of time (during mounting), and not pass REQ_NOWAIT at all in those cases.
On Wed, Jul 22, 2020 at 07:08:21PM +0300, Avi Kivity wrote: > > On 13/12/2018 13.53, Goldwyn Rodrigues wrote: > > For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, > > it returns EIO. Return EOPNOTSUPP to represent the correct error code. > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> I think the main problem is the EOPNOTSUPP return value. Everywhere else we treat the lack of support as BLK_STS_AGAIN / -EAGAIN, so it should return that. Independ of that the legacy direct I/O code really should just use blk_status_to_errno like most of the other infrastructure instead of havings it's own conversion and dropping the detailed error status on the floor.
diff --git a/fs/direct-io.c b/fs/direct-io.c index 41a0e97252ae..77adf33916b8 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -542,10 +542,13 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio) blk_status_t err = bio->bi_status; if (err) { - if (err == BLK_STS_AGAIN && (bio->bi_opf & REQ_NOWAIT)) - dio->io_error = -EAGAIN; - else - dio->io_error = -EIO; + dio->io_error = -EIO; + if (bio->bi_opf & REQ_NOWAIT) { + if (err == BLK_STS_AGAIN) + dio->io_error = -EAGAIN; + else if (err == BLK_STS_NOTSUPP) + dio->io_error = -EOPNOTSUPP; + } } if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty) {
For AIO+DIO with RWF_NOWAIT, if the block layer does not support REQ_NOWAIT, it returns EIO. Return EOPNOTSUPP to represent the correct error code. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> --- fs/direct-io.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)