Message ID | 20191213183632.19441-4-axboe@kernel.dk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | io_uring items for 5.6 | expand |
On Fri, Dec 13, 2019 at 11:36:25AM -0700, Jens Axboe wrote: > If the fast lookup fails, then return -EAGAIN to have the caller retry > the path lookup. This is in preparation for supporting non-blocking > open. NAK. We are not littering fs/namei.c with incremental broken bits and pieces with uncertain eventual use. And it's broken - lookup_slow() is *NOT* the only place that can and does block. For starters, ->d_revalidate() can very well block and it is called outside of lookup_slow(). So does ->d_automount(). So does ->d_manage(). I'm rather sceptical about the usefulness of non-blocking open, to be honest, but in any case, one thing that is absolutely not going to happen is piecewise introduction of such stuff without a discussion of the entire design.
On 12/26/19 5:42 PM, Al Viro wrote: > On Fri, Dec 13, 2019 at 11:36:25AM -0700, Jens Axboe wrote: >> If the fast lookup fails, then return -EAGAIN to have the caller retry >> the path lookup. This is in preparation for supporting non-blocking >> open. > > NAK. We are not littering fs/namei.c with incremental broken bits > and pieces with uncertain eventual use. To be fair, the "eventual use" is just the next patch or two... > And it's broken - lookup_slow() is *NOT* the only place that can and > does block. For starters, ->d_revalidate() can very well block and > it is called outside of lookup_slow(). So does ->d_automount(). > So does ->d_manage(). Fair enough, so it's not complete. I'd love to get it there, though! > I'm rather sceptical about the usefulness of non-blocking open, to be > honest, but in any case, one thing that is absolutely not going to > happen is piecewise introduction of such stuff without a discussion > of the entire design. It's a necessity for io_uring, otherwise _any_ open needs to happen out-of-line. But I get your objection, I'd like to get this moving in a productive way though. What do you want it to look like? I'd be totally fine with knowing if the fs has ->d_revalidate(), and always doing those out-of-line. If I know the open will be slow, that's preferable. Ditto for ->d_automount() and ->d_manage(), all of that looks like cases that would be fine to punt. I honestly care mostly about the cached local case _not_ needing out-of-line handling, that needs to happen inline. Still seems to me like the LOOKUP_NONBLOCK is the way to go, and just have lookup_fast() -EAGAIN if we need to call any of the potentially problematic dentry ops. Yes, they _may_ not block, but they could. I don't think we need to propagate this information further.
On 12/26/19 10:05 PM, Jens Axboe wrote: > On 12/26/19 5:42 PM, Al Viro wrote: >> On Fri, Dec 13, 2019 at 11:36:25AM -0700, Jens Axboe wrote: >>> If the fast lookup fails, then return -EAGAIN to have the caller retry >>> the path lookup. This is in preparation for supporting non-blocking >>> open. >> >> NAK. We are not littering fs/namei.c with incremental broken bits >> and pieces with uncertain eventual use. > > To be fair, the "eventual use" is just the next patch or two... > >> And it's broken - lookup_slow() is *NOT* the only place that can and >> does block. For starters, ->d_revalidate() can very well block and >> it is called outside of lookup_slow(). So does ->d_automount(). >> So does ->d_manage(). > > Fair enough, so it's not complete. I'd love to get it there, though! > >> I'm rather sceptical about the usefulness of non-blocking open, to be >> honest, but in any case, one thing that is absolutely not going to >> happen is piecewise introduction of such stuff without a discussion >> of the entire design. > > It's a necessity for io_uring, otherwise _any_ open needs to happen > out-of-line. But I get your objection, I'd like to get this moving in a > productive way though. > > What do you want it to look like? I'd be totally fine with knowing if > the fs has ->d_revalidate(), and always doing those out-of-line. If I > know the open will be slow, that's preferable. Ditto for ->d_automount() > and ->d_manage(), all of that looks like cases that would be fine to > punt. I honestly care mostly about the cached local case _not_ needing > out-of-line handling, that needs to happen inline. > > Still seems to me like the LOOKUP_NONBLOCK is the way to go, and just > have lookup_fast() -EAGAIN if we need to call any of the potentially > problematic dentry ops. Yes, they _may_ not block, but they could. I > don't think we need to propagate this information further. Incremental here - just check for potentially problematic dentry ops, and have the open redone from a path where it doesn't matter. diff --git a/fs/namei.c b/fs/namei.c index ebd05ed14b0a..9c46b1e04fac 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1549,6 +1549,14 @@ static struct dentry *__lookup_hash(const struct qstr *name, return dentry; } +static inline bool lookup_may_block(struct dentry *dentry) +{ + const struct dentry_operations *ops = dentry->d_op; + + /* assume these dentry ops may block */ + return ops->d_revalidate || ops->d_automount || ops->d_manage; +} + static int lookup_fast(struct nameidata *nd, struct path *path, struct inode **inode, unsigned *seqp) @@ -1573,6 +1581,9 @@ static int lookup_fast(struct nameidata *nd, return 0; } + if ((nd->flags & LOOKUP_NONBLOCK) && lookup_may_block(dentry)) + return -EAGAIN; + /* * This sequence count validates that the inode matches * the dentry name information from lookup. @@ -1615,7 +1626,10 @@ static int lookup_fast(struct nameidata *nd, dentry = __d_lookup(parent, &nd->last); if (unlikely(!dentry)) return 0; - status = d_revalidate(dentry, nd->flags); + if ((nd->flags & LOOKUP_NONBLOCK) && lookup_may_block(dentry)) + status = -EAGAIN; + else + status = d_revalidate(dentry, nd->flags); } if (unlikely(status <= 0)) { if (!status)
On 12/26/19 10:25 PM, Jens Axboe wrote: > On 12/26/19 10:05 PM, Jens Axboe wrote: >> On 12/26/19 5:42 PM, Al Viro wrote: >>> On Fri, Dec 13, 2019 at 11:36:25AM -0700, Jens Axboe wrote: >>>> If the fast lookup fails, then return -EAGAIN to have the caller retry >>>> the path lookup. This is in preparation for supporting non-blocking >>>> open. >>> >>> NAK. We are not littering fs/namei.c with incremental broken bits >>> and pieces with uncertain eventual use. >> >> To be fair, the "eventual use" is just the next patch or two... >> >>> And it's broken - lookup_slow() is *NOT* the only place that can and >>> does block. For starters, ->d_revalidate() can very well block and >>> it is called outside of lookup_slow(). So does ->d_automount(). >>> So does ->d_manage(). >> >> Fair enough, so it's not complete. I'd love to get it there, though! >> >>> I'm rather sceptical about the usefulness of non-blocking open, to be >>> honest, but in any case, one thing that is absolutely not going to >>> happen is piecewise introduction of such stuff without a discussion >>> of the entire design. >> >> It's a necessity for io_uring, otherwise _any_ open needs to happen >> out-of-line. But I get your objection, I'd like to get this moving in a >> productive way though. >> >> What do you want it to look like? I'd be totally fine with knowing if >> the fs has ->d_revalidate(), and always doing those out-of-line. If I >> know the open will be slow, that's preferable. Ditto for ->d_automount() >> and ->d_manage(), all of that looks like cases that would be fine to >> punt. I honestly care mostly about the cached local case _not_ needing >> out-of-line handling, that needs to happen inline. >> >> Still seems to me like the LOOKUP_NONBLOCK is the way to go, and just >> have lookup_fast() -EAGAIN if we need to call any of the potentially >> problematic dentry ops. Yes, they _may_ not block, but they could. I >> don't think we need to propagate this information further. > > Incremental here - just check for potentially problematic dentry ops, > and have the open redone from a path where it doesn't matter. Here's the (updated) full patch, with the bits cleaned up a bit. Would this be more agreeable to you? commit ac605d1d6ca445ba7e2990e0afe0e28ad831a663 Author: Jens Axboe <axboe@kernel.dk> Date: Fri Dec 13 11:09:26 2019 -0700 fs: add namei support for doing a non-blocking path lookup If the fast lookup fails, then return -EAGAIN to have the caller retry the path lookup. Assume that a dentry having any of: ->d_revalidate() ->d_automount() ->d_manage() could block in those callbacks. Preemptively return -EAGAIN if any of these are present. This is in preparation for supporting non-blocking open. Signed-off-by: Jens Axboe <axboe@kernel.dk> diff --git a/fs/namei.c b/fs/namei.c index d6c91d1e88cb..2bfdb932f2f2 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1549,6 +1549,17 @@ static struct dentry *__lookup_hash(const struct qstr *name, return dentry; } +static inline bool lookup_could_block(struct dentry *dentry, unsigned int flags) +{ + const struct dentry_operations *ops = dentry->d_op; + + if (!(flags & LOOKUP_NONBLOCK)) + return 0; + + /* assume these dentry ops may block */ + return ops->d_revalidate || ops->d_automount || ops->d_manage; +} + static int lookup_fast(struct nameidata *nd, struct path *path, struct inode **inode, unsigned *seqp) @@ -1573,6 +1584,9 @@ static int lookup_fast(struct nameidata *nd, return 0; } + if (unlikely(lookup_could_block(dentry, nd->flags))) + return -EAGAIN; + /* * This sequence count validates that the inode matches * the dentry name information from lookup. @@ -1615,7 +1629,10 @@ static int lookup_fast(struct nameidata *nd, dentry = __d_lookup(parent, &nd->last); if (unlikely(!dentry)) return 0; - status = d_revalidate(dentry, nd->flags); + if (unlikely(lookup_could_block(dentry, nd->flags))) + status = -EAGAIN; + else + status = d_revalidate(dentry, nd->flags); } if (unlikely(status <= 0)) { if (!status) @@ -1799,6 +1816,8 @@ static int walk_component(struct nameidata *nd, int flags) if (unlikely(err <= 0)) { if (err < 0) return err; + if (nd->flags & LOOKUP_NONBLOCK) + return -EAGAIN; path.dentry = lookup_slow(&nd->last, nd->path.dentry, nd->flags); if (IS_ERR(path.dentry)) diff --git a/include/linux/namei.h b/include/linux/namei.h index 7fe7b87a3ded..935a1bf0caca 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -38,6 +38,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; #define LOOKUP_JUMPED 0x1000 #define LOOKUP_ROOT 0x2000 #define LOOKUP_ROOT_GRABBED 0x0008 +#define LOOKUP_NONBLOCK 0x10000 /* don't block for lookup */ extern int path_pts(struct path *path);
On 12/27/19 8:45 AM, Jens Axboe wrote: > On 12/26/19 10:25 PM, Jens Axboe wrote: >> On 12/26/19 10:05 PM, Jens Axboe wrote: >>> On 12/26/19 5:42 PM, Al Viro wrote: >>>> On Fri, Dec 13, 2019 at 11:36:25AM -0700, Jens Axboe wrote: >>>>> If the fast lookup fails, then return -EAGAIN to have the caller retry >>>>> the path lookup. This is in preparation for supporting non-blocking >>>>> open. >>>> >>>> NAK. We are not littering fs/namei.c with incremental broken bits >>>> and pieces with uncertain eventual use. >>> >>> To be fair, the "eventual use" is just the next patch or two... >>> >>>> And it's broken - lookup_slow() is *NOT* the only place that can and >>>> does block. For starters, ->d_revalidate() can very well block and >>>> it is called outside of lookup_slow(). So does ->d_automount(). >>>> So does ->d_manage(). >>> >>> Fair enough, so it's not complete. I'd love to get it there, though! >>> >>>> I'm rather sceptical about the usefulness of non-blocking open, to be >>>> honest, but in any case, one thing that is absolutely not going to >>>> happen is piecewise introduction of such stuff without a discussion >>>> of the entire design. >>> >>> It's a necessity for io_uring, otherwise _any_ open needs to happen >>> out-of-line. But I get your objection, I'd like to get this moving in a >>> productive way though. >>> >>> What do you want it to look like? I'd be totally fine with knowing if >>> the fs has ->d_revalidate(), and always doing those out-of-line. If I >>> know the open will be slow, that's preferable. Ditto for ->d_automount() >>> and ->d_manage(), all of that looks like cases that would be fine to >>> punt. I honestly care mostly about the cached local case _not_ needing >>> out-of-line handling, that needs to happen inline. >>> >>> Still seems to me like the LOOKUP_NONBLOCK is the way to go, and just >>> have lookup_fast() -EAGAIN if we need to call any of the potentially >>> problematic dentry ops. Yes, they _may_ not block, but they could. I >>> don't think we need to propagate this information further. >> >> Incremental here - just check for potentially problematic dentry ops, >> and have the open redone from a path where it doesn't matter. > > Here's the (updated) full patch, with the bits cleaned up a bit. Would > this be more agreeable to you? Needs a !ops check as well, tmpfs hits that: commit 8a4dbfbbcd675492cf9e8cbcb203386a1ce10916 Author: Jens Axboe <axboe@kernel.dk> Date: Fri Dec 13 11:09:26 2019 -0700 fs: add namei support for doing a non-blocking path lookup If the fast lookup fails, then return -EAGAIN to have the caller retry the path lookup. Assume that a dentry having any of: ->d_revalidate() ->d_automount() ->d_manage() could block in those callbacks. Preemptively return -EAGAIN if any of these are present. This is in preparation for supporting non-blocking open. Signed-off-by: Jens Axboe <axboe@kernel.dk> diff --git a/fs/namei.c b/fs/namei.c index d6c91d1e88cb..42e71e5a69f1 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1549,6 +1549,17 @@ static struct dentry *__lookup_hash(const struct qstr *name, return dentry; } +static inline bool lookup_could_block(struct dentry *dentry, unsigned int flags) +{ + const struct dentry_operations *ops = dentry->d_op; + + if (!ops || !(flags & LOOKUP_NONBLOCK)) + return 0; + + /* assume these dentry ops may block */ + return ops->d_revalidate || ops->d_automount || ops->d_manage; +} + static int lookup_fast(struct nameidata *nd, struct path *path, struct inode **inode, unsigned *seqp) @@ -1573,6 +1584,9 @@ static int lookup_fast(struct nameidata *nd, return 0; } + if (unlikely(lookup_could_block(dentry, nd->flags))) + return -EAGAIN; + /* * This sequence count validates that the inode matches * the dentry name information from lookup. @@ -1615,7 +1629,10 @@ static int lookup_fast(struct nameidata *nd, dentry = __d_lookup(parent, &nd->last); if (unlikely(!dentry)) return 0; - status = d_revalidate(dentry, nd->flags); + if (unlikely(lookup_could_block(dentry, nd->flags))) + status = -EAGAIN; + else + status = d_revalidate(dentry, nd->flags); } if (unlikely(status <= 0)) { if (!status) @@ -1799,6 +1816,8 @@ static int walk_component(struct nameidata *nd, int flags) if (unlikely(err <= 0)) { if (err < 0) return err; + if (nd->flags & LOOKUP_NONBLOCK) + return -EAGAIN; path.dentry = lookup_slow(&nd->last, nd->path.dentry, nd->flags); if (IS_ERR(path.dentry)) diff --git a/include/linux/namei.h b/include/linux/namei.h index 7fe7b87a3ded..935a1bf0caca 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -38,6 +38,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; #define LOOKUP_JUMPED 0x1000 #define LOOKUP_ROOT 0x2000 #define LOOKUP_ROOT_GRABBED 0x0008 +#define LOOKUP_NONBLOCK 0x10000 /* don't block for lookup */ extern int path_pts(struct path *path);
diff --git a/fs/namei.c b/fs/namei.c index 2dda552bcf7a..50899721f699 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1801,6 +1801,8 @@ static int walk_component(struct nameidata *nd, int flags) if (unlikely(err <= 0)) { if (err < 0) return err; + if (nd->flags & LOOKUP_NONBLOCK) + return -EAGAIN; path.dentry = lookup_slow(&nd->last, nd->path.dentry, nd->flags); if (IS_ERR(path.dentry)) diff --git a/include/linux/namei.h b/include/linux/namei.h index 397a08ade6a2..a50ad21e3457 100644 --- a/include/linux/namei.h +++ b/include/linux/namei.h @@ -38,6 +38,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, LAST_BIND}; #define LOOKUP_JUMPED 0x1000 #define LOOKUP_ROOT 0x2000 #define LOOKUP_ROOT_GRABBED 0x0008 +#define LOOKUP_NONBLOCK 0x10000 /* don't block for lookup */ extern int path_pts(struct path *path);
If the fast lookup fails, then return -EAGAIN to have the caller retry the path lookup. This is in preparation for supporting non-blocking open. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- fs/namei.c | 2 ++ include/linux/namei.h | 1 + 2 files changed, 3 insertions(+)