Message ID | a8cdb781384791c30e30036aced4c027c5dfea86.1605969341.git.asml.silence@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | iov_iter: optimise iter type checking | expand |
On 21/11/2020 14:37, Pavel Begunkov wrote: > The problem here is that iov_iter_is_*() helpers check types for > equality, but all iterate_* helpers do bitwise ands. This confuses > compilers, so even if some cases were handled separately with > iov_iter_is_*(), corresponding ifs in iterate*() right after are not > eliminated. > > E.g. iov_iter_npages() first handles discards, but iterate_all_kinds() > still checks for discard iter type and generates unreachable code down > the line. Ping. This one should be pretty simple > > text data bss dec hex filename > before: 24409 805 0 25214 627e lib/iov_iter.o > after: 23977 805 0 24782 60ce lib/iov_iter.o > > Reviewed-by: Jens Axboe <axboe@kernel.dk> > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> > --- > include/linux/uio.h | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/include/linux/uio.h b/include/linux/uio.h > index 72d88566694e..c5970b2d3307 100644 > --- a/include/linux/uio.h > +++ b/include/linux/uio.h > @@ -57,27 +57,27 @@ static inline enum iter_type iov_iter_type(const struct iov_iter *i) > > static inline bool iter_is_iovec(const struct iov_iter *i) > { > - return iov_iter_type(i) == ITER_IOVEC; > + return iov_iter_type(i) & ITER_IOVEC; > } > > static inline bool iov_iter_is_kvec(const struct iov_iter *i) > { > - return iov_iter_type(i) == ITER_KVEC; > + return iov_iter_type(i) & ITER_KVEC; > } > > static inline bool iov_iter_is_bvec(const struct iov_iter *i) > { > - return iov_iter_type(i) == ITER_BVEC; > + return iov_iter_type(i) & ITER_BVEC; > } > > static inline bool iov_iter_is_pipe(const struct iov_iter *i) > { > - return iov_iter_type(i) == ITER_PIPE; > + return iov_iter_type(i) & ITER_PIPE; > } > > static inline bool iov_iter_is_discard(const struct iov_iter *i) > { > - return iov_iter_type(i) == ITER_DISCARD; > + return iov_iter_type(i) & ITER_DISCARD; > } > > static inline unsigned char iov_iter_rw(const struct iov_iter *i) >
On 06/12/2020 16:01, Pavel Begunkov wrote: > On 21/11/2020 14:37, Pavel Begunkov wrote: >> The problem here is that iov_iter_is_*() helpers check types for >> equality, but all iterate_* helpers do bitwise ands. This confuses >> compilers, so even if some cases were handled separately with >> iov_iter_is_*(), corresponding ifs in iterate*() right after are not >> eliminated. >> >> E.g. iov_iter_npages() first handles discards, but iterate_all_kinds() >> still checks for discard iter type and generates unreachable code down >> the line. > > Ping. This one should be pretty simple Ping please. Any doubts about this patch? > >> >> text data bss dec hex filename >> before: 24409 805 0 25214 627e lib/iov_iter.o >> after: 23977 805 0 24782 60ce lib/iov_iter.o >> >> Reviewed-by: Jens Axboe <axboe@kernel.dk> >> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> >> --- >> include/linux/uio.h | 10 +++++----- >> 1 file changed, 5 insertions(+), 5 deletions(-) >> >> diff --git a/include/linux/uio.h b/include/linux/uio.h >> index 72d88566694e..c5970b2d3307 100644 >> --- a/include/linux/uio.h >> +++ b/include/linux/uio.h >> @@ -57,27 +57,27 @@ static inline enum iter_type iov_iter_type(const struct iov_iter *i) >> >> static inline bool iter_is_iovec(const struct iov_iter *i) >> { >> - return iov_iter_type(i) == ITER_IOVEC; >> + return iov_iter_type(i) & ITER_IOVEC; >> } >> >> static inline bool iov_iter_is_kvec(const struct iov_iter *i) >> { >> - return iov_iter_type(i) == ITER_KVEC; >> + return iov_iter_type(i) & ITER_KVEC; >> } >> >> static inline bool iov_iter_is_bvec(const struct iov_iter *i) >> { >> - return iov_iter_type(i) == ITER_BVEC; >> + return iov_iter_type(i) & ITER_BVEC; >> } >> >> static inline bool iov_iter_is_pipe(const struct iov_iter *i) >> { >> - return iov_iter_type(i) == ITER_PIPE; >> + return iov_iter_type(i) & ITER_PIPE; >> } >> >> static inline bool iov_iter_is_discard(const struct iov_iter *i) >> { >> - return iov_iter_type(i) == ITER_DISCARD; >> + return iov_iter_type(i) & ITER_DISCARD; >> } >> >> static inline unsigned char iov_iter_rw(const struct iov_iter *i) >> >
On Sat, Jan 09, 2021 at 04:09:08PM +0000, Pavel Begunkov wrote: > On 06/12/2020 16:01, Pavel Begunkov wrote: > > On 21/11/2020 14:37, Pavel Begunkov wrote: > >> The problem here is that iov_iter_is_*() helpers check types for > >> equality, but all iterate_* helpers do bitwise ands. This confuses > >> compilers, so even if some cases were handled separately with > >> iov_iter_is_*(), corresponding ifs in iterate*() right after are not > >> eliminated. > >> > >> E.g. iov_iter_npages() first handles discards, but iterate_all_kinds() > >> still checks for discard iter type and generates unreachable code down > >> the line. > > > > Ping. This one should be pretty simple > > Ping please. Any doubts about this patch? Sorry, had been buried in other crap. I'm really not fond of the bitmap use; if anything, I would rather turn iterate_and_advance() et.al. into switches... How about moving the READ/WRITE part into MSB? Checking is just as fast (if not faster - check for sign vs. checking bit 0). And turn the types into straight (dense) enum. Almost all iov_iter_rw() callers have the form (iov_iter_rw(iter) == READ) or (iov_iter_rw(iter) == WRITE). Out of 50-odd callers there are 5 nominal exceptions: fs/cifs/smbdirect.c:1936: iov_iter_rw(&msg->msg_iter)); fs/exfat/inode.c:442: int rw = iov_iter_rw(iter); fs/f2fs/data.c:3639: int rw = iov_iter_rw(iter); fs/f2fs/f2fs.h:4082: int rw = iov_iter_rw(iter); fs/f2fs/f2fs.h:4092: int rw = iov_iter_rw(iter); The first one is debugging printk if (iov_iter_rw(&msg->msg_iter) == WRITE) { /* It's a bug in upper layer to get there */ cifs_dbg(VFS, "Invalid msg iter dir %u\n", iov_iter_rw(&msg->msg_iter)); rc = -EINVAL; goto out; } and if you look at the condition, the quality of message is underwhelming - "Data source msg iter passed by caller" would be more informative. Other 4... exfat one is if (rw == WRITE) { ... } ... if (ret < 0 && (rw & WRITE)) exfat_write_failed(mapping, size); IOW, doing bool is_write = iov_iter_rw(iter) == WRITE; would be cleaner. f2fs.h ones are int rw = iov_iter_rw(iter); .... if (.... && rw == WRITE ... so they are of the same sort (assuming we want that local variable in the first place). f2fs/data.c is the least trivial - it includes things like if (!down_read_trylock(&fi->i_gc_rwsem[rw])) { and considering the amount of other stuff done there, I would suggest something like int rw = is_data_source(iter) ? WRITE : READ; I'll dig myself from under ->d_revalidate() code review, look through the iov_iter-related series and post review, hopefully by tonight.
On 09/01/2021 17:03, Al Viro wrote: > On Sat, Jan 09, 2021 at 04:09:08PM +0000, Pavel Begunkov wrote: >> On 06/12/2020 16:01, Pavel Begunkov wrote: >>> On 21/11/2020 14:37, Pavel Begunkov wrote: >>>> The problem here is that iov_iter_is_*() helpers check types for >>>> equality, but all iterate_* helpers do bitwise ands. This confuses >>>> compilers, so even if some cases were handled separately with >>>> iov_iter_is_*(), corresponding ifs in iterate*() right after are not >>>> eliminated. >>>> >>>> E.g. iov_iter_npages() first handles discards, but iterate_all_kinds() >>>> still checks for discard iter type and generates unreachable code down >>>> the line. >>> >>> Ping. This one should be pretty simple >> >> Ping please. Any doubts about this patch? > > Sorry, had been buried in other crap. I'm really not fond of the > bitmap use; if anything, I would rather turn iterate_and_advance() et.al. > into switches... > > How about moving the READ/WRITE part into MSB? Checking is just as fast > (if not faster - check for sign vs. checking bit 0). And turn the > types into straight (dense) enum. Didn't realise that approach before, sounds good. Most of it will be replaced with sign jcc, and the rest will be (t >> 31) or movcc, so it should not be of concern. type_mask = 255; iov_iter_type(i) { return i->type & ~type_mask; } I hope this stuff won't add much, because the original patch completely optimises this "&" out. I guess it'll turn into extra xor m m notb8 m and m & type > > Almost all iov_iter_rw() callers have the form (iov_iter_rw(iter) == READ) or > (iov_iter_rw(iter) == WRITE). Out of 50-odd callers there are 5 nominal > exceptions: > fs/cifs/smbdirect.c:1936: iov_iter_rw(&msg->msg_iter)); > fs/exfat/inode.c:442: int rw = iov_iter_rw(iter); > fs/f2fs/data.c:3639: int rw = iov_iter_rw(iter); > fs/f2fs/f2fs.h:4082: int rw = iov_iter_rw(iter); > fs/f2fs/f2fs.h:4092: int rw = iov_iter_rw(iter); > > The first one is debugging printk > if (iov_iter_rw(&msg->msg_iter) == WRITE) { > /* It's a bug in upper layer to get there */ > cifs_dbg(VFS, "Invalid msg iter dir %u\n", > iov_iter_rw(&msg->msg_iter)); > rc = -EINVAL; > goto out; > } > and if you look at the condition, the quality of message is > underwhelming - "Data source msg iter passed by caller" would > be more informative. > > Other 4... exfat one is > if (rw == WRITE) { > ... > } > ... > if (ret < 0 && (rw & WRITE)) > exfat_write_failed(mapping, size); > IOW, doing > bool is_write = iov_iter_rw(iter) == WRITE; > would be cleaner. f2fs.h ones are > int rw = iov_iter_rw(iter); > .... > if (.... && rw == WRITE ... > so they are of the same sort (assuming we want that local > variable in the first place). > > f2fs/data.c is the least trivial - it includes things like > if (!down_read_trylock(&fi->i_gc_rwsem[rw])) { > and considering the amount of other stuff done there, > I would suggest something like > int rw = is_data_source(iter) ? WRITE : READ; > > I'll dig myself from under ->d_revalidate() code review, look > through the iov_iter-related series and post review, hopefully > by tonight. Great, thanks Al. Without it being optimised right my other patches keep worsening iov_iter, and I obviously want to avoid that.
From: Al Viro > Sent: 09 January 2021 17:04 > > On Sat, Jan 09, 2021 at 04:09:08PM +0000, Pavel Begunkov wrote: > > On 06/12/2020 16:01, Pavel Begunkov wrote: > > > On 21/11/2020 14:37, Pavel Begunkov wrote: > > >> The problem here is that iov_iter_is_*() helpers check types for > > >> equality, but all iterate_* helpers do bitwise ands. This confuses > > >> compilers, so even if some cases were handled separately with > > >> iov_iter_is_*(), corresponding ifs in iterate*() right after are not > > >> eliminated. > > >> > > >> E.g. iov_iter_npages() first handles discards, but iterate_all_kinds() > > >> still checks for discard iter type and generates unreachable code down > > >> the line. > > > > > > Ping. This one should be pretty simple > > > > Ping please. Any doubts about this patch? > > Sorry, had been buried in other crap. I'm really not fond of the > bitmap use; if anything, I would rather turn iterate_and_advance() et.al. > into switches... That loses any optimisations in the order of the comparisons. The bitmap also allows different groups to be optimised for in different code paths. > How about moving the READ/WRITE part into MSB? Checking is just as fast > (if not faster - check for sign vs. checking bit 0). And turn the > types into straight (dense) enum. Does any code actually look at the fields as a pair? Would it even be better to use separate bytes? Even growing the on-stack structure by a word won't really matter. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 09/01/2021 21:49, David Laight wrote: > From: Al Viro >> Sent: 09 January 2021 17:04 >> >> On Sat, Jan 09, 2021 at 04:09:08PM +0000, Pavel Begunkov wrote: >>> On 06/12/2020 16:01, Pavel Begunkov wrote: >>>> On 21/11/2020 14:37, Pavel Begunkov wrote: >>>>> The problem here is that iov_iter_is_*() helpers check types for >>>>> equality, but all iterate_* helpers do bitwise ands. This confuses >>>>> compilers, so even if some cases were handled separately with >>>>> iov_iter_is_*(), corresponding ifs in iterate*() right after are not >>>>> eliminated. >>>>> >>>>> E.g. iov_iter_npages() first handles discards, but iterate_all_kinds() >>>>> still checks for discard iter type and generates unreachable code down >>>>> the line. >>>> >>>> Ping. This one should be pretty simple >>> >>> Ping please. Any doubts about this patch? >> >> Sorry, had been buried in other crap. I'm really not fond of the >> bitmap use; if anything, I would rather turn iterate_and_advance() et.al. >> into switches... > > That loses any optimisations in the order of the comparisons. > The bitmap also allows different groups to be optimised for in different code paths. You still can have a fast path and even retoss ITER_* for convenience. Other use cases are not important at the current state. > >> How about moving the READ/WRITE part into MSB? Checking is just as fast >> (if not faster - check for sign vs. checking bit 0). And turn the >> types into straight (dense) enum. > > Does any code actually look at the fields as a pair? > Would it even be better to use separate bytes? > Even growing the on-stack structure by a word won't really matter. u8 type, rw; That won't bloat the struct. I like the idea. If used together compilers can treat it as u16. btw there is a 4B hole just after for x64.
From: Pavel Begunkov > Sent: 09 January 2021 22:11 .... > > Does any code actually look at the fields as a pair? > > Would it even be better to use separate bytes? > > Even growing the on-stack structure by a word won't really matter. > > u8 type, rw; > > That won't bloat the struct. I like the idea. If used together compilers > can treat it as u16. > > btw there is a 4B hole just after for x64. I've just had a quick look at the sources. (Nothing was powered up earlier.) AFAICT nothing needs the RW flag to be in the same word as the type. If you think about it, the call site is specific to read/write. The only places iov_iter_rw() is used in inside helper functions to save the direction being passed from the caller. I hope the comment about bit 1 being BVEC_FLAG_NO_REF is old. I can't find any references to that flag. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 11/01/2021 09:35, David Laight wrote: > From: Pavel Begunkov >> Sent: 09 January 2021 22:11 > .... >>> Does any code actually look at the fields as a pair? >>> Would it even be better to use separate bytes? >>> Even growing the on-stack structure by a word won't really matter. >> >> u8 type, rw; >> >> That won't bloat the struct. I like the idea. If used together compilers >> can treat it as u16. >> >> btw there is a 4B hole just after for x64. > > I've just had a quick look at the sources. > (Nothing was powered up earlier.) > > AFAICT nothing needs the RW flag to be in the same word > as the type. > If you think about it, the call site is specific to read/write. > The only places iov_iter_rw() is used in inside helper functions > to save the direction being passed from the caller. > > I hope the comment about bit 1 being BVEC_FLAG_NO_REF is old. > I can't find any references to that flag. Yep, long dead.
On Sat, Jan 09, 2021 at 10:11:09PM +0000, Pavel Begunkov wrote: > > Does any code actually look at the fields as a pair? > > Would it even be better to use separate bytes? > > Even growing the on-stack structure by a word won't really matter. > > u8 type, rw; > > That won't bloat the struct. I like the idea. If used together compilers > can treat it as u16. Reasonable, and from what I remember from looking through the users, no readers will bother with looking at both at the same time. On the write side... it's only set in iov_iter_{kvec,bvec,pipe,discard,init}. I sincerely doubt anyone would give a fuck, not to mention that something like void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe, size_t count) { BUG_ON(direction != READ); WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size)); *i = (struct iov_iter) { .iter_type = ITER_PIPE, .data_source = false, .pipe = pipe, .head = pipe->head, .start_head = pipe->head, .count = count, .iov_offset = 0 }; } would make more sense anyway - we do want to overwrite everything in the object, and let the compiler do whatever it likes to do. So... something like (completely untested) variant below, perhaps? diff --git a/include/linux/uio.h b/include/linux/uio.h index 72d88566694e..ed8ad2c5d384 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -19,20 +19,16 @@ struct kvec { enum iter_type { /* iter types */ - ITER_IOVEC = 4, - ITER_KVEC = 8, - ITER_BVEC = 16, - ITER_PIPE = 32, - ITER_DISCARD = 64, + ITER_IOVEC, + ITER_KVEC, + ITER_BVEC, + ITER_PIPE, + ITER_DISCARD }; struct iov_iter { - /* - * Bit 0 is the read/write bit, set if we're writing. - * Bit 1 is the BVEC_FLAG_NO_REF bit, set if type is a bvec and - * the caller isn't expecting to drop a page reference when done. - */ - unsigned int type; + u8 iter_type; + bool data_source; size_t iov_offset; size_t count; union { @@ -52,7 +48,7 @@ struct iov_iter { static inline enum iter_type iov_iter_type(const struct iov_iter *i) { - return i->type & ~(READ | WRITE); + return i->iter_type; } static inline bool iter_is_iovec(const struct iov_iter *i) @@ -82,7 +78,7 @@ static inline bool iov_iter_is_discard(const struct iov_iter *i) static inline unsigned char iov_iter_rw(const struct iov_iter *i) { - return i->type & (READ | WRITE); + return i->data_source ? WRITE : READ; } /* diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 1635111c5bd2..133c03b2dcae 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -81,19 +81,18 @@ #define iterate_all_kinds(i, n, v, I, B, K) { \ if (likely(n)) { \ size_t skip = i->iov_offset; \ - if (unlikely(i->type & ITER_BVEC)) { \ + if (likely(i->iter_type == ITER_IOVEC)) { \ + const struct iovec *iov; \ + struct iovec v; \ + iterate_iovec(i, n, v, iov, skip, (I)) \ + } else if (i->iter_type == ITER_BVEC) { \ struct bio_vec v; \ struct bvec_iter __bi; \ iterate_bvec(i, n, v, __bi, skip, (B)) \ - } else if (unlikely(i->type & ITER_KVEC)) { \ + } else if (i->iter_type == ITER_KVEC) { \ const struct kvec *kvec; \ struct kvec v; \ iterate_kvec(i, n, v, kvec, skip, (K)) \ - } else if (unlikely(i->type & ITER_DISCARD)) { \ - } else { \ - const struct iovec *iov; \ - struct iovec v; \ - iterate_iovec(i, n, v, iov, skip, (I)) \ } \ } \ } @@ -103,7 +102,17 @@ n = i->count; \ if (i->count) { \ size_t skip = i->iov_offset; \ - if (unlikely(i->type & ITER_BVEC)) { \ + if (likely(i->iter_type == ITER_IOVEC)) { \ + const struct iovec *iov; \ + struct iovec v; \ + iterate_iovec(i, n, v, iov, skip, (I)) \ + if (skip == iov->iov_len) { \ + iov++; \ + skip = 0; \ + } \ + i->nr_segs -= iov - i->iov; \ + i->iov = iov; \ + } else if (i->iter_type == ITER_BVEC) { \ const struct bio_vec *bvec = i->bvec; \ struct bio_vec v; \ struct bvec_iter __bi; \ @@ -111,7 +120,7 @@ i->bvec = __bvec_iter_bvec(i->bvec, __bi); \ i->nr_segs -= i->bvec - bvec; \ skip = __bi.bi_bvec_done; \ - } else if (unlikely(i->type & ITER_KVEC)) { \ + } else if (i->iter_type == ITER_KVEC) { \ const struct kvec *kvec; \ struct kvec v; \ iterate_kvec(i, n, v, kvec, skip, (K)) \ @@ -121,18 +130,8 @@ } \ i->nr_segs -= kvec - i->kvec; \ i->kvec = kvec; \ - } else if (unlikely(i->type & ITER_DISCARD)) { \ + } else if (i->iter_type == ITER_DISCARD) { \ skip += n; \ - } else { \ - const struct iovec *iov; \ - struct iovec v; \ - iterate_iovec(i, n, v, iov, skip, (I)) \ - if (skip == iov->iov_len) { \ - iov++; \ - skip = 0; \ - } \ - i->nr_segs -= iov - i->iov; \ - i->iov = iov; \ } \ i->count -= n; \ i->iov_offset = skip; \ @@ -434,7 +433,7 @@ int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes) int err; struct iovec v; - if (!(i->type & (ITER_BVEC|ITER_KVEC))) { + if (i->iter_type == ITER_IOVEC) { iterate_iovec(i, bytes, v, iov, skip, ({ err = fault_in_pages_readable(v.iov_base, v.iov_len); if (unlikely(err)) @@ -450,19 +449,26 @@ void iov_iter_init(struct iov_iter *i, unsigned int direction, size_t count) { WARN_ON(direction & ~(READ | WRITE)); - direction &= READ | WRITE; /* It will get better. Eventually... */ - if (uaccess_kernel()) { - i->type = ITER_KVEC | direction; - i->kvec = (struct kvec *)iov; - } else { - i->type = ITER_IOVEC | direction; - i->iov = iov; - } - i->nr_segs = nr_segs; - i->iov_offset = 0; - i->count = count; + if (uaccess_kernel()) + *i = (struct iov_iter) { + .iter_type = ITER_KVEC, + .data_source = direction, + .kvec = (struct kvec *)iov, + .nr_segs = nr_segs, + .iov_offset = 0, + .count = count + }; + else + *i = (struct iov_iter) { + .iter_type = ITER_IOVEC, + .data_source = direction, + .iov = iov, + .nr_segs = nr_segs, + .iov_offset = 0, + .count = count + }; } EXPORT_SYMBOL(iov_iter_init); @@ -915,17 +921,20 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes, { if (unlikely(!page_copy_sane(page, offset, bytes))) return 0; - if (i->type & (ITER_BVEC|ITER_KVEC)) { + if (likely(i->iter_type == ITER_IOVEC)) + return copy_page_to_iter_iovec(page, offset, bytes, i); + if (i->iter_type == ITER_BVEC || i->iter_type == ITER_KVEC) { void *kaddr = kmap_atomic(page); size_t wanted = copy_to_iter(kaddr + offset, bytes, i); kunmap_atomic(kaddr); return wanted; - } else if (unlikely(iov_iter_is_discard(i))) - return bytes; - else if (likely(!iov_iter_is_pipe(i))) - return copy_page_to_iter_iovec(page, offset, bytes, i); - else + } + if (i->iter_type == ITER_PIPE) return copy_page_to_iter_pipe(page, offset, bytes, i); + if (i->iter_type == ITER_DISCARD) + return bytes; + WARN_ON(1); + return 0; } EXPORT_SYMBOL(copy_page_to_iter); @@ -934,17 +943,16 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes, { if (unlikely(!page_copy_sane(page, offset, bytes))) return 0; - if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) { - WARN_ON(1); - return 0; - } - if (i->type & (ITER_BVEC|ITER_KVEC)) { + if (likely(i->iter_type == ITER_IOVEC)) + return copy_page_from_iter_iovec(page, offset, bytes, i); + if (i->iter_type == ITER_BVEC || i->iter_type == ITER_KVEC) { void *kaddr = kmap_atomic(page); size_t wanted = _copy_from_iter(kaddr + offset, bytes, i); kunmap_atomic(kaddr); return wanted; - } else - return copy_page_from_iter_iovec(page, offset, bytes, i); + } + WARN_ON(1); + return 0; } EXPORT_SYMBOL(copy_page_from_iter); @@ -1172,11 +1180,14 @@ void iov_iter_kvec(struct iov_iter *i, unsigned int direction, size_t count) { WARN_ON(direction & ~(READ | WRITE)); - i->type = ITER_KVEC | (direction & (READ | WRITE)); - i->kvec = kvec; - i->nr_segs = nr_segs; - i->iov_offset = 0; - i->count = count; + *i = (struct iov_iter) { + .iter_type = ITER_KVEC, + .data_source = direction, + .kvec = kvec, + .nr_segs = nr_segs, + .iov_offset = 0, + .count = count + }; } EXPORT_SYMBOL(iov_iter_kvec); @@ -1185,11 +1196,14 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction, size_t count) { WARN_ON(direction & ~(READ | WRITE)); - i->type = ITER_BVEC | (direction & (READ | WRITE)); - i->bvec = bvec; - i->nr_segs = nr_segs; - i->iov_offset = 0; - i->count = count; + *i = (struct iov_iter) { + .iter_type = ITER_BVEC, + .data_source = direction, + .bvec = bvec, + .nr_segs = nr_segs, + .iov_offset = 0, + .count = count + }; } EXPORT_SYMBOL(iov_iter_bvec); @@ -1199,12 +1213,15 @@ void iov_iter_pipe(struct iov_iter *i, unsigned int direction, { BUG_ON(direction != READ); WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size)); - i->type = ITER_PIPE | READ; - i->pipe = pipe; - i->head = pipe->head; - i->iov_offset = 0; - i->count = count; - i->start_head = i->head; + *i = (struct iov_iter) { + .iter_type = ITER_PIPE, + .data_source = false, + .pipe = pipe, + .head = pipe->head, + .start_head = pipe->head, + .count = count, + .iov_offset = 0 + }; } EXPORT_SYMBOL(iov_iter_pipe); @@ -1220,9 +1237,11 @@ EXPORT_SYMBOL(iov_iter_pipe); void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count) { BUG_ON(direction != READ); - i->type = ITER_DISCARD | READ; - i->count = count; - i->iov_offset = 0; + *i = (struct iov_iter) { + .iter_type = ITER_DISCARD, + .data_source = false, + .count = count, + }; } EXPORT_SYMBOL(iov_iter_discard);
From: Al Viro > Sent: 16 January 2021 05:18 > > On Sat, Jan 09, 2021 at 10:11:09PM +0000, Pavel Begunkov wrote: > > > > Does any code actually look at the fields as a pair? > > > Would it even be better to use separate bytes? > > > Even growing the on-stack structure by a word won't really matter. > > > > u8 type, rw; > > > > That won't bloat the struct. I like the idea. If used together compilers > > can treat it as u16. > > Reasonable, and from what I remember from looking through the users, > no readers will bother with looking at both at the same time. I couldn't find any. ... > So... something like (completely untested) variant below, perhaps? > > diff --git a/include/linux/uio.h b/include/linux/uio.h > index 72d88566694e..ed8ad2c5d384 100644 > --- a/include/linux/uio.h > +++ b/include/linux/uio.h > @@ -19,20 +19,16 @@ struct kvec { > > enum iter_type { > /* iter types */ > - ITER_IOVEC = 4, > - ITER_KVEC = 8, > - ITER_BVEC = 16, > - ITER_PIPE = 32, > - ITER_DISCARD = 64, > + ITER_IOVEC, > + ITER_KVEC, > + ITER_BVEC, > + ITER_PIPE, > + ITER_DISCARD > }; > > struct iov_iter { > - /* > - * Bit 0 is the read/write bit, set if we're writing. > - * Bit 1 is the BVEC_FLAG_NO_REF bit, set if type is a bvec and > - * the caller isn't expecting to drop a page reference when done. > - */ > - unsigned int type; > + u8 iter_type; > + bool data_source; I'd leave it as 'u8 direction' and assign READ (0) or WRITE (1) to it. It will always be confusing whether WRITE means a 'write' system call or a transfer that will write into the buffer (eg a read system call). I'm pretty sure I can detect the performance change from forcing the compiler to convert values to 'bool'. ... Since you've still got tests like: > + if (i->iter_type == ITER_BVEC || i->iter_type == ITER_KVEC) { It is probably still worth using bit values. After all, the only thing that benefits from small dense integers is a case statement lookup table - and we don't do those any more. Otherwise you might as well use 'i', 'k', 'b', 'p' and 'd' so that anyone hexdumping the structure (or reading the asm decode) knows the type without having to go and look it up. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 16/01/2021 05:18, Al Viro wrote: > On Sat, Jan 09, 2021 at 10:11:09PM +0000, Pavel Begunkov wrote: > >>> Does any code actually look at the fields as a pair? >>> Would it even be better to use separate bytes? >>> Even growing the on-stack structure by a word won't really matter. >> >> u8 type, rw; >> >> That won't bloat the struct. I like the idea. If used together compilers >> can treat it as u16. > > Reasonable, and from what I remember from looking through the users, > no readers will bother with looking at both at the same time. Al, are you going turn it into a patch, or prefer me to take over? > On the write side... it's only set in iov_iter_{kvec,bvec,pipe,discard,init}. > I sincerely doubt anyone would give a fuck, not to mention that something > like > void iov_iter_pipe(struct iov_iter *i, unsigned int direction, > struct pipe_inode_info *pipe, > size_t count) > { > BUG_ON(direction != READ); > WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size)); > *i = (struct iov_iter) { > .iter_type = ITER_PIPE, > .data_source = false, > .pipe = pipe, > .head = pipe->head, > .start_head = pipe->head, > .count = count, > .iov_offset = 0 > }; > } > > would make more sense anyway - we do want to overwrite everything in the > object, and let the compiler do whatever it likes to do. > > So... something like (completely untested) variant below, perhaps? > > diff --git a/include/linux/uio.h b/include/linux/uio.h > index 72d88566694e..ed8ad2c5d384 100644 > --- a/include/linux/uio.h > +++ b/include/linux/uio.h > @@ -19,20 +19,16 @@ struct kvec { > > enum iter_type { > /* iter types */ > - ITER_IOVEC = 4, > - ITER_KVEC = 8, > - ITER_BVEC = 16, > - ITER_PIPE = 32, > - ITER_DISCARD = 64, > + ITER_IOVEC, > + ITER_KVEC, > + ITER_BVEC, > + ITER_PIPE, > + ITER_DISCARD > }; > > struct iov_iter { > - /* > - * Bit 0 is the read/write bit, set if we're writing. > - * Bit 1 is the BVEC_FLAG_NO_REF bit, set if type is a bvec and > - * the caller isn't expecting to drop a page reference when done. > - */ > - unsigned int type; > + u8 iter_type; > + bool data_source; > size_t iov_offset; > size_t count; > union { > @@ -52,7 +48,7 @@ struct iov_iter { > > static inline enum iter_type iov_iter_type(const struct iov_iter *i) > { > - return i->type & ~(READ | WRITE); > + return i->iter_type; > } > > static inline bool iter_is_iovec(const struct iov_iter *i) > @@ -82,7 +78,7 @@ static inline bool iov_iter_is_discard(const struct iov_iter *i) > > static inline unsigned char iov_iter_rw(const struct iov_iter *i) > { > - return i->type & (READ | WRITE); > + return i->data_source ? WRITE : READ; > } > > /* > diff --git a/lib/iov_iter.c b/lib/iov_iter.c > index 1635111c5bd2..133c03b2dcae 100644 > --- a/lib/iov_iter.c > +++ b/lib/iov_iter.c > @@ -81,19 +81,18 @@ > #define iterate_all_kinds(i, n, v, I, B, K) { \ > if (likely(n)) { \ > size_t skip = i->iov_offset; \ > - if (unlikely(i->type & ITER_BVEC)) { \ > + if (likely(i->iter_type == ITER_IOVEC)) { \ > + const struct iovec *iov; \ > + struct iovec v; \ > + iterate_iovec(i, n, v, iov, skip, (I)) \ > + } else if (i->iter_type == ITER_BVEC) { \ > struct bio_vec v; \ > struct bvec_iter __bi; \ > iterate_bvec(i, n, v, __bi, skip, (B)) \ > - } else if (unlikely(i->type & ITER_KVEC)) { \ > + } else if (i->iter_type == ITER_KVEC) { \ > const struct kvec *kvec; \ > struct kvec v; \ > iterate_kvec(i, n, v, kvec, skip, (K)) \ > - } else if (unlikely(i->type & ITER_DISCARD)) { \ > - } else { \ > - const struct iovec *iov; \ > - struct iovec v; \ > - iterate_iovec(i, n, v, iov, skip, (I)) \ > } \ > } \ > } > @@ -103,7 +102,17 @@ > n = i->count; \ > if (i->count) { \ > size_t skip = i->iov_offset; \ > - if (unlikely(i->type & ITER_BVEC)) { \ > + if (likely(i->iter_type == ITER_IOVEC)) { \ > + const struct iovec *iov; \ > + struct iovec v; \ > + iterate_iovec(i, n, v, iov, skip, (I)) \ > + if (skip == iov->iov_len) { \ > + iov++; \ > + skip = 0; \ > + } \ > + i->nr_segs -= iov - i->iov; \ > + i->iov = iov; \ > + } else if (i->iter_type == ITER_BVEC) { \ > const struct bio_vec *bvec = i->bvec; \ > struct bio_vec v; \ > struct bvec_iter __bi; \ > @@ -111,7 +120,7 @@ > i->bvec = __bvec_iter_bvec(i->bvec, __bi); \ > i->nr_segs -= i->bvec - bvec; \ > skip = __bi.bi_bvec_done; \ > - } else if (unlikely(i->type & ITER_KVEC)) { \ > + } else if (i->iter_type == ITER_KVEC) { \ > const struct kvec *kvec; \ > struct kvec v; \ > iterate_kvec(i, n, v, kvec, skip, (K)) \ > @@ -121,18 +130,8 @@ > } \ > i->nr_segs -= kvec - i->kvec; \ > i->kvec = kvec; \ > - } else if (unlikely(i->type & ITER_DISCARD)) { \ > + } else if (i->iter_type == ITER_DISCARD) { \ > skip += n; \ > - } else { \ > - const struct iovec *iov; \ > - struct iovec v; \ > - iterate_iovec(i, n, v, iov, skip, (I)) \ > - if (skip == iov->iov_len) { \ > - iov++; \ > - skip = 0; \ > - } \ > - i->nr_segs -= iov - i->iov; \ > - i->iov = iov; \ > } \ > i->count -= n; \ > i->iov_offset = skip; \ > @@ -434,7 +433,7 @@ int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes) > int err; > struct iovec v; > > - if (!(i->type & (ITER_BVEC|ITER_KVEC))) { > + if (i->iter_type == ITER_IOVEC) { > iterate_iovec(i, bytes, v, iov, skip, ({ > err = fault_in_pages_readable(v.iov_base, v.iov_len); > if (unlikely(err)) > @@ -450,19 +449,26 @@ void iov_iter_init(struct iov_iter *i, unsigned int direction, > size_t count) > { > WARN_ON(direction & ~(READ | WRITE)); > - direction &= READ | WRITE; > > /* It will get better. Eventually... */ > - if (uaccess_kernel()) { > - i->type = ITER_KVEC | direction; > - i->kvec = (struct kvec *)iov; > - } else { > - i->type = ITER_IOVEC | direction; > - i->iov = iov; > - } > - i->nr_segs = nr_segs; > - i->iov_offset = 0; > - i->count = count; > + if (uaccess_kernel()) > + *i = (struct iov_iter) { > + .iter_type = ITER_KVEC, > + .data_source = direction, > + .kvec = (struct kvec *)iov, > + .nr_segs = nr_segs, > + .iov_offset = 0, > + .count = count > + }; > + else > + *i = (struct iov_iter) { > + .iter_type = ITER_IOVEC, > + .data_source = direction, > + .iov = iov, > + .nr_segs = nr_segs, > + .iov_offset = 0, > + .count = count > + }; > } > EXPORT_SYMBOL(iov_iter_init); > > @@ -915,17 +921,20 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes, > { > if (unlikely(!page_copy_sane(page, offset, bytes))) > return 0; > - if (i->type & (ITER_BVEC|ITER_KVEC)) { > + if (likely(i->iter_type == ITER_IOVEC)) > + return copy_page_to_iter_iovec(page, offset, bytes, i); > + if (i->iter_type == ITER_BVEC || i->iter_type == ITER_KVEC) { > void *kaddr = kmap_atomic(page); > size_t wanted = copy_to_iter(kaddr + offset, bytes, i); > kunmap_atomic(kaddr); > return wanted; > - } else if (unlikely(iov_iter_is_discard(i))) > - return bytes; > - else if (likely(!iov_iter_is_pipe(i))) > - return copy_page_to_iter_iovec(page, offset, bytes, i); > - else > + } > + if (i->iter_type == ITER_PIPE) > return copy_page_to_iter_pipe(page, offset, bytes, i); > + if (i->iter_type == ITER_DISCARD) > + return bytes; > + WARN_ON(1); > + return 0; > } > EXPORT_SYMBOL(copy_page_to_iter); > > @@ -934,17 +943,16 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes, > { > if (unlikely(!page_copy_sane(page, offset, bytes))) > return 0; > - if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) { > - WARN_ON(1); > - return 0; > - } > - if (i->type & (ITER_BVEC|ITER_KVEC)) { > + if (likely(i->iter_type == ITER_IOVEC)) > + return copy_page_from_iter_iovec(page, offset, bytes, i); > + if (i->iter_type == ITER_BVEC || i->iter_type == ITER_KVEC) { > void *kaddr = kmap_atomic(page); > size_t wanted = _copy_from_iter(kaddr + offset, bytes, i); > kunmap_atomic(kaddr); > return wanted; > - } else > - return copy_page_from_iter_iovec(page, offset, bytes, i); > + } > + WARN_ON(1); > + return 0; > } > EXPORT_SYMBOL(copy_page_from_iter); > > @@ -1172,11 +1180,14 @@ void iov_iter_kvec(struct iov_iter *i, unsigned int direction, > size_t count) > { > WARN_ON(direction & ~(READ | WRITE)); > - i->type = ITER_KVEC | (direction & (READ | WRITE)); > - i->kvec = kvec; > - i->nr_segs = nr_segs; > - i->iov_offset = 0; > - i->count = count; > + *i = (struct iov_iter) { > + .iter_type = ITER_KVEC, > + .data_source = direction, > + .kvec = kvec, > + .nr_segs = nr_segs, > + .iov_offset = 0, > + .count = count > + }; > } > EXPORT_SYMBOL(iov_iter_kvec); > > @@ -1185,11 +1196,14 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction, > size_t count) > { > WARN_ON(direction & ~(READ | WRITE)); > - i->type = ITER_BVEC | (direction & (READ | WRITE)); > - i->bvec = bvec; > - i->nr_segs = nr_segs; > - i->iov_offset = 0; > - i->count = count; > + *i = (struct iov_iter) { > + .iter_type = ITER_BVEC, > + .data_source = direction, > + .bvec = bvec, > + .nr_segs = nr_segs, > + .iov_offset = 0, > + .count = count > + }; > } > EXPORT_SYMBOL(iov_iter_bvec); > > @@ -1199,12 +1213,15 @@ void iov_iter_pipe(struct iov_iter *i, unsigned int direction, > { > BUG_ON(direction != READ); > WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size)); > - i->type = ITER_PIPE | READ; > - i->pipe = pipe; > - i->head = pipe->head; > - i->iov_offset = 0; > - i->count = count; > - i->start_head = i->head; > + *i = (struct iov_iter) { > + .iter_type = ITER_PIPE, > + .data_source = false, > + .pipe = pipe, > + .head = pipe->head, > + .start_head = pipe->head, > + .count = count, > + .iov_offset = 0 > + }; > } > EXPORT_SYMBOL(iov_iter_pipe); > > @@ -1220,9 +1237,11 @@ EXPORT_SYMBOL(iov_iter_pipe); > void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count) > { > BUG_ON(direction != READ); > - i->type = ITER_DISCARD | READ; > - i->count = count; > - i->iov_offset = 0; > + *i = (struct iov_iter) { > + .iter_type = ITER_DISCARD, > + .data_source = false, > + .count = count, > + }; > } > EXPORT_SYMBOL(iov_iter_discard); > >
From: Pavel Begunkov > Sent: 27 January 2021 15:48 > > On 16/01/2021 05:18, Al Viro wrote: > > On Sat, Jan 09, 2021 at 10:11:09PM +0000, Pavel Begunkov wrote: > > > >>> Does any code actually look at the fields as a pair? > >>> Would it even be better to use separate bytes? > >>> Even growing the on-stack structure by a word won't really matter. > >> > >> u8 type, rw; > >> > >> That won't bloat the struct. I like the idea. If used together compilers > >> can treat it as u16. > > > > Reasonable, and from what I remember from looking through the users, > > no readers will bother with looking at both at the same time. > > Al, are you going turn it into a patch, or prefer me to take over? I'd definitely leave the type as a bitmap. It may be useful to add ITER_IOVEC_SINGLE to optimise some very common paths for user iovec with only a single buffer. But you'd probably want to keep the full version for more unusual (or already expensive) cases. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Wed, Jan 27, 2021 at 04:28:38PM +0000, David Laight wrote:
> I'd definitely leave the type as a bitmap.
What the hell for? Microoptimizations in places where we have
much heavier stuff to be done are bloody pointless.
It's already overcomplicated. And compiler is _not_ going to
be able to prove that we'll only ever have one bit set, so
you would be making it harder to optimize, not to mention
reason about.
On Wed, Jan 27, 2021 at 03:48:10PM +0000, Pavel Begunkov wrote: > On 16/01/2021 05:18, Al Viro wrote: > > On Sat, Jan 09, 2021 at 10:11:09PM +0000, Pavel Begunkov wrote: > > > >>> Does any code actually look at the fields as a pair? > >>> Would it even be better to use separate bytes? > >>> Even growing the on-stack structure by a word won't really matter. > >> > >> u8 type, rw; > >> > >> That won't bloat the struct. I like the idea. If used together compilers > >> can treat it as u16. > > > > Reasonable, and from what I remember from looking through the users, > > no readers will bother with looking at both at the same time. > > Al, are you going turn it into a patch, or prefer me to take over? I'll massage that a bit and put into #work.iov_iter - just need to dig my way from under the pile of ->d_revalidate() review...
On 27/01/2021 18:31, Al Viro wrote: > On Wed, Jan 27, 2021 at 03:48:10PM +0000, Pavel Begunkov wrote: >> On 16/01/2021 05:18, Al Viro wrote: >>> On Sat, Jan 09, 2021 at 10:11:09PM +0000, Pavel Begunkov wrote: >>> >>>>> Does any code actually look at the fields as a pair? >>>>> Would it even be better to use separate bytes? >>>>> Even growing the on-stack structure by a word won't really matter. >>>> >>>> u8 type, rw; >>>> >>>> That won't bloat the struct. I like the idea. If used together compilers >>>> can treat it as u16. >>> >>> Reasonable, and from what I remember from looking through the users, >>> no readers will bother with looking at both at the same time. >> >> Al, are you going turn it into a patch, or prefer me to take over? > > I'll massage that a bit and put into #work.iov_iter - just need to dig my > way from under the pile of ->d_revalidate() review... Perfect, thanks
diff --git a/include/linux/uio.h b/include/linux/uio.h index 72d88566694e..c5970b2d3307 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -57,27 +57,27 @@ static inline enum iter_type iov_iter_type(const struct iov_iter *i) static inline bool iter_is_iovec(const struct iov_iter *i) { - return iov_iter_type(i) == ITER_IOVEC; + return iov_iter_type(i) & ITER_IOVEC; } static inline bool iov_iter_is_kvec(const struct iov_iter *i) { - return iov_iter_type(i) == ITER_KVEC; + return iov_iter_type(i) & ITER_KVEC; } static inline bool iov_iter_is_bvec(const struct iov_iter *i) { - return iov_iter_type(i) == ITER_BVEC; + return iov_iter_type(i) & ITER_BVEC; } static inline bool iov_iter_is_pipe(const struct iov_iter *i) { - return iov_iter_type(i) == ITER_PIPE; + return iov_iter_type(i) & ITER_PIPE; } static inline bool iov_iter_is_discard(const struct iov_iter *i) { - return iov_iter_type(i) == ITER_DISCARD; + return iov_iter_type(i) & ITER_DISCARD; } static inline unsigned char iov_iter_rw(const struct iov_iter *i)