Message ID | 20210419201251.GE2531743@casper.infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | smp_rmb_cond | expand |
Matthew Wilcox <willy@infradead.org> wrote: > i see worse inlining decisions from gcc with this. maybe you see > an improvement that would justify it? > > [ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99998] Perhaps attach the patch to the bz, see if the compiler guys can recommend anything? David
On Mon, Apr 19, 2021 at 09:20:40PM +0100, David Howells wrote: > Matthew Wilcox <willy@infradead.org> wrote: > > > i see worse inlining decisions from gcc with this. maybe you see > > an improvement that would justify it? > > > > [ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99998] > > Perhaps attach the patch to the bz, see if the compiler guys can recommend > anything? your test case loses the bogus branch 0000000000000000 <PageUptodate>: 0: 48 8b 47 08 mov 0x8(%rdi),%rax 4: a8 01 test $0x1,%al 6: 74 04 je c <PageUptodate+0xc> 8: 48 8d 78 ff lea -0x1(%rax),%rdi c: 8b 07 mov (%rdi),%eax e: 48 c1 e8 02 shr $0x2,%rax 12: 24 01 and $0x1,%al 14: 74 00 je 16 <PageUptodate+0x16> 16: c3 retq 0000000000000017 <Page2Uptodate>: 17: 48 8b 47 08 mov 0x8(%rdi),%rax 1b: a8 01 test $0x1,%al 1d: 74 04 je 23 <Page2Uptodate+0xc> 1f: 48 8d 78 ff lea -0x1(%rax),%rdi 23: 8b 07 mov (%rdi),%eax 25: 48 c1 e8 02 shr $0x2,%rax 29: 83 e0 01 and $0x1,%eax 2c: c3 retq but that means that gcc then does more inlining to functions that call PageUptodate: $ ./scripts/bloat-o-meter filemap-before.o filemap-after.o add/remove: 0/0 grow/shrink: 3/4 up/down: 179/-91 (88) Function old new delta mapping_seek_hole_data 1203 1347 +144 __lock_page_killable 394 426 +32 next_uptodate_page 603 606 +3 wait_on_page_bit_common 582 576 -6 filemap_get_pages 1530 1512 -18 do_read_cache_page 1031 1012 -19 filemap_read_page 261 213 -48 Total: Before=24603, After=24691, chg +0.36% but maybe you have a metric that shows this winning at scale instead of in a micro?
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h index 4819d5e5a335..4cbc5bd5bcdd 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -60,6 +60,7 @@ static inline unsigned long array_index_mask_nospec(unsigned long index, #define __smp_mb() asm volatile("lock; addl $0,-4(%%rsp)" ::: "memory", "cc") #endif #define __smp_rmb() dma_rmb() +#define smp_rmb_cond(x) barrier() #define __smp_wmb() barrier() #define __smp_store_mb(var, value) do { (void)xchg(&var, value); } while (0) diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h index 640f09479bdf..cc0c864f90dc 100644 --- a/include/asm-generic/barrier.h +++ b/include/asm-generic/barrier.h @@ -89,6 +89,10 @@ #endif /* CONFIG_SMP */ +#ifndef smp_rmb_cond +#define smp_rmb_cond(x) do { if (x) smp_rmb(); } while (0) +#endif + #ifndef __smp_store_mb #define __smp_store_mb(var, value) do { WRITE_ONCE(var, value); __smp_mb(); } while (0) #endif diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 04a34c08e0a6..c45d491e9245 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -522,8 +522,7 @@ static inline int PageUptodate(struct page *page) * * See SetPageUptodate() for the other side of the story. */ - if (ret) - smp_rmb(); + smp_rmb_cond(ret); return ret; } diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c index 7a1414622051..260ef2474ff2 100644 --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -89,8 +89,7 @@ static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s, * Make sure that all old data have been read before the buffer * was reset. This is not needed when we just append data. */ - if (!len) - smp_rmb(); + smp_rmb_cond(!len); va_copy(ap, args); add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, ap);