Message ID | a3fef3b0-c9f3-208e-3728-62ca9cff70ba@suse.com (mailing list archive) |
---|---|
Headers | show |
Series | assorted replacement of x[mz]alloc_bytes() | expand |
On 08/04/2021 13:13, Jan Beulich wrote: > In the long run I think we want to do away with these type-unsafe > interfaces, the more that they also request (typically) excess > alignment. This series of entirely independent patches is > eliminating the instances where it's relatively clear that they're > not just "blob" allocations. > > > 03: x86/MCE: avoid effectively open-coding xmalloc_array() > 04: x86/HVM: avoid effectively open-coding xmalloc_array() > 05: x86/oprofile: avoid effectively open-coding xmalloc_array() > 06: x86/IRQ: avoid over-alignment in alloc_pirq_struct() > 07: EFI/runtime: avoid effectively open-coding xmalloc_array() > 08: hypfs: avoid effectively open-coding xzalloc_array() > 10: video/lfb: avoid effectively open-coding xzalloc_array() The flex conversions are fine, but I am unconvinced by argument for interchanging array() and bytes(). The cacheline size is 64 bytes, and the minimum allocation size is 16, plus a bhdr overhead of 32 bytes, so you're already at most of a cacheline for a nominally-zero sized allocation. There can however be a severe penalty from cacheline sharing, which is why the bytes() form does have a minimum alignment. There is one xmalloc heap shared across the entire system, so you've got no idea what might be sharing the same cache line for sub-line allocations. We should not support sub-line allocations IMO. The savings is a handful of bytes at best, and some horrible performance cliffs to avoid. People running virtualisation are not going to be ram constrained to the order of a few bytes. For small allocations which don't require specific alignment, then putting bhdr and the allocation in the same line is fine (if we don't do this already), but we shouldn't be in the position of having two bhdr's in the same cache line, even if there are plenty of single-byte allocations in the theoretical worst case. ~Andrew
On 08.04.2021 14:57, Andrew Cooper wrote: > On 08/04/2021 13:13, Jan Beulich wrote: >> In the long run I think we want to do away with these type-unsafe >> interfaces, the more that they also request (typically) excess >> alignment. This series of entirely independent patches is >> eliminating the instances where it's relatively clear that they're >> not just "blob" allocations. >> >> >> 03: x86/MCE: avoid effectively open-coding xmalloc_array() >> 04: x86/HVM: avoid effectively open-coding xmalloc_array() >> 05: x86/oprofile: avoid effectively open-coding xmalloc_array() >> 06: x86/IRQ: avoid over-alignment in alloc_pirq_struct() >> 07: EFI/runtime: avoid effectively open-coding xmalloc_array() >> 08: hypfs: avoid effectively open-coding xzalloc_array() >> 10: video/lfb: avoid effectively open-coding xzalloc_array() > > The flex conversions are fine, but I am unconvinced by argument for > interchanging array() and bytes(). > > The cacheline size is 64 bytes, and the minimum allocation size is 16, > plus a bhdr overhead of 32 bytes, so you're already at most of a > cacheline for a nominally-zero sized allocation. But you're aware that the alignment x[mz]alloc_bytes() forces is 128 bytes? Plus, while sizeof(struct bhdr) is indeed 32, the overhead on allocated blocks is #define BHDR_OVERHEAD (sizeof(struct bhdr) - MIN_BLOCK_SIZE) i.e. 16 (i.e. the other half of the 32 is already the minimum block size of 16 that you also mention). IOW a cacheline sized block would yield 48 bytes of usable space. Specifically a meaningful change in the PV case from what patch 06 does, where we only need 40 bytes. > There can however be a severe penalty from cacheline sharing, which is > why the bytes() form does have a minimum alignment. There is one > xmalloc heap shared across the entire system, so you've got no idea what > might be sharing the same cache line for sub-line allocations. This would call for distinguishing short-lived allocations (and ones to be used mainly from a single CPU) from long-lived ones having system wide use. I.e. a request to gain further allocation function flavors, when already the introduction of the one new xv[mz]alloc() has caused long-winded discussions with (so far) no real outcome. > We should not support sub-line allocations IMO. The savings is a > handful of bytes at best, and some horrible performance cliffs to > avoid. People running virtualisation are not going to be ram > constrained to the order of a few bytes. > > For small allocations which don't require specific alignment, then > putting bhdr and the allocation in the same line is fine (if we don't do > this already), but we shouldn't be in the position of having two bhdr's > in the same cache line, even if there are plenty of single-byte > allocations in the theoretical worst case. That's a request to tweak allocator internals then, not an argument against the conversions done here. Jan