[00/11] assorted replacement of x[mz]alloc_bytes()

Message ID	a3fef3b0-c9f3-208e-3728-62ca9cff70ba@suse.com (mailing list archive)
Headers	show Return-Path: <SRS0=xlmN=JF=lists.xenproject.org=xen-devel-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D83E86115B Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com>, George Dunlap <george.dunlap@citrix.com>, Ian Jackson <iwj@xenproject.org>, Julien Grall <julien@xen.org>, Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org>, =?utf-8?q?Roger_Pau_Monn=C3=A9?= <roger.pau@citrix.com> From: Jan Beulich <jbeulich@suse.com> Subject: [PATCH 00/11] assorted replacement of x[mz]alloc_bytes() To: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org> Message-ID: <a3fef3b0-c9f3-208e-3728-62ca9cff70ba@suse.com> Date: Thu, 8 Apr 2021 14:13:34 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit
Series	assorted replacement of x[mz]alloc_bytes() \| expand [00/11] assorted replacement of x[mz]alloc_bytes() [01/11] x86/HVM: avoid effectively open-coding xzalloc_flex_struct() [02/11] x86/vPMU: avoid effectively open-coding xzalloc_flex_struct() [03/11] x86/MCE: avoid effectively open-coding xmalloc_array() [04/11] x86/HVM: avoid effectively open-coding xmalloc_array() [05/11] x86/oprofile: avoid effectively open-coding xmalloc_array() [06/11] x86/IRQ: avoid over-alignment in alloc_pirq_struct() [07/11] EFI/runtime: avoid effectively open-coding xmalloc_array() [08/11] hypfs: avoid effectively open-coding xzalloc_array() [09/11] kexec: avoid effectively open-coding xzalloc_flex_struct() [10/11] video/lfb: avoid effectively open-coding xzalloc_array() [11/11] Arm/optee: don't open-code xzalloc_flex_struct()

Message ID

a3fef3b0-c9f3-208e-3728-62ca9cff70ba@suse.com (mailing list archive)

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D83E86115B
Errors-To: xen-devel-bounces@lists.xenproject.org
Precedence: list
Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
 George Dunlap <george.dunlap@citrix.com>, Ian Jackson <iwj@xenproject.org>,
 Julien Grall <julien@xen.org>, Stefano Stabellini <sstabellini@kernel.org>,
 Wei Liu <wl@xen.org>, =?utf-8?q?Roger_Pau_Monn=C3=A9?= <roger.pau@citrix.com>
From: Jan Beulich <jbeulich@suse.com>
Subject: [PATCH 00/11] assorted replacement of x[mz]alloc_bytes()
To: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Message-ID: <a3fef3b0-c9f3-208e-3728-62ca9cff70ba@suse.com>
Date: Thu, 8 Apr 2021 14:13:34 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
 Thunderbird/78.9.0
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit

Series

assorted replacement of x[mz]alloc_bytes() | expand

Message

Jan Beulich April 8, 2021, 12:13 p.m. UTC

In the long run I think we want to do away with these type-unsafe
interfaces, the more that they also request (typically) excess
alignment. This series of entirely independent patches is
eliminating the instances where it's relatively clear that they're
not just "blob" allocations.

01: x86/HVM: avoid effectively open-coding xzalloc_flex_struct()
02: x86/vPMU: avoid effectively open-coding xzalloc_flex_struct()
03: x86/MCE: avoid effectively open-coding xmalloc_array()
04: x86/HVM: avoid effectively open-coding xmalloc_array()
05: x86/oprofile: avoid effectively open-coding xmalloc_array()
06: x86/IRQ: avoid over-alignment in alloc_pirq_struct()
07: EFI/runtime: avoid effectively open-coding xmalloc_array()
08: hypfs: avoid effectively open-coding xzalloc_array()
09: kexec: avoid effectively open-coding xzalloc_flex_struct()
10: video/lfb: avoid effectively open-coding xzalloc_array()
11: Arm/optee: don't open-code xzalloc_flex_struct()

Jan

Comments

Andrew Cooper April 8, 2021, 12:57 p.m. UTC | #1

On 08/04/2021 13:13, Jan Beulich wrote:
> In the long run I think we want to do away with these type-unsafe
> interfaces, the more that they also request (typically) excess
> alignment. This series of entirely independent patches is
> eliminating the instances where it's relatively clear that they're
> not just "blob" allocations.
>
>
> 03: x86/MCE: avoid effectively open-coding xmalloc_array()
> 04: x86/HVM: avoid effectively open-coding xmalloc_array()
> 05: x86/oprofile: avoid effectively open-coding xmalloc_array()
> 06: x86/IRQ: avoid over-alignment in alloc_pirq_struct()
> 07: EFI/runtime: avoid effectively open-coding xmalloc_array()
> 08: hypfs: avoid effectively open-coding xzalloc_array()
> 10: video/lfb: avoid effectively open-coding xzalloc_array()

The flex conversions are fine, but I am unconvinced by argument for
interchanging array() and bytes().

The cacheline size is 64 bytes, and the minimum allocation size is 16,
plus a bhdr overhead of 32 bytes, so you're already at most of a
cacheline for a nominally-zero sized allocation.

There can however be a severe penalty from cacheline sharing, which is
why the bytes() form does have a minimum alignment.  There is one
xmalloc heap shared across the entire system, so you've got no idea what
might be sharing the same cache line for sub-line allocations.

We should not support sub-line allocations IMO.  The savings is a
handful of bytes at best, and some horrible performance cliffs to
avoid.  People running virtualisation are not going to be ram
constrained to the order of a few bytes.

For small allocations which don't require specific alignment, then
putting bhdr and the allocation in the same line is fine (if we don't do
this already), but we shouldn't be in the position of having two bhdr's
in the same cache line, even if there are plenty of single-byte
allocations in the theoretical worst case.

~Andrew

Jan Beulich April 8, 2021, 2:12 p.m. UTC | #2

On 08.04.2021 14:57, Andrew Cooper wrote:
> On 08/04/2021 13:13, Jan Beulich wrote:
>> In the long run I think we want to do away with these type-unsafe
>> interfaces, the more that they also request (typically) excess
>> alignment. This series of entirely independent patches is
>> eliminating the instances where it's relatively clear that they're
>> not just "blob" allocations.
>>
>>
>> 03: x86/MCE: avoid effectively open-coding xmalloc_array()
>> 04: x86/HVM: avoid effectively open-coding xmalloc_array()
>> 05: x86/oprofile: avoid effectively open-coding xmalloc_array()
>> 06: x86/IRQ: avoid over-alignment in alloc_pirq_struct()
>> 07: EFI/runtime: avoid effectively open-coding xmalloc_array()
>> 08: hypfs: avoid effectively open-coding xzalloc_array()
>> 10: video/lfb: avoid effectively open-coding xzalloc_array()
> 
> The flex conversions are fine, but I am unconvinced by argument for
> interchanging array() and bytes().
> 
> The cacheline size is 64 bytes, and the minimum allocation size is 16,
> plus a bhdr overhead of 32 bytes, so you're already at most of a
> cacheline for a nominally-zero sized allocation.

But you're aware that the alignment x[mz]alloc_bytes() forces is
128 bytes? Plus, while sizeof(struct bhdr) is indeed 32, the
overhead on allocated blocks is

#define BHDR_OVERHEAD   (sizeof(struct bhdr) - MIN_BLOCK_SIZE)

i.e. 16 (i.e. the other half of the 32 is already the minimum
block size of 16 that you also mention). IOW a cacheline sized
block would yield 48 bytes of usable space. Specifically a
meaningful change in the PV case from what patch 06 does, where
we only need 40 bytes.

> There can however be a severe penalty from cacheline sharing, which is
> why the bytes() form does have a minimum alignment.  There is one
> xmalloc heap shared across the entire system, so you've got no idea what
> might be sharing the same cache line for sub-line allocations.

This would call for distinguishing short-lived allocations (and
ones to be used mainly from a single CPU) from long-lived ones
having system wide use. I.e. a request to gain further allocation
function flavors, when already the introduction of the one new
xv[mz]alloc() has caused long-winded discussions with (so far) no
real outcome.

> We should not support sub-line allocations IMO.  The savings is a
> handful of bytes at best, and some horrible performance cliffs to
> avoid.  People running virtualisation are not going to be ram
> constrained to the order of a few bytes.
> 
> For small allocations which don't require specific alignment, then
> putting bhdr and the allocation in the same line is fine (if we don't do
> this already), but we shouldn't be in the position of having two bhdr's
> in the same cache line, even if there are plenty of single-byte
> allocations in the theoretical worst case.

That's a request to tweak allocator internals then, not an argument
against the conversions done here.

Jan