diff mbox series

[11/11] Arm/optee: don't open-code xzalloc_flex_struct()

Message ID aad23304-c727-2921-59fe-ab3763f5da03@suse.com (mailing list archive)
State Superseded
Headers show
Series assorted replacement of x[mz]alloc_bytes() | expand

Commit Message

Jan Beulich April 8, 2021, 12:23 p.m. UTC
There is a difference in generated code: xzalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Comments

Julien Grall April 13, 2021, 6:19 p.m. UTC | #1
Hi Jan,

On 08/04/2021 13:23, Jan Beulich wrote:
> There is a difference in generated code: xzalloc_bytes() forces
> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
> actually don't want it.

So I think moving to xmalloc_flex_struct() is a pretty good move. But I 
am actually a bit confused with the argument used.

Could you provide some details why you think forcing the array to be 
aligned to the maximum cache line supported (128 bytes on Arm) is wrong?

Cheers,
Jan Beulich April 14, 2021, 7:03 a.m. UTC | #2
On 13.04.2021 20:19, Julien Grall wrote:
> On 08/04/2021 13:23, Jan Beulich wrote:
>> There is a difference in generated code: xzalloc_bytes() forces
>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
>> actually don't want it.
> 
> So I think moving to xmalloc_flex_struct() is a pretty good move. But I 
> am actually a bit confused with the argument used.
> 
> Could you provide some details why you think forcing the array to be 
> aligned to the maximum cache line supported (128 bytes on Arm) is wrong?

It is not "wrong" in that sense, but if this is intended it shouldn't
be arranged via use of xmalloc_bytes(). As also pointed out in a
similar discussion on another sub-thread, imo xmalloc_bytes(), being
type-unsafe, should go away altogether mid-term. If individual callers
have specific alignment requirements (which ought to be the exception),
they should explicitly request the needed alignment. If architectures
would prefer all allocations to have certain minimum alignment (e.g.
to avoid cacheline sharing, which was Andrew's argument) or other
"arrangement" (alignment by itself may not be that interesting due to
the bhdr placed ahead of the allocation), it should be the allocator
itself that provides for this, not individual callers.

Jan
Julien Grall April 15, 2021, 10:26 a.m. UTC | #3
Hi Jan,

On 14/04/2021 08:03, Jan Beulich wrote:
> On 13.04.2021 20:19, Julien Grall wrote:
>> On 08/04/2021 13:23, Jan Beulich wrote:
>>> There is a difference in generated code: xzalloc_bytes() forces
>>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
>>> actually don't want it.
>>
>> So I think moving to xmalloc_flex_struct() is a pretty good move. But I
>> am actually a bit confused with the argument used.
>>
>> Could you provide some details why you think forcing the array to be
>> aligned to the maximum cache line supported (128 bytes on Arm) is wrong?
> 
> It is not "wrong" in that sense, but if this is intended it shouldn't
> be arranged via use of xmalloc_bytes().

This is not very clear from the commit message (or even the cover 
letter). How about:

"
The current use xzalloc_bytes() in optee is nearly an open-coded version 
of xzalloc_flex_struct() which was introduced after the driver was merged.

The main difference is xzalloc_bytes() will also force the allocation to 
be SMP_CACHE_BYTES aligned and therefore avoid sharing the cache line.

While sharing the cache line can have an impact on the performance, this 
is also true for most of the other users of  x*alloc_flex_struct(). So 
if we want to prevent sharing a cache line, it should be part of 
x*alloc_flex_struct().

In this case, we don't need stricter alignment than what the allocator 
does. So the call to xzalloc_bytes() is now replaced with a call 
xzalloc_flex_Struct().
"

Ideally, we want the same sort of the commit message in the other patches.

> As also pointed out in a
> similar discussion on another sub-thread, imo xmalloc_bytes(), being
> type-unsafe, should go away altogether mid-term.

And I will support dropping xmalloc_bytes().

> If individual callers
> have specific alignment requirements (which ought to be the exception),
> they should explicitly request the needed alignment. If architectures
> would prefer all allocations to have certain minimum alignment (e.g.
> to avoid cacheline sharing, which was Andrew's argument) or other
> "arrangement" (alignment by itself may not be that interesting due to
> the bhdr placed ahead of the allocation), it should be the allocator
> itself that provides for this, not individual callers.

And I agree that the allocator should do the alignment if this benefits 
every allocation.

Cheers,
Jan Beulich April 15, 2021, 11:02 a.m. UTC | #4
On 15.04.2021 12:26, Julien Grall wrote:
> Hi Jan,
> 
> On 14/04/2021 08:03, Jan Beulich wrote:
>> On 13.04.2021 20:19, Julien Grall wrote:
>>> On 08/04/2021 13:23, Jan Beulich wrote:
>>>> There is a difference in generated code: xzalloc_bytes() forces
>>>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
>>>> actually don't want it.
>>>
>>> So I think moving to xmalloc_flex_struct() is a pretty good move. But I
>>> am actually a bit confused with the argument used.
>>>
>>> Could you provide some details why you think forcing the array to be
>>> aligned to the maximum cache line supported (128 bytes on Arm) is wrong?
>>
>> It is not "wrong" in that sense, but if this is intended it shouldn't
>> be arranged via use of xmalloc_bytes().
> 
> This is not very clear from the commit message (or even the cover 
> letter). How about:
> 
> "
> The current use xzalloc_bytes() in optee is nearly an open-coded version 
> of xzalloc_flex_struct() which was introduced after the driver was merged.
> 
> The main difference is xzalloc_bytes() will also force the allocation to 
> be SMP_CACHE_BYTES aligned and therefore avoid sharing the cache line.
> 
> While sharing the cache line can have an impact on the performance, this 
> is also true for most of the other users of  x*alloc_flex_struct(). So 
> if we want to prevent sharing a cache line, it should be part of 
> x*alloc_flex_struct().
> 
> In this case, we don't need stricter alignment than what the allocator 
> does. So the call to xzalloc_bytes() is now replaced with a call 
> xzalloc_flex_Struct().
> "

Well, okay, I've inserted a slightly edited version of this into
the patch. But ...

> Ideally, we want the same sort of the commit message in the other patches.

... I disagree here: In particular because of the recurring
pattern, I dislike repeated overly verbose descriptions. I could
perhaps see them as being not seen as overly verbose when looking
at every commit on its own (like would happen when someone tries
to do some archaeology later on), but in the context of such a
series this is potentially harmful: If almost a dozen patches
have almost the same sufficiently verbose description, possible
differences may not be paid attention to.

Plus, granted possibly controversial as well, I'm afraid I'm not
happy to (repeatedly) state obvious facts. The abuse (in almost
all cases) of x[mz]alloc_bytes() is, afaict, in no way attributed
to the resulting higher alignment, but rather in the desire to
get away easily without needing to think about using a type-safe
flavor. This said I will admit that prior to the introduction of
x[mz]alloc_flex_struct() I can see how alternatives could quickly
have got quite ugly. And for the few (if any) users which
actually care about this higher alignment, it should have been
noted at the time of introducing the use that the specific need
for certain alignment shouldn't be implied by using this function,
but rather be made explicit. This is even more so as it's not
written down anywhere that x[mz]alloc_bytes() guarantees a certain
alignment (i.e. if the plan wasn't anyway to phase out its use, it
should have been permissible to change the alignment it requests
from the underlying implementation without first auditing all
users).

Jan
Julien Grall April 15, 2021, 11:31 a.m. UTC | #5
On 15/04/2021 12:02, Jan Beulich wrote:
> On 15.04.2021 12:26, Julien Grall wrote:
>> Hi Jan,
>>
>> On 14/04/2021 08:03, Jan Beulich wrote:
>>> On 13.04.2021 20:19, Julien Grall wrote:
>>>> On 08/04/2021 13:23, Jan Beulich wrote:
>>>>> There is a difference in generated code: xzalloc_bytes() forces
>>>>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
>>>>> actually don't want it.
>>>>
>>>> So I think moving to xmalloc_flex_struct() is a pretty good move. But I
>>>> am actually a bit confused with the argument used.
>>>>
>>>> Could you provide some details why you think forcing the array to be
>>>> aligned to the maximum cache line supported (128 bytes on Arm) is wrong?
>>>
>>> It is not "wrong" in that sense, but if this is intended it shouldn't
>>> be arranged via use of xmalloc_bytes().
>>
>> This is not very clear from the commit message (or even the cover
>> letter). How about:
>>
>> "
>> The current use xzalloc_bytes() in optee is nearly an open-coded version
>> of xzalloc_flex_struct() which was introduced after the driver was merged.
>>
>> The main difference is xzalloc_bytes() will also force the allocation to
>> be SMP_CACHE_BYTES aligned and therefore avoid sharing the cache line.
>>
>> While sharing the cache line can have an impact on the performance, this
>> is also true for most of the other users of  x*alloc_flex_struct(). So
>> if we want to prevent sharing a cache line, it should be part of
>> x*alloc_flex_struct().
>>
>> In this case, we don't need stricter alignment than what the allocator
>> does. So the call to xzalloc_bytes() is now replaced with a call
>> xzalloc_flex_Struct().
>> "
> 
> Well, okay, I've inserted a slightly edited version of this into
> the patch. But ...
> 
>> Ideally, we want the same sort of the commit message in the other patches.
> 
> ... I disagree here: In particular because of the recurring
> pattern, I dislike repeated overly verbose descriptions. I could
> perhaps see them as being not seen as overly verbose when looking
> at every commit on its own (like would happen when someone tries
> to do some archaeology later on), but in the context of such a
> series this is potentially harmful: If almost a dozen patches
> have almost the same sufficiently verbose description, possible
> differences may not be paid attention to.

There are two part to take into account:

1) Reviewer
2) Future reader

For reviewer, I agree that the information is redundant. Although, it is 
not like the cover letter (in this case) contain more information about 
the reasoning... ;)

However, such information is not redundant for a future reader. Patches 
in a series may be applied separately, in a different order, title 
mangled...

> 
> Plus, granted possibly controversial as well, I'm afraid I'm not
> happy to (repeatedly) state obvious facts.
I should probably remind you that what's obvious for you may not be for 
the other.

In particular, writing "I think we only don't need this here, but 
actually don't want it" is very unhelpful because it can be interpreted 
differently (my original answer is an example).

It doesn't cost much to write clear commit message. But it will cost a 
lot for the future reader (or even reviewer) to figure out the original 
intention.

> The abuse (in almost
> all cases) of x[mz]alloc_bytes() is, afaict, in no way attributed
> to the resulting higher alignment, but rather in the desire to
> get away easily without needing to think about using a type-safe
> flavor. This said I will admit that prior to the introduction of
> x[mz]alloc_flex_struct() I can see how alternatives could quickly
> have got quite ugly. And for the few (if any) users which
> actually care about this higher alignment, it should have been
> noted at the time of introducing the use that the specific need
> for certain alignment shouldn't be implied by using this function,
> but rather be made explicit. This is even more so as it's not
> written down anywhere that x[mz]alloc_bytes() guarantees a certain
> alignment (i.e. if the plan wasn't anyway to phase out its use, it
> should have been permissible to change the alignment it requests
> from the underlying implementation without first auditing all
> users).
I don't think it is really helpful to argue on whether the developper 
did it by lazyness/lack of helpers or else.

As I wrote, dropping xmalloc_bytes() is a good move. But we should 
documenting our reasoning rather than hoping that everyone know your 
"obvious facts".

Cheers,
diff mbox series

Patch

--- a/xen/arch/arm/tee/optee.c
+++ b/xen/arch/arm/tee/optee.c
@@ -529,8 +529,7 @@  static struct optee_shm_buf *allocate_op
     while ( unlikely(old != atomic_cmpxchg(&ctx->optee_shm_buf_pages,
                                            old, new)) );
 
-    optee_shm_buf = xzalloc_bytes(sizeof(struct optee_shm_buf) +
-                                  pages_cnt * sizeof(struct page *));
+    optee_shm_buf = xzalloc_flex_struct(struct optee_shm_buf, pages, pages_cnt);
     if ( !optee_shm_buf )
     {
         err_code = -ENOMEM;