Message ID | aad23304-c727-2921-59fe-ab3763f5da03@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | assorted replacement of x[mz]alloc_bytes() | expand |
Hi Jan, On 08/04/2021 13:23, Jan Beulich wrote: > There is a difference in generated code: xzalloc_bytes() forces > SMP_CACHE_BYTES alignment. I think we not only don't need this here, but > actually don't want it. So I think moving to xmalloc_flex_struct() is a pretty good move. But I am actually a bit confused with the argument used. Could you provide some details why you think forcing the array to be aligned to the maximum cache line supported (128 bytes on Arm) is wrong? Cheers,
On 13.04.2021 20:19, Julien Grall wrote: > On 08/04/2021 13:23, Jan Beulich wrote: >> There is a difference in generated code: xzalloc_bytes() forces >> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but >> actually don't want it. > > So I think moving to xmalloc_flex_struct() is a pretty good move. But I > am actually a bit confused with the argument used. > > Could you provide some details why you think forcing the array to be > aligned to the maximum cache line supported (128 bytes on Arm) is wrong? It is not "wrong" in that sense, but if this is intended it shouldn't be arranged via use of xmalloc_bytes(). As also pointed out in a similar discussion on another sub-thread, imo xmalloc_bytes(), being type-unsafe, should go away altogether mid-term. If individual callers have specific alignment requirements (which ought to be the exception), they should explicitly request the needed alignment. If architectures would prefer all allocations to have certain minimum alignment (e.g. to avoid cacheline sharing, which was Andrew's argument) or other "arrangement" (alignment by itself may not be that interesting due to the bhdr placed ahead of the allocation), it should be the allocator itself that provides for this, not individual callers. Jan
Hi Jan, On 14/04/2021 08:03, Jan Beulich wrote: > On 13.04.2021 20:19, Julien Grall wrote: >> On 08/04/2021 13:23, Jan Beulich wrote: >>> There is a difference in generated code: xzalloc_bytes() forces >>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but >>> actually don't want it. >> >> So I think moving to xmalloc_flex_struct() is a pretty good move. But I >> am actually a bit confused with the argument used. >> >> Could you provide some details why you think forcing the array to be >> aligned to the maximum cache line supported (128 bytes on Arm) is wrong? > > It is not "wrong" in that sense, but if this is intended it shouldn't > be arranged via use of xmalloc_bytes(). This is not very clear from the commit message (or even the cover letter). How about: " The current use xzalloc_bytes() in optee is nearly an open-coded version of xzalloc_flex_struct() which was introduced after the driver was merged. The main difference is xzalloc_bytes() will also force the allocation to be SMP_CACHE_BYTES aligned and therefore avoid sharing the cache line. While sharing the cache line can have an impact on the performance, this is also true for most of the other users of x*alloc_flex_struct(). So if we want to prevent sharing a cache line, it should be part of x*alloc_flex_struct(). In this case, we don't need stricter alignment than what the allocator does. So the call to xzalloc_bytes() is now replaced with a call xzalloc_flex_Struct(). " Ideally, we want the same sort of the commit message in the other patches. > As also pointed out in a > similar discussion on another sub-thread, imo xmalloc_bytes(), being > type-unsafe, should go away altogether mid-term. And I will support dropping xmalloc_bytes(). > If individual callers > have specific alignment requirements (which ought to be the exception), > they should explicitly request the needed alignment. If architectures > would prefer all allocations to have certain minimum alignment (e.g. > to avoid cacheline sharing, which was Andrew's argument) or other > "arrangement" (alignment by itself may not be that interesting due to > the bhdr placed ahead of the allocation), it should be the allocator > itself that provides for this, not individual callers. And I agree that the allocator should do the alignment if this benefits every allocation. Cheers,
On 15.04.2021 12:26, Julien Grall wrote: > Hi Jan, > > On 14/04/2021 08:03, Jan Beulich wrote: >> On 13.04.2021 20:19, Julien Grall wrote: >>> On 08/04/2021 13:23, Jan Beulich wrote: >>>> There is a difference in generated code: xzalloc_bytes() forces >>>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but >>>> actually don't want it. >>> >>> So I think moving to xmalloc_flex_struct() is a pretty good move. But I >>> am actually a bit confused with the argument used. >>> >>> Could you provide some details why you think forcing the array to be >>> aligned to the maximum cache line supported (128 bytes on Arm) is wrong? >> >> It is not "wrong" in that sense, but if this is intended it shouldn't >> be arranged via use of xmalloc_bytes(). > > This is not very clear from the commit message (or even the cover > letter). How about: > > " > The current use xzalloc_bytes() in optee is nearly an open-coded version > of xzalloc_flex_struct() which was introduced after the driver was merged. > > The main difference is xzalloc_bytes() will also force the allocation to > be SMP_CACHE_BYTES aligned and therefore avoid sharing the cache line. > > While sharing the cache line can have an impact on the performance, this > is also true for most of the other users of x*alloc_flex_struct(). So > if we want to prevent sharing a cache line, it should be part of > x*alloc_flex_struct(). > > In this case, we don't need stricter alignment than what the allocator > does. So the call to xzalloc_bytes() is now replaced with a call > xzalloc_flex_Struct(). > " Well, okay, I've inserted a slightly edited version of this into the patch. But ... > Ideally, we want the same sort of the commit message in the other patches. ... I disagree here: In particular because of the recurring pattern, I dislike repeated overly verbose descriptions. I could perhaps see them as being not seen as overly verbose when looking at every commit on its own (like would happen when someone tries to do some archaeology later on), but in the context of such a series this is potentially harmful: If almost a dozen patches have almost the same sufficiently verbose description, possible differences may not be paid attention to. Plus, granted possibly controversial as well, I'm afraid I'm not happy to (repeatedly) state obvious facts. The abuse (in almost all cases) of x[mz]alloc_bytes() is, afaict, in no way attributed to the resulting higher alignment, but rather in the desire to get away easily without needing to think about using a type-safe flavor. This said I will admit that prior to the introduction of x[mz]alloc_flex_struct() I can see how alternatives could quickly have got quite ugly. And for the few (if any) users which actually care about this higher alignment, it should have been noted at the time of introducing the use that the specific need for certain alignment shouldn't be implied by using this function, but rather be made explicit. This is even more so as it's not written down anywhere that x[mz]alloc_bytes() guarantees a certain alignment (i.e. if the plan wasn't anyway to phase out its use, it should have been permissible to change the alignment it requests from the underlying implementation without first auditing all users). Jan
On 15/04/2021 12:02, Jan Beulich wrote: > On 15.04.2021 12:26, Julien Grall wrote: >> Hi Jan, >> >> On 14/04/2021 08:03, Jan Beulich wrote: >>> On 13.04.2021 20:19, Julien Grall wrote: >>>> On 08/04/2021 13:23, Jan Beulich wrote: >>>>> There is a difference in generated code: xzalloc_bytes() forces >>>>> SMP_CACHE_BYTES alignment. I think we not only don't need this here, but >>>>> actually don't want it. >>>> >>>> So I think moving to xmalloc_flex_struct() is a pretty good move. But I >>>> am actually a bit confused with the argument used. >>>> >>>> Could you provide some details why you think forcing the array to be >>>> aligned to the maximum cache line supported (128 bytes on Arm) is wrong? >>> >>> It is not "wrong" in that sense, but if this is intended it shouldn't >>> be arranged via use of xmalloc_bytes(). >> >> This is not very clear from the commit message (or even the cover >> letter). How about: >> >> " >> The current use xzalloc_bytes() in optee is nearly an open-coded version >> of xzalloc_flex_struct() which was introduced after the driver was merged. >> >> The main difference is xzalloc_bytes() will also force the allocation to >> be SMP_CACHE_BYTES aligned and therefore avoid sharing the cache line. >> >> While sharing the cache line can have an impact on the performance, this >> is also true for most of the other users of x*alloc_flex_struct(). So >> if we want to prevent sharing a cache line, it should be part of >> x*alloc_flex_struct(). >> >> In this case, we don't need stricter alignment than what the allocator >> does. So the call to xzalloc_bytes() is now replaced with a call >> xzalloc_flex_Struct(). >> " > > Well, okay, I've inserted a slightly edited version of this into > the patch. But ... > >> Ideally, we want the same sort of the commit message in the other patches. > > ... I disagree here: In particular because of the recurring > pattern, I dislike repeated overly verbose descriptions. I could > perhaps see them as being not seen as overly verbose when looking > at every commit on its own (like would happen when someone tries > to do some archaeology later on), but in the context of such a > series this is potentially harmful: If almost a dozen patches > have almost the same sufficiently verbose description, possible > differences may not be paid attention to. There are two part to take into account: 1) Reviewer 2) Future reader For reviewer, I agree that the information is redundant. Although, it is not like the cover letter (in this case) contain more information about the reasoning... ;) However, such information is not redundant for a future reader. Patches in a series may be applied separately, in a different order, title mangled... > > Plus, granted possibly controversial as well, I'm afraid I'm not > happy to (repeatedly) state obvious facts. I should probably remind you that what's obvious for you may not be for the other. In particular, writing "I think we only don't need this here, but actually don't want it" is very unhelpful because it can be interpreted differently (my original answer is an example). It doesn't cost much to write clear commit message. But it will cost a lot for the future reader (or even reviewer) to figure out the original intention. > The abuse (in almost > all cases) of x[mz]alloc_bytes() is, afaict, in no way attributed > to the resulting higher alignment, but rather in the desire to > get away easily without needing to think about using a type-safe > flavor. This said I will admit that prior to the introduction of > x[mz]alloc_flex_struct() I can see how alternatives could quickly > have got quite ugly. And for the few (if any) users which > actually care about this higher alignment, it should have been > noted at the time of introducing the use that the specific need > for certain alignment shouldn't be implied by using this function, > but rather be made explicit. This is even more so as it's not > written down anywhere that x[mz]alloc_bytes() guarantees a certain > alignment (i.e. if the plan wasn't anyway to phase out its use, it > should have been permissible to change the alignment it requests > from the underlying implementation without first auditing all > users). I don't think it is really helpful to argue on whether the developper did it by lazyness/lack of helpers or else. As I wrote, dropping xmalloc_bytes() is a good move. But we should documenting our reasoning rather than hoping that everyone know your "obvious facts". Cheers,
--- a/xen/arch/arm/tee/optee.c +++ b/xen/arch/arm/tee/optee.c @@ -529,8 +529,7 @@ static struct optee_shm_buf *allocate_op while ( unlikely(old != atomic_cmpxchg(&ctx->optee_shm_buf_pages, old, new)) ); - optee_shm_buf = xzalloc_bytes(sizeof(struct optee_shm_buf) + - pages_cnt * sizeof(struct page *)); + optee_shm_buf = xzalloc_flex_struct(struct optee_shm_buf, pages, pages_cnt); if ( !optee_shm_buf ) { err_code = -ENOMEM;
There is a difference in generated code: xzalloc_bytes() forces SMP_CACHE_BYTES alignment. I think we not only don't need this here, but actually don't want it. Signed-off-by: Jan Beulich <jbeulich@suse.com>