diff mbox series

[4/4] mm/slub: refactor calculate_order() and calc_slab_order()

Message ID 20230908145302.30320-10-vbabka@suse.cz (mailing list archive)
State New
Headers show
Series SLUB: calculate_order() cleanups | expand

Commit Message

Vlastimil Babka Sept. 8, 2023, 2:53 p.m. UTC
After the previous cleanups, we can now move some code from
calc_slab_order() to calculate_order() so it's executed just once, and
do some more cleanups.

- move the min_order and MAX_OBJS_PER_PAGE evaluation to
  calc_slab_order().

- change calc_slab_order() parameter min_objects to min_order

Also make MAX_OBJS_PER_PAGE check more robust by considering also
min_objects in addition to slub_min_order. Otherwise this is not a
functional change.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

Comments

kernel test robot Sept. 11, 2023, 5:56 a.m. UTC | #1
Hello,

kernel test robot noticed "UBSAN:shift-out-of-bounds_in_mm/slub.c" on:

commit: f04d441027621c16081803832a54f59272112cf5 ("[PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()")
url: https://github.com/intel-lab-lkp/linux/commits/Vlastimil-Babka/mm-slub-simplify-the-last-resort-slab-order-calculation/20230908-225506
base: git://git.kernel.org/cgit/linux/kernel/git/vbabka/slab.git for-next
patch link: https://lore.kernel.org/all/20230908145302.30320-10-vbabka@suse.cz/
patch subject: [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()

in testcase: boot

compiler: clang-16
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


+-------------------------------------------------+------------+------------+
|                                                 | a17847b835 | f04d441027 |
+-------------------------------------------------+------------+------------+
| UBSAN:shift-out-of-bounds_in_mm/slub.c          | 0          | 12         |
+-------------------------------------------------+------------+------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202309111340.f59c3f22-oliver.sang@intel.com


[    0.901457][    T0] UBSAN: shift-out-of-bounds in mm/slub.c:463:34
[    0.902458][    T0] shift exponent 52 is too large for 32-bit type 'unsigned int'
[    0.903477][    T0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G                T  6.5.0-rc1-00009-gf04d44102762 #1
[    0.904450][    T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[    0.904450][    T0] Call Trace:
[    0.904450][    T0]  <TASK>
[ 0.904450][ T0] dump_stack_lvl (lib/dump_stack.c:107) 
[ 0.904450][ T0] dump_stack (lib/dump_stack.c:114) 
[ 0.904450][ T0] ubsan_epilogue (lib/ubsan.c:218) 
[ 0.904450][ T0] __ubsan_handle_shift_out_of_bounds (lib/ubsan.c:?) 
[ 0.904450][ T0] ? tdx_handle_virt_exception (arch/x86/include/asm/shared/tdx.h:60 arch/x86/coco/tdx/tdx.c:375 arch/x86/coco/tdx/tdx.c:430 arch/x86/coco/tdx/tdx.c:650 arch/x86/coco/tdx/tdx.c:666) 
[ 0.904450][ T0] ? kmemleak_alloc (mm/kmemleak.c:977) 
[ 0.904450][ T0] __kmem_cache_create (mm/slub.c:? mm/slub.c:4159 mm/slub.c:4473 mm/slub.c:4507 mm/slub.c:5104) 
[ 0.904450][ T0] ? kmem_cache_alloc (mm/slub.c:3502) 
[ 0.904450][ T0] kmem_cache_create_usercopy (mm/slab_common.c:236 mm/slab_common.c:340) 
[ 0.904450][ T0] fork_init (kernel/fork.c:1048) 
[ 0.904450][ T0] ? kmem_cache_create (mm/slab_common.c:395) 
[ 0.904450][ T0] start_kernel (init/main.c:1046) 
[ 0.904450][ T0] x86_64_start_reservations (??:?) 
[ 0.904450][ T0] x86_64_start_kernel (arch/x86/kernel/head64.c:486) 
[ 0.904450][ T0] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441) 
[    0.904450][    T0]  </TASK>
[    0.904457][    T0] ================================================================================
[    0.905560][    T0] LSM: initializing lsm=lockdown,capability,landlock,yama,safesetid,integrity
[    0.906497][    T0] landlock: Up and running.



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230911/202309111340.f59c3f22-oliver.sang@intel.com
Vlastimil Babka Sept. 15, 2023, 1:36 p.m. UTC | #2
On 9/11/23 07:56, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed "UBSAN:shift-out-of-bounds_in_mm/slub.c" on:
> 
> commit: f04d441027621c16081803832a54f59272112cf5 ("[PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()")
> url: https://github.com/intel-lab-lkp/linux/commits/Vlastimil-Babka/mm-slub-simplify-the-last-resort-slab-order-calculation/20230908-225506
> base: git://git.kernel.org/cgit/linux/kernel/git/vbabka/slab.git for-next
> patch link: https://lore.kernel.org/all/20230908145302.30320-10-vbabka@suse.cz/
> patch subject: [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()
> 
> in testcase: boot
> 
> compiler: clang-16
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> +-------------------------------------------------+------------+------------+
> |                                                 | a17847b835 | f04d441027 |
> +-------------------------------------------------+------------+------------+
> | UBSAN:shift-out-of-bounds_in_mm/slub.c          | 0          | 12         |
> +-------------------------------------------------+------------+------------+
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202309111340.f59c3f22-oliver.sang@intel.com
> 
> 
> [    0.901457][    T0] UBSAN: shift-out-of-bounds in mm/slub.c:463:34
> [    0.902458][    T0] shift exponent 52 is too large for 32-bit type 'unsigned int'
> [    0.903477][    T0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G                T  6.5.0-rc1-00009-gf04d44102762 #1
> [    0.904450][    T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [    0.904450][    T0] Call Trace:
> [    0.904450][    T0]  <TASK>
> [ 0.904450][ T0] dump_stack_lvl (lib/dump_stack.c:107) 
> [ 0.904450][ T0] dump_stack (lib/dump_stack.c:114) 
> [ 0.904450][ T0] ubsan_epilogue (lib/ubsan.c:218) 
> [ 0.904450][ T0] __ubsan_handle_shift_out_of_bounds (lib/ubsan.c:?) 
> [ 0.904450][ T0] ? tdx_handle_virt_exception (arch/x86/include/asm/shared/tdx.h:60 arch/x86/coco/tdx/tdx.c:375 arch/x86/coco/tdx/tdx.c:430 arch/x86/coco/tdx/tdx.c:650 arch/x86/coco/tdx/tdx.c:666) 
> [ 0.904450][ T0] ? kmemleak_alloc (mm/kmemleak.c:977) 
> [ 0.904450][ T0] __kmem_cache_create (mm/slub.c:? mm/slub.c:4159 mm/slub.c:4473 mm/slub.c:4507 mm/slub.c:5104) 

Oh thanks, fixing up:

--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4152,7 +4152,8 @@ static inline int calculate_order(unsigned int size)
                        nr_cpus = nr_cpu_ids;
                min_objects = 4 * (fls(nr_cpus) + 1);
        }
-       max_objects = order_objects(slub_max_order, size);
+       /* min_objects can't be 0 because get_order(0) is undefined */
+       max_objects = max(order_objects(slub_max_order, size), 1);
        min_objects = min(min_objects, max_objects);
 
        min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
l
Baoquan He Sept. 16, 2023, 1:28 a.m. UTC | #3
On 09/08/23 at 04:53pm, Vlastimil Babka wrote:
> After the previous cleanups, we can now move some code from
> calc_slab_order() to calculate_order() so it's executed just once, and
> do some more cleanups.
> 
> - move the min_order and MAX_OBJS_PER_PAGE evaluation to
>   calc_slab_order().
> 
> - change calc_slab_order() parameter min_objects to min_order
> 
> Also make MAX_OBJS_PER_PAGE check more robust by considering also
> min_objects in addition to slub_min_order. Otherwise this is not a
> functional change.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/slub.c | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index f04eb029d85a..1c91f72c7239 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4110,17 +4110,12 @@ static unsigned int slub_min_objects;
>   * the smallest order which will fit the object.
>   */
>  static inline unsigned int calc_slab_order(unsigned int size,
> -		unsigned int min_objects, unsigned int max_order,
> +		unsigned int min_order, unsigned int max_order,
>  		unsigned int fract_leftover)
>  {
> -	unsigned int min_order = slub_min_order;
>  	unsigned int order;
>  
> -	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
> -		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
> -
> -	for (order = max(min_order, (unsigned int)get_order(min_objects * size));
> -			order <= max_order; order++) {
> +	for (order = min_order; order <= max_order; order++) {
>  
>  		unsigned int slab_size = (unsigned int)PAGE_SIZE << order;
>  		unsigned int rem;
> @@ -4139,7 +4134,7 @@ static inline int calculate_order(unsigned int size)
>  	unsigned int order;
>  	unsigned int min_objects;
>  	unsigned int max_objects;
> -	unsigned int nr_cpus;
> +	unsigned int min_order;
>  
>  	min_objects = slub_min_objects;
>  	if (!min_objects) {
> @@ -4152,7 +4147,7 @@ static inline int calculate_order(unsigned int size)
>  		 * order on systems that appear larger than they are, and too
>  		 * low order on systems that appear smaller than they are.
>  		 */
> -		nr_cpus = num_present_cpus();
> +		unsigned int nr_cpus = num_present_cpus();
>  		if (nr_cpus <= 1)
>  			nr_cpus = nr_cpu_ids;
>  		min_objects = 4 * (fls(nr_cpus) + 1);

A minor concern, should we change 'min_objects' to be a local static
to avoid the "if (!min_objects) {" code block every time?  It's deducing
the value from nr_cpus, we may not need do the calculation each time.

> @@ -4160,6 +4155,10 @@ static inline int calculate_order(unsigned int size)
>  	max_objects = order_objects(slub_max_order, size);
>  	min_objects = min(min_objects, max_objects);
>  
> +	min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
> +	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
> +		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
> +
>  	/*
>  	 * Attempt to find best configuration for a slab. This works by first
>  	 * attempting to generate a layout with the best possible configuration and
> @@ -4176,7 +4175,7 @@ static inline int calculate_order(unsigned int size)
>  	 * long as at least single object fits within slub_max_order.
>  	 */
>  	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
> -		order = calc_slab_order(size, min_objects, slub_max_order,
> +		order = calc_slab_order(size, min_order, slub_max_order,
>  					fraction);
>  		if (order <= slub_max_order)
>  			return order;
> -- 
> 2.42.0
>
Feng Tang Sept. 20, 2023, 1:36 p.m. UTC | #4
On Fri, Sep 08, 2023 at 10:53:07PM +0800, Vlastimil Babka wrote:
> After the previous cleanups, we can now move some code from
> calc_slab_order() to calculate_order() so it's executed just once, and
> do some more cleanups.
> 
> - move the min_order and MAX_OBJS_PER_PAGE evaluation to
>   calc_slab_order().

Nit: here is to 'move ... to calculate_order()'?

I tried this patch series with normal boot on a desktop and one 2
socket server: patch 2/4 doesn't change order of any slab, and patch
3/4 does make the slab order of big objects more consistent.

Thanks for making the code much cleaner! And for the whole series, 

Reviewed-by: Feng Tang <feng.tang@intel.com>

> - change calc_slab_order() parameter min_objects to min_order
> 
> Also make MAX_OBJS_PER_PAGE check more robust by considering also
> min_objects in addition to slub_min_order. Otherwise this is not a
> functional change.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/slub.c | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index f04eb029d85a..1c91f72c7239 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4110,17 +4110,12 @@ static unsigned int slub_min_objects;
>   * the smallest order which will fit the object.
>   */
>  static inline unsigned int calc_slab_order(unsigned int size,
> -		unsigned int min_objects, unsigned int max_order,
> +		unsigned int min_order, unsigned int max_order,
>  		unsigned int fract_leftover)
>  {
> -	unsigned int min_order = slub_min_order;
>  	unsigned int order;
>  
> -	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
> -		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
> -
> -	for (order = max(min_order, (unsigned int)get_order(min_objects * size));
> -			order <= max_order; order++) {
> +	for (order = min_order; order <= max_order; order++) {
>  
>  		unsigned int slab_size = (unsigned int)PAGE_SIZE << order;
>  		unsigned int rem;
> @@ -4139,7 +4134,7 @@ static inline int calculate_order(unsigned int size)
>  	unsigned int order;
>  	unsigned int min_objects;
>  	unsigned int max_objects;
> -	unsigned int nr_cpus;
> +	unsigned int min_order;
>  
>  	min_objects = slub_min_objects;
>  	if (!min_objects) {
> @@ -4152,7 +4147,7 @@ static inline int calculate_order(unsigned int size)
>  		 * order on systems that appear larger than they are, and too
>  		 * low order on systems that appear smaller than they are.
>  		 */
> -		nr_cpus = num_present_cpus();
> +		unsigned int nr_cpus = num_present_cpus();
>  		if (nr_cpus <= 1)
>  			nr_cpus = nr_cpu_ids;
>  		min_objects = 4 * (fls(nr_cpus) + 1);
> @@ -4160,6 +4155,10 @@ static inline int calculate_order(unsigned int size)
>  	max_objects = order_objects(slub_max_order, size);
>  	min_objects = min(min_objects, max_objects);
>  
> +	min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
> +	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
> +		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
> +
>  	/*
>  	 * Attempt to find best configuration for a slab. This works by first
>  	 * attempting to generate a layout with the best possible configuration and
> @@ -4176,7 +4175,7 @@ static inline int calculate_order(unsigned int size)
>  	 * long as at least single object fits within slub_max_order.
>  	 */
>  	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
> -		order = calc_slab_order(size, min_objects, slub_max_order,
> +		order = calc_slab_order(size, min_order, slub_max_order,
>  					fraction);
>  		if (order <= slub_max_order)
>  			return order;
> -- 
> 2.42.0
> 
>
Vlastimil Babka Sept. 22, 2023, 6:55 a.m. UTC | #5
On 9/20/23 15:36, Feng Tang wrote:
> On Fri, Sep 08, 2023 at 10:53:07PM +0800, Vlastimil Babka wrote:
>> After the previous cleanups, we can now move some code from
>> calc_slab_order() to calculate_order() so it's executed just once, and
>> do some more cleanups.
>> 
>> - move the min_order and MAX_OBJS_PER_PAGE evaluation to
>>   calc_slab_order().
> 
> Nit: here is to 'move ... to calculate_order()'?

Oops, right, fixed.

> I tried this patch series with normal boot on a desktop and one 2
> socket server: patch 2/4 doesn't change order of any slab, and patch
> 3/4 does make the slab order of big objects more consistent.
> 
> Thanks for making the code much cleaner! And for the whole series, 
> 
> Reviewed-by: Feng Tang <feng.tang@intel.com>

Thanks! Applied.

> 
>> - change calc_slab_order() parameter min_objects to min_order
>> 
>> Also make MAX_OBJS_PER_PAGE check more robust by considering also
>> min_objects in addition to slub_min_order. Otherwise this is not a
>> functional change.
>> 
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> ---
>>  mm/slub.c | 19 +++++++++----------
>>  1 file changed, 9 insertions(+), 10 deletions(-)
>> 
>> diff --git a/mm/slub.c b/mm/slub.c
>> index f04eb029d85a..1c91f72c7239 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -4110,17 +4110,12 @@ static unsigned int slub_min_objects;
>>   * the smallest order which will fit the object.
>>   */
>>  static inline unsigned int calc_slab_order(unsigned int size,
>> -		unsigned int min_objects, unsigned int max_order,
>> +		unsigned int min_order, unsigned int max_order,
>>  		unsigned int fract_leftover)
>>  {
>> -	unsigned int min_order = slub_min_order;
>>  	unsigned int order;
>>  
>> -	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
>> -		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
>> -
>> -	for (order = max(min_order, (unsigned int)get_order(min_objects * size));
>> -			order <= max_order; order++) {
>> +	for (order = min_order; order <= max_order; order++) {
>>  
>>  		unsigned int slab_size = (unsigned int)PAGE_SIZE << order;
>>  		unsigned int rem;
>> @@ -4139,7 +4134,7 @@ static inline int calculate_order(unsigned int size)
>>  	unsigned int order;
>>  	unsigned int min_objects;
>>  	unsigned int max_objects;
>> -	unsigned int nr_cpus;
>> +	unsigned int min_order;
>>  
>>  	min_objects = slub_min_objects;
>>  	if (!min_objects) {
>> @@ -4152,7 +4147,7 @@ static inline int calculate_order(unsigned int size)
>>  		 * order on systems that appear larger than they are, and too
>>  		 * low order on systems that appear smaller than they are.
>>  		 */
>> -		nr_cpus = num_present_cpus();
>> +		unsigned int nr_cpus = num_present_cpus();
>>  		if (nr_cpus <= 1)
>>  			nr_cpus = nr_cpu_ids;
>>  		min_objects = 4 * (fls(nr_cpus) + 1);
>> @@ -4160,6 +4155,10 @@ static inline int calculate_order(unsigned int size)
>>  	max_objects = order_objects(slub_max_order, size);
>>  	min_objects = min(min_objects, max_objects);
>>  
>> +	min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
>> +	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
>> +		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
>> +
>>  	/*
>>  	 * Attempt to find best configuration for a slab. This works by first
>>  	 * attempting to generate a layout with the best possible configuration and
>> @@ -4176,7 +4175,7 @@ static inline int calculate_order(unsigned int size)
>>  	 * long as at least single object fits within slub_max_order.
>>  	 */
>>  	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
>> -		order = calc_slab_order(size, min_objects, slub_max_order,
>> +		order = calc_slab_order(size, min_order, slub_max_order,
>>  					fraction);
>>  		if (order <= slub_max_order)
>>  			return order;
>> -- 
>> 2.42.0
>> 
>>
Vlastimil Babka Sept. 22, 2023, 7 a.m. UTC | #6
On 9/16/23 03:28, Baoquan He wrote:
> On 09/08/23 at 04:53pm, Vlastimil Babka wrote:
>> @@ -4152,7 +4147,7 @@ static inline int calculate_order(unsigned int size)
>>  		 * order on systems that appear larger than they are, and too
>>  		 * low order on systems that appear smaller than they are.
>>  		 */
>> -		nr_cpus = num_present_cpus();
>> +		unsigned int nr_cpus = num_present_cpus();
>>  		if (nr_cpus <= 1)
>>  			nr_cpus = nr_cpu_ids;
>>  		min_objects = 4 * (fls(nr_cpus) + 1);
> 
> A minor concern, should we change 'min_objects' to be a local static
> to avoid the "if (!min_objects) {" code block every time?  It's deducing
> the value from nr_cpus, we may not need do the calculation each time.

Maybe, although it's not a hot path. But we should make sure the
num_present_cpus() cannot change. Could it be e.g. low (1) very early when
we bootstrap the initial caches, but then update and at least most of the
caches then reflect the real number of cpus? With a static we would create
everything with 1.

>> @@ -4160,6 +4155,10 @@ static inline int calculate_order(unsigned int size)
>>  	max_objects = order_objects(slub_max_order, size);
>>  	min_objects = min(min_objects, max_objects);
>>  
>> +	min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
>> +	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
>> +		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
>> +
>>  	/*
>>  	 * Attempt to find best configuration for a slab. This works by first
>>  	 * attempting to generate a layout with the best possible configuration and
>> @@ -4176,7 +4175,7 @@ static inline int calculate_order(unsigned int size)
>>  	 * long as at least single object fits within slub_max_order.
>>  	 */
>>  	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
>> -		order = calc_slab_order(size, min_objects, slub_max_order,
>> +		order = calc_slab_order(size, min_order, slub_max_order,
>>  					fraction);
>>  		if (order <= slub_max_order)
>>  			return order;
>> -- 
>> 2.42.0
>> 
>
Baoquan He Sept. 22, 2023, 7:29 a.m. UTC | #7
On 09/22/23 at 09:00am, Vlastimil Babka wrote:
> On 9/16/23 03:28, Baoquan He wrote:
> > On 09/08/23 at 04:53pm, Vlastimil Babka wrote:
> >> @@ -4152,7 +4147,7 @@ static inline int calculate_order(unsigned int size)
> >>  		 * order on systems that appear larger than they are, and too
> >>  		 * low order on systems that appear smaller than they are.
> >>  		 */
> >> -		nr_cpus = num_present_cpus();
> >> +		unsigned int nr_cpus = num_present_cpus();
> >>  		if (nr_cpus <= 1)
> >>  			nr_cpus = nr_cpu_ids;
> >>  		min_objects = 4 * (fls(nr_cpus) + 1);
> > 
> > A minor concern, should we change 'min_objects' to be a local static
> > to avoid the "if (!min_objects) {" code block every time?  It's deducing
> > the value from nr_cpus, we may not need do the calculation each time.
> 
> Maybe, although it's not a hot path. But we should make sure the
> num_present_cpus() cannot change. Could it be e.g. low (1) very early when
> we bootstrap the initial caches, but then update and at least most of the
> caches then reflect the real number of cpus? With a static we would create
> everything with 1.

Yeah, I was silly, didn't think about it. We may check via system_state,
but it's not worth to bother since it's not hot path as you said. Sorry for
the noise.
diff mbox series

Patch

diff --git a/mm/slub.c b/mm/slub.c
index f04eb029d85a..1c91f72c7239 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4110,17 +4110,12 @@  static unsigned int slub_min_objects;
  * the smallest order which will fit the object.
  */
 static inline unsigned int calc_slab_order(unsigned int size,
-		unsigned int min_objects, unsigned int max_order,
+		unsigned int min_order, unsigned int max_order,
 		unsigned int fract_leftover)
 {
-	unsigned int min_order = slub_min_order;
 	unsigned int order;
 
-	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
-		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
-
-	for (order = max(min_order, (unsigned int)get_order(min_objects * size));
-			order <= max_order; order++) {
+	for (order = min_order; order <= max_order; order++) {
 
 		unsigned int slab_size = (unsigned int)PAGE_SIZE << order;
 		unsigned int rem;
@@ -4139,7 +4134,7 @@  static inline int calculate_order(unsigned int size)
 	unsigned int order;
 	unsigned int min_objects;
 	unsigned int max_objects;
-	unsigned int nr_cpus;
+	unsigned int min_order;
 
 	min_objects = slub_min_objects;
 	if (!min_objects) {
@@ -4152,7 +4147,7 @@  static inline int calculate_order(unsigned int size)
 		 * order on systems that appear larger than they are, and too
 		 * low order on systems that appear smaller than they are.
 		 */
-		nr_cpus = num_present_cpus();
+		unsigned int nr_cpus = num_present_cpus();
 		if (nr_cpus <= 1)
 			nr_cpus = nr_cpu_ids;
 		min_objects = 4 * (fls(nr_cpus) + 1);
@@ -4160,6 +4155,10 @@  static inline int calculate_order(unsigned int size)
 	max_objects = order_objects(slub_max_order, size);
 	min_objects = min(min_objects, max_objects);
 
+	min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
+	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
+		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
+
 	/*
 	 * Attempt to find best configuration for a slab. This works by first
 	 * attempting to generate a layout with the best possible configuration and
@@ -4176,7 +4175,7 @@  static inline int calculate_order(unsigned int size)
 	 * long as at least single object fits within slub_max_order.
 	 */
 	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
-		order = calc_slab_order(size, min_objects, slub_max_order,
+		order = calc_slab_order(size, min_order, slub_max_order,
 					fraction);
 		if (order <= slub_max_order)
 			return order;