mbox series

[v3,0/5] arm64: Dereference CPU operations indirectly

Message ID 20200212004351.66576-1-gshan@redhat.com (mailing list archive)
Headers show
Series arm64: Dereference CPU operations indirectly | expand

Message

Gavin Shan Feb. 12, 2020, 12:43 a.m. UTC
The subject perhaps can't precisely indicate what this series does, but
keep it same as before for consistency.

In current implementation, an array (@cpu_ops[NR_CPUS]) is maintained
to dereference the CPU operations. 2KB memory are consumed when NR_CPUS
is configured to 256. It's too much than what I expected. This series
reworks the implementation to dereference the CPU operations indirectly
by using the index to the CPU operations array, so that less memory (4
bytes) will be consumed for the same purpose. The optimization bases on
the assumption: these CPU operations aren't dereferenced in hot path.
Also, we have only one valid CPU enablement method for all CPUs.

PATCH[1/5] isn't too much relevant, to declare ACPI parking protocol only
when CONFIG_ARM64_ACPI_PARKING_PROTOCOL has been enabled. PATCH[2/5]
renames cpu_read_ops() to init_cpu_ops(), which is obviously more precise
because it's initializing the CPU operations. PATCH[3/5] introduces
get_cpu_ops(), preparing for dereferencing CPU operations indirectly.
PATCH[4/5] removes the CPU operations deferencing array and replaces it
with an 4-bytes variable with the assumption: all CPUs should have same
enablement method. PATCH[5/5] removes the argument of get_cpu_ops() as
it's useless.

Changelog
=========
v3:
   * Assume all CPUs have same enablement method. With this, the used
     memory is further squeezed from 64 bytes to 4 bytes (Lorenzo Pieralisi)
   * Add PATCH[5/5] to remove argument of get_cpu_ops()  (Gavin Shan)
v2:
   * Pack 4 CPUs' indexes into one byte. 64 bytes are consumed in order
     to get the CPU operations                            (Robin Murphy)
   * Use ARRAY_SIZE() to iterate @cpu_ops[]               (Robin Murphy)
   * Make index-0 valid                                   (Robin Murphy)

Gavin Shan (5):
  arm64: Declare ACPI parking protocol CPU operation if needed
  arm64: Rename cpu_read_ops() to init_cpu_ops()
  arm64: Introduce get_cpu_ops() helper function
  arm64: Remove CPU operations dereferencing array
  arm64: Remove argument @cpu of get_cpu_ops()

 arch/arm64/include/asm/cpu_ops.h |  8 ++--
 arch/arm64/kernel/cpu_ops.c      | 69 +++++++++++++++++++-------------
 arch/arm64/kernel/cpuidle.c      | 10 ++---
 arch/arm64/kernel/setup.c        |  8 ++--
 arch/arm64/kernel/smp.c          | 60 ++++++++++++++++++---------
 5 files changed, 95 insertions(+), 60 deletions(-)

Comments

Gavin Shan Feb. 20, 2020, 9:46 p.m. UTC | #1
On 2/12/20 11:43 AM, Gavin Shan wrote:
> The subject perhaps can't precisely indicate what this series does, but
> keep it same as before for consistency.
> 
> In current implementation, an array (@cpu_ops[NR_CPUS]) is maintained
> to dereference the CPU operations. 2KB memory are consumed when NR_CPUS
> is configured to 256. It's too much than what I expected. This series
> reworks the implementation to dereference the CPU operations indirectly
> by using the index to the CPU operations array, so that less memory (4
> bytes) will be consumed for the same purpose. The optimization bases on
> the assumption: these CPU operations aren't dereferenced in hot path.
> Also, we have only one valid CPU enablement method for all CPUs.
> 
> PATCH[1/5] isn't too much relevant, to declare ACPI parking protocol only
> when CONFIG_ARM64_ACPI_PARKING_PROTOCOL has been enabled. PATCH[2/5]
> renames cpu_read_ops() to init_cpu_ops(), which is obviously more precise
> because it's initializing the CPU operations. PATCH[3/5] introduces
> get_cpu_ops(), preparing for dereferencing CPU operations indirectly.
> PATCH[4/5] removes the CPU operations deferencing array and replaces it
> with an 4-bytes variable with the assumption: all CPUs should have same
> enablement method. PATCH[5/5] removes the argument of get_cpu_ops() as
> it's useless.
> 
> Changelog
> =========
> v3:
>     * Assume all CPUs have same enablement method. With this, the used
>       memory is further squeezed from 64 bytes to 4 bytes (Lorenzo Pieralisi)
>     * Add PATCH[5/5] to remove argument of get_cpu_ops()  (Gavin Shan)
> v2:
>     * Pack 4 CPUs' indexes into one byte. 64 bytes are consumed in order
>       to get the CPU operations                            (Robin Murphy)
>     * Use ARRAY_SIZE() to iterate @cpu_ops[]               (Robin Murphy)
>     * Make index-0 valid                                   (Robin Murphy)
> 

Lorenzo/Robin, could you please give it quick scan when you have time?
Please let me know if you have more comments :)

Thanks,
Gavin

> Gavin Shan (5):
>    arm64: Declare ACPI parking protocol CPU operation if needed
>    arm64: Rename cpu_read_ops() to init_cpu_ops()
>    arm64: Introduce get_cpu_ops() helper function
>    arm64: Remove CPU operations dereferencing array
>    arm64: Remove argument @cpu of get_cpu_ops()
> 
>   arch/arm64/include/asm/cpu_ops.h |  8 ++--
>   arch/arm64/kernel/cpu_ops.c      | 69 +++++++++++++++++++-------------
>   arch/arm64/kernel/cpuidle.c      | 10 ++---
>   arch/arm64/kernel/setup.c        |  8 ++--
>   arch/arm64/kernel/smp.c          | 60 ++++++++++++++++++---------
>   5 files changed, 95 insertions(+), 60 deletions(-)
>