mbox series

[v5,0/7] mm, slab: Make kmalloc_info[] contain all types of names

Message ID 20190916144558.27282-1-lpf.vector@gmail.com (mailing list archive)
Headers show
Series mm, slab: Make kmalloc_info[] contain all types of names | expand

Message

Pengfei Li Sept. 16, 2019, 2:45 p.m. UTC
Changes in v5
--
1. patch 1/7:
    - rename SET_KMALLOC_SIZE to INIT_KMALLOC_INFO
2. patch 5/7:
    - fix build errors (Reported-by: kbuild test robot)
    - make all_kmalloc_info[] static (Reported-by: kbuild test robot)
3. patch 6/7:
    - for robustness, determine kmalloc_cache is !NULL in
      new_kmalloc_cache()
4. add ack tag from David Rientjes

Changes in v4
--
1. [old] abandon patch 4/4
2. [new] patch 4/7:
    - return ZERO_SIZE_ALLOC instead 0 for zero sized requests
3. [new] patch 5/7:
    - reorder kmalloc_info[], kmalloc_caches[] (in order of size)
    - hard to split, so slightly larger
4. [new] patch 6/7:
    - initialize kmalloc_cache[] with the same size but different
      types
5. [new] patch 7/7:
    - modify kmalloc_caches[type][idx] to kmalloc_caches[idx][type]

Patch 4-7 are newly added, more information can be obtained from
commit messages.

Changes in v3
--
1. restore __initconst (patch 1/4)
2. rename patch 3/4
3. add more clarification for patch 4/4
4. add ack tag from Roman Gushchin

Changes in v2
--
1. remove __initconst (patch 1/5)
2. squash patch 2/5
3. add ack tag from Vlastimil Babka


There are three types of kmalloc, KMALLOC_NORMAL, KMALLOC_RECLAIM
and KMALLOC_DMA.

The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
generated by kmalloc_cache_name().

Patch1 predefines the names of all types of kmalloc to save
the time spent dynamically generating names.

These changes make sense, and the time spent by new_kmalloc_cache()
has been reduced by approximately 36.3%.

                         Time spent by new_kmalloc_cache()
                                  (CPU cycles)
5.3-rc7                              66264
5.3-rc7+patch_1-3                    42188


bloat-o-meter
--
$ ./scripts/bloat-o-meter vmlinux_5.3 vmlinux_5.3+patch_1-7
add/remove: 1/2 grow/shrink: 6/65 up/down: 872/-1574 (-702)
Function                                     old     new   delta
all_kmalloc_info                               -     832    +832
crypto_create_tfm                            211     225     +14
ieee80211_key_alloc                         1159    1169     +10
nl80211_parse_sched_scan                    2787    2795      +8
tg3_self_test                               4255    4259      +4
find_get_context.isra                        634     637      +3
sd_probe                                     947     948      +1
nf_queue                                     671     670      -1
trace_parser_get_init                         71      69      -2
pkcs7_verify.cold                            318     316      -2
units                                        323     320      -3
nl80211_set_reg                              642     639      -3
pkcs7_verify                                1503    1495      -8
i915_sw_fence_await_dma_fence                445     437      -8
nla_strdup                                   143     134      -9
kmalloc_slab                                 102      93      -9
xhci_alloc_tt_info                           349     338     -11
xhci_segment_alloc                           303     289     -14
xhci_alloc_container_ctx                     221     207     -14
xfrm_policy_alloc                            277     263     -14
selinux_sk_alloc_security                    119     105     -14
sdev_evt_send_simple                         124     110     -14
sdev_evt_alloc                                85      71     -14
sbitmap_queue_init_node                      424     410     -14
regulatory_hint_found_beacon                 400     386     -14
nf_ct_tmpl_alloc                              91      77     -14
gss_create_cred                              146     132     -14
drm_flip_work_allocate_task                   76      62     -14
cfg80211_stop_iface                          266     252     -14
cfg80211_sinfo_alloc_tid_stats                83      69     -14
cfg80211_port_authorized                     218     204     -14
cfg80211_ibss_joined                         341     327     -14
call_usermodehelper_setup                    155     141     -14
bpf_prog_alloc_no_stats                      188     174     -14
blk_alloc_flush_queue                        197     183     -14
bdi_alloc_node                               201     187     -14
_netlbl_catmap_getnode                       253     239     -14
__igmp_group_dropped                         629     615     -14
____ip_mc_inc_group                          481     467     -14
xhci_alloc_command                           221     205     -16
audit_log_d_path                             204     188     -16
xprt_switch_alloc                            145     128     -17
xhci_ring_alloc                              378     361     -17
xhci_mem_init                               3673    3656     -17
xhci_alloc_virt_device                       505     488     -17
xhci_alloc_stream_info                       727     710     -17
tcp_sendmsg_locked                          3129    3112     -17
tcp_md5_do_add                               783     766     -17
tcp_fastopen_defer_connect                   279     262     -17
sr_read_tochdr.isra                          260     243     -17
sr_read_tocentry.isra                        337     320     -17
sr_is_xa                                     385     368     -17
sr_get_mcn                                   269     252     -17
scsi_probe_and_add_lun                      2947    2930     -17
ring_buffer_read_prepare                     103      86     -17
request_firmware_nowait                      405     388     -17
ohci_urb_enqueue                            3185    3168     -17
nfs_alloc_seqid                               96      79     -17
nfs4_get_state_owner                        1049    1032     -17
nfs4_do_close                                587     570     -17
mempool_create_node                          173     156     -17
ip6_setup_cork                              1030    1013     -17
ida_alloc_range                              951     934     -17
gss_import_sec_context                       187     170     -17
dma_pool_alloc                               419     402     -17
devres_open_group                            223     206     -17
cfg80211_parse_mbssid_data                  2406    2389     -17
ip_setup_cork                                374     354     -20
kmalloc_caches                               336     312     -24
__i915_sw_fence_await_sw_fence               429     405     -24
kmalloc_cache_name                            57       -     -57
create_kmalloc_caches                        270     195     -75
new_kmalloc_cache                            112       -    -112
kmalloc_info                                 432       8    -424
Total: Before=14789209, After=14788507, chg -0.00%

Pengfei Li (7):
  mm, slab: Make kmalloc_info[] contain all types of names
  mm, slab: Remove unused kmalloc_size()
  mm, slab_common: Use enum kmalloc_cache_type to iterate over kmalloc
    caches
  mm, slab: Return ZERO_SIZE_ALLOC for zero sized kmalloc requests
  mm, slab_common: Make kmalloc_caches[] start at size KMALLOC_MIN_SIZE
  mm, slab_common: Initialize the same size of kmalloc_caches[]
  mm, slab_common: Modify kmalloc_caches[type][idx] to
    kmalloc_caches[idx][type]

 include/linux/slab.h | 136 ++++++++++++++------------
 mm/slab.c            |  11 ++-
 mm/slab.h            |  10 +-
 mm/slab_common.c     | 227 ++++++++++++++++++-------------------------
 mm/slub.c            |  12 +--
 5 files changed, 187 insertions(+), 209 deletions(-)

Comments

Christoph Lameter (Ampere) Sept. 16, 2019, 4:04 p.m. UTC | #1
On Mon, 16 Sep 2019, Pengfei Li wrote:

> The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
> but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
> generated by kmalloc_cache_name().
>
> Patch1 predefines the names of all types of kmalloc to save
> the time spent dynamically generating names.
>
> These changes make sense, and the time spent by new_kmalloc_cache()
> has been reduced by approximately 36.3%.

This is time spend during boot and does not affect later system
performance.

>                          Time spent by new_kmalloc_cache()
>                                   (CPU cycles)
> 5.3-rc7                              66264
> 5.3-rc7+patch_1-3                    42188

Ok. 15k cycles during boot saved. So we save 5 microseconds during bootup?

The current approach was created with the view on future setups allowing a
dynamic configuration of kmalloc caches based on need. I.e. ZONE_DMA may
not be needed once the floppy driver no longer makes use of it.
Pengfei Li Sept. 17, 2019, 1:11 p.m. UTC | #2
On Tue, Sep 17, 2019 at 12:04 AM Christopher Lameter <cl@linux.com> wrote:
>
> On Mon, 16 Sep 2019, Pengfei Li wrote:
>
> > The name of KMALLOC_NORMAL is contained in kmalloc_info[].name,
> > but the names of KMALLOC_RECLAIM and KMALLOC_DMA are dynamically
> > generated by kmalloc_cache_name().
> >
> > Patch1 predefines the names of all types of kmalloc to save
> > the time spent dynamically generating names.
> >
> > These changes make sense, and the time spent by new_kmalloc_cache()
> > has been reduced by approximately 36.3%.
>
> This is time spend during boot and does not affect later system
> performance.
>

Yes.

> >                          Time spent by new_kmalloc_cache()
> >                                   (CPU cycles)
> > 5.3-rc7                              66264
> > 5.3-rc7+patch_1-3                    42188
>
> Ok. 15k cycles during boot saved. So we save 5 microseconds during bootup?
>

Yes, this is a very small benefit.

> The current approach was created with the view on future setups allowing a
> dynamic configuration of kmalloc caches based on need. I.e. ZONE_DMA may
> not be needed once the floppy driver no longer makes use of it.

Yes, With this in mind, I also used #ifdef for INIT_KMALLOC_INFO.