mbox series

[v3,0/3] mm: kmemleak: Use a memory pool for kmemleak object allocations

Message ID 20190812160642.52134-1-catalin.marinas@arm.com (mailing list archive)
Headers show
Series mm: kmemleak: Use a memory pool for kmemleak object allocations | expand

Message

Catalin Marinas Aug. 12, 2019, 4:06 p.m. UTC
Following the discussions on v2 of this patch(set) [1], this series
takes slightly different approach:

- it implements its own simple memory pool that does not rely on the
  slab allocator

- drops the early log buffer logic entirely since it can now allocate
  metadata from the memory pool directly before kmemleak is fully
  initialised

- CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
  CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE

- moves the kmemleak_init() call earlier (mm_init())

- to avoid a separate memory pool for struct scan_area, it makes the
  tool robust when such allocations fail as scan areas are rather an
  optimisation

[1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com

Catalin Marinas (3):
  mm: kmemleak: Make the tool tolerant to struct scan_area allocation
    failures
  mm: kmemleak: Simple memory allocation pool for kmemleak objects
  mm: kmemleak: Use the memory pool for early allocations

 init/main.c       |   2 +-
 lib/Kconfig.debug |  11 +-
 mm/kmemleak.c     | 325 ++++++++++++----------------------------------
 3 files changed, 91 insertions(+), 247 deletions(-)

Comments

Andrew Morton Aug. 12, 2019, 9:07 p.m. UTC | #1
On Mon, 12 Aug 2019 17:06:39 +0100 Catalin Marinas <catalin.marinas@arm.com> wrote:

> Following the discussions on v2 of this patch(set) [1], this series
> takes slightly different approach:
> 
> - it implements its own simple memory pool that does not rely on the
>   slab allocator
> 
> - drops the early log buffer logic entirely since it can now allocate
>   metadata from the memory pool directly before kmemleak is fully
>   initialised
> 
> - CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
>   CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE
> 
> - moves the kmemleak_init() call earlier (mm_init())
> 
> - to avoid a separate memory pool for struct scan_area, it makes the
>   tool robust when such allocations fail as scan areas are rather an
>   optimisation
> 
> [1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com

Using the term "memory pool" is a little unfortunate, but better than
using "mempool"!

The changelog doesn't answer the very first question: why not use
mempools.  Please send along a paragraph which explains this decision.
Catalin Marinas Aug. 13, 2019, 9:40 a.m. UTC | #2
On Mon, Aug 12, 2019 at 02:07:30PM -0700, Andrew Morton wrote:
> On Mon, 12 Aug 2019 17:06:39 +0100 Catalin Marinas <catalin.marinas@arm.com> wrote:
> 
> > Following the discussions on v2 of this patch(set) [1], this series
> > takes slightly different approach:
> > 
> > - it implements its own simple memory pool that does not rely on the
> >   slab allocator
> > 
> > - drops the early log buffer logic entirely since it can now allocate
> >   metadata from the memory pool directly before kmemleak is fully
> >   initialised
> > 
> > - CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
> >   CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE
> > 
> > - moves the kmemleak_init() call earlier (mm_init())
> > 
> > - to avoid a separate memory pool for struct scan_area, it makes the
> >   tool robust when such allocations fail as scan areas are rather an
> >   optimisation
> > 
> > [1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com
> 
> Using the term "memory pool" is a little unfortunate, but better than
> using "mempool"!

I agree, it could have been more inspired. What about "metadata pool"
(together with function name updates etc.)? Happy to send a v4.

> The changelog doesn't answer the very first question: why not use
> mempools.  Please send along a paragraph which explains this decision.

I posted one in reply to the patch where the changelog should be
updated.

Thanks.
Noam Stolero Dec. 3, 2019, 3:51 p.m. UTC | #3
On 8/12/2019 7:06 PM, Catalin Marinas wrote:

Following the discussions on v2 of this patch(set) [1], this series
takes slightly different approach:

- it implements its own simple memory pool that does not rely on the
  slab allocator

- drops the early log buffer logic entirely since it can now allocate
  metadata from the memory pool directly before kmemleak is fully
  initialised

- CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
  CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE

- moves the kmemleak_init() call earlier (mm_init())

- to avoid a separate memory pool for struct scan_area, it makes the
  tool robust when such allocations fail as scan areas are rather an
  optimisation

[1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com

Catalin Marinas (3):
  mm: kmemleak: Make the tool tolerant to struct scan_area allocation
    failures
  mm: kmemleak: Simple memory allocation pool for kmemleak objects
  mm: kmemleak: Use the memory pool for early allocations

 init/main.c       |   2 +-
 lib/Kconfig.debug |  11 +-
 mm/kmemleak.c     | 325 ++++++++++++----------------------------------
 3 files changed, 91 insertions(+), 247 deletions(-)



Hi Catalin,

We observe severe degradation in our network performance affecting all of our NICs.
The degradation is directly linked to this patch.

What we run:
Simple Iperf TCP loopback with 8 streams on ConnectX5-100GbE.
Since it's a loopback test, traffic goes from the socket through the IP
stack and back to the socket, without going through the NIC driver.

What we observe:
Throughput performance:
- Kernel 5.3GA - Throughput was 230Gbps
- Kernel 5.4-rc1 and later - Throughput is 50Gbps

CPU utilization-wise:
Using perf we see much higher CPU utilization with kmem related functions:

Function                  | Kernel 5.3GA | Kernel 5.4-rc1 and later

--------------------------|--------------|-------------------------

__kfree_skb               | 3.4%         | 11.0%

kmem_cache_free           | 0.3%         | 10.2%

__alloc_skb               | 2.2%         | 26.0%

queued_spin_lock_slowpath | 1.3%         | 26.3%

delete_object_full        | Not used     | 18.0%

'delete_object_full()' function seems like the one which starts the slower flow.
One of the conditions causing this function to kick into action is 'kmemleak_free_enabled' flag,
which was changed to enabled by default by your series.
Reverting discussed series restore the performance almost completely.

Can you help shed light on the subject?
Catalin Marinas Dec. 3, 2019, 4:08 p.m. UTC | #4
On Tue, Dec 03, 2019 at 03:51:50PM +0000, Noam Stolero wrote:
> On 8/12/2019 7:06 PM, Catalin Marinas wrote:
> >     Following the discussions on v2 of this patch(set) [1], this series
> >     takes slightly different approach:
> > 
> >     - it implements its own simple memory pool that does not rely on the
> >       slab allocator
> > 
> >     - drops the early log buffer logic entirely since it can now allocate
> >       metadata from the memory pool directly before kmemleak is fully
> >       initialised
> > 
> >     - CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
> >       CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE
> > 
> >     - moves the kmemleak_init() call earlier (mm_init())
> > 
> >     - to avoid a separate memory pool for struct scan_area, it makes the
> >       tool robust when such allocations fail as scan areas are rather an
> >       optimisation
> > 
> >     [1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com
> > 
> >     Catalin Marinas (3):
> >       mm: kmemleak: Make the tool tolerant to struct scan_area allocation
> >         failures
> >       mm: kmemleak: Simple memory allocation pool for kmemleak objects
> >       mm: kmemleak: Use the memory pool for early allocations
> > 
> >      init/main.c       |   2 +-
> >      lib/Kconfig.debug |  11 +-
> >      mm/kmemleak.c     | 325 ++++++++++++----------------------------------
> >      3 files changed, 91 insertions(+), 247 deletions(-)
> 
> We observe severe degradation in our network performance affecting all
> of our NICs. The degradation is directly linked to this patch.
> 
> What we run:
> Simple Iperf TCP loopback with 8 streams on ConnectX5-100GbE.
> Since it's a loopback test, traffic goes from the socket through the IP
> stack and back to the socket, without going through the NIC driver.

Something similar was reported before. Can you try commit 2abd839aa7e6
("kmemleak: Do not corrupt the object_list during clean-up") and see if
it fixes the problem for you? It was merged in 5.4-rc4.
Noam Stolero Dec. 5, 2019, 4:16 p.m. UTC | #5
On 12/3/2019 6:08 PM, Catalin Marinas wrote:
> On Tue, Dec 03, 2019 at 03:51:50PM +0000, Noam Stolero wrote:
>> On 8/12/2019 7:06 PM, Catalin Marinas wrote:
>>>      Following the discussions on v2 of this patch(set) [1], this series
>>>      takes slightly different approach:
>>>
>>>      - it implements its own simple memory pool that does not rely on the
>>>        slab allocator
>>>
>>>      - drops the early log buffer logic entirely since it can now allocate
>>>        metadata from the memory pool directly before kmemleak is fully
>>>        initialised
>>>
>>>      - CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
>>>        CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE
>>>
>>>      - moves the kmemleak_init() call earlier (mm_init())
>>>
>>>      - to avoid a separate memory pool for struct scan_area, it makes the
>>>        tool robust when such allocations fail as scan areas are rather an
>>>        optimisation
>>>
>>>      [1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com
>>>
>>>      Catalin Marinas (3):
>>>        mm: kmemleak: Make the tool tolerant to struct scan_area allocation
>>>          failures
>>>        mm: kmemleak: Simple memory allocation pool for kmemleak objects
>>>        mm: kmemleak: Use the memory pool for early allocations
>>>
>>>       init/main.c       |   2 +-
>>>       lib/Kconfig.debug |  11 +-
>>>       mm/kmemleak.c     | 325 ++++++++++++----------------------------------
>>>       3 files changed, 91 insertions(+), 247 deletions(-)
>> We observe severe degradation in our network performance affecting all
>> of our NICs. The degradation is directly linked to this patch.
>>
>> What we run:
>> Simple Iperf TCP loopback with 8 streams on ConnectX5-100GbE.
>> Since it's a loopback test, traffic goes from the socket through the IP
>> stack and back to the socket, without going through the NIC driver.
> Something similar was reported before. Can you try commit 2abd839aa7e6
> ("kmemleak: Do not corrupt the object_list during clean-up") and see if
> it fixes the problem for you? It was merged in 5.4-rc4.

I've tested this commit, as well as 5.4.0-rc6 and 5.4.0 GA versions. We 
don't see the issue I've mentioned.

Thank you for the quick response and the assistance.