[00/15] stackdepot: allow evicting stack traces

Message ID	cover.1693328501.git.andreyknvl@google.com (mailing list archive)
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: andrey.konovalov@linux.dev To: Marco Elver <elver@google.com>, Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com>, Dmitry Vyukov <dvyukov@google.com>, Vlastimil Babka <vbabka@suse.cz>, kasan-dev@googlegroups.com, Evgenii Stepanov <eugenis@google.com>, Andrew Morton <akpm@linux-foundation.org>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrey Konovalov <andreyknvl@google.com> Subject: [PATCH 00/15] stackdepot: allow evicting stack traces Date: Tue, 29 Aug 2023 19:11:10 +0200 Message-Id: <cover.1693328501.git.andreyknvl@google.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	stackdepot: allow evicting stack traces \| expand [00/15] stackdepot: allow evicting stack traces [01/15] stackdepot: check disabled flag when fetching [02/15] stackdepot: simplify __stack_depot_save [03/15] stackdepot: drop valid bit from handles [04/15] stackdepot: add depot_fetch_stack helper [05/15] stackdepot: use fixed-sized slots for stack records [06/15] stackdepot: fix and clean-up atomic annotations [07/15] stackdepot: rework helpers for depot_alloc_stack [08/15] stackdepot: rename next_pool_required to new_pool_required [09/15] stackdepot: store next pool pointer in new_pool [10/15] stackdepot: store free stack records in a freelist [11/15] stackdepot: use read/write lock [12/15] stackdepot: add refcount for records [13/15] stackdepot: add backwards links to hash table buckets [14/15] stackdepot: allow users to evict stack traces [15/15] kasan: use stack_depot_evict for tag-based modes

Message ID

cover.1693328501.git.andreyknvl@google.com (mailing list archive)

Headers

From: andrey.konovalov@linux.dev
To: Marco Elver <elver@google.com>,
	Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	kasan-dev@googlegroups.com,
	Evgenii Stepanov <eugenis@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Andrey Konovalov <andreyknvl@google.com>
Subject: [PATCH 00/15] stackdepot: allow evicting stack traces
Date: Tue, 29 Aug 2023 19:11:10 +0200
Message-Id: <cover.1693328501.git.andreyknvl@google.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

stackdepot: allow evicting stack traces | expand

Message

andrey.konovalov@linux.dev Aug. 29, 2023, 5:11 p.m. UTC

From: Andrey Konovalov <andreyknvl@google.com>

Currently, the stack depot grows indefinitely until it reaches its
capacity. Once that happens, the stack depot stops saving new stack
traces.

This creates a problem for using the stack depot for in-field testing
and in production.

For such uses, an ideal stack trace storage should:

1. Allow saving fresh stack traces on systems with a large uptime while
   limiting the amount of memory used to store the traces;
2. Have a low performance impact.

Implementing #1 in the stack depot is impossible with the current
keep-forever approach. This series targets to address that. Issue #2 is
left to be addressed in a future series.

This series changes the stack depot implementation to allow evicting
unneeded stack traces from the stack depot. The users of the stack depot
can do that via a new stack_depot_evict API.

Internal changes to the stack depot code include:

1. Storing stack traces in 32-frame-sized slots (vs precisely-sized slots
   in the current implementation);
2. Keeping available slots in a freelist (vs keeping an offset to the next
   free slot);
3. Using a read/write lock for synchronization (vs a lock-free approach
   combined with a spinlock).

This series also integrates the eviction functionality in the tag-based
KASAN modes. (I will investigate integrating it into the Generic mode as
well in the following iterations of this series.)

Despite wasting some space on rounding up the size of each stack record
to 32 frames, with this change, the tag-based KASAN modes end up
consuming ~5% less memory in stack depot during boot (with the default
stack ring size of 32k entries). The reason for this is the eviction of
irrelevant stack traces from the stack depot, which frees up space for
other stack traces.

For other tools that heavily rely on the stack depot, like Generic KASAN
and KMSAN, this change leads to the stack depot capacity being reached
sooner than before. However, as these tools are mainly used in fuzzing
scenarios where the kernel is frequently rebooted, this outcome should
be acceptable.

There is no measurable boot time performace impact of these changes for
KASAN on x86-64. I haven't done any tests for arm64 modes (the stack
depot without performance optimizations is not suitable for intended use
of those anyway), but I expect a similar result. Obtaining and copying
stack trace frames when saving them into stack depot is what takes the
most time.

This series does not yet provide a way to configure the maximum size of
the stack depot externally (e.g. via a command-line parameter). This will
either be added in the following iterations of this series (if the used
approach gets approval) or will be added together with the performance
improvement changes.

Andrey Konovalov (15):
  stackdepot: check disabled flag when fetching
  stackdepot: simplify __stack_depot_save
  stackdepot: drop valid bit from handles
  stackdepot: add depot_fetch_stack helper
  stackdepot: use fixed-sized slots for stack records
  stackdepot: fix and clean-up atomic annotations
  stackdepot: rework helpers for depot_alloc_stack
  stackdepot: rename next_pool_required to new_pool_required
  stackdepot: store next pool pointer in new_pool
  stackdepot: store free stack records in a freelist
  stackdepot: use read/write lock
  stackdepot: add refcount for records
  stackdepot: add backwards links to hash table buckets
  stackdepot: allow users to evict stack traces
  kasan: use stack_depot_evict for tag-based modes

 include/linux/stackdepot.h |  11 ++
 lib/stackdepot.c           | 361 ++++++++++++++++++++++++-------------
 mm/kasan/tags.c            |   7 +-
 3 files changed, 249 insertions(+), 130 deletions(-)

Comments

Vlastimil Babka Aug. 30, 2023, 7:46 a.m. UTC | #1

On 8/29/23 19:11, andrey.konovalov@linux.dev wrote:
> From: Andrey Konovalov <andreyknvl@google.com>
> 
> Currently, the stack depot grows indefinitely until it reaches its
> capacity. Once that happens, the stack depot stops saving new stack
> traces.
> 
> This creates a problem for using the stack depot for in-field testing
> and in production.
> 
> For such uses, an ideal stack trace storage should:
> 
> 1. Allow saving fresh stack traces on systems with a large uptime while
>    limiting the amount of memory used to store the traces;
> 2. Have a low performance impact.

I wonder if there's also another thing to consider for the future:

3. With the number of stackdepot users increasing, each having their
distinct set of stacks from others, would it make sense to create separate
"storage instance" for each user instead of putting everything in a single
shared one?

In any case, evicting support is a good development, thanks!

> Implementing #1 in the stack depot is impossible with the current
> keep-forever approach. This series targets to address that. Issue #2 is
> left to be addressed in a future series.
> 
> This series changes the stack depot implementation to allow evicting
> unneeded stack traces from the stack depot. The users of the stack depot
> can do that via a new stack_depot_evict API.
> 
> Internal changes to the stack depot code include:
> 
> 1. Storing stack traces in 32-frame-sized slots (vs precisely-sized slots
>    in the current implementation);
> 2. Keeping available slots in a freelist (vs keeping an offset to the next
>    free slot);
> 3. Using a read/write lock for synchronization (vs a lock-free approach
>    combined with a spinlock).
> 
> This series also integrates the eviction functionality in the tag-based
> KASAN modes. (I will investigate integrating it into the Generic mode as
> well in the following iterations of this series.)
> 
> Despite wasting some space on rounding up the size of each stack record
> to 32 frames, with this change, the tag-based KASAN modes end up
> consuming ~5% less memory in stack depot during boot (with the default
> stack ring size of 32k entries). The reason for this is the eviction of
> irrelevant stack traces from the stack depot, which frees up space for
> other stack traces.
> 
> For other tools that heavily rely on the stack depot, like Generic KASAN
> and KMSAN, this change leads to the stack depot capacity being reached
> sooner than before. However, as these tools are mainly used in fuzzing
> scenarios where the kernel is frequently rebooted, this outcome should
> be acceptable.
> 
> There is no measurable boot time performace impact of these changes for
> KASAN on x86-64. I haven't done any tests for arm64 modes (the stack
> depot without performance optimizations is not suitable for intended use
> of those anyway), but I expect a similar result. Obtaining and copying
> stack trace frames when saving them into stack depot is what takes the
> most time.
> 
> This series does not yet provide a way to configure the maximum size of
> the stack depot externally (e.g. via a command-line parameter). This will
> either be added in the following iterations of this series (if the used
> approach gets approval) or will be added together with the performance
> improvement changes.
> 
> Andrey Konovalov (15):
>   stackdepot: check disabled flag when fetching
>   stackdepot: simplify __stack_depot_save
>   stackdepot: drop valid bit from handles
>   stackdepot: add depot_fetch_stack helper
>   stackdepot: use fixed-sized slots for stack records
>   stackdepot: fix and clean-up atomic annotations
>   stackdepot: rework helpers for depot_alloc_stack
>   stackdepot: rename next_pool_required to new_pool_required
>   stackdepot: store next pool pointer in new_pool
>   stackdepot: store free stack records in a freelist
>   stackdepot: use read/write lock
>   stackdepot: add refcount for records
>   stackdepot: add backwards links to hash table buckets
>   stackdepot: allow users to evict stack traces
>   kasan: use stack_depot_evict for tag-based modes
> 
>  include/linux/stackdepot.h |  11 ++
>  lib/stackdepot.c           | 361 ++++++++++++++++++++++++-------------
>  mm/kasan/tags.c            |   7 +-
>  3 files changed, 249 insertions(+), 130 deletions(-)
>

Andrey Konovalov Sept. 4, 2023, 6:45 p.m. UTC | #2

On Wed, Aug 30, 2023 at 9:46 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>
> I wonder if there's also another thing to consider for the future:
>
> 3. With the number of stackdepot users increasing, each having their
> distinct set of stacks from others, would it make sense to create separate
> "storage instance" for each user instead of putting everything in a single
> shared one?

This shouldn't be hard to implement. However, do you see any
particular use cases for this?

One thing that comes to mind is that the users will then be able to
create/destroy stack depot instances when required. But I don't know
if any of the users need this: so far they all seem to require stack
depot throughout the whole lifetime of the system.

> In any case, evicting support is a good development, thanks!

Thank you!

Kuan-Ying Lee (李冠穎) Sept. 5, 2023, 2:48 a.m. UTC | #3

On Mon, 2023-09-04 at 20:45 +0200, Andrey Konovalov wrote:
> On Wed, Aug 30, 2023 at 9:46 AM Vlastimil Babka <vbabka@suse.cz>
> wrote:
> > 
> > I wonder if there's also another thing to consider for the future:
> > 
> > 3. With the number of stackdepot users increasing, each having
> > their
> > distinct set of stacks from others, would it make sense to create
> > separate
> > "storage instance" for each user instead of putting everything in a
> > single
> > shared one?
> 
> This shouldn't be hard to implement. However, do you see any
> particular use cases for this?
> 
> One thing that comes to mind is that the users will then be able to
> create/destroy stack depot instances when required. But I don't know
> if any of the users need this: so far they all seem to require stack
> depot throughout the whole lifetime of the system.
> 
Maybe we can use evition in page_owner and slub_debug
(SLAB_STORE_USER).

After we update page_owner->handle, we could evict the previous
handle?

> > In any case, evicting support is a good development, thanks!
> 
> Thank you!
>

Andrey Konovalov Sept. 13, 2023, 5:10 p.m. UTC | #4

On Tue, Sep 5, 2023 at 4:48 AM 'Kuan-Ying Lee (李冠穎)' via kasan-dev
<kasan-dev@googlegroups.com> wrote:
>
> > > 3. With the number of stackdepot users increasing, each having
> > > their
> > > distinct set of stacks from others, would it make sense to create
> > > separate
> > > "storage instance" for each user instead of putting everything in a
> > > single
> > > shared one?
> >
> > This shouldn't be hard to implement. However, do you see any
> > particular use cases for this?
> >
> > One thing that comes to mind is that the users will then be able to
> > create/destroy stack depot instances when required. But I don't know
> > if any of the users need this: so far they all seem to require stack
> > depot throughout the whole lifetime of the system.
> >
> Maybe we can use evition in page_owner and slub_debug
> (SLAB_STORE_USER).
>
> After we update page_owner->handle, we could evict the previous
> handle?

We can definitely adapt more users to the new API. My comment was
related to the suggestion of splitting stack depot storages for
different users.

But actually I have a response to my question about the split: if each
user has a separate stack depot storage instance, they can set the
maximum stack trace size as they desire, and thus save up on memory.