mbox series

[RFC,0/2] Alloc kfence_pool after system startup

Message ID 20220303031505.28495-1-dtcccc@linux.alibaba.com (mailing list archive)
Headers show
Series Alloc kfence_pool after system startup | expand

Message

Tianchen Ding March 3, 2022, 3:15 a.m. UTC
KFENCE aims at production environments, but it does not allow enabling
after system startup because kfence_pool only alloc pages from memblock.
Consider the following production scene:
At first, for performance considerations, production machines do not
enable KFENCE.
However, after running for a while, the kernel is suspected to have
memory errors. (e.g., a sibling machine crashed.)
So other production machines need to enable KFENCE, but it's hard for
them to reboot.

The 1st patch allows re-enabling KFENCE if the pool is already
allocated from memblock.

The 2nd patch applies the main part.

Tianchen Ding (2):
  kfence: Allow re-enabling KFENCE after system startup
  kfence: Alloc kfence_pool after system startup

 mm/kfence/core.c | 106 ++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 87 insertions(+), 19 deletions(-)

Comments

Alexander Potapenko March 3, 2022, 9:05 a.m. UTC | #1
On Thu, Mar 3, 2022 at 4:15 AM Tianchen Ding <dtcccc@linux.alibaba.com>
wrote:

> KFENCE aims at production environments, but it does not allow enabling
> after system startup because kfence_pool only alloc pages from memblock.
> Consider the following production scene:
> At first, for performance considerations, production machines do not
> enable KFENCE.
>
What are the performance considerations you have in mind? Are you running
KFENCE with a very aggressive sampling rate?

However, after running for a while, the kernel is suspected to have
> memory errors. (e.g., a sibling machine crashed.)
>
I have doubts regarding this setup. It might be faster (although one can
tune KFENCE to have nearly zero performance impact), but is harder to
maintain.
It will also catch fewer errors than if you just had KFENCE on from the
very beginning:
 - sibling machines may behave differently, and a certain bug may only
occur once - in that case the secondary instances won't notice it, even
with KFENCE;
 - KFENCE also catches non-lethal corruptions (e.g. OOB reads), which may
stay under radar for a very long time.


> So other production machines need to enable KFENCE, but it's hard for
> them to reboot.
>
> The 1st patch allows re-enabling KFENCE if the pool is already
> allocated from memblock.
>
> The 2nd patch applies the main part.
>
> Tianchen Ding (2):
>   kfence: Allow re-enabling KFENCE after system startup
>   kfence: Alloc kfence_pool after system startup
>
>  mm/kfence/core.c | 106 ++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 87 insertions(+), 19 deletions(-)
>
> --
> 2.27.0
>
>
Marco Elver March 3, 2022, 9:30 a.m. UTC | #2
On Thu, 3 Mar 2022 at 10:05, Alexander Potapenko <glider@google.com> wrote:

I share Alex's concerns.

> On Thu, Mar 3, 2022 at 4:15 AM Tianchen Ding <dtcccc@linux.alibaba.com> wrote:
>>
>> KFENCE aims at production environments, but it does not allow enabling
>> after system startup because kfence_pool only alloc pages from memblock.
>> Consider the following production scene:
>> At first, for performance considerations, production machines do not
>> enable KFENCE.
>
> What are the performance considerations you have in mind? Are you running KFENCE with a very aggressive sampling rate?

Indeed, what is wrong with simply starting up KFENCE with a sample
interval of 10000? However, I very much doubt that you'll notice any
performance issues above 500ms.

Do let us know what performance issues you have seen. It may be
related to an earlier version of KFENCE but has since been fixed (see
log).

>> However, after running for a while, the kernel is suspected to have
>> memory errors. (e.g., a sibling machine crashed.)
>
> I have doubts regarding this setup. It might be faster (although one can tune KFENCE to have nearly zero performance impact), but is harder to maintain.
> It will also catch fewer errors than if you just had KFENCE on from the very beginning:
>  - sibling machines may behave differently, and a certain bug may only occur once - in that case the secondary instances won't notice it, even with KFENCE;
>  - KFENCE also catches non-lethal corruptions (e.g. OOB reads), which may stay under radar for a very long time.
>
>>
>> So other production machines need to enable KFENCE, but it's hard for
>> them to reboot.
>>
>> The 1st patch allows re-enabling KFENCE if the pool is already
>> allocated from memblock.

Patch 1/2 might be ok by itself, but I still don't see the point
because you should just leave KFENCE enabled. There should be no
reason to have to turn it off. If anything, you can increase the
sample interval to something very large if needed.
Tianchen Ding March 4, 2022, 2:24 a.m. UTC | #3
On 2022/3/3 17:30, Marco Elver wrote:

Thanks for your replies.
I do see setting a large sample_interval means almost disabling KFENCE.
In fact, my point is to provide a more “flexible” way. Since some Ops 
may be glad to use something like on/off switch than 10000ms interval. :-)

> On Thu, 3 Mar 2022 at 10:05, Alexander Potapenko <glider@google.com> wrote:
> 
> I share Alex's concerns.
> 
>> On Thu, Mar 3, 2022 at 4:15 AM Tianchen Ding <dtcccc@linux.alibaba.com> wrote:
>>>
>>> KFENCE aims at production environments, but it does not allow enabling
>>> after system startup because kfence_pool only alloc pages from memblock.
>>> Consider the following production scene:
>>> At first, for performance considerations, production machines do not
>>> enable KFENCE.
>>
>> What are the performance considerations you have in mind? Are you running KFENCE with a very aggressive sampling rate?
> 
> Indeed, what is wrong with simply starting up KFENCE with a sample
> interval of 10000? However, I very much doubt that you'll notice any
> performance issues above 500ms.
> 
> Do let us know what performance issues you have seen. It may be
> related to an earlier version of KFENCE but has since been fixed (see
> log).
> 
>>> However, after running for a while, the kernel is suspected to have
>>> memory errors. (e.g., a sibling machine crashed.)
>>
>> I have doubts regarding this setup. It might be faster (although one can tune KFENCE to have nearly zero performance impact), but is harder to maintain.
>> It will also catch fewer errors than if you just had KFENCE on from the very beginning:
>>   - sibling machines may behave differently, and a certain bug may only occur once - in that case the secondary instances won't notice it, even with KFENCE;
>>   - KFENCE also catches non-lethal corruptions (e.g. OOB reads), which may stay under radar for a very long time.
>>
>>>
>>> So other production machines need to enable KFENCE, but it's hard for
>>> them to reboot.
>>>
>>> The 1st patch allows re-enabling KFENCE if the pool is already
>>> allocated from memblock.
> 
> Patch 1/2 might be ok by itself, but I still don't see the point
> because you should just leave KFENCE enabled. There should be no
> reason to have to turn it off. If anything, you can increase the
> sample interval to something very large if needed.
Marco Elver March 4, 2022, 6:14 p.m. UTC | #4
On Fri, 4 Mar 2022 at 03:25, Tianchen Ding <dtcccc@linux.alibaba.com> wrote:
>
> On 2022/3/3 17:30, Marco Elver wrote:
>
> Thanks for your replies.
> I do see setting a large sample_interval means almost disabling KFENCE.
> In fact, my point is to provide a more “flexible” way. Since some Ops
> may be glad to use something like on/off switch than 10000ms interval. :-)

Have you already successfully caught bugs by turning KFENCE on _in
reaction_ to some suspected issues? We really do not think that
switching on KFENCE _after_ having observed a bug, especially on a
completely different machine, is at all reliable.

While your patches are appreciated, I think your usecase doesn't make
sense to us (based on our experience). I think this flexibility is
nice-to-have, so I think the justification just needs changing, to
avoid misleading other folks. Please see comments on the other
patches.

Thanks,
-- Marco