Patchwork [v6] mm: Add memory allocation watchdog kernel thread.

login
register
mail settings
Submitter Johannes Weiner
Date Jan. 25, 2017, 6:11 p.m.
Message ID <20170125181150.GA16398@cmpxchg.org>
Download mbox | patch
Permalink /patch/9537691/
State New
Headers show

Comments

Johannes Weiner - Jan. 25, 2017, 6:11 p.m.
On Sun, Nov 06, 2016 at 04:15:01PM +0900, Tetsuo Handa wrote:
> +- Why need to use it?
> +
> +Currently, when something went wrong inside memory allocation request,
> +the system might stall without any kernel messages.
> +
> +Although there is khungtaskd kernel thread as an asynchronous monitoring
> +approach, khungtaskd kernel thread is not always helpful because memory
> +allocating tasks unlikely sleep in uninterruptible state for
> +/proc/sys/kernel/hung_task_timeout_secs seconds.
> +
> +Although there is warn_alloc() as a synchronous monitoring approach
> +which emits
> +
> +  "%s: page allocation stalls for %ums, order:%u, mode:%#x(%pGg)\n"
> +
> +line, warn_alloc() is not bullet proof because allocating tasks can get
> +stuck before calling warn_alloc() and/or allocating tasks are using
> +__GFP_NOWARN flag and/or such lines are suppressed by ratelimiting and/or
> +such lines are corrupted due to collisions.

I'm not fully convinced by this explanation. Do you have a real life
example where the warn_alloc() stall info is not enough? If yes, this
should be included here and in the changelog. If not, the extra code,
the task_struct overhead etc. don't seem justified.

__GFP_NOWARN shouldn't suppress stall warnings, IMO. It's for whether
the caller expects allocation failure and is prepared to handle it; an
allocation stalling out for 10s is an issue regardless of the callsite.

---

From 6420cae52cac8167bd5fb19f45feed2d540bc11d Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Wed, 25 Jan 2017 12:57:20 -0500
Subject: [PATCH] mm: page_alloc: __GFP_NOWARN shouldn't suppress stall
 warnings

__GFP_NOWARN, which is usually added to avoid warnings from callsites
that expect to fail and have fallbacks, currently also suppresses
allocation stall warnings. These trigger when an allocation is stuck
inside the allocator for 10 seconds or longer.

But there is no class of allocations that can get legitimately stuck
in the allocator for this long. This always indicates a problem.

Always emit stall warnings. Restrict __GFP_NOWARN to alloc failures.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Michal Hocko - Jan. 25, 2017, 6:45 p.m.
On Wed 25-01-17 13:11:50, Johannes Weiner wrote:
[...]
> >From 6420cae52cac8167bd5fb19f45feed2d540bc11d Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Wed, 25 Jan 2017 12:57:20 -0500
> Subject: [PATCH] mm: page_alloc: __GFP_NOWARN shouldn't suppress stall
>  warnings
> 
> __GFP_NOWARN, which is usually added to avoid warnings from callsites
> that expect to fail and have fallbacks, currently also suppresses
> allocation stall warnings. These trigger when an allocation is stuck
> inside the allocator for 10 seconds or longer.
> 
> But there is no class of allocations that can get legitimately stuck
> in the allocator for this long. This always indicates a problem.
> 
> Always emit stall warnings. Restrict __GFP_NOWARN to alloc failures.

Tetsuo has already suggested something like this and I didn't really
like it because it makes the semantic of the flag confusing. The mask
says to not warn while the kernel log might contain an allocation splat.
You are right that stalling for 10s seconds means a problem on its own
but on the other hand I can imagine somebody might really want to have
clean logs and the last thing we want is to have another gfp flag for
that purpose.

I also do not think that this change would make a big difference because
most allocations simply use this flag along with __GFP_NORETRY or
GFP_NOWAIT resp GFP_ATOMIC. Have we ever seen a stall with this
allocation requests?

I haven't nacked Tetsuo's patch AFAIR and will not nack this one either
I just do not think we should tweak __GFP_NOWARN.
 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  mm/page_alloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f3e0c69a97b7..7ce051d1d575 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3704,7 +3704,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  
>  	/* Make sure we know about allocations which stall for too long */
>  	if (time_after(jiffies, alloc_start + stall_timeout)) {
> -		warn_alloc(gfp_mask,
> +		warn_alloc(gfp_mask & ~__GFP_NOWARN,
>  			"page allocation stalls for %ums, order:%u",
>  			jiffies_to_msecs(jiffies-alloc_start), order);
>  		stall_timeout += 10 * HZ;
> -- 
> 2.11.0
Johannes Weiner - Jan. 25, 2017, 7:22 p.m.
On Wed, Jan 25, 2017 at 07:45:49PM +0100, Michal Hocko wrote:
> On Wed 25-01-17 13:11:50, Johannes Weiner wrote:
> [...]
> > >From 6420cae52cac8167bd5fb19f45feed2d540bc11d Mon Sep 17 00:00:00 2001
> > From: Johannes Weiner <hannes@cmpxchg.org>
> > Date: Wed, 25 Jan 2017 12:57:20 -0500
> > Subject: [PATCH] mm: page_alloc: __GFP_NOWARN shouldn't suppress stall
> >  warnings
> > 
> > __GFP_NOWARN, which is usually added to avoid warnings from callsites
> > that expect to fail and have fallbacks, currently also suppresses
> > allocation stall warnings. These trigger when an allocation is stuck
> > inside the allocator for 10 seconds or longer.
> > 
> > But there is no class of allocations that can get legitimately stuck
> > in the allocator for this long. This always indicates a problem.
> > 
> > Always emit stall warnings. Restrict __GFP_NOWARN to alloc failures.
> 
> Tetsuo has already suggested something like this and I didn't really
> like it because it makes the semantic of the flag confusing. The mask
> says to not warn while the kernel log might contain an allocation splat.
> You are right that stalling for 10s seconds means a problem on its own
> but on the other hand I can imagine somebody might really want to have
> clean logs and the last thing we want is to have another gfp flag for
> that purpose.

I don't think it's confusing. __GFP_NOWARN tells the allocator whether
an allocation failure can be handled or whether it constitutes a bug.

If we agree that stalling for 10s is a bug, then we should emit the
warnings. Tying this to whether the caller can handle an allocation
failure is non-sensical. Not warning about a bug because the user
would prefer clean logs is... somewhat out there.
Minchan Kim - Jan. 25, 2017, 11:44 p.m.
On Wed, Jan 25, 2017 at 01:11:50PM -0500, Johannes Weiner wrote:
> On Sun, Nov 06, 2016 at 04:15:01PM +0900, Tetsuo Handa wrote:
> > +- Why need to use it?
> > +
> > +Currently, when something went wrong inside memory allocation request,
> > +the system might stall without any kernel messages.
> > +
> > +Although there is khungtaskd kernel thread as an asynchronous monitoring
> > +approach, khungtaskd kernel thread is not always helpful because memory
> > +allocating tasks unlikely sleep in uninterruptible state for
> > +/proc/sys/kernel/hung_task_timeout_secs seconds.
> > +
> > +Although there is warn_alloc() as a synchronous monitoring approach
> > +which emits
> > +
> > +  "%s: page allocation stalls for %ums, order:%u, mode:%#x(%pGg)\n"
> > +
> > +line, warn_alloc() is not bullet proof because allocating tasks can get
> > +stuck before calling warn_alloc() and/or allocating tasks are using
> > +__GFP_NOWARN flag and/or such lines are suppressed by ratelimiting and/or
> > +such lines are corrupted due to collisions.
> 
> I'm not fully convinced by this explanation. Do you have a real life
> example where the warn_alloc() stall info is not enough? If yes, this
> should be included here and in the changelog. If not, the extra code,
> the task_struct overhead etc. don't seem justified.
> 
> __GFP_NOWARN shouldn't suppress stall warnings, IMO. It's for whether
> the caller expects allocation failure and is prepared to handle it; an
> allocation stalling out for 10s is an issue regardless of the callsite.
> 
> ---
> 
> From 6420cae52cac8167bd5fb19f45feed2d540bc11d Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Wed, 25 Jan 2017 12:57:20 -0500
> Subject: [PATCH] mm: page_alloc: __GFP_NOWARN shouldn't suppress stall
>  warnings
> 
> __GFP_NOWARN, which is usually added to avoid warnings from callsites
> that expect to fail and have fallbacks, currently also suppresses
> allocation stall warnings. These trigger when an allocation is stuck
> inside the allocator for 10 seconds or longer.
> 
> But there is no class of allocations that can get legitimately stuck
> in the allocator for this long. This always indicates a problem.
> 
> Always emit stall warnings. Restrict __GFP_NOWARN to alloc failures.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Minchan Kim <minchan@kernel.org>
Michal Hocko - Jan. 26, 2017, 7:57 a.m.
On Wed 25-01-17 14:22:45, Johannes Weiner wrote:
> On Wed, Jan 25, 2017 at 07:45:49PM +0100, Michal Hocko wrote:
> > On Wed 25-01-17 13:11:50, Johannes Weiner wrote:
> > [...]
> > > >From 6420cae52cac8167bd5fb19f45feed2d540bc11d Mon Sep 17 00:00:00 2001
> > > From: Johannes Weiner <hannes@cmpxchg.org>
> > > Date: Wed, 25 Jan 2017 12:57:20 -0500
> > > Subject: [PATCH] mm: page_alloc: __GFP_NOWARN shouldn't suppress stall
> > >  warnings
> > > 
> > > __GFP_NOWARN, which is usually added to avoid warnings from callsites
> > > that expect to fail and have fallbacks, currently also suppresses
> > > allocation stall warnings. These trigger when an allocation is stuck
> > > inside the allocator for 10 seconds or longer.
> > > 
> > > But there is no class of allocations that can get legitimately stuck
> > > in the allocator for this long. This always indicates a problem.
> > > 
> > > Always emit stall warnings. Restrict __GFP_NOWARN to alloc failures.
> > 
> > Tetsuo has already suggested something like this and I didn't really
> > like it because it makes the semantic of the flag confusing. The mask
> > says to not warn while the kernel log might contain an allocation splat.
> > You are right that stalling for 10s seconds means a problem on its own
> > but on the other hand I can imagine somebody might really want to have
> > clean logs and the last thing we want is to have another gfp flag for
> > that purpose.
> 
> I don't think it's confusing. __GFP_NOWARN tells the allocator whether
> an allocation failure can be handled or whether it constitutes a bug.
> 
> If we agree that stalling for 10s is a bug, then we should emit the
> warnings.

Yes, in many cases it would be a bug in the MM. Some of them would be
inherent because the allocator doesn't implement any fairness and
starvation cannot be ruled out (would that be a bug?). In general,
looping/spending a lot of time in kernel can be seen as a bug. We have
watchdogs to report those cases and the time has told us that we had to
develop ways to silent those lockups because in some cases we couldn't
handle them. I am worried we will eventually find cases like that for
allocation stalls as well. I might be over sensitive because we have
made some mistakes in the gfp flags land already and I would like to
prevent more to come.

That being said, I will not stand in the way...
Tetsuo Handa - Jan. 26, 2017, 10:28 a.m.
Michal Hocko wrote:
> On Wed 25-01-17 13:11:50, Johannes Weiner wrote:
> [...]
> > >From 6420cae52cac8167bd5fb19f45feed2d540bc11d Mon Sep 17 00:00:00 2001
> > From: Johannes Weiner <hannes@cmpxchg.org>
> > Date: Wed, 25 Jan 2017 12:57:20 -0500
> > Subject: [PATCH] mm: page_alloc: __GFP_NOWARN shouldn't suppress stall
> >  warnings
> > 
> > __GFP_NOWARN, which is usually added to avoid warnings from callsites
> > that expect to fail and have fallbacks, currently also suppresses
> > allocation stall warnings. These trigger when an allocation is stuck
> > inside the allocator for 10 seconds or longer.
> > 
> > But there is no class of allocations that can get legitimately stuck
> > in the allocator for this long. This always indicates a problem.
> > 
> > Always emit stall warnings. Restrict __GFP_NOWARN to alloc failures.
> 
> Tetsuo has already suggested something like this and I didn't really

Yes, I already suggested it at
http://lkml.kernel.org/r/1484132120-35288-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp .

> like it because it makes the semantic of the flag confusing. The mask
> says to not warn while the kernel log might contain an allocation splat.
> You are right that stalling for 10s seconds means a problem on its own
> but on the other hand I can imagine somebody might really want to have
> clean logs and the last thing we want is to have another gfp flag for
> that purpose.

I agree with Johannes about that __GFP_NOWARN should not suppress allocation
stall warnings. But having another gfp flag for that purpose is not useful.
Given that someone really wants not to have allocation stall warnings in the
kernel logs, where is the switch for enabling/disabling allocation stall
warnings (because gfp flags are constant determined at build time)? We will
need to have either a kernel command line option or a sysctl (or sysfs)
variable. khungtaskd uses sysctl variables for those who really wants not
to have TASK_UNINTERRUPTIBLE warnings; so does kmallocwd.

> 
> I also do not think that this change would make a big difference because
> most allocations simply use this flag along with __GFP_NORETRY or
> GFP_NOWAIT resp GFP_ATOMIC. Have we ever seen a stall with this
> allocation requests?

You are totally ignoring what I explained in the "[PATCH] mm: Ignore
__GFP_NOWARN when reporting stalls" thread shown above.

  Majority of __GFP_DIRECT_RECLAIM allocation requests are tolerable with
  allocation failure (and they will be willing to give up upon SIGKILL if
  they are from syscall) and do not need to alarm the admin to do any action.
  If they are not tolerable with allocation failure, they will add __GFP_NOFAIL.
  
  Apart from the reality that they are not tested well because they are
  currently protected by the "too small to fail" memory-allocation rule,
  they are ready to add __GFP_NOWARN. And current behavior (i.e. !costly
  __GFP_DIRECT_RECLAIM allocation requests won't fail unless __GFP_NORETRY
  is set or TIF_MEMDIE is set after SIGKILL was delivered) keeps them away
  from adding __GFP_NOFAIL.

> 
> I haven't nacked Tetsuo's patch AFAIR and will not nack this one either
> I just do not think we should tweak __GFP_NOWARN.

Leaving this proposal as it is is counterproductive. I already said

  The discussion at this stage should not be "whether we need such
  watchdog and debugging code" but should be "how we can make the impact
  of watchdog and debugging code as small as possible".

at http://lkml.kernel.org/r/1462630604-23410-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp .
And there had been no response.

Johannes Weiner wrote:
> On Sun, Nov 06, 2016 at 04:15:01PM +0900, Tetsuo Handa wrote:
> > +- Why need to use it?
> > +
> > +Currently, when something went wrong inside memory allocation request,
> > +the system might stall without any kernel messages.
> > +
> > +Although there is khungtaskd kernel thread as an asynchronous monitoring
> > +approach, khungtaskd kernel thread is not always helpful because memory
> > +allocating tasks unlikely sleep in uninterruptible state for
> > +/proc/sys/kernel/hung_task_timeout_secs seconds.
> > +
> > +Although there is warn_alloc() as a synchronous monitoring approach
> > +which emits
> > +
> > +  "%s: page allocation stalls for %ums, order:%u, mode:%#x(%pGg)\n"
> > +
> > +line, warn_alloc() is not bullet proof because allocating tasks can get
> > +stuck before calling warn_alloc() and/or allocating tasks are using
> > +__GFP_NOWARN flag and/or such lines are suppressed by ratelimiting and/or
> > +such lines are corrupted due to collisions.
> 
> I'm not fully convinced by this explanation. Do you have a real life
> example where the warn_alloc() stall info is not enough? If yes, this
> should be included here and in the changelog. If not, the extra code,
> the task_struct overhead etc. don't seem justified.

warn_alloc() stall info cannot provide overall analyses. I said

  If you meant (b), it is because synchronous watchdog is not reliable and
  cannot provide enough diagnostic information. Since allocation livelock
  involves several threads due to dependency, it is important to take a
  snapshot of possibly relevant threads. By using asynchronous watchdog,
  we can not only take a snapshot but also take more actions for obtaining
  diagnostic information (e.g. enabling tracepoints when allocation stalls
  are detected).

in the same thread shown above. For example, the cause of allocation stall
might be due to out of idle workqueue thread; e.g. commit 373ccbe5927034b5
("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make
any progress"). Without reporting all possibly relevant threads, we might
fail to obtain enough diagnostic information. Changing warn_alloc() to
also report workqueues/kswapd/locks etc. will be noisy and incomplete
because warn_alloc() cares only current thread.

I welcome suggestions for "how we can make the impact of watchdog and
debugging code as small as possible". But I don't have environment for
evaluating the task_struct overhead. I wonder whether this overhead matters
because this is allocation slowpath which will consume a lot of CPU cycles
for scanning and/or sleep for many jiffies waiting for I/O. It will be far
cheaper than keeping mm related tracepoints enabled until something happens
which might be uptime of months. Sending the v6 patch to linux-next and
start evaluating the overhead will be the way to avoid leaving this proposal
as it is.
Tetsuo Handa - Feb. 22, 2017, 2:11 a.m.
Johannes, any questions/comments on this patch?
I'd like to send to linux-next because this patch will be useful for
examining problems like "Bug 192981 - page allocation stalls". I think
that this patch will remain be useful as long as we use direct reclaim.

Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Wed 25-01-17 13:11:50, Johannes Weiner wrote:
> > [...]
> > > >From 6420cae52cac8167bd5fb19f45feed2d540bc11d Mon Sep 17 00:00:00 2001
> > > From: Johannes Weiner <hannes@cmpxchg.org>
> > > Date: Wed, 25 Jan 2017 12:57:20 -0500
> > > Subject: [PATCH] mm: page_alloc: __GFP_NOWARN shouldn't suppress stall
> > >  warnings
> > > 
> > > __GFP_NOWARN, which is usually added to avoid warnings from callsites
> > > that expect to fail and have fallbacks, currently also suppresses
> > > allocation stall warnings. These trigger when an allocation is stuck
> > > inside the allocator for 10 seconds or longer.
> > > 
> > > But there is no class of allocations that can get legitimately stuck
> > > in the allocator for this long. This always indicates a problem.
> > > 
> > > Always emit stall warnings. Restrict __GFP_NOWARN to alloc failures.
> > 
> > Tetsuo has already suggested something like this and I didn't really
> 
> Yes, I already suggested it at
> http://lkml.kernel.org/r/1484132120-35288-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp .
> 
> > like it because it makes the semantic of the flag confusing. The mask
> > says to not warn while the kernel log might contain an allocation splat.
> > You are right that stalling for 10s seconds means a problem on its own
> > but on the other hand I can imagine somebody might really want to have
> > clean logs and the last thing we want is to have another gfp flag for
> > that purpose.
> 
> I agree with Johannes about that __GFP_NOWARN should not suppress allocation
> stall warnings. But having another gfp flag for that purpose is not useful.
> Given that someone really wants not to have allocation stall warnings in the
> kernel logs, where is the switch for enabling/disabling allocation stall
> warnings (because gfp flags are constant determined at build time)? We will
> need to have either a kernel command line option or a sysctl (or sysfs)
> variable. khungtaskd uses sysctl variables for those who really wants not
> to have TASK_UNINTERRUPTIBLE warnings; so does kmallocwd.
> 
> > 
> > I also do not think that this change would make a big difference because
> > most allocations simply use this flag along with __GFP_NORETRY or
> > GFP_NOWAIT resp GFP_ATOMIC. Have we ever seen a stall with this
> > allocation requests?
> 
> You are totally ignoring what I explained in the "[PATCH] mm: Ignore
> __GFP_NOWARN when reporting stalls" thread shown above.
> 
>   Majority of __GFP_DIRECT_RECLAIM allocation requests are tolerable with
>   allocation failure (and they will be willing to give up upon SIGKILL if
>   they are from syscall) and do not need to alarm the admin to do any action.
>   If they are not tolerable with allocation failure, they will add __GFP_NOFAIL.
>   
>   Apart from the reality that they are not tested well because they are
>   currently protected by the "too small to fail" memory-allocation rule,
>   they are ready to add __GFP_NOWARN. And current behavior (i.e. !costly
>   __GFP_DIRECT_RECLAIM allocation requests won't fail unless __GFP_NORETRY
>   is set or TIF_MEMDIE is set after SIGKILL was delivered) keeps them away
>   from adding __GFP_NOFAIL.
> 
> > 
> > I haven't nacked Tetsuo's patch AFAIR and will not nack this one either
> > I just do not think we should tweak __GFP_NOWARN.
> 
> Leaving this proposal as it is is counterproductive. I already said
> 
>   The discussion at this stage should not be "whether we need such
>   watchdog and debugging code" but should be "how we can make the impact
>   of watchdog and debugging code as small as possible".
> 
> at http://lkml.kernel.org/r/1462630604-23410-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp .
> And there had been no response.
> 
> Johannes Weiner wrote:
> > On Sun, Nov 06, 2016 at 04:15:01PM +0900, Tetsuo Handa wrote:
> > > +- Why need to use it?
> > > +
> > > +Currently, when something went wrong inside memory allocation request,
> > > +the system might stall without any kernel messages.
> > > +
> > > +Although there is khungtaskd kernel thread as an asynchronous monitoring
> > > +approach, khungtaskd kernel thread is not always helpful because memory
> > > +allocating tasks unlikely sleep in uninterruptible state for
> > > +/proc/sys/kernel/hung_task_timeout_secs seconds.
> > > +
> > > +Although there is warn_alloc() as a synchronous monitoring approach
> > > +which emits
> > > +
> > > +  "%s: page allocation stalls for %ums, order:%u, mode:%#x(%pGg)\n"
> > > +
> > > +line, warn_alloc() is not bullet proof because allocating tasks can get
> > > +stuck before calling warn_alloc() and/or allocating tasks are using
> > > +__GFP_NOWARN flag and/or such lines are suppressed by ratelimiting and/or
> > > +such lines are corrupted due to collisions.
> > 
> > I'm not fully convinced by this explanation. Do you have a real life
> > example where the warn_alloc() stall info is not enough? If yes, this
> > should be included here and in the changelog. If not, the extra code,
> > the task_struct overhead etc. don't seem justified.
> 
> warn_alloc() stall info cannot provide overall analyses. I said
> 
>   If you meant (b), it is because synchronous watchdog is not reliable and
>   cannot provide enough diagnostic information. Since allocation livelock
>   involves several threads due to dependency, it is important to take a
>   snapshot of possibly relevant threads. By using asynchronous watchdog,
>   we can not only take a snapshot but also take more actions for obtaining
>   diagnostic information (e.g. enabling tracepoints when allocation stalls
>   are detected).
> 
> in the same thread shown above. For example, the cause of allocation stall
> might be due to out of idle workqueue thread; e.g. commit 373ccbe5927034b5
> ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make
> any progress"). Without reporting all possibly relevant threads, we might
> fail to obtain enough diagnostic information. Changing warn_alloc() to
> also report workqueues/kswapd/locks etc. will be noisy and incomplete
> because warn_alloc() cares only current thread.
> 
> I welcome suggestions for "how we can make the impact of watchdog and
> debugging code as small as possible". But I don't have environment for
> evaluating the task_struct overhead. I wonder whether this overhead matters
> because this is allocation slowpath which will consume a lot of CPU cycles
> for scanning and/or sleep for many jiffies waiting for I/O. It will be far
> cheaper than keeping mm related tracepoints enabled until something happens
> which might be uptime of months. Sending the v6 patch to linux-next and
> start evaluating the overhead will be the way to avoid leaving this proposal
> as it is.
>

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f3e0c69a97b7..7ce051d1d575 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3704,7 +3704,7 @@  __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
 	/* Make sure we know about allocations which stall for too long */
 	if (time_after(jiffies, alloc_start + stall_timeout)) {
-		warn_alloc(gfp_mask,
+		warn_alloc(gfp_mask & ~__GFP_NOWARN,
 			"page allocation stalls for %ums, order:%u",
 			jiffies_to_msecs(jiffies-alloc_start), order);
 		stall_timeout += 10 * HZ;