diff mbox

[BUG] Swap xarray workingset eviction warning.

Message ID 20180705170019.GA14929@cmpxchg.org (mailing list archive)
State New, archived
Headers show

Commit Message

Johannes Weiner July 5, 2018, 5 p.m. UTC
Hello,

On Sun, Jul 01, 2018 at 07:50:59PM -0700, Matthew Wilcox wrote:
> On Sun, Jul 01, 2018 at 07:09:41PM -0400, Peter Geis wrote:
> > The warning is as follows:
> > [10409.408904] ------------[ cut here ]------------
> > [10409.408912] WARNING: CPU: 0 PID: 38 at ./include/linux/xarray.h:53
> > workingset_eviction+0x14c/0x154
> 
> This is interesting.  Here's the code that leads to the warning:
> 
> static void *pack_shadow(int memcgid, pg_data_t *pgdat, unsigned long eviction)
> {
>         eviction >>= bucket_order;
>         eviction = (eviction << MEM_CGROUP_ID_SHIFT) | memcgid;
>         eviction = (eviction << NODES_SHIFT) | pgdat->node_id;
> 
>         return xa_mk_value(eviction);
> }
> 
> The warning itself comes from:
> 
> static inline void *xa_mk_value(unsigned long v)
> {
>         WARN_ON((long)v < 0);
>         return (void *)((v << 1) | 1);
> }
> 
> The fact that we haven't seen this on other architectures makes me wonder
> if NODES_SHIFT or MEM_CGROUP_ID_SHIFT are messed up on Tegra?
> 
> Johannes, I wonder if you could help out here?  I'm not terribly familiar
> with this part of the workingset code.

This could be a matter of uptime, but the warning triggers on a thing
that is supposed to happen everywhere eventually. Let's fix it.

The eviction timestamp is from a monotonically increasing counter that
will eventually reach values high enough that the left-shifts for
memcg id and node id will truncate the upper bits.

The bucketing logic isn't supposed to prevent this truncation, it's
just making sure that the namespace of the truncated timestamp field
is big enough to cover the full range of actionable refault pages.

xa_mk_value() doesn't understand that we're okay with it chopping off
our upper-most bit. We shouldn't make this an API behavior, either, so
let's fix the workingset code to always clear those bits before hand.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---

Comments

Matthew Wilcox July 5, 2018, 5:53 p.m. UTC | #1
On Thu, Jul 05, 2018 at 01:00:19PM -0400, Johannes Weiner wrote:
> This could be a matter of uptime, but the warning triggers on a thing
> that is supposed to happen everywhere eventually. Let's fix it.

Ahh!  Thank you!

> xa_mk_value() doesn't understand that we're okay with it chopping off
> our upper-most bit. We shouldn't make this an API behavior, either, so
> let's fix the workingset code to always clear those bits before hand.

Makes sense.  I'll just fold this in, if that's OK with you?

> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> 
> diff --git a/mm/workingset.c b/mm/workingset.c
> index a466e731231d..1da19c04b6f7 100644
> --- a/mm/workingset.c
> +++ b/mm/workingset.c
> @@ -173,6 +173,7 @@ static unsigned int bucket_order __read_mostly;
>  static void *pack_shadow(int memcgid, pg_data_t *pgdat, unsigned long eviction)
>  {
>  	eviction >>= bucket_order;
> +	eviction &= EVICTION_MASK;
>  	eviction = (eviction << MEM_CGROUP_ID_SHIFT) | memcgid;
>  	eviction = (eviction << NODES_SHIFT) | pgdat->node_id;
>
Johannes Weiner July 5, 2018, 6:43 p.m. UTC | #2
On Thu, Jul 05, 2018 at 10:53:52AM -0700, Matthew Wilcox wrote:
> On Thu, Jul 05, 2018 at 01:00:19PM -0400, Johannes Weiner wrote:
> > This could be a matter of uptime, but the warning triggers on a thing
> > that is supposed to happen everywhere eventually. Let's fix it.
> 
> Ahh!  Thank you!
> 
> > xa_mk_value() doesn't understand that we're okay with it chopping off
> > our upper-most bit. We shouldn't make this an API behavior, either, so
> > let's fix the workingset code to always clear those bits before hand.
> 
> Makes sense.  I'll just fold this in, if that's OK with you?

Sounds good to me, thanks.
diff mbox

Patch

diff --git a/mm/workingset.c b/mm/workingset.c
index a466e731231d..1da19c04b6f7 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -173,6 +173,7 @@  static unsigned int bucket_order __read_mostly;
 static void *pack_shadow(int memcgid, pg_data_t *pgdat, unsigned long eviction)
 {
 	eviction >>= bucket_order;
+	eviction &= EVICTION_MASK;
 	eviction = (eviction << MEM_CGROUP_ID_SHIFT) | memcgid;
 	eviction = (eviction << NODES_SHIFT) | pgdat->node_id;