diff mbox series

[RFC,02/15] slub: Add isolate() and migrate() methods

Message ID 20190308041426.16654-3-tobin@kernel.org (mailing list archive)
State New, archived
Headers show
Series mm: Implement Slab Movable Objects (SMO) | expand

Commit Message

Tobin C. Harding March 8, 2019, 4:14 a.m. UTC
Add the two methods needed for moving objects and enable the display of
the callbacks via the /sys/kernel/slab interface.

Add documentation explaining the use of these methods and the prototypes
for slab.h. Add functions to setup the callbacks method for a slab
cache.

Add empty functions for SLAB/SLOB. The API is generic so it could be
theoretically implemented for these allocators as well.

Co-developed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
---
 include/linux/slab.h     | 69 ++++++++++++++++++++++++++++++++++++++++
 include/linux/slub_def.h |  3 ++
 mm/slab_common.c         |  4 +++
 mm/slub.c                | 42 ++++++++++++++++++++++++
 4 files changed, 118 insertions(+)

Comments

Tycho Andersen March 8, 2019, 3:28 p.m. UTC | #1
On Fri, Mar 08, 2019 at 03:14:13PM +1100, Tobin C. Harding wrote:
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index f9d89c1b5977..754acdb292e4 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -298,6 +298,10 @@ int slab_unmergeable(struct kmem_cache *s)
>  	if (!is_root_cache(s))
>  		return 1;
>  
> +	/*
> +	 * s->isolate and s->migrate imply s->ctor so no need to
> +	 * check them explicitly.
> +	 */

Shouldn't this implication go the other way, i.e.
    s->ctor => s->isolate & s->migrate
?

>  	if (s->ctor)
>  		return 1;

Tycho
Christoph Lameter (Ampere) March 8, 2019, 4:15 p.m. UTC | #2
On Fri, 8 Mar 2019, Tycho Andersen wrote:

> On Fri, Mar 08, 2019 at 03:14:13PM +1100, Tobin C. Harding wrote:
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index f9d89c1b5977..754acdb292e4 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -298,6 +298,10 @@ int slab_unmergeable(struct kmem_cache *s)
> >  	if (!is_root_cache(s))
> >  		return 1;
> >
> > +	/*
> > +	 * s->isolate and s->migrate imply s->ctor so no need to
> > +	 * check them explicitly.
> > +	 */
>
> Shouldn't this implication go the other way, i.e.
>     s->ctor => s->isolate & s->migrate

A cache can have a constructor but the object may not be movable (I.e.
currently dentries and inodes).
Tycho Andersen March 8, 2019, 4:22 p.m. UTC | #3
On Fri, Mar 08, 2019 at 04:15:46PM +0000, Christopher Lameter wrote:
> On Fri, 8 Mar 2019, Tycho Andersen wrote:
> 
> > On Fri, Mar 08, 2019 at 03:14:13PM +1100, Tobin C. Harding wrote:
> > > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > > index f9d89c1b5977..754acdb292e4 100644
> > > --- a/mm/slab_common.c
> > > +++ b/mm/slab_common.c
> > > @@ -298,6 +298,10 @@ int slab_unmergeable(struct kmem_cache *s)
> > >  	if (!is_root_cache(s))
> > >  		return 1;
> > >
> > > +	/*
> > > +	 * s->isolate and s->migrate imply s->ctor so no need to
> > > +	 * check them explicitly.
> > > +	 */
> >
> > Shouldn't this implication go the other way, i.e.
> >     s->ctor => s->isolate & s->migrate
> 
> A cache can have a constructor but the object may not be movable (I.e.
> currently dentries and inodes).

Yep, thanks. Somehow I got confused by the comment.

Tycho
Tobin Harding March 8, 2019, 7:53 p.m. UTC | #4
On Fri, Mar 08, 2019 at 09:22:37AM -0700, Tycho Andersen wrote:
> On Fri, Mar 08, 2019 at 04:15:46PM +0000, Christopher Lameter wrote:
> > On Fri, 8 Mar 2019, Tycho Andersen wrote:
> > 
> > > On Fri, Mar 08, 2019 at 03:14:13PM +1100, Tobin C. Harding wrote:
> > > > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > > > index f9d89c1b5977..754acdb292e4 100644
> > > > --- a/mm/slab_common.c
> > > > +++ b/mm/slab_common.c
> > > > @@ -298,6 +298,10 @@ int slab_unmergeable(struct kmem_cache *s)
> > > >  	if (!is_root_cache(s))
> > > >  		return 1;
> > > >
> > > > +	/*
> > > > +	 * s->isolate and s->migrate imply s->ctor so no need to
> > > > +	 * check them explicitly.
> > > > +	 */
> > >
> > > Shouldn't this implication go the other way, i.e.
> > >     s->ctor => s->isolate & s->migrate
> > 
> > A cache can have a constructor but the object may not be movable (I.e.
> > currently dentries and inodes).
> 
> Yep, thanks. Somehow I got confused by the comment.

I removed code here from the original RFC-v2, if this comment is
confusing perhaps we are better off without it.

thanks,
Tobin.
Tycho Andersen March 8, 2019, 8:08 p.m. UTC | #5
On Sat, Mar 09, 2019 at 06:53:22AM +1100, Tobin C. Harding wrote:
> On Fri, Mar 08, 2019 at 09:22:37AM -0700, Tycho Andersen wrote:
> > On Fri, Mar 08, 2019 at 04:15:46PM +0000, Christopher Lameter wrote:
> > > On Fri, 8 Mar 2019, Tycho Andersen wrote:
> > > 
> > > > On Fri, Mar 08, 2019 at 03:14:13PM +1100, Tobin C. Harding wrote:
> > > > > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > > > > index f9d89c1b5977..754acdb292e4 100644
> > > > > --- a/mm/slab_common.c
> > > > > +++ b/mm/slab_common.c
> > > > > @@ -298,6 +298,10 @@ int slab_unmergeable(struct kmem_cache *s)
> > > > >  	if (!is_root_cache(s))
> > > > >  		return 1;
> > > > >
> > > > > +	/*
> > > > > +	 * s->isolate and s->migrate imply s->ctor so no need to
> > > > > +	 * check them explicitly.
> > > > > +	 */
> > > >
> > > > Shouldn't this implication go the other way, i.e.
> > > >     s->ctor => s->isolate & s->migrate
> > > 
> > > A cache can have a constructor but the object may not be movable (I.e.
> > > currently dentries and inodes).
> > 
> > Yep, thanks. Somehow I got confused by the comment.
> 
> I removed code here from the original RFC-v2, if this comment is
> confusing perhaps we are better off without it.

I'd say leave it, unless others have objections. I got lost in the
"no need" and return true for unmergable too-many-nots goop, but it's
definitely worth noting that one implies the other. An alternative
might be to move it to a comment on the struct member instead.

Tycho
Roman Gushchin March 11, 2019, 9:51 p.m. UTC | #6
On Fri, Mar 08, 2019 at 03:14:13PM +1100, Tobin C. Harding wrote:
> Add the two methods needed for moving objects and enable the display of
> the callbacks via the /sys/kernel/slab interface.
> 
> Add documentation explaining the use of these methods and the prototypes
> for slab.h. Add functions to setup the callbacks method for a slab
> cache.
> 
> Add empty functions for SLAB/SLOB. The API is generic so it could be
> theoretically implemented for these allocators as well.
> 
> Co-developed-by: Christoph Lameter <cl@linux.com>
> Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> ---
>  include/linux/slab.h     | 69 ++++++++++++++++++++++++++++++++++++++++
>  include/linux/slub_def.h |  3 ++
>  mm/slab_common.c         |  4 +++
>  mm/slub.c                | 42 ++++++++++++++++++++++++
>  4 files changed, 118 insertions(+)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 11b45f7ae405..22e87c41b8a4 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -152,6 +152,75 @@ void memcg_create_kmem_cache(struct mem_cgroup *, struct kmem_cache *);
>  void memcg_deactivate_kmem_caches(struct mem_cgroup *);
>  void memcg_destroy_kmem_caches(struct mem_cgroup *);
>  
> +/*
> + * Function prototypes passed to kmem_cache_setup_mobility() to enable
> + * mobile objects and targeted reclaim in slab caches.
> + */
> +
> +/**
> + * typedef kmem_cache_isolate_func - Object migration callback function.
> + * @s: The cache we are working on.
> + * @ptr: Pointer to an array of pointers to the objects to migrate.
> + * @nr: Number of objects in array.
> + *
> + * The purpose of kmem_cache_isolate_func() is to pin each object so that
> + * they cannot be freed until kmem_cache_migrate_func() has processed
> + * them. This may be accomplished by increasing the refcount or setting
> + * a flag.
> + *
> + * The object pointer array passed is also passed to
> + * kmem_cache_migrate_func().  The function may remove objects from the
> + * array by setting pointers to NULL. This is useful if we can determine
> + * that an object is being freed because kmem_cache_isolate_func() was
> + * called when the subsystem was calling kmem_cache_free().  In that
> + * case it is not necessary to increase the refcount or specially mark
> + * the object because the release of the slab lock will lead to the
> + * immediate freeing of the object.
> + *
> + * Context: Called with locks held so that the slab objects cannot be
> + *          freed.  We are in an atomic context and no slab operations
> + *          may be performed.
> + * Return: A pointer that is passed to the migrate function. If any
> + *         objects cannot be touched at this point then the pointer may
> + *         indicate a failure and then the migration function can simply
> + *         remove the references that were already obtained. The private
> + *         data could be used to track the objects that were already pinned.
> + */
> +typedef void *kmem_cache_isolate_func(struct kmem_cache *s, void **ptr, int nr);
> +
> +/**
> + * typedef kmem_cache_migrate_func - Object migration callback function.
> + * @s: The cache we are working on.
> + * @ptr: Pointer to an array of pointers to the objects to migrate.
> + * @nr: Number of objects in array.
> + * @node: The NUMA node where the object should be allocated.
> + * @private: The pointer returned by kmem_cache_isolate_func().
> + *
> + * This function is responsible for migrating objects.  Typically, for
> + * each object in the input array you will want to allocate an new
> + * object, copy the original object, update any pointers, and free the
> + * old object.
> + *
> + * After this function returns all pointers to the old object should now
> + * point to the new object.
> + *
> + * Context: Called with no locks held and interrupts enabled.  Sleeping
> + *          is possible.  Any operation may be performed.
> + */
> +typedef void kmem_cache_migrate_func(struct kmem_cache *s, void **ptr,
> +				     int nr, int node, void *private);
> +
> +/*
> + * kmem_cache_setup_mobility() is used to setup callbacks for a slab cache.
> + */
> +#ifdef CONFIG_SLUB
> +void kmem_cache_setup_mobility(struct kmem_cache *, kmem_cache_isolate_func,
> +			       kmem_cache_migrate_func);
> +#else
> +static inline void kmem_cache_setup_mobility(struct kmem_cache *s,
> +	kmem_cache_isolate_func isolate, kmem_cache_migrate_func migrate) {}
> +#endif
> +
>  /*
>   * Please use this macro to create slab caches. Simply specify the
>   * name of the structure and maybe some flags that are listed above.
> diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
> index 3a1a1dbc6f49..a7340a1ed5dc 100644
> --- a/include/linux/slub_def.h
> +++ b/include/linux/slub_def.h
> @@ -99,6 +99,9 @@ struct kmem_cache {
>  	gfp_t allocflags;	/* gfp flags to use on each alloc */
>  	int refcount;		/* Refcount for slab cache destroy */
>  	void (*ctor)(void *);
> +	kmem_cache_isolate_func *isolate;
> +	kmem_cache_migrate_func *migrate;
> +
>  	unsigned int inuse;		/* Offset to metadata */
>  	unsigned int align;		/* Alignment */
>  	unsigned int red_left_pad;	/* Left redzone padding size */
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index f9d89c1b5977..754acdb292e4 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -298,6 +298,10 @@ int slab_unmergeable(struct kmem_cache *s)
>  	if (!is_root_cache(s))
>  		return 1;
>  
> +	/*
> +	 * s->isolate and s->migrate imply s->ctor so no need to
> +	 * check them explicitly.
> +	 */
>  	if (s->ctor)
>  		return 1;
>  
> diff --git a/mm/slub.c b/mm/slub.c
> index 69164aa7cbbf..0133168d1089 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4325,6 +4325,34 @@ int __kmem_cache_create(struct kmem_cache *s, slab_flags_t flags)
>  	return err;
>  }
>  
> +void kmem_cache_setup_mobility(struct kmem_cache *s,
> +			       kmem_cache_isolate_func isolate,
> +			       kmem_cache_migrate_func migrate)
> +{

I wonder if it's better to adapt kmem_cache_create() to take two additional
argument? I suspect mobility is not a dynamic option, so it can be
set on kmem_cache creation.

> +	/*
> +	 * Mobile objects must have a ctor otherwise the object may be
> +	 * in an undefined state on allocation.  Since the object may
> +	 * need to be inspected by the migration function at any time
> +	 * after allocation we must ensure that the object always has a
> +	 * defined state.
> +	 */
> +	if (!s->ctor) {
> +		pr_err("%s: cannot setup mobility without a constructor\n",
> +		       s->name);
> +		return;
> +	}
> +
> +	s->isolate = isolate;
> +	s->migrate = migrate;
> +
> +	/*
> +	 * Sadly serialization requirements currently mean that we have
> +	 * to disable fast cmpxchg based processing.
> +	 */

Can you, please, elaborate a bit more here?

> +	s->flags &= ~__CMPXCHG_DOUBLE;
> +}
> +EXPORT_SYMBOL(kmem_cache_setup_mobility);
> +
>  void *__kmalloc_track_caller(size_t size, gfp_t gfpflags, unsigned long caller)
>  {
>  	struct kmem_cache *s;
> @@ -5018,6 +5046,20 @@ static ssize_t ops_show(struct kmem_cache *s, char *buf)
>  
>  	if (s->ctor)
>  		x += sprintf(buf + x, "ctor : %pS\n", s->ctor);
> +
> +	if (s->isolate) {
> +		x += sprintf(buf + x, "isolate : ");
> +		x += sprint_symbol(buf + x,
> +				(unsigned long)s->isolate);
> +		x += sprintf(buf + x, "\n");
> +	}

Is there a reason why s->ctor and s->isolate/migrate are printed
using different methods?

> +
> +	if (s->migrate) {
> +		x += sprintf(buf + x, "migrate : ");
> +		x += sprint_symbol(buf + x,
> +				(unsigned long)s->migrate);
> +		x += sprintf(buf + x, "\n");
> +	}
>  	return x;
>  }
>  SLAB_ATTR_RO(ops);
> -- 
> 2.21.0
> 

Thanks!
Tobin Harding March 12, 2019, 1:08 a.m. UTC | #7
On Mon, Mar 11, 2019 at 09:51:09PM +0000, Roman Gushchin wrote:
> On Fri, Mar 08, 2019 at 03:14:13PM +1100, Tobin C. Harding wrote:
> > Add the two methods needed for moving objects and enable the display of
> > the callbacks via the /sys/kernel/slab interface.
> > 
> > Add documentation explaining the use of these methods and the prototypes
> > for slab.h. Add functions to setup the callbacks method for a slab
> > cache.
> > 
> > Add empty functions for SLAB/SLOB. The API is generic so it could be
> > theoretically implemented for these allocators as well.
> > 
> > Co-developed-by: Christoph Lameter <cl@linux.com>
> > Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> > ---
> >  include/linux/slab.h     | 69 ++++++++++++++++++++++++++++++++++++++++
> >  include/linux/slub_def.h |  3 ++
> >  mm/slab_common.c         |  4 +++
> >  mm/slub.c                | 42 ++++++++++++++++++++++++
> >  4 files changed, 118 insertions(+)
> > 
> > diff --git a/include/linux/slab.h b/include/linux/slab.h
> > index 11b45f7ae405..22e87c41b8a4 100644
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -152,6 +152,75 @@ void memcg_create_kmem_cache(struct mem_cgroup *, struct kmem_cache *);
> >  void memcg_deactivate_kmem_caches(struct mem_cgroup *);
> >  void memcg_destroy_kmem_caches(struct mem_cgroup *);
> >  
> > +/*
> > + * Function prototypes passed to kmem_cache_setup_mobility() to enable
> > + * mobile objects and targeted reclaim in slab caches.
> > + */
> > +
> > +/**
> > + * typedef kmem_cache_isolate_func - Object migration callback function.
> > + * @s: The cache we are working on.
> > + * @ptr: Pointer to an array of pointers to the objects to migrate.
> > + * @nr: Number of objects in array.
> > + *
> > + * The purpose of kmem_cache_isolate_func() is to pin each object so that
> > + * they cannot be freed until kmem_cache_migrate_func() has processed
> > + * them. This may be accomplished by increasing the refcount or setting
> > + * a flag.
> > + *
> > + * The object pointer array passed is also passed to
> > + * kmem_cache_migrate_func().  The function may remove objects from the
> > + * array by setting pointers to NULL. This is useful if we can determine
> > + * that an object is being freed because kmem_cache_isolate_func() was
> > + * called when the subsystem was calling kmem_cache_free().  In that
> > + * case it is not necessary to increase the refcount or specially mark
> > + * the object because the release of the slab lock will lead to the
> > + * immediate freeing of the object.
> > + *
> > + * Context: Called with locks held so that the slab objects cannot be
> > + *          freed.  We are in an atomic context and no slab operations
> > + *          may be performed.
> > + * Return: A pointer that is passed to the migrate function. If any
> > + *         objects cannot be touched at this point then the pointer may
> > + *         indicate a failure and then the migration function can simply
> > + *         remove the references that were already obtained. The private
> > + *         data could be used to track the objects that were already pinned.
> > + */
> > +typedef void *kmem_cache_isolate_func(struct kmem_cache *s, void **ptr, int nr);
> > +
> > +/**
> > + * typedef kmem_cache_migrate_func - Object migration callback function.
> > + * @s: The cache we are working on.
> > + * @ptr: Pointer to an array of pointers to the objects to migrate.
> > + * @nr: Number of objects in array.
> > + * @node: The NUMA node where the object should be allocated.
> > + * @private: The pointer returned by kmem_cache_isolate_func().
> > + *
> > + * This function is responsible for migrating objects.  Typically, for
> > + * each object in the input array you will want to allocate an new
> > + * object, copy the original object, update any pointers, and free the
> > + * old object.
> > + *
> > + * After this function returns all pointers to the old object should now
> > + * point to the new object.
> > + *
> > + * Context: Called with no locks held and interrupts enabled.  Sleeping
> > + *          is possible.  Any operation may be performed.
> > + */
> > +typedef void kmem_cache_migrate_func(struct kmem_cache *s, void **ptr,
> > +				     int nr, int node, void *private);
> > +
> > +/*
> > + * kmem_cache_setup_mobility() is used to setup callbacks for a slab cache.
> > + */
> > +#ifdef CONFIG_SLUB
> > +void kmem_cache_setup_mobility(struct kmem_cache *, kmem_cache_isolate_func,
> > +			       kmem_cache_migrate_func);
> > +#else
> > +static inline void kmem_cache_setup_mobility(struct kmem_cache *s,
> > +	kmem_cache_isolate_func isolate, kmem_cache_migrate_func migrate) {}
> > +#endif
> > +
> >  /*
> >   * Please use this macro to create slab caches. Simply specify the
> >   * name of the structure and maybe some flags that are listed above.
> > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
> > index 3a1a1dbc6f49..a7340a1ed5dc 100644
> > --- a/include/linux/slub_def.h
> > +++ b/include/linux/slub_def.h
> > @@ -99,6 +99,9 @@ struct kmem_cache {
> >  	gfp_t allocflags;	/* gfp flags to use on each alloc */
> >  	int refcount;		/* Refcount for slab cache destroy */
> >  	void (*ctor)(void *);
> > +	kmem_cache_isolate_func *isolate;
> > +	kmem_cache_migrate_func *migrate;
> > +
> >  	unsigned int inuse;		/* Offset to metadata */
> >  	unsigned int align;		/* Alignment */
> >  	unsigned int red_left_pad;	/* Left redzone padding size */
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index f9d89c1b5977..754acdb292e4 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -298,6 +298,10 @@ int slab_unmergeable(struct kmem_cache *s)
> >  	if (!is_root_cache(s))
> >  		return 1;
> >  
> > +	/*
> > +	 * s->isolate and s->migrate imply s->ctor so no need to
> > +	 * check them explicitly.
> > +	 */
> >  	if (s->ctor)
> >  		return 1;
> >  
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 69164aa7cbbf..0133168d1089 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -4325,6 +4325,34 @@ int __kmem_cache_create(struct kmem_cache *s, slab_flags_t flags)
> >  	return err;
> >  }
> >  
> > +void kmem_cache_setup_mobility(struct kmem_cache *s,
> > +			       kmem_cache_isolate_func isolate,
> > +			       kmem_cache_migrate_func migrate)
> > +{
> 
> I wonder if it's better to adapt kmem_cache_create() to take two additional
> argument? I suspect mobility is not a dynamic option, so it can be
> set on kmem_cache creation.


Thanks for the review.  You are correct mobility is not dynamic (at the
moment once enabled it cannot be disabled).  I don't think we want to
change every caller of kmem_cache_create() though, adding two new
parameters that are almost always going to be NULL.  Also, I cannot ATM
see how object migration would be useful to SLOB so changing the API for
all slab allocators does not seem like a good thing.

thanks,
Tobin.
Christoph Lameter (Ampere) March 12, 2019, 4:35 a.m. UTC | #8
On Mon, 11 Mar 2019, Roman Gushchin wrote:

> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -4325,6 +4325,34 @@ int __kmem_cache_create(struct kmem_cache *s, slab_flags_t flags)
> >  	return err;
> >  }
> >
> > +void kmem_cache_setup_mobility(struct kmem_cache *s,
> > +			       kmem_cache_isolate_func isolate,
> > +			       kmem_cache_migrate_func migrate)
> > +{
>
> I wonder if it's better to adapt kmem_cache_create() to take two additional
> argument? I suspect mobility is not a dynamic option, so it can be
> set on kmem_cache creation.

One other idea that prior versions of this patchset used was to change
kmem_cache_create() so that the ctor parameter becomes an ops vector.

However, in order to reduce the size of the patchset I dropped that. It
could be easily moved back to the way it was before.

> > +	/*
> > +	 * Sadly serialization requirements currently mean that we have
> > +	 * to disable fast cmpxchg based processing.
> > +	 */
>
> Can you, please, elaborate a bit more here?

cmpxchg based processing does not lock the struct page. SMO requires to
ensure that all changes on a slab page can be stopped. The page->lock will
accomplish that. I think we could avoid dealing with actually locking the
page with some more work.
Roman Gushchin March 12, 2019, 6:47 p.m. UTC | #9
On Tue, Mar 12, 2019 at 04:35:15AM +0000, Christopher Lameter wrote:
> On Mon, 11 Mar 2019, Roman Gushchin wrote:
> 
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -4325,6 +4325,34 @@ int __kmem_cache_create(struct kmem_cache *s, slab_flags_t flags)
> > >  	return err;
> > >  }
> > >
> > > +void kmem_cache_setup_mobility(struct kmem_cache *s,
> > > +			       kmem_cache_isolate_func isolate,
> > > +			       kmem_cache_migrate_func migrate)
> > > +{
> >
> > I wonder if it's better to adapt kmem_cache_create() to take two additional
> > argument? I suspect mobility is not a dynamic option, so it can be
> > set on kmem_cache creation.
> 
> One other idea that prior versions of this patchset used was to change
> kmem_cache_create() so that the ctor parameter becomes an ops vector.
> 
> However, in order to reduce the size of the patchset I dropped that. It
> could be easily moved back to the way it was before.

Understood. I like the idea of an ops vector, but it can be done later,
agree.

> 
> > > +	/*
> > > +	 * Sadly serialization requirements currently mean that we have
> > > +	 * to disable fast cmpxchg based processing.
> > > +	 */
> >
> > Can you, please, elaborate a bit more here?
> 
> cmpxchg based processing does not lock the struct page. SMO requires to
> ensure that all changes on a slab page can be stopped. The page->lock will
> accomplish that. I think we could avoid dealing with actually locking the
> page with some more work.

Thank you for the explanation!
diff mbox series

Patch

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 11b45f7ae405..22e87c41b8a4 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -152,6 +152,75 @@  void memcg_create_kmem_cache(struct mem_cgroup *, struct kmem_cache *);
 void memcg_deactivate_kmem_caches(struct mem_cgroup *);
 void memcg_destroy_kmem_caches(struct mem_cgroup *);
 
+/*
+ * Function prototypes passed to kmem_cache_setup_mobility() to enable
+ * mobile objects and targeted reclaim in slab caches.
+ */
+
+/**
+ * typedef kmem_cache_isolate_func - Object migration callback function.
+ * @s: The cache we are working on.
+ * @ptr: Pointer to an array of pointers to the objects to migrate.
+ * @nr: Number of objects in array.
+ *
+ * The purpose of kmem_cache_isolate_func() is to pin each object so that
+ * they cannot be freed until kmem_cache_migrate_func() has processed
+ * them. This may be accomplished by increasing the refcount or setting
+ * a flag.
+ *
+ * The object pointer array passed is also passed to
+ * kmem_cache_migrate_func().  The function may remove objects from the
+ * array by setting pointers to NULL. This is useful if we can determine
+ * that an object is being freed because kmem_cache_isolate_func() was
+ * called when the subsystem was calling kmem_cache_free().  In that
+ * case it is not necessary to increase the refcount or specially mark
+ * the object because the release of the slab lock will lead to the
+ * immediate freeing of the object.
+ *
+ * Context: Called with locks held so that the slab objects cannot be
+ *          freed.  We are in an atomic context and no slab operations
+ *          may be performed.
+ * Return: A pointer that is passed to the migrate function. If any
+ *         objects cannot be touched at this point then the pointer may
+ *         indicate a failure and then the migration function can simply
+ *         remove the references that were already obtained. The private
+ *         data could be used to track the objects that were already pinned.
+ */
+typedef void *kmem_cache_isolate_func(struct kmem_cache *s, void **ptr, int nr);
+
+/**
+ * typedef kmem_cache_migrate_func - Object migration callback function.
+ * @s: The cache we are working on.
+ * @ptr: Pointer to an array of pointers to the objects to migrate.
+ * @nr: Number of objects in array.
+ * @node: The NUMA node where the object should be allocated.
+ * @private: The pointer returned by kmem_cache_isolate_func().
+ *
+ * This function is responsible for migrating objects.  Typically, for
+ * each object in the input array you will want to allocate an new
+ * object, copy the original object, update any pointers, and free the
+ * old object.
+ *
+ * After this function returns all pointers to the old object should now
+ * point to the new object.
+ *
+ * Context: Called with no locks held and interrupts enabled.  Sleeping
+ *          is possible.  Any operation may be performed.
+ */
+typedef void kmem_cache_migrate_func(struct kmem_cache *s, void **ptr,
+				     int nr, int node, void *private);
+
+/*
+ * kmem_cache_setup_mobility() is used to setup callbacks for a slab cache.
+ */
+#ifdef CONFIG_SLUB
+void kmem_cache_setup_mobility(struct kmem_cache *, kmem_cache_isolate_func,
+			       kmem_cache_migrate_func);
+#else
+static inline void kmem_cache_setup_mobility(struct kmem_cache *s,
+	kmem_cache_isolate_func isolate, kmem_cache_migrate_func migrate) {}
+#endif
+
 /*
  * Please use this macro to create slab caches. Simply specify the
  * name of the structure and maybe some flags that are listed above.
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
index 3a1a1dbc6f49..a7340a1ed5dc 100644
--- a/include/linux/slub_def.h
+++ b/include/linux/slub_def.h
@@ -99,6 +99,9 @@  struct kmem_cache {
 	gfp_t allocflags;	/* gfp flags to use on each alloc */
 	int refcount;		/* Refcount for slab cache destroy */
 	void (*ctor)(void *);
+	kmem_cache_isolate_func *isolate;
+	kmem_cache_migrate_func *migrate;
+
 	unsigned int inuse;		/* Offset to metadata */
 	unsigned int align;		/* Alignment */
 	unsigned int red_left_pad;	/* Left redzone padding size */
diff --git a/mm/slab_common.c b/mm/slab_common.c
index f9d89c1b5977..754acdb292e4 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -298,6 +298,10 @@  int slab_unmergeable(struct kmem_cache *s)
 	if (!is_root_cache(s))
 		return 1;
 
+	/*
+	 * s->isolate and s->migrate imply s->ctor so no need to
+	 * check them explicitly.
+	 */
 	if (s->ctor)
 		return 1;
 
diff --git a/mm/slub.c b/mm/slub.c
index 69164aa7cbbf..0133168d1089 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4325,6 +4325,34 @@  int __kmem_cache_create(struct kmem_cache *s, slab_flags_t flags)
 	return err;
 }
 
+void kmem_cache_setup_mobility(struct kmem_cache *s,
+			       kmem_cache_isolate_func isolate,
+			       kmem_cache_migrate_func migrate)
+{
+	/*
+	 * Mobile objects must have a ctor otherwise the object may be
+	 * in an undefined state on allocation.  Since the object may
+	 * need to be inspected by the migration function at any time
+	 * after allocation we must ensure that the object always has a
+	 * defined state.
+	 */
+	if (!s->ctor) {
+		pr_err("%s: cannot setup mobility without a constructor\n",
+		       s->name);
+		return;
+	}
+
+	s->isolate = isolate;
+	s->migrate = migrate;
+
+	/*
+	 * Sadly serialization requirements currently mean that we have
+	 * to disable fast cmpxchg based processing.
+	 */
+	s->flags &= ~__CMPXCHG_DOUBLE;
+}
+EXPORT_SYMBOL(kmem_cache_setup_mobility);
+
 void *__kmalloc_track_caller(size_t size, gfp_t gfpflags, unsigned long caller)
 {
 	struct kmem_cache *s;
@@ -5018,6 +5046,20 @@  static ssize_t ops_show(struct kmem_cache *s, char *buf)
 
 	if (s->ctor)
 		x += sprintf(buf + x, "ctor : %pS\n", s->ctor);
+
+	if (s->isolate) {
+		x += sprintf(buf + x, "isolate : ");
+		x += sprint_symbol(buf + x,
+				(unsigned long)s->isolate);
+		x += sprintf(buf + x, "\n");
+	}
+
+	if (s->migrate) {
+		x += sprintf(buf + x, "migrate : ");
+		x += sprint_symbol(buf + x,
+				(unsigned long)s->migrate);
+		x += sprintf(buf + x, "\n");
+	}
 	return x;
 }
 SLAB_ATTR_RO(ops);