mm/swap: Fix release_pages() when releasing devmap pages
diff mbox series

Message ID 20190523223746.4982-1-ira.weiny@intel.com
State New
Headers show
Series
  • mm/swap: Fix release_pages() when releasing devmap pages
Related show

Commit Message

Ira Weiny May 23, 2019, 10:37 p.m. UTC
From: Ira Weiny <ira.weiny@intel.com>

Device pages can be more than type MEMORY_DEVICE_PUBLIC.

Handle all device pages within release_pages()

This was found via code inspection while determining if release_pages()
and the new put_user_pages() could be interchangeable.

Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 mm/swap.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

Comments

John Hubbard May 24, 2019, 1:05 a.m. UTC | #1
On 5/23/19 3:37 PM, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Device pages can be more than type MEMORY_DEVICE_PUBLIC.
> 
> Handle all device pages within release_pages()
> 
> This was found via code inspection while determining if release_pages()
> and the new put_user_pages() could be interchangeable.
> 
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  mm/swap.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 3a75722e68a9..d1e8122568d0 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -739,15 +739,14 @@ void release_pages(struct page **pages, int nr)
>  		if (is_huge_zero_page(page))
>  			continue;
>  
> -		/* Device public page can not be huge page */
> -		if (is_device_public_page(page)) {
> +		if (is_zone_device_page(page)) {
>  			if (locked_pgdat) {
>  				spin_unlock_irqrestore(&locked_pgdat->lru_lock,
>  						       flags);
>  				locked_pgdat = NULL;
>  			}
> -			put_devmap_managed_page(page);
> -			continue;
> +			if (put_devmap_managed_page(page))
> +				continue;
>  		}
>  
>  		page = compound_head(page);
> 

Reviewed-by: John Hubbard <jhubbard@nvidia.com>

thanks,
Dan Williams May 24, 2019, 3:58 a.m. UTC | #2
On Thu, May 23, 2019 at 3:37 PM <ira.weiny@intel.com> wrote:
>
> From: Ira Weiny <ira.weiny@intel.com>
>
> Device pages can be more than type MEMORY_DEVICE_PUBLIC.
>
> Handle all device pages within release_pages()
>
> This was found via code inspection while determining if release_pages()
> and the new put_user_pages() could be interchangeable.
>
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  mm/swap.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/mm/swap.c b/mm/swap.c
> index 3a75722e68a9..d1e8122568d0 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -739,15 +739,14 @@ void release_pages(struct page **pages, int nr)
>                 if (is_huge_zero_page(page))
>                         continue;
>
> -               /* Device public page can not be huge page */
> -               if (is_device_public_page(page)) {
> +               if (is_zone_device_page(page)) {
>                         if (locked_pgdat) {
>                                 spin_unlock_irqrestore(&locked_pgdat->lru_lock,
>                                                        flags);
>                                 locked_pgdat = NULL;
>                         }
> -                       put_devmap_managed_page(page);
> -                       continue;
> +                       if (put_devmap_managed_page(page))

This "shouldn't" fail, and if it does the code that follows might get
confused by a ZONE_DEVICE page. If anything I would make this a
WARN_ON_ONCE(!put_devmap_managed_page(page)), but always continue
unconditionally.

Other than that you can add:

    Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Ira Weiny May 24, 2019, 3:36 p.m. UTC | #3
On Thu, May 23, 2019 at 08:58:12PM -0700, Dan Williams wrote:
> On Thu, May 23, 2019 at 3:37 PM <ira.weiny@intel.com> wrote:
> >
> > From: Ira Weiny <ira.weiny@intel.com>
> >
> > Device pages can be more than type MEMORY_DEVICE_PUBLIC.
> >
> > Handle all device pages within release_pages()
> >
> > This was found via code inspection while determining if release_pages()
> > and the new put_user_pages() could be interchangeable.
> >
> > Cc: Jérôme Glisse <jglisse@redhat.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: John Hubbard <jhubbard@nvidia.com>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > ---
> >  mm/swap.c | 7 +++----
> >  1 file changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 3a75722e68a9..d1e8122568d0 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -739,15 +739,14 @@ void release_pages(struct page **pages, int nr)
> >                 if (is_huge_zero_page(page))
> >                         continue;
> >
> > -               /* Device public page can not be huge page */
> > -               if (is_device_public_page(page)) {
> > +               if (is_zone_device_page(page)) {
> >                         if (locked_pgdat) {
> >                                 spin_unlock_irqrestore(&locked_pgdat->lru_lock,
> >                                                        flags);
> >                                 locked_pgdat = NULL;
> >                         }
> > -                       put_devmap_managed_page(page);
> > -                       continue;
> > +                       if (put_devmap_managed_page(page))
> 
> This "shouldn't" fail, and if it does the code that follows might get

I agree it shouldn't based on the check.  However...

> confused by a ZONE_DEVICE page. If anything I would make this a
> WARN_ON_ONCE(!put_devmap_managed_page(page)), but always continue
> unconditionally.

I was trying to follow the pattern from put_page()  Where if fails it indicated
it was not a devmap page and so "regular" processing should continue.

Since I'm unsure I'll just ask what does this check do?

        if (!static_branch_unlikely(&devmap_managed_key))
                return false;

... In put_devmap_managed_page()?

> 
> Other than that you can add:
> 
>     Reviewed-by: Dan Williams <dan.j.williams@intel.com>

Thanks v2 to follow.

Ira
Dan Williams May 24, 2019, 3:48 p.m. UTC | #4
On Fri, May 24, 2019 at 8:35 AM Ira Weiny <ira.weiny@intel.com> wrote:
>
> On Thu, May 23, 2019 at 08:58:12PM -0700, Dan Williams wrote:
> > On Thu, May 23, 2019 at 3:37 PM <ira.weiny@intel.com> wrote:
> > >
> > > From: Ira Weiny <ira.weiny@intel.com>
> > >
> > > Device pages can be more than type MEMORY_DEVICE_PUBLIC.
> > >
> > > Handle all device pages within release_pages()
> > >
> > > This was found via code inspection while determining if release_pages()
> > > and the new put_user_pages() could be interchangeable.
> > >
> > > Cc: Jérôme Glisse <jglisse@redhat.com>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Michal Hocko <mhocko@suse.com>
> > > Cc: John Hubbard <jhubbard@nvidia.com>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > ---
> > >  mm/swap.c | 7 +++----
> > >  1 file changed, 3 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/mm/swap.c b/mm/swap.c
> > > index 3a75722e68a9..d1e8122568d0 100644
> > > --- a/mm/swap.c
> > > +++ b/mm/swap.c
> > > @@ -739,15 +739,14 @@ void release_pages(struct page **pages, int nr)
> > >                 if (is_huge_zero_page(page))
> > >                         continue;
> > >
> > > -               /* Device public page can not be huge page */
> > > -               if (is_device_public_page(page)) {
> > > +               if (is_zone_device_page(page)) {
> > >                         if (locked_pgdat) {
> > >                                 spin_unlock_irqrestore(&locked_pgdat->lru_lock,
> > >                                                        flags);
> > >                                 locked_pgdat = NULL;
> > >                         }
> > > -                       put_devmap_managed_page(page);
> > > -                       continue;
> > > +                       if (put_devmap_managed_page(page))
> >
> > This "shouldn't" fail, and if it does the code that follows might get
>
> I agree it shouldn't based on the check.  However...
>
> > confused by a ZONE_DEVICE page. If anything I would make this a
> > WARN_ON_ONCE(!put_devmap_managed_page(page)), but always continue
> > unconditionally.
>
> I was trying to follow the pattern from put_page()  Where if fails it indicated
> it was not a devmap page and so "regular" processing should continue.

In this case that regular continuation already happened by not taking
the if (is_zone_device_page(page)) branch

>
> Since I'm unsure I'll just ask what does this check do?
>
>         if (!static_branch_unlikely(&devmap_managed_key))
>                 return false;

That attempts to skip the overhead imposed by device-pages, i.e.
->page_free() callback and other extras, if there are no device-page
producers in the system. I.e. use the old simple put_page() path when
there is no hmm or pmem.
Dan Williams May 24, 2019, 9:03 p.m. UTC | #5
On Thu, May 23, 2019 at 8:58 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Thu, May 23, 2019 at 3:37 PM <ira.weiny@intel.com> wrote:
> >
> > From: Ira Weiny <ira.weiny@intel.com>
> >
> > Device pages can be more than type MEMORY_DEVICE_PUBLIC.
> >
> > Handle all device pages within release_pages()
> >
> > This was found via code inspection while determining if release_pages()
> > and the new put_user_pages() could be interchangeable.
> >
> > Cc: Jérôme Glisse <jglisse@redhat.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: John Hubbard <jhubbard@nvidia.com>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > ---
> >  mm/swap.c | 7 +++----
> >  1 file changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 3a75722e68a9..d1e8122568d0 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -739,15 +739,14 @@ void release_pages(struct page **pages, int nr)
> >                 if (is_huge_zero_page(page))
> >                         continue;
> >
> > -               /* Device public page can not be huge page */
> > -               if (is_device_public_page(page)) {
> > +               if (is_zone_device_page(page)) {
> >                         if (locked_pgdat) {
> >                                 spin_unlock_irqrestore(&locked_pgdat->lru_lock,
> >                                                        flags);
> >                                 locked_pgdat = NULL;
> >                         }
> > -                       put_devmap_managed_page(page);
> > -                       continue;
> > +                       if (put_devmap_managed_page(page))
>
> This "shouldn't" fail, and if it does the code that follows might get
> confused by a ZONE_DEVICE page. If anything I would make this a
> WARN_ON_ONCE(!put_devmap_managed_page(page)), but always continue
> unconditionally.

As discussed offline, I'm wrong here. It needs to fall through to
put_page_testzero() for the device-dax case, but perhaps a comment for
the next time I forget that subtlety.

Patch
diff mbox series

diff --git a/mm/swap.c b/mm/swap.c
index 3a75722e68a9..d1e8122568d0 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -739,15 +739,14 @@  void release_pages(struct page **pages, int nr)
 		if (is_huge_zero_page(page))
 			continue;
 
-		/* Device public page can not be huge page */
-		if (is_device_public_page(page)) {
+		if (is_zone_device_page(page)) {
 			if (locked_pgdat) {
 				spin_unlock_irqrestore(&locked_pgdat->lru_lock,
 						       flags);
 				locked_pgdat = NULL;
 			}
-			put_devmap_managed_page(page);
-			continue;
+			if (put_devmap_managed_page(page))
+				continue;
 		}
 
 		page = compound_head(page);