diff mbox series

[2/6] hugetlbfs: fix cannot migrate the fallocated HugeTLB page

Message ID 20210104065843.5658-2-songmuchun@bytedance.com (mailing list archive)
State New, archived
Headers show
Series [1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one | expand

Commit Message

Muchun Song Jan. 4, 2021, 6:58 a.m. UTC
Because we only can isolate a active page via isolate_huge_page()
and hugetlbfs_fallocate() forget to mark it as active, we cannot
isolate and migrate those pages.

Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate())
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 fs/hugetlbfs/inode.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Mike Kravetz Jan. 4, 2021, 10:38 p.m. UTC | #1
On 1/3/21 10:58 PM, Muchun Song wrote:
> Because we only can isolate a active page via isolate_huge_page()
> and hugetlbfs_fallocate() forget to mark it as active, we cannot
> isolate and migrate those pages.
> 
> Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate())
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  fs/hugetlbfs/inode.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

Good catch.  This is indeed an issue.

> 
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index b5c109703daa..2aceb085d202 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -737,10 +737,11 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
>  
>  		/*
>  		 * unlock_page because locked by add_to_page_cache()
> -		 * page_put due to reference from alloc_huge_page()
> +		 * put_page() (which is in the putback_active_hugepage())
> +		 * due to reference from alloc_huge_page()

Thanks for fixing the comment.

>  		 */
>  		unlock_page(page);
> -		put_page(page);
> +		putback_active_hugepage(page);

I'm curious why you used putback_active_hugepage() here instead of simply
calling set_page_huge_active() before the put_page()?

When the page was allocated, it was placed on the active list (alloc_huge_page).
Therefore, the hugetlb_lock locking and list movement should not be necessary.
Muchun Song Jan. 5, 2021, 2:44 a.m. UTC | #2
On Tue, Jan 5, 2021 at 6:40 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 1/3/21 10:58 PM, Muchun Song wrote:
> > Because we only can isolate a active page via isolate_huge_page()
> > and hugetlbfs_fallocate() forget to mark it as active, we cannot
> > isolate and migrate those pages.
> >
> > Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate())
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > ---
> >  fs/hugetlbfs/inode.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
>
> Good catch.  This is indeed an issue.
>
> >
> > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> > index b5c109703daa..2aceb085d202 100644
> > --- a/fs/hugetlbfs/inode.c
> > +++ b/fs/hugetlbfs/inode.c
> > @@ -737,10 +737,11 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
> >
> >               /*
> >                * unlock_page because locked by add_to_page_cache()
> > -              * page_put due to reference from alloc_huge_page()
> > +              * put_page() (which is in the putback_active_hugepage())
> > +              * due to reference from alloc_huge_page()
>
> Thanks for fixing the comment.
>
> >                */
> >               unlock_page(page);
> > -             put_page(page);
> > +             putback_active_hugepage(page);
>
> I'm curious why you used putback_active_hugepage() here instead of simply
> calling set_page_huge_active() before the put_page()?
>
> When the page was allocated, it was placed on the active list (alloc_huge_page).
> Therefore, the hugetlb_lock locking and list movement should not be necessary.

I agree with you. Because set_page_huge_active is not exported (static
function). Only exporting set_page_huge_active seems strange (leaving
clear_page_huge_active not export). This is just my opinion. What's
yours, Mike?

Thanks.

>
> --
> Mike Kravetz
>
> >       }
> >
> >       if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + len > inode->i_size)
> >
Mike Kravetz Jan. 5, 2021, 10:27 p.m. UTC | #3
On 1/4/21 6:44 PM, Muchun Song wrote:
> On Tue, Jan 5, 2021 at 6:40 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>
>> On 1/3/21 10:58 PM, Muchun Song wrote:
>>> Because we only can isolate a active page via isolate_huge_page()
>>> and hugetlbfs_fallocate() forget to mark it as active, we cannot
>>> isolate and migrate those pages.
>>>
>>> Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate())
>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>> ---
>>>  fs/hugetlbfs/inode.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> Good catch.  This is indeed an issue.
>>
>>>
>>> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
>>> index b5c109703daa..2aceb085d202 100644
>>> --- a/fs/hugetlbfs/inode.c
>>> +++ b/fs/hugetlbfs/inode.c
>>> @@ -737,10 +737,11 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
>>>
>>>               /*
>>>                * unlock_page because locked by add_to_page_cache()
>>> -              * page_put due to reference from alloc_huge_page()
>>> +              * put_page() (which is in the putback_active_hugepage())
>>> +              * due to reference from alloc_huge_page()
>>
>> Thanks for fixing the comment.
>>
>>>                */
>>>               unlock_page(page);
>>> -             put_page(page);
>>> +             putback_active_hugepage(page);
>>
>> I'm curious why you used putback_active_hugepage() here instead of simply
>> calling set_page_huge_active() before the put_page()?
>>
>> When the page was allocated, it was placed on the active list (alloc_huge_page).
>> Therefore, the hugetlb_lock locking and list movement should not be necessary.
> 
> I agree with you. Because set_page_huge_active is not exported (static
> function). Only exporting set_page_huge_active seems strange (leaving
> clear_page_huge_active not export). This is just my opinion. What's
> yours, Mike?

I'm thinking that we should export (make external) set_page_huge_active.
We can leave clear_page_huge_active as static and just add something to
the commit log noting that there are no external users.

My primary reason for doing this is to eliminate the extra and unnecessary
per-page lock/unlock cycle.  I believe there are some applications that
use fallocate to pre-allocate very large hugetlbfs files.  They may notice
the extra overhead.
Muchun Song Jan. 6, 2021, 2:57 a.m. UTC | #4
On Wed, Jan 6, 2021 at 6:29 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 1/4/21 6:44 PM, Muchun Song wrote:
> > On Tue, Jan 5, 2021 at 6:40 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
> >>
> >> On 1/3/21 10:58 PM, Muchun Song wrote:
> >>> Because we only can isolate a active page via isolate_huge_page()
> >>> and hugetlbfs_fallocate() forget to mark it as active, we cannot
> >>> isolate and migrate those pages.
> >>>
> >>> Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate())
> >>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> >>> ---
> >>>  fs/hugetlbfs/inode.c | 5 +++--
> >>>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>
> >> Good catch.  This is indeed an issue.
> >>
> >>>
> >>> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> >>> index b5c109703daa..2aceb085d202 100644
> >>> --- a/fs/hugetlbfs/inode.c
> >>> +++ b/fs/hugetlbfs/inode.c
> >>> @@ -737,10 +737,11 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
> >>>
> >>>               /*
> >>>                * unlock_page because locked by add_to_page_cache()
> >>> -              * page_put due to reference from alloc_huge_page()
> >>> +              * put_page() (which is in the putback_active_hugepage())
> >>> +              * due to reference from alloc_huge_page()
> >>
> >> Thanks for fixing the comment.
> >>
> >>>                */
> >>>               unlock_page(page);
> >>> -             put_page(page);
> >>> +             putback_active_hugepage(page);
> >>
> >> I'm curious why you used putback_active_hugepage() here instead of simply
> >> calling set_page_huge_active() before the put_page()?
> >>
> >> When the page was allocated, it was placed on the active list (alloc_huge_page).
> >> Therefore, the hugetlb_lock locking and list movement should not be necessary.
> >
> > I agree with you. Because set_page_huge_active is not exported (static
> > function). Only exporting set_page_huge_active seems strange (leaving
> > clear_page_huge_active not export). This is just my opinion. What's
> > yours, Mike?
>
> I'm thinking that we should export (make external) set_page_huge_active.
> We can leave clear_page_huge_active as static and just add something to
> the commit log noting that there are no external users.
>
> My primary reason for doing this is to eliminate the extra and unnecessary
> per-page lock/unlock cycle.  I believe there are some applications that
> use fallocate to pre-allocate very large hugetlbfs files.  They may notice
> the extra overhead.

Agree. Will do in the next version. Thanks.

> --
> Mike Kravetz
diff mbox series

Patch

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index b5c109703daa..2aceb085d202 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -737,10 +737,11 @@  static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
 
 		/*
 		 * unlock_page because locked by add_to_page_cache()
-		 * page_put due to reference from alloc_huge_page()
+		 * put_page() (which is in the putback_active_hugepage())
+		 * due to reference from alloc_huge_page()
 		 */
 		unlock_page(page);
-		put_page(page);
+		putback_active_hugepage(page);
 	}
 
 	if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + len > inode->i_size)