Message ID | 1572991351-86061-1-git-send-email-yang.shi@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: shmem: use proper gfp flags for shmem_writepage() | expand |
On Wed 06-11-19 06:02:31, Yang Shi wrote: > The shmem_writepage() uses GFP_ATOMIC to allocate swap cache. > GFP_ATOMIC used to mean __GFP_HIGH, but now it means __GFP_HIGH | > __GFP_ATOMIC | __GFP_KSWAPD_RECLAIM. However, shmem_writepage() should > write out to swap only in response to memory pressure, so > __GFP_KSWAPD_RECLAIM looks useless since the caller may be kswapd itself > or in direct reclaim already. What kind of problem are you trying to fix here? > In addition, XArray node allocations from PF_MEMALLOC contexts could > completely exhaust the page allocator, __GFP_NOMEMALLOC stops emergency > reserves from being allocated. I am not really familiar with XArray much, could you be more specific please? > Here just copy the gfp flags used by add_to_swap(). > > Cc: Hugh Dickins <hughd@google.com> > Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> > --- > mm/shmem.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 220be9f..9691dec 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -1369,7 +1369,8 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc) > if (list_empty(&info->swaplist)) > list_add(&info->swaplist, &shmem_swaplist); > > - if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) { > + if (add_to_swap_cache(page, swap, > + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) { > spin_lock_irq(&info->lock); > shmem_recalc_inode(inode); > info->swapped++; > -- > 1.8.3.1
On 11/6/19 7:18 AM, Michal Hocko wrote: > On Wed 06-11-19 06:02:31, Yang Shi wrote: >> The shmem_writepage() uses GFP_ATOMIC to allocate swap cache. >> GFP_ATOMIC used to mean __GFP_HIGH, but now it means __GFP_HIGH | >> __GFP_ATOMIC | __GFP_KSWAPD_RECLAIM. However, shmem_writepage() should >> write out to swap only in response to memory pressure, so >> __GFP_KSWAPD_RECLAIM looks useless since the caller may be kswapd itself >> or in direct reclaim already. > What kind of problem are you trying to fix here? I didn't run into any visible problem. I just happened to find this inconsistency when I was looking into the other problem. The add_to_swap() does: int add_to_swap(struct page *page) { ... err = add_to_swap_cache(page, entry, __GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN); ... } Actually, shmem_writepage() does almost the same thing and both of them are called in reclaim context, so I didn't see why they should use different gfp flag. And, GFP_ATOMIC is also different from the old definition as I mentioned in the commit log. > >> In addition, XArray node allocations from PF_MEMALLOC contexts could >> completely exhaust the page allocator, __GFP_NOMEMALLOC stops emergency >> reserves from being allocated. > I am not really familiar with XArray much, could you be more specific > please? It comes from the comments of add_to_swap(), says: /* * XArray node allocations from PF_MEMALLOC contexts could * completely exhaust the page allocator. __GFP_NOMEMALLOC * stops emergency reserves from being allocated. And, it looks the original comment came from pre-git time, TBH I'm not quite sure about the specific problem which incurred this. I suspect it may be because PF_MEMALLOC context allows ALLOC_NO_WATERMARK. > >> Here just copy the gfp flags used by add_to_swap(). >> >> Cc: Hugh Dickins <hughd@google.com> >> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> >> --- >> mm/shmem.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/mm/shmem.c b/mm/shmem.c >> index 220be9f..9691dec 100644 >> --- a/mm/shmem.c >> +++ b/mm/shmem.c >> @@ -1369,7 +1369,8 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc) >> if (list_empty(&info->swaplist)) >> list_add(&info->swaplist, &shmem_swaplist); >> >> - if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) { >> + if (add_to_swap_cache(page, swap, >> + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) { >> spin_lock_irq(&info->lock); >> shmem_recalc_inode(inode); >> info->swapped++; >> -- >> 1.8.3.1
On Wed, 6 Nov 2019, Yang Shi wrote: > On 11/6/19 7:18 AM, Michal Hocko wrote: > > On Wed 06-11-19 06:02:31, Yang Shi wrote: > > > The shmem_writepage() uses GFP_ATOMIC to allocate swap cache. > > > GFP_ATOMIC used to mean __GFP_HIGH, but now it means __GFP_HIGH | > > > __GFP_ATOMIC | __GFP_KSWAPD_RECLAIM. However, shmem_writepage() should > > > write out to swap only in response to memory pressure, so > > > __GFP_KSWAPD_RECLAIM looks useless since the caller may be kswapd itself > > > or in direct reclaim already. > > What kind of problem are you trying to fix here? > > I didn't run into any visible problem. I just happened to find this > inconsistency when I was looking into the other problem. Yes, I don't think it fixes any actual problem: just a cleanup to make the two calls look the same when they don't need to be different (whereas the call from __read_swap_cache_async() rightly uses a lower priority gfp). If it does fix a problem, then you need to worry also about the * TODO: this could cause a theoretical memory reclaim * deadlock in the swap out path. comment still against the call in add_to_swap(): but I think that is equally theoretical, demanding no attention since 2.6.12. > > The add_to_swap() does: > > int add_to_swap(struct page *page) > { > ... > err = add_to_swap_cache(page, entry, > __GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN); > ... > } > > Actually, shmem_writepage() does almost the same thing and both of them are > called in reclaim context, so I didn't see why they should use different gfp > flag. And, GFP_ATOMIC is also different from the old definition as I > mentioned in the commit log. > > > > > > In addition, XArray node allocations from PF_MEMALLOC contexts could > > > completely exhaust the page allocator, __GFP_NOMEMALLOC stops emergency > > > reserves from being allocated. > > I am not really familiar with XArray much, could you be more specific > > please? > > It comes from the comments of add_to_swap(), says: > > /* > * XArray node allocations from PF_MEMALLOC contexts could > * completely exhaust the page allocator. __GFP_NOMEMALLOC > * stops emergency reserves from being allocated. > > And, it looks the original comment came from pre-git time, TBH I'm not quite > sure about the specific problem which incurred this. I suspect it may be > because PF_MEMALLOC context allows ALLOC_NO_WATERMARK. > > > > > > Here just copy the gfp flags used by add_to_swap(). > > > > > > Cc: Hugh Dickins <hughd@google.com> > > > Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> > > > --- > > > mm/shmem.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/mm/shmem.c b/mm/shmem.c > > > index 220be9f..9691dec 100644 > > > --- a/mm/shmem.c > > > +++ b/mm/shmem.c > > > @@ -1369,7 +1369,8 @@ static int shmem_writepage(struct page *page, > > > struct writeback_control *wbc) > > > if (list_empty(&info->swaplist)) > > > list_add(&info->swaplist, &shmem_swaplist); > > > - if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) { > > > + if (add_to_swap_cache(page, swap, > > > + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) { > > > spin_lock_irq(&info->lock); > > > shmem_recalc_inode(inode); > > > info->swapped++; > > > -- > > > 1.8.3.1
On 11/6/19 10:59 AM, Hugh Dickins wrote: > On Wed, 6 Nov 2019, Yang Shi wrote: >> On 11/6/19 7:18 AM, Michal Hocko wrote: >>> On Wed 06-11-19 06:02:31, Yang Shi wrote: >>>> The shmem_writepage() uses GFP_ATOMIC to allocate swap cache. >>>> GFP_ATOMIC used to mean __GFP_HIGH, but now it means __GFP_HIGH | >>>> __GFP_ATOMIC | __GFP_KSWAPD_RECLAIM. However, shmem_writepage() should >>>> write out to swap only in response to memory pressure, so >>>> __GFP_KSWAPD_RECLAIM looks useless since the caller may be kswapd itself >>>> or in direct reclaim already. >>> What kind of problem are you trying to fix here? >> I didn't run into any visible problem. I just happened to find this >> inconsistency when I was looking into the other problem. > Yes, I don't think it fixes any actual problem: just a cleanup to > make the two calls look the same when they don't need to be different > (whereas the call from __read_swap_cache_async() rightly uses a > lower priority gfp). I'm supposed it is because __read_swap_cache_async()is typically called from page fault context which is less crucial than reclaim. Shall I consider this as an ack but with commit log rephrased to reflect the cleanup? > > If it does fix a problem, then you need to worry also about the > * TODO: this could cause a theoretical memory reclaim > * deadlock in the swap out path. > comment still against the call in add_to_swap(): but I think that > is equally theoretical, demanding no attention since 2.6.12. > >> The add_to_swap() does: >> >> int add_to_swap(struct page *page) >> { >> ... >> err = add_to_swap_cache(page, entry, >> __GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN); >> ... >> } >> >> Actually, shmem_writepage() does almost the same thing and both of them are >> called in reclaim context, so I didn't see why they should use different gfp >> flag. And, GFP_ATOMIC is also different from the old definition as I >> mentioned in the commit log. >> >>>> In addition, XArray node allocations from PF_MEMALLOC contexts could >>>> completely exhaust the page allocator, __GFP_NOMEMALLOC stops emergency >>>> reserves from being allocated. >>> I am not really familiar with XArray much, could you be more specific >>> please? >> It comes from the comments of add_to_swap(), says: >> >> /* >> * XArray node allocations from PF_MEMALLOC contexts could >> * completely exhaust the page allocator. __GFP_NOMEMALLOC >> * stops emergency reserves from being allocated. >> >> And, it looks the original comment came from pre-git time, TBH I'm not quite >> sure about the specific problem which incurred this. I suspect it may be >> because PF_MEMALLOC context allows ALLOC_NO_WATERMARK. >> >>>> Here just copy the gfp flags used by add_to_swap(). >>>> >>>> Cc: Hugh Dickins <hughd@google.com> >>>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> >>>> --- >>>> mm/shmem.c | 3 ++- >>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/mm/shmem.c b/mm/shmem.c >>>> index 220be9f..9691dec 100644 >>>> --- a/mm/shmem.c >>>> +++ b/mm/shmem.c >>>> @@ -1369,7 +1369,8 @@ static int shmem_writepage(struct page *page, >>>> struct writeback_control *wbc) >>>> if (list_empty(&info->swaplist)) >>>> list_add(&info->swaplist, &shmem_swaplist); >>>> - if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) { >>>> + if (add_to_swap_cache(page, swap, >>>> + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) { >>>> spin_lock_irq(&info->lock); >>>> shmem_recalc_inode(inode); >>>> info->swapped++; >>>> -- >>>> 1.8.3.1
On Thu, 7 Nov 2019, Yang Shi wrote: > On 11/6/19 10:59 AM, Hugh Dickins wrote: > > Yes, I don't think it fixes any actual problem: just a cleanup to > > make the two calls look the same when they don't need to be different > > (whereas the call from __read_swap_cache_async() rightly uses a > > lower priority gfp). > > I'm supposed it is because __read_swap_cache_async()is typically called from > page fault context which is less crucial than reclaim. Exactly. > > Shall I consider this as an ack but with commit log rephrased to reflect the > cleanup? Okay, Acked-by: Hugh Dickins <hughd@google.com>
diff --git a/mm/shmem.c b/mm/shmem.c index 220be9f..9691dec 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1369,7 +1369,8 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc) if (list_empty(&info->swaplist)) list_add(&info->swaplist, &shmem_swaplist); - if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) { + if (add_to_swap_cache(page, swap, + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) { spin_lock_irq(&info->lock); shmem_recalc_inode(inode); info->swapped++;
The shmem_writepage() uses GFP_ATOMIC to allocate swap cache. GFP_ATOMIC used to mean __GFP_HIGH, but now it means __GFP_HIGH | __GFP_ATOMIC | __GFP_KSWAPD_RECLAIM. However, shmem_writepage() should write out to swap only in response to memory pressure, so __GFP_KSWAPD_RECLAIM looks useless since the caller may be kswapd itself or in direct reclaim already. In addition, XArray node allocations from PF_MEMALLOC contexts could completely exhaust the page allocator, __GFP_NOMEMALLOC stops emergency reserves from being allocated. Here just copy the gfp flags used by add_to_swap(). Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> --- mm/shmem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)