Message ID | 20181120134323.13007-4-mhocko@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | few memory offlining enhancements | expand |
On Tue, Nov 20, 2018 at 02:43:23PM +0100, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.com> > > filemap_map_pages takes a speculative reference to each page in the > range before it tries to lock that page. While this is correct it > also can influence page migration which will bail out when seeing > an elevated reference count. The faultaround code would bail on > seeing a locked page so we can pro-actively check the PageLocked > bit before page_cache_get_speculative and prevent from pointless > reference count churn. Looks fine to me. But please drop a line of comment in the code. As is it might be confusing for a reader.
On Tue 20-11-18 17:07:15, Kirill A. Shutemov wrote: > On Tue, Nov 20, 2018 at 02:43:23PM +0100, Michal Hocko wrote: > > From: Michal Hocko <mhocko@suse.com> > > > > filemap_map_pages takes a speculative reference to each page in the > > range before it tries to lock that page. While this is correct it > > also can influence page migration which will bail out when seeing > > an elevated reference count. The faultaround code would bail on > > seeing a locked page so we can pro-actively check the PageLocked > > bit before page_cache_get_speculative and prevent from pointless > > reference count churn. > > Looks fine to me. Thanks for the review. > But please drop a line of comment in the code. As is it might be confusing > for a reader. This? diff --git a/mm/filemap.c b/mm/filemap.c index c76d6a251770..7c4e439a2e85 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2554,6 +2554,10 @@ void filemap_map_pages(struct vm_fault *vmf, head = compound_head(page); + /* + * Check the locked pages before taking a reference to not + * go in the way of migration. + */ if (PageLocked(head)) goto next; if (!page_cache_get_speculative(head))
On Tue, Nov 20, 2018 at 03:12:07PM +0100, Michal Hocko wrote: > On Tue 20-11-18 17:07:15, Kirill A. Shutemov wrote: > > On Tue, Nov 20, 2018 at 02:43:23PM +0100, Michal Hocko wrote: > > > From: Michal Hocko <mhocko@suse.com> > > > > > > filemap_map_pages takes a speculative reference to each page in the > > > range before it tries to lock that page. While this is correct it > > > also can influence page migration which will bail out when seeing > > > an elevated reference count. The faultaround code would bail on > > > seeing a locked page so we can pro-actively check the PageLocked > > > bit before page_cache_get_speculative and prevent from pointless > > > reference count churn. > > > > Looks fine to me. > > Thanks for the review. > > > But please drop a line of comment in the code. As is it might be confusing > > for a reader. > > This? Yep. Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > diff --git a/mm/filemap.c b/mm/filemap.c > index c76d6a251770..7c4e439a2e85 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -2554,6 +2554,10 @@ void filemap_map_pages(struct vm_fault *vmf, > > head = compound_head(page); > > + /* > + * Check the locked pages before taking a reference to not > + * go in the way of migration. > + */ > if (PageLocked(head)) > goto next; > if (!page_cache_get_speculative(head)) > -- > Michal Hocko > SUSE Labs
On Tue 20-11-18 17:17:00, Kirill A. Shutemov wrote: > On Tue, Nov 20, 2018 at 03:12:07PM +0100, Michal Hocko wrote: > > On Tue 20-11-18 17:07:15, Kirill A. Shutemov wrote: > > > On Tue, Nov 20, 2018 at 02:43:23PM +0100, Michal Hocko wrote: > > > > From: Michal Hocko <mhocko@suse.com> > > > > > > > > filemap_map_pages takes a speculative reference to each page in the > > > > range before it tries to lock that page. While this is correct it > > > > also can influence page migration which will bail out when seeing > > > > an elevated reference count. The faultaround code would bail on > > > > seeing a locked page so we can pro-actively check the PageLocked > > > > bit before page_cache_get_speculative and prevent from pointless > > > > reference count churn. > > > > > > Looks fine to me. > > > > Thanks for the review. > > > > > But please drop a line of comment in the code. As is it might be confusing > > > for a reader. > > > > This? > > Yep. > > Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cool, thanks! I will wait some more time for review feedback for other patches and then repost with this folded in.
On 20.11.18 14:43, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.com> > > filemap_map_pages takes a speculative reference to each page in the > range before it tries to lock that page. While this is correct it > also can influence page migration which will bail out when seeing > an elevated reference count. The faultaround code would bail on > seeing a locked page so we can pro-actively check the PageLocked > bit before page_cache_get_speculative and prevent from pointless > reference count churn. > > Cc: "Kirill A. Shutemov" <kirill@shutemov.name> > Suggested-by: Jan Kara <jack@suse.cz> > Signed-off-by: Michal Hocko <mhocko@suse.com> > --- > mm/filemap.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/filemap.c b/mm/filemap.c > index 81adec8ee02c..c76d6a251770 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -2553,6 +2553,9 @@ void filemap_map_pages(struct vm_fault *vmf, > goto next; > > head = compound_head(page); > + > + if (PageLocked(head)) > + goto next; > if (!page_cache_get_speculative(head)) > goto next; > > Right, no sense in referencing a page if we know we will drop the reference right away. Reviewed-by: David Hildenbrand <david@redhat.com>
On Tue, 20 Nov 2018, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.com> > > filemap_map_pages takes a speculative reference to each page in the > range before it tries to lock that page. While this is correct it > also can influence page migration which will bail out when seeing > an elevated reference count. The faultaround code would bail on > seeing a locked page so we can pro-actively check the PageLocked > bit before page_cache_get_speculative and prevent from pointless > reference count churn. > > Cc: "Kirill A. Shutemov" <kirill@shutemov.name> > Suggested-by: Jan Kara <jack@suse.cz> > Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Hugh Dickins <hughd@google.com> though I think this patch is more useful to the avoid atomic ops, and unnecessary dirtying of the cacheline, than to avoid the very transient elevation of refcount, which will not affect page migration very much. > --- > mm/filemap.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/filemap.c b/mm/filemap.c > index 81adec8ee02c..c76d6a251770 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -2553,6 +2553,9 @@ void filemap_map_pages(struct vm_fault *vmf, > goto next; > > head = compound_head(page); > + > + if (PageLocked(head)) > + goto next; > if (!page_cache_get_speculative(head)) > goto next; > > -- > 2.19.1
> On Nov 20, 2018, at 7:12 AM, Michal Hocko <mhocko@kernel.org> wrote: > > + /* > + * Check the locked pages before taking a reference to not > + * go in the way of migration. > + */ Could you make this a tiny bit more explanative, something like: + /* + * Check for a locked page first, as a speculative + * reference may adversely influence page migration. + */ Reviewed-by: William Kucharski <william.kucharski@oracle.com>
On Tue 20-11-18 21:51:39, William Kucharski wrote: > > > > On Nov 20, 2018, at 7:12 AM, Michal Hocko <mhocko@kernel.org> wrote: > > > > + /* > > + * Check the locked pages before taking a reference to not > > + * go in the way of migration. > > + */ > > Could you make this a tiny bit more explanative, something like: > > + /* > + * Check for a locked page first, as a speculative > + * reference may adversely influence page migration. > + */ sure > > Reviewed-by: William Kucharski <william.kucharski@oracle.com> Thanks!
On Tue 20-11-18 17:47:21, Hugh Dickins wrote: > On Tue, 20 Nov 2018, Michal Hocko wrote: > > > From: Michal Hocko <mhocko@suse.com> > > > > filemap_map_pages takes a speculative reference to each page in the > > range before it tries to lock that page. While this is correct it > > also can influence page migration which will bail out when seeing > > an elevated reference count. The faultaround code would bail on > > seeing a locked page so we can pro-actively check the PageLocked > > bit before page_cache_get_speculative and prevent from pointless > > reference count churn. > > > > Cc: "Kirill A. Shutemov" <kirill@shutemov.name> > > Suggested-by: Jan Kara <jack@suse.cz> > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > Acked-by: Hugh Dickins <hughd@google.com> Thanks! > though I think this patch is more useful to the avoid atomic ops, > and unnecessary dirtying of the cacheline, than to avoid the very > transient elevation of refcount, which will not affect page migration > very much. Are you sure it would really be transient? In other words is it possible that the fault around can block migration repeatedly under refault heavy workload? I just couldn't convince myself, to be honest.
On Wed, 21 Nov 2018, Michal Hocko wrote: > On Tue 20-11-18 17:47:21, Hugh Dickins wrote: > > On Tue, 20 Nov 2018, Michal Hocko wrote: > > > > > From: Michal Hocko <mhocko@suse.com> > > > > > > filemap_map_pages takes a speculative reference to each page in the > > > range before it tries to lock that page. While this is correct it > > > also can influence page migration which will bail out when seeing > > > an elevated reference count. The faultaround code would bail on > > > seeing a locked page so we can pro-actively check the PageLocked > > > bit before page_cache_get_speculative and prevent from pointless > > > reference count churn. > > > > > > Cc: "Kirill A. Shutemov" <kirill@shutemov.name> > > > Suggested-by: Jan Kara <jack@suse.cz> > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > > > Acked-by: Hugh Dickins <hughd@google.com> > > Thanks! > > > though I think this patch is more useful to the avoid atomic ops, > > and unnecessary dirtying of the cacheline, than to avoid the very > > transient elevation of refcount, which will not affect page migration > > very much. > > Are you sure it would really be transient? In other words is it possible > that the fault around can block migration repeatedly under refault heavy > workload? I just couldn't convince myself, to be honest. I don't deny that it is possible: I expect that, using fork() (which does not copy the ptes in a shared file vma), you can construct a test case where each child faults one or another page near a page of no interest, and that page of no interest is a target of migration perpetually frustrated by filemap_map_pages()'s briefly raised refcount. But I suggest that's a third-order effect: well worth fixing because it's easily and uncontroversially dealt with, as you have; but not of great importance. The first-order effect is migration conspiring to defeat itself: that's what my put_and_wait_on_page_locked() patch, in other thread, is about. The second order effect is when a page that is really wanted is waited on - the target of a fault, for which page refcount is raised maybe long before it finally gets into the page table (whereupon it becomes visible to try_to_unmap(), and its mapcount matches refcount so that migration can fully account for the page). One class of that can be well dealt with by using put_and_wait_on_page_locked_killable() in lock_page_or_retry(), but I was keeping that as a future instalment. But I shouldn't denigrate the transient case by referring so lightly to migrate_pages()' 10 attempts: each of those failed attempts can be very expensive, unmapping and TLB flushing (including IPIs) and remapping. It may well be that 2 or 3 would be a more cost-effective number of attempts, at least when the page is mapped. Hugh
On Wed 21-11-18 18:27:11, Hugh Dickins wrote: > On Wed, 21 Nov 2018, Michal Hocko wrote: > > On Tue 20-11-18 17:47:21, Hugh Dickins wrote: > > > On Tue, 20 Nov 2018, Michal Hocko wrote: > > > > > > > From: Michal Hocko <mhocko@suse.com> > > > > > > > > filemap_map_pages takes a speculative reference to each page in the > > > > range before it tries to lock that page. While this is correct it > > > > also can influence page migration which will bail out when seeing > > > > an elevated reference count. The faultaround code would bail on > > > > seeing a locked page so we can pro-actively check the PageLocked > > > > bit before page_cache_get_speculative and prevent from pointless > > > > reference count churn. > > > > > > > > Cc: "Kirill A. Shutemov" <kirill@shutemov.name> > > > > Suggested-by: Jan Kara <jack@suse.cz> > > > > Signed-off-by: Michal Hocko <mhocko@suse.com> > > > > > > Acked-by: Hugh Dickins <hughd@google.com> > > > > Thanks! > > > > > though I think this patch is more useful to the avoid atomic ops, > > > and unnecessary dirtying of the cacheline, than to avoid the very > > > transient elevation of refcount, which will not affect page migration > > > very much. > > > > Are you sure it would really be transient? In other words is it possible > > that the fault around can block migration repeatedly under refault heavy > > workload? I just couldn't convince myself, to be honest. > > I don't deny that it is possible: I expect that, using fork() (which does > not copy the ptes in a shared file vma), you can construct a test case > where each child faults one or another page near a page of no interest, > and that page of no interest is a target of migration perpetually > frustrated by filemap_map_pages()'s briefly raised refcount. The other issue I am debugging and which very likely has the same underlying issue in the end has shown [ 883.930477] rac1 kernel: page:ffffea2084bf5cc0 count:1889 mapcount:1887 mapping:ffff8833c82c9ad8 index:0x6b [ 883.930485] rac1 kernel: ext4_da_aops [ext4] [ 883.930497] rac1 kernel: name:"libc-2.22.so" [ 883.931241] rac1 kernel: do_migrate_range done ret=23 pattern. After we have disabled the faultaround the failure has moved to a different page but libc hasn't shown up again. This might be a matter of (bad)luck and timing. But we thought that it is not too unlikely for faultaround on such a shared page to strike in. > But I suggest that's a third-order effect: well worth fixing because > it's easily and uncontroversially dealt with, as you have; but not of > great importance. > > The first-order effect is migration conspiring to defeat itself: that's > what my put_and_wait_on_page_locked() patch, in other thread, is about. yes. That is obviously a much more effective fix. > The second order effect is when a page that is really wanted is waited > on - the target of a fault, for which page refcount is raised maybe > long before it finally gets into the page table (whereupon it becomes > visible to try_to_unmap(), and its mapcount matches refcount so that > migration can fully account for the page). One class of that can be > well dealt with by using put_and_wait_on_page_locked_killable() in > lock_page_or_retry(), but I was keeping that as a future instalment. > > But I shouldn't denigrate the transient case by referring so lightly > to migrate_pages()' 10 attempts: each of those failed attempts can > be very expensive, unmapping and TLB flushing (including IPIs) and > remapping. It may well be that 2 or 3 would be a more cost-effective > number of attempts, at least when the page is mapped. If you want some update to the comment in this function or to the changelog, I am open of course. Right now I have + * Check for a locked page first, as a speculative + * reference may adversely influence page migration. as suggested by William.
On Thu, 22 Nov 2018, Michal Hocko wrote: > > If you want some update to the comment in this function or to the > changelog, I am open of course. Right now I have > + * Check for a locked page first, as a speculative > + * reference may adversely influence page migration. > as suggested by William. I ought to care, since I challenged the significance of this aspect in the first place, but find I don't care enough - I much prefer the patch to the comments on and in it, but have not devised any wording that I'd prefer to see instead - sorry. Hugh
diff --git a/mm/filemap.c b/mm/filemap.c index 81adec8ee02c..c76d6a251770 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2553,6 +2553,9 @@ void filemap_map_pages(struct vm_fault *vmf, goto next; head = compound_head(page); + + if (PageLocked(head)) + goto next; if (!page_cache_get_speculative(head)) goto next;