Message ID | 1548793753-62377-1-git-send-email-yang.shi@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] mm: ksm: do not block on page lock when searching stable tree | expand |
On 1/29/19 12:29 PM, Yang Shi wrote: > ksmd need search stable tree to look for the suitable KSM page, but the > KSM page might be locked for a while due to i.e. KSM page rmap walk. > Basically it is not a big deal since commit 2c653d0ee2ae > ("ksm: introduce ksm_max_page_sharing per page deduplication limit"), > since max_page_sharing limits the number of shared KSM pages. > > But it still sounds not worth waiting for the lock, the page can be skip, > then try to merge it in the next scan to avoid potential stall if its > content is still intact. > > Introduce trylock mode to get_ksm_page() to not block on page lock, like > what try_to_merge_one_page() does. And, define three possible > operations (nolock, lock and trylock) as enum type to avoid stacking up > bools and make the code more readable. > > Return -EBUSY if trylock fails, since NULL means not find suitable KSM > page, which is a valid case. > > With the default max_page_sharing setting (256), there is almost no > observed change comparing lock vs trylock. > > However, with ksm02 of LTP, the reduced ksmd full scan time can be > observed, which has set max_page_sharing to 786432. With lock version, > ksmd may tak 10s - 11s to run two full scans, with trylock version ksmd > may take 8s - 11s to run two full scans. And, the number of > pages_sharing and pages_to_scan keep same. Basically, this change has > no harm > > Cc: Hugh Dickins <hughd@google.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Suggested-by: John Hubbard <jhubbard@nvidia.com> > Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> > Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> > --- > Hi folks, > > This patch was with "mm: vmscan: skip KSM page in direct reclaim if priority > is low" in the initial submission. Then Hugh and Andrea pointed out commit > 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page deduplication > limit") is good enough for limiting the number of shared KSM page to prevent > from softlock when walking ksm page rmap. This commit does solve the problem. > So, the series was dropped by Andrew from -mm tree. > > However, I thought the second patch (this one) still sounds useful. So, I did > some test and resubmit it. The first version was reviewed by Krill Tkhai, so > I keep his Reviewed-by tag since there is no change to the patch except the > commit log. > > So, would you please reconsider this patch? > > v3: Use enum to define get_ksm_page operations (nolock, lock and trylock) per > John Hubbard > v2: Updated the commit log to reflect some test result and latest discussion > > mm/ksm.c | 46 ++++++++++++++++++++++++++++++++++++---------- > 1 file changed, 36 insertions(+), 10 deletions(-) > > diff --git a/mm/ksm.c b/mm/ksm.c > index 6c48ad1..5647bc1 100644 > --- a/mm/ksm.c > +++ b/mm/ksm.c > @@ -667,6 +667,12 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node) > free_stable_node(stable_node); > } > > +enum get_ksm_page_flags { > + GET_KSM_PAGE_NOLOCK, > + GET_KSM_PAGE_LOCK, > + GET_KSM_PAGE_TRYLOCK > +}; > + > /* > * get_ksm_page: checks if the page indicated by the stable node > * is still its ksm page, despite having held no reference to it. > @@ -686,7 +692,8 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node) > * a page to put something that might look like our key in page->mapping. > * is on its way to being freed; but it is an anomaly to bear in mind. > */ > -static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it) > +static struct page *get_ksm_page(struct stable_node *stable_node, > + enum get_ksm_page_flags flags) > { > struct page *page; > void *expected_mapping; > @@ -728,8 +735,15 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it) > goto stale; > } > > - if (lock_it) { > + if (flags == GET_KSM_PAGE_TRYLOCK) { > + if (!trylock_page(page)) { > + put_page(page); > + return ERR_PTR(-EBUSY); > + } > + } else if (flags == GET_KSM_PAGE_LOCK) > lock_page(page); > + > + if (flags != GET_KSM_PAGE_NOLOCK) { > if (READ_ONCE(page->mapping) != expected_mapping) { > unlock_page(page); > put_page(page); > @@ -763,7 +777,7 @@ static void remove_rmap_item_from_tree(struct rmap_item *rmap_item) > struct page *page; > > stable_node = rmap_item->head; > - page = get_ksm_page(stable_node, true); > + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK); > if (!page) > goto out; > > @@ -863,7 +877,7 @@ static int remove_stable_node(struct stable_node *stable_node) > struct page *page; > int err; > > - page = get_ksm_page(stable_node, true); > + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK); > if (!page) { > /* > * get_ksm_page did remove_node_from_stable_tree itself. > @@ -1385,7 +1399,7 @@ static struct page *stable_node_dup(struct stable_node **_stable_node_dup, > * stable_node parameter itself will be freed from > * under us if it returns NULL. > */ > - _tree_page = get_ksm_page(dup, false); > + _tree_page = get_ksm_page(dup, GET_KSM_PAGE_NOLOCK); > if (!_tree_page) > continue; > nr += 1; > @@ -1508,7 +1522,7 @@ static struct page *__stable_node_chain(struct stable_node **_stable_node_dup, > if (!is_stable_node_chain(stable_node)) { > if (is_page_sharing_candidate(stable_node)) { > *_stable_node_dup = stable_node; > - return get_ksm_page(stable_node, false); > + return get_ksm_page(stable_node, GET_KSM_PAGE_NOLOCK); > } > /* > * _stable_node_dup set to NULL means the stable_node > @@ -1613,7 +1627,8 @@ static struct page *stable_tree_search(struct page *page) > * wrprotected at all times. Any will work > * fine to continue the walk. > */ > - tree_page = get_ksm_page(stable_node_any, false); > + tree_page = get_ksm_page(stable_node_any, > + GET_KSM_PAGE_NOLOCK); > } > VM_BUG_ON(!stable_node_dup ^ !!stable_node_any); > if (!tree_page) { > @@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct page *page) > * It would be more elegant to return stable_node > * than kpage, but that involves more changes. > */ > - tree_page = get_ksm_page(stable_node_dup, true); > + tree_page = get_ksm_page(stable_node_dup, > + GET_KSM_PAGE_TRYLOCK); > + > + if (PTR_ERR(tree_page) == -EBUSY) > + return ERR_PTR(-EBUSY); or just: if (PTR_ERR(tree_page) == -EBUSY) return tree_page; right? > + > if (unlikely(!tree_page)) > /* > * The tree may have been rebalanced, > @@ -1842,7 +1862,8 @@ static struct stable_node *stable_tree_insert(struct page *kpage) > * wrprotected at all times. Any will work > * fine to continue the walk. > */ > - tree_page = get_ksm_page(stable_node_any, false); > + tree_page = get_ksm_page(stable_node_any, > + GET_KSM_PAGE_NOLOCK); > } > VM_BUG_ON(!stable_node_dup ^ !!stable_node_any); > if (!tree_page) { > @@ -2060,6 +2081,10 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item) > > /* We first start with searching the page inside the stable tree */ > kpage = stable_tree_search(page); > + > + if (PTR_ERR(kpage) == -EBUSY) > + return; > + > if (kpage == page && rmap_item->head == stable_node) { > put_page(kpage); > return; > @@ -2242,7 +2267,8 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page) > > list_for_each_entry_safe(stable_node, next, > &migrate_nodes, list) { > - page = get_ksm_page(stable_node, false); > + page = get_ksm_page(stable_node, > + GET_KSM_PAGE_NOLOCK); > if (page) > put_page(page); > cond_resched(); > Hi Yang, The patch looks correct as far doing what it claims to do. I'll leave it to others to decide if a trylock-based approach is really what you want, for KSM scans. It seems reasonable from my very limited knowledge of KSM: there shouldn't be any cases where you really *need* to wait for a page lock, because the whole system is really sort of an optimization anyway. thanks,
On 29.01.2019 23:29, Yang Shi wrote: > ksmd need search stable tree to look for the suitable KSM page, but the > KSM page might be locked for a while due to i.e. KSM page rmap walk. > Basically it is not a big deal since commit 2c653d0ee2ae > ("ksm: introduce ksm_max_page_sharing per page deduplication limit"), > since max_page_sharing limits the number of shared KSM pages. > > But it still sounds not worth waiting for the lock, the page can be skip, > then try to merge it in the next scan to avoid potential stall if its > content is still intact. > > Introduce trylock mode to get_ksm_page() to not block on page lock, like > what try_to_merge_one_page() does. And, define three possible > operations (nolock, lock and trylock) as enum type to avoid stacking up > bools and make the code more readable. > > Return -EBUSY if trylock fails, since NULL means not find suitable KSM > page, which is a valid case. > > With the default max_page_sharing setting (256), there is almost no > observed change comparing lock vs trylock. > > However, with ksm02 of LTP, the reduced ksmd full scan time can be > observed, which has set max_page_sharing to 786432. With lock version, > ksmd may tak 10s - 11s to run two full scans, with trylock version ksmd > may take 8s - 11s to run two full scans. And, the number of > pages_sharing and pages_to_scan keep same. Basically, this change has > no harm. > > Cc: Hugh Dickins <hughd@google.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Suggested-by: John Hubbard <jhubbard@nvidia.com> > Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Also looks good for me. > Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> > --- > Hi folks, > > This patch was with "mm: vmscan: skip KSM page in direct reclaim if priority > is low" in the initial submission. Then Hugh and Andrea pointed out commit > 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page deduplication > limit") is good enough for limiting the number of shared KSM page to prevent > from softlock when walking ksm page rmap. This commit does solve the problem. > So, the series was dropped by Andrew from -mm tree. > > However, I thought the second patch (this one) still sounds useful. So, I did > some test and resubmit it. The first version was reviewed by Krill Tkhai, so > I keep his Reviewed-by tag since there is no change to the patch except the > commit log. > > So, would you please reconsider this patch? > > v3: Use enum to define get_ksm_page operations (nolock, lock and trylock) per > John Hubbard > v2: Updated the commit log to reflect some test result and latest discussion > > mm/ksm.c | 46 ++++++++++++++++++++++++++++++++++++---------- > 1 file changed, 36 insertions(+), 10 deletions(-) > > diff --git a/mm/ksm.c b/mm/ksm.c > index 6c48ad1..5647bc1 100644 > --- a/mm/ksm.c > +++ b/mm/ksm.c > @@ -667,6 +667,12 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node) > free_stable_node(stable_node); > } > > +enum get_ksm_page_flags { > + GET_KSM_PAGE_NOLOCK, > + GET_KSM_PAGE_LOCK, > + GET_KSM_PAGE_TRYLOCK > +}; > + > /* > * get_ksm_page: checks if the page indicated by the stable node > * is still its ksm page, despite having held no reference to it. > @@ -686,7 +692,8 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node) > * a page to put something that might look like our key in page->mapping. > * is on its way to being freed; but it is an anomaly to bear in mind. > */ > -static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it) > +static struct page *get_ksm_page(struct stable_node *stable_node, > + enum get_ksm_page_flags flags) > { > struct page *page; > void *expected_mapping; > @@ -728,8 +735,15 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it) > goto stale; > } > > - if (lock_it) { > + if (flags == GET_KSM_PAGE_TRYLOCK) { > + if (!trylock_page(page)) { > + put_page(page); > + return ERR_PTR(-EBUSY); > + } > + } else if (flags == GET_KSM_PAGE_LOCK) > lock_page(page); > + > + if (flags != GET_KSM_PAGE_NOLOCK) { > if (READ_ONCE(page->mapping) != expected_mapping) { > unlock_page(page); > put_page(page); > @@ -763,7 +777,7 @@ static void remove_rmap_item_from_tree(struct rmap_item *rmap_item) > struct page *page; > > stable_node = rmap_item->head; > - page = get_ksm_page(stable_node, true); > + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK); > if (!page) > goto out; > > @@ -863,7 +877,7 @@ static int remove_stable_node(struct stable_node *stable_node) > struct page *page; > int err; > > - page = get_ksm_page(stable_node, true); > + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK); > if (!page) { > /* > * get_ksm_page did remove_node_from_stable_tree itself. > @@ -1385,7 +1399,7 @@ static struct page *stable_node_dup(struct stable_node **_stable_node_dup, > * stable_node parameter itself will be freed from > * under us if it returns NULL. > */ > - _tree_page = get_ksm_page(dup, false); > + _tree_page = get_ksm_page(dup, GET_KSM_PAGE_NOLOCK); > if (!_tree_page) > continue; > nr += 1; > @@ -1508,7 +1522,7 @@ static struct page *__stable_node_chain(struct stable_node **_stable_node_dup, > if (!is_stable_node_chain(stable_node)) { > if (is_page_sharing_candidate(stable_node)) { > *_stable_node_dup = stable_node; > - return get_ksm_page(stable_node, false); > + return get_ksm_page(stable_node, GET_KSM_PAGE_NOLOCK); > } > /* > * _stable_node_dup set to NULL means the stable_node > @@ -1613,7 +1627,8 @@ static struct page *stable_tree_search(struct page *page) > * wrprotected at all times. Any will work > * fine to continue the walk. > */ > - tree_page = get_ksm_page(stable_node_any, false); > + tree_page = get_ksm_page(stable_node_any, > + GET_KSM_PAGE_NOLOCK); > } > VM_BUG_ON(!stable_node_dup ^ !!stable_node_any); > if (!tree_page) { > @@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct page *page) > * It would be more elegant to return stable_node > * than kpage, but that involves more changes. > */ > - tree_page = get_ksm_page(stable_node_dup, true); > + tree_page = get_ksm_page(stable_node_dup, > + GET_KSM_PAGE_TRYLOCK); > + > + if (PTR_ERR(tree_page) == -EBUSY) > + return ERR_PTR(-EBUSY); > + > if (unlikely(!tree_page)) > /* > * The tree may have been rebalanced, > @@ -1842,7 +1862,8 @@ static struct stable_node *stable_tree_insert(struct page *kpage) > * wrprotected at all times. Any will work > * fine to continue the walk. > */ > - tree_page = get_ksm_page(stable_node_any, false); > + tree_page = get_ksm_page(stable_node_any, > + GET_KSM_PAGE_NOLOCK); > } > VM_BUG_ON(!stable_node_dup ^ !!stable_node_any); > if (!tree_page) { > @@ -2060,6 +2081,10 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item) > > /* We first start with searching the page inside the stable tree */ > kpage = stable_tree_search(page); > + > + if (PTR_ERR(kpage) == -EBUSY) > + return; > + > if (kpage == page && rmap_item->head == stable_node) { > put_page(kpage); > return; > @@ -2242,7 +2267,8 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page) > > list_for_each_entry_safe(stable_node, next, > &migrate_nodes, list) { > - page = get_ksm_page(stable_node, false); > + page = get_ksm_page(stable_node, > + GET_KSM_PAGE_NOLOCK); > if (page) > put_page(page); > cond_resched(); >
On 1/29/19 11:14 PM, John Hubbard wrote: > On 1/29/19 12:29 PM, Yang Shi wrote: >> ksmd need search stable tree to look for the suitable KSM page, but the >> KSM page might be locked for a while due to i.e. KSM page rmap walk. >> Basically it is not a big deal since commit 2c653d0ee2ae >> ("ksm: introduce ksm_max_page_sharing per page deduplication limit"), >> since max_page_sharing limits the number of shared KSM pages. >> >> But it still sounds not worth waiting for the lock, the page can be >> skip, >> then try to merge it in the next scan to avoid potential stall if its >> content is still intact. >> >> Introduce trylock mode to get_ksm_page() to not block on page lock, like >> what try_to_merge_one_page() does. And, define three possible >> operations (nolock, lock and trylock) as enum type to avoid stacking up >> bools and make the code more readable. >> >> Return -EBUSY if trylock fails, since NULL means not find suitable KSM >> page, which is a valid case. >> >> With the default max_page_sharing setting (256), there is almost no >> observed change comparing lock vs trylock. >> >> However, with ksm02 of LTP, the reduced ksmd full scan time can be >> observed, which has set max_page_sharing to 786432. With lock version, >> ksmd may tak 10s - 11s to run two full scans, with trylock version ksmd >> may take 8s - 11s to run two full scans. And, the number of >> pages_sharing and pages_to_scan keep same. Basically, this change has >> no harm > >> Cc: Hugh Dickins <hughd@google.com> >> Cc: Andrea Arcangeli <aarcange@redhat.com> >> Suggested-by: John Hubbard <jhubbard@nvidia.com> >> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> >> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> >> --- >> Hi folks, >> >> This patch was with "mm: vmscan: skip KSM page in direct reclaim if >> priority >> is low" in the initial submission. Then Hugh and Andrea pointed out >> commit >> 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page >> deduplication >> limit") is good enough for limiting the number of shared KSM page to >> prevent >> from softlock when walking ksm page rmap. This commit does solve the >> problem. >> So, the series was dropped by Andrew from -mm tree. >> >> However, I thought the second patch (this one) still sounds useful. >> So, I did >> some test and resubmit it. The first version was reviewed by Krill >> Tkhai, so >> I keep his Reviewed-by tag since there is no change to the patch >> except the >> commit log. >> >> So, would you please reconsider this patch? >> >> v3: Use enum to define get_ksm_page operations (nolock, lock and >> trylock) per >> John Hubbard >> v2: Updated the commit log to reflect some test result and latest >> discussion >> >> mm/ksm.c | 46 ++++++++++++++++++++++++++++++++++++---------- >> 1 file changed, 36 insertions(+), 10 deletions(-) >> >> diff --git a/mm/ksm.c b/mm/ksm.c >> index 6c48ad1..5647bc1 100644 >> --- a/mm/ksm.c >> +++ b/mm/ksm.c >> @@ -667,6 +667,12 @@ static void remove_node_from_stable_tree(struct >> stable_node *stable_node) >> free_stable_node(stable_node); >> } >> +enum get_ksm_page_flags { >> + GET_KSM_PAGE_NOLOCK, >> + GET_KSM_PAGE_LOCK, >> + GET_KSM_PAGE_TRYLOCK >> +}; >> + >> /* >> * get_ksm_page: checks if the page indicated by the stable node >> * is still its ksm page, despite having held no reference to it. >> @@ -686,7 +692,8 @@ static void remove_node_from_stable_tree(struct >> stable_node *stable_node) >> * a page to put something that might look like our key in >> page->mapping. >> * is on its way to being freed; but it is an anomaly to bear in mind. >> */ >> -static struct page *get_ksm_page(struct stable_node *stable_node, >> bool lock_it) >> +static struct page *get_ksm_page(struct stable_node *stable_node, >> + enum get_ksm_page_flags flags) >> { >> struct page *page; >> void *expected_mapping; >> @@ -728,8 +735,15 @@ static struct page *get_ksm_page(struct >> stable_node *stable_node, bool lock_it) >> goto stale; >> } >> - if (lock_it) { >> + if (flags == GET_KSM_PAGE_TRYLOCK) { >> + if (!trylock_page(page)) { >> + put_page(page); >> + return ERR_PTR(-EBUSY); >> + } >> + } else if (flags == GET_KSM_PAGE_LOCK) >> lock_page(page); >> + >> + if (flags != GET_KSM_PAGE_NOLOCK) { >> if (READ_ONCE(page->mapping) != expected_mapping) { >> unlock_page(page); >> put_page(page); >> @@ -763,7 +777,7 @@ static void remove_rmap_item_from_tree(struct >> rmap_item *rmap_item) >> struct page *page; >> stable_node = rmap_item->head; >> - page = get_ksm_page(stable_node, true); >> + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK); >> if (!page) >> goto out; >> @@ -863,7 +877,7 @@ static int remove_stable_node(struct >> stable_node *stable_node) >> struct page *page; >> int err; >> - page = get_ksm_page(stable_node, true); >> + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK); >> if (!page) { >> /* >> * get_ksm_page did remove_node_from_stable_tree itself. >> @@ -1385,7 +1399,7 @@ static struct page *stable_node_dup(struct >> stable_node **_stable_node_dup, >> * stable_node parameter itself will be freed from >> * under us if it returns NULL. >> */ >> - _tree_page = get_ksm_page(dup, false); >> + _tree_page = get_ksm_page(dup, GET_KSM_PAGE_NOLOCK); >> if (!_tree_page) >> continue; >> nr += 1; >> @@ -1508,7 +1522,7 @@ static struct page *__stable_node_chain(struct >> stable_node **_stable_node_dup, >> if (!is_stable_node_chain(stable_node)) { >> if (is_page_sharing_candidate(stable_node)) { >> *_stable_node_dup = stable_node; >> - return get_ksm_page(stable_node, false); >> + return get_ksm_page(stable_node, GET_KSM_PAGE_NOLOCK); >> } >> /* >> * _stable_node_dup set to NULL means the stable_node >> @@ -1613,7 +1627,8 @@ static struct page *stable_tree_search(struct >> page *page) >> * wrprotected at all times. Any will work >> * fine to continue the walk. >> */ >> - tree_page = get_ksm_page(stable_node_any, false); >> + tree_page = get_ksm_page(stable_node_any, >> + GET_KSM_PAGE_NOLOCK); >> } >> VM_BUG_ON(!stable_node_dup ^ !!stable_node_any); >> if (!tree_page) { >> @@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct >> page *page) >> * It would be more elegant to return stable_node >> * than kpage, but that involves more changes. >> */ >> - tree_page = get_ksm_page(stable_node_dup, true); >> + tree_page = get_ksm_page(stable_node_dup, >> + GET_KSM_PAGE_TRYLOCK); >> + >> + if (PTR_ERR(tree_page) == -EBUSY) >> + return ERR_PTR(-EBUSY); > > or just: > > if (PTR_ERR(tree_page) == -EBUSY) > return tree_page; > > right? Either looks fine to me. Returning errno may look more explicit? Anyway I really don't have preference. > >> + >> if (unlikely(!tree_page)) >> /* >> * The tree may have been rebalanced, >> @@ -1842,7 +1862,8 @@ static struct stable_node >> *stable_tree_insert(struct page *kpage) >> * wrprotected at all times. Any will work >> * fine to continue the walk. >> */ >> - tree_page = get_ksm_page(stable_node_any, false); >> + tree_page = get_ksm_page(stable_node_any, >> + GET_KSM_PAGE_NOLOCK); >> } >> VM_BUG_ON(!stable_node_dup ^ !!stable_node_any); >> if (!tree_page) { >> @@ -2060,6 +2081,10 @@ static void cmp_and_merge_page(struct page >> *page, struct rmap_item *rmap_item) >> /* We first start with searching the page inside the stable >> tree */ >> kpage = stable_tree_search(page); >> + >> + if (PTR_ERR(kpage) == -EBUSY) >> + return; >> + >> if (kpage == page && rmap_item->head == stable_node) { >> put_page(kpage); >> return; >> @@ -2242,7 +2267,8 @@ static struct rmap_item >> *scan_get_next_rmap_item(struct page **page) >> list_for_each_entry_safe(stable_node, next, >> &migrate_nodes, list) { >> - page = get_ksm_page(stable_node, false); >> + page = get_ksm_page(stable_node, >> + GET_KSM_PAGE_NOLOCK); >> if (page) >> put_page(page); >> cond_resched(); >> > > Hi Yang, > > The patch looks correct as far doing what it claims to do. I'll leave it > to others to decide if a trylock-based approach is really what you want, > for KSM scans. It seems reasonable from my very limited knowledge of > KSM: there shouldn't be any cases where you really *need* to wait for > a page lock, because the whole system is really sort of an optimization > anyway. Thanks! Yang > > > thanks,
On 1/30/19 9:47 AM, Yang Shi wrote: [...] >>> @@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct page *page) >>> * It would be more elegant to return stable_node >>> * than kpage, but that involves more changes. >>> */ >>> - tree_page = get_ksm_page(stable_node_dup, true); >>> + tree_page = get_ksm_page(stable_node_dup, >>> + GET_KSM_PAGE_TRYLOCK); >>> + >>> + if (PTR_ERR(tree_page) == -EBUSY) >>> + return ERR_PTR(-EBUSY); >> >> or just: >> >> if (PTR_ERR(tree_page) == -EBUSY) >> return tree_page; >> >> right? > > Either looks fine to me. Returning errno may look more explicit? Anyway I really don't have preference. Yes, either one is fine. I like to see less code on the screen, all else being equal, but it's an extremely minor point, and sometimes being explicit instead is better anyway. thanks,
diff --git a/mm/ksm.c b/mm/ksm.c index 6c48ad1..5647bc1 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -667,6 +667,12 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node) free_stable_node(stable_node); } +enum get_ksm_page_flags { + GET_KSM_PAGE_NOLOCK, + GET_KSM_PAGE_LOCK, + GET_KSM_PAGE_TRYLOCK +}; + /* * get_ksm_page: checks if the page indicated by the stable node * is still its ksm page, despite having held no reference to it. @@ -686,7 +692,8 @@ static void remove_node_from_stable_tree(struct stable_node *stable_node) * a page to put something that might look like our key in page->mapping. * is on its way to being freed; but it is an anomaly to bear in mind. */ -static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it) +static struct page *get_ksm_page(struct stable_node *stable_node, + enum get_ksm_page_flags flags) { struct page *page; void *expected_mapping; @@ -728,8 +735,15 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it) goto stale; } - if (lock_it) { + if (flags == GET_KSM_PAGE_TRYLOCK) { + if (!trylock_page(page)) { + put_page(page); + return ERR_PTR(-EBUSY); + } + } else if (flags == GET_KSM_PAGE_LOCK) lock_page(page); + + if (flags != GET_KSM_PAGE_NOLOCK) { if (READ_ONCE(page->mapping) != expected_mapping) { unlock_page(page); put_page(page); @@ -763,7 +777,7 @@ static void remove_rmap_item_from_tree(struct rmap_item *rmap_item) struct page *page; stable_node = rmap_item->head; - page = get_ksm_page(stable_node, true); + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK); if (!page) goto out; @@ -863,7 +877,7 @@ static int remove_stable_node(struct stable_node *stable_node) struct page *page; int err; - page = get_ksm_page(stable_node, true); + page = get_ksm_page(stable_node, GET_KSM_PAGE_LOCK); if (!page) { /* * get_ksm_page did remove_node_from_stable_tree itself. @@ -1385,7 +1399,7 @@ static struct page *stable_node_dup(struct stable_node **_stable_node_dup, * stable_node parameter itself will be freed from * under us if it returns NULL. */ - _tree_page = get_ksm_page(dup, false); + _tree_page = get_ksm_page(dup, GET_KSM_PAGE_NOLOCK); if (!_tree_page) continue; nr += 1; @@ -1508,7 +1522,7 @@ static struct page *__stable_node_chain(struct stable_node **_stable_node_dup, if (!is_stable_node_chain(stable_node)) { if (is_page_sharing_candidate(stable_node)) { *_stable_node_dup = stable_node; - return get_ksm_page(stable_node, false); + return get_ksm_page(stable_node, GET_KSM_PAGE_NOLOCK); } /* * _stable_node_dup set to NULL means the stable_node @@ -1613,7 +1627,8 @@ static struct page *stable_tree_search(struct page *page) * wrprotected at all times. Any will work * fine to continue the walk. */ - tree_page = get_ksm_page(stable_node_any, false); + tree_page = get_ksm_page(stable_node_any, + GET_KSM_PAGE_NOLOCK); } VM_BUG_ON(!stable_node_dup ^ !!stable_node_any); if (!tree_page) { @@ -1673,7 +1688,12 @@ static struct page *stable_tree_search(struct page *page) * It would be more elegant to return stable_node * than kpage, but that involves more changes. */ - tree_page = get_ksm_page(stable_node_dup, true); + tree_page = get_ksm_page(stable_node_dup, + GET_KSM_PAGE_TRYLOCK); + + if (PTR_ERR(tree_page) == -EBUSY) + return ERR_PTR(-EBUSY); + if (unlikely(!tree_page)) /* * The tree may have been rebalanced, @@ -1842,7 +1862,8 @@ static struct stable_node *stable_tree_insert(struct page *kpage) * wrprotected at all times. Any will work * fine to continue the walk. */ - tree_page = get_ksm_page(stable_node_any, false); + tree_page = get_ksm_page(stable_node_any, + GET_KSM_PAGE_NOLOCK); } VM_BUG_ON(!stable_node_dup ^ !!stable_node_any); if (!tree_page) { @@ -2060,6 +2081,10 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item) /* We first start with searching the page inside the stable tree */ kpage = stable_tree_search(page); + + if (PTR_ERR(kpage) == -EBUSY) + return; + if (kpage == page && rmap_item->head == stable_node) { put_page(kpage); return; @@ -2242,7 +2267,8 @@ static struct rmap_item *scan_get_next_rmap_item(struct page **page) list_for_each_entry_safe(stable_node, next, &migrate_nodes, list) { - page = get_ksm_page(stable_node, false); + page = get_ksm_page(stable_node, + GET_KSM_PAGE_NOLOCK); if (page) put_page(page); cond_resched();