From patchwork Sun Aug 25 00:54:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11113171 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F116D14F7 for ; Sun, 25 Aug 2019 00:54:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AA95E2190F for ; Sun, 25 Aug 2019 00:54:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="0PzduUZ4" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AA95E2190F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 094286B04F5; Sat, 24 Aug 2019 20:54:40 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 06BD56B04F7; Sat, 24 Aug 2019 20:54:40 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EC3D36B04F8; Sat, 24 Aug 2019 20:54:39 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0254.hostedemail.com [216.40.44.254]) by kanga.kvack.org (Postfix) with ESMTP id CD4BC6B04F5 for ; Sat, 24 Aug 2019 20:54:39 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 7659E4853 for ; Sun, 25 Aug 2019 00:54:39 +0000 (UTC) X-FDA: 75859129878.09.crack99_45b2fe51f9439 X-Spam-Summary: 2,0,0,31931747c5a1c9d1,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:henryburns@google.com:henrywolfeburns@gmail.com:jwadams@google.com::mm-commits@vger.kernel.org:shakeelb@google.com:stable@vger.kernel.org:torvalds@linux-foundation.org:vitalywool@gmail.com,RULES_HIT:2:41:355:379:800:960:967:973:988:989:1260:1263:1345:1381:1431:1437:1535:1605:1606:1730:1747:1777:1792:2194:2199:2393:2525:2559:2563:2682:2685:2689:2859:2902:2903:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4118:4321:4423:4605:5007:6261:6653:7514:7576:7875:8599:8660:9025:9545:10004:10913:11026:11473:11658:11914:12043:12048:12291:12294:12296:12297:12438:12517:12519:12555:12679:12683:12783:12986:13148:13230:13846:14096:14799:21080:21324:21451:21627:21740:21939:30034:30054:30070,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk ,SPF:fp, X-HE-Tag: crack99_45b2fe51f9439 X-Filterd-Recvd-Size: 7231 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Sun, 25 Aug 2019 00:54:38 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 945E0206E0; Sun, 25 Aug 2019 00:54:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566694477; bh=CBZu2oc704LYAmtVnL2Z4143B5NGtHX1hgCERy8iZIA=; h=Date:From:To:Subject:From; b=0PzduUZ4UtAUI6UrMdJDxtgNvAZRwMz7kbDXQJ6PEK5Rvs5Bq02D2rnpbXcZIerUN vVz4DTeVxGXEXA2mz8BeK2IuftVPoGDXVIqMF4SdZe4bV9pWASnJy+RufnbRBzOT14 OA51oftvtj9Cx/AbX7DlULzYdRbB+TLk/QUOk5Ys= Date: Sat, 24 Aug 2019 17:54:37 -0700 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, henryburns@google.com, henrywolfeburns@gmail.com, jwadams@google.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, shakeelb@google.com, stable@vger.kernel.org, torvalds@linux-foundation.org, vitalywool@gmail.com Subject: [patch 01/11] mm/z3fold.c: fix race between migration and destruction Message-ID: <20190825005437.EXZurMmNL%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Henry Burns Subject: mm/z3fold.c: fix race between migration and destruction In z3fold_destroy_pool() we call destroy_workqueue(&pool->compact_wq). However, we have no guarantee that migration isn't happening in the background at that time. Migration directly calls queue_work_on(pool->compact_wq), if destruction wins that race we are using a destroyed workqueue. Link: http://lkml.kernel.org/r/20190809213828.202833-1-henryburns@google.com Signed-off-by: Henry Burns Cc: Vitaly Wool Cc: Shakeel Butt Cc: Jonathan Adams Cc: Henry Burns Cc: Signed-off-by: Andrew Morton --- mm/z3fold.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) --- a/mm/z3fold.c~mm-z3foldc-fix-race-between-migration-and-destruction +++ a/mm/z3fold.c @@ -41,6 +41,7 @@ #include #include #include +#include #include #include @@ -145,6 +146,8 @@ struct z3fold_header { * @release_wq: workqueue for safe page release * @work: work_struct for safe page release * @inode: inode for z3fold pseudo filesystem + * @destroying: bool to stop migration once we start destruction + * @isolated: int to count the number of pages currently in isolation * * This structure is allocated at pool creation time and maintains metadata * pertaining to a particular z3fold pool. @@ -163,8 +166,11 @@ struct z3fold_pool { const struct zpool_ops *zpool_ops; struct workqueue_struct *compact_wq; struct workqueue_struct *release_wq; + struct wait_queue_head isolate_wait; struct work_struct work; struct inode *inode; + bool destroying; + int isolated; }; /* @@ -769,6 +775,7 @@ static struct z3fold_pool *z3fold_create goto out_c; spin_lock_init(&pool->lock); spin_lock_init(&pool->stale_lock); + init_waitqueue_head(&pool->isolate_wait); pool->unbuddied = __alloc_percpu(sizeof(struct list_head)*NCHUNKS, 2); if (!pool->unbuddied) goto out_pool; @@ -808,6 +815,15 @@ out: return NULL; } +static bool pool_isolated_are_drained(struct z3fold_pool *pool) +{ + bool ret; + + spin_lock(&pool->lock); + ret = pool->isolated == 0; + spin_unlock(&pool->lock); + return ret; +} /** * z3fold_destroy_pool() - destroys an existing z3fold pool * @pool: the z3fold pool to be destroyed @@ -817,6 +833,22 @@ out: static void z3fold_destroy_pool(struct z3fold_pool *pool) { kmem_cache_destroy(pool->c_handle); + /* + * We set pool-> destroying under lock to ensure that + * z3fold_page_isolate() sees any changes to destroying. This way we + * avoid the need for any memory barriers. + */ + + spin_lock(&pool->lock); + pool->destroying = true; + spin_unlock(&pool->lock); + + /* + * We need to ensure that no pages are being migrated while we destroy + * these workqueues, as migration can queue work on either of the + * workqueues. + */ + wait_event(pool->isolate_wait, !pool_isolated_are_drained(pool)); /* * We need to destroy pool->compact_wq before pool->release_wq, @@ -1307,6 +1339,28 @@ static u64 z3fold_get_pool_size(struct z return atomic64_read(&pool->pages_nr); } +/* + * z3fold_dec_isolated() expects to be called while pool->lock is held. + */ +static void z3fold_dec_isolated(struct z3fold_pool *pool) +{ + assert_spin_locked(&pool->lock); + VM_BUG_ON(pool->isolated <= 0); + pool->isolated--; + + /* + * If we have no more isolated pages, we have to see if + * z3fold_destroy_pool() is waiting for a signal. + */ + if (pool->isolated == 0 && waitqueue_active(&pool->isolate_wait)) + wake_up_all(&pool->isolate_wait); +} + +static void z3fold_inc_isolated(struct z3fold_pool *pool) +{ + pool->isolated++; +} + static bool z3fold_page_isolate(struct page *page, isolate_mode_t mode) { struct z3fold_header *zhdr; @@ -1333,6 +1387,33 @@ static bool z3fold_page_isolate(struct p spin_lock(&pool->lock); if (!list_empty(&page->lru)) list_del(&page->lru); + /* + * We need to check for destruction while holding pool->lock, as + * otherwise destruction could see 0 isolated pages, and + * proceed. + */ + if (unlikely(pool->destroying)) { + spin_unlock(&pool->lock); + /* + * If this page isn't stale, somebody else holds a + * reference to it. Let't drop our refcount so that they + * can call the release logic. + */ + if (unlikely(kref_put(&zhdr->refcount, + release_z3fold_page_locked))) { + /* + * If we get here we have kref problems, so we + * should freak out. + */ + WARN(1, "Z3fold is experiencing kref problems\n"); + return false; + } + z3fold_page_unlock(zhdr); + return false; + } + + + z3fold_inc_isolated(pool); spin_unlock(&pool->lock); z3fold_page_unlock(zhdr); return true; @@ -1401,6 +1482,10 @@ static int z3fold_page_migrate(struct ad queue_work_on(new_zhdr->cpu, pool->compact_wq, &new_zhdr->work); + spin_lock(&pool->lock); + z3fold_dec_isolated(pool); + spin_unlock(&pool->lock); + page_mapcount_reset(page); put_page(page); return 0; @@ -1420,10 +1505,14 @@ static void z3fold_page_putback(struct p INIT_LIST_HEAD(&page->lru); if (kref_put(&zhdr->refcount, release_z3fold_page_locked)) { atomic64_dec(&pool->pages_nr); + spin_lock(&pool->lock); + z3fold_dec_isolated(pool); + spin_unlock(&pool->lock); return; } spin_lock(&pool->lock); list_add(&page->lru, &pool->lru); + z3fold_dec_isolated(pool); spin_unlock(&pool->lock); z3fold_page_unlock(zhdr); } From patchwork Sun Aug 25 00:54:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11113173 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC9B01399 for ; Sun, 25 Aug 2019 00:54:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8C80022CE3 for ; Sun, 25 Aug 2019 00:54:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="jmF1xqx0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8C80022CE3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B90086B04F7; Sat, 24 Aug 2019 20:54:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B418F6B04F9; Sat, 24 Aug 2019 20:54:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A55FD6B04FA; Sat, 24 Aug 2019 20:54:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0228.hostedemail.com [216.40.44.228]) by kanga.kvack.org (Postfix) with ESMTP id 85A636B04F7 for ; Sat, 24 Aug 2019 20:54:43 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 2B42052A2 for ; Sun, 25 Aug 2019 00:54:43 +0000 (UTC) X-FDA: 75859130046.27.fog67_463cbf1e2813d X-Spam-Summary: 2,0,0,57fce20eb444c188,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org::m.mizuma@jp.fujitsu.com:mgorman@techsingularity.net:mm-commits@vger.kernel.org:n-horiguchi@ah.jp.nec.com:osalvador@suse.de:pavel.tatashin@microsoft.com:rientjes@google.com:stable@vger.kernel.org:torvalds@linux-foundation.org:vbabka@suse.cz,RULES_HIT:2:41:69:355:379:800:960:965:966:967:968:973:988:989:1260:1263:1345:1381:1431:1437:1535:1605:1606:1730:1747:1777:1792:2194:2196:2198:2199:2200:2201:2393:2525:2553:2559:2563:2682:2685:2689:2693:2731:2859:2898:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3743:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4043:4117:4321:4385:4390:4395:4605:5007:6119:6261:6653:6737:7576:7688:7903:7974:8531:8599:8660:8957:9025:9545:9592:10004:11026:11232:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12783:12895:12986:13148:13161:13180:13229:13230: 13255:21 X-HE-Tag: fog67_463cbf1e2813d X-Filterd-Recvd-Size: 6861 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Sun, 25 Aug 2019 00:54:42 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 15A7C2190F; Sun, 25 Aug 2019 00:54:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566694481; bh=/nAdXxZjqQ/vrNHjU8t4XxTxSowuJMyFyK6hvPaQ6CE=; h=Date:From:To:Subject:From; b=jmF1xqx0OT0PfEa8XiLmdhk0hFX0KLqKmw5CTztQYjNQaoKpxpDZtGg0xTkOrlswV 2Wi7ZfxB6whA/vjJWUxll0hGSvd317XB9SC5M92RZCX2Z/eHPff1E5LKh3e1QWfX13 XCqtNLS2X0itSFt0YfD/EQdzuARNp4pdGowgA6Z0= Date: Sat, 24 Aug 2019 17:54:40 -0700 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, linux-mm@kvack.org, m.mizuma@jp.fujitsu.com, mgorman@techsingularity.net, mm-commits@vger.kernel.org, n-horiguchi@ah.jp.nec.com, osalvador@suse.de, pavel.tatashin@microsoft.com, rientjes@google.com, stable@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 02/11] mm, page_alloc: move_freepages should not examine struct page of reserved memory Message-ID: <20190825005440.tdSQS605E%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: David Rientjes Subject: mm, page_alloc: move_freepages should not examine struct page of reserved memory After commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages"), struct page of reserved memory is zeroed. This causes page->flags to be 0 and fixes issues related to reading /proc/kpageflags, for example, of reserved memory. The VM_BUG_ON() in move_freepages_block(), however, assumes that page_zone() is meaningful even for reserved memory. That assumption is no longer true after the aforementioned commit. There's no reason why move_freepages_block() should be testing the legitimacy of page_zone() for reserved memory; its scope is limited only to pages on the zone's freelist. Note that pfn_valid() can be true for reserved memory: there is a backing struct page. The check for page_to_nid(page) is also buggy but reserved memory normally only appears on node 0 so the zeroing doesn't affect this. Move the debug checks to after verifying PageBuddy is true. This isolates the scope of the checks to only be for buddy pages which are on the zone's freelist which move_freepages_block() is operating on. In this case, an incorrect node or zone is a bug worthy of being warned about (and the examination of struct page is acceptable bcause this memory is not reserved). Why does move_freepages_block() gets called on reserved memory? It's simply math after finding a valid free page from the per-zone free area to use as fallback. We find the beginning and end of the pageblock of the valid page and that can bring us into memory that was reserved per the e820. pfn_valid() is still true (it's backed by a struct page), but since it's zero'd we shouldn't make any inferences here about comparing its node or zone. The current node check just happens to succeed most of the time by luck because reserved memory typically appears on node 0. The fix here is to validate that we actually have buddy pages before testing if there's any type of zone or node strangeness going on. We noticed it almost immediately after bringing 907ec5fca3dc in on CONFIG_DEBUG_VM builds. It depends on finding specific free pages in the per-zone free area where the math in move_freepages() will bring the start or end pfn into reserved memory and wanting to claim that entire pageblock as a new migratetype. So the path will be rare, require CONFIG_DEBUG_VM, and require fallback to a different migratetype. Some struct pages were already zeroed from reserve pages before 907ec5fca3c so it theoretically could trigger before this commit. I think it's rare enough under a config option that most people don't run that others may not have noticed. I wouldn't argue against a stable tag and the backport should be easy enough, but probably wouldn't single out a commit that this is fixing. Mel said: : The overhead of the debugging check is higher with this patch although : it'll only affect debug builds and the path is not particularly hot. : If this was a concern, I think it would be reasonable to simply remove : the debugging check as the zone boundaries are checked in : move_freepages_block and we never expect a zone/node to be smaller than : a pageblock and stuck in the middle of another zone. Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1908122036560.10779@chino.kir.corp.google.com Signed-off-by: David Rientjes Acked-by: Mel Gorman Cc: Naoya Horiguchi Cc: Masayoshi Mizuma Cc: Oscar Salvador Cc: Pavel Tatashin Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton --- mm/page_alloc.c | 19 ++++--------------- 1 file changed, 4 insertions(+), 15 deletions(-) --- a/mm/page_alloc.c~mm-page_alloc-move_freepages-should-not-examine-struct-page-of-reserved-memory +++ a/mm/page_alloc.c @@ -2238,27 +2238,12 @@ static int move_freepages(struct zone *z unsigned int order; int pages_moved = 0; -#ifndef CONFIG_HOLES_IN_ZONE - /* - * page_zone is not safe to call in this context when - * CONFIG_HOLES_IN_ZONE is set. This bug check is probably redundant - * anyway as we check zone boundaries in move_freepages_block(). - * Remove at a later date when no bug reports exist related to - * grouping pages by mobility - */ - VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && - pfn_valid(page_to_pfn(end_page)) && - page_zone(start_page) != page_zone(end_page)); -#endif for (page = start_page; page <= end_page;) { if (!pfn_valid_within(page_to_pfn(page))) { page++; continue; } - /* Make sure we are not inadvertently changing nodes */ - VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); - if (!PageBuddy(page)) { /* * We assume that pages that could be isolated for @@ -2273,6 +2258,10 @@ static int move_freepages(struct zone *z continue; } + /* Make sure we are not inadvertently changing nodes */ + VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); + VM_BUG_ON_PAGE(page_zone(page) != zone, page); + order = page_order(page); move_to_free_area(page, &zone->free_area[order], migratetype); page += 1 << order; From patchwork Sun Aug 25 00:54:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11113175 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B72F61399 for ; Sun, 25 Aug 2019 00:54:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8297E22CE9 for ; Sun, 25 Aug 2019 00:54:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZlulVT9Q" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8297E22CE9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8F7596B04F9; Sat, 24 Aug 2019 20:54:46 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8A74E6B04FB; Sat, 24 Aug 2019 20:54:46 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E50B6B04FC; Sat, 24 Aug 2019 20:54:46 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0200.hostedemail.com [216.40.44.200]) by kanga.kvack.org (Postfix) with ESMTP id 5FA6F6B04F9 for ; Sat, 24 Aug 2019 20:54:46 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id EC4E2824CA26 for ; Sun, 25 Aug 2019 00:54:45 +0000 (UTC) X-FDA: 75859130130.05.mass81_46ab5a7ae232d X-Spam-Summary: 2,0,0,78619e0cc4d38bc9,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:cai@lca.pw::linux@roeck-us.net:mm-commits@vger.kernel.org:sfr@canb.auug.org.au:torvalds@linux-foundation.org,RULES_HIT:41:355:379:800:960:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1541:1711:1730:1747:1777:1792:1801:1981:2194:2199:2393:2525:2559:2563:2682:2685:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3352:3866:3867:3870:3871:3872:3873:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:4605:5007:6119:6261:6653:7576:8599:9025:9545:9592:10004:10913:11026:11658:11914:12043:12048:12297:12517:12519:12555:12679:12783:12986:13069:13221:13229:13311:13357:13846:14093:14181:14384:14581:14721:14849:21080:21433:21451:21627:21939:30029:30054,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:4,LUA_SUMMARY:none X-HE-Tag: mass81_46ab5a7ae232d X-Filterd-Recvd-Size: 3210 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Sun, 25 Aug 2019 00:54:45 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 55839206E0; Sun, 25 Aug 2019 00:54:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566694484; bh=L2S2GGnciDnwlnmsfC/b9WeIfalDn4jL2tsJvrWBT2o=; h=Date:From:To:Subject:From; b=ZlulVT9QkC1VhYqJ8u8+nxxp00qAMhXD2GYAnYjfQPSe65KDXsAkUUN1g9jnPgv92 Q2DgEYPCtR68irbFtQ0S2lYSl1pql4pI9Y+joYFpSO9SUQTnrTAbpTGN0aAPMHpKsA RB8WUL+qJwowEa0M7nV4kprv77fAx87FN/elR7Ow= Date: Sat, 24 Aug 2019 17:54:43 -0700 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, cai@lca.pw, linux-mm@kvack.org, linux@roeck-us.net, mm-commits@vger.kernel.org, sfr@canb.auug.org.au, torvalds@linux-foundation.org Subject: [patch 03/11] parisc: fix compilation errrors Message-ID: <20190825005443.DjMnGU4bk%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Qian Cai Subject: parisc: fix compilation errrors Commit 0cfaee2af3a0 ("include/asm-generic/5level-fixup.h: fix variable 'p4d' set but not used") converted a few functions from macros to static inline, which causes parisc to complain, In file included from ./include/asm-generic/4level-fixup.h:38:0, from ./arch/parisc/include/asm/pgtable.h:5, from ./arch/parisc/include/asm/io.h:6, from ./include/linux/io.h:13, from sound/core/memory.c:9: ./include/asm-generic/5level-fixup.h:14:18: error: unknown type name 'pgd_t'; did you mean 'pid_t'? #define p4d_t pgd_t ^ ./include/asm-generic/5level-fixup.h:24:28: note: in expansion of macro 'p4d_t' static inline int p4d_none(p4d_t p4d) ^~~~~ It is because "4level-fixup.h" is included before "asm/page.h" where "pgd_t" is defined. Link: http://lkml.kernel.org/r/20190815205305.1382-1-cai@lca.pw Fixes: 0cfaee2af3a0 ("include/asm-generic/5level-fixup.h: fix variable 'p4d' set but not used") Signed-off-by: Qian Cai Reported-by: Guenter Roeck Tested-by: Guenter Roeck Cc: Stephen Rothwell Signed-off-by: Andrew Morton --- arch/parisc/include/asm/pgtable.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/arch/parisc/include/asm/pgtable.h~parisc-fix-compilation-errrors +++ a/arch/parisc/include/asm/pgtable.h @@ -2,6 +2,7 @@ #ifndef _PARISC_PGTABLE_H #define _PARISC_PGTABLE_H +#include #include #include @@ -98,8 +99,6 @@ static inline void purge_tlb_entries(str #endif /* !__ASSEMBLY__ */ -#include - #define pte_ERROR(e) \ printk("%s:%d: bad pte %08lx.\n", __FILE__, __LINE__, pte_val(e)) #define pmd_ERROR(e) \ From patchwork Sun Aug 25 00:54:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11113177 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 280FF14F7 for ; Sun, 25 Aug 2019 00:54:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E711222CE3 for ; Sun, 25 Aug 2019 00:54:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="s36VIb77" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E711222CE3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EC7606B04FB; Sat, 24 Aug 2019 20:54:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E74896B04FD; Sat, 24 Aug 2019 20:54:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D66486B04FE; Sat, 24 Aug 2019 20:54:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0025.hostedemail.com [216.40.44.25]) by kanga.kvack.org (Postfix) with ESMTP id B62FD6B04FB for ; Sat, 24 Aug 2019 20:54:49 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 611024820 for ; Sun, 25 Aug 2019 00:54:49 +0000 (UTC) X-FDA: 75859130298.22.smile33_471e498b9e562 X-Spam-Summary: 2,0,0,4ba227e636abf083,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:guro@fb.com:hannes@cmpxchg.org::mhocko@suse.com:mm-commits@vger.kernel.org:stable@vger.kernel.org:torvalds@linux-foundation.org:vdavydov.dev@gmail.com,RULES_HIT:41:355:379:800:960:966:967:968:973:988:989:1260:1263:1345:1381:1431:1437:1534:1543:1711:1730:1747:1777:1792:2005:2194:2195:2196:2198:2199:2200:2201:2202:2393:2525:2559:2563:2682:2685:2693:2741:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4041:4321:4385:5007:6261:6653:7514:7576:7903:8526:8599:9025:9545:10004:10913:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12663:12679:12683:12783:12895:12986:14181:14721:14849:21080:21433:21450:21451:21627:21795:21939:30034:30051:30054:30064,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF X-HE-Tag: smile33_471e498b9e562 X-Filterd-Recvd-Size: 4898 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Sun, 25 Aug 2019 00:54:48 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 73EAD22CE3; Sun, 25 Aug 2019 00:54:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566694487; bh=sZLrUwfxCBbdM8PK8O0U93Mf5Xfl+lDvsy3+XCkpuvA=; h=Date:From:To:Subject:From; b=s36VIb77QroDG9pO3hPB3BOcqFzGpvb1OkV7gBEvl53G9WWY8NEBvd++oIUnHemsD k7HolEY+0dbcAbc8sxX5oRfh+sx+LscUaMbbGAzIExiObKyGKsooFrB2EGVX5+F/ln xrorf5uNMFDw/tmDv5HTLJzTQIJ1Nu1fqWFcWxDs= Date: Sat, 24 Aug 2019 17:54:47 -0700 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org, vdavydov.dev@gmail.com Subject: [patch 04/11] mm: memcontrol: flush percpu vmstats before releasing memcg Message-ID: <20190825005447.6PmKju-ud%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Roman Gushchin Subject: mm: memcontrol: flush percpu vmstats before releasing memcg Percpu caching of local vmstats with the conditional propagation by the cgroup tree leads to an accumulation of errors on non-leaf levels. Let's imagine two nested memory cgroups A and A/B. Say, a process belonging to A/B allocates 100 pagecache pages on the CPU 0. The percpu cache will spill 3 times, so that 32*3=96 pages will be accounted to A/B and A atomic vmstat counters, 4 pages will remain in the percpu cache. Imagine A/B is nearby memory.max, so that every following allocation triggers a direct reclaim on the local CPU. Say, each such attempt will free 16 pages on a new cpu. That means every percpu cache will have -16 pages, except the first one, which will have 4 - 16 = -12. A/B and A atomic counters will not be touched at all. Now a user removes A/B. All percpu caches are freed and corresponding vmstat numbers are forgotten. A has 96 pages more than expected. As memory cgroups are created and destroyed, errors do accumulate. Even 1-2 pages differences can accumulate into large numbers. To fix this issue let's accumulate and propagate percpu vmstat values before releasing the memory cgroup. At this point these numbers are stable and cannot be changed. Since on cpu hotplug we do flush percpu vmstats anyway, we can iterate only over online cpus. Link: http://lkml.kernel.org/r/20190819202338.363363-2-guro@fb.com Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty") Signed-off-by: Roman Gushchin Acked-by: Michal Hocko Cc: Johannes Weiner Cc: Vladimir Davydov Cc: Signed-off-by: Andrew Morton --- mm/memcontrol.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) --- a/mm/memcontrol.c~mm-memcontrol-flush-percpu-vmstats-before-releasing-memcg +++ a/mm/memcontrol.c @@ -3260,6 +3260,41 @@ static u64 mem_cgroup_read_u64(struct cg } } +static void memcg_flush_percpu_vmstats(struct mem_cgroup *memcg) +{ + unsigned long stat[MEMCG_NR_STAT]; + struct mem_cgroup *mi; + int node, cpu, i; + + for (i = 0; i < MEMCG_NR_STAT; i++) + stat[i] = 0; + + for_each_online_cpu(cpu) + for (i = 0; i < MEMCG_NR_STAT; i++) + stat[i] += raw_cpu_read(memcg->vmstats_percpu->stat[i]); + + for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) + for (i = 0; i < MEMCG_NR_STAT; i++) + atomic_long_add(stat[i], &mi->vmstats[i]); + + for_each_node(node) { + struct mem_cgroup_per_node *pn = memcg->nodeinfo[node]; + struct mem_cgroup_per_node *pi; + + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) + stat[i] = 0; + + for_each_online_cpu(cpu) + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) + stat[i] += raw_cpu_read( + pn->lruvec_stat_cpu->count[i]); + + for (pi = pn; pi; pi = parent_nodeinfo(pi, node)) + for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) + atomic_long_add(stat[i], &pi->lruvec_stat[i]); + } +} + #ifdef CONFIG_MEMCG_KMEM static int memcg_online_kmem(struct mem_cgroup *memcg) { @@ -4682,6 +4717,11 @@ static void __mem_cgroup_free(struct mem { int node; + /* + * Flush percpu vmstats to guarantee the value correctness + * on parent's and all ancestor levels. + */ + memcg_flush_percpu_vmstats(memcg); for_each_node(node) free_mem_cgroup_per_node_info(memcg, node); free_percpu(memcg->vmstats_percpu); From patchwork Sun Aug 25 00:54:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11113179 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2EFEA14F7 for ; Sun, 25 Aug 2019 00:54:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EE8FE22CE3 for ; Sun, 25 Aug 2019 00:54:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="AZCZ5CG1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EE8FE22CE3 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B19F06B04FD; Sat, 24 Aug 2019 20:54:52 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id ACB716B04FF; Sat, 24 Aug 2019 20:54:52 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A07A06B0500; Sat, 24 Aug 2019 20:54:52 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0054.hostedemail.com [216.40.44.54]) by kanga.kvack.org (Postfix) with ESMTP id 824746B04FD for ; Sat, 24 Aug 2019 20:54:52 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 319F8181AC9B4 for ; Sun, 25 Aug 2019 00:54:52 +0000 (UTC) X-FDA: 75859130424.19.burn82_479499eaeb905 X-Spam-Summary: 2,0,0,1d048248a3c4a6d6,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:guro@fb.com:hannes@cmpxchg.org::mhocko@suse.com:mm-commits@vger.kernel.org:stable@vger.kernel.org:torvalds@linux-foundation.org:vdavydov.dev@gmail.com,RULES_HIT:41:355:379:800:960:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1542:1711:1730:1747:1777:1792:2194:2196:2199:2200:2393:2525:2553:2559:2563:2682:2685:2741:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3870:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4041:4321:4385:5007:6261:6653:7514:7576:7903:8599:9025:9545:10004:10913:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12679:12783:12986:14181:14721:14849:21080:21451:21627:21939:30054:30064:30090,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:5,LUA_SUMMARY:none X-HE-Tag: burn82_479499eaeb905 X-Filterd-Recvd-Size: 3780 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Sun, 25 Aug 2019 00:54:51 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id A04902190F; Sun, 25 Aug 2019 00:54:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566694490; bh=oDtFyeAh0PJ+Q411JYh5/5alblkC4nC7vhAYVRdeYPw=; h=Date:From:To:Subject:From; b=AZCZ5CG1iRFy6DDBDEGN2TTobvBPXuwctoVPSGHtTwtLYPmr9SbT5BpI0d9aK/+te bNP/7zUGtUxkSY8mQueRRnJciQYLS5/Ua1S0fvXcWlGaF/9/Vz0H9T0b53OnYfgRzq ABAsLMKYupZQ3erI/aQ1M3QB7WlrGfeGWon8OxOA= Date: Sat, 24 Aug 2019 17:54:50 -0700 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org, vdavydov.dev@gmail.com Subject: [patch 05/11] mm: memcontrol: flush percpu vmevents before releasing memcg Message-ID: <20190825005450.VXGCpG7ix%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Roman Gushchin Subject: mm: memcontrol: flush percpu vmevents before releasing memcg Similar to vmstats, percpu caching of local vmevents leads to an accumulation of errors on non-leaf levels. This happens because some leftovers may remain in percpu caches, so that they are never propagated up by the cgroup tree and just disappear into nonexistence with on releasing of the memory cgroup. To fix this issue let's accumulate and propagate percpu vmevents values before releasing the memory cgroup similar to what we're doing with vmstats. Since on cpu hotplug we do flush percpu vmstats anyway, we can iterate only over online cpus. Link: http://lkml.kernel.org/r/20190819202338.363363-4-guro@fb.com Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty") Signed-off-by: Roman Gushchin Acked-by: Michal Hocko Cc: Johannes Weiner Cc: Vladimir Davydov Cc: Signed-off-by: Andrew Morton --- mm/memcontrol.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) --- a/mm/memcontrol.c~mm-memcontrol-flush-percpu-vmevents-before-releasing-memcg +++ a/mm/memcontrol.c @@ -3295,6 +3295,25 @@ static void memcg_flush_percpu_vmstats(s } } +static void memcg_flush_percpu_vmevents(struct mem_cgroup *memcg) +{ + unsigned long events[NR_VM_EVENT_ITEMS]; + struct mem_cgroup *mi; + int cpu, i; + + for (i = 0; i < NR_VM_EVENT_ITEMS; i++) + events[i] = 0; + + for_each_online_cpu(cpu) + for (i = 0; i < NR_VM_EVENT_ITEMS; i++) + events[i] += raw_cpu_read( + memcg->vmstats_percpu->events[i]); + + for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) + for (i = 0; i < NR_VM_EVENT_ITEMS; i++) + atomic_long_add(events[i], &mi->vmevents[i]); +} + #ifdef CONFIG_MEMCG_KMEM static int memcg_online_kmem(struct mem_cgroup *memcg) { @@ -4718,10 +4737,11 @@ static void __mem_cgroup_free(struct mem int node; /* - * Flush percpu vmstats to guarantee the value correctness + * Flush percpu vmstats and vmevents to guarantee the value correctness * on parent's and all ancestor levels. */ memcg_flush_percpu_vmstats(memcg); + memcg_flush_percpu_vmevents(memcg); for_each_node(node) free_mem_cgroup_per_node_info(memcg, node); free_percpu(memcg->vmstats_percpu); From patchwork Sun Aug 25 00:54:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11113181 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 562C61399 for ; Sun, 25 Aug 2019 00:54:57 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 17A6822CE9 for ; Sun, 25 Aug 2019 00:54:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="VrpFC1Sa" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 17A6822CE9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 28EDF6B04FF; Sat, 24 Aug 2019 20:54:56 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 218586B0501; Sat, 24 Aug 2019 20:54:56 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17C576B0502; Sat, 24 Aug 2019 20:54:56 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0058.hostedemail.com [216.40.44.58]) by kanga.kvack.org (Postfix) with ESMTP id EDF2E6B04FF for ; Sat, 24 Aug 2019 20:54:55 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 91B514820 for ; Sun, 25 Aug 2019 00:54:55 +0000 (UTC) X-FDA: 75859130550.08.space69_481215c01285b X-Spam-Summary: 2,0,0,c4a73ad1383dfa35,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:caspar@linux.alibaba.com:hannes@cmpxchg.org:joseph.qi@linux.alibaba.com:kerneljasonxing@linux.alibaba.com::mingo@redhat.com:mm-commits@vger.kernel.org:peterz@infradead.org:stable@vger.kernel.org:surenb@google.com:torvalds@linux-foundation.org,RULES_HIT:41:355:379:800:960:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1542:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2525:2559:2563:2682:2685:2859:2897:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3353:3865:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4250:4321:4385:5007:6119:6261:6653:6737:7576:7875:7903:9025:9391:9545:9707:10004:10913:11026:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12783:12986:13053:13161:13229:14181:14721:14849:21080:21212:21324:21433:21450:21451:21627:21740:21795:21819:21939:30006:30012:30041:30051:30 054:3007 X-HE-Tag: space69_481215c01285b X-Filterd-Recvd-Size: 3521 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Sun, 25 Aug 2019 00:54:55 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C7F3A2190F; Sun, 25 Aug 2019 00:54:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566694494; bh=YpmFU5fs/NJMrrrVE0ZhpttDS0wDvYzWyG+8KwPKHfM=; h=Date:From:To:Subject:From; b=VrpFC1Saajfsx9VLQwtvXAPYy7hXdVbLfCrBXyb4Wm4IwKNAet80OW8STG8HkbGcU Ys2WyzPJg/SLAeXcwZdtRgjbuCxmRaNTHYhqmelHe33dSrdlxW8tF31mhLBWssL+Iz UHP4cbGrg4r+RMlWxnvY3HN+CbP6lCvnjv0HQg8I= Date: Sat, 24 Aug 2019 17:54:53 -0700 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, caspar@linux.alibaba.com, hannes@cmpxchg.org, joseph.qi@linux.alibaba.com, kerneljasonxing@linux.alibaba.com, linux-mm@kvack.org, mingo@redhat.com, mm-commits@vger.kernel.org, peterz@infradead.org, stable@vger.kernel.org, surenb@google.com, torvalds@linux-foundation.org Subject: [patch 06/11] psi: get poll_work to run when calling poll syscall next time Message-ID: <20190825005453.mWr0lsMZh%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jason Xing Subject: psi: get poll_work to run when calling poll syscall next time Only when calling the poll syscall the first time can user receive POLLPRI correctly. After that, user always fails to acquire the event signal. Reproduce case: 1. Get the monitor code in Documentation/accounting/psi.txt 2. Run it, and wait for the event triggered. 3. Kill and restart the process. The question is why we can end up with poll_scheduled = 1 but the work not running (which would reset it to 0). And the answer is because the scheduling side sees group->poll_kworker under RCU protection and then schedules it, but here we cancel the work and destroy the worker. The cancel needs to pair with resetting the poll_scheduled flag. Link: http://lkml.kernel.org/r/1566357985-97781-1-git-send-email-joseph.qi@linux.alibaba.com Signed-off-by: Jason Xing Signed-off-by: Joseph Qi Reviewed-by: Caspar Zhang Reviewed-by: Suren Baghdasaryan Acked-by: Johannes Weiner Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Signed-off-by: Andrew Morton --- kernel/sched/psi.c | 8 ++++++++ 1 file changed, 8 insertions(+) --- a/kernel/sched/psi.c~psi-get-poll_work-to-run-when-calling-poll-syscall-next-time +++ a/kernel/sched/psi.c @@ -1131,7 +1131,15 @@ static void psi_trigger_destroy(struct k * deadlock while waiting for psi_poll_work to acquire trigger_lock */ if (kworker_to_destroy) { + /* + * After the RCU grace period has expired, the worker + * can no longer be found through group->poll_kworker. + * But it might have been already scheduled before + * that - deschedule it cleanly before destroying it. + */ kthread_cancel_delayed_work_sync(&group->poll_work); + atomic_set(&group->poll_scheduled, 0); + kthread_destroy_worker(kworker_to_destroy); } kfree(t); From patchwork Sun Aug 25 00:54:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11113183 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C767014F7 for ; Sun, 25 Aug 2019 00:55:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 881E422CE9 for ; Sun, 25 Aug 2019 00:55:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="wpzfkNyX" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 881E422CE9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 80D026B0501; Sat, 24 Aug 2019 20:54:59 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7BD2C6B0503; Sat, 24 Aug 2019 20:54:59 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6D42D6B0504; Sat, 24 Aug 2019 20:54:59 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0048.hostedemail.com [216.40.44.48]) by kanga.kvack.org (Postfix) with ESMTP id 4ADCA6B0501 for ; Sat, 24 Aug 2019 20:54:59 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id E6050824CA26 for ; Sun, 25 Aug 2019 00:54:58 +0000 (UTC) X-FDA: 75859130676.15.heat71_488d6cc564a25 X-Spam-Summary: 2,0,0,01297195dfa22371,d41d8cd98f00b204,akpm@linux-foundation.org,:aarcange@redhat.com:akpm@linux-foundation.org:jannh@google.com:jgg@mellanox.com::mhocko@suse.com:mm-commits@vger.kernel.org:oleg@redhat.com:penguin-kernel@i-love.sakura.ne.jp:peterx@redhat.com:rppt@linux.ibm.com:stable@vger.kernel.org:torvalds@linux-foundation.org:wangkefeng.wang@huawei.com,RULES_HIT:41:69:355:379:800:960:966:967:968:973:988:989:1260:1263:1345:1381:1431:1437:1534:1542:1711:1730:1747:1777:1792:2194:2196:2199:2200:2393:2525:2559:2563:2682:2685:2859:2898:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3353:3865:3868:3870:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:4385:5007:6261:6653:6737:7576:7904:8599:9025:9545:10004:10913:11026:11473:11658:11914:12043:12048:12114:12296:12297:12438:12517:12519:12533:12555:12679:12783:12986:13221:13229:13255:13846:14181:14721:14849:21080:21451:21627:21939:30054:30070,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5, 0.5,Netc X-HE-Tag: heat71_488d6cc564a25 X-Filterd-Recvd-Size: 4090 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Sun, 25 Aug 2019 00:54:58 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 186B42339D; Sun, 25 Aug 2019 00:54:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566694497; bh=T03GOG6VGUuHs/Ca4qtKoTZFfuRB7+LZvKLzqSSOIhE=; h=Date:From:To:Subject:From; b=wpzfkNyXyN6ka6oSUQAmEWmPWf6KlANw+++YxBJbJr0fJT8pQH6291yLe+FT/sBTs ZFjVEIoWisA8s7huPfNBNTrreGtifYfUWDjF0yWHvItWQgHxhG0/FkT/J9aSt/Ez3f KGorxLIlwJn6IjjCAwxvf42aQ9EgLhsTZkUfxR9A= Date: Sat, 24 Aug 2019 17:54:56 -0700 From: akpm@linux-foundation.org To: aarcange@redhat.com, akpm@linux-foundation.org, jannh@google.com, jgg@mellanox.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, oleg@redhat.com, penguin-kernel@I-love.SAKURA.ne.jp, peterx@redhat.com, rppt@linux.ibm.com, stable@vger.kernel.org, torvalds@linux-foundation.org, wangkefeng.wang@huawei.com Subject: [patch 07/11] userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx Message-ID: <20190825005456.L2ZisNIcB%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Oleg Nesterov Subject: userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx userfaultfd_release() should clear vm_flags/vm_userfaultfd_ctx even if mm->core_state != NULL. Otherwise a page fault can see userfaultfd_missing() == T and use an already freed userfaultfd_ctx. Link: http://lkml.kernel.org/r/20190820160237.GB4983@redhat.com Fixes: 04f5866e41fb ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping") Signed-off-by: Oleg Nesterov Reported-by: Kefeng Wang Reviewed-by: Andrea Arcangeli Tested-by: Kefeng Wang Cc: Peter Xu Cc: Mike Rapoport Cc: Jann Horn Cc: Jason Gunthorpe Cc: Michal Hocko Cc: Tetsuo Handa Cc: Signed-off-by: Andrew Morton --- fs/userfaultfd.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) --- a/fs/userfaultfd.c~userfaultfd_release-always-remove-uffd-flags-and-clear-vm_userfaultfd_ctx +++ a/fs/userfaultfd.c @@ -880,6 +880,7 @@ static int userfaultfd_release(struct in /* len == 0 means wake all */ struct userfaultfd_wake_range range = { .len = 0, }; unsigned long new_flags; + bool still_valid; WRITE_ONCE(ctx->released, true); @@ -895,8 +896,7 @@ static int userfaultfd_release(struct in * taking the mmap_sem for writing. */ down_write(&mm->mmap_sem); - if (!mmget_still_valid(mm)) - goto skip_mm; + still_valid = mmget_still_valid(mm); prev = NULL; for (vma = mm->mmap; vma; vma = vma->vm_next) { cond_resched(); @@ -907,19 +907,20 @@ static int userfaultfd_release(struct in continue; } new_flags = vma->vm_flags & ~(VM_UFFD_MISSING | VM_UFFD_WP); - prev = vma_merge(mm, prev, vma->vm_start, vma->vm_end, - new_flags, vma->anon_vma, - vma->vm_file, vma->vm_pgoff, - vma_policy(vma), - NULL_VM_UFFD_CTX); - if (prev) - vma = prev; - else - prev = vma; + if (still_valid) { + prev = vma_merge(mm, prev, vma->vm_start, vma->vm_end, + new_flags, vma->anon_vma, + vma->vm_file, vma->vm_pgoff, + vma_policy(vma), + NULL_VM_UFFD_CTX); + if (prev) + vma = prev; + else + prev = vma; + } vma->vm_flags = new_flags; vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX; } -skip_mm: up_write(&mm->mmap_sem); mmput(mm); wakeup: From patchwork Sun Aug 25 00:54:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11113185 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 136DC14F7 for ; Sun, 25 Aug 2019 00:55:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D2B43206E0 for ; Sun, 25 Aug 2019 00:55:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="aLhNaLvV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D2B43206E0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C52DE6B0503; Sat, 24 Aug 2019 20:55:02 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BB45F6B0505; Sat, 24 Aug 2019 20:55:02 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ACB076B0506; Sat, 24 Aug 2019 20:55:02 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0063.hostedemail.com [216.40.44.63]) by kanga.kvack.org (Postfix) with ESMTP id 7E8196B0503 for ; Sat, 24 Aug 2019 20:55:02 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 27724824CA26 for ; Sun, 25 Aug 2019 00:55:02 +0000 (UTC) X-FDA: 75859130844.29.war53_49021fcfe6006 X-Spam-Summary: 2,0,0,9b17ea9c98b15d3a,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:kirill@shutemov.name::mgorman@techsingularity.net:mhocko@kernel.org:mm-commits@vger.kernel.org:stable@vger.kernel.org:torvalds@linux-foundation.org:vbabka@suse.cz:willy@infradead.org,RULES_HIT:41:355:379:800:960:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1541:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2525:2559:2563:2682:2685:2731:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3352:3865:3867:3868:3870:3871:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4250:4321:4385:5007:6119:6120:6261:6653:7576:7901:7903:8599:8660:8957:9025:9545:10004:10913:11026:11473:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12783:12986:13069:13148:13161:13221:13229:13230:13311:13357:13846:14181:14384:14721:14849:21080:21451:21627:21939:30054,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0, MSF:not X-HE-Tag: war53_49021fcfe6006 X-Filterd-Recvd-Size: 2917 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Sun, 25 Aug 2019 00:55:01 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 628DA22CF7; Sun, 25 Aug 2019 00:55:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566694500; bh=5HYpnWuJ3nYcHULbN9ZEAdHlX5aKAAA5am09MKUmBjA=; h=Date:From:To:Subject:From; b=aLhNaLvVpRCMyKWPS+iYJ4gyE6kwGv3jDwTeIz1IrsW1/fP0VN9CV49gFl3CmXlzY j5/s+PCUithPry/y3C3Db1zr3Q5e7MVIC7L+J8dQb+bPuJnTpofO5KmeVYujBmYO1g Ic3xuUsTnOQYyvi5KyV4HbMubgwDXDCZnriKnWaQ= Date: Sat, 24 Aug 2019 17:54:59 -0700 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, kirill@shutemov.name, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@kernel.org, mm-commits@vger.kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz, willy@infradead.org Subject: [patch 08/11] mm, page_owner: handle THP splits correctly Message-ID: <20190825005459.Ik4Bi2G1I%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Vlastimil Babka Subject: mm, page_owner: handle THP splits correctly THP splitting path is missing the split_page_owner() call that split_page() has. As a result, split THP pages are wrongly reported in the page_owner file as order-9 pages. Furthermore when the former head page is freed, the remaining former tail pages are not listed in the page_owner file at all. This patch fixes that by adding the split_page_owner() call into __split_huge_page(). Link: http://lkml.kernel.org/r/20190820131828.22684-2-vbabka@suse.cz Fixes: a9627bc5e34e ("mm/page_owner: introduce split_page_owner and replace manual handling") Reported-by: Kirill A. Shutemov Signed-off-by: Vlastimil Babka Cc: Michal Hocko Cc: Mel Gorman Cc: Matthew Wilcox Cc: Signed-off-by: Andrew Morton --- mm/huge_memory.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/mm/huge_memory.c~mm-page_owner-handle-thp-splits-correctly +++ a/mm/huge_memory.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -2516,6 +2517,9 @@ static void __split_huge_page(struct pag } ClearPageCompound(head); + + split_page_owner(head, HPAGE_PMD_ORDER); + /* See comment in __split_huge_page_tail() */ if (PageAnon(head)) { /* Additional pin to swap cache */ From patchwork Sun Aug 25 00:55:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11113187 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 124611399 for ; Sun, 25 Aug 2019 00:55:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D151B206E0 for ; Sun, 25 Aug 2019 00:55:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="b3C+Ykhd" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D151B206E0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DB85F6B0505; Sat, 24 Aug 2019 20:55:05 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D14366B0507; Sat, 24 Aug 2019 20:55:05 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C2A806B0508; Sat, 24 Aug 2019 20:55:05 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0220.hostedemail.com [216.40.44.220]) by kanga.kvack.org (Postfix) with ESMTP id 98B1C6B0505 for ; Sat, 24 Aug 2019 20:55:05 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 47D424853 for ; Sun, 25 Aug 2019 00:55:05 +0000 (UTC) X-FDA: 75859130970.21.bear22_497b7c5d66a2b X-Spam-Summary: 2,0,0,94cbcebe0449766f,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:henryburns@google.com:henrywolfeburns@gmail.com:jwadams@google.com::minchan@kernel.org:mm-commits@vger.kernel.org:sergey.senozhatsky@gmail.com:shakeelb@google.com:stable@vger.kernel.org:torvalds@linux-foundation.org,RULES_HIT:41:355:379:800:960:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1542:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2525:2559:2563:2682:2685:2693:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3870:3871:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:4385:5007:6261:6653:6737:7514:7576:8599:8660:9010:9025:9545:10004:10913:11026:11658:11914:12043:12048:12296:12297:12438:12517:12519:12555:12679:12783:12986:13148:13230:13846:13869:14093:14096:14181:14721:14849:21080:21451:21627:21740:21939:30002:30005:30012:30054:30091,0,RBL:error,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:n one,Doma X-HE-Tag: bear22_497b7c5d66a2b X-Filterd-Recvd-Size: 4089 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Sun, 25 Aug 2019 00:55:04 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9B67222CE9; Sun, 25 Aug 2019 00:55:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566694504; bh=kKaN3Cr5sqAV+xNvs8rkdG+UYpwdBQMLlzsekFu2QlA=; h=Date:From:To:Subject:From; b=b3C+YkhdpZdaK92lI7iq8EhrCpuqxcU2qCpaHqHdUCdCNDNWkFKOtjy7RjUIW2MsE umfe8j51j3sb4BZGFNZ4TC0b7irE1zRMD1P4yyxKz30Z1K0zjlvM2DViyk4E69yNst wve7Gq1B1+a4GNwhgo+5IG4OLlsP6y4xNmukCyWI= Date: Sat, 24 Aug 2019 17:55:03 -0700 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, henryburns@google.com, henrywolfeburns@gmail.com, jwadams@google.com, linux-mm@kvack.org, minchan@kernel.org, mm-commits@vger.kernel.org, sergey.senozhatsky@gmail.com, shakeelb@google.com, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 09/11] mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely Message-ID: <20190825005503.70Mi5FZ2O%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Henry Burns Subject: mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely In zs_page_migrate() we call putback_zspage() after we have finished migrating all pages in this zspage. However, the return value is ignored. If a zs_free() races in between zs_page_isolate() and zs_page_migrate(), freeing the last object in the zspage, putback_zspage() will leave the page in ZS_EMPTY for potentially an unbounded amount of time. To fix this, we need to do the same thing as zs_page_putback() does: schedule free_work to occur. To avoid duplicated code, move the sequence to a new putback_zspage_deferred() function which both zs_page_migrate() and zs_page_putback() call. Link: http://lkml.kernel.org/r/20190809181751.219326-1-henryburns@google.com Fixes: 48b4800a1c6a ("zsmalloc: page migration support") Signed-off-by: Henry Burns Reviewed-by: Sergey Senozhatsky Cc: Henry Burns Cc: Minchan Kim Cc: Shakeel Butt Cc: Jonathan Adams Cc: Signed-off-by: Andrew Morton --- mm/zsmalloc.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) --- a/mm/zsmalloc.c~mm-zsmallocc-migration-can-leave-pages-in-zs_empty-indefinitely +++ a/mm/zsmalloc.c @@ -1862,6 +1862,18 @@ static void dec_zspage_isolation(struct zspage->isolated--; } +static void putback_zspage_deferred(struct zs_pool *pool, + struct size_class *class, + struct zspage *zspage) +{ + enum fullness_group fg; + + fg = putback_zspage(class, zspage); + if (fg == ZS_EMPTY) + schedule_work(&pool->free_work); + +} + static void replace_sub_page(struct size_class *class, struct zspage *zspage, struct page *newpage, struct page *oldpage) { @@ -2031,7 +2043,7 @@ static int zs_page_migrate(struct addres * the list if @page is final isolated subpage in the zspage. */ if (!is_zspage_isolated(zspage)) - putback_zspage(class, zspage); + putback_zspage_deferred(pool, class, zspage); reset_page(page); put_page(page); @@ -2077,14 +2089,13 @@ static void zs_page_putback(struct page spin_lock(&class->lock); dec_zspage_isolation(zspage); if (!is_zspage_isolated(zspage)) { - fg = putback_zspage(class, zspage); /* * Due to page_lock, we cannot free zspage immediately * so let's defer. */ - if (fg == ZS_EMPTY) - schedule_work(&pool->free_work); + putback_zspage_deferred(pool, class, zspage); } + spin_unlock(&class->lock); } From patchwork Sun Aug 25 00:55:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11113189 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CF9D91399 for ; Sun, 25 Aug 2019 00:55:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8BE7922CE9 for ; Sun, 25 Aug 2019 00:55:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="um4J+4HV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8BE7922CE9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 749786B0508; Sat, 24 Aug 2019 20:55:09 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6AC2B6B0509; Sat, 24 Aug 2019 20:55:09 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5736B6B050A; Sat, 24 Aug 2019 20:55:09 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0120.hostedemail.com [216.40.44.120]) by kanga.kvack.org (Postfix) with ESMTP id 2E5476B0508 for ; Sat, 24 Aug 2019 20:55:09 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id C2C84181AC9AE for ; Sun, 25 Aug 2019 00:55:08 +0000 (UTC) X-FDA: 75859131096.29.burn94_49fb298d1e615 X-Spam-Summary: 2,0,0,c09c232b2682519e,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:henryburns@google.com:henrywolfeburns@gmail.com:jwadams@google.com::minchan@kernel.org:mm-commits@vger.kernel.org:sergey.senozhatsky@gmail.com:shakeelb@google.com:stable@vger.kernel.org:torvalds@linux-foundation.org,RULES_HIT:2:41:355:379:800:960:966:967:968:973:988:989:1260:1263:1345:1381:1431:1437:1535:1605:1606:1730:1747:1777:1792:2196:2199:2393:2525:2559:2563:2682:2685:2693:2859:2902:2903:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4117:4321:4385:5007:6261:6653:6737:7514:7576:7903:8599:8660:9010:9025:9545:9592:10004:10913:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12679:12683:12783:12986:13148:13230:13846:13972:14096:21080:21324:21451:21627:21740:21939:30002:30029:30045:30054:30070,0,RBL:error,CacheIP:none,Bayesian:0 .5,0.5,0 X-HE-Tag: burn94_49fb298d1e615 X-Filterd-Recvd-Size: 6955 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Sun, 25 Aug 2019 00:55:08 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id EFD7F22CED; Sun, 25 Aug 2019 00:55:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566694507; bh=YvUL2+MnQV0e67f4+fMepihiMOJRzGYyvLKFw2Paq1c=; h=Date:From:To:Subject:From; b=um4J+4HV+Ee8mAXZrgetnOlkqF2lvzvEWybh/6OLzelImORXFKXtjQoYYeb1VSkcr F1S0RrKBFSeyHDuVOr5EXq5tUpGCzN2YNKBkqi1SOFqLvbIRyDsMbttakk3AJFWua2 93Zq7h7h6OjE++HiQBlo0fpH6JDaZ9sx9XV6s6KY= Date: Sat, 24 Aug 2019 17:55:06 -0700 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, henryburns@google.com, henrywolfeburns@gmail.com, jwadams@google.com, linux-mm@kvack.org, minchan@kernel.org, mm-commits@vger.kernel.org, sergey.senozhatsky@gmail.com, shakeelb@google.com, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 10/11] mm/zsmalloc.c: fix race condition in zs_destroy_pool Message-ID: <20190825005506.eC_-skh7p%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Henry Burns Subject: mm/zsmalloc.c: fix race condition in zs_destroy_pool In zs_destroy_pool() we call flush_work(&pool->free_work). However, we have no guarantee that migration isn't happening in the background at that time. Since migration can't directly free pages, it relies on free_work being scheduled to free the pages. But there's nothing preventing an in-progress migrate from queuing the work *after* zs_unregister_migration() has called flush_work(). Which would mean pages still pointing at the inode when we free it. Since we know at destroy time all objects should be free, no new migrations can come in (since zs_page_isolate() fails for fully-free zspages). This means it is sufficient to track a "# isolated zspages" count by class, and have the destroy logic ensure all such pages have drained before proceeding. Keeping that state under the class spinlock keeps the logic straightforward. In this case a memory leak could lead to an eventual crash if compaction hits the leaked page. This crash would only occur if people are changing their zswap backend at runtime (which eventually starts destruction). Link: http://lkml.kernel.org/r/20190809181751.219326-2-henryburns@google.com Fixes: 48b4800a1c6a ("zsmalloc: page migration support") Signed-off-by: Henry Burns Reviewed-by: Sergey Senozhatsky Cc: Henry Burns Cc: Minchan Kim Cc: Shakeel Butt Cc: Jonathan Adams Cc: Signed-off-by: Andrew Morton --- mm/zsmalloc.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 59 insertions(+), 2 deletions(-) --- a/mm/zsmalloc.c~mm-zsmallocc-fix-race-condition-in-zs_destroy_pool +++ a/mm/zsmalloc.c @@ -54,6 +54,7 @@ #include #include #include +#include #include #include @@ -268,6 +269,10 @@ struct zs_pool { #ifdef CONFIG_COMPACTION struct inode *inode; struct work_struct free_work; + /* A wait queue for when migration races with async_free_zspage() */ + struct wait_queue_head migration_wait; + atomic_long_t isolated_pages; + bool destroying; #endif }; @@ -1874,6 +1879,19 @@ static void putback_zspage_deferred(stru } +static inline void zs_pool_dec_isolated(struct zs_pool *pool) +{ + VM_BUG_ON(atomic_long_read(&pool->isolated_pages) <= 0); + atomic_long_dec(&pool->isolated_pages); + /* + * There's no possibility of racing, since wait_for_isolated_drain() + * checks the isolated count under &class->lock after enqueuing + * on migration_wait. + */ + if (atomic_long_read(&pool->isolated_pages) == 0 && pool->destroying) + wake_up_all(&pool->migration_wait); +} + static void replace_sub_page(struct size_class *class, struct zspage *zspage, struct page *newpage, struct page *oldpage) { @@ -1943,6 +1961,7 @@ static bool zs_page_isolate(struct page */ if (!list_empty(&zspage->list) && !is_zspage_isolated(zspage)) { get_zspage_mapping(zspage, &class_idx, &fullness); + atomic_long_inc(&pool->isolated_pages); remove_zspage(class, zspage, fullness); } @@ -2042,8 +2061,16 @@ static int zs_page_migrate(struct addres * Page migration is done so let's putback isolated zspage to * the list if @page is final isolated subpage in the zspage. */ - if (!is_zspage_isolated(zspage)) + if (!is_zspage_isolated(zspage)) { + /* + * We cannot race with zs_destroy_pool() here because we wait + * for isolation to hit zero before we start destroying. + * Also, we ensure that everyone can see pool->destroying before + * we start waiting. + */ putback_zspage_deferred(pool, class, zspage); + zs_pool_dec_isolated(pool); + } reset_page(page); put_page(page); @@ -2094,8 +2121,8 @@ static void zs_page_putback(struct page * so let's defer. */ putback_zspage_deferred(pool, class, zspage); + zs_pool_dec_isolated(pool); } - spin_unlock(&class->lock); } @@ -2118,8 +2145,36 @@ static int zs_register_migration(struct return 0; } +static bool pool_isolated_are_drained(struct zs_pool *pool) +{ + return atomic_long_read(&pool->isolated_pages) == 0; +} + +/* Function for resolving migration */ +static void wait_for_isolated_drain(struct zs_pool *pool) +{ + + /* + * We're in the process of destroying the pool, so there are no + * active allocations. zs_page_isolate() fails for completely free + * zspages, so we need only wait for the zs_pool's isolated + * count to hit zero. + */ + wait_event(pool->migration_wait, + pool_isolated_are_drained(pool)); +} + static void zs_unregister_migration(struct zs_pool *pool) { + pool->destroying = true; + /* + * We need a memory barrier here to ensure global visibility of + * pool->destroying. Thus pool->isolated pages will either be 0 in which + * case we don't care, or it will be > 0 and pool->destroying will + * ensure that we wake up once isolation hits 0. + */ + smp_mb(); + wait_for_isolated_drain(pool); /* This can block */ flush_work(&pool->free_work); iput(pool->inode); } @@ -2357,6 +2412,8 @@ struct zs_pool *zs_create_pool(const cha if (!pool->name) goto err; + init_waitqueue_head(&pool->migration_wait); + if (create_cache(pool)) goto err; From patchwork Sun Aug 25 00:55:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11113191 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 17A521399 for ; Sun, 25 Aug 2019 00:55:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D635B206E0 for ; Sun, 25 Aug 2019 00:55:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="0NeMK/2C" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D635B206E0 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B415B6B050A; Sat, 24 Aug 2019 20:55:12 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id ACD3A6B050B; Sat, 24 Aug 2019 20:55:12 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 992036B050C; Sat, 24 Aug 2019 20:55:12 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0102.hostedemail.com [216.40.44.102]) by kanga.kvack.org (Postfix) with ESMTP id 721C96B050A for ; Sat, 24 Aug 2019 20:55:12 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 15E6D500F for ; Sun, 25 Aug 2019 00:55:12 +0000 (UTC) X-FDA: 75859131264.11.front74_4a77dcb843819 X-Spam-Summary: 2,0,0,7aca87d9cc082308,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:andreyknvl@google.com:aryabinin@virtuozzo.com:catalin.marinas@arm.com:dvyukov@google.com:glider@google.com::mark.rutland@arm.com:mm-commits@vger.kernel.org:stable@vger.kernel.org:torvalds@linux-foundation.org:walter-zh.wu@mediatek.com:will.deacon@arm.com,RULES_HIT:41:355:379:800:960:965:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1534:1542:1711:1730:1747:1777:1792:1981:2194:2196:2199:2200:2393:2525:2553:2559:2563:2682:2685:2693:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4321:4385:4390:4395:5007:6119:6261:6653:6737:7576:8599:8603:9025:9545:10004:10913:11026:11473:11658:11914:12043:12048:12114:12296:12297:12438:12517:12519:12555:12679:12783:12986:13161:13221:13229:13255:13846:14093:14181:14721:14849:21080:21451:21627:21809:21939:30012: 30054:30 X-HE-Tag: front74_4a77dcb843819 X-Filterd-Recvd-Size: 3707 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Sun, 25 Aug 2019 00:55:11 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5A08A22CE3; Sun, 25 Aug 2019 00:55:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566694510; bh=S/u5w9NhTrxsivl7K1Z35nRC16DrJe8DW84Y81s14Ew=; h=Date:From:To:Subject:From; b=0NeMK/2CHmHw3+DMaZZFpHW8Mf6T5HpVKVEEKfxxXpW/b96pn1PoLhJXX3cPT/xSR 5oWlU7eMMuZWy+ASalzhjDt9s6bdjcz1NVaO+R3FzAlIpXI4mTvBQeeZAOd2r8MB/G DmSO8pOg2NaZdUHYs6KO1V7mwrzGk0sY4w++Vq/M= Date: Sat, 24 Aug 2019 17:55:09 -0700 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, andreyknvl@google.com, aryabinin@virtuozzo.com, catalin.marinas@arm.com, dvyukov@google.com, glider@google.com, linux-mm@kvack.org, mark.rutland@arm.com, mm-commits@vger.kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org, walter-zh.wu@mediatek.com, will.deacon@arm.com Subject: [patch 11/11] mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y Message-ID: <20190825005509.BW17vfLCd%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Andrey Ryabinin Subject: mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y The code like this: ptr = kmalloc(size, GFP_KERNEL); page = virt_to_page(ptr); offset = offset_in_page(ptr); kfree(page_address(page) + offset); may produce false-positive invalid-free reports on the kernel with CONFIG_KASAN_SW_TAGS=y. In the example above we lose the original tag assigned to 'ptr', so kfree() gets the pointer with 0xFF tag. In kfree() we check that 0xFF tag is different from the tag in shadow hence print false report. Instead of just comparing tags, do the following: 1) Check that shadow doesn't contain KASAN_TAG_INVALID. Otherwise it's double-free and it doesn't matter what tag the pointer have. 2) If pointer tag is different from 0xFF, make sure that tag in the shadow is the same as in the pointer. Link: http://lkml.kernel.org/r/20190819172540.19581-1-aryabinin@virtuozzo.com Fixes: 7f94ffbc4c6a ("kasan: add hooks implementation for tag-based mode") Signed-off-by: Andrey Ryabinin Reported-by: Walter Wu Reported-by: Mark Rutland Reviewed-by: Andrey Konovalov Cc: Alexander Potapenko Cc: Dmitry Vyukov Cc: Catalin Marinas Cc: Will Deacon Cc: Signed-off-by: Andrew Morton --- mm/kasan/common.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) --- a/mm/kasan/common.c~mm-kasan-fix-false-positive-invalid-free-reports-with-config_kasan_sw_tags=y +++ a/mm/kasan/common.c @@ -407,8 +407,14 @@ static inline bool shadow_invalid(u8 tag if (IS_ENABLED(CONFIG_KASAN_GENERIC)) return shadow_byte < 0 || shadow_byte >= KASAN_SHADOW_SCALE_SIZE; - else - return tag != (u8)shadow_byte; + + /* else CONFIG_KASAN_SW_TAGS: */ + if ((u8)shadow_byte == KASAN_TAG_INVALID) + return true; + if ((tag != KASAN_TAG_KERNEL) && (tag != (u8)shadow_byte)) + return true; + + return false; } static bool __kasan_slab_free(struct kmem_cache *cache, void *object,