From patchwork Thu Nov 22 19:54:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sasha Levin X-Patchwork-Id: 10694775 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 25DA214BD for ; Thu, 22 Nov 2018 19:56:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 17C4B2B010 for ; Thu, 22 Nov 2018 19:56:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0BB282B136; Thu, 22 Nov 2018 19:56:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 583282B010 for ; Thu, 22 Nov 2018 19:56:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 66E856B2CF4; Thu, 22 Nov 2018 14:56:12 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 61F126B2CF5; Thu, 22 Nov 2018 14:56:12 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5360C6B2CF6; Thu, 22 Nov 2018 14:56:12 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 145346B2CF4 for ; Thu, 22 Nov 2018 14:56:12 -0500 (EST) Received: by mail-pg1-f198.google.com with SMTP id g188so3004749pgc.22 for ; Thu, 22 Nov 2018 11:56:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=UijWGe/GnKfoHQDoEehyKyKkU+M7UHhoyt00RkEYD64=; b=BtuflwsdJ0GOjpOYGZtzzE84RpCNRQMRoc5GRCzkeVj2clToSdI6vkzA57D9+11d77 7H3oW8NNivUugz/vUp2zzVmehhpYGXDyXcz0Qs7FvyaiQ3XGX9FxtFzlcn20rmpdcXWu 0XBAXpdxSO2i5gVWX/hI5gTGKZfKY7qCVfeK2GEEX3CJu1z6G/KGV/ezkYcHUfGwJzlz g9LthXOCmqJvpKCbxWpVBtr8THf1lWXXE7PgdQq/pO3lHyCutsajwnH4vQFNA5+VMc02 Xv81JK3qDD4tuP3XEwqnun6LQKKiHp6sbDaSEjz2/J940ZIcm71tJoJbPjV3MyvjUQV7 H0Iw== X-Gm-Message-State: AA+aEWYwQJEy8iobLsuWdDv/02Kmsm/8rvlYtte7jG+3pzLql9119DYM ShUo5M4E7Ulv7U5QuvZhGypUbtRx+Rv8gTMDrGnODqbedjKHRMJVmH24GHZ4EH3MzTci5T9twRT DRUimvcO+hBqOdZz7vsp0znaqQTnwR2hcmCHwTL3d9QDx+pbaO3P/mVnv7UHRiIAZKg== X-Received: by 2002:a63:4101:: with SMTP id o1mr11108089pga.447.1542916571726; Thu, 22 Nov 2018 11:56:11 -0800 (PST) X-Google-Smtp-Source: AFSGD/VflqIxIk0banC7un6ljEqROf364HCn4CJo7/6gXzeJsAqGdez0z0LiReCfuniGHw5vrN3Q X-Received: by 2002:a63:4101:: with SMTP id o1mr11108049pga.447.1542916570867; Thu, 22 Nov 2018 11:56:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542916570; cv=none; d=google.com; s=arc-20160816; b=t+B1M7TgmqifFjLbQLKwGd7tDufsofsrejTOAvdteOHtTHKbzhYqtj/Y0+FWQmK5hN NBUr+ijfJRI5MnfeKv2FJr3VCYworjZooMfxfJ8GXuWZQ7wzoq5Fhm9dGSD3nnz7kO+V YMEWa4inssl+ecPMPf1O5jGn0xcsxhJ2ZNffBnGetFU7QHLucu/KLqMh6GmASwrTO6Si iPMWJYItzbKCQTj/6ZYvA9cn5xcE4vo/Z/gTQqR3AzppvQiUuTD5P4Z2B5vZXAPlNGsj o1LMG//ugxJlwXi8Sp69xO/d9IcXp7hHu4ZQhQQFBhDnaBS1iQ/N9qKZc+Q7YMH3LmjD mHcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=UijWGe/GnKfoHQDoEehyKyKkU+M7UHhoyt00RkEYD64=; b=naSotb08SBRplpHY6dvz8hUmBTt/vQskRCqiNPEpN2uVpOmr71KI55lwopTWdZFu9R IMsZ/MbZCbFEUafMZs61jEIpVc4hAvTqqWktbv21NSkHMPEPFgvwBPn53xttcidd5Jw4 2qRAYKvbjRkZxV2sbTEDv8mp5ngJRbwt44yKbYESdS2ovxhSLgYOVC3TY5wGclItfJM/ 6i/14RZ/CBTLsP4ZCPs3iwHPyqBu9sMXCsFbsiTj7Y1Sk08CIq9M6fRHJRuhJLxQUhvV BjSX9o0iA2+0v3b7l2NVUmqHaQ4rw64tJc+yLB+t81kfoTFElxePpAleuRcnUb/+nFuk QvYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=QSXAzck7; spf=pass (google.com: domain of sashal@kernel.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=sashal@kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from mail.kernel.org (mail.kernel.org. [198.145.29.99]) by mx.google.com with ESMTPS id t3si32217804pgl.108.2018.11.22.11.56.10 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Nov 2018 11:56:10 -0800 (PST) Received-SPF: pass (google.com: domain of sashal@kernel.org designates 198.145.29.99 as permitted sender) client-ip=198.145.29.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=QSXAzck7; spf=pass (google.com: domain of sashal@kernel.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=sashal@kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from sasha-vm.mshome.net (unknown [37.142.5.207]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0376A20864; Thu, 22 Nov 2018 19:56:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1542916570; bh=XGApC75tsE0a/HLhg1A1seQEfABEBPrukrY9pXv8FnQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QSXAzck7SCNmk0Ba/jkQZsrCvJDwaTZDxYouqqrW/d5eZQTn8O3/TERXw7bCXnxLG 7hSaAmHEg4Wi4An7ej+nUG9HVmjNFLA3s+/rJ78nh/VdLwThu/x/bicRmu9WNdmGFY iDzEILqHqy/we4Q7U3lc0hrsVBvIovXnyHqGZCVQ= From: Sasha Levin To: stable@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Vitaly Wool , Vitaly Wool , Jongseok Kim , Andrew Morton , Linus Torvalds , Sasha Levin , linux-mm@kvack.org Subject: [PATCH AUTOSEL 4.14 19/21] z3fold: fix possible reclaim races Date: Thu, 22 Nov 2018 14:54:50 -0500 Message-Id: <20181122195452.13520-19-sashal@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181122195452.13520-1-sashal@kernel.org> References: <20181122195452.13520-1-sashal@kernel.org> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Vitaly Wool [ Upstream commit ca0246bb97c23da9d267c2107c07fb77e38205c9 ] Reclaim and free can race on an object which is basically fine but in order for reclaim to be able to map "freed" object we need to encode object length in the handle. handle_to_chunks() is then introduced to extract object length from a handle and use it during mapping. Moreover, to avoid racing on a z3fold "headless" page release, we should not try to free that page in z3fold_free() if the reclaim bit is set. Also, in the unlikely case of trying to reclaim a page being freed, we should not proceed with that page. While at it, fix the page accounting in reclaim function. This patch supersedes "[PATCH] z3fold: fix reclaim lock-ups". Link: http://lkml.kernel.org/r/20181105162225.74e8837d03583a9b707cf559@gmail.com Signed-off-by: Vitaly Wool Signed-off-by: Jongseok Kim Reported-by-by: Jongseok Kim Reviewed-by: Snild Dolkow Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/z3fold.c | 101 ++++++++++++++++++++++++++++++++-------------------- 1 file changed, 62 insertions(+), 39 deletions(-) diff --git a/mm/z3fold.c b/mm/z3fold.c index f33403d718ac..2813cdfa46b9 100644 --- a/mm/z3fold.c +++ b/mm/z3fold.c @@ -99,6 +99,7 @@ struct z3fold_header { #define NCHUNKS ((PAGE_SIZE - ZHDR_SIZE_ALIGNED) >> CHUNK_SHIFT) #define BUDDY_MASK (0x3) +#define BUDDY_SHIFT 2 /** * struct z3fold_pool - stores metadata for each z3fold pool @@ -145,7 +146,7 @@ enum z3fold_page_flags { MIDDLE_CHUNK_MAPPED, NEEDS_COMPACTING, PAGE_STALE, - UNDER_RECLAIM + PAGE_CLAIMED, /* by either reclaim or free */ }; /***************** @@ -174,7 +175,7 @@ static struct z3fold_header *init_z3fold_page(struct page *page, clear_bit(MIDDLE_CHUNK_MAPPED, &page->private); clear_bit(NEEDS_COMPACTING, &page->private); clear_bit(PAGE_STALE, &page->private); - clear_bit(UNDER_RECLAIM, &page->private); + clear_bit(PAGE_CLAIMED, &page->private); spin_lock_init(&zhdr->page_lock); kref_init(&zhdr->refcount); @@ -223,8 +224,11 @@ static unsigned long encode_handle(struct z3fold_header *zhdr, enum buddy bud) unsigned long handle; handle = (unsigned long)zhdr; - if (bud != HEADLESS) - handle += (bud + zhdr->first_num) & BUDDY_MASK; + if (bud != HEADLESS) { + handle |= (bud + zhdr->first_num) & BUDDY_MASK; + if (bud == LAST) + handle |= (zhdr->last_chunks << BUDDY_SHIFT); + } return handle; } @@ -234,6 +238,12 @@ static struct z3fold_header *handle_to_z3fold_header(unsigned long handle) return (struct z3fold_header *)(handle & PAGE_MASK); } +/* only for LAST bud, returns zero otherwise */ +static unsigned short handle_to_chunks(unsigned long handle) +{ + return (handle & ~PAGE_MASK) >> BUDDY_SHIFT; +} + /* * (handle & BUDDY_MASK) < zhdr->first_num is possible in encode_handle * but that doesn't matter. because the masking will result in the @@ -717,37 +727,39 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle) page = virt_to_page(zhdr); if (test_bit(PAGE_HEADLESS, &page->private)) { - /* HEADLESS page stored */ - bud = HEADLESS; - } else { - z3fold_page_lock(zhdr); - bud = handle_to_buddy(handle); - - switch (bud) { - case FIRST: - zhdr->first_chunks = 0; - break; - case MIDDLE: - zhdr->middle_chunks = 0; - zhdr->start_middle = 0; - break; - case LAST: - zhdr->last_chunks = 0; - break; - default: - pr_err("%s: unknown bud %d\n", __func__, bud); - WARN_ON(1); - z3fold_page_unlock(zhdr); - return; + /* if a headless page is under reclaim, just leave. + * NB: we use test_and_set_bit for a reason: if the bit + * has not been set before, we release this page + * immediately so we don't care about its value any more. + */ + if (!test_and_set_bit(PAGE_CLAIMED, &page->private)) { + spin_lock(&pool->lock); + list_del(&page->lru); + spin_unlock(&pool->lock); + free_z3fold_page(page); + atomic64_dec(&pool->pages_nr); } + return; } - if (bud == HEADLESS) { - spin_lock(&pool->lock); - list_del(&page->lru); - spin_unlock(&pool->lock); - free_z3fold_page(page); - atomic64_dec(&pool->pages_nr); + /* Non-headless case */ + z3fold_page_lock(zhdr); + bud = handle_to_buddy(handle); + + switch (bud) { + case FIRST: + zhdr->first_chunks = 0; + break; + case MIDDLE: + zhdr->middle_chunks = 0; + break; + case LAST: + zhdr->last_chunks = 0; + break; + default: + pr_err("%s: unknown bud %d\n", __func__, bud); + WARN_ON(1); + z3fold_page_unlock(zhdr); return; } @@ -755,7 +767,7 @@ static void z3fold_free(struct z3fold_pool *pool, unsigned long handle) atomic64_dec(&pool->pages_nr); return; } - if (test_bit(UNDER_RECLAIM, &page->private)) { + if (test_bit(PAGE_CLAIMED, &page->private)) { z3fold_page_unlock(zhdr); return; } @@ -833,20 +845,30 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) } list_for_each_prev(pos, &pool->lru) { page = list_entry(pos, struct page, lru); + + /* this bit could have been set by free, in which case + * we pass over to the next page in the pool. + */ + if (test_and_set_bit(PAGE_CLAIMED, &page->private)) + continue; + + zhdr = page_address(page); if (test_bit(PAGE_HEADLESS, &page->private)) - /* candidate found */ break; - zhdr = page_address(page); - if (!z3fold_page_trylock(zhdr)) + if (!z3fold_page_trylock(zhdr)) { + zhdr = NULL; continue; /* can't evict at this point */ + } kref_get(&zhdr->refcount); list_del_init(&zhdr->buddy); zhdr->cpu = -1; - set_bit(UNDER_RECLAIM, &page->private); break; } + if (!zhdr) + break; + list_del_init(&page->lru); spin_unlock(&pool->lock); @@ -895,6 +917,7 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) if (test_bit(PAGE_HEADLESS, &page->private)) { if (ret == 0) { free_z3fold_page(page); + atomic64_dec(&pool->pages_nr); return 0; } spin_lock(&pool->lock); @@ -902,7 +925,7 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, unsigned int retries) spin_unlock(&pool->lock); } else { z3fold_page_lock(zhdr); - clear_bit(UNDER_RECLAIM, &page->private); + clear_bit(PAGE_CLAIMED, &page->private); if (kref_put(&zhdr->refcount, release_z3fold_page_locked)) { atomic64_dec(&pool->pages_nr); @@ -961,7 +984,7 @@ static void *z3fold_map(struct z3fold_pool *pool, unsigned long handle) set_bit(MIDDLE_CHUNK_MAPPED, &page->private); break; case LAST: - addr += PAGE_SIZE - (zhdr->last_chunks << CHUNK_SHIFT); + addr += PAGE_SIZE - (handle_to_chunks(handle) << CHUNK_SHIFT); break; default: pr_err("unknown buddy id %d\n", buddy);