From patchwork Mon Jan 28 16:04:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 10783941 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1552B139A for ; Mon, 28 Jan 2019 16:04:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 03FA92ABDB for ; Mon, 28 Jan 2019 16:04:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EC3982AE7B; Mon, 28 Jan 2019 16:04:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 39E8E2ABDB for ; Mon, 28 Jan 2019 16:04:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6AF5B8E0002; Mon, 28 Jan 2019 11:04:10 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 65D868E0001; Mon, 28 Jan 2019 11:04:10 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 573848E0002; Mon, 28 Jan 2019 11:04:10 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) by kanga.kvack.org (Postfix) with ESMTP id 2E0438E0001 for ; Mon, 28 Jan 2019 11:04:10 -0500 (EST) Received: by mail-qt1-f199.google.com with SMTP id w19so21146496qto.13 for ; Mon, 28 Jan 2019 08:04:10 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id; bh=3g8ukPIuxvnYpp4ZZrXM2L/pRIG1RvRhag6XFgOk9Bw=; b=WqShtvrn+PNpob2sC58Pf9WhZtbScPjefaZdO5qBCGIXxC+Ph+Bq/npeJkGIHDcvtF 3gfLeWQ86qNyktX8HUVkso1I/wpylaFwiwaZuLfZCC/cS9utCnJ2WiLl98AFKgQjuEd/ PyZZZbi3gNadIp6EjoDmIGC8XcQobkr8BHRnmO9eL4nx8iag23fCPg5YOQ/PhO5Rn+q0 OeCjfowZtfZuaFQIna+GbgmHep3lrSOobI7Nvz++6wmZW5Ei/+t4+RbmjwJu3Yvrp+nY NS0RHRHOxr+zn4WCOXdX564zGDw6Mwi+ivsyJfMlNboeZSo32ue7hxVOFvx88c2eiV5B B10g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of david@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: AJcUukfMJXfZonNuwhsVsGtnTvRKBQU2ZpxBB4Pgkoma2ck845VpPgpm rdjJ8Kxj+X+nIRZTAoF283HB8G4NG7JBq8mE1KrXy0EjPFxnoGH7F8FWvUaF/T0slGSiIPKN5yP F19kQf/6pLJ4sxotE9IQ17wXTkAa4rn1qv7x9Rr/RRLuZgJPpnrpeG+71hOCSIBa42A== X-Received: by 2002:ae9:dec5:: with SMTP id s188mr19805212qkf.127.1548691449935; Mon, 28 Jan 2019 08:04:09 -0800 (PST) X-Google-Smtp-Source: ALg8bN41j7e94ghpdHExTmVbWKyUl4bNosh7Z6iv8Sw7KiWJZpVGwHIvpjcFPjVGyDxjgqS6Et5l X-Received: by 2002:ae9:dec5:: with SMTP id s188mr19805158qkf.127.1548691449208; Mon, 28 Jan 2019 08:04:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548691449; cv=none; d=google.com; s=arc-20160816; b=PG0y4r6xOq+L2b9+knusUuFDGHQ0O7qtq8JTbGI30ygximP4zaOWmWhEzkv6QRNX+9 TiGsCwhhr096b0K9cFOleZEGBA30/kqC4kKd5Aqad7E7briH+UJ1trIVWBR9PBv9daOB pNo22nfQPVfpufidZd/Zv1IkOxw6t7s3ZuEMy1HKTBGczgy/V1+Ykq71t874pAeVfDSj 42+bohacK6xK96t+wA7DXwXzC1js5MYH7faAmgqOxLcXmfdgYG+i3/5N+HcSe3Qkur5k UsXdvh/gE5cgsArsH6KZ2LZu/nlZHRkl39XHaTtt1cmdRpXPPOg5bgHku1tHVgMQHe4z wxjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:cc:to:from; bh=3g8ukPIuxvnYpp4ZZrXM2L/pRIG1RvRhag6XFgOk9Bw=; b=fmDBZb3gtsoqfDZKTNxsD7N+8GKeP86FWBZGGdrMSTb1qs92XYgORsREXs6ootk2jc FGXa+I6HDr59hxDVYCQpa7jQ06t6be/QObPr2CVNNfPlj6ilbGG3P3SMa0whcylZ6eoL pbbUmIDbwY7z5AkG78HftgPHLlvXzNHH/vw+ecdYO3xN0PNQm4PMCig4tF1VbAsnijjk 1y1HMRbatkYaFMwtZG41dO331AThWisKUwd759oD/lM3YNBdobQj6y22qx5mXcZ+mHgg XtKwJTd5B/vgp9u/7ChIo9dWPqog6t9Dz5tuS/10t+2EHSfkFQ7tYIeV4XpGxRYK7Z/R +flg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of david@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id d13si1502617qto.267.2019.01.28.08.04.09 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 28 Jan 2019 08:04:09 -0800 (PST) Received-SPF: pass (google.com: domain of david@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of david@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id F3B5BA08EC; Mon, 28 Jan 2019 16:04:07 +0000 (UTC) Received: from t460s.redhat.com (ovpn-117-107.ams2.redhat.com [10.36.117.107]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7221710E1B41; Mon, 28 Jan 2019 16:04:04 +0000 (UTC) From: David Hildenbrand To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, David Hildenbrand , Andrew Morton , Mel Gorman , "Kirill A. Shutemov" , Michal Hocko , Naoya Horiguchi , Jan Kara , Andrea Arcangeli , Dominik Brodowski , Matthew Wilcox , Vratislav Bendel , Rafael Aquini , Konstantin Khlebnikov , Minchan Kim , stable@vger.kernel.org Subject: [PATCH v1] mm: migrate: don't rely on PageMovable() of newpage after unlocking it Date: Mon, 28 Jan 2019 17:04:03 +0100 Message-Id: <20190128160403.16657-1-david@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Mon, 28 Jan 2019 16:04:08 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP While debugging some crashes related to virtio-balloon deflation that happened under the old balloon migration code, I stumbled over a race that still exists today. What we experienced: drivers/virtio/virtio_balloon.c:release_pages_balloon(): - WARNING: CPU: 13 PID: 6586 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0 - list_del corruption. prev->next should be ffffe253961090a0, but was dead000000000100 Turns out after having added the page to a local list when dequeuing, the page would suddenly be moved to an LRU list before we would free it via the local list, corrupting both lists. So a page we own and that is !LRU was moved to an LRU list. In __unmap_and_move(), we lock the old and newpage and perform the migration. In case of vitio-balloon, the new page will become movable, the old page will no longer be movable. However, after unlocking newpage, there is nothing stopping the newpage from getting dequeued and freed by virtio-balloon. This will result in the newpage 1. No longer having PageMovable() 2. Getting moved to the local list before finally freeing it (using page->lru) Back in the migration thread in __unmap_and_move(), we would after unlocking the newpage suddenly no longer have PageMovable(newpage) and will therefore call putback_lru_page(newpage), modifying page->lru although that list is still in use by virtio-balloon. To summarize, we have a race between migrating the newpage and checking for PageMovable(newpage). Instead of checking PageMovable(newpage), we can simply rely on is_lru of the original page. Looks like this was introduced by d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management"), which was backported up to 3.12. Old compaction code used PageBalloon() via -_is_movable_balloon_page() instead of PageMovable(), however with the same semantics. Cc: Andrew Morton Cc: Mel Gorman Cc: "Kirill A. Shutemov" Cc: Michal Hocko Cc: Naoya Horiguchi Cc: Jan Kara Cc: Andrea Arcangeli Cc: Dominik Brodowski Cc: Matthew Wilcox Cc: Vratislav Bendel Cc: Rafael Aquini Cc: Konstantin Khlebnikov Cc: Minchan Kim Cc: stable@vger.kernel.org # 3.12+ Fixes: d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management") Reported-by: Vratislav Bendel Acked-by: Michal Hocko Acked-by: Rafael Aquini Signed-off-by: David Hildenbrand --- mm/migrate.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 4512afab46ac..31e002270b05 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1135,10 +1135,12 @@ static int __unmap_and_move(struct page *page, struct page *newpage, * If migration is successful, decrease refcount of the newpage * which will not free the page because new page owner increased * refcounter. As well, if it is LRU page, add the page to LRU - * list in here. + * list in here. Don't rely on PageMovable(newpage), as that could + * already have changed after unlocking newpage (e.g. + * virtio-balloon deflation). */ if (rc == MIGRATEPAGE_SUCCESS) { - if (unlikely(__PageMovable(newpage))) + if (unlikely(!is_lru)) put_page(newpage); else putback_lru_page(newpage);