From patchwork Sat Jul 14 13:39:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 10524731 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1E92C60388 for ; Sat, 14 Jul 2018 13:40:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E31D128D3E for ; Sat, 14 Jul 2018 13:40:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D4A8B28D67; Sat, 14 Jul 2018 13:40:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2798C28D3E for ; Sat, 14 Jul 2018 13:40:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2209D6B0005; Sat, 14 Jul 2018 09:40:10 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1CE536B0006; Sat, 14 Jul 2018 09:40:10 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0BDF56B0007; Sat, 14 Jul 2018 09:40:10 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-it0-f72.google.com (mail-it0-f72.google.com [209.85.214.72]) by kanga.kvack.org (Postfix) with ESMTP id D57FF6B0005 for ; Sat, 14 Jul 2018 09:40:09 -0400 (EDT) Received: by mail-it0-f72.google.com with SMTP id a10-v6so983543itc.9 for ; Sat, 14 Jul 2018 06:40:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:mime-version:references :in-reply-to:from:date:message-id:subject:to:cc; bh=hiruQdXLNWKDLIyPEkdg4s7S4ErlPwAqpdAiBxNfQkg=; b=UmT3mq+H1ienvKRXYArVM8DGshyJuy73rdhItXjUt7KutukCSxY6iqY8KZQqgTKbT9 q4a+gNoKseWfGsHl6hPim27uEMygbZ0HjZwEZJJWTYJgCWOnifwbVWZQM8l5PQlMuGTG 0zAp7Gm43hubYZkefSRNt3tJf4VmcfLM70pK/V7iv8h0P/iqPgL39iYkwWdJQARc5Yr1 gqUlo7YwohZMetZTTyK3pla8e2+lICrIQVZnAWmh4VDZQUWz1a1+oqDMA14aotWlOzQ4 PWreBjSGgENUwE44nO5DglQBFsC9oR+IgAsPJN5GSsf6vlGdgaXLdnT3wSPrywrMGR2q x8IA== X-Gm-Message-State: AOUpUlHXFTiIq4yPbW0x3lBtU1w7Jh8FZOUxXyJjGSAkIUPbowUsEMVM ttgOIKFC+lB5GR85kiRcJIldOms3HZ5LvNreY2qEtqMvbu2ktLj+2VHHs1Bz6gR845E2wrPTKTA SLtxKrEnNIs/Xo9eYHaQocUHKbpX8+qDCQeNhQ8TvLQWhA+XaPgglIUUor4LFZugpzQ== X-Received: by 2002:a24:60d:: with SMTP id 13-v6mr8056536itv.1.1531575609482; Sat, 14 Jul 2018 06:40:09 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcIGsZ200ZGZp8cW6zk9xXK8s53O44eODfNqs/RO+4dyWiGxzZcrLHm02ATXKylc6QCVyUx X-Received: by 2002:a24:60d:: with SMTP id 13-v6mr8056502itv.1.1531575608569; Sat, 14 Jul 2018 06:40:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531575608; cv=none; d=google.com; s=arc-20160816; b=VtxMgJPAiHtg9QE+yd4UJpcTOfZHFupleZvTNtpeh/eDt/M9Frd/r0TKZTPfj41Xaq /p5+CzPVNBGDq5Zw1L24dokdT8y9fKFiRt8pVezsoutHVGir2DmCc4wWy+Q8fxWURhF+ 4ondfvU+sbhc1fzXXssT9QHgW+i7XQ6tsdWPG7xje29GSvj1qPXAWAANprMu261sXaDy +bN0M6PH9R3BV5EvWha87unF82a2zp2p/4n9dz5uOToyarC4IGRehH/pJczHTJjWwkV0 DTesgheCsBhJeEOl+EwPG4NIizVhiv11qxsjWAh8ze3zikseRobD3F6979iPthfpiVzm azVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature:arc-authentication-results; bh=hiruQdXLNWKDLIyPEkdg4s7S4ErlPwAqpdAiBxNfQkg=; b=sf2IoImoCZf1va8fuLGIGDI9Wp7KacTxZvhisDjoZq5ihB5Ng3AQIXWtgWzgiNOVr2 ebsSSJ/S08GJl4YcWQ8wJNjPml7/GxCtU2z7/5gt3mVsD8wE5CV4P0LSkT0s5JtJNRo5 aCenVdq3XPFEW8TE8ugVkyRsjLAAx1BuNizpMpQKmt9kiySSxASoe2Wq3sW5WnfH1o9n Jao4Fbj3ecDyZU+xji9OnHxnf7N/SJdpL3tC9I/nF460m/Me/qd3kLXS+8O8Wu3OVsxh d3sGkFJf06bW7wb8ixD9CLE6Z1KUcajzKtVGKvTxjettbEN15SNp7Hf17a/Vg0xzW5ev FTAw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=LppjeYwG; spf=pass (google.com: domain of pasha.tatashin@oracle.com designates 141.146.126.78 as permitted sender) smtp.mailfrom=pasha.tatashin@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from aserp2120.oracle.com (aserp2120.oracle.com. [141.146.126.78]) by mx.google.com with ESMTPS id f8-v6si8640261jam.30.2018.07.14.06.40.08 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 14 Jul 2018 06:40:08 -0700 (PDT) Received-SPF: pass (google.com: domain of pasha.tatashin@oracle.com designates 141.146.126.78 as permitted sender) client-ip=141.146.126.78; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=LppjeYwG; spf=pass (google.com: domain of pasha.tatashin@oracle.com designates 141.146.126.78 as permitted sender) smtp.mailfrom=pasha.tatashin@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w6EDdpci047420 for ; Sat, 14 Jul 2018 13:40:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=mime-version : references : in-reply-to : from : date : message-id : subject : to : cc : content-type; s=corp-2018-07-02; bh=hiruQdXLNWKDLIyPEkdg4s7S4ErlPwAqpdAiBxNfQkg=; b=LppjeYwGItmeAmLxiXq/W5L1XB0x0CDASkVzn7r0NYFhRO7J0vRQ566fZOT6UdR6siNy B707EGxMfAmNyLwZnFFbKPaOE+Y95ELRDbaPIe0LbZL1B50O2DEtTC62OfMYZRNBtXt9 po2TcLV6hw7KSSHMpGqOP1o+smgOTpvXXTC37q50QRIJG1P3nid042YfDDNjPbBsNHOg i8+PCjMM5Q/Wu7QEU/f6+VpB4QqJK9UVnnpXJYdEYtZCoPcN+ZeOnxBq+1RDzOJFfmle lQRE0yuvonwZw4h+FgEb7MJVPcaoe9/yrHRd6AAWqQ22BILBAC7MF2hq31jcrA28cYMx ug== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp2120.oracle.com with ESMTP id 2k7a33rse0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Sat, 14 Jul 2018 13:40:07 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w6EDe6XB011344 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Sat, 14 Jul 2018 13:40:06 GMT Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w6EDe6Pp030604 for ; Sat, 14 Jul 2018 13:40:06 GMT Received: from mail-oi0-f49.google.com (/209.85.218.49) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 14 Jul 2018 06:40:06 -0700 Received: by mail-oi0-f49.google.com with SMTP id l10-v6so20417986oii.0 for ; Sat, 14 Jul 2018 06:40:06 -0700 (PDT) X-Received: by 2002:aca:e089:: with SMTP id x131-v6mr10344204oig.221.1531575605769; Sat, 14 Jul 2018 06:40:05 -0700 (PDT) MIME-Version: 1.0 References: <20180713164804.fc2c27ccbac4c02ca2c8b984@linux-foundation.org> <20180713165812.ec391548ffeead96725d044c@linux-foundation.org> <9b93d48c-b997-01f7-2fd6-6e35301ef263@oracle.com> <5edf2d71-f548-98f9-16dd-b7fed29f4869@oracle.com> In-Reply-To: From: Pavel Tatashin Date: Sat, 14 Jul 2018 09:39:29 -0400 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Instability in current -git tree To: Linus Torvalds Cc: Andrew Morton , tglx@linutronix.de, willy@infradead.org, mingo@redhat.com, axboe@kernel.dk, gregkh@linuxfoundation.org, davem@davemloft.net, viro@zeniv.linux.org.uk, Dave Airlie , Tejun Heo , Theodore Tso , snitzer@redhat.com, Linux Memory Management List , neelx@redhat.com, mgorman@techsingularity.net X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8953 signatures=668706 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=5 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1806210000 definitions=main-1807140166 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Hi Linus, I attached a temporary fix, which I could not test, as I was unable to reproduce the problem, but it should fix the issue. Reverting "f7f99100d8d9 mm: stop zeroing memory during allocation in vmemmap" would introduce a significant boot performance regression, as we would zero the whole memmap twice during boot. Later, I will introduce a more detailed fix that will get rid of zero_resv_unavail() entirely, and instead will zero skipped struct pages in memmap_init_zone(), where it should be done. Thank you, Pavel On Fri, Jul 13, 2018 at 11:25 PM Linus Torvalds wrote: > > On Fri, Jul 13, 2018 at 8:04 PM Pavel Tatashin > wrote: > > > > > You can't just memset() the 'struct page' to zero after it's been set up. > > > > That should not be happening, unless there is a bug. > > Well, it does seem to happen. My memory stress-tester has been running > for about half an hour now with the revert I posted - it used to > trigger the problem in maybe ~5 minutes before. > > So I do think that revert fixes it for me. No guarantees, but since I > figured out how to trigger it, it's been fairly reliable. > > > We want to zero those struct pages so we do not have uninitialized > > data accessed by various parts of the code that rounds down large > > pages and access the first page in section without verifying that the > > page is valid. The example of this is described in commit that > > introduced zero_resv_unavail() > > I'm attaching the relevant (?) parts of dmesg, which has the node > ranges, maybe you can see what the problem with the code is. > > (NOTE! This dmesg is with that "mem=6G" command line option, which causes that > > e820: remove [mem 0x180000000-0xfffffffffffffffe] usable > > line - that's just because it's my stress-test boot. It happens with > or without it, but without the "mem=6G" it took days to trigger). > > I'm more than willing to test patches (either for added information or > for testing fixes), although I think I'm getting off the computer for > today. > > Linus From 95259841ef79cc17c734a994affa3714479753e3 Mon Sep 17 00:00:00 2001 From: Pavel Tatashin Date: Sat, 14 Jul 2018 09:15:07 -0400 Subject: [PATCH] mm: zero unavailable pages before memmap init We must zero struct pages for memory that is not backed by physical memory, or kernel does not have access to. Recently, there was a change which zeroed all memmap for all holes in e820. Unfortunately, it introduced a bug that is discussed here: https://www.spinics.net/lists/linux-mm/msg156764.html Linus, also saw this bug on his machine, and confirmed that pulling commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved") fixes the issue. The problem is that we incorrectly zero some struct pages after they were setup. The fix is to zero unavailable struct pages prior to initializing of struct pages. A more detailed fix should come later that would avoid double zeroing cases: one in __init_single_page(), the other one in zero_resv_unavail(). Fixes: 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved") Signed-off-by: Pavel Tatashin --- mm/page_alloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1521100f1e63..5d800d61ddb7 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6847,6 +6847,7 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) /* Initialise every node */ mminit_verify_pageflags_layout(); setup_nr_node_ids(); + zero_resv_unavail(); for_each_online_node(nid) { pg_data_t *pgdat = NODE_DATA(nid); free_area_init_node(nid, NULL, @@ -6857,7 +6858,6 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) node_set_state(nid, N_MEMORY); check_for_memory(pgdat, nid); } - zero_resv_unavail(); } static int __init cmdline_parse_core(char *p, unsigned long *core, @@ -7033,9 +7033,9 @@ void __init set_dma_reserve(unsigned long new_dma_reserve) void __init free_area_init(unsigned long *zones_size) { + zero_resv_unavail(); free_area_init_node(0, zones_size, __pa(PAGE_OFFSET) >> PAGE_SHIFT, NULL); - zero_resv_unavail(); } static int page_alloc_cpu_dead(unsigned int cpu) -- 2.18.0