From patchwork Thu May 10 11:53:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Tatashin X-Patchwork-Id: 10391733 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id ACD2560153 for ; Thu, 10 May 2018 11:54:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9D3F828A78 for ; Thu, 10 May 2018 11:54:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 919FE28A86; Thu, 10 May 2018 11:54:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 099D028A78 for ; Thu, 10 May 2018 11:54:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 842756B05F5; Thu, 10 May 2018 07:54:35 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7FCE36B05F7; Thu, 10 May 2018 07:54:35 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6DF0A6B05F8; Thu, 10 May 2018 07:54:35 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg0-f69.google.com (mail-pg0-f69.google.com [74.125.83.69]) by kanga.kvack.org (Postfix) with ESMTP id 2A71B6B05F5 for ; Thu, 10 May 2018 07:54:35 -0400 (EDT) Received: by mail-pg0-f69.google.com with SMTP id a9-v6so726910pgt.6 for ; Thu, 10 May 2018 04:54:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:subject:date:message-id; bh=XQ49fCTdiCkgoN+/TQ5KoKSYgvRuCJlF12nj8e7fGzs=; b=iIDM/yTPXgmuigiy1JbOpYG+cahjLiGzz6scr51d9hAhSdHzyt5Zs671EivAfECio8 0LjOJjlh6RmLD5LQ96wZec/lpGANrNmuXaqwgzMxpja5ufZXjXdEf004oZkIyDBTSB6I lIYwx/fasSy4cAhXDi9yNRuvrEO0mjk14T4EpxJkLVcnQJi8rXnP6kifJ9xqV6X+c9y+ KzTD/39WUxEvxLyCdeyDjhkwP3TTTRWBT4iTX99OvWHVM94JsaIL4HVhMwHuxFIWkXXO d8Y1jCjLio+mddwzWqeVJaPAU5vawMrtRasJex6lplxw02bGFxZk2pfmbIDQxNOsmgrl QTBQ== X-Gm-Message-State: ALKqPwchJDe97UKRee8UtZ7xxBRPO9y+b0lrSgS+kbAezOVedWQjmASg w9wEREaMtDfeWW9BA4xWGLRfV7k/1LK+UL/FK34Vvq2iHHfj2kviHtPJRntRZjK8s1Mb2a5m2aS vF4EQWAuU2Cb0JE/otI0IJb0kmYeiHdP/8FH9KCB/6DYwf2MnIZWiRn77D0Aj9rcIGA== X-Received: by 2002:a65:6592:: with SMTP id u18-v6mr907469pgv.366.1525953274759; Thu, 10 May 2018 04:54:34 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrqWeCc6dz4AWXrsj1lz8l1UetFqLZ0j8WJkAHrwTeNHaDUVK27PINgVK4GOBdcBddMjC3J X-Received: by 2002:a65:6592:: with SMTP id u18-v6mr907433pgv.366.1525953273760; Thu, 10 May 2018 04:54:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525953273; cv=none; d=google.com; s=arc-20160816; b=tzOAzmxa5gJmMrHvwg0gWW9gPqChtBRPBXoKIMkCERkZkqWmzE58XQnkiJWrmG4Ppp tz6ZE6Igv4Ot6Am0PRb4ZycT6AEWGUsZT5INLYvP0vkAqylc1b/FNulpz5Vwaqe/ipXl oFnk0IER2nuhxBSF8ojYK6ZxsMfoOBJ6E4pCnzOp3XIR+URV8quSShrL+3GM7zB99oO/ sPuptSZDAzpBHErJEZ7g2a2tN4opSJgah/jYt9LBvVeo9nGktzp3b1HgUZ1iTzZlB4D1 rIUiuNmQKKRTJ2KCCyCnJJIQG/NXUw7pbUxFH39w4V9kH8shwVLd7OcCJHG0kIExeef9 AZvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:to:from:dkim-signature :arc-authentication-results; bh=XQ49fCTdiCkgoN+/TQ5KoKSYgvRuCJlF12nj8e7fGzs=; b=qt0p81G8nxaOGUoiLD+sPDcV6dpJM6I223ePhk1J+0nCIWcKuPKOFzZwRzgqnsFRnv 67wCuXpysg0xYvgh1pC7z0TGieJOXhBzrraNP0cbHE5jTPf9j6p4CLdz3bEMvUBhlHh4 2E55SQPYJzrRlbIhw+asNvPUcRN6R8S0Ifri2O+LfuiExEKuTbnJBJoeSTlxRyY3wJ7n oGLyf4b6bUnnfrSxFayGD59KkJ5t7mS0Told2DNVMxetFWmS1JG4q7Ys9PoooQO62uEy RAngFX2bDjj5dbR7rF8h24kk7WblpxtVpS1LHl0WmCikqYlY3/c29VbDzB4DTbwmbS8a v9MA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=bu1b50xq; spf=pass (google.com: domain of pasha.tatashin@oracle.com designates 156.151.31.85 as permitted sender) smtp.mailfrom=pasha.tatashin@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from userp2120.oracle.com (userp2120.oracle.com. [156.151.31.85]) by mx.google.com with ESMTPS id x12-v6si543343pgv.556.2018.05.10.04.54.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 10 May 2018 04:54:33 -0700 (PDT) Received-SPF: pass (google.com: domain of pasha.tatashin@oracle.com designates 156.151.31.85 as permitted sender) client-ip=156.151.31.85; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=bu1b50xq; spf=pass (google.com: domain of pasha.tatashin@oracle.com designates 156.151.31.85 as permitted sender) smtp.mailfrom=pasha.tatashin@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w4ABpTns136544; Thu, 10 May 2018 11:54:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id; s=corp-2017-10-26; bh=XQ49fCTdiCkgoN+/TQ5KoKSYgvRuCJlF12nj8e7fGzs=; b=bu1b50xqRviKEQYvDQzG4SvKAoGdjf8JmjXKckyLXnLzzAn/c77NRVvWIOZZ6qSTk5xA 5ApnLjeUnuQmntODbhzG4vuIH7pDb1LOWR/D86fVtA/cEEqGKPS3e1uYRorqQBqwcATO fDFv65o2jUKYdOq3h6VdaHSLS7QlTIZfDX+vzn6wEA7gZaKiPcLohBITq9JtXfaJzoM7 f+BPBBdGHYosWZ7nKAE3ZUe9UUPnLhhY+U0+KVGoageKZQfqBP+WsNBg+ZoN771naHuj iNbt8vEREjEFPugGR3qwNy1E3ms5BETeSqQttvID2OsJ2TpEi4fwOTIUzn6yRmbt6/4T Dw== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp2120.oracle.com with ESMTP id 2hv6kp380v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 10 May 2018 11:54:06 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w4ABs5i0014284 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 10 May 2018 11:54:05 GMT Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w4ABs3rA020620; Thu, 10 May 2018 11:54:03 GMT Received: from localhost.localdomain (/73.69.118.222) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 10 May 2018 04:54:03 -0700 From: Pavel Tatashin To: steven.sistare@oracle.com, daniel.m.jordan@oracle.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mhocko@suse.com, linux-mm@kvack.org, mgorman@techsingularity.net, mingo@kernel.org, peterz@infradead.org, rostedt@goodmis.org, fengguang.wu@intel.com, dennisszhou@gmail.com Subject: [PATCH v2] mm: allow deferred page init for vmemmap only Date: Thu, 10 May 2018 07:53:56 -0400 Message-Id: <20180510115356.31164-1-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.17.0 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8888 signatures=668698 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1805100117 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP It is unsafe to do virtual to physical translations before mm_init() is called if struct page is needed in order to determine the memory section number (see SECTION_IN_PAGE_FLAGS). This is because only in mm_init() we initialize struct pages for all the allocated memory when deferred struct pages are used. My recent fix exposed this problem, because it greatly reduced number of pages that are initialized before mm_init(), but the problem existed even before my fix, as Fengguang Wu found. Below is a more detailed explanation of the problem. We initialize struct pages in four places: 1. Early in boot a small set of struct pages is initialized to fill the first section, and lower zones. 2. During mm_init() we initialize "struct pages" for all the memory that is allocated, i.e reserved in memblock. 3. Using on-demand logic when pages are allocated after mm_init call (when memblock is finished) 4. After smp_init() when the rest free deferred pages are initialized. The problem occurs if we try to do va to phys translation of a memory between steps 1 and 2. Because we have not yet initialized struct pages for all the reserved pages, it is inherently unsafe to do va to phys if the translation itself requires access of "struct page" as in case of this combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP Here is a sample path, where translation is required, that occurs before mm_init(): start_kernel() trap_init() setup_cpu_entry_areas() setup_cpu_entry_area(cpu) get_cpu_gdt_paddr(cpu) per_cpu_ptr_to_phys(addr) pcpu_addr_to_page(addr) virt_to_page(addr) pfn_to_page(__pa(addr) >> PAGE_SHIFT) The problems are discussed in these threads: http://lkml.kernel.org/r/20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel.com http://lkml.kernel.org/r/20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel.com http://lkml.kernel.org/r/20180426202619.2768-1-pasha.tatashin@oracle.com Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Signed-off-by: Pavel Tatashin --- mm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/Kconfig b/mm/Kconfig index d5004d82a1d6..1cd32d67ca30 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -635,7 +635,7 @@ config DEFERRED_STRUCT_PAGE_INIT bool "Defer initialisation of struct pages to kthreads" default n depends on NO_BOOTMEM - depends on !FLATMEM + depends on SPARSEMEM_VMEMMAP help Ordinarily all struct pages are initialised during early boot in a single thread. On very large machines this can take a considerable