From patchwork Thu Oct 31 15:18:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: zhangchun X-Patchwork-Id: 13858022 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7918D767DD for ; Thu, 31 Oct 2024 15:16:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C5536B007B; Thu, 31 Oct 2024 11:16:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 474C36B0082; Thu, 31 Oct 2024 11:16:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3638C6B0083; Thu, 31 Oct 2024 11:16:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1953A6B007B for ; Thu, 31 Oct 2024 11:16:26 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B83D81215D9 for ; Thu, 31 Oct 2024 15:16:25 +0000 (UTC) X-FDA: 82734247386.13.1C11822 Received: from h3cspam02-ex.h3c.com (smtp.h3c.com [60.191.123.50]) by imf03.hostedemail.com (Postfix) with ESMTP id B278E20027 for ; Thu, 31 Oct 2024 15:16:08 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf03.hostedemail.com: domain of zhang.chunA@h3c.com designates 60.191.123.50 as permitted sender) smtp.mailfrom=zhang.chunA@h3c.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730387652; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O6os/G9diYMpA+wrAJmjCm2eDQgi2G4kWIHhO5jIYCQ=; b=Q8DdXM7VXX3ZQRp5tCVJVue/a1VSljY719KS+teTi9jSabM1jVam8DcvIAbu1ZWhakbdAa Dp/c7pfb7kJFmoFKTroQI2ths3AsIzpz3LbCU6at2j91vfrreaYujIV3U3SeHXz1SyFyHM FNBxWgijDdr/30EhGUTzaVvAEnz6yPA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730387652; a=rsa-sha256; cv=none; b=7HzXchQC2xMJE8E/iHjBkdX/T9OnuWZiR94RwP2N9wkV1x0YA/LIRdtTlYItSPrHdih+fr rioQy6pqXlQFXBXQEfIlsmeb+39bUiKkFjSuOQXLM9p70TYMAKXvFZ1fBNSoOMQxUHeid5 Gg2YFJgvMmrjSqgd97f9CXam62sZ6zY= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf03.hostedemail.com: domain of zhang.chunA@h3c.com designates 60.191.123.50 as permitted sender) smtp.mailfrom=zhang.chunA@h3c.com Received: from mail.maildlp.com ([172.25.15.154]) by h3cspam02-ex.h3c.com with ESMTP id 49VFG359095464; Thu, 31 Oct 2024 23:16:04 +0800 (+08) (envelope-from zhang.chunA@h3c.com) Received: from DAG6EX09-BJD.srv.huawei-3com.com (unknown [10.153.34.11]) by mail.maildlp.com (Postfix) with ESMTP id 4AB522004735; Thu, 31 Oct 2024 23:22:57 +0800 (CST) Received: from localhost.localdomain.com (10.99.206.13) by DAG6EX09-BJD.srv.huawei-3com.com (10.153.34.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1258.27; Thu, 31 Oct 2024 23:16:06 +0800 From: zhangchun To: CC: , , , , zhangchun , zhangzhansheng Subject: [PATCH v4] =?utf-8?q?mm=3A_Give_kmap=5Flock_before_call_flush=5Ftlb?= =?utf-8?q?=5Fkernel=5Frang=EF=BC=8Cavoid_kmap=5Fhigh_deadlock=2E?= Date: Thu, 31 Oct 2024 23:18:04 +0800 Message-ID: <1730387884-57777-1-git-send-email-zhang.chuna@h3c.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1728891693-41227-1-git-send-email-zhang.chuna@h3c.com> References: <1728891693-41227-1-git-send-email-zhang.chuna@h3c.com> MIME-Version: 1.0 X-Originating-IP: [10.99.206.13] X-ClientProxiedBy: BJSMTP01-EX.srv.huawei-3com.com (10.63.20.132) To DAG6EX09-BJD.srv.huawei-3com.com (10.153.34.11) X-DNSRBL: X-MAIL: h3cspam02-ex.h3c.com 49VFG359095464 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: B278E20027 X-Stat-Signature: h5e4q6kiokfwjjcqpwwx1no9recdjjhp X-HE-Tag: 1730387768-855348 X-HE-Meta: U2FsdGVkX18Dk0mKsBdKiUYmriJOBVIVd8vkhqczXtc9bJ4wyf8SjcI7s4lvq139/3AwyRql79Wg5thV82U/xns7AOnEjrZ8Pei1PbjEL9CpP5wTq/5Jr5gdWev1czbzuC/MVsGVwhU28d5WUNN6fGrvaty4otU2ZGq6HX2NREveT7vJqwQcdYo+RheAxDZTEEeRM1frKy2u4v7AT95oI6wgNwOEqUswq9JAnlGB3bYUiNU2BqzXZgdedCIiUo7VsuGFaTdL14g+K5+WtvZ6/L6lghcLZo596RCj0/jcClnYlvfOIyyku/jArieJELGhl4p15WL++jJVPn2sFNFLl5uG8X2AJVYkd6w2B9lyVuXzwIV8LinHdbqqqy9hg/Gs8gC2lYURw8Cq4Y0FMe/IAw+g1LdU6IlqTtSrzyQVtz0k4jLAfVDPAKuTvZGm1OB8Ib1fScLggoH4gPgwMok2+VZGPgH7lP7eLfc8/zxTXHYF1xVbK8TKdiiINnnrltBXM+JHugE+V/5U0Xm06WGmjXnui0P1IfQ6O5bW3l4OcG0xEtMRsL9jyUVdRkrc8k8TdiXsXGb9yXSt/1xUub1ZiavMcv7fwR0iOsV4HUEbDSs3Vm3gBza4vCjvJ00nkruXywF/XGIrRt7wOaMBC0g/7/nktsez+Jh29JVMD7muiqpXXJau2ztvDVpJUQ4SqjXodwEqosubMhkhqwzx0vGp1hDHhJcTF7SQ1yTgzw1gVuXGHjlg673MfRbq2KS0XxSVO0plJlXL2L2693AKwjJ4eJBvt3uFQevI6h5ej8RW4zJviTEkwrASLQj8mE9CtmDXcv2CzT8RFMeBxXADnEI6sonte7x70mMoXOSUUi/xfoBeD8ZMX2Nou6qdN0QjL1qymmqE+HxkR0trgR9LqtZVL2EmmcEi/pPlOIf0uGYsVHwkaVkKwLvfY4zbXU9MKAAojfcFRsJzxaMq1OglK9+ hzkVmBxC B/QU1A83qVaVd5RjRIgoJX7GDRV9iREYwEdvfYbCCM4A7k4psgqpC/L1bpkZI3h/5sE7kDDFWGwfPmDklXGNDKefRPQ5SyyvX6nr3pbj7ijZGyawqKlTDSAFDqmkujcnsIlU3QSXb6VwE5oQvPmWcksrZkUB5JG2IWjFnz3hCAysxp78DHoZTtG3DdVGTYXOssufG86uNRDFnTreSewc6d8GOJlVwvepetlDmx/R9zd5xfvkrZ8KBu3n2cnHYM4XZIKcMrk61kCNo7ndD0BtGHSKa+Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: CPU 0: CPU 1: kmap_high(){ kmap_xxx() { ... irq_disable(); spin_lock(&kmap_lock) ... map_new_virtual ... flush_all_zero_pkmaps flush_tlb_kernel_range /* CPU0 holds the kmap_lock */ smp_call_function_many spin_lock(&kmap_lock) ... .... spin_unlock(&kmap_lock) ... CPU 0 holds the kmap_lock, waiting for CPU 1 respond to IPI. But CPU 1 has disabled irqs, waiting for kmap_lock, cannot answer the IPI. Fix this by releasing kmap_lock before call flush_tlb_kernel_range, avoid kmap_lock deadlock. if (need_flush) { unlock_kmap(); flush_tlb_kernel_range(PKMAP_ADDR(0), PKMAP_ADDR(LAST_PKMAP)); lock_kmap(); } Dropping the lock like this is safe. kmap_lock is used to protect pkmap_count, pkmap_page_table and last_pkmap_nr(static variable). When call flush_tlb_kernel_range(PKMAP_ADDR(0), PKMAP_ADDR(LAST_PKMAP)), flush_tlb_kernel_range will neither modify nor read these variables. Leave that data unprotected here is safe. map_new_virtual aims to find an usable entry pkmap_count[last_pkmap_nr]. When read and modify the pkmap_count[last_pkmap_nr], the kmap_lock is not dropped. "if (!pkmap_count[last_pkmap_nr])" determine pkmap_count[last_pkmap_nr] is usable or not. If unusable, try agin. Furthermore, the value of static variable last_pkmap_nr is stored in a local variable last_pkmap_nr, when kmap_lock is acquired, this is thread-safe. In an extreme case, if Thread A and Thread B access the same last_pkmap_nr, Thread A calls function flush_tlb_kernel_range and release the kmap_lock, and Thread B then acquires the kmap_lock and modifies the variable pkmap_count[last_pkmap_nr]. After Thread A completes the execution of function the variable pkmap_count[last_pkmap_nr]. After Thread A completes the execution of function flush_tlb_kernel_range, it will check the variable pkmap_count[last_pkmap_nr]. static inline unsigned long map_new_virtual(struct page *page) { unsigned long vaddr; int count; unsigned int last_pkmap_nr; // local variable to store static variable last_pkmap_nr unsigned int color = get_pkmap_color(page); start: ... flush_all_zero_pkmaps();// release kmap_lock, then acquire it count = get_pkmap_entries_count(color); } ... if (!pkmap_count[last_pkmap_nr]) // pkmap_count[last_pkmap_nr] is used or not break; /* Found a usable entry */ if (--count) continue; ... vaddr = PKMAP_ADDR(last_pkmap_nr); set_pte_at(&init_mm, vaddr, &(pkmap_page_table[last_pkmap_nr]), mk_pte(page, kmap_prot)); pkmap_count[last_pkmap_nr] = 1; ... return vaddr; } Fixes: 3297e760776a ("highmem: atomic highmem kmap page pinning") Signed-off-by: zhangchun Co-developed-by: zhangzhansheng Signed-off-by: zhangzhansheng Suggested-by: Matthew Wilcox Reviewed-by: zhangzhengming --- mm/highmem.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) -- 1.8.3.1 diff --git a/mm/highmem.c b/mm/highmem.c index ef3189b..07f2c67 100644 --- a/mm/highmem.c +++ b/mm/highmem.c @@ -231,8 +231,18 @@ static void flush_all_zero_pkmaps(void) set_page_address(page, NULL); need_flush = 1; } - if (need_flush) + if (need_flush) { + /* + * In multi-core system one CPU holds the kmap_lock, waiting + * for other CPUs respond to IPI. But other CPUS has disabled + * irqs, waiting for kmap_lock, cannot answer the IPI. Release + * kmap_lock before call flush_tlb_kernel_range, avoid kmap_lock + * deadlock. + */ + unlock_kmap(); flush_tlb_kernel_range(PKMAP_ADDR(0), PKMAP_ADDR(LAST_PKMAP)); + lock_kmap(); + } } void __kmap_flush_unused(void)