From patchwork Fri Jan 10 04:30:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11326625 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 16EF1139A for ; Fri, 10 Jan 2020 04:46:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C18572080D for ; Fri, 10 Jan 2020 04:46:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C18572080D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D67538E0005; Thu, 9 Jan 2020 23:46:23 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D18618E0001; Thu, 9 Jan 2020 23:46:23 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C07C38E0005; Thu, 9 Jan 2020 23:46:23 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id A777C8E0001 for ; Thu, 9 Jan 2020 23:46:23 -0500 (EST) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 4986C3CF8 for ; Fri, 10 Jan 2020 04:46:23 +0000 (UTC) X-FDA: 76360488246.04.bell92_54a3a385b1d2b X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,dan.j.williams@intel.com,:akpm@linux-foundation.org:stable@vger.kernel.org:vishal.l.verma@intel.com:david@redhat.com:pasha.tatashin@soleen.com:mhocko@suse.com:dave.hansen@linux.intel.com::linux-kernel@vger.kernel.org:dan.j.williams@intel.com,RULES_HIT:30041:30045:30054:30064:30070:30075,0,RBL:134.134.136.65:@intel.com:.lbl8.mailshell.net-62.18.0.100 64.95.201.95,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: bell92_54a3a385b1d2b X-Filterd-Recvd-Size: 7249 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Fri, 10 Jan 2020 04:46:21 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Jan 2020 20:46:20 -0800 X-IronPort-AV: E=Sophos;i="5.69,415,1571727600"; d="scan'208";a="236767082" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Jan 2020 20:46:19 -0800 Subject: [PATCH] mm/memory_hotplug: Fix remove_memory() lockdep splat From: Dan Williams To: akpm@linux-foundation.org Cc: stable@vger.kernel.org, Vishal Verma , David Hildenbrand , Pavel Tatashin , Michal Hocko , Dave Hansen , linux-mm@kvack.org, linux-kernel@vger.kernel.org Date: Thu, 09 Jan 2020 20:30:17 -0800 Message-ID: <157863061737.2230556.3959730620803366776.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The daxctl unit test for the dax_kmem driver currently triggers the lockdep splat below. It results from the fact that remove_memory_block_devices() is invoked under the mem_hotplug_lock() causing lockdep entanglements with cpu_hotplug_lock(). The mem_hotplug_lock() is not needed to synchronize the memory block device sysfs interface vs the page online state, that is already handled by lock_device_hotplug(). Specifically lock_device_hotplug() is sufficient to allow try_remove_memory() to check the offline state of the memblocks and be assured that subsequent online attempts will be blocked. The device_online() path checks mem->section_count before allowing any state manipulations and mem->section_count is cleared in remove_memory_block_devices(). The add_memory() path does create memblock devices under the lock, but there is no lockdep report on that path, so it is left alone for now. This change is only possible thanks to the recent change that refactored memory block device removal out of arch_remove_memory() (commit 4c4b7f9ba948 mm/memory_hotplug: remove memory block devices before arch_remove_memory()). ====================================================== WARNING: possible circular locking dependency detected 5.5.0-rc3+ #230 Tainted: G OE ------------------------------------------------------ lt-daxctl/6459 is trying to acquire lock: ffff99c7f0003510 (kn->count#241){++++}, at: kernfs_remove_by_name_ns+0x41/0x80 but task is already holding lock: ffffffffa76a5450 (mem_hotplug_lock.rw_sem){++++}, at: percpu_down_write+0x20/0xe0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (mem_hotplug_lock.rw_sem){++++}: __lock_acquire+0x39c/0x790 lock_acquire+0xa2/0x1b0 get_online_mems+0x3e/0xb0 kmem_cache_create_usercopy+0x2e/0x260 kmem_cache_create+0x12/0x20 ptlock_cache_init+0x20/0x28 start_kernel+0x243/0x547 secondary_startup_64+0xb6/0xc0 -> #1 (cpu_hotplug_lock.rw_sem){++++}: __lock_acquire+0x39c/0x790 lock_acquire+0xa2/0x1b0 cpus_read_lock+0x3e/0xb0 online_pages+0x37/0x300 memory_subsys_online+0x17d/0x1c0 device_online+0x60/0x80 state_store+0x65/0xd0 kernfs_fop_write+0xcf/0x1c0 vfs_write+0xdb/0x1d0 ksys_write+0x65/0xe0 do_syscall_64+0x5c/0xa0 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (kn->count#241){++++}: check_prev_add+0x98/0xa40 validate_chain+0x576/0x860 __lock_acquire+0x39c/0x790 lock_acquire+0xa2/0x1b0 __kernfs_remove+0x25f/0x2e0 kernfs_remove_by_name_ns+0x41/0x80 remove_files.isra.0+0x30/0x70 sysfs_remove_group+0x3d/0x80 sysfs_remove_groups+0x29/0x40 device_remove_attrs+0x39/0x70 device_del+0x16a/0x3f0 device_unregister+0x16/0x60 remove_memory_block_devices+0x82/0xb0 try_remove_memory+0xb5/0x130 remove_memory+0x26/0x40 dev_dax_kmem_remove+0x44/0x6a [kmem] device_release_driver_internal+0xe4/0x1c0 unbind_store+0xef/0x120 kernfs_fop_write+0xcf/0x1c0 vfs_write+0xdb/0x1d0 ksys_write+0x65/0xe0 do_syscall_64+0x5c/0xa0 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: kn->count#241 --> cpu_hotplug_lock.rw_sem --> mem_hotplug_lock.rw_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(mem_hotplug_lock.rw_sem); lock(cpu_hotplug_lock.rw_sem); lock(mem_hotplug_lock.rw_sem); lock(kn->count#241); *** DEADLOCK *** No fixes tag as this seems to have been a long standing issue that likely predated the addition of kernfs lockdep annotations. Cc: Cc: Vishal Verma Cc: David Hildenbrand Cc: Pavel Tatashin Cc: Michal Hocko Cc: Dave Hansen Signed-off-by: Dan Williams --- mm/memory_hotplug.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 55ac23ef11c1..a4e7dadded08 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1763,8 +1763,6 @@ static int __ref try_remove_memory(int nid, u64 start, u64 size) BUG_ON(check_hotplug_memory_range(start, size)); - mem_hotplug_begin(); - /* * All memory blocks must be offlined before removing memory. Check * whether all memory blocks in question are offline and return error @@ -1777,9 +1775,17 @@ static int __ref try_remove_memory(int nid, u64 start, u64 size) /* remove memmap entry */ firmware_map_remove(start, start + size, "System RAM"); - /* remove memory block devices before removing memory */ + /* + * Remove memory block devices before removing memory, and do + * not hold the mem_hotplug_lock() over kobject removal + * operations. lock_device_hotplug() keeps the + * check_memblock_offlined_cb result valid until the entire + * removal process is complete. + */ remove_memory_block_devices(start, size); + mem_hotplug_begin(); + arch_remove_memory(nid, start, size, NULL); memblock_free(start, size); memblock_remove(start, size);