From patchwork Tue Oct 25 12:38:48 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sreekanth Reddy X-Patchwork-Id: 9394471 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B990D6077F for ; Tue, 25 Oct 2016 12:39:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A8E0929589 for ; Tue, 25 Oct 2016 12:39:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9CFE629594; Tue, 25 Oct 2016 12:39:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6B41B29589 for ; Tue, 25 Oct 2016 12:39:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758222AbcJYMjM (ORCPT ); Tue, 25 Oct 2016 08:39:12 -0400 Received: from mail-pf0-f181.google.com ([209.85.192.181]:34306 "EHLO mail-pf0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755649AbcJYMjK (ORCPT ); Tue, 25 Oct 2016 08:39:10 -0400 Received: by mail-pf0-f181.google.com with SMTP id n85so2625280pfi.1 for ; Tue, 25 Oct 2016 05:39:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:to:cc:subject:date:message-id; bh=vPX8+X9GD7nUErCr5pxlY4Kwj11QPAkv66zY5N/nBuU=; b=VHlBqsV8s1Y/KmiNHCZ1lDcjQT1OKxFsY7bS72vVpXGt+f3HOLwj4solkjVZ37WUCO AaEWZC8scH/AJqqOz41rDYR/wUEe64uIEgq5e0qRVE8dgbAAGhNeVdzsxdFAs2VUhGlX m2JmI9xxyn136sn8K7LgjuwXzEzpjDZBi+0uE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=vPX8+X9GD7nUErCr5pxlY4Kwj11QPAkv66zY5N/nBuU=; b=g4isYKHPs8UGwB5Envk437my2/yAPDNLBsKtWpzguW+0Hx36QkQrlLkxELXT4MmNUt tKz6aqYtIXqxhd5aiGvl8aIWEkMEpzODOIuqmZa3NW1oK0XtKlyixB2N+zWooGFj5ibD qmjkQIt4Ul1PrJhtzaqS+8CjrckddGtYho+FFbeUt3uKH46oUgg0109HMinMEtXpNC1l W5ZRO5jMK5X2/SVpdkENL1qZ9V/NN3cDV+JD5K6my3+kDX8SZxGgZudTdWXByBpMM/T0 gLh8TnK+I+d9ld94bl7h0z5tWQwWBJ7q3k3ipiydXssGX3qDWMaq0mb/dwssK26MjJL9 g5Uw== X-Gm-Message-State: ABUngvezeuJsjIHxQCtuvgs7v8UOyq6vFei770qB6+kIMuUOf38EVum2pLFTZ3GCJktS1cn6 X-Received: by 10.99.123.90 with SMTP id k26mr16620030pgn.153.1477399149721; Tue, 25 Oct 2016 05:39:09 -0700 (PDT) Received: from host1.dhcp.avagotech.net ([192.19.239.250]) by smtp.gmail.com with ESMTPSA id e6sm33249128pfb.57.2016.10.25.05.39.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 25 Oct 2016 05:39:08 -0700 (PDT) From: Sreekanth Reddy To: axboe@kernel.dk Cc: jejb@kernel.org, hch@infradead.org, martin.petersen@oracle.com, linux-scsi@vger.kernel.org, Sathya.Prakash@broadcom.com, kashyap.desai@broadcom.com, linux-kernel@vger.kernel.org, suganath-prabu.subramani@broadcom.com, chaitra.basappa@broadcom.com, linux-block@vger.kernel.org, Sreekanth Reddy Subject: [PATCH v2] block: Fix kernel panic occurs while creating second raid disk Date: Tue, 25 Oct 2016 18:08:48 +0530 Message-Id: <1477399128-22052-1-git-send-email-sreekanth.reddy@broadcom.com> X-Mailer: git-send-email 2.0.2 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Sreekanth Reddy Observing below kernel panic while creating second raid disk on LSI SAS3008 HBA card. [ +0.000055] ------------[ cut here ]------------ [ +0.000007] WARNING: CPU: 2 PID: 281 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80 [ +0.000002] sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:32' [ +0.000001] Modules linked in: mptctl mptbase xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables intel_rapl sb_edac edac_core x86_pkg_temp_pclmul joydev ghash_clmulni_intel iTCO_wdt ipmi_ssif mei_me pcspkr mei iTCO_vendor_support ipmi_si i2c_i801 lpc_ich mfd_corema acpi_pad wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc xfs libcrc32c ast i2c_algo_bit drm_kore raid_class nvme_core scsi_transport_sas dca [ +0.000067] CPU: 2 PID: 281 Comm: kworker/u49:5 Not tainted 4.9.0-rc2 #1 [ +0.000002] Hardware name: Supermicro SYS-2028U-TNRT+/X10DRU-i+, BIOS 1.1 07/22/2015 [ +0.000005] Workqueue: events_unbound async_run_entry_fn [ +0.000004] Call Trace: [ +0.000009] [] dump_stack+0x63/0x85 [ +0.000005] [] __warn+0xcb/0xf0 [ +0.000004] [] warn_slowpath_fmt+0x5f/0x80 [ +0.000006] [] ? kernfs_path_from_node+0x4f/0x60 [ +0.000002] [] sysfs_warn_dup+0x62/0x80 [ +0.000002] [] sysfs_create_dir_ns+0x77/0x90 [ +0.000004] [] kobject_add_internal+0x99/0x330 [ +0.000003] [] ? vsnprintf+0x35b/0x4c0 [ +0.000003] [] kobject_add+0x75/0xd0 [ +0.000006] [] ? device_private_init+0x23/0x70 [ +0.000007] [] ? mutex_lock+0x12/0x30 [ +0.000003] [] device_add+0x119/0x670 [ +0.000004] [] device_create_groups_vargs+0xe0/0xf0 [ +0.000003] [] device_create_vargs+0x1c/0x20 [ +0.000006] [] bdi_register+0x8c/0x180 [ +0.000003] [] bdi_register_owner+0x36/0x60 [ +0.000006] [] device_add_disk+0x168/0x480 [ +0.000005] [] ? update_autosuspend+0x51/0x60 [ +0.000005] [] sd_probe_async+0x110/0x1c0 [ +0.000002] [] async_run_entry_fn+0x39/0x140 [ +0.000003] [] process_one_work+0x15f/0x430 [ +0.000002] [] worker_thread+0x4e/0x490 [ +0.000002] [] ? process_one_work+0x430/0x430 [ +0.000003] [] kthread+0xd9/0xf0 [ +0.000003] [] ? kthread_park+0x60/0x60 [ +0.000003] [] ret_from_fork+0x25/0x30 [ +0.000002] ------------[ cut here ]------------ [ +0.000004] WARNING: CPU: 2 PID: 281 at lib/kobject.c:240 kobject_add_internal+0x2bd/0x330 [ +0.000001] kobject_add_internal failed for 8:32 with -EEXIST, don't try to register things with the same name in the same [ +0.000001] Modules linked in: mptctl mptbase xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables intel_rapl sb_edac edac_core x86_pkg_temp_pclmul joydev ghash_clmulni_intel iTCO_wdt ipmi_ssif mei_me pcspkr mei iTCO_vendor_support ipmi_si i2c_i801 lpc_ich mfd_corema acpi_pad wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc xfs libcrc32c ast i2c_algo_bit drm_kore raid_class nvme_core scsi_transport_sas dca [ +0.000043] CPU: 2 PID: 281 Comm: kworker/u49:5 Tainted: G W 4.9.0-rc2 #1 [ +0.000001] Hardware name: Supermicro SYS-2028U-TNRT+/X10DRU-i+, BIOS 1.1 07/22/2015 [ +0.000002] Workqueue: events_unbound async_run_entry_fn [ +0.000003] Call Trace: [ +0.000003] [] dump_stack+0x63/0x85 [ +0.000003] [] __warn+0xcb/0xf0 [ +0.000004] [] warn_slowpath_fmt+0x5f/0x80 [ +0.000002] [] ? sysfs_warn_dup+0x6a/0x80 [ +0.000003] [] kobject_add_internal+0x2bd/0x330 [ +0.000003] [] ? vsnprintf+0x35b/0x4c0 [ +0.000003] [] kobject_add+0x75/0xd0 [ +0.000003] [] ? device_private_init+0x23/0x70 [ +0.000004] [] ? mutex_lock+0x12/0x30 [ +0.000002] [] device_add+0x119/0x670 [ +0.000004] [] device_create_groups_vargs+0xe0/0xf0 [ +0.000003] [] device_create_vargs+0x1c/0x20 [ +0.000003] [] bdi_register+0x8c/0x180 [ +0.000003] [] bdi_register_owner+0x36/0x60 [ +0.000004] [] device_add_disk+0x168/0x480 [ +0.000003] [] ? update_autosuspend+0x51/0x60 [ +0.000002] [] sd_probe_async+0x110/0x1c0 [ +0.000002] [] async_run_entry_fn+0x39/0x140 [ +0.000002] [] process_one_work+0x15f/0x430 [ +0.000002] [] worker_thread+0x4e/0x490 [ +0.000002] [] ? process_one_work+0x430/0x430 [ +0.000003] [] kthread+0xd9/0xf0 [ +0.000003] [] ? kthread_park+0x60/0x60 [ +0.000003] [] ret_from_fork+0x25/0x30 [ +0.000949] BUG: unable to handle kernel [ +0.005263] NULL pointer dereference [ +0.002853] IP: [] sysfs_do_create_link_sd.isra.2+0x34/0xb0 [ +0.008584] PGD 0 [ +0.006115] Oops: 0000 [#1] SMP [ +0.004531] Modules linked in: mptctl mptbase xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables intel_rapl sb_edac edac_core x86_pkg_temp_pclmul joydev ghash_clmulni_intel iTCO_wdt ipmi_ssif mei_me pcspkr mei iTCO_vendor_support ipmi_si i2c_i801 lpc_ich mfd_corema acpi_pad wmi acpi_power_meter nfsd auth_rpcgss nfs_acl lockd grace binfmt_misc sunrpc xfs libcrc32c ast i2c_algo_bit drm_kore raid_class nvme_core scsi_transport_sas dca [ +0.080566] CPU: 17 PID: 281 Comm: kworker/u49:5 Tainted: G W 4.9.0-rc2 #1 [ +0.009472] Hardware name: Supermicro SYS-2028U-TNRT+/X10DRU-i+, BIOS 1.1 07/22/2015 [ +0.009169] Workqueue: events_unbound async_run_entry_fn [ +0.007340] RIP: 0010:[] [] sysfs_do_create_link_sd.isra.2+0x34/0xb0 [ +0.010294] Call Trace: [ +0.005269] [] sysfs_create_link+0x25/0x40 [ +0.008568] [] device_add_disk+0x1fc/0x480 [ +0.008551] [] sd_probe_async+0x110/0x1c0 [ +0.008456] [] async_run_entry_fn+0x39/0x140 [ +0.010021] [] process_one_work+0x15f/0x430 [ +0.009623] [] worker_thread+0x4e/0x490 [ +0.007422] [] ? process_one_work+0x430/0x430 [ +0.008728] [] kthread+0xd9/0xf0 [ +0.007578] [] ? kthread_park+0x60/0x60 [ +0.006816] [] ret_from_fork+0x25/0x30 [ +0.006814] Code: 75 48 85 ff 74 70 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fe 53 48 c7 c7 90 74 01 82 48 89 f3 41 89 cc c5 ff ff c6 05 15 48 d5 [ +0.022853] RIP [] sysfs_do_create_link_sd.isra.2+0x34/0xb0 [ +0.008679] RSP [ +0.006129] BUG: unable to handle kernel While analyzing this issue, I observed that while creating the first raid disk, we hide first raid disk's PD devices (i.e. device will be their but it won't have block device entry). But kernel is not removing the entries of this first raid disk's PD devices BDI's in /sys/devices/virtual/bdi/ path, still it shows bdi device entries for these PD eventhough these PD doesn't have a block device names. e.g. output of 'ls -l /dev/sd*' after creating first raid disk [root@dhcp ~]# ls -l /dev/sd* brw-rw---- 1 root disk 8, 0 Oct 24 17:37 /dev/sda brw-rw---- 1 root disk 8, 1 Oct 24 17:37 /dev/sda1 brw-rw---- 1 root disk 8, 2 Oct 24 17:37 /dev/sda2 brw-rw---- 1 root disk 8, 3 Oct 24 17:37 /dev/sda3 brw-rw---- 1 root disk 8, 16 Oct 24 17:37 /dev/sdb brw-rw---- 1 root disk 8, 64 Oct 24 17:37 /dev/sde brw-rw---- 1 root disk 8, 80 Oct 24 17:37 /dev/sdf brw-rw---- 1 root disk 8, 96 Oct 24 17:37 /dev/sdg brw-rw---- 1 root disk 8, 112 Oct 24 17:37 /dev/sdh brw-rw---- 1 root disk 8, 128 Oct 24 17:37 /dev/sdi brw-rw---- 1 root disk 8, 144 Oct 24 17:37 /dev/sdj brw-rw---- 1 root disk 8, 160 Oct 24 17:41 /dev/sdk outout of 'ls -l /sys/devices/virtual/bdi/' [root@dhcp-135-24-192-127 ~]# ls -l /sys/devices/virtual/bdi/ total 0 drwxr-xr-x 3 root root 0 Oct 24 17:39 259:0 drwxr-xr-x 3 root root 0 Oct 24 17:39 8:0 drwxr-xr-x 3 root root 0 Oct 24 17:39 8:112 drwxr-xr-x 3 root root 0 Oct 24 17:39 8:128 drwxr-xr-x 3 root root 0 Oct 24 17:39 8:144 drwxr-xr-x 3 root root 0 Oct 24 17:39 8:16 drwxr-xr-x 3 root root 0 Oct 24 17:41 8:160 drwxr-xr-x 3 root root 0 Oct 24 17:39 8:32 drwxr-xr-x 3 root root 0 Oct 24 17:39 8:48 drwxr-xr-x 3 root root 0 Oct 24 17:39 8:64 drwxr-xr-x 3 root root 0 Oct 24 17:39 8:80 drwxr-xr-x 3 root root 0 Oct 24 17:39 8:96 Here we can observe that there are no block devices for '8:32' & '8:48' bdi entries, which are PD's for raid disk /dev/sdk. Now while creating a second raid disk, kernel is trying to use MAJOR:MINOR as 8:32 for second raid disk and we observe above kernel OOPs. By calling bdi_unregister() in del_gendisk() function has resolved this issue. v2 Change set: * Remove bdi device from on the list only if it has valid device Signed-off-by: Sreekanth Reddy --- block/genhd.c | 1 + mm/backing-dev.c | 10 +++++----- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/block/genhd.c b/block/genhd.c index fcd6d4f..b95f2fa 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -658,6 +658,7 @@ void del_gendisk(struct gendisk *disk) disk->flags &= ~GENHD_FL_UP; sysfs_remove_link(&disk_to_dev(disk)->kobj, "bdi"); + bdi_unregister(&disk->queue->backing_dev_info); blk_unregister_queue(disk); blk_unregister_region(disk_devt(disk), disk->minors); diff --git a/mm/backing-dev.c b/mm/backing-dev.c index 8fde443..80a64f0 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -853,12 +853,12 @@ static void bdi_remove_from_list(struct backing_dev_info *bdi) void bdi_unregister(struct backing_dev_info *bdi) { - /* make sure nobody finds us on the bdi_list anymore */ - bdi_remove_from_list(bdi); - wb_shutdown(&bdi->wb); - cgwb_bdi_destroy(bdi); - if (bdi->dev) { + /* make sure nobody finds us on the bdi_list anymore */ + bdi_remove_from_list(bdi); + wb_shutdown(&bdi->wb); + cgwb_bdi_destroy(bdi); + bdi_debug_unregister(bdi); device_unregister(bdi->dev); bdi->dev = NULL;