From patchwork Thu Jun 16 02:55:56 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: DingXiang X-Patchwork-Id: 9179783 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6A498604DB for ; Thu, 16 Jun 2016 02:49:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5E62327DA4 for ; Thu, 16 Jun 2016 02:49:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5243927F3E; Thu, 16 Jun 2016 02:49:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB39527DA4 for ; Thu, 16 Jun 2016 02:49:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161044AbcFPCtX (ORCPT ); Wed, 15 Jun 2016 22:49:23 -0400 Received: from szxga01-in.huawei.com ([58.251.152.64]:16161 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932393AbcFPCtW (ORCPT ); Wed, 15 Jun 2016 22:49:22 -0400 Received: from 172.24.1.136 (EHLO szxeml433-hub.china.huawei.com) ([172.24.1.136]) by szxrg01-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id DMG20944; Thu, 16 Jun 2016 10:49:12 +0800 (CST) Received: from localhost.localdomain (10.175.100.166) by szxeml433-hub.china.huawei.com (10.82.67.210) with Microsoft SMTP Server id 14.3.235.1; Thu, 16 Jun 2016 10:49:03 +0800 From: DingXiang To: , , , , , , , , CC: , , Subject: [PATCH V2] libata:fix kernel panic when hotplug Date: Thu, 16 Jun 2016 10:55:56 +0800 Message-ID: <1466045756-16425-1-git-send-email-dingxiang@huawei.com> X-Mailer: git-send-email 1.7.1 MIME-Version: 1.0 X-Originating-IP: [10.175.100.166] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090203.576213AB.003D, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 4ba4e6d67bb156d5e44a51726f758c68 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ding Xiang In normal condition,if we use sas protocol and hotplug a sata disk on a port,the sas driver will send event "PORTE_BYTES_DMAED" and call function "sas_porte_bytes_dmaed". But if a sata disk is run io and unplug it,then plug a new sata disk,this operation may cause a kernel panic like this: [ 2366.923208] Unable to handle kernel NULL pointer dereference at virtual address 000007b8 [ 2366.949253] pgd = ffffffc00121d000 [ 2366.971164] [000007b8] *pgd=00000027df893003, *pud=00000027df893003, *pmd=00000027df894003, *pte=006000006d000707 [ 2367.022822] Internal error: Oops: 96000005 [#1] SMP [ 2367.048490] Modules linked in: dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) crc32_arm64(E) aes_ce_blk(E) ablk_helper(E) cry ptd(E) aes_ce_cipher(E) ghash_ce(E) sha2_ce(E) sha1_ce(E) ses(E) enclosure(E) shpchp(E) marvell(E) [ 2367.144808] CPU: 16 PID: 710 Comm: kworker/16:1 Tainted: G E 4.1.23-next.aarch64 #1 [ 2367.180161] Hardware name: Huawei Taishan 2280 /BC11SPCC, BIOS 1.28 05/14/2016 [ 2367.213305] Workqueue: events ata_scsi_hotplug [ 2367.244296] task: ffffffe7db9b5e00 ti: ffffffe7db1a0000 task.ti: ffffffe7db1a0000 [ 2367.279949] PC is at sas_find_dev_by_rphy+0x48/0x118 [ 2367.312045] LR is at sas_find_dev_by_rphy+0x40/0x118 [ 2367.341970] pc : [] lr : [] pstate: 00000145 ... [ 2368.766334] Call trace: [ 2368.781712] [] sas_find_dev_by_rphy+0x48/0x118 [ 2368.800394] [] sas_target_alloc+0x28/0x98 [ 2368.817975] [] scsi_alloc_target+0x248/0x308 [ 2368.835570] [] __scsi_add_device+0xb8/0x160 [ 2368.853034] [] ata_scsi_scan_host+0x190/0x230 [ 2368.871614] [] ata_scsi_hotplug+0xc8/0xe8 [ 2368.889152] [] process_one_work+0x164/0x438 [ 2368.908003] [] worker_thread+0x144/0x4b0 [ 2368.924613] [] kthread+0xfc/0x110 [ 2368.940923] Code: aa1303e0 97ff5deb 34ffff80 d1082273 (f943de76) This because "dev_to_shost" in "sas_find_dev_by_rphy" return a NULL point,and SHOST_TO_SAS_HA used it,so kernel panic happed. why dev_to_shost return a NULL point? Because in "__scsi_add_device" , struct device *parent = &shost->shost_gendev, and in "scsi_alloc_target", "*parent" is assigned to "starget->dev.parent",then "sas_target_alloc" will get "struct sas_rphy" according "starget->dev.parent", and in "sas_find_dev_by_rphy" , we will get "struct Scsi_Host *shost" according "rphy->dev.parent",we will find that rphy->dev.parent = shost->shost_gendev.parent, and shost_gendev.parent is "ap->tdev",there is no parent any more,so "dev_to_shost" return a NULL point. when the panic will happen? When libata is handling error,and add hotplug_task to workqueue, if a new sata disk pluged at this moment,the libata hotplug task will run and panic will happen. In fact,we don't need libata to deal with hotplug in sas environment. So we can't run ata hotplug task when ata port is sas host. Signed-off-by: Ding Xiang --- drivers/ata/libata-eh.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 61dc7a9..4428a7c 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -816,7 +816,8 @@ void ata_scsi_port_error_handler(struct Scsi_Host *host, struct ata_port *ap) if (ap->pflags & ATA_PFLAG_LOADING) ap->pflags &= ~ATA_PFLAG_LOADING; - else if (ap->pflags & ATA_PFLAG_SCSI_HOTPLUG) + else if ((ap->pflags & ATA_PFLAG_SCSI_HOTPLUG) && + !(ap->pflags & ATA_PFLAG_SAS_HOST)) schedule_delayed_work(&ap->hotplug_task, 0); if (ap->pflags & ATA_PFLAG_RECOVERED)