From patchwork Wed Aug 19 16:01:35 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Fausser X-Patchwork-Id: 7038451 Return-Path: X-Original-To: patchwork-linux-scsi@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 649CC9F344 for ; Wed, 19 Aug 2015 16:02:04 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id B034B20531 for ; Wed, 19 Aug 2015 16:02:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 20DB3205B6 for ; Wed, 19 Aug 2015 16:01:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751753AbbHSQBn (ORCPT ); Wed, 19 Aug 2015 12:01:43 -0400 Received: from mout.kundenserver.de ([212.227.17.24]:64146 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751428AbbHSQBm (ORCPT ); Wed, 19 Aug 2015 12:01:42 -0400 Received: from pc00002.rts.com ([46.237.192.139]) by mrelayeu.kundenserver.de (mreue101) with ESMTPSA (Nemesis) id 0Lch0h-1Z1Ftm2jfL-00k9aZ; Wed, 19 Aug 2015 18:01:37 +0200 Message-ID: <55D4A85F.1080304@real-time-systems.com> Date: Wed, 19 Aug 2015 18:01:35 +0200 From: Stefan Fausser User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0 MIME-Version: 1.0 To: intel-linux-scu@intel.com, artur.paszkiewicz@intel.com, JBottomley@odin.com, linux-scsi@vger.kernel.org, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: isci, INTx mode, race condition X-Provags-ID: V03:K0:B49nwTq3OoTfdABR+s9IOiGAM4GLAC/eFbu0tDL2i3AytfMD/0x aouVnObJNRoCvK2NEhTjELriMCkuctJgpSa2TNSWdmthe8mkDa94GIWD8QiREw1g/KaffsW k3EoycswtPddi5wzQhRDvMnjhZllkJqCE2kAjYJFqpR8Fk+3VF5W7PmOXFzpZSpcxkO5qTX RsuUL2KcouFaJYI2Qz2Ow== X-UI-Out-Filterresults: notjunk:1; V01:K0:mXXa7BWk1ec=:xck0uP04VB5mg5ZZfOQzX6 jNQDP5pDYy1gAGOvxPlUo2Vy0ROK9JEDdCClSVBJAzxbz3/8U12vgv13+EK/ijyjOW0bK/1KE 3qI29+yW+jAV0Ajt8V8UzXKKIoAfqDK/gypxHMz5x2UmtHMi5LRC2K4grMJgNi5vFN91Q5tNu e3Wavr74eGqTB7WjGliRJRYBu5oiyI+moU8ALa36Ps1Wh9sLlQnXhelaKV38QQkCqlbUAKutW 9T4S32gesdU3BtyKFE6Nx9SashFX3RKa/lM0ILn1HfJ+IfEpjzfuGKspAivv89P/KNNV8BSnc oX53DG4J6EeBthTCBBZUSZI8f+shI6tCAna7uYMwVXblidsHuwrAsdOXaTPHtjsocR8FHH9DT gX7AbRC3P+8cCVpPQB1VPQTeyh4F6TScZU5zh6mgAfqFOAzcPN0+qhVQkqp4L1M17AajpfF1+ tfTMK+qI0ssGewElUS6TiqUHeusNVklokKwSL5gYVAVW1NxUZCdSduV3k6KiUzDPc91YoJZZO 5kYCdyTRV2v4qTHqVDPeOst/AGhMKrmk1zpj8s/lFg53kFwWy68KQp09kstK8YgFNiCgpW8Yl RqNZJWdsHydAJ/wSzQwKvpkKcf3EgMu7oVz9xL20iqyk898Ovpnm+vAduN8Y/lnGAWRaZnCaH wXZ0HbMAKorX5j6uyrTjXmUYxIcqk4QCYHhLFKSaLmTs8mhL3mKpxD3vuUvzP+V82HEE= Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_TVD_MIME_EPI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Dear all, attached are two patches for the "isci" module (CONFIG_SCSI_ISCI). Both patches apply to the current Linux kernel, retrieved by GIT (4.2.0-rc7). The first patch (init.patch) is for reproducing the problem with the "Intel(R) C600 SAS Controller" in INTx Mode, see below. The second patch (host.patch) is for fixing this problem. The problem: By applying the first patch "init.patch", the "Intel(R) C600 SAS Controller" (now abbreviated by SAS) generates level-triggered INTx Interrupts instead of (edge-triggered) MSI-X Interrupts. In the ISR (isci_intx_isr), the controller determines if the interrupt is due to a normal operation (normal interrrupt) or an error. In the case of a normal interrupt, a tasklet is scheduled that should handle the normal interrupt. However, in the ISR, the interrupts are left unmasked and the SAS device may trigger the next interrupt after the ISR has left and before the tasklet has been scheduled. Thus, with this patch "init.patch" and on my system (Intel C600 chipset series), the SAS device repeatedly level-triggers the interrupt and the tasklet to handle the interrupt never gets scheduled. This will result in a soft-lockup on the executing core. In my investigations, the above described problem occurs in all Linux kernel version starting from 3.5 and up to to-day. The fix: By applying the second patch "host.patch", the interrupts are masked in the INTx ISR in case of a normal interrupt. Thus, the scheduler has enough time to schedule the handling tasklet. In the tasklet (see sci_controller_completion_handler), the interrupts are unmasked again. Please let me know if you need any other information. Kind Regards, Stefan Acked-by: Artur Paszkiewicz Signed-off-by: Stefan Fausser --- linux/drivers/scsi/isci/init.c.orig 2015-08-19 15:23:36.000000000 +0200 +++ linux/drivers/scsi/isci/init.c 2015-08-19 15:47:47.000000000 +0200 @@ -345,6 +345,7 @@ static int isci_setup_interrupts(struct struct isci_host *ihost; struct isci_pci_info *pci_info = to_pci_info(pdev); + goto intx; /* * Determine the number of vectors associated with this * PCI function.