From patchwork Tue Jul 3 07:57:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Wilck X-Patchwork-Id: 10503397 X-Patchwork-Delegate: christophe.varoqui@free.fr Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B547F601D3 for ; Tue, 3 Jul 2018 07:57:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A609E28A10 for ; Tue, 3 Jul 2018 07:57:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 99DC428A13; Tue, 3 Jul 2018 07:57:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.4 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,SUBJ_ATTENTION autolearn=ham version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2BD9928A10 for ; Tue, 3 Jul 2018 07:57:51 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9B4D4308A94B; Tue, 3 Jul 2018 07:57:48 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D2EC460BE4; Tue, 3 Jul 2018 07:57:46 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 022503F7FE; Tue, 3 Jul 2018 07:57:42 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id w637vfdk004408 for ; Tue, 3 Jul 2018 03:57:41 -0400 Received: by smtp.corp.redhat.com (Postfix) id 8C7585D76F; Tue, 3 Jul 2018 07:57:41 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx01.extmail.prod.ext.phx2.redhat.com [10.5.110.25]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E74F55D761; Tue, 3 Jul 2018 07:57:36 +0000 (UTC) Received: from smtp2.provo.novell.com (smtp2.provo.novell.com [137.65.250.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 61D3F81DF2; Tue, 3 Jul 2018 07:57:35 +0000 (UTC) Received: from apollon.suse.de.de (prv-ext-foundry1int.gns.novell.com [137.65.251.240]) by smtp2.provo.novell.com with ESMTP (TLS encrypted); Tue, 03 Jul 2018 01:57:31 -0600 From: Martin Wilck To: Christophe Varoqui Date: Tue, 3 Jul 2018 09:57:06 +0200 Message-Id: <20180703075707.834-2-mwilck@suse.com> In-Reply-To: <20180703075707.834-1-mwilck@suse.com> References: <20180703075707.834-1-mwilck@suse.com> X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 207 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 03 Jul 2018 07:57:35 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 03 Jul 2018 07:57:35 +0000 (UTC) for IP:'137.65.250.81' DOMAIN:'smtp2.provo.novell.com' HELO:'smtp2.provo.novell.com' FROM:'mwilck@suse.com' RCPT:'' X-RedHat-Spam-Score: -1.801 (RCVD_IN_DNSWL_MED, SPF_PASS, SUBJ_ATTENTION) 137.65.250.81 smtp2.provo.novell.com 137.65.250.81 smtp2.provo.novell.com X-Scanned-By: MIMEDefang 2.83 on 10.5.110.25 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-loop: dm-devel@redhat.com Cc: dm-devel@redhat.com, Martin Wilck Subject: [dm-devel] [PATCH 2/3] libmultipath: alua: retry RTPG for NOT_READY and UNIT_ATTENTION X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Tue, 03 Jul 2018 07:57:50 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP Use similar logic as the kernel for retrying ALUA commands to avoid misinterpreting temporary failures as fatal errors. Signed-off-by: Martin Wilck Reviewed-by: Benjamin Marzinski --- libmultipath/prioritizers/alua_rtpg.c | 59 +++++++++++++++++++++++---- 1 file changed, 50 insertions(+), 9 deletions(-) diff --git a/libmultipath/prioritizers/alua_rtpg.c b/libmultipath/prioritizers/alua_rtpg.c index ce405b55..34b5f3ce 100644 --- a/libmultipath/prioritizers/alua_rtpg.c +++ b/libmultipath/prioritizers/alua_rtpg.c @@ -69,10 +69,20 @@ print_hex(unsigned char *p, unsigned long len) #define SCSI_COMMAND_TERMINATED 0x22 #define SG_ERR_DRIVER_SENSE 0x08 #define RECOVERED_ERROR 0x01 +#define NOT_READY 0x2 +#define UNIT_ATTENTION 0x6 + +enum scsi_disposition { + SCSI_GOOD = 0, + SCSI_ERROR, + SCSI_RETRY, +}; static int -scsi_error(struct sg_io_hdr *hdr) +scsi_error(struct sg_io_hdr *hdr, int opcode) { + int sense_key, asc, ascq; + /* Treat SG_ERR here to get rid of sg_err.[ch] */ hdr->status &= 0x7e; @@ -81,29 +91,44 @@ scsi_error(struct sg_io_hdr *hdr) (hdr->host_status == 0) && (hdr->driver_status == 0) ) { - return 0; + return SCSI_GOOD; } + sense_key = asc = ascq = -1; if ( (hdr->status == SCSI_CHECK_CONDITION) || (hdr->status == SCSI_COMMAND_TERMINATED) || ((hdr->driver_status & 0xf) == SG_ERR_DRIVER_SENSE) ) { if (hdr->sbp && (hdr->sb_len_wr > 2)) { - int sense_key; unsigned char * sense_buffer = hdr->sbp; - if (sense_buffer[0] & 0x2) + if (sense_buffer[0] & 0x2) { sense_key = sense_buffer[1] & 0xf; - else + if (hdr->sb_len_wr > 3) + asc = sense_buffer[2]; + if (hdr->sb_len_wr > 4) + ascq = sense_buffer[3]; + } else { sense_key = sense_buffer[2] & 0xf; + if (hdr->sb_len_wr > 13) + asc = sense_buffer[12]; + if (hdr->sb_len_wr > 14) + ascq = sense_buffer[13]; + } if (sense_key == RECOVERED_ERROR) - return 0; + return SCSI_GOOD; } } - return 1; + PRINT_DEBUG("alua: SCSI error for command %02x: status %02x, sense %02x/%02x/%02x", + opcode, hdr->status, sense_key, asc, ascq); + + if (sense_key == UNIT_ATTENTION || sense_key == NOT_READY) + return SCSI_RETRY; + else + return SCSI_ERROR; } /* @@ -116,7 +141,9 @@ do_inquiry(int fd, int evpd, unsigned int codepage, struct inquiry_command cmd; struct sg_io_hdr hdr; unsigned char sense[SENSE_BUFF_LEN]; + int rc, retry_count = 3; +retry: memset(&cmd, 0, sizeof(cmd)); cmd.op = OPERATION_CODE_INQUIRY; if (evpd) { @@ -142,9 +169,15 @@ do_inquiry(int fd, int evpd, unsigned int codepage, return -RTPG_INQUIRY_FAILED; } - if (scsi_error(&hdr)) { + rc = scsi_error(&hdr, OPERATION_CODE_INQUIRY); + if (rc == SCSI_ERROR) { PRINT_DEBUG("do_inquiry: SCSI error!"); return -RTPG_INQUIRY_FAILED; + } else if (rc == SCSI_RETRY) { + if (--retry_count >= 0) + goto retry; + PRINT_DEBUG("do_inquiry: retries exhausted!"); + return -RTPG_INQUIRY_FAILED; } PRINT_HEX((unsigned char *) resp, resplen); @@ -265,7 +298,9 @@ do_rtpg(int fd, void* resp, long resplen, unsigned int timeout) struct rtpg_command cmd; struct sg_io_hdr hdr; unsigned char sense[SENSE_BUFF_LEN]; + int retry_count = 3, rc; +retry: memset(&cmd, 0, sizeof(cmd)); cmd.op = OPERATION_CODE_RTPG; rtpg_command_set_service_action(&cmd); @@ -286,9 +321,15 @@ do_rtpg(int fd, void* resp, long resplen, unsigned int timeout) if (ioctl(fd, SG_IO, &hdr) < 0) return -RTPG_RTPG_FAILED; - if (scsi_error(&hdr)) { + rc = scsi_error(&hdr, OPERATION_CODE_RTPG); + if (rc == SCSI_ERROR) { PRINT_DEBUG("do_rtpg: SCSI error!"); return -RTPG_RTPG_FAILED; + } else if (rc == SCSI_RETRY) { + if (--retry_count >= 0) + goto retry; + PRINT_DEBUG("do_rtpg: retries exhausted!"); + return -RTPG_RTPG_FAILED; } PRINT_HEX(resp, resplen);