From patchwork Mon Aug 10 03:55:51 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 6979291 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Original-To: patchwork-dm-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 0301E9F373 for ; Mon, 10 Aug 2015 04:00:36 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 2DCB5206EF for ; Mon, 10 Aug 2015 04:00:35 +0000 (UTC) Received: from mx4-phx2.redhat.com (mx4-phx2.redhat.com [209.132.183.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2099D206E6 for ; Mon, 10 Aug 2015 04:00:33 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by mx4-phx2.redhat.com (8.13.8/8.13.8) with ESMTP id t7A3u2cE009204; Sun, 9 Aug 2015 23:56:03 -0400 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id t7A3u1Ce018217 for ; Sun, 9 Aug 2015 23:56:01 -0400 Received: from mx1.redhat.com (ext-mx02.extmail.prod.ext.phx2.redhat.com [10.5.110.26]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t7A3u1DV005092 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sun, 9 Aug 2015 23:56:01 -0400 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by mx1.redhat.com (Postfix) with ESMTPS id EBB978EB30; Mon, 10 Aug 2015 03:55:59 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 807DBAB9F; Mon, 10 Aug 2015 03:55:58 +0000 (UTC) Date: Mon, 10 Aug 2015 13:55:51 +1000 From: NeilBrown To: Mikulas Patocka , dm-devel@redhat.com Message-ID: <20150810135551.64d7dbac@noble> MIME-Version: 1.0 X-RedHat-Spam-Score: -4.601 (BAYES_50, DCC_REPUT_00_12, RCVD_IN_DNSWL_HI, SPF_PASS) 195.135.220.15 mx2.suse.de 195.135.220.15 mx2.suse.de X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-Scanned-By: MIMEDefang 2.75 on 10.5.110.26 X-loop: dm-devel@redhat.com Subject: [dm-devel] dm-snap deadlock in pending_complete() X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Mikulas, I have a customer hitting the deadlock you described over a year ago in: Subject: [dm-devel] [PATCH] block: flush queued bios when the process blocks I notice that patch never went upstream. Has anything else been done to fix this deadlock? My thought was that something like the below would be sufficient. Do you see any problem with that? It avoids the deadlock by dropping the lock while sleeping. Thanks NeilBrown --- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index 7c82d3ccce87..d29bcd02f9cf 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -1454,6 +1454,7 @@ static void pending_complete(struct dm_snap_pending_exception *pe, int success) } *e = pe->e; +retry: down_write(&s->lock); if (!s->valid) { free_completed_exception(e); @@ -1462,7 +1463,11 @@ static void pending_complete(struct dm_snap_pending_exception *pe, int success) } /* Check for conflicting reads */ - __check_for_conflicting_io(s, pe->e.old_chunk); + if (__chunk_size_tracked(s, pe->e.old_chunk)) { + up_write(&s->lock); + msleep(1); + goto retry; + } /* * Add a proper exception, and remove the