From patchwork Fri Aug 11 16:48:54 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 9896321 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DBE46602DA for ; Fri, 11 Aug 2017 16:49:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CC7A228C45 for ; Fri, 11 Aug 2017 16:49:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BFC8528C55; Fri, 11 Aug 2017 16:49:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 47EE628C45 for ; Fri, 11 Aug 2017 16:49:50 +0000 (UTC) Received: from localhost ([::1]:56220 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dgD89-0005ad-CD for patchwork-qemu-devel@patchwork.kernel.org; Fri, 11 Aug 2017 12:49:49 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50527) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dgD7T-0005Z9-0t for qemu-devel@nongnu.org; Fri, 11 Aug 2017 12:49:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dgD7R-0005wj-Sr for qemu-devel@nongnu.org; Fri, 11 Aug 2017 12:49:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38469) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dgD7M-0005vV-0P; Fri, 11 Aug 2017 12:49:00 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E4B1BC014196; Fri, 11 Aug 2017 16:48:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com E4B1BC014196 Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=kwolf@redhat.com Received: from localhost.localdomain (ovpn-116-116.ams2.redhat.com [10.36.116.116]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 22DA7A64B2; Fri, 11 Aug 2017 16:48:55 +0000 (UTC) Date: Fri, 11 Aug 2017 18:48:54 +0200 From: Kevin Wolf To: Christian Ehrhardt Message-ID: <20170811164854.GG4162@localhost.localdomain> References: <20170811120435.GB23309@lemon.lan> <20170811123716.GD4162@localhost.localdomain> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.8.3 (2017-05-23) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 11 Aug 2017 16:48:59 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: Re: [Qemu-devel] migration issue with qemu 2.10-rc2: QEMU command 'nbd-server-add': Block node is read-only X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , qemu-devel@nongnu.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Am 11.08.2017 um 17:34 hat Christian Ehrhardt geschrieben: > On Fri, Aug 11, 2017 at 2:37 PM, Kevin Wolf wrote: > > > Am 11.08.2017 um 14:04 hat Fam Zheng geschrieben: > > > On Fri, 08/11 13:07, Christian Ehrhardt wrote: > > > > Simplifying that to a smaller test: > > > > > > > [...] > > > > > Block node is read-only > > > [...] > > > > > > > This is actually caused by > > > > > > commit 9c5e6594f15b7364624a3ad40306c396c93a2145 > > > Author: Kevin Wolf > > > Date: Thu May 4 18:52:40 2017 +0200 > > > > > > block: Fix write/resize permissions for inactive images > > > > > > > Yeah in the meantime an automated bisect run is through an agrees. > Thanks for pointing out the right change triggering that so early! > > Thanks Kevin for all the suggestions already, I quickly looked at the code > but I'm lost there without taking much more time. > Is anybody experienced in that area looking at fixing that? > Because IMHO that is kind of a block bug for 2.10 by breaking formerly > working behavior (even as Kevin identified it seems to have done it wrong > all the time). The patch below implements what I was thinking of, and it seems to fix this problem. However, you'll immediately run into the next one, which is a message like 'Conflicts with use by ide0-hd0 as 'root', which does not allow 'write' on #block172'. The reason for this is that since commit 4417ab7adf1, bdrv_invalidate_cache() not only activates the image, but also is the signal for guest device BlockBackends that migration has completed and they should now request exclusive access to the image. As the NBD server shows, this assumption is wrong. Images can be activated even before migration completes. Maybe this really needs to be done in a VM state change handler. We can't simply revert commit 4417ab7adf1 because for image file locking, it is important that we drop locks for inactive images, so BdrvChildRole.activate/inactivate will probably stay. Maybe only one bool blk->disable_perm isn't enough, we might need a different one for devices before migration completed, or at least a counter. I'll be on vacation starting tomorrow, so someone else needs to tackle this. Fam, I think you are relatively familiar with the op blockers? Kevin diff --git a/nbd/server.c b/nbd/server.c index 82a78bf439..b68b6fb535 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1045,11 +1045,22 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, off_t size, bool writethrough, BlockBackend *on_eject_blk, Error **errp) { + AioContext *ctx; BlockBackend *blk; NBDExport *exp = g_malloc0(sizeof(NBDExport)); uint64_t perm; int ret; + /* + * NBD exports are used for non-shared storage migration. Make sure + * that BDRV_O_INACTIVE is cleared and the image is ready for write + * access since the export could be available before migration handover. + */ + ctx = bdrv_get_aio_context(bs); + aio_context_acquire(ctx); + bdrv_invalidate_cache(bs, NULL); + aio_context_release(ctx); + /* Don't allow resize while the NBD server is running, otherwise we don't * care what happens with the node. */ perm = BLK_PERM_CONSISTENT_READ; @@ -1087,15 +1098,6 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, off_t size, exp->eject_notifier.notify = nbd_eject_notifier; blk_add_remove_bs_notifier(on_eject_blk, &exp->eject_notifier); } - - /* - * NBD exports are used for non-shared storage migration. Make sure - * that BDRV_O_INACTIVE is cleared and the image is ready for write - * access since the export could be available before migration handover. - */ - aio_context_acquire(exp->ctx); - blk_invalidate_cache(blk, NULL); - aio_context_release(exp->ctx); return exp; fail: