Message ID | 20190729105343.19250-1-kwolf@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [for-4.1] block/copy-on-read: Fix permissions for inactive node | expand |
On 7/29/19 5:53 AM, Kevin Wolf wrote: > The copy-on-read drive must not request the WRITE_UNCHANGED permission > for its child if the node is inactive, otherwise starting a migration > destination with -incoming will fail because the child cannot provide > write access yet: > > qemu-system-x86_64: -blockdev copy-on-read,file=img,node-name=cor: Block node is read-only > > Earlier QEMU versions additionally ran into an abort() on the migration > source side: bdrv_inactivate_recurse() failed to update permissions. > This is silently ignored today because it was only supposed to loosen > restrictions. This is the symptom that was originally reported here: > > https://bugzilla.redhat.com/show_bug.cgi?id=1733022 > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block/copy-on-read.c | 16 +++++++--------- > 1 file changed, 7 insertions(+), 9 deletions(-) Do any of the iotests cover this? Should they, especially if you are trying to get this in for -rc3 tomorrow? > > diff --git a/block/copy-on-read.c b/block/copy-on-read.c > index 22f24fd0db..6631f30205 100644 > --- a/block/copy-on-read.c > +++ b/block/copy-on-read.c > @@ -56,16 +56,14 @@ static void cor_child_perm(BlockDriverState *bs, BdrvChild *c, > uint64_t perm, uint64_t shared, > uint64_t *nperm, uint64_t *nshared) > { > - if (c == NULL) { > - *nperm = (perm & PERM_PASSTHROUGH) | BLK_PERM_WRITE_UNCHANGED; > - *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED; > - return; > - } > + *nperm = perm & PERM_PASSTHROUGH; > + *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED; > > - *nperm = (perm & PERM_PASSTHROUGH) | > - (c->perm & PERM_UNCHANGED); > - *nshared = (shared & PERM_PASSTHROUGH) | > - (c->shared_perm & PERM_UNCHANGED); The old code unconditionally returned one set of permissions when c == NULL, or made a choice based on c's existing permissions on whether to pass in those two bits. > + /* We must not request write permissions for an inactive node, the child > + * cannot provide it. */ > + if (!(bs->open_flags & BDRV_O_INACTIVE)) { > + *nperm |= BLK_PERM_WRITE_UNCHANGED; > + } The new code changes the condition for or'ing in WRITE_UNCHANGED to *nperm (it is no longer dependent on whether c == NULL, but whether the drive is inactive), which matches your commit message. But the new code also changes to always pass in the PERM_UNCHANGED to *nshared; that used to be skipped if c was non-NULL and did not already have the permission. I don't follow that change from the commit message, am I missing something?
Am 29.07.2019 um 15:35 hat Eric Blake geschrieben: > On 7/29/19 5:53 AM, Kevin Wolf wrote: > > The copy-on-read drive must not request the WRITE_UNCHANGED permission > > for its child if the node is inactive, otherwise starting a migration > > destination with -incoming will fail because the child cannot provide > > write access yet: > > > > qemu-system-x86_64: -blockdev copy-on-read,file=img,node-name=cor: Block node is read-only > > > > Earlier QEMU versions additionally ran into an abort() on the migration > > source side: bdrv_inactivate_recurse() failed to update permissions. > > This is silently ignored today because it was only supposed to loosen > > restrictions. This is the symptom that was originally reported here: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1733022 > > > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > > --- > > block/copy-on-read.c | 16 +++++++--------- > > 1 file changed, 7 insertions(+), 9 deletions(-) > > Do any of the iotests cover this? Should they, especially if you are > trying to get this in for -rc3 tomorrow? No, we don't have any iotests for migration with filter drivers yet. We probably should, but I didn't want to miss -rc3 with the fix because I was busy writing a test case. > > > > diff --git a/block/copy-on-read.c b/block/copy-on-read.c > > index 22f24fd0db..6631f30205 100644 > > --- a/block/copy-on-read.c > > +++ b/block/copy-on-read.c > > @@ -56,16 +56,14 @@ static void cor_child_perm(BlockDriverState *bs, BdrvChild *c, > > uint64_t perm, uint64_t shared, > > uint64_t *nperm, uint64_t *nshared) > > { > > - if (c == NULL) { > > - *nperm = (perm & PERM_PASSTHROUGH) | BLK_PERM_WRITE_UNCHANGED; > > - *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED; > > - return; > > - } > > + *nperm = perm & PERM_PASSTHROUGH; > > + *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED; > > > > - *nperm = (perm & PERM_PASSTHROUGH) | > > - (c->perm & PERM_UNCHANGED); > > - *nshared = (shared & PERM_PASSTHROUGH) | > > - (c->shared_perm & PERM_UNCHANGED); > > The old code unconditionally returned one set of permissions when c == > NULL, or made a choice based on c's existing permissions on whether to > pass in those two bits. > > > + /* We must not request write permissions for an inactive node, the child > > + * cannot provide it. */ > > + if (!(bs->open_flags & BDRV_O_INACTIVE)) { > > + *nperm |= BLK_PERM_WRITE_UNCHANGED; > > + } > > The new code changes the condition for or'ing in WRITE_UNCHANGED to > *nperm (it is no longer dependent on whether c == NULL, but whether the > drive is inactive), which matches your commit message. > > But the new code also changes to always pass in the PERM_UNCHANGED to > *nshared; that used to be skipped if c was non-NULL and did not already > have the permission. I don't follow that change from the commit > message, am I missing something? The old code didn't actually do anything that should have a different result (apart from WRITE_UNCHANGED for inactive images), just everything in a more complicated way for no apparent reason. Or at least that's what Max and I concluded after looking at this. Taking the PERM_UNCHANGED bits from the old value effectively means that they are taken from the very first call, which had c == NULL. So we can just use the same code to set them instead of referring to the old values of c->perm and c->shared_perm (which is really something a .bdrv_child_perm implementation shouldn't do - there are more cases, but we can clean them up for 4.2). Not cleaning this up would mean that I'd have to explicitly clear the WRITE_UNCHANGED bit after uselessly copying from the old state. This would be further complication of already unnecessarily complicated code, so I decided that cleaning it up so that its correctness becomes very obvious (request everything the parent nodes need, plus WRITE_UNCHANGED for the copy on read functionality if the node is active) makes more sense. Kevin
On 29.07.19 12:53, Kevin Wolf wrote: > The copy-on-read drive must not request the WRITE_UNCHANGED permission > for its child if the node is inactive, otherwise starting a migration > destination with -incoming will fail because the child cannot provide > write access yet: > > qemu-system-x86_64: -blockdev copy-on-read,file=img,node-name=cor: Block node is read-only > > Earlier QEMU versions additionally ran into an abort() on the migration > source side: bdrv_inactivate_recurse() failed to update permissions. > This is silently ignored today because it was only supposed to loosen > restrictions. This is the symptom that was originally reported here: > > https://bugzilla.redhat.com/show_bug.cgi?id=1733022 > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block/copy-on-read.c | 16 +++++++--------- > 1 file changed, 7 insertions(+), 9 deletions(-) Reviewed-by: Max Reitz <mreitz@redhat.com>
diff --git a/block/copy-on-read.c b/block/copy-on-read.c index 22f24fd0db..6631f30205 100644 --- a/block/copy-on-read.c +++ b/block/copy-on-read.c @@ -56,16 +56,14 @@ static void cor_child_perm(BlockDriverState *bs, BdrvChild *c, uint64_t perm, uint64_t shared, uint64_t *nperm, uint64_t *nshared) { - if (c == NULL) { - *nperm = (perm & PERM_PASSTHROUGH) | BLK_PERM_WRITE_UNCHANGED; - *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED; - return; - } + *nperm = perm & PERM_PASSTHROUGH; + *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED; - *nperm = (perm & PERM_PASSTHROUGH) | - (c->perm & PERM_UNCHANGED); - *nshared = (shared & PERM_PASSTHROUGH) | - (c->shared_perm & PERM_UNCHANGED); + /* We must not request write permissions for an inactive node, the child + * cannot provide it. */ + if (!(bs->open_flags & BDRV_O_INACTIVE)) { + *nperm |= BLK_PERM_WRITE_UNCHANGED; + } }
The copy-on-read drive must not request the WRITE_UNCHANGED permission for its child if the node is inactive, otherwise starting a migration destination with -incoming will fail because the child cannot provide write access yet: qemu-system-x86_64: -blockdev copy-on-read,file=img,node-name=cor: Block node is read-only Earlier QEMU versions additionally ran into an abort() on the migration source side: bdrv_inactivate_recurse() failed to update permissions. This is silently ignored today because it was only supposed to loosen restrictions. This is the symptom that was originally reported here: https://bugzilla.redhat.com/show_bug.cgi?id=1733022 Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- block/copy-on-read.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)