diff mbox series

[1/2] iotests: 030 TestParallelOps non-shared base node

Message ID 1550762799-830661-2-git-send-email-andrey.shinkevich@virtuozzo.com (mailing list archive)
State New, archived
Headers show
Series block-stream: include base into job node list | expand

Commit Message

Andrey Shinkevich Feb. 21, 2019, 3:26 p.m. UTC
The test case TestParallelOps::test_stream_parallel in #030 fails
if a base node is protected by the block-stream running job that
includes the base node into the job node list (block_job_add_bdrv)
without BLK_PERM_GRAPH_MOD shared permission.
The block-stream job would own the base node not allowing it to go
away due to the graph modification, e.g. when a filter node is
inserted above a top node of the parallel job. That's, the base
node was shared between the two parallel jobs.
This patch excludes sharing the node by inserting additional nodes
inbetween.
Furthermore, the block-stream job needs to own base node as the
limit to copy-on-read operation.
The insertion of additional nodes incures changes of node numbers
in two other cases:
test_overlapping_1
test_stream_base_node_name
because the nodes with written data got other numbers.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
---
 tests/qemu-iotests/030 | 34 ++++++++++++++++++----------------
 1 file changed, 18 insertions(+), 16 deletions(-)

Comments

Alberto Garcia March 18, 2019, 3:05 p.m. UTC | #1
On Thu 21 Feb 2019 04:26:38 PM CET, Andrey Shinkevich wrote:
> The test case TestParallelOps::test_stream_parallel in #030 fails
> if a base node is protected by the block-stream running job that
> includes the base node into the job node list (block_job_add_bdrv)
> without BLK_PERM_GRAPH_MOD shared permission.
> The block-stream job would own the base node not allowing it to go
> away due to the graph modification, e.g. when a filter node is
> inserted above a top node of the parallel job. That's, the base
> node was shared between the two parallel jobs.
> This patch excludes sharing the node by inserting additional nodes
> inbetween.

So what we have now is:

   A <- B <- C <- D <- E <- F <- G <- H <- I

and we can launch four parallel block-stream jobs:

   From C (base) to A
   From E (base) to C
   From G (base) to E
   From I (base) to G

These jobs share nodes (the base node of each one is the active node of
the next), but that's ok. From the base node we only require that the
guest-visible data does not change.

If I understand it correctly your patch blocks the base node for the
stream job, so it's not possible to share it anymore like we are doing
now and you need to insert additional nodes.

This seems unnecessary. I wrote this test to check precisely that very
scenario. It works and I don't see any reason why it shouldn't. As I
said in the e-mail that I sent some minutes ago, if what we want is to
prevent the base node from disappearing it seems that we have better
tools for that now.

Berto
Vladimir Sementsov-Ogievskiy March 18, 2019, 3:25 p.m. UTC | #2
18.03.2019 18:05, Alberto Garcia wrote:
> On Thu 21 Feb 2019 04:26:38 PM CET, Andrey Shinkevich wrote:
>> The test case TestParallelOps::test_stream_parallel in #030 fails
>> if a base node is protected by the block-stream running job that
>> includes the base node into the job node list (block_job_add_bdrv)
>> without BLK_PERM_GRAPH_MOD shared permission.
>> The block-stream job would own the base node not allowing it to go
>> away due to the graph modification, e.g. when a filter node is
>> inserted above a top node of the parallel job. That's, the base
>> node was shared between the two parallel jobs.
>> This patch excludes sharing the node by inserting additional nodes
>> inbetween.
> 
> So what we have now is:
> 
>     A <- B <- C <- D <- E <- F <- G <- H <- I
> 
> and we can launch four parallel block-stream jobs:
> 
>     From C (base) to A
>     From E (base) to C
>     From G (base) to E
>     From I (base) to G
> 
> These jobs share nodes (the base node of each one is the active node of
> the next), but that's ok. From the base node we only require that the
> guest-visible data does not change.

But with filters you cant.

assume that "From E (base) to C" is already running. It means, that there
is a filter inserted between B and C. Then you want to start
"From C (base) to A". But it's illegal, because this range includes filter,
which belong to "From E (base) to C". "From E (base) to C"

And even if you instead select somehow this filter to be the base for
"From C (base) to A", so jobs are not intersecting, it's wrong, as if
"From E (base) to C" finish earlier, it will want to remove its filter,
which will fail, as backing link form B to it is frozen.

> 
> If I understand it correctly your patch blocks the base node for the
> stream job, so it's not possible to share it anymore like we are doing
> now and you need to insert additional nodes.
> 
> This seems unnecessary. I wrote this test to check precisely that very
> scenario. It works and I don't see any reason why it shouldn't. As I
> said in the e-mail that I sent some minutes ago, if what we want is to
> prevent the base node from disappearing it seems that we have better
> tools for that now.
> 
> Berto
>
Andrey Shinkevich March 19, 2019, 7:10 a.m. UTC | #3
On 18/03/2019 18:05, Alberto Garcia wrote:
> On Thu 21 Feb 2019 04:26:38 PM CET, Andrey Shinkevich wrote:
>> The test case TestParallelOps::test_stream_parallel in #030 fails
>> if a base node is protected by the block-stream running job that
>> includes the base node into the job node list (block_job_add_bdrv)
>> without BLK_PERM_GRAPH_MOD shared permission.
>> The block-stream job would own the base node not allowing it to go
>> away due to the graph modification, e.g. when a filter node is
>> inserted above a top node of the parallel job. That's, the base
>> node was shared between the two parallel jobs.
>> This patch excludes sharing the node by inserting additional nodes
>> inbetween.
> 
> So what we have now is:
> 
>     A <- B <- C <- D <- E <- F <- G <- H <- I
> 
> and we can launch four parallel block-stream jobs:
> 
>     From C (base) to A
>     From E (base) to C
>     From G (base) to E
>     From I (base) to G
> 
> These jobs share nodes (the base node of each one is the active node of
> the next), but that's ok. From the base node we only require that the
> guest-visible data does not change.
> 
> If I understand it correctly your patch blocks the base node for the
> stream job, so it's not possible to share it anymore like we are doing
> now and you need to insert additional nodes.
> 
> This seems unnecessary. I wrote this test to check precisely that very
> scenario. It works and I don't see any reason why it shouldn't. As I
> said in the e-mail that I sent some minutes ago, if what we want is to
> prevent the base node from disappearing it seems that we have better
> tools for that now.
> 
> Berto
> 
Thank you, Berto, for your response.
I am about to complete a new version of the series that demonstrates
the issue with the given test when a filter node is inserted for every
block-stream parallel job. The series includes the last update with
the frozen backing chain.
Alberto Garcia March 19, 2019, 4:19 p.m. UTC | #4
On Mon 18 Mar 2019 04:25:10 PM CET, Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> wrote:
>> So what we have now is:
>> 
>>     A <- B <- C <- D <- E <- F <- G <- H <- I
>> 
>> and we can launch four parallel block-stream jobs:
>> 
>>     From C (base) to A
>>     From E (base) to C
>>     From G (base) to E
>>     From I (base) to G
>> 
>> These jobs share nodes (the base node of each one is the active node of
>> the next), but that's ok. From the base node we only require that the
>> guest-visible data does not change.
>
> But with filters you cant.
>
> assume that "From E (base) to C" is already running. It means, that
> there is a filter inserted between B and C. Then you want to start
> "From C (base) to A". But it's illegal, because this range includes
> filter, which belong to "From E (base) to C".

Oh, I see. Let's use a shorter chain for simplicity:

   A <- B <- C <- D <- E

1) If we stream first from E to C we add a filter F:

   A <- B <- F <- C <- D <- E

   Now we can't stream from C to A because F is on the way, and the F-C
   link is frozen.

2) If we stream first from C to A the filter goes on top of A:

   F <- A <- B <- C <- D <- E

   But now we can't stream from E to C, because that would insert a
   filter between B and C and that link is frozen.

I think it's unfortunate that implemeting block-stream with a filter
restricts this kind of scenarios, but I don't see a simple solution, and
I don't think this use case justifies making the code more complicated.

Berto
Kevin Wolf March 20, 2019, 9:16 a.m. UTC | #5
Am 19.03.2019 um 17:19 hat Alberto Garcia geschrieben:
> On Mon 18 Mar 2019 04:25:10 PM CET, Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> wrote:
> >> So what we have now is:
> >> 
> >>     A <- B <- C <- D <- E <- F <- G <- H <- I
> >> 
> >> and we can launch four parallel block-stream jobs:
> >> 
> >>     From C (base) to A
> >>     From E (base) to C
> >>     From G (base) to E
> >>     From I (base) to G
> >> 
> >> These jobs share nodes (the base node of each one is the active node of
> >> the next), but that's ok. From the base node we only require that the
> >> guest-visible data does not change.
> >
> > But with filters you cant.
> >
> > assume that "From E (base) to C" is already running. It means, that
> > there is a filter inserted between B and C. Then you want to start
> > "From C (base) to A". But it's illegal, because this range includes
> > filter, which belong to "From E (base) to C".
> 
> Oh, I see. Let's use a shorter chain for simplicity:
> 
>    A <- B <- C <- D <- E

Written from right to left, i.e. E being the base and A the top layer?
We usually write things the other write round, I hope this doesn't get
too confusing later.

> 1) If we stream first from E to C we add a filter F:
> 
>    A <- B <- F <- C <- D <- E
> 
>    Now we can't stream from C to A because F is on the way, and the F-C
>    link is frozen.

Why is a frozen link a problem? The streaming operation isn't going to
change this link, it just copies data from the subchain (including F and
C) to A. This is not something that a frozen link should prevent.

Now if we assume that the C to A job completes first, we still wouldn't
change the link F -> C, it would still point to C. But what would happen
is indeed wrong:

         F
         |
    A ---+-> C -> D -> E

I don't see why frozen backing file links should prevent this, but it's
still wrong.

You would get the correct result if your second job is from F to A. If
the second job finishes first, this works trivially. However, if the
first job finished first and wants to remove F and therefore change the
parent links from F to C, the frozen link B -> F of the second job
would be a problem.

So it seems frozen links allow the wrong case, but block the correct
one? :-(

> 2) If we stream first from C to A the filter goes on top of A:
> 
>    F <- A <- B <- C <- D <- E
> 
>    But now we can't stream from E to C, because that would insert a
>    filter between B and C and that link is frozen.
> 
> I think it's unfortunate that implemeting block-stream with a filter
> restricts this kind of scenarios, but I don't see a simple solution, and
> I don't think this use case justifies making the code more complicated.

Maybe at least this kind of freezing isn't the right tool for block
jobs, after all.

Kevin
Vladimir Sementsov-Ogievskiy March 20, 2019, 9:35 a.m. UTC | #6
20.03.2019 12:16, Kevin Wolf wrote:
> Am 19.03.2019 um 17:19 hat Alberto Garcia geschrieben:
>> On Mon 18 Mar 2019 04:25:10 PM CET, Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> wrote:
>>>> So what we have now is:
>>>>
>>>>      A <- B <- C <- D <- E <- F <- G <- H <- I
>>>>
>>>> and we can launch four parallel block-stream jobs:
>>>>
>>>>      From C (base) to A
>>>>      From E (base) to C
>>>>      From G (base) to E
>>>>      From I (base) to G
>>>>
>>>> These jobs share nodes (the base node of each one is the active node of
>>>> the next), but that's ok. From the base node we only require that the
>>>> guest-visible data does not change.
>>>
>>> But with filters you cant.
>>>
>>> assume that "From E (base) to C" is already running. It means, that
>>> there is a filter inserted between B and C. Then you want to start
>>> "From C (base) to A". But it's illegal, because this range includes
>>> filter, which belong to "From E (base) to C".
>>
>> Oh, I see. Let's use a shorter chain for simplicity:
>>
>>     A <- B <- C <- D <- E
> 
> Written from right to left, i.e. E being the base and A the top layer?
> We usually write things the other write round, I hope this doesn't get
> too confusing later.
> 
>> 1) If we stream first from E to C we add a filter F:
>>
>>     A <- B <- F <- C <- D <- E
>>
>>     Now we can't stream from C to A because F is on the way, and the F-C
>>     link is frozen.
> 
> Why is a frozen link a problem? The streaming operation isn't going to
> change this link, it just copies data from the subchain (including F and
> C) to A. This is not something that a frozen link should prevent.
> 
> Now if we assume that the C to A job completes first, we still wouldn't
> change the link F -> C, it would still point to C. But what would happen
> is indeed wrong:
> 
>           F
>           |
>      A ---+-> C -> D -> E
> 
> I don't see why frozen backing file links should prevent this, but it's
> still wrong.

I think it should be enough to not share WRITE_PERM by F-C link, to restrict
this.

> 
> You would get the correct result if your second job is from F to A. If
> the second job finishes first, this works trivially. However, if the
> first job finished first and wants to remove F and therefore change the
> parent links from F to C, the frozen link B -> F of the second job
> would be a problem.
> 
> So it seems frozen links allow the wrong case, but block the correct
> one? :-(
> 
>> 2) If we stream first from C to A the filter goes on top of A:
>>
>>     F <- A <- B <- C <- D <- E
>>
>>     But now we can't stream from E to C, because that would insert a
>>     filter between B and C and that link is frozen.
>>
>> I think it's unfortunate that implemeting block-stream with a filter
>> restricts this kind of scenarios, but I don't see a simple solution, and
>> I don't think this use case justifies making the code more complicated.
> 
> Maybe at least this kind of freezing isn't the right tool for block
> jobs, after all.
> 

I think, the correct way, which will support these correct cases (which however
don't look like real use cases) is removing base from stream view. Stream
should operate instead using bottom-node. And changing backing link of this
bottom-node to some new base (filter for ex.) will be transparent for stream.

But for this we should change interface (as it uses base), or at least recalculate
base to bottom-node as a first action of the job, before any yield.

Also, we'll need block_status operation to support bottom_node instead of base.

And this all looks really more complicated than what we already have with frozen
backing chain.

Do we really need such cases with parallel streams, sharing same node as base for
one and top for another? Will someone be upset if we drop usage for such scenarios?
Alberto Garcia March 20, 2019, 5:02 p.m. UTC | #7
On Wed 20 Mar 2019 10:16:10 AM CET, Kevin Wolf wrote:
>> Oh, I see. Let's use a shorter chain for simplicity:
>> 
>>    A <- B <- C <- D <- E
>
> Written from right to left, i.e. E being the base and A the top layer?
> We usually write things the other write round, I hope this doesn't get
> too confusing later.

Oh my... yes, of course you're right, I should have written it the other
way around:

   E <- D <- C <- B <- A

>> 1) If we stream first from E to C we add a filter F:
>> 
>>    A <- B <- F <- C <- D <- E

( which should have been written   E <- D <- C <- F <- B <- A )

>>    Now we can't stream from C to A because F is on the way, and the F-C
>>    link is frozen.
>
> Why is a frozen link a problem? The streaming operation isn't going to
> change this link, it just copies data from the subchain (including F
> and C) to A. This is not something that a frozen link should prevent.

Not the operation itself, but the first thing that block-stream does is
freeze the chain from top (A) to base (C), so this would fail if there's
already a frozen link on the way (C <- F on this case?).

> So it seems frozen links allow the wrong case, but block the correct
> one? :-(

Yes, we probably need to rethink this scenario a bit.

Berto
Alberto Garcia March 20, 2019, 5:25 p.m. UTC | #8
On Wed 20 Mar 2019 10:35:27 AM CET, Vladimir Sementsov-Ogievskiy wrote:
>> Maybe at least this kind of freezing isn't the right tool for block
>> jobs, after all.
>
> I think, the correct way, which will support these correct cases
> (which however don't look like real use cases) is removing base from
> stream view. Stream should operate instead using bottom-node.

What is bottom-node?

> And this all looks really more complicated than what we already have
> with frozen backing chain.
>
> Do we really need such cases with parallel streams, sharing same node
> as base for one and top for another? Will someone be upset if we drop
> usage for such scenarios?

That would mean that 'base' is blocked and therefore cannot be used by a
different block job.

I don't think it's terrible to restrict that scenario, although it's
technically a regression so someone could complain about it. As I said
ealier I would prefer not to do it if there's a simple solution (we
would need to think about it), but I wouldn't want to make the code more
complicated just for this use case.

Berto
Vladimir Sementsov-Ogievskiy March 21, 2019, 8:30 a.m. UTC | #9
20.03.2019 20:25, Alberto Garcia wrote:
> On Wed 20 Mar 2019 10:35:27 AM CET, Vladimir Sementsov-Ogievskiy wrote:
>>> Maybe at least this kind of freezing isn't the right tool for block
>>> jobs, after all.
>>
>> I think, the correct way, which will support these correct cases
>> (which however don't look like real use cases) is removing base from
>> stream view. Stream should operate instead using bottom-node.
> 
> What is bottom-node?

I mean bottom intermediate node, or a node which backing is base.

> 
>> And this all looks really more complicated than what we already have
>> with frozen backing chain.
>>
>> Do we really need such cases with parallel streams, sharing same node
>> as base for one and top for another? Will someone be upset if we drop
>> usage for such scenarios?
> 
> That would mean that 'base' is blocked and therefore cannot be used by a
> different block job.
> 
> I don't think it's terrible to restrict that scenario, although it's
> technically a regression so someone could complain about it. As I said
> ealier I would prefer not to do it if there's a simple solution (we
> would need to think about it), but I wouldn't want to make the code more
> complicated just for this use case.
> 
> Berto
>
Kevin Wolf March 21, 2019, 10:53 a.m. UTC | #10
Am 20.03.2019 um 18:02 hat Alberto Garcia geschrieben:
> On Wed 20 Mar 2019 10:16:10 AM CET, Kevin Wolf wrote:
> >> Oh, I see. Let's use a shorter chain for simplicity:
> >> 
> >>    A <- B <- C <- D <- E
> >
> > Written from right to left, i.e. E being the base and A the top layer?
> > We usually write things the other write round, I hope this doesn't get
> > too confusing later.
> 
> Oh my... yes, of course you're right, I should have written it the other
> way around:
> 
>    E <- D <- C <- B <- A
> 
> >> 1) If we stream first from E to C we add a filter F:
> >> 
> >>    A <- B <- F <- C <- D <- E
> 
> ( which should have been written   E <- D <- C <- F <- B <- A )
> 
> >>    Now we can't stream from C to A because F is on the way, and the F-C
> >>    link is frozen.
> >
> > Why is a frozen link a problem? The streaming operation isn't going to
> > change this link, it just copies data from the subchain (including F
> > and C) to A. This is not something that a frozen link should prevent.
> 
> Not the operation itself, but the first thing that block-stream does is
> freeze the chain from top (A) to base (C), so this would fail if there's
> already a frozen link on the way (C <- F on this case?).

Oh, I see. I think this is why I suggested a counter originally instead
of a bool.

> > So it seems frozen links allow the wrong case, but block the correct
> > one? :-(
> 
> Yes, we probably need to rethink this scenario a bit.

But yes, even with a counter, the other problem would still remain (that
the first job can't remove the filter on completion because the second
one has frozen its link to the filter).

I don't think that's a case we want to just forbid because nobody needs
this anyway. It's a sign of a more fundamental problem in our design,
and I'm sure it will bite us in other places, too.

Kevin
Vladimir Sementsov-Ogievskiy March 21, 2019, 11:53 a.m. UTC | #11
21.03.2019 13:53, Kevin Wolf wrote:
> Am 20.03.2019 um 18:02 hat Alberto Garcia geschrieben:
>> On Wed 20 Mar 2019 10:16:10 AM CET, Kevin Wolf wrote:
>>>> Oh, I see. Let's use a shorter chain for simplicity:
>>>>
>>>>     A <- B <- C <- D <- E
>>>
>>> Written from right to left, i.e. E being the base and A the top layer?
>>> We usually write things the other write round, I hope this doesn't get
>>> too confusing later.
>>
>> Oh my... yes, of course you're right, I should have written it the other
>> way around:
>>
>>     E <- D <- C <- B <- A
>>
>>>> 1) If we stream first from E to C we add a filter F:
>>>>
>>>>     A <- B <- F <- C <- D <- E
>>
>> ( which should have been written   E <- D <- C <- F <- B <- A )
>>
>>>>     Now we can't stream from C to A because F is on the way, and the F-C
>>>>     link is frozen.
>>>
>>> Why is a frozen link a problem? The streaming operation isn't going to
>>> change this link, it just copies data from the subchain (including F
>>> and C) to A. This is not something that a frozen link should prevent.
>>
>> Not the operation itself, but the first thing that block-stream does is
>> freeze the chain from top (A) to base (C), so this would fail if there's
>> already a frozen link on the way (C <- F on this case?).
> 
> Oh, I see. I think this is why I suggested a counter originally instead
> of a bool.
> 
>>> So it seems frozen links allow the wrong case, but block the correct
>>> one? :-(
>>
>> Yes, we probably need to rethink this scenario a bit.
> 
> But yes, even with a counter, the other problem would still remain (that
> the first job can't remove the filter on completion because the second
> one has frozen its link to the filter).
> 
> I don't think that's a case we want to just forbid because nobody needs
> this anyway. It's a sign of a more fundamental problem in our design,
> and I'm sure it will bite us in other places, too.
> 
> Kevin
> 

Does it mean that we now have problems with other jobs which already has
filter if use them in parallel with stream?
Alberto Garcia March 21, 2019, 2:23 p.m. UTC | #12
On Thu 21 Mar 2019 09:30:26 AM CET, Vladimir Sementsov-Ogievskiy wrote:
> 20.03.2019 20:25, Alberto Garcia wrote:
>> On Wed 20 Mar 2019 10:35:27 AM CET, Vladimir Sementsov-Ogievskiy wrote:
>>>> Maybe at least this kind of freezing isn't the right tool for block
>>>> jobs, after all.
>>>
>>> I think, the correct way, which will support these correct cases
>>> (which however don't look like real use cases) is removing base from
>>> stream view. Stream should operate instead using bottom-node.
>> 
>> What is bottom-node?
>
> I mean bottom intermediate node, or a node which backing is base.

But the base node is needed, because even if the job doesn't operate
directly on it it does use the name of the image to store on the top
image's backing file string.

Berto
Alberto Garcia March 21, 2019, 2:51 p.m. UTC | #13
On Thu 21 Mar 2019 12:53:57 PM CET, Vladimir Sementsov-Ogievskiy wrote:
>>> Yes, we probably need to rethink this scenario a bit.
>> 
>> But yes, even with a counter, the other problem would still remain
>> (that the first job can't remove the filter on completion because the
>> second one has frozen its link to the filter).
>> 
>> I don't think that's a case we want to just forbid because nobody
>> needs this anyway. It's a sign of a more fundamental problem in our
>> design, and I'm sure it will bite us in other places, too.
>
> Does it mean that we now have problems with other jobs which already
> has filter if use them in parallel with stream?

I think if there's a job that can insert a filter above the 'base' node
then we have a problem.

I was checking the tests that run commit and stream in parallel in 030,
but they do commit on the upper images and stream on the lower ones, so
that's safe. I'll try to run them the other way around because we might
have a problem there.

Berto
Kevin Wolf March 21, 2019, 2:53 p.m. UTC | #14
Am 21.03.2019 um 15:23 hat Alberto Garcia geschrieben:
> On Thu 21 Mar 2019 09:30:26 AM CET, Vladimir Sementsov-Ogievskiy wrote:
> > 20.03.2019 20:25, Alberto Garcia wrote:
> >> On Wed 20 Mar 2019 10:35:27 AM CET, Vladimir Sementsov-Ogievskiy wrote:
> >>>> Maybe at least this kind of freezing isn't the right tool for block
> >>>> jobs, after all.
> >>>
> >>> I think, the correct way, which will support these correct cases
> >>> (which however don't look like real use cases) is removing base from
> >>> stream view. Stream should operate instead using bottom-node.
> >> 
> >> What is bottom-node?
> >
> > I mean bottom intermediate node, or a node which backing is base.
> 
> But the base node is needed, because even if the job doesn't operate
> directly on it it does use the name of the image to store on the top
> image's backing file string.

That's a one-time thing at the end of the operation, though. I don't
think there is a problem with replacing the base node while the job is
running. It just needs to look at backing_bs(base_node) when it rewrites
the backing file string.

Kevin
Alberto Garcia March 21, 2019, 2:59 p.m. UTC | #15
On Thu 21 Mar 2019 03:53:41 PM CET, Kevin Wolf wrote:
>> >>> I think, the correct way, which will support these correct cases
>> >>> (which however don't look like real use cases) is removing base
>> >>> from stream view. Stream should operate instead using
>> >>> bottom-node.
>> >> 
>> >> What is bottom-node?
>> >
>> > I mean bottom intermediate node, or a node which backing is base.
>> 
>> But the base node is needed, because even if the job doesn't operate
>> directly on it it does use the name of the image to store on the top
>> image's backing file string.
>
> That's a one-time thing at the end of the operation, though. I don't
> think there is a problem with replacing the base node while the job is
> running. It just needs to look at backing_bs(base_node) when it
> rewrites the backing file string.

Right, there's a 'backing-file' parameter so the user can specify the
string, but by default it's taken from base.

Berto
Andrey Shinkevich March 21, 2019, 3:36 p.m. UTC | #16
On 20/03/2019 20:02, Alberto Garcia wrote:
> On Wed 20 Mar 2019 10:16:10 AM CET, Kevin Wolf wrote:
>>> Oh, I see. Let's use a shorter chain for simplicity:
>>>
>>>     A <- B <- C <- D <- E
>>
>> Written from right to left, i.e. E being the base and A the top layer?
>> We usually write things the other write round, I hope this doesn't get
>> too confusing later.
> 
> Oh my... yes, of course you're right, I should have written it the other
> way around:
> 
>     E <- D <- C <- B <- A
> 
>>> 1) If we stream first from E to C we add a filter F:
>>>
>>>     A <- B <- F <- C <- D <- E
> 
> ( which should have been written   E <- D <- C <- F <- B <- A )
> 
>>>     Now we can't stream from C to A because F is on the way, and the F-C
>>>     link is frozen.
>>
>> Why is a frozen link a problem? The streaming operation isn't going to
>> change this link, it just copies data from the subchain (including F
>> and C) to A. This is not something that a frozen link should prevent.
> 
> Not the operation itself, but the first thing that block-stream does is
> freeze the chain from top (A) to base (C), so this would fail if there's
> already a frozen link on the way (C <- F on this case?).

I manage the issue by removing the filter F after the chain A-C is
unfrozen in the stream_prepare().

> 
>> So it seems frozen links allow the wrong case, but block the correct
>> one? :-(
> 
> Yes, we probably need to rethink this scenario a bit.
> 
> Berto
>
Andrey Shinkevich March 21, 2019, 5:05 p.m. UTC | #17
On 21/03/2019 13:53, Kevin Wolf wrote:
> Am 20.03.2019 um 18:02 hat Alberto Garcia geschrieben:
>> On Wed 20 Mar 2019 10:16:10 AM CET, Kevin Wolf wrote:
>>>> Oh, I see. Let's use a shorter chain for simplicity:
>>>>
>>>>     A <- B <- C <- D <- E
>>>
>>> Written from right to left, i.e. E being the base and A the top layer?
>>> We usually write things the other write round, I hope this doesn't get
>>> too confusing later.
>>
>> Oh my... yes, of course you're right, I should have written it the other
>> way around:
>>
>>     E <- D <- C <- B <- A
>>
>>>> 1) If we stream first from E to C we add a filter F:
>>>>
>>>>     A <- B <- F <- C <- D <- E
>>
>> ( which should have been written   E <- D <- C <- F <- B <- A )
>>
>>>>     Now we can't stream from C to A because F is on the way, and the F-C
>>>>     link is frozen.
>>>
>>> Why is a frozen link a problem? The streaming operation isn't going to
>>> change this link, it just copies data from the subchain (including F
>>> and C) to A. This is not something that a frozen link should prevent.
>>
>> Not the operation itself, but the first thing that block-stream does is
>> freeze the chain from top (A) to base (C), so this would fail if there's
>> already a frozen link on the way (C <- F on this case?).
> 
> Oh, I see. I think this is why I suggested a counter originally instead
> of a bool.
> 
>>> So it seems frozen links allow the wrong case, but block the correct
>>> one? :-(
>>
>> Yes, we probably need to rethink this scenario a bit.
> 
> But yes, even with a counter, the other problem would still remain (that
> the first job can't remove the filter on completion because the second
> one has frozen its link to the filter).

With this example E <- D <- C <- F <- B <- A,

In the current implementation of the copy-on-read filter,
its bs->backing is not initialized (while it is not true for the filter
in block-commit job). So, bdrv_freeze_backing_chain() doesn't go beyond 
the cor-filter node. With the two parallel block-stream jobs, we get the
following sub-chains frozen:
F <- B <- A
E <- D <- C
as C <- F backing BdrvChild link doesn't exist.
If the cor-filter is inserted with the bdrv_append(), we get two
BdrvChild links (file and backing) pointed to the same BlockDriverState
'C' and additionally some child-permissions issues that I have not 
resolved yet...
Due to the fact mentioned above, freezing the backing chain works with
the filter inserted. But, with the one BdrvChild *file link only in the
BlockDriverState of the cor-filter, we encounter a broken chain each
time we iterate through it with the backing_bs(F) (=NULL) in many other
pieces of the code. In turn, it breaks the existing model.
That's the point! (((
What can we do with that?

In my patch Stream-block-job-involves-copy-on-read-filter.patch :

static BlockDriverState *child_file_bs(BlockDriverState *bs)
{
     return bs->file ? bs->file->bs : NULL;
}

static BlockDriverState *skip_filter(BlockDriverState *bs)
{
     BlockDriverState *ret_bs = bs;

     while (ret_bs && ret_bs->drv && ret_bs->drv->is_filter) {
         ret_bs = child_file_bs(ret_bs);
     }

     return ret_bs;
}

But the solution above looks clumsy to me.
I would appreciate to hear any other ideas from you.

Andrey

> 
> I don't think that's a case we want to just forbid because nobody needs
> this anyway. It's a sign of a more fundamental problem in our design,
> and I'm sure it will bite us in other places, too.
> 
> Kevin
>
Andrey Shinkevich March 21, 2019, 5:11 p.m. UTC | #18
On 21/03/2019 14:53, Vladimir Sementsov-Ogievskiy wrote:
> 21.03.2019 13:53, Kevin Wolf wrote:
>> Am 20.03.2019 um 18:02 hat Alberto Garcia geschrieben:
>>> On Wed 20 Mar 2019 10:16:10 AM CET, Kevin Wolf wrote:
>>>>> Oh, I see. Let's use a shorter chain for simplicity:
>>>>>
>>>>>      A <- B <- C <- D <- E
>>>>
>>>> Written from right to left, i.e. E being the base and A the top layer?
>>>> We usually write things the other write round, I hope this doesn't get
>>>> too confusing later.
>>>
>>> Oh my... yes, of course you're right, I should have written it the other
>>> way around:
>>>
>>>      E <- D <- C <- B <- A
>>>
>>>>> 1) If we stream first from E to C we add a filter F:
>>>>>
>>>>>      A <- B <- F <- C <- D <- E
>>>
>>> ( which should have been written   E <- D <- C <- F <- B <- A )
>>>
>>>>>      Now we can't stream from C to A because F is on the way, and the F-C
>>>>>      link is frozen.
>>>>
>>>> Why is a frozen link a problem? The streaming operation isn't going to
>>>> change this link, it just copies data from the subchain (including F
>>>> and C) to A. This is not something that a frozen link should prevent.
>>>
>>> Not the operation itself, but the first thing that block-stream does is
>>> freeze the chain from top (A) to base (C), so this would fail if there's
>>> already a frozen link on the way (C <- F on this case?).
>>
>> Oh, I see. I think this is why I suggested a counter originally instead
>> of a bool.
>>
>>>> So it seems frozen links allow the wrong case, but block the correct
>>>> one? :-(
>>>
>>> Yes, we probably need to rethink this scenario a bit.
>>
>> But yes, even with a counter, the other problem would still remain (that
>> the first job can't remove the filter on completion because the second
>> one has frozen its link to the filter).
>>
>> I don't think that's a case we want to just forbid because nobody needs
>> this anyway. It's a sign of a more fundamental problem in our design,
>> and I'm sure it will bite us in other places, too.
>>
>> Kevin
>>
> 
> Does it mean that we now have problems with other jobs which already has
> filter if use them in parallel with stream?
> 

In the current implementation of the iotests, only the
TestParallelOps::test_stream_parallel() in the #030
detects the issue.

Andrey
Vladimir Sementsov-Ogievskiy March 22, 2019, 7:26 a.m. UTC | #19
On 21.03.2019 20:05, Andrey Shinkevich wrote:
> 
> 
> On 21/03/2019 13:53, Kevin Wolf wrote:
>> Am 20.03.2019 um 18:02 hat Alberto Garcia geschrieben:
>>> On Wed 20 Mar 2019 10:16:10 AM CET, Kevin Wolf wrote:
>>>>> Oh, I see. Let's use a shorter chain for simplicity:
>>>>>
>>>>>      A <- B <- C <- D <- E
>>>>
>>>> Written from right to left, i.e. E being the base and A the top layer?
>>>> We usually write things the other write round, I hope this doesn't get
>>>> too confusing later.
>>>
>>> Oh my... yes, of course you're right, I should have written it the other
>>> way around:
>>>
>>>      E <- D <- C <- B <- A
>>>
>>>>> 1) If we stream first from E to C we add a filter F:
>>>>>
>>>>>      A <- B <- F <- C <- D <- E
>>>
>>> ( which should have been written   E <- D <- C <- F <- B <- A )
>>>
>>>>>      Now we can't stream from C to A because F is on the way, and the F-C
>>>>>      link is frozen.
>>>>
>>>> Why is a frozen link a problem? The streaming operation isn't going to
>>>> change this link, it just copies data from the subchain (including F
>>>> and C) to A. This is not something that a frozen link should prevent.
>>>
>>> Not the operation itself, but the first thing that block-stream does is
>>> freeze the chain from top (A) to base (C), so this would fail if there's
>>> already a frozen link on the way (C <- F on this case?).
>>
>> Oh, I see. I think this is why I suggested a counter originally instead
>> of a bool.
>>
>>>> So it seems frozen links allow the wrong case, but block the correct
>>>> one? :-(
>>>
>>> Yes, we probably need to rethink this scenario a bit.
>>
>> But yes, even with a counter, the other problem would still remain (that
>> the first job can't remove the filter on completion because the second
>> one has frozen its link to the filter).
> 
> With this example E <- D <- C <- F <- B <- A,
> 
> In the current implementation of the copy-on-read filter,
> its bs->backing is not initialized (while it is not true for the filter
> in block-commit job). So, bdrv_freeze_backing_chain() doesn't go beyond
> the cor-filter node. With the two parallel block-stream jobs, we get the
> following sub-chains frozen:
> F <- B <- A
> E <- D <- C
> as C <- F backing BdrvChild link doesn't exist.
> If the cor-filter is inserted with the bdrv_append(), we get two
> BdrvChild links (file and backing) pointed to the same BlockDriverState
> 'C' and additionally some child-permissions issues that I have not
> resolved yet...
> Due to the fact mentioned above, freezing the backing chain works with
> the filter inserted. But, with the one BdrvChild *file link only in the
> BlockDriverState of the cor-filter, we encounter a broken chain each
> time we iterate through it with the backing_bs(F) (=NULL) in many other
> pieces of the code. In turn, it breaks the existing model.
> That's the point! (((
> What can we do with that?
> 
> In my patch Stream-block-job-involves-copy-on-read-filter.patch :
> 
> static BlockDriverState *child_file_bs(BlockDriverState *bs)
> {
>       return bs->file ? bs->file->bs : NULL;
> }
> 
> static BlockDriverState *skip_filter(BlockDriverState *bs)
> {
>       BlockDriverState *ret_bs = bs;
> 
>       while (ret_bs && ret_bs->drv && ret_bs->drv->is_filter) {
>           ret_bs = child_file_bs(ret_bs);
>       }
> 
>       return ret_bs;
> }
> 
> But the solution above looks clumsy to me.
> I would appreciate to hear any other ideas from you.
> 

As I understand, we can't drop .file child from copy-on-read and use 
backing instead as it'll break backward compatibility. So I propose to
support backing child in COR, so that it may operate through .file or
through .backing, depending on what it has. (But forbid creating both
children)
Kevin Wolf March 22, 2019, 9:28 a.m. UTC | #20
Am 21.03.2019 um 18:05 hat Andrey Shinkevich geschrieben:
> 
> 
> On 21/03/2019 13:53, Kevin Wolf wrote:
> > Am 20.03.2019 um 18:02 hat Alberto Garcia geschrieben:
> >> On Wed 20 Mar 2019 10:16:10 AM CET, Kevin Wolf wrote:
> >>>> Oh, I see. Let's use a shorter chain for simplicity:
> >>>>
> >>>>     A <- B <- C <- D <- E
> >>>
> >>> Written from right to left, i.e. E being the base and A the top layer?
> >>> We usually write things the other write round, I hope this doesn't get
> >>> too confusing later.
> >>
> >> Oh my... yes, of course you're right, I should have written it the other
> >> way around:
> >>
> >>     E <- D <- C <- B <- A
> >>
> >>>> 1) If we stream first from E to C we add a filter F:
> >>>>
> >>>>     A <- B <- F <- C <- D <- E
> >>
> >> ( which should have been written   E <- D <- C <- F <- B <- A )
> >>
> >>>>     Now we can't stream from C to A because F is on the way, and the F-C
> >>>>     link is frozen.
> >>>
> >>> Why is a frozen link a problem? The streaming operation isn't going to
> >>> change this link, it just copies data from the subchain (including F
> >>> and C) to A. This is not something that a frozen link should prevent.
> >>
> >> Not the operation itself, but the first thing that block-stream does is
> >> freeze the chain from top (A) to base (C), so this would fail if there's
> >> already a frozen link on the way (C <- F on this case?).
> > 
> > Oh, I see. I think this is why I suggested a counter originally instead
> > of a bool.
> > 
> >>> So it seems frozen links allow the wrong case, but block the correct
> >>> one? :-(
> >>
> >> Yes, we probably need to rethink this scenario a bit.
> > 
> > But yes, even with a counter, the other problem would still remain (that
> > the first job can't remove the filter on completion because the second
> > one has frozen its link to the filter).
> 
> With this example E <- D <- C <- F <- B <- A,
> 
> In the current implementation of the copy-on-read filter,
> its bs->backing is not initialized (while it is not true for the filter
> in block-commit job). So, bdrv_freeze_backing_chain() doesn't go beyond 
> the cor-filter node. With the two parallel block-stream jobs, we get the
> following sub-chains frozen:
> F <- B <- A
> E <- D <- C
> as C <- F backing BdrvChild link doesn't exist.

Hm, yes, but that's more by accident than a proper design. Of course,
freezing the chain shouldn't be stopped by a filter.

Currently, bdrv_freeze_backing_chain() simply stops if we reach a point
where the backing chain ends, even if we haven't reached the base. I
think this should be an assertion failure because the function should
never be called with a base that isn't even in the backing chain.

> If the cor-filter is inserted with the bdrv_append(), we get two
> BdrvChild links (file and backing) pointed to the same BlockDriverState
> 'C' and additionally some child-permissions issues that I have not 
> resolved yet...
> Due to the fact mentioned above, freezing the backing chain works with
> the filter inserted. But, with the one BdrvChild *file link only in the
> BlockDriverState of the cor-filter, we encounter a broken chain each
> time we iterate through it with the backing_bs(F) (=NULL) in many other
> pieces of the code. In turn, it breaks the existing model.
> That's the point! (((
> What can we do with that?

I think Max's series "Deal with filters" might be very helpful for this
kind of thing. With it, bdrv_freeze_backing_chain() can easily be made
work even across the cor filter.

Basically, the solution in this series is that it replaces backing_bs()
with different other functions depending on what is really wanted.
Amonst others, it introduces a function to get to the next non-filter in
the backing file chain, no matter whether it has to go through bs->file
or bs->backing, which is probably the right replacement in your case.

Kevin
Alberto Garcia March 22, 2019, 3:54 p.m. UTC | #21
On Thu 21 Mar 2019 03:51:12 PM CET, Alberto Garcia <berto@igalia.com> wrote:

> I was checking the tests that run commit and stream in parallel in
> 030, but they do commit on the upper images and stream on the lower
> ones, so that's safe. I'll try to run them the other way around
> because we might have a problem there.

I considered these scenarios with the following backing chain:

   E <- D <- C <- B <- A

1) stream from C to A, then commit from C to E

   This fails because qmp_block_commit() checks for op blockers in C's
   overlay (B), which is blocked by the stream block job.
   ("Node 'B' is busy: block device is in use by block job: stream")

2) commit from C to E, then stream from C to A

   This fails because the commit job inserts a filter between C and B
   and the bdrv_freeze_backing_chain(bs, base) call in stream_start()
   fails.

   However! I found this crash in a couple of occasions, I believe that
   it happens if the commit job finishes before block_stream, but I need
   to debug it further to see why the previous error didn't happen.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000559aca6e745d in stream_prepare (job=0x559acdafad70) at block/stream.c:80
80                      base_fmt = base->drv->format_name;
(gdb) print base
$1 = (BlockDriverState *) 0x559acd070240
(gdb) print base->drv
$2 = (BlockDriver *) 0xb5b5b5b5b5b5b5b5
(gdb) bt
#0  0x0000559aca6e745d in stream_prepare (job=0x559acdafad70) at block/stream.c:80
#1  0x0000559aca973a40 in job_prepare (job=0x559acdafad70) at job.c:771
#2  0x0000559aca9722fd in job_txn_apply (txn=0x559acd01e6d0, fn=0x559aca973a03 <job_prepare>) at job.c:146
#3  0x0000559aca973ad2 in job_do_finalize (job=0x559acdafad70) at job.c:788
#4  0x0000559aca973ca0 in job_completed_txn_success (job=0x559acdafad70) at job.c:842
#5  0x0000559aca973d3d in job_completed (job=0x559acdafad70) at job.c:855
#6  0x0000559aca973d8c in job_exit (opaque=0x559acdafad70) at job.c:874
#7  0x0000559acaa99c55 in aio_bh_call (bh=0x559acd3247f0) at util/async.c:90
#8  0x0000559acaa99ced in aio_bh_poll (ctx=0x559accfb9a30) at util/async.c:118
#9  0x0000559acaa9ebc0 in aio_dispatch (ctx=0x559accfb9a30) at util/aio-posix.c:460
#10 0x0000559acaa9a088 in aio_ctx_dispatch (source=0x559accfb9a30, callback=0x0, user_data=0x0) at util/async.c:261
#11 0x00007f7d8e7787f7 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#12 0x0000559acaa9d4bf in glib_pollfds_poll () at util/main-loop.c:222
#13 0x0000559acaa9d539 in os_host_main_loop_wait (timeout=0) at util/main-loop.c:245
#14 0x0000559acaa9d63e in main_loop_wait (nonblocking=0) at util/main-loop.c:521
#15 0x0000559aca6c0ace in main_loop () at vl.c:1969
#16 0x0000559aca6c7db3 in main (argc=18, argv=0x7ffe11ee6d58, envp=0x7ffe11ee6df0) at vl.c:4589

So we need to look into this :( but I'd say that it seems that stream
should not need 'base' at all, just the node on top of it.

Berto
Andrey Shinkevich March 22, 2019, 6:17 p.m. UTC | #22
On 22/03/2019 18:54, Alberto Garcia wrote:
> On Thu 21 Mar 2019 03:51:12 PM CET, Alberto Garcia <berto@igalia.com> wrote:
> 
>> I was checking the tests that run commit and stream in parallel in
>> 030, but they do commit on the upper images and stream on the lower
>> ones, so that's safe. I'll try to run them the other way around
>> because we might have a problem there.
> 
> I considered these scenarios with the following backing chain:
> 
>     E <- D <- C <- B <- A
> 
> 1) stream from C to A, then commit from C to E
> 
>     This fails because qmp_block_commit() checks for op blockers in C's
>     overlay (B), which is blocked by the stream block job.
>     ("Node 'B' is busy: block device is in use by block job: stream")
> 
> 2) commit from C to E, then stream from C to A
> 
>     This fails because the commit job inserts a filter between C and B
>     and the bdrv_freeze_backing_chain(bs, base) call in stream_start()
>     fails.
> 
>     However! I found this crash in a couple of occasions, I believe that
>     it happens if the commit job finishes before block_stream, but I need
>     to debug it further to see why the previous error didn't happen.
> 
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x0000559aca6e745d in stream_prepare (job=0x559acdafad70) at block/stream.c:80
> 80                      base_fmt = base->drv->format_name;
> (gdb) print base
> $1 = (BlockDriverState *) 0x559acd070240
> (gdb) print base->drv
> $2 = (BlockDriver *) 0xb5b5b5b5b5b5b5b5
> (gdb) bt
> #0  0x0000559aca6e745d in stream_prepare (job=0x559acdafad70) at block/stream.c:80
> #1  0x0000559aca973a40 in job_prepare (job=0x559acdafad70) at job.c:771
> #2  0x0000559aca9722fd in job_txn_apply (txn=0x559acd01e6d0, fn=0x559aca973a03 <job_prepare>) at job.c:146
> #3  0x0000559aca973ad2 in job_do_finalize (job=0x559acdafad70) at job.c:788
> #4  0x0000559aca973ca0 in job_completed_txn_success (job=0x559acdafad70) at job.c:842
> #5  0x0000559aca973d3d in job_completed (job=0x559acdafad70) at job.c:855
> #6  0x0000559aca973d8c in job_exit (opaque=0x559acdafad70) at job.c:874
> #7  0x0000559acaa99c55 in aio_bh_call (bh=0x559acd3247f0) at util/async.c:90
> #8  0x0000559acaa99ced in aio_bh_poll (ctx=0x559accfb9a30) at util/async.c:118
> #9  0x0000559acaa9ebc0 in aio_dispatch (ctx=0x559accfb9a30) at util/aio-posix.c:460
> #10 0x0000559acaa9a088 in aio_ctx_dispatch (source=0x559accfb9a30, callback=0x0, user_data=0x0) at util/async.c:261
> #11 0x00007f7d8e7787f7 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
> #12 0x0000559acaa9d4bf in glib_pollfds_poll () at util/main-loop.c:222
> #13 0x0000559acaa9d539 in os_host_main_loop_wait (timeout=0) at util/main-loop.c:245
> #14 0x0000559acaa9d63e in main_loop_wait (nonblocking=0) at util/main-loop.c:521
> #15 0x0000559aca6c0ace in main_loop () at vl.c:1969
> #16 0x0000559aca6c7db3 in main (argc=18, argv=0x7ffe11ee6d58, envp=0x7ffe11ee6df0) at vl.c:4589
> 
> So we need to look into this :( but I'd say that it seems that stream
> should not need 'base' at all, just the node on top of it.
> 
> Berto
> 

Meanwhile, I will tackle a new series that uses a 'bottom node'
instead of the 'base'...

Andrey
Alberto Garcia March 24, 2019, 6:05 p.m. UTC | #23
On Fri 22 Mar 2019 04:54:59 PM CET, Alberto Garcia <berto@igalia.com> wrote:
>    E <- D <- C <- B <- A
>
> 2) commit from C to E, then stream from C to A
>
>    This fails because the commit job inserts a filter between C and B
>    and the bdrv_freeze_backing_chain(bs, base) call in stream_start()
>    fails.
>
>    However! I found this crash in a couple of occasions, I believe that
>    it happens if the commit job finishes before block_stream, but I need
>    to debug it further to see why the previous error didn't happen.

I was debugging this today. Here's what happens:

 - The commit job starts
 - The stream job starts and yields during bdrv_reopen_set_read_only()
   in stream_start()
 - The commit job ends and removes C and D from the backing chain.
 - stream_start() resumes but now 'C' doesn't exist anymore, so
   BlockDriverState *base is a dead pointer.

Berto
diff mbox series

Patch

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index 276e06b..c801be1 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -153,7 +153,8 @@  class TestSingleDrive(iotests.QMPTestCase):
 
 class TestParallelOps(iotests.QMPTestCase):
     num_ops = 4 # Number of parallel block-stream operations
-    num_imgs = num_ops * 2 + 1
+    step = 3 # base + source + target
+    num_imgs = num_ops * step
     image_len = num_ops * 512 * 1024
     imgs = []
 
@@ -174,14 +175,15 @@  class TestParallelOps(iotests.QMPTestCase):
                      '-o', 'backing_file=%s' % self.imgs[i-1], self.imgs[i])
 
         # Put data into the images we are copying data from
-        odd_img_indexes = [x for x in reversed(range(self.num_imgs)) if x % 2 == 1]
-        for i in range(len(odd_img_indexes)):
+        source_img_indexes = \
+            list(reversed(list(range(self.num_imgs)[1::self.step])))
+        for i in range(len(source_img_indexes)):
             # Alternate between 256KB and 512KB.
             # This way jobs will not finish in the same order they were created
             num_kb = 256 + 256 * (i % 2)
             qemu_io('-f', iotests.imgfmt,
                     '-c', 'write -P 0xFF %dk %dk' % (i * 512, num_kb),
-                    self.imgs[odd_img_indexes[i]])
+                    self.imgs[source_img_indexes[i]])
 
         # Attach the drive to the VM
         self.vm = iotests.VM()
@@ -199,14 +201,14 @@  class TestParallelOps(iotests.QMPTestCase):
         self.assert_no_active_block_jobs()
 
         # Check that the maps don't match before the streaming operations
-        for i in range(2, self.num_imgs, 2):
+        for i in range(2, self.num_imgs, self.step):
             self.assertNotEqual(qemu_io('-f', iotests.imgfmt, '-rU', '-c', 'map', self.imgs[i]),
                                 qemu_io('-f', iotests.imgfmt, '-rU', '-c', 'map', self.imgs[i-1]),
                                 'image file map matches backing file before streaming')
 
         # Create all streaming jobs
         pending_jobs = []
-        for i in range(2, self.num_imgs, 2):
+        for i in range(2, self.num_imgs, self.step):
             node_name = 'node%d' % i
             job_id = 'stream-%s' % node_name
             pending_jobs.append(job_id)
@@ -226,7 +228,7 @@  class TestParallelOps(iotests.QMPTestCase):
         self.vm.shutdown()
 
         # Check that all maps match now
-        for i in range(2, self.num_imgs, 2):
+        for i in range(2, self.num_imgs, self.step):
             self.assertEqual(qemu_io('-f', iotests.imgfmt, '-c', 'map', self.imgs[i]),
                              qemu_io('-f', iotests.imgfmt, '-c', 'map', self.imgs[i-1]),
                              'image file map does not match backing file after streaming')
@@ -237,16 +239,16 @@  class TestParallelOps(iotests.QMPTestCase):
         self.assert_no_active_block_jobs()
 
         # Set a speed limit to make sure that this job blocks the rest
-        result = self.vm.qmp('block-stream', device='node4', job_id='stream-node4', base=self.imgs[1], speed=1024*1024)
+        result = self.vm.qmp('block-stream', device='node5', job_id='stream-node5', base=self.imgs[1], speed=1024*1024)
         self.assert_qmp(result, 'return', {})
 
-        result = self.vm.qmp('block-stream', device='node5', job_id='stream-node5', base=self.imgs[2])
+        result = self.vm.qmp('block-stream', device='node6', job_id='stream-node6', base=self.imgs[2])
         self.assert_qmp(result, 'error/class', 'GenericError')
 
         result = self.vm.qmp('block-stream', device='node3', job_id='stream-node3', base=self.imgs[2])
         self.assert_qmp(result, 'error/class', 'GenericError')
 
-        result = self.vm.qmp('block-stream', device='node4', job_id='stream-node4-v2')
+        result = self.vm.qmp('block-stream', device='node5', job_id='stream-node5-v2')
         self.assert_qmp(result, 'error/class', 'GenericError')
 
         # block-commit should also fail if it touches nodes used by the stream job
@@ -260,7 +262,7 @@  class TestParallelOps(iotests.QMPTestCase):
         result = self.vm.qmp('block-commit', device='drive0', base=self.imgs[0], top=self.imgs[1], job_id='commit-node0')
         self.assert_qmp(result, 'error/class', 'GenericError')
 
-        self.wait_until_completed(drive='stream-node4')
+        self.wait_until_completed(drive='stream-node5')
         self.assert_no_active_block_jobs()
 
     # Similar to test_overlapping_1, but with block-commit
@@ -383,8 +385,8 @@  class TestParallelOps(iotests.QMPTestCase):
     def test_stream_base_node_name(self):
         self.assert_no_active_block_jobs()
 
-        self.assertNotEqual(qemu_io('-f', iotests.imgfmt, '-rU', '-c', 'map', self.imgs[4]),
-                            qemu_io('-f', iotests.imgfmt, '-rU', '-c', 'map', self.imgs[3]),
+        self.assertNotEqual(qemu_io('-f', iotests.imgfmt, '-rU', '-c', 'map', self.imgs[5]),
+                            qemu_io('-f', iotests.imgfmt, '-rU', '-c', 'map', self.imgs[4]),
                             'image file map matches backing file before streaming')
 
         # Error: the base node does not exist
@@ -404,7 +406,7 @@  class TestParallelOps(iotests.QMPTestCase):
         self.assert_qmp(result, 'error/class', 'GenericError')
 
         # Success: the base node is a backing file of the top node
-        result = self.vm.qmp('block-stream', device='node4', base_node='node2', job_id='stream')
+        result = self.vm.qmp('block-stream', device='node5', base_node='node3', job_id='stream')
         self.assert_qmp(result, 'return', {})
 
         self.wait_until_completed(drive='stream')
@@ -412,8 +414,8 @@  class TestParallelOps(iotests.QMPTestCase):
         self.assert_no_active_block_jobs()
         self.vm.shutdown()
 
-        self.assertEqual(qemu_io('-f', iotests.imgfmt, '-c', 'map', self.imgs[4]),
-                         qemu_io('-f', iotests.imgfmt, '-c', 'map', self.imgs[3]),
+        self.assertEqual(qemu_io('-f', iotests.imgfmt, '-c', 'map', self.imgs[5]),
+                         qemu_io('-f', iotests.imgfmt, '-c', 'map', self.imgs[4]),
                          'image file map matches backing file after streaming')
 
 class TestQuorum(iotests.QMPTestCase):