Message ID | 20230421171411.566300-3-berrange@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | tests/qtest: make migration-test massively faster | expand |
Daniel P. Berrangé <berrange@redhat.com> wrote: > The 'unsigned int interations' config for migration is somewhat > overkill. Most tests don't set it, and a value of '0' is treated > as equivalent to '1'. The only test that does set it, xbzrle, > used a value of '2'. > > This setting, however, only relates to the migration iterations > that take place prior to allowing convergence. IOW, on top of > this iteration count, there is always at least 1 further migration > iteration done to deal with pages that are dirtied during the > previous iteration(s). > > IOW, even with iterations==1, the xbzrle test will be running for > a minimum of 2 iterations. With this in mind we can simplify the > code and just get rid of the special case. Perhaps the old code was already wrong, but we need at least three iterations for the xbzrle test: - 1st iteration: xbzrle is not used, nothing is on cache. - 2nd iteration: pages are put into cache, no xbzrle is used because there is no previous page. - 3rd iteration: We really use xbzrle now against the copy of the previous iterations. And yes, this should be commented somewhere. Later, Juan.
On Fri, Apr 21, 2023 at 11:54:55PM +0200, Juan Quintela wrote: > Daniel P. Berrangé <berrange@redhat.com> wrote: > > The 'unsigned int interations' config for migration is somewhat > > overkill. Most tests don't set it, and a value of '0' is treated > > as equivalent to '1'. The only test that does set it, xbzrle, > > used a value of '2'. > > > > This setting, however, only relates to the migration iterations > > that take place prior to allowing convergence. IOW, on top of > > this iteration count, there is always at least 1 further migration > > iteration done to deal with pages that are dirtied during the > > previous iteration(s). > > > > IOW, even with iterations==1, the xbzrle test will be running for > > a minimum of 2 iterations. With this in mind we can simplify the > > code and just get rid of the special case. > > Perhaps the old code was already wrong, but we need at least three > iterations for the xbzrle test: > - 1st iteration: xbzrle is not used, nothing is on cache. Are you sure about this ? I see ram_save_page() calling save_xbzrle_page() and unless I'm mis-understanding the code, it doesn't appear to skip anything on the 1st iteration. IIUC save_xbzrle_page will add pages into the cache on the first iteration, so the second iteration will get cache hits > - 2nd iteration: pages are put into cache, no xbzrle is used because > there is no previous page. > - 3rd iteration: We really use xbzrle now against the copy of the > previous iterations. > > And yes, this should be commented somewhere. With regards, Daniel
Daniel P. Berrangé <berrange@redhat.com> wrote: > On Fri, Apr 21, 2023 at 11:54:55PM +0200, Juan Quintela wrote: >> Daniel P. Berrangé <berrange@redhat.com> wrote: >> > The 'unsigned int interations' config for migration is somewhat >> > overkill. Most tests don't set it, and a value of '0' is treated >> > as equivalent to '1'. The only test that does set it, xbzrle, >> > used a value of '2'. >> > >> > This setting, however, only relates to the migration iterations >> > that take place prior to allowing convergence. IOW, on top of >> > this iteration count, there is always at least 1 further migration >> > iteration done to deal with pages that are dirtied during the >> > previous iteration(s). >> > >> > IOW, even with iterations==1, the xbzrle test will be running for >> > a minimum of 2 iterations. With this in mind we can simplify the >> > code and just get rid of the special case. >> >> Perhaps the old code was already wrong, but we need at least three >> iterations for the xbzrle test: >> - 1st iteration: xbzrle is not used, nothing is on cache. > > Are you sure about this ? I see ram_save_page() calling > save_xbzrle_page() and unless I'm mis-understanding the > code, it doesn't appear to skip anything on the 1st > iteration. I will admit that code is convoluted as hell. And I confuse myself a lot here O:-) struct RAM_STATE { ... /* Start using XBZRLE (e.g., after the first round). */ bool xbzrle_enabled; } I.e. xbzrle_enabled() and m->xbzrle_enabled are two completely different things. static int ram_save_page(RAMState *rs, PageSearchStatus *pss) { ... if (rs->xbzrle_enabled && !migration_in_postcopy()) { pages = save_xbzrle_page(rs, pss, &p, current_addr, block, offset); .... } .... } and static int find_dirty_block(RAMState *rs, PageSearchStatus *pss) { /* Update pss->page for the next dirty bit in ramblock */ pss_find_next_dirty(pss); if (pss->complete_round && pss->block == rs->last_seen_block && ... return PAGE_ALL_CLEAN; } if (!offset_in_ramblock(pss->block, ((ram_addr_t)pss->page) << TARGET_PAGE_BITS)) { .... if (!pss->block) { .... if (migrate_use_xbzrle()) { rs->xbzrle_enabled = true; } } ... } else { /* We've found something */ return PAGE_DIRTY_FOUND; } } > IIUC save_xbzrle_page will add pages into the cache on > the first iteration, so the second iteration will get > cache hits > >> - 2nd iteration: pages are put into cache, no xbzrle is used because >> there is no previous page. >> - 3rd iteration: We really use xbzrle now against the copy of the >> previous iterations. >> >> And yes, this should be commented somewhere. Seeing that it has been able to confuse you, a single comment will not make the trick O:-) Later, Juan.
On Wed, Apr 26, 2023 at 11:42:51AM +0200, Juan Quintela wrote: > Daniel P. Berrangé <berrange@redhat.com> wrote: > > On Fri, Apr 21, 2023 at 11:54:55PM +0200, Juan Quintela wrote: > >> Daniel P. Berrangé <berrange@redhat.com> wrote: > >> > The 'unsigned int interations' config for migration is somewhat > >> > overkill. Most tests don't set it, and a value of '0' is treated > >> > as equivalent to '1'. The only test that does set it, xbzrle, > >> > used a value of '2'. > >> > > >> > This setting, however, only relates to the migration iterations > >> > that take place prior to allowing convergence. IOW, on top of > >> > this iteration count, there is always at least 1 further migration > >> > iteration done to deal with pages that are dirtied during the > >> > previous iteration(s). > >> > > >> > IOW, even with iterations==1, the xbzrle test will be running for > >> > a minimum of 2 iterations. With this in mind we can simplify the > >> > code and just get rid of the special case. > >> > >> Perhaps the old code was already wrong, but we need at least three > >> iterations for the xbzrle test: > >> - 1st iteration: xbzrle is not used, nothing is on cache. > > > > Are you sure about this ? I see ram_save_page() calling > > save_xbzrle_page() and unless I'm mis-understanding the > > code, it doesn't appear to skip anything on the 1st > > iteration. > > I will admit that code is convoluted as hell. > And I confuse myself a lot here O:-) > > struct RAM_STATE { > ... > /* Start using XBZRLE (e.g., after the first round). */ > bool xbzrle_enabled; > } > > I.e. xbzrle_enabled() and m->xbzrle_enabled are two completely different things. Aieeeee ! That's confusing indeed :-) Lets rename that struct field to 'xbzrle_started', to better distinguish active state from enabled state. > static int ram_save_page(RAMState *rs, PageSearchStatus *pss) > { > ... > if (rs->xbzrle_enabled && !migration_in_postcopy()) { > pages = save_xbzrle_page(rs, pss, &p, current_addr, > block, offset); > .... > } > .... > } > > and > > static int find_dirty_block(RAMState *rs, PageSearchStatus *pss) > { > /* Update pss->page for the next dirty bit in ramblock */ > pss_find_next_dirty(pss); > > if (pss->complete_round && pss->block == rs->last_seen_block && > ... > return PAGE_ALL_CLEAN; > } > if (!offset_in_ramblock(pss->block, > ((ram_addr_t)pss->page) << TARGET_PAGE_BITS)) { > .... > if (!pss->block) { > .... > if (migrate_use_xbzrle()) { > rs->xbzrle_enabled = true; > } > } > ... > } else { > /* We've found something */ > return PAGE_DIRTY_FOUND; > } > } > > > > > IIUC save_xbzrle_page will add pages into the cache on > > the first iteration, so the second iteration will get > > cache hits > > > >> - 2nd iteration: pages are put into cache, no xbzrle is used because > >> there is no previous page. > >> - 3rd iteration: We really use xbzrle now against the copy of the > >> previous iterations. > >> > >> And yes, this should be commented somewhere. > > Seeing that it has been able to confuse you, a single comment will not > make the trick O:-) > > Later, Juan. > With regards, Daniel
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index ac2e8ecac6..e16120ff30 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -568,9 +568,6 @@ typedef struct { MIG_TEST_FAIL_DEST_QUIT_ERR, } result; - /* Optional: set number of migration passes to wait for */ - unsigned int iterations; - /* Postcopy specific fields */ void *postcopy_data; bool postcopy_preempt; @@ -1354,13 +1351,7 @@ static void test_precopy_common(MigrateCommon *args) qtest_set_expected_status(to, EXIT_FAILURE); } } else { - if (args->iterations) { - while (args->iterations--) { - wait_for_migration_pass(from); - } - } else { - wait_for_migration_pass(from); - } + wait_for_migration_pass(from); migrate_ensure_converge(from); @@ -1514,8 +1505,6 @@ static void test_precopy_unix_xbzrle(void) .listen_uri = uri, .start_hook = test_migrate_xbzrle_start, - - .iterations = 2, }; test_precopy_common(&args);
The 'unsigned int interations' config for migration is somewhat overkill. Most tests don't set it, and a value of '0' is treated as equivalent to '1'. The only test that does set it, xbzrle, used a value of '2'. This setting, however, only relates to the migration iterations that take place prior to allowing convergence. IOW, on top of this iteration count, there is always at least 1 further migration iteration done to deal with pages that are dirtied during the previous iteration(s). IOW, even with iterations==1, the xbzrle test will be running for a minimum of 2 iterations. With this in mind we can simplify the code and just get rid of the special case. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- tests/qtest/migration-test.c | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-)