Message ID | 20230713055819.30497-1-dongli.zhang@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] dump: kdump-zlib data pages not dumped with pvtime/aarch64 | expand |
Hi Marc-André, May I get any feedback on this bugfix? Feel free to let me know if I can help by collecting any data from the aarch64 server. Since the pvtime is supported by default by QEMU/linux, this indicates currently the dump does not work for aarch64 with new version of QEMU/linux. Thank you very much! Dongli Zhang On 7/12/23 22:58, Dongli Zhang wrote: > The kdump-zlib data pages are not dumped from aarch64 host when the > 'pvtime' is involved, that is, when the block->target_end is not aligned to > page_size. In the below example, it is expected to dump two blocks. > > (qemu) info mtree -f > ... ... > 00000000090a0000-00000000090a0fff (prio 0, ram): pvtime KVM > ... ... > 0000000040000000-00000001bfffffff (prio 0, ram): mach-virt.ram KVM > ... ... > > However, there is an issue with get_next_page() so that the pages for > "mach-virt.ram" will not be dumped. > > At line 1296, although we have reached at the end of the 'pvtime' block, > since it is not aligned to the page_size (e.g., 0x10000), it will not break > at line 1298. > > 1255 static bool get_next_page(GuestPhysBlock **blockptr, uint64_t *pfnptr, > 1256 uint8_t **bufptr, DumpState *s) > ... ... > 1294 memcpy(buf + addr % page_size, hbuf, n); > 1295 addr += n; > 1296 if (addr % page_size == 0) { > 1297 /* we filled up the page */ > 1298 break; > 1299 } > > As a result, get_next_page() will continue to the next > block ("mach-virt.ram"). Finally, when get_next_page() returns to the > caller: > > - 'pfnptr' is referring to the 'pvtime' > - but 'blockptr' is referring to the "mach-virt.ram" > > When get_next_page() is called the next time, "*pfnptr += 1" still refers > to the prior 'pvtime'. It will exit immediately because it is out of the > range of the current "mach-virt.ram". > > The fix is to break when it is time to come to the next block, so that both > 'pfnptr' and 'blockptr' refer to the same block. > > Fixes: 94d788408d2d ("dump: fix kdump to work over non-aligned blocks") > Cc: Joe Jin <joe.jin@oracle.com> > Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> > --- > dump/dump.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/dump/dump.c b/dump/dump.c > index 1f1a6edcab..c93e4c572f 100644 > --- a/dump/dump.c > +++ b/dump/dump.c > @@ -1293,8 +1293,8 @@ static bool get_next_page(GuestPhysBlock **blockptr, uint64_t *pfnptr, > > memcpy(buf + addr % page_size, hbuf, n); > addr += n; > - if (addr % page_size == 0) { > - /* we filled up the page */ > + if (addr % page_size == 0 || addr >= block->target_end) { > + /* we filled up the page or the current block is finished */ > break; > } > } else {
Hi On Thu, Jul 13, 2023 at 11:11 AM Dongli Zhang <dongli.zhang@oracle.com> wrote: > > The kdump-zlib data pages are not dumped from aarch64 host when the > 'pvtime' is involved, that is, when the block->target_end is not aligned to > page_size. In the below example, it is expected to dump two blocks. > > (qemu) info mtree -f > ... ... > 00000000090a0000-00000000090a0fff (prio 0, ram): pvtime KVM > ... ... > 0000000040000000-00000001bfffffff (prio 0, ram): mach-virt.ram KVM > ... ... > > However, there is an issue with get_next_page() so that the pages for > "mach-virt.ram" will not be dumped. > > At line 1296, although we have reached at the end of the 'pvtime' block, > since it is not aligned to the page_size (e.g., 0x10000), it will not break > at line 1298. > > 1255 static bool get_next_page(GuestPhysBlock **blockptr, uint64_t *pfnptr, > 1256 uint8_t **bufptr, DumpState *s) > ... ... > 1294 memcpy(buf + addr % page_size, hbuf, n); > 1295 addr += n; > 1296 if (addr % page_size == 0) { > 1297 /* we filled up the page */ > 1298 break; > 1299 } > > As a result, get_next_page() will continue to the next > block ("mach-virt.ram"). Finally, when get_next_page() returns to the > caller: > > - 'pfnptr' is referring to the 'pvtime' > - but 'blockptr' is referring to the "mach-virt.ram" > > When get_next_page() is called the next time, "*pfnptr += 1" still refers > to the prior 'pvtime'. It will exit immediately because it is out of the > range of the current "mach-virt.ram". > > The fix is to break when it is time to come to the next block, so that both > 'pfnptr' and 'blockptr' refer to the same block. > > Fixes: 94d788408d2d ("dump: fix kdump to work over non-aligned blocks") > Cc: Joe Jin <joe.jin@oracle.com> > Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> > --- > dump/dump.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/dump/dump.c b/dump/dump.c > index 1f1a6edcab..c93e4c572f 100644 > --- a/dump/dump.c > +++ b/dump/dump.c > @@ -1293,8 +1293,8 @@ static bool get_next_page(GuestPhysBlock **blockptr, uint64_t *pfnptr, > > memcpy(buf + addr % page_size, hbuf, n); > addr += n; > - if (addr % page_size == 0) { > - /* we filled up the page */ > + if (addr % page_size == 0 || addr >= block->target_end) { > + /* we filled up the page or the current block is finished */ > break; Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> > } > } else { > -- > 2.34.1 >
diff --git a/dump/dump.c b/dump/dump.c index 1f1a6edcab..c93e4c572f 100644 --- a/dump/dump.c +++ b/dump/dump.c @@ -1293,8 +1293,8 @@ static bool get_next_page(GuestPhysBlock **blockptr, uint64_t *pfnptr, memcpy(buf + addr % page_size, hbuf, n); addr += n; - if (addr % page_size == 0) { - /* we filled up the page */ + if (addr % page_size == 0 || addr >= block->target_end) { + /* we filled up the page or the current block is finished */ break; } } else {
The kdump-zlib data pages are not dumped from aarch64 host when the 'pvtime' is involved, that is, when the block->target_end is not aligned to page_size. In the below example, it is expected to dump two blocks. (qemu) info mtree -f ... ... 00000000090a0000-00000000090a0fff (prio 0, ram): pvtime KVM ... ... 0000000040000000-00000001bfffffff (prio 0, ram): mach-virt.ram KVM ... ... However, there is an issue with get_next_page() so that the pages for "mach-virt.ram" will not be dumped. At line 1296, although we have reached at the end of the 'pvtime' block, since it is not aligned to the page_size (e.g., 0x10000), it will not break at line 1298. 1255 static bool get_next_page(GuestPhysBlock **blockptr, uint64_t *pfnptr, 1256 uint8_t **bufptr, DumpState *s) ... ... 1294 memcpy(buf + addr % page_size, hbuf, n); 1295 addr += n; 1296 if (addr % page_size == 0) { 1297 /* we filled up the page */ 1298 break; 1299 } As a result, get_next_page() will continue to the next block ("mach-virt.ram"). Finally, when get_next_page() returns to the caller: - 'pfnptr' is referring to the 'pvtime' - but 'blockptr' is referring to the "mach-virt.ram" When get_next_page() is called the next time, "*pfnptr += 1" still refers to the prior 'pvtime'. It will exit immediately because it is out of the range of the current "mach-virt.ram". The fix is to break when it is time to come to the next block, so that both 'pfnptr' and 'blockptr' refer to the same block. Fixes: 94d788408d2d ("dump: fix kdump to work over non-aligned blocks") Cc: Joe Jin <joe.jin@oracle.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> --- dump/dump.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)