Message ID | 20191009164459.8209-1-msmarduch@digitalocean.com (mailing list archive) |
---|---|
Headers | show |
Series | log guest name and memory error type AO, AR for MCEs | expand |
On 09/10/19 18:44, Mario Smarduch wrote: > In a large VPC environment we want to log memory error occurrences > and log them with guest name and type - there are few use cases > > > - if VM crashes on AR mce inform the user about the reason and resolve the case > - if VM hangs notify the user to reboot and resume processing > - if VM continues to run let the user know, he/she maybe able to correlate > to vm internal outage > - Rawhammer attacks - isolate/determine the attacker possible migrating it off > the hypervisor > - In general track memory errors on a hyperviosr over time to determine trends > > Monitoring our fleet we come across quite a few of these and been > able to take action where before there were no clues to the causes. > > When memory error occurs we get a log entry in qemu log: > > Guest [Droplet-12345678] 2019-08-02T05:00:11.940270Z qemu-system-x86_64: > Guest MCE Memory Error at QEMU addr 0x7f3c7622f000 and GUEST 0x78e42f000 > addr of type BUS_MCEERR_AR injected > > with enterprise logging environment we can to take further actions. > > v1 -> v2: > - split into two patches one to get the gustname second to log MCEs > - addressed comments for MCE logging > > Mario Smarduch (2): > util/qemu-error: add guest name helper with -msg options > target/i386: log MCE guest and host addresses > > include/qemu/error-report.h | 1 + > qemu-options.hx | 10 ++++++---- > target/i386/kvm.c | 29 ++++++++++++++++++++++++----- > util/qemu-error.c | 31 +++++++++++++++++++++++++++++++ > vl.c | 5 +++++ > 5 files changed, 67 insertions(+), 9 deletions(-) > Queued, thanks. Paolo
Patchew URL: https://patchew.org/QEMU/20191009164459.8209-1-msmarduch@digitalocean.com/ Hi, This series failed the docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #! /bin/bash export ARCH=x86_64 make docker-image-fedora V=1 NETWORK=1 time make docker-test-mingw@fedora J=14 NETWORK=1 === TEST SCRIPT END === CC util/hbitmap.o CC util/fifo8.o Encoding error: 'utf-8' codec can't decode byte 0x95 in position 799: invalid start byte The full traceback has been saved in /tmp/sphinx-err-qsfcd92y.log, if you want to report the issue to the developers. CC util/cacheinfo.o --- CC util/id.o CC util/iov.o CC util/qemu-config.o make: *** [Makefile:994: docs/interop/index.html] Error 2 make: *** Waiting for unfinished jobs.... Traceback (most recent call last): File "./tests/docker/docker.py", line 662, in <module> --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=c8274a8a922d4d2c81b14b6b5c0902ca', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-vdk7fez9/src/docker-src.2019-10-09-18.53.09.17540:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=c8274a8a922d4d2c81b14b6b5c0902ca make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-vdk7fez9/src' make: *** [docker-run-test-mingw@fedora] Error 2 real 2m36.643s user 0m7.259s The full log is available at http://patchew.org/logs/20191009164459.8209-1-msmarduch@digitalocean.com/testing.docker-mingw@fedora/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com
On 10/09/2019 02:19 PM, Paolo Bonzini wrote: > On 09/10/19 18:44, Mario Smarduch wrote: >> In a large VPC environment we want to log memory error occurrences >> and log them with guest name and type - there are few use cases >> >> >> - if VM crashes on AR mce inform the user about the reason and resolve the case >> - if VM hangs notify the user to reboot and resume processing >> - if VM continues to run let the user know, he/she maybe able to correlate >> to vm internal outage >> - Rawhammer attacks - isolate/determine the attacker possible migrating it off >> the hypervisor >> - In general track memory errors on a hyperviosr over time to determine trends >> >> Monitoring our fleet we come across quite a few of these and been >> able to take action where before there were no clues to the causes. >> >> When memory error occurs we get a log entry in qemu log: >> >> Guest [Droplet-12345678] 2019-08-02T05:00:11.940270Z qemu-system-x86_64: >> Guest MCE Memory Error at QEMU addr 0x7f3c7622f000 and GUEST 0x78e42f000 >> addr of type BUS_MCEERR_AR injected >> >> with enterprise logging environment we can to take further actions. >> >> v1 -> v2: >> - split into two patches one to get the gustname second to log MCEs >> - addressed comments for MCE logging >> >> Mario Smarduch (2): >> util/qemu-error: add guest name helper with -msg options >> target/i386: log MCE guest and host addresses >> >> include/qemu/error-report.h | 1 + >> qemu-options.hx | 10 ++++++---- >> target/i386/kvm.c | 29 ++++++++++++++++++++++++----- >> util/qemu-error.c | 31 +++++++++++++++++++++++++++++++ >> vl.c | 5 +++++ >> 5 files changed, 67 insertions(+), 9 deletions(-) >> > > Queued, thanks. > > Paolo > Great, thanks for the fixup and y'all time. - Mario