diff mbox

target/aarch64: exit to main loop after handling MSR

Message ID 20170613225352.GA26288@flamenco (mailing list archive)
State New, archived
Headers show

Commit Message

Emilio Cota June 13, 2017, 10:53 p.m. UTC
The appended fixes it for me. Can you please test?
[ apply with `git am --scissors' ]

Thanks,

		Emilio

---- 8< ----

Commit e75449a3 ("target/aarch64: optimize indirect branches") causes
a regression by which aarch64 guests freeze under TCG with -smp > 1,
even with `-accel accel=tcg,thread=single' (i.e. MTTCG disabled).

I isolated the problem to the MSR handler. This patch forces an exit
after the handler is executed, which fixes the regression.

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 target/arm/translate-a64.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

no-reply@patchew.org June 13, 2017, 11:01 p.m. UTC | #1
Hi,

This series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Subject: [Qemu-devel] [PATCH] target/aarch64: exit to main loop after handling MSR
Type: series
Message-id: 20170613225352.GA26288@flamenco

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=8
time make docker-test-quick@centos6
time make docker-test-mingw@fedora
time make docker-test-build@min-glib
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/20170613225352.GA26288@flamenco -> patchew/20170613225352.GA26288@flamenco
Switched to a new branch 'test'
308f131 target/aarch64: exit to main loop after handling MSR

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into '/var/tmp/patchew-tester-tmp-7o9lu4by/src/dtc'...
Submodule path 'dtc': checked out '558cd81bdd432769b59bff01240c44f82cfb1a9d'
  BUILD   centos6
make[1]: Entering directory '/var/tmp/patchew-tester-tmp-7o9lu4by/src'
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPY    RUNNER
    RUN test-quick in qemu:centos6 
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
ccache-3.1.6-2.el6.x86_64
epel-release-6-8.noarch
gcc-4.4.7-17.el6.x86_64
git-1.7.1-4.el6_7.1.x86_64
glib2-devel-2.28.8-5.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
make-3.81-23.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
tar-1.23-15.el6_8.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=libfdt-devel ccache     tar git make gcc g++     zlib-devel glib2-devel SDL-devel pixman-devel     epel-release
HOSTNAME=546e27c66238
TERM=xterm
MAKEFLAGS= -j8
HISTSIZE=1000
J=8
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
LANG=en_US.UTF-8
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES= dtc
DEBUG=
G_BROKEN_FILENAMES=1
CCACHE_HASHDIR=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu --prefix=/var/tmp/qemu-build/install
/tmp/qemu-test/src/configure: line 4683: c++: command not found
No C++ compiler available; disabling C++ specific optional code
Install prefix    /var/tmp/qemu-build/install
BIOS directory    /var/tmp/qemu-build/install/share/qemu
binary directory  /var/tmp/qemu-build/install/bin
library directory /var/tmp/qemu-build/install/lib
module directory  /var/tmp/qemu-build/install/lib/qemu
libexec directory /var/tmp/qemu-build/install/libexec
include directory /var/tmp/qemu-build/install/include
config directory  /var/tmp/qemu-build/install/etc
local state directory   /var/tmp/qemu-build/install/var
Manual directory  /var/tmp/qemu-build/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path       /tmp/qemu-test/src
C compiler        cc
Host C compiler   cc
C++ compiler      
Objective-C compiler cc
ARFLAGS           rv
CFLAGS            -O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS       -I/usr/include/pixman-1   -I$(SRC_PATH)/dtc/libfdt -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include   -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels -Wno-missing-include-dirs -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition -Wtype-limits -fstack-protector-all
LDFLAGS           -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g 
make              make
install           install
python            python -B
smbd              /usr/sbin/smbd
module support    no
host CPU          x86_64
host big endian   no
target list       x86_64-softmmu aarch64-softmmu
tcg debug enabled no
gprof enabled     no
sparse enabled    no
strip binaries    yes
profiler          no
static build      no
pixman            system
SDL support       yes (1.2.14)
GTK support       no 
GTK GL support    no
VTE support       no 
TLS priority      NORMAL
GNUTLS support    no
GNUTLS rnd        no
libgcrypt         no
libgcrypt kdf     no
nettle            no 
nettle kdf        no
libtasn1          no
curses support    no
virgl support     no
curl support      no
mingw32 support   no
Audio drivers     oss
Block whitelist (rw) 
Block whitelist (ro) 
VirtFS support    no
VNC support       yes
VNC SASL support  no
VNC JPEG support  no
VNC PNG support   no
xen support       no
brlapi support    no
bluez  support    no
Documentation     no
PIE               yes
vde support       no
netmap support    no
Linux AIO support no
ATTR/XATTR support yes
Install blobs     yes
KVM support       yes
HAX support       no
RDMA support      no
TCG interpreter   no
fdt support       yes
preadv support    yes
fdatasync         yes
madvise           yes
posix_madvise     yes
libcap-ng support no
vhost-net support yes
vhost-scsi support yes
vhost-vsock support yes
Trace backends    log
spice support     no 
rbd support       no
xfsctl support    no
smartcard support no
libusb            no
usb net redir     no
OpenGL support    no
OpenGL dmabufs    no
libiscsi support  no
libnfs support    no
build guest agent yes
QGA VSS support   no
QGA w32 disk info no
QGA MSI support   no
seccomp support   no
coroutine backend ucontext
coroutine pool    yes
debug stack usage no
GlusterFS support no
gcov              gcov
gcov enabled      no
TPM support       yes
libssh2 support   no
TPM passthrough   yes
QOM debugging     yes
Live block migration yes
lzo support       no
snappy support    no
bzip2 support     no
NUMA host support no
tcmalloc support  no
jemalloc support  no
avx2 optimization no
replication support yes
VxHS block device no
  GEN     x86_64-softmmu/config-devices.mak.tmp
mkdir -p dtc/libfdt
mkdir -p dtc/tests
  GEN     aarch64-softmmu/config-devices.mak.tmp
  GEN     config-host.h
  GEN     qemu-options.def
  GEN     qmp-commands.h
  GEN     qapi-types.h
  GEN     qapi-visit.h
  GEN     qapi-event.h
  GEN     x86_64-softmmu/config-devices.mak
  GEN     qmp-marshal.c
  GEN     aarch64-softmmu/config-devices.mak
  GEN     qapi-types.c
  GEN     qapi-visit.c
  GEN     qapi-event.c
  GEN     qmp-introspect.h
  GEN     qmp-introspect.c
  GEN     trace/generated-tcg-tracers.h
  GEN     trace/generated-helpers-wrappers.h
  GEN     trace/generated-helpers.h
  GEN     trace/generated-helpers.c
  GEN     module_block.h
  GEN     tests/test-qapi-types.h
  GEN     tests/test-qapi-visit.h
  GEN     tests/test-qmp-commands.h
  GEN     tests/test-qapi-event.h
  GEN     tests/test-qmp-introspect.h
  GEN     trace-root.h
  GEN     util/trace.h
  GEN     crypto/trace.h
  GEN     io/trace.h
  GEN     migration/trace.h
  GEN     block/trace.h
  GEN     backends/trace.h
  GEN     chardev/trace.h
  GEN     hw/block/trace.h
  GEN     hw/block/dataplane/trace.h
  GEN     hw/char/trace.h
  GEN     hw/intc/trace.h
  GEN     hw/net/trace.h
  GEN     hw/virtio/trace.h
  GEN     hw/audio/trace.h
  GEN     hw/misc/trace.h
  GEN     hw/usb/trace.h
  GEN     hw/scsi/trace.h
  GEN     hw/nvram/trace.h
  GEN     hw/display/trace.h
  GEN     hw/input/trace.h
  GEN     hw/timer/trace.h
  GEN     hw/dma/trace.h
  GEN     hw/sparc/trace.h
  GEN     hw/sd/trace.h
  GEN     hw/isa/trace.h
  GEN     hw/mem/trace.h
  GEN     hw/i386/trace.h
  GEN     hw/i386/xen/trace.h
  GEN     hw/9pfs/trace.h
  GEN     hw/ppc/trace.h
  GEN     hw/pci/trace.h
  GEN     hw/s390x/trace.h
  GEN     hw/vfio/trace.h
  GEN     hw/acpi/trace.h
  GEN     hw/arm/trace.h
  GEN     hw/alpha/trace.h
  GEN     hw/xen/trace.h
  GEN     ui/trace.h
  GEN     audio/trace.h
  GEN     net/trace.h
  GEN     target/arm/trace.h
  GEN     target/i386/trace.h
  GEN     target/mips/trace.h
  GEN     target/sparc/trace.h
  GEN     target/s390x/trace.h
  GEN     target/ppc/trace.h
  GEN     qom/trace.h
  GEN     linux-user/trace.h
  GEN     trace-root.c
  GEN     qapi/trace.h
  GEN     util/trace.c
  GEN     crypto/trace.c
  GEN     io/trace.c
  GEN     migration/trace.c
  GEN     block/trace.c
  GEN     backends/trace.c
  GEN     chardev/trace.c
  GEN     hw/block/trace.c
  GEN     hw/block/dataplane/trace.c
  GEN     hw/char/trace.c
  GEN     hw/intc/trace.c
  GEN     hw/net/trace.c
  GEN     hw/virtio/trace.c
  GEN     hw/audio/trace.c
  GEN     hw/misc/trace.c
  GEN     hw/usb/trace.c
  GEN     hw/scsi/trace.c
  GEN     hw/nvram/trace.c
  GEN     hw/display/trace.c
  GEN     hw/input/trace.c
  GEN     hw/timer/trace.c
  GEN     hw/dma/trace.c
  GEN     hw/sparc/trace.c
  GEN     hw/sd/trace.c
  GEN     hw/isa/trace.c
  GEN     hw/mem/trace.c
  GEN     hw/i386/trace.c
  GEN     hw/i386/xen/trace.c
  GEN     hw/9pfs/trace.c
  GEN     hw/ppc/trace.c
  GEN     hw/pci/trace.c
  GEN     hw/s390x/trace.c
  GEN     hw/vfio/trace.c
  GEN     hw/acpi/trace.c
  GEN     hw/arm/trace.c
  GEN     hw/alpha/trace.c
  GEN     hw/xen/trace.c
  GEN     ui/trace.c
  GEN     audio/trace.c
  GEN     net/trace.c
  GEN     target/arm/trace.c
  GEN     target/i386/trace.c
  GEN     target/mips/trace.c
  GEN     target/sparc/trace.c
  GEN     target/s390x/trace.c
  GEN     target/ppc/trace.c
  GEN     qom/trace.c
  GEN     linux-user/trace.c
  GEN     qapi/trace.c
  GEN     config-all-devices.mak
	 DEP /tmp/qemu-test/src/dtc/tests/dumptrees.c
	 DEP /tmp/qemu-test/src/dtc/tests/trees.S
	 DEP /tmp/qemu-test/src/dtc/tests/testutils.c
	 DEP /tmp/qemu-test/src/dtc/tests/value-labels.c
	 DEP /tmp/qemu-test/src/dtc/tests/asm_tree_dump.c
	 DEP /tmp/qemu-test/src/dtc/tests/truncated_property.c
	 DEP /tmp/qemu-test/src/dtc/tests/check_path.c
	 DEP /tmp/qemu-test/src/dtc/tests/overlay_bad_fixup.c
	 DEP /tmp/qemu-test/src/dtc/tests/overlay.c
	 DEP /tmp/qemu-test/src/dtc/tests/subnode_iterate.c
	 DEP /tmp/qemu-test/src/dtc/tests/property_iterate.c
	 DEP /tmp/qemu-test/src/dtc/tests/integer-expressions.c
	 DEP /tmp/qemu-test/src/dtc/tests/utilfdt_test.c
	 DEP /tmp/qemu-test/src/dtc/tests/path_offset_aliases.c
	 DEP /tmp/qemu-test/src/dtc/tests/add_subnode_with_nops.c
	 DEP /tmp/qemu-test/src/dtc/tests/dtbs_equal_unordered.c
	 DEP /tmp/qemu-test/src/dtc/tests/dtb_reverse.c
	 DEP /tmp/qemu-test/src/dtc/tests/dtbs_equal_ordered.c
	 DEP /tmp/qemu-test/src/dtc/tests/extra-terminating-null.c
	 DEP /tmp/qemu-test/src/dtc/tests/incbin.c
	 DEP /tmp/qemu-test/src/dtc/tests/boot-cpuid.c
	 DEP /tmp/qemu-test/src/dtc/tests/phandle_format.c
	 DEP /tmp/qemu-test/src/dtc/tests/path-references.c
	 DEP /tmp/qemu-test/src/dtc/tests/references.c
	 DEP /tmp/qemu-test/src/dtc/tests/string_escapes.c
	 DEP /tmp/qemu-test/src/dtc/tests/propname_escapes.c
	 DEP /tmp/qemu-test/src/dtc/tests/appendprop2.c
	 DEP /tmp/qemu-test/src/dtc/tests/appendprop1.c
	 DEP /tmp/qemu-test/src/dtc/tests/del_node.c
	 DEP /tmp/qemu-test/src/dtc/tests/del_property.c
	 DEP /tmp/qemu-test/src/dtc/tests/setprop.c
	 DEP /tmp/qemu-test/src/dtc/tests/set_name.c
	 DEP /tmp/qemu-test/src/dtc/tests/rw_tree1.c
	 DEP /tmp/qemu-test/src/dtc/tests/open_pack.c
	 DEP /tmp/qemu-test/src/dtc/tests/nopulate.c
	 DEP /tmp/qemu-test/src/dtc/tests/mangle-layout.c
	 DEP /tmp/qemu-test/src/dtc/tests/move_and_save.c
	 DEP /tmp/qemu-test/src/dtc/tests/sw_tree1.c
	 DEP /tmp/qemu-test/src/dtc/tests/nop_node.c
	 DEP /tmp/qemu-test/src/dtc/tests/nop_property.c
	 DEP /tmp/qemu-test/src/dtc/tests/setprop_inplace.c
	 DEP /tmp/qemu-test/src/dtc/tests/stringlist.c
	 DEP /tmp/qemu-test/src/dtc/tests/sized_cells.c
	 DEP /tmp/qemu-test/src/dtc/tests/addr_size_cells.c
	 DEP /tmp/qemu-test/src/dtc/tests/notfound.c
	 DEP /tmp/qemu-test/src/dtc/tests/char_literal.c
	 DEP /tmp/qemu-test/src/dtc/tests/get_alias.c
	 DEP /tmp/qemu-test/src/dtc/tests/node_offset_by_compatible.c
	 DEP /tmp/qemu-test/src/dtc/tests/node_check_compatible.c
	 DEP /tmp/qemu-test/src/dtc/tests/node_offset_by_phandle.c
	 DEP /tmp/qemu-test/src/dtc/tests/parent_offset.c
	 DEP /tmp/qemu-test/src/dtc/tests/node_offset_by_prop_value.c
	 DEP /tmp/qemu-test/src/dtc/tests/supernode_atdepth_offset.c
	 DEP /tmp/qemu-test/src/dtc/tests/get_path.c
	 DEP /tmp/qemu-test/src/dtc/tests/get_phandle.c
	 DEP /tmp/qemu-test/src/dtc/tests/getprop.c
	 DEP /tmp/qemu-test/src/dtc/tests/get_name.c
	 DEP /tmp/qemu-test/src/dtc/tests/path_offset.c
	 DEP /tmp/qemu-test/src/dtc/tests/subnode_offset.c
	 DEP /tmp/qemu-test/src/dtc/tests/find_property.c
	 DEP /tmp/qemu-test/src/dtc/tests/root_node.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_overlay.c
	 DEP /tmp/qemu-test/src/dtc/tests/get_mem_rsv.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_addresses.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_empty_tree.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_rw.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_sw.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_strerror.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_wip.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt.c
	 DEP /tmp/qemu-test/src/dtc/libfdt/fdt_ro.c
	 DEP /tmp/qemu-test/src/dtc/util.c
	 DEP /tmp/qemu-test/src/dtc/fdtput.c
	 DEP /tmp/qemu-test/src/dtc/fdtget.c
	 DEP /tmp/qemu-test/src/dtc/fdtdump.c
	 LEX convert-dtsv0-lexer.lex.c
make[1]: flex: Command not found
	 DEP /tmp/qemu-test/src/dtc/srcpos.c
	 BISON dtc-parser.tab.c
make[1]: bison: Command not found
	 LEX dtc-lexer.lex.c
make[1]: flex: Command not found
	 DEP /tmp/qemu-test/src/dtc/livetree.c
	 DEP /tmp/qemu-test/src/dtc/treesource.c
	 DEP /tmp/qemu-test/src/dtc/fstree.c
	 DEP /tmp/qemu-test/src/dtc/flattree.c
	 DEP /tmp/qemu-test/src/dtc/dtc.c
	 DEP /tmp/qemu-test/src/dtc/data.c
	 DEP /tmp/qemu-test/src/dtc/checks.c
	CHK version_gen.h
	 LEX convert-dtsv0-lexer.lex.c
make[1]: flex: Command not found
	 BISON dtc-parser.tab.c
	 LEX dtc-lexer.lex.c
make[1]: flex: Command not found
make[1]: bison: Command not found
	UPD version_gen.h
	 DEP /tmp/qemu-test/src/dtc/util.c
	 LEX convert-dtsv0-lexer.lex.c
	 BISON dtc-parser.tab.c
make[1]: flex: Command not found
make[1]: bison: Command not found
	 LEX dtc-lexer.lex.c
make[1]: flex: Command not found
	 CC libfdt/fdt.o
	 CC libfdt/fdt_ro.o
	 CC libfdt/fdt_rw.o
	 CC libfdt/fdt_strerror.o
	 CC libfdt/fdt_wip.o
	 CC libfdt/fdt_sw.o
	 CC libfdt/fdt_empty_tree.o
	 CC libfdt/fdt_addresses.o
	 CC libfdt/fdt_overlay.o
	 AR libfdt/libfdt.a
ar: creating libfdt/libfdt.a
a - libfdt/fdt.o
a - libfdt/fdt_ro.o
a - libfdt/fdt_wip.o
a - libfdt/fdt_sw.o
a - libfdt/fdt_rw.o
a - libfdt/fdt_strerror.o
a - libfdt/fdt_empty_tree.o
a - libfdt/fdt_addresses.o
a - libfdt/fdt_overlay.o
	 BISON dtc-parser.tab.c
	 LEX dtc-lexer.lex.c
make[1]: bison: Command not found
make[1]: flex: Command not found
	 LEX convert-dtsv0-lexer.lex.c
make[1]: flex: Command not found
  CC      tests/qemu-iotests/socket_scm_helper.o
  GEN     qga/qapi-generated/qga-qapi-visit.h
  GEN     qga/qapi-generated/qga-qapi-types.h
  GEN     qga/qapi-generated/qga-qmp-commands.h
  GEN     qga/qapi-generated/qga-qapi-types.c
  GEN     qga/qapi-generated/qga-qapi-visit.c
  CC      qmp-introspect.o
  GEN     qga/qapi-generated/qga-qmp-marshal.c
  CC      qapi-visit.o
  CC      qapi-types.o
  CC      qapi-event.o
  CC      qapi/qapi-visit-core.o
  CC      qapi/qapi-dealloc-visitor.o
  CC      qapi/qobject-input-visitor.o
  CC      qapi/qobject-output-visitor.o
  CC      qapi/qmp-dispatch.o
  CC      qapi/qmp-registry.o
  CC      qapi/string-input-visitor.o
  CC      qapi/string-output-visitor.o
  CC      qapi/qapi-clone-visitor.o
  CC      qapi/opts-visitor.o
  CC      qapi/qmp-event.o
  CC      qapi/qapi-util.o
  CC      qobject/qnull.o
  CC      qobject/qint.o
  CC      qobject/qdict.o
  CC      qobject/qstring.o
  CC      qobject/qlist.o
  CC      qobject/qfloat.o
  CC      qobject/qbool.o
  CC      qobject/qjson.o
  CC      qobject/qobject.o
  CC      qobject/json-lexer.o
  CC      qobject/json-streamer.o
  CC      qobject/json-parser.o
  CC      trace/control.o
  CC      trace/qmp.o
  CC      util/osdep.o
  CC      util/cutils.o
  CC      util/unicode.o
  CC      util/bufferiszero.o
  CC      util/qemu-timer-common.o
  CC      util/lockcnt.o
  CC      util/aiocb.o
  CC      util/async.o
  CC      util/thread-pool.o
  CC      util/main-loop.o
  CC      util/qemu-timer.o
  CC      util/iohandler.o
  CC      util/aio-posix.o
  CC      util/compatfd.o
  CC      util/mmap-alloc.o
  CC      util/event_notifier-posix.o
  CC      util/oslib-posix.o
  CC      util/qemu-openpty.o
  CC      util/qemu-thread-posix.o
  CC      util/memfd.o
  CC      util/envlist.o
  CC      util/path.o
  CC      util/module.o
  CC      util/host-utils.o
  CC      util/bitmap.o
  CC      util/fifo8.o
  CC      util/bitops.o
  CC      util/hbitmap.o
  CC      util/acl.o
  CC      util/error.o
  CC      util/qemu-error.o
  CC      util/iov.o
  CC      util/qemu-config.o
  CC      util/id.o
  CC      util/qemu-sockets.o
  CC      util/uri.o
  CC      util/qemu-option.o
  CC      util/notify.o
  CC      util/keyval.o
  CC      util/qemu-progress.o
  CC      util/hexdump.o
  CC      util/crc32c.o
  CC      util/uuid.o
  CC      util/throttle.o
  CC      util/getauxval.o
  CC      util/readline.o
  CC      util/qemu-coroutine.o
  CC      util/rcu.o
  CC      util/qemu-coroutine-lock.o
  CC      util/qemu-coroutine-io.o
  CC      util/coroutine-ucontext.o
  CC      util/buffer.o
  CC      util/qemu-coroutine-sleep.o
  CC      util/timed-average.o
  CC      util/qdist.o
  CC      util/log.o
  CC      util/base64.o
  CC      util/qht.o
  CC      util/range.o
  CC      util/systemd.o
  CC      util/trace.o
  CC      trace-root.o
  CC      crypto/trace.o
  CC      io/trace.o
  CC      migration/trace.o
  CC      block/trace.o
  CC      backends/trace.o
  CC      chardev/trace.o
  CC      hw/block/trace.o
  CC      hw/block/dataplane/trace.o
  CC      hw/char/trace.o
  CC      hw/intc/trace.o
  CC      hw/net/trace.o
  CC      hw/virtio/trace.o
  CC      hw/audio/trace.o
  CC      hw/misc/trace.o
  CC      hw/usb/trace.o
  CC      hw/scsi/trace.o
  CC      hw/nvram/trace.o
  CC      hw/display/trace.o
  CC      hw/input/trace.o
  CC      hw/timer/trace.o
  CC      hw/dma/trace.o
  CC      hw/sparc/trace.o
  CC      hw/sd/trace.o
  CC      hw/isa/trace.o
  CC      hw/mem/trace.o
  CC      hw/i386/xen/trace.o
  CC      hw/i386/trace.o
  CC      hw/9pfs/trace.o
  CC      hw/ppc/trace.o
  CC      hw/pci/trace.o
  CC      hw/s390x/trace.o
  CC      hw/vfio/trace.o
  CC      hw/acpi/trace.o
  CC      hw/arm/trace.o
  CC      hw/alpha/trace.o
  CC      hw/xen/trace.o
  CC      audio/trace.o
  CC      net/trace.o
  CC      ui/trace.o
  CC      target/arm/trace.o
  CC      target/i386/trace.o
  CC      target/mips/trace.o
  CC      target/sparc/trace.o
  CC      target/s390x/trace.o
  CC      target/ppc/trace.o
  CC      linux-user/trace.o
  CC      qom/trace.o
  CC      qapi/trace.o
  CC      crypto/pbkdf-stub.o
  CC      stubs/arch-query-cpu-model-expansion.o
  CC      stubs/arch-query-cpu-def.o
  CC      stubs/arch-query-cpu-model-comparison.o
  CC      stubs/arch-query-cpu-model-baseline.o
  CC      stubs/bdrv-next-monitor-owned.o
  CC      stubs/blk-commit-all.o
  CC      stubs/blockdev-close-all-bdrv-states.o
  CC      stubs/cpu-get-clock.o
  CC      stubs/clock-warp.o
  CC      stubs/cpu-get-icount.o
  CC      stubs/dump.o
  CC      stubs/error-printf.o
  CC      stubs/gdbstub.o
  CC      stubs/fdset.o
  CC      stubs/get-vm-name.o
  CC      stubs/iothread.o
  CC      stubs/iothread-lock.o
  CC      stubs/is-daemonized.o
  CC      stubs/machine-init-done.o
  CC      stubs/migr-blocker.o
  CC      stubs/monitor.o
  CC      stubs/notify-event.o
  CC      stubs/qtest.o
  CC      stubs/replay.o
  CC      stubs/runstate-check.o
  CC      stubs/set-fd-handler.o
  CC      stubs/slirp.o
  CC      stubs/sysbus.o
  CC      stubs/trace-control.o
  CC      stubs/uuid.o
  CC      stubs/vm-stop.o
  CC      stubs/vmstate.o
  CC      stubs/qmp_pc_dimm_device_list.o
  CC      stubs/target-monitor-defs.o
  CC      stubs/target-get-monitor-def.o
  CC      stubs/pc_madt_cpu_entry.o
  CC      stubs/vmgenid.o
  CC      stubs/xen-common.o
  CC      stubs/xen-hvm.o
  CC      contrib/ivshmem-client/main.o
  CC      contrib/ivshmem-client/ivshmem-client.o
  CC      contrib/ivshmem-server/ivshmem-server.o
  CC      contrib/ivshmem-server/main.o
  CC      qemu-nbd.o
  CC      block.o
  CC      blockjob.o
  CC      replication.o
  CC      qemu-io-cmds.o
  CC      block/qcow.o
  CC      block/raw-format.o
  CC      block/vdi.o
  CC      block/cloop.o
  CC      block/vmdk.o
  CC      block/bochs.o
  CC      block/vpc.o
  CC      block/vvfat.o
  CC      block/dmg.o
  CC      block/qcow2.o
  CC      block/qcow2-refcount.o
  CC      block/qcow2-cluster.o
  CC      block/qcow2-snapshot.o
  CC      block/qcow2-cache.o
  CC      block/qed.o
  CC      block/qed-l2-cache.o
  CC      block/qed-gencb.o
  CC      block/qed-table.o
  CC      block/qed-cluster.o
  CC      block/qed-check.o
  CC      block/vhdx-endian.o
  CC      block/vhdx.o
  CC      block/vhdx-log.o
  CC      block/quorum.o
  CC      block/blkdebug.o
  CC      block/parallels.o
  CC      block/blkverify.o
  CC      block/blkreplay.o
  CC      block/block-backend.o
  CC      block/snapshot.o
  CC      block/qapi.o
  CC      block/file-posix.o
  CC      block/null.o
  CC      block/mirror.o
  CC      block/commit.o
  CC      block/io.o
  CC      block/throttle-groups.o
  CC      block/nbd.o
  CC      block/nbd-client.o
  CC      block/sheepdog.o
  CC      block/accounting.o
  CC      block/dirty-bitmap.o
  CC      block/write-threshold.o
  CC      block/backup.o
  CC      block/replication.o
  CC      block/crypto.o
  CC      nbd/server.o
  CC      nbd/client.o
  CC      nbd/common.o
  CC      crypto/init.o
  CC      crypto/hash.o
  CC      crypto/hash-glib.o
  CC      crypto/hmac-glib.o
  CC      crypto/hmac.o
  CC      crypto/aes.o
  CC      crypto/desrfb.o
  CC      crypto/cipher.o
  CC      crypto/tlscreds.o
  CC      crypto/tlscredsx509.o
  CC      crypto/tlscredsanon.o
  CC      crypto/tlssession.o
  CC      crypto/secret.o
  CC      crypto/random-platform.o
  CC      crypto/pbkdf.o
  CC      crypto/ivgen.o
  CC      crypto/ivgen-essiv.o
  CC      crypto/ivgen-plain.o
  CC      crypto/ivgen-plain64.o
  CC      crypto/afsplit.o
  CC      crypto/xts.o
  CC      crypto/block.o
  CC      crypto/block-qcow.o
  CC      crypto/block-luks.o
  CC      io/channel.o
  CC      io/channel-buffer.o
  CC      io/channel-command.o
  CC      io/channel-file.o
  CC      io/channel-socket.o
  CC      io/channel-tls.o
  CC      io/channel-watch.o
  CC      io/channel-websock.o
  CC      io/channel-util.o
  CC      io/dns-resolver.o
  CC      io/task.o
  CC      qom/object.o
  CC      qom/qom-qobject.o
  CC      qom/container.o
  CC      qom/object_interfaces.o
  GEN     qemu-img-cmds.h
  CC      qemu-io.o
  CC      blockdev.o
  CC      qemu-bridge-helper.o
  CC      blockdev-nbd.o
  CC      iothread.o
  CC      qdev-monitor.o
  CC      device-hotplug.o
  CC      os-posix.o
  CC      accel.o
  CC      bt-host.o
  CC      bt-vhci.o
  CC      dma-helpers.o
  CC      vl.o
  CC      tpm.o
  CC      device_tree.o
  CC      qmp-marshal.o
  CC      qmp.o
  CC      hmp.o
  CC      cpus-common.o
  CC      audio/audio.o
  CC      audio/noaudio.o
  CC      audio/wavaudio.o
  CC      audio/mixeng.o
  CC      audio/sdlaudio.o
  CC      audio/ossaudio.o
  CC      audio/wavcapture.o
  CC      backends/rng.o
  CC      backends/rng-egd.o
  CC      backends/rng-random.o
  CC      backends/tpm.o
  CC      backends/hostmem.o
  CC      backends/hostmem-ram.o
  CC      backends/hostmem-file.o
  CC      backends/cryptodev.o
  CC      backends/cryptodev-builtin.o
  CC      block/stream.o
  CC      chardev/msmouse.o
  CC      chardev/wctablet.o
  CC      chardev/testdev.o
  CC      disas/i386.o
  CC      fsdev/qemu-fsdev-dummy.o
  CC      fsdev/qemu-fsdev-opts.o
  CC      disas/arm.o
  CC      fsdev/qemu-fsdev-throttle.o
  CC      hw/acpi/core.o
  CC      hw/acpi/piix4.o
  CC      hw/acpi/pcihp.o
  CC      hw/acpi/tco.o
  CC      hw/acpi/cpu_hotplug.o
  CC      hw/acpi/memory_hotplug.o
  CC      hw/acpi/ich9.o
  CC      hw/acpi/cpu.o
  CC      hw/acpi/nvdimm.o
  CC      hw/acpi/vmgenid.o
  CC      hw/acpi/acpi_interface.o
  CC      hw/acpi/bios-linker-loader.o
  CC      hw/acpi/ipmi.o
  CC      hw/acpi/acpi-stub.o
  CC      hw/acpi/ipmi-stub.o
  CC      hw/audio/sb16.o
  CC      hw/acpi/aml-build.o
  CC      hw/audio/es1370.o
  CC      hw/audio/ac97.o
  CC      hw/audio/adlib.o
  CC      hw/audio/gus.o
  CC      hw/audio/fmopl.o
  CC      hw/audio/gusemu_hal.o
  CC      hw/audio/gusemu_mixer.o
  CC      hw/audio/cs4231a.o
  CC      hw/audio/intel-hda.o
  CC      hw/audio/hda-codec.o
  CC      hw/audio/pcspk.o
  CC      hw/audio/wm8750.o
  CC      hw/audio/pl041.o
  CC      hw/audio/lm4549.o
  CC      hw/audio/marvell_88w8618.o
  CC      hw/audio/soundhw.o
  CC      hw/block/block.o
  CC      hw/block/hd-geometry.o
  CC      hw/block/cdrom.o
  CC      hw/block/fdc.o
  CC      hw/block/m25p80.o
  CC      hw/block/nand.o
  CC      hw/block/pflash_cfi01.o
  CC      hw/block/pflash_cfi02.o
  CC      hw/block/ecc.o
  CC      hw/block/onenand.o
  CC      hw/block/nvme.o
  CC      hw/bt/core.o
  CC      hw/bt/l2cap.o
  CC      hw/bt/sdp.o
  CC      hw/bt/hci.o
  CC      hw/bt/hid.o
  CC      hw/char/ipoctal232.o
  CC      hw/bt/hci-csr.o
  CC      hw/char/parallel.o
  CC      hw/char/pl011.o
  CC      hw/char/serial.o
  CC      hw/char/serial-isa.o
  CC      hw/char/serial-pci.o
  CC      hw/char/virtio-console.o
  CC      hw/char/cadence_uart.o
  CC      hw/char/debugcon.o
  CC      hw/core/qdev.o
  CC      hw/core/qdev-properties.o
  CC      hw/char/imx_serial.o
  CC      hw/core/bus.o
  CC      hw/core/reset.o
  CC      hw/core/fw-path-provider.o
  CC      hw/core/irq.o
  CC      hw/core/hotplug.o
  CC      hw/core/ptimer.o
  CC      hw/core/sysbus.o
  CC      hw/core/nmi.o
  CC      hw/core/loader.o
  CC      hw/core/machine.o
  CC      hw/core/register.o
  CC      hw/core/qdev-properties-system.o
  CC      hw/core/platform-bus.o
  CC      hw/cpu/core.o
  CC      hw/core/or-irq.o
  CC      hw/display/ads7846.o
  CC      hw/display/cirrus_vga.o
  CC      hw/display/pl110.o
  CC      hw/display/ssd0323.o
  CC      hw/display/ssd0303.o
  CC      hw/display/vga-pci.o
  CC      hw/display/vmware_vga.o
  CC      hw/display/blizzard.o
  CC      hw/display/vga-isa.o
  CC      hw/display/exynos4210_fimd.o
  CC      hw/display/framebuffer.o
  CC      hw/display/tc6393xb.o
  CC      hw/dma/pl080.o
  CC      hw/dma/pl330.o
  CC      hw/dma/i8257.o
  CC      hw/dma/xlnx-zynq-devcfg.o
  CC      hw/gpio/max7310.o
  CC      hw/gpio/pl061.o
  CC      hw/gpio/zaurus.o
  CC      hw/gpio/gpio_key.o
  CC      hw/i2c/core.o
  CC      hw/i2c/smbus.o
  CC      hw/i2c/smbus_eeprom.o
  CC      hw/i2c/i2c-ddc.o
  CC      hw/i2c/versatile_i2c.o
  CC      hw/i2c/smbus_ich9.o
  CC      hw/i2c/pm_smbus.o
  CC      hw/i2c/bitbang_i2c.o
  CC      hw/i2c/exynos4210_i2c.o
  CC      hw/i2c/aspeed_i2c.o
  CC      hw/i2c/imx_i2c.o
  CC      hw/ide/core.o
  CC      hw/ide/qdev.o
  CC      hw/ide/atapi.o
  CC      hw/ide/pci.o
  CC      hw/ide/isa.o
  CC      hw/ide/piix.o
  CC      hw/ide/microdrive.o
  CC      hw/ide/ahci.o
  CC      hw/input/hid.o
  CC      hw/ide/ich.o
  CC      hw/input/lm832x.o
  CC      hw/input/pckbd.o
  CC      hw/input/pl050.o
  CC      hw/input/ps2.o
  CC      hw/input/stellaris_input.o
  CC      hw/input/tsc2005.o
  CC      hw/input/virtio-input.o
  CC      hw/input/vmmouse.o
  CC      hw/input/virtio-input-hid.o
  CC      hw/input/virtio-input-host.o
  CC      hw/intc/i8259_common.o
  CC      hw/intc/pl190.o
  CC      hw/intc/i8259.o
  CC      hw/intc/realview_gic.o
  CC      hw/intc/imx_avic.o
  CC      hw/intc/ioapic_common.o
  CC      hw/intc/arm_gic_common.o
  CC      hw/intc/arm_gic.o
  CC      hw/intc/arm_gicv2m.o
  CC      hw/intc/arm_gicv3_common.o
  CC      hw/intc/arm_gicv3.o
  CC      hw/intc/arm_gicv3_dist.o
  CC      hw/intc/arm_gicv3_redist.o
  CC      hw/intc/arm_gicv3_its_common.o
  CC      hw/intc/intc.o
  CC      hw/ipack/ipack.o
  CC      hw/ipmi/ipmi.o
  CC      hw/ipack/tpci200.o
  CC      hw/ipmi/ipmi_bmc_sim.o
  CC      hw/ipmi/ipmi_bmc_extern.o
  CC      hw/ipmi/isa_ipmi_kcs.o
  CC      hw/ipmi/isa_ipmi_bt.o
  CC      hw/isa/isa-bus.o
  CC      hw/isa/apm.o
  CC      hw/mem/pc-dimm.o
  CC      hw/misc/applesmc.o
  CC      hw/misc/max111x.o
  CC      hw/mem/nvdimm.o
  CC      hw/misc/tmp105.o
  CC      hw/misc/tmp421.o
  CC      hw/misc/debugexit.o
  CC      hw/misc/sga.o
  CC      hw/misc/pc-testdev.o
  CC      hw/misc/pci-testdev.o
  CC      hw/misc/unimp.o
  CC      hw/misc/arm_integrator_debug.o
  CC      hw/misc/arm_l2x0.o
  CC      hw/misc/a9scu.o
  CC      hw/misc/arm11scu.o
  CC      hw/net/ne2000.o
  CC      hw/net/eepro100.o
  CC      hw/net/pcnet-pci.o
  CC      hw/net/pcnet.o
  CC      hw/net/e1000.o
  CC      hw/net/e1000x_common.o
  CC      hw/net/net_tx_pkt.o
  CC      hw/net/net_rx_pkt.o
  CC      hw/net/e1000e_core.o
  CC      hw/net/e1000e.o
  CC      hw/net/rtl8139.o
  CC      hw/net/vmxnet3.o
  CC      hw/net/smc91c111.o
  CC      hw/net/ne2000-isa.o
  CC      hw/net/lan9118.o
  CC      hw/net/xgmac.o
  CC      hw/net/allwinner_emac.o
  CC      hw/net/imx_fec.o
  CC      hw/net/cadence_gem.o
  CC      hw/net/stellaris_enet.o
  CC      hw/net/ftgmac100.o
  CC      hw/net/rocker/rocker.o
  CC      hw/net/rocker/rocker_fp.o
  CC      hw/net/rocker/rocker_desc.o
  CC      hw/net/rocker/rocker_world.o
  CC      hw/net/rocker/rocker_of_dpa.o
  CC      hw/nvram/eeprom93xx.o
  CC      hw/nvram/fw_cfg.o
  CC      hw/nvram/chrp_nvram.o
  CC      hw/pci-bridge/pci_bridge_dev.o
  CC      hw/pci-bridge/pcie_root_port.o
  CC      hw/pci-bridge/gen_pcie_root_port.o
  CC      hw/pci-bridge/pci_expander_bridge.o
  CC      hw/pci-bridge/xio3130_upstream.o
  CC      hw/pci-bridge/xio3130_downstream.o
  CC      hw/pci-bridge/ioh3420.o
  CC      hw/pci-bridge/i82801b11.o
  CC      hw/pci-host/pam.o
  CC      hw/pci-host/versatile.o
  CC      hw/pci-host/piix.o
  CC      hw/pci-host/q35.o
  CC      hw/pci-host/gpex.o
  CC      hw/pci/pci.o
  CC      hw/pci/pci_bridge.o
  CC      hw/pci/msix.o
  CC      hw/pci/msi.o
  CC      hw/pci/shpc.o
  CC      hw/pci/slotid_cap.o
  CC      hw/pci/pci_host.o
  CC      hw/pci/pcie_host.o
  CC      hw/pci/pcie.o
  CC      hw/pci/pcie_aer.o
  CC      hw/pci/pcie_port.o
  CC      hw/pci/pci-stub.o
  CC      hw/scsi/scsi-disk.o
  CC      hw/pcmcia/pcmcia.o
  CC      hw/scsi/scsi-generic.o
  CC      hw/scsi/scsi-bus.o
  CC      hw/scsi/lsi53c895a.o
  CC      hw/scsi/mptconfig.o
  CC      hw/scsi/mptsas.o
  CC      hw/scsi/mptendian.o
  CC      hw/scsi/megasas.o
  CC      hw/scsi/vmw_pvscsi.o
  CC      hw/scsi/esp.o
  CC      hw/scsi/esp-pci.o
  CC      hw/sd/pl181.o
  CC      hw/sd/ssi-sd.o
In file included from /tmp/qemu-test/src/hw/net/vmxnet3.c:30:
/tmp/qemu-test/src/include/migration/register.h:18: error: redefinition of typedef ‘LoadStateHandler’
/tmp/qemu-test/src/include/migration/vmstate.h:32: note: previous declaration of ‘LoadStateHandler’ was here
make: *** [hw/net/vmxnet3.o] Error 1
make: *** Waiting for unfinished jobs....
tests/docker/Makefile.include:118: recipe for target 'docker-run' failed
make[1]: *** [docker-run] Error 2
make[1]: Leaving directory '/var/tmp/patchew-tester-tmp-7o9lu4by/src'
tests/docker/Makefile.include:149: recipe for target 'docker-run-test-quick@centos6' failed
make: *** [docker-run-test-quick@centos6] Error 2
=== OUTPUT END ===

Test command exited with code: 2


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org
Richard Henderson June 14, 2017, 4:48 a.m. UTC | #2
On 06/13/2017 03:53 PM, Emilio G. Cota wrote:
> The appended fixes it for me. Can you please test?
> [ apply with `git am --scissors' ]
> 
> Thanks,
> 
> 		Emilio
> 
> ---- 8< ----
> 
> Commit e75449a3 ("target/aarch64: optimize indirect branches") causes
> a regression by which aarch64 guests freeze under TCG with -smp > 1,
> even with `-accel accel=tcg,thread=single' (i.e. MTTCG disabled).
> 
> I isolated the problem to the MSR handler. This patch forces an exit
> after the handler is executed, which fixes the regression.

Why would that be?  The cpu_get_tb_cpu_state within helper_lookup_tb_ptr is 
supposed to read the new state that the msr handler would have installed.


r~
Alex Bennée June 14, 2017, 10:38 a.m. UTC | #3
Emilio G. Cota <cota@braap.org> writes:

> The appended fixes it for me. Can you please test?
> [ apply with `git am --scissors' ]
>
> Thanks,
>
> 		Emilio
>
> ---- 8< ----
>
> Commit e75449a3 ("target/aarch64: optimize indirect branches") causes
> a regression by which aarch64 guests freeze under TCG with -smp > 1,
> even with `-accel accel=tcg,thread=single' (i.e. MTTCG disabled).
>
> I isolated the problem to the MSR handler. This patch forces an exit
> after the handler is executed, which fixes the regression.
>
> Signed-off-by: Emilio G. Cota <cota@braap.org>

Tested-by: Alex Bennée <alex.bennee@linaro.org>

But what exactly is the mechanism here? DISAS_UPDATE should have ensured
that the PC was updated before we get to the helper. Is this a case of
msr_i_pstate somehow getting missed or not causing a flag update which
confuses the next TB calculation?

> ---
>  target/arm/translate-a64.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
> index 860e279..5a609a0 100644
> --- a/target/arm/translate-a64.c
> +++ b/target/arm/translate-a64.c
> @@ -1422,7 +1422,7 @@ static void handle_msr_i(DisasContext *s, uint32_t insn,
>          gen_helper_msr_i_pstate(cpu_env, tcg_op, tcg_imm);
>          tcg_temp_free_i32(tcg_imm);
>          tcg_temp_free_i32(tcg_op);
> -        s->is_jmp = DISAS_UPDATE;
> +        s->is_jmp = DISAS_EXIT;
>          break;
>      }
>      default:
> @@ -11362,6 +11362,10 @@ void gen_intermediate_code_a64(ARMCPU *cpu, TranslationBlock *tb)
>          case DISAS_NEXT:
>              gen_goto_tb(dc, 1, dc->pc);
>              break;
> +        case DISAS_EXIT:
> +            gen_a64_set_pc_im(dc->pc);
> +            tcg_gen_exit_tb(0);
> +            break;
>          default:
>          case DISAS_UPDATE:
>              gen_a64_set_pc_im(dc->pc);


--
Alex Bennée
Paolo Bonzini June 14, 2017, 10:46 a.m. UTC | #4
On 14/06/2017 06:48, Richard Henderson wrote:
>>
>> Commit e75449a3 ("target/aarch64: optimize indirect branches") causes
>> a regression by which aarch64 guests freeze under TCG with -smp > 1,
>> even with `-accel accel=tcg,thread=single' (i.e. MTTCG disabled).
>>
>> I isolated the problem to the MSR handler. This patch forces an exit
>> after the handler is executed, which fixes the regression.
> 
> Why would that be?  The cpu_get_tb_cpu_state within helper_lookup_tb_ptr
> is supposed to read the new state that the msr handler would have
> installed.

Could some of these cause an interrupt, or some other change in the
cpu_exec flow?

Thanks,

Paolo
Alex Bennée June 14, 2017, 11:45 a.m. UTC | #5
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 14/06/2017 06:48, Richard Henderson wrote:
>>>
>>> Commit e75449a3 ("target/aarch64: optimize indirect branches") causes
>>> a regression by which aarch64 guests freeze under TCG with -smp > 1,
>>> even with `-accel accel=tcg,thread=single' (i.e. MTTCG disabled).
>>>
>>> I isolated the problem to the MSR handler. This patch forces an exit
>>> after the handler is executed, which fixes the regression.
>>
>> Why would that be?  The cpu_get_tb_cpu_state within helper_lookup_tb_ptr
>> is supposed to read the new state that the msr handler would have
>> installed.
>
> Could some of these cause an interrupt, or some other change in the
> cpu_exec flow?

Well what I was observing was the secondary_start_kernel stalling and
leaving the main cpu spinning. The msr is actually:

	local_irq_enable();
	local_fiq_enable();

Which I assume would re-enable IRQs if they are ready to go. However I
guess if we sink into our cpu_idle without exiting the main loop we
never set any pending IRQs?

>
> Thanks,
>
> Paolo


--
Alex Bennée
Paolo Bonzini June 14, 2017, 12:02 p.m. UTC | #6
On 14/06/2017 13:45, Alex Bennée wrote:
> 
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 14/06/2017 06:48, Richard Henderson wrote:
>>>>
>>>> Commit e75449a3 ("target/aarch64: optimize indirect branches") causes
>>>> a regression by which aarch64 guests freeze under TCG with -smp > 1,
>>>> even with `-accel accel=tcg,thread=single' (i.e. MTTCG disabled).
>>>>
>>>> I isolated the problem to the MSR handler. This patch forces an exit
>>>> after the handler is executed, which fixes the regression.
>>>
>>> Why would that be?  The cpu_get_tb_cpu_state within helper_lookup_tb_ptr
>>> is supposed to read the new state that the msr handler would have
>>> installed.
>>
>> Could some of these cause an interrupt, or some other change in the
>> cpu_exec flow?
> 
> Well what I was observing was the secondary_start_kernel stalling and
> leaving the main cpu spinning. The msr is actually:
> 
> 	local_irq_enable();
> 	local_fiq_enable();
> 
> Which I assume would re-enable IRQs if they are ready to go. However I
> guess if we sink into our cpu_idle without exiting the main loop we
> never set any pending IRQs?

Then Emilio's patch, if a bit of a heavy hammer, is correct. After
aa64_daif_write needs you need an exit_tb so that arm_cpu_exec_interrupt
is executed again.

Compare with this from the x86 front-end:

        /* if irq were inhibited with HF_INHIBIT_IRQ_MASK, we clear
           the flag and abort the translation to give the irqs a
           change to be happen */
        if (dc->tf || dc->singlestep_enabled ||
            (flags & HF_INHIBIT_IRQ_MASK)) {
            gen_jmp_im(pc_ptr - dc->cs_base);
            gen_eob(dc);
            break;
        }

(This triggers one instruction after a STI instruction, due to how x86's
"interrupt shadow" work, so it doesn't happen immediately after
helper_sti; but the idea is the same).

Paolo
Alex Bennée June 14, 2017, 12:14 p.m. UTC | #7
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 14/06/2017 13:45, Alex Bennée wrote:
>>
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>
>>> On 14/06/2017 06:48, Richard Henderson wrote:
>>>>>
>>>>> Commit e75449a3 ("target/aarch64: optimize indirect branches") causes
>>>>> a regression by which aarch64 guests freeze under TCG with -smp > 1,
>>>>> even with `-accel accel=tcg,thread=single' (i.e. MTTCG disabled).
>>>>>
>>>>> I isolated the problem to the MSR handler. This patch forces an exit
>>>>> after the handler is executed, which fixes the regression.
>>>>
>>>> Why would that be?  The cpu_get_tb_cpu_state within helper_lookup_tb_ptr
>>>> is supposed to read the new state that the msr handler would have
>>>> installed.
>>>
>>> Could some of these cause an interrupt, or some other change in the
>>> cpu_exec flow?
>>
>> Well what I was observing was the secondary_start_kernel stalling and
>> leaving the main cpu spinning. The msr is actually:
>>
>> 	local_irq_enable();
>> 	local_fiq_enable();
>>
>> Which I assume would re-enable IRQs if they are ready to go. However I
>> guess if we sink into our cpu_idle without exiting the main loop we
>> never set any pending IRQs?
>
> Then Emilio's patch, if a bit of a heavy hammer, is correct. After
> aa64_daif_write needs you need an exit_tb so that arm_cpu_exec_interrupt
> is executed again.

This is a case of cpu->interrupt_request being pending but not having
set cpu->icount_decr yet to signal the exit. Wouldn't another approach
(that didn't involve futzing with each front-end) to be to check
cpu->interrupt_request and force the exit in lookup_tb_ptr?

>
> Compare with this from the x86 front-end:
>
>         /* if irq were inhibited with HF_INHIBIT_IRQ_MASK, we clear
>            the flag and abort the translation to give the irqs a
>            change to be happen */
>         if (dc->tf || dc->singlestep_enabled ||
>             (flags & HF_INHIBIT_IRQ_MASK)) {
>             gen_jmp_im(pc_ptr - dc->cs_base);
>             gen_eob(dc);
>             break;
>         }
>
> (This triggers one instruction after a STI instruction, due to how x86's
> "interrupt shadow" work, so it doesn't happen immediately after
> helper_sti; but the idea is the same).
>
> Paolo


--
Alex Bennée
Paolo Bonzini June 14, 2017, 12:16 p.m. UTC | #8
On 14/06/2017 14:14, Alex Bennée wrote:
>> Then Emilio's patch, if a bit of a heavy hammer, is correct. After
>> aa64_daif_write needs you need an exit_tb so that arm_cpu_exec_interrupt
>> is executed again.
> 
> This is a case of cpu->interrupt_request being pending but not having
> set cpu->icount_decr yet to signal the exit.

Rather than "yet", "anymore".  So far it has always been an invariant
that anything that re-enabled an interrupt had to do exit_tb.

> Wouldn't another approach
> (that didn't involve futzing with each front-end) to be to check
> cpu->interrupt_request and force the exit in lookup_tb_ptr?

That would cause an unnecessary slowdown in code that runs with
interrupts disabled but does a lot of indirect jumps...  ppc's SLOF
firmware probably qualifies.

Paolo
Alex Bennée June 14, 2017, 12:35 p.m. UTC | #9
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 14/06/2017 14:14, Alex Bennée wrote:
>>> Then Emilio's patch, if a bit of a heavy hammer, is correct. After
>>> aa64_daif_write needs you need an exit_tb so that arm_cpu_exec_interrupt
>>> is executed again.
>>
>> This is a case of cpu->interrupt_request being pending but not having
>> set cpu->icount_decr yet to signal the exit.
>
> Rather than "yet", "anymore".  So far it has always been an invariant
> that anything that re-enabled an interrupt had to do exit_tb.
>
>> Wouldn't another approach
>> (that didn't involve futzing with each front-end) to be to check
>> cpu->interrupt_request and force the exit in lookup_tb_ptr?
>
> That would cause an unnecessary slowdown in code that runs with
> interrupts disabled but does a lot of indirect jumps...  ppc's SLOF
> firmware probably qualifies.

Really? I'd have to measure the change it makes. Is there a benchmark
stanza for measuring the PPC slof firmware time?

I have 3 patches now which all fix the same thing so we can pick and
choose which we should apply. Patches incoming...

--
Alex Bennée
Paolo Bonzini June 14, 2017, 12:43 p.m. UTC | #10
On 14/06/2017 14:35, Alex Bennée wrote:
>> That would cause an unnecessary slowdown in code that runs with
>> interrupts disabled but does a lot of indirect jumps...  ppc's SLOF
>> firmware probably qualifies.
> 
> Really?

Yes. :)  SLOF basically runs a Forth interpreter.  If you run
"qemu-system-ppc64 -d in_asm", you'll see a lot of "bctr" and "bctrl"
instruction (respectively ARM's "br" and "blr" IIRC).

> I'd have to measure the change it makes. Is there a benchmark
> stanza for measuring the PPC slof firmware time?

Just booting.  PPC doesn't have tcg_gen_lookup_and_goto_ptr support yet,
so it would be a theoretical slowdown at this time.

Paolo

> I have 3 patches now which all fix the same thing so we can pick and
> choose which we should apply. Patches incoming...
diff mbox

Patch

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 860e279..5a609a0 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1422,7 +1422,7 @@  static void handle_msr_i(DisasContext *s, uint32_t insn,
         gen_helper_msr_i_pstate(cpu_env, tcg_op, tcg_imm);
         tcg_temp_free_i32(tcg_imm);
         tcg_temp_free_i32(tcg_op);
-        s->is_jmp = DISAS_UPDATE;
+        s->is_jmp = DISAS_EXIT;
         break;
     }
     default:
@@ -11362,6 +11362,10 @@  void gen_intermediate_code_a64(ARMCPU *cpu, TranslationBlock *tb)
         case DISAS_NEXT:
             gen_goto_tb(dc, 1, dc->pc);
             break;
+        case DISAS_EXIT:
+            gen_a64_set_pc_im(dc->pc);
+            tcg_gen_exit_tb(0);
+            break;
         default:
         case DISAS_UPDATE:
             gen_a64_set_pc_im(dc->pc);