mbox series

[for-9.2,00/10] s390: Convert virtio-ccw, cpu to three-phase reset, and followup cleanup

Message ID 20240813165250.2717650-1-peter.maydell@linaro.org (mailing list archive)
Headers show
Series s390: Convert virtio-ccw, cpu to three-phase reset, and followup cleanup | expand

Message

Peter Maydell Aug. 13, 2024, 4:52 p.m. UTC
The main aim of this patchseries is to remove the two remaining uses
of device_class_set_parent_reset() in the tree, which are virtio-ccw
and the s390 CPU class. Doing that lets us do some followup cleanup.
(The diffstat looks alarming but is almost all coccinelle automated
changes.)

virtio-ccw is a simple conversion, because it's a leaf class and no
other code interacts with the reset function. So we convert to
three-phase and resettable_class_set_parent_phases().

The s390 CPU is trickier, because s390 has its own set of different
kinds of reset (the S390_CPU_RESET_* enum values). At the moment we
have a single function implementing the reset, which always chains to
the parent DeviceClass::reset, and which can be called both from the
CPU object's own DeviceClass::reset method and also from other s390
code.

I opted to handle this by adding two new S390-CPU-specific reset types
to the Resettable ResetType enum.  Now instead of having an underlying
implementation of reset that is s390-specific and which might be
called either directly or via the DeviceClass::reset method, we can
implement only the Resettable hold phase method, and have the places
that need to trigger an s390-specific reset type do so by calling
resettable_reset().
                            
The other option would have been to smuggle in the s390 reset type
via, for instance, a field in the CPU state that we set in
s390_do_cpu_initial_reset() etc and then examined in the reset method,
but doing it this way seems cleaner.

The rest of the series is largely cleanup. Some small stuff:
 * remove the now-unused device_class_set_parent_reset() function
 * remove some CPU state struct fields in alpha and hppa that
   were simply unused
 * fix an accidental comma in xilinx_axidma that confused a
   Coccinelle script
 * hw/remote/message.c should not be directly invoking the
   DeviceClass::reset method

There are also patches here which convert devices from directly
setting:
   dc->reset = my_reset_function;
to using a function call:
   device_class_set_legacy_reset(dc, my_reset_function);

The conversion is entirely scripted using a coccinelle patch, and it
is responsible for the large diffstat here.

If people feel this is unnecessary churn I'm open to that argument,
but my motivation here is:
 * it means that people writing device code see the "legacy" word
   and get a hint that a new QEMU device probably should be using
   something else (i.e. Resettable)
 * it makes it easier to find the places in the code that are
   still using legacy-reset with a simple grep

In particular, having done that makes it easier to rename the
DeviceState::reset field to DeviceState::legacy_reset, and that in
turn is how I found that hw/remote/ was still directly invoking it.

The last patch removes device_phases_reset(), which was the part of
the reset transition glue that supported "device is reset via the
legacy reset method but it is a three-phase device".  Now we know
there is no way to directly invoke legacy-reset any more, we can drop
that bit of glue. (The opposite direction, "device is reset via a
three-phase-aware method but only implements the legacy reset method",
is still supported.)

I am pondering the idea of a further Coccinelle-driven conversion
of device_class_set_legacy_reset() calls into three-phase setting
of the Resettable hold phase method. This would be a zero behaviour
change, but the difficult part will be where we have devices
which make direct function calls to their reset function as well
as registering it as the reset method; I'm not sure if Coccinelle
can do "match functions which are not referenced elsewhere in the
file" checks.

Note that my testing here has only been "make check" and
"make check-avocado", which I know doesn't really exercise reset
very heavily.

thanks
-- PMM

Peter Maydell (10):
  hw/s390/virtio-ccw: Convert to three-phase reset
  target/s390: Convert CPU to Resettable interface
  hw: Remove device_class_set_parent_reset()
  target/alpha, hppa: Remove unused parent_reset fields
  hw/dma/xilinx_axidma: Use semicolon at end of statement, not comma
  hw/remote/message.c: Don't directly invoke DeviceClass:reset
  hw: Define new device_class_set_legacy_reset()
  hw: Use device_class_set_legacy_reset() instead of opencoding
  hw: Rename DeviceClass::reset field to legacy_reset
  hw: Remove device_phases_reset()

 docs/devel/reset.rst                  | 10 +++++
 scripts/coccinelle/device-reset.cocci | 30 ++++++++++++++
 hw/s390x/virtio-ccw.h                 |  2 +-
 include/hw/qdev-core.h                | 33 +++++++--------
 include/hw/resettable.h               |  2 +
 target/alpha/cpu.h                    |  2 -
 target/hppa/cpu.h                     |  2 -
 target/s390x/cpu.h                    | 21 ++--------
 hw/acpi/erst.c                        |  2 +-
 hw/acpi/piix4.c                       |  2 +-
 hw/adc/aspeed_adc.c                   |  2 +-
 hw/adc/max111x.c                      |  2 +-
 hw/adc/stm32f2xx_adc.c                |  2 +-
 hw/adc/zynq-xadc.c                    |  2 +-
 hw/arm/armsse.c                       |  2 +-
 hw/arm/highbank.c                     |  2 +-
 hw/arm/musicpal.c                     |  6 +--
 hw/arm/pxa2xx.c                       |  4 +-
 hw/arm/strongarm.c                    |  4 +-
 hw/audio/ac97.c                       |  2 +-
 hw/audio/cs4231.c                     |  2 +-
 hw/audio/cs4231a.c                    |  2 +-
 hw/audio/es1370.c                     |  2 +-
 hw/audio/hda-codec.c                  |  2 +-
 hw/audio/intel-hda.c                  |  2 +-
 hw/audio/marvell_88w8618.c            |  2 +-
 hw/audio/pl041.c                      |  2 +-
 hw/audio/via-ac97.c                   |  2 +-
 hw/block/fdc-isa.c                    |  2 +-
 hw/block/fdc-sysbus.c                 |  2 +-
 hw/block/m25p80.c                     |  2 +-
 hw/block/nand.c                       |  2 +-
 hw/block/onenand.c                    |  2 +-
 hw/block/pflash_cfi01.c               |  2 +-
 hw/block/pflash_cfi02.c               |  2 +-
 hw/block/swim.c                       |  2 +-
 hw/char/avr_usart.c                   |  2 +-
 hw/char/cmsdk-apb-uart.c              |  2 +-
 hw/char/digic-uart.c                  |  2 +-
 hw/char/escc.c                        |  2 +-
 hw/char/etraxfs_ser.c                 |  2 +-
 hw/char/exynos4210_uart.c             |  2 +-
 hw/char/goldfish_tty.c                |  2 +-
 hw/char/grlib_apbuart.c               |  2 +-
 hw/char/ibex_uart.c                   |  2 +-
 hw/char/imx_serial.c                  |  2 +-
 hw/char/mcf_uart.c                    |  2 +-
 hw/char/mchp_pfsoc_mmuart.c           |  2 +-
 hw/char/nrf51_uart.c                  |  2 +-
 hw/char/pl011.c                       |  2 +-
 hw/char/renesas_sci.c                 |  2 +-
 hw/char/sclpconsole-lm.c              |  2 +-
 hw/char/sclpconsole.c                 |  2 +-
 hw/char/sh_serial.c                   |  2 +-
 hw/char/shakti_uart.c                 |  2 +-
 hw/char/stm32f2xx_usart.c             |  2 +-
 hw/char/xilinx_uartlite.c             |  2 +-
 hw/core/or-irq.c                      |  2 +-
 hw/core/qdev.c                        | 58 +++++++--------------------
 hw/cxl/switch-mailbox-cci.c           |  2 +-
 hw/display/artist.c                   |  2 +-
 hw/display/ati.c                      |  2 +-
 hw/display/bcm2835_fb.c               |  2 +-
 hw/display/cg3.c                      |  2 +-
 hw/display/dpcd.c                     |  2 +-
 hw/display/exynos4210_fimd.c          |  2 +-
 hw/display/g364fb.c                   |  2 +-
 hw/display/i2c-ddc.c                  |  2 +-
 hw/display/jazz_led.c                 |  2 +-
 hw/display/macfb.c                    |  4 +-
 hw/display/qxl.c                      |  2 +-
 hw/display/sii9022.c                  |  2 +-
 hw/display/sm501.c                    |  4 +-
 hw/display/tcx.c                      |  2 +-
 hw/display/vga-isa.c                  |  2 +-
 hw/display/vga-mmio.c                 |  2 +-
 hw/display/vga-pci.c                  |  2 +-
 hw/display/vmware_vga.c               |  2 +-
 hw/display/xlnx_dp.c                  |  2 +-
 hw/dma/bcm2835_dma.c                  |  2 +-
 hw/dma/i8257.c                        |  2 +-
 hw/dma/pl080.c                        |  2 +-
 hw/dma/pl330.c                        |  2 +-
 hw/dma/rc4030.c                       |  2 +-
 hw/dma/sparc32_dma.c                  |  2 +-
 hw/dma/xilinx_axidma.c                |  4 +-
 hw/dma/xlnx-zdma.c                    |  2 +-
 hw/dma/xlnx-zynq-devcfg.c             |  2 +-
 hw/dma/xlnx_csu_dma.c                 |  2 +-
 hw/dma/xlnx_dpdma.c                   |  2 +-
 hw/fsi/aspeed_apb2opb.c               |  2 +-
 hw/fsi/fsi-master.c                   |  2 +-
 hw/fsi/fsi.c                          |  2 +-
 hw/fsi/lbus.c                         |  2 +-
 hw/gpio/aspeed_gpio.c                 |  2 +-
 hw/gpio/bcm2835_gpio.c                |  2 +-
 hw/gpio/bcm2838_gpio.c                |  2 +-
 hw/gpio/gpio_key.c                    |  2 +-
 hw/gpio/imx_gpio.c                    |  2 +-
 hw/gpio/max7310.c                     |  2 +-
 hw/gpio/mpc8xxx.c                     |  2 +-
 hw/gpio/nrf51_gpio.c                  |  2 +-
 hw/gpio/omap_gpio.c                   |  4 +-
 hw/gpio/pca9552.c                     |  2 +-
 hw/gpio/pca9554.c                     |  2 +-
 hw/gpio/pcf8574.c                     |  2 +-
 hw/gpio/sifive_gpio.c                 |  2 +-
 hw/hyperv/hyperv.c                    |  2 +-
 hw/hyperv/vmbus.c                     |  2 +-
 hw/i2c/aspeed_i2c.c                   |  4 +-
 hw/i2c/bcm2835_i2c.c                  |  2 +-
 hw/i2c/exynos4210_i2c.c               |  2 +-
 hw/i2c/imx_i2c.c                      |  2 +-
 hw/i2c/microbit_i2c.c                 |  2 +-
 hw/i2c/mpc_i2c.c                      |  2 +-
 hw/i2c/omap_i2c.c                     |  2 +-
 hw/i2c/ppc4xx_i2c.c                   |  2 +-
 hw/i2c/smbus_eeprom.c                 |  2 +-
 hw/i386/amd_iommu.c                   |  2 +-
 hw/i386/intel_iommu.c                 |  2 +-
 hw/i386/kvm/i8254.c                   |  2 +-
 hw/i386/kvm/i8259.c                   |  2 +-
 hw/i386/kvm/ioapic.c                  |  2 +-
 hw/i386/kvm/xen_overlay.c             |  2 +-
 hw/i386/port92.c                      |  2 +-
 hw/i386/vapic.c                       |  2 +-
 hw/i386/vmmouse.c                     |  2 +-
 hw/i386/xen/xen_platform.c            |  2 +-
 hw/ide/ahci.c                         |  2 +-
 hw/ide/cmd646.c                       |  2 +-
 hw/ide/ich.c                          |  2 +-
 hw/ide/isa.c                          |  2 +-
 hw/ide/macio.c                        |  2 +-
 hw/ide/microdrive.c                   |  2 +-
 hw/ide/mmio.c                         |  2 +-
 hw/ide/piix.c                         |  4 +-
 hw/ide/sii3112.c                      |  2 +-
 hw/ide/via.c                          |  2 +-
 hw/input/adb-kbd.c                    |  2 +-
 hw/input/adb-mouse.c                  |  2 +-
 hw/input/lm832x.c                     |  2 +-
 hw/input/pckbd.c                      |  4 +-
 hw/intc/allwinner-a10-pic.c           |  2 +-
 hw/intc/apic_common.c                 |  2 +-
 hw/intc/armv7m_nvic.c                 |  2 +-
 hw/intc/aspeed_intc.c                 |  2 +-
 hw/intc/aspeed_vic.c                  |  2 +-
 hw/intc/bcm2835_ic.c                  |  2 +-
 hw/intc/bcm2836_control.c             |  2 +-
 hw/intc/exynos4210_combiner.c         |  2 +-
 hw/intc/goldfish_pic.c                |  2 +-
 hw/intc/grlib_irqmp.c                 |  2 +-
 hw/intc/heathrow_pic.c                |  2 +-
 hw/intc/i8259.c                       |  2 +-
 hw/intc/imx_avic.c                    |  2 +-
 hw/intc/imx_gpcv2.c                   |  2 +-
 hw/intc/ioapic.c                      |  2 +-
 hw/intc/loongarch_extioi.c            |  2 +-
 hw/intc/loongarch_pch_pic.c           |  2 +-
 hw/intc/m68k_irqc.c                   |  2 +-
 hw/intc/omap_intc.c                   |  4 +-
 hw/intc/openpic.c                     |  2 +-
 hw/intc/openpic_kvm.c                 |  2 +-
 hw/intc/pl190.c                       |  2 +-
 hw/intc/ppc-uic.c                     |  2 +-
 hw/intc/s390_flic.c                   |  2 +-
 hw/intc/s390_flic_kvm.c               |  2 +-
 hw/intc/sifive_plic.c                 |  2 +-
 hw/intc/slavio_intctl.c               |  2 +-
 hw/intc/xlnx-pmu-iomod-intc.c         |  2 +-
 hw/intc/xlnx-zynqmp-ipi.c             |  2 +-
 hw/isa/lpc_ich9.c                     |  2 +-
 hw/isa/pc87312.c                      |  2 +-
 hw/isa/piix.c                         |  2 +-
 hw/isa/vt82c686.c                     | 10 ++---
 hw/m68k/mcf5206.c                     |  2 +-
 hw/m68k/mcf_intc.c                    |  2 +-
 hw/m68k/next-cube.c                   |  2 +-
 hw/m68k/next-kbd.c                    |  2 +-
 hw/mem/cxl_type3.c                    |  2 +-
 hw/misc/a9scu.c                       |  2 +-
 hw/misc/allwinner-cpucfg.c            |  2 +-
 hw/misc/allwinner-h3-ccu.c            |  2 +-
 hw/misc/allwinner-h3-dramc.c          |  2 +-
 hw/misc/allwinner-h3-sysctrl.c        |  2 +-
 hw/misc/allwinner-r40-ccu.c           |  2 +-
 hw/misc/allwinner-r40-dramc.c         |  2 +-
 hw/misc/allwinner-sid.c               |  2 +-
 hw/misc/allwinner-sramc.c             |  2 +-
 hw/misc/applesmc.c                    |  2 +-
 hw/misc/arm_l2x0.c                    |  2 +-
 hw/misc/arm_sysctl.c                  |  2 +-
 hw/misc/armsse-cpu-pwrctrl.c          |  2 +-
 hw/misc/armsse-mhu.c                  |  2 +-
 hw/misc/aspeed_hace.c                 |  2 +-
 hw/misc/aspeed_i3c.c                  |  4 +-
 hw/misc/aspeed_lpc.c                  |  2 +-
 hw/misc/aspeed_peci.c                 |  2 +-
 hw/misc/aspeed_sbc.c                  |  2 +-
 hw/misc/aspeed_scu.c                  | 10 ++---
 hw/misc/aspeed_sdmc.c                 |  4 +-
 hw/misc/aspeed_xdma.c                 |  2 +-
 hw/misc/avr_power.c                   |  2 +-
 hw/misc/bcm2835_cprman.c              |  8 ++--
 hw/misc/bcm2835_mbox.c                |  2 +-
 hw/misc/bcm2835_mphi.c                |  2 +-
 hw/misc/bcm2835_powermgt.c            |  2 +-
 hw/misc/bcm2835_rng.c                 |  2 +-
 hw/misc/bcm2835_thermal.c             |  2 +-
 hw/misc/eccmemctl.c                   |  2 +-
 hw/misc/exynos4210_clk.c              |  2 +-
 hw/misc/exynos4210_pmu.c              |  2 +-
 hw/misc/exynos4210_rng.c              |  2 +-
 hw/misc/imx25_ccm.c                   |  2 +-
 hw/misc/imx31_ccm.c                   |  2 +-
 hw/misc/imx6_ccm.c                    |  2 +-
 hw/misc/imx6_src.c                    |  2 +-
 hw/misc/imx6ul_ccm.c                  |  2 +-
 hw/misc/imx7_ccm.c                    |  4 +-
 hw/misc/imx7_snvs.c                   |  2 +-
 hw/misc/imx7_src.c                    |  2 +-
 hw/misc/imx_rngc.c                    |  2 +-
 hw/misc/iotkit-secctl.c               |  2 +-
 hw/misc/iotkit-sysctl.c               |  2 +-
 hw/misc/ivshmem.c                     |  2 +-
 hw/misc/lasi.c                        |  2 +-
 hw/misc/led.c                         |  2 +-
 hw/misc/macio/cuda.c                  |  2 +-
 hw/misc/macio/gpio.c                  |  2 +-
 hw/misc/macio/mac_dbdma.c             |  2 +-
 hw/misc/macio/pmu.c                   |  2 +-
 hw/misc/mips_cmgcr.c                  |  2 +-
 hw/misc/mips_cpc.c                    |  2 +-
 hw/misc/mips_itu.c                    |  2 +-
 hw/misc/mps2-fpgaio.c                 |  2 +-
 hw/misc/mps2-scc.c                    |  2 +-
 hw/misc/msf2-sysreg.c                 |  2 +-
 hw/misc/nrf51_rng.c                   |  2 +-
 hw/misc/pci-testdev.c                 |  2 +-
 hw/misc/sifive_e_aon.c                |  2 +-
 hw/misc/sifive_u_prci.c               |  2 +-
 hw/misc/slavio_misc.c                 |  2 +-
 hw/misc/stm32f2xx_syscfg.c            |  2 +-
 hw/misc/stm32f4xx_exti.c              |  2 +-
 hw/misc/stm32f4xx_syscfg.c            |  2 +-
 hw/misc/tz-mpc.c                      |  2 +-
 hw/misc/tz-msc.c                      |  2 +-
 hw/misc/tz-ppc.c                      |  2 +-
 hw/misc/virt_ctrl.c                   |  2 +-
 hw/misc/xlnx-versal-cfu.c             |  2 +-
 hw/net/allwinner-sun8i-emac.c         |  2 +-
 hw/net/allwinner_emac.c               |  2 +-
 hw/net/cadence_gem.c                  |  2 +-
 hw/net/can/can_kvaser_pci.c           |  2 +-
 hw/net/can/can_mioe3680_pci.c         |  2 +-
 hw/net/can/can_pcm3680_pci.c          |  2 +-
 hw/net/can/ctucan_pci.c               |  2 +-
 hw/net/can/xlnx-versal-canfd.c        |  2 +-
 hw/net/dp8393x.c                      |  2 +-
 hw/net/etraxfs_eth.c                  |  2 +-
 hw/net/fsl_etsec/etsec.c              |  2 +-
 hw/net/ftgmac100.c                    |  4 +-
 hw/net/imx_fec.c                      |  2 +-
 hw/net/lan9118.c                      |  2 +-
 hw/net/lance.c                        |  2 +-
 hw/net/lasi_i82596.c                  |  2 +-
 hw/net/mcf_fec.c                      |  2 +-
 hw/net/mipsnet.c                      |  2 +-
 hw/net/msf2-emac.c                    |  2 +-
 hw/net/npcm7xx_emc.c                  |  2 +-
 hw/net/npcm_gmac.c                    |  2 +-
 hw/net/opencores_eth.c                |  2 +-
 hw/net/pcnet-pci.c                    |  2 +-
 hw/net/rocker/rocker.c                |  2 +-
 hw/net/rtl8139.c                      |  2 +-
 hw/net/smc91c111.c                    |  2 +-
 hw/net/stellaris_enet.c               |  2 +-
 hw/net/sungem.c                       |  2 +-
 hw/net/sunhme.c                       |  2 +-
 hw/net/tulip.c                        |  2 +-
 hw/net/vmxnet3.c                      |  2 +-
 hw/net/xilinx_axienet.c               |  2 +-
 hw/net/xilinx_ethlite.c               |  2 +-
 hw/nvme/ctrl.c                        |  2 +-
 hw/nvram/eeprom_at24c.c               |  2 +-
 hw/nvram/fw_cfg.c                     |  2 +-
 hw/nvram/mac_nvram.c                  |  2 +-
 hw/nvram/nrf51_nvm.c                  |  2 +-
 hw/pci-bridge/cxl_downstream.c        |  2 +-
 hw/pci-bridge/cxl_upstream.c          |  2 +-
 hw/pci-bridge/i82801b11.c             |  2 +-
 hw/pci-bridge/pci_bridge_dev.c        |  2 +-
 hw/pci-bridge/pci_expander_bridge.c   |  2 +-
 hw/pci-bridge/pcie_pci_bridge.c       |  2 +-
 hw/pci-bridge/simba.c                 |  2 +-
 hw/pci-bridge/xio3130_downstream.c    |  2 +-
 hw/pci-bridge/xio3130_upstream.c      |  2 +-
 hw/pci-host/astro.c                   |  4 +-
 hw/pci-host/designware.c              |  2 +-
 hw/pci-host/dino.c                    |  2 +-
 hw/pci-host/gt64120.c                 |  2 +-
 hw/pci-host/mv64361.c                 |  2 +-
 hw/pci-host/ppc440_pcix.c             |  2 +-
 hw/pci-host/q35.c                     |  2 +-
 hw/pci-host/sabre.c                   |  2 +-
 hw/pci-host/versatile.c               |  2 +-
 hw/pci-host/xilinx-pcie.c             |  2 +-
 hw/ppc/pnv_psi.c                      |  4 +-
 hw/ppc/ppc405_boards.c                |  2 +-
 hw/ppc/ppc405_uc.c                    | 12 +++---
 hw/ppc/ppc4xx_devs.c                  |  6 +--
 hw/ppc/ppc4xx_sdram.c                 |  4 +-
 hw/ppc/ppce500_spin.c                 |  2 +-
 hw/ppc/spapr_cpu_core.c               |  2 +-
 hw/ppc/spapr_iommu.c                  |  2 +-
 hw/ppc/spapr_pci.c                    |  2 +-
 hw/ppc/spapr_vio.c                    |  2 +-
 hw/remote/message.c                   |  5 +--
 hw/remote/proxy.c                     |  2 +-
 hw/rtc/allwinner-rtc.c                |  2 +-
 hw/rtc/aspeed_rtc.c                   |  2 +-
 hw/rtc/ds1338.c                       |  2 +-
 hw/rtc/exynos4210_rtc.c               |  2 +-
 hw/rtc/goldfish_rtc.c                 |  2 +-
 hw/rtc/ls7a_rtc.c                     |  2 +-
 hw/rtc/m48t59-isa.c                   |  2 +-
 hw/rtc/m48t59.c                       |  2 +-
 hw/rtc/xlnx-zynqmp-rtc.c              |  2 +-
 hw/s390x/ccw-device.c                 |  2 +-
 hw/s390x/event-facility.c             |  2 +-
 hw/s390x/ipl.c                        |  2 +-
 hw/s390x/s390-pci-bus.c               |  4 +-
 hw/s390x/sclpquiesce.c                |  2 +-
 hw/s390x/virtio-ccw.c                 | 13 +++---
 hw/scsi/esp-pci.c                     |  2 +-
 hw/scsi/esp.c                         |  2 +-
 hw/scsi/lsi53c895a.c                  |  2 +-
 hw/scsi/megasas.c                     |  2 +-
 hw/scsi/mptsas.c                      |  2 +-
 hw/scsi/scsi-disk.c                   |  2 +-
 hw/scsi/scsi-generic.c                |  2 +-
 hw/scsi/vmw_pvscsi.c                  |  2 +-
 hw/sd/allwinner-sdhost.c              |  2 +-
 hw/sd/aspeed_sdhci.c                  |  2 +-
 hw/sd/bcm2835_sdhost.c                |  2 +-
 hw/sd/cadence_sdhci.c                 |  2 +-
 hw/sd/npcm7xx_sdhci.c                 |  2 +-
 hw/sd/pl181.c                         |  2 +-
 hw/sd/pxa2xx_mmci.c                   |  2 +-
 hw/sd/sd.c                            |  2 +-
 hw/sd/sdhci.c                         |  2 +-
 hw/sd/ssi-sd.c                        |  2 +-
 hw/sensor/dps310.c                    |  2 +-
 hw/sensor/emc141x.c                   |  2 +-
 hw/sensor/lsm303dlhc_mag.c            |  2 +-
 hw/sparc/sun4m_iommu.c                |  2 +-
 hw/sparc64/sun4u_iommu.c              |  2 +-
 hw/ssi/aspeed_smc.c                   |  2 +-
 hw/ssi/bcm2835_spi.c                  |  2 +-
 hw/ssi/ibex_spi_host.c                |  2 +-
 hw/ssi/imx_spi.c                      |  2 +-
 hw/ssi/mss-spi.c                      |  2 +-
 hw/ssi/pl022.c                        |  2 +-
 hw/ssi/pnv_spi.c                      |  2 +-
 hw/ssi/sifive_spi.c                   |  2 +-
 hw/ssi/stm32f2xx_spi.c                |  2 +-
 hw/ssi/xilinx_spi.c                   |  2 +-
 hw/ssi/xilinx_spips.c                 |  4 +-
 hw/ssi/xlnx-versal-ospi.c             |  2 +-
 hw/timer/a9gtimer.c                   |  2 +-
 hw/timer/allwinner-a10-pit.c          |  2 +-
 hw/timer/arm_mptimer.c                |  2 +-
 hw/timer/armv7m_systick.c             |  2 +-
 hw/timer/aspeed_timer.c               |  2 +-
 hw/timer/avr_timer16.c                |  2 +-
 hw/timer/bcm2835_systmr.c             |  2 +-
 hw/timer/cmsdk-apb-dualtimer.c        |  2 +-
 hw/timer/cmsdk-apb-timer.c            |  2 +-
 hw/timer/digic-timer.c                |  2 +-
 hw/timer/exynos4210_mct.c             |  2 +-
 hw/timer/exynos4210_pwm.c             |  2 +-
 hw/timer/grlib_gptimer.c              |  2 +-
 hw/timer/hpet.c                       |  2 +-
 hw/timer/i8254.c                      |  2 +-
 hw/timer/ibex_timer.c                 |  2 +-
 hw/timer/imx_epit.c                   |  2 +-
 hw/timer/imx_gpt.c                    |  2 +-
 hw/timer/nrf51_timer.c                |  2 +-
 hw/timer/renesas_cmt.c                |  2 +-
 hw/timer/renesas_tmr.c                |  2 +-
 hw/timer/sifive_pwm.c                 |  2 +-
 hw/timer/slavio_timer.c               |  2 +-
 hw/timer/sse-counter.c                |  2 +-
 hw/timer/sse-timer.c                  |  2 +-
 hw/timer/stm32f2xx_timer.c            |  2 +-
 hw/tpm/tpm_tis_i2c.c                  |  2 +-
 hw/tpm/tpm_tis_isa.c                  |  2 +-
 hw/tpm/tpm_tis_sysbus.c               |  2 +-
 hw/tricore/tricore_testdevice.c       |  2 +-
 hw/usb/hcd-dwc3.c                     |  2 +-
 hw/usb/hcd-ehci-pci.c                 |  2 +-
 hw/usb/hcd-ehci-sysbus.c              |  2 +-
 hw/usb/hcd-ohci-pci.c                 |  2 +-
 hw/usb/hcd-ohci-sysbus.c              |  2 +-
 hw/usb/hcd-uhci.c                     |  2 +-
 hw/usb/hcd-xhci-pci.c                 |  2 +-
 hw/usb/hcd-xhci-sysbus.c              |  2 +-
 hw/usb/hcd-xhci.c                     |  2 +-
 hw/usb/imx-usb-phy.c                  |  2 +-
 hw/usb/tusb6010.c                     |  2 +-
 hw/vfio/ap.c                          |  2 +-
 hw/vfio/ccw.c                         |  2 +-
 hw/vfio/pci.c                         |  2 +-
 hw/virtio/virtio-mmio.c               |  2 +-
 hw/watchdog/cmsdk-apb-watchdog.c      |  2 +-
 hw/watchdog/sbsa_gwdt.c               |  2 +-
 hw/watchdog/wdt_aspeed.c              |  2 +-
 hw/watchdog/wdt_diag288.c             |  2 +-
 hw/watchdog/wdt_i6300esb.c            |  2 +-
 hw/watchdog/wdt_ib700.c               |  2 +-
 hw/watchdog/wdt_imx2.c                |  2 +-
 target/s390x/cpu.c                    | 38 ++++++++----------
 target/s390x/sigp.c                   |  8 +---
 423 files changed, 553 insertions(+), 569 deletions(-)
 create mode 100644 scripts/coccinelle/device-reset.cocci

Comments

Christian Borntraeger Aug. 14, 2024, 2:22 p.m. UTC | #1
Am 13.08.24 um 18:52 schrieb Peter Maydell:
> The main aim of this patchseries is to remove the two remaining uses
> of device_class_set_parent_reset() in the tree, which are virtio-ccw
> and the s390 CPU class. Doing that lets us do some followup cleanup.
> (The diffstat looks alarming but is almost all coccinelle automated
> changes.)
> 
> virtio-ccw is a simple conversion, because it's a leaf class and no
> other code interacts with the reset function. So we convert to
> three-phase and resettable_class_set_parent_phases().
> 
> The s390 CPU is trickier, because s390 has its own set of different
> kinds of reset (the S390_CPU_RESET_* enum values). At the moment we
> have a single function implementing the reset, which always chains to
> the parent DeviceClass::reset, and which can be called both from the
> CPU object's own DeviceClass::reset method and also from other s390
> code.
> 
> I opted to handle this by adding two new S390-CPU-specific reset types
> to the Resettable ResetType enum.  Now instead of having an underlying
> implementation of reset that is s390-specific and which might be
> called either directly or via the DeviceClass::reset method, we can
> implement only the Resettable hold phase method, and have the places
> that need to trigger an s390-specific reset type do so by calling
> resettable_reset().
>                              
> The other option would have been to smuggle in the s390 reset type
> via, for instance, a field in the CPU state that we set in
> s390_do_cpu_initial_reset() etc and then examined in the reset method,
> but doing it this way seems cleaner.
> 
> The rest of the series is largely cleanup. Some small stuff:
>   * remove the now-unused device_class_set_parent_reset() function
>   * remove some CPU state struct fields in alpha and hppa that
>     were simply unused
>   * fix an accidental comma in xilinx_axidma that confused a
>     Coccinelle script
>   * hw/remote/message.c should not be directly invoking the
>     DeviceClass::reset method
> 
> There are also patches here which convert devices from directly
> setting:
>     dc->reset = my_reset_function;
> to using a function call:
>     device_class_set_legacy_reset(dc, my_reset_function);
> 
> The conversion is entirely scripted using a coccinelle patch, and it
> is responsible for the large diffstat here.
> 
> If people feel this is unnecessary churn I'm open to that argument,
> but my motivation here is:
>   * it means that people writing device code see the "legacy" word
>     and get a hint that a new QEMU device probably should be using
>     something else (i.e. Resettable)
>   * it makes it easier to find the places in the code that are
>     still using legacy-reset with a simple grep
> 
> In particular, having done that makes it easier to rename the
> DeviceState::reset field to DeviceState::legacy_reset, and that in
> turn is how I found that hw/remote/ was still directly invoking it.
> 
> The last patch removes device_phases_reset(), which was the part of
> the reset transition glue that supported "device is reset via the
> legacy reset method but it is a three-phase device".  Now we know
> there is no way to directly invoke legacy-reset any more, we can drop
> that bit of glue. (The opposite direction, "device is reset via a
> three-phase-aware method but only implements the legacy reset method",
> is still supported.)
> 
> I am pondering the idea of a further Coccinelle-driven conversion
> of device_class_set_legacy_reset() calls into three-phase setting
> of the Resettable hold phase method. This would be a zero behaviour
> change, but the difficult part will be where we have devices
> which make direct function calls to their reset function as well
> as registering it as the reset method; I'm not sure if Coccinelle
> can do "match functions which are not referenced elsewhere in the
> file" checks.
> 
> Note that my testing here has only been "make check" and
> "make check-avocado", which I know doesn't really exercise reset
> very heavily.

Nina, can you have a look at those patches?
Peter Maydell Aug. 14, 2024, 8:06 p.m. UTC | #2
On Wed, 14 Aug 2024 at 15:22, Christian Borntraeger
<borntraeger@linux.ibm.com> wrote:
>
> Am 13.08.24 um 18:52 schrieb Peter Maydell:
> > The main aim of this patchseries is to remove the two remaining uses
> > of device_class_set_parent_reset() in the tree, which are virtio-ccw
> > and the s390 CPU class. Doing that lets us do some followup cleanup.
> > (The diffstat looks alarming but is almost all coccinelle automated
> > changes.)
> >
> > Note that my testing here has only been "make check" and
> > "make check-avocado", which I know doesn't really exercise reset
> > very heavily.

> Nina, can you have a look at those patches?

If you plan to do any testing you'll want to locally fix the
silly mistake I made in patch 2 (a RESET_TYPE_COLD where
it should say RESET_TYPE_S390_CPU_NORMAL) -- see the review
comments on that patch. Sorry about that one...

thanks
-- PMM
Nina Schoetterl-Glausch Aug. 22, 2024, 10:34 a.m. UTC | #3
On Wed, 2024-08-14 at 21:06 +0100, Peter Maydell wrote:
> On Wed, 14 Aug 2024 at 15:22, Christian Borntraeger
> <borntraeger@linux.ibm.com> wrote:
> > 
> > Am 13.08.24 um 18:52 schrieb Peter Maydell:
> > > The main aim of this patchseries is to remove the two remaining uses
> > > of device_class_set_parent_reset() in the tree, which are virtio-ccw
> > > and the s390 CPU class. Doing that lets us do some followup cleanup.
> > > (The diffstat looks alarming but is almost all coccinelle automated
> > > changes.)
> > > 
> > > Note that my testing here has only been "make check" and
> > > "make check-avocado", which I know doesn't really exercise reset
> > > very heavily.
> 
> > Nina, can you have a look at those patches?
> 
> If you plan to do any testing you'll want to locally fix the
> silly mistake I made in patch 2 (a RESET_TYPE_COLD where
> it should say RESET_TYPE_S390_CPU_NORMAL) -- see the review
> comments on that patch. Sorry about that one...
> 
> thanks
> -- PMM
> 

I'll run it through our CI and see if anything pops up.
Nico Boehr Aug. 26, 2024, 12:08 p.m. UTC | #4
Quoting Nina Schoetterl-Glausch (2024-08-22 12:34:14)
> I'll run it through our CI and see if anything pops up.

Nina is on holiday, she asked me to quickly report back.

There was a little hickup without the fixup to patch 2, but after Nina
pushed the fixup, we did not observe any failures related to your
changes in our CI. Thanks!
Nico Boehr Aug. 28, 2024, 8:13 a.m. UTC | #5
Quoting Nico Boehr (2024-08-26 14:08:20)
> There was a little hickup without the fixup to patch 2, but after Nina
> pushed the fixup, we did not observe any failures related to your
> changes in our CI. Thanks!

Peter, after a few CI runs, we unfortunately did find some issues with your
patch :-(

Rebooting a guest in a loop sometimes fails. Michael was able to bisect it
to your series.

The problem is intermittent. The guest is unable to load its initramfs:

  [    0.560674] rootfs image is not initramfs (no cpio magic); looks like an initrd
  [    0.588605] Freeing initrd memory: 95680K
  [    0.593143] md: Waiting for all devices to be available before autodetect
  [    0.593144] md: If you don't use raid, use raid=noautodetect
  [    0.593145] md: Autodetecting RAID arrays.
  [    0.593146] md: autorun ...
  [    0.593147] md: ... autorun DONE.
  [    0.593156] RAMDISK: gzip image found at block 0
  [    0.609110] RAMDISK: incomplete write (29120 != 32768)
  [    0.609113] write error

...and then a panic because the kernel doesn't find a rootfs.

It seems like the compressed initramfs is corrupted somehow, since "rootfs
image is not initramfs" doesn't appear on a successful boot.

initramfs and kernel are loaded via direct kernel boot. Running under KVM.

Some vhost error messages do appear before the guest panics, but it is not
entirely clear to me whether they are related:

  [...]
  2024-08-28T06:56:29.765324Z qemu-system-s390x: vhost vring error in virtqueue 0: Invalid argument (22)
  2024-08-28T06:56:32.210982Z qemu-system-s390x: vhost vring error in virtqueue 0: Invalid argument (22)
  2024-08-28 06:56:35.430+0000: panic s390: core='0' psw-mask='0x0002000180000000' psw-addr='0x00000387b028c67e' reason='disabled-wait'

Any idea?
Peter Maydell Aug. 28, 2024, 3:46 p.m. UTC | #6
On Wed, 28 Aug 2024 at 09:13, Nico Boehr <nrb@linux.ibm.com> wrote:
>
> Quoting Nico Boehr (2024-08-26 14:08:20)
> > There was a little hickup without the fixup to patch 2, but after Nina
> > pushed the fixup, we did not observe any failures related to your
> > changes in our CI. Thanks!
>
> Peter, after a few CI runs, we unfortunately did find some issues with your
> patch :-(
>
> Rebooting a guest in a loop sometimes fails. Michael was able to bisect it
> to your series.
>
> The problem is intermittent. The guest is unable to load its initramfs:
>
>   [    0.560674] rootfs image is not initramfs (no cpio magic); looks like an initrd
>   [    0.588605] Freeing initrd memory: 95680K
>   [    0.593143] md: Waiting for all devices to be available before autodetect
>   [    0.593144] md: If you don't use raid, use raid=noautodetect
>   [    0.593145] md: Autodetecting RAID arrays.
>   [    0.593146] md: autorun ...
>   [    0.593147] md: ... autorun DONE.
>   [    0.593156] RAMDISK: gzip image found at block 0
>   [    0.609110] RAMDISK: incomplete write (29120 != 32768)
>   [    0.609113] write error
>
> ...and then a panic because the kernel doesn't find a rootfs.
>
> It seems like the compressed initramfs is corrupted somehow, since "rootfs
> image is not initramfs" doesn't appear on a successful boot.
>
> initramfs and kernel are loaded via direct kernel boot. Running under KVM.
>
> Some vhost error messages do appear before the guest panics, but it is not
> entirely clear to me whether they are related:
>
>   [...]
>   2024-08-28T06:56:29.765324Z qemu-system-s390x: vhost vring error in virtqueue 0: Invalid argument (22)
>   2024-08-28T06:56:32.210982Z qemu-system-s390x: vhost vring error in virtqueue 0: Invalid argument (22)
>   2024-08-28 06:56:35.430+0000: panic s390: core='0' psw-mask='0x0002000180000000' psw-addr='0x00000387b028c67e' reason='disabled-wait'
>
> Any idea?

Well, the series is *supposed* to be just a refactoring, not a change of
behaviour, so I'm not sure. I don't suppose you have a reproduce case
that I can run? (I do have access to an s390 machine if that helps.)

thanks
-- PMM
Nico Boehr Aug. 29, 2024, 12:19 p.m. UTC | #7
Quoting Peter Maydell (2024-08-28 17:46:42)
[...]
> Well, the series is *supposed* to be just a refactoring, not a change of
> behaviour, so I'm not sure. I don't suppose you have a reproduce case
> that I can run? (I do have access to an s390 machine if that helps.)

Well, it's on an internal testing framework which we do not release
publicly. :-(

Luckily, it's not that difficult to reproduce.  It seems like this only
happens when a reboot is initiated over SSH connection via vsock. Here are
some instructions on how to reproduce (with mkosi and Fedora):

1. Craft an mkosi.conf like this:
   [Distribution]
   Distribution=fedora
   Architecture=s390x
   
   [Output]
   Format=cpio
   CompressOutput=xz
   
   [Content]
   Packages=openssh-server
   Packages=kernel-modules-core-6.8.5-301.fc40.s390x
   Bootloader=none
   MakeInitrd=no
   Ssh=yes
   Autologin=yes
   RootPassword=hunter
   Timezone=Etc/UTC
   Locale=C.UTF-8
2. Generate SSH host key:
   mkdir -p mkosi.extra/etc/ssh && ssh-keygen -t ed25519 -f mkosi.extra/etc/ssh/ssh_host_ed25519_key
3. Build image:
   mkosi
4. Boot with QEMU:
   qemu-system-s390x -machine s390-ccw-virtio,accel=kvm -nodefaults -nographic -chardev stdio,id=con0 -device sclpconsole,chardev=con0 -m 2048 -kernel image.vmlinuz -initrd image.cpio.xz -device vhost-vsock-ccw,guest-cid=3
5. In a different terminal, run a reboot loop:
   i=0; while true; do ssh -o ProxyCommand='socat VSOCK-CONNECT:3:22 -' localhost -l root reboot; echo $i; let i=i+1; done

After a while (10 minutes, sometimes longer) you should see the quest crash
either with "Attempted to kill init" or "Unable to mount rootfs" and a message
about corrupted initramfs:
[    1.599082] Initramfs unpacking failed: XZ-compressed data is corrupt

I had this run without your series for 45 minutes, while with your series
this crashed within ~5-10 minutes.

Thanks!
Peter Maydell Aug. 29, 2024, 1:09 p.m. UTC | #8
On Thu, 29 Aug 2024 at 13:20, Nico Boehr <nrb@linux.ibm.com> wrote:
>
> Quoting Peter Maydell (2024-08-28 17:46:42)
> [...]
> > Well, the series is *supposed* to be just a refactoring, not a change of
> > behaviour, so I'm not sure. I don't suppose you have a reproduce case
> > that I can run? (I do have access to an s390 machine if that helps.)
>
> Well, it's on an internal testing framework which we do not release
> publicly. :-(
>
> Luckily, it's not that difficult to reproduce.  It seems like this only
> happens when a reboot is initiated over SSH connection via vsock. Here are
> some instructions on how to reproduce (with mkosi and Fedora):
>
> 1. Craft an mkosi.conf like this:
>    [Distribution]
>    Distribution=fedora
>    Architecture=s390x
>
>    [Output]
>    Format=cpio
>    CompressOutput=xz
>
>    [Content]
>    Packages=openssh-server
>    Packages=kernel-modules-core-6.8.5-301.fc40.s390x
>    Bootloader=none
>    MakeInitrd=no
>    Ssh=yes
>    Autologin=yes
>    RootPassword=hunter
>    Timezone=Etc/UTC
>    Locale=C.UTF-8
> 2. Generate SSH host key:
>    mkdir -p mkosi.extra/etc/ssh && ssh-keygen -t ed25519 -f mkosi.extra/etc/ssh/ssh_host_ed25519_key
> 3. Build image:
>    mkosi
> 4. Boot with QEMU:
>    qemu-system-s390x -machine s390-ccw-virtio,accel=kvm -nodefaults -nographic -chardev stdio,id=con0 -device sclpconsole,chardev=con0 -m 2048 -kernel image.vmlinuz -initrd image.cpio.xz -device vhost-vsock-ccw,guest-cid=3
> 5. In a different terminal, run a reboot loop:
>    i=0; while true; do ssh -o ProxyCommand='socat VSOCK-CONNECT:3:22 -' localhost -l root reboot; echo $i; let i=i+1; done

Thanks. I tried this repro, but mkosi falls over almost
immediately:

‣ Detaching namespace
‣ Setting up package cache…
‣ Setting up package cache /home/linux1/s390-failure/.mkosi-htsau2ot complete
‣ Setting up temporary workspace.
‣ Temporary workspace set up in /var/tmp/mkosi-tjc0nror
‣ Running second (final) stage…
‣  Creating image with partition table…
Disk /home/linux1/s390-failure/.mkosi-ddkx5xpb: 3 GiB, 3221266432
bytes, 6291536 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Created a new GPT disklabel (GUID: 14CF3B05-D072-0A45-8EE4-3219112ACB2E).
/home/linux1/s390-failure/.mkosi-ddkx5xpb1: Created a new partition 1
of type 'unknown' and of size 3 GiB.
/home/linux1/s390-failure/.mkosi-ddkx5xpb2: Done.

New situation:
Disklabel type: gpt
Disk identifier: 14CF3B05-D072-0A45-8EE4-3219112ACB2E

Device                                     Start     End Sectors Size Type
/home/linux1/s390-failure/.mkosi-ddkx5xpb1    40 6291495 6291456   3G unknown

The partition table has been altered.
‣  Created image with partition table as
/home/linux1/s390-failure/.mkosi-ddkx5xpb
‣  Attaching /home/linux1/s390-failure/.mkosi-ddkx5xpb as loopback…
‣  Attached /dev/loop13
‣  Formatting root partition…
mke2fs 1.46.5 (30-Dec-2021)
The file /dev/loop13p1 does not exist and no size was specified.
‣  (Detaching /dev/loop13)
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3/dist-packages/mkosi/__main__.py", line 32, in <module>
    main()
  File "/usr/lib/python3/dist-packages/mkosi/__main__.py", line 26, in main
    run_verb(a)
  File "/usr/lib/python3/dist-packages/mkosi/__init__.py", line 7809,
in run_verb
    manifest = build_stuff(args)
  File "/usr/lib/python3/dist-packages/mkosi/__init__.py", line 7227,
in build_stuff
    image = build_image(args, root, manifest=manifest,
do_run_build_script=False, cleanup=True)
  File "/usr/lib/python3/dist-packages/mkosi/__init__.py", line 6941,
in build_image
    prepare_root(args, encrypted.root, cached)
  File "/usr/lib/python3/dist-packages/mkosi/__init__.py", line 1308,
in prepare_root
    mkfs_generic(args, label, path, dev)
  File "/usr/lib/python3/dist-packages/mkosi/__init__.py", line 1031,
in mkfs_generic
    run([*cmdline, dev])
  File "/usr/lib/python3/dist-packages/mkosi/backend.py", line 699, in run
    return subprocess.run(cmdline, check=check, stdout=stdout,
stderr=stderr, **kwargs)
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['mkfs.ext4', '-I', '256',
'-L', 'root', '-M', '/', '/dev/loop13p1']' returned non-zero exit
status 1.

(My s390 box is running Ubuntu, in case that makes a difference.)

Maybe you could put the kernel and initrd somewhere I can
get them from?

-- PMM
Cédric Le Goater Aug. 29, 2024, 1:20 p.m. UTC | #9
> 
> (My s390 box is running Ubuntu, in case that makes a difference.)
> 
> Maybe you could put the kernel and initrd somewhere I can
> get them from?

I am also trying. See :

   https://www.kaod.org/qemu/s390x/

These were generated on my x86/f40 laptop.

C.
Nico Boehr Aug. 29, 2024, 1:26 p.m. UTC | #10
Quoting Peter Maydell (2024-08-29 15:09:44)
> Thanks. I tried this repro, but mkosi falls over almost
> immediately:
> 
> ‣ Detaching namespace
> ‣ Setting up package cache…
> ‣ Setting up package cache /home/linux1/s390-failure/.mkosi-htsau2ot complete
> ‣ Setting up temporary workspace.
> ‣ Temporary workspace set up in /var/tmp/mkosi-tjc0nror
> ‣ Running second (final) stage…
> ‣  Creating image with partition table…
> Disk /home/linux1/s390-failure/.mkosi-ddkx5xpb: 3 GiB, 3221266432
> bytes, 6291536 sectors
> Units: sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> 
> >>> Script header accepted.
> >>> Script header accepted.
> >>> Script header accepted.
> >>> Created a new GPT disklabel (GUID: 14CF3B05-D072-0A45-8EE4-3219112ACB2E).
> /home/linux1/s390-failure/.mkosi-ddkx5xpb1: Created a new partition 1
> of type 'unknown' and of size 3 GiB.
> /home/linux1/s390-failure/.mkosi-ddkx5xpb2: Done.
> 
> New situation:
> Disklabel type: gpt
> Disk identifier: 14CF3B05-D072-0A45-8EE4-3219112ACB2E
> 
> Device                                     Start     End Sectors Size Type
> /home/linux1/s390-failure/.mkosi-ddkx5xpb1    40 6291495 6291456   3G unknown
> 
> The partition table has been altered.
> ‣  Created image with partition table as
> /home/linux1/s390-failure/.mkosi-ddkx5xpb
> ‣  Attaching /home/linux1/s390-failure/.mkosi-ddkx5xpb as loopback…
> ‣  Attached /dev/loop13
> ‣  Formatting root partition…
> mke2fs 1.46.5 (30-Dec-2021)
> The file /dev/loop13p1 does not exist and no size was specified.
> ‣  (Detaching /dev/loop13)
> Traceback (most recent call last):
>   File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
>     return _run_code(code, main_globals, None,
>   File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
>     exec(code, run_globals)
>   File "/usr/lib/python3/dist-packages/mkosi/__main__.py", line 32, in <module>
>     main()
>   File "/usr/lib/python3/dist-packages/mkosi/__main__.py", line 26, in main
>     run_verb(a)
>   File "/usr/lib/python3/dist-packages/mkosi/__init__.py", line 7809,
> in run_verb
>     manifest = build_stuff(args)
>   File "/usr/lib/python3/dist-packages/mkosi/__init__.py", line 7227,
> in build_stuff
>     image = build_image(args, root, manifest=manifest,
> do_run_build_script=False, cleanup=True)
>   File "/usr/lib/python3/dist-packages/mkosi/__init__.py", line 6941,
> in build_image
>     prepare_root(args, encrypted.root, cached)
>   File "/usr/lib/python3/dist-packages/mkosi/__init__.py", line 1308,
> in prepare_root
>     mkfs_generic(args, label, path, dev)
>   File "/usr/lib/python3/dist-packages/mkosi/__init__.py", line 1031,
> in mkfs_generic
>     run([*cmdline, dev])
>   File "/usr/lib/python3/dist-packages/mkosi/backend.py", line 699, in run
>     return subprocess.run(cmdline, check=check, stdout=stdout,
> stderr=stderr, **kwargs)
>   File "/usr/lib/python3.10/subprocess.py", line 526, in run
>     raise CalledProcessError(retcode, process.args,
> subprocess.CalledProcessError: Command '['mkfs.ext4', '-I', '256',
> '-L', 'root', '-M', '/', '/dev/loop13p1']' returned non-zero exit
> status 1.
> 
> (My s390 box is running Ubuntu, in case that makes a difference.)
> 
> Maybe you could put the kernel and initrd somewhere I can
> get them from?

Sigh. Yeah, I'll try to find a way (may take a while).

In the meantime, looks like mkosi is trying to create an block image, but
that's not what it's configured to do; are you sure mkosi.conf is in the
same directory you're calling it from?
Peter Maydell Aug. 29, 2024, 1:35 p.m. UTC | #11
On Thu, 29 Aug 2024 at 14:26, Nico Boehr <nrb@linux.ibm.com> wrote:
>
> Quoting Peter Maydell (2024-08-29 15:09:44)
> > Thanks. I tried this repro, but mkosi falls over almost
> > immediately:

> In the meantime, looks like mkosi is trying to create an block image, but
> that's not what it's configured to do; are you sure mkosi.conf is in the
> same directory you're calling it from?

It is. I notice however that the manpage for mkosi
says that it looks for "mkosi.default", not "mkosi.conf".
Maybe it needs a newer mkosi than Ubuntu ships?
(mkosi --version says "mkosi 12".)

I'll use the images Cédric has kindly generated.

-- PMM
Nico Boehr Aug. 29, 2024, 2:44 p.m. UTC | #12
Quoting Peter Maydell (2024-08-29 15:35:30)
> On Thu, 29 Aug 2024 at 14:26, Nico Boehr <nrb@linux.ibm.com> wrote:
> >
> > Quoting Peter Maydell (2024-08-29 15:09:44)
> > > Thanks. I tried this repro, but mkosi falls over almost
> > > immediately:
> 
> > In the meantime, looks like mkosi is trying to create an block image, but
> > that's not what it's configured to do; are you sure mkosi.conf is in the
> > same directory you're calling it from?
> 
> It is. I notice however that the manpage for mkosi
> says that it looks for "mkosi.default", not "mkosi.conf".
> Maybe it needs a newer mkosi than Ubuntu ships?
> (mkosi --version says "mkosi 12".)

Likely. I have mkosi 22 here.

> I'll use the images Cédric has kindly generated.

Thanks, images by Cédric look good, but I forgot to tell you that
you need a SSH key for login too :)

You could unpack image.cpio.gz, add your key to root/.ssh and repack the
whole thing to image.new.xz:
- mkdir xtract && cd xtract
- unxz < ../image.cpio.xz | cpio -H newc -iv
- cp ~/.ssh/authorized_keys root/.ssh/
- find . 2>/dev/null | cpio -o -c -R root:root | xz -9 --format=lzma > ../image.new.xz
Peter Maydell Aug. 29, 2024, 3:08 p.m. UTC | #13
On Thu, 29 Aug 2024 at 15:44, Nico Boehr <nrb@linux.ibm.com> wrote:
>
> Quoting Peter Maydell (2024-08-29 15:35:30)
> > On Thu, 29 Aug 2024 at 14:26, Nico Boehr <nrb@linux.ibm.com> wrote:
> > >
> > > Quoting Peter Maydell (2024-08-29 15:09:44)
> > > > Thanks. I tried this repro, but mkosi falls over almost
> > > > immediately:
> >
> > > In the meantime, looks like mkosi is trying to create an block image, but
> > > that's not what it's configured to do; are you sure mkosi.conf is in the
> > > same directory you're calling it from?
> >
> > It is. I notice however that the manpage for mkosi
> > says that it looks for "mkosi.default", not "mkosi.conf".
> > Maybe it needs a newer mkosi than Ubuntu ships?
> > (mkosi --version says "mkosi 12".)
>
> Likely. I have mkosi 22 here.
>
> > I'll use the images Cédric has kindly generated.
>
> Thanks, images by Cédric look good, but I forgot to tell you that
> you need a SSH key for login too :)
>
> You could unpack image.cpio.gz, add your key to root/.ssh and repack the
> whole thing to image.new.xz:
> - mkdir xtract && cd xtract
> - unxz < ../image.cpio.xz | cpio -H newc -iv
> - cp ~/.ssh/authorized_keys root/.ssh/
> - find . 2>/dev/null | cpio -o -c -R root:root | xz -9 --format=lzma > ../image.new.xz

Thanks, I was just wondering how to get the ssh key into there.

I found I needed ...  cpio -H newc -o -R root:root | xz -9 -C crc32
when creating the new image, otherwise the kernel doesn't
like the format. (Also since I'm not root I needed to
sprinkle in some sudo invocations.)

-- PMM
Peter Maydell Aug. 29, 2024, 3:53 p.m. UTC | #14
On Wed, 28 Aug 2024 at 09:13, Nico Boehr <nrb@linux.ibm.com> wrote:
>
> Quoting Nico Boehr (2024-08-26 14:08:20)
> > There was a little hickup without the fixup to patch 2, but after Nina
> > pushed the fixup, we did not observe any failures related to your
> > changes in our CI. Thanks!
>
> Peter, after a few CI runs, we unfortunately did find some issues with your
> patch :-(
>
> Rebooting a guest in a loop sometimes fails. Michael was able to bisect it
> to your series.
>
> The problem is intermittent. The guest is unable to load its initramfs:
>
>   [    0.560674] rootfs image is not initramfs (no cpio magic); looks like an initrd
>   [    0.588605] Freeing initrd memory: 95680K
>   [    0.593143] md: Waiting for all devices to be available before autodetect
>   [    0.593144] md: If you don't use raid, use raid=noautodetect
>   [    0.593145] md: Autodetecting RAID arrays.
>   [    0.593146] md: autorun ...
>   [    0.593147] md: ... autorun DONE.
>   [    0.593156] RAMDISK: gzip image found at block 0
>   [    0.609110] RAMDISK: incomplete write (29120 != 32768)
>   [    0.609113] write error
>
> ...and then a panic because the kernel doesn't find a rootfs.

I repro'd *something*, but it wasn't quite this. I got:


[    4.691853] clk: Disabling unused clocks
[    4.695419] Freeing unused kernel image (initmem) memory: 6520K
[    4.695430] Write protected read-only-after-init data: 144k
[    4.695834] Checked W+X mappings: passed, no unexpected W+X pages found
[    4.695849] Run /init as init process
/init: error while loading shared libraries: libgcc_s.so.1: cannot
open shared object file: No such file or directory
[    4.697009] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x00007f00
[    4.697030] CPU: 0 PID: 1 Comm: init Not tainted 6.8.5-301.fc40.s390x #1
[    4.697035] Hardware name: IBM 8561 LT1 400 (KVM/Linux)
[    4.697040] Call Trace:
[    4.697047]  [<000000007ab6ae36>] dump_stack_lvl+0x66/0x88
[    4.697081]  [<0000000079e17c2a>] panic+0x312/0x328
[    4.697096]  [<0000000079e1de84>] do_exit+0x8a4/0xae8
[    4.697101]  [<0000000079e1e2e0>] do_group_exit+0x40/0xb8
[    4.697103]  [<0000000079e1e386>] __s390x_sys_exit_group+0x2e/0x30
[    4.697105]  [<000000007ab9526a>] __do_syscall+0x252/0x2c0
[    4.697113]  [<000000007aba8840>] system_call+0x70/0x98

Which I guess could be caused by a different corruption
of the initramfs ?

thanks
-- PMM
Nico Boehr Aug. 30, 2024, 5:48 a.m. UTC | #15
Quoting Peter Maydell (2024-08-29 17:53:02)
> On Wed, 28 Aug 2024 at 09:13, Nico Boehr <nrb@linux.ibm.com> wrote:
> >
> > Quoting Nico Boehr (2024-08-26 14:08:20)
> > > There was a little hickup without the fixup to patch 2, but after Nina
> > > pushed the fixup, we did not observe any failures related to your
> > > changes in our CI. Thanks!
> >
> > Peter, after a few CI runs, we unfortunately did find some issues with your
> > patch :-(
> >
> > Rebooting a guest in a loop sometimes fails. Michael was able to bisect it
> > to your series.
> >
> > The problem is intermittent. The guest is unable to load its initramfs:
> >
> >   [    0.560674] rootfs image is not initramfs (no cpio magic); looks like an initrd
> >   [    0.588605] Freeing initrd memory: 95680K
> >   [    0.593143] md: Waiting for all devices to be available before autodetect
> >   [    0.593144] md: If you don't use raid, use raid=noautodetect
> >   [    0.593145] md: Autodetecting RAID arrays.
> >   [    0.593146] md: autorun ...
> >   [    0.593147] md: ... autorun DONE.
> >   [    0.593156] RAMDISK: gzip image found at block 0
> >   [    0.609110] RAMDISK: incomplete write (29120 != 32768)
> >   [    0.609113] write error
> >
> > ...and then a panic because the kernel doesn't find a rootfs.
> 
> I repro'd *something*, but it wasn't quite this. I got:
> 
> 
> [    4.691853] clk: Disabling unused clocks
> [    4.695419] Freeing unused kernel image (initmem) memory: 6520K
> [    4.695430] Write protected read-only-after-init data: 144k
> [    4.695834] Checked W+X mappings: passed, no unexpected W+X pages found
> [    4.695849] Run /init as init process
> /init: error while loading shared libraries: libgcc_s.so.1: cannot
> open shared object file: No such file or directory
> [    4.697009] Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x00007f00
> [    4.697030] CPU: 0 PID: 1 Comm: init Not tainted 6.8.5-301.fc40.s390x #1
> [    4.697035] Hardware name: IBM 8561 LT1 400 (KVM/Linux)
> [    4.697040] Call Trace:
> [    4.697047]  [<000000007ab6ae36>] dump_stack_lvl+0x66/0x88
> [    4.697081]  [<0000000079e17c2a>] panic+0x312/0x328
> [    4.697096]  [<0000000079e1de84>] do_exit+0x8a4/0xae8
> [    4.697101]  [<0000000079e1e2e0>] do_group_exit+0x40/0xb8
> [    4.697103]  [<0000000079e1e386>] __s390x_sys_exit_group+0x2e/0x30
> [    4.697105]  [<000000007ab9526a>] __do_syscall+0x252/0x2c0
> [    4.697113]  [<000000007aba8840>] system_call+0x70/0x98
> 
> Which I guess could be caused by a different corruption
> of the initramfs ?

I think that is the problem, just a different symptom. I also got this
sometimes (in random libraries all over the place).

And yes, agree, it is a corruption of the initramfs:
[    1.597465] Initramfs unpacking failed: XZ-compressed data is corrupt
Peter Maydell Aug. 30, 2024, 10:01 a.m. UTC | #16
On Fri, 30 Aug 2024 at 06:48, Nico Boehr <nrb@linux.ibm.com> wrote:
>
> Quoting Peter Maydell (2024-08-29 17:53:02)
> > I repro'd *something*, but it wasn't quite this. I got:
> >
> >
> > [    4.691853] clk: Disabling unused clocks
> > [    4.695419] Freeing unused kernel image (initmem) memory: 6520K
> > [    4.695430] Write protected read-only-after-init data: 144k
> > [    4.695834] Checked W+X mappings: passed, no unexpected W+X pages found
> > [    4.695849] Run /init as init process
> > /init: error while loading shared libraries: libgcc_s.so.1: cannot
> > open shared object file: No such file or directory
> > [    4.697009] Kernel panic - not syncing: Attempted to kill init!
> > exitcode=0x00007f00
> > [    4.697030] CPU: 0 PID: 1 Comm: init Not tainted 6.8.5-301.fc40.s390x #1
> > [    4.697035] Hardware name: IBM 8561 LT1 400 (KVM/Linux)
> > [    4.697040] Call Trace:
> > [    4.697047]  [<000000007ab6ae36>] dump_stack_lvl+0x66/0x88
> > [    4.697081]  [<0000000079e17c2a>] panic+0x312/0x328
> > [    4.697096]  [<0000000079e1de84>] do_exit+0x8a4/0xae8
> > [    4.697101]  [<0000000079e1e2e0>] do_group_exit+0x40/0xb8
> > [    4.697103]  [<0000000079e1e386>] __s390x_sys_exit_group+0x2e/0x30
> > [    4.697105]  [<000000007ab9526a>] __do_syscall+0x252/0x2c0
> > [    4.697113]  [<000000007aba8840>] system_call+0x70/0x98
> >
> > Which I guess could be caused by a different corruption
> > of the initramfs ?
>
> I think that is the problem, just a different symptom. I also got this
> sometimes (in random libraries all over the place).

I ran overnight with none of the patchset applied, and it never
failed (as expected). Running with just the first virtio-ccw
patch does fall over fairly quickly. So something's up with
that patch, which is curious because that's the one I thought
was a straightforward conversion without any complications :-)

I'll investigate further today, I have the beginnings of a
theory about what might be happening...

-- PMM
Nico Boehr Aug. 30, 2024, 11:56 a.m. UTC | #17
Quoting Peter Maydell (2024-08-30 12:01:47)
> On Fri, 30 Aug 2024 at 06:48, Nico Boehr <nrb@linux.ibm.com> wrote:
> >
> > Quoting Peter Maydell (2024-08-29 17:53:02)
> > > I repro'd *something*, but it wasn't quite this. I got:
> > >
> > >
> > > [    4.691853] clk: Disabling unused clocks
> > > [    4.695419] Freeing unused kernel image (initmem) memory: 6520K
> > > [    4.695430] Write protected read-only-after-init data: 144k
> > > [    4.695834] Checked W+X mappings: passed, no unexpected W+X pages found
> > > [    4.695849] Run /init as init process
> > > /init: error while loading shared libraries: libgcc_s.so.1: cannot
> > > open shared object file: No such file or directory
> > > [    4.697009] Kernel panic - not syncing: Attempted to kill init!
> > > exitcode=0x00007f00
> > > [    4.697030] CPU: 0 PID: 1 Comm: init Not tainted 6.8.5-301.fc40.s390x #1
> > > [    4.697035] Hardware name: IBM 8561 LT1 400 (KVM/Linux)
> > > [    4.697040] Call Trace:
> > > [    4.697047]  [<000000007ab6ae36>] dump_stack_lvl+0x66/0x88
> > > [    4.697081]  [<0000000079e17c2a>] panic+0x312/0x328
> > > [    4.697096]  [<0000000079e1de84>] do_exit+0x8a4/0xae8
> > > [    4.697101]  [<0000000079e1e2e0>] do_group_exit+0x40/0xb8
> > > [    4.697103]  [<0000000079e1e386>] __s390x_sys_exit_group+0x2e/0x30
> > > [    4.697105]  [<000000007ab9526a>] __do_syscall+0x252/0x2c0
> > > [    4.697113]  [<000000007aba8840>] system_call+0x70/0x98
> > >
> > > Which I guess could be caused by a different corruption
> > > of the initramfs ?
> >
> > I think that is the problem, just a different symptom. I also got this
> > sometimes (in random libraries all over the place).
> 
> I ran overnight with none of the patchset applied, and it never
> failed (as expected). Running with just the first virtio-ccw
> patch does fall over fairly quickly. So something's up with
> that patch, which is curious because that's the one I thought
> was a straightforward conversion without any complications :-)
> 
> I'll investigate further today, I have the beginnings of a
> theory about what might be happening...

Thanks for taking the time, Peter! Let me know when you have insights.
Peter Maydell Aug. 30, 2024, 12:04 p.m. UTC | #18
On Fri, 30 Aug 2024 at 12:56, Nico Boehr <nrb@linux.ibm.com> wrote:
>
> Quoting Peter Maydell (2024-08-30 12:01:47)
> > I ran overnight with none of the patchset applied, and it never
> > failed (as expected). Running with just the first virtio-ccw
> > patch does fall over fairly quickly. So something's up with
> > that patch, which is curious because that's the one I thought
> > was a straightforward conversion without any complications :-)
> >
> > I'll investigate further today, I have the beginnings of a
> > theory about what might be happening...
>
> Thanks for taking the time, Peter! Let me know when you have insights.

I think I've found the problem, I'm just testing it to see if it
does properly fix the intermittent. The issue is that before we
can convert a class to three-phase reset we need to have
already converted all its parent classes. TYPE_VIRTIO_CCW_DEVICE
is a subclass of TYPE_CCW_DEVICE, but I forgot to convert
TYPE_CCW_DEVICE to three-phase reset. The result is that the
reset in the parent class was not happening at all (presumably
leaving the TYPE_CCW_DEVICE in an invalid/unexpected state on
reboot). The fix is to add an extra patch at the start of the
series. Once I've tested this I'll send out a v2 of the series,
maybe also adding the cleanup RTH suggested in one of the later
patches. (Ironically, if we do that cleanup then the bug would
also go away, because old-style DeviceClass:reset methods will
get called in exactly the same way as if they were registered
as phases.hold methods, rather than being specially handled...)


commit ff9bc4c8910f636f20e9b6e91d63dd003408ce10
Author: Peter Maydell <peter.maydell@linaro.org>
Date:   Fri Aug 30 12:50:03 2024 +0100

    hw/s390/ccw-device: Convert to three-phase reset

    Convert the TYPE_CCW_DEVICE to three-phase reset. This is a
    device class which is subclassed, so it needs to be three-phase
    before we can convert the subclass.

    Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

diff --git a/hw/s390x/ccw-device.c b/hw/s390x/ccw-device.c
index a7d682e5af9..14c24e38904 100644
--- a/hw/s390x/ccw-device.c
+++ b/hw/s390x/ccw-device.c
@@ -44,9 +44,9 @@ static Property ccw_device_properties[] = {
     DEFINE_PROP_END_OF_LIST(),
 };

-static void ccw_device_reset(DeviceState *d)
+static void ccw_device_reset_hold(Object *obj, ResetType type)
 {
-    CcwDevice *ccw_dev = CCW_DEVICE(d);
+    CcwDevice *ccw_dev = CCW_DEVICE(obj);

     css_reset_sch(ccw_dev->sch);
 }
@@ -55,11 +55,12 @@ static void ccw_device_class_init(ObjectClass
*klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
     CCWDeviceClass *k = CCW_DEVICE_CLASS(klass);
+    ResettableClass *rc = RESETTABLE_CLASS(klass);

     k->realize = ccw_device_realize;
     k->refill_ids = ccw_device_refill_ids;
     device_class_set_props(dc, ccw_device_properties);
-    dc->reset = ccw_device_reset;
+    rc->phases.hold = ccw_device_reset_hold;
     dc->bus_type = TYPE_VIRTUAL_CSS_BUS;
 }

-- PMM
Peter Maydell Aug. 30, 2024, 12:17 p.m. UTC | #19
On Fri, 30 Aug 2024 at 13:04, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Fri, 30 Aug 2024 at 12:56, Nico Boehr <nrb@linux.ibm.com> wrote:
> >
> > Quoting Peter Maydell (2024-08-30 12:01:47)
> > > I ran overnight with none of the patchset applied, and it never
> > > failed (as expected). Running with just the first virtio-ccw
> > > patch does fall over fairly quickly. So something's up with
> > > that patch, which is curious because that's the one I thought
> > > was a straightforward conversion without any complications :-)
> > >
> > > I'll investigate further today, I have the beginnings of a
> > > theory about what might be happening...
> >
> > Thanks for taking the time, Peter! Let me know when you have insights.
>
> I think I've found the problem, I'm just testing it to see if it
> does properly fix the intermittent.

> The fix is to add an extra patch at the start of the
> series. Once I've tested this I'll send out a v2 of the series,
> maybe also adding the cleanup RTH suggested in one of the later
> patches.

By the way, how would you like this series to go upstream?
I can think of three options:
 * you take the whole thing through the s390 tree
 * I take the whole thing through my tree
 * you take the s390 specific patches, and I take the
   remaining reset-cleanup patches once those have landed

Any preference?

thanks
-- PMM
Thomas Huth Aug. 30, 2024, 12:21 p.m. UTC | #20
On 30/08/2024 14.17, Peter Maydell wrote:
> On Fri, 30 Aug 2024 at 13:04, Peter Maydell <peter.maydell@linaro.org> wrote:
>>
>> On Fri, 30 Aug 2024 at 12:56, Nico Boehr <nrb@linux.ibm.com> wrote:
>>>
>>> Quoting Peter Maydell (2024-08-30 12:01:47)
>>>> I ran overnight with none of the patchset applied, and it never
>>>> failed (as expected). Running with just the first virtio-ccw
>>>> patch does fall over fairly quickly. So something's up with
>>>> that patch, which is curious because that's the one I thought
>>>> was a straightforward conversion without any complications :-)
>>>>
>>>> I'll investigate further today, I have the beginnings of a
>>>> theory about what might be happening...
>>>
>>> Thanks for taking the time, Peter! Let me know when you have insights.
>>
>> I think I've found the problem, I'm just testing it to see if it
>> does properly fix the intermittent.
> 
>> The fix is to add an extra patch at the start of the
>> series. Once I've tested this I'll send out a v2 of the series,
>> maybe also adding the cleanup RTH suggested in one of the later
>> patches.
> 
> By the way, how would you like this series to go upstream?
> I can think of three options:
>   * you take the whole thing through the s390 tree
>   * I take the whole thing through my tree
>   * you take the s390 specific patches, and I take the
>     remaining reset-cleanup patches once those have landed
> 
> Any preference?

I'm fine if you take the patches through your tree along with the other 
reset-related patches.

  Thomas