Message ID | 20240118185151.265329-1-mjrosato@linux.ibm.com (mailing list archive) |
---|---|
Headers | show |
Series | s390x/pci: fix ISM reset | expand |
On 1/18/24 19:51, Matthew Rosato wrote: > Commit ef1535901a0 (re-)introduced an issue where passthrough ISM devices > on s390x would enter an error state after reboot. This was previously fixed > by 03451953c79e, using device reset callbacks, however the change in > ef1535901a0 effectively triggers a cold reset of the pci bus before the > device reset callbacks are triggered. > > To resolve this, this series proposes to remove the use of the reset callback > for ISM cleanup and instead trigger ISM reset from subsystem_reset before > triggering bus resets. This has to happen before the bus resets because the > reset of s390-pcihost will trigger reset of the PCI bus followed by the > s390-pci bus, and the former will trigger vfio-pci reset / the aperture-wide > unmap that ISM gets upset about. > > /s390-pcihost (s390-pcihost) > /pci.0 (PCI) > /s390-pcibus.0 (s390-pcibus) > > While fixing this, it was also noted that kernel warnings could be seen that > indicate a guest ISC reference count error. That's because in some reset > cases we were not bothering to disable AIF, but would again re-enable it after > the reset (causing the reference count to grow erroneously). This was a base > issue that went unnoticed because the kernel previously did not detect and > issue a warning for this scenario. > > > Changes for v2: > - Fold a typo fix from patch 2 into patch 1 where it belongs > - Add block comment re: timing of ISM reset > - Add review tags Reviewed-by: Cédric Le Goater <clg@redhat.com> Thanks, C.
18.01.2024 21:51, Matthew Rosato : > Commit ef1535901a0 (re-)introduced an issue where passthrough ISM devices > on s390x would enter an error state after reboot. This was previously fixed > by 03451953c79e, using device reset callbacks, however the change in > ef1535901a0 effectively triggers a cold reset of the pci bus before the > device reset callbacks are triggered. > > To resolve this, this series proposes to remove the use of the reset callback > for ISM cleanup and instead trigger ISM reset from subsystem_reset before > triggering bus resets. This has to happen before the bus resets because the > reset of s390-pcihost will trigger reset of the PCI bus followed by the > s390-pci bus, and the former will trigger vfio-pci reset / the aperture-wide > unmap that ISM gets upset about. > > /s390-pcihost (s390-pcihost) > /pci.0 (PCI) > /s390-pcibus.0 (s390-pcibus) > > While fixing this, it was also noted that kernel warnings could be seen that > indicate a guest ISC reference count error. That's because in some reset > cases we were not bothering to disable AIF, but would again re-enable it after > the reset (causing the reference count to grow erroneously). This was a base > issue that went unnoticed because the kernel previously did not detect and > issue a warning for this scenario. > Matthew Rosato (3): > s390x/pci: avoid double enable/disable of aif > s390x/pci: refresh fh before disabling aif > s390x/pci: drive ISM reset from subsystem reset Is it this a material for -stable, or there's no need to bother? (changes 1 and 2 applies to 7.2 (while 2 fixes later change), all 3 applies to 8.1 (while 3 fixes later change), and all 3 can be picked up for 8.2, I guess). Thanks, /mjt
22.01.2024 13:18, Michael Tokarev : .. > Is it this a material for -stable, or there's no need to bother? Actually it's been Cc'd to qemu-stable@ already, I haven't noticed. Still there's a question which branches should get which patches. > (changes 1 and 2 applies to 7.2 (while 2 fixes later change), > all 3 applies to 8.1 (while 3 fixes later change), and all 3 can be > picked up for 8.2, I guess). 07b2c8e034d80f s390x/pci: avoid double enable/disable of aif Fixes: v7.1.0-416-gd0bc7091c2 s390x/pci: enable adapter event notification for interpreted devices 30e35258e25c75 s390x/pci: refresh fh before disabling aif Fixes: v7.2.0-51-g03451953c7 s390x/pci: reset ISM passthrough devices on shutdown and system reset 68c691ca99a253 s390x/pci: drive ISM reset from subsystem reset Fixes: v8.1.0-654-gef1535901a s390x: do a subsystem reset before the unprotect on reboot Fixes: v7.2.0-51-g03451953c7 s390x/pci: reset ISM passthrough devices on shutdown and system reset So all 3 are okay for 8.2. What about 8.1 and 7.2 which are the current still-maintained stable branches? (I think this 8.1 release will be the last in series). Thanks, /mjt
On 22/01/2024 11.31, Michael Tokarev wrote: > 22.01.2024 13:18, Michael Tokarev : > .. >> Is it this a material for -stable, or there's no need to bother? > > Actually it's been Cc'd to qemu-stable@ already, I haven't noticed. > Still there's a question which branches should get which patches. ... > So all 3 are okay for 8.2. > > What about 8.1 and 7.2 which are the current still-maintained stable branches? > (I think this 8.1 release will be the last in series). IIUC the main issue that this series fixes is the bug that has been uncovered by commit ef1535901a07f2e49fa25c8bcee7f0b73801d824 ("s390x: do a subsystem reset before the unprotect on reboot") - and that one is in 8.2 only. So I think it should be OK to just backport this to 8.2 and skip 8.1 and 7.2. Thomas
On 1/22/24 5:49 AM, Thomas Huth wrote: > On 22/01/2024 11.31, Michael Tokarev wrote: >> 22.01.2024 13:18, Michael Tokarev : >> .. >>> Is it this a material for -stable, or there's no need to bother? >> >> Actually it's been Cc'd to qemu-stable@ already, I haven't noticed. >> Still there's a question which branches should get which patches. > ... >> So all 3 are okay for 8.2. >> >> What about 8.1 and 7.2 which are the current still-maintained stable branches? >> (I think this 8.1 release will be the last in series). > > IIUC the main issue that this series fixes is the bug that has been uncovered by commit ef1535901a07f2e49fa25c8bcee7f0b73801d824 ("s390x: do a subsystem reset before the unprotect on reboot") - and that one is in 8.2 only. So I think it should be OK to just backport this to 8.2 and skip 8.1 and 7.2. > Agreed. Thanks, Matt