Message ID | 20190222183941.29596-1-farman@linux.ibm.com (mailing list archive) |
---|---|
Headers | show |
Series | Fix vfio-ccw handling of TIC recursion | expand |
On Fri, 22 Feb 2019 19:39:39 +0100 Eric Farman <farman@linux.ibm.com> wrote: > Per the discussion [1] about a problem with how vfio-ccw calculates > the length of a channel program (specifically when using the > forthcoming QEMU BIOS code for DASD IPL), I present this fix. > > Patch 1 fixes the problem, and is over-engineered > for readability sake. :) > > Patch 2 takes the functions from Patch 1, and refactors the > existing code to make other areas a little easier to understand. > (I hope.) > > I've been running fio for over 24 hours now, and have seen > zero hours. Previously, I would have probably seen "a few" > errors by now, where prior to the original fix I would've seen > "many" errors. Further tests are still ongoing. Awesome, thanks! > > [1] https://marc.info/?l=linux-s390&m=155063096321940&w=2 > > Eric Farman (2): > s390/cio: Fix vfio-ccw handling of recursive TICs > s390/cio: Use cpa range elsewhere within vfio-ccw > > drivers/s390/cio/vfio_ccw_cp.c | 55 ++++++++++++++++++++++++++++++++---------- > 1 file changed, 42 insertions(+), 13 deletions(-) > I hope we can queue the patches soon, reviewing now.
On 02/25/2019 04:11 AM, Cornelia Huck wrote: > On Fri, 22 Feb 2019 19:39:39 +0100 > Eric Farman <farman@linux.ibm.com> wrote: > >> Per the discussion [1] about a problem with how vfio-ccw calculates >> the length of a channel program (specifically when using the >> forthcoming QEMU BIOS code for DASD IPL), I present this fix. >> >> Patch 1 fixes the problem, and is over-engineered >> for readability sake. > > :) > >> >> Patch 2 takes the functions from Patch 1, and refactors the >> existing code to make other areas a little easier to understand. >> (I hope.) >> >> I've been running fio for over 24 hours now, and have seen >> zero hours. Previously, I would have probably seen "a few" >> errors by now, where prior to the original fix I would've seen >> "many" errors. Further tests are still ongoing. > > Awesome, thanks! I left fio running over the weekend, with newly-randomized parameters every hour or two... Had one error yesterday morning, in the NOP+TIC-to-redrive-I/O case. I didn't leave any tracing on because I didn't expect I'd be able to get anything before they wrapped, and didn't have time to figure out a way to cleanly filter errors. Though I did leave a counter in place for the number of times we processed a TIC that goes back into the current chain, and it hit about 1900 times since Friday. More than three quarters of them occurred during the error yesterday morning, so something was being dramatic at the time. I guess there's one obscure corner to track down, but it otherwise seems to run quite a bit better than before. - Eric > >> >> [1] https://marc.info/?l=linux-s390&m=155063096321940&w=2 >> >> Eric Farman (2): >> s390/cio: Fix vfio-ccw handling of recursive TICs >> s390/cio: Use cpa range elsewhere within vfio-ccw >> >> drivers/s390/cio/vfio_ccw_cp.c | 55 ++++++++++++++++++++++++++++++++---------- >> 1 file changed, 42 insertions(+), 13 deletions(-) >> > > I hope we can queue the patches soon, reviewing now. >
On Mon, 25 Feb 2019 11:30:47 -0500 Eric Farman <farman@linux.ibm.com> wrote: > On 02/25/2019 04:11 AM, Cornelia Huck wrote: > > On Fri, 22 Feb 2019 19:39:39 +0100 > > Eric Farman <farman@linux.ibm.com> wrote: > > > >> Per the discussion [1] about a problem with how vfio-ccw calculates > >> the length of a channel program (specifically when using the > >> forthcoming QEMU BIOS code for DASD IPL), I present this fix. > >> > >> Patch 1 fixes the problem, and is over-engineered > >> for readability sake. > > > > :) > > > >> > >> Patch 2 takes the functions from Patch 1, and refactors the > >> existing code to make other areas a little easier to understand. > >> (I hope.) > >> > >> I've been running fio for over 24 hours now, and have seen > >> zero hours. Previously, I would have probably seen "a few" > >> errors by now, where prior to the original fix I would've seen > >> "many" errors. Further tests are still ongoing. > > > > Awesome, thanks! > > I left fio running over the weekend, with newly-randomized parameters > every hour or two... Had one error yesterday morning, in the > NOP+TIC-to-redrive-I/O case. I didn't leave any tracing on because I > didn't expect I'd be able to get anything before they wrapped, and > didn't have time to figure out a way to cleanly filter errors. > > Though I did leave a counter in place for the number of times we > processed a TIC that goes back into the current chain, and it hit about > 1900 times since Friday. More than three quarters of them occurred > during the error yesterday morning, so something was being dramatic at > the time. I guess there's one obscure corner to track down, but it > otherwise seems to run quite a bit better than before. Agreed, the patches as they are now are a real improvement.
On 02/22/2019 01:39 PM, Eric Farman wrote: > Per the discussion [1] about a problem with how vfio-ccw calculates > the length of a channel program (specifically when using the > forthcoming QEMU BIOS code for DASD IPL), I present this fix. > > Patch 1 fixes the problem, and is over-engineered > for readability sake. > > Patch 2 takes the functions from Patch 1, and refactors the > existing code to make other areas a little easier to understand. > (I hope.) > > I've been running fio for over 24 hours now, and have seen > zero hours. Previously, I would have probably seen "a few" > errors by now, where prior to the original fix I would've seen > "many" errors. Further tests are still ongoing. > > [1] https://marc.info/?l=linux-s390&m=155063096321940&w=2 > > Eric Farman (2): > s390/cio: Fix vfio-ccw handling of recursive TICs > s390/cio: Use cpa range elsewhere within vfio-ccw > > drivers/s390/cio/vfio_ccw_cp.c | 55 ++++++++++++++++++++++++++++++++---------- > 1 file changed, 42 insertions(+), 13 deletions(-) > Reviewed-by: Farhan Ali <alifm@linux.ibm.com> for the series.
On Fri, 22 Feb 2019 19:39:39 +0100 Eric Farman <farman@linux.ibm.com> wrote: > Per the discussion [1] about a problem with how vfio-ccw calculates > the length of a channel program (specifically when using the > forthcoming QEMU BIOS code for DASD IPL), I present this fix. > > Patch 1 fixes the problem, and is over-engineered > for readability sake. > > Patch 2 takes the functions from Patch 1, and refactors the > existing code to make other areas a little easier to understand. > (I hope.) > > I've been running fio for over 24 hours now, and have seen > zero hours. Previously, I would have probably seen "a few" > errors by now, where prior to the original fix I would've seen > "many" errors. Further tests are still ongoing. > > [1] https://marc.info/?l=linux-s390&m=155063096321940&w=2 > > Eric Farman (2): > s390/cio: Fix vfio-ccw handling of recursive TICs > s390/cio: Use cpa range elsewhere within vfio-ccw > > drivers/s390/cio/vfio_ccw_cp.c | 55 ++++++++++++++++++++++++++++++++---------- > 1 file changed, 42 insertions(+), 13 deletions(-) > Thanks, applied.