Message ID | 1524149293-12658-2-git-send-email-pmorel@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 24/04/2018 08:54, Dong Jia Shi wrote: > * Pierre Morel <pmorel@linux.vnet.ibm.com> [2018-04-19 16:48:04 +0200]: > > [...] > >> @@ -94,9 +83,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) >> static void vfio_ccw_sch_irq(struct subchannel *sch) >> { >> struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev); >> + struct irb *irb = this_cpu_ptr(&cio_irb); >> >> inc_irq_stat(IRQIO_CIO); >> - vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT); >> + memcpy(&private->irb, irb, sizeof(*irb)); >> + >> + WARN_ON(work_pending(&private->io_work)); > Hmm, why do we need this? The current design insure that we have not two concurrent SSCH requests. How ever I want here to track spurious interrupt. If we implement cancel, halt or clear requests, we also may trigger (AFAIU) a second interrupts depending on races between instructions, controller and device. We do not need it strongly. > >> + queue_work(vfio_ccw_work_q, &private->io_work); >> + if (private->completion) >> + complete(private->completion); >> } >> >> static int vfio_ccw_sch_probe(struct subchannel *sch) > [...] >
On Tue, 24 Apr 2018 10:40:56 +0200 Pierre Morel <pmorel@linux.vnet.ibm.com> wrote: > On 24/04/2018 08:54, Dong Jia Shi wrote: > > * Pierre Morel <pmorel@linux.vnet.ibm.com> [2018-04-19 16:48:04 +0200]: > > > > [...] > > > >> @@ -94,9 +83,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) > >> static void vfio_ccw_sch_irq(struct subchannel *sch) > >> { > >> struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev); > >> + struct irb *irb = this_cpu_ptr(&cio_irb); > >> > >> inc_irq_stat(IRQIO_CIO); > >> - vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT); > >> + memcpy(&private->irb, irb, sizeof(*irb)); > >> + > >> + WARN_ON(work_pending(&private->io_work)); > > Hmm, why do we need this? > > The current design insure that we have not two concurrent SSCH requests. > How ever I want here to track spurious interrupt. > If we implement cancel, halt or clear requests, we also may trigger (AFAIU) > a second interrupts depending on races between instructions, controller > and device. You won't get an interrupt for a successful cancel. If you do a halt/clear, you will make the subchannel halt/clear pending in addition to start pending and you'll only get one interrupt (if the I/O has progressed far enough, you won't be able to issue a hsch). The interesting case is: - guest does a ssch, we do a ssch on the device - the guest does a csch before it got the interrupt for the ssch - before we do the csch on the device, the subchannel is already status pending with completion of the ssch - after we issue the csch, we get a second interrupt (for the csch) I think we should present two interrupts to the guest in that case. Races between issuing ssch/hsch/csch and the subchannel becoming status pending happen on real hardware as well, we're just more likely to see them with the vfio layer in between. (I'm currently trying to recall what we're doing with unsolicited interrupts. These are fun wrt deferred cc 1; I'm not sure if there are cases where we want to present a deferred cc to the guest.) Also, doing a second ssch before we got final state for the first one is perfectly valid. Linux just does not do it, so I'm not sure if we should invest too much time there. > > We do not need it strongly. > > > > >> + queue_work(vfio_ccw_work_q, &private->io_work); > >> + if (private->completion) > >> + complete(private->completion); > >> } > >> > >> static int vfio_ccw_sch_probe(struct subchannel *sch) > > [...] > > >
On 24/04/2018 11:59, Cornelia Huck wrote: > On Tue, 24 Apr 2018 10:40:56 +0200 > Pierre Morel <pmorel@linux.vnet.ibm.com> wrote: > >> On 24/04/2018 08:54, Dong Jia Shi wrote: >>> * Pierre Morel <pmorel@linux.vnet.ibm.com> [2018-04-19 16:48:04 +0200]: >>> >>> [...] >>> >>>> @@ -94,9 +83,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) >>>> static void vfio_ccw_sch_irq(struct subchannel *sch) >>>> { >>>> struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev); >>>> + struct irb *irb = this_cpu_ptr(&cio_irb); >>>> >>>> inc_irq_stat(IRQIO_CIO); >>>> - vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT); >>>> + memcpy(&private->irb, irb, sizeof(*irb)); >>>> + >>>> + WARN_ON(work_pending(&private->io_work)); >>> Hmm, why do we need this? >> The current design insure that we have not two concurrent SSCH requests. >> How ever I want here to track spurious interrupt. >> If we implement cancel, halt or clear requests, we also may trigger (AFAIU) >> a second interrupts depending on races between instructions, controller >> and device. > You won't get an interrupt for a successful cancel. If you do a > halt/clear, you will make the subchannel halt/clear pending in addition > to start pending and you'll only get one interrupt (if the I/O has > progressed far enough, you won't be able to issue a hsch). The > interesting case is: > - guest does a ssch, we do a ssch on the device > - the guest does a csch before it got the interrupt for the ssch > - before we do the csch on the device, the subchannel is already status > pending with completion of the ssch > - after we issue the csch, we get a second interrupt (for the csch) We agree. > > I think we should present two interrupts to the guest in that case. > Races between issuing ssch/hsch/csch and the subchannel becoming status > pending happen on real hardware as well, we're just more likely to see > them with the vfio layer in between. Yes, agreed too. > > (I'm currently trying to recall what we're doing with unsolicited > interrupts. These are fun wrt deferred cc 1; I'm not sure if there are > cases where we want to present a deferred cc to the guest.) This patch does not change the current functionalities, only consolidates the FSM. The current way to handle unsolicited interrupts is to report them to the guest along with the deferred code AFAIU. > > Also, doing a second ssch before we got final state for the first one > is perfectly valid. Linux just does not do it, so I'm not sure if we > should invest too much time there. I agree too, it would just make things unnecessary complicated.
On Tue, 24 Apr 2018 13:49:14 +0200 Pierre Morel <pmorel@linux.vnet.ibm.com> wrote: > On 24/04/2018 11:59, Cornelia Huck wrote: > > On Tue, 24 Apr 2018 10:40:56 +0200 > > Pierre Morel <pmorel@linux.vnet.ibm.com> wrote: > > > >> On 24/04/2018 08:54, Dong Jia Shi wrote: > >>> * Pierre Morel <pmorel@linux.vnet.ibm.com> [2018-04-19 16:48:04 +0200]: > >>> > >>> [...] > >>> > >>>> @@ -94,9 +83,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) > >>>> static void vfio_ccw_sch_irq(struct subchannel *sch) > >>>> { > >>>> struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev); > >>>> + struct irb *irb = this_cpu_ptr(&cio_irb); > >>>> > >>>> inc_irq_stat(IRQIO_CIO); > >>>> - vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT); > >>>> + memcpy(&private->irb, irb, sizeof(*irb)); > >>>> + > >>>> + WARN_ON(work_pending(&private->io_work)); > >>> Hmm, why do we need this? > >> The current design insure that we have not two concurrent SSCH requests. > >> How ever I want here to track spurious interrupt. > >> If we implement cancel, halt or clear requests, we also may trigger (AFAIU) > >> a second interrupts depending on races between instructions, controller > >> and device. > > You won't get an interrupt for a successful cancel. If you do a > > halt/clear, you will make the subchannel halt/clear pending in addition > > to start pending and you'll only get one interrupt (if the I/O has > > progressed far enough, you won't be able to issue a hsch). The > > interesting case is: > > - guest does a ssch, we do a ssch on the device > > - the guest does a csch before it got the interrupt for the ssch > > - before we do the csch on the device, the subchannel is already status > > pending with completion of the ssch > > - after we issue the csch, we get a second interrupt (for the csch) > > We agree. > > > > > I think we should present two interrupts to the guest in that case. > > Races between issuing ssch/hsch/csch and the subchannel becoming status > > pending happen on real hardware as well, we're just more likely to see > > them with the vfio layer in between. > > Yes, agreed too. > > > > > (I'm currently trying to recall what we're doing with unsolicited > > interrupts. These are fun wrt deferred cc 1; I'm not sure if there are > > cases where we want to present a deferred cc to the guest.) > > This patch does not change the current functionalities, only > consolidates the FSM. > The current way to handle unsolicited interrupts is to report them to > the guest > along with the deferred code AFAIU. My question was more along the line of "do we actually want to _generate_ a deferred cc1 or unsolicited interrupt, based upon what we do in our state machine". My guess is no, regardless of the changes you do in this series. > > > > > Also, doing a second ssch before we got final state for the first one > > is perfectly valid. Linux just does not do it, so I'm not sure if we > > should invest too much time there. > > I agree too, it would just make things unnecessary complicated. I'm a big fan of just throwing everything at the hardware and let it sort out any races etc. We just need to be sure we don't mix up interrupts :)
On 24/04/2018 13:55, Cornelia Huck wrote: > On Tue, 24 Apr 2018 13:49:14 +0200 > Pierre Morel <pmorel@linux.vnet.ibm.com> wrote: > >> On 24/04/2018 11:59, Cornelia Huck wrote: >>> On Tue, 24 Apr 2018 10:40:56 +0200 >>> Pierre Morel <pmorel@linux.vnet.ibm.com> wrote: >>> >>>> On 24/04/2018 08:54, Dong Jia Shi wrote: >>>>> * Pierre Morel <pmorel@linux.vnet.ibm.com> [2018-04-19 16:48:04 +0200]: >>>>> >>>>> [...] >>>>> >>>>>> @@ -94,9 +83,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) >>>>>> static void vfio_ccw_sch_irq(struct subchannel *sch) >>>>>> { >>>>>> struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev); >>>>>> + struct irb *irb = this_cpu_ptr(&cio_irb); >>>>>> >>>>>> inc_irq_stat(IRQIO_CIO); >>>>>> - vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT); >>>>>> + memcpy(&private->irb, irb, sizeof(*irb)); >>>>>> + >>>>>> + WARN_ON(work_pending(&private->io_work)); >>>>> Hmm, why do we need this? >>>> The current design insure that we have not two concurrent SSCH requests. >>>> How ever I want here to track spurious interrupt. >>>> If we implement cancel, halt or clear requests, we also may trigger (AFAIU) >>>> a second interrupts depending on races between instructions, controller >>>> and device. >>> You won't get an interrupt for a successful cancel. If you do a >>> halt/clear, you will make the subchannel halt/clear pending in addition >>> to start pending and you'll only get one interrupt (if the I/O has >>> progressed far enough, you won't be able to issue a hsch). The >>> interesting case is: >>> - guest does a ssch, we do a ssch on the device >>> - the guest does a csch before it got the interrupt for the ssch >>> - before we do the csch on the device, the subchannel is already status >>> pending with completion of the ssch >>> - after we issue the csch, we get a second interrupt (for the csch) >> We agree. >> >>> I think we should present two interrupts to the guest in that case. >>> Races between issuing ssch/hsch/csch and the subchannel becoming status >>> pending happen on real hardware as well, we're just more likely to see >>> them with the vfio layer in between. >> Yes, agreed too. >> >>> (I'm currently trying to recall what we're doing with unsolicited >>> interrupts. These are fun wrt deferred cc 1; I'm not sure if there are >>> cases where we want to present a deferred cc to the guest.) >> This patch does not change the current functionalities, only >> consolidates the FSM. >> The current way to handle unsolicited interrupts is to report them to >> the guest >> along with the deferred code AFAIU. > My question was more along the line of "do we actually want to > _generate_ a deferred cc1 or unsolicited interrupt, based upon what we > do in our state machine". My guess is no, regardless of the changes you > do in this series. > >>> Also, doing a second ssch before we got final state for the first one >>> is perfectly valid. Linux just does not do it, so I'm not sure if we >>> should invest too much time there. >> I agree too, it would just make things unnecessary complicated. > I'm a big fan of just throwing everything at the hardware and let it > sort out any races etc. We just need to be sure we don't mix up > interrupts :) > OK, I understand, I can do somthing in the interrupt handler to make sure we do not loose interrupt IRQs. I make a proposition in V2. Thanks, Pierre
On 04/24/2018 11:59 AM, Cornelia Huck wrote: > On Tue, 24 Apr 2018 10:40:56 +0200 > Pierre Morel <pmorel@linux.vnet.ibm.com> wrote: > >> On 24/04/2018 08:54, Dong Jia Shi wrote: >>> * Pierre Morel <pmorel@linux.vnet.ibm.com> [2018-04-19 16:48:04 +0200]: >>> >>> [...] >>> >>>> @@ -94,9 +83,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) >>>> static void vfio_ccw_sch_irq(struct subchannel *sch) >>>> { >>>> struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev); >>>> + struct irb *irb = this_cpu_ptr(&cio_irb); >>>> >>>> inc_irq_stat(IRQIO_CIO); >>>> - vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT); >>>> + memcpy(&private->irb, irb, sizeof(*irb)); >>>> + >>>> + WARN_ON(work_pending(&private->io_work)); >>> Hmm, why do we need this? >> >> The current design insure that we have not two concurrent SSCH requests. >> How ever I want here to track spurious interrupt. >> If we implement cancel, halt or clear requests, we also may trigger (AFAIU) >> a second interrupts depending on races between instructions, controller >> and device. > > You won't get an interrupt for a successful cancel. If you do a > halt/clear, you will make the subchannel halt/clear pending in addition > to start pending and you'll only get one interrupt (if the I/O has > progressed far enough, you won't be able to issue a hsch). The > interesting case is: > - guest does a ssch, we do a ssch on the device > - the guest does a csch before it got the interrupt for the ssch > - before we do the csch on the device, the subchannel is already status > pending with completion of the ssch > - after we issue the csch, we get a second interrupt (for the csch) > > I think we should present two interrupts to the guest in that case. > Races between issuing ssch/hsch/csch and the subchannel becoming status > pending happen on real hardware as well, we're just more likely to see > them with the vfio layer in between. > AFAIU this will be the problem of the person implementing the clear and the halt for vfio-ccw. I.e. it's a non-problem right now. > (I'm currently trying to recall what we're doing with unsolicited > interrupts. These are fun wrt deferred cc 1; I'm not sure if there are > cases where we want to present a deferred cc to the guest.) What are 'fun wrt deferred cc 1' again? The deferred cc I understand but wrt does not click at all. > > Also, doing a second ssch before we got final state for the first one > is perfectly valid. Linux just does not do it, so I'm not sure if we > should invest too much time there. > >> >> We do not need it strongly. >> >>> >>>> + queue_work(vfio_ccw_work_q, &private->io_work); >>>> + if (private->completion) >>>> + complete(private->completion); >>>> } >>>> >>>> static int vfio_ccw_sch_probe(struct subchannel *sch) >>> [...] >>> >> >
On Tue, 24 Apr 2018 18:42:38 +0200 Halil Pasic <pasic@linux.ibm.com> wrote: > On 04/24/2018 11:59 AM, Cornelia Huck wrote: > > On Tue, 24 Apr 2018 10:40:56 +0200 > > Pierre Morel <pmorel@linux.vnet.ibm.com> wrote: > > > >> On 24/04/2018 08:54, Dong Jia Shi wrote: > >>> * Pierre Morel <pmorel@linux.vnet.ibm.com> [2018-04-19 16:48:04 +0200]: > >>> > >>> [...] > >>> > >>>> @@ -94,9 +83,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) > >>>> static void vfio_ccw_sch_irq(struct subchannel *sch) > >>>> { > >>>> struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev); > >>>> + struct irb *irb = this_cpu_ptr(&cio_irb); > >>>> > >>>> inc_irq_stat(IRQIO_CIO); > >>>> - vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT); > >>>> + memcpy(&private->irb, irb, sizeof(*irb)); > >>>> + > >>>> + WARN_ON(work_pending(&private->io_work)); > >>> Hmm, why do we need this? > >> > >> The current design insure that we have not two concurrent SSCH requests. > >> How ever I want here to track spurious interrupt. > >> If we implement cancel, halt or clear requests, we also may trigger (AFAIU) > >> a second interrupts depending on races between instructions, controller > >> and device. > > > > You won't get an interrupt for a successful cancel. If you do a > > halt/clear, you will make the subchannel halt/clear pending in addition > > to start pending and you'll only get one interrupt (if the I/O has > > progressed far enough, you won't be able to issue a hsch). The > > interesting case is: > > - guest does a ssch, we do a ssch on the device > > - the guest does a csch before it got the interrupt for the ssch > > - before we do the csch on the device, the subchannel is already status > > pending with completion of the ssch > > - after we issue the csch, we get a second interrupt (for the csch) > > > > I think we should present two interrupts to the guest in that case. > > Races between issuing ssch/hsch/csch and the subchannel becoming status > > pending happen on real hardware as well, we're just more likely to see > > them with the vfio layer in between. > > > > AFAIU this will be the problem of the person implementing the clear > and the halt for vfio-ccw. I.e. it's a non-problem right now. Well, that person is me :) I will post some RFC Real Soon Now if I stop getting sidetracked... > > > (I'm currently trying to recall what we're doing with unsolicited > > interrupts. These are fun wrt deferred cc 1; I'm not sure if there are > > cases where we want to present a deferred cc to the guest.) > > What are 'fun wrt deferred cc 1' again? The deferred cc I understand > but wrt does not click at all. wrt == with regard to (Or were you asking something else?) > > > > > Also, doing a second ssch before we got final state for the first one > > is perfectly valid. Linux just does not do it, so I'm not sure if we > > should invest too much time there. > > > >> > >> We do not need it strongly. > >> > >>> > >>>> + queue_work(vfio_ccw_work_q, &private->io_work); > >>>> + if (private->completion) > >>>> + complete(private->completion); > >>>> } > >>>> > >>>> static int vfio_ccw_sch_probe(struct subchannel *sch) > >>> [...] > >>> > >> > > >
On 04/25/2018 08:57 AM, Cornelia Huck wrote: >> AFAIU this will be the problem of the person implementing the clear >> and the halt for vfio-ccw. I.e. it's a non-problem right now. > Well, that person is me:) I will post some RFC Real Soon Now if I stop > getting sidetracked... > Makes sense. It should be fine either way AFAIU. CSCH, more precisely the clear function is supposed to clear the interruption request(s) too. But I guess there is no way of the CP to identify an I/O interrupt that should have been cleared -- that is catch us disrespecting the architecture. I can't think of a way to establish must happen before relationship... But discarding the first interrupt and delivering just one for the CSCH is fine too for the same reason. Regards, Halil
On Wed, 25 Apr 2018 13:06:31 +0200 Halil Pasic <pasic@linux.ibm.com> wrote: > On 04/25/2018 08:57 AM, Cornelia Huck wrote: > >> AFAIU this will be the problem of the person implementing the clear > >> and the halt for vfio-ccw. I.e. it's a non-problem right now. > > Well, that person is me:) I will post some RFC Real Soon Now if I stop > > getting sidetracked... > > > > Makes sense. It should be fine either way AFAIU. > > CSCH, more precisely the clear function is supposed to clear the > interruption request(s) too. But I guess there is no way of the CP to > identify an I/O interrupt that should have been cleared -- that is catch > us disrespecting the architecture. I can't think of a way to establish > must happen before relationship... > > But discarding the first interrupt and delivering just one for the CSCH > is fine too for the same reason. Yes, both work. The calling code in the guest has to be able to handle both anyway, since both can happen on real hardware as well (with a smaller race window).
diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c index ea6a2d0..f1b158c 100644 --- a/drivers/s390/cio/vfio_ccw_drv.c +++ b/drivers/s390/cio/vfio_ccw_drv.c @@ -70,20 +70,9 @@ int vfio_ccw_sch_quiesce(struct subchannel *sch) static void vfio_ccw_sch_io_todo(struct work_struct *work) { struct vfio_ccw_private *private; - struct irb *irb; private = container_of(work, struct vfio_ccw_private, io_work); - irb = &private->irb; - - if (scsw_is_solicited(&irb->scsw)) { - cp_update_scsw(&private->cp, &irb->scsw); - cp_free(&private->cp); - } - memcpy(private->io_region.irb_area, irb, sizeof(*irb)); - - if (private->io_trigger) - eventfd_signal(private->io_trigger, 1); - + vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT); if (private->mdev) private->state = VFIO_CCW_STATE_IDLE; } @@ -94,9 +83,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) static void vfio_ccw_sch_irq(struct subchannel *sch) { struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev); + struct irb *irb = this_cpu_ptr(&cio_irb); inc_irq_stat(IRQIO_CIO); - vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT); + memcpy(&private->irb, irb, sizeof(*irb)); + + WARN_ON(work_pending(&private->io_work)); + queue_work(vfio_ccw_work_q, &private->io_work); + if (private->completion) + complete(private->completion); } static int vfio_ccw_sch_probe(struct subchannel *sch) diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c index c30420c..af88551 100644 --- a/drivers/s390/cio/vfio_ccw_fsm.c +++ b/drivers/s390/cio/vfio_ccw_fsm.c @@ -162,14 +162,16 @@ static void fsm_io_request(struct vfio_ccw_private *private, static void fsm_irq(struct vfio_ccw_private *private, enum vfio_ccw_event event) { - struct irb *irb = this_cpu_ptr(&cio_irb); + struct irb *irb = &private->irb; - memcpy(&private->irb, irb, sizeof(*irb)); - - queue_work(vfio_ccw_work_q, &private->io_work); + if (scsw_is_solicited(&irb->scsw)) { + cp_update_scsw(&private->cp, &irb->scsw); + cp_free(&private->cp); + } + memcpy(private->io_region.irb_area, irb, sizeof(*irb)); - if (private->completion) - complete(private->completion); + if (private->io_trigger) + eventfd_signal(private->io_trigger, 1); } /*
Having state changes out of IRQ context allows to protect critical sections with mutexes. Next patches in the serie will use this possibility. We use work queues to thread the interrupts. Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com> --- drivers/s390/cio/vfio_ccw_drv.c | 21 ++++++++------------- drivers/s390/cio/vfio_ccw_fsm.c | 14 ++++++++------ 2 files changed, 16 insertions(+), 19 deletions(-)