[RFC,v2,7/9] vfio-ccw: Wire up the CRW irq and CRW region

Message ID	20200206213825.11444-8-farman@linux.ibm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=ZRkH=32=vger.kernel.org=kvm-owner@kernel.org> Gateway: Authorized Use Only! Violators will be prosecuted for <kvm@vger.kernel.org> from <farman@linux.ibm.com>; Thu, 6 Feb 2020 21:38:30 -0000 Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 6 Feb 2020 21:38:28 -0000 From: Eric Farman <farman@linux.ibm.com> To: Cornelia Huck <cohuck@redhat.com> Cc: Halil Pasic <pasic@linux.ibm.com>, Jason Herne <jjherne@linux.ibm.com>, Jared Rossi <jrossi@linux.ibm.com>, Eric Farman <farman@linux.ibm.com>, linux-s390@vger.kernel.org, kvm@vger.kernel.org Subject: [RFC PATCH v2 7/9] vfio-ccw: Wire up the CRW irq and CRW region Date: Thu, 6 Feb 2020 22:38:23 +0100 In-Reply-To: <20200206213825.11444-1-farman@linux.ibm.com> References: <20200206213825.11444-1-farman@linux.ibm.com> Message-Id: <20200206213825.11444-8-farman@linux.ibm.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk
Series	s390x/vfio-ccw: Channel Path Handling \| expand [RFC,v2,0/9] s390x/vfio-ccw: Channel Path Handling [RFC,v2,1/9] vfio-ccw: Introduce new helper functions to free/destroy regions [RFC,v2,2/9] vfio-ccw: Register a chp_event callback for vfio-ccw [RFC,v2,3/9] vfio-ccw: Refactor the unregister of the async regions [RFC,v2,4/9] vfio-ccw: Introduce a new schib region [RFC,v2,5/9] vfio-ccw: Introduce a new CRW region [RFC,v2,6/9] vfio-ccw: Refactor IRQ handlers [RFC,v2,7/9] vfio-ccw: Wire up the CRW irq and CRW region [RFC,v2,8/9] vfio-ccw: Add trace for CRW event [RFC,v2,9/9] vfio-ccw: Remove inline get_schid() routine

Eric Farman Feb. 6, 2020, 9:38 p.m. UTC

From: Farhan Ali <alifm@linux.ibm.com>

Use an IRQ to notify userspace that there is a CRW
pending in the region, related to path-availability
changes on the passthrough subchannel.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---

Notes:
    v1->v2:
     - Remove extraneous 0x0 in crw.rsid assignment [CH]
     - Refactor the building/queueing of a crw into its own routine [EF]
    
    v0->v1: [EF]
     - Place the non-refactoring changes from the previous patch here
     - Clean up checkpatch (whitespace) errors
     - s/chp_crw/crw/
     - Move acquire/release of io_mutex in vfio_ccw_crw_region_read()
       into patch that introduces that region
     - Remove duplicate include from vfio_ccw_drv.c
     - Reorder include in vfio_ccw_private.h

 drivers/s390/cio/vfio_ccw_chp.c     |  5 ++
 drivers/s390/cio/vfio_ccw_drv.c     | 73 +++++++++++++++++++++++++++++
 drivers/s390/cio/vfio_ccw_ops.c     |  4 ++
 drivers/s390/cio/vfio_ccw_private.h |  9 ++++
 include/uapi/linux/vfio.h           |  1 +
 5 files changed, 92 insertions(+)

Cornelia Huck Feb. 14, 2020, 1:34 p.m. UTC | #1

On Thu,  6 Feb 2020 22:38:23 +0100
Eric Farman <farman@linux.ibm.com> wrote:

> From: Farhan Ali <alifm@linux.ibm.com>
> 
> Use an IRQ to notify userspace that there is a CRW
> pending in the region, related to path-availability
> changes on the passthrough subchannel.
> 
> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
> 
> Notes:
>     v1->v2:
>      - Remove extraneous 0x0 in crw.rsid assignment [CH]
>      - Refactor the building/queueing of a crw into its own routine [EF]
>     
>     v0->v1: [EF]
>      - Place the non-refactoring changes from the previous patch here
>      - Clean up checkpatch (whitespace) errors
>      - s/chp_crw/crw/
>      - Move acquire/release of io_mutex in vfio_ccw_crw_region_read()
>        into patch that introduces that region
>      - Remove duplicate include from vfio_ccw_drv.c
>      - Reorder include in vfio_ccw_private.h
> 
>  drivers/s390/cio/vfio_ccw_chp.c     |  5 ++
>  drivers/s390/cio/vfio_ccw_drv.c     | 73 +++++++++++++++++++++++++++++
>  drivers/s390/cio/vfio_ccw_ops.c     |  4 ++
>  drivers/s390/cio/vfio_ccw_private.h |  9 ++++
>  include/uapi/linux/vfio.h           |  1 +
>  5 files changed, 92 insertions(+)
> 
(...)
> +static void vfio_ccw_alloc_crw(struct vfio_ccw_private *private,
> +			       struct chp_link *link,
> +			       unsigned int erc)
> +{
> +	struct vfio_ccw_crw *vc_crw;
> +	struct crw *crw;
> +
> +	/*
> +	 * If unable to allocate a CRW, just drop the event and
> +	 * carry on.  The guest will either see a later one or
> +	 * learn when it issues its own store subchannel.
> +	 */
> +	vc_crw = kzalloc(sizeof(*vc_crw), GFP_ATOMIC);
> +	if (!vc_crw)
> +		return;
> +
> +	/*
> +	 * Build in the first CRW space, but don't chain anything
> +	 * into the second one even though the space exists.
> +	 */
> +	crw = &vc_crw->crw[0];
> +
> +	/*
> +	 * Presume every CRW we handle is reported by a channel-path.
> +	 * Maybe not future-proof, but good for what we're doing now.

You could pass in a source indication, maybe? Presumably, at least one
of the callers further up the chain knows...

> +	 *
> +	 * FIXME Sort of a lie, since we're converting a CRW
> +	 * reported by a channel-path into one issued to each
> +	 * subchannel, but still saying it's coming from the path.

It's still channel-path related, though :)

The important point is probably is that userspace needs to be aware
that the same channel-path related event is reported on all affected
subchannels, and they therefore need some appropriate handling on their
side.

> +	 */
> +	crw->rsc = CRW_RSC_CPATH;
> +	crw->rsid = (link->chpid.cssid << 8) | link->chpid.id;
> +	crw->erc = erc;
> +
> +	list_add_tail(&vc_crw->next, &private->crw);
> +	queue_work(vfio_ccw_work_q, &private->crw_work);
> +}
> +
>  static int vfio_ccw_chp_event(struct subchannel *sch,
>  			      struct chp_link *link, int event)
>  {
(...)

Eric Farman Feb. 14, 2020, 4:24 p.m. UTC | #2

On 2/14/20 8:34 AM, Cornelia Huck wrote:
> On Thu,  6 Feb 2020 22:38:23 +0100
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> From: Farhan Ali <alifm@linux.ibm.com>
>>
>> Use an IRQ to notify userspace that there is a CRW
>> pending in the region, related to path-availability
>> changes on the passthrough subchannel.
>>
>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
>> ---
>>
>> Notes:
>>     v1->v2:
>>      - Remove extraneous 0x0 in crw.rsid assignment [CH]
>>      - Refactor the building/queueing of a crw into its own routine [EF]
>>     
>>     v0->v1: [EF]
>>      - Place the non-refactoring changes from the previous patch here
>>      - Clean up checkpatch (whitespace) errors
>>      - s/chp_crw/crw/
>>      - Move acquire/release of io_mutex in vfio_ccw_crw_region_read()
>>        into patch that introduces that region
>>      - Remove duplicate include from vfio_ccw_drv.c
>>      - Reorder include in vfio_ccw_private.h
>>
>>  drivers/s390/cio/vfio_ccw_chp.c     |  5 ++
>>  drivers/s390/cio/vfio_ccw_drv.c     | 73 +++++++++++++++++++++++++++++
>>  drivers/s390/cio/vfio_ccw_ops.c     |  4 ++
>>  drivers/s390/cio/vfio_ccw_private.h |  9 ++++
>>  include/uapi/linux/vfio.h           |  1 +
>>  5 files changed, 92 insertions(+)
>>
> (...)
>> +static void vfio_ccw_alloc_crw(struct vfio_ccw_private *private,
>> +			       struct chp_link *link,
>> +			       unsigned int erc)
>> +{
>> +	struct vfio_ccw_crw *vc_crw;
>> +	struct crw *crw;
>> +
>> +	/*
>> +	 * If unable to allocate a CRW, just drop the event and
>> +	 * carry on.  The guest will either see a later one or
>> +	 * learn when it issues its own store subchannel.
>> +	 */
>> +	vc_crw = kzalloc(sizeof(*vc_crw), GFP_ATOMIC);
>> +	if (!vc_crw)
>> +		return;
>> +
>> +	/*
>> +	 * Build in the first CRW space, but don't chain anything
>> +	 * into the second one even though the space exists.
>> +	 */
>> +	crw = &vc_crw->crw[0];
>> +
>> +	/*
>> +	 * Presume every CRW we handle is reported by a channel-path.
>> +	 * Maybe not future-proof, but good for what we're doing now.
> 
> You could pass in a source indication, maybe? Presumably, at least one
> of the callers further up the chain knows...

The "chain" is the vfio_ccw_chp_event() function called off the
.chp_event callback, and then to this point.  So I don't think there's
much we can get back from our callchain, other than the CHP_xxxLINE
event that got us here.

> 
>> +	 *
>> +	 * FIXME Sort of a lie, since we're converting a CRW
>> +	 * reported by a channel-path into one issued to each
>> +	 * subchannel, but still saying it's coming from the path.
> 
> It's still channel-path related, though :)
> 
> The important point is probably is that userspace needs to be aware
> that the same channel-path related event is reported on all affected
> subchannels, and they therefore need some appropriate handling on their
> side.

This is true.  And the fact that the RSC and RSID fields will be in
agreement is helpful.  But yes, the fact that userspace should expect
the possibility of more than one CRW per channel path is the thing I'm
still not enjoying.  Mostly because of the race between queueing
additional ones, and unqueuing them on the other side.  So probably not
much that can be done here but awareness.

> 
>> +	 */
>> +	crw->rsc = CRW_RSC_CPATH;
>> +	crw->rsid = (link->chpid.cssid << 8) | link->chpid.id;
>> +	crw->erc = erc;
>> +
>> +	list_add_tail(&vc_crw->next, &private->crw);
>> +	queue_work(vfio_ccw_work_q, &private->crw_work);
>> +}
>> +
>>  static int vfio_ccw_chp_event(struct subchannel *sch,
>>  			      struct chp_link *link, int event)
>>  {
> (...)
>

Cornelia Huck March 24, 2020, 4:34 p.m. UTC | #3

On Fri, 14 Feb 2020 11:24:39 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On 2/14/20 8:34 AM, Cornelia Huck wrote:
> > On Thu,  6 Feb 2020 22:38:23 +0100
> > Eric Farman <farman@linux.ibm.com> wrote:

> > (...)  
> >> +static void vfio_ccw_alloc_crw(struct vfio_ccw_private *private,
> >> +			       struct chp_link *link,
> >> +			       unsigned int erc)
> >> +{
> >> +	struct vfio_ccw_crw *vc_crw;
> >> +	struct crw *crw;
> >> +
> >> +	/*
> >> +	 * If unable to allocate a CRW, just drop the event and
> >> +	 * carry on.  The guest will either see a later one or
> >> +	 * learn when it issues its own store subchannel.
> >> +	 */
> >> +	vc_crw = kzalloc(sizeof(*vc_crw), GFP_ATOMIC);
> >> +	if (!vc_crw)
> >> +		return;
> >> +
> >> +	/*
> >> +	 * Build in the first CRW space, but don't chain anything
> >> +	 * into the second one even though the space exists.
> >> +	 */
> >> +	crw = &vc_crw->crw[0];
> >> +
> >> +	/*
> >> +	 * Presume every CRW we handle is reported by a channel-path.
> >> +	 * Maybe not future-proof, but good for what we're doing now.  
> > 
> > You could pass in a source indication, maybe? Presumably, at least one
> > of the callers further up the chain knows...  
> 
> The "chain" is the vfio_ccw_chp_event() function called off the
> .chp_event callback, and then to this point.  So I don't think there's
> much we can get back from our callchain, other than the CHP_xxxLINE
> event that got us here.

We might want to pass in CRW_RSC_CPATH, that would make it a bit more
flexible. We can easily rearrange code internally later, though.

Eric Farman March 26, 2020, 6:51 p.m. UTC | #4

On 3/24/20 12:34 PM, Cornelia Huck wrote:
> On Fri, 14 Feb 2020 11:24:39 -0500
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> On 2/14/20 8:34 AM, Cornelia Huck wrote:
>>> On Thu,  6 Feb 2020 22:38:23 +0100
>>> Eric Farman <farman@linux.ibm.com> wrote:
> 
>>> (...)  
>>>> +static void vfio_ccw_alloc_crw(struct vfio_ccw_private *private,
>>>> +			       struct chp_link *link,
>>>> +			       unsigned int erc)
>>>> +{
>>>> +	struct vfio_ccw_crw *vc_crw;
>>>> +	struct crw *crw;
>>>> +
>>>> +	/*
>>>> +	 * If unable to allocate a CRW, just drop the event and
>>>> +	 * carry on.  The guest will either see a later one or
>>>> +	 * learn when it issues its own store subchannel.
>>>> +	 */
>>>> +	vc_crw = kzalloc(sizeof(*vc_crw), GFP_ATOMIC);
>>>> +	if (!vc_crw)
>>>> +		return;
>>>> +
>>>> +	/*
>>>> +	 * Build in the first CRW space, but don't chain anything
>>>> +	 * into the second one even though the space exists.
>>>> +	 */
>>>> +	crw = &vc_crw->crw[0];
>>>> +
>>>> +	/*
>>>> +	 * Presume every CRW we handle is reported by a channel-path.
>>>> +	 * Maybe not future-proof, but good for what we're doing now.  
>>>
>>> You could pass in a source indication, maybe? Presumably, at least one
>>> of the callers further up the chain knows...  
>>
>> The "chain" is the vfio_ccw_chp_event() function called off the
>> .chp_event callback, and then to this point.  So I don't think there's
>> much we can get back from our callchain, other than the CHP_xxxLINE
>> event that got us here.
> 
> We might want to pass in CRW_RSC_CPATH, that would make it a bit more
> flexible. We can easily rearrange code internally later, though.
> 

This is true...  I'll rearrange it so the routine takes the rsid as
input instead of the link, as well as the rsc, so we don't have to do
that fiddling down the road.

Cornelia Huck April 6, 2020, 1:52 p.m. UTC | #5

On Thu,  6 Feb 2020 22:38:23 +0100
Eric Farman <farman@linux.ibm.com> wrote:

> From: Farhan Ali <alifm@linux.ibm.com>
> 
> Use an IRQ to notify userspace that there is a CRW
> pending in the region, related to path-availability
> changes on the passthrough subchannel.
> 
> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
> 
> Notes:
>     v1->v2:
>      - Remove extraneous 0x0 in crw.rsid assignment [CH]
>      - Refactor the building/queueing of a crw into its own routine [EF]
>     
>     v0->v1: [EF]
>      - Place the non-refactoring changes from the previous patch here
>      - Clean up checkpatch (whitespace) errors
>      - s/chp_crw/crw/
>      - Move acquire/release of io_mutex in vfio_ccw_crw_region_read()
>        into patch that introduces that region
>      - Remove duplicate include from vfio_ccw_drv.c
>      - Reorder include in vfio_ccw_private.h
> 
>  drivers/s390/cio/vfio_ccw_chp.c     |  5 ++
>  drivers/s390/cio/vfio_ccw_drv.c     | 73 +++++++++++++++++++++++++++++
>  drivers/s390/cio/vfio_ccw_ops.c     |  4 ++
>  drivers/s390/cio/vfio_ccw_private.h |  9 ++++
>  include/uapi/linux/vfio.h           |  1 +
>  5 files changed, 92 insertions(+)

[I may have gotten all muddled up from staring at this, but please bear
with me...]

> diff --git a/drivers/s390/cio/vfio_ccw_chp.c b/drivers/s390/cio/vfio_ccw_chp.c
> index 8fde94552149..328b4e1d1972 100644
> --- a/drivers/s390/cio/vfio_ccw_chp.c
> +++ b/drivers/s390/cio/vfio_ccw_chp.c
> @@ -98,6 +98,11 @@ static ssize_t vfio_ccw_crw_region_read(struct vfio_ccw_private *private,
>  		ret = count;
>  
>  	mutex_unlock(&private->io_mutex);
> +
> +	/* Notify the guest if more CRWs are on our queue */
> +	if (!list_empty(&private->crw) && private->crw_trigger)
> +		eventfd_signal(private->crw_trigger, 1);

Here we possibly arm the eventfd again, but don't do anything regarding
queued crws and the region.

> +
>  	return ret;
>  }
>  
> diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
> index 1e1360af1b34..c48c260a129d 100644
> --- a/drivers/s390/cio/vfio_ccw_drv.c
> +++ b/drivers/s390/cio/vfio_ccw_drv.c
> @@ -108,6 +108,31 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
>  		eventfd_signal(private->io_trigger, 1);
>  }
>  
> +static void vfio_ccw_crw_todo(struct work_struct *work)
> +{
> +	struct vfio_ccw_private *private;
> +	struct vfio_ccw_crw *crw;
> +
> +	private = container_of(work, struct vfio_ccw_private, crw_work);
> +
> +	/* FIXME Ugh, need better control of this list */
> +	crw = list_first_entry_or_null(&private->crw,
> +				       struct vfio_ccw_crw, next);
> +
> +	if (crw) {
> +		list_del(&crw->next);
> +
> +		mutex_lock(&private->io_mutex);
> +		memcpy(&private->crw_region->crw0, crw->crw, sizeof(*crw->crw));
> +		mutex_unlock(&private->io_mutex);
> +
> +		kfree(crw);
> +
> +		if (private->crw_trigger)
> +			eventfd_signal(private->crw_trigger, 1);
> +	}
> +}

This function copies one outstanding crw and arms the eventfd.

> +
>  /*
>   * Css driver callbacks
>   */

(...)

> @@ -276,6 +309,44 @@ static int vfio_ccw_sch_event(struct subchannel *sch, int process)
>  	return rc;
>  }
>  
> +static void vfio_ccw_alloc_crw(struct vfio_ccw_private *private,
> +			       struct chp_link *link,
> +			       unsigned int erc)
> +{
> +	struct vfio_ccw_crw *vc_crw;
> +	struct crw *crw;
> +
> +	/*
> +	 * If unable to allocate a CRW, just drop the event and
> +	 * carry on.  The guest will either see a later one or
> +	 * learn when it issues its own store subchannel.
> +	 */
> +	vc_crw = kzalloc(sizeof(*vc_crw), GFP_ATOMIC);
> +	if (!vc_crw)
> +		return;
> +
> +	/*
> +	 * Build in the first CRW space, but don't chain anything
> +	 * into the second one even though the space exists.
> +	 */
> +	crw = &vc_crw->crw[0];
> +
> +	/*
> +	 * Presume every CRW we handle is reported by a channel-path.
> +	 * Maybe not future-proof, but good for what we're doing now.
> +	 *
> +	 * FIXME Sort of a lie, since we're converting a CRW
> +	 * reported by a channel-path into one issued to each
> +	 * subchannel, but still saying it's coming from the path.
> +	 */
> +	crw->rsc = CRW_RSC_CPATH;
> +	crw->rsid = (link->chpid.cssid << 8) | link->chpid.id;
> +	crw->erc = erc;
> +
> +	list_add_tail(&vc_crw->next, &private->crw);
> +	queue_work(vfio_ccw_work_q, &private->crw_work);

This function allocates a new crw and queues it. After that, it
triggers the function doing the copy-to-region-and-notify stuff.

> +}
> +
>  static int vfio_ccw_chp_event(struct subchannel *sch,
>  			      struct chp_link *link, int event)
>  {
> @@ -303,6 +374,7 @@ static int vfio_ccw_chp_event(struct subchannel *sch,
>  	case CHP_OFFLINE:
>  		/* Path is gone */
>  		cio_cancel_halt_clear(sch, &retry);
> +		vfio_ccw_alloc_crw(private, link, CRW_ERC_PERRN);
>  		break;
>  	case CHP_VARY_ON:
>  		/* Path logically turned on */
> @@ -312,6 +384,7 @@ static int vfio_ccw_chp_event(struct subchannel *sch,
>  	case CHP_ONLINE:
>  		/* Path became available */
>  		sch->lpm |= mask & sch->opm;
> +		vfio_ccw_alloc_crw(private, link, CRW_ERC_INIT);
>  		break;
>  	}
>  

These two (path online/offline handling) are the only code paths
triggering an update to the queued crws.

Aren't we missing copying in a new queued crw after userspace had done
a read?

Eric Farman April 6, 2020, 10:11 p.m. UTC | #6

On 4/6/20 9:52 AM, Cornelia Huck wrote:
> On Thu,  6 Feb 2020 22:38:23 +0100
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> From: Farhan Ali <alifm@linux.ibm.com>
>>
>> Use an IRQ to notify userspace that there is a CRW
>> pending in the region, related to path-availability
>> changes on the passthrough subchannel.
>>
>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
>> ---
>>
>> Notes:
>>     v1->v2:
>>      - Remove extraneous 0x0 in crw.rsid assignment [CH]
>>      - Refactor the building/queueing of a crw into its own routine [EF]
>>     
>>     v0->v1: [EF]
>>      - Place the non-refactoring changes from the previous patch here
>>      - Clean up checkpatch (whitespace) errors
>>      - s/chp_crw/crw/
>>      - Move acquire/release of io_mutex in vfio_ccw_crw_region_read()
>>        into patch that introduces that region
>>      - Remove duplicate include from vfio_ccw_drv.c
>>      - Reorder include in vfio_ccw_private.h
>>
>>  drivers/s390/cio/vfio_ccw_chp.c     |  5 ++
>>  drivers/s390/cio/vfio_ccw_drv.c     | 73 +++++++++++++++++++++++++++++
>>  drivers/s390/cio/vfio_ccw_ops.c     |  4 ++
>>  drivers/s390/cio/vfio_ccw_private.h |  9 ++++
>>  include/uapi/linux/vfio.h           |  1 +
>>  5 files changed, 92 insertions(+)
> 
> [I may have gotten all muddled up from staring at this, but please bear
> with me...]
> 
...snip...
> 
> Aren't we missing copying in a new queued crw after userspace had done
> a read?
> 

Um, huh.  I'll doublecheck that after dinner, but it sure looks like
you're right.

(Might not get back to you tomorrow, because I don't have much time
until Wednesday.)

[RFC,v2,7/9] vfio-ccw: Wire up the CRW irq and CRW region

Commit Message

Comments

Patch