[v6,00/10] s390: vfio-ap: dynamic configuration support
mbox series

Message ID 1568410018-10833-1-git-send-email-akrowiak@linux.ibm.com
Headers show
Series
  • s390: vfio-ap: dynamic configuration support
Related show

Message

Tony Krowiak Sept. 13, 2019, 9:26 p.m. UTC
The current design for AP pass-through does not support making dynamic
changes to the AP matrix of a running guest resulting in three deficiencies
this patch series is intended to mitigate:

1. Adapters, domains and control domains can not be added to or removed
   from a running guest. In order to modify a guest's AP configuration,
   the guest must be terminated; only then can AP resources be assigned
   to or unassigned from the guest's matrix mdev. The new AP configuration
   becomes available to the guest when it is subsequently restarted.

2. The AP bus's /sys/bus/ap/apmask and /sys/bus/ap/aqmask interfaces can
   be modified by a root user without any restrictions. A change to either
   mask can result in AP queue devices being unbound from the vfio_ap
   device driver and bound to a zcrypt device driver even if a guest is
   using the queues, thus giving the host access to the guest's private
   crypto data and vice versa.

3. The APQNs derived from the Cartesian product of the APIDs of the
   adapters and APQIs of the domains assigned to a matrix mdev must
   reference an AP queue device bound to the vfio_ap device driver. 

This patch series introduces the following changes to the current design
to alleviate the shortcomings described above as well as to implement more
of the AP architecture:

1. A root user will be prevented from making changes to the AP bus's
   /sys/bus/ap/apmask or /sys/bus/ap/aqmask if the ownership of an APQN
   changes from the vfio_ap device driver to a zcrypt driver when the APQN
   is assigned to a matrix mdev.

2. The sysfs bind/unbind interfaces will be disabled for the vfio_ap
   device driver.

3. Allow AP resources to be assigned to or removed from a matrix mdev
   while a guest is using it and hot plug the resource into or hot unplug
   the resource from the running guest.

4. Allow assignment of an AP adapter or domain to a matrix mdev even if it
   results in assignment of an APQN that does not reference an AP queue
   device bound to the vfio_ap device driver, as long as the APQN is owned
   by the vfio_ap driver. Allowing over-provisioning of AP resources
   better models the architecture which does not preclude assigning AP
   resources that are not yet available in the system. If/when the queue
   becomes available to the host, it will immediately also become available
   to the guest.

1. Rationale for changes to AP bus's apmask/aqmask interfaces:
----------------------------------------------------------
Due to the extremely sensitive nature of cryptographic data, it is
imperative that great care be taken to ensure that such data is secured.
Allowing a root user, either inadvertently or maliciously, to configure
these masks such that a queue is shared between the host and a guest is
not only avoidable, it is advisable. It was suggested that this scenario
is better handled in user space with management software, but that does
not preclude a malicious administrator from using the sysfs interfaces
to gain access to a guest's crypto data. It was also suggested that this
scenario could be avoided by taking access to the adapter away from the
guest and zeroing out the queues prior to the vfio_ap driver releasing the
device; however, stealing an adapter in use from a guest as a by-product
of an operation is bad and will likely cause problems for the guest
unnecessarily. It was decided that the most effective solution with the
least number of negative side effects is to prevent the situation at the
source.

2. Rationale for disabling bind/unbind interfaces for vfio_ap driver:
-----------------------------------------------------------------
By disabling the bind/unbind interfaces for the vfio_ap device driver, 
the user is forced to use the AP bus's apmask/aqmask interfaces to control
the probing and removing of AP queues. There are two primary reasons for
disabling the bind/unbind interfaces for the vfio_ap device driver:

* The device architecture does not provide a means to prevent unbinding
  a device from a device driver, so an AP queue device can be unbound
  from the vfio_ap driver even when the queue is in use by a guest. By
  disabling the unbind interface, the user is forced to use the AP bus's
  apmask/aqmask interfaces which will prevent this.

* Binding of AP queues is controlled by the AP bus /sys/bus/ap/apmask and
  /sys/bus/ap/aqmask interfaces. If the masks indicate that an APQN is
  owned by zcrypt, trying to bind it to the vfio_ap device driver will
  fail; therefore, the bind interface is somewhat redundant and certainly
  unnecessary.        
  
3. Rationale for hot plug/unplug using matrix mdev sysfs interfaces:
----------------------------------------------------------------
Allowing a user to hot plug/unplug AP resources using the matrix mdev
sysfs interfaces circumvents the need to terminate the guest in order to
modify its AP configuration. Allowing dynamic configuration makes 
reconfiguring a guest's AP matrix much less disruptive.

4. Rationale for allowing over-provisioning of AP resources:
----------------------------------------------------------- 
Allowing assignment of AP resources to a matrix mdev and ultimately to a
guest better models the AP architecture. The architecture does not
preclude assignment of unavailable AP resources. If a queue subsequently
becomes available while a guest using the matrix mdev to which its APQN
is assigned, the guest will automatically acquire access to it. If an APQN
is dynamically unassigned from the underlying host system, it will 
automatically become unavailable to the guest.

Change log v5-v6:
----------------
* Fixed a bug in ap_bus.c introduced with patch 2/7 of the v5 
  series. Harald Freudenberer pointed out that the mutex lock
  for ap_perms_mutex in the apmask_store and aqmask_store functions
  was not being freed. 

* Removed patch 6/7 which added logging to the vfio_ap driver
  to expedite acceptance of this series. The logging will be introduced
  with a separate patch series to allow more time to explore options
  such as DBF logging vs. tracepoints.

* Added 3 patches related to ensuring that APQNs that do not reference
  AP queue devices bound to the vfio_ap device driver are not assigned
  to the guest CRYCB:

  Patch 4: Filter CRYCB bits for unavailable queue devices
  Patch 5: sysfs attribute to display the guest CRYCB
  Patch 6: update guest CRYCB in vfio_ap probe and remove callbacks

* Added a patch (Patch 9) to version the vfio_ap module.

* Reshuffled patches to allow the in_use callback implementation to
  invoke the vfio_ap_mdev_verify_no_sharing() function introduced in
  patch 2.  

Change log v4-v5:
----------------
* Added a patch to provide kernel s390dbf debug logs for VFIO AP

Change log v3->v4:
-----------------
* Restored patches preventing root user from changing ownership of
  APQNs from zcrypt drivers to the vfio_ap driver if the APQN is
  assigned to an mdev.

* No longer enforcing requirement restricting guest access to
  queues represented by a queue device bound to the vfio_ap
  device driver.

* Removed shadow CRYCB and now directly updating the guest CRYCB
  from the matrix mdev's matrix.

* Rebased the patch series on top of 'vfio: ap: AP Queue Interrupt
  Control' patches.

* Disabled bind/unbind sysfs interfaces for vfio_ap driver

Change log v2->v3:
-----------------
* Allow guest access to an AP queue only if the queue is bound to
  the vfio_ap device driver.

* Removed the patch to test CRYCB masks before taking the vCPUs
  out of SIE. Now checking the shadow CRYCB in the vfio_ap driver.

Change log v1->v2:
-----------------
* Removed patches preventing root user from unbinding AP queues from 
  the vfio_ap device driver
* Introduced a shadow CRYCB in the vfio_ap driver to manage dynamic 
  changes to the AP guest configuration due to root user interventions
  or hardware anomalies.

Tony Krowiak (10):
  s390: vfio-ap: Refactor vfio_ap driver probe and remove callbacks
  s390: vfio-ap: allow assignment of unavailable AP resources to mdev
    device
  s390: vfio-ap: allow hot plug/unplug of AP resources using mdev device
  s390: vfio-ap: filter CRYCB bits for unavailable queue devices
  s390: vfio-ap: sysfs attribute to display the guest CRYCB
  s390: vfio-ap: update guest CRYCB in vfio_ap probe and remove
    callbacks
  s390: zcrypt: driver callback to indicate resource in use
  s390: vfio-ap: implement in-use callback for vfio_ap driver
  s390: vfio-ap: added versioning to vfio_ap module
  s390: vfio-ap: update documentation

 Documentation/s390/vfio-ap.rst        | 899 +++++++++++++++++++++++++---------
 drivers/s390/crypto/ap_bus.c          | 144 +++++-
 drivers/s390/crypto/ap_bus.h          |   4 +
 drivers/s390/crypto/vfio_ap_drv.c     |  27 +-
 drivers/s390/crypto/vfio_ap_ops.c     | 610 ++++++++++++++---------
 drivers/s390/crypto/vfio_ap_private.h |  12 +-
 6 files changed, 1200 insertions(+), 496 deletions(-)

Comments

Halil Pasic Oct. 8, 2019, 10:48 a.m. UTC | #1
On Fri, 13 Sep 2019 17:26:48 -0400
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> The current design for AP pass-through does not support making dynamic
> changes to the AP matrix of a running guest resulting in three deficiencies
> this patch series is intended to mitigate:
> 
> 1. Adapters, domains and control domains can not be added to or removed
>    from a running guest. In order to modify a guest's AP configuration,
>    the guest must be terminated; only then can AP resources be assigned
>    to or unassigned from the guest's matrix mdev. The new AP configuration
>    becomes available to the guest when it is subsequently restarted.
> 
> 2. The AP bus's /sys/bus/ap/apmask and /sys/bus/ap/aqmask interfaces can
>    be modified by a root user without any restrictions. A change to either
>    mask can result in AP queue devices being unbound from the vfio_ap
>    device driver and bound to a zcrypt device driver even if a guest is
>    using the queues, thus giving the host access to the guest's private
>    crypto data and vice versa.
> 
> 3. The APQNs derived from the Cartesian product of the APIDs of the
>    adapters and APQIs of the domains assigned to a matrix mdev must
>    reference an AP queue device bound to the vfio_ap device driver. 
> 
> This patch series introduces the following changes to the current design
> to alleviate the shortcomings described above as well as to implement more
> of the AP architecture:
> 
> 1. A root user will be prevented from making changes to the AP bus's
>    /sys/bus/ap/apmask or /sys/bus/ap/aqmask if the ownership of an APQN
>    changes from the vfio_ap device driver to a zcrypt driver when the APQN
>    is assigned to a matrix mdev.
> 
> 2. The sysfs bind/unbind interfaces will be disabled for the vfio_ap
>    device driver.
> 

Doesn't this have the potential of breaking some userspace stuff that
might be out there?

> 3. Allow AP resources to be assigned to or removed from a matrix mdev
>    while a guest is using it and hot plug the resource into or hot unplug
>    the resource from the running guest.

This looks like a natural extension of the interface -- i.e. should not
break any userspace.

> 
> 4. Allow assignment of an AP adapter or domain to a matrix mdev even if it
>    results in assignment of an APQN that does not reference an AP queue
>    device bound to the vfio_ap device driver, as long as the APQN is owned
>    by the vfio_ap driver. Allowing over-provisioning of AP resources
>    better models the architecture which does not preclude assigning AP
>    resources that are not yet available in the system. If/when the queue
>    becomes available to the host, it will immediately also become available
>    to the guest.

Same here -- I don't think this change breaks any userspace.

> 
> 1. Rationale for changes to AP bus's apmask/aqmask interfaces:
> ----------------------------------------------------------
> Due to the extremely sensitive nature of cryptographic data, it is
> imperative that great care be taken to ensure that such data is secured.
> Allowing a root user, either inadvertently or maliciously, to configure
> these masks such that a queue is shared between the host and a guest is
> not only avoidable, it is advisable. It was suggested that this scenario
> is better handled in user space with management software, but that does
> not preclude a malicious administrator from using the sysfs interfaces
> to gain access to a guest's crypto data. It was also suggested that this
> scenario could be avoided by taking access to the adapter away from the
> guest and zeroing out the queues prior to the vfio_ap driver releasing the
> device; however, stealing an adapter in use from a guest as a by-product
> of an operation is bad and will likely cause problems for the guest
> unnecessarily. It was decided that the most effective solution with the
> least number of negative side effects is to prevent the situation at the
> source.
> 
> 2. Rationale for disabling bind/unbind interfaces for vfio_ap driver:
> -----------------------------------------------------------------
> By disabling the bind/unbind interfaces for the vfio_ap device driver, 
> the user is forced to use the AP bus's apmask/aqmask interfaces to control
> the probing and removing of AP queues. There are two primary reasons for
> disabling the bind/unbind interfaces for the vfio_ap device driver:
> 
> * The device architecture does not provide a means to prevent unbinding
>   a device from a device driver, so an AP queue device can be unbound
>   from the vfio_ap driver even when the queue is in use by a guest. By
>   disabling the unbind interface, the user is forced to use the AP bus's
>   apmask/aqmask interfaces which will prevent this.
> 

Isn't this fixed by your filtering (if implemented correctly)? BTW I believe
it solves the problem regardless whether the unbind was triggered by the
drivers unbind attribute or by a[pq]mask.

> * Binding of AP queues is controlled by the AP bus /sys/bus/ap/apmask and
>   /sys/bus/ap/aqmask interfaces. If the masks indicate that an APQN is
>   owned by zcrypt, trying to bind it to the vfio_ap device driver will
>   fail; therefore, the bind interface is somewhat redundant and certainly
>   unnecessary.        
>   

[..]

> Tony Krowiak (10):
>   s390: vfio-ap: Refactor vfio_ap driver probe and remove callbacks
>   s390: vfio-ap: allow assignment of unavailable AP resources to mdev
>     device
>   s390: vfio-ap: allow hot plug/unplug of AP resources using mdev device
>   s390: vfio-ap: filter CRYCB bits for unavailable queue devices
>   s390: vfio-ap: sysfs attribute to display the guest CRYCB
>   s390: vfio-ap: update guest CRYCB in vfio_ap probe and remove
>     callbacks
>   s390: zcrypt: driver callback to indicate resource in use
>   s390: vfio-ap: implement in-use callback for vfio_ap driver
>   s390: vfio-ap: added versioning to vfio_ap module
>   s390: vfio-ap: update documentation

I believe it would be worthwhile to reorder the patches (fixes and
re-factoring first, the features).

Regards,
Halil

> 
>  Documentation/s390/vfio-ap.rst        | 899 +++++++++++++++++++++++++---------
>  drivers/s390/crypto/ap_bus.c          | 144 +++++-
>  drivers/s390/crypto/ap_bus.h          |   4 +
>  drivers/s390/crypto/vfio_ap_drv.c     |  27 +-
>  drivers/s390/crypto/vfio_ap_ops.c     | 610 ++++++++++++++---------
>  drivers/s390/crypto/vfio_ap_private.h |  12 +-
>  6 files changed, 1200 insertions(+), 496 deletions(-)
>
Pierre Morel Oct. 8, 2019, 12:57 p.m. UTC | #2
On 10/8/19 12:48 PM, Halil Pasic wrote:
> On Fri, 13 Sep 2019 17:26:48 -0400
> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
>
>> The current design for AP pass-through does not support making dynamic
>> changes to the AP matrix of a running guest resulting in three deficiencies
>> this patch series is intended to mitigate:
>>
>> 1. Adapters, domains and control domains can not be added to or removed
>>     from a running guest. In order to modify a guest's AP configuration,
>>     the guest must be terminated; only then can AP resources be assigned
>>     to or unassigned from the guest's matrix mdev. The new AP configuration
>>     becomes available to the guest when it is subsequently restarted.
>>
>> 2. The AP bus's /sys/bus/ap/apmask and /sys/bus/ap/aqmask interfaces can
>>     be modified by a root user without any restrictions. A change to either
>>     mask can result in AP queue devices being unbound from the vfio_ap
>>     device driver and bound to a zcrypt device driver even if a guest is
>>     using the queues, thus giving the host access to the guest's private
>>     crypto data and vice versa.
>>
>> 3. The APQNs derived from the Cartesian product of the APIDs of the
>>     adapters and APQIs of the domains assigned to a matrix mdev must
>>     reference an AP queue device bound to the vfio_ap device driver.
>>
>> This patch series introduces the following changes to the current design
>> to alleviate the shortcomings described above as well as to implement more
>> of the AP architecture:
>>
>> 1. A root user will be prevented from making changes to the AP bus's
>>     /sys/bus/ap/apmask or /sys/bus/ap/aqmask if the ownership of an APQN
>>     changes from the vfio_ap device driver to a zcrypt driver when the APQN
>>     is assigned to a matrix mdev.
>>
>> 2. The sysfs bind/unbind interfaces will be disabled for the vfio_ap
>>     device driver.
>>
> Doesn't this have the potential of breaking some userspace stuff that
> might be out there?
>
>> 3. Allow AP resources to be assigned to or removed from a matrix mdev
>>     while a guest is using it and hot plug the resource into or hot unplug
>>     the resource from the running guest.
> This looks like a natural extension of the interface -- i.e. should not
> break any userspace.
>
>> 4. Allow assignment of an AP adapter or domain to a matrix mdev even if it
>>     results in assignment of an APQN that does not reference an AP queue
>>     device bound to the vfio_ap device driver, as long as the APQN is owned
>>     by the vfio_ap driver. Allowing over-provisioning of AP resources
>>     better models the architecture which does not preclude assigning AP
>>     resources that are not yet available in the system. If/when the queue
>>     becomes available to the host, it will immediately also become available
>>     to the guest.
> Same here -- I don't think this change breaks any userspace.
>
>> 1. Rationale for changes to AP bus's apmask/aqmask interfaces:
>> ----------------------------------------------------------
>> Due to the extremely sensitive nature of cryptographic data, it is
>> imperative that great care be taken to ensure that such data is secured.
>> Allowing a root user, either inadvertently or maliciously, to configure
>> these masks such that a queue is shared between the host and a guest is
>> not only avoidable, it is advisable.

Just curious: how is it possible to do such a configuration?


>>   It was suggested that this scenario
>> is better handled in user space with management software, but that does
>> not preclude a malicious administrator from using the sysfs interfaces
>> to gain access to a guest's crypto data. It was also suggested that this
>> scenario could be avoided by taking access to the adapter away from the
>> guest and zeroing out the queues prior to the vfio_ap driver releasing the
>> device; however, stealing an adapter in use from a guest as a by-product
>> of an operation is bad and will likely cause problems for the guest
>> unnecessarily. It was decided that the most effective solution with the
>> least number of negative side effects is to prevent the situation at the
>> source.


Stealing an adapter in use by a guest, insn't it what is done if we 
allow to unassign an AP/Domain using the unassign sysfs interface when 
the mediated device is in use by the guest?


>>
>> 2. Rationale for disabling bind/unbind interfaces for vfio_ap driver:
>> -----------------------------------------------------------------
>> By disabling the bind/unbind interfaces for the vfio_ap device driver,
>> the user is forced to use the AP bus's apmask/aqmask interfaces to control
>> the probing and removing of AP queues. There are two primary reasons for
>> disabling the bind/unbind interfaces for the vfio_ap device driver:
>>
>> * The device architecture does not provide a means to prevent unbinding
>>    a device from a device driver, so an AP queue device can be unbound
>>    from the vfio_ap driver even when the queue is in use by a guest. By
>>    disabling the unbind interface, the user is forced to use the AP bus's
>>    apmask/aqmask interfaces which will prevent this.
>>
> Isn't this fixed by your filtering (if implemented correctly)? BTW I believe
> it solves the problem regardless whether the unbind was triggered by the
> drivers unbind attribute or by a[pq]mask
>
>> * Binding of AP queues is controlled by the AP bus /sys/bus/ap/apmask and
>>    /sys/bus/ap/aqmask interfaces. If the masks indicate that an APQN is
>>    owned by zcrypt, trying to bind it to the vfio_ap device driver will
>>    fail; therefore, the bind interface is somewhat redundant and certainly
>>    unnecessary.
>>    
>>
>>
>> Tony Krowiak (10):
>>    s390: vfio-ap: Refactor vfio_ap driver probe and remove callbacks
>>    s390: vfio-ap: allow assignment of unavailable AP resources to mdev
>>      device
>>    s390: vfio-ap: allow hot plug/unplug of AP resources using mdev device
>>    s390: vfio-ap: filter CRYCB bits for unavailable queue devices
>>    s390: vfio-ap: sysfs attribute to display the guest CRYCB
>>    s390: vfio-ap: update guest CRYCB in vfio_ap probe and remove
>>      callbacks
>>    s390: zcrypt: driver callback to indicate resource in use
>>    s390: vfio-ap: implement in-use callback for vfio_ap driver
>>    s390: vfio-ap: added versioning to vfio_ap module
>>    s390: vfio-ap: update documentation
> I believe it would be worthwhile to reorder the patches (fixes and
> re-factoring first, the features).
>
> Regards,
> Halil
>
>>   Documentation/s390/vfio-ap.rst        | 899 +++++++++++++++++++++++++---------
>>   drivers/s390/crypto/ap_bus.c          | 144 +++++-
>>   drivers/s390/crypto/ap_bus.h          |   4 +
>>   drivers/s390/crypto/vfio_ap_drv.c     |  27 +-
>>   drivers/s390/crypto/vfio_ap_ops.c     | 610 ++++++++++++++---------
>>   drivers/s390/crypto/vfio_ap_private.h |  12 +-
>>   6 files changed, 1200 insertions(+), 496 deletions(-)
>>
Tony Krowiak Oct. 15, 2019, 8:27 p.m. UTC | #3
On 10/8/19 6:48 AM, Halil Pasic wrote:
> On Fri, 13 Sep 2019 17:26:48 -0400
> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
> 
>> The current design for AP pass-through does not support making dynamic
>> changes to the AP matrix of a running guest resulting in three deficiencies
>> this patch series is intended to mitigate:
>>
>> 1. Adapters, domains and control domains can not be added to or removed
>>     from a running guest. In order to modify a guest's AP configuration,
>>     the guest must be terminated; only then can AP resources be assigned
>>     to or unassigned from the guest's matrix mdev. The new AP configuration
>>     becomes available to the guest when it is subsequently restarted.
>>
>> 2. The AP bus's /sys/bus/ap/apmask and /sys/bus/ap/aqmask interfaces can
>>     be modified by a root user without any restrictions. A change to either
>>     mask can result in AP queue devices being unbound from the vfio_ap
>>     device driver and bound to a zcrypt device driver even if a guest is
>>     using the queues, thus giving the host access to the guest's private
>>     crypto data and vice versa.
>>
>> 3. The APQNs derived from the Cartesian product of the APIDs of the
>>     adapters and APQIs of the domains assigned to a matrix mdev must
>>     reference an AP queue device bound to the vfio_ap device driver.
>>
>> This patch series introduces the following changes to the current design
>> to alleviate the shortcomings described above as well as to implement more
>> of the AP architecture:
>>
>> 1. A root user will be prevented from making changes to the AP bus's
>>     /sys/bus/ap/apmask or /sys/bus/ap/aqmask if the ownership of an APQN
>>     changes from the vfio_ap device driver to a zcrypt driver when the APQN
>>     is assigned to a matrix mdev.
>>
>> 2. The sysfs bind/unbind interfaces will be disabled for the vfio_ap
>>     device driver.
>>
> 
> Doesn't this have the potential of breaking some userspace stuff that
> might be out there?

I have decided to leave these interfaces enabled.

> 
>> 3. Allow AP resources to be assigned to or removed from a matrix mdev
>>     while a guest is using it and hot plug the resource into or hot unplug
>>     the resource from the running guest.
> 
> This looks like a natural extension of the interface -- i.e. should not
> break any userspace.

We agree

> 
>>
>> 4. Allow assignment of an AP adapter or domain to a matrix mdev even if it
>>     results in assignment of an APQN that does not reference an AP queue
>>     device bound to the vfio_ap device driver, as long as the APQN is owned
>>     by the vfio_ap driver. Allowing over-provisioning of AP resources
>>     better models the architecture which does not preclude assigning AP
>>     resources that are not yet available in the system. If/when the queue
>>     becomes available to the host, it will immediately also become available
>>     to the guest.
> 
> Same here -- I don't think this change breaks any userspace.

We agree here also

> 
>>
>> 1. Rationale for changes to AP bus's apmask/aqmask interfaces:
>> ----------------------------------------------------------
>> Due to the extremely sensitive nature of cryptographic data, it is
>> imperative that great care be taken to ensure that such data is secured.
>> Allowing a root user, either inadvertently or maliciously, to configure
>> these masks such that a queue is shared between the host and a guest is
>> not only avoidable, it is advisable. It was suggested that this scenario
>> is better handled in user space with management software, but that does
>> not preclude a malicious administrator from using the sysfs interfaces
>> to gain access to a guest's crypto data. It was also suggested that this
>> scenario could be avoided by taking access to the adapter away from the
>> guest and zeroing out the queues prior to the vfio_ap driver releasing the
>> device; however, stealing an adapter in use from a guest as a by-product
>> of an operation is bad and will likely cause problems for the guest
>> unnecessarily. It was decided that the most effective solution with the
>> least number of negative side effects is to prevent the situation at the
>> source.
>>
>> 2. Rationale for disabling bind/unbind interfaces for vfio_ap driver:
>> -----------------------------------------------------------------
>> By disabling the bind/unbind interfaces for the vfio_ap device driver,
>> the user is forced to use the AP bus's apmask/aqmask interfaces to control
>> the probing and removing of AP queues. There are two primary reasons for
>> disabling the bind/unbind interfaces for the vfio_ap device driver:
>>
>> * The device architecture does not provide a means to prevent unbinding
>>    a device from a device driver, so an AP queue device can be unbound
>>    from the vfio_ap driver even when the queue is in use by a guest. By
>>    disabling the unbind interface, the user is forced to use the AP bus's
>>    apmask/aqmask interfaces which will prevent this.
>>
> 
> Isn't this fixed by your filtering (if implemented correctly)? BTW I believe
> it solves the problem regardless whether the unbind was triggered by the
> drivers unbind attribute or by a[pq]mask.

IMHO, it would be better if we didn't rely on the filtering because when
an unbind is done, the filtering will remove access to the entire
adapter. My goal was to limit the need for filtering to unbinds
triggered by AP deconfiguration via the SE or SCLP command over which we
have no control. We can control all other scenarios except for when an
adapter goes away for some other reason such as a failure.

> 
>> * Binding of AP queues is controlled by the AP bus /sys/bus/ap/apmask and
>>    /sys/bus/ap/aqmask interfaces. If the masks indicate that an APQN is
>>    owned by zcrypt, trying to bind it to the vfio_ap device driver will
>>    fail; therefore, the bind interface is somewhat redundant and certainly
>>    unnecessary.
>>    
> 
> [..]
> 
>> Tony Krowiak (10):
>>    s390: vfio-ap: Refactor vfio_ap driver probe and remove callbacks
>>    s390: vfio-ap: allow assignment of unavailable AP resources to mdev
>>      device
>>    s390: vfio-ap: allow hot plug/unplug of AP resources using mdev device
>>    s390: vfio-ap: filter CRYCB bits for unavailable queue devices
>>    s390: vfio-ap: sysfs attribute to display the guest CRYCB
>>    s390: vfio-ap: update guest CRYCB in vfio_ap probe and remove
>>      callbacks
>>    s390: zcrypt: driver callback to indicate resource in use
>>    s390: vfio-ap: implement in-use callback for vfio_ap driver
>>    s390: vfio-ap: added versioning to vfio_ap module
>>    s390: vfio-ap: update documentation
> 
> I believe it would be worthwhile to reorder the patches (fixes and
> re-factoring first, the features).

Suggestions?

> 
> Regards,
> Halil
> 
>>
>>   Documentation/s390/vfio-ap.rst        | 899 +++++++++++++++++++++++++---------
>>   drivers/s390/crypto/ap_bus.c          | 144 +++++-
>>   drivers/s390/crypto/ap_bus.h          |   4 +
>>   drivers/s390/crypto/vfio_ap_drv.c     |  27 +-
>>   drivers/s390/crypto/vfio_ap_ops.c     | 610 ++++++++++++++---------
>>   drivers/s390/crypto/vfio_ap_private.h |  12 +-
>>   6 files changed, 1200 insertions(+), 496 deletions(-)
>>
>
Tony Krowiak Oct. 15, 2019, 8:33 p.m. UTC | #4
On 10/8/19 8:57 AM, Pierre Morel wrote:
> 
> On 10/8/19 12:48 PM, Halil Pasic wrote:
>> On Fri, 13 Sep 2019 17:26:48 -0400
>> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
>>
>>> The current design for AP pass-through does not support making dynamic
>>> changes to the AP matrix of a running guest resulting in three 
>>> deficiencies
>>> this patch series is intended to mitigate:
>>>
>>> 1. Adapters, domains and control domains can not be added to or removed
>>>     from a running guest. In order to modify a guest's AP configuration,
>>>     the guest must be terminated; only then can AP resources be assigned
>>>     to or unassigned from the guest's matrix mdev. The new AP 
>>> configuration
>>>     becomes available to the guest when it is subsequently restarted.
>>>
>>> 2. The AP bus's /sys/bus/ap/apmask and /sys/bus/ap/aqmask interfaces can
>>>     be modified by a root user without any restrictions. A change to 
>>> either
>>>     mask can result in AP queue devices being unbound from the vfio_ap
>>>     device driver and bound to a zcrypt device driver even if a guest is
>>>     using the queues, thus giving the host access to the guest's private
>>>     crypto data and vice versa.
>>>
>>> 3. The APQNs derived from the Cartesian product of the APIDs of the
>>>     adapters and APQIs of the domains assigned to a matrix mdev must
>>>     reference an AP queue device bound to the vfio_ap device driver.
>>>
>>> This patch series introduces the following changes to the current design
>>> to alleviate the shortcomings described above as well as to implement 
>>> more
>>> of the AP architecture:
>>>
>>> 1. A root user will be prevented from making changes to the AP bus's
>>>     /sys/bus/ap/apmask or /sys/bus/ap/aqmask if the ownership of an APQN
>>>     changes from the vfio_ap device driver to a zcrypt driver when 
>>> the APQN
>>>     is assigned to a matrix mdev.
>>>
>>> 2. The sysfs bind/unbind interfaces will be disabled for the vfio_ap
>>>     device driver.
>>>
>> Doesn't this have the potential of breaking some userspace stuff that
>> might be out there?
>>
>>> 3. Allow AP resources to be assigned to or removed from a matrix mdev
>>>     while a guest is using it and hot plug the resource into or hot 
>>> unplug
>>>     the resource from the running guest.
>> This looks like a natural extension of the interface -- i.e. should not
>> break any userspace.
>>
>>> 4. Allow assignment of an AP adapter or domain to a matrix mdev even 
>>> if it
>>>     results in assignment of an APQN that does not reference an AP queue
>>>     device bound to the vfio_ap device driver, as long as the APQN is 
>>> owned
>>>     by the vfio_ap driver. Allowing over-provisioning of AP resources
>>>     better models the architecture which does not preclude assigning AP
>>>     resources that are not yet available in the system. If/when the 
>>> queue
>>>     becomes available to the host, it will immediately also become 
>>> available
>>>     to the guest.
>> Same here -- I don't think this change breaks any userspace.
>>
>>> 1. Rationale for changes to AP bus's apmask/aqmask interfaces:
>>> ----------------------------------------------------------
>>> Due to the extremely sensitive nature of cryptographic data, it is
>>> imperative that great care be taken to ensure that such data is secured.
>>> Allowing a root user, either inadvertently or maliciously, to configure
>>> these masks such that a queue is shared between the host and a guest is
>>> not only avoidable, it is advisable.
> 
> Just curious: how is it possible to do such a configuration?

In the current implementation of dedicated crypto, there is nothing
stopping a sysadmin from changing the apmask/aqmask in manner that
transfers ownership of one more APQNs from the vfio_ap device driver to
zcrypt which results in unbinding the queue devices from vfio_ap and
binding them to the zcrypt drive. If a guest happens to be using the
queue at the time, both the host and guest will have access. That is
fixed by this series.

> 
> 
>>>   It was suggested that this scenario
>>> is better handled in user space with management software, but that does
>>> not preclude a malicious administrator from using the sysfs interfaces
>>> to gain access to a guest's crypto data. It was also suggested that this
>>> scenario could be avoided by taking access to the adapter away from the
>>> guest and zeroing out the queues prior to the vfio_ap driver 
>>> releasing the
>>> device; however, stealing an adapter in use from a guest as a by-product
>>> of an operation is bad and will likely cause problems for the guest
>>> unnecessarily. It was decided that the most effective solution with the
>>> least number of negative side effects is to prevent the situation at the
>>> source.
> 
> 
> Stealing an adapter in use by a guest, insn't it what is done if we 
> allow to unassign an AP/Domain using the unassign sysfs interface when 
> the mediated device is in use by the guest?

Yes, but that is a deliberate action as opposed to a side effect of
bind/unbind. It is the very definition of dynamic configuration (a.k.a.,
hot plug/unplug).

> 
> 
>>>
>>> 2. Rationale for disabling bind/unbind interfaces for vfio_ap driver:
>>> -----------------------------------------------------------------
>>> By disabling the bind/unbind interfaces for the vfio_ap device driver,
>>> the user is forced to use the AP bus's apmask/aqmask interfaces to 
>>> control
>>> the probing and removing of AP queues. There are two primary reasons for
>>> disabling the bind/unbind interfaces for the vfio_ap device driver:
>>>
>>> * The device architecture does not provide a means to prevent unbinding
>>>    a device from a device driver, so an AP queue device can be unbound
>>>    from the vfio_ap driver even when the queue is in use by a guest. By
>>>    disabling the unbind interface, the user is forced to use the AP 
>>> bus's
>>>    apmask/aqmask interfaces which will prevent this.
>>>
>> Isn't this fixed by your filtering (if implemented correctly)? BTW I 
>> believe
>> it solves the problem regardless whether the unbind was triggered by the
>> drivers unbind attribute or by a[pq]mask
>>
>>> * Binding of AP queues is controlled by the AP bus /sys/bus/ap/apmask 
>>> and
>>>    /sys/bus/ap/aqmask interfaces. If the masks indicate that an APQN is
>>>    owned by zcrypt, trying to bind it to the vfio_ap device driver will
>>>    fail; therefore, the bind interface is somewhat redundant and 
>>> certainly
>>>    unnecessary.
>>>
>>>
>>> Tony Krowiak (10):
>>>    s390: vfio-ap: Refactor vfio_ap driver probe and remove callbacks
>>>    s390: vfio-ap: allow assignment of unavailable AP resources to mdev
>>>      device
>>>    s390: vfio-ap: allow hot plug/unplug of AP resources using mdev 
>>> device
>>>    s390: vfio-ap: filter CRYCB bits for unavailable queue devices
>>>    s390: vfio-ap: sysfs attribute to display the guest CRYCB
>>>    s390: vfio-ap: update guest CRYCB in vfio_ap probe and remove
>>>      callbacks
>>>    s390: zcrypt: driver callback to indicate resource in use
>>>    s390: vfio-ap: implement in-use callback for vfio_ap driver
>>>    s390: vfio-ap: added versioning to vfio_ap module
>>>    s390: vfio-ap: update documentation
>> I believe it would be worthwhile to reorder the patches (fixes and
>> re-factoring first, the features).
>>
>> Regards,
>> Halil
>>
>>>   Documentation/s390/vfio-ap.rst        | 899 
>>> +++++++++++++++++++++++++---------
>>>   drivers/s390/crypto/ap_bus.c          | 144 +++++-
>>>   drivers/s390/crypto/ap_bus.h          |   4 +
>>>   drivers/s390/crypto/vfio_ap_drv.c     |  27 +-
>>>   drivers/s390/crypto/vfio_ap_ops.c     | 610 ++++++++++++++---------
>>>   drivers/s390/crypto/vfio_ap_private.h |  12 +-
>>>   6 files changed, 1200 insertions(+), 496 deletions(-)
>>>