mbox series

[v3,0/2] spapr: Use vIOMMU translation for virtio by default

Message ID 20200305043009.611636-1-david@gibson.dropbear.id.au (mailing list archive)
Headers show
Series spapr: Use vIOMMU translation for virtio by default | expand

Message

David Gibson March 5, 2020, 4:30 a.m. UTC
Upcoming Secure VM support for pSeries machines introduces some
complications for virtio, since the transfer buffers need to be
explicitly shared so that the hypervisor can access them.

While it's not strictly speaking dependent on it, the fact that virtio
devices bypass normal platform IOMMU translation complicates the issue
on the guest side.  Since there are some significan downsides to
bypassing the vIOMMU anyway, let's just disable that.

There's already a flag to do this in virtio, just turn it on by
default for forthcoming pseries machine types.

Any opinions on whether dropping support for the older guest kernels
is acceptable at this point?

Changes since v2:
 * Rebase and improve some comments
Changes since v1:
 * Added information on which guest kernel versions will no longer
   work with these changes
 * Use Michael Tsirkin's suggested better way of handling the machine
   type change

David Gibson (2):
  spapr: Disable legacy virtio devices for pseries-5.0 and later
  spapr: Enable virtio iommu_platform=on by default

 hw/ppc/spapr.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Daniel P. Berrangé March 10, 2020, 11:43 a.m. UTC | #1
On Thu, Mar 05, 2020 at 03:30:07PM +1100, David Gibson wrote:
> Upcoming Secure VM support for pSeries machines introduces some
> complications for virtio, since the transfer buffers need to be
> explicitly shared so that the hypervisor can access them.
> 
> While it's not strictly speaking dependent on it, the fact that virtio
> devices bypass normal platform IOMMU translation complicates the issue
> on the guest side.  Since there are some significan downsides to
> bypassing the vIOMMU anyway, let's just disable that.
> 
> There's already a flag to do this in virtio, just turn it on by
> default for forthcoming pseries machine types.

Breaking existing guest OS to support a new secure VM feature that
may not even be used/wanted doesn't seems like a sensible tradeoff
for default out of the box behaviour.

IOW, if Secure VM needs this, can we tie the change in virtio and
IOMMU defaults to the machine type flag that enables the use of
Secure VM.

That way the changed virtio defaults only take effect if a user/mgmt
app has explicitly opted in to the new Secure VM feature, and existing
users won't be broken by a new feature they don't even use.

> Any opinions on whether dropping support for the older guest kernels
> is acceptable at this point?


I think this question has different answers depending on whether you
are considering downstream vendor policy, current upstream policy,
and a possible new downstream policy on guest support. IOW a bit of a
can of worms...


In the case of RHEL downstream there is a very narrow matrix for
what guest OS are considered supported.

In the case of current upstream, there has essentially never been
any documented guest matrix. The unwritten implicit rule upstream
has followed is to not change defaults in a way that would break
ability to run existing guest OS.


As an example, on x86 upstream defaults to i440fx and thus still
uses virtio devices in transitional mode by default, while downstream
RHEL used its narrow support matrix as a justification for why it was
ok to switch to q35 by default & loose guest support in many cases.
Even that was problematic though, because RHEL still needed to support
RHEL-6 guest which are broken by default with q35 since they only
support legacy mode virtio. Thus we needed work in management apps
to enable RHEL-6 to continue working with q35 chipset, by placing
the devices onto a PCI bridge, instead of a PCIe root port, or by
explicitly using i440fx instead.

Thus if we follow our *current* upstream guest support policy, I don't
think it is acceptable to break existing guests with the new machine
type upstream.  It is reasonable to do it downstream if the downstream
is willing to sacrifice these guests, or invest to make any mgmt apps 
add workaround/revert QEMU changes.


With that all said, I do *NOT* think current upstream practice is good
or sustainable long term (though I admit I've argued on the other side
in the past).

This policy is why we're still using a machine designed in 1995 on x86
by default, in order that we avoid breaking the popular guest OS of the
day, like Windows 95.

This is similar to the problem we had with host build platforms, where
we afraid to make any change which would break an existing build platform,
or on the flipside made arbitrary ad-hoc changes with no consistent
approach across different subsystems.


I think that we should aim define some clearer policy around how we
want to support guest OS in upstream. As we did with our host build
platforms policy, any guest support policy should aim to move forward
at a reasonable pace so that we are not locked at a specific point in
time forever.

I can imagine three tiers

 1. Recommended. Expected to work well with machine type defaults
 2. Supported. Should work with QEMU but may need special settings applied
 3. Unsupported. Will not work with QEMU regardless of settings

I don't have an opinion right now on what guest OS should fall in which
category. One possible way to classify them would be to look at whether
the vendor themselves still supported the OS.  IOW, to be in the
"Recommended" criteria, it must be actively supported by the vendor.
Once EOL by the vendor it would be capped at the "Supported" tier.

That definition wouldn't help your pseries issue though, because RHEL-6
is still considered a supported OS.

Another possible way to classify guest OS would be to look purely at
the original release date, and set a cap of "$TODAY - 5 years" for
the "Recommended" tier. That would exclude RHEL-6.

Regards,
Daniel
David Gibson March 11, 2020, 1:12 a.m. UTC | #2
On Tue, Mar 10, 2020 at 11:43:43AM +0000, Daniel P. Berrangé wrote:
> On Thu, Mar 05, 2020 at 03:30:07PM +1100, David Gibson wrote:
> > Upcoming Secure VM support for pSeries machines introduces some
> > complications for virtio, since the transfer buffers need to be
> > explicitly shared so that the hypervisor can access them.
> > 
> > While it's not strictly speaking dependent on it, the fact that virtio
> > devices bypass normal platform IOMMU translation complicates the issue
> > on the guest side.  Since there are some significan downsides to
> > bypassing the vIOMMU anyway, let's just disable that.
> > 
> > There's already a flag to do this in virtio, just turn it on by
> > default for forthcoming pseries machine types.
> 
> Breaking existing guest OS to support a new secure VM feature that
> may not even be used/wanted doesn't seems like a sensible tradeoff
> for default out of the box behaviour.
> 
> IOW, if Secure VM needs this, can we tie the change in virtio and
> IOMMU defaults to the machine type flag that enables the use of
> Secure VM.

There is no such flag.

In the POWER secure VM model, the secure mode option isn't something
that's constructed in when the hypervisor builds the VM.  Instead the
VM is started normally and transitions itself to secure mode by
talking directly with the ultravisor (it then uses TPM shenannigans to
safely get the keys to its real storage backend(s)).

> That way the changed virtio defaults only take effect if a user/mgmt
> app has explicitly opted in to the new Secure VM feature, and existing
> users won't be broken by a new feature they don't even use.

Sure, but qemu has no natural way to know if secure VM is in use,
until too late.

I am wondering if we have to introduce an "svm=on" flag anyway.  It's
pretty ugly, since all it would be doing is changing defaults here and
there for compatibilty with a possible future SVM transition, but
maybe it's the best we can do :/.

> > Any opinions on whether dropping support for the older guest kernels
> > is acceptable at this point?
> 
> 
> I think this question has different answers depending on whether you
> are considering downstream vendor policy, current upstream policy,
> and a possible new downstream policy on guest support. IOW a bit of a
> can of worms...
> 
> 
> In the case of RHEL downstream there is a very narrow matrix for
> what guest OS are considered supported.
> 
> In the case of current upstream, there has essentially never been
> any documented guest matrix. The unwritten implicit rule upstream
> has followed is to not change defaults in a way that would break
> ability to run existing guest OS.

Hrm, ok, that's not how I've been treating it for for pseries, though
previous breakages have been for much older / rarer cases.  We broke
support for guests that don't call "ibm,client-architecture-support"
long, long ago (but such guests are really, really ancient).  We broke
support (without workarounds) for guests with 4kiB standard page size
more recently, but those are at least a decade old for common
downstream distros (you can build such kernels today, but
approximately nobody does).

> As an example, on x86 upstream defaults to i440fx and thus still
> uses virtio devices in transitional mode by default, while downstream
> RHEL used its narrow support matrix as a justification for why it was
> ok to switch to q35 by default & loose guest support in many cases.
> Even that was problematic though, because RHEL still needed to support
> RHEL-6 guest which are broken by default with q35 since they only
> support legacy mode virtio. Thus we needed work in management apps
> to enable RHEL-6 to continue working with q35 chipset, by placing
> the devices onto a PCI bridge, instead of a PCIe root port, or by
> explicitly using i440fx instead.

Yeah, and here's where x86's visibility with mgmt because a big
thing.  Most of these changes are easily enough worked around with
machine type options - and there's no inherent reason those are harder
to work with than whole machine types, or other config details.  But
getting mgmt apps to support machine option based workarounds for us
is a real challenge.

> Thus if we follow our *current* upstream guest support policy, I don't
> think it is acceptable to break existing guests with the new machine
> type upstream.  It is reasonable to do it downstream if the downstream
> is willing to sacrifice these guests, or invest to make any mgmt apps 
> add workaround/revert QEMU changes.
> 
> 
> With that all said, I do *NOT* think current upstream practice is good
> or sustainable long term (though I admit I've argued on the other side
> in the past).
> 
> This policy is why we're still using a machine designed in 1995 on x86
> by default, in order that we avoid breaking the popular guest OS of the
> day, like Windows 95.
> 
> This is similar to the problem we had with host build platforms, where
> we afraid to make any change which would break an existing build platform,
> or on the flipside made arbitrary ad-hoc changes with no consistent
> approach across different subsystems.
> 
> 
> I think that we should aim define some clearer policy around how we
> want to support guest OS in upstream. As we did with our host build
> platforms policy, any guest support policy should aim to move forward
> at a reasonable pace so that we are not locked at a specific point in
> time forever.
> 
> I can imagine three tiers
> 
>  1. Recommended. Expected to work well with machine type defaults
>  2. Supported. Should work with QEMU but may need special settings applied
>  3. Unsupported. Will not work with QEMU regardless of settings
> 
> I don't have an opinion right now on what guest OS should fall in which
> category. One possible way to classify them would be to look at whether
> the vendor themselves still supported the OS.  IOW, to be in the
> "Recommended" criteria, it must be actively supported by the vendor.
> Once EOL by the vendor it would be capped at the "Supported" tier.
> 
> That definition wouldn't help your pseries issue though, because RHEL-6
> is still considered a supported OS.
> 
> Another possible way to classify guest OS would be to look purely at
> the original release date, and set a cap of "$TODAY - 5 years" for
> the "Recommended" tier. That would exclude RHEL-6.

That seems fairly reasonable to me, but it doesn't really help me move
forward right now.  Right now, we have no sane way of distinguishing
early enough between a RHEL-6 guest that needs legacy virtio and a
guest that wants to go to secure mode, and can't have legacy virtio. :(
Michael S. Tsirkin March 11, 2020, 7:33 a.m. UTC | #3
On Wed, Mar 11, 2020 at 12:12:47PM +1100, David Gibson wrote:
> I am wondering if we have to introduce an "svm=on" flag anyway.  It's
> pretty ugly, since all it would be doing is changing defaults here and
> there for compatibilty with a possible future SVM transition, but
> maybe it's the best we can do :/.

Frankly I'm surprised there's no way for the hypervisor to block VM
transition to secure mode. To me an inability to disable DRM looks like
a security problem. Does not the ultravisor somehow allow
enabling/disabling this functionality from the hypervisor? It would be
even better if the hypervisor could block the guest from poking at the
ultravisor completely but I guess that would be too much to hope for.
Daniel P. Berrangé March 11, 2020, 10:01 a.m. UTC | #4
On Wed, Mar 11, 2020 at 12:12:47PM +1100, David Gibson wrote:
> On Tue, Mar 10, 2020 at 11:43:43AM +0000, Daniel P. Berrangé wrote:
> > On Thu, Mar 05, 2020 at 03:30:07PM +1100, David Gibson wrote:
> > > Upcoming Secure VM support for pSeries machines introduces some
> > > complications for virtio, since the transfer buffers need to be
> > > explicitly shared so that the hypervisor can access them.
> > > 
> > > While it's not strictly speaking dependent on it, the fact that virtio
> > > devices bypass normal platform IOMMU translation complicates the issue
> > > on the guest side.  Since there are some significan downsides to
> > > bypassing the vIOMMU anyway, let's just disable that.
> > > 
> > > There's already a flag to do this in virtio, just turn it on by
> > > default for forthcoming pseries machine types.
> > 
> > Breaking existing guest OS to support a new secure VM feature that
> > may not even be used/wanted doesn't seems like a sensible tradeoff
> > for default out of the box behaviour.
> > 
> > IOW, if Secure VM needs this, can we tie the change in virtio and
> > IOMMU defaults to the machine type flag that enables the use of
> > Secure VM.
> 
> There is no such flag.
> 
> In the POWER secure VM model, the secure mode option isn't something
> that's constructed in when the hypervisor builds the VM.  Instead the
> VM is started normally and transitions itself to secure mode by
> talking directly with the ultravisor (it then uses TPM shenannigans to
> safely get the keys to its real storage backend(s)).

This is pretty suprising to me. The ability to use secure VM mode surely
depends on host hardware features. We would need to be able to block the
use of this, in order to allow VMs to be live migrated to hosts which
lack the feature. Automatically & silently enabling a feature that
has a hardware dependancy is something we aim to avoid, unless the user
has opted in via some flag (such as -cpu host, or a -cpu $NAME, that
implies the feature).

> > > Any opinions on whether dropping support for the older guest kernels
> > > is acceptable at this point?
> > 
> > 
> > I think this question has different answers depending on whether you
> > are considering downstream vendor policy, current upstream policy,
> > and a possible new downstream policy on guest support. IOW a bit of a
> > can of worms...
> > 
> > 
> > In the case of RHEL downstream there is a very narrow matrix for
> > what guest OS are considered supported.
> > 
> > In the case of current upstream, there has essentially never been
> > any documented guest matrix. The unwritten implicit rule upstream
> > has followed is to not change defaults in a way that would break
> > ability to run existing guest OS.
> 
> Hrm, ok, that's not how I've been treating it for for pseries, though
> previous breakages have been for much older / rarer cases.  We broke
> support for guests that don't call "ibm,client-architecture-support"
> long, long ago (but such guests are really, really ancient).  We broke
> support (without workarounds) for guests with 4kiB standard page size
> more recently, but those are at least a decade old for common
> downstream distros (you can build such kernels today, but
> approximately nobody does).
> 
> > As an example, on x86 upstream defaults to i440fx and thus still
> > uses virtio devices in transitional mode by default, while downstream
> > RHEL used its narrow support matrix as a justification for why it was
> > ok to switch to q35 by default & loose guest support in many cases.
> > Even that was problematic though, because RHEL still needed to support
> > RHEL-6 guest which are broken by default with q35 since they only
> > support legacy mode virtio. Thus we needed work in management apps
> > to enable RHEL-6 to continue working with q35 chipset, by placing
> > the devices onto a PCI bridge, instead of a PCIe root port, or by
> > explicitly using i440fx instead.
> 
> Yeah, and here's where x86's visibility with mgmt because a big
> thing.  Most of these changes are easily enough worked around with
> machine type options - and there's no inherent reason those are harder
> to work with than whole machine types, or other config details.  But
> getting mgmt apps to support machine option based workarounds for us
> is a real challenge.
> 
> > Thus if we follow our *current* upstream guest support policy, I don't
> > think it is acceptable to break existing guests with the new machine
> > type upstream.  It is reasonable to do it downstream if the downstream
> > is willing to sacrifice these guests, or invest to make any mgmt apps 
> > add workaround/revert QEMU changes.
> > 
> > 
> > With that all said, I do *NOT* think current upstream practice is good
> > or sustainable long term (though I admit I've argued on the other side
> > in the past).
> > 
> > This policy is why we're still using a machine designed in 1995 on x86
> > by default, in order that we avoid breaking the popular guest OS of the
> > day, like Windows 95.
> > 
> > This is similar to the problem we had with host build platforms, where
> > we afraid to make any change which would break an existing build platform,
> > or on the flipside made arbitrary ad-hoc changes with no consistent
> > approach across different subsystems.
> > 
> > 
> > I think that we should aim define some clearer policy around how we
> > want to support guest OS in upstream. As we did with our host build
> > platforms policy, any guest support policy should aim to move forward
> > at a reasonable pace so that we are not locked at a specific point in
> > time forever.
> > 
> > I can imagine three tiers
> > 
> >  1. Recommended. Expected to work well with machine type defaults
> >  2. Supported. Should work with QEMU but may need special settings applied
> >  3. Unsupported. Will not work with QEMU regardless of settings
> > 
> > I don't have an opinion right now on what guest OS should fall in which
> > category. One possible way to classify them would be to look at whether
> > the vendor themselves still supported the OS.  IOW, to be in the
> > "Recommended" criteria, it must be actively supported by the vendor.
> > Once EOL by the vendor it would be capped at the "Supported" tier.
> > 
> > That definition wouldn't help your pseries issue though, because RHEL-6
> > is still considered a supported OS.
> > 
> > Another possible way to classify guest OS would be to look purely at
> > the original release date, and set a cap of "$TODAY - 5 years" for
> > the "Recommended" tier. That would exclude RHEL-6.
> 
> That seems fairly reasonable to me, but it doesn't really help me move
> forward right now.  Right now, we have no sane way of distinguishing
> early enough between a RHEL-6 guest that needs legacy virtio and a
> guest that wants to go to secure mode, and can't have legacy virtio. :(

Regards,
Daniel
Michael S. Tsirkin March 11, 2020, 11:48 a.m. UTC | #5
On Wed, Mar 11, 2020 at 10:01:27AM +0000, Daniel P. Berrangé wrote:
> On Wed, Mar 11, 2020 at 12:12:47PM +1100, David Gibson wrote:
> > On Tue, Mar 10, 2020 at 11:43:43AM +0000, Daniel P. Berrangé wrote:
> > > On Thu, Mar 05, 2020 at 03:30:07PM +1100, David Gibson wrote:
> > > > Upcoming Secure VM support for pSeries machines introduces some
> > > > complications for virtio, since the transfer buffers need to be
> > > > explicitly shared so that the hypervisor can access them.
> > > > 
> > > > While it's not strictly speaking dependent on it, the fact that virtio
> > > > devices bypass normal platform IOMMU translation complicates the issue
> > > > on the guest side.  Since there are some significan downsides to
> > > > bypassing the vIOMMU anyway, let's just disable that.
> > > > 
> > > > There's already a flag to do this in virtio, just turn it on by
> > > > default for forthcoming pseries machine types.
> > > 
> > > Breaking existing guest OS to support a new secure VM feature that
> > > may not even be used/wanted doesn't seems like a sensible tradeoff
> > > for default out of the box behaviour.
> > > 
> > > IOW, if Secure VM needs this, can we tie the change in virtio and
> > > IOMMU defaults to the machine type flag that enables the use of
> > > Secure VM.
> > 
> > There is no such flag.
> > 
> > In the POWER secure VM model, the secure mode option isn't something
> > that's constructed in when the hypervisor builds the VM.  Instead the
> > VM is started normally and transitions itself to secure mode by
> > talking directly with the ultravisor (it then uses TPM shenannigans to
> > safely get the keys to its real storage backend(s)).
> 
> This is pretty suprising to me. The ability to use secure VM mode surely
> depends on host hardware features. We would need to be able to block the
> use of this, in order to allow VMs to be live migrated to hosts which
> lack the feature. Automatically & silently enabling a feature that
> has a hardware dependancy is something we aim to avoid, unless the user
> has opted in via some flag (such as -cpu host, or a -cpu $NAME, that
> implies the feature).

That's something I don't know. Is migration supported in this mode?
Greg Kurz March 11, 2020, 5:19 p.m. UTC | #6
On Thu,  5 Mar 2020 15:30:07 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> Upcoming Secure VM support for pSeries machines introduces some
> complications for virtio, since the transfer buffers need to be
> explicitly shared so that the hypervisor can access them.
> 
> While it's not strictly speaking dependent on it, the fact that virtio
> devices bypass normal platform IOMMU translation complicates the issue
> on the guest side.  Since there are some significan downsides to
> bypassing the vIOMMU anyway, let's just disable that.
> 
> There's already a flag to do this in virtio, just turn it on by
> default for forthcoming pseries machine types.
> 
> Any opinions on whether dropping support for the older guest kernels
> is acceptable at this point?
> 
> Changes since v2:
>  * Rebase and improve some comments
> Changes since v1:
>  * Added information on which guest kernel versions will no longer
>    work with these changes
>  * Use Michael Tsirkin's suggested better way of handling the machine
>    type change
> 
> David Gibson (2):
>   spapr: Disable legacy virtio devices for pseries-5.0 and later

This disables legacy AND transitional devices. IIUC this first patch
is needed because only non-transitional (pure virtio-1) devices
support iommu_platform=on, ie. this check in virtio_pci_device_plugged():

    if (legacy) {
        if (virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM)) {
            error_setg(errp, "VIRTIO_F_IOMMU_PLATFORM was supported by"
                       " neither legacy nor transitional device");
            return ;
        }

It certainly looks right for legacy devices but what about
transitional ones ? I couldn't find any indication in the
spec or in the QEMU archives that explains why IOMMU should
only be used with non-transitional devices...

Jason or Michael, can you explain ?

>   spapr: Enable virtio iommu_platform=on by default
> 
>  hw/ppc/spapr.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
David Gibson March 12, 2020, 1:08 a.m. UTC | #7
On Wed, Mar 11, 2020 at 10:01:27AM +0000, Daniel P. Berrangé wrote:
65;5803;1c> On Wed, Mar 11, 2020 at 12:12:47PM +1100, David Gibson wrote:
> > On Tue, Mar 10, 2020 at 11:43:43AM +0000, Daniel P. Berrangé wrote:
> > > On Thu, Mar 05, 2020 at 03:30:07PM +1100, David Gibson wrote:
> > > > Upcoming Secure VM support for pSeries machines introduces some
> > > > complications for virtio, since the transfer buffers need to be
> > > > explicitly shared so that the hypervisor can access them.
> > > > 
> > > > While it's not strictly speaking dependent on it, the fact that virtio
> > > > devices bypass normal platform IOMMU translation complicates the issue
> > > > on the guest side.  Since there are some significan downsides to
> > > > bypassing the vIOMMU anyway, let's just disable that.
> > > > 
> > > > There's already a flag to do this in virtio, just turn it on by
> > > > default for forthcoming pseries machine types.
> > > 
> > > Breaking existing guest OS to support a new secure VM feature that
> > > may not even be used/wanted doesn't seems like a sensible tradeoff
> > > for default out of the box behaviour.
> > > 
> > > IOW, if Secure VM needs this, can we tie the change in virtio and
> > > IOMMU defaults to the machine type flag that enables the use of
> > > Secure VM.
> > 
> > There is no such flag.
> > 
> > In the POWER secure VM model, the secure mode option isn't something
> > that's constructed in when the hypervisor builds the VM.  Instead the
> > VM is started normally and transitions itself to secure mode by
> > talking directly with the ultravisor (it then uses TPM shenannigans to
> > safely get the keys to its real storage backend(s)).
> 
> This is pretty suprising to me. The ability to use secure VM mode surely
> depends on host hardware features. We would need to be able to block the
> use of this, in order to allow VMs to be live migrated to hosts which
> lack the feature. Automatically & silently enabling a feature that
> has a hardware dependancy is something we aim to avoid, unless the user
> has opted in via some flag (such as -cpu host, or a -cpu $NAME, that
> implies the feature).

That is an excellent point, which I had not previously considered.

I have confirmed that there is indeed not, at present, a way to
disable the secure transition.  But, it looks like it's not too late
to fix it.

I've discussed with Paul Mackerras, and early in the secure transition
apparently the UV makes a call to the HV, which is allowed to fail.

So, we're looking at adding another KVM capability for secure mode.
It will default to disabled, and until it is explicitly enabled, KVM
will always fail that call from the UV, effectively preventing guests
from going into secure mode.

We can then wire that up to a new spapr cap in qemu, which we can also
use to configure these virtio defaults.
David Gibson March 12, 2020, 1:09 a.m. UTC | #8
On Wed, Mar 11, 2020 at 07:48:26AM -0400, Michael S. Tsirkin wrote:
> On Wed, Mar 11, 2020 at 10:01:27AM +0000, Daniel P. Berrangé wrote:
> > On Wed, Mar 11, 2020 at 12:12:47PM +1100, David Gibson wrote:
> > > On Tue, Mar 10, 2020 at 11:43:43AM +0000, Daniel P. Berrangé wrote:
> > > > On Thu, Mar 05, 2020 at 03:30:07PM +1100, David Gibson wrote:
> > > > > Upcoming Secure VM support for pSeries machines introduces some
> > > > > complications for virtio, since the transfer buffers need to be
> > > > > explicitly shared so that the hypervisor can access them.
> > > > > 
> > > > > While it's not strictly speaking dependent on it, the fact that virtio
> > > > > devices bypass normal platform IOMMU translation complicates the issue
> > > > > on the guest side.  Since there are some significan downsides to
> > > > > bypassing the vIOMMU anyway, let's just disable that.
> > > > > 
> > > > > There's already a flag to do this in virtio, just turn it on by
> > > > > default for forthcoming pseries machine types.
> > > > 
> > > > Breaking existing guest OS to support a new secure VM feature that
> > > > may not even be used/wanted doesn't seems like a sensible tradeoff
> > > > for default out of the box behaviour.
> > > > 
> > > > IOW, if Secure VM needs this, can we tie the change in virtio and
> > > > IOMMU defaults to the machine type flag that enables the use of
> > > > Secure VM.
> > > 
> > > There is no such flag.
> > > 
> > > In the POWER secure VM model, the secure mode option isn't something
> > > that's constructed in when the hypervisor builds the VM.  Instead the
> > > VM is started normally and transitions itself to secure mode by
> > > talking directly with the ultravisor (it then uses TPM shenannigans to
> > > safely get the keys to its real storage backend(s)).
> > 
> > This is pretty suprising to me. The ability to use secure VM mode surely
> > depends on host hardware features. We would need to be able to block the
> > use of this, in order to allow VMs to be live migrated to hosts which
> > lack the feature. Automatically & silently enabling a feature that
> > has a hardware dependancy is something we aim to avoid, unless the user
> > has opted in via some flag (such as -cpu host, or a -cpu $NAME, that
> > implies the feature).
> 
> That's something I don't know. Is migration supported in this mode?

Not at this stage, though there's plans for it later.
David Gibson March 12, 2020, 1:10 a.m. UTC | #9
On Wed, Mar 11, 2020 at 03:33:59AM -0400, Michael S. Tsirkin wrote:
> On Wed, Mar 11, 2020 at 12:12:47PM +1100, David Gibson wrote:
> > I am wondering if we have to introduce an "svm=on" flag anyway.  It's
> > pretty ugly, since all it would be doing is changing defaults here and
> > there for compatibilty with a possible future SVM transition, but
> > maybe it's the best we can do :/.
> 
> Frankly I'm surprised there's no way for the hypervisor to block VM
> transition to secure mode. To me an inability to disable DRM looks like
> a security problem.

Uh.. I don't immediately see how it's a security problem, though I'm
certainly convinced it's a problem in other ways.

> Does not the ultravisor somehow allow
> enabling/disabling this functionality from the hypervisor?

Not at present, but as mentioned on the other thread, Paul and I came
up with a tentative plan to change that.

> It would be
> even better if the hypervisor could block the guest from poking at the
> ultravisor completely but I guess that would be too much to hope for.

Yeah, probably :/.
Michael S. Tsirkin March 12, 2020, 6:32 a.m. UTC | #10
On Thu, Mar 12, 2020 at 12:10:49PM +1100, David Gibson wrote:
> On Wed, Mar 11, 2020 at 03:33:59AM -0400, Michael S. Tsirkin wrote:
> > On Wed, Mar 11, 2020 at 12:12:47PM +1100, David Gibson wrote:
> > > I am wondering if we have to introduce an "svm=on" flag anyway.  It's
> > > pretty ugly, since all it would be doing is changing defaults here and
> > > there for compatibilty with a possible future SVM transition, but
> > > maybe it's the best we can do :/.
> > 
> > Frankly I'm surprised there's no way for the hypervisor to block VM
> > transition to secure mode. To me an inability to disable DRM looks like
> > a security problem.
> 
> Uh.. I don't immediately see how it's a security problem, though I'm
> certainly convinced it's a problem in other ways.

Well for one it breaks introspection, allowing guests to hide
malicious code from hypervisors.

> > Does not the ultravisor somehow allow
> > enabling/disabling this functionality from the hypervisor?
> 
> Not at present, but as mentioned on the other thread, Paul and I came
> up with a tentative plan to change that.
> 
> > It would be
> > even better if the hypervisor could block the guest from poking at the
> > ultravisor completely but I guess that would be too much to hope for.
> 
> Yeah, probably :/.
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson
Daniel P. Berrangé March 12, 2020, 9:47 a.m. UTC | #11
On Thu, Mar 12, 2020 at 12:08:47PM +1100, David Gibson wrote:
> On Wed, Mar 11, 2020 at 10:01:27AM +0000, Daniel P. Berrangé wrote:
> 65;5803;1c> On Wed, Mar 11, 2020 at 12:12:47PM +1100, David Gibson wrote:
> > > On Tue, Mar 10, 2020 at 11:43:43AM +0000, Daniel P. Berrangé wrote:
> > > > On Thu, Mar 05, 2020 at 03:30:07PM +1100, David Gibson wrote:
> > > > > Upcoming Secure VM support for pSeries machines introduces some
> > > > > complications for virtio, since the transfer buffers need to be
> > > > > explicitly shared so that the hypervisor can access them.
> > > > > 
> > > > > While it's not strictly speaking dependent on it, the fact that virtio
> > > > > devices bypass normal platform IOMMU translation complicates the issue
> > > > > on the guest side.  Since there are some significan downsides to
> > > > > bypassing the vIOMMU anyway, let's just disable that.
> > > > > 
> > > > > There's already a flag to do this in virtio, just turn it on by
> > > > > default for forthcoming pseries machine types.
> > > > 
> > > > Breaking existing guest OS to support a new secure VM feature that
> > > > may not even be used/wanted doesn't seems like a sensible tradeoff
> > > > for default out of the box behaviour.
> > > > 
> > > > IOW, if Secure VM needs this, can we tie the change in virtio and
> > > > IOMMU defaults to the machine type flag that enables the use of
> > > > Secure VM.
> > > 
> > > There is no such flag.
> > > 
> > > In the POWER secure VM model, the secure mode option isn't something
> > > that's constructed in when the hypervisor builds the VM.  Instead the
> > > VM is started normally and transitions itself to secure mode by
> > > talking directly with the ultravisor (it then uses TPM shenannigans to
> > > safely get the keys to its real storage backend(s)).
> > 
> > This is pretty suprising to me. The ability to use secure VM mode surely
> > depends on host hardware features. We would need to be able to block the
> > use of this, in order to allow VMs to be live migrated to hosts which
> > lack the feature. Automatically & silently enabling a feature that
> > has a hardware dependancy is something we aim to avoid, unless the user
> > has opted in via some flag (such as -cpu host, or a -cpu $NAME, that
> > implies the feature).
> 
> That is an excellent point, which I had not previously considered.
> 
> I have confirmed that there is indeed not, at present, a way to
> disable the secure transition.  But, it looks like it's not too late
> to fix it.
> 
> I've discussed with Paul Mackerras, and early in the secure transition
> apparently the UV makes a call to the HV, which is allowed to fail.
> 
> So, we're looking at adding another KVM capability for secure mode.
> It will default to disabled, and until it is explicitly enabled, KVM
> will always fail that call from the UV, effectively preventing guests
> from going into secure mode.
> 
> We can then wire that up to a new spapr cap in qemu, which we can also
> use to configure these virtio defaults.

Great, that sounds viable to me.

Regards,
Daniel
David Gibson March 16, 2020, 3:06 a.m. UTC | #12
On Thu, Mar 12, 2020 at 02:32:11AM -0400, Michael S. Tsirkin wrote:
> On Thu, Mar 12, 2020 at 12:10:49PM +1100, David Gibson wrote:
> > On Wed, Mar 11, 2020 at 03:33:59AM -0400, Michael S. Tsirkin wrote:
> > > On Wed, Mar 11, 2020 at 12:12:47PM +1100, David Gibson wrote:
> > > > I am wondering if we have to introduce an "svm=on" flag anyway.  It's
> > > > pretty ugly, since all it would be doing is changing defaults here and
> > > > there for compatibilty with a possible future SVM transition, but
> > > > maybe it's the best we can do :/.
> > > 
> > > Frankly I'm surprised there's no way for the hypervisor to block VM
> > > transition to secure mode. To me an inability to disable DRM looks like
> > > a security problem.
> > 
> > Uh.. I don't immediately see how it's a security problem, though I'm
> > certainly convinced it's a problem in other ways.
> 
> Well for one it breaks introspection, allowing guests to hide
> malicious code from hypervisors.

Hm, ok.  Is that much used in practice?

(Aside: I don't think I'd call that "introspection" since it's one
thing examining another, not something examining itself).

> 
> > > Does not the ultravisor somehow allow
> > > enabling/disabling this functionality from the hypervisor?
> > 
> > Not at present, but as mentioned on the other thread, Paul and I came
> > up with a tentative plan to change that.
> > 
> > > It would be
> > > even better if the hypervisor could block the guest from poking at the
> > > ultravisor completely but I guess that would be too much to hope for.
> > 
> > Yeah, probably :/.
> > 
> 
>