mbox series

[v3,0/7] memory: prevent dma-reentracy issues

Message ID 20221028191648.964076-1-alxndr@bu.edu (mailing list archive)
Headers show
Series memory: prevent dma-reentracy issues | expand

Message

Alexander Bulekov Oct. 28, 2022, 7:16 p.m. UTC
These patches aim to solve two types of DMA-reentrancy issues:

1.) mmio -> dma -> mmio case
To solve this, we track whether the device is engaged in io by
checking/setting a flag within APIs used for MMIO access.

2.) bh -> dma write -> mmio case
This case is trickier, since we dont have a generic way to associate a
bh with the underlying Device/DeviceState. Thus, this version introduces
a change to QEMU's DMA APIs to associate each request with the
origiantor DeviceState. In total, the affected APIs are used in
approximately 250 locations:

dma_memory_valid (1 usage)
dma_memory_rw (~5 uses)
dma_memory_read (~92 uses)
dma_memory_write (~71 uses)
dma_memory_set (~4 uses)
dma_memory_map (~18 uses)
dma_memory_unmap (~21 uses)
{ld,st}_{le,be}_{uw,l,q}_dma (~10 uses)
ldub_dma (does not appear to be used anywhere)
stb_dma (1 usage)
dma_buf_read (~18 uses)
dma_buf_write (~7 uses)

It is not trivial to mechanically replace all of the invocations:
For many cases, this will be as simple as adding DEVICE(s) to the
arguments, but there are locations where the code will need to be
slightly changed. As such, for now I added "_guarded" versions of most
of the APIs which can be used until all of the invocations are fixed.

The end goal is to go through all of hw/ and make the required changes
(I will need help with this). Once that is done, the "_guarded" APIs can
take the place of the standard DMA APIs and we can mecahnically remove
the "_guarded" suffix from all invocations.

These changes do not address devices that bypass DMA apis and directly
call into address_space.. APIs. This occurs somewhat commonly, and
prevents me from fixing issues in Virtio devices, such as:
https://gitlab.com/qemu-project/qemu/-/issues/827
I'm not sure what approach we should take for these cases - maybe they
should be switched to DMA APIs (or the DMA API expanded).

v2 -> v3: Bite the bullet and modify the DMA APIs, rather than
    attempting to guess DeviceStates in BHs.

Alexander Bulekov (7):
  memory: associate DMA accesses with the initiator Device
  dma-helpers: switch to guarded DMA accesses
  ahci: switch to guarded DMA acccesses
  sdhci: switch to guarded DMA accesses
  ehci: switch to guarded DMA accesses
  xhci: switch to guarded DMA accesses
  usb/libhw: switch to guarded DMA accesses

 hw/ide/ahci.c          | 16 +++++++++-------
 hw/sd/sdhci.c          | 43 ++++++++++++++++++++++--------------------
 hw/usb/hcd-ehci.c      |  8 ++++----
 hw/usb/hcd-xhci.c      | 24 +++++++++++------------
 hw/usb/libhw.c         |  4 ++--
 include/hw/qdev-core.h |  2 ++
 include/sysemu/dma.h   | 41 ++++++++++++++++++++++++++++++++++++++++
 softmmu/dma-helpers.c  | 15 ++++++++-------
 softmmu/memory.c       | 15 +++++++++++++++
 softmmu/trace-events   |  1 +
 10 files changed, 117 insertions(+), 52 deletions(-)

Comments

Alexander Bulekov Nov. 7, 2022, 5:09 p.m. UTC | #1
On 221028 1516, Alexander Bulekov wrote:
> These patches aim to solve two types of DMA-reentrancy issues:
> 
> 1.) mmio -> dma -> mmio case
> To solve this, we track whether the device is engaged in io by
> checking/setting a flag within APIs used for MMIO access.
> 
> 2.) bh -> dma write -> mmio case
> This case is trickier, since we dont have a generic way to associate a
> bh with the underlying Device/DeviceState. Thus, this version introduces
> a change to QEMU's DMA APIs to associate each request with the
> origiantor DeviceState. In total, the affected APIs are used in
> approximately 250 locations:
> 
> dma_memory_valid (1 usage)
> dma_memory_rw (~5 uses)
> dma_memory_read (~92 uses)
> dma_memory_write (~71 uses)
> dma_memory_set (~4 uses)
> dma_memory_map (~18 uses)
> dma_memory_unmap (~21 uses)
> {ld,st}_{le,be}_{uw,l,q}_dma (~10 uses)
> ldub_dma (does not appear to be used anywhere)
> stb_dma (1 usage)
> dma_buf_read (~18 uses)
> dma_buf_write (~7 uses)
> 
> It is not trivial to mechanically replace all of the invocations:
> For many cases, this will be as simple as adding DEVICE(s) to the
> arguments, but there are locations where the code will need to be
> slightly changed. As such, for now I added "_guarded" versions of most
> of the APIs which can be used until all of the invocations are fixed.
> 
> The end goal is to go through all of hw/ and make the required changes
> (I will need help with this). Once that is done, the "_guarded" APIs can
> take the place of the standard DMA APIs and we can mecahnically remove
> the "_guarded" suffix from all invocations.
> 
> These changes do not address devices that bypass DMA apis and directly
> call into address_space.. APIs. This occurs somewhat commonly, and
> prevents me from fixing issues in Virtio devices, such as:
> https://gitlab.com/qemu-project/qemu/-/issues/827
> I'm not sure what approach we should take for these cases - maybe they
> should be switched to DMA APIs (or the DMA API expanded).
> 
> v2 -> v3: Bite the bullet and modify the DMA APIs, rather than
>     attempting to guess DeviceStates in BHs.
> 
> Alexander Bulekov (7):
>   memory: associate DMA accesses with the initiator Device
>   dma-helpers: switch to guarded DMA accesses
>   ahci: switch to guarded DMA acccesses
>   sdhci: switch to guarded DMA accesses
>   ehci: switch to guarded DMA accesses
>   xhci: switch to guarded DMA accesses
>   usb/libhw: switch to guarded DMA accesses
> 
>  hw/ide/ahci.c          | 16 +++++++++-------
>  hw/sd/sdhci.c          | 43 ++++++++++++++++++++++--------------------
>  hw/usb/hcd-ehci.c      |  8 ++++----
>  hw/usb/hcd-xhci.c      | 24 +++++++++++------------
>  hw/usb/libhw.c         |  4 ++--
>  include/hw/qdev-core.h |  2 ++
>  include/sysemu/dma.h   | 41 ++++++++++++++++++++++++++++++++++++++++
>  softmmu/dma-helpers.c  | 15 ++++++++-------
>  softmmu/memory.c       | 15 +++++++++++++++
>  softmmu/trace-events   |  1 +
>  10 files changed, 117 insertions(+), 52 deletions(-)
> 
> -- 
> 2.27.0
>

ping
Stefan Hajnoczi Nov. 10, 2022, 8:50 p.m. UTC | #2
Preventing this class of bugs is important but QEMU is currently
frozen for the 7.2 release. I'm a little concerned about regressions
in a patch series that changes core device emulation code.

I'll review the series on Monday and if anyone has strong opinions on
whether to merge this into 7.2, please say so. My thoughts are that
this should be merged in the 7.3 release cycle so there's time to work
out any issues.

Stefan
Michael S. Tsirkin Nov. 10, 2022, 8:53 p.m. UTC | #3
On Thu, Nov 10, 2022 at 03:50:51PM -0500, Stefan Hajnoczi wrote:
> Preventing this class of bugs is important but QEMU is currently
> frozen for the 7.2 release. I'm a little concerned about regressions
> in a patch series that changes core device emulation code.
> 
> I'll review the series on Monday and if anyone has strong opinions on
> whether to merge this into 7.2, please say so. My thoughts are that
> this should be merged in the 7.3 release cycle so there's time to work
> out any issues.
> 
> Stefan

Stefan, what you say here makes total sense to me.
Didn't look at the series either yet.
Peter Maydell Nov. 10, 2022, 10:50 p.m. UTC | #4
On Thu, 10 Nov 2022 at 20:51, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>
> Preventing this class of bugs is important but QEMU is currently
> frozen for the 7.2 release. I'm a little concerned about regressions
> in a patch series that changes core device emulation code.
>
> I'll review the series on Monday and if anyone has strong opinions on
> whether to merge this into 7.2, please say so. My thoughts are that
> this should be merged in the 7.3 release cycle so there's time to work
> out any issues.

Yeah, we've lived with this class of issues for many releases
now; I would favour landing any solution early in the 8.0
cycle so we can make sure we've worked out any problems well
before release.

thanks
-- PMM
Philippe Mathieu-Daudé Nov. 15, 2022, 11:28 a.m. UTC | #5
On 10/11/22 21:50, Stefan Hajnoczi wrote:
> Preventing this class of bugs is important but QEMU is currently
> frozen for the 7.2 release. I'm a little concerned about regressions
> in a patch series that changes core device emulation code.

I'm waiting for Alex's MemTxRequesterType field addition in
MemTxAttrs [1] lands to rework my previous approach using
flatview_access_allowed() instead of access_with_adjusted_size()
[2]. I haven't looked at this series in detail, but since the
permission check is done on the Memory API layer, I might have
missed something in my previous intent (by using the FlatView
layer).

[1] 
https://lore.kernel.org/qemu-devel/20221111182535.64844-2-alex.bennee@linaro.org/
[2] 
https://lore.kernel.org/qemu-devel/20211215182421.418374-4-philmd@redhat.com/

> I'll review the series on Monday and if anyone has strong opinions on
> whether to merge this into 7.2, please say so. My thoughts are that
> this should be merged in the 7.3 release cycle so there's time to work
> out any issues.
> 
> Stefan