diff mbox series

[1/1] hw/ide/core: terminate in-flight DMA on IDE bus reset

Message ID 20230921160712.99521-2-simon.rowe@nutanix.com (mailing list archive)
State New, archived
Headers show
Series CVE-2023-5088 | expand

Commit Message

Simon Rowe Sept. 21, 2023, 4:07 p.m. UTC
When an IDE controller is reset, its internal state is being cleared
before any outstanding I/O is cancelled. If a response to DMA is
received in this window, the aio callback will incorrectly continue
with the next part of the transfer (now using sector 0 from
the cleared controller state).

For a write operation, this results in user data being written to the
MBR, replacing the first stage bootloader and/or partition table. A
malicious user could exploit this bug to first read the MBR and then
rewrite it with user-controller bootloader code.

This addresses the bug by checking if DRQ_STAT is still set in the DMA
callback (as it is otherwise cleared at the start of the bus
reset). If it is not, treat the transfer as ended.

This only appears to affect SATA controllers, plain IDE does not use
aio.

Fixes: CVE-2023-5088
Signed-off-by: Simon Rowe <simon.rowe@nutanix.com>
Cc: Felipe Franciosi <felipe@nutanix.com>
---
 hw/ide/core.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

John Snow Sept. 25, 2023, 7:53 p.m. UTC | #1
Niklas, I'm sorry to lean on you here a little bit - You've been
working on the SATA side of this a bit more often, can you let me know
if you think this patch is safe?

I'm not immediately sure what the impact of applying it is, but I have
some questions about it:

(1) When does ide_dma_cb get invoked when DRQ_STAT is set but the
return code we were passed is not < 0, and

(2) what's the impact of just *not* executing the end-of-transfer
block here; what happens to the state machine?

On Thu, Sep 21, 2023 at 12:07 PM Simon Rowe <simon.rowe@nutanix.com> wrote:
>
> When an IDE controller is reset, its internal state is being cleared
> before any outstanding I/O is cancelled. If a response to DMA is
> received in this window, the aio callback will incorrectly continue
> with the next part of the transfer (now using sector 0 from
> the cleared controller state).

Eugh, yikes. It feels like we should fix the cancellation ... I'm
worried that if we've reset the state machine and we need to bail on a
DMA callback that the heuristics we use for that will eventually be
wrong, unless I am mistaken and this is totally safe and reliable for
a reason I don't intuitively see right away.

Thoughts?

>
> For a write operation, this results in user data being written to the
> MBR, replacing the first stage bootloader and/or partition table. A
> malicious user could exploit this bug to first read the MBR and then
> rewrite it with user-controller bootloader code.

Oh, good.

>
> This addresses the bug by checking if DRQ_STAT is still set in the DMA
> callback (as it is otherwise cleared at the start of the bus
> reset). If it is not, treat the transfer as ended.
>
> This only appears to affect SATA controllers, plain IDE does not use
> aio.
>
> Fixes: CVE-2023-5088
> Signed-off-by: Simon Rowe <simon.rowe@nutanix.com>
> Cc: Felipe Franciosi <felipe@nutanix.com>
> ---
>  hw/ide/core.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index b5e0dcd29b..826b7eaeeb 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -906,8 +906,12 @@ static void ide_dma_cb(void *opaque, int ret)
>          s->nsector -= n;
>      }
>
> -    /* end of transfer ? */
> -    if (s->nsector == 0) {
> +    /*
> +     * End of transfer ?
> +     * If a bus reset occurs immediately before the callback is invoked the
> +     * bus state will have been cleared. Terminate the transfer.
> +     */
> +    if (s->nsector == 0 || !(s->status & DRQ_STAT)) {
>          s->status = READY_STAT | SEEK_STAT;
>          ide_bus_set_irq(s->bus);
>          goto eot;
> --
> 2.22.3
>
Fiona Ebner Sept. 26, 2023, 7:11 a.m. UTC | #2
Am 25.09.23 um 21:53 schrieb John Snow:
> On Thu, Sep 21, 2023 at 12:07 PM Simon Rowe <simon.rowe@nutanix.com> wrote:
>>
>> When an IDE controller is reset, its internal state is being cleared
>> before any outstanding I/O is cancelled. If a response to DMA is
>> received in this window, the aio callback will incorrectly continue
>> with the next part of the transfer (now using sector 0 from
>> the cleared controller state).
> 
> Eugh, yikes. It feels like we should fix the cancellation ... 
Please note that there already is a patch for that on the list:
https://lists.nongnu.org/archive/html/qemu-devel/2023-09/msg01011.html

Best Regards,
Fiona
John Snow Sept. 26, 2023, 2:45 p.m. UTC | #3
On Tue, Sep 26, 2023, 3:11 AM Fiona Ebner <f.ebner@proxmox.com> wrote:

> Am 25.09.23 um 21:53 schrieb John Snow:
> > On Thu, Sep 21, 2023 at 12:07 PM Simon Rowe <simon.rowe@nutanix.com>
> wrote:
> >>
> >> When an IDE controller is reset, its internal state is being cleared
> >> before any outstanding I/O is cancelled. If a response to DMA is
> >> received in this window, the aio callback will incorrectly continue
> >> with the next part of the transfer (now using sector 0 from
> >> the cleared controller state).
> >
> > Eugh, yikes. It feels like we should fix the cancellation ...
> Please note that there already is a patch for that on the list:
> https://lists.nongnu.org/archive/html/qemu-devel/2023-09/msg01011.html
>
> Best Regards,
> Fiona
>

Gotcha, thanks for the pointer. I wonder if that's sufficient to fix the
CVE here? I don't have the reproducer in my hands (that I know of ... it's
genuinely possible I missed it, apologies)

>
Fiona Ebner Sept. 28, 2023, 11:23 a.m. UTC | #4
Am 26.09.23 um 16:45 schrieb John Snow:
> 
> 
> On Tue, Sep 26, 2023, 3:11 AM Fiona Ebner <f.ebner@proxmox.com
> <mailto:f.ebner@proxmox.com>> wrote:
> 
>     Am 25.09.23 um 21:53 schrieb John Snow:
>     > On Thu, Sep 21, 2023 at 12:07 PM Simon Rowe
>     <simon.rowe@nutanix.com <mailto:simon.rowe@nutanix.com>> wrote:
>     >>
>     >> When an IDE controller is reset, its internal state is being cleared
>     >> before any outstanding I/O is cancelled. If a response to DMA is
>     >> received in this window, the aio callback will incorrectly continue
>     >> with the next part of the transfer (now using sector 0 from
>     >> the cleared controller state).
>     >
>     > Eugh, yikes. It feels like we should fix the cancellation ...
>     Please note that there already is a patch for that on the list:
>     https://lists.nongnu.org/archive/html/qemu-devel/2023-09/msg01011.html <https://lists.nongnu.org/archive/html/qemu-devel/2023-09/msg01011.html>
> 
>     Best Regards,
>     Fiona
> 
> 
> Gotcha, thanks for the pointer. I wonder if that's sufficient to fix the
> CVE here? I don't have the reproducer in my hands (that I know of ...
> it's genuinely possible I missed it, apologies)
> 

AFAICT, yes, because the DMA callback is invoked before resetting the
state now. But not 100% sure if it can't be triggered in some other way,
maybe Simon knows more? I don't have a reproducer for the CVE either,
but the second patch after the one linked above adds a qtest for the
reset scenario.

Best Regards,
Fiona
Simon Rowe Oct. 2, 2023, 9:08 a.m. UTC | #5
On Thursday, 28 September 2023 Fiona Ebner <f.ebner@proxmox.com> wrote:

> AFAICT, yes, because the DMA callback is invoked before resetting the
> state now. But not 100% sure if it can't be triggered in some other way,
> maybe Simon knows more? I don't have a reproducer for the CVE either,
> but the second patch after the one linked above adds a qtest for the
> reset scenario.

I initially tested an identical change and, yes, it did seem to address the issue. I preferred my final solution because it felt wrong for the DMA to continue after the point the VM is expecting the controller to be reset. I felt it was best to leave the ordering as is (because there are multiple other controllers that use ide_bus_reset()) and terminate the DMA transaction using state that indicates a reset is being performed.

I have a test setup that I use to reproduce this (that was mentioned in the original CVE disclosure). My patch ran for 24+ hours successfully. I can test any other proposed fix.

Regards
Simon
John Snow Oct. 2, 2023, 10:59 p.m. UTC | #6
On Mon, Oct 2, 2023 at 5:09 AM Simon Rowe <simon.rowe@nutanix.com> wrote:
>
> On Thursday, 28 September 2023 Fiona Ebner <f.ebner@proxmox.com> wrote:
>
>
>
> > AFAICT, yes, because the DMA callback is invoked before resetting the
> > state now. But not 100% sure if it can't be triggered in some other way,
> > maybe Simon knows more? I don't have a reproducer for the CVE either,
> > but the second patch after the one linked above adds a qtest for the
> > reset scenario.
>
>
>
> I initially tested an identical change and, yes, it did seem to address the issue. I preferred my final solution because it felt wrong for the DMA to continue after the point the VM is expecting the controller to be reset. I felt it was best to leave the ordering as is (because there are multiple other controllers that use ide_bus_reset()) and terminate the DMA transaction using state that indicates a reset is being performed.

Which reset pathway are you testing that causes the problem? I'm not
fully clear on why checking for DRQ is legitimate here. Does this new
condition fire before or after the registers have been reset as part
of the reset ...?

I trust you there's a problem, but I don't know the specifics of it
just yet to understand your proposed fix. (I have a nagging fear that
it might trigger in more circumstances than we want it to, but I need
more info to audit that.)

I'm also concerned about -- depending on WHEN this conditional will
fire -- the effects of processing the end-of-transfer block on a newly
reset (or about-to-be reset) device.

>
>
>
> I have a test setup that I use to reproduce this (that was mentioned in the original CVE disclosure). My patch ran for 24+ hours successfully. I can test any other proposed fix.
>
>
>
> Regards
>
> Simon
Simon Rowe Oct. 3, 2023, 12:46 p.m. UTC | #7
On Monday, 2 October 2023 John Snow <jsnow@redhat.com> wrote:

> Which reset pathway are you testing that causes the problem?

The test centres on a VM-initiated bus reset because a DMA write has stalled (I deliberately discard the iSCSI response).

> I'm not fully clear on why checking for DRQ is legitimate here.

It was the best condition I could find, there’s probably something better. DRQ_STAT seems to be set by ide_sector_start_dma() and cleared when the transfer ends.

It would be far simpler if s->nsector was trustworthy, but it’s massaged by ide_set_signature() immediately after being zeroed.

> Does this new condition fire before or after the registers have been reset as part
> of the reset ...?

After, the flow is as follows:
- DMA transfer started
- Guest triggers AHCI reset
- ahci_reset_port() calls ide_bus_reset() calls ide_reset()
- ide_reset() clears state including LBA48 support etc
- ide_bus_reset() attempts to cancel pending async DMA operation
- bdrv_aio_cancel() sends async cancel request then polls for response
- Completion of DMA request arrives
- ide_dma_cb() calculates sector number by calling ide_get_sector()
- Because of the controller state after reset sector number is 0
- Next part of transfer is done

> I trust you there's a problem, but I don't know the specifics of it
> just yet to understand your proposed fix. (I have a nagging fear that
> it might trigger in more circumstances than we want it to, but I need
> more info to audit that.)

Hopefully the above clarifies things. I’ve done my best to make the fix very targeted but this is a complex interaction in subsystems I have little knowledge of.

> I'm also concerned about -- depending on WHEN this conditional will
> fire -- the effects of processing the end-of-transfer block on a newly
> reset (or about-to-be reset) device.

I understand, do you think there’s a better approach?

Regards
Simon
Niklas Cassel Oct. 3, 2023, 2:06 p.m. UTC | #8
On Mon, Sep 25, 2023 at 03:53:23PM -0400, John Snow wrote:
> Niklas, I'm sorry to lean on you here a little bit - You've been
> working on the SATA side of this a bit more often, can you let me know
> if you think this patch is safe?

FWIW, I prefer Fiona's patch series which calls blk_aio_cancel() before
calling ide_reset():
https://lore.kernel.org/qemu-devel/20230906130922.142845-1-f.ebner@proxmox.com/T/#u

Patch 2/2 also adds a qtest which fails before patch 1/2, and passes after
patch 1/2.


> 
> I'm not immediately sure what the impact of applying it is, but I have
> some questions about it:
> 
> (1) When does ide_dma_cb get invoked when DRQ_STAT is set but the
> return code we were passed is not < 0, and
> 
> (2) what's the impact of just *not* executing the end-of-transfer
> block here; what happens to the state machine?
> 
> On Thu, Sep 21, 2023 at 12:07 PM Simon Rowe <simon.rowe@nutanix.com> wrote:
> >
> > When an IDE controller is reset, its internal state is being cleared
> > before any outstanding I/O is cancelled. If a response to DMA is
> > received in this window, the aio callback will incorrectly continue
> > with the next part of the transfer (now using sector 0 from
> > the cleared controller state).
> 
> Eugh, yikes. It feels like we should fix the cancellation ... I'm
> worried that if we've reset the state machine and we need to bail on a
> DMA callback that the heuristics we use for that will eventually be
> wrong, unless I am mistaken and this is totally safe and reliable for
> a reason I don't intuitively see right away.
> 
> Thoughts?

Since Simon has a reliable reproducer, and has offered to test Fiona's patch,
perhaps we should simply take him up on that offer :)


Kind regards,
Niklas
John Snow Oct. 3, 2023, 5:05 p.m. UTC | #9
On Tue, Oct 3, 2023, 10:07 AM Niklas Cassel <Niklas.Cassel@wdc.com> wrote:

> On Mon, Sep 25, 2023 at 03:53:23PM -0400, John Snow wrote:
> > Niklas, I'm sorry to lean on you here a little bit - You've been
> > working on the SATA side of this a bit more often, can you let me know
> > if you think this patch is safe?
>
> FWIW, I prefer Fiona's patch series which calls blk_aio_cancel() before
> calling ide_reset():
>
> https://lore.kernel.org/qemu-devel/20230906130922.142845-1-f.ebner@proxmox.com/T/#u
>
> Patch 2/2 also adds a qtest which fails before patch 1/2, and passes after
> patch 1/2.
>

I think I also prefer Fiona's patch. Simon makes a good point (I think)
that it feels correct to dump the DMA as fast as possible, but Fiona's
patch seems more justifiably and obviously correct.

I think my preference is for Fiona's patches first, and if we want to
optimize cancellation thereafter we can attempt to do so.


>
> >
> > I'm not immediately sure what the impact of applying it is, but I have
> > some questions about it:
> >
> > (1) When does ide_dma_cb get invoked when DRQ_STAT is set but the
> > return code we were passed is not < 0, and
> >
> > (2) what's the impact of just *not* executing the end-of-transfer
> > block here; what happens to the state machine?
> >
> > On Thu, Sep 21, 2023 at 12:07 PM Simon Rowe <simon.rowe@nutanix.com>
> wrote:
> > >
> > > When an IDE controller is reset, its internal state is being cleared
> > > before any outstanding I/O is cancelled. If a response to DMA is
> > > received in this window, the aio callback will incorrectly continue
> > > with the next part of the transfer (now using sector 0 from
> > > the cleared controller state).
> >
> > Eugh, yikes. It feels like we should fix the cancellation ... I'm
> > worried that if we've reset the state machine and we need to bail on a
> > DMA callback that the heuristics we use for that will eventually be
> > wrong, unless I am mistaken and this is totally safe and reliable for
> > a reason I don't intuitively see right away.
> >
> > Thoughts?
>
> Since Simon has a reliable reproducer, and has offered to test Fiona's
> patch,
> perhaps we should simply take him up on that offer :)
>

Yes! I just wanted to understand the differences between the two approaches
since it looked like it was going to be my job to decide between them 
Simon Rowe Oct. 4, 2023, 7:51 a.m. UTC | #10
On Tuesday, 3 October 2023 John Snow <jsnow@redhat.com> wrote:

> Simon, can you confirm that Fiona's patches are appropriate for your reproducer? In the meantime I'll do my
> own audit for the problem as you described it (thank you very much for that) and see if there's anything else
> that needs to be addressed.

I’ll run it through my setup and give it 24 hours testing.

Regards
Simon
diff mbox series

Patch

diff --git a/hw/ide/core.c b/hw/ide/core.c
index b5e0dcd29b..826b7eaeeb 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -906,8 +906,12 @@  static void ide_dma_cb(void *opaque, int ret)
         s->nsector -= n;
     }
 
-    /* end of transfer ? */
-    if (s->nsector == 0) {
+    /*
+     * End of transfer ?
+     * If a bus reset occurs immediately before the callback is invoked the
+     * bus state will have been cleared. Terminate the transfer.
+     */
+    if (s->nsector == 0 || !(s->status & DRQ_STAT)) {
         s->status = READY_STAT | SEEK_STAT;
         ide_bus_set_irq(s->bus);
         goto eot;