diff mbox series

[for,8.0,v3] memory: Prevent recursive memory access

Message ID 20230316162044.31607-1-akihiko.odaki@daynix.com (mailing list archive)
State New, archived
Headers show
Series [for,8.0,v3] memory: Prevent recursive memory access | expand

Commit Message

Akihiko Odaki March 16, 2023, 4:20 p.m. UTC
A guest may request ask a memory-mapped device to perform DMA. If the
address specified for DMA is the device performing DMA, it will create
recursion. It is very unlikely that device implementations are prepared
for such an abnormal access, which can result in unpredictable behavior.

In particular, such a recursion breaks e1000e, a network device. If
the device is configured to write the received packet to the register
to trigger receiving, it triggers re-entry to the Rx logic of e1000e.
This causes use-after-free since the Rx logic is not re-entrant.

As there should be no valid reason to perform recursive memory access,
check for recursion before accessing memory-mapped device.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1543
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
V1 -> V2: Marked the variable thread-local. Introduced linked list.

 softmmu/memory.c | 81 ++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 64 insertions(+), 17 deletions(-)

Comments

Cédric Le Goater March 17, 2023, 4:25 p.m. UTC | #1
Hello,

On 3/16/23 17:20, Akihiko Odaki wrote:
> A guest may request ask a memory-mapped device to perform DMA. If the
> address specified for DMA is the device performing DMA, it will create
> recursion. It is very unlikely that device implementations are prepared
> for such an abnormal access, which can result in unpredictable behavior.
> 
> In particular, such a recursion breaks e1000e, a network device. If
> the device is configured to write the received packet to the register
> to trigger receiving, it triggers re-entry to the Rx logic of e1000e.
> This causes use-after-free since the Rx logic is not re-entrant.
> 
> As there should be no valid reason to perform recursive memory access,
> check for recursion before accessing memory-mapped device.
> 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1543
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
> V1 -> V2: Marked the variable thread-local. Introduced linked list.
> 
>   softmmu/memory.c | 81 ++++++++++++++++++++++++++++++++++++++----------
>   1 file changed, 64 insertions(+), 17 deletions(-)
> 
> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index 4699ba55ec..6be33a9e3e 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -61,6 +61,15 @@ struct AddrRange {
>       Int128 size;
>   };
>   
> +typedef struct AccessedRegion AccessedRegion;
> +
> +struct AccessedRegion {
> +    const Object *owner;
> +    const AccessedRegion *next;
> +};
> +
> +static __thread const AccessedRegion *accessed_region;
> +
>   static AddrRange addrrange_make(Int128 start, Int128 size)
>   {
>       return (AddrRange) { start, size };
> @@ -1394,6 +1403,16 @@ bool memory_region_access_valid(MemoryRegion *mr,
>           return false;
>       }
>   
> +    for (const AccessedRegion *ar = accessed_region; ar; ar = ar->next) {
> +        if (ar->owner == mr->owner) {
> +            qemu_log_mask(LOG_GUEST_ERROR, "Invalid %s at addr 0x%" HWADDR_PRIX
> +                          ", size %u, region '%s', reason: recursive access\n",
> +                          is_write ? "write" : "read",
> +                          addr, size, memory_region_name(mr));
> +            return false;
> +        }
> +    }
> +
>       /* Treat zero as compatibility all valid */
>       if (!mr->ops->valid.max_access_size) {
>           return true;
> @@ -1413,6 +1432,29 @@ bool memory_region_access_valid(MemoryRegion *mr,
>       return true;
>   }
>   
> +static bool memory_region_access_start(MemoryRegion *mr,
> +                                       hwaddr addr,
> +                                       unsigned size,
> +                                       bool is_write,
> +                                       MemTxAttrs attrs,
> +                                       AccessedRegion *ar)
> +{
> +    if (!memory_region_access_valid(mr, addr, size, is_write, attrs)) {
> +        return false;
> +    }
> +
> +    ar->owner = mr->owner;
> +    ar->next = accessed_region;
> +    accessed_region = ar->next;

Isn't 'accessed_region' always NULL ?

> +
> +    return true;
> +}
> +
> +static void memory_region_access_end(void)
> +{
> +    accessed_region = accessed_region->next;
> +}
and so, this is a segv.

Thanks,

C.
Peter Maydell March 17, 2023, 4:30 p.m. UTC | #2
On Thu, 16 Mar 2023 at 16:21, Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>
> A guest may request ask a memory-mapped device to perform DMA. If the
> address specified for DMA is the device performing DMA, it will create
> recursion. It is very unlikely that device implementations are prepared
> for such an abnormal access, which can result in unpredictable behavior.
>
> In particular, such a recursion breaks e1000e, a network device. If
> the device is configured to write the received packet to the register
> to trigger receiving, it triggers re-entry to the Rx logic of e1000e.
> This causes use-after-free since the Rx logic is not re-entrant.
>
> As there should be no valid reason to perform recursive memory access,
> check for recursion before accessing memory-mapped device.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1543
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
> V1 -> V2: Marked the variable thread-local. Introduced linked list.

Don't we already have proposals on the mailing list for
addressing this? How does this patch differ from those?

Also, "device A DMAs to device A" is only a subset of the problems --
you can also have "device A DMAs to device B which triggers DMA to
device A" and more complicated situations.

thanks
-- PMM
Akihiko Odaki March 18, 2023, 6:05 a.m. UTC | #3
On 2023/03/18 1:25, Cédric Le Goater wrote:
> Hello,
> 
> On 3/16/23 17:20, Akihiko Odaki wrote:
>> A guest may request ask a memory-mapped device to perform DMA. If the
>> address specified for DMA is the device performing DMA, it will create
>> recursion. It is very unlikely that device implementations are prepared
>> for such an abnormal access, which can result in unpredictable behavior.
>>
>> In particular, such a recursion breaks e1000e, a network device. If
>> the device is configured to write the received packet to the register
>> to trigger receiving, it triggers re-entry to the Rx logic of e1000e.
>> This causes use-after-free since the Rx logic is not re-entrant.
>>
>> As there should be no valid reason to perform recursive memory access,
>> check for recursion before accessing memory-mapped device.
>>
>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1543
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> ---
>> V1 -> V2: Marked the variable thread-local. Introduced linked list.
>>
>>   softmmu/memory.c | 81 ++++++++++++++++++++++++++++++++++++++----------
>>   1 file changed, 64 insertions(+), 17 deletions(-)
>>
>> diff --git a/softmmu/memory.c b/softmmu/memory.c
>> index 4699ba55ec..6be33a9e3e 100644
>> --- a/softmmu/memory.c
>> +++ b/softmmu/memory.c
>> @@ -61,6 +61,15 @@ struct AddrRange {
>>       Int128 size;
>>   };
>> +typedef struct AccessedRegion AccessedRegion;
>> +
>> +struct AccessedRegion {
>> +    const Object *owner;
>> +    const AccessedRegion *next;
>> +};
>> +
>> +static __thread const AccessedRegion *accessed_region;
>> +
>>   static AddrRange addrrange_make(Int128 start, Int128 size)
>>   {
>>       return (AddrRange) { start, size };
>> @@ -1394,6 +1403,16 @@ bool memory_region_access_valid(MemoryRegion *mr,
>>           return false;
>>       }
>> +    for (const AccessedRegion *ar = accessed_region; ar; ar = 
>> ar->next) {
>> +        if (ar->owner == mr->owner) {
>> +            qemu_log_mask(LOG_GUEST_ERROR, "Invalid %s at addr 0x%" 
>> HWADDR_PRIX
>> +                          ", size %u, region '%s', reason: recursive 
>> access\n",
>> +                          is_write ? "write" : "read",
>> +                          addr, size, memory_region_name(mr));
>> +            return false;
>> +        }
>> +    }
>> +
>>       /* Treat zero as compatibility all valid */
>>       if (!mr->ops->valid.max_access_size) {
>>           return true;
>> @@ -1413,6 +1432,29 @@ bool memory_region_access_valid(MemoryRegion *mr,
>>       return true;
>>   }
>> +static bool memory_region_access_start(MemoryRegion *mr,
>> +                                       hwaddr addr,
>> +                                       unsigned size,
>> +                                       bool is_write,
>> +                                       MemTxAttrs attrs,
>> +                                       AccessedRegion *ar)
>> +{
>> +    if (!memory_region_access_valid(mr, addr, size, is_write, attrs)) {
>> +        return false;
>> +    }
>> +
>> +    ar->owner = mr->owner;
>> +    ar->next = accessed_region;
>> +    accessed_region = ar->next;
> 
> Isn't 'accessed_region' always NULL ?
> 
>> +
>> +    return true;
>> +}
>> +
>> +static void memory_region_access_end(void)
>> +{
>> +    accessed_region = accessed_region->next;
>> +}
> and so, this is a segv.
> 
> Thanks,
> 
> C.

It was intended to be: accessed_region = ar;

Anyway, I'm no longer pushing this forward as there is a better alternative:
https://lore.kernel.org/qemu-devel/20230313082417.827484-1-alxndr@bu.edu/

This can detect a re-entrancy problem involving bottom half API, and 
disable the check where re-entrancy is allowed.

Regards,
Akihiko Odaki
Akihiko Odaki March 18, 2023, 6:10 a.m. UTC | #4
On 2023/03/18 1:30, Peter Maydell wrote:
> On Thu, 16 Mar 2023 at 16:21, Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>>
>> A guest may request ask a memory-mapped device to perform DMA. If the
>> address specified for DMA is the device performing DMA, it will create
>> recursion. It is very unlikely that device implementations are prepared
>> for such an abnormal access, which can result in unpredictable behavior.
>>
>> In particular, such a recursion breaks e1000e, a network device. If
>> the device is configured to write the received packet to the register
>> to trigger receiving, it triggers re-entry to the Rx logic of e1000e.
>> This causes use-after-free since the Rx logic is not re-entrant.
>>
>> As there should be no valid reason to perform recursive memory access,
>> check for recursion before accessing memory-mapped device.
>>
>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1543
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> ---
>> V1 -> V2: Marked the variable thread-local. Introduced linked list.
> 
> Don't we already have proposals on the mailing list for
> addressing this? How does this patch differ from those?
> 
> Also, "device A DMAs to device A" is only a subset of the problems --
> you can also have "device A DMAs to device B which triggers DMA to
> device A" and more complicated situations.
> 
> thanks
> -- PMM

I was not aware of the proposals when I wrote this. Alexander's proposal 
is better than this as it covers bottom half API and can disable the 
check if not appropriate.
https://lore.kernel.org/qemu-devel/20230313082417.827484-1-alxndr@bu.edu/

My patch can handle the case "device A DMAs to device B which triggers 
DMA to device A" as the accessed memory region is recorded with a linked 
list, and a DMA won't overwrite the history of the prior DMAs. But it 
cannot handle asynchronous case while Alexander's patch can.

Regards,
Akihiko Odaki
diff mbox series

Patch

diff --git a/softmmu/memory.c b/softmmu/memory.c
index 4699ba55ec..6be33a9e3e 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -61,6 +61,15 @@  struct AddrRange {
     Int128 size;
 };
 
+typedef struct AccessedRegion AccessedRegion;
+
+struct AccessedRegion {
+    const Object *owner;
+    const AccessedRegion *next;
+};
+
+static __thread const AccessedRegion *accessed_region;
+
 static AddrRange addrrange_make(Int128 start, Int128 size)
 {
     return (AddrRange) { start, size };
@@ -1394,6 +1403,16 @@  bool memory_region_access_valid(MemoryRegion *mr,
         return false;
     }
 
+    for (const AccessedRegion *ar = accessed_region; ar; ar = ar->next) {
+        if (ar->owner == mr->owner) {
+            qemu_log_mask(LOG_GUEST_ERROR, "Invalid %s at addr 0x%" HWADDR_PRIX
+                          ", size %u, region '%s', reason: recursive access\n",
+                          is_write ? "write" : "read",
+                          addr, size, memory_region_name(mr));
+            return false;
+        }
+    }
+
     /* Treat zero as compatibility all valid */
     if (!mr->ops->valid.max_access_size) {
         return true;
@@ -1413,6 +1432,29 @@  bool memory_region_access_valid(MemoryRegion *mr,
     return true;
 }
 
+static bool memory_region_access_start(MemoryRegion *mr,
+                                       hwaddr addr,
+                                       unsigned size,
+                                       bool is_write,
+                                       MemTxAttrs attrs,
+                                       AccessedRegion *ar)
+{
+    if (!memory_region_access_valid(mr, addr, size, is_write, attrs)) {
+        return false;
+    }
+
+    ar->owner = mr->owner;
+    ar->next = accessed_region;
+    accessed_region = ar->next;
+
+    return true;
+}
+
+static void memory_region_access_end(void)
+{
+    accessed_region = accessed_region->next;
+}
+
 static MemTxResult memory_region_dispatch_read1(MemoryRegion *mr,
                                                 hwaddr addr,
                                                 uint64_t *pval,
@@ -1443,6 +1485,7 @@  MemTxResult memory_region_dispatch_read(MemoryRegion *mr,
                                         MemTxAttrs attrs)
 {
     unsigned size = memop_size(op);
+    AccessedRegion ar;
     MemTxResult r;
 
     if (mr->alias) {
@@ -1450,12 +1493,13 @@  MemTxResult memory_region_dispatch_read(MemoryRegion *mr,
                                            mr->alias_offset + addr,
                                            pval, op, attrs);
     }
-    if (!memory_region_access_valid(mr, addr, size, false, attrs)) {
+    if (!memory_region_access_start(mr, addr, size, false, attrs, &ar)) {
         *pval = unassigned_mem_read(mr, addr, size);
         return MEMTX_DECODE_ERROR;
     }
 
     r = memory_region_dispatch_read1(mr, addr, pval, size, attrs);
+    memory_region_access_end();
     adjust_endianness(mr, pval, op);
     return r;
 }
@@ -1493,13 +1537,15 @@  MemTxResult memory_region_dispatch_write(MemoryRegion *mr,
                                          MemTxAttrs attrs)
 {
     unsigned size = memop_size(op);
+    AccessedRegion ar;
+    MemTxResult result;
 
     if (mr->alias) {
         return memory_region_dispatch_write(mr->alias,
                                             mr->alias_offset + addr,
                                             data, op, attrs);
     }
-    if (!memory_region_access_valid(mr, addr, size, true, attrs)) {
+    if (!memory_region_access_start(mr, addr, size, true, attrs, &ar)) {
         unassigned_mem_write(mr, addr, data, size);
         return MEMTX_DECODE_ERROR;
     }
@@ -1508,23 +1554,24 @@  MemTxResult memory_region_dispatch_write(MemoryRegion *mr,
 
     if ((!kvm_eventfds_enabled()) &&
         memory_region_dispatch_write_eventfds(mr, addr, data, size, attrs)) {
-        return MEMTX_OK;
-    }
-
-    if (mr->ops->write) {
-        return access_with_adjusted_size(addr, &data, size,
-                                         mr->ops->impl.min_access_size,
-                                         mr->ops->impl.max_access_size,
-                                         memory_region_write_accessor, mr,
-                                         attrs);
+        result = MEMTX_OK;
+    } else if (mr->ops->write) {
+        result = access_with_adjusted_size(addr, &data, size,
+                                           mr->ops->impl.min_access_size,
+                                           mr->ops->impl.max_access_size,
+                                           memory_region_write_accessor, mr,
+                                           attrs);
     } else {
-        return
-            access_with_adjusted_size(addr, &data, size,
-                                      mr->ops->impl.min_access_size,
-                                      mr->ops->impl.max_access_size,
-                                      memory_region_write_with_attrs_accessor,
-                                      mr, attrs);
+        result = access_with_adjusted_size(addr, &data, size,
+                                           mr->ops->impl.min_access_size,
+                                           mr->ops->impl.max_access_size,
+                                           memory_region_write_with_attrs_accessor,
+                                           mr, attrs);
     }
+
+    memory_region_access_end();
+
+    return result;
 }
 
 void memory_region_init_io(MemoryRegion *mr,