diff mbox series

[v2,08/10] dp8393x: don't force 32-bit register access

Message ID 20210625065401.30170-9-mark.cave-ayland@ilande.co.uk (mailing list archive)
State New, archived
Headers show
Series dp8393x: fixes for MacOS toolbox ROM | expand

Commit Message

Mark Cave-Ayland June 25, 2021, 6:53 a.m. UTC
Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that all accesses
to the registers were 32-bit but this is actually not the case. The access size is
determined by the CPU instruction used and not the number of physical address lines.

The big_endian workaround applied to the register read/writes was actually caused
by forcing the access size to 32-bit when the guest OS was using a 16-bit access.
Since the registers are 16-bit then we can simply set .impl.min_access to 2 and
then the memory API will automatically do the right thing for both 16-bit accesses
used by Linux and 32-bit accesses used by the MacOS toolbox ROM.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 3fe9a838ec ("dp8393x: Always use 32-bit accesses")
---
 hw/net/dp8393x.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

Comments

Philippe Mathieu-Daudé July 1, 2021, 9:34 p.m. UTC | #1
On 6/25/21 8:53 AM, Mark Cave-Ayland wrote:
> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that all accesses
> to the registers were 32-bit but this is actually not the case. The access size is
> determined by the CPU instruction used and not the number of physical address lines.
> 
> The big_endian workaround applied to the register read/writes was actually caused
> by forcing the access size to 32-bit when the guest OS was using a 16-bit access.
> Since the registers are 16-bit then we can simply set .impl.min_access to 2 and
> then the memory API will automatically do the right thing for both 16-bit accesses
> used by Linux and 32-bit accesses used by the MacOS toolbox ROM.

Hmm I'm not sure. This sounds to me like the "QEMU doesn't model
busses so we end using kludge to hide bugs" pattern. Can you
provide a QTest (ideally) or a "-trace memory_region_ops_\*" log
of your firmware accessing the dp8393x please?

> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> Fixes: 3fe9a838ec ("dp8393x: Always use 32-bit accesses")
> ---
>  hw/net/dp8393x.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
> index 252c0a2664..6789bcd3af 100644
> --- a/hw/net/dp8393x.c
> +++ b/hw/net/dp8393x.c
> @@ -602,15 +602,14 @@ static uint64_t dp8393x_read(void *opaque, hwaddr addr, unsigned int size)
>  
>      trace_dp8393x_read(reg, reg_names[reg], val, size);
>  
> -    return s->big_endian ? val << 16 : val;
> +    return val;
>  }
>  
> -static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data,
> +static void dp8393x_write(void *opaque, hwaddr addr, uint64_t val,
>                            unsigned int size)
>  {
>      dp8393xState *s = opaque;
>      int reg = addr >> s->it_shift;
> -    uint32_t val = s->big_endian ? data >> 16 : data;
>  
>      trace_dp8393x_write(reg, reg_names[reg], val, size);
>  
> @@ -694,7 +693,7 @@ static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data,
>  static const MemoryRegionOps dp8393x_ops = {
>      .read = dp8393x_read,
>      .write = dp8393x_write,
> -    .impl.min_access_size = 4,
> +    .impl.min_access_size = 2,
>      .impl.max_access_size = 4,
>      .endianness = DEVICE_NATIVE_ENDIAN,
>  };
>
Finn Thain July 2, 2021, 4:36 a.m. UTC | #2
On Thu, 1 Jul 2021, Philippe Mathieu-Daudé wrote:

> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote:
> > Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that 
> > all accesses to the registers were 32-bit 

No, that assumption was not made there. Just take a look at my commits in 
Linux that make 16-bit accesses. If commit 3fe9a838ec worked by accident, 
it probably just reflects my inadequate knowledge of QEMU internals.

> > but this is actually not the case. The access size is determined by 
> > the CPU instruction used and not the number of physical address lines.
> > 

I think that's an over-simplification (in the context of commit 
3fe9a838ec).

> > The big_endian workaround applied to the register read/writes was 
> > actually caused by forcing the access size to 32-bit when the guest OS 
> > was using a 16-bit access. Since the registers are 16-bit then we can 
> > simply set .impl.min_access to 2 and then the memory API will 
> > automatically do the right thing for both 16-bit accesses used by 
> > Linux and 32-bit accesses used by the MacOS toolbox ROM.
> 
> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model busses 
> so we end using kludge to hide bugs" pattern. Can you provide a QTest 
> (ideally) or a "-trace memory_region_ops_\*" log of your firmware 
> accessing the dp8393x please?
> 

The DP83932 chip is highly configurable, so I'm not sure that the 
behaviour of any given firmware would resolve the question.

Anyway, as far as the DP83932 hardware is concerned, the behaviour of the 
upper 16-bits of the data bus depends on the configuration programmed into 
the DP83932 registers, and whether the chip is accessed as a slave or 
performing DMA as a master.
Mark Cave-Ayland July 3, 2021, 6:21 a.m. UTC | #3
On 02/07/2021 05:36, Finn Thain wrote:

>> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote:
>>> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that
>>> all accesses to the registers were 32-bit
> 
> No, that assumption was not made there. Just take a look at my commits in
> Linux that make 16-bit accesses. If commit 3fe9a838ec worked by accident,
> it probably just reflects my inadequate knowledge of QEMU internals.
> 
>>> but this is actually not the case. The access size is determined by
>>> the CPU instruction used and not the number of physical address lines.
>>>
> 
> I think that's an over-simplification (in the context of commit
> 3fe9a838ec).

Let me try and clarify this a bit more: there are 2 different changes incorporated 
into 3fe9a838ec. The first (as you mention below and also detailed in the commit 
messge), is related to handling writes to the upper 16-bits of a word from the device 
and ensuring that 32-bit accesses are handled correctly. This part seems correct to 
me based upon reading the datasheet and the commit message:

@@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width, int offset,
                          uint16_t val)
  {
      if (s->big_endian) {
-        s->data[offset * width + width - 1] = cpu_to_be16(val);
+        if (width == 2) {
+            s->data[offset * 2] = 0;
+            s->data[offset * 2 + 1] = cpu_to_be16(val);
+        } else {
+            s->data[offset] = cpu_to_be16(val);
+        }
      } else {
-        s->data[offset * width] = cpu_to_le16(val);
+        if (width == 2) {
+            s->data[offset * 2] = cpu_to_le16(val);
+            s->data[offset * 2 + 1] = 0;
+        } else {
+            s->data[offset] = cpu_to_le16(val);
+        }
      }
  }

The second change incorporated into 3fe9a838ec (and the one this patch fixes) is a 
similar change made to the MMIO *register* accesses:

@@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr addr, unsigned 
int size)

      DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]);

-    return val;
+    return s->big_endian ? val << 16 : val;
  }

and:

@@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data,
  {
      dp8393xState *s = opaque;
      int reg = addr >> s->it_shift;
+    uint32_t val = s->big_endian ? data >> 16 : data;

This is not correct since the QEMU memory API handles any access size and endian 
conversion before the MMIO access reaches the device. It is this change which breaks 
the 32-bit accesses used by MacOS to read/write the dp8393x registers because it 
applies an additional endian swap on top of that already done by the memory API.

>>> The big_endian workaround applied to the register read/writes was
>>> actually caused by forcing the access size to 32-bit when the guest OS
>>> was using a 16-bit access. Since the registers are 16-bit then we can
>>> simply set .impl.min_access to 2 and then the memory API will
>>> automatically do the right thing for both 16-bit accesses used by
>>> Linux and 32-bit accesses used by the MacOS toolbox ROM.
>>
>> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model busses
>> so we end using kludge to hide bugs" pattern. Can you provide a QTest
>> (ideally) or a "-trace memory_region_ops_\*" log of your firmware
>> accessing the dp8393x please?
>>
> The DP83932 chip is highly configurable, so I'm not sure that the
> behaviour of any given firmware would resolve the question.

Indeed, I don't think that will help much here. Phil, if you would still like me to 
send you some traces after reading the explanation above then do let me know.

> Anyway, as far as the DP83932 hardware is concerned, the behaviour of the
> upper 16-bits of the data bus depends on the configuration programmed into
> the DP83932 registers, and whether the chip is accessed as a slave or
> performing DMA as a master.

The important part of the commit and its associated message is that it only changes 
the *register* accesses which were introduced as part of 3fe9a838ec.

In the end all the patch does is remove the manual endian swap from the MMIO 
registers since QEMU's memory API does the right thing all by itself, and adds the 
tweak below:

@@ -694,7 +693,7 @@ static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data,
  static const MemoryRegionOps dp8393x_ops = {
      .read = dp8393x_read,
      .write = dp8393x_write,
-    .impl.min_access_size = 4,
+    .impl.min_access_size = 2,
      .impl.max_access_size = 4,
      .endianness = DEVICE_NATIVE_ENDIAN,
  };

As Finn points out the dp8393x registers are 16-bit so we declare the implementation 
size as 2 bytes and the maximum size as 4 bytes. This allows Linux to function 
correctly with 16-bit accesses, whilst 32-bit accesses done by MacOS are split into 2 
separate 16-bit accesses and combined automatically by the memory API.


ATB,

Mark.
Philippe Mathieu-Daudé July 3, 2021, 8:52 a.m. UTC | #4
On 7/3/21 8:21 AM, Mark Cave-Ayland wrote:
> On 02/07/2021 05:36, Finn Thain wrote:
> 
>>> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote:
>>>> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that
>>>> all accesses to the registers were 32-bit
>>
>> No, that assumption was not made there. Just take a look at my commits in
>> Linux that make 16-bit accesses. If commit 3fe9a838ec worked by accident,
>> it probably just reflects my inadequate knowledge of QEMU internals.
>>
>>>> but this is actually not the case. The access size is determined by
>>>> the CPU instruction used and not the number of physical address lines.
>>>>
>>
>> I think that's an over-simplification (in the context of commit
>> 3fe9a838ec).
> 
> Let me try and clarify this a bit more: there are 2 different changes
> incorporated into 3fe9a838ec. The first (as you mention below and also
> detailed in the commit messge), is related to handling writes to the
> upper 16-bits of a word from the device and ensuring that 32-bit
> accesses are handled correctly. This part seems correct to me based upon
> reading the datasheet and the commit message:
> 
> @@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width,
> int offset,
>                          uint16_t val)
>  {
>      if (s->big_endian) {
> -        s->data[offset * width + width - 1] = cpu_to_be16(val);
> +        if (width == 2) {
> +            s->data[offset * 2] = 0;
> +            s->data[offset * 2 + 1] = cpu_to_be16(val);
> +        } else {
> +            s->data[offset] = cpu_to_be16(val);
> +        }
>      } else {
> -        s->data[offset * width] = cpu_to_le16(val);
> +        if (width == 2) {
> +            s->data[offset * 2] = cpu_to_le16(val);
> +            s->data[offset * 2 + 1] = 0;
> +        } else {
> +            s->data[offset] = cpu_to_le16(val);
> +        }
>      }
>  }
> 
> The second change incorporated into 3fe9a838ec (and the one this patch
> fixes) is a similar change made to the MMIO *register* accesses:
> 
> @@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr
> addr, unsigned int size)
> 
>      DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]);
> 
> -    return val;
> +    return s->big_endian ? val << 16 : val;
>  }
> 
> and:
> 
> @@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr
> addr, uint64_t data,
>  {
>      dp8393xState *s = opaque;
>      int reg = addr >> s->it_shift;
> +    uint32_t val = s->big_endian ? data >> 16 : data;
> 
> This is not correct since the QEMU memory API handles any access size
> and endian conversion before the MMIO access reaches the device. It is
> this change which breaks the 32-bit accesses used by MacOS to read/write
> the dp8393x registers because it applies an additional endian swap on
> top of that already done by the memory API.
> 
>>>> The big_endian workaround applied to the register read/writes was
>>>> actually caused by forcing the access size to 32-bit when the guest OS
>>>> was using a 16-bit access. Since the registers are 16-bit then we can
>>>> simply set .impl.min_access to 2 and then the memory API will
>>>> automatically do the right thing for both 16-bit accesses used by
>>>> Linux and 32-bit accesses used by the MacOS toolbox ROM.
>>>
>>> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model busses
>>> so we end using kludge to hide bugs" pattern. Can you provide a QTest
>>> (ideally) or a "-trace memory_region_ops_\*" log of your firmware
>>> accessing the dp8393x please?
>>>
>> The DP83932 chip is highly configurable, so I'm not sure that the
>> behaviour of any given firmware would resolve the question.
> 
> Indeed, I don't think that will help much here. Phil, if you would still
> like me to send you some traces after reading the explanation above then
> do let me know.

I read it and still would have few traces. We can hand-write them too.

I'd like to add qtests for some read/write,16/32(A)==B.

>> Anyway, as far as the DP83932 hardware is concerned, the behaviour of the
>> upper 16-bits of the data bus depends on the configuration programmed
>> into
>> the DP83932 registers, and whether the chip is accessed as a slave or
>> performing DMA as a master.
> 
> The important part of the commit and its associated message is that it
> only changes the *register* accesses which were introduced as part of
> 3fe9a838ec.
> 
> In the end all the patch does is remove the manual endian swap from the
> MMIO registers since QEMU's memory API does the right thing all by
> itself, and adds the tweak below:
> 
> @@ -694,7 +693,7 @@ static void dp8393x_write(void *opaque, hwaddr addr,
> uint64_t data,
>  static const MemoryRegionOps dp8393x_ops = {
>      .read = dp8393x_read,
>      .write = dp8393x_write,
> -    .impl.min_access_size = 4,
> +    .impl.min_access_size = 2,
>      .impl.max_access_size = 4,
>      .endianness = DEVICE_NATIVE_ENDIAN,
>  };
> 
> As Finn points out the dp8393x registers are 16-bit so we declare the
> implementation size as 2 bytes and the maximum size as 4 bytes. This
> allows Linux to function correctly with 16-bit accesses, whilst 32-bit
> accesses done by MacOS are split into 2 separate 16-bit accesses and
> combined automatically by the memory API.

I hadn't check the datasheet yet but will have a look.

Thanks,

Phil.
Mark Cave-Ayland July 3, 2021, 12:04 p.m. UTC | #5
On 03/07/2021 09:52, Philippe Mathieu-Daudé wrote:

> On 7/3/21 8:21 AM, Mark Cave-Ayland wrote:
>> On 02/07/2021 05:36, Finn Thain wrote:
>>
>>>> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote:
>>>>> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that
>>>>> all accesses to the registers were 32-bit
>>>
>>> No, that assumption was not made there. Just take a look at my commits in
>>> Linux that make 16-bit accesses. If commit 3fe9a838ec worked by accident,
>>> it probably just reflects my inadequate knowledge of QEMU internals.
>>>
>>>>> but this is actually not the case. The access size is determined by
>>>>> the CPU instruction used and not the number of physical address lines.
>>>>>
>>>
>>> I think that's an over-simplification (in the context of commit
>>> 3fe9a838ec).
>>
>> Let me try and clarify this a bit more: there are 2 different changes
>> incorporated into 3fe9a838ec. The first (as you mention below and also
>> detailed in the commit messge), is related to handling writes to the
>> upper 16-bits of a word from the device and ensuring that 32-bit
>> accesses are handled correctly. This part seems correct to me based upon
>> reading the datasheet and the commit message:
>>
>> @@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width,
>> int offset,
>>                           uint16_t val)
>>   {
>>       if (s->big_endian) {
>> -        s->data[offset * width + width - 1] = cpu_to_be16(val);
>> +        if (width == 2) {
>> +            s->data[offset * 2] = 0;
>> +            s->data[offset * 2 + 1] = cpu_to_be16(val);
>> +        } else {
>> +            s->data[offset] = cpu_to_be16(val);
>> +        }
>>       } else {
>> -        s->data[offset * width] = cpu_to_le16(val);
>> +        if (width == 2) {
>> +            s->data[offset * 2] = cpu_to_le16(val);
>> +            s->data[offset * 2 + 1] = 0;
>> +        } else {
>> +            s->data[offset] = cpu_to_le16(val);
>> +        }
>>       }
>>   }
>>
>> The second change incorporated into 3fe9a838ec (and the one this patch
>> fixes) is a similar change made to the MMIO *register* accesses:
>>
>> @@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr
>> addr, unsigned int size)
>>
>>       DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]);
>>
>> -    return val;
>> +    return s->big_endian ? val << 16 : val;
>>   }
>>
>> and:
>>
>> @@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr
>> addr, uint64_t data,
>>   {
>>       dp8393xState *s = opaque;
>>       int reg = addr >> s->it_shift;
>> +    uint32_t val = s->big_endian ? data >> 16 : data;
>>
>> This is not correct since the QEMU memory API handles any access size
>> and endian conversion before the MMIO access reaches the device. It is
>> this change which breaks the 32-bit accesses used by MacOS to read/write
>> the dp8393x registers because it applies an additional endian swap on
>> top of that already done by the memory API.
>>
>>>>> The big_endian workaround applied to the register read/writes was
>>>>> actually caused by forcing the access size to 32-bit when the guest OS
>>>>> was using a 16-bit access. Since the registers are 16-bit then we can
>>>>> simply set .impl.min_access to 2 and then the memory API will
>>>>> automatically do the right thing for both 16-bit accesses used by
>>>>> Linux and 32-bit accesses used by the MacOS toolbox ROM.
>>>>
>>>> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model busses
>>>> so we end using kludge to hide bugs" pattern. Can you provide a QTest
>>>> (ideally) or a "-trace memory_region_ops_\*" log of your firmware
>>>> accessing the dp8393x please?
>>>>
>>> The DP83932 chip is highly configurable, so I'm not sure that the
>>> behaviour of any given firmware would resolve the question.
>>
>> Indeed, I don't think that will help much here. Phil, if you would still
>> like me to send you some traces after reading the explanation above then
>> do let me know.
> 
> I read it and still would have few traces. We can hand-write them too.
> 
> I'd like to add qtests for some read/write,16/32(A)==B.

Sigh. I was just about to attempt some traces when I realised looking at the patch 
that I made a mistake, and that in order for the memory API to automatically handle 
the 4 byte accesses as 2 x 2 byte accesses then both .impl.min_access_size and 
.impl.max_access_size need to be set to 2 :(  The remainder of the patch is the same 
but dp8393x_ops now looks like this:

static const MemoryRegionOps dp8393x_ops = {
     .read = dp8393x_read,
     .write = dp8393x_write,
     .impl.min_access_size = 2,
     .impl.max_access_size = 2,
     .endianness = DEVICE_NATIVE_ENDIAN,
};

I've tested this under Linux/m68k, NetBSD/arc and MacOS and networking seems fine 
after a quick test in each OS. The slight curiosity is that the 4 byte accesses used 
by MacOS are split into 2 x 2 byte accesses, but since MacOS only uses the bottom 
16-bit of the result and ignores the top 16-bits then despite there being 2 accesses, 
everything still works as expected (see below).


READ:

dp8393x_read reg=0x28 [SR] val=0x0004 size=2
memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a0 value 0x4 size 2
dp8393x_read reg=0x28 [SR] val=0x0004 size=2
memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a2 value 0x4 size 2
memory_region_ops_read cpu 0 mr 0x55f210c9c330 addr 0x50f0a0a0 value 0x40004 size 4

WRITE:

memory_region_ops_write cpu 0 mr 0x55f210c9c330 addr 0x50f0a01c value 0x248fe8 size 4
memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01c value 0x24 size 2
dp8393x_write reg=0x7 [CTDA] val=0x0024 size=2
memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01e value 0x8fe8 size 2
dp8393x_write reg=0x7 [CTDA] val=0x8fe8 size=2


If you're happy with this, I'll resubmit a revised version of the patch but with an 
updated commit message: the Fixes: tag is still the same, but the updated fix is to 
ensure that .impl.min_access_size and .impl.max_access_size match the real-life 
16-bit register size.


ATB,

Mark.
Philippe Mathieu-Daudé July 3, 2021, 1:10 p.m. UTC | #6
On 7/3/21 2:04 PM, Mark Cave-Ayland wrote:
> On 03/07/2021 09:52, Philippe Mathieu-Daudé wrote:
>> On 7/3/21 8:21 AM, Mark Cave-Ayland wrote:
>>> On 02/07/2021 05:36, Finn Thain wrote:
>>>
>>>>> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote:
>>>>>> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that
>>>>>> all accesses to the registers were 32-bit
>>>>
>>>> No, that assumption was not made there. Just take a look at my
>>>> commits in
>>>> Linux that make 16-bit accesses. If commit 3fe9a838ec worked by
>>>> accident,
>>>> it probably just reflects my inadequate knowledge of QEMU internals.
>>>>
>>>>>> but this is actually not the case. The access size is determined by
>>>>>> the CPU instruction used and not the number of physical address
>>>>>> lines.
>>>>>>
>>>>
>>>> I think that's an over-simplification (in the context of commit
>>>> 3fe9a838ec).
>>>
>>> Let me try and clarify this a bit more: there are 2 different changes
>>> incorporated into 3fe9a838ec. The first (as you mention below and also
>>> detailed in the commit messge), is related to handling writes to the
>>> upper 16-bits of a word from the device and ensuring that 32-bit
>>> accesses are handled correctly. This part seems correct to me based upon
>>> reading the datasheet and the commit message:
>>>
>>> @@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width,
>>> int offset,
>>>                           uint16_t val)
>>>   {
>>>       if (s->big_endian) {
>>> -        s->data[offset * width + width - 1] = cpu_to_be16(val);
>>> +        if (width == 2) {
>>> +            s->data[offset * 2] = 0;
>>> +            s->data[offset * 2 + 1] = cpu_to_be16(val);
>>> +        } else {
>>> +            s->data[offset] = cpu_to_be16(val);
>>> +        }
>>>       } else {
>>> -        s->data[offset * width] = cpu_to_le16(val);
>>> +        if (width == 2) {
>>> +            s->data[offset * 2] = cpu_to_le16(val);
>>> +            s->data[offset * 2 + 1] = 0;
>>> +        } else {
>>> +            s->data[offset] = cpu_to_le16(val);
>>> +        }
>>>       }
>>>   }
>>>
>>> The second change incorporated into 3fe9a838ec (and the one this patch
>>> fixes) is a similar change made to the MMIO *register* accesses:
>>>
>>> @@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr
>>> addr, unsigned int size)
>>>
>>>       DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]);
>>>
>>> -    return val;
>>> +    return s->big_endian ? val << 16 : val;
>>>   }
>>>
>>> and:
>>>
>>> @@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr
>>> addr, uint64_t data,
>>>   {
>>>       dp8393xState *s = opaque;
>>>       int reg = addr >> s->it_shift;
>>> +    uint32_t val = s->big_endian ? data >> 16 : data;
>>>
>>> This is not correct since the QEMU memory API handles any access size
>>> and endian conversion before the MMIO access reaches the device. It is
>>> this change which breaks the 32-bit accesses used by MacOS to read/write
>>> the dp8393x registers because it applies an additional endian swap on
>>> top of that already done by the memory API.
>>>
>>>>>> The big_endian workaround applied to the register read/writes was
>>>>>> actually caused by forcing the access size to 32-bit when the
>>>>>> guest OS
>>>>>> was using a 16-bit access. Since the registers are 16-bit then we can
>>>>>> simply set .impl.min_access to 2 and then the memory API will
>>>>>> automatically do the right thing for both 16-bit accesses used by
>>>>>> Linux and 32-bit accesses used by the MacOS toolbox ROM.
>>>>>
>>>>> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model
>>>>> busses
>>>>> so we end using kludge to hide bugs" pattern. Can you provide a QTest
>>>>> (ideally) or a "-trace memory_region_ops_\*" log of your firmware
>>>>> accessing the dp8393x please?
>>>>>
>>>> The DP83932 chip is highly configurable, so I'm not sure that the
>>>> behaviour of any given firmware would resolve the question.
>>>
>>> Indeed, I don't think that will help much here. Phil, if you would still
>>> like me to send you some traces after reading the explanation above then
>>> do let me know.
>>
>> I read it and still would have few traces. We can hand-write them too.
>>
>> I'd like to add qtests for some read/write,16/32(A)==B.
> 
> Sigh. I was just about to attempt some traces when I realised looking at
> the patch that I made a mistake, and that in order for the memory API to
> automatically handle the 4 byte accesses as 2 x 2 byte accesses then
> both .impl.min_access_size and .impl.max_access_size need to be set to 2
> :(  The remainder of the patch is the same but dp8393x_ops now looks
> like this:
> 
> static const MemoryRegionOps dp8393x_ops = {
>     .read = dp8393x_read,
>     .write = dp8393x_write,
>     .impl.min_access_size = 2,
>     .impl.max_access_size = 2,
>     .endianness = DEVICE_NATIVE_ENDIAN,
> };

Yes :) This is closer to what I wrote during my review:

-- >8 --
diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
index bff359ed13f..6061268dc49 100644
--- a/hw/net/dp8393x.c
+++ b/hw/net/dp8393x.c
@@ -697,7 +697,9 @@ static const MemoryRegionOps dp8393x_ops = {
     .read = dp8393x_read,
     .write = dp8393x_write,
     .impl.min_access_size = 2,
-    .impl.max_access_size = 4,
+    .impl.max_access_size = 2,
+    .valid.min_access_size = 2,
+    .valid.max_access_size = 4,
     .endianness = DEVICE_NATIVE_ENDIAN,
 };
---

> 
> I've tested this under Linux/m68k, NetBSD/arc and MacOS and networking
> seems fine after a quick test in each OS. The slight curiosity is that
> the 4 byte accesses used by MacOS are split into 2 x 2 byte accesses,
> but since MacOS only uses the bottom 16-bit of the result and ignores
> the top 16-bits then despite there being 2 accesses, everything still
> works as expected (see below).
> 
> 
> READ:
> 
> dp8393x_read reg=0x28 [SR] val=0x0004 size=2
> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a0 value 0x4
> size 2
> dp8393x_read reg=0x28 [SR] val=0x0004 size=2
> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a2 value 0x4
> size 2
> memory_region_ops_read cpu 0 mr 0x55f210c9c330 addr 0x50f0a0a0 value
> 0x40004 size 4
> 
> WRITE:
> 
> memory_region_ops_write cpu 0 mr 0x55f210c9c330 addr 0x50f0a01c value
> 0x248fe8 size 4
> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01c value
> 0x24 size 2
> dp8393x_write reg=0x7 [CTDA] val=0x0024 size=2
> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01e value
> 0x8fe8 size 2
> dp8393x_write reg=0x7 [CTDA] val=0x8fe8 size=2
> 
> 
> If you're happy with this, I'll resubmit a revised version of the patch
> but with an updated commit message: the Fixes: tag is still the same,
> but the updated fix is to ensure that .impl.min_access_size and
> .impl.max_access_size match the real-life 16-bit register size.

Hold on a bit more, I'm still reviewing the datasheet ;)

My code note so far (untested) are:

-- >8 --
@@ -219,34 +225,48 @@ static uint32_t dp8393x_wt(dp8393xState *s)
     return s->regs[SONIC_WT1] << 16 | s->regs[SONIC_WT0];
 }

-static uint16_t dp8393x_get(dp8393xState *s, int width, int offset)
+static uint16_t dp8393x_get(dp8393xState *s, hwaddr addr, unsigned ofs16)
 {
+    const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
     uint16_t val;

-    if (s->big_endian) {
-        val = be16_to_cpu(s->data[offset * width + width - 1]);
+    if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
+        addr += 2 * ofs16;
+        if (s->big_endian) {
+            val = address_space_ldl_be(&s->as, addr, attrs, NULL);
+        } else {
+            val = address_space_ldl_le(&s->as, addr, attrs, NULL);
+        }
     } else {
-        val = le16_to_cpu(s->data[offset * width]);
+        addr += 1 * ofs16;
+        if (s->big_endian) {
+            val = address_space_lduw_be(&s->as, addr, attrs, NULL);
+        } else {
+            val = address_space_lduw_le(&s->as, addr, attrs, NULL);
+        }
     }
+
     return val;
 }

-static void dp8393x_put(dp8393xState *s, int width, int offset,
-                        uint16_t val)
+static void dp8393x_put(dp8393xState *s,
+                        hwaddr addr, unsigned ofs16, uint16_t val)
 {
-    if (s->big_endian) {
-        if (width == 2) {
-            s->data[offset * 2] = 0;
-            s->data[offset * 2 + 1] = cpu_to_be16(val);
+    const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
+
+    if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
+        addr += 2 * ofs16;
+        if (s->big_endian) {
+            address_space_stl_be(&s->as, addr, val, attrs, NULL);
         } else {
-            s->data[offset] = cpu_to_be16(val);
+            address_space_stl_le(&s->as, addr, val, attrs, NULL);
         }
     } else {
-        if (width == 2) {
-            s->data[offset * 2] = cpu_to_le16(val);
-            s->data[offset * 2 + 1] = 0;
+        addr += 1 * ofs16;
+        if (s->big_endian) {
+            address_space_stw_be(&s->as, addr, val, attrs, NULL);
         } else {
-            s->data[offset] = cpu_to_le16(val);
+            address_space_stw_le(&s->as, addr, val, attrs, NULL);
         }
     }
 }
@@ -313,14 +331,12 @@ static void dp8393x_do_read_rra(dp8393xState *s)
     /* Read memory */
     width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1;
     size = sizeof(uint16_t) * 4 * width;
-    address_space_read(&s->as, dp8393x_rrp(s),
-                       MEMTXATTRS_UNSPECIFIED, s->data, size);

     /* Update SONIC registers */
-    s->regs[SONIC_CRBA0] = dp8393x_get(s, width, 0);
-    s->regs[SONIC_CRBA1] = dp8393x_get(s, width, 1);
-    s->regs[SONIC_RBWC0] = dp8393x_get(s, width, 2);
-    s->regs[SONIC_RBWC1] = dp8393x_get(s, width, 3);
+    s->regs[SONIC_CRBA0] = dp8393x_get(s, dp8393x_rrp(s), 0);
+    s->regs[SONIC_CRBA1] = dp8393x_get(s, dp8393x_rrp(s), 1);
+    s->regs[SONIC_RBWC0] = dp8393x_get(s, dp8393x_rrp(s), 2);
+    s->regs[SONIC_RBWC1] = dp8393x_get(s, dp8393x_rrp(s), 3);
     trace_dp8393x_read_rra_regs(s->regs[SONIC_CRBA0], s->regs[SONIC_CRBA1],
                                 s->regs[SONIC_RBWC0],
s->regs[SONIC_RBWC1]);

@@ -416,28 +432,22 @@ static void
dp8393x_do_receiver_disable(dp8393xState *s)
 static void dp8393x_do_transmit_packets(dp8393xState *s)
 {
     NetClientState *nc = qemu_get_queue(s->nic);
-    int width, size;
     int tx_len, len;
     uint16_t i;

-    width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1;
-
     while (1) {
         /* Read memory */
-        size = sizeof(uint16_t) * 6 * width;
         s->regs[SONIC_TTDA] = s->regs[SONIC_CTDA];
         trace_dp8393x_transmit_packet(dp8393x_ttda(s));
-        address_space_read(&s->as, dp8393x_ttda(s) + sizeof(uint16_t) *
width,
-                           MEMTXATTRS_UNSPECIFIED, s->data, size);
         tx_len = 0;

         /* Update registers */
-        s->regs[SONIC_TCR] = dp8393x_get(s, width, 0) & 0xf000;
-        s->regs[SONIC_TPS] = dp8393x_get(s, width, 1);
-        s->regs[SONIC_TFC] = dp8393x_get(s, width, 2);
-        s->regs[SONIC_TSA0] = dp8393x_get(s, width, 3);
-        s->regs[SONIC_TSA1] = dp8393x_get(s, width, 4);
-        s->regs[SONIC_TFS] = dp8393x_get(s, width, 5);
+        s->regs[SONIC_TCR] = dp8393x_get(s, dp8393x_ttda(s), 0) & 0xf000;
+        s->regs[SONIC_TPS] = dp8393x_get(s, dp8393x_ttda(s), 1);
+        s->regs[SONIC_TFC] = dp8393x_get(s, dp8393x_ttda(s), 2);
+        s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 3);
+        s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 4);
+        s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 5);

         /* Handle programmable interrupt */
         if (s->regs[SONIC_TCR] & SONIC_TCR_PINT) {
@@ -459,15 +469,9 @@ static void
dp8393x_do_transmit_packets(dp8393xState *s)
             i++;
             if (i != s->regs[SONIC_TFC]) {
                 /* Read next fragment details */
-                size = sizeof(uint16_t) * 3 * width;
-                address_space_read(&s->as,
-                                   dp8393x_ttda(s)
-                                   + sizeof(uint16_t) * width * (4 + 3
* i),
-                                   MEMTXATTRS_UNSPECIFIED, s->data,
-                                   size);
-                s->regs[SONIC_TSA0] = dp8393x_get(s, width, 0);
-                s->regs[SONIC_TSA1] = dp8393x_get(s, width, 1);
-                s->regs[SONIC_TFS] = dp8393x_get(s, width, 2);
+                s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 0);
+                s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 1);
+                s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 2);
             }
         }

@@ -500,22 +504,12 @@ static void
dp8393x_do_transmit_packets(dp8393xState *s)
         s->regs[SONIC_TCR] |= SONIC_TCR_PTX;

         /* Write status */
-        dp8393x_put(s, width, 0,
-                    s->regs[SONIC_TCR] & 0x0fff); /* status */
-        size = sizeof(uint16_t) * width;
-        address_space_write(&s->as, dp8393x_ttda(s),
-                            MEMTXATTRS_UNSPECIFIED, s->data, size);
+        dp8393x_put(s, dp8393x_ttda(s), 0, s->regs[SONIC_TCR] & 0x0fff);

         if (!(s->regs[SONIC_CR] & SONIC_CR_HTX)) {
             /* Read footer of packet */
-            size = sizeof(uint16_t) * width;
-            address_space_read(&s->as,
-                               dp8393x_ttda(s)
-                               + sizeof(uint16_t) * width
-                                 * (4 + 3 * s->regs[SONIC_TFC]),
-                               MEMTXATTRS_UNSPECIFIED, s->data,
-                               size);
-            s->regs[SONIC_CTDA] = dp8393x_get(s, width, 0);
+            s->regs[SONIC_CTDA] = dp8393x_get(s, dp8393x_ttda(s),
+                                               s->regs[SONIC_TFC]);
             if (s->regs[SONIC_CTDA] & SONIC_DESC_EOL) {
                 /* EOL detected */
                 break;
@@ -764,7 +759,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
     dp8393xState *s = qemu_get_nic_opaque(nc);
     int packet_type;
     uint32_t available, address;
-    int width, rx_len, padded_len;
+    int rx_len, padded_len;
     uint32_t checksum;
     int size;

@@ -777,10 +772,8 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,

     rx_len = pkt_size + sizeof(checksum);
     if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
-        width = 2;
         padded_len = ((rx_len - 1) | 3) + 1;
     } else {
-        width = 1;
         padded_len = ((rx_len - 1) | 1) + 1;
     }

@@ -801,11 +794,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
     /* Check for EOL */
     if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
         /* Are we still in resource exhaustion? */
-        size = sizeof(uint16_t) * 1 * width;
-        address = dp8393x_crda(s) + sizeof(uint16_t) * 5 * width;
-        address_space_read(&s->as, address, MEMTXATTRS_UNSPECIFIED,
-                           s->data, size);
-        s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0);
+        s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5);
         if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
             /* Still EOL ; stop reception */
             return -1;
@@ -813,11 +802,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
         /* Link has been updated by host */

         /* Clear in_use */
-        size = sizeof(uint16_t) * width;
-        address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width;
-        dp8393x_put(s, width, 0, 0);
-        address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED,
-                         (uint8_t *)s->data, size, 1);
+        dp8393x_put(s, dp8393x_crda(s), 6, 0x0000);

         /* Move to next descriptor */
         s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
@@ -846,8 +831,8 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,
     /* Pad short packets to keep pointers aligned */
     if (rx_len < padded_len) {
         size = padded_len - rx_len;
-        address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED,
-            (uint8_t *)"\xFF\xFF\xFF", size, 1);
+        address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED,
+                            (uint8_t *)"\xFF\xFF\xFF", size);
         address += size;
     }

@@ -871,32 +856,20 @@ static ssize_t dp8393x_receive(NetClientState *nc,
const uint8_t * buf,

     /* Write status to memory */
     trace_dp8393x_receive_write_status(dp8393x_crda(s));
-    dp8393x_put(s, width, 0, s->regs[SONIC_RCR]); /* status */
-    dp8393x_put(s, width, 1, rx_len); /* byte count */
-    dp8393x_put(s, width, 2, s->regs[SONIC_TRBA0]); /* pkt_ptr0 */
-    dp8393x_put(s, width, 3, s->regs[SONIC_TRBA1]); /* pkt_ptr1 */
-    dp8393x_put(s, width, 4, s->regs[SONIC_RSC]); /* seq_no */
-    size = sizeof(uint16_t) * 5 * width;
-    address_space_write(&s->as, dp8393x_crda(s),
-                        MEMTXATTRS_UNSPECIFIED,
-                        s->data, size);
+    dp8393x_put(s, dp8393x_crda(s), 0, s->regs[SONIC_RCR]); /* status */
+    dp8393x_put(s, dp8393x_crda(s), 1, rx_len); /* byte count */
+    dp8393x_put(s, dp8393x_crda(s), 2, s->regs[SONIC_TRBA0]); /*
pkt_ptr0 */
+    dp8393x_put(s, dp8393x_crda(s), 3, s->regs[SONIC_TRBA1]); /*
pkt_ptr1 */
+    dp8393x_put(s, dp8393x_crda(s), 4, s->regs[SONIC_RSC]); /* seq_no */

     /* Check link field */
-    size = sizeof(uint16_t) * width;
-    address_space_read(&s->as,
-                       dp8393x_crda(s) + sizeof(uint16_t) * 5 * width,
-                       MEMTXATTRS_UNSPECIFIED, s->data, size);
-    s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0);
+    s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5);
     if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
         /* EOL detected */
         s->regs[SONIC_ISR] |= SONIC_ISR_RDE;
     } else {
         /* Clear in_use */
-        size = sizeof(uint16_t) * width;
-        address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width;
-        dp8393x_put(s, width, 0, 0);
-        address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED,
-                            s->data, size);
+        dp8393x_put(s, dp8393x_crda(s), 6, 0x0000);

         /* Move to next descriptor */
         s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
---

Now I'll look at the CAM then the cksum.

Regards,

Phil.
Mark Cave-Ayland July 3, 2021, 2:16 p.m. UTC | #7
On 03/07/2021 14:10, Philippe Mathieu-Daudé wrote:

> On 7/3/21 2:04 PM, Mark Cave-Ayland wrote:
>> On 03/07/2021 09:52, Philippe Mathieu-Daudé wrote:
>>> On 7/3/21 8:21 AM, Mark Cave-Ayland wrote:
>>>> On 02/07/2021 05:36, Finn Thain wrote:
>>>>
>>>>>> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote:
>>>>>>> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that
>>>>>>> all accesses to the registers were 32-bit
>>>>>
>>>>> No, that assumption was not made there. Just take a look at my
>>>>> commits in
>>>>> Linux that make 16-bit accesses. If commit 3fe9a838ec worked by
>>>>> accident,
>>>>> it probably just reflects my inadequate knowledge of QEMU internals.
>>>>>
>>>>>>> but this is actually not the case. The access size is determined by
>>>>>>> the CPU instruction used and not the number of physical address
>>>>>>> lines.
>>>>>>>
>>>>>
>>>>> I think that's an over-simplification (in the context of commit
>>>>> 3fe9a838ec).
>>>>
>>>> Let me try and clarify this a bit more: there are 2 different changes
>>>> incorporated into 3fe9a838ec. The first (as you mention below and also
>>>> detailed in the commit messge), is related to handling writes to the
>>>> upper 16-bits of a word from the device and ensuring that 32-bit
>>>> accesses are handled correctly. This part seems correct to me based upon
>>>> reading the datasheet and the commit message:
>>>>
>>>> @@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width,
>>>> int offset,
>>>>                            uint16_t val)
>>>>    {
>>>>        if (s->big_endian) {
>>>> -        s->data[offset * width + width - 1] = cpu_to_be16(val);
>>>> +        if (width == 2) {
>>>> +            s->data[offset * 2] = 0;
>>>> +            s->data[offset * 2 + 1] = cpu_to_be16(val);
>>>> +        } else {
>>>> +            s->data[offset] = cpu_to_be16(val);
>>>> +        }
>>>>        } else {
>>>> -        s->data[offset * width] = cpu_to_le16(val);
>>>> +        if (width == 2) {
>>>> +            s->data[offset * 2] = cpu_to_le16(val);
>>>> +            s->data[offset * 2 + 1] = 0;
>>>> +        } else {
>>>> +            s->data[offset] = cpu_to_le16(val);
>>>> +        }
>>>>        }
>>>>    }
>>>>
>>>> The second change incorporated into 3fe9a838ec (and the one this patch
>>>> fixes) is a similar change made to the MMIO *register* accesses:
>>>>
>>>> @@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr
>>>> addr, unsigned int size)
>>>>
>>>>        DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]);
>>>>
>>>> -    return val;
>>>> +    return s->big_endian ? val << 16 : val;
>>>>    }
>>>>
>>>> and:
>>>>
>>>> @@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr
>>>> addr, uint64_t data,
>>>>    {
>>>>        dp8393xState *s = opaque;
>>>>        int reg = addr >> s->it_shift;
>>>> +    uint32_t val = s->big_endian ? data >> 16 : data;
>>>>
>>>> This is not correct since the QEMU memory API handles any access size
>>>> and endian conversion before the MMIO access reaches the device. It is
>>>> this change which breaks the 32-bit accesses used by MacOS to read/write
>>>> the dp8393x registers because it applies an additional endian swap on
>>>> top of that already done by the memory API.
>>>>
>>>>>>> The big_endian workaround applied to the register read/writes was
>>>>>>> actually caused by forcing the access size to 32-bit when the
>>>>>>> guest OS
>>>>>>> was using a 16-bit access. Since the registers are 16-bit then we can
>>>>>>> simply set .impl.min_access to 2 and then the memory API will
>>>>>>> automatically do the right thing for both 16-bit accesses used by
>>>>>>> Linux and 32-bit accesses used by the MacOS toolbox ROM.
>>>>>>
>>>>>> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model
>>>>>> busses
>>>>>> so we end using kludge to hide bugs" pattern. Can you provide a QTest
>>>>>> (ideally) or a "-trace memory_region_ops_\*" log of your firmware
>>>>>> accessing the dp8393x please?
>>>>>>
>>>>> The DP83932 chip is highly configurable, so I'm not sure that the
>>>>> behaviour of any given firmware would resolve the question.
>>>>
>>>> Indeed, I don't think that will help much here. Phil, if you would still
>>>> like me to send you some traces after reading the explanation above then
>>>> do let me know.
>>>
>>> I read it and still would have few traces. We can hand-write them too.
>>>
>>> I'd like to add qtests for some read/write,16/32(A)==B.
>>
>> Sigh. I was just about to attempt some traces when I realised looking at
>> the patch that I made a mistake, and that in order for the memory API to
>> automatically handle the 4 byte accesses as 2 x 2 byte accesses then
>> both .impl.min_access_size and .impl.max_access_size need to be set to 2
>> :(  The remainder of the patch is the same but dp8393x_ops now looks
>> like this:
>>
>> static const MemoryRegionOps dp8393x_ops = {
>>      .read = dp8393x_read,
>>      .write = dp8393x_write,
>>      .impl.min_access_size = 2,
>>      .impl.max_access_size = 2,
>>      .endianness = DEVICE_NATIVE_ENDIAN,
>> };
> 
> Yes :) This is closer to what I wrote during my review:
> 
> -- >8 --
> diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
> index bff359ed13f..6061268dc49 100644
> --- a/hw/net/dp8393x.c
> +++ b/hw/net/dp8393x.c
> @@ -697,7 +697,9 @@ static const MemoryRegionOps dp8393x_ops = {
>       .read = dp8393x_read,
>       .write = dp8393x_write,
>       .impl.min_access_size = 2,
> -    .impl.max_access_size = 4,
> +    .impl.max_access_size = 2,
> +    .valid.min_access_size = 2,
> +    .valid.max_access_size = 4,
>       .endianness = DEVICE_NATIVE_ENDIAN,
>   };
> ---

Yes, that should work. Sorry I got confused during the initial review.

>> I've tested this under Linux/m68k, NetBSD/arc and MacOS and networking
>> seems fine after a quick test in each OS. The slight curiosity is that
>> the 4 byte accesses used by MacOS are split into 2 x 2 byte accesses,
>> but since MacOS only uses the bottom 16-bit of the result and ignores
>> the top 16-bits then despite there being 2 accesses, everything still
>> works as expected (see below).
>>
>>
>> READ:
>>
>> dp8393x_read reg=0x28 [SR] val=0x0004 size=2
>> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a0 value 0x4
>> size 2
>> dp8393x_read reg=0x28 [SR] val=0x0004 size=2
>> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a2 value 0x4
>> size 2
>> memory_region_ops_read cpu 0 mr 0x55f210c9c330 addr 0x50f0a0a0 value
>> 0x40004 size 4
>>
>> WRITE:
>>
>> memory_region_ops_write cpu 0 mr 0x55f210c9c330 addr 0x50f0a01c value
>> 0x248fe8 size 4
>> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01c value
>> 0x24 size 2
>> dp8393x_write reg=0x7 [CTDA] val=0x0024 size=2
>> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01e value
>> 0x8fe8 size 2
>> dp8393x_write reg=0x7 [CTDA] val=0x8fe8 size=2
>>
>>
>> If you're happy with this, I'll resubmit a revised version of the patch
>> but with an updated commit message: the Fixes: tag is still the same,
>> but the updated fix is to ensure that .impl.min_access_size and
>> .impl.max_access_size match the real-life 16-bit register size.
> 
> Hold on a bit more, I'm still reviewing the datasheet ;)
> 
> My code note so far (untested) are:
> 
> -- >8 --
> @@ -219,34 +225,48 @@ static uint32_t dp8393x_wt(dp8393xState *s)
>       return s->regs[SONIC_WT1] << 16 | s->regs[SONIC_WT0];
>   }
> 
> -static uint16_t dp8393x_get(dp8393xState *s, int width, int offset)
> +static uint16_t dp8393x_get(dp8393xState *s, hwaddr addr, unsigned ofs16)
>   {
> +    const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
>       uint16_t val;
> 
> -    if (s->big_endian) {
> -        val = be16_to_cpu(s->data[offset * width + width - 1]);
> +    if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
> +        addr += 2 * ofs16;
> +        if (s->big_endian) {
> +            val = address_space_ldl_be(&s->as, addr, attrs, NULL);
> +        } else {
> +            val = address_space_ldl_le(&s->as, addr, attrs, NULL);
> +        }
>       } else {
> -        val = le16_to_cpu(s->data[offset * width]);
> +        addr += 1 * ofs16;
> +        if (s->big_endian) {
> +            val = address_space_lduw_be(&s->as, addr, attrs, NULL);
> +        } else {
> +            val = address_space_lduw_le(&s->as, addr, attrs, NULL);
> +        }
>       }
> +
>       return val;
>   }
> 
> -static void dp8393x_put(dp8393xState *s, int width, int offset,
> -                        uint16_t val)
> +static void dp8393x_put(dp8393xState *s,
> +                        hwaddr addr, unsigned ofs16, uint16_t val)
>   {
> -    if (s->big_endian) {
> -        if (width == 2) {
> -            s->data[offset * 2] = 0;
> -            s->data[offset * 2 + 1] = cpu_to_be16(val);
> +    const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
> +
> +    if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
> +        addr += 2 * ofs16;
> +        if (s->big_endian) {
> +            address_space_stl_be(&s->as, addr, val, attrs, NULL);
>           } else {
> -            s->data[offset] = cpu_to_be16(val);
> +            address_space_stl_le(&s->as, addr, val, attrs, NULL);
>           }
>       } else {
> -        if (width == 2) {
> -            s->data[offset * 2] = cpu_to_le16(val);
> -            s->data[offset * 2 + 1] = 0;
> +        addr += 1 * ofs16;
> +        if (s->big_endian) {
> +            address_space_stw_be(&s->as, addr, val, attrs, NULL);
>           } else {
> -            s->data[offset] = cpu_to_le16(val);
> +            address_space_stw_le(&s->as, addr, val, attrs, NULL);
>           }
>       }
>   }
> @@ -313,14 +331,12 @@ static void dp8393x_do_read_rra(dp8393xState *s)
>       /* Read memory */
>       width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1;
>       size = sizeof(uint16_t) * 4 * width;
> -    address_space_read(&s->as, dp8393x_rrp(s),
> -                       MEMTXATTRS_UNSPECIFIED, s->data, size);
> 
>       /* Update SONIC registers */
> -    s->regs[SONIC_CRBA0] = dp8393x_get(s, width, 0);
> -    s->regs[SONIC_CRBA1] = dp8393x_get(s, width, 1);
> -    s->regs[SONIC_RBWC0] = dp8393x_get(s, width, 2);
> -    s->regs[SONIC_RBWC1] = dp8393x_get(s, width, 3);
> +    s->regs[SONIC_CRBA0] = dp8393x_get(s, dp8393x_rrp(s), 0);
> +    s->regs[SONIC_CRBA1] = dp8393x_get(s, dp8393x_rrp(s), 1);
> +    s->regs[SONIC_RBWC0] = dp8393x_get(s, dp8393x_rrp(s), 2);
> +    s->regs[SONIC_RBWC1] = dp8393x_get(s, dp8393x_rrp(s), 3);
>       trace_dp8393x_read_rra_regs(s->regs[SONIC_CRBA0], s->regs[SONIC_CRBA1],
>                                   s->regs[SONIC_RBWC0],
> s->regs[SONIC_RBWC1]);
> 
> @@ -416,28 +432,22 @@ static void
> dp8393x_do_receiver_disable(dp8393xState *s)
>   static void dp8393x_do_transmit_packets(dp8393xState *s)
>   {
>       NetClientState *nc = qemu_get_queue(s->nic);
> -    int width, size;
>       int tx_len, len;
>       uint16_t i;
> 
> -    width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1;
> -
>       while (1) {
>           /* Read memory */
> -        size = sizeof(uint16_t) * 6 * width;
>           s->regs[SONIC_TTDA] = s->regs[SONIC_CTDA];
>           trace_dp8393x_transmit_packet(dp8393x_ttda(s));
> -        address_space_read(&s->as, dp8393x_ttda(s) + sizeof(uint16_t) *
> width,
> -                           MEMTXATTRS_UNSPECIFIED, s->data, size);
>           tx_len = 0;
> 
>           /* Update registers */
> -        s->regs[SONIC_TCR] = dp8393x_get(s, width, 0) & 0xf000;
> -        s->regs[SONIC_TPS] = dp8393x_get(s, width, 1);
> -        s->regs[SONIC_TFC] = dp8393x_get(s, width, 2);
> -        s->regs[SONIC_TSA0] = dp8393x_get(s, width, 3);
> -        s->regs[SONIC_TSA1] = dp8393x_get(s, width, 4);
> -        s->regs[SONIC_TFS] = dp8393x_get(s, width, 5);
> +        s->regs[SONIC_TCR] = dp8393x_get(s, dp8393x_ttda(s), 0) & 0xf000;
> +        s->regs[SONIC_TPS] = dp8393x_get(s, dp8393x_ttda(s), 1);
> +        s->regs[SONIC_TFC] = dp8393x_get(s, dp8393x_ttda(s), 2);
> +        s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 3);
> +        s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 4);
> +        s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 5);
> 
>           /* Handle programmable interrupt */
>           if (s->regs[SONIC_TCR] & SONIC_TCR_PINT) {
> @@ -459,15 +469,9 @@ static void
> dp8393x_do_transmit_packets(dp8393xState *s)
>               i++;
>               if (i != s->regs[SONIC_TFC]) {
>                   /* Read next fragment details */
> -                size = sizeof(uint16_t) * 3 * width;
> -                address_space_read(&s->as,
> -                                   dp8393x_ttda(s)
> -                                   + sizeof(uint16_t) * width * (4 + 3
> * i),
> -                                   MEMTXATTRS_UNSPECIFIED, s->data,
> -                                   size);
> -                s->regs[SONIC_TSA0] = dp8393x_get(s, width, 0);
> -                s->regs[SONIC_TSA1] = dp8393x_get(s, width, 1);
> -                s->regs[SONIC_TFS] = dp8393x_get(s, width, 2);
> +                s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 0);
> +                s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 1);
> +                s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 2);
>               }
>           }
> 
> @@ -500,22 +504,12 @@ static void
> dp8393x_do_transmit_packets(dp8393xState *s)
>           s->regs[SONIC_TCR] |= SONIC_TCR_PTX;
> 
>           /* Write status */
> -        dp8393x_put(s, width, 0,
> -                    s->regs[SONIC_TCR] & 0x0fff); /* status */
> -        size = sizeof(uint16_t) * width;
> -        address_space_write(&s->as, dp8393x_ttda(s),
> -                            MEMTXATTRS_UNSPECIFIED, s->data, size);
> +        dp8393x_put(s, dp8393x_ttda(s), 0, s->regs[SONIC_TCR] & 0x0fff);
> 
>           if (!(s->regs[SONIC_CR] & SONIC_CR_HTX)) {
>               /* Read footer of packet */
> -            size = sizeof(uint16_t) * width;
> -            address_space_read(&s->as,
> -                               dp8393x_ttda(s)
> -                               + sizeof(uint16_t) * width
> -                                 * (4 + 3 * s->regs[SONIC_TFC]),
> -                               MEMTXATTRS_UNSPECIFIED, s->data,
> -                               size);
> -            s->regs[SONIC_CTDA] = dp8393x_get(s, width, 0);
> +            s->regs[SONIC_CTDA] = dp8393x_get(s, dp8393x_ttda(s),
> +                                               s->regs[SONIC_TFC]);
>               if (s->regs[SONIC_CTDA] & SONIC_DESC_EOL) {
>                   /* EOL detected */
>                   break;
> @@ -764,7 +759,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
> const uint8_t * buf,
>       dp8393xState *s = qemu_get_nic_opaque(nc);
>       int packet_type;
>       uint32_t available, address;
> -    int width, rx_len, padded_len;
> +    int rx_len, padded_len;
>       uint32_t checksum;
>       int size;
> 
> @@ -777,10 +772,8 @@ static ssize_t dp8393x_receive(NetClientState *nc,
> const uint8_t * buf,
> 
>       rx_len = pkt_size + sizeof(checksum);
>       if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
> -        width = 2;
>           padded_len = ((rx_len - 1) | 3) + 1;
>       } else {
> -        width = 1;
>           padded_len = ((rx_len - 1) | 1) + 1;
>       }
> 
> @@ -801,11 +794,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
> const uint8_t * buf,
>       /* Check for EOL */
>       if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
>           /* Are we still in resource exhaustion? */
> -        size = sizeof(uint16_t) * 1 * width;
> -        address = dp8393x_crda(s) + sizeof(uint16_t) * 5 * width;
> -        address_space_read(&s->as, address, MEMTXATTRS_UNSPECIFIED,
> -                           s->data, size);
> -        s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0);
> +        s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5);
>           if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
>               /* Still EOL ; stop reception */
>               return -1;
> @@ -813,11 +802,7 @@ static ssize_t dp8393x_receive(NetClientState *nc,
> const uint8_t * buf,
>           /* Link has been updated by host */
> 
>           /* Clear in_use */
> -        size = sizeof(uint16_t) * width;
> -        address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width;
> -        dp8393x_put(s, width, 0, 0);
> -        address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED,
> -                         (uint8_t *)s->data, size, 1);
> +        dp8393x_put(s, dp8393x_crda(s), 6, 0x0000);
> 
>           /* Move to next descriptor */
>           s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
> @@ -846,8 +831,8 @@ static ssize_t dp8393x_receive(NetClientState *nc,
> const uint8_t * buf,
>       /* Pad short packets to keep pointers aligned */
>       if (rx_len < padded_len) {
>           size = padded_len - rx_len;
> -        address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED,
> -            (uint8_t *)"\xFF\xFF\xFF", size, 1);
> +        address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED,
> +                            (uint8_t *)"\xFF\xFF\xFF", size);
>           address += size;
>       }
> 
> @@ -871,32 +856,20 @@ static ssize_t dp8393x_receive(NetClientState *nc,
> const uint8_t * buf,
> 
>       /* Write status to memory */
>       trace_dp8393x_receive_write_status(dp8393x_crda(s));
> -    dp8393x_put(s, width, 0, s->regs[SONIC_RCR]); /* status */
> -    dp8393x_put(s, width, 1, rx_len); /* byte count */
> -    dp8393x_put(s, width, 2, s->regs[SONIC_TRBA0]); /* pkt_ptr0 */
> -    dp8393x_put(s, width, 3, s->regs[SONIC_TRBA1]); /* pkt_ptr1 */
> -    dp8393x_put(s, width, 4, s->regs[SONIC_RSC]); /* seq_no */
> -    size = sizeof(uint16_t) * 5 * width;
> -    address_space_write(&s->as, dp8393x_crda(s),
> -                        MEMTXATTRS_UNSPECIFIED,
> -                        s->data, size);
> +    dp8393x_put(s, dp8393x_crda(s), 0, s->regs[SONIC_RCR]); /* status */
> +    dp8393x_put(s, dp8393x_crda(s), 1, rx_len); /* byte count */
> +    dp8393x_put(s, dp8393x_crda(s), 2, s->regs[SONIC_TRBA0]); /*
> pkt_ptr0 */
> +    dp8393x_put(s, dp8393x_crda(s), 3, s->regs[SONIC_TRBA1]); /*
> pkt_ptr1 */
> +    dp8393x_put(s, dp8393x_crda(s), 4, s->regs[SONIC_RSC]); /* seq_no */
> 
>       /* Check link field */
> -    size = sizeof(uint16_t) * width;
> -    address_space_read(&s->as,
> -                       dp8393x_crda(s) + sizeof(uint16_t) * 5 * width,
> -                       MEMTXATTRS_UNSPECIFIED, s->data, size);
> -    s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0);
> +    s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5);
>       if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
>           /* EOL detected */
>           s->regs[SONIC_ISR] |= SONIC_ISR_RDE;
>       } else {
>           /* Clear in_use */
> -        size = sizeof(uint16_t) * width;
> -        address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width;
> -        dp8393x_put(s, width, 0, 0);
> -        address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED,
> -                            s->data, size);
> +        dp8393x_put(s, dp8393x_crda(s), 6, 0x0000);
> 
>           /* Move to next descriptor */
>           s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
> ---
> 
> Now I'll look at the CAM then the cksum.

I see, so you're also looking to consolidate all the address space accesses via the 
dp8393x_get() and dp8393x_put() functions instead of using s->data as an 
intermediatery - that makes sense to me. Once you have a patch that works for MIPS 
let me know and I can give a test on Linux/m68k and MacOS :)


ATB,

Mark.
Philippe Mathieu-Daudé July 3, 2021, 2:22 p.m. UTC | #8
On 7/3/21 4:16 PM, Mark Cave-Ayland wrote:
> On 03/07/2021 14:10, Philippe Mathieu-Daudé wrote:
>> On 7/3/21 2:04 PM, Mark Cave-Ayland wrote:

>>> I've tested this under Linux/m68k, NetBSD/arc and MacOS and networking
>>> seems fine after a quick test in each OS. The slight curiosity is that
>>> the 4 byte accesses used by MacOS are split into 2 x 2 byte accesses,
>>> but since MacOS only uses the bottom 16-bit of the result and ignores
>>> the top 16-bits then despite there being 2 accesses, everything still
>>> works as expected (see below).
>>>
>>>
>>> READ:
>>>
>>> dp8393x_read reg=0x28 [SR] val=0x0004 size=2
>>> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a0 value 0x4
>>> size 2
>>> dp8393x_read reg=0x28 [SR] val=0x0004 size=2
>>> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a2 value 0x4
>>> size 2
>>> memory_region_ops_read cpu 0 mr 0x55f210c9c330 addr 0x50f0a0a0 value
>>> 0x40004 size 4
>>>
>>> WRITE:
>>>
>>> memory_region_ops_write cpu 0 mr 0x55f210c9c330 addr 0x50f0a01c value
>>> 0x248fe8 size 4
>>> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01c value
>>> 0x24 size 2
>>> dp8393x_write reg=0x7 [CTDA] val=0x0024 size=2
>>> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01e value
>>> 0x8fe8 size 2
>>> dp8393x_write reg=0x7 [CTDA] val=0x8fe8 size=2
>>>
>>>
>>> If you're happy with this, I'll resubmit a revised version of the patch
>>> but with an updated commit message: the Fixes: tag is still the same,
>>> but the updated fix is to ensure that .impl.min_access_size and
>>> .impl.max_access_size match the real-life 16-bit register size.
>>
>> Hold on a bit more, I'm still reviewing the datasheet ;)
>>
>> My code note so far (untested) are:
>>
>> -- >8 --
>> @@ -219,34 +225,48 @@ static uint32_t dp8393x_wt(dp8393xState *s)
>>       return s->regs[SONIC_WT1] << 16 | s->regs[SONIC_WT0];
>>   }
>>
>> -static uint16_t dp8393x_get(dp8393xState *s, int width, int offset)
>> +static uint16_t dp8393x_get(dp8393xState *s, hwaddr addr, unsigned
>> ofs16)
>>   {
>> +    const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED;
>>       uint16_t val;
>>
>> -    if (s->big_endian) {
>> -        val = be16_to_cpu(s->data[offset * width + width - 1]);
>> +    if (s->regs[SONIC_DCR] & SONIC_DCR_DW) {
>> +        addr += 2 * ofs16;
>> +        if (s->big_endian) {
>> +            val = address_space_ldl_be(&s->as, addr, attrs, NULL);
>> +        } else {
>> +            val = address_space_ldl_le(&s->as, addr, attrs, NULL);
>> +        }
>>       } else {
>> -        val = le16_to_cpu(s->data[offset * width]);
>> +        addr += 1 * ofs16;
>> +        if (s->big_endian) {
>> +            val = address_space_lduw_be(&s->as, addr, attrs, NULL);
>> +        } else {
>> +            val = address_space_lduw_le(&s->as, addr, attrs, NULL);
>> +        }
>>       }
>> +
>>       return val;
>>   }

>>       /* Check link field */
>> -    size = sizeof(uint16_t) * width;
>> -    address_space_read(&s->as,
>> -                       dp8393x_crda(s) + sizeof(uint16_t) * 5 * width,
>> -                       MEMTXATTRS_UNSPECIFIED, s->data, size);
>> -    s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0);
>> +    s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5);
>>       if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) {
>>           /* EOL detected */
>>           s->regs[SONIC_ISR] |= SONIC_ISR_RDE;
>>       } else {
>>           /* Clear in_use */
>> -        size = sizeof(uint16_t) * width;
>> -        address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width;
>> -        dp8393x_put(s, width, 0, 0);
>> -        address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED,
>> -                            s->data, size);
>> +        dp8393x_put(s, dp8393x_crda(s), 6, 0x0000);
>>
>>           /* Move to next descriptor */
>>           s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA];
>> ---
>>
>> Now I'll look at the CAM then the cksum.
> 
> I see, so you're also looking to consolidate all the address space
> accesses via the dp8393x_get() and dp8393x_put() functions instead of
> using s->data as an intermediatery - that makes sense to me. Once you
> have a patch that works for MIPS let me know and I can give a test on
> Linux/m68k and MacOS :)

Done and sent. Thanks for the testing :)
diff mbox series

Patch

diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c
index 252c0a2664..6789bcd3af 100644
--- a/hw/net/dp8393x.c
+++ b/hw/net/dp8393x.c
@@ -602,15 +602,14 @@  static uint64_t dp8393x_read(void *opaque, hwaddr addr, unsigned int size)
 
     trace_dp8393x_read(reg, reg_names[reg], val, size);
 
-    return s->big_endian ? val << 16 : val;
+    return val;
 }
 
-static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data,
+static void dp8393x_write(void *opaque, hwaddr addr, uint64_t val,
                           unsigned int size)
 {
     dp8393xState *s = opaque;
     int reg = addr >> s->it_shift;
-    uint32_t val = s->big_endian ? data >> 16 : data;
 
     trace_dp8393x_write(reg, reg_names[reg], val, size);
 
@@ -694,7 +693,7 @@  static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data,
 static const MemoryRegionOps dp8393x_ops = {
     .read = dp8393x_read,
     .write = dp8393x_write,
-    .impl.min_access_size = 4,
+    .impl.min_access_size = 2,
     .impl.max_access_size = 4,
     .endianness = DEVICE_NATIVE_ENDIAN,
 };