Message ID | 20210625065401.30170-9-mark.cave-ayland@ilande.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | dp8393x: fixes for MacOS toolbox ROM | expand |
On 6/25/21 8:53 AM, Mark Cave-Ayland wrote: > Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that all accesses > to the registers were 32-bit but this is actually not the case. The access size is > determined by the CPU instruction used and not the number of physical address lines. > > The big_endian workaround applied to the register read/writes was actually caused > by forcing the access size to 32-bit when the guest OS was using a 16-bit access. > Since the registers are 16-bit then we can simply set .impl.min_access to 2 and > then the memory API will automatically do the right thing for both 16-bit accesses > used by Linux and 32-bit accesses used by the MacOS toolbox ROM. Hmm I'm not sure. This sounds to me like the "QEMU doesn't model busses so we end using kludge to hide bugs" pattern. Can you provide a QTest (ideally) or a "-trace memory_region_ops_\*" log of your firmware accessing the dp8393x please? > Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> > Fixes: 3fe9a838ec ("dp8393x: Always use 32-bit accesses") > --- > hw/net/dp8393x.c | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c > index 252c0a2664..6789bcd3af 100644 > --- a/hw/net/dp8393x.c > +++ b/hw/net/dp8393x.c > @@ -602,15 +602,14 @@ static uint64_t dp8393x_read(void *opaque, hwaddr addr, unsigned int size) > > trace_dp8393x_read(reg, reg_names[reg], val, size); > > - return s->big_endian ? val << 16 : val; > + return val; > } > > -static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data, > +static void dp8393x_write(void *opaque, hwaddr addr, uint64_t val, > unsigned int size) > { > dp8393xState *s = opaque; > int reg = addr >> s->it_shift; > - uint32_t val = s->big_endian ? data >> 16 : data; > > trace_dp8393x_write(reg, reg_names[reg], val, size); > > @@ -694,7 +693,7 @@ static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data, > static const MemoryRegionOps dp8393x_ops = { > .read = dp8393x_read, > .write = dp8393x_write, > - .impl.min_access_size = 4, > + .impl.min_access_size = 2, > .impl.max_access_size = 4, > .endianness = DEVICE_NATIVE_ENDIAN, > }; >
On Thu, 1 Jul 2021, Philippe Mathieu-Daudé wrote: > On 6/25/21 8:53 AM, Mark Cave-Ayland wrote: > > Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that > > all accesses to the registers were 32-bit No, that assumption was not made there. Just take a look at my commits in Linux that make 16-bit accesses. If commit 3fe9a838ec worked by accident, it probably just reflects my inadequate knowledge of QEMU internals. > > but this is actually not the case. The access size is determined by > > the CPU instruction used and not the number of physical address lines. > > I think that's an over-simplification (in the context of commit 3fe9a838ec). > > The big_endian workaround applied to the register read/writes was > > actually caused by forcing the access size to 32-bit when the guest OS > > was using a 16-bit access. Since the registers are 16-bit then we can > > simply set .impl.min_access to 2 and then the memory API will > > automatically do the right thing for both 16-bit accesses used by > > Linux and 32-bit accesses used by the MacOS toolbox ROM. > > Hmm I'm not sure. This sounds to me like the "QEMU doesn't model busses > so we end using kludge to hide bugs" pattern. Can you provide a QTest > (ideally) or a "-trace memory_region_ops_\*" log of your firmware > accessing the dp8393x please? > The DP83932 chip is highly configurable, so I'm not sure that the behaviour of any given firmware would resolve the question. Anyway, as far as the DP83932 hardware is concerned, the behaviour of the upper 16-bits of the data bus depends on the configuration programmed into the DP83932 registers, and whether the chip is accessed as a slave or performing DMA as a master.
On 02/07/2021 05:36, Finn Thain wrote: >> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote: >>> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that >>> all accesses to the registers were 32-bit > > No, that assumption was not made there. Just take a look at my commits in > Linux that make 16-bit accesses. If commit 3fe9a838ec worked by accident, > it probably just reflects my inadequate knowledge of QEMU internals. > >>> but this is actually not the case. The access size is determined by >>> the CPU instruction used and not the number of physical address lines. >>> > > I think that's an over-simplification (in the context of commit > 3fe9a838ec). Let me try and clarify this a bit more: there are 2 different changes incorporated into 3fe9a838ec. The first (as you mention below and also detailed in the commit messge), is related to handling writes to the upper 16-bits of a word from the device and ensuring that 32-bit accesses are handled correctly. This part seems correct to me based upon reading the datasheet and the commit message: @@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width, int offset, uint16_t val) { if (s->big_endian) { - s->data[offset * width + width - 1] = cpu_to_be16(val); + if (width == 2) { + s->data[offset * 2] = 0; + s->data[offset * 2 + 1] = cpu_to_be16(val); + } else { + s->data[offset] = cpu_to_be16(val); + } } else { - s->data[offset * width] = cpu_to_le16(val); + if (width == 2) { + s->data[offset * 2] = cpu_to_le16(val); + s->data[offset * 2 + 1] = 0; + } else { + s->data[offset] = cpu_to_le16(val); + } } } The second change incorporated into 3fe9a838ec (and the one this patch fixes) is a similar change made to the MMIO *register* accesses: @@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr addr, unsigned int size) DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]); - return val; + return s->big_endian ? val << 16 : val; } and: @@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data, { dp8393xState *s = opaque; int reg = addr >> s->it_shift; + uint32_t val = s->big_endian ? data >> 16 : data; This is not correct since the QEMU memory API handles any access size and endian conversion before the MMIO access reaches the device. It is this change which breaks the 32-bit accesses used by MacOS to read/write the dp8393x registers because it applies an additional endian swap on top of that already done by the memory API. >>> The big_endian workaround applied to the register read/writes was >>> actually caused by forcing the access size to 32-bit when the guest OS >>> was using a 16-bit access. Since the registers are 16-bit then we can >>> simply set .impl.min_access to 2 and then the memory API will >>> automatically do the right thing for both 16-bit accesses used by >>> Linux and 32-bit accesses used by the MacOS toolbox ROM. >> >> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model busses >> so we end using kludge to hide bugs" pattern. Can you provide a QTest >> (ideally) or a "-trace memory_region_ops_\*" log of your firmware >> accessing the dp8393x please? >> > The DP83932 chip is highly configurable, so I'm not sure that the > behaviour of any given firmware would resolve the question. Indeed, I don't think that will help much here. Phil, if you would still like me to send you some traces after reading the explanation above then do let me know. > Anyway, as far as the DP83932 hardware is concerned, the behaviour of the > upper 16-bits of the data bus depends on the configuration programmed into > the DP83932 registers, and whether the chip is accessed as a slave or > performing DMA as a master. The important part of the commit and its associated message is that it only changes the *register* accesses which were introduced as part of 3fe9a838ec. In the end all the patch does is remove the manual endian swap from the MMIO registers since QEMU's memory API does the right thing all by itself, and adds the tweak below: @@ -694,7 +693,7 @@ static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data, static const MemoryRegionOps dp8393x_ops = { .read = dp8393x_read, .write = dp8393x_write, - .impl.min_access_size = 4, + .impl.min_access_size = 2, .impl.max_access_size = 4, .endianness = DEVICE_NATIVE_ENDIAN, }; As Finn points out the dp8393x registers are 16-bit so we declare the implementation size as 2 bytes and the maximum size as 4 bytes. This allows Linux to function correctly with 16-bit accesses, whilst 32-bit accesses done by MacOS are split into 2 separate 16-bit accesses and combined automatically by the memory API. ATB, Mark.
On 7/3/21 8:21 AM, Mark Cave-Ayland wrote: > On 02/07/2021 05:36, Finn Thain wrote: > >>> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote: >>>> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that >>>> all accesses to the registers were 32-bit >> >> No, that assumption was not made there. Just take a look at my commits in >> Linux that make 16-bit accesses. If commit 3fe9a838ec worked by accident, >> it probably just reflects my inadequate knowledge of QEMU internals. >> >>>> but this is actually not the case. The access size is determined by >>>> the CPU instruction used and not the number of physical address lines. >>>> >> >> I think that's an over-simplification (in the context of commit >> 3fe9a838ec). > > Let me try and clarify this a bit more: there are 2 different changes > incorporated into 3fe9a838ec. The first (as you mention below and also > detailed in the commit messge), is related to handling writes to the > upper 16-bits of a word from the device and ensuring that 32-bit > accesses are handled correctly. This part seems correct to me based upon > reading the datasheet and the commit message: > > @@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width, > int offset, > uint16_t val) > { > if (s->big_endian) { > - s->data[offset * width + width - 1] = cpu_to_be16(val); > + if (width == 2) { > + s->data[offset * 2] = 0; > + s->data[offset * 2 + 1] = cpu_to_be16(val); > + } else { > + s->data[offset] = cpu_to_be16(val); > + } > } else { > - s->data[offset * width] = cpu_to_le16(val); > + if (width == 2) { > + s->data[offset * 2] = cpu_to_le16(val); > + s->data[offset * 2 + 1] = 0; > + } else { > + s->data[offset] = cpu_to_le16(val); > + } > } > } > > The second change incorporated into 3fe9a838ec (and the one this patch > fixes) is a similar change made to the MMIO *register* accesses: > > @@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr > addr, unsigned int size) > > DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]); > > - return val; > + return s->big_endian ? val << 16 : val; > } > > and: > > @@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr > addr, uint64_t data, > { > dp8393xState *s = opaque; > int reg = addr >> s->it_shift; > + uint32_t val = s->big_endian ? data >> 16 : data; > > This is not correct since the QEMU memory API handles any access size > and endian conversion before the MMIO access reaches the device. It is > this change which breaks the 32-bit accesses used by MacOS to read/write > the dp8393x registers because it applies an additional endian swap on > top of that already done by the memory API. > >>>> The big_endian workaround applied to the register read/writes was >>>> actually caused by forcing the access size to 32-bit when the guest OS >>>> was using a 16-bit access. Since the registers are 16-bit then we can >>>> simply set .impl.min_access to 2 and then the memory API will >>>> automatically do the right thing for both 16-bit accesses used by >>>> Linux and 32-bit accesses used by the MacOS toolbox ROM. >>> >>> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model busses >>> so we end using kludge to hide bugs" pattern. Can you provide a QTest >>> (ideally) or a "-trace memory_region_ops_\*" log of your firmware >>> accessing the dp8393x please? >>> >> The DP83932 chip is highly configurable, so I'm not sure that the >> behaviour of any given firmware would resolve the question. > > Indeed, I don't think that will help much here. Phil, if you would still > like me to send you some traces after reading the explanation above then > do let me know. I read it and still would have few traces. We can hand-write them too. I'd like to add qtests for some read/write,16/32(A)==B. >> Anyway, as far as the DP83932 hardware is concerned, the behaviour of the >> upper 16-bits of the data bus depends on the configuration programmed >> into >> the DP83932 registers, and whether the chip is accessed as a slave or >> performing DMA as a master. > > The important part of the commit and its associated message is that it > only changes the *register* accesses which were introduced as part of > 3fe9a838ec. > > In the end all the patch does is remove the manual endian swap from the > MMIO registers since QEMU's memory API does the right thing all by > itself, and adds the tweak below: > > @@ -694,7 +693,7 @@ static void dp8393x_write(void *opaque, hwaddr addr, > uint64_t data, > static const MemoryRegionOps dp8393x_ops = { > .read = dp8393x_read, > .write = dp8393x_write, > - .impl.min_access_size = 4, > + .impl.min_access_size = 2, > .impl.max_access_size = 4, > .endianness = DEVICE_NATIVE_ENDIAN, > }; > > As Finn points out the dp8393x registers are 16-bit so we declare the > implementation size as 2 bytes and the maximum size as 4 bytes. This > allows Linux to function correctly with 16-bit accesses, whilst 32-bit > accesses done by MacOS are split into 2 separate 16-bit accesses and > combined automatically by the memory API. I hadn't check the datasheet yet but will have a look. Thanks, Phil.
On 03/07/2021 09:52, Philippe Mathieu-Daudé wrote: > On 7/3/21 8:21 AM, Mark Cave-Ayland wrote: >> On 02/07/2021 05:36, Finn Thain wrote: >> >>>> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote: >>>>> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that >>>>> all accesses to the registers were 32-bit >>> >>> No, that assumption was not made there. Just take a look at my commits in >>> Linux that make 16-bit accesses. If commit 3fe9a838ec worked by accident, >>> it probably just reflects my inadequate knowledge of QEMU internals. >>> >>>>> but this is actually not the case. The access size is determined by >>>>> the CPU instruction used and not the number of physical address lines. >>>>> >>> >>> I think that's an over-simplification (in the context of commit >>> 3fe9a838ec). >> >> Let me try and clarify this a bit more: there are 2 different changes >> incorporated into 3fe9a838ec. The first (as you mention below and also >> detailed in the commit messge), is related to handling writes to the >> upper 16-bits of a word from the device and ensuring that 32-bit >> accesses are handled correctly. This part seems correct to me based upon >> reading the datasheet and the commit message: >> >> @@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width, >> int offset, >> uint16_t val) >> { >> if (s->big_endian) { >> - s->data[offset * width + width - 1] = cpu_to_be16(val); >> + if (width == 2) { >> + s->data[offset * 2] = 0; >> + s->data[offset * 2 + 1] = cpu_to_be16(val); >> + } else { >> + s->data[offset] = cpu_to_be16(val); >> + } >> } else { >> - s->data[offset * width] = cpu_to_le16(val); >> + if (width == 2) { >> + s->data[offset * 2] = cpu_to_le16(val); >> + s->data[offset * 2 + 1] = 0; >> + } else { >> + s->data[offset] = cpu_to_le16(val); >> + } >> } >> } >> >> The second change incorporated into 3fe9a838ec (and the one this patch >> fixes) is a similar change made to the MMIO *register* accesses: >> >> @@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr >> addr, unsigned int size) >> >> DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]); >> >> - return val; >> + return s->big_endian ? val << 16 : val; >> } >> >> and: >> >> @@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr >> addr, uint64_t data, >> { >> dp8393xState *s = opaque; >> int reg = addr >> s->it_shift; >> + uint32_t val = s->big_endian ? data >> 16 : data; >> >> This is not correct since the QEMU memory API handles any access size >> and endian conversion before the MMIO access reaches the device. It is >> this change which breaks the 32-bit accesses used by MacOS to read/write >> the dp8393x registers because it applies an additional endian swap on >> top of that already done by the memory API. >> >>>>> The big_endian workaround applied to the register read/writes was >>>>> actually caused by forcing the access size to 32-bit when the guest OS >>>>> was using a 16-bit access. Since the registers are 16-bit then we can >>>>> simply set .impl.min_access to 2 and then the memory API will >>>>> automatically do the right thing for both 16-bit accesses used by >>>>> Linux and 32-bit accesses used by the MacOS toolbox ROM. >>>> >>>> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model busses >>>> so we end using kludge to hide bugs" pattern. Can you provide a QTest >>>> (ideally) or a "-trace memory_region_ops_\*" log of your firmware >>>> accessing the dp8393x please? >>>> >>> The DP83932 chip is highly configurable, so I'm not sure that the >>> behaviour of any given firmware would resolve the question. >> >> Indeed, I don't think that will help much here. Phil, if you would still >> like me to send you some traces after reading the explanation above then >> do let me know. > > I read it and still would have few traces. We can hand-write them too. > > I'd like to add qtests for some read/write,16/32(A)==B. Sigh. I was just about to attempt some traces when I realised looking at the patch that I made a mistake, and that in order for the memory API to automatically handle the 4 byte accesses as 2 x 2 byte accesses then both .impl.min_access_size and .impl.max_access_size need to be set to 2 :( The remainder of the patch is the same but dp8393x_ops now looks like this: static const MemoryRegionOps dp8393x_ops = { .read = dp8393x_read, .write = dp8393x_write, .impl.min_access_size = 2, .impl.max_access_size = 2, .endianness = DEVICE_NATIVE_ENDIAN, }; I've tested this under Linux/m68k, NetBSD/arc and MacOS and networking seems fine after a quick test in each OS. The slight curiosity is that the 4 byte accesses used by MacOS are split into 2 x 2 byte accesses, but since MacOS only uses the bottom 16-bit of the result and ignores the top 16-bits then despite there being 2 accesses, everything still works as expected (see below). READ: dp8393x_read reg=0x28 [SR] val=0x0004 size=2 memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a0 value 0x4 size 2 dp8393x_read reg=0x28 [SR] val=0x0004 size=2 memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a2 value 0x4 size 2 memory_region_ops_read cpu 0 mr 0x55f210c9c330 addr 0x50f0a0a0 value 0x40004 size 4 WRITE: memory_region_ops_write cpu 0 mr 0x55f210c9c330 addr 0x50f0a01c value 0x248fe8 size 4 memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01c value 0x24 size 2 dp8393x_write reg=0x7 [CTDA] val=0x0024 size=2 memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01e value 0x8fe8 size 2 dp8393x_write reg=0x7 [CTDA] val=0x8fe8 size=2 If you're happy with this, I'll resubmit a revised version of the patch but with an updated commit message: the Fixes: tag is still the same, but the updated fix is to ensure that .impl.min_access_size and .impl.max_access_size match the real-life 16-bit register size. ATB, Mark.
On 7/3/21 2:04 PM, Mark Cave-Ayland wrote: > On 03/07/2021 09:52, Philippe Mathieu-Daudé wrote: >> On 7/3/21 8:21 AM, Mark Cave-Ayland wrote: >>> On 02/07/2021 05:36, Finn Thain wrote: >>> >>>>> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote: >>>>>> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that >>>>>> all accesses to the registers were 32-bit >>>> >>>> No, that assumption was not made there. Just take a look at my >>>> commits in >>>> Linux that make 16-bit accesses. If commit 3fe9a838ec worked by >>>> accident, >>>> it probably just reflects my inadequate knowledge of QEMU internals. >>>> >>>>>> but this is actually not the case. The access size is determined by >>>>>> the CPU instruction used and not the number of physical address >>>>>> lines. >>>>>> >>>> >>>> I think that's an over-simplification (in the context of commit >>>> 3fe9a838ec). >>> >>> Let me try and clarify this a bit more: there are 2 different changes >>> incorporated into 3fe9a838ec. The first (as you mention below and also >>> detailed in the commit messge), is related to handling writes to the >>> upper 16-bits of a word from the device and ensuring that 32-bit >>> accesses are handled correctly. This part seems correct to me based upon >>> reading the datasheet and the commit message: >>> >>> @@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width, >>> int offset, >>> uint16_t val) >>> { >>> if (s->big_endian) { >>> - s->data[offset * width + width - 1] = cpu_to_be16(val); >>> + if (width == 2) { >>> + s->data[offset * 2] = 0; >>> + s->data[offset * 2 + 1] = cpu_to_be16(val); >>> + } else { >>> + s->data[offset] = cpu_to_be16(val); >>> + } >>> } else { >>> - s->data[offset * width] = cpu_to_le16(val); >>> + if (width == 2) { >>> + s->data[offset * 2] = cpu_to_le16(val); >>> + s->data[offset * 2 + 1] = 0; >>> + } else { >>> + s->data[offset] = cpu_to_le16(val); >>> + } >>> } >>> } >>> >>> The second change incorporated into 3fe9a838ec (and the one this patch >>> fixes) is a similar change made to the MMIO *register* accesses: >>> >>> @@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr >>> addr, unsigned int size) >>> >>> DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]); >>> >>> - return val; >>> + return s->big_endian ? val << 16 : val; >>> } >>> >>> and: >>> >>> @@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr >>> addr, uint64_t data, >>> { >>> dp8393xState *s = opaque; >>> int reg = addr >> s->it_shift; >>> + uint32_t val = s->big_endian ? data >> 16 : data; >>> >>> This is not correct since the QEMU memory API handles any access size >>> and endian conversion before the MMIO access reaches the device. It is >>> this change which breaks the 32-bit accesses used by MacOS to read/write >>> the dp8393x registers because it applies an additional endian swap on >>> top of that already done by the memory API. >>> >>>>>> The big_endian workaround applied to the register read/writes was >>>>>> actually caused by forcing the access size to 32-bit when the >>>>>> guest OS >>>>>> was using a 16-bit access. Since the registers are 16-bit then we can >>>>>> simply set .impl.min_access to 2 and then the memory API will >>>>>> automatically do the right thing for both 16-bit accesses used by >>>>>> Linux and 32-bit accesses used by the MacOS toolbox ROM. >>>>> >>>>> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model >>>>> busses >>>>> so we end using kludge to hide bugs" pattern. Can you provide a QTest >>>>> (ideally) or a "-trace memory_region_ops_\*" log of your firmware >>>>> accessing the dp8393x please? >>>>> >>>> The DP83932 chip is highly configurable, so I'm not sure that the >>>> behaviour of any given firmware would resolve the question. >>> >>> Indeed, I don't think that will help much here. Phil, if you would still >>> like me to send you some traces after reading the explanation above then >>> do let me know. >> >> I read it and still would have few traces. We can hand-write them too. >> >> I'd like to add qtests for some read/write,16/32(A)==B. > > Sigh. I was just about to attempt some traces when I realised looking at > the patch that I made a mistake, and that in order for the memory API to > automatically handle the 4 byte accesses as 2 x 2 byte accesses then > both .impl.min_access_size and .impl.max_access_size need to be set to 2 > :( The remainder of the patch is the same but dp8393x_ops now looks > like this: > > static const MemoryRegionOps dp8393x_ops = { > .read = dp8393x_read, > .write = dp8393x_write, > .impl.min_access_size = 2, > .impl.max_access_size = 2, > .endianness = DEVICE_NATIVE_ENDIAN, > }; Yes :) This is closer to what I wrote during my review: -- >8 -- diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c index bff359ed13f..6061268dc49 100644 --- a/hw/net/dp8393x.c +++ b/hw/net/dp8393x.c @@ -697,7 +697,9 @@ static const MemoryRegionOps dp8393x_ops = { .read = dp8393x_read, .write = dp8393x_write, .impl.min_access_size = 2, - .impl.max_access_size = 4, + .impl.max_access_size = 2, + .valid.min_access_size = 2, + .valid.max_access_size = 4, .endianness = DEVICE_NATIVE_ENDIAN, }; --- > > I've tested this under Linux/m68k, NetBSD/arc and MacOS and networking > seems fine after a quick test in each OS. The slight curiosity is that > the 4 byte accesses used by MacOS are split into 2 x 2 byte accesses, > but since MacOS only uses the bottom 16-bit of the result and ignores > the top 16-bits then despite there being 2 accesses, everything still > works as expected (see below). > > > READ: > > dp8393x_read reg=0x28 [SR] val=0x0004 size=2 > memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a0 value 0x4 > size 2 > dp8393x_read reg=0x28 [SR] val=0x0004 size=2 > memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a2 value 0x4 > size 2 > memory_region_ops_read cpu 0 mr 0x55f210c9c330 addr 0x50f0a0a0 value > 0x40004 size 4 > > WRITE: > > memory_region_ops_write cpu 0 mr 0x55f210c9c330 addr 0x50f0a01c value > 0x248fe8 size 4 > memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01c value > 0x24 size 2 > dp8393x_write reg=0x7 [CTDA] val=0x0024 size=2 > memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01e value > 0x8fe8 size 2 > dp8393x_write reg=0x7 [CTDA] val=0x8fe8 size=2 > > > If you're happy with this, I'll resubmit a revised version of the patch > but with an updated commit message: the Fixes: tag is still the same, > but the updated fix is to ensure that .impl.min_access_size and > .impl.max_access_size match the real-life 16-bit register size. Hold on a bit more, I'm still reviewing the datasheet ;) My code note so far (untested) are: -- >8 -- @@ -219,34 +225,48 @@ static uint32_t dp8393x_wt(dp8393xState *s) return s->regs[SONIC_WT1] << 16 | s->regs[SONIC_WT0]; } -static uint16_t dp8393x_get(dp8393xState *s, int width, int offset) +static uint16_t dp8393x_get(dp8393xState *s, hwaddr addr, unsigned ofs16) { + const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; uint16_t val; - if (s->big_endian) { - val = be16_to_cpu(s->data[offset * width + width - 1]); + if (s->regs[SONIC_DCR] & SONIC_DCR_DW) { + addr += 2 * ofs16; + if (s->big_endian) { + val = address_space_ldl_be(&s->as, addr, attrs, NULL); + } else { + val = address_space_ldl_le(&s->as, addr, attrs, NULL); + } } else { - val = le16_to_cpu(s->data[offset * width]); + addr += 1 * ofs16; + if (s->big_endian) { + val = address_space_lduw_be(&s->as, addr, attrs, NULL); + } else { + val = address_space_lduw_le(&s->as, addr, attrs, NULL); + } } + return val; } -static void dp8393x_put(dp8393xState *s, int width, int offset, - uint16_t val) +static void dp8393x_put(dp8393xState *s, + hwaddr addr, unsigned ofs16, uint16_t val) { - if (s->big_endian) { - if (width == 2) { - s->data[offset * 2] = 0; - s->data[offset * 2 + 1] = cpu_to_be16(val); + const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; + + if (s->regs[SONIC_DCR] & SONIC_DCR_DW) { + addr += 2 * ofs16; + if (s->big_endian) { + address_space_stl_be(&s->as, addr, val, attrs, NULL); } else { - s->data[offset] = cpu_to_be16(val); + address_space_stl_le(&s->as, addr, val, attrs, NULL); } } else { - if (width == 2) { - s->data[offset * 2] = cpu_to_le16(val); - s->data[offset * 2 + 1] = 0; + addr += 1 * ofs16; + if (s->big_endian) { + address_space_stw_be(&s->as, addr, val, attrs, NULL); } else { - s->data[offset] = cpu_to_le16(val); + address_space_stw_le(&s->as, addr, val, attrs, NULL); } } } @@ -313,14 +331,12 @@ static void dp8393x_do_read_rra(dp8393xState *s) /* Read memory */ width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1; size = sizeof(uint16_t) * 4 * width; - address_space_read(&s->as, dp8393x_rrp(s), - MEMTXATTRS_UNSPECIFIED, s->data, size); /* Update SONIC registers */ - s->regs[SONIC_CRBA0] = dp8393x_get(s, width, 0); - s->regs[SONIC_CRBA1] = dp8393x_get(s, width, 1); - s->regs[SONIC_RBWC0] = dp8393x_get(s, width, 2); - s->regs[SONIC_RBWC1] = dp8393x_get(s, width, 3); + s->regs[SONIC_CRBA0] = dp8393x_get(s, dp8393x_rrp(s), 0); + s->regs[SONIC_CRBA1] = dp8393x_get(s, dp8393x_rrp(s), 1); + s->regs[SONIC_RBWC0] = dp8393x_get(s, dp8393x_rrp(s), 2); + s->regs[SONIC_RBWC1] = dp8393x_get(s, dp8393x_rrp(s), 3); trace_dp8393x_read_rra_regs(s->regs[SONIC_CRBA0], s->regs[SONIC_CRBA1], s->regs[SONIC_RBWC0], s->regs[SONIC_RBWC1]); @@ -416,28 +432,22 @@ static void dp8393x_do_receiver_disable(dp8393xState *s) static void dp8393x_do_transmit_packets(dp8393xState *s) { NetClientState *nc = qemu_get_queue(s->nic); - int width, size; int tx_len, len; uint16_t i; - width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1; - while (1) { /* Read memory */ - size = sizeof(uint16_t) * 6 * width; s->regs[SONIC_TTDA] = s->regs[SONIC_CTDA]; trace_dp8393x_transmit_packet(dp8393x_ttda(s)); - address_space_read(&s->as, dp8393x_ttda(s) + sizeof(uint16_t) * width, - MEMTXATTRS_UNSPECIFIED, s->data, size); tx_len = 0; /* Update registers */ - s->regs[SONIC_TCR] = dp8393x_get(s, width, 0) & 0xf000; - s->regs[SONIC_TPS] = dp8393x_get(s, width, 1); - s->regs[SONIC_TFC] = dp8393x_get(s, width, 2); - s->regs[SONIC_TSA0] = dp8393x_get(s, width, 3); - s->regs[SONIC_TSA1] = dp8393x_get(s, width, 4); - s->regs[SONIC_TFS] = dp8393x_get(s, width, 5); + s->regs[SONIC_TCR] = dp8393x_get(s, dp8393x_ttda(s), 0) & 0xf000; + s->regs[SONIC_TPS] = dp8393x_get(s, dp8393x_ttda(s), 1); + s->regs[SONIC_TFC] = dp8393x_get(s, dp8393x_ttda(s), 2); + s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 3); + s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 4); + s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 5); /* Handle programmable interrupt */ if (s->regs[SONIC_TCR] & SONIC_TCR_PINT) { @@ -459,15 +469,9 @@ static void dp8393x_do_transmit_packets(dp8393xState *s) i++; if (i != s->regs[SONIC_TFC]) { /* Read next fragment details */ - size = sizeof(uint16_t) * 3 * width; - address_space_read(&s->as, - dp8393x_ttda(s) - + sizeof(uint16_t) * width * (4 + 3 * i), - MEMTXATTRS_UNSPECIFIED, s->data, - size); - s->regs[SONIC_TSA0] = dp8393x_get(s, width, 0); - s->regs[SONIC_TSA1] = dp8393x_get(s, width, 1); - s->regs[SONIC_TFS] = dp8393x_get(s, width, 2); + s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 0); + s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 1); + s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 2); } } @@ -500,22 +504,12 @@ static void dp8393x_do_transmit_packets(dp8393xState *s) s->regs[SONIC_TCR] |= SONIC_TCR_PTX; /* Write status */ - dp8393x_put(s, width, 0, - s->regs[SONIC_TCR] & 0x0fff); /* status */ - size = sizeof(uint16_t) * width; - address_space_write(&s->as, dp8393x_ttda(s), - MEMTXATTRS_UNSPECIFIED, s->data, size); + dp8393x_put(s, dp8393x_ttda(s), 0, s->regs[SONIC_TCR] & 0x0fff); if (!(s->regs[SONIC_CR] & SONIC_CR_HTX)) { /* Read footer of packet */ - size = sizeof(uint16_t) * width; - address_space_read(&s->as, - dp8393x_ttda(s) - + sizeof(uint16_t) * width - * (4 + 3 * s->regs[SONIC_TFC]), - MEMTXATTRS_UNSPECIFIED, s->data, - size); - s->regs[SONIC_CTDA] = dp8393x_get(s, width, 0); + s->regs[SONIC_CTDA] = dp8393x_get(s, dp8393x_ttda(s), + s->regs[SONIC_TFC]); if (s->regs[SONIC_CTDA] & SONIC_DESC_EOL) { /* EOL detected */ break; @@ -764,7 +759,7 @@ static ssize_t dp8393x_receive(NetClientState *nc, const uint8_t * buf, dp8393xState *s = qemu_get_nic_opaque(nc); int packet_type; uint32_t available, address; - int width, rx_len, padded_len; + int rx_len, padded_len; uint32_t checksum; int size; @@ -777,10 +772,8 @@ static ssize_t dp8393x_receive(NetClientState *nc, const uint8_t * buf, rx_len = pkt_size + sizeof(checksum); if (s->regs[SONIC_DCR] & SONIC_DCR_DW) { - width = 2; padded_len = ((rx_len - 1) | 3) + 1; } else { - width = 1; padded_len = ((rx_len - 1) | 1) + 1; } @@ -801,11 +794,7 @@ static ssize_t dp8393x_receive(NetClientState *nc, const uint8_t * buf, /* Check for EOL */ if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) { /* Are we still in resource exhaustion? */ - size = sizeof(uint16_t) * 1 * width; - address = dp8393x_crda(s) + sizeof(uint16_t) * 5 * width; - address_space_read(&s->as, address, MEMTXATTRS_UNSPECIFIED, - s->data, size); - s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0); + s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5); if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) { /* Still EOL ; stop reception */ return -1; @@ -813,11 +802,7 @@ static ssize_t dp8393x_receive(NetClientState *nc, const uint8_t * buf, /* Link has been updated by host */ /* Clear in_use */ - size = sizeof(uint16_t) * width; - address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width; - dp8393x_put(s, width, 0, 0); - address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED, - (uint8_t *)s->data, size, 1); + dp8393x_put(s, dp8393x_crda(s), 6, 0x0000); /* Move to next descriptor */ s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA]; @@ -846,8 +831,8 @@ static ssize_t dp8393x_receive(NetClientState *nc, const uint8_t * buf, /* Pad short packets to keep pointers aligned */ if (rx_len < padded_len) { size = padded_len - rx_len; - address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED, - (uint8_t *)"\xFF\xFF\xFF", size, 1); + address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED, + (uint8_t *)"\xFF\xFF\xFF", size); address += size; } @@ -871,32 +856,20 @@ static ssize_t dp8393x_receive(NetClientState *nc, const uint8_t * buf, /* Write status to memory */ trace_dp8393x_receive_write_status(dp8393x_crda(s)); - dp8393x_put(s, width, 0, s->regs[SONIC_RCR]); /* status */ - dp8393x_put(s, width, 1, rx_len); /* byte count */ - dp8393x_put(s, width, 2, s->regs[SONIC_TRBA0]); /* pkt_ptr0 */ - dp8393x_put(s, width, 3, s->regs[SONIC_TRBA1]); /* pkt_ptr1 */ - dp8393x_put(s, width, 4, s->regs[SONIC_RSC]); /* seq_no */ - size = sizeof(uint16_t) * 5 * width; - address_space_write(&s->as, dp8393x_crda(s), - MEMTXATTRS_UNSPECIFIED, - s->data, size); + dp8393x_put(s, dp8393x_crda(s), 0, s->regs[SONIC_RCR]); /* status */ + dp8393x_put(s, dp8393x_crda(s), 1, rx_len); /* byte count */ + dp8393x_put(s, dp8393x_crda(s), 2, s->regs[SONIC_TRBA0]); /* pkt_ptr0 */ + dp8393x_put(s, dp8393x_crda(s), 3, s->regs[SONIC_TRBA1]); /* pkt_ptr1 */ + dp8393x_put(s, dp8393x_crda(s), 4, s->regs[SONIC_RSC]); /* seq_no */ /* Check link field */ - size = sizeof(uint16_t) * width; - address_space_read(&s->as, - dp8393x_crda(s) + sizeof(uint16_t) * 5 * width, - MEMTXATTRS_UNSPECIFIED, s->data, size); - s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0); + s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5); if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) { /* EOL detected */ s->regs[SONIC_ISR] |= SONIC_ISR_RDE; } else { /* Clear in_use */ - size = sizeof(uint16_t) * width; - address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width; - dp8393x_put(s, width, 0, 0); - address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED, - s->data, size); + dp8393x_put(s, dp8393x_crda(s), 6, 0x0000); /* Move to next descriptor */ s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA]; --- Now I'll look at the CAM then the cksum. Regards, Phil.
On 03/07/2021 14:10, Philippe Mathieu-Daudé wrote: > On 7/3/21 2:04 PM, Mark Cave-Ayland wrote: >> On 03/07/2021 09:52, Philippe Mathieu-Daudé wrote: >>> On 7/3/21 8:21 AM, Mark Cave-Ayland wrote: >>>> On 02/07/2021 05:36, Finn Thain wrote: >>>> >>>>>> On 6/25/21 8:53 AM, Mark Cave-Ayland wrote: >>>>>>> Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that >>>>>>> all accesses to the registers were 32-bit >>>>> >>>>> No, that assumption was not made there. Just take a look at my >>>>> commits in >>>>> Linux that make 16-bit accesses. If commit 3fe9a838ec worked by >>>>> accident, >>>>> it probably just reflects my inadequate knowledge of QEMU internals. >>>>> >>>>>>> but this is actually not the case. The access size is determined by >>>>>>> the CPU instruction used and not the number of physical address >>>>>>> lines. >>>>>>> >>>>> >>>>> I think that's an over-simplification (in the context of commit >>>>> 3fe9a838ec). >>>> >>>> Let me try and clarify this a bit more: there are 2 different changes >>>> incorporated into 3fe9a838ec. The first (as you mention below and also >>>> detailed in the commit messge), is related to handling writes to the >>>> upper 16-bits of a word from the device and ensuring that 32-bit >>>> accesses are handled correctly. This part seems correct to me based upon >>>> reading the datasheet and the commit message: >>>> >>>> @@ -246,9 +246,19 @@ static void dp8393x_put(dp8393xState *s, int width, >>>> int offset, >>>> uint16_t val) >>>> { >>>> if (s->big_endian) { >>>> - s->data[offset * width + width - 1] = cpu_to_be16(val); >>>> + if (width == 2) { >>>> + s->data[offset * 2] = 0; >>>> + s->data[offset * 2 + 1] = cpu_to_be16(val); >>>> + } else { >>>> + s->data[offset] = cpu_to_be16(val); >>>> + } >>>> } else { >>>> - s->data[offset * width] = cpu_to_le16(val); >>>> + if (width == 2) { >>>> + s->data[offset * 2] = cpu_to_le16(val); >>>> + s->data[offset * 2 + 1] = 0; >>>> + } else { >>>> + s->data[offset] = cpu_to_le16(val); >>>> + } >>>> } >>>> } >>>> >>>> The second change incorporated into 3fe9a838ec (and the one this patch >>>> fixes) is a similar change made to the MMIO *register* accesses: >>>> >>>> @@ -590,7 +600,7 @@ static uint64_t dp8393x_read(void *opaque, hwaddr >>>> addr, unsigned int size) >>>> >>>> DPRINTF("read 0x%04x from reg %s\n", val, reg_names[reg]); >>>> >>>> - return val; >>>> + return s->big_endian ? val << 16 : val; >>>> } >>>> >>>> and: >>>> >>>> @@ -598,13 +608,14 @@ static void dp8393x_write(void *opaque, hwaddr >>>> addr, uint64_t data, >>>> { >>>> dp8393xState *s = opaque; >>>> int reg = addr >> s->it_shift; >>>> + uint32_t val = s->big_endian ? data >> 16 : data; >>>> >>>> This is not correct since the QEMU memory API handles any access size >>>> and endian conversion before the MMIO access reaches the device. It is >>>> this change which breaks the 32-bit accesses used by MacOS to read/write >>>> the dp8393x registers because it applies an additional endian swap on >>>> top of that already done by the memory API. >>>> >>>>>>> The big_endian workaround applied to the register read/writes was >>>>>>> actually caused by forcing the access size to 32-bit when the >>>>>>> guest OS >>>>>>> was using a 16-bit access. Since the registers are 16-bit then we can >>>>>>> simply set .impl.min_access to 2 and then the memory API will >>>>>>> automatically do the right thing for both 16-bit accesses used by >>>>>>> Linux and 32-bit accesses used by the MacOS toolbox ROM. >>>>>> >>>>>> Hmm I'm not sure. This sounds to me like the "QEMU doesn't model >>>>>> busses >>>>>> so we end using kludge to hide bugs" pattern. Can you provide a QTest >>>>>> (ideally) or a "-trace memory_region_ops_\*" log of your firmware >>>>>> accessing the dp8393x please? >>>>>> >>>>> The DP83932 chip is highly configurable, so I'm not sure that the >>>>> behaviour of any given firmware would resolve the question. >>>> >>>> Indeed, I don't think that will help much here. Phil, if you would still >>>> like me to send you some traces after reading the explanation above then >>>> do let me know. >>> >>> I read it and still would have few traces. We can hand-write them too. >>> >>> I'd like to add qtests for some read/write,16/32(A)==B. >> >> Sigh. I was just about to attempt some traces when I realised looking at >> the patch that I made a mistake, and that in order for the memory API to >> automatically handle the 4 byte accesses as 2 x 2 byte accesses then >> both .impl.min_access_size and .impl.max_access_size need to be set to 2 >> :( The remainder of the patch is the same but dp8393x_ops now looks >> like this: >> >> static const MemoryRegionOps dp8393x_ops = { >> .read = dp8393x_read, >> .write = dp8393x_write, >> .impl.min_access_size = 2, >> .impl.max_access_size = 2, >> .endianness = DEVICE_NATIVE_ENDIAN, >> }; > > Yes :) This is closer to what I wrote during my review: > > -- >8 -- > diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c > index bff359ed13f..6061268dc49 100644 > --- a/hw/net/dp8393x.c > +++ b/hw/net/dp8393x.c > @@ -697,7 +697,9 @@ static const MemoryRegionOps dp8393x_ops = { > .read = dp8393x_read, > .write = dp8393x_write, > .impl.min_access_size = 2, > - .impl.max_access_size = 4, > + .impl.max_access_size = 2, > + .valid.min_access_size = 2, > + .valid.max_access_size = 4, > .endianness = DEVICE_NATIVE_ENDIAN, > }; > --- Yes, that should work. Sorry I got confused during the initial review. >> I've tested this under Linux/m68k, NetBSD/arc and MacOS and networking >> seems fine after a quick test in each OS. The slight curiosity is that >> the 4 byte accesses used by MacOS are split into 2 x 2 byte accesses, >> but since MacOS only uses the bottom 16-bit of the result and ignores >> the top 16-bits then despite there being 2 accesses, everything still >> works as expected (see below). >> >> >> READ: >> >> dp8393x_read reg=0x28 [SR] val=0x0004 size=2 >> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a0 value 0x4 >> size 2 >> dp8393x_read reg=0x28 [SR] val=0x0004 size=2 >> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a2 value 0x4 >> size 2 >> memory_region_ops_read cpu 0 mr 0x55f210c9c330 addr 0x50f0a0a0 value >> 0x40004 size 4 >> >> WRITE: >> >> memory_region_ops_write cpu 0 mr 0x55f210c9c330 addr 0x50f0a01c value >> 0x248fe8 size 4 >> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01c value >> 0x24 size 2 >> dp8393x_write reg=0x7 [CTDA] val=0x0024 size=2 >> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01e value >> 0x8fe8 size 2 >> dp8393x_write reg=0x7 [CTDA] val=0x8fe8 size=2 >> >> >> If you're happy with this, I'll resubmit a revised version of the patch >> but with an updated commit message: the Fixes: tag is still the same, >> but the updated fix is to ensure that .impl.min_access_size and >> .impl.max_access_size match the real-life 16-bit register size. > > Hold on a bit more, I'm still reviewing the datasheet ;) > > My code note so far (untested) are: > > -- >8 -- > @@ -219,34 +225,48 @@ static uint32_t dp8393x_wt(dp8393xState *s) > return s->regs[SONIC_WT1] << 16 | s->regs[SONIC_WT0]; > } > > -static uint16_t dp8393x_get(dp8393xState *s, int width, int offset) > +static uint16_t dp8393x_get(dp8393xState *s, hwaddr addr, unsigned ofs16) > { > + const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; > uint16_t val; > > - if (s->big_endian) { > - val = be16_to_cpu(s->data[offset * width + width - 1]); > + if (s->regs[SONIC_DCR] & SONIC_DCR_DW) { > + addr += 2 * ofs16; > + if (s->big_endian) { > + val = address_space_ldl_be(&s->as, addr, attrs, NULL); > + } else { > + val = address_space_ldl_le(&s->as, addr, attrs, NULL); > + } > } else { > - val = le16_to_cpu(s->data[offset * width]); > + addr += 1 * ofs16; > + if (s->big_endian) { > + val = address_space_lduw_be(&s->as, addr, attrs, NULL); > + } else { > + val = address_space_lduw_le(&s->as, addr, attrs, NULL); > + } > } > + > return val; > } > > -static void dp8393x_put(dp8393xState *s, int width, int offset, > - uint16_t val) > +static void dp8393x_put(dp8393xState *s, > + hwaddr addr, unsigned ofs16, uint16_t val) > { > - if (s->big_endian) { > - if (width == 2) { > - s->data[offset * 2] = 0; > - s->data[offset * 2 + 1] = cpu_to_be16(val); > + const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; > + > + if (s->regs[SONIC_DCR] & SONIC_DCR_DW) { > + addr += 2 * ofs16; > + if (s->big_endian) { > + address_space_stl_be(&s->as, addr, val, attrs, NULL); > } else { > - s->data[offset] = cpu_to_be16(val); > + address_space_stl_le(&s->as, addr, val, attrs, NULL); > } > } else { > - if (width == 2) { > - s->data[offset * 2] = cpu_to_le16(val); > - s->data[offset * 2 + 1] = 0; > + addr += 1 * ofs16; > + if (s->big_endian) { > + address_space_stw_be(&s->as, addr, val, attrs, NULL); > } else { > - s->data[offset] = cpu_to_le16(val); > + address_space_stw_le(&s->as, addr, val, attrs, NULL); > } > } > } > @@ -313,14 +331,12 @@ static void dp8393x_do_read_rra(dp8393xState *s) > /* Read memory */ > width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1; > size = sizeof(uint16_t) * 4 * width; > - address_space_read(&s->as, dp8393x_rrp(s), > - MEMTXATTRS_UNSPECIFIED, s->data, size); > > /* Update SONIC registers */ > - s->regs[SONIC_CRBA0] = dp8393x_get(s, width, 0); > - s->regs[SONIC_CRBA1] = dp8393x_get(s, width, 1); > - s->regs[SONIC_RBWC0] = dp8393x_get(s, width, 2); > - s->regs[SONIC_RBWC1] = dp8393x_get(s, width, 3); > + s->regs[SONIC_CRBA0] = dp8393x_get(s, dp8393x_rrp(s), 0); > + s->regs[SONIC_CRBA1] = dp8393x_get(s, dp8393x_rrp(s), 1); > + s->regs[SONIC_RBWC0] = dp8393x_get(s, dp8393x_rrp(s), 2); > + s->regs[SONIC_RBWC1] = dp8393x_get(s, dp8393x_rrp(s), 3); > trace_dp8393x_read_rra_regs(s->regs[SONIC_CRBA0], s->regs[SONIC_CRBA1], > s->regs[SONIC_RBWC0], > s->regs[SONIC_RBWC1]); > > @@ -416,28 +432,22 @@ static void > dp8393x_do_receiver_disable(dp8393xState *s) > static void dp8393x_do_transmit_packets(dp8393xState *s) > { > NetClientState *nc = qemu_get_queue(s->nic); > - int width, size; > int tx_len, len; > uint16_t i; > > - width = (s->regs[SONIC_DCR] & SONIC_DCR_DW) ? 2 : 1; > - > while (1) { > /* Read memory */ > - size = sizeof(uint16_t) * 6 * width; > s->regs[SONIC_TTDA] = s->regs[SONIC_CTDA]; > trace_dp8393x_transmit_packet(dp8393x_ttda(s)); > - address_space_read(&s->as, dp8393x_ttda(s) + sizeof(uint16_t) * > width, > - MEMTXATTRS_UNSPECIFIED, s->data, size); > tx_len = 0; > > /* Update registers */ > - s->regs[SONIC_TCR] = dp8393x_get(s, width, 0) & 0xf000; > - s->regs[SONIC_TPS] = dp8393x_get(s, width, 1); > - s->regs[SONIC_TFC] = dp8393x_get(s, width, 2); > - s->regs[SONIC_TSA0] = dp8393x_get(s, width, 3); > - s->regs[SONIC_TSA1] = dp8393x_get(s, width, 4); > - s->regs[SONIC_TFS] = dp8393x_get(s, width, 5); > + s->regs[SONIC_TCR] = dp8393x_get(s, dp8393x_ttda(s), 0) & 0xf000; > + s->regs[SONIC_TPS] = dp8393x_get(s, dp8393x_ttda(s), 1); > + s->regs[SONIC_TFC] = dp8393x_get(s, dp8393x_ttda(s), 2); > + s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 3); > + s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 4); > + s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 5); > > /* Handle programmable interrupt */ > if (s->regs[SONIC_TCR] & SONIC_TCR_PINT) { > @@ -459,15 +469,9 @@ static void > dp8393x_do_transmit_packets(dp8393xState *s) > i++; > if (i != s->regs[SONIC_TFC]) { > /* Read next fragment details */ > - size = sizeof(uint16_t) * 3 * width; > - address_space_read(&s->as, > - dp8393x_ttda(s) > - + sizeof(uint16_t) * width * (4 + 3 > * i), > - MEMTXATTRS_UNSPECIFIED, s->data, > - size); > - s->regs[SONIC_TSA0] = dp8393x_get(s, width, 0); > - s->regs[SONIC_TSA1] = dp8393x_get(s, width, 1); > - s->regs[SONIC_TFS] = dp8393x_get(s, width, 2); > + s->regs[SONIC_TSA0] = dp8393x_get(s, dp8393x_ttda(s), 0); > + s->regs[SONIC_TSA1] = dp8393x_get(s, dp8393x_ttda(s), 1); > + s->regs[SONIC_TFS] = dp8393x_get(s, dp8393x_ttda(s), 2); > } > } > > @@ -500,22 +504,12 @@ static void > dp8393x_do_transmit_packets(dp8393xState *s) > s->regs[SONIC_TCR] |= SONIC_TCR_PTX; > > /* Write status */ > - dp8393x_put(s, width, 0, > - s->regs[SONIC_TCR] & 0x0fff); /* status */ > - size = sizeof(uint16_t) * width; > - address_space_write(&s->as, dp8393x_ttda(s), > - MEMTXATTRS_UNSPECIFIED, s->data, size); > + dp8393x_put(s, dp8393x_ttda(s), 0, s->regs[SONIC_TCR] & 0x0fff); > > if (!(s->regs[SONIC_CR] & SONIC_CR_HTX)) { > /* Read footer of packet */ > - size = sizeof(uint16_t) * width; > - address_space_read(&s->as, > - dp8393x_ttda(s) > - + sizeof(uint16_t) * width > - * (4 + 3 * s->regs[SONIC_TFC]), > - MEMTXATTRS_UNSPECIFIED, s->data, > - size); > - s->regs[SONIC_CTDA] = dp8393x_get(s, width, 0); > + s->regs[SONIC_CTDA] = dp8393x_get(s, dp8393x_ttda(s), > + s->regs[SONIC_TFC]); > if (s->regs[SONIC_CTDA] & SONIC_DESC_EOL) { > /* EOL detected */ > break; > @@ -764,7 +759,7 @@ static ssize_t dp8393x_receive(NetClientState *nc, > const uint8_t * buf, > dp8393xState *s = qemu_get_nic_opaque(nc); > int packet_type; > uint32_t available, address; > - int width, rx_len, padded_len; > + int rx_len, padded_len; > uint32_t checksum; > int size; > > @@ -777,10 +772,8 @@ static ssize_t dp8393x_receive(NetClientState *nc, > const uint8_t * buf, > > rx_len = pkt_size + sizeof(checksum); > if (s->regs[SONIC_DCR] & SONIC_DCR_DW) { > - width = 2; > padded_len = ((rx_len - 1) | 3) + 1; > } else { > - width = 1; > padded_len = ((rx_len - 1) | 1) + 1; > } > > @@ -801,11 +794,7 @@ static ssize_t dp8393x_receive(NetClientState *nc, > const uint8_t * buf, > /* Check for EOL */ > if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) { > /* Are we still in resource exhaustion? */ > - size = sizeof(uint16_t) * 1 * width; > - address = dp8393x_crda(s) + sizeof(uint16_t) * 5 * width; > - address_space_read(&s->as, address, MEMTXATTRS_UNSPECIFIED, > - s->data, size); > - s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0); > + s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5); > if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) { > /* Still EOL ; stop reception */ > return -1; > @@ -813,11 +802,7 @@ static ssize_t dp8393x_receive(NetClientState *nc, > const uint8_t * buf, > /* Link has been updated by host */ > > /* Clear in_use */ > - size = sizeof(uint16_t) * width; > - address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width; > - dp8393x_put(s, width, 0, 0); > - address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED, > - (uint8_t *)s->data, size, 1); > + dp8393x_put(s, dp8393x_crda(s), 6, 0x0000); > > /* Move to next descriptor */ > s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA]; > @@ -846,8 +831,8 @@ static ssize_t dp8393x_receive(NetClientState *nc, > const uint8_t * buf, > /* Pad short packets to keep pointers aligned */ > if (rx_len < padded_len) { > size = padded_len - rx_len; > - address_space_rw(&s->as, address, MEMTXATTRS_UNSPECIFIED, > - (uint8_t *)"\xFF\xFF\xFF", size, 1); > + address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED, > + (uint8_t *)"\xFF\xFF\xFF", size); > address += size; > } > > @@ -871,32 +856,20 @@ static ssize_t dp8393x_receive(NetClientState *nc, > const uint8_t * buf, > > /* Write status to memory */ > trace_dp8393x_receive_write_status(dp8393x_crda(s)); > - dp8393x_put(s, width, 0, s->regs[SONIC_RCR]); /* status */ > - dp8393x_put(s, width, 1, rx_len); /* byte count */ > - dp8393x_put(s, width, 2, s->regs[SONIC_TRBA0]); /* pkt_ptr0 */ > - dp8393x_put(s, width, 3, s->regs[SONIC_TRBA1]); /* pkt_ptr1 */ > - dp8393x_put(s, width, 4, s->regs[SONIC_RSC]); /* seq_no */ > - size = sizeof(uint16_t) * 5 * width; > - address_space_write(&s->as, dp8393x_crda(s), > - MEMTXATTRS_UNSPECIFIED, > - s->data, size); > + dp8393x_put(s, dp8393x_crda(s), 0, s->regs[SONIC_RCR]); /* status */ > + dp8393x_put(s, dp8393x_crda(s), 1, rx_len); /* byte count */ > + dp8393x_put(s, dp8393x_crda(s), 2, s->regs[SONIC_TRBA0]); /* > pkt_ptr0 */ > + dp8393x_put(s, dp8393x_crda(s), 3, s->regs[SONIC_TRBA1]); /* > pkt_ptr1 */ > + dp8393x_put(s, dp8393x_crda(s), 4, s->regs[SONIC_RSC]); /* seq_no */ > > /* Check link field */ > - size = sizeof(uint16_t) * width; > - address_space_read(&s->as, > - dp8393x_crda(s) + sizeof(uint16_t) * 5 * width, > - MEMTXATTRS_UNSPECIFIED, s->data, size); > - s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0); > + s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5); > if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) { > /* EOL detected */ > s->regs[SONIC_ISR] |= SONIC_ISR_RDE; > } else { > /* Clear in_use */ > - size = sizeof(uint16_t) * width; > - address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width; > - dp8393x_put(s, width, 0, 0); > - address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED, > - s->data, size); > + dp8393x_put(s, dp8393x_crda(s), 6, 0x0000); > > /* Move to next descriptor */ > s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA]; > --- > > Now I'll look at the CAM then the cksum. I see, so you're also looking to consolidate all the address space accesses via the dp8393x_get() and dp8393x_put() functions instead of using s->data as an intermediatery - that makes sense to me. Once you have a patch that works for MIPS let me know and I can give a test on Linux/m68k and MacOS :) ATB, Mark.
On 7/3/21 4:16 PM, Mark Cave-Ayland wrote: > On 03/07/2021 14:10, Philippe Mathieu-Daudé wrote: >> On 7/3/21 2:04 PM, Mark Cave-Ayland wrote: >>> I've tested this under Linux/m68k, NetBSD/arc and MacOS and networking >>> seems fine after a quick test in each OS. The slight curiosity is that >>> the 4 byte accesses used by MacOS are split into 2 x 2 byte accesses, >>> but since MacOS only uses the bottom 16-bit of the result and ignores >>> the top 16-bits then despite there being 2 accesses, everything still >>> works as expected (see below). >>> >>> >>> READ: >>> >>> dp8393x_read reg=0x28 [SR] val=0x0004 size=2 >>> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a0 value 0x4 >>> size 2 >>> dp8393x_read reg=0x28 [SR] val=0x0004 size=2 >>> memory_region_ops_read cpu 0 mr 0x55f210f44830 addr 0x5000a0a2 value 0x4 >>> size 2 >>> memory_region_ops_read cpu 0 mr 0x55f210c9c330 addr 0x50f0a0a0 value >>> 0x40004 size 4 >>> >>> WRITE: >>> >>> memory_region_ops_write cpu 0 mr 0x55f210c9c330 addr 0x50f0a01c value >>> 0x248fe8 size 4 >>> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01c value >>> 0x24 size 2 >>> dp8393x_write reg=0x7 [CTDA] val=0x0024 size=2 >>> memory_region_ops_write cpu 0 mr 0x55f210f44830 addr 0x5000a01e value >>> 0x8fe8 size 2 >>> dp8393x_write reg=0x7 [CTDA] val=0x8fe8 size=2 >>> >>> >>> If you're happy with this, I'll resubmit a revised version of the patch >>> but with an updated commit message: the Fixes: tag is still the same, >>> but the updated fix is to ensure that .impl.min_access_size and >>> .impl.max_access_size match the real-life 16-bit register size. >> >> Hold on a bit more, I'm still reviewing the datasheet ;) >> >> My code note so far (untested) are: >> >> -- >8 -- >> @@ -219,34 +225,48 @@ static uint32_t dp8393x_wt(dp8393xState *s) >> return s->regs[SONIC_WT1] << 16 | s->regs[SONIC_WT0]; >> } >> >> -static uint16_t dp8393x_get(dp8393xState *s, int width, int offset) >> +static uint16_t dp8393x_get(dp8393xState *s, hwaddr addr, unsigned >> ofs16) >> { >> + const MemTxAttrs attrs = MEMTXATTRS_UNSPECIFIED; >> uint16_t val; >> >> - if (s->big_endian) { >> - val = be16_to_cpu(s->data[offset * width + width - 1]); >> + if (s->regs[SONIC_DCR] & SONIC_DCR_DW) { >> + addr += 2 * ofs16; >> + if (s->big_endian) { >> + val = address_space_ldl_be(&s->as, addr, attrs, NULL); >> + } else { >> + val = address_space_ldl_le(&s->as, addr, attrs, NULL); >> + } >> } else { >> - val = le16_to_cpu(s->data[offset * width]); >> + addr += 1 * ofs16; >> + if (s->big_endian) { >> + val = address_space_lduw_be(&s->as, addr, attrs, NULL); >> + } else { >> + val = address_space_lduw_le(&s->as, addr, attrs, NULL); >> + } >> } >> + >> return val; >> } >> /* Check link field */ >> - size = sizeof(uint16_t) * width; >> - address_space_read(&s->as, >> - dp8393x_crda(s) + sizeof(uint16_t) * 5 * width, >> - MEMTXATTRS_UNSPECIFIED, s->data, size); >> - s->regs[SONIC_LLFA] = dp8393x_get(s, width, 0); >> + s->regs[SONIC_LLFA] = dp8393x_get(s, dp8393x_crda(s), 5); >> if (s->regs[SONIC_LLFA] & SONIC_DESC_EOL) { >> /* EOL detected */ >> s->regs[SONIC_ISR] |= SONIC_ISR_RDE; >> } else { >> /* Clear in_use */ >> - size = sizeof(uint16_t) * width; >> - address = dp8393x_crda(s) + sizeof(uint16_t) * 6 * width; >> - dp8393x_put(s, width, 0, 0); >> - address_space_write(&s->as, address, MEMTXATTRS_UNSPECIFIED, >> - s->data, size); >> + dp8393x_put(s, dp8393x_crda(s), 6, 0x0000); >> >> /* Move to next descriptor */ >> s->regs[SONIC_CRDA] = s->regs[SONIC_LLFA]; >> --- >> >> Now I'll look at the CAM then the cksum. > > I see, so you're also looking to consolidate all the address space > accesses via the dp8393x_get() and dp8393x_put() functions instead of > using s->data as an intermediatery - that makes sense to me. Once you > have a patch that works for MIPS let me know and I can give a test on > Linux/m68k and MacOS :) Done and sent. Thanks for the testing :)
diff --git a/hw/net/dp8393x.c b/hw/net/dp8393x.c index 252c0a2664..6789bcd3af 100644 --- a/hw/net/dp8393x.c +++ b/hw/net/dp8393x.c @@ -602,15 +602,14 @@ static uint64_t dp8393x_read(void *opaque, hwaddr addr, unsigned int size) trace_dp8393x_read(reg, reg_names[reg], val, size); - return s->big_endian ? val << 16 : val; + return val; } -static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data, +static void dp8393x_write(void *opaque, hwaddr addr, uint64_t val, unsigned int size) { dp8393xState *s = opaque; int reg = addr >> s->it_shift; - uint32_t val = s->big_endian ? data >> 16 : data; trace_dp8393x_write(reg, reg_names[reg], val, size); @@ -694,7 +693,7 @@ static void dp8393x_write(void *opaque, hwaddr addr, uint64_t data, static const MemoryRegionOps dp8393x_ops = { .read = dp8393x_read, .write = dp8393x_write, - .impl.min_access_size = 4, + .impl.min_access_size = 2, .impl.max_access_size = 4, .endianness = DEVICE_NATIVE_ENDIAN, };
Commit 3fe9a838ec "dp8393x: Always use 32-bit accesses" assumed that all accesses to the registers were 32-bit but this is actually not the case. The access size is determined by the CPU instruction used and not the number of physical address lines. The big_endian workaround applied to the register read/writes was actually caused by forcing the access size to 32-bit when the guest OS was using a 16-bit access. Since the registers are 16-bit then we can simply set .impl.min_access to 2 and then the memory API will automatically do the right thing for both 16-bit accesses used by Linux and 32-bit accesses used by the MacOS toolbox ROM. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Fixes: 3fe9a838ec ("dp8393x: Always use 32-bit accesses") --- hw/net/dp8393x.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)