diff mbox series

[v2,1/1] target/riscv: fix endless translation loop on big endian systems

Message ID 20250414034626.3491489-2-ziqiaokong@gmail.com (mailing list archive)
State New
Headers show
Series Fix endless translation loop of riscv | expand

Commit Message

Ziqiao Kong April 14, 2025, 3:46 a.m. UTC
On big endian systems, pte and updated_pte hold big endian host data
while pte_pa points to little endian target data. This means the branch
at cpu_helper.c:1669 will be always satisfied and restart translation,
causing an endless translation loop.

Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
---
 target/riscv/cpu_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Philippe Mathieu-Daudé April 14, 2025, 10:41 a.m. UTC | #1
Hi,

On 14/4/25 05:46, Ziqiao Kong wrote:
> On big endian systems, pte and updated_pte hold big endian host data
> while pte_pa points to little endian target data. This means the branch
> at cpu_helper.c:1669 will be always satisfied and restart translation,
> causing an endless translation loop.
> 

Cc: qemu-stable@nongnu.org
Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")

> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> ---
>   target/riscv/cpu_helper.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 6c4391d96b..bc146771c8 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
>               target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
>               target_ulong old_pte;
>               if (riscv_cpu_sxl(env) == MXL_RV32) {
> -                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> +                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
>               } else {
> -                old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> +                old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
>               }
>               if (old_pte != pte) {
>                   goto restart;

If PTEs are always stored in LE order, maybe what we want is earlier:

-- >8 --
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 619c76cc001..b6ac2800240 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState 
*env, hwaddr *physical,
          if (riscv_cpu_mxl(env) == MXL_RV32) {
-            pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
+            pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
          } else {
-            pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
+            pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
          }
---
Ziqiao Kong April 14, 2025, 11:17 a.m. UTC | #2
On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
>
> Hi,
>
> On 14/4/25 05:46, Ziqiao Kong wrote:
> > On big endian systems, pte and updated_pte hold big endian host data
> > while pte_pa points to little endian target data. This means the branch
> > at cpu_helper.c:1669 will be always satisfied and restart translation,
> > causing an endless translation loop.
> >
>
> Cc: qemu-stable@nongnu.org
> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
>
> > Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> > ---
> >   target/riscv/cpu_helper.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > index 6c4391d96b..bc146771c8 100644
> > --- a/target/riscv/cpu_helper.c
> > +++ b/target/riscv/cpu_helper.c
> > @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> >               target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> >               target_ulong old_pte;
> >               if (riscv_cpu_sxl(env) == MXL_RV32) {
> > -                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> > +                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
> >               } else {
> > -                old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> > +                old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
> >               }
> >               if (old_pte != pte) {
> >                   goto restart;
>
> If PTEs are always stored in LE order, maybe what we want is earlier:
>
> -- >8 --
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 619c76cc001..b6ac2800240 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> *env, hwaddr *physical,
>           if (riscv_cpu_mxl(env) == MXL_RV32) {
> -            pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> +            pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
>           } else {
> -            pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> +            pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);

Unfortunately, this doesn't work in two ways:

1. Note pte is used in the following code and that means pte must hold
a correct value from the
view of host endian (in my case, big endian not little endian).
2. address_space_ldq_le will dispatch to ldq_le_p, while
address_space_leq will dispatch to ldq_p.
However, on little endian targets, ldq_p is an alias of ldq_le_p so
making no effects.

Per my testing, this patch doesn't have any effect indeed. To have a
brief view what is happening,
see the logs just before atomic_cmpxchg:

pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f

>           }
> ---
Ziqiao Kong April 14, 2025, 4:59 p.m. UTC | #3
Hello Philippe,

Any further concern regarding this series? I certainly would like to investigate
and help =).

Bests,
Ziqiao

On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
>
> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
> <philmd@linaro.org> wrote:
> >
> > Hi,
> >
> > On 14/4/25 05:46, Ziqiao Kong wrote:
> > > On big endian systems, pte and updated_pte hold big endian host data
> > > while pte_pa points to little endian target data. This means the branch
> > > at cpu_helper.c:1669 will be always satisfied and restart translation,
> > > causing an endless translation loop.
> > >
> >
> > Cc: qemu-stable@nongnu.org
> > Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
> >
> > > Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> > > ---
> > >   target/riscv/cpu_helper.c | 4 ++--
> > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > > index 6c4391d96b..bc146771c8 100644
> > > --- a/target/riscv/cpu_helper.c
> > > +++ b/target/riscv/cpu_helper.c
> > > @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> > >               target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> > >               target_ulong old_pte;
> > >               if (riscv_cpu_sxl(env) == MXL_RV32) {
> > > -                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> > > +                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
> > >               } else {
> > > -                old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> > > +                old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
> > >               }
> > >               if (old_pte != pte) {
> > >                   goto restart;
> >
> > If PTEs are always stored in LE order, maybe what we want is earlier:
> >
> > -- >8 --
> > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > index 619c76cc001..b6ac2800240 100644
> > --- a/target/riscv/cpu_helper.c
> > +++ b/target/riscv/cpu_helper.c
> > @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> > *env, hwaddr *physical,
> >           if (riscv_cpu_mxl(env) == MXL_RV32) {
> > -            pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> > +            pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
> >           } else {
> > -            pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> > +            pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
>
> Unfortunately, this doesn't work in two ways:
>
> 1. Note pte is used in the following code and that means pte must hold
> a correct value from the
> view of host endian (in my case, big endian not little endian).
> 2. address_space_ldq_le will dispatch to ldq_le_p, while
> address_space_leq will dispatch to ldq_p.
> However, on little endian targets, ldq_p is an alias of ldq_le_p so
> making no effects.
>
> Per my testing, this patch doesn't have any effect indeed. To have a
> brief view what is happening,
> see the logs just before atomic_cmpxchg:
>
> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
>
> >           }
> > ---
Philippe Mathieu-Daudé April 14, 2025, 5:38 p.m. UTC | #4
Hi,

On 14/4/25 18:59, Ziqiao Kong wrote:
> Hello Philippe,
> 
> Any further concern regarding this series? I certainly would like to investigate
> and help =).

Short term I can't keep looking because I'm busy with other stuffs and
tagged this patch for another review, because there is some endianness
code smell in get_physical_address(). I understand your change fixes
your issue, but I'm skeptical about it, in part because there are no
such use in the whole code base. My change suggestion is just a starting
point, more is needed.

> 
> Bests,
> Ziqiao
> 
> On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
>>
>> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
>> <philmd@linaro.org> wrote:
>>>
>>> Hi,
>>>
>>> On 14/4/25 05:46, Ziqiao Kong wrote:
>>>> On big endian systems, pte and updated_pte hold big endian host data
>>>> while pte_pa points to little endian target data. This means the branch
>>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
>>>> causing an endless translation loop.
>>>>
>>>
>>> Cc: qemu-stable@nongnu.org
>>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
>>>
>>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
>>>> ---
>>>>    target/riscv/cpu_helper.c | 4 ++--
>>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>>>> index 6c4391d96b..bc146771c8 100644
>>>> --- a/target/riscv/cpu_helper.c
>>>> +++ b/target/riscv/cpu_helper.c
>>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
>>>>                target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
>>>>                target_ulong old_pte;
>>>>                if (riscv_cpu_sxl(env) == MXL_RV32) {
>>>> -                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
>>>> +                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
>>>>                } else {
>>>> -                old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
>>>> +                old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
>>>>                }
>>>>                if (old_pte != pte) {
>>>>                    goto restart;
>>>
>>> If PTEs are always stored in LE order, maybe what we want is earlier:
>>>
>>> -- >8 --
>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>>> index 619c76cc001..b6ac2800240 100644
>>> --- a/target/riscv/cpu_helper.c
>>> +++ b/target/riscv/cpu_helper.c
>>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
>>> *env, hwaddr *physical,
>>>            if (riscv_cpu_mxl(env) == MXL_RV32) {
>>> -            pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
>>> +            pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
>>>            } else {
>>> -            pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
>>> +            pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
>>
>> Unfortunately, this doesn't work in two ways:
>>
>> 1. Note pte is used in the following code and that means pte must hold
>> a correct value from the
>> view of host endian (in my case, big endian not little endian).
>> 2. address_space_ldq_le will dispatch to ldq_le_p, while
>> address_space_leq will dispatch to ldq_p.
>> However, on little endian targets, ldq_p is an alias of ldq_le_p so
>> making no effects.
>>
>> Per my testing, this patch doesn't have any effect indeed. To have a
>> brief view what is happening,
>> see the logs just before atomic_cmpxchg:
>>
>> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
>>
>>>            }
>>> ---
Ziqiao Kong April 15, 2025, 7:04 a.m. UTC | #5
Accidentally not cc all recipients. Sorry for the confusion. Below is
the duplicated message:

Hello Philippe,

On Tue, Apr 15, 2025 at 1:38 AM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
>
> Hi,
>
> On 14/4/25 18:59, Ziqiao Kong wrote:
> > Hello Philippe,
> >
> > Any further concern regarding this series? I certainly would like to investigate
> > and help =).
>
> Short term I can't keep looking because I'm busy with other stuffs and
> tagged this patch for another review, because there is some endianness
> code smell in get_physical_address(). I understand your change fixes
> your issue, but I'm skeptical about it, in part because there are no
> such use in the whole code base. My change suggestion is just a starting
> point, more is needed.

Thanks for responding.

Actually, the pattern of this usage is actually very common in the code base and
that's why I fixed in this way. Sorry I should have put this in the
cover letter to
justify my fix. Below is an incomplete list of the code using this pattern:

- target/i386/tcg/system/excp_helper.c:129

if (likely(in->haddr)) {
old = cpu_to_le32(old);
new = cpu_to_le32(new);
return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old;
}

- target/arm/ptw.c: 840

if (ptw->out_be) {
old_val = cpu_to_be64(old_val);
new_val = cpu_to_be64(new_val);
cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
cur_val = be64_to_cpu(cur_val);
} else {
old_val = cpu_to_le64(old_val);
new_val = cpu_to_le64(new_val);
cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
cur_val = le64_to_cpu(cur_val);
}

You might want to do a `grep -rn "qatomic_cmpxchg" .` to see all matches.


>
> >
> > Bests,
> > Ziqiao
> >
> > On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
> >>
> >> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
> >> <philmd@linaro.org> wrote:
> >>>
> >>> Hi,
> >>>
> >>> On 14/4/25 05:46, Ziqiao Kong wrote:
> >>>> On big endian systems, pte and updated_pte hold big endian host data
> >>>> while pte_pa points to little endian target data. This means the branch
> >>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
> >>>> causing an endless translation loop.
> >>>>
> >>>
> >>> Cc: qemu-stable@nongnu.org
> >>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
> >>>
> >>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> >>>> ---
> >>>>    target/riscv/cpu_helper.c | 4 ++--
> >>>>    1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> >>>> index 6c4391d96b..bc146771c8 100644
> >>>> --- a/target/riscv/cpu_helper.c
> >>>> +++ b/target/riscv/cpu_helper.c
> >>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> >>>>                target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> >>>>                target_ulong old_pte;
> >>>>                if (riscv_cpu_sxl(env) == MXL_RV32) {
> >>>> -                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> >>>> +                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
> >>>>                } else {
> >>>> -                old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> >>>> +                old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
> >>>>                }
> >>>>                if (old_pte != pte) {
> >>>>                    goto restart;
> >>>
> >>> If PTEs are always stored in LE order, maybe what we want is earlier:
> >>>
> >>> -- >8 --
> >>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> >>> index 619c76cc001..b6ac2800240 100644
> >>> --- a/target/riscv/cpu_helper.c
> >>> +++ b/target/riscv/cpu_helper.c
> >>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> >>> *env, hwaddr *physical,
> >>>            if (riscv_cpu_mxl(env) == MXL_RV32) {
> >>> -            pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> >>> +            pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
> >>>            } else {
> >>> -            pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> >>> +            pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
> >>
> >> Unfortunately, this doesn't work in two ways:
> >>
> >> 1. Note pte is used in the following code and that means pte must hold
> >> a correct value from the
> >> view of host endian (in my case, big endian not little endian).
> >> 2. address_space_ldq_le will dispatch to ldq_le_p, while
> >> address_space_leq will dispatch to ldq_p.
> >> However, on little endian targets, ldq_p is an alias of ldq_le_p so
> >> making no effects.
> >>
> >> Per my testing, this patch doesn't have any effect indeed. To have a
> >> brief view what is happening,
> >> see the logs just before atomic_cmpxchg:
> >>
> >> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
> >>
> >>>            }
> >>> ---
>
Philippe Mathieu-Daudé April 15, 2025, 7:15 a.m. UTC | #6
On 15/4/25 09:04, Ziqiao Kong wrote:
> Accidentally not cc all recipients. Sorry for the confusion. Below is
> the duplicated message:
> 
> Hello Philippe,
> 
> On Tue, Apr 15, 2025 at 1:38 AM Philippe Mathieu-Daudé
> <philmd@linaro.org> wrote:
>>
>> Hi,
>>
>> On 14/4/25 18:59, Ziqiao Kong wrote:
>>> Hello Philippe,
>>>
>>> Any further concern regarding this series? I certainly would like to investigate
>>> and help =).
>>
>> Short term I can't keep looking because I'm busy with other stuffs and
>> tagged this patch for another review, because there is some endianness
>> code smell in get_physical_address(). I understand your change fixes
>> your issue, but I'm skeptical about it, in part because there are no
>> such use in the whole code base. My change suggestion is just a starting
>> point, more is needed.
> 
> Thanks for responding.
> 
> Actually, the pattern of this usage is actually very common in the code base and
> that's why I fixed in this way. Sorry I should have put this in the
> cover letter to
> justify my fix. Below is an incomplete list of the code using this pattern:
> 
> - target/i386/tcg/system/excp_helper.c:129
> 
> if (likely(in->haddr)) {
> old = cpu_to_le32(old);
> new = cpu_to_le32(new);
> return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old;
> }
> 
> - target/arm/ptw.c: 840
> 
> if (ptw->out_be) {
> old_val = cpu_to_be64(old_val);
> new_val = cpu_to_be64(new_val);
> cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> cur_val = be64_to_cpu(cur_val);
> } else {
> old_val = cpu_to_le64(old_val);
> new_val = cpu_to_le64(new_val);
> cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> cur_val = le64_to_cpu(cur_val);
> }

Doh OK...

> 
> You might want to do a `grep -rn "qatomic_cmpxchg" .` to see all matches.
> 
> 
>>
>>>
>>> Bests,
>>> Ziqiao
>>>
>>> On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
>>>>
>>>> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
>>>> <philmd@linaro.org> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On 14/4/25 05:46, Ziqiao Kong wrote:
>>>>>> On big endian systems, pte and updated_pte hold big endian host data
>>>>>> while pte_pa points to little endian target data. This means the branch
>>>>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
>>>>>> causing an endless translation loop.
>>>>>>
>>>>>
>>>>> Cc: qemu-stable@nongnu.org
>>>>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
>>>>>
>>>>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
>>>>>> ---
>>>>>>     target/riscv/cpu_helper.c | 4 ++--
>>>>>>     1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>>>>>> index 6c4391d96b..bc146771c8 100644
>>>>>> --- a/target/riscv/cpu_helper.c
>>>>>> +++ b/target/riscv/cpu_helper.c
>>>>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
>>>>>>                 target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
>>>>>>                 target_ulong old_pte;
>>>>>>                 if (riscv_cpu_sxl(env) == MXL_RV32) {
>>>>>> -                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
>>>>>> +                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));

Then don't we need:

     old_pte = le32_to_cpu(old_pte);

>>>>>>                 } else {
>>>>>> -                old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
>>>>>> +                old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));

     old_pte = le64_to_cpu(old_pte);

?

>>>>>>                 }
>>>>>>                 if (old_pte != pte) {
>>>>>>                     goto restart;
>>>>>
>>>>> If PTEs are always stored in LE order, maybe what we want is earlier:
>>>>>
>>>>> -- >8 --
>>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>>>>> index 619c76cc001..b6ac2800240 100644
>>>>> --- a/target/riscv/cpu_helper.c
>>>>> +++ b/target/riscv/cpu_helper.c
>>>>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
>>>>> *env, hwaddr *physical,
>>>>>             if (riscv_cpu_mxl(env) == MXL_RV32) {
>>>>> -            pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
>>>>> +            pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
>>>>>             } else {
>>>>> -            pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
>>>>> +            pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
>>>>
>>>> Unfortunately, this doesn't work in two ways:
>>>>
>>>> 1. Note pte is used in the following code and that means pte must hold
>>>> a correct value from the
>>>> view of host endian (in my case, big endian not little endian).
>>>> 2. address_space_ldq_le will dispatch to ldq_le_p, while
>>>> address_space_leq will dispatch to ldq_p.
>>>> However, on little endian targets, ldq_p is an alias of ldq_le_p so
>>>> making no effects.
>>>>
>>>> Per my testing, this patch doesn't have any effect indeed. To have a
>>>> brief view what is happening,
>>>> see the logs just before atomic_cmpxchg:
>>>>
>>>> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
>>>>
>>>>>             }
>>>>> ---
>>
Ziqiao Kong April 15, 2025, 7:19 a.m. UTC | #7
Hello Philippe,

On Tue, Apr 15, 2025 at 3:15 PM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
>
> On 15/4/25 09:04, Ziqiao Kong wrote:
> > Accidentally not cc all recipients. Sorry for the confusion. Below is
> > the duplicated message:
> >
> > Hello Philippe,
> >
> > On Tue, Apr 15, 2025 at 1:38 AM Philippe Mathieu-Daudé
> > <philmd@linaro.org> wrote:
> >>
> >> Hi,
> >>
> >> On 14/4/25 18:59, Ziqiao Kong wrote:
> >>> Hello Philippe,
> >>>
> >>> Any further concern regarding this series? I certainly would like to investigate
> >>> and help =).
> >>
> >> Short term I can't keep looking because I'm busy with other stuffs and
> >> tagged this patch for another review, because there is some endianness
> >> code smell in get_physical_address(). I understand your change fixes
> >> your issue, but I'm skeptical about it, in part because there are no
> >> such use in the whole code base. My change suggestion is just a starting
> >> point, more is needed.
> >
> > Thanks for responding.
> >
> > Actually, the pattern of this usage is actually very common in the code base and
> > that's why I fixed in this way. Sorry I should have put this in the
> > cover letter to
> > justify my fix. Below is an incomplete list of the code using this pattern:
> >
> > - target/i386/tcg/system/excp_helper.c:129
> >
> > if (likely(in->haddr)) {
> > old = cpu_to_le32(old);
> > new = cpu_to_le32(new);
> > return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old;
> > }
> >
> > - target/arm/ptw.c: 840
> >
> > if (ptw->out_be) {
> > old_val = cpu_to_be64(old_val);
> > new_val = cpu_to_be64(new_val);
> > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > cur_val = be64_to_cpu(cur_val);
> > } else {
> > old_val = cpu_to_le64(old_val);
> > new_val = cpu_to_le64(new_val);
> > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > cur_val = le64_to_cpu(cur_val);
> > }
>
> Doh OK...
>
> >
> > You might want to do a `grep -rn "qatomic_cmpxchg" .` to see all matches.
> >
> >
> >>
> >>>
> >>> Bests,
> >>> Ziqiao
> >>>
> >>> On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
> >>>>
> >>>> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
> >>>> <philmd@linaro.org> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> On 14/4/25 05:46, Ziqiao Kong wrote:
> >>>>>> On big endian systems, pte and updated_pte hold big endian host data
> >>>>>> while pte_pa points to little endian target data. This means the branch
> >>>>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
> >>>>>> causing an endless translation loop.
> >>>>>>
> >>>>>
> >>>>> Cc: qemu-stable@nongnu.org
> >>>>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
> >>>>>
> >>>>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> >>>>>> ---
> >>>>>>     target/riscv/cpu_helper.c | 4 ++--
> >>>>>>     1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> >>>>>> index 6c4391d96b..bc146771c8 100644
> >>>>>> --- a/target/riscv/cpu_helper.c
> >>>>>> +++ b/target/riscv/cpu_helper.c
> >>>>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> >>>>>>                 target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> >>>>>>                 target_ulong old_pte;
> >>>>>>                 if (riscv_cpu_sxl(env) == MXL_RV32) {
> >>>>>> -                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> >>>>>> +                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
>
> Then don't we need:
>
>      old_pte = le32_to_cpu(old_pte);
>
> >>>>>>                 } else {
> >>>>>> -                old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> >>>>>> +                old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
>
>      old_pte = le64_to_cpu(old_pte);
>
> ?

Note old_pte is no longer used later (it is defined within the scope here) and
dropped immediately after qatomic_cmpxchg so we don't need that extra bswap.

>
> >>>>>>                 }
> >>>>>>                 if (old_pte != pte) {
> >>>>>>                     goto restart;
> >>>>>
> >>>>> If PTEs are always stored in LE order, maybe what we want is earlier:
> >>>>>
> >>>>> -- >8 --
> >>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> >>>>> index 619c76cc001..b6ac2800240 100644
> >>>>> --- a/target/riscv/cpu_helper.c
> >>>>> +++ b/target/riscv/cpu_helper.c
> >>>>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> >>>>> *env, hwaddr *physical,
> >>>>>             if (riscv_cpu_mxl(env) == MXL_RV32) {
> >>>>> -            pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> >>>>> +            pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
> >>>>>             } else {
> >>>>> -            pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> >>>>> +            pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
> >>>>
> >>>> Unfortunately, this doesn't work in two ways:
> >>>>
> >>>> 1. Note pte is used in the following code and that means pte must hold
> >>>> a correct value from the
> >>>> view of host endian (in my case, big endian not little endian).
> >>>> 2. address_space_ldq_le will dispatch to ldq_le_p, while
> >>>> address_space_leq will dispatch to ldq_p.
> >>>> However, on little endian targets, ldq_p is an alias of ldq_le_p so
> >>>> making no effects.
> >>>>
> >>>> Per my testing, this patch doesn't have any effect indeed. To have a
> >>>> brief view what is happening,
> >>>> see the logs just before atomic_cmpxchg:
> >>>>
> >>>> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
> >>>>
> >>>>>             }
> >>>>> ---
> >>
>
Ziqiao Kong April 15, 2025, 7:22 a.m. UTC | #8
On Tue, Apr 15, 2025 at 3:19 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
>
> Hello Philippe,
>
> On Tue, Apr 15, 2025 at 3:15 PM Philippe Mathieu-Daudé
> <philmd@linaro.org> wrote:
> >
> > On 15/4/25 09:04, Ziqiao Kong wrote:
> > > Accidentally not cc all recipients. Sorry for the confusion. Below is
> > > the duplicated message:
> > >
> > > Hello Philippe,
> > >
> > > On Tue, Apr 15, 2025 at 1:38 AM Philippe Mathieu-Daudé
> > > <philmd@linaro.org> wrote:
> > >>
> > >> Hi,
> > >>
> > >> On 14/4/25 18:59, Ziqiao Kong wrote:
> > >>> Hello Philippe,
> > >>>
> > >>> Any further concern regarding this series? I certainly would like to investigate
> > >>> and help =).
> > >>
> > >> Short term I can't keep looking because I'm busy with other stuffs and
> > >> tagged this patch for another review, because there is some endianness
> > >> code smell in get_physical_address(). I understand your change fixes
> > >> your issue, but I'm skeptical about it, in part because there are no
> > >> such use in the whole code base. My change suggestion is just a starting
> > >> point, more is needed.
> > >
> > > Thanks for responding.
> > >
> > > Actually, the pattern of this usage is actually very common in the code base and
> > > that's why I fixed in this way. Sorry I should have put this in the
> > > cover letter to
> > > justify my fix. Below is an incomplete list of the code using this pattern:
> > >
> > > - target/i386/tcg/system/excp_helper.c:129
> > >
> > > if (likely(in->haddr)) {
> > > old = cpu_to_le32(old);
> > > new = cpu_to_le32(new);
> > > return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old;
> > > }
> > >
> > > - target/arm/ptw.c: 840
> > >
> > > if (ptw->out_be) {
> > > old_val = cpu_to_be64(old_val);
> > > new_val = cpu_to_be64(new_val);
> > > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > > cur_val = be64_to_cpu(cur_val);
> > > } else {
> > > old_val = cpu_to_le64(old_val);
> > > new_val = cpu_to_le64(new_val);
> > > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > > cur_val = le64_to_cpu(cur_val);
> > > }
> >
> > Doh OK...
> >
> > >
> > > You might want to do a `grep -rn "qatomic_cmpxchg" .` to see all matches.
> > >
> > >
> > >>
> > >>>
> > >>> Bests,
> > >>> Ziqiao
> > >>>
> > >>> On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
> > >>>>
> > >>>> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
> > >>>> <philmd@linaro.org> wrote:
> > >>>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> On 14/4/25 05:46, Ziqiao Kong wrote:
> > >>>>>> On big endian systems, pte and updated_pte hold big endian host data
> > >>>>>> while pte_pa points to little endian target data. This means the branch
> > >>>>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
> > >>>>>> causing an endless translation loop.
> > >>>>>>
> > >>>>>
> > >>>>> Cc: qemu-stable@nongnu.org
> > >>>>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
> > >>>>>
> > >>>>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> > >>>>>> ---
> > >>>>>>     target/riscv/cpu_helper.c | 4 ++--
> > >>>>>>     1 file changed, 2 insertions(+), 2 deletions(-)
> > >>>>>>
> > >>>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > >>>>>> index 6c4391d96b..bc146771c8 100644
> > >>>>>> --- a/target/riscv/cpu_helper.c
> > >>>>>> +++ b/target/riscv/cpu_helper.c
> > >>>>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> > >>>>>>                 target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> > >>>>>>                 target_ulong old_pte;
> > >>>>>>                 if (riscv_cpu_sxl(env) == MXL_RV32) {
> > >>>>>> -                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> > >>>>>> +                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
> >
> > Then don't we need:
> >
> >      old_pte = le32_to_cpu(old_pte);
> >
> > >>>>>>                 } else {
> > >>>>>> -                old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> > >>>>>> +                old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
> >
> >      old_pte = le64_to_cpu(old_pte);
> >
> > ?
>
> Note old_pte is no longer used later (it is defined within the scope here) and
> dropped immediately after qatomic_cmpxchg so we don't need that extra bswap.

Also pte is not modified inplace previously, unlike the code samples above.

>
> >
> > >>>>>>                 }
> > >>>>>>                 if (old_pte != pte) {
> > >>>>>>                     goto restart;
> > >>>>>
> > >>>>> If PTEs are always stored in LE order, maybe what we want is earlier:
> > >>>>>
> > >>>>> -- >8 --
> > >>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > >>>>> index 619c76cc001..b6ac2800240 100644
> > >>>>> --- a/target/riscv/cpu_helper.c
> > >>>>> +++ b/target/riscv/cpu_helper.c
> > >>>>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> > >>>>> *env, hwaddr *physical,
> > >>>>>             if (riscv_cpu_mxl(env) == MXL_RV32) {
> > >>>>> -            pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> > >>>>> +            pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
> > >>>>>             } else {
> > >>>>> -            pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> > >>>>> +            pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
> > >>>>
> > >>>> Unfortunately, this doesn't work in two ways:
> > >>>>
> > >>>> 1. Note pte is used in the following code and that means pte must hold
> > >>>> a correct value from the
> > >>>> view of host endian (in my case, big endian not little endian).
> > >>>> 2. address_space_ldq_le will dispatch to ldq_le_p, while
> > >>>> address_space_leq will dispatch to ldq_p.
> > >>>> However, on little endian targets, ldq_p is an alias of ldq_le_p so
> > >>>> making no effects.
> > >>>>
> > >>>> Per my testing, this patch doesn't have any effect indeed. To have a
> > >>>> brief view what is happening,
> > >>>> see the logs just before atomic_cmpxchg:
> > >>>>
> > >>>> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
> > >>>>
> > >>>>>             }
> > >>>>> ---
> > >>
> >
Ziqiao Kong April 15, 2025, 7:40 a.m. UTC | #9
On Tue, Apr 15, 2025 at 3:22 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
>
> On Tue, Apr 15, 2025 at 3:19 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
> >
> > Hello Philippe,
> >
> > On Tue, Apr 15, 2025 at 3:15 PM Philippe Mathieu-Daudé
> > <philmd@linaro.org> wrote:
> > >
> > > On 15/4/25 09:04, Ziqiao Kong wrote:
> > > > Accidentally not cc all recipients. Sorry for the confusion. Below is
> > > > the duplicated message:
> > > >
> > > > Hello Philippe,
> > > >
> > > > On Tue, Apr 15, 2025 at 1:38 AM Philippe Mathieu-Daudé
> > > > <philmd@linaro.org> wrote:
> > > >>
> > > >> Hi,
> > > >>
> > > >> On 14/4/25 18:59, Ziqiao Kong wrote:
> > > >>> Hello Philippe,
> > > >>>
> > > >>> Any further concern regarding this series? I certainly would like to investigate
> > > >>> and help =).
> > > >>
> > > >> Short term I can't keep looking because I'm busy with other stuffs and
> > > >> tagged this patch for another review, because there is some endianness
> > > >> code smell in get_physical_address(). I understand your change fixes
> > > >> your issue, but I'm skeptical about it, in part because there are no
> > > >> such use in the whole code base. My change suggestion is just a starting
> > > >> point, more is needed.
> > > >
> > > > Thanks for responding.
> > > >
> > > > Actually, the pattern of this usage is actually very common in the code base and
> > > > that's why I fixed in this way. Sorry I should have put this in the
> > > > cover letter to
> > > > justify my fix. Below is an incomplete list of the code using this pattern:
> > > >
> > > > - target/i386/tcg/system/excp_helper.c:129
> > > >
> > > > if (likely(in->haddr)) {
> > > > old = cpu_to_le32(old);
> > > > new = cpu_to_le32(new);
> > > > return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old;
> > > > }
> > > >
> > > > - target/arm/ptw.c: 840
> > > >
> > > > if (ptw->out_be) {
> > > > old_val = cpu_to_be64(old_val);
> > > > new_val = cpu_to_be64(new_val);
> > > > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > > > cur_val = be64_to_cpu(cur_val);
> > > > } else {
> > > > old_val = cpu_to_le64(old_val);
> > > > new_val = cpu_to_le64(new_val);
> > > > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > > > cur_val = le64_to_cpu(cur_val);
> > > > }
> > >
> > > Doh OK...
> > >
> > > >
> > > > You might want to do a `grep -rn "qatomic_cmpxchg" .` to see all matches.
> > > >
> > > >
> > > >>
> > > >>>
> > > >>> Bests,
> > > >>> Ziqiao
> > > >>>
> > > >>> On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
> > > >>>>
> > > >>>> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
> > > >>>> <philmd@linaro.org> wrote:
> > > >>>>>
> > > >>>>> Hi,
> > > >>>>>
> > > >>>>> On 14/4/25 05:46, Ziqiao Kong wrote:
> > > >>>>>> On big endian systems, pte and updated_pte hold big endian host data
> > > >>>>>> while pte_pa points to little endian target data. This means the branch
> > > >>>>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
> > > >>>>>> causing an endless translation loop.
> > > >>>>>>
> > > >>>>>
> > > >>>>> Cc: qemu-stable@nongnu.org
> > > >>>>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
> > > >>>>>
> > > >>>>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> > > >>>>>> ---
> > > >>>>>>     target/riscv/cpu_helper.c | 4 ++--
> > > >>>>>>     1 file changed, 2 insertions(+), 2 deletions(-)
> > > >>>>>>
> > > >>>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > > >>>>>> index 6c4391d96b..bc146771c8 100644
> > > >>>>>> --- a/target/riscv/cpu_helper.c
> > > >>>>>> +++ b/target/riscv/cpu_helper.c
> > > >>>>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> > > >>>>>>                 target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> > > >>>>>>                 target_ulong old_pte;
> > > >>>>>>                 if (riscv_cpu_sxl(env) == MXL_RV32) {
> > > >>>>>> -                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> > > >>>>>> +                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
> > >
> > > Then don't we need:
> > >
> > >      old_pte = le32_to_cpu(old_pte);
> > >
> > > >>>>>>                 } else {
> > > >>>>>> -                old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> > > >>>>>> +                old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
> > >
> > >      old_pte = le64_to_cpu(old_pte);
> > >
> > > ?

Oh you are correct. I will draft a v3.

> >
> > Note old_pte is no longer used later (it is defined within the scope here) and
> > dropped immediately after qatomic_cmpxchg so we don't need that extra bswap.
>
> Also pte is not modified inplace previously, unlike the code samples above.
>
> >
> > >
> > > >>>>>>                 }
> > > >>>>>>                 if (old_pte != pte) {
> > > >>>>>>                     goto restart;
> > > >>>>>
> > > >>>>> If PTEs are always stored in LE order, maybe what we want is earlier:
> > > >>>>>
> > > >>>>> -- >8 --
> > > >>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > > >>>>> index 619c76cc001..b6ac2800240 100644
> > > >>>>> --- a/target/riscv/cpu_helper.c
> > > >>>>> +++ b/target/riscv/cpu_helper.c
> > > >>>>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> > > >>>>> *env, hwaddr *physical,
> > > >>>>>             if (riscv_cpu_mxl(env) == MXL_RV32) {
> > > >>>>> -            pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> > > >>>>> +            pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
> > > >>>>>             } else {
> > > >>>>> -            pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> > > >>>>> +            pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
> > > >>>>
> > > >>>> Unfortunately, this doesn't work in two ways:
> > > >>>>
> > > >>>> 1. Note pte is used in the following code and that means pte must hold
> > > >>>> a correct value from the
> > > >>>> view of host endian (in my case, big endian not little endian).
> > > >>>> 2. address_space_ldq_le will dispatch to ldq_le_p, while
> > > >>>> address_space_leq will dispatch to ldq_p.
> > > >>>> However, on little endian targets, ldq_p is an alias of ldq_le_p so
> > > >>>> making no effects.
> > > >>>>
> > > >>>> Per my testing, this patch doesn't have any effect indeed. To have a
> > > >>>> brief view what is happening,
> > > >>>> see the logs just before atomic_cmpxchg:
> > > >>>>
> > > >>>> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
> > > >>>>
> > > >>>>>             }
> > > >>>>> ---
> > > >>
> > >
diff mbox series

Patch

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 6c4391d96b..bc146771c8 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -1662,9 +1662,9 @@  static int get_physical_address(CPURISCVState *env, hwaddr *physical,
             target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
             target_ulong old_pte;
             if (riscv_cpu_sxl(env) == MXL_RV32) {
-                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
+                old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
             } else {
-                old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
+                old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
             }
             if (old_pte != pte) {
                 goto restart;