mbox series

[0/7] Add support to handle misaligned accesses in S-mode

Message ID 20230926150316.1129648-1-cleger@rivosinc.com (mailing list archive)
Headers show
Series Add support to handle misaligned accesses in S-mode | expand

Message

Clément Léger Sept. 26, 2023, 3:03 p.m. UTC
Since commit 61cadb9 ("Provide new description of misaligned load/store
behavior compatible with privileged architecture.") in the RISC-V ISA
manual, it is stated that misaligned load/store might not be supported.
However, the RISC-V kernel uABI describes that misaligned accesses are
supported. In order to support that, this series adds support for S-mode
handling of misaligned accesses as well support for prctl(PR_UNALIGN).

Handling misaligned access in kernel allows for a finer grain control
of the misaligned accesses behavior, and thanks to the prctl call, can
allow disabling misaligned access emulation to generate SIGBUS. User
space can then optimize its software by removing such access based on
SIGBUS generation.

Currently, this series is useful for people that uses a SBI that does
not handled misaligned traps. In a near future, this series will make
use a SBI extension [1] allowing to request delegation of the
misaligned load/store traps to the S-mode software. This extension has
been submitted for review to the riscv tech-prs group. An OpenSBI
implementation for this spec is available at [2].

This series can be tested using the spike simulator [3] and an openSBI
version [4] which allows to always delegate misaligned load/store to
S-mode.

[1] https://lists.riscv.org/g/tech-prs/message/540
[2] https://github.com/rivosinc/opensbi/tree/dev/cleger/fw_feature_upstream
[3] https://github.com/riscv-software-src/riscv-isa-sim
[4] https://github.com/rivosinc/opensbi/tree/dev/cleger/no_misaligned

Clément Léger (7):
  riscv: remove unused functions in traps_misaligned.c
  riscv: add support for misaligned handling in S-mode
  riscv: report perf event for misaligned fault
  riscv: add floating point insn support to misaligned access emulation
  riscv: add support for sysctl unaligned_enabled control
  riscv: report misaligned accesses emulation to hwprobe
  riscv: add support for PR_SET_UNALIGN and PR_GET_UNALIGN

 arch/riscv/Kconfig                    |   1 +
 arch/riscv/include/asm/cpufeature.h   |   6 +
 arch/riscv/include/asm/entry-common.h |   3 +
 arch/riscv/include/asm/processor.h    |   9 +
 arch/riscv/kernel/Makefile            |   2 +-
 arch/riscv/kernel/cpufeature.c        |   6 +-
 arch/riscv/kernel/fpu.S               | 117 ++++++++
 arch/riscv/kernel/process.c           |  18 ++
 arch/riscv/kernel/setup.c             |   1 +
 arch/riscv/kernel/traps.c             |   9 -
 arch/riscv/kernel/traps_misaligned.c  | 374 ++++++++++++++++++++++----
 11 files changed, 488 insertions(+), 58 deletions(-)

Comments

Evan Green Sept. 26, 2023, 9:43 p.m. UTC | #1
On Tue, Sep 26, 2023 at 8:03 AM Clément Léger <cleger@rivosinc.com> wrote:
>
> Since commit 61cadb9 ("Provide new description of misaligned load/store
> behavior compatible with privileged architecture.") in the RISC-V ISA
> manual, it is stated that misaligned load/store might not be supported.
> However, the RISC-V kernel uABI describes that misaligned accesses are
> supported. In order to support that, this series adds support for S-mode
> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>
> Handling misaligned access in kernel allows for a finer grain control
> of the misaligned accesses behavior, and thanks to the prctl call, can
> allow disabling misaligned access emulation to generate SIGBUS. User
> space can then optimize its software by removing such access based on
> SIGBUS generation.
>
> Currently, this series is useful for people that uses a SBI that does
> not handled misaligned traps. In a near future, this series will make
> use a SBI extension [1] allowing to request delegation of the
> misaligned load/store traps to the S-mode software. This extension has
> been submitted for review to the riscv tech-prs group. An OpenSBI
> implementation for this spec is available at [2].

For my own education, how does the new SBI call behave with respect to
multiple harts? Does a call to change a feature perform that change
across all harts, or just the hart the SBI call was made on? If the
answer is "all harts", what if not all harts are exactly the same, and
some can enable the feature switch while others cannot? Also if the
answer is "all harts", does it also apply to hotplugged cpus, which
may not have even existed at boot time?

What happens if a hart goes through a context loss event, like
suspend/resume? Is the setting expected to be sticky, or is the kernel
expected to replay these calls?

-Evan
Clément Léger Sept. 28, 2023, 7:49 a.m. UTC | #2
On 26/09/2023 23:43, Evan Green wrote:
> On Tue, Sep 26, 2023 at 8:03 AM Clément Léger <cleger@rivosinc.com> wrote:
>>
>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>> behavior compatible with privileged architecture.") in the RISC-V ISA
>> manual, it is stated that misaligned load/store might not be supported.
>> However, the RISC-V kernel uABI describes that misaligned accesses are
>> supported. In order to support that, this series adds support for S-mode
>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>>
>> Handling misaligned access in kernel allows for a finer grain control
>> of the misaligned accesses behavior, and thanks to the prctl call, can
>> allow disabling misaligned access emulation to generate SIGBUS. User
>> space can then optimize its software by removing such access based on
>> SIGBUS generation.
>>
>> Currently, this series is useful for people that uses a SBI that does
>> not handled misaligned traps. In a near future, this series will make
>> use a SBI extension [1] allowing to request delegation of the
>> misaligned load/store traps to the S-mode software. This extension has
>> been submitted for review to the riscv tech-prs group. An OpenSBI
>> implementation for this spec is available at [2].
> 
> For my own education, how does the new SBI call behave with respect to
> multiple harts? Does a call to change a feature perform that change
> across all harts, or just the hart the SBI call was made on? If the
> answer is "all harts", what if not all harts are exactly the same, and
> some can enable the feature switch while others cannot? Also if the
> answer is "all harts", does it also apply to hotplugged cpus, which
> may not have even existed at boot time?

Depending on the feature, they can be either global (all harts) or
local (calling hart). The medeleg register is per hart and thus
misaligned load/store delegation for S-mode is also per hart.


> 
> What happens if a hart goes through a context loss event, like
> suspend/resume? Is the setting expected to be sticky, or is the kernel
> expected to replay these calls?

That is a good question that we did not actually clarified yet. Thanks
for raising it !

Clément

> 
> -Evan
Evan Green Sept. 28, 2023, 4:48 p.m. UTC | #3
On Thu, Sep 28, 2023 at 12:49 AM Clément Léger <cleger@rivosinc.com> wrote:
>
>
>
> On 26/09/2023 23:43, Evan Green wrote:
> > On Tue, Sep 26, 2023 at 8:03 AM Clément Léger <cleger@rivosinc.com> wrote:
> >>
> >> Since commit 61cadb9 ("Provide new description of misaligned load/store
> >> behavior compatible with privileged architecture.") in the RISC-V ISA
> >> manual, it is stated that misaligned load/store might not be supported.
> >> However, the RISC-V kernel uABI describes that misaligned accesses are
> >> supported. In order to support that, this series adds support for S-mode
> >> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
> >>
> >> Handling misaligned access in kernel allows for a finer grain control
> >> of the misaligned accesses behavior, and thanks to the prctl call, can
> >> allow disabling misaligned access emulation to generate SIGBUS. User
> >> space can then optimize its software by removing such access based on
> >> SIGBUS generation.
> >>
> >> Currently, this series is useful for people that uses a SBI that does
> >> not handled misaligned traps. In a near future, this series will make
> >> use a SBI extension [1] allowing to request delegation of the
> >> misaligned load/store traps to the S-mode software. This extension has
> >> been submitted for review to the riscv tech-prs group. An OpenSBI
> >> implementation for this spec is available at [2].
> >
> > For my own education, how does the new SBI call behave with respect to
> > multiple harts? Does a call to change a feature perform that change
> > across all harts, or just the hart the SBI call was made on? If the
> > answer is "all harts", what if not all harts are exactly the same, and
> > some can enable the feature switch while others cannot? Also if the
> > answer is "all harts", does it also apply to hotplugged cpus, which
> > may not have even existed at boot time?
>
> Depending on the feature, they can be either global (all harts) or
> local (calling hart). The medeleg register is per hart and thus
> misaligned load/store delegation for S-mode is also per hart.

We should probably state this in the spec update then, both generally
and for each specific feature added. Otherwise firmware writers are
left not knowing if they're supposed to spread a feature across to all
cores or not.

>
>
> >
> > What happens if a hart goes through a context loss event, like
> > suspend/resume? Is the setting expected to be sticky, or is the kernel
> > expected to replay these calls?
>
> That is a good question that we did not actually clarified yet. Thanks
> for raising it !

No problem! This may also need to be specified per-feature in the
spec. I have a vague hunch that it's better to ask the kernel to do it
on resume, though ideally we'd have the terminology (and I don't think
we do?) to specify exactly which points constitute a context loss.
Mostly I'm remembering the x86 and ARM transition from S3, where lots
of firmware code ran at resume, to S0ix-like power states, where
things resumed directly into the OS and they had to figure out how to
do it without firmware. The vague hunch is that keeping the laundry
list of things firmware must do on resume low might keep us from
getting in S0ix's way, but it's all so speculative it's hard to know
if it's really a useful hunch or not.

-Evan
Conor Dooley Sept. 30, 2023, 9:23 a.m. UTC | #4
On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
> Since commit 61cadb9 ("Provide new description of misaligned load/store
> behavior compatible with privileged architecture.") in the RISC-V ISA
> manual, it is stated that misaligned load/store might not be supported.
> However, the RISC-V kernel uABI describes that misaligned accesses are
> supported. In order to support that, this series adds support for S-mode
> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
> 
> Handling misaligned access in kernel allows for a finer grain control
> of the misaligned accesses behavior, and thanks to the prctl call, can
> allow disabling misaligned access emulation to generate SIGBUS. User
> space can then optimize its software by removing such access based on
> SIGBUS generation.
> 
> Currently, this series is useful for people that uses a SBI that does
> not handled misaligned traps. In a near future, this series will make
> use a SBI extension [1] allowing to request delegation of the
> misaligned load/store traps to the S-mode software. This extension has
> been submitted for review to the riscv tech-prs group. An OpenSBI
> implementation for this spec is available at [2].
> 
> This series can be tested using the spike simulator [3] and an openSBI
> version [4] which allows to always delegate misaligned load/store to
> S-mode.

Some patches in this series do not build for any configs, some are
broken for clang builds and others are broken for nommu. Please try to
build test this more thoroughly before you submit the next version.

Also, AIUI, this series should be marked RFC since the SBI extension
this relies on has not been frozen.

Cheers,
Conor.
Clément Léger Oct. 2, 2023, 7:40 a.m. UTC | #5
On 30/09/2023 11:23, Conor Dooley wrote:
> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>> behavior compatible with privileged architecture.") in the RISC-V ISA
>> manual, it is stated that misaligned load/store might not be supported.
>> However, the RISC-V kernel uABI describes that misaligned accesses are
>> supported. In order to support that, this series adds support for S-mode
>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>>
>> Handling misaligned access in kernel allows for a finer grain control
>> of the misaligned accesses behavior, and thanks to the prctl call, can
>> allow disabling misaligned access emulation to generate SIGBUS. User
>> space can then optimize its software by removing such access based on
>> SIGBUS generation.
>>
>> Currently, this series is useful for people that uses a SBI that does
>> not handled misaligned traps. In a near future, this series will make
>> use a SBI extension [1] allowing to request delegation of the
>> misaligned load/store traps to the S-mode software. This extension has
>> been submitted for review to the riscv tech-prs group. An OpenSBI
>> implementation for this spec is available at [2].
>>
>> This series can be tested using the spike simulator [3] and an openSBI
>> version [4] which allows to always delegate misaligned load/store to
>> S-mode.
> 
> Some patches in this series do not build for any configs, some are
> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.

Hi Conor,

Thanks for the feedback, I'll check that.

> 
> Also, AIUI, this series should be marked RFC since the SBI extension
> this relies on has not been frozen.

This series does not actually uses the SBI extension but provides a way
to detect if misaligned accesses are not handled by hardware nor by the
SBI. It has been reported by Ron & Daniel they they have a minimal SBI
implementation that does not handle misaligned accesses and that they
would like to make use of the PR_SET_UNALIGN feature. This is what this
series addresses (and thus does not depend on the mentioned SBI extension).

Thanks,

Clément

> 
> Cheers,
> Conor.
Conor Dooley Oct. 2, 2023, 10:49 a.m. UTC | #6
On Mon, Oct 02, 2023 at 09:40:04AM +0200, Clément Léger wrote:
> 
> 
> On 30/09/2023 11:23, Conor Dooley wrote:
> > On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
> >> Since commit 61cadb9 ("Provide new description of misaligned load/store
> >> behavior compatible with privileged architecture.") in the RISC-V ISA
> >> manual, it is stated that misaligned load/store might not be supported.
> >> However, the RISC-V kernel uABI describes that misaligned accesses are
> >> supported. In order to support that, this series adds support for S-mode
> >> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
> >>
> >> Handling misaligned access in kernel allows for a finer grain control
> >> of the misaligned accesses behavior, and thanks to the prctl call, can
> >> allow disabling misaligned access emulation to generate SIGBUS. User
> >> space can then optimize its software by removing such access based on
> >> SIGBUS generation.
> >>
> >> Currently, this series is useful for people that uses a SBI that does
> >> not handled misaligned traps. In a near future, this series will make
> >> use a SBI extension [1] allowing to request delegation of the
> >> misaligned load/store traps to the S-mode software. This extension has
> >> been submitted for review to the riscv tech-prs group. An OpenSBI
> >> implementation for this spec is available at [2].
> >>
> >> This series can be tested using the spike simulator [3] and an openSBI
> >> version [4] which allows to always delegate misaligned load/store to
> >> S-mode.
> > 
> > Some patches in this series do not build for any configs, some are
> > broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
> 
> Hi Conor,
> 
> Thanks for the feedback, I'll check that.
> 
> > 
> > Also, AIUI, this series should be marked RFC since the SBI extension
> > this relies on has not been frozen.
> 
> This series does not actually uses the SBI extension but provides a way
> to detect if misaligned accesses are not handled by hardware nor by the
> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
> implementation that does not handle misaligned accesses and that they
> would like to make use of the PR_SET_UNALIGN feature. This is what this
> series addresses (and thus does not depend on the mentioned SBI extension).

Ah, I must have misread then. Apologies.
Clément Léger Oct. 2, 2023, 11:18 a.m. UTC | #7
On 02/10/2023 12:49, Conor Dooley wrote:
> On Mon, Oct 02, 2023 at 09:40:04AM +0200, Clément Léger wrote:
>>
>>
>> On 30/09/2023 11:23, Conor Dooley wrote:
>>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
>>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>>>> behavior compatible with privileged architecture.") in the RISC-V ISA
>>>> manual, it is stated that misaligned load/store might not be supported.
>>>> However, the RISC-V kernel uABI describes that misaligned accesses are
>>>> supported. In order to support that, this series adds support for S-mode
>>>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>>>>
>>>> Handling misaligned access in kernel allows for a finer grain control
>>>> of the misaligned accesses behavior, and thanks to the prctl call, can
>>>> allow disabling misaligned access emulation to generate SIGBUS. User
>>>> space can then optimize its software by removing such access based on
>>>> SIGBUS generation.
>>>>
>>>> Currently, this series is useful for people that uses a SBI that does
>>>> not handled misaligned traps. In a near future, this series will make
>>>> use a SBI extension [1] allowing to request delegation of the
>>>> misaligned load/store traps to the S-mode software. This extension has
>>>> been submitted for review to the riscv tech-prs group. An OpenSBI
>>>> implementation for this spec is available at [2].
>>>>
>>>> This series can be tested using the spike simulator [3] and an openSBI
>>>> version [4] which allows to always delegate misaligned load/store to
>>>> S-mode.
>>>
>>> Some patches in this series do not build for any configs, some are
>>> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
>>
>> Hi Conor,
>>
>> Thanks for the feedback, I'll check that.
>>
>>>
>>> Also, AIUI, this series should be marked RFC since the SBI extension
>>> this relies on has not been frozen.
>>
>> This series does not actually uses the SBI extension but provides a way
>> to detect if misaligned accesses are not handled by hardware nor by the
>> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
>> implementation that does not handle misaligned accesses and that they
>> would like to make use of the PR_SET_UNALIGN feature. This is what this
>> series addresses (and thus does not depend on the mentioned SBI extension).
> 
> Ah, I must have misread then. Apologies.

No worries, maybe I should actually remove this from the cover letter to
avoid any confusion !

Clément
ron minnich Oct. 2, 2023, 3:32 p.m. UTC | #8
This was a very interesting read. One other thought crossed my mind,
which is that a RISC-V implementation might make the alignment
delegation hard-wired to always delegate to S mode. I.e, the bit might
be WARL and always 1. For what I'm doing, this would actually be
pretty convenient. Just want to make sure this code can accommodate
that -- wdyt?

We have found lots of value in our experiments with delegating
alignment traps to Linux -- not least because they tend to locate
problems in the kernel :-) -- we've found issues in module loading,
early startup (there's a needed .align2 directive for sbi secondary
startup, AFAICT) and the timing code for misaligned load/store
handling.

I don't know how you test this unaligned trap handling, but it might
be worthwhile to work that out. You can test via oreboot and the
visionfive2, save we have not figured out why SMP startup is going
wrong, yet :-), so we're not as feature-complete as needed. But soon.

Thanks!

On Mon, Oct 2, 2023 at 5:19 AM Clément Léger <cleger@rivosinc.com> wrote:
>
>
>
> On 02/10/2023 12:49, Conor Dooley wrote:
> > On Mon, Oct 02, 2023 at 09:40:04AM +0200, Clément Léger wrote:
> >>
> >>
> >> On 30/09/2023 11:23, Conor Dooley wrote:
> >>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
> >>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
> >>>> behavior compatible with privileged architecture.") in the RISC-V ISA
> >>>> manual, it is stated that misaligned load/store might not be supported.
> >>>> However, the RISC-V kernel uABI describes that misaligned accesses are
> >>>> supported. In order to support that, this series adds support for S-mode
> >>>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
> >>>>
> >>>> Handling misaligned access in kernel allows for a finer grain control
> >>>> of the misaligned accesses behavior, and thanks to the prctl call, can
> >>>> allow disabling misaligned access emulation to generate SIGBUS. User
> >>>> space can then optimize its software by removing such access based on
> >>>> SIGBUS generation.
> >>>>
> >>>> Currently, this series is useful for people that uses a SBI that does
> >>>> not handled misaligned traps. In a near future, this series will make
> >>>> use a SBI extension [1] allowing to request delegation of the
> >>>> misaligned load/store traps to the S-mode software. This extension has
> >>>> been submitted for review to the riscv tech-prs group. An OpenSBI
> >>>> implementation for this spec is available at [2].
> >>>>
> >>>> This series can be tested using the spike simulator [3] and an openSBI
> >>>> version [4] which allows to always delegate misaligned load/store to
> >>>> S-mode.
> >>>
> >>> Some patches in this series do not build for any configs, some are
> >>> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
> >>
> >> Hi Conor,
> >>
> >> Thanks for the feedback, I'll check that.
> >>
> >>>
> >>> Also, AIUI, this series should be marked RFC since the SBI extension
> >>> this relies on has not been frozen.
> >>
> >> This series does not actually uses the SBI extension but provides a way
> >> to detect if misaligned accesses are not handled by hardware nor by the
> >> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
> >> implementation that does not handle misaligned accesses and that they
> >> would like to make use of the PR_SET_UNALIGN feature. This is what this
> >> series addresses (and thus does not depend on the mentioned SBI extension).
> >
> > Ah, I must have misread then. Apologies.
>
> No worries, maybe I should actually remove this from the cover letter to
> avoid any confusion !
>
> Clément
Jessica Clarke Oct. 2, 2023, 10:22 p.m. UTC | #9
On 2 Oct 2023, at 16:32, ron minnich <rminnich@gmail.com> wrote:
> 
> This was a very interesting read. One other thought crossed my mind,
> which is that a RISC-V implementation might make the alignment
> delegation hard-wired to always delegate to S mode. I.e, the bit might
> be WARL and always 1. For what I'm doing, this would actually be
> pretty convenient. Just want to make sure this code can accommodate
> that -- wdyt?

Such an implementation would violate the spec:

  An implementation shall not have any bits of medeleg be read-only
  one, i.e., any synchronous trap that can be delegated must support not
  being delegated.

Supporting that is thus out of scope.

Jess

> We have found lots of value in our experiments with delegating
> alignment traps to Linux -- not least because they tend to locate
> problems in the kernel :-) -- we've found issues in module loading,
> early startup (there's a needed .align2 directive for sbi secondary
> startup, AFAICT) and the timing code for misaligned load/store
> handling.
> 
> I don't know how you test this unaligned trap handling, but it might
> be worthwhile to work that out. You can test via oreboot and the
> visionfive2, save we have not figured out why SMP startup is going
> wrong, yet :-), so we're not as feature-complete as needed. But soon.
> 
> Thanks!
> 
> On Mon, Oct 2, 2023 at 5:19 AM Clément Léger <cleger@rivosinc.com> wrote:
>> 
>> 
>> 
>> On 02/10/2023 12:49, Conor Dooley wrote:
>>> On Mon, Oct 02, 2023 at 09:40:04AM +0200, Clément Léger wrote:
>>>> 
>>>> 
>>>> On 30/09/2023 11:23, Conor Dooley wrote:
>>>>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
>>>>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>>>>>> behavior compatible with privileged architecture.") in the RISC-V ISA
>>>>>> manual, it is stated that misaligned load/store might not be supported.
>>>>>> However, the RISC-V kernel uABI describes that misaligned accesses are
>>>>>> supported. In order to support that, this series adds support for S-mode
>>>>>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>>>>>> 
>>>>>> Handling misaligned access in kernel allows for a finer grain control
>>>>>> of the misaligned accesses behavior, and thanks to the prctl call, can
>>>>>> allow disabling misaligned access emulation to generate SIGBUS. User
>>>>>> space can then optimize its software by removing such access based on
>>>>>> SIGBUS generation.
>>>>>> 
>>>>>> Currently, this series is useful for people that uses a SBI that does
>>>>>> not handled misaligned traps. In a near future, this series will make
>>>>>> use a SBI extension [1] allowing to request delegation of the
>>>>>> misaligned load/store traps to the S-mode software. This extension has
>>>>>> been submitted for review to the riscv tech-prs group. An OpenSBI
>>>>>> implementation for this spec is available at [2].
>>>>>> 
>>>>>> This series can be tested using the spike simulator [3] and an openSBI
>>>>>> version [4] which allows to always delegate misaligned load/store to
>>>>>> S-mode.
>>>>> 
>>>>> Some patches in this series do not build for any configs, some are
>>>>> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
>>>> 
>>>> Hi Conor,
>>>> 
>>>> Thanks for the feedback, I'll check that.
>>>> 
>>>>> 
>>>>> Also, AIUI, this series should be marked RFC since the SBI extension
>>>>> this relies on has not been frozen.
>>>> 
>>>> This series does not actually uses the SBI extension but provides a way
>>>> to detect if misaligned accesses are not handled by hardware nor by the
>>>> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
>>>> implementation that does not handle misaligned accesses and that they
>>>> would like to make use of the PR_SET_UNALIGN feature. This is what this
>>>> series addresses (and thus does not depend on the mentioned SBI extension).
>>> 
>>> Ah, I must have misread then. Apologies.
>> 
>> No worries, maybe I should actually remove this from the cover letter to
>> avoid any confusion !
>> 
>> Clément
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Clément Léger Oct. 3, 2023, 8:12 a.m. UTC | #10
On 02/10/2023 17:32, ron minnich wrote:
> This was a very interesting read. One other thought crossed my mind,
> which is that a RISC-V implementation might make the alignment
> delegation hard-wired to always delegate to S mode. I.e, the bit might
> be WARL and always 1. For what I'm doing, this would actually be
> pretty convenient. Just want to make sure this code can accommodate
> that -- wdyt?

Hi Ron,

This series does not really care about "how" misaligned load/store are
delegated, it only tries to check if misaligned load/store are handled
by the kernel. So whatever you decide to do to delegate that is a bit
out of the scope of this series.

> 
> We have found lots of value in our experiments with delegating
> alignment traps to Linux -- not least because they tend to locate
> problems in the kernel :-) -- we've found issues in module loading,
> early startup (there's a needed .align2 directive for sbi secondary
> startup, AFAICT) and the timing code for misaligned load/store
> handling.>
> I don't know how you test this unaligned trap handling, but it might
> be worthwhile to work that out. You can test via oreboot and the
> visionfive2, save we have not figured out why SMP startup is going
> wrong, yet :-), so we're not as feature-complete as needed. But soon.

I test that on spike (which does not handle misaligned accesses contrary
to qemu) using a userspace program that actually exercise all kind of
standard load/store instructions as well as FPU ones with different
registers. Regarding the kernel, you are right that I might be lacking a
few tests though. I'll also consider using a visionfive2 board to
validate that on real hardware.

Thanks,

Clément

> 
> Thanks!
> 
> On Mon, Oct 2, 2023 at 5:19 AM Clément Léger <cleger@rivosinc.com> wrote:
>>
>>
>>
>> On 02/10/2023 12:49, Conor Dooley wrote:
>>> On Mon, Oct 02, 2023 at 09:40:04AM +0200, Clément Léger wrote:
>>>>
>>>>
>>>> On 30/09/2023 11:23, Conor Dooley wrote:
>>>>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
>>>>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>>>>>> behavior compatible with privileged architecture.") in the RISC-V ISA
>>>>>> manual, it is stated that misaligned load/store might not be supported.
>>>>>> However, the RISC-V kernel uABI describes that misaligned accesses are
>>>>>> supported. In order to support that, this series adds support for S-mode
>>>>>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
>>>>>>
>>>>>> Handling misaligned access in kernel allows for a finer grain control
>>>>>> of the misaligned accesses behavior, and thanks to the prctl call, can
>>>>>> allow disabling misaligned access emulation to generate SIGBUS. User
>>>>>> space can then optimize its software by removing such access based on
>>>>>> SIGBUS generation.
>>>>>>
>>>>>> Currently, this series is useful for people that uses a SBI that does
>>>>>> not handled misaligned traps. In a near future, this series will make
>>>>>> use a SBI extension [1] allowing to request delegation of the
>>>>>> misaligned load/store traps to the S-mode software. This extension has
>>>>>> been submitted for review to the riscv tech-prs group. An OpenSBI
>>>>>> implementation for this spec is available at [2].
>>>>>>
>>>>>> This series can be tested using the spike simulator [3] and an openSBI
>>>>>> version [4] which allows to always delegate misaligned load/store to
>>>>>> S-mode.
>>>>>
>>>>> Some patches in this series do not build for any configs, some are
>>>>> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
>>>>
>>>> Hi Conor,
>>>>
>>>> Thanks for the feedback, I'll check that.
>>>>
>>>>>
>>>>> Also, AIUI, this series should be marked RFC since the SBI extension
>>>>> this relies on has not been frozen.
>>>>
>>>> This series does not actually uses the SBI extension but provides a way
>>>> to detect if misaligned accesses are not handled by hardware nor by the
>>>> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
>>>> implementation that does not handle misaligned accesses and that they
>>>> would like to make use of the PR_SET_UNALIGN feature. This is what this
>>>> series addresses (and thus does not depend on the mentioned SBI extension).
>>>
>>> Ah, I must have misread then. Apologies.
>>
>> No worries, maybe I should actually remove this from the cover letter to
>> avoid any confusion !
>>
>> Clément
Atish Kumar Patra Oct. 3, 2023, 8:40 a.m. UTC | #11
On Thu, Sep 28, 2023 at 9:48 AM Evan Green <evan@rivosinc.com> wrote:
>
> On Thu, Sep 28, 2023 at 12:49 AM Clément Léger <cleger@rivosinc.com> wrote:
> >
> >
> >
> > On 26/09/2023 23:43, Evan Green wrote:
> > > On Tue, Sep 26, 2023 at 8:03 AM Clément Léger <cleger@rivosinc.com> wrote:
> > >>
> > >> Since commit 61cadb9 ("Provide new description of misaligned load/store
> > >> behavior compatible with privileged architecture.") in the RISC-V ISA
> > >> manual, it is stated that misaligned load/store might not be supported.
> > >> However, the RISC-V kernel uABI describes that misaligned accesses are
> > >> supported. In order to support that, this series adds support for S-mode
> > >> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
> > >>
> > >> Handling misaligned access in kernel allows for a finer grain control
> > >> of the misaligned accesses behavior, and thanks to the prctl call, can
> > >> allow disabling misaligned access emulation to generate SIGBUS. User
> > >> space can then optimize its software by removing such access based on
> > >> SIGBUS generation.
> > >>
> > >> Currently, this series is useful for people that uses a SBI that does
> > >> not handled misaligned traps. In a near future, this series will make
> > >> use a SBI extension [1] allowing to request delegation of the
> > >> misaligned load/store traps to the S-mode software. This extension has
> > >> been submitted for review to the riscv tech-prs group. An OpenSBI
> > >> implementation for this spec is available at [2].
> > >
> > > For my own education, how does the new SBI call behave with respect to
> > > multiple harts? Does a call to change a feature perform that change
> > > across all harts, or just the hart the SBI call was made on? If the
> > > answer is "all harts", what if not all harts are exactly the same, and
> > > some can enable the feature switch while others cannot? Also if the
> > > answer is "all harts", does it also apply to hotplugged cpus, which
> > > may not have even existed at boot time?
> >
> > Depending on the feature, they can be either global (all harts) or
> > local (calling hart). The medeleg register is per hart and thus
> > misaligned load/store delegation for S-mode is also per hart.
>
> We should probably state this in the spec update then, both generally
> and for each specific feature added. Otherwise firmware writers are
> left not knowing if they're supposed to spread a feature across to all
> cores or not.
>

If a feature is required to update any CSR, it must be per hart only.
The supervisor software
is aware of the state of each hart and it should invoke it from all
the present harts.

Doing it in M-mode will result in M-mode IPIs and seems racy with what
kernel might be doing at that time.

> >
> >
> > >
> > > What happens if a hart goes through a context loss event, like
> > > suspend/resume? Is the setting expected to be sticky, or is the kernel
> > > expected to replay these calls?
> >
> > That is a good question that we did not actually clarified yet. Thanks
> > for raising it !
>

IMO, it should be sticky until a reset. I would be interested to hear
other thoughts though if there is a non-sticky
use-case.


> No problem! This may also need to be specified per-feature in the
> spec. I have a vague hunch that it's better to ask the kernel to do it
> on resume, though ideally we'd have the terminology (and I don't think
> we do?) to specify exactly which points constitute a context loss.
> Mostly I'm remembering the x86 and ARM transition from S3, where lots
> of firmware code ran at resume, to S0ix-like power states, where
> things resumed directly into the OS and they had to figure out how to
> do it without firmware. The vague hunch is that keeping the laundry
> list of things firmware must do on resume low might keep us from
> getting in S0ix's way, but it's all so speculative it's hard to know
> if it's really a useful hunch or not.
>
> -Evan
ron minnich Oct. 3, 2023, 3:37 p.m. UTC | #12
While it is true that it violates the spec today, given the fluidity
of the spec of the last 10 years, I'm not sure that matters :-)

Anyway, that's out of scope for this discussion, though I appreciate
your clarification. I'll bring it up elsewhere.

Clement points out that this series would work fine if that bit were
hardwired to 1, which is all I care about.

thanks

On Mon, Oct 2, 2023 at 4:23 PM Jessica Clarke <jrtc27@jrtc27.com> wrote:
>
> On 2 Oct 2023, at 16:32, ron minnich <rminnich@gmail.com> wrote:
> >
> > This was a very interesting read. One other thought crossed my mind,
> > which is that a RISC-V implementation might make the alignment
> > delegation hard-wired to always delegate to S mode. I.e, the bit might
> > be WARL and always 1. For what I'm doing, this would actually be
> > pretty convenient. Just want to make sure this code can accommodate
> > that -- wdyt?
>
> Such an implementation would violate the spec:
>
>   An implementation shall not have any bits of medeleg be read-only
>   one, i.e., any synchronous trap that can be delegated must support not
>   being delegated.
>
> Supporting that is thus out of scope.
>
> Jess
>
> > We have found lots of value in our experiments with delegating
> > alignment traps to Linux -- not least because they tend to locate
> > problems in the kernel :-) -- we've found issues in module loading,
> > early startup (there's a needed .align2 directive for sbi secondary
> > startup, AFAICT) and the timing code for misaligned load/store
> > handling.
> >
> > I don't know how you test this unaligned trap handling, but it might
> > be worthwhile to work that out. You can test via oreboot and the
> > visionfive2, save we have not figured out why SMP startup is going
> > wrong, yet :-), so we're not as feature-complete as needed. But soon.
> >
> > Thanks!
> >
> > On Mon, Oct 2, 2023 at 5:19 AM Clément Léger <cleger@rivosinc.com> wrote:
> >>
> >>
> >>
> >> On 02/10/2023 12:49, Conor Dooley wrote:
> >>> On Mon, Oct 02, 2023 at 09:40:04AM +0200, Clément Léger wrote:
> >>>>
> >>>>
> >>>> On 30/09/2023 11:23, Conor Dooley wrote:
> >>>>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
> >>>>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
> >>>>>> behavior compatible with privileged architecture.") in the RISC-V ISA
> >>>>>> manual, it is stated that misaligned load/store might not be supported.
> >>>>>> However, the RISC-V kernel uABI describes that misaligned accesses are
> >>>>>> supported. In order to support that, this series adds support for S-mode
> >>>>>> handling of misaligned accesses as well support for prctl(PR_UNALIGN).
> >>>>>>
> >>>>>> Handling misaligned access in kernel allows for a finer grain control
> >>>>>> of the misaligned accesses behavior, and thanks to the prctl call, can
> >>>>>> allow disabling misaligned access emulation to generate SIGBUS. User
> >>>>>> space can then optimize its software by removing such access based on
> >>>>>> SIGBUS generation.
> >>>>>>
> >>>>>> Currently, this series is useful for people that uses a SBI that does
> >>>>>> not handled misaligned traps. In a near future, this series will make
> >>>>>> use a SBI extension [1] allowing to request delegation of the
> >>>>>> misaligned load/store traps to the S-mode software. This extension has
> >>>>>> been submitted for review to the riscv tech-prs group. An OpenSBI
> >>>>>> implementation for this spec is available at [2].
> >>>>>>
> >>>>>> This series can be tested using the spike simulator [3] and an openSBI
> >>>>>> version [4] which allows to always delegate misaligned load/store to
> >>>>>> S-mode.
> >>>>>
> >>>>> Some patches in this series do not build for any configs, some are
> >>>>> broken for clang builds and others are broken for nommu. Please try to> build test this more thoroughly before you submit the next version.
> >>>>
> >>>> Hi Conor,
> >>>>
> >>>> Thanks for the feedback, I'll check that.
> >>>>
> >>>>>
> >>>>> Also, AIUI, this series should be marked RFC since the SBI extension
> >>>>> this relies on has not been frozen.
> >>>>
> >>>> This series does not actually uses the SBI extension but provides a way
> >>>> to detect if misaligned accesses are not handled by hardware nor by the
> >>>> SBI. It has been reported by Ron & Daniel they they have a minimal SBI
> >>>> implementation that does not handle misaligned accesses and that they
> >>>> would like to make use of the PR_SET_UNALIGN feature. This is what this
> >>>> series addresses (and thus does not depend on the mentioned SBI extension).
> >>>
> >>> Ah, I must have misread then. Apologies.
> >>
> >> No worries, maybe I should actually remove this from the cover letter to
> >> avoid any confusion !
> >>
> >> Clément
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
>
ron minnich Oct. 3, 2023, 3:39 p.m. UTC | #13
On Tue, Oct 3, 2023 at 2:40 AM Atish Kumar Patra <atishp@rivosinc.com> wrote:

> IMO, it should be sticky until a reset.

that's a wonderful idea; perhaps this discussion needs to be held
elsewhere, esp. as it regards possible violation of the spec. Thanks!
David Laight Oct. 4, 2023, 8:26 a.m. UTC | #14
From: Clément Léger
> Sent: 02 October 2023 08:40
> 
> On 30/09/2023 11:23, Conor Dooley wrote:
> > On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
> >> Since commit 61cadb9 ("Provide new description of misaligned load/store
> >> behavior compatible with privileged architecture.") in the RISC-V ISA
> >> manual, it is stated that misaligned load/store might not be supported.
> >> However, the RISC-V kernel uABI describes that misaligned accesses are
> >> supported.
...

That it just really horrid.
If the cpu is going to trap misaligned accesses then you want
The compiler generated code (ie packed data) not to generate
misaligned accesses.
So you have to change the kernel uABI.

OTOH if you known that such accesses won't fault and will be
not really slower than aligned accesses then optimised versions
of some functions (like memcpy and checksums) can use misaligned
accesses.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Clément Léger Oct. 4, 2023, 10:03 a.m. UTC | #15
On 04/10/2023 10:26, David Laight wrote:
> From: Clément Léger
>> Sent: 02 October 2023 08:40
>>
>> On 30/09/2023 11:23, Conor Dooley wrote:
>>> On Tue, Sep 26, 2023 at 05:03:09PM +0200, Clément Léger wrote:
>>>> Since commit 61cadb9 ("Provide new description of misaligned load/store
>>>> behavior compatible with privileged architecture.") in the RISC-V ISA
>>>> manual, it is stated that misaligned load/store might not be supported.
>>>> However, the RISC-V kernel uABI describes that misaligned accesses are
>>>> supported.
> ...
> 
> That it just really horrid.
> If the cpu is going to trap misaligned accesses then you want
> The compiler generated code (ie packed data) not to generate
> misaligned accesses.

Hi David,

Saying that you support misaligned access does not mean that they are
going to be efficient, just that they are supported (in fact, the uABI
state that they may perform poorly). The compiler is actually not so
stupid and will try to do as much aligned access as possible in what it
generates (unless forced by some assembly, cast or whatever that can
screw up alignment accesses). This is already the case and it will most
probably not change.

> So you have to change the kernel uABI.

I don't think so. Rule N°1 for kernel development is "Don't break the
userspace". So if changing the RISC-V uABI to say "misaligned accesses
are not supported", that is unlikely to happen. We stated that
misaligned access are supported and thus, they will continue to be
supported.

> 
> OTOH if you known that such accesses won't fault and will be
> not really slower than aligned accesses then optimised versions
> of some functions (like memcpy and checksums) can use misaligned
> accesses.

Yes, this is selected by HAVE_EFFICIENT_UNALIGNED_ACCESS. On RISC-V,
since the specification says nothing about the efficiency of such
access, we can't select it like that. Some RISC-V based SoC/CPUs might
want to select it manually in their config. In order to support that
dynamically and in a generic way, some future work could involve using
static keys for such alternatives and enable it based on the speed that
was detected.

Thanks,

Clément

> 
> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
David Laight Oct. 4, 2023, 2:10 p.m. UTC | #16
...
> Saying that you support misaligned access does not mean that they are
> going to be efficient, just that they are supported (in fact, the uABI
> state that they may perform poorly). The compiler is actually not so
> stupid and will try to do as much aligned access as possible in what it
> generates (unless forced by some assembly, cast or whatever that can
> screw up alignment accesses). This is already the case and it will most
> probably not change.

I did a quick check.

https://godbolt.org/z/j3e9drv4e

The code generated by both clang and gcc for misaligned reads is horrid.
Gcc does a better job if the alignment is known but generates much
the same as the clang code when it isn't.

The C code is much shorter.
Even though both gcc and clang add a (different) instruction to it

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)