diff mbox series

[v1,1/4] xen: introduce CONFIG_GENERIC_BUG_FRAME

Message ID 8adf4aeff96750982e3d670cb3aed11553d546d5.1675441720.git.oleksii.kurochko@gmail.com (mailing list archive)
State Superseded
Headers show
Series introduce generic implementation of macros from bug.h | expand

Commit Message

Oleksii Kurochko Feb. 3, 2023, 5:05 p.m. UTC
A large part of the content of the bug.h is repeated among all
architectures, so it was decided to create a new config
CONFIG_GENERIC_BUG_FRAME.

The version of <bug.h> from ARM was taken as the base version,
as it looks the most portable.

The patch introduces the following stuff:
  * common bug.h header
  * generic implementation of do_bug_frame()
  * new config CONFIG_GENERIC_BUG_FRAME

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
 xen/common/Kconfig    |   6 ++
 xen/common/Makefile   |   1 +
 xen/common/bug.c      |  88 +++++++++++++++++++++++++++++
 xen/include/xen/bug.h | 127 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 222 insertions(+)
 create mode 100644 xen/common/bug.c
 create mode 100644 xen/include/xen/bug.h

Comments

Jan Beulich Feb. 13, 2023, 12:24 p.m. UTC | #1
On 03.02.2023 18:05, Oleksii Kurochko wrote:
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -92,6 +92,12 @@ config STATIC_MEMORY
>  
>  	  If unsure, say N.
>  
> +config GENERIC_DO_BUG_FRAME
> +	bool
> +	help
> +	  Generic do_bug_frame() function is needed to handle the type of bug
> +	  frame and print an information about it.

Generally help text for prompt-less functions is not very useful. Assuming
it is put here to inform people actually reading the source file, I'm okay
for it to be left here, but please at least drop the stray "an". I also
think this may want moving up in the file, e.g. ahead of all the HAS_*
controls (which, as you will notice, all have no help text either). Plus
finally how about shorter and more applicable GENERIC_BUG_FRAME - after
all what becomes generic is more than just do_bug_frame()?

> --- /dev/null
> +++ b/xen/common/bug.c
> @@ -0,0 +1,88 @@
> +#include <xen/bug.h>
> +#include <xen/errno.h>
> +#include <xen/types.h>
> +#include <xen/kernel.h>

Please order headers alphabetically unless there's anything preventing
that from being done.

> +#include <xen/string.h>
> +#include <xen/virtual_region.h>
> +
> +#include <asm/processor.h>
> +
> +int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)

I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
environments that's redundant with "unsigned long", and it's also
redundant with C99's uintptr_t. Hence when seeing the type I always
wonder whether it's really a host virtual address which is meant (and
not perhaps a guest one, which in principle could differ from the host
one for certain guest types). In any event the existence of this type
should imo not be a prereq to using this generic piece of infrastructure.

> +{
> +    const struct bug_frame *bug = NULL;
> +    const char *prefix = "", *filename, *predicate;
> +    unsigned long fixup;
> +    int id = -1, lineno;

For both of these I question them needing to be of a signed type.

> +    const struct virtual_region *region;
> +
> +    region = find_text_region(pc);
> +    if ( region )
> +    {
> +        for ( id = 0; id < BUGFRAME_NR; id++ )
> +        {
> +            const struct bug_frame *b;
> +            unsigned int i;
> +
> +            for ( i = 0, b = region->frame[id].bugs;
> +                  i < region->frame[id].n_bugs; b++, i++ )
> +            {
> +                if ( ((vaddr_t)bug_loc(b)) == pc )

Afaics this is the sole use of bug_loc(). If so, better change the macro
to avoid the need for a cast here:

#define bug_loc(b) ((unsigned long)(b) + (b)->loc_disp)

> --- /dev/null
> +++ b/xen/include/xen/bug.h
> @@ -0,0 +1,127 @@
> +#ifndef __XEN_BUG_H__
> +#define __XEN_BUG_H__
> +
> +#define BUG_DISP_WIDTH    24
> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
> +
> +#define BUGFRAME_run_fn 0
> +#define BUGFRAME_warn   1
> +#define BUGFRAME_bug    2
> +#define BUGFRAME_assert 3
> +
> +#define BUGFRAME_NR     4
> +
> +#ifndef __ASSEMBLY__
> +
> +#include <xen/errno.h>
> +#include <xen/stringify.h>
> +#include <xen/types.h>
> +#include <xen/lib.h>

Again - please sort headers.

> +#include <asm/bug.h>
> +
> +#ifndef BUG_FRAME_STUFF
> +struct bug_frame {

Please can we have a blank line between the above two ones and then similarly
ahead of the #endif?

> +    signed int loc_disp;    /* Relative address to the bug address */
> +    signed int file_disp;   /* Relative address to the filename */
> +    signed int msg_disp;    /* Relative address to the predicate (for ASSERT) */
> +    uint16_t line;          /* Line number */
> +    uint32_t pad0:16;       /* Padding for 8-bytes align */

Already the original comment in Arm code is wrong: The padding doesn't
guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
size. Aiui there's also no need for 8-byte alignment anywhere here (in
fact there's ".p2align 2" in the generator macros).

I also wonder why this needs to be a named bitfield: Either make it
plain uint16_t, or if you use a bitfield, then omit the name.

> +};
> +
> +#define bug_loc(b) ((const void *)(b) + (b)->loc_disp)
> +#define bug_file(b) ((const void *)(b) + (b)->file_disp);
> +#define bug_line(b) ((b)->line)
> +#define bug_msg(b) ((const char *)(b) + (b)->msg_disp)
> +#endif /* BUG_FRAME_STUFF */
> +
> +#ifndef BUG_FRAME
> +/* Many versions of GCC doesn't support the asm %c parameter which would
> + * be preferable to this unpleasantness. We use mergeable string
> + * sections to avoid multiple copies of the string appearing in the
> + * Xen image. BUGFRAME_run_fn needs to be handled separately.
> + */

When generalizing the logic, I wonder in how far the comment doesn't
want re-wording some. For example, while Arm prefixes constant insn
operands with # (and x86 uses $), there's no such prefix in RISC-V. At
which point there's no need to use %c in the first place. (Which in
turn means x86'es more compact representation may also be usable there.
And yet in turn the question arises whether RISC-V wouldn't better have
its own derivation of the machinery, rather than trying to generalize
things. RISC-V's would then likely be closer to x86'es, just without
the use of %c on asm() operands. Which might then suggest to rather
generalize x86'es variant down the road.)

At the very least the comment's style wants correcting, and in the first
sentence s/doesn't/don't/. Also %c isn't a parameter, but a modifier.

> +#define BUG_FRAME(type, line, file, has_msg, msg) do {                      \
> +    BUILD_BUG_ON((line) >> 16);                                             \
> +    BUILD_BUG_ON((type) >= BUGFRAME_NR);                                    \
> +    asm ("1:"BUG_INSTR"\n"                                                  \

Nit: Style (missing blank after opening parenthesis, and then also at the
end of the construct ahead of the closing parenthesis).

> +         ".pushsection .rodata.str, \"aMS\", %progbits, 1\n"                \
> +         "2:\t.asciz " __stringify(file) "\n"                               \

file is always a string literal; really it always is __FILE__ in this
non-x86 implementation. So first preference ought to be to drop the
macro parameter and use __FILE__ here (same then for line vs __LINE__).
Yet even if not done like that, the __stringify() is largely unneeded
(unless we expect file names to have e.g. backslashes in their names)
and looks somewhat odd here. So how about

         "2:\t.asciz \"" __FILE__ "\"\n"

? But wait - peeking ahead to the x86 patch I notice that __FILE__ and
__LINE__ need to remain arguments. But then the second preference would
still be

         "2:\t.asciz \"" file "\"\n"

imo. Yet maybe others disagree ...

> +         "3:\n"                                                             \
> +         ".if " #has_msg "\n"                                               \
> +         "\t.asciz " #msg "\n"                                              \
> +         ".endif\n"                                                         \
> +         ".popsection\n"                                                    \
> +         ".pushsection .bug_frames." __stringify(type) ", \"a\", %progbits\n"\
> +         "4:\n"                                                             \
> +         ".p2align 2\n"                                                     \
> +         ".long (1b - 4b)\n"                                                \
> +         ".long (2b - 4b)\n"                                                \
> +         ".long (3b - 4b)\n"                                                \
> +         ".hword " __stringify(line) ", 0\n"                                \

Hmm, .hword. How generic is support for that in assemblers? I notice even
very old gas has support for it, but architectures may not consider it
two bytes wide. (On x86, for example, it's bogus to be two bytes, since
.word also produces 2 bytes of data. And indeed MIPS and NDS32 override it
in gas to produce 1 byte of data only, at least in certain cases.) I'd
like to suggest to use a fourth .long here, and to drop the padding field
from struct bug_frame in exchange.

Furthermore, once assembly code is generalized, you need to pay attention
to formatting: Only labels may start at the beginning of a line; in
particular directives never should.

> +         ".popsection");                                                    \
> +} while (0)

Nit: Style (missing blanks, and preferably "false" instead of "0").

> +#endif /* BUG_FRAME */
> +
> +#ifndef run_in_exception_handler
> +/*
> + * GCC will not allow to use "i"  when PIE is enabled (Xen doesn't set the
> + * flag but instead rely on the default value from the compiler). So the
> + * easiest way to implement run_in_exception_handler() is to pass the to
> + * be called function in a fixed register.
> + */
> +#define  run_in_exception_handler(fn) do {                                  \

Nit: Just one blank please after #define (and on the first comment line
there's also a double blank where only one should be).

> +    register void *fn_ asm(__stringify(BUG_FN_REG)) = (fn);                 \

x86 makes sure "fn" is of correct type. Arm doesn't, which likely is a
mistake. May I ask that you use the correct type here (which is even
better than x86'es extra checking construct):

    register void (*fn_)(struct cpu_user_regs *) asm(__stringify(BUG_FN_REG)) = (fn);

> +    asm ("1:"BUG_INSTR"\n"                                                  \
> +         ".pushsection .bug_frames." __stringify(BUGFRAME_run_fn) ","       \
> +         "             \"a\", %%progbits\n"                                 \
> +         "2:\n"                                                             \
> +         ".p2align 2\n"                                                     \
> +         ".long (1b - 2b)\n"                                                \
> +         ".long 0, 0, 0\n"                                                  \
> +         ".popsection" :: "r" (fn_));                                       \
> +} while (0)
> +#endif /* run_in_exception_handler */
> +
> +#ifndef WARN
> +#define WARN() BUG_FRAME(BUGFRAME_warn, __LINE__, __FILE__, 0, "")
> +#endif /* WARN */
> +
> +#ifndef BUG
> +#define BUG() do {                                              \
> +    BUG_FRAME(BUGFRAME_bug,  __LINE__, __FILE__, 0, "");        \
> +    unreachable();                                              \
> +} while (0)
> +#endif
> +
> +#ifndef assert_failed
> +#define assert_failed(msg) do {                                 \
> +    BUG_FRAME(BUGFRAME_assert, __LINE__, __FILE__, 1, msg);     \
> +    unreachable();                                              \
> +} while (0)
> +#endif
> +
> +extern const struct bug_frame __start_bug_frames[],
> +                              __stop_bug_frames_0[],
> +                              __stop_bug_frames_1[],
> +                              __stop_bug_frames_2[],
> +                              __stop_bug_frames_3[];
> +
> +#else /* !__ASSEMBLY__ */
> +
> +#ifdef CONFIG_X86
> +#include <asm/bug.h>
> +#endif

Why this x86 special? (To be precise: Why can this not be done uniformly?)

Jan
Julien Grall Feb. 13, 2023, 1:19 p.m. UTC | #2
On 13/02/2023 12:24, Jan Beulich wrote:
> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>> --- a/xen/common/Kconfig
>> +++ b/xen/common/Kconfig
>> @@ -92,6 +92,12 @@ config STATIC_MEMORY
>>   
>>   	  If unsure, say N.
>>   
>> +config GENERIC_DO_BUG_FRAME
>> +	bool
>> +	help
>> +	  Generic do_bug_frame() function is needed to handle the type of bug
>> +	  frame and print an information about it.
> 
> Generally help text for prompt-less functions is not very useful. Assuming
> it is put here to inform people actually reading the source file, I'm okay
> for it to be left here, but please at least drop the stray "an". I also
> think this may want moving up in the file, e.g. ahead of all the HAS_*
> controls (which, as you will notice, all have no help text either). Plus
> finally how about shorter and more applicable GENERIC_BUG_FRAME - after
> all what becomes generic is more than just do_bug_frame()?
> 
>> --- /dev/null
>> +++ b/xen/common/bug.c
>> @@ -0,0 +1,88 @@
>> +#include <xen/bug.h>
>> +#include <xen/errno.h>
>> +#include <xen/types.h>
>> +#include <xen/kernel.h>
> 
> Please order headers alphabetically unless there's anything preventing
> that from being done.
> 
>> +#include <xen/string.h>
>> +#include <xen/virtual_region.h>
>> +
>> +#include <asm/processor.h>
>> +
>> +int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)
> 
> I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
> environments that's redundant with "unsigned long", and it's also
> redundant with C99's uintptr_t. Hence when seeing the type I always
> wonder whether it's really a host virtual address which is meant (and
> not perhaps a guest one, which in principle could differ from the host
> one for certain guest types). In any event the existence of this type
> should imo not be a prereq to using this generic piece of infrastructure

C spec aside, the use "unsigned long" is quite overloaded within Xen. 
Although, this has improved since we introduced mfn_t/gfn_t.

In the future, I could imagine us to also introduce typesafe for 
vaddr_t, reducing further the risk to mix different meaning of unsigned 
long.

Therefore, I think the introduction of vaddr_t should be a prereq for 
using the generic piece of infrastructure.

> 
>> +{
>> +    const struct bug_frame *bug = NULL;
>> +    const char *prefix = "", *filename, *predicate;
>> +    unsigned long fixup;
>> +    int id = -1, lineno;
> 
> For both of these I question them needing to be of a signed type.
> 
>> +    const struct virtual_region *region;
>> +
>> +    region = find_text_region(pc);
>> +    if ( region )
>> +    {
>> +        for ( id = 0; id < BUGFRAME_NR; id++ )
>> +        {
>> +            const struct bug_frame *b;
>> +            unsigned int i;
>> +
>> +            for ( i = 0, b = region->frame[id].bugs;
>> +                  i < region->frame[id].n_bugs; b++, i++ )
>> +            {
>> +                if ( ((vaddr_t)bug_loc(b)) == pc )
> 
> Afaics this is the sole use of bug_loc(). If so, better change the macro
> to avoid the need for a cast here:
> 
> #define bug_loc(b) ((unsigned long)(b) + (b)->loc_disp)
> 
>> --- /dev/null
>> +++ b/xen/include/xen/bug.h
>> @@ -0,0 +1,127 @@
>> +#ifndef __XEN_BUG_H__
>> +#define __XEN_BUG_H__
>> +
>> +#define BUG_DISP_WIDTH    24
>> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
>> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
>> +
>> +#define BUGFRAME_run_fn 0
>> +#define BUGFRAME_warn   1
>> +#define BUGFRAME_bug    2
>> +#define BUGFRAME_assert 3
>> +
>> +#define BUGFRAME_NR     4
>> +
>> +#ifndef __ASSEMBLY__
>> +
>> +#include <xen/errno.h>
>> +#include <xen/stringify.h>
>> +#include <xen/types.h>
>> +#include <xen/lib.h>
> 
> Again - please sort headers.
> 
>> +#include <asm/bug.h>
>> +
>> +#ifndef BUG_FRAME_STUFF
>> +struct bug_frame {
> 
> Please can we have a blank line between the above two ones and then similarly
> ahead of the #endif?
> 
>> +    signed int loc_disp;    /* Relative address to the bug address */
>> +    signed int file_disp;   /* Relative address to the filename */
>> +    signed int msg_disp;    /* Relative address to the predicate (for ASSERT) */
>> +    uint16_t line;          /* Line number */
>> +    uint32_t pad0:16;       /* Padding for 8-bytes align */
> 
> Already the original comment in Arm code is wrong: The padding doesn't
> guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
> size.
> Aiui there's also no need for 8-byte alignment anywhere here (in
> fact there's ".p2align 2" in the generator macros).

I would rather keep the pad0 here.

> 
> I also wonder why this needs to be a named bitfield: Either make it
> plain uint16_t, or if you use a bitfield, then omit the name.

Everything you seem to suggest are clean ups. So I think it would be 
better if they are first applied to Arm and then we move the code to 
common afterwards.

This will make easier to confirm what changed and also tracking the 
history (think git blame).

That said, I don't view the clean-ups as necessary in order to move the 
code in common... They could be done afterwards by Oleksii or someone else.

> 
>> +};
>> +
>> +#define bug_loc(b) ((const void *)(b) + (b)->loc_disp)
>> +#define bug_file(b) ((const void *)(b) + (b)->file_disp);
>> +#define bug_line(b) ((b)->line)
>> +#define bug_msg(b) ((const char *)(b) + (b)->msg_disp)
>> +#endif /* BUG_FRAME_STUFF */
>> +
>> +#ifndef BUG_FRAME
>> +/* Many versions of GCC doesn't support the asm %c parameter which would
>> + * be preferable to this unpleasantness. We use mergeable string
>> + * sections to avoid multiple copies of the string appearing in the
>> + * Xen image. BUGFRAME_run_fn needs to be handled separately.
>> + */
> 
> When generalizing the logic, I wonder in how far the comment doesn't
> want re-wording some. For example, while Arm prefixes constant insn
> operands with # (and x86 uses $), there's no such prefix in RISC-V. At
> which point there's no need to use %c in the first place. (Which in
> turn means x86'es more compact representation may also be usable there.
> And yet in turn the question arises whether RISC-V wouldn't better have
> its own derivation of the machinery, rather than trying to generalize
> things. RISC-V's would then likely be closer to x86'es, just without
> the use of %c on asm() operands. Which might then suggest to rather
> generalize x86'es variant down the road.)
> 
> At the very least the comment's style wants correcting, and in the first
> sentence s/doesn't/don't/. Also %c isn't a parameter, but a modifier.
> 
>> +#define BUG_FRAME(type, line, file, has_msg, msg) do {                      \
>> +    BUILD_BUG_ON((line) >> 16);                                             \
>> +    BUILD_BUG_ON((type) >= BUGFRAME_NR);                                    \
>> +    asm ("1:"BUG_INSTR"\n"                                                  \
> 
> Nit: Style (missing blank after opening parenthesis, and then also at the
> end of the construct ahead of the closing parenthesis).
> 
>> +         ".pushsection .rodata.str, \"aMS\", %progbits, 1\n"                \
>> +         "2:\t.asciz " __stringify(file) "\n"                               \
> 
> file is always a string literal; really it always is __FILE__ in this
> non-x86 implementation. So first preference ought to be to drop the
> macro parameter and use __FILE__ here (same then for line vs __LINE__).
> Yet even if not done like that, the __stringify() is largely unneeded
> (unless we expect file names to have e.g. backslashes in their names)
> and looks somewhat odd here. So how about
> 
>           "2:\t.asciz \"" __FILE__ "\"\n"
> 
> ? But wait - peeking ahead to the x86 patch I notice that __FILE__ and
> __LINE__ need to remain arguments. But then the second preference would
> still be
> 
>           "2:\t.asciz \"" file "\"\n"
> 
> imo. Yet maybe others disagree ...

I would prefer to keep the __stringify() version because it avoids 
relying on file to always be a string literal.

[...]
Jan Beulich Feb. 13, 2023, 1:30 p.m. UTC | #3
On 13.02.2023 14:19, Julien Grall wrote:
> On 13/02/2023 12:24, Jan Beulich wrote:
>> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>>> --- /dev/null
>>> +++ b/xen/common/bug.c
>>> @@ -0,0 +1,88 @@
>>> +#include <xen/bug.h>
>>> +#include <xen/errno.h>
>>> +#include <xen/types.h>
>>> +#include <xen/kernel.h>
>>
>> Please order headers alphabetically unless there's anything preventing
>> that from being done.
>>
>>> +#include <xen/string.h>
>>> +#include <xen/virtual_region.h>
>>> +
>>> +#include <asm/processor.h>
>>> +
>>> +int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)
>>
>> I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
>> environments that's redundant with "unsigned long", and it's also
>> redundant with C99's uintptr_t. Hence when seeing the type I always
>> wonder whether it's really a host virtual address which is meant (and
>> not perhaps a guest one, which in principle could differ from the host
>> one for certain guest types). In any event the existence of this type
>> should imo not be a prereq to using this generic piece of infrastructure
> 
> C spec aside, the use "unsigned long" is quite overloaded within Xen. 
> Although, this has improved since we introduced mfn_t/gfn_t.
> 
> In the future, I could imagine us to also introduce typesafe for 
> vaddr_t, reducing further the risk to mix different meaning of unsigned 
> long.
> 
> Therefore, I think the introduction of vaddr_t should be a prereq for 
> using the generic piece of infrastructure.

Would be nice if other maintainers could share their thoughts here. I,
for one, would view a typesafe gaddr_t as reasonable, and perhaps also
a typesafe gvaddr_t, but hypervisor addresses should be fine as void *
or unsigned long.

>>> --- /dev/null
>>> +++ b/xen/include/xen/bug.h
>>> @@ -0,0 +1,127 @@
>>> +#ifndef __XEN_BUG_H__
>>> +#define __XEN_BUG_H__
>>> +
>>> +#define BUG_DISP_WIDTH    24
>>> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
>>> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
>>> +
>>> +#define BUGFRAME_run_fn 0
>>> +#define BUGFRAME_warn   1
>>> +#define BUGFRAME_bug    2
>>> +#define BUGFRAME_assert 3
>>> +
>>> +#define BUGFRAME_NR     4
>>> +
>>> +#ifndef __ASSEMBLY__
>>> +
>>> +#include <xen/errno.h>
>>> +#include <xen/stringify.h>
>>> +#include <xen/types.h>
>>> +#include <xen/lib.h>
>>
>> Again - please sort headers.
>>
>>> +#include <asm/bug.h>
>>> +
>>> +#ifndef BUG_FRAME_STUFF
>>> +struct bug_frame {
>>
>> Please can we have a blank line between the above two ones and then similarly
>> ahead of the #endif?
>>
>>> +    signed int loc_disp;    /* Relative address to the bug address */
>>> +    signed int file_disp;   /* Relative address to the filename */
>>> +    signed int msg_disp;    /* Relative address to the predicate (for ASSERT) */
>>> +    uint16_t line;          /* Line number */
>>> +    uint32_t pad0:16;       /* Padding for 8-bytes align */
>>
>> Already the original comment in Arm code is wrong: The padding doesn't
>> guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
>> size.
>> Aiui there's also no need for 8-byte alignment anywhere here (in
>> fact there's ".p2align 2" in the generator macros).
> 
> I would rather keep the pad0 here.

May I ask why that is?

>> I also wonder why this needs to be a named bitfield: Either make it
>> plain uint16_t, or if you use a bitfield, then omit the name.
> 
> Everything you seem to suggest are clean ups. So I think it would be 
> better if they are first applied to Arm and then we move the code to 
> common afterwards.
> 
> This will make easier to confirm what changed and also tracking the 
> history (think git blame).
> 
> That said, I don't view the clean-ups as necessary in order to move the 
> code in common... They could be done afterwards by Oleksii or someone else.

I disagree: The wider the exposure / use of code, the more important it is
to be reasonably tidy. Cleaning up first and then moving is certainly fine
with me.

>>> +         ".pushsection .rodata.str, \"aMS\", %progbits, 1\n"                \
>>> +         "2:\t.asciz " __stringify(file) "\n"                               \
>>
>> file is always a string literal; really it always is __FILE__ in this
>> non-x86 implementation. So first preference ought to be to drop the
>> macro parameter and use __FILE__ here (same then for line vs __LINE__).
>> Yet even if not done like that, the __stringify() is largely unneeded
>> (unless we expect file names to have e.g. backslashes in their names)
>> and looks somewhat odd here. So how about
>>
>>           "2:\t.asciz \"" __FILE__ "\"\n"
>>
>> ? But wait - peeking ahead to the x86 patch I notice that __FILE__ and
>> __LINE__ need to remain arguments. But then the second preference would
>> still be
>>
>>           "2:\t.asciz \"" file "\"\n"
>>
>> imo. Yet maybe others disagree ...
> 
> I would prefer to keep the __stringify() version because it avoids 
> relying on file to always be a string literal.

... hiding possible mistakes (not passing a string literal is very
likely unintentional here after all).

Jan
Andrew Cooper Feb. 13, 2023, 1:33 p.m. UTC | #4
On 13/02/2023 1:19 pm, Julien Grall wrote:
>
>
> On 13/02/2023 12:24, Jan Beulich wrote:
>> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>>> --- a/xen/common/Kconfig
>>> +++ b/xen/common/Kconfig
>>> @@ -92,6 +92,12 @@ config STATIC_MEMORY
>>>           If unsure, say N.
>>>   +config GENERIC_DO_BUG_FRAME
>>> +    bool
>>> +    help
>>> +      Generic do_bug_frame() function is needed to handle the type
>>> of bug
>>> +      frame and print an information about it.
>>
>> Generally help text for prompt-less functions is not very useful.
>> Assuming
>> it is put here to inform people actually reading the source file, I'm
>> okay
>> for it to be left here, but please at least drop the stray "an". I also
>> think this may want moving up in the file, e.g. ahead of all the HAS_*
>> controls (which, as you will notice, all have no help text either). Plus
>> finally how about shorter and more applicable GENERIC_BUG_FRAME - after
>> all what becomes generic is more than just do_bug_frame()?
>>
>>> --- /dev/null
>>> +++ b/xen/common/bug.c
>>> @@ -0,0 +1,88 @@
>>> +#include <xen/bug.h>
>>> +#include <xen/errno.h>
>>> +#include <xen/types.h>
>>> +#include <xen/kernel.h>
>>
>> Please order headers alphabetically unless there's anything preventing
>> that from being done.
>>
>>> +#include <xen/string.h>
>>> +#include <xen/virtual_region.h>
>>> +
>>> +#include <asm/processor.h>
>>> +
>>> +int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)
>>
>> I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
>> environments that's redundant with "unsigned long", and it's also
>> redundant with C99's uintptr_t. Hence when seeing the type I always
>> wonder whether it's really a host virtual address which is meant (and
>> not perhaps a guest one, which in principle could differ from the host
>> one for certain guest types). In any event the existence of this type
>> should imo not be a prereq to using this generic piece of infrastructure
>
> C spec aside, the use "unsigned long" is quite overloaded within Xen.
> Although, this has improved since we introduced mfn_t/gfn_t.
>
> In the future, I could imagine us to also introduce typesafe for
> vaddr_t, reducing further the risk to mix different meaning of
> unsigned long.
>
> Therefore, I think the introduction of vaddr_t should be a prereq for
> using the generic piece of infrastructure.

I'm with Jan on this.  vaddr_t is pure obfuscation, and you can do what
you like with it in ARM, but you will find very firm objection to it
finding its way into common or x86 code.

virtual addresses are spelt 'unsigned long'.

~Andrew
Julien Grall Feb. 13, 2023, 1:55 p.m. UTC | #5
Hi Andrew,

On 13/02/2023 13:33, Andrew Cooper wrote:
> On 13/02/2023 1:19 pm, Julien Grall wrote:
>>
>>
>> On 13/02/2023 12:24, Jan Beulich wrote:
>>> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>>>> --- a/xen/common/Kconfig
>>>> +++ b/xen/common/Kconfig
>>>> @@ -92,6 +92,12 @@ config STATIC_MEMORY
>>>>            If unsure, say N.
>>>>    +config GENERIC_DO_BUG_FRAME
>>>> +    bool
>>>> +    help
>>>> +      Generic do_bug_frame() function is needed to handle the type
>>>> of bug
>>>> +      frame and print an information about it.
>>>
>>> Generally help text for prompt-less functions is not very useful.
>>> Assuming
>>> it is put here to inform people actually reading the source file, I'm
>>> okay
>>> for it to be left here, but please at least drop the stray "an". I also
>>> think this may want moving up in the file, e.g. ahead of all the HAS_*
>>> controls (which, as you will notice, all have no help text either). Plus
>>> finally how about shorter and more applicable GENERIC_BUG_FRAME - after
>>> all what becomes generic is more than just do_bug_frame()?
>>>
>>>> --- /dev/null
>>>> +++ b/xen/common/bug.c
>>>> @@ -0,0 +1,88 @@
>>>> +#include <xen/bug.h>
>>>> +#include <xen/errno.h>
>>>> +#include <xen/types.h>
>>>> +#include <xen/kernel.h>
>>>
>>> Please order headers alphabetically unless there's anything preventing
>>> that from being done.
>>>
>>>> +#include <xen/string.h>
>>>> +#include <xen/virtual_region.h>
>>>> +
>>>> +#include <asm/processor.h>
>>>> +
>>>> +int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)
>>>
>>> I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
>>> environments that's redundant with "unsigned long", and it's also
>>> redundant with C99's uintptr_t. Hence when seeing the type I always
>>> wonder whether it's really a host virtual address which is meant (and
>>> not perhaps a guest one, which in principle could differ from the host
>>> one for certain guest types). In any event the existence of this type
>>> should imo not be a prereq to using this generic piece of infrastructure
>>
>> C spec aside, the use "unsigned long" is quite overloaded within Xen.
>> Although, this has improved since we introduced mfn_t/gfn_t.
>>
>> In the future, I could imagine us to also introduce typesafe for
>> vaddr_t, reducing further the risk to mix different meaning of
>> unsigned long.
>>
>> Therefore, I think the introduction of vaddr_t should be a prereq for
>> using the generic piece of infrastructure.
> 
> I'm with Jan on this.  vaddr_t is pure obfuscation, and you can do what
> you like with it in ARM, but you will find very firm objection to it
> finding its way into common or x86 code.

Talking about obfuscation...

42sh> ack "unsigned long addr" | wc -l
143

2/3 of this is on x86. Without looking at the code, it can be quite 
difficult to figure out if this is meant to a virtual/physical 
host/guest address.

> 
> virtual addresses are spelt 'unsigned long'.

That's ok so long there are no way to mistakenly mix some parameters. 
For instance in the past, the type of map_pages_to_xen() was:

int map_pages_to_xen(unsigned long virt,
                      unsigned long mfn,
                      unsigned long nr_mfns,
                      unsigned int flags)

Since then 'mfn' is now thankfully mfn_t... But before, it was easy to mix.

For sure, there are other way to do it like properly naming the 
variable... But using a type as the advantage that it could be checked a 
compilation.

I am open to other suggestions if you have a way to guarantee that 
mistake can be avoided.

Cheers,
Julien Grall Feb. 13, 2023, 1:56 p.m. UTC | #6
On 13/02/2023 13:30, Jan Beulich wrote:
> On 13.02.2023 14:19, Julien Grall wrote:
>> On 13/02/2023 12:24, Jan Beulich wrote:
>>> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>>>> --- /dev/null
>>>> +++ b/xen/common/bug.c
>>>> @@ -0,0 +1,88 @@
>>>> +#include <xen/bug.h>
>>>> +#include <xen/errno.h>
>>>> +#include <xen/types.h>
>>>> +#include <xen/kernel.h>
>>>
>>> Please order headers alphabetically unless there's anything preventing
>>> that from being done.
>>>
>>>> +#include <xen/string.h>
>>>> +#include <xen/virtual_region.h>
>>>> +
>>>> +#include <asm/processor.h>
>>>> +
>>>> +int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)
>>>
>>> I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-like
>>> environments that's redundant with "unsigned long", and it's also
>>> redundant with C99's uintptr_t. Hence when seeing the type I always
>>> wonder whether it's really a host virtual address which is meant (and
>>> not perhaps a guest one, which in principle could differ from the host
>>> one for certain guest types). In any event the existence of this type
>>> should imo not be a prereq to using this generic piece of infrastructure
>>
>> C spec aside, the use "unsigned long" is quite overloaded within Xen.
>> Although, this has improved since we introduced mfn_t/gfn_t.
>>
>> In the future, I could imagine us to also introduce typesafe for
>> vaddr_t, reducing further the risk to mix different meaning of unsigned
>> long.
>>
>> Therefore, I think the introduction of vaddr_t should be a prereq for
>> using the generic piece of infrastructure.
> 
> Would be nice if other maintainers could share their thoughts here. I,
> for one, would view a typesafe gaddr_t as reasonable, and perhaps also
> a typesafe gvaddr_t, but hypervisor addresses should be fine as void *
> or unsigned long.

See my answer to Andrew.

> 
>>>> --- /dev/null
>>>> +++ b/xen/include/xen/bug.h
>>>> @@ -0,0 +1,127 @@
>>>> +#ifndef __XEN_BUG_H__
>>>> +#define __XEN_BUG_H__
>>>> +
>>>> +#define BUG_DISP_WIDTH    24
>>>> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
>>>> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
>>>> +
>>>> +#define BUGFRAME_run_fn 0
>>>> +#define BUGFRAME_warn   1
>>>> +#define BUGFRAME_bug    2
>>>> +#define BUGFRAME_assert 3
>>>> +
>>>> +#define BUGFRAME_NR     4
>>>> +
>>>> +#ifndef __ASSEMBLY__
>>>> +
>>>> +#include <xen/errno.h>
>>>> +#include <xen/stringify.h>
>>>> +#include <xen/types.h>
>>>> +#include <xen/lib.h>
>>>
>>> Again - please sort headers.
>>>
>>>> +#include <asm/bug.h>
>>>> +
>>>> +#ifndef BUG_FRAME_STUFF
>>>> +struct bug_frame {
>>>
>>> Please can we have a blank line between the above two ones and then similarly
>>> ahead of the #endif?
>>>
>>>> +    signed int loc_disp;    /* Relative address to the bug address */
>>>> +    signed int file_disp;   /* Relative address to the filename */
>>>> +    signed int msg_disp;    /* Relative address to the predicate (for ASSERT) */
>>>> +    uint16_t line;          /* Line number */
>>>> +    uint32_t pad0:16;       /* Padding for 8-bytes align */
>>>
>>> Already the original comment in Arm code is wrong: The padding doesn't
>>> guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
>>> size.
>>> Aiui there's also no need for 8-byte alignment anywhere here (in
>>> fact there's ".p2align 2" in the generator macros).
>>
>> I would rather keep the pad0 here.
> 
> May I ask why that is?

It makes clear of the padding (which would have been hidden) when using 
.p2align 2.

Cheers,
Jan Beulich Feb. 13, 2023, 3:02 p.m. UTC | #7
On 13.02.2023 14:56, Julien Grall wrote:
> On 13/02/2023 13:30, Jan Beulich wrote:
>> On 13.02.2023 14:19, Julien Grall wrote:
>>> On 13/02/2023 12:24, Jan Beulich wrote:
>>>> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>>>>> --- /dev/null
>>>>> +++ b/xen/include/xen/bug.h
>>>>> @@ -0,0 +1,127 @@
>>>>> +#ifndef __XEN_BUG_H__
>>>>> +#define __XEN_BUG_H__
>>>>> +
>>>>> +#define BUG_DISP_WIDTH    24
>>>>> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
>>>>> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
>>>>> +
>>>>> +#define BUGFRAME_run_fn 0
>>>>> +#define BUGFRAME_warn   1
>>>>> +#define BUGFRAME_bug    2
>>>>> +#define BUGFRAME_assert 3
>>>>> +
>>>>> +#define BUGFRAME_NR     4
>>>>> +
>>>>> +#ifndef __ASSEMBLY__
>>>>> +
>>>>> +#include <xen/errno.h>
>>>>> +#include <xen/stringify.h>
>>>>> +#include <xen/types.h>
>>>>> +#include <xen/lib.h>
>>>>
>>>> Again - please sort headers.
>>>>
>>>>> +#include <asm/bug.h>
>>>>> +
>>>>> +#ifndef BUG_FRAME_STUFF
>>>>> +struct bug_frame {
>>>>
>>>> Please can we have a blank line between the above two ones and then similarly
>>>> ahead of the #endif?
>>>>
>>>>> +    signed int loc_disp;    /* Relative address to the bug address */
>>>>> +    signed int file_disp;   /* Relative address to the filename */
>>>>> +    signed int msg_disp;    /* Relative address to the predicate (for ASSERT) */
>>>>> +    uint16_t line;          /* Line number */
>>>>> +    uint32_t pad0:16;       /* Padding for 8-bytes align */
>>>>
>>>> Already the original comment in Arm code is wrong: The padding doesn't
>>>> guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
>>>> size.
>>>> Aiui there's also no need for 8-byte alignment anywhere here (in
>>>> fact there's ".p2align 2" in the generator macros).
>>>
>>> I would rather keep the pad0 here.
>>
>> May I ask why that is?
> 
> It makes clear of the padding (which would have been hidden) when using 
> .p2align 2.

But you realize that I didn't ask to drop the member? Besides (later in
the reply to Oleksii) suggesting to make "line" uint32_t, you zapped the
relevant further part of my reply here:

"I also wonder why this needs to be a named bitfield: Either make it
 plain uint16_t, or if you use a bitfield, then omit the name."

I thought that's clear enough as a request to change representation,
but not asking for plain removal. The part of my reply that you commented
on is merely asking to not move a bogus comment (whether it gets corrected
up front or while being moved is secondary to me).

Jan
Julien Grall Feb. 13, 2023, 3:07 p.m. UTC | #8
Hi,

On 13/02/2023 15:02, Jan Beulich wrote:
> On 13.02.2023 14:56, Julien Grall wrote:
>> On 13/02/2023 13:30, Jan Beulich wrote:
>>> On 13.02.2023 14:19, Julien Grall wrote:
>>>> On 13/02/2023 12:24, Jan Beulich wrote:
>>>>> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>>>>>> --- /dev/null
>>>>>> +++ b/xen/include/xen/bug.h
>>>>>> @@ -0,0 +1,127 @@
>>>>>> +#ifndef __XEN_BUG_H__
>>>>>> +#define __XEN_BUG_H__
>>>>>> +
>>>>>> +#define BUG_DISP_WIDTH    24
>>>>>> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
>>>>>> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
>>>>>> +
>>>>>> +#define BUGFRAME_run_fn 0
>>>>>> +#define BUGFRAME_warn   1
>>>>>> +#define BUGFRAME_bug    2
>>>>>> +#define BUGFRAME_assert 3
>>>>>> +
>>>>>> +#define BUGFRAME_NR     4
>>>>>> +
>>>>>> +#ifndef __ASSEMBLY__
>>>>>> +
>>>>>> +#include <xen/errno.h>
>>>>>> +#include <xen/stringify.h>
>>>>>> +#include <xen/types.h>
>>>>>> +#include <xen/lib.h>
>>>>>
>>>>> Again - please sort headers.
>>>>>
>>>>>> +#include <asm/bug.h>
>>>>>> +
>>>>>> +#ifndef BUG_FRAME_STUFF
>>>>>> +struct bug_frame {
>>>>>
>>>>> Please can we have a blank line between the above two ones and then similarly
>>>>> ahead of the #endif?
>>>>>
>>>>>> +    signed int loc_disp;    /* Relative address to the bug address */
>>>>>> +    signed int file_disp;   /* Relative address to the filename */
>>>>>> +    signed int msg_disp;    /* Relative address to the predicate (for ASSERT) */
>>>>>> +    uint16_t line;          /* Line number */
>>>>>> +    uint32_t pad0:16;       /* Padding for 8-bytes align */
>>>>>
>>>>> Already the original comment in Arm code is wrong: The padding doesn't
>>>>> guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
>>>>> size.
>>>>> Aiui there's also no need for 8-byte alignment anywhere here (in
>>>>> fact there's ".p2align 2" in the generator macros).
>>>>
>>>> I would rather keep the pad0 here.
>>>
>>> May I ask why that is?
>>
>> It makes clear of the padding (which would have been hidden) when using
>> .p2align 2.
> 
> But you realize that I didn't ask to drop the member?

I misunderstood your first reply. But the second reply...

  Besides (later in
> the reply to Oleksii) suggesting to make "line" uint32_t, you zapped the
> relevant further part of my reply here:

> 
> "I also wonder why this needs to be a named bitfield: Either make it
>   plain uint16_t, or if you use a bitfield, then omit the name."
> 
> I thought that's clear enough as a request to change representation,

... "May I ask why that is?" added to the confusion because it implied 
that you disagree on keep the pad0.

> but not asking for plain removal. The part of my reply that you commented
> on is merely asking to not move a bogus comment (whether it gets corrected
> up front or while being moved is secondary to me).

I am fine with both suggestions.

Cheers,
Jan Beulich Feb. 13, 2023, 3:14 p.m. UTC | #9
On 13.02.2023 16:07, Julien Grall wrote:
> On 13/02/2023 15:02, Jan Beulich wrote:
>> On 13.02.2023 14:56, Julien Grall wrote:
>>> On 13/02/2023 13:30, Jan Beulich wrote:
>>>> On 13.02.2023 14:19, Julien Grall wrote:
>>>>> On 13/02/2023 12:24, Jan Beulich wrote:
>>>>>> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>>>>>>> --- /dev/null
>>>>>>> +++ b/xen/include/xen/bug.h
>>>>>>> @@ -0,0 +1,127 @@
>>>>>>> +#ifndef __XEN_BUG_H__
>>>>>>> +#define __XEN_BUG_H__
>>>>>>> +
>>>>>>> +#define BUG_DISP_WIDTH    24
>>>>>>> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
>>>>>>> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
>>>>>>> +
>>>>>>> +#define BUGFRAME_run_fn 0
>>>>>>> +#define BUGFRAME_warn   1
>>>>>>> +#define BUGFRAME_bug    2
>>>>>>> +#define BUGFRAME_assert 3
>>>>>>> +
>>>>>>> +#define BUGFRAME_NR     4
>>>>>>> +
>>>>>>> +#ifndef __ASSEMBLY__
>>>>>>> +
>>>>>>> +#include <xen/errno.h>
>>>>>>> +#include <xen/stringify.h>
>>>>>>> +#include <xen/types.h>
>>>>>>> +#include <xen/lib.h>
>>>>>>
>>>>>> Again - please sort headers.
>>>>>>
>>>>>>> +#include <asm/bug.h>
>>>>>>> +
>>>>>>> +#ifndef BUG_FRAME_STUFF
>>>>>>> +struct bug_frame {
>>>>>>
>>>>>> Please can we have a blank line between the above two ones and then similarly
>>>>>> ahead of the #endif?
>>>>>>
>>>>>>> +    signed int loc_disp;    /* Relative address to the bug address */
>>>>>>> +    signed int file_disp;   /* Relative address to the filename */
>>>>>>> +    signed int msg_disp;    /* Relative address to the predicate (for ASSERT) */
>>>>>>> +    uint16_t line;          /* Line number */
>>>>>>> +    uint32_t pad0:16;       /* Padding for 8-bytes align */
>>>>>>
>>>>>> Already the original comment in Arm code is wrong: The padding doesn't
>>>>>> guarantee 8-byte alignment; it merely guarantees a multiple of 8 bytes
>>>>>> size.
>>>>>> Aiui there's also no need for 8-byte alignment anywhere here (in
>>>>>> fact there's ".p2align 2" in the generator macros).
>>>>>
>>>>> I would rather keep the pad0 here.
>>>>
>>>> May I ask why that is?
>>>
>>> It makes clear of the padding (which would have been hidden) when using
>>> .p2align 2.
>>
>> But you realize that I didn't ask to drop the member?
> 
> I misunderstood your first reply. But the second reply...
> 
>   Besides (later in
>> the reply to Oleksii) suggesting to make "line" uint32_t, you zapped the
>> relevant further part of my reply here:
> 
>>
>> "I also wonder why this needs to be a named bitfield: Either make it
>>   plain uint16_t, or if you use a bitfield, then omit the name."
>>
>> I thought that's clear enough as a request to change representation,
> 
> ... "May I ask why that is?" added to the confusion because it implied 
> that you disagree on keep the pad0.

Hmm, yes, I can see how that may have been ambiguous: I understood that
reply of yours as requesting to retain the _name_ (i.e. objecting to my
thought of using an unnamed bitfield instead).

Jan
Oleksii Kurochko Feb. 14, 2023, 4:22 p.m. UTC | #10
Hi Jan!

Thanks a lot for starting the review of the patch series!

On Mon, 2023-02-13 at 13:24 +0100, Jan Beulich wrote:
> On 03.02.2023 18:05, Oleksii Kurochko wrote:
> > --- a/xen/common/Kconfig
> > +++ b/xen/common/Kconfig
> > @@ -92,6 +92,12 @@ config STATIC_MEMORY
> >  
> >           If unsure, say N.
> >  
> > +config GENERIC_DO_BUG_FRAME
> > +       bool
> > +       help
> > +         Generic do_bug_frame() function is needed to handle the
> > type of bug
> > +         frame and print an information about it.
> 
> Generally help text for prompt-less functions is not very useful.
> Assuming
> it is put here to inform people actually reading the source file, I'm
> okay
> for it to be left here, but please at least drop the stray "an". I
> also
> think this may want moving up in the file, e.g. ahead of all the
> HAS_*
> controls (which, as you will notice, all have no help text either).
> Plus
> finally how about shorter and more applicable GENERIC_BUG_FRAME -
> after
> all what becomes generic is more than just do_bug_frame()?
Thanks for the comments. I will take them into account.
Right now only do_bug_frame() is expected to be generic.

> 
> > --- /dev/null
> > +++ b/xen/common/bug.c
> > @@ -0,0 +1,88 @@
> > +#include <xen/bug.h>
> > +#include <xen/errno.h>
> > +#include <xen/types.h>
> > +#include <xen/kernel.h>
> 
> Please order headers alphabetically unless there's anything
> preventing
> that from being done.
Sure, thanks.

> 
> > +#include <xen/string.h>
> > +#include <xen/virtual_region.h>
> > +
> > +#include <asm/processor.h>
> > +
> > +int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)
> 
> I know Arm is using vaddr_t and RISC-V now also has it, but in UNIX-
> like
> environments that's redundant with "unsigned long", and it's also
> redundant with C99's uintptr_t. Hence when seeing the type I always
> wonder whether it's really a host virtual address which is meant (and
> not perhaps a guest one, which in principle could differ from the
> host
> one for certain guest types). In any event the existence of this type
> should imo not be a prereq to using this generic piece of
> infrastructure.
> 
I am OK with changing vaddr_t to 'unsigned long' and agree with your
point
so if no one is against that I'll change it.

> > +{
> > +    const struct bug_frame *bug = NULL;
> > +    const char *prefix = "", *filename, *predicate;
> > +    unsigned long fixup;
> > +    int id = -1, lineno;
> 
> For both of these I question them needing to be of a signed type.
I took the code from ARM but it looks like the type of id and lineno
can be changed to 'unsgined int'. So I'll update the code
correspondingly.

> 
> > +    const struct virtual_region *region;
> > +
> > +    region = find_text_region(pc);
> > +    if ( region )
> > +    {
> > +        for ( id = 0; id < BUGFRAME_NR; id++ )
> > +        {
> > +            const struct bug_frame *b;
> > +            unsigned int i;
> > +
> > +            for ( i = 0, b = region->frame[id].bugs;
> > +                  i < region->frame[id].n_bugs; b++, i++ )
> > +            {
> > +                if ( ((vaddr_t)bug_loc(b)) == pc )
> 
> Afaics this is the sole use of bug_loc(). If so, better change the
> macro
> to avoid the need for a cast here:
> 
> #define bug_loc(b) ((unsigned long)(b) + (b)->loc_disp)
> 
Thanks for the recommendation. It makes sense to change the macro so
I'll update it too.

> > --- /dev/null
> > +++ b/xen/include/xen/bug.h
> > @@ -0,0 +1,127 @@
> > +#ifndef __XEN_BUG_H__
> > +#define __XEN_BUG_H__
> > +
> > +#define BUG_DISP_WIDTH    24
> > +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
> > +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
> > +
> > +#define BUGFRAME_run_fn 0
> > +#define BUGFRAME_warn   1
> > +#define BUGFRAME_bug    2
> > +#define BUGFRAME_assert 3
> > +
> > +#define BUGFRAME_NR     4
> > +
> > +#ifndef __ASSEMBLY__
> > +
> > +#include <xen/errno.h>
> > +#include <xen/stringify.h>
> > +#include <xen/types.h>
> > +#include <xen/lib.h>
> 
> Again - please sort headers.
> 
> > +#include <asm/bug.h>
> > +
> > +#ifndef BUG_FRAME_STUFF
> > +struct bug_frame {
> 
> Please can we have a blank line between the above two ones and then
> similarly
> ahead of the #endif?
Sure.

> 
> > +    signed int loc_disp;    /* Relative address to the bug address
> > */
> > +    signed int file_disp;   /* Relative address to the filename */
> > +    signed int msg_disp;    /* Relative address to the predicate
> > (for ASSERT) */
> > +    uint16_t line;          /* Line number */
> > +    uint32_t pad0:16;       /* Padding for 8-bytes align */
> 
> Already the original comment in Arm code is wrong: The padding
> doesn't
> guarantee 8-byte alignment; it merely guarantees a multiple of 8
> bytes
> size. Aiui there's also no need for 8-byte alignment anywhere here
> (in
> fact there's ".p2align 2" in the generator macros).
> 
> I also wonder why this needs to be a named bitfield: Either make it
> plain uint16_t, or if you use a bitfield, then omit the name.
> 
It will better to change it to 'uint16_t' as I don't see any reason to
use 'uint32_t' with bitfield here.
I'll check if we need the alignment. If there  is '.p2align 2' we
really don't need it.

> > +};
> > +
> > +#define bug_loc(b) ((const void *)(b) + (b)->loc_disp)
> > +#define bug_file(b) ((const void *)(b) + (b)->file_disp);
> > +#define bug_line(b) ((b)->line)
> > +#define bug_msg(b) ((const char *)(b) + (b)->msg_disp)
> > +#endif /* BUG_FRAME_STUFF */
> > +
> > +#ifndef BUG_FRAME
> > +/* Many versions of GCC doesn't support the asm %c parameter which
> > would
> > + * be preferable to this unpleasantness. We use mergeable string
> > + * sections to avoid multiple copies of the string appearing in
> > the
> > + * Xen image. BUGFRAME_run_fn needs to be handled separately.
> > + */
> 
> When generalizing the logic, I wonder in how far the comment doesn't
> want re-wording some. For example, while Arm prefixes constant insn
> operands with # (and x86 uses $), there's no such prefix in RISC-V.
> At
> which point there's no need to use %c in the first place. (Which in
> turn means x86'es more compact representation may also be usable
> there.
> And yet in turn the question arises whether RISC-V wouldn't better
> have
> its own derivation of the machinery, rather than trying to generalize
> things. RISC-V's would then likely be closer to x86'es, just without
> the use of %c on asm() operands. Which might then suggest to rather
> generalize x86'es variant down the road.)
ARM version is more portable because of the '%c' modifier which is not
present everywhere, so I decided to use it as a generic implementation.
Moreover I don't see any reason why we can't switch x86 implementation
to 'generic/ARM'.
And one more thing if you look at WARN/BUG/ASSERT_FAILED/BUG_FRAME
macros implementation to use it in assembly code the implementations
are closer to 'generic/ARM'.

> 
> At the very least the comment's style wants correcting, and in the
> first
> sentence s/doesn't/don't/. Also %c isn't a parameter, but a modifier.
> 
Thanks!

> > +#define BUG_FRAME(type, line, file, has_msg, msg) do
> > {                      \
> > +    BUILD_BUG_ON((line) >>
> > 16);                                             \
> > +    BUILD_BUG_ON((type) >=
> > BUGFRAME_NR);                                    \
> > +    asm
> > ("1:"BUG_INSTR"\n"                                                 
> > \
> 
> Nit: Style (missing blank after opening parenthesis, and then also at
> the
> end of the construct ahead of the closing parenthesis).
> 
I'll fix it.

> > +         ".pushsection .rodata.str, \"aMS\", %progbits,
> > 1\n"                \
> > +         "2:\t.asciz " __stringify(file)
> > "\n"                               \
> 
> file is always a string literal; really it always is __FILE__ in this
> non-x86 implementation. So first preference ought to be to drop the
> macro parameter and use __FILE__ here (same then for line vs
> __LINE__).
> Yet even if not done like that, the __stringify() is largely unneeded
> (unless we expect file names to have e.g. backslashes in their names)
> and looks somewhat odd here. So how about
> 
>          "2:\t.asciz \"" __FILE__ "\"\n"
> 
> ? But wait - peeking ahead to the x86 patch I notice that __FILE__
> and
> __LINE__ need to remain arguments. But then the second preference
> would
> still be
> 
>          "2:\t.asciz \"" file "\"\n"
> 
> imo. Yet maybe others disagree ...
> 
> > +        
> > "3:\n"                                                            
> > \
> > +         ".if " #has_msg
> > "\n"                                               \
> > +         "\t.asciz " #msg
> > "\n"                                              \
> > +        
> > ".endif\n"                                                        
> > \
> > +        
> > ".popsection\n"                                                   
> > \
> > +         ".pushsection .bug_frames." __stringify(type) ", \"a\",
> > %progbits\n"\
> > +        
> > "4:\n"                                                            
> > \
> > +         ".p2align
> > 2\n"                                                     \
> > +         ".long (1b -
> > 4b)\n"                                                \
> > +         ".long (2b -
> > 4b)\n"                                                \
> > +         ".long (3b -
> > 4b)\n"                                                \
> > +         ".hword " __stringify(line) ",
> > 0\n"                                \
> 
> Hmm, .hword. How generic is support for that in assemblers? I notice
> even
> very old gas has support for it, but architectures may not consider
> it
> two bytes wide. (On x86, for example, it's bogus to be two bytes,
> since
> .word also produces 2 bytes of data. And indeed MIPS and NDS32
> override it
> in gas to produce 1 byte of data only, at least in certain cases.)
> I'd
> like to suggest to use a fourth .long here, and to drop the padding
> field
> from struct bug_frame in exchange.
Changing .hword to .half can make the code more portable and generic,
as .half is a more standard and widely supported assembler directive
for declaring 16-bit data. 
But probably you are right and it will be easier to change it to .long.

> 
> Furthermore, once assembly code is generalized, you need to pay
> attention
> to formatting: Only labels may start at the beginning of a line; in
> particular directives never should.
> 
> > +        
> > ".popsection");                                                   
> > \
> > +} while (0)
> 
> Nit: Style (missing blanks, and preferably "false" instead of "0").
> 
> > +#endif /* BUG_FRAME */
> > +
> > +#ifndef run_in_exception_handler
> > +/*
> > + * GCC will not allow to use "i"  when PIE is enabled (Xen doesn't
> > set the
> > + * flag but instead rely on the default value from the compiler).
> > So the
> > + * easiest way to implement run_in_exception_handler() is to pass
> > the to
> > + * be called function in a fixed register.
> > + */
> > +#define  run_in_exception_handler(fn) do
> > {                                  \
> 
> Nit: Just one blank please after #define (and on the first comment
> line
> there's also a double blank where only one should be).
> 
> > +    register void *fn_ asm(__stringify(BUG_FN_REG)) =
> > (fn);                 \
> 
> x86 makes sure "fn" is of correct type. Arm doesn't, which likely is
> a
> mistake. May I ask that you use the correct type here (which is even
> better than x86'es extra checking construct):
> 
>     register void (*fn_)(struct cpu_user_regs *)
> asm(__stringify(BUG_FN_REG)) = (fn);
> 
the type of 'fn' should be updated. I'll take into account in the next
version of patch.

> > +    asm
> > ("1:"BUG_INSTR"\n"                                                 
> > \
> > +         ".pushsection .bug_frames." __stringify(BUGFRAME_run_fn)
> > ","       \
> > +         "             \"a\",
> > %%progbits\n"                                 \
> > +        
> > "2:\n"                                                            
> > \
> > +         ".p2align
> > 2\n"                                                     \
> > +         ".long (1b -
> > 2b)\n"                                                \
> > +         ".long 0, 0,
> > 0\n"                                                  \
> > +         ".popsection" :: "r"
> > (fn_));                                       \
> > +} while (0)
> > +#endif /* run_in_exception_handler */
> > +
> > +#ifndef WARN
> > +#define WARN() BUG_FRAME(BUGFRAME_warn, __LINE__, __FILE__, 0, "")
> > +#endif /* WARN */
> > +
> > +#ifndef BUG
> > +#define BUG() do {                                              \
> > +    BUG_FRAME(BUGFRAME_bug,  __LINE__, __FILE__, 0, "");        \
> > +    unreachable();                                              \
> > +} while (0)
> > +#endif
> > +
> > +#ifndef assert_failed
> > +#define assert_failed(msg) do {                                 \
> > +    BUG_FRAME(BUGFRAME_assert, __LINE__, __FILE__, 1, msg);     \
> > +    unreachable();                                              \
> > +} while (0)
> > +#endif
> > +
> > +extern const struct bug_frame __start_bug_frames[],
> > +                              __stop_bug_frames_0[],
> > +                              __stop_bug_frames_1[],
> > +                              __stop_bug_frames_2[],
> > +                              __stop_bug_frames_3[];
> > +
> > +#else /* !__ASSEMBLY__ */
> > +
> > +#ifdef CONFIG_X86
> > +#include <asm/bug.h>
> > +#endif
> 
> Why this x86 special? (To be precise: Why can this not be done
> uniformly?)
> 
We can leave only '#include <asm/bug.h>'. I had an issue during making
the code more generic. I removed it and re-run tests and it looks like
it is fine to go w/o '#ifdef...':
https://gitlab.com/xen-project/people/olkur/xen/-/pipelines/777288336

> Jan
~ Oleksii
Jan Beulich Feb. 14, 2023, 4:55 p.m. UTC | #11
On 14.02.2023 17:22, Oleksii wrote:
> On Mon, 2023-02-13 at 13:24 +0100, Jan Beulich wrote:
>> On 03.02.2023 18:05, Oleksii Kurochko wrote:
>>> --- a/xen/common/Kconfig
>>> +++ b/xen/common/Kconfig
>>> @@ -92,6 +92,12 @@ config STATIC_MEMORY
>>>  
>>>           If unsure, say N.
>>>  
>>> +config GENERIC_DO_BUG_FRAME
>>> +       bool
>>> +       help
>>> +         Generic do_bug_frame() function is needed to handle the
>>> type of bug
>>> +         frame and print an information about it.
>>
>> Generally help text for prompt-less functions is not very useful.
>> Assuming
>> it is put here to inform people actually reading the source file, I'm
>> okay
>> for it to be left here, but please at least drop the stray "an". I
>> also
>> think this may want moving up in the file, e.g. ahead of all the
>> HAS_*
>> controls (which, as you will notice, all have no help text either).
>> Plus
>> finally how about shorter and more applicable GENERIC_BUG_FRAME -
>> after
>> all what becomes generic is more than just do_bug_frame()?
> Thanks for the comments. I will take them into account.
> Right now only do_bug_frame() is expected to be generic.

Hmm, do you mean to undo part of what you've done? I didn't think
you would. Yet in what you've done so far, the struct an several
macros are also generalized. Hence the "DO" in the name is too
narrow from my pov.

>>> --- /dev/null
>>> +++ b/xen/include/xen/bug.h
>>> @@ -0,0 +1,127 @@
>>> +#ifndef __XEN_BUG_H__
>>> +#define __XEN_BUG_H__
>>> +
>>> +#define BUG_DISP_WIDTH    24
>>> +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
>>> +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
>>> +
>>> +#define BUGFRAME_run_fn 0
>>> +#define BUGFRAME_warn   1
>>> +#define BUGFRAME_bug    2
>>> +#define BUGFRAME_assert 3
>>> +
>>> +#define BUGFRAME_NR     4
>>> +
>>> +#ifndef __ASSEMBLY__
>>> +
>>> +#include <xen/errno.h>
>>> +#include <xen/stringify.h>
>>> +#include <xen/types.h>
>>> +#include <xen/lib.h>
>>
>> Again - please sort headers.
>>
>>> +#include <asm/bug.h>
>>> +
>>> +#ifndef BUG_FRAME_STUFF
>>> +struct bug_frame {
>>
>> Please can we have a blank line between the above two ones and then
>> similarly
>> ahead of the #endif?
> Sure.
> 
>>
>>> +    signed int loc_disp;    /* Relative address to the bug address
>>> */
>>> +    signed int file_disp;   /* Relative address to the filename */
>>> +    signed int msg_disp;    /* Relative address to the predicate
>>> (for ASSERT) */
>>> +    uint16_t line;          /* Line number */
>>> +    uint32_t pad0:16;       /* Padding for 8-bytes align */
>>
>> Already the original comment in Arm code is wrong: The padding
>> doesn't
>> guarantee 8-byte alignment; it merely guarantees a multiple of 8
>> bytes
>> size. Aiui there's also no need for 8-byte alignment anywhere here
>> (in
>> fact there's ".p2align 2" in the generator macros).
>>
>> I also wonder why this needs to be a named bitfield: Either make it
>> plain uint16_t, or if you use a bitfield, then omit the name.
>>
> It will better to change it to 'uint16_t' as I don't see any reason to
> use 'uint32_t' with bitfield here.
> I'll check if we need the alignment. If there  is '.p2align 2' we
> really don't need it.

See Julien's replies any my responses: FTAOD I did _not_ ask to remove
the field.

>>> +};
>>> +
>>> +#define bug_loc(b) ((const void *)(b) + (b)->loc_disp)
>>> +#define bug_file(b) ((const void *)(b) + (b)->file_disp);
>>> +#define bug_line(b) ((b)->line)
>>> +#define bug_msg(b) ((const char *)(b) + (b)->msg_disp)
>>> +#endif /* BUG_FRAME_STUFF */
>>> +
>>> +#ifndef BUG_FRAME
>>> +/* Many versions of GCC doesn't support the asm %c parameter which
>>> would
>>> + * be preferable to this unpleasantness. We use mergeable string
>>> + * sections to avoid multiple copies of the string appearing in
>>> the
>>> + * Xen image. BUGFRAME_run_fn needs to be handled separately.
>>> + */
>>
>> When generalizing the logic, I wonder in how far the comment doesn't
>> want re-wording some. For example, while Arm prefixes constant insn
>> operands with # (and x86 uses $), there's no such prefix in RISC-V.
>> At
>> which point there's no need to use %c in the first place. (Which in
>> turn means x86'es more compact representation may also be usable
>> there.
>> And yet in turn the question arises whether RISC-V wouldn't better
>> have
>> its own derivation of the machinery, rather than trying to generalize
>> things. RISC-V's would then likely be closer to x86'es, just without
>> the use of %c on asm() operands. Which might then suggest to rather
>> generalize x86'es variant down the road.)
> ARM version is more portable because of the '%c' modifier which is not
> present everywhere, so I decided to use it as a generic implementation.
> Moreover I don't see any reason why we can't switch x86 implementation
> to 'generic/ARM'.

That would increase data size on x86 for no gain, from all I can tell.

>>> +         ".hword " __stringify(line) ",
>>> 0\n"                                \
>>
>> Hmm, .hword. How generic is support for that in assemblers? I notice
>> even
>> very old gas has support for it, but architectures may not consider
>> it
>> two bytes wide. (On x86, for example, it's bogus to be two bytes,
>> since
>> .word also produces 2 bytes of data. And indeed MIPS and NDS32
>> override it
>> in gas to produce 1 byte of data only, at least in certain cases.)
>> I'd
>> like to suggest to use a fourth .long here, and to drop the padding
>> field
>> from struct bug_frame in exchange.
> Changing .hword to .half can make the code more portable and generic,
> as .half is a more standard and widely supported assembler directive
> for declaring 16-bit data. 

And how is "half" better than "hword" in the mentioned respect? Half
a word is still a byte on x86 (due to its 16-bit history).

That said - from all I can tell by briefly looking at gas sources,
there's no ".half" there.

Jan
Oleksii Kurochko Feb. 15, 2023, 3:06 p.m. UTC | #12
On Tue, 2023-02-14 at 17:55 +0100, Jan Beulich wrote:
> On 14.02.2023 17:22, Oleksii wrote:
> > On Mon, 2023-02-13 at 13:24 +0100, Jan Beulich wrote:
> > > On 03.02.2023 18:05, Oleksii Kurochko wrote:
> > > > --- a/xen/common/Kconfig
> > > > +++ b/xen/common/Kconfig
> > > > @@ -92,6 +92,12 @@ config STATIC_MEMORY
> > > >  
> > > >           If unsure, say N.
> > > >  
> > > > +config GENERIC_DO_BUG_FRAME
> > > > +       bool
> > > > +       help
> > > > +         Generic do_bug_frame() function is needed to handle
> > > > the
> > > > type of bug
> > > > +         frame and print an information about it.
> > > 
> > > Generally help text for prompt-less functions is not very useful.
> > > Assuming
> > > it is put here to inform people actually reading the source file,
> > > I'm
> > > okay
> > > for it to be left here, but please at least drop the stray "an".
> > > I
> > > also
> > > think this may want moving up in the file, e.g. ahead of all the
> > > HAS_*
> > > controls (which, as you will notice, all have no help text
> > > either).
> > > Plus
> > > finally how about shorter and more applicable GENERIC_BUG_FRAME -
> > > after
> > > all what becomes generic is more than just do_bug_frame()?
> > Thanks for the comments. I will take them into account.
> > Right now only do_bug_frame() is expected to be generic.
> 
> Hmm, do you mean to undo part of what you've done? I didn't think
> you would. Yet in what you've done so far, the struct an several
> macros are also generalized. Hence the "DO" in the name is too
> narrow from my pov.
> 
No, I don't undo part of what I have done.
I misunderstood your initial message. So yeah, I will remove "DO" I
think it will be more correct.
> > > > --- /dev/null
> > > > +++ b/xen/include/xen/bug.h
> > > > @@ -0,0 +1,127 @@
> > > > +#ifndef __XEN_BUG_H__
> > > > +#define __XEN_BUG_H__
> > > > +
> > > > +#define BUG_DISP_WIDTH    24
> > > > +#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
> > > > +#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
> > > > +
> > > > +#define BUGFRAME_run_fn 0
> > > > +#define BUGFRAME_warn   1
> > > > +#define BUGFRAME_bug    2
> > > > +#define BUGFRAME_assert 3
> > > > +
> > > > +#define BUGFRAME_NR     4
> > > > +
> > > > +#ifndef __ASSEMBLY__
> > > > +
> > > > +#include <xen/errno.h>
> > > > +#include <xen/stringify.h>
> > > > +#include <xen/types.h>
> > > > +#include <xen/lib.h>
> > > 
> > > Again - please sort headers.
> > > 
> > > > +#include <asm/bug.h>
> > > > +
> > > > +#ifndef BUG_FRAME_STUFF
> > > > +struct bug_frame {
> > > 
> > > Please can we have a blank line between the above two ones and
> > > then
> > > similarly
> > > ahead of the #endif?
> > Sure.
> > 
> > > 
> > > > +    signed int loc_disp;    /* Relative address to the bug
> > > > address
> > > > */
> > > > +    signed int file_disp;   /* Relative address to the
> > > > filename */
> > > > +    signed int msg_disp;    /* Relative address to the
> > > > predicate
> > > > (for ASSERT) */
> > > > +    uint16_t line;          /* Line number */
> > > > +    uint32_t pad0:16;       /* Padding for 8-bytes align */
> > > 
> > > Already the original comment in Arm code is wrong: The padding
> > > doesn't
> > > guarantee 8-byte alignment; it merely guarantees a multiple of 8
> > > bytes
> > > size. Aiui there's also no need for 8-byte alignment anywhere
> > > here
> > > (in
> > > fact there's ".p2align 2" in the generator macros).
> > > 
> > > I also wonder why this needs to be a named bitfield: Either make
> > > it
> > > plain uint16_t, or if you use a bitfield, then omit the name.
> > > 
> > It will better to change it to 'uint16_t' as I don't see any reason
> > to
> > use 'uint32_t' with bitfield here.
> > I'll check if we need the alignment. If there  is '.p2align 2' we
> > really don't need it.
> 
> See Julien's replies any my responses: FTAOD I did _not_ ask to
> remove
> the field.
I didn't see his reply so I'll read it too.
> 
> > > > +};
> > > > +
> > > > +#define bug_loc(b) ((const void *)(b) + (b)->loc_disp)
> > > > +#define bug_file(b) ((const void *)(b) + (b)->file_disp);
> > > > +#define bug_line(b) ((b)->line)
> > > > +#define bug_msg(b) ((const char *)(b) + (b)->msg_disp)
> > > > +#endif /* BUG_FRAME_STUFF */
> > > > +
> > > > +#ifndef BUG_FRAME
> > > > +/* Many versions of GCC doesn't support the asm %c parameter
> > > > which
> > > > would
> > > > + * be preferable to this unpleasantness. We use mergeable
> > > > string
> > > > + * sections to avoid multiple copies of the string appearing
> > > > in
> > > > the
> > > > + * Xen image. BUGFRAME_run_fn needs to be handled separately.
> > > > + */
> > > 
> > > When generalizing the logic, I wonder in how far the comment
> > > doesn't
> > > want re-wording some. For example, while Arm prefixes constant
> > > insn
> > > operands with # (and x86 uses $), there's no such prefix in RISC-
> > > V.
> > > At
> > > which point there's no need to use %c in the first place. (Which
> > > in
> > > turn means x86'es more compact representation may also be usable
> > > there.
> > > And yet in turn the question arises whether RISC-V wouldn't
> > > better
> > > have
> > > its own derivation of the machinery, rather than trying to
> > > generalize
> > > things. RISC-V's would then likely be closer to x86'es, just
> > > without
> > > the use of %c on asm() operands. Which might then suggest to
> > > rather
> > > generalize x86'es variant down the road.)
> > ARM version is more portable because of the '%c' modifier which is
> > not
> > present everywhere, so I decided to use it as a generic
> > implementation.
> > Moreover I don't see any reason why we can't switch x86
> > implementation
> > to 'generic/ARM'.
> 
> That would increase data size on x86 for no gain, from all I can
> tell.
You are right. It will increase data size.

One more point regarding portability is that x86 uses an 'i' constraint
modifier that GCC won't allow when PIE is enabled, so it doesn't look
like the best option to use as generic.

> 
> > > > +         ".hword " __stringify(line) ",
> > > > 0\n"                                \
> > > 
> > > Hmm, .hword. How generic is support for that in assemblers? I
> > > notice
> > > even
> > > very old gas has support for it, but architectures may not
> > > consider
> > > it
> > > two bytes wide. (On x86, for example, it's bogus to be two bytes,
> > > since
> > > .word also produces 2 bytes of data. And indeed MIPS and NDS32
> > > override it
> > > in gas to produce 1 byte of data only, at least in certain
> > > cases.)
> > > I'd
> > > like to suggest to use a fourth .long here, and to drop the
> > > padding
> > > field
> > > from struct bug_frame in exchange.
> > Changing .hword to .half can make the code more portable and
> > generic,
> > as .half is a more standard and widely supported assembler
> > directive
> > for declaring 16-bit data. 
> 
> And how is "half" better than "hword" in the mentioned respect? Half
> a word is still a byte on x86 (due to its 16-bit history).
> 
> That said - from all I can tell by briefly looking at gas sources,
> there's no ".half" there.
Then, it will still be an issue. So then the best solution will be to
change it to the type recommended before.

> 
> Jan
Oleksii Kurochko Feb. 15, 2023, 5:59 p.m. UTC | #13
Hello Jan and community,

I experimented and switched RISC-V to x86 implementation. All that I
changed in x86 implementation for RISC-V was _ASM_BUGFRAME_TEXT. Other
things are the same as for x86.

For RISC-V it is fine to skip '%c' modifier so _ASM_BUGFRAME_TEXT will
look like:

#define _ASM_BUGFRAME_TEXT(second_frame) \
    ".Lbug%=: ebreak\n"   
    ".pushsection .bug_frames.%[bf_type], \"a\", @progbits\n"
    ".p2align 2\n"
    ".Lfrm%=:\n"
    ".long (.Lbug%= - .Lfrm%=) + %[bf_line_hi]\n"
    ".long (%[bf_ptr] - .Lfrm%=) + %[bf_line_lo]\n"
    ".if " #second_frame "\n"
    ".long 0, %[bf_msg] - .Lfrm%=\n"
    ".endif\n"
    ".popsection\n"

The only thing I am worried about is:

#define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
  [bf_type] "i" (type), ...
because as I understand it can be an issue with 'i' modifier in case of
PIE. I am not sure that Xen enables PIE somewhere but still...
If it is not an issue then we can use x86 implementation as a generic
one.

Could you please share your thoughts about that?

~ Oleksii
Jan Beulich Feb. 16, 2023, 7:31 a.m. UTC | #14
On 15.02.2023 18:59, Oleksii wrote:
> Hello Jan and community,
> 
> I experimented and switched RISC-V to x86 implementation. All that I
> changed in x86 implementation for RISC-V was _ASM_BUGFRAME_TEXT. Other
> things are the same as for x86.
> 
> For RISC-V it is fine to skip '%c' modifier so _ASM_BUGFRAME_TEXT will
> look like:
> 
> #define _ASM_BUGFRAME_TEXT(second_frame) \
>     ".Lbug%=: ebreak\n"   
>     ".pushsection .bug_frames.%[bf_type], \"a\", @progbits\n"
>     ".p2align 2\n"
>     ".Lfrm%=:\n"
>     ".long (.Lbug%= - .Lfrm%=) + %[bf_line_hi]\n"
>     ".long (%[bf_ptr] - .Lfrm%=) + %[bf_line_lo]\n"
>     ".if " #second_frame "\n"
>     ".long 0, %[bf_msg] - .Lfrm%=\n"
>     ".endif\n"
>     ".popsection\n"

I expect this could be further abstracted such that only the actual
instruction is arch-specific.

> The only thing I am worried about is:
> 
> #define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
>   [bf_type] "i" (type), ...
> because as I understand it can be an issue with 'i' modifier in case of
> PIE. I am not sure that Xen enables PIE somewhere but still...
> If it is not an issue then we can use x86 implementation as a generic
> one.

"i" is not generally an issue with PIE, it only is when the value is the
address of a symbol. Here "type" is a constant in all cases. (Or else
how would you express an immediate operand of an instruction in an
asm()?)

Jan
Oleksii Kurochko Feb. 16, 2023, 10:44 a.m. UTC | #15
On Thu, 2023-02-16 at 08:31 +0100, Jan Beulich wrote:
> On 15.02.2023 18:59, Oleksii wrote:
> > Hello Jan and community,
> > 
> > I experimented and switched RISC-V to x86 implementation. All that
> > I
> > changed in x86 implementation for RISC-V was _ASM_BUGFRAME_TEXT.
> > Other
> > things are the same as for x86.
> > 
> > For RISC-V it is fine to skip '%c' modifier so _ASM_BUGFRAME_TEXT
> > will
> > look like:
> > 
> > #define _ASM_BUGFRAME_TEXT(second_frame) \
> >     ".Lbug%=: ebreak\n"   
> >     ".pushsection .bug_frames.%[bf_type], \"a\", @progbits\n"
> >     ".p2align 2\n"
> >     ".Lfrm%=:\n"
> >     ".long (.Lbug%= - .Lfrm%=) + %[bf_line_hi]\n"
> >     ".long (%[bf_ptr] - .Lfrm%=) + %[bf_line_lo]\n"
> >     ".if " #second_frame "\n"
> >     ".long 0, %[bf_msg] - .Lfrm%=\n"
> >     ".endif\n"
> >     ".popsection\n"
> 
> I expect this could be further abstracted such that only the actual
> instruction is arch-specific.
Generally, the only thing that can't be abstracted ( as you mentioned
is an instruction ).

So we can make additional defines:
  1. #define MODIFIER "" or "c" (depends on architecture) and use it  
the following way:
   ".pushsection .bug_frames.%"MODIFIER"[bf_type], \"a\", @progbits\n"
   ...
  2. #define BUG_INSTR "ebreak" | "ud2" | "..." and use it in the
following way:
   ".Lbug%=: "BUG_ISNTR"\n"
   ...
Except for these defines which should be specified for each
architecture
all other code will be the same for all architectures.

But as I understand with modifier 'c' can be issues that not all
compilers support but for ARM and  x86 before immediate is present
punctuation # or $ which are removed by modifier 'c'. And I am
wondering what other ways to remove punctuation before immediate exist.

Do you think we should consider that as an issue?

> 
> > The only thing I am worried about is:
> > 
> > #define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
> >   [bf_type] "i" (type), ...
> > because as I understand it can be an issue with 'i' modifier in
> > case of
> > PIE. I am not sure that Xen enables PIE somewhere but still...
> > If it is not an issue then we can use x86 implementation as a
> > generic
> > one.
> 
> "i" is not generally an issue with PIE, it only is when the value is
> the
> address of a symbol. Here "type" is a constant in all cases. (Or else
> how would you express an immediate operand of an instruction in an
> asm()?)
Hmm. I don't know other ways to express an immediate operand except
'i'.

It looks like type, line, msg are always 'constants' 
(
they possibly can be passed without 'i' and used directly inside asm():
   ...
   ".pushsection .bug_frames." __stringify(type) ", \"a\",
%progbits\n"\
   ...
) but the issue will be with 'ptr' for which we have to use 'i'.

And I am not sure about all 'constants'. For example, I think that it
can be an issue with 'line' = '((line & ((1 << BUG_LINE_LO_WIDTH) - 1))
<< BUG_DISP_WIDTH)' if we will use that directly inside asm(...).

> 
> Jan

~ Oleksii
Andrew Cooper Feb. 16, 2023, 10:48 a.m. UTC | #16
On 16/02/2023 7:31 am, Jan Beulich wrote:
> On 15.02.2023 18:59, Oleksii wrote:
>> Hello Jan and community,
>>
>> I experimented and switched RISC-V to x86 implementation. All that I
>> changed in x86 implementation for RISC-V was _ASM_BUGFRAME_TEXT. Other
>> things are the same as for x86.
>>
>> For RISC-V it is fine to skip '%c' modifier so _ASM_BUGFRAME_TEXT will
>> look like:
>>
>> #define _ASM_BUGFRAME_TEXT(second_frame) \
>>     ".Lbug%=: ebreak\n"   
>>     ".pushsection .bug_frames.%[bf_type], \"a\", @progbits\n"
>>     ".p2align 2\n"
>>     ".Lfrm%=:\n"
>>     ".long (.Lbug%= - .Lfrm%=) + %[bf_line_hi]\n"
>>     ".long (%[bf_ptr] - .Lfrm%=) + %[bf_line_lo]\n"
>>     ".if " #second_frame "\n"
>>     ".long 0, %[bf_msg] - .Lfrm%=\n"
>>     ".endif\n"
>>     ".popsection\n"
> I expect this could be further abstracted such that only the actual
> instruction is arch-specific.
>
>> The only thing I am worried about is:
>>
>> #define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
>>   [bf_type] "i" (type), ...
>> because as I understand it can be an issue with 'i' modifier in case of
>> PIE. I am not sure that Xen enables PIE somewhere but still...
>> If it is not an issue then we can use x86 implementation as a generic
>> one.
> "i" is not generally an issue with PIE, it only is when the value is the
> address of a symbol. Here "type" is a constant in all cases. (Or else
> how would you express an immediate operand of an instruction in an
> asm()?)

At a guess, the problem isn't type.  It's the line number, which ends up
in a relocation.

That said, I'm not sure an architecture could reasonably function
without a generic 4-byte PC-relative relocation...

~Andrew
Oleksii Kurochko Feb. 16, 2023, 12:09 p.m. UTC | #17
On Thu, 2023-02-16 at 12:44 +0200, Oleksii wrote:
> On Thu, 2023-02-16 at 08:31 +0100, Jan Beulich wrote:
> > On 15.02.2023 18:59, Oleksii wrote:
> > > Hello Jan and community,
> > > 
> > > I experimented and switched RISC-V to x86 implementation. All
> > > that
> > > I
> > > changed in x86 implementation for RISC-V was _ASM_BUGFRAME_TEXT.
> > > Other
> > > things are the same as for x86.
> > > 
> > > For RISC-V it is fine to skip '%c' modifier so _ASM_BUGFRAME_TEXT
> > > will
> > > look like:
> > > 
> > > #define _ASM_BUGFRAME_TEXT(second_frame) \
> > >     ".Lbug%=: ebreak\n"   
> > >     ".pushsection .bug_frames.%[bf_type], \"a\", @progbits\n"
> > >     ".p2align 2\n"
> > >     ".Lfrm%=:\n"
> > >     ".long (.Lbug%= - .Lfrm%=) + %[bf_line_hi]\n"
> > >     ".long (%[bf_ptr] - .Lfrm%=) + %[bf_line_lo]\n"
> > >     ".if " #second_frame "\n"
> > >     ".long 0, %[bf_msg] - .Lfrm%=\n"
> > >     ".endif\n"
> > >     ".popsection\n"
> > 
> > I expect this could be further abstracted such that only the actual
> > instruction is arch-specific.
> Generally, the only thing that can't be abstracted ( as you mentioned
> is an instruction ).
> 
> So we can make additional defines:
>   1. #define MODIFIER "" or "c" (depends on architecture) and use it
>  
> the following way:
>    ".pushsection .bug_frames.%"MODIFIER"[bf_type], \"a\",
> @progbits\n"
>    ...
>   2. #define BUG_INSTR "ebreak" | "ud2" | "..." and use it in the
> following way:
>    ".Lbug%=: "BUG_ISNTR"\n"
>    ...
> Except for these defines which should be specified for each
> architecture
> all other code will be the same for all architectures.
> 
> But as I understand with modifier 'c' can be issues that not all
> compilers support but for ARM and  x86 before immediate is present
> punctuation # or $ which are removed by modifier 'c'. And I am
> wondering what other ways to remove punctuation before immediate
> exist.
> 
> Do you think we should consider that as an issue?
> 
> > 
> > > The only thing I am worried about is:
> > > 
> > > #define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
> > >   [bf_type] "i" (type), ...
> > > because as I understand it can be an issue with 'i' modifier in
> > > case of
> > > PIE. I am not sure that Xen enables PIE somewhere but still...
> > > If it is not an issue then we can use x86 implementation as a
> > > generic
> > > one.
> > 
> > "i" is not generally an issue with PIE, it only is when the value
> > is
> > the
> > address of a symbol. Here "type" is a constant in all cases. (Or
> > else
> > how would you express an immediate operand of an instruction in an
> > asm()?)
> Hmm. I don't know other ways to express an immediate operand except
> 'i'.
> 
> It looks like type, line, msg are always 'constants' 
> (
> they possibly can be passed without 'i' and used directly inside
> asm():
>    ...
>    ".pushsection .bug_frames." __stringify(type) ", \"a\",
> %progbits\n"\
>    ...
> ) but the issue will be with 'ptr' for which we have to use 'i'.
> 
> And I am not sure about all 'constants'. For example, I think that it
> can be an issue with 'line' = '((line & ((1 << BUG_LINE_LO_WIDTH) -
> 1))
> << BUG_DISP_WIDTH)' if we will use that directly inside asm(...).
> 
I think we can avoid 'i' by using 'r' + some kind of mov instruction to
write a register value to memory. The code below is pseudo-code:
#define _ASM_BUGFRAME_TEXT(second_frame)
    ...
    "loc_disp:\n"
    "    .long 0x0"
    "mov [loc_disp], some_register"
    ...
And the we have to define mov for each architecture.
                                            \
> > 
> > Jan
> 
> ~ Oleksii
Andrew Cooper Feb. 16, 2023, 12:19 p.m. UTC | #18
On 16/02/2023 12:09 pm, Oleksii wrote:
> On Thu, 2023-02-16 at 12:44 +0200, Oleksii wrote:
>> On Thu, 2023-02-16 at 08:31 +0100, Jan Beulich wrote:
>>> On 15.02.2023 18:59, Oleksii wrote:
>>>> Hello Jan and community,
>>>>
>>>> I experimented and switched RISC-V to x86 implementation. All
>>>> that
>>>> I
>>>> changed in x86 implementation for RISC-V was _ASM_BUGFRAME_TEXT.
>>>> Other
>>>> things are the same as for x86.
>>>>
>>>> For RISC-V it is fine to skip '%c' modifier so _ASM_BUGFRAME_TEXT
>>>> will
>>>> look like:
>>>>
>>>> #define _ASM_BUGFRAME_TEXT(second_frame) \
>>>>     ".Lbug%=: ebreak\n"   
>>>>     ".pushsection .bug_frames.%[bf_type], \"a\", @progbits\n"
>>>>     ".p2align 2\n"
>>>>     ".Lfrm%=:\n"
>>>>     ".long (.Lbug%= - .Lfrm%=) + %[bf_line_hi]\n"
>>>>     ".long (%[bf_ptr] - .Lfrm%=) + %[bf_line_lo]\n"
>>>>     ".if " #second_frame "\n"
>>>>     ".long 0, %[bf_msg] - .Lfrm%=\n"
>>>>     ".endif\n"
>>>>     ".popsection\n"
>>> I expect this could be further abstracted such that only the actual
>>> instruction is arch-specific.
>> Generally, the only thing that can't be abstracted ( as you mentioned
>> is an instruction ).
>>
>> So we can make additional defines:
>>   1. #define MODIFIER "" or "c" (depends on architecture) and use it
>>  
>> the following way:
>>    ".pushsection .bug_frames.%"MODIFIER"[bf_type], \"a\",
>> @progbits\n"
>>    ...
>>   2. #define BUG_INSTR "ebreak" | "ud2" | "..." and use it in the
>> following way:
>>    ".Lbug%=: "BUG_ISNTR"\n"
>>    ...
>> Except for these defines which should be specified for each
>> architecture
>> all other code will be the same for all architectures.
>>
>> But as I understand with modifier 'c' can be issues that not all
>> compilers support but for ARM and  x86 before immediate is present
>> punctuation # or $ which are removed by modifier 'c'. And I am
>> wondering what other ways to remove punctuation before immediate
>> exist.
>>
>> Do you think we should consider that as an issue?
>>
>>>> The only thing I am worried about is:
>>>>
>>>> #define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
>>>>   [bf_type] "i" (type), ...
>>>> because as I understand it can be an issue with 'i' modifier in
>>>> case of
>>>> PIE. I am not sure that Xen enables PIE somewhere but still...
>>>> If it is not an issue then we can use x86 implementation as a
>>>> generic
>>>> one.
>>> "i" is not generally an issue with PIE, it only is when the value
>>> is
>>> the
>>> address of a symbol. Here "type" is a constant in all cases. (Or
>>> else
>>> how would you express an immediate operand of an instruction in an
>>> asm()?)
>> Hmm. I don't know other ways to express an immediate operand except
>> 'i'.
>>
>> It looks like type, line, msg are always 'constants' 
>> (
>> they possibly can be passed without 'i' and used directly inside
>> asm():
>>    ...
>>    ".pushsection .bug_frames." __stringify(type) ", \"a\",
>> %progbits\n"\
>>    ...
>> ) but the issue will be with 'ptr' for which we have to use 'i'.
>>
>> And I am not sure about all 'constants'. For example, I think that it
>> can be an issue with 'line' = '((line & ((1 << BUG_LINE_LO_WIDTH) -
>> 1))
>> << BUG_DISP_WIDTH)' if we will use that directly inside asm(...).
>>
> I think we can avoid 'i' by using 'r' + some kind of mov instruction to
> write a register value to memory. The code below is pseudo-code:
> #define _ASM_BUGFRAME_TEXT(second_frame)
>     ...
>     "loc_disp:\n"
>     "    .long 0x0"
>     "mov [loc_disp], some_register"
>     ...
> And the we have to define mov for each architecture.

No, you really cannot.  This is the asm equivalent of writing something
like this:

static struct bugframe __section(.bug_frames.1) foo = {
    .loc = insn - &foo + LINE_LO,
    .msg = "bug string" - &foo + LINE_HI,
};

It is a data structure, not executable code.

~Andrew
Oleksii Kurochko Feb. 16, 2023, 12:21 p.m. UTC | #19
On Thu, 2023-02-16 at 10:48 +0000, Andrew Cooper wrote:
> On 16/02/2023 7:31 am, Jan Beulich wrote:
> > On 15.02.2023 18:59, Oleksii wrote:
> > > Hello Jan and community,
> > > 
> > > I experimented and switched RISC-V to x86 implementation. All
> > > that I
> > > changed in x86 implementation for RISC-V was _ASM_BUGFRAME_TEXT.
> > > Other
> > > things are the same as for x86.
> > > 
> > > For RISC-V it is fine to skip '%c' modifier so _ASM_BUGFRAME_TEXT
> > > will
> > > look like:
> > > 
> > > #define _ASM_BUGFRAME_TEXT(second_frame) \
> > >     ".Lbug%=: ebreak\n"   
> > >     ".pushsection .bug_frames.%[bf_type], \"a\", @progbits\n"
> > >     ".p2align 2\n"
> > >     ".Lfrm%=:\n"
> > >     ".long (.Lbug%= - .Lfrm%=) + %[bf_line_hi]\n"
> > >     ".long (%[bf_ptr] - .Lfrm%=) + %[bf_line_lo]\n"
> > >     ".if " #second_frame "\n"
> > >     ".long 0, %[bf_msg] - .Lfrm%=\n"
> > >     ".endif\n"
> > >     ".popsection\n"
> > I expect this could be further abstracted such that only the actual
> > instruction is arch-specific.
> > 
> > > The only thing I am worried about is:
> > > 
> > > #define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
> > >   [bf_type] "i" (type), ...
> > > because as I understand it can be an issue with 'i' modifier in
> > > case of
> > > PIE. I am not sure that Xen enables PIE somewhere but still...
> > > If it is not an issue then we can use x86 implementation as a
> > > generic
> > > one.
> > "i" is not generally an issue with PIE, it only is when the value
> > is the
> > address of a symbol. Here "type" is a constant in all cases. (Or
> > else
> > how would you express an immediate operand of an instruction in an
> > asm()?)
> 
> At a guess, the problem isn't type.  It's the line number, which ends
> up
> in a relocation.
Sure. I missed that.
> 
> That said, I'm not sure an architecture could reasonably function
> without a generic 4-byte PC-relative relocation...
> 
> ~Andrew
Jan Beulich Feb. 16, 2023, 2:53 p.m. UTC | #20
On 16.02.2023 11:48, Andrew Cooper wrote:
> On 16/02/2023 7:31 am, Jan Beulich wrote:
>> On 15.02.2023 18:59, Oleksii wrote:
>>> Hello Jan and community,
>>>
>>> I experimented and switched RISC-V to x86 implementation. All that I
>>> changed in x86 implementation for RISC-V was _ASM_BUGFRAME_TEXT. Other
>>> things are the same as for x86.
>>>
>>> For RISC-V it is fine to skip '%c' modifier so _ASM_BUGFRAME_TEXT will
>>> look like:
>>>
>>> #define _ASM_BUGFRAME_TEXT(second_frame) \
>>>     ".Lbug%=: ebreak\n"   
>>>     ".pushsection .bug_frames.%[bf_type], \"a\", @progbits\n"
>>>     ".p2align 2\n"
>>>     ".Lfrm%=:\n"
>>>     ".long (.Lbug%= - .Lfrm%=) + %[bf_line_hi]\n"
>>>     ".long (%[bf_ptr] - .Lfrm%=) + %[bf_line_lo]\n"
>>>     ".if " #second_frame "\n"
>>>     ".long 0, %[bf_msg] - .Lfrm%=\n"
>>>     ".endif\n"
>>>     ".popsection\n"
>> I expect this could be further abstracted such that only the actual
>> instruction is arch-specific.
>>
>>> The only thing I am worried about is:
>>>
>>> #define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
>>>   [bf_type] "i" (type), ...
>>> because as I understand it can be an issue with 'i' modifier in case of
>>> PIE. I am not sure that Xen enables PIE somewhere but still...
>>> If it is not an issue then we can use x86 implementation as a generic
>>> one.
>> "i" is not generally an issue with PIE, it only is when the value is the
>> address of a symbol. Here "type" is a constant in all cases. (Or else
>> how would you express an immediate operand of an instruction in an
>> asm()?)
> 
> At a guess, the problem isn't type.  It's the line number, which ends up
> in a relocation.

Why would that be a problem? If there's a relocation in the first place
(not because of the line number, but because of the other part of the
expression), then it would only alter the addend of that relocation.

Jan
Jan Beulich Feb. 16, 2023, 2:55 p.m. UTC | #21
On 16.02.2023 11:44, Oleksii wrote:
> On Thu, 2023-02-16 at 08:31 +0100, Jan Beulich wrote:
>> On 15.02.2023 18:59, Oleksii wrote:
>>> The only thing I am worried about is:
>>>
>>> #define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
>>>   [bf_type] "i" (type), ...
>>> because as I understand it can be an issue with 'i' modifier in
>>> case of
>>> PIE. I am not sure that Xen enables PIE somewhere but still...
>>> If it is not an issue then we can use x86 implementation as a
>>> generic
>>> one.
>>
>> "i" is not generally an issue with PIE, it only is when the value is
>> the
>> address of a symbol. Here "type" is a constant in all cases. (Or else
>> how would you express an immediate operand of an instruction in an
>> asm()?)
> Hmm. I don't know other ways to express an immediate operand except
> 'i'.
> 
> It looks like type, line, msg are always 'constants' 
> (
> they possibly can be passed without 'i' and used directly inside asm():
>    ...
>    ".pushsection .bug_frames." __stringify(type) ", \"a\",
> %progbits\n"\
>    ...
> ) but the issue will be with 'ptr' for which we have to use 'i'.
> 
> And I am not sure about all 'constants'. For example, I think that it
> can be an issue with 'line' = '((line & ((1 << BUG_LINE_LO_WIDTH) - 1))
> << BUG_DISP_WIDTH)' if we will use that directly inside asm(...).

Not matter how complex an expression, the compiler will evaluate it to
a plain number if it's a constant expression. The only think to worry
about with "i" is, as said, is the value supplied is the address of
some symbol (or an expression involving such).

Jan
Oleksii Kurochko Feb. 16, 2023, 8:44 p.m. UTC | #22
On Thu, 2023-02-16 at 12:19 +0000, Andrew Cooper wrote:
> On 16/02/2023 12:09 pm, Oleksii wrote:
> > On Thu, 2023-02-16 at 12:44 +0200, Oleksii wrote:
> > > On Thu, 2023-02-16 at 08:31 +0100, Jan Beulich wrote:
> > > > On 15.02.2023 18:59, Oleksii wrote:
> > > > > Hello Jan and community,
> > > > > 
> > > > > I experimented and switched RISC-V to x86 implementation. All
> > > > > that
> > > > > I
> > > > > changed in x86 implementation for RISC-V was
> > > > > _ASM_BUGFRAME_TEXT.
> > > > > Other
> > > > > things are the same as for x86.
> > > > > 
> > > > > For RISC-V it is fine to skip '%c' modifier so
> > > > > _ASM_BUGFRAME_TEXT
> > > > > will
> > > > > look like:
> > > > > 
> > > > > #define _ASM_BUGFRAME_TEXT(second_frame) \
> > > > >     ".Lbug%=: ebreak\n"   
> > > > >     ".pushsection .bug_frames.%[bf_type], \"a\", @progbits\n"
> > > > >     ".p2align 2\n"
> > > > >     ".Lfrm%=:\n"
> > > > >     ".long (.Lbug%= - .Lfrm%=) + %[bf_line_hi]\n"
> > > > >     ".long (%[bf_ptr] - .Lfrm%=) + %[bf_line_lo]\n"
> > > > >     ".if " #second_frame "\n"
> > > > >     ".long 0, %[bf_msg] - .Lfrm%=\n"
> > > > >     ".endif\n"
> > > > >     ".popsection\n"
> > > > I expect this could be further abstracted such that only the
> > > > actual
> > > > instruction is arch-specific.
> > > Generally, the only thing that can't be abstracted ( as you
> > > mentioned
> > > is an instruction ).
> > > 
> > > So we can make additional defines:
> > >   1. #define MODIFIER "" or "c" (depends on architecture) and use
> > > it
> > >  
> > > the following way:
> > >    ".pushsection .bug_frames.%"MODIFIER"[bf_type], \"a\",
> > > @progbits\n"
> > >    ...
> > >   2. #define BUG_INSTR "ebreak" | "ud2" | "..." and use it in the
> > > following way:
> > >    ".Lbug%=: "BUG_ISNTR"\n"
> > >    ...
> > > Except for these defines which should be specified for each
> > > architecture
> > > all other code will be the same for all architectures.
> > > 
> > > But as I understand with modifier 'c' can be issues that not all
> > > compilers support but for ARM and  x86 before immediate is
> > > present
> > > punctuation # or $ which are removed by modifier 'c'. And I am
> > > wondering what other ways to remove punctuation before immediate
> > > exist.
> > > 
> > > Do you think we should consider that as an issue?
> > > 
> > > > > The only thing I am worried about is:
> > > > > 
> > > > > #define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
> > > > >   [bf_type] "i" (type), ...
> > > > > because as I understand it can be an issue with 'i' modifier
> > > > > in
> > > > > case of
> > > > > PIE. I am not sure that Xen enables PIE somewhere but
> > > > > still...
> > > > > If it is not an issue then we can use x86 implementation as a
> > > > > generic
> > > > > one.
> > > > "i" is not generally an issue with PIE, it only is when the
> > > > value
> > > > is
> > > > the
> > > > address of a symbol. Here "type" is a constant in all cases.
> > > > (Or
> > > > else
> > > > how would you express an immediate operand of an instruction in
> > > > an
> > > > asm()?)
> > > Hmm. I don't know other ways to express an immediate operand
> > > except
> > > 'i'.
> > > 
> > > It looks like type, line, msg are always 'constants' 
> > > (
> > > they possibly can be passed without 'i' and used directly inside
> > > asm():
> > >    ...
> > >    ".pushsection .bug_frames." __stringify(type) ", \"a\",
> > > %progbits\n"\
> > >    ...
> > > ) but the issue will be with 'ptr' for which we have to use 'i'.
> > > 
> > > And I am not sure about all 'constants'. For example, I think
> > > that it
> > > can be an issue with 'line' = '((line & ((1 << BUG_LINE_LO_WIDTH)
> > > -
> > > 1))
> > > << BUG_DISP_WIDTH)' if we will use that directly inside asm(...).
> > > 
> > I think we can avoid 'i' by using 'r' + some kind of mov
> > instruction to
> > write a register value to memory. The code below is pseudo-code:
> > #define _ASM_BUGFRAME_TEXT(second_frame)
> >     ...
> >     "loc_disp:\n"
> >     "    .long 0x0"
> >     "mov [loc_disp], some_register"
> >     ...
> > And the we have to define mov for each architecture.
> 
> No, you really cannot.  This is the asm equivalent of writing
> something
> like this:
> 
> static struct bugframe __section(.bug_frames.1) foo = {
>     .loc = insn - &foo + LINE_LO,
>     .msg = "bug string" - &foo + LINE_HI,
> };
> 
> It is a data structure, not executable code.
Thanks for the clarification.

What about mainly using C for BUG_FRAME and only allocating bug_frame
sections in assembly?

Please look at POC below:


#include <stdio.h>
#include <stdint.h>

#define BUG_DISP_WIDTH    24
#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)

struct bug_frame {
    signed int loc_disp:BUG_DISP_WIDTH;
    unsigned int line_hi:BUG_LINE_HI_WIDTH;
    signed int ptr_disp:BUG_DISP_WIDTH;
    unsigned int line_lo:BUG_LINE_LO_WIDTH;
    signed int msg_disp[];
};

#define bug_loc(b) ((const void *)(b) + (b)->loc_disp)
#define bug_ptr(b) ((const void *)(b) + (b)->ptr_disp)
#define bug_line(b) (((((b)->line_hi + ((b)->loc_disp < 0)) & \
                       ((1 << BUG_LINE_HI_WIDTH) - 1)) << \
                      BUG_LINE_LO_WIDTH) + \
                     (((b)->line_lo + ((b)->ptr_disp < 0)) &          
\
                      ((1 << BUG_LINE_LO_WIDTH) - 1)))
#define bug_msg(b) ((const char *)(b) + (b)->msg_disp[1])

#define ALLOCATE_BF_SECTION(has_msg) do {                            \
    asm (".pushsection bug_frames                  \n"                
\
         ".long 0, 0 \n"                                              
\
        ".if " #has_msg "\n"                                          
\
         "\t.long 0 \n"                                               
\
         "\t.long 0 \n"                                               
\
         ".endif\n"                                                   
\
         ".popsection" );                                             
\
} while (0)

#define MERGE_(a,b)  a##b
#define UNIQUE_BUG_INSTR_LABEL(a) MERGE_(unique_name_, a)

#define BUG_FRAME(type, line, file, has_msg, msg) do {                
\
    unsigned int line_lo = (((line) >> BUG_LINE_LO_WIDTH) <<
BUG_DISP_WIDTH);                                                      
\
    unsigned int line_hi = ((line & ((1 << BUG_LINE_LO_WIDTH) - 1)) <<
BUG_DISP_WIDTH);                                                      
\
    ALLOCATE_BF_SECTION(1);                                           
\
    *(signed int *)(&bug_frames) = (unsigned long)
&&UNIQUE_BUG_INSTR_LABEL(line) - (unsigned long)&bug_frames + line_lo;
\
    *((signed int *)(&bug_frames) + 1) = (unsigned long)file -
(unsigned long)&bug_frames +  line_hi;                                
\
    bug_var.msg_disp[1] = (signed int)((unsigned long)#msg - (unsigned
long)&bug_frames);                                                    
\
    UNIQUE_BUG_INSTR_LABEL(line):                                     
\
        asm volatile ("nop");                                         
} while (0)

extern char bug_frames[];

static struct bug_frame bug_var __attribute__((section("bug_frames")));

int main() {
    BUG_FRAME(1, __LINE__, __FILE__, 1, "TEST");

    printf("bug_loc: %p\n", bug_loc(&bug_var));
    printf("bug_ptr: 0x%x\n", bug_ptr(&bug_var));
    printf("__FILE__: %s\n", (char *)bug_ptr(&bug_var));
    printf("bug_line: %d\n", bug_line(&bug_var));
    printf("msg: %s\n", bug_msg(&bug_var));

    BUG_FRAME(1, __LINE__, __FILE__, 1, "NEW TEST");

    printf("bug_loc: %p\n", bug_loc(&bug_var));
    printf("bug_ptr: 0x%x\n", bug_ptr(&bug_var));
    printf("__FILE__: %s\n", (char *)bug_ptr(&bug_var));
    printf("bug_line: %d\n", bug_line(&bug_var));
    printf("msg: %s\n", bug_msg(&bug_var));

    return 0;
}

Do you think it makes any sense? In this case, all BUG_FRAME can be re-
used among all architectures, and only bug instructions should be
changed.

> 
> ~Andrew
~ Oleksii
Jan Beulich Feb. 17, 2023, 7:12 a.m. UTC | #23
On 16.02.2023 21:44, Oleksii wrote:
> On Thu, 2023-02-16 at 12:19 +0000, Andrew Cooper wrote:
>> On 16/02/2023 12:09 pm, Oleksii wrote:
>>> On Thu, 2023-02-16 at 12:44 +0200, Oleksii wrote:
>>>> On Thu, 2023-02-16 at 08:31 +0100, Jan Beulich wrote:
>>>>> On 15.02.2023 18:59, Oleksii wrote:
>>>>>> Hello Jan and community,
>>>>>>
>>>>>> I experimented and switched RISC-V to x86 implementation. All
>>>>>> that
>>>>>> I
>>>>>> changed in x86 implementation for RISC-V was
>>>>>> _ASM_BUGFRAME_TEXT.
>>>>>> Other
>>>>>> things are the same as for x86.
>>>>>>
>>>>>> For RISC-V it is fine to skip '%c' modifier so
>>>>>> _ASM_BUGFRAME_TEXT
>>>>>> will
>>>>>> look like:
>>>>>>
>>>>>> #define _ASM_BUGFRAME_TEXT(second_frame) \
>>>>>>     ".Lbug%=: ebreak\n"   
>>>>>>     ".pushsection .bug_frames.%[bf_type], \"a\", @progbits\n"
>>>>>>     ".p2align 2\n"
>>>>>>     ".Lfrm%=:\n"
>>>>>>     ".long (.Lbug%= - .Lfrm%=) + %[bf_line_hi]\n"
>>>>>>     ".long (%[bf_ptr] - .Lfrm%=) + %[bf_line_lo]\n"
>>>>>>     ".if " #second_frame "\n"
>>>>>>     ".long 0, %[bf_msg] - .Lfrm%=\n"
>>>>>>     ".endif\n"
>>>>>>     ".popsection\n"
>>>>> I expect this could be further abstracted such that only the
>>>>> actual
>>>>> instruction is arch-specific.
>>>> Generally, the only thing that can't be abstracted ( as you
>>>> mentioned
>>>> is an instruction ).
>>>>
>>>> So we can make additional defines:
>>>>   1. #define MODIFIER "" or "c" (depends on architecture) and use
>>>> it
>>>>  
>>>> the following way:
>>>>    ".pushsection .bug_frames.%"MODIFIER"[bf_type], \"a\",
>>>> @progbits\n"
>>>>    ...
>>>>   2. #define BUG_INSTR "ebreak" | "ud2" | "..." and use it in the
>>>> following way:
>>>>    ".Lbug%=: "BUG_ISNTR"\n"
>>>>    ...
>>>> Except for these defines which should be specified for each
>>>> architecture
>>>> all other code will be the same for all architectures.
>>>>
>>>> But as I understand with modifier 'c' can be issues that not all
>>>> compilers support but for ARM and  x86 before immediate is
>>>> present
>>>> punctuation # or $ which are removed by modifier 'c'. And I am
>>>> wondering what other ways to remove punctuation before immediate
>>>> exist.
>>>>
>>>> Do you think we should consider that as an issue?
>>>>
>>>>>> The only thing I am worried about is:
>>>>>>
>>>>>> #define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
>>>>>>   [bf_type] "i" (type), ...
>>>>>> because as I understand it can be an issue with 'i' modifier
>>>>>> in
>>>>>> case of
>>>>>> PIE. I am not sure that Xen enables PIE somewhere but
>>>>>> still...
>>>>>> If it is not an issue then we can use x86 implementation as a
>>>>>> generic
>>>>>> one.
>>>>> "i" is not generally an issue with PIE, it only is when the
>>>>> value
>>>>> is
>>>>> the
>>>>> address of a symbol. Here "type" is a constant in all cases.
>>>>> (Or
>>>>> else
>>>>> how would you express an immediate operand of an instruction in
>>>>> an
>>>>> asm()?)
>>>> Hmm. I don't know other ways to express an immediate operand
>>>> except
>>>> 'i'.
>>>>
>>>> It looks like type, line, msg are always 'constants' 
>>>> (
>>>> they possibly can be passed without 'i' and used directly inside
>>>> asm():
>>>>    ...
>>>>    ".pushsection .bug_frames." __stringify(type) ", \"a\",
>>>> %progbits\n"\
>>>>    ...
>>>> ) but the issue will be with 'ptr' for which we have to use 'i'.
>>>>
>>>> And I am not sure about all 'constants'. For example, I think
>>>> that it
>>>> can be an issue with 'line' = '((line & ((1 << BUG_LINE_LO_WIDTH)
>>>> -
>>>> 1))
>>>> << BUG_DISP_WIDTH)' if we will use that directly inside asm(...).
>>>>
>>> I think we can avoid 'i' by using 'r' + some kind of mov
>>> instruction to
>>> write a register value to memory. The code below is pseudo-code:
>>> #define _ASM_BUGFRAME_TEXT(second_frame)
>>>     ...
>>>     "loc_disp:\n"
>>>     "    .long 0x0"
>>>     "mov [loc_disp], some_register"
>>>     ...
>>> And the we have to define mov for each architecture.
>>
>> No, you really cannot.  This is the asm equivalent of writing
>> something
>> like this:
>>
>> static struct bugframe __section(.bug_frames.1) foo = {
>>     .loc = insn - &foo + LINE_LO,
>>     .msg = "bug string" - &foo + LINE_HI,
>> };
>>
>> It is a data structure, not executable code.
> Thanks for the clarification.
> 
> What about mainly using C for BUG_FRAME and only allocating bug_frame
> sections in assembly?
> 
> Please look at POC below:

That's still using statements (assignments), not initializers. We expect
the table to be populated at build time, and we expect it to be read-only
at runtime. Plus your approach once again increases overall size (just
that this time you add code, not data).

Jan

> #include <stdio.h>
> #include <stdint.h>
> 
> #define BUG_DISP_WIDTH    24
> #define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
> #define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
> 
> struct bug_frame {
>     signed int loc_disp:BUG_DISP_WIDTH;
>     unsigned int line_hi:BUG_LINE_HI_WIDTH;
>     signed int ptr_disp:BUG_DISP_WIDTH;
>     unsigned int line_lo:BUG_LINE_LO_WIDTH;
>     signed int msg_disp[];
> };
> 
> #define bug_loc(b) ((const void *)(b) + (b)->loc_disp)
> #define bug_ptr(b) ((const void *)(b) + (b)->ptr_disp)
> #define bug_line(b) (((((b)->line_hi + ((b)->loc_disp < 0)) & \
>                        ((1 << BUG_LINE_HI_WIDTH) - 1)) << \
>                       BUG_LINE_LO_WIDTH) + \
>                      (((b)->line_lo + ((b)->ptr_disp < 0)) &          
> \
>                       ((1 << BUG_LINE_LO_WIDTH) - 1)))
> #define bug_msg(b) ((const char *)(b) + (b)->msg_disp[1])
> 
> #define ALLOCATE_BF_SECTION(has_msg) do {                            \
>     asm (".pushsection bug_frames                  \n"                
> \
>          ".long 0, 0 \n"                                              
> \
>         ".if " #has_msg "\n"                                          
> \
>          "\t.long 0 \n"                                               
> \
>          "\t.long 0 \n"                                               
> \
>          ".endif\n"                                                   
> \
>          ".popsection" );                                             
> \
> } while (0)
> 
> #define MERGE_(a,b)  a##b
> #define UNIQUE_BUG_INSTR_LABEL(a) MERGE_(unique_name_, a)
> 
> #define BUG_FRAME(type, line, file, has_msg, msg) do {                
> \
>     unsigned int line_lo = (((line) >> BUG_LINE_LO_WIDTH) <<
> BUG_DISP_WIDTH);                                                      
> \
>     unsigned int line_hi = ((line & ((1 << BUG_LINE_LO_WIDTH) - 1)) <<
> BUG_DISP_WIDTH);                                                      
> \
>     ALLOCATE_BF_SECTION(1);                                           
> \
>     *(signed int *)(&bug_frames) = (unsigned long)
> &&UNIQUE_BUG_INSTR_LABEL(line) - (unsigned long)&bug_frames + line_lo;
> \
>     *((signed int *)(&bug_frames) + 1) = (unsigned long)file -
> (unsigned long)&bug_frames +  line_hi;                                
> \
>     bug_var.msg_disp[1] = (signed int)((unsigned long)#msg - (unsigned
> long)&bug_frames);                                                    
> \
>     UNIQUE_BUG_INSTR_LABEL(line):                                     
> \
>         asm volatile ("nop");                                         
> } while (0)
> 
> extern char bug_frames[];
> 
> static struct bug_frame bug_var __attribute__((section("bug_frames")));
> 
> int main() {
>     BUG_FRAME(1, __LINE__, __FILE__, 1, "TEST");
> 
>     printf("bug_loc: %p\n", bug_loc(&bug_var));
>     printf("bug_ptr: 0x%x\n", bug_ptr(&bug_var));
>     printf("__FILE__: %s\n", (char *)bug_ptr(&bug_var));
>     printf("bug_line: %d\n", bug_line(&bug_var));
>     printf("msg: %s\n", bug_msg(&bug_var));
> 
>     BUG_FRAME(1, __LINE__, __FILE__, 1, "NEW TEST");
> 
>     printf("bug_loc: %p\n", bug_loc(&bug_var));
>     printf("bug_ptr: 0x%x\n", bug_ptr(&bug_var));
>     printf("__FILE__: %s\n", (char *)bug_ptr(&bug_var));
>     printf("bug_line: %d\n", bug_line(&bug_var));
>     printf("msg: %s\n", bug_msg(&bug_var));
> 
>     return 0;
> }
> 
> Do you think it makes any sense? In this case, all BUG_FRAME can be re-
> used among all architectures, and only bug instructions should be
> changed.
> 
>>
>> ~Andrew
> ~ Oleksii
Oleksii Kurochko Feb. 17, 2023, 9:33 a.m. UTC | #24
On Fri, 2023-02-17 at 08:12 +0100, Jan Beulich wrote:
> On 16.02.2023 21:44, Oleksii wrote:
> > On Thu, 2023-02-16 at 12:19 +0000, Andrew Cooper wrote:
> > > On 16/02/2023 12:09 pm, Oleksii wrote:
> > > > On Thu, 2023-02-16 at 12:44 +0200, Oleksii wrote:
> > > > > On Thu, 2023-02-16 at 08:31 +0100, Jan Beulich wrote:
> > > > > > On 15.02.2023 18:59, Oleksii wrote:
> > > > > > > Hello Jan and community,
> > > > > > > 
> > > > > > > I experimented and switched RISC-V to x86 implementation.
> > > > > > > All
> > > > > > > that
> > > > > > > I
> > > > > > > changed in x86 implementation for RISC-V was
> > > > > > > _ASM_BUGFRAME_TEXT.
> > > > > > > Other
> > > > > > > things are the same as for x86.
> > > > > > > 
> > > > > > > For RISC-V it is fine to skip '%c' modifier so
> > > > > > > _ASM_BUGFRAME_TEXT
> > > > > > > will
> > > > > > > look like:
> > > > > > > 
> > > > > > > #define _ASM_BUGFRAME_TEXT(second_frame) \
> > > > > > >     ".Lbug%=: ebreak\n"   
> > > > > > >     ".pushsection .bug_frames.%[bf_type], \"a\",
> > > > > > > @progbits\n"
> > > > > > >     ".p2align 2\n"
> > > > > > >     ".Lfrm%=:\n"
> > > > > > >     ".long (.Lbug%= - .Lfrm%=) + %[bf_line_hi]\n"
> > > > > > >     ".long (%[bf_ptr] - .Lfrm%=) + %[bf_line_lo]\n"
> > > > > > >     ".if " #second_frame "\n"
> > > > > > >     ".long 0, %[bf_msg] - .Lfrm%=\n"
> > > > > > >     ".endif\n"
> > > > > > >     ".popsection\n"
> > > > > > I expect this could be further abstracted such that only
> > > > > > the
> > > > > > actual
> > > > > > instruction is arch-specific.
> > > > > Generally, the only thing that can't be abstracted ( as you
> > > > > mentioned
> > > > > is an instruction ).
> > > > > 
> > > > > So we can make additional defines:
> > > > >   1. #define MODIFIER "" or "c" (depends on architecture) and
> > > > > use
> > > > > it
> > > > >  
> > > > > the following way:
> > > > >    ".pushsection .bug_frames.%"MODIFIER"[bf_type], \"a\",
> > > > > @progbits\n"
> > > > >    ...
> > > > >   2. #define BUG_INSTR "ebreak" | "ud2" | "..." and use it in
> > > > > the
> > > > > following way:
> > > > >    ".Lbug%=: "BUG_ISNTR"\n"
> > > > >    ...
> > > > > Except for these defines which should be specified for each
> > > > > architecture
> > > > > all other code will be the same for all architectures.
> > > > > 
> > > > > But as I understand with modifier 'c' can be issues that not
> > > > > all
> > > > > compilers support but for ARM and  x86 before immediate is
> > > > > present
> > > > > punctuation # or $ which are removed by modifier 'c'. And I
> > > > > am
> > > > > wondering what other ways to remove punctuation before
> > > > > immediate
> > > > > exist.
> > > > > 
> > > > > Do you think we should consider that as an issue?
> > > > > 
> > > > > > > The only thing I am worried about is:
> > > > > > > 
> > > > > > > #define _ASM_BUGFRAME_INFO(type, line, ptr, msg) \
> > > > > > >   [bf_type] "i" (type), ...
> > > > > > > because as I understand it can be an issue with 'i'
> > > > > > > modifier
> > > > > > > in
> > > > > > > case of
> > > > > > > PIE. I am not sure that Xen enables PIE somewhere but
> > > > > > > still...
> > > > > > > If it is not an issue then we can use x86 implementation
> > > > > > > as a
> > > > > > > generic
> > > > > > > one.
> > > > > > "i" is not generally an issue with PIE, it only is when the
> > > > > > value
> > > > > > is
> > > > > > the
> > > > > > address of a symbol. Here "type" is a constant in all
> > > > > > cases.
> > > > > > (Or
> > > > > > else
> > > > > > how would you express an immediate operand of an
> > > > > > instruction in
> > > > > > an
> > > > > > asm()?)
> > > > > Hmm. I don't know other ways to express an immediate operand
> > > > > except
> > > > > 'i'.
> > > > > 
> > > > > It looks like type, line, msg are always 'constants' 
> > > > > (
> > > > > they possibly can be passed without 'i' and used directly
> > > > > inside
> > > > > asm():
> > > > >    ...
> > > > >    ".pushsection .bug_frames." __stringify(type) ", \"a\",
> > > > > %progbits\n"\
> > > > >    ...
> > > > > ) but the issue will be with 'ptr' for which we have to use
> > > > > 'i'.
> > > > > 
> > > > > And I am not sure about all 'constants'. For example, I think
> > > > > that it
> > > > > can be an issue with 'line' = '((line & ((1 <<
> > > > > BUG_LINE_LO_WIDTH)
> > > > > -
> > > > > 1))
> > > > > << BUG_DISP_WIDTH)' if we will use that directly inside
> > > > > asm(...).
> > > > > 
> > > > I think we can avoid 'i' by using 'r' + some kind of mov
> > > > instruction to
> > > > write a register value to memory. The code below is pseudo-
> > > > code:
> > > > #define _ASM_BUGFRAME_TEXT(second_frame)
> > > >     ...
> > > >     "loc_disp:\n"
> > > >     "    .long 0x0"
> > > >     "mov [loc_disp], some_register"
> > > >     ...
> > > > And the we have to define mov for each architecture.
> > > 
> > > No, you really cannot.  This is the asm equivalent of writing
> > > something
> > > like this:
> > > 
> > > static struct bugframe __section(.bug_frames.1) foo = {
> > >     .loc = insn - &foo + LINE_LO,
> > >     .msg = "bug string" - &foo + LINE_HI,
> > > };
> > > 
> > > It is a data structure, not executable code.
> > Thanks for the clarification.
> > 
> > What about mainly using C for BUG_FRAME and only allocating
> > bug_frame
> > sections in assembly?
> > 
> > Please look at POC below:
> 
> That's still using statements (assignments), not initializers. We
> expect
> the table to be populated at build time, and we expect it to be read-
> only
> at runtime. Plus your approach once again increases overall size
> (just
> that this time you add code, not data).
If we have such requirements then the best option is to use as a
generic implementation the implementation provided by x86 ( as I
mentioned before, it mainly can be re-used for RISC-V ) and leave ARM
implementation mostly as is ( except for some minor changes ).
> 
> Jan
> 
> > #include <stdio.h>
> > #include <stdint.h>
> > 
> > #define BUG_DISP_WIDTH    24
> > #define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
> > #define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
> > 
> > struct bug_frame {
> >     signed int loc_disp:BUG_DISP_WIDTH;
> >     unsigned int line_hi:BUG_LINE_HI_WIDTH;
> >     signed int ptr_disp:BUG_DISP_WIDTH;
> >     unsigned int line_lo:BUG_LINE_LO_WIDTH;
> >     signed int msg_disp[];
> > };
> > 
> > #define bug_loc(b) ((const void *)(b) + (b)->loc_disp)
> > #define bug_ptr(b) ((const void *)(b) + (b)->ptr_disp)
> > #define bug_line(b) (((((b)->line_hi + ((b)->loc_disp < 0)) & \
> >                        ((1 << BUG_LINE_HI_WIDTH) - 1)) << \
> >                       BUG_LINE_LO_WIDTH) + \
> >                      (((b)->line_lo + ((b)->ptr_disp < 0))
> > &          
> > \
> >                       ((1 << BUG_LINE_LO_WIDTH) - 1)))
> > #define bug_msg(b) ((const char *)(b) + (b)->msg_disp[1])
> > 
> > #define ALLOCATE_BF_SECTION(has_msg) do
> > {                            \
> >     asm (".pushsection bug_frames                 
> > \n"                
> > \
> >          ".long 0, 0
> > \n"                                              
> > \
> >         ".if " #has_msg
> > "\n"                                          
> > \
> >          "\t.long 0
> > \n"                                               
> > \
> >          "\t.long 0
> > \n"                                               
> > \
> >         
> > ".endif\n"                                                   
> > \
> >          ".popsection"
> > );                                             
> > \
> > } while (0)
> > 
> > #define MERGE_(a,b)  a##b
> > #define UNIQUE_BUG_INSTR_LABEL(a) MERGE_(unique_name_, a)
> > 
> > #define BUG_FRAME(type, line, file, has_msg, msg) do
> > {                
> > \
> >     unsigned int line_lo = (((line) >> BUG_LINE_LO_WIDTH) <<
> > BUG_DISP_WIDTH);                                                   
> >    
> > \
> >     unsigned int line_hi = ((line & ((1 << BUG_LINE_LO_WIDTH) - 1))
> > <<
> > BUG_DISP_WIDTH);                                                   
> >    
> > \
> >    
> > ALLOCATE_BF_SECTION(1);                                           
> > \
> >     *(signed int *)(&bug_frames) = (unsigned long)
> > &&UNIQUE_BUG_INSTR_LABEL(line) - (unsigned long)&bug_frames +
> > line_lo;
> > \
> >     *((signed int *)(&bug_frames) + 1) = (unsigned long)file -
> > (unsigned long)&bug_frames + 
> > line_hi;                                
> > \
> >     bug_var.msg_disp[1] = (signed int)((unsigned long)#msg -
> > (unsigned
> > long)&bug_frames);                                                 
> >    
> > \
> >    
> > UNIQUE_BUG_INSTR_LABEL(line):                                     
> > \
> >         asm volatile
> > ("nop");                                         
> > } while (0)
> > 
> > extern char bug_frames[];
> > 
> > static struct bug_frame bug_var
> > __attribute__((section("bug_frames")));
> > 
> > int main() {
> >     BUG_FRAME(1, __LINE__, __FILE__, 1, "TEST");
> > 
> >     printf("bug_loc: %p\n", bug_loc(&bug_var));
> >     printf("bug_ptr: 0x%x\n", bug_ptr(&bug_var));
> >     printf("__FILE__: %s\n", (char *)bug_ptr(&bug_var));
> >     printf("bug_line: %d\n", bug_line(&bug_var));
> >     printf("msg: %s\n", bug_msg(&bug_var));
> > 
> >     BUG_FRAME(1, __LINE__, __FILE__, 1, "NEW TEST");
> > 
> >     printf("bug_loc: %p\n", bug_loc(&bug_var));
> >     printf("bug_ptr: 0x%x\n", bug_ptr(&bug_var));
> >     printf("__FILE__: %s\n", (char *)bug_ptr(&bug_var));
> >     printf("bug_line: %d\n", bug_line(&bug_var));
> >     printf("msg: %s\n", bug_msg(&bug_var));
> > 
> >     return 0;
> > }
> > 
> > Do you think it makes any sense? In this case, all BUG_FRAME can be
> > re-
> > used among all architectures, and only bug instructions should be
> > changed.
> > 
> > > 
> > > ~Andrew
> > ~ Oleksii
>
diff mbox series

Patch

diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index f1ea3199c8..811b4eaf3b 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -92,6 +92,12 @@  config STATIC_MEMORY
 
 	  If unsure, say N.
 
+config GENERIC_DO_BUG_FRAME
+	bool
+	help
+	  Generic do_bug_frame() function is needed to handle the type of bug
+	  frame and print an information about it.
+
 menu "Speculative hardening"
 
 config INDIRECT_THUNK
diff --git a/xen/common/Makefile b/xen/common/Makefile
index bbd75b4be6..7d04c8d3b2 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -1,5 +1,6 @@ 
 obj-$(CONFIG_ARGO) += argo.o
 obj-y += bitmap.o
+obj-$(CONFIG_GENERIC_DO_BUG_FRAME) += bug.o
 obj-$(CONFIG_HYPFS_CONFIG) += config_data.o
 obj-$(CONFIG_CORE_PARKING) += core_parking.o
 obj-y += cpu.o
diff --git a/xen/common/bug.c b/xen/common/bug.c
new file mode 100644
index 0000000000..393e58d571
--- /dev/null
+++ b/xen/common/bug.c
@@ -0,0 +1,88 @@ 
+#include <xen/bug.h>
+#include <xen/errno.h>
+#include <xen/types.h>
+#include <xen/kernel.h>
+#include <xen/string.h>
+#include <xen/virtual_region.h>
+
+#include <asm/processor.h>
+
+int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)
+{
+    const struct bug_frame *bug = NULL;
+    const char *prefix = "", *filename, *predicate;
+    unsigned long fixup;
+    int id = -1, lineno;
+    const struct virtual_region *region;
+
+    region = find_text_region(pc);
+    if ( region )
+    {
+        for ( id = 0; id < BUGFRAME_NR; id++ )
+        {
+            const struct bug_frame *b;
+            unsigned int i;
+
+            for ( i = 0, b = region->frame[id].bugs;
+                  i < region->frame[id].n_bugs; b++, i++ )
+            {
+                if ( ((vaddr_t)bug_loc(b)) == pc )
+                {
+                    bug = b;
+                    goto found;
+                }
+            }
+        }
+    }
+ found:
+    if ( !bug )
+        return -ENOENT;
+
+    if ( id == BUGFRAME_run_fn )
+    {
+        void (*fn)(const struct cpu_user_regs *) = (void *)regs->BUG_FN_REG;
+
+        fn(regs);
+        return 0;
+    }
+
+    /* WARN, BUG or ASSERT: decode the filename pointer and line number. */
+    filename = bug_file(bug);
+    if ( !is_kernel(filename) )
+        return -EINVAL;
+    fixup = strlen(filename);
+    if ( fixup > 50 )
+    {
+        filename += fixup - 47;
+        prefix = "...";
+    }
+    lineno = bug_line(bug);
+
+    switch ( id )
+    {
+    case BUGFRAME_warn:
+        printk("Xen WARN at %s%s:%d\n", prefix, filename, lineno);
+        show_execution_state(regs);
+        return 0;
+
+    case BUGFRAME_bug:
+        printk("Xen BUG at %s%s:%d\n", prefix, filename, lineno);
+        show_execution_state(regs);
+        panic("Xen BUG at %s%s:%d\n", prefix, filename, lineno);
+
+    case BUGFRAME_assert:
+        /* ASSERT: decode the predicate string pointer. */
+        predicate = bug_msg(bug);
+        if ( !is_kernel(predicate) )
+            predicate = "<unknown>";
+
+        printk("Assertion '%s' failed at %s%s:%d\n",
+               predicate, prefix, filename, lineno);
+        show_execution_state(regs);
+        panic("Assertion '%s' failed at %s%s:%d\n",
+              predicate, prefix, filename, lineno);
+    }
+
+    return -EINVAL;
+}
+
diff --git a/xen/include/xen/bug.h b/xen/include/xen/bug.h
new file mode 100644
index 0000000000..b46dae035e
--- /dev/null
+++ b/xen/include/xen/bug.h
@@ -0,0 +1,127 @@ 
+#ifndef __XEN_BUG_H__
+#define __XEN_BUG_H__
+
+#define BUG_DISP_WIDTH    24
+#define BUG_LINE_LO_WIDTH (31 - BUG_DISP_WIDTH)
+#define BUG_LINE_HI_WIDTH (31 - BUG_DISP_WIDTH)
+
+#define BUGFRAME_run_fn 0
+#define BUGFRAME_warn   1
+#define BUGFRAME_bug    2
+#define BUGFRAME_assert 3
+
+#define BUGFRAME_NR     4
+
+#ifndef __ASSEMBLY__
+
+#include <xen/errno.h>
+#include <xen/stringify.h>
+#include <xen/types.h>
+#include <xen/lib.h>
+
+#include <asm/bug.h>
+
+#ifndef BUG_FRAME_STUFF
+struct bug_frame {
+    signed int loc_disp;    /* Relative address to the bug address */
+    signed int file_disp;   /* Relative address to the filename */
+    signed int msg_disp;    /* Relative address to the predicate (for ASSERT) */
+    uint16_t line;          /* Line number */
+    uint32_t pad0:16;       /* Padding for 8-bytes align */
+};
+
+#define bug_loc(b) ((const void *)(b) + (b)->loc_disp)
+#define bug_file(b) ((const void *)(b) + (b)->file_disp);
+#define bug_line(b) ((b)->line)
+#define bug_msg(b) ((const char *)(b) + (b)->msg_disp)
+#endif /* BUG_FRAME_STUFF */
+
+#ifndef BUG_FRAME
+/* Many versions of GCC doesn't support the asm %c parameter which would
+ * be preferable to this unpleasantness. We use mergeable string
+ * sections to avoid multiple copies of the string appearing in the
+ * Xen image. BUGFRAME_run_fn needs to be handled separately.
+ */
+#define BUG_FRAME(type, line, file, has_msg, msg) do {                      \
+    BUILD_BUG_ON((line) >> 16);                                             \
+    BUILD_BUG_ON((type) >= BUGFRAME_NR);                                    \
+    asm ("1:"BUG_INSTR"\n"                                                  \
+         ".pushsection .rodata.str, \"aMS\", %progbits, 1\n"                \
+         "2:\t.asciz " __stringify(file) "\n"                               \
+         "3:\n"                                                             \
+         ".if " #has_msg "\n"                                               \
+         "\t.asciz " #msg "\n"                                              \
+         ".endif\n"                                                         \
+         ".popsection\n"                                                    \
+         ".pushsection .bug_frames." __stringify(type) ", \"a\", %progbits\n"\
+         "4:\n"                                                             \
+         ".p2align 2\n"                                                     \
+         ".long (1b - 4b)\n"                                                \
+         ".long (2b - 4b)\n"                                                \
+         ".long (3b - 4b)\n"                                                \
+         ".hword " __stringify(line) ", 0\n"                                \
+         ".popsection");                                                    \
+} while (0)
+#endif /* BUG_FRAME */
+
+#ifndef run_in_exception_handler
+/*
+ * GCC will not allow to use "i"  when PIE is enabled (Xen doesn't set the
+ * flag but instead rely on the default value from the compiler). So the
+ * easiest way to implement run_in_exception_handler() is to pass the to
+ * be called function in a fixed register.
+ */
+#define  run_in_exception_handler(fn) do {                                  \
+    register void *fn_ asm(__stringify(BUG_FN_REG)) = (fn);                 \
+    asm ("1:"BUG_INSTR"\n"                                                  \
+         ".pushsection .bug_frames." __stringify(BUGFRAME_run_fn) ","       \
+         "             \"a\", %%progbits\n"                                 \
+         "2:\n"                                                             \
+         ".p2align 2\n"                                                     \
+         ".long (1b - 2b)\n"                                                \
+         ".long 0, 0, 0\n"                                                  \
+         ".popsection" :: "r" (fn_));                                       \
+} while (0)
+#endif /* run_in_exception_handler */
+
+#ifndef WARN
+#define WARN() BUG_FRAME(BUGFRAME_warn, __LINE__, __FILE__, 0, "")
+#endif /* WARN */
+
+#ifndef BUG
+#define BUG() do {                                              \
+    BUG_FRAME(BUGFRAME_bug,  __LINE__, __FILE__, 0, "");        \
+    unreachable();                                              \
+} while (0)
+#endif
+
+#ifndef assert_failed
+#define assert_failed(msg) do {                                 \
+    BUG_FRAME(BUGFRAME_assert, __LINE__, __FILE__, 1, msg);     \
+    unreachable();                                              \
+} while (0)
+#endif
+
+extern const struct bug_frame __start_bug_frames[],
+                              __stop_bug_frames_0[],
+                              __stop_bug_frames_1[],
+                              __stop_bug_frames_2[],
+                              __stop_bug_frames_3[];
+
+#else /* !__ASSEMBLY__ */
+
+#ifdef CONFIG_X86
+#include <asm/bug.h>
+#endif
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* __XEN_BUG_H__ */
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */