Message ID | 20230426-devalue-enlarging-afb4fa1bb247@wendy (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
Series | Handle multi-letter extensions starting with caps in riscv,isa | expand |
Context | Check | Description |
---|---|---|
conchuod/cover_letter | success | Series has a cover letter |
conchuod/tree_selection | success | Guessed tree name to be fixes at HEAD 1b50f956c8fe |
conchuod/fixes_present | success | Fixes tag present in non-next series |
conchuod/maintainers_pattern | success | MAINTAINERS pattern errors before the patch: 1 and now 1 |
conchuod/verify_signedoff | success | Signed-off-by tag matches author and committer |
conchuod/kdoc | success | Errors and warnings before: 0 this patch: 0 |
conchuod/build_rv64_clang_allmodconfig | success | Errors and warnings before: 18 this patch: 18 |
conchuod/module_param | success | Was 0 now: 0 |
conchuod/build_rv64_gcc_allmodconfig | success | Errors and warnings before: 18 this patch: 18 |
conchuod/build_rv32_defconfig | success | Build OK |
conchuod/dtb_warn_rv64 | success | Errors and warnings before: 3 this patch: 3 |
conchuod/header_inline | success | No static functions without inline keyword in header files |
conchuod/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 36 lines checked |
conchuod/source_inline | success | Was 0 now: 0 |
conchuod/build_rv64_nommu_k210_defconfig | success | Build OK |
conchuod/verify_fixes | success | Fixes tag looks correct |
conchuod/build_rv64_nommu_virt_defconfig | success | Build OK |
On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > Yangyu Chen reported that if an multi-letter extension begins with a > capital letter the parser will treat the remainder of that multi-letter > extension as single-letter extensions. I think the problem is that the parser doesn't completely abort when it sees something it doesn't understand. Continuing is risky since it may be possible to compose an invalid string that gets the parser to run off the rails. How about completely aborting, noisily, when the string doesn't match expectations, falling back to a default string such as rv64ima instead. That also ought to get faster corrections of device trees. > > Certain versions of rocket-chip will export devicetree containing > rv64ima_Zifencei, which is parsed by the kernel as rv64imafc. > > While capital letters in riscv,isa are invalid and the validation of > devicetree's isn't the kernel's job, we should behave more gracefully > here. Rather than abort parsing on meeting a capital letter, mark the > extension as an error & allow the parser to skip ahead to the next > extension. > > Reported-by: Yangyu Chen <cyy@cyyself.name> > Link: https://lore.kernel.org/all/tencent_1647475C9618C390BEC601BE2CC1206D0C07@qq.com/ > Fixes: 2a31c54be097 ("RISC-V: Minimal parser for "riscv, isa" strings") > Signed-off-by: Conor Dooley <conor.dooley@microchip.com> > --- > arch/riscv/kernel/cpufeature.c | 17 ++++++++++++++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c > index 52585e088873..93850540b0b4 100644 > --- a/arch/riscv/kernel/cpufeature.c > +++ b/arch/riscv/kernel/cpufeature.c > @@ -142,6 +142,10 @@ void __init riscv_fill_hwcap(void) > const char *ext_end = isa; > bool ext_long = false, ext_err = false; > > + if (unlikely(!islower(*ext))) { > + ext_err = true; > + } Can drop the {} > + > switch (*ext) { > case 's': > /** > @@ -156,6 +160,15 @@ void __init riscv_fill_hwcap(void) > break; > } > fallthrough; > + case 'S': > + case 'X': > + case 'Z': > + /* > + * As the riscv,isa string must be lower-case, > + * S, X and Z are not valid characters. Parse > + * the invalid extension anyway, to skip ahead > + * to the next valid one. > + */ > case 'x': > case 'z': > ext_long = true; > @@ -185,10 +198,8 @@ void __init riscv_fill_hwcap(void) > ++ext_end; > break; > default: > - if (unlikely(!islower(*ext))) { > - ext_err = true; > + if (unlikely(ext_err)) > break; > - } > /* Find next extension */ > if (!isdigit(*isa)) > break; > -- > 2.39.2 > Thanks, drew
On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > Yangyu Chen reported that if an multi-letter extension begins with a > > capital letter the parser will treat the remainder of that multi-letter > > extension as single-letter extensions. > > I think the problem is that the parser doesn't completely abort when > it sees something it doesn't understand. Continuing is risky since > it may be possible to compose an invalid string that gets the parser > to run off the rails. Usually I am of the opinion that we should not seek the validate the dt in the kernel, since there are tools for doing so *cough* dt-validate *cough*. This one seemed like low hanging fruit though, since the parser handles having capital letters in any of the other places after the rv##, but falls over pretty badly for this particular issue. In general, I don't think we need to be concerned about anything that fails dt-validate though, you kinda need to trust that that is correct. I'd argue that we might even do too much validation in the parser at present. Is there some attack vector, or ACPI related consideration, that I am unaware of that makes this risky? > How about completely aborting, noisily, when the string doesn't match > expectations, falling back to a default string such as rv64ima instead. > That also ought to get faster corrections of device trees. I did this first actually, but I was afraid that it would cause regressions? If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is invalid and dt-validate would have told you so, but at present that would be parsed as "rv64imafdc_zicbom" which is a perfect description of the hardware in question (since the meaning of i was set before RVI made a hames of things). So that's why I opted to not do some sort of pr_err/BUG()/WARN() and try to keep processing the string. I'm happy to abort entirely on reaching a capital if people feel there's unlikely to be a fallout from that. Cheers, Conor.
On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > capital letter the parser will treat the remainder of that multi-letter > > > extension as single-letter extensions. > > > > I think the problem is that the parser doesn't completely abort when > > it sees something it doesn't understand. Continuing is risky since > > it may be possible to compose an invalid string that gets the parser > > to run off the rails. > > Usually I am of the opinion that we should not seek the validate the dt > in the kernel, since there are tools for doing so *cough* dt-validate > *cough*. This one seemed like low hanging fruit though, since the parser > handles having capital letters in any of the other places after the > rv##, but falls over pretty badly for this particular issue. > > In general, I don't think we need to be concerned about anything that > fails dt-validate though, you kinda need to trust that that is correct. > I'd argue that we might even do too much validation in the parser at > present. > Is there some attack vector, or ACPI related consideration, that I am > unaware of that makes this risky? C language + string processing == potential attack vector > > > How about completely aborting, noisily, when the string doesn't match > > expectations, falling back to a default string such as rv64ima instead. > > That also ought to get faster corrections of device trees. > > I did this first actually, but I was afraid that it would cause > regressions? > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > invalid and dt-validate would have told you so, but at present that > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > the hardware in question (since the meaning of i was set before RVI made > a hames of things). > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > try to keep processing the string. I'm happy to abort entirely on > reaching a capital if people feel there's unlikely to be a fallout from > that. There might be fallout, but the kernel needs to defend itself. IMO, if the kernel doesn't know how to parse something, then it should stop trying to immediately, either with a BUG(), refusing to accept any part of it, by fallbacking back to a default, or by only accepting what it believes it parsed correctly. The third option is probably a reasonable choice in this case. Thanks, drew
On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > capital letter the parser will treat the remainder of that multi-letter > > > > extension as single-letter extensions. > > > > > > I think the problem is that the parser doesn't completely abort when > > > it sees something it doesn't understand. Continuing is risky since > > > it may be possible to compose an invalid string that gets the parser > > > to run off the rails. > > > > Usually I am of the opinion that we should not seek the validate the dt > > in the kernel, since there are tools for doing so *cough* dt-validate > > *cough*. This one seemed like low hanging fruit though, since the parser > > handles having capital letters in any of the other places after the > > rv##, but falls over pretty badly for this particular issue. > > > > In general, I don't think we need to be concerned about anything that > > fails dt-validate though, you kinda need to trust that that is correct. > > I'd argue that we might even do too much validation in the parser at > > present. > > Is there some attack vector, or ACPI related consideration, that I am > > unaware of that makes this risky? A bit unrelated to this, but your mention of ACPI made me go look at the approved ECR[1] again for the ISA string. It says "Null-terminated ASCII Instruction Set Architecture (ISA) string for this hart. The format of the ISA string is defined in the RISC-V unprivileged specification." I suppose we can still add additional requirements to an ACPI ISA string which the Linux kernel will parse, but it'll be odd to point people at the DT binding to do that. Maybe we should consider making the parser more complete, possibly by importing it from some reference implementation or something. [1] https://drive.google.com/file/d/1nP3nFiH4jkPMp6COOxP6123DCZKR-tia/view Thanks, drew > > C language + string processing == potential attack vector > > > > > > How about completely aborting, noisily, when the string doesn't match > > > expectations, falling back to a default string such as rv64ima instead. > > > That also ought to get faster corrections of device trees. > > > > I did this first actually, but I was afraid that it would cause > > regressions? > > > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > > invalid and dt-validate would have told you so, but at present that > > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > > the hardware in question (since the meaning of i was set before RVI made > > a hames of things). > > > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > > try to keep processing the string. I'm happy to abort entirely on > > reaching a capital if people feel there's unlikely to be a fallout from > > that. > > There might be fallout, but the kernel needs to defend itself. IMO, if > the kernel doesn't know how to parse something, then it should stop > trying to immediately, either with a BUG(), refusing to accept any > part of it, by fallbacking back to a default, or by only accepting what > it believes it parsed correctly. > > The third option is probably a reasonable choice in this case. > > Thanks, > drew
On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > capital letter the parser will treat the remainder of that multi-letter > > > > extension as single-letter extensions. > > > > > > I think the problem is that the parser doesn't completely abort when > > > it sees something it doesn't understand. Continuing is risky since > > > it may be possible to compose an invalid string that gets the parser > > > to run off the rails. > > > > Usually I am of the opinion that we should not seek the validate the dt > > in the kernel, since there are tools for doing so *cough* dt-validate > > *cough*. This one seemed like low hanging fruit though, since the parser > > handles having capital letters in any of the other places after the > > rv##, but falls over pretty badly for this particular issue. > > > > In general, I don't think we need to be concerned about anything that > > fails dt-validate though, you kinda need to trust that that is correct. > > I'd argue that we might even do too much validation in the parser at > > present. > > Is there some attack vector, or ACPI related consideration, that I am > > unaware of that makes this risky? > > C language + string processing == potential attack vector Right. I thought there was some specific scenario that you had in mind, rather than the "obvious" "parsing strings is bad". What I was wondering is whether the devicetree is an attack vector you actually have to care about? I had thought it wasn't, hence asking. > > > How about completely aborting, noisily, when the string doesn't match > > > expectations, falling back to a default string such as rv64ima instead. > > > That also ought to get faster corrections of device trees. > > > > I did this first actually, but I was afraid that it would cause > > regressions? > > > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > > invalid and dt-validate would have told you so, but at present that > > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > > the hardware in question (since the meaning of i was set before RVI made > > a hames of things). After thinking about it a bit cycling home, I don't actually think that this would be a regression. If your dt is not valid, then that's your problem, not ours :) Valid dt will be parsed correctly before and after such a change, so I think that that is actually okay. The dt-binding exists for a reason, and can be pointed to if anyone claims this is a regression I think. > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > > try to keep processing the string. I'm happy to abort entirely on > > reaching a capital if people feel there's unlikely to be a fallout from > > that. > > There might be fallout, but the kernel needs to defend itself. IMO, if > the kernel doesn't know how to parse something, then it should stop > trying to immediately, either with a BUG(), refusing to accept any > part of it, by fallbacking back to a default, or by only accepting what > it believes it parsed correctly. > > The third option is probably a reasonable choice in this case. My patch could be interpreted as meeting the criteria for option 3. I think you instead mean "stop parsing at that point & only report the extensions seen prior to the first bad one"? Cheers, Conor.
On Wed, Apr 26, 2023 at 02:58:45PM +0100, Conor Dooley wrote: > On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > > capital letter the parser will treat the remainder of that multi-letter > > > > > extension as single-letter extensions. > > > > > > > > I think the problem is that the parser doesn't completely abort when > > > > it sees something it doesn't understand. Continuing is risky since > > > > it may be possible to compose an invalid string that gets the parser > > > > to run off the rails. > > > > > > Usually I am of the opinion that we should not seek the validate the dt > > > in the kernel, since there are tools for doing so *cough* dt-validate > > > *cough*. This one seemed like low hanging fruit though, since the parser > > > handles having capital letters in any of the other places after the > > > rv##, but falls over pretty badly for this particular issue. > > > > > > In general, I don't think we need to be concerned about anything that > > > fails dt-validate though, you kinda need to trust that that is correct. > > > I'd argue that we might even do too much validation in the parser at > > > present. > > > Is there some attack vector, or ACPI related consideration, that I am > > > unaware of that makes this risky? > > > > C language + string processing == potential attack vector > > Right. I thought there was some specific scenario that you had in mind, > rather than the "obvious" "parsing strings is bad". Yup, I only pointed out the obvious since it's always possible, at least for me, to lose sight of the forest for the trees. > What I was wondering is whether the devicetree is an attack vector you > actually have to care about? I had thought it wasn't, hence asking. Nope, I haven't put any thought into this potential attack vector beyond this discussion. > > > > > How about completely aborting, noisily, when the string doesn't match > > > > expectations, falling back to a default string such as rv64ima instead. > > > > That also ought to get faster corrections of device trees. > > > > > > I did this first actually, but I was afraid that it would cause > > > regressions? > > > > > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > > > invalid and dt-validate would have told you so, but at present that > > > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > > > the hardware in question (since the meaning of i was set before RVI made > > > a hames of things). > > After thinking about it a bit cycling home, I don't actually think that > this would be a regression. If your dt is not valid, then that's your > problem, not ours :) > Valid dt will be parsed correctly before and after such a change, so I > think that that is actually okay. > The dt-binding exists for a reason, and can be pointed to if anyone > claims this is a regression I think. I agree. > > > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > > > try to keep processing the string. I'm happy to abort entirely on > > > reaching a capital if people feel there's unlikely to be a fallout from > > > that. > > > > There might be fallout, but the kernel needs to defend itself. IMO, if > > the kernel doesn't know how to parse something, then it should stop > > trying to immediately, either with a BUG(), refusing to accept any > > part of it, by fallbacking back to a default, or by only accepting what > > it believes it parsed correctly. > > > > The third option is probably a reasonable choice in this case. > > My patch could be interpreted as meeting the criteria for option 3. > I think you instead mean "stop parsing at that point & only report the > extensions seen prior to the first bad one"? Right. Thanks, drew
On Wed, Apr 26, 2023 at 03:54:55PM +0200, Andrew Jones wrote: > On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > > capital letter the parser will treat the remainder of that multi-letter > > > > > extension as single-letter extensions. > > > > > > > > I think the problem is that the parser doesn't completely abort when > > > > it sees something it doesn't understand. Continuing is risky since > > > > it may be possible to compose an invalid string that gets the parser > > > > to run off the rails. > > > > > > Usually I am of the opinion that we should not seek the validate the dt > > > in the kernel, since there are tools for doing so *cough* dt-validate > > > *cough*. This one seemed like low hanging fruit though, since the parser > > > handles having capital letters in any of the other places after the > > > rv##, but falls over pretty badly for this particular issue. > > > > > > In general, I don't think we need to be concerned about anything that > > > fails dt-validate though, you kinda need to trust that that is correct. > > > I'd argue that we might even do too much validation in the parser at > > > present. > > > Is there some attack vector, or ACPI related consideration, that I am > > > unaware of that makes this risky? > > A bit unrelated to this, but your mention of ACPI made me go look at the > approved ECR[1] again for the ISA string. It says "Null-terminated ASCII > Instruction Set Architecture (ISA) string for this hart. The format of the > ISA string is defined in the RISC-V unprivileged specification." I suppose > we can still add additional requirements to an ACPI ISA string which the > Linux kernel will parse, but it'll be odd to point people at the DT > binding to do that. Maybe we should consider making the parser more > complete, possibly by importing it from some reference implementation or > something. Heh, I wonder are we heading for some divergence here then. riscv,isa in a DT is explicitly *not* a match for that due to the backwards-incompatible changes made by RVI to extension definitions since riscv,isa was added to the dt-binding. Clarifying that one is the next patch in my todo list.. ACPI naively saying "it matches the spec" is asking for trouble, since there does not actually appear to be any sort of clarification about which *version* of the spec that may be. At least in the dt-binding, we have a format there, what happens to the ACPI spec if RVI decides that - is a suitable alternative to _ in some future edition? I don't think such a thing is all that likely, but surely you'd like to insulate the ABI from that sort of eventuality? Perhaps the thing to do is to actually take Yangyu's first patch and my second one, since the problem with backwards compatibility doesn't stop the kernel from being more permissive? Cheers, Conor. > > [1] https://drive.google.com/file/d/1nP3nFiH4jkPMp6COOxP6123DCZKR-tia/view > > Thanks, > drew > > > > > C language + string processing == potential attack vector > > > > > > > > > How about completely aborting, noisily, when the string doesn't match > > > > expectations, falling back to a default string such as rv64ima instead. > > > > That also ought to get faster corrections of device trees. > > > > > > I did this first actually, but I was afraid that it would cause > > > regressions? > > > > > > If you have riscv,isa = "rv64imafdc_Zifencei_zicbom", yes that is > > > invalid and dt-validate would have told you so, but at present that > > > would be parsed as "rv64imafdc_zicbom" which is a perfect description of > > > the hardware in question (since the meaning of i was set before RVI made > > > a hames of things). > > > > > > So that's why I opted to not do some sort of pr_err/BUG()/WARN() and > > > try to keep processing the string. I'm happy to abort entirely on > > > reaching a capital if people feel there's unlikely to be a fallout from > > > that. > > > > There might be fallout, but the kernel needs to defend itself. IMO, if > > the kernel doesn't know how to parse something, then it should stop > > trying to immediately, either with a BUG(), refusing to accept any > > part of it, by fallbacking back to a default, or by only accepting what > > it believes it parsed correctly. > > > > The third option is probably a reasonable choice in this case. > > > > Thanks, > > drew
On Wed, Apr 26, 2023 at 03:37:58PM +0100, Conor Dooley wrote: > On Wed, Apr 26, 2023 at 03:54:55PM +0200, Andrew Jones wrote: > > On Wed, Apr 26, 2023 at 03:08:25PM +0200, Andrew Jones wrote: > > > On Wed, Apr 26, 2023 at 01:47:39PM +0100, Conor Dooley wrote: > > > > On Wed, Apr 26, 2023 at 02:18:52PM +0200, Andrew Jones wrote: > > > > > On Wed, Apr 26, 2023 at 11:43:24AM +0100, Conor Dooley wrote: > > > > > > Yangyu Chen reported that if an multi-letter extension begins with a > > > > > > capital letter the parser will treat the remainder of that multi-letter > > > > > > extension as single-letter extensions. > > > > > > > > > > I think the problem is that the parser doesn't completely abort when > > > > > it sees something it doesn't understand. Continuing is risky since > > > > > it may be possible to compose an invalid string that gets the parser > > > > > to run off the rails. > > > > > > > > Usually I am of the opinion that we should not seek the validate the dt > > > > in the kernel, since there are tools for doing so *cough* dt-validate > > > > *cough*. This one seemed like low hanging fruit though, since the parser > > > > handles having capital letters in any of the other places after the > > > > rv##, but falls over pretty badly for this particular issue. > > > > > > > > In general, I don't think we need to be concerned about anything that > > > > fails dt-validate though, you kinda need to trust that that is correct. > > > > I'd argue that we might even do too much validation in the parser at > > > > present. > > > > Is there some attack vector, or ACPI related consideration, that I am > > > > unaware of that makes this risky? > > > > A bit unrelated to this, but your mention of ACPI made me go look at the > > approved ECR[1] again for the ISA string. It says "Null-terminated ASCII > > Instruction Set Architecture (ISA) string for this hart. The format of the > > ISA string is defined in the RISC-V unprivileged specification." I suppose > > we can still add additional requirements to an ACPI ISA string which the > > Linux kernel will parse, but it'll be odd to point people at the DT > > binding to do that. Maybe we should consider making the parser more > > complete, possibly by importing it from some reference implementation or > > something. > > Heh, I wonder are we heading for some divergence here then. riscv,isa in > a DT is explicitly *not* a match for that due to the > backwards-incompatible changes made by RVI to extension definitions > since riscv,isa was added to the dt-binding. Clarifying that one is the > next patch in my todo list.. > > ACPI naively saying "it matches the spec" is asking for trouble, since > there does not actually appear to be any sort of clarification about > which *version* of the spec that may be. At least in the dt-binding, we > have a format there, what happens to the ACPI spec if RVI decides that - > is a suitable alternative to _ in some future edition? I don't think > such a thing is all that likely, but surely you'd like to insulate the > ABI from that sort of eventuality? Agreed. I'll raise this concern with the ACPI people. > > Perhaps the thing to do is to actually take Yangyu's first patch and my > second one, since the problem with backwards compatibility doesn't stop > the kernel from being more permissive? I guess so? Every time you wake up and see a parser, you get six more weeks of Winter. drew
Hi, On Wed, 26 Apr 2023 15:37:58 +0100, Conor Dooley wrote: > Perhaps the thing to do is to actually take Yangyu's first patch and my > second one, since the problem with backwards compatibility doesn't stop > the kernel from being more permissive? How about taking my first patch[1] since the ECR[2] mentioned that the format of the ISA string is defined in the RISC-V unprivileged specification? However, I think we still need to stop the parser if some characters that the parser is not able to handle as Andrew Jones mentioned in the previous mail[3]. In this case, we still need to add some code to stop parsing if any error happens. In my humble opinion, backward compatibility can be solved by backports to LTS kernels. I think the better option is to allow using uppercase letters in the device-tree document to make the parser coherent with RISC-V ISA specification but recommend using all lowercase letters for better backward compatibility. [1] https://lore.kernel.org/all/tencent_63090269FF399AE30AC774848C344EF2F10A@qq.com/ [2] https://drive.google.com/file/d/1nP3nFiH4jkPMp6COOxP6123DCZKR-tia/view [3] https://lore.kernel.org/all/d7t6nxmblmyqriogs4bxakpse3qc7pc6cczjnhmkpk4kjwvgcb@3aihh3erdn6p/ Thanks, Yangyu Chen
On Thu, Apr 27, 2023 at 01:11:00AM +0800, Yangyu Chen wrote: > Hi, > > On Wed, 26 Apr 2023 15:37:58 +0100, Conor Dooley wrote: > > Perhaps the thing to do is to actually take Yangyu's first patch and my > > second one, since the problem with backwards compatibility doesn't stop > > the kernel from being more permissive? > > How about taking my first patch[1] since the ECR[2] mentioned that > the format of the ISA string is defined in the RISC-V unprivileged > specification? That is what I suggested, no? Your 1/2 with a revised subject noting that ACPI may need it, rather than talking about DT. See also my comments to Drew about the "perils" of letting undefined spec versions have control over your ABI. > However, I think we still need to stop the parser if > some characters that the parser is not able to handle as Andrew Jones > mentioned in the previous mail[3]. In this case, we still need to add > some code to stop parsing if any error happens. Currently it skips extensions that are not valid, but keeps parsing. Apart from the case where SZX are capital letters it "should" either parse into something resembling correct or skip. If we start parsing in a case-invariant manner, I don't see any immediately problem with what we currently have. I just don't really get what we need to "protect" the kernel from. If someone controls the dtb, I think what they do to the isa string is probably the least of our worries. > In my humble opinion, backward compatibility can be solved by backports > to LTS kernels. The binding might lie in Linux, but that does not mean we can fix the problem by backporting parser changes to stable. There are other users and Linux kernels that would pre-date the change that we would be inflicting this relaxation on for no benefit at all. U-Boot for example does a case-sensitive comparison. > I think the better option is to allow using uppercase > letters in the device-tree document to make the parser coherent with > RISC-V ISA specification but recommend using all lowercase letters for > better backward compatibility. Any addition of uppercase to that binding will get my NAK. There is no realistic benefit to doing so, only potential for disruption. DT generators should emit DT that complies with bindings ¯\_(ツ)_/¯. I'll go take a proper look at your 1/2 from the other day. I had a comment about it that I didn't leave, but will do so now. Thanks, Conor.
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 52585e088873..93850540b0b4 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -142,6 +142,10 @@ void __init riscv_fill_hwcap(void) const char *ext_end = isa; bool ext_long = false, ext_err = false; + if (unlikely(!islower(*ext))) { + ext_err = true; + } + switch (*ext) { case 's': /** @@ -156,6 +160,15 @@ void __init riscv_fill_hwcap(void) break; } fallthrough; + case 'S': + case 'X': + case 'Z': + /* + * As the riscv,isa string must be lower-case, + * S, X and Z are not valid characters. Parse + * the invalid extension anyway, to skip ahead + * to the next valid one. + */ case 'x': case 'z': ext_long = true; @@ -185,10 +198,8 @@ void __init riscv_fill_hwcap(void) ++ext_end; break; default: - if (unlikely(!islower(*ext))) { - ext_err = true; + if (unlikely(ext_err)) break; - } /* Find next extension */ if (!isdigit(*isa)) break;
Yangyu Chen reported that if an multi-letter extension begins with a capital letter the parser will treat the remainder of that multi-letter extension as single-letter extensions. Certain versions of rocket-chip will export devicetree containing rv64ima_Zifencei, which is parsed by the kernel as rv64imafc. While capital letters in riscv,isa are invalid and the validation of devicetree's isn't the kernel's job, we should behave more gracefully here. Rather than abort parsing on meeting a capital letter, mark the extension as an error & allow the parser to skip ahead to the next extension. Reported-by: Yangyu Chen <cyy@cyyself.name> Link: https://lore.kernel.org/all/tencent_1647475C9618C390BEC601BE2CC1206D0C07@qq.com/ Fixes: 2a31c54be097 ("RISC-V: Minimal parser for "riscv, isa" strings") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> --- arch/riscv/kernel/cpufeature.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)