[v2,1/2] sh: Added kernel.h to word-at-a-time

Message ID	20231214-libstringheader-v2-1-0f195dcff204@google.com (mailing list archive)
State	Superseded
Headers	show Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 226336E2A8 for <linux-hardening@vger.kernel.org>; Thu, 14 Dec 2023 21:07:24 +0000 (UTC) Date: Thu, 14 Dec 2023 21:06:12 +0000 In-Reply-To: <20231214-libstringheader-v2-0-0f195dcff204@google.com> Precedence: bulk Mime-Version: 1.0 References: <20231214-libstringheader-v2-0-0f195dcff204@google.com> Message-ID: <20231214-libstringheader-v2-1-0f195dcff204@google.com> Subject: [PATCH v2 1/2] sh: Added kernel.h to word-at-a-time From: tanzirh@google.com To: Kees Cook <keescook@chromium.org> Cc: Andy Shevchenko <andy@kernel.org>, linux-hardening@vger.kernel.org, linux-kernel@vger.kernel.org, Nick DeSaulniers <nnn@google.com>, Andrew Morton <akpm@linux-foundation.org>, llvm@lists.linux.dev, Al Viro <viro@zeniv.linux.org.uk> Content-Type: text/plain; charset="utf-8"
Series	shrink lib/string.i via IWYU \| expand [v2,0/2] shrink lib/string.i via IWYU [v2,1/2] sh: Added kernel.h to word-at-a-time [v2,2/2] lib/string: shrink lib/string.i via IWYU

Tanzir Hasan Dec. 14, 2023, 9:06 p.m. UTC

This patch includes linux/kernel.h in asm/word-at-a-time.h for the
sh architecture. WORD_AT_A_TIME_CONSTANTS depends on kernel.h.
Making this implicit dependancy explicit allows for later improvements
in the lib/string.c inclusion list.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sh/include/asm/word-at-a-time.h | 1 +
 1 file changed, 1 insertion(+)

Kees Cook Dec. 14, 2023, 9:37 p.m. UTC | #1

On Thu, Dec 14, 2023 at 09:06:12PM +0000, tanzirh@google.com wrote:
> This patch includes linux/kernel.h in asm/word-at-a-time.h for the
> sh architecture. WORD_AT_A_TIME_CONSTANTS depends on kernel.h.
> Making this implicit dependancy explicit allows for later improvements
> in the lib/string.c inclusion list.
> 
> Suggested-by: Al Viro <viro@zeniv.linux.org.uk>

This patch is missing your Signed-off-by: tag.

Also, please double-check your email configs: your full name is missing
from your emails (it's just "tanzirh@google.com"):
https://lore.kernel.org/lkml/20231214-libstringheader-v2-1-0f195dcff204@google.com/

But otherwise, the change itself looks fine. :)

Nick Desaulniers Dec. 14, 2023, 9:51 p.m. UTC | #2

On Thu, Dec 14, 2023 at 1:37 PM Kees Cook <keescook@chromium.org> wrote:
>
> Also, please double-check your email configs: your full name is missing
> from your emails (it's just "tanzirh@google.com"):
> https://lore.kernel.org/lkml/20231214-libstringheader-v2-1-0f195dcff204@google.com/

This is the issue related to use of our internal mailer with b4.
Konstantin fixed this in b4 a while ago, but I suspect the version of
b4 we're getting from apt still does not contain the fix.

Or perhaps we need to switch to pyenv and pypi to install a newer
version of b4.  Tanzir (sits right next to me) and reports his version
of b4 is:

0.12.3

On pypi, looks like there's a 0.12.4
https://pypi.org/project/b4/#history
Looking at those release dates, I'm pretty sure Konstantine fixed this
particular issue in the 0.12.4 release.

(I made this mistake too recently, with my latest commit you picked up)

Debian stable says it has 0.12.0 https://packages.debian.org/stable/b4
Debian unstable says it has 0.12.4 https://packages.debian.org/unstable/b4
IDK where 0.12.3 is coming from. Perhaps our internal mirrors are
lagging behind.

Andy Shevchenko Dec. 15, 2023, 4:04 p.m. UTC | #3

On Thu, Dec 14, 2023 at 09:06:12PM +0000, tanzirh@google.com wrote:
> This patch includes linux/kernel.h in asm/word-at-a-time.h for the
> sh architecture. WORD_AT_A_TIME_CONSTANTS depends on kernel.h.
> Making this implicit dependancy explicit allows for later improvements
> in the lib/string.c inclusion list.
> 
> Suggested-by: Al Viro <viro@zeniv.linux.org.uk>

You forgot your SoB, but...

...

>  #ifdef CONFIG_CPU_BIG_ENDIAN
>  # include <asm-generic/word-at-a-time.h>
>  #else
> +#include <linux/kernel.h>

I highly discourage from doing that. Instead, split what is needed to
the separate (new) header and include that one.

>  /*
>   * Little-endian version cribbed from x86.
>   */

Andy Shevchenko Dec. 15, 2023, 7:09 p.m. UTC | #4

On Fri, Dec 15, 2023 at 8:31 PM Tanzir Hasan <tanzirh@google.com> wrote:
> On Fri, Dec 15, 2023 at 8:04 AM Andy Shevchenko <andy@kernel.org> wrote:
>> On Thu, Dec 14, 2023 at 09:06:12PM +0000, tanzirh@google.com wrote:

...

>> > +#include <linux/kernel.h>
>>
>> I highly discourage from doing that. Instead, split what is needed to
>> the separate (new) header and include that one.
>
>
> I think it would make the most sense to do this in a separate patch.
> What word-at-a-time.h needs from kernel.h is REPEAT_BYTE and to my knowledge,
> almost every other version of word-at-a-time.h includes kernel.h gets this by
> including kernel.h. A future change could be removing REPEAT_BYTE
> out of kernel.h

Just create a patch that either moves that macro (along with upper_*()
and lower_*() APIs) to a more distinguishable header
(maybe bytes.h or words.h or wordpart.h, etc) and use it in your case
and fix others.

Nick Desaulniers Dec. 18, 2023, 4:57 p.m. UTC | #5

On Fri, Dec 15, 2023 at 11:09 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> On Fri, Dec 15, 2023 at 8:31 PM Tanzir Hasan <tanzirh@google.com> wrote:
> > On Fri, Dec 15, 2023 at 8:04 AM Andy Shevchenko <andy@kernel.org> wrote:
> >> On Thu, Dec 14, 2023 at 09:06:12PM +0000, tanzirh@google.com wrote:
>
> ...
>
> >> > +#include <linux/kernel.h>
> >>
> >> I highly discourage from doing that. Instead, split what is needed to
> >> the separate (new) header and include that one.
> >
> >
> > I think it would make the most sense to do this in a separate patch.
> > What word-at-a-time.h needs from kernel.h is REPEAT_BYTE and to my knowledge,
> > almost every other version of word-at-a-time.h includes kernel.h gets this by
> > including kernel.h. A future change could be removing REPEAT_BYTE
> > out of kernel.h
>
> Just create a patch that either moves that macro (along with upper_*()
> and lower_*() APIs) to a more distinguishable header
> (maybe bytes.h or words.h or wordpart.h, etc) and use it in your case
> and fix others.

Andy,
These are good suggestions and we should do them...

...and Tanzir only has 3 weeks left of his internship.  I don't want
him to get bogged down chasing build regressions from modifying the
headers themselves.  I think what's best for him from here through the
remainder of his internship is to stay focused on applying suggestions
from IWYU to just modify the #include list of .c files, and not start
splitting .h files.  Splitting the .h files will be the next step, and
is made easier by having the codebase not have so many indirect
includes (via IWYU), but we need time to soak header changes, and time
Tanzir does not have.  Please can we keep the suggestions focused on
whether the modifications to the header includes (and the tangential
cleanups) are correct?

While REPEAT_BYTE has a manageable number of users, upper_* and
lower_* have significantly more; I worry about moving those causing
regressions.  We can move them, but such changes would need
significantly more soak time than this series IMO.  Tanzir is also
working on statistical analysis; I suspect if he analyzes
include/linux/kernel.h, he can comment on whether the usage of
REPEAT_BYTE is correlated with the usage of upper_* and lower_* in
order to inform whether they should be grouped together or not.

Tanzir Hasan Dec. 18, 2023, 5:05 p.m. UTC | #6

> While REPEAT_BYTE has a manageable number of users, upper_* and
> lower_* have significantly more; I worry about moving those causing
> regressions.  We can move them, but such changes would need
> significantly more soak time than this series IMO.  Tanzir is also
> working on statistical analysis; I suspect if he analyzes
> include/linux/kernel.h, he can comment on whether the usage of
> REPEAT_BYTE is correlated with the usage of upper_* and lower_* in
> order to inform whether they should be grouped together or not.

Removing REPEAT_BYTE is manageable and I have already moved it. I will
be pushing a patch that moves just that into another file called wordpart.h.
There are too many instances of the other functions for it to make sense to
remove them all in this patch.

Best,
Tanzir

Andy Shevchenko Dec. 18, 2023, 5:23 p.m. UTC | #7

On Mon, Dec 18, 2023 at 08:57:59AM -0800, Nick Desaulniers wrote:
> On Fri, Dec 15, 2023 at 11:09 AM Andy Shevchenko
> <andy.shevchenko@gmail.com> wrote:
> > On Fri, Dec 15, 2023 at 8:31 PM Tanzir Hasan <tanzirh@google.com> wrote:
> > > On Fri, Dec 15, 2023 at 8:04 AM Andy Shevchenko <andy@kernel.org> wrote:
> > >> On Thu, Dec 14, 2023 at 09:06:12PM +0000, tanzirh@google.com wrote:

...

> > >> > +#include <linux/kernel.h>
> > >>
> > >> I highly discourage from doing that. Instead, split what is needed to
> > >> the separate (new) header and include that one.
> > >
> > >
> > > I think it would make the most sense to do this in a separate patch.
> > > What word-at-a-time.h needs from kernel.h is REPEAT_BYTE and to my knowledge,
> > > almost every other version of word-at-a-time.h includes kernel.h gets this by
> > > including kernel.h. A future change could be removing REPEAT_BYTE
> > > out of kernel.h
> >
> > Just create a patch that either moves that macro (along with upper_*()
> > and lower_*() APIs) to a more distinguishable header
> > (maybe bytes.h or words.h or wordpart.h, etc) and use it in your case
> > and fix others.
> 
> Andy,
> These are good suggestions and we should do them...
> 
> ...and Tanzir only has 3 weeks left of his internship.  I don't want
> him to get bogged down chasing build regressions from modifying the
> headers themselves.  I think what's best for him from here through the
> remainder of his internship is to stay focused on applying suggestions
> from IWYU to just modify the #include list of .c files, and not start
> splitting .h files.  Splitting the .h files will be the next step, and
> is made easier by having the codebase not have so many indirect
> includes (via IWYU), but we need time to soak header changes, and time
> Tanzir does not have.  Please can we keep the suggestions focused on
> whether the modifications to the header includes (and the tangential
> cleanups) are correct?

Understood. Can we add a comment like

/* FIXME: replace with a proper header to avoid dependency hell */
#include <linux/kernel.h>

?

> While REPEAT_BYTE has a manageable number of users, upper_* and
> lower_* have significantly more; I worry about moving those causing
> regressions.

If you look at how I did similar in the past, I back included new header
into kernel.h. Not the pretty solution, but allows to split in the new code.

> We can move them, but such changes would need significantly more soak time

s/significantly//

But I got your point, see above.

> than this series IMO. Tanzir is also working on statistical analysis; I
> suspect if he analyzes include/linux/kernel.h, he can comment on whether the
> usage of REPEAT_BYTE is correlated with the usage of upper_* and lower_* in
> order to inform whether they should be grouped together or not.

Andy Shevchenko Dec. 18, 2023, 5:25 p.m. UTC | #8

On Mon, Dec 18, 2023 at 09:05:56AM -0800, Tanzir Hasan wrote:
> > While REPEAT_BYTE has a manageable number of users, upper_* and
> > lower_* have significantly more; I worry about moving those causing
> > regressions.  We can move them, but such changes would need
> > significantly more soak time than this series IMO.  Tanzir is also
> > working on statistical analysis; I suspect if he analyzes
> > include/linux/kernel.h, he can comment on whether the usage of
> > REPEAT_BYTE is correlated with the usage of upper_* and lower_* in
> > order to inform whether they should be grouped together or not.
> 
> Removing REPEAT_BYTE is manageable and I have already moved it.

Removing? You mean switching to something else in all those headers?

> I will
> be pushing a patch that moves just that into another file called wordpart.h.
> There are too many instances of the other functions for it to make sense to
> remove them all in this patch.

Okay, let's see the proposal (patch) code then!

[v2,1/2] sh: Added kernel.h to word-at-a-time

Commit Message

Comments

Patch