diff mbox series

Possible problem in check-local-export during the kernel build process (RCU torture)

Message ID CAABZP2wCsXG=qaZ68OwDDPA=uG-4Wb-e6WxMNuHNbJTZ1ruenA@mail.gmail.com (mailing list archive)
State New, archived
Headers show
Series Possible problem in check-local-export during the kernel build process (RCU torture) | expand

Commit Message

Zhouyi Zhou June 28, 2022, 12:55 a.m. UTC
Dear Masahiro:

1. The cause of the problem
When I am doing RCU torture test:

#git clone https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
#cd linux-rcu
#git checkout remotes/origin/pmladek.2022.06.15a
#./tools/testing/selftests/rcutorture/bin/torture.sh

The kernel building report error something in both Dell PowerEdge R720
and Thinkpad T14 (Amd).

For example:
/mnt/rcu/linux-rcu/tools/testing/selftests/rcutorture/res/2022.06.27-10.42.37-torture#
find . -name Make.out|xargs grep Error
./results-rcutorture-kasan/RUDE01/Make.out:make[2]: ***
[scripts/Makefile.build:257: kernel/trace/power-traces.o] Error 255

2. I trace the problem to check-local-export
I add some echo statement in Makefile.build

Then I rerun the torture.sh

The result show it is wait $! in all failed builds because in all failed cases,
there is L53, but no L56

4. I look into source code of wait command
#wget http://ftp.gnu.org/gnu/bash/bash-5.0.tar.gz
#tar zxf tar zxf bash-5.0.tar.gz
I found that there are QUIT statements in realization of function wait_for

5. My Guess
wait statement in check-local-export may cause bash to quit
I am very interested in this problem, but I am a rookie, I am very
glad to proceed the investigation with your further directions.

Sorry to have brought you so much trouble.


Kind Regard
Thank you very much
Zhouyi

Comments

Masahiro Yamada June 28, 2022, 7:24 a.m. UTC | #1
On Tue, Jun 28, 2022 at 9:55 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>
> Dear Masahiro:
>
> 1. The cause of the problem
> When I am doing RCU torture test:
>
> #git clone https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> #cd linux-rcu
> #git checkout remotes/origin/pmladek.2022.06.15a
> #./tools/testing/selftests/rcutorture/bin/torture.sh
>
> The kernel building report error something in both Dell PowerEdge R720
> and Thinkpad T14 (Amd).
>
> For example:
> /mnt/rcu/linux-rcu/tools/testing/selftests/rcutorture/res/2022.06.27-10.42.37-torture#
> find . -name Make.out|xargs grep Error
> ./results-rcutorture-kasan/RUDE01/Make.out:make[2]: ***
> [scripts/Makefile.build:257: kernel/trace/power-traces.o] Error 255


check-local-export stopped using the process substitution.
Seet this commit:


commit da4288b95baa1c7c9aa8a476f58b37eb238745b0
Author: Masahiro Yamada <masahiroy@kernel.org>
Date:   Wed Jun 8 10:11:00 2022 +0900

    scripts/check-local-export: avoid 'wait $!' for process substitution




You are debugging the stale Kbuild code.

If your main interest is RCU torture,
please narrow it down to a standalone script,
not the entire Kbuild.









>
> 2. I trace the problem to check-local-export
> I add some echo statement in Makefile.build
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index 1f01ac65c0cd..0d48a2d3efff 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -227,13 +227,21 @@ cmd_check_local_export =
> $(srctree)/scripts/check-local-export $@
>
>  define rule_cc_o_c
>         $(call cmd_and_fixdep,cc_o_c)
> +       echo "after fixdep $$? $<"
>         $(call cmd,gen_ksymdeps)
> +       echo "after gen_ksymdeps $$? $<"
>         $(call cmd,check_local_export)
> +       echo "after check_local_export $$? $<"
>         $(call cmd,checksrc)
> +       echo "after checksrc $$? $<"
>         $(call cmd,checkdoc)
> +       echo "after checkdoc $$? $<"
>         $(call cmd,gen_objtooldep)
> +       echo "after gen_objtooldep $$? $<"
>         $(call cmd,gen_symversions_c)
> +       echo "after gen_symversions $$? $<"
>         $(call cmd,record_mcount)
> +       echo "after record_mcount $$? $<"
>  endef
>
> Then I rerun the torture.sh
>
> The result show it is check_local_export did not continue in all failed builds.
>
> 3. I trace into wait statement in check-local-export
> diff --git a/scripts/check-local-export b/scripts/check-local-export
> index da745e2743b7..d35477d95bdc 100755
> --- a/scripts/check-local-export
> +++ b/scripts/check-local-export
> @@ -12,7 +12,7 @@ declare -A symbol_types
>  declare -a export_symbols
>
>  exit_code=0
> -
> +echo "check-local-export L15 ${0} ${1}"
>  while read value type name
>  do
>         # Skip the line if the number of fields is less than 3.
> @@ -50,9 +50,10 @@ do
>         #   done < <(${NM} --quiet ${1})
>  done < <(${NM} ${1} 2>/dev/null || { echo "${0}: ${NM} failed" >&2; false; } )
>
> +echo "check-local-export L53 ${0} ${1}"
>  # Catch error in the process substitution
>  wait $!
> -
> +echo "check-local-export L56 ${0} ${1} $! $?"
>  for name in "${export_symbols[@]}"
>  do
>         # nm(3) says "If lowercase, the symbol is usually local"
> @@ -61,5 +62,9 @@ do
>                 exit_code=1
>         fi
>  done
> -
> +if [ ${exit_code} -ne 0 ] ; then
> +    echo "Zhouyi Zhou"
> +    echo ${exit_code}
> +fi
> +echo "check-local-export L69 $? ${exit_code}"
>  exit ${exit_code}
>
> Then I rerun the torture.sh
>
> The result show it is wait $! in all failed builds because in all failed cases,
> there is L53, but no L56
>
> 4. I look into source code of wait command
> #wget http://ftp.gnu.org/gnu/bash/bash-5.0.tar.gz
> #tar zxf tar zxf bash-5.0.tar.gz
> I found that there are QUIT statements in realization of function wait_for
>
> 5. My Guess
> wait statement in check-local-export may cause bash to quit
> I am very interested in this problem, but I am a rookie, I am very
> glad to proceed the investigation with your further directions.
>
> Sorry to have brought you so much trouble.
>
>
> Kind Regard
> Thank you very much
> Zhouyi
Zhouyi Zhou June 28, 2022, 11:35 p.m. UTC | #2
On Tue, Jun 28, 2022 at 3:25 PM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> On Tue, Jun 28, 2022 at 9:55 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >
> > Dear Masahiro:
> >
> > 1. The cause of the problem
> > When I am doing RCU torture test:
> >
> > #git clone https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
> > #cd linux-rcu
> > #git checkout remotes/origin/pmladek.2022.06.15a
> > #./tools/testing/selftests/rcutorture/bin/torture.sh
> >
> > The kernel building report error something in both Dell PowerEdge R720
> > and Thinkpad T14 (Amd).
> >
> > For example:
> > /mnt/rcu/linux-rcu/tools/testing/selftests/rcutorture/res/2022.06.27-10.42.37-torture#
> > find . -name Make.out|xargs grep Error
> > ./results-rcutorture-kasan/RUDE01/Make.out:make[2]: ***
> > [scripts/Makefile.build:257: kernel/trace/power-traces.o] Error 255
>
>
> check-local-export stopped using the process substitution.
> Seet this commit:
>
>
> commit da4288b95baa1c7c9aa8a476f58b37eb238745b0
> Author: Masahiro Yamada <masahiroy@kernel.org>
> Date:   Wed Jun 8 10:11:00 2022 +0900
>
>     scripts/check-local-export: avoid 'wait $!' for process substitution
>
>
>
>
> You are debugging the stale Kbuild code.
Thank Masahiro for your help
I patched da4288b95baa1c7c9aa8a476f58b37eb238745b0 to
pmladek.2022.06.15a branch of linux-rcu, after 13 hours' RCU torture
test,
all the testcases are built successfully!

Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
>
> If your main interest is RCU torture,
> please narrow it down to a standalone script,
> not the entire Kbuild.
OK and Thank you and Paul both for your guidance and encouragement ;-)
>
>
>
>
>
>
>
>
>
> >
> > 2. I trace the problem to check-local-export
> > I add some echo statement in Makefile.build
> > diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> > index 1f01ac65c0cd..0d48a2d3efff 100644
> > --- a/scripts/Makefile.build
> > +++ b/scripts/Makefile.build
> > @@ -227,13 +227,21 @@ cmd_check_local_export =
> > $(srctree)/scripts/check-local-export $@
> >
> >  define rule_cc_o_c
> >         $(call cmd_and_fixdep,cc_o_c)
> > +       echo "after fixdep $$? $<"
> >         $(call cmd,gen_ksymdeps)
> > +       echo "after gen_ksymdeps $$? $<"
> >         $(call cmd,check_local_export)
> > +       echo "after check_local_export $$? $<"
> >         $(call cmd,checksrc)
> > +       echo "after checksrc $$? $<"
> >         $(call cmd,checkdoc)
> > +       echo "after checkdoc $$? $<"
> >         $(call cmd,gen_objtooldep)
> > +       echo "after gen_objtooldep $$? $<"
> >         $(call cmd,gen_symversions_c)
> > +       echo "after gen_symversions $$? $<"
> >         $(call cmd,record_mcount)
> > +       echo "after record_mcount $$? $<"
> >  endef
> >
> > Then I rerun the torture.sh
> >
> > The result show it is check_local_export did not continue in all failed builds.
> >
> > 3. I trace into wait statement in check-local-export
> > diff --git a/scripts/check-local-export b/scripts/check-local-export
> > index da745e2743b7..d35477d95bdc 100755
> > --- a/scripts/check-local-export
> > +++ b/scripts/check-local-export
> > @@ -12,7 +12,7 @@ declare -A symbol_types
> >  declare -a export_symbols
> >
> >  exit_code=0
> > -
> > +echo "check-local-export L15 ${0} ${1}"
> >  while read value type name
> >  do
> >         # Skip the line if the number of fields is less than 3.
> > @@ -50,9 +50,10 @@ do
> >         #   done < <(${NM} --quiet ${1})
> >  done < <(${NM} ${1} 2>/dev/null || { echo "${0}: ${NM} failed" >&2; false; } )
> >
> > +echo "check-local-export L53 ${0} ${1}"
> >  # Catch error in the process substitution
> >  wait $!
> > -
> > +echo "check-local-export L56 ${0} ${1} $! $?"
> >  for name in "${export_symbols[@]}"
> >  do
> >         # nm(3) says "If lowercase, the symbol is usually local"
> > @@ -61,5 +62,9 @@ do
> >                 exit_code=1
> >         fi
> >  done
> > -
> > +if [ ${exit_code} -ne 0 ] ; then
> > +    echo "Zhouyi Zhou"
> > +    echo ${exit_code}
> > +fi
> > +echo "check-local-export L69 $? ${exit_code}"
> >  exit ${exit_code}
> >
> > Then I rerun the torture.sh
> >
> > The result show it is wait $! in all failed builds because in all failed cases,
> > there is L53, but no L56
> >
> > 4. I look into source code of wait command
> > #wget http://ftp.gnu.org/gnu/bash/bash-5.0.tar.gz
> > #tar zxf tar zxf bash-5.0.tar.gz
> > I found that there are QUIT statements in realization of function wait_for
> >
> > 5. My Guess
> > wait statement in check-local-export may cause bash to quit
> > I am very interested in this problem, but I am a rookie, I am very
> > glad to proceed the investigation with your further directions.
> >
> > Sorry to have brought you so much trouble.
> >
> >
> > Kind Regard
> > Thank you very much
> > Zhouyi
>
>
>
> --
> Best Regards
> Masahiro Yamada
Best Regards
Zhouyi
diff mbox series

Patch

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 1f01ac65c0cd..0d48a2d3efff 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -227,13 +227,21 @@  cmd_check_local_export =
$(srctree)/scripts/check-local-export $@

 define rule_cc_o_c
        $(call cmd_and_fixdep,cc_o_c)
+       echo "after fixdep $$? $<"
        $(call cmd,gen_ksymdeps)
+       echo "after gen_ksymdeps $$? $<"
        $(call cmd,check_local_export)
+       echo "after check_local_export $$? $<"
        $(call cmd,checksrc)
+       echo "after checksrc $$? $<"
        $(call cmd,checkdoc)
+       echo "after checkdoc $$? $<"
        $(call cmd,gen_objtooldep)
+       echo "after gen_objtooldep $$? $<"
        $(call cmd,gen_symversions_c)
+       echo "after gen_symversions $$? $<"
        $(call cmd,record_mcount)
+       echo "after record_mcount $$? $<"
 endef

Then I rerun the torture.sh

The result show it is check_local_export did not continue in all failed builds.

3. I trace into wait statement in check-local-export
diff --git a/scripts/check-local-export b/scripts/check-local-export
index da745e2743b7..d35477d95bdc 100755
--- a/scripts/check-local-export
+++ b/scripts/check-local-export
@@ -12,7 +12,7 @@  declare -A symbol_types
 declare -a export_symbols

 exit_code=0
-
+echo "check-local-export L15 ${0} ${1}"
 while read value type name
 do
        # Skip the line if the number of fields is less than 3.
@@ -50,9 +50,10 @@  do
        #   done < <(${NM} --quiet ${1})
 done < <(${NM} ${1} 2>/dev/null || { echo "${0}: ${NM} failed" >&2; false; } )

+echo "check-local-export L53 ${0} ${1}"
 # Catch error in the process substitution
 wait $!
-
+echo "check-local-export L56 ${0} ${1} $! $?"
 for name in "${export_symbols[@]}"
 do
        # nm(3) says "If lowercase, the symbol is usually local"
@@ -61,5 +62,9 @@  do
                exit_code=1
        fi
 done
-
+if [ ${exit_code} -ne 0 ] ; then
+    echo "Zhouyi Zhou"
+    echo ${exit_code}
+fi
+echo "check-local-export L69 $? ${exit_code}"
 exit ${exit_code}