Message ID | 166500907546.887104.248083399669088204.stgit@magnolia (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | fstests: improve coredump capture and storage | expand |
On Wed, Oct 05, 2022 at 03:31:15PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > If someone sets kernel.core_uses_pid (or kernel.core_pattern), any > coredumps generated by fstests might have names that are longer than > just "core". Since the pid isn't all that useful by itself, let's > record the coredumps by hash when we save them, so that we don't waste > space storing identical crash dumps. > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > --- > check | 26 ++++++++++++++++++++++---- > common/rc | 16 ++++++++++++++++ > 2 files changed, 38 insertions(+), 4 deletions(-) > > > diff --git a/check b/check > index af23572ccc..654d986b27 100755 > --- a/check > +++ b/check > @@ -913,11 +913,19 @@ function run_section() > sts=$? > fi > > - if [ -f core ]; then > - _dump_err_cont "[dumped core]" > - mv core $RESULT_BASE/$seqnum.core > + # If someone sets kernel.core_pattern or kernel.core_uses_pid, > + # coredumps generated by fstests might have a longer name than > + # just "core". Use globbing to find the most common patterns, > + # assuming there are no other coredump capture packages set up. > + local cores=0 > + for i in core core.*; do I'm wondering if it should be "for i in core*" ? The coredump file only can be "core" with dot ".", can it with "-" or "_" or others? > + test -f "$i" || continue > + if ((cores++ == 0)); then > + _dump_err_cont "[dumped core]" > + fi > + _save_coredump "$i" > tc_status="fail" > - fi > + done > > if [ -f $seqres.notrun ]; then > $timestamp && _timestamp > @@ -950,6 +958,16 @@ function run_section() > # of the check script itself. > (_adjust_oom_score 250; _check_filesystems) || tc_status="fail" > _check_dmesg || tc_status="fail" > + > + # Save any coredumps from the post-test fs checks > + for i in core core.*; do > + test -f "$i" || continue > + if ((cores++ == 0)); then > + _dump_err_cont "[dumped core]" > + fi > + _save_coredump "$i" > + tc_status="fail" > + done > fi > > # Reload the module after each test to check for leaks or > diff --git a/common/rc b/common/rc > index d1f3d56bf8..9750d06a9a 100644 > --- a/common/rc > +++ b/common/rc > @@ -4948,6 +4948,22 @@ _create_file_sized() > return $ret > } > > +_save_coredump() > +{ > + local path="$1" > + > + local core_hash="$(_md5_checksum "$path")" > + local out_file="$RESULT_BASE/$seqnum.core.$core_hash" > + > + if [ -s "$out_file" ]; then > + rm -f "$path" > + return > + fi > + rm -f "$out_file" > + > + mv "$path" "$out_file" > +} > + > init_rc > > ################################################################################ >
On Fri, Oct 07, 2022 at 01:18:55PM +0800, Zorro Lang wrote: > On Wed, Oct 05, 2022 at 03:31:15PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > If someone sets kernel.core_uses_pid (or kernel.core_pattern), any > > coredumps generated by fstests might have names that are longer than > > just "core". Since the pid isn't all that useful by itself, let's > > record the coredumps by hash when we save them, so that we don't waste > > space storing identical crash dumps. > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > --- > > check | 26 ++++++++++++++++++++++---- > > common/rc | 16 ++++++++++++++++ > > 2 files changed, 38 insertions(+), 4 deletions(-) > > > > > > diff --git a/check b/check > > index af23572ccc..654d986b27 100755 > > --- a/check > > +++ b/check > > @@ -913,11 +913,19 @@ function run_section() > > sts=$? > > fi > > > > - if [ -f core ]; then > > - _dump_err_cont "[dumped core]" > > - mv core $RESULT_BASE/$seqnum.core > > + # If someone sets kernel.core_pattern or kernel.core_uses_pid, > > + # coredumps generated by fstests might have a longer name than > > + # just "core". Use globbing to find the most common patterns, > > + # assuming there are no other coredump capture packages set up. > > + local cores=0 > > + for i in core core.*; do > > I'm wondering if it should be "for i in core*" ? The coredump file only can be > "core" with dot ".", can it with "-" or "_" or others? The ".$pid" pattern is encoded in the kernel function format_corename: /* Backward compatibility with core_uses_pid: * * If core_pattern does not include a %p (as is the default) * and core_uses_pid is set, then .%pid will be appended to * the filename. Do not do this for piped commands. */ if (!ispipe && !pid_in_pattern && core_uses_pid) { err = cn_printf(cn, ".%d", task_tgid_vnr(current)); if (err) return err; } Note that even this is an incomplete solution because the sysadmin could configure a totally different corename pattern, pipe it to another program, or whatever, and fstests fails to capture any dumps at all. Longer term it'd be theoretically nice to turn core_pattern into a cgroup-manageable tunable and teach ./check to capture all the coredumps, but I am not volunteering to write that much new infrastructure. :/ --D > > > + test -f "$i" || continue > > + if ((cores++ == 0)); then > > + _dump_err_cont "[dumped core]" > > + fi > > + _save_coredump "$i" > > tc_status="fail" > > - fi > > + done > > > > if [ -f $seqres.notrun ]; then > > $timestamp && _timestamp > > @@ -950,6 +958,16 @@ function run_section() > > # of the check script itself. > > (_adjust_oom_score 250; _check_filesystems) || tc_status="fail" > > _check_dmesg || tc_status="fail" > > + > > + # Save any coredumps from the post-test fs checks > > + for i in core core.*; do > > + test -f "$i" || continue > > + if ((cores++ == 0)); then > > + _dump_err_cont "[dumped core]" > > + fi > > + _save_coredump "$i" > > + tc_status="fail" > > + done > > fi > > > > # Reload the module after each test to check for leaks or > > diff --git a/common/rc b/common/rc > > index d1f3d56bf8..9750d06a9a 100644 > > --- a/common/rc > > +++ b/common/rc > > @@ -4948,6 +4948,22 @@ _create_file_sized() > > return $ret > > } > > > > +_save_coredump() > > +{ > > + local path="$1" > > + > > + local core_hash="$(_md5_checksum "$path")" > > + local out_file="$RESULT_BASE/$seqnum.core.$core_hash" > > + > > + if [ -s "$out_file" ]; then > > + rm -f "$path" > > + return > > + fi > > + rm -f "$out_file" > > + > > + mv "$path" "$out_file" > > +} > > + > > init_rc > > > > ################################################################################ > > >
diff --git a/check b/check index af23572ccc..654d986b27 100755 --- a/check +++ b/check @@ -913,11 +913,19 @@ function run_section() sts=$? fi - if [ -f core ]; then - _dump_err_cont "[dumped core]" - mv core $RESULT_BASE/$seqnum.core + # If someone sets kernel.core_pattern or kernel.core_uses_pid, + # coredumps generated by fstests might have a longer name than + # just "core". Use globbing to find the most common patterns, + # assuming there are no other coredump capture packages set up. + local cores=0 + for i in core core.*; do + test -f "$i" || continue + if ((cores++ == 0)); then + _dump_err_cont "[dumped core]" + fi + _save_coredump "$i" tc_status="fail" - fi + done if [ -f $seqres.notrun ]; then $timestamp && _timestamp @@ -950,6 +958,16 @@ function run_section() # of the check script itself. (_adjust_oom_score 250; _check_filesystems) || tc_status="fail" _check_dmesg || tc_status="fail" + + # Save any coredumps from the post-test fs checks + for i in core core.*; do + test -f "$i" || continue + if ((cores++ == 0)); then + _dump_err_cont "[dumped core]" + fi + _save_coredump "$i" + tc_status="fail" + done fi # Reload the module after each test to check for leaks or diff --git a/common/rc b/common/rc index d1f3d56bf8..9750d06a9a 100644 --- a/common/rc +++ b/common/rc @@ -4948,6 +4948,22 @@ _create_file_sized() return $ret } +_save_coredump() +{ + local path="$1" + + local core_hash="$(_md5_checksum "$path")" + local out_file="$RESULT_BASE/$seqnum.core.$core_hash" + + if [ -s "$out_file" ]; then + rm -f "$path" + return + fi + rm -f "$out_file" + + mv "$path" "$out_file" +} + init_rc ################################################################################