Message ID | Y0dZpkOwJpyQ9SA9@magnolia (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | [v2.1,1/2] check: detect and preserve all coredumps made by a test | expand |
On Wed, Oct 12, 2022 at 05:19:50PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > If someone sets kernel.core_uses_pid (or kernel.core_pattern), any > coredumps generated by fstests might have names that are longer than > just "core". Since the pid isn't all that useful by itself, let's > record the coredumps by hash when we save them, so that we don't waste > space storing identical crash dumps. > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > --- > v2.1: use REPORT_DIR per maintainer suggestion > --- This version looks good to me, Reviewed-by: Zorro Lang <zlang@redhat.com> > check | 26 ++++++++++++++++++++++---- > common/rc | 16 ++++++++++++++++ > 2 files changed, 38 insertions(+), 4 deletions(-) > > diff --git a/check b/check > index d587a70546..29303db1c8 100755 > --- a/check > +++ b/check > @@ -923,11 +923,19 @@ function run_section() > sts=$? > fi > > - if [ -f core ]; then > - _dump_err_cont "[dumped core]" > - mv core $RESULT_BASE/$seqnum.core > + # If someone sets kernel.core_pattern or kernel.core_uses_pid, > + # coredumps generated by fstests might have a longer name than > + # just "core". Use globbing to find the most common patterns, > + # assuming there are no other coredump capture packages set up. > + local cores=0 > + for i in core core.*; do > + test -f "$i" || continue > + if ((cores++ == 0)); then > + _dump_err_cont "[dumped core]" > + fi > + _save_coredump "$i" > tc_status="fail" > - fi > + done > > if [ -f $seqres.notrun ]; then > $timestamp && _timestamp > @@ -960,6 +968,16 @@ function run_section() > # of the check script itself. > (_adjust_oom_score 250; _check_filesystems) || tc_status="fail" > _check_dmesg || tc_status="fail" > + > + # Save any coredumps from the post-test fs checks > + for i in core core.*; do > + test -f "$i" || continue > + if ((cores++ == 0)); then > + _dump_err_cont "[dumped core]" > + fi > + _save_coredump "$i" > + tc_status="fail" > + done > fi > > # Reload the module after each test to check for leaks or > diff --git a/common/rc b/common/rc > index d877ac77a0..2e1891180a 100644 > --- a/common/rc > +++ b/common/rc > @@ -4949,6 +4949,22 @@ _create_file_sized() > return $ret > } > > +_save_coredump() > +{ > + local path="$1" > + > + local core_hash="$(_md5_checksum "$path")" > + local out_file="$REPORT_DIR/$seqnum.core.$core_hash" > + > + if [ -s "$out_file" ]; then > + rm -f "$path" > + return > + fi > + rm -f "$out_file" > + > + mv "$path" "$out_file" > +} > + > init_rc > > ################################################################################ >
On Thu, Oct 13, 2022 at 07:44:46PM +0800, Zorro Lang wrote: > On Wed, Oct 12, 2022 at 05:19:50PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@kernel.org> > > > > If someone sets kernel.core_uses_pid (or kernel.core_pattern), any > > coredumps generated by fstests might have names that are longer than > > just "core". Since the pid isn't all that useful by itself, let's > > record the coredumps by hash when we save them, so that we don't waste > > space storing identical crash dumps. > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > --- > > v2.1: use REPORT_DIR per maintainer suggestion > > --- > > This version looks good to me, > Reviewed-by: Zorro Lang <zlang@redhat.com> It occurred to me overnight that ./check doesn't export REPORT_DIR, so I'll push out a v2.2 that adds that. Currently the lack of an export doesn't affect anyone, but as soon as any tests want to call _save_coredump they're going to run into that issue. (...and yes, I do have future fuzz tests that will call it from a test in between fuzz field cycles.) --D > > check | 26 ++++++++++++++++++++++---- > > common/rc | 16 ++++++++++++++++ > > 2 files changed, 38 insertions(+), 4 deletions(-) > > > > diff --git a/check b/check > > index d587a70546..29303db1c8 100755 > > --- a/check > > +++ b/check > > @@ -923,11 +923,19 @@ function run_section() > > sts=$? > > fi > > > > - if [ -f core ]; then > > - _dump_err_cont "[dumped core]" > > - mv core $RESULT_BASE/$seqnum.core > > + # If someone sets kernel.core_pattern or kernel.core_uses_pid, > > + # coredumps generated by fstests might have a longer name than > > + # just "core". Use globbing to find the most common patterns, > > + # assuming there are no other coredump capture packages set up. > > + local cores=0 > > + for i in core core.*; do > > + test -f "$i" || continue > > + if ((cores++ == 0)); then > > + _dump_err_cont "[dumped core]" > > + fi > > + _save_coredump "$i" > > tc_status="fail" > > - fi > > + done > > > > if [ -f $seqres.notrun ]; then > > $timestamp && _timestamp > > @@ -960,6 +968,16 @@ function run_section() > > # of the check script itself. > > (_adjust_oom_score 250; _check_filesystems) || tc_status="fail" > > _check_dmesg || tc_status="fail" > > + > > + # Save any coredumps from the post-test fs checks > > + for i in core core.*; do > > + test -f "$i" || continue > > + if ((cores++ == 0)); then > > + _dump_err_cont "[dumped core]" > > + fi > > + _save_coredump "$i" > > + tc_status="fail" > > + done > > fi > > > > # Reload the module after each test to check for leaks or > > diff --git a/common/rc b/common/rc > > index d877ac77a0..2e1891180a 100644 > > --- a/common/rc > > +++ b/common/rc > > @@ -4949,6 +4949,22 @@ _create_file_sized() > > return $ret > > } > > > > +_save_coredump() > > +{ > > + local path="$1" > > + > > + local core_hash="$(_md5_checksum "$path")" > > + local out_file="$REPORT_DIR/$seqnum.core.$core_hash" > > + > > + if [ -s "$out_file" ]; then > > + rm -f "$path" > > + return > > + fi > > + rm -f "$out_file" > > + > > + mv "$path" "$out_file" > > +} > > + > > init_rc > > > > ################################################################################ > > >
On Thu, Oct 13, 2022 at 08:48:41AM -0700, Darrick J. Wong wrote: > On Thu, Oct 13, 2022 at 07:44:46PM +0800, Zorro Lang wrote: > > On Wed, Oct 12, 2022 at 05:19:50PM -0700, Darrick J. Wong wrote: > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > If someone sets kernel.core_uses_pid (or kernel.core_pattern), any > > > coredumps generated by fstests might have names that are longer than > > > just "core". Since the pid isn't all that useful by itself, let's > > > record the coredumps by hash when we save them, so that we don't waste > > > space storing identical crash dumps. > > > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > > --- > > > v2.1: use REPORT_DIR per maintainer suggestion > > > --- > > > > This version looks good to me, > > Reviewed-by: Zorro Lang <zlang@redhat.com> > > It occurred to me overnight that ./check doesn't export REPORT_DIR, so > I'll push out a v2.2 that adds that. Currently the lack of an export > doesn't affect anyone, but as soon as any tests want to call > _save_coredump they're going to run into that issue. Hmm... the RESULT_DIR is exported, you can use it, or use $seqres directly due to it's defined in common/preamble (although is not exported). ./common/preamble:42: seqres=$RESULT_DIR/$seq What do you think? Thanks, Zorro > > (...and yes, I do have future fuzz tests that will call it from a test > in between fuzz field cycles.) > > --D > > > > check | 26 ++++++++++++++++++++++---- > > > common/rc | 16 ++++++++++++++++ > > > 2 files changed, 38 insertions(+), 4 deletions(-) > > > > > > diff --git a/check b/check > > > index d587a70546..29303db1c8 100755 > > > --- a/check > > > +++ b/check > > > @@ -923,11 +923,19 @@ function run_section() > > > sts=$? > > > fi > > > > > > - if [ -f core ]; then > > > - _dump_err_cont "[dumped core]" > > > - mv core $RESULT_BASE/$seqnum.core > > > + # If someone sets kernel.core_pattern or kernel.core_uses_pid, > > > + # coredumps generated by fstests might have a longer name than > > > + # just "core". Use globbing to find the most common patterns, > > > + # assuming there are no other coredump capture packages set up. > > > + local cores=0 > > > + for i in core core.*; do > > > + test -f "$i" || continue > > > + if ((cores++ == 0)); then > > > + _dump_err_cont "[dumped core]" > > > + fi > > > + _save_coredump "$i" > > > tc_status="fail" > > > - fi > > > + done > > > > > > if [ -f $seqres.notrun ]; then > > > $timestamp && _timestamp > > > @@ -960,6 +968,16 @@ function run_section() > > > # of the check script itself. > > > (_adjust_oom_score 250; _check_filesystems) || tc_status="fail" > > > _check_dmesg || tc_status="fail" > > > + > > > + # Save any coredumps from the post-test fs checks > > > + for i in core core.*; do > > > + test -f "$i" || continue > > > + if ((cores++ == 0)); then > > > + _dump_err_cont "[dumped core]" > > > + fi > > > + _save_coredump "$i" > > > + tc_status="fail" > > > + done > > > fi > > > > > > # Reload the module after each test to check for leaks or > > > diff --git a/common/rc b/common/rc > > > index d877ac77a0..2e1891180a 100644 > > > --- a/common/rc > > > +++ b/common/rc > > > @@ -4949,6 +4949,22 @@ _create_file_sized() > > > return $ret > > > } > > > > > > +_save_coredump() > > > +{ > > > + local path="$1" > > > + > > > + local core_hash="$(_md5_checksum "$path")" > > > + local out_file="$REPORT_DIR/$seqnum.core.$core_hash" > > > + > > > + if [ -s "$out_file" ]; then > > > + rm -f "$path" > > > + return > > > + fi > > > + rm -f "$out_file" > > > + > > > + mv "$path" "$out_file" > > > +} > > > + > > > init_rc > > > > > > ################################################################################ > > > > > >
On Fri, Oct 14, 2022 at 12:03:26AM +0800, Zorro Lang wrote: > On Thu, Oct 13, 2022 at 08:48:41AM -0700, Darrick J. Wong wrote: > > On Thu, Oct 13, 2022 at 07:44:46PM +0800, Zorro Lang wrote: > > > On Wed, Oct 12, 2022 at 05:19:50PM -0700, Darrick J. Wong wrote: > > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > > > If someone sets kernel.core_uses_pid (or kernel.core_pattern), any > > > > coredumps generated by fstests might have names that are longer than > > > > just "core". Since the pid isn't all that useful by itself, let's > > > > record the coredumps by hash when we save them, so that we don't waste > > > > space storing identical crash dumps. > > > > > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > > > --- > > > > v2.1: use REPORT_DIR per maintainer suggestion > > > > --- > > > > > > This version looks good to me, > > > Reviewed-by: Zorro Lang <zlang@redhat.com> > > > > It occurred to me overnight that ./check doesn't export REPORT_DIR, so > > I'll push out a v2.2 that adds that. Currently the lack of an export > > doesn't affect anyone, but as soon as any tests want to call > > _save_coredump they're going to run into that issue. > > Hmm... the RESULT_DIR is exported, you can use it, or use $seqres directly due > to it's defined in common/preamble (although is not exported). > > ./common/preamble:42: seqres=$RESULT_DIR/$seq > > What do you think? seqres is defined differently in check than in common/preamble, but I guess RESULT_DIR will work. --D > Thanks, > Zorro > > > > > (...and yes, I do have future fuzz tests that will call it from a test > > in between fuzz field cycles.) > > > > --D > > > > > > check | 26 ++++++++++++++++++++++---- > > > > common/rc | 16 ++++++++++++++++ > > > > 2 files changed, 38 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/check b/check > > > > index d587a70546..29303db1c8 100755 > > > > --- a/check > > > > +++ b/check > > > > @@ -923,11 +923,19 @@ function run_section() > > > > sts=$? > > > > fi > > > > > > > > - if [ -f core ]; then > > > > - _dump_err_cont "[dumped core]" > > > > - mv core $RESULT_BASE/$seqnum.core > > > > + # If someone sets kernel.core_pattern or kernel.core_uses_pid, > > > > + # coredumps generated by fstests might have a longer name than > > > > + # just "core". Use globbing to find the most common patterns, > > > > + # assuming there are no other coredump capture packages set up. > > > > + local cores=0 > > > > + for i in core core.*; do > > > > + test -f "$i" || continue > > > > + if ((cores++ == 0)); then > > > > + _dump_err_cont "[dumped core]" > > > > + fi > > > > + _save_coredump "$i" > > > > tc_status="fail" > > > > - fi > > > > + done > > > > > > > > if [ -f $seqres.notrun ]; then > > > > $timestamp && _timestamp > > > > @@ -960,6 +968,16 @@ function run_section() > > > > # of the check script itself. > > > > (_adjust_oom_score 250; _check_filesystems) || tc_status="fail" > > > > _check_dmesg || tc_status="fail" > > > > + > > > > + # Save any coredumps from the post-test fs checks > > > > + for i in core core.*; do > > > > + test -f "$i" || continue > > > > + if ((cores++ == 0)); then > > > > + _dump_err_cont "[dumped core]" > > > > + fi > > > > + _save_coredump "$i" > > > > + tc_status="fail" > > > > + done > > > > fi > > > > > > > > # Reload the module after each test to check for leaks or > > > > diff --git a/common/rc b/common/rc > > > > index d877ac77a0..2e1891180a 100644 > > > > --- a/common/rc > > > > +++ b/common/rc > > > > @@ -4949,6 +4949,22 @@ _create_file_sized() > > > > return $ret > > > > } > > > > > > > > +_save_coredump() > > > > +{ > > > > + local path="$1" > > > > + > > > > + local core_hash="$(_md5_checksum "$path")" > > > > + local out_file="$REPORT_DIR/$seqnum.core.$core_hash" > > > > + > > > > + if [ -s "$out_file" ]; then > > > > + rm -f "$path" > > > > + return > > > > + fi > > > > + rm -f "$out_file" > > > > + > > > > + mv "$path" "$out_file" > > > > +} > > > > + > > > > init_rc > > > > > > > > ################################################################################ > > > > > > > > > >
On Thu, Oct 13, 2022 at 09:27:08AM -0700, Darrick J. Wong wrote: > On Fri, Oct 14, 2022 at 12:03:26AM +0800, Zorro Lang wrote: > > On Thu, Oct 13, 2022 at 08:48:41AM -0700, Darrick J. Wong wrote: > > > On Thu, Oct 13, 2022 at 07:44:46PM +0800, Zorro Lang wrote: > > > > On Wed, Oct 12, 2022 at 05:19:50PM -0700, Darrick J. Wong wrote: > > > > > From: Darrick J. Wong <djwong@kernel.org> > > > > > > > > > > If someone sets kernel.core_uses_pid (or kernel.core_pattern), any > > > > > coredumps generated by fstests might have names that are longer than > > > > > just "core". Since the pid isn't all that useful by itself, let's > > > > > record the coredumps by hash when we save them, so that we don't waste > > > > > space storing identical crash dumps. > > > > > > > > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > > > > > --- > > > > > v2.1: use REPORT_DIR per maintainer suggestion > > > > > --- > > > > > > > > This version looks good to me, > > > > Reviewed-by: Zorro Lang <zlang@redhat.com> > > > > > > It occurred to me overnight that ./check doesn't export REPORT_DIR, so > > > I'll push out a v2.2 that adds that. Currently the lack of an export > > > doesn't affect anyone, but as soon as any tests want to call > > > _save_coredump they're going to run into that issue. > > > > Hmm... the RESULT_DIR is exported, you can use it, or use $seqres directly due > > to it's defined in common/preamble (although is not exported). > > > > ./common/preamble:42: seqres=$RESULT_DIR/$seq > > > > What do you think? > > seqres is defined differently in check than in common/preamble, but I > guess RESULT_DIR will work. Nope, it won't, because RESULT_DIR ends up getting set to something like /var/tmp/fstests/xfs_moocow/xfs, seqnum gets set to xfs/350, and then you end up with garbage paths like: /var/tmp/fstests/xfs_moocow/xfs/xfs/350.core.XXX Soooo you were right, I should have used seqres from the start. The multiple definitions are confusing, but they end up resolving to the same pathnames(!) so it's all good. --D > --D > > > Thanks, > > Zorro > > > > > > > > (...and yes, I do have future fuzz tests that will call it from a test > > > in between fuzz field cycles.) > > > > > > --D > > > > > > > > check | 26 ++++++++++++++++++++++---- > > > > > common/rc | 16 ++++++++++++++++ > > > > > 2 files changed, 38 insertions(+), 4 deletions(-) > > > > > > > > > > diff --git a/check b/check > > > > > index d587a70546..29303db1c8 100755 > > > > > --- a/check > > > > > +++ b/check > > > > > @@ -923,11 +923,19 @@ function run_section() > > > > > sts=$? > > > > > fi > > > > > > > > > > - if [ -f core ]; then > > > > > - _dump_err_cont "[dumped core]" > > > > > - mv core $RESULT_BASE/$seqnum.core > > > > > + # If someone sets kernel.core_pattern or kernel.core_uses_pid, > > > > > + # coredumps generated by fstests might have a longer name than > > > > > + # just "core". Use globbing to find the most common patterns, > > > > > + # assuming there are no other coredump capture packages set up. > > > > > + local cores=0 > > > > > + for i in core core.*; do > > > > > + test -f "$i" || continue > > > > > + if ((cores++ == 0)); then > > > > > + _dump_err_cont "[dumped core]" > > > > > + fi > > > > > + _save_coredump "$i" > > > > > tc_status="fail" > > > > > - fi > > > > > + done > > > > > > > > > > if [ -f $seqres.notrun ]; then > > > > > $timestamp && _timestamp > > > > > @@ -960,6 +968,16 @@ function run_section() > > > > > # of the check script itself. > > > > > (_adjust_oom_score 250; _check_filesystems) || tc_status="fail" > > > > > _check_dmesg || tc_status="fail" > > > > > + > > > > > + # Save any coredumps from the post-test fs checks > > > > > + for i in core core.*; do > > > > > + test -f "$i" || continue > > > > > + if ((cores++ == 0)); then > > > > > + _dump_err_cont "[dumped core]" > > > > > + fi > > > > > + _save_coredump "$i" > > > > > + tc_status="fail" > > > > > + done > > > > > fi > > > > > > > > > > # Reload the module after each test to check for leaks or > > > > > diff --git a/common/rc b/common/rc > > > > > index d877ac77a0..2e1891180a 100644 > > > > > --- a/common/rc > > > > > +++ b/common/rc > > > > > @@ -4949,6 +4949,22 @@ _create_file_sized() > > > > > return $ret > > > > > } > > > > > > > > > > +_save_coredump() > > > > > +{ > > > > > + local path="$1" > > > > > + > > > > > + local core_hash="$(_md5_checksum "$path")" > > > > > + local out_file="$REPORT_DIR/$seqnum.core.$core_hash" > > > > > + > > > > > + if [ -s "$out_file" ]; then > > > > > + rm -f "$path" > > > > > + return > > > > > + fi > > > > > + rm -f "$out_file" > > > > > + > > > > > + mv "$path" "$out_file" > > > > > +} > > > > > + > > > > > init_rc > > > > > > > > > > ################################################################################ > > > > > > > > > > > > > >
diff --git a/check b/check index d587a70546..29303db1c8 100755 --- a/check +++ b/check @@ -923,11 +923,19 @@ function run_section() sts=$? fi - if [ -f core ]; then - _dump_err_cont "[dumped core]" - mv core $RESULT_BASE/$seqnum.core + # If someone sets kernel.core_pattern or kernel.core_uses_pid, + # coredumps generated by fstests might have a longer name than + # just "core". Use globbing to find the most common patterns, + # assuming there are no other coredump capture packages set up. + local cores=0 + for i in core core.*; do + test -f "$i" || continue + if ((cores++ == 0)); then + _dump_err_cont "[dumped core]" + fi + _save_coredump "$i" tc_status="fail" - fi + done if [ -f $seqres.notrun ]; then $timestamp && _timestamp @@ -960,6 +968,16 @@ function run_section() # of the check script itself. (_adjust_oom_score 250; _check_filesystems) || tc_status="fail" _check_dmesg || tc_status="fail" + + # Save any coredumps from the post-test fs checks + for i in core core.*; do + test -f "$i" || continue + if ((cores++ == 0)); then + _dump_err_cont "[dumped core]" + fi + _save_coredump "$i" + tc_status="fail" + done fi # Reload the module after each test to check for leaks or diff --git a/common/rc b/common/rc index d877ac77a0..2e1891180a 100644 --- a/common/rc +++ b/common/rc @@ -4949,6 +4949,22 @@ _create_file_sized() return $ret } +_save_coredump() +{ + local path="$1" + + local core_hash="$(_md5_checksum "$path")" + local out_file="$REPORT_DIR/$seqnum.core.$core_hash" + + if [ -s "$out_file" ]; then + rm -f "$path" + return + fi + rm -f "$out_file" + + mv "$path" "$out_file" +} + init_rc ################################################################################