mbox series

[v2,00/17] Bug fixes for mdadm tests

Message ID 20220526163604.32736-1-logang@deltatee.com (mailing list archive)
Headers show
Series Bug fixes for mdadm tests | expand

Message

Logan Gunthorpe May 26, 2022, 4:35 p.m. UTC
Hi,

This is the updated series with the feedback received in v1[1].

This series includes fixes to fix all the kernel panics in the mdadm
tests and some, related, sparse issues. The first 12 patches
clean refactor the raid5-cache code so that the RCU usage of conf->log
can be cleaned up which is done in patch 13 -- fixing some actual kernel
NULL pointer dereference crashes in the mdadm test.

Patch 14 fixes some of the remaining sparse warnings that are just
missing __rcu annotations.

Patches 15 provides a cleanup for patches 16 and 17 which fix a couple
additional hangs seen in an mdadm test.

This series will be followed by another series for mdadm which fixes
the segfaults and annotates some failing tests to make mdadm tests
runnable fairly reliably, but I'll wait for a stable hash for this
series to note the kernel version tested against. Following that,
v3 of my lock contention series will be sent with more confidence
of its correctness.

This series is based on the current md/md-next branch as of today
(42b805af10). A git branch is available here:

  https://github.com/sbates130272/linux-p2pmem md-bug_v2

Thanks,

Logan

[1] https://lore.kernel.org/all/20220519191311.17119-1-logang@deltatee.com

--

Changes since v1:
  * Add a patch to move the struct r5l_log to raid5-log.h in order
    to fix a compiler error with rcu_access_pointer() in versions
    prior to gcc-10
  * Rework r5c_is_writeback() changes to make less churn (per Christoph)
  * Change some 1s to trues in rcu_dereference_protected calls (per
    Christoph)
  * Fix an odd hunk mistake in the RCU protection patch (per Christoph)
  * Fix an inverted conditional (noticed by Donald)
  * Add a patch to add an enum for the overloaded values used by
    mddev->curr_resync to make the status_resync() fixes clearer
    (per Christoph)

--

Logan Gunthorpe (17):
  md/raid5-log: Drop extern decorators for function prototypes
  md/raid5-cache: Add r5c_conf_is_writeback() helper
  md/raid5-cache: Refactor r5l_start() to take a struct r5conf
  md/raid5-cache: Refactor r5l_flush_stripe_to_raid() to take a struct
    r5conf
  md/raid5-cache: Refactor r5l_wake_reclaim() to take a struct r5conf
  md/raid5-cache: Refactor remaining functions to take a r5conf
  md/raid5-ppl: Drop unused argument from ppl_handle_flush_request()
  md/raid5-cache: Pass the log through to r5c_finish_cache_stripe()
  md/raid5-cache: Don't pass conf to r5c_calculate_new_cp()
  md/raid5-cache: Take struct r5l_log in
    r5c_log_required_to_flush_cache()
  md/raid5: Ensure array is suspended for calls to log_exit()
  md/raid5-cache: Move struct r5l_log definition to raid5-log.h
  md/raid5-cache: Add RCU protection to conf->log accesses
  md/raid5-cache: Annotate pslot with __rcu notation
  md: Use enum for overloaded magic numbers used by mddev->curr_resync
  md: Ensure resync is reported after it starts
  md: Notify sysfs sync_completed in md_reap_sync_thread()

 drivers/md/md.c          |  55 +++----
 drivers/md/md.h          |  15 ++
 drivers/md/raid5-cache.c | 304 ++++++++++++++++++---------------------
 drivers/md/raid5-log.h   | 178 ++++++++++++++++-------
 drivers/md/raid5-ppl.c   |   2 +-
 drivers/md/raid5.c       |  50 +++----
 drivers/md/raid5.h       |   2 +-
 7 files changed, 336 insertions(+), 270 deletions(-)


base-commit: 42b805af102471f53e3c7867b8c2b502ea4eef7e
--
2.30.2