mbox series

[GIT,PULL] bcachefs updates for 6.8

Message ID wq27r7e3n5jz4z6pn2twwrcp2zklumcfibutcpxrw6sgaxcsl5@m5z7rwxyuh72 (mailing list archive)
State New, archived
Headers show
Series [GIT,PULL] bcachefs updates for 6.8 | expand

Pull-request

https://evilpiepirate.org/git/bcachefs.git tags/bcachefs-2024-01-10

Message

Kent Overstreet Jan. 10, 2024, 7:36 p.m. UTC
Hi Linus, here's the main bcachefs updates for 6.8.

Cheers,
Kent


The following changes since commit 0d72ab35a925d66b044cb62b709e53141c3f0143:

  bcachefs: make RO snapshots actually RO (2024-01-01 11:47:07 -0500)

are available in the Git repository at:

  https://evilpiepirate.org/git/bcachefs.git tags/bcachefs-2024-01-10

for you to fetch changes up to 169de41985f53320580f3d347534966ea83343ca:

  bcachefs: eytzinger0_find() search should be const (2024-01-05 23:24:46 -0500)

----------------------------------------------------------------
bcachefs updates for 6.8:

 - btree write buffer rewrite: instead of adding keys to the btree write
   buffer at transaction commit time, we know journal them with a
   different journal entry type and copy them from the journal to the
   write buffer just prior to journal write.

   This reduces the number of atomic operations on shared cachelines
   in the transaction commit path and is a signicant performance
   improvement on some workloads: multithreaded 4k random writes went
   from ~650k iops to ~850k iops.

 - Bring back optimistic spinning for six locks: the new implementation
   doesn't use osq locks; instead we add to the lock waitlist as normal,
   and then spin on the lock_acquired bit in the waitlist entry, _not_
   the lock itself.

 - BCH_IOCTL_DEV_USAGE_V2, which allows for new data types
 - BCH_IOCTL_OFFLINE_FSCK, which runs the kernel implementation of fsck
   but without mounting: useful for transparently using the kernel
   version of fsck from 'bcachefs fsck' when the kernel version is a
   better match for the on disk filesystem.

 - BCH_IOCTL_ONLINE_FSCK: online fsck. Not all passes are supported yet,
   but the passes that are supported are fully featured - errors may be
   corrected as normal.

   The new ioctls use the new 'thread_with_file' abstraction for kicking
   off a kthread that's tied to a file descriptor returned to userspace
   via the ioctl.

 - btree_paths within a btree_trans are now dynamically growable,
   instead of being limited to 64. This is important for the
   check_directory_structure phase of fsck, and also fixes some issues
   we were having with btree path overflow in the reflink btree.

 - Trigger refactoring; prep work for the upcoming disk space accounting
   rewrite

 - Numerous bugfixes :)

----------------------------------------------------------------
Brian Foster (3):
      bcachefs: remove sb lock and flags update on explicit shutdown
      bcachefs: return from fsync on writeback error to avoid early shutdown
      bcachefs: clean up some dead fallocate code

Daniel Hill (6):
      bcachefs: add a quieter bch2_read_super
      bcachefs: remove dead bch2_evacuate_bucket()
      bcachefs: rebalance should wakeup on shutdown if disabled
      bcachefs: copygc should wakeup on shutdown if disabled
      bcachefs: copygc shouldn't try moving buckets on error
      bcachefs: remove redundant condition from data_update_index_update

Gustavo A. R. Silva (3):
      bcachefs: Replace zero-length arrays with flexible-array members
      bcachefs: Use array_size() in call to copy_from_user()
      bcachefs: Replace zero-length array with flex-array member and use __counted_by

Kent Overstreet (210):
      bcachefs: Flush fsck errors before running twice
      bcachefs: Add extra verbose logging for ro path
      bcachefs: Improved backpointer messages in fsck
      bcachefs: kill INODE_LOCK, use lock_two_nondirectories()
      bcachefs: Check for unlinked inodes not on deleted list
      bcachefs: Fix locking when checking freespace btree
      bcachefs: Print old version when scanning for old metadata
      bcachefs: Fix warning when building in userspace
      bcachefs: Include average write size in sysfs journal_debug
      bcachefs: Add an assertion in bch2_journal_pin_set()
      bcachefs: Journal pins must always have a flush_fn
      bcachefs: track_event_change()
      bcachefs: Clear k->needs_whitout earlier in commit path
      bcachefs: BTREE_INSERT_JOURNAL_REPLAY now "don't init trans->journal_res"
      bcachefs: Kill BTREE_UPDATE_PREJOURNAL
      bcachefs: Go rw before journal replay
      bcachefs: Make journal replay more efficient
      bcachefs: Avoiding dropping/retaking write locks in bch2_btree_write_buffer_flush_one()
      bcachefs: Fix redundant variable initialization
      bcachefs: Kill dead BTREE_INSERT flags
      bcachefs: bch_str_hash_flags_t
      bcachefs: Rename BTREE_INSERT flags
      bcachefs: Improve btree_path_dowgrade tracepoint
      bcachefs: backpointers fsck no longer uses BTREE_ITER_ALL_LEVELS
      bcachefs: Kill BTREE_ITER_ALL_LEVELS
      bcachefs: Fix userspace bch2_prt_datetime()
      bcachefs: Don't rejournal keys in key cache flush
      bcachefs: Don't flush journal after replay
      bcachefs: Add a tracepoint for journal entry close
      bcachefs: Kill memset() in bch2_btree_iter_init()
      bcachefs: Kill btree_iter->journal_pos
      bcachefs: Rename bch_replicas_entry -> bch_replicas_entry_v1
      bcachefs: Don't use update_cached_sectors() in bch2_mark_alloc()
      bcachefs: x-macro-ify bch_data_ops enum
      bcachefs: Convert bch2_move_btree() to bbpos
      bcachefs: BCH_DATA_OP_drop_extra_replicas
      powerpc: Export kvm_guest static key, for bcachefs six locks
      bcachefs: six locks: Simplify optimistic spinning
      bcachefs: Simplify check_bucket_ref()
      bcachefs: BCH_IOCTL_DEV_USAGE_V2
      bcachefs: New bucket sector count helpers
      bcachefs: bch2_dev_usage_to_text()
      bcachefs: Kill dev_usage->buckets_ec
      bcachefs: Improve sysfs compression_stats
      bcachefs: Print durability in member_to_text()
      bcachefs: Add a rebalance, data_update tracepoints
      bcachefs: Refactor bch2_check_alloc_to_lru_ref()
      bcachefs: Kill journal_seq/gc args to bch2_dev_usage_update_m()
      bcachefs: convert bch_fs_flags to x-macro
      bcachefs: No need to allocate keys for write buffer
      bcachefs: Improve btree write buffer tracepoints
      bcachefs: kill journal->preres_wait
      bcachefs: delete useless commit_do()
      bcachefs: Clean up btree write buffer write ref handling
      bcachefs: bch2_btree_write_buffer_flush_locked()
      bcachefs: bch2_btree_write_buffer_flush() -> bch2_btree_write_buffer_tryflush()
      bcachefs: count_event()
      bcachefs: Improve trace_trans_restart_too_many_iters()
      bcachefs: Improve trace_trans_restart_would_deadlock
      bcachefs: Don't open code bch2_dev_exists2()
      bcachefs: ONLY_SPECIFIED_DEVS doesn't mean ignore durability anymore
      bcachefs: wb_flush_one_slowpath()
      bcachefs: more write buffer refactoring
      bcachefs: Explicity go RW for fsck
      bcachefs: On missing backpointer to interior node, flush interior updates
      bcachefs: Make backpointer fsck wb flush check more rigorous
      bcachefs: Include btree_trans in more tracepoints
      bcachefs: Move reflink_p triggers into reflink.c
      bcachefs: Refactor trans->paths_allocated to be standard bitmap
      bcachefs: BCH_ERR_opt_parse_error
      bcachefs: Improve error message when finding wrong btree node
      bcachefs: c->ro_ref
      bcachefs: thread_with_file
      bcachefs: Add ability to redirect log output
      bcachefs: Mark recovery passses that are safe to run online
      bcachefs: bch2_run_online_recovery_passes()
      bcachefs: BCH_IOCTL_FSCK_OFFLINE
      bcachefs: BCH_IOCTL_FSCK_ONLINE
      bcachefs: Fix open coded set_btree_iter_dontneed()
      bcachefs: Fix bch2_read_btree()
      bcachefs: continue now works in for_each_btree_key2()
      bcachefs: Kill for_each_btree_key()
      bcachefs: Rename for_each_btree_key2() -> for_each_btree_key()
      bcachefs: reserve path idx 0 for sentinal
      bcachefs: Fix snapshot.c assertion for online fsck
      bcachefs: kill btree_path->(alloc_seq|downgrade_seq)
      bcachefs; kill bch2_btree_key_cache_flush()
      bcachefs: Improve trans->extra_journal_entries
      bcachefs: bch2_trans_node_add no longer uses trans_for_each_path()
      bcachefs: Unwritten journal buffers are always dirty
      bcachefs: journal->buf_lock
      bcachefs: btree write buffer now slurps keys from journal
      bcachefs: Inline btree write buffer sort
      bcachefs: check_root() can now be run online
      bcachefs: kill btree_trans->wb_updates
      bcachefs: Drop journal entry compaction
      bcachefs: fix userspace build errors
      bcachefs: bch_err_(fn|msg) check if should print
      bcachefs: qstr_eq()
      bcachefs: drop extra semicolon
      bcachefs: Make sure allocation failure errors are logged
      MAINTAINERS: Update my email address
      bcachefs: Delete dio read alignment check
      bcachefs: Fixes for rust bindgen
      bcachefs: check for failure to downgrade
      bcachefs: Use GFP_KERNEL for promote allocations
      bcachefs: Improve the nopromote tracepoint
      bcachefs: trans_for_each_update() now declares loop iter
      bcachefs: darray_for_each() now declares loop iter
      bcachefs: simplify bch_devs_list
      bcachefs: better error message in btree_node_write_work()
      bcachefs: add more verbose logging
      bcachefs: fix warning about uninitialized time_stats
      bcachefs: use track_event_change() for allocator blocked stats
      bcachefs: bch2_trans_srcu_lock() should be static
      bcachefs: bch2_dirent_lookup() -> lockrestart_do()
      bcachefs: for_each_btree_key_upto() -> for_each_btree_key_old_upto()
      bcachefs: kill for_each_btree_key_old_upto()
      bcachefs: kill for_each_btree_key_norestart()
      bcachefs: for_each_btree_key() now declares loop iter
      bcachefs: for_each_member_device() now declares loop iter
      bcachefs: for_each_member_device_rcu() now declares loop iter
      bcachefs: vstruct_for_each() now declares loop iter
      bcachefs: fsck -> bch2_trans_run()
      bcachefs: kill __bch2_btree_iter_peek_upto_and_restart()
      bcachefs: bkey_for_each_ptr() now declares loop iter
      bcachefs: for_each_keylist_key() declares loop iter
      bcachefs: skip journal more often in key cache reclaim
      bcachefs: Convert split_devs() to darray
      bcachefs: Kill GFP_NOFAIL usage in readahead path
      bcachefs: minor bch2_btree_path_set_pos() optimization
      bcachefs: bch2_path_get() -> btree_path_idx_t
      bcachefs; bch2_path_put() -> btree_path_idx_t
      bcachefs: bch2_btree_path_set_pos() -> btree_path_idx_t
      bcachefs: bch2_btree_path_make_mut() -> btree_path_idx_t
      bcachefs: bch2_btree_path_traverse() -> btree_path_idx_t
      bcachefs: btree_path_alloc() -> btree_path_idx_t
      bcachefs: btree_iter -> btree_path_idx_t
      bcachefs: btree_insert_entry -> btree_path_idx_t
      bcachefs: struct trans_for_each_path_inorder_iter
      bcachefs: bch2_btree_path_to_text() -> btree_path_idx_t
      bcachefs: kill trans_for_each_path_from()
      bcachefs: trans_for_each_path() no longer uses path->idx
      bcachefs: trans_for_each_path_with_node() no longer uses path->idx
      bcachefs: bch2_path_get() no longer uses path->idx
      bcachefs: bch2_btree_iter_peek_prev() no longer uses path->idx
      bcachefs: get_unlocked_mut_path() -> btree_path_idx_t
      bcachefs: kill btree_path.idx
      bcachefs: Clean up btree_trans
      bcachefs: rcu protect trans->paths
      bcachefs: optimize __bch2_trans_get(), kill DEBUG_TRANSACTIONS
      bcachefs: trans->updates will also be resizable
      bcachefs: trans->nr_paths
      bcachefs: Fix interior update path btree_path uses
      bcachefs: growable btree_paths
      bcachefs: bch2_btree_trans_peek_updates
      bcachefs: bch2_btree_trans_peek_prev_updates
      bcachefs: bch2_btree_trans_peek_slot_updates
      bcachefs: Fix reattach_inode() for snapshots
      bcachefs: check_directory_structure() can now be run online
      bcachefs: Check journal entries for invalid keys in trans commit path
      bcachefs: Fix nochanges/read_only interaction
      bcachefs: bch_member->seq
      bcachefs: Split brain detection
      bcachefs: btree_trans always has stats
      bcachefs: track transaction durations
      bcachefs: wb_key_cmp -> wb_key_ref_cmp
      bcachefs: __journal_keys_sort() refactoring
      bcachefs: __bch2_journal_key_to_wb -> bch2_journal_key_to_wb_slowpath
      bcachefs: Fix printing of device durability
      bcachefs: factor out thread_with_file, thread_with_stdio
      bcachefs: Upgrading uses bch_sb.recovery_passes_required
      bcachefs: trans_mark now takes bkey_s
      bcachefs: mark now takes bkey_s
      bcachefs: Kill BTREE_TRIGGER_NOATOMIC
      bcachefs: BTREE_TRIGGER_TRANSACTIONAL
      bcachefs: kill mem_trigger_run_overwrite_then_insert()
      bcachefs: unify inode trigger
      bcachefs: unify reflink_p trigger
      bcachefs: unify reservation trigger
      bcachefs: move bch2_mark_alloc() to alloc_background.c
      bcachefs: unify alloc trigger
      bcachefs: move stripe triggers to ec.c
      bcachefs: unify stripe trigger
      bcachefs: bch2_trigger_pointer()
      bcachefs: Online fsck can now fix errors
      bcachefs: bch2_trigger_stripe_ptr()
      bcachefs: unify extent trigger
      bcachefs: Combine .trans_trigger, .atomic_trigger
      bcachefs: kill useless return ret
      bcachefs: Add an option to control btree node prefetching
      bcachefs: don't clear accessed bit in btree node fill
      bcachefs: add time_stats for btree_node_read_done()
      bcachefs: increase max_active on io_complete_wq
      bcachefs: add missing bch2_latency_acct() call
      bcachefs: Don't autofix errors we can't fix
      bcachefs: no thread_with_file in userspace
      bcachefs: Upgrades now specify errors to fix, like downgrades
      bcachefs: fsck_err()s don't need to manually check c->sb.version anymore
      bcachefs: Improve would_deadlock trace event
      bcachefs: %pg is banished
      bcachefs: __bch2_sb_field_to_text()
      bcachefs: print sb magic when relevant
      bcachefs: improve validate_bset_keys()
      bcachefs: improve checksum error messages
      bcachefs: bch2_dump_bset() doesn't choke on u64s == 0
      bcachefs: Restart recovery passes more reliably
      bcachefs: fix simulateously upgrading & downgrading
      bcachefs: move "ptrs not changing" optimization to bch2_trigger_extent()
      bcachefs: eytzinger0_find() search should be const

Randy Dunlap (2):
      bcachefs: six lock: fix typos
      bcachefs: mean and variance: fix kernel-doc for function params

Richard Davies (1):
      bcachefs: Remove obsolete comment about zstd

Yang Li (1):
      bcachefs: clean up one inconsistent indenting

 MAINTAINERS                            |    2 +-
 arch/powerpc/kernel/firmware.c         |    2 +
 fs/bcachefs/Kconfig                    |   18 +-
 fs/bcachefs/Makefile                   |    1 +
 fs/bcachefs/alloc_background.c         |  484 +++++-----
 fs/bcachefs/alloc_background.h         |   39 +-
 fs/bcachefs/alloc_foreground.c         |   46 +-
 fs/bcachefs/backpointers.c             |  199 +++--
 fs/bcachefs/backpointers.h             |   27 +-
 fs/bcachefs/bcachefs.h                 |  192 +++-
 fs/bcachefs/bcachefs_format.h          |  123 ++-
 fs/bcachefs/bcachefs_ioctl.h           |   60 +-
 fs/bcachefs/bkey_methods.h             |   82 +-
 fs/bcachefs/bset.c                     |    6 +
 fs/bcachefs/btree_cache.c              |   28 +-
 fs/bcachefs/btree_cache.h              |    4 +-
 fs/bcachefs/btree_gc.c                 |  327 +++----
 fs/bcachefs/btree_io.c                 |  132 ++-
 fs/bcachefs/btree_io.h                 |    2 +-
 fs/bcachefs/btree_iter.c               |  945 ++++++++++----------
 fs/bcachefs/btree_iter.h               |  407 ++++-----
 fs/bcachefs/btree_journal_iter.c       |   25 +-
 fs/bcachefs/btree_key_cache.c          |   63 +-
 fs/bcachefs/btree_key_cache.h          |    2 -
 fs/bcachefs/btree_locking.c            |  111 ++-
 fs/bcachefs/btree_locking.h            |   16 +-
 fs/bcachefs/btree_trans_commit.c       |  313 +++----
 fs/bcachefs/btree_types.h              |  136 +--
 fs/bcachefs/btree_update.c             |  245 ++----
 fs/bcachefs/btree_update.h             |  111 ++-
 fs/bcachefs/btree_update_interior.c    |  322 +++----
 fs/bcachefs/btree_update_interior.h    |   11 +-
 fs/bcachefs/btree_write_buffer.c       |  668 +++++++++-----
 fs/bcachefs/btree_write_buffer.h       |   53 +-
 fs/bcachefs/btree_write_buffer_types.h |   63 +-
 fs/bcachefs/buckets.c                  | 1511 ++++++++------------------------
 fs/bcachefs/buckets.h                  |   45 +-
 fs/bcachefs/buckets_types.h            |    2 -
 fs/bcachefs/chardev.c                  |  363 ++++++--
 fs/bcachefs/checksum.h                 |   23 +
 fs/bcachefs/compress.c                 |    4 -
 fs/bcachefs/darray.h                   |    8 +-
 fs/bcachefs/data_update.c              |   30 +-
 fs/bcachefs/debug.c                    |  141 ++-
 fs/bcachefs/dirent.c                   |   51 +-
 fs/bcachefs/dirent.h                   |    7 +-
 fs/bcachefs/disk_groups.c              |   13 +-
 fs/bcachefs/ec.c                       |  406 +++++++--
 fs/bcachefs/ec.h                       |    5 +-
 fs/bcachefs/ec_types.h                 |    2 +-
 fs/bcachefs/errcode.h                  |    7 +-
 fs/bcachefs/error.c                    |  103 ++-
 fs/bcachefs/extent_update.c            |    2 +-
 fs/bcachefs/extents.c                  |    4 -
 fs/bcachefs/extents.h                  |   24 +-
 fs/bcachefs/eytzinger.h                |   10 +-
 fs/bcachefs/fs-common.c                |   36 +-
 fs/bcachefs/fs-io-buffered.c           |   38 +-
 fs/bcachefs/fs-io-direct.c             |    3 -
 fs/bcachefs/fs-io.c                    |   20 +-
 fs/bcachefs/fs-ioctl.c                 |   12 +-
 fs/bcachefs/fs.c                       |  100 +--
 fs/bcachefs/fs.h                       |    9 +-
 fs/bcachefs/fsck.c                     |  630 ++++++-------
 fs/bcachefs/inode.c                    |  129 ++-
 fs/bcachefs/inode.h                    |   15 +-
 fs/bcachefs/io_misc.c                  |   55 +-
 fs/bcachefs/io_read.c                  |   50 +-
 fs/bcachefs/io_write.c                 |   45 +-
 fs/bcachefs/journal.c                  |  108 ++-
 fs/bcachefs/journal.h                  |    4 +-
 fs/bcachefs/journal_io.c               |  153 ++--
 fs/bcachefs/journal_reclaim.c          |  120 ++-
 fs/bcachefs/journal_reclaim.h          |   16 +-
 fs/bcachefs/journal_seq_blacklist.c    |    2 +-
 fs/bcachefs/journal_types.h            |   16 +-
 fs/bcachefs/keylist.c                  |    2 -
 fs/bcachefs/keylist.h                  |    4 +-
 fs/bcachefs/logged_ops.c               |   18 +-
 fs/bcachefs/lru.c                      |   11 +-
 fs/bcachefs/mean_and_variance.c        |   10 +-
 fs/bcachefs/mean_and_variance.h        |    5 +-
 fs/bcachefs/migrate.c                  |    9 +-
 fs/bcachefs/move.c                     |  187 ++--
 fs/bcachefs/move.h                     |   13 +-
 fs/bcachefs/movinggc.c                 |   49 +-
 fs/bcachefs/opts.c                     |    4 +-
 fs/bcachefs/opts.h                     |   20 +-
 fs/bcachefs/quota.c                    |   28 +-
 fs/bcachefs/rebalance.c                |   38 +-
 fs/bcachefs/recovery.c                 |  291 +++---
 fs/bcachefs/recovery.h                 |    1 +
 fs/bcachefs/recovery_types.h           |   25 +-
 fs/bcachefs/reflink.c                  |  224 ++++-
 fs/bcachefs/reflink.h                  |   22 +-
 fs/bcachefs/replicas.c                 |   66 +-
 fs/bcachefs/replicas.h                 |   22 +-
 fs/bcachefs/replicas_types.h           |    6 +-
 fs/bcachefs/sb-clean.c                 |   20 +-
 fs/bcachefs/sb-downgrade.c             |   90 +-
 fs/bcachefs/sb-downgrade.h             |    1 +
 fs/bcachefs/sb-errors_types.h          |    4 +-
 fs/bcachefs/sb-members.c               |   18 +-
 fs/bcachefs/sb-members.h               |  100 ++-
 fs/bcachefs/six.c                      |  117 +--
 fs/bcachefs/six.h                      |   13 +-
 fs/bcachefs/snapshot.c                 |  174 ++--
 fs/bcachefs/snapshot.h                 |    8 +-
 fs/bcachefs/str_hash.h                 |   25 +-
 fs/bcachefs/subvolume.c                |   31 +-
 fs/bcachefs/subvolume_types.h          |    4 +
 fs/bcachefs/super-io.c                 |  168 ++--
 fs/bcachefs/super-io.h                 |    7 +-
 fs/bcachefs/super.c                    |  388 ++++----
 fs/bcachefs/super.h                    |    6 +-
 fs/bcachefs/super_types.h              |    2 +-
 fs/bcachefs/sysfs.c                    |  160 ++--
 fs/bcachefs/tests.c                    |  193 ++--
 fs/bcachefs/thread_with_file.c         |  299 +++++++
 fs/bcachefs/thread_with_file.h         |   41 +
 fs/bcachefs/thread_with_file_types.h   |   16 +
 fs/bcachefs/trace.h                    |  278 ++++--
 fs/bcachefs/util.c                     |  191 ++--
 fs/bcachefs/util.h                     |   56 +-
 fs/bcachefs/vstructs.h                 |   10 +-
 125 files changed, 7101 insertions(+), 5961 deletions(-)
 create mode 100644 fs/bcachefs/thread_with_file.c
 create mode 100644 fs/bcachefs/thread_with_file.h
 create mode 100644 fs/bcachefs/thread_with_file_types.h

Comments

Kees Cook Jan. 10, 2024, 11:48 p.m. UTC | #1
On Wed, Jan 10, 2024 at 02:36:30PM -0500, Kent Overstreet wrote:
> [...]
>       bcachefs: %pg is banished

Hi!

Not a PR blocker, but this patch re-introduces users of strlcpy() which
has been otherwise removed this cycle. I'll send a patch to replace
these new uses, but process-wise, I'd like check on how bcachefs patches
are reviewed.

Normally I'd go find the original email that posted the patch and reply
there, but I couldn't find a development list where this patch was
posted. Where is this happening? (Being posted somewhere is supposed
to be a prerequisite for living in -next. E.g. quoting from the -next
inclusion boiler-plate: "* posted to the relevant mailing list,") It
looks like it was authored 5 days ago, which is cutting it awfully close
to the merge window opening:

	AuthorDate: Fri Jan 5 11:58:50 2024 -0500

Actually, it looks like you rebased onto v6.7-rc7? This is normally
strongly discouraged. The common merge base is -rc2.

It also seems it didn't get a run through scripts/checkpatch.pl, which
shows 4 warnings, 2 or which point out the strlcpy deprecation:

WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89
#123: FILE: fs/bcachefs/super.c:1389:
+               strlcpy(c->name, name.buf, sizeof(c->name));

WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89
#124: FILE: fs/bcachefs/super.c:1390:
+       strlcpy(ca->name, name.buf, sizeof(ca->name));

Please make sure you're running checkpatch.pl -- it'll make integration,
technical debt reduction, and coding style adjustments much easier. :)

Thanks!

-Kees
Kent Overstreet Jan. 11, 2024, 12:04 a.m. UTC | #2
On Wed, Jan 10, 2024 at 03:48:43PM -0800, Kees Cook wrote:
> On Wed, Jan 10, 2024 at 02:36:30PM -0500, Kent Overstreet wrote:
> > [...]
> >       bcachefs: %pg is banished
> 
> Hi!
> 
> Not a PR blocker, but this patch re-introduces users of strlcpy() which
> has been otherwise removed this cycle. I'll send a patch to replace
> these new uses, but process-wise, I'd like check on how bcachefs patches
> are reviewed.

I'm happy to fix it. Perhaps the declaration could get a depracated
warning, though?

> Normally I'd go find the original email that posted the patch and reply
> there, but I couldn't find a development list where this patch was
> posted. Where is this happening? (Being posted somewhere is supposed
> to be a prerequisite for living in -next. E.g. quoting from the -next
> inclusion boiler-plate: "* posted to the relevant mailing list,") It
> looks like it was authored 5 days ago, which is cutting it awfully close
> to the merge window opening:
> 
> 	AuthorDate: Fri Jan 5 11:58:50 2024 -0500

I'm confident in my testing; if it was a patch that needed more soak
time it would have waited.

> Actually, it looks like you rebased onto v6.7-rc7? This is normally
> strongly discouraged. The common merge base is -rc2.

Is there something special about rc2?

I reorder patches fairly often just in the normal course of backporting
fixes, and if I have to rebase everything for a backport I'll often
rebase onto a newer kernel so that the people who are running my tree
are testing something more stable - it does come up.

> It also seems it didn't get a run through scripts/checkpatch.pl, which
> shows 4 warnings, 2 or which point out the strlcpy deprecation:
> 
> WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89
> #123: FILE: fs/bcachefs/super.c:1389:
> +               strlcpy(c->name, name.buf, sizeof(c->name));
> 
> WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89
> #124: FILE: fs/bcachefs/super.c:1390:
> +       strlcpy(ca->name, name.buf, sizeof(ca->name));
> 
> Please make sure you're running checkpatch.pl -- it'll make integration,
> technical debt reduction, and coding style adjustments much easier. :)

Well, we do have rather a lot of linters these days.

That's actually something I've been meaning to raise - perhaps we could
start thinking about some pluggable way of running linters so that
they're all run as part of a normal kernel build (and something that
would be easy to drop new linters in to; I'd like to write some bcachefs
specific ones).

The current model of "I have to remember to run these 5 things, and then
I'm going to get email nags for 3 more that I can't run" is not terribly
scalable :)
Kees Cook Jan. 11, 2024, 12:39 a.m. UTC | #3
On Wed, Jan 10, 2024 at 07:04:47PM -0500, Kent Overstreet wrote:
> On Wed, Jan 10, 2024 at 03:48:43PM -0800, Kees Cook wrote:
> > On Wed, Jan 10, 2024 at 02:36:30PM -0500, Kent Overstreet wrote:
> > > [...]
> > >       bcachefs: %pg is banished
> > 
> > Hi!
> > 
> > Not a PR blocker, but this patch re-introduces users of strlcpy() which
> > has been otherwise removed this cycle. I'll send a patch to replace
> > these new uses, but process-wise, I'd like check on how bcachefs patches
> > are reviewed.
> 
> I'm happy to fix it. Perhaps the declaration could get a depracated
> warning, though?

That's one of checkpatch.pl's purposes, seeing as how deprecation warnings
are ... deprecated. :P
https://docs.kernel.org/process/deprecated.html#id1
This has made treewide changes like this more difficult, but these are
the Rules From Linus. ;)

> > Normally I'd go find the original email that posted the patch and reply
> > there, but I couldn't find a development list where this patch was
> > posted. Where is this happening? (Being posted somewhere is supposed
> > to be a prerequisite for living in -next. E.g. quoting from the -next
> > inclusion boiler-plate: "* posted to the relevant mailing list,") It
> > looks like it was authored 5 days ago, which is cutting it awfully close
> > to the merge window opening:
> > 
> > 	AuthorDate: Fri Jan 5 11:58:50 2024 -0500
> 
> I'm confident in my testing; if it was a patch that needed more soak
> time it would have waited.
> 
> > Actually, it looks like you rebased onto v6.7-rc7? This is normally
> > strongly discouraged. The common merge base is -rc2.
> 
> Is there something special about rc2?

It's what sfr suggested as it's when many subsystem maintainers merge
to when opening their trees for development. Usually it's a good tree
state: after stabilization fixes from any rc1 rough edges.

> I reorder patches fairly often just in the normal course of backporting
> fixes, and if I have to rebase everything for a backport I'll often
> rebase onto a newer kernel so that the people who are running my tree
> are testing something more stable - it does come up.

Okay, gotcha. I personally don't care how maintainers handle rebasing; I
was just confused about the timing and why I couldn't find the original
patch on any lists. :) And to potentially warn about Linus possibly not
liking the rebase too.

> 
> > It also seems it didn't get a run through scripts/checkpatch.pl, which
> > shows 4 warnings, 2 or which point out the strlcpy deprecation:
> > 
> > WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89
> > #123: FILE: fs/bcachefs/super.c:1389:
> > +               strlcpy(c->name, name.buf, sizeof(c->name));
> > 
> > WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89
> > #124: FILE: fs/bcachefs/super.c:1390:
> > +       strlcpy(ca->name, name.buf, sizeof(ca->name));
> > 
> > Please make sure you're running checkpatch.pl -- it'll make integration,
> > technical debt reduction, and coding style adjustments much easier. :)
> 
> Well, we do have rather a lot of linters these days.
> 
> That's actually something I've been meaning to raise - perhaps we could
> start thinking about some pluggable way of running linters so that
> they're all run as part of a normal kernel build (and something that
> would be easy to drop new linters in to; I'd like to write some bcachefs
> specific ones).

With no central CI, the best we've got is everyone running the same
"minimum set" of checks. I'm most familiar with netdev's CI which has
such things (and checkpatch.pl is included). For example see:
https://patchwork.kernel.org/project/netdevbpf/patch/20240110110451.5473-3-ptikhomirov@virtuozzo.com/

> The current model of "I have to remember to run these 5 things, and then
> I'm going to get email nags for 3 more that I can't run" is not terribly
> scalable :)

Oh, I hear you. It's positively agonizing for those of us doing treewide
changes. I've got at least 4 CIs I check (in addition to my own) just to
check everyone's various coverage tools.

At the very least, checkpatch.pl is the common denominator:
https://docs.kernel.org/process/submitting-patches.html#style-check-your-changes

-Kees
Kent Overstreet Jan. 11, 2024, 12:58 a.m. UTC | #4
On Wed, Jan 10, 2024 at 04:39:22PM -0800, Kees Cook wrote:
> On Wed, Jan 10, 2024 at 07:04:47PM -0500, Kent Overstreet wrote:
> > On Wed, Jan 10, 2024 at 03:48:43PM -0800, Kees Cook wrote:
> > > On Wed, Jan 10, 2024 at 02:36:30PM -0500, Kent Overstreet wrote:
> > > > [...]
> > > >       bcachefs: %pg is banished
> > > 
> > > Hi!
> > > 
> > > Not a PR blocker, but this patch re-introduces users of strlcpy() which
> > > has been otherwise removed this cycle. I'll send a patch to replace
> > > these new uses, but process-wise, I'd like check on how bcachefs patches
> > > are reviewed.
> > 
> > I'm happy to fix it. Perhaps the declaration could get a depracated
> > warning, though?
> 
> That's one of checkpatch.pl's purposes, seeing as how deprecation warnings
> are ... deprecated. :P
> https://docs.kernel.org/process/deprecated.html#id1
> This has made treewide changes like this more difficult, but these are
> the Rules From Linus. ;)

...And how does that make any sense? "The warnings weren't getting
cleaned up, so get rid of them - except not really, just move them off
to the side so they'll be more annoying when they do come up"...

Perhaps we could've just switched to deprecation warnings being on in a
W=1 build?

> Okay, gotcha. I personally don't care how maintainers handle rebasing; I
> was just confused about the timing and why I couldn't find the original
> patch on any lists. :) And to potentially warn about Linus possibly not
> liking the rebase too.

*nod* If there's some other reason why it's convenient to be on rc2 I
could possibly switch my workflow, but pushing code out quickly is the
norm for me.

> > > It also seems it didn't get a run through scripts/checkpatch.pl, which
> > > shows 4 warnings, 2 or which point out the strlcpy deprecation:
> > > 
> > > WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89
> > > #123: FILE: fs/bcachefs/super.c:1389:
> > > +               strlcpy(c->name, name.buf, sizeof(c->name));
> > > 
> > > WARNING: Prefer strscpy over strlcpy - see: https://github.com/KSPP/linux/issues/89
> > > #124: FILE: fs/bcachefs/super.c:1390:
> > > +       strlcpy(ca->name, name.buf, sizeof(ca->name));
> > > 
> > > Please make sure you're running checkpatch.pl -- it'll make integration,
> > > technical debt reduction, and coding style adjustments much easier. :)
> > 
> > Well, we do have rather a lot of linters these days.
> > 
> > That's actually something I've been meaning to raise - perhaps we could
> > start thinking about some pluggable way of running linters so that
> > they're all run as part of a normal kernel build (and something that
> > would be easy to drop new linters in to; I'd like to write some bcachefs
> > specific ones).
> 
> With no central CI, the best we've got is everyone running the same
> "minimum set" of checks. I'm most familiar with netdev's CI which has
> such things (and checkpatch.pl is included). For example see:
> https://patchwork.kernel.org/project/netdevbpf/patch/20240110110451.5473-3-ptikhomirov@virtuozzo.com/

Yeah, we badly need a central/common CI. I've been making noises that my
own thing could be a good basis for that - e.g. it shouldn't be much
work to use it for running our tests in tools/tesing/selftests. Sadly no
time for that myself, but happy to talk about it if someone does start
leading/coordinating that effort.

example tests, example output:
https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest
https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing

> > The current model of "I have to remember to run these 5 things, and then
> > I'm going to get email nags for 3 more that I can't run" is not terribly
> > scalable :)
> 
> Oh, I hear you. It's positively agonizing for those of us doing treewide
> changes. I've got at least 4 CIs I check (in addition to my own) just to
> check everyone's various coverage tools.
> 
> At the very least, checkpatch.pl is the common denominator:
> https://docs.kernel.org/process/submitting-patches.html#style-check-your-changes

At one point in my career I was religious about checkpatch; since then
the warnings it produces have seemed to me more on the naggy and less on
the useful end of the spectrum - I like smatch better in that respect.
But - I'll start running it again for the deprecation warnings :)
Linus Torvalds Jan. 11, 2024, 1:47 a.m. UTC | #5
On Wed, 10 Jan 2024 at 16:58, Kent Overstreet <kent.overstreet@linux.dev> wrote:
>
> ...And how does that make any sense? "The warnings weren't getting
> cleaned up, so get rid of them - except not really, just move them off
> to the side so they'll be more annoying when they do come up"...

Honestly,the checkpatch warnings are often garbage too.

The whole deprecation warnings never worked. They don't work in
checkpatch either.

> Perhaps we could've just switched to deprecation warnings being on in a
> W=1 build?

No, because the whole idea of "let me mark something deprecated and
then not just remove it" is GARBAGE.

If somebody wants to deprecate something, it is up to *them* to finish
the job. Not annoy thousands of other developers with idiotic
warnings.

            Linus
pr-tracker-bot@kernel.org Jan. 11, 2024, 2:23 a.m. UTC | #6
The pull request you sent on Wed, 10 Jan 2024 14:36:30 -0500:

> https://evilpiepirate.org/git/bcachefs.git tags/bcachefs-2024-01-10

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/999a36b52b1b11b2ca0590756e4f8cf21f2d9182

Thank you!
Mark Brown Jan. 11, 2024, 3:35 p.m. UTC | #7
On Wed, Jan 10, 2024 at 07:58:20PM -0500, Kent Overstreet wrote:
> On Wed, Jan 10, 2024 at 04:39:22PM -0800, Kees Cook wrote:

> > With no central CI, the best we've got is everyone running the same
> > "minimum set" of checks. I'm most familiar with netdev's CI which has
> > such things (and checkpatch.pl is included). For example see:
> > https://patchwork.kernel.org/project/netdevbpf/patch/20240110110451.5473-3-ptikhomirov@virtuozzo.com/

> Yeah, we badly need a central/common CI. I've been making noises that my
> own thing could be a good basis for that - e.g. it shouldn't be much
> work to use it for running our tests in tools/tesing/selftests. Sadly no
> time for that myself, but happy to talk about it if someone does start
> leading/coordinating that effort.

IME the actually running the tests bit isn't usually *so* much the
issue, someone making a new test runner and/or output format does mean a
bit of work integrating it into infrastructure but that's more usually
annoying than a blocker.  Issues tend to be more around arranging to
drive the relevant test systems, figuring out which tests to run where
(including things like figuring out capacity on test devices, or how
long you're prepared to wait in interactive usage) and getting the
environment on the target devices into a state where the tests can run.
Plus any stability issues with the tests themselves of course, and
there's a bunch of costs somewhere along the line.

I suspect we're more likely to get traction with aggregating test
results and trying to do UI/reporting on top of that than with the
running things bit, that really would be very good to have.  I've copied
in Nikolai who's work on kcidb is the main thing I'm aware of there,
though at the minute operational issues mean it's a bit write only.

> example tests, example output:
> https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest
> https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing

For example looking at the sample test there it looks like it needs
among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm,
rsync and a reasonably performant disk with 40G of space available.
None of that is especially unreasonable for a filesystems test but it's
all things that we need to get onto the system where we want to run the
test and there's a lot of systems where the storage requirements would
be unsustainable for one reason or another.  It also appears to take
about 33000s to run on whatever system you use which is distinctly
non-trivial.

I certainly couldn't run it readily in my lab.

> > At the very least, checkpatch.pl is the common denominator:
> > https://docs.kernel.org/process/submitting-patches.html#style-check-your-changes

> At one point in my career I was religious about checkpatch; since then
> the warnings it produces have seemed to me more on the naggy and less
> on the useful end of the spectrum - I like smatch better in that
> respect.  But - I'll start running it again for the deprecation
> warnings :)

Yeah, I don't run it on incoming stuff because the rate at which it
reports things I don't find useful is far too high.
Kent Overstreet Jan. 11, 2024, 5:38 p.m. UTC | #8
On Thu, Jan 11, 2024 at 03:35:40PM +0000, Mark Brown wrote:
> On Wed, Jan 10, 2024 at 07:58:20PM -0500, Kent Overstreet wrote:
> > On Wed, Jan 10, 2024 at 04:39:22PM -0800, Kees Cook wrote:
> 
> > > With no central CI, the best we've got is everyone running the same
> > > "minimum set" of checks. I'm most familiar with netdev's CI which has
> > > such things (and checkpatch.pl is included). For example see:
> > > https://patchwork.kernel.org/project/netdevbpf/patch/20240110110451.5473-3-ptikhomirov@virtuozzo.com/
> 
> > Yeah, we badly need a central/common CI. I've been making noises that my
> > own thing could be a good basis for that - e.g. it shouldn't be much
> > work to use it for running our tests in tools/tesing/selftests. Sadly no
> > time for that myself, but happy to talk about it if someone does start
> > leading/coordinating that effort.
> 
> IME the actually running the tests bit isn't usually *so* much the
> issue, someone making a new test runner and/or output format does mean a
> bit of work integrating it into infrastructure but that's more usually
> annoying than a blocker.

No, the proliferation of test runners, test output formats, CI systems,
etc. really is an issue; it means we can't have one common driver that
anyone can run from the command line, and instead there's a bunch of
disparate systems with patchwork integration and all the feedback is nag
emails - after you've finished whan you were working on instead of
moving on to the next thing - with no way to get immediate feedback.

And it's because building something shiny and new is the fun part, no
one wants to do the grungy integration work.

> Issues tend to be more around arranging to
> drive the relevant test systems, figuring out which tests to run where
> (including things like figuring out capacity on test devices, or how
> long you're prepared to wait in interactive usage) and getting the
> environment on the target devices into a state where the tests can run.
> Plus any stability issues with the tests themselves of course, and
> there's a bunch of costs somewhere along the line.
> 
> I suspect we're more likely to get traction with aggregating test
> results and trying to do UI/reporting on top of that than with the
> running things bit, that really would be very good to have.  I've copied
> in Nikolai who's work on kcidb is the main thing I'm aware of there,
> though at the minute operational issues mean it's a bit write only.
> 
> > example tests, example output:
> > https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest
> > https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing
> 
> For example looking at the sample test there it looks like it needs
> among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm,
> rsync

Getting all that set up by the end user is one command:
  ktest/root_image create
and running a test is one morecommand:
build-test-kernel run ~/ktest/tests/bcachefs/single_device.ktest

> and a reasonably performant disk with 40G of space available.
> None of that is especially unreasonable for a filesystems test but it's
> all things that we need to get onto the system where we want to run the
> test and there's a lot of systems where the storage requirements would
> be unsustainable for one reason or another.  It also appears to take
> about 33000s to run on whatever system you use which is distinctly
> non-trivial.

Getting sufficient coverage in filesystem land does take some amount of
resources, but it's not so bad - I'm leasing 80 core ARM64 machines from
Hetzner for $250/month and running 10 test VMs per machine, so it's
really not that expensive. Other subsystems would probably be fine with
less resources.
Mark Brown Jan. 11, 2024, 9:47 p.m. UTC | #9
On Thu, Jan 11, 2024 at 12:38:57PM -0500, Kent Overstreet wrote:
> On Thu, Jan 11, 2024 at 03:35:40PM +0000, Mark Brown wrote:

> > IME the actually running the tests bit isn't usually *so* much the
> > issue, someone making a new test runner and/or output format does mean a
> > bit of work integrating it into infrastructure but that's more usually
> > annoying than a blocker.

> No, the proliferation of test runners, test output formats, CI systems,
> etc. really is an issue; it means we can't have one common driver that
> anyone can run from the command line, and instead there's a bunch of
> disparate systems with patchwork integration and all the feedback is nag
> emails - after you've finished whan you were working on instead of
> moving on to the next thing - with no way to get immediate feedback.

It's certainly an issue and it's much better if people do manage to fit
their tests into some existing thing but I'm not convinced that's the
big reason why you have a bunch of different systems running separately
and doing different things.  For example the enterprise vendors will
naturally tend to have a bunch of server systems in their labs and focus
on their testing needs while I know the Intel audio CI setup has a bunch
of laptops, laptop like dev boards and things in there with loopback
audio cables and I think test equipment plugged in and focuses rather
more on audio.  My own lab is built around on systems I can be in the
same room as without getting too annoyed and does things I find useful,
plus using spare bandwidth for KernelCI because they can take donated
lab time.

I think there's a few different issues you're pointing at here:

 - Working out how to run relevant tests for whatever area of the kernel
   you're working on on whatever hardware you have to hand.
 - Working out exactly what other testers will do.
 - Promptness and consistency of feedback from other testers.
 - UI for getting results from other testers.

and while it really sounds like your main annoyances are the bits with
other test systems it really seems like the test runner bit is mainly
for the first issue, possibly also helping with working out what other
testers are going to do.  These are all very real issues.

> And it's because building something shiny and new is the fun part, no
> one wants to do the grungy integration work.

I think you may be overestimating people's enthusiasm for writing test
stuff there!  There is NIH stuff going on for sure but lot of the time
when you look at something where people have gone off and done their own
thing it's either much older than you initially thought and predates
anything they might've integrated with or there's some reason why none
of the existing systems fit well.  Anecdotally it seems much more common
to see people looking for things to reuse in order to save time than it
is to see people going off and reinventing the world.

> > > example tests, example output:
> > > https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest
> > > https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing

> > For example looking at the sample test there it looks like it needs
> > among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm,
> > rsync

> Getting all that set up by the end user is one command:
>   ktest/root_image create
> and running a test is one morecommand:
> build-test-kernel run ~/ktest/tests/bcachefs/single_device.ktest

That does assume that you're building and running everything directly on
the system under test and are happy to have the test in a VM which isn't
an assumption that holds universally, and also that whoever's doing the
testing doesn't want to do something like use their own distro or
something - like I say none of it looks too unreasonable for
filesystems.

> > and a reasonably performant disk with 40G of space available.
> > None of that is especially unreasonable for a filesystems test but it's
> > all things that we need to get onto the system where we want to run the
> > test and there's a lot of systems where the storage requirements would
> > be unsustainable for one reason or another.  It also appears to take
> > about 33000s to run on whatever system you use which is distinctly
> > non-trivial.

> Getting sufficient coverage in filesystem land does take some amount of
> resources, but it's not so bad - I'm leasing 80 core ARM64 machines from
> Hetzner for $250/month and running 10 test VMs per machine, so it's
> really not that expensive. Other subsystems would probably be fine with
> less resources.

Some will be, some will have more demanding requirements especially when
you want to test on actual hardware rather than in a VM.  For example
with my own test setup which is more focused on hardware the operating
costs aren't such a big deal but I've got boards that are for various
reasons irreplaceable, often single instances of boards (which makes
scheduling a thing) and for some of the tests I'd like to get around to
setting up I need special physical setup.  Some of the hardware I'd like
to cover is only available in machines which are in various respects
annoying to automate, I've got a couple of unused systems waiting for me
to have sufficient bandwidth to work out how to automate them.  Either
way I don't think the costs are trival enough to be completely handwaved
away.

I'd also note that the 9 hour turnaround time for that test set you're
pointing at isn't exactly what I'd associate with immediate feedback.
Matthew Wilcox Jan. 11, 2024, 10:57 p.m. UTC | #10
On Wed, Jan 10, 2024 at 05:47:20PM -0800, Linus Torvalds wrote:
> No, because the whole idea of "let me mark something deprecated and
> then not just remove it" is GARBAGE.
> 
> If somebody wants to deprecate something, it is up to *them* to finish
> the job. Not annoy thousands of other developers with idiotic
> warnings.

What would be nice is something that warned about _new_ uses being
added.  ie checkpatch.  Let's at least not make the problem worse.
Kees Cook Jan. 11, 2024, 11:42 p.m. UTC | #11
On Thu, Jan 11, 2024 at 10:57:18PM +0000, Matthew Wilcox wrote:
> On Wed, Jan 10, 2024 at 05:47:20PM -0800, Linus Torvalds wrote:
> > No, because the whole idea of "let me mark something deprecated and
> > then not just remove it" is GARBAGE.
> > 
> > If somebody wants to deprecate something, it is up to *them* to finish
> > the job. Not annoy thousands of other developers with idiotic
> > warnings.
> 
> What would be nice is something that warned about _new_ uses being
> added.  ie checkpatch.  Let's at least not make the problem worse.

For now, we've just kind of "dealt with it". For things that show up
with new -W options we've enlisted sfr to do the -next builds with it
explicitly added (but not to the tree) so he could generate nag emails
when new warnings appeared. That could happen if we added it to W=1
builds, or some other flag like REPORT_DEPRECATED=1.

Another ugly idea would be to do a treewide replacement of "func" to
"func_deprecated", and make "func" just a wrapper for it that is marked
with __deprecated. Then only new instances would show up (assuming people
weren't trying to actively bypass the deprecation work by adding calls to
"func_deprecated"). :P Then the refactoring to replace "func_deprecated"
could happen a bit more easily.

Most past deprecations have pretty narrow usage. This is not true with
the string functions, which is why it's more noticeable here. :P

-Kees
Linus Torvalds Jan. 11, 2024, 11:58 p.m. UTC | #12
On Thu, 11 Jan 2024 at 15:42, Kees Cook <keescook@chromium.org> wrote:
>
> Another ugly idea would be to do a treewide replacement of "func" to
> "func_deprecated", and make "func" just a wrapper for it that is marked
> with __deprecated.

That's probably not a horrible idea, at least when we're talking a
reasonable number of users (ie when we're talking "tens of users" like
strlcpy is now).

We should probably generally rename functions much more aggressively
any time the "signature" changes.

We've had situations where the semantics changed but not enough to
necessarily trigger type warnings, and then renaming things is just a
good thing just to avoid mistakes. Even if it's temporary and you plan
on renaming things back.

And with a coccinelle script (that should be documented in the patch)
it's not necessarily all that painful to do.

                Linus
Kent Overstreet Jan. 12, 2024, 12:05 a.m. UTC | #13
On Thu, Jan 11, 2024 at 03:42:19PM -0800, Kees Cook wrote:
> On Thu, Jan 11, 2024 at 10:57:18PM +0000, Matthew Wilcox wrote:
> > On Wed, Jan 10, 2024 at 05:47:20PM -0800, Linus Torvalds wrote:
> > > No, because the whole idea of "let me mark something deprecated and
> > > then not just remove it" is GARBAGE.
> > > 
> > > If somebody wants to deprecate something, it is up to *them* to finish
> > > the job. Not annoy thousands of other developers with idiotic
> > > warnings.
> > 
> > What would be nice is something that warned about _new_ uses being
> > added.  ie checkpatch.  Let's at least not make the problem worse.
> 
> For now, we've just kind of "dealt with it". For things that show up
> with new -W options we've enlisted sfr to do the -next builds with it
> explicitly added (but not to the tree) so he could generate nag emails
> when new warnings appeared. That could happen if we added it to W=1
> builds, or some other flag like REPORT_DEPRECATED=1.
> 
> Another ugly idea would be to do a treewide replacement of "func" to
> "func_deprecated", and make "func" just a wrapper for it that is marked
> with __deprecated. Then only new instances would show up (assuming people
> weren't trying to actively bypass the deprecation work by adding calls to
> "func_deprecated"). :P Then the refactoring to replace "func_deprecated"
> could happen a bit more easily.
> 
> Most past deprecations have pretty narrow usage. This is not true with
> the string functions, which is why it's more noticeable here. :P

Before doing the renaming - why not just leave a kdoc comment that marks
it as deprecated? Seems odd that checkpatch was patched, but I can't
find anything marking it as deprecated when I cscope to it.
Kees Cook Jan. 12, 2024, 12:18 a.m. UTC | #14
On Thu, Jan 11, 2024 at 07:05:06PM -0500, Kent Overstreet wrote:
> On Thu, Jan 11, 2024 at 03:42:19PM -0800, Kees Cook wrote:
> > On Thu, Jan 11, 2024 at 10:57:18PM +0000, Matthew Wilcox wrote:
> > > On Wed, Jan 10, 2024 at 05:47:20PM -0800, Linus Torvalds wrote:
> > > > No, because the whole idea of "let me mark something deprecated and
> > > > then not just remove it" is GARBAGE.
> > > > 
> > > > If somebody wants to deprecate something, it is up to *them* to finish
> > > > the job. Not annoy thousands of other developers with idiotic
> > > > warnings.
> > > 
> > > What would be nice is something that warned about _new_ uses being
> > > added.  ie checkpatch.  Let's at least not make the problem worse.
> > 
> > For now, we've just kind of "dealt with it". For things that show up
> > with new -W options we've enlisted sfr to do the -next builds with it
> > explicitly added (but not to the tree) so he could generate nag emails
> > when new warnings appeared. That could happen if we added it to W=1
> > builds, or some other flag like REPORT_DEPRECATED=1.
> > 
> > Another ugly idea would be to do a treewide replacement of "func" to
> > "func_deprecated", and make "func" just a wrapper for it that is marked
> > with __deprecated. Then only new instances would show up (assuming people
> > weren't trying to actively bypass the deprecation work by adding calls to
> > "func_deprecated"). :P Then the refactoring to replace "func_deprecated"
> > could happen a bit more easily.
> > 
> > Most past deprecations have pretty narrow usage. This is not true with
> > the string functions, which is why it's more noticeable here. :P
> 
> Before doing the renaming - why not just leave a kdoc comment that marks
> it as deprecated? Seems odd that checkpatch was patched, but I can't
> find anything marking it as deprecated when I cscope to it.

It doesn't explicitly say "deprecated", but this language has been in
the kdoc for a while now (not that people go read this often):

 * Do not use this function. While FORTIFY_SOURCE tries to avoid
 * over-reads when calculating strlen(@q), it is still possible.
 * Prefer strscpy(), though note its different return values for
 * detecting truncation.

But it's all fine -- we're about to wipe out strlcpy for v6.8. Once the
drivers-core and drm-misc-next trees land, (and the bcachefs patch[1])
we'll be at 0 users. :)

-Kees

[1] https://lore.kernel.org/lkml/20240110235438.work.385-kees@kernel.org/
Kent Overstreet Jan. 12, 2024, 1:10 a.m. UTC | #15
On Thu, Jan 11, 2024 at 09:47:26PM +0000, Mark Brown wrote:
> On Thu, Jan 11, 2024 at 12:38:57PM -0500, Kent Overstreet wrote:
> > On Thu, Jan 11, 2024 at 03:35:40PM +0000, Mark Brown wrote:
> 
> > > IME the actually running the tests bit isn't usually *so* much the
> > > issue, someone making a new test runner and/or output format does mean a
> > > bit of work integrating it into infrastructure but that's more usually
> > > annoying than a blocker.
> 
> > No, the proliferation of test runners, test output formats, CI systems,
> > etc. really is an issue; it means we can't have one common driver that
> > anyone can run from the command line, and instead there's a bunch of
> > disparate systems with patchwork integration and all the feedback is nag
> > emails - after you've finished whan you were working on instead of
> > moving on to the next thing - with no way to get immediate feedback.
> 
> It's certainly an issue and it's much better if people do manage to fit
> their tests into some existing thing but I'm not convinced that's the
> big reason why you have a bunch of different systems running separately
> and doing different things.  For example the enterprise vendors will
> naturally tend to have a bunch of server systems in their labs and focus
> on their testing needs while I know the Intel audio CI setup has a bunch
> of laptops, laptop like dev boards and things in there with loopback
> audio cables and I think test equipment plugged in and focuses rather
> more on audio.  My own lab is built around on systems I can be in the
> same room as without getting too annoyed and does things I find useful,
> plus using spare bandwidth for KernelCI because they can take donated
> lab time.

No, you're overthinking.

The vast majority of kernel testing requires no special hardware, just a
virtual machine.

There is _no fucking reason_ we shouldn't be able to run tests on our
own local machines - _local_ machines, not waiting for the Intel CI
setup and asking for a git branch to be tested, not waiting for who
knows how long for the CI farm to get to it - just run the damn tests
immediately and get immediate feedback.

You guys are overthinking and overengineering and ignoring the basics,
the way enterprise people always do.

> > And it's because building something shiny and new is the fun part, no
> > one wants to do the grungy integration work.
> 
> I think you may be overestimating people's enthusiasm for writing test
> stuff there!  There is NIH stuff going on for sure but lot of the time
> when you look at something where people have gone off and done their own
> thing it's either much older than you initially thought and predates
> anything they might've integrated with or there's some reason why none
> of the existing systems fit well.  Anecdotally it seems much more common
> to see people looking for things to reuse in order to save time than it
> is to see people going off and reinventing the world.

It's a basic lack of leadership. Yes, the younger engineers are always
going to be doing the new and shiny, and always going to want to build
something new instead of finishing off the tests or integrating with
something existing. Which is why we're supposed to have managers saying
"ok, what do I need to prioritize for my team be able to develop
effectively".

> 
> > > > example tests, example output:
> > > > https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest
> > > > https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing
> 
> > > For example looking at the sample test there it looks like it needs
> > > among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm,
> > > rsync
> 
> > Getting all that set up by the end user is one command:
> >   ktest/root_image create
> > and running a test is one morecommand:
> > build-test-kernel run ~/ktest/tests/bcachefs/single_device.ktest
> 
> That does assume that you're building and running everything directly on
> the system under test and are happy to have the test in a VM which isn't
> an assumption that holds universally, and also that whoever's doing the
> testing doesn't want to do something like use their own distro or
> something - like I say none of it looks too unreasonable for
> filesystems.

No, I'm doing it that way because technically that's the simplest way to
do it.

All you guys building crazy contraptions for running tests on Google
Cloud or Amazon or whatever - you're building technical workarounds for
broken procurement.

Just requisition the damn machines.

> Some will be, some will have more demanding requirements especially when
> you want to test on actual hardware rather than in a VM.  For example
> with my own test setup which is more focused on hardware the operating
> costs aren't such a big deal but I've got boards that are for various
> reasons irreplaceable, often single instances of boards (which makes
> scheduling a thing) and for some of the tests I'd like to get around to
> setting up I need special physical setup.  Some of the hardware I'd like
> to cover is only available in machines which are in various respects
> annoying to automate, I've got a couple of unused systems waiting for me
> to have sufficient bandwidth to work out how to automate them.  Either
> way I don't think the costs are trival enough to be completely handwaved
> away.

That does complicate things.

I'd also really like to get automated performance testing going too,
which would have similar requirements in that jobs would need to be
scheduled on specific dedicated machines. I think what you're doing
could still build off of some common infrastructure.

> I'd also note that the 9 hour turnaround time for that test set you're
> pointing at isn't exactly what I'd associate with immediate feedback.

My CI shards at the subtest level, and like I mentioned I run 10 VMs per
physical machine, so with just 2 of the 80 core Ampere boxes I get full
test runs done in ~20 minutes.
Neal Gompa Jan. 12, 2024, 11:11 a.m. UTC | #16
On Thu, Jan 11, 2024 at 8:11 PM Kent Overstreet
<kent.overstreet@linux.dev> wrote:
>
> On Thu, Jan 11, 2024 at 09:47:26PM +0000, Mark Brown wrote:
> > On Thu, Jan 11, 2024 at 12:38:57PM -0500, Kent Overstreet wrote:
> > > On Thu, Jan 11, 2024 at 03:35:40PM +0000, Mark Brown wrote:
> >
> > > > IME the actually running the tests bit isn't usually *so* much the
> > > > issue, someone making a new test runner and/or output format does mean a
> > > > bit of work integrating it into infrastructure but that's more usually
> > > > annoying than a blocker.
> >
> > > No, the proliferation of test runners, test output formats, CI systems,
> > > etc. really is an issue; it means we can't have one common driver that
> > > anyone can run from the command line, and instead there's a bunch of
> > > disparate systems with patchwork integration and all the feedback is nag
> > > emails - after you've finished whan you were working on instead of
> > > moving on to the next thing - with no way to get immediate feedback.
> >
> > It's certainly an issue and it's much better if people do manage to fit
> > their tests into some existing thing but I'm not convinced that's the
> > big reason why you have a bunch of different systems running separately
> > and doing different things.  For example the enterprise vendors will
> > naturally tend to have a bunch of server systems in their labs and focus
> > on their testing needs while I know the Intel audio CI setup has a bunch
> > of laptops, laptop like dev boards and things in there with loopback
> > audio cables and I think test equipment plugged in and focuses rather
> > more on audio.  My own lab is built around on systems I can be in the
> > same room as without getting too annoyed and does things I find useful,
> > plus using spare bandwidth for KernelCI because they can take donated
> > lab time.
>
> No, you're overthinking.
>
> The vast majority of kernel testing requires no special hardware, just a
> virtual machine.
>
> There is _no fucking reason_ we shouldn't be able to run tests on our
> own local machines - _local_ machines, not waiting for the Intel CI
> setup and asking for a git branch to be tested, not waiting for who
> knows how long for the CI farm to get to it - just run the damn tests
> immediately and get immediate feedback.
>
> You guys are overthinking and overengineering and ignoring the basics,
> the way enterprise people always do.
>

As one of those former enterprise people that actually did do this
stuff, I can say that even when I was "in the enterprise", I tried to
avoid overthinking and overengineering stuff like this. :)

Nobody can maintain anything that's so complicated nobody can run the
tests on their machine. That is the root of all sadness.

> > > And it's because building something shiny and new is the fun part, no
> > > one wants to do the grungy integration work.
> >
> > I think you may be overestimating people's enthusiasm for writing test
> > stuff there!  There is NIH stuff going on for sure but lot of the time
> > when you look at something where people have gone off and done their own
> > thing it's either much older than you initially thought and predates
> > anything they might've integrated with or there's some reason why none
> > of the existing systems fit well.  Anecdotally it seems much more common
> > to see people looking for things to reuse in order to save time than it
> > is to see people going off and reinventing the world.
>
> It's a basic lack of leadership. Yes, the younger engineers are always
> going to be doing the new and shiny, and always going to want to build
> something new instead of finishing off the tests or integrating with
> something existing. Which is why we're supposed to have managers saying
> "ok, what do I need to prioritize for my team be able to develop
> effectively".
>
> >
> > > > > example tests, example output:
> > > > > https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest
> > > > > https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing
> >
> > > > For example looking at the sample test there it looks like it needs
> > > > among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm,
> > > > rsync
> >
> > > Getting all that set up by the end user is one command:
> > >   ktest/root_image create
> > > and running a test is one morecommand:
> > > build-test-kernel run ~/ktest/tests/bcachefs/single_device.ktest
> >
> > That does assume that you're building and running everything directly on
> > the system under test and are happy to have the test in a VM which isn't
> > an assumption that holds universally, and also that whoever's doing the
> > testing doesn't want to do something like use their own distro or
> > something - like I say none of it looks too unreasonable for
> > filesystems.
>
> No, I'm doing it that way because technically that's the simplest way to
> do it.
>
> All you guys building crazy contraptions for running tests on Google
> Cloud or Amazon or whatever - you're building technical workarounds for
> broken procurement.
>
> Just requisition the damn machines.
>

Running in the cloud does not mean it has to be complicated. It can be
a simple Buildbot or whatever that knows how to spawn spot instances
for tests and destroy them when they're done *if the test passed*. If
a test failed on an instance, it could hold onto them for a day or two
for someone to debug if needed.

(I mention Buildbot because in a previous life, I used that to run
tests for the dattobd out-of-tree kernel module before. That was the
strategy I used for it.)

> > Some will be, some will have more demanding requirements especially when
> > you want to test on actual hardware rather than in a VM.  For example
> > with my own test setup which is more focused on hardware the operating
> > costs aren't such a big deal but I've got boards that are for various
> > reasons irreplaceable, often single instances of boards (which makes
> > scheduling a thing) and for some of the tests I'd like to get around to
> > setting up I need special physical setup.  Some of the hardware I'd like
> > to cover is only available in machines which are in various respects
> > annoying to automate, I've got a couple of unused systems waiting for me
> > to have sufficient bandwidth to work out how to automate them.  Either
> > way I don't think the costs are trival enough to be completely handwaved
> > away.
>
> That does complicate things.
>
> I'd also really like to get automated performance testing going too,
> which would have similar requirements in that jobs would need to be
> scheduled on specific dedicated machines. I think what you're doing
> could still build off of some common infrastructure.
>
> > I'd also note that the 9 hour turnaround time for that test set you're
> > pointing at isn't exactly what I'd associate with immediate feedback.
>
> My CI shards at the subtest level, and like I mentioned I run 10 VMs per
> physical machine, so with just 2 of the 80 core Ampere boxes I get full
> test runs done in ~20 minutes.
>

This design, ironically, is way more cloud-friendly than a lot of
testing system designs I've seen in the past. :)
Mark Brown Jan. 12, 2024, 6:22 p.m. UTC | #17
On Fri, Jan 12, 2024 at 06:11:04AM -0500, Neal Gompa wrote:
> On Thu, Jan 11, 2024 at 8:11 PM Kent Overstreet
> > On Thu, Jan 11, 2024 at 09:47:26PM +0000, Mark Brown wrote:

> > > It's certainly an issue and it's much better if people do manage to fit
> > > their tests into some existing thing but I'm not convinced that's the
> > > big reason why you have a bunch of different systems running separately
> > > and doing different things.  For example the enterprise vendors will
> > > naturally tend to have a bunch of server systems in their labs and focus
> > > on their testing needs while I know the Intel audio CI setup has a bunch

> > No, you're overthinking.

> > The vast majority of kernel testing requires no special hardware, just a
> > virtual machine.

This depends a lot on the area of the kernel you're looking at - some
things are very amenable to testing in a VM but there's plenty of code
where you really do want to ensure that at some point you're running
with some actual hardware, ideally as wide a range of it with diverse
implementation decisions as you can manage.  OTOH some things can only
be tested virtually because the hardware doesn't exist yet!

> > There is _no fucking reason_ we shouldn't be able to run tests on our
> > own local machines - _local_ machines, not waiting for the Intel CI
> > setup and asking for a git branch to be tested, not waiting for who
> > knows how long for the CI farm to get to it - just run the damn tests
> > immediately and get immediate feedback.

> > You guys are overthinking and overengineering and ignoring the basics,
> > the way enterprise people always do.

> As one of those former enterprise people that actually did do this
> stuff, I can say that even when I was "in the enterprise", I tried to
> avoid overthinking and overengineering stuff like this. :)

> Nobody can maintain anything that's so complicated nobody can run the
> tests on their machine. That is the root of all sadness.

Yeah, similar with a lot of the more hardware focused or embedded stuff
- running something on the machine that's in front of you is seldom the
bit that causes substantial issues.  Most of the exceptions I've
personally dealt with involved testing hardware (from simple stuff like
wiring the audio inputs and outputs together to verify that they're
working to attaching fancy test equipment to simulate things or validate
that desired physical parameters are being achieved).

> > > of the existing systems fit well.  Anecdotally it seems much more common
> > > to see people looking for things to reuse in order to save time than it
> > > is to see people going off and reinventing the world.

> > It's a basic lack of leadership. Yes, the younger engineers are always
> > going to be doing the new and shiny, and always going to want to build
> > something new instead of finishing off the tests or integrating with
> > something existing. Which is why we're supposed to have managers saying
> > "ok, what do I need to prioritize for my team be able to develop
> > effectively".

That sounds more like a "(reproducible) tests don't exist" complaint
which is a different thing again to people going off and NIHing fancy
frameworks.

> > > That does assume that you're building and running everything directly on
> > > the system under test and are happy to have the test in a VM which isn't
> > > an assumption that holds universally, and also that whoever's doing the
> > > testing doesn't want to do something like use their own distro or
> > > something - like I say none of it looks too unreasonable for
> > > filesystems.

> > No, I'm doing it that way because technically that's the simplest way to
> > do it.

> > All you guys building crazy contraptions for running tests on Google
> > Cloud or Amazon or whatever - you're building technical workarounds for
> > broken procurement.

I think you're addressing some specific stuff that I'm not super
familiar with here?  My own stuff (and most of the stuff I end up
looking at) involves driving actual hardware.

> > Just requisition the damn machines.

There's some assumptions there which are true for a lot of people
working on the kernel but not all of them...

> Running in the cloud does not mean it has to be complicated. It can be
> a simple Buildbot or whatever that knows how to spawn spot instances
> for tests and destroy them when they're done *if the test passed*. If
> a test failed on an instance, it could hold onto them for a day or two
> for someone to debug if needed.

> (I mention Buildbot because in a previous life, I used that to run
> tests for the dattobd out-of-tree kernel module before. That was the
> strategy I used for it.)

Yeah, or if your thing runs in a Docker container rather than a VM then
throwing it at a Kubernetes cluster using a batch job isn't a big jump.

> > I'd also really like to get automated performance testing going too,
> > which would have similar requirements in that jobs would need to be
> > scheduled on specific dedicated machines. I think what you're doing
> > could still build off of some common infrastructure.

It does actually - like quite a few test labs mine is based around LAVA,
labgrid is the other popular option (people were actually thinking about
integrating the two recently since labgrid is a bit lower level than
LAVA and they could conceptually play nicely with each other).  Since
the control API is internet accessible this means that it's really
simple for me to to donate spare time on the boards to KernelCI as it
understands how to drive LAVA, testing that I in turn use myself.  Both
my stuff and KernelCI use a repository of glue which knows how to drive
various testsuites inside a LAVA job, that's also used by other systems
using LAVA like LKFT.

The custom stuff I have is all fairly thin (and quite janky), mostly
just either things specific to my physical lab or managing which tests I
want to run and what results I expect.  What I've got is *much* more
limited than I'd like, and frankly if I wasn't able to pick up huge
amounts of preexisting work most of this stuff would not be happening.

> > > I'd also note that the 9 hour turnaround time for that test set you're
> > > pointing at isn't exactly what I'd associate with immediate feedback.

> > My CI shards at the subtest level, and like I mentioned I run 10 VMs per
> > physical machine, so with just 2 of the 80 core Ampere boxes I get full
> > test runs done in ~20 minutes.

> This design, ironically, is way more cloud-friendly than a lot of
> testing system designs I've seen in the past. :)

Sounds like a small private cloud to me!  :P
Kent Overstreet Jan. 15, 2024, 6:42 p.m. UTC | #18
On Fri, Jan 12, 2024 at 06:22:55PM +0000, Mark Brown wrote:
> This depends a lot on the area of the kernel you're looking at - some
> things are very amenable to testing in a VM but there's plenty of code
> where you really do want to ensure that at some point you're running
> with some actual hardware, ideally as wide a range of it with diverse
> implementation decisions as you can manage.  OTOH some things can only
> be tested virtually because the hardware doesn't exist yet!

Surface wise, there are a lot of drivers that need real hardware; but if
you look at where the complexity is, the hard complex algorithmic stuff
that really needs to be tested thoroughly - that's all essentially
library code that doesn't need specific drivers to test.

More broadly, whenever testing comes up the "special cases and special
hardware" keeps distracting us from making progress on the basics; which
is making sure as much of the kernel as possible can be tested in a
virtual machine, with no special setup.

And if we were better at that, it would be a good nudge towards driver
developers to make their stuff easier to test, perhaps by getting a
virtualized implementation into qemu, or to make the individual drivers
thinner and move heavy logic into easier to test library code.

> Yeah, similar with a lot of the more hardware focused or embedded stuff
> - running something on the machine that's in front of you is seldom the
> bit that causes substantial issues.  Most of the exceptions I've
> personally dealt with involved testing hardware (from simple stuff like
> wiring the audio inputs and outputs together to verify that they're
> working to attaching fancy test equipment to simulate things or validate
> that desired physical parameters are being achieved).

Is that sort of thing a frequent source of regressions?

That sounds like the sort of thing that should be a simple table, and
not something I would expect to need heavy regression testing - but, my
experience with driver development was nearly 15 years ago; not a lot of
day to day. How badly are typical kernel refactorings needing regression
testing in individual drivers?

Filesystem development, OTOH, needs _heavy_ regression testing for
everything we do. Similarly with mm, scheduler; many subtle interactions
going on.

> > > > of the existing systems fit well.  Anecdotally it seems much more common
> > > > to see people looking for things to reuse in order to save time than it
> > > > is to see people going off and reinventing the world.
> 
> > > It's a basic lack of leadership. Yes, the younger engineers are always
> > > going to be doing the new and shiny, and always going to want to build
> > > something new instead of finishing off the tests or integrating with
> > > something existing. Which is why we're supposed to have managers saying
> > > "ok, what do I need to prioritize for my team be able to develop
> > > effectively".
> 
> That sounds more like a "(reproducible) tests don't exist" complaint
> which is a different thing again to people going off and NIHing fancy
> frameworks.

No, it's a leadership/mentorship thing.

And this is something that's always been lacking in kernel culture.
Witness the kind of general grousing that goes on at maintainer summits;
maintainers complain about being overworked and people not stepping up
to help with the grungy responsibilities, while simultaneously we still
very much have a "fuck off if you haven't proven yourself" attitude
towards newcomers. Understandable given the historical realities (this
shit is hard and the penalties of fucking up are high, so there does
need to be a barrier to entry), but it's left us with some real gaps.

We don't have enough a people in the senier engineer role who lay out
designs and organise people to take on projects that are bigger than one
single person can do, or that are necessary but not "fun".

Tests and test infrastructure fall into the necessary but not fun
category, so they languish.

They are also things that you don't really learn the value of until
you've been doing this stuff for a decade or so and you've learned by
experience that yes, good tests really make life easier, as well as how
to write effective tests, and that's knowledge that needs to be
instilled.

> 
> > > > That does assume that you're building and running everything directly on
> > > > the system under test and are happy to have the test in a VM which isn't
> > > > an assumption that holds universally, and also that whoever's doing the
> > > > testing doesn't want to do something like use their own distro or
> > > > something - like I say none of it looks too unreasonable for
> > > > filesystems.
> 
> > > No, I'm doing it that way because technically that's the simplest way to
> > > do it.
> 
> > > All you guys building crazy contraptions for running tests on Google
> > > Cloud or Amazon or whatever - you're building technical workarounds for
> > > broken procurement.
> 
> I think you're addressing some specific stuff that I'm not super
> familiar with here?  My own stuff (and most of the stuff I end up
> looking at) involves driving actual hardware.

Yeah that's fair; that was addressed more towards what's been going on
in the filesystem testing world, where I still (outside of my own stuff)
haven't seen a CI with a proper dashboard of test results; instead a lot
of code has been burned on multi-distro, highly configurable stuff that
targets multiple clouds, but - I want simple and functional, not
whiz-bang features.

> > > Just requisition the damn machines.
> 
> There's some assumptions there which are true for a lot of people
> working on the kernel but not all of them...

$500 a month for my setup (and this is coming out of my patreon funding
right now!). It's a matter of priorities, and being willing to present
this as _necessary_ to the people who control the purse strings.

> > Running in the cloud does not mean it has to be complicated. It can be
> > a simple Buildbot or whatever that knows how to spawn spot instances
> > for tests and destroy them when they're done *if the test passed*. If
> > a test failed on an instance, it could hold onto them for a day or two
> > for someone to debug if needed.
> 
> > (I mention Buildbot because in a previous life, I used that to run
> > tests for the dattobd out-of-tree kernel module before. That was the
> > strategy I used for it.)
> 
> Yeah, or if your thing runs in a Docker container rather than a VM then
> throwing it at a Kubernetes cluster using a batch job isn't a big jump.

Kubernetes might be next level; I'm not a kubernetes guy so I can't say
if it would simplify things over what I've got. But if it meant running
on existing kubernetes clouds, that would make requisitioning hardware
easier.

> > > I'd also really like to get automated performance testing going too,
> > > which would have similar requirements in that jobs would need to be
> > > scheduled on specific dedicated machines. I think what you're doing
> > > could still build off of some common infrastructure.
> 
> It does actually - like quite a few test labs mine is based around LAVA,
> labgrid is the other popular option (people were actually thinking about
> integrating the two recently since labgrid is a bit lower level than
> LAVA and they could conceptually play nicely with each other).  Since
> the control API is internet accessible this means that it's really
> simple for me to to donate spare time on the boards to KernelCI as it
> understands how to drive LAVA, testing that I in turn use myself.  Both
> my stuff and KernelCI use a repository of glue which knows how to drive
> various testsuites inside a LAVA job, that's also used by other systems
> using LAVA like LKFT.
> 
> The custom stuff I have is all fairly thin (and quite janky), mostly
> just either things specific to my physical lab or managing which tests I
> want to run and what results I expect.  What I've got is *much* more
> limited than I'd like, and frankly if I wasn't able to pick up huge
> amounts of preexisting work most of this stuff would not be happening.

That's interesting. Do you have or would you be willing to write an
overview of what you've got? The way you describe it I wonder if we've
got some commonality.

The short overview of my system: tests are programs that expose
subcommends for listing depencies (i.e. virtual machine options, kernel
config options) and for listing and running subtests. Tests themselves
are shell scripts, with various library code for e.g. standard
kernel/vm config options, hooking up tracing, core dump catching, etc.

The idea is for tests to be entirely self contained and need no outside
configuration.

The test framework knows how to
 - build an appropriately configured kernel
 - launch a VM, which needs no prior configuration besides creation of a
   RO root filesystem image (single command, as mentioned)
 - exposes subcommands for qemu's gdb interface, kgdb, ssh access, etc.
   for when running interactively
 - implements watchdogs/test timeouts

and the CI, on top of all that, watches various git repositories and -
as you saw - tests every commit, newest to oldest, and provides the
results in a git log format.

The last one, "results in git log format", is _huge_. I don't know why I
haven't seen anyone else do that - it was a must-have feature for any
system over 10 years ago, and it never appeared so I finally built it
myself.

We (inherently!) have lots of issues with tests that only sometimes fail
making it hard to know when a regression was introduced, but running all
the tests on every commit with a good way to see the results makes this
nearly a non issue - that is, with a weak and noisy signal (tests
results) we just have to gather enough data and present the results
properly to make the signal stand out (which commit(s) were buggy).

I write a lot of code (over 200 commits for bcachefs this merge window
alone), and this is a huge part of why I'm able to - I never have to do
manual bisection anymore, and thanks to a codebase that's littered with
assertions and debugging tools I don't spend that much time bug hunting
either.

> > > > I'd also note that the 9 hour turnaround time for that test set you're
> > > > pointing at isn't exactly what I'd associate with immediate feedback.
> 
> > > My CI shards at the subtest level, and like I mentioned I run 10 VMs per
> > > physical machine, so with just 2 of the 80 core Ampere boxes I get full
> > > test runs done in ~20 minutes.
> 
> > This design, ironically, is way more cloud-friendly than a lot of
> > testing system designs I've seen in the past. :)
> 
> Sounds like a small private cloud to me!  :P

Yep :)
Greg KH Jan. 15, 2024, 8:13 p.m. UTC | #19
On Mon, Jan 15, 2024 at 01:42:53PM -0500, Kent Overstreet wrote:
> > That sounds more like a "(reproducible) tests don't exist" complaint
> > which is a different thing again to people going off and NIHing fancy
> > frameworks.
> 
> No, it's a leadership/mentorship thing.
> 
> And this is something that's always been lacking in kernel culture.
> Witness the kind of general grousing that goes on at maintainer summits;
> maintainers complain about being overworked and people not stepping up
> to help with the grungy responsibilities, while simultaneously we still
> very much have a "fuck off if you haven't proven yourself" attitude
> towards newcomers. Understandable given the historical realities (this
> shit is hard and the penalties of fucking up are high, so there does
> need to be a barrier to entry), but it's left us with some real gaps.
> 
> We don't have enough a people in the senier engineer role who lay out
> designs and organise people to take on projects that are bigger than one
> single person can do, or that are necessary but not "fun".
> 
> Tests and test infrastructure fall into the necessary but not fun
> category, so they languish.

No, they fall into the "no company wants to pay someone to do the work"
category, so it doesn't get done.

It's not a "leadership" issue, what is the "leadership" supposed to do
here, refuse to take any new changes unless someone ponys up and does
the infrastructure and testing work first?  That's not going to fly, for
valid reasons.

And as proof of this, we have had many real features, that benefit
everyone, called out as "please, companies, pay for this to be done, you
all want it, and so do we!" and yet, no one does it.  One real example
is the RT work, it has a real roadmap, people to do the work, a tiny
price tag, yet almost no one sponsoring it.  Yes, for that specific
issue it's slowly getting there and better, but it is one example of how
you view of this might not be all that correct.

I have loads of things I would love to see done.  And I get interns at
times to chip away at them, but my track record with interns is that
almost all of them go off and get real jobs at companies doing kernel
work (and getting paid well), and my tasks don't get finished, so it's
back up to me to do them.  And that's fine, and wonderful, I want those
interns to get good jobs, that's why we do this.

> They are also things that you don't really learn the value of until
> you've been doing this stuff for a decade or so and you've learned by
> experience that yes, good tests really make life easier, as well as how
> to write effective tests, and that's knowledge that needs to be
> instilled.

And you will see that we now have the infrastructure in places for this.
The great kunit testing framework, the kselftest framework, and the
stuff tying it all together is there.  All it takes is people actually
using it to write their tests, which is slowly happening.

So maybe, the "leadership" here is working, but in a nice organic way of
"wouldn't it be nice if you cleaned that out-of-tree unit test framework
up and get it merged" type of leadership, not mandates-from-on-high that
just don't work.  So organic you might have missed it :)

Anyway, just my my 2c, what do I know...

greg k-h
Kent Overstreet Jan. 17, 2024, 4:41 a.m. UTC | #20
On Mon, Jan 15, 2024 at 09:13:01PM +0100, Greg KH wrote:
> On Mon, Jan 15, 2024 at 01:42:53PM -0500, Kent Overstreet wrote:
> > > That sounds more like a "(reproducible) tests don't exist" complaint
> > > which is a different thing again to people going off and NIHing fancy
> > > frameworks.
> > 
> > No, it's a leadership/mentorship thing.
> > 
> > And this is something that's always been lacking in kernel culture.
> > Witness the kind of general grousing that goes on at maintainer summits;
> > maintainers complain about being overworked and people not stepping up
> > to help with the grungy responsibilities, while simultaneously we still
> > very much have a "fuck off if you haven't proven yourself" attitude
> > towards newcomers. Understandable given the historical realities (this
> > shit is hard and the penalties of fucking up are high, so there does
> > need to be a barrier to entry), but it's left us with some real gaps.
> > 
> > We don't have enough a people in the senier engineer role who lay out
> > designs and organise people to take on projects that are bigger than one
> > single person can do, or that are necessary but not "fun".
> > 
> > Tests and test infrastructure fall into the necessary but not fun
> > category, so they languish.
> 
> No, they fall into the "no company wants to pay someone to do the work"
> category, so it doesn't get done.
> 
> It's not a "leadership" issue, what is the "leadership" supposed to do
> here, refuse to take any new changes unless someone ponys up and does
> the infrastructure and testing work first?  That's not going to fly, for
> valid reasons.
> 
> And as proof of this, we have had many real features, that benefit
> everyone, called out as "please, companies, pay for this to be done, you
> all want it, and so do we!" and yet, no one does it.  One real example
> is the RT work, it has a real roadmap, people to do the work, a tiny
> price tag, yet almost no one sponsoring it.  Yes, for that specific
> issue it's slowly getting there and better, but it is one example of how
> you view of this might not be all that correct.

Well, what's so special about any of those features? What's special
about the RT work? The list of features and enhancements we want is
never ending.

But good tools are important beacuse they affect the rate of everyday
development; they're a multiplier on the money everone is spending on
salaries.

In everyday development, the rate at which we can run tests and verify
the corectness of the code we're working on is more often than not _the_
limiting factor on rate of development. It's a particularly big deal for
getting new people up to speed, and for work that crosses subsystems.


> And you will see that we now have the infrastructure in places for this.
> The great kunit testing framework, the kselftest framework, and the
> stuff tying it all together is there.  All it takes is people actually
> using it to write their tests, which is slowly happening.
> 
> So maybe, the "leadership" here is working, but in a nice organic way of
> "wouldn't it be nice if you cleaned that out-of-tree unit test framework
> up and get it merged" type of leadership, not mandates-from-on-high that
> just don't work.  So organic you might have missed it :)

Things are moving in the right direction; the testing track at Plumber's
was exciting to see.

Kselftests is not there yet, though. Those tests could all be runnable
with a single command - and _most_ of what's needed is there, the kernel
config dependencies are listed out, but we're still lacking a
testrunner.

I've been trying to get someone interested in hooking them up to ktest
(my ktest, not that other thing), so that we'd have one common
testrunner for running anything that can be a VM test. Similarly with
blktests, mmtests, et cetera.

Having one common way of running all our functional VM tests, and a
common collection of those tests would be a huge win for productivity
because _way_ too many developers are still using slow ad hoc testing
methods, and a good test runner (ktest) gets the edit/compile/test cycle
down to < 1 minute, with the same tests framework for local development
and automated testing in the big test cloud...
Greg KH Jan. 17, 2024, 5:31 a.m. UTC | #21
On Tue, Jan 16, 2024 at 11:41:25PM -0500, Kent Overstreet wrote:
> On Mon, Jan 15, 2024 at 09:13:01PM +0100, Greg KH wrote:
> > On Mon, Jan 15, 2024 at 01:42:53PM -0500, Kent Overstreet wrote:
> > > > That sounds more like a "(reproducible) tests don't exist" complaint
> > > > which is a different thing again to people going off and NIHing fancy
> > > > frameworks.
> > > 
> > > No, it's a leadership/mentorship thing.
> > > 
> > > And this is something that's always been lacking in kernel culture.
> > > Witness the kind of general grousing that goes on at maintainer summits;
> > > maintainers complain about being overworked and people not stepping up
> > > to help with the grungy responsibilities, while simultaneously we still
> > > very much have a "fuck off if you haven't proven yourself" attitude
> > > towards newcomers. Understandable given the historical realities (this
> > > shit is hard and the penalties of fucking up are high, so there does
> > > need to be a barrier to entry), but it's left us with some real gaps.
> > > 
> > > We don't have enough a people in the senier engineer role who lay out
> > > designs and organise people to take on projects that are bigger than one
> > > single person can do, or that are necessary but not "fun".
> > > 
> > > Tests and test infrastructure fall into the necessary but not fun
> > > category, so they languish.
> > 
> > No, they fall into the "no company wants to pay someone to do the work"
> > category, so it doesn't get done.
> > 
> > It's not a "leadership" issue, what is the "leadership" supposed to do
> > here, refuse to take any new changes unless someone ponys up and does
> > the infrastructure and testing work first?  That's not going to fly, for
> > valid reasons.
> > 
> > And as proof of this, we have had many real features, that benefit
> > everyone, called out as "please, companies, pay for this to be done, you
> > all want it, and so do we!" and yet, no one does it.  One real example
> > is the RT work, it has a real roadmap, people to do the work, a tiny
> > price tag, yet almost no one sponsoring it.  Yes, for that specific
> > issue it's slowly getting there and better, but it is one example of how
> > you view of this might not be all that correct.
> 
> Well, what's so special about any of those features? What's special
> about the RT work? The list of features and enhancements we want is
> never ending.

Nothing is special about RT except it is a good example of the kernel
"leadership" asking for help, and companies just ignoring us by not
funding the work to be done that they themselves want to see happen
because their own devices rely on it.

> But good tools are important beacuse they affect the rate of everyday
> development; they're a multiplier on the money everone is spending on
> salaries.
> 
> In everyday development, the rate at which we can run tests and verify
> the corectness of the code we're working on is more often than not _the_
> limiting factor on rate of development. It's a particularly big deal for
> getting new people up to speed, and for work that crosses subsystems.

Agreed, I'm not objecting here at all.

> > And you will see that we now have the infrastructure in places for this.
> > The great kunit testing framework, the kselftest framework, and the
> > stuff tying it all together is there.  All it takes is people actually
> > using it to write their tests, which is slowly happening.
> > 
> > So maybe, the "leadership" here is working, but in a nice organic way of
> > "wouldn't it be nice if you cleaned that out-of-tree unit test framework
> > up and get it merged" type of leadership, not mandates-from-on-high that
> > just don't work.  So organic you might have missed it :)
> 
> Things are moving in the right direction; the testing track at Plumber's
> was exciting to see.
> 
> Kselftests is not there yet, though. Those tests could all be runnable
> with a single command - and _most_ of what's needed is there, the kernel
> config dependencies are listed out, but we're still lacking a
> testrunner.

'make kselftest' is a good start, it outputs in proper format that test
runners can consume.  We even have 'make rusttest' now too because "rust
is special" for some odd reason :)

And that should be all that the kernel needs to provide as test runners
all work differently for various reasons, but if you want to help
standardize on something, that's what kernelci is doing, I know they can
always appreciate the help as well.

> I've been trying to get someone interested in hooking them up to ktest
> (my ktest, not that other thing), so that we'd have one common
> testrunner for running anything that can be a VM test. Similarly with
> blktests, mmtests, et cetera.

Hey, that "other" ktest.pl is what I have been using for stable kernel
test builds for years, it does work well for what it is designed for,
and I know other developers also use it.

> Having one common way of running all our functional VM tests, and a
> common collection of those tests would be a huge win for productivity
> because _way_ too many developers are still using slow ad hoc testing
> methods, and a good test runner (ktest) gets the edit/compile/test cycle
> down to < 1 minute, with the same tests framework for local development
> and automated testing in the big test cloud...

Agreed, and that's what kernelci is working to help provide.

thanks,

greg k-h
Theodore Ts'o Jan. 17, 2024, 5:54 a.m. UTC | #22
On Tue, Jan 16, 2024 at 11:41:25PM -0500, Kent Overstreet wrote:
> > > No, it's a leadership/mentorship thing.
> > > 
> > > And this is something that's always been lacking in kernel culture.
> > > Witness the kind of general grousing that goes on at maintainer summits;
> > > maintainers complain about being overworked and people not stepping up
> > > to help with the grungy responsibilities, while simultaneously we still

     <blah blah blah>

> > > Tests and test infrastructure fall into the necessary but not fun
> > > category, so they languish.
> > 
> > No, they fall into the "no company wants to pay someone to do the work"
> > category, so it doesn't get done.
> > 
> > It's not a "leadership" issue, what is the "leadership" supposed to do
> > here, refuse to take any new changes unless someone ponys up and does
> > the infrastructure and testing work first?  That's not going to fly, for
> > valid reasons.

Greg is absolutely right about this.

> But good tools are important beacuse they affect the rate of everyday
> development; they're a multiplier on the money everone is spending on
> salaries.

Alas, companies don't see it that way.  They take the value that get
from Linux for granted, and they only care about the multipler effect
of their employees salaries (and sometimes not even that).  They most
certainly care about the salutary effects on the entire ecosyustem.
At least, I haven't seen any company make funding decisions on that
basis.

It's easy enough for you to blame "leadership", but the problem is the
leaders at the VP and SVP level who control the budgets, not the
leadership of the maintainers, who are overworked, and who often
invest in testing themselves, on their own personal time, because they
don't get adequate support from others.

It's also for that reason why we try to prove that people won't just
stick around enough for their pet feature (or in the case of ntfs,
their pet file system) gets into the kernel --- and then disappear.
For too often, this is what happens, either because they have their
itch scratched, or their company reassigns them to some other project
that is important for their company's bottom-line.

If that person is willing their own personal time, long after work
hours, to steward their contribution in the absence of corporate
support, great.  But we need to have that proven to us, or at the very
least, make sure the feature's long-term maintenace burden is as low
possible, to mitigate the likelihood that we won't see the new
engineer after their feature lands upstream.

> Having one common way of running all our functional VM tests, and a
> common collection of those tests would be a huge win for productivity
> because _way_ too many developers are still using slow ad hoc testing
> methods, and a good test runner (ktest) gets the edit/compile/test cycle
> down to < 1 minute, with the same tests framework for local development
> and automated testing in the big test cloud...

I'm going to call bullshit on this assertion.  The fact that we have
multiple ways of running our tests is not the reason why testing takes
a long time.

If you are going to run stress tests, which is critical for testing
real file systems, that's going to take at least an hour; more if you
want to test muliple file system features.  The full regression set
for ext4, using the common fstests testt suite, takes about 25 hours
of VM time; and about 2.5 hours of wall clock time since I shard it
across a dozen VM's.

Yes, w could try to add some unit tests which take much less time
running tests where fstests is creating a file system, mounting it,
exercising the code through userspace functions, and then unmounting
the file system and then checking the file system.  Even if that were
an adequate replacement for some of the existing fstests, (a) it's not
a replacement for stress testing, and (b) this would require a vast
amount of file system specific software engineering investment, and
where is that going from?

The bottom line is that problem is that having a one common way of
running our functional VM tests is not even *close* to root cause of
the problem.

	    	       	  	       - Ted
James Bottomley Jan. 17, 2024, 1:03 p.m. UTC | #23
On Wed, 2024-01-17 at 00:54 -0500, Theodore Ts'o wrote:
> On Tue, Jan 16, 2024 at 11:41:25PM -0500, Kent Overstreet wrote:
> > > > No, it's a leadership/mentorship thing.
> > > > 
> > > > And this is something that's always been lacking in kernel
> > > > culture. Witness the kind of general grousing that goes on at
> > > > maintainer summits;maintainers complain about being overworked
> > > > and people not stepping up to help with the grungy
> > > > responsibilities, while simultaneously we still
> 
>      <blah blah blah>
> 
> > > > Tests and test infrastructure fall into the necessary but not
> > > > fun category, so they languish.
> > > 
> > > No, they fall into the "no company wants to pay someone to do the
> > > work" category, so it doesn't get done.
> > > 
> > > It's not a "leadership" issue, what is the "leadership" supposed
> > > to do here, refuse to take any new changes unless someone ponys
> > > up and does the infrastructure and testing work first?  That's
> > > not going to fly, for valid reasons.
> 
> Greg is absolutely right about this.
> 
> > But good tools are important beacuse they affect the rate of
> > everyday development; they're a multiplier on the money everone is
> > spending on salaries.
> 
> Alas, companies don't see it that way.  They take the value that get
> from Linux for granted, and they only care about the multipler effect
> of their employees salaries (and sometimes not even that).  They most
> certainly care about the salutary effects on the entire ecosyustem.
> At least, I haven't seen any company make funding decisions on that
> basis.

Actually, this is partly our fault.  Companies behave exactly like a
selfish contributor does:

https://archive.fosdem.org/2020/schedule/event/selfish_contributor/

The question they ask is "if I'm putting money into it, what am I
getting out of it".  If the answer to that is that it benefits
everybody, it's basically charity  to the entity being asked (and not
even properly tax deductible at that), which goes way back behind even
real charitable donations (which at least have a publicity benefit) and
you don't even get to speak to anyone about it when you go calling with
the collecting tin.  If you can say it benefits these 5 tasks your
current employees are doing, you might have a possible case for the
engineering budget (you might get in the door but you'll still be
queuing behind every in-plan budget item).  The best case is if you can
demonstrate some useful for profit contribution it makes to the actual
line of business (or better yet could be used to spawn a new line of
business), so when you're asking for a tool, it has to be usable
outside the narrow confines of the kernel and you need to be able to
articulate why it's generally useful (git is a great example, it was
designed to solve a kernel specific problem, but not it's in use pretty
much everywhere source control is a thing).

Somewhere between 2000 and now we seem to have lost our ability to
frame the argument in the above terms, because the business quid pro
quo argument was what got us money for stuff we needed and the Linux
Foundation and the TAB formed, but we're not managing nearly as well
now.  The environment has hardened against us (we're no longer the new
shiny) but that's not the whole explanation.

I also have to say, that for all the complaints there's just not any
open source pull for test tools (there's no-one who's on a mission to
make them better).  Demanding that someone else do it is proof of this
(if you cared enough you'd do it yourself).  That's why all our testing
infrastructure is just some random set of scripts that mostly does what
I want, because it's the last thing I need to prove the thing I
actually care about works.

Finally testing infrastructure is how OSDL (the precursor to the Linux
foundation) got started and got its initial funding, so corporations
have been putting money into it for decades with not much return (and
pretty much nothing to show for a unified testing infrastructure ...
ten points to the team who can actually name the test infrastructure
OSDL produced) and have finally concluded it's not worth it, making it
a 10x harder sell now.

James
Mark Brown Jan. 17, 2024, 5:33 p.m. UTC | #24
On Mon, Jan 15, 2024 at 01:42:53PM -0500, Kent Overstreet wrote:
> On Fri, Jan 12, 2024 at 06:22:55PM +0000, Mark Brown wrote:

> > This depends a lot on the area of the kernel you're looking at - some
> > things are very amenable to testing in a VM but there's plenty of code
> > where you really do want to ensure that at some point you're running
> > with some actual hardware, ideally as wide a range of it with diverse
> > implementation decisions as you can manage.  OTOH some things can only
> > be tested virtually because the hardware doesn't exist yet!

> Surface wise, there are a lot of drivers that need real hardware; but if
> you look at where the complexity is, the hard complex algorithmic stuff
> that really needs to be tested thoroughly - that's all essentially
> library code that doesn't need specific drivers to test.

...

> And if we were better at that, it would be a good nudge towards driver
> developers to make their stuff easier to test, perhaps by getting a
> virtualized implementation into qemu, or to make the individual drivers
> thinner and move heavy logic into easier to test library code.

As Greg indicated with the testing I doubt everyone has infinite budget
for developing emulation, and I will note that model accuracy and
performance tend to be competing goals.  When it comes to factoring
things out into library code that can be a double edged sword - changes
in the shared code can affect rather more systems than a single driver
change so really ought to be tested on a wide range of systems.  The
level of risk from changes does vary widly of course, and you can try to
have pure software tests for the things you know are relied upon, but
it can be surprising.

> > Yeah, similar with a lot of the more hardware focused or embedded stuff
> > - running something on the machine that's in front of you is seldom the
> > bit that causes substantial issues.  Most of the exceptions I've
> > personally dealt with involved testing hardware (from simple stuff like
> > wiring the audio inputs and outputs together to verify that they're
> > working to attaching fancy test equipment to simulate things or validate
> > that desired physical parameters are being achieved).

> Is that sort of thing a frequent source of regressions?

> That sounds like the sort of thing that should be a simple table, and
> not something I would expect to need heavy regression testing - but, my
> experience with driver development was nearly 15 years ago; not a lot of
> day to day. How badly are typical kernel refactorings needing regression
> testing in individual drivers?

General refactorings tend not to be that risky, but once you start doing
active work on the shared code dealing with the specific thing the risk
starts to go up and some changes are more risky than others.

> Filesystem development, OTOH, needs _heavy_ regression testing for
> everything we do. Similarly with mm, scheduler; many subtle interactions
> going on.

Right, and a lot of factored out code ends up in the same boat - that's
kind of the issue.

> > > > It's a basic lack of leadership. Yes, the younger engineers are always
> > > > going to be doing the new and shiny, and always going to want to build
> > > > something new instead of finishing off the tests or integrating with
> > > > something existing. Which is why we're supposed to have managers saying
> > > > "ok, what do I need to prioritize for my team be able to develop
> > > > effectively".

> > That sounds more like a "(reproducible) tests don't exist" complaint
> > which is a different thing again to people going off and NIHing fancy
> > frameworks.

> No, it's a leadership/mentorship thing.

> And this is something that's always been lacking in kernel culture.
> Witness the kind of general grousing that goes on at maintainer summits;
> maintainers complain about being overworked and people not stepping up
> to help with the grungy responsibilities, while simultaneously we still
> very much have a "fuck off if you haven't proven yourself" attitude
> towards newcomers. Understandable given the historical realities (this
> shit is hard and the penalties of fucking up are high, so there does
> need to be a barrier to entry), but it's left us with some real gaps.

> We don't have enough a people in the senier engineer role who lay out
> designs and organise people to take on projects that are bigger than one
> single person can do, or that are necessary but not "fun".

> Tests and test infrastructure fall into the necessary but not fun
> category, so they languish.

Like Greg said I don't think that's a realistic view of how we can get
things done here - often the thing with stop energy is that it just
makes people stop.  In a lot of areas everyone is just really busy and
struggling to keep up, we make progress on the generic stuff in part by
accepting that people have limited time and will do what they can with
everyone building on top of everyone's work.

> > > > Just requisition the damn machines.

> > There's some assumptions there which are true for a lot of people
> > working on the kernel but not all of them...

> $500 a month for my setup (and this is coming out of my patreon funding
> right now!). It's a matter of priorities, and being willing to present
> this as _necessary_ to the people who control the purse strings.

One of the assumptions there is that everyone is doing this in a well
funded corporate environment focused on upstream.  Even ignoring
hobbyists and students for example in the embedded world it's fairly
common to have stuff being upstreamed since people did the work anyway
for a customer project or internal product but where the customer
doesn't actually care either way if the code lands anywhere other than
their product (we might suggest that they should care but that doesn't
mean that they actually do care).

I'll also note that there's people like me who do things with areas of
the kernel not urgently related to their current employer's business and
hence very difficult to justify as a work expense.  With my lab some
companies have been generous enough to send me test hardware (which I'm
very greatful for, that's most of the irreplaceable stuff I have) but
the infrastructure around them and the day to day operating costs are
all being paid for by me personally.

> > > > I'd also really like to get automated performance testing going too,
> > > > which would have similar requirements in that jobs would need to be
> > > > scheduled on specific dedicated machines. I think what you're doing
> > > > could still build off of some common infrastructure.

> > It does actually - like quite a few test labs mine is based around LAVA,
> > labgrid is the other popular option (people were actually thinking about
> > integrating the two recently since labgrid is a bit lower level than

...

> > want to run and what results I expect.  What I've got is *much* more
> > limited than I'd like, and frankly if I wasn't able to pick up huge
> > amounts of preexisting work most of this stuff would not be happening.

> That's interesting. Do you have or would you be willing to write an
> overview of what you've got? The way you describe it I wonder if we've
> got some commonality.

I was actually thinking about putting together a talk about it, though
realistically the majority of it is just a very standard LAVA lab which
is something there's a bunch of presentations/documentation about
already.

> The short overview of my system: tests are programs that expose
> subcommends for listing depencies (i.e. virtual machine options, kernel
> config options) and for listing and running subtests. Tests themselves
> are shell scripts, with various library code for e.g. standard
> kernel/vm config options, hooking up tracing, core dump catching, etc.

> The idea is for tests to be entirely self contained and need no outside
> configuration.

The tests themselves bit sounds like what everyone else is doing - it
all comes down to running some shell commands in a target environment
somewhere.  kselftest provides information on which config options it
needs which would be nice to integrate too.

> and the CI, on top of all that, watches various git repositories and -
> as you saw - tests every commit, newest to oldest, and provides the
> results in a git log format.

> The last one, "results in git log format", is _huge_. I don't know why I
> haven't seen anyone else do that - it was a must-have feature for any
> system over 10 years ago, and it never appeared so I finally built it
> myself.

A lot of the automated testing that gets done is too expensive to be
done per commit, though some does.  I do actually do it myself, but even
there it's mainly just some very quick smoke tests that get run per
commit with more tests done on the branch as a whole (with a bit more
where I can parallise things well).  My stuff is more organised for
scripting so expected passes are all just elided, I just use LAVA's UI
if I want to pull the actual jobs for some reason.  I've also see aiaiai
used for this, though I think the model there was similarly to only get
told about problems.

> We (inherently!) have lots of issues with tests that only sometimes fail
> making it hard to know when a regression was introduced, but running all
> the tests on every commit with a good way to see the results makes this
> nearly a non issue - that is, with a weak and noisy signal (tests
> results) we just have to gather enough data and present the results
> properly to make the signal stand out (which commit(s) were buggy).

Yeah, running for longer and/or more often helps find the hard to
reproduce things.  There's a bunch of strategies for picking exactly
what to do there, per commit is certainly a valid one.
Mark Brown Jan. 17, 2024, 6:19 p.m. UTC | #25
On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote:

> I also have to say, that for all the complaints there's just not any
> open source pull for test tools (there's no-one who's on a mission to
> make them better).  Demanding that someone else do it is proof of this
> (if you cared enough you'd do it yourself).  That's why all our testing
> infrastructure is just some random set of scripts that mostly does what
> I want, because it's the last thing I need to prove the thing I
> actually care about works.

> Finally testing infrastructure is how OSDL (the precursor to the Linux
> foundation) got started and got its initial funding, so corporations
> have been putting money into it for decades with not much return (and
> pretty much nothing to show for a unified testing infrastructure ...
> ten points to the team who can actually name the test infrastructure
> OSDL produced) and have finally concluded it's not worth it, making it
> a 10x harder sell now.

I think that's a *bit* pessimistic, at least for some areas of the
kernel - there is commercial stuff going on with kernel testing with
varying degrees of community engagement (eg, off the top of my head
Baylibre, Collabora and Linaro all have offerings of various kinds that
I'm aware of), and some of that does turn into investments in reusable
things rather than proprietary stuff.  I know that I look at the
kernelci.org results for my trees, and that I've fixed issues I saw
purely in there.  kselftest is noticably getting much better over time,
and LTP is quite active too.  The stuff I'm aware of is more focused
around the embedded space than the enterprise/server space but it does
exist.  That's not to say that this is all well resourced and there's no
problem (far from it), but it really doesn't feel like a complete dead
loss either.

Some of the issues come from the different questions that people are
trying to answer with testing, or the very different needs of the
tests that people want to run - for example one of the reasons
filesystems aren't particularly well covered for the embedded cases is
that if your local storage is SD or worse eMMC then heavy I/O suddenly
looks a lot more demanding and media durability a real consideration.
Theodore Ts'o Jan. 18, 2024, 2:49 a.m. UTC | #26
On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote:
> Actually, this is partly our fault.  Companies behave exactly like a
> selfish contributor does:
> 
> https://archive.fosdem.org/2020/schedule/event/selfish_contributor/
> 
> The question they ask is "if I'm putting money into it, what am I
> getting out of it".  If the answer to that is that it benefits
> everybody, it's basically charity  to the entity being asked (and not
> even properly tax deductible at that), which goes way back behind even
> real charitable donations (which at least have a publicity benefit) and
> you don't even get to speak to anyone about it when you go calling with
> the collecting tin.  If you can say it benefits these 5 tasks your
> current employees are doing, you might have a possible case for the
> engineering budget (you might get in the door but you'll still be
> queuing behind every in-plan budget item).  The best case is if you can
> demonstrate some useful for profit contribution it makes to the actual
> line of business (or better yet could be used to spawn a new line of
> business), so when you're asking for a tool, it has to be usable
> outside the narrow confines of the kernel and you need to be able to
> articulate why it's generally useful (git is a great example, it was
> designed to solve a kernel specific problem, but not it's in use pretty
> much everywhere source control is a thing).

I have on occasion tried to make the "it benefits the whole ecosystem"
argument, and that will work on the margins.  But it's a lot harder
when it's more than a full SWE-year's worth of investment, at least
more recently.  I *have* tried to get more test investment. with an
eye towards benefitting not just one company, but in a much more
general fasion ---- but multi-engineer projects are a very hard sell,
especially recently.  If Kent wants to impugn my leadership skills,
that's fine; I invite him to try and see if he can get SVP's cough up
the dough.  :-)

I've certainly had a lot more success with the "Business quid pro quo"
argument; fscrypt and fsverity was developed for Android and Chrome;
casefolding support benefited Android and Steam; ext4 fast commits was
targetted at cloud-based NFS and Samba serving, etc.

My conception of a successful open source maintainer includes a strong
aspect of a product manager whose job is to find product/market fit.
That is, I try to be a matchmaker between some feature that I've
wnated for my subsystem, and would benefit users, and a business case
that is sufficientlty compelling that a company is willing to fund the
engineering effort to make taht feature happen.  That companmy might
be one that signs my patcheck, or might be some other company.  For
special bonus points, if I can convince some other company to find a
good chunk of the engineering effort, and it *also* benefits the
company that pays my salary, that's a win-win that I can crow about at
performance review time.  :-)

> Somewhere between 2000 and now we seem to have lost our ability to
> frame the argument in the above terms, because the business quid pro
> quo argument was what got us money for stuff we needed and the Linux
> Foundation and the TAB formed, but we're not managing nearly as well
> now.  The environment has hardened against us (we're no longer the new
> shiny) but that's not the whole explanation.

There are a couple of dynamics going on here, I think.  When a company
is just starting to invest in open source, and it is the "new shiny"
it's a lot easier to make the pitch for big projects that are good for
everyone.  In the early days of the IBM Linux Technolgy Center, the
Linux SMP scalability effort, ltp, etc., were significantly funded by
the IBM LTC.  And in some cases, efforts which didn't make it
upstream, but which inspired the features to enter Linux (even if it
wasn't IBM code), such as in the case of the IBM's linux thread or
volume management, it was still considered a win by IBM management.

Unfortunately, this effect fades over time.  It's a lot easier to fund
multi-engineer projects which run for more than a year, when a company
is just starting out, and when it's still trying to attract upstream
developers, and it has a sizeable "investment" budget.  ("IBM will
invest a billion dollars in Linux").  But then in later years, the
VP's have to justify their budget, and so companies tend to become
more and more "selfish".  After all, that's how capitalism works ---
"think of the children^H^H^H^H^H^H^H shareholders!"

I suspect we can all think of companies beyond just IBM where this
dynamic is at play; I certainly can!

The economic cycle can also make a huge difference.  Things got harder
after the dot com imposiion; then things lossened up.  Howver,
post-COVID, we've seen multiple companies really become much more
focused on "how is this good for our company".  It has different names
at different companies, such as "year of efficiency" or "sharpening
our focus", but it often is accompanied with layoffs, and a general
tightening of budgets.  I don't think it's an accident that
maintainwer grumpiness has been higher than normal in the last year or
so.

	    	       	  	      	     	- Ted
Randy Dunlap Jan. 18, 2024, 5:35 a.m. UTC | #27
On 1/17/24 05:03, James Bottomley wrote:
> Finally testing infrastructure is how OSDL (the precursor to the Linux
> foundation) got started and got its initial funding, so corporations
> have been putting money into it for decades with not much return (and
> pretty much nothing to show for a unified testing infrastructure ...
> ten points to the team who can actually name the test infrastructure
> OSDL produced) and have finally concluded it's not worth it, making it
> a 10x harder sell now.

What will ten points get me?  a weak cup of coffee?

Do I need a team to answer the question?

Anyway, Crucible.
Kent Overstreet Jan. 21, 2024, 2:49 a.m. UTC | #28
On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote:
> On Wed, 2024-01-17 at 00:54 -0500, Theodore Ts'o wrote:
> > On Tue, Jan 16, 2024 at 11:41:25PM -0500, Kent Overstreet wrote:
> > > > > No, it's a leadership/mentorship thing.
> > > > > 
> > > > > And this is something that's always been lacking in kernel
> > > > > culture. Witness the kind of general grousing that goes on at
> > > > > maintainer summits;maintainers complain about being overworked
> > > > > and people not stepping up to help with the grungy
> > > > > responsibilities, while simultaneously we still
> > 
> >      <blah blah blah>
> > 
> > > > > Tests and test infrastructure fall into the necessary but not
> > > > > fun category, so they languish.
> > > > 
> > > > No, they fall into the "no company wants to pay someone to do the
> > > > work" category, so it doesn't get done.
> > > > 
> > > > It's not a "leadership" issue, what is the "leadership" supposed
> > > > to do here, refuse to take any new changes unless someone ponys
> > > > up and does the infrastructure and testing work first?  That's
> > > > not going to fly, for valid reasons.
> > 
> > Greg is absolutely right about this.
> > 
> > > But good tools are important beacuse they affect the rate of
> > > everyday development; they're a multiplier on the money everone is
> > > spending on salaries.
> > 
> > Alas, companies don't see it that way.  They take the value that get
> > from Linux for granted, and they only care about the multipler effect
> > of their employees salaries (and sometimes not even that).  They most
> > certainly care about the salutary effects on the entire ecosyustem.
> > At least, I haven't seen any company make funding decisions on that
> > basis.
> 
> Actually, this is partly our fault.  Companies behave exactly like a
> selfish contributor does:
> 
> https://archive.fosdem.org/2020/schedule/event/selfish_contributor/
> 
> The question they ask is "if I'm putting money into it, what am I
> getting out of it".  If the answer to that is that it benefits
> everybody, it's basically charity  to the entity being asked (and not
> even properly tax deductible at that), which goes way back behind even
> real charitable donations (which at least have a publicity benefit) and
> you don't even get to speak to anyone about it when you go calling with
> the collecting tin.  If you can say it benefits these 5 tasks your
> current employees are doing, you might have a possible case for the
> engineering budget (you might get in the door but you'll still be
> queuing behind every in-plan budget item).  The best case is if you can
> demonstrate some useful for profit contribution it makes to the actual
> line of business (or better yet could be used to spawn a new line of
> business), so when you're asking for a tool, it has to be usable
> outside the narrow confines of the kernel and you need to be able to
> articulate why it's generally useful (git is a great example, it was
> designed to solve a kernel specific problem, but not it's in use pretty
> much everywhere source control is a thing).
> 
> Somewhere between 2000 and now we seem to have lost our ability to
> frame the argument in the above terms, because the business quid pro
> quo argument was what got us money for stuff we needed and the Linux
> Foundation and the TAB formed, but we're not managing nearly as well
> now.  The environment has hardened against us (we're no longer the new
> shiny) but that's not the whole explanation.

I think this take is closer to the mark, yeah.

The elephant in the room that I keep seeing is that MBA driven business
culture in the U.S. has gotten _insane_, and we've all been stewing in
the same pot together, collectively boiling, and not noticing or talking
about just how bad it's gotten.

Engineering culture really does matter; it's what makes the difference
between working effectively or not. And by engineering culture I mean
things like being able to set effective goals and deliver on them, and
have a good balance between product based, end user focused development;
exploratory, prototype-minded research product type stuff; and the
"clean up your messes and eat your vegetables" type stuff that keeps
tech debt from getting out of hand.

Culturally, we in the kernel community are quite good on the last front,
not so good on the first two, and I think a large part of the reason is
people being immersed in corporate culture where everything is quarterly
OKRs, "efficiency", et cetera - and everywhere I look, it's hard to find
senior engineering involved in setting a roadmap. Instead we have a lot
of "initiatives" and feifdoms, and if you ask me it's a direct result of
MBA culture run amuck.

Culturally, things seem to be a lot better in Europe - I've been seeing
a _lot_ more willingness to fund grungy difficult long term projects
there; the silicon valley mentality of "it must have the potential for a
massive impact (and we have to get it done as quick as possible) or it's
not worth looking at" is, thankfully, absent there.

> I also have to say, that for all the complaints there's just not any
> open source pull for test tools (there's no-one who's on a mission to
> make them better).  Demanding that someone else do it is proof of this
> (if you cared enough you'd do it yourself).  That's why all our testing
> infrastructure is just some random set of scripts that mostly does what
> I want, because it's the last thing I need to prove the thing I
> actually care about works.

It's awkward because the people with the greatest need, and therefore
(in theory?) the greatest understanding for what kind of tools would be
effective, are the people with massive other responsibilities.

There are things we just can't do without delegating, and delegating is
something we seem to be consistently not great at in the kernel
community. And I don't think it needs to be that way, because younger
engineers would really benefit from working closely with someone more
senior, and in my experience the way to do a lot of these tooling things
right is _not_ to build it all at once in a year of full time SWE salary
time - it's much better to take your time, spend a lot of time learning
the workflows, letting ideas percolate, and gradually build things up.

Yet the way these projects all seem to go is we have one or a few people
working full time mostly writing code, building things with a lot of
_features_... and if you ask me, ending up with something where most of
the features were things we didn't need or ask for and just make the end
result harder to use.

Tools are hard to get right; perhaps we should be spending more of our
bikeshedding time on the lists bikeshedding our tools, and a little bit
less on coding style minutia.

Personally, I've tried to get the ball rolling multiple times with
various people asking them what they want and need out of their testing
tools and how they use them, and it often feels like pulling teeth.

> Finally testing infrastructure is how OSDL (the precursor to the Linux
> foundation) got started and got its initial funding, so corporations
> have been putting money into it for decades with not much return (and
> pretty much nothing to show for a unified testing infrastructure ...
> ten points to the team who can actually name the test infrastructure
> OSDL produced) and have finally concluded it's not worth it, making it
> a 10x harder sell now.

The circle of fail continues :)
Kent Overstreet Jan. 21, 2024, 3:24 a.m. UTC | #29
On Wed, Jan 17, 2024 at 06:19:43PM +0000, Mark Brown wrote:
> On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote:
> 
> > I also have to say, that for all the complaints there's just not any
> > open source pull for test tools (there's no-one who's on a mission to
> > make them better).  Demanding that someone else do it is proof of this
> > (if you cared enough you'd do it yourself).  That's why all our testing
> > infrastructure is just some random set of scripts that mostly does what
> > I want, because it's the last thing I need to prove the thing I
> > actually care about works.
> 
> > Finally testing infrastructure is how OSDL (the precursor to the Linux
> > foundation) got started and got its initial funding, so corporations
> > have been putting money into it for decades with not much return (and
> > pretty much nothing to show for a unified testing infrastructure ...
> > ten points to the team who can actually name the test infrastructure
> > OSDL produced) and have finally concluded it's not worth it, making it
> > a 10x harder sell now.
> 
> I think that's a *bit* pessimistic, at least for some areas of the
> kernel - there is commercial stuff going on with kernel testing with
> varying degrees of community engagement (eg, off the top of my head
> Baylibre, Collabora and Linaro all have offerings of various kinds that
> I'm aware of), and some of that does turn into investments in reusable
> things rather than proprietary stuff.  I know that I look at the
> kernelci.org results for my trees, and that I've fixed issues I saw
> purely in there.  kselftest is noticably getting much better over time,
> and LTP is quite active too.  The stuff I'm aware of is more focused
> around the embedded space than the enterprise/server space but it does
> exist.  That's not to say that this is all well resourced and there's no
> problem (far from it), but it really doesn't feel like a complete dead
> loss either.

kselftest is pretty exciting to me; "collect all our integration tests
into one place and start to standarize on running them" is good stuff.

You seem to be pretty familiar with all the various testing efforts, I
wonder if you could talk about what you see that's interesting and
useful in the various projects?

I think a lot of this stems from a lack of organization and a lack of
communication; I see a lot of projects reinventing things in slightly
different ways and failing to build off of each other.

> Some of the issues come from the different questions that people are
> trying to answer with testing, or the very different needs of the
> tests that people want to run - for example one of the reasons
> filesystems aren't particularly well covered for the embedded cases is
> that if your local storage is SD or worse eMMC then heavy I/O suddenly
> looks a lot more demanding and media durability a real consideration.

Well, for filesystem testing we (mostly) don't want to be hammering on
an actual block device if we can help it - there are occasionally bugs
that will only manifest when you're testing on a device with realistic
performance characteristics, and we definitely want to be doing some
amount of performance testing on actual devices, but most of our testing
is best done in a VM where the scratch devices live entirely in dram on
the host.

But that's a minor detail, IMO - that doesn't prevent us from having a
common test runner for anything that doesn't need special hardware.
Kent Overstreet Jan. 21, 2024, 12:20 p.m. UTC | #30
On Wed, Jan 17, 2024 at 09:49:22PM -0500, Theodore Ts'o wrote:
> On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote:
> > Actually, this is partly our fault.  Companies behave exactly like a
> > selfish contributor does:
> > 
> > https://archive.fosdem.org/2020/schedule/event/selfish_contributor/
> > 
> > The question they ask is "if I'm putting money into it, what am I
> > getting out of it".  If the answer to that is that it benefits
> > everybody, it's basically charity  to the entity being asked (and not
> > even properly tax deductible at that), which goes way back behind even
> > real charitable donations (which at least have a publicity benefit) and
> > you don't even get to speak to anyone about it when you go calling with
> > the collecting tin.  If you can say it benefits these 5 tasks your
> > current employees are doing, you might have a possible case for the
> > engineering budget (you might get in the door but you'll still be
> > queuing behind every in-plan budget item).  The best case is if you can
> > demonstrate some useful for profit contribution it makes to the actual
> > line of business (or better yet could be used to spawn a new line of
> > business), so when you're asking for a tool, it has to be usable
> > outside the narrow confines of the kernel and you need to be able to
> > articulate why it's generally useful (git is a great example, it was
> > designed to solve a kernel specific problem, but not it's in use pretty
> > much everywhere source control is a thing).
> 
> I have on occasion tried to make the "it benefits the whole ecosystem"
> argument, and that will work on the margins.  But it's a lot harder
> when it's more than a full SWE-year's worth of investment, at least
> more recently.  I *have* tried to get more test investment. with an
> eye towards benefitting not just one company, but in a much more
> general fasion ---- but multi-engineer projects are a very hard sell,
> especially recently.  If Kent wants to impugn my leadership skills,
> that's fine; I invite him to try and see if he can get SVP's cough up
> the dough.  :-)

Well, I've tried talking to you about improving our testing tooling - in
particular, what we could do if we had better, more self contained
tools, not just targeted at xfstests, in particular a VM testrunner that
could run kselftests too - and as I recall, your reaction was pretty
much "why would I be interested in that? What does that do for me?"

So yeah, I would call that a fail in leadership. Us filesystem people
have the highest testing requirements and ought to know how to do this
best, and if the poeple with the most experience aren't trying share
that knowledge and experience in the form of collaborating on tooling,
what the fuck are we even doing here?

If I sound frustrated, it's because I am.

> I've certainly had a lot more success with the "Business quid pro quo"
> argument; fscrypt and fsverity was developed for Android and Chrome;
> casefolding support benefited Android and Steam; ext4 fast commits was
> targetted at cloud-based NFS and Samba serving, etc.

Yeah, I keep hearing you talking about the product management angle and
I have to call bullshit. There's a lot more to maintaining the health of
projects in the long term than just selling features to customers.

> Unfortunately, this effect fades over time.  It's a lot easier to fund
> multi-engineer projects which run for more than a year, when a company
> is just starting out, and when it's still trying to attract upstream
> developers, and it has a sizeable "investment" budget.  ("IBM will
> invest a billion dollars in Linux").  But then in later years, the
> VP's have to justify their budget, and so companies tend to become
> more and more "selfish".  After all, that's how capitalism works ---
> "think of the children^H^H^H^H^H^H^H shareholders!"

This stuff doesn't have to be huge multi engineer-year projects to get
anything useful done.

ktest has been a tiny side project for me. If I can turn that into a
full blown CI that runs arbitrary self contained VM tests with quick
turnaround and a nice git log UI, in my spare time, why can't we pitch
in together instead of each running in different directions and
collaborate and communicate a bit better instead of bitching so much?
Theodore Ts'o Jan. 24, 2024, 5:52 a.m. UTC | #31
On Sun, Jan 21, 2024 at 07:20:32AM -0500, Kent Overstreet wrote:
> 
> Well, I've tried talking to you about improving our testing tooling - in
> particular, what we could do if we had better, more self contained
> tools, not just targeted at xfstests, in particular a VM testrunner that
> could run kselftests too - and as I recall, your reaction was pretty
> much "why would I be interested in that? What does that do for me?"

My reaction was to your proposal that I throw away my framework which
works super well for me, in favor of your favorite framework.  My
framework already supports blktests and the Phoronix Test Suite, and
it would be a lot less work for me to add support for kselftests to
{gce,kvm,android}-xfstests.

The reality is that we all have test suites that are optimized for our
workflow.  Trying to get everyone to standardize on a single test
framework is going to be hard, since they have optimized for different
use cases.  Mine can be used for both local testing as well as
sharding across multiple Google Cloud VM's, and with auto-bisection
features, and it already supports blktests and PTS, and it handles
both x86 and arm64 with both native and cross-compiling support.  I'm
certainly willing to work with others to improve my xfstests-bld.

> So yeah, I would call that a fail in leadership. Us filesystem people
> have the highest testing requirements and ought to know how to do this
> best, and if the poeple with the most experience aren't trying share
> that knowledge and experience in the form of collaborating on tooling,
> what the fuck are we even doing here?

I'm certainly willing to work with others, and I've accepted patches
from other users of {kvm,gce,android}-xfstests.  If you have something
which is a strict superset of all of the features of xfstests-bld, I'm
certainly willing to talk.

I'm sure you have a system which works well for *you*.  However, I'm
much less interested in throwing away of my invested effort for
something that works well for me --- as well as other users of
xfstests-bld.  (This includes other ext4 developers, Google's internal
prodkernel for our data centers, and testing ext4 and xfs for Google's
Cloud-Opmized OS distribution.)

This is not a leadership failure; this is more like telling a Debian
user to throw away their working system because you think Fedora
better, and "wouldn't it be better if we all used the same
distribution"?

> ktest has been a tiny side project for me. If I can turn that into a
> full blown CI that runs arbitrary self contained VM tests with quick
> turnaround and a nice git log UI, in my spare time, why can't we pitch
> in together instead of each running in different directions and
> collaborate and communicate a bit better instead of bitching so much?

xfstests-bld started as a side project to me as well, and has
accumulated other users and contributors.  Why can't you use my system
instead?  By your definition of "failure of leadership", you have
clearly failed as well in not seeing the light and using *my* system.  :-)

						- Ted
Mark Brown Jan. 25, 2024, 9:46 p.m. UTC | #32
On Sat, Jan 20, 2024 at 10:24:09PM -0500, Kent Overstreet wrote:
> On Wed, Jan 17, 2024 at 06:19:43PM +0000, Mark Brown wrote:
> > On Wed, Jan 17, 2024 at 08:03:35AM -0500, James Bottomley wrote:

> > I think that's a *bit* pessimistic, at least for some areas of the
> > kernel - there is commercial stuff going on with kernel testing with
> > varying degrees of community engagement (eg, off the top of my head
> > Baylibre, Collabora and Linaro all have offerings of various kinds that
> > I'm aware of), and some of that does turn into investments in reusable
> > things rather than proprietary stuff.  I know that I look at the
> > kernelci.org results for my trees, and that I've fixed issues I saw
> > purely in there.  kselftest is noticably getting much better over time,
> > and LTP is quite active too.  The stuff I'm aware of is more focused
> > around the embedded space than the enterprise/server space but it does
> > exist.  That's not to say that this is all well resourced and there's no
> > problem (far from it), but it really doesn't feel like a complete dead
> > loss either.

> kselftest is pretty exciting to me; "collect all our integration tests
> into one place and start to standarize on running them" is good stuff.

> You seem to be pretty familiar with all the various testing efforts, I
> wonder if you could talk about what you see that's interesting and
> useful in the various projects?

Well, I'm familiar with the bits I look at and some of the adjacent
areas but definitely not with the testing world as a whole.

For tests themselves there's some generic suites like LTP and kselftest,
plus a lot of domain specific things which are widely used in their
areas.  Often the stuff that's separate either lives with something like
a userspace library rather than just being a purely kernel thing or has
some other special infrastructure needs.

For lab orchestration there's at least:

    https://beaker-project.org/
    https://github.com/labgrid-project/labgrid
    https://www.lavasoftware.org/

Beaker and LAVA are broadly similar in a parallel evolution sort of way,
scalable job scheduler/orchestration things intended for non interactive
use with a lot of overlap in design choices.  LAVA plays nicer with
embedded boards since Beaker comes from RedHat and is focused more on
server/PC type use cases though I don't think there's anything
fundamental there.  Labgrid has a strong embedded focus with facilities
like integrating anciliary test equipment and caters a lot more to
interactive use than either of the other two but AIUI doesn't help so
much with batch usage, though that can be built on top.  All of them can
handle virtual targets as well as physical ones.

All of these need something driving them to actually generate test jobs
and present the results, as well as larger projects there's also people
like Guenter Roeck and myself who run things that amuse us and report
them by hand.  Of the bigger general purpose orchestration projects off
the top of my head there's

    https://github.com/intel/lkp-tests/blob/master/doc/faq.md
    https://cki-project.org/
    https://kernelci.org/
    https://lkft.linaro.org/

CKI and KernelCI are not a million miles apart, they both monitor a
bunch of trees and run well known testsuites that they've integrated,
and have code available if you want to deploy your own thing (eg, for
non-public stuff).  They're looking at pooling their results into kcidb
as part of the KernelCI LF project.  Like 0day is proprietary to Intel
LKFT is proprietary to Linaro, LKFT has a focus on running a lot of
tests on stable -rcs with manual reporting though they do have some best
effort coverage of mainline and -next as well.

There's also a bunch of people doing things specific to a given hardware
type or other interest, often internal to a vendor but for example Intel
have some public CI for their graphics and audio:

    https://intel-gfx-ci.01.org/
    https://github.com/thesofproject/linux/

(you can see the audio stuff doing it's thing on the pull requests in
the SOF repo.)  The infra behind these is a bit task specific AIUI, for
example the audio testing includes a lot of boards that don't have
serial consoles or anything (eg, laptops) so it uses a fixed filesystem
on the device, copies a kernel in and uses grub-reboot to try it one
time.  They're particularly interesting because they're more actively
tied to the development flow.  The clang people have something too using
a github flow:

    https://github.com/ClangBuiltLinux/continuous-integration2

(which does have some boots on virtual platforms as well as just build
coverage.)

> I think a lot of this stems from a lack of organization and a lack of
> communication; I see a lot of projects reinventing things in slightly
> different ways and failing to build off of each other.

There's definitely some NIHing going on in places but a lot of it comes
from people with different needs or environments (like the Intel audio
stuff I mentioned), or just things already existing and nobody wanting
to disrupt what they've got for a wholesale replacement.  People are
rarely working from nothing, and there's a bunch of communication and
sharing of ideas going on.

> > Some of the issues come from the different questions that people are
> > trying to answer with testing, or the very different needs of the
> > tests that people want to run - for example one of the reasons
> > filesystems aren't particularly well covered for the embedded cases is
> > that if your local storage is SD or worse eMMC then heavy I/O suddenly
> > looks a lot more demanding and media durability a real consideration.

> Well, for filesystem testing we (mostly) don't want to be hammering on
> an actual block device if we can help it - there are occasionally bugs
> that will only manifest when you're testing on a device with realistic
> performance characteristics, and we definitely want to be doing some
> amount of performance testing on actual devices, but most of our testing
> is best done in a VM where the scratch devices live entirely in dram on
> the host.

Sure, though there can be limitations with the amount of memory on a lot
of these systems too!  You can definitely do things, it's just not
always ideal - for example filesystem people will tend to default to
using test filesystems sized like the total memory of a lot of even
modern embedded boards so if nothing else you need to tune things down
if you're going to do a memory only test.