mbox series

[v13,00/17] Introduce the Counter character device interface

Message ID cover.1626165764.git.vilhelm.gray@gmail.com (mailing list archive)
Headers show
Series Introduce the Counter character device interface | expand

Message

William Breathitt Gray July 13, 2021, 9:53 a.m. UTC
Changes in v13:
 - Use GFP_KERNEL when using mutexes instead of spinlocks
 - Free event_node on error in counter_set_event_node
 - Create n_events_list_lock mutex for protecting next_events_list
   access
 - Adjust counter_set_event_node() locking to cover watch_validate() too
 - Pull out COUNTER_ENABLE_EVENTS_IOCTL code into its own function
   counter_enable_events()
 - Reimplement chrdev_lock as a bitmap and utilize
   test_and_set_bit_lock()/clear_bit_unlock() to handle locking
 - Move wake_up_poll() call to after spin_unlock_irqrestore() in the
   counter_push_event() function
 - Drop the lock on error in counter_events_queue_size_write()
 - Rename "group" to "cattr_group" where appropriate for clarity
 - Pull allocation of attribute groups to outside of for-loop in
   counter_sysfs_add()
 - Define and use enum translation function st32_lptim_func_map()
 - Provide complete example in enum counter_component documentation
 - Provide inline description comments for enum counter_event_type
 - Move documentation of Counter ioctl commands inline with code and
   reference them using :c:macro: in generic-counter.rst
 - Make it clear which callbacks are optional in struct counter_ops
 - Use HTML tables in Documentation/driver-api/generic-counter.rst
 - Qualify struct counter_watch watches as statin in counter_example.c
 - Return 1 on read() error in counter_example.c
 - Remove unnecessary explicit casts in counter_example.c

For convenience, this patchset is also available on my personal git
repo: https://gitlab.com/vilhelmgray/iio/-/tree/counter_chrdev_v13

The patches preceding "counter: Internalize sysfs interface code" are
primarily cleanup and fixes that can be picked up and applied now to the
IIO tree if so desired. The "counter: Internalize sysfs interface code"
patch as well may be considered for pickup because it is relatively safe
and makes no changes to the userspace interface.

To summarize the main points of this patchset: there are no changes to
the existing Counter sysfs userspace interface; a Counter character
device interface is introduced that allows Counter events and associated
data to be read() by userspace; the events_configure() and
watch_validate() driver callbacks are introduced to support Counter
events; and IRQ support is added to the 104-QUAD-8 driver, serving as an
example of how to support the new Counter events functionality.

I did discover an issue with the character device code that I'm having
trouble understanding. The issue does not affect the sysfs interface
code so it's still safe to merge up to patch 08/17; only the character
device node is affected starting with patch 10/17.

Suppose I open the chrdev from a userspace application and keep it open,
but then remove the respective driver and Counter subsystem module from
my system. The devm_counter_release() and counter_exit() functions will
be called as expected; the counter_chrdev_release() function will not be
called yet, but that is expected because the chrdev is still open by
userspace. If I try to break out of my userspace application, I expect
counter_chrdev_release() to finally be called, but this does not happen.
Instead, my userspace application stalls and I see the following error
in my dmesg:

[  172.859570] BUG: unable to handle page fault for address: ffffffffc09ae298
[  172.859594] #PF: supervisor read access in kernel mode
[  172.859598] #PF: error_code(0x0000) - not-present page
[  172.859603] PGD 23615067 P4D 23615067 PUD 23617067 PMD 1029ad067 PTE 0
[  172.859623] Oops: 0000 [#1] SMP NOPTI
[  172.859629] CPU: 2 PID: 2485 Comm: counter_example Not tainted 5.13.0+ #1
[  172.859640] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
[  172.859645] RIP: 0010:filp_close+0x29/0x70
[  172.859662] Code: 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 48 8b 47 38 48 85 c0 0f 84 b7 40 86 00 48 8b 47 28 49 89 fc 49 89 f5 45 31 f6 <48> 8b 40 78 48 85 c0 74 08 ff d0 0f 1f 00 41 89 c6 41 f6 44 24 45
[  172.859669] RSP: 0018:ffffad31c0ee7cb0 EFLAGS: 00010246
[  172.859675] RAX: ffffffffc09ae220 RBX: 0000000000000001 RCX: 0000000000000001
[  172.859680] RDX: ffff9a43829708e0 RSI: ffff9a4382970840 RDI: ffff9a43821f4f00
[  172.859684] RBP: ffffad31c0ee7cc8 R08: 0000000000000001 R09: 0000000000000001
[  172.859687] R10: ffffffffffff4d00 R11: ffff9a43933c6e10 R12: ffff9a43821f4f00
[  172.859691] R13: ffff9a4382970840 R14: 0000000000000000 R15: 0000000000000003
[  172.859694] FS:  0000000000000000(0000) GS:ffff9a44b7d00000(0000) knlGS:0000000000000000
[  172.859699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  172.859704] CR2: ffffffffc09ae298 CR3: 0000000023610001 CR4: 0000000000370ee0
[  172.859713] Call Trace:
[  172.859730]  put_files_struct+0x73/0xd0
[  172.859738]  exit_files+0x49/0x50
[  172.859743]  do_exit+0x33b/0xa20
[  172.859751]  do_group_exit+0x3b/0xb0
[  172.859758]  get_signal+0x16f/0x8b0
[  172.859766]  ? _copy_to_user+0x20/0x30
[  172.859774]  ? put_timespec64+0x3d/0x60
[  172.859784]  arch_do_signal_or_restart+0xf3/0x850
[  172.859794]  ? hrtimer_nanosleep+0x9f/0x120
[  172.859802]  ? __hrtimer_init+0xd0/0xd0
[  172.859808]  exit_to_user_mode_prepare+0x122/0x1b0
[  172.859816]  syscall_exit_to_user_mode+0x27/0x50
[  172.859825]  do_syscall_64+0x48/0xc0
[  172.859831]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  172.859839] RIP: 0033:0x7f07f8b9951a
[  172.859850] Code: Unable to access opcode bytes at RIP 0x7f07f8b994f0.
[  172.859853] RSP: 002b:00007ffc0d12c230 EFLAGS: 00000246 ORIG_RAX: 00000000000000e6
[  172.859860] RAX: fffffffffffffdfc RBX: ffffffffffffff01 RCX: 00007f07f8b9951a
[  172.859863] RDX: 00007ffc0d12c2b0 RSI: 0000000000000000 RDI: 0000000000000000
[  172.859867] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007ffc0d12c1c6
[  172.859871] R10: 00007ffc0d12c2b0 R11: 0000000000000246 R12: 00007ffc0d12c2b0
[  172.859874] R13: 00007ffc0d12c2b0 R14: 0000000000000000 R15: 0000000000000000
[  172.859886] Modules linked in: intel_rapl_msr intel_rapl_common kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel nls_iso8859_1 aesni_intel crypto_simd cryptd rapl drm_ttm_helper ttm uvcvideo drm_kms_helper videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common input_leds syscopyarea videodev sysfillrect sysimgblt fb_sys_fops cec rc_core joydev mc drm serio_raw mac_hid qemu_fw_cfg sch_fq_codel msr parport_pc ppdev lp parport virtio_rng ip_tables x_tables autofs4 hid_generic usbhid hid virtio_net psmouse net_failover i2c_piix4 virtio_blk failover pata_acpi floppy [last unloaded: counter]
[  172.859995] CR2: ffffffffc09ae298
[  172.860009] ---[ end trace e7d3d7da1a73b8f4 ]---
[  172.860013] RIP: 0010:filp_close+0x29/0x70
[  172.860021] Code: 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 48 8b 47 38 48 85 c0 0f 84 b7 40 86 00 48 8b 47 28 49 89 fc 49 89 f5 45 31 f6 <48> 8b 40 78 48 85 c0 74 08 ff d0 0f 1f 00 41 89 c6 41 f6 44 24 45
[  172.860027] RSP: 0018:ffffad31c0ee7cb0 EFLAGS: 00010246
[  172.860031] RAX: ffffffffc09ae220 RBX: 0000000000000001 RCX: 0000000000000001
[  172.860034] RDX: ffff9a43829708e0 RSI: ffff9a4382970840 RDI: ffff9a43821f4f00
[  172.860038] RBP: ffffad31c0ee7cc8 R08: 0000000000000001 R09: 0000000000000001
[  172.860041] R10: ffffffffffff4d00 R11: ffff9a43933c6e10 R12: ffff9a43821f4f00
[  172.860044] R13: ffff9a4382970840 R14: 0000000000000000 R15: 0000000000000003
[  172.860047] FS:  0000000000000000(0000) GS:ffff9a44b7d00000(0000) knlGS:0000000000000000
[  172.860052] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  172.860056] CR2: ffffffffc09ae298 CR3: 0000000023610001 CR4: 0000000000370ee0
[  172.860073] Fixing recursive fault but reboot is needed!

It looks like faults in filp_close() before counter_chrdev_release() is
called. Is this issue manifesting because counter_exit() was called
earlier while the chrdev was still open?

William Breathitt Gray (17):
  counter: 104-quad-8: Return error when invalid mode during
    ceiling_write
  counter: Return error code on invalid modes
  counter: Standardize to ERANGE for limit exceeded errors
  counter: Rename counter_signal_value to counter_signal_level
  counter: Rename counter_count_function to counter_function
  counter: Internalize sysfs interface code
  counter: Update counter.h comments to reflect sysfs internalization
  docs: counter: Update to reflect sysfs internalization
  counter: Move counter enums to uapi header
  counter: Add character device interface
  docs: counter: Document character device interface
  tools/counter: Create Counter tools
  counter: Implement signalZ_action_component_id sysfs attribute
  counter: Implement *_component_id sysfs attributes
  counter: Implement events_queue_size sysfs attribute
  counter: 104-quad-8: Replace mutex with spinlock
  counter: 104-quad-8: Add IRQ support for the ACCES 104-QUAD-8

 Documentation/ABI/testing/sysfs-bus-counter   |   38 +-
 Documentation/driver-api/generic-counter.rst  |  358 +++-
 .../userspace-api/ioctl/ioctl-number.rst      |    1 +
 MAINTAINERS                                   |    3 +-
 drivers/counter/104-quad-8.c                  |  728 ++++----
 drivers/counter/Kconfig                       |    6 +-
 drivers/counter/Makefile                      |    1 +
 drivers/counter/counter-chrdev.c              |  514 ++++++
 drivers/counter/counter-chrdev.h              |   14 +
 drivers/counter/counter-core.c                |  182 ++
 drivers/counter/counter-sysfs.c               |  962 +++++++++++
 drivers/counter/counter-sysfs.h               |   13 +
 drivers/counter/counter.c                     | 1496 -----------------
 drivers/counter/ftm-quaddec.c                 |   59 +-
 drivers/counter/intel-qep.c                   |  150 +-
 drivers/counter/interrupt-cnt.c               |   73 +-
 drivers/counter/microchip-tcb-capture.c       |  103 +-
 drivers/counter/stm32-lptimer-cnt.c           |  211 ++-
 drivers/counter/stm32-timer-cnt.c             |  147 +-
 drivers/counter/ti-eqep.c                     |  205 ++-
 include/linux/counter.h                       |  717 ++++----
 include/linux/counter_enum.h                  |   45 -
 include/uapi/linux/counter.h                  |  154 ++
 tools/Makefile                                |   13 +-
 tools/counter/Build                           |    1 +
 tools/counter/Makefile                        |   53 +
 tools/counter/counter_example.c               |   93 +
 27 files changed, 3572 insertions(+), 2768 deletions(-)
 create mode 100644 drivers/counter/counter-chrdev.c
 create mode 100644 drivers/counter/counter-chrdev.h
 create mode 100644 drivers/counter/counter-core.c
 create mode 100644 drivers/counter/counter-sysfs.c
 create mode 100644 drivers/counter/counter-sysfs.h
 delete mode 100644 drivers/counter/counter.c
 delete mode 100644 include/linux/counter_enum.h
 create mode 100644 include/uapi/linux/counter.h
 create mode 100644 tools/counter/Build
 create mode 100644 tools/counter/Makefile
 create mode 100644 tools/counter/counter_example.c


base-commit: 50be9417e23af5a8ac860d998e1e3f06b8fd79d7

Comments

William Breathitt Gray July 28, 2021, 8:51 a.m. UTC | #1
On Tue, Jul 13, 2021 at 06:53:04PM +0900, William Breathitt Gray wrote:
> Suppose I open the chrdev from a userspace application and keep it open,
> but then remove the respective driver and Counter subsystem module from
> my system. The devm_counter_release() and counter_exit() functions will
> be called as expected; the counter_chrdev_release() function will not be
> called yet, but that is expected because the chrdev is still open by
> userspace. If I try to break out of my userspace application, I expect
> counter_chrdev_release() to finally be called, but this does not happen.
> Instead, my userspace application stalls and I see the following error
> in my dmesg:
> 
> [  172.859570] BUG: unable to handle page fault for address: ffffffffc09ae298
> [  172.859594] #PF: supervisor read access in kernel mode
> [  172.859598] #PF: error_code(0x0000) - not-present page
> [  172.859603] PGD 23615067 P4D 23615067 PUD 23617067 PMD 1029ad067 PTE 0
> [  172.859623] Oops: 0000 [#1] SMP NOPTI
> [  172.859629] CPU: 2 PID: 2485 Comm: counter_example Not tainted 5.13.0+ #1
> [  172.859640] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
> [  172.859645] RIP: 0010:filp_close+0x29/0x70
> [  172.859662] Code: 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 48 8b 47 38 48 85 c0 0f 84 b7 40 86 00 48 8b 47 28 49 89 fc 49 89 f5 45 31 f6 <48> 8b 40 78 48 85 c0 74 08 ff d0 0f 1f 00 41 89 c6 41 f6 44 24 45
> [  172.859669] RSP: 0018:ffffad31c0ee7cb0 EFLAGS: 00010246
> [  172.859675] RAX: ffffffffc09ae220 RBX: 0000000000000001 RCX: 0000000000000001
> [  172.859680] RDX: ffff9a43829708e0 RSI: ffff9a4382970840 RDI: ffff9a43821f4f00
> [  172.859684] RBP: ffffad31c0ee7cc8 R08: 0000000000000001 R09: 0000000000000001
> [  172.859687] R10: ffffffffffff4d00 R11: ffff9a43933c6e10 R12: ffff9a43821f4f00
> [  172.859691] R13: ffff9a4382970840 R14: 0000000000000000 R15: 0000000000000003
> [  172.859694] FS:  0000000000000000(0000) GS:ffff9a44b7d00000(0000) knlGS:0000000000000000
> [  172.859699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  172.859704] CR2: ffffffffc09ae298 CR3: 0000000023610001 CR4: 0000000000370ee0
> [  172.859713] Call Trace:
> [  172.859730]  put_files_struct+0x73/0xd0
> [  172.859738]  exit_files+0x49/0x50
> [  172.859743]  do_exit+0x33b/0xa20
> [  172.859751]  do_group_exit+0x3b/0xb0
> [  172.859758]  get_signal+0x16f/0x8b0
> [  172.859766]  ? _copy_to_user+0x20/0x30
> [  172.859774]  ? put_timespec64+0x3d/0x60
> [  172.859784]  arch_do_signal_or_restart+0xf3/0x850
> [  172.859794]  ? hrtimer_nanosleep+0x9f/0x120
> [  172.859802]  ? __hrtimer_init+0xd0/0xd0
> [  172.859808]  exit_to_user_mode_prepare+0x122/0x1b0
> [  172.859816]  syscall_exit_to_user_mode+0x27/0x50
> [  172.859825]  do_syscall_64+0x48/0xc0
> [  172.859831]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  172.859839] RIP: 0033:0x7f07f8b9951a
> [  172.859850] Code: Unable to access opcode bytes at RIP 0x7f07f8b994f0.
> [  172.859853] RSP: 002b:00007ffc0d12c230 EFLAGS: 00000246 ORIG_RAX: 00000000000000e6
> [  172.859860] RAX: fffffffffffffdfc RBX: ffffffffffffff01 RCX: 00007f07f8b9951a
> [  172.859863] RDX: 00007ffc0d12c2b0 RSI: 0000000000000000 RDI: 0000000000000000
> [  172.859867] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007ffc0d12c1c6
> [  172.859871] R10: 00007ffc0d12c2b0 R11: 0000000000000246 R12: 00007ffc0d12c2b0
> [  172.859874] R13: 00007ffc0d12c2b0 R14: 0000000000000000 R15: 0000000000000000
> [  172.859886] Modules linked in: intel_rapl_msr intel_rapl_common kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel nls_iso8859_1 aesni_intel crypto_simd cryptd rapl drm_ttm_helper ttm uvcvideo drm_kms_helper videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common input_leds syscopyarea videodev sysfillrect sysimgblt fb_sys_fops cec rc_core joydev mc drm serio_raw mac_hid qemu_fw_cfg sch_fq_codel msr parport_pc ppdev lp parport virtio_rng ip_tables x_tables autofs4 hid_generic usbhid hid virtio_net psmouse net_failover i2c_piix4 virtio_blk failover pata_acpi floppy [last unloaded: counter]
> [  172.859995] CR2: ffffffffc09ae298
> [  172.860009] ---[ end trace e7d3d7da1a73b8f4 ]---
> [  172.860013] RIP: 0010:filp_close+0x29/0x70
> [  172.860021] Code: 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 48 8b 47 38 48 85 c0 0f 84 b7 40 86 00 48 8b 47 28 49 89 fc 49 89 f5 45 31 f6 <48> 8b 40 78 48 85 c0 74 08 ff d0 0f 1f 00 41 89 c6 41 f6 44 24 45
> [  172.860027] RSP: 0018:ffffad31c0ee7cb0 EFLAGS: 00010246
> [  172.860031] RAX: ffffffffc09ae220 RBX: 0000000000000001 RCX: 0000000000000001
> [  172.860034] RDX: ffff9a43829708e0 RSI: ffff9a4382970840 RDI: ffff9a43821f4f00
> [  172.860038] RBP: ffffad31c0ee7cc8 R08: 0000000000000001 R09: 0000000000000001
> [  172.860041] R10: ffffffffffff4d00 R11: ffff9a43933c6e10 R12: ffff9a43821f4f00
> [  172.860044] R13: ffff9a4382970840 R14: 0000000000000000 R15: 0000000000000003
> [  172.860047] FS:  0000000000000000(0000) GS:ffff9a44b7d00000(0000) knlGS:0000000000000000
> [  172.860052] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  172.860056] CR2: ffffffffc09ae298 CR3: 0000000023610001 CR4: 0000000000370ee0
> [  172.860073] Fixing recursive fault but reboot is needed!
> 
> It looks like faults in filp_close() before counter_chrdev_release() is
> called. Is this issue manifesting because counter_exit() was called
> earlier while the chrdev was still open?

After giving this some more thought I realized that once the counter
module is unloaded, the counter_fops callbacks are lost and thus the
fops for the chrdev are no longer be valid. This means that
counter_chrdev_release will need to be called before the counter module
is unloaded. How can I guarantee this if a userspace application may
still have the chrdev indefinitely open? In counter_chrdev_remove(),
should I walk through the open chrdevs and release each one?

William Breathitt Gray
William Breathitt Gray July 29, 2021, 9:48 a.m. UTC | #2
On Wed, Jul 28, 2021 at 05:51:45PM +0900, William Breathitt Gray wrote:
> On Tue, Jul 13, 2021 at 06:53:04PM +0900, William Breathitt Gray wrote:
> > Suppose I open the chrdev from a userspace application and keep it open,
> > but then remove the respective driver and Counter subsystem module from
> > my system. The devm_counter_release() and counter_exit() functions will
> > be called as expected; the counter_chrdev_release() function will not be
> > called yet, but that is expected because the chrdev is still open by
> > userspace. If I try to break out of my userspace application, I expect
> > counter_chrdev_release() to finally be called, but this does not happen.
> > Instead, my userspace application stalls and I see the following error
> > in my dmesg:
> > 
> > [  172.859570] BUG: unable to handle page fault for address: ffffffffc09ae298
> > [  172.859594] #PF: supervisor read access in kernel mode
> > [  172.859598] #PF: error_code(0x0000) - not-present page
> > [  172.859603] PGD 23615067 P4D 23615067 PUD 23617067 PMD 1029ad067 PTE 0
> > [  172.859623] Oops: 0000 [#1] SMP NOPTI
> > [  172.859629] CPU: 2 PID: 2485 Comm: counter_example Not tainted 5.13.0+ #1
> > [  172.859640] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
> > [  172.859645] RIP: 0010:filp_close+0x29/0x70
> > [  172.859662] Code: 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 48 8b 47 38 48 85 c0 0f 84 b7 40 86 00 48 8b 47 28 49 89 fc 49 89 f5 45 31 f6 <48> 8b 40 78 48 85 c0 74 08 ff d0 0f 1f 00 41 89 c6 41 f6 44 24 45
> > [  172.859669] RSP: 0018:ffffad31c0ee7cb0 EFLAGS: 00010246
> > [  172.859675] RAX: ffffffffc09ae220 RBX: 0000000000000001 RCX: 0000000000000001
> > [  172.859680] RDX: ffff9a43829708e0 RSI: ffff9a4382970840 RDI: ffff9a43821f4f00
> > [  172.859684] RBP: ffffad31c0ee7cc8 R08: 0000000000000001 R09: 0000000000000001
> > [  172.859687] R10: ffffffffffff4d00 R11: ffff9a43933c6e10 R12: ffff9a43821f4f00
> > [  172.859691] R13: ffff9a4382970840 R14: 0000000000000000 R15: 0000000000000003
> > [  172.859694] FS:  0000000000000000(0000) GS:ffff9a44b7d00000(0000) knlGS:0000000000000000
> > [  172.859699] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  172.859704] CR2: ffffffffc09ae298 CR3: 0000000023610001 CR4: 0000000000370ee0
> > [  172.859713] Call Trace:
> > [  172.859730]  put_files_struct+0x73/0xd0
> > [  172.859738]  exit_files+0x49/0x50
> > [  172.859743]  do_exit+0x33b/0xa20
> > [  172.859751]  do_group_exit+0x3b/0xb0
> > [  172.859758]  get_signal+0x16f/0x8b0
> > [  172.859766]  ? _copy_to_user+0x20/0x30
> > [  172.859774]  ? put_timespec64+0x3d/0x60
> > [  172.859784]  arch_do_signal_or_restart+0xf3/0x850
> > [  172.859794]  ? hrtimer_nanosleep+0x9f/0x120
> > [  172.859802]  ? __hrtimer_init+0xd0/0xd0
> > [  172.859808]  exit_to_user_mode_prepare+0x122/0x1b0
> > [  172.859816]  syscall_exit_to_user_mode+0x27/0x50
> > [  172.859825]  do_syscall_64+0x48/0xc0
> > [  172.859831]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [  172.859839] RIP: 0033:0x7f07f8b9951a
> > [  172.859850] Code: Unable to access opcode bytes at RIP 0x7f07f8b994f0.
> > [  172.859853] RSP: 002b:00007ffc0d12c230 EFLAGS: 00000246 ORIG_RAX: 00000000000000e6
> > [  172.859860] RAX: fffffffffffffdfc RBX: ffffffffffffff01 RCX: 00007f07f8b9951a
> > [  172.859863] RDX: 00007ffc0d12c2b0 RSI: 0000000000000000 RDI: 0000000000000000
> > [  172.859867] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007ffc0d12c1c6
> > [  172.859871] R10: 00007ffc0d12c2b0 R11: 0000000000000246 R12: 00007ffc0d12c2b0
> > [  172.859874] R13: 00007ffc0d12c2b0 R14: 0000000000000000 R15: 0000000000000000
> > [  172.859886] Modules linked in: intel_rapl_msr intel_rapl_common kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel nls_iso8859_1 aesni_intel crypto_simd cryptd rapl drm_ttm_helper ttm uvcvideo drm_kms_helper videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common input_leds syscopyarea videodev sysfillrect sysimgblt fb_sys_fops cec rc_core joydev mc drm serio_raw mac_hid qemu_fw_cfg sch_fq_codel msr parport_pc ppdev lp parport virtio_rng ip_tables x_tables autofs4 hid_generic usbhid hid virtio_net psmouse net_failover i2c_piix4 virtio_blk failover pata_acpi floppy [last unloaded: counter]
> > [  172.859995] CR2: ffffffffc09ae298
> > [  172.860009] ---[ end trace e7d3d7da1a73b8f4 ]---
> > [  172.860013] RIP: 0010:filp_close+0x29/0x70
> > [  172.860021] Code: 90 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 48 8b 47 38 48 85 c0 0f 84 b7 40 86 00 48 8b 47 28 49 89 fc 49 89 f5 45 31 f6 <48> 8b 40 78 48 85 c0 74 08 ff d0 0f 1f 00 41 89 c6 41 f6 44 24 45
> > [  172.860027] RSP: 0018:ffffad31c0ee7cb0 EFLAGS: 00010246
> > [  172.860031] RAX: ffffffffc09ae220 RBX: 0000000000000001 RCX: 0000000000000001
> > [  172.860034] RDX: ffff9a43829708e0 RSI: ffff9a4382970840 RDI: ffff9a43821f4f00
> > [  172.860038] RBP: ffffad31c0ee7cc8 R08: 0000000000000001 R09: 0000000000000001
> > [  172.860041] R10: ffffffffffff4d00 R11: ffff9a43933c6e10 R12: ffff9a43821f4f00
> > [  172.860044] R13: ffff9a4382970840 R14: 0000000000000000 R15: 0000000000000003
> > [  172.860047] FS:  0000000000000000(0000) GS:ffff9a44b7d00000(0000) knlGS:0000000000000000
> > [  172.860052] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  172.860056] CR2: ffffffffc09ae298 CR3: 0000000023610001 CR4: 0000000000370ee0
> > [  172.860073] Fixing recursive fault but reboot is needed!
> > 
> > It looks like faults in filp_close() before counter_chrdev_release() is
> > called. Is this issue manifesting because counter_exit() was called
> > earlier while the chrdev was still open?
> 
> After giving this some more thought I realized that once the counter
> module is unloaded, the counter_fops callbacks are lost and thus the
> fops for the chrdev are no longer be valid. This means that
> counter_chrdev_release will need to be called before the counter module
> is unloaded. How can I guarantee this if a userspace application may
> still have the chrdev indefinitely open? In counter_chrdev_remove(),
> should I walk through the open chrdevs and release each one?
> 
> William Breathitt Gray

Sorry for all the noise, I had simply forgotten to define the owner
member of counter_fops structure to THIS_MODULE. I'll make that fix and
also add in some conditional checks to protect the callbacks so they
return -ENODEV if the counter has been unregistered. With that I expect
to have a v14 submission ready some time this coming week.

William Breathitt Gray