Message ID | 0-v3-402a7d6459de+24b-iommufd_jgg@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | IOMMUFD Generic interface | expand |
On Tue, Oct 25, 2022 at 03:12:09PM -0300, Jason Gunthorpe wrote: > [ > At this point everything is done and I will start putting this work into a > git tree and into linux-next with the intention of sending it during the > next merge window. > > I intend to focus the next several weeks on more intensive QA to look at > error flows and other things. Hopefully including syzkaller if I'm lucky > ] > However, these are not necessary for this series to advance. > > This is on github: https://github.com/jgunthorpe/linux/commits/iommufd Tested-by: Nicolin Chen <nicolinc@nvidia.com> I tested on ARM64+SMMUv3 with the other vfio_iommufd branch that includes these core changes too. Thanks Nicolin
On Tue, 25 Oct 2022 15:12:09 -0300 Jason Gunthorpe <jgg@nvidia.com> wrote: > [ > At this point everything is done and I will start putting this work into a > git tree and into linux-next with the intention of sending it during the > next merge window. > > I intend to focus the next several weeks on more intensive QA to look at > error flows and other things. Hopefully including syzkaller if I'm lucky > ] In case this one hasn't been reported yet (with IOMMUFD_VFIO_CONTAINER): ====================================================== WARNING: possible circular locking dependency detected 6.1.0-rc3+ #133 Tainted: G E ------------------------------------------------------ qemu-system-x86/1731 is trying to acquire lock: ffff90d3f5fe3e08 (&iopt->iova_rwsem){++++}-{3:3}, at: iopt_map_pages.part.0+0x85/0xe0 [iommufd] but task is already holding lock: ffff90d3f5fe3d18 (&iopt->domains_rwsem){.+.+}-{3:3}, at: iopt_map_pages.part.0+0x18/0xe0 [iommufd] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&iopt->domains_rwsem){.+.+}-{3:3}: down_read+0x2d/0x40 iommufd_vfio_ioctl+0x2cc/0x640 [iommufd] iommufd_fops_ioctl+0x14e/0x190 [iommufd] __x64_sys_ioctl+0x8b/0xc0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd -> #0 (&iopt->iova_rwsem){++++}-{3:3}: __lock_acquire+0x10dc/0x1da0 lock_acquire+0xc2/0x2d0 down_write+0x2b/0xd0 iopt_map_pages.part.0+0x85/0xe0 [iommufd] iopt_map_user_pages+0x179/0x1d0 [iommufd] iommufd_vfio_ioctl+0x216/0x640 [iommufd] iommufd_fops_ioctl+0x14e/0x190 [iommufd] __x64_sys_ioctl+0x8b/0xc0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&iopt->domains_rwsem); lock(&iopt->iova_rwsem); lock(&iopt->domains_rwsem); lock(&iopt->iova_rwsem); *** DEADLOCK *** 2 locks held by qemu-system-x86/1731: #0: ffff90d3f5fe3c70 (&obj->destroy_rwsem){.+.+}-{3:3}, at: get_compat_ioas+0x2b/0x90 [iommufd] #1: ffff90d3f5fe3d18 (&iopt->domains_rwsem){.+.+}-{3:3}, at: iopt_map_pages.part.0+0x18/0xe0 [iommufd] stack backtrace: CPU: 0 PID: 1731 Comm: qemu-system-x86 Tainted: G E 6.1.0-rc3+ #133 Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 Call Trace: <TASK> dump_stack_lvl+0x56/0x73 check_noncircular+0xd6/0x100 ? lock_is_held_type+0xe2/0x140 __lock_acquire+0x10dc/0x1da0 lock_acquire+0xc2/0x2d0 ? iopt_map_pages.part.0+0x85/0xe0 [iommufd] ? lock_release+0x137/0x2d0 down_write+0x2b/0xd0 ? iopt_map_pages.part.0+0x85/0xe0 [iommufd] iopt_map_pages.part.0+0x85/0xe0 [iommufd] iopt_map_user_pages+0x179/0x1d0 [iommufd] iommufd_vfio_ioctl+0x216/0x640 [iommufd] iommufd_fops_ioctl+0x14e/0x190 [iommufd] __x64_sys_ioctl+0x8b/0xc0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fd1eee7c17b Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffd9787b9a8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd1eee7c17b RDX: 00007ffd9787b9e0 RSI: 0000000000003b71 RDI: 000000000000001c RBP: 00007ffd9787ba10 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000c0000 R11: 0000000000000206 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK>
On Fri, 4 Nov 2022 15:27:13 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > On Tue, 25 Oct 2022 15:12:09 -0300 > Jason Gunthorpe <jgg@nvidia.com> wrote: > > > [ > > At this point everything is done and I will start putting this work into a > > git tree and into linux-next with the intention of sending it during the > > next merge window. > > > > I intend to focus the next several weeks on more intensive QA to look at > > error flows and other things. Hopefully including syzkaller if I'm lucky > > ] > > In case this one hasn't been reported yet (with IOMMUFD_VFIO_CONTAINER): And... ------------[ cut here ]------------ WARNING: CPU: 4 PID: 1736 at drivers/iommu/iommufd/io_pagetable.c:660 iopt_destroy_table+0x91/0xc0 [iommufd] Modules linked in: scsi_transport_iscsi(E) xt_CHECKSUM(E) xt_MASQUERADE(E) xt_conntrack(E) ipt_REJECT(E) nf_nat_tftp(E) nft_objref(E) nf_conntrack_tftp(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_tables(E) bridge(E) stp(E) llc(E) ebtable_nat(E) ebtable_broute(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ip_set(E) nfnetlink(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) sunrpc(E) intel_rapl_msr(E) intel_rapl_common(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) snd_hda_intel(E) snd_intel_dspcfg(E) kvm_intel(E) snd_hda_codec(E) snd_hwdep(E) bcache(E) iTCO_wdt(E) snd_hda_core(E) kvm(E) mei_hdcp(E) intel_pmc_bxt(E) at24(E) snd_seq(E) iTCO_vendor_support(E) eeepc_wmi(E) snd_seq_device(E) asus_wmi(E) rapl(E) snd_pcm(E) ledtrig_audio(E) intel_cstate(E) sparse_keymap(E) intel_uncore(E) mei_me(E) snd_timer(E) platform_profile(E) i2c_i801(E) rfkill(E) wmi_bmof(E) snd(E) i2c_smbus(E) soundcore(E) mei(E) lpc_ich(E) ip_tables(E) i915(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) vfio_pci(E) vfio_pci_core(E) irqbypass(E) vfio_virqfd(E) serio_raw(E) i2c_algo_bit(E) drm_buddy(E) drm_display_helper(E) drm_kms_helper(E) cec(E) ttm(E) r8169(E) e1000e(E) drm(E) video(E) wmi(E) mtty(E) mdev(E) vfio(E) iommufd(E) macvtap(E) macvlan(E) tap(E) CPU: 4 PID: 1736 Comm: qemu-system-x86 Tainted: G E 6.1.0-rc3+ #133 Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 RIP: 0010:iopt_destroy_table+0x91/0xc0 [iommufd] Code: a8 01 00 00 48 85 c0 75 21 49 83 bc 24 e0 00 00 00 00 75 23 49 8b 84 24 88 01 00 00 48 85 c0 75 25 5b 5d 41 5c c3 cc cc cc cc <0f> 0b 49 83 bc 24 e0 00 00 00 00 74 dd 0f 0b 49 8b 84 24 88 01 00 RSP: 0018:ffff9c8dc1c63cb0 EFLAGS: 00010282 RAX: ffff90d454863a80 RBX: ffff90d3f5fe3e40 RCX: 0000000000000000 RDX: ffffffffffffffff RSI: 0000000000000000 RDI: ffff90d3f5fe3e40 RBP: 0000000000000000 R08: 0000000000000001 R09: ffff90d43234b240 R10: 0000000000000000 R11: ffff90d42c703000 R12: ffff90d3f5fe3ca8 R13: 0000000000000001 R14: ffff90d43ca32138 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff90d7df700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3fba3c6000 CR3: 000000009ba26005 CR4: 00000000001726e0 Call Trace: <TASK> iommufd_ioas_destroy+0x2b/0x60 [iommufd] iommufd_fops_release+0x8b/0xe0 [iommufd] __fput+0x94/0x250 task_work_run+0x59/0x90 do_exit+0x374/0xbd0 ? rcu_read_lock_sched_held+0x12/0x70 do_group_exit+0x33/0xa0 get_signal+0xaf4/0xb20 arch_do_signal_or_restart+0x36/0x780 ? do_futex+0x126/0x1c0 exit_to_user_mode_prepare+0x181/0x260 syscall_exit_to_user_mode+0x16/0x50 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fd1ef7a3750 Code: Unable to access opcode bytes at 0x7fd1ef7a3726. RSP: 002b:00007fd1e21fb5d8 EFLAGS: 00000282 ORIG_RAX: 00000000000000ca RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00007fd1ef7a3750 RDX: 0000000000000002 RSI: 0000000000000080 RDI: 00005571b8cf38c0 RBP: 00007fd1e21fb630 R08: 0000000000000000 R09: 000000000000000b R10: 0000000000000000 R11: 0000000000000282 R12: 00007ffd9787d1ae R13: 00007ffd9787d1af R14: 00007ffd9787d270 R15: 00007fd1e2200700 </TASK> irq event stamp: 202 hardirqs last enabled at (201): [<ffffffffa7e235a2>] syscall_enter_from_user_mode+0x22/0xb0 hardirqs last disabled at (202): [<ffffffffa7e2da5d>] __schedule+0x7ed/0xd30 softirqs last enabled at (0): [<ffffffffa70e2241>] copy_process+0x9f1/0x1e90 softirqs last disabled at (0): [<0000000000000000>] 0x0 ---[ end trace 0000000000000000 ]---
On Fri, Nov 04, 2022 at 03:27:13PM -0600, Alex Williamson wrote: > On Tue, 25 Oct 2022 15:12:09 -0300 > Jason Gunthorpe <jgg@nvidia.com> wrote: > > > [ > > At this point everything is done and I will start putting this work into a > > git tree and into linux-next with the intention of sending it during the > > next merge window. > > > > I intend to focus the next several weeks on more intensive QA to look at > > error flows and other things. Hopefully including syzkaller if I'm lucky > > ] > > In case this one hasn't been reported yet (with IOMMUFD_VFIO_CONTAINER): > > ====================================================== > WARNING: possible circular locking dependency detected > 6.1.0-rc3+ #133 Tainted: G E > ------------------------------------------------------ > qemu-system-x86/1731 is trying to acquire lock: > ffff90d3f5fe3e08 (&iopt->iova_rwsem){++++}-{3:3}, at: iopt_map_pages.part.0+0x85/0xe0 [iommufd] > > but task is already holding lock: > ffff90d3f5fe3d18 (&iopt->domains_rwsem){.+.+}-{3:3}, at: iopt_map_pages.part.0+0x18/0xe0 [iommufd] > > which lock already depends on the new lock. I think this is: https://lore.kernel.org/all/Y1qR6Zxdmuk+ME5z@nvidia.com/ Thanks, Jason
On Fri, Nov 04, 2022 at 04:03:48PM -0600, Alex Williamson wrote: > On Fri, 4 Nov 2022 15:27:13 -0600 > Alex Williamson <alex.williamson@redhat.com> wrote: > > > On Tue, 25 Oct 2022 15:12:09 -0300 > > Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > > [ > > > At this point everything is done and I will start putting this work into a > > > git tree and into linux-next with the intention of sending it during the > > > next merge window. > > > > > > I intend to focus the next several weeks on more intensive QA to look at > > > error flows and other things. Hopefully including syzkaller if I'm lucky > > > ] > > > > In case this one hasn't been reported yet (with IOMMUFD_VFIO_CONTAINER): > > And... > > ------------[ cut here ]------------ > WARNING: CPU: 4 PID: 1736 at drivers/iommu/iommufd/io_pagetable.c:660 iopt_destroy_table+0x91/0xc0 [iommufd] This is a generic splat that says accounting has gone wrong syzkaller hit splats like this and they are fixed, so I'm guessing it is sorted out now. Most likely: https://lore.kernel.org/all/Y2QfqAWxqT5cCfmN@nvidia.com/ https://lore.kernel.org/all/Y2U9LiwXxPO7G6YW@nvidia.com/ I hope to post v4 by the end of the day (the fixes on are on the github already), so please re-test this Jason