diff mbox

[1/4] scsi: bnx2fc: convert per-CPU thread to kworker

Message ID 1467734929-24442-1-git-send-email-bigeasy@linutronix.de (mailing list archive)
State Changes Requested, archived
Headers show

Commit Message

Sebastian Andrzej Siewior July 5, 2016, 4:08 p.m. UTC
The driver creates its own per-CPU threads which are updated based on CPU
hotplug events. It is also possible to use kworkers and remove some of the
infrastructure get the same job done while saving a few lines of code.

bnx2fc_percpu_io_thread() becomes bnx2fc_percpu_io_work() which is
mostly the same code. The outer loop (kthread_should_stop()) gets
removed and the remaining code is shifted to the left.
In bnx2fc_process_new_cqes() the code checked for ->iothread to
decide if there is an active per-CPU thread. With the kworkers this
is no longer possible nor required. The allocation of a new work item
(via bnx2fc_alloc_work()) does no longer happen with the ->fp_work_lock
lock held. It performs only a memory allocation + initialization which
does not require any kind of serialization. The lock is held while
adding the new member to fps->work_list list.

The remaining part is the removal CPU hotplug notifier since it is taken
care by the kworker code.

This patch was only compile-tested due to -ENODEV.

Cc: QLogic-Storage-Upstream@qlogic.com
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/scsi/bnx2fc/bnx2fc.h      |   2 +-
 drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 157 ++++++--------------------------------
 drivers/scsi/bnx2fc/bnx2fc_hwi.c  |  22 +++---
 3 files changed, 32 insertions(+), 149 deletions(-)

Comments

Johannes Thumshirn July 20, 2016, 11:04 a.m. UTC | #1
On Tue, Jul 05, 2016 at 06:08:46PM +0200, Sebastian Andrzej Siewior wrote:
> The driver creates its own per-CPU threads which are updated based on CPU
> hotplug events. It is also possible to use kworkers and remove some of the
> infrastructure get the same job done while saving a few lines of code.
> 
> bnx2fc_percpu_io_thread() becomes bnx2fc_percpu_io_work() which is
> mostly the same code. The outer loop (kthread_should_stop()) gets
> removed and the remaining code is shifted to the left.
> In bnx2fc_process_new_cqes() the code checked for ->iothread to
> decide if there is an active per-CPU thread. With the kworkers this
> is no longer possible nor required. The allocation of a new work item
> (via bnx2fc_alloc_work()) does no longer happen with the ->fp_work_lock
> lock held. It performs only a memory allocation + initialization which
> does not require any kind of serialization. The lock is held while
> adding the new member to fps->work_list list.
> 
> The remaining part is the removal CPU hotplug notifier since it is taken
> care by the kworker code.
> 
> This patch was only compile-tested due to -ENODEV.
> 
> Cc: QLogic-Storage-Upstream@qlogic.com
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Hi Sebastian,

I've tried to test your series and unfortunately I encountered the following
oops.

[  113.969654] Modules linked in: af_packet(E) 8021q(E) garp(E) stp(E) llc(E) mrp(E) mgag200(E) i2c_algo_bit(E) bnx2x(E) drm_kms_helper(E) syscopyarea(E) mdio(E) sysfillrect(E) libcrc32c(E) sysimgblt(E) dm_service_time(E) bnx2fc(E) fb_sys_fops(E) geneve(E) xhci_pci(E) uhci_hcd(E) ttm(E) sd_mod(E) ehci_hcd(E) xhci_hcd(E) cnic(E) vxlan(E) uio(E) ip6_udp_tunnel(E) udp_tunnel(E) crc32c_intel(E) hpsa(E) ptp(E) usbcore(E) drm(E) scsi_transport_sas(E) usb_common(E) pps_core(E) fjes(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) dm_mirror(E) dm_region_hash(E) dm_log(E) fcoe(E) libfcoe(E) libfc(E) scsi_transport_fc(E) sg(E) dm_multipath(E) dm_mod(E) scsi_mod(E) efivarfs(E) autofs4(E)
[  113.969669] Supported: Yes
[  113.969671] CPU: 11 PID: 398 Comm: kworker/11:1 Tainted: G            EL     4.4.15-0.g6f92308-default #1
[  113.969672] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 12/24/2014
[  113.969674] Workqueue: events bnx2fc_percpu_io_work [bnx2fc]
[  113.969675] task: ffff880667a38600 ti: ffff880667a3c000 task.ti: ffff880667a3c000
[  113.969676] RIP: 0010:[<ffffffff810c4c2f>]  [<ffffffff810c4c2f>] native_queued_spin_lock_slowpath+0x16f/0x190
[  113.969677] RSP: 0018:ffff880667a3fde8  EFLAGS: 00000202
[  113.969678] RAX: 0000000000000101 RBX: ffff88065553cd20 RCX: 0000000000000001
[  113.969678] RDX: 0000000000000101 RSI: 0000000000000001 RDI: ffff8806676d80d0
[  113.969679] RBP: ffff880667a3fdf8 R08: 0000000000000101 R09: 0000000000000001
[  113.969679] R10: ffffffff81d46fc0 R11: 0000000000000000 R12: ffff880667a3fdf8
[  113.969680] R13: ffff8806676d80a0 R14: ffff8806676d80c0 R15: ffff8806676d80d0
[  113.969681] FS:  0000000000000000(0000) GS:ffff8806676c0000(0000) knlGS:0000000000000000
[  113.969681] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  113.969682] CR2: 00007fee621724a0 CR3: 0000000001c0a000 CR4: 00000000001406e0
[  113.969682] Stack:
[  113.969683]  ffffffff81181a17 ffffffffa026a1dc ffff880667a3fdf8 ffff880667a3fdf8
[  113.969684]  ffff8806676d80a0 ffff88017c303b80 ffff8806676d5400 ffffe8fa068c1200
[  113.969685]  0000000000000000 00000000000002c0 ffffffff8109394e 0000000067a38600
[  113.969685] Call Trace:
[  113.969688]  [<ffffffff81181a17>] queued_spin_lock_slowpath+0x7/0xa
[  113.969690]  [<ffffffffa026a1dc>] bnx2fc_percpu_io_work+0xac/0xd0 [bnx2fc]
[  113.969693]  [<ffffffff8109394e>] process_one_work+0x14e/0x410
[  113.969695]  [<ffffffff810941a6>] worker_thread+0x116/0x490
[  113.969697]  [<ffffffff8109964d>] kthread+0xbd/0xe0
[  113.969699]  [<ffffffff815debbf>] ret_from_fork+0x3f/0x70
[  113.970696] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[  113.970696] 
[  113.970696] Leftover inexact backtrace:
[  113.970696] 
[  113.970698]  [<ffffffff81099590>] ? kthread_park+0x50/0x50
[  113.970705] Code: 89 d0 66 31 c0 39 c1 74 e7 4d 85 c9 c6 07 01 74 2d 41 c7 41 08 01 00 00 00 e9 52 ff ff ff 83 fa 01 74 17 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 b8 01 00 00 00 66 89 07 c3 f3 c3 f3 90 4d 8b 

I'll try to narrow it down to the specific patch, but currently I'm a bit
busy, sorry.

Johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sebastian Andrzej Siewior July 20, 2016, 12:07 p.m. UTC | #2
On 07/20/2016 01:04 PM, Johannes Thumshirn wrote:
> Hi Sebastian,

Hi Johannes,

> I've tried to test your series and unfortunately I encountered the following
> oops.
thanks for testing.
…
> 
> I'll try to narrow it down to the specific patch, but currently I'm a bit
> busy, sorry.

I looked over the patch and noticed nothing. It was possible to run the
worker but somehow the spinlock exploded.

> 
> Johannes
> 
Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Thumshirn July 20, 2016, 12:13 p.m. UTC | #3
On Wed, Jul 20, 2016 at 02:07:19PM +0200, Sebastian Andrzej Siewior wrote:
> On 07/20/2016 01:04 PM, Johannes Thumshirn wrote:
> > Hi Sebastian,
> 
> Hi Johannes,
> 
> > I've tried to test your series and unfortunately I encountered the following
> > oops.
> thanks for testing.
> …
> > 
> > I'll try to narrow it down to the specific patch, but currently I'm a bit
> > busy, sorry.
> 
> I looked over the patch and noticed nothing. It was possible to run the
> worker but somehow the spinlock exploded.

Strange...

I'll re-apply the patches again, maybe something went wrong there.

Johannes
Sebastian Andrzej Siewior Aug. 12, 2016, 11 a.m. UTC | #4
On 2016-07-05 18:08:46 [+0200], To linux-scsi@vger.kernel.org wrote:
> The driver creates its own per-CPU threads which are updated based on CPU
> hotplug events. It is also possible to use kworkers and remove some of the
> infrastructure get the same job done while saving a few lines of code.

ping

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/scsi/bnx2fc/bnx2fc.h b/drivers/scsi/bnx2fc/bnx2fc.h
index fdd4eb4e41b2..fdd89935cac7 100644
--- a/drivers/scsi/bnx2fc/bnx2fc.h
+++ b/drivers/scsi/bnx2fc/bnx2fc.h
@@ -168,7 +168,7 @@  extern struct fcoe_percpu_s bnx2fc_global;
 extern struct workqueue_struct *bnx2fc_wq;
 
 struct bnx2fc_percpu_s {
-	struct task_struct *iothread;
+	struct work_struct work;
 	struct list_head work_list;
 	spinlock_t fp_work_lock;
 };
diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index d6800afd0232..ab08d3ab8a44 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -127,13 +127,6 @@  module_param_named(log_fka, bnx2fc_log_fka, uint, S_IRUGO|S_IWUSR);
 MODULE_PARM_DESC(log_fka, " Print message to kernel log when fcoe is "
 	"initiating a FIP keep alive when debug logging is enabled.");
 
-static int bnx2fc_cpu_callback(struct notifier_block *nfb,
-			     unsigned long action, void *hcpu);
-/* notification function for CPU hotplug events */
-static struct notifier_block bnx2fc_cpu_notifier = {
-	.notifier_call = bnx2fc_cpu_callback,
-};
-
 static inline struct net_device *bnx2fc_netdev(const struct fc_lport *lport)
 {
 	return ((struct bnx2fc_interface *)
@@ -621,39 +614,32 @@  static void bnx2fc_recv_frame(struct sk_buff *skb)
 }
 
 /**
- * bnx2fc_percpu_io_thread - thread per cpu for ios
+ * bnx2fc_percpu_io_work - work per cpu for ios
  *
- * @arg:	ptr to bnx2fc_percpu_info structure
+ * @work_s:	The work struct
  */
-int bnx2fc_percpu_io_thread(void *arg)
+static void bnx2fc_percpu_io_work(struct work_struct *work_s)
 {
-	struct bnx2fc_percpu_s *p = arg;
+	struct bnx2fc_percpu_s *p;
 	struct bnx2fc_work *work, *tmp;
 	LIST_HEAD(work_list);
 
-	set_user_nice(current, MIN_NICE);
-	set_current_state(TASK_INTERRUPTIBLE);
-	while (!kthread_should_stop()) {
-		schedule();
-		spin_lock_bh(&p->fp_work_lock);
-		while (!list_empty(&p->work_list)) {
-			list_splice_init(&p->work_list, &work_list);
-			spin_unlock_bh(&p->fp_work_lock);
+	p = container_of(work_s, struct bnx2fc_percpu_s, work);
 
-			list_for_each_entry_safe(work, tmp, &work_list, list) {
-				list_del_init(&work->list);
-				bnx2fc_process_cq_compl(work->tgt, work->wqe);
-				kfree(work);
-			}
-
-			spin_lock_bh(&p->fp_work_lock);
-		}
-		__set_current_state(TASK_INTERRUPTIBLE);
+	spin_lock_bh(&p->fp_work_lock);
+	while (!list_empty(&p->work_list)) {
+		list_splice_init(&p->work_list, &work_list);
 		spin_unlock_bh(&p->fp_work_lock);
-	}
-	__set_current_state(TASK_RUNNING);
 
-	return 0;
+		list_for_each_entry_safe(work, tmp, &work_list, list) {
+			list_del_init(&work->list);
+			bnx2fc_process_cq_compl(work->tgt, work->wqe);
+			kfree(work);
+		}
+
+		spin_lock_bh(&p->fp_work_lock);
+	}
+	spin_unlock_bh(&p->fp_work_lock);
 }
 
 static struct fc_host_statistics *bnx2fc_get_host_stats(struct Scsi_Host *shost)
@@ -2570,91 +2556,6 @@  static struct fcoe_transport bnx2fc_transport = {
 	.disable = bnx2fc_disable,
 };
 
-/**
- * bnx2fc_percpu_thread_create - Create a receive thread for an
- *				 online CPU
- *
- * @cpu: cpu index for the online cpu
- */
-static void bnx2fc_percpu_thread_create(unsigned int cpu)
-{
-	struct bnx2fc_percpu_s *p;
-	struct task_struct *thread;
-
-	p = &per_cpu(bnx2fc_percpu, cpu);
-
-	thread = kthread_create_on_node(bnx2fc_percpu_io_thread,
-					(void *)p, cpu_to_node(cpu),
-					"bnx2fc_thread/%d", cpu);
-	/* bind thread to the cpu */
-	if (likely(!IS_ERR(thread))) {
-		kthread_bind(thread, cpu);
-		p->iothread = thread;
-		wake_up_process(thread);
-	}
-}
-
-static void bnx2fc_percpu_thread_destroy(unsigned int cpu)
-{
-	struct bnx2fc_percpu_s *p;
-	struct task_struct *thread;
-	struct bnx2fc_work *work, *tmp;
-
-	BNX2FC_MISC_DBG("destroying io thread for CPU %d\n", cpu);
-
-	/* Prevent any new work from being queued for this CPU */
-	p = &per_cpu(bnx2fc_percpu, cpu);
-	spin_lock_bh(&p->fp_work_lock);
-	thread = p->iothread;
-	p->iothread = NULL;
-
-
-	/* Free all work in the list */
-	list_for_each_entry_safe(work, tmp, &p->work_list, list) {
-		list_del_init(&work->list);
-		bnx2fc_process_cq_compl(work->tgt, work->wqe);
-		kfree(work);
-	}
-
-	spin_unlock_bh(&p->fp_work_lock);
-
-	if (thread)
-		kthread_stop(thread);
-}
-
-/**
- * bnx2fc_cpu_callback - Handler for CPU hotplug events
- *
- * @nfb:    The callback data block
- * @action: The event triggering the callback
- * @hcpu:   The index of the CPU that the event is for
- *
- * This creates or destroys per-CPU data for fcoe
- *
- * Returns NOTIFY_OK always.
- */
-static int bnx2fc_cpu_callback(struct notifier_block *nfb,
-			     unsigned long action, void *hcpu)
-{
-	unsigned cpu = (unsigned long)hcpu;
-
-	switch (action) {
-	case CPU_ONLINE:
-	case CPU_ONLINE_FROZEN:
-		printk(PFX "CPU %x online: Create Rx thread\n", cpu);
-		bnx2fc_percpu_thread_create(cpu);
-		break;
-	case CPU_DEAD:
-	case CPU_DEAD_FROZEN:
-		printk(PFX "CPU %x offline: Remove Rx thread\n", cpu);
-		bnx2fc_percpu_thread_destroy(cpu);
-		break;
-	default:
-		break;
-	}
-	return NOTIFY_OK;
-}
-
 static int bnx2fc_slave_configure(struct scsi_device *sdev)
 {
 	if (!bnx2fc_queue_depth)
@@ -2722,19 +2623,9 @@  static int __init bnx2fc_mod_init(void)
 		p = &per_cpu(bnx2fc_percpu, cpu);
 		INIT_LIST_HEAD(&p->work_list);
 		spin_lock_init(&p->fp_work_lock);
+		INIT_WORK(&p->work, bnx2fc_percpu_io_work);
 	}
 
-	cpu_notifier_register_begin();
-
-	for_each_online_cpu(cpu) {
-		bnx2fc_percpu_thread_create(cpu);
-	}
-
-	/* Initialize per CPU interrupt thread */
-	__register_hotcpu_notifier(&bnx2fc_cpu_notifier);
-
-	cpu_notifier_register_done();
-
 	cnic_register_driver(CNIC_ULP_FCOE, &bnx2fc_cnic_cb);
 
 	return 0;
@@ -2798,17 +2689,13 @@  static void __exit bnx2fc_mod_exit(void)
 	if (l2_thread)
 		kthread_stop(l2_thread);
 
-	cpu_notifier_register_begin();
+	for_each_possible_cpu(cpu) {
+		struct bnx2fc_percpu_s *p;
 
-	/* Destroy per cpu threads */
-	for_each_online_cpu(cpu) {
-		bnx2fc_percpu_thread_destroy(cpu);
+		p = per_cpu_ptr(&bnx2fc_percpu, cpu);
+		flush_work(&p->work);
 	}
 
-	__unregister_hotcpu_notifier(&bnx2fc_cpu_notifier);
-
-	cpu_notifier_register_done();
-
 	destroy_workqueue(bnx2fc_wq);
 	/*
 	 * detach from scsi transport
diff --git a/drivers/scsi/bnx2fc/bnx2fc_hwi.c b/drivers/scsi/bnx2fc/bnx2fc_hwi.c
index 28c671b609b2..acad76d3a81b 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_hwi.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_hwi.c
@@ -1046,23 +1046,19 @@  int bnx2fc_process_new_cqes(struct bnx2fc_rport *tgt)
 			struct bnx2fc_percpu_s *fps = NULL;
 			unsigned int cpu = wqe % num_possible_cpus();
 
-			fps = &per_cpu(bnx2fc_percpu, cpu);
-			spin_lock_bh(&fps->fp_work_lock);
-			if (unlikely(!fps->iothread))
-				goto unlock;
-
 			work = bnx2fc_alloc_work(tgt, wqe);
-			if (work)
+			if (work) {
+				fps = &per_cpu(bnx2fc_percpu, cpu);
+
+				spin_lock_bh(&fps->fp_work_lock);
 				list_add_tail(&work->list,
 					      &fps->work_list);
-unlock:
-			spin_unlock_bh(&fps->fp_work_lock);
-
-			/* Pending work request completion */
-			if (fps->iothread && work)
-				wake_up_process(fps->iothread);
-			else
+				spin_unlock_bh(&fps->fp_work_lock);
+				schedule_work_on(cpu, &fps->work);
+			} else {
 				bnx2fc_process_cq_compl(tgt, wqe);
+			}
+
 			num_free_sqes++;
 		}
 		cqe++;