Message ID | CAPgLHd8_WyQiDpJwRWwNVPbSvVcpMj1bMu7h2QmfJpscd0LOFQ@mail.gmail.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
On Fri, 22 Feb 2013 10:48:12 +0800 Wei Yongjun <weiyj.lk@gmail.com> wrote: > From: Wei Yongjun <yongjun_wei@trendmicro.com.cn> > > A spin lock is taken here so we should use GFP_NOWAIT like > other case. > > .. > > --- a/drivers/infiniband/hw/ipath/ipath_driver.c > +++ b/drivers/infiniband/hw/ipath/ipath_driver.c > @@ -204,7 +204,7 @@ static struct ipath_devdata *ipath_alloc_devdata(struct pci_dev *pdev) > idr_preload(GFP_KERNEL); > spin_lock_irqsave(&ipath_devs_lock, flags); > > - ret = idr_alloc(&unit_table, dd, 0, 0, GFP_KERNEL); > + ret = idr_alloc(&unit_table, dd, 0, 0, GFP_NOWAIT); > if (ret < 0) { > printk(KERN_ERR IPATH_DRV_NAME > ": Could not allocate unit ID: error %d\n", -ret); I'd say we're entitled to use GFP_ATOMIC here: it will dip into page reserves and is less likely to fail. Given that this code is performing allocations under spin_lock_irqsave(), it should be using idr_preload()/idr_preload_end() (and perhaps even a repeat loop around those) to make the allocations more reliable. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, Andrew. On Fri, Feb 22, 2013 at 2:16 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > Given that this code is performing allocations under > spin_lock_irqsave(), it should be using idr_preload()/idr_preload_end() > (and perhaps even a repeat loop around those) to make the allocations > more reliable. It's already using preload so GFP_NOWAIT is enough to guarantee GFP_KERNEL level allocation. Thanks.
On Fri, 22 Feb 2013 14:21:58 -0800 Tejun Heo <tj@kernel.org> wrote: > Hello, Andrew. > > On Fri, Feb 22, 2013 at 2:16 PM, Andrew Morton > <akpm@linux-foundation.org> wrote: > > Given that this code is performing allocations under > > spin_lock_irqsave(), it should be using idr_preload()/idr_preload_end() > > (and perhaps even a repeat loop around those) to make the allocations > > more reliable. > > It's already using preload so GFP_NOWAIT is enough to guarantee > GFP_KERNEL level allocation. > Not "guarantee". - idr_preload() might fail. It is a design error that idr_preload() returns void - it should do what radix_tree_preload() does. - even if idr_preload() completed the preload, an interrupt might come in and its handler might drain this cpu's preload pool. This all sounds like crazy will-never-happen stuff. That's what I thought about the radix-tree code when used for pagecache. But under extreme worloads, the cant-happens kept on happening, which is how all that funky preload/preempt-disable stuff occurred. It would be more robust still if the per-cpu preload magazines were per-radixtree and per-idr, rather than kernel-wide. That would reduce the risk of interrupt-time stealing. Or make foo_preload() return with interrupts disabled rather than jsut preempt_disable(). -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, Andrew. On Fri, Feb 22, 2013 at 2:44 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > Not "guarantee". > > - idr_preload() might fail. It is a design error that idr_preload() > returns void - it should do what radix_tree_preload() does. That would equal to GFP_KERNEL allocation failing. The error isn't ignored. It gets propagated to the actual idr_alloc() allocation. You're just collecting the error later. > - even if idr_preload() completed the preload, an interrupt might > come in and its handler might drain this cpu's preload pool. Non-process context doesn't dip into preload pool, so the first allocation after idr_preload() is *guaranteed* to fail iff the previous idr_preload(GFP_KERNEL) failed and there just isn't much we can do other than failing if GFP_KERNEL allocation fails. Thanks.
On Fri, Feb 22, 2013 at 02:51:30PM -0800, Tejun Heo wrote: > Non-process context doesn't dip into preload pool, so the first > allocation after idr_preload() is *guaranteed* to fail iff the > previous idr_preload(GFP_KERNEL) failed and there just isn't much we > can do other than failing if GFP_KERNEL allocation fails. If it still is confusing, a good way to think about it is that idr_preload() upgrades the immediately following idr_preload() allocation flag to the stronger of the two used in idr_preload() and idr_alloc(). In practice, this means there's no reason to use anything other than GFP_NOWAIT inside preloaded section. IMO, this is a better or rather more evolved design of preloading in radix_tree. It's a lot more convenient and I'm thinking about updating radix_tree preloading to about the same scheme. It's a bit cumbersome at the moment requiring determination of whether preloading will be used or not at initialization time. Thanks.
diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c b/drivers/infiniband/hw/ipath/ipath_driver.c index fcdaeea..03ed479 100644 --- a/drivers/infiniband/hw/ipath/ipath_driver.c +++ b/drivers/infiniband/hw/ipath/ipath_driver.c @@ -204,7 +204,7 @@ static struct ipath_devdata *ipath_alloc_devdata(struct pci_dev *pdev) idr_preload(GFP_KERNEL); spin_lock_irqsave(&ipath_devs_lock, flags); - ret = idr_alloc(&unit_table, dd, 0, 0, GFP_KERNEL); + ret = idr_alloc(&unit_table, dd, 0, 0, GFP_NOWAIT); if (ret < 0) { printk(KERN_ERR IPATH_DRV_NAME ": Could not allocate unit ID: error %d\n", -ret);