diff mbox

[-next,v3] IB/ipath: use GFP_NOWAIT under spin lock

Message ID CAPgLHd8_WyQiDpJwRWwNVPbSvVcpMj1bMu7h2QmfJpscd0LOFQ@mail.gmail.com (mailing list archive)
State Rejected
Headers show

Commit Message

Wei Yongjun Feb. 22, 2013, 2:48 a.m. UTC
From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

A spin lock is taken here so we should use GFP_NOWAIT like
other case.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Acked-by: Tejun Heo <tj@kernel.org>
---
v2 -> v3: add acked and cc akpm@linux-foundation.org
---
 drivers/infiniband/hw/ipath/ipath_driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Andrew Morton Feb. 22, 2013, 10:16 p.m. UTC | #1
On Fri, 22 Feb 2013 10:48:12 +0800
Wei Yongjun <weiyj.lk@gmail.com> wrote:

> From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
> 
> A spin lock is taken here so we should use GFP_NOWAIT like
> other case.
> 
> ..
>
> --- a/drivers/infiniband/hw/ipath/ipath_driver.c
> +++ b/drivers/infiniband/hw/ipath/ipath_driver.c
> @@ -204,7 +204,7 @@ static struct ipath_devdata *ipath_alloc_devdata(struct pci_dev *pdev)
>  	idr_preload(GFP_KERNEL);
>  	spin_lock_irqsave(&ipath_devs_lock, flags);
>  
> -	ret = idr_alloc(&unit_table, dd, 0, 0, GFP_KERNEL);
> +	ret = idr_alloc(&unit_table, dd, 0, 0, GFP_NOWAIT);
>  	if (ret < 0) {
>  		printk(KERN_ERR IPATH_DRV_NAME
>  		       ": Could not allocate unit ID: error %d\n", -ret);

I'd say we're entitled to use GFP_ATOMIC here: it will dip into page
reserves and is less likely to fail.

Given that this code is performing allocations under
spin_lock_irqsave(), it should be using idr_preload()/idr_preload_end()
(and perhaps even a repeat loop around those) to make the allocations
more reliable.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo Feb. 22, 2013, 10:21 p.m. UTC | #2
Hello, Andrew.

On Fri, Feb 22, 2013 at 2:16 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> Given that this code is performing allocations under
> spin_lock_irqsave(), it should be using idr_preload()/idr_preload_end()
> (and perhaps even a repeat loop around those) to make the allocations
> more reliable.

It's already using preload so GFP_NOWAIT is enough to guarantee
GFP_KERNEL level allocation.

Thanks.
Andrew Morton Feb. 22, 2013, 10:44 p.m. UTC | #3
On Fri, 22 Feb 2013 14:21:58 -0800
Tejun Heo <tj@kernel.org> wrote:

> Hello, Andrew.
> 
> On Fri, Feb 22, 2013 at 2:16 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > Given that this code is performing allocations under
> > spin_lock_irqsave(), it should be using idr_preload()/idr_preload_end()
> > (and perhaps even a repeat loop around those) to make the allocations
> > more reliable.
> 
> It's already using preload so GFP_NOWAIT is enough to guarantee
> GFP_KERNEL level allocation.
> 

Not "guarantee".

- idr_preload() might fail.  It is a design error that idr_preload()
  returns void - it should do what radix_tree_preload() does.

- even if idr_preload() completed the preload, an interrupt might
  come in and its handler might drain this cpu's preload pool.

This all sounds like crazy will-never-happen stuff.  That's what I
thought about the radix-tree code when used for pagecache.  But under
extreme worloads, the cant-happens kept on happening, which is how all
that funky preload/preempt-disable stuff occurred.

It would be more robust still if the per-cpu preload magazines were
per-radixtree and per-idr, rather than kernel-wide.  That would reduce
the risk of interrupt-time stealing.  Or make foo_preload() return with
interrupts disabled rather than jsut preempt_disable().

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo Feb. 22, 2013, 10:51 p.m. UTC | #4
Hello, Andrew.

On Fri, Feb 22, 2013 at 2:44 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> Not "guarantee".
>
> - idr_preload() might fail.  It is a design error that idr_preload()
>   returns void - it should do what radix_tree_preload() does.

That would equal to GFP_KERNEL allocation failing. The error isn't
ignored. It gets propagated to the actual idr_alloc() allocation.
You're just collecting the error later.

> - even if idr_preload() completed the preload, an interrupt might
>   come in and its handler might drain this cpu's preload pool.

Non-process context doesn't dip into preload pool, so the first
allocation after idr_preload() is *guaranteed* to fail iff the
previous idr_preload(GFP_KERNEL) failed and there just isn't much we
can do other than failing if GFP_KERNEL allocation fails.

Thanks.
Tejun Heo Feb. 22, 2013, 10:58 p.m. UTC | #5
On Fri, Feb 22, 2013 at 02:51:30PM -0800, Tejun Heo wrote:
> Non-process context doesn't dip into preload pool, so the first
> allocation after idr_preload() is *guaranteed* to fail iff the
> previous idr_preload(GFP_KERNEL) failed and there just isn't much we
> can do other than failing if GFP_KERNEL allocation fails.

If it still is confusing, a good way to think about it is that
idr_preload() upgrades the immediately following idr_preload()
allocation flag to the stronger of the two used in idr_preload() and
idr_alloc().  In practice, this means there's no reason to use
anything other than GFP_NOWAIT inside preloaded section.

IMO, this is a better or rather more evolved design of preloading in
radix_tree.  It's a lot more convenient and I'm thinking about
updating radix_tree preloading to about the same scheme.  It's a bit
cumbersome at the moment requiring determination of whether preloading
will be used or not at initialization time.

Thanks.
diff mbox

Patch

diff --git a/drivers/infiniband/hw/ipath/ipath_driver.c b/drivers/infiniband/hw/ipath/ipath_driver.c
index fcdaeea..03ed479 100644
--- a/drivers/infiniband/hw/ipath/ipath_driver.c
+++ b/drivers/infiniband/hw/ipath/ipath_driver.c
@@ -204,7 +204,7 @@  static struct ipath_devdata *ipath_alloc_devdata(struct pci_dev *pdev)
 	idr_preload(GFP_KERNEL);
 	spin_lock_irqsave(&ipath_devs_lock, flags);
 
-	ret = idr_alloc(&unit_table, dd, 0, 0, GFP_KERNEL);
+	ret = idr_alloc(&unit_table, dd, 0, 0, GFP_NOWAIT);
 	if (ret < 0) {
 		printk(KERN_ERR IPATH_DRV_NAME
 		       ": Could not allocate unit ID: error %d\n", -ret);