[-next,v3] IB/ipath: use GFP_NOWAIT under spin lock

Message ID	CAPgLHd8_WyQiDpJwRWwNVPbSvVcpMj1bMu7h2QmfJpscd0LOFQ@mail.gmail.com (mailing list archive)
State	Rejected
Headers	show Return-Path: <linux-rdma-owner@vger.kernel.org> MIME-Version: 1.0 In-Reply-To: <CAOS58YOHt7OaBHRwwqjLF8xZtVaj=BeFUXFAxW0owb4_BxC3Rg@mail.gmail.com> References: <CAPgLHd965-++Kj6HE-wjHqpZjortmmCAvApSSwiXaX6YwWncHQ@mail.gmail.com> <CAOS58YNcJ0gDmoR4M1tgJbqqzv0i=5EkoAr=OhoTSrozfw-ZtQ@mail.gmail.com> <CAOS58YOHt7OaBHRwwqjLF8xZtVaj=BeFUXFAxW0owb4_BxC3Rg@mail.gmail.com> Date: Fri, 22 Feb 2013 10:48:12 +0800 Message-ID: <CAPgLHd8_WyQiDpJwRWwNVPbSvVcpMj1bMu7h2QmfJpscd0LOFQ@mail.gmail.com> Subject: [PATCH -next v3] IB/ipath: use GFP_NOWAIT under spin lock From: Wei Yongjun <weiyj.lk@gmail.com> To: tj@kernel.org Cc: infinipath@intel.com, roland@kernel.org, sean.hefty@intel.com, hal.rosenstock@gmail.com, linux-rdma@vger.kernel.org, akpm@linux-foundation.org Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk

Wei Yongjun Feb. 22, 2013, 2:48 a.m. UTC

From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

A spin lock is taken here so we should use GFP_NOWAIT like
other case.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Acked-by: Tejun Heo <tj@kernel.org>
---
v2 -> v3: add acked and cc akpm@linux-foundation.org
---
 drivers/infiniband/hw/ipath/ipath_driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andrew Morton Feb. 22, 2013, 10:16 p.m. UTC | #1

On Fri, 22 Feb 2013 10:48:12 +0800
Wei Yongjun <weiyj.lk@gmail.com> wrote:

> From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
> 
> A spin lock is taken here so we should use GFP_NOWAIT like
> other case.
> 
> ..
>
> --- a/drivers/infiniband/hw/ipath/ipath_driver.c
> +++ b/drivers/infiniband/hw/ipath/ipath_driver.c
> @@ -204,7 +204,7 @@ static struct ipath_devdata *ipath_alloc_devdata(struct pci_dev *pdev)
>  	idr_preload(GFP_KERNEL);
>  	spin_lock_irqsave(&ipath_devs_lock, flags);
>  
> -	ret = idr_alloc(&unit_table, dd, 0, 0, GFP_KERNEL);
> +	ret = idr_alloc(&unit_table, dd, 0, 0, GFP_NOWAIT);
>  	if (ret < 0) {
>  		printk(KERN_ERR IPATH_DRV_NAME
>  		       ": Could not allocate unit ID: error %d\n", -ret);

I'd say we're entitled to use GFP_ATOMIC here: it will dip into page
reserves and is less likely to fail.

Given that this code is performing allocations under
spin_lock_irqsave(), it should be using idr_preload()/idr_preload_end()
(and perhaps even a repeat loop around those) to make the allocations
more reliable.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Tejun Heo Feb. 22, 2013, 10:21 p.m. UTC | #2

Hello, Andrew.

On Fri, Feb 22, 2013 at 2:16 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> Given that this code is performing allocations under
> spin_lock_irqsave(), it should be using idr_preload()/idr_preload_end()
> (and perhaps even a repeat loop around those) to make the allocations
> more reliable.

It's already using preload so GFP_NOWAIT is enough to guarantee
GFP_KERNEL level allocation.

Thanks.

Andrew Morton Feb. 22, 2013, 10:44 p.m. UTC | #3

On Fri, 22 Feb 2013 14:21:58 -0800
Tejun Heo <tj@kernel.org> wrote:

> Hello, Andrew.
> 
> On Fri, Feb 22, 2013 at 2:16 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > Given that this code is performing allocations under
> > spin_lock_irqsave(), it should be using idr_preload()/idr_preload_end()
> > (and perhaps even a repeat loop around those) to make the allocations
> > more reliable.
> 
> It's already using preload so GFP_NOWAIT is enough to guarantee
> GFP_KERNEL level allocation.
> 

Not "guarantee".

- idr_preload() might fail.  It is a design error that idr_preload()
  returns void - it should do what radix_tree_preload() does.

- even if idr_preload() completed the preload, an interrupt might
  come in and its handler might drain this cpu's preload pool.

This all sounds like crazy will-never-happen stuff.  That's what I
thought about the radix-tree code when used for pagecache.  But under
extreme worloads, the cant-happens kept on happening, which is how all
that funky preload/preempt-disable stuff occurred.

It would be more robust still if the per-cpu preload magazines were
per-radixtree and per-idr, rather than kernel-wide.  That would reduce
the risk of interrupt-time stealing.  Or make foo_preload() return with
interrupts disabled rather than jsut preempt_disable().

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Tejun Heo Feb. 22, 2013, 10:51 p.m. UTC | #4

Hello, Andrew.

On Fri, Feb 22, 2013 at 2:44 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> Not "guarantee".
>
> - idr_preload() might fail.  It is a design error that idr_preload()
>   returns void - it should do what radix_tree_preload() does.

That would equal to GFP_KERNEL allocation failing. The error isn't
ignored. It gets propagated to the actual idr_alloc() allocation.
You're just collecting the error later.

> - even if idr_preload() completed the preload, an interrupt might
>   come in and its handler might drain this cpu's preload pool.

Non-process context doesn't dip into preload pool, so the first
allocation after idr_preload() is *guaranteed* to fail iff the
previous idr_preload(GFP_KERNEL) failed and there just isn't much we
can do other than failing if GFP_KERNEL allocation fails.

Thanks.

Tejun Heo Feb. 22, 2013, 10:58 p.m. UTC | #5

On Fri, Feb 22, 2013 at 02:51:30PM -0800, Tejun Heo wrote:
> Non-process context doesn't dip into preload pool, so the first
> allocation after idr_preload() is *guaranteed* to fail iff the
> previous idr_preload(GFP_KERNEL) failed and there just isn't much we
> can do other than failing if GFP_KERNEL allocation fails.

If it still is confusing, a good way to think about it is that
idr_preload() upgrades the immediately following idr_preload()
allocation flag to the stronger of the two used in idr_preload() and
idr_alloc().  In practice, this means there's no reason to use
anything other than GFP_NOWAIT inside preloaded section.

IMO, this is a better or rather more evolved design of preloading in
radix_tree.  It's a lot more convenient and I'm thinking about
updating radix_tree preloading to about the same scheme.  It's a bit
cumbersome at the moment requiring determination of whether preloading
will be used or not at initialization time.

Thanks.

[-next,v3] IB/ipath: use GFP_NOWAIT under spin lock

Commit Message

Comments

Patch