diff mbox

[rdma-next,V2,5/5] IB/core: Integrate IB address resolution module into core

Message ID 1462563928-29164-6-git-send-email-leon@kernel.org (mailing list archive)
State Superseded
Headers show

Commit Message

Leon Romanovsky May 6, 2016, 7:45 p.m. UTC
From: Leon Romanovsky <leonro@mellanox.com>

IB address resolution is declared as a module (ib_addr.ko) which loads
itself before IB core module (ib_core.ko).

It causes to the scenario where IB netlink which is initialized by IB
core can't be used by ib_addr.ko.

In order to solve it, we are converting ib_addr.ko to be part of
IB core module.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
---
 drivers/infiniband/core/Makefile |  6 ++----
 drivers/infiniband/core/addr.c   | 11 ++---------
 drivers/infiniband/core/device.c | 11 ++++++++++-
 include/rdma/ib_addr.h           |  3 +++
 4 files changed, 17 insertions(+), 14 deletions(-)

Comments

Doug Ledford May 13, 2016, 7:34 p.m. UTC | #1
On 05/06/2016 03:45 PM, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> IB address resolution is declared as a module (ib_addr.ko) which loads
> itself before IB core module (ib_core.ko).
> 
> It causes to the scenario where IB netlink which is initialized by IB
> core can't be used by ib_addr.ko.
> 
> In order to solve it, we are converting ib_addr.ko to be part of
> IB core module.

> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index 1097984..8894ad0 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -977,16 +977,24 @@ static int __init ib_core_init(void)
>  		goto err_comp;
>  	}
>  
> +	ret = addr_init();
> +	if (ret) {
> +		pr_warn("Could't init IB address resolution\n");
> +		goto err_sysfs;
> +	}
> +
>  	ret = ibnl_init();
>  	if (ret) {
>  		pr_warn("Couldn't init IB netlink interface\n");
> -		goto err_sysfs;
> +		goto err_addr;
>  	}

It's unclear to me how this change helps things.  Theoretically, if you
load ib_addr before ib_core (and therefore before any ibnl_init() is
run) and that prevents ib_addr from using the ibnl infrastructure, then
you would think that building ib_addr into ib_core and then running the
addr_init() before the ibnl_init() would have exactly the same problem.
I don't see what issue is resolved by this patch.  But, just as
importantly, after reading addr.c to see how it uses the ibnl
infrastructure, I don't even see what the original problem can be.
Please elaborate more on what actual problem we are solving with this
because I am loathe to change the module structure without good cause
(in particular, I know at least the Red Hat init scripts and systemd
units know about module names, so this requires changes to user space
scripts and that's a big no-no if it can be avoided).
Mark Bloch May 15, 2016, 10:51 a.m. UTC | #2
> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> owner@vger.kernel.org] On Behalf Of Doug Ledford
> Sent: Friday, May 13, 2016 10:34 PM
> To: Leon Romanovsky <leon@kernel.org>
> Cc: linux-rdma@vger.kernel.org; Leon Romanovsky <leonro@mellanox.com>
> Subject: Re: [PATCH rdma-next V2 5/5] IB/core: Integrate IB address
> resolution module into core
> 
> On 05/06/2016 03:45 PM, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > IB address resolution is declared as a module (ib_addr.ko) which loads
> > itself before IB core module (ib_core.ko).
> >
> > It causes to the scenario where IB netlink which is initialized by IB
> > core can't be used by ib_addr.ko.
> >
> > In order to solve it, we are converting ib_addr.ko to be part of
> > IB core module.
> 
> > diff --git a/drivers/infiniband/core/device.c
> b/drivers/infiniband/core/device.c
> > index 1097984..8894ad0 100644
> > --- a/drivers/infiniband/core/device.c
> > +++ b/drivers/infiniband/core/device.c
> > @@ -977,16 +977,24 @@ static int __init ib_core_init(void)
> >  		goto err_comp;
> >  	}
> >
> > +	ret = addr_init();
> > +	if (ret) {
> > +		pr_warn("Could't init IB address resolution\n");
> > +		goto err_sysfs;
> > +	}
> > +
> >  	ret = ibnl_init();
> >  	if (ret) {
> >  		pr_warn("Couldn't init IB netlink interface\n");
> > -		goto err_sysfs;
> > +		goto err_addr;
> >  	}
> 
> It's unclear to me how this change helps things.  Theoretically, if you
> load ib_addr before ib_core (and therefore before any ibnl_init() is
> run) and that prevents ib_addr from using the ibnl infrastructure, then
> you would think that building ib_addr into ib_core and then running the
> addr_init() before the ibnl_init() would have exactly the same problem.
> I don't see what issue is resolved by this patch.
You are right, this patch doesn't solve the problem of using ibnl inside ib_addr,
But this patch didn't try to do that, all it did was to integrate ib_addr into ib_core,
It tried to preserve the current order of dependencies . 

If this change is accepted, the order will be switched in the patches that add
ibnl usage inside addr.c, another option is to pull this patch with the correct order
into that series, this way we won't have to touch this area twice.

> But, just as importantly, after reading addr.c to see how it uses the ibnl
> infrastructure, I don't even see what the original problem can be.
As it stands today:
- ibnl is part of ib_core.
- ib_core needs ib_addr.

So if we add ibnl usage to ib_addr it means ib_addr will need ib_core,
which causes a dependency cycle.

> Please elaborate more on what actual problem we are solving with this
> because I am loathe to change the module structure without good cause
> (in particular, I know at least the Red Hat init scripts and systemd
> units know about module names, so this requires changes to user space
> scripts and that's a big no-no if it can be avoided).
> 
> 
> --
> Doug Ledford <dledford@redhat.com>
>               GPG KeyID: 0E572FDD
> 


Mark Bloch.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford May 16, 2016, 3:09 p.m. UTC | #3
On 05/15/2016 06:51 AM, Mark Bloch wrote:

>> But, just as importantly, after reading addr.c to see how it uses the ibnl
>> infrastructure, I don't even see what the original problem can be.
> As it stands today:
> - ibnl is part of ib_core.
> - ib_core needs ib_addr.
> 
> So if we add ibnl usage to ib_addr it means ib_addr will need ib_core,
> which causes a dependency cycle.

Right, but in that case, this patch needs to be part of the series that
adds the ibnl support into the ib_addr functionality.  Because you split
them into separate series, this was a patch looking for a problem to
solve and it wasn't clear what it was.  If I had taken the other series
and not this series, it would have broken things.  So please keep
patches like this together with the other patches that depend on it.

That said, I also don't want to redo modules if we don't have to.  As my
previous email points out, changing modules breaks init scripts and
systemd unit files.  It is to be avoided when possible.
Leon Romanovsky May 16, 2016, 4:30 p.m. UTC | #4
On Mon, May 16, 2016 at 11:09:44AM -0400, Doug Ledford wrote:
> On 05/15/2016 06:51 AM, Mark Bloch wrote:
> 
> >> But, just as importantly, after reading addr.c to see how it uses the ibnl
> >> infrastructure, I don't even see what the original problem can be.
> > As it stands today:
> > - ibnl is part of ib_core.
> > - ib_core needs ib_addr.
> > 
> > So if we add ibnl usage to ib_addr it means ib_addr will need ib_core,
> > which causes a dependency cycle.
> 
> Right, but in that case, this patch needs to be part of the series that
> adds the ibnl support into the ib_addr functionality.  Because you split
> them into separate series, this was a patch looking for a problem to
> solve and it wasn't clear what it was.  If I had taken the other series
> and not this series, it would have broken things.  So please keep
> patches like this together with the other patches that depend on it.
> 
> That said, I also don't want to redo modules if we don't have to.  As my
> previous email points out, changing modules breaks init scripts and
> systemd unit files.  It is to be avoided when possible.

Sorry,
I was in the mood of fixing things when I wrote and sent this patch.
The question is which version will you more likely to accept: this one
(remove ib_addr module) or previous one (add ib_netlink module)?

Thanks.
> 
> 
> -- 
> Doug Ledford <dledford@redhat.com>
>               GPG KeyID: 0E572FDD
> 
>
Doug Ledford May 16, 2016, 5:42 p.m. UTC | #5
On 05/16/2016 12:30 PM, Leon Romanovsky wrote:
> On Mon, May 16, 2016 at 11:09:44AM -0400, Doug Ledford wrote:
>> On 05/15/2016 06:51 AM, Mark Bloch wrote:
>>
>>>> But, just as importantly, after reading addr.c to see how it uses the ibnl
>>>> infrastructure, I don't even see what the original problem can be.
>>> As it stands today:
>>> - ibnl is part of ib_core.
>>> - ib_core needs ib_addr.
>>>
>>> So if we add ibnl usage to ib_addr it means ib_addr will need ib_core,
>>> which causes a dependency cycle.
>>
>> Right, but in that case, this patch needs to be part of the series that
>> adds the ibnl support into the ib_addr functionality.  Because you split
>> them into separate series, this was a patch looking for a problem to
>> solve and it wasn't clear what it was.  If I had taken the other series
>> and not this series, it would have broken things.  So please keep
>> patches like this together with the other patches that depend on it.
>>
>> That said, I also don't want to redo modules if we don't have to.  As my
>> previous email points out, changing modules breaks init scripts and
>> systemd unit files.  It is to be avoided when possible.
> 
> Sorry,
> I was in the mood of fixing things when I wrote and sent this patch.
> The question is which version will you more likely to accept: this one
> (remove ib_addr module) or previous one (add ib_netlink module)?

Can you build netlink in and then init the ib_addr module after the
netlink init is complete?  Wouldn't that resolve the dependency ordering
issue without changing the module names?
Jason Gunthorpe May 16, 2016, 6:27 p.m. UTC | #6
On Mon, May 16, 2016 at 01:42:34PM -0400, Doug Ledford wrote:
> Can you build netlink in and then init the ib_addr module after the
> netlink init is complete?  Wouldn't that resolve the dependency ordering
> issue without changing the module names?

Why are you so worried about ib_addr's module name? Is that actually
hand loaded instead of relying in the implicit dependencies???

Is failure to load a module going to break any existing scripts?

Worse comes to worse, just make dummy empty ib_addr.ko, but
I've never seen that in-kernel before.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford May 16, 2016, 6:39 p.m. UTC | #7
On 05/16/2016 02:27 PM, Jason Gunthorpe wrote:
> On Mon, May 16, 2016 at 01:42:34PM -0400, Doug Ledford wrote:
>> Can you build netlink in and then init the ib_addr module after the
>> netlink init is complete?  Wouldn't that resolve the dependency ordering
>> issue without changing the module names?
> 
> Why are you so worried about ib_addr's module name? Is that actually
> hand loaded instead of relying in the implicit dependencies???

It's hand unloaded.  Depending on the release of software you are
talking about, the init script has the ability to unload/reload the IB
stack.  In that case, it has to know about ib_addr so that it can unload
after ib_core.  I should know, when the order changed from ib_core first
and ib_addr second to the opposite it broke the Red Hat init script for
a release, including all the bug reports about the rdma stack failing to
shut down properly because it was unloading modules in the wrong order.
This falls in the same category of a lot of other things: you don't
break user space if you don't have to.

> Is failure to load a module going to break any existing scripts?
> 
> Worse comes to worse, just make dummy empty ib_addr.ko, but
> I've never seen that in-kernel before.
> 
> Jason
>
Leon Romanovsky May 16, 2016, 6:54 p.m. UTC | #8
On Mon, May 16, 2016 at 01:42:34PM -0400, Doug Ledford wrote:
> On 05/16/2016 12:30 PM, Leon Romanovsky wrote:
> > On Mon, May 16, 2016 at 11:09:44AM -0400, Doug Ledford wrote:
> >> On 05/15/2016 06:51 AM, Mark Bloch wrote:
> >>
> >>>> But, just as importantly, after reading addr.c to see how it uses the ibnl
> >>>> infrastructure, I don't even see what the original problem can be.
> >>> As it stands today:
> >>> - ibnl is part of ib_core.
> >>> - ib_core needs ib_addr.
> >>>
> >>> So if we add ibnl usage to ib_addr it means ib_addr will need ib_core,
> >>> which causes a dependency cycle.
> >>
> >> Right, but in that case, this patch needs to be part of the series that
> >> adds the ibnl support into the ib_addr functionality.  Because you split
> >> them into separate series, this was a patch looking for a problem to
> >> solve and it wasn't clear what it was.  If I had taken the other series
> >> and not this series, it would have broken things.  So please keep
> >> patches like this together with the other patches that depend on it.
> >>
> >> That said, I also don't want to redo modules if we don't have to.  As my
> >> previous email points out, changing modules breaks init scripts and
> >> systemd unit files.  It is to be avoided when possible.
> > 
> > Sorry,
> > I was in the mood of fixing things when I wrote and sent this patch.
> > The question is which version will you more likely to accept: this one
> > (remove ib_addr module) or previous one (add ib_netlink module)?
> 
> Can you build netlink in and then init the ib_addr module after the
> netlink init is complete?  Wouldn't that resolve the dependency ordering
> issue without changing the module names?

It seems reasonable and we will test it, before reposting.

However generally speaking, I agree with Jason and Ira that this module
(ib_addr.ko) is useless as module and can be part of ib_core.ko.

It doesn't seem as a big deal to fix all that init scripts (remove two
lines).

> 
> 
> -- 
> Doug Ledford <dledford@redhat.com>
>               GPG KeyID: 0E572FDD
> 
>
Doug Ledford May 16, 2016, 8:27 p.m. UTC | #9
On 05/16/2016 02:54 PM, Leon Romanovsky wrote:
> On Mon, May 16, 2016 at 01:42:34PM -0400, Doug Ledford wrote:
>> On 05/16/2016 12:30 PM, Leon Romanovsky wrote:
>>> On Mon, May 16, 2016 at 11:09:44AM -0400, Doug Ledford wrote:
>>>> On 05/15/2016 06:51 AM, Mark Bloch wrote:
>>>>
>>>>>> But, just as importantly, after reading addr.c to see how it uses the ibnl
>>>>>> infrastructure, I don't even see what the original problem can be.
>>>>> As it stands today:
>>>>> - ibnl is part of ib_core.
>>>>> - ib_core needs ib_addr.
>>>>>
>>>>> So if we add ibnl usage to ib_addr it means ib_addr will need ib_core,
>>>>> which causes a dependency cycle.
>>>>
>>>> Right, but in that case, this patch needs to be part of the series that
>>>> adds the ibnl support into the ib_addr functionality.  Because you split
>>>> them into separate series, this was a patch looking for a problem to
>>>> solve and it wasn't clear what it was.  If I had taken the other series
>>>> and not this series, it would have broken things.  So please keep
>>>> patches like this together with the other patches that depend on it.
>>>>
>>>> That said, I also don't want to redo modules if we don't have to.  As my
>>>> previous email points out, changing modules breaks init scripts and
>>>> systemd unit files.  It is to be avoided when possible.
>>>
>>> Sorry,
>>> I was in the mood of fixing things when I wrote and sent this patch.
>>> The question is which version will you more likely to accept: this one
>>> (remove ib_addr module) or previous one (add ib_netlink module)?
>>
>> Can you build netlink in and then init the ib_addr module after the
>> netlink init is complete?  Wouldn't that resolve the dependency ordering
>> issue without changing the module names?
> 
> It seems reasonable and we will test it, before reposting.

Thanks.

> However generally speaking, I agree with Jason and Ira that this module
> (ib_addr.ko) is useless as module and can be part of ib_core.ko.
> 
> It doesn't seem as a big deal to fix all that init scripts (remove two
> lines).

It's not removing two lines.  If someone wants to be able to boot both
kernels prior to this change and kernel post this change, they have to
now write their script to handle both conditions sanely.  And they have
to know to do so, where as very often they simply get caught off guard
and things are broken for some period of time.  If the change to ib_addr
actually solved a problem that couldn't be solved otherwise, that would
be one thing.  But it doesn't.  It's a complete no-op.
Jason Gunthorpe May 16, 2016, 9:03 p.m. UTC | #10
> and things are broken for some period of time.  If the change to ib_addr
> actually solved a problem that couldn't be solved otherwise, that would
> be one thing.  But it doesn't.  It's a complete no-op.

The original patches had a horrible dynamic registration scheme for
the netlink dispatch that only served to work around the pointless
module split. The issue is that the dispatch needs to have function
relocations from both modules to follow the usual netlink idioms.

It is unfortunate that distro scripts have become such that they break
if the kernel changes even something small like dependencies.

Another alternative is to move only the parts of ib_addr that touch
netlink into ib_core. Hopefully that doesn't create an empty module.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Bloch May 17, 2016, 5 a.m. UTC | #11
> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com]
> Sent: Tuesday, May 17, 2016 12:03 AM
> To: Doug Ledford <dledford@redhat.com>
> Cc: leon@kernel.org; Mark Bloch <markb@mellanox.com>; linux-
> rdma@vger.kernel.org
> Subject: Re: [PATCH rdma-next V2 5/5] IB/core: Integrate IB address
> resolution module into core
> 
> > and things are broken for some period of time.  If the change to ib_addr
> > actually solved a problem that couldn't be solved otherwise, that would
> > be one thing.  But it doesn't.  It's a complete no-op.
> 
> The original patches had a horrible dynamic registration scheme for
> the netlink dispatch that only served to work around the pointless
> module split. The issue is that the dispatch needs to have function
> relocations from both modules to follow the usual netlink idioms.
Are you talking about  the changes in ibnl_add_client?
Those changes were because of a different module (ib_sa.ko)
In sa_query.c we register a ibnl client for RDMA_NL_LS and
we use that for getting path records (ibacm runs in userspace and listens 
on that netlink socket). If we look at the definition of RDMA_NS_LS
it is defined as a  rdma local service operations, so naturedly
I feel  ip->gid should be added to that family.

It means we have two modules (ib_sa.ko and ib_core/ib_addr depends if we integrate them or not)
Needing to use different operations which are part of the same family.

The approach I've chosen is the easiest to implement because I'm not sure we
need all the complexity other solutions impose. I think Ira suggested to introduce "add operation"
it's an idea I've thought about, but it raises the issue of ibnl needing to manage each family.
This idea has the benefit if we want to have an overlap between clients, can you think
of a use case where different clients need to support the same operation?

I should also mention the option of adding a new family RDMA_NL_LS_IP or something 
like that. I didn't do that because we are using ibacm to answer the ip->gid requests.
Ibacm already listens on a netlink socket for RDMA_NS_LS, so it makes adding
Ip->gid to ibacm much easier and nicer if it needs to listen on only one netlink socket.

> It is unfortunate that distro scripts have become such that they break
> if the kernel changes even something small like dependencies.
> 
> Another alternative is to move only the parts of ib_addr that touch
> netlink into ib_core. Hopefully that doesn't create an empty module.
> 
> Jason

Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig May 17, 2016, 5:46 a.m. UTC | #12
On Mon, May 16, 2016 at 03:03:27PM -0600, Jason Gunthorpe wrote:
> It is unfortunate that distro scripts have become such that they break
> if the kernel changes even something small like dependencies.

What script exactly break, btw?  We had quite a few module merges /
split in other subsystems I contribute to, and they've not been major
issues.


> Another alternative is to move only the parts of ib_addr that touch
> netlink into ib_core. Hopefully that doesn't create an empty module.

Or just create a dummy ib_addr module IFF it really is an issue.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky May 17, 2016, 5:48 a.m. UTC | #13
On Mon, May 16, 2016 at 04:27:05PM -0400, Doug Ledford wrote:
> On 05/16/2016 02:54 PM, Leon Romanovsky wrote:
> > On Mon, May 16, 2016 at 01:42:34PM -0400, Doug Ledford wrote:
> >> On 05/16/2016 12:30 PM, Leon Romanovsky wrote:
> >>> On Mon, May 16, 2016 at 11:09:44AM -0400, Doug Ledford wrote:
> >>>> On 05/15/2016 06:51 AM, Mark Bloch wrote:
> >>>>
> >>>>>> But, just as importantly, after reading addr.c to see how it uses the ibnl
> >>>>>> infrastructure, I don't even see what the original problem can be.
> >>>>> As it stands today:
> >>>>> - ibnl is part of ib_core.
> >>>>> - ib_core needs ib_addr.
> >>>>>
> >>>>> So if we add ibnl usage to ib_addr it means ib_addr will need ib_core,
> >>>>> which causes a dependency cycle.
> >>>>
> >>>> Right, but in that case, this patch needs to be part of the series that
> >>>> adds the ibnl support into the ib_addr functionality.  Because you split
> >>>> them into separate series, this was a patch looking for a problem to
> >>>> solve and it wasn't clear what it was.  If I had taken the other series
> >>>> and not this series, it would have broken things.  So please keep
> >>>> patches like this together with the other patches that depend on it.
> >>>>
> >>>> That said, I also don't want to redo modules if we don't have to.  As my
> >>>> previous email points out, changing modules breaks init scripts and
> >>>> systemd unit files.  It is to be avoided when possible.
> >>>
> >>> Sorry,
> >>> I was in the mood of fixing things when I wrote and sent this patch.
> >>> The question is which version will you more likely to accept: this one
> >>> (remove ib_addr module) or previous one (add ib_netlink module)?
> >>
> >> Can you build netlink in and then init the ib_addr module after the
> >> netlink init is complete?  Wouldn't that resolve the dependency ordering
> >> issue without changing the module names?
> > 
> > It seems reasonable and we will test it, before reposting.
> 
> Thanks.
> 
> > However generally speaking, I agree with Jason and Ira that this module
> > (ib_addr.ko) is useless as module and can be part of ib_core.ko.
> > 
> > It doesn't seem as a big deal to fix all that init scripts (remove two
> > lines).
> 
> It's not removing two lines.  If someone wants to be able to boot both
> kernels prior to this change and kernel post this change, they have to
> now write their script to handle both conditions sanely.  And they have
> to know to do so, where as very often they simply get caught off guard
> and things are broken for some period of time.  If the change to ib_addr
> actually solved a problem that couldn't be solved otherwise, that would
> be one thing.  But it doesn't.  It's a complete no-op.

I think that your position is too tight for not allowing modules
restructure. It looks like that this complexity doesn't exist outside
testing scripts.

For my testings, I'm using Debian Sid with inbox scripts/packages and it handles
correctly IB stack with different kernels and different set of modules.

> 
> 
> -- 
> Doug Ledford <dledford@redhat.com>
>               GPG KeyID: 0E572FDD
> 
>
Leon Romanovsky May 17, 2016, 5:51 a.m. UTC | #14
On Mon, May 16, 2016 at 10:46:52PM -0700, Christoph Hellwig wrote:
> On Mon, May 16, 2016 at 03:03:27PM -0600, Jason Gunthorpe wrote:
> > It is unfortunate that distro scripts have become such that they break
> > if the kernel changes even something small like dependencies.
> 
> What script exactly break, btw?  We had quite a few module merges /
> split in other subsystems I contribute to, and they've not been major
> issues.

Debian Sid works like a charm with/without ib_addr.ko as a separate
module. It looks like breakage will be in testing scripts only.

> 
> 
> > Another alternative is to move only the parts of ib_addr that touch
> > netlink into ib_core. Hopefully that doesn't create an empty module.
> 
> Or just create a dummy ib_addr module IFF it really is an issue.
Mark Bloch May 17, 2016, 8:29 a.m. UTC | #15
> -----Original Message-----
> From: Doug Ledford [mailto:dledford@redhat.com]
> Sent: Monday, May 16, 2016 8:43 PM
> To: leon@kernel.org
> Cc: Mark Bloch <markb@mellanox.com>; linux-rdma@vger.kernel.org
> Subject: Re: [PATCH rdma-next V2 5/5] IB/core: Integrate IB address
> resolution module into core
> Can you build netlink in and then init the ib_addr module after the
> netlink init is complete?  Wouldn't that resolve the dependency ordering
> issue without changing the module names?
Sorry, but I don't understand what do you mean by this.
If you would like to keep the current module structure (ib_core.ko and ib_addr.ko)
The only way to use ibnl from within addr.c is to move the ibnl initialization to addr.c
and build netlink.c as part of ib_addr.ko, but it seems kind of strange if
ib_addr is responsible to initialize ibnl.

> 
> --
> Doug Ledford <dledford@redhat.com>
>               GPG KeyID: 0E572FDD
> 

Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford May 17, 2016, 2:52 p.m. UTC | #16
On 05/17/2016 01:46 AM, Christoph Hellwig wrote:
> On Mon, May 16, 2016 at 03:03:27PM -0600, Jason Gunthorpe wrote:
>> It is unfortunate that distro scripts have become such that they break
>> if the kernel changes even something small like dependencies.
> 
> What script exactly break, btw?  We had quite a few module merges /
> split in other subsystems I contribute to, and they've not been major
> issues.

The Red Hat SysV Init script for the RDMA subsystem on both RHEL and
Fedora (prior to the switch over to systemd for the RDMA init, which
means EL6 and earlier and I'm not sure when we switched in Fedora, but
I'm sure the old SysV init version of the Fedora rdma package is already
EOL) and the OFED SysV init script are the two for certain that I know of.

The RDMA subsystem is a bit unique in that the older SysV init scripts
allowed a reload operation, and that required removing all the modules
and then reinserting them.  The more modern systemd unit did away with
reload as an option and so it would work fine with the module name
change (although it would barf on other more important modules if they
changed names, primarily the end modules, such as ucma and friends).

I can't speak to Debian.  Among other issues, the only reason this would
even possibly be an issue for seemingly older OS releases is if the OS
in question still does either an RDMA stack update in a stable kernel
(like Red Hat does with all of its releases, including EL 6) or if the
older release gets newer kernels (like we do with Fedora, but our Fedora
releases short lived and so this isn't a problem there, but I have no
knowledge of Debian releases).

> 
>> Another alternative is to move only the parts of ib_addr that touch
>> netlink into ib_core. Hopefully that doesn't create an empty module.
> 
> Or just create a dummy ib_addr module IFF it really is an issue.

This is an awful lot of arguing over seemingly nothing.  These things
were true when I rejected the first patch:

1) It didn't solve any problem.  By code inspection, it wouldn't even
work to solve the problem that the commit message says it was supposed
to solve.
2) It was part of a series that was unrelated to the issue it was
supposed to help instead of part of the series that supposedly needed it.
3) It needlessly broke user space scripts (for systems that get updated
to modern upstream, which, believe it or not, EL 6 still is included in
that...the EL 6.8 update that just went out includes an upstream 4.1
RDMA core subsystem).

If you want me to take the patch, then one of 1? and 2 need to be true:

1a) It needs to actually solve a problem in the patch itself (and the
git log should reflect the problem it solves).

1b) It can solve a problem for some other code in the same series as it.

2) The solution must at least be more elegant or appropriate than
alternative solutions that don't break user space.

In short, give me actual justification and I'll take it.  After all,
breaking older init scripts is a pretty minor user space break.  But I
need *some* justification.
Doug Ledford May 17, 2016, 2:58 p.m. UTC | #17
On 05/17/2016 01:48 AM, Leon Romanovsky wrote:
> On Mon, May 16, 2016 at 04:27:05PM -0400, Doug Ledford wrote:

>> It's not removing two lines.  If someone wants to be able to boot both
>> kernels prior to this change and kernel post this change, they have to
>> now write their script to handle both conditions sanely.  And they have
>> to know to do so, where as very often they simply get caught off guard
>> and things are broken for some period of time.  If the change to ib_addr
>> actually solved a problem that couldn't be solved otherwise, that would
>> be one thing.  But it doesn't.  It's a complete no-op.
> 
> I think that your position is too tight for not allowing modules
> restructure. It looks like that this complexity doesn't exist outside
> testing scripts.

No, they were/are production scripts.

> For my testings, I'm using Debian Sid with inbox scripts/packages and it handles
> correctly IB stack with different kernels and different set of modules.

Does it support reloading the stack and have you tried it if it does?
Jason Gunthorpe May 17, 2016, 4:56 p.m. UTC | #18
On Tue, May 17, 2016 at 05:00:03AM +0000, Mark Bloch wrote:

> Are you talking about  the changes in ibnl_add_client?
> Those changes were because of a different module (ib_sa.ko)

Bleck, why is this such a mess? Can't you get everything into one
module so we don't need this dynamic stuff??

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky May 18, 2016, 4:41 a.m. UTC | #19
On Mon, May 16, 2016 at 02:39:10PM -0400, Doug Ledford wrote:
> On 05/16/2016 02:27 PM, Jason Gunthorpe wrote:
> > On Mon, May 16, 2016 at 01:42:34PM -0400, Doug Ledford wrote:
> >> Can you build netlink in and then init the ib_addr module after the
> >> netlink init is complete?  Wouldn't that resolve the dependency ordering
> >> issue without changing the module names?
> > 
> > Why are you so worried about ib_addr's module name? Is that actually
> > hand loaded instead of relying in the implicit dependencies???
> 
> It's hand unloaded.  Depending on the release of software you are
> talking about, the init script has the ability to unload/reload the IB
> stack.  In that case, it has to know about ib_addr so that it can unload
> after ib_core.  I should know, when the order changed from ib_core first
> and ib_addr second to the opposite it broke the Red Hat init script for
> a release, including all the bug reports about the rdma stack failing to
> shut down properly because it was unloading modules in the wrong order.
> This falls in the same category of a lot of other things: you don't
> break user space if you don't have to.

I think the reason that this thread fails to die is our trouble to
understand if code refactoring is enough justification to get rid of all
these modules.

All guides/scripts which I saw, instructs to load all these ib_...
modules together and it looks redundant to have such large number of
them [1].

[1]
http://www.rdmamojo.com/2014/11/08/working-rdma-ubuntu/#Starting_the_RDMA_services

> 
> > Is failure to load a module going to break any existing scripts?
> > 
> > Worse comes to worse, just make dummy empty ib_addr.ko, but
> > I've never seen that in-kernel before.
> > 
> > Jason
> > 
> 
> 
> -- 
> Doug Ledford <dledford@redhat.com>
>               GPG KeyID: 0E572FDD
> 
>
Leon Romanovsky May 18, 2016, 4:47 a.m. UTC | #20
On Tue, May 17, 2016 at 10:58:19AM -0400, Doug Ledford wrote:
> On 05/17/2016 01:48 AM, Leon Romanovsky wrote:
> > On Mon, May 16, 2016 at 04:27:05PM -0400, Doug Ledford wrote:
> 
> >> It's not removing two lines.  If someone wants to be able to boot both
> >> kernels prior to this change and kernel post this change, they have to
> >> now write their script to handle both conditions sanely.  And they have
> >> to know to do so, where as very often they simply get caught off guard
> >> and things are broken for some period of time.  If the change to ib_addr
> >> actually solved a problem that couldn't be solved otherwise, that would
> >> be one thing.  But it doesn't.  It's a complete no-op.
> > 
> > I think that your position is too tight for not allowing modules
> > restructure. It looks like that this complexity doesn't exist outside
> > testing scripts.
> 
> No, they were/are production scripts.
> 
> > For my testings, I'm using Debian Sid with inbox scripts/packages and it handles
> > correctly IB stack with different kernels and different set of modules.
> 
> Does it support reloading the stack and have you tried it if it does?

It is RedHat specific, from my understanding in other distribution there is no
special script to reload RDMA, I wrote for myself module reload script
for debug.

> 
> -- 
> Doug Ledford <dledford@redhat.com>
>               GPG KeyID: 0E572FDD
> 
>
Doug Ledford May 18, 2016, 2:20 p.m. UTC | #21
On 05/18/2016 12:41 AM, Leon Romanovsky wrote:
> On Mon, May 16, 2016 at 02:39:10PM -0400, Doug Ledford wrote:
>> On 05/16/2016 02:27 PM, Jason Gunthorpe wrote:
>>> On Mon, May 16, 2016 at 01:42:34PM -0400, Doug Ledford wrote:
>>>> Can you build netlink in and then init the ib_addr module after the
>>>> netlink init is complete?  Wouldn't that resolve the dependency ordering
>>>> issue without changing the module names?
>>>
>>> Why are you so worried about ib_addr's module name? Is that actually
>>> hand loaded instead of relying in the implicit dependencies???
>>
>> It's hand unloaded.  Depending on the release of software you are
>> talking about, the init script has the ability to unload/reload the IB
>> stack.  In that case, it has to know about ib_addr so that it can unload
>> after ib_core.  I should know, when the order changed from ib_core first
>> and ib_addr second to the opposite it broke the Red Hat init script for
>> a release, including all the bug reports about the rdma stack failing to
>> shut down properly because it was unloading modules in the wrong order.
>> This falls in the same category of a lot of other things: you don't
>> break user space if you don't have to.
> 
> I think the reason that this thread fails to die is our trouble to
> understand if code refactoring is enough justification to get rid of all
> these modules.
> 
> All guides/scripts which I saw, instructs to load all these ib_...
> modules together and it looks redundant to have such large number of
> them [1].
> 
> [1]
> http://www.rdmamojo.com/2014/11/08/working-rdma-ubuntu/#Starting_the_RDMA_services

I've never heard of rdmamojo before...interesting.

Anyway, it probably is true that the number of modules is superfluous.
I'm not arguing against that.

>>
>>> Is failure to load a module going to break any existing scripts?
>>>
>>> Worse comes to worse, just make dummy empty ib_addr.ko, but
>>> I've never seen that in-kernel before.
>>>
>>> Jason
>>>
>>
>>
>> -- 
>> Doug Ledford <dledford@redhat.com>
>>               GPG KeyID: 0E572FDD
>>
>>
> 
>
Doug Ledford May 18, 2016, 2:20 p.m. UTC | #22
On 05/18/2016 12:47 AM, Leon Romanovsky wrote:
> On Tue, May 17, 2016 at 10:58:19AM -0400, Doug Ledford wrote:
>> On 05/17/2016 01:48 AM, Leon Romanovsky wrote:
>>> On Mon, May 16, 2016 at 04:27:05PM -0400, Doug Ledford wrote:
>>
>>>> It's not removing two lines.  If someone wants to be able to boot both
>>>> kernels prior to this change and kernel post this change, they have to
>>>> now write their script to handle both conditions sanely.  And they have
>>>> to know to do so, where as very often they simply get caught off guard
>>>> and things are broken for some period of time.  If the change to ib_addr
>>>> actually solved a problem that couldn't be solved otherwise, that would
>>>> be one thing.  But it doesn't.  It's a complete no-op.
>>>
>>> I think that your position is too tight for not allowing modules
>>> restructure. It looks like that this complexity doesn't exist outside
>>> testing scripts.
>>
>> No, they were/are production scripts.
>>
>>> For my testings, I'm using Debian Sid with inbox scripts/packages and it handles
>>> correctly IB stack with different kernels and different set of modules.
>>
>> Does it support reloading the stack and have you tried it if it does?
> 
> It is RedHat specific, from my understanding in other distribution there is no
> special script to reload RDMA, I wrote for myself module reload script
> for debug.

Then I guess I was just nice to developers back in the day.
Mark Bloch May 18, 2016, 3:09 p.m. UTC | #23
> -----Original Message-----
> From: Jason Gunthorpe [mailto:jgunthorpe@obsidianresearch.com]
> Sent: Tuesday, May 17, 2016 7:57 PM
> To: Mark Bloch <markb@mellanox.com>
> Cc: Doug Ledford <dledford@redhat.com>; leon@kernel.org; linux-
> rdma@vger.kernel.org
> Subject: Re: [PATCH rdma-next V2 5/5] IB/core: Integrate IB address
> resolution module into core
> 
> On Tue, May 17, 2016 at 05:00:03AM +0000, Mark Bloch wrote:
> 
> > Are you talking about  the changes in ibnl_add_client?
> > Those changes were because of a different module (ib_sa.ko)
> 
> Bleck, why is this such a mess? Can't you get everything into one
> module so we don't need this dynamic stuff??
Well, seeing as ib_sa.ko doesn't have to be loaded we will still need
A dynamic part somewhere, to handle the registration/deregistration.
We will probably need a new module to manage that.
I think this is doable, but do we need it?

I agree there might be a more elegant solution, but I don't see any other
modules that will use it.  Do we have a use case other than ib_sa/ib_core?

Mark

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index f818538..2c6dc6b 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -2,7 +2,7 @@  infiniband-$(CONFIG_INFINIBAND_ADDR_TRANS)	:= rdma_cm.o
 user_access-$(CONFIG_INFINIBAND_ADDR_TRANS)	:= rdma_ucm.o
 
 obj-$(CONFIG_INFINIBAND) +=		ib_core.o ib_mad.o ib_sa.o \
-					ib_cm.o iw_cm.o ib_addr.o \
+					ib_cm.o iw_cm.o \
 					$(infiniband-y)
 obj-$(CONFIG_INFINIBAND_USER_MAD) +=	ib_umad.o
 obj-$(CONFIG_INFINIBAND_USER_ACCESS) +=	ib_uverbs.o ib_ucm.o \
@@ -10,7 +10,7 @@  obj-$(CONFIG_INFINIBAND_USER_ACCESS) +=	ib_uverbs.o ib_ucm.o \
 
 ib_core-y :=			packer.o ud_header.o verbs.o cq.o sysfs.o \
 				device.o fmr_pool.o cache.o netlink.o \
-				roce_gid_mgmt.o
+				roce_gid_mgmt.o addr.o
 ib_core-$(CONFIG_INFINIBAND_USER_MEM) += umem.o
 ib_core-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o umem_rbtree.o
 
@@ -28,8 +28,6 @@  rdma_cm-$(CONFIG_INFINIBAND_ADDR_TRANS_CONFIGFS) += cma_configfs.o
 
 rdma_ucm-y :=			ucma.o
 
-ib_addr-y :=			addr.o
-
 ib_umad-y :=			user_mad.o
 
 ib_ucm-y :=			ucm.o
diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index 337353d..3a203ee 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -47,10 +47,6 @@ 
 #include <rdma/ib_addr.h>
 #include <rdma/ib.h>
 
-MODULE_AUTHOR("Sean Hefty");
-MODULE_DESCRIPTION("IB Address Translation");
-MODULE_LICENSE("Dual BSD/GPL");
-
 struct addr_req {
 	struct list_head list;
 	struct sockaddr_storage src_addr;
@@ -634,7 +630,7 @@  static struct notifier_block nb = {
 	.notifier_call = netevent_callback
 };
 
-static int __init addr_init(void)
+int addr_init(void)
 {
 	addr_wq = create_singlethread_workqueue("ib_addr");
 	if (!addr_wq)
@@ -645,12 +641,9 @@  static int __init addr_init(void)
 	return 0;
 }
 
-static void __exit addr_cleanup(void)
+void addr_cleanup(void)
 {
 	rdma_addr_unregister_client(&self);
 	unregister_netevent_notifier(&nb);
 	destroy_workqueue(addr_wq);
 }
-
-module_init(addr_init);
-module_exit(addr_cleanup);
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 1097984..8894ad0 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -977,16 +977,24 @@  static int __init ib_core_init(void)
 		goto err_comp;
 	}
 
+	ret = addr_init();
+	if (ret) {
+		pr_warn("Could't init IB address resolution\n");
+		goto err_sysfs;
+	}
+
 	ret = ibnl_init();
 	if (ret) {
 		pr_warn("Couldn't init IB netlink interface\n");
-		goto err_sysfs;
+		goto err_addr;
 	}
 
 	ib_cache_setup();
 
 	return 0;
 
+err_addr:
+	addr_cleanup();
 err_sysfs:
 	class_unregister(&ib_class);
 err_comp:
@@ -1000,6 +1008,7 @@  static void __exit ib_core_cleanup(void)
 {
 	ib_cache_cleanup();
 	ibnl_cleanup();
+	addr_cleanup();
 	class_unregister(&ib_class);
 	destroy_workqueue(ib_comp_wq);
 	/* Make sure that any pending umem accounting work is done. */
diff --git a/include/rdma/ib_addr.h b/include/rdma/ib_addr.h
index 931a47b..e276a07 100644
--- a/include/rdma/ib_addr.h
+++ b/include/rdma/ib_addr.h
@@ -54,6 +54,9 @@  struct rdma_addr_client {
 	struct completion comp;
 };
 
+int addr_init(void);
+void addr_cleanup(void);
+
 /**
  * rdma_addr_register_client - Register an address client.
  */