diff mbox

[rdma-core] suse: switch fully to the new udev mechanism

Message ID 71835edf-cb2a-d4f8-627e-0f60ee772fb7@suse.de (mailing list archive)
State Accepted
Headers show

Commit Message

Nicolas Morey-Chaisemartin Aug. 28, 2017, 3:05 p.m. UTC
Do not use redhat services and scripts and use the new udev system

Signed-off-by: Nicolas Morey-Chaisemartin <NMoreyChaisemartin@suse.com>
---
Also sent as a PR on github: https://github.com/linux-rdma/rdma-core/pull/195

 suse/rdma-core.spec | 42 ------------------------------------------
 1 file changed, 42 deletions(-)

Comments

Jason Gunthorpe Aug. 28, 2017, 3:16 p.m. UTC | #1
On Mon, Aug 28, 2017 at 05:05:33PM +0200, Nicolas Morey-Chaisemartin wrote:

> -sed 's%/usr/libexec%/usr/lib%g' redhat/rdma.modules-setup.sh > %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh
> -chmod 0755 %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh

You probably still need something dracut related.

At a minimum you have to address the problem we saw in Debian: mlx5
and mlx4 need to have their rdma modules include in the initrd if
their core modules are included. (eg anything request_module'd from
the initrd must be present)

You may need to do more if you intend to support boot over ipoib, srp,
nfs-rdma, etc - I think the RH scripts aim to support.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Morey-Chaisemartin Aug. 28, 2017, 3:23 p.m. UTC | #2
Le 28/08/2017 à 17:16, Jason Gunthorpe a écrit :
> On Mon, Aug 28, 2017 at 05:05:33PM +0200, Nicolas Morey-Chaisemartin wrote:
>
>> -sed 's%/usr/libexec%/usr/lib%g' redhat/rdma.modules-setup.sh > %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh
>> -chmod 0755 %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh
> You probably still need something dracut related.
>
> At a minimum you have to address the problem we saw in Debian: mlx5
> and mlx4 need to have their rdma modules include in the initrd if
> their core modules are included. (eg anything request_module'd from
> the initrd must be present)
>
> You may need to do more if you intend to support boot over ipoib, srp,
> nfs-rdma, etc - I think the RH scripts aim to support.
>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
These were in your mail so I assumed they could be dropped :)
I'll keep the dracut stuff then.

Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Aug. 28, 2017, 3:31 p.m. UTC | #3
On Mon, Aug 28, 2017 at 05:23:25PM +0200, Nicolas Morey-Chaisemartin wrote:
> Le 28/08/2017 ?? 17:16, Jason Gunthorpe a ??crit??:
> > On Mon, Aug 28, 2017 at 05:05:33PM +0200, Nicolas Morey-Chaisemartin wrote:
> >
> >> -sed 's%/usr/libexec%/usr/lib%g' redhat/rdma.modules-setup.sh > %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh
> >> -chmod 0755 %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh
> > You probably still need something dracut related.
> >
> > At a minimum you have to address the problem we saw in Debian: mlx5
> > and mlx4 need to have their rdma modules include in the initrd if
> > their core modules are included. (eg anything request_module'd from
> > the initrd must be present)
> >
> > You may need to do more if you intend to support boot over ipoib, srp,
> > nfs-rdma, etc - I think the RH scripts aim to support.

> These were in your mail so I assumed they could be dropped :)
> I'll keep the dracut stuff then.

I was not careful about what I quoted.. Just stuff in that overall area.

The existing dracut script seems to assume other things, so you may
need a new dracut script. If it is totally general then it can live in
kernel-boot..

You should also consider if you need the udev triggers, I don't know
what suse policy is. Look in debian/rdma-core.postinst and others.

Adding them allows the udev rules to trigger on package install and
makes everything start working without a reboot.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicolas Morey-Chaisemartin Aug. 29, 2017, 7:59 a.m. UTC | #4
Le 28/08/2017 à 17:31, Jason Gunthorpe a écrit :
> On Mon, Aug 28, 2017 at 05:23:25PM +0200, Nicolas Morey-Chaisemartin wrote:
>> Le 28/08/2017 ?? 17:16, Jason Gunthorpe a ??crit??:
>>> On Mon, Aug 28, 2017 at 05:05:33PM +0200, Nicolas Morey-Chaisemartin wrote:
>>>
>>>> -sed 's%/usr/libexec%/usr/lib%g' redhat/rdma.modules-setup.sh > %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh
>>>> -chmod 0755 %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh
>>> You probably still need something dracut related.
>>>
>>> At a minimum you have to address the problem we saw in Debian: mlx5
>>> and mlx4 need to have their rdma modules include in the initrd if
>>> their core modules are included. (eg anything request_module'd from
>>> the initrd must be present)
>>>
>>> You may need to do more if you intend to support boot over ipoib, srp,
>>> nfs-rdma, etc - I think the RH scripts aim to support.
>> These were in your mail so I assumed they could be dropped :)
>> I'll keep the dracut stuff then.
> I was not careful about what I quoted.. Just stuff in that overall area.
>
> The existing dracut script seems to assume other things, so you may
> need a new dracut script. If it is totally general then it can live in
> kernel-boot..
>
> You should also consider if you need the udev triggers, I don't know
> what suse policy is. Look in debian/rdma-core.postinst and others.
>
> Adding them allows the udev rules to trigger on package install and
> makes everything start working without a reboot.


I looked around to all the scripts and I'm going some time to go through all these and decide which we need and which we don't.
Some of those were all upstreamed at once and I'm not sure what the bug they were fixing and if it is still needed.
I contacted Doug on the side. It would make sense to work with Red Hat to migrate to the udev system (vs rdma.service) and "cleanup" those scripts so they can be shared more easily.
This also means this won't be done before v15 is out. We should keep SUSE spec as is (it works) and work on the cleanup for v16.


Nicolas

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Aug. 29, 2017, 3:03 p.m. UTC | #5
On Tue, Aug 29, 2017 at 09:59:34AM +0200, Nicolas Morey-Chaisemartin wrote:

> I looked around to all the scripts and I'm going some time to go through all these and decide which we need and which we don't.
> Some of those were all upstreamed at once and I'm not sure what the
> bug they were fixing and if it is still needed.

Basically none of it is necessary today from a 'bug fix'
perspective, the bug fix stuff is all ancient history.

Here is my perspective on the RH directory:

rdma.conf
 - Obsoleted by /etc/rdma/modules/rdma/modules
   Except for 'tech preview' which is a RH concept.
rdma.fixup-mtrr.awk
 - Obsolete, supports ancient hardware, done in kernel now
rdma.ifdown-ib
rdma.ifup-ib
 - Looks like this supports RH's old 'network-scripts' system?
   Is it even compatible with suse?
rdma.kernel-init
rdma.service
rdma.udev-rules
 - This is the implementation of rdma.conf, it is obsoleted.
   The bug fix stuff is all for ancient hardware or done in
   the kernel now.
rdma.mlx4-setup.sh
rdma.mlx4.conf
rdma.mlx4.sys.modprobe
 - Mellanox says they now prefer it if the device's EEPROM is
   configured, instead of this approach. So this is old
rdma.modules-setup.sh
 - Dracut support to include more stuff in the initrd.
rdma.sriov-init
rdma.sriov-vfs
 - This creates SRIOV instances at boot. Maybe it should move
   to kernel-boot, but also unclear why we need it? doesn't
   libvirt do this nowadays?
rdma.udev-ipoib-naming.rules
 - This is a user example for udev rules..
   Could go into kernel-boot

If suse never shipped this stuff before then there is no reason
to rush to add it now..

> I contacted Doug on the side. It would make sense to work with Red
> Hat to migrate to the udev system (vs rdma.service) and "cleanup"
> those scripts so they can be shared more easily.  This also means
> this won't be done before v15 is out. We should keep SUSE spec as is
> (it works) and work on the cleanup for v16.

rdma.service is replaced by the stuff in kernel boot, and by my
eye the remainder is highly RH specific or I'm not certain what it is
for.. suse will probably need some distro specific things as well.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford Aug. 29, 2017, 3:20 p.m. UTC | #6
On Tue, 2017-08-29 at 09:03 -0600, Jason Gunthorpe wrote:
> On Tue, Aug 29, 2017 at 09:59:34AM +0200, Nicolas Morey-Chaisemartin
> wrote:
> 
> > I looked around to all the scripts and I'm going some time to go
> > through all these and decide which we need and which we don't.
> > Some of those were all upstreamed at once and I'm not sure what the
> > bug they were fixing and if it is still needed.
> 
> Basically none of it is necessary today from a 'bug fix'
> perspective, the bug fix stuff is all ancient history.
> 
> Here is my perspective on the RH directory:
> 
> rdma.conf
>  - Obsoleted by /etc/rdma/modules/rdma/modules
>    Except for 'tech preview' which is a RH concept.

Sounds right.

> rdma.fixup-mtrr.awk
>  - Obsolete, supports ancient hardware, done in kernel now

Yeah, this is droppable.  It was needed back in the SDR days for qib
hardware.

> rdma.ifdown-ib
> rdma.ifup-ib
>  - Looks like this supports RH's old 'network-scripts' system?
>    Is it even compatible with suse?

Yes, and probably not.  We have to keep it around because users have
the option of using the network scripts instead of NetworkManager.

> rdma.kernel-init
> rdma.service
> rdma.udev-rules
>  - This is the implementation of rdma.conf, it is obsoleted.
>    The bug fix stuff is all for ancient hardware or done in
>    the kernel now.

Correct.  The module loading should be obsoleted by the udev
autoloading work you just did and the PCI fixups in the script are even
more ancient and droppable than the MTRR fixups ;-)

> rdma.mlx4-setup.sh
> rdma.mlx4.conf
> rdma.mlx4.sys.modprobe
>  - Mellanox says they now prefer it if the device's EEPROM is
>    configured, instead of this approach. So this is old

Right, but it still needs to stick around for now.  Even though the
EEPROM approach is preferred, not all mlx4 level devices support it,
and given that mlx4 is still very much in use, we need to keep it.

> rdma.modules-setup.sh
>  - Dracut support to include more stuff in the initrd.

Right, which Red Hat (at least) must keep.

> rdma.sriov-init
> rdma.sriov-vfs
>  - This creates SRIOV instances at boot.

Correct.

>  Maybe it should move
>    to kernel-boot, but also unclear why we need it? doesn't
>    libvirt do this nowadays?

It needs to die.  For a very long time libvirt has not been smart
enough to deal with the dual ports on mlx4 hardware.  There has been
work upstream in libvirt to make this work.  The scripts here are
useless for any guests that are open to migration as they preconfigure
the devices and then you attach the device to the guest more or less
unmanaged.  Libvirt/qemu can never migrate the host because it doesn't
know how to set up the card on the new host the same way.  It's my hope
that the rdma tool will be expanded to support the different SRIOV
configuration methods (mlx4 and mlx5 are totally different)
transparently.  If/when that happens, it will be easy for libvirt to
standardize on that method and move this to "fully supported" status. 
Right now, this support is just to allow people to statically configure
SRIOV for use, but I don't consider something that can't migrate guests
production ready IMO.

> rdma.udev-ipoib-naming.rules
>  - This is a user example for udev rules..
>    Could go into kernel-boot

Right.  This is a totally generic udev consistent device naming
support.
Nicolas Morey-Chaisemartin Aug. 29, 2017, 3:38 p.m. UTC | #7
Le 29/08/2017 à 17:03, Jason Gunthorpe a écrit :
> On Tue, Aug 29, 2017 at 09:59:34AM +0200, Nicolas Morey-Chaisemartin wrote:
>
>> I looked around to all the scripts and I'm going some time to go through all these and decide which we need and which we don't.
>> Some of those were all upstreamed at once and I'm not sure what the
>> bug they were fixing and if it is still needed.
> Basically none of it is necessary today from a 'bug fix'
> perspective, the bug fix stuff is all ancient history.
>
> Here is my perspective on the RH directory:
>
> rdma.conf
>  - Obsoleted by /etc/rdma/modules/rdma/modules
>    Except for 'tech preview' which is a RH concept.
> rdma.fixup-mtrr.awk
>  - Obsolete, supports ancient hardware, done in kernel now
> rdma.ifdown-ib
> rdma.ifup-ib
>  - Looks like this supports RH's old 'network-scripts' system?
>    Is it even compatible with suse?
> rdma.kernel-init
> rdma.service
> rdma.udev-rules
>  - This is the implementation of rdma.conf, it is obsoleted.
>    The bug fix stuff is all for ancient hardware or done in
>    the kernel now.
> rdma.mlx4-setup.sh
> rdma.mlx4.conf
> rdma.mlx4.sys.modprobe
>  - Mellanox says they now prefer it if the device's EEPROM is
>    configured, instead of this approach. So this is old
> rdma.modules-setup.sh
>  - Dracut support to include more stuff in the initrd.
> rdma.sriov-init
> rdma.sriov-vfs
>  - This creates SRIOV instances at boot. Maybe it should move
>    to kernel-boot, but also unclear why we need it? doesn't
>    libvirt do this nowadays?
> rdma.udev-ipoib-naming.rules
>  - This is a user example for udev rules..
>    Could go into kernel-boot
>
> If suse never shipped this stuff before then there is no reason
> to rush to add it now..

We did for SLE12SP3 because it was a rushed job and we pretty much copied the redhat spec and fixed to match our packaging rules.
It appears we should be able to drop all these. We just need a new dracut file with the instmods but without all the redhat files.
It makes sense to keep the ipoib udev example too.
Not sure about the SRIOV files.



>> I contacted Doug on the side. It would make sense to work with Red
>> Hat to migrate to the udev system (vs rdma.service) and "cleanup"
>> those scripts so they can be shared more easily.  This also means
>> this won't be done before v15 is out. We should keep SUSE spec as is
>> (it works) and work on the cleanup for v16.
> rdma.service is replaced by the stuff in kernel boot, and by my
> eye the remainder is highly RH specific or I'm not certain what it is
> for.. suse will probably need some distro specific things as well.

It may come to that but we have nothing pending yet

Nicolas

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Aug. 29, 2017, 5:12 p.m. UTC | #8
On Tue, Aug 29, 2017 at 11:20:51AM -0400, Doug Ledford wrote:

> > rdma.fixup-mtrr.awk
> >  - Obsolete, supports ancient hardware, done in kernel now
> 
> Yeah, this is droppable.  It was needed back in the SDR days for qib
> hardware.

PR #199 deletes this file

> > rdma.mlx4-setup.sh
> > rdma.mlx4.conf
> > rdma.mlx4.sys.modprobe
> >  - Mellanox says they now prefer it if the device's EEPROM is
> >    configured, instead of this approach. So this is old
> 
> Right, but it still needs to stick around for now.  Even though the
> EEPROM approach is preferred, not all mlx4 level devices support it,
> and given that mlx4 is still very much in use, we need to keep it.

The modprobe approach is not compatible with hotplug.  Instead, we
really want this to run from a udev rule, but the mlx4 driver does not
create a kobject from mlx_core, so there is nothing to trigger the
rule on :|

I guess a kernel patch will be needed here..

> >  Maybe it should move
> >    to kernel-boot, but also unclear why we need it? doesn't
> >    libvirt do this nowadays?
> 
> It needs to die.  For a very long time libvirt has not been smart
> enough to deal with the dual ports on mlx4 hardware.  There has been
> work upstream in libvirt to make this work.  The scripts here are
> useless for any guests that are open to migration as they preconfigure
> the devices and then you attach the device to the guest more or less
> unmanaged.  Libvirt/qemu can never migrate the host because it doesn't
> know how to set up the card on the new host the same way.  It's my hope
> that the rdma tool will be expanded to support the different SRIOV
> configuration methods (mlx4 and mlx5 are totally different)
> transparently.  If/when that happens, it will be easy for libvirt to
> standardize on that method and move this to "fully supported" status. 
> Right now, this support is just to allow people to statically configure
> SRIOV for use, but I don't consider something that can't migrate guests
> production ready IMO.

Okay, lets leave that as redhat/, but I thought Mellanox sorted this
all out with the ipoib netlink patches related to sriov? Sigh.

> > rdma.udev-ipoib-naming.rules
> >  - This is a user example for udev rules..
> >    Could go into kernel-boot
> 
> Right.  This is a totally generic udev consistent device naming
> support.

Done in PR #199

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/suse/rdma-core.spec b/suse/rdma-core.spec
index 76ca7286..64d43b23 100644
--- a/suse/rdma-core.spec
+++ b/suse/rdma-core.spec
@@ -356,24 +356,6 @@  mkdir -p %{buildroot}%{dracutlibdir}/modules.d/05rdma
 mkdir -p %{buildroot}%{sysmodprobedir}
 mkdir -p %{buildroot}%{_unitdir}
 
-install -D -m0644 redhat/rdma.conf %{buildroot}/%{_sysconfdir}/rdma/rdma.conf
-sed 's%/usr/libexec%/usr/lib%' redhat/rdma.service > %{buildroot}%{_unitdir}/rdma.service
-chmod 0644 %{buildroot}%{_unitdir}/rdma.service
-install -D -m0644 redhat/rdma.sriov-vfs %{buildroot}/%{_sysconfdir}/rdma/sriov-vfs
-install -D -m0644 redhat/rdma.mlx4.conf %{buildroot}/%{_sysconfdir}/rdma/mlx4.conf
-install -D -m0644 redhat/rdma.udev-ipoib-naming.rules %{buildroot}%{_udevrulesdir}/70-persistent-ipoib.rules
-sed 's%/usr/libexec%/usr/lib%g' redhat/rdma.modules-setup.sh > %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh
-chmod 0755 %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh
-install -D -m0644 redhat/rdma.udev-rules %{buildroot}%{_udevrulesdir}/98-rdma.rules
-sed 's%/usr/libexec%/usr/lib%g' redhat/rdma.mlx4.sys.modprobe > %{buildroot}%{sysmodprobedir}/50-libmlx4.conf
-chmod 0644 %{buildroot}%{sysmodprobedir}/50-libmlx4.conf
-
-sed 's%/usr/libexec%/usr/lib%g' redhat/rdma.kernel-init > %{buildroot}%{_libexecdir}/rdma-init-kernel
-chmod 0755 %{buildroot}%{_libexecdir}/rdma-init-kernel
-install -D -m0755 redhat/rdma.sriov-init %{buildroot}%{_libexecdir}/rdma-set-sriov-vf
-install -D -m0644 redhat/rdma.fixup-mtrr.awk %{buildroot}%{_libexecdir}/rdma-fixup-mtrr.awk
-install -D -m0755 redhat/rdma.mlx4-setup.sh %{buildroot}%{_libexecdir}/mlx4-setup.sh
-
 mv %{buildroot}%{_sysconfdir}/modprobe.d/truescale.conf %{buildroot}%{_sysconfdir}/modprobe.d/50-truescale.conf
 %if 0%{?dma_coherent}
 mv %{buildroot}%{_sysconfdir}/modprobe.d/mlx4.conf %{buildroot}%{_sysconfdir}/modprobe.d/50-mlx4.conf
@@ -410,18 +392,6 @@  rm -rf %{buildroot}/%{_sbindir}/srp_daemon.sh
 %post -n %rdmacm_lname -p /sbin/ldconfig
 %postun -n %rdmacm_lname -p /sbin/ldconfig
 
-%pre
-%service_add_pre rdma.service
-
-%post
-%service_add_post rdma.service
-
-%preun
-%service_del_preun -n rdma.service
-
-%postun
-%service_del_postun -n rdma.service
-
 #
 # ibacm
 #
@@ -491,36 +461,24 @@  rm -rf %{buildroot}/%{_sbindir}/srp_daemon.sh
 %dir %{_libexecdir}/udev/rules.d
 %dir %{_sysconfdir}/modprobe.d
 %doc %{_docdir}/%{name}-%{version}/README.md
-%config(noreplace) %{_sysconfdir}/rdma/mlx4.conf
 %config(noreplace) %{_sysconfdir}/rdma/modules/infiniband.conf
 %config(noreplace) %{_sysconfdir}/rdma/modules/iwarp.conf
 %config(noreplace) %{_sysconfdir}/rdma/modules/opa.conf
 %config(noreplace) %{_sysconfdir}/rdma/modules/rdma.conf
 %config(noreplace) %{_sysconfdir}/rdma/modules/roce.conf
-%config(noreplace) %{_sysconfdir}/rdma/rdma.conf
-%config(noreplace) %{_sysconfdir}/rdma/sriov-vfs
 %if 0%{?dma_coherent}
 %config(noreplace) %{_sysconfdir}/modprobe.d/50-mlx4.conf
 %endif
 %config(noreplace) %{_sysconfdir}/modprobe.d/50-truescale.conf
 %{_unitdir}/rdma-hw.target
 %{_unitdir}/rdma-load-modules@.service
-%{_unitdir}/rdma.service
 %dir %{dracutlibdir}
 %dir %{dracutlibdir}/modules.d
 %dir %{dracutlibdir}/modules.d/05rdma
-%{dracutlibdir}/modules.d/05rdma/module-setup.sh
-%{_udevrulesdir}/70-persistent-ipoib.rules
 %{_udevrulesdir}/75-rdma-description.rules
 %{_udevrulesdir}/90-rdma-hw-modules.rules
 %{_udevrulesdir}/90-rdma-ulp-modules.rules
 %{_udevrulesdir}/90-rdma-umad.rules
-%{_udevrulesdir}/98-rdma.rules
-%config %{sysmodprobedir}/50-libmlx4.conf
-%{_libexecdir}/rdma-init-kernel
-%{_libexecdir}/rdma-set-sriov-vf
-%{_libexecdir}/rdma-fixup-mtrr.awk
-%{_libexecdir}/mlx4-setup.sh
 %{_libexecdir}/truescale-serdes.cmds
 %license COPYING.*
 %{_sbindir}/rcrdma