Message ID | 1494613473.14477.12.camel@sandisk.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On Fri, May 12, 2017 at 06:24:35PM +0000, Bart Van Assche wrote: > When using systemd templates instead of the current approach it would > become cumbersome for users to stop and start the srp_daemon on all ports > simultaneously. I think systemd PartOf dependencies is intended to solve that problem.. PartOf= Configures dependencies similar to Requires=, but limited to stopping and restarting of units. When systemd stops or restarts the units listed here, the action is propagated to this unit. Note that this is a one-way dependency -- changes to this unit do not affect the listed units. So there could still be a service name that covered all of the children (eg srp_daemon.service and srp_daemon_port@.service) > Additionally, if an RDMA adapter is hot-added, should the srp_daemon > be started or should it not be started for the newly added ports? I'm not sure what is best for that policy.. Today it looks like srp_daemon defaults to explicitly opt-in to running on ports. To continue that policy the admin would have to explicitly opt in to each port. Eg perhaps something like: systemctl enable srp_daemon@mlx4_0/0 Instead of editing a /etc/ config file. systemd has a '.device' system, which as I understand it, would allow srp_daemon@mlx4_0/0 to depend on mlx4_0/0.device, which is created by udev on hotplug, and would trigger running the service. (perhaps for this reason the template value would be umad0, not mlx4_0/0) > An in-between program could indeed do the job of listening for udev events > and actively managing srp_daemon children. Since we already have an > in-between program, namely srp_daemon.sh, the simplest approach is to modify > that shell script. Can you have a look at the attached patch? This seems like a reasonable improvement, but as I said, the inbetween program is a bit more complex that you probably want to do with a script: - Sites do not like things running on timers due to the introduced jitter, so the trivial sleep loop is not nice. - Failure of a child should result in a sensible log message and a restart protocol like systemd has. Failure to SIGTERM the child should escalate to SIGKILL to prevent lockups during shutdown - Dumping stderr/etc into /dev/null is a bad idea (eg glibc internal asserts are lost now). In a systemd world each child should get its own stderrr logger connection so this stuff works right. - We still need to be able to configure which ports srp_daemon runs on and to 'reload' that configuration into the script - Ideally hot removal of an adaptor requires halting just those daemons without triggering restart Since so much of that overlaps with systemd capabilities it does make some sense to have systemd manage the child processes directly. For instance you could have the srp_daemon script like in the patch but replace this: ${prog} -e -c -n [..] with systemctl start srp_daemon_port@${hca_id}/${port} Using PartOf dependencies.. The user experience would be very similar to the script directly forking, but many more of the above areas are covered off.. But.. in that case we don't even need the script. If the admin wishes to run srp_daemon on all IB ports then a simple udev rule will do the job: SUBSYSTEM=="infiniband", TAG+="systemd", ATTRS{node_desc}=="*", ENV{SYSTEMD_WANTS}+="srp_daemon_port@$name.service" [more or less] And now the timer is gone too. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From bec3884e36298ab2b6fb09c533b575fa95bd3378 Mon Sep 17 00:00:00 2001 From: Bart Van Assche <bart.vanassche@sandisk.com> Date: Fri, 12 May 2017 10:24:44 -0700 Subject: [PATCH] srp_daemon.service: Add support for hotplugging Instead of only starting srp_daemon.sh after the RDMA subsystem has been initialized, make srp_daemon.sh scan periodically for RDMA ports. An advantage of the new approach is that it supports hot-add of RDMA adapters. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> --- redhat/srp_daemon.service | 4 ---- srp_daemon/srp_daemon.sh.in | 41 ++++++++++++++++++++++++++--------------- 2 files changed, 26 insertions(+), 19 deletions(-) diff --git a/redhat/srp_daemon.service b/redhat/srp_daemon.service index 9510f5fb..cf98aa74 100644 --- a/redhat/srp_daemon.service +++ b/redhat/srp_daemon.service @@ -3,10 +3,6 @@ Description=Start or stop the daemon that attaches to SRP devices Documentation=man:srp_daemon file:/etc/rdma/rdma.conf file:/etc/srp_daemon.conf DefaultDependencies=false Conflicts=emergency.target emergency.service -Requires=rdma.service -Wants=opensm.service -After=rdma.service opensm.service -After=network.target Before=remote-fs-pre.target [Service] diff --git a/srp_daemon/srp_daemon.sh.in b/srp_daemon/srp_daemon.sh.in index 75e8a31b..74c08d7a 100755 --- a/srp_daemon/srp_daemon.sh.in +++ b/srp_daemon/srp_daemon.sh.in @@ -37,6 +37,7 @@ rescan_interval=60 pids=() pidfile="@CMAKE_INSTALL_FULL_RUNDIR@/srp_daemon.sh.pid" mypid=$$ +umad_devs=() trap_handler() { @@ -49,6 +50,17 @@ trap_handler() exit 0 } +# Check whether $1 is equal to one of $2..${$#} +contains() +{ + local v + + for v in "${@:2}"; do + [ "$v" = "$1" ] && return 0 + done + return 1 +} + # Check if there is another copy running of srp_daemon.sh if [ -f "$pidfile" ]; then if [ -e "/proc/$(cat "$pidfile" 2>/dev/null)/status" ]; then @@ -66,19 +78,18 @@ fi trap 'trap_handler' 2 15 -while [ ! -d ${ibdir} ] -do - sleep 30 +while :; do + for d in ${ibdir}_mad/umad*; do + [ -e "$d" ] || continue + contains "$d" "${umad_devs[@]}" && continue + hca_id="$(<"$d/ibdev")" + port="$(<"$d/port")" + add_target="${ibdir}_srp/srp-${hca_id}-${port}/add_target" + if [ -e "${add_target}" ]; then + ${prog} -e -c -n -i "${hca_id}" -p "${port}" -R "${rescan_interval}" "${params[@]}" >/dev/null 2>&1 & + pids+=($!) + umad_dev+=($d) + fi + done + sleep $rescan_interval done - -for d in ${ibdir}_mad/umad*; do - hca_id="$(<"$d/ibdev")" - port="$(<"$d/port")" - add_target="${ibdir}_srp/srp-${hca_id}-${port}/add_target" - if [ -e "${add_target}" ]; then - ${prog} -e -c -n -i "${hca_id}" -p "${port}" -R "${rescan_interval}" "${params[@]}" >/dev/null 2>&1 & - pids+=($!) - fi -done - -wait -- 2.12.2