diff mbox series

[5/5] libmultipath: don't dlclose tur checker DSO

Message ID 1603487708-12547-6-git-send-email-bmarzins@redhat.com (mailing list archive)
State Not Applicable, archived
Delegated to: christophe varoqui
Headers show
Series Misc Multipath patches | expand

Commit Message

Benjamin Marzinski Oct. 23, 2020, 9:15 p.m. UTC
The multipathd tur checker thread is designed to be able to finish at
any time, even after the tur checker itself has been freed. The
multipathd shutdown code makes sure all the checkers have been freed
before freeing the checker_class and calling dlclose() to unload the
DSO, but this doesn't guarantee that the checker threads have finished.
If one hasn't, the DSO will get unloaded while the thread still running
code from it, causing a segfault. Unfortunately, it's not possible to be
sure that all tur checker threads have ended during shutdown, without
making them joinable.

However, since libmultipath will never be reinitialized after it has
been uninitialzed, not dlclosing the tur checker DSO once a thread is
started has minimal cost (keeping the DSO code around until the program
exits, which usually happens right after freeing the checkers).

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 libmultipath/checkers.c     | 10 +++++++++-
 libmultipath/checkers.h     |  1 +
 libmultipath/checkers/tur.c |  1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

Comments

Martin Wilck Oct. 30, 2020, 9:15 p.m. UTC | #1
On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> The multipathd tur checker thread is designed to be able to finish at
> any time, even after the tur checker itself has been freed. The
> multipathd shutdown code makes sure all the checkers have been freed
> before freeing the checker_class and calling dlclose() to unload the
> DSO, but this doesn't guarantee that the checker threads have
> finished.
> If one hasn't, the DSO will get unloaded while the thread still
> running
> code from it, causing a segfault. Unfortunately, it's not possible to
> be
> sure that all tur checker threads have ended during shutdown, without
> making them joinable.
> 
> However, since libmultipath will never be reinitialized after it has
> been uninitialzed, not dlclosing the tur checker DSO once a thread is
> started has minimal cost (keeping the DSO code around until the
> program
> exits, which usually happens right after freeing the checkers).

I'm not against this, but have you considered using an atomic  refcount
for the DSO? With every tur thread starting, we could increase it, and
decrease it in the cleanup function of the thread when it exits. That
should be safe. If the refcount was positive when we exit, we could
refrain from unloading the DSO.

Regards,
Martin


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Benjamin Marzinski Nov. 2, 2020, 6:27 p.m. UTC | #2
On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > The multipathd tur checker thread is designed to be able to finish at
> > any time, even after the tur checker itself has been freed. The
> > multipathd shutdown code makes sure all the checkers have been freed
> > before freeing the checker_class and calling dlclose() to unload the
> > DSO, but this doesn't guarantee that the checker threads have
> > finished.
> > If one hasn't, the DSO will get unloaded while the thread still
> > running
> > code from it, causing a segfault. Unfortunately, it's not possible to
> > be
> > sure that all tur checker threads have ended during shutdown, without
> > making them joinable.
> > 
> > However, since libmultipath will never be reinitialized after it has
> > been uninitialzed, not dlclosing the tur checker DSO once a thread is
> > started has minimal cost (keeping the DSO code around until the
> > program
> > exits, which usually happens right after freeing the checkers).
> 
> I'm not against this, but have you considered using an atomic  refcount
> for the DSO? With every tur thread starting, we could increase it, and
> decrease it in the cleanup function of the thread when it exits. That
> should be safe. If the refcount was positive when we exit, we could
> refrain from unloading the DSO.

I tried exactly that, and I would still get crashes, even if it put the
code that decrements the atomic variable in a function that's not part
of the DSO, and put a pthread_exit() before the final
pthread_cleanup_pop() that would decrement it in tur_thread, so that
after the cleanup code is called the thread should never return to code
that is in the DSO. I had to add sleeps in code to force various
orderings, but I couldn't find any way that worked for all possible
orderings.  I would love it if this worked, and you're free to try, but
I could not get this method to work.

-Ben
 
> Regards,
> Martin

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Benjamin Marzinski Nov. 3, 2020, 6:48 p.m. UTC | #3
On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > The multipathd tur checker thread is designed to be able to finish at
> > any time, even after the tur checker itself has been freed. The
> > multipathd shutdown code makes sure all the checkers have been freed
> > before freeing the checker_class and calling dlclose() to unload the
> > DSO, but this doesn't guarantee that the checker threads have
> > finished.
> > If one hasn't, the DSO will get unloaded while the thread still
> > running
> > code from it, causing a segfault. Unfortunately, it's not possible to
> > be
> > sure that all tur checker threads have ended during shutdown, without
> > making them joinable.
> > 
> > However, since libmultipath will never be reinitialized after it has
> > been uninitialzed, not dlclosing the tur checker DSO once a thread is
> > started has minimal cost (keeping the DSO code around until the
> > program
> > exits, which usually happens right after freeing the checkers).
> 
> I'm not against this, but have you considered using an atomic  refcount
> for the DSO? With every tur thread starting, we could increase it, and
> decrease it in the cleanup function of the thread when it exits. That
> should be safe. If the refcount was positive when we exit, we could
> refrain from unloading the DSO.
> 
> Regards,
> Martin

NAK. I apparently forgot to commit the version file changes.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Martin Wilck Nov. 4, 2020, 10:39 p.m. UTC | #4
On Mon, 2020-11-02 at 12:27 -0600, Benjamin Marzinski wrote:
> On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> > On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > > The multipathd tur checker thread is designed to be able to
> > > finish at
> > > any time, even after the tur checker itself has been freed. The
> > > multipathd shutdown code makes sure all the checkers have been
> > > freed
> > > before freeing the checker_class and calling dlclose() to unload
> > > the
> > > DSO, but this doesn't guarantee that the checker threads have
> > > finished.
> > > If one hasn't, the DSO will get unloaded while the thread still
> > > running
> > > code from it, causing a segfault. Unfortunately, it's not
> > > possible to
> > > be
> > > sure that all tur checker threads have ended during shutdown,
> > > without
> > > making them joinable.
> > > 
> > > However, since libmultipath will never be reinitialized after it
> > > has
> > > been uninitialzed, not dlclosing the tur checker DSO once a
> > > thread is
> > > started has minimal cost (keeping the DSO code around until the
> > > program
> > > exits, which usually happens right after freeing the checkers).
> > 
> > I'm not against this, but have you considered using an
> > atomic  refcount
> > for the DSO? With every tur thread starting, we could increase it,
> > and
> > decrease it in the cleanup function of the thread when it exits.
> > That
> > should be safe. If the refcount was positive when we exit, we could
> > refrain from unloading the DSO.
> 
> I tried exactly that, and I would still get crashes, even if it put
> the
> code that decrements the atomic variable in a function that's not
> part
> of the DSO, and put a pthread_exit() before the final
> pthread_cleanup_pop() that would decrement it in tur_thread, so that
> after the cleanup code is called the thread should never return to
> code
> that is in the DSO. I had to add sleeps in code to force various
> orderings, but I couldn't find any way that worked for all possible
> orderings.  I would love it if this worked, and you're free to try,
> but
> I could not get this method to work.

I've experimented a bit with a trivial test program, and I found that
it worked stably if decrementing the refcount is really the last thing
thread's cleanup function does. Could you provide some details about
the sleeps that you'd put in that made this approach fail?

Martin
Benjamin Marzinski Nov. 4, 2020, 11:27 p.m. UTC | #5
On Wed, Nov 04, 2020 at 10:39:39PM +0000, Martin Wilck wrote:
> On Mon, 2020-11-02 at 12:27 -0600, Benjamin Marzinski wrote:
> > On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> > > On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > > > The multipathd tur checker thread is designed to be able to
> > > > finish at
> > > > any time, even after the tur checker itself has been freed. The
> > > > multipathd shutdown code makes sure all the checkers have been
> > > > freed
> > > > before freeing the checker_class and calling dlclose() to unload
> > > > the
> > > > DSO, but this doesn't guarantee that the checker threads have
> > > > finished.
> > > > If one hasn't, the DSO will get unloaded while the thread still
> > > > running
> > > > code from it, causing a segfault. Unfortunately, it's not
> > > > possible to
> > > > be
> > > > sure that all tur checker threads have ended during shutdown,
> > > > without
> > > > making them joinable.
> > > > 
> > > > However, since libmultipath will never be reinitialized after it
> > > > has
> > > > been uninitialzed, not dlclosing the tur checker DSO once a
> > > > thread is
> > > > started has minimal cost (keeping the DSO code around until the
> > > > program
> > > > exits, which usually happens right after freeing the checkers).
> > > 
> > > I'm not against this, but have you considered using an
> > > atomic  refcount
> > > for the DSO? With every tur thread starting, we could increase it,
> > > and
> > > decrease it in the cleanup function of the thread when it exits.
> > > That
> > > should be safe. If the refcount was positive when we exit, we could
> > > refrain from unloading the DSO.
> > 
> > I tried exactly that, and I would still get crashes, even if it put
> > the
> > code that decrements the atomic variable in a function that's not
> > part
> > of the DSO, and put a pthread_exit() before the final
> > pthread_cleanup_pop() that would decrement it in tur_thread, so that
> > after the cleanup code is called the thread should never return to
> > code
> > that is in the DSO. I had to add sleeps in code to force various
> > orderings, but I couldn't find any way that worked for all possible
> > orderings.  I would love it if this worked, and you're free to try,
> > but
> > I could not get this method to work.
> 
> I've experimented a bit with a trivial test program, and I found that
> it worked stably if decrementing the refcount is really the last thing
> thread's cleanup function does. Could you provide some details about
> the sleeps that you'd put in that made this approach fail?

I believe the situation that continued to crash was where the
tur_thread() exitted naturally (so it set running to 0), although I'm
not sure that this is necessary, or if it would still crash when running
the cleanup function because of a cancel.  I put the cleanup function to
decrement the count in libmultipath, so that it wasn't part of the DSO,
and then I put a sleep(5) as the last line of the cleanup function, and
a sleep(10) as the last line of cleanup_checkers(). I also had to set
running to 0 before starting the thread, and then take out the code to
pause the thread if running was aleady 0, to make sure the thread acted
like it was the one to set running to 0. Then the idea is to stop
multipathd while there is a thread in its sleep period, so that
multipathd sees that the counter is 0, and closes the dso, and then
the thread finishes before multipathd shuts the rest of the way down. 

-Ben

> Martin
> 
> -- Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107 SUSE
> Software Solutions Germany GmbH HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Martin Wilck Nov. 4, 2020, 11:34 p.m. UTC | #6
On Wed, 2020-11-04 at 22:39 +0000, Martin Wilck wrote:
> On Mon, 2020-11-02 at 12:27 -0600, Benjamin Marzinski wrote:
> > On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> > > On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > > > The multipathd tur checker thread is designed to be able to
> > > > finish at
> > > > any time, even after the tur checker itself has been freed. The
> > > > multipathd shutdown code makes sure all the checkers have been
> > > > freed
> > > > before freeing the checker_class and calling dlclose() to
> > > > unload
> > > > the
> > > > DSO, but this doesn't guarantee that the checker threads have
> > > > finished.
> > > > If one hasn't, the DSO will get unloaded while the thread still
> > > > running
> > > > code from it, causing a segfault. Unfortunately, it's not
> > > > possible to
> > > > be
> > > > sure that all tur checker threads have ended during shutdown,
> > > > without
> > > > making them joinable.
> > > > 
> > > > However, since libmultipath will never be reinitialized after
> > > > it
> > > > has
> > > > been uninitialzed, not dlclosing the tur checker DSO once a
> > > > thread is
> > > > started has minimal cost (keeping the DSO code around until the
> > > > program
> > > > exits, which usually happens right after freeing the checkers).
> > > 
> > > I'm not against this, but have you considered using an
> > > atomic  refcount
> > > for the DSO? With every tur thread starting, we could increase
> > > it,
> > > and
> > > decrease it in the cleanup function of the thread when it exits.
> > > That
> > > should be safe. If the refcount was positive when we exit, we
> > > could
> > > refrain from unloading the DSO.
> > 
> > I tried exactly that, and I would still get crashes, even if it put
> > the
> > code that decrements the atomic variable in a function that's not
> > part
> > of the DSO, and put a pthread_exit() before the final
> > pthread_cleanup_pop() that would decrement it in tur_thread, so
> > that
> > after the cleanup code is called the thread should never return to
> > code
> > that is in the DSO. I had to add sleeps in code to force various
> > orderings, but I couldn't find any way that worked for all possible
> > orderings.  I would love it if this worked, and you're free to try,
> > but
> > I could not get this method to work.
> 
> I've experimented a bit with a trivial test program, and I found that
> it worked stably if decrementing the refcount is really the last
> thing
> thread's cleanup function does. Could you provide some details about
> the sleeps that you'd put in that made this approach fail?

I've made some more experiments and if I repeat my test enough, I could
see SIGSEGV.

I was able to make it work by using a thread entrypoint outside the DSO
(calling the DSO's thread function), and decrementing the refcount in
the outermost cleanup function (which is also not in the DSO),
all using proper atomics and barriers.

Martin
Martin Wilck Nov. 4, 2020, 11:56 p.m. UTC | #7
On Wed, 2020-11-04 at 17:27 -0600, Benjamin Marzinski wrote:
> On Wed, Nov 04, 2020 at 10:39:39PM +0000, Martin Wilck wrote:
> > On Mon, 2020-11-02 at 12:27 -0600, Benjamin Marzinski wrote:
> > > On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> > > > On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > > > > The multipathd tur checker thread is designed to be able to
> > > > > finish at
> > > > > any time, even after the tur checker itself has been freed.
> > > > > The
> > > > > multipathd shutdown code makes sure all the checkers have
> > > > > been
> > > > > freed
> > > > > before freeing the checker_class and calling dlclose() to
> > > > > unload
> > > > > the
> > > > > DSO, but this doesn't guarantee that the checker threads have
> > > > > finished.
> > > > > If one hasn't, the DSO will get unloaded while the thread
> > > > > still
> > > > > running
> > > > > code from it, causing a segfault. Unfortunately, it's not
> > > > > possible to
> > > > > be
> > > > > sure that all tur checker threads have ended during shutdown,
> > > > > without
> > > > > making them joinable.
> > > > > 
> > > > > However, since libmultipath will never be reinitialized after
> > > > > it
> > > > > has
> > > > > been uninitialzed, not dlclosing the tur checker DSO once a
> > > > > thread is
> > > > > started has minimal cost (keeping the DSO code around until
> > > > > the
> > > > > program
> > > > > exits, which usually happens right after freeing the
> > > > > checkers).
> > > > 
> > > > I'm not against this, but have you considered using an
> > > > atomic  refcount
> > > > for the DSO? With every tur thread starting, we could increase
> > > > it,
> > > > and
> > > > decrease it in the cleanup function of the thread when it
> > > > exits.
> > > > That
> > > > should be safe. If the refcount was positive when we exit, we
> > > > could
> > > > refrain from unloading the DSO.
> > > 
> > > I tried exactly that, and I would still get crashes, even if it
> > > put
> > > the
> > > code that decrements the atomic variable in a function that's not
> > > part
> > > of the DSO, and put a pthread_exit() before the final
> > > pthread_cleanup_pop() that would decrement it in tur_thread, so
> > > that
> > > after the cleanup code is called the thread should never return
> > > to
> > > code
> > > that is in the DSO. I had to add sleeps in code to force various
> > > orderings, but I couldn't find any way that worked for all
> > > possible
> > > orderings.  I would love it if this worked, and you're free to
> > > try,
> > > but
> > > I could not get this method to work.
> > 
> > I've experimented a bit with a trivial test program, and I found
> > that
> > it worked stably if decrementing the refcount is really the last
> > thing
> > thread's cleanup function does. Could you provide some details
> > about
> > the sleeps that you'd put in that made this approach fail?
> 
> I believe the situation that continued to crash was where the
> tur_thread() exitted naturally (so it set running to 0), although I'm
> not sure that this is necessary, or if it would still crash when
> running
> the cleanup function because of a cancel.  I put the cleanup function
> to
> decrement the count in libmultipath, so that it wasn't part of the
> DSO,
> and then I put a sleep(5) as the last line of the cleanup function,
> and
> a sleep(10) as the last line of cleanup_checkers(). I also had to set
> running to 0 before starting the thread, and then take out the code
> to
> pause the thread if running was aleady 0, to make sure the thread
> acted
> like it was the one to set running to 0. Then the idea is to stop
> multipathd while there is a thread in its sleep period, so that
> multipathd sees that the counter is 0, and closes the dso, and then
> the thread finishes before multipathd shuts the rest of the way
> down. 

I guess the key is that the thread's entry point must also be in
libmultipath (i.e. outside the DSO). In pseudo-code:

entrypoint() {
   refcount++;
   pthread_cleanup_push(refcount--);
   tur_thread(ct);
   pthread_cleanup_pop(1);
}

This way the thread can't be in DSO code any more when refcount goes to
zero.

Martin
Benjamin Marzinski Nov. 5, 2020, 12:41 a.m. UTC | #8
On Wed, Nov 04, 2020 at 11:56:07PM +0000, Martin Wilck wrote:
> On Wed, 2020-11-04 at 17:27 -0600, Benjamin Marzinski wrote:
> > On Wed, Nov 04, 2020 at 10:39:39PM +0000, Martin Wilck wrote:
> > > On Mon, 2020-11-02 at 12:27 -0600, Benjamin Marzinski wrote:
> > > > On Fri, Oct 30, 2020 at 09:15:39PM +0000, Martin Wilck wrote:
> > > > > On Fri, 2020-10-23 at 16:15 -0500, Benjamin Marzinski wrote:
> > > > > > The multipathd tur checker thread is designed to be able to
> > > > > > finish at
> > > > > > any time, even after the tur checker itself has been freed.
> > > > > > The
> > > > > > multipathd shutdown code makes sure all the checkers have
> > > > > > been
> > > > > > freed
> > > > > > before freeing the checker_class and calling dlclose() to
> > > > > > unload
> > > > > > the
> > > > > > DSO, but this doesn't guarantee that the checker threads have
> > > > > > finished.
> > > > > > If one hasn't, the DSO will get unloaded while the thread
> > > > > > still
> > > > > > running
> > > > > > code from it, causing a segfault. Unfortunately, it's not
> > > > > > possible to
> > > > > > be
> > > > > > sure that all tur checker threads have ended during shutdown,
> > > > > > without
> > > > > > making them joinable.
> > > > > > 
> > > > > > However, since libmultipath will never be reinitialized after
> > > > > > it
> > > > > > has
> > > > > > been uninitialzed, not dlclosing the tur checker DSO once a
> > > > > > thread is
> > > > > > started has minimal cost (keeping the DSO code around until
> > > > > > the
> > > > > > program
> > > > > > exits, which usually happens right after freeing the
> > > > > > checkers).
> > > > > 
> > > > > I'm not against this, but have you considered using an
> > > > > atomic  refcount
> > > > > for the DSO? With every tur thread starting, we could increase
> > > > > it,
> > > > > and
> > > > > decrease it in the cleanup function of the thread when it
> > > > > exits.
> > > > > That
> > > > > should be safe. If the refcount was positive when we exit, we
> > > > > could
> > > > > refrain from unloading the DSO.
> > > > 
> > > > I tried exactly that, and I would still get crashes, even if it
> > > > put
> > > > the
> > > > code that decrements the atomic variable in a function that's not
> > > > part
> > > > of the DSO, and put a pthread_exit() before the final
> > > > pthread_cleanup_pop() that would decrement it in tur_thread, so
> > > > that
> > > > after the cleanup code is called the thread should never return
> > > > to
> > > > code
> > > > that is in the DSO. I had to add sleeps in code to force various
> > > > orderings, but I couldn't find any way that worked for all
> > > > possible
> > > > orderings.  I would love it if this worked, and you're free to
> > > > try,
> > > > but
> > > > I could not get this method to work.
> > > 
> > > I've experimented a bit with a trivial test program, and I found
> > > that
> > > it worked stably if decrementing the refcount is really the last
> > > thing
> > > thread's cleanup function does. Could you provide some details
> > > about
> > > the sleeps that you'd put in that made this approach fail?
> > 
> > I believe the situation that continued to crash was where the
> > tur_thread() exitted naturally (so it set running to 0), although I'm
> > not sure that this is necessary, or if it would still crash when
> > running
> > the cleanup function because of a cancel.  I put the cleanup function
> > to
> > decrement the count in libmultipath, so that it wasn't part of the
> > DSO,
> > and then I put a sleep(5) as the last line of the cleanup function,
> > and
> > a sleep(10) as the last line of cleanup_checkers(). I also had to set
> > running to 0 before starting the thread, and then take out the code
> > to
> > pause the thread if running was aleady 0, to make sure the thread
> > acted
> > like it was the one to set running to 0. Then the idea is to stop
> > multipathd while there is a thread in its sleep period, so that
> > multipathd sees that the counter is 0, and closes the dso, and then
> > the thread finishes before multipathd shuts the rest of the way
> > down. 
> 
> I guess the key is that the thread's entry point must also be in
> libmultipath (i.e. outside the DSO). In pseudo-code:
> 
> entrypoint() {
>    refcount++;
>    pthread_cleanup_push(refcount--);
>    tur_thread(ct);
>    pthread_cleanup_pop(1);
> }
> 
> This way the thread can't be in DSO code any more when refcount goes to
> zero.

Oh! I didn't think of solving it that way, but it makes sense. So, were
you planning on posting a patch?

-Ben


> 
> Martin
> 
> -- 
> Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
> SUSE  Software Solutions Germany GmbH
> HRB 36809, AG Nürnberg GF: Felix
> Imendörffer
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Martin Wilck Nov. 5, 2020, 11:53 a.m. UTC | #9
On Wed, 2020-11-04 at 18:41 -0600, Benjamin Marzinski wrote:
> On Wed, Nov 04, 2020 at 11:56:07PM +0000, Martin Wilck wrote:
> > 
> > I guess the key is that the thread's entry point must also be in
> > libmultipath (i.e. outside the DSO). In pseudo-code:
> > 
> > entrypoint() {
> >    refcount++;
> >    pthread_cleanup_push(refcount--);
> >    tur_thread(ct);
> >    pthread_cleanup_pop(1);
> > }
> > 
> > This way the thread can't be in DSO code any more when refcount
> > goes to
> > zero.
> 
> Oh! I didn't think of solving it that way, but it makes sense. So,
> were
> you planning on posting a patch?

I just did ("libmultipath: prevent DSO unloading with astray checker
threads"). Please have a look and possibly test it using the setup
that failed for you before.

Regards
Martin
diff mbox series

Patch

diff --git a/libmultipath/checkers.c b/libmultipath/checkers.c
index 18b1f5eb..35a17f8c 100644
--- a/libmultipath/checkers.c
+++ b/libmultipath/checkers.c
@@ -22,6 +22,7 @@  struct checker_class {
 	void (*reset)(void);		     /* to reset the global variables */
 	const char **msgtable;
 	short msgtable_size;
+	int keep_dso;
 };
 
 static const char *checker_state_names[PATH_MAX_STATE] = {
@@ -74,7 +75,7 @@  void free_checker_class(struct checker_class *c)
 	list_del(&c->node);
 	if (c->reset)
 		c->reset();
-	if (c->handle) {
+	if (c->handle && !c->keep_dso) {
 		if (dlclose(c->handle) != 0) {
 			condlog(0, "Cannot unload checker %s: %s",
 				c->name, dlerror());
@@ -197,6 +198,13 @@  out:
 	return NULL;
 }
 
+void checker_keep_dso(struct checker * c)
+{
+	if (!c || !c->cls)
+		return;
+	c->cls->keep_dso = 1;
+}
+
 void checker_set_fd (struct checker * c, int fd)
 {
 	if (!c)
diff --git a/libmultipath/checkers.h b/libmultipath/checkers.h
index 9d5f90b9..af5a4006 100644
--- a/libmultipath/checkers.h
+++ b/libmultipath/checkers.h
@@ -146,6 +146,7 @@  void checker_reset (struct checker *);
 void checker_set_sync (struct checker *);
 void checker_set_async (struct checker *);
 void checker_set_fd (struct checker *, int);
+void checker_keep_dso(struct checker *c);
 void checker_enable (struct checker *);
 void checker_disable (struct checker *);
 int checker_check (struct checker *, int);
diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c
index e886fcf8..fd58d62a 100644
--- a/libmultipath/checkers/tur.c
+++ b/libmultipath/checkers/tur.c
@@ -394,6 +394,7 @@  int libcheck_check(struct checker * c)
 		uatomic_set(&ct->running, 1);
 		tur_set_async_timeout(c);
 		setup_thread_attr(&attr, 32 * 1024, 1);
+		checker_keep_dso(c);
 		r = pthread_create(&ct->thread, &attr, tur_thread, ct);
 		pthread_attr_destroy(&attr);
 		if (r) {