mbox series

[0/3] locking/mutex: Add mutex_timed_lock() to solve potential deadlock problems

Message ID 20200210204651.21674-1-longman@redhat.com (mailing list archive)
Headers show
Series locking/mutex: Add mutex_timed_lock() to solve potential deadlock problems | expand

Message

Waiman Long Feb. 10, 2020, 8:46 p.m. UTC
When writing to the some of the sysctl parameters or sysfs files,
locks may be taken in an order that is different from other parts of
the kernel. As a result, lockdep may complain about circular locking
dependency indicating that deadlock may actually happen in some corner
cases. Patch 3 shows an example of this in the sysfs files of the
slub allocator.

It is typically hard to change the locking order in many cases. One
possible solution is to use a trylock loop. That is considered inelegant
and it is hard to control the actual wait time.

An alternative solution proposed by this patchset is to add a new
mutex_timed_lock() call that allows an additional timeout argument. This
function will return an error code if timeout happens. The use of this
new API will prevent deadlock from happening while allowing the task
to wait a sufficient period of time before giving up.

The goal of this new API is to prevent deadlock from happening, so
timeout accuracy is not high on the priority list. A coarse-grained
and easily understood millisecond based integer timeout argument is
used. That is somewhat different from the rt_mutex_timed_lock() function
where a more precise but complex hrtimer_sleeper argument is used.

On a 4-socket 128-thread x86-64 running a 128-thread mutex locking
microbenchmark with 1ms timeout, the output of the microbenchmark were:

  Running locktest with mutex [runtime = 10s, load = 1]
  Threads = 128, Min/Mean/Max = 247,667/601,134/1,621,145
  Threads = 128, Total Rate = 7,694 kop/s; Percpu Rate = 60 kop/s

The corresponding mutex locking events were:

  mutex_handoff=2032
  mutex_optspin=3486239
  mutex_sleep=2047
  mutex_slowpath=3626
  mutex_timeout=294

Waiman Long (3):
  locking/mutex: Add mutex_timed_lock()
  locking/mutex: Enable some lock event counters
  mm/slub: Fix potential deadlock problem in slab_attr_store()

 include/linux/mutex.h             |   3 +
 kernel/locking/lock_events_list.h |   9 +++
 kernel/locking/mutex.c            | 114 +++++++++++++++++++++++++++---
 mm/slub.c                         |   7 +-
 4 files changed, 123 insertions(+), 10 deletions(-)

Comments

Peter Zijlstra Feb. 11, 2020, 12:31 p.m. UTC | #1
On Mon, Feb 10, 2020 at 03:46:48PM -0500, Waiman Long wrote:
> An alternative solution proposed by this patchset is to add a new
> mutex_timed_lock() call that allows an additional timeout argument. This
> function will return an error code if timeout happens. The use of this
> new API will prevent deadlock from happening while allowing the task
> to wait a sufficient period of time before giving up.

We've always rejected timed_lock implementation because, as akpm has
already expressed, their need is disgusting.
Waiman Long Feb. 11, 2020, 11:31 p.m. UTC | #2
On 2/11/20 7:31 AM, Peter Zijlstra wrote:
> On Mon, Feb 10, 2020 at 03:46:48PM -0500, Waiman Long wrote:
>> An alternative solution proposed by this patchset is to add a new
>> mutex_timed_lock() call that allows an additional timeout argument. This
>> function will return an error code if timeout happens. The use of this
>> new API will prevent deadlock from happening while allowing the task
>> to wait a sufficient period of time before giving up.
> We've always rejected timed_lock implementation because, as akpm has
> already expressed, their need is disgusting.
>
That is fine. I will see if the lock order can be changed in a way to
address the problem.

Thanks,
Longman