From patchwork Fri Jun 28 12:20:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauro Carvalho Chehab X-Patchwork-Id: 11022191 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 198B21575 for ; Fri, 28 Jun 2019 12:21:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0148227BA5 for ; Fri, 28 Jun 2019 12:21:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E911F287E5; Fri, 28 Jun 2019 12:21:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6B28627BA5 for ; Fri, 28 Jun 2019 12:21:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BF4E96E912; Fri, 28 Jun 2019 12:21:12 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:e::133]) by gabe.freedesktop.org (Postfix) with ESMTPS id BF9C16E912 for ; Fri, 28 Jun 2019 12:21:10 +0000 (UTC) Received: from [186.213.242.156] (helo=bombadil.infradead.org) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hgprv-00009r-1I; Fri, 28 Jun 2019 12:20:44 +0000 Received: from mchehab by bombadil.infradead.org with local (Exim 4.92) (envelope-from ) id 1hgprt-000577-3j; Fri, 28 Jun 2019 09:20:41 -0300 From: Mauro Carvalho Chehab To: Linux Doc Mailing List Subject: [PATCH 04/43] docs: locking: convert docs to ReST and rename to *.rst Date: Fri, 28 Jun 2019 09:20:00 -0300 Message-Id: <0a4ec1e39a9e2cec38ffb7d24b3b911e9f11f47a.1561723980.git.mchehab+samsung@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: References: MIME-Version: 1.0 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=gD5FzOREchq+JJ0oJEh3iM6IvcLiFvL1QH79Y0JPa/4=; b=FeQ/Ry3qie60CXQ+wWJuA4/p7u NyjIfMjPjDoCIllhX9U+k85iXDZI/HsI8b0UJWVjT0POfStFoPMs+WwqBA5MIs7MTrhEVZdJC+LKV iHchdC47DrUlx/wRo5QPJ/aaNVtyfeCiK4yfPpT300NTNr3tIBInF1vxHf6bJqOU8WaLUhdVHaoRW aME1kG7c239mQOjyIJbU3RGWTOoap0t4K/pVGX0raApQq8Y93jHUw38U5vdQqfduHIiJ+NnFg1XGR 1mK+K6jnSWM98tFP5vYIbI6NigETEggotIhJ7K90mbn5Ap5/MxpEVBLNqlDtzMKvnTz4/XO1Mf+6D kmQ/VcAg==; X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jonathan Corbet , Peter Zijlstra , Sean Paul , David Airlie , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Mauro Carvalho Chehab , Maxime Ripard , Ingo Molnar , Federico Vaga , Mauro Carvalho Chehab , Darren Hart , Thomas Gleixner , Will Deacon Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Convert the locking documents to ReST and add them to the kernel development book where it belongs. Most of the stuff here is just to make Sphinx to properly parse the text file, as they're already in good shape, not requiring massive changes in order to be parsed. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab Acked-by: Federico Vaga --- Documentation/kernel-hacking/locking.rst | 2 +- Documentation/locking/index.rst | 24 +++ ...{lockdep-design.txt => lockdep-design.rst} | 51 +++-- Documentation/locking/lockstat.rst | 204 ++++++++++++++++++ Documentation/locking/lockstat.txt | 183 ---------------- .../{locktorture.txt => locktorture.rst} | 105 +++++---- .../{mutex-design.txt => mutex-design.rst} | 26 ++- ...t-mutex-design.txt => rt-mutex-design.rst} | 139 ++++++------ .../locking/{rt-mutex.txt => rt-mutex.rst} | 30 +-- .../locking/{spinlocks.txt => spinlocks.rst} | 32 ++- ...w-mutex-design.txt => ww-mutex-design.rst} | 82 +++---- Documentation/pi-futex.txt | 2 +- .../it_IT/kernel-hacking/locking.rst | 2 +- drivers/gpu/drm/drm_modeset_lock.c | 2 +- include/linux/lockdep.h | 2 +- include/linux/mutex.h | 2 +- include/linux/rwsem.h | 2 +- kernel/locking/mutex.c | 2 +- kernel/locking/rtmutex.c | 2 +- lib/Kconfig.debug | 4 +- 20 files changed, 511 insertions(+), 387 deletions(-) create mode 100644 Documentation/locking/index.rst rename Documentation/locking/{lockdep-design.txt => lockdep-design.rst} (93%) create mode 100644 Documentation/locking/lockstat.rst delete mode 100644 Documentation/locking/lockstat.txt rename Documentation/locking/{locktorture.txt => locktorture.rst} (57%) rename Documentation/locking/{mutex-design.txt => mutex-design.rst} (94%) rename Documentation/locking/{rt-mutex-design.txt => rt-mutex-design.rst} (91%) rename Documentation/locking/{rt-mutex.txt => rt-mutex.rst} (71%) rename Documentation/locking/{spinlocks.txt => spinlocks.rst} (89%) rename Documentation/locking/{ww-mutex-design.txt => ww-mutex-design.rst} (93%) diff --git a/Documentation/kernel-hacking/locking.rst b/Documentation/kernel-hacking/locking.rst index dc698ea456e0..a8518ac0d31d 100644 --- a/Documentation/kernel-hacking/locking.rst +++ b/Documentation/kernel-hacking/locking.rst @@ -1364,7 +1364,7 @@ Futex API reference Further reading =============== -- ``Documentation/locking/spinlocks.txt``: Linus Torvalds' spinlocking +- ``Documentation/locking/spinlocks.rst``: Linus Torvalds' spinlocking tutorial in the kernel sources. - Unix Systems for Modern Architectures: Symmetric Multiprocessing and diff --git a/Documentation/locking/index.rst b/Documentation/locking/index.rst new file mode 100644 index 000000000000..ef5da7fe9aac --- /dev/null +++ b/Documentation/locking/index.rst @@ -0,0 +1,24 @@ +:orphan: + +======= +locking +======= + +.. toctree:: + :maxdepth: 1 + + lockdep-design + lockstat + locktorture + mutex-design + rt-mutex-design + rt-mutex + spinlocks + ww-mutex-design + +.. only:: subproject and html + + Indices + ======= + + * :ref:`genindex` diff --git a/Documentation/locking/lockdep-design.txt b/Documentation/locking/lockdep-design.rst similarity index 93% rename from Documentation/locking/lockdep-design.txt rename to Documentation/locking/lockdep-design.rst index f189d130e543..23fcbc4d3fc0 100644 --- a/Documentation/locking/lockdep-design.txt +++ b/Documentation/locking/lockdep-design.rst @@ -2,6 +2,7 @@ Runtime locking correctness validator ===================================== started by Ingo Molnar + additions by Arjan van de Ven Lock-class @@ -56,7 +57,7 @@ where the last 1 category is: When locking rules are violated, these usage bits are presented in the locking error messages, inside curlies, with a total of 2 * n STATEs bits. -A contrived example: +A contrived example:: modprobe/2287 is trying to acquire lock: (&sio_locks[i].lock){-.-.}, at: [] mutex_lock+0x21/0x24 @@ -70,12 +71,14 @@ of the lock and readlock (if exists), for each of the n STATEs listed above respectively, and the character displayed at each bit position indicates: + === =================================================== '.' acquired while irqs disabled and not in irq context '-' acquired in irq context '+' acquired with irqs enabled '?' acquired in irq context with irqs enabled. + === =================================================== -The bits are illustrated with an example: +The bits are illustrated with an example:: (&sio_locks[i].lock){-.-.}, at: [] mutex_lock+0x21/0x24 |||| @@ -90,13 +93,13 @@ context and whether that STATE is enabled yields four possible cases as shown in the table below. The bit character is able to indicate which exact case is for the lock as of the reporting time. - ------------------------------------------- + +--------------+-------------+--------------+ | | irq enabled | irq disabled | - |-------------------------------------------| + +--------------+-------------+--------------+ | ever in irq | ? | - | - |-------------------------------------------| + +--------------+-------------+--------------+ | never in irq | + | . | - ------------------------------------------- + +--------------+-------------+--------------+ The character '-' suggests irq is disabled because if otherwise the charactor '?' would have been shown instead. Similar deduction can be @@ -113,7 +116,7 @@ is irq-unsafe means it was ever acquired with irq enabled. A softirq-unsafe lock-class is automatically hardirq-unsafe as well. The following states must be exclusive: only one of them is allowed to be set -for any lock-class based on its usage: +for any lock-class based on its usage:: or or @@ -134,7 +137,7 @@ Multi-lock dependency rules: The same lock-class must not be acquired twice, because this could lead to lock recursion deadlocks. -Furthermore, two locks can not be taken in inverse order: +Furthermore, two locks can not be taken in inverse order:: -> -> @@ -148,7 +151,7 @@ operations; the validator will still find whether these locks can be acquired in a circular fashion. Furthermore, the following usage based lock dependencies are not allowed -between any two lock-classes: +between any two lock-classes:: -> -> @@ -204,16 +207,16 @@ the ordering is not static. In order to teach the validator about this correct usage model, new versions of the various locking primitives were added that allow you to specify a "nesting level". An example call, for the block device mutex, -looks like this: +looks like this:: -enum bdev_bd_mutex_lock_class -{ + enum bdev_bd_mutex_lock_class + { BD_MUTEX_NORMAL, BD_MUTEX_WHOLE, BD_MUTEX_PARTITION -}; + }; - mutex_lock_nested(&bdev->bd_contains->bd_mutex, BD_MUTEX_PARTITION); +mutex_lock_nested(&bdev->bd_contains->bd_mutex, BD_MUTEX_PARTITION); In this case the locking is done on a bdev object that is known to be a partition. @@ -234,7 +237,7 @@ must be held: lockdep_assert_held*(&lock) and lockdep_*pin_lock(&lock). As the name suggests, lockdep_assert_held* family of macros assert that a particular lock is held at a certain time (and generate a WARN() otherwise). This annotation is largely used all over the kernel, e.g. kernel/sched/ -core.c +core.c:: void update_rq_clock(struct rq *rq) { @@ -253,7 +256,7 @@ out to be especially helpful to debug code with callbacks, where an upper layer assumes a lock remains taken, but a lower layer thinks it can maybe drop and reacquire the lock ("unwittingly" introducing races). lockdep_pin_lock() returns a 'struct pin_cookie' that is then used by lockdep_unpin_lock() to check -that nobody tampered with the lock, e.g. kernel/sched/sched.h +that nobody tampered with the lock, e.g. kernel/sched/sched.h:: static inline void rq_pin_lock(struct rq *rq, struct rq_flags *rf) { @@ -280,7 +283,7 @@ correctness) in the sense that for every simple, standalone single-task locking sequence that occurred at least once during the lifetime of the kernel, the validator proves it with a 100% certainty that no combination and timing of these locking sequences can cause any class of -lock related deadlock. [*] +lock related deadlock. [1]_ I.e. complex multi-CPU and multi-task locking scenarios do not have to occur in practice to prove a deadlock: only the simple 'component' @@ -299,7 +302,9 @@ possible combination of locking interaction between CPUs, combined with every possible hardirq and softirq nesting scenario (which is impossible to do in practice). -[*] assuming that the validator itself is 100% correct, and no other +.. [1] + + assuming that the validator itself is 100% correct, and no other part of the system corrupts the state of the validator in any way. We also assume that all NMI/SMM paths [which could interrupt even hardirq-disabled codepaths] are correct and do not interfere @@ -310,7 +315,7 @@ to do in practice). Performance: ------------ -The above rules require _massive_ amounts of runtime checking. If we did +The above rules require **massive** amounts of runtime checking. If we did that for every lock taken and for every irqs-enable event, it would render the system practically unusably slow. The complexity of checking is O(N^2), so even with just a few hundred lock-classes we'd have to do @@ -369,17 +374,17 @@ be harder to do than to say. Of course, if you do run out of lock classes, the next thing to do is to find the offending lock classes. First, the following command gives -you the number of lock classes currently in use along with the maximum: +you the number of lock classes currently in use along with the maximum:: grep "lock-classes" /proc/lockdep_stats -This command produces the following output on a modest system: +This command produces the following output on a modest system:: - lock-classes: 748 [max: 8191] + lock-classes: 748 [max: 8191] If the number allocated (748 above) increases continually over time, then there is likely a leak. The following command can be used to -identify the leaking lock classes: +identify the leaking lock classes:: grep "BD" /proc/lockdep diff --git a/Documentation/locking/lockstat.rst b/Documentation/locking/lockstat.rst new file mode 100644 index 000000000000..536eab8dbd99 --- /dev/null +++ b/Documentation/locking/lockstat.rst @@ -0,0 +1,204 @@ +=============== +Lock Statistics +=============== + +What +==== + +As the name suggests, it provides statistics on locks. + + +Why +=== + +Because things like lock contention can severely impact performance. + +How +=== + +Lockdep already has hooks in the lock functions and maps lock instances to +lock classes. We build on that (see Documentation/locking/lockdep-design.rst). +The graph below shows the relation between the lock functions and the various +hooks therein:: + + __acquire + | + lock _____ + | \ + | __contended + | | + | + | _______/ + |/ + | + __acquired + | + . + + . + | + __release + | + unlock + + lock, unlock - the regular lock functions + __* - the hooks + <> - states + +With these hooks we provide the following statistics: + + con-bounces + - number of lock contention that involved x-cpu data + contentions + - number of lock acquisitions that had to wait + wait time + min + - shortest (non-0) time we ever had to wait for a lock + max + - longest time we ever had to wait for a lock + total + - total time we spend waiting on this lock + avg + - average time spent waiting on this lock + acq-bounces + - number of lock acquisitions that involved x-cpu data + acquisitions + - number of times we took the lock + hold time + min + - shortest (non-0) time we ever held the lock + max + - longest time we ever held the lock + total + - total time this lock was held + avg + - average time this lock was held + +These numbers are gathered per lock class, per read/write state (when +applicable). + +It also tracks 4 contention points per class. A contention point is a call site +that had to wait on lock acquisition. + +Configuration +------------- + +Lock statistics are enabled via CONFIG_LOCK_STAT. + +Usage +----- + +Enable collection of statistics:: + + # echo 1 >/proc/sys/kernel/lock_stat + +Disable collection of statistics:: + + # echo 0 >/proc/sys/kernel/lock_stat + +Look at the current lock statistics:: + + ( line numbers not part of actual output, done for clarity in the explanation + below ) + + # less /proc/lock_stat + + 01 lock_stat version 0.4 + 02----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + 03 class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg + 04----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + 05 + 06 &mm->mmap_sem-W: 46 84 0.26 939.10 16371.53 194.90 47291 2922365 0.16 2220301.69 17464026916.32 5975.99 + 07 &mm->mmap_sem-R: 37 100 1.31 299502.61 325629.52 3256.30 212344 34316685 0.10 7744.91 95016910.20 2.77 + 08 --------------- + 09 &mm->mmap_sem 1 [] khugepaged_scan_mm_slot+0x57/0x280 + 10 &mm->mmap_sem 96 [] __do_page_fault+0x1d4/0x510 + 11 &mm->mmap_sem 34 [] vm_mmap_pgoff+0x87/0xd0 + 12 &mm->mmap_sem 17 [] vm_munmap+0x41/0x80 + 13 --------------- + 14 &mm->mmap_sem 1 [] dup_mmap+0x2a/0x3f0 + 15 &mm->mmap_sem 60 [] SyS_mprotect+0xe9/0x250 + 16 &mm->mmap_sem 41 [] __do_page_fault+0x1d4/0x510 + 17 &mm->mmap_sem 68 [] vm_mmap_pgoff+0x87/0xd0 + 18 + 19............................................................................................................................................................................................................................. + 20 + 21 unix_table_lock: 110 112 0.21 49.24 163.91 1.46 21094 66312 0.12 624.42 31589.81 0.48 + 22 --------------- + 23 unix_table_lock 45 [] unix_create1+0x16e/0x1b0 + 24 unix_table_lock 47 [] unix_release_sock+0x31/0x250 + 25 unix_table_lock 15 [] unix_find_other+0x117/0x230 + 26 unix_table_lock 5 [] unix_autobind+0x11f/0x1b0 + 27 --------------- + 28 unix_table_lock 39 [] unix_release_sock+0x31/0x250 + 29 unix_table_lock 49 [] unix_create1+0x16e/0x1b0 + 30 unix_table_lock 20 [] unix_find_other+0x117/0x230 + 31 unix_table_lock 4 [] unix_autobind+0x11f/0x1b0 + + +This excerpt shows the first two lock class statistics. Line 01 shows the +output version - each time the format changes this will be updated. Line 02-04 +show the header with column descriptions. Lines 05-18 and 20-31 show the actual +statistics. These statistics come in two parts; the actual stats separated by a +short separator (line 08, 13) from the contention points. + +Lines 09-12 show the first 4 recorded contention points (the code +which tries to get the lock) and lines 14-17 show the first 4 recorded +contended points (the lock holder). It is possible that the max +con-bounces point is missing in the statistics. + +The first lock (05-18) is a read/write lock, and shows two lines above the +short separator. The contention points don't match the column descriptors, +they have two: contentions and [] symbol. The second set of contention +points are the points we're contending with. + +The integer part of the time values is in us. + +Dealing with nested locks, subclasses may appear:: + + 32........................................................................................................................................................................................................................... + 33 + 34 &rq->lock: 13128 13128 0.43 190.53 103881.26 7.91 97454 3453404 0.00 401.11 13224683.11 3.82 + 35 --------- + 36 &rq->lock 645 [] task_rq_lock+0x43/0x75 + 37 &rq->lock 297 [] try_to_wake_up+0x127/0x25a + 38 &rq->lock 360 [] select_task_rq_fair+0x1f0/0x74a + 39 &rq->lock 428 [] scheduler_tick+0x46/0x1fb + 40 --------- + 41 &rq->lock 77 [] task_rq_lock+0x43/0x75 + 42 &rq->lock 174 [] try_to_wake_up+0x127/0x25a + 43 &rq->lock 4715 [] double_rq_lock+0x42/0x54 + 44 &rq->lock 893 [] schedule+0x157/0x7b8 + 45 + 46........................................................................................................................................................................................................................... + 47 + 48 &rq->lock/1: 1526 11488 0.33 388.73 136294.31 11.86 21461 38404 0.00 37.93 109388.53 2.84 + 49 ----------- + 50 &rq->lock/1 11526 [] double_rq_lock+0x4f/0x54 + 51 ----------- + 52 &rq->lock/1 5645 [] double_rq_lock+0x42/0x54 + 53 &rq->lock/1 1224 [] schedule+0x157/0x7b8 + 54 &rq->lock/1 4336 [] double_rq_lock+0x4f/0x54 + 55 &rq->lock/1 181 [] try_to_wake_up+0x127/0x25a + +Line 48 shows statistics for the second subclass (/1) of &rq->lock class +(subclass starts from 0), since in this case, as line 50 suggests, +double_rq_lock actually acquires a nested lock of two spinlocks. + +View the top contending locks:: + + # grep : /proc/lock_stat | head + clockevents_lock: 2926159 2947636 0.15 46882.81 1784540466.34 605.41 3381345 3879161 0.00 2260.97 53178395.68 13.71 + tick_broadcast_lock: 346460 346717 0.18 2257.43 39364622.71 113.54 3642919 4242696 0.00 2263.79 49173646.60 11.59 + &mapping->i_mmap_mutex: 203896 203899 3.36 645530.05 31767507988.39 155800.21 3361776 8893984 0.17 2254.15 14110121.02 1.59 + &rq->lock: 135014 136909 0.18 606.09 842160.68 6.15 1540728 10436146 0.00 728.72 17606683.41 1.69 + &(&zone->lru_lock)->rlock: 93000 94934 0.16 59.18 188253.78 1.98 1199912 3809894 0.15 391.40 3559518.81 0.93 + tasklist_lock-W: 40667 41130 0.23 1189.42 428980.51 10.43 270278 510106 0.16 653.51 3939674.91 7.72 + tasklist_lock-R: 21298 21305 0.20 1310.05 215511.12 10.12 186204 241258 0.14 1162.33 1179779.23 4.89 + rcu_node_1: 47656 49022 0.16 635.41 193616.41 3.95 844888 1865423 0.00 764.26 1656226.96 0.89 + &(&dentry->d_lockref.lock)->rlock: 39791 40179 0.15 1302.08 88851.96 2.21 2790851 12527025 0.10 1910.75 3379714.27 0.27 + rcu_node_0: 29203 30064 0.16 786.55 1555573.00 51.74 88963 244254 0.00 398.87 428872.51 1.76 + +Clear the statistics:: + + # echo 0 > /proc/lock_stat diff --git a/Documentation/locking/lockstat.txt b/Documentation/locking/lockstat.txt deleted file mode 100644 index fdbeb0c45ef3..000000000000 --- a/Documentation/locking/lockstat.txt +++ /dev/null @@ -1,183 +0,0 @@ - -LOCK STATISTICS - -- WHAT - -As the name suggests, it provides statistics on locks. - -- WHY - -Because things like lock contention can severely impact performance. - -- HOW - -Lockdep already has hooks in the lock functions and maps lock instances to -lock classes. We build on that (see Documentation/locking/lockdep-design.txt). -The graph below shows the relation between the lock functions and the various -hooks therein. - - __acquire - | - lock _____ - | \ - | __contended - | | - | - | _______/ - |/ - | - __acquired - | - . - - . - | - __release - | - unlock - -lock, unlock - the regular lock functions -__* - the hooks -<> - states - -With these hooks we provide the following statistics: - - con-bounces - number of lock contention that involved x-cpu data - contentions - number of lock acquisitions that had to wait - wait time min - shortest (non-0) time we ever had to wait for a lock - max - longest time we ever had to wait for a lock - total - total time we spend waiting on this lock - avg - average time spent waiting on this lock - acq-bounces - number of lock acquisitions that involved x-cpu data - acquisitions - number of times we took the lock - hold time min - shortest (non-0) time we ever held the lock - max - longest time we ever held the lock - total - total time this lock was held - avg - average time this lock was held - -These numbers are gathered per lock class, per read/write state (when -applicable). - -It also tracks 4 contention points per class. A contention point is a call site -that had to wait on lock acquisition. - - - CONFIGURATION - -Lock statistics are enabled via CONFIG_LOCK_STAT. - - - USAGE - -Enable collection of statistics: - -# echo 1 >/proc/sys/kernel/lock_stat - -Disable collection of statistics: - -# echo 0 >/proc/sys/kernel/lock_stat - -Look at the current lock statistics: - -( line numbers not part of actual output, done for clarity in the explanation - below ) - -# less /proc/lock_stat - -01 lock_stat version 0.4 -02----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -03 class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg -04----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -05 -06 &mm->mmap_sem-W: 46 84 0.26 939.10 16371.53 194.90 47291 2922365 0.16 2220301.69 17464026916.32 5975.99 -07 &mm->mmap_sem-R: 37 100 1.31 299502.61 325629.52 3256.30 212344 34316685 0.10 7744.91 95016910.20 2.77 -08 --------------- -09 &mm->mmap_sem 1 [] khugepaged_scan_mm_slot+0x57/0x280 -10 &mm->mmap_sem 96 [] __do_page_fault+0x1d4/0x510 -11 &mm->mmap_sem 34 [] vm_mmap_pgoff+0x87/0xd0 -12 &mm->mmap_sem 17 [] vm_munmap+0x41/0x80 -13 --------------- -14 &mm->mmap_sem 1 [] dup_mmap+0x2a/0x3f0 -15 &mm->mmap_sem 60 [] SyS_mprotect+0xe9/0x250 -16 &mm->mmap_sem 41 [] __do_page_fault+0x1d4/0x510 -17 &mm->mmap_sem 68 [] vm_mmap_pgoff+0x87/0xd0 -18 -19............................................................................................................................................................................................................................. -20 -21 unix_table_lock: 110 112 0.21 49.24 163.91 1.46 21094 66312 0.12 624.42 31589.81 0.48 -22 --------------- -23 unix_table_lock 45 [] unix_create1+0x16e/0x1b0 -24 unix_table_lock 47 [] unix_release_sock+0x31/0x250 -25 unix_table_lock 15 [] unix_find_other+0x117/0x230 -26 unix_table_lock 5 [] unix_autobind+0x11f/0x1b0 -27 --------------- -28 unix_table_lock 39 [] unix_release_sock+0x31/0x250 -29 unix_table_lock 49 [] unix_create1+0x16e/0x1b0 -30 unix_table_lock 20 [] unix_find_other+0x117/0x230 -31 unix_table_lock 4 [] unix_autobind+0x11f/0x1b0 - - -This excerpt shows the first two lock class statistics. Line 01 shows the -output version - each time the format changes this will be updated. Line 02-04 -show the header with column descriptions. Lines 05-18 and 20-31 show the actual -statistics. These statistics come in two parts; the actual stats separated by a -short separator (line 08, 13) from the contention points. - -Lines 09-12 show the first 4 recorded contention points (the code -which tries to get the lock) and lines 14-17 show the first 4 recorded -contended points (the lock holder). It is possible that the max -con-bounces point is missing in the statistics. - -The first lock (05-18) is a read/write lock, and shows two lines above the -short separator. The contention points don't match the column descriptors, -they have two: contentions and [] symbol. The second set of contention -points are the points we're contending with. - -The integer part of the time values is in us. - -Dealing with nested locks, subclasses may appear: - -32........................................................................................................................................................................................................................... -33 -34 &rq->lock: 13128 13128 0.43 190.53 103881.26 7.91 97454 3453404 0.00 401.11 13224683.11 3.82 -35 --------- -36 &rq->lock 645 [] task_rq_lock+0x43/0x75 -37 &rq->lock 297 [] try_to_wake_up+0x127/0x25a -38 &rq->lock 360 [] select_task_rq_fair+0x1f0/0x74a -39 &rq->lock 428 [] scheduler_tick+0x46/0x1fb -40 --------- -41 &rq->lock 77 [] task_rq_lock+0x43/0x75 -42 &rq->lock 174 [] try_to_wake_up+0x127/0x25a -43 &rq->lock 4715 [] double_rq_lock+0x42/0x54 -44 &rq->lock 893 [] schedule+0x157/0x7b8 -45 -46........................................................................................................................................................................................................................... -47 -48 &rq->lock/1: 1526 11488 0.33 388.73 136294.31 11.86 21461 38404 0.00 37.93 109388.53 2.84 -49 ----------- -50 &rq->lock/1 11526 [] double_rq_lock+0x4f/0x54 -51 ----------- -52 &rq->lock/1 5645 [] double_rq_lock+0x42/0x54 -53 &rq->lock/1 1224 [] schedule+0x157/0x7b8 -54 &rq->lock/1 4336 [] double_rq_lock+0x4f/0x54 -55 &rq->lock/1 181 [] try_to_wake_up+0x127/0x25a - -Line 48 shows statistics for the second subclass (/1) of &rq->lock class -(subclass starts from 0), since in this case, as line 50 suggests, -double_rq_lock actually acquires a nested lock of two spinlocks. - -View the top contending locks: - -# grep : /proc/lock_stat | head - clockevents_lock: 2926159 2947636 0.15 46882.81 1784540466.34 605.41 3381345 3879161 0.00 2260.97 53178395.68 13.71 - tick_broadcast_lock: 346460 346717 0.18 2257.43 39364622.71 113.54 3642919 4242696 0.00 2263.79 49173646.60 11.59 - &mapping->i_mmap_mutex: 203896 203899 3.36 645530.05 31767507988.39 155800.21 3361776 8893984 0.17 2254.15 14110121.02 1.59 - &rq->lock: 135014 136909 0.18 606.09 842160.68 6.15 1540728 10436146 0.00 728.72 17606683.41 1.69 - &(&zone->lru_lock)->rlock: 93000 94934 0.16 59.18 188253.78 1.98 1199912 3809894 0.15 391.40 3559518.81 0.93 - tasklist_lock-W: 40667 41130 0.23 1189.42 428980.51 10.43 270278 510106 0.16 653.51 3939674.91 7.72 - tasklist_lock-R: 21298 21305 0.20 1310.05 215511.12 10.12 186204 241258 0.14 1162.33 1179779.23 4.89 - rcu_node_1: 47656 49022 0.16 635.41 193616.41 3.95 844888 1865423 0.00 764.26 1656226.96 0.89 - &(&dentry->d_lockref.lock)->rlock: 39791 40179 0.15 1302.08 88851.96 2.21 2790851 12527025 0.10 1910.75 3379714.27 0.27 - rcu_node_0: 29203 30064 0.16 786.55 1555573.00 51.74 88963 244254 0.00 398.87 428872.51 1.76 - -Clear the statistics: - -# echo 0 > /proc/lock_stat diff --git a/Documentation/locking/locktorture.txt b/Documentation/locking/locktorture.rst similarity index 57% rename from Documentation/locking/locktorture.txt rename to Documentation/locking/locktorture.rst index 6a8df4cd19bf..e79eeeca3ac6 100644 --- a/Documentation/locking/locktorture.txt +++ b/Documentation/locking/locktorture.rst @@ -1,6 +1,9 @@ +================================== Kernel Lock Torture Test Operation +================================== CONFIG_LOCK_TORTURE_TEST +======================== The CONFIG LOCK_TORTURE_TEST config option provides a kernel module that runs torture tests on core kernel locking primitives. The kernel @@ -18,61 +21,77 @@ can be simulated by either enlarging this critical region hold time and/or creating more kthreads. -MODULE PARAMETERS +Module Parameters +================= This module has the following parameters: - ** Locktorture-specific ** +Locktorture-specific +-------------------- -nwriters_stress Number of kernel threads that will stress exclusive lock +nwriters_stress + Number of kernel threads that will stress exclusive lock ownership (writers). The default value is twice the number of online CPUs. -nreaders_stress Number of kernel threads that will stress shared lock +nreaders_stress + Number of kernel threads that will stress shared lock ownership (readers). The default is the same amount of writer locks. If the user did not specify nwriters_stress, then both readers and writers be the amount of online CPUs. -torture_type Type of lock to torture. By default, only spinlocks will +torture_type + Type of lock to torture. By default, only spinlocks will be tortured. This module can torture the following locks, with string values as follows: - o "lock_busted": Simulates a buggy lock implementation. + - "lock_busted": + Simulates a buggy lock implementation. - o "spin_lock": spin_lock() and spin_unlock() pairs. + - "spin_lock": + spin_lock() and spin_unlock() pairs. - o "spin_lock_irq": spin_lock_irq() and spin_unlock_irq() - pairs. + - "spin_lock_irq": + spin_lock_irq() and spin_unlock_irq() pairs. - o "rw_lock": read/write lock() and unlock() rwlock pairs. + - "rw_lock": + read/write lock() and unlock() rwlock pairs. - o "rw_lock_irq": read/write lock_irq() and unlock_irq() - rwlock pairs. + - "rw_lock_irq": + read/write lock_irq() and unlock_irq() + rwlock pairs. - o "mutex_lock": mutex_lock() and mutex_unlock() pairs. + - "mutex_lock": + mutex_lock() and mutex_unlock() pairs. - o "rtmutex_lock": rtmutex_lock() and rtmutex_unlock() - pairs. Kernel must have CONFIG_RT_MUTEX=y. + - "rtmutex_lock": + rtmutex_lock() and rtmutex_unlock() pairs. + Kernel must have CONFIG_RT_MUTEX=y. - o "rwsem_lock": read/write down() and up() semaphore pairs. + - "rwsem_lock": + read/write down() and up() semaphore pairs. - ** Torture-framework (RCU + locking) ** +Torture-framework (RCU + locking) +--------------------------------- -shutdown_secs The number of seconds to run the test before terminating +shutdown_secs + The number of seconds to run the test before terminating the test and powering off the system. The default is zero, which disables test termination and system shutdown. This capability is useful for automated testing. -onoff_interval The number of seconds between each attempt to execute a +onoff_interval + The number of seconds between each attempt to execute a randomly selected CPU-hotplug operation. Defaults to zero, which disables CPU hotplugging. In CONFIG_HOTPLUG_CPU=n kernels, locktorture will silently refuse to do any CPU-hotplug operations regardless of what value is specified for onoff_interval. -onoff_holdoff The number of seconds to wait until starting CPU-hotplug +onoff_holdoff + The number of seconds to wait until starting CPU-hotplug operations. This would normally only be used when locktorture was built into the kernel and started automatically at boot time, in which case it is useful @@ -80,53 +99,59 @@ onoff_holdoff The number of seconds to wait until starting CPU-hotplug coming and going. This parameter is only useful if CONFIG_HOTPLUG_CPU is enabled. -stat_interval Number of seconds between statistics-related printk()s. +stat_interval + Number of seconds between statistics-related printk()s. By default, locktorture will report stats every 60 seconds. Setting the interval to zero causes the statistics to be printed -only- when the module is unloaded, and this is the default. -stutter The length of time to run the test before pausing for this +stutter + The length of time to run the test before pausing for this same period of time. Defaults to "stutter=5", so as to run and pause for (roughly) five-second intervals. Specifying "stutter=0" causes the test to run continuously without pausing, which is the old default behavior. -shuffle_interval The number of seconds to keep the test threads affinitied +shuffle_interval + The number of seconds to keep the test threads affinitied to a particular subset of the CPUs, defaults to 3 seconds. Used in conjunction with test_no_idle_hz. -verbose Enable verbose debugging printing, via printk(). Enabled +verbose + Enable verbose debugging printing, via printk(). Enabled by default. This extra information is mostly related to high-level errors and reports from the main 'torture' framework. -STATISTICS +Statistics +========== -Statistics are printed in the following format: +Statistics are printed in the following format:: -spin_lock-torture: Writes: Total: 93746064 Max/Min: 0/0 Fail: 0 - (A) (B) (C) (D) (E) + spin_lock-torture: Writes: Total: 93746064 Max/Min: 0/0 Fail: 0 + (A) (B) (C) (D) (E) -(A): Lock type that is being tortured -- torture_type parameter. + (A): Lock type that is being tortured -- torture_type parameter. -(B): Number of writer lock acquisitions. If dealing with a read/write primitive - a second "Reads" statistics line is printed. + (B): Number of writer lock acquisitions. If dealing with a read/write + primitive a second "Reads" statistics line is printed. -(C): Number of times the lock was acquired. + (C): Number of times the lock was acquired. -(D): Min and max number of times threads failed to acquire the lock. + (D): Min and max number of times threads failed to acquire the lock. -(E): true/false values if there were errors acquiring the lock. This should - -only- be positive if there is a bug in the locking primitive's - implementation. Otherwise a lock should never fail (i.e., spin_lock()). - Of course, the same applies for (C), above. A dummy example of this is - the "lock_busted" type. + (E): true/false values if there were errors acquiring the lock. This should + -only- be positive if there is a bug in the locking primitive's + implementation. Otherwise a lock should never fail (i.e., spin_lock()). + Of course, the same applies for (C), above. A dummy example of this is + the "lock_busted" type. -USAGE +Usage +===== -The following script may be used to torture locks: +The following script may be used to torture locks:: #!/bin/sh diff --git a/Documentation/locking/mutex-design.txt b/Documentation/locking/mutex-design.rst similarity index 94% rename from Documentation/locking/mutex-design.txt rename to Documentation/locking/mutex-design.rst index 818aca19612f..4d8236b81fa5 100644 --- a/Documentation/locking/mutex-design.txt +++ b/Documentation/locking/mutex-design.rst @@ -1,6 +1,9 @@ +======================= Generic Mutex Subsystem +======================= started by Ingo Molnar + updated by Davidlohr Bueso What are mutexes? @@ -23,7 +26,7 @@ Implementation Mutexes are represented by 'struct mutex', defined in include/linux/mutex.h and implemented in kernel/locking/mutex.c. These locks use an atomic variable (->owner) to keep track of the lock state during its lifetime. Field owner -actually contains 'struct task_struct *' to the current lock owner and it is +actually contains `struct task_struct *` to the current lock owner and it is therefore NULL if not currently owned. Since task_struct pointers are aligned at at least L1_CACHE_BYTES, low bits (3) are used to store extra state (e.g., if waiter list is non-empty). In its most basic form it also includes a @@ -101,29 +104,36 @@ features that make lock debugging easier and faster: Interfaces ---------- -Statically define the mutex: +Statically define the mutex:: + DEFINE_MUTEX(name); -Dynamically initialize the mutex: +Dynamically initialize the mutex:: + mutex_init(mutex); -Acquire the mutex, uninterruptible: +Acquire the mutex, uninterruptible:: + void mutex_lock(struct mutex *lock); void mutex_lock_nested(struct mutex *lock, unsigned int subclass); int mutex_trylock(struct mutex *lock); -Acquire the mutex, interruptible: +Acquire the mutex, interruptible:: + int mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass); int mutex_lock_interruptible(struct mutex *lock); -Acquire the mutex, interruptible, if dec to 0: +Acquire the mutex, interruptible, if dec to 0:: + int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock); -Unlock the mutex: +Unlock the mutex:: + void mutex_unlock(struct mutex *lock); -Test if the mutex is taken: +Test if the mutex is taken:: + int mutex_is_locked(struct mutex *lock); Disadvantages diff --git a/Documentation/locking/rt-mutex-design.txt b/Documentation/locking/rt-mutex-design.rst similarity index 91% rename from Documentation/locking/rt-mutex-design.txt rename to Documentation/locking/rt-mutex-design.rst index 3d7b865539cc..59c2a64efb21 100644 --- a/Documentation/locking/rt-mutex-design.txt +++ b/Documentation/locking/rt-mutex-design.rst @@ -1,14 +1,15 @@ -# -# Copyright (c) 2006 Steven Rostedt -# Licensed under the GNU Free Documentation License, Version 1.2 -# - +============================== RT-mutex implementation design ------------------------------- +============================== + +Copyright (c) 2006 Steven Rostedt + +Licensed under the GNU Free Documentation License, Version 1.2 + This document tries to describe the design of the rtmutex.c implementation. It doesn't describe the reasons why rtmutex.c exists. For that please see -Documentation/locking/rt-mutex.txt. Although this document does explain problems +Documentation/locking/rt-mutex.rst. Although this document does explain problems that happen without this code, but that is in the concept to understand what the code actually is doing. @@ -41,17 +42,17 @@ to release the lock, because for all we know, B is a CPU hog and will never give C a chance to release the lock. This is called unbounded priority inversion. -Here's a little ASCII art to show the problem. +Here's a little ASCII art to show the problem:: - grab lock L1 (owned by C) - | -A ---+ - C preempted by B - | -C +----+ + grab lock L1 (owned by C) + | + A ---+ + C preempted by B + | + C +----+ -B +--------> - B now keeps A from running. + B +--------> + B now keeps A from running. Priority Inheritance (PI) @@ -75,24 +76,29 @@ Terminology Here I explain some terminology that is used in this document to help describe the design that is used to implement PI. -PI chain - The PI chain is an ordered series of locks and processes that cause +PI chain + - The PI chain is an ordered series of locks and processes that cause processes to inherit priorities from a previous process that is blocked on one of its locks. This is described in more detail later in this document. -mutex - In this document, to differentiate from locks that implement +mutex + - In this document, to differentiate from locks that implement PI and spin locks that are used in the PI code, from now on the PI locks will be called a mutex. -lock - In this document from now on, I will use the term lock when +lock + - In this document from now on, I will use the term lock when referring to spin locks that are used to protect parts of the PI algorithm. These locks disable preemption for UP (when CONFIG_PREEMPT is enabled) and on SMP prevents multiple CPUs from entering critical sections simultaneously. -spin lock - Same as lock above. +spin lock + - Same as lock above. -waiter - A waiter is a struct that is stored on the stack of a blocked +waiter + - A waiter is a struct that is stored on the stack of a blocked process. Since the scope of the waiter is within the code for a process being blocked on the mutex, it is fine to allocate the waiter on the process's stack (local variable). This @@ -104,14 +110,18 @@ waiter - A waiter is a struct that is stored on the stack of a blocked waiter is sometimes used in reference to the task that is waiting on a mutex. This is the same as waiter->task. -waiters - A list of processes that are blocked on a mutex. +waiters + - A list of processes that are blocked on a mutex. -top waiter - The highest priority process waiting on a specific mutex. +top waiter + - The highest priority process waiting on a specific mutex. -top pi waiter - The highest priority process waiting on one of the mutexes +top pi waiter + - The highest priority process waiting on one of the mutexes that a specific process owns. -Note: task and process are used interchangeably in this document, mostly to +Note: + task and process are used interchangeably in this document, mostly to differentiate between two processes that are being described together. @@ -123,7 +133,7 @@ inheritance to take place. Multiple chains may converge, but a chain would never diverge, since a process can't be blocked on more than one mutex at a time. -Example: +Example:: Process: A, B, C, D, E Mutexes: L1, L2, L3, L4 @@ -137,21 +147,21 @@ Example: D owns L4 E blocked on L4 -The chain would be: +The chain would be:: E->L4->D->L3->C->L2->B->L1->A To show where two chains merge, we could add another process F and another mutex L5 where B owns L5 and F is blocked on mutex L5. -The chain for F would be: +The chain for F would be:: F->L5->B->L1->A Since a process may own more than one mutex, but never be blocked on more than one, the chains merge. -Here we show both chains: +Here we show both chains:: E->L4->D->L3->C->L2-+ | @@ -165,12 +175,12 @@ than the processes to the left or below in the chain. Also since a mutex may have more than one process blocked on it, we can have multiple chains merge at mutexes. If we add another process G that is -blocked on mutex L2: +blocked on mutex L2:: G->L2->B->L1->A And once again, to show how this can grow I will show the merging chains -again. +again:: E->L4->D->L3->C-+ +->L2-+ @@ -184,7 +194,7 @@ the chain (A and B in this example), must have their priorities increased to that of G. Mutex Waiters Tree ------------------ +------------------ Every mutex keeps track of all the waiters that are blocked on itself. The mutex has a rbtree to store these waiters by priority. This tree is protected @@ -219,19 +229,19 @@ defined. But is very complex to figure it out, since it depends on all the nesting of mutexes. Let's look at the example where we have 3 mutexes, L1, L2, and L3, and four separate functions func1, func2, func3 and func4. The following shows a locking order of L1->L2->L3, but may not actually -be directly nested that way. +be directly nested that way:: -void func1(void) -{ + void func1(void) + { mutex_lock(L1); /* do anything */ mutex_unlock(L1); -} + } -void func2(void) -{ + void func2(void) + { mutex_lock(L1); mutex_lock(L2); @@ -239,10 +249,10 @@ void func2(void) mutex_unlock(L2); mutex_unlock(L1); -} + } -void func3(void) -{ + void func3(void) + { mutex_lock(L2); mutex_lock(L3); @@ -250,30 +260,30 @@ void func3(void) mutex_unlock(L3); mutex_unlock(L2); -} + } -void func4(void) -{ + void func4(void) + { mutex_lock(L3); /* do something again */ mutex_unlock(L3); -} + } Now we add 4 processes that run each of these functions separately. Processes A, B, C, and D which run functions func1, func2, func3 and func4 respectively, and such that D runs first and A last. With D being preempted -in func4 in the "do something again" area, we have a locking that follows: +in func4 in the "do something again" area, we have a locking that follows:: -D owns L3 - C blocked on L3 - C owns L2 - B blocked on L2 - B owns L1 - A blocked on L1 + D owns L3 + C blocked on L3 + C owns L2 + B blocked on L2 + B owns L1 + A blocked on L1 -And thus we have the chain A->L1->B->L2->C->L3->D. + And thus we have the chain A->L1->B->L2->C->L3->D. This gives us a PI depth of 4 (four processes), but looking at any of the functions individually, it seems as though they only have at most a locking @@ -298,7 +308,7 @@ not true, the rtmutex.c code will be broken!), this allows for the least significant bit to be used as a flag. Bit 0 is used as the "Has Waiters" flag. It's set whenever there are waiters on a mutex. -See Documentation/locking/rt-mutex.txt for further details. +See Documentation/locking/rt-mutex.rst for further details. cmpxchg Tricks -------------- @@ -307,17 +317,17 @@ Some architectures implement an atomic cmpxchg (Compare and Exchange). This is used (when applicable) to keep the fast path of grabbing and releasing mutexes short. -cmpxchg is basically the following function performed atomically: +cmpxchg is basically the following function performed atomically:: -unsigned long _cmpxchg(unsigned long *A, unsigned long *B, unsigned long *C) -{ + unsigned long _cmpxchg(unsigned long *A, unsigned long *B, unsigned long *C) + { unsigned long T = *A; if (*A == *B) { *A = *C; } return T; -} -#define cmpxchg(a,b,c) _cmpxchg(&a,&b,&c) + } + #define cmpxchg(a,b,c) _cmpxchg(&a,&b,&c) This is really nice to have, since it allows you to only update a variable if the variable is what you expect it to be. You know if it succeeded if @@ -352,9 +362,10 @@ Then rt_mutex_setprio is called to adjust the priority of the task to the new priority. Note that rt_mutex_setprio is defined in kernel/sched/core.c to implement the actual change in priority. -(Note: For the "prio" field in task_struct, the lower the number, the +Note: + For the "prio" field in task_struct, the lower the number, the higher the priority. A "prio" of 5 is of higher priority than a - "prio" of 10.) + "prio" of 10. It is interesting to note that rt_mutex_adjust_prio can either increase or decrease the priority of the task. In the case that a higher priority @@ -439,6 +450,7 @@ wait_lock, which this code currently holds. So setting the "Has Waiters" flag forces the current owner to synchronize with this code. The lock is taken if the following are true: + 1) The lock has no owner 2) The current task is the highest priority against all other waiters of the lock @@ -546,10 +558,13 @@ Credits ------- Author: Steven Rostedt + Updated: Alex Shi - 7/6/2017 -Original Reviewers: Ingo Molnar, Thomas Gleixner, Thomas Duetsch, and +Original Reviewers: + Ingo Molnar, Thomas Gleixner, Thomas Duetsch, and Randy Dunlap + Update (7/6/2017) Reviewers: Steven Rostedt and Sebastian Siewior Updates diff --git a/Documentation/locking/rt-mutex.txt b/Documentation/locking/rt-mutex.rst similarity index 71% rename from Documentation/locking/rt-mutex.txt rename to Documentation/locking/rt-mutex.rst index 35793e003041..c365dc302081 100644 --- a/Documentation/locking/rt-mutex.txt +++ b/Documentation/locking/rt-mutex.rst @@ -1,5 +1,6 @@ +================================== RT-mutex subsystem with PI support ----------------------------------- +================================== RT-mutexes with priority inheritance are used to support PI-futexes, which enable pthread_mutex_t priority inheritance attributes @@ -46,27 +47,30 @@ The state of the rt-mutex is tracked via the owner field of the rt-mutex structure: lock->owner holds the task_struct pointer of the owner. Bit 0 is used to -keep track of the "lock has waiters" state. +keep track of the "lock has waiters" state: - owner bit0 + ============ ======= ================================================ + owner bit0 Notes + ============ ======= ================================================ NULL 0 lock is free (fast acquire possible) NULL 1 lock is free and has waiters and the top waiter - is going to take the lock* + is going to take the lock [1]_ taskpointer 0 lock is held (fast release possible) - taskpointer 1 lock is held and has waiters** + taskpointer 1 lock is held and has waiters [2]_ + ============ ======= ================================================ The fast atomic compare exchange based acquire and release is only possible when bit 0 of lock->owner is 0. -(*) It also can be a transitional state when grabbing the lock -with ->wait_lock is held. To prevent any fast path cmpxchg to the lock, -we need to set the bit0 before looking at the lock, and the owner may be -NULL in this small time, hence this can be a transitional state. +.. [1] It also can be a transitional state when grabbing the lock + with ->wait_lock is held. To prevent any fast path cmpxchg to the lock, + we need to set the bit0 before looking at the lock, and the owner may + be NULL in this small time, hence this can be a transitional state. -(**) There is a small time when bit 0 is set but there are no -waiters. This can happen when grabbing the lock in the slow path. -To prevent a cmpxchg of the owner releasing the lock, we need to -set this bit before looking at the lock. +.. [2] There is a small time when bit 0 is set but there are no + waiters. This can happen when grabbing the lock in the slow path. + To prevent a cmpxchg of the owner releasing the lock, we need to + set this bit before looking at the lock. BTW, there is still technically a "Pending Owner", it's just not called that anymore. The pending owner happens to be the top_waiter of a lock diff --git a/Documentation/locking/spinlocks.txt b/Documentation/locking/spinlocks.rst similarity index 89% rename from Documentation/locking/spinlocks.txt rename to Documentation/locking/spinlocks.rst index ff35e40bdf5b..098107fb7d86 100644 --- a/Documentation/locking/spinlocks.txt +++ b/Documentation/locking/spinlocks.rst @@ -1,8 +1,13 @@ +=============== +Locking lessons +=============== + Lesson 1: Spin locks +==================== -The most basic primitive for locking is spinlock. +The most basic primitive for locking is spinlock:: -static DEFINE_SPINLOCK(xxx_lock); + static DEFINE_SPINLOCK(xxx_lock); unsigned long flags; @@ -19,23 +24,25 @@ worry about UP vs SMP issues: the spinlocks work correctly under both. NOTE! Implications of spin_locks for memory are further described in: Documentation/memory-barriers.txt + (5) LOCK operations. + (6) UNLOCK operations. The above is usually pretty simple (you usually need and want only one spinlock for most things - using more than one spinlock can make things a lot more complex and even slower and is usually worth it only for -sequences that you _know_ need to be split up: avoid it at all cost if you +sequences that you **know** need to be split up: avoid it at all cost if you aren't sure). This is really the only really hard part about spinlocks: once you start using spinlocks they tend to expand to areas you might not have noticed before, because you have to make sure the spinlocks correctly protect the -shared data structures _everywhere_ they are used. The spinlocks are most +shared data structures **everywhere** they are used. The spinlocks are most easily added to places that are completely independent of other code (for example, internal driver data structures that nobody else ever touches). - NOTE! The spin-lock is safe only when you _also_ use the lock itself + NOTE! The spin-lock is safe only when you **also** use the lock itself to do locking across CPU's, which implies that EVERYTHING that touches a shared variable has to agree about the spinlock they want to use. @@ -43,6 +50,7 @@ example, internal driver data structures that nobody else ever touches). ---- Lesson 2: reader-writer spinlocks. +================================== If your data accesses have a very natural pattern where you usually tend to mostly read from the shared variables, the reader-writer locks @@ -54,7 +62,7 @@ to change the variables it has to get an exclusive write lock. simple spinlocks. Unless the reader critical section is long, you are better off just using spinlocks. -The routines look the same as above: +The routines look the same as above:: rwlock_t xxx_lock = __RW_LOCK_UNLOCKED(xxx_lock); @@ -71,7 +79,7 @@ The routines look the same as above: The above kind of lock may be useful for complex data structures like linked lists, especially searching for entries without changing the list itself. The read lock allows many concurrent readers. Anything that -_changes_ the list will have to get the write lock. +**changes** the list will have to get the write lock. NOTE! RCU is better for list traversal, but requires careful attention to design detail (see Documentation/RCU/listRCU.txt). @@ -87,10 +95,11 @@ to get the write-lock at the very beginning. ---- Lesson 3: spinlocks revisited. +============================== The single spin-lock primitives above are by no means the only ones. They are the most safe ones, and the ones that work under all circumstances, -but partly _because_ they are safe they are also fairly slow. They are slower +but partly **because** they are safe they are also fairly slow. They are slower than they'd need to be, because they do have to disable interrupts (which is just a single instruction on a x86, but it's an expensive one - and on other architectures it can be worse). @@ -98,7 +107,7 @@ and on other architectures it can be worse). If you have a case where you have to protect a data structure across several CPU's and you want to use spinlocks you can potentially use cheaper versions of the spinlocks. IFF you know that the spinlocks are -never used in interrupt handlers, you can use the non-irq versions: +never used in interrupt handlers, you can use the non-irq versions:: spin_lock(&lock); ... @@ -110,7 +119,7 @@ This is useful if you know that the data in question is only ever manipulated from a "process context", ie no interrupts involved. The reasons you mustn't use these versions if you have interrupts that -play with the spinlock is that you can get deadlocks: +play with the spinlock is that you can get deadlocks:: spin_lock(&lock); ... @@ -147,9 +156,10 @@ indeed), while write-locks need to protect themselves against interrupts. ---- Reference information: +====================== For dynamic initialization, use spin_lock_init() or rwlock_init() as -appropriate: +appropriate:: spinlock_t xxx_lock; rwlock_t xxx_rw_lock; diff --git a/Documentation/locking/ww-mutex-design.txt b/Documentation/locking/ww-mutex-design.rst similarity index 93% rename from Documentation/locking/ww-mutex-design.txt rename to Documentation/locking/ww-mutex-design.rst index f0ed7c30e695..1846c199da23 100644 --- a/Documentation/locking/ww-mutex-design.txt +++ b/Documentation/locking/ww-mutex-design.rst @@ -1,3 +1,4 @@ +====================================== Wound/Wait Deadlock-Proof Mutex Design ====================================== @@ -85,6 +86,7 @@ Furthermore there are three different class of w/w lock acquire functions: no deadlock potential and hence the ww_mutex_lock call will block and not prematurely return -EDEADLK. The advantage of the _slow functions is in interface safety: + - ww_mutex_lock has a __must_check int return type, whereas ww_mutex_lock_slow has a void return type. Note that since ww mutex code needs loops/retries anyway the __must_check doesn't result in spurious warnings, even though the @@ -115,36 +117,36 @@ expect the number of simultaneous competing transactions to be typically small, and you want to reduce the number of rollbacks. Three different ways to acquire locks within the same w/w class. Common -definitions for methods #1 and #2: +definitions for methods #1 and #2:: -static DEFINE_WW_CLASS(ww_class); + static DEFINE_WW_CLASS(ww_class); -struct obj { + struct obj { struct ww_mutex lock; /* obj data */ -}; + }; -struct obj_entry { + struct obj_entry { struct list_head head; struct obj *obj; -}; + }; Method 1, using a list in execbuf->buffers that's not allowed to be reordered. This is useful if a list of required objects is already tracked somewhere. Furthermore the lock helper can use propagate the -EALREADY return code back to the caller as a signal that an object is twice on the list. This is useful if the list is constructed from userspace input and the ABI requires userspace to -not have duplicate entries (e.g. for a gpu commandbuffer submission ioctl). +not have duplicate entries (e.g. for a gpu commandbuffer submission ioctl):: -int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) -{ + int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) + { struct obj *res_obj = NULL; struct obj_entry *contended_entry = NULL; struct obj_entry *entry; ww_acquire_init(ctx, &ww_class); -retry: + retry: list_for_each_entry (entry, list, head) { if (entry->obj == res_obj) { res_obj = NULL; @@ -160,7 +162,7 @@ retry: ww_acquire_done(ctx); return 0; -err: + err: list_for_each_entry_continue_reverse (entry, list, head) ww_mutex_unlock(&entry->obj->lock); @@ -176,14 +178,14 @@ err: ww_acquire_fini(ctx); return ret; -} + } Method 2, using a list in execbuf->buffers that can be reordered. Same semantics of duplicate entry detection using -EALREADY as method 1 above. But the -list-reordering allows for a bit more idiomatic code. +list-reordering allows for a bit more idiomatic code:: -int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) -{ + int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) + { struct obj_entry *entry, *entry2; ww_acquire_init(ctx, &ww_class); @@ -216,24 +218,25 @@ int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) ww_acquire_done(ctx); return 0; -} + } -Unlocking works the same way for both methods #1 and #2: +Unlocking works the same way for both methods #1 and #2:: -void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) -{ + void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) + { struct obj_entry *entry; list_for_each_entry (entry, list, head) ww_mutex_unlock(&entry->obj->lock); ww_acquire_fini(ctx); -} + } Method 3 is useful if the list of objects is constructed ad-hoc and not upfront, e.g. when adjusting edges in a graph where each node has its own ww_mutex lock, and edges can only be changed when holding the locks of all involved nodes. w/w mutexes are a natural fit for such a case for two reasons: + - They can handle lock-acquisition in any order which allows us to start walking a graph from a starting point and then iteratively discovering new edges and locking down the nodes those edges connect to. @@ -243,6 +246,7 @@ mutexes are a natural fit for such a case for two reasons: as a starting point). Note that this approach differs in two important ways from the above methods: + - Since the list of objects is dynamically constructed (and might very well be different when retrying due to hitting the -EDEADLK die condition) there's no need to keep any object on a persistent list when it's not locked. We can @@ -260,17 +264,17 @@ any interface misuse for these cases. Also, method 3 can't fail the lock acquisition step since it doesn't return -EALREADY. Of course this would be different when using the _interruptible -variants, but that's outside of the scope of these examples here. +variants, but that's outside of the scope of these examples here:: -struct obj { + struct obj { struct ww_mutex ww_mutex; struct list_head locked_list; -}; + }; -static DEFINE_WW_CLASS(ww_class); + static DEFINE_WW_CLASS(ww_class); -void __unlock_objs(struct list_head *list) -{ + void __unlock_objs(struct list_head *list) + { struct obj *entry, *temp; list_for_each_entry_safe (entry, temp, list, locked_list) { @@ -279,15 +283,15 @@ void __unlock_objs(struct list_head *list) list_del(&entry->locked_list); ww_mutex_unlock(entry->ww_mutex) } -} + } -void lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) -{ + void lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) + { struct obj *obj; ww_acquire_init(ctx, &ww_class); -retry: + retry: /* re-init loop start state */ loop { /* magic code which walks over a graph and decides which objects @@ -312,13 +316,13 @@ retry: ww_acquire_done(ctx); return 0; -} + } -void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) -{ + void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) + { __unlock_objs(list); ww_acquire_fini(ctx); -} + } Method 4: Only lock one single objects. In that case deadlock detection and prevention is obviously overkill, since with grabbing just one lock you can't @@ -329,11 +333,14 @@ Implementation Details ---------------------- Design: +^^^^^^^ + ww_mutex currently encapsulates a struct mutex, this means no extra overhead for normal mutex locks, which are far more common. As such there is only a small increase in code size if wait/wound mutexes are not used. We maintain the following invariants for the wait list: + (1) Waiters with an acquire context are sorted by stamp order; waiters without an acquire context are interspersed in FIFO order. (2) For Wait-Die, among waiters with contexts, only the first one can have @@ -355,6 +362,8 @@ Design: therefore be directed towards the uncontended cases. Lockdep: +^^^^^^^^ + Special care has been taken to warn for as many cases of api abuse as possible. Some common api abuses will be caught with CONFIG_DEBUG_MUTEXES, but CONFIG_PROVE_LOCKING is recommended. @@ -379,5 +388,6 @@ Lockdep: having called ww_acquire_fini on the first. - 'normal' deadlocks that can occur. -FIXME: Update this section once we have the TASK_DEADLOCK task state flag magic -implemented. +FIXME: + Update this section once we have the TASK_DEADLOCK task state flag magic + implemented. diff --git a/Documentation/pi-futex.txt b/Documentation/pi-futex.txt index b154f6c0c36e..c33ba2befbf8 100644 --- a/Documentation/pi-futex.txt +++ b/Documentation/pi-futex.txt @@ -119,4 +119,4 @@ properties of futexes, and all four combinations are possible: futex, robust-futex, PI-futex, robust+PI-futex. More details about priority inheritance can be found in -Documentation/locking/rt-mutex.txt. +Documentation/locking/rt-mutex.rst. diff --git a/Documentation/translations/it_IT/kernel-hacking/locking.rst b/Documentation/translations/it_IT/kernel-hacking/locking.rst index 5fd8a1abd2be..b9a6be4b8499 100644 --- a/Documentation/translations/it_IT/kernel-hacking/locking.rst +++ b/Documentation/translations/it_IT/kernel-hacking/locking.rst @@ -1404,7 +1404,7 @@ Riferimento per l'API dei Futex Approfondimenti =============== -- ``Documentation/locking/spinlocks.txt``: la guida di Linus Torvalds agli +- ``Documentation/locking/spinlocks.rst``: la guida di Linus Torvalds agli spinlock del kernel. - Unix Systems for Modern Architectures: Symmetric Multiprocessing and diff --git a/drivers/gpu/drm/drm_modeset_lock.c b/drivers/gpu/drm/drm_modeset_lock.c index 53187821df01..fcfe1a03c4a1 100644 --- a/drivers/gpu/drm/drm_modeset_lock.c +++ b/drivers/gpu/drm/drm_modeset_lock.c @@ -36,7 +36,7 @@ * of extra utility/tracking out of our acquire-ctx. This is provided * by &struct drm_modeset_lock and &struct drm_modeset_acquire_ctx. * - * For basic principles of &ww_mutex, see: Documentation/locking/ww-mutex-design.txt + * For basic principles of &ww_mutex, see: Documentation/locking/ww-mutex-design.rst * * The basic usage pattern is to:: * diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index 57baa27f238c..0b0d7259276d 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -5,7 +5,7 @@ * Copyright (C) 2006,2007 Red Hat, Inc., Ingo Molnar * Copyright (C) 2007 Red Hat, Inc., Peter Zijlstra * - * see Documentation/locking/lockdep-design.txt for more details. + * see Documentation/locking/lockdep-design.rst for more details. */ #ifndef __LINUX_LOCKDEP_H #define __LINUX_LOCKDEP_H diff --git a/include/linux/mutex.h b/include/linux/mutex.h index 3093dd162424..dcd03fee6e01 100644 --- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -151,7 +151,7 @@ static inline bool mutex_is_locked(struct mutex *lock) /* * See kernel/locking/mutex.c for detailed documentation of these APIs. - * Also see Documentation/locking/mutex-design.txt. + * Also see Documentation/locking/mutex-design.rst. */ #ifdef CONFIG_DEBUG_LOCK_ALLOC extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass); diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h index e401358c4e7e..9d9c663987d8 100644 --- a/include/linux/rwsem.h +++ b/include/linux/rwsem.h @@ -160,7 +160,7 @@ extern void downgrade_write(struct rw_semaphore *sem); * static then another method for expressing nested locking is * the explicit definition of lock class keys and the use of * lockdep_set_class() at lock initialization time. - * See Documentation/locking/lockdep-design.txt for more details.) + * See Documentation/locking/lockdep-design.rst for more details.) */ extern void down_read_nested(struct rw_semaphore *sem, int subclass); extern void down_write_nested(struct rw_semaphore *sem, int subclass); diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 0c601ae072b3..edd1c082dbf5 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -16,7 +16,7 @@ * by Steven Rostedt, based on work by Gregory Haskins, Peter Morreale * and Sven Dietrich. * - * Also see Documentation/locking/mutex-design.txt. + * Also see Documentation/locking/mutex-design.rst. */ #include #include diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index 38fbf9fa7f1b..fa83d36e30c6 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -9,7 +9,7 @@ * Copyright (C) 2005 Kihon Technologies Inc., Steven Rostedt * Copyright (C) 2006 Esben Nielsen * - * See Documentation/locking/rt-mutex-design.txt for details. + * See Documentation/locking/rt-mutex-design.rst for details. */ #include #include diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 6d2799190fba..18e8c180a874 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1139,7 +1139,7 @@ config PROVE_LOCKING the proof of observed correctness is also maintained for an arbitrary combination of these separate locking variants. - For more details, see Documentation/locking/lockdep-design.txt. + For more details, see Documentation/locking/lockdep-design.rst. config LOCK_STAT bool "Lock usage statistics" @@ -1153,7 +1153,7 @@ config LOCK_STAT help This feature enables tracking lock contention points - For more details, see Documentation/locking/lockstat.txt + For more details, see Documentation/locking/lockstat.rst This also enables lock events required by "perf lock", subcommand of perf. From patchwork Fri Jun 28 12:20:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauro Carvalho Chehab X-Patchwork-Id: 11022181 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A576814C0 for ; Fri, 28 Jun 2019 12:20:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 966C2204C4 for ; Fri, 28 Jun 2019 12:20:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8A5F428355; Fri, 28 Jun 2019 12:20:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F2610204C4 for ; Fri, 28 Jun 2019 12:20:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EFC396E910; Fri, 28 Jun 2019 12:20:50 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:e::133]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1C6DD6E910 for ; Fri, 28 Jun 2019 12:20:50 +0000 (UTC) Received: from [186.213.242.156] (helo=bombadil.infradead.org) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hgprv-00009y-8k; Fri, 28 Jun 2019 12:20:43 +0000 Received: from mchehab by bombadil.infradead.org with local (Exim 4.92) (envelope-from ) id 1hgprt-00057f-Ae; Fri, 28 Jun 2019 09:20:41 -0300 From: Mauro Carvalho Chehab To: Linux Doc Mailing List Subject: [PATCH 11/43] docs: console.txt: convert docs to ReST and rename to *.rst Date: Fri, 28 Jun 2019 09:20:07 -0300 Message-Id: X-Mailer: git-send-email 2.21.0 In-Reply-To: References: MIME-Version: 1.0 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=oOwOSCxbZh/8uQiBXY+7QLjNRJOPdEk2ewzEqFQcDtU=; b=p5kvvJ+rs+cLmQ+c0oXFGTXbL4 rmUqAfjducYKYD6conb47hi0BDJrU+kLj2HAzG0XTR8a960cy8waVhBKgVFEIgmeY+NQPRZbXIglk vA37C/K49ysh+9Gtf7WIt0BIxqDsaKK1X9ybULB1/DsuBOWhu6BHcVJG+YNY4ir4HbriROod1oEPc wwXV6eIIrG5ChE4YhULSUVZ18E4DJKeAHv71PodC5DHzuOfG9m8TGUYiNY3vDhyp76xczxyyPtpxf V2ettC5wrN8c3CTsEp86XImKdnb/KB4h7Ge0fZcXHgilw2W1hfR3M+AqxTtQMQYtAt86NibUWSYxl aouz/TlA==; X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-fbdev@vger.kernel.org, Bartlomiej Zolnierkiewicz , Greg Kroah-Hartman , Jonathan Corbet , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Mauro Carvalho Chehab , Jiri Slaby , Mauro Carvalho Chehab Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Convert this small file to ReST in preparation for adding it to the driver-api book. While this is not part of the driver-api book, mark it as :orphan:, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab Acked-by: Greg Kroah-Hartman Acked-by: Bartlomiej Zolnierkiewicz --- .../console/{console.txt => console.rst} | 63 ++++++++++--------- Documentation/fb/fbcon.rst | 4 +- drivers/tty/Kconfig | 2 +- 3 files changed, 38 insertions(+), 31 deletions(-) rename Documentation/console/{console.txt => console.rst} (80%) diff --git a/Documentation/console/console.txt b/Documentation/console/console.rst similarity index 80% rename from Documentation/console/console.txt rename to Documentation/console/console.rst index d73c2ab4beda..b374141b027e 100644 --- a/Documentation/console/console.txt +++ b/Documentation/console/console.rst @@ -1,3 +1,6 @@ +:orphan: + +=============== Console Drivers =============== @@ -17,25 +20,26 @@ of driver occupying the consoles.) They can only take over the console that is occupied by the system driver. In the same token, if the modular driver is released by the console, the system driver will take over. -Modular drivers, from the programmer's point of view, have to call: +Modular drivers, from the programmer's point of view, have to call:: do_take_over_console() - load and bind driver to console layer give_up_console() - unload driver; it will only work if driver is fully unbound -In newer kernels, the following are also available: +In newer kernels, the following are also available:: do_register_con_driver() do_unregister_con_driver() If sysfs is enabled, the contents of /sys/class/vtconsole can be examined. This shows the console backends currently registered by the -system which are named vtcon where is an integer from 0 to 15. Thus: +system which are named vtcon where is an integer from 0 to 15. +Thus:: ls /sys/class/vtconsole . .. vtcon0 vtcon1 -Each directory in /sys/class/vtconsole has 3 files: +Each directory in /sys/class/vtconsole has 3 files:: ls /sys/class/vtconsole/vtcon0 . .. bind name uevent @@ -46,27 +50,29 @@ What do these files signify? read, or acts to bind or unbind the driver to the virtual consoles when written to. The possible values are: - 0 - means the driver is not bound and if echo'ed, commands the driver + 0 + - means the driver is not bound and if echo'ed, commands the driver to unbind - 1 - means the driver is bound and if echo'ed, commands the driver to + 1 + - means the driver is bound and if echo'ed, commands the driver to bind - 2. name - read-only file. Shows the name of the driver in this format: + 2. name - read-only file. Shows the name of the driver in this format:: - cat /sys/class/vtconsole/vtcon0/name - (S) VGA+ + cat /sys/class/vtconsole/vtcon0/name + (S) VGA+ - '(S)' stands for a (S)ystem driver, i.e., it cannot be directly - commanded to bind or unbind + '(S)' stands for a (S)ystem driver, i.e., it cannot be directly + commanded to bind or unbind - 'VGA+' is the name of the driver + 'VGA+' is the name of the driver - cat /sys/class/vtconsole/vtcon1/name - (M) frame buffer device + cat /sys/class/vtconsole/vtcon1/name + (M) frame buffer device - In this case, '(M)' stands for a (M)odular driver, one that can be - directly commanded to bind or unbind. + In this case, '(M)' stands for a (M)odular driver, one that can be + directly commanded to bind or unbind. 3. uevent - ignore this file @@ -75,14 +81,17 @@ driver takes over the consoles vacated by the driver. Binding, on the other hand, will bind the driver to the consoles that are currently occupied by a system driver. -NOTE1: Binding and unbinding must be selected in Kconfig. It's under: +NOTE1: + Binding and unbinding must be selected in Kconfig. It's under:: -Device Drivers -> Character devices -> Support for binding and unbinding -console drivers + Device Drivers -> + Character devices -> + Support for binding and unbinding console drivers -NOTE2: If any of the virtual consoles are in KD_GRAPHICS mode, then binding or -unbinding will not succeed. An example of an application that sets the console -to KD_GRAPHICS is X. +NOTE2: + If any of the virtual consoles are in KD_GRAPHICS mode, then binding or + unbinding will not succeed. An example of an application that sets the + console to KD_GRAPHICS is X. How useful is this feature? This is very useful for console driver developers. By unbinding the driver from the console layer, one can unload the @@ -92,10 +101,10 @@ framebuffer console to VGA console and vice versa, this feature also makes this possible. (NOTE NOTE NOTE: Please read fbcon.txt under Documentation/fb for more details.) -Notes for developers: -===================== +Notes for developers +==================== -do_take_over_console() is now broken up into: +do_take_over_console() is now broken up into:: do_register_con_driver() do_bind_con_driver() - private function @@ -104,7 +113,7 @@ give_up_console() is a wrapper to do_unregister_con_driver(), and a driver must be fully unbound for this call to succeed. con_is_bound() will check if the driver is bound or not. -Guidelines for console driver writers: +Guidelines for console driver writers ===================================== In order for binding to and unbinding from the console to properly work, @@ -140,6 +149,4 @@ The current crop of console drivers should still work correctly, but binding and unbinding them may cause problems. With minimal fixes, these drivers can be made to work correctly. -========================== Antonino Daplas - diff --git a/Documentation/fb/fbcon.rst b/Documentation/fb/fbcon.rst index 1da65b9000de..26bc5cdaabab 100644 --- a/Documentation/fb/fbcon.rst +++ b/Documentation/fb/fbcon.rst @@ -187,7 +187,7 @@ the hardware. Thus, in a VGA console:: Assuming the VGA driver can be unloaded, one must first unbind the VGA driver from the console layer before unloading the driver. The VGA driver cannot be unloaded if it is still bound to the console layer. (See -Documentation/console/console.txt for more information). +Documentation/console/console.rst for more information). This is more complicated in the case of the framebuffer console (fbcon), because fbcon is an intermediate layer between the console and the drivers:: @@ -204,7 +204,7 @@ fbcon. Thus, there is no need to explicitly unbind the fbdev drivers from fbcon. So, how do we unbind fbcon from the console? Part of the answer is in -Documentation/console/console.txt. To summarize: +Documentation/console/console.rst. To summarize: Echo a value to the bind file that represents the framebuffer console driver. So assuming vtcon1 represents fbcon, then:: diff --git a/drivers/tty/Kconfig b/drivers/tty/Kconfig index 0e3e4dacbc12..1cb50f19d58c 100644 --- a/drivers/tty/Kconfig +++ b/drivers/tty/Kconfig @@ -93,7 +93,7 @@ config VT_HW_CONSOLE_BINDING select the console driver that will serve as the backend for the virtual terminals. - See for more + See for more information. For framebuffer console users, please refer to . From patchwork Fri Jun 28 12:20:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauro Carvalho Chehab X-Patchwork-Id: 11022187 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ECEC014C0 for ; Fri, 28 Jun 2019 12:21:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D58DE27BA5 for ; Fri, 28 Jun 2019 12:21:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C618C287E5; Fri, 28 Jun 2019 12:21:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6BB8327BA5 for ; Fri, 28 Jun 2019 12:20:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3731A6E911; Fri, 28 Jun 2019 12:20:55 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:e::133]) by gabe.freedesktop.org (Postfix) with ESMTPS id 40D966E911 for ; Fri, 28 Jun 2019 12:20:52 +0000 (UTC) Received: from [186.213.242.156] (helo=bombadil.infradead.org) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1hgprw-0000AM-2V; Fri, 28 Jun 2019 12:20:46 +0000 Received: from mchehab by bombadil.infradead.org with local (Exim 4.92) (envelope-from ) id 1hgpru-00059Y-03; Fri, 28 Jun 2019 09:20:42 -0300 From: Mauro Carvalho Chehab To: Linux Doc Mailing List Subject: [PATCH 34/43] docs: ioctl: convert to ReST Date: Fri, 28 Jun 2019 09:20:30 -0300 Message-Id: <05ff3ad912b20c073acd13ccf4b17fd28e9db3df.1561723980.git.mchehab+samsung@kernel.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: References: MIME-Version: 1.0 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=W5eR2I0ZpL7RLVtZ2JCV0NEE6mjZnd9bWDRGihUAG1w=; b=O9VLhdXMRZDXBKdL177FJDGDQE n3hWbF33106jzv0dna/sZZ5BqJlMU0EaZ7ZyKPtleUhy6YPmzWvZLt7DDytQiBsjS0mYYjfp2DW7W njsKILQ9GYVEEB7c1gyIwlicf/or/aImGaGU3IzeO0LzSvNtwG4UbNSMwZ5H6ui9TEfUdw1TOTfTH hfWVsGWfqkDjJCt1VS3LcbxT+P9uwDJe8mZeP5fTVgWa3kFCf063QVWALik+eSAeq/gT5JXC6NDES hzQcxlNX/a1I1Qd5TZfd5iK9akpb5Ek5v6M4sYuVGyGPzbL+r6hfDaNU0+9NYMBCxtItAPY6lQTja 2jCxqMBA==; X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jonathan Corbet , Maxime Ripard , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Mauro Carvalho Chehab , David Airlie , Mauro Carvalho Chehab , Sean Paul Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Rename the iio documentation files to ReST, add an index for them and adjust in order to produce a nice html output via the Sphinx build system. The cdrom.txt and hdio.txt have their own particular syntax. In order to speedup the conversion, I used a small ancillary perl script: my $d; $d .= $_ while(<>); $d =~ s/(\nCDROM\S+)\s+(\w[^\n]*)/$1\n\t$2\n/g; $d =~ s/(\nHDIO\S+)\s+(\w[^\n]*)/$1\n\t$2\n/g; $d =~ s/(\n\s*usage:)[\s\n]*(\w[^\n]*)/$1:\n\n\t $2\n/g; $d =~ s/(\n\s*)(E\w+[\s\n]*\w[^\n]*)/$1- $2/g; $d =~ s/(\n\s*)(inputs|outputs|notes):\s*(\w[^\n]*)/$1$2:\n\t\t$3\n/g; print $d; It basically add blank lines on a few interesting places. The script is not perfect: still several things require manual work, but it saved quite some time doing some obvious stuff. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab --- ...g-up-ioctls.txt => botching-up-ioctls.rst} | 1 + Documentation/ioctl/cdrom.rst | 1233 +++++++++++++++++ Documentation/ioctl/cdrom.txt | 967 ------------- Documentation/ioctl/{hdio.txt => hdio.rst} | 835 +++++++---- Documentation/ioctl/index.rst | 16 + ...{ioctl-decoding.txt => ioctl-decoding.rst} | 13 +- drivers/gpu/drm/drm_ioctl.c | 2 +- 7 files changed, 1814 insertions(+), 1253 deletions(-) rename Documentation/ioctl/{botching-up-ioctls.txt => botching-up-ioctls.rst} (99%) create mode 100644 Documentation/ioctl/cdrom.rst delete mode 100644 Documentation/ioctl/cdrom.txt rename Documentation/ioctl/{hdio.txt => hdio.rst} (54%) create mode 100644 Documentation/ioctl/index.rst rename Documentation/ioctl/{ioctl-decoding.txt => ioctl-decoding.rst} (54%) diff --git a/Documentation/ioctl/botching-up-ioctls.txt b/Documentation/ioctl/botching-up-ioctls.rst similarity index 99% rename from Documentation/ioctl/botching-up-ioctls.txt rename to Documentation/ioctl/botching-up-ioctls.rst index 883fb034bd04..ac697fef3545 100644 --- a/Documentation/ioctl/botching-up-ioctls.txt +++ b/Documentation/ioctl/botching-up-ioctls.rst @@ -1,3 +1,4 @@ +================================= (How to avoid) Botching up ioctls ================================= diff --git a/Documentation/ioctl/cdrom.rst b/Documentation/ioctl/cdrom.rst new file mode 100644 index 000000000000..3b4c0506de46 --- /dev/null +++ b/Documentation/ioctl/cdrom.rst @@ -0,0 +1,1233 @@ +============================ +Summary of CDROM ioctl calls +============================ + +- Edward A. Falk + +November, 2004 + +This document attempts to describe the ioctl(2) calls supported by +the CDROM layer. These are by-and-large implemented (as of Linux 2.6) +in drivers/cdrom/cdrom.c and drivers/block/scsi_ioctl.c + +ioctl values are listed in . As of this writing, they +are as follows: + + ====================== =============================================== + CDROMPAUSE Pause Audio Operation + CDROMRESUME Resume paused Audio Operation + CDROMPLAYMSF Play Audio MSF (struct cdrom_msf) + CDROMPLAYTRKIND Play Audio Track/index (struct cdrom_ti) + CDROMREADTOCHDR Read TOC header (struct cdrom_tochdr) + CDROMREADTOCENTRY Read TOC entry (struct cdrom_tocentry) + CDROMSTOP Stop the cdrom drive + CDROMSTART Start the cdrom drive + CDROMEJECT Ejects the cdrom media + CDROMVOLCTRL Control output volume (struct cdrom_volctrl) + CDROMSUBCHNL Read subchannel data (struct cdrom_subchnl) + CDROMREADMODE2 Read CDROM mode 2 data (2336 Bytes) + (struct cdrom_read) + CDROMREADMODE1 Read CDROM mode 1 data (2048 Bytes) + (struct cdrom_read) + CDROMREADAUDIO (struct cdrom_read_audio) + CDROMEJECT_SW enable(1)/disable(0) auto-ejecting + CDROMMULTISESSION Obtain the start-of-last-session + address of multi session disks + (struct cdrom_multisession) + CDROM_GET_MCN Obtain the "Universal Product Code" + if available (struct cdrom_mcn) + CDROM_GET_UPC Deprecated, use CDROM_GET_MCN instead. + CDROMRESET hard-reset the drive + CDROMVOLREAD Get the drive's volume setting + (struct cdrom_volctrl) + CDROMREADRAW read data in raw mode (2352 Bytes) + (struct cdrom_read) + CDROMREADCOOKED read data in cooked mode + CDROMSEEK seek msf address + CDROMPLAYBLK scsi-cd only, (struct cdrom_blk) + CDROMREADALL read all 2646 bytes + CDROMGETSPINDOWN return 4-bit spindown value + CDROMSETSPINDOWN set 4-bit spindown value + CDROMCLOSETRAY pendant of CDROMEJECT + CDROM_SET_OPTIONS Set behavior options + CDROM_CLEAR_OPTIONS Clear behavior options + CDROM_SELECT_SPEED Set the CD-ROM speed + CDROM_SELECT_DISC Select disc (for juke-boxes) + CDROM_MEDIA_CHANGED Check is media changed + CDROM_DRIVE_STATUS Get tray position, etc. + CDROM_DISC_STATUS Get disc type, etc. + CDROM_CHANGER_NSLOTS Get number of slots + CDROM_LOCKDOOR lock or unlock door + CDROM_DEBUG Turn debug messages on/off + CDROM_GET_CAPABILITY get capabilities + CDROMAUDIOBUFSIZ set the audio buffer size + DVD_READ_STRUCT Read structure + DVD_WRITE_STRUCT Write structure + DVD_AUTH Authentication + CDROM_SEND_PACKET send a packet to the drive + CDROM_NEXT_WRITABLE get next writable block + CDROM_LAST_WRITTEN get last block written on disc + ====================== =============================================== + + +The information that follows was determined from reading kernel source +code. It is likely that some corrections will be made over time. + +------------------------------------------------------------------------------ + +General: + + Unless otherwise specified, all ioctl calls return 0 on success + and -1 with errno set to an appropriate value on error. (Some + ioctls return non-negative data values.) + + Unless otherwise specified, all ioctl calls return -1 and set + errno to EFAULT on a failed attempt to copy data to or from user + address space. + + Individual drivers may return error codes not listed here. + + Unless otherwise specified, all data structures and constants + are defined in + +------------------------------------------------------------------------------ + + +CDROMPAUSE + Pause Audio Operation + + + usage:: + + ioctl(fd, CDROMPAUSE, 0); + + + inputs: + none + + + outputs: + none + + + error return: + - ENOSYS cd drive not audio-capable. + + +CDROMRESUME + Resume paused Audio Operation + + + usage:: + + ioctl(fd, CDROMRESUME, 0); + + + inputs: + none + + + outputs: + none + + + error return: + - ENOSYS cd drive not audio-capable. + + +CDROMPLAYMSF + Play Audio MSF + + (struct cdrom_msf) + + + usage:: + + struct cdrom_msf msf; + + ioctl(fd, CDROMPLAYMSF, &msf); + + inputs: + cdrom_msf structure, describing a segment of music to play + + + outputs: + none + + + error return: + - ENOSYS cd drive not audio-capable. + + notes: + - MSF stands for minutes-seconds-frames + - LBA stands for logical block address + - Segment is described as start and end times, where each time + is described as minutes:seconds:frames. + A frame is 1/75 of a second. + + +CDROMPLAYTRKIND + Play Audio Track/index + + (struct cdrom_ti) + + + usage:: + + struct cdrom_ti ti; + + ioctl(fd, CDROMPLAYTRKIND, &ti); + + inputs: + cdrom_ti structure, describing a segment of music to play + + + outputs: + none + + + error return: + - ENOSYS cd drive not audio-capable. + + notes: + - Segment is described as start and end times, where each time + is described as a track and an index. + + + +CDROMREADTOCHDR + Read TOC header + + (struct cdrom_tochdr) + + + usage:: + + cdrom_tochdr header; + + ioctl(fd, CDROMREADTOCHDR, &header); + + inputs: + cdrom_tochdr structure + + + outputs: + cdrom_tochdr structure + + + error return: + - ENOSYS cd drive not audio-capable. + + + +CDROMREADTOCENTRY + Read TOC entry + + (struct cdrom_tocentry) + + + usage:: + + struct cdrom_tocentry entry; + + ioctl(fd, CDROMREADTOCENTRY, &entry); + + inputs: + cdrom_tocentry structure + + + outputs: + cdrom_tocentry structure + + + error return: + - ENOSYS cd drive not audio-capable. + - EINVAL entry.cdte_format not CDROM_MSF or CDROM_LBA + - EINVAL requested track out of bounds + - EIO I/O error reading TOC + + notes: + - TOC stands for Table Of Contents + - MSF stands for minutes-seconds-frames + - LBA stands for logical block address + + + +CDROMSTOP + Stop the cdrom drive + + + usage:: + + ioctl(fd, CDROMSTOP, 0); + + + inputs: + none + + + outputs: + none + + + error return: + - ENOSYS cd drive not audio-capable. + + notes: + - Exact interpretation of this ioctl depends on the device, + but most seem to spin the drive down. + + +CDROMSTART + Start the cdrom drive + + + usage:: + + ioctl(fd, CDROMSTART, 0); + + + inputs: + none + + + outputs: + none + + + error return: + - ENOSYS cd drive not audio-capable. + + notes: + - Exact interpretation of this ioctl depends on the device, + but most seem to spin the drive up and/or close the tray. + Other devices ignore the ioctl completely. + + +CDROMEJECT + - Ejects the cdrom media + + + usage:: + + ioctl(fd, CDROMEJECT, 0); + + + inputs: + none + + + outputs: + none + + + error returns: + - ENOSYS cd drive not capable of ejecting + - EBUSY other processes are accessing drive, or door is locked + + notes: + - See CDROM_LOCKDOOR, below. + + + + +CDROMCLOSETRAY + pendant of CDROMEJECT + + + usage:: + + ioctl(fd, CDROMCLOSETRAY, 0); + + + inputs: + none + + + outputs: + none + + + error returns: + - ENOSYS cd drive not capable of closing the tray + - EBUSY other processes are accessing drive, or door is locked + + notes: + - See CDROM_LOCKDOOR, below. + + + + +CDROMVOLCTRL + Control output volume (struct cdrom_volctrl) + + + usage:: + + struct cdrom_volctrl volume; + + ioctl(fd, CDROMVOLCTRL, &volume); + + inputs: + cdrom_volctrl structure containing volumes for up to 4 + channels. + + outputs: + none + + + error return: + - ENOSYS cd drive not audio-capable. + + + +CDROMVOLREAD + Get the drive's volume setting + + (struct cdrom_volctrl) + + + usage:: + + struct cdrom_volctrl volume; + + ioctl(fd, CDROMVOLREAD, &volume); + + inputs: + none + + + outputs: + The current volume settings. + + + error return: + - ENOSYS cd drive not audio-capable. + + + +CDROMSUBCHNL + Read subchannel data + + (struct cdrom_subchnl) + + + usage:: + + struct cdrom_subchnl q; + + ioctl(fd, CDROMSUBCHNL, &q); + + inputs: + cdrom_subchnl structure + + + outputs: + cdrom_subchnl structure + + + error return: + - ENOSYS cd drive not audio-capable. + - EINVAL format not CDROM_MSF or CDROM_LBA + + notes: + - Format is converted to CDROM_MSF or CDROM_LBA + as per user request on return + + + +CDROMREADRAW + read data in raw mode (2352 Bytes) + + (struct cdrom_read) + + usage:: + + union { + + struct cdrom_msf msf; /* input */ + char buffer[CD_FRAMESIZE_RAW]; /* return */ + } arg; + ioctl(fd, CDROMREADRAW, &arg); + + inputs: + cdrom_msf structure indicating an address to read. + + Only the start values are significant. + + outputs: + Data written to address provided by user. + + + error return: + - EINVAL address less than 0, or msf less than 0:2:0 + - ENOMEM out of memory + + notes: + - As of 2.6.8.1, comments in indicate that this + ioctl accepts a cdrom_read structure, but actual source code + reads a cdrom_msf structure and writes a buffer of data to + the same address. + + - MSF values are converted to LBA values via this formula:: + + lba = (((m * CD_SECS) + s) * CD_FRAMES + f) - CD_MSF_OFFSET; + + + + +CDROMREADMODE1 + Read CDROM mode 1 data (2048 Bytes) + + (struct cdrom_read) + + notes: + Identical to CDROMREADRAW except that block size is + CD_FRAMESIZE (2048) bytes + + + +CDROMREADMODE2 + Read CDROM mode 2 data (2336 Bytes) + + (struct cdrom_read) + + notes: + Identical to CDROMREADRAW except that block size is + CD_FRAMESIZE_RAW0 (2336) bytes + + + +CDROMREADAUDIO + (struct cdrom_read_audio) + + usage:: + + struct cdrom_read_audio ra; + + ioctl(fd, CDROMREADAUDIO, &ra); + + inputs: + cdrom_read_audio structure containing read start + point and length + + outputs: + audio data, returned to buffer indicated by ra + + + error return: + - EINVAL format not CDROM_MSF or CDROM_LBA + - EINVAL nframes not in range [1 75] + - ENXIO drive has no queue (probably means invalid fd) + - ENOMEM out of memory + + +CDROMEJECT_SW + enable(1)/disable(0) auto-ejecting + + + usage:: + + int val; + + ioctl(fd, CDROMEJECT_SW, val); + + inputs: + Flag specifying auto-eject flag. + + + outputs: + none + + + error return: + - ENOSYS Drive is not capable of ejecting. + - EBUSY Door is locked + + + + +CDROMMULTISESSION + Obtain the start-of-last-session address of multi session disks + + (struct cdrom_multisession) + + usage:: + + struct cdrom_multisession ms_info; + + ioctl(fd, CDROMMULTISESSION, &ms_info); + + inputs: + cdrom_multisession structure containing desired + + format. + + outputs: + cdrom_multisession structure is filled with last_session + information. + + error return: + - EINVAL format not CDROM_MSF or CDROM_LBA + + +CDROM_GET_MCN + Obtain the "Universal Product Code" + if available + + (struct cdrom_mcn) + + + usage:: + + struct cdrom_mcn mcn; + + ioctl(fd, CDROM_GET_MCN, &mcn); + + inputs: + none + + + outputs: + Universal Product Code + + + error return: + - ENOSYS Drive is not capable of reading MCN data. + + notes: + - Source code comments state:: + + The following function is implemented, although very few + audio discs give Universal Product Code information, which + should just be the Medium Catalog Number on the box. Note, + that the way the code is written on the CD is /not/ uniform + across all discs! + + + + +CDROM_GET_UPC + CDROM_GET_MCN (deprecated) + + + Not implemented, as of 2.6.8.1 + + + +CDROMRESET + hard-reset the drive + + + usage:: + + ioctl(fd, CDROMRESET, 0); + + + inputs: + none + + + outputs: + none + + + error return: + - EACCES Access denied: requires CAP_SYS_ADMIN + - ENOSYS Drive is not capable of resetting. + + + + +CDROMREADCOOKED + read data in cooked mode + + + usage:: + + u8 buffer[CD_FRAMESIZE] + + ioctl(fd, CDROMREADCOOKED, buffer); + + inputs: + none + + + outputs: + 2048 bytes of data, "cooked" mode. + + + notes: + Not implemented on all drives. + + + + + +CDROMREADALL + read all 2646 bytes + + + Same as CDROMREADCOOKED, but reads 2646 bytes. + + + +CDROMSEEK + seek msf address + + + usage:: + + struct cdrom_msf msf; + + ioctl(fd, CDROMSEEK, &msf); + + inputs: + MSF address to seek to. + + + outputs: + none + + + + +CDROMPLAYBLK + scsi-cd only + + (struct cdrom_blk) + + + usage:: + + struct cdrom_blk blk; + + ioctl(fd, CDROMPLAYBLK, &blk); + + inputs: + Region to play + + + outputs: + none + + + + +CDROMGETSPINDOWN + usage:: + + char spindown; + + ioctl(fd, CDROMGETSPINDOWN, &spindown); + + inputs: + none + + + outputs: + The value of the current 4-bit spindown value. + + + + + +CDROMSETSPINDOWN + usage:: + + char spindown + + ioctl(fd, CDROMSETSPINDOWN, &spindown); + + inputs: + 4-bit value used to control spindown (TODO: more detail here) + + + outputs: + none + + + + + + +CDROM_SET_OPTIONS + Set behavior options + + + usage:: + + int options; + + ioctl(fd, CDROM_SET_OPTIONS, options); + + inputs: + New values for drive options. The logical 'or' of: + + ============== ================================== + CDO_AUTO_CLOSE close tray on first open(2) + CDO_AUTO_EJECT open tray on last release + CDO_USE_FFLAGS use O_NONBLOCK information on open + CDO_LOCK lock tray on open files + CDO_CHECK_TYPE check type on open for data + ============== ================================== + + outputs: + Returns the resulting options settings in the + ioctl return value. Returns -1 on error. + + error return: + - ENOSYS selected option(s) not supported by drive. + + + + +CDROM_CLEAR_OPTIONS + Clear behavior options + + + Same as CDROM_SET_OPTIONS, except that selected options are + turned off. + + + +CDROM_SELECT_SPEED + Set the CD-ROM speed + + + usage:: + + int speed; + + ioctl(fd, CDROM_SELECT_SPEED, speed); + + inputs: + New drive speed. + + + outputs: + none + + + error return: + - ENOSYS speed selection not supported by drive. + + + +CDROM_SELECT_DISC + Select disc (for juke-boxes) + + + usage:: + + int disk; + + ioctl(fd, CDROM_SELECT_DISC, disk); + + inputs: + Disk to load into drive. + + + outputs: + none + + + error return: + - EINVAL Disk number beyond capacity of drive + + + +CDROM_MEDIA_CHANGED + Check is media changed + + + usage:: + + int slot; + + ioctl(fd, CDROM_MEDIA_CHANGED, slot); + + inputs: + Slot number to be tested, always zero except for jukeboxes. + + May also be special values CDSL_NONE or CDSL_CURRENT + + outputs: + Ioctl return value is 0 or 1 depending on whether the media + + has been changed, or -1 on error. + + error returns: + - ENOSYS Drive can't detect media change + - EINVAL Slot number beyond capacity of drive + - ENOMEM Out of memory + + + +CDROM_DRIVE_STATUS + Get tray position, etc. + + + usage:: + + int slot; + + ioctl(fd, CDROM_DRIVE_STATUS, slot); + + inputs: + Slot number to be tested, always zero except for jukeboxes. + + May also be special values CDSL_NONE or CDSL_CURRENT + + outputs: + Ioctl return value will be one of the following values + + from : + + =================== ========================== + CDS_NO_INFO Information not available. + CDS_NO_DISC + CDS_TRAY_OPEN + CDS_DRIVE_NOT_READY + CDS_DISC_OK + -1 error + =================== ========================== + + error returns: + - ENOSYS Drive can't detect drive status + - EINVAL Slot number beyond capacity of drive + - ENOMEM Out of memory + + + + +CDROM_DISC_STATUS + Get disc type, etc. + + + usage:: + + ioctl(fd, CDROM_DISC_STATUS, 0); + + + inputs: + none + + + outputs: + Ioctl return value will be one of the following values + + from : + + - CDS_NO_INFO + - CDS_AUDIO + - CDS_MIXED + - CDS_XA_2_2 + - CDS_XA_2_1 + - CDS_DATA_1 + + error returns: + none at present + + notes: + - Source code comments state:: + + + Ok, this is where problems start. The current interface for + the CDROM_DISC_STATUS ioctl is flawed. It makes the false + assumption that CDs are all CDS_DATA_1 or all CDS_AUDIO, etc. + Unfortunately, while this is often the case, it is also + very common for CDs to have some tracks with data, and some + tracks with audio. Just because I feel like it, I declare + the following to be the best way to cope. If the CD has + ANY data tracks on it, it will be returned as a data CD. + If it has any XA tracks, I will return it as that. Now I + could simplify this interface by combining these returns with + the above, but this more clearly demonstrates the problem + with the current interface. Too bad this wasn't designed + to use bitmasks... -Erik + + Well, now we have the option CDS_MIXED: a mixed-type CD. + User level programmers might feel the ioctl is not very + useful. + ---david + + + + +CDROM_CHANGER_NSLOTS + Get number of slots + + + usage:: + + ioctl(fd, CDROM_CHANGER_NSLOTS, 0); + + + inputs: + none + + + outputs: + The ioctl return value will be the number of slots in a + CD changer. Typically 1 for non-multi-disk devices. + + error returns: + none + + + +CDROM_LOCKDOOR + lock or unlock door + + + usage:: + + int lock; + + ioctl(fd, CDROM_LOCKDOOR, lock); + + inputs: + Door lock flag, 1=lock, 0=unlock + + + outputs: + none + + + error returns: + - EDRIVE_CANT_DO_THIS + + Door lock function not supported. + - EBUSY + + Attempt to unlock when multiple users + have the drive open and not CAP_SYS_ADMIN + + notes: + As of 2.6.8.1, the lock flag is a global lock, meaning that + all CD drives will be locked or unlocked together. This is + probably a bug. + + The EDRIVE_CANT_DO_THIS value is defined in + and is currently (2.6.8.1) the same as EOPNOTSUPP + + + +CDROM_DEBUG + Turn debug messages on/off + + + usage:: + + int debug; + + ioctl(fd, CDROM_DEBUG, debug); + + inputs: + Cdrom debug flag, 0=disable, 1=enable + + + outputs: + The ioctl return value will be the new debug flag. + + + error return: + - EACCES Access denied: requires CAP_SYS_ADMIN + + + +CDROM_GET_CAPABILITY + get capabilities + + + usage:: + + ioctl(fd, CDROM_GET_CAPABILITY, 0); + + + inputs: + none + + + outputs: + The ioctl return value is the current device capability + flags. See CDC_CLOSE_TRAY, CDC_OPEN_TRAY, etc. + + + +CDROMAUDIOBUFSIZ + set the audio buffer size + + + usage:: + + int arg; + + ioctl(fd, CDROMAUDIOBUFSIZ, val); + + inputs: + New audio buffer size + + + outputs: + The ioctl return value is the new audio buffer size, or -1 + on error. + + error return: + - ENOSYS Not supported by this driver. + + notes: + Not supported by all drivers. + + + + +DVD_READ_STRUCT Read structure + + usage:: + + dvd_struct s; + + ioctl(fd, DVD_READ_STRUCT, &s); + + inputs: + dvd_struct structure, containing: + + =================== ========================================== + type specifies the information desired, one of + DVD_STRUCT_PHYSICAL, DVD_STRUCT_COPYRIGHT, + DVD_STRUCT_DISCKEY, DVD_STRUCT_BCA, + DVD_STRUCT_MANUFACT + physical.layer_num desired layer, indexed from 0 + copyright.layer_num desired layer, indexed from 0 + disckey.agid + =================== ========================================== + + outputs: + dvd_struct structure, containing: + + =================== ================================ + physical for type == DVD_STRUCT_PHYSICAL + copyright for type == DVD_STRUCT_COPYRIGHT + disckey.value for type == DVD_STRUCT_DISCKEY + bca.{len,value} for type == DVD_STRUCT_BCA + manufact.{len,valu} for type == DVD_STRUCT_MANUFACT + =================== ================================ + + error returns: + - EINVAL physical.layer_num exceeds number of layers + - EIO Received invalid response from drive + + + +DVD_WRITE_STRUCT Write structure + + Not implemented, as of 2.6.8.1 + + + +DVD_AUTH Authentication + + usage:: + + dvd_authinfo ai; + + ioctl(fd, DVD_AUTH, &ai); + + inputs: + dvd_authinfo structure. See + + + outputs: + dvd_authinfo structure. + + + error return: + - ENOTTY ai.type not recognized. + + + +CDROM_SEND_PACKET + send a packet to the drive + + + usage:: + + struct cdrom_generic_command cgc; + + ioctl(fd, CDROM_SEND_PACKET, &cgc); + + inputs: + cdrom_generic_command structure containing the packet to send. + + + outputs: + none + + cdrom_generic_command structure containing results. + + error return: + - EIO + + command failed. + - EPERM + + Operation not permitted, either because a + write command was attempted on a drive which + is opened read-only, or because the command + requires CAP_SYS_RAWIO + - EINVAL + + cgc.data_direction not set + + + +CDROM_NEXT_WRITABLE + get next writable block + + + usage:: + + long next; + + ioctl(fd, CDROM_NEXT_WRITABLE, &next); + + inputs: + none + + + outputs: + The next writable block. + + + notes: + If the device does not support this ioctl directly, the + + ioctl will return CDROM_LAST_WRITTEN + 7. + + + +CDROM_LAST_WRITTEN + get last block written on disc + + + usage:: + + long last; + + ioctl(fd, CDROM_LAST_WRITTEN, &last); + + inputs: + none + + + outputs: + The last block written on disc + + + notes: + If the device does not support this ioctl directly, the + result is derived from the disc's table of contents. If the + table of contents can't be read, this ioctl returns an + error. diff --git a/Documentation/ioctl/cdrom.txt b/Documentation/ioctl/cdrom.txt deleted file mode 100644 index a4d62a9d6771..000000000000 --- a/Documentation/ioctl/cdrom.txt +++ /dev/null @@ -1,967 +0,0 @@ - Summary of CDROM ioctl calls. - ============================ - - Edward A. Falk - - November, 2004 - -This document attempts to describe the ioctl(2) calls supported by -the CDROM layer. These are by-and-large implemented (as of Linux 2.6) -in drivers/cdrom/cdrom.c and drivers/block/scsi_ioctl.c - -ioctl values are listed in . As of this writing, they -are as follows: - - CDROMPAUSE Pause Audio Operation - CDROMRESUME Resume paused Audio Operation - CDROMPLAYMSF Play Audio MSF (struct cdrom_msf) - CDROMPLAYTRKIND Play Audio Track/index (struct cdrom_ti) - CDROMREADTOCHDR Read TOC header (struct cdrom_tochdr) - CDROMREADTOCENTRY Read TOC entry (struct cdrom_tocentry) - CDROMSTOP Stop the cdrom drive - CDROMSTART Start the cdrom drive - CDROMEJECT Ejects the cdrom media - CDROMVOLCTRL Control output volume (struct cdrom_volctrl) - CDROMSUBCHNL Read subchannel data (struct cdrom_subchnl) - CDROMREADMODE2 Read CDROM mode 2 data (2336 Bytes) - (struct cdrom_read) - CDROMREADMODE1 Read CDROM mode 1 data (2048 Bytes) - (struct cdrom_read) - CDROMREADAUDIO (struct cdrom_read_audio) - CDROMEJECT_SW enable(1)/disable(0) auto-ejecting - CDROMMULTISESSION Obtain the start-of-last-session - address of multi session disks - (struct cdrom_multisession) - CDROM_GET_MCN Obtain the "Universal Product Code" - if available (struct cdrom_mcn) - CDROM_GET_UPC Deprecated, use CDROM_GET_MCN instead. - CDROMRESET hard-reset the drive - CDROMVOLREAD Get the drive's volume setting - (struct cdrom_volctrl) - CDROMREADRAW read data in raw mode (2352 Bytes) - (struct cdrom_read) - CDROMREADCOOKED read data in cooked mode - CDROMSEEK seek msf address - CDROMPLAYBLK scsi-cd only, (struct cdrom_blk) - CDROMREADALL read all 2646 bytes - CDROMGETSPINDOWN return 4-bit spindown value - CDROMSETSPINDOWN set 4-bit spindown value - CDROMCLOSETRAY pendant of CDROMEJECT - CDROM_SET_OPTIONS Set behavior options - CDROM_CLEAR_OPTIONS Clear behavior options - CDROM_SELECT_SPEED Set the CD-ROM speed - CDROM_SELECT_DISC Select disc (for juke-boxes) - CDROM_MEDIA_CHANGED Check is media changed - CDROM_DRIVE_STATUS Get tray position, etc. - CDROM_DISC_STATUS Get disc type, etc. - CDROM_CHANGER_NSLOTS Get number of slots - CDROM_LOCKDOOR lock or unlock door - CDROM_DEBUG Turn debug messages on/off - CDROM_GET_CAPABILITY get capabilities - CDROMAUDIOBUFSIZ set the audio buffer size - DVD_READ_STRUCT Read structure - DVD_WRITE_STRUCT Write structure - DVD_AUTH Authentication - CDROM_SEND_PACKET send a packet to the drive - CDROM_NEXT_WRITABLE get next writable block - CDROM_LAST_WRITTEN get last block written on disc - - -The information that follows was determined from reading kernel source -code. It is likely that some corrections will be made over time. - - - - - - - -General: - - Unless otherwise specified, all ioctl calls return 0 on success - and -1 with errno set to an appropriate value on error. (Some - ioctls return non-negative data values.) - - Unless otherwise specified, all ioctl calls return -1 and set - errno to EFAULT on a failed attempt to copy data to or from user - address space. - - Individual drivers may return error codes not listed here. - - Unless otherwise specified, all data structures and constants - are defined in - - - - -CDROMPAUSE Pause Audio Operation - - usage: - - ioctl(fd, CDROMPAUSE, 0); - - inputs: none - - outputs: none - - error return: - ENOSYS cd drive not audio-capable. - - -CDROMRESUME Resume paused Audio Operation - - usage: - - ioctl(fd, CDROMRESUME, 0); - - inputs: none - - outputs: none - - error return: - ENOSYS cd drive not audio-capable. - - -CDROMPLAYMSF Play Audio MSF (struct cdrom_msf) - - usage: - - struct cdrom_msf msf; - ioctl(fd, CDROMPLAYMSF, &msf); - - inputs: - cdrom_msf structure, describing a segment of music to play - - outputs: none - - error return: - ENOSYS cd drive not audio-capable. - - notes: - MSF stands for minutes-seconds-frames - LBA stands for logical block address - - Segment is described as start and end times, where each time - is described as minutes:seconds:frames. A frame is 1/75 of - a second. - - -CDROMPLAYTRKIND Play Audio Track/index (struct cdrom_ti) - - usage: - - struct cdrom_ti ti; - ioctl(fd, CDROMPLAYTRKIND, &ti); - - inputs: - cdrom_ti structure, describing a segment of music to play - - outputs: none - - error return: - ENOSYS cd drive not audio-capable. - - notes: - Segment is described as start and end times, where each time - is described as a track and an index. - - - -CDROMREADTOCHDR Read TOC header (struct cdrom_tochdr) - - usage: - - cdrom_tochdr header; - ioctl(fd, CDROMREADTOCHDR, &header); - - inputs: - cdrom_tochdr structure - - outputs: - cdrom_tochdr structure - - error return: - ENOSYS cd drive not audio-capable. - - - -CDROMREADTOCENTRY Read TOC entry (struct cdrom_tocentry) - - usage: - - struct cdrom_tocentry entry; - ioctl(fd, CDROMREADTOCENTRY, &entry); - - inputs: - cdrom_tocentry structure - - outputs: - cdrom_tocentry structure - - error return: - ENOSYS cd drive not audio-capable. - EINVAL entry.cdte_format not CDROM_MSF or CDROM_LBA - EINVAL requested track out of bounds - EIO I/O error reading TOC - - notes: - TOC stands for Table Of Contents - MSF stands for minutes-seconds-frames - LBA stands for logical block address - - - -CDROMSTOP Stop the cdrom drive - - usage: - - ioctl(fd, CDROMSTOP, 0); - - inputs: none - - outputs: none - - error return: - ENOSYS cd drive not audio-capable. - - notes: - Exact interpretation of this ioctl depends on the device, - but most seem to spin the drive down. - - -CDROMSTART Start the cdrom drive - - usage: - - ioctl(fd, CDROMSTART, 0); - - inputs: none - - outputs: none - - error return: - ENOSYS cd drive not audio-capable. - - notes: - Exact interpretation of this ioctl depends on the device, - but most seem to spin the drive up and/or close the tray. - Other devices ignore the ioctl completely. - - -CDROMEJECT Ejects the cdrom media - - usage: - - ioctl(fd, CDROMEJECT, 0); - - inputs: none - - outputs: none - - error returns: - ENOSYS cd drive not capable of ejecting - EBUSY other processes are accessing drive, or door is locked - - notes: - See CDROM_LOCKDOOR, below. - - - -CDROMCLOSETRAY pendant of CDROMEJECT - - usage: - - ioctl(fd, CDROMCLOSETRAY, 0); - - inputs: none - - outputs: none - - error returns: - ENOSYS cd drive not capable of closing the tray - EBUSY other processes are accessing drive, or door is locked - - notes: - See CDROM_LOCKDOOR, below. - - - -CDROMVOLCTRL Control output volume (struct cdrom_volctrl) - - usage: - - struct cdrom_volctrl volume; - ioctl(fd, CDROMVOLCTRL, &volume); - - inputs: - cdrom_volctrl structure containing volumes for up to 4 - channels. - - outputs: none - - error return: - ENOSYS cd drive not audio-capable. - - - -CDROMVOLREAD Get the drive's volume setting - (struct cdrom_volctrl) - - usage: - - struct cdrom_volctrl volume; - ioctl(fd, CDROMVOLREAD, &volume); - - inputs: none - - outputs: - The current volume settings. - - error return: - ENOSYS cd drive not audio-capable. - - - -CDROMSUBCHNL Read subchannel data (struct cdrom_subchnl) - - usage: - - struct cdrom_subchnl q; - ioctl(fd, CDROMSUBCHNL, &q); - - inputs: - cdrom_subchnl structure - - outputs: - cdrom_subchnl structure - - error return: - ENOSYS cd drive not audio-capable. - EINVAL format not CDROM_MSF or CDROM_LBA - - notes: - Format is converted to CDROM_MSF or CDROM_LBA - as per user request on return - - - -CDROMREADRAW read data in raw mode (2352 Bytes) - (struct cdrom_read) - - usage: - - union { - struct cdrom_msf msf; /* input */ - char buffer[CD_FRAMESIZE_RAW]; /* return */ - } arg; - ioctl(fd, CDROMREADRAW, &arg); - - inputs: - cdrom_msf structure indicating an address to read. - Only the start values are significant. - - outputs: - Data written to address provided by user. - - error return: - EINVAL address less than 0, or msf less than 0:2:0 - ENOMEM out of memory - - notes: - As of 2.6.8.1, comments in indicate that this - ioctl accepts a cdrom_read structure, but actual source code - reads a cdrom_msf structure and writes a buffer of data to - the same address. - - MSF values are converted to LBA values via this formula: - - lba = (((m * CD_SECS) + s) * CD_FRAMES + f) - CD_MSF_OFFSET; - - - - -CDROMREADMODE1 Read CDROM mode 1 data (2048 Bytes) - (struct cdrom_read) - - notes: - Identical to CDROMREADRAW except that block size is - CD_FRAMESIZE (2048) bytes - - - -CDROMREADMODE2 Read CDROM mode 2 data (2336 Bytes) - (struct cdrom_read) - - notes: - Identical to CDROMREADRAW except that block size is - CD_FRAMESIZE_RAW0 (2336) bytes - - - -CDROMREADAUDIO (struct cdrom_read_audio) - - usage: - - struct cdrom_read_audio ra; - ioctl(fd, CDROMREADAUDIO, &ra); - - inputs: - cdrom_read_audio structure containing read start - point and length - - outputs: - audio data, returned to buffer indicated by ra - - error return: - EINVAL format not CDROM_MSF or CDROM_LBA - EINVAL nframes not in range [1 75] - ENXIO drive has no queue (probably means invalid fd) - ENOMEM out of memory - - -CDROMEJECT_SW enable(1)/disable(0) auto-ejecting - - usage: - - int val; - ioctl(fd, CDROMEJECT_SW, val); - - inputs: - Flag specifying auto-eject flag. - - outputs: none - - error return: - ENOSYS Drive is not capable of ejecting. - EBUSY Door is locked - - - - -CDROMMULTISESSION Obtain the start-of-last-session - address of multi session disks - (struct cdrom_multisession) - usage: - - struct cdrom_multisession ms_info; - ioctl(fd, CDROMMULTISESSION, &ms_info); - - inputs: - cdrom_multisession structure containing desired - format. - - outputs: - cdrom_multisession structure is filled with last_session - information. - - error return: - EINVAL format not CDROM_MSF or CDROM_LBA - - -CDROM_GET_MCN Obtain the "Universal Product Code" - if available (struct cdrom_mcn) - - usage: - - struct cdrom_mcn mcn; - ioctl(fd, CDROM_GET_MCN, &mcn); - - inputs: none - - outputs: - Universal Product Code - - error return: - ENOSYS Drive is not capable of reading MCN data. - - notes: - Source code comments state: - - The following function is implemented, although very few - audio discs give Universal Product Code information, which - should just be the Medium Catalog Number on the box. Note, - that the way the code is written on the CD is /not/ uniform - across all discs! - - - - -CDROM_GET_UPC CDROM_GET_MCN (deprecated) - - Not implemented, as of 2.6.8.1 - - - -CDROMRESET hard-reset the drive - - usage: - - ioctl(fd, CDROMRESET, 0); - - inputs: none - - outputs: none - - error return: - EACCES Access denied: requires CAP_SYS_ADMIN - ENOSYS Drive is not capable of resetting. - - - - -CDROMREADCOOKED read data in cooked mode - - usage: - - u8 buffer[CD_FRAMESIZE] - ioctl(fd, CDROMREADCOOKED, buffer); - - inputs: none - - outputs: - 2048 bytes of data, "cooked" mode. - - notes: - Not implemented on all drives. - - - - -CDROMREADALL read all 2646 bytes - - Same as CDROMREADCOOKED, but reads 2646 bytes. - - - -CDROMSEEK seek msf address - - usage: - - struct cdrom_msf msf; - ioctl(fd, CDROMSEEK, &msf); - - inputs: - MSF address to seek to. - - outputs: none - - - -CDROMPLAYBLK scsi-cd only, (struct cdrom_blk) - - usage: - - struct cdrom_blk blk; - ioctl(fd, CDROMPLAYBLK, &blk); - - inputs: - Region to play - - outputs: none - - - -CDROMGETSPINDOWN - - usage: - - char spindown; - ioctl(fd, CDROMGETSPINDOWN, &spindown); - - inputs: none - - outputs: - The value of the current 4-bit spindown value. - - - - -CDROMSETSPINDOWN - - usage: - - char spindown - ioctl(fd, CDROMSETSPINDOWN, &spindown); - - inputs: - 4-bit value used to control spindown (TODO: more detail here) - - outputs: none - - - - - -CDROM_SET_OPTIONS Set behavior options - - usage: - - int options; - ioctl(fd, CDROM_SET_OPTIONS, options); - - inputs: - New values for drive options. The logical 'or' of: - CDO_AUTO_CLOSE close tray on first open(2) - CDO_AUTO_EJECT open tray on last release - CDO_USE_FFLAGS use O_NONBLOCK information on open - CDO_LOCK lock tray on open files - CDO_CHECK_TYPE check type on open for data - - outputs: - Returns the resulting options settings in the - ioctl return value. Returns -1 on error. - - error return: - ENOSYS selected option(s) not supported by drive. - - - - -CDROM_CLEAR_OPTIONS Clear behavior options - - Same as CDROM_SET_OPTIONS, except that selected options are - turned off. - - - -CDROM_SELECT_SPEED Set the CD-ROM speed - - usage: - - int speed; - ioctl(fd, CDROM_SELECT_SPEED, speed); - - inputs: - New drive speed. - - outputs: none - - error return: - ENOSYS speed selection not supported by drive. - - - -CDROM_SELECT_DISC Select disc (for juke-boxes) - - usage: - - int disk; - ioctl(fd, CDROM_SELECT_DISC, disk); - - inputs: - Disk to load into drive. - - outputs: none - - error return: - EINVAL Disk number beyond capacity of drive - - - -CDROM_MEDIA_CHANGED Check is media changed - - usage: - - int slot; - ioctl(fd, CDROM_MEDIA_CHANGED, slot); - - inputs: - Slot number to be tested, always zero except for jukeboxes. - May also be special values CDSL_NONE or CDSL_CURRENT - - outputs: - Ioctl return value is 0 or 1 depending on whether the media - has been changed, or -1 on error. - - error returns: - ENOSYS Drive can't detect media change - EINVAL Slot number beyond capacity of drive - ENOMEM Out of memory - - - -CDROM_DRIVE_STATUS Get tray position, etc. - - usage: - - int slot; - ioctl(fd, CDROM_DRIVE_STATUS, slot); - - inputs: - Slot number to be tested, always zero except for jukeboxes. - May also be special values CDSL_NONE or CDSL_CURRENT - - outputs: - Ioctl return value will be one of the following values - from : - - CDS_NO_INFO Information not available. - CDS_NO_DISC - CDS_TRAY_OPEN - CDS_DRIVE_NOT_READY - CDS_DISC_OK - -1 error - - error returns: - ENOSYS Drive can't detect drive status - EINVAL Slot number beyond capacity of drive - ENOMEM Out of memory - - - - -CDROM_DISC_STATUS Get disc type, etc. - - usage: - - ioctl(fd, CDROM_DISC_STATUS, 0); - - inputs: none - - outputs: - Ioctl return value will be one of the following values - from : - CDS_NO_INFO - CDS_AUDIO - CDS_MIXED - CDS_XA_2_2 - CDS_XA_2_1 - CDS_DATA_1 - - error returns: none at present - - notes: - Source code comments state: - - Ok, this is where problems start. The current interface for - the CDROM_DISC_STATUS ioctl is flawed. It makes the false - assumption that CDs are all CDS_DATA_1 or all CDS_AUDIO, etc. - Unfortunately, while this is often the case, it is also - very common for CDs to have some tracks with data, and some - tracks with audio. Just because I feel like it, I declare - the following to be the best way to cope. If the CD has - ANY data tracks on it, it will be returned as a data CD. - If it has any XA tracks, I will return it as that. Now I - could simplify this interface by combining these returns with - the above, but this more clearly demonstrates the problem - with the current interface. Too bad this wasn't designed - to use bitmasks... -Erik - - Well, now we have the option CDS_MIXED: a mixed-type CD. - User level programmers might feel the ioctl is not very - useful. - ---david - - - - -CDROM_CHANGER_NSLOTS Get number of slots - - usage: - - ioctl(fd, CDROM_CHANGER_NSLOTS, 0); - - inputs: none - - outputs: - The ioctl return value will be the number of slots in a - CD changer. Typically 1 for non-multi-disk devices. - - error returns: none - - - -CDROM_LOCKDOOR lock or unlock door - - usage: - - int lock; - ioctl(fd, CDROM_LOCKDOOR, lock); - - inputs: - Door lock flag, 1=lock, 0=unlock - - outputs: none - - error returns: - EDRIVE_CANT_DO_THIS Door lock function not supported. - EBUSY Attempt to unlock when multiple users - have the drive open and not CAP_SYS_ADMIN - - notes: - As of 2.6.8.1, the lock flag is a global lock, meaning that - all CD drives will be locked or unlocked together. This is - probably a bug. - - The EDRIVE_CANT_DO_THIS value is defined in - and is currently (2.6.8.1) the same as EOPNOTSUPP - - - -CDROM_DEBUG Turn debug messages on/off - - usage: - - int debug; - ioctl(fd, CDROM_DEBUG, debug); - - inputs: - Cdrom debug flag, 0=disable, 1=enable - - outputs: - The ioctl return value will be the new debug flag. - - error return: - EACCES Access denied: requires CAP_SYS_ADMIN - - - -CDROM_GET_CAPABILITY get capabilities - - usage: - - ioctl(fd, CDROM_GET_CAPABILITY, 0); - - inputs: none - - outputs: - The ioctl return value is the current device capability - flags. See CDC_CLOSE_TRAY, CDC_OPEN_TRAY, etc. - - - -CDROMAUDIOBUFSIZ set the audio buffer size - - usage: - - int arg; - ioctl(fd, CDROMAUDIOBUFSIZ, val); - - inputs: - New audio buffer size - - outputs: - The ioctl return value is the new audio buffer size, or -1 - on error. - - error return: - ENOSYS Not supported by this driver. - - notes: - Not supported by all drivers. - - - -DVD_READ_STRUCT Read structure - - usage: - - dvd_struct s; - ioctl(fd, DVD_READ_STRUCT, &s); - - inputs: - dvd_struct structure, containing: - type specifies the information desired, one of - DVD_STRUCT_PHYSICAL, DVD_STRUCT_COPYRIGHT, - DVD_STRUCT_DISCKEY, DVD_STRUCT_BCA, - DVD_STRUCT_MANUFACT - physical.layer_num desired layer, indexed from 0 - copyright.layer_num desired layer, indexed from 0 - disckey.agid - - outputs: - dvd_struct structure, containing: - physical for type == DVD_STRUCT_PHYSICAL - copyright for type == DVD_STRUCT_COPYRIGHT - disckey.value for type == DVD_STRUCT_DISCKEY - bca.{len,value} for type == DVD_STRUCT_BCA - manufact.{len,valu} for type == DVD_STRUCT_MANUFACT - - error returns: - EINVAL physical.layer_num exceeds number of layers - EIO Received invalid response from drive - - - -DVD_WRITE_STRUCT Write structure - - Not implemented, as of 2.6.8.1 - - - -DVD_AUTH Authentication - - usage: - - dvd_authinfo ai; - ioctl(fd, DVD_AUTH, &ai); - - inputs: - dvd_authinfo structure. See - - outputs: - dvd_authinfo structure. - - error return: - ENOTTY ai.type not recognized. - - - -CDROM_SEND_PACKET send a packet to the drive - - usage: - - struct cdrom_generic_command cgc; - ioctl(fd, CDROM_SEND_PACKET, &cgc); - - inputs: - cdrom_generic_command structure containing the packet to send. - - outputs: none - cdrom_generic_command structure containing results. - - error return: - EIO command failed. - EPERM Operation not permitted, either because a - write command was attempted on a drive which - is opened read-only, or because the command - requires CAP_SYS_RAWIO - EINVAL cgc.data_direction not set - - - -CDROM_NEXT_WRITABLE get next writable block - - usage: - - long next; - ioctl(fd, CDROM_NEXT_WRITABLE, &next); - - inputs: none - - outputs: - The next writable block. - - notes: - If the device does not support this ioctl directly, the - ioctl will return CDROM_LAST_WRITTEN + 7. - - - -CDROM_LAST_WRITTEN get last block written on disc - - usage: - - long last; - ioctl(fd, CDROM_LAST_WRITTEN, &last); - - inputs: none - - outputs: - The last block written on disc - - notes: - If the device does not support this ioctl directly, the - result is derived from the disc's table of contents. If the - table of contents can't be read, this ioctl returns an - error. diff --git a/Documentation/ioctl/hdio.txt b/Documentation/ioctl/hdio.rst similarity index 54% rename from Documentation/ioctl/hdio.txt rename to Documentation/ioctl/hdio.rst index 18eb98c44ffe..e822e3dff176 100644 --- a/Documentation/ioctl/hdio.txt +++ b/Documentation/ioctl/hdio.rst @@ -1,9 +1,10 @@ - Summary of HDIO_ ioctl calls. - ============================ +============================== +Summary of `HDIO_` ioctl calls +============================== - Edward A. Falk +- Edward A. Falk - November, 2004 +November, 2004 This document attempts to describe the ioctl(2) calls supported by the HD/IDE layer. These are by-and-large implemented (as of Linux 2.6) @@ -14,6 +15,7 @@ are as follows: ioctls that pass argument pointers to user space: + ======================= ======================================= HDIO_GETGEO get device geometry HDIO_GET_UNMASKINTR get current unmask setting HDIO_GET_MULTCOUNT get current IDE blockmode setting @@ -36,9 +38,11 @@ are as follows: HDIO_DRIVE_TASK execute task and special drive command HDIO_DRIVE_CMD execute a special drive command HDIO_DRIVE_CMD_AEB HDIO_DRIVE_TASK + ======================= ======================================= ioctls that pass non-pointer values: + ======================= ======================================= HDIO_SET_MULTCOUNT change IDE blockmode HDIO_SET_UNMASKINTR permit other irqs during I/O HDIO_SET_KEEPSETTINGS keep ioctl settings on reset @@ -57,16 +61,13 @@ are as follows: HDIO_SET_IDE_SCSI Set scsi emulation mode on/off HDIO_SET_SCSI_IDE not implemented yet + ======================= ======================================= The information that follows was determined from reading kernel source code. It is likely that some corrections will be made over time. - - - - - +------------------------------------------------------------------------------ General: @@ -80,459 +81,610 @@ General: Unless otherwise specified, all data structures and constants are defined in +------------------------------------------------------------------------------ +HDIO_GETGEO + get device geometry -HDIO_GETGEO get device geometry - usage: + usage:: struct hd_geometry geom; + ioctl(fd, HDIO_GETGEO, &geom); - inputs: none + inputs: + none + + outputs: + hd_geometry structure containing: - hd_geometry structure containing: + ========= ================================== heads number of heads sectors number of sectors/track cylinders number of cylinders, mod 65536 start starting sector of this partition. + ========= ================================== error returns: - EINVAL if the device is not a disk drive or floppy drive, - or if the user passes a null pointer + - EINVAL + + if the device is not a disk drive or floppy drive, + or if the user passes a null pointer notes: + Not particularly useful with modern disk drives, whose geometry + is a polite fiction anyway. Modern drives are addressed + purely by sector number nowadays (lba addressing), and the + drive geometry is an abstraction which is actually subject + to change. Currently (as of Nov 2004), the geometry values + are the "bios" values -- presumably the values the drive had + when Linux first booted. - Not particularly useful with modern disk drives, whose geometry - is a polite fiction anyway. Modern drives are addressed - purely by sector number nowadays (lba addressing), and the - drive geometry is an abstraction which is actually subject - to change. Currently (as of Nov 2004), the geometry values - are the "bios" values -- presumably the values the drive had - when Linux first booted. + In addition, the cylinders field of the hd_geometry is an + unsigned short, meaning that on most architectures, this + ioctl will not return a meaningful value on drives with more + than 65535 tracks. - In addition, the cylinders field of the hd_geometry is an - unsigned short, meaning that on most architectures, this - ioctl will not return a meaningful value on drives with more - than 65535 tracks. + The start field is unsigned long, meaning that it will not + contain a meaningful value for disks over 219 Gb in size. - The start field is unsigned long, meaning that it will not - contain a meaningful value for disks over 219 Gb in size. +HDIO_GET_UNMASKINTR + get current unmask setting -HDIO_GET_UNMASKINTR get current unmask setting - usage: + usage:: long val; + ioctl(fd, HDIO_GET_UNMASKINTR, &val); - inputs: none + inputs: + none + + outputs: - The value of the drive's current unmask setting + The value of the drive's current unmask setting -HDIO_SET_UNMASKINTR permit other irqs during I/O - usage: + +HDIO_SET_UNMASKINTR + permit other irqs during I/O + + + usage:: unsigned long val; + ioctl(fd, HDIO_SET_UNMASKINTR, val); inputs: - New value for unmask flag + New value for unmask flag + + + + outputs: + none + - outputs: none error return: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - EACCES Access denied: requires CAP_SYS_ADMIN - EINVAL value out of range [0 1] - EBUSY Controller busy + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - EACCES Access denied: requires CAP_SYS_ADMIN + - EINVAL value out of range [0 1] + - EBUSY Controller busy -HDIO_GET_MULTCOUNT get current IDE blockmode setting +HDIO_GET_MULTCOUNT + get current IDE blockmode setting - usage: + + usage:: long val; + ioctl(fd, HDIO_GET_MULTCOUNT, &val); - inputs: none + inputs: + none + + outputs: - The value of the current IDE block mode setting. This - controls how many sectors the drive will transfer per - interrupt. + The value of the current IDE block mode setting. This + controls how many sectors the drive will transfer per + interrupt. -HDIO_SET_MULTCOUNT change IDE blockmode +HDIO_SET_MULTCOUNT + change IDE blockmode - usage: + + usage:: int val; + ioctl(fd, HDIO_SET_MULTCOUNT, val); inputs: - New value for IDE block mode setting. This controls how many - sectors the drive will transfer per interrupt. + New value for IDE block mode setting. This controls how many + sectors the drive will transfer per interrupt. + + outputs: + none + - outputs: none error return: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - EACCES Access denied: requires CAP_SYS_ADMIN - EINVAL value out of range supported by disk. - EBUSY Controller busy or blockmode already set. - EIO Drive did not accept new block mode. + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - EACCES Access denied: requires CAP_SYS_ADMIN + - EINVAL value out of range supported by disk. + - EBUSY Controller busy or blockmode already set. + - EIO Drive did not accept new block mode. notes: - - Source code comments read: + Source code comments read:: This is tightly woven into the driver->do_special cannot touch. DON'T do it again until a total personality rewrite is committed. If blockmode has already been set, this ioctl will fail with - EBUSY + -EBUSY -HDIO_GET_QDMA get use-qdma flag +HDIO_GET_QDMA + get use-qdma flag + Not implemented, as of 2.6.8.1 -HDIO_SET_XFER set transfer rate via proc +HDIO_SET_XFER + set transfer rate via proc + Not implemented, as of 2.6.8.1 -HDIO_OBSOLETE_IDENTITY OBSOLETE, DO NOT USE +HDIO_OBSOLETE_IDENTITY + OBSOLETE, DO NOT USE + Same as HDIO_GET_IDENTITY (see below), except that it only returns the first 142 bytes of drive identity information. -HDIO_GET_IDENTITY get IDE identification info +HDIO_GET_IDENTITY + get IDE identification info - usage: + + usage:: unsigned char identity[512]; + ioctl(fd, HDIO_GET_IDENTITY, identity); - inputs: none + inputs: + none + + outputs: - - ATA drive identity information. For full description, see - the IDENTIFY DEVICE and IDENTIFY PACKET DEVICE commands in - the ATA specification. + ATA drive identity information. For full description, see + the IDENTIFY DEVICE and IDENTIFY PACKET DEVICE commands in + the ATA specification. error returns: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - ENOMSG IDENTIFY DEVICE information not available + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - ENOMSG IDENTIFY DEVICE information not available notes: + Returns information that was obtained when the drive was + probed. Some of this information is subject to change, and + this ioctl does not re-probe the drive to update the + information. - Returns information that was obtained when the drive was - probed. Some of this information is subject to change, and - this ioctl does not re-probe the drive to update the - information. + This information is also available from /proc/ide/hdX/identify - This information is also available from /proc/ide/hdX/identify +HDIO_GET_KEEPSETTINGS + get keep-settings-on-reset flag -HDIO_GET_KEEPSETTINGS get keep-settings-on-reset flag - usage: + usage:: long val; + ioctl(fd, HDIO_GET_KEEPSETTINGS, &val); - inputs: none + inputs: + none + + outputs: - The value of the current "keep settings" flag + The value of the current "keep settings" flag + + notes: + When set, indicates that kernel should restore settings + after a drive reset. - When set, indicates that kernel should restore settings - after a drive reset. +HDIO_SET_KEEPSETTINGS + keep ioctl settings on reset -HDIO_SET_KEEPSETTINGS keep ioctl settings on reset - usage: + usage:: long val; + ioctl(fd, HDIO_SET_KEEPSETTINGS, val); inputs: - New value for keep_settings flag + New value for keep_settings flag + + + + outputs: + none + - outputs: none error return: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - EACCES Access denied: requires CAP_SYS_ADMIN - EINVAL value out of range [0 1] - EBUSY Controller busy + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - EACCES Access denied: requires CAP_SYS_ADMIN + - EINVAL value out of range [0 1] + - EBUSY Controller busy -HDIO_GET_32BIT get current io_32bit setting +HDIO_GET_32BIT + get current io_32bit setting - usage: + + usage:: long val; + ioctl(fd, HDIO_GET_32BIT, &val); - inputs: none + inputs: + none + + outputs: - The value of the current io_32bit setting + The value of the current io_32bit setting + + notes: + 0=16-bit, 1=32-bit, 2,3 = 32bit+sync - 0=16-bit, 1=32-bit, 2,3 = 32bit+sync -HDIO_GET_NOWERR get ignore-write-error flag - usage: +HDIO_GET_NOWERR + get ignore-write-error flag + + + usage:: long val; + ioctl(fd, HDIO_GET_NOWERR, &val); - inputs: none + inputs: + none + + outputs: - The value of the current ignore-write-error flag + The value of the current ignore-write-error flag -HDIO_GET_DMA get use-dma flag - usage: + +HDIO_GET_DMA + get use-dma flag + + + usage:: long val; + ioctl(fd, HDIO_GET_DMA, &val); - inputs: none + inputs: + none + + outputs: - The value of the current use-dma flag + The value of the current use-dma flag -HDIO_GET_NICE get nice flags - usage: + +HDIO_GET_NICE + get nice flags + + + usage:: long nice; + ioctl(fd, HDIO_GET_NICE, &nice); - inputs: none + inputs: + none + + outputs: + The drive's "nice" values. + - The drive's "nice" values. notes: + Per-drive flags which determine when the system will give more + bandwidth to other devices sharing the same IDE bus. - Per-drive flags which determine when the system will give more - bandwidth to other devices sharing the same IDE bus. - See , near symbol IDE_NICE_DSC_OVERLAP. + See , near symbol IDE_NICE_DSC_OVERLAP. -HDIO_SET_NICE set nice flags +HDIO_SET_NICE + set nice flags - usage: + + usage:: unsigned long nice; + ... ioctl(fd, HDIO_SET_NICE, nice); inputs: - bitmask of nice flags. + bitmask of nice flags. + + + + outputs: + none + - outputs: none error returns: - EACCES Access denied: requires CAP_SYS_ADMIN - EPERM Flags other than DSC_OVERLAP and NICE_1 set. - EPERM DSC_OVERLAP specified but not supported by drive + - EACCES Access denied: requires CAP_SYS_ADMIN + - EPERM Flags other than DSC_OVERLAP and NICE_1 set. + - EPERM DSC_OVERLAP specified but not supported by drive notes: + This ioctl sets the DSC_OVERLAP and NICE_1 flags from values + provided by the user. - This ioctl sets the DSC_OVERLAP and NICE_1 flags from values - provided by the user. + Nice flags are listed in , starting with + IDE_NICE_DSC_OVERLAP. These values represent shifts. - Nice flags are listed in , starting with - IDE_NICE_DSC_OVERLAP. These values represent shifts. +HDIO_GET_WCACHE + get write cache mode on|off -HDIO_GET_WCACHE get write cache mode on|off - usage: + usage:: long val; + ioctl(fd, HDIO_GET_WCACHE, &val); - inputs: none + inputs: + none + + outputs: - The value of the current write cache mode + The value of the current write cache mode -HDIO_GET_ACOUSTIC get acoustic value - usage: + +HDIO_GET_ACOUSTIC + get acoustic value + + + usage:: long val; + ioctl(fd, HDIO_GET_ACOUSTIC, &val); - inputs: none + inputs: + none + + outputs: - The value of the current acoustic settings + The value of the current acoustic settings + + notes: + See HDIO_SET_ACOUSTIC + - See HDIO_SET_ACOUSTIC HDIO_GET_ADDRESS + usage:: - usage: long val; + ioctl(fd, HDIO_GET_ADDRESS, &val); - inputs: none + inputs: + none + + outputs: - The value of the current addressing mode: - 0 = 28-bit - 1 = 48-bit - 2 = 48-bit doing 28-bit - 3 = 64-bit + The value of the current addressing mode: + = =================== + 0 28-bit + 1 48-bit + 2 48-bit doing 28-bit + 3 64-bit + = =================== -HDIO_GET_BUSSTATE get the bus state of the hwif - usage: +HDIO_GET_BUSSTATE + get the bus state of the hwif + + + usage:: long state; + ioctl(fd, HDIO_SCAN_HWIF, &state); - inputs: none + inputs: + none + + outputs: - Current power state of the IDE bus. One of BUSSTATE_OFF, - BUSSTATE_ON, or BUSSTATE_TRISTATE + Current power state of the IDE bus. One of BUSSTATE_OFF, + BUSSTATE_ON, or BUSSTATE_TRISTATE error returns: - EACCES Access denied: requires CAP_SYS_ADMIN + - EACCES Access denied: requires CAP_SYS_ADMIN -HDIO_SET_BUSSTATE set the bus state of the hwif +HDIO_SET_BUSSTATE + set the bus state of the hwif - usage: + + usage:: int state; + ... ioctl(fd, HDIO_SCAN_HWIF, state); inputs: - Desired IDE power state. One of BUSSTATE_OFF, BUSSTATE_ON, - or BUSSTATE_TRISTATE + Desired IDE power state. One of BUSSTATE_OFF, BUSSTATE_ON, + or BUSSTATE_TRISTATE + + outputs: + none + - outputs: none error returns: - EACCES Access denied: requires CAP_SYS_RAWIO - EOPNOTSUPP Hardware interface does not support bus power control + - EACCES Access denied: requires CAP_SYS_RAWIO + - EOPNOTSUPP Hardware interface does not support bus power control -HDIO_TRISTATE_HWIF execute a channel tristate +HDIO_TRISTATE_HWIF + execute a channel tristate + Not implemented, as of 2.6.8.1. See HDIO_SET_BUSSTATE -HDIO_DRIVE_RESET execute a device reset +HDIO_DRIVE_RESET + execute a device reset - usage: + + usage:: int args[3] + ... ioctl(fd, HDIO_DRIVE_RESET, args); - inputs: none + inputs: + none + + + + outputs: + none + - outputs: none error returns: - EACCES Access denied: requires CAP_SYS_ADMIN - ENXIO No such device: phy dead or ctl_addr == 0 - EIO I/O error: reset timed out or hardware error + - EACCES Access denied: requires CAP_SYS_ADMIN + - ENXIO No such device: phy dead or ctl_addr == 0 + - EIO I/O error: reset timed out or hardware error notes: - Execute a reset on the device as soon as the current IO - operation has completed. + - Execute a reset on the device as soon as the current IO + operation has completed. - Executes an ATAPI soft reset if applicable, otherwise - executes an ATA soft reset on the controller. + - Executes an ATAPI soft reset if applicable, otherwise + executes an ATA soft reset on the controller. -HDIO_DRIVE_TASKFILE execute raw taskfile +HDIO_DRIVE_TASKFILE + execute raw taskfile - Note: If you don't have a copy of the ANSI ATA specification - handy, you should probably ignore this ioctl. - Execute an ATA disk command directly by writing the "taskfile" - registers of the drive. Requires ADMIN and RAWIO access - privileges. + Note: + If you don't have a copy of the ANSI ATA specification + handy, you should probably ignore this ioctl. - usage: + - Execute an ATA disk command directly by writing the "taskfile" + registers of the drive. Requires ADMIN and RAWIO access + privileges. + + usage:: struct { + ide_task_request_t req_task; u8 outbuf[OUTPUT_SIZE]; u8 inbuf[INPUT_SIZE]; @@ -548,6 +700,7 @@ HDIO_DRIVE_TASKFILE execute raw taskfile (See below for details on memory area passed to ioctl.) + ============ =================================================== io_ports[8] values to be written to taskfile registers hob_ports[8] high-order bytes, for extended commands. out_flags flags indicating which registers are valid @@ -557,24 +710,29 @@ HDIO_DRIVE_TASKFILE execute raw taskfile out_size size of output buffer outbuf buffer of data to be transmitted to disk inbuf buffer of data to be received from disk (see [1]) + ============ =================================================== outputs: + =========== ==================================================== io_ports[] values returned in the taskfile registers hob_ports[] high-order bytes, for extended commands. out_flags flags indicating which registers are valid (see [2]) in_flags flags indicating which registers should be returned outbuf buffer of data to be transmitted to disk (see [1]) inbuf buffer of data to be received from disk + =========== ==================================================== error returns: - EACCES CAP_SYS_ADMIN or CAP_SYS_RAWIO privilege not set. - ENOMSG Device is not a disk drive. - ENOMEM Unable to allocate memory for task - EFAULT req_cmd == TASKFILE_IN_OUT (not implemented as of 2.6.8) - EPERM req_cmd == TASKFILE_MULTI_OUT and drive - multi-count not yet set. - EIO Drive failed the command. + - EACCES CAP_SYS_ADMIN or CAP_SYS_RAWIO privilege not set. + - ENOMSG Device is not a disk drive. + - ENOMEM Unable to allocate memory for task + - EFAULT req_cmd == TASKFILE_IN_OUT (not implemented as of 2.6.8) + - EPERM + + req_cmd == TASKFILE_MULTI_OUT and drive + multi-count not yet set. + - EIO Drive failed the command. notes: @@ -615,22 +773,25 @@ HDIO_DRIVE_TASKFILE execute raw taskfile Command is passed to the disk drive via the ide_task_request_t structure, which contains these fields: + ============ =============================================== io_ports[8] values for the taskfile registers hob_ports[8] high-order bytes, for extended commands out_flags flags indicating which entries in the - io_ports[] and hob_ports[] arrays + io_ports[] and hob_ports[] arrays contain valid values. Type ide_reg_valid_t. in_flags flags indicating which entries in the - io_ports[] and hob_ports[] arrays + io_ports[] and hob_ports[] arrays are expected to contain valid values on return. data_phase See below req_cmd Command type, see below out_size output (user->drive) buffer size, bytes in_size input (drive->user) buffer size, bytes + ============ =============================================== When out_flags is zero, the following registers are loaded. + ============ =============================================== HOB_FEATURE If the drive supports LBA48 HOB_NSECTOR If the drive supports LBA48 HOB_SECTOR If the drive supports LBA48 @@ -644,9 +805,11 @@ HDIO_DRIVE_TASKFILE execute raw taskfile SELECT First, masked with 0xE0 if LBA48, 0xEF otherwise; then, or'ed with the default value of SELECT. + ============ =============================================== If any bit in out_flags is set, the following registers are loaded. + ============ =============================================== HOB_DATA If out_flags.b.data is set. HOB_DATA will travel on DD8-DD15 on little endian machines and on DD0-DD7 on big endian machines. @@ -664,6 +827,7 @@ HDIO_DRIVE_TASKFILE execute raw taskfile HCYL If out_flags.b.hcyl is set SELECT Or'ed with the default value of SELECT and loaded regardless of out_flags.b.select. + ============ =============================================== Taskfile registers are read back from the drive into {io|hob}_ports[] after the command completes iff one of the @@ -674,6 +838,7 @@ HDIO_DRIVE_TASKFILE execute raw taskfile 2. One or more than one bits are set in out_flags. 3. The requested data_phase is TASKFILE_NO_DATA. + ============ =============================================== HOB_DATA If in_flags.b.data is set. It will contain DD8-DD15 on little endian machines and DD0-DD7 on big endian machines. @@ -689,10 +854,12 @@ HDIO_DRIVE_TASKFILE execute raw taskfile SECTOR LCYL HCYL + ============ =============================================== The data_phase field describes the data transfer to be performed. Value is one of: + =================== ======================================== TASKFILE_IN TASKFILE_MULTI_IN TASKFILE_OUT @@ -708,15 +875,18 @@ HDIO_DRIVE_TASKFILE execute raw taskfile TASKFILE_P_OUT unimplemented TASKFILE_P_OUT_DMA unimplemented TASKFILE_P_OUT_DMAQ unimplemented + =================== ======================================== The req_cmd field classifies the command type. It may be one of: + ======================== ======================================= IDE_DRIVE_TASK_NO_DATA IDE_DRIVE_TASK_SET_XFER unimplemented IDE_DRIVE_TASK_IN IDE_DRIVE_TASK_OUT unimplemented IDE_DRIVE_TASK_RAW_WRITE + ======================== ======================================= [6] Do not access {in|out}_flags->all except for resetting all the bits. Always access individual bit fields. ->all @@ -726,45 +896,57 @@ HDIO_DRIVE_TASKFILE execute raw taskfile -HDIO_DRIVE_CMD execute a special drive command +HDIO_DRIVE_CMD + execute a special drive command + Note: If you don't have a copy of the ANSI ATA specification handy, you should probably ignore this ioctl. - usage: + usage:: u8 args[4+XFER_SIZE]; + ... ioctl(fd, HDIO_DRIVE_CMD, args); inputs: + Commands other than WIN_SMART: - Commands other than WIN_SMART + ======= ======= args[0] COMMAND args[1] NSECTOR args[2] FEATURE args[3] NSECTOR + ======= ======= - WIN_SMART + WIN_SMART: + + ======= ======= args[0] COMMAND args[1] SECTOR args[2] FEATURE args[3] NSECTOR + ======= ======= outputs: + args[] buffer is filled with register values followed by any + - args[] buffer is filled with register values followed by any data returned by the disk. + + ======== ==================================================== args[0] status args[1] error args[2] NSECTOR args[3] undefined args[4+] NSECTOR * 512 bytes of data returned by the command. + ======== ==================================================== error returns: - EACCES Access denied: requires CAP_SYS_RAWIO - ENOMEM Unable to allocate memory for task - EIO Drive reports error + - EACCES Access denied: requires CAP_SYS_RAWIO + - ENOMEM Unable to allocate memory for task + - EIO Drive reports error notes: @@ -789,20 +971,24 @@ HDIO_DRIVE_CMD execute a special drive command -HDIO_DRIVE_TASK execute task and special drive command +HDIO_DRIVE_TASK + execute task and special drive command + Note: If you don't have a copy of the ANSI ATA specification handy, you should probably ignore this ioctl. - usage: + usage:: u8 args[7]; + ... ioctl(fd, HDIO_DRIVE_TASK, args); inputs: + Taskfile register values: - Taskfile register values: + ======= ======= args[0] COMMAND args[1] FEATURE args[2] NSECTOR @@ -810,10 +996,13 @@ HDIO_DRIVE_TASK execute task and special drive command args[4] LCYL args[5] HCYL args[6] SELECT + ======= ======= outputs: + Taskfile register values: - Taskfile register values: + + ======= ======= args[0] status args[1] error args[2] NSECTOR @@ -821,12 +1010,13 @@ HDIO_DRIVE_TASK execute task and special drive command args[4] LCYL args[5] HCYL args[6] SELECT + ======= ======= error returns: - EACCES Access denied: requires CAP_SYS_RAWIO - ENOMEM Unable to allocate memory for task - ENOMSG Device is not a disk drive. - EIO Drive failed the command. + - EACCES Access denied: requires CAP_SYS_RAWIO + - ENOMEM Unable to allocate memory for task + - ENOMSG Device is not a disk drive. + - EIO Drive failed the command. notes: @@ -836,236 +1026,317 @@ HDIO_DRIVE_TASK execute task and special drive command -HDIO_DRIVE_CMD_AEB HDIO_DRIVE_TASK +HDIO_DRIVE_CMD_AEB + HDIO_DRIVE_TASK + Not implemented, as of 2.6.8.1 -HDIO_SET_32BIT change io_32bit flags +HDIO_SET_32BIT + change io_32bit flags - usage: + + usage:: int val; + ioctl(fd, HDIO_SET_32BIT, val); inputs: - New value for io_32bit flag + New value for io_32bit flag + + + + outputs: + none + - outputs: none error return: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - EACCES Access denied: requires CAP_SYS_ADMIN - EINVAL value out of range [0 3] - EBUSY Controller busy + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - EACCES Access denied: requires CAP_SYS_ADMIN + - EINVAL value out of range [0 3] + - EBUSY Controller busy -HDIO_SET_NOWERR change ignore-write-error flag +HDIO_SET_NOWERR + change ignore-write-error flag - usage: + + usage:: int val; + ioctl(fd, HDIO_SET_NOWERR, val); inputs: - New value for ignore-write-error flag. Used for ignoring + New value for ignore-write-error flag. Used for ignoring + + WRERR_STAT - outputs: none + outputs: + none + + error return: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - EACCES Access denied: requires CAP_SYS_ADMIN - EINVAL value out of range [0 1] - EBUSY Controller busy + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - EACCES Access denied: requires CAP_SYS_ADMIN + - EINVAL value out of range [0 1] + - EBUSY Controller busy -HDIO_SET_DMA change use-dma flag +HDIO_SET_DMA + change use-dma flag - usage: + + usage:: long val; + ioctl(fd, HDIO_SET_DMA, val); inputs: - New value for use-dma flag + New value for use-dma flag + + + + outputs: + none + - outputs: none error return: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - EACCES Access denied: requires CAP_SYS_ADMIN - EINVAL value out of range [0 1] - EBUSY Controller busy + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - EACCES Access denied: requires CAP_SYS_ADMIN + - EINVAL value out of range [0 1] + - EBUSY Controller busy -HDIO_SET_PIO_MODE reconfig interface to new speed +HDIO_SET_PIO_MODE + reconfig interface to new speed - usage: + + usage:: long val; + ioctl(fd, HDIO_SET_PIO_MODE, val); inputs: - New interface speed. + New interface speed. + + + + outputs: + none + - outputs: none error return: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - EACCES Access denied: requires CAP_SYS_ADMIN - EINVAL value out of range [0 255] - EBUSY Controller busy + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - EACCES Access denied: requires CAP_SYS_ADMIN + - EINVAL value out of range [0 255] + - EBUSY Controller busy -HDIO_SCAN_HWIF register and (re)scan interface +HDIO_SCAN_HWIF + register and (re)scan interface - usage: + + usage:: int args[3] + ... ioctl(fd, HDIO_SCAN_HWIF, args); inputs: + + ======= ========================= args[0] io address to probe + + args[1] control address to probe args[2] irq number + ======= ========================= + + outputs: + none + - outputs: none error returns: - EACCES Access denied: requires CAP_SYS_RAWIO - EIO Probe failed. + - EACCES Access denied: requires CAP_SYS_RAWIO + - EIO Probe failed. notes: + This ioctl initializes the addresses and irq for a disk + controller, probes for drives, and creates /proc/ide + interfaces as appropriate. - This ioctl initializes the addresses and irq for a disk - controller, probes for drives, and creates /proc/ide - interfaces as appropriate. +HDIO_UNREGISTER_HWIF + unregister interface -HDIO_UNREGISTER_HWIF unregister interface - usage: + usage:: int index; + ioctl(fd, HDIO_UNREGISTER_HWIF, index); inputs: - index index of hardware interface to unregister + index index of hardware interface to unregister + + + + outputs: + none + - outputs: none error returns: - EACCES Access denied: requires CAP_SYS_RAWIO + - EACCES Access denied: requires CAP_SYS_RAWIO notes: + This ioctl removes a hardware interface from the kernel. - This ioctl removes a hardware interface from the kernel. + Currently (2.6.8) this ioctl silently fails if any drive on + the interface is busy. - Currently (2.6.8) this ioctl silently fails if any drive on - the interface is busy. +HDIO_SET_WCACHE + change write cache enable-disable -HDIO_SET_WCACHE change write cache enable-disable - usage: + usage:: int val; + ioctl(fd, HDIO_SET_WCACHE, val); inputs: - New value for write cache enable + New value for write cache enable + + + + outputs: + none + - outputs: none error return: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - EACCES Access denied: requires CAP_SYS_ADMIN - EINVAL value out of range [0 1] - EBUSY Controller busy + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - EACCES Access denied: requires CAP_SYS_ADMIN + - EINVAL value out of range [0 1] + - EBUSY Controller busy -HDIO_SET_ACOUSTIC change acoustic behavior +HDIO_SET_ACOUSTIC + change acoustic behavior - usage: + + usage:: int val; + ioctl(fd, HDIO_SET_ACOUSTIC, val); inputs: - New value for drive acoustic settings + New value for drive acoustic settings + + + + outputs: + none + - outputs: none error return: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - EACCES Access denied: requires CAP_SYS_ADMIN - EINVAL value out of range [0 254] - EBUSY Controller busy + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - EACCES Access denied: requires CAP_SYS_ADMIN + - EINVAL value out of range [0 254] + - EBUSY Controller busy -HDIO_SET_QDMA change use-qdma flag +HDIO_SET_QDMA + change use-qdma flag + Not implemented, as of 2.6.8.1 -HDIO_SET_ADDRESS change lba addressing modes +HDIO_SET_ADDRESS + change lba addressing modes - usage: + + usage:: int val; + ioctl(fd, HDIO_SET_ADDRESS, val); inputs: - New value for addressing mode - 0 = 28-bit - 1 = 48-bit - 2 = 48-bit doing 28-bit + New value for addressing mode + + = =================== + 0 28-bit + 1 48-bit + 2 48-bit doing 28-bit + = =================== + + outputs: + none + - outputs: none error return: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - EACCES Access denied: requires CAP_SYS_ADMIN - EINVAL value out of range [0 2] - EBUSY Controller busy - EIO Drive does not support lba48 mode. + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - EACCES Access denied: requires CAP_SYS_ADMIN + - EINVAL value out of range [0 2] + - EBUSY Controller busy + - EIO Drive does not support lba48 mode. HDIO_SET_IDE_SCSI + usage:: - usage: long val; + ioctl(fd, HDIO_SET_IDE_SCSI, val); inputs: - New value for scsi emulation mode (?) + New value for scsi emulation mode (?) + + + + outputs: + none + - outputs: none error return: - EINVAL (bdev != bdev->bd_contains) (not sure what this means) - EACCES Access denied: requires CAP_SYS_ADMIN - EINVAL value out of range [0 1] - EBUSY Controller busy + - EINVAL (bdev != bdev->bd_contains) (not sure what this means) + - EACCES Access denied: requires CAP_SYS_ADMIN + - EINVAL value out of range [0 1] + - EBUSY Controller busy HDIO_SET_SCSI_IDE - Not implemented, as of 2.6.8.1 - - diff --git a/Documentation/ioctl/index.rst b/Documentation/ioctl/index.rst new file mode 100644 index 000000000000..1a6f437566e3 --- /dev/null +++ b/Documentation/ioctl/index.rst @@ -0,0 +1,16 @@ +:orphan: + +====== +IOCTLs +====== + +.. toctree:: + :maxdepth: 1 + + ioctl-number + + botching-up-ioctls + ioctl-decoding + + cdrom + hdio diff --git a/Documentation/ioctl/ioctl-decoding.txt b/Documentation/ioctl/ioctl-decoding.rst similarity index 54% rename from Documentation/ioctl/ioctl-decoding.txt rename to Documentation/ioctl/ioctl-decoding.rst index e35efb0cec2e..380d6bb3e3ea 100644 --- a/Documentation/ioctl/ioctl-decoding.txt +++ b/Documentation/ioctl/ioctl-decoding.rst @@ -1,10 +1,16 @@ +============================== +Decoding an IOCTL Magic Number +============================== + To decode a hex IOCTL code: Most architectures use this generic format, but check include/ARCH/ioctl.h for specifics, e.g. powerpc uses 3 bits to encode read/write and 13 bits for size. - bits meaning + ====== ================================== + bits meaning + ====== ================================== 31-30 00 - no parameters: uses _IO macro 10 - read: _IOR 01 - write: _IOW @@ -16,9 +22,10 @@ uses 3 bits to encode read/write and 13 bits for size. unique to each driver 7-0 function # + ====== ================================== So for example 0x82187201 is a read with arg length of 0x218, -character 'r' function 1. Grepping the source reveals this is: +character 'r' function 1. Grepping the source reveals this is:: -#define VFAT_IOCTL_READDIR_BOTH _IOR('r', 1, struct dirent [2]) + #define VFAT_IOCTL_READDIR_BOTH _IOR('r', 1, struct dirent [2]) diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 9441a36a2469..bd810454d239 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -736,7 +736,7 @@ static const struct drm_ioctl_desc drm_ioctls[] = { * }; * * Please make sure that you follow all the best practices from - * ``Documentation/ioctl/botching-up-ioctls.txt``. Note that drm_ioctl() + * ``Documentation/ioctl/botching-up-ioctls.rst``. Note that drm_ioctl() * automatically zero-extends structures, hence make sure you can add more stuff * at the end, i.e. don't put a variable sized array there. *