Message ID | 20230729175952.4068-1-dg573847474@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [v2] dmaengine: plx_dma: Fix potential deadlock on &plxdev->ring_lock | expand |
On 7/29/23 11:59, Chengfeng Ye wrote: > As plx_dma_process_desc() is invoked by both tasklet plx_dma_desc_task() > under softirq context and plx_dma_tx_status() callback that executed under > process context, the lock aquicision of &plxdev->ring_lock inside > plx_dma_process_desc() should disable irq otherwise deadlock could happen > if the irq preempts the execution of process context code while the lock > is held in process context on the same CPU. > > Possible deadlock scenario: > plx_dma_tx_status() > -> plx_dma_process_desc() > -> spin_lock(&plxdev->ring_lock) > <tasklet softirq> > -> plx_dma_desc_task() > -> plx_dma_process_desc() > -> spin_lock(&plxdev->ring_lock) (deadlock here) > > This flaw was found by an experimental static analysis tool I am developing > for irq-related deadlock. > > The lock was changed from spin_lock_bh() to spin_lock() by a previous patch > for performance concern but unintentionally brought this potential deadlock > problem. > > This patch reverts back to spin_lock_bh() to fix the deadlock problem. > > Fixes: 1d05a0bdb420 ("dmaengine: plx_dma: Move spin_lock_bh() to spin_lock()") > Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> > Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Thanks! Logan
Hello, Am 29.07.2023 um 19:59 schrieb Chengfeng Ye: > This flaw was found by an experimental static analysis tool I am developing > for irq-related deadlock. Just out of curiosity, did/could - Linux kernel config checks like CONFIG_DEBUG_SPINLOCK option or - Smatch [1] find that issue too? I have also found an article from Dan Carpenter on the net about lock checking capability of Smatch which relates IMHO to what you are doing [2]. The question is, whether the checks/algorithm what you have developed already exists in form of other tools or they might be added to an already existing one, which is already spread across the community and used accordingly. Many thanks for your reply in advance. [1] https://github.com/error27/smatch [2] https://blogs.oracle.com/linux/post/writing-the-ultimate-locking-check Cheers Eric
Hi Eric, Thank you for your interest in it. For a dynamic detection solution, then the answer is yes. Lockdep, which should be enabled by CONFIG_DEBUG_SPINLOCK, has the ability to detect such deadlocks. But the problem is that the detection requires input and exact thread interleaving to trigger the bug, otherwise the bugs would be buried and cannot be detected. For static analysis, I think the answer is no. Smatch, like other static deadlock detection algorithms in CBMC[1] and Infer[2], should be designed to reason thread interaction but not interrupts, which requires new algorithms that I am working on. Besides, may I ask a question that I have sent some patches[3][4] weeks ago, but have not yet got a reply. Would reviewers check the patches later or should I ping them again? [1] http://www.cprover.org/deadlock-detection/ [2] https://github.com/facebook/infer [3] https://lore.kernel.org/lkml/20230726062313.77121-1-dg573847474@gmail.com/ [4] https://lore.kernel.org/lkml/20230726051727.64088-1-dg573847474@gmail.com/ Thanks, Chengfeng
Hello Chengfeng, Am 29.08.2023 um 05:10 schrieb Chengfeng Ye: > Hi Eric, > > Thank you for your interest in it. Thanks for getting back to me. > For a dynamic detection solution, then the answer is yes. > Lockdep, which should be enabled by CONFIG_DEBUG_SPINLOCK, > has the ability to detect such deadlocks. But the problem is that the detection > requires input and exact thread interleaving to trigger the bug, otherwise > the bugs would be buried and cannot be detected. > > For static analysis, I think the answer is no. Smatch, like other > static deadlock detection algorithms in CBMC[1] and Infer[2], should be > designed to reason thread interaction but not interrupts, which requires > new algorithms that I am working on. Will you publish your work later on e.g. on github? Actually maybe it would even make sense to integrate your work into scripts/checkpatch.pl of the Linux kernel (or the like). Basically if a patch to be committed fails locking it should not be committed anyway. IMHO the quality standard one could expect from the code should always be the same. So adding it to a mandatory check procedure (script which must be executed before committing patches) and/or to "0-DAY CI Kernel Test Service" [5] would definitely be worth a thought. > Besides, may I ask a question that I have sent some patches[3][4] weeks > ago, but have not yet got a reply. Would reviewers check the patches > later or should I ping them again? You never have a guarantee who will when review your patch on the mailing list. It is kind of best effort based system mainly of volunteers. Just give people a bit of time since it is currently also holiday time. You may ping the maintainer of the subsystem when some time has passed since he is responsible for the patches to be administered. BTW, I think you already pinged indirectly w/ your e-mail. > [1] http://www.cprover.org/deadlock-detection/ > [2] https://github.com/facebook/infer > [3] https://lore.kernel.org/lkml/20230726062313.77121-1-dg573847474@gmail.com/ > [4] https://lore.kernel.org/lkml/20230726051727.64088-1-dg573847474@gmail.com/ [5] https://github.com/intel/lkp-tests/wiki Cheers Eric
diff --git a/drivers/dma/plx_dma.c b/drivers/dma/plx_dma.c index 34b6416c3287..7693c067a1aa 100644 --- a/drivers/dma/plx_dma.c +++ b/drivers/dma/plx_dma.c @@ -137,7 +137,7 @@ static void plx_dma_process_desc(struct plx_dma_dev *plxdev) struct plx_dma_desc *desc; u32 flags; - spin_lock(&plxdev->ring_lock); + spin_lock_bh(&plxdev->ring_lock); while (plxdev->tail != plxdev->head) { desc = plx_dma_get_desc(plxdev, plxdev->tail); @@ -165,7 +165,7 @@ static void plx_dma_process_desc(struct plx_dma_dev *plxdev) plxdev->tail++; } - spin_unlock(&plxdev->ring_lock); + spin_unlock_bh(&plxdev->ring_lock); } static void plx_dma_abort_desc(struct plx_dma_dev *plxdev)
As plx_dma_process_desc() is invoked by both tasklet plx_dma_desc_task() under softirq context and plx_dma_tx_status() callback that executed under process context, the lock aquicision of &plxdev->ring_lock inside plx_dma_process_desc() should disable irq otherwise deadlock could happen if the irq preempts the execution of process context code while the lock is held in process context on the same CPU. Possible deadlock scenario: plx_dma_tx_status() -> plx_dma_process_desc() -> spin_lock(&plxdev->ring_lock) <tasklet softirq> -> plx_dma_desc_task() -> plx_dma_process_desc() -> spin_lock(&plxdev->ring_lock) (deadlock here) This flaw was found by an experimental static analysis tool I am developing for irq-related deadlock. The lock was changed from spin_lock_bh() to spin_lock() by a previous patch for performance concern but unintentionally brought this potential deadlock problem. This patch reverts back to spin_lock_bh() to fix the deadlock problem. Fixes: 1d05a0bdb420 ("dmaengine: plx_dma: Move spin_lock_bh() to spin_lock()") Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> Changes in v2 - Consistently use spin_lock_bh() on &plxdev->ring_lock instead of spin_lock_irqsave(). --- drivers/dma/plx_dma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)