diff mbox series

[v1,04/25] block: nr_sects_write(): Disable preemption on seqcount write

Message ID 20200519214547.352050-5-a.darwish@linutronix.de (mailing list archive)
State New, archived
Headers show
Series seqlock: Extend seqcount API with associated locks | expand

Commit Message

Ahmed S. Darwish May 19, 2020, 9:45 p.m. UTC
For optimized block readers not holding a mutex, the "number of sectors"
64-bit value is protected from tearing on 32-bit architectures by a
sequence counter.

Disable preemption before entering that sequence counter's write side
critical section. Otherwise, the read side can preempt the write side
section and spin for the entire scheduler tick. If the reader belongs to
a real-time scheduling class, it can spin forever and the kernel will
livelock.

Fixes: c83f6bf98dc1 ("block: add partition resize function to blkpg ioctl")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 block/blk.h | 2 ++
 1 file changed, 2 insertions(+)

Comments

Peter Zijlstra May 22, 2020, 4:39 p.m. UTC | #1
On Tue, May 19, 2020 at 11:45:26PM +0200, Ahmed S. Darwish wrote:
> For optimized block readers not holding a mutex, the "number of sectors"
> 64-bit value is protected from tearing on 32-bit architectures by a
> sequence counter.
> 
> Disable preemption before entering that sequence counter's write side
> critical section. Otherwise, the read side can preempt the write side
> section and spin for the entire scheduler tick. If the reader belongs to
> a real-time scheduling class, it can spin forever and the kernel will
> livelock.
> 
> Fixes: c83f6bf98dc1 ("block: add partition resize function to blkpg ioctl")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  block/blk.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/block/blk.h b/block/blk.h
> index 0a94ec68af32..151f86932547 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -470,9 +470,11 @@ static inline sector_t part_nr_sects_read(struct hd_struct *part)
>  static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
>  {
>  #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> +	preempt_disable();
>  	write_seqcount_begin(&part->nr_sects_seq);
>  	part->nr_sects = size;
>  	write_seqcount_end(&part->nr_sects_seq);
> +	preempt_enable();
>  #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
>  	preempt_disable();
>  	part->nr_sects = size;

This does look like something that include/linux/u64_stats_sync.h could
help with.
Ahmed S. Darwish May 25, 2020, 9:56 a.m. UTC | #2
Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, May 19, 2020 at 11:45:26PM +0200, Ahmed S. Darwish wrote:
> > For optimized block readers not holding a mutex, the "number of sectors"
> > 64-bit value is protected from tearing on 32-bit architectures by a
> > sequence counter.
> >
> > Disable preemption before entering that sequence counter's write side
> > critical section. Otherwise, the read side can preempt the write side
> > section and spin for the entire scheduler tick. If the reader belongs to
> > a real-time scheduling class, it can spin forever and the kernel will
> > livelock.
> >
> > Fixes: c83f6bf98dc1 ("block: add partition resize function to blkpg ioctl")
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
> > Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> >  block/blk.h | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/block/blk.h b/block/blk.h
> > index 0a94ec68af32..151f86932547 100644
> > --- a/block/blk.h
> > +++ b/block/blk.h
> > @@ -470,9 +470,11 @@ static inline sector_t part_nr_sects_read(struct hd_struct *part)
> >  static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
> >  {
> >  #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
> > +	preempt_disable();
> >  	write_seqcount_begin(&part->nr_sects_seq);
> >  	part->nr_sects = size;
> >  	write_seqcount_end(&part->nr_sects_seq);
> > +	preempt_enable();
> >  #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
> >  	preempt_disable();
> >  	part->nr_sects = size;
>
> This does look like something that include/linux/u64_stats_sync.h could
> help with.

Correct.

I just felt though that this would be too much for a 'Cc: stable' patch.

In another (in-progress) seqlock.h patch series, all of the seqcount_t
call sites that are used for 64-bit values tearing protection on 32-bit
kernels are transformed to the u64_stats_sync.h API.

Thanks,

--
Ahmed S. Darwish
Linutronix GmbH
Ahmed S. Darwish May 25, 2020, 10:12 a.m. UTC | #3
Sasha Levin <sashal@kernel.org> wrote:
> Hi
>
> [This is an automated email]
>
> This commit has been processed because it contains a "Fixes:" tag
> fixing commit: c83f6bf98dc1 ("block: add partition resize function to blkpg ioctl").
>
> The bot has tested the following trees: v5.6.13, v5.4.41, v4.19.123, v4.14.180, v4.9.223, v4.4.223.
>
> v5.6.13: Failed to apply! Possible dependencies:
...
> v5.4.41: Failed to apply! Possible dependencies:
...
> v4.19.123: Failed to apply! Possible dependencies:
...
> v4.14.180: Failed to apply! Possible dependencies:
...
> v4.9.223: Failed to apply! Possible dependencies:
...
> v4.4.223: Failed to apply! Possible dependencies:
...
>
> NOTE: The patch will not be queued to stable trees until it is upstream.
>
> How should we proceed with this patch?
>

The v5.7-rc1 commit 581e26004a09 ("block: move block layer internals out
of include/linux/genhd.h") moved the part_nr_sects_write() static inline
function from include/linux/genhd.h to block/blk.h.

After review, I'll send a rebased patch to stable.

Thanks,

--
Ahmed S. Darwish
Linutronix GmbH
diff mbox series

Patch

diff --git a/block/blk.h b/block/blk.h
index 0a94ec68af32..151f86932547 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -470,9 +470,11 @@  static inline sector_t part_nr_sects_read(struct hd_struct *part)
 static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
 {
 #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
+	preempt_disable();
 	write_seqcount_begin(&part->nr_sects_seq);
 	part->nr_sects = size;
 	write_seqcount_end(&part->nr_sects_seq);
+	preempt_enable();
 #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
 	preempt_disable();
 	part->nr_sects = size;