Message ID | 20170413080629.7610-1-jthumshirn@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Apr 13, 2017 at 10:06:29AM +0200, Johannes Thumshirn wrote: > Doing a mkfs.btrfs on a (qemu emulated) PCIe NVMe causes a kernel panic > in nvme_setup_prps() because the dma_len will drop below zero but the > length not. I think we should also turns this into a WARN_ON_ONCE + error return.. But do you have an exact btrfsprogs version and command line? I do a lot of testing that involves mkfs.btrfs on nvme and haven't seen it.. > A git bisect tracked the behaviour down to commit 729204ef49ec ("block: > relax check on sg gap"). Since commit 729204ef49ec a bio's offsets are not > taken into account in the decision if the bio will gap any more. Restore > the old behavior of checking bio offsets as well for the decision if a > bio will gap. > > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> > Fixes: 729204ef49ec ("block: relax check on sg gap") > Cc: Ming Lei <ming.lei@redhat.com> > --- > include/linux/blkdev.h | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 7548f332121a..a03b7196209e 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -1677,11 +1677,14 @@ static inline bool bio_will_gap(struct request_queue *q, struct bio *prev, > { > if (bio_has_data(prev) && queue_virt_boundary(q)) { > struct bio_vec pb, nb; > + bool offset; > > bio_get_last_bvec(prev, &pb); > bio_get_first_bvec(next, &nb); > > - if (!bios_segs_mergeable(q, prev, &pb, &nb)) > + offset = pb.bv_offset || nb.bv_offset; > + > + if (offset || !bios_segs_mergeable(q, prev, &pb, &nb)) > return __bvec_gap_to_prev(q, &pb, nb.bv_offset); > } I think the code in NVMe (and potentially the other drivers using virt_queue_boundary) is bogus. All of them are actually fine with gaps in the protocol, as long as the gaps are aligned to said boundary. So I suspect what we really need is to fix up NVMe, and after that we could even relax the above check, to not check for offset but offset & queue_virt_boundary(q).
On Thu, Apr 13, 2017 at 11:48:35AM +0200, Christoph Hellwig wrote: > I think we should also turns this into a WARN_ON_ONCE + error return.. > > But do you have an exact btrfsprogs version and command line? I do a lot > of testing that involves mkfs.btrfs on nvme and haven't seen it.. Sure, it's: mkfs.btrfs, part of btrfs-progs v4.5.3+20160729 Qemu is 2.6.2 [...] > > I think the code in NVMe (and potentially the other drivers using > virt_queue_boundary) is bogus. All of them are actually fine with > gaps in the protocol, as long as the gaps are aligned to said boundary. > > So I suspect what we really need is to fix up NVMe, and after that > we could even relax the above check, to not check for offset but > offset & queue_virt_boundary(q). That's what I tried doing the last two days but as we're rather late in the rc cycle and it is a regression that came in with -rc1 I'd rather like to have it fixed or at least have a band aid in place. Byte, Johannes
On Thu, Apr 13, 2017 at 11:48:35AM +0200, Christoph Hellwig wrote: > On Thu, Apr 13, 2017 at 10:06:29AM +0200, Johannes Thumshirn wrote: > > Doing a mkfs.btrfs on a (qemu emulated) PCIe NVMe causes a kernel panic > > in nvme_setup_prps() because the dma_len will drop below zero but the > > length not. > > I think we should also turns this into a WARN_ON_ONCE + error return.. > > But do you have an exact btrfsprogs version and command line? I do a lot > of testing that involves mkfs.btrfs on nvme and haven't seen it.. Ah one detail I forgot: mkfs.xfs _does_ work. Haven't checked ext4 though.
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 7548f332121a..a03b7196209e 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1677,11 +1677,14 @@ static inline bool bio_will_gap(struct request_queue *q, struct bio *prev, { if (bio_has_data(prev) && queue_virt_boundary(q)) { struct bio_vec pb, nb; + bool offset; bio_get_last_bvec(prev, &pb); bio_get_first_bvec(next, &nb); - if (!bios_segs_mergeable(q, prev, &pb, &nb)) + offset = pb.bv_offset || nb.bv_offset; + + if (offset || !bios_segs_mergeable(q, prev, &pb, &nb)) return __bvec_gap_to_prev(q, &pb, nb.bv_offset); }
Doing a mkfs.btrfs on a (qemu emulated) PCIe NVMe causes a kernel panic in nvme_setup_prps() because the dma_len will drop below zero but the length not. A git bisect tracked the behaviour down to commit 729204ef49ec ("block: relax check on sg gap"). Since commit 729204ef49ec a bio's offsets are not taken into account in the decision if the bio will gap any more. Restore the old behavior of checking bio offsets as well for the decision if a bio will gap. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: 729204ef49ec ("block: relax check on sg gap") Cc: Ming Lei <ming.lei@redhat.com> --- include/linux/blkdev.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)