From patchwork Thu Apr 13 12:11:40 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 9679287 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 59F2660383 for ; Thu, 13 Apr 2017 12:12:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4B4B727F9F for ; Thu, 13 Apr 2017 12:12:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3FC6A283FE; Thu, 13 Apr 2017 12:12:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DF56327F9F for ; Thu, 13 Apr 2017 12:12:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753433AbdDMML5 (ORCPT ); Thu, 13 Apr 2017 08:11:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39114 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752250AbdDMML4 (ORCPT ); Thu, 13 Apr 2017 08:11:56 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5D8EC30C7AC; Thu, 13 Apr 2017 12:11:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 5D8EC30C7AC Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=ming.lei@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 5D8EC30C7AC Received: from ming.t460p (ovpn-12-21.pek2.redhat.com [10.72.12.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C07AF7BCD5; Thu, 13 Apr 2017 12:11:43 +0000 (UTC) Date: Thu, 13 Apr 2017 20:11:40 +0800 From: Ming Lei To: Johannes Thumshirn Cc: Jens Axboe , Omar Sandoval , Bart Van Assche , Hannes Reinecke , Christoph Hellwig , Linux Block Layer Mailinglist , Linux Kernel Mailinglist Subject: Re: [PATCH] block: bios with an offset are always gappy Message-ID: <20170413121134.GA32362@ming.t460p> References: <20170413080629.7610-1-jthumshirn@suse.de> <20170413100110.GB5964@ming.t460p> <20170413115328.GH6734@linux-x5ow.site> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20170413115328.GH6734@linux-x5ow.site> User-Agent: Mutt/1.8.0 (2017-02-23) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Thu, 13 Apr 2017 12:11:55 +0000 (UTC) Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Thu, Apr 13, 2017 at 01:53:28PM +0200, Johannes Thumshirn wrote: > On Thu, Apr 13, 2017 at 06:02:21PM +0800, Ming Lei wrote: > > On Thu, Apr 13, 2017 at 10:06:29AM +0200, Johannes Thumshirn wrote: > > > Doing a mkfs.btrfs on a (qemu emulated) PCIe NVMe causes a kernel panic > > > in nvme_setup_prps() because the dma_len will drop below zero but the > > > length not. > > > > Looks I can't reproduce the issue in QEMU(32G nvme, either partitioned > > or not, just use 'mkfs.btrfs /dev/nvme0n1p1'), could you share the exact > > mkfs command line and size of your emulated NVMe? > > the exact cmdline is mkfs.btrfs -f /dev/nvme0n1p1 (-f because there was a > existing btrfs on the image). The image is 17179869184 (a.k.a 16G) bytes. > > [...] > > > Could you try the following patch to see if it fixes your issue? > > It's back to the old, erratic behaviour, see log below. Ok, could you apply the attached debug patch and collect the ftrace log? (ftrace_dump_on_oops need to be passed to kernel cmd line). Thanks, Ming diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 26a5fd05fe88..a813a36d48d9 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -491,6 +491,8 @@ static bool nvme_setup_prps(struct nvme_dev *dev, struct request *req) break; if (dma_len > 0) continue; + if (dma_len < 0) + blk_dump_rq(req, "nvme dma sg gap"); BUG_ON(dma_len < 0); sg = sg_next(sg); dma_addr = sg_dma_address(sg); diff --git a/include/linux/bio.h b/include/linux/bio.h index 8e521194f6fc..f3b001e401d2 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -811,5 +811,29 @@ static inline int bio_integrity_add_page(struct bio *bio, struct page *page, #endif /* CONFIG_BLK_DEV_INTEGRITY */ +static inline void blk_dump_bio(struct bio *bio, const char *msg) +{ + struct bvec_iter iter; + struct bio_vec bvec; + int i = 0; + unsigned sectors = 0; + + trace_printk("%s-%p: %hx/%hx %u %llu %u\n", + msg, bio, + bio->bi_flags, bio->bi_opf, + bio->bi_phys_segments, + (unsigned long long)bio->bi_iter.bi_sector, + bio->bi_iter.bi_size); + bio_for_each_segment(bvec, bio, iter) { + sectors += bvec.bv_len >> 9; + trace_printk("\t %d: %lu %u %u(%u)\n", i++, + (unsigned long)page_to_pfn(bvec.bv_page), + bvec.bv_offset, + bvec.bv_len, bvec.bv_len >> 12); + } + trace_printk("\t total sectors %u\n", sectors); +} + + #endif /* CONFIG_BLOCK */ #endif /* __LINUX_BIO_H */ diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 7548f332121a..b75d6fe5a1b9 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1698,6 +1698,22 @@ static inline bool req_gap_front_merge(struct request *req, struct bio *bio) return bio_will_gap(req->q, bio, req->bio); } +static inline void blk_dump_rq(const struct request *req, const char *msg) +{ + struct bio *bio; + int i = 0; + + trace_printk("%s: dump bvec for %p(f:%x, seg: %d)\n", + msg, req, req->cmd_flags, + req->nr_phys_segments); + + __rq_for_each_bio(bio, req) { + char num[16]; + snprintf(num, 16, "%d", i++); + blk_dump_bio(bio, num); + } +} + int kblockd_schedule_work(struct work_struct *work); int kblockd_schedule_work_on(int cpu, struct work_struct *work); int kblockd_schedule_delayed_work(struct delayed_work *dwork, unsigned long delay);