diff mbox series

libnvdimm: Let revalidate_disk() revalidate region read-only

Message ID 161527286194.446794.5215036039655765042.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive)
State New, archived
Headers show
Series libnvdimm: Let revalidate_disk() revalidate region read-only | expand

Commit Message

Dan Williams March 9, 2021, 6:54 a.m. UTC
Previous kernels allowed the BLKROSET to override the disk's read-only
status. With that situation fixed the pmem driver needs to rely on
revalidate_disk() to clear the disk read-only status after the host
region has been marked read-write.

Recall that when libnvdimm determines that the persistent memory has
lost persistence (for example lack of energy to flush from DRAM to FLASH
on an NVDIMM-N device) it marks the region read-only, but that state can
be overridden by the user via:

   echo 0 > /sys/bus/nd/devices/regionX/read_only

...to date there is no notification that the region has restored
persistence, so the user override is the only recovery.

Fixes: 52f019d43c22 ("block: add a hard-readonly flag to struct gendisk")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/nvdimm/btt.c  |    7 +++++++
 drivers/nvdimm/bus.c  |   14 ++++++--------
 drivers/nvdimm/pmem.c |    7 +++++++
 3 files changed, 20 insertions(+), 8 deletions(-)

Comments

Christoph Hellwig March 9, 2021, 7:31 a.m. UTC | #1
On Mon, Mar 08, 2021 at 10:54:22PM -0800, Dan Williams wrote:
> Previous kernels allowed the BLKROSET to override the disk's read-only
> status. With that situation fixed the pmem driver needs to rely on
> revalidate_disk() to clear the disk read-only status after the host
> region has been marked read-write.
> 
> Recall that when libnvdimm determines that the persistent memory has
> lost persistence (for example lack of energy to flush from DRAM to FLASH
> on an NVDIMM-N device) it marks the region read-only, but that state can
> be overridden by the user via:
> 
>    echo 0 > /sys/bus/nd/devices/regionX/read_only
> 
> ...to date there is no notification that the region has restored
> persistence, so the user override is the only recovery.

I've just resent my series to kill of ->revalidate_disk for good, so this
obvious makes me a little unhappy.  Given that ->revalidate_disk
only ends up beeing called from the same path that ->open is called,
why can't you just hook this up from the open method?

Also any reason the sysfs attribute can't just directly propagate the
information to the disk?
Dan Williams March 9, 2021, 4:36 p.m. UTC | #2
On Mon, Mar 8, 2021 at 11:31 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, Mar 08, 2021 at 10:54:22PM -0800, Dan Williams wrote:
> > Previous kernels allowed the BLKROSET to override the disk's read-only
> > status. With that situation fixed the pmem driver needs to rely on
> > revalidate_disk() to clear the disk read-only status after the host
> > region has been marked read-write.
> >
> > Recall that when libnvdimm determines that the persistent memory has
> > lost persistence (for example lack of energy to flush from DRAM to FLASH
> > on an NVDIMM-N device) it marks the region read-only, but that state can
> > be overridden by the user via:
> >
> >    echo 0 > /sys/bus/nd/devices/regionX/read_only
> >
> > ...to date there is no notification that the region has restored
> > persistence, so the user override is the only recovery.
>
> I've just resent my series to kill of ->revalidate_disk for good, so this
> obvious makes me a little unhappy.  Given that ->revalidate_disk
> only ends up beeing called from the same path that ->open is called,
> why can't you just hook this up from the open method?
>
> Also any reason the sysfs attribute can't just directly propagate the
> information to the disk?

I should have assumed that revalidate_disk() was on the chopping
block. Let me take a look at just propagating from the region update
down to all affected disks. There's already a notification path for
regions to communicate badblocks, should be straightforward to reuse
for read-only updates.
diff mbox series

Patch

diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 41aa1f01fc07..73d3bf5aa208 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1508,11 +1508,18 @@  static int btt_getgeo(struct block_device *bd, struct hd_geometry *geo)
 	return 0;
 }
 
+static int btt_revalidate(struct gendisk *disk)
+{
+	nvdimm_check_and_set_ro(disk);
+	return 0;
+}
+
 static const struct block_device_operations btt_fops = {
 	.owner =		THIS_MODULE,
 	.submit_bio =		btt_submit_bio,
 	.rw_page =		btt_rw_page,
 	.getgeo =		btt_getgeo,
+	.revalidate_disk =	btt_revalidate,
 };
 
 static int btt_blk_init(struct btt *btt)
diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 48f0985ca8a0..3a777d0073b7 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -631,16 +631,14 @@  void nvdimm_check_and_set_ro(struct gendisk *disk)
 	struct nd_region *nd_region = to_nd_region(dev->parent);
 	int disk_ro = get_disk_ro(disk);
 
-	/*
-	 * Upgrade to read-only if the region is read-only preserve as
-	 * read-only if the disk is already read-only.
-	 */
-	if (disk_ro || nd_region->ro == disk_ro)
+	/* catch the disk up with the region ro state */
+	if (disk_ro == nd_region->ro)
 		return;
 
-	dev_info(dev, "%s read-only, marking %s read-only\n",
-			dev_name(&nd_region->dev), disk->disk_name);
-	set_disk_ro(disk, 1);
+	dev_info(dev, "%s read-%s, marking %s read-%s\n",
+		 dev_name(&nd_region->dev), nd_region->ro ? "only" : "write",
+		 disk->disk_name, nd_region->ro ? "only" : "write");
+	set_disk_ro(disk, nd_region->ro);
 }
 EXPORT_SYMBOL(nvdimm_check_and_set_ro);
 
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index b8a85bfb2e95..af204fce1b1c 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -276,10 +276,17 @@  __weak long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
 	return PHYS_PFN(pmem->size - pmem->pfn_pad - offset);
 }
 
+static int pmem_revalidate(struct gendisk *disk)
+{
+	nvdimm_check_and_set_ro(disk);
+	return 0;
+}
+
 static const struct block_device_operations pmem_fops = {
 	.owner =		THIS_MODULE,
 	.submit_bio =		pmem_submit_bio,
 	.rw_page =		pmem_rw_page,
+	.revalidate_disk =	pmem_revalidate,
 };
 
 static int pmem_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,