diff mbox series

[v5,7/7] pmem: fix pmem_do_write() avoid writing to 'np' page

Message ID 20220128213150.1333552-8-jane.chu@oracle.com (mailing list archive)
State New, archived
Headers show
Series DAX poison recovery | expand

Commit Message

Jane Chu Jan. 28, 2022, 9:31 p.m. UTC
Since poisoned page is marked as not-present, the first of the
two back-to-back write_pmem() calls can only be made when there
is no poison in the range, otherwise kernel Oops.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 drivers/nvdimm/pmem.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

Christoph Hellwig Feb. 2, 2022, 1:28 p.m. UTC | #1
On Fri, Jan 28, 2022 at 02:31:50PM -0700, Jane Chu wrote:
> +	if (!bad_pmem) {
>  		write_pmem(pmem_addr, page, page_off, len);
> +	} else {
> +		rc = pmem_clear_poison(pmem, pmem_off, len);
> +		if (rc == BLK_STS_OK)
> +			write_pmem(pmem_addr, page, page_off, len);
> +		else
> +			pr_warn("%s: failed to clear poison\n",
> +				__func__);

This warning probably needs ratelimiting.

Also this flow looks a little odd.  I'd redo the whole function with a
clear bad_mem case:

	if (unlikely(is_bad_pmem(&pmem->bb, sector, len))) {
		blk_status_t rc = pmem_clear_poison(pmem, pmem_off, len);

		if (rc != BLK_STS_OK) {
			pr_warn("%s: failed to clear poison\n", __func__);
			return rc;
		}
	}
	flush_dcache_page(page);
	write_pmem(pmem_addr, page, page_off, len);
	return BLK_STS_OK;
Jane Chu Feb. 2, 2022, 9:31 p.m. UTC | #2
On 2/2/2022 5:28 AM, Christoph Hellwig wrote:
> On Fri, Jan 28, 2022 at 02:31:50PM -0700, Jane Chu wrote:
>> +	if (!bad_pmem) {
>>   		write_pmem(pmem_addr, page, page_off, len);
>> +	} else {
>> +		rc = pmem_clear_poison(pmem, pmem_off, len);
>> +		if (rc == BLK_STS_OK)
>> +			write_pmem(pmem_addr, page, page_off, len);
>> +		else
>> +			pr_warn("%s: failed to clear poison\n",
>> +				__func__);
> 
> This warning probably needs ratelimiting.

Will do, in case bad hardware is encountered, I can see lots of warnings.

> 
> Also this flow looks a little odd.  I'd redo the whole function with a
> clear bad_mem case:
> 
> 	if (unlikely(is_bad_pmem(&pmem->bb, sector, len))) {
> 		blk_status_t rc = pmem_clear_poison(pmem, pmem_off, len);
> 
> 		if (rc != BLK_STS_OK) {
> 			pr_warn("%s: failed to clear poison\n", __func__);
> 			return rc;
> 		}
> 	}
> 	flush_dcache_page(page);
> 	write_pmem(pmem_addr, page, page_off, len);
> 	return BLK_STS_OK;
> 
> 

This is much better, thanks for the suggestion!

-jane
diff mbox series

Patch

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index f2d6b34d48de..283890084d58 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -187,10 +187,15 @@  static blk_status_t pmem_do_write(struct pmem_device *pmem,
 	 * after clear poison.
 	 */
 	flush_dcache_page(page);
-	write_pmem(pmem_addr, page, page_off, len);
-	if (unlikely(bad_pmem)) {
-		rc = pmem_clear_poison(pmem, pmem_off, len);
+	if (!bad_pmem) {
 		write_pmem(pmem_addr, page, page_off, len);
+	} else {
+		rc = pmem_clear_poison(pmem, pmem_off, len);
+		if (rc == BLK_STS_OK)
+			write_pmem(pmem_addr, page, page_off, len);
+		else
+			pr_warn("%s: failed to clear poison\n",
+				__func__);
 	}
 
 	return rc;