diff mbox

[v3,15/16] mtd: rawnand: qcom: helper function for raw read

Message ID 1527250904-21988-16-git-send-email-absahu@codeaurora.org (mailing list archive)
State Superseded, archived
Delegated to: Andy Gross
Headers show

Commit Message

Abhishek Sahu May 25, 2018, 12:21 p.m. UTC
This patch does minor code reorganization for raw reads.
Currently the raw read is required for complete page but for
subsequent patches related with erased codeword bit flips
detection, only few CW should be read. So, this patch adds
helper function and introduces the read CW bitmask which
specifies which CW reads are required in complete page.

Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
---
* Changes from v2:
  NONE

* Changes from v1:
 1. Included more detail in function comment

 drivers/mtd/nand/raw/qcom_nandc.c | 197 ++++++++++++++++++++++++--------------
 1 file changed, 123 insertions(+), 74 deletions(-)

Comments

Miquel Raynal May 27, 2018, 1:53 p.m. UTC | #1
Hi Abhishek,

On Fri, 25 May 2018 17:51:43 +0530, Abhishek Sahu
<absahu@codeaurora.org> wrote:

> This patch does minor code reorganization for raw reads.
> Currently the raw read is required for complete page but for
> subsequent patches related with erased codeword bit flips
> detection, only few CW should be read. So, this patch adds
> helper function and introduces the read CW bitmask which
> specifies which CW reads are required in complete page.
> 
> Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
> ---
> * Changes from v2:
>   NONE
> 
> * Changes from v1:
>  1. Included more detail in function comment
> 
>  drivers/mtd/nand/raw/qcom_nandc.c | 197 ++++++++++++++++++++++++--------------
>  1 file changed, 123 insertions(+), 74 deletions(-)
> 
> diff --git a/drivers/mtd/nand/raw/qcom_nandc.c b/drivers/mtd/nand/raw/qcom_nandc.c
> index 87f900e..34143a4 100644
> --- a/drivers/mtd/nand/raw/qcom_nandc.c
> +++ b/drivers/mtd/nand/raw/qcom_nandc.c
> @@ -1588,6 +1588,127 @@ static int check_flash_errors(struct qcom_nand_host *host, int cw_cnt)
>  }
>  
>  /*
> + * Helper to perform the page raw read operation. The read_cw_mask will be
> + * used to specify the codewords (CW) for which the data should be read. The
> + * single page contains multiple CW.
> + *
> + * Normally other NAND controllers store the data in main area and
> + * ecc bytes in OOB area. So, if page size is 2048+64 then 2048
> + * data bytes will go in main area followed by ECC bytes. The QCOM NAND
> + * controller follows different layout in which the data+OOB is internally
> + * divided in 528/532 bytes CW and each CW contains 516 bytes followed by
> + * ECC parity bytes for that CW. By this, 4 available OOB bytes per CW
> + * will also be protected with ECC.
> + *
> + * For each CW read, following are the 2 steps:
> + * 1. Read the codeword bytes from NAND chip to NAND controller internal HW
> + *    buffer.
> + * 2. Copy all these bytes from this HW buffer to actual buffer.
> + *
> + * Sometime, only few CW data is required in complete page. The read_cw_mask
> + * specifies which CW in a page needs to be read. Start address will be
> + * determined with this CW mask to skip unnecessary data copy from NAND
> + * flash device. Then, actual data copy from NAND controller HW internal buffer
> + * to data buffer will be done only for the CWs, which have the mask set.
> + */
> +static int
> +nandc_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
> +		    u8 *data_buf, u8 *oob_buf,
> +		    int page, unsigned long read_cw_mask)

Please prefix the helper with "qcom_nandc"

> +{
> +	struct qcom_nand_host *host = to_qcom_nand_host(chip);
> +	struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
> +	struct nand_ecc_ctrl *ecc = &chip->ecc;
> +	int i, ret;
> +	int read_loc, start_step, last_step;
> +
> +	nand_read_page_op(chip, page, 0, NULL, 0);
> +
> +	host->use_ecc = false;
> +	start_step = ffs(read_cw_mask) - 1;
> +	last_step = fls(read_cw_mask);
> +
> +	clear_bam_transaction(nandc);
> +	set_address(host, host->cw_size * start_step, page);
> +	update_rw_regs(host, last_step - start_step, true);
> +	config_nand_page_read(nandc);
> +
> +	for (i = start_step; i < last_step; i++) {

This comment applies for both patches 15 and 16:

I would really prefer having a qcom_nandc_read_cw_raw() that reads only
one CW. From qcom_nandc_read_page_raw() you would loop over all the CW
calling qcom_nandc_read_cw_raw() helper (it's raw reads, we don't care
about performances) and from ->read_page_raw() you would check
CW with uncorrectable errors for being blank with that helper. You
would avoid the not-so-nice logic where you read all the CW between the
first bad one and the last bad one.

Thanks,
Miquèl

--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Abhishek Sahu May 28, 2018, 7:34 a.m. UTC | #2
On 2018-05-27 19:23, Miquel Raynal wrote:
> Hi Abhishek,
> 
> On Fri, 25 May 2018 17:51:43 +0530, Abhishek Sahu
> <absahu@codeaurora.org> wrote:
> 
>> This patch does minor code reorganization for raw reads.
>> Currently the raw read is required for complete page but for
>> subsequent patches related with erased codeword bit flips
>> detection, only few CW should be read. So, this patch adds
>> helper function and introduces the read CW bitmask which
>> specifies which CW reads are required in complete page.
>> 
>> Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
>> ---
>> * Changes from v2:
>>   NONE
>> 
>> * Changes from v1:
>>  1. Included more detail in function comment
>> 
>>  drivers/mtd/nand/raw/qcom_nandc.c | 197 
>> ++++++++++++++++++++++++--------------
>>  1 file changed, 123 insertions(+), 74 deletions(-)
>> 
>> diff --git a/drivers/mtd/nand/raw/qcom_nandc.c 
>> b/drivers/mtd/nand/raw/qcom_nandc.c
>> index 87f900e..34143a4 100644
>> --- a/drivers/mtd/nand/raw/qcom_nandc.c
>> +++ b/drivers/mtd/nand/raw/qcom_nandc.c
>> @@ -1588,6 +1588,127 @@ static int check_flash_errors(struct 
>> qcom_nand_host *host, int cw_cnt)
>>  }
>> 
>>  /*
>> + * Helper to perform the page raw read operation. The read_cw_mask 
>> will be
>> + * used to specify the codewords (CW) for which the data should be 
>> read. The
>> + * single page contains multiple CW.
>> + *
>> + * Normally other NAND controllers store the data in main area and
>> + * ecc bytes in OOB area. So, if page size is 2048+64 then 2048
>> + * data bytes will go in main area followed by ECC bytes. The QCOM 
>> NAND
>> + * controller follows different layout in which the data+OOB is 
>> internally
>> + * divided in 528/532 bytes CW and each CW contains 516 bytes 
>> followed by
>> + * ECC parity bytes for that CW. By this, 4 available OOB bytes per 
>> CW
>> + * will also be protected with ECC.
>> + *
>> + * For each CW read, following are the 2 steps:
>> + * 1. Read the codeword bytes from NAND chip to NAND controller 
>> internal HW
>> + *    buffer.
>> + * 2. Copy all these bytes from this HW buffer to actual buffer.
>> + *
>> + * Sometime, only few CW data is required in complete page. The 
>> read_cw_mask
>> + * specifies which CW in a page needs to be read. Start address will 
>> be
>> + * determined with this CW mask to skip unnecessary data copy from 
>> NAND
>> + * flash device. Then, actual data copy from NAND controller HW 
>> internal buffer
>> + * to data buffer will be done only for the CWs, which have the mask 
>> set.
>> + */
>> +static int
>> +nandc_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
>> +		    u8 *data_buf, u8 *oob_buf,
>> +		    int page, unsigned long read_cw_mask)
> 
> Please prefix the helper with "qcom_nandc"
> 

  Sure Miquel.
  I will update that.

>> +{
>> +	struct qcom_nand_host *host = to_qcom_nand_host(chip);
>> +	struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
>> +	struct nand_ecc_ctrl *ecc = &chip->ecc;
>> +	int i, ret;
>> +	int read_loc, start_step, last_step;
>> +
>> +	nand_read_page_op(chip, page, 0, NULL, 0);
>> +
>> +	host->use_ecc = false;
>> +	start_step = ffs(read_cw_mask) - 1;
>> +	last_step = fls(read_cw_mask);
>> +
>> +	clear_bam_transaction(nandc);
>> +	set_address(host, host->cw_size * start_step, page);
>> +	update_rw_regs(host, last_step - start_step, true);
>> +	config_nand_page_read(nandc);
>> +
>> +	for (i = start_step; i < last_step; i++) {
> 
> This comment applies for both patches 15 and 16:
> 
> I would really prefer having a qcom_nandc_read_cw_raw() that reads only
> one CW. From qcom_nandc_read_page_raw() you would loop over all the CW
> calling qcom_nandc_read_cw_raw() helper (it's raw reads, we don't care
> about performances)

  Doing that way will degrade performances hugely.

  Currently once we formed the descriptor, the DMA will take care
  of complete page data transfer from NAND device to buffer and will
  generate single interrupt.

  Now it will form one CW descriptor and wait for it to be finished.
  In background, the data transfer from NAND device will be also
  split and for every CW, it will give the PAGE_READ command again,
  which is again time consuming.

  Data transfer degradation is ok but it will increase CPU time
  and number of interrupts which will impact other peripherals
  performance that time.

  Most of the NAND parts has 4K page size i.e 8 CWs.

> and from ->read_page_raw() you would check
> CW with uncorrectable errors for being blank with that helper. You
> would avoid the not-so-nice logic where you read all the CW between the
> first bad one and the last bad one.
> 

  The reading b/w first CW and last CW is only from NAND device to NAND
  HW buffers. The NAND controller has 2 HW buffers which is used to
  optimize the traffic throughput between the NAND device and
  system memory,in both directions. Each buffer is 544B in size: 512B
  for data + 32B spare bytes. Throughput optimization is achieved by
  executing internal data transfers (i.e. between NANDc buffers and
  system memory) simultaneously with NAND device operations.

  Making separate function won't help in improving performance for
  this case either since once every thing is set for reading page
  (descriptor formation, issue the PAGE_READ, Data transfer from
  Flash array to data register in NAND device), the read time from
  device to NAND HW buffer is very less. Again, we did optimization
  in which the copying from NAND HW buffer to actual buffer is being
  done only for those CW's only.

  Again, in this case CPU time will be more.

  Thanks,
  Abhishek
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Miquel Raynal June 7, 2018, 12:43 p.m. UTC | #3
Hi Abhishek,

On Mon, 28 May 2018 13:04:45 +0530, Abhishek Sahu
<absahu@codeaurora.org> wrote:

> On 2018-05-27 19:23, Miquel Raynal wrote:
> > Hi Abhishek,  
> > > On Fri, 25 May 2018 17:51:43 +0530, Abhishek Sahu  
> > <absahu@codeaurora.org> wrote:  
> > >> This patch does minor code reorganization for raw reads.  
> >> Currently the raw read is required for complete page but for
> >> subsequent patches related with erased codeword bit flips
> >> detection, only few CW should be read. So, this patch adds
> >> helper function and introduces the read CW bitmask which
> >> specifies which CW reads are required in complete page.  
> >> >> Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>  
> >> ---
> >> * Changes from v2:
> >>   NONE  
> >> >> * Changes from v1:  
> >>  1. Included more detail in function comment  
> >> >>  drivers/mtd/nand/raw/qcom_nandc.c | 197 >> ++++++++++++++++++++++++--------------  
> >>  1 file changed, 123 insertions(+), 74 deletions(-)  
> >> >> diff --git a/drivers/mtd/nand/raw/qcom_nandc.c >> b/drivers/mtd/nand/raw/qcom_nandc.c  
> >> index 87f900e..34143a4 100644
> >> --- a/drivers/mtd/nand/raw/qcom_nandc.c
> >> +++ b/drivers/mtd/nand/raw/qcom_nandc.c
> >> @@ -1588,6 +1588,127 @@ static int check_flash_errors(struct >> qcom_nand_host *host, int cw_cnt)
> >>  }  
> >> >>  /*  
> >> + * Helper to perform the page raw read operation. The read_cw_mask >> will be
> >> + * used to specify the codewords (CW) for which the data should be >> read. The
> >> + * single page contains multiple CW.
> >> + *
> >> + * Normally other NAND controllers store the data in main area and
> >> + * ecc bytes in OOB area. So, if page size is 2048+64 then 2048
> >> + * data bytes will go in main area followed by ECC bytes. The QCOM >> NAND
> >> + * controller follows different layout in which the data+OOB is >> internally
> >> + * divided in 528/532 bytes CW and each CW contains 516 bytes >> followed by
> >> + * ECC parity bytes for that CW. By this, 4 available OOB bytes per >> CW
> >> + * will also be protected with ECC.
> >> + *
> >> + * For each CW read, following are the 2 steps:
> >> + * 1. Read the codeword bytes from NAND chip to NAND controller >> internal HW
> >> + *    buffer.
> >> + * 2. Copy all these bytes from this HW buffer to actual buffer.
> >> + *
> >> + * Sometime, only few CW data is required in complete page. The >> read_cw_mask
> >> + * specifies which CW in a page needs to be read. Start address will >> be
> >> + * determined with this CW mask to skip unnecessary data copy from >> NAND
> >> + * flash device. Then, actual data copy from NAND controller HW >> internal buffer
> >> + * to data buffer will be done only for the CWs, which have the mask >> set.
> >> + */
> >> +static int
> >> +nandc_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
> >> +		    u8 *data_buf, u8 *oob_buf,
> >> +		    int page, unsigned long read_cw_mask)
> > > Please prefix the helper with "qcom_nandc"  
> >   
>   Sure Miquel.
>   I will update that.
> 
> >> +{
> >> +	struct qcom_nand_host *host = to_qcom_nand_host(chip);
> >> +	struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
> >> +	struct nand_ecc_ctrl *ecc = &chip->ecc;
> >> +	int i, ret;
> >> +	int read_loc, start_step, last_step;
> >> +
> >> +	nand_read_page_op(chip, page, 0, NULL, 0);
> >> +
> >> +	host->use_ecc = false;
> >> +	start_step = ffs(read_cw_mask) - 1;
> >> +	last_step = fls(read_cw_mask);
> >> +
> >> +	clear_bam_transaction(nandc);
> >> +	set_address(host, host->cw_size * start_step, page);
> >> +	update_rw_regs(host, last_step - start_step, true);
> >> +	config_nand_page_read(nandc);
> >> +
> >> +	for (i = start_step; i < last_step; i++) {
> > > This comment applies for both patches 15 and 16:
> > > I would really prefer having a qcom_nandc_read_cw_raw() that reads only  
> > one CW. From qcom_nandc_read_page_raw() you would loop over all the CW
> > calling qcom_nandc_read_cw_raw() helper (it's raw reads, we don't care
> > about performances)  
> 
>   Doing that way will degrade performances hugely.
> 
>   Currently once we formed the descriptor, the DMA will take care
>   of complete page data transfer from NAND device to buffer and will
>   generate single interrupt.
> 
>   Now it will form one CW descriptor and wait for it to be finished.
>   In background, the data transfer from NAND device will be also
>   split and for every CW, it will give the PAGE_READ command again,
>   which is again time consuming.
> 
>   Data transfer degradation is ok but it will increase CPU time
>   and number of interrupts which will impact other peripherals
>   performance that time.
> 
>   Most of the NAND parts has 4K page size i.e 8 CWs.
> 
> > and from ->read_page_raw() you would check
> > CW with uncorrectable errors for being blank with that helper. You
> > would avoid the not-so-nice logic where you read all the CW between the
> > first bad one and the last bad one.
> >   
>   The reading b/w first CW and last CW is only from NAND device to NAND
>   HW buffers. The NAND controller has 2 HW buffers which is used to
>   optimize the traffic throughput between the NAND device and
>   system memory,in both directions. Each buffer is 544B in size: 512B
>   for data + 32B spare bytes. Throughput optimization is achieved by
>   executing internal data transfers (i.e. between NANDc buffers and
>   system memory) simultaneously with NAND device operations.
> 
>   Making separate function won't help in improving performance for
>   this case either since once every thing is set for reading page
>   (descriptor formation, issue the PAGE_READ, Data transfer from
>   Flash array to data register in NAND device), the read time from
>   device to NAND HW buffer is very less. Again, we did optimization
>   in which the copying from NAND HW buffer to actual buffer is being
>   done only for those CW's only.
> 
>   Again, in this case CPU time will be more.
> 


I understand the point and thanks for detailing it. But raw access
happen either during debug (we don't care about CPU time) or when there
is an uncorrectable error, which is very unlikely to happen very often
when using eg. UBI/UBIFS. So I'm still convinced it is better to have a
_simple_ and straightforward code for this path than something way
harder to understand and much faster.

You can add a comment to explain what would be the fastest way and
why though.


Thanks,
Miquèl
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Abhishek Sahu June 11, 2018, 9:19 a.m. UTC | #4
On 2018-06-07 18:13, Miquel Raynal wrote:
> Hi Abhishek,
> 
> On Mon, 28 May 2018 13:04:45 +0530, Abhishek Sahu
> <absahu@codeaurora.org> wrote:
> 
>> On 2018-05-27 19:23, Miquel Raynal wrote:
>> > Hi Abhishek,
>> > > On Fri, 25 May 2018 17:51:43 +0530, Abhishek Sahu
>> > <absahu@codeaurora.org> wrote:
>> > >> This patch does minor code reorganization for raw reads.
>> >> Currently the raw read is required for complete page but for
>> >> subsequent patches related with erased codeword bit flips
>> >> detection, only few CW should be read. So, this patch adds
>> >> helper function and introduces the read CW bitmask which
>> >> specifies which CW reads are required in complete page.
>> >> >> Signed-off-by: Abhishek Sahu <absahu@codeaurora.org>
>> >> ---

  <snip>

>> >> +	for (i = start_step; i < last_step; i++) {
>> > > This comment applies for both patches 15 and 16:
>> > > I would really prefer having a qcom_nandc_read_cw_raw() that reads only
>> > one CW. From qcom_nandc_read_page_raw() you would loop over all the CW
>> > calling qcom_nandc_read_cw_raw() helper (it's raw reads, we don't care
>> > about performances)
>> 
>>   Doing that way will degrade performances hugely.
>> 
>>   Currently once we formed the descriptor, the DMA will take care
>>   of complete page data transfer from NAND device to buffer and will
>>   generate single interrupt.
>> 
>>   Now it will form one CW descriptor and wait for it to be finished.
>>   In background, the data transfer from NAND device will be also
>>   split and for every CW, it will give the PAGE_READ command again,
>>   which is again time consuming.
>> 
>>   Data transfer degradation is ok but it will increase CPU time
>>   and number of interrupts which will impact other peripherals
>>   performance that time.
>> 
>>   Most of the NAND parts has 4K page size i.e 8 CWs.
>> 
>> > and from ->read_page_raw() you would check
>> > CW with uncorrectable errors for being blank with that helper. You
>> > would avoid the not-so-nice logic where you read all the CW between the
>> > first bad one and the last bad one.
>> >
>>   The reading b/w first CW and last CW is only from NAND device to 
>> NAND
>>   HW buffers. The NAND controller has 2 HW buffers which is used to
>>   optimize the traffic throughput between the NAND device and
>>   system memory,in both directions. Each buffer is 544B in size: 512B
>>   for data + 32B spare bytes. Throughput optimization is achieved by
>>   executing internal data transfers (i.e. between NANDc buffers and
>>   system memory) simultaneously with NAND device operations.
>> 
>>   Making separate function won't help in improving performance for
>>   this case either since once every thing is set for reading page
>>   (descriptor formation, issue the PAGE_READ, Data transfer from
>>   Flash array to data register in NAND device), the read time from
>>   device to NAND HW buffer is very less. Again, we did optimization
>>   in which the copying from NAND HW buffer to actual buffer is being
>>   done only for those CW's only.
>> 
>>   Again, in this case CPU time will be more.
>> 
> 
> 
> I understand the point and thanks for detailing it. But raw access
> happen either during debug (we don't care about CPU time) or when there
> is an uncorrectable error, which is very unlikely to happen very often
> when using eg. UBI/UBIFS. So I'm still convinced it is better to have a
> _simple_ and straightforward code for this path than something way
> harder to understand and much faster.
> 
> You can add a comment to explain what would be the fastest way and
> why though.
> 

  Thanks Miquel. I will do the changes to make function for
  single codeword raw read.

  Regards,
  Abhishek
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/mtd/nand/raw/qcom_nandc.c b/drivers/mtd/nand/raw/qcom_nandc.c
index 87f900e..34143a4 100644
--- a/drivers/mtd/nand/raw/qcom_nandc.c
+++ b/drivers/mtd/nand/raw/qcom_nandc.c
@@ -1588,6 +1588,127 @@  static int check_flash_errors(struct qcom_nand_host *host, int cw_cnt)
 }
 
 /*
+ * Helper to perform the page raw read operation. The read_cw_mask will be
+ * used to specify the codewords (CW) for which the data should be read. The
+ * single page contains multiple CW.
+ *
+ * Normally other NAND controllers store the data in main area and
+ * ecc bytes in OOB area. So, if page size is 2048+64 then 2048
+ * data bytes will go in main area followed by ECC bytes. The QCOM NAND
+ * controller follows different layout in which the data+OOB is internally
+ * divided in 528/532 bytes CW and each CW contains 516 bytes followed by
+ * ECC parity bytes for that CW. By this, 4 available OOB bytes per CW
+ * will also be protected with ECC.
+ *
+ * For each CW read, following are the 2 steps:
+ * 1. Read the codeword bytes from NAND chip to NAND controller internal HW
+ *    buffer.
+ * 2. Copy all these bytes from this HW buffer to actual buffer.
+ *
+ * Sometime, only few CW data is required in complete page. The read_cw_mask
+ * specifies which CW in a page needs to be read. Start address will be
+ * determined with this CW mask to skip unnecessary data copy from NAND
+ * flash device. Then, actual data copy from NAND controller HW internal buffer
+ * to data buffer will be done only for the CWs, which have the mask set.
+ */
+static int
+nandc_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
+		    u8 *data_buf, u8 *oob_buf,
+		    int page, unsigned long read_cw_mask)
+{
+	struct qcom_nand_host *host = to_qcom_nand_host(chip);
+	struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
+	struct nand_ecc_ctrl *ecc = &chip->ecc;
+	int i, ret;
+	int read_loc, start_step, last_step;
+
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
+	host->use_ecc = false;
+	start_step = ffs(read_cw_mask) - 1;
+	last_step = fls(read_cw_mask);
+
+	clear_bam_transaction(nandc);
+	set_address(host, host->cw_size * start_step, page);
+	update_rw_regs(host, last_step - start_step, true);
+	config_nand_page_read(nandc);
+
+	for (i = start_step; i < last_step; i++) {
+		int data_size1, data_size2, oob_size1, oob_size2;
+		int reg_off = FLASH_BUF_ACC;
+
+		data_size1 = mtd->writesize - host->cw_size * (ecc->steps - 1);
+		oob_size1 = host->bbm_size;
+
+		if (i == (ecc->steps - 1)) {
+			data_size2 = ecc->size - data_size1 -
+				     ((ecc->steps - 1) << 2);
+			oob_size2 = (ecc->steps << 2) + host->ecc_bytes_hw +
+				    host->spare_bytes;
+		} else {
+			data_size2 = host->cw_data - data_size1;
+			oob_size2 = host->ecc_bytes_hw + host->spare_bytes;
+		}
+
+		/*
+		 * Don't perform actual data copy from NAND controller internal
+		 * HW buffer to data buffer through DMA for this codeword.
+		 */
+		if (!(read_cw_mask & BIT(i))) {
+			if (nandc->props->is_bam)
+				nandc_set_read_loc(nandc, 0, 0, 0, 1);
+
+			config_nand_cw_read(nandc, false);
+
+			data_buf += data_size1 + data_size2;
+			oob_buf += oob_size1 + oob_size2;
+
+			continue;
+		}
+
+		if (nandc->props->is_bam) {
+			read_loc = 0;
+			nandc_set_read_loc(nandc, 0, read_loc, data_size1, 0);
+			read_loc += data_size1;
+
+			nandc_set_read_loc(nandc, 1, read_loc, oob_size1, 0);
+			read_loc += oob_size1;
+
+			nandc_set_read_loc(nandc, 2, read_loc, data_size2, 0);
+			read_loc += data_size2;
+
+			nandc_set_read_loc(nandc, 3, read_loc, oob_size2, 1);
+		}
+
+		config_nand_cw_read(nandc, false);
+
+		read_data_dma(nandc, reg_off, data_buf, data_size1, 0);
+		reg_off += data_size1;
+		data_buf += data_size1;
+
+		read_data_dma(nandc, reg_off, oob_buf, oob_size1, 0);
+		reg_off += oob_size1;
+		oob_buf += oob_size1;
+
+		read_data_dma(nandc, reg_off, data_buf, data_size2, 0);
+		reg_off += data_size2;
+		data_buf += data_size2;
+
+		read_data_dma(nandc, reg_off, oob_buf, oob_size2, 0);
+		oob_buf += oob_size2;
+	}
+
+	ret = submit_descs(nandc);
+	free_descs(nandc);
+	if (ret) {
+		dev_err(nandc->dev, "failure to read raw page\n");
+		return ret;
+	}
+
+	return check_flash_errors(host, ecc->steps);
+}
+
+/*
  * reads back status registers set by the controller to notify page read
  * errors. this is equivalent to what 'ecc->correct()' would do.
  */
@@ -1815,80 +1936,8 @@  static int qcom_nandc_read_page_raw(struct mtd_info *mtd,
 				    struct nand_chip *chip, uint8_t *buf,
 				    int oob_required, int page)
 {
-	struct qcom_nand_host *host = to_qcom_nand_host(chip);
-	struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
-	u8 *data_buf, *oob_buf;
-	struct nand_ecc_ctrl *ecc = &chip->ecc;
-	int i, ret;
-	int read_loc;
-
-	nand_read_page_op(chip, page, 0, NULL, 0);
-	data_buf = buf;
-	oob_buf = chip->oob_poi;
-
-	host->use_ecc = false;
-
-	clear_bam_transaction(nandc);
-	update_rw_regs(host, ecc->steps, true);
-	config_nand_page_read(nandc);
-
-	for (i = 0; i < ecc->steps; i++) {
-		int data_size1, data_size2, oob_size1, oob_size2;
-		int reg_off = FLASH_BUF_ACC;
-
-		data_size1 = mtd->writesize - host->cw_size * (ecc->steps - 1);
-		oob_size1 = host->bbm_size;
-
-		if (i == (ecc->steps - 1)) {
-			data_size2 = ecc->size - data_size1 -
-				     ((ecc->steps - 1) << 2);
-			oob_size2 = (ecc->steps << 2) + host->ecc_bytes_hw +
-				    host->spare_bytes;
-		} else {
-			data_size2 = host->cw_data - data_size1;
-			oob_size2 = host->ecc_bytes_hw + host->spare_bytes;
-		}
-
-		if (nandc->props->is_bam) {
-			read_loc = 0;
-			nandc_set_read_loc(nandc, 0, read_loc, data_size1, 0);
-			read_loc += data_size1;
-
-			nandc_set_read_loc(nandc, 1, read_loc, oob_size1, 0);
-			read_loc += oob_size1;
-
-			nandc_set_read_loc(nandc, 2, read_loc, data_size2, 0);
-			read_loc += data_size2;
-
-			nandc_set_read_loc(nandc, 3, read_loc, oob_size2, 1);
-		}
-
-		config_nand_cw_read(nandc, false);
-
-		read_data_dma(nandc, reg_off, data_buf, data_size1, 0);
-		reg_off += data_size1;
-		data_buf += data_size1;
-
-		read_data_dma(nandc, reg_off, oob_buf, oob_size1, 0);
-		reg_off += oob_size1;
-		oob_buf += oob_size1;
-
-		read_data_dma(nandc, reg_off, data_buf, data_size2, 0);
-		reg_off += data_size2;
-		data_buf += data_size2;
-
-		read_data_dma(nandc, reg_off, oob_buf, oob_size2, 0);
-		oob_buf += oob_size2;
-	}
-
-	ret = submit_descs(nandc);
-	free_descs(nandc);
-	if (ret) {
-		dev_err(nandc->dev, "failure to read raw page\n");
-		return ret;
-	}
-
-	return check_flash_errors(host, ecc->steps);
+	return nandc_read_page_raw(mtd, chip, buf, chip->oob_poi, page,
+				   BIT(chip->ecc.steps) - 1);
 }
 
 /* implements ecc->read_oob() */