spi: rspi: fix the bug related to mount/remount jffs2

Message ID	1486342933-5307-1-git-send-email-cv-dong@jinso.co.jp (mailing list archive)
State	Superseded
Delegated to:	Geert Uytterhoeven
Headers	show Return-Path: <linux-renesas-soc-owner@kernel.org> From: DongCV <cv-dong@jinso.co.jp> To: broonie@kernel.org, geert+renesas@glider.be, linux-spi@vger.kernel.org Cc: kuninori.morimoto.gx@renesas.com, yoshihiro.shimoda.uh@renesas.com, ryusuke.sakato.bx@renesas.com, linux-renesas-soc@vger.kernel.org, nv-dung@jinso.co.jp, h-inayoshi@jinso.co.jp, cm-hiep@jinso.co.jp Subject: [PATCH] spi: rspi: fix the bug related to mount/remount jffs2 Date: Mon, 6 Feb 2017 10:02:13 +0900 Message-Id: <1486342933-5307-1-git-send-email-cv-dong@jinso.co.jp> Sender: linux-renesas-soc-owner@vger.kernel.org Precedence: bulk

Cao Van Dong Feb. 6, 2017, 1:02 a.m. UTC

From: Dong <cv-dong@jinso.co.jp>

This patch fixes the output warning logs and data loss when
performing mount/umount then remount the device with jffs2 format.

This is the warning logs when performing mount/umount then remount the device with jffs2 format:
"root@linaro-naro:~# mount -t jffs2 /dev/mtdblock2 /mnt/media
[ 3839.928013] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40000: 0x1900 instead
[ 3839.956515] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40004: 0x000c instead
[ 3839.985009] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40008: 0xb0b1 instead
[ 3840.013591] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b4000c: 0x1900 instead
[ 3840.042087] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40010: 0x0044 instead
[ 3840.070566] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40014: 0xfb1d instead
[ 3840.099159] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40018: 0x0002 instead
[ 3840.127604] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b4001c: 0x0001 instead
[ 3840.156043] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40020: 0x81a4 instead
[ 3840.184472] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b4002c: 0x6529 instead
[ 3840.212900] jffs2: Further such events for this erase block will not be printed
[ 3840.322915] jffs2: notice: (2008) read_dnode: node CRC failed on dnode at 0x3b40080: read 0xc40b5dfc, calculated 0x264be003
root@linaro-naro[ 3840.356857] jffs2: warning: (2008) jffs2_do_read_inode_internal: no data nodes found for ino #2
:~# [ 3840.386659] jffs2: Returned error for crccheck of ino #2. Expect badness...

Signed-off-by: Dong <cv-dong@jinso.co.jp>
---
 drivers/spi/spi-rspi.c | 1 -
 1 file changed, 1 deletion(-)

Geert Uytterhoeven Feb. 6, 2017, 11:02 a.m. UTC | #1

Hi Dong,

On Mon, Feb 6, 2017 at 2:02 AM, DongCV <cv-dong@jinso.co.jp> wrote:
> From: Dong <cv-dong@jinso.co.jp>
>
> This patch fixes the output warning logs and data loss when
> performing mount/umount then remount the device with jffs2 format.
>
> This is the warning logs when performing mount/umount then remount the device with jffs2 format:
> "root@linaro-naro:~# mount -t jffs2 /dev/mtdblock2 /mnt/media
> [ 3839.928013] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40000: 0x1900 instead
> [ 3839.956515] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40004: 0x000c instead
> [ 3839.985009] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40008: 0xb0b1 instead
> [ 3840.013591] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b4000c: 0x1900 instead
> [ 3840.042087] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40010: 0x0044 instead
> [ 3840.070566] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40014: 0xfb1d instead
> [ 3840.099159] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40018: 0x0002 instead
> [ 3840.127604] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b4001c: 0x0001 instead
> [ 3840.156043] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40020: 0x81a4 instead
> [ 3840.184472] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b4002c: 0x6529 instead
> [ 3840.212900] jffs2: Further such events for this erase block will not be printed
> [ 3840.322915] jffs2: notice: (2008) read_dnode: node CRC failed on dnode at 0x3b40080: read 0xc40b5dfc, calculated 0x264be003
> root@linaro-naro[ 3840.356857] jffs2: warning: (2008) jffs2_do_read_inode_internal: no data nodes found for ino #2
> :~# [ 3840.386659] jffs2: Returned error for crccheck of ino #2. Expect badness...
>
> Signed-off-by: Dong <cv-dong@jinso.co.jp>
> ---
>  drivers/spi/spi-rspi.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/drivers/spi/spi-rspi.c b/drivers/spi/spi-rspi.c
> index 9daf500..2ee1301 100644
> --- a/drivers/spi/spi-rspi.c
> +++ b/drivers/spi/spi-rspi.c
> @@ -848,7 +848,6 @@ static int qspi_transfer_in(struct rspi_data *rspi, struct spi_transfer *xfer)
>                         ret = rspi_pio_transfer(rspi, NULL, rx, n);
>                         if (ret < 0)
>                                 return ret;
> -                       *rx++ = ret;
>                 }
>                 n -= len;
>         }

Apart from sending patches inline, my comments from
https://www.spinics.net/lists/linux-spi/msg09753.html are still valid.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Mark Brown Feb. 6, 2017, 12:46 p.m. UTC | #2

On Mon, Feb 06, 2017 at 10:02:13AM +0900, DongCV wrote:
> From: Dong <cv-dong@jinso.co.jp>
> 
> This patch fixes the output warning logs and data loss when
> performing mount/umount then remount the device with jffs2 format.

This is not a good changelog since it does not describe what the problem
with the driver is or how the change fixes it - it just says that there
is a problem.

> This is the warning logs when performing mount/umount then remount the device with jffs2 format:
> "root@linaro-naro:~# mount -t jffs2 /dev/mtdblock2 /mnt/media
> [ 3839.928013] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40000: 0x1900 instead
> [ 3839.956515] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40004: 0x000c instead

Please think hard before including long log messages in upstream
reports, they often contain almost no useful information relative to
their size so often obscure the relevant content in your message.  Some
subset is usually much better (eg, "produces lots of errors like X"
here).

Cao Van Dong Feb. 7, 2017, 10:25 a.m. UTC | #3

Dear Geert,

Thank you for your reply.

>> diff --git a/drivers/spi/spi-rspi.c b/drivers/spi/spi-rspi.c
>> index 9daf500..2ee1301 100644
>> --- a/drivers/spi/spi-rspi.c
>> +++ b/drivers/spi/spi-rspi.c
>> @@ -848,7 +848,6 @@ static int qspi_transfer_in(struct rspi_data *rspi, struct spi_transfer *xfer)
>>                        ret = rspi_pio_transfer(rspi, NULL, rx, n);
>>                        if (ret < 0)
>>                                return ret;
>> -                     *rx++ = ret;
> Storing the success code (0) in the receive buffer is indeed wrong.
>
> However, there are other bugs in that code:
>
> rspi_pio_transfer(rspi, NULL, rx, n) transfers n bytes instead of len,
> while n is decreased by len (which is <= n).
> Furthermore rx is not incremented.
> Hence if len < n, n will still be non-zero, and a new iteration of the
> loop will be started, trying to receive more data, and overwriting the
> just filled buffer.
>
> The same bug is present in qspi_transfer_out().
>
>>                }
>>                n -= len;
>>        }
>> --
>> 1.9.1
>>
(nip)
> Apart from sending patches inline, my comments from
> https://www.spinics.net/lists/linux-spi/msg09753.html are still valid.

Sorry I might not understand your explanation correctly. Could you 
please explain it more details?
(Because I've tried to understand your explanation then analyzed the 
source code again as below:
https://patchwork.kernel.org/patch/9541629/)

Thank you.
Dong



On 02/06/2017 08:02 PM, Geert Uytterhoeven wrote:
> Hi Dong,
>
> On Mon, Feb 6, 2017 at 2:02 AM, DongCV <cv-dong@jinso.co.jp> wrote:
>> From: Dong <cv-dong@jinso.co.jp>
>>
>> This patch fixes the output warning logs and data loss when
>> performing mount/umount then remount the device with jffs2 format.
>>
>> This is the warning logs when performing mount/umount then remount the device with jffs2 format:
>> "root@linaro-naro:~# mount -t jffs2 /dev/mtdblock2 /mnt/media
>> [ 3839.928013] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40000: 0x1900 instead
>> [ 3839.956515] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40004: 0x000c instead
>> [ 3839.985009] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40008: 0xb0b1 instead
>> [ 3840.013591] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b4000c: 0x1900 instead
>> [ 3840.042087] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40010: 0x0044 instead
>> [ 3840.070566] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40014: 0xfb1d instead
>> [ 3840.099159] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40018: 0x0002 instead
>> [ 3840.127604] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b4001c: 0x0001 instead
>> [ 3840.156043] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40020: 0x81a4 instead
>> [ 3840.184472] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b4002c: 0x6529 instead
>> [ 3840.212900] jffs2: Further such events for this erase block will not be printed
>> [ 3840.322915] jffs2: notice: (2008) read_dnode: node CRC failed on dnode at 0x3b40080: read 0xc40b5dfc, calculated 0x264be003
>> root@linaro-naro[ 3840.356857] jffs2: warning: (2008) jffs2_do_read_inode_internal: no data nodes found for ino #2
>> :~# [ 3840.386659] jffs2: Returned error for crccheck of ino #2. Expect badness...
>>
>> Signed-off-by: Dong <cv-dong@jinso.co.jp>
>> ---
>>   drivers/spi/spi-rspi.c | 1 -
>>   1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/spi/spi-rspi.c b/drivers/spi/spi-rspi.c
>> index 9daf500..2ee1301 100644
>> --- a/drivers/spi/spi-rspi.c
>> +++ b/drivers/spi/spi-rspi.c
>> @@ -848,7 +848,6 @@ static int qspi_transfer_in(struct rspi_data *rspi, struct spi_transfer *xfer)
>>                          ret = rspi_pio_transfer(rspi, NULL, rx, n);
>>                          if (ret < 0)
>>                                  return ret;
>> -                       *rx++ = ret;
>>                  }
>>                  n -= len;
>>          }
> Apart from sending patches inline, my comments from
> https://www.spinics.net/lists/linux-spi/msg09753.html are still valid.
>
> Gr{oetje,eeting}s,
>
>                          Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                  -- Linus Torvalds
>

Geert Uytterhoeven Feb. 8, 2017, 8:58 a.m. UTC | #4

Hi Dong,

On Tue, Feb 7, 2017 at 11:25 AM, DongCV <cv-dong@jinso.co.jp> wrote:
>>> diff --git a/drivers/spi/spi-rspi.c b/drivers/spi/spi-rspi.c
>>> index 9daf500..2ee1301 100644
>>> --- a/drivers/spi/spi-rspi.c
>>> +++ b/drivers/spi/spi-rspi.c
>>> @@ -848,7 +848,6 @@ static int qspi_transfer_in(struct rspi_data *rspi,
>>> struct spi_transfer *xfer)
>>>                        ret = rspi_pio_transfer(rspi, NULL, rx, n);
>>>                        if (ret < 0)
>>>                                return ret;
>>> -                     *rx++ = ret;
>>
>> Storing the success code (0) in the receive buffer is indeed wrong.
>>
>> However, there are other bugs in that code:
>>
>> rspi_pio_transfer(rspi, NULL, rx, n) transfers n bytes instead of len,
>> while n is decreased by len (which is <= n).
>> Furthermore rx is not incremented.
>> Hence if len < n, n will still be non-zero, and a new iteration of the
>> loop will be started, trying to receive more data, and overwriting the
>> just filled buffer.
>>
>> The same bug is present in qspi_transfer_out().
>>
>>>                }
>>>                n -= len;
>>>        }
>>> --
>>> 1.9.1
>>>
> (nip)
>>
>> Apart from sending patches inline, my comments from
>> https://www.spinics.net/lists/linux-spi/msg09753.html are still valid.
>
>
> Sorry I might not understand your explanation correctly. Could you please
> explain it more details?
> (Because I've tried to understand your explanation then analyzed the source
> code again as below:
> https://patchwork.kernel.org/patch/9541629/)

qspi_transfer_in() does:

        while (n > 0) {
                len = qspi_set_receive_trigger(rspi, n);
                // len will be <= n
                if (len == QSPI_BUFFER_SIZE) {
                        // receive blocks of len bytes
                        ...
                } else {
                        // receive n (not len) bytes
                        ret = rspi_pio_transfer(rspi, NULL, rx, n);
                        //
                        if (ret < 0)
                                return ret;
                        // bogus write (which your patch removes: OK)
                        *rx++ = ret;
                        // here we should also return (see below why)
                        // (in qspi_transfer_out() we should "break")
                }
                // Either we received a block of len bytes
                // or we received n bytes, and the below is wrong if len < n!
                n -= len;
                // If len was < n, n will be non-zero, and we will receive more
                // bytes in the next iteration
        }

        return 0;
}

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Cao Minh Hiep Feb. 13, 2017, 6:50 a.m. UTC | #5

Hi Geert,

Sorry to bother you!
> qspi_transfer_in() does:
>
>          while (n > 0) {
>                  len = qspi_set_receive_trigger(rspi, n);
>                  // len will be <= n
I agree. This is len = min value of n and 32.
>                  if (len == QSPI_BUFFER_SIZE) {
>                          // receive blocks of len bytes
Yes, This is len == 32 bytes
>                          ...
>                  } else {
>                          // receive n (not len) bytes
This is always n == len ( This case, len or n is always < 32)
Because this is the last n bytes should be received.

>                          ret = rspi_pio_transfer(rspi, NULL, rx, n);
>                          //
>                          if (ret < 0)
>                                  return ret;
>                          // bogus write (which your patch removes: OK)
>                          *rx++ = ret;
I agree. This code needs to be removed.

>                          // here we should also return (see below why)
>                          // (in qspi_transfer_out() we should "break")
>                  }
>                  // Either we received a block of len bytes
>                  // or we received n bytes, and the below is wrong if len < n!
>                  n -= len;
>                  // If len was < n, n will be non-zero, and we will receive more
>                  // bytes in the next iteration
I am sorry, I don't understand your opinion here also.
The following is my opinion:

In case of receiving n bytes data > 32 bytes (Ex: n= 50bytes)
The first loop, n= 50,and  len = 32 bytes by getting min value of 50 and 
32 from qspi_set_receive_trigger()
(this case was n < len).
Then it receives 32 bytes in "if (len == QSPI_BUFFER_SIZE)" statement.
After received 32bytes of data, n -= len is implemented. It means n =(n 
- len) = (50-32)= 18 bytes.
The first loop finished.

The second loop, n=18 bytes, and len = 18 bytes by getting min value of 
32 and 18 from qspi_set_receive_trigger().
This time, 'else' statement would be implemented. *rx pointer was 
increased into rspi_pio_transfer()
After received the last 18 bytes of data into 'else' statement, n -=len 
was implemented again.
It means n = (n - len) = (18 -18) = 0 byte.
The second loop finished.

This time n = 0, Completed receiving data.

In case of receiving n bytes data < 32 bytes  (Ex: n= 20 bytes).
It's the same with the second loop above.

Thank you.
Hiep.

Geert Uytterhoeven Feb. 13, 2017, 8:07 a.m. UTC | #6

Hi Hiep,

On Mon, Feb 13, 2017 at 7:50 AM, Hiep Cao Minh <cm-hiep@jinso.co.jp> wrote:
> Sorry to bother you!

No, thank you for finally make me see my wrong reasoning!

>> qspi_transfer_in() does:
>>
>>          while (n > 0) {
>>                  len = qspi_set_receive_trigger(rspi, n);
>>                  // len will be <= n
>
> I agree. This is len = min value of n and 32.
>>
>>                  if (len == QSPI_BUFFER_SIZE) {
>>                          // receive blocks of len bytes
>
> Yes, This is len == 32 bytes
>>
>>                          ...
>>                  } else {
>>                          // receive n (not len) bytes
>
> This is always n == len ( This case, len or n is always < 32)
> Because this is the last n bytes should be received.

Right.
Doh, I must have missed that if "len < n", the first branch is always taken.

>>                          ret = rspi_pio_transfer(rspi, NULL, rx, n);

It would have been clearer to use "len" here instead of "n".

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Geert Uytterhoeven Feb. 13, 2017, 8:26 a.m. UTC | #7

On Mon, Feb 6, 2017 at 1:46 PM, Mark Brown <broonie@kernel.org> wrote:
> On Mon, Feb 06, 2017 at 10:02:13AM +0900, DongCV wrote:
>> From: Dong <cv-dong@jinso.co.jp>
>>
>> This patch fixes the output warning logs and data loss when
>> performing mount/umount then remount the device with jffs2 format.
>
> This is not a good changelog since it does not describe what the problem
> with the driver is or how the change fixes it - it just says that there
> is a problem.
>
>> This is the warning logs when performing mount/umount then remount the device with jffs2 format:
>> "root@linaro-naro:~# mount -t jffs2 /dev/mtdblock2 /mnt/media
>> [ 3839.928013] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40000: 0x1900 instead
>> [ 3839.956515] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x03b40004: 0x000c instead
>
> Please think hard before including long log messages in upstream
> reports, they often contain almost no useful information relative to
> their size so often obscure the relevant content in your message.  Some
> subset is usually much better (eg, "produces lots of errors like X"
> here).

May I suggest the following:

    spi: rspi: Fix bogus received byte in qspi_transfer_in()

    When there are less than QSPI_BUFFER_SIZE remaining bytes to be received,
    qspi_transfer_in() writes one bogus byte in the receive buffer, possibly
    leading to a buffer overflow.

    This can be reproduced by mounting, unmounting, and remounting a
    jffs2-formatted device, causing lots of warnings like:

        jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x03b40000: 0x1900 instead

    Remove the bogus write to fix this.

It's also a good idea to add a Fixes tag:

    Fixes: 3be09bec42a800d4 ("spi: rspi: supports 32bytes buffer for
DUAL and QUAD")

(the code was moved afterwards, but both the origin and the move were
 integrated in v4.10-rc1).

Finally:

    Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Cao Minh Hiep Feb. 15, 2017, 12:26 a.m. UTC | #8

Hi Geert,

Sorry for the delay.

Thanks for your reply.


> On Mon, Feb 13, 2017 at 7:50 AM, Hiep Cao Minh <cm-hiep@jinso.co.jp> wrote:
>> Sorry to bother you!
> No, thank you for finally make me see my wrong reasoning!
>
>>> qspi_transfer_in() does:
>>>
>>>           while (n > 0) {
>>>                   len = qspi_set_receive_trigger(rspi, n);
>>>                   // len will be <= n
>> I agree. This is len = min value of n and 32.
>>>                   if (len == QSPI_BUFFER_SIZE) {
>>>                           // receive blocks of len bytes
>> Yes, This is len == 32 bytes
>>>                           ...
>>>                   } else {
>>>                           // receive n (not len) bytes
>> This is always n == len ( This case, len or n is always < 32)
>> Because this is the last n bytes should be received.
> Right.
> Doh, I must have missed that if "len < n", the first branch is always taken.
>
>>>                           ret = rspi_pio_transfer(rspi, NULL, rx, n);
> It would have been clearer to use "len" here instead of "n".
I agree, Dong will update your suggestion at the patch.

Thank you.
Hiep.

spi: rspi: fix the bug related to mount/remount jffs2

Commit Message

Comments

Patch