diff mbox series

cdc-acm: fix abnormal DATA RX issue for Mediatek Preloader.

Message ID 1544671676-23912-1-git-send-email-macpaul.lin@mediatek.com (mailing list archive)
State Superseded
Headers show
Series cdc-acm: fix abnormal DATA RX issue for Mediatek Preloader. | expand

Commit Message

Macpaul Lin Dec. 13, 2018, 3:27 a.m. UTC
From: Macpaul Lin <macpaul.lin@mediatek.com>

Mediatek Preloader is a proprietary embedded boot loader for loading
Little Kernel and Linux into device DRAM.

This boot loader also handle firmware updating. Mediatek Preloader will be
enumerated as a virtual COM port when the device is connected to Windows
or Linux OS via CDC-ACM class driver. When the enumeration has been done,
Mediatek Preloader will send out handshake command "READY" to PC actively
instead of waiting command from the download tool.
Since Linux 4.12, the commit "tty: reset termios state on device
registration" (93857edd9829e144acb6c7e72d593f6e01aead66) causes Mediatek
Preloader receiving some abnoraml command like "READYXX" as it sended.
Which will be recognized as an incorrect response. This behavior change
also causes the handshake fail.

By disabling the ECHO termios flag could avoid this problem. However, it
cannot be done by user space configuration when download tool open
/dev/ttyACM0. This is because the device running Mediatek Preloader will
send handshake command "READY" immediately once the CDC-ACM driver is
ready.

This patch wants to fix above problem by introducing "DISABLE_ECHO"
property in driver_info. When Mediatek Preloader is connected, the
CDC-ACM driver could disable ECHO flag in termios to avoid the problem.

Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
---
 drivers/usb/class/cdc-acm.c | 9 ++++++++-
 drivers/usb/class/cdc-acm.h | 1 +
 2 files changed, 9 insertions(+), 1 deletion(-)

Comments

Oliver Neukum Dec. 13, 2018, 9:23 a.m. UTC | #1
On Do, 2018-12-13 at 11:27 +0800, macpaul.lin@mediatek.com wrote:
> From: Macpaul Lin <macpaul.lin@mediatek.com>
> 
> Mediatek Preloader is a proprietary embedded boot loader for loading
> Little Kernel and Linux into device DRAM.
> 
> This boot loader also handle firmware updating. Mediatek Preloader will be
> enumerated as a virtual COM port when the device is connected to Windows
> or Linux OS via CDC-ACM class driver. When the enumeration has been done,
> Mediatek Preloader will send out handshake command "READY" to PC actively
> instead of waiting command from the download tool.
> Since Linux 4.12, the commit "tty: reset termios state on device
> registration" (93857edd9829e144acb6c7e72d593f6e01aead66) causes Mediatek
> Preloader receiving some abnoraml command like "READYXX" as it sended.
> Which will be recognized as an incorrect response. This behavior change
> also causes the handshake fail.

Thank you for making this patch. However, I am afraid I have to ask
for two little alterations before it can go upstream

1. If I understand you correctly it only worked by accident usually
on the old kernels. Please CC the patch to stable.
2. Do not check for exact match on your quirk. That will prevent
combining quirks. Please test for the specific bit being set.

> 
>  
> +	/* handle active handshake triggered by device */
> +	if (quirks == DISABLE_ECHO)

This test is too specific.

> +		acm_tty_driver->init_termios.c_lflag &= ~(ECHO);
> +

	Regards
		Oliver
Johan Hovold Dec. 13, 2018, 9:43 a.m. UTC | #2
On Thu, Dec 13, 2018 at 11:27:56AM +0800, macpaul.lin@mediatek.com wrote:
> From: Macpaul Lin <macpaul.lin@mediatek.com>
> 
> Mediatek Preloader is a proprietary embedded boot loader for loading
> Little Kernel and Linux into device DRAM.
> 
> This boot loader also handle firmware updating. Mediatek Preloader will be
> enumerated as a virtual COM port when the device is connected to Windows
> or Linux OS via CDC-ACM class driver. When the enumeration has been done,
> Mediatek Preloader will send out handshake command "READY" to PC actively
> instead of waiting command from the download tool.
> Since Linux 4.12, the commit "tty: reset termios state on device
> registration" (93857edd9829e144acb6c7e72d593f6e01aead66) causes Mediatek
> Preloader receiving some abnoraml command like "READYXX" as it sended.
> Which will be recognized as an incorrect response. This behavior change
> also causes the handshake fail.
> 
> By disabling the ECHO termios flag could avoid this problem. However, it
> cannot be done by user space configuration when download tool open
> /dev/ttyACM0. This is because the device running Mediatek Preloader will
> send handshake command "READY" immediately once the CDC-ACM driver is
> ready.
> 
> This patch wants to fix above problem by introducing "DISABLE_ECHO"
> property in driver_info. When Mediatek Preloader is connected, the
> CDC-ACM driver could disable ECHO flag in termios to avoid the problem.
> 
> Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
> ---
>  drivers/usb/class/cdc-acm.c | 9 ++++++++-
>  drivers/usb/class/cdc-acm.h | 1 +
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
> index 1b68fed..2f744bb 100644
> --- a/drivers/usb/class/cdc-acm.c
> +++ b/drivers/usb/class/cdc-acm.c
> @@ -1156,6 +1156,10 @@ static int acm_probe(struct usb_interface *intf,
>  		goto skip_normal_probe;
>  	}
>  
> +	/* handle active handshake triggered by device */
> +	if (quirks == DISABLE_ECHO)
> +		acm_tty_driver->init_termios.c_lflag &= ~(ECHO);

You cannot change the driver init_termios like this as that will affect
all cdc-acm devices that are probed later. If this is at all needed,
this will have to be done at tty install time.

Note that 93857edd9829 ("tty: reset termios state on device
registration") only makes sure that you get the default terminal setting
whenever you plug in a device (rather than reuse settings from a
previously connected device which happened to be assigned the same minor
number). Specifically, it should not change any behaviour for the first
time a cdc-acm device is plugged in.

From just a quick look, it seems you need to prevent your download tool
from sending "XX" before disabling ECHO. Why wouldn't that work?

Thanks,
Johan
Oliver Neukum Dec. 13, 2018, 10:13 a.m. UTC | #3
On Do, 2018-12-13 at 10:43 +0100, Johan Hovold wrote:
> On Thu, Dec 13, 2018 at 11:27:56AM +0800, macpaul.lin@mediatek.com wrote:
> > From: Macpaul Lin <macpaul.lin@mediatek.com>
> > 
> > +	/* handle active handshake triggered by device */
> > +	if (quirks == DISABLE_ECHO)
> > +		acm_tty_driver->init_termios.c_lflag &= ~(ECHO);
> 
> You cannot change the driver init_termios like this as that will affect
> all cdc-acm devices that are probed later. If this is at all needed,
> this will have to be done at tty install time.

Right. How do with decide on a sensible default anyway?

	Regards
		Oliver
Johan Hovold Dec. 13, 2018, 10:18 a.m. UTC | #4
On Thu, Dec 13, 2018 at 11:13:54AM +0100, Oliver Neukum wrote:
> On Do, 2018-12-13 at 10:43 +0100, Johan Hovold wrote:
> > On Thu, Dec 13, 2018 at 11:27:56AM +0800, macpaul.lin@mediatek.com wrote:
> > > From: Macpaul Lin <macpaul.lin@mediatek.com>
> > > 
> > > +	/* handle active handshake triggered by device */
> > > +	if (quirks == DISABLE_ECHO)
> > > +		acm_tty_driver->init_termios.c_lflag &= ~(ECHO);
> > 
> > You cannot change the driver init_termios like this as that will affect
> > all cdc-acm devices that are probed later. If this is at all needed,
> > this will have to be done at tty install time.
> 
> Right. How do with decide on a sensible default anyway?

I think the current defaults are sensible. They are based on
tty_std_termios, which has ECHO set, as for most (all?) tty drivers.

Johan
Macpaul Lin Dec. 14, 2018, 2 a.m. UTC | #5
On Thu, 2018-12-13 at 11:18 +0100, Johan Hovold wrote:
> On Thu, Dec 13, 2018 at 11:13:54AM +0100, Oliver Neukum wrote:
> > On Do, 2018-12-13 at 10:43 +0100, Johan Hovold wrote:
> > > On Thu, Dec 13, 2018 at 11:27:56AM +0800, macpaul.lin@mediatek.com wrote:
> > > > From: Macpaul Lin <macpaul.lin@mediatek.com>
> > > > 
> > > > +	/* handle active handshake triggered by device */
> > > > +	if (quirks == DISABLE_ECHO)
> > > > +		acm_tty_driver->init_termios.c_lflag &= ~(ECHO);
> > > 
> > > You cannot change the driver init_termios like this as that will affect
> > > all cdc-acm devices that are probed later. If this is at all needed,
> > > this will have to be done at tty install time.
> > 
> > Right. How do with decide on a sensible default anyway?
> 
> I think the current defaults are sensible. They are based on
> tty_std_termios, which has ECHO set, as for most (all?) tty drivers.
> 
> Johan

Well, the problem is that the phone device (preloader) will get
"READYXX" even "no any download tool has ever been launched on PC".

After the reset termios state change simply set EHCO enable, each time
the phone device simple send out "READY" to PC after USB has been
connected. The "READY" indication has no any "\0" character at the end
of the command string, hence it will receive some dirty data (ex. 
"READYXX") back on RX path.

Regards,
Macpaul Lin
Johan Hovold Dec. 14, 2018, 11:07 a.m. UTC | #6
On Fri, Dec 14, 2018 at 10:00:16AM +0800, Macpaul Lin wrote:
> On Thu, 2018-12-13 at 11:18 +0100, Johan Hovold wrote:
> > On Thu, Dec 13, 2018 at 11:13:54AM +0100, Oliver Neukum wrote:
> > > On Do, 2018-12-13 at 10:43 +0100, Johan Hovold wrote:
> > > > On Thu, Dec 13, 2018 at 11:27:56AM +0800, macpaul.lin@mediatek.com wrote:
> > > > > From: Macpaul Lin <macpaul.lin@mediatek.com>
> > > > > 
> > > > > +	/* handle active handshake triggered by device */
> > > > > +	if (quirks == DISABLE_ECHO)
> > > > > +		acm_tty_driver->init_termios.c_lflag &= ~(ECHO);
> > > > 
> > > > You cannot change the driver init_termios like this as that will affect
> > > > all cdc-acm devices that are probed later. If this is at all needed,
> > > > this will have to be done at tty install time.
> > > 
> > > Right. How do with decide on a sensible default anyway?
> > 
> > I think the current defaults are sensible. They are based on
> > tty_std_termios, which has ECHO set, as for most (all?) tty drivers.

> Well, the problem is that the phone device (preloader) will get
> "READYXX" even "no any download tool has ever been launched on PC".

Something much hold the port open on the host (PC) side for it to be
echoed back, but perhaps the outgoing data is buffered in the device
until it is eventually opened.

> After the reset termios state change simply set EHCO enable, each time
> the phone device simple send out "READY" to PC after USB has been
> connected. The "READY" indication has no any "\0" character at the end
> of the command string, hence it will receive some dirty data (ex. 
> "READYXX") back on RX path.

So you have a firmware bug which sends out some garbage characters
("XX")?

Either way, you should have hit this also before the commit which
started resetting the terminal settings whenever a device was plugged
in instead of reusing a potentially random other device's settings.

However, after clearing echo and replugging the device, nothing would
have been echoed back on next open, but only *if* the device happened to
be assigned the same minor number.

But if this confuses your firmware and there's no way to restart the
handshake in your host tool, perhaps clearing ECHO at tty install time
(i.e. in tty_acm_install()) is the only way to deal with this.

Johan
Macpaul Lin Dec. 17, 2018, 5:50 a.m. UTC | #7
On Fri, 2018-12-14 at 12:07 +0100, Johan Hovold wrote:
> On Fri, Dec 14, 2018 at 10:00:16AM +0800, Macpaul Lin wrote:
> > On Thu, 2018-12-13 at 11:18 +0100, Johan Hovold wrote:
> > > On Thu, Dec 13, 2018 at 11:13:54AM +0100, Oliver Neukum wrote:
> > > > On Do, 2018-12-13 at 10:43 +0100, Johan Hovold wrote:
> > > > > On Thu, Dec 13, 2018 at 11:27:56AM +0800, macpaul.lin@mediatek.com wrote:
> > > > > > From: Macpaul Lin <macpaul.lin@mediatek.com>
> > > > > > 
> > > > > > +	/* handle active handshake triggered by device */
> > > > > > +	if (quirks == DISABLE_ECHO)
> > > > > > +		acm_tty_driver->init_termios.c_lflag &= ~(ECHO);
> > > > > 
> > > > > You cannot change the driver init_termios like this as that will affect
> > > > > all cdc-acm devices that are probed later. If this is at all needed,
> > > > > this will have to be done at tty install time.
> > > > 
> > > > Right. How do with decide on a sensible default anyway?
> > > 
> > > I think the current defaults are sensible. They are based on
> > > tty_std_termios, which has ECHO set, as for most (all?) tty drivers.
> 
> > Well, the problem is that the phone device (preloader) will get
> > "READYXX" even "no any download tool has ever been launched on PC".
> 
> Something much hold the port open on the host (PC) side for it to be
> echoed back, but perhaps the outgoing data is buffered in the device
> until it is eventually opened.
> 
> > After the reset termios state change simply set EHCO enable, each time
> > the phone device simple send out "READY" to PC after USB has been
> > connected. The "READY" indication has no any "\0" character at the end
> > of the command string, hence it will receive some dirty data (ex. 
> > "READYXX") back on RX path.
> 
> So you have a firmware bug which sends out some garbage characters
> ("XX")?

No, "XX" just meant any random data of any possible length, not the real
characters "XX". From the bottom of hardware FIIO, usb driver buffers,
to the higher level CDC driver and download handshake flow, the firmware
keeps separated RX and TX data buffers. The initial command "READY" is
static declared and only been used once when the handshake flow starts
in stead of dynamic allocated. We've also tested memory set at the time 
when RX/TX buffer is allocated when system boot, but the problem did not
recovered.

> Either way, you should have hit this also before the commit which
> started resetting the terminal settings whenever a device was plugged
> in instead of reusing a potentially random other device's settings.

We've also reviewed the firmware code, too. This mechanism of download
flow (firmware code) has been used more than 10 years and has never been
changed. Our phone product ships 100 million per year. What I mean here 
is this download flow has been verified in the factory producing line
both on Windows and Linux PC so many time. Hence the possibility of bug
in the data buffer management is very rare. 

This problem has been reported by one of our customer about 1 month ago
because they upgraded their factory PC from Ubuntu 16.04 (Linux 4.4) to
Ubuntu 18.04. (Linux 4.15). Hence Mediatek has done lots of bisect test
from 4.4 release to 4.15 on both tty and cdc-acm changes to find out
which commit caused behavior change on PC side. Because the new reset 
termios state method comes with another 2 patches. We also tested these
patch separately to confirm the bisect result.  
        tty: reset termios state on device registration
        tty: drop obsolete termios_locked comments
        tty: close race between device register and open
We've also remove reset termios state commit on latest 4.19 kernel to
confirm the behavior of PC is back to work for the download, too.

> However, after clearing echo and replugging the device, nothing would
> have been echoed back on next open, but only *if* the device happened to
> be assigned the same minor number.
> 
> But if this confuses your firmware and there's no way to restart the
> handshake in your host tool, perhaps clearing ECHO at tty install time
> (i.e. in tty_acm_install()) is the only way to deal with this.

Thanks for you comment!
I'll make a change to move the clearing ECHO into tty_acm_install() and
to check if it also work for the download process. Thanks a lot!

> Johan

Regards,
Macpaul Lin
Johan Hovold Dec. 18, 2018, 8:55 a.m. UTC | #8
On Mon, Dec 17, 2018 at 01:50:58PM +0800, Macpaul Lin wrote:
> On Fri, 2018-12-14 at 12:07 +0100, Johan Hovold wrote:
> > On Fri, Dec 14, 2018 at 10:00:16AM +0800, Macpaul Lin wrote:

> > > After the reset termios state change simply set EHCO enable, each time
> > > the phone device simple send out "READY" to PC after USB has been
> > > connected. The "READY" indication has no any "\0" character at the end
> > > of the command string, hence it will receive some dirty data (ex. 
> > > "READYXX") back on RX path.
> > 
> > So you have a firmware bug which sends out some garbage characters
> > ("XX")?
> 
> No, "XX" just meant any random data of any possible length, not the real
> characters "XX". From the bottom of hardware FIIO, usb driver buffers,
> to the higher level CDC driver and download handshake flow, the firmware
> keeps separated RX and TX data buffers. The initial command "READY" is
> static declared and only been used once when the handshake flow starts
> in stead of dynamic allocated. We've also tested memory set at the time 
> when RX/TX buffer is allocated when system boot, but the problem did not
> recovered.

You can enable (verbose) debugging in the cdc-acm driver to see how many
bytes it receives from the device (and how many bytes it echoes back) in
order to determine where these garbage characters come from. Just define
DEBUG and VERBOSE_DEBUG in the cdc-acm driver.

> > Either way, you should have hit this also before the commit which
> > started resetting the terminal settings whenever a device was plugged
> > in instead of reusing a potentially random other device's settings.
> 
> We've also reviewed the firmware code, too. This mechanism of download
> flow (firmware code) has been used more than 10 years and has never been
> changed. Our phone product ships 100 million per year. What I mean here 
> is this download flow has been verified in the factory producing line
> both on Windows and Linux PC so many time. Hence the possibility of bug
> in the data buffer management is very rare. 
> 
> This problem has been reported by one of our customer about 1 month ago
> because they upgraded their factory PC from Ubuntu 16.04 (Linux 4.4) to
> Ubuntu 18.04. (Linux 4.15). Hence Mediatek has done lots of bisect test
> from 4.4 release to 4.15 on both tty and cdc-acm changes to find out
> which commit caused behavior change on PC side. Because the new reset 
> termios state method comes with another 2 patches. We also tested these
> patch separately to confirm the bisect result.  
>         tty: reset termios state on device registration
>         tty: drop obsolete termios_locked comments
>         tty: close race between device register and open
> We've also remove reset termios state commit on latest 4.19 kernel to
> confirm the behavior of PC is back to work for the download, too.

That's great, but the conclusion remains; the problem would have been
there on *first* open also before the termios reset change.

> > However, after clearing echo and replugging the device, nothing would
> > have been echoed back on next open, but only *if* the device happened to
> > be assigned the same minor number.
> > 
> > But if this confuses your firmware and there's no way to restart the
> > handshake in your host tool, perhaps clearing ECHO at tty install time
> > (i.e. in tty_acm_install()) is the only way to deal with this.
> 
> Thanks for you comment!
> I'll make a change to move the clearing ECHO into tty_acm_install() and
> to check if it also work for the download process. Thanks a lot!

Good. You're still modifying the shared driver init_termios in your v2,
but I'll comment on that in a reply to the patch.

Thanks,
Johan
diff mbox series

Patch

diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c
index 1b68fed..2f744bb 100644
--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -1156,6 +1156,10 @@  static int acm_probe(struct usb_interface *intf,
 		goto skip_normal_probe;
 	}
 
+	/* handle active handshake triggered by device */
+	if (quirks == DISABLE_ECHO)
+		acm_tty_driver->init_termios.c_lflag &= ~(ECHO);
+
 	/* normal probing*/
 	if (!buffer) {
 		dev_err(&intf->dev, "Weird descriptor references\n");
@@ -1655,7 +1659,10 @@  static int acm_pre_reset(struct usb_interface *intf)
 	.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
 	},
 	{ USB_DEVICE(0x0e8d, 0x0003), /* FIREFLY, MediaTek Inc; andrey.arapov@gmail.com */
-	.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
+	.driver_info = DISABLE_ECHO, /* DISABLE ECHO in termios flag */
+	},
+	{ USB_DEVICE(0x0e8d, 0x2000), /* FIREFLY, MediaTek Inc; Preloader */
+	.driver_info = DISABLE_ECHO, /* DISABLE ECHO in termios flag */
 	},
 	{ USB_DEVICE(0x0e8d, 0x3329), /* MediaTek Inc GPS */
 	.driver_info = NO_UNION_NORMAL, /* has no union descriptor */
diff --git a/drivers/usb/class/cdc-acm.h b/drivers/usb/class/cdc-acm.h
index ca06b20..515aad0 100644
--- a/drivers/usb/class/cdc-acm.h
+++ b/drivers/usb/class/cdc-acm.h
@@ -140,3 +140,4 @@  struct acm {
 #define QUIRK_CONTROL_LINE_STATE	BIT(6)
 #define CLEAR_HALT_CONDITIONS		BIT(7)
 #define SEND_ZERO_PACKET		BIT(8)
+#define DISABLE_ECHO			BIT(9)