Message ID | 20221220055954.11197-1-zhouruihai@huaqin.corp-partner.google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | platform/chrome: cros_ec_typec: deferred probe when typec count mismatch | expand |
On Mon, Dec 19, 2022 at 10:00 PM Ruihai Zhou <zhouruihai@huaqin.corp-partner.google.com> wrote: > > The kernel bootup is much faster with normal mode. In this case, > there is a chance that the cros-ec-typec module get the actual typec > port counts but not accurate from ec before ec is able to setup it. > It will block the HDMI mux function. > Hence, return -EPROBE_DEFER to put the device onto the deferred probe > list when the typec count mismatch between ec and node. > > Signed-off-by: Ruihai Zhou <zhouruihai@huaqin.corp-partner.google.com> > --- > drivers/platform/chrome/cros_ec_typec.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/platform/chrome/cros_ec_typec.c b/drivers/platform/chrome/cros_ec_typec.c > index 59de4ce01fab..d821501e875c 100644 > --- a/drivers/platform/chrome/cros_ec_typec.c > +++ b/drivers/platform/chrome/cros_ec_typec.c > @@ -382,7 +382,7 @@ static int cros_typec_init_ports(struct cros_typec_data *typec) > > if (nports > typec->num_ports) { > dev_err(dev, "More ports listed than can be supported.\n"); > - return -EINVAL; > + return -EPROBE_DEFER; I think that is problematic. It might as well be that nports > EC_USB_PD_MAX_PORTS. Is this really seen in the field ? The EC should never report a wrong (random) number of ports. If it is not ready, there should be _some_ indication that it isn't ready. Does it really report a more or less random number in this case ? Guenter > } > > /* DT uses "reg" to specify port number. */ > -- > 2.17.1 >
> I think that is problematic. It might as well be that nports > > EC_USB_PD_MAX_PORTS. Yes, you're right. so we should consider it's a invalid argument and return -EINVAL if nports > EC_USB_PD_MAX_PORTS. right? > Is this really seen in the field ? The EC should never report a wrong > (random) number of ports. If it is not ready, there should be _some_ > indication that it isn't ready. Does it really report a more or less > random number in this case ? Yes, I saw this on corsola boards. The EC report a wrong(not random) number. because corsola emulates HDMI MUX over the current type-c mux stack. The ec has to fake a type-c port to pass the MUX info. But the task are not initiated on starting up, and increase the type-c port counts after the tasks finished. In this case, I saw the typec->num_ports = 1, but the nports = 2, which will be probe failed and block the HDMI MUX function. I will send v2 patch, if nports > EC_USB_PD_MAX_PORTS, then return -EINVAL, but if nports > typec->num_ports, we consider wait a second to ec task increase the type-c port counts if there're HDMI DB attach, then return -EPROBE_DEFER.
On Tue, Dec 20, 2022 at 6:10 PM Ruihai Zhou <zhouruihai@huaqin.corp-partner.google.com> wrote: >> >> I think that is problematic. It might as well be that nports > >> EC_USB_PD_MAX_PORTS. >> > Yes, you're right. so we should consider it's a invalid argument and return -EINVAL if nports > EC_USB_PD_MAX_PORTS. right? Why ? While this would be a bug, it should hopefully be found early in development and never hit the field. I don't see a reason for changing the code. >> >> Is this really seen in the field ? The EC should never report a wrong >> (random) number of ports. If it is not ready, there should be _some_ >> indication that it isn't ready. Does it really report a more or less >> random number in this case ? > > Yes, I saw this on corsola board. The EC report a wrong number(not random), because corsola emulates HDMI MUX over the current > type-c mux stack. The ec has to fake a type-c port to pass the MUX info. But the task are not initiated on starting up, and increase the > type-c port counts after the tasks finished. In this case, I saw the typec->num_ports = 1, but the nports = 2, which will be probe failed and > block the HDMI mux function. > > I will send v2 patch, if nports > EC_USB_PD_MAX_PORTS, then return -EINVAL, but if nports > typec->num_ports, we consider wait a second > to ec task increase the type-c port counts if there're a HDMI DB attach, then return -EPROBE_DEFER The current code just reduces nports if it is larger than EC_USB_PD_MAX_PORTS. Again, I don't see a reason to change that. Anyway, I am not sure if the above will work reliably. I am not sure what "HDMI DB" refers to, but if it is an external connector its status could change anytime. In that situation, no amount of waiting would help. Either case, the EC on corsola should really not change the number of ports it supports. Either it is a constant that should not change, or it is dynamic and the EC would need to tell the host if the number of ports changes (up or down). Trying to fix this in cros_ec_typec without well defined protocol exchange with the EC is at best a kludge, but not a real solution. Thanks, Guenter
>>> Is this really seen in the field ? The EC should never report a wrong >>> (random) number of ports. If it is not ready, there should be _some_ >>> indication that it isn't ready. Does it really report a more or less >>> random number in this case ? >> >> Yes, I saw this on corsola board. The EC report a wrong number(not random), because corsola emulates HDMI MUX over the current >> type-c mux stack. The ec has to fake a type-c port to pass the MUX info. But the task are not initiated on starting up, and increase the >> type-c port counts after the tasks finished. In this case, I saw the typec->num_ports = 1, but the nports = 2, which will be probe failed and >> block the HDMI mux function. >> >> I will send v2 patch, if nports > EC_USB_PD_MAX_PORTS, then return -EINVAL, but if nports > typec->num_ports, we consider wait a second >> to ec task increase the type-c port counts if there're a HDMI DB attach, then return -EPROBE_DEFER > > The current code just reduces nports if it is larger than > EC_USB_PD_MAX_PORTS. Again, I don't see a reason to change that. > > Anyway, I am not sure if the above will work reliably. I am not sure > what "HDMI DB" refers to, but if it is an external connector its > status could change anytime. In that situation, no amount of waiting > would help. Either case, the EC on corsola should really not change > the number of ports it supports. Either it is a constant that should > not change, or it is dynamic and the EC would need to tell the host if > the number of ports changes (up or down). Trying to fix this in > cros_ec_typec without well defined protocol exchange with the EC is at > best a kludge, but not a real solution. Thank you for the reply. The "HDMI DB" refer to the daughter board attached to the mainboard, which don't change anytime. In fact, the corsola EC increase the port count dynamically with some delay (no see a standard yet) during bootup. There is the EC code to increase the port count [1]. If the cros-ec-typec get the type-c port counts from EC before the EC increase the port counts, then will probe failed if return -EINVAL. I think the cros-ec-typec is replying on the fragile delay in EC, cros-ec-typec module will not get the fake(HDMI) type-c port counts once the EC deferred call is later than the driver probe. That is why I make this change. For the verification, the driver always probe failed when EC reboot without the patch. But the driver probe passed with the patch after a few times deferred probe. Link: [1] https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/ec/zephyr/program/corsola/src/usbc.c;l=47
On Tue, Dec 20, 2022 at 11:01 PM Ruihai Zhou <zhouruihai@huaqin.corp-partner.google.com> wrote: > > >>> Is this really seen in the field ? The EC should never report a wrong > >>> (random) number of ports. If it is not ready, there should be _some_ > >>> indication that it isn't ready. Does it really report a more or less > >>> random number in this case ? > >> > >> Yes, I saw this on corsola board. The EC report a wrong number(not random), because corsola emulates HDMI MUX over the current > >> type-c mux stack. The ec has to fake a type-c port to pass the MUX info. But the task are not initiated on starting up, and increase the > >> type-c port counts after the tasks finished. In this case, I saw the typec->num_ports = 1, but the nports = 2, which will be probe failed and > >> block the HDMI mux function. > >> > >> I will send v2 patch, if nports > EC_USB_PD_MAX_PORTS, then return -EINVAL, but if nports > typec->num_ports, we consider wait a second > >> to ec task increase the type-c port counts if there're a HDMI DB attach, then return -EPROBE_DEFER > > > > The current code just reduces nports if it is larger than > > EC_USB_PD_MAX_PORTS. Again, I don't see a reason to change that. > > > > Anyway, I am not sure if the above will work reliably. I am not sure > > what "HDMI DB" refers to, but if it is an external connector its > > status could change anytime. In that situation, no amount of waiting > > would help. Either case, the EC on corsola should really not change > > the number of ports it supports. Either it is a constant that should > > not change, or it is dynamic and the EC would need to tell the host if > > the number of ports changes (up or down). Trying to fix this in > > cros_ec_typec without well defined protocol exchange with the EC is at > > best a kludge, but not a real solution. > > Thank you for the reply. The "HDMI DB" refer to the daughter board > attached to the mainboard, which don't change anytime. In fact, > the corsola EC increase the port count dynamically with some delay > (no see a standard yet) during bootup. There is the EC code to > increase the port count [1]. If the cros-ec-typec get the type-c > port counts from EC before the EC increase the port counts, then > will probe failed if return -EINVAL. I think the cros-ec-typec > is replying on the fragile delay in EC, cros-ec-typec module will > not get the fake(HDMI) type-c port counts once the EC deferred > call is later than the driver probe. That is why I make this > change. For the verification, the driver always probe failed > when EC reboot without the patch. But the driver probe passed with > the patch after a few times deferred probe. Sorry, but I agree with Guenter; I don't think this is justification to add a hack for 1 board in this driver. In what situation does the EC task take so long after reboot that it occurs after a module probe (which occurs shortly after system boot up)? Why is the "corsola" board increasing the Type-C port count dynamically after some delay, and what is that delay amount? Why can't this be set via CBI/DT or its equivalent on your ARM platform? This seems like fragile code in the EC and should be fixed there. The number of Type-C ports reported by the EC really should be invariant across the boot lifecycle. I don't think this patch can be accepted. Frankly, I'm concerned that doing this might be introducing some subtle bugs in the EC Type-C code too (I'm certain there are assumptions about port count being static there too). If you really need to change the number of ports dynamically in the EC, add a board specific-hook to not respond to the EC_CMD_USB_PD_PORTS host command until you have "set" the number of ports. Best regards, - Prashant
> Sorry, but I agree with Guenter; I don't think this is justification to add a > hack for 1 board in this driver. > > In what situation does the EC task take so long after reboot that it > occurs after a module probe (which occurs shortly after system boot up)? > > Why is the "corsola" board increasing the Type-C port count dynamically > after some delay, and what is that delay amount? Why can't this be set > via CBI/DT or its equivalent on your ARM platform? > > This seems like fragile code in the EC and should be fixed there. The > number of Type-C ports reported by the EC really should be invariant > across the boot lifecycle. I don't think this patch can be accepted. > Alright, I got it. Thank you for your patience and the suggestions. Because the tcpc/pd tasks are not needed for the HDMI port. The EC don't want the tasks initiated on starting up, and increase the counts after the tasks are launched. The delay amount was 2 second before, (always failed when ec reboot on firmware normal mode) and change to 1 second later, and 500ms yesterday(not reproduce until now). The delay on EC can be referred to here [1]. However, with the current design, the number of type-c ports reported by EC isn't invariant across the boot lifecycle. Also, maybe printing the counts getting from EC when encountered this field will be a littel helpful for debugging. just an insignificant thought. > If you really need to change the number of ports dynamically in the EC, > add a board specific-hook to not respond to the EC_CMD_USB_PD_PORTS > host command until you have "set" the number of ports. Got it. Thank you again for your suggestions. Link: [1] https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/ec/zephyr/program/corsola/src/usbc.c;l=222
diff --git a/drivers/platform/chrome/cros_ec_typec.c b/drivers/platform/chrome/cros_ec_typec.c index 59de4ce01fab..d821501e875c 100644 --- a/drivers/platform/chrome/cros_ec_typec.c +++ b/drivers/platform/chrome/cros_ec_typec.c @@ -382,7 +382,7 @@ static int cros_typec_init_ports(struct cros_typec_data *typec) if (nports > typec->num_ports) { dev_err(dev, "More ports listed than can be supported.\n"); - return -EINVAL; + return -EPROBE_DEFER; } /* DT uses "reg" to specify port number. */
The kernel bootup is much faster with normal mode. In this case, there is a chance that the cros-ec-typec module get the actual typec port counts but not accurate from ec before ec is able to setup it. It will block the HDMI mux function. Hence, return -EPROBE_DEFER to put the device onto the deferred probe list when the typec count mismatch between ec and node. Signed-off-by: Ruihai Zhou <zhouruihai@huaqin.corp-partner.google.com> --- drivers/platform/chrome/cros_ec_typec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)