Message ID | 20200922112126.16919-1-oneukum@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] fixes for hangs and error reporting in CDC_WDM | expand |
On Tue, Sep 22, 2020 at 01:21:19PM +0200, Oliver Neukum wrote: > Stress testing has shown that CDC-WDM has some issues with hangs > and error reporting > > 1. wakeups are not correctly handled in multhreaded environments > 2. unresponsive hardware is not handled > 3. errors are not correctly reported. This needs flush() to be > implemented. > > This version makes wdm_flush() use interruptible sleep. > > For easier review all squashed together: > I have like 3 or 4 different "RFC" series here from you for this driver, which one is the "newest"? And can you send a series that isn't RFC so that I can know you feel it is good enough to be merged? thanks, greg k-h
On 2020/09/26 0:11, Greg KH wrote: > On Tue, Sep 22, 2020 at 01:21:19PM +0200, Oliver Neukum wrote: >> Stress testing has shown that CDC-WDM has some issues with hangs >> and error reporting >> >> 1. wakeups are not correctly handled in multhreaded environments >> 2. unresponsive hardware is not handled >> 3. errors are not correctly reported. This needs flush() to be >> implemented. >> >> This version makes wdm_flush() use interruptible sleep. >> >> For easier review all squashed together: >> > > I have like 3 or 4 different "RFC" series here from you for this driver, > which one is the "newest"? https://lkml.kernel.org/r/20200923092136.14824-1-oneukum@suse.com is the newest series from Oliver. But https://lkml.kernel.org/r/b27841ab-a88c-13e2-a66f-6df7af1f46b4@i-love.sakura.ne.jp is the squashed version with updated comments and deduplicated code. > > And can you send a series that isn't RFC so that I can know you feel it > is good enough to be merged? Do you want this fix as a series of patches (the former link)? Since I think that the changeset should be atomically applied, I posted the latter link.
On Sat, Sep 26, 2020 at 12:20:57AM +0900, Tetsuo Handa wrote: > On 2020/09/26 0:11, Greg KH wrote: > > On Tue, Sep 22, 2020 at 01:21:19PM +0200, Oliver Neukum wrote: > >> Stress testing has shown that CDC-WDM has some issues with hangs > >> and error reporting > >> > >> 1. wakeups are not correctly handled in multhreaded environments > >> 2. unresponsive hardware is not handled > >> 3. errors are not correctly reported. This needs flush() to be > >> implemented. > >> > >> This version makes wdm_flush() use interruptible sleep. > >> > >> For easier review all squashed together: > >> > > > > I have like 3 or 4 different "RFC" series here from you for this driver, > > which one is the "newest"? > > https://lkml.kernel.org/r/20200923092136.14824-1-oneukum@suse.com > > is the newest series from Oliver. But > > https://lkml.kernel.org/r/b27841ab-a88c-13e2-a66f-6df7af1f46b4@i-love.sakura.ne.jp > > is the squashed version with updated comments and deduplicated code. > > > > > And can you send a series that isn't RFC so that I can know you feel it > > is good enough to be merged? > > Do you want this fix as a series of patches (the former link)? > Since I think that the changeset should be atomically applied, I posted the latter link. A single patch would be good to send to me again, burried at the end of a long thread is hard to dig out. Also with proper authorship is needed, did Oliver write this, or did you? There is the co-developed-by: tag, which looks like it might be relevant here, can you do that? thanks, greg k-h
On 2020/09/26 0:28, Greg KH wrote: >> Do you want this fix as a series of patches (the former link)? >> Since I think that the changeset should be atomically applied, I posted the latter link. > > A single patch would be good to send to me again, burried at the end of > a long thread is hard to dig out. > > Also with proper authorship is needed, did Oliver write this, or did > you? Oliver wrote the majority part. I just squashed the series and updated comments and deduplicated the code. Thus, I think main authorship should be given to Oliver. > > There is the co-developed-by: tag, which looks like it might be relevant > here, can you do that? If you prefer the co-developed-by: tag, you can add: Co-developed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Oliver, any comments?
On Sat, Sep 26, 2020 at 12:34:24AM +0900, Tetsuo Handa wrote: > On 2020/09/26 0:28, Greg KH wrote: > >> Do you want this fix as a series of patches (the former link)? > >> Since I think that the changeset should be atomically applied, I posted the latter link. > > > > A single patch would be good to send to me again, burried at the end of > > a long thread is hard to dig out. > > > > Also with proper authorship is needed, did Oliver write this, or did > > you? > > Oliver wrote the majority part. I just squashed the series and updated comments > and deduplicated the code. Thus, I think main authorship should be given to Oliver. > > > > > There is the co-developed-by: tag, which looks like it might be relevant > > here, can you do that? > > If you prefer the co-developed-by: tag, you can add: > > Co-developed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> That seems reasonable, then put Oliver's address on the first line with a "From:" line to make this all work correctly when I apply it. thanks, greg k-h
Am Samstag, den 26.09.2020, 07:40 +0200 schrieb Greg KH: > On Sat, Sep 26, 2020 at 12:34:24AM +0900, Tetsuo Handa wrote: > > On 2020/09/26 0:28, Greg KH wrote: > > > > Do you want this fix as a series of patches (the former link)? > > > > Since I think that the changeset should be atomically applied, I posted the latter link. > > > > > > A single patch would be good to send to me again, burried at the end of > > > a long thread is hard to dig out. > > > > > > Also with proper authorship is needed, did Oliver write this, or did > > > you? > > > > Oliver wrote the majority part. I just squashed the series and updated comments > > and deduplicated the code. Thus, I think main authorship should be given to Oliver. > > > > > There is the co-developed-by: tag, which looks like it might be relevant > > > here, can you do that? > > > > If you prefer the co-developed-by: tag, you can add: > > > > Co-developed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > > That seems reasonable, then put Oliver's address on the first line with > a "From:" line to make this all work correctly when I apply it. Hi, I did sent out the series with a PATCH label last week. Should I resend? Regards Oliver
On 2020/09/29 17:46, Oliver Neukum wrote: >> That seems reasonable, then put Oliver's address on the first line with >> a "From:" line to make this all work correctly when I apply it. > > Hi, > > I did sent out the series with a PATCH label last week. > Should I resend? > No need to resend. I already sent a squashed version as https://lkml.kernel.org/r/20200928141755.3476-1-penguin-kernel@I-love.SAKURA.ne.jp .
--- a/drivers/usb/class/cdc-wdm.c +++ b/drivers/usb/class/cdc-wdm.c @@ -58,6 +58,9 @@ MODULE_DEVICE_TABLE (usb, wdm_ids); #define WDM_MAX 16 +/* flush() is uninterruptible, but we cannot wait forever */ +#define WDM_FLUSH_TIMEOUT (30 * HZ) + /* CDC-WMC r1.1 requires wMaxCommand to be "at least 256 decimal (0x100)" */ #define WDM_DEFAULT_BUFSIZE 256 @@ -151,7 +154,7 @@ static void wdm_out_callback(struct urb *urb) kfree(desc->outbuf); desc->outbuf = NULL; clear_bit(WDM_IN_USE, &desc->flags); - wake_up(&desc->wait); + wake_up_all(&desc->wait); } static void wdm_in_callback(struct urb *urb) @@ -393,6 +396,9 @@ static ssize_t wdm_write if (test_bit(WDM_RESETTING, &desc->flags)) r = -EIO; + if (test_bit(WDM_DISCONNECTING, &desc->flags)) + r = -ENODEV; + if (r < 0) { rv = r; goto out_free_mem_pm; @@ -424,6 +430,7 @@ static ssize_t wdm_write if (rv < 0) { desc->outbuf = NULL; clear_bit(WDM_IN_USE, &desc->flags); + wake_up_all(&desc->wait); /* for flush() */ dev_err(&desc->intf->dev, "Tx URB error: %d\n", rv); rv = usb_translate_errors(rv); goto out_free_mem_pm; @@ -583,11 +590,39 @@ static ssize_t wdm_read return rv; } +/* + * The difference to flush is that we wait forever. If you don't like + * that behavior, you need to send a signal. + */ + +static int wdm_fsync(struct file *file, loff_t start, loff_t end, int datasync) +{ + struct wdm_device *desc = file->private_data; + int rv; + + rv = wait_event_interruptible(desc->wait, + !test_bit(WDM_IN_USE, &desc->flags) || + test_bit(WDM_DISCONNECTING, &desc->flags)); + + if (test_bit(WDM_DISCONNECTING, &desc->flags)) + return -ENODEV; + if (rv < 0) + return -EINTR; + + spin_lock_irq(&desc->iuspin); + rv = desc->werr; + desc->werr = 0; + spin_unlock_irq(&desc->iuspin); + + return usb_translate_errors(rv); +} + static int wdm_flush(struct file *file, fl_owner_t id) { struct wdm_device *desc = file->private_data; + int rv; - wait_event(desc->wait, + rv = wait_event_interruptible_timeout(desc->wait, /* * needs both flags. We cannot do with one * because resetting it would cause a race @@ -595,16 +630,27 @@ static int wdm_flush(struct file *file, fl_owner_t id) * a disconnect */ !test_bit(WDM_IN_USE, &desc->flags) || - test_bit(WDM_DISCONNECTING, &desc->flags)); + test_bit(WDM_DISCONNECTING, &desc->flags), + WDM_FLUSH_TIMEOUT); - /* cannot dereference desc->intf if WDM_DISCONNECTING */ + /* + * to report the correct error. + * This is best effort + * We are inevitably racing with the hardware. + */ if (test_bit(WDM_DISCONNECTING, &desc->flags)) return -ENODEV; - if (desc->werr < 0) - dev_err(&desc->intf->dev, "Error in flush path: %d\n", - desc->werr); + if (!rv) + return -EIO; + if (rv < 0) + return -EINTR; + + spin_lock_irq(&desc->iuspin); + rv = desc->werr; + desc->werr = 0; + spin_unlock_irq(&desc->iuspin); - return usb_translate_errors(desc->werr); + return usb_translate_errors(rv); } static __poll_t wdm_poll(struct file *file, struct poll_table_struct *wait) @@ -729,6 +775,7 @@ static const struct file_operations wdm_fops = { .owner = THIS_MODULE, .read = wdm_read, .write = wdm_write, + .fsync = wdm_fsync, .open = wdm_open, .flush = wdm_flush, .release = wdm_release,