Message ID | 20200616131743.4793-3-mark@xwax.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/3] echoaudio: Race conditions around "opencount" | expand |
On Tue, 16 Jun 2020 15:17:43 +0200, Mark Hills wrote: > > +/* Update software pointer to match the hardware > + * > + * Return: 1 if we crossed a period threshold, otherwise 0 > + */ > +static int snd_echo_poll_substream(struct snd_pcm_substream *substream) This is a bit confusing name. One would imagine from the function name as if it were about the handling of poll() syscall. > +{ > + struct snd_pcm_runtime *runtime = substream->runtime; > + struct audiopipe *pipe = runtime->private_data; > + unsigned counter, step, period, last_period; > + size_t buffer_bytes; > + > + if (pipe->state != PIPE_STATE_STARTED) > + return 0; > + > + counter = le32_to_cpu(*pipe->dma_counter); > + > + period = bytes_to_frames(runtime, counter) > + / runtime->period_size; > + last_period = bytes_to_frames(runtime, pipe->last_counter) > + / runtime->period_size; > + > + if (period == last_period) > + return 0; /* don't do any work yet */ > + > + step = counter - pipe->last_counter; > + pipe->last_counter = counter; Dose this calculation consider the overlap of 32bit integer of the counter? (I'm not sure whether the old code did it right, though.) thanks, Takashi
On Tue, 16 Jun 2020, Takashi Iwai wrote: > On Tue, 16 Jun 2020 15:17:43 +0200, > Mark Hills wrote: > > > > +/* Update software pointer to match the hardware > > + * > > + * Return: 1 if we crossed a period threshold, otherwise 0 > > + */ > > +static int snd_echo_poll_substream(struct snd_pcm_substream *substream) > > This is a bit confusing name. One would imagine from the function > name as if it were about the handling of poll() syscall. Poll felt intuitive to me; maybe from FreeBSD where network drivers can poll on a timer instead of interrupts. I do know about poll(), though. How about snd_echo_update_substream()? > > +{ > > + struct snd_pcm_runtime *runtime = substream->runtime; > > + struct audiopipe *pipe = runtime->private_data; > > + unsigned counter, step, period, last_period; > > + size_t buffer_bytes; > > + > > + if (pipe->state != PIPE_STATE_STARTED) > > + return 0; > > + > > + counter = le32_to_cpu(*pipe->dma_counter); > > + > > + period = bytes_to_frames(runtime, counter) > > + / runtime->period_size; > > + last_period = bytes_to_frames(runtime, pipe->last_counter) > > + / runtime->period_size; > > + > > + if (period == last_period) > > + return 0; /* don't do any work yet */ > > + > > + step = counter - pipe->last_counter; > > + pipe->last_counter = counter; > > Dose this calculation consider the overlap of 32bit integer of the > counter? (I'm not sure whether the old code did it right, though.) I believe it does, since (small - big) as unsigned gives small value. And period is checked only for equality (not greater than). I'll add a comment as such. I have run it with long streams. Would it be clearer, or should it be that buffer_bytes to "unsigned"? Though size_t conveys that it is a byte length, in memory. In general I haven't deviated from existing code without need to, so I inherited these types. The same goes for the pattern of calculating "step" with unsigned values, and then using a loop to wrap it to the buffer: while (pipe->position >= buffer_bytes) pipe->position -= buffer_bytes; I've assumed this was a recognised pattern in ALSA code, preferred over modulus.
On Tue, 16 Jun 2020 16:01:11 +0200, Mark Hills wrote: > > On Tue, 16 Jun 2020, Takashi Iwai wrote: > > > On Tue, 16 Jun 2020 15:17:43 +0200, > > Mark Hills wrote: > > > > > > +/* Update software pointer to match the hardware > > > + * > > > + * Return: 1 if we crossed a period threshold, otherwise 0 > > > + */ > > > +static int snd_echo_poll_substream(struct snd_pcm_substream *substream) > > > > This is a bit confusing name. One would imagine from the function > > name as if it were about the handling of poll() syscall. > > Poll felt intuitive to me; maybe from FreeBSD where network drivers can > poll on a timer instead of interrupts. I do know about poll(), though. > > How about snd_echo_update_substream()? Yes, it's more self-explanatory. Or even better snd_echo_update_substream_position() :) > > > +{ > > > + struct snd_pcm_runtime *runtime = substream->runtime; > > > + struct audiopipe *pipe = runtime->private_data; > > > + unsigned counter, step, period, last_period; > > > + size_t buffer_bytes; > > > + > > > + if (pipe->state != PIPE_STATE_STARTED) > > > + return 0; > > > + > > > + counter = le32_to_cpu(*pipe->dma_counter); > > > + > > > + period = bytes_to_frames(runtime, counter) > > > + / runtime->period_size; > > > + last_period = bytes_to_frames(runtime, pipe->last_counter) > > > + / runtime->period_size; > > > + > > > + if (period == last_period) > > > + return 0; /* don't do any work yet */ > > > + > > > + step = counter - pipe->last_counter; > > > + pipe->last_counter = counter; > > > > Dose this calculation consider the overlap of 32bit integer of the > > counter? (I'm not sure whether the old code did it right, though.) > > I believe it does, since (small - big) as unsigned gives small value. And > period is checked only for equality (not greater than). I'll add a comment > as such. I have run it with long streams. The problem is that the last_period calculation can be wrong if the period size isn't aligned. e.g. when period size is 44100, around the boundary position, it gets a different last_period value although it still in the same period. thanks, Takashi
On Tue, 16 Jun 2020 14:17:43 +0100 Mark Hills <mark@xwax.org> wrote: > Distorted audio appears occasionally, affecting either playback > or capture and requiring the affected substream to be closed by > all applications and re-opened. Yes, it also happens here very rarely (one time every some months) and I failed to reproduce the problem. > The best way I have found to reproduce the bug is to use dmix in > combination with Chromium, which opens the audio device multiple > times in threads. Anecdotally, the problems appear to have increased > with faster CPUs. > > Since applying this patch I have not had problems, where previously > they would occur several times a day. > > This patch addresses the following issues: > > * Check for progress using the counter from the hardware, not after > it has been truncated to the buffer. > > There appears to be a possible bug if a whole ringbuffer advances > between interrupts, it goes unnoticed. In that case the stream must be restarted anyway due to xrun. > * Remove chip->last_period: > > It's trivial to derive from pipe->last_counter, and inside pipe > is where it more logically belongs. This has less scope for bugs > as it is not wrapped to the buffer length. Ok. > * Remove the accessing of pipe->dma_counter twice in the interrupt > handler: Why twice? > +static int snd_echo_poll_substream(struct snd_pcm_substream *substream) > [...] > static irqreturn_t snd_echo_interrupt(int irq, void *dev_id) > [...] Looks fine to me.
On Tue, 16 Jun 2020 14:17:43 +0100 Mark Hills <mark@xwax.org> wrote: > + counter = le32_to_cpu(*pipe->dma_counter); > + > + period = bytes_to_frames(runtime, counter) > + / runtime->period_size; > + last_period = bytes_to_frames(runtime, pipe->last_counter) > + / runtime->period_size; > + > + if (period == last_period) > + return 0; /* don't do any work yet */ > + > + step = counter - pipe->last_counter; Uhmm.. no, when the dma_counter wraps around the comparison gives wrong result, unless 4G is divisible by the buffer size. Please consider this patch (to apply after yours). It computes the new period using pipe->position before and after adding the amount of bytes tranferred since the previous period. Briefly tested - It seems ok. Signed-off-by: Giuliano Pochini <pochini@shiny.it> --- sound/pci/echoaudio/echoaudio.c__orig 2020-06-16 23:36:02.818601055 +0200 +++ sound/pci/echoaudio/echoaudio.c 2020-06-16 23:52:46.593325066 +0200 @@ -1781,7 +1781,7 @@ { struct snd_pcm_runtime *runtime = substream->runtime; struct audiopipe *pipe = runtime->private_data; - unsigned counter, step, period, last_period; + u32 counter, step, period, last_period, pipe_position; size_t buffer_bytes; if (pipe->state != PIPE_STATE_STARTED) @@ -1789,24 +1789,22 @@ counter = le32_to_cpu(*pipe->dma_counter); - period = bytes_to_frames(runtime, counter) + step = counter - pipe->last_counter; + pipe_position = pipe->position + step; /* bytes */ + buffer_bytes = frames_to_bytes(runtime, runtime->buffer_size); + while (pipe_position >= buffer_bytes) + pipe_position -= buffer_bytes; + + period = bytes_to_frames(runtime, pipe_position) / runtime->period_size; - last_period = bytes_to_frames(runtime, pipe->last_counter) + last_period = bytes_to_frames(runtime, pipe->position) / runtime->period_size; if (period == last_period) return 0; /* don't do any work yet */ - step = counter - pipe->last_counter; + pipe->position = pipe_position; pipe->last_counter = counter; - - pipe->position += step; /* bytes */ - - buffer_bytes = frames_to_bytes(runtime, runtime->buffer_size); - - while (pipe->position >= buffer_bytes) - pipe->position -= buffer_bytes; - return 1; }
On Tue, 16 Jun 2020, Takashi Iwai wrote: > On Tue, 16 Jun 2020 16:01:11 +0200, > Mark Hills wrote: > > > > On Tue, 16 Jun 2020, Takashi Iwai wrote: > > > > > On Tue, 16 Jun 2020 15:17:43 +0200, > > > Mark Hills wrote: > > > > > > > > +/* Update software pointer to match the hardware > > > > + * > > > > + * Return: 1 if we crossed a period threshold, otherwise 0 > > > > + */ > > > > +static int snd_echo_poll_substream(struct snd_pcm_substream *substream) > > > > > > This is a bit confusing name. One would imagine from the function > > > name as if it were about the handling of poll() syscall. > > > > Poll felt intuitive to me; maybe from FreeBSD where network drivers can > > poll on a timer instead of interrupts. I do know about poll(), though. > > > > How about snd_echo_update_substream()? > > Yes, it's more self-explanatory. Or even better > snd_echo_update_substream_position() :) Out of interest, these are static but the names are globally qualified. I see this elsewhere, so I followed, but is this agreed convention? Because it could be update_substream_position() :) > > > > +{ > > > > + struct snd_pcm_runtime *runtime = substream->runtime; > > > > + struct audiopipe *pipe = runtime->private_data; > > > > + unsigned counter, step, period, last_period; > > > > + size_t buffer_bytes; > > > > + > > > > + if (pipe->state != PIPE_STATE_STARTED) > > > > + return 0; > > > > + > > > > + counter = le32_to_cpu(*pipe->dma_counter); > > > > + > > > > + period = bytes_to_frames(runtime, counter) > > > > + / runtime->period_size; > > > > + last_period = bytes_to_frames(runtime, pipe->last_counter) > > > > + / runtime->period_size; > > > > + > > > > + if (period == last_period) > > > > + return 0; /* don't do any work yet */ > > > > + > > > > + step = counter - pipe->last_counter; > > > > + pipe->last_counter = counter; > > > > > > Dose this calculation consider the overlap of 32bit integer of the > > > counter? (I'm not sure whether the old code did it right, though.) > > > > I believe it does, since (small - big) as unsigned gives small value. And > > period is checked only for equality (not greater than). I'll add a comment > > as such. I have run it with long streams. > > The problem is that the last_period calculation can be wrong if the > period size isn't aligned. e.g. when period size is 44100, around the > boundary position, it gets a different last_period value although it > still in the same period. Agreed, yes. In which case I think it's clearer to not infer anything about periods from the current counter or position, and (effectively) accumulate it. Would you please make suggestions on the code below? Because it allowed for further simplification whilst fixing the bugs. In the end, modulo became clearer than loops and I think it has the bonus of being less risky should the counter make a large step. I'll run for a longer test today. --- /* Update software pointer to match the hardware * * \pre chip->lock is held */ static void snd_echo_update_substream_position(struct echoaudio *chip, struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct audiopipe *pipe = runtime->private_data; u32 counter, step; size_t period_bytes; if (pipe->state != PIPE_STATE_STARTED) return; period_bytes = frames_to_bytes(runtime, runtime->period_size); counter = le32_to_cpu(*pipe->dma_counter); step = counter - pipe->last_counter; /* handles wrapping of counter */ step -= step % period_bytes; /* acknowledge whole periods only */ if (step == 0) return; /* haven't advanced a whole period yet */ pipe->last_counter += step; /* does not always wrap on a period */ pipe->position += step; /* wraparound the buffer */ pipe->position %= frames_to_bytes(runtime, runtime->buffer_size); /* notify only once, even if multiple periods elapsed */ spin_unlock(&chip->lock); snd_pcm_period_elapsed(substream); spin_lock(&chip->lock); } static irqreturn_t snd_echo_interrupt(int irq, void *dev_id) { struct echoaudio *chip = dev_id; int ss, st; spin_lock(&chip->lock); st = service_irq(chip); if (st < 0) { spin_unlock(&chip->lock); return IRQ_NONE; } /* The hardware doesn't tell us which substream caused the irq, thus we have to check all running substreams. */ for (ss = 0; ss < DSP_MAXPIPES; ss++) { struct snd_pcm_substream *substream; substream = chip->substream[ss]; if (substream) snd_echo_update_substream_position(chip, substream); } spin_unlock(&chip->lock); #ifdef ECHOCARD_HAS_MIDI if (st > 0 && chip->midi_in) { snd_rawmidi_receive(chip->midi_in, chip->midi_buffer, st); dev_dbg(chip->card->dev, "rawmidi_iread=%d\n", st); } #endif return IRQ_HANDLED; } static snd_pcm_uframes_t pcm_pointer(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct audiopipe *pipe = runtime->private_data; return bytes_to_frames(runtime, pipe->position); }
On Tue, 16 Jun 2020, Giuliano Pochini wrote: > On Tue, 16 Jun 2020 14:17:43 +0100 > Mark Hills <mark@xwax.org> wrote: > > > Distorted audio appears occasionally, affecting either playback > > or capture and requiring the affected substream to be closed by > > all applications and re-opened. > > Yes, it also happens here very rarely (one time every some months) and I > failed to reproduce the problem. It frustrated me for years, but I had other work I needed to do. Eventually I was working around it several times a day :-/ [...] > > * Check for progress using the counter from the hardware, not after > > it has been truncated to the buffer. > > > > There appears to be a possible bug if a whole ringbuffer advances > > between interrupts, it goes unnoticed. > > In that case the stream must be restarted anyway due to xrun. Yes, but I think it will go unnoticed, so you don't know to re-start. [...] > > * Remove the accessing of pipe->dma_counter twice in the interrupt > > handler: > > Why twice? Yes, the interrupt handler calls pcm_pointer() directly to make a decision. Then it calls snd_pcm_period_elapsed(), which itself goes on to make another call to pcm_pointer(), by this time the bus mastering of the interface has adjusted the counter. The new code processes the counter only once, and this is easier to rationalise whether there are bugs.
On Wed, 17 Jun 2020, Giuliano Pochini wrote: > On Tue, 16 Jun 2020 14:17:43 +0100 > Mark Hills <mark@xwax.org> wrote: > > > + counter = le32_to_cpu(*pipe->dma_counter); > > + > > + period = bytes_to_frames(runtime, counter) > > + / runtime->period_size; > > + last_period = bytes_to_frames(runtime, pipe->last_counter) > > + / runtime->period_size; > > + > > + if (period == last_period) > > + return 0; /* don't do any work yet */ > > + > > + step = counter - pipe->last_counter; > > Uhmm.. no, when the dma_counter wraps around the comparison gives wrong > result, unless 4G is divisible by the buffer size. Agree. > Please consider this patch (to apply after yours). It computes the new > period using pipe->position before and after adding the amount of bytes > tranferred since the previous period. Briefly tested - It seems ok. > > > Signed-off-by: Giuliano Pochini <pochini@shiny.it> > > --- sound/pci/echoaudio/echoaudio.c__orig 2020-06-16 23:36:02.818601055 +0200 > +++ sound/pci/echoaudio/echoaudio.c 2020-06-16 23:52:46.593325066 +0200 > @@ -1781,7 +1781,7 @@ > { > struct snd_pcm_runtime *runtime = substream->runtime; > struct audiopipe *pipe = runtime->private_data; > - unsigned counter, step, period, last_period; > + u32 counter, step, period, last_period, pipe_position; > size_t buffer_bytes; > > if (pipe->state != PIPE_STATE_STARTED) > @@ -1789,24 +1789,22 @@ > > counter = le32_to_cpu(*pipe->dma_counter); > > - period = bytes_to_frames(runtime, counter) > + step = counter - pipe->last_counter; > + pipe_position = pipe->position + step; /* bytes */ > + buffer_bytes = frames_to_bytes(runtime, runtime->buffer_size); > + while (pipe_position >= buffer_bytes) > + pipe_position -= buffer_bytes; > + > + period = bytes_to_frames(runtime, pipe_position) > / runtime->period_size; > - last_period = bytes_to_frames(runtime, pipe->last_counter) > + last_period = bytes_to_frames(runtime, pipe->position) > / runtime->period_size; > > if (period == last_period) > return 0; /* don't do any work yet */ > > - step = counter - pipe->last_counter; > + pipe->position = pipe_position; > pipe->last_counter = counter; > - > - pipe->position += step; /* bytes */ > - > - buffer_bytes = frames_to_bytes(runtime, runtime->buffer_size); > - > - while (pipe->position >= buffer_bytes) > - pipe->position -= buffer_bytes; > - > return 1; I think this risks returning to a case where it concludes nothing advances if the counter advances by a whole buffer? You might be able to do the comparison before wrapping pipe_position, but hopefully you'll consider my patch in reply to Takashi has more clarity.
On Wed, 17 Jun 2020 12:51:05 +0200, Mark Hills wrote: > > On Tue, 16 Jun 2020, Takashi Iwai wrote: > > > On Tue, 16 Jun 2020 16:01:11 +0200, > > Mark Hills wrote: > > > > > > On Tue, 16 Jun 2020, Takashi Iwai wrote: > > > > > > > On Tue, 16 Jun 2020 15:17:43 +0200, > > > > Mark Hills wrote: > > > > > > > > > > +/* Update software pointer to match the hardware > > > > > + * > > > > > + * Return: 1 if we crossed a period threshold, otherwise 0 > > > > > + */ > > > > > +static int snd_echo_poll_substream(struct snd_pcm_substream *substream) > > > > > > > > This is a bit confusing name. One would imagine from the function > > > > name as if it were about the handling of poll() syscall. > > > > > > Poll felt intuitive to me; maybe from FreeBSD where network drivers can > > > poll on a timer instead of interrupts. I do know about poll(), though. > > > > > > How about snd_echo_update_substream()? > > > > Yes, it's more self-explanatory. Or even better > > snd_echo_update_substream_position() :) > > Out of interest, these are static but the names are globally qualified. I > see this elsewhere, so I followed, but is this agreed convention? > > Because it could be update_substream_position() :) It's fine to use any name for a local function. > > > > > +{ > > > > > + struct snd_pcm_runtime *runtime = substream->runtime; > > > > > + struct audiopipe *pipe = runtime->private_data; > > > > > + unsigned counter, step, period, last_period; > > > > > + size_t buffer_bytes; > > > > > + > > > > > + if (pipe->state != PIPE_STATE_STARTED) > > > > > + return 0; > > > > > + > > > > > + counter = le32_to_cpu(*pipe->dma_counter); > > > > > + > > > > > + period = bytes_to_frames(runtime, counter) > > > > > + / runtime->period_size; > > > > > + last_period = bytes_to_frames(runtime, pipe->last_counter) > > > > > + / runtime->period_size; > > > > > + > > > > > + if (period == last_period) > > > > > + return 0; /* don't do any work yet */ > > > > > + > > > > > + step = counter - pipe->last_counter; > > > > > + pipe->last_counter = counter; > > > > > > > > Dose this calculation consider the overlap of 32bit integer of the > > > > counter? (I'm not sure whether the old code did it right, though.) > > > > > > I believe it does, since (small - big) as unsigned gives small value. And > > > period is checked only for equality (not greater than). I'll add a comment > > > as such. I have run it with long streams. > > > > The problem is that the last_period calculation can be wrong if the > > period size isn't aligned. e.g. when period size is 44100, around the > > boundary position, it gets a different last_period value although it > > still in the same period. > > Agreed, yes. > > In which case I think it's clearer to not infer anything about periods > from the current counter or position, and (effectively) accumulate it. > > Would you please make suggestions on the code below? > > Because it allowed for further simplification whilst fixing the bugs. > > In the end, modulo became clearer than loops and I think it has the bonus > of being less risky should the counter make a large step. > > I'll run for a longer test today. > > --- > > /* Update software pointer to match the hardware > * > * \pre chip->lock is held > */ > static void snd_echo_update_substream_position(struct echoaudio *chip, > struct snd_pcm_substream *substream) > { > struct snd_pcm_runtime *runtime = substream->runtime; > struct audiopipe *pipe = runtime->private_data; > u32 counter, step; > size_t period_bytes; > > if (pipe->state != PIPE_STATE_STARTED) > return; > > period_bytes = frames_to_bytes(runtime, runtime->period_size); > > counter = le32_to_cpu(*pipe->dma_counter); > > step = counter - pipe->last_counter; /* handles wrapping of counter */ > step -= step % period_bytes; /* acknowledge whole periods only */ > > if (step == 0) > return; /* haven't advanced a whole period yet */ > pipe->last_counter += step; /* does not always wrap on a period */ > pipe->position += step; > > /* wraparound the buffer */ > pipe->position %= frames_to_bytes(runtime, runtime->buffer_size); > > /* notify only once, even if multiple periods elapsed */ > spin_unlock(&chip->lock); > snd_pcm_period_elapsed(substream); > spin_lock(&chip->lock); > } > > static irqreturn_t snd_echo_interrupt(int irq, void *dev_id) > { > struct echoaudio *chip = dev_id; > int ss, st; > > spin_lock(&chip->lock); > st = service_irq(chip); > if (st < 0) { > spin_unlock(&chip->lock); > return IRQ_NONE; > } > /* The hardware doesn't tell us which substream caused the irq, > thus we have to check all running substreams. */ > for (ss = 0; ss < DSP_MAXPIPES; ss++) { > struct snd_pcm_substream *substream; > > substream = chip->substream[ss]; > if (substream) > snd_echo_update_substream_position(chip, substream); > } > spin_unlock(&chip->lock); > > #ifdef ECHOCARD_HAS_MIDI > if (st > 0 && chip->midi_in) { > snd_rawmidi_receive(chip->midi_in, chip->midi_buffer, st); > dev_dbg(chip->card->dev, "rawmidi_iread=%d\n", st); > } > #endif > return IRQ_HANDLED; > } > > static snd_pcm_uframes_t pcm_pointer(struct snd_pcm_substream *substream) > { > struct snd_pcm_runtime *runtime = substream->runtime; > struct audiopipe *pipe = runtime->private_data; > > return bytes_to_frames(runtime, pipe->position); I guess this misses the update of the precise position; in your code, pipe->position gets updated only with the period size at irq handler. IMO, we should have the code like: static bool update_stream_position(struct snd_pcm_substream *substream) { // update pipe->position and others, returns true if period elapsed } static irqreturn_t snd_echo_interrupt() { spin_lock(&chip->lock); .... if (update_stream_position(substream)) { spin_unlock(&chip->lock); snd_pcm_period_elapsed(substream); spin_lock(&chip->lock); } .... spin_unlock(&chip->lock); return IRQ_HANDLED; } static snd_pcm_uframes_t pcm_pointer(struct snd_pcm_substream *substream) { .... update_stream_position(substream); return bytes_to_frames(runtime, pipe->position); } Note that snd_pcm_period_elapsed() isn't needed (must not be done) from the pointer callback. The PCM core would detect and handle properly if the period gets elapsed there. thanks, Takashi
On Thu, 18 Jun 2020, Takashi Iwai wrote: > On Wed, 17 Jun 2020 12:51:05 +0200, > Mark Hills wrote: > > > > On Tue, 16 Jun 2020, Takashi Iwai wrote: > > > > > On Tue, 16 Jun 2020 16:01:11 +0200, > > > Mark Hills wrote: [...] > > > > /* Update software pointer to match the hardware > > * > > * \pre chip->lock is held > > */ > > static void snd_echo_update_substream_position(struct echoaudio *chip, > > struct snd_pcm_substream *substream) > > { > > struct snd_pcm_runtime *runtime = substream->runtime; > > struct audiopipe *pipe = runtime->private_data; > > u32 counter, step; > > size_t period_bytes; > > > > if (pipe->state != PIPE_STATE_STARTED) > > return; > > > > period_bytes = frames_to_bytes(runtime, runtime->period_size); > > > > counter = le32_to_cpu(*pipe->dma_counter); > > > > step = counter - pipe->last_counter; /* handles wrapping of counter */ > > step -= step % period_bytes; /* acknowledge whole periods only */ > > > > if (step == 0) > > return; /* haven't advanced a whole period yet */ > > pipe->last_counter += step; /* does not always wrap on a period */ > > pipe->position += step; > > > > /* wraparound the buffer */ > > pipe->position %= frames_to_bytes(runtime, runtime->buffer_size); > > > > /* notify only once, even if multiple periods elapsed */ > > spin_unlock(&chip->lock); > > snd_pcm_period_elapsed(substream); > > spin_lock(&chip->lock); > > } > > > > static irqreturn_t snd_echo_interrupt(int irq, void *dev_id) > > { > > struct echoaudio *chip = dev_id; > > int ss, st; > > > > spin_lock(&chip->lock); > > st = service_irq(chip); > > if (st < 0) { > > spin_unlock(&chip->lock); > > return IRQ_NONE; > > } > > /* The hardware doesn't tell us which substream caused the irq, > > thus we have to check all running substreams. */ > > for (ss = 0; ss < DSP_MAXPIPES; ss++) { > > struct snd_pcm_substream *substream; > > > > substream = chip->substream[ss]; > > if (substream) > > snd_echo_update_substream_position(chip, substream); > > } > > spin_unlock(&chip->lock); > > > > #ifdef ECHOCARD_HAS_MIDI > > if (st > 0 && chip->midi_in) { > > snd_rawmidi_receive(chip->midi_in, chip->midi_buffer, st); > > dev_dbg(chip->card->dev, "rawmidi_iread=%d\n", st); > > } > > #endif > > return IRQ_HANDLED; > > } > > > > static snd_pcm_uframes_t pcm_pointer(struct snd_pcm_substream *substream) > > { > > struct snd_pcm_runtime *runtime = substream->runtime; > > struct audiopipe *pipe = runtime->private_data; > > > > return bytes_to_frames(runtime, pipe->position); > > I guess this misses the update of the precise position; in your code, > pipe->position gets updated only with the period size at irq handler. > > > IMO, we should have the code like: > > static bool update_stream_position(struct snd_pcm_substream *substream) > { > // update pipe->position and others, returns true if period elapsed > } > > static irqreturn_t snd_echo_interrupt() > { > spin_lock(&chip->lock); > .... > if (update_stream_position(substream)) { > spin_unlock(&chip->lock); > snd_pcm_period_elapsed(substream); > spin_lock(&chip->lock); > } > .... > spin_unlock(&chip->lock); > return IRQ_HANDLED; > } > > static snd_pcm_uframes_t pcm_pointer(struct snd_pcm_substream *substream) > { > .... > update_stream_position(substream); > return bytes_to_frames(runtime, pipe->position); > } Thanks. I certainly understand this in isolation. But could I please ask for help with the bigger picture? As it feels mismatched. * Code should take every possible opportunity to update the stream position; interrupts, or explicit pcm_pointer calls (whereas the docs guide towards doing it in the interrupt handler) * I critiqued (elsewhere in thread) the older interrupt handler for checking the counter, unlocking, calling back into PCM core and checking again a moment later. Whereas this is considered good behaviour. * Seems like the overall aim is for userland to be able (if it wants to) to poll the soundcard, even outside of periods. If all the above is true, I would expect interrupt handling to be very simple -- it would straight away call into PCM core, join existing if the codepaths (to take locks) and do a position update. PCM core would decide if a period really elapsed, not the driver. But this is not how it works. This now relates strongly to a question of locking: I ran the code (top of this message) all day, with a few instances in dmesg (at irregular intervals, not wrapping): [161644.076666] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 [163232.392501] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 [164976.098069] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 [165054.946225] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 [165312.141545] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 A definite bug, of course. However (and I am happy to be corrected) the function never finishes with position == buffer size. Only way is a race between interrupt handler and pcm_pointer. Therefore one of these is needed: * pcm_pointer locks chip->lock Even though the docs emphasise PCM core has exclusivity, it it not worth much as it does not protect against the interrupt handler. But now interrupt handler becomes ugly in total: take chip->lock, check the counter, release chip->lock, go to PCM core (which takes middle layer locks), take chip->lock again, check counter again, release chip->lock again. * interrupt handler must make atomic update of pipe->position This could have been a solution, but not if we expect pcm_pointer to also invoke the position update. Now we have a race: both the interrupt handler and pcm_position want to read dma_counter and write pipe->position after. So either do everthing in interrupt, everything in the pointer callback (though there isn't the API for this), but doing both does not seem to work well (though probably can be made to work if necessary) If we can clarify the requirements then I do not think it would be hard for me to write the code. [...] > Takashi Thanks again,
On Thu, 18 Jun 2020 13:07:33 +0200, Mark Hills wrote: > > On Thu, 18 Jun 2020, Takashi Iwai wrote: > > > On Wed, 17 Jun 2020 12:51:05 +0200, > > Mark Hills wrote: > > > > > > On Tue, 16 Jun 2020, Takashi Iwai wrote: > > > > > > > On Tue, 16 Jun 2020 16:01:11 +0200, > > > > Mark Hills wrote: > [...] > > > > > > /* Update software pointer to match the hardware > > > * > > > * \pre chip->lock is held > > > */ > > > static void snd_echo_update_substream_position(struct echoaudio *chip, > > > struct snd_pcm_substream *substream) > > > { > > > struct snd_pcm_runtime *runtime = substream->runtime; > > > struct audiopipe *pipe = runtime->private_data; > > > u32 counter, step; > > > size_t period_bytes; > > > > > > if (pipe->state != PIPE_STATE_STARTED) > > > return; > > > > > > period_bytes = frames_to_bytes(runtime, runtime->period_size); > > > > > > counter = le32_to_cpu(*pipe->dma_counter); > > > > > > step = counter - pipe->last_counter; /* handles wrapping of counter */ > > > step -= step % period_bytes; /* acknowledge whole periods only */ > > > > > > if (step == 0) > > > return; /* haven't advanced a whole period yet */ > > > pipe->last_counter += step; /* does not always wrap on a period */ > > > pipe->position += step; > > > > > > /* wraparound the buffer */ > > > pipe->position %= frames_to_bytes(runtime, runtime->buffer_size); > > > > > > /* notify only once, even if multiple periods elapsed */ > > > spin_unlock(&chip->lock); > > > snd_pcm_period_elapsed(substream); > > > spin_lock(&chip->lock); > > > } > > > > > > static irqreturn_t snd_echo_interrupt(int irq, void *dev_id) > > > { > > > struct echoaudio *chip = dev_id; > > > int ss, st; > > > > > > spin_lock(&chip->lock); > > > st = service_irq(chip); > > > if (st < 0) { > > > spin_unlock(&chip->lock); > > > return IRQ_NONE; > > > } > > > /* The hardware doesn't tell us which substream caused the irq, > > > thus we have to check all running substreams. */ > > > for (ss = 0; ss < DSP_MAXPIPES; ss++) { > > > struct snd_pcm_substream *substream; > > > > > > substream = chip->substream[ss]; > > > if (substream) > > > snd_echo_update_substream_position(chip, substream); > > > } > > > spin_unlock(&chip->lock); > > > > > > #ifdef ECHOCARD_HAS_MIDI > > > if (st > 0 && chip->midi_in) { > > > snd_rawmidi_receive(chip->midi_in, chip->midi_buffer, st); > > > dev_dbg(chip->card->dev, "rawmidi_iread=%d\n", st); > > > } > > > #endif > > > return IRQ_HANDLED; > > > } > > > > > > static snd_pcm_uframes_t pcm_pointer(struct snd_pcm_substream *substream) > > > { > > > struct snd_pcm_runtime *runtime = substream->runtime; > > > struct audiopipe *pipe = runtime->private_data; > > > > > > return bytes_to_frames(runtime, pipe->position); > > > > I guess this misses the update of the precise position; in your code, > > pipe->position gets updated only with the period size at irq handler. > > > > > > IMO, we should have the code like: > > > > static bool update_stream_position(struct snd_pcm_substream *substream) > > { > > // update pipe->position and others, returns true if period elapsed > > } > > > > static irqreturn_t snd_echo_interrupt() > > { > > spin_lock(&chip->lock); > > .... > > if (update_stream_position(substream)) { > > spin_unlock(&chip->lock); > > snd_pcm_period_elapsed(substream); > > spin_lock(&chip->lock); > > } > > .... > > spin_unlock(&chip->lock); > > return IRQ_HANDLED; > > } > > > > static snd_pcm_uframes_t pcm_pointer(struct snd_pcm_substream *substream) > > { > > .... > > update_stream_position(substream); > > return bytes_to_frames(runtime, pipe->position); > > } > > Thanks. > > I certainly understand this in isolation. > > But could I please ask for help with the bigger picture? As it feels > mismatched. > > * Code should take every possible opportunity to update the stream > position; interrupts, or explicit pcm_pointer calls (whereas the docs > guide towards doing it in the interrupt handler) > > * I critiqued (elsewhere in thread) the older interrupt handler for > checking the counter, unlocking, calling back into PCM core and checking > again a moment later. Whereas this is considered good behaviour. > > * Seems like the overall aim is for userland to be able (if it wants to) > to poll the soundcard, even outside of periods. Right, the user-space can query the current position at any time, and the driver should return the position as precisely as possible. Some applications (like PulseAudio) sets the interrupt as minimum as possible while it does schedule the update by itself, judging the position via the ioctl. In such operational mode, the accuracy of the current position query is vital. > If all the above is true, I would expect interrupt handling to be very > simple -- it would straight away call into PCM core, join existing if the > codepaths (to take locks) and do a position update. PCM core would decide > if a period really elapsed, not the driver. But this is not how it works. > > This now relates strongly to a question of locking: > > I ran the code (top of this message) all day, with a few instances in > dmesg (at irregular intervals, not wrapping): > > [161644.076666] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > [163232.392501] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > [164976.098069] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > [165054.946225] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > [165312.141545] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > A definite bug, of course. > > However (and I am happy to be corrected) the function never finishes with > position == buffer size. Only way is a race between interrupt handler and > pcm_pointer. > > Therefore one of these is needed: > > * pcm_pointer locks chip->lock > > Even though the docs emphasise PCM core has exclusivity, it it not worth > much as it does not protect against the interrupt handler. > > But now interrupt handler becomes ugly in total: take chip->lock, check > the counter, release chip->lock, go to PCM core (which takes middle > layer locks), take chip->lock again, check counter again, release > chip->lock again. Yes, that's the most robust way to go. If the lock is really costing too much, you can consider a fast-path by some flag for the irq -> snd_pcm_period_elapsed() case. Basically you can do everything in the pointer callback. The only requirement in the interrupt handle is basically judge whether you need the call of snd_pcm_period_elapsed() and call it. The rest update could be done in the other places. > * interrupt handler must make atomic update of pipe->position > > This could have been a solution, but not if we expect pcm_pointer to > also invoke the position update. Now we have a race: both the interrupt > handler and pcm_position want to read dma_counter and write > pipe->position after. As mentioned, the update of position at query is essential in majority of sound systems. Takashi > So either do everthing in interrupt, everything in the pointer callback > (though there isn't the API for this), but doing both does not seem to > work well (though probably can be made to work if necessary) > > If we can clarify the requirements then I do not think it would be hard > for me to write the code. > > [...] > > Takashi > > Thanks again, > > -- > Mark >
On Thu, 18 Jun 2020, Takashi Iwai wrote: > On Thu, 18 Jun 2020 13:07:33 +0200, > Mark Hills wrote: > > > > On Thu, 18 Jun 2020, Takashi Iwai wrote: > > > > > On Wed, 17 Jun 2020 12:51:05 +0200, > > > Mark Hills wrote: > > [...] > > But could I please ask for help with the bigger picture? As it feels > > mismatched. > > > > * Code should take every possible opportunity to update the stream > > position; interrupts, or explicit pcm_pointer calls (whereas the docs > > guide towards doing it in the interrupt handler) > > > > * I critiqued (elsewhere in thread) the older interrupt handler for > > checking the counter, unlocking, calling back into PCM core and checking > > again a moment later. Whereas this is considered good behaviour. > > > > * Seems like the overall aim is for userland to be able (if it wants to) > > to poll the soundcard, even outside of periods. > > Right, the user-space can query the current position at any time, and > the driver should return the position as precisely as possible. > > Some applications (like PulseAudio) sets the interrupt as minimum as > possible while it does schedule the update by itself, judging the > position via the ioctl. In such operational mode, the accuracy of the > current position query is vital. > > > If all the above is true, I would expect interrupt handling to be very > > simple -- it would straight away call into PCM core, join existing if the > > codepaths (to take locks) and do a position update. PCM core would decide > > if a period really elapsed, not the driver. But this is not how it works. > > > > This now relates strongly to a question of locking: > > > > I ran the code (top of this message) all day, with a few instances in > > dmesg (at irregular intervals, not wrapping): > > > > [161644.076666] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > [163232.392501] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > [164976.098069] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > [165054.946225] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > [165312.141545] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > > > A definite bug, of course. > > > > However (and I am happy to be corrected) the function never finishes with > > position == buffer size. Only way is a race between interrupt handler and > > pcm_pointer. > > > > Therefore one of these is needed: > > > > * pcm_pointer locks chip->lock > > > > Even though the docs emphasise PCM core has exclusivity, it it not worth > > much as it does not protect against the interrupt handler. > > > > But now interrupt handler becomes ugly in total: take chip->lock, check > > the counter, release chip->lock, go to PCM core (which takes middle > > layer locks), take chip->lock again, check counter again, release > > chip->lock again. > > Yes, that's the most robust way to go. If the lock is really costing > too much, you can consider a fast-path by some flag for the irq -> > snd_pcm_period_elapsed() case. I don't understand how a fast path could be made to work, as it can't pass data across snd_pcm_period_elapsed() and it still must syncronise access between reading dma_counter and writing pipe->position. Hence questioning if a better design is simpler interrupt handlers that just enter PCM core. > Basically you can do everything in the pointer callback. The only > requirement in the interrupt handle is basically judge whether you need > the call of snd_pcm_period_elapsed() and call it. The rest update could > be done in the other places. Thanks, please just another clarification: I presume that calls to pcm_pointer are completely independent of the period notifications? ie. period notifications are at regular intervals, regardless of whether pcm_pointer is called inbetween. pcm_pointer must not reset any state used by the interrupt. Which means we must handle when non-interrupt call to pcm_pointer causes a period to elapse. The next interrupt handler must notify. I can see in the original code uses chip->last_period exclusively by the interrupt handler, and I removed it. Some comments around the intent would help. I'll cross reference the original code with my new understanding. My instinct here is that to preserve - regular period notifications - handle period_size not aligning with 32-bit counter - no races between interrupt and pcm_pointer that the clearest and bug-free implementation may be to separate the interrupt (notifications) and pointer updates to separate state. Then there's no lock and the only crossover is an atomic read of dma_counter. And that's what I will try -- thanks.
On Thu, 18 Jun 2020, Mark Hills wrote: > On Thu, 18 Jun 2020, Takashi Iwai wrote: > > > On Thu, 18 Jun 2020 13:07:33 +0200, > > Mark Hills wrote: > > > > > > On Thu, 18 Jun 2020, Takashi Iwai wrote: > > > > > > > On Wed, 17 Jun 2020 12:51:05 +0200, > > > > Mark Hills wrote: > > > [...] > > > But could I please ask for help with the bigger picture? As it feels > > > mismatched. > > > > > > * Code should take every possible opportunity to update the stream > > > position; interrupts, or explicit pcm_pointer calls (whereas the docs > > > guide towards doing it in the interrupt handler) > > > > > > * I critiqued (elsewhere in thread) the older interrupt handler for > > > checking the counter, unlocking, calling back into PCM core and checking > > > again a moment later. Whereas this is considered good behaviour. > > > > > > * Seems like the overall aim is for userland to be able (if it wants to) > > > to poll the soundcard, even outside of periods. > > > > Right, the user-space can query the current position at any time, and > > the driver should return the position as precisely as possible. > > > > Some applications (like PulseAudio) sets the interrupt as minimum as > > possible while it does schedule the update by itself, judging the > > position via the ioctl. In such operational mode, the accuracy of the > > current position query is vital. > > > > > If all the above is true, I would expect interrupt handling to be very > > > simple -- it would straight away call into PCM core, join existing if the > > > codepaths (to take locks) and do a position update. PCM core would decide > > > if a period really elapsed, not the driver. But this is not how it works. > > > > > > This now relates strongly to a question of locking: > > > > > > I ran the code (top of this message) all day, with a few instances in > > > dmesg (at irregular intervals, not wrapping): > > > > > > [161644.076666] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > > [163232.392501] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > > [164976.098069] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > > [165054.946225] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > > [165312.141545] snd_echo3g 0000:09:02.0: invalid position: , pos = 4096, buffer size = 4096, period size = 64 > > > > > > A definite bug, of course. > > > > > > However (and I am happy to be corrected) the function never finishes with > > > position == buffer size. Only way is a race between interrupt handler and > > > pcm_pointer. > > > > > > Therefore one of these is needed: > > > > > > * pcm_pointer locks chip->lock > > > > > > Even though the docs emphasise PCM core has exclusivity, it it not worth > > > much as it does not protect against the interrupt handler. > > > > > > But now interrupt handler becomes ugly in total: take chip->lock, check > > > the counter, release chip->lock, go to PCM core (which takes middle > > > layer locks), take chip->lock again, check counter again, release > > > chip->lock again. > > > > Yes, that's the most robust way to go. If the lock is really costing > > too much, you can consider a fast-path by some flag for the irq -> > > snd_pcm_period_elapsed() case. > > I don't understand how a fast path could be made to work, as it can't pass > data across snd_pcm_period_elapsed() and it still must syncronise access > between reading dma_counter and writing pipe->position. > > Hence questioning if a better design is simpler interrupt handlers that > just enter PCM core. > > > Basically you can do everything in the pointer callback. The only > > requirement in the interrupt handle is basically judge whether you need > > the call of snd_pcm_period_elapsed() and call it. The rest update could > > be done in the other places. > > Thanks, please just another clarification: > > I presume that calls to pcm_pointer are completely independent of the > period notifications? > > ie. period notifications are at regular intervals, regardless of whether > pcm_pointer is called inbetween. pcm_pointer must not reset any state used > by the interrupt. > > Which means we must handle when non-interrupt call to pcm_pointer causes a > period to elapse. The next interrupt handler must notify. > > I can see in the original code uses chip->last_period exclusively by the > interrupt handler, and I removed it. Some comments around the intent would > help. I'll cross reference the original code with my new understanding. > > My instinct here is that to preserve > > - regular period notifications > > - handle period_size not aligning with 32-bit counter > > - no races between interrupt and pcm_pointer > > that the clearest and bug-free implementation may be to separate the > interrupt (notifications) and pointer updates to separate state. > > Then there's no lock and the only crossover is an atomic read of > dma_counter. > > And that's what I will try -- thanks. Ok so the implementation would look something like this below, which I will run for the rest of the day: * Clear separation of the period notification from position updates; only syncronising is around dma_counter, no need for locks * All counting is accumulated to avoid bugs in the cases of wrapping and non-alignment It's easier to see in the end results but of course I'll work on a clear diff. --- /****************************************************************************** IRQ Handling ******************************************************************************/ /* Check if a period has elapsed since last interrupt * * Don't make any updates to state; PCM core handles this with the * correct locks. * * \return true if a period has elapsed, otherwise false */ static bool period_has_elapsed(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct audiopipe *pipe = runtime->private_data; u32 counter, step; size_t period_bytes; if (pipe->state != PIPE_STATE_STARTED) return false; period_bytes = frames_to_bytes(runtime, runtime->period_size); counter = le32_to_cpu(*pipe->dma_counter); /* presumed atomic */ step = counter - pipe->last_period; /* handles wrapping */ step -= step % period_bytes; /* acknowledge whole periods only */ if (step == 0) return false; /* haven't advanced a whole period yet */ pipe->last_period += step; /* used exclusively by us */ return true; } static irqreturn_t snd_echo_interrupt(int irq, void *dev_id) { struct echoaudio *chip = dev_id; int ss, st; spin_lock(&chip->lock); st = service_irq(chip); if (st < 0) { spin_unlock(&chip->lock); return IRQ_NONE; } /* The hardware doesn't tell us which substream caused the irq, thus we have to check all running substreams. */ for (ss = 0; ss < DSP_MAXPIPES; ss++) { struct snd_pcm_substream *substream; substream = chip->substream[ss]; if (substream && period_has_elapsed(substream)) { spin_unlock(&chip->lock); snd_pcm_period_elapsed(substream); spin_lock(&chip->lock); } } spin_unlock(&chip->lock); #ifdef ECHOCARD_HAS_MIDI if (st > 0 && chip->midi_in) { snd_rawmidi_receive(chip->midi_in, chip->midi_buffer, st); dev_dbg(chip->card->dev, "rawmidi_iread=%d\n", st); } #endif return IRQ_HANDLED; } static snd_pcm_uframes_t pcm_pointer(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct audiopipe *pipe = runtime->private_data; u32 counter, step; /* * IRQ handling runs concurrently. Do not share tracking of * counter with it, which would race or require locking */ counter = le32_to_cpu(*pipe->dma_counter); /* presumed atomic */ step = counter - pipe->last_counter; /* handles wrapping */ pipe->last_counter = counter; /* counter doesn't neccessarily wrap on a multiple of * buffer_size, so can't derive the position; must * accumulate */ pipe->position += step; pipe->position %= frames_to_bytes(runtime, runtime->buffer_size); /* wrap */ return bytes_to_frames(runtime, pipe->position); }
On Wed, 17 Jun 2020 12:14:42 +0100 (BST) Mark Hills <mark@xwax.org> wrote: > On Wed, 17 Jun 2020, Giuliano Pochini wrote: > [...] > > - pipe->position += step; /* bytes */ > > - > > - buffer_bytes = frames_to_bytes(runtime, runtime->buffer_size); > > - > > - while (pipe->position >= buffer_bytes) > > - pipe->position -= buffer_bytes; > > - > > return 1; > > I think this risks returning to a case where it concludes nothing > advances if the counter advances by a whole buffer? Yes, it can, but you can detect that case checking for step >= period_bytes. > You might be able to do the comparison before wrapping pipe_position, but > hopefully you'll consider my patch in reply to Takashi has more clarity. Your patch is very interesting. I didn't take into account the idea of advancing the position by full periods only. If the PCM subsystem hasn't changed much since I last checked (I wrote the driver many years ago), it should work fine (and I'm sure you tested it). But I don't know if something else requires better resolution.
On Fri, 19 Jun 2020, Giuliano Pochini wrote: > On Wed, 17 Jun 2020 12:14:42 +0100 (BST) > Mark Hills <mark@xwax.org> wrote: > [...] > > You might be able to do the comparison before wrapping pipe_position, > > but hopefully you'll consider my patch in reply to Takashi has more > > clarity. > > Your patch is very interesting. I didn't take into account the idea of > advancing the position by full periods only. If the PCM subsystem hasn't > changed much since I last checked (I wrote the driver many years ago), > it should work fine (and I'm sure you tested it). But I don't know if > something else requires better resolution. It's funny, but I didn't take account of the opposite; that there was any merits to polling inbetween the interrupts for better resolution. Takashi pointed out the need for this and we had some discussion. Check the other thread, where I provided a newer revision of the code. The good thing is I think we can have all the things we want and be bug free, just I have to understand the specification. It would be great if you would like to take a look at the newer code for any problems you can see. I was going to run it for a few days then turn it into some patches.
On Fri, 19 Jun 2020 22:21:54 +0100 (BST) Mark Hills <mark@xwax.org> wrote: > On Fri, 19 Jun 2020, Giuliano Pochini wrote: > > > On Wed, 17 Jun 2020 12:14:42 +0100 (BST) > > Mark Hills <mark@xwax.org> wrote: > > > [...] > > > You might be able to do the comparison before wrapping pipe_position, > > > but hopefully you'll consider my patch in reply to Takashi has more > > > clarity. > > > > Your patch is very interesting. I didn't take into account the idea of > > advancing the position by full periods only. If the PCM subsystem > > hasn't changed much since I last checked (I wrote the driver many years > > ago), it should work fine (and I'm sure you tested it). But I don't > > know if something else requires better resolution. > > It's funny, but I didn't take account of the opposite; that there was any > merits to polling inbetween the interrupts for better resolution. > > Takashi pointed out the need for this and we had some discussion. Check > the other thread, where I provided a newer revision of the code. > > The good thing is I think we can have all the things we want and be bug > free, just I have to understand the specification. > > It would be great if you would like to take a look at the newer code for > any problems you can see. I was going to run it for a few days then turn > it into some patches. I looked at your code and I think it's OK. I'm using it for some days without any problem. I also stressed it with pretty tight timings and it worked fine all the time. Since I could not reproduce that problem before, except in some rare random circumstances, I'm not a good tester at all. At most I can say that your patch does not make things worse :)
On Mon, 29 Jun 2020, Giuliano Pochini wrote: > On Fri, 19 Jun 2020 22:21:54 +0100 (BST) > Mark Hills <mark@xwax.org> wrote: > > > On Fri, 19 Jun 2020, Giuliano Pochini wrote: > > > > > On Wed, 17 Jun 2020 12:14:42 +0100 (BST) > > > Mark Hills <mark@xwax.org> wrote: > > > > > [...] > > > > You might be able to do the comparison before wrapping pipe_position, > > > > but hopefully you'll consider my patch in reply to Takashi has more > > > > clarity. > > > > > > Your patch is very interesting. I didn't take into account the idea of > > > advancing the position by full periods only. If the PCM subsystem > > > hasn't changed much since I last checked (I wrote the driver many years > > > ago), it should work fine (and I'm sure you tested it). But I don't > > > know if something else requires better resolution. > > > > It's funny, but I didn't take account of the opposite; that there was any > > merits to polling inbetween the interrupts for better resolution. > > > > Takashi pointed out the need for this and we had some discussion. Check > > the other thread, where I provided a newer revision of the code. > > > > The good thing is I think we can have all the things we want and be bug > > free, just I have to understand the specification. > > > > It would be great if you would like to take a look at the newer code for > > any problems you can see. I was going to run it for a few days then turn > > it into some patches. > > I looked at your code and I think it's OK. I'm using it for some days > without any problem. I also stressed it with pretty tight timings and it > worked fine all the time. > > Since I could not reproduce that problem before, except in some rare > random circumstances, I'm not a good tester at all. At most I can say > that your patch does not make things worse :) What software are you using on the device, and are you using x86_64 and dmix? I think some issues might be exaggerated by dmix which has a unique way of opening the device several times. And then chromium exercises dmix a lot with all of its threads/forks. That would I presume be how it exercises races between pcm_pointer and interrupts.
On Wed, 1 Jul 2020 13:25:07 +0100 (BST) Mark Hills <mark@xwax.org> wrote: > On Mon, 29 Jun 2020, Giuliano Pochini wrote: > > > Since I could not reproduce that problem before, except in some rare > > random circumstances, I'm not a good tester at all. At most I can say > > that your patch does not make things worse :) > > What software are you using on the device, and are you using x86_64 and > dmix? > > I think some issues might be exaggerated by dmix which has a unique way > of opening the device several times. And then chromium exercises dmix a > lot with all of its threads/forks. That would I presume be how it > exercises races between pcm_pointer and interrupts. x86_64 now. When I wrote the driver I had a powermac. I also can test it on a x86_32 laptop with an Indigo-IO card. I tested both plughw and dmix, but I don't use the latter often.
diff --git a/sound/pci/echoaudio/echoaudio.c b/sound/pci/echoaudio/echoaudio.c index 8cf08988959f..12217649fb44 100644 --- a/sound/pci/echoaudio/echoaudio.c +++ b/sound/pci/echoaudio/echoaudio.c @@ -2,6 +2,7 @@ /* * ALSA driver for Echoaudio soundcards. * Copyright (C) 2003-2004 Giuliano Pochini <pochini@shiny.it> + * Copyright (C) 2020 Mark Hills <mark@xwax.org> */ #include <linux/module.h> @@ -590,7 +591,6 @@ static int init_engine(struct snd_pcm_substream *substream, /* This stuff is used by the irq handler, so it must be * initialized before chip->substream */ - chip->last_period[pipe_index] = 0; pipe->last_counter = 0; pipe->position = 0; smp_wmb(); @@ -759,7 +759,6 @@ static int pcm_trigger(struct snd_pcm_substream *substream, int cmd) pipe = chip->substream[i]->runtime->private_data; switch (pipe->state) { case PIPE_STATE_STOPPED: - chip->last_period[i] = 0; pipe->last_counter = 0; pipe->position = 0; *pipe->dma_counter = 0; @@ -807,19 +806,8 @@ static snd_pcm_uframes_t pcm_pointer(struct snd_pcm_substream *substream) { struct snd_pcm_runtime *runtime = substream->runtime; struct audiopipe *pipe = runtime->private_data; - size_t cnt, bufsize, pos; - cnt = le32_to_cpu(*pipe->dma_counter); - pipe->position += cnt - pipe->last_counter; - pipe->last_counter = cnt; - bufsize = substream->runtime->buffer_size; - pos = bytes_to_frames(substream->runtime, pipe->position); - - while (pos >= bufsize) { - pipe->position -= frames_to_bytes(substream->runtime, bufsize); - pos -= bufsize; - } - return pos; + return bytes_to_frames(runtime, pipe->position); } @@ -1782,14 +1770,50 @@ static const struct snd_kcontrol_new snd_echo_channels_info = { /****************************************************************************** - IRQ Handler + IRQ Handling ******************************************************************************/ +/* Update software pointer to match the hardware + * + * Return: 1 if we crossed a period threshold, otherwise 0 + */ +static int snd_echo_poll_substream(struct snd_pcm_substream *substream) +{ + struct snd_pcm_runtime *runtime = substream->runtime; + struct audiopipe *pipe = runtime->private_data; + unsigned counter, step, period, last_period; + size_t buffer_bytes; + + if (pipe->state != PIPE_STATE_STARTED) + return 0; + + counter = le32_to_cpu(*pipe->dma_counter); + + period = bytes_to_frames(runtime, counter) + / runtime->period_size; + last_period = bytes_to_frames(runtime, pipe->last_counter) + / runtime->period_size; + + if (period == last_period) + return 0; /* don't do any work yet */ + + step = counter - pipe->last_counter; + pipe->last_counter = counter; + + pipe->position += step; /* bytes */ + + buffer_bytes = frames_to_bytes(runtime, runtime->buffer_size); + + while (pipe->position >= buffer_bytes) + pipe->position -= buffer_bytes; + + return 1; +} + static irqreturn_t snd_echo_interrupt(int irq, void *dev_id) { struct echoaudio *chip = dev_id; - struct snd_pcm_substream *substream; - int period, ss, st; + int ss, st; spin_lock(&chip->lock); st = service_irq(chip); @@ -1800,18 +1824,18 @@ static irqreturn_t snd_echo_interrupt(int irq, void *dev_id) /* The hardware doesn't tell us which substream caused the irq, thus we have to check all running substreams. */ for (ss = 0; ss < DSP_MAXPIPES; ss++) { + struct snd_pcm_substream *substream; + substream = chip->substream[ss]; - if (substream && ((struct audiopipe *)substream->runtime-> - private_data)->state == PIPE_STATE_STARTED) { - period = pcm_pointer(substream) / - substream->runtime->period_size; - if (period != chip->last_period[ss]) { - chip->last_period[ss] = period; - spin_unlock(&chip->lock); - snd_pcm_period_elapsed(substream); - spin_lock(&chip->lock); - } - } + if (!substream) + continue; + + if (!snd_echo_poll_substream(substream)) + continue; + + spin_unlock(&chip->lock); + snd_pcm_period_elapsed(substream); + spin_lock(&chip->lock); } spin_unlock(&chip->lock); diff --git a/sound/pci/echoaudio/echoaudio.h b/sound/pci/echoaudio/echoaudio.h index 6fd283e4e676..7ff5d4de6880 100644 --- a/sound/pci/echoaudio/echoaudio.h +++ b/sound/pci/echoaudio/echoaudio.h @@ -332,7 +332,6 @@ struct audioformat { struct echoaudio { spinlock_t lock; struct snd_pcm_substream *substream[DSP_MAXPIPES]; - int last_period[DSP_MAXPIPES]; struct mutex mode_mutex; u16 num_digital_modes, digital_mode_list[6]; u16 num_clock_sources, clock_source_list[10];
Distorted audio appears occasionally, affecting either playback or capture and requiring the affected substream to be closed by all applications and re-opened. The best way I have found to reproduce the bug is to use dmix in combination with Chromium, which opens the audio device multiple times in threads. Anecdotally, the problems appear to have increased with faster CPUs. Since applying this patch I have not had problems, where previously they would occur several times a day. This patch addresses the following issues: * Check for progress using the counter from the hardware, not after it has been truncated to the buffer. There appears to be a possible bug if a whole ringbuffer advances between interrupts, it goes unnoticed. * Remove chip->last_period: It's trivial to derive from pipe->last_counter, and inside pipe is where it more logically belongs. This has less scope for bugs as it is not wrapped to the buffer length. * Remove the accessing of pipe->dma_counter twice in the interrupt handler: The value will be changing between accesses. It doesn't look like this could cause a bug in practice, assuming the value only goes up. Except perhaps if another thread were able to reset it to 0 on the second access because chip->lock is not held. This may improve the performance of the interrupt handler; dma_counter is volatile so access is slow. The resulting interrupt handling resembles more closely the pattern in the kernel "Writing an ALSA driver" documentation. Signed-off-by: Mark Hills <mark@xwax.org> --- sound/pci/echoaudio/echoaudio.c | 80 +++++++++++++++++++++------------ sound/pci/echoaudio/echoaudio.h | 1 - 2 files changed, 52 insertions(+), 29 deletions(-)