Message ID | s5htwj67wbs.wl-tiwai@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Den 2016-04-12 skrev Takashi Iwai: > On Mon, 11 Apr 2016 17:08:01 +0200, > Lars Lindqvist wrote: > > > > On 2016-04-11 Takashi Iwai wrote: > > > [Added Qing Cai to Cc, who was the author of the patch in question] > > > > > > On Sun, 10 Apr 2016 23:57:11 +0200, > > > Lars Lindqvist wrote: > > > > > > > > Hi! > > > > > > > > Since alsa-lib commit dec428c352217010e4b8bd750d302b8062339d32, I've > > > > occationally been hit by an EBADFD whenever any program tries to play > > > > sound. The situation I get is that the first shmget succeds, so > > > > dmix->shmid >= 0, therefore first_instance = 0. > > > > > > I wonder how does this succeed? It's a leftover shmem? > > > But then why it contains the garbage...? > > > > I seem to be able to trigger it by having one client open, starting another, > > and quickly closing the first one. > > One possible case is that the second stream is opened almost at the > same time as the first stream, and the second stream reaches to the > point checking SND_PCM_DIRECT_MAGIC before the first one finishes the > initialization. Does a hackish patch like below make any difference? No it doesn't, the magic check succeeds. The situation is that the first one has been open for "a long time" when I open the second one. So I would think that initialization has been completed properly. If the second one opens just as the first closes, it might be that the second assigns (correctly) first_instance = 0, but by the time it reaches the "if (first_instance)" check, the first one has closed, and left garbage. Lars > diff --git a/src/pcm/pcm_direct.c b/src/pcm/pcm_direct.c > index 14de734d98eb..d0aa4258004f 100644 > --- a/src/pcm/pcm_direct.c > +++ b/src/pcm/pcm_direct.c > @@ -92,6 +92,7 @@ int snd_pcm_direct_shm_create_or_connect(snd_pcm_direct_t *dmix) > { > struct shmid_ds buf; > int tmpid, err, first_instance = 0; > + int repeat = 0; > > retryget: > dmix->shmid = shmget(dmix->ipc_key, sizeof(snd_pcm_direct_share_t), > @@ -136,6 +137,9 @@ retryget: > } else { > if (dmix->shmptr->magic != SND_PCM_DIRECT_MAGIC) { > snd_pcm_direct_shm_discard(dmix); > + /* might be racy, let's retry for a few times */ > + if (++repeat < 10) > + goto retryget; > return -EINVAL; > } > }
On Tue, 12 Apr 2016 16:52:28 +0200, Lars Lindqvist wrote: > > Den 2016-04-12 skrev Takashi Iwai: > > On Mon, 11 Apr 2016 17:08:01 +0200, > > Lars Lindqvist wrote: > > > > > > On 2016-04-11 Takashi Iwai wrote: > > > > [Added Qing Cai to Cc, who was the author of the patch in question] > > > > > > > > On Sun, 10 Apr 2016 23:57:11 +0200, > > > > Lars Lindqvist wrote: > > > > > > > > > > Hi! > > > > > > > > > > Since alsa-lib commit dec428c352217010e4b8bd750d302b8062339d32, I've > > > > > occationally been hit by an EBADFD whenever any program tries to play > > > > > sound. The situation I get is that the first shmget succeds, so > > > > > dmix->shmid >= 0, therefore first_instance = 0. > > > > > > > > I wonder how does this succeed? It's a leftover shmem? > > > > But then why it contains the garbage...? > > > > > > I seem to be able to trigger it by having one client open, starting another, > > > and quickly closing the first one. > > > > One possible case is that the second stream is opened almost at the > > same time as the first stream, and the second stream reaches to the > > point checking SND_PCM_DIRECT_MAGIC before the first one finishes the > > initialization. Does a hackish patch like below make any difference? > > No it doesn't, the magic check succeeds. The situation is that the first > one has been open for "a long time" when I open the second one. So I > would think that initialization has been completed properly. If the > second one opens just as the first closes, it might be that the second > assigns (correctly) first_instance = 0, but by the time it reaches the > "if (first_instance)" check, the first one has closed, and left garbage. Actually we have a semaphore before shm access, so the race at two opens shouldn't happen. I noticed it after I sent my previous mail. But the semaphore is taken also at snd_pcm_dmix_close(). So I wonder where the race actually happens. Both open and close must be protected while another stream is opening or closing. Could you try to check where you get the exact error...? Takashi
Den 2016-04-12 skrev Takashi Iwai: > Actually we have a semaphore before shm access, so the race at two > opens shouldn't happen. I noticed it after I sent my previous mail. > > But the semaphore is taken also at snd_pcm_dmix_close(). So I wonder > where the race actually happens. Both open and close must be > protected while another stream is opening or closing. > > Could you try to check where you get the exact error...? The execution tree, as far as I can find is the following: snd_pcm_dmix_open: * Line 1009 snd_pcm_direct_shm_create_or_connect with the code in question, * which returns 0. So we end up in line 1058, dmix->shmtr->use_server is 0, * so go to line 1072.. running: snd_pcm_open_slave, running: snd_pcm_open_named_slave, running: snd_pcm_open_conf * where snd_dlobj_cache_get gives open_func = snd_pcm_hw_open, so snd_pcm_hw_open() snd_open_device("/dev/snd/pcmC0D0p") * where open() returns -1 with errno = EBADFD Regards, Lars
diff --git a/src/pcm/pcm_direct.c b/src/pcm/pcm_direct.c index 14de734d98eb..d0aa4258004f 100644 --- a/src/pcm/pcm_direct.c +++ b/src/pcm/pcm_direct.c @@ -92,6 +92,7 @@ int snd_pcm_direct_shm_create_or_connect(snd_pcm_direct_t *dmix) { struct shmid_ds buf; int tmpid, err, first_instance = 0; + int repeat = 0; retryget: dmix->shmid = shmget(dmix->ipc_key, sizeof(snd_pcm_direct_share_t), @@ -136,6 +137,9 @@ retryget: } else { if (dmix->shmptr->magic != SND_PCM_DIRECT_MAGIC) { snd_pcm_direct_shm_discard(dmix); + /* might be racy, let's retry for a few times */ + if (++repeat < 10) + goto retryget; return -EINVAL; } }