Message ID | 87ziydvasn.fsf_-_@doppelsaurus.mobileactivedefense.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 11/16/2015 05:28 PM, Rainer Weikusat wrote: > An AF_UNIX datagram socket being the client in an n:1 association with > some server socket is only allowed to send messages to the server if the > receive queue of this socket contains at most sk_max_ack_backlog > datagrams. This implies that prospective writers might be forced to go > to sleep despite none of the message presently enqueued on the server > receive queue were sent by them. In order to ensure that these will be > woken up once space becomes again available, the present unix_dgram_poll > routine does a second sock_poll_wait call with the peer_wait wait queue > of the server socket as queue argument (unix_dgram_recvmsg does a wake > up on this queue after a datagram was received). This is inherently > problematic because the server socket is only guaranteed to remain alive > for as long as the client still holds a reference to it. In case the > connection is dissolved via connect or by the dead peer detection logic > in unix_dgram_sendmsg, the server socket may be freed despite "the > polling mechanism" (in particular, epoll) still has a pointer to the > corresponding peer_wait queue. There's no way to forcibly deregister a > wait queue with epoll. > > Based on an idea by Jason Baron, the patch below changes the code such > that a wait_queue_t belonging to the client socket is enqueued on the > peer_wait queue of the server whenever the peer receive queue full > condition is detected by either a sendmsg or a poll. A wake up on the > peer queue is then relayed to the ordinary wait queue of the client > socket via wake function. The connection to the peer wait queue is again > dissolved if either a wake up is about to be relayed or the client > socket reconnects or a dead peer is detected or the client socket is > itself closed. This enables removing the second sock_poll_wait from > unix_dgram_poll, thus avoiding the use-after-free, while still ensuring > that no blocked writer sleeps forever. > > Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> > Fixes: ec0d215f9420 ("af_unix: fix 'poll for write'/connected DGRAM sockets") > --- > > Additional remark about "5456f09aaf88/ af_unix: fix unix_dgram_poll() > behavior for EPOLLOUT event": This shouldn't be an issue anymore with > this change despite it restores the "only when writable" behaviour" as > the wake up relay will also be set up once _dgram_sendmsg returned > EAGAIN for a send attempt on a n:1 connected socket. > > Hi, My only comment was about potentially avoiding the double lock in the write path, otherwise this looks ok to me. Thanks, -Jason -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Rainer Weikusat <rweikusat@mobileactivedefense.com> Date: Mon, 16 Nov 2015 22:28:40 +0000 > An AF_UNIX datagram socket being the client in an n:1 association with > some server socket is only allowed to send messages to the server if the > receive queue of this socket contains at most sk_max_ack_backlog > datagrams. This implies that prospective writers might be forced to go > to sleep despite none of the message presently enqueued on the server > receive queue were sent by them. In order to ensure that these will be > woken up once space becomes again available, the present unix_dgram_poll > routine does a second sock_poll_wait call with the peer_wait wait queue > of the server socket as queue argument (unix_dgram_recvmsg does a wake > up on this queue after a datagram was received). This is inherently > problematic because the server socket is only guaranteed to remain alive > for as long as the client still holds a reference to it. In case the > connection is dissolved via connect or by the dead peer detection logic > in unix_dgram_sendmsg, the server socket may be freed despite "the > polling mechanism" (in particular, epoll) still has a pointer to the > corresponding peer_wait queue. There's no way to forcibly deregister a > wait queue with epoll. > > Based on an idea by Jason Baron, the patch below changes the code such > that a wait_queue_t belonging to the client socket is enqueued on the > peer_wait queue of the server whenever the peer receive queue full > condition is detected by either a sendmsg or a poll. A wake up on the > peer queue is then relayed to the ordinary wait queue of the client > socket via wake function. The connection to the peer wait queue is again > dissolved if either a wake up is about to be relayed or the client > socket reconnects or a dead peer is detected or the client socket is > itself closed. This enables removing the second sock_poll_wait from > unix_dgram_poll, thus avoiding the use-after-free, while still ensuring > that no blocked writer sleeps forever. > > Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> > Fixes: ec0d215f9420 ("af_unix: fix 'poll for write'/connected DGRAM sockets") So because of a corner case of epoll handling and sender socket release, every single datagram sendmsg has to do a double lock now? I do not dispute the correctness of your fix at this point, but that added cost in the fast path is really too high. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller <davem@davemloft.net> writes: > From: Rainer Weikusat <rweikusat@mobileactivedefense.com> > Date: Mon, 16 Nov 2015 22:28:40 +0000 > >> An AF_UNIX datagram socket being the client in an n:1 association with >> some server socket is only allowed to send messages to the server if the >> receive queue of this socket contains at most sk_max_ack_backlog >> datagrams. [...] >> Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> >> Fixes: ec0d215f9420 ("af_unix: fix 'poll for write'/connected DGRAM sockets") > > So because of a corner case of epoll handling and sender socket release, > every single datagram sendmsg has to do a double lock now? > > I do not dispute the correctness of your fix at this point, but that > added cost in the fast path is really too high. This leaves only the option of a somewhat incorrect solution and what is or isn't acceptable in this respect is somewhat difficult to decide. The basic options would be - return EAGAIN even if sending became possible (Jason's most recent suggestions) - retry sending a limited number of times, eg, once, before returning EAGAIN, on the grounds that this is nicer to the application and that redoing all the stuff up to the _lock in dgram_sendmsg can possibly/ likely be avoided Which one do you prefer? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Rainer Weikusat <rw@doppelsaurus.mobileactivedefense.com> writes: [...] > The basic options would be > > - return EAGAIN even if sending became possible (Jason's most > recent suggestions) > > - retry sending a limited number of times, eg, once, before > returning EAGAIN, on the grounds that this is nicer to the > application and that redoing all the stuff up to the _lock in > dgram_sendmsg can possibly/ likely be avoided A third option: Use trylock to acquire the sk lock. If this succeeds, there's no risk of deadlocking anyone even if acquiring the locks in the wrong order. This could look as follows (NB: I didn't even compile this, I just wrote the code to get an idea how complicated it would be): int need_wakeup; [...] need_wakeup = 0; err = 0; if (spin_lock_trylock(unix_sk(sk)->lock)) { if (unix_peer(sk) != other || unix_dgram_peer_wake_me(sk, other)) err = -EAGAIN; } else { err = -EAGAIN; unix_state_unlock(other); unix_state_lock(sk); need_wakeup = unix_peer(sk) != other && unix_dgram_peer_wake_connect(sk, other) && sk_receive_queue_len(other) == 0; } unix_state_unlock(sk); if (err) { if (need_wakeup) wake_up_interruptible_poll(sk_sleep(sk), POLLOUT | POLLWRNORM | POLLWRBAND); goto out_free; } -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Rainer Weikusat <rweikusat@mobileactivedefense.com> writes: [...] > This leaves only the option of a somewhat incorrect solution and what is > or isn't acceptable in this respect is somewhat difficult to decide. The > basic options would be [...] > - retry sending a limited number of times, eg, once, before > returning EAGAIN, on the grounds that this is nicer to the > application and that redoing all the stuff up to the _lock in > dgram_sendmsg can possibly/ likely be avoided Since it's better to have a specific example of something: Here's another 'code sketch' of this option (hopefully with less errors this time, there's an int restart = 0 above): if (unix_peer(other) != sk && unix_recvq_full(other)) { int need_wakeup; [...] need_wakeup = 0; err = 0; unix_state_unlock(other); unix_state_lock(sk); if (unix_peer(sk) == other) { if (++restart == 2) { need_wakeup = unix_dgram_peer_wake_connect(sk, other) && sk_receive_queue_len(other) == 0; err = -EAGAIN; } else if (unix_dgram_peer_wake_me(sk, other)) err = -EAGAIN; } else err = -EAGAIN; unix_state_unlock(sk); if (err || !restart) { if (need_wakeup) wake_up_interruptible_poll(sk_sleep(sk), POLLOUT | POLLWRNORM | POLLWRBAND); goto out_free; } goto restart; } I don't particularly like that, either, and to me, the best option seems to be to return the spurious EAGAIN if taking both locks unconditionally is not an option as that's the simplest choice. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller <davem@davemloft.net> writes: > From: Rainer Weikusat <rweikusat@mobileactivedefense.com> > Date: Mon, 16 Nov 2015 22:28:40 +0000 > >> An AF_UNIX datagram socket being the client in an n:1 [...] > So because of a corner case of epoll handling and sender socket release, > every single datagram sendmsg has to do a double lock now? > > I do not dispute the correctness of your fix at this point, but that > added cost in the fast path is really too high. Some more information on this: Running the test program included below on my 'work' system (otherwise idle, after logging in via VT with no GUI running)/ quadcore AMD A10-5700, 3393.984 for 20 times/ patched 4.3 resulted in the following throughput statistics[*]: avg 13.617 M/s median 13.393 M/s max 17.14 M/s min 13.047 M/s deviation 0.85 I'll try to post the results for 'unpatched' later as I'm also working on a couple of other things. [*] I do not use my fingers for counting, hence, these are binary and not decimal units. ------------ #include <inttypes.h> #include <stdlib.h> #include <stdio.h> #include <sys/socket.h> #include <sys/time.h> #include <sys/wait.h> #include <unistd.h> enum { MSG_SZ = 16, MSGS = 1000000 }; static char msg[MSG_SZ]; static uint64_t tv2u(struct timeval *tv) { uint64_t u; u = tv->tv_sec; u *= 1000000; return u + tv->tv_usec; } int main(void) { struct timeval start, stop; uint64_t t_diff; double rate; int sks[2]; unsigned remain; char buf[MSG_SZ]; socketpair(AF_UNIX, SOCK_SEQPACKET, 0, sks); if (fork() == 0) { close(*sks); gettimeofday(&start, 0); while (read(sks[1], buf, sizeof(buf)) > 0); gettimeofday(&stop, 0); t_diff = tv2u(&stop); t_diff -= tv2u(&start); rate = MSG_SZ * MSGS; rate /= t_diff; rate *= 1000000; printf("rate %fM/s\n", rate / (1 << 20)); fflush(stdout); _exit(0); } close(sks[1]); remain = MSGS; do write(*sks, msg, sizeof(msg)); while (--remain); close(*sks); wait(NULL); return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Rainer Weikusat <rw@doppelsaurus.mobileactivedefense.com> writes: [...] > Some more information on this: Running the test program included below > on my 'work' system (otherwise idle, after logging in via VT with no GUI > running)/ quadcore AMD A10-5700, 3393.984 for 20 times/ patched 4.3 resulted in the > following throughput statistics[*]: Since the results were too variable with only 20 runs, I've also tested this with 100 for three kernels, stock 4.3, 4.3 plus the published patch, 4.3 plus the published patch plus the "just return EAGAIN" modification". The 1st and the 3rd perform about identical for the test program I used (slightly modified version included below), the 2nd is markedly slower. This is most easily visible when grouping the printed data rates (B/s) 'by millions': stock 4.3 --------- 13000000.000-13999999.000 3 (3%) 14000000.000-14999999.000 82 (82%) 15000000.000-15999999.000 15 (15%) 4.3 + patch ----------- 13000000.000-13999999.000 54 (54%) 14000000.000-14999999.000 35 (35%) 15000000.000-15999999.000 7 (7%) 16000000.000-16999999.000 1 (1%) 18000000.000-18999999.000 1 (1%) 22000000.000-22999999.000 2 (2%) 4.3 + modified patch -------------------- 13000000.000-13999999.000 3 (3%) 14000000.000-14999999.000 82 (82%) 15000000.000-15999999.000 14 (14%) 24000000.000-24999999.000 1 (1%) IMHO, the 3rd option would be the way to go if this was considered an acceptable option (ie, despite it returns spurious errors in 'rare cases'). modified test program ===================== #include <inttypes.h> #include <stdlib.h> #include <stdio.h> #include <sys/socket.h> #include <sys/time.h> #include <sys/wait.h> #include <unistd.h> enum { MSG_SZ = 16, MSGS = 1000000 }; static char msg[MSG_SZ]; static uint64_t tv2u(struct timeval *tv) { uint64_t u; u = tv->tv_sec; u *= 1000000; return u + tv->tv_usec; } int main(void) { struct timeval start, stop; uint64_t t_diff; double rate; int sks[2]; unsigned remain; char buf[MSG_SZ]; socketpair(AF_UNIX, SOCK_SEQPACKET, 0, sks); if (fork() == 0) { close(*sks); gettimeofday(&start, 0); while (read(sks[1], buf, sizeof(buf)) > 0); gettimeofday(&stop, 0); t_diff = tv2u(&stop); t_diff -= tv2u(&start); rate = MSG_SZ * MSGS; rate /= t_diff; rate *= 1000000; printf("%f\n", rate); fflush(stdout); _exit(0); } close(sks[1]); remain = MSGS; do write(*sks, msg, sizeof(msg)); while (--remain); close(*sks); wait(NULL); return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Rainer Weikusat <rweikusat@mobileactivedefense.com> writes: > Rainer Weikusat <rw@doppelsaurus.mobileactivedefense.com> writes: > > [...] > >> The basic options would be >> >> - return EAGAIN even if sending became possible (Jason's most >> recent suggestions) >> >> - retry sending a limited number of times, eg, once, before >> returning EAGAIN, on the grounds that this is nicer to the >> application and that redoing all the stuff up to the _lock in >> dgram_sendmsg can possibly/ likely be avoided > > A third option: A fourth and even one that's reasonably simple to implement: In case other became ready during the checks, drop other lock, do a double-lock sk, other, set a flag variable indicating this and restart the procedure after the unix_state_lock_other[*], using the value of the flag to lock/ unlock sk as needed. Should other still be ready to receive data, execution can then continue with the 'queue it' code as the other lock was held all the time this time. Combined with a few unlikely annotations in place where they're IMHO appropriate, this is speed-wise comparable to the stock kernel. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/af_unix.h b/include/net/af_unix.h index b36d837..2a91a05 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -62,6 +62,7 @@ struct unix_sock { #define UNIX_GC_CANDIDATE 0 #define UNIX_GC_MAYBE_CYCLE 1 struct socket_wq peer_wq; + wait_queue_t peer_wake; }; static inline struct unix_sock *unix_sk(const struct sock *sk) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 94f6582..3f4974d 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -326,6 +326,112 @@ found: return s; } +/* Support code for asymmetrically connected dgram sockets + * + * If a datagram socket is connected to a socket not itself connected + * to the first socket (eg, /dev/log), clients may only enqueue more + * messages if the present receive queue of the server socket is not + * "too large". This means there's a second writeability condition + * poll and sendmsg need to test. The dgram recv code will do a wake + * up on the peer_wait wait queue of a socket upon reception of a + * datagram which needs to be propagated to sleeping would-be writers + * since these might not have sent anything so far. This can't be + * accomplished via poll_wait because the lifetime of the server + * socket might be less than that of its clients if these break their + * association with it or if the server socket is closed while clients + * are still connected to it and there's no way to inform "a polling + * implementation" that it should let go of a certain wait queue + * + * In order to propagate a wake up, a wait_queue_t of the client + * socket is enqueued on the peer_wait queue of the server socket + * whose wake function does a wake_up on the ordinary client socket + * wait queue. This connection is established whenever a write (or + * poll for write) hit the flow control condition and broken when the + * association to the server socket is dissolved or after a wake up + * was relayed. + */ + +static int unix_dgram_peer_wake_relay(wait_queue_t *q, unsigned mode, int flags, + void *key) +{ + struct unix_sock *u; + wait_queue_head_t *u_sleep; + + u = container_of(q, struct unix_sock, peer_wake); + + __remove_wait_queue(&unix_sk(u->peer_wake.private)->peer_wait, + q); + u->peer_wake.private = NULL; + + /* relaying can only happen while the wq still exists */ + u_sleep = sk_sleep(&u->sk); + if (u_sleep) + wake_up_interruptible_poll(u_sleep, key); + + return 0; +} + +static int unix_dgram_peer_wake_connect(struct sock *sk, struct sock *other) +{ + struct unix_sock *u, *u_other; + int rc; + + u = unix_sk(sk); + u_other = unix_sk(other); + rc = 0; + spin_lock(&u_other->peer_wait.lock); + + if (!u->peer_wake.private) { + u->peer_wake.private = other; + __add_wait_queue(&u_other->peer_wait, &u->peer_wake); + + rc = 1; + } + + spin_unlock(&u_other->peer_wait.lock); + return rc; +} + +static int unix_dgram_peer_wake_disconnect(struct sock *sk, struct sock *other) +{ + struct unix_sock *u, *u_other; + int rc; + + u = unix_sk(sk); + u_other = unix_sk(other); + rc = 0; + spin_lock(&u_other->peer_wait.lock); + + if (u->peer_wake.private == other) { + __remove_wait_queue(&u_other->peer_wait, &u->peer_wake); + u->peer_wake.private = NULL; + + rc = 1; + } + + spin_unlock(&u_other->peer_wait.lock); + return rc; +} + +/* preconditions: + * - unix_peer(sk) == other + * - association is stable + */ +static int unix_dgram_peer_wake_me(struct sock *sk, struct sock *other) +{ + int connected; + + connected = unix_dgram_peer_wake_connect(sk, other); + + if (unix_recvq_full(other)) + return 1; + + if (connected) + unix_dgram_peer_wake_disconnect(sk, other); + + return 0; +} + static inline int unix_writable(struct sock *sk) { return (atomic_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf; @@ -430,6 +536,8 @@ static void unix_release_sock(struct sock *sk, int embrion) skpair->sk_state_change(skpair); sk_wake_async(skpair, SOCK_WAKE_WAITD, POLL_HUP); } + + unix_dgram_peer_wake_disconnect(sk, skpair); sock_put(skpair); /* It may now die */ unix_peer(sk) = NULL; } @@ -664,6 +772,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern) INIT_LIST_HEAD(&u->link); mutex_init(&u->readlock); /* single task reading lock */ init_waitqueue_head(&u->peer_wait); + init_waitqueue_func_entry(&u->peer_wake, unix_dgram_peer_wake_relay); unix_insert_socket(unix_sockets_unbound(sk), sk); out: if (sk == NULL) @@ -1031,6 +1140,13 @@ restart: if (unix_peer(sk)) { struct sock *old_peer = unix_peer(sk); unix_peer(sk) = other; + + if (unix_dgram_peer_wake_disconnect(sk, old_peer)) + wake_up_interruptible_poll(sk_sleep(sk), + POLLOUT | + POLLWRNORM | + POLLWRBAND); + unix_state_double_unlock(sk, other); if (other != old_peer) @@ -1548,7 +1664,7 @@ restart: goto out_free; } - unix_state_lock(other); + unix_state_double_lock(sk, other); err = -EPERM; if (!unix_may_send(sk, other)) goto out_unlock; @@ -1562,9 +1678,15 @@ restart: sock_put(other); err = 0; - unix_state_lock(sk); if (unix_peer(sk) == other) { unix_peer(sk) = NULL; + + if (unix_dgram_peer_wake_disconnect(sk, other)) + wake_up_interruptible_poll(sk_sleep(sk), + POLLOUT | + POLLWRNORM | + POLLWRBAND); + unix_state_unlock(sk); unix_dgram_disconnected(sk, other); @@ -1591,20 +1713,27 @@ restart: } if (unix_peer(other) != sk && unix_recvq_full(other)) { - if (!timeo) { - err = -EAGAIN; - goto out_unlock; - } + if (timeo) { + unix_state_unlock(sk); - timeo = unix_wait_for_peer(other, timeo); + timeo = unix_wait_for_peer(other, timeo); - err = sock_intr_errno(timeo); - if (signal_pending(current)) - goto out_free; + err = sock_intr_errno(timeo); + if (signal_pending(current)) + goto out_free; - goto restart; + goto restart; + } + + if (unix_peer(sk) != other || + unix_dgram_peer_wake_me(sk, other)) { + err = -EAGAIN; + goto out_unlock; + } } + unix_state_unlock(sk); + if (sock_flag(other, SOCK_RCVTSTAMP)) __net_timestamp(skb); maybe_add_creds(skb, sock, other); @@ -1618,7 +1747,7 @@ restart: return len; out_unlock: - unix_state_unlock(other); + unix_state_double_unlock(sk, other); out_free: kfree_skb(skb); out: @@ -2453,14 +2582,16 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock, return mask; writable = unix_writable(sk); - other = unix_peer_get(sk); - if (other) { - if (unix_peer(other) != sk) { - sock_poll_wait(file, &unix_sk(other)->peer_wait, wait); - if (unix_recvq_full(other)) - writable = 0; - } - sock_put(other); + if (writable) { + unix_state_lock(sk); + + other = unix_peer(sk); + if (other && unix_peer(other) != sk && + unix_recvq_full(other) && + unix_dgram_peer_wake_me(sk, other)) + writable = 0; + + unix_state_unlock(sk); } if (writable)
An AF_UNIX datagram socket being the client in an n:1 association with some server socket is only allowed to send messages to the server if the receive queue of this socket contains at most sk_max_ack_backlog datagrams. This implies that prospective writers might be forced to go to sleep despite none of the message presently enqueued on the server receive queue were sent by them. In order to ensure that these will be woken up once space becomes again available, the present unix_dgram_poll routine does a second sock_poll_wait call with the peer_wait wait queue of the server socket as queue argument (unix_dgram_recvmsg does a wake up on this queue after a datagram was received). This is inherently problematic because the server socket is only guaranteed to remain alive for as long as the client still holds a reference to it. In case the connection is dissolved via connect or by the dead peer detection logic in unix_dgram_sendmsg, the server socket may be freed despite "the polling mechanism" (in particular, epoll) still has a pointer to the corresponding peer_wait queue. There's no way to forcibly deregister a wait queue with epoll. Based on an idea by Jason Baron, the patch below changes the code such that a wait_queue_t belonging to the client socket is enqueued on the peer_wait queue of the server whenever the peer receive queue full condition is detected by either a sendmsg or a poll. A wake up on the peer queue is then relayed to the ordinary wait queue of the client socket via wake function. The connection to the peer wait queue is again dissolved if either a wake up is about to be relayed or the client socket reconnects or a dead peer is detected or the client socket is itself closed. This enables removing the second sock_poll_wait from unix_dgram_poll, thus avoiding the use-after-free, while still ensuring that no blocked writer sleeps forever. Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Fixes: ec0d215f9420 ("af_unix: fix 'poll for write'/connected DGRAM sockets") --- Additional remark about "5456f09aaf88/ af_unix: fix unix_dgram_poll() behavior for EPOLLOUT event": This shouldn't be an issue anymore with this change despite it restores the "only when writable" behaviour" as the wake up relay will also be set up once _dgram_sendmsg returned EAGAIN for a send attempt on a n:1 connected socket. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html