Message ID | 20240223214003.17369-12-kuniyu@amazon.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | af_unix: Rework GC. | expand |
On Fri, 2024-02-23 at 13:40 -0800, Kuniyuki Iwashima wrote: > The definition of the lowlink in Tarjan's algorithm is the > smallest index of a vertex that is reachable with at most one > back-edge in SCC. This is not useful for a cross-edge. > > If we start traversing from A in the following graph, the final > lowlink of D is 3. The cross-edge here is one between D and C. > > A -> B -> D D = (4, 3) (index, lowlink) > ^ | | C = (3, 1) > | V | B = (2, 1) > `--- C <--' A = (1, 1) > > This is because the lowlink of D is updated with the index of C. > > In the following patch, we detect a dead SCC by checking two > conditions for each vertex. > > 1) vertex has no edge directed to another SCC (no bridge) > 2) vertex's out_degree is the same as the refcount of its file > > If 1) is false, there is a receiver of all fds of the SCC and > its ancestor SCC. > > To evaluate 1), we need to assign a unique index to each SCC and > assign it to all vertices in the SCC. > > This patch changes the lowlink update logic for cross-edge so > that in the example above, the lowlink of D is updated with the > lowlink of C. > > A -> B -> D D = (4, 1) (index, lowlink) > ^ | | C = (3, 1) > | V | B = (2, 1) > `--- C <--' A = (1, 1) > > Then, all vertices in the same SCC have the same lowlink, and we > can quickly find the bridge connecting to different SCC if exists. > > However, it is no longer called lowlink, so we rename it to > scc_index. (It's sometimes called lowpoint.) > > Also, we add a global variable to hold the last index used in DFS > so that we do not reset the initial index in each DFS. > > This patch can be squashed to the SCC detection patch but is > split deliberately for anyone wondering why lowlink is not used > as used in the original Tarjan's algorithm and many reference > implementations. > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > --- > include/net/af_unix.h | 2 +- > net/unix/garbage.c | 15 ++++++++------- > 2 files changed, 9 insertions(+), 8 deletions(-) > > diff --git a/include/net/af_unix.h b/include/net/af_unix.h > index ec040caaa4b5..696d997a5ac9 100644 > --- a/include/net/af_unix.h > +++ b/include/net/af_unix.h > @@ -36,7 +36,7 @@ struct unix_vertex { > struct list_head scc_entry; > unsigned long out_degree; > unsigned long index; > - unsigned long lowlink; > + unsigned long scc_index; > }; > > struct unix_edge { > diff --git a/net/unix/garbage.c b/net/unix/garbage.c > index 1d9a0498dec5..0eb1610c96d7 100644 > --- a/net/unix/garbage.c > +++ b/net/unix/garbage.c > @@ -308,18 +308,18 @@ static bool unix_scc_cyclic(struct list_head *scc) > > static LIST_HEAD(unix_visited_vertices); > static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; > +static unsigned long unix_vertex_last_index = UNIX_VERTEX_INDEX_START; > > static void __unix_walk_scc(struct unix_vertex *vertex) > { > - unsigned long index = UNIX_VERTEX_INDEX_START; > LIST_HEAD(vertex_stack); > struct unix_edge *edge; > LIST_HEAD(edge_stack); > > next_vertex: > - vertex->index = index; > - vertex->lowlink = index; > - index++; > + vertex->index = unix_vertex_last_index; > + vertex->scc_index = unix_vertex_last_index; > + unix_vertex_last_index++; > > list_add(&vertex->scc_entry, &vertex_stack); > > @@ -342,13 +342,13 @@ static void __unix_walk_scc(struct unix_vertex *vertex) > > vertex = edge->predecessor->vertex; > > - vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); > + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); > } else if (next_vertex->index != unix_vertex_grouped_index) { > - vertex->lowlink = min(vertex->lowlink, next_vertex->index); > + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); I guess the above will break when unix_vertex_last_index wraps around, or am I low on coffee? (I guess there is not such a thing as enough coffee to allow me reviewing this whole series at once ;) Can we expect a wrap around in host with (surprisingly very) long uptimes? Thanks, Paolo
From: Paolo Abeni <pabeni@redhat.com> Date: Tue, 27 Feb 2024 12:19:40 +0100 > On Fri, 2024-02-23 at 13:40 -0800, Kuniyuki Iwashima wrote: > > The definition of the lowlink in Tarjan's algorithm is the > > smallest index of a vertex that is reachable with at most one > > back-edge in SCC. This is not useful for a cross-edge. > > > > If we start traversing from A in the following graph, the final > > lowlink of D is 3. The cross-edge here is one between D and C. > > > > A -> B -> D D = (4, 3) (index, lowlink) > > ^ | | C = (3, 1) > > | V | B = (2, 1) > > `--- C <--' A = (1, 1) > > > > This is because the lowlink of D is updated with the index of C. > > > > In the following patch, we detect a dead SCC by checking two > > conditions for each vertex. > > > > 1) vertex has no edge directed to another SCC (no bridge) > > 2) vertex's out_degree is the same as the refcount of its file > > > > If 1) is false, there is a receiver of all fds of the SCC and > > its ancestor SCC. > > > > To evaluate 1), we need to assign a unique index to each SCC and > > assign it to all vertices in the SCC. > > > > This patch changes the lowlink update logic for cross-edge so > > that in the example above, the lowlink of D is updated with the > > lowlink of C. > > > > A -> B -> D D = (4, 1) (index, lowlink) > > ^ | | C = (3, 1) > > | V | B = (2, 1) > > `--- C <--' A = (1, 1) > > > > Then, all vertices in the same SCC have the same lowlink, and we > > can quickly find the bridge connecting to different SCC if exists. > > > > However, it is no longer called lowlink, so we rename it to > > scc_index. (It's sometimes called lowpoint.) > > > > Also, we add a global variable to hold the last index used in DFS > > so that we do not reset the initial index in each DFS. > > > > This patch can be squashed to the SCC detection patch but is > > split deliberately for anyone wondering why lowlink is not used > > as used in the original Tarjan's algorithm and many reference > > implementations. > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > > --- > > include/net/af_unix.h | 2 +- > > net/unix/garbage.c | 15 ++++++++------- > > 2 files changed, 9 insertions(+), 8 deletions(-) > > > > diff --git a/include/net/af_unix.h b/include/net/af_unix.h > > index ec040caaa4b5..696d997a5ac9 100644 > > --- a/include/net/af_unix.h > > +++ b/include/net/af_unix.h > > @@ -36,7 +36,7 @@ struct unix_vertex { > > struct list_head scc_entry; > > unsigned long out_degree; > > unsigned long index; > > - unsigned long lowlink; > > + unsigned long scc_index; > > }; > > > > struct unix_edge { > > diff --git a/net/unix/garbage.c b/net/unix/garbage.c > > index 1d9a0498dec5..0eb1610c96d7 100644 > > --- a/net/unix/garbage.c > > +++ b/net/unix/garbage.c > > @@ -308,18 +308,18 @@ static bool unix_scc_cyclic(struct list_head *scc) > > > > static LIST_HEAD(unix_visited_vertices); > > static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; > > +static unsigned long unix_vertex_last_index = UNIX_VERTEX_INDEX_START; > > > > static void __unix_walk_scc(struct unix_vertex *vertex) > > { > > - unsigned long index = UNIX_VERTEX_INDEX_START; > > LIST_HEAD(vertex_stack); > > struct unix_edge *edge; > > LIST_HEAD(edge_stack); > > > > next_vertex: > > - vertex->index = index; > > - vertex->lowlink = index; > > - index++; > > + vertex->index = unix_vertex_last_index; > > + vertex->scc_index = unix_vertex_last_index; > > + unix_vertex_last_index++; > > > > list_add(&vertex->scc_entry, &vertex_stack); > > > > @@ -342,13 +342,13 @@ static void __unix_walk_scc(struct unix_vertex *vertex) > > > > vertex = edge->predecessor->vertex; > > > > - vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); > > + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); > > } else if (next_vertex->index != unix_vertex_grouped_index) { > > - vertex->lowlink = min(vertex->lowlink, next_vertex->index); > > + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); > > I guess the above will break when unix_vertex_last_index wraps around, > or am I low on coffee? (I guess there is not such a thing as enough > coffee to allow me reviewing this whole series at once ;) > > Can we expect a wrap around in host with (surprisingly very) long > uptimes? Then, the number of inflight AF_UNIX sockets is at least 2^64 - 1. After this series, struct unix_sock is 1024 bytes, so... the host would have roughly 2^10 * 2^64 == 2^74 bytes == 2^34 TBi == 17179869184 TBi memory! So, we need not expect a wrap around :)
On Tue, 2024-02-27 at 19:05 -0800, Kuniyuki Iwashima wrote: > From: Paolo Abeni <pabeni@redhat.com> > Date: Tue, 27 Feb 2024 12:19:40 +0100 > > On Fri, 2024-02-23 at 13:40 -0800, Kuniyuki Iwashima wrote: > > > The definition of the lowlink in Tarjan's algorithm is the > > > smallest index of a vertex that is reachable with at most one > > > back-edge in SCC. This is not useful for a cross-edge. > > > > > > If we start traversing from A in the following graph, the final > > > lowlink of D is 3. The cross-edge here is one between D and C. > > > > > > A -> B -> D D = (4, 3) (index, lowlink) > > > ^ | | C = (3, 1) > > > | V | B = (2, 1) > > > `--- C <--' A = (1, 1) > > > > > > This is because the lowlink of D is updated with the index of C. > > > > > > In the following patch, we detect a dead SCC by checking two > > > conditions for each vertex. > > > > > > 1) vertex has no edge directed to another SCC (no bridge) > > > 2) vertex's out_degree is the same as the refcount of its file > > > > > > If 1) is false, there is a receiver of all fds of the SCC and > > > its ancestor SCC. > > > > > > To evaluate 1), we need to assign a unique index to each SCC and > > > assign it to all vertices in the SCC. > > > > > > This patch changes the lowlink update logic for cross-edge so > > > that in the example above, the lowlink of D is updated with the > > > lowlink of C. > > > > > > A -> B -> D D = (4, 1) (index, lowlink) > > > ^ | | C = (3, 1) > > > | V | B = (2, 1) > > > `--- C <--' A = (1, 1) > > > > > > Then, all vertices in the same SCC have the same lowlink, and we > > > can quickly find the bridge connecting to different SCC if exists. > > > > > > However, it is no longer called lowlink, so we rename it to > > > scc_index. (It's sometimes called lowpoint.) > > > > > > Also, we add a global variable to hold the last index used in DFS > > > so that we do not reset the initial index in each DFS. > > > > > > This patch can be squashed to the SCC detection patch but is > > > split deliberately for anyone wondering why lowlink is not used > > > as used in the original Tarjan's algorithm and many reference > > > implementations. > > > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > > > --- > > > include/net/af_unix.h | 2 +- > > > net/unix/garbage.c | 15 ++++++++------- > > > 2 files changed, 9 insertions(+), 8 deletions(-) > > > > > > diff --git a/include/net/af_unix.h b/include/net/af_unix.h > > > index ec040caaa4b5..696d997a5ac9 100644 > > > --- a/include/net/af_unix.h > > > +++ b/include/net/af_unix.h > > > @@ -36,7 +36,7 @@ struct unix_vertex { > > > struct list_head scc_entry; > > > unsigned long out_degree; > > > unsigned long index; > > > - unsigned long lowlink; > > > + unsigned long scc_index; > > > }; > > > > > > struct unix_edge { > > > diff --git a/net/unix/garbage.c b/net/unix/garbage.c > > > index 1d9a0498dec5..0eb1610c96d7 100644 > > > --- a/net/unix/garbage.c > > > +++ b/net/unix/garbage.c > > > @@ -308,18 +308,18 @@ static bool unix_scc_cyclic(struct list_head *scc) > > > > > > static LIST_HEAD(unix_visited_vertices); > > > static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; > > > +static unsigned long unix_vertex_last_index = UNIX_VERTEX_INDEX_START; > > > > > > static void __unix_walk_scc(struct unix_vertex *vertex) > > > { > > > - unsigned long index = UNIX_VERTEX_INDEX_START; > > > LIST_HEAD(vertex_stack); > > > struct unix_edge *edge; > > > LIST_HEAD(edge_stack); > > > > > > next_vertex: > > > - vertex->index = index; > > > - vertex->lowlink = index; > > > - index++; > > > + vertex->index = unix_vertex_last_index; > > > + vertex->scc_index = unix_vertex_last_index; > > > + unix_vertex_last_index++; > > > > > > list_add(&vertex->scc_entry, &vertex_stack); > > > > > > @@ -342,13 +342,13 @@ static void __unix_walk_scc(struct unix_vertex *vertex) > > > > > > vertex = edge->predecessor->vertex; > > > > > > - vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); > > > + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); > > > } else if (next_vertex->index != unix_vertex_grouped_index) { > > > - vertex->lowlink = min(vertex->lowlink, next_vertex->index); > > > + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); > > > > I guess the above will break when unix_vertex_last_index wraps around, > > or am I low on coffee? (I guess there is not such a thing as enough > > coffee to allow me reviewing this whole series at once ;) > > > > Can we expect a wrap around in host with (surprisingly very) long > > uptimes? > > Then, the number of inflight AF_UNIX sockets is at least 2^64 - 1. Isn't "unix_vertex_last_index" value preserved across consecutive cg run? I though we could reach wrap around after a lot of gc runs... Cheers, Paolo
From: Paolo Abeni <pabeni@redhat.com> Date: Wed, 28 Feb 2024 08:49:46 +0100 > On Tue, 2024-02-27 at 19:05 -0800, Kuniyuki Iwashima wrote: > > From: Paolo Abeni <pabeni@redhat.com> > > Date: Tue, 27 Feb 2024 12:19:40 +0100 > > > On Fri, 2024-02-23 at 13:40 -0800, Kuniyuki Iwashima wrote: > > > > The definition of the lowlink in Tarjan's algorithm is the > > > > smallest index of a vertex that is reachable with at most one > > > > back-edge in SCC. This is not useful for a cross-edge. > > > > > > > > If we start traversing from A in the following graph, the final > > > > lowlink of D is 3. The cross-edge here is one between D and C. > > > > > > > > A -> B -> D D = (4, 3) (index, lowlink) > > > > ^ | | C = (3, 1) > > > > | V | B = (2, 1) > > > > `--- C <--' A = (1, 1) > > > > > > > > This is because the lowlink of D is updated with the index of C. > > > > > > > > In the following patch, we detect a dead SCC by checking two > > > > conditions for each vertex. > > > > > > > > 1) vertex has no edge directed to another SCC (no bridge) > > > > 2) vertex's out_degree is the same as the refcount of its file > > > > > > > > If 1) is false, there is a receiver of all fds of the SCC and > > > > its ancestor SCC. > > > > > > > > To evaluate 1), we need to assign a unique index to each SCC and > > > > assign it to all vertices in the SCC. > > > > > > > > This patch changes the lowlink update logic for cross-edge so > > > > that in the example above, the lowlink of D is updated with the > > > > lowlink of C. > > > > > > > > A -> B -> D D = (4, 1) (index, lowlink) > > > > ^ | | C = (3, 1) > > > > | V | B = (2, 1) > > > > `--- C <--' A = (1, 1) > > > > > > > > Then, all vertices in the same SCC have the same lowlink, and we > > > > can quickly find the bridge connecting to different SCC if exists. > > > > > > > > However, it is no longer called lowlink, so we rename it to > > > > scc_index. (It's sometimes called lowpoint.) > > > > > > > > Also, we add a global variable to hold the last index used in DFS > > > > so that we do not reset the initial index in each DFS. > > > > > > > > This patch can be squashed to the SCC detection patch but is > > > > split deliberately for anyone wondering why lowlink is not used > > > > as used in the original Tarjan's algorithm and many reference > > > > implementations. > > > > > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > > > > --- > > > > include/net/af_unix.h | 2 +- > > > > net/unix/garbage.c | 15 ++++++++------- > > > > 2 files changed, 9 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/include/net/af_unix.h b/include/net/af_unix.h > > > > index ec040caaa4b5..696d997a5ac9 100644 > > > > --- a/include/net/af_unix.h > > > > +++ b/include/net/af_unix.h > > > > @@ -36,7 +36,7 @@ struct unix_vertex { > > > > struct list_head scc_entry; > > > > unsigned long out_degree; > > > > unsigned long index; > > > > - unsigned long lowlink; > > > > + unsigned long scc_index; > > > > }; > > > > > > > > struct unix_edge { > > > > diff --git a/net/unix/garbage.c b/net/unix/garbage.c > > > > index 1d9a0498dec5..0eb1610c96d7 100644 > > > > --- a/net/unix/garbage.c > > > > +++ b/net/unix/garbage.c > > > > @@ -308,18 +308,18 @@ static bool unix_scc_cyclic(struct list_head *scc) > > > > > > > > static LIST_HEAD(unix_visited_vertices); > > > > static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; > > > > +static unsigned long unix_vertex_last_index = UNIX_VERTEX_INDEX_START; > > > > > > > > static void __unix_walk_scc(struct unix_vertex *vertex) > > > > { > > > > - unsigned long index = UNIX_VERTEX_INDEX_START; > > > > LIST_HEAD(vertex_stack); > > > > struct unix_edge *edge; > > > > LIST_HEAD(edge_stack); > > > > > > > > next_vertex: > > > > - vertex->index = index; > > > > - vertex->lowlink = index; > > > > - index++; > > > > + vertex->index = unix_vertex_last_index; > > > > + vertex->scc_index = unix_vertex_last_index; > > > > + unix_vertex_last_index++; > > > > > > > > list_add(&vertex->scc_entry, &vertex_stack); > > > > > > > > @@ -342,13 +342,13 @@ static void __unix_walk_scc(struct unix_vertex *vertex) > > > > > > > > vertex = edge->predecessor->vertex; > > > > > > > > - vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); > > > > + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); > > > > } else if (next_vertex->index != unix_vertex_grouped_index) { > > > > - vertex->lowlink = min(vertex->lowlink, next_vertex->index); > > > > + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); > > > > > > I guess the above will break when unix_vertex_last_index wraps around, > > > or am I low on coffee? (I guess there is not such a thing as enough > > > coffee to allow me reviewing this whole series at once ;) > > > > > > Can we expect a wrap around in host with (surprisingly very) long > > > uptimes? > > > > Then, the number of inflight AF_UNIX sockets is at least 2^64 - 1. > > Isn't "unix_vertex_last_index" value preserved across consecutive cg > run? I though we could reach wrap around after a lot of gc runs... It's preserved across consecutive DFS in a single gc run, but unix_walk_scc() always reset it. So, if it's wrapped, there would be too many sockets. I used unix_vertex_last_index elsewhere in the initial draft, but now local variable could be better here.
On Wed, 2024-02-28 at 08:25 -0800, Kuniyuki Iwashima wrote: > From: Paolo Abeni <pabeni@redhat.com> > Date: Wed, 28 Feb 2024 08:49:46 +0100 > > On Tue, 2024-02-27 at 19:05 -0800, Kuniyuki Iwashima wrote: > > > From: Paolo Abeni <pabeni@redhat.com> > > > Date: Tue, 27 Feb 2024 12:19:40 +0100 > > > > On Fri, 2024-02-23 at 13:40 -0800, Kuniyuki Iwashima wrote: > > > > > The definition of the lowlink in Tarjan's algorithm is the > > > > > smallest index of a vertex that is reachable with at most one > > > > > back-edge in SCC. This is not useful for a cross-edge. > > > > > > > > > > If we start traversing from A in the following graph, the final > > > > > lowlink of D is 3. The cross-edge here is one between D and C. > > > > > > > > > > A -> B -> D D = (4, 3) (index, lowlink) > > > > > ^ | | C = (3, 1) > > > > > | V | B = (2, 1) > > > > > `--- C <--' A = (1, 1) > > > > > > > > > > This is because the lowlink of D is updated with the index of C. > > > > > > > > > > In the following patch, we detect a dead SCC by checking two > > > > > conditions for each vertex. > > > > > > > > > > 1) vertex has no edge directed to another SCC (no bridge) > > > > > 2) vertex's out_degree is the same as the refcount of its file > > > > > > > > > > If 1) is false, there is a receiver of all fds of the SCC and > > > > > its ancestor SCC. > > > > > > > > > > To evaluate 1), we need to assign a unique index to each SCC and > > > > > assign it to all vertices in the SCC. > > > > > > > > > > This patch changes the lowlink update logic for cross-edge so > > > > > that in the example above, the lowlink of D is updated with the > > > > > lowlink of C. > > > > > > > > > > A -> B -> D D = (4, 1) (index, lowlink) > > > > > ^ | | C = (3, 1) > > > > > | V | B = (2, 1) > > > > > `--- C <--' A = (1, 1) > > > > > > > > > > Then, all vertices in the same SCC have the same lowlink, and we > > > > > can quickly find the bridge connecting to different SCC if exists. > > > > > > > > > > However, it is no longer called lowlink, so we rename it to > > > > > scc_index. (It's sometimes called lowpoint.) > > > > > > > > > > Also, we add a global variable to hold the last index used in DFS > > > > > so that we do not reset the initial index in each DFS. > > > > > > > > > > This patch can be squashed to the SCC detection patch but is > > > > > split deliberately for anyone wondering why lowlink is not used > > > > > as used in the original Tarjan's algorithm and many reference > > > > > implementations. > > > > > > > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > > > > > --- > > > > > include/net/af_unix.h | 2 +- > > > > > net/unix/garbage.c | 15 ++++++++------- > > > > > 2 files changed, 9 insertions(+), 8 deletions(-) > > > > > > > > > > diff --git a/include/net/af_unix.h b/include/net/af_unix.h > > > > > index ec040caaa4b5..696d997a5ac9 100644 > > > > > --- a/include/net/af_unix.h > > > > > +++ b/include/net/af_unix.h > > > > > @@ -36,7 +36,7 @@ struct unix_vertex { > > > > > struct list_head scc_entry; > > > > > unsigned long out_degree; > > > > > unsigned long index; > > > > > - unsigned long lowlink; > > > > > + unsigned long scc_index; > > > > > }; > > > > > > > > > > struct unix_edge { > > > > > diff --git a/net/unix/garbage.c b/net/unix/garbage.c > > > > > index 1d9a0498dec5..0eb1610c96d7 100644 > > > > > --- a/net/unix/garbage.c > > > > > +++ b/net/unix/garbage.c > > > > > @@ -308,18 +308,18 @@ static bool unix_scc_cyclic(struct list_head *scc) > > > > > > > > > > static LIST_HEAD(unix_visited_vertices); > > > > > static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; > > > > > +static unsigned long unix_vertex_last_index = UNIX_VERTEX_INDEX_START; > > > > > > > > > > static void __unix_walk_scc(struct unix_vertex *vertex) > > > > > { > > > > > - unsigned long index = UNIX_VERTEX_INDEX_START; > > > > > LIST_HEAD(vertex_stack); > > > > > struct unix_edge *edge; > > > > > LIST_HEAD(edge_stack); > > > > > > > > > > next_vertex: > > > > > - vertex->index = index; > > > > > - vertex->lowlink = index; > > > > > - index++; > > > > > + vertex->index = unix_vertex_last_index; > > > > > + vertex->scc_index = unix_vertex_last_index; > > > > > + unix_vertex_last_index++; > > > > > > > > > > list_add(&vertex->scc_entry, &vertex_stack); > > > > > > > > > > @@ -342,13 +342,13 @@ static void __unix_walk_scc(struct unix_vertex *vertex) > > > > > > > > > > vertex = edge->predecessor->vertex; > > > > > > > > > > - vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); > > > > > + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); > > > > > } else if (next_vertex->index != unix_vertex_grouped_index) { > > > > > - vertex->lowlink = min(vertex->lowlink, next_vertex->index); > > > > > + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); > > > > > > > > I guess the above will break when unix_vertex_last_index wraps around, > > > > or am I low on coffee? (I guess there is not such a thing as enough > > > > coffee to allow me reviewing this whole series at once ;) > > > > > > > > Can we expect a wrap around in host with (surprisingly very) long > > > > uptimes? > > > > > > Then, the number of inflight AF_UNIX sockets is at least 2^64 - 1. > > > > Isn't "unix_vertex_last_index" value preserved across consecutive cg > > run? I though we could reach wrap around after a lot of gc runs... > > It's preserved across consecutive DFS in a single gc run, but > unix_walk_scc() always reset it. So, if it's wrapped, there > would be too many sockets. Ah, I missed that point. No wrap-around problem then! > I used unix_vertex_last_index elsewhere in the initial draft, > but now local variable could be better here. You could bundle the index, hitlist, etc. in a single struct (gs_state or whatever) and pass around a single argument, if that helps. Cheers, Paolo
diff --git a/include/net/af_unix.h b/include/net/af_unix.h index ec040caaa4b5..696d997a5ac9 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -36,7 +36,7 @@ struct unix_vertex { struct list_head scc_entry; unsigned long out_degree; unsigned long index; - unsigned long lowlink; + unsigned long scc_index; }; struct unix_edge { diff --git a/net/unix/garbage.c b/net/unix/garbage.c index 1d9a0498dec5..0eb1610c96d7 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -308,18 +308,18 @@ static bool unix_scc_cyclic(struct list_head *scc) static LIST_HEAD(unix_visited_vertices); static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2; +static unsigned long unix_vertex_last_index = UNIX_VERTEX_INDEX_START; static void __unix_walk_scc(struct unix_vertex *vertex) { - unsigned long index = UNIX_VERTEX_INDEX_START; LIST_HEAD(vertex_stack); struct unix_edge *edge; LIST_HEAD(edge_stack); next_vertex: - vertex->index = index; - vertex->lowlink = index; - index++; + vertex->index = unix_vertex_last_index; + vertex->scc_index = unix_vertex_last_index; + unix_vertex_last_index++; list_add(&vertex->scc_entry, &vertex_stack); @@ -342,13 +342,13 @@ static void __unix_walk_scc(struct unix_vertex *vertex) vertex = edge->predecessor->vertex; - vertex->lowlink = min(vertex->lowlink, next_vertex->lowlink); + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); } else if (next_vertex->index != unix_vertex_grouped_index) { - vertex->lowlink = min(vertex->lowlink, next_vertex->index); + vertex->scc_index = min(vertex->scc_index, next_vertex->scc_index); } } - if (vertex->index == vertex->lowlink) { + if (vertex->index == vertex->scc_index) { struct list_head scc; __list_cut_position(&scc, &vertex_stack, &vertex->scc_entry); @@ -371,6 +371,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex) static void unix_walk_scc(void) { + unix_vertex_last_index = UNIX_VERTEX_INDEX_START; unix_graph_maybe_cyclic = false; while (!list_empty(&unix_unvisited_vertices)) {
The definition of the lowlink in Tarjan's algorithm is the smallest index of a vertex that is reachable with at most one back-edge in SCC. This is not useful for a cross-edge. If we start traversing from A in the following graph, the final lowlink of D is 3. The cross-edge here is one between D and C. A -> B -> D D = (4, 3) (index, lowlink) ^ | | C = (3, 1) | V | B = (2, 1) `--- C <--' A = (1, 1) This is because the lowlink of D is updated with the index of C. In the following patch, we detect a dead SCC by checking two conditions for each vertex. 1) vertex has no edge directed to another SCC (no bridge) 2) vertex's out_degree is the same as the refcount of its file If 1) is false, there is a receiver of all fds of the SCC and its ancestor SCC. To evaluate 1), we need to assign a unique index to each SCC and assign it to all vertices in the SCC. This patch changes the lowlink update logic for cross-edge so that in the example above, the lowlink of D is updated with the lowlink of C. A -> B -> D D = (4, 1) (index, lowlink) ^ | | C = (3, 1) | V | B = (2, 1) `--- C <--' A = (1, 1) Then, all vertices in the same SCC have the same lowlink, and we can quickly find the bridge connecting to different SCC if exists. However, it is no longer called lowlink, so we rename it to scc_index. (It's sometimes called lowpoint.) Also, we add a global variable to hold the last index used in DFS so that we do not reset the initial index in each DFS. This patch can be squashed to the SCC detection patch but is split deliberately for anyone wondering why lowlink is not used as used in the original Tarjan's algorithm and many reference implementations. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> --- include/net/af_unix.h | 2 +- net/unix/garbage.c | 15 ++++++++------- 2 files changed, 9 insertions(+), 8 deletions(-)