mbox series

[net-next,v5,0/3] Threads support in proc connector

Message ID 20241017181436.2047508-1-anjali.k.kulkarni@oracle.com (mailing list archive)
Headers show
Series Threads support in proc connector | expand

Message

Anjali Kulkarni Oct. 17, 2024, 6:14 p.m. UTC
Recently we committed a fix to allow processes to receive notifications for
non-zero exits via the process connector module. Commit is a4c9a56e6a2c.

However, for threads, when it does a pthread_exit(&exit_status) call, the
kernel is not aware of the exit status with which pthread_exit is called.
It is sent by child thread to the parent process, if it is waiting in
pthread_join(). Hence, for a thread exiting abnormally, kernel cannot
send notifications to any listening processes.

The exception to this is if the thread is sent a signal which it has not
handled, and dies along with it's process as a result; for eg. SIGSEGV or
SIGKILL. In this case, kernel is aware of the non-zero exit and sends a
notification for it.

For our use case, we cannot have parent wait in pthread_join, one of the
main reasons for this being that we do not want to track normal
pthread_exit(), which could be a very large number. We only want to be
notified of any abnormal exits. Hence, threads are created with
pthread_attr_t set to PTHREAD_CREATE_DETACHED.

To fix this problem, we add a new type PROC_CN_MCAST_NOTIFY to proc connector
API, which allows a thread to send it's exit status to kernel either when
it needs to call pthread_exit() with non-zero value to indicate some
error or from signal handler before pthread_exit().

We also need to filter packets with non-zero exit notifications futher
based on instances, which can be identified by task names. Hence, added a
comm field to the packet's struct proc_event, in which task->comm is
stored.

v4->v5 changes:
- Handled comment by Stanislav Fomichev to fix a print format error.
- Made thread.c completely automated by starting proc_filter program
  from within threads.c.
- Changed name CONFIG_CN_HASH_KUNIT_TEST to CN_HASH_KUNIT_TEST in
  Kconfig.debug and changed display text.
 
v3->v4 changes:
- Reduce size of exit.log by removing unnecessary text.

v2->v3 changes:
- Handled comment by Liam Howlett to set hdev to NULL and add comment on
  it.
- Handled comment by Liam Howlett to combine functions for deleting+get
  and deleting into one in cn_hash.c
- Handled comment by Liam Howlett to remove extern in the functions
  defined in cn_hash_test.h
- Some nits by Liam Howlett fixed.
- Handled comment by Liam Howlett to make threads test automated.
  proc_filter.c creates exit.log, which is read by thread.c and checks
  the values reported.
- Added "comm" field to struct proc_event, to copy the task's name to
  the packet to allow further filtering by packets.

v1->v2 changes:
- Handled comment by Peter Zijlstra to remove locking for PF_EXIT_NOTIFY
  task->flags.
- Added error handling in thread.c

v->v1 changes:
- Handled comment by Simon Horman to remove unused err in cn_proc.c
- Handled comment by Simon Horman to make adata and key_display static
  in cn_hash_test.c

Anjali Kulkarni (3):
  connector/cn_proc: Add hash table for threads
  connector/cn_proc: Kunit tests for threads hash table
  connector/cn_proc: Selftest for threads

 drivers/connector/Makefile                    |   2 +-
 drivers/connector/cn_hash.c                   | 221 +++++++++++++++++
 drivers/connector/cn_proc.c                   |  62 ++++-
 drivers/connector/connector.c                 |  75 +++++-
 include/linux/connector.h                     |  35 +++
 include/linux/sched.h                         |   2 +-
 include/uapi/linux/cn_proc.h                  |   5 +-
 lib/Kconfig.debug                             |  17 ++
 lib/Makefile                                  |   1 +
 lib/cn_hash_test.c                            | 167 +++++++++++++
 lib/cn_hash_test.h                            |  10 +
 tools/testing/selftests/connector/Makefile    |  23 +-
 .../testing/selftests/connector/proc_filter.c |  34 ++-
 tools/testing/selftests/connector/thread.c    | 232 ++++++++++++++++++
 .../selftests/connector/thread_filter.c       |  96 ++++++++
 15 files changed, 967 insertions(+), 15 deletions(-)
 create mode 100644 drivers/connector/cn_hash.c
 create mode 100644 lib/cn_hash_test.c
 create mode 100644 lib/cn_hash_test.h
 create mode 100644 tools/testing/selftests/connector/thread.c
 create mode 100644 tools/testing/selftests/connector/thread_filter.c

Comments

Simon Horman Oct. 18, 2024, 9:49 a.m. UTC | #1
On Thu, Oct 17, 2024 at 11:14:33AM -0700, Anjali Kulkarni wrote:
> Recently we committed a fix to allow processes to receive notifications for
> non-zero exits via the process connector module. Commit is a4c9a56e6a2c.
> 
> However, for threads, when it does a pthread_exit(&exit_status) call, the
> kernel is not aware of the exit status with which pthread_exit is called.
> It is sent by child thread to the parent process, if it is waiting in
> pthread_join(). Hence, for a thread exiting abnormally, kernel cannot
> send notifications to any listening processes.
> 
> The exception to this is if the thread is sent a signal which it has not
> handled, and dies along with it's process as a result; for eg. SIGSEGV or
> SIGKILL. In this case, kernel is aware of the non-zero exit and sends a
> notification for it.
> 
> For our use case, we cannot have parent wait in pthread_join, one of the
> main reasons for this being that we do not want to track normal
> pthread_exit(), which could be a very large number. We only want to be
> notified of any abnormal exits. Hence, threads are created with
> pthread_attr_t set to PTHREAD_CREATE_DETACHED.
> 
> To fix this problem, we add a new type PROC_CN_MCAST_NOTIFY to proc connector
> API, which allows a thread to send it's exit status to kernel either when
> it needs to call pthread_exit() with non-zero value to indicate some
> error or from signal handler before pthread_exit().
> 
> We also need to filter packets with non-zero exit notifications futher
> based on instances, which can be identified by task names. Hence, added a
> comm field to the packet's struct proc_event, in which task->comm is
> stored.

As it seems that there will be another revision anyway,
please run this patch-set through checkpatch with the following arguments.

	./scripts/checkpatch.pl --strict --max-line-length=80

And please fix warnings about alignment and line length.
But please do so in such a way that doesn't reduce readability,
e.g. don't split strings over multiple lines.
Anjali Kulkarni Oct. 18, 2024, 3:31 p.m. UTC | #2
> On Oct 18, 2024, at 2:49 AM, Simon Horman <horms@kernel.org> wrote:
> 
> On Thu, Oct 17, 2024 at 11:14:33AM -0700, Anjali Kulkarni wrote:
>> Recently we committed a fix to allow processes to receive notifications for
>> non-zero exits via the process connector module. Commit is a4c9a56e6a2c.
>> 
>> However, for threads, when it does a pthread_exit(&exit_status) call, the
>> kernel is not aware of the exit status with which pthread_exit is called.
>> It is sent by child thread to the parent process, if it is waiting in
>> pthread_join(). Hence, for a thread exiting abnormally, kernel cannot
>> send notifications to any listening processes.
>> 
>> The exception to this is if the thread is sent a signal which it has not
>> handled, and dies along with it's process as a result; for eg. SIGSEGV or
>> SIGKILL. In this case, kernel is aware of the non-zero exit and sends a
>> notification for it.
>> 
>> For our use case, we cannot have parent wait in pthread_join, one of the
>> main reasons for this being that we do not want to track normal
>> pthread_exit(), which could be a very large number. We only want to be
>> notified of any abnormal exits. Hence, threads are created with
>> pthread_attr_t set to PTHREAD_CREATE_DETACHED.
>> 
>> To fix this problem, we add a new type PROC_CN_MCAST_NOTIFY to proc connector
>> API, which allows a thread to send it's exit status to kernel either when
>> it needs to call pthread_exit() with non-zero value to indicate some
>> error or from signal handler before pthread_exit().
>> 
>> We also need to filter packets with non-zero exit notifications futher
>> based on instances, which can be identified by task names. Hence, added a
>> comm field to the packet's struct proc_event, in which task->comm is
>> stored.
> 
> As it seems that there will be another revision anyway,
> please run this patch-set through checkpatch with the following arguments.
> 
> ./scripts/checkpatch.pl --strict --max-line-length=80
> 
> And please fix warnings about alignment and line length.
> But please do so in such a way that doesn't reduce readability,
> e.g. don't split strings over multiple lines.

Ok thanks, will do. 

Anjali