diff mbox series

[v6,1/2] net/handshake: Create a NETLINK service for handling handshake requests

Message ID 167786949141.7199.15896224944077004509.stgit@91.116.238.104.host.secureserver.net (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series Another crack at a handshake upcall mechanism | expand

Checks

Context Check Description
netdev/series_format warning Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 4553 this patch: 4553
netdev/cc_maintainers warning 7 maintainers not CCed: mhiramat@kernel.org linux-trace-kernel@vger.kernel.org davem@davemloft.net corbet@lwn.net chuck.lever@oracle.com rostedt@goodmis.org linux-doc@vger.kernel.org
netdev/build_clang success Errors and warnings before: 1067 this patch: 1067
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 4765 this patch: 4765
netdev/checkpatch warning CHECK: Alignment should match open parenthesis CHECK: Lines should not end with a '(' CHECK: Please don't use multiple blank lines CHECK: extern prototypes should be avoided in .h files WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? WARNING: line length of 81 exceeds 80 columns WARNING: networking block comments don't use an empty /* line, use /* Comment...
netdev/kdoc fail Errors and warnings before: 0 this patch: 5
netdev/source_inline success Was 0 now: 0

Commit Message

Chuck Lever March 3, 2023, 6:51 p.m. UTC
From: Chuck Lever <chuck.lever@oracle.com>

When a kernel consumer needs a transport layer security session, it
first needs a handshake to negotiate and establish a session. This
negotiation can be done in user space via one of the several
existing library implementations, or it can be done in the kernel.

No in-kernel handshake implementations yet exist. In their absence,
we add a netlink service that can:

a. Notify a user space daemon that a handshake is needed.

b. Once notified, the daemon calls the kernel back via this
   netlink service to get the handshake parameters, including an
   open socket on which to establish the session.

c. Once the handshake is complete, the daemon reports the
   session status and other information via a second netlink
   operation. This operation marks that it is safe for the
   kernel to use the open socket and the security session
   established there.

The notification service uses a multicast group. Each handshake
mechanism (eg, tlshd) adopts its own group number so that the
handshake services are completely independent of one another. The
kernel can then tell via netlink_has_listeners() whether a handshake
service is active and prepared to handle a handshake request.

A new netlink operation, ACCEPT, acts like accept(2) in that it
instantiates a file descriptor in the user space daemon's fd table.
If this operation is successful, the reply carries the fd number,
which can be treated as an open and ready file descriptor.

While user space is performing the handshake, the kernel keeps its
muddy paws off the open socket. A second new netlink operation,
DONE, indicates that the user space daemon is finished with the
socket and it is safe for the kernel to use again. The operation
also indicates whether a session was established successfully.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 Documentation/netlink/specs/handshake.yaml |  137 +++++++++++
 include/net/handshake.h                    |   46 ++++
 include/net/net_namespace.h                |    5 
 include/net/sock.h                         |    1 
 include/trace/events/handshake.h           |  159 +++++++++++++
 include/uapi/linux/handshake.h             |   71 ++++++
 net/Makefile                               |    1 
 net/handshake/Makefile                     |   11 +
 net/handshake/handshake.h                  |   41 +++
 net/handshake/netlink.c                    |  345 ++++++++++++++++++++++++++++
 net/handshake/request.c                    |  246 ++++++++++++++++++++
 net/handshake/trace.c                      |   17 +
 12 files changed, 1080 insertions(+)
 create mode 100644 Documentation/netlink/specs/handshake.yaml
 create mode 100644 include/net/handshake.h
 create mode 100644 include/trace/events/handshake.h
 create mode 100644 include/uapi/linux/handshake.h
 create mode 100644 net/handshake/Makefile
 create mode 100644 net/handshake/handshake.h
 create mode 100644 net/handshake/netlink.c
 create mode 100644 net/handshake/request.c
 create mode 100644 net/handshake/trace.c

Comments

Jakub Kicinski March 4, 2023, 2:21 a.m. UTC | #1
On Fri, 03 Mar 2023 13:51:31 -0500 Chuck Lever wrote:
> +operations:
> +  list:
> +    -
> +      name: ready
> +      doc: Notify handlers that a new handshake request is waiting
> +      value: 1

FWIW the value: 1 is now default for attr sets and ops, so you can drop
in v7 if you want.

> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> index 78beaa765c73..a0ce9de4dab1 100644
> --- a/include/net/net_namespace.h
> +++ b/include/net/net_namespace.h
> @@ -188,6 +188,11 @@ struct net {
>  #if IS_ENABLED(CONFIG_SMC)
>  	struct netns_smc	smc;
>  #endif
> +
> +	/* transport layer security handshake requests */
> +	spinlock_t		hs_lock;
> +	struct list_head	hs_requests;
> +	int			hs_pending;

Do we need this statically here? Can you use .id and .size of
pernet_operations and then net_generic() to access?

Also spinlock_t is 4B, right? So it'd be better for packing
to put in next to hs_pending.

>  } __randomize_layout;
>  
>  #include <linux/seq_file_net.h>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 573f2bf7e0de..2a7345ce2540 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -519,6 +519,7 @@ struct sock {
>  
>  	struct socket		*sk_socket;
>  	void			*sk_user_data;
> +	void			*sk_handshake_req;

Additions to core structures need an #ifdef I reckon.
Preferably put the pointer in a hashtable, there will
likely be relatively few sockets in a system with a req
outstanding. Not to mention distro kernels which will have
to burn 8B whether the feature is used or not.

> +static int handshake_status_reply(struct sk_buff *skb, struct genl_info *gi,
> +				  int status)
> +{
> +	struct nlmsghdr *hdr;
> +	struct sk_buff *msg;
> +	int ret;
> +
> +	ret = -ENOMEM;
> +	msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> +	if (!msg)
> +		goto out;
> +	hdr = handshake_genl_put(msg, gi);
> +	if (!hdr)
> +		goto out_free;
> +
> +	ret = -EMSGSIZE;
> +	ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_STATUS, status);
> +	if (ret < 0)
> +		goto out_free;
> +
> +	genlmsg_end(msg, hdr);
> +	return genlmsg_reply(msg, gi);
> +
> +out_free:
> +	genlmsg_cancel(msg, hdr);
> +out:
> +	return ret;
> +}

Why implement a full reply to return errno? The normal Netlink ACK
carries errno, you can simply return an error from the .doit().

> +static int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *gi)
> +{
> +	struct nlattr *tb[HANDSHAKE_A_ACCEPT_MAX + 1];
> +	struct net *net = sock_net(skb->sk);
> +	struct handshake_req *pos, *req;
> +	int fd, err;
> +
> +	err = -EINVAL;
> +	if (genlmsg_parse(nlmsg_hdr(skb), &handshake_genl_family, tb,
> +			  HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
> +			  handshake_accept_nl_policy, NULL))
> +		goto out_status;

gi->attrs has the attributes already parsed and ready to use!

BTW would you mind sed'ing /gi/info/ on the patches?
That's the most common variable name for struct genl_info.

> +	if (!tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS])
> +		goto out_status;

Shouldn't that be an error (checked with GENL_REQ_ATTR_CHECK())?

> +	req = NULL;
> +	spin_lock(&net->hs_lock);
> +	list_for_each_entry(pos, &net->hs_requests, hr_list) {
> +		if (pos->hr_proto->hp_handler_class !=
> +		    nla_get_u32(tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS]))

Maybe let's store this to a local variable to avoid long lines.

> +			continue;
> +		__remove_pending_locked(net, pos);
> +		req = pos;
> +		break;
> +	}
> +	spin_unlock(&net->hs_lock);
> +	if (!req)
> +		goto out_status;
> +
> +	fd = handshake_dup(req->hr_sock);
> +	if (fd < 0) {
> +		err = fd;
> +		goto out_complete;
> +	}
> +	err = req->hr_proto->hp_accept(req, gi, fd);
> +	if (err)
> +		goto out_complete;
> +
> +	trace_handshake_cmd_accept(net, req, req->hr_sock, fd);
> +	return 0;
> +
> +out_complete:
> +	handshake_complete(req, -EIO, NULL);
> +	fput(req->hr_sock->file);
> +out_status:
> +	trace_handshake_cmd_accept_err(net, req, NULL, err);
> +	return handshake_status_reply(skb, gi, err);
> +}
> +
> +static const struct nla_policy
> +handshake_done_nl_policy[HANDSHAKE_A_DONE_MAX + 1] = {
> +	[HANDSHAKE_A_DONE_SOCKFD] = { .type = NLA_U32, },
> +	[HANDSHAKE_A_DONE_STATUS] = { .type = NLA_U32, },
> +	[HANDSHAKE_A_DONE_REMOTE_AUTH] = { .type = NLA_U32, },
> +};

> +static const struct genl_split_ops handshake_nl_ops[] = {
> +	{
> +		.cmd		= HANDSHAKE_CMD_ACCEPT,
> +		.doit		= handshake_nl_accept_doit,
> +		.policy		= handshake_accept_nl_policy,
> +		.maxattr	= HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
> +		.flags		= GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
> +	},
> +	{
> +		.cmd		= HANDSHAKE_CMD_DONE,
> +		.doit		= handshake_nl_done_doit,
> +		.policy		= handshake_done_nl_policy,
> +		.maxattr	= HANDSHAKE_A_DONE_MAX,
> +		.flags		= GENL_CMD_CAP_DO,
> +	},
> +};
> +
> +static const struct genl_multicast_group handshake_nl_mcgrps[] = {
> +	[HANDSHAKE_HANDLER_CLASS_NONE] = { .name = HANDSHAKE_MCGRP_NONE, },
> +};
> +
> +static struct genl_family __ro_after_init handshake_genl_family = {
> +	.hdrsize		= 0,
> +	.name			= HANDSHAKE_FAMILY_NAME,
> +	.version		= HANDSHAKE_FAMILY_VERSION,
> +	.netnsok		= true,
> +	.parallel_ops		= true,
> +	.n_mcgrps		= ARRAY_SIZE(handshake_nl_mcgrps),
> +	.n_split_ops		= ARRAY_SIZE(handshake_nl_ops),
> +	.split_ops		= handshake_nl_ops,
> +	.mcgrps			= handshake_nl_mcgrps,
> +	.module			= THIS_MODULE,
> +};

You're not auto-generating the family, ops, and policies?
Any reason?

> +static void __net_exit handshake_net_exit(struct net *net)
> +{
> +	struct handshake_req *req;
> +	LIST_HEAD(requests);
> +
> +	/*
> +	 * This drains the net's pending list. Requests that
> +	 * have been accepted and are in progress will be
> +	 * destroyed when the socket is closed.
> +	 */
> +	spin_lock(&net->hs_lock);
> +	list_splice_init(&requests, &net->hs_requests);

What about new requests getting queued?

> +	spin_unlock(&net->hs_lock);
> +
> +	while (!list_empty(&requests)) {
> +		req = list_first_entry(&requests, struct handshake_req, hr_list);
> +		list_del(&req->hr_list);
> +
> +		/*
> +		 * Requests on this list have not yet been
> +		 * accepted, so they do not have an fd to put.
> +		 */
> +
> +		handshake_complete(req, -ETIMEDOUT, NULL);
> +	}
> +}

> +/**
> + * handshake_req_alloc - consumer API to allocate a request
> + * @sock: open socket on which to perform a handshake
> + * @proto: security protocol
> + * @flags: memory allocation flags
> + *
> + * Returns an initialized handshake_req or NULL.
> + */
> +struct handshake_req *handshake_req_alloc(struct socket *sock,
> +					  const struct handshake_proto *proto,
> +					  gfp_t flags)
> +{
> +	struct handshake_req *req;
> +
> +	/* Avoid accessing uninitialized global variables later on */
> +	if (!handshake_genl_inited)
> +		return NULL;
> +
> +	req = kzalloc(sizeof(*req) + proto->hp_privsize, flags);

Go to the next comment, then come back ...

... and then here you can use struct_size(req, priv, proto->hp_privsize)
to avoid false positive "this addition may overflow" patches.

> +	if (!req)
> +		return NULL;
> +
> +	sock_hold(sock->sk);
> +
> +	INIT_LIST_HEAD(&req->hr_list);
> +	req->hr_sock = sock;
> +	req->hr_proto = proto;
> +	return req;
> +}
> +EXPORT_SYMBOL(handshake_req_alloc);
> +
> +/**
> + * handshake_req_private - consumer API to return per-handshake private data
> + * @req: handshake arguments
> + *
> + */
> +void *handshake_req_private(struct handshake_req *req)
> +{
> +	return (void *)(req + 1);

IDK if this is not going to run afoul of the new object size checks
from Kees. You may be better of adding a flex array member to req
(char priv[]) and returning it here. (go back up)

> +}
> +EXPORT_SYMBOL(handshake_req_private);

> +/**
> + * handshake_req_cancel - consumer API to cancel an in-progress handshake
> + * @sock: socket on which there is an ongoing handshake
> + *
> + * XXX: Perhaps killing the user space agent might also be necessary?
> + *
> + * Request cancellation races with request completion. To determine
> + * who won, callers examine the return value from this function.
> + *
> + * Return values:
> + *   %true - Uncompleted handshake request was canceled or not found
> + *   %false - Handshake request already completed
> + */
> +bool handshake_req_cancel(struct socket *sock)
> +{
> +	struct handshake_req *req;
> +	struct sock *sk;
> +	struct net *net;
> +
> +	if (!sock)
> +		return true;

Is there a strong reason to check the input here?

> +	sk = sock->sk;
> +	req = sk->sk_handshake_req;
> +	net = sock_net(sk);
> +
> +	if (!req) {
> +		trace_handshake_cancel_none(net, req, sock);
> +		return true;
> +	}
> +
> +	if (remove_pending(net, req)) {
> +		/* Request hadn't been accepted */
> +		trace_handshake_cancel(net, req, sock);
> +		return true;
> +	}
> +	if (test_and_set_bit(HANDSHAKE_F_COMPLETED, &req->hr_flags)) {
> +		/* Request already completed */
> +		trace_handshake_cancel_busy(net, req, sock);
> +		return false;
> +	}
> +
> +	__sock_put(sk);
> +	trace_handshake_cancel(net, req, sock);
> +	return true;
> +}
Chuck Lever March 4, 2023, 5:25 p.m. UTC | #2
> On Mar 3, 2023, at 9:21 PM, Jakub Kicinski <kuba@kernel.org> wrote:
> 
> On Fri, 03 Mar 2023 13:51:31 -0500 Chuck Lever wrote:
> 
>> +static const struct genl_split_ops handshake_nl_ops[] = {
>> +	{
>> +		.cmd		= HANDSHAKE_CMD_ACCEPT,
>> +		.doit		= handshake_nl_accept_doit,
>> +		.policy		= handshake_accept_nl_policy,
>> +		.maxattr	= HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
>> +		.flags		= GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
>> +	},
>> +	{
>> +		.cmd		= HANDSHAKE_CMD_DONE,
>> +		.doit		= handshake_nl_done_doit,
>> +		.policy		= handshake_done_nl_policy,
>> +		.maxattr	= HANDSHAKE_A_DONE_MAX,
>> +		.flags		= GENL_CMD_CAP_DO,
>> +	},
>> +};
>> +
>> +static const struct genl_multicast_group handshake_nl_mcgrps[] = {
>> +	[HANDSHAKE_HANDLER_CLASS_NONE] = { .name = HANDSHAKE_MCGRP_NONE, },
>> +};
>> +
>> +static struct genl_family __ro_after_init handshake_genl_family = {
>> +	.hdrsize		= 0,
>> +	.name			= HANDSHAKE_FAMILY_NAME,
>> +	.version		= HANDSHAKE_FAMILY_VERSION,
>> +	.netnsok		= true,
>> +	.parallel_ops		= true,
>> +	.n_mcgrps		= ARRAY_SIZE(handshake_nl_mcgrps),
>> +	.n_split_ops		= ARRAY_SIZE(handshake_nl_ops),
>> +	.split_ops		= handshake_nl_ops,
>> +	.mcgrps			= handshake_nl_mcgrps,
>> +	.module			= THIS_MODULE,
>> +};
> 
> You're not auto-generating the family, ops, and policies?
> Any reason?

I couldn't find a way to have the generated source appear
in the middle of a source file. But I see that's not the
way others are doing it, so I have added separate files
under net/handshake for the generated source and header
material. Two things, though:

1. I don't see a generated struct genl_family.

2. The SPDX tags in the generated source files is "BSD
   3-clause", but the tag in my spec is "GPL-2.0 with
   syscall note". Oddly, the generated uapi header still
   has the latter (correct) tag.

--
Chuck Lever
Chuck Lever March 4, 2023, 5:44 p.m. UTC | #3
> On Mar 4, 2023, at 12:25 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> 
> 
> 
>> On Mar 3, 2023, at 9:21 PM, Jakub Kicinski <kuba@kernel.org> wrote:
>> 
>> On Fri, 03 Mar 2023 13:51:31 -0500 Chuck Lever wrote:
>> 
>>> +static const struct genl_split_ops handshake_nl_ops[] = {
>>> +	{
>>> +		.cmd		= HANDSHAKE_CMD_ACCEPT,
>>> +		.doit		= handshake_nl_accept_doit,
>>> +		.policy		= handshake_accept_nl_policy,
>>> +		.maxattr	= HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
>>> +		.flags		= GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
>>> +	},
>>> +	{
>>> +		.cmd		= HANDSHAKE_CMD_DONE,
>>> +		.doit		= handshake_nl_done_doit,
>>> +		.policy		= handshake_done_nl_policy,
>>> +		.maxattr	= HANDSHAKE_A_DONE_MAX,
>>> +		.flags		= GENL_CMD_CAP_DO,
>>> +	},
>>> +};
>>> +
>>> +static const struct genl_multicast_group handshake_nl_mcgrps[] = {
>>> +	[HANDSHAKE_HANDLER_CLASS_NONE] = { .name = HANDSHAKE_MCGRP_NONE, },
>>> +};
>>> +
>>> +static struct genl_family __ro_after_init handshake_genl_family = {
>>> +	.hdrsize		= 0,
>>> +	.name			= HANDSHAKE_FAMILY_NAME,
>>> +	.version		= HANDSHAKE_FAMILY_VERSION,
>>> +	.netnsok		= true,
>>> +	.parallel_ops		= true,
>>> +	.n_mcgrps		= ARRAY_SIZE(handshake_nl_mcgrps),
>>> +	.n_split_ops		= ARRAY_SIZE(handshake_nl_ops),
>>> +	.split_ops		= handshake_nl_ops,
>>> +	.mcgrps			= handshake_nl_mcgrps,
>>> +	.module			= THIS_MODULE,
>>> +};
>> 
>> You're not auto-generating the family, ops, and policies?
>> Any reason?
> 
> I couldn't find a way to have the generated source appear
> in the middle of a source file. But I see that's not the
> way others are doing it, so I have added separate files
> under net/handshake for the generated source and header
> material. Two things, though:
> 
> 1. I don't see a generated struct genl_family.

Some experimentation revealed that this is because the spec
was a "genetlink-c" spec which prevents the generation of
"struct genl_family".

But switching it to a "genetlink" spec means it wants my
main header to be linux/handshake.h, and it won't allow the
use of "uapi-header" to put that header somewhere else (in
my case, I thought linux/net/handshake.h was more appropriate).


> 2. The SPDX tags in the generated source files is "BSD
>   3-clause", but the tag in my spec is "GPL-2.0 with
>   syscall note". Oddly, the generated uapi header still
>   has the latter (correct) tag.
> 
> --
> Chuck Lever
> 
> 

--
Chuck Lever
Chuck Lever March 4, 2023, 7:48 p.m. UTC | #4
> On Mar 4, 2023, at 2:16 PM, Jakub Kicinski <kuba@kernel.org> wrote:
> 
> On Sat, 4 Mar 2023 17:44:34 +0000 Chuck Lever III wrote:
>>> I couldn't find a way to have the generated source appear
>>> in the middle of a source file. But I see that's not the
>>> way others are doing it, so I have added separate files
>>> under net/handshake for the generated source and header
>>> material. Two things, though:
>>> 
>>> 1. I don't see a generated struct genl_family.  
>> 
>> Some experimentation revealed that this is because the spec
>> was a "genetlink-c" spec which prevents the generation of
>> "struct genl_family".
>> 
>> But switching it to a "genetlink" spec means it wants my
>> main header to be linux/handshake.h, and it won't allow the
>> use of "uapi-header" to put that header somewhere else (in
>> my case, I thought linux/net/handshake.h was more appropriate).
> 
> Hm, include/uapi/linux/net/ does not exist and I did not foresee
> the need :)  We just need to allow it in the schema, right?
> 
> diff --git a/Documentation/netlink/genetlink.yaml b/Documentation/netlink/genetlink.yaml
> index 62a922755ce2..5594410963b5 100644
> --- a/Documentation/netlink/genetlink.yaml
> +++ b/Documentation/netlink/genetlink.yaml
> @@ -33,6 +33,9 @@ additionalProperties: False
>   protocol:
>     description: Schema compatibility level. Default is "genetlink".
>     enum: [ genetlink ]
> +  uapi-header:
> +    description: Path to the uAPI header, default is linux/${family-name}.h
> +    type: string
> 
>   definitions:
>     description: List of type and constant definitions (enums, flags, defines).

Somehow it's working as I had it. I can't fathom where the
generated source file is including include/uapi/linux/handshake.h
from, but <shrug> it's generating properly and compiling now.

I see others, such as fou, also appear to work just the
same way as I structured it. So, false alarm, I hope.


>>> 2. The SPDX tags in the generated source files is "BSD
>>>  3-clause", but the tag in my spec is "GPL-2.0 with
>>>  syscall note". Oddly, the generated uapi header still
>>>  has the latter (correct) tag.
> 
> I was trying to go with least restrictive licenses for the generated
> code. Would BSD-3-clause everywhere be okay with you?

IIUC we cannot generate source code from a GPL-encumbered
specification and label that code with a less-restrictive
license. Isn't generated source code a "derived" artifact?

The spec lives in the kernel tree, therefore it's covered.
Plus, my employer requires that all of my contributions
to the Linux kernel are under GPL v2.

I'd prefer to see all my generated files get a license
that matches the spec's license.

You could add an spdx object in the YAML schema, and output
the value of that object as part of code generation.

To be safe, I'd also find a suitably informed lawyer who
can give us an opinion about how this needs to work. I've
had a similar discussion about the license status of a
spec derived from source code, so I'm skeptical that we
can simply replace the license when going to code from
spec.

If you need to require BSD-3-clause in this area, I can
request an exception from my employer for the YAML that
is contributed as part of the handshake mechanism.

Sorry to make trouble -- hopefully this discussion is also
keeping you out of trouble too.


--
Chuck Lever
Jakub Kicinski March 4, 2023, 8:01 p.m. UTC | #5
On Sat, 4 Mar 2023 19:48:51 +0000 Chuck Lever III wrote:
> >>> 2. The SPDX tags in the generated source files is "BSD
> >>>  3-clause", but the tag in my spec is "GPL-2.0 with
> >>>  syscall note". Oddly, the generated uapi header still
> >>>  has the latter (correct) tag.  
> > 
> > I was trying to go with least restrictive licenses for the generated
> > code. Would BSD-3-clause everywhere be okay with you?  
> 
> IIUC we cannot generate source code from a GPL-encumbered
> specification and label that code with a less-restrictive
> license. Isn't generated source code a "derived" artifact?
> 
> The spec lives in the kernel tree, therefore it's covered.
> Plus, my employer requires that all of my contributions
> to the Linux kernel are under GPL v2.
> 
> I'd prefer to see all my generated files get a license
> that matches the spec's license.
> 
> You could add an spdx object in the YAML schema, and output
> the value of that object as part of code generation.
> 
> To be safe, I'd also find a suitably informed lawyer who
> can give us an opinion about how this needs to work. I've
> had a similar discussion about the license status of a
> spec derived from source code, so I'm skeptical that we
> can simply replace the license when going to code from
> spec.
> 
> If you need to require BSD-3-clause in this area, I can
> request an exception from my employer for the YAML that
> is contributed as part of the handshake mechanism.

The choice of BSD was to make the specs as easy to use as possible.
Some companies may still be iffy about GPL, and it's all basically
an API, not "real code".

If your lawyers agree we should require BSD an all Netlink specs,
document that and make the uAPI also BSD.

> Sorry to make trouble -- hopefully this discussion is also
> keeping you out of trouble too.

I was hoping choice of BSD would keep me out of trouble :)
My second choice was to make them public domain.. but lawyers should
like BSD-3-clause more because of the warranty statement.
Chuck Lever March 4, 2023, 8:19 p.m. UTC | #6
> On Mar 4, 2023, at 3:01 PM, Jakub Kicinski <kuba@kernel.org> wrote:
> 
> On Sat, 4 Mar 2023 19:48:51 +0000 Chuck Lever III wrote:
>>>>> 2. The SPDX tags in the generated source files is "BSD
>>>>> 3-clause", but the tag in my spec is "GPL-2.0 with
>>>>> syscall note". Oddly, the generated uapi header still
>>>>> has the latter (correct) tag.  
>>> 
>>> I was trying to go with least restrictive licenses for the generated
>>> code. Would BSD-3-clause everywhere be okay with you?  
>> 
>> IIUC we cannot generate source code from a GPL-encumbered
>> specification and label that code with a less-restrictive
>> license. Isn't generated source code a "derived" artifact?
>> 
>> The spec lives in the kernel tree, therefore it's covered.
>> Plus, my employer requires that all of my contributions
>> to the Linux kernel are under GPL v2.
>> 
>> I'd prefer to see all my generated files get a license
>> that matches the spec's license.
>> 
>> You could add an spdx object in the YAML schema, and output
>> the value of that object as part of code generation.
>> 
>> To be safe, I'd also find a suitably informed lawyer who
>> can give us an opinion about how this needs to work. I've
>> had a similar discussion about the license status of a
>> spec derived from source code, so I'm skeptical that we
>> can simply replace the license when going to code from
>> spec.
>> 
>> If you need to require BSD-3-clause in this area, I can
>> request an exception from my employer for the YAML that
>> is contributed as part of the handshake mechanism.
> 
> The choice of BSD was to make the specs as easy to use as possible.
> Some companies may still be iffy about GPL, and it's all basically
> an API, not "real code".
> 
> If your lawyers agree we should require BSD an all Netlink specs,
> document that and make the uAPI also BSD.
> 
>> Sorry to make trouble -- hopefully this discussion is also
>> keeping you out of trouble too.
> 
> I was hoping choice of BSD would keep me out of trouble :)
> My second choice was to make them public domain.. but lawyers should
> like BSD-3-clause more because of the warranty statement.

The issue is that the GPL forces our hand. Derived code
is under GPL if the spec is under GPL. The 3 existing
specs in Documentation/netlink/specs are unlabeled, and
therefore I think would be subsumed under the blanket
license that other kernel source falls under.

I don't think you can simply choose a license for
the derived code. The only way to fix this so that the
generated code is under BSD-3-clause is to explicitly
re-license the specs under Documentation/netlink/specs/
as BSD-3-clause. (which is as easy as asking the authors
for permission to do that - I assume this stuff is new
enough that it won't be difficult to track them down).

Again, it would be convenient for contributors in this
area to specify the spec and code license in the YAML
spec. Anyone can contribute under BSD-3-clause or GPL,
but the code and spec licenses have to match, IMO.

I can start with the LF first to see if we actually have
a problem.


--
Chuck Lever
Jakub Kicinski March 4, 2023, 8:45 p.m. UTC | #7
On Sat, 4 Mar 2023 20:19:06 +0000 Chuck Lever III wrote:
> >> Sorry to make trouble -- hopefully this discussion is also
> >> keeping you out of trouble too.  
> > 
> > I was hoping choice of BSD would keep me out of trouble :)
> > My second choice was to make them public domain.. but lawyers should
> > like BSD-3-clause more because of the warranty statement.  
> 
> The issue is that the GPL forces our hand. Derived code
> is under GPL if the spec is under GPL. The 3 existing
> specs in Documentation/netlink/specs are unlabeled, and
> therefore I think would be subsumed under the blanket
> license that other kernel source falls under.

Understood.

> I don't think you can simply choose a license for
> the derived code. The only way to fix this so that the
> generated code is under BSD-3-clause is to explicitly
> re-license the specs under Documentation/netlink/specs/
> as BSD-3-clause. (which is as easy as asking the authors
> for permission to do that - I assume this stuff is new
> enough that it won't be difficult to track them down).

Fair point. I'll relicense, they are all written by me.
The two other people who touched them should be easy to
get hold of.

> Again, it would be convenient for contributors in this
> area to specify the spec and code license in the YAML
> spec. Anyone can contribute under BSD-3-clause or GPL,
> but the code and spec licenses have to match, IMO.

Yes, I'll clean the existing specs up. The only outstanding
question AFAICT is whether we really need the GPL or you can 
get an exception for yourself and use BSD?

I care more about the downstream users than kernel devs on this,
I'd really prefer for the users not to have to worry about 
licensing. There may be a codegen for some funky new language 
which requires a specific license which may not be compatible
with GPL.

For normal C this is covered by the "uAPI note" but I doubt
that will cover generated code. And frankly would prefer not 
to have to ask :( So let's try BSD?

FWIW I always thought that companies which have an explicit
"can contribute to the kernel in GPL" policy do it because
one needs an exception _for_GPL_, not for the kernel.
Logically the answer to BSD-3-Clause to be "oh, yea, we 
don't care"... I said "logically", you can make the obvious
joke yourself :)

> I can start with the LF first to see if we actually have
> a problem.
Chuck Lever March 4, 2023, 9:40 p.m. UTC | #8
> On Mar 4, 2023, at 3:45 PM, Jakub Kicinski <kuba@kernel.org> wrote:
> 
> On Sat, 4 Mar 2023 20:19:06 +0000 Chuck Lever III wrote:
>>>> Sorry to make trouble -- hopefully this discussion is also
>>>> keeping you out of trouble too.  
>>> 
>>> I was hoping choice of BSD would keep me out of trouble :)
>>> My second choice was to make them public domain.. but lawyers should
>>> like BSD-3-clause more because of the warranty statement.  
>> 
>> The issue is that the GPL forces our hand. Derived code
>> is under GPL if the spec is under GPL. The 3 existing
>> specs in Documentation/netlink/specs are unlabeled, and
>> therefore I think would be subsumed under the blanket
>> license that other kernel source falls under.
> 
> Understood.
> 
>> I don't think you can simply choose a license for
>> the derived code. The only way to fix this so that the
>> generated code is under BSD-3-clause is to explicitly
>> re-license the specs under Documentation/netlink/specs/
>> as BSD-3-clause. (which is as easy as asking the authors
>> for permission to do that - I assume this stuff is new
>> enough that it won't be difficult to track them down).
> 
> Fair point. I'll relicense, they are all written by me.
> The two other people who touched them should be easy to
> get hold of.
> 
>> Again, it would be convenient for contributors in this
>> area to specify the spec and code license in the YAML
>> spec. Anyone can contribute under BSD-3-clause or GPL,
>> but the code and spec licenses have to match, IMO.
> 
> Yes, I'll clean the existing specs up. The only outstanding
> question AFAICT is whether we really need the GPL

I don't believe GPL to be a general contribution
requirement.

My bugaboo is that these are currently unlabeled and
potentially inconsistent with existing license
requirements. All fixable.


> or you can 
> get an exception for yourself and use BSD?

Yes, I will ask if handshake.yaml may be contributed
under BSD. They might suggest dual license.


> I care more about the downstream users than kernel devs on this,
> I'd really prefer for the users not to have to worry about 
> licensing. There may be a codegen for some funky new language 
> which requires a specific license which may not be compatible
> with GPL.

Sure. IMO it starts with the licensing of the specs. Fix
that and you should be good.


> For normal C this is covered by the "uAPI note" but I doubt
> that will cover generated code. And frankly would prefer not 
> to have to ask :( So let's try BSD?
> 
> FWIW I always thought that companies which have an explicit
> "can contribute to the kernel in GPL" policy do it because
> one needs an exception _for_GPL_, not for the kernel.
> Logically the answer to BSD-3-Clause to be "oh, yea, we 
> don't care"... I said "logically", you can make the obvious
> joke yourself :)
> 
>> I can start with the LF first to see if we actually have
>> a problem.

--
Chuck Lever
Chuck Lever March 6, 2023, 7:34 p.m. UTC | #9
> On Mar 4, 2023, at 3:45 PM, Jakub Kicinski <kuba@kernel.org> wrote:
> 
> On Sat, 4 Mar 2023 20:19:06 +0000 Chuck Lever III wrote:
>>>> Sorry to make trouble -- hopefully this discussion is also
>>>> keeping you out of trouble too.  
>>> 
>>> I was hoping choice of BSD would keep me out of trouble :)
>>> My second choice was to make them public domain.. but lawyers should
>>> like BSD-3-clause more because of the warranty statement.  
>> 
>> The issue is that the GPL forces our hand. Derived code
>> is under GPL if the spec is under GPL. The 3 existing
>> specs in Documentation/netlink/specs are unlabeled, and
>> therefore I think would be subsumed under the blanket
>> license that other kernel source falls under.
> 
> Understood.
> 
>> I don't think you can simply choose a license for
>> the derived code. The only way to fix this so that the
>> generated code is under BSD-3-clause is to explicitly
>> re-license the specs under Documentation/netlink/specs/
>> as BSD-3-clause. (which is as easy as asking the authors
>> for permission to do that - I assume this stuff is new
>> enough that it won't be difficult to track them down).
> 
> Fair point. I'll relicense, they are all written by me.
> The two other people who touched them should be easy to
> get hold of.
> 
>> Again, it would be convenient for contributors in this
>> area to specify the spec and code license in the YAML
>> spec. Anyone can contribute under BSD-3-clause or GPL,
>> but the code and spec licenses have to match, IMO.
> 
> Yes, I'll clean the existing specs up. The only outstanding
> question AFAICT is whether we really need the GPL or you can 
> get an exception for yourself and use BSD?

I'm told that without even getting an exception, I am permitted
to contribute the handshake spec as GPL-2.0 OR BSD-2-Clause.

I don't yet have a resolution on whether the code generated
by ynl-gen-c.py is considered a derivative work.


> I care more about the downstream users than kernel devs on this,
> I'd really prefer for the users not to have to worry about 
> licensing. There may be a codegen for some funky new language 
> which requires a specific license which may not be compatible
> with GPL.
> 
> For normal C this is covered by the "uAPI note" but I doubt
> that will cover generated code. And frankly would prefer not 
> to have to ask :( So let's try BSD?
> 
> FWIW I always thought that companies which have an explicit
> "can contribute to the kernel in GPL" policy do it because
> one needs an exception _for_GPL_, not for the kernel.
> Logically the answer to BSD-3-Clause to be "oh, yea, we 
> don't care"... I said "logically", you can make the obvious
> joke yourself :)

So I'm wondering why not dual-license all the specs? That is
the usual way to provide a permissive license that can be
used outside of Linux environments. The output files might
then carry a dual license as well.


--
Chuck Lever
diff mbox series

Patch

diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml
new file mode 100644
index 000000000000..8367f50fb745
--- /dev/null
+++ b/Documentation/netlink/specs/handshake.yaml
@@ -0,0 +1,137 @@ 
+# SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+#
+# GENL HANDSHAKE service.
+#
+# Author: Chuck Lever <chuck.lever@oracle.com>
+#
+# Copyright (c) 2023, Oracle and/or its affiliates.
+#
+
+name: handshake
+
+protocol: genetlink-c
+
+doc: Netlink protocol to request a transport layer security handshake.
+
+uapi-header: linux/net/handshake.h
+
+definitions:
+  -
+    type: enum
+    name: handler-class
+    enum-name:
+    value-start: 0
+    entries: [ none ]
+  -
+    type: enum
+    name: msg-type
+    enum-name:
+    value-start: 0
+    entries: [ unspec, clienthello, serverhello ]
+  -
+    type: enum
+    name: auth
+    enum-name:
+    value-start: 0
+    entries: [ unspec, unauth, psk, x509 ]
+
+attribute-sets:
+  -
+    name: x509
+    attributes:
+      -
+        name: cert
+        type: u32
+        value: 1
+      -
+        name: privkey
+        type: u32
+  -
+    name: accept
+    attributes:
+      -
+        name: status
+        type: u32
+        value: 1
+      -
+        name: sockfd
+        type: u32
+      -
+        name: handler-class
+        type: u32
+        enum: handler-class
+      -
+        name: message-type
+        type: u32
+        enum: msg-type
+      -
+        name: timeout
+        type: u32
+      -
+        name: auth-mode
+        type: u32
+        enum: auth
+      -
+        name: peer-identity
+        type: u32
+        multi-attr: true
+      -
+        name: certificate
+        type: nest
+        nested-attributes: x509
+        multi-attr: true
+  -
+    name: done
+    attributes:
+      -
+        name: status
+        type: u32
+        value: 1
+      -
+        name: sockfd
+        type: u32
+      -
+        name: remote-auth
+        type: u32
+        multi-attr: true
+
+operations:
+  list:
+    -
+      name: ready
+      doc: Notify handlers that a new handshake request is waiting
+      value: 1
+      notify: accept
+    -
+      name: accept
+      doc: Handler retrieves next queued handshake request
+      attribute-set: accept
+      flags: [ admin-perm ]
+      do:
+        request:
+          attributes:
+            - handler-class
+        reply:
+          attributes:
+            - status
+            - sockfd
+            - message-type
+            - timeout
+            - auth-mode
+            - peer-identity
+            - certificate
+    -
+      name: done
+      doc: Handler reports handshake completion
+      attribute-set: done
+      do:
+        request:
+          attributes:
+            - status
+            - sockfd
+            - remote-auth
+
+mcast-groups:
+  list:
+    -
+      name: none
diff --git a/include/net/handshake.h b/include/net/handshake.h
new file mode 100644
index 000000000000..aa5d80a4d66f
--- /dev/null
+++ b/include/net/handshake.h
@@ -0,0 +1,46 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Generic HANDSHAKE service.
+ *
+ * Author: Chuck Lever <chuck.lever@oracle.com>
+ *
+ * Copyright (c) 2023, Oracle and/or its affiliates.
+ */
+
+/*
+ * Data structures and functions that are visible only within the
+ * kernel are declared here.
+ */
+
+#ifndef _NET_HANDSHAKE_H
+#define _NET_HANDSHAKE_H
+
+struct handshake_req;
+
+/*
+ * Invariants for all handshake requests for one transport layer
+ * security protocol
+ */
+struct handshake_proto {
+	int			hp_handler_class;
+	size_t			hp_privsize;
+
+	int			(*hp_accept)(struct handshake_req *req,
+					     struct genl_info *gi, int fd);
+	void			(*hp_done)(struct handshake_req *req,
+					   unsigned int status,
+					   struct nlattr **tb);
+	void			(*hp_destroy)(struct handshake_req *req);
+};
+
+extern struct handshake_req *
+handshake_req_alloc(struct socket *sock, const struct handshake_proto *proto,
+		    gfp_t flags);
+extern void *handshake_req_private(struct handshake_req *req);
+extern int handshake_req_submit(struct handshake_req *req, gfp_t flags);
+extern bool handshake_req_cancel(struct socket *sock);
+
+extern struct nlmsghdr *handshake_genl_put(struct sk_buff *msg,
+					   struct genl_info *gi);
+
+#endif /* _NET_HANDSHAKE_H */
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 78beaa765c73..a0ce9de4dab1 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -188,6 +188,11 @@  struct net {
 #if IS_ENABLED(CONFIG_SMC)
 	struct netns_smc	smc;
 #endif
+
+	/* transport layer security handshake requests */
+	spinlock_t		hs_lock;
+	struct list_head	hs_requests;
+	int			hs_pending;
 } __randomize_layout;
 
 #include <linux/seq_file_net.h>
diff --git a/include/net/sock.h b/include/net/sock.h
index 573f2bf7e0de..2a7345ce2540 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -519,6 +519,7 @@  struct sock {
 
 	struct socket		*sk_socket;
 	void			*sk_user_data;
+	void			*sk_handshake_req;
 #ifdef CONFIG_SECURITY
 	void			*sk_security;
 #endif
diff --git a/include/trace/events/handshake.h b/include/trace/events/handshake.h
new file mode 100644
index 000000000000..feffcd1d6256
--- /dev/null
+++ b/include/trace/events/handshake.h
@@ -0,0 +1,159 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM handshake
+
+#if !defined(_TRACE_HANDSHAKE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_HANDSHAKE_H
+
+#include <linux/net.h>
+#include <linux/tracepoint.h>
+
+DECLARE_EVENT_CLASS(handshake_event_class,
+	TP_PROTO(
+		const struct net *net,
+		const struct handshake_req *req,
+		const struct socket *sock
+	),
+	TP_ARGS(net, req, sock),
+	TP_STRUCT__entry(
+		__field(const void *, req)
+		__field(const void *, sock)
+		__field(unsigned int, netns_ino)
+	),
+	TP_fast_assign(
+		__entry->req = req;
+		__entry->sock = sock;
+		__entry->netns_ino = net->ns.inum;
+	),
+	TP_printk("req=%p sock=%p",
+		__entry->req, __entry->sock
+	)
+);
+#define DEFINE_HANDSHAKE_EVENT(name)				\
+	DEFINE_EVENT(handshake_event_class, name,		\
+		TP_PROTO(					\
+			const struct net *net,			\
+			const struct handshake_req *req,	\
+			const struct socket *sock		\
+		),						\
+		TP_ARGS(net, req, sock))
+
+DECLARE_EVENT_CLASS(handshake_fd_class,
+	TP_PROTO(
+		const struct net *net,
+		const struct handshake_req *req,
+		const struct socket *sock,
+		int fd
+	),
+	TP_ARGS(net, req, sock, fd),
+	TP_STRUCT__entry(
+		__field(const void *, req)
+		__field(const void *, sock)
+		__field(int, fd)
+		__field(unsigned int, netns_ino)
+	),
+	TP_fast_assign(
+		__entry->req = req;
+		__entry->sock = req->hr_sock;
+		__entry->fd = fd;
+		__entry->netns_ino = net->ns.inum;
+	),
+	TP_printk("req=%p sock=%p fd=%d",
+		__entry->req, __entry->sock, __entry->fd
+	)
+);
+#define DEFINE_HANDSHAKE_FD_EVENT(name)				\
+	DEFINE_EVENT(handshake_fd_class, name,			\
+		TP_PROTO(					\
+			const struct net *net,			\
+			const struct handshake_req *req,	\
+			const struct socket *sock,		\
+			int fd					\
+		),						\
+		TP_ARGS(net, req, sock, fd))
+
+DECLARE_EVENT_CLASS(handshake_error_class,
+	TP_PROTO(
+		const struct net *net,
+		const struct handshake_req *req,
+		const struct socket *sock,
+		int err
+	),
+	TP_ARGS(net, req, sock, err),
+	TP_STRUCT__entry(
+		__field(const void *, req)
+		__field(const void *, sock)
+		__field(int, err)
+		__field(unsigned int, netns_ino)
+	),
+	TP_fast_assign(
+		__entry->req = req;
+		__entry->sock = sock;
+		__entry->err = err;
+		__entry->netns_ino = net->ns.inum;
+	),
+	TP_printk("req=%p sock=%p err=%d",
+		__entry->req, __entry->sock, __entry->err
+	)
+);
+#define DEFINE_HANDSHAKE_ERROR(name)				\
+	DEFINE_EVENT(handshake_error_class, name,		\
+		TP_PROTO(					\
+			const struct net *net,			\
+			const struct handshake_req *req,	\
+			const struct socket *sock,		\
+			int err					\
+		),						\
+		TP_ARGS(net, req, sock, err))
+
+
+/**
+ ** Request lifetime events
+ **/
+
+DEFINE_HANDSHAKE_EVENT(handshake_submit);
+DEFINE_HANDSHAKE_ERROR(handshake_submit_err);
+DEFINE_HANDSHAKE_EVENT(handshake_cancel);
+DEFINE_HANDSHAKE_EVENT(handshake_cancel_none);
+DEFINE_HANDSHAKE_EVENT(handshake_cancel_busy);
+DEFINE_HANDSHAKE_EVENT(handshake_destruct);
+
+
+TRACE_EVENT(handshake_complete,
+	TP_PROTO(
+		const struct net *net,
+		const struct handshake_req *req,
+		const struct socket *sock,
+		int status
+	),
+	TP_ARGS(net, req, sock, status),
+	TP_STRUCT__entry(
+		__field(const void *, req)
+		__field(const void *, sock)
+		__field(int, status)
+		__field(unsigned int, netns_ino)
+	),
+	TP_fast_assign(
+		__entry->req = req;
+		__entry->sock = sock;
+		__entry->status = status;
+		__entry->netns_ino = net->ns.inum;
+	),
+	TP_printk("req=%p sock=%p status=%d",
+		__entry->req, __entry->sock, __entry->status
+	)
+);
+
+/**
+ ** Netlink events
+ **/
+
+DEFINE_HANDSHAKE_ERROR(handshake_notify_err);
+DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_accept);
+DEFINE_HANDSHAKE_ERROR(handshake_cmd_accept_err);
+DEFINE_HANDSHAKE_FD_EVENT(handshake_cmd_done);
+DEFINE_HANDSHAKE_ERROR(handshake_cmd_done_err);
+
+#endif /* _TRACE_HANDSHAKE_H */
+
+#include <trace/define_trace.h>
diff --git a/include/uapi/linux/handshake.h b/include/uapi/linux/handshake.h
new file mode 100644
index 000000000000..6e0c608a6b91
--- /dev/null
+++ b/include/uapi/linux/handshake.h
@@ -0,0 +1,71 @@ 
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/* Do not edit directly, auto-generated from: */
+/*	Documentation/netlink/specs/handshake.yaml */
+/* YNL-GEN uapi header */
+
+#ifndef _UAPI_LINUX_HANDSHAKE_H
+#define _UAPI_LINUX_HANDSHAKE_H
+
+#define HANDSHAKE_FAMILY_NAME		"handshake"
+#define HANDSHAKE_FAMILY_VERSION	1
+
+enum {
+	HANDSHAKE_HANDLER_CLASS_NONE,
+};
+
+enum {
+	HANDSHAKE_MSG_TYPE_UNSPEC,
+	HANDSHAKE_MSG_TYPE_CLIENTHELLO,
+	HANDSHAKE_MSG_TYPE_SERVERHELLO,
+};
+
+enum {
+	HANDSHAKE_AUTH_UNSPEC,
+	HANDSHAKE_AUTH_UNAUTH,
+	HANDSHAKE_AUTH_PSK,
+	HANDSHAKE_AUTH_X509,
+};
+
+enum {
+	HANDSHAKE_A_X509_CERT = 1,
+	HANDSHAKE_A_X509_PRIVKEY,
+
+	__HANDSHAKE_A_X509_MAX,
+	HANDSHAKE_A_X509_MAX = (__HANDSHAKE_A_X509_MAX - 1)
+};
+
+enum {
+	HANDSHAKE_A_ACCEPT_STATUS = 1,
+	HANDSHAKE_A_ACCEPT_SOCKFD,
+	HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
+	HANDSHAKE_A_ACCEPT_MESSAGE_TYPE,
+	HANDSHAKE_A_ACCEPT_TIMEOUT,
+	HANDSHAKE_A_ACCEPT_AUTH_MODE,
+	HANDSHAKE_A_ACCEPT_PEER_IDENTITY,
+	HANDSHAKE_A_ACCEPT_CERTIFICATE,
+
+	__HANDSHAKE_A_ACCEPT_MAX,
+	HANDSHAKE_A_ACCEPT_MAX = (__HANDSHAKE_A_ACCEPT_MAX - 1)
+};
+
+enum {
+	HANDSHAKE_A_DONE_STATUS = 1,
+	HANDSHAKE_A_DONE_SOCKFD,
+	HANDSHAKE_A_DONE_REMOTE_AUTH,
+
+	__HANDSHAKE_A_DONE_MAX,
+	HANDSHAKE_A_DONE_MAX = (__HANDSHAKE_A_DONE_MAX - 1)
+};
+
+enum {
+	HANDSHAKE_CMD_READY = 1,
+	HANDSHAKE_CMD_ACCEPT,
+	HANDSHAKE_CMD_DONE,
+
+	__HANDSHAKE_CMD_MAX,
+	HANDSHAKE_CMD_MAX = (__HANDSHAKE_CMD_MAX - 1)
+};
+
+#define HANDSHAKE_MCGRP_NONE	"none"
+
+#endif /* _UAPI_LINUX_HANDSHAKE_H */
diff --git a/net/Makefile b/net/Makefile
index 0914bea9c335..adbb64277601 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -79,3 +79,4 @@  obj-$(CONFIG_NET_NCSI)		+= ncsi/
 obj-$(CONFIG_XDP_SOCKETS)	+= xdp/
 obj-$(CONFIG_MPTCP)		+= mptcp/
 obj-$(CONFIG_MCTP)		+= mctp/
+obj-y				+= handshake/
diff --git a/net/handshake/Makefile b/net/handshake/Makefile
new file mode 100644
index 000000000000..a41b03f4837b
--- /dev/null
+++ b/net/handshake/Makefile
@@ -0,0 +1,11 @@ 
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Makefile for the Generic HANDSHAKE service
+#
+# Author: Chuck Lever <chuck.lever@oracle.com>
+#
+# Copyright (c) 2023, Oracle and/or its affiliates.
+#
+
+obj-y += handshake.o
+handshake-y := netlink.o request.o trace.o
diff --git a/net/handshake/handshake.h b/net/handshake/handshake.h
new file mode 100644
index 000000000000..77ba4c68cd00
--- /dev/null
+++ b/net/handshake/handshake.h
@@ -0,0 +1,41 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Generic netlink handshake service
+ *
+ * Author: Chuck Lever <chuck.lever@oracle.com>
+ *
+ * Copyright (c) 2023, Oracle and/or its affiliates.
+ */
+
+/*
+ * Data structures and functions that are visible only within the
+ * handshake module are declared here.
+ */
+
+#ifndef _INTERNAL_HANDSHAKE_H
+#define _INTERNAL_HANDSHAKE_H
+
+/*
+ * One handshake request
+ */
+struct handshake_req {
+	struct list_head		hr_list;
+	unsigned long			hr_flags;
+	const struct handshake_proto	*hr_proto;
+	struct socket			*hr_sock;
+
+	void				(*hr_saved_destruct)(struct sock *sk);
+};
+
+#define HANDSHAKE_F_COMPLETED	BIT(0)
+
+/* netlink.c */
+extern bool handshake_genl_inited;
+int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags);
+
+/* request.c */
+void __remove_pending_locked(struct net *net, struct handshake_req *req);
+void handshake_complete(struct handshake_req *req, unsigned int status,
+			struct nlattr **tb);
+
+#endif /* _INTERNAL_HANDSHAKE_H */
diff --git a/net/handshake/netlink.c b/net/handshake/netlink.c
new file mode 100644
index 000000000000..6f3a7852742b
--- /dev/null
+++ b/net/handshake/netlink.c
@@ -0,0 +1,345 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Generic netlink handshake service
+ *
+ * Author: Chuck Lever <chuck.lever@oracle.com>
+ *
+ * Copyright (c) 2023, Oracle and/or its affiliates.
+ */
+
+#include <linux/types.h>
+#include <linux/socket.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/inet.h>
+
+#include <net/sock.h>
+#include <net/genetlink.h>
+#include <net/handshake.h>
+
+#include <uapi/linux/handshake.h>
+#include <trace/events/handshake.h>
+#include "handshake.h"
+
+static struct genl_family __ro_after_init handshake_genl_family;
+bool handshake_genl_inited;
+
+/**
+ * handshake_genl_notify - Notify handlers that a request is waiting
+ * @net: target network namespace
+ * @handler_class: target handler
+ * @flags: memory allocation control flags
+ *
+ * Returns zero on success or a negative errno if notification failed.
+ */
+int handshake_genl_notify(struct net *net, int handler_class, gfp_t flags)
+{
+	struct sk_buff *msg;
+	void *hdr;
+
+	if (!genl_has_listeners(&handshake_genl_family, net, handler_class))
+		return -ESRCH;
+
+	msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	hdr = genlmsg_put(msg, 0, 0, &handshake_genl_family, 0,
+			  HANDSHAKE_CMD_READY);
+	if (!hdr)
+		goto out_free;
+
+	if (nla_put_u32(msg, HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
+			handler_class) < 0) {
+		genlmsg_cancel(msg, hdr);
+		goto out_free;
+	}
+
+	genlmsg_end(msg, hdr);
+	return genlmsg_multicast_netns(&handshake_genl_family, net, msg,
+				       0, handler_class, flags);
+
+out_free:
+	nlmsg_free(msg);
+	return -EMSGSIZE;
+}
+
+/**
+ * handshake_genl_put - Create a generic netlink message header
+ * @msg: buffer in which to create the header
+ * @gi: generic netlink message context
+ *
+ * Returns a ready-to-use header, or NULL.
+ */
+struct nlmsghdr *handshake_genl_put(struct sk_buff *msg, struct genl_info *gi)
+{
+	return genlmsg_put(msg, gi->snd_portid, gi->snd_seq,
+			   &handshake_genl_family, 0, gi->genlhdr->cmd);
+}
+EXPORT_SYMBOL(handshake_genl_put);
+
+static int handshake_status_reply(struct sk_buff *skb, struct genl_info *gi,
+				  int status)
+{
+	struct nlmsghdr *hdr;
+	struct sk_buff *msg;
+	int ret;
+
+	ret = -ENOMEM;
+	msg = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		goto out;
+	hdr = handshake_genl_put(msg, gi);
+	if (!hdr)
+		goto out_free;
+
+	ret = -EMSGSIZE;
+	ret = nla_put_u32(msg, HANDSHAKE_A_ACCEPT_STATUS, status);
+	if (ret < 0)
+		goto out_free;
+
+	genlmsg_end(msg, hdr);
+	return genlmsg_reply(msg, gi);
+
+out_free:
+	genlmsg_cancel(msg, hdr);
+out:
+	return ret;
+}
+
+/*
+ * dup() a kernel socket for use as a user space file descriptor
+ * in the current process. The kernel socket must have an
+ * instatiated struct file.
+ *
+ * Implicit argument: "current()"
+ */
+static int handshake_dup(struct socket *kernsock)
+{
+	struct file *file;
+	int newfd;
+
+	if (!kernsock->file)
+		return -EBADF;
+
+	file = get_file(kernsock->file);
+	newfd = get_unused_fd_flags(O_CLOEXEC);
+	if (newfd < 0) {
+		fput(file);
+		return newfd;
+	}
+
+	fd_install(newfd, file);
+	return newfd;
+}
+
+static const struct nla_policy
+handshake_accept_nl_policy[HANDSHAKE_A_ACCEPT_HANDLER_CLASS + 1] = {
+	[HANDSHAKE_A_ACCEPT_HANDLER_CLASS] = { .type = NLA_U32, },
+};
+
+static int handshake_nl_accept_doit(struct sk_buff *skb, struct genl_info *gi)
+{
+	struct nlattr *tb[HANDSHAKE_A_ACCEPT_MAX + 1];
+	struct net *net = sock_net(skb->sk);
+	struct handshake_req *pos, *req;
+	int fd, err;
+
+	err = -EINVAL;
+	if (genlmsg_parse(nlmsg_hdr(skb), &handshake_genl_family, tb,
+			  HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
+			  handshake_accept_nl_policy, NULL))
+		goto out_status;
+	if (!tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS])
+		goto out_status;
+
+	req = NULL;
+	spin_lock(&net->hs_lock);
+	list_for_each_entry(pos, &net->hs_requests, hr_list) {
+		if (pos->hr_proto->hp_handler_class !=
+		    nla_get_u32(tb[HANDSHAKE_A_ACCEPT_HANDLER_CLASS]))
+			continue;
+		__remove_pending_locked(net, pos);
+		req = pos;
+		break;
+	}
+	spin_unlock(&net->hs_lock);
+	if (!req)
+		goto out_status;
+
+	fd = handshake_dup(req->hr_sock);
+	if (fd < 0) {
+		err = fd;
+		goto out_complete;
+	}
+	err = req->hr_proto->hp_accept(req, gi, fd);
+	if (err)
+		goto out_complete;
+
+	trace_handshake_cmd_accept(net, req, req->hr_sock, fd);
+	return 0;
+
+out_complete:
+	handshake_complete(req, -EIO, NULL);
+	fput(req->hr_sock->file);
+out_status:
+	trace_handshake_cmd_accept_err(net, req, NULL, err);
+	return handshake_status_reply(skb, gi, err);
+}
+
+static const struct nla_policy
+handshake_done_nl_policy[HANDSHAKE_A_DONE_MAX + 1] = {
+	[HANDSHAKE_A_DONE_SOCKFD] = { .type = NLA_U32, },
+	[HANDSHAKE_A_DONE_STATUS] = { .type = NLA_U32, },
+	[HANDSHAKE_A_DONE_REMOTE_AUTH] = { .type = NLA_U32, },
+};
+
+static int handshake_nl_done_doit(struct sk_buff *skb, struct genl_info *gi)
+{
+	struct nlattr *tb[HANDSHAKE_A_DONE_MAX + 1];
+	struct net *net = sock_net(skb->sk);
+	struct socket *sock = NULL;
+	struct handshake_req *req;
+	int fd, status, err;
+
+	err = genlmsg_parse(nlmsg_hdr(skb), &handshake_genl_family, tb,
+			    HANDSHAKE_A_DONE_MAX, handshake_done_nl_policy,
+			    NULL);
+	if (err || !tb[HANDSHAKE_A_DONE_SOCKFD]) {
+		err = -EINVAL;
+		goto out_status;
+	}
+
+	fd = nla_get_u32(tb[HANDSHAKE_A_DONE_SOCKFD]);
+
+	err = 0;
+	sock = sockfd_lookup(fd, &err);
+	if (err) {
+		err = -EBADF;
+		goto out_status;
+	}
+
+	req = sock->sk->sk_handshake_req;
+	if (!req) {
+		err = -EBUSY;
+		goto out_status;
+	}
+
+	trace_handshake_cmd_done(net, req, sock, fd);
+
+	status = -EIO;
+	if (tb[HANDSHAKE_A_DONE_STATUS])
+		status = nla_get_u32(tb[HANDSHAKE_A_DONE_STATUS]);
+
+	handshake_complete(req, status, tb);
+	fput(sock->file);
+	return 0;
+
+out_status:
+	trace_handshake_cmd_done_err(net, req, sock, err);
+	return 0;
+}
+
+static const struct genl_split_ops handshake_nl_ops[] = {
+	{
+		.cmd		= HANDSHAKE_CMD_ACCEPT,
+		.doit		= handshake_nl_accept_doit,
+		.policy		= handshake_accept_nl_policy,
+		.maxattr	= HANDSHAKE_A_ACCEPT_HANDLER_CLASS,
+		.flags		= GENL_ADMIN_PERM | GENL_CMD_CAP_DO,
+	},
+	{
+		.cmd		= HANDSHAKE_CMD_DONE,
+		.doit		= handshake_nl_done_doit,
+		.policy		= handshake_done_nl_policy,
+		.maxattr	= HANDSHAKE_A_DONE_MAX,
+		.flags		= GENL_CMD_CAP_DO,
+	},
+};
+
+static const struct genl_multicast_group handshake_nl_mcgrps[] = {
+	[HANDSHAKE_HANDLER_CLASS_NONE] = { .name = HANDSHAKE_MCGRP_NONE, },
+};
+
+static struct genl_family __ro_after_init handshake_genl_family = {
+	.hdrsize		= 0,
+	.name			= HANDSHAKE_FAMILY_NAME,
+	.version		= HANDSHAKE_FAMILY_VERSION,
+	.netnsok		= true,
+	.parallel_ops		= true,
+	.n_mcgrps		= ARRAY_SIZE(handshake_nl_mcgrps),
+	.n_split_ops		= ARRAY_SIZE(handshake_nl_ops),
+	.split_ops		= handshake_nl_ops,
+	.mcgrps			= handshake_nl_mcgrps,
+	.module			= THIS_MODULE,
+};
+
+static int __net_init handshake_net_init(struct net *net)
+{
+	spin_lock_init(&net->hs_lock);
+	INIT_LIST_HEAD(&net->hs_requests);
+	net->hs_pending	= 0;
+	return 0;
+}
+
+static void __net_exit handshake_net_exit(struct net *net)
+{
+	struct handshake_req *req;
+	LIST_HEAD(requests);
+
+	/*
+	 * This drains the net's pending list. Requests that
+	 * have been accepted and are in progress will be
+	 * destroyed when the socket is closed.
+	 */
+	spin_lock(&net->hs_lock);
+	list_splice_init(&requests, &net->hs_requests);
+	spin_unlock(&net->hs_lock);
+
+	while (!list_empty(&requests)) {
+		req = list_first_entry(&requests, struct handshake_req, hr_list);
+		list_del(&req->hr_list);
+
+		/*
+		 * Requests on this list have not yet been
+		 * accepted, so they do not have an fd to put.
+		 */
+
+		handshake_complete(req, -ETIMEDOUT, NULL);
+	}
+}
+
+static struct pernet_operations handshake_genl_net_ops = {
+	.init		= handshake_net_init,
+	.exit		= handshake_net_exit,
+};
+
+static int __init handshake_init(void)
+{
+	int ret;
+
+	ret = genl_register_family(&handshake_genl_family);
+	if (ret) {
+		pr_warn("handshake: netlink registration failed (%d)\n", ret);
+		return ret;
+	}
+
+	ret = register_pernet_subsys(&handshake_genl_net_ops);
+	if (ret) {
+		pr_warn("handshake: pernet registration failed (%d)\n", ret);
+		genl_unregister_family(&handshake_genl_family);
+	}
+
+	handshake_genl_inited = true;
+	return ret;
+}
+
+static void __exit handshake_exit(void)
+{
+	unregister_pernet_subsys(&handshake_genl_net_ops);
+	genl_unregister_family(&handshake_genl_family);
+}
+
+module_init(handshake_init);
+module_exit(handshake_exit);
diff --git a/net/handshake/request.c b/net/handshake/request.c
new file mode 100644
index 000000000000..43edfa94f4fa
--- /dev/null
+++ b/net/handshake/request.c
@@ -0,0 +1,246 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Handshake request lifetime events
+ *
+ * Author: Chuck Lever <chuck.lever@oracle.com>
+ *
+ * Copyright (c) 2023, Oracle and/or its affiliates.
+ */
+
+#include <linux/types.h>
+#include <linux/socket.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/inet.h>
+#include <linux/fdtable.h>
+
+#include <net/sock.h>
+#include <net/genetlink.h>
+#include <net/handshake.h>
+
+#include <uapi/linux/handshake.h>
+#include <trace/events/handshake.h>
+#include "handshake.h"
+
+/*
+ * This limit is to prevent slow remotes from causing denial of service.
+ * A ulimit-style tunable might be used instead.
+ */
+#define HANDSHAKE_PENDING_MAX (10)
+
+static void __add_pending_locked(struct net *net, struct handshake_req *req)
+{
+	net->hs_pending++;
+	list_add_tail(&req->hr_list, &net->hs_requests);
+}
+
+void __remove_pending_locked(struct net *net, struct handshake_req *req)
+{
+	net->hs_pending--;
+	list_del_init(&req->hr_list);
+}
+
+/*
+ * Return values:
+ *   %true - the request was found on @net's pending list
+ *   %false - the request was not found on @net's pending list
+ *
+ * If @req was on a pending list, it has not yet been accepted.
+ */
+static bool remove_pending(struct net *net, struct handshake_req *req)
+{
+	bool ret;
+
+	ret = false;
+
+	spin_lock(&net->hs_lock);
+	if (!list_empty(&req->hr_list)) {
+		__remove_pending_locked(net, req);
+		ret = true;
+	}
+	spin_unlock(&net->hs_lock);
+
+	return ret;
+}
+
+static void handshake_req_destroy(struct handshake_req *req, struct sock *sk)
+{
+	req->hr_proto->hp_destroy(req);
+	sk->sk_handshake_req = NULL;
+	kfree(req);
+}
+
+static void handshake_sk_destruct(struct sock *sk)
+{
+	struct handshake_req *req = sk->sk_handshake_req;
+
+	if (req) {
+		trace_handshake_destruct(sock_net(sk), req, req->hr_sock);
+		handshake_req_destroy(req, sk);
+	}
+}
+
+/**
+ * handshake_req_alloc - consumer API to allocate a request
+ * @sock: open socket on which to perform a handshake
+ * @proto: security protocol
+ * @flags: memory allocation flags
+ *
+ * Returns an initialized handshake_req or NULL.
+ */
+struct handshake_req *handshake_req_alloc(struct socket *sock,
+					  const struct handshake_proto *proto,
+					  gfp_t flags)
+{
+	struct handshake_req *req;
+
+	/* Avoid accessing uninitialized global variables later on */
+	if (!handshake_genl_inited)
+		return NULL;
+
+	req = kzalloc(sizeof(*req) + proto->hp_privsize, flags);
+	if (!req)
+		return NULL;
+
+	sock_hold(sock->sk);
+
+	INIT_LIST_HEAD(&req->hr_list);
+	req->hr_sock = sock;
+	req->hr_proto = proto;
+	return req;
+}
+EXPORT_SYMBOL(handshake_req_alloc);
+
+/**
+ * handshake_req_private - consumer API to return per-handshake private data
+ * @req: handshake arguments
+ *
+ */
+void *handshake_req_private(struct handshake_req *req)
+{
+	return (void *)(req + 1);
+}
+EXPORT_SYMBOL(handshake_req_private);
+
+/**
+ * handshake_req_submit - consumer API to submit a handshake request
+ * @req: handshake arguments
+ * @flags: memory allocation flags
+ *
+ * Return values:
+ *   %0: Request queued
+ *   %-EBUSY: A handshake is already under way for this socket
+ *   %-ESRCH: No handshake agent is available
+ *   %-EAGAIN: Too many pending handshake requests
+ *   %-ENOMEM: Failed to allocate memory
+ *   %-EMSGSIZE: Failed to construct notification message
+ *
+ * A zero return value from handshake_request() means that
+ * exactly one subsequent completion callback is guaranteed.
+ *
+ * A negative return value from handshake_request() means that
+ * no completion callback will be done and that @req is
+ * destroyed.
+ */
+int handshake_req_submit(struct handshake_req *req, gfp_t flags)
+{
+	struct socket *sock = req->hr_sock;
+	struct sock *sk = sock->sk;
+	struct net *net = sock_net(sk);
+	int ret;
+
+	ret = -EAGAIN;
+	if (READ_ONCE(net->hs_pending) >= HANDSHAKE_PENDING_MAX)
+		goto out_err;
+
+	ret = -EBUSY;
+	spin_lock(&net->hs_lock);
+	if (sk->sk_handshake_req || !list_empty(&req->hr_list)) {
+		spin_unlock(&net->hs_lock);
+		goto out_err;
+	}
+	req->hr_saved_destruct = sk->sk_destruct;
+	sk->sk_destruct = handshake_sk_destruct;
+	sk->sk_handshake_req = req;
+	__add_pending_locked(net, req);
+	spin_unlock(&net->hs_lock);
+
+	ret = handshake_genl_notify(net, req->hr_proto->hp_handler_class,
+				    flags);
+	if (ret) {
+		trace_handshake_notify_err(net, req, sock, ret);
+		if (remove_pending(net, req))
+			goto out_err;
+	}
+
+	trace_handshake_submit(net, req, sock);
+	return 0;
+
+out_err:
+	trace_handshake_submit_err(net, req, sock, ret);
+	handshake_req_destroy(req, sk);
+	return ret;
+}
+EXPORT_SYMBOL(handshake_req_submit);
+
+void handshake_complete(struct handshake_req *req, unsigned int status,
+			struct nlattr **tb)
+{
+	struct socket *sock = req->hr_sock;
+	struct net *net = sock_net(sock->sk);
+
+	if (!test_and_set_bit(HANDSHAKE_F_COMPLETED, &req->hr_flags)) {
+		trace_handshake_complete(net, req, sock, status);
+		req->hr_proto->hp_done(req, status, tb);
+		__sock_put(sock->sk);
+	}
+}
+
+/**
+ * handshake_req_cancel - consumer API to cancel an in-progress handshake
+ * @sock: socket on which there is an ongoing handshake
+ *
+ * XXX: Perhaps killing the user space agent might also be necessary?
+ *
+ * Request cancellation races with request completion. To determine
+ * who won, callers examine the return value from this function.
+ *
+ * Return values:
+ *   %true - Uncompleted handshake request was canceled or not found
+ *   %false - Handshake request already completed
+ */
+bool handshake_req_cancel(struct socket *sock)
+{
+	struct handshake_req *req;
+	struct sock *sk;
+	struct net *net;
+
+	if (!sock)
+		return true;
+
+	sk = sock->sk;
+	req = sk->sk_handshake_req;
+	net = sock_net(sk);
+
+	if (!req) {
+		trace_handshake_cancel_none(net, req, sock);
+		return true;
+	}
+
+	if (remove_pending(net, req)) {
+		/* Request hadn't been accepted */
+		trace_handshake_cancel(net, req, sock);
+		return true;
+	}
+	if (test_and_set_bit(HANDSHAKE_F_COMPLETED, &req->hr_flags)) {
+		/* Request already completed */
+		trace_handshake_cancel_busy(net, req, sock);
+		return false;
+	}
+
+	__sock_put(sk);
+	trace_handshake_cancel(net, req, sock);
+	return true;
+}
+EXPORT_SYMBOL(handshake_req_cancel);
diff --git a/net/handshake/trace.c b/net/handshake/trace.c
new file mode 100644
index 000000000000..3a5b6f29a2b8
--- /dev/null
+++ b/net/handshake/trace.c
@@ -0,0 +1,17 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Trace points for transport security layer handshakes.
+ *
+ * Author: Chuck Lever <chuck.lever@oracle.com>
+ *
+ * Copyright (c) 2023, Oracle and/or its affiliates.
+ */
+
+#include <linux/types.h>
+#include <net/sock.h>
+
+#include "handshake.h"
+
+#define CREATE_TRACE_POINTS
+
+#include <trace/events/handshake.h>