mbox series

[00/11] Keyrings, Block and USB notifications [ver #7]

Message ID 156717343223.2204.15875738850129174524.stgit@warthog.procyon.org.uk (mailing list archive)
Headers show
Series Keyrings, Block and USB notifications [ver #7] | expand

Message

David Howells Aug. 30, 2019, 1:57 p.m. UTC
Here's a set of patches to add a general notification queue concept and to
add sources of events for:

 (1) Key/keyring events, such as creating, linking and removal of keys.

 (2) General device events (single common queue) including:

     - Block layer events, such as device errors

     - USB subsystem events, such as device/bus attach/remove, device
       reset, device errors.

Tests for the key/keyring events can be found on the keyutils next branch:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=next

Notifications are done automatically inside of the testing infrastructure
on every change to that every test makes to a key or keyring.

Manual pages can be found there also, including pages for watch_queue(7)
and the watch_devices(2) system call (these should be transferred to the
manpages package if taken upstream).

LSM hooks are included:

 (1) A set of hooks are provided that allow an LSM to rule on whether or
     not a watch may be set.  Each of these hooks takes a different
     "watched object" parameter, so they're not really shareable.  The LSM
     should use current's credentials.  [Wanted by SELinux & Smack]

 (2) A hook is provided to allow an LSM to rule on whether or not a
     particular message may be posted to a particular queue.  This is given
     the credentials from the event generator (which may be the system) and
     the watch setter.  [Wanted by Smack]

I've provided a preliminary attempt to provide SELinux and Smack with
implementations of some of these hooks.


Design decisions:

 (1) A misc chardev is used to create and open a ring buffer:

	fd = open("/dev/watch_queue", O_RDWR);

     which is then configured and mmap'd into userspace:

	ioctl(fd, IOC_WATCH_QUEUE_SET_SIZE, BUF_SIZE);
	ioctl(fd, IOC_WATCH_QUEUE_SET_FILTER, &filter);
	buf = mmap(NULL, BUF_SIZE * page_size, PROT_READ | PROT_WRITE,
		   MAP_SHARED, fd, 0);

     The fd cannot be read or written (though there is a facility to use
     write to inject records for debugging) and userspace just pulls data
     directly out of the buffer.

 (2) The ring index pointers are stored inside the ring and are thus
     accessible to userspace.  Userspace should only update the tail
     pointer and never the head pointer or risk breaking the buffer.  The
     kernel checks that the pointers appear valid before trying to use
     them.  A 'skip' record is maintained around the pointers.

 (3) poll() can be used to wait for data to appear in the buffer.

 (4) Records in the buffer are binary, typed and have a length so that they
     can be of varying size.

     This means that multiple heterogeneous sources can share a common
     buffer.  Tags may be specified when a watchpoint is created to help
     distinguish the sources.

 (5) The queue is reusable as there are 16 million types available, of
     which I've used just a few, so there is scope for others to be used.

 (6) Records are filterable as types have up to 256 subtypes that can be
     individually filtered.  Other filtration is also available.

 (7) Each time the buffer is opened, a new buffer is created - this means
     that there's no interference between watchers.

 (8) When recording a notification, the kernel will not sleep, but will
     rather mark a queue as overrun if there's insufficient space, thereby
     avoiding userspace causing the kernel to hang.

 (9) The 'watchpoint' should be specific where possible, meaning that you
     specify the object that you want to watch.

(10) The buffer is created and then watchpoints are attached to it, using
     one of:

	keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fd, 0x01);
	watch_devices(fd, 0x02, 0);

     where in both cases, fd indicates the queue and the number after is a
     tag between 0 and 255.

(11) The watch must be removed if either the watch buffer is destroyed or
     the watched object is destroyed.


Things I want to avoid:

 (1) Introducing features that make the core VFS dependent on the network
     stack or networking namespaces (ie. usage of netlink).

 (2) Dumping all this stuff into dmesg and having a daemon that sits there
     parsing the output and distributing it as this then puts the
     responsibility for security into userspace and makes handling
     namespaces tricky.  Further, dmesg might not exist or might be
     inaccessible inside a container.

 (3) Letting users see events they shouldn't be able to see.


The patches can be found here also:

	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=notifications-core

Changes:

 ver #7:

 (*) Removed the 'watch' argument from the security_watch_key() and
     security_watch_devices() hooks as current_cred() can be used instead
     of watch->cred.

 ver #6:

 (*) Fix mmap bug in watch_queue driver.

 (*) Add an extended removal notification that can transmit an identifier
     to userspace (such as a key ID).

 (*) Don't produce a instantiation notification in mark_key_instantiated()
     but rather do it in the caller to prevent key updates from producing
     an instantiate notification as well as an update notification.

 (*) Set the right number of filters in the sample program.

 (*) Provide preliminary hook implementations for SELinux and Smack.

 ver #5:

 (*) Split the superblock watch and mount watch parts out into their own
     branch (notifications-mount) as they really need certain fsinfo()
     attributes.

 (*) Rearrange the watch notification UAPI header to push the length down
     to bits 0-5 and remove the lost-message bits.  The userspace's watch
     ID tag is moved to bits 8-15 and then the message type is allocated
     all of bits 16-31 for its own purposes.

     The lost-message bit is moved over to the header, rather than being
     placed in the next message to be generated and given its own word so
     it can be cleared with xchg(,0) for parisc.

 (*) The security_post_notification() hook is no longer called with the
     spinlock held and softirqs disabled - though the RCU readlock is still
     held.

 (*) Buffer pages are now accounted towards RLIMIT_MEMLOCK and CAP_IPC_LOCK
     will skip the overuse check.

 (*) The buffer is marked VM_DONTEXPAND.

 (*) Save the watch-setter's creds in struct watch and give that to the LSM
     hook for posting a message.

 ver #4:

 (*) Split the basic UAPI bits out into their own patch and then split the
     LSM hooks out into an intermediate patch.  Add LSM hooks for setting
     watches.

     Rename the *_notify() system calls to watch_*() for consistency.

 ver #3:

 (*) I've added a USB notification source and reformulated the block
     notification source so that there's now a common watch list, for which
     the system call is now device_notify().

     I've assigned a pair of unused ioctl numbers in the 'W' series to the
     ioctls added by this series.

     I've also added a description of the kernel API to the documentation.

 ver #2:

 (*) I've fixed various issues raised by Jann Horn and GregKH and moved to
     krefs for refcounting.  I've added some security features to try and
     give Casey Schaufler the LSM control he wants.

David
---
David Howells (11):
      uapi: General notification ring definitions
      security: Add hooks to rule on setting a watch
      security: Add a hook for the point of notification insertion
      General notification queue with user mmap()'able ring buffer
      keys: Add a notification facility
      Add a general, global device notification watch list
      block: Add block layer notifications
      usb: Add USB subsystem notifications
      Add sample notification program
      selinux: Implement the watch_key security hook
      smack: Implement the watch_key and post_notification hooks [untested]


 Documentation/ioctl/ioctl-number.rst        |    1 
 Documentation/security/keys/core.rst        |   58 ++
 Documentation/watch_queue.rst               |  460 ++++++++++++++
 arch/alpha/kernel/syscalls/syscall.tbl      |    1 
 arch/arm/tools/syscall.tbl                  |    1 
 arch/ia64/kernel/syscalls/syscall.tbl       |    1 
 arch/m68k/kernel/syscalls/syscall.tbl       |    1 
 arch/microblaze/kernel/syscalls/syscall.tbl |    1 
 arch/mips/kernel/syscalls/syscall_n32.tbl   |    1 
 arch/mips/kernel/syscalls/syscall_n64.tbl   |    1 
 arch/mips/kernel/syscalls/syscall_o32.tbl   |    1 
 arch/parisc/kernel/syscalls/syscall.tbl     |    1 
 arch/powerpc/kernel/syscalls/syscall.tbl    |    1 
 arch/s390/kernel/syscalls/syscall.tbl       |    1 
 arch/sh/kernel/syscalls/syscall.tbl         |    1 
 arch/sparc/kernel/syscalls/syscall.tbl      |    1 
 arch/x86/entry/syscalls/syscall_32.tbl      |    1 
 arch/x86/entry/syscalls/syscall_64.tbl      |    1 
 arch/xtensa/kernel/syscalls/syscall.tbl     |    1 
 block/Kconfig                               |    9 
 block/blk-core.c                            |   29 +
 drivers/base/Kconfig                        |    9 
 drivers/base/Makefile                       |    1 
 drivers/base/watch.c                        |   90 +++
 drivers/misc/Kconfig                        |   13 
 drivers/misc/Makefile                       |    1 
 drivers/misc/watch_queue.c                  |  893 +++++++++++++++++++++++++++
 drivers/usb/core/Kconfig                    |    9 
 drivers/usb/core/devio.c                    |   56 ++
 drivers/usb/core/hub.c                      |    4 
 include/linux/blkdev.h                      |   15 
 include/linux/device.h                      |    7 
 include/linux/key.h                         |    3 
 include/linux/lsm_audit.h                   |    1 
 include/linux/lsm_hooks.h                   |   38 +
 include/linux/security.h                    |   32 +
 include/linux/syscalls.h                    |    1 
 include/linux/usb.h                         |   18 +
 include/linux/watch_queue.h                 |   94 +++
 include/uapi/asm-generic/unistd.h           |    4 
 include/uapi/linux/keyctl.h                 |    2 
 include/uapi/linux/watch_queue.h            |  183 ++++++
 kernel/sys_ni.c                             |    1 
 samples/Kconfig                             |    6 
 samples/Makefile                            |    1 
 samples/watch_queue/Makefile                |    8 
 samples/watch_queue/watch_test.c            |  233 +++++++
 security/keys/Kconfig                       |    9 
 security/keys/compat.c                      |    3 
 security/keys/gc.c                          |    5 
 security/keys/internal.h                    |   30 +
 security/keys/key.c                         |   38 +
 security/keys/keyctl.c                      |   99 +++
 security/keys/keyring.c                     |   20 -
 security/keys/request_key.c                 |    4 
 security/security.c                         |   23 +
 security/selinux/hooks.c                    |   14 
 security/smack/smack_lsm.c                  |   82 ++
 58 files changed, 2593 insertions(+), 30 deletions(-)
 create mode 100644 Documentation/watch_queue.rst
 create mode 100644 drivers/base/watch.c
 create mode 100644 drivers/misc/watch_queue.c
 create mode 100644 include/linux/watch_queue.h
 create mode 100644 include/uapi/linux/watch_queue.h
 create mode 100644 samples/watch_queue/Makefile
 create mode 100644 samples/watch_queue/watch_test.c

Comments

David Howells Aug. 30, 2019, 2:15 p.m. UTC | #1
.\"
.\" Copyright (C) 2019 Red Hat, Inc. All Rights Reserved.
.\" Written by David Howells (dhowells@redhat.com)
.\"
.\" This program is free software; you can redistribute it and/or
.\" modify it under the terms of the GNU General Public Licence
.\" as published by the Free Software Foundation; either version
.\" 2 of the Licence, or (at your option) any later version.
.\"
.TH WATCH_QUEUE 7 "28 Aug 2019" Linux "General Kernel Notifications"
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH NAME
/dev/watch_queue \- General kernel notification queue
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH SYNOPSIS
#include <linux/watch_queue.h>
.EX

int fd = open("/dev/watch_queue", O_RDWR);
ioctl(fd, IOC_WATCH_QUEUE_SET_SIZE, size / page_size);
ioctl(fd, IOC_WATCH_QUEUE_SET_FILTER, &filter);
buf = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
.EE
.SH OVERVIEW
.PP
The general kernel notification queue is a general purpose transport for kernel
notification messages to userspace.  Notification messages are marked with type
information so that events from multiple sources can be distinguished.
Messages are also of variable length to accommodate different information for
each type.
.PP
This queue is implemented as a misc device that can be opened multiple times,
each opening creating a fully independent queue.  Queues are then configured
with the size and filtering, event sources are attached and the queue is mapped
into a process's VM.
.PP
Queues take the form of a ring buffer with shared index pointers, all of which
is accessed directly within the mapping.  There are no read and write methods,
though poll is provided so that the buffer can be waited upon.
.PP
A queue pins a certain amount of locked kernel memory (so that the kernel can
write a notification into it from contexts where swapping cannot be performed),
and so is subject to resource limit restrictions on
.BR RLIMIT_MEMLOCK .
.PP
Sources must be attached to a queue manually; there's no single global event
source, but rather a variety of sources, each of which can be attached to by
multiple queues.  Attachments can be set up by:
.TP
.BR keyctl_watch_key (3)
Monitor a key or keyring for changes.
.TP
.BR device_notify (2)
Monitor a global source of device events from USB and block devices, such as
device detection, device removal and I/O errors.
.PP
Because a source can produce a lot of different events, not all of which may be
of interest to the watcher, a filter can be set on a queue to determine whether
a particular event will get inserted in a queue at the point of posting inside
the kernel.

.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH RING STRUCTURE
.PP
The ring buffer is divided into 8-byte slots and notification message occupies
between 1 and 63 of those slots.  Each message begins with a header of the
form:
.PP
.in +4n
.EX
struct watch_notification {
	__u32	type:24;
	__u32	subtype:8;
	__u32	info;
};
.EE
.in
.PP
Where
.I type
indicates the general class of notification,
.I subtype
indicates the specific type of notification within that class and
.I info
includes the length (in slots), the watcher's ID and some type-specific
information.
.PP
Messages inserted into the buffer aren't allowed to split over the end of the
buffer; instead a
.I skip
notification will be inserted to pad to the end of the buffer.  A skip
notification will have the type set to
.B WATCH_TYPE_META
and the subtype set to
.BR WATCH_META_SKIP_NOTIFICATION ,
with the length indicating how much should be skipped.
.PP
To avoid the need for an extra page dedicated solely to metadata pointers, the
first few slots are covered by a permanent skip notification and contain ring
metadata including the pointers.  The buffer has a 'header' of the form:
.PP
.in +4n
.EX
struct {
	struct watch_notification watch;
	__u32	head;
	__u32	tail;
	__u32	mask;
	__u32	__reserved;
};
.EE
.in
.PP
This includes the ring indices,
.IR head " and " tail ,
and a
.I mask
to mask them off with before use.  When using the ring indices, the following
precautions should be observed:
.TP
.B (1)
.I head
indicates where the kernel will insert the next message into the buffer.  Only
the kernel is allowed to change head.
.TP
.B (2)
.I tail
indicates where the next message for userspace to consume can be found; tail
will never be changed by the kernel.
.TP
.B (3)
An
.IR acquire -class
memory barrier must be used to read head.  It is not necessary to use a memory
barrier to read tail.
.TP
.B (4)
The buffer is empty if tail == head.
.TP
.B (5)
head and tail should not be masked off after increment, but rather left to wrap
naturally; this means that the index must be masked off before being used to
access the buffer.
.TP
.B (6)
After consuming a message, the length (in slots) of the message should be added
to tail and tail must not be then masked off.
.TP
.B (7)
A
.IR release -class
memory barrier must be used to update
.IR tail .
.PP
If the head and tail values become too far separated or head points to a
forbidden area of the buffer, no further message insertion will take place and
.IR poll ()
will flag
.BR POLLERR .
Otherwise, poll() will flag
.BR POLLIN " and " POLLRDNORM
if tail != head.
.PP
The ring as a whole is described by the following structure:
.PP
.in +4n
.EX
struct watch_queue_buffer {
	union {
		struct {
			struct watch_notification watch;
			__u32	head;
			__u32	tail;
			__u32	mask;
			__u32	__reserved;
		} meta;
		struct watch_notification slots[0];
	};
};
.EE
.in
.PP
Where
.I meta
covers the slots holding the ring indices and other metadata.  Note that the
metadata may be extended in future.  It's size can be determined by checking
the length of the skip pseudo-message that covers it (see
.IR meta.watch ).
.PP
In the event that the ring is full when the kernel needs to write in a
notification, it will set
.B WATCH_INFO_NOTIFICATIONS_LOST
in
.IR meta.watch.info
to indicate an overrun.  If the flag is noticed as being unset, the entire word
can be simply cleared without bothering the kernel as the kernel doesn't ever
read it.

.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH IOCTL COMMANDS
The device has the following
.IR ioctl ()
commands:
.TP
.B IOC_WATCH_QUEUE_SET_SIZE
The ioctl argument is indicates the size of the buffer in pages and must be a
power of two.  This command allocates the memory to back the buffer.
.IP
This may only be done once and the buffer cannot be mmap'd until this command
has been done.
.TP
.B IOC_WATCH_QUEUE_SET_FILTER
This is used to set filters on the notifications that get written into the
buffer.  The ioctl argument points to a structure of the following form:
.IP
.in +4n
.EX
struct watch_notification_filter {
	__u32	nr_filters;
	__u32	__reserved;
	struct watch_notification_type_filter filters[];
};
.EE
.in
.IP
Where
.I nr_filters
indicates the number of elements in the
.IR filters []
array.  Each element in the filters array specifies a filter and is of the
following form:
.IP
.in +4n
.EX
struct watch_notification_type_filter {
	__u32	type;
	__u32	info_filter;
	__u32	info_mask;
	__u32	subtype_filter[8];
};
.EE
.in
.IP
Where
.I type
refer to the type field in a notification record header, info_filter and
info_mask refer to the info field and subtype_filter is a bit-mask of subtypes.
.IP
If no filters are installed, all notifications are allowed by default and if
one or more filters are installed, notifications are disallowed by default.
.IP
A notifications matches a filter if, for notification N and filter F:
.IP
.in +4n
.EX
N->type == F->type &&
(F->subtype_filter[N->subtype >> 5] &
	(1U << (N->subtype & 31))) &&
(N->info & F->info_mask) == F->info_filter)
.EE
.in
.IP


.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH EXAMPLE
To use the notification mechanism, first of all the device has to be opened,
the size must be set and the buffer mapped:
.PP
.in +4n
.EX
int wfd = open("/dev/watch_queue", O_RDWR);

ioctl(wfd, IOC_WATCH_QUEUE_SET_SIZE, 1);

struct watch_queue_buffer *buf =
	mmap(NULL, 1 * PAGE_SIZE, PROT_READ | PROT_WRITE,
	     MAP_SHARED, wfd, 0);

.EE
.in
.PP
From this point, the buffer is open for business.  Filters can be set to
restrict the notifications that get inserted into the buffer from the sources
that are watched.  For example:
.PP
.in +4n
.EX
static struct watch_notification_filter filter = {
	.nr_filters	= 2,
	.__reserved	= 0,
	.filters = {
		[0]	= {
			.type			= WATCH_TYPE_KEY_NOTIFY,
			.subtype_filter[0]	= 1 << NOTIFY_KEY_LINKED,
			.info_filter		= 1 << WATCH_INFO_FLAG_2,
			.info_mask		= 1 << WATCH_INFO_FLAG_2,
		},
		[1]	= {
			.type			= WATCH_TYPE_USB_NOTIFY,
			.subtype_filter[0]	= 1 << NOTIFY_USB_DEVICE_ADD,
		},
	},
};

ioctl(fd, IOC_WATCH_QUEUE_SET_FILTER, &filter);
.EE
.in
.PP
will only allow key-change notifications that indicate a key is linked into a
keyring and then only if type-specific flag WATCH_INFO_FLAG_2 is set on the
notification and will only allow USB device-add notifications, blocking other
USB notifications and all block device notifications.
.PP
Sources can then be watched, for example:
.PP
.in +4n
.EX
keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, wfd, 0x33);
watch_devices(wfd, 0x55, 0);
.EE
.in
.PP
The first places a watch on the process's session keyring, directing the
notifications to the buffer we just created and specifying that they should be
tagged with 0x33 in the info ID field.  The second places a watch on the global
device notifications queue, specifying that notifications from that should be
tagged with info ID 0x55.
.PP
The device file descriptor can then be polled to find out when the kernel
writes something into the buffer or if the ring indices become incoherent:
.PP
.in +4n
.EX
struct pollfd p[1];
p[0].fd = wfd;
p[0].events = POLLIN | POLLERR;
p[0].revents = 0;
poll(p, 1, -1);
.EE
.in
.PP
When it is determined that there is something in the buffer, messages can be
read out of the ring with something like the following:
.PP
.in +4n
.EX
struct watch_notification *n;
unsigned int len, head, tail, mask = buf->meta.mask;

while (head = __atomic_load_n(&buf->meta.head,
                              __ATOMIC_ACQUIRE),
       tail = buf->meta.tail,
       tail != head
       ) {
        n = &buf->slots[tail & mask];
        len = n->info & WATCH_INFO_LENGTH;
        len >>= WATCH_INFO_LENGTH__SHIFT;
        if (len == 0)
                abort();

        switch (n->type) {
        case WATCH_TYPE_META:
                switch (n->subtype) {
                case WATCH_META_REMOVAL_NOTIFICATION:
                        saw_removal_notification(n);
                        break;
                }
                break;
        case WATCH_TYPE_KEY_NOTIFY:
                saw_key_change(n);
                break;
        case WATCH_TYPE_USB_NOTIFY:
                saw_usb_event(n);
                break;
        }

        tail += len;
        __atomic_store_n(&buf->meta.tail, tail, __ATOMIC_RELEASE);
}
.EE
.in
.PP

.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH VERSIONS
The notification queue driver first appeared in v??? of the Linux kernel.
.SH SEE ALSO
.ad l
.nh
.BR ioctl (2),
.BR keyctl (1),
.BR keyctl_watch_key (3),
.BR poll (2),
.BR setrlimit (2)
David Howells Aug. 30, 2019, 2:15 p.m. UTC | #2
'\" t
.\" Copyright (c) 2019 David Howells <dhowells@redhat.com>
.\"
.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date.  The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein.  The author(s) may not
.\" have taken the same level of care in the production of this manual,
.\" which is licensed free of charge, as they might when working
.\" professionally.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\" %%%LICENSE_END
.\"
.TH WATCH_DEVICES 2 2019-08-29 "Linux" "Linux Programmer's Manual"
.SH NAME
watch_devices \- Watch for global device notifications
.SH SYNOPSIS
.nf
.B #include <linux/watch_queue.h>
.br
.B #include <unistd.h>
.br
.BI "int watch_devices(int " watch_fd ", int " watch_id ", unsigned int " flags );
.fi
.PP
.IR Note :
There are no glibc wrappers for these system calls.
.SH DESCRIPTION
.PP
.BR watch_devices ()
attaches a watch on the global device notification source to a previously
opened and configured watch queue.  See
.BR watch_queue (7)
for more information on how to set up and use those.
.PP
The global device notification source is provided with events from a number of
sources, including block device errors and USB events.  Each notification type
has a specific format.

.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SS Block Device Notifications
Events on block devices, such as I/O errors are posted to any watching queues.
The message format is:
.PP
.in +4n
.EX
struct block_notification {
	struct watch_notification watch;
	__u64	dev;
	__u64	sector;
};
.EE
.in
.PP
The
.I watch.type
field will be set to
.BR WATCH_TYPE_BLOCK_NOTIFY ,
the
.I watch.subtype
field will contain a constant that indicates the particular event that occurred
and the watch_id passed to watch_devices() will be placed in
.I watch.info
in the ID field.
.PP
.I dev
will contain the major and minor device numbers in
.B dev_t
form and
.I sector
will contain the first sector the notification pertains to.
.PP
The following events are defined:
.PP
.in +4n
.TS
lB l.
NOTIFY_BLOCK_ERROR_TIMEOUT
NOTIFY_BLOCK_ERROR_NO_SPACE
NOTIFY_BLOCK_ERROR_RECOVERABLE_TRANSPORT
NOTIFY_BLOCK_ERROR_CRITICAL_TARGET
NOTIFY_BLOCK_ERROR_CRITICAL_NEXUS
NOTIFY_BLOCK_ERROR_CRITICAL_MEDIUM
NOTIFY_BLOCK_ERROR_PROTECTION
NOTIFY_BLOCK_ERROR_KERNEL_RESOURCE
NOTIFY_BLOCK_ERROR_DEVICE_RESOURCE
NOTIFY_BLOCK_ERROR_IO
.TE
.in
.PP
All of which indicate error conditions.

.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SS USB Device Notifications
Events on USB devices, such as I/O errors are posted to any watching queues.
The message format is:
.PP
.in +4n
.EX
struct usb_notification {
        struct watch_notification watch;
        __u32   error;
        __u32   reserved;
        __u8    name_len;
        __u8    name[0];
};
.EE
.in
.PP
The
.I watch.type
field will be set to
.BR WATCH_TYPE_USB_NOTIFY ,
the
.I watch.subtype
field will contain a constant that indicates the particular event that occurred
and the watch_id passed to watch_devices() will be placed in
.I watch.info
in the ID field.
.PP
.IR name " and " name_len
indicates the textual name of the USB device that originated the notification.
The name will be truncated to
.B USB_NOTIFICATION_MAX_NAME_LEN
if it is longer than that.
.PP
The following subtypes are currently defined:
.TP
.B NOTIFY_USB_DEVICE_ADD
A new USB device has been plugged in.
.TP
.B NOTIFY_USB_DEVICE_REMOVE
A USB device has been unplugged.
.TP
.B NOTIFY_USB_BUS_ADD
A new USB bus is now available.
.TP
.B NOTIFY_USB_BUS_REMOVE
A USB bus has been removed.
.TP
.B NOTIFY_USB_DEVICE_RESET
A USB device has been reset.
.TP
.B NOTIFY_USB_DEVICE_ERROR
A USB device has generated an error; a suitable error code will have been
placed in
.IR error .


.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH RETURN VALUE
On success, the function returns 0.  On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
The following errors may be returned:
.TP
.B EBADF
.I watch_fd
is an invalid file descriptor.
.TP
.B EBADSLT
The watch does not exist and so cannot be removed.
.TP
.B EBUSY
The source is already attached to the watch device instance specified by
.I watch_fd
and so cannot be added.
.TP
.B EINVAL
.I watch_fd
does not refer to a watch_queue device file.
.TP
.B EINVAL
.IR watch_fd " or " watch_id
is out of range.
.TP
.B EINVAL
Unsupported
.I flags
set.
.TP
.B ENOMEM
Insufficient memory available to allocate a watch record.
.TP
.B EPERM
The caller does not have the required privileges.
.SH CONFORMING TO
These functions are Linux-specific and should not be used in programs intended
to be portable.
.SH VERSIONS
The notification queue driver first appeared in v??? of the Linux kernel.
.SH NOTES
Glibc does not (yet) provide a wrapper for the
.BR watch_devices "()"
system call; call it using
.BR syscall (2).
.SH SEE ALSO
.BR watch_queue (7)
David Howells Aug. 30, 2019, 2:16 p.m. UTC | #3
.\"
.\" Copyright (C) 2019 Red Hat, Inc. All Rights Reserved.
.\" Written by David Howells (dhowells@redhat.com)
.\"
.\" This program is free software; you can redistribute it and/or
.\" modify it under the terms of the GNU General Public License
.\" as published by the Free Software Foundation; either version
.\" 2 of the License, or (at your option) any later version.
.\"
.TH KEYCTL_GRANT_PERMISSION 3 "28 Aug 2019" Linux "Linux Key Management Calls"
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH NAME
keyctl_watch_key \- Watch for changes to a key
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH SYNOPSIS
.nf
.B #include <keyutils.h>
.sp
.BI "long keyctl_watch_key(key_serial_t " key ,
.BI "                      int " watch_queue_fd
.BI "                      int " watch_id ");"
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH DESCRIPTION
.BR keyctl_watch_key ()
sets or removes a watch on
.IR key .
.PP
.I watch_id
specifies the ID for a watch that will be included in notification messages.
It can be between 0 and 255 to add a key; it should be -1 to remove a key.
.PP
.I watch_queue_fd
is a file descriptor attached to a watch_queue device instance.  Multiple
openings of a device provide separate instances.  Each device instance can
only have one watch on any particular key.
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SS Notification Record
.PP
Key-specific notification messages that the kernel emits into the buffer have
the following format:
.PP
.in +4n
.EX
struct key_notification {
	struct watch_notification watch;
	__u32	key_id;
	__u32	aux;
};
.EE
.in
.PP
The
.I watch.type
field will be set to
.B WATCH_TYPE_KEY_NOTIFY
and the
.I watch.subtype
field will contain one of the following constants, indicating the event that
occurred and the watch_id passed to keyctl_watch_key() will be placed in
.I watch.info
in the ID field.  The following events are defined:
.TP
.B NOTIFY_KEY_INSTANTIATED
This indicates that a watched key got instantiated or negatively instantiated.
.I key_id
indicates the key that was instantiated and
.I aux
is unused.
.TP
.B NOTIFY_KEY_UPDATED
This indicates that a watched key got updated or instantiated by update.
.I key_id
indicates the key that was updated and
.I aux
is unused.
.TP
.B NOTIFY_KEY_LINKED
This indicates that a key got linked into a watched keyring.
.I key_id
indicates the keyring that was modified
.I aux
indicates the key that was added.
.TP
.B NOTIFY_KEY_UNLINKED
This indicates that a key got unlinked from a watched keyring.
.I key_id
indicates the keyring that was modified
.I aux
indicates the key that was removed.
.TP
.B NOTIFY_KEY_CLEARED
This indicates that a watched keyring got cleared.
.I key_id
indicates the keyring that was cleared and
.I aux
is unused.
.TP
.B NOTIFY_KEY_REVOKED
This indicates that a watched key got revoked.
.I key_id
indicates the key that was revoked and
.I aux
is unused.
.TP
.B NOTIFY_KEY_INVALIDATED
This indicates that a watched key got invalidated.
.I key_id
indicates the key that was invalidated and
.I aux
is unused.
.TP
.B NOTIFY_KEY_SETATTR
This indicates that a watched key had its attributes (owner, group,
permissions, timeout) modified.
.I key_id
indicates the key that was modified and
.I aux
is unused.
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SS Removal Notification
When a watched key is garbage collected, all of its watches are automatically
destroyed and a notification is delivered to each watcher.  This will normally
be an extended notification of the form:
.PP
.in +4n
.EX
struct watch_notification_removal {
	struct watch_notification watch;
	__u64	id;
};
.EE
.in
.PP
The
.I watch.type
field will be set to
.B WATCH_TYPE_META
and the
.I watch.subtype
field will contain
.BR WATCH_META_REMOVAL_NOTIFICATION .
If the extended notification is given, then the length will be 2 units,
otherwise it will be 1 and only the header will be present.
.PP
The watch_id passed to
.IR keyctl_watch_key ()
will be placed in
.I watch.info
in the ID field.
.PP
If the extension is present,
.I id
will be set to the ID of the destroyed key.
.PP
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH RETURN VALUE
On success
.BR keyctl_watch_key ()
returns
.B 0 .
On error, the value
.B -1
will be returned and
.I errno
will have been set to an appropriate error.
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH ERRORS
.TP
.B ENOKEY
The specified key does not exist.
.TP
.B EKEYEXPIRED
The specified key has expired.
.TP
.B EKEYREVOKED
The specified key has been revoked.
.TP
.B EACCES
The named key exists, but does not grant
.B view
permission to the calling process.
.TP
.B EBUSY
The specified key already has a watch on it for that device instance (add
only).
.TP
.B EBADSLT
The specified key doesn't have a watch on it (removal only).
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH LINKING
This is a library function that can be found in
.IR libkeyutils .
When linking,
.B \-lkeyutils
should be specified to the linker.
.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.SH SEE ALSO
.ad l
.nh
.BR keyctl (1),
.BR add_key (2),
.BR keyctl (2),
.BR request_key (2),
.BR keyctl (3),
.BR keyrings (7),
.BR keyutils (7)
Casey Schaufler Aug. 30, 2019, 10:09 p.m. UTC | #4
On 8/30/2019 6:57 AM, David Howells wrote:
> Here's a set of patches to add a general notification queue concept and to
> add sources of events for:
>
>  (1) Key/keyring events, such as creating, linking and removal of keys.
>
>  (2) General device events (single common queue) including:
>
>      - Block layer events, such as device errors
>
>      - USB subsystem events, such as device/bus attach/remove, device
>        reset, device errors.
>
> Tests for the key/keyring events can be found on the keyutils next branch:
>
> 	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=next

I'm having trouble with the "make install" on Fedora. Is there an
unusual dependency?

>
> Notifications are done automatically inside of the testing infrastructure
> on every change to that every test makes to a key or keyring.
>
> Manual pages can be found there also, including pages for watch_queue(7)
> and the watch_devices(2) system call (these should be transferred to the
> manpages package if taken upstream).
>
> LSM hooks are included:
>
>  (1) A set of hooks are provided that allow an LSM to rule on whether or
>      not a watch may be set.  Each of these hooks takes a different
>      "watched object" parameter, so they're not really shareable.  The LSM
>      should use current's credentials.  [Wanted by SELinux & Smack]
>
>  (2) A hook is provided to allow an LSM to rule on whether or not a
>      particular message may be posted to a particular queue.  This is given
>      the credentials from the event generator (which may be the system) and
>      the watch setter.  [Wanted by Smack]
>
> I've provided a preliminary attempt to provide SELinux and Smack with
> implementations of some of these hooks.
>
>
> Design decisions:
>
>  (1) A misc chardev is used to create and open a ring buffer:
>
> 	fd = open("/dev/watch_queue", O_RDWR);
>
>      which is then configured and mmap'd into userspace:
>
> 	ioctl(fd, IOC_WATCH_QUEUE_SET_SIZE, BUF_SIZE);
> 	ioctl(fd, IOC_WATCH_QUEUE_SET_FILTER, &filter);
> 	buf = mmap(NULL, BUF_SIZE * page_size, PROT_READ | PROT_WRITE,
> 		   MAP_SHARED, fd, 0);
>
>      The fd cannot be read or written (though there is a facility to use
>      write to inject records for debugging) and userspace just pulls data
>      directly out of the buffer.
>
>  (2) The ring index pointers are stored inside the ring and are thus
>      accessible to userspace.  Userspace should only update the tail
>      pointer and never the head pointer or risk breaking the buffer.  The
>      kernel checks that the pointers appear valid before trying to use
>      them.  A 'skip' record is maintained around the pointers.
>
>  (3) poll() can be used to wait for data to appear in the buffer.
>
>  (4) Records in the buffer are binary, typed and have a length so that they
>      can be of varying size.
>
>      This means that multiple heterogeneous sources can share a common
>      buffer.  Tags may be specified when a watchpoint is created to help
>      distinguish the sources.
>
>  (5) The queue is reusable as there are 16 million types available, of
>      which I've used just a few, so there is scope for others to be used.
>
>  (6) Records are filterable as types have up to 256 subtypes that can be
>      individually filtered.  Other filtration is also available.
>
>  (7) Each time the buffer is opened, a new buffer is created - this means
>      that there's no interference between watchers.
>
>  (8) When recording a notification, the kernel will not sleep, but will
>      rather mark a queue as overrun if there's insufficient space, thereby
>      avoiding userspace causing the kernel to hang.
>
>  (9) The 'watchpoint' should be specific where possible, meaning that you
>      specify the object that you want to watch.
>
> (10) The buffer is created and then watchpoints are attached to it, using
>      one of:
>
> 	keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fd, 0x01);
> 	watch_devices(fd, 0x02, 0);
>
>      where in both cases, fd indicates the queue and the number after is a
>      tag between 0 and 255.
>
> (11) The watch must be removed if either the watch buffer is destroyed or
>      the watched object is destroyed.
>
>
> Things I want to avoid:
>
>  (1) Introducing features that make the core VFS dependent on the network
>      stack or networking namespaces (ie. usage of netlink).
>
>  (2) Dumping all this stuff into dmesg and having a daemon that sits there
>      parsing the output and distributing it as this then puts the
>      responsibility for security into userspace and makes handling
>      namespaces tricky.  Further, dmesg might not exist or might be
>      inaccessible inside a container.
>
>  (3) Letting users see events they shouldn't be able to see.
>
>
> The patches can be found here also:
>
> 	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=notifications-core
>
> Changes:
>
>  ver #7:
>
>  (*) Removed the 'watch' argument from the security_watch_key() and
>      security_watch_devices() hooks as current_cred() can be used instead
>      of watch->cred.
>
>  ver #6:
>
>  (*) Fix mmap bug in watch_queue driver.
>
>  (*) Add an extended removal notification that can transmit an identifier
>      to userspace (such as a key ID).
>
>  (*) Don't produce a instantiation notification in mark_key_instantiated()
>      but rather do it in the caller to prevent key updates from producing
>      an instantiate notification as well as an update notification.
>
>  (*) Set the right number of filters in the sample program.
>
>  (*) Provide preliminary hook implementations for SELinux and Smack.
>
>  ver #5:
>
>  (*) Split the superblock watch and mount watch parts out into their own
>      branch (notifications-mount) as they really need certain fsinfo()
>      attributes.
>
>  (*) Rearrange the watch notification UAPI header to push the length down
>      to bits 0-5 and remove the lost-message bits.  The userspace's watch
>      ID tag is moved to bits 8-15 and then the message type is allocated
>      all of bits 16-31 for its own purposes.
>
>      The lost-message bit is moved over to the header, rather than being
>      placed in the next message to be generated and given its own word so
>      it can be cleared with xchg(,0) for parisc.
>
>  (*) The security_post_notification() hook is no longer called with the
>      spinlock held and softirqs disabled - though the RCU readlock is still
>      held.
>
>  (*) Buffer pages are now accounted towards RLIMIT_MEMLOCK and CAP_IPC_LOCK
>      will skip the overuse check.
>
>  (*) The buffer is marked VM_DONTEXPAND.
>
>  (*) Save the watch-setter's creds in struct watch and give that to the LSM
>      hook for posting a message.
>
>  ver #4:
>
>  (*) Split the basic UAPI bits out into their own patch and then split the
>      LSM hooks out into an intermediate patch.  Add LSM hooks for setting
>      watches.
>
>      Rename the *_notify() system calls to watch_*() for consistency.
>
>  ver #3:
>
>  (*) I've added a USB notification source and reformulated the block
>      notification source so that there's now a common watch list, for which
>      the system call is now device_notify().
>
>      I've assigned a pair of unused ioctl numbers in the 'W' series to the
>      ioctls added by this series.
>
>      I've also added a description of the kernel API to the documentation.
>
>  ver #2:
>
>  (*) I've fixed various issues raised by Jann Horn and GregKH and moved to
>      krefs for refcounting.  I've added some security features to try and
>      give Casey Schaufler the LSM control he wants.
>
> David
> ---
> David Howells (11):
>       uapi: General notification ring definitions
>       security: Add hooks to rule on setting a watch
>       security: Add a hook for the point of notification insertion
>       General notification queue with user mmap()'able ring buffer
>       keys: Add a notification facility
>       Add a general, global device notification watch list
>       block: Add block layer notifications
>       usb: Add USB subsystem notifications
>       Add sample notification program
>       selinux: Implement the watch_key security hook
>       smack: Implement the watch_key and post_notification hooks [untested]
>
>
>  Documentation/ioctl/ioctl-number.rst        |    1 
>  Documentation/security/keys/core.rst        |   58 ++
>  Documentation/watch_queue.rst               |  460 ++++++++++++++
>  arch/alpha/kernel/syscalls/syscall.tbl      |    1 
>  arch/arm/tools/syscall.tbl                  |    1 
>  arch/ia64/kernel/syscalls/syscall.tbl       |    1 
>  arch/m68k/kernel/syscalls/syscall.tbl       |    1 
>  arch/microblaze/kernel/syscalls/syscall.tbl |    1 
>  arch/mips/kernel/syscalls/syscall_n32.tbl   |    1 
>  arch/mips/kernel/syscalls/syscall_n64.tbl   |    1 
>  arch/mips/kernel/syscalls/syscall_o32.tbl   |    1 
>  arch/parisc/kernel/syscalls/syscall.tbl     |    1 
>  arch/powerpc/kernel/syscalls/syscall.tbl    |    1 
>  arch/s390/kernel/syscalls/syscall.tbl       |    1 
>  arch/sh/kernel/syscalls/syscall.tbl         |    1 
>  arch/sparc/kernel/syscalls/syscall.tbl      |    1 
>  arch/x86/entry/syscalls/syscall_32.tbl      |    1 
>  arch/x86/entry/syscalls/syscall_64.tbl      |    1 
>  arch/xtensa/kernel/syscalls/syscall.tbl     |    1 
>  block/Kconfig                               |    9 
>  block/blk-core.c                            |   29 +
>  drivers/base/Kconfig                        |    9 
>  drivers/base/Makefile                       |    1 
>  drivers/base/watch.c                        |   90 +++
>  drivers/misc/Kconfig                        |   13 
>  drivers/misc/Makefile                       |    1 
>  drivers/misc/watch_queue.c                  |  893 +++++++++++++++++++++++++++
>  drivers/usb/core/Kconfig                    |    9 
>  drivers/usb/core/devio.c                    |   56 ++
>  drivers/usb/core/hub.c                      |    4 
>  include/linux/blkdev.h                      |   15 
>  include/linux/device.h                      |    7 
>  include/linux/key.h                         |    3 
>  include/linux/lsm_audit.h                   |    1 
>  include/linux/lsm_hooks.h                   |   38 +
>  include/linux/security.h                    |   32 +
>  include/linux/syscalls.h                    |    1 
>  include/linux/usb.h                         |   18 +
>  include/linux/watch_queue.h                 |   94 +++
>  include/uapi/asm-generic/unistd.h           |    4 
>  include/uapi/linux/keyctl.h                 |    2 
>  include/uapi/linux/watch_queue.h            |  183 ++++++
>  kernel/sys_ni.c                             |    1 
>  samples/Kconfig                             |    6 
>  samples/Makefile                            |    1 
>  samples/watch_queue/Makefile                |    8 
>  samples/watch_queue/watch_test.c            |  233 +++++++
>  security/keys/Kconfig                       |    9 
>  security/keys/compat.c                      |    3 
>  security/keys/gc.c                          |    5 
>  security/keys/internal.h                    |   30 +
>  security/keys/key.c                         |   38 +
>  security/keys/keyctl.c                      |   99 +++
>  security/keys/keyring.c                     |   20 -
>  security/keys/request_key.c                 |    4 
>  security/security.c                         |   23 +
>  security/selinux/hooks.c                    |   14 
>  security/smack/smack_lsm.c                  |   82 ++
>  58 files changed, 2593 insertions(+), 30 deletions(-)
>  create mode 100644 Documentation/watch_queue.rst
>  create mode 100644 drivers/base/watch.c
>  create mode 100644 drivers/misc/watch_queue.c
>  create mode 100644 include/linux/watch_queue.h
>  create mode 100644 include/uapi/linux/watch_queue.h
>  create mode 100644 samples/watch_queue/Makefile
>  create mode 100644 samples/watch_queue/watch_test.c
>
David Howells Sept. 2, 2019, 12:39 p.m. UTC | #5
Casey Schaufler <casey@schaufler-ca.com> wrote:

> > Tests for the key/keyring events can be found on the keyutils next branch:
> >
> > 	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=next
> 
> I'm having trouble with the "make install" on Fedora. Is there an
> unusual dependency?

What's the symptom you're seeing?  Is it this:

install -D -m 0644 libkeyutils.a /tmp/opt/lib64 libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007f7dcbf6d000)/libkeyutils.a
/bin/sh: -c: line 0: syntax error near unexpected token `('
/bin/sh: -c: line 0: `install -D -m 0644 libkeyutils.a /tmp/opt/lib64 libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007f7dcbf6d000)/libkeyutils.a'

David
David Howells Sept. 2, 2019, 1:26 p.m. UTC | #6
Casey Schaufler <casey@schaufler-ca.com> wrote:

> > Tests for the key/keyring events can be found on the keyutils next branch:
> >
> > 	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=next
> 
> I'm having trouble with the "make install" on Fedora. Is there an
> unusual dependency?

I've pushed a couple of patches to my next branch.  Do "make install" and
"make rpm" now work for you?

David
David Howells Sept. 3, 2019, 4:06 p.m. UTC | #7
Hillf Danton <hdanton@sina.com> wrote:

> > +	smp_store_release(&buf->meta.head, head);
> 
> Add a line of comment for the paring smp_load_acquire().
> I did not find it in 04/11.

You won't find smp_load_acquire() - it's not in the kernel, though if you look
in the sample, you'll find the corresponding barrier in userspace.  Note that
there's a further implicit barrier you don't see.

I've added the comments:

	/* Barrier against userspace, ordering data read before tail read */
	ring_tail = READ_ONCE(buf->meta.tail);

and:

	/* Barrier against userspace, ordering head update after data write. */
	smp_store_release(&buf->meta.head, head);

David
David Howells Sept. 3, 2019, 4:37 p.m. UTC | #8
Hillf Danton <hdanton@sina.com> wrote:

> > +	for (i = 0; i < wf->nr_filters; i++) {
> > +		wt = &wf->filters[i];
> > +		if (n->type == wt->type &&
> > +		    (wt->subtype_filter[n->subtype >> 5] &
> > +		     (1U << (n->subtype & 31))) &&
> 
> Replace the pure numbers with something easier to understand.

How about the following:

static bool filter_watch_notification(const struct watch_filter *wf,
				      const struct watch_notification *n)
{
	const struct watch_type_filter *wt;
	unsigned int st_bits = sizeof(wt->subtype_filter[0]) * 8;
	unsigned int st_index = n->subtype / st_bits;
	unsigned int st_bit = 1U << (n->subtype % st_bits);
	int i;

	if (!test_bit(n->type, wf->type_filter))
		return false;

	for (i = 0; i < wf->nr_filters; i++) {
		wt = &wf->filters[i];
		if (n->type == wt->type &&
		    (wt->subtype_filter[st_index] & st_bit) &&
		    (n->info & wt->info_mask) == wt->info_filter)
			return true;
	}

	return false; /* If there is a filter, the default is to reject. */
}

David