freebsd-skq

Author	SHA1	Message	Date
kib	e884cfc968	Move the funsetown(9) call from audit_pipe_close() to cdevpriv destructor. As result, close method becomes trivial and removed. Final cdevsw close method might be called without file context (e.g. in vn_open_vnode() if the vnode is reclaimed meantime), which leaves ap_sigio registered for notification, despite cdevpriv destructor frees the memory later. Call destructor instead of doing a cleanup inline, for devfs_set_cdevpriv() failure in open. This adds missed funsetown(9) call and locks ap to satisfy audit_pipe_free() invariants. Reported and tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-01-13 14:02:07 +00:00
davide	c71af9be63	Replace dev_clone with cdevpriv(9) KPI in audit_pipe code. This is (yet another) step towards the removal of device cloning from our kernel. CR: https://reviews.freebsd.org/D441 Reviewed by: kib, rwatson Tested by: pho	2014-08-20 16:04:30 +00:00
davide	ec6382d0c2	- Use make_dev_credf(MAKEDEV_REF) instead of the race-prone make_dev()+ dev_ref() in the clone handlers that still use it. - Don't set SI_CHEAPCLONE flag, it's not used anywhere neither in devfs (for anything real) Reviewed by: kib	2013-09-07 13:45:44 +00:00
ed	832b15d289	Get rid of D_PSEUDO. It seems the D_PSEUDO flag was meant to allow make_dev() to return NULL. Nowadays we have a different interface for that; make_dev_p(). There's no need to keep it there. While there, remove an unneeded D_NEEDMINOR from the gpio driver. Discussed with: gonzo@ (gpio)	2011-10-18 08:09:44 +00:00
attilio	683d7a54ce	Fix a deficiency in the selinfo interface: If a selinfo object is recorded (via selrecord()) and then it is quickly destroyed, with the waiters missing the opportunity to awake, at the next iteration they will find the selinfo object destroyed, causing a PF#. That happens because the selinfo interface has no way to drain the waiters before to destroy the registered selinfo object. Also this race is quite rare to get in practice, because it would require a selrecord(), a poll request by another thread and a quick destruction of the selrecord()'ed selinfo object. Fix this by adding the seldrain() routine which should be called before to destroy the selinfo objects (in order to avoid such case), and fix the present cases where it might have already been called. Sometimes, the context is safe enough to prevent this type of race, like it happens in device drivers which installs selinfo objects on poll callbacks. There, the destruction of the selinfo object happens at driver detach time, when all the filedescriptors should be already closed, thus there cannot be a race. For this case, mfi(4) device driver can be set as an example, as it implements a full correct logic for preventing this from happening. Sponsored by: Sandvine Incorporated Reported by: rstone Tested by: pluknet Reviewed by: jhb, kib Approved by: re (bz) MFC after: 3 weeks	2011-08-25 15:51:54 +00:00
kib	e1cb2941d4	Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. Use vnode interlock to protect the knote fields [1]. The locking assumes that shared vnode lock is held, thus we get exclusive access to knote either by exclusive vnode lock protection, or by shared vnode lock + vnode interlock. Do not use kl_locked() method to assert either lock ownership or the fact that curthread does not own the lock. For shared locks, ownership is not recorded, e.g. VOP_ISLOCKED can return LK_SHARED for the shared lock not owned by curthread, causing false positives in kqueue subsystem assertions about knlist lock. Remove kl_locked method from knlist lock vector, and add two separate assertion methods kl_assert_locked and kl_assert_unlocked, that are supposed to use proper asserts. Change knlist_init accordingly. Add convenience function knlist_init_mtx to reduce number of arguments for typical knlist initialization. Submitted by: jhb [1] Noted by: jhb [2] Reviewed by: jhb Tested by: rnoland	2009-06-10 20:59:32 +00:00
rwatson	8dbf62efb2	Remove D_NEEDGIANT from audit pipes. I'm actually not sure why this was here, but isn't needed. MFC after: 2 weeks Sponsored by: Apple, Inc.	2009-04-16 11:57:16 +00:00
rwatson	1d82f9d188	Set the lower bound on queue size for an audit pipe to 1 instead of 0, as an audit pipe with a queue length of 0 is less useful. Obtained from: TrustedBSD Project Sponsored by: Apple, Inc. MFC after: 1 week	2009-02-08 15:38:31 +00:00
rwatson	8ee4f3581d	Eliminate the local variable 'ape' in audit_pipe_kqread(), as it's only used for an assertion that we don't really need anymore. MFC after: 1 week Reported by: Christoph Mallon <christoph dot mallon at gmx dot de>	2009-02-04 19:56:37 +00:00
rwatson	5d645da259	Do a lockless read of the audit pipe list before grabbing the audit pipe lock in order to avoid the lock acquire hit if the pipe list is very likely empty. Obtained from: TrustedBSD Project MFC after: 3 weeks Sponsored by: Apple, Inc.	2009-01-06 14:15:38 +00:00
rwatson	a5f7e7ad63	Fix white space botch: use carriage returns rather than tabs.	2008-12-31 23:22:45 +00:00
rwatson	20831b1f86	Update introductory comment for audit pipes. MFC after: 2 months Sponsored by: Apple, Inc.	2008-11-02 00:25:48 +00:00
rwatson	368cc5044a	Remove stale comment about filtering in audit pipe ioctl routine: we do support filtering now, although we may want to make it more interesting in the future. MFC after: 2 months Sponsored by: Apple, Inc.	2008-11-02 00:18:19 +00:00
rwatson	3f0f3e5028	Add comment for per-pipe stats. MFC after: 2 months Sponsored by: Apple, Inc.	2008-11-01 23:05:49 +00:00
rwatson	64f6525f93	We only allow a partial read of the first record in an audit pipe record queue, so move the offset field from the per-record audit_pipe_entry structure to the audit_pipe structure. Now that we support reading more than one record at a time, add a new summary field to audit_pipe, ap_qbyteslen, which tracks the total number of bytes present in a pipe, and return that (minus the current offset) via FIONREAD and kqueue's data variable for the pending byte count rather than the number of bytes remaining in only the first record. Add a number of asserts to confirm that these counts and offsets following the expected rules. MFC after: 2 months Sponsored by: Apple, Inc.	2008-11-01 21:56:45 +00:00
rwatson	f8873b326d	Allow a single read(2) system call on an audit pipe to retrieve data from more than one audit record at a time in order to improve efficiency. MFC after: 2 months Sponsored by: Apple, Inc.	2008-11-01 21:16:09 +00:00
rwatson	efc5b661a1	Since there is no longer the opportunity for record truncation, just return 0 if the truncation counter is queried on an audit pipe. MFC after: 2 months Sponsored by: Apple, Inc.	2008-10-31 15:11:01 +00:00
rwatson	6f79887fc5	Historically, /dev/auditpipe has allows only whole records to be read via read(2), which meant that records longer than the buffer passed to read(2) were dropped. Instead take the approach of allowing partial reads to be continued across multiple system calls more in the style of streaming character device. This means retaining a record on the per-pipe queue in a partially read state, so maintain a current offset into the record. Keep the record on the queue during a read, so add a new lock, ap_sx, to serialize removal of records from the queue by either read(2) or ioctl(2) requesting a pipe flush. Modify the kqueue handler to return bytes left in the current record rather than simply the size of the current record. It is now possible to use praudit, which used the standard FILE * buffer sizes, to track much larger record sizes from /dev/auditpipe, such as very long command lines to execve(2). MFC after: 2 months Sponsored by: Apple, Inc.	2008-10-31 14:40:21 +00:00
rwatson	81bbfda754	When we drop an audit record going to and audit pipe because the audit pipe has overflowed, drop the newest, rather than oldest, record. This makes overflow drop behavior consistent with memory allocation failure leading to drop, avoids touching the consumer end of the queue from a producer, and lowers the CPU overhead of dropping a record by dropping before memory allocation and copying. Obtained from: Apple, Inc. MFC after: 2 months	2008-10-30 23:09:19 +00:00
rwatson	7e2b08356c	Break out single audit_pipe_mtx into two types of locks: a global rwlock protecting the list of audit pipes, and a per-pipe mutex protecting the queue. Likewise, replace the single global condition variable used to signal delivery of a record to one or more pipes, and add a per-pipe condition variable to avoid spurious wakeups when event subscriptions differ across multiple pipes. This slightly increases the cost of delivering to audit pipes, but should reduce lock contention in the presence of multiple readers as only the per-pipe lock is required to read from a pipe, as well as avoid overheading when different pipes are used in different ways. MFC after: 2 months Sponsored by: Apple, Inc.	2008-10-30 21:58:39 +00:00
ed	4212d51a7d	Remove unit2minor() use from kernel code. When I changed kern_conf.c three months ago I made device unit numbers equal to (unneeded) device minor numbers. We used to require bitshifting, because there were eight bits in the middle that were reserved for a device major number. Not very long after I turned dev2unit(), minor(), unit2minor() and minor2unit() into macro's. The unit2minor() and minor2unit() macro's were no-ops. We'd better not remove these four macro's from the kernel, because there is a lot of (external) code that may still depend on them. For now it's harmless to remove all invocations of unit2minor() and minor2unit(). Reviewed by: kib	2008-09-26 14:19:52 +00:00
rwatson	b8596e4794	Further synchronization of copyrights, licenses, white space, etc from Apple and from the OpenBSM vendor tree. Obtained from: Apple Inc., TrustedBSD Project MFC after: 3 days	2008-07-31 09:54:35 +00:00
ed	1bfc292986	Don't enforce unique device minor number policy anymore. Except for the case where we use the cloner library (clone_create() and friends), there is no reason to enforce a unique device minor number policy. There are various drivers in the source tree that allocate unr pools and such to provide minor numbers, without using them themselves. Because we still need to support unique device minor numbers for the cloner library, introduce a new flag called D_NEEDMINOR. All cdevsw's that are used in combination with the cloner library should be marked with this flag to make the cloning work. This means drivers can now freely use si_drv0 to store their own flags and state, making it effectively the same as si_drv1 and si_drv2. We still keep the minor() and dev2unit() routines around to make drivers happy. The NTFS code also used the minor number in its hash table. We should not do this anymore. If the si_drv0 field would be changed, it would no longer end up in the same list. Approved by: philip (mentor)	2008-06-11 18:55:19 +00:00
rwatson	780b65a710	Use __FBSDID() for $FreeBSD$ IDs in the audit code. MFC after: 3 days	2008-04-13 22:06:56 +00:00
wkoszek	a8e6c33502	Change "audit_pipe_preselect" to "audit_pipe_presel" to make it print with proper alignment in ddb(4) and vmstat(8). Reviewed by: rwatson@	2007-12-25 13:23:19 +00:00
csjp	eaecf9354f	Make sure we are incrementing the read count for each audit pipe read. MFC after: 1 week	2007-10-27 22:28:01 +00:00
csjp	d250020a68	- Change the wakeup logic associated with having multiple sleepers on multiple different audit pipes. The old method used cv_signal() which would result in only one thread being woken up after we appended a record to it's queue. This resulted in un-timely wake-ups when processing audit records real-time. - Assign PSOCK priority to threads that have been sleeping on a read(2). This is the same priority threads are woken up with when they select(2) or poll(2). This yields fairness between various forms of sleep on the audit pipes. Obtained from: TrustedBSD Project Discussed with: rwatson MFC after: 1 week	2007-10-12 15:09:02 +00:00
rwatson	0d42b093e7	Clean up audit comments--formatting, spelling, etc.	2007-06-01 21:58:59 +00:00
rwatson	10d0d9cf47	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
rwatson	c9215ad31e	Allow the user process to query the kernel's notion of a maximum audit record size at run-time, which can be used by the user process to size the user space buffer it reads into from the audit pipe. Perforce change: 105098 Obtained from: TrustedBSD Project	2006-08-26 17:59:31 +00:00
rwatson	1e4f4abfce	Add kqueue support to audit pipe pseudo-devices. Obtained from: TrustedBSD Project	2006-08-24 17:42:38 +00:00
rwatson	4f317e1576	Introduce support for per-audit pipe preselection independent from the global audit trail configuration. This allows applications consuming audit trails to specify parameters for which audit records are of interest, including selecting records not required by the global trail. Allowing application interest specification without changing the global configuration allows intrusion detection systems to run without interfering with global auditing or each other (if multiple are present). To implement this: - Kernel audit records now carry a flag to indicate whether they have been selected by the global trail or by the audit pipe subsystem, set during record commit, so that this information is available after BSM conversion when delivering the BSM to the trail and audit pipes in the audit worker thread asynchronously. Preselection by either record target will cause the record to be kept. - Similar changes to preselection when the audit record is created when the system call is entering: consult both the global trail and pipes. - au_preselect() now accepts the class in order to avoid repeatedly looking up the mask for each preselection test. - Define a series of ioctls that allow applications to specify whether they want to track the global trail, or program their own preselection parameters: they may specify their own flags and naflags masks, similar to the global masks of the same name, as well as a set of per-auid masks. They also set a per-pipe mode specifying whether they track the global trail, or user their own -- the door is left open for future additional modes. A new ioctl is defined to allow a user process to flush the current audit pipe queue, which can be used after reprogramming pre-selection to make sure that only records of interest are received in future reads. - Audit pipe data structures are extended to hold the additional fields necessary to support preselection. By default, audit pipes track the global trail, so "praudit /dev/auditpipe" will track the global audit trail even though praudit doesn't program the audit pipe selection model. - Comment about the complexities of potentially adding partial read support to audit pipes. By using a set of ioctls, applications can select which records are of interest, and toggle the preselection mode. Obtained from: TrustedBSD Project	2006-06-05 14:48:17 +00:00
rwatson	bae874c2cb	Merge Perforce change 93570 from TrustedBSD audit3 branch: Add audit pipe ioctls to query minimum and maximum audit queue lengths. Obtained from: TrustedBSD Project	2006-03-19 15:39:03 +00:00
rwatson	2b1a7974d7	Merge Perforce change 93567 from TrustedBSD audit3 branch: Bump default queue limit for audit pipes from 32 to 128, since 32 is pretty small. Obtained from: TrustedBSD Project	2006-03-19 15:38:03 +00:00
rwatson	a74ff4762f	Merge Perforce change 93506 from TrustedBSD audit3 branch: Add ioctls to audit pipes in order to allow querying of the current record queue state, setting of the queue limit, and querying of pipe statistics. Obtained from: TrustedBSD Project	2006-03-19 15:36:10 +00:00
rwatson	fb6445828e	Count drops when the first of two pipe mallocs fails. Obtained from: TrustedBSD Project	2006-03-04 17:09:17 +00:00
rwatson	bc3d3926ef	Fix queue drop logic when the queue overflows: decrement queue length. Obtained from: TrustedBSD Project	2006-02-07 14:46:26 +00:00
rwatson	a1af4bcfbd	Add support for audit pipe special devices, which allow user space applications to insert a "tee" in the live audit event stream. Records are inserted into a per-clone queue so that user processes can pull discreet records out of the queue. Unlike delivery to disk, audit pipes are "lossy", dropping records in low memory conditions or when the process falls behind real-time events. This mechanism is appropriate for use by live monitoring systems, host-based intrusion detection, etc, and avoids applications having to dig through active on-disk trails that are owned by the audit daemon. Obtained from: TrustedBSD Project	2006-02-06 22:50:39 +00:00

38 Commits