freebsd-skq

Author	SHA1	Message	Date
Conrad Meyer	85078b8573	Split out cwd/root/jail, cmask state from filedesc table No functional change intended. Tracking these structures separately for each proc enables future work to correctly emulate clone(2) in linux(4). __FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof. Reviewed by: kib Discussed with: markj, mjg Differential Revision: https://reviews.freebsd.org/D27037	2020-11-17 21:14:13 +00:00
Conrad Meyer	ede4af47ae	unix(4): Enhance LOCAL_CREDS_PERSISTENT ABI As this ABI is still fresh (r367287), let's correct some mistakes now: - Version the structure to allow for future changes - Include sender's pid in control message structure - Use a distinct control message type from the cmsgcred / sockcred mess Discussed with: kib, markj, trasz Differential Revision: https://reviews.freebsd.org/D27084	2020-11-17 20:01:21 +00:00
Konstantin Belousov	441eb16a95	Allow some VOPs to return ERELOOKUP to indicate VFS operation restart at top level. Restart syscalls and some sync operations when filesystem indicated ERELOOKUP condition, mostly for VOPs operating on metdata. In particular, lookup results cached in the inode/v_data is no longer valid and needs recalculating. Right now this should be nop. Assert that ERELOOKUP is catched everywhere and not returned to userspace, by asserting that td_errno != ERELOOKUP on syscall return path. In collaboration with: pho Reviewed by: mckusick (previous version), markj Tested by: markj (syzkaller), pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D26136	2020-11-13 09:42:32 +00:00
Mateusz Guzik	3c50616fc1	fd: make all f_count uses go through refcount_*	2020-11-05 02:12:33 +00:00
Conrad Meyer	2de07e4096	unix(4): Add SOL_LOCAL:LOCAL_CREDS_PERSISTENT This option is intended to be semantically identical to Linux's SOL_SOCKET:SO_PASSCRED. For now, it is mutually exclusive with the pre-existing sockopt SOL_LOCAL:LOCAL_CREDS. Reviewed by: markj (penultimate version) Differential Revision: https://reviews.freebsd.org/D27011	2020-11-03 01:17:45 +00:00
Mark Johnston	b4e07e3da5	Fix locking in uipc_accept(). Reported by: cy MFC after: 1 week Sponsored by: The FreeBSD Foundation	2020-09-15 23:03:56 +00:00
Mark Johnston	448000279e	Fix locking in uipc_accept(). This function wasn't converted to use the new locking protocol in r333744. Make it use the PCB lock for synchronizing connection state. Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26300	2020-09-15 19:23:42 +00:00
Mark Johnston	ccdadf1a9b	Simplify unix socket connection peer locking. unp_pcb_owned_lock2() has some sharp edges and forces callers to deal with a bunch of cases. Simplify it: - Rename to unp_pcb_lock_peer(). - Return the connected peer instead of forcing callers to load it beforehand. - Handle self-connected sockets. - In unp_connectat(), just lock the accept socket directly. It should not be possible for the nascent socket to participate in any other lock orders. - Get rid of connect_internal(). It does not provide any useful checking anymore. - Block in unp_connectat() when a different thread is concurrently attempting to lock both sides of a connection. This provides simpler semantics for callers of unp_pcb_lock_peer(). - Make unp_connectat() return EISCONN if the socket is already connected. This fixes a race[1] when multiple threads attempt to connect() to different addresses using the same datagram socket. Upper layers will disconnect a connected datagram socket before calling the protocol connect's method, but there is no synchronization between this and protocol-layer code. Reported by: syzkaller [1] Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26299	2020-09-15 19:23:22 +00:00
Mark Johnston	ed92e1c78c	Avoid an unnecessary malloc() when connecting dgram sockets. The allocated memory is only required for SOCK_STREAM and SOCK_SEQPACKET sockets. Reviewed by: kevans Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26298	2020-09-15 19:23:01 +00:00
Mark Johnston	f0317f868b	Simplify unp_disconnect() callers. In all cases, PCBs are unlocked after unp_disconnect() returns. Since unp_disconnect() may release the last PCB reference, callers may have to bump the refcount before the call just so that they can release them again. Change unp_disconnect() to release PCB locks as well as connection references; this lets us remove several refcount manipulations. Tighten assertions. Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26297	2020-09-15 19:22:37 +00:00
Mark Johnston	4820bf6ac2	Rename unp_pcb_lock2(). unp_pcb_lock_pair() seems like a better name. Also make it handle the case where the two sockets are the same instead of making callers do it. No functional change intended. Reviewed by: glebius, kevans, kib Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26296	2020-09-15 19:22:16 +00:00
Mark Johnston	5362170da7	Improve unix socket PCB refcounting. - Use refcount_init(). - Define an INVARIANTS-only zone destructor to assert that various bits of PCB state aren't left dangling. - Annotate unp_pcb_rele() with __result_use_check. - Simplify control flow. Reviewed by: glebius, kevans, kib Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26295	2020-09-15 19:21:58 +00:00
Mark Johnston	d5cbccecd8	Update unix domain socket locking comments. - Define a locking key for unpcb members. - Rewrite some of the locking protocol description to make it less verbose and avoid referencing some subroutines which will be renamed. - Reorder includes. Reviewed by: glebius, kevans, kib Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D26294	2020-09-15 19:21:33 +00:00
Mateusz Guzik	6fed89b179	kern: clean up empty lines in .c and .h files	2020-09-01 22:12:32 +00:00
Konstantin Belousov	6e0c8e1ae2	Add SOL_LOCAL symbolic constant for unix socket option level. The constant seems to exists on MacOS X >= 10.8. Requested by: swills Reviewed by: allanjude, kevans Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25933	2020-08-03 22:13:02 +00:00
Mark Johnston	1b778ba260	Fix a logic error in uipc_ready_scan(). When processing the last record in a socket buffer, take care to avoid a NULL pointer dereference when advancing the record iterator. Reported by: syzbot+6a689cc9c27bd265237a@syzkaller.appspotmail.com Fixes: r359778 MFC after: 1 week Sponsored by: The FreeBSD Foundation	2020-07-30 00:52:37 +00:00
Mark Johnston	b36871af6d	Fix sendto() on unconnected SOCK_STREAM/SEQPACKET unix sockets. Previously the unpcb pointer of the newly connected remote socket was not initialized correctly, so attempting to lock it would result in a null pointer dereference. Reported by: syzkaller MFC after: 1 week Sponsored by: The FreeBSD Foundation	2020-04-13 19:22:05 +00:00
Mark Johnston	25f4ddfb2b	sbappendcontrol() needs to avoid clearing M_NOTREADY on data mbufs. If LOCAL_CREDS is set on a unix socket and sendfile() is called, sendfile will call uipc_send(PRUS_NOTREADY), prepending a control message to the M_NOTREADY mbufs. uipc_send() then calls sbappendcontrol() instead of sbappend(), and sbappendcontrol() would erroneously clear M_NOTREADY. Pass send flags to sbappendcontrol(), like we do for sbappend(), to preserve M_READY when necessary. Reported by: syzkaller MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24333	2020-04-10 20:42:11 +00:00
Mark Johnston	a50b1900a0	Properly handle disconnected sockets in uipc_ready(). When transmitting over a unix socket, data is placed directly into the receiving socket's receive buffer, instead of the transmitting socket's send buffer. This means that when pru_ready is called during sendfile(), the passed socket does not contain M_NOTREADY mbufs in its buffers; uipc_ready() must locate the linked socket. Currently uipc_ready() frees the mbufs if the socket is disconnected, but this is wrong since the mbufs may still be present in the receiving socket's buffer after a disconnect. This can result in a use-after-free and potentially a double free if the receive buffer is flushed after uipc_ready() frees the mbufs. Fix the problem by trying harder to locate the correct socket buffer and calling sbready(): use the global list of SOCK_STREAM unix sockets to search for a sockbuf containing the now-ready mbufs. Only free the mbufs if we fail this search. Reviewed by: jah, kib Reported and tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24332	2020-04-10 20:41:59 +00:00
Mark Johnston	db38b699b4	Simplify uipc_detach() slightly. Remove a goto and an unneeded local variable, and fix style. No functional change intended. Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2020-03-20 16:18:54 +00:00
Mark Johnston	569a186d02	Remove UNP_NASCENT, reverting r303855. unp_connectat() no longer holds the link lock across calls to sonewconn(), so the recursion described in r303855 can no longer occur. No functional change intended. Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation	2020-03-20 16:17:54 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Mateusz Guzik	3ff65f71cb	Remove duplicated empty lines from kern/*.c No functional changes.	2020-01-30 20:05:05 +00:00
Jason A. Harmening	a9aa06f7b1	Implement cycle-detecting garbage collector for AF_UNIX sockets The existing AF_UNIX socket garbage collector destroys any socket which may potentially be in a cycle, as indicated by its file reference count being equal to its enqueue count. However, this can produce false positives for in-flight sockets which aren't part of a cycle but are part of one or more SCM_RIGHTS mssages and which have been closed on the sending side. If the garbage collector happens to run at exactly the wrong time, destruction of these sockets will render them unusable on the receiving side, such that no previously-written data may be read. This change rewrites the garbage collector to precisely detect cycles: 1. The existing check of msgcount==f_count is still used to determine whether the socket is potentially in a cycle. 2. The socket is now placed on a local "dead list", which is used to reduce iteration time (and therefore contention on the global unp_link_rwlock). 3. The first pass through the dead list removes each potentially-dead socket's outgoing references from the graph of potentially-dead sockets, using a gc-specific copy of the original reference count. 4. The second series of passes through the dead list removes from the list any socket whose remaining gc refcount is non-zero, as this indicates the socket is actually accessible outside of any possible cycle. Iteration is repeated until no further sockets are removed from the dead list. 5. Sockets remaining in the dead list are destroyed as before. PR: 227285 Submitted by: jan.kokemueller@gmail.com (prior version) Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D23142	2020-01-25 08:57:26 +00:00
Mateusz Guzik	b249ce48ea	vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427	2020-01-03 22:29:58 +00:00
Mark Johnston	4013d72684	Fix handling of empty SCM_RIGHTS messages. As unp_internalize() processes the input control messages, it builds an output mbuf chain containing the internalized representations of those messages. In one special case, that of an empty SCM_RIGHTS message, the message is simply discarded. However, the loop which appends mbufs to the output chain assumed that each iteration would produce an mbuf, resulting in a null pointer dereference if an empty SCM_RIGHTS message was followed by a non-empty message. Fix this by advancing the output mbuf chain tail pointer only if an internalized control message was produced. Reported by: syzbot+1b5cced0f7fad26ae382@syzkaller.appspotmail.com MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-10-08 23:34:48 +00:00
Konstantin Belousov	f1cf2b9dcb	Check and avoid overflow when incrementing fp->f_count in fget_unlocked() and fhold(). On sufficiently large machine, f_count can be legitimately very large, e.g. malicious code can dup same fd up to the per-process filedescriptors limit, and then fork as much as it can. On some smaller machine, I see kern.maxfilesperproc: 939132 kern.maxprocperuid: 34203 which already overflows u_int. More, the malicious code can create transient references by sending fds over unix sockets. I realized that this check is missed after reading https://secfault-security.com/blog/FreeBSD-SA-1902.fd.html Reviewed by: markj (previous version), mjg Tested by: pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D20947	2019-07-21 15:07:12 +00:00
Konstantin Belousov	47c3450e50	Fix leak of memory and file refs with sendmsg(2) over unix domain sockets. When sendmsg(2) sucessfully internalized one SCM_RIGHTS control message, but failed to process some other control message later, both file references and filedescent memory needs to be freed. This was not done, only mbuf chain was freed. Noted, test case written, reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21000	2019-07-19 20:51:39 +00:00
Dmitry Chagin	c5afec6e89	Complete LOCAL_PEERCRED support. Cache pid of the remote process in the struct xucred. Do not bump XUCRED_VERSION as struct layout is not changed. PR: 215202 Reviewed by: tijl MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D20415	2019-05-30 14:24:26 +00:00
Mark Johnston	79db6fe7aa	Plug some networking sysctl leaks. Various network protocol sysctl handlers were not zero-filling their output buffers and thus would export uninitialized stack memory to userland. Fix a number of such handlers. Reported by: Thomas Barabosch, Fraunhofer FKIE Reviewed by: tuexen MFC after: 3 days Security: kernel memory disclosure Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18301	2018-11-22 20:49:41 +00:00
Mateusz Guzik	f218ac5087	uipc_usrreq: fix inode number assignment The code was incrementing a global variable in an unsafe manner. Two different threads stating two different sockets could have resulted in the same inode numbers assigned to both. Creation is protected with a global lock, move the assigment there. Since inode numbers are 64-bit now drop the check for overflows. Sponsored by: The FreeBSD Foundation	2018-11-21 22:25:05 +00:00
Mark Johnston	c7902fbeae	Improve handling of control message truncation. If a recvmsg(2) or recvmmsg(2) caller doesn't provide sufficient space for all control messages, the kernel sets MSG_CTRUNC in the message flags to indicate truncation of the control messages. In the case of SCM_RIGHTS messages, however, we were failing to dispose of the rights that had already been externalized into the recipient's file descriptor table. Add a new function and mbuf type to handle this cleanup task, and use it any time we fail to copy control messages out to the recipient. To simplify cleanup, control message truncation is now only performed at control message boundaries. The change also fixes a few related bugs: - Rights could be leaked to the recipient process if an error occurred while copying out a message's contents. - We failed to set MSG_CTRUNC if the truncation occurred on a control message boundary, e.g., if the caller received two control messages and provided only the exact amount of buffer space needed for the first. PR: 131876 Reviewed by: ed (previous version) MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16561	2018-08-07 16:36:48 +00:00
Mark Johnston	5b0480f2cc	Don't check rcv sockbuf limits when sending on a unix stream socket. sosend_generic() performs an initial comparison of the amount of data (including control messages) to be transmitted with the send buffer size. When transmitting on a unix socket, we then compare the amount of data being sent with the amount of space in the receive buffer size; if insufficient space is available, sbappendcontrol() returns an error and the data is lost. This is easily triggered by sending control messages together with an amount of data roughly equal to the send buffer size, since the control message size may change in uipc_send() as file descriptors are internalized. Fix the problem by removing the space check in sbappendcontrol(), whose only consumer is the unix sockets code. The stream sockets code uses the SB_STOP mechanism to ensure that senders will block if the receive buffer fills up. PR: 181741 MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D16515	2018-08-04 20:26:54 +00:00
Mark Johnston	e62ca80bde	Style.	2018-08-04 20:16:36 +00:00
Alan Somers	da4465506d	Fix LOCAL_PEERCRED with socketpair(2) Enable the LOCAL_PEERCRED socket option for unix domain stream sockets created with socketpair(2). Previously, it only worked with unix domain stream sockets created with socket(2)/listen(2)/connect(2)/accept(2). PR: 176419 Reported by: Nicholas Wilson <nicholas@nicholaswilson.me.uk> MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D16350	2018-08-03 01:37:00 +00:00
Brooks Davis	3a20f06a1c	Use uintptr_t alone when assigning to kvaddr_t variables. Suggested by: jhb	2018-07-10 13:03:06 +00:00
Brooks Davis	7524b4c14b	Correct breakage on 32-bit platforms from r335979.	2018-07-06 10:03:33 +00:00
Brooks Davis	f38b68ae8a	Make struct xinpcb and friends word-size independent. Replace size_t members with ksize_t (uint64_t) and pointer members (never used as pointers in userspace, but instead as unique idenitifiers) with kvaddr_t (uint64_t). This makes the structs identical between 32-bit and 64-bit ABIs. On 64-bit bit systems, the ABI is maintained. On 32-bit systems, this is an ABI breaking change. The ABI of most of these structs was previously broken in r315662. This also imposes a small API change on userspace consumers who must handle kernel pointers becoming virtual addresses. PR: 228301 (exp-run by antoine) Reviewed by: jtl, kib, rwatson (various versions) Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15386	2018-07-05 13:13:48 +00:00
Matt Macy	a62b4665f4	AF_UNIX: bring uipc_ready in compliance with new locking protocol PR: 228742 Submitted by: markj Reviewed by: markj	2018-06-08 20:31:59 +00:00
Matt Macy	fcabd54160	AF_UNIX: check for unp == unp2 on disconnect	2018-06-07 04:57:40 +00:00
Matt Macy	acf9fd05d8	AF_UNIX: It is possible for UNIX datagram sockets to be connected to themselves. The updated code assumed that that could not happen and would try to lock the unp mutex twice. There may be a lingering issue here but this fixes it for the reporter. PR: 228458 Reported by: marieheleneka at gmail.com	2018-05-24 21:13:46 +00:00
Matt Macy	c684c14ce3	AF_UNIX: evidently Samba likes to connect a unix socket to itself, fix locking	2018-05-24 18:22:13 +00:00
Matt Macy	a3a734908b	AF_UNIX in connectat unp and unp2 can be the same	2018-05-24 18:22:05 +00:00
Matt Macy	16529dace8	AF_UNIX: assert that we're not acquiring the same lock	2018-05-24 15:28:16 +00:00
Matt Macy	9725c9cef7	AF_UNIX gc unused label ...sigh	2018-05-20 21:37:34 +00:00
Matt Macy	cb8f450b94	AF_UNIX: Don't unlock unp/unp2 if they're not locked Reported by: mjg	2018-05-20 21:20:26 +00:00
Matt Macy	e10ef65d23	AF_UNIX: fix LOR introduced by the locking rewrite	2018-05-20 05:50:53 +00:00
Matt Macy	d95253403f	AF_UNIX: make unpcb lock name line up with what's in witness	2018-05-20 04:32:48 +00:00
Warner Losh	f344fb0b4b	Restore the all rights reserved language. Put it on each of the prior two copyrights. The line originated with the Berkeely Regents, who we have not approached about removing it (it's honestly too trivial to be worth that fight). Restore it to rwatson's line as well. He can decide if he wants it or not on his own. Matt clearly doesn't want it, per project preference and his own statements on IRC. Noticed by: rgrimes@	2018-05-19 17:29:57 +00:00
Matt Macy	4b06dee1e5	AF_UNIX: switch to annotations to avoid warnings	2018-05-19 05:37:58 +00:00

1 2 3 4 5 ...

359 Commits