freebsd-nq

Author	SHA1	Message	Date
Andre Oppermann	dc00208ec4	Grammar fixes to r241781. Submitted by: alc	2012-10-20 19:38:22 +00:00
Andre Oppermann	2bdf61ca29	Hide the unfortunate named sysctl kern.ipc.somaxconn from sysctl -a output and replace it with a new visible sysctl kern.ipc.acceptqueue of the same functionality. It specifies the maximum length of the accept queue on a listen socket. The old kern.ipc.somaxconn remains available for reading and writing for compatibility reasons so that existing programs, scripts and configurations continue to work. There no plans to ever remove the orginal and now hidden kern.ipc.somaxconn.	2012-10-20 12:53:14 +00:00
Andre Oppermann	1490de00a8	Tidy up somaxconn (accept queue limit) and related functions and move it together into one place.	2012-10-20 10:51:32 +00:00
Andre Oppermann	4b62fe5b0b	Move socket UMA zone initialization functionality together into one place.	2012-10-19 12:16:29 +00:00
Andre Oppermann	cf8e6069e8	Move UMA socket zone initialization from uipc_domain.c to uipc_socket.c into one place next to its other related functions to avoid confusion.	2012-10-19 10:15:32 +00:00
Andre Oppermann	d10733a8da	Remove unnecessary includes from sosend_copyin() and fix a couple of style issues.	2012-10-18 21:04:30 +00:00
Andre Oppermann	1d147759db	Remove double-wrapping of #ifdef ZERO_COPY_SOCKETS within zero copy specialized sosend_copyin() helper function.	2012-10-18 20:22:17 +00:00
Attilio Rao	2e564269d0	Disconnect non-MPSAFE SMBFS from the build in preparation for dropping GIANT from VFS. In addition, disconnect also netsmb, which is a base requirement for SMBFS. In the while SMBFS regular users can use FUSE interface and smbnetfs port to work with their SMBFS partitions. Also, there are ongoing efforts by vendor to support in-kernel smbfs, so there are good chances that it will get relinked once properly locked. This is not targeted for MFC.	2012-10-18 12:04:56 +00:00
Attilio Rao	a42ac676f5	Disconnect non-MPSAFE NTFS from the build in preparation for dropping GIANT from VFS. This code is particulary broken and fragile and other in-kernel implementations around, found in other operating systems, don't really seem clean and solid enough to be imported at all. If someone wants to reconsider in-kernel NTFS implementation for inclusion again, a fair effort for completely fixing and cleaning it up is expected. In the while NTFS regular users can use FUSE interface and ntfs-3g port to work with their NTFS partitions. This is not targeted for MFC.	2012-10-17 11:30:00 +00:00
Attilio Rao	e6116d5b8e	Disconnect non-MPSAFE NWFS from the build in preparation for dropping GIANT from VFS. In addition, disconnect also netncp, which is a base requirement for NWFS. In the possibility of a future maintenance of the code and later readd to the FreeBSD base, maybe we should think about a better location for netncp. I'm not entirely sure the / top location is actually right, however I will let network people to comment on that more specifically. This is not targeted for MFC.	2012-10-17 11:16:17 +00:00
Attilio Rao	55793cdccf	Disconnect non-MPSAFE PORTALFS from the build in preparation for dropping GIANT from VFS. This is not targeted for MFC.	2012-10-16 09:59:10 +00:00
Attilio Rao	05e009c443	Disconnect non-MPSAFE HPFS from the build in preparation for dropping GIANT from VFS. This is not targeted for MFC.	2012-10-16 09:55:31 +00:00
Konstantin Belousov	36c6f3aaae	Acquire the rangelock for truncate(2) as well. Reported and reviewed by: avg Tested by: pho MFC after: 1 week	2012-10-15 18:15:18 +00:00
Konstantin Belousov	9b233e2307	Add a KPI to allow to reserve some amount of space in the numvnodes counter, without actually allocating the vnodes. The supposed use of the getnewvnode_reserve(9) is to reclaim enough free vnodes while the code still does not hold any resources that might be needed during the reclamation, and to consume the slack later for getnewvnode() calls made from the innards. After the critical block is finished, the caller shall free any reserve left, by getnewvnode_drop_reserve(9). Reviewed by: avg Tested by: pho MFC after: 1 week	2012-10-14 19:43:37 +00:00
Alexander Motin	803a9b3efd	panic() with reasonable message instead of returning zero frequency causing division by zero later if event timer's minimal period is above one second. For now it is just a theoretical possibility. Found by: Clang Static Analyzer	2012-10-10 19:46:46 +00:00
Attilio Rao	3a4730256a	Add an unified macro to deny ability from the compiler to reorder instruction loads/stores at its will. The macro __compiler_membar() is currently supported for both gcc and clang, but kernel compilation will fail otherwise. Reviewed by: bde, kib Discussed with: dim, theraven MFC after: 2 weeks	2012-10-09 14:32:30 +00:00
Andriy Gapon	298fbd1605	cngetc: use cpu_spinwait to ease the cncheckc loop a tiny bit Reviewed by: julian MFC after: 10 days	2012-10-06 19:50:23 +00:00
Andriy Gapon	c331c9703c	ktrace/kern_exec: check p_tracecred instead of p_cred .. when deciding whether to continue tracing across suid/sgid exec. Otherwise if root ktrace-d an unprivileged process and the processed exec-ed a suid program, then tracing didn't continue across exec. Reviewed by: bde, kib MFC after: 22 days	2012-10-06 19:23:44 +00:00
Ed Schouten	6b1b791da6	Fix faulty error code handling in read(2) on TTYs. When performing a non-blocking read(2), on a TTY while no data is available, we should return EAGAIN. But if there's a modem disconnect, we should return 0. Right now we only return 0 when doing a blocking read, which is wrong. MFC after: 1 month	2012-10-03 13:51:03 +00:00
Garrett Wollman	48b5c7410f	Fix spelling of the function name in two assertion messages.	2012-10-02 18:38:05 +00:00
Eitan Adler	8dbce2a343	Provide a generic way to disable devices at boot time PR: kern/119202 Requested by: peterj Reviewed by: sbruno, jhb Approved by: cperciva MFC after: 1 week	2012-10-02 03:33:41 +00:00
Pawel Jakub Dawidek	55711729f3	- Enforce CAP_MKFIFO on mkfifoat(2), not on mknodat(2). Without this change mkfifoat(2) was not restricted. - Introduce CAP_MKNOD and enforce it on mknodat(2). Sponsored by: FreeBSD Foundation MFC after: 2 weeks	2012-10-01 05:43:24 +00:00
Konstantin Belousov	877d24ac8a	Fix the mis-handling of the VV_TEXT on the nullfs vnodes. If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write. Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode. Tested by: pho (previous version) MFC after: 2 weeks	2012-09-28 11:25:02 +00:00
Matthew D Fleming	fc8fdae0df	Fix up kernel sources to be ready for a 64-bit ino_t. Original code by: Gleb Kurtsou	2012-09-27 23:30:49 +00:00
Pawel Jakub Dawidek	c8e781f6e0	Revert r240931, as the previous comment was actually in sync with POSIX. I have to note that POSIX is simply stupid in how it describes O_EXEC/fexecve and friends. Yes, not only inconsistent, but stupid. In the open(2) description, O_RDONLY flag is described as: O_RDONLY Open for reading only. Taken from: http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html Note "for reading only". Not "for reading or executing"! In the fexecve(2) description you can find: The fexecve() function shall fail if: [EBADF] The fd argument is not a valid file descriptor open for executing. Taken from: http://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html As you can see the function shall fail if the file was not open with O_EXEC! And yet, if you look closer you can find this mess in the exec.html: Since execute permission is checked by fexecve(), the file description fd need not have been opened with the O_EXEC flag. Yes, O_EXEC flag doesn't have to be specified after all. You can open a file with O_RDONLY and you still be able to fexecve(2) it.	2012-09-27 16:43:23 +00:00
Mikolaj Golub	47813f5d94	Kernel and modules have "set_vnet" linker set, where virtualized global variables are placed. When a module is loaded by link_elf linker its variables from "set_vnet" linker set are copied to the kernel "set_vnet" ("modspace") and all references to these variables inside the module are relocated accordingly. The issue is when a module is loaded that has references to global variables from another, previously loaded module: these references are not relocated so an invalid address is used when the module tries to access the variable. The example is V_layer3_chain, defined in ipfw module and accessed from ipfw_nat. The same issue is with DPCPU variables, which use "set_pcpu" linker set. Fix this making the link_elf linker on a module load recognize "external" DPCPU/VNET variables defined in the previously loaded modules and relocate them accordingly. For this set_pcpu_list and set_vnet_list are used, where the addresses of modules' "set_pcpu" and "set_vnet" linker sets are stored. Note, archs that use link_elf_obj (amd64) were not affected by this issue. Reviewed by: jhb, julian, zec (initial version) MFC after: 1 month	2012-09-27 14:55:15 +00:00
Konstantin Belousov	94cb35459d	Make the updates of the tid ring buffer' head and tail pointers explicit by moving them into separate statements from the buffer element accesses. Requested by: jhb MFC after: 3 days	2012-09-26 09:25:11 +00:00
Pawel Jakub Dawidek	28f865b0b1	Fix freebsd32_kmq_timedreceive() and freebsd32_kmq_timedsend() to use getmq_read() and getmq_write() respectively, just like sys_kmq_timedreceive() and sys_kmq_timedsend(). Sponsored by: FreeBSD Foundation MFC after: 2 weeks	2012-09-25 22:15:59 +00:00
Pawel Jakub Dawidek	8c706ce0d0	vn_write() always expects FOF_OFFSET flag, which is asserted at the begining, so there is no need to check for it. Sponsored by: FreeBSD Foundation MFC after: 2 weeks	2012-09-25 21:31:17 +00:00
Pawel Jakub Dawidek	3a038c4d68	We cannot open file for reading and executing (O_RDONLY \| O_EXEC). Well, in theory we can pass those two flags, because O_RDONLY is 0, but we won't be able to read from a descriptor opened with O_EXEC. Update the comment. Sponsored by: FreeBSD Foundation MFC after: 2 weeks	2012-09-25 21:11:40 +00:00
Pawel Jakub Dawidek	5c3e5c7f03	Require CAP_DELETE on directory descriptor for unlinkat(2). Sponsored by: FreeBSD Foundation MFC after: 2 weeks	2012-09-25 21:00:36 +00:00
Pawel Jakub Dawidek	cffcbad2bf	Require CAP_CREATE on directory descriptor for symlinkat(2). Sponsored by: FreeBSD Foundation MFC after: 2 weeks	2012-09-25 20:59:12 +00:00
Pawel Jakub Dawidek	d2e166e654	Require CAP_CREATE on directory descriptor for linkat(2). Sponsored by: FreeBSD Foundation MFC after: 2 weeks	2012-09-25 20:58:15 +00:00
Pawel Jakub Dawidek	1159429db8	O_EXEC flag is not part of the O_ACCMODE mask, check it separately. If O_EXEC is provided don't require CAP_READ/CAP_WRITE, as O_EXEC is mutually exclusive to O_RDONLY/O_WRONLY/O_RDWR. Without this change CAP_FEXECVE capability right is not enforced. Sponsored by: FreeBSD Foundation MFC after: 3 days	2012-09-25 20:48:49 +00:00
George V. Neville-Neil	0bf9cb917c	Change the module name for the I/O provider to "kernel" from "genunix" This will requires us to modify externally created DTrace scripts but makes logical sense for FreeBSD. Requested by: rpaulo MFC after: 2 weeks	2012-09-25 19:16:28 +00:00
John Baldwin	d95dca1d08	Add optional entropy harvesting for software interrupts in swi_sched() as controlled by kern.random.sys.harvest.swi. SWI harvesting feeds into the interrupt FIFO and each event is estimated as providing a single bit of entropy. Reviewed by: markm, obrien MFC after: 2 weeks	2012-09-25 14:55:46 +00:00
Konstantin Belousov	787a64ddd2	Do not skip two elements of the tid_buffer when reusing the buffer slot. This eventually results in exhaustion of the tid space, causing new threads get tid -1 as identifier. The bad effect of having the thread id equal to -1 is that UMTX_OP_UMUTEX_WAIT returns EFAULT for a lock owned by such thread, because casuword cannot distinguish between literal value -1 read from the address and -1 returned as an indication of faulted access. _thr_umutex_lock() helper from libthr does not check for errors from _umtx_op_err(2), causing an infinite loop in mutex_lock_sleep(). We observed the JVM processes hanging and consuming enormous amount of system time on machines with approximately 100 days uptime. Reported by: Mykola Dzham <freebsd levsha org ua> MFC after: 1 week	2012-09-22 12:17:09 +00:00
Eitan Adler	96240c89f0	Correct double "the the" Approved by: cperciva MFC after: 3 days	2012-09-14 21:28:56 +00:00
Andriy Gapon	e87fc7cf7b	sched_ule: fix inverted condition in reporting of priority lending via ktr Reviewed by: kan MFC after: 1 week	2012-09-14 19:55:28 +00:00
Attilio Rao	0a15e5d30d	Remove all the checks on curthread != NULL with the exception of some MD trap checks (eg. printtrap()). Generally this check is not needed anymore, as there is not a legitimate case where curthread != NULL, after pcpu 0 area has been properly initialized. Reviewed by: bde, jhb MFC after: 1 week	2012-09-13 22:26:22 +00:00
John Baldwin	0f14f15b62	Ignore stop and continue signals sent to an exiting process. Stop signals set p_xstat to the signal that triggered the stop, but p_xstat is also used to hold the exit status of an exiting process. Without this change, a stop signal that arrived after a process was marked P_WEXIT but before it was marked a zombie would overwrite the exit status with the stop signal number. Reviewed by: kib MFC after: 1 week	2012-09-13 15:51:18 +00:00
Attilio Rao	e3ae0dfe69	Improve check coverage about idle threads. Idle threads are not allowed to acquire any lock but spinlocks. Deny any attempt to do so by panicing at the locking operation when INVARIANTS is on. Then, remove the check on blocking on a turnstile. The check in sleepqueues is left because they are not allowed to use tsleep() either which could happen still. Reviewed by: bde, jhb, kib MFC after: 1 week	2012-09-12 22:10:53 +00:00
Attilio Rao	faa1082aa2	Tweak the commit message in case of panic for sleeping from threads with TDP_NOSLEEPING on. The current message has no informations on the thread and wchan involed, which may be useful in case where dumps have mangled dwarf informations. Reported by: kib Reviewed by: bde, jhb, kib MFC after: 1 week	2012-09-12 22:05:54 +00:00
Konstantin Belousov	bcd5bb8e57	Add a facility for vgone() to inform the set of subscribed mounts about vnode reclamation. Typical use is for the bypass mounts like nullfs to get a notification about lower vnode going away. Now, vgone() calls new VFS op vfs_reclaim_lowervp() with an argument lowervp which is reclaimed. It is possible to register several reclamation event listeners, to correctly handle the case of several nullfs mounts over the same directory. For the filesystem not having nullfs mounts over it, the overhead added is a single mount interlock lock/unlock in the vnode reclamation path. In collaboration with: pho MFC after: 3 weeks	2012-09-09 19:17:15 +00:00
Konstantin Belousov	84c3cd4f19	Add MNTK_LOOKUP_EXCL_DOTDOT struct mount flag, which specifies to the lookup code that dotdot lookups shall override any shared lock requests with the exclusive one. The flag is useful for filesystems which sometimes need to upgrade shared lock to exclusive inside the VOP_LOOKUP or later, which cannot be done safely for dotdot, due to dvp also locked and causing LOR. In collaboration with: pho MFC after: 3 weeks	2012-09-09 19:11:52 +00:00
Attilio Rao	16cbf13b53	Move the checks for td_pinned, td_critnest, TDP_NOFAULTING and TDP_NOSLEEPING leaking from syscallret() to userret() so that also trap handling is covered. Also, the check on td_locks is not duplicated between the two functions. Reported by: avg Reviewed by: kib MFC after: 1 week	2012-09-08 18:35:15 +00:00
Attilio Rao	fbe18392a1	Move PT_UPDATED_FLUSH() before td_locks check in order to have more coverage also in the XEN case. Reviewed by: kib MFC after: 1 week	2012-09-08 18:29:53 +00:00
Attilio Rao	324e57150d	userret() already checks for td_locks when INVARIANTS is enabled, so there is no need to check if Giant is acquired after it. Reviewed by: kib MFC after: 1 week	2012-09-08 18:27:11 +00:00
Gleb Smirnoff	aaf6343576	Supply the pr_ctloutput method for local datagram sockets, so that setsockopt() and getsockopt() work on them. This makes 'tools/regression/sockets/unix_cmsg -t dgram' more successful.	2012-09-07 21:06:54 +00:00
John Baldwin	773e3b7dda	A few whitespace and comment fixes.	2012-09-07 15:10:46 +00:00

1 2 3 4 5 ...

12855 Commits