freebsd-dev

Author	SHA1	Message	Date
Konstantin Belousov	84c3cd4f19	Add MNTK_LOOKUP_EXCL_DOTDOT struct mount flag, which specifies to the lookup code that dotdot lookups shall override any shared lock requests with the exclusive one. The flag is useful for filesystems which sometimes need to upgrade shared lock to exclusive inside the VOP_LOOKUP or later, which cannot be done safely for dotdot, due to dvp also locked and causing LOR. In collaboration with: pho MFC after: 3 weeks	2012-09-09 19:11:52 +00:00
Attilio Rao	16cbf13b53	Move the checks for td_pinned, td_critnest, TDP_NOFAULTING and TDP_NOSLEEPING leaking from syscallret() to userret() so that also trap handling is covered. Also, the check on td_locks is not duplicated between the two functions. Reported by: avg Reviewed by: kib MFC after: 1 week	2012-09-08 18:35:15 +00:00
Attilio Rao	fbe18392a1	Move PT_UPDATED_FLUSH() before td_locks check in order to have more coverage also in the XEN case. Reviewed by: kib MFC after: 1 week	2012-09-08 18:29:53 +00:00
Attilio Rao	324e57150d	userret() already checks for td_locks when INVARIANTS is enabled, so there is no need to check if Giant is acquired after it. Reviewed by: kib MFC after: 1 week	2012-09-08 18:27:11 +00:00
Gleb Smirnoff	aaf6343576	Supply the pr_ctloutput method for local datagram sockets, so that setsockopt() and getsockopt() work on them. This makes 'tools/regression/sockets/unix_cmsg -t dgram' more successful.	2012-09-07 21:06:54 +00:00
John Baldwin	773e3b7dda	A few whitespace and comment fixes.	2012-09-07 15:10:46 +00:00
Aleksandr Rybalko	1bccd8638e	Style fixes. Suggested by: mdf Approved by: adrian (menthor)	2012-09-04 23:16:55 +00:00
Aleksandr Rybalko	6a8dada257	Add missing braces. Approved by: bschmidt (while mentor offline) Pointed by: gcooper Pointy hat to: ray	2012-09-03 09:46:46 +00:00
Andrey Zonov	ceb0f71506	- Mark some sysctls with CTLFLAG_TUN flag instead of CTLFLAG_RDTUN. Pointed out by: avg Approved by: kib (mentor) MFC after: 1 week	2012-09-03 09:26:56 +00:00
Aleksandr Rybalko	70da14c4bb	Add kern.hintmode sysctl variable to show current state of hints: 0 - loader hints in environment only; 1 - static hints only 2 - fallback mode (Dynamic KENV with fallback to kernel environment) Add kern.hintmode write handler, accept only value 2. That will switch static KENV to dynamic. So it will be possible to change device hints. Approved by: adrian (mentor)	2012-09-03 08:52:05 +00:00
Andrey Zonov	c3927cd956	- Make kern.maxtsiz, kern.dfldsiz, kern.maxdsiz, kern.dflssiz, kern.maxssiz and kern.sgrowsiz sysctls writable. Approved by: kib (mentor)	2012-09-02 17:39:02 +00:00
Mikolaj Golub	bb9f214f64	In soreceive_generic() remove the optimization for the case when MSG_WAITALL is set, and it is possible to do the entire receive operation at once if we block (resid <= hiwat). Actually it might make the recv(2) with MSG_WAITALL flag get stuck when there is enough space in the receiver buffer to satisfy the request but not enough to open the window closed previously due to the buffer being full. The issue can be reproduced using the following scenario: On the sender side do 2 send(2) requests: 1) data of size much smaller than SOBUF_SIZE (e.g. SOBUF_SIZE / 10); 2) data of size equal to SOBUF_SIZE. On the receiver side do 2 recv(2) requests with MSG_WAITALL flag set: 1) recv() data of SOBUF_SIZE / 10 size; 2) recv() data of SOBUF_SIZE size; We totally fill the receiver buffer with one SOBUF_SIZE/10 size request and partial SOBUF_SIZE request. When the first request is processed we get SOBUF_SIZE/10 free space. It is just enough to receive the rest of bytes for the second request, and soreceive_generic() blocks in the part that is a subject of this change waiting for the rest. But the window was closed when the buffer was filled and to avoid silly window syndrome it opens only when available space is larger than sb_hiwat/4 or maxseg. So it is stuck and pending data is only sent via TCP window probes. Discussed with: kib (long ago) MFC after: 2 weeks	2012-09-02 07:33:52 +00:00
Mikolaj Golub	2ad099fcb1	In soreceive_generic() when checking if the type of mbuf has changed check it for MT_CONTROL type too, otherwise the assertion "m->m_type == MT_DATA" below may be triggered by the following scenario: - the sender sends some data (MT_DATA) and then a file descriptor (MT_CONTROL); - the receiver calls recv(2) with a MSG_WAITALL asking for data larger than the receive buffer (uio_resid > hiwat). MFC after: 2 week	2012-09-02 07:29:37 +00:00
Pawel Jakub Dawidek	707641ec28	Fix panic in procdesc that can be triggered in the following scenario: 1. Process A pdfork(2)s process B. 2. Process A passes process descriptor of B to unrelated process C. 3. Hit CTRL+C to terminate process A. Process B is also terminated with SIGINT. 4. init(8) collects status of process B. 5. Process C closes process descriptor associated with process B. When we have such order of events, init(8), by collecting status of process B, will call procdesc_reap(). This function sets pd_proc to NULL. Now when process C calls close on this process descriptor, procdesc_close() is called. Unfortunately procdesc_close() assumes that pd_proc points at a valid proc structure, but it was set to NULL earlier, so the kernel panics. The patch also adds setting 'p->p_procdesc' to NULL in procdesc_reap(), which I think should be done. MFC after: 1 week	2012-09-01 11:21:56 +00:00
Attilio Rao	d4a2ab8c07	Post r222812 KTR_CPUMASK started being initialized only as a tunable handler and not more statically. Unfortunately, it seems that this is not ideal for new platform bringup and boot low level development (which needs ktr_cpumask to be effective before tunables can be setup). Because of this, add a way to statically initialize cpusets, by passing an list of initializers, divided by commas. Also, provide a way to enforce an all-set mask, for above mentioned initializers. This imposes some differences on how KTR_CPUMASK is setup now as a kernel option, and in particular this makes the words specifications backward wrt. what is currently in -CURRENT. In order to avoid mismatches between KTR_CPUMASK definition and other way to setup the mask (tunable, sysctl) and to print it, change the ordering how cpusetobj_print() and cpusetobj_scan() acquire the words belonging to the set. Please give a look to sys/conf/NOTES in order to understand how the new format is supposed to work. Also, ktr manpages will be updated shortly by gjb which volountereed for this. This patch won't be merged because it changes a POLA (at least from the theoretical standpoint) and this is however a patch that proves to be effective only in development environments. Requested by: rpaulo Reviewed by: jeff, rpaulo	2012-08-30 21:22:47 +00:00
Marius Strobl	bf38cf8ab3	- Unlike cache invalidation and TLB demapping IPIs, reading registers from other CPUs doesn't require locking so get rid of it. As the latter is used for the timecounter on certain machine models, using a spin lock in this case can lead to a deadlock with the upcoming callout(9) rework. - Merge r134227/r167250 from x86: Avoid cross-IPI SMP deadlock by using the smp_ipi_mtx spin lock not only for smp_rendezvous_cpus() but also for the MD cache invalidation and TLB demapping IPIs. - Mark some unused function arguments as such. MFC after: 1 week	2012-08-29 16:56:50 +00:00
Ed Schouten	fa4dd27847	Remove unused SI_* flags. The SI_DEVOPEN, SI_CONSOPEN and SI_CANDELETE flags are not used by any piece of code in the tree.	2012-08-28 19:30:29 +00:00
John Baldwin	10f0ab3933	Shorten the name of the fast SWI taskqueue to "fast taskq" so that it fits. Reported by: lev MFC after: 1 week	2012-08-28 13:35:37 +00:00
Navdeep Parhar	812302c3eb	Allow nmbjumbop, nmbjumbo9, and nmbjumbo16 to be set directly via loader tunables. MFC after: 1 month	2012-08-23 21:32:02 +00:00
Konstantin Belousov	258f94423b	Provide some compat32 shims for sysctl vfs.conflist. It is required for getvfsbyname(3) operation when called from 32bit process, and getvfsbyname(3) is used by recent bsdtar import. Reported by: many Tested by: David Naylor <naylor.b.david@gmail.com> MFC after: 5 days	2012-08-22 20:05:34 +00:00
John Baldwin	7e690c1f79	Assert that system calls do not leak a pinned thread (via sched_pin()) to userland.	2012-08-22 20:02:42 +00:00
John Baldwin	dda66918e6	Fix a typo.	2012-08-22 20:01:57 +00:00
John Baldwin	ba96d2d816	Mark the idle threads as non-sleepable and also assert that an idle thread never blocks on a turnstile.	2012-08-22 20:01:38 +00:00
John Baldwin	e6bdd477fc	Fix the 'show witness' DDB command to honor db_pager_quit.	2012-08-22 20:00:41 +00:00
John Baldwin	6f7d0018b0	Add a BUS_CHILD_DELETED() method that a bus can hook to allow it to cleanup any bus-specific state (such as ivars) when a child device is deleted. Requested by: kan MFC after: 1 month	2012-08-21 18:13:09 +00:00
Konstantin Belousov	888aefef89	Deliver SIGSYS to the guilty thread, not to the process. MFC after: 1 week	2012-08-18 18:17:10 +00:00
David Xu	e31eb35c3f	regen.	2012-08-17 02:47:16 +00:00
David Xu	d65f1abca7	Implement syscall clock_getcpuclockid2, so we can get a clock id for process, thread or others we want to support. Use the syscall to implement POSIX API clock_getcpuclock and pthread_getcpuclockid. PR: 168417	2012-08-17 02:26:31 +00:00
John Baldwin	f39f73f47c	Remove D_NEEDGIANT from dead_devsw. biofinish() (and thus dead_strategy) does not need Giant. MFC after: 1 month	2012-08-16 18:04:33 +00:00
Konstantin Belousov	3fa615bc11	As a safety measure, disable lowering pid_max too much. Requested by: Peter Jeremy <peter@rulingia.com> MFC after: 1 week	2012-08-16 13:04:21 +00:00
Konstantin Belousov	abce621c3a	Fix grammar. Submitted by: jh MFC after: 1 week	2012-08-16 13:01:56 +00:00
Warner Losh	79f1fdb83b	Limit popcorn limit to something sane (either 2ns or 2 ticks if that's longer). PR: 156481 Submitted by: Ian Lepore	2012-08-16 02:35:44 +00:00
Alan Cox	33327b9e9b	Correct a KASSERT message. Submitted by: bde	2012-08-15 22:12:01 +00:00
Konstantin Belousov	02c6fc2114	Add a sysctl kern.pid_max, which limits the maximum pid the system is allowed to allocate, and corresponding tunable with the same name. Note that existing processes with higher pids are left intact. MFC after: 1 week	2012-08-15 15:56:21 +00:00
Hans Petter Selasky	c01fc06ee9	Revert r239178 and implement two new functions, namely "device_free_softc()" and "device_claim_softc()", to allow USB serial drivers refcounting the softc. These functions are used to grab the softc from auto-free and to free the softc back to the correct malloc type, respectivly. Discussed with: jhb MFC after: 2 weeks	2012-08-15 15:42:57 +00:00
David E. O'Brien	60ee433881	Don't include opt_ddb.h & <ddb/ddb.h> twice.	2012-08-15 14:18:54 +00:00
Jaakko Heinonen	2f0ac2593b	Reserve room for the terminating NUL when setting or getting kernel environment variables. KENV_MNAMELEN and KENV_MVALLEN doesn't include space for the terminating NUL.	2012-08-14 19:16:30 +00:00
David Xu	d7f97db7bd	Some style fixes inspired by @bde.	2012-08-11 23:48:39 +00:00
Alexander Motin	37f4e0254f	Some more minor tunings inspired by bde@.	2012-08-11 20:24:39 +00:00
Alexander Motin	bf89d544d0	Allow idle threads to steal second threads from other cores on systems with 8 or more cores to improve utilization. None of my tests on 2xXeon (2x6x2) system shown any slowdown from mentioned "excess thrashing". Same time in pbzip2 test with number of threads more then number of CPUs I see up to 10% speedup with SMT disabled and up 5% with SMT enabled. Thinking about trashing I was trying to limit that stealing within same last level cache, but got only worse results. Present code any way prefers to steal threads from topologically closer cores. Sponsored by: iXsystems, Inc.	2012-08-11 15:08:19 +00:00
David Xu	e8afbca2bc	tvtohz will print out an error message if a negative value is given to it, avoid this problem by detecting timeout earlier. Reported by: pho	2012-08-11 00:06:56 +00:00
Alexander Motin	579895df01	Some minor tunings/cleanups inspired by bde@ after previous commits: - remove extra dynamic variable initializations; - restore (4BSD) and implement (ULE) hogticks variable setting; - make sched_rr_interval() more tolerant to options; - restore (4BSD) and implement (ULE) kern.sched.quantum sysctl, a more user-friendly wrapper for sched_slice; - tune some sysctl descriptions; - make some style fixes.	2012-08-10 19:02:49 +00:00
Alexander Motin	9000aabf3b	sched_rr_interval() seems always returned period in hz ticks, but same always it was used as rate. Fix use side units to period in hz ticks.	2012-08-10 18:19:57 +00:00
Hans Petter Selasky	ea1bd564ac	Add new device method to free the automatically allocated softc structure which is returned by device_get_softc(). This method can be used to easily implement softc refcounting. This can be desirable when the softc has memory references which are controlled by userspace handles for example. This solves the problem of blocking the caller of device_detach() for a non-deterministic time. Discussed with: kib, ed MFC after: 2 weeks	2012-08-10 15:02:49 +00:00
Alexander Motin	3d7f41175d	Rework r220198 change (by fabient). I believe it solves the problem from the wrong direction. Before it, if preemption and end of time slice happen same time, thread was put to the head of the queue as for only preemption. It could cause single thread to run for indefinitely long time. r220198 handles it by not clearing TDF_NEEDRESCHED in case of preemption. But that causes delayed context switch every time preemption happens, even when not needed. Solve problem by introducing scheduler-specifoc thread flag TDF_SLICEEND, set when thread's time slice is over and it should be put to the tail of queue. Using SW_PREEMPT flag for that purpose as it was before just not enough informative to work correctly. On my tests this by 2-3 times reduces run time deviation (improves fairness) in cases when several threads share one CPU. Reviewed by: fabient MFC after: 2 months Sponsored by: iXsystems, Inc.	2012-08-09 19:26:13 +00:00
Alexander Motin	48317e9e27	SCHED_4BSD scheduling quantum mechanism appears to be broken for some time. With switchticks variable being reset each time thread preempted (that is done regularly by interrupt threads) scheduling quantum may never expire. It was not noticed in time because several other factors still regularly trigger context switches. Handle the problem by replacing that mechanism with its equivalent from SCHED_ULE called time slice. It is effectively the same, just measured in context of stathz instead of hz. Some unification is probably not bad.	2012-08-09 18:09:59 +00:00
Konstantin Belousov	c0c6e95f7f	Always initialize pl_event. Submitted by: Andrey Zonov <andrey@zonov.org> MFC after: 3 days	2012-08-08 00:20:30 +00:00
Alexander Kabaev	c9516c94b4	Do not add handler to event handlers list until ithread is created. In rare event when fast and ithread interrupts share the same vector and the fast handler was registered first, we can end up trying to schedule the ithread that is not created yet. The kernel built with INVARIANTS then triggers an assertion. Change the order to create the ithread first and only then add the handler that needs it to the interrupt event handlers list. Reviewed by: jhb	2012-08-06 16:37:43 +00:00
Konstantin Belousov	1c771f9222	After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks	2012-08-05 14:11:42 +00:00
Alexander Motin	2038943013	Particlly MFcalloutng r238425 (by davide): Fix an issue related to old periodic timers. The code in kern_clocksource.c uses interrupt to keep track of time, and this time may not match with binuptime(). In order to address such incoherency, switch periodic timers to binuptime(). Except further calloutng it is needed for already present cyclic subsystem.	2012-08-04 08:06:37 +00:00

1 2 3 4 5 ...

12811 Commits