freebsd-nq

Author	SHA1	Message	Date
David Xu	b101b127f3	In function do_rw_wrlock, when a writer got an error and before returning, check if there are readers blocked by us via URWLOCK_WRITE_WAITERS flag, and resume the readers. The error must be EAGAIN, otherwise there must have memory problem, and nobody can rescue the buggy application. The revision 197445 might be reverted.	2009-09-25 00:03:13 +00:00
Alexander Motin	6090db7d38	Do not call BUS_DRIVER_ADDED() for detached buses (attach failed) on driver load. This fixes crash on atapicam module load on systems, where some ata channels (usually ata1) was probed, but failed to attach. Reviewed by: jhb, imp Tested by: many	2009-09-24 17:03:32 +00:00
Roman Divacky	1c2825bd80	Change unsigned foo to u_foo as required by style(9). Requested by: bde Approved by: ed (mentor)	2009-09-22 16:16:02 +00:00
Edward Tomasz Napierala	a9315dded6	Add pieces of infrastructure required for NFSv4 ACL support in UFS. Reviewed by: rwatson	2009-09-22 15:15:03 +00:00
Konstantin Belousov	51a6ef34fb	Remove forward_roundrobin(), it is unused for quite some time. Reviewed by: jhb MFC after: 1 week	2009-09-21 13:09:56 +00:00
Michael Tuexen	8518270e20	Get SCTP working in combination with VIMAGE. Contains code from bz. Approved by: rrs (mentor) MFC after: 1 month.	2009-09-19 14:02:16 +00:00
Alan Cox	fe105d45a2	Add a new sysctl for reporting all of the supported page sizes. Reviewed by: jhb MFC after: 3 weeks	2009-09-18 17:04:57 +00:00
Attilio Rao	39df6da8cc	Don't allocate new unnecessary pages when devstat_alloc() looses the run for re-acuiring the lock, but recheck if new pages are allocatable from the pool and free the previously allocated ones. Tested by: pho, Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-09-18 13:48:38 +00:00
Roman Divacky	6413e27b3a	Fix the style of the previous commit. Approved by: ed (mentor, implicit)	2009-09-17 17:48:13 +00:00
Roman Divacky	abc8594d70	Make these argument/variable unsigned as the defines for them don't fit into signed 32bit integer. Approved by: ed (mentor, implicit) Approved by: sson	2009-09-17 17:41:28 +00:00
Stacey Son	fdc1a1131e	Add EV_RECEIPT to kevents. EV_RECEIPT is useful to disambiguating error conditions when multiple events structures are passed to kevent(2). The error code is returned in the data field and EV_ERROR is set. Approved by: rwatson (co-mentor)	2009-09-16 03:49:54 +00:00
Stacey Son	1a921c410a	Add the EV_DISPATCH flag to kevents. When the EV_DISPATCH flag is used the event source will be disabled immediately after the delivery of an event. This is similar to the EV_ONESHOT flag but it doesn't delete the event. Approved by: rwatson (co-mentor)	2009-09-16 03:37:39 +00:00
Stacey Son	2c2e449905	Add EVFILT_USER to kevents. Add user events support to kernel events which are not associated with any kernel mechanism but are triggered by user level code. This is useful for adding user level events to an event handler that may also be monitoring kernel events. Approved by: rwatson (co-mentor)	2009-09-16 03:30:12 +00:00
Stacey Son	95128e983f	Add optional touch event filter hooks to kevents. The touch event filter is called when a kernel event data is possibly updated. There are two hook points. First, during a kevent() system call. Second, when an event has been triggered. Approved by: rwatson (co-mentor)	2009-09-16 03:15:57 +00:00
Andre Oppermann	11c99a6d7b	-Put the optimized soreceive_stream() under a compile time option called TCP_SORECEIVE_STREAM for the time being. Requested by: brooks Once compiled in make it easily switchable for testers by using a tuneable net.inet.tcp.soreceive_stream and a corresponding read-only sysctl to report the current state. Suggested by: rwatson MFC after: 2 days	2009-09-15 22:23:45 +00:00
Attilio Rao	435068aab7	Fix sched_switch_migrate(): - In 8.x and above the run-queue locks are nomore shared even in the HTT case, so remove the special case. - The deadlock explained in the removed comment here is still possible even with different locks, with the contribution of tdq_lock_pair(). An explanation is here: (hypotesis: a thread needs to migrate on another CPU, thread1 is doing sched_switch_migrate() and thread2 is the one handling the sched_switch() request or in other words, thread1 is the thread that needs to migrate and thread2 is a thread that is going to be preempted, most likely an idle thread. Also, 'old' is referred to the context (in terms of run-queue and CPU) thread1 is leaving and 'new' is referred to the context thread1 is going into. Finally, thread3 is doing tdq_idletd() or sched_balance() and definitively doing tdq_lock_pair()) * thread1 blocks its td_lock. Now td_lock is 'blocked' * thread1 drops its old runqueue lock * thread1 acquires the new runqueue lock * thread1 adds itself to the new runqueue and sends an IPI_PREEMPT through tdq_notify() to the new CPU * thread1 drops the new lock * thread3, scanning the runqueues, locks the old lock * thread2 received the IPI_PREEMPT and does thread_lock() with td_lock pointing to the new runqueue * thread3 wants to acquire the new runqueue lock, but it can't because it is held by thread2 so it spins * thread1 wants to acquire old lock, but as long as it is held by thread3 it can't * thread2 going further, at some point wants to switchin in thread1, but it will wait forever because thread1->td_lock is in blocked state This deadlock has been manifested mostly on 7.x and reported several time on mailing lists under the voice 'spinlock held too long'. Many thanks to des@ for having worked hard on producing suitable textdumps and Jeff for help on the comment wording. Reviewed by: jeff Reported by: des, others Tested by: des, Giovanni Trematerra <giovanni dot trematerra at gmail dot com> (STABLE_7 based version)	2009-09-15 16:56:17 +00:00
Attilio Rao	4c68dee0fb	Revert r196779 in order to implement a different scheme for newbus locking methodology. Requested by: imp	2009-09-13 15:08:19 +00:00
Luigi Rizzo	446e861708	Make sure callouts are not processed one tick late. The problem was introduced in SVN 180608/ rev 1.114 and affects all users of callout_reset() (including select, usleep, setitimer). A better fix probably involves replicating 'ticks' in the struct callout_cpu; this commit is just a temporary thing so that we can MFC it after a suitable test time and RE approval. MFC after: 3 days	2009-09-12 21:44:34 +00:00
Robert Watson	e76d823b81	Use C99 initialization for struct filterops. Obtained from: Mac OS X Sponsored by: Apple Inc. MFC after: 3 weeks	2009-09-12 20:03:45 +00:00
Nick Hibma	b22692bd0a	Add a comment on the consequences of reducing the poweroff delay	2009-09-10 18:24:59 +00:00
Konstantin Belousov	b55ef216fe	kern_select(9) copies fd_set in and out of userspace in quantities of longs. Since 32bit processes longs are 4 bytes, 64bit kernel may copy in or out 4 bytes more then the process expected. Calculate the amount of bytes to copy taking into account size of fd_set for the current process ABI. Diagnosed and tested by: Peter Jeremy <peterjeremy acm org> Reviewed by: jhb MFC after: 1 week	2009-09-09 20:59:01 +00:00
Konstantin Belousov	1ef6ea9b60	Unlock the image vnode around the call of pmc PMC_FN_PROCESS_EXEC hook. The hook calls vn_fullpath(9), that should not be executed with a vnode lock held. Reported by: Bruce Cran <bruce cran org uk> Tested by: pho MFC after: 3 days	2009-09-09 10:52:36 +00:00
Konstantin Belousov	427992ecdb	In vfs_mark_atime(9), be resistent against reclaimed vnodes. Assert that neccessary locks are taken, since vop might not be called. Tested by: pho MFC after: 3 days	2009-09-09 10:51:50 +00:00
Poul-Henning Kamp	6778431478	Revert previous commit and add myself to the list of people who should know better than to commit with a cat in the area.	2009-09-08 13:19:05 +00:00
Poul-Henning Kamp	b34421bf9c	Add necessary include.	2009-09-08 13:16:55 +00:00
Antoine Brodin	bf3f1fe043	Change w_notrunning and w_stillcold from pointer to array so that sizeof returns what is expected. PR: kern/138557 Discussed with: brucec@ MFC after: 1 month	2009-09-06 13:31:05 +00:00
Konstantin Belousov	db17314ea4	In fhopen, vfs_ref() the mount point while vnode is unlocked, to prevent vn_start_write(NULL, &mp) from operating on potentially freed or reused struct mount *. Remove unmatched vfs_rel() in cleanup. Noted and reviewed by: tegge Tested by: pho MFC after: 3 days	2009-09-06 11:44:46 +00:00
Ed Schouten	4d3b1aacfc	Move ptmx into pty(4). Now that pty(4) is a loadable kernel module, I'd better move /dev/ptmx in there as well. This means that pty(4) now provides almost all pseudo-terminal compatibility code. This means it's very easy to test whether applications use the proper library interfaces when allocating pseudo-terminals (namely posix_openpt and openpty).	2009-09-06 10:27:45 +00:00
Jamie Gritton	babbbb9c57	Allow a jail's name to be the same as its jid (which is the default if no name is specified), but still disallow other numeric names. Reviewed by: zec Approved by: bz (mentor) MFC after: 3 days	2009-09-04 19:00:48 +00:00
Attilio Rao	80002a63db	Add intermediate states for attaching and detaching that will be reused by the enhached newbus locking once it is checked in. This change can be easilly MFCed to STABLE_8 at the appropriate moment. Reviewed by: jhb, scottl Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-09-03 13:40:41 +00:00
Attilio Rao	8d3635c4db	Fix some bugs related to adaptive spinning: In the lockmgr support: - GIANT_RESTORE() is just called when the sleep finishes, so the current code can ends up into a giant unlock problem. Fix it by appropriately call GIANT_RESTORE() when needed. Note that this is not exactly ideal because for any interation of the adaptive spinning we drop and restore Giant, but the overhead should be not a factor. - In the lock held in exclusive mode case, after the adaptive spinning is brought to completition, we should just retry to acquire the lock instead to fallthrough. Fix that. - Fix a style nit In the sx support: - Call GIANT_SAVE() before than looping. This saves some overhead because in the current code GIANT_SAVE() is called several times. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-09-02 17:33:51 +00:00
Konstantin Belousov	579b976090	Fix mount reference leak when V_XSLEEP is specified to vn_start_write(). Submitted by: tegge	2009-09-01 12:05:39 +00:00
Konstantin Belousov	8a945d109c	Reintroduce the r196640, after fixing the problem with my testing. Remove the altkstacks, instead instantiate threads with kernel stack allocated with the right size from the start. For the thread that has kernel stack cached, verify that requested stack size is equial to the actual, and reallocate the stack if sizes differ [1]. This fixes the bug introduced by r173361 that was committed several days after r173004 and consisted of kthread_add(9) ignoring the non-default kernel stack size. Also, r173361 removed the caching of the kernel stacks for a non-first thread in the process. Introduce separate kernel stack cache that keeps some limited amount of preallocated kernel stacks to lower the latency of thread allocation. Add vm_lowmem handler to prune the cache on low memory condition. This way, system with reasonable amount of the threads get lower latency of thread creation, while still not exhausting significant portion of KVA for unused kstacks. Submitted by: peter [1] Discussed with: jhb, julian, peter Reviewed by: jhb Tested by: pho (and retested according to new test scenarious) MFC after: 1 week	2009-09-01 11:41:51 +00:00
Konstantin Belousov	a505c2c72c	Make the mnt_writeopcount and mnt_secondary_writes counters, used by the suspension code, not greater then mnt_ref reference counter value. Increment mnt_ref together with write counter in vn_start_write()/ vn_start_secondary_write(), releasing in vn_finished_write/vn_finished_secondary_write(). Since r186197, unmount code requires that no writers occured after all references are expired. We still could get write counter incremented for freed or reused struct mount, but it seems to be innocent, since corresponding vnode should be referenced and reclaimed then. Reported by: pho (last half a year), erwin Reviewed by: attilio Tested by: pho, erwin MFC after: 1 week	2009-08-31 10:20:52 +00:00
Bjoern A. Zeeb	ecc2fda872	Make sure FreeBSD binaries without .note.ABI-tag section work correctly and do not match a colliding Debian GNU/kFreeBSD brandinfo statements. For this mark the Debian GNU/kFreeBSD brandinfo that it must have an .note.ABI-tag section and ignore the old EI_OSABI brandinfo when comparing a possibly colliding set of options. Due to SYSINIT we add the brandinfo in a non-deterministic order, so native FreeBSD is not always first. We may want to consider to force native FreeBSD to come first as well. The only way a problem could currently be noticed is when running an i386 binary without the .note.ABI-tag on amd64 and the Debian GNU/kFreeBSD brandinfo was matched first, as the fallback to ld-elf32.so.1 does not exist in that case. Reported and tested by: ticso In collaboration with: kib MFC after: 3 days	2009-08-30 14:38:17 +00:00
Konstantin Belousov	f25fa6abb2	Reverse r196640 and r196644 for now.	2009-08-29 21:53:08 +00:00
Konstantin Belousov	b6b2d1bf88	Dispose the kernel stack of the proper thread. Submitted by: alc MFC after: 1 week	2009-08-29 18:01:02 +00:00
Konstantin Belousov	c3cf0b476f	Remove the altkstacks, instead instantiate threads with kernel stack allocated with the right size from the start. For the thread that has kernel stack cached, verify that requested stack size is equial to the actual, and reallocate the stack if sizes differ [1]. This fixes the bug introduced by r173361 that was committed several days after r173004 and consisted of kthread_add(9) ignoring the non-default kernel stack size. Also, r173361 removed the caching of the kernel stacks for a non-first thread in the process. Introduce separate kernel stack cache that keeps some limited amount of preallocated kernel stacks to lower the latency of thread allocation. Add vm_lowmem handler to prune the cache on low memory condition. This way, system with reasonable amount of the threads get lower latency of thread creation, while still not exhausting significant portion of KVA for unused kstacks. Submitted by: peter [1] Discussed with: jhb, julian, peter Reviewed by: jhb Tested by: pho MFC after: 1 week	2009-08-29 13:28:02 +00:00
John Baldwin	2fa8c8d21e	Extend the device pager to support different memory attributes on different pages in an object. - Add a new variant of d_mmap() currently called d_mmap2() which accepts an additional in/out parameter that is the memory attribute to use for the requested page. - A driver either uses d_mmap() or d_mmap2() for all requests but not both. The current implementation uses a flag in the cdevsw (D_MMAP2) to indicate that the driver provides a d_mmap2() handler instead of d_mmap(). This is done to make the change ABI compatible with existing drivers and MFC'able to 7 and 8. Submitted by: alc MFC after: 1 month	2009-08-28 14:06:55 +00:00
Jamie Gritton	c4884ffa6f	Fix a LOR between allprison_lock and vnode locks by releasing allprison_lock before releasing a prison's root vnode. PR: kern/138004 Reviewed by: kib Approved by: bz (mentor) MFC after: 3 days	2009-08-27 16:15:51 +00:00
Marius Strobl	5486ffc898	Add a temporary workaround which just lets init die instead of causing a panic if it is killed due to a unsolved stack overflow seen very late during shutdown on sparc64 when the gmirror worker process exists, which is a regression introduced in 8.0. Reviewed by: kib MFC after: 3 days	2009-08-26 21:10:47 +00:00
Konstantin Belousov	4f4946d337	Honor the vfs.timestamp_precision sysctl settings for utimes(path, NULL) and similar calls. Obtained from: Petr Salinger, Debian GNU/kFreeBSD, Debian bug #489894 MFC after: 3 days	2009-08-26 14:32:37 +00:00
Jilles Tjoelker	74d1c4927a	Fix poll() on half-closed sockets, while retaining POLLHUP for fifos. This reverts part of r196460, so that sockets only return POLLHUP if both directions are closed/error. Fifos get POLLHUP by closing the unused direction immediately after creating the sockets. The tools/regression/poll/*poll.c tests now pass except for two other things: - if POLLHUP is returned, POLLIN is always returned as well instead of only when there is data left in the buffer to be read - fifo old/new reader distinction does not work the way POSIX specs it Reviewed by: kib, bde	2009-08-25 21:44:14 +00:00
Warner Losh	58a745889f	Rather than havnig enabled/disabled, implement a max queue depth. While usually not an issue, this firewalls bugs in the code that may run us out of memory. Fix a memory exhaustion in the case where devctl was disabled, but the link was bouncing. The check to queue was in the wrong place. Implement a new sysctl hw.bus.devctl_queue to control the depth. Make compatibility hacks for hw.bus.devctl_disable to ease transition. Reviewed by: emaste@ Approved by: re@ (kib) MFC after: asap	2009-08-25 06:25:59 +00:00
Bjoern A. Zeeb	89ffc202d6	Fix handling of .note.ABI-tag section for GNU systems [1]. Handle GNU/Linux according to LSB Core Specification 4.0, Chapter 11. Object Format, 11.8. ABI note tag. Also check the first word of desc, not only name, according to glibc abi-tags specification to distinguish between Linux and kFreeBSD. Add explicit handling for Debian GNU/kFreeBSD, which runs on our kernels as well [2]. In {amd64,i386}/trap.c, when checking osrel of the current process, also check the ABI to not change the signal behaviour for Linux binary processes, now that we save an osrel version for all three from the lists above in struct proc [2]. These changes make it possible to run FreeBSD, Debian GNU/kFreeBSD and Linux binaries on the same machine again for at least i386 and amd64, and no longer break kFreeBSD which was detected as GNU(/Linux). PR: kern/135468 Submitted by: dchagin [1] (initial patch) Suggested by: kib [2] Tested by: Petr Salinger (Petr.Salinger seznam.cz) for kFreeBSD Reviewed by: kib MFC after: 3 days	2009-08-24 16:19:47 +00:00
Ed Schouten	2992abe047	Allow multiple console devices per driver without insane code duplication. Say, a driver wants to have multiple console devices to pick from, you would normally write down something like this: CONSOLE_DRIVER(dev1); CONSOLE_DRIVER(dev2); Unfortunately, this means that you have to declare 10 cn routines, instead of 5. It also isn't possible to initialize cn_arg on beforehand. I noticed this restriction when I was implementing some of the console bits for my vt(4) driver in my newcons branch. I have a single set of cn routines (termcn_*) which are shared by all vt(4) console devices. In order to solve this, I'm adding a separate consdev_ops structure, which contains all the function pointers. This structure is referenced through consdev's cn_ops field. While there, I'm removing CONS_DRIVER() and cn_checkc, which have been deprecated for years. They weren't used throughout the source, until the Xen console driver showed up. CONSOLE_DRIVER() has been changed to do the right thing. It now declares both the consdev and consdev_ops structure and ties them together. In other words: this change doesn't change the KPI for drivers that used the regular way of declaring console devices. If drivers want to use multiple console devices, they can do this as follows: static const struct consdev_ops mydriver_cnops = { .cn_probe = mydriver_cnprobe, ... }; static struct mydriver_softc cons0_softc = { ... }; CONSOLE_DEVICE(cons0, mydriver_cnops, &cons0_softc); static struct mydriver_softc cons1_softc = { ... }; CONSOLE_DEVICE(cons1, mydriver_cnops, &cons1_softc); Obtained from: //depot/user/ed/newcons/...	2009-08-24 10:53:30 +00:00
Marko Zec	0cb8b6a9b7	When "jail -c vnet" request fails, the current code actually creates and leaves behind an orphaned vnet. This change ensures that such vnets get released. This change affects only options VIMAGE builds. Submitted by: jamie Discussed with: bz Approved by: re (rwatson), julian (mentor) MFC after: 3 days	2009-08-24 10:16:19 +00:00
Marko Zec	d57425ab65	When registering a protocol to an existing protocol domain via pf_proto_register(), iterate over all existing vnets to call protosw_init() and thus the appropriate .pr_init() handler in the context of each vnet. NB in the future we probably want to separate pr_init() handlers into two, i.e. per-vnet and global, functions. This change has no impact on nooptions VIMAGE builds. Approved by: re (rwatson), julian (mentor) MFC after: 3 days	2009-08-24 10:03:41 +00:00
Robert Watson	77dfcdc445	Rework global locks for interface list and index management, correcting several critical bugs, including race conditions and lock order issues: Replace the single rwlock, ifnet_lock, with two locks, an rwlock and an sxlock. Either can be held to stablize the lists and indexes, but both are required to write. This allows the list to be held stable in both network interrupt contexts and sleepable user threads across sleeping memory allocations or device driver interactions. As before, writes to the interface list must occur from sleepable contexts. Reviewed by: bz, julian MFC after: 3 days	2009-08-23 20:40:19 +00:00
Ed Schouten	bfdaa52382	Allow pty(4) to be loaded as a kld. Unfortunately, the wrappers that are present in pts(4) don't have the mechanics to allow pty(4) to be unloaded safely, so I'm forcing this kld to return EBUSY. This also means we have to enable some extra code in pts(4) unconditionally. Proposed by: rwatson	2009-08-23 20:26:09 +00:00

1 2 3 4 5 ...

11376 Commits