freebsd-skq

Author	SHA1	Message	Date
attilio	df8d61d246	When releasing a lockmgr held in shared way we need to use a write memory barrier in order to avoid, on architectures which doesn't have strong ordered writes, CPU instructions reordering. Diagnosed by: fabio	2009-10-03 15:02:55 +00:00
bz	a0d8f55f8a	Print a warning in case we cannot add more brandinfo because we would overflow the MAX_BRANDS sized array. Reviewed by: kib MFC After: 1 month	2009-10-03 10:50:00 +00:00
rwatson	86b3bcad7d	Don't comment on stream socket handling in sosend_dgram, since that's not handled. MFC after: 3 weeks	2009-10-02 21:31:15 +00:00
bz	bc660fe08f	Add a mitigation feature that will prevent user mappings at virtual address 0, limiting the ability to convert a kernel NULL pointer dereference into a privilege escalation attack. If the sysctl is set to 0 a newly started process will not be able to map anything in the address range of the first page (0 to PAGE_SIZE). This is the default. Already running processes are not affected by this. You can either change the sysctl or the tunable from loader in case you need to map at a virtual address of 0, for example when running any of the extinct species of a set of a.out binaries, vm86 emulation, .. In that case set security.bsd.map_at_zero="1". Superseeds: r197537 In collaboration with: jhb, kib, alc	2009-10-02 17:48:51 +00:00
emaste	93e81ca098	In fill_kinfo_thread, copy the thread's name into struct kinfo_proc even if it is empty. Otherwise the previous thread's name would remain in the struct and then be reported for this thread. Submitted by: Ryan Stone MFC after: 1 week	2009-10-01 21:44:30 +00:00
trasz	d5661d631d	Provide default implementation for VOP_ACCESS(9), so that filesystems which want to provide VOP_ACCESSX(9) don't have to implement both. Note that this commit makes implementation of either of these two mandatory. Reviewed by: kib	2009-10-01 17:22:03 +00:00
kib	6f65ac4277	Do not dereference vp->v_mount without holding vnode lock and checking that the vnode is not reclaimed. Noted by: Igor Sysoev <is rambler-co ru> MFC after: 1 week	2009-10-01 12:50:26 +00:00
kib	605a0e085a	Fix typo. MFC after: 3 days	2009-10-01 12:46:58 +00:00
avg	78596c163e	print machine in kernel boot version string Discussed with: gavin, kib, jhb PR: kern/126926 MFC after: 2 weeks	2009-10-01 10:53:12 +00:00
attilio	e9f2530ebf	When releasing a read/shared lock we need to use a write memory barrier in order to avoid, on architectures which doesn't have strong ordered writes, CPU instructions reordering. Diagnosed by: fabio Reviewed by: jhb Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-09-30 13:26:31 +00:00
avg	49524d648a	print_caddr_t: drop incorrect __unused attribute from parameter seems like a purely cosmetic change Reviewed by: jhb, kib MFC after: 1 week	2009-09-30 11:14:13 +00:00
rwatson	47ee86367a	Regenerate system call files following r197636.	2009-09-30 08:48:59 +00:00
rwatson	3d5e3df28c	Reserve system call numbers for Capsicum security framework capabilities, capability mode, and process descriptors: cap_new, cap_getrights, cap_enter, cap_getmode, pdfork, pdkill, pdgetpid, and pdwait. Obtained from: TrustedBSD Project Sponsored by: Google MFC after: 3 weeks	2009-09-30 08:46:01 +00:00
jamie	59de460f51	Set the prison in NFS anon and GSS SVC creds. Reviewed by: marcel MFC after: 3 days	2009-09-28 18:07:16 +00:00
delphij	79f2f8c774	Add two new fcntls to enable/disable read-ahead: - F_READAHEAD: specify the amount for sequential access. The amount is specified in bytes and is rounded up to nearest block size. - F_RDAHEAD: Darwin compatible version that use 128KB as the sequential access size. A third argument of zero disables the read-ahead behavior. Please note that the read-ahead amount is also constrainted by sysctl variable, vfs.read_max, which may need to be raised in order to better utilize this feature. Thanks Igor Sysoev for proposing the feature and submitting the original version, and kib@ for his valuable comments. Submitted by: Igor Sysoev <is rambler-co ru> Reviewed by: kib@ MFC after: 1 month	2009-09-28 16:59:47 +00:00
delphij	fd5f08e3e8	Use correct sizeof() object for klist 'list'. Currently, struct klist contained only SLIST_HEAD as its member, thus sizeof(struct klist) would equal to sizeof(struct klist *), so this change makes the code more correct in terms of semantics, but should be a no-op to compiler at this time. Reported by: MQ <antinvidia at gmail com>	2009-09-28 10:22:46 +00:00
davidxu	14cbd7c194	In function do_rw_wrlock, when a writer got an error and before returning, check if there are readers blocked by us via URWLOCK_WRITE_WAITERS flag, and resume the readers. The error must be EAGAIN, otherwise there must have memory problem, and nobody can rescue the buggy application. The revision 197445 might be reverted.	2009-09-25 00:03:13 +00:00
mav	19afb02afb	Do not call BUS_DRIVER_ADDED() for detached buses (attach failed) on driver load. This fixes crash on atapicam module load on systems, where some ata channels (usually ata1) was probed, but failed to attach. Reviewed by: jhb, imp Tested by: many	2009-09-24 17:03:32 +00:00
rdivacky	69bdbc1b60	Change unsigned foo to u_foo as required by style(9). Requested by: bde Approved by: ed (mentor)	2009-09-22 16:16:02 +00:00
trasz	7af86ad4d9	Add pieces of infrastructure required for NFSv4 ACL support in UFS. Reviewed by: rwatson	2009-09-22 15:15:03 +00:00
kib	9357da94fe	Remove forward_roundrobin(), it is unused for quite some time. Reviewed by: jhb MFC after: 1 week	2009-09-21 13:09:56 +00:00
tuexen	de3c71bd61	Get SCTP working in combination with VIMAGE. Contains code from bz. Approved by: rrs (mentor) MFC after: 1 month.	2009-09-19 14:02:16 +00:00
alc	309c5ab06f	Add a new sysctl for reporting all of the supported page sizes. Reviewed by: jhb MFC after: 3 weeks	2009-09-18 17:04:57 +00:00
attilio	6aa8dd54a5	Don't allocate new unnecessary pages when devstat_alloc() looses the run for re-acuiring the lock, but recheck if new pages are allocatable from the pool and free the previously allocated ones. Tested by: pho, Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-09-18 13:48:38 +00:00
rdivacky	eefaeaf649	Fix the style of the previous commit. Approved by: ed (mentor, implicit)	2009-09-17 17:48:13 +00:00
rdivacky	1f00412058	Make these argument/variable unsigned as the defines for them don't fit into signed 32bit integer. Approved by: ed (mentor, implicit) Approved by: sson	2009-09-17 17:41:28 +00:00
sson	95d4e7d075	Add EV_RECEIPT to kevents. EV_RECEIPT is useful to disambiguating error conditions when multiple events structures are passed to kevent(2). The error code is returned in the data field and EV_ERROR is set. Approved by: rwatson (co-mentor)	2009-09-16 03:49:54 +00:00
sson	a386443e51	Add the EV_DISPATCH flag to kevents. When the EV_DISPATCH flag is used the event source will be disabled immediately after the delivery of an event. This is similar to the EV_ONESHOT flag but it doesn't delete the event. Approved by: rwatson (co-mentor)	2009-09-16 03:37:39 +00:00
sson	7cb0718a03	Add EVFILT_USER to kevents. Add user events support to kernel events which are not associated with any kernel mechanism but are triggered by user level code. This is useful for adding user level events to an event handler that may also be monitoring kernel events. Approved by: rwatson (co-mentor)	2009-09-16 03:30:12 +00:00
sson	297d4ab14b	Add optional touch event filter hooks to kevents. The touch event filter is called when a kernel event data is possibly updated. There are two hook points. First, during a kevent() system call. Second, when an event has been triggered. Approved by: rwatson (co-mentor)	2009-09-16 03:15:57 +00:00
andre	3e3fe69f6b	-Put the optimized soreceive_stream() under a compile time option called TCP_SORECEIVE_STREAM for the time being. Requested by: brooks Once compiled in make it easily switchable for testers by using a tuneable net.inet.tcp.soreceive_stream and a corresponding read-only sysctl to report the current state. Suggested by: rwatson MFC after: 2 days -This line, and those below, will be ignored-- > Description of fields to fill in above: 76 columns --\| > PR: If a GNATS PR is affected by the change. > Submitted by: If someone else sent in the change. > Reviewed by: If someone else reviewed your modification. > Approved by: If you needed approval for this commit. > Obtained from: If the change is from a third party. > MFC after: N [day[s]\|week[s]\|month[s]]. Request a reminder email. > Security: Vulnerability reference (one per line) or description. > Empty fields above will be automatically removed. M sys/conf/options M sys/kern/uipc_socket.c M sys/netinet/tcp_subr.c M sys/netinet/tcp_usrreq.c	2009-09-15 22:23:45 +00:00
attilio	867bd3bd7f	Fix sched_switch_migrate(): - In 8.x and above the run-queue locks are nomore shared even in the HTT case, so remove the special case. - The deadlock explained in the removed comment here is still possible even with different locks, with the contribution of tdq_lock_pair(). An explanation is here: (hypotesis: a thread needs to migrate on another CPU, thread1 is doing sched_switch_migrate() and thread2 is the one handling the sched_switch() request or in other words, thread1 is the thread that needs to migrate and thread2 is a thread that is going to be preempted, most likely an idle thread. Also, 'old' is referred to the context (in terms of run-queue and CPU) thread1 is leaving and 'new' is referred to the context thread1 is going into. Finally, thread3 is doing tdq_idletd() or sched_balance() and definitively doing tdq_lock_pair()) * thread1 blocks its td_lock. Now td_lock is 'blocked' * thread1 drops its old runqueue lock * thread1 acquires the new runqueue lock * thread1 adds itself to the new runqueue and sends an IPI_PREEMPT through tdq_notify() to the new CPU * thread1 drops the new lock * thread3, scanning the runqueues, locks the old lock * thread2 received the IPI_PREEMPT and does thread_lock() with td_lock pointing to the new runqueue * thread3 wants to acquire the new runqueue lock, but it can't because it is held by thread2 so it spins * thread1 wants to acquire old lock, but as long as it is held by thread3 it can't * thread2 going further, at some point wants to switchin in thread1, but it will wait forever because thread1->td_lock is in blocked state This deadlock has been manifested mostly on 7.x and reported several time on mailing lists under the voice 'spinlock held too long'. Many thanks to des@ for having worked hard on producing suitable textdumps and Jeff for help on the comment wording. Reviewed by: jeff Reported by: des, others Tested by: des, Giovanni Trematerra <giovanni dot trematerra at gmail dot com> (STABLE_7 based version)	2009-09-15 16:56:17 +00:00
attilio	47c9d3ff5e	Revert r196779 in order to implement a different scheme for newbus locking methodology. Requested by: imp	2009-09-13 15:08:19 +00:00
luigi	02e6d1c7f9	Make sure callouts are not processed one tick late. The problem was introduced in SVN 180608/ rev 1.114 and affects all users of callout_reset() (including select, usleep, setitimer). A better fix probably involves replicating 'ticks' in the struct callout_cpu; this commit is just a temporary thing so that we can MFC it after a suitable test time and RE approval. MFC after: 3 days	2009-09-12 21:44:34 +00:00
rwatson	53eaed07bc	Use C99 initialization for struct filterops. Obtained from: Mac OS X Sponsored by: Apple Inc. MFC after: 3 weeks	2009-09-12 20:03:45 +00:00
n_hibma	ae42cb4782	Add a comment on the consequences of reducing the poweroff delay	2009-09-10 18:24:59 +00:00
kib	91e6a5b3cc	kern_select(9) copies fd_set in and out of userspace in quantities of longs. Since 32bit processes longs are 4 bytes, 64bit kernel may copy in or out 4 bytes more then the process expected. Calculate the amount of bytes to copy taking into account size of fd_set for the current process ABI. Diagnosed and tested by: Peter Jeremy <peterjeremy acm org> Reviewed by: jhb MFC after: 1 week	2009-09-09 20:59:01 +00:00
kib	7a991968f2	Unlock the image vnode around the call of pmc PMC_FN_PROCESS_EXEC hook. The hook calls vn_fullpath(9), that should not be executed with a vnode lock held. Reported by: Bruce Cran <bruce cran org uk> Tested by: pho MFC after: 3 days	2009-09-09 10:52:36 +00:00
kib	3119d26c95	In vfs_mark_atime(9), be resistent against reclaimed vnodes. Assert that neccessary locks are taken, since vop might not be called. Tested by: pho MFC after: 3 days	2009-09-09 10:51:50 +00:00
phk	5297261651	Revert previous commit and add myself to the list of people who should know better than to commit with a cat in the area.	2009-09-08 13:19:05 +00:00
phk	2314521104	Add necessary include.	2009-09-08 13:16:55 +00:00
antoine	3aaee1a8e7	Change w_notrunning and w_stillcold from pointer to array so that sizeof returns what is expected. PR: kern/138557 Discussed with: brucec@ MFC after: 1 month	2009-09-06 13:31:05 +00:00
kib	14578a3276	In fhopen, vfs_ref() the mount point while vnode is unlocked, to prevent vn_start_write(NULL, &mp) from operating on potentially freed or reused struct mount *. Remove unmatched vfs_rel() in cleanup. Noted and reviewed by: tegge Tested by: pho MFC after: 3 days	2009-09-06 11:44:46 +00:00
ed	73faea2301	Move ptmx into pty(4). Now that pty(4) is a loadable kernel module, I'd better move /dev/ptmx in there as well. This means that pty(4) now provides almost all pseudo-terminal compatibility code. This means it's very easy to test whether applications use the proper library interfaces when allocating pseudo-terminals (namely posix_openpt and openpty).	2009-09-06 10:27:45 +00:00
jamie	319f7fbbc6	Allow a jail's name to be the same as its jid (which is the default if no name is specified), but still disallow other numeric names. Reviewed by: zec Approved by: bz (mentor) MFC after: 3 days	2009-09-04 19:00:48 +00:00
attilio	ef39f869d5	Add intermediate states for attaching and detaching that will be reused by the enhached newbus locking once it is checked in. This change can be easilly MFCed to STABLE_8 at the appropriate moment. Reviewed by: jhb, scottl Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-09-03 13:40:41 +00:00
attilio	ce3b4baf3c	Fix some bugs related to adaptive spinning: In the lockmgr support: - GIANT_RESTORE() is just called when the sleep finishes, so the current code can ends up into a giant unlock problem. Fix it by appropriately call GIANT_RESTORE() when needed. Note that this is not exactly ideal because for any interation of the adaptive spinning we drop and restore Giant, but the overhead should be not a factor. - In the lock held in exclusive mode case, after the adaptive spinning is brought to completition, we should just retry to acquire the lock instead to fallthrough. Fix that. - Fix a style nit In the sx support: - Call GIANT_SAVE() before than looping. This saves some overhead because in the current code GIANT_SAVE() is called several times. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-09-02 17:33:51 +00:00
kib	32ffadd900	Fix mount reference leak when V_XSLEEP is specified to vn_start_write(). Submitted by: tegge	2009-09-01 12:05:39 +00:00
kib	bae5df8cbb	Reintroduce the r196640, after fixing the problem with my testing. Remove the altkstacks, instead instantiate threads with kernel stack allocated with the right size from the start. For the thread that has kernel stack cached, verify that requested stack size is equial to the actual, and reallocate the stack if sizes differ [1]. This fixes the bug introduced by r173361 that was committed several days after r173004 and consisted of kthread_add(9) ignoring the non-default kernel stack size. Also, r173361 removed the caching of the kernel stacks for a non-first thread in the process. Introduce separate kernel stack cache that keeps some limited amount of preallocated kernel stacks to lower the latency of thread allocation. Add vm_lowmem handler to prune the cache on low memory condition. This way, system with reasonable amount of the threads get lower latency of thread creation, while still not exhausting significant portion of KVA for unused kstacks. Submitted by: peter [1] Discussed with: jhb, julian, peter Reviewed by: jhb Tested by: pho (and retested according to new test scenarious) MFC after: 1 week	2009-09-01 11:41:51 +00:00
kib	eab33b2665	Make the mnt_writeopcount and mnt_secondary_writes counters, used by the suspension code, not greater then mnt_ref reference counter value. Increment mnt_ref together with write counter in vn_start_write()/ vn_start_secondary_write(), releasing in vn_finished_write/vn_finished_secondary_write(). Since r186197, unmount code requires that no writers occured after all references are expired. We still could get write counter incremented for freed or reused struct mount, but it seems to be innocent, since corresponding vnode should be referenced and reclaimed then. Reported by: pho (last half a year), erwin Reviewed by: attilio Tested by: pho, erwin MFC after: 1 week	2009-08-31 10:20:52 +00:00

... 3 4 5 6 7 ...

11600 Commits