13069 Commits

Author SHA1 Message Date
attilio
d3b7ec3a08 MFC 2013-02-04 22:10:01 +00:00
avg
cfd3c02e72 ktr: prevent possible footshooting with KTR_ENTRIES and KTR_BOOT_ENTRIES
Suggested by:	adrian
MFC after:	14 days
X-MFC with:	r246282
2013-02-04 21:58:57 +00:00
avg
3a33f3e282 ktr: copy content from the early static buffer if KTR_ENTRIES !=
KTR_BOOT_ENTRIES

Reported by:	glebius, jhb
Pointyhat to:	avg
MFC after:	14 days
X-MFC with:	r246282
2013-02-04 21:50:55 +00:00
marius
790d2fce4f Try to improve r242655 take III: move these SYSCTLs describing the kernel
map, which is defined and initialized in vm/vm_kern.c, to the latter.

Submitted by:	alc
2013-02-04 09:35:48 +00:00
marius
a41f9579b9 Further improve r242655 and supply VM_{MIN,MAX}_KERNEL_ADDRESS as constant
values to SYSCTL_ULONG(9) where possible.

Submitted by:	bde
2013-02-03 21:43:55 +00:00
attilio
0d3b58aee0 MFC 2013-02-03 20:13:33 +00:00
avg
08016a40d7 allow for large KTR_ENTRIES values by allocating ktr_buf using malloc(9)
Only during very early boot, before malloc(9) is functional (SI_SUB_KMEM),
the static ktr_buf_init is used.  Size of the static buffer is determined
by a new kernel option KTR_BOOT_ENTRIES.  Its default value is 1024.

This commit builds on top of r243046.

Reviewed by:	alc
MFC after:	17 days
2013-02-03 09:57:39 +00:00
avg
223577af14 fix some fat-fingering in r246246
Submitted by:	mjg
Pointyhat to:	avg
MFC after:	5 days
X-MFC with:	r246246
2013-02-02 14:19:50 +00:00
avg
8e238c660c print compiler version in the kernel banner
And provide kernel compiler version as a sysctl as well.
This is useful while we have gcc and clang cohabitation.
This could be even more useful when we have support
for external toolchains.

In cooperation with:	mjg
MFC after:		13 days
2013-02-02 11:58:35 +00:00
gber
d005d97a01 Get time of next event from other cores only if SMP is already started.
Reviewed by: mav
Obtained from: Semihalf
2013-02-01 11:39:03 +00:00
pjd
2163564eab Now that MPSAFE flag is gone, we can arrange code a bit better. 2013-01-31 22:20:05 +00:00
pjd
8a682d18ff Remove leftover label after Giant removal from VFS. 2013-01-31 22:15:41 +00:00
pjd
b04cb3ac24 Remove label that was accidentally moved during Giant removal from VFS. 2013-01-31 22:14:16 +00:00
pjd
36392a26c9 Simplify code a bit. This is leftover after Giant removal from VFS. 2013-01-31 22:12:48 +00:00
kib
e81abcec51 The case of pid == WAIT_MYPGRP for the kern_wait() is already handled
in kern_wait6(), which is called by kern_wait().  Remove the redundand
check, introduced in r243136, and add a comment noting this, to make
the code less confusing.

The blank lines are added to properly delineate the scope of the
preceeding comments.

Noted by:	"Jukka A. Ukkonen" <jau@iki.fi>
MFC after:	1 week
2013-01-30 13:14:15 +00:00
jhb
5df972afb6 Mark 'ticks', 'time_second', and 'time_uptime' as volatile to prevent the
compiler from caching their values in tight loops.

Reviewed by:	bde
MFC after:	1 week
2013-01-28 19:38:13 +00:00
glebius
fd20cedd21 - Move large functions m_getjcl() and m_get2() to kern/uipc_mbuf.c
- style(9) fixes to mbuf.h

Reviewed by:	bde
2013-01-24 09:29:41 +00:00
jhb
599614d596 Fix a typo. 2013-01-23 14:37:05 +00:00
andre
d8067c27f5 Move the mbuf memory limit calculations from init_param2() to
tunable_mbinit() where it is next to where it is used later.

Change the sysinit level of tunable_mbinit() from SI_SUB_TUNABLES
to SI_SUB_KMEM after the VM is running.  This allows to use better
methods to determine the effectively available physical and virtual
memory available to the kernel.

Update comments.

In a second step it can be merged into mbuf_init().
2013-01-17 21:28:31 +00:00
alfred
c24045a741 Do not autotune ncallout to be greater than 18508.
When maxusers was unrestricted and maxfiles was allowed to autotune
much higher the result was that ncallout which was based on maxfiles
and maxproc grew much higher than was needed.

To fix this clip autotuning to the same number we would get with
the old maxusers algorithm which would stop scaling at 384
maxusers.

Growing ncalout higher is not likely to be needed since most consumers
of timeout(9) are gone and any higher value for ncallout causes the
callwheel hashes to be much larger than will even be needed for
most applications.

MFC after:      1 month

Reviewed by:	mav
2013-01-15 19:26:17 +00:00
zont
5e5b6a9249 - Detect when we are in KVM.
Silence on:	emulation
Approved by:	kib (mentor)
MFC after:	1 week
2013-01-15 14:05:59 +00:00
kib
5a71b324fb Add a trivial comment to record the proper commit log for r245407:
Set the v_hash for a new vnode in the getnewvnode() to the value
calculated based on the vnode structure address.  Filesystems using
vfs_hash_insert() override the v_hash using the standard formula of
(inode_number + mnt_hashseed).  For other filesystems, the
initialization allows the vfs_hash_index() to provide useful hash too.

Suggested, reviewed and tested by:	peter
Sponsored by:	The FreeBSD Foundation
MFC after:	5 days
2013-01-14 05:52:23 +00:00
kib
cef86179d2 diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
index 7c243b6..0bdaf36 100644
--- a/sys/kern/vfs_subr.c
+++ b/sys/kern/vfs_subr.c
@@ -279,6 +279,7 @@ SYSCTL_INT(_debug, OID_AUTO, vnlru_nowhere, CTLFLAG_RW,
 #define VSHOULDFREE(vp) (!((vp)->v_iflag & VI_FREE) && !(vp)->v_holdcnt)
 #define VSHOULDBUSY(vp) (((vp)->v_iflag & VI_FREE) && (vp)->v_holdcnt)

+static int vnsz2log;

 /*
  * Initialize the vnode management data structures.
@@ -293,6 +294,7 @@ SYSCTL_INT(_debug, OID_AUTO, vnlru_nowhere, CTLFLAG_RW,
 static void
 vntblinit(void *dummy __unused)
 {
+	u_int i;
 	int physvnodes, virtvnodes;

 	/*
@@ -332,6 +334,9 @@ vntblinit(void *dummy __unused)
 	syncer_maxdelay = syncer_mask + 1;
 	mtx_init(&sync_mtx, "Syncer mtx", NULL, MTX_DEF);
 	cv_init(&sync_wakeup, "syncer");
+	for (i = 1; i <= sizeof(struct vnode); i <<= 1)
+		vnsz2log++;
+	vnsz2log--;
 }
 SYSINIT(vfs, SI_SUB_VFS, SI_ORDER_FIRST, vntblinit, NULL);

@@ -1067,6 +1072,14 @@ alloc:
 	}
 	rangelock_init(&vp->v_rl);

+	/*
+	 * For the filesystems which do not use vfs_hash_insert(),
+	 * still initialize v_hash to have vfs_hash_index() useful.
+	 * E.g., nullfs uses vfs_hash_index() on the lower vnode for
+	 * its own hashing.
+	 */
+	vp->v_hash = (uintptr_t)vp >> vnsz2log;
+
 	*vpp = vp;
 	return (0);
 }
2013-01-14 05:42:54 +00:00
kib
b3d1d61e31 Add exported vfs_hash_index() function, which calculates the canonical
pre-masked hash for the given vnode.  The function assumes that
vp->v_hash is initialized by the filesystem vnode instantiation
function.  At the moment, it is only done if filesystem uses
vfs_hash_insert().

Reviewed by:	peter
Tested by:	peter, pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	5 days
2013-01-14 05:41:40 +00:00
kib
b11dfc2e2a Rename vfs_hash_index() to vfs_hash_bucket().
Reviewed by:	peter
Tested by:	peter, pho
Sponsored by:	The FreeBSD Foundation
MFC after:	5 days
2013-01-14 05:40:21 +00:00
kib
a1f9360638 Add flags argument to vfs_write_resume() and remove
vfs_write_resume_flags().

Sponsored by:	The FreeBSD Foundation
2013-01-11 06:08:32 +00:00
attilio
561dd1163d MFC 2013-01-09 11:58:47 +00:00
mjg
10b2623aac lockmgr: unlock interlock (if requested) when dealing with upgrade/downgrade
requests for LK_NOSHARE locks, just like for shared locks.

PR:		kern/174969
Reviewed by:	attilio
MFC after:	1 week
2013-01-06 21:47:59 +00:00
kib
69c365794c Protect the p->p_pgrp dereference with the process lock.
MFC after:	3 days
2013-01-06 15:10:10 +00:00
neel
7dc424a712 Teach the kernel to recognize that it is executing inside a bhyve virtual
machine.

Obtained from:	NetApp
2013-01-05 19:18:50 +00:00
bjk
604b10857d Fix some minor inaccuracies introduced in r243251.
Also correct the comment in kern_synch.c which was the source of the
problematic text.

Reviewed by:	kib (previous version)
Approved by:	hrs (mentor)
2013-01-05 00:23:26 +00:00
davidxu
ab8145269b Revert revision 244760 because strncpy pads trailing space with zero,
this prevents kernel data from being leaked.

Noticed by: Joerg Sonnenberger &lt; joerg at britannica dot bec dot de &gt;
2013-01-04 11:11:12 +00:00
kib
bc62472c88 Remove the deprecated MNT_VNODE_FOREACH interface. Use the
MNT_VNODE_FOREACH_ALL instead.
2013-01-03 19:02:52 +00:00
kib
a280492f6e The process_deferred_inactive() function locks the vnodes of the ufs
mount, which means that is must not be called while the snaplock is
owned.  The vfs_write_resume(9) does call the function as the
VFS_SUSP_CLEAN() method, which is too early and falls into the region
still protected by snaplock.

Add yet another flag for the vfs_write_resume_flags() to avoid calling
suspension cleanup handler after the suspend is lifted, and use it in
the ffs_snapshot() call to vfs_write_resume.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2013-01-01 16:14:48 +00:00
kib
4247222fcc Make it possible to atomically resume writes on the mount and account
the write start, by adding a variation of the vfs_write_resume(9)
which accepts flags.

Use the new function to prevent a deadlock between parallel suspension
and snapshotting a UFS mount.  The ffs_snapshot() code performed
vfs_write_resume() followed by vn_start_write() while owning the
snaplock.  If the suspension intervene between resume and
vn_start_write(), the deadlock occured after the suspending thread
tried to lock the snaplock, most typically during the write in the
ffs_copyonwrite().

Reported and tested by:	Andreas Longwitz <longwitz@incore.de>
Reviewed by:	mckusick
MFC after:	2 weeks
X-MFC-note:	make the vfs_write_resume(9) function a macro after the MFC,
	in HEAD
2012-12-28 23:08:30 +00:00
gonzo
0875ed61b8 Fix build on ARM (and probably other platforms) 2012-12-28 06:52:53 +00:00
davidxu
b7760e9ec4 Use strlcpy to NULL-terminate error message even if user provided a short
buffer.
2012-12-28 02:43:33 +00:00
attilio
27fa8d59ff Fixup r244240: mp_ncpus will be 1 also in the !SMP and smp_disabled=1
case. There is no point in optimizing further the code and use a TRUE
litteral for a path that does heavyweight stuff anyway (like lock acq),
at the price of obfuscated code.

Use the appropriate check where necessary and remove a macro.

Sponsored by:	EMC / Isilon storage division
MFC after:	3 days
2012-12-26 15:20:32 +00:00
attilio
fcadd67d75 MFC 2012-12-26 08:20:27 +00:00
kib
c6bad3bef7 Do not force a writer to the devfs file to drain the buffer writes.
Requested and tested by:	Ian Lepore <freebsd@damnhippie.dyndns.org>
MFC after:	2 weeks
2012-12-23 22:43:27 +00:00
jh
de8a1071d5 Reject spaces and double quotation marks in device names. devctl(4)
and devd(8) can't handle names with such characters properly.

PR:		bin/144736, kern/161912
Discussed with:	imp, kib, pjd
2012-12-22 13:33:28 +00:00
attilio
d2861e9295 Fixup r240424: On entering KDB backends, the hijacked thread to run
interrupt context can still be idlethread. At that point, without the
panic condition, it can still happen that idlethread then will try to
acquire some locks to carry on some operations.

Skip the idlethread check on block/sleep lock operations when KDB is
active.

Reported by:	jh
Tested by:	jh
MFC after:	1 week
2012-12-22 09:37:34 +00:00
attilio
0d14b65c78 Fixup r218424: uio_yield() was scaling directly to userland priority.
When kern_yield() was introduced with the possibility to specify
a new priority, the behaviour changed by not lowering priority at all
in the consumers, making the yielding mechanism highly ineffective for
high priority kthreads like bufdaemon, syncer, vlrudaemon, etc.
There are no evidences that consumers could bear with such change in
semantic and this situation could finally lead to bugs similar to the
ones fixed in r244240.
Re-specify userland pri for kthreads involved.

Tested by:	pho
Reviewed by:	kib, mdf
MFC after:	1 week
2012-12-21 13:14:12 +00:00
des
425c0645c4 Rewrite fdgrowtable() so common mortals can actually understand what
it does and how, and add comments describing the data structures and
explaining how they are managed.
2012-12-20 20:18:27 +00:00
cognet
5780ffa994 Create an architecture-agnostic buffer pool manager that uses uma(9) to
manage a set of power-of-2 sized buffers for bus_dmamem_alloc().

This allows the caller to provide the back-end allocator uma allocator,
allowing full control of the memory pages backing the pool.  For
convenience, it provides an optional builtin allocator that provides pages
allocated with the VM_MEMATTR_UNCACHEABLE attribute, for managing pools of
DMA buffers for BUS_DMA_COHERENT or BUS_DMA_NOCACHE.

This also allows the caller to specify a minimum alignment, and it ensures
that all buffers start on a boundary and have a length that's a multiple of
that value, to avoid using buffers that trigger partial cache line flushes.

Submitted by:	Ian Lepore <freebsd@damnhippie.dyndns.org>
2012-12-20 00:34:54 +00:00
pjd
f4da2a6634 Replace expand_name() function with corefile_open() function, which not
only returns name, but also vnode of corefile to use.

This simplifies the code and closes few races, especially in %I handling.

Reviewed by:	kib
Obtained from:	WHEEL Systems
2012-12-19 23:59:48 +00:00
pjd
15927a4dbb Use correct file permissions when looking for available core file if
kern.corefile contains %I.

Obtained from:	WHEEL Systems
2012-12-19 23:40:02 +00:00
jeff
0a985cfcde - Add new machine parsable KTR macros for timing events.
- Use this new format to automatically handle syscalls and VOPs.  This
   changes the earlier format but is still human readable.

Sponsored by:	EMC / Isilon Storage Division
2012-12-19 20:10:00 +00:00
jeff
c3865e85e8 - Correctly handle EWOULDBLOCK in quiesce_cpus
Discussed with:	mav
2012-12-19 20:08:06 +00:00
pjd
ab9dc4d86b The 'flags' argument can be modified in vn_open_cred(), so we need to
set it for every loop interation.

Pointed out by:	kib
2012-12-19 12:14:08 +00:00