12801 Commits

Author SHA1 Message Date
attilio
640e058da3 MFC 2013-03-07 23:43:03 +00:00
bryanv
840f21fc49 Remove the virtio dependency entry for the VirtIO device drivers. This
will prevent the kernel from linking if the device driver are included
without the virtio module. Remove pci and scbus for the same reason.

Also explain the relationship and necessity of the virtio and virtio_pci
modules. Currently in FreeBSD, we only support VirtIO PCI, but it could
be replaced with a different interface (like MMIO) and the device
(network, block, etc) will still function.

Requested by:	luigi
Approved by:	grehan (mentor)
MFC after:	3 days
2013-03-06 07:17:53 +00:00
ken
d11db422c6 Re-enable CTL in GENERIC on i386 and amd64, but turn on the CTL disable
tunable by default.

This will allow GENERIC configurations to boot on small memory boxes, but
not require end users who want to use CTL to recompile their kernel.  They
can simply set kern.cam.ctl.disable=0 in loader.conf.

The eventual solution to the memory usage problem is to change the way
CTL allocates memory to be more configurable, but this should fix things
for small memory situations in the mean time.

UPDATING:		Explain the change in the CTL configuration, and
			how users can enable CTL if they would like to use
			it.

sys/conf/options:	Add a new option, CTL_DISABLE, that prevents CTL
			from initializing.

ctl.c:			If CTL_DISABLE is turned on, don't initialize.

i386/conf/GENERIC,
amd64/conf/GENERIC:	Re-enable device ctl, and add the CTL_DISABLE
			option.
2013-03-04 21:18:45 +00:00
attilio
d924385016 Fixup XEN pmap to cope with removal of left/right iterators from
pages.

Sponsored by:	EMC / Isilon storage division
2013-03-03 01:26:11 +00:00
attilio
c53a782d3a MFC 2013-03-03 01:06:24 +00:00
attilio
be8012c4e6 Fix-up r247622 by also renaming pv_list iterator into the xen
pmap verbatim copy.

Sponsored by:	EMC / Isilon storage division
Reported by:	tinderbox
2013-03-03 01:02:57 +00:00
attilio
e98f58faf6 MFC 2013-03-02 14:48:41 +00:00
attilio
17028bb6ae MFC 2013-03-02 14:28:31 +00:00
attilio
5d33901b24 Merge from vmc-playground branch:
Rename the pv_entry_t iterator from pv_list to pv_next.
Besides being more correct technically (as the name seems to suggest
this is a list while it is an iterator), it will also be needed by
vm_radix work to avoid a nameclash on macro expansions.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc, jeff
Tested by:	flo, pho, jhb, davide
2013-03-02 14:19:08 +00:00
adrian
f77596bc52 Disable the ctl driver in GENERIC.
It unfortunately steals a fair chunk of RAM at startup even if it's not
actively used, which prevents FreeBSD VMs of 128MB from successfully
booting and running.
2013-03-02 08:12:41 +00:00
pjd
f07ebb8888 Merge Capsicum overhaul:
- Capability is no longer separate descriptor type. Now every descriptor
  has set of its own capability rights.

- The cap_new(2) system call is left, but it is no longer documented and
  should not be used in new code.

- The new syscall cap_rights_limit(2) should be used instead of
  cap_new(2), which limits capability rights of the given descriptor
  without creating a new one.

- The cap_getrights(2) syscall is renamed to cap_rights_get(2).

- If CAP_IOCTL capability right is present we can further reduce allowed
  ioctls list with the new cap_ioctls_limit(2) syscall. List of allowed
  ioctls can be retrived with cap_ioctls_get(2) syscall.

- If CAP_FCNTL capability right is present we can further reduce fcntls
  that can be used with the new cap_fcntls_limit(2) syscall and retrive
  them with cap_fcntls_get(2).

- To support ioctl and fcntl white-listing the filedesc structure was
  heavly modified.

- The audit subsystem, kdump and procstat tools were updated to
  recognize new syscalls.

- Capability rights were revised and eventhough I tried hard to provide
  backward API and ABI compatibility there are some incompatible changes
  that are described in detail below:

	CAP_CREATE old behaviour:
	- Allow for openat(2)+O_CREAT.
	- Allow for linkat(2).
	- Allow for symlinkat(2).
	CAP_CREATE new behaviour:
	- Allow for openat(2)+O_CREAT.

	Added CAP_LINKAT:
	- Allow for linkat(2). ABI: Reuses CAP_RMDIR bit.
	- Allow to be target for renameat(2).

	Added CAP_SYMLINKAT:
	- Allow for symlinkat(2).

	Removed CAP_DELETE. Old behaviour:
	- Allow for unlinkat(2) when removing non-directory object.
	- Allow to be source for renameat(2).

	Removed CAP_RMDIR. Old behaviour:
	- Allow for unlinkat(2) when removing directory.

	Added CAP_RENAMEAT:
	- Required for source directory for the renameat(2) syscall.

	Added CAP_UNLINKAT (effectively it replaces CAP_DELETE and CAP_RMDIR):
	- Allow for unlinkat(2) on any object.
	- Required if target of renameat(2) exists and will be removed by this
	  call.

	Removed CAP_MAPEXEC.

	CAP_MMAP old behaviour:
	- Allow for mmap(2) with any combination of PROT_NONE, PROT_READ and
	  PROT_WRITE.
	CAP_MMAP new behaviour:
	- Allow for mmap(2)+PROT_NONE.

	Added CAP_MMAP_R:
	- Allow for mmap(PROT_READ).
	Added CAP_MMAP_W:
	- Allow for mmap(PROT_WRITE).
	Added CAP_MMAP_X:
	- Allow for mmap(PROT_EXEC).
	Added CAP_MMAP_RW:
	- Allow for mmap(PROT_READ | PROT_WRITE).
	Added CAP_MMAP_RX:
	- Allow for mmap(PROT_READ | PROT_EXEC).
	Added CAP_MMAP_WX:
	- Allow for mmap(PROT_WRITE | PROT_EXEC).
	Added CAP_MMAP_RWX:
	- Allow for mmap(PROT_READ | PROT_WRITE | PROT_EXEC).

	Renamed CAP_MKDIR to CAP_MKDIRAT.
	Renamed CAP_MKFIFO to CAP_MKFIFOAT.
	Renamed CAP_MKNODE to CAP_MKNODEAT.

	CAP_READ old behaviour:
	- Allow pread(2).
	- Disallow read(2), readv(2) (if there is no CAP_SEEK).
	CAP_READ new behaviour:
	- Allow read(2), readv(2).
	- Disallow pread(2) (CAP_SEEK was also required).

	CAP_WRITE old behaviour:
	- Allow pwrite(2).
	- Disallow write(2), writev(2) (if there is no CAP_SEEK).
	CAP_WRITE new behaviour:
	- Allow write(2), writev(2).
	- Disallow pwrite(2) (CAP_SEEK was also required).

	Added convinient defines:

	#define	CAP_PREAD		(CAP_SEEK | CAP_READ)
	#define	CAP_PWRITE		(CAP_SEEK | CAP_WRITE)
	#define	CAP_MMAP_R		(CAP_MMAP | CAP_SEEK | CAP_READ)
	#define	CAP_MMAP_W		(CAP_MMAP | CAP_SEEK | CAP_WRITE)
	#define	CAP_MMAP_X		(CAP_MMAP | CAP_SEEK | 0x0000000000000008ULL)
	#define	CAP_MMAP_RW		(CAP_MMAP_R | CAP_MMAP_W)
	#define	CAP_MMAP_RX		(CAP_MMAP_R | CAP_MMAP_X)
	#define	CAP_MMAP_WX		(CAP_MMAP_W | CAP_MMAP_X)
	#define	CAP_MMAP_RWX		(CAP_MMAP_R | CAP_MMAP_W | CAP_MMAP_X)
	#define	CAP_RECV		CAP_READ
	#define	CAP_SEND		CAP_WRITE

	#define	CAP_SOCK_CLIENT \
		(CAP_CONNECT | CAP_GETPEERNAME | CAP_GETSOCKNAME | CAP_GETSOCKOPT | \
		 CAP_PEELOFF | CAP_RECV | CAP_SEND | CAP_SETSOCKOPT | CAP_SHUTDOWN)
	#define	CAP_SOCK_SERVER \
		(CAP_ACCEPT | CAP_BIND | CAP_GETPEERNAME | CAP_GETSOCKNAME | \
		 CAP_GETSOCKOPT | CAP_LISTEN | CAP_PEELOFF | CAP_RECV | CAP_SEND | \
		 CAP_SETSOCKOPT | CAP_SHUTDOWN)

	Added defines for backward API compatibility:

	#define	CAP_MAPEXEC		CAP_MMAP_X
	#define	CAP_DELETE		CAP_UNLINKAT
	#define	CAP_MKDIR		CAP_MKDIRAT
	#define	CAP_RMDIR		CAP_UNLINKAT
	#define	CAP_MKFIFO		CAP_MKFIFOAT
	#define	CAP_MKNOD		CAP_MKNODAT
	#define	CAP_SOCK_ALL		(CAP_SOCK_CLIENT | CAP_SOCK_SERVER)

Sponsored by:	The FreeBSD Foundation
Reviewed by:	Christoph Mallon <christoph.mallon@gmx.de>
Many aspects discussed with:	rwatson, benl, jonathan
ABI compatibility discussed with:	kib
2013-03-02 00:53:12 +00:00
mav
6cf7cc6e4d MFcalloutng:
Switch eventtimers(9) from using struct bintime to sbintime_t.
Even before this not a single driver really supported full dynamic range of
struct bintime even in theory, not speaking about practical inexpediency.
This change legitimates the status quo and cleans up the code.
2013-02-28 13:46:03 +00:00
davide
2bf12d0c7c MFcalloutng:
When CPU becomes idle, cpu_idleclock() calculates time to the next timer
event in order to reprogram hw timer. Return that time in sbintime_t to
the caller and pass it to acpi_cpu_idle(), where it can be used as one
more factor (quite precise) to extimate furter sleep time and choose
optimal sleep state. This is a preparatory change for further callout
improvements will be committed in the next days.

The commmit is not targeted for MFC.
2013-02-28 10:46:54 +00:00
attilio
6ff1954532 MFC 2013-02-27 18:23:12 +00:00
attilio
8d28f94790 Merge from vmobj-rwlock:
VM_OBJECT_LOCKED() macro is only used to implement a custom version
of lock assertions right now (which likely spread out thanks to
copy and paste).
Remove it and implement actual assertions.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc
Tested by:	pho
2013-02-27 18:12:13 +00:00
attilio
cff31deb1a MFC 2013-02-24 16:50:53 +00:00
attilio
905e648d42 Hide the details for the assertion for VM_OBJECT_LOCK operations.
Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into
VM_OBJECT_ASSERT_WLOCKED(foo)

Sponsored by:	EMC / Isilon storage division
Requested by:	alc
2013-02-21 21:54:53 +00:00
attilio
8746bf6a5f MFC 2013-02-21 15:06:19 +00:00
imp
fa7b0b1bdd Locking for todr got pushed down into inittodr and the client
libraries it calls (although some might not be doing it right). We are
serialized right now by giant as well. This means the splsoftclock are
now an anachronism that has no benefit, even marking where locking
needs to happen. Remove them.
2013-02-21 07:16:40 +00:00
kib
cda69c9622 Convert machine/elf.h, machine/frame.h, machine/sigframe.h,
machine/signal.h and machine/ucontext.h into common x86 includes,
copying from amd64 and merging with i386.

Kernel-only compat definitions are kept in the i386/include/sigframe.h
and i386/include/signal.h, to reduce amd64 kernel namespace pollution.
The amd64 compat uses its own definitions so far.

The _MACHINE_ELF_WANT_32BIT definition is to allow the
sys/boot/userboot/userboot/elf32_freebsd.c to use i386 ELF definitions
on the amd64 compile host.  The same hack could be usefully abused by
other code too.
2013-02-20 17:39:52 +00:00
attilio
1f1e13ca03 There is no need to use VM_OBJECT_LOCKED() as the assertion won't
make the check available in any case if INVARIANTS is switched off.
Remove VM_OBJECT_LOCKED().
2013-02-20 10:51:34 +00:00
attilio
658534ed5a Switch vm_object lock to be a rwlock.
* VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations
* VM_OBJECT_SLEEP() is introduced as a general purpose primitve to
  get a sleep operation using a VM_OBJECT_LOCK() as protection
* The approach must bear with vm_pager.h namespace pollution so many
  files require including directly rwlock.h
2013-02-20 10:38:34 +00:00
jkim
92f5e4e678 Consistently use round_page(x) rather than roundup(x, PAGE_SIZE). There is
no functional change.
2013-02-15 22:43:08 +00:00
attilio
0d3b58aee0 MFC 2013-02-03 20:13:33 +00:00
avg
2e2156704e cpususpend_handler: mark AP as resumed only after fully setting up lapic
Reviewed by:	jhb
Tested by:	Sergey V. Dyatko <sergey.dyatko@gmail.com>,
		KAHO Toshikazu <kaho@elam.kais.kyoto-u.ac.jp>
MFC after:	12 days
2013-02-02 12:04:32 +00:00
avg
09a43450b8 x86 suspend/resume: suspend pics and pseudo-pics in reverse order
- change 'pics' from STAILQ to TAILQ
- ensure that Local APIC is always first in 'pics'

Reviewed by:	jhb
Tested by:	Sergey V. Dyatko <sergey.dyatko@gmail.com>,
		KAHO Toshikazu <kaho@elam.kais.kyoto-u.ac.jp>
MFC after:	12 days
2013-02-02 12:02:42 +00:00
eadler
6a1efe1ad9 Remove support for plip from the GENERIC kernel as no systems in the
last 10 years require this support.

Discussed with:	db
Discussed with:	kib
Reviewed by:	imp
Reviewed by:	jhb
Reviewed by:	-hackers
Approved by:	cperciva (mentor)
2013-02-01 20:17:11 +00:00
andre
6af4b61a84 Remove unused VM_MAX_AUTOTUNE_NMBCLUSTERS define. 2013-02-01 14:16:37 +00:00
jhb
2617d9f095 Reduce duplication between i386/linux/linux.h and amd64/linux32/linux.h
by moving bits that are MI out into headers in compat/linux.

Reviewed by:	Chagin Dmitry  dmitry | gmail
MFC after:	2 weeks
2013-01-29 18:41:30 +00:00
jhb
af17a55dfd Don't assume that all Linux TCP-level socket options are identical to
FreeBSD TCP-level socket options (only the first two are).  Instead,
using a mapping function and fail unsupported options as we do for other
socket option levels.

MFC after:	2 weeks
2013-01-23 21:44:48 +00:00
jhb
fe9918fd55 Fix build with SMP disabled.`
Reported by:	bf
2013-01-19 01:18:22 +00:00
jhb
4a3c4478d3 Don't attempt to use clflush on the local APIC register window. Various
CPUs exhibit bad behavior if this is done (Intel Errata AAJ3, hangs on
Pentium-M, and trashing of the local APIC registers on a VIA C7).  The
local APIC is implicitly mapped UC already via MTRRs, so the clflush isn't
necessary anyway.

MFC after:	2 weeks
2013-01-17 21:32:25 +00:00
bryanv
df2fc90f0e Add VirtIO to the i386 and amd64 GENERIC kernels
This also removes the kludge from r239009 that covered only
the network driver.

Reviewed by:	grehan
Approved by:	grehan (mentor)
MFC after:	1 week
2013-01-13 07:14:16 +00:00
attilio
561dd1163d MFC 2013-01-09 11:58:47 +00:00
kib
5a9188f8d3 Enable the UFS quotas for big-iron GENERIC kernels.
Discussed with:	      mckusick
MFC after:	      2 weeks
2013-01-03 19:03:41 +00:00
des
67e77c00a8 As discussed on -current last October, remove the firewire drivers from
GENERIC.
2013-01-03 14:30:24 +00:00
marius
185d79fdec Fix !INVARIANTS && !SMP build.
MFC after:	3 days
2013-01-03 01:09:50 +00:00
attilio
fcadd67d75 MFC 2012-12-26 08:20:27 +00:00
jimharris
5e7d94235a Revert r243960 based on feedback regarding keeping x86 headers unified
(mdf@, tijl@) and use of KASSERT/systm.h in bus.h (zeising@, bde@).

Alternate implementation will be made in a separate commit.
2012-12-13 21:27:20 +00:00
attilio
be719e9167 MFC 2012-12-11 00:07:19 +00:00
jimharris
f20d18d37b Add amd64 implementations for 8-byte bus_space routines.
Submitted by:	Carl Delsey <carl.r.delsey@intel.com>
Discussed with:	jhb, rwatson
Reviewed by:	jimharris
MFC after:	1 week
2012-12-06 22:33:31 +00:00
kib
6b76c5a1b8 Print the frame addresses for the backtraces on i386 and amd64. It
allows both to inspect the frame sizes and to manually peek into the
frames from ddb, if needed.

Reviewed by:	dim
MFC after:	2 weeks
2012-12-03 22:16:51 +00:00
jkim
caa6f5a7e6 Remove duplicate code. Reduce diff between amd64 and i386. 2012-12-01 00:56:19 +00:00
jkim
9c50b706fb Use volatile keywords properly. 2012-11-30 20:15:01 +00:00
jkim
5bf7bb816c Tidy up inline assembly. No functional change. 2012-11-30 00:59:37 +00:00
dim
fb83924633 Fix a minor warning in sys/i386/xen/clock.c.
MFC after:	3 days
2012-11-12 20:50:11 +00:00
alfred
7beb738c8a Allow maxusers to scale on machines with large address space.
Some hooks are added to clamp down maxusers and nmbclusters for
small address space systems.

VM_MAX_AUTOTUNE_MAXUSERS - the max maxusers that will be autotuned based on
physical memory.
VM_MAX_AUTOTUNE_NMBCLUSTERS - max nmbclusters based on physical memory.

These are set to the old values on i386 to preserve the clamping that was
being done to all arches.

Another macro VM_AUTOTUNE_NMBCLUSTERS is provided to allow an override
for the calculation on a MD basis.  Currently no arch defines this.

Reviewed by: peter
MFC after: 2 weeks
2012-11-10 02:08:40 +00:00
attilio
f3501b109e Rework the known rwlock to benefit about staying on their own
cache line in order to avoid manual frobbing but using
struct rwlock_padalign.

Reviewed by:	alc, jimharris
2012-11-03 23:03:14 +00:00
kib
f16ea99007 The r241025 fixed the case when a binary, executed from nullfs mount,
was still possible to open for write from the lower filesystem.  There
is a symmetric situation where the binary could already has file
descriptors opened for write, but it can be executed from the nullfs
overlay.

Handle the issue by passing one v_writecount reference to the lower
vnode if nullfs vnode has non-zero v_writecount.  Note that only one
write reference can be donated, since nullfs only keeps one use
reference on the lower vnode.  Always use the lower vnode v_writecount
for the checks.

Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is
currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT
to manipulate the v_writecount value, which manages a single bypass
reference to the lower vnode.  Caling the VOPs instead of directly
accessing v_writecount provide the fix described in the previous
paragraph.

Tested by:	pho
MFC after:	3 weeks
2012-11-02 13:56:36 +00:00
kib
09daf9f4d2 Add missed sched_pin().
Submitted by:	Svatopluk Kraus <onwahe@gmail.com>
Reviewed by:	alc
MFC after:	3 days
2012-10-24 18:21:22 +00:00