83116 Commits

Author SHA1 Message Date
avg
4e51477b36 zfs vn_has_cached_data: take into account v_object->cache != NULL
This mirrors code in tmpfs.
This changge shouldn't affect much read path, it may cause unnecessary
vm_page_lookup calls in the case where v_object has no active or inactive
pages but has some cache pages.  I believe this situation to be non-essential.

In write path this change should allow us to properly detect the above
case and free a cache page when we write to a range that corresponds to it.
If this situation is undetected then we could have a discrepancy between
data in page cache and in ARC or on disk.

This change allows us to re-enable vn_has_cached_data() check in zfs_write.

NOTE: strictly speaking resident_page_count and cache fields of v_object
should be exmined under VM_OBJECT_LOCK, but for this particular usage
we may get away with it.

Discussed with:	alc, kib
Approved by:	pjd
Tested with:	tools/regression/fsx
MFC after:	3 weeks
2010-09-15 11:05:41 +00:00
avg
23cfa76cb6 zfs mappedread, update_pages: use int for offset and length within a page
uint64_t, int64_t were redundant there

Approved by:	pjd
Tested by:	tools/regression/fsx
MFC after:	2 weeks
2010-09-15 10:48:16 +00:00
avg
204e0a0dec zfs mappedread: use uiomove_fromphys where possible
Reviewed by:	alc
Approved by:	pjd
Tested by:	tools/regression/fsx
MFC after:	2 weeks
2010-09-15 10:44:20 +00:00
andre
9d9488f599 Change the default MSS for IPv4 and IPv6 TCP connections from an
artificial power-of-2 rounded number to their real values specified
in RFC879 and RFC2460.

From the history and existing comments it appears that the rounded
numbers were intended to be advantageous for the kernel and mbuf
system.  However this hasn't been the case at for at least a long
time.  The mbuf clusters used in tcp_output() have enough space
to hold the larger real value for the default MSS for both IPv4 and
IPv6.  Note that the default MSS is only used when path MTU discovery
is disabled.

Update and expand related comments.

Reviewed by:	lsteward (including some word-smithing)
MFC after:	2 weeks
2010-09-15 10:39:30 +00:00
avg
ddc5721620 zfs: catch up with vm_page_sleep_if_busy changes
Reviewed by:	alc
Approved by:	pjd
Tested by:	tools/regression/fsx
MFC after:	2 weeks
2010-09-15 10:39:21 +00:00
avg
65a73d5f0b tmpfs, zfs + sendfile: mark page bits as valid after populating it with data
Otherwise, adding insult to injury, in addition to double-caching of data
we would always copy the data into a vnode's vm object page from backend.
This is specific to sendfile case only (VOP_READ with UIO_NOCOPY).

PR:		kern/141305
Reported by:	Wiktor Niesiobedzki <bsd@vink.pl>
Reviewed by:	alc
Tested by:	tools/regression/sockets/sendfile
MFC after:	2 weeks
2010-09-15 10:31:27 +00:00
avg
50571f8bfa sys/pcpu.h: remove a workaround for a fixed ld bug
The workaround was incorrectly documented as having something to do with
set_pcpu section's progbits, but in fact it was for incorrect placement
of __start_set_pcpu because of the bug in ld.
The bug was fixed in r210245, see commit message for details.

A side-effect of the workaround was that a zero-size set_pcpu section was
produced for modules, source code of which included pcpu.h but didn't
actually define any dynamic per-cpu variables.
This commit should remove the side-effect.

The same workaround is present sys/net/vnet.h, has an analogous side-effect
and can be removed as well.

An UPDATING entry that warns about a need for recent ld is following.

MFC after:	1 month
2010-09-15 10:02:46 +00:00
neel
da4ad90d63 Add 64-bit SWARM board kernel configs. 2010-09-15 05:32:10 +00:00
neel
8156525e0d Factor out the common parts of the swarm board in SWARM_COMMON and start
including that in SWARM and SWARM_SMP kernel configs.
2010-09-15 05:29:13 +00:00
neel
310427c33e Make the meaning of the 'mask' argument to 'set_intr_mask(mask)' consistent
with the meaning of IM bits in the status register.

Reviewed by:	jmallett, jchandra
2010-09-15 05:10:50 +00:00
emaste
25bb71a720 Add some enums and constants from Adaptec's latest driver
(build 17911).
2010-09-15 01:19:11 +00:00
grehan
bd5391ac7c Introduce inheritance into the PowerPC MMU kobj interface.
include/mmuvar.h - Change the MMU_DEF macro to also create the class
definition as well as define the DATA_SET. Add a macro, MMU_DEF_INHERIT,
which has an extra parameter specifying the MMU class to inherit methods
from. Update the comments at the start of the header file to describe the
new macros.

booke/pmap.c
aim/mmu_oea.c
aim/mmu_oea64.c - Collapse mmu_def_t declaration into updated MMU_DEF macro

The MMU_DEF_INHERIT macro will be used in the PS3 MMU implementation to
allow it to inherit the stock powerpc64 MMU methods.

Reviewed by:	nwhitehorn
2010-09-15 00:17:52 +00:00
marius
774af0eb5d Use saner nsegments and maxsegsz parameters when creating certain DMA tags;
tags for 1-byte allocations cannot possibly be split across 2 segments and
maxsegsz must not exceed maxsize.
2010-09-14 20:41:06 +00:00
marius
2172eaf479 Remove a KASSERT which will also trigger for perfectly valid combinations
of small maxsize and "large" (including BUS_SPACE_UNRESTRICTED) nsegments
parameters. Generally using a presz of 0 (which indeed might indicate the
use of bogus parameters for DMA tag creation) is not fatal, it just means
that no additional DVMA space will be preallocated.
2010-09-14 20:31:09 +00:00
marius
f5db453abe Remove redundant raising of the PIL to PIL_TICK as the respective locore
code already did that.
2010-09-14 19:35:43 +00:00
kib
48b2cf6dd7 Rename the field to not confuse readers. The bytes are actually used.
Discussed with:	rmacklem
MFC after:	1 week
2010-09-14 18:58:51 +00:00
mckusick
dd70ac636a Update comments in soft updates code to more fully describe
the addition of journalling. Only functional change is to
tighten a KASSERT.

Reviewed by:	jeff Roberson
2010-09-14 18:04:05 +00:00
ken
871d0bf87f MFp4: (//depot/projects/mps/...)
Report data overruns properly.

Submitted by:	scottl
2010-09-14 17:22:06 +00:00
pjd
e87685cef9 - Change all places where G_TYPE_ASCNUM is used to G_TYPE_NUMBER.
It turns out the new type wasn't really needed.
- Reorganize code a little bit.
2010-09-14 16:21:13 +00:00
mm
9525362cee Remove duplicated VFS_HOLD due to a mismerge.
PR:		kern/150544
Approved by:	delphij (mentor)
MFC after:	1 day
2010-09-14 12:12:18 +00:00
pjd
65239d84e5 Simplify the code a bit. 2010-09-14 11:42:07 +00:00
mm
af9e1720ca Add missing vop_vector zfsctl_ops_shares
Add missing locks around VOP_READDIR and VOP_GETATTR with z_shares_dir

PR:		kern/150544
Approved by:	delphij (mentor)
Obtained from:	perforce (pjd)
MFC after:	1 day
2010-09-14 10:27:32 +00:00
mav
6eed5acb73 Fix panic on NULL dereference possible after r212541. 2010-09-14 10:26:49 +00:00
mav
6c05aa4db6 Make kern_tc.c provide minimum frequency of tc_ticktock() calls, required
to handle current timecounter wraps. Make kern_clocksource.c to honor that
requirement, scheduling sleeps on first CPU for no more then specified
period. Allow other CPUs to sleep up to 1/4 second (for any case).
2010-09-14 08:48:06 +00:00
mav
5864d6e457 Replace spin lock with the set of atomics. It is impractical for one
tc_ticktock() call to wait for another's completion -- just skip it.
2010-09-14 04:57:30 +00:00
mav
5f7bd119f7 Add some foot shooting protection by checking singlemul value correctness.
Rephrase sysctls descriptions.

Suggested by:	edmaste
2010-09-14 04:48:04 +00:00
grehan
93d9948943 Resurrect PSIM support by moving the cacheline size-detection warning
printf outside of the MMU-disabled region. A call into OpenFirmware
with the MMU off resulted in an internal PSIM assert.
2010-09-14 03:18:11 +00:00
emaste
2b6d501a6e Avoid repeatedly spamming the console while a timed out command is waiting
to complete.  Instead, print one message after the timeout period expires,
and one more when (if) the command eventually completes.

MFC after:	1 month
2010-09-14 01:51:04 +00:00
neel
eb7e545e67 Port r212559 to mips.
Do not explicitly enable interrupts in smp_init_secondary() because it
renders any spinlock protected code after that point to run with
interrupts enabled. This is because the processor is executing in the
context of idlethread whose 'md_spinlock_count' is already set to 1.

Instead just let sched_throw() re-enable interrupts when it releases
the spinlock.

The original powerpc commit log for r212559 is available here:
http://svn.freebsd.org/viewvc/base?view=revision&revision=212559
2010-09-14 01:48:01 +00:00
neel
0f470b2019 Enforce that pmap_mapdev() always returns uncacheable mappings.
Reviewed by:	imp, jchandra, jmallett
2010-09-14 01:27:53 +00:00
nwhitehorn
1680d79cbd Fix a missing set of parantheses that could cause recent versions of libthr
to crash deferencing a NULL pointer to the user context on powerpc64
systems with COMPAT_FREEBSD32 defined.
2010-09-13 22:50:05 +00:00
jkim
1bef4fcb17 Fix segment:offset calculation of interrupt vector for relocated video BIOS
when the original offset is bigger than size of one page.  X86BIOS macros
cannot be used here because it is assumed address is only linear in a page.

Tested by:	netchild
2010-09-13 19:58:46 +00:00
pjd
7766ec39d4 Remove the page queues lock around vm_page_undirty() - it is no longer needed.
Reviewed by:	alc
2010-09-13 19:47:09 +00:00
mdf
3ed6eac561 Revert r212370, as it causes a LOR on powerpc. powerpc does a few
unexpected things in copyout(9) and so wiring the user buffer is not
sufficient to perform a copyout(9) while holding a random mutex.

Requested by: nwhitehorn
2010-09-13 18:48:23 +00:00
rpaulo
8fc202916b Bump __FreeBSD_version to reflect the userland DTrace changes.
Sponsored by:	The FreeBSD Foundation
> Description of fields to fill in above:                     76 columns --|
> PR:            If a GNATS PR is affected by the change.
> Submitted by:  If someone else sent in the change.
> Reviewed by:   If someone else reviewed your modification.
> Approved by:   If you needed approval for this commit.
> Obtained from: If the change is from a third party.
> MFC after:     N [day[s]|week[s]|month[s]].  Request a reminder email.
> Security:      Vulnerability reference (one per line) or description.
> Empty fields above will be automatically removed.

M    param.h
2010-09-13 17:53:43 +00:00
imp
418e75a887 TARGET_64BIT isn't needed anymore, GC it (partial merge from tbemd). 2010-09-13 16:39:33 +00:00
nwhitehorn
c105226853 Fix a subtle bug uncovered by the recent one-shot timer import in which
any spin locks acquired between the enabling of interrupts in
machdep_ap_bootstrap() and the invocation of the scheduler would fail to
have interrupts disabled due to the fake spinlock already held by the
idle thread. sched_throw(NULL) will enable interrupts by itself when
exiting this spinlock, so just let it do that and don't enable interrupts
here.
2010-09-13 15:36:42 +00:00
mav
2a8c47ab11 Change call order to enable interrupts only after timer being programmed.
Submitted by:	nwhitehorn
2010-09-13 14:25:07 +00:00
pjd
3d8ce965d3 - Remove gc_argname field. It was introduced for gpart(8), but if I
understand everything correctly, we don't really need it.
- Provide default numeric value as strings. This allows to simplify
  a lot of code.
- Bump version number.
2010-09-13 13:48:18 +00:00
jchandra
7dc7517414 sys/mips/rmi/msgring.h - fixes and clean up.
- Remove sync from msgrng_send, sync needs to be called just once before
  sending.
- Fix retry logic - don't reload registers when retrying in message_send,
  also fix check for send pending fail.
- remove unused message_send_block_fast()
- merge message_receive_fast() to message_receive
- style(9) fixes, and comments
- rge and nlge updated for the sys/mips/rmi/msgring.h changes
2010-09-13 13:11:50 +00:00
jchandra
3137214722 bus_add_child method is needed now. 2010-09-13 11:47:35 +00:00
avg
c294549cbd acpi_cpu: do not apply P_LVLx_LAT rules to latencies returned by _CST
ACPI specification sates that if P_LVL2_LAT > 100, then a system doesn't
support C2; if P_LVL3_LAT > 1000, then C3 is not supported.
But there are no such rules for Cx state data returned by _CST.  If a
state is not supported it should not be included into the return
package.  In other words, any latency value returned by _CST is valid,
it's up to the OS and/or user to decide whether to use it.

Submitted by:	nork
Suggested by:	mav
MFC after:	1 week
2010-09-13 09:51:24 +00:00
pjd
6f96b7c228 - Allow to specify value as const pointers.
- Make optional string values always an empty string.
2010-09-13 08:56:07 +00:00
avg
ab04d6fe3f bus_add_child: add specialized default implementation that calls panic
If a kobj method doesn't have any explicitly provided default
implementation, then it is auto-assigned kobj_error_method.
kobj_error_method is proper only for methods that return error code,
because it just returns ENXIO.
So, in the case of unimplemented bus_add_child caller would get
(device_t)ENXIO as a return value, which would cause the mistake to go
unnoticed, because return value is typically checked for NULL.
Thus, a specialized null_add_child is added.  It would have sufficied
for correctness to return NULL, but this type of mistake was deemed to
be rare and serious enough to call panic instead.

Watch out for this kind of problem with other kobj methods.

Suggested by:	jhb, imp
MFC after:	2 weeks
2010-09-13 08:34:20 +00:00
imp
367de98e5d Simplify atomic selection 2010-09-13 07:29:02 +00:00
imp
a235626c3c Prefer MACHINE_CPUARCH over MACHINE_ARCH 2010-09-13 07:27:03 +00:00
mav
eb4931dc6c Refactor timer management code with priority to one-shot operation mode.
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.

There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
  kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
  kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
  kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
  kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.

As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.

Tested by:	many (on i386, amd64, sparc64 and powerc)
H/W donated by:	Gheorghe Ardelean
Sponsored by:	iXsystems, Inc.
2010-09-13 07:25:35 +00:00
imp
c517eaecea Use MACHINE_CPUARCH as appropriate
Define __KLD_SHARED to be yes or no depending on if the target uses shared
binaries for klds or not (this also eliminates 4 uses of MACHINE_ARCH).
2010-09-13 07:16:48 +00:00
mav
1b35612118 Add tunable 'hint.hpet.X.per_cpu' to specify how much per-CPU timers driver
should provide if there is sufficient hardware. Default is 1.
2010-09-13 06:32:56 +00:00
jchandra
edb616a784 The functions in sys/mips/mips/psraccess.S can be implemented with
mips_rd_status/mips_wr_status.  Implement them in mips/include/cpufunc.h,
and remove psraccess.S.

Reviewed by:	neel, imp
2010-09-13 05:03:37 +00:00