by removing parentheses. The main bug is in gcc: on machines with
64-bit longs and 64-bit long longs,
(unsigned long long)rdp->total_sectors / ((1024L * 1024L) / DEV_BSIZE))
has type plain unsigned long instead of the correctly promoted type
unsigned long long, so it can not be printfed using %llu format. Even
1ULL / 1L is mispromoted. Anyway, casting the correct operand
automatically avoids the problem. We do not want to to pessimize the
division; we just want to convert to a common maximal type for printing.
moderately improves msync's and VM object flushing for objects containing
randomly dirtied pages (fsync(), msync(), filesystem update daemon),
and improves cpu use for small-ranged sequential msync()s in the face of
very large mmap()ings from O(N) to O(1) as might be performed by a database.
A sysctl, vm.msync_flush_flag, has been added and defaults to 3 (the two
committed optimizations are turned on by default). 0 will turn off both
optimizations.
This code has already been tested under stable and is one in a series of
memq / vp->v_dirtyblkhd / fsync optimizations to remove O(N^2) restart
conditions that will be coming down the pipe.
MFC after: 3 days
- Move jail checks and some other checks involving constants and stack
variables out from under Giant. This isn't perfectly safe atm because
jail_sysvipc_allowed is read w/o a lock meaning that its value could be
stale. This global variable will soon become a per-jail flag, however,
at which time it will either not need a lock or will use the prison lock.
individual filesystems to determine whether they should operate in
"file system as a single object" mode, or "file system as a set of objects
with individual labels" mode. Note: in the trustedbsd_mac branch,
this is refered to as "MNT_MULTILEVEL", but the two mean the same thing.
MNT_MULTILABEL is more suggestive of a flexible policy system than one
providing purely hierarchal policies. The need for a reserved flag will
go away once nmount() is done.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
and for individual MAC policies. The framework event initializes the
access control subsystem; the policy event allows policies to register
themselves. The gap in between is for all the things we'll think of
later.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
pseudo-devices when an interface goes away. Otherwise, an open /dev/net/foo0
when the interface is removed can cause a crash.
Not objected to by: jlemon
Some buggy firmware workarounds. Fix some endian bugs.
These reduce the diffs from NetBSD, but NetBSD does have more changes since
my last manual merge.
people working on the MAC tree from getting toasted whenever system call
numbers are allocated in the main tree (for example, for KSE :-).
Calls allocated: __mac_{get,set}_proc, __mac_{get,set}_{fd,file}().
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
Includes some minor whitespace changes, and re-ordering to be able to document
properly (e.g, grouping of variables and the SYSCTL macro calls for them, where
the documentation has been added.)
Reviewed by: phk (but all errors are mine)
active-filter in pppd(8).
PR: kern/12281
Submitted by: Tim Moore <moore@bricoworks.com>
Not objected by: peter
Reviewed by: ru
Approved by: ru
MFC after: 1 week
be allocated as arrays indexed by the cpu id. Previously the only reliable
way to know the max cpu id was through MAXCPU. mp_ncpus isn't useful here
because cpu ids may be sparsely mapped, although x86 and alpha do not do this.
Also, call cpu_mp_probe much earlier so the max cpu id is known before the VM
starts up. This is intended to help support per cpu queues for the new
allocator, but may be useful elsewhere.
Reviewed by: jake
Approved by: jake
Link if only ATAPI device in kernel config
Remove unused #includes
Rearrange a bit in ata-raid to make diff against -stable smaller
Enable wc as default again, dunne how this happend...
and again in vm_page.c and vm_pageq.c.
o Delete unusused prototypes. (Mainly a result of the earlier renaming
of various functions from vm_page_*() to vm_pageq_*().)
This makes other power-management system (APM for now) to be able to
generate power profile change events (ie. AC-line status changes), and
other kernel components, not only the ACPI components, can be notified
the events.
- move subroutines in acpi_powerprofile.c (removed) to kern/subr_power.c
- call power_profile_set_state() also from APM driver when AC-line
status changes
- add call-back function for Crusoe LongRun controlling on power
profile changes for a example
on the loader to do it. Improve smp startup code to be less racy and to
defer certain things until the right time. This almost boots single user
on my dual ultra 60, it is still very fragile:
SMP: AP CPU #1 Launched!
Enter full pathname of shell or RETURN for /bin/sh:
# ls
Debugger("trapsig")
Stopped at Debugger+0x1c: ta %xcc, 1
db> heh
No such command
db>
with pmaps. When the context numbers wrap around we flush all user mappings
from the tlb. This makes use of the array indexed by cpuid to allow a pmap
to have a different context number on a different cpu. If the context numbers
are then divided evenly among cpus such that none are shared, we can avoid
sending tlb shootdown ipis in an smp system for non-shared pmaps. This also
removes a limit of 8192 processes (pmaps) that could be active at any given
time due to running out of tlb contexts.
Inspired by: the brown book
Crucial bugfix from: tmm
clobbered by the child. This is more complicated than usual because the
window that could get clobbered is pushed in kernel mode, so a lot of
registers would have to be saved in other registers in userland and we
don't have enough. What we do have is space in the pcb to temporarily
store user windows that were spilled in kernel mode, but could not be
immediately stored to the user stack. So we copy in the parent's topmost
window and save it in the pcb, and arrange for it to be copied back out
when the child is done frobbing the stack.
Reviewed by: tmm
PAGE_SIZE / MCLBYTES == 1 crash. Fix them by changing the
appropriate "allocate new page and bucket" code in mb_alloc to use
the macro for properly grabbing an allocated object from a bucket,
the one that checks whether the bucket is empty.
This should allow ken to continue testing zero-copy stuff on -CURRENT.
Noticed and provided debug info: ken
* Move the section which manipulates ia64_pal_base to after cninit() so
that we don't risk printing anything before we have a console.
* Don't call ia64_probe_sapics() for a SKI build. This should really
be dependant on ACPICA being present or something.
Add code to properly detach/attach disks that are part of a RAID.
Mark a disk that is attached on an ATA channel belonging to a
RAID as a spare disk that can be used for rebuilding failed RAID1's.
Add support for rebuilding failed RAID1's.
Several fixes to the detach/attach code.
For replacing a disk in a failed RAID1 do the following:
Find the controller channel# of the failed disk.
Exec 'atacontrol detach <channel#>' to free the disk from the system.
Replace the failed disk with a new one of at least the same size.
If your have your disks in drawers/enclosures this can be done with
the system still running.
Exec 'atacontrol attach <channel#>' to add the disk to the system and
mark it as a valid spare for rebuild.
Exec 'atacontrol rebuild <array#>'
The system will rebuild the array on the fly, the array can still
be used during this, although with slower performance.
Please let me know of any problems with this!
Sponsored by: Advanis Inc.
MFC after: 2 weeks
fill out netc_anon (a `struct ucred'), and add an XXX around the
entire operation since it isn't clear whether it's doing the right
thing with things like cr_uidinfo and cr_prison.
is not a neighbor. see comments for the detailed reason.
- Rejected the process of nd6_rtrequest() when the request is RESOLVE and
the interface does not need neighbor caches.
Obtained from: KAME
MFC After: 1 week
the data was supplied as a uio or an mbuf. Previously the limit was
ignored for mbuf data, and NFS could run the kernel out of mbufs
when an ipfw rule blocked retransmissions.
Previously, the UPAGES/KSTACK area of processes/threads would leak memory
at the time that a previously swapped process was terminated. Lukcily, the
leak was only 12K/proc, so it was unlikely to be a major problem unless you
had an undersized swap partition.
Submitted by: dillon
Reviewed by: silby
MFC after: 1 week
I put these in to match the use of spl*() in the NetBSD code I was basing this
on, but it appears to cause problems.
I'm doing this in a separate commit so as to be able to refer back if locking
becomes an issue at a later stage.
Define the atm_dev_free() routine so that its OK to free stuff that is defined
as volatile. Note this doesn't FORCE the arguemnts to be volatile,
just says that it's not an error if it is..
and isn't strictly required. However, it lowers the number of false
positives found when grep'ing the kernel sources for p_ucred to ensure
proper locking.
the pipe is locked and shouldn't be.
initialize pipe->pipe_mtxp to NULL when creating pipes in order not
to trip the above assertions.
swap pipe lock with giant around calls to pipe_destroy_write_buffer()
pipe_destroy_write_buffer issue noticed by: jhb
fully protect p_ucred yet so Giant is needed until all the p_ucred
locking is done. This is the original reason td_ucred was not used
immediately after its addition. Unfortunately, not using td_ucred is
not enough to avoid problems. Since p_ucred could be stale, we could
actually be dereferencing a stale pointer to dink with the refcount, so
we really need Giant to avoid foot-shooting. This allows td_ucred to
be safely used as well.
In order to determine what to page out, the vm_daemon checks
reference bits on all pages belonging to all processes. Unfortunately,
the algorithm used reacted badly with shared pages; each shared page
would be checked once per process sharing it; this caused an O(N^2)
growth of tlb invalidations. The algorithm has been changed so that
each page will be checked only 16 times.
Prior to this change, a fork/sleepbomb of 1300 processes could cause
the vm_daemon to take over 60 seconds to complete, effectively
freezing the system for that time period. With this change
in place, the vm_daemon completes in less than a second. Any system
with hundreds of processes sharing pages should benefit from this change.
Note that the vm_daemon is only run when the system is under extreme
memory pressure. It is likely that many people with loaded systems saw
no symptoms of this problem until they reached the point where swapping
began.
Special thanks go to dillon, peter, and Chuck Cranor, who helped me
get up to speed with vm internals.
PR: 33542, 20393
Reviewed by: dillon
MFC after: 1 week
in most machines of the Sun Ultra series. This is a port of the NetBSD
driver which I enhanced to make use of the gather functionality and the
configurable RX buffer offset to avoid copying all received/sent packet
(instead, packets will be directly DMAd from mbuf chains and into mbuf
clusters now).
device drivers for bus system with other endinesses than the CPU (using
interfaces compatible to NetBSD):
- bwap16() and bswap32(). These have optimized implementations on some
architectures; for those that don't, there exist generic implementations.
- macros to convert from a certain byte order to host byte order and vice
versa, using a naming scheme like le16toh(), htole16().
These are implemented using the bswap functions.
- stream bus space access functions, which do not perform a byte order
conversion (while the normal access functions would if the bus endianess
differs from the CPU endianess).
htons(), htonl(), ntohs() and ntohl() are implemented using the new
functions above for kernel usage. None of the above interfaces is currently
exported to user land.
Make use of the new functions in a few places where local implementations
of the same functionality existed.
Reviewed by: mike, bde
Tested on alpha by: mike