use "\n\" instead of "\" at the end of each source line, and don't use
semicolons). Fixed some older style bugs on the same lines (mainly
English errors in comments).
1) The race has to do with zone destruction. From the zone destructor we
would lock the zone, set the working set size to 0, then unlock the zone,
drain it, and then free the structure. Within the window following the
working-set-size set to 0 and unlocking of the zone and the point where
in zone_drain we re-acquire the zone lock, the uma timer routine could
have fired off and changed the working set size to something non-zero,
thereby potentially preventing us from completely freeing slabs before
destroying the zone (and thus leaking them).
2) The leak has to do with zone destruction as well. When destroying a
zone we would take care to free all the buckets cached in the zone, but
although we would drain the pcpu cache buckets, we would not free them.
This resulted in leaking a couple of bucket structures (512 bytes each)
per cpu on SMP during zone destruction.
While I'm here, also silence GCC warnings by turning uma_slab_alloc()
from inline to real function. It's too big to be an inline.
Reviewed by: JeffR
a long-standing mistake in the way a portion of a pipe's KVA is
allocated. Specifically, kmem_alloc_pageable() is inappropriate
for use in the "direct" case because it allows a preceding vm map entry
and vm object to be extended to support the new KVA allocation.
However, the direct case KVA allocation should not have a backing
vm object. This is corrected by using kmem_alloc_nofault().
Submitted by: tegge (with the above explanation by me)
tsb_foreach(), 0 signals to terminate the tsb traversal, so when
tsb_foreach() was used in pmap_protect() (which only happens when
the area to be protected is larger than PMAP_TSB_THRESH = 16MB), only
the first tsb entry in the specified range would be protected.
Reported by: Andrew Belashov <bel@orel.ru>
("UMA Zone") carefully, because it does not have pcpu caches allocated
at all. In the UP case, we did not catch this because one pcpu cache
is always allocated with the zone, but for the MP case, we were getting
bogus stats for this zone.
Tested by: Lukas Ertl <le@univie.ac.at>
- In sysctl_vm_zone use the per cpu locks to read the current cache
statistics this makes them more accurate while under heavy load.
Submitted by: tegge
so not only wastes memory but it can also cause a leak in zones that
will be destroyed later. The problem is that the slab allocation code
places newly created slabs on the partially allocated list because it
assumes that the caller will actually allocate some memory from it.
Failure to do so places an otherwise free slab on the partial slab list
where we wont find it later in zone_drain().
Continuously prodded to fix by: phk (Thanks)
with up to date comments. This fixes booting kernels with boot2
(except for loss of the features provided by loader) and is suitable
for MFC. Contrary to the old comments, most loaders don't clear the bss.
biosboot lost clearing of the bss in a code crunch in 1997, and boot2
never did it.
kan didn't notice the problem with gcc-3.3 putting variables that are
initialized to 0 in the bss until after committing gcc-3.3 because he
was already using essentially this patch. Before gcc-3.3, only the
non-critical `bootdev' variable was clobbered by clearing the bss.
MFC after: 3 days
messages are forwarded as netgraph control messages to the node
that is connected to the manage hook. If that hook is not connected,
the event is lost. Flow control events are converted to netgraph
flow control messages and send along the hook that is connected to
the flow controlled VC. ACR change events are converted to control
messages and sent along the hook for the given VC.
instead of int where the variable has to hold buffer lengths,
use u_int for things like number of network interfaces which
in principle can never be negative.
parts of the system about certain kinds of events, like changes
in the ABR rate, changes in the carrier state, PVC changes. The
main consumers of these events are the harp(4) pseudo-driver
and the ILMI daemon via ng_atm(4).
HIDENAME() macro seems to be unimplementable in C. (HIDENAME() used
to use invalid token pasting using ## for the STDC case until gcc
started rejecting that; now it uses unportable token pasting using
juxtaposition in all cases.) This reduces use of HIDENAME() in the
kernel to only i386 and amd64 profiling code so that it doesn't bite
most kernels whenever gcc becomes stricter. Problems with HIDENAME()
in userland are smaller because userland mostly doesn't use strict
flags yet. There are some advantages to hiding the name of mcount,
but newer arches shouldn't do it; only amd64 does.
MFC after: 3 days
On second thoughts hide tmpstk better by staticizing it.
- cut the version string at the newline, suppressing information about
who built the kernel and in what directory. Most of this information
was already lost to truncation.
- on i386, return the precise CPU class (if known) rather than just
"i386". Linux software which uses this information to select
which binary to run often does not know what to make of "i386".
allocation function. With this patch, it prevents continous growth of
the devbuf memory pool.
Tested with ssh <host> dd of=/dev/null < /dev/zero and vmstat -m | grep devbuf
to such devices. If a device fails due to this commit, add:
options DA_OLD_QUIRKS
to the kernel config and recompile. Then send the output of "camcontrol
inquiry da0" to scsi@freebsd.org so the quirk can be re-enabled.
to set np->n_size back to the desired size again after calling
nfs_meta_setsize(), since it could end up in nfs_loadattrcache() getting
called, which would change n_size back to the value it had before the
truncate request was issued. The result of this bug is that the size info
cached in the nfsnode becomes incorrect, lseek(fd, ofs, SEEK_END) seeks
past the end of the file, stat() returns the wrong size, etc.
PR: 41792
MFC after: 2 weeks
kernel ACL interfaces and system call names.
Break out UFS2 and FFS extattr delete and list vnode operations from
setextattr and getextattr to deleteextattr and listextattr, which
cleans up the implementations, and makes the results more readable,
and makes the APIs more clear.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
kern.file sysctl, don't return information about processes that
fail p_cansee(td, p). This prevents sockstat and related
programs from seeing file descriptors owned by processes not
in the same jail as the thread, as well as having implications
for MAC, etc.
This is a partial solution: it permits an information leak about
the number of descriptors in the sizing calculation (but this is
not new information, you can also get it from kern.openfiles),
and doesn't attempt to mask file descriptors based on the
properties of the descriptor, only the process referencing it.
However, it provides most of what you want under most
circumstances, without complicating the locking.
PR: 54211
Based on a patch submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>
receive 6 byte commands. Add a check for this flag to da(4) and cd(4) so
that they honor it. This is a quick workaround for many devices (especially
USB) that require da(4) quirks to operate. The more complete approach is
to finish the new transport code which will be aware of the SCSI version a
transport implements.
MFC after: 1 day
child when forking. This provides a consistent initial state.
Note that cpu_set_upcall() does not clear the per-CPU unique value as
it is followed by a call to set_mcontext(), which sets it accordingly.
in the `video_state' structure, to larger ones (from u_char to
u_short). Each can now hold values at least as large as the
size of the array it is meant to point into.
This eliminates warnings printed by GCC 3.3.1 and hence makes
pcvt compilable using -Werror.
memory in bus_dmamem_alloc(). This is possible now that
contigmalloc() supports the M_ZERO flag.
- Remove the locking of Giant around calls to contigmalloc() since
contigmalloc() now grabs Giant itself.
mainly to quiet a warning emitted by GCC 3.3 about comparing
a variable to a value which is larger than the former can hold.
The value was checked to make sure the `np->squeue' array is
not accessed behind its boundary.
This worked due to possibly accidental truncation when
(np->squeueput + 1) was larger than or equal to MAX_START (256)
when it was assigned to `qidx'.
`qidx' is used to hold the next position in the start queue
for an insertion. The new type was chosen because some other
code in the function ncr_freeze_devq() also uses plain integers
to hold those indices.
Wrapped the line after the closing parenthesis of an `if'
condition.
function unless the device is configured up. Without this fix, the
device ends up in the RUNNING state even though it is configured down.
Also, check the RUNNING flag before calling the if_start function, in
case the if_init function failed for one reason or another.
contain the filedescriptor number on opens from userland.
The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*
For now pass -1 all over the place.
invariants/witness/etc on i386, sparc64, amd64 and alpha for GENERIC.
Lint probably still needs fixing, as do a couple of other drivers
that have broken recently and not been noticed.
in ntfs_writentvattr_plain and ntfs_readntvattr_plain, and purge the boot
block from the buffer cache if isn't exactly one cluster long. These two
changes work around the same buffer cache bug that ntfs_subr.c 1.30 tried
to, but in a different way. This may decrease throughput by reading smaller
amounts of data from the disk at a time, but may increase it by avoiding
bogus writes of clean buffers.
Problem (re)reported by Karel J. Bosschaart on -current.
switching anymore, so there's no need to save and restore GP. This
change breaks threaded applications linked against libc_r. Pull the
tier 2 card again: relink. This will link against libthr instead.
the "do-nothing" versions of __RCSID(), __RCSID_SOURCE(), __SCCSID(),
and __COPYRIGHT() were not strictly correct. They should not expand
into [nothing], because the ';' which follows them would then cause
a syntax error (in a strict C compiler, if not gcc...).
So, change the do-nothing versions of those macros to use the
'struct __hack' tactic, as was already used with __FBSDID().
Approved by: discussions with bde
MFC after: 1 week
on i486's (and probably i386's), but it has had very little effect
since gcc-2.7 or gcc-2.95. With gcc-3.3, it gave a small
pessimization for at least i386's, athlon-xp's and pentium4's, a
small optimization (I think) for pentium1's, and made no difference
for i386's. (movzbl is best for all the later processors, and the
micro-optimization was to stop it being used on i486's.)
a non-standard construct. Instead, redefine struct _ia64_fpreg as a
union and put a long double in it. On ia64 and for LP64, this is
defined by the ABI to have 16-byte alignment. For ILP32 a long double
has 4-byte alignment, but we don't support ILP32.
Note that the in-memory image of a long double does not match the in-
memory image of spilled FP registers. This means that one cannot use
the fpr_flt field to interpet the bits. For this reason we continue
to use an aggregate type.
connections. While this confuses tcpdump, it enables other applications
to see and analyze non-IP traffic (signalling, for example).
Pointed out by: Vincent Jardin <vjardin@wanadoo.fr>
OID as the other protocol family sub-trees do, that is equal to the
protocol family identifier. Make the ATM layer debugging flags
available under this tree.
Submitted by: Vincent Jardin <vjardin@wanadoo.fr>
MFC after: 2 weeks
set an initial value. This is aimed at getting us closer to being able to
turn -Werror back on and we can adjust the settings later on. Yes, we
could turn off -Wno-inline instead, but that would hide the effect of
gcc's bogo-estimator ignoring inline (either rightly or wrongly).
written as a template that when inlined is specialized for the caller
through constant value propagation and dead code elimination. Thus,
the specialized code that is generated for pmap_clear_reference() et
al. avoids several conditional branches inside of a loop.
fields in the low 32 bits of the local APIC ICR register. Use this macro
in place of APIC_RESV2_MASK when masking off existing bits from the ICR
when writing to it to send an IPI.
Tested by: scottl
but this just created a weird inconsistency when porting gdb(1).
Instead, we name each high FP register seperately, like we do for
all the other registers.
__attribute__((__always_inline__)) by adding an __always_inline macro
(used like __dead2 etc). __inline_damnit has also been suggested but we
have a precedent of keeping the names similar so they are easier to find.
These were a left over from when the private memory pools were
converted to use uma zones. The limit of UMA zones, however,
works differently. When a zone is limited to only one or two pages
than, on multi-cpu systems, processes can get stuck on the zonelimit,
because all remaining free items are in caches of other CPUs.
Also add rudimentary error handling in some places (panic) when a zone
cannot be created.