attached. Otherwise, the snp->snp_tty would be overwritten, while the
tty line discipline still set to the snpdisc. Then snplwrite() causes
panic because ttytosnp() cannot find the snp.
MFC after: 1 week
support its -k argument:
kern.proc.kstack - dump the kernel stack of a process, if debugging
is permitted.
This sysctl is present if either "options DDB" or "options STACK" is
compiled into the kernel. Having support for tracing the kernel
stacks of processes from user space makes it much easier to debug
(or understand) specific wmesg's while avoiding the need to enter
DDB in order to determine the path by which a process came to be
blocked on a particular wait channel or lock.
- Introduce per-architecture stack_machdep.c to hold stack_save(9).
- Introduce per-architecture machine/stack.h to capture any common
definitions required between db_trace.c and stack_machdep.c.
- Add new kernel option "options STACK"; we will build in stack(9) if it is
defined, or also if "options DDB" is defined to provide compatibility
with existing users of stack(9).
Add new stack_save_td(9) function, which allows the capture of a stacktrace
of another thread rather than the current thread, which the existing
stack_save(9) was limited to. It requires that the thread be neither
swapped out nor running, which is the responsibility of the consumer to
enforce.
Update stack(9) man page.
Build tested: amd64, arm, i386, ia64, powerpc, sparc64, sun4v
Runtime tested: amd64 (rwatson), arm (cognet), i386 (rwatson)
We used to allocate the domains 0-14 for userland, and leave the domain 15
for the kernel. Now supersections requires the use of domain 0, so we
switched the kernel domain to 0, and use 1-15 for userland.
How it's done currently, the kernel domain could be allocated for a
userland process.
So switch back to the previous way we did things, set the first available
domain to 0, and just add 1 to get the real domain number in the struct pmap.
Reported by: Mark Tinguely <tinguely AT casselton DOT net>
MFC After: 3 days
1. A packet comes in that is to be forwarded
2. The destination of the packet is rewritten by some firewall code
3. The next link's MTU is too small
4. The packet has the DF bit set
Then the current code is such that instead of setting the next
link's MTU in the ICMP error, ip_next_mtu() is called and a guess
is sent as to which MTU is supposed to be tried next. This is because
in this case ip_forward() is called with srcrt set to 1. In that
case the ia pointer remains NULL but it is needed to get the MTU
of the interface the packet is to be sent out from.
Thus, we always set ia to the outgoing interface.
MFC after: 2 weeks
The RAS implementation would set the end address, then the start
address. These were used by the kernel to restart a RAS sequence if
it was interrupted. When the thread switching code ran, it would
check these values and adjust the PC and clear them if it did.
However, there's a small flaw in this scheme. Thread T1, sets the end
address and gets preempted. Thread T2 runs and also does a RAS
operation. This resets end to zero. Thread T1 now runs again and
sets start and then begins the RAS sequence, but is preempted before
the RAS sequence executes its last instruction. The kernel code that
would ordinarily restart the RAS sequence doesn't because the PC isn't
between start and 0, so the PC isn't set to the start of the sequence.
So when T1 is resumed again, it is at the wrong location for RAS to
produce the correct results. This causes the wrong results for the
atomic sequence.
The window for the first race is 3 instructions. The window for the
second race is 5-10 instructions depending on the atomic operation.
This makes this failure fairly rare and hard to reproduce.
Mutexs are implemented in libthr using atomic operations. When the
above race would occur, a lock could get stuck locked, causing many
downstream problems, as you might expect.
Also, make sure to reset the start and end address when doing a syscall, or
a malicious process could set them before doing a syscall.
Reviewed by: imp, ups (thanks guys)
Pointy hat to: cognet
MFC After: 3 days
its -f and -v arguments:
kern.proc.filedesc - dump file descriptor information for a process, if
debugging is permitted, including socket addresses, open flags, file
offsets, file paths, etc.
kern.proc.vmmap - dump virtual memory mapping information for a process,
if debugging is permitted, including layout and information on
underlying objects, such as the type of object and path.
These provide a superset of the information historically available
through the now-deprecated procfs(4), and are intended to be exported
in an ABI-robust form.
January 1, 1601. The 1601 - 1970 period was in seconds rather than 100ns
units.
Remove duplication by having NdisGetCurrentSystemTime call ntoskrnl_time.
linker interfaces for looking up function names and offsets from
instruction pointers. Create two variants of each call: one that is
"DDB-safe" and avoids locking in the linker, and one that is safe for
use in live kernels, by virtue of observing locking, and in particular
safe when kernel modules are being loaded and unloaded simultaneous to
their use. This will allow them to be used outside of debugging
contexts.
Modify two of three current stack(9) consumers to use the DDB-safe
interfaces, as they run in low-level debugging contexts, such as inside
lockmgr(9) and the kernel memory allocator.
Update man page.
sx driver), change a magic value in the PLX bridge chip. Apparently later
builds of the PCI cards had corrected values in the configuration eeprom.
This change supposedly fixes some pci bus problems.
information in support of DDB(4); these functions bypass normal linker
locking as they may run in contexts where locking is unsafe (such as the
kernel debugger).
Add a new interface linker_ddb_search_symbol_name(), which looks up a
symbol name and offset given an address, and also
linker_search_symbol_name() which does the same but *does* follow the
locking conventions of the linker.
Unlike existing functions, these functions place the name in a
caller-provided buffer, which is stable even after linker locks have been
released. These functions will be used in upcoming revisions to stack(9)
to support kernel stack trace generation in contexts as part of a live,
rather than suspended, kernel.
gets enabled when INVARIANTS is on instead of DIAGNOSTIC (which apparently
nobody uses). From Tor's description:
This happens when the block range spans two block maps, the first in the
inode (mapping up to NDADDR direct blocks) and the second being the first
indirect block. The current check assumes that both block maps are
indirect blocks.
Work done by: tegge
Tested by: kris, kensmith
in the tcp header. With relevant parts of the tcp header changing after
the 'signature' was computed, the signature becomes invalid.
Reviewed by: tools/regression/netinet/tcpconnect
MFC after: 3 days
Tested by: Nick Hilliard (see net@)
is required by the X.Org PCI domains code and additionally needs
a workaround for Hummingbird and Sabre bridges as these don't
allow their config headers to be read at any width, which is an
unusual behavior.
- In psycho(4) take advantage of DEFINE_CLASS_0 and use more
appropriate types for some softc members.
MFC after: 3 days
hack means you can get the units and flags to match up more easily with
serial consoles on machines with acpi tables that cause the com ports
to be probed in the wrong order (and hence get the wrong sio unit number).
This replaces the common alternative hack of editing the code to comment
out the acpi attachment. This could go away entirely when device wiring
patches are committed.
stomping on the units intended for the motherboard sio ports. This is
no real substitute for the not-yet-committed device wiring enhancements.
Code taken from sio's pci attachment.
allocation fails and pv entries are reclaimed, there may be an unused pv
entry in a pv chunk that survived the reclamation. However, previously,
after reclamation, get_pv_entry() did not look for an unused pv entry in
a surviving pv chunk; it simply retried the page allocation. Now, it
does look for an unused pv entry before retrying the page allocation.
Note: This only applies to RELENG_7. Earlier branches use a different
pv entry allocator.
MFC after: 6 weeks
Intel CPUs with family 0x6, model 0xE and later (i.e., Intel Core(TM))
have a PMC architecture that differs somewhat from previous CPUs in
family 0x6. Even though the basic programming model is similar, the
documented set of legal values that may be loaded into their PMC MSRs
differs from that of the previous PMCs in family 0x6 and reusing bit
values valid for the older PMCs could result in undefined behaviour in
the general case.
per-cpu area. cp_time[] goes away and a new function creates a merged
cp_time-like array for things like linprocfs, sysctl etc. The
atomic ops for updating cp_time[] in statclock go away, and the scope
of the thread lock is reduced.
sysctl kern.cp_time returns a backwards compatible cp_time[] array.
A new kern.cp_times sysctl returns the individual per-cpu stats.
I have pending changes to make top and vmstat optionally show per-cpu
stats.
I'm very aware that there are something like 5 or 6 other versions "out
there" for doing this - but none were handy when I needed them.
I did merge my changes with John Baldwin's, and ended up replacing a
few chunks of my stuff with his, and stealing some other code.
Reviewed by: jhb
Partly obtained from: jhb
since the branch caches on at least Athlon XP through Athlon 64 CPU's
don't understand such instructions and guarantee a cache miss taking
at least 10 cycles. Use the documented workaround "ret $0" instead
("nop; ret" also works, but "ret $0" is probably faster on old CPUs).
Normal code (even asm code) doesn't branch to "ret", since there is
usually some cleanup to do, but the __mcount, .mcount and .mexitcount
entry points were optimized too well to have the minimum number of
instructions (3 instructions each if profiling is not enabled) and
they did this. I didn't see a significant number of cache misses for
.mexitcount, but for the shared "ret" for __mcount and .mcount I
observed cache misses costing 26 cycles each. For a send(2) syscall
that makes about 70 function calls, the cost of these cache misses
alone increased the syscall time from about 4000 cycles to about 7000
cycles. 4000 is for a profiling (GUPROF) kernel with profiling disabled;
after this fix, configuring profiling only costs about 600 cycles in the
4000, which is consistent with almost perfect branch prediction in the
mcounting calls.
unused except to obfuscate disassemblies. -mprofiler-epilogue is
currently with gcc-4 (it does too little), but -finstrument-functions
is broken in a different way (it does too much).
amd64 version: meger whitespace fixes from i386 version.
Call uma_sel_align() there at well.
Set CPU_CONTROL_VECRELOC if we're using the high vectors page.
Submitted by: Rafal Jaworowski <raj AT semihalf DOT com>
MFC After: 1 week
bpf will see inner and outer headers or just inner or outer
headers for incoming and outgoing IPsec packets.
This is useful in bpf to not have over long lines for debugging
or selcting packets based on the inner headers.
It also properly defines the behavior of what the firewalls see.
Last but not least it gives you if_enc(4) for IPv6 as well.
[ As some auxiliary state was not available in the later
input path we save it in the tdbi. That way tcpdump can give a
consistent view of either of (authentic,confidential) for both
before and after states. ]
Discussed with: thompsa (2007-04-25, basic idea of unifying paths)
Reviewed by: thompsa, gnn
- On amd64, just assume type #1 is always used. PCI 2.0 mandated
deprecated type #2 and required type #1 for all future bridges which
was well before amd64 existed.
- For i386, ignore whatever value was in 0xcf8 before testing for type #1
and instead rely on the other tests to determine if type #1 works. Some
newer machines leave garbage in 0xcf8 during boot and as a result the
kernel doesn't find PCI at all (which greatly confuses ACPI which expects
PCI to exist when PCI busses are in the namespace).
MFC after: 3 days
Discussed with: scottl
ZFS porting style didn't extend this, instead using a heap of additional
header files that don't get installed.
My intention had been to allow OpenSolaris external code to build on
FreeBSD out of the box (i.e. without a src tree).
Make clear that this is not a good idea when called from
tcp_output()->ipsec_hdrsiz_tcp()->ipsec4_hdrsize_tcp()
as we do not know if IPsec processing is needed at that point.
T_DIRECT filtering so that disk drives can be attached via the
pass driver. Add CAM locking. Don't mark CAM commands as SG64
since the hardware isn't designed to deal with 64-bit passthru
commands. Hopefully the bounce buffer changes that were done
for the management/ioctl interface are robust enough to handle
this deficiency for CAM as well.
- Enable pcbeep control for Acer + ALC268 (nid 29). Give enough (fake)
hints so the parser will grab it and allocate "speaker" control.
- Fix regression while preparing DAC and ADC for multichannel
format. Since playback policy is to output to every possible path,
ensure that each DAC is started.
Reported / Tested by: Guy Brand
Currently, Giant is not too much contented so that it is ok to treact it
like any other mutexes.
Please don't forget to update your own custom config kernel files.
Approved by: cognet, marcel (maintainers of arches where option is
not enabled at the moment)
of some old programs. Since sigval is union type, this change will not have
binary compatibility problem.
MFC: after 3 days
Discussed with: rwatson, glebius
It should just contain the value we want to add, as if we're interrupted
between the add and the str, we will restart from the beginning. Just use
a register we can scratch instead.
MFC After: 1 week
routine. It is not needed as the existing tests for segment coalescing
already handle bounced addresses and it prevents legal segment coalescing
in certain edge cases.
MFC after: 1 week
Reviewed by: scottl
The call should happen with the driver lock held. We don't hold the driver
lock in newstate as it's a separate thread where we can't sleep (and we only
call wpi_cmd in async mode).
Discovered By: Attillo's callout rework
Approved By: mlaier (comentor)
currently, before to spin the turnstile spinlock is acquired and the
waiters flag is set.
This is not strictly necessary, so just spin before to acquire the
spinlock and to set the flags.
This will simplify a lot other functions too, as now we have the waiters
flag set only if there are actually waiters.
This should make wakeup/sleeping couplet faster under intensive mutex
workload.
This also fixes a bug in rw_try_upgrade() in the adaptive case, where
turnstile_lookup() will recurse on the ts_lock lock that will never be
really released [1].
[1] Reported by: jeff with Nokia help
Tested by: pho, kris (earlier, bugged version of rwlock part)
Discussed with: jhb [2], jeff
MFC after: 1 week
[2] John had a similar patch about 6.x and/or 7.x about mutexes probabilly
sends frames up the stack after changing the current channel then
the lookup by ieee channel number may fail leaving a null ptr in
se_chan; if this happens fallback to the channel recorded when the
frame is processed (curchan). Since the frame doesn't contribute
to scan results for the sta this is acceptable.
Reviewed by: thompsa
MFC after: 3 days
1837014 Kernel panics after authentication of an outgoing packet
1836992 Potential bugs in packet auth code (w/patches)
1836967 Kernel panic when using auth rule with keep state
and another reported only to FreeBSD by Andiry (see PR)
PR: kern/118251
Submitted by: Andriy Syrovenko <andriys@gmail.com>
Reviewed by: darrenr
MFC after: 5 days
cast as uint32_t which is defined as unsigned int. gcc doesn't want to
consider that there might not be much difference between an int and
a long on a 32 bit architecture.
vm_pageout_fallback_object_lock() in vm_contig_launder_page() to better
handle a lock-ordering problem. Consequently, trylock's failure on the
page's containing object no longer implies that the page cannot be
laundered.
MFC after: 6 weeks
This has the benefit that rmlocks have proper support for reader recursion
(in contrast to rwlock(9) which could potential lead to writer stravation).
It also means a significant performance gain, eventhough only visible in
microbenchmarks at the moment.
Discussed on: -arch, -net
malloc_type_allocated(..., 0) calls that occur when contigmalloc() has
failed. Eliminate the acquisition and release of the page queues lock
from vm_page_release_contig(). Rename contigmalloc2() to
contigmapping(), reflecting what it does.
as up if at least one of its ports also has a link up. This fixes using
carp+lagg together and any other system that relies on linkstate events.
PR: kern/113956
MFC after: 3 days
the inpcb when there's an inpcb without associated timewait state, and
not unlocking when the inpcb has been freed. This avoids a kernel panic
when tcpdrop(8) is run on a socket in the TIMEWAIT state.
MFC after: 3 days
Reported by: Rako <rako29 at gmail dot com>
should never be moved by one lock to another.
As, luckily, nothing in our tree is using it, axe the function.
This breaks lockmgr KPI, so interested, third-party modules should update
their source code with appropriate replacement.
Ok'ed by: ups, rwatson
MFC after: 3 days
comments from vnode_pager_setsize(). This call was introduced in
revision 1.140 to address a problem that no longer exists.
Specifically, pmap_zero_page_area() has replaced a (possibly)
problematic implementation of page zeroing that was based on
vm_pager_map(), bzero(), and vm_pager_unmap().
while the global callout spinlock is not held, and can lead to PF#.
Reported by: dougb, Mark Atkinson <atkin901 at yahoo dot com>
Tested by: dougb
Diagnosed by: jhb
The lookup hurts a bit for connections but had been there anyway
if IPSEC was compiled in. So moving the lookup up a bit gives us
TSO support at not extra cost.
PR: kern/115586
Tested by: gallatin
Discussed with: kmacy
MFC after: 2 months
a good job of it) in the copypktopts() function, just call ip6_clearpktopts()
directly. Otherwise, the callers of this function would end up freeing the
memory twice.
Reviewed by: jinmei
PR: kern/116360
o Acer Aspire 4520 laptop
- jack sensing / automute
o Toshiba Satellite A135-S4527 laptop
- jack sensing / automute
Tested by: lioux
o Apple Macbook 3 (is it?)
- require gpio0 (for speakers) and ovref50 (for headphone)
to make it works
- jack sensing / automute
Tested by: Ed Schouten
* Add Nvidia MCP67 controller ids.
* Be sensible about simmilar controller with multiple pci ids.
* Connect unused DAC/ADC to stream#0 rather than forcing each of them
managing their own stream.
MFC after: 3 days
include the ithread scheduling step. Without this, a preemption might
occur in between the interrupt getting masked and the ithread getting
scheduled. Since the interrupt handler runs in the context of curthread,
the scheudler might see it as having a such a low priority on a busy system
that it doesn't get to run for a _long_ time, leaving the interrupt stranded
in a disabled state. The only way that the preemption can happen is by
a fast/filter handler triggering a schduling event earlier in the handler,
so this problem can only happen for cases where an interrupt is being
shared by both a fast/filter handler and an ithread handler. Unfortunately,
it seems to be common for this sharing to happen with network and USB
devices, for example. This fixes many of the mysterious TCP session
timeouts and NIC watchdogs that were being reported. Many thanks to Sam
Lefler for getting to the bottom of this problem.
Reviewed by: jhb, jeff, silby
- Bring HEAD up to the latest shared code
- Fix TSO problem using limited MSS and forwarding
- Dual lock implementation
- New device support
- For my ease, this code can compile in either 6.x or later
- brings this driver in sync with the 6.3
prepend a data mbuf in front of a header mbuf without moving the header
to the new mbuf, and (2) a possible alignment problem on architectures
with strict alignment as reported in kern/4184.
PR: kern/4184 (1)
addresses as the source of an AARP request. While this PR was submitted
in the context of work in OpenBSD to port netatalk (in 1997), I've
synchronized the code more to our ARP input routine, which had similar
requirements.
Submitted by: Denton Gentry
PR: kern/4184
MFC after: 1 week
only at address 0 which is supposed to be the only valid phy address
on Marvell PHY. The more correct solution would be masking PHY
address ranges allowable in PHY probe routine. Unfortunately,
FreeBSD has no way to retrict the PHY address ranges or to pass special
flags to PHY driver.
This change assumes that PHY hardwares attached to msk(4) would be
Marvell made 88E11xx PHY.
With this changes the phantom phys attached on 88E8036(Yukon FE)
should disappear.
Reported by: Oleg Lomaka < oleg AT lomaka DOT org DOT ua >
Tested by: Oleg Lomaka < oleg AT lomaka DOT org DOT ua >
only 4KB SRAM.
o Rework setting Tx/Rx RAM buffer size. Give receiver 2/3 of memory
and round it down to the multiple of 1024. The RAM buffer size of
Yukon II should be multiple of 1024. This fixes bogus RAM buffer
configuration used in Yukon FE.
Reported by: Oleg Lomaka < oleg AT lomaka DOT org DOT ua >
Tested by: Oleg Lomaka < oleg AT lomaka DOT org DOT ua >
timestamps in the initial SYN packet actually use them in the rest of the
connection. Unfortunately, during the 7.0 testing cycle users have already
found network devices that violate this constraint.
RFC 1323 states 'and may send a TSopt in other segments' rather than
'and MUST send', so we must allow it.
Discovered by: Rob Zietlow
Tracked down by: Kip Macy
PR: bin/118005
publicly available datasheet for Yukon II and don't know what
bug/workaround exist for the specific hardware revision. Also I don't
think the vendor will release hardware errata in near future.
The hardware feature lists were not used at all except setting water
mark registers. Since msk(4) should know exact chip model/revision
number to decide which hardware capability could be used the extra
feature lists were redundant.
o Enable jumbo frame support for EC Ultra and disable jumbo frame
for FE.
o Enable store and forward mode for standard MTU sized frame.
o Enable TSO for EC Ultra. However TSO/checksum offload is disabled
for jumbo frame case. Because EC Ultra can't use store and forward
mode for jumbo frame TSO/checksum offload is not available.
o Adjust Tx GMAC almost empty threshold value and add a jumbo frame
water mark. The maic value was obtained from Marvell's sk98lin
driver.
o Fix EC Ultra chip revision number.
rwlocks in conjuction with callouts. The function does basically what
callout_init_mtx() alredy does with the difference of using a rwlock
as extra argument.
CALLOUT_SHAREDLOCK flag can be used, now, in order to acquire the lock only
in read mode when running the callout handler. It has no effects when used
in conjuction with mtx.
In order to implement this, underlying callout functions have been made
completely lock type-unaware, so accordingly with this, sysctl
debug.to_avg_mtxcalls is now changed in the generic
debug.to_avg_lockcalls.
Note: currently the allowed lock classes are mutexes and rwlocks because
callout handlers run in softclock swi, so they cannot sleep and they
cannot acquire sleepable locks like sx or lockmgr.
Requested by: kmacy, pjd, rwatson
Reviewed by: jhb
Revert the probe in atapi-cd.c to the old usage now its fixed on AHCI.
THis change also fixes using virtual CD's om fx parallels.
Still leaves the GEOM problem of telling media vs device access apart in the access function.
server-side RPC retranmission cache for non-idempotent operations: these
hacks substituted 0 (success) for the expected EEXIST in the event that
a target name already existed for LINK, SYMLINK, and MKDIR operations,
under the assumption that EEXIST represented a second application of the
original RPC rather than a true failure.
Background: certain NFS operations (in this case, LINK, SYMLINK, and
MKDIR) are not idempotent, as they leave behind persisting state on the
server that prevents them from being replayed without an error;if an UDP
RPC reply is lost leading to a retransmission by theclient, the second
reply will return EEXIST rather than success, asthe new object has
already been created. The NFS client previouslysilently mapped the
EEXIST return into success to paper over thisproblem.
However, in all modern NFS server implementations, a reply cache is kept
in order to retransmit the original reply to a retransmitted request,
rather than performing the operation a second time, allowing this hack
to be avoided. This allows link()-based filelocking over NFS to operate
correctly, as an application requestingthe creation of a new link for a
file to tell if it succeededatomically or not.
Other NFS clients, including Solaris and Linux, generally follow this
behavior for the same reasons. Most clients also now default to TCP,
which also helps avoid the issue of retransmitted but non-idempotent
requests in most cases.
Reported by: Adam McDougall <mcdouga9 at egr dot msu dot edu>,
Timo Sirainen <tss at iki dot fi>
Reviewed by: mohans
MFC after: 1 week
o buffered write, for chunks smaller than PIPE_MINDIRECT bytes
o direct write, for everything else
A call to writev(2) may receive struct iov of various size and the
kernel may have to switch from one solution to the other. Before doing
this, it must wake reader processes and any select/poll/kqueue up.
This commit fixes a bug where select/poll/kqueue are not triggered
when switching from buffered write to direct write. It adds calls to
pipeselwakeup().
I give more details on freebsd-arch@:
http://lists.freebsd.org/pipermail/freebsd-arch/2007-September/006790.html
This should fix issues with Erlang (lang/erlang) and kqueue.
Reported by: Rickard Green (Erlang)
time ago (2002 according to the gcc log). Using the proper name
fixes a warning in src/lib/libc/gen/ulimit.c about the second
argument of va_start() not being the last named (when it really
was).
This has the following benefits:
- allows to use the AT keyboard maps in share/syscons/keymaps with
sunkbd(4),
- allows to use kbdmux(4) with sunkbd(4),
- allows Sun RS232 keyboards to be configured and used the same
way as Sun USB keyboards driven by ukbd(4) (which also does AT
keyboard emulation) with X.Org, putting an end to the problem
of native support for the former in X.Org being broken over and
over again.
MFC after: 3 days
an unified way for all the lock primitives to express lock assertions.
Currenty, lockmgrs and rmlocks don't have assertions, so just panic in
that case.
This will be a base for more callout improvements.
Ok'ed by: jhb, jeff
strees2 suite, to quote his letter, this change:
1. It removes the tn_lookup_dirent stuff. I think this cannot be fixed,
because nothing protects vnode/tmpfs node between lookup is done, and
actual operation is performed, in the case the vnode lock is dropped.
At least, this is the case with the from vnode for rename.
For now, we do the linear lookup in the parent node. This has its own
drawbacks. Not mentioning speed (that could be fixed by using hash), the
real problem is the situation where several hardlinks exist in the dvp.
But, I think this is fixable.
2. The patch restores the VV_ROOT flag on the root vnode after it became
reclaimed and allocated again. This fixes MPASS assertion at the start
of the tmpfs_lookup() reported by many.
Submitted by: kib
First, a file is mmap(2)ed and then mlock(2)ed. Later, it is truncated.
Under "normal" circumstances, i.e., when the file is not mlock(2)ed, the
pages beyond the EOF are unmapped and freed. However, when the file is
mlock(2)ed, the pages beyond the EOF are unmapped but not freed because
they have a non-zero wire count. This can be a mistake. Specifically,
it is a mistake if the sole reason why the pages are wired is because of
wired, managed mappings. Previously, unmapping the pages destroys these
wired, managed mappings, but does not reduce the pages' wire count.
Consequently, when the file is unmapped, the pages are not unwired
because the wired mapping has been destroyed. Moreover, when the vm
object is finally destroyed, the pages are leaked because they are still
wired. The fix is to reduce the pages' wired count by the number of
wired, managed mappings destroyed. To do this, I introduce a new pmap
function pmap_page_wired_mappings() that returns the number of managed
mappings to the given physical page that are wired, and I use this
function in vm_object_page_remove().
Reviewed by: tegge
MFC after: 6 weeks
If it is set to zero value (default) dummynet module will try to emulate
real link as close as possible (bandwidth & latency): packet will not leave
pipe faster than it should be on real link with given bandwidth.
(This is original behaviour of dummynet which was altered in previous commit)
If it is set to non-zero value only bandwidth is enforced: packet's latency
can be lower comparing to real link with given bandwidth.
- Document recently introduced dummynet(4) sysctl variables.
Requested by: luigi, julian
MFC after: 3 month
with ACCESSPERMS. Document in mount_ntfs(8) only the nine
low-order bits of mask are used (taken from mount_msdosfs(8)).
PR: kern/114856
Submitted by: Ighighi
MFC after: 1 month
In case attach fails because of the priv check we leaked the
memory and left so_pcb as fodder for invariants.
Reported by: Pawel Worach
Reviewed by: rwatson
- Implement timing out of VPD register access.[1]
- Fix an off-by-one error of freeing malloc'd space when checksum is invalid.
- Fix style(9) bugs, i.e., sizeof cannot be followed by space.
- Retire now obsolete 'hw.pci.enable_vpd' tunable.
Submitted by: cokane (initial revision)[1]
Reviewed by: marius (intermediate revision)
Silence from: jhb, jmg, rwatson
Tested by: cokane, jkim
MFC after: 3 days
The register layout is little different from memory-mapped stats
in the previous generation chips. In fact, it is bad because
registers in this range are cleared after reading them.
Reviewed by: scottl
MFC after: 3 days
- Trying to eliminate another racing by replacing the timeout(9) with
callout APIs. In addition to that, the callout_drain() in an_detach()
help us to avoid a possible panic-on-free due to the callout API tries
to lock a destroyed mutex.
- In an_stats_update(), check the return value of an_read_record(). This
should reduce the chance of device removal(PCCARD) panic [2].
- Adding a comment to state the fact that an_stats_update() is now called
via callout(9) with a lock held [2].
Submitted by: jhb [1], ambrisko [2]
Reviewed by: jhb, ambrisko
Reported by: dhw
Tested by: dhw
MFC after: 3 days
priorities of the technologies supported by 802.3 Selector Field
value.
1000BASE-T full duplex
1000BASE-T
100BASE-T2 full duplex
100BASE-TX full duplex
100BASE-T2
100BASE-T4
100BASE-TX
10BASE-T full duplex
10BAST-T
However PHY drivers didn't honor the order such that 100BASE-T4 had
higher priority than 100BASE-TX full duplex. Fix that long standing
bugs such that have PHY drivers choose the highest common denominator
ability.
Fix a bug in dcphy which inadvertently aceepts 100BASE-T4.
PR: 92599
- Populate the register values for the trapframe put on the stack by the
double fault handler.
- Teach DDB's trace routine to treat a double fault like other trap frames.
MFC after: 3 days
process_fini, thread_ctor, thread_dtor, thread_init, thread_fini. This
will allow us to extend dynamically areas in proc/thread for dtrace ;-)
Reviewed by: rwatson
- process_ctor,dtor, init and fini
- thread_ctor,dtor, init and fini
This allows the ability to add on additional things
during construction/destruction of threads and processes.
Reviewed by: rwatson