to intptr_t. This fixes a compiler warning (integer from pointer
without cast) in scvgarndr.c when SC_PIXEL_MODE is defined.
o Define readb() and writeb(). Both are used in scvgarndr.c when,
guess what, SC_PIXEL_MODE is defined.
Both changes are ia64 specific.
Found by: LINT
reacquire the "first" object's lock while a backing object's lock is held.
Since this is a lock-order reversal, vm_fault() uses trylock to acquire
the first object's lock, skipping the sequential access optimization in
the unlikely event that the trylock fails.
in struct vm_page are defined as u_int for 16K pages and u_long
for 32K pages, with the implied assumption that long will at least
be 64 bits wide on platforms where we support 32K pages.
from the ia32 specific stuff. Some of this still needs to move to the MI
freebsd32 area, and some needs to move to the MD area. This is still
work-in-progress.
commands. Add a quirk for the Creative Nomad MuVo USB device that uses
it as well as NO_SYNCHRONIZE_CACHE.
PR: kern/53094
Submitted by: Richard Nyberg <rnyberg@it.su.se>
MFC after: 3 days
device should handle them.
This prevents for instance GEOM::ioctl requests from reaching a
lower BSDlabel node, which ps@ found would confuse newfs(8).
CB710, CB720, CB1211, CB1225, CB1410 and CB1420
These are likely licensed designed from TI, and the Linux PCMCIA code
treats them as TI chips.
Add comment, but no ID for the 711E1 from O2Micro.
boot time. Instead, read it a sector at a time. While this sounds
like a significant slowdown, I've not been able to measure any
signficant difference.
Submitted by: luigi
Reviewed by: jhb, sam (both a while ago)
MFC After: 3 days
mac_reflect_mbuf_icmp()
mac_reflect_mbuf_tcp()
These entry points permit MAC policies to do "update in place"
changes to the labels on ICMP and TCP mbuf headers when an ICMP or
TCP response is generated to a packet outside of the context of
an existing socket. For example, in respond to a ping or a RST
packet to a SYN on a closed port.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
mac_reflect_mbuf_icmp()
mac_reflect_mbuf_tcp()
These entry points permit MAC policies to do "update in place"
changes to the labels on ICMP and TCP mbuf headers when an ICMP or
TCP response is generated to a packet outside of the context of
an existing socket. For example, in respond to a ping or a RST
packet to a SYN on a closed port.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
change in mac_lomac: if both flags are set on the new label, we
may not need to always fill out the label (only if one flag is
set, not both). Avoid stomping on a section of the label if we
are in fact modifying both elements.
Because we know that both flags will be set, we don't need to
test whether the range or single are set in later consistency
checks of the range and single -- just test them.
By checking the range of the new vs. the range of the old label
before testing the single against the new range, we implicitly
test that the new single is in the old range. Document this
with a comment.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
vendors that list the vendor ID in the proper byte order. The second
section is for vendors that get it backwards. The third is for what
appear to be 'random' ones (although 0xcxxx appears to be coherent
enough that maybe somebody else is assigning those numbers).
Framework labels:
- Re-work the label state assertions to use a set of central
ASSERT_type_LABEL() assertions.
- Test to make sure labels passed to externalize/internalize calls haven't
been destroyed.
- For access control checks, assert the condition of all labels passed in.
- For life cycle events, assert the condition of all labels passed in.
- Add new entry point implementations for new MAC Framework entry points:
mac_test_reflect_mbuf_icmp(), mac_test_reflect_mbuf_tcp(),
mac_test_check_vnode_deleteextattr(), mac_test_check_vnode_listextattr().
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
mac_stub policy and no longer mac_none (as found in the repocopy).
Add comment to this effect.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
o AD1980 hook.
o ac97_fix_auxout.
and:
o Associate AC97_MIX_AUXOUT with SOUND_MIXER_OGAIN rather than
SOUND_MIXER_MONITOR.
o Add ac97_fix_tone to remove tone controls from mixer if invalid.
explicit access control checks to delete and list extended attributes
on a vnode, rather than implicitly combining with the setextattr and
getextattr checks. This reflects EA API changes in the kernel made
recently, including the move to explicit VOP's for both of these
operations.
Obtained from: TrustedBSD PRoject
Sponsored by: DARPA, Network Associates Laboratories
MAC_DEBUG_COUNTER_INC() and MAC_DEBUG_COUNTER_DEC() to maintain
debugging counter values rather than #ifdef'ing the atomic
operations to MAC_DEBUG.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
represents the pruely stylistic changes and should have no net impact
on the rest of the code.
bde's more substantive changes will follow in a separate commit once
we've come to closure on them.
Submitted by: bde
UMA_ZFLAG_INTERNAL zones at all. Apparently, Wilko's alpha
was crashing while entering multi-user because, I think, we
were calculating the garbage cachefree for pcpu caches that
essentially don't exist for at least the 'zones' zone and it so
happened that we were reading from an unmapped location.
Confirmed to fix crash: wilko
Helped debug: wilko, gallatin
from queue(3).
Improve vertical compactness by using a IGMP_PRINTF() macro rather
than #ifdefing IGMP_DEBUG a large number of debugging printfs.
Reviewed by: mdodd (SLIST changes)
specific interfaces. This is required by aodvd, and may in future help us
in getting rid of the requirement for BPF from our import of isc-dhcp.
Suggested by: fenestro
Obtained from: BSD/OS
Reviewed by: mini, sam
Approved by: jake (mentor)
are all bogus, and the cards that don't decode things quite right
often have hundreds of them. This will fix starvation of small dmesg
buffers and allow better debugging to happen. I thought about adding
an override, but there is such a thing as too many knobs. :-)
ntp_update_second twice when we have a large step in case that step
goes across a scheduled leap second. The only way this could happen
would be if we didn't call tc_windup over the end of day on the day of
a leap second, which would only happen if timeouts were delayed for
seconds. While it is an edge case, it is an important one to get
right for my employer.
Sponsored by: Timing Solutions Corporation
ultimate trigger for the follow-up fixes in revisions 1.78, 1.80,
1.81 and 1.82 of trap.c. I was simply too pre-occupied with the
gateway page and how it blurs kernel space with user space and
vice versa that I couldn't see that it was all a load of bollocks.
It's not the IP address that matters, it's the privilege level that
counts. We never run in user space with lifted permissions and we
sure can not run in kernel space without it. Sure, the gateway page
is the exception, but not if you look at the privilege level. It's
user space if you run with user permissions and kernel space otherwise.
So, we're back to looking at the privilege level like it should be.
There's no other way.
Pointy hat: marcel
- Fix up TX speed changes.
- Make mpi-350 cards sort-of work with new firmware. It RXs okay but TXs
only work for about 14 packets then fails to get an interrupt. The
TX watchdog fires. It has been reported that my hack for now doesn't
break cards with the older firmware. It appears my card has lost
the ability to RX or TX at all but other peoples cards work. I assume
it got damaged in tansport.
MFC: 1 week.
count handling of station entries in hostap mode:
Input path:
o driver is now expected to find the node associated with the
sender of a received frame; use ic_bss if none is located
o driver passes the (referenced) node into ieee80211_input for
use within the wlan module and is responsible for cleaning up
on return
o the antenna state is no longer passed up with each frame; this
is now considered driver-private state and drivers are responsible
for keeping it in the driver-private part of a node
Output path:
Revamp output path for management frames to eliminate redundant
locking that causes problems and to correct reference counting
bogosity that occurs when stations are timed out due to inactivity
(in AP mode). On output the refcnt'd node is stashed in the pkthdr's
recvif field (yech) and retrieved by the driver. This eliminates
an unref/ref scenario and related node table unlock/lock due to the
driver looking up the node. This is particularly important when
stations are timed out as this causes a lock order reversal that
can result in a deadlock. As a byproduct we also reduce the overhead
for sending management frames (minimal). Additional fallout from
this is a change to ieee80211_encap to return a refcn't node for
tieing to the outbound frame. Node refcnts are not reclaimed until
after a frame is completely processed (e.g. in the tx interrupt
handler). This is especially important for timed out stations as
this deref will be the final one causing the node entry to be
reclaimed.
Additional semi-related changes:
o replace m_copym use with m_copypacket (optimization)
o add assert to verify ic_bss is never free'd during normal operation
o add comments explaining calling conventions by drivers for frames
going in each direction
o remove extraneous code that "cannot be executed" (e.g. because
pointers may never be null)
that caused a 3-4 times slow down in performance.
(the primary Sparc64 developers are all using OFW_NEWPCI already, so it is
the best code path for users)
quietly discard them; this just permits them to be collected with bpf)
o add a counter for the number of rate control frames discarded when not in
monitor mode
o move the rx "too short" statistic in the stat structure so non-error rx stats
are together (NB: ABI change to apps that collect stats via driver ioctl)
mistakes (this mistake was not an issue because the length is only used to
decide whether or not to allocate a cluster)
o while here, move a beacon length comment to the "right place"
_pmap_allocpte(): Guarantee that the page table page is zero filled before
adding it to the directory. Otherwise, a 2nd, 3rd, etc. thread could
access a nearby virtual address and use garbage for the address
translation.
Discussed with: peter, tegge
of bw_meter entries were processed up to one second ahead.
After an unappropriate rescheduling of some of the bw_meter
entries, the upcalls weren't delivered.
* pim_register_prepare() uses the appropriate sw_csum flag to
call ip_fragment() so the IP checksum is computed properly.
* Modify pim_register_prepare() to take care of IP packets that
don't need fragmentation.
* Add-back in_delayed_cksum() to encap_send(), because it seems it
should be there.
Submitted by: Pavlin Radoslavov <pavlin@icir.org>
also fixes pfs_access() since it relies on VOP_GETATTR() which will call
pfs_getattr(). This prevents jailed processes from discovering the
existence, start time and ownership of processes outside the jail.
PR: kern/48156
present, and non-zero when it is (or may be) absent. The test
cbb_child_present was backwards. However, typical usage in the tree
would cause it to do the right thing because the card really wasn't
there the OK flag would be turned on.
Also, assume that if any of these bits are turned on we don't have a
card, rather than requiring both of them in the suspend/resume
routines.
Noticed by: cognet
directories. Previously, pfs_iterate() would return -1 when it
reached the end of the process list while processing a process
directory node, even if the parent directory contained further nodes
(which is the case for the linprocfs root directory, where the process
directory node is actually first in the list). With this patch,
pfs_iterate() will continue to traverse the parent directory's node
list after exhausting the process list (as was the intention all
along). The code should hopefully be easier to read as well.
While I'm here, have pfs_iterate() assert that the allproc lock is
held.
to what is in NetBSD. I have a few cards that tickles this bug, and
this just keeps us from panicing. It doesn't actually fix the problem
(that will happen once I figure out why some cards hate the address
their CIS is mapped to high memory).
binaries in /bin and /sbin installed in /lib. Only the versioned files
reside in /lib, the .so symlink continues to live /usr/lib so the
toolchain doesn't need to be modified.
like we have on other platforms. Move savectx() to <machine/pcb.h>.
A lot of files got these MD prototypes through the indirect inclusion
of <machine/cpu.h> and now need to include <machine/md_var.h>. The
number of which is unexpectedly large...
osf1_misc.c especially is tricky because szsigcode is redefined in
one of the osf1 header files. Reordering of the include files was
needed.
linprocfs.c now needs an explicit extern declaration.
Tested with: LINT
prototypes of cpu_halt(), cpu_reset() and swi_vm() from md_var.h to
cpu.h. This affects db_command.c and kern_shutdown.c.
ia64: move all MD prototypes from cpu.h to md_var.h. This affects
madt.c, interrupt.c and mp_machdep.c. Remove is_physical_memory().
It's not used (vm_machdep.c).
alpha: the MD prototypes have been left in cpu.h with a comment
that they should be there. Moving them is left for later. It was
expected that the impact would be significant enough to be done in
a seperate commit.
powerpc: MD prototypes left in cpu.h. Comment added.
Suggested by: bde
Tested with: make universe (pc98 incomplete)
A timecounter will be selected when registered if its quality is
not negative and no less than the current timecounters.
Add a sysctl to report all available timecounters and their qualities.
Give the dummy timecounter a solid negative quality of minus a million.
Give the i8254 zero and the ACPI 1000.
The TSC gets 800, unless APM or SMP forces it negative.
Other timecounters default to zero quality and thereby retain current
selection behaviour.
Sign extension happens after the shift, not before so that boundary
cases like 0x40000000 will not be caught properly.
Instead, right shift ndirty. It is guaranteed to be a multiple of 8.
While here, do some manual code motion and code commoning.
Range check bug pointed out by: iedowse
case, a "Vortex86" mini PC), the PCI device ID value in the EEPROM (0x8100)
does not agree with the PCI device ID returned by pci_get_device() (0x8139).
This means that while rl_probe() matches the device, rl_attach() doesn't.
Work around this by adding an entry to the rl_devs table for the 8100 with
a device ID of 0x8100.
Also, get rid of extra instance of __FBSDID(). One is enough.
- Update some stale comments.
- Sort a couple of includes.
- Only set 'newcpu' in updatepri() if we use it.
- No functional changes.
Obtained from: bde (via an old diff I got a long time ago)
get the same value from ip->i_ump->um_devvp.
This saves a pointer in the memory copies of inodes, which can
easily run into several hundred kilobytes.
The extra indirection is unmeasurable in benchmarks.
Approved by: mckusick
- Add a macro for the logical shift needed to extract an APIC ID from
either from the local APIC ICR Hi register or the APIC ID registers of
the local and IO APICs.
This code dates back to the very first diskless support on FreeBSD,
back when swapon(8) couldn't simply be run on a NFS backed file.
Suggested replacement command sequence on the client:
dd if=/dev/zero of=/swapfile bs=1k count=1 oseek=100000
swapon /swapfile
rm -f /swapfile
For whatever value of 100000 you want.
Minor code reorganization was required, but the only functional
change was that the first 1024 bytes of output are thrown out
after each reseed, rather than just the initial seed.
was masked. However KIMURA Yasuhiro-san noticed my mistake and was
kind enough to provide a better patch in PR 55581. I've merged that
into the routine. Hopefully I've not overlooked anything this time.
MFC After: 5 days
that were on the kernel stack into account. For now we write them
out to the register stack of the process before creating the dump.
This however is not the final solution. The problem is that we may
invalidate the coredump by overwriting vital information due to an
invalid backing store pointer. Instead we need to write the dirty
registers to an unused region of VM which will result in a seperate
segment in the coredump. For now we can at least get to all the
registers from a coredump.
and the move to control register to avoid dependency violations when
these functions are used. Note that explicit data and instruction
serialization also need to be in a subsequent instruction group.
This too requires that we have an igrp break here.
PT_SETKSTACK. These requests allow the tracing process to access the
dirty registers of the traced process that are on the kernel stack.
Note that there's currently no way to access the rnat register for
those dirty registers that are not (yet) covered by a nat collection
point. The interface for this is still being slept on.
Also note that implied by these requests is the division of work:
The tracing process has to keep track of where registers are spilled
and is responsible to figure out where the NaT bit of the stacked
registers are at any time during the execution of the traced process.
The kernel provides the interfaces but will not abstract the fact
that the register stack can be split. This model does not follow
the approach taken in Linux where PT_PEEK and PT_POKE deals with
this automagically.
check for permissions, do it for all requests, not the known requests.
Later when we actually service the request we deal with the invalid
requests we previously caught earlier.
This commit changes the behaviour of the ptrace(2) interface for
boundary cases such as an unknown request without proper permissions.
Previously we would return EINVAL. Now we return EBUSY or EPERM.
Platforms need to define __HAVE_PTRACE_MACHDEP when they have MD
requests. This makes the prototype of cpu_ptrace() visible and
introduces a call to this function for all requests greater or
equal to PT_FIRSTMACH.
Silence on: audit
Correctly handle additional disks without BIOS partition tables.
Previously, vinum_scandisk stopped scanning additional disks for
native partitions after any good partition was found. This applies
to all platforms, but was a particular problem on systems without
BIOS partition tables.
Submitted by: harti
the hardware mutex if it is held. Re-add calls to Enable/Clear fixed events.
This is not known to have caused problems. Bug symptoms might have included
instability after an aborted suspend attempt or power/sleep buttons not
being enabled.
kobj global method table; also kassert that the table has not overflowed
when defining a new method.
there are indications that the table is being overflowed in certain
situations as we gain more kobj consumers- this will allow us to check
whether kobj is at fault. symptoms would be incorrect methods being called.
CP-168U board. It initializes and attaches in the same way as the
older (but higher performance) C168H. The only difference is the
board ID, which is 0x1681.
PR: kern/53548
Submitted by: regnauld@catpipe.net
MFC after: 1 week
the Palm device and the USB host controller deadlock. The USB host
controller is expecting an early-end-of-transmission packet with 0
data, and the Palm doesn't send one because it's already communicated
the amount of data it's going to send in a header (which ucom/uvisor
are oblivious to). This is the problem that has been known on the
pilot-link lists as the "[Free]BSD USB problem", but not understood.
Submitted by: Nathan J. Williams <nathanw@MIT.EDU>
multi-fragment transmission. I'm not sure if this is a bug or a requirement
that I overlooked with going through the documentation, but the sample
8169 NIC that I have seems to require it at least some of the time or
else it botches TCP checksums on segments that span multiple descriptors.
to override the method pointers for manipulating nodes; this fixes
a problem where the ic_bss node was not being created properly
for the ath driver causing the driver to scribble on random memory.
Noticed by: David Young <dyoung@pobox.com>
rate set element id from an AP. This allows stations to associate with
AP's that violate the 802.11 spec by sending >8 rates. This corrects a
recent regression; older code did likewise.
for partly-aligned operations through /dev/crypto (unlikely)
o add missing case in iov code that never showed up because of the above bug
Submitted by: "Jason L. Wright" <jason@thought.net>
MFC after: 3 days
to walk the list and remove the current item and destroy/free it.
Alexander Kabaev will likely do the equivalent for the other list
types, but I just happened to have this one sitting in a local
non-FreeBSD tree already.
segments are lost for the application. This broke, for example,
ports/benchmarks/dbs which needs the SYN segment to filter the
contents of the trace buffer for the connection it is interested in.
This patch makes the SYN segments available again. Unfortunately they
are now associated with the listening socket instead of the new one, so
a change to applications is required, but without this patch it wouldn't
work altogether.
PR: kern/45966
code has rotten a bit so that the header length is not correct at
the point when tcp_trace is called. Temporarily compute the correct
value before the call and restore the old value after. This makes
ports/benchmarks/dbs to almost work.
This is a NOP unless you compile with TCPDEBUG.
in tcp_input that leave the function before hitting the tcp_trace
function call for the TCPDEBUG option. This has made TCPDEBUG mostly
useless (and tools like ports/benchmarks/dbs not working). Add
tcp_trace calls to the return paths that could be identified in this
maze.
This is a NOP unless you compile with TCPDEBUG.
in user space or kernel space. VM_MIN_KERNEL_ADDRESS starts after the
gateway page, which means that improper memory accesses to the gateway
page while in user mode would panic the kernel. Use VM_MAX_ADDRESS
instead. It ends before the gateway page. The difference between
VM_MIN_KERNEL_ADDRESS and VM_MAX_ADDRESS is exactly the gateway page.
move to ar.rsc. The RSE must be in enforced lazy mode when writing
to RSE modifyable registers. In this case we restore the RSE NaT
collection register ar.rnat. I have seen 2 general exception faults
on pluto1 now that indicate that the move to ar.rsc has already
happened prior to the move to ar.rnat, meaning that the RSE is not
in enforced lazy mode anymore. The ia64 dependency and instruction
ordering rules seem to allow having both registers written to in
the same instruction group, provided ar.rsc is written to later than
ar.rnat (based on the ordering semantics). It appears that we may
be pushing our luck. For now, put them in seperate cycles (by means
of the instruction group break). If we ever get a general exception
fault on the move to ar.rnat again, we have definite proof that
something else is fishy.