an incompatible conversion from a 64-bit pointer to a 32-bit integer on
64-bit platforms. We will investigate whether Solaris uses a 64-bit
token here, or a new record here, in order to avoid truncating user
pointers that are 64-bit. However, in the mean time, truncation is fine
as these are rarely/never used fields in audit records.
Obtained from: TrustedBSD Project
vnode is from a file system that is not MPSAFE, as vrele() expects
Giant to be held when it is called on a non-MPSAFE vnode.
Spotted by: kris
Tested by: glebius
to sbus_mdvec.dv_clock as sbus_mdvec.dv_clock is meant to be specified
in MHz. While this was a bug it shouldn't have affected FreeBSD/sparc64
as sbus_mdvec.dv_clock is used to limit the clock rate of chips when
a machine isn't able to support them at maximum speed which isn't the
case for sun4u machines.
- Remove the code that checks whether the clock frequency returned by
sbus_get_clockfreq() is 0 and falls back to 25MHz if it is as that's
already done in sbus(4).
Approved by: mjacob
MFC after: 3 days
store some pipe pointers on stack. If user reconfigures dummynet
in the interlock gap, we can work with freed pipes after relock.
To fix this, we decided not to send packets in transmit_event(),
but fill a queue. At the end of dummynet() and dummynet_io(),
after the lock is dropped, if there is something in the queue
we run dummynet_send() to process the queue.
In collaboration with: ru
dedicated to storing pv entries, originally so that kva didn't have to be
allocated at inconvenient times. For amd64, we can get the same effect by
using the direct map area. Allocating pages is the same as with the object
backed method, but now we can just lookup the page in the direct map area.
Thus, no more pageable kva is reserved. This is the single largest
consumer of kva on our work machines and this change should help conserve
the fixed size 2GB pageable kva on the amd64 kernel.
There are a pair of sysctl nodes introduced, named the same as their
tunable counterparts. vm.pmap.shpgperproc and vm.pmap.pv_entry_max
They work just like the tunables of the same path, except the values are
linked. The pv entry cap is now dynamically changeable.
I didn't make them totally unlimited because we need some sort of safety
limit still. One could consume all physical memory without a cap.
the system when brelse() was called with B_RELBUF set on the buffer. This
could be a problem when the system was low on memory, had many buffers on
QUEUE_EMPTYKVA and started to traverse directories. For each getnewbuf(),
pages were allocated from the system, driving the free reserve downwards.
For each brelse(), the system put the buffer on QUEUE_CLEAN, with B_INVAL
set.
This commit changes the semantics of B_RELBUF to also free pages from
non-VMIO buffers.
Reviewed by: alc
routines.
- Add or replace cpu_spinwait() with DELAY(1) to a few of the busy
loops when reading from the controller to work around firmware bugs
which can crash the controller.
with pseudo header for tcp/udp packets). This could save one in_pseudo() call
per incoming tcp/udp packet.
Approved by: glebius (mentor)
MFC after: 3 weeks
filtering mechanisms to use the new rwlock(9) locking API:
- Drop the variables stored in the phil_head structure which were specific to
conditions and the home rolled read/write locking mechanism.
- Drop some includes which were used for condition variables
- Drop the inline functions, and convert them to macros. Also, move these
macros into pfil.h
- Move pfil list locking macros intp phil.h as well
- Rename ph_busy_count to ph_nhooks. This variable will represent the number
of IN/OUT hooks registered with the pfil head structure
- Define PFIL_HOOKED macro which evaluates to true if there are any
hooks to be ran by pfil_run_hooks
- In the IP/IP6 stacks, change the ph_busy_count comparison to use the new
PFIL_HOOKED macro.
- Drop optimization in pfil_run_hooks which checks to see if there are any
hooks to be ran, and returns if not. This check is already performed by the
IP stacks when they call:
if (!PFIL_HOOKED(ph))
goto skip_hooks;
- Drop in assertion which makes sure that the number of hooks never drops
below 0 for good measure. This in theory should never happen, and if it
does than there are problems somewhere
- Drop special logic around PFIL_WAITOK because rw_wlock(9) does not sleep
- Drop variables which support home rolled read/write locking mechanism from
the IPFW firewall chain structure.
- Swap out the read/write firewall chain lock internal to use the rwlock(9)
API instead of our home rolled version
- Convert the inlined functions to macros
Reviewed by: mlaier, andre, glebius
Thanks to: jhb for the new locking API
- td_ar to struct thread, which holds the in-progress audit record during
a system call.
- p_au to struct proc, which holds per-process audit state, such as the
audit identifier, audit terminal, and process audit masks.
In the earlier implementation, td_ar was added to the zero'd section of
struct thread. In order to facilitate merging to RELENG_6, it has been
moved to the end of the data structure, requiring explicit
initalization in the thread constructor.
Much help from: wsalamon
Obtained from: TrustedBSD Project
option. We always build audit_syscalls.c so that the system call
stubs can return ENOSYS rather than the system call code
generating SIGSYS for the system calls. We are not yet ready to
add AUDIT to LINT, as the prototypes for system call arguments
won't be there until after the system calls for audit are added.
Much work from: wsalamon
Obtained from: TrustedBSD Project
- Management of audit state on processes.
- Audit system calls to configure process and system audit state.
- Reliable audit record queue implementation, audit_worker kernel
thread to asynchronously store records on disk.
- Audit event argument.
- Internal audit data structure -> BSM audit trail conversion library.
- Audit event pre-selection.
- Audit pseudo-device permitting kernel->user upcalls to notify auditd
of kernel audit events.
Much work by: wsalamon
Obtained from: TrustedBSD Project, Apple Computer, Inc.
couple of FreeBSD-specific modifications that may be merged out
later). These include files define the basic audit data
structures, types, and definitions use by the kernel, or shared
by the kernel and user space.
Obtained from: TrustedBSD Project, Apple Computer, Inc.
capability is present as not all devices supported by the agp_i810 driver
(such as i915) have the AGP capability. Instead, add an identify routine
to the agp_i810 driver that uses the PCI ID to determine if it should
create an agp child device.
to process. It could give us [significant?] perfomance increase if there is big
difference between RX/TX flows.
Submitted by: Mihail Balikov <mihail.balikov AT interbgc DOT com>
Approved by: glebius (mentor)
MFC after: 3 days
synchronized on every call of bge_poll_locked().
Suggested by: Mihail Balikov <mihail.balikov AT interbgc DOT com>
Approved by: glebius (mentor)
MFC after: 3 days
2) add missing bus_dmamap_sync() call in bge_intr()
Tested by: Husnu Demir <hdemir AT metu DOT edu DOT tr>
Approved by: glebius (mentor)
MFC after: 3 days
and signifincantly improve the readability of ip_input() and
ip_output() again.
The resulting IPSEC hooks in ip_input() and ip_output() may be
used later on for making IPSEC loadable.
This move is mostly mechanical and should preserve current IPSEC
behaviour as-is. Nothing shall prevent improvements in the way
IPSEC interacts with the IPv4 stack.
Discussed with: bz, gnn, rwatson; (earlier version)
The former type, size_t, was causing truncation to 32 bits on i386,
which immediately led to undersizing of VM objects backed by
files >4GB. In particular, sendfile(2) was broken for such files.
PR: kern/92243
MFC after: 5 days
without Giant held. Do this by tracking the vfslocked state for
the directory seperate from the child. This is only important
in the case where we cross a mountpoint.
Sponsored by: Isilon Systems, Inc.
MFC After: 3 days
on a lock held the last usecount ref on a vnode and the lock failed we
would not call INACTIVE. Solve this by only holding a holdcnt to prevent
the vnode from disappearing while we wait on vn_lock. Other callers
may now VOP_INACTIVE while we are waiting on the lock, however this race
is acceptable, while losing INACTIVE is not.
Discussed with: kan, pjd
Tested by: kkenn
Sponsored by: Isilon Systems, Inc.
MFC After: 1 week
directory. vrele() may lock the passed vnode, which in these cases would
give an invalid lock order of child -> parent. These situations are
deadlock prone although do not typically deadlock because the vrele
is typically not releasing the last reference to the vnode. Users of
vrele must consider it as a call to vn_lock() and order it appropriately.
MFC After: 1 week
Sponsored by: Isilon Systems, Inc.
Tested by: kkenn
in order to support the on-board LANCE in Ultra 1 and to the MI NOTES as
it should work just fine with the AMD PCnet family of chips on all archs
but is not yet meant to replace lnc(4). If a kernel includes all of le(4),
lnc(4) and pcn(4) precedence is given to lnc(4)/pcn(4) for now.
will be sent if there is an address on the bridge. Exclude the bridge from the
special arp handling.
This has been tested with all combinations of addresses on the bridge and members.
Pointed out by: Michal Mertl
- code expects memcmp() to return a signed value, our memcmp() returns 0 if
args are equal and > 0 if not.
- It's possible to hijack interface for static entry. If bridge recieves
packet from interface marked as learning it will replace the bridge_rtnode
entry for the source address even if such entry marked as static.
Submitted by: Gleb Kurtsov <k-gleb yandex.ru>
MFC after: 3 days
try to use the registrant for numbers in this file, not the OEM that
put their label on it. Use PNY's real number 0x154b. Add another PNY
atachmate with quirks from a PR filed a while ago, but that I can't
seem to find now...
This logic change was introduced in revision 1.74:
Correct an oversight in jail() that allowed processes in jail to access
ptys in ways that might be unethical, especially towards processes not in
jail, or in other jails.
It should be fine to allow root in the host environment to do this. This
allows for more effective monitoring of prisons from the host environment.
Discussed with: rwatson
MFC after: 1 week
beginning and simply refuse to attach to a parent without either
flag.
Our network stack cannot handle well IFF_BROADCAST or IFF_MULTICAST
on an interface changing on the fly. E.g., IP will or won't assign
a broadcast address to an interface and join the all-hosts multicast
group on it depending on its IFF_BROADCAST and IFF_MULTICAST settings.
Should the flags alter later, IP will miss the change and keep using
bogus settings. This can lead to evil things like supplying an
invalid broadcast address or trying to leave a multicast group that
hasn't been joined. So just avoid touching the flags since an
interface was created. This has no practical purpose.
Discussed with: -net, glebius, oleg
MFC after: 1 week
from NetBSD. This driver actually can replace lnc(4). Advantages over
lnc(4) are:
- Cleaner and more flexible regarding MD needs.
- Endian-clean and MPSAFE.
- Supports ALTQ, VLAN_MTU, ifmedia.
- Uses 32bit DMA for the PCI variants.
This commit includes front-ends for the dma(4) pseudo-bus found on SBus-
based sparc64 machines (thus supports the on-board LANCE in Sun Ultra 1)
and PCI. In order to actually replace lnc(4), the front-ends for ISA and
the PC98 CBUS would have to be added but for which I don't have hardware
to test.
Reviewed and some improvements by: yongari
Tested on: i386, sparc64
- Like lsi64854_scsi_intr() return -1 in case there was a DMA error so
the caller can distinguish it from a normal interrupt and leave the
reset of the DMA engine to the caller so we don't kill any state there.
- Move the static 'dodrain' flag to struct lsi64854_softc as there can
be more than one LSI64854 used for a LANCE in a system and reset it
again once draining the E-cache is done so we don't keep draining the
cache with every interrupt.
- Remove calling sc->sc_intrchain(), we will call lsi64854_enet_intr()
via sc->intr() in the interrupt handler of the LANCE driver and not
use it in chained mode.
o lsi64854_pp_intr():
- Like lsi64854_scsi_intr() return -1 in case there was a DMA error so
the caller can distinguish it from a normal interrupt.
o Remove the no longer used sc_intrchain* from struct lsi64854_softc.
o Make lsi64854_reset(), lsi64854_setup*() and lsi64854_*_intr() static
to lsi64854.c as we do and will only call them via the respective
function pointers in struct lsi64854_softc.
o While here fix style(9) bugs (variable definition inside a nested scope).
It detects both: buffer underflows and buffer overflows bugs at runtime
(on free(9) and realloc(9)) and prints backtraces from where memory was
allocated and from where it was freed.
Tested by: kris
interrupt handler for the LANCE devices and remove dma_setup_intr(). We
just can't completely ignore the DMA engine in a LANCE driver anyway and
calling the DMA engine interrupt handler in the LANCE driver directly
allows to cover it by the LANCE driver lock.
work by yar, thompsa and myself. The checksum offloading part also involves
work done by Mihail Balikov.
The most important changes:
o Instead of global linked list of all vlan softc use a per-trunk
hash. The size of hash is dynamically adjusted, depending on
number of entries. This changes struct ifnet, replacing counter
of vlans with a pointer to trunk structure. This change is an
improvement for setups with big number of VLANs, several interfaces
and several CPUs. It is a small regression for a setup with a single
VLAN interface.
An alternative to dynamic hash is a per-trunk static array with
4096 entries, which is a compile time option - VLAN_ARRAY. In my
experiments the array is not an improvement, probably because such
a big trunk structure doesn't fit into CPU cache.
o Introduce an UMA zone for VLAN tags. Since drivers depend on it,
the zone is declared in kern_mbuf.c, not in optional vlan(4) driver.
This change is a big improvement for any setup utilizing vlan(4).
o Use rwlock(9) instead of mutex(9) for locking. We are the first
ones to do this! :)
o Some drivers can do hardware VLAN tagging + hardware checksum
offloading. Add an infrastructure for this. Whenever vlan(4) is
attached to a parent or parent configuration is changed, the flags
on vlan(4) interface are updated.
In collaboration with: yar, thompsa
In collaboration with: Mihail Balikov <mihail.balikov interbgc.com>
this is more consistent with the placement of slaves in /dev/pts. The
actual name doesn't matter as it's not part of the exposed API or used by
libc. In some sense, it would be nice if these device nodes didn't have to
have names in devfs at all.
Suggested by: Stephen McKay <smckay at internode dot on dot net>
however IPv4-in-IPv4 tunnels are now stable on SMP. Details:
- Add per-softc mutex.
- Hold the mutex on output.
The main problem was the rtentry, placed in softc. It could be
freed by ip_output(). Meanwhile, another thread being in
in_gif_output() can read and write this rtentry.
Reported by: many
Tested by: Alexander Shiryaev <aixp mail.ru>