improve performance:
- Eliminate custom reference count and condition variable to monitor
threads entering the framework, as this had both significant overhead
and behaved badly in the face of contention.
- Replace reference count with two locks: an rwlock and an sx lock,
which will be read-acquired by threads entering the framework
depending on whether a give policy entry point is permitted to sleep
or not.
- Replace previous mutex locking of the reference count for exclusive
access with write acquiring of both the policy list sx and rw locks,
which occurs only when policies are attached or detached.
- Do a lockless read of the dynamic policy list head before acquiring
any locks in order to reduce overhead when no dynamic policies are
loaded; this a race we can afford to lose.
- For every policy entry point invocation, decide whether sleeping is
permitted, and if not, use a _NOSLEEP() variant of the composition
macros, which will use the rwlock instead of the sxlock. In some
cases, we decide which to use based on allocation flags passed to the
MAC Framework entry point.
As with the move to rwlocks/rmlocks in pfil, this may trigger witness
warnings, but these should (generally) be false positives as all
acquisition of the locks is for read with two very narrow exceptions
for policy load/unload, and those code blocks should never acquire
other locks.
Sponsored by: Google, Inc.
Obtained from: TrustedBSD Project
Discussed with: csjp (idea, not specific patch)
(1) Fix pcib_read/write_config prototypes.
(2) When contrainting a resource request for a 'subtractive' bridge,
it is important to select a range outside the base/limit
registers, since those are the only values known to not
possibly work. On my HP laptop, the base bridge excludes I/O
ports 0xa000-0xafff, however that was the range we were passing
up the tree. Instead, when a range spans the "hole" we now
arbitrarily pick the range just above the hole to allocate from.
All of my rl and xl cards, at a minimum, started working again on this
laptop with those fixes.
- When sending large PR-SCTP messages over a
lossy link we would incorrectly calculate the fwd-tsn
- When receiving large multipart pr-sctp packets we would
incorrectly send back a SACK that would renege improperly
on already received packets thus causing unneeded retransmissions.
is calculated as 0 which causes errors elsewhere.
Submitted by: KOIE Hidetaka <koie@suri.co.jp>
- When sched_affinity() is called with a thread that is not curthread we
need to handle the ON_RUNQ() case by adding the thread to the correct
run queue.
Submitted by: Justin Teller <justin.teller@gmail.com>
MFC after: 1 Week
".note.ABI-tag" section.
The search order of a brand is changed, now first of all the
".note.ABI-tag" is looked through.
Move code which fetch osreldate for ELF binary to check_note() handler.
PR: 118473
Approved by: kib (mentor)
booting because the CD driver did not use bounce buffers to ensure
request buffers sent to the BIOS were always in the first 1MB. Copy over
the bounce buffer logic from the BIOS disk driver (minus the 64k boundary
code for floppies) to fix this.
Reported by: kensmith
look like a temperature.
This driver will most likely be renamed to something more meaningful in
the near future.
Submitted by: nork
MFC after: 2 weeks
o add 9280 attach that sets up ini, cal, etc.
o new rf backend for 9280 and later parts
o split ini setup and spur mitigation support out to methods
and provide 9280-specific support
o minor fixups to shared code to handle 9280-specific work
Obtained from: Atheros (ini values and some code)
Provide a custom lock around initializing and tearing down EA area,
to prevent both memory leaks and double-free of it. Count the number
of EA area accessors.
Lock protocol requires either holding exclusive vnode lock to modify
i_ea_area, or shared vnode lock and owning IN_EA_LOCKED flag in i_flag.
Noted by: YAMAMOTO, Taku <taku tackymt homeip net>
Tested by: pho (previous version)
MFC after: 2 weeks
both disks, or if we should suppress the slave drive. Default to
suppressing the slave, in the case that this REQIURED tuple turns out
to not actually be present...
Based on the HAL preemption lock there is a problem on SMP machines
and causes a panic.
o When a device detached the current tactic to detach NDIS USB driver is
to call SURPRISE_REMOVED event. So it don't need to call
ndis_halt_nic() again. This fixes some page faults when some drivers
work abnormal.
o it assumes now that URB_FUNCTION_BULK_OR_INTERRUPT_TRANSFER is in
DISPATCH_LEVEL (non-sleepable) and as further work
URB_FUNCTION_VENDOR_XXX and URB_FUNCTION_CLASS_XXX should be.
Reviewed by: Hans Petter Selasky <hselasky_at_freebsd.org>
Tested by: Paul B. Mahol <onemda_at_gmail.com>
More HID parsing fixes for usb mice.
- be less strict on the last HID item usage.
- preserve item size and count accross items
- improve default HID usage selection.
Tested by: ache
Submitted by: Hans Petter Selasky
o Header file cleanup.
o bus_dma(9) conversion.
- Removed all consumers of vtophys(9) and converted to use
bus_dma(9).
- Typhoon2 functional specification says the controller supports
64bit DMA addressing. However all Typhoon controllers are known
to lack of DAC support so 64bit DMA support was disabled.
- The hardware can't handle more 16 fragmented Tx DMA segments so
teach txp(4) to collapse these segments to be less than 16.
- Added Rx buffer alignment requirements(4 bytes alignment) and
implemented fixup code to align receive frame. Previously
txp(4) always copied Rx frame to align it on 2 byte boundary
but its copy overhead is much higher than unaligned access on
i386/amd64. Alignment fixup code is now applied only for
strict-alignment architectures. With this change i386 and
amd64 will get instant Rx performance boost. Typhoon2 datasheet
mentions a command that pads arbitrary bytes in Rx buffer but
that command does not work.
- Nuked pointer trick in descriptor ring. This does not work on
sparc64 and replaced it with bcopy. Alternatively txp(4) can
embed a 32 bits index value into the descriptor and compute
real buffer address but it may make code complicated.
- Added endianness support code in various Tx/Rx/command/response
descriptor access. With this change txp(4) should work on all
architectures.
o Added comments for known firmware bugs(Tx checksum offloading,
TSO, VLAN stripping and Rx buffer padding control).
o Prefer faster memory space register access to I/O space access.
Added fall-back mechanism to use alternative I/O space access.
The hardware supports both memory and I/O mapped access. Users
can still force to use old I/O space access by setting
hw.txp.prefer_iomap tunable to 1 in /boot/loader.conf.
o Added experimental suspend/resume methods.
o Nuke error prone Rx buffer handling code and implemented local
buffer management with TAILQ. Be definition the controller can't
pass the last received frame to host if no Rx free buffers are
available to use as head and tail pointer of Rx descriptor ring
can't have the same value. In that case the Rx buffer pointer in
Rx buffer ring still holds a valid buffer and txp_rxbuf_reclaim()
can't fill Rx buffers as the first buffer is still valid. Instead
of relying on the value of Rx buffer ring, introduce local buffer
management code to handle empty buffer situation. This should fix
a long standing bug which completely hangs the controller under
high network load. I could easily trigger the issue by sending 64
bytes UDP frames with netperf. I have no idea how this bugs was
not fixed for a long time.
o Converted ithread interrupt handler to filter based one.
o Rearranged txp_detach routine such that it's now used for general
clean-up routine.
o Show sleep image version on device attach time. This will help
to know what action should be taken depending on sleep image
version. The version information in datasheet was wrong for newer
NV images so I followed Linux which seems to correctly extract
version numbers from response descriptors.
o Firmware image is no longer downloaded in device attach time. Now
it is reloaded whenever if_init is invoked. This is to ensure
correct operation of hardware when something goes wrong.
Previously the controller always run without regard to running
state of firmware. This change will add additional controller
initialization time but it give more robust operation as txp(4)
always start off from a known state. The controller is put into
sleep state until administrator explicitly up the interface.
o As firmware is loaded in if_init handler, it's now possible to
implement real watchdog timeout handler. When watchdog timer is
expired, full-reset the controller and initialize the hardware
again as most other drivers do. While I'm here use our own timer
for watchdog instead of using if_watchdog/if_timer interface.
o Instead of masking specific interrupts with TXP_IMR register,
program TXP_IER register with the interrupts to be raised and
use TXP_IMR to toggle interrupt generation.
o Implemented txp_wait() to wait a specific state of a controller.
o Separate boot related code from txp_download_fw() and name it
txp_boot() to handle boot process.
o Added bus_barrier(9) to host to ARM communication.
o Added endianness to all typhoon command processing. The ARM93C
always expects little-endian format of command/data.
o Removed __STRICT_ALIGNMENT which is not valid on FreeBSD.
__NO_STRICT_ALIGNMENT is provided for that purpose on FreeBSD.
Previously __STRICT_ALIGNMENT was unconditionally defined for
all architectures.
o Rewrote SIOCSIFCAP ioctl handler such that each capability can be
controlled by ifconfig(8). Note, disabling VLAN hardware tagging
has no effect due to the bug of firmware.
o Don't send TXP_CMD_CLEAR_STATISTICS to clear MAC statistics in
txp_tick(). The command is not atomic. Instead, just read the
statistics and reflect saved statistics to the statistics.
dev.txp.%d.stats sysctl node provides detailed MAC statistics.
This also reduces a lot of waste of CPU cycles as processing a
command ring takes a very long time on ARM93C. Note, Rx
multicast and broadcast statistics does not seem to right. It
might be another bug of firmware.
o Implemented link state change handling in txp_tick(). Now sending
packets is allowed only after establishing a valid link. Also
invoke link state change notification whenever its state is
changed so pseudo drivers like lagg(4) that relies on link state
can work with failover or link aggregation without hacks.
if_baudrate is updated to resolved speed so SNMP agents can get
correct bandwidth parameters.
o Overhauled Tx routine such that it now honors number of allowable
DMA segments and checks for 4 free descriptors before trying to
send a frame. A frame may require 4 descriptors(1 frame
descriptor, 1 or more frame descriptors, 1 TSO option descriptor,
one free descriptor to prevent descriptor wrap-around) at least
so it's necessary to check available free descriptors prior to
setting up DMA operation.
o Added a sysctl variable dev.txp.%d.process_limit to control
how many received frames should be served in Rx handler. Valid
ranges are 16 to 128(default 64) in unit of frames.
o Added ALTQ(4) support.
o Added missing IFCAP_VLAN_HWCSUM as txp(4) can offload checksum
calculation as well as VLAN tag insertion/stripping.
o Fixed media header length for VLAN.
o Don't set if_mtu in device attach, it's already set in
ether_ifattach().
o Enabled MWI.
o Fixed module unload panic when bpf listeners are active.
o Rearranged ethernet address programming logic such that it works
on strict-alignment architectures.
o Removed unused member variables in softc.
o Added support for WOL.
o Removed now unused TXP_PCI_LOMEM/TXP_PCI_LOIO.
o Added wakeup command TXP_BOOTCMD_WAKEUP definition.
o Added a new firmware version query command, TXP_CMD_READ_VERSION.
o Removed volatile keyword in softc as bus_dmamap_sync(9) should
take care of this.
o Removed embedded union trick of a structure used to to access
a pointer on LP64 systems.
o Added a few TSO related definitions for struct txp_tcpseg_desc.
However TSO is not used at all due to the limitation of hardware.
o Redefined PKT_MAX_PKTLEN to theoretical maximum size of a frame.
o Switched from bus_space_{read|write}_4 to bus_{read|write}_4.
o Added a new macro TXP_DESC_INC to compute next descriptor index.
Tested by: don.nasco <> gmail dot com
poll(), only copy out the revents field, not the whole pollfd
structure. Otherwise, if the events field is updated
concurrently by another thread, that update may be lost.
This issue apparently causes problems for the JDK on FreeBSD,
which expects the Linux behavior of not updating all fields
(somewhat oddly, Solaris does not implement the required
behavior, but presumably our adaptation of the JDK is based
on the Linux port?).
MFC after: 2 weeks
PR: kern/130924
Submitted by: Kurt Miller <kurt @ intricatesoftware.com>
Discussed with: kib
internal sysctl_sysctl_name() handler to map the MIB array to a string
name and logs this name in the trace log. This can be useful to see
exactly which sysctls a thread is invoking.
MFC after: 1 month
filesystem supports additional operations using shared vnode locks.
Currently this is used to enable shared locks for open() and close() of
read-only file descriptors.
- When an ISOPEN namei() request is performed with LOCKSHARED, use a
shared vnode lock for the leaf vnode only if the mount point has the
extended shared flag set.
- Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but
not O_CREAT.
- Use a shared vnode lock around VOP_CLOSE() if the file was opened with
O_RDONLY and the mountpoint has the extended shared flag set.
- Adjust md(4) to upgrade the vnode lock on the vnode it gets back from
vn_open() since it now may only have a shared vnode lock.
- Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since
FIFO's require exclusive vnode locks for their open() and close()
routines. (My recent MPSAFE patches for UDF and cd9660 already included
this change.)
- Enable extended shared operations on UFS, cd9660, and UDF.
Submitted by: ups
Reviewed by: pjd (ZFS bits)
MFC after: 1 month
legal in the spec. Add newline to the verbose messages we print when
debugging when this happens. The Hitachi HT-4840-11 is the only card
to hit these in years, and it works well enough if we're liberal about
what we accept.
the unmanaged flag set in the PVO attributes. Without doing this,
pmap_remove() could try to remove fictitious pages (like those created
by mmap of physical memory) from the wrong UMA zone, causing a panic.
Reported by: Justin Hibbits
MFC after: 1 week
to resurrect (maybe honor foot shooting bit in kern.geom_debugflags)
o fix match macro so we now recognize we want to merge FIS dir with RedBoot
config parameters even if we don't actually do it
or not the inpcb is currenty on various hash lookup lists, rather
than using (lport != 0) to detect this. This means that the full
4-tuple of a connection can be retained after close, which should
lead to more sensible netstat output in the window between TCP
close and socket close.
MFC after: 2 weeks
this fixes geom_redboot on boards that have multiple parts/regions as it
uses the value to locate the FIS directory which is in the last erase
region of flash
Firmware upgraded to 7.1.0 (from 5.0.0).
T3C EEPROM and SRAM added; Code to update eeprom/sram fixed.
fl_empty and rx_fifo_ovfl counters can be observed via sysctl.
Two new cxgbtool commands to get uP logic analyzer info and uP IOQs
Synced up with Chelsio's "common code" (as of 03/03/09)
Submitted by: Navdeep Parhar at Chelsio
Reviewed by: gnn
MFC after: 2 weeks
operate on the resource (we have no local resources to manage); this
fixes drivers that alloc/release resources in their probe method and
then do it again in attach
o while here add some prints to catch failures and massage style a bit
hit in the name cache. cache_lookup() doesn't actually return ENOENT
for such requests to force the filesystem to do an explicit lookup, so
this was effectively dead code.
- Grab the nfsnode mutex while writing to n_dmtime. We don't grab the lock
when comparing the time against the cached directory mod time (just as
we don't when comparing ctime's for +ve name cache hits) since the
attribute caching is already racy for NFS clients as it is.
Discussed with: bde
in6p_ip6_nxt
in6p_vflag
in6p_flags
in6p_socket
in6p_lport
in6p_fport
in6p_ppcb
Remove unused v6 macro aliases for inpcb flags:
IN6P_HIGHPORT
IN6P_LOWPORT
IN6P_ANONPORT
IN6P_RECVIF
IN6P_MTUDISC
IN6P_FAITH
IN6P_CONTROLOPTS
References to in6p_lport and in6_fport in sockstat are also replaced with
normal inp_lport and inp_fport references.
MFC after: 3 days
Reviewed by: bz
o encode need for A4 bus space tag hackery according to the memory
address; checking for "uart" breaks down with the GPS chip support
which is also a uart but does not require the same hackery
o encode the correct memory window instead of carving up all of i/o
space, potentially with a larger window than a device should have;
this likely should be handled in the drivers by using a proper bus
alloc call but since some drivers depend on the bus support to figure
this out we cannot simply mod them
o add optional GPS and RS485 support (conditionally as the support
isn't ready yet)
needed.
- Move the release of the sysctl sx lock after the vsunlock() in
userland_sysctl() to restore the original memlock behavior of
minimizing the amount of memory wired to handle sysctl requests.
MFC after: 1 week
implementation instead. The bypass does not assume that returned vnode
is only held.
Reported by: Paul B. Mahol <onemda gmail com>, pluknet <pluknet gmail com>
Reviewed by: jhb
Tested by: pho, pluknet <pluknet gmail com>
consumers which fork after the shared pages have been setup. pflogd(8)
is an example. The problem is understood and there is a fix coming in
shortly.
Folks who want to continue using it can do so by setting
net.bpf.zerocopy_enable
to 1.
Discussed with: rwatson
the PORTEN and MEMEN bits in the command register than to zero the
bars.
Use pci_write_ivar directly instead of a one-line wrapper that adds no
value.
Track verbosity changes in pci.
Remove a stray blank line.
After I imported libteken into the source tree, I noticed syscons didn't
store the cursor position inside the terminal emulator, but inside the
virtual terminal stat. This is not very useful, because when you
implement more complex forms of line wrapping, you need to keep track of
more state than just the cursor position.
Because the kernel messages didn't share the same terminal emulator as
ttyv0, this caused a lot of strange things, like kernel messages being
misplaced and a missing notification to resize the terminal emulator for
kernel messages never to be resized when using vidcontrol.
This patch just removes kernel_console_ts and adds a special parameter
to te_puts to determine whether messages should be printed using regular
colors or the ones for kernel messages.
Reported by: ache
Tested by: nyan, garga (older version)