arcconf tool by Adaptec already seems to use for identifying the
Serial Number of the devices.
Some simple things (like FIB setup and bound checks) are retrieved
from the Adaptec's driver, but this implementation is quite different
because it does use the normal buffer dmat area for loading segments
and not a special one (like the Adaptec's one does).
Sponsored by: Sandvine Incorporated
Discussed with: emaste, scottl
Reviewed by: emaste, scottl
MFC: 2 weeks
their calling contexts in {IP divert, raw IP sockets, TCP, UDP} and
create new helper functions: in_pcbinfo_init() and in_pcbinfo_destroy()
to do this work in a central spot. As inpcbinfo becomes more complex
due to ongoing work to add connection groups, this will reduce code
duplication.
MFC after: 1 month
Reviewed by: bz
Sponsored by: Juniper Networks
COMPAT_43TTY enables the sgtty interface. Even though its exposure has
only been removed in FreeBSD 8.0, it wasn't used by anything in the base
system in FreeBSD 5.x (possibly even 4.x?). On those releases, if your
ports/packages are less than two years old, they will prefer termios
over sgtty.
pointer, rather than octeon_fpa_alloc.
o) Report half duplex status properly.
o) Do not unconditionally update the last known link status in the softc. If
report_link isn't set, when octeon_rgmx_config_speed is called the first
time it will tell the driver (essentially) that we have already marked the
interface up. Likewise, don't change media speed and duplex if only the
link status is at issue. [1]
o) Remove manual changing of link state and let octeon_rgmx_config_speed do the
heavy lifting. [1]
Reviewed by: [1] imp
Sponsored by: Packet Forensics
have the delayed function take an argument as to the offset
to the SCTP header. This allows it to work for V4 and V6.
This of course means changing all callers of the function
to either pass the header len, if they have it, or create
it (ip_hl << 2 or sizeof(ip6_hdr)).
PR: 144529
MFC after: 2 weeks
- Add a missing callout_drain(9) before the descriptor deallocation.[1]
- Prefer callout_init_mtx(9) over callout_init(9) and let the callout
subsystem handle the mutex for callout function.
PR: kern/144453
Submitted by: Alexander Sack (asack at niksun dot com)[1]
MFC after: 1 week
Yukon FE and Yukon Ultra2. These controllers provide very simple
checksum computation mechanism and it requires additional pseudo
header checksum computation in upper stack. Even though I couldn't
see much performance difference with/without Rx checksum offloading
it may help notebook based controllers.
Actually controller can compute two checksum value by giving
different starting position of checksum computation on received
frame. However, for long time, Marvell's checksum offloading engine
have been known to have several silicon bugs so don't blindly trust
computed partial checksum value. Instead, compute partial checksum
twice by giving the same checksum computation position and compare
the result. If the value is different it's clear indication of
hardware bug. This configuration lose IP checksum offloading
capability but I think it's better to take safe route.
Note, Rx checksum offloading for Yukon XL was still disabled due to
known silicon bug.
index of status block is read first before acknowledging the
interrupts. Otherwise bge(4) may get stale status block as
acknowledging an interrupt may yield another status block update.
Reviewed by: marius
starting from netgraph import in 1999.
netstat(8) used pointer to node as node address, oops. That didn't
work, we need the node ID in brackets to successfully address a node.
We can't look into ng_node, due to inability to include netgraph/netgraph.h
in userland code. So let the node make a hint for a userland, storing
the node ID in its private data.
MFC after: 2 weeks
address as well as the transport protocol port information
from the outbound packets. The routing code is generic and
compares every byte in the given sockaddr object. Therefore
the temporary sockaddr objects must be cleared due to padding
bytes. In addition, the port information must be stripped
or the route search will either fail or return the incorrect
route entry.
Unit testing is done using OpenVPN over the if_tun interface.
MFC after: 7 days
no delayed checksum was added to the ip6 output code. This
causes cards that do not support SCTP checksum offload to
have SCTP packets that are IPv6 NOT have the sctp checksum
performed. Thus you could not communicate with a peer. This
adds the missing bits to make the checksum happen for these cards.
PR: 144529
MFC after: 2 weeks
- add a name argument to flowtable_alloc for printing with ddb commands
- extend ddb commands to print destination address or 4-tuples
- don't parse ports in ulp header if FL_HASH_ALL is not passed
- add kern_flowtable_insert to enable more generic use of flowtable
(e.g. system calls for adding entries)
- don't hash loopback addresses
- cleanup whitespace
- keep statistics per-cpu for per-cpu flowtables to avoid cache line contention
- add sysctls to accumulate stats and report aggregate
MFC after: 7 days
o) Properly configure the CAM to handle IFF_PROMISC and note where IFF_ALLMULTI
handling would go if we didn't already force the NIC to receive all
multicast traffic.
Reviewed by: imp
Sponsored by: Packet Forensics
o) Inline octeon_rgmx_mark_ready into octeon_rgmx_init.
o) Add a media status handler that reports link and media status.
o) Set link state when if_init is called.
o) Remove some printfs related to driver state changes.
o) Remove some gratuitous comments.
Reviewed by: imp
Sponsored by: Packet Forensics
is issued at the beginning of the initial IN/OUT data transfers. Reason
unknown, probably firmware fault. Now the stall is only cleared on data
transfer errors.
PR: usb/144199
Submitted by: Hans Petter Selasky
other modules can generate USB descriptors.
- extend the vendor specific request function by one length pointer argument,
because not all descriptors store the length in the first byte. For example
HID descriptors.
Submitted by: Hans Petter Selasky
1) vm_machdep.c: remove the dangling allocations so they do not
un-necessarily turn off the cache upon consecutive access.
2) busdma_machdep.c: remove the same amount than shadow mapped.
Reported by: Maks Verver
Submitted by: Mark Tinguely
Reviewed by: Grzegorz Bernacki
MFC after: 3 days
does not set or update the if_link_state variable.
As such RT_LINK_IS_UP() fails for the if_tap interface.
Also, the RT_LINK_IS_UP() needs to bypass all loopback
interfaces because loopback interfaces are considered
up logically as long as the system is running.
This patch fixes the above issues by setting and updating
the if_link_state variable when the tap interface is
opened or closed respectively. Similary approach is
already done in the if_tun device.
MFC after: 3 days
for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32
option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts
of the kernel and enhances the freebsd32 compatibility code to support
big-endian platforms.
Reviewed by: kib, jhb
given the advent of the extended family and extended model fields. The
values are printed in hex to match their common usage in documentation.
Submitted by: Alexander Best
MFC after: 1 week
- so_pcb is now guaranteed to be non-NULL and valid if a valid socket
reference is held.
- Need to check INP_TIMEWAIT and INP_DROPPED before assuming inp_ppcb is a
tcpcb, as it might be a tcptw or NULL otherwise.
- tp can never be NULL by the end of the function, so only check
TCPS_ESTABLISHED before extracting tcpcb fields.
The NFS server arguably incorporates too many assumptions about TCP
internals, but fixing that is left for nother day.
MFC after: 1 week
Reviewed by: bz
Reviewed and tested by: rmacklem
Sponsored by: Juniper Networks
Also disable relaxed ordering as recommended by data sheet for
PCI-X devices. For PCI-X BCM5704, set maximum outstanding split
transactions to 0 as indicated by data sheet.
For BCM5703 in PCI-X mode, DMA read watermark should be less than
or equal to maximum read byte count configuration. Enforce this
limitation in DMA read watermark configuration.
chip revision often found in the blades and resulting in interfaces
not sensing carrier signal. Looking at all problem reports it
appears that it only affects some very specific silicon revision
(ASIC (0x57081021); Rev (B2)) and version of the PHY that
supports 1000baseSX-FDX media only. Therefore, narrow the scope of
workaround to combination of that revision and media type. Given
that the first report on this issue is dated back to 2007, there is
not much hope that this issue will ever be properly resolved.
Among affected systems are IBM HS21, Intel SBXD132 and HP BL460c.
PR: 118238, 122551, 140970
MFC after: 1 month
These header files only provide functionality that can be used in
combination with libcompat. In order to prevent people from including
them without any actual use (which happens a lot with <sys/timeb.h>),
put a warning here to make people more aware.
This means we have to lower WARNS for libcompat, which is no big deal.
are referenced directly from ivar pointer. It's to do like what other
buses do. [1]
o changes exported prototypes. It doesn't use struct siba_* structures
anymore that instead of it it uses only device_t.
o removes duplicate code and debug messages.
o style(9)
Pointed out by: imp [1]
Setting the new sysctl MIB "debug.acpi.enable_debug_objects" to a non-zero
value enables us to print Debug object when something is written to it.
- Allow users to disable interpreter slack mode. Setting the new tunable
"debug.acpi.interpreter_slack" to zero disables some workarounds for common
BIOS mistakes and enables strict ACPI implementations by the specification.
processors. With this workaround, superpage promotion can be re-enabled
under virtualization. Moreover, machine check exceptions can safely be
enabled when FreeBSD is running natively on Family 10h processors.
Most of the credit should go to Andriy Gapon for diagnosing the error and
working with Borislav Petkov at AMD to document it. Andriy also reviewed
and tested my patches.
Discussed with: jhb
MFC after: 3 weeks
counting in incrementing the interrupt nesting level. This fixes a number
of bugs in which the interrupt thread could be preempted by an IPI,
indefinitely delaying acknowledgement of the interrupt to the PIC, causing
interrupt starvation and hangs.
Reported by: linimon
Reviewed by: marcel, jhb
MFC after: 1 week
allow for connection load balancing across interfaces. Currently
the address alias handling method is colliding with the ECMP code.
For example, when two interfaces are configured on the same prefix,
only one prefix route is installed. So connection load balancing
among the available interfaces is not possible.
The other advantage of ECMP is for failover. The issue with the
current code, is that the interface link-state is not reflected
in the route entry. For example, if there are two interfaces on
the same prefix, the cable on one interface is unplugged, new and
existing connections should switch over to the other interface.
This is not done today and packets go into a black hole.
Also, there is a small bug in the kernel where deleting ECMP routes
in the userland will always return an error even though the command
is successfully executed.
MFC after: 5 days
pmc_flush_logfile is now non-blocking and just ask the kernel
to shutdown the file. From that point, no more data is
accepted by the log thread and when the last buffer is flushed
the file is closed.
This will remove a deadlock between pmcstat asking for
flush while it cannot flush the pipe itself.
MFC after: 3 days
to not leak them, otherwise making UMA/vmstat unhappy with every stoped vnet.
We will still leak pages (especially for zones marked NOFREE).
Reshuffle cleanup order in tcp_destroy() to get rid of what we can
easily free first.
Sponsored by: ISPsystem
Reviewed by: rwatson
MFC after: 5 days
violated: so_pcb can never be NULL for a valid UDP socket, and it is
always SOCK_DGRAM. Use sotoinpcb() as the rest of the UDP code does.
MFC after: 1 week
Reviewed by: bz
Sponsored by: Juniper Networks
On Linux, /proc/<pid>/fd is comparable to fdescfs, where it allows you
to inspect the file descriptors used by each process. Glibc's ttyname()
works by performing a readlink() on these nodes, since all nodes in this
directory are symlinks.
It is a bit hard to implement this in linprocfs right now, so I am not
going to bother. Add a way to make ttyname(3) work, by adding a
/proc/<pid>/fd symlink, which points to /dev/fd only if the calling
process matches. When fdescfs is mounted, this will cause the
readlink() in ttyname() to fail, causing it to fall back on manually
finding a matching node in /dev.
Discussed on: emulation@
been required since FreeBSD 7.0 when the so_pcb pointer leading to inp was
guaranteed to be stable when a valid socket reference is held (as it is in
the output path).
MFC after: 1 week
Reviewed by: bz
Sponsored by: Juniper Networks
tcbinfo lock there: r175612, which re-added it, masked a race between
sonewconn(2) and accept(2) that could allow an incompletely initialized
address on a newly-created socket on a listen queue to be exposed. Full
details can be found in that commit message.
MFC after: 1 week
Sponsored by: Juniper Networks
radix table root nodes. This is only needed (and available)
in the virtualization case to free the resources when tearing
down a virtual network stack.
Sponsored by: ISPsystem
Reviewed by: julian, zec
MFC after: 5 days
to not leak them making the VM subsystem unhappy with every stoped vnet(*).
We will still leak pages (especially as zones are marked NOFREE).
(*) This will also keep vmstat -z more usable.
Sponsored by: ISPsystem
MFC after: 5 days
or overflow the netisr queue and fall back to the interface
queue so that we can garuantee that the ifnet pointer stays
valid. Formerly we ended up with reference counts <= 0 in
case the netisr had returned ENOBUFS. The idea is to track
any packet in the netisr queue and only change the refount
on edge operations for the fallback interface queue. This
also avoids problems in case the if_snd.ifq_len lies to us.
Also rework refount assertions to make sure they trigger if
we go below 1. Formerly a negative refence count did not
trigger the assert as the refcount variable is u_int.
Sponsored by: ISPsystem
MFC after: 5 days
than spinning forever. This fixes booting with CF ejected.
NB: I've made the driver pretty chatty about errors in case there's hardware
that operates differently to mine, so we can easily track down any issues.
Reviewed by: imp
Sponsored by: Packet Forensics
redundant implementations.
o) Use ABI, not ISA, to determine address length.
o) Disable and restore interrupts around any operation that uses all 64 bits of
a register. In kernels using the O32 ABI, the upper 32 bits of those
registers is likely to be corrupted by an interrupt.
Sponsored by: Packet Forensics
In order to do that cleanly, lapic_setup_clock(), on both ia32 and amd64,
now accepts as arguments the desired sources to handle, and returns the
actual ones (LAPIC_CLOCK_NONE is forbidden because otherwise there is no
meaning in calling such function).
This allows to bring out into commont x86 code the handling part for
machdep.lapic_allclocks tunable, which is retained.
- We don't need to fall back to uncacheable memory to satisfy BUS_DMA_COHERENT
requests on these CPUs.
- The bus_dmamap_sync() is a no-op for these CPUs.
A side-effect of this change is rename DMAMAP_COHERENT flag to
DMAMAP_UNCACHEABLE. This conveys the purpose of the flag more accurately.
Reviewed by: gonzo, imp
processes. It did not return an error but instead
just let garbage be passed back. This I fix so
it actually properly translates the priority the
process is at to a posix's high means more priority.
I also fix it so that if the ULE scheduler has bumped
it up to a realtime process you get back a sane value
i.e. the highest priority (63 for time-share).
sched_setscheduler() had the setting of the
timeshare class priority disabled. With some notes
about rejecting the posix high numbers is greater
priority and use nice instead. This fix also
adjusts that to work, with the cavet that a t-s
process may well get bumped up or down i.e. the
setscheduler() will NOT change the nice value only
the current priority. I think this is reasonable
considering if the user wants to play with nice then
he can. At least all the posix'ish interfaces now
respond sanely.
MFC after: 3 weeks
interface didn't be attached automatically at boot time so changes a
approach to attach children based on leveraging some newbus niceties.
Submitted by: nwhitehorn
- change the way in which command queue overflow is handled;
- do not expose to CAM two command slots, used for driver's internal purposes;
- allow driver to use up to 1024 command slots, instead of 256 before.
88E1149 PHY. This will fix intermittent watchdog timeouts as well
as very slow network performance on 88E8072 Yukon Extreme.
PR: kern/144148
MFC after: 1 week
correctly initialized and just then assign to softclock/profclock.
Right now, some atrtc seems reporting strange diagnostic error* making the
current pattern bogus.
In order to do that cleanly, lapic_setup_clock(), on both ia32 and amd64,
now accepts as arguments the desired sources to handle, and returns the
actual ones (LAPIC_CLOCK_NONE is forbidden because otherwise there is no
meaning in calling such function).
This allows to bring out into commont x86 code the handling part for
machdep.lapic_allclocks tunable, which is retained.
Sponsored by: Sandvine Incorporated
Tested by: yongari, Richard Todd
<rmtodd at ichotolot dot servalan dot com>
MFC: 3 weeks
X-MFC: r202387, 204309
and tested over the past two months in the ipfw3-head branch. This
also happens to be the same code available in the Linux and Windows
ports of ipfw and dummynet.
The major enhancement is a completely restructured version of
dummynet, with support for different packet scheduling algorithms
(loadable at runtime), faster queue/pipe lookup, and a much cleaner
internal architecture and kernel/userland ABI which simplifies
future extensions.
In addition to the existing schedulers (FIFO and WF2Q+), we include
a Deficit Round Robin (DRR or RR for brevity) scheduler, and a new,
very fast version of WF2Q+ called QFQ.
Some test code is also present (in sys/netinet/ipfw/test) that
lets you build and test schedulers in userland.
Also, we have added a compatibility layer that understands requests
from the RELENG_7 and RELENG_8 versions of the /sbin/ipfw binaries,
and replies correctly (at least, it does its best; sometimes you
just cannot tell who sent the request and how to answer).
The compatibility layer should make it possible to MFC this code in a
relatively short time.
Some minor glitches (e.g. handling of ipfw set enable/disable,
and a workaround for a bug in RELENG_7's /sbin/ipfw) will be
fixed with separate commits.
CREDITS:
This work has been partly supported by the ONELAB2 project, and
mostly developed by Riccardo Panicucci and myself.
The code for the qfq scheduler is mostly from Fabio Checconi,
and Marta Carbone and Francesco Magno have helped with testing,
debugging and some bug fixes.
device is different from the device used to the original mount.
Note that update_mp does not need devvp locked, and pmp->pm_devvp cannot
be freed meantime.
Reported and tested by: pho
MFC after: 3 weeks
o rename the new variables to comply with the naming scheme
o move the new variables to an AR5212 specific struct
o use ahp when available
o revert to previous ts_flags check
- remove unused and commented code (MIPS_BUS_SPACE_PCI, pic_usb_ack)
- use rmi_pci_bus_space for USB too (needs byteswap)
- uncomment xls_ehci.c in files.xlr
- changes to xls_ehci.c - updated with dev/usb/controller/ehci_*.c as
Obtained from: JC - c.jayachandran@gmail.com
Enhanced process coredump routines.
This brings in the following features:
1) Limit number of cores per process via the %I coredump formatter.
Example:
if corefilename is set to %N.%I.core AND num_cores = 3, then
if a process "rpd" cores, then the corefile will be named
"rpd.0.core", however if it cores again, then the kernel will
generate "rpd.1.core" until we hit the limit of "num_cores".
this is useful to get several corefiles, but also prevent filling
the machine with corefiles.
2) Encode machine hostname in core dump name via %H.
3) Compress coredumps, useful for embedded platforms with limited space.
A sysctl kern.compress_user_cores is made available if turned on.
To enable compressed coredumps, the following config options need to be set:
options COMPRESS_USER_CORES
device zlib # brings in the zlib requirements.
device gzio # brings in the kernel vnode gzip output module.
4) Eventhandlers are fired to indicate coredumps in progress.
5) The imgact sv_coredump routine has grown a flag to pass in more
state, currently this is used only for passing a flag down to compress
the coredump or not.
Note that the gzio facility can be used for generic output of gzip'd
streams via vnodes.
Obtained from: Juniper Networks
Reviewed by: kan
does not generate excessive interrupts any more so we don't need
to have two copies of interrupt handler.
While I'm here remove two STAT_PUT_IDX register accesses in LE
status event handler. After r204539 msk(4) always sync status LEs
so there is no need to resort to reading STAT_PUT_IDX register to
know the end of status LE processing. Just trust status LE's
ownership bit.
countdown timer register. The timer resolution may vary among
controllers but the value would be represented by core clock
cycles. msk(4) will automatically computes number of required clock
cycles from given micro-seconds unit.
The default interrupt holdoff timer value is 100us which will
ensure less than 10k interrupts under load. The timer value can be
changed with dev.mskc.0.int_holdoff sysctl node.
Note, the interrupt moderation is shared resource on dual-port
controllers so you can't use separate interrupt moderation value
for each port. This means we can't stop interrupt moderation in
driver stop routine. Also have msk_tick() reclaim transmitted Tx
buffers as safety belt. With this change there is no need to check
missing Tx completion interrupt in watchdog handler, so remove it.
full-duplex. Previously msk(4) used to allow flow-control on
1000baseT half-duplex media. Also GMAC pause is enabled if link
partner is capable of handling it.
While I'm here use IFM_OPTIONS instead of using IFM_GMASK to check
optional flags of link.
- Rename the netisr protocol registration array, 'np' to 'netisr_proto',
in order to reduce the chances of symbol name collisions. It remains
statically defined, but it will be looked up by netstat(1).
- Move certain internal structure definitions from netisr.c to
netisr_internal.h so that netstat(1) can find them. They remain
private, and should not be used for any other purpose (for example,
they should not be used by kernel modules, which must instead use the
public interfaces in netisr.h).
- Store a kernel-compiled version of NETISR_MAXPROT in the global variable
netisr_maxprot, and export via a sysctl, so that it is available for use
by netstat(1). This is especially important for crashdump
interpretation, where the size of the workstream structure is determined
by the maximum number of protocols compiled into the kernel.
MFC after: 1 week
Sponsored by: Juniper Networks
This is a split merge because of non-uniform licensing of the DTC package
contents and the way these components will be used in the FreeBSD environment.
The original DTC package is composed of the following two major pieces:
1. sys/contrib/libfdt (BSD [dual] license)
2. contrib/dtc (GPLv2)
The libfdt component is going to be shared in all aspects of the environment:
- /boot/loader
- kernel
- dtc (the device tree compiler proper, userspace tool)
msdosfs-specific variant of vn_vget_ino(), msdosfs_deget_dotdot().
As was done for UFS, relookup the dotdot denode after the call to
msdosfs_deget_dotdot(), because vnode lock is dropped and directory
might be moved.
Tested by: pho
MFC after: 3 weeks