- Rename the netisr protocol registration array, 'np' to 'netisr_proto',
in order to reduce the chances of symbol name collisions. It remains
statically defined, but it will be looked up by netstat(1).
- Move certain internal structure definitions from netisr.c to
netisr_internal.h so that netstat(1) can find them. They remain
private, and should not be used for any other purpose (for example,
they should not be used by kernel modules, which must instead use the
public interfaces in netisr.h).
- Store a kernel-compiled version of NETISR_MAXPROT in the global variable
netisr_maxprot, and export via a sysctl, so that it is available for use
by netstat(1). This is especially important for crashdump
interpretation, where the size of the workstream structure is determined
by the maximum number of protocols compiled into the kernel.
MFC after: 1 week
Sponsored by: Juniper Networks
This is a split merge because of non-uniform licensing of the DTC package
contents and the way these components will be used in the FreeBSD environment.
The original DTC package is composed of the following two major pieces:
1. sys/contrib/libfdt (BSD [dual] license)
2. contrib/dtc (GPLv2)
The libfdt component is going to be shared in all aspects of the environment:
- /boot/loader
- kernel
- dtc (the device tree compiler proper, userspace tool)
msdosfs-specific variant of vn_vget_ino(), msdosfs_deget_dotdot().
As was done for UFS, relookup the dotdot denode after the call to
msdosfs_deget_dotdot(), because vnode lock is dropped and directory
might be moved.
Tested by: pho
MFC after: 3 weeks
SLOT_EMPTY deName[0] values. Besides conforming to FAT specification, it
also clears the issue where vfs_hash_insert found the vnode in hash, and
newly allocated vnode is vput()ed. There, deName[0] == 0, and vnode is
not reclaimed, indefinitely kept on mountlist.
Tested by: pho
MFC after: 3 weeks
The plan is to use vnode lock to protect denode and fat cache,
and having separate lock for block use map.
Change the check and return on impossible condition into KASSERT().
Tested by: pho
MFC after: 3 weeks
Do not do additional dev_ref() on the newly created interface in the
if_clone create method [1]. This reference is not needed and never
removed, causing struct cdevpriv leakage. Remove the setting of
SI_CHEAPCLONE flag as well, since it is unused.
For dev_clone handlers, create cdevs with the call make_dev_credf(MAKEDEV_REF)
instead of calling make_dev() and then dev_ref(), to avoid a race.
Call drain_dev_clone_events() at the module unload time after dev_clone
handler is deinstalled.
Submitted by: Mikolaj Golub <to.my.trociny gmail com> [1]
MFC after: 1 week
It is the case, however, that the uidinfo of the temporary credential
set up for access(2) is not properly updated when its effective uid is
changed.
MFC after: 3 days
o Assign vectors based on priority, because vectors have
implied priority in hardware.
o Use unordered memory accesses to the I/O sapic and use
the acceptance form of the mf instruction.
o Remove the sapicreg.h and sapicvar.h headers. All definitions
in sapicreg.h are private to sapic.c and all definitions in
sapicvar.h are either private or interface functions. Move the
interface functions to intr.h.
o Hide the definition of struct sapic.
that the virtual machine monitor has enabled machine check exceptions.
Unfortunately, on AMD Family 10h processors the machine check hardware
has a bug (Erratum 383) that can result in a false machine check exception
when a superpage promotion occurs. Thus, I am disabling superpage
promotion when the FreeBSD kernel is running as a guest operating system
on an AMD Family 10h processor.
Reviewed by: jhb, kib
MFC after: 3 days
in the process queue when gathering information for the process, and set
of signals pending for the thread, when gathering information for the
thread. Previously, the sysctl returned a union of the process and some
arbitrary thread pending set for the process, and union of the process
and the thread pending set for the thread.
MFC after: 1 week
32 bit handles. The RIO (reduced interrupt operation) and fast posting
for the parallel SCSI cards were all 16 bit handles. Furthermore,
target mode parallel SCSI only can have 16 bit handles.
Use part of a supplied patch to switch over to using 32 bit handles.
Be a bit more conservative here and only do this for parallel SCSI
for the 12160 (Ultra3) cards. There were a lot of marginal Ultra2
cards, and, frankly, few are findable now for testing.
Fix the target handle routine to only do 16 bit handles for parallel
SCSI cards. This is okay because the upper sixteen bits of the new
32 bit handles is a sequence number to help protect against duplicate
completions. This would be very unlikely to happen with parallel
SCSI target mode, and wasn't present before, so we're no worse off
than we used to be.
While we're at it, finally split the async mailbox completion handlers
into FC and parallel SCSI functions. This makes it much cleaner and
easier to figure out what is or isn't a legal async mailbox completion
code for different card classes.
PR: kern/144250
Submitted partially by: Charles D
MFC after: 1 week
the issue. I still have no idea why TSO does not work on this
controller. davidch@ also confirmed there is no known TSO related
issues for this controller.
parsing code in TSO path as the controller requires VLAN hardware
tagging to make TSO work over VLANs.
While parsing the mbuf in TSO patch, always perform check for
writable mbuf as bce(4) have to reset IP length and IP checksum
field of IP header and make sure to ensure contiguous buffer before
accessing IP/TCP headers. While I'm here replace magic number 40 to
more readable sizeof(struct ip) + sizeof(struct tcphdr).
Reviewed by: davidch
TX/RX checksum handler to set/clear relavant assist bits which was
used to cause unexpected results.
With this change, bce(4) can be bridged with other interfaces that
lack TSO, VLAN checksum offloading.
Reviewed by: davidch
around. Management firmware(ASF/IPMI/UMP) requires the VLAN
hardware tag stripping so don't actually disable VLAN hardware tag
stripping. If VLAN hardware tag stripping was disabled, bce(4)
manually reconstruct VLAN frame by appending stripped VLAN tag.
Also remove unnecessary IFCAP_VLAN_MTU message.
Reviewed by: davidch
for controllers like 88E8053 which reports two MSI messages.
Because we don't get anything useful things with 2 MSI messages,
allocating 1 MSI message would be more sane approach.
While I'm here, enable MSI for dual-port controllers too. Because
status block is shared for dual-port controllers, I don't think
msk(4) will encounter problem for using MSI on dual-port
controllers.
The register offset is not valid on 88E8072 controller. Also don't
blindly increase max read request size to 4096, instead, use 2048
which seems to be more sane value and only change the value if the
hardware default size(512) was used on that register.
For PCIX controllers, use system defined constant rather than using
magic value.
While I'm here stop showing negotiated link width.
not require checksum LE configuration if checksum start and write
position is the same as before. So keep track last checksum start
and write position and insert new LE whenever the position is
changed. This reduces number of LEs used in TX path as well as
slightly enhance TX performance.
the scratchpage is updated, the PVO's physical address is updated as well.
This makes pmap_extract() begin returning non-zero values again, causing
the panic partially fixed in r204297. Fix this by excluding addresses
beyond virtual_end from consideration as KVA addresses, instead of allowing
addresses up to VM_MAX_KERNEL_ADDRESS.
shared and generalized between our current amd64, i386 and pc98.
This is just an initial step that should lead to a more complete effort.
For the moment, a very simple porting of cpufreq modules, BIOS calls and
the whole MD specific ISA bus part is added to the sub-tree but ideally
a lot of code might be added and more shared support should grow.
Sponsored by: Sandvine Incorporated
Reviewed by: emaste, kib, jhb, imp
Discussed on: arch
MFC: 3 weeks
counters have not gone about MAXCPU or NETISR_MAXPROT. These problems
caused panics on UP kernels with INVARIANTS when using sysctl -a, but
would also have caused problems for 32-core boxes or if the netisr
protocol vector was fully populated.
Reported by: nwhitehorn, Neel Natu <neelnatu@gmail.com>
MFC after: 4 days
its PVO to map physical address 0 instead of kernelstart. This fixes a
situation in which a user process could attempt to return this address
via KVM, have it fault while being modified, and then panic the kernel
because (a) it is supposed to map a valid address and (b) it lies in the
no-fault region between VM_MIN_KERNEL_ADDRESS and virtual_avail.
While here, move msgbuf and dpcpu make into regular KVA space for
consistency with other implementations.
too high due to several overflows. The actual limit is somewhere in the
neighborhood of INT_MAX/4 on 64-bit machines, but most systems could not
support such a limit due to a lack of memory and the cost of duplicate
credentials.
Reported by: bde
been around for a long time now (7.1-ish or even earlier); assume
they are present. These includes MSI, TSO, LRO, VLAN, INTR_FILTERS,
FIRMWARE, etc.
Also, eliminate some dead code and clean up in other places as part
of this quick once-over.
MFC after: 1 week
race as it could already have been tx'd and freed by that time. Place
the bpf tap just _before_ writing the gen bit.
This fixes a panic when running tcpdump on a cxgb interface.
physical address is changed, there is a brief window during which its PTE
is invalid. Since moea64_set_scratchpage_pa() does not and cannot hold
the page table lock, it was possible for another CPU to insert a new PTE
into the scratch page's PTEG slot during this interval, corrupting both
mappings.
Solve this by creating a new flag, LPTE_LOCKED, such that
moea64_pte_insert will avoid claiming locked PTEG slots even if they
are invalid. This change also incorporates some additional paranoia
added to solve things I thought might be this bug.
Reported by: linimon
- Add a separate palette data for 8-bit DAC mode when SC_PIXEL_MODE is set
and fill it up with default gray-scale palette data for text. Now we don't
have to set `hint.sc.0.vesa_mode' to get the default palette data.
- Add a new adapter flag, V_ADP_DAC8 to track whether the controller is
using 8-bit palette format and load correct palette when switching modes.
- Set 8-bit DAC mode only for non-VGA compatible graphics mode.
change and have isp_make_here scan the whole bus which will then scan all
luns.
I think xpt_rescan needs to be fixed, but that's a separable issue.
Suggested by: Alexander
not support TSO over VLAN if VLAN hardware tagging is disabled so
there is no need to check VLAN here.
While I'm here make sure to pullup IP/TCP headers in the first
buffer.
GEM_MIF_CONFIG_MDI0 cannot be trusted when the firmware has powered
down the chip so the internal transceiver has to be hardcoded. This
is also in line with the AppleGMACEthernet driver, which just doesn't
distinguish between internal/external transceiver and MDIO/MDI1
respectively in the first place. Tested by: Andreas Tobler
MFC after: 1 week
frame, make sure to update VLAN capabilities whenever jumbo frame
is configured.
While I'm here rearrange interface capabilities configuration. The
controller requires VLAN hardware tagging to make TSO work on VLANs
so explicitly check this requirement.
Now all contiguous regions returned from bus-dma will be aligned to the
alignment constraint and all but the last region are guaranteed to be
a multiple of the alignment in length. This also means that the relative
alignment of two adjacent bytes in the I/O stream have a difference of 1
even if they are not physically contiguous.
The old code, when needing to perform a copy in order to align data, only
copied the amount of data needed to reach the next page boundary. This
often left an unaligned end to the segment. Drivers such as Xen's blkfront
can't deal with such segments.
The downside to this approach is that, once an unaligned region is encountered,
the remainder of the I/O will be bounced. However, bouncing should be rare.
It is typically caused by non-performance critical userland programs that
don't bother to align their I/O buffers (e.g. bsdlabel). In-kernel I/O
buffers are always aligned to at least a page boundary.
Reviewed by: scottl
MFC after: 2 weeks
no superpage mappings are created within the clean submap, aligning the
start of the clean submap helps to prevent interference with kmem_alloc()'s
use of superpages.
such that a fancier thermal management algorithm can be run from user
space, but the kernel will at least ensure your machine does not either
sound like a wind tunnel or catch fire.
Although this file has historically been used as a dumping ground for
random functions, nowadays it only contains functions related to copying
bits {from,to} userspace and hash table utility functions.
Behold, subr_uio.c and subr_hash.c.
and will try to load it if it's not present. To better meet these
expectations, change the module name for the loopback interface from
'loop' to 'if_lo'. The loopback interface is always compiled into the
base kernel, so there are no resulting changes in kld files, etc.
Discussed with: brooks (ages ago)
MFC after: 1 week
The `name' and `newp' arguments can be marked const, because the buffers
they refer to are never changed. While there, perform some other
cleanups:
- Remove K&R from sysctl.c.
- Implement sysctlbyname() using sysctlnametomib() to prevent
duplication of an undocumented kernel interface.
- Fix some whitespace nits.
It seems the prototypes are now in sync with NetBSD as well.
but also of different types, f.e. Sun Fire V890 can be equipped with a
mix of UltraSPARC IV and IV+ CPUs, requiring different MMU initialization
and different workarounds for model specific errata. Therefore move the
CPU implementation number from a global variable to the per-CPU data.
Functions which are called before the latter is available are passed the
implementation number as a parameter now.
This file was missed in r204152.
Tx DMA burst size 2048, I beleive PCIe maximum read request size
also should match to the value of Tx DMA burst size. With this
change I can get more than 800Mbps for TCP bulk transfers.
Previously I was not able to get more than 700Mbps. If I enable TSO
it now shows 927Mbps.
but also of different types, f.e. Sun Fire V890 can be equipped with a
mix of UltraSPARC IV and IV+ CPUs, requiring different MMU initialization
and different workarounds for model specific errata. Therefore move the
CPU implementation number from a global variable to the per-CPU data.
Functions which are called before the latter is available are passed the
implementation number as a parameter now.
to make TSO work on VLAN. So if VLAN hardware tagging is disabled
explicitly clear TSO on VLAN. While I'm here remove duplicated
VLAN_CAPABILITIES call.
from IFCAP_VLAN_HWTAGGING. I think some hardwares may be able to
TSO over VLAN without VLAN hardware tagging.
Driver changes and userland support will follow.
Reviewed by: thompsa
- 'show ifnets' prints a list of ifnet *s per virtual network stack,
- 'show ifnet <struct ifnet *>' prints fields matching the given ifp.
We do not yet print the complete set of fields and might want to
factor this out to an extra if_debug.c file in case this grows
a lot[1]. We may also want to grow 'show ifnet <if_xname>' support[1].
Sponsored by: ISPsystem
Suggested by: rwatson [1]
Reviewed by: rwatson
MFC after: 5 days
a "locked" version that will only handle a single network stack
instance. The latter is called directly from ip_destroy().
Hook up an ip_destroy() function to release resources from the
legacy IP network layer upon virtual network stack teardown.
Sponsored by: ISPsystem
Reviewed by: rwatson
MFC After: 5 days
- add bus_space_rmi_pci.c for PCI bus space
- files.xlr update for changes in files
- pcibus.c merged into xlr_pci.c (they were small files with inter-dependencies)
- xlr_pci.c - lot of changes here with few fixes, formatting cleanup
Obtained from: C. Jayachandran (JC) - c.jayachandran@gmail.com
- remove pci related code from bus_space_rmi.c, we will have another
file for PCI bus space functions which will do byte-swapping.
- remove local SWAP implementation
- added TODO stub for unimplemented functions
Obtained from: C. Jayachandran - c.jayachandran@gmail.com
- (cleanup) remove rmi specific 'struct mips_intrhand' - this is no
longer needed since 'struct intr_event' have all the required hooks
- add xlr_cpu_establish_hardintr, which has args for pre/post ithread
and filter hooks, so that the PCI code can add the PCI controller
interrupt ack code here
- make 'cpu_establish_hardintr' use the above function.
- (fix) change type of eirr/eimr from register_t to uint64_t. These
have to be 64bit otherwise we cannot handle interrupts from 32.
- (fix) use eimr to mask eirr before checking interrupts, so that we
will not handle masked interrupts.
Obtained from: C. Jayachandran - c.jayachandran@gmail.com
UMA segments at their physical addresses instead of into KVA. This emulates
the direct mapping behavior of OEA32 in an ad-hoc way. To make this work
properly required sharing the entire kernel PMAP with Open Firmware, so
ofw_pmap is transformed into a stub on 64-bit CPUs.
Also implement some more tweaks to get more mileage out of our limited
amount of KVA, principally by extending KVA into segment 16 until the
beginning of the first OFW mapping.
Reported by: linimon
of its argument before atomically replacing it, which could occasionally
return the wrong value on an SMP system. This resulted in user mutex
operations hanging when using threaded applications.
This header file uses __packed, without including <sys/cdefs.h>. This
means it cannot be used in the way described in sysarch(3) by only
including <machine/sysarch.h>.
The backtrace code tries to look for an instruction of the form "sw ra, x(sp)"
to figure out the program counter of the calling function. When we generate
the kernel exception frame we store the 'ra' at the time of the exception
using an instruction of the same form. The problem is that the 'ra' at the
time of the exception is not the same as the 'program counter' at the time
of the exception.
The fix is to save the 'exception program counter' register by staging
it through the 'ra' register.
The RX overflow is reported in bit 2 on real hardware and Linux driver
for the same device already has this defined correctly.
This fixes frequent interrupt storms seen on RouterStation Pro boards.
Discussed with: gonzo
questions on the thermal calibration), and to read and set fan RPMs from
software. While here, fix a number of bugs.
Calibration code from: OpenBSD
MFC after: 2 weeks
HAST allows to transparently store data on two physically separated machines
connected over the TCP/IP network. HAST works in Primary-Secondary
(Master-Backup, Master-Slave) configuration, which means that only one of the
cluster nodes can be active at any given time. Only Primary node is able to
handle I/O requests to HAST-managed devices. Currently HAST is limited to two
cluster nodes in total.
HAST operates on block level - it provides disk-like devices in /dev/hast/
directory for use by file systems and/or applications. Working on block level
makes it transparent for file systems and applications. There in no difference
between using HAST-provided device and raw disk, partition, etc. All of them
are just regular GEOM providers in FreeBSD.
For more information please consult hastd(8), hastctl(8) and hast.conf(5)
manual pages, as well as http://wiki.FreeBSD.org/HAST.
Sponsored by: FreeBSD Foundation
Sponsored by: OMCnet Internet Service GmbH
Sponsored by: TransIP BV
PVOs, and so the modified state of the page can no longer be communicated
to the VM layer, causing pages not to be flushed to swap when needed, in
turn causing memory corruption. Also make several correctness adjustments
to I-Cache synchronization and TLB invalidation for 64-bit Book-S CPUs.
Obtained from: projects/ppc64
Discussed with: grehan
MFC after: 2 weeks
This patch basically gives us the best of both worlds. Instead of
forcing the compiler to emulate GNU-style inline semantics even though
we're using ISO C99, it will only use GNU-style inlining when the
compiler is configured that way (__GNUC_GNU_INLINE__).
Tested by: jhb
Getting the little-endian PCI bus working on the big-endian CPU proved to be
quite challenging. We let the PCI devices be mapped in the "match byte lanes"
address window. This is where they are mapped by the CFE and DMA transfers
generated to or from addresses within this window are not subject to automatic
byte-swapping.
However any access by the driver to memory-mapped pci space is redirected
via the "match bit lanes" address window. We get the benefit of automatic
byte swapping through this address window and drivers don't need to change
to deal with CPU big-endianness.
module. With r203732 it became apparent that creating the sysctl nodes
twice causes at least a warning, however the whole code shouldn't be
present twice in the first place.
Discussed with: rmacklem