using miibus, since for some devices that use multiple addresses on the bus,
going through miibus may be unclear, and for devices that are not standard
MII PHYs, miibus may throw a fit, necessitating complicated interfaces to
fake the interface that it expects during probe/attach.
o) Make the mv88e61xx SMI interface in octe attach a PHY directly and fix some
mistakes in the code that resulted from trying too hard to present a nice
interface to miibus.
o) Add a PHY driver for the mv88e61xx. If attached (it is optional in kernel
compiles so the default behavior of having a dumb switch is preserved) it
will place the switch in a VLAN-tagging mode such that each physical port
has a VLAN associated with it and interfaces for the VLANs can be created to
address or bridge between them.
XXX It would be nice for this to be part of a single module including the
SMI interface, and for it to fit into a generic switch configuration
framework and for it to use DSA rather than VLANs, but this is a start
and gives some sense of the parameters of such frameworks that are not
currently present in FreeBSD. In lieu of a switch configuration
interface, per-port media status and VLAN settings are in a sysctl tree.
XXX There may be some minor nits remaining in the handling of broadcast,
multicast and unknown destination traffic. It would also be nice to go
through and replace the few remaining magic numbers with macros at some
point in the future.
XXX This has only been tested with the MV88E6161, but it should work with
minimal or no modification on related switches, so support for probing
them was included.
Thanks to Pat Saavedra of TELoIP and Rafal Jaworowski of Semihalf for their
assistance in understanding the switch chipset.
data size greater than 8192. Since soreserve(so, 256*1024, 256*1024)
would always fail for the default value of sb_max, modify clnt_dg.c
so that it uses the calculated values and checks for an error return
from soreserve(). Also, add a check for error return from soreserve()
to clnt_vc.c and change __rpc_get_t_size() to use sb_max_adj instead of
the bogus maxsize == 256*1024.
PR: kern/150910
Reviewed by: jhb
MFC after: 2 weeks
receive producer ring only for BCM5700. It was believed that
BCM5700 with external SSRAM is the only controller that supports
mini ring but it seems all BCM570[0-4] requires to disable mini
receive producer ring. Otherwise, it caused unexpected RX DMA
error or watchdog timeouts.
Reported by: marius, Steve Kargl <sgk <> troutmask dot apl dot washington dot edu>
Tested by: marius, Steve Kargl <sgk <> troutmask dot apl dot washington dot edu>
Short description of the changes:
- attempt to retry some commands for which it is possible (read, query)
- always make a short sleep before checking EC status in polled mode
- periodically poll EC status in interrupt mode
- change logic for detecting broken interrupt delivery and falling back
to polled mode
- check that EC is ready for input before starting a new command, wait
if necessary
This commit is based on the original patch by David Naylor.
PR: kern/150517
Submitted by: David Naylor <naylor.b.david@gmail.com>
Reviewed by: jkim
MFC after: 3 weeks
This is based on the same approach as used in panic().
In theory parallel execution of generic_stop_cpus() could lead to two CPUs
stopping each other and everyone else, and thus a total system halt.
Also, in theory, we should have some smarter locking here, because two
(or more CPUs) could be stopping unrelated sets of CPUs.
But in practice, it seems, this function is only used to stop
"all other" CPUs.
Additionally, I took this opportunity to make amd64-specific suspend_cpus()
function use generic_stop_cpus() instead of rolling out essentially
duplicate code.
This code is based on code by Sandvine Incorporated.
Suggested by: mdf
Reviewed by: jhb, jkim (earlier version)
MFC after: 2 weeks
Since r212650 and before this change sendfile(2) could produce
a partially valid page for a trailing portion of a ZFS vnode.
vm_fault() always wants to see a fully valid page even if it's the last
page that partially extends beyond vnode's end. Otherwise it calls
vop_getpages() to bring in the page. In the case of ZFS this means
that the data is read from the page into the same page and this breaks
checks in ZFS mappedread() - a thread that set VPO_BUSY on the page in
vm_fault() will get blocked forever waiting for it to be cleared.
Many thanks to Kai and Jeremy for reproducing the issue and providing
important debugging information and help.
Reported by: Kai Gallasch <gallasch@free.de>,
Jeremy Chadwick <freebsd@jdc.parodius.com>
Tested by: Kai Gallasch <gallasch@free.de>,
Jeremy Chadwick <freebsd@jdc.parodius.com>
Reviewed by: kib
MFC after: 3 days
To-Do: apply the same treatment to tmpfs + sendfile
kernel of exactly the same __FreeBSD_version as the headers module was
compiled against.
Mark our in-tree ABI emulators with DECLARE_MODULE_TIED. The modules
use kernel interfaces that the Release Engineering Team feel are not
stable enough to guarantee they will not change during the life cycle
of a STABLE branch. In particular, the layout of struct sysentvec is
declared to be not part of the STABLE KBI.
Discussed with: bz, rwatson
Approved by: re (bz, kensmith)
MFC after: 2 weeks
NFSv4 server more readable. Mostly changes to comments, but a
case of >= is changed to >, since == can never happen. Also, I've
added a couple of KASSERT()s and a slight optimization, since
once the "else if" case happens, subsequent locks in the list can't
have any effect. None of these changes fixes any known bug.
MFC after: 2 weeks
before setting the flag, interrupt was already enabled such that
interrupt handler could be run before setting IFF_DRV_RUNNING flag.
This can lose initial link state change interrupt which in turn
make bge(4) think that it still does not have valid link. Fix this
race by protecting the taskqueue with a driver lock.
While I'm here move reenabling interrupt code after handling of link
state chage.
Reviewed by: davidch
Previously bge(4) always enabled auto polling for non-BGE_FLAG_TBI
controllers. With this change, auto polling is not used anymore so
polling through mii(4) was introduced.
Reviewed by: davidch
It seems axe(4) controllers support interrupt endpoint such that
enabling interrupt endpoint generates about 1000 interrupts/sec.
Controllers transfer 8 bytes data through interrupt endpoint and
the data include link UP/DOWN state as well as some PHY related
information. Previously axe(4) didn't use the transferred data and
didn't even try to read the data. Because axe(4) counts on mii(4)
to detect link state changes there is no need to use interrupt
endpoint here.
This change fixes generation of unnecessary interrupts which was
seen when interface is brought to UP.
No objections from: hselasky
breakage for old mount(2) syscall, since most struct <filesystem>_args
embed export_args. The mount(2) is supposed to provide ABI
compatibility for pre-nmount mount(8) binaries, so restore ABI to
pre-r184588.
Requested and reviewed by: bde
MFC after: 2 weeks
This is to prevent caching of its value in a register when it is checked
and modified by multiple CPUs in parallel.
Also, move the variable into the scope of the only function that uses it.
Reviewed by: jhb
Hint from: mdf
MFC after: 1 week
rwlock to protect the table. In old code, thread lookup is done with
process lock held, to find a thread, kernel has to iterate through
process and thread list, this is quite inefficient.
With this change, test shows in extreme case performance is
dramatically improved.
Earlier patch was reviewed by: jhb, julian
to the expected type so they work like the corresponding __bswapN_var()
functions and the compiler doesn't complain when arguments of different
width are passed.
opposition to the change, since really we need to implement missing
functionality in drivers or the 802.3 layer.
For now, restore a reminder message for a missing rum_update_mcast, but
print it only once.
controller, but make it optional.
After a problem report from Andrew Boyer, it looks like the LSI
chip may have issues (the watchdog timer fired) if too many aborts
are sent down to the chip at the same time. We know that task
management commands are serialized, and although the manual doesn't
say it, it may be a good idea to just send one at a time.
But, since I'm not certain that this is necessary, add a tunable
and sysctl variable (hw.mps.%d.allow_multiple_tm_cmds) to control
the driver's behavior.
mps.c: Add support for the sysctl and tunable, and add a
comment about the possible return values to
mps_map_command().
mps_sas.c: Run all task management commands through two new
routines, mpssas_issue_tm_request() and
mpssas_complete_tm_request().
This allows us to optionally serialize task
management commands. Also, change things so that
the response to a task management command always
comes back through the callback. (Before it could
come via the callback or the return value.)
mpsvar.h: Add softc variables for the list of active task
management commands, the number of active commands,
and whether we should allow multiple active task
management commands. Add an active command flag.
mps.4: Describe the new sysctl/loader tunable variable.
Sponsored by: Spectra Logic Corporation
physical memory
This is needed to correctly autotune ZFS ARC size when vm_kmem_size is
set to value larger than available physical memory.
MFC after: 2 weeks
A new function prep_devname() sanitizes a device name by removing
leading and redundant sequential slashes. The function returns an error
for names which already exist or are considered invalid.
A new flag MAKEDEV_CHECKNAME for make_dev_p(9) and make_dev_credf(9)
indicates that the caller is prepared to handle an error related to the
device name. An invalid name triggers a panic if the flag is not
specified.
Document the MAKEDEV_CHECKNAME flag in the make_dev(9) manual page.
Idea from: kib
Reviewed by: kib
as 5788. This caused BGE_MISC_LOCAL_CTL register is used to
generate link state change interrupt for non-5788 controllers. The
interrupt handler may or may not detect link state attention as
status block wouldn't be updated when an interrupt was generated
with BGE_MISC_LOCAL_CTL register. All controllers except 5700 and
5788 should use host coalescing mode register to trigger an
interrupt.
a single directory entry. As a consequnce, name cache purge done by lookup
for fvp when DELETE op for namei is specified, might be not enough to
expunge all namecache entries that were installed for this direntry.
Explicitely call cache_purge(fvp) when msdosfs_rename() succeeded.
PR: kern/93634
MFC after: 1 week
* Support for sam9 "EMAC" controller.
* Support for rmii interface to phy.
at91.c & at91sam9.c:
* Eliminate separate at91sam9.c file.
* Add new devices to at91sam9_devs table.
at91_machdep.c & at at91sam9_machdep.c:
* Automatic chip type determination.
* Remove compile time chip dependencies.
* Eliminate separate at91sam9_machdep.c file.
at91_pmc.c:
* Corrected support for all of the sam926? and sam9g20 chips.
* Remove compile time chip dependencies.
My apologies to Greg for taking so long to take care of it.
versions of controller support different number of ring control
blocks such that adjust code a bit to access known number of
send/receive ring control blocks. Previously bge(4) blindly
accessed 16 send/receive RCBs. Also move initializing standard
receive producer ring producer index, jumbo receive producer ring
producer index and mini receive producer ring producer index to
the end of each receive producer ring initialization.
Do not assume mini receive producer ring is available only when
controller has jumbo frame capability, instead explicitly check
ASIC version BCM5700 to disable mini receive producer ring.
Additionally always enable send ring 0 regardless of controller
versions. Previously bge(4) didn't enable send ring 0 if controller
is BGE_IS_5705_PLUS. Becase bge(4) need 1 send ring to send frames
at least, I have no idea how it would have worked so far.
Submitted by: davidch
dev.bce.<unit>.nvram_dump
Add the capability to write the complete contents of the NVRAM via sysctl
dev.bce.<unit>.nvram_write
These are only available if the kernel option BCE_DEBUG is enabled.
The nvram_write sysctl also requires the kernel option
BCE_NVRAM_WRITE_SUPPORT to be enabled. These are to be used at your
own caution. Since the MAC addresses are stored in the NVRAM, if you
dump one NIC and restore it on another NIC the destination NIC's
MAC addresses will not be preserved. A tool can be made using these
sysctl's to manage the on-chip firmware.
Reviewed by: davidch, yongari
BGE_MI_MODE register accesses. Previously bge(4) used to read
BGE_MI_MODE register to detect whether it needs to disable
autopolling feature or not. Because we don't touch autopolling in
other part of driver there is no reason to read BGE_MI_MODE
register given that we know default value in advance. In order to
achieve the goal, check whether the controller has CPMU(Central
Power Mangement Unit) capability. If controller has CPMU feature,
use 500KHz MII management interface(mdio/mdc) frequency regardless
core clock frequency. Otherwise use default MII clock. While I'm
here, add CPMU register definition.
In bge_miibus_readreg(), rearrange code a bit and remove goto
statement. In bge_miibus_writereg(), make sure to restore
autopolling even if MII write failed. The delay time inserted after
accessing BGE_MI_MODE register increased from 40us to 80us.
The default PHY address is now stored in softc. All PHYs supported
by bge(4) currently uses PHY address 1 but it will be changed when
we add newer controllers. This change will make it easier to change
default PHY address depending on PHY models.
Submitted by: davidch
physical address. Adds a dma tag to the XLR/XLS pci bus with the
lowaddr if the CPU happens to be a XLR C rev.
Submitted by: Sreekanth M. S. (kanthms at netlogicmicro dot com))
special eject command to reappear as modem. It also requires DIR_IN flag
in the command message, so we supply some dummy data along with the command.
Feedback from X080S owners appreciated. I have not a pure Alcatel/TCTMobile
device, but another one under "Svyaznoy" (Связной) brand, and I didn't yet
managed to get it working. It is successfully recognized, it responds to
AT commands, but it shuts up right after successfull CONNECT response.
Reviewed by: hps
- nlge_ioctl handles IFF_UP and IFF_PROMISC flags
- Translate table code, to enable flow based CPU assignment added
disabled by default (can be enabled by a tunable).
- Changed signature of nlge_port_disable to make it consistent with nlge_port_enable
- Removed TXCSUM and VLAN_HW_TAGGING from i/f capabilities.
Submitted by: Sriram Gorti (srgorti at netlogicmicro dot com)
values to zero. A correct solution would involve emulating vector
operations on denormalized values, but this has little effect on accuracy
and is much less complicated for now.
MFC after: 2 weeks
pmap_kextract()) before pmap_bootstrap() is called.
Document the set of pmap functions that may be called before
pmap_bootstrap() is called.
Tested by: bde@
Reviewed by: kib@
Discussed with: jhb@
MFC after: 6 weeks
sysctl and counters for message ring threads (intial version). Update
watermark values, and and decrease the maximum threads to 3 (this will leave
a few CPUs for other processes)
Minor comment fix in nlge.
successfully received a frame but we failed to pass it to upper
stack due to lack of resources. So update if_iqdrops counter
instead of updating if_ierrors counter.
mode in the USB core. The patch mostly consists of updating the USB
HUB code to support USB 3.0 HUBs. This patch also add some more USB
controller methods to support more active-alike USB controllers like
the XHCI which needs to be informed about various device state events.
USB 3.0 HUBs are not tested yet, due to lack of hardware, but are
believed to work.
After this update the initial device descriptor is only read twice
when we know that the bMaxPacketSize is too small for a single packet
transfer of this descriptor.
Approved by: thompsa (mentor)
controllers combine multiple TX requests into single one if there
is room in TX buffer of controller. Updating TX packet counter at
the end of TX completion resulted in incorrect TX packet counter as
axe(4) thought it sent 1 packet. There is no easy way to know how
many combined TX were completed in the callback.
Because this change updates TX packet counter before actual
transmission, it may not be ideal one. But I believe it's better
than showing fake 8kpps under high TX load. With this change, TX
shows 221kpps on Linksus USB200M.
these names are used in data sheet. Also use UnicastPkts,
MulticastPkts and BroadcastPkts instead of UcastPkts, McastPkts
and BcastPkts to clarify its meaning.
Suggested by: bde
addresses that is greater than a superpage in size but not a multiple of
the superpage size, then vm_map_find() is not always expanding the kernel
pmap to support the last few small pages being allocated. These failures
are not commonplace, so this was first noticed by someone porting FreeBSD
to a new architecture. Previously, we grew the kernel page table in
vm_map_findspace() when we found the first available virtual address.
This works most of the time because we always grow the kernel pmap or page
table by an amount that is a multiple of the superpage size. Now, instead,
we defer the call to pmap_growkernel() until we are committed to a range
of virtual addresses in vm_map_insert(). In general, there is another
reason to prefer calling pmap_growkernel() in vm_map_insert(). It makes
it possible for someone to do the equivalent of an mmap(MAP_FIXED) on the
kernel map.
Reported by: Svatopluk Kraus
Reviewed by: kib@
MFC after: 3 weeks
suspicious about 'l' and '1' being confused in numeric constants.
The fear being that some old fart programmer might still think that
he is using a Remmington Noiseless as input terminal device.
An easy way to placate this fear is to use capital 'L' or to put
the 'u' in unsigned constants in front of the 'l'.
Unlike actual MTRR, this only controls the mapping attributes for
subsequent mmap() of /dev/mem. Nonetheless, the support is sufficiently
MTRR-like that Xorg can use it, which translates into an enormous increase
in graphics performance on PowerPC.
MFC after: 2 weeks
trap frame when trap initiated kdb entry, incorrectly calculated the
value of %rsp for trapped thread.
According to Intel(R) 64 and IA-32 Architectures Software Developer's Manual
Volume 3A: System Programming Guide, Part 1, rev. 035, 6.14.2 64-Bit Mode
Stack Frame, "64-bit mode ... pushes SS:RSP unconditionally, rather than
only on a CPL change."
Even assuming the conditional push of the %ss:%rsp, the calculation
was still wrong because sizeof(tf_ss) + sizeof(tf_rsp) == 16 on amd64.
Always use the tf_rsp from trap frame. The change supposedly fixes
stepping when using kgdb backend for kdb.
Submitted by: Zhouyi Zhou <zhouzhouyi gmail com>
PR: amd64/151167
Reviewed by: avg
MFC after: 1 week
scratch. This driver adds support for USB3.0 devices. The XHCI
interface is also backwards compatible to USB2.0 and USB1.0 and will
evntually replace the OHCI/UHCI and EHCI drivers.
There will be follow-up commits during the coming week to link the
driver into the default kernel build and add missing USB3.0
functionality in the USB core. Currently only the driver files are
committed.
Approved by: thompsa (mentor)
- Wakeup multiple threads per core using message ring watermark interrupts.
- Update message ring handler registration, use the real device station id
for registering interrupts.
- rge/nlge: update for the new message ring registration code.
- rge/nlge: use 2 message ring stations for incoming packets, this will
allow more messages to be queued.
- nlge: comment fixes, remove unused variable
- style and whitespace fixes
it (the root mount code) into a new file called vfs_mountroot.c
The split is almost trivial, as the code is almost perfectly
non-intertwined. The only adjustment needed was to move the UMA
zone allocation out of vfs_mountroot() [in vfs_mountroot.c] and
into vfs_mount.c, where it had to be done as a SYSINIT [see
vfs_mount_init()].
There are no functional changes with this commit.
different PHY instance being selected and isolation out into the wrappers
around the service methods rather than duplicating them over and over
again (besides, a PHY driver shouldn't need to care about which instance
it actually is).
- Centralize the check for the need to isolate a non-zero PHY instance not
supporting isolation in mii_mediachg() and just ignore it rather than
panicing, which should sufficient given that a) things are likely to
just work anyway if one doesn't plug in more than one port at a time and
b) refusing to attach in this case just leaves us in a unknown but most
likely also not exactly correct configuration (besides several drivers
setting MIIF_NOISOLATE didn't care about these anyway, probably due to
setting this flag for no real reason).
- Minor fixes like removing unnecessary setting of sc->mii_anegticks,
using sc->mii_anegticks instead of hardcoded values etc.
the linker_load_file methods. The change is that the consequent
linker_file_unload() call is not under the vnode lock anymore.
This prevents the LOR between kernel linker sx xlock and vnode lock,
because linker_file_unload() relocks kernel linker lock.
MFC after: 2 weeks
the miibus attached to octe interfaces.
o) Add an SMI/MDIO interface to the MV88E61XX and use it for the switch PHY on
the Lanner MR-320. An actual driver for the switch PHY will come later.
Note that for now it intercepts and fakes MII_BMSR reads to prevent the
miibus from talking to anything but the switch itself.
bus interface does that's special here now is to use a 64-bit register size.
In theory, uart(4) ought to support a regsz as well as regshft and support
64-bit registers directly.
Also use the UART class's range rather than a hand-coded 1024 for the address
range.
and set memory attributes appropriately for mmap() calls on /dev/console.
Xorg no longer uses /dev/console to mmap the framebuffer, so framebuffer
write combining support in X will arrive in the next patch.
by just caching the mode for later use by pmap_enter(), following amd64.
While here, correct some mismerges from mmu_oea64 -> mmu_oea and clean
up some dead code found while fixing the fictitious page behavior.
This patch is significantly based on previous work by jkim.
List of changes:
- added comments that describe topology uniformity assumption
- added reference to Intel Processor Topology Enumeration article
- documented a few global variables that describe topology
- retired weirdly set and used logical_cpus variable
- changed fallback code for mp_ncpus > 0 case, so that CPUs are treated
as being different packages rather than cores in a single package
- moved AMD-specific code to topo_probe_amd [jkim]
- in topo_probe_0x4() follow Intel-prescribed procedure of deriving SMT
and core masks and match APIC IDs against those masks [started by
jkim]
- in topo_probe_0x4() drop code for double-checking topology parameters
by looking at L1 cache properties [jkim]
- in topo_probe_0xb() add fallback path to topo_probe_0x4() as
prescribed by Intel [jkim]
Still to do:
- prepare for upcoming AMD CPUs by using new mechanism of uniform
topology description [pointed by jkim]
- probe cache topology in addition to CPU topology and probably use that
for scheduler affinity topology; e.g. Core2 Duo and Athlon II X2 have
the same CPU topology, but Athlon cores do not share L2 cache while
Core2's do (no L3 cache in both cases)
- think of supporting non-uniform topologies if they are ever
implemented for platforms in question
- think how to better described old HTT vs new HTT distinction, HTT vs
SMT can be confusing as SMT is a generic term
- more robust code for marking CPUs as "logical" and/or "hyperthreaded",
use HTT mask instead of modulo operation
- correct support for halting logical and/or hyperthreaded CPUs, let
scheduler know that it shouldn't schedule any threads on those CPUs
PR: kern/145385 (related)
In collaboration with: jkim
Tested by: Sergey Kandaurov <pluknet@gmail.com>,
Jeremy Chadwick <freebsd@jdc.parodius.com>,
Chip Camden <sterling@camdensoftware.com>,
Steve Wills <steve@mouf.net>,
Olivier Smedts <olivier@gid0.org>,
Florian Smeets <flo@smeets.im>
MFC after: 1 month
IFF_ALLMULTI/IFF_PROMISC as well as multicast filter configuration.
Rewrite RX filter logic to reduce number of register accesses and
make it handle promiscuous/allmulti toggling without controller
reinitialization.
Previously rl(4) counted on controller reinitialization to reprogram
promiscuous configuration but r211767 resulted in avoiding
controller reinitialization whenever promiscuous mode is toggled.
To address this, keep track of driver's view of interface state and
handle IFF_ALLMULTI/IFF_PROMISC changes without reinitializing
controller. This should fix a regression introduced in r211267.
While I'm here remove unnecessary variable reassignment in ioctl
handler.
PR: kern/151079
MFC after: 1 week
SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY should only be used to call
scheduler() function which turns the initial thread into swapper proper
and thus there is no further SYSINIT processing.
Other SYSINITs with SI_SUB_RUN_SCHEDULER+SI_ORDER_ANY may get ordered
after scheduler() and thus never executed. That particular relative
order is semi-arbitrary.
Thus, change such places to use SI_ORDER_MIDDLE.
Also, use SI_ORDER_MIDDLE instead of correct, but less appealing,
SI_ORDER_ANY - 1.
MFC after: 1 week
number of unexplained interrupt problems. For some reason, using HPET
interrupts there breaks HDA sound. Legacy route mode interrupts reported
to work fine there.
controllers. bge(4) exported MAC statistics on controllers that
maintain the statistics in the NIC's internal memory. Newer
controllers require register access to fetch these values. These
counters provide useful information to diagnose driver issues.
The check for alignment should be made against the physical address and not
the virtual address that maps it.
Sponsored by: NetApp
Submitted by: Will McGovern (will at netapp dot com)
Reviewed by: mjacob, jhb
parent driver. Use that information to configure flow-control.
One drawback is there is no way to disable flow-control as we still
don't have proper way to not advertise RX/TX pause capability to
link partner. But I don't think it would cause severe problems and
users can selectively disable flow-control in switch port.
- license clause now contains "AUTHOR AND CONTRIBUTORS"
instead of just "AUTHOR"
- Add license/copyright to gpioc.c
Spotted by: Edward Tomasz Napierala, Andrew Turner
which were raised during hot-swap events. Now such events trigger cam
rescans, as is done in the mps driver.
Submitted by: Mark Johnston <mjohnston at sandvine dot com>
This fixes the bug where setting bw > 1 MTU/tick resulted in
infinite bandwidth if io_fast=1
PR: 147245 148429
Obtained from: Riccardo Panicucci
MFC after: 3 days
has reached. This reduced number of dropped frames when
flow-control is enabled. Previously it dropped incoming frames once
RX MBUF low watermark has reached. The value used in MAC RX MBUF
low watermark is greater than or equal to 4 so receiving two more
RX frames should not be a problem.
Obtained from: OpenBSD
net.inet.ip.fw.one_pass and always moved to the next rule
in case of a successful nat.
This should fix several related PR (waiting for feedback
before closing them)
PR: 145167 149572 150141
MFC after: 3 days
- handle compat32 processes;
- remove the checks for copied in addresses to belong into valid
usermode range, proc_rwmem() does this;
- simplify loop reading single string, limit the total amount of strings
collected by ARG_MAX bytes;
- correctly add '\0' at the end of each copied string;
- fix style.
In linprocfs_doprocenviron():
- unlock the process before calling copyin code [1]. The process is held
by pseudofs.
In linprocfs_doproccmdline:
- use linprocfs_doargv() to handle !curproc case for which p_args is not cached.
Reported by: plulnet [1]
Tested by: pluknet
Approved by: des (linprocfs maintainer, previous
version of the patch)
MFC after: 3 weeks
- GPIO bus controller interface
- GPIO bus interface
- Implementation of GPIO led(4) compatible device
- Implementation of iic(4) bus over GPIO (author: Luiz Otavio O Souza)
Tested by: Luiz Otavio O Souza, Alexandr Rybalko
- Sync shared code with Intel internal
- New client chipset support added
- em driver - fixes to 82574, limit queues to 1 but use MSIX
- em driver - large changes in TX checksum offload and tso
code, thanks to yongari.
- some small changes for watchdog issues.
- igb driver - local timer watchdog code was missing locking
this and a couple other watchdog related fixes.
- bug in rx discard found by Andrew Boyer, check for null pointer
MFC: a week
o) Give a virtual address for I/O ports on n64.
o) On the Portwell CAM-0100, return the right IRQ for the on-board SATA.
o) Except on bridges, only set PORTEN and MEMEN on devices that have I/O or
memory BARs respectively.
o) Disable PORTEN and MEMEN while reprogramming BARs.
o) On the Lanner MR-955, set the Tx DMA power register for the on-board Promise
SATA controller.
- Check for SMAP data from the loader first. If it exists, don't bother
doing any VM86 calls at all. This will be more friendly for non-BIOS
boot environments such as EFI, etc.
- Move the base memory setup into a new basemem_setup() routine instead
of duplicating it.
- Simplify the XEN case by removing all of the VM86/SMAP parsing code rather
than just jumping over it.
- Adjust some comments to better explain the code flow.
MFC after: 2 weeks
un-expiring.
The previous version of code have no locking when testing rt_refcnt.
The result of the lack of locking may result in a condition where
a routing entry have a reference count but at the same time have
RTPRF_OURS bit set and an expiration timer. These would eventually
lead to a panic:
panic: rtqkill route really not free
When the system have ICMP redirects accepted from local gateway
in a moderate frequency, for instance.
Commit this workaround for now until we have some better solution.
PR: kern/149804
Reviewed by: bz
Tested by: Zhao Xin, Pete French
MFC after: 2 weeks
specific devfs path already exists.
The function will be used from kern_conf.c to detect duplicate device
registrations. Callers must hold the devmtx mutex.
Reviewed by: kib
links. The reference counting is needed to be able to determine if a
specific devfs path exists. For true device file paths we can traverse
the cdevp_list but a separate directory list is needed for user created
symbolic links.
Add a new directory entry flag DE_USER to mark entries which should
unreference their parent directory on deletion.
A new function to traverse cdevp_list and the directory list will be
introduced in a separate commit.
Idea from: kib
Reviewed by: kib
- XLS B0 and later revision chips have PCIe link 2 & 3 mapped to different
PIC interrupts. Update pic.h, board.h and xlr_pci.c to reflect this.
- remove debug prints in xlr_pci.c
- add more processor IDs to board.h, add function xlr_is_xls_b0()
- some style(9) and whitespace fixes
Retry IO once with ZIO_FLAG_TRYHARD before declaring a pool faulted
OpenSolaris revision and Bug IDs:
9725:0bf7402e8022
6843014 ZFS B_FAILFAST handling is broken
Approved by: delphij (mentor)
Obtained from: OpenSolaris (Bug ID 6843014)
MFC after: 3 weeks
OpenSolaris revision and Bug IDs:
9701:cc5b64682e64
6803605 should be able to offline log devices
6726045 vdev_deflate_ratio is not set when offlining a log device
6599442 zpool import has faults in the display
Approved by: delphij (mentor)
Obtained from: OpenSolaris (Bug ID 6803605, 6726045, 6599442)
MFC after: 3 weeks
not be used outside of the reassembly queue implementation. Provide a new
function to flush all segments from a reassembly queue and call it from the
appropriate places instead of manipulating the queue directly.
Sponsored by: FreeBSD Foundation
Reviewed by: andre, gnn, rpaulo
MFC after: 2 weeks
these could be made dependent on either of the octusb or octe options, but
making them standard fixes a number of option combinations that were previously
broken.
clean up most layering violations:
sys/boot/i386/common/rbx.h:
RBX_* defines
OPT_SET()
OPT_CHECK()
sys/boot/common/util.[ch]:
memcpy()
memset()
memcmp()
bcpy()
bzero()
bcmp()
strcmp()
strncmp() [new]
strcpy()
strcat()
strchr()
strlen()
printf()
sys/boot/i386/common/cons.[ch]:
ioctrl
putc()
xputc()
putchar()
getc()
xgetc()
keyhit() [now takes number of seconds as an argument]
getstr()
sys/boot/i386/common/drv.[ch]:
struct dsk
drvread()
drvwrite() [new]
drvsize() [new]
sys/boot/common/crc32.[ch] [new]
sys/boot/common/gpt.[ch] [new]
- Teach gptboot and gptzfsboot about new files. I haven't touched the
rest, but there is still a lot of code duplication to be removed.
- Implement full GPT support. Currently we just read primary header and
partition table and don't care about checksums, etc. After this change we
verify checksums of primary header and primary partition table and if
there is a problem we fall back to backup header and backup partition
table.
- Clean up most messages to use prefix of boot program, so in case of an
error we know where the error comes from, eg.:
gptboot: unable to read primary GPT header
- If we can't boot, print boot prompt only once and not every five
seconds.
- Honour newly added GPT attributes:
bootme - this is bootable partition
bootonce - try to boot from this partition only once
bootfailed - we failed to boot from this partition
- Change boot order of gptboot to the following:
1. Try to boot from all the partitions that have both 'bootme'
and 'bootonce' attributes one by one.
2. Try to boot from all the partitions that have only 'bootme'
attribute one by one.
3. If there are no partitions with 'bootme' attribute, boot from
the first UFS partition.
- The 'bootonce' functionality is implemented in the following way:
1. Walk through all the partitions and when 'bootonce'
attribute is found without 'bootme' attribute, remove
'bootonce' attribute and set 'bootfailed' attribute.
'bootonce' attribute alone means that we tried to boot from
this partition, but boot failed after leaving gptboot and
machine was restarted.
2. Find partition with both 'bootme' and 'bootonce' attributes.
3. Remove 'bootme' attribute.
4. Try to execute /boot/loader or /boot/kernel/kernel from that
partition. If succeeded we stop here.
5. If execution failed, remove 'bootonce' and set 'bootfailed'.
6. Go to 2.
If whole boot succeeded there is new /etc/rc.d/gptboot script coming
that will log all partitions that we failed to boot from (the ones with
'bootfailed' attribute) and will remove this attribute. It will also
find partition with 'bootonce' attribute - this is the partition we
booted from successfully. The script will log success and remove the
attribute.
All the GPT updates we do here goes to both primary and backup GPT if
they are valid. We don't touch headers or partition tables when
checksum doesn't match.
Reviewed by: arch (Message-ID: <20100917234542.GE1902@garage.freebsd.pl>)
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
MFC after: 2 weeks
attribute (it should be allowed only to unset it), but for test purposes it
might be useful, so the current code allows it.
Reviewed by: arch@ (Message-ID: <20100917234542.GE1902@garage.freebsd.pl>)
MFC after: 2 weeks
GPT_ENT_ATTR_BOOTME - this is bootable partition
GPT_ENT_ATTR_BOOTONCE - try to boot only once from this partition
GPT_ENT_ATTR_BOOTFAILED - set this flag if we cannot boot from partition
containing GPT_ENT_ATTR_BOOTONCE flag; note that if we cannot
boot from partition that contains only GPT_ENT_ATTR_BOOTME flag,
the GPT_ENT_ATTR_BOOTFAILED flag won't be set
According to wikipedia Microsoft TechNet says that attributes are divided into
two halves: the lower 4 bytes representing partition independent attributes,
and the upper 4 bytes are partition type dependent. Microsoft is already using
bits 60 (read-only), 62 (hidden) and 63 (do not automount) and I'd like to not
collide with those, so we are using bit 59 (bootme), 58 (bootonce) and 57
(bootfailed).
Reviewed by: arch (Message-ID: <20100917234542.GE1902@garage.freebsd.pl>)
MFC after: 2 weeks
in the kernel (just as inet_ntoa() and inet_aton()) are and sync their
prototype accordingly with already mentioned functions.
Sponsored by: Sandvine Incorporated
Reviewed by: emaste, rstone
Approved by: dfr
MFC after: 2 weeks
driver to try to switch interrupt handlers at setup. It's not a very
good implementation of bus_teardown_intr, though.
o) Set cache line size and latency timers for PCI devices per Linux.
o) Reset and configure the bus from scratch rather than expecting U-Boot to
do it for us. Values and configuration from Linux, U-Boot and comments
in the Cavium Simple Executive sources.
o) Do a resource assignment and bus numbering pass in the absence of a PCI
BIOS or firmware that will do it for us.
XXX This has to be the third or fourth instance of this in FreeBSD and
it would be nice to have it become part of the PCI bus driver itself,
like it is on Linux.
o) Fix interrupt mapping for and adjust bus configuration for the Lanner
MR-955, based on information provided by Lanner.
too many bge(4) controllers there and model name does not
necessarily match asic/chip revision. Relying on VPD string made
it hard to identify exact asic/chip revision so the first step to
debug bge(4) was getting exact asic/chip information with verbose
boot which may not be available on production server.
or some variation in the path, the new version assumes that $0 is
newvers.sh path, and that dirname $0/.. is the same as $S aka $SYSDIR.
It also removes knowledge of ${MACHINE} and ${MACHINE_ARCH}, which is
also good.
# I've had this in my tree for about 6 months now, which is why I
# didn't notice that I broke it in r209510 and that was fixed in
# r212954. This should finally resolve the issues people had with
# r204824 as well as address the issues that motivated r204824.
driver-maintained ifnet fields (such as if_drv_flags).
- Use soft locks as the mutex that protects each interface's knote list
rather than using the global knote list lock. Also, use the softc
for kn_hook instead of the cdev.
- Use mtx_sleep() instead of tsleep() when blocking in the read routines.
This fixes a lost wakeup race.
- Remove D_NEEDGIANT now that the cdevsw routines use the softc lock
where locking is needed.
- Lock IFQ when calculating the result for FIONREAD in tap(4). tun(4)
already did this.
- Remove remaining spl calls.
Submitted by: Marcin Cieslak saper of saper|info (3)
MFC after: 2 weeks
the location, apply elf_relocaddr to the symbol value to have right
values for the symbols from dpcpu segment.
PR: kern/147769
Discussed with: avg
Tested by: marius
MFC after: 2 weeks
This is a followup to r212964.
stack_print call chain obtains linker sx lock and thus potentially may
lead to a deadlock depending on a kind of a panic.
stack_print_ddb doesn't acquire any locks and it doesn't use any
facilities of ddb backend.
Using stack_print_ddb outside of DDB ifdef required taking a number of
helper functions from under it as well.
It is a good idea to rename linker_ddb_* and stack_*_ddb functions to
have 'unlocked' component in their name instead of 'ddb', because those
functions do not use any DDB services, but instead they provide unlocked
access to linker symbol information. The latter was previously needed
only for DDB, hence the 'ddb' name component.
Alternative is to ditch unlocked versions altogether after implementing
proper panic handling:
1. stop other cpus upon a panic
2. make all non-spinlock lock operations (mutex, sx, rwlock) be a no-op
when panicstr != NULL
Suggested by: mdf
Discussed with: attilio
MFC after: 2 weeks
address spaces
There has been no need to do that starting with ACPICA 20040427 as
AcpiEnableSubsystem() installs the handlers automatically.
Additionaly, explicitly calling AcpiInstallAddressSpaceHandler before
AcpiEnableSubsystem is not supported by ACPICA and leads to too early
execution of _REG methods in some DSDTs, which may result in problems.
Big thanks to Robert Moore of ACPICA/Intel for explaining the above.
Reported by: Daniel Bilik <daniel.bilik@neosystem.cz>
Tested by: Daniel Bilik <daniel.bilik@neosystem.cz>
Reviewed by: jkim
Suggested by: "Moore, Robert" <robert.moore@intel.com>
MFC after: 1 week