- correct the ethernet payload remainder which
must be post-offseted by -14 bytes instead of
0 bytes. This is not very clearly defined in the
NCM specification.
- add development feature about limiting the
maximum datagram count in each NCM payload.
- zero-pad alignment data
- add TX-interval tuning sysctl
Approved by: thompsa (mentor)
Link UP state could be reported first before actual completion of
auto-negotiation. This change makes bge(4) reprogram BGE_MAC_MODE,
BGE_TX_MODE and BGE_RX_MODE register only after controller got a
valid link.
receive producer ring only for BCM5700. It was believed that
BCM5700 with external SSRAM is the only controller that supports
mini ring but it seems all BCM570[0-4] requires to disable mini
receive producer ring. Otherwise, it caused unexpected RX DMA
error or watchdog timeouts.
Reported by: marius, Steve Kargl <sgk <> troutmask dot apl dot washington dot edu>
Tested by: marius, Steve Kargl <sgk <> troutmask dot apl dot washington dot edu>
Short description of the changes:
- attempt to retry some commands for which it is possible (read, query)
- always make a short sleep before checking EC status in polled mode
- periodically poll EC status in interrupt mode
- change logic for detecting broken interrupt delivery and falling back
to polled mode
- check that EC is ready for input before starting a new command, wait
if necessary
This commit is based on the original patch by David Naylor.
PR: kern/150517
Submitted by: David Naylor <naylor.b.david@gmail.com>
Reviewed by: jkim
MFC after: 3 weeks
before setting the flag, interrupt was already enabled such that
interrupt handler could be run before setting IFF_DRV_RUNNING flag.
This can lose initial link state change interrupt which in turn
make bge(4) think that it still does not have valid link. Fix this
race by protecting the taskqueue with a driver lock.
While I'm here move reenabling interrupt code after handling of link
state chage.
Reviewed by: davidch
Previously bge(4) always enabled auto polling for non-BGE_FLAG_TBI
controllers. With this change, auto polling is not used anymore so
polling through mii(4) was introduced.
Reviewed by: davidch
It seems axe(4) controllers support interrupt endpoint such that
enabling interrupt endpoint generates about 1000 interrupts/sec.
Controllers transfer 8 bytes data through interrupt endpoint and
the data include link UP/DOWN state as well as some PHY related
information. Previously axe(4) didn't use the transferred data and
didn't even try to read the data. Because axe(4) counts on mii(4)
to detect link state changes there is no need to use interrupt
endpoint here.
This change fixes generation of unnecessary interrupts which was
seen when interface is brought to UP.
No objections from: hselasky
opposition to the change, since really we need to implement missing
functionality in drivers or the 802.3 layer.
For now, restore a reminder message for a missing rum_update_mcast, but
print it only once.
controller, but make it optional.
After a problem report from Andrew Boyer, it looks like the LSI
chip may have issues (the watchdog timer fired) if too many aborts
are sent down to the chip at the same time. We know that task
management commands are serialized, and although the manual doesn't
say it, it may be a good idea to just send one at a time.
But, since I'm not certain that this is necessary, add a tunable
and sysctl variable (hw.mps.%d.allow_multiple_tm_cmds) to control
the driver's behavior.
mps.c: Add support for the sysctl and tunable, and add a
comment about the possible return values to
mps_map_command().
mps_sas.c: Run all task management commands through two new
routines, mpssas_issue_tm_request() and
mpssas_complete_tm_request().
This allows us to optionally serialize task
management commands. Also, change things so that
the response to a task management command always
comes back through the callback. (Before it could
come via the callback or the return value.)
mpsvar.h: Add softc variables for the list of active task
management commands, the number of active commands,
and whether we should allow multiple active task
management commands. Add an active command flag.
mps.4: Describe the new sysctl/loader tunable variable.
Sponsored by: Spectra Logic Corporation
as 5788. This caused BGE_MISC_LOCAL_CTL register is used to
generate link state change interrupt for non-5788 controllers. The
interrupt handler may or may not detect link state attention as
status block wouldn't be updated when an interrupt was generated
with BGE_MISC_LOCAL_CTL register. All controllers except 5700 and
5788 should use host coalescing mode register to trigger an
interrupt.
versions of controller support different number of ring control
blocks such that adjust code a bit to access known number of
send/receive ring control blocks. Previously bge(4) blindly
accessed 16 send/receive RCBs. Also move initializing standard
receive producer ring producer index, jumbo receive producer ring
producer index and mini receive producer ring producer index to
the end of each receive producer ring initialization.
Do not assume mini receive producer ring is available only when
controller has jumbo frame capability, instead explicitly check
ASIC version BCM5700 to disable mini receive producer ring.
Additionally always enable send ring 0 regardless of controller
versions. Previously bge(4) didn't enable send ring 0 if controller
is BGE_IS_5705_PLUS. Becase bge(4) need 1 send ring to send frames
at least, I have no idea how it would have worked so far.
Submitted by: davidch
dev.bce.<unit>.nvram_dump
Add the capability to write the complete contents of the NVRAM via sysctl
dev.bce.<unit>.nvram_write
These are only available if the kernel option BCE_DEBUG is enabled.
The nvram_write sysctl also requires the kernel option
BCE_NVRAM_WRITE_SUPPORT to be enabled. These are to be used at your
own caution. Since the MAC addresses are stored in the NVRAM, if you
dump one NIC and restore it on another NIC the destination NIC's
MAC addresses will not be preserved. A tool can be made using these
sysctl's to manage the on-chip firmware.
Reviewed by: davidch, yongari
BGE_MI_MODE register accesses. Previously bge(4) used to read
BGE_MI_MODE register to detect whether it needs to disable
autopolling feature or not. Because we don't touch autopolling in
other part of driver there is no reason to read BGE_MI_MODE
register given that we know default value in advance. In order to
achieve the goal, check whether the controller has CPMU(Central
Power Mangement Unit) capability. If controller has CPMU feature,
use 500KHz MII management interface(mdio/mdc) frequency regardless
core clock frequency. Otherwise use default MII clock. While I'm
here, add CPMU register definition.
In bge_miibus_readreg(), rearrange code a bit and remove goto
statement. In bge_miibus_writereg(), make sure to restore
autopolling even if MII write failed. The delay time inserted after
accessing BGE_MI_MODE register increased from 40us to 80us.
The default PHY address is now stored in softc. All PHYs supported
by bge(4) currently uses PHY address 1 but it will be changed when
we add newer controllers. This change will make it easier to change
default PHY address depending on PHY models.
Submitted by: davidch
special eject command to reappear as modem. It also requires DIR_IN flag
in the command message, so we supply some dummy data along with the command.
Feedback from X080S owners appreciated. I have not a pure Alcatel/TCTMobile
device, but another one under "Svyaznoy" (Связной) brand, and I didn't yet
managed to get it working. It is successfully recognized, it responds to
AT commands, but it shuts up right after successfull CONNECT response.
Reviewed by: hps
successfully received a frame but we failed to pass it to upper
stack due to lack of resources. So update if_iqdrops counter
instead of updating if_ierrors counter.
mode in the USB core. The patch mostly consists of updating the USB
HUB code to support USB 3.0 HUBs. This patch also add some more USB
controller methods to support more active-alike USB controllers like
the XHCI which needs to be informed about various device state events.
USB 3.0 HUBs are not tested yet, due to lack of hardware, but are
believed to work.
After this update the initial device descriptor is only read twice
when we know that the bMaxPacketSize is too small for a single packet
transfer of this descriptor.
Approved by: thompsa (mentor)
controllers combine multiple TX requests into single one if there
is room in TX buffer of controller. Updating TX packet counter at
the end of TX completion resulted in incorrect TX packet counter as
axe(4) thought it sent 1 packet. There is no easy way to know how
many combined TX were completed in the callback.
Because this change updates TX packet counter before actual
transmission, it may not be ideal one. But I believe it's better
than showing fake 8kpps under high TX load. With this change, TX
shows 221kpps on Linksus USB200M.
these names are used in data sheet. Also use UnicastPkts,
MulticastPkts and BroadcastPkts instead of UcastPkts, McastPkts
and BcastPkts to clarify its meaning.
Suggested by: bde
scratch. This driver adds support for USB3.0 devices. The XHCI
interface is also backwards compatible to USB2.0 and USB1.0 and will
evntually replace the OHCI/UHCI and EHCI drivers.
There will be follow-up commits during the coming week to link the
driver into the default kernel build and add missing USB3.0
functionality in the USB core. Currently only the driver files are
committed.
Approved by: thompsa (mentor)
different PHY instance being selected and isolation out into the wrappers
around the service methods rather than duplicating them over and over
again (besides, a PHY driver shouldn't need to care about which instance
it actually is).
- Centralize the check for the need to isolate a non-zero PHY instance not
supporting isolation in mii_mediachg() and just ignore it rather than
panicing, which should sufficient given that a) things are likely to
just work anyway if one doesn't plug in more than one port at a time and
b) refusing to attach in this case just leaves us in a unknown but most
likely also not exactly correct configuration (besides several drivers
setting MIIF_NOISOLATE didn't care about these anyway, probably due to
setting this flag for no real reason).
- Minor fixes like removing unnecessary setting of sc->mii_anegticks,
using sc->mii_anegticks instead of hardcoded values etc.
number of unexplained interrupt problems. For some reason, using HPET
interrupts there breaks HDA sound. Legacy route mode interrupts reported
to work fine there.
controllers. bge(4) exported MAC statistics on controllers that
maintain the statistics in the NIC's internal memory. Newer
controllers require register access to fetch these values. These
counters provide useful information to diagnose driver issues.
parent driver. Use that information to configure flow-control.
One drawback is there is no way to disable flow-control as we still
don't have proper way to not advertise RX/TX pause capability to
link partner. But I don't think it would cause severe problems and
users can selectively disable flow-control in switch port.
- license clause now contains "AUTHOR AND CONTRIBUTORS"
instead of just "AUTHOR"
- Add license/copyright to gpioc.c
Spotted by: Edward Tomasz Napierala, Andrew Turner
which were raised during hot-swap events. Now such events trigger cam
rescans, as is done in the mps driver.
Submitted by: Mark Johnston <mjohnston at sandvine dot com>
has reached. This reduced number of dropped frames when
flow-control is enabled. Previously it dropped incoming frames once
RX MBUF low watermark has reached. The value used in MAC RX MBUF
low watermark is greater than or equal to 4 so receiving two more
RX frames should not be a problem.
Obtained from: OpenBSD
- GPIO bus controller interface
- GPIO bus interface
- Implementation of GPIO led(4) compatible device
- Implementation of iic(4) bus over GPIO (author: Luiz Otavio O Souza)
Tested by: Luiz Otavio O Souza, Alexandr Rybalko
- Sync shared code with Intel internal
- New client chipset support added
- em driver - fixes to 82574, limit queues to 1 but use MSIX
- em driver - large changes in TX checksum offload and tso
code, thanks to yongari.
- some small changes for watchdog issues.
- igb driver - local timer watchdog code was missing locking
this and a couple other watchdog related fixes.
- bug in rx discard found by Andrew Boyer, check for null pointer
MFC: a week
too many bge(4) controllers there and model name does not
necessarily match asic/chip revision. Relying on VPD string made
it hard to identify exact asic/chip revision so the first step to
debug bge(4) was getting exact asic/chip information with verbose
boot which may not be available on production server.
address spaces
There has been no need to do that starting with ACPICA 20040427 as
AcpiEnableSubsystem() installs the handlers automatically.
Additionaly, explicitly calling AcpiInstallAddressSpaceHandler before
AcpiEnableSubsystem is not supported by ACPICA and leads to too early
execution of _REG methods in some DSDTs, which may result in problems.
Big thanks to Robert Moore of ACPICA/Intel for explaining the above.
Reported by: Daniel Bilik <daniel.bilik@neosystem.cz>
Tested by: Daniel Bilik <daniel.bilik@neosystem.cz>
Reviewed by: jkim
Suggested by: "Moore, Robert" <robert.moore@intel.com>
MFC after: 1 week
StarFire controller does not require controller reinitialization to
program perfect filters. While here, make driver immediately exit
from interrupt/polling handler if driver reinitialized controller.
PR: kern/87506
- Add a single sysctl procedure to all three drivers to read an arbitrary
register (the register is passed as arg2). Use it to replace existing
routines in igb(4) that used a separate routine for each register, and
to add support for missing stats in em(4) and lem(4).
- Move the 'rx_overruns' and 'watchdog_timeouts' stats out of the MAC stats
section as they are driver stats, not MAC counters.
- Simplify the code that creates per-queue stats in igb(4) to use a single
loop and remove duplicated code.
- Properly read all 64 bits of the 'good octets received/transmitted' in
em(4) and lem(4).
- Actually read the interrupt count registers in em(4), and drop the
'host to card' sysctl stats from em(4) as they are not implemented in
any of the hardware this driver supports.
- Restore several stats to em(4) that were lost in the earlier stats
conversion including per-queue stats.
- Export several MAC stats in em(4) that were exported in igb(4) but not
in em(4).
- Export stats in lem(4) using individual sysctls as in em(4) and igb(4).
Reviewed by: jfv
MFC after: 1 week
the device:
- unobscure some of the code by moving it into its own functions
- get rid of some magic numbers
- create similar structure as the reference driver has, this should
make further syncs easier
particular edge case where X-axis resolution is not multiple of font width.
Now we just advance enough scan lines, then deduct a partial scan line.
It is more intuitive than the previous code. Apply the same wisdom to EGA
and VGA planar renderers for consistency.
Reported by: David DEMELIER (demelier dot david at gmail dot com)
When the driver is completely saturated with commands (1024 in the
case of the SAS2008 in my test system), I/O stops. If we tell CAM
that we have one less command slot than we have actually allocated,
everything works fine. We also need a few extra command slots to
allow for aborts and other task management commands to be sent down.
This needs more investigation to determine the root cause, but for
now this fixes things in my testing.
mps.c: Change a printf() to mps_printf().
mps_sas.c: Subtract 5 command slots when we tell CAM how many
commands we can handle.
Add some commented-out logic to print the contents
the CDBs for timed-out commands. This can help
in debugging devices that are timing out. This
will be uncommented once I bring some CAM changes in.
Reported by: Andrew Boyer <aboyer at averesystems dot com>
According to the MPT2 spec, task management commands are
serialized, and so no I/O should start while task management
commands are active.
So, to comply with that, freeze the SIM queue before we send any
task management commands (abort, target reset, etc.) down to the
IOC. We unfreeze the queue once the task management command
completes.
It isn't clear from the spec whether multiple simultaneous task
management commands are supported. Right now it is possible to
have multiple outstanding task management commands, especially in
the abort case. Multiple outstanding aborts do complete
successfully, so it may be supported.
We also don't yet have any recovery mechanism (e.g. reset the IOC)
if the task management command fails.
already updated after allocating mbuf so driver had to use the last
index instead of using next producer index. This should fix driver
hang which may happen under high network load.
Reported by: Igor Sysoev <is <> rambler-co dot ru>, Vlad Galu <dudu <> dudu dot ro>
Tested by: Igor Sysoev <is <> rambler-co dot ru>, Vlad Galu <dudu <> dudu dot ro>
MFC after: 10 days
Add a drain function for struct sysctl_req, and use it for a variety
of handlers, some of which had to do awkward things to get a large
enough SBUF_FIXEDLEN buffer.
Note that some sysctl handlers were explicitly outputting a trailing
NUL byte. This behaviour was preserved, though it should not be
necessary.
Reviewed by: phk (original patch)
- Don't probe for PHYs if we already know to use a SERDES. Unlike as with
cas(4) this only serves to speed up the the device attach though and can
only be determined via the OFW device tree but not from the VPD.
- Don't touch the MIF when using a SERDES.
- Add some missing bus space barriers, mainly in the PCS code path.
- D_TRACKCLOSE may be used there as d_close() are expected to match up
d_open() calls
- Replace the hand-crafted counter and flag with the
device_busy()/device_unbusy() proper usage.
Sponsored by: Sandvine Incorporated
Reported by: Mark Johnston <mjohnston at sandvine dot com>
Tested by: Mark Johnston
Reviewed by: emaste
MFC after: 10 days
when the original offset is bigger than size of one page. X86BIOS macros
cannot be used here because it is assumed address is only linear in a page.
Tested by: netchild
unexpected things in copyout(9) and so wiring the user buffer is not
sufficient to perform a copyout(9) while holding a random mutex.
Requested by: nwhitehorn
ACPI specification sates that if P_LVL2_LAT > 100, then a system doesn't
support C2; if P_LVL3_LAT > 1000, then C3 is not supported.
But there are no such rules for Cx state data returned by _CST. If a
state is not supported it should not be included into the return
package. In other words, any latency value returned by _CST is valid,
it's up to the OS and/or user to decide whether to use it.
Submitted by: nork
Suggested by: mav
MFC after: 1 week
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.
There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.
As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.
Tested by: many (on i386, amd64, sparc64 and powerc)
H/W donated by: Gheorghe Ardelean
Sponsored by: iXsystems, Inc.
Bring in a driver for the LSI Logic MPT2 6Gb SAS controllers.
This driver supports basic I/O, and works with SAS and SATA drives and
expanders.
Basic error recovery works (i.e. timeouts and aborts) as well.
Integrated RAID isn't supported yet, and there are some known bugs.
So this isn't ready for production use, but is certainly ready for
testing and additional development. For the moment, new commits to this
driver should go into the FreeBSD Perforce repository first
(//depot/projects/mps/...) and then get merged into -current once
they've been vetted.
This has only been added to the amd64 GENERIC, since that is the only
architecture I have tested this driver with.
Submitted by: scottl
Discussed with: imp, gibbs, will
Sponsored by: Yahoo, Spectra Logic Corporation
This reflects actual type used to store and compare child device orders.
Change is mostly done via a Coccinelle (soon to be devel/coccinelle)
semantic patch.
Verified by LINT+modules kernel builds.
Followup to: r212213
MFC after: 10 days
malo and mwl use the firmware framework to access firmware images.
Depending on the firmware modules itself is not required and in this
case even wrong because no modules with those names exist.
Pointed out by: brucec
MFC after: 1 week
handlers, some of which had to do awkward things to get a large enough
FIXEDLEN buffer.
Note that some sysctl handlers were explicitly outputting a trailing NUL
byte. This behaviour was preserved, though it should not be necessary.
Reviewed by: phk
PCI status register to map its current name.
- Use PCIM_* rather than PCIR_* for constants for fields in various AER
registers. I got about half of them right in the previous commit.
MFC after: 1 week
K2 SATA controllers. The chip's status register must be read first, and
as a long, for other registers to be correctly updated after a command, and
this includes the command sequence in device detection as well as the
previously handled case after interrupts. While here, clean up some
previous hacks related to this controller.
Reported by: many
Reviewed by: mav
MFC after: 3 weeks
PCI-express. I used PCIZ_* for ID constants (plain capability IDs use
PCIY_*).
- Add register definitions for the Advanced Error Reporting, Virtual
Channels, and Device Serial Number extended capabilities.
- Teach pciconf -c to list extended as well as plain capabilities. Adds
more detailed parsing for AER, VC, and device serial numbers.
MFC after: 2 weeks
are still bound to BSP. It confuses timer management logic in per-CPU mode
and may cause timer not being reloaded. Check such cases on interrupt
arival and reload timer to give system some more time to manage proper
binding.
discrepencies from the igb version which was the target.
Change the message when neither MSI or MSIX are enabled
and a fallback to Legacy interrupts happen, the existing
message was confusing.
- add identify method to create driver's own device_t
- successfully probe only driver's own device_t instead of any device_t
- (ab)use device order to hopefully be probed/attached after acpi_wmi
PR: kern/147858
Tested by: Maciej Suszko <maciej@suszko.eu>
MFC after: 1 week
- Add special check for case when time expires before being programmed.
This fixes interrupt loss and respectively timer death on attempt to
program very short interval. Increase minimal supported period to more
realistic value.
- Add support for hint.hpet.X.allowed_irqs tunable, allowing manually
specify which interrupts driver allowed to use. Unluckily, many BIOSes
program wrong allowed interrupts mask, so driver tries to stay on safe
side by not using unshareable ISA IRQs. This option gives control over
this limitation, allowing more per-CPU timers to be provided, when FSB
interrupts are not supported. Value of this tunable is bitmask.
- Do not use regular interrupts on virtual machines. QEMU and VirtualBox
do not support them properly, that may cause problems. Stay safe by default.
Same time both QEMU and VirtualBox work fine in legacy_route mode.
VirtualBox also works fine if manually specify allowed ISA IRQs with above.
Both deadline and current_time are time_seconds (+ utc_offset())
casted to unsigned long long. No need to cast to or print as pointers.
MFC after: 4 days