too many bge(4) controllers there and model name does not
necessarily match asic/chip revision. Relying on VPD string made
it hard to identify exact asic/chip revision so the first step to
debug bge(4) was getting exact asic/chip information with verbose
boot which may not be available on production server.
address spaces
There has been no need to do that starting with ACPICA 20040427 as
AcpiEnableSubsystem() installs the handlers automatically.
Additionaly, explicitly calling AcpiInstallAddressSpaceHandler before
AcpiEnableSubsystem is not supported by ACPICA and leads to too early
execution of _REG methods in some DSDTs, which may result in problems.
Big thanks to Robert Moore of ACPICA/Intel for explaining the above.
Reported by: Daniel Bilik <daniel.bilik@neosystem.cz>
Tested by: Daniel Bilik <daniel.bilik@neosystem.cz>
Reviewed by: jkim
Suggested by: "Moore, Robert" <robert.moore@intel.com>
MFC after: 1 week
StarFire controller does not require controller reinitialization to
program perfect filters. While here, make driver immediately exit
from interrupt/polling handler if driver reinitialized controller.
PR: kern/87506
- Add a single sysctl procedure to all three drivers to read an arbitrary
register (the register is passed as arg2). Use it to replace existing
routines in igb(4) that used a separate routine for each register, and
to add support for missing stats in em(4) and lem(4).
- Move the 'rx_overruns' and 'watchdog_timeouts' stats out of the MAC stats
section as they are driver stats, not MAC counters.
- Simplify the code that creates per-queue stats in igb(4) to use a single
loop and remove duplicated code.
- Properly read all 64 bits of the 'good octets received/transmitted' in
em(4) and lem(4).
- Actually read the interrupt count registers in em(4), and drop the
'host to card' sysctl stats from em(4) as they are not implemented in
any of the hardware this driver supports.
- Restore several stats to em(4) that were lost in the earlier stats
conversion including per-queue stats.
- Export several MAC stats in em(4) that were exported in igb(4) but not
in em(4).
- Export stats in lem(4) using individual sysctls as in em(4) and igb(4).
Reviewed by: jfv
MFC after: 1 week
the device:
- unobscure some of the code by moving it into its own functions
- get rid of some magic numbers
- create similar structure as the reference driver has, this should
make further syncs easier
particular edge case where X-axis resolution is not multiple of font width.
Now we just advance enough scan lines, then deduct a partial scan line.
It is more intuitive than the previous code. Apply the same wisdom to EGA
and VGA planar renderers for consistency.
Reported by: David DEMELIER (demelier dot david at gmail dot com)
When the driver is completely saturated with commands (1024 in the
case of the SAS2008 in my test system), I/O stops. If we tell CAM
that we have one less command slot than we have actually allocated,
everything works fine. We also need a few extra command slots to
allow for aborts and other task management commands to be sent down.
This needs more investigation to determine the root cause, but for
now this fixes things in my testing.
mps.c: Change a printf() to mps_printf().
mps_sas.c: Subtract 5 command slots when we tell CAM how many
commands we can handle.
Add some commented-out logic to print the contents
the CDBs for timed-out commands. This can help
in debugging devices that are timing out. This
will be uncommented once I bring some CAM changes in.
Reported by: Andrew Boyer <aboyer at averesystems dot com>
According to the MPT2 spec, task management commands are
serialized, and so no I/O should start while task management
commands are active.
So, to comply with that, freeze the SIM queue before we send any
task management commands (abort, target reset, etc.) down to the
IOC. We unfreeze the queue once the task management command
completes.
It isn't clear from the spec whether multiple simultaneous task
management commands are supported. Right now it is possible to
have multiple outstanding task management commands, especially in
the abort case. Multiple outstanding aborts do complete
successfully, so it may be supported.
We also don't yet have any recovery mechanism (e.g. reset the IOC)
if the task management command fails.
already updated after allocating mbuf so driver had to use the last
index instead of using next producer index. This should fix driver
hang which may happen under high network load.
Reported by: Igor Sysoev <is <> rambler-co dot ru>, Vlad Galu <dudu <> dudu dot ro>
Tested by: Igor Sysoev <is <> rambler-co dot ru>, Vlad Galu <dudu <> dudu dot ro>
MFC after: 10 days
Add a drain function for struct sysctl_req, and use it for a variety
of handlers, some of which had to do awkward things to get a large
enough SBUF_FIXEDLEN buffer.
Note that some sysctl handlers were explicitly outputting a trailing
NUL byte. This behaviour was preserved, though it should not be
necessary.
Reviewed by: phk (original patch)
- Don't probe for PHYs if we already know to use a SERDES. Unlike as with
cas(4) this only serves to speed up the the device attach though and can
only be determined via the OFW device tree but not from the VPD.
- Don't touch the MIF when using a SERDES.
- Add some missing bus space barriers, mainly in the PCS code path.
- D_TRACKCLOSE may be used there as d_close() are expected to match up
d_open() calls
- Replace the hand-crafted counter and flag with the
device_busy()/device_unbusy() proper usage.
Sponsored by: Sandvine Incorporated
Reported by: Mark Johnston <mjohnston at sandvine dot com>
Tested by: Mark Johnston
Reviewed by: emaste
MFC after: 10 days
when the original offset is bigger than size of one page. X86BIOS macros
cannot be used here because it is assumed address is only linear in a page.
Tested by: netchild
unexpected things in copyout(9) and so wiring the user buffer is not
sufficient to perform a copyout(9) while holding a random mutex.
Requested by: nwhitehorn
ACPI specification sates that if P_LVL2_LAT > 100, then a system doesn't
support C2; if P_LVL3_LAT > 1000, then C3 is not supported.
But there are no such rules for Cx state data returned by _CST. If a
state is not supported it should not be included into the return
package. In other words, any latency value returned by _CST is valid,
it's up to the OS and/or user to decide whether to use it.
Submitted by: nork
Suggested by: mav
MFC after: 1 week
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.
There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.
As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.
Tested by: many (on i386, amd64, sparc64 and powerc)
H/W donated by: Gheorghe Ardelean
Sponsored by: iXsystems, Inc.
Bring in a driver for the LSI Logic MPT2 6Gb SAS controllers.
This driver supports basic I/O, and works with SAS and SATA drives and
expanders.
Basic error recovery works (i.e. timeouts and aborts) as well.
Integrated RAID isn't supported yet, and there are some known bugs.
So this isn't ready for production use, but is certainly ready for
testing and additional development. For the moment, new commits to this
driver should go into the FreeBSD Perforce repository first
(//depot/projects/mps/...) and then get merged into -current once
they've been vetted.
This has only been added to the amd64 GENERIC, since that is the only
architecture I have tested this driver with.
Submitted by: scottl
Discussed with: imp, gibbs, will
Sponsored by: Yahoo, Spectra Logic Corporation
This reflects actual type used to store and compare child device orders.
Change is mostly done via a Coccinelle (soon to be devel/coccinelle)
semantic patch.
Verified by LINT+modules kernel builds.
Followup to: r212213
MFC after: 10 days
malo and mwl use the firmware framework to access firmware images.
Depending on the firmware modules itself is not required and in this
case even wrong because no modules with those names exist.
Pointed out by: brucec
MFC after: 1 week
handlers, some of which had to do awkward things to get a large enough
FIXEDLEN buffer.
Note that some sysctl handlers were explicitly outputting a trailing NUL
byte. This behaviour was preserved, though it should not be necessary.
Reviewed by: phk
PCI status register to map its current name.
- Use PCIM_* rather than PCIR_* for constants for fields in various AER
registers. I got about half of them right in the previous commit.
MFC after: 1 week
K2 SATA controllers. The chip's status register must be read first, and
as a long, for other registers to be correctly updated after a command, and
this includes the command sequence in device detection as well as the
previously handled case after interrupts. While here, clean up some
previous hacks related to this controller.
Reported by: many
Reviewed by: mav
MFC after: 3 weeks
PCI-express. I used PCIZ_* for ID constants (plain capability IDs use
PCIY_*).
- Add register definitions for the Advanced Error Reporting, Virtual
Channels, and Device Serial Number extended capabilities.
- Teach pciconf -c to list extended as well as plain capabilities. Adds
more detailed parsing for AER, VC, and device serial numbers.
MFC after: 2 weeks
are still bound to BSP. It confuses timer management logic in per-CPU mode
and may cause timer not being reloaded. Check such cases on interrupt
arival and reload timer to give system some more time to manage proper
binding.
discrepencies from the igb version which was the target.
Change the message when neither MSI or MSIX are enabled
and a fallback to Legacy interrupts happen, the existing
message was confusing.
- add identify method to create driver's own device_t
- successfully probe only driver's own device_t instead of any device_t
- (ab)use device order to hopefully be probed/attached after acpi_wmi
PR: kern/147858
Tested by: Maciej Suszko <maciej@suszko.eu>
MFC after: 1 week
- Add special check for case when time expires before being programmed.
This fixes interrupt loss and respectively timer death on attempt to
program very short interval. Increase minimal supported period to more
realistic value.
- Add support for hint.hpet.X.allowed_irqs tunable, allowing manually
specify which interrupts driver allowed to use. Unluckily, many BIOSes
program wrong allowed interrupts mask, so driver tries to stay on safe
side by not using unshareable ISA IRQs. This option gives control over
this limitation, allowing more per-CPU timers to be provided, when FSB
interrupts are not supported. Value of this tunable is bitmask.
- Do not use regular interrupts on virtual machines. QEMU and VirtualBox
do not support them properly, that may cause problems. Stay safe by default.
Same time both QEMU and VirtualBox work fine in legacy_route mode.
VirtualBox also works fine if manually specify allowed ISA IRQs with above.
Both deadline and current_time are time_seconds (+ utc_offset())
casted to unsigned long long. No need to cast to or print as pointers.
MFC after: 4 days
to pad with 0xFF when it encounter short frames. According to RFC
1042 the pad bytes should be 0x00.
Because manual padding consumes extra CPU cycles, introduce a new
tunable which controls the padding behavior. Turning this tunable
on will have driver pad manually but it's disabled by default. Users
can enable software padding by setting the following tunable to
non-zero value.
dev.sis.%d.manual_pad="1"
PR: kern/35422 (patch not used)
prevented driver from working on big-endian machines. Also rewrite
station address programming to make it work on strict-alignment
architectures. With this change, sis(4) now works on sparc64 and
performance number looks good even though sis(4) have to apply
fixup code to align received frames on 2 bytes boundary on sparc64.
response to DMA activate FIS under certain circumstances. This is
recommended fix from chip datasheet. If triggered, this bug most likely
cause write command timeout.
MFC after: 2 weeks
value 0xff. On hot-plug this value confuses ata_generic_reset() device
presence detection logic. As soon as we already know drive presence from
SATA hard reset, hint ata_generic_reset() to wait for device signature
until success or full timeout.
greater than 65535 bytes then the CDC driver might not work as expected, which
is not likely with the existing USB speeds.
Submitted by: Hans Petter Selasky
allmulti is toggled. Controller does not require reinitialization.
This removes unnecessary controller reinitialization whenever
tcpdump is used.
While I'm here remove unnecessary variable reinitialization.
configured TX/RX MACs before getting a valid link. After that, when
link state change callback is called, it called device
initialization again to reconfigure TX/RX MACs depending on
resolved link state. This hack created several bad side effects and
it required more hacks to not collide with sis_tick callback as
well as disabling switching to currently selected media in device
initialization. Also it seems sis(4) was used to be a template
driver for long time so other drivers which was modeled after
sis(4) also should be changed.
TX/RX MACs are now reconfigured after getting a valid link. Fix for
short cable error is also applied after getting a link because it's
only valid when the resolved speed is 100Mbps.
While I'm here slightly reorganize interrupt handler such that
sis(4) always read SIS_ISR register to see whether the interrupt is
ours or not. This change removes another hack and make it possible
to nuke sis_stopped variable in softc.
o Enforce TX/RX descriptor ring alignment. NS data sheet says the
controller needs 4 bytes alignment but use 16 to cover both SiS
and NS controllers. I don't have SiS data sheet so I'm not sure
what is alignment restriction of SiS controller but 16 would be
enough because it's larger than the size of a TX/RX descriptor.
Previously sis(4) ignored the alignment restriction.
o Enforce RX buffer alignment, 4.
Previously sis(4) ignored RX buffer alignment restriction.
o Limit number of TX DMA segment to be used to 16. It seems
controller has no restriction on number of DMA segments but
using more than 16 looks resource waste.
o Collapse long mbuf chains with m_collapse(9) instead of calling
expensive m_defrag(9).
o TX/RX side bus_dmamap_load_mbuf_sg(9) support and remove
unnecessary callbacks.
o Initial endianness support.
o Prefer local alignment fixup code to m_devget(9).
o Pre-allocate TX/RX mbuf DMA maps instead of creating/destroying
these maps in fast TX/RX path. On non-x86 architectures, this is
very expensive operation and there is no need to do that.
o Add missing bus_dmamap_sync(9) in TX/RX path.
o watchdog is now unarmed only when there are no pending frames
on controller. Previously sis(4) blindly unarmed watchdog
without checking the number of queued frames.
o For efficiency, loaded DMA map is reused for error frames.
o DMA map loading failure is now gracefully handled. Previously
sis(4) ignored any DMA map loading errors.
o Nuke unused macros which are not appropriate for endianness
operation.
o Stop embedding driver maintained structures into descriptor
rings. Because TX/RX descriptor structures are shared between
host and controller, frequent bus_dmamap_sync(9) operations are
required in fast path. Embedding driver structures will increase
the size of DMA map which in turn will slow down performance.
not change interrupt vector if it is not pointing the ROM itself. Actually,
we just fail shadowing altogether if that is the case because the shadowed
copy will be useless for sure and POST may not be relocatable or useful.
While I'm here, fix a debugging message under bootverbose, really. r211829
fixed one case but broke another. Mea Culpa.
or not by comparing reported TX consumer index with saved index. So
remove unnecessary check done after freeing transmitted mbufs.
While I'm here nuke unnecessary variable initializations.
tag. All controllers that are not BCM5755 or higher have 4GB
boundary DMA bug. Previously bge(4) used 32bit DMA address to
workaround the bug(r199670). However this caused the use of bounce
buffers such that it resulted in poor performance for systems which
have more than 4GB memory. Because bus_dma(9) honors boundary
restriction requirement of DMA tag for dynamic buffers, having a
separate TX/RX mbuf DMA tag will greatly reduce the possibility of
using bounce buffers. For DMA buffers allocated with
bus_dmamem_alloc(9), now bge(4) explicitly checks whether the
requested memory region crossed the boundary or not.
With this change, only the DMA buffer that crossed the boundary
will use 32bit DMA address. Other DMA buffers are not affected as
separate DMA tag is created for each DMA buffer.
Even if 32bit DMA address space is used for a buffer, the chance to
use bounce buffer is still very low as the size of buffer is small.
This change should eliminate most usage of bounce buffers on
systems that have more than 4GB memory.
More correct fix would be teaching bus_dma(9) to honor boundary
restriction for buffers created with bus_dmamem_alloc(9) but it
seems that is not easy.
While I'm here cleanup bge_dma_map_addr() and remove unnecessary
member variables in bge_dmamap_arg structure.
Tested by: marcel
heavy I/O load.
Many thanks to LSI for continuing to support FreeBSD.
PR: kern/149968
Submitted by: LSI (Tom Couch)
Reported by: Kai Kockro <kkockro web de>
Tested by: Kai Kockro, jpaetzel
MFC after: 7 days
configuration function. For failed memory allocations, em(4)/lem(4)
called panic(9) which is not acceptable on production box.
igb(4)/ixgb(4)/ix(4) allocated the required memory in stack which
consumed 768 bytes of stack memory which looks too big.
To address these issues, allocate multicast array memory in device
attach time and make multicast configuration success under any
conditions. This change also removes the excessive use of memory in
stack.
Reviewed by: jfv
Just showing some buffer allocation error is more appropriate
action for drivers. This should fix occasional panic reported on
em(4) when driver encountered resource shortage.
Reviewed by: jfv
supported by many BIOSes to improve performance of VESA BIOS calls for real
mode OSes but it is not our intention here. However, this may help some
platforms where the video ROMs are inaccessible after suspend, for example.
Note it may consume up to 64K bytes of contiguous memory depending on video
controller model when it is enabled. This feature can be disabled by
setting zero to 'debug.vesa.shadow_rom' loader tunable via loader(8) or
loader.conf(5). The default is 1 (enabled), for now.
ID, plus the ability to force '16-bit mode' which really means NE-2000
mode. Other open source drivers suggest that the Holtek misbehaves if
you allow the 8-bit probe. Also, all of the PCI chips emulate
NE-2000ish cards, so always force 16-bit mode for memory transfers.
PR: 84202 (patch not used)
I'm not sure whether adding this logical id is correct or not
because Compex RL2000 is in the list of supported hardware list.
I guess the Compex RL2000 could be PCI variant while the controller
in question is ISA controller. It seems PNP compat id didn't match
or it had multiple compat ids so isa_pnp_probe() seemed to return
ENOENT.
PR: kern/80853
support WOL. Some controllers require additional 3-wire auxiliary
remote wakeup connector to draw power. More recent xl(4)
controllers may not need the wakeup connector though.
initialize controller from a known good state. Previously driver
used to issue controller reset while TX/RX DMA are in progress.
I guess resetting controller in active TX/RX DMA cycle is to ensure
stopping I/Os in xl_shutdown(). I remember some buggy controllers
didn't respond with stop command if controller is under high
network load at the time of shutdown so resetting controller was
the only safe way to stop the I/Os. However, from my experiments,
controller always responded with stop command under high network
load so I think it's okay to remove the xl_reset() in
device_shutdown handler.
Resetting controller also will clear configured RX filter which
in turn will make WOL support hard because driver have to reprogram
RX filter in WOL handler as well as setting station address.
datagrams with checksum value 0 when TX UDP checksum offloading is
enabled. Generating UDP checksum value 0 is RFC 768 violation.
Even though the probability of generating such UDP datagrams is
low, I don't want to see FreeBSD boxes to inject such datagrams
into network so disable UDP checksum offloading by default. Users
still override this behavior by setting a sysctl variable or loader
tunable, dev.bge.%d.forced_udpcsum.
I have no idea why this issue was not reported so far given that
bge(4) is one of the most commonly used controller on high-end
server class systems. Thanks to andre@ who passed the PR to me.
PR: kern/104826
Makefiles or *.mk files, use ${CC:T:Mfoo} instead, so only the basename
of the compiler command (excluding any arguments) is considered.
This allows you to use, for example, CC="/nondefault/path/clang -xxx",
and still have the various tests in bsd.*.mk identify your compiler as
clang correctly.
ICC if cases were also changed.
Submitted by: Dimitry Andric <dimitry at andric.com>
method is used by the PCI bus driver to query the power management system
to determine the proper device state to be used for a device during suspend
and resume. For the ACPI PCI bridge drivers this calls
acpi_device_pwr_for_sleep(). This removes ACPI-specific knowledge from
the PCI and PCI-PCI bridge drivers.
Reviewed by: jkim
If it does, don't then try reprogramming the NF "cap" values (ie
what values are the "maximum" value the NF can be) - instead,
just leave the current CCA value as the NF cap.
This was inspired by some similar work from ath9k. It isn't
a 100% complete solution (as there may be some reason where a
high NF CCA/cap is written, causing the baseband to stop thinking it
is able to transmit, leading to stuck beacon and interface reset)
which I'll investigate and look at fixing in a later commit.
Obtained from: Linux