There have still been intermittent problems with apparent TX
hangs for some customers. These have been problematic to reproduce
but I believe these changes will address them. Testing on a number
of fronts have been positive.
EM: there is an important 'chicken bit' fix for 82574 in the shared
code this is supported in the core here.
- The TX path has been tightened up to improve performance. In
particular UDP with jumbo frames was having problems, and the
changes here have improved that.
- OACTIVE has been used more carefully on the theory that some
hangs may be due to a problem in this interaction
- Problems with the RX init code, the "lazy" allocation and
ring initialization has been found to cause problems in some
newer client systems, and as it really is not that big a win
(its not in a hot path) it seems best to remove it.
- HWTSO was broken when VLAN HWTAGGING or HWFILTER is used, I
found this was due to an error in setting up the descriptors
in em_xmit.
IGB:
- TX is also improved here. With multiqueue I realized its very
important to handle OACTIVE only under the CORE lock so there
are no races between the queues.
- Flow Control handling was broken in a couple ways, I have changed
and I hope improved that in this delta.
- UDP also had a problem in the TX path here, it was change to
improve that.
- On some hardware, with the driver static, a weird stray interrupt
seems to sometimes fire and cause a panic in the RX mbuf refresh
code. This is addressed by setting interrupts late in the init
path, and also to set all interrupts bits off at the start of that.
support of new deltas for both em and igb drivers.
Note that I am not able to track all the bugs fixed in
this code, I am a consumer of it as a component of my
core drivers. It is important to keep the FreeBSD drivers
up to date with it however.
One important note is there is a key fix for 82574 in this
update. Also, there are lots of white space changes, I am
not happy about them but have no control over it :)
control. Controller does not automatically generate pause frames
based on number of available RX buffers so it's very hard to
know when driver should generate XON frame in time. The only
mechanism driver can detect low number of RX buffer condition is
ET_INTR_RXRING0_LOW or ET_INTR_RXRING1_LOW interrupt. This
interrupt is generated whenever controller notices the number of
available RX buffers are lower than pre-programmed value(
ET_RX_RING0_MINCNT and ET_RX_RING1_MINCNT register). This scheme
does not provide a way to detect when controller sees enough number
of RX buffers again such that efficient generation of XON/XOFF
frame is not easy.
While here, add more flow control related register definition.
interrupt is ours. Note, interrupts are automatically ACKed when
the status register is read.
Add RX/TX DMA error to interrupt handler and do full controller
reset if driver happen to encounter these errors. There is no way
to recover from these DMA errors without controller reset.
Rename local variable name intrs with status to enhance
readability.
While I'm here, rename ET_INTR_TXEOF and ET_INTR_RXEOF to
ET_INTR_TXDMA and ET_INTR_RXDMA respectively. These interrupts
indicate that a frame is successfully DMAed to controller's
internal FIFO and they have nothing to do with EOF(end of frame).
Driver does not need to wait actual end of TX/RX of a frame(e.g.
no need to wait the end signal of TX which is generated when a
frame in TX FIFO is emptied by MAC). Previous names were somewhat
confusing.
of a devfs file descriptor in devfs_close_f(). The passed in td argument
may be NULL if the close was invoked by garbage collection of open
file descriptors in pending control messages in the socket buffer of a
UNIX domain socket after it was closed.
PR: kern/151758
Submitted by: Andrey Shidakov andrey shidakov ru
Submitted by: Ruben van Staveren ruben verweg com
Reviewed by: kib
MFC after: 2 weeks
thread_free(newtd). This to avoid a possible page fault in
cpu_thread_clean() as seen on amd64 with syscall fuzzing.
Reviewed by: kib
MFC after: 1 week
suspend state. This will save more power.
On resume, make sure to enable all clocks. While I'm here, if
controller is not fast ethernet, enable gigabit PHY.
Don't blindly re-initialize controller whenever MTU is changed.
Now, reinitializing is done only when driver is running.
While here, remove unnecessary assignment of error value since it
was already initialized to 0.
o Do not report link status if driver is not running.
o TX/RX MAC configuration should be done with resolved speed,
duplex and flow control after establishing a link so it can't
be done in driver initialization routine.
Move the configuration to miibus_statchg callback which will be
called whenever any link state change is detected.
At this moment, flow-control is not enabled yet mainly because
I was not able to set correct flow control parameters to
generate TX pause frames.
o Now TX/RX MAC is enabled only when a valid link is detected.
Rearragnge hardware initialization routine a bit to leave
enabling MAC to miibus_statchg callback. In order to that,
TX/RX DMA engine is enabled in et_init_locked().
o Introduce ET_FLAG_LINK flag to track current link state.
o Introduce ET_FLAG_FASTETHER flag to mark whether controller is
fast ethernet. This flag is checked in miibus_statchg callback
to know whether PHY established a valid link.
o In et_stop(), TX/RX MAC is explicitly disabled instead of
relying on et_reset(). And move et_reset() from et_stop() to
controller initialization. Controler reset is not required here
and it would also clear critial registers(i.e station address,
RX filter configuration, WOL etc) that are required to make WOL
work.
o Switching to current media is done in et_init_locked() after
setting IFF_DRV_RUNNING flag. This should ensure reliable
auto-negotiation/manual link establishment.
o In et_start_locked(), check whether driver got a valid link
before trying to send frames.
o Remove checking a link in et_tick() as this is done by
miibus_statchg callback.
identifier reserved for the implementation in C99 and earlier so there is
no sensible reason for introducing yet another reserved identifier when we
could just use the one C1x uses.
Approved by: brooks (mentor)
manipulation of interrupt register access is done through
CSR_WRITE_4 macro. Also add disabling interrupt into et_reset()
because we want interrupt disabled state after controller reset.
While I'm here slightly change interrupt handler to be more
readable one.
send a single TX command after setting up all TX frames. This
removes unnecessary register accesses and bus_dmamap_sync(9) calls.
et(4) uses TX interrupt moderation so it's possible to have TX
buffers that were already transmitted but waiting for TX completion
interrupt. If the number of available TX descriptor is less then
1/3 of total TX descriptor, try reclaiming first to get enough free
TX descriptors before setting up TX descriptors.
After r228325, et_txeof() no longer tries to send frames after
reclaiming TX buffers. That change was made to give more chance
to transmit frames in main interrupt handler since we can still
send frames in interrupt handler with RX interrupt. So right
before exiting interrupt hander, after enabling interrupt, try to
send more frames. This gives slightly better performance numbers.
While I'm here reduce number of spare TX descriptors from 8 to 4.
Controller does not require reserved TX descriptors, it was just to
reduce TX overhead. After r228325, driver has much lower TX
overhead so it does not make sense to reserve 8 TX descriptors.
change should make et(4) work on any architectures.
o Remove m_getl inline function and replace it with stanard mbuf
interfaces. Previous code tried to minimize code duplication
but this came from incorrect use of common DMA tag.
Driver may be still use a common RX allocation handler with
additional structure changes but I don't see much point to do
that it would make it hard to understand the code.
o Remove DragonflyBSD specific constant EVL_ENCAPLEN, use
ETHER_VLAN_ENCAP_LEN instead.
o Add bunch of new RX status definition. It seems controller
supports RX checksum offloading but I was not able to make the
feature work yet. Currently driver checks whether recevied
frame is good one or not.
o Avoid a typedef ending in '_t' as style(9) says.
o Controller has no restriction on DMA address space, so there
is no reason to limit the DMA address to 32bit. Descriptor
rings, status blocks and TX/RX buffers now use full 64bit DMA
addressing.
o Allocate DMA memory shared between host and controller as
coherent.
o Create 3 separate DMA tags to be used as TX, mini RX ring and
stanard RX ring. Previously it created a single DMA tag and it
was used to all three rings.
o et(4) does not support jumbo frame at this moment and I still
don't quite understand how jumbo frame works on this controller
so use two RX rings to handle small sized frame and normal sized
frame respectively. The mini RX ring will be used to receive
frames that are less than or equal to 127 bytes. The second RX
ring is used to receive frames that are not handled by the first
RX ring.
If jumbo frame support is implemented, driver may have to choose
better RX scheme by letting the second RX ring handle jumbo
frames. This scheme will mimic Broadcom's efficient jumbo frame
handling feature. However RAM buffer size(16KB) of the
controller is too small to hold 2 jumbo frames, if 9KB
jumbo frame is used, I'm not sure how good performance would it
have.
o In et_rxeof(), make sure to check whether controller received
good frame or not. Passing corrupted frame to upper layer is
bad idea.
o If driver receives a bad frame or driver fails to allocate RX
buffer due to resource shortage condition, reuse previously
loaded DMA map for RX buffer instead of unloading/loading RX
buffer again.
o et_init_tx_ring() never fails so change return type to void.
o In watchdog handler, show TX DMA write back status of errored
frame which could be used as a clue to debug watchdog timeout.
o Add missing bus_dmamap_sync() in various places such that et(4)
should work with bounce buffers(e.g. PAE).
o TX side bus_dmamap_load_mbuf_sg(9) support.
o RX side bus_dmamap_load_mbuf_sg(9) support.
o Controller has no DMA alignment limit in RX buffer so use
m_adj(9) in RX buffer allocation to make IP header align on 2
bytes boundary. Otherwise it would trigger unaligned access
error in upper layer on strict alignment architectures.
One of down side of controller is it provides limited set of RX
buffer length like most Intel controllers. This is not problem
at this moment because driver does not support jumbo frame yet
but it may require alignment fixup code to support jumbo frame
on strict alignment architectures.
o In et_txeof(), don't zero TX descriptors for transmitted frames.
TX descriptors don't need write access after transmission.
Driver sets IFF_DRV_OACTIVE when the number of available TX
descriptors are less than or equal to ET_NSEG_SPARE. Make sure
to clear IFF_DRV_OACTIVE only when the number of available TX
descriptor is greater than ET_NSEG_SPARE.
__noreturn macro and modify the other exiting functions to use it.
The __noreturn macro, unlike __dead2, must be used BEFORE the function.
This is in line with the C and C++ specifications that place _Noreturn (c1x)
and [[noreturn]] (C++11) in front of the functions. As with __dead2, this
macro falls back to using the GCC attribute.
Unfortunately, clang currently sets the same value for the C version macro
in C99 and C1x modes, so these functions are hidden by default. At some
point before 10.0, I need to go through the headers and clean up the C1x /
C++11 visibility.
Reviewed by: brooks (mentor)
of vm_kmem_size that may occur if the system administrator has specified a
vm.vm_kmem_size tunable value that exceeds the hard cap.
PR: 162741
Submitted by: Adam McDougall
Reviewed by: bde@
MFC after: 3 weeks
Optimize for the case, by lazily allocating the pipe inode number at the
fstat(2) time. If alloc_unr(9) returns failure, do not fail fstat(2), since
uses of inode numbers are even rare then fstat(2), but provide zero inode
forever. Note that alloc_unr() failure is unlikely due to total number
of pipes in the system limited by the number of file descriptors.
Based on the submission by: gianni
MFC after: 2 weeks
c162516
Remove vtblk_sector_size
c162515
Wrap long license lines
c162514
Remove vtblk_unit
c162513
Wrap long lines in the license.
c162512
Remove verbose messages when link goes up/down.
A similar message is printed elsewhere as a result of
if_link_state_change().
c162511
Explicity compare pointer to NULL
c162510
Allocate the mac filter table at attach time.
c162509
Add real BSD licenses to the header files copied from Linux.
The chases upstream changes made in Linux awhile ago.
c162508
Only notify if we actually dequeued something.
c162507
Change a couple of if () { KASSERT(...) } to just KASSERTs.
In non-debug kernels, the if() { } probably get optomized
away, but I guess this is clearer.
c162506
Remove VIRTIO_BLK_F_TOPOLOGY fields in the config.
TOPOLOGY has since been removed from the spec, and the FreeBSD
didn't really do anything with the fields anyways.
c162505
Move vtblk_enqueue_request() outside the locks when getting the ident.
c162504
Remove soon to be uneeded trylock during dump [1].
http://lists.freebsd.org/pipermail/freebsd-current/2011-November/029226.html
c162503
Remove emtpy line
c162502
Drop frame if cannot allocate a vtnet_tx_header.
If we don't, we set OACTIVE, but if there are no
other frames in flight, vtnet_txeof() will never
be called to unset OACTIVE. The interface would
have to be down/up'ed in order to become usable.
We could be cuter here and only do this if the
virtqueue is emtpy, but its probably not worth
the complication.
c162501
Start mbuf replacement loop at 1 for clarity
Obtained from: Bryan Venteicher bryanv at daemoninthecloset dot org
driver that has high precedence for the controller override et(4).
Add missing callout_drain(9) in device detach and rework detach
routine. While I'm here use rman_get_rid(9) instead of using
cached resource id because bus methods are free to change the
id.
While I'm here remove initializing if_mtu, it is set by
ether_ifattach(9). Also move callout_init_mtx(9) to the right below
driver lock initialization.
descriptors before trying to send frames. If we're not able to
send a frame, make sure to prepend it to if_snd queue such that
alt(4) should work.
While I'm here prefer ETHER_BPF_MTAP to BPF_MTAP. ETHER_BPF_MTAP
should be used for controllers that support VLAN hardware tag
insertion. The controller supports VLAN tag insertion but lacks
VLAN tag stripping in RX path though.
before calling bus_enumerate_hinted_children(9) (which is the minimum for
this to work) instead of fully probing it so later on we can just call
bus_generic_attach(9) on the parent of the miibus(4) instance. The latter
is necessary in order to work around what seems to be a bzzarre race in
newbus affecting a few machines since r227687, causing no driver being
probed for the newly added miibus(4) instance. Presumably this is the
same race that was the motivation for the work around done in r215348.
Reported and tested by: yongari
- Revert the removal of a static in r221913 in order to help compilers to
produce more optimal code.
Citing jilles:
If we are ever going to do ASLR, the AUXV information tells an attacker
where the stack, executable and RTLD are located, which defeats much of
the point of randomizing the addresses in the first place.
Given that the AUXV information seems to be used by debuggers only anyway,
I think it would be good to move it to p_candebug() now.
The full virtual memory maps (KERN_PROC_VMMAP, procstat -v) are already
under p_candebug().
Suggested by: jilles
Discussed with: rwatson
MFC after: 1 week
use superpage reservations. So, for the first time, kernel virtual memory
that is allocated by contigmalloc(), kmem_alloc_attr(), and
kmem_alloc_contig() can be promoted to superpages. In fact, even a series
of small contigmalloc() allocations may collectively result in a promoted
superpage.
Eliminate some duplication of code in vm_reserv_alloc_page().
Change the type of vm_reserv_reclaim_contig()'s first parameter in order
that it be consistent with other vm_*_contig() functions.
Tested by: marius (sparc64)
Where i386/bios/apm.c requires no per-descriptor state, the ACPI version
of these device do. Instead of using hackish clone lists that leave
stale device nodes lying around, use the cdevpriv API.
On my hardware, "em" in netmap mode does about 1.388 Mpps
on one card (on an Asus motherboard), and 1.1 Mpps on another
card (PCIe bus). Both seem to be NIC-limited, because
i have the same rate even with the CPU running at 150 MHz.
On the "re" driver the tx throughput is around 420-450 Kpps
on various (8111C and the like) chipsets. On the Rx side
performance seems much better, and i can receive the full
load generated by the "em" cards.
"igb" is untested as i don't have the hardware.
A link reset now is completely transparent for the netmap client:
even if the NIC resets its own ring (e.g. restarting from 0),
the client will not see any change in the current rx/tx positions,
because the driver will keep track of the offset between the two.
2. make the device-specific code more uniform across different drivers
There were some inconsistencies in the implementation of the netmap
support routines, now drivers have been aligned to a common
code structure.
3. import netmap support for ixgbe . This is implemented as a very
small patch for ixgbe.c (233 lines, 11 chunks, mostly comments:
in total the patch has only 54 lines of new code) , as most of
the code is in an external file sys/dev/netmap/ixgbe_netmap.h ,
following some initial comments from Jack Vogel about making
changes less intrusive.
(Note, i have emailed Jack multiple times asking if he had
comments on this structure of the code; i got no reply so
i assume he is fine with it).
Support for other drivers (em, lem, re, igb) will come later.
"ixgbe" is now the reference driver for netmap support. Both the
external file (sys/dev/netmap/ixgbe_netmap.h) and the device-specific
patches (in sys/dev/ixgbe/ixgbe.c) are heavily commented and should
serve as a reference for other device drivers.
Tested on i386 and amd64 with the pkt-gen program in tools/tools/netmap,
the sender does 14.88 Mpps at 1050 Mhz and 14.2 Mpps at 900 MHz
on an i7-860 with 4 cores and 82599 card. Haven't tried yet more
aggressive optimizations such as adding 'prefetch' instructions
in the time-critical parts of the code.
difference between fi_wgen and f_seqcount, so the change is purely
cosmetic, but it makes the code easier to understand.
Submitted by: gianni
MFC after: 2 weeks
check for a UTF-8 compliant file name. Enabling this sysctl results in
an NFSv4 server that is non-RFC3530 compliant, therefore it is not enabled
by default. However, enabling this sysctl results in NFSv3 compatible
behaviour and fixes the problem reported by "dan at sunsaturn.com"
to freebsd-current@ on Nov. 14, 2011 under the subject "NFSV4 readlink_stat".
Tested by: dan at sunsaturn.com
Reviewed by: zack
MFC after: 2 weeks
pins, rather than defaulting to 0 and 1.
This way the pin order can be reversed. It is reversed with the
TP-Link TL-WR1043nd.
Submitted by: Stefan Bethke <stb@lassitu.de>
These realtek switch PHYs speak a variant of i2c with some slightly
modified handling.
From the submitter, slightly modified now that some further digging
has been done:
The I2C framework makes a assumption that the read/not-write bit of the first
byte (the address) indicates whether reads or writes are to follow.
The RTL8366 family uses the bus: after sending the address+read/not-write byte,
two register address bytes are sent, then the 16-bit register value is sent
or received. While the register write access can be performed as a 4-byte
write, the read access requires the read bit to be set, but the first two bytes
for the register address then need to be transmitted.
This patch maintains the i2c protocol behaviour but allows it to be relaxed
(for these kinds of switch PHYs, and whatever else Realtek may do with this
almost-but-not-quite i2c bus) - by setting the "strict" hint to 0.
The "strict" hint defaults to 1.
Submitted by: Stefan Bethke <stb@lassitu.de>
It appears this device fails if sent a SYNCHRONIZE_CACHE command, so add
quirk to avoid sending it.
I will follow up with Micron on this issue, and will adjust the quirk if
necessary based on their feedback.
Reviewed by: hselasky@
no need to additionally add CPU memory barriers to the acquire variants of
atomic(9), these are documented to also include compiler memory barriers.
So add the latter, which were previously included by using membar(), back.
of the same lock_owner4 string. As such, the handling of cleanup
of lock_owners could be simplified. This simplification permitted
the client to do a ReleaseLockOwner operation when the process that
the lock_owner4 string represents, has exited. This permits the
server to release any storage related to the lock_owner4 string
before the associated open is closed. Without this change, it
is possible to exhaust a server's storage when a long running
process opens a file and then many child processes do locking
on the file, because the open doesn't get closed. A similar patch
was applied to the Linux NFSv4 client recently so that it wouldn't
exhaust a server's storage.
Reviewed by: zack
MFC after: 2 weeks
having dereferenced it. We either should generally check the device_t's
supplied to bus functions before using them (which we seem to virtually
never do) or just assume that they are not NULL.
While at it make this code fit 78 columns.
Found with: Coverity Prevent(tm)
CID: 4230
when actually setting a driver as especially ENOMEM is fatal in these
cases.
- Annotate other calls to device_set_devclass(9) and device_set_driver(9)
without the return value being checked and that are okay to fail.
Reviewed by: yongari (slightly earlier version)
in addition to the user priority for threads whose current real priority
is equal to the previous user priority or if the new priority is a
real-time priority. This allows priority changes of other threads to
have an immediate effect.
MFC after: 2 weeks
According to the open firmware standard, finddevice call has to return
a phandle with value of -1 in case of error.
This commit is to:
- Fix the FDT implementation of this interface (ofw_fdt_finddevice) to
return (phandle_t)-1 in case of error, instead of 0 as it does now.
- Fix up the callers of OF_finddevice() to compare the return value with
-1 instead of 0 to check for errors.
- Since phandle_t is unsigned, the return value of OF_finddevice should
be checked with '== -1' rather than '<= 0' or '> 0', fix up these cases
as well.
Reported by: nwhitehorn
Reviewed by: raj
Approved by: raj, nwhitehorn
to known AHCI-capable chips (AMD/NVIDIA), configured for legacy emulation.
Enabled by default to get additional performance and functionality of AHCI
when it can't be enabled by BIOS. Can be disabled to honor BIOS settings if
needed for some reason.
MFC after: 1 month
NFS server and reuse it for writes as well to allow writes to the backing
store to be clustered.
- Use a prime number for the size of the heuristic table (1017 is not
prime).
- Move the logic to locate a heuristic entry from the table and compute
the sequential count out of VOP_READ() and into a separate routine.
- Use the logic from sequential_heuristic() in vfs_vnops.c to update the
seqcount when a sequential access is performed rather than just
increasing seqcount by 1. This lets the clustering count ramp up
faster.
- Allow for some reordering of RPCs and if it is detected leave the current
seqcount as-is rather than dropping back to a seqcount of 1. Also,
when out of order access is encountered, cut seqcount in half rather than
dropping it all the way back to 1 to further aid with reordering.
- Fix the new NFS server to properly update the next offset after a
successful VOP_READ() so that the readahead actually works.
Some of these changes came from an earlier patch by Bjorn Gronwall that was
forwarded to me by bde@.
Discussed with: bde, rmacklem, fs@
Submitted by: Bjorn Gronwall (1, 4)
MFC after: 2 weeks
-1. But, because ino_t is unsigned, this case was not covered by the
test ino > 0 in pipeclose(), leading to the free_unr(-1). Fix it by
explicitely comparing with 0 and -1. [1]
Do no access freed memory, the inode number was cached to prevent access
to cpipe after it possibly was freed, but I failed to commit the right
patch.
Noted by: gianni [1]
Pointy hat to: kib
MFC after: 3 days
introduced when feed-forward clock support is enabled in the kernel:
- Rename the "choice" variable to "available".
- Streamline the implementation of the "active" variable's sysctl handler
function.
- Create a kern.sysclock sysctl node for general sysclock related configuration
options. Place the "available" and "active" variables under this node.
- Create a kern.sysclock.ffclock sysctl node for feed-forward clock specific
configuration options. Place the "version" and "ffcounter_bypass" variables
under this node.
- Tweak some of the description strings.
Discussed with: Julien Ridoux (jridoux at unimelb edu au)
defined based on WITH/WITHOUT_CTF settings, default is WITHOUT_CTF,
NO_CTF overrides WITH_CTF (used by Makefile.inc1)
- CTFCONVERT_CMD/NORMAL_CTFCONVERT are now defined to empty string
if make(1) can handle empty commands
- CTFCONVERT_CMD=... is a hack (should be defined to empty string instead):
make(1) should be taught to ignore empty commands silently in compat mode
(as it does in !compat mode, GNU make also silently ignores empty commands)
and to skip printing empty commands in !compat mode
- config(8) should generate ${NORMAL_CTFCONVERT} invocation without '@':
this will allow to simplify kern.pre.mk even more and lessen the number
of shell invocations during kernel build when CTF is turned off
- WITH_CTF can now be converted to usual MK_CTF=yes/no infrastructure
Pointy hat to: fjoe [1]
Since the address of vm_page lock mutex depends on the kernel options,
it is easy for module to get out of sync with the kernel.
No vm_page_lockptr() accessor is provided for modules. It can be added
later if needed, unless proper KPI is developed to serve the needs.
Reviewed by: attilio, alc
MFC after: 3 weeks
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of
Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward
Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
Discussed with: Julien Ridoux (jridoux at unimelb edu au)
Submitted by: Julien Ridoux (jridoux at unimelb edu au)
reimplementing the [get]{bin,nano,micro}[up]time() wrapper functions in terms of
the new "fromclock" API instead.
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of
Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward
Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
Discussed with: Julien Ridoux (jridoux at unimelb edu au)
Submitted by: Julien Ridoux (jridoux at unimelb edu au)
select which system clock to obtain time from, independent of the current
default system clock. In the brave new multi sysclock world, both feedback and
feed-forward system clocks can be maintained and used concurrently, so this API
provides a minimalist first step for interested consumers to exercise control
over their choice of system clock.
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of
Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward
Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
Discussed with: Julien Ridoux (jridoux at unimelb edu au)
Submitted by: Julien Ridoux (jridoux at unimelb edu au)
that new APIs with some performance sensitivity can be built on top of them.
These functions should not be called directly except in special circumstances.
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of
Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward
Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
Discussed with: Julien Ridoux (jridoux at unimelb edu au)
Submitted by: Julien Ridoux (jridoux at unimelb edu au)
fbclock_{nanouptime|microuptime|bintime|nanotime|microtime}() functions to avoid
indirecting through a sysclock_ops wrapper function.
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of
Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward
Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
Submitted by: Julien Ridoux (jridoux at unimelb edu au)
parent interface. This avoids the overhead of queueing a packet to an IFQ
only to immediately dequeue it again.
Suggested by: np
Reviewed by: brooks
MFC after: 1 month
control for all vr(4) controllers that support it. It's known that
old vr(4) controllers(Rhine II) does not support TX pause but Rhine
III supports both TX and RX pause.
Make TX pause really work on Rhine III by letting controller know
available RX buffers.
While here, adjust XON/XOFF parameters to get better performance
with flow control.
back after I fix the breakages on some of our more exotic platforms.
While here, add the driver to the amd64 NOTES, so it can be picked up in LINT
builds.
using member variables in softc.
While I'm here change media after setting IFF_DRV_RUNNING. This
will remove unnecessary link state handling in vr_tick() if
controller established a link immediately.
have reserved free space in the APM area.
Also instead of one write request per each APM entry, use MAXPHY
sized writes when we are updating APM.
MFC after: 1 month
of hand-made.
- When registering new cloner, check whether a cloner with
same name already exist.
- When allocating unit, also check with help of ifunit()
whether such interface already exist or not. [1]
PR: kern/162789 [1]
transfer statemachine. This work is about using a single
state variable instead of multiple state bits as input
for the USB statemachine to determine what to do in the
various parts of the code. No APIs towards USB device
drivers or USB host controller drivers will be changed.
MFC after: 1 month
Fix a bug where the parameter length of a supported address types
parameter is set to a wrong value if the kernel is built with
with either INET or INET6, but not both.
MFC after: 3 days.
the 16-bit cylinders field of the VTOC8 disk label (at around 502GB). The
geometry chosen for disks above that limit allows to use disks up to 2TB,
which is the limit of the extended VTOC8 format. The geometry used for
disks smaller than the 16-bit cylinders limit stays the same as used by
cam_calc_geometry(9) for extended translation.
Thanks to Hans-Joerg Sirtl for providing hardware for testing this change.
MFC after: 3 days
backup stack queue entry when the zone is exhausted, otherwise we leak a zone
allocation each time we plug a hole in the reassembly queue.
Reported by: many on freebsd-stable@ (thread: "TCP Reassembly Issues")
Tested by: many on freebsd-stable@ (thread: "TCP Reassembly Issues")
Reviewed by: bz (very brief sanity check)
MFC after: 3 days
compatible with each other and since r227539 the last issue seen when
using SCHED_ULE is fixed. At least on UP and 2-way machines SCHED_4BSD
still performs better than SCHED_ULE, however, the optimizations done
in r225889 pretty much compensate that so there's at least no net
regression.
Thanks go to Peter Jeremy for extensive testing.
the second-last 64k seems to be the default firmware board configuration
area.
Since I have no idea whether uboot uses it or not - and it's prefixed
with an atheros eeprom signature (0xaa55), I figure the safest thing
to do is mark it as read-only.
I've modified my local tplink firmware building program to generate
a board configuration section - which is separate to this partition.
It's located in the 64k _before_ this particular 64k.
The firmware build program from OpenWRT never initialises those
values and the firmware images from tplink also leave it 0x0, so I
don't currently know what the exact, correct details should be.
the ar71xx platform code should assume a uboot or redboot environment.
The current code gets very confused (and just crashes) on a uboot
environment, where each attribute=value pair is in a single entry.
Redboot on the other hand stores it as "attribute", "value", "attribute",
"value", ...
This allows the kernel to boot on a TP-LINK TL-WR1043ND from flash,
where the uboot environment gets setup. This didn't show up during a netboot
as "tftpboot" and "go" don't setup the uboot environment variables.
The default flash layout gives only 1 megabyte for the kernel, gzipped.
The uboot firmware running on this device only supports gzip, not lzma, so
we actually _do_ have to try and slim the kernel down a bit.
But, since I can't actually do that at the present, I'm opting to:
* extend the kernel from 1mb to 2mb;
* have rootfs fill the rest of that, save 64k;
* eventually I'll hide a 64k config partition at the end, between the
end of rootfs and the ART (radio configuration data.)
The uboot firmware doesn't care about the partition layout. It just
expects the kernel application image to sit at 0xbf020000 (right after
the 128k uboot image.) The uboot header isn't actually read either -
it's "faked" from a "tplink" flash image header. So as long as the
map configuration here matches what is being written out via the
tplink firmware generator, everything is a-ok.
A previous commit disabled compiling the AR9130 support in the default
HAL build in the kernel. Since the AR9130 support won't actually function
without AH_SUPPORT_AR9130 (and that abomination needs to be undone at some
point, in order to allow USB 11n NICs to also work), we now have to
explicitly compile it in.
But since the 11n RF backends don't (currently) join the RF linker set,
one has to compile in _an_ RF backend for the HAL to compile.
At some point it would be nice to correctly update the bus glue to make
this "correct", including having the DDR flush occur in the right spot
(ie, any AHB interrupt.)
put into suspend/shutdown. Old PCI controllers performed that
operation in firmware but for RTL8111C or newer controllers, it's
responsibility of driver. It's not clear whether the firmware of
RTL8111B still downgrades its speed to 10/100Mbps so leave it as it
was.
issues probably needing workarounds in bge(4) when brgphy(4) handles this
PHY. Letting ukphy(4) handle it instead results in a working configuration,
although likely with performance penalties.
The calibrate callout is done with the sc lock held.
This only showed up when using an older NIC (AR5212) whose
radio/phy requires the rfgain adjustment.
Pointy-hat-to: adrian
Sponsored by: Hobnob, Inc.
* Failall is now named just that.
* Add TX ok and TX fail, for aggregate frame sub-frames.
This will break athstats; a followup commit wil resolve this.
Sponsored by: Hobnob, Inc.
Because there is no reliable way to know whether RX MAC is in
stopped state, rejecting all frames would be the only way to
minimize possible races.
Otherwise it's possible to receive frames while stop command
execution is in progress and controller can DMA the frame to freed
RX buffer during that period.
This was observed on recent PCIe controllers(i.e. RTL8111F).
While this change may not be required on old controllers it
wouldn't make negative effects on old controllers. One side effect
of this change is disabling receive so driver reprograms RL_RXCFG
to receive WOL frames when it is put into suspend or shutdown.
This should address occasional 'memory modified free' errors seen
on recent RealTek controllers.
driver would ignore the first link state update if controller
already established a link such that it would have to take
additional link state handling in re_tick().
access.
While I'm here, enable WOL through magic packet but disable waking
up system via unicast, multicast and broadcast frames. Otherwise,
multicast or unicast frame(e.g. ICMP echo request) can wake up
system which is not probably wanted behavior on most environments.
This was not known as problem because RL_CFG5 register access had
not effect until this change.
The capability to wake up system with unicast/multicast frames
are still set in driver, default off, so users who need that
feature can still activate it with ifconfig(8).
one. Interestingly, these are actually the default for quite some time
(bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
since r52045) but even recently added device drivers do this unnecessarily.
Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
Discussed with: jhb
- Also while at it, use __FBSDID.
bit should not affect link establishment process of auto-negotiation
if manual configuration is not used, which is true in auto-negotiation.
However it seems setting this bit interfere with IP1001 PHY's
down-shifting feature such that establishing a 10/100Mbps link failed
when 1000baseT link is not available during auto-negotiation process.
Tested by: Andrey Smagin <samspeed <> mail dot ru >
Pause timer value is initialized to 0xFFFF. Controller allows just
4 different TX pause thresholds. The lowest possible threshold
value looks too aggressive so use next available threshold value.
and proc_getenvv(), which were implemented using linprocfs_doargv() as
a reference.
Suggested by: kib
Reviewed by: kib
Approved by: des (linprocfs maintainer)
MFC after: 2 weeks
- Remove MIIBUS statchg callback and program VGE_DIAGCTL before
initiating link establishment. Previously driver used to
program VGE_DIAGCTL after getting a link in statchg callback.
It seems the VGE_DIAGCTL register works like a kind of MII
register such that it requires setting a 'to be' mode in advance
rather than relying on resolved speed/duplex of established link.
This means the statchg callback is not needed in driver. In
addition, if there was no link at the time of media change, this
was not called at all.
- Introduce vge_ifmedia_upd_locked() to change current media to
configured one. Actual media change is performed only after PHY
reset and VGE_DIAGCTL setup.
- In WOL configuration, make sure to clear forced mode such that
controller can rely on auto-negotiation.
- Unlike most other drivers that use miibus(4), vge(4) used
controller's auto-polling feature for link state tracking via
interrupt. This came from controller's inefficient mechanism to
access MII registers. On link state change interrupt, vge(4)
used to get current link state with series of MII register
accesses. Because vge(4) already enabled auto polling, read PHY
status register to resolved speed/duplex/flow control parameters.
vge(4) still does not drive MII_TICK to reduce number of MII
register accesses which in turn means the driver does not know the
status of auto-negotiation. This was a one of long standing
issue of vge(4). Probably driver may be able to implement a timer
that keeps track of auto-negotiation state and restart
auto-negotiation when driver couldn't establish a link within a
specified period. However the controller does not provide a
reliable way to detect auto-negotiation failure so I'm not sure
whether it's worth to implement it in driver.
Alternatively driver can completely disable MII auto-polling and
let miibus(4) poll link state by driving MII_TICK. This may reduce
unnecessary overhead of stopping/restarting MII auto-polling of
controller. Unfortunately it was known that some variants of
controller does not work correctly if MII auto-polling is disabled.
environment strings and ELF auxiliary vectors from a process stack.
Make sysctl_kern_proc_args to read not cached arguments from the
process stack.
Export proc_getargv() and proc_getenvv() so they can be reused by
procfs and linprocfs.
Suggested by: kib
Reviewed by: kib
Discussed with: kib, rwatson, jilles
Tested by: pho
MFC after: 2 weeks
and DEVMETHOD() we can fully hide the explicit mention of kobj(9) from
device drivers.
- Update the example in driver.9 to use DEVMETHOD_END.
Submitted by: jhb
MFC after: 3 days
__FreeBSD_kernel__ indicates that this system uses the kernel of FreeBSD,
which by definition is always true on FreeBSD. This macro is also defined
on other systems that use the kernel of FreeBSD, such as GNU/kFreeBSD.
It is tempting to use this macro in userland code when we want to enable
kernel-specific routines, and in fact it's fine to do this in code that
is part of FreeBSD itself. However, be aware that as presence of this
macro is still not widespread (e.g. older FreeBSD versions, 3rd party
compilers, etc), it is STRONGLY DISCOURAGED to check for this macro in
external applications without also checking for __FreeBSD__ as an
alternative.
Approved by: kib (mentor)
MFC after: 2 weeks
vnode locking for read, readdir, readlink, getattr and access.
It is hoped that this will improve server performance for these
operations, since they will no longer be serialized for a given
file/vnode.