Use the defined types instead of int when manipulating masks.
Supposedly, it could fix support for 32KB page size in the
machine-independend VM layer.
Reviewed by: alc
MFC after: 2 weeks
checksum offloading and VLAN hardware tag insertion/stripping from
the currently enabled hardware offloading capabilities.
Previously if_hwassist, which was initialized to TX/RX checksum
offloading, was blindly used to enable both TX and RX checksum
offloading such that disabling either TX or RX checksum offloading
was not possible.
ti(4) controllers support TX/RX checksum offloading with VLAN
tagging so announce TX/RX checksum offloading capability over VLAN
to vlan(4).
Make VLAN hardware tag insertion/stripping honors currently enabled
interface capability instead of blindly enabling VLAN hardware
tagging. This change allows disabling hardware support of VLAN tag.
Because ti(4) supports VLAN oversized frames, make network stack
know the capability by setting if_hdrlen.
While I'm here, rewrite SIOCSIFCAP handler and make sure to
reinitialize controller whenever TX/RX checksum offloading and VLAN
hardware tagging option is changed. The requirement of controller
reinitialization comes from the limitation of Tigon I/II firmware.
Tigon I/II firmware requires all related RCBs should be
reinitialized whenever any of its hardware offloading capabilities
change.
vlan(4) is also notified whenever the parent interface's capability
changes such that it can correctly handle TX/RX checksum offloading
based on parent interface's enabled offloading capabilities.
RX checksum offloading handler was changed to make upper stack use
controller computed partial checksum value. Previously, ti(4) just
set the computed value for any frames(IPv4, IPv6) and the value was
not used in upper stack because driver didn't set CSUM_DATA_VALID
such that upper network stack had to recompute checksum of TCP/UDP
packets. I have no idea how this was not noticed for a long time.
With this change, upper network stack does not have to fully
recompute the checksum such that calculating pseudo checksum based
on partial checksum is sufficient to know whether received packet's
checksum is correct or not. However, I don't know why ti(4) does
not have controller compute pseudo checksum as controller has
ability to do it. I'm just guessing enabling that feature could
trigger a firmware bug or could be slower than computing it on host
side so just leave it as it was.
In order not to produce false positives, ti(4) now checks whether
controller actually computed IP or TCP/UDP checksum by checking
ti_flags field.
state changes. Hide superfluous link up/down message under
bootverbose since if_link_state_change(9) shows that information.
While I'm here, change baudrate with the resolved speed of the
established link instead of blindly setting it 1G. Unfortunately,
it seems there is no way to differentiate 10/100Mbps from
non-gigabit link so just assume we established a 100Mbps link if
current link is not a gigabit link.
This was broken in r175872.
We have a UMA backed jumbo allocator and that is much better
implementation than having a local jumbo buffer allocator in
driver. This local allocator would be removed in near future but
fixing build before removal wouldn't be a bad idea.
compiled into the kernel.
Do not try to build the module in case of no INET support but
keep #error calls for now in case we would compile it into the
kernel.
This should fix an issue where the module would fail to enable
IPv6 support from the rc framework, but also other INET and INET6
parts being silently compiled out without giving a warning in the
module case.
While here garbage collect unneeded opt_*.h includes.
opt_ipdn.h is not used anywhere but we need to leave the DUMMYNET
entry in options for conditional inclusion in kernel so keep the
file with the same name.
Reported by: pluknet
Reviewed by: plunket, jhb
MFC After: 3 days
madvise(2) except that it operates on a file descriptor instead of a
memory region. It is currently only supported on regular files.
Just as with madvise(2), the advice given to posix_fadvise(2) can be
divided into two types. The first type provide hints about data access
patterns and are used in the file read and write routines to modify the
I/O flags passed down to VOP_READ() and VOP_WRITE(). These modes are
thus filesystem independent. Note that to ease implementation (and
since this API is only advisory anyway), only a single non-normal
range is allowed per file descriptor.
The second type of hints are used to hint to the OS that data will or
will not be used. These hints are implemented via a new VOP_ADVISE().
A default implementation is provided which does nothing for the WILLNEED
request and attempts to move any clean pages to the cache page queue for
the DONTNEED request. This latter case required two other changes.
First, a new V_CLEANONLY flag was added to vinvalbuf(). This requests
vinvalbuf() to only flush clean buffers for the vnode from the buffer
cache and to not remove any backing pages from the vnode. This is
used to ensure clean pages are not wired into the buffer cache before
attempting to move them to the cache page queue. The second change adds
a new vm_object_page_cache() method. This method is somewhat similar to
vm_object_page_remove() except that instead of freeing each page in the
specified range, it attempts to move clean pages to the cache queue if
possible.
To preserve the ABI of struct file, the f_cdevpriv pointer is now reused
in a union to point to the currently active advice region if one is
present for regular files.
Reviewed by: jilles, kib, arch@
Approved by: re (kib)
MFC after: 1 month
file descriptor drops to zero out of _fdrop() and into devfs_close_f()
as it is only relevant for devfs file descriptors.
Reviewed by: kib
MFC after: 1 week
subject heading "mtx_lock() of destroyed mutex on NFS" and
PR# 156168 appear to be caused by clnt_dg_destroy() closing
down the socket prematurely. When to close down the socket
is controlled by a reference count (cs_refs), but clnt_dg_create()
checks for sb_upcall being non-NULL to decide if a new socket
is needed. I believe the crashes were caused by the following race:
clnt_dg_destroy() finds cs_refs == 0 and decides to delete socket
clnt_dg_destroy() then loses race with clnt_dg_create() for
acquisition of the SOCKBUF_LOCK()
clnt_dg_create() finds sb_upcall != NULL and increments cs_refs to 1
clnt_dg_destroy() then acquires SOCKBUF_LOCK(), sets sb_upcall to
NULL and destroys socket
This patch fixes the above race by changing clnt_dg_destroy() so
that it acquires SOCKBUF_LOCK() before testing cs_refs.
Tested by: bz
PR: 156168
Reviewed by: dfr
MFC after: 2 weeks
UP/!SMP case.
The callbacks may be relying on this feature and having 2 different
ways to deal with them is not correct.
Reported by: rstone
Reviewed by: jhb
MFC after: 2 weeks
They seem to be changed unintentionally in r226437, and there were no
any mentions of renaming in commit log message.
Reported by: Anton Yuzhaninov <citrin citrin ru>
- delay consumer closing and detaching on orphan() until all I/Os complete;
- prevent new I/Os submission after orphan() called.
Previous implementation could destroy consumers still having active
requests and worked only because of global workaround made on GEOM level.
and use these new options in the mips pmap.
Wake up the page daemon in vm_page_alloc_freelist() if the number of free
and cached pages becomes too low.
Tidy up vm_page_alloc_init(). In particular, add a comment about an
important restriction on its use.
Tested by: jchandra@
instead of destroy_dev(). It moves device destruction waiting out of the
topology lock and so fixes dead lock between orphanization and closing.
Real provider and geom destruction called from swi context after device
destroyed as callback of the destroy_dev_sched_cb().
replace amd(4) with the former in the amd64, i386 and pc98 GENERIC kernel
configuration files. Besides duplicating functionality, amd(4), which
previously also supported the AMD Am53C974, unlike esp(4) is no longer
maintained and has accumulated enough bit rot over time to always cause
a panic during boot as long as at least one target is attached to it
(see PR 124667).
PR: 124667
Obtained from: NetBSD (based on)
MFC after: 3 days
Do not close/destroy opened consumer directly in case of disconnect. Instead
keep it existing until it will be closed in regular way in response to
upstream provider destruction. Delay geom destruction in the same way.
Previous implementation could destroy consumers still having active
requests and worked only because of global workaround made on GEOM level.
corresponding Linux driver uses. This allows mpt(4) to still recognize
all good SATA devices in presence of a defective one, which takes about
45 seconds.
In the long term we probably should implement the logic used by mpt2sas(4)
allowing IOC port initialization to complete at a later time.
Submitted by: Andrew Boyer
MFC after: 3 days
by rman_get_virtual(9) to access device registers sparc64 currently cares
about.
Ideally ata(4) should just be converted to access these using bus_space(9)
read/write functions instead as there's really no reason to do it the
former way. However, this part of ata-siliconimage.c should go away in
favor of siis(4) sooner or later anyway and I don't have the hardware to
actually test the SX4 bits of ata-promise.c.
Also ideally the other architectures should also properly handle the
BUS_SPACE_MAP_LINEAR flag of bus_space_map(9) so this code wouldn't need
to be #ifdef'ed.
Do not close/destroy opened consumer directly in case of disconnect. Instead
keep it existing until it will be closed in regular way in response to
upstream provider destruction. Delay geom destruction in the same way.
Previous implementation could destroy consumers still having active
requests and worked only because of global workaround made on GEOM level.
take advantage of it instead of duplicating it. This reduces the size of
the i386 GENERIC kernel by about 4k. The only potential in-tree user left
unconverted is xe(4), which generally should be changed to use miibus(4)
instead of implementing PHY handling on its own, as otherwise it makes not
much sense to add a dependency on miibus(4)/mii_bitbang(4) to xe(4) just
for the MII bitbang'ing code. The common MII bitbang'ing code also is
useful in the embedded space for using GPIO pins to implement MII access.
- Based on lessons learnt with dc(4) (see r185750), add bus barriers to the
MII bitbang read and write functions of the other drivers converted in
order to ensure the intended ordering. Given that register access via an
index register as well as register bank/window switching is subject to the
same problem, also add bus barriers to the respective functions of smc(4),
tl(4) and xl(4).
- Sprinkle some const.
Thanks to the following testers:
Andrew Bliznak (nge(4)), nwhitehorn@ (bm(4)), yongari@ (sis(4) and ste(4))
Thanks to Hans-Joerg Sirtl for supplying hardware to test stge(4).
Reviewed by: yongari (subset of drivers)
Obtained from: NetBSD (partially)
r162200 delays provider orphanization until all running requests complete,
to workaround broken orphan() method implementation in some classes.
r215687 removes persistent periodic (10Hz) event thread wake ups.
Together these changes can indefinitely delay orphanization until some
other event wake up the event thread. One consequence of this is inability
of CAM to destroy device disconnected when busy and, as consequence, create
new one after reconnection.
While the best solution would be to revert r162200, it is not easy, as
some classes still look broken in that way. Instead conditionally wake up
event thread if there are some providers waiting for orphanization.
MFC after: 1 week
- Move esp_devclass to ncr53c9x.c in order to allow different bus front-ends
to use it.
- Use KOBJMETHOD_END.
- Remove the gl_clear_latched_intr hook as it's not needed for any of the
chips nor the front-ends supported in FreeBSD and likely never will be.
- Correct the DMA constraints used in the SBus front-end, the LSI64854 isn't
limited to 32-bit DMA.
- The ESP200 also only supports up to 64k transfers.
- Don't let the DMA and SBus front-end supply a maximum transfer size larger
than MAXPHYS as that's the maximum the upper layers use and we otherwise
just waste resources unnecessarily.
- Initialize the ECB callout and don't zero the handle when returning ECBs
to the free list so that ncr53c9x_callout() actually is called with the
driver lock held.
- On detach the driver lock should be held across cam_sim_free() according
to isp(4) and a panic received.
- Check the return value of NCRDMA_SETUP(), i.e. bus_dmamap_load(9), and try
to handle failures gracefully.
- In ncr53c9x_action() replace N calls to xpt_done() in a switch with just
one at the end.
- On XPT_PATH_INQ report "NCR" rather than "Sun" as the vendor as the former
is somewhat more correct as well as the maximum supported transfer size via
maxio in order to take advantage of controllers that that can handle more
than DFLTPHYS.
- Print the number of MESSAGE (EXTENDED) rejected.
- Fix the path encoded in the multiple inclusion protection of ncr53c9xvar.h.
- Correct the DMA constraints used in the LSI64854 core to not exceed the
maximum supported transfer size and include the boundary so we don't need
to check on every setup of a DMA transfer.
- Let the bus DMA map callbacks do nothing in case of an error.
- Correctly handle > 64k transfers for FAS366 in the LSI64854. A new feature
flag NCR_F_LARGEXFER was introduced so we just need to check for this one
and not for individual controllers supporting large transfers in several
places.
- Let the LSI64854 core load transfer buffers using BUS_DMA_NOWAIT as the
NCR53C9x core can't handle EINPROGRESS. Due to lack of bounce buffers
support, sparc64 doesn't actually use EINPROGRESS and likely never will,
as an example for writing additional front-ends for the NCR53C9x core it
makes sense to set BUS_DMA_NOWAIT anyway though.
- Some minor cleanup.
eliminating duplicated code in the various pmap implementations.
Micro-optimize vm_phys_free_pages().
Introduce vm_phys_free_contig(). It is fast routine for freeing an
arbitrary number of physically contiguous pages. In particular, it
doesn't require the number of pages to be a power of two.
Use "u_long" instead of "unsigned long".
Bruce Evans (bde@) has convinced me that the "boundary" parameters
to kmem_alloc_contig(), vm_phys_alloc_contig(), and
vm_reserv_reclaim_contig() should be of type "vm_paddr_t" and not
"u_long". Make this change.
to an API change in CAM. It's once again possible to link a static kernel
with 'mfi' without requiring 'scbus' as well. Ditto for KLD loading.
Submitted by: kib
Reviewed by: ken
MFC after: 3 days
(mostly with Catalan characters in mind, but it probably
benefits other languages).
The new mappings are as follows:
▮ -> █
ÀÈÍÏÓÒÚ -> AEIIOOU
ŀ / Ŀ -> l / L
Reviewed by: ed
Approved by: kib (mentor)
I found this useful when trying to debug the AR9160 STA RX filter issue -
I'd get crypto reply errors but it wasn't entirely clear which TID it
was for.
their length.
Without this, an error frame mbuf would:
* have its size adjusted;
* thrown at the radiotap code;
* then since it's never consumed, the rxbuf/mbuf is then re-added to the
RX descriptor list with the small size;
* .. and the hardware ends up (sometimes) only DMA'ing part of a frame into
the small buffer, chaining RX frames together (setting the more flag).
I discovered this particular issue when doing some promiscuous radiotap
testing; I found that I'd occasionally get rs_more set in RX descriptors
w/ the first frame length being very small (sub-100 bytes.) The driver
handles 2-descriptor RX frames (but not more), so this still worked; it
was just odd.
This is suboptimal and may benefit from being replaced with caching
the m_pkthdr_len and m_len fields, then restoring them after completion.
providers and consumers will be destroyed. Before take some actions
with a geom, check that it is not destroyed at the moment.
Tested by: nwhitehorn
MFC after: 1 week
bge(4) sends BGE_FW_CMD_DRV_ALIVE command to firmware every 2
seconds. BGE_FW_CMD_DRV_ALIVE command requires 4 bytes data. This
data contains timeout value in seconds until the next
BGE_FW_CMD_DRV_ALIVE command.
Broadcom recommends driver set the value 3 times longer than the
interval that it sends BGE_FW_CMD_DRV_ALIVE. Currently bge(4) uses
3 seconds so probably we have to increase it in future and use
different ALIVE command(e.g. BGE_FW_CMD_DRV_ALIVE3).
No functional changes.
This bit(SW event 7 in publicly available data sheet) is used to
make RX CPU handle a firmware command and the bit is automatically
cleared after RX CPU completed the command.
Generally firmware command takes the following steps.
1. Write BGE_SRAM_FW_CMD_MB with a command.
2. Write BGE_SRAM_FW_CMD_LEN_MB with the length of the command in bytes.
3. Write BGE_SRAM_FW_CMD_DATA_MB with actual command data.
4. Generate BGE_RX_CPU_EVENT and let firmware handle the command.
5. Wait for the ACK of the firmware command.
No functional changes.
it no longer exists). Instead, run svnversion if we can find the binary
and test that the output looks like a version string.
Reviewed by: discussion on -current@
Tested by: rodrigc for non-svn case (thanks!)
start only one worker thread. For software crypto it will start by default
N worker threads where N is the number of available CPUs.
This is not optimal if hardware crypto is AES-NI, which uses CPU for AES
calculations.
Change that to always start one worker thread for every available CPU.
Number of worker threads per GELI provider can be easly reduced with
kern.geom.eli.threads sysctl/tunable and even for software crypto it
should be reduced when using more providers.
While here, when number of threads exceeds number of CPUs avilable don't
reduce this number, assume the user knows what he is doing.
Reported by: Yuri Karaban <dev@dev97.com>
MFC after: 3 days
- Operate on uint64_t types when doing XORing, etc. instead of uint8_t.
- Don't bzero() temporary block for every AES block. Do it once for entire
data block.
- AES-NI is available only on little endian architectures. Simplify code
that takes block number from IV.
Benchmarks:
Memory-backed md(4) device, software AES-XTS, 4kB sector:
# dd if=/dev/md0.eli bs=1m
59.61MB/s
Memory-backed md(4) device, old AES-NI AES-XTS, 4kB sector:
# dd if=/dev/md0.eli bs=1m
97.29MB/s
Memory-backed md(4) device, new AES-NI AES-XTS, 4kB sector:
# dd if=/dev/md0.eli bs=1m
221.26MB/s
127% performance improvement between old and new code.
Harddisk, raw speed:
# dd if=/dev/ada0 bs=1m
137.63MB/s
Harddisk, software AES-XTS, 4kB sector:
# dd if=/dev/ada0.eli bs=1m
47.83MB/s (34% of raw disk speed)
Harddisk, old AES-NI AES-XTS, 4kB sector:
# dd if=/dev/ada0.eli bs=1m
68.33MB/s (49% of raw disk speed)
Harddisk, new AES-NI AES-XTS, 4kB sector:
# dd if=/dev/ada0.eli bs=1m
108.35MB/s (78% of raw disk speed)
58% performance improvement between old and new code.
As a side-note, GELI with AES-NI using AES-CBC can achive native disk speed.
MFC after: 3 days
thing when changing the debugging options as part of head becoming a new
stable branch. It may also help people who for one reason or another want
to run head but don't want it slowed down by the debugging support.
Reviewed by: kib
more general VM system interfaces. So, their implementation can now
reside in kern_malloc.c alongside the other functions that are declared
in malloc.h.
about the various driver events like load, unload, reset, suspend,
restart, and ioctl operations.
Define driver's event rather than using hard-coded values. We don't
still send suspend/resume event to firmware.
Previously bge(4) used BGE_SDI_STATUS to send events. Because driver
has to access firmware mail box to inform current state, using
BGE_SDI_STATUS register was wrong. The end result was the same as
BGE_SDI_STATUS is 0x0C04.
No functional changes.
- add support for volumes above 2TiB with Promise metadata format;
- enforse and document other limitations:
- Intel and Promise metadata formats do not support disks above 2TiB;
- NVIDIA metadata format does not support volumes above 2TiB.
Sponsored by: iXsystems, Inc.
MFC after: 2 weeks
The origin of GENCOMM seems to come from Alteon Tigon Host/NIC
interface definition where it defines general communications region
which is active when firmware is loaded and running. This region
was used in communication between the host and processor internal
to the Tigon chip.
Broadcom data sheet also defines the region as 'Software Gencomm'
in NetXtreme memory map but lacks detailed description of its
interface so it was hard to know which ones are used for which
interface.
This change shall slightly enhance readability.
No functional changes.
larger than 4KB in size. However the maximum DMA segment size
created in DMA tag is 4KB, so we wouldn't encounter the issue here.
Just record this issue such that let developers not to create a DMA
segment that is larger than 4KB for BCM5719. It's possible to split
a DMA segment into multiple smaller ones in run time but I believe
it's not worth to implement that.
o Protect bge(4) status block access and register dump with driver lock.
o Add missing bus_dmamap_sync() before dumping status block.
o Use minimum status block size, 32 bytes, for status block dump on most
controllers except BCM5700 AX/BX.
While I'm here, make the handler show 5717 Plus in hardware flags.
* preserve AR_TxIntrReq on every descriptor in an aggregate chain,
not just the first descriptor;
* always blank out the descriptor in ar5416ChainTxDesc() when forming
aggregates - the way I'm using this in the 11n branch is to first
chain aggregates together, then use the other HAL calls to fill in
the details.
* Add the TID field in the TX status descriptor;
* Add in the 11n first/middle/last functions for fiddling
with the descriptors. These are from the Linux and the
reference driver, but I'm not (currently) using them.
* Add further AR_ISR_S5 register definitions.
Obtained from: Linux ath9k, Atheros
interfere with traffic, as the NF load can take quite a while and poking the
AGC every 10uS is just a bit silly.
Instead, just leave the baseband NF calibration where it is and just read it
back next time a longcal interval happens.
and constants related to the BIOS Enhanced Disk Drive Specification.
- Use this header instead of magic numbers and various duplicate structure
definitions for doing I/O.
- Use an actual structure for the request to fetch drive parameters in
drvsize() rather than a gross hack of a char array with some magic
size. While here, change drvsize() to only pass the 1.1 version of
the structure and not request device path information. If we want
device path information you have to set the length of the device
path information as an input (along with probably checking the actual
EDD version to see which size one should use as the device path
information is variable-length). This fixes data smashing problems
from passing an EDD 3 structure to BIOSes supporting EDD 4.
Reviewed by: avg
Tested by: Dennis Koegel dk neveragain.de
MFC after: 1 week
controller.
AX88772B data sheet does not show detailed information about
checksum offloading related things. It seems the controller has
lots of options to support checksum offloading but I failed to
understand why this feature requires so much complex controller
configuration and status bits.
One of major difference between AX88772B and its predecessor is
AX88772B uses a new RX header format when RX checksum offloading is
enabled. It also requires the received length of a frame should be
multiple of 4. Controller will pad necessary bytes to make the
length of received frame to be multiple of 4. It is driver's
responsibility to offset this pad bytes.
Note, AX88772B could be configured to get partial checksum value in
in RX header. This mode uses different RX header format and
currently we don't use that fature.
This change makes axe(4) use driver specific MII attach handler to
override uether(9)'s default MII attach and announce flow-control
capability for AX88178/AX88772A/AX88772B to PHY drivers. It seems
original AX88772 also supports flow-control but I didn't enable it
due to lack of test/access to the controller. The flow-control
threshold parameter is loaded from EEPROM and there is no way to
override this value without reprogramming EEPROM. For AX88772B,
TX/RX IP/TCP/UDP checksum offloading is announced to network stack.
IPv6 and PPPoE checksum offloading is also supported by controller
but we have no way to take advantage of these features.
Driver already knows PHY address so make PHY driver know that
information and remove unnecessary PHY address check used in
miibus_readreg/miibus_writereg callbacks. Also announce AX88178,
AX88772A and AX88772B support VLAN over-sized frame.
While I'm here clean up headers and remove axe_start() in
axe_init() because the link wouldn't be available right after media
change.
common cases that can be handled in constant time. The insight being
that a page's parent in the vm object's tree is very often its
predecessor or successor in the vm object's ordered memq.
Tested by: jhb
MFC after: 10 days
with older FreeBSD versions:
- Add -V option to 'geli init' to specify version number. If no -V is given
the most recent version is used.
- If -V is given don't allow to use features not supported by this version.
- Print version in 'geli list' output.
- Update manual page and add table describing which GELI version is
supported by which FreeBSD version, so one can use it when preparing GELI
device for older FreeBSD version.
Inspired by: Garrett Cooper <yanegomi@gmail.com>
MFC after: 3 days
of the CDDL licence explicitly requires every Contributor to add
a copyright notice.
This also reflects the copyright notices for the changes recently
added by Illumos.
MFC after: 3 days
interfaces. A host route has a NULL mask so check for that condition.
I have also been told by developers who customize the packet output
path with direct manipulation of the route entry (or the outgoing
interface to be specific). This patch checks for the route mask
explicitly to make sure custom code will not panic.
PR: kern/161805
MFC after: 3 days
masked out when adding a prefix route through the "route" command.
However, when deleting the route, simply changing the command keyword
from "add" to "delete" does not work. The failoure is observed in
both IPv4 and IPv6 route insertion. The patch makes the route command
behavior consistent between the "add" and the "delete" operation.
MFC after: 1 week
handler such that driver can announce interface capabilities and
can do its own MII attach. Currently all USB ethernet controllers
have no way to establish a link with pause capabilities. Lack of
checksum offloading support also was one of reason to bring this
change in.
This change adds a couple of wrappers to USB ethernet drivers
(uether_ifmedia_upd, uether_init and uether_start). All exported
functions in uether has prefix uether_ so I think it's more
consistent to have wrappers that follow the convention.
This change preserves ABI/KPI so it should be safe to merge this
change to stable/8.
While I'm here add missing __FBSDID and clean up headers.
Reviewed by: hselasky
controller which is found on ULi M1563 South Bridge & M1689 Bridge.
These controllers look like a tulip clone.
M5263 controller does not support MII bitbang so use DC_ROM
register to access MII registers. Like other tulip variants, ULi
controller uses a setup frame to configure RX filter and uses new
setup frame format. It's not clear to me whether the controller
supports a hash based multicast filtering so this patch uses 14
perfect multicast filter to filter multicast frames. If number of
multicast addresses is greater than 14, controller is put into a
mode that receives all multicast frames.
Due to lack of access to M5261, this change was not tested with
M5261 but it probably works. Many thanks to Marco who provided
remote access to M5263.
Tested by: Marco Steinbach <coco <> executive-computing dot de>,
Martin MATO <martin.mato <> orange dot fr>
link such that calling dc_setcfg() right after media change would
be meaningless unless controller in question is not Davicom DM9102.
Ideally dc_setcfg() should be called when speed/duplex is resolved
otherwise it would reprogram controller with wrong speed/duplex
information. Because MII status change callback already calls
dc_setcfg() I think calling dc_setcfg() in dc_init_locked() is
wrong. For instance, it would take some time to establish a link
after mii_mediachg(), so blindly calling dc_setcfg() right after
mii_mediachg() will always yield wrong media configuration.
Extend dc_ifmedia_upd() to handle media change and still allow
21143 and Davidcom controllers program speed/duplex regardless of
current resolved speed/duplex of link. In theory 21143 may not need
to call dc_setcfg() right after media change, but leave it as it is
because there are too many variants to test that change. Probably
dc(4) shall need a PHY reset in dc_ifmedia_upd() but it's hard to
verify correctness of the change.
This change reliably makes ULi M5263 establish a link.
While I'm here correctly report media change result. Previously it
always reported a success.
failure (the getnewvnode cannot return an error). In this case, the
null_insmntque_dtr() already unlocked the reclaimed vnode, so VOP_UNLOCK()
in the nullfs_mount() after null_nodeget() failure is wrong.
Tested by: pho
MFC after: 1 week
- for the legacy PCI ATA channels move channel number out of the device
description, same as it is for ahci(4), siis(4) and mvs(4);
- add device description for the ISA ATA channels.
where the driver assumed that BA resources are still available due to
net80211 saying so.
PR: 161407, 159768
Tested by: cperciva, rene
MFC after: 3 days
It is possible for file systems with 'mountpoint' preperty set to 'legacy'
or 'none' - we don't have to change mount directory for them.
Currently such file systems are unmounted on rename and not even mounted back.
This introduces layering violation, as we need to update 'f_mntfromname'
field in statfs structure related to mountpoint (for the dataset we are
renaming and all its children).
In my opinion it is worth it, as it allow to update FreeBSD in even cleaner
way - in ZFS-only configuration root file system is ZFS file system with
'mountpoint' property set to 'legacy'. If root dataset is named system/rootfs,
we can snapshot it (system/rootfs@upgrade), clone it (system/oldrootfs),
update FreeBSD and if it doesn't boot we can boot back from system/oldrootfs
and rename it back to system/rootfs while it is mounted as /. Before it was
not possible, because unmounting / was not possible.
MFC after: 2 weeks
This restores the previous behaviour. While here, match '?' and '.'
inputs exactly and improve the error message.
Requested by: avg@
Derived from a patch by: Arnaud Lacombe <lacombar@gmail.com>
of scheduling next run pfsync_bulk_update(), pfsync_bulk_fail()
was scheduled.
This lead to instant 100% state leak after first bulk update
request.
- After above fix, it appeared that pfsync_bulk_update() lacks
locking. To fix this, sc_bulk_tmo callout was converted to an
mtx one. Eventually, all pf/pfsync callouts should be converted
to mtx version, since it isn't possible to stop or drain a
non-mtx callout without risk of race.
- Add comment that callout_stop() in pfsync_clone_destroy() lacks
locking. Since pfsync0 can't be destroyed (yet), let it be here.
The root of problem is re-locking at the end of pfsync_sendout().
Several functions are calling pfsync_sendout() holding pointers
to pf data on stack, and these functions expect this data to be
consistent.
To fix this, the following approach was taken:
- The pfsync_sendout() doesn't call ip_output() directly, but
enqueues the mbuf on sc->sc_ifp's interfaces queue, that
is currently unused. Then pfsync netisr is scheduled. PF_LOCK
isn't dropped in pfsync_sendout().
- The netisr runs through queue and ip_output()s packets
on it.
Apart from fixing race, this also decouples stack, fixing
potential issues, that may happen, when sending pfsync(4)
packets on input path.
Reviewed by: eri (a quick review)
o Detect when Boot Camp is enabled (i.e. the MBR mirrors the GPT).
o When Boot Camp is enabled, update the MBR whenever we write the GPT.
o Creation of a Boot Camp enabled GPT is not supported.
o Automatically disable Boot Camp when the GPT has been changed so that
there's either no EFI partition or no HFS+ partition.
o The first 4 partitions (by index) get mirrored in the MBR.
Requested by, discussed with and tested by: kris@pcbsd.org
MFC after: 1 week
number of packets can be queued on sc, while we are in ip_output(), and then
we wipe the accumulated sc_len. On next pfsync_sendout() that would lead to
writing beyond our mbuf cluster.
This allows to see processes I/O activity in 'top -m io' output.
PR kern/156218
Reported by: Marcus Reid <marcus@blazingdot.com>
Patch by: avg
MFC after: 3 days
According to POSIX, these two header files should be able to be included
by themselves, not depending on other headers. The <net/if.h> header
uses struct sockaddr when __BSD_VISIBLE=1, while <netinet/tcp.h> uses
integer datatypes (u_int32_t, u_short, etc).
MFC after: 2 months
implement a deprecated FPU control interface in addition to the
standard one. To make this clearer, further deprecate ieeefp.h
by not declaring the function prototypes except on architectures
that implement them already.
Currently i386 and amd64 implement the ieeefp.h interface for
compatibility, and for fp[gs]etprec(), which doesn't exist on
most other hardware. Powerpc, sparc64, and ia64 partially implement
it and probably shouldn't, and other architectures don't implement it
at all.
- Decompress assembled gang block data if compressed.
- Verify checksum of a gang header.
- Verify checksum of assembled gang block data.
- Verify checksum of uber block.
Submitted by: avg
MFC after: 3 days
physical block size declared in bp may not always be what we want.
For example in case of gang block header physical block size declared
in bp is much larger than SPA_GANGBLOCKSIZE (512 bytes) and checksum
calculation failed. This bug could lead to accessing unallocated
memory and resets/failures during boot.
MFC after: 3 days
handled by lower layers like vdev_raidz, which uses bp for checksum
verification. This bug could lead to NULL pointer reference and resets
during boot.
MFC after: 3 days
to document where we are expecting to be called with a lock held to
more easily catch unnoticed code paths.
This does not neccessarily improve locking in pfsync, it just tries
to avoid the panics reported.
PR: kern/159390, kern/158873
Submitted by: pluknet (at least something that partly resembles
my patch ignoring other cleanup, which I only saw
too late on the 2nd PR)
MFC After: 3 days
and virtualization it is not helpful but complicates things.
Current state of art is to not virtualize these kinds of locks -
inp_group/hash/info/.. are all not virtualized either.
MFC after: 3 days
pfsync also depends on pf to be initialized already so pf goes at
FIRST and the interfaces go at ANY.
Then the (VNET_)SYSINIT startups for pf stays at SI_SUB_PROTO_BEGIN
and for pfsync we move to the later SI_SUB_PROTO_IF.
This is not ideal either but at least an order that should work for
the moment and can be re-fined with the VIMAGE merge, once this will
actually work with more than one network stack.
MFC after: 3 days
and never remove state.
This fixes the problem some people are seeing that state is removed when pf
is loaded as a module but not in situations when compiled into the kernel.
Reported by: many on freebsd-pf
Tested by: flo
MFC after: 3 days
If we handle an interrupt just before the 'wait' and the interrupt
schedules some work, we need to skip the 'wait' call. The simple solution
of calling sched_runnable() with interrupts disabled immediately before
wait still leaves a window after the call and before 'wait' in which
the same issue can occur.
The solution implemented is to check the EPC in the interrupt handler, and
if it is in a region before the 'wait' call, to fix up the EPC to skip the
wait call.
Reported/analysed by: adrian
Fix suggested by: kib
Reviewed by: jmallett, imp
As the underlying block is 4KB if the PMC throughput is low the measurement
will be reported on the next tick. pmcstat(8) use the modified flush API to
reclaim current buffer before displaying next top.
MFC after: 1 month
As part of the 8.0-RELEASE cycle this was done in stable/8 (r199112)
but was left alone in head so people could work on fixing an issue that
caused boot failure on some motherboards. Apparently nobody has worked
on it and we are getting reports of boot failure with the 9.0 test builds.
So this time I'll comment out the driver in head (still hoping someone
will work on it) and MFC to stable/9.
Submitted by: Alberto Villa <avilla at FreeBSD dot org>
- update xlp_machdep.c to read arguments from FDT if FDT support is
compiled in.
- define rmi_uart_bus_space, and use it as fdtbus_bs_tag
- update conf files for FDT support
- add default dts file xlp-basic.dts
It seems the D_PSEUDO flag was meant to allow make_dev() to return NULL.
Nowadays we have a different interface for that; make_dev_p(). There's
no need to keep it there.
While there, remove an unneeded D_NEEDMINOR from the gpio driver.
Discussed with: gonzo@ (gpio)
only logged instances where an operation on a file descriptor required
capabilities which the file descriptor did not have. By adding a type enum
to struct ktr_cap_fail, we can catch other types of capability failures as
well, such as disallowed system calls or attempts to wrap a file descriptor
with more capabilities than it had to begin with.
Some earlier series (~AR5212?) play badly with BIOSes.
In these instances, they may require a forced reset (by transitioning
the NIC through D0 -> D3 -> D0) before they probe/attach correctly.
This is currently disabled because:
* I haven't figured out the "right" code to ensure this only happens
for PCI NICs (not PCIe or Cardbus);
* I haven't at all done wide scale testing for this, and I'm not yet
ready for said wide-scale testing.
I'm documenting this primarily so users with misbehaving NICs have
something to tinker with.
Obtained from: Atheros
The final missing bit here is enabling the PCI configuration register
read, but there's currently no glue available for the HAL to read (and
write) PCI configuration space registers.
Obtained from: Atheros
The AR5008/AR9001 series NICs have a bug where BB register reads
will occasionally be corrupted. This could cause issues with things
such as ANI, which adjust operational parameters based on the
BB radio register reads. This was introduced in the AR5008 chip
and fixed with the first released AR9002 series NIC (AR9280v2.)
A followup commit will implement the acutal WAR when reading
BB registers. I'm still not sure how I'll implement it - whether
it should be done in the osdep layer, or whether it should just
live in the AR5416 HAL. Either way, they can use this capability
bit to determine whether to implement the WAR or not.
Thankyou to various sources inside Atheros who have helped me track
down what this particular issue is.
Obtained from: Atheros
There are HAL methods which are actually direct register
access, rather than simply HAL calls. Because of this, these
register accesses would use the non-debug path in ah_osdep.h
as opt_ah.h isn't included.
With this, the correct register access methods are used,
so debugging traces show things such as TXDP checking and
TSF32 access.
When calculating space needed for SA_BONUS buffers,
hdrsize is always rounded up to next 8-aligned boundary.
However, in two places the round up was done against
sum of 'total' plus hdrsize. On the other hand,
hdrsize increments by 4 each time, which means in
certain conditions, we would end up returning with
will_spill == 0 and (total + hdrsize) larger than
full_space, leading to a failed assertion because
it's invalid for dmu_set_bonus.
Sponsored by: iXsystems, Inc.
Reviewed by: mm
MFC after: 3 days
Because driver is accessing a common MII structure in
mii_pollstat(), updating user supplied structure should be done
before dropping a driver lock.
Reported by: Karim (fodillemlinkarimi <> gmail dot com)
Because driver is accessing a common MII structure in
mii_pollstat(), updating user supplied structure should be done
before dropping a driver lock.
Reported by: Karim (fodillemlinkarimi <> gmail dot com)
That way the radar errors aren't enabled prematurely.
A DFS tester has reported that radar events are reported
during channel scanning, before DFS is actually enabled.
Use the offset into the device tree from fdtp as the phandle instead
of using pointer into the device tree. This will make sure that the
phandle fits into a uint32_t type, even when compiled for 64bit.
Reviewed by: raj, nathanw, marcel
on the largest multi-write size.
From the submitter:
==
I looked further into the magic 88-byte threshold after which the bug
occurs. It turns out that figure included the 24-byte tx_desc, and up
to 64 bytes of beacon frame (header+data).
rum_write_multi doesn't seem happy with writing >64 bytes at a time to
the MAC register. If I break it up into separate calls (e.g. bytes
0-63, then bytes 64-65, written at the appropriate offset) I see the
proper beacon frames being transmitted now.
==
Submitted by: Steven Chamberlain <steven@pyro.eu.org>
MFC after: 3 days
Reading /dev/mem in 64 bit kernel crashes. This is because the page
used to call uiomove_fromphys() from memrw() does not have md.pv_list
initialized correctly.
The fix is to call pmap_page_init() on the page to initialize it.
separation rewrite changes. r196865 was committed to fix a scope
violation problem in the following test scenario:
box-1# ifconfig em0 inet6 2001:db8:1:: prefixlen 64 anycast
box-1# ifconfig em1 inet6 2001:db8:2::1 prefixlen 64
box-2# ifconfig re0 inet6 2001:db8:1::6 prefixlen 64
em0 and re0 are on the same link.
box-2# ping6 2001:db8:1::
PING6(56=40+8+8 bytes) 2001:db8:1::6 --> 2001:db8:1::
the ICMPv6 response should have a source address of em1, which
is 2001:db8:2::1, not the link-local address of em0.
That code is no longer necessary and breaks the IPv6-Ready logo
testing, so revert it now.
Reviewed by: hrs
MFC after: 3 days
long been superseded by the RFC3390 initial CWND sizing.
Also remove the remnants of TCP_METRICS_CWND which used the
TCP hostcache to set the initial CWND in a non-RFC compliant
way.
MFC after: 1 week
- A race condition could happen if two threads were using RAS at the same time
as the code didn't reset RAS_END, the RAS code could believe we were not in
a RAS, when we were in fact.
- Using signed value logic to compare addresses wasn't such a good idea.
Many thanks to Ian to investigate on these issues.
Pointy hat to: cognet
PR: arm/161498
Submitted by: Ian Lepore <freebsd At damnhippie DOT dyndns dot org
MFC after: 1 week
page has been allocated, or we could end up using random values, and bad things
could happen.
PR: arm/161492
Submitted by: Ian Lepore <freebsd AT damnhippie dot dyndns DOT org>
MFC after: 1 week
This fixes a compiler warning at WARNS=6 when including the header files
as follows:
#include <sys/types.h>
#include <netinet/in.h>
#include <netinet/ip_var.h>
#include <netinet/udp.h>
#include <netinet/udp_var.h>
To run a /31 network, participating hosts MUST drop support
for directed broadcasts, and treat the first and last addresses
on subnet as unicast. The broadcast address for the prefix
should be the link local broadcast address, INADDR_BROADCAST.
- Remove ia_net, ia_netmask, ia_netbroadcast from struct in_ifaddr.
- Remove net.inet.ip.subnetsarelocal, I bet no one need it in 2011.
- fix bug when we were not forwarding to a host which matches classful
net address. For example router having 192.168.x.y/16 network attached,
would not forward traffic to 192.168.*.0, which are legal IPs in
CIDR world.
- For compatibility, leave autoguessing of mask based on class.
Reviewed by: andre, bz, rwatson
* Break out the PCI setup override code into a new function.
* Re-apply the PCI overrides on powersave resume. The retry timeout
register isn't currently being saved/resumed by the PCI driver/bus
code.