Place the arguments at a fixed offset of 0x800 withing the argument area
(of size 0x1000). Allow variable size extended arguments first of which
should be a size of the extended arguments (including the size
parameter).
Consolidate all related definitions in a new i386/common/bootargs.h header.
Many thanks to jhb and bde for their guidance and reviews.
Reviewed by: jhb, bde
Approved by: jhb
MFC after: 1 month
controller to perform MDIO type accesses to a remote transceiver
using message pages defined through MRBE(multirate backplane
ethernet). It's used in blade systems(e.g Dell Blade m610) which
are connected to pass-through blades rather than traditional
switches.
This change directly manipulates firmware's mailboxes to control
remote PHY such that it does not use mii(4). Alternatively, as
David said, it could be implemented in brgphy(4) by creating a fake
PHY and let brgphy(4) do necessary mii accesses and bce(4) can
implement mailbox accesses based on the type of brgphy(4)'s mii
accesses. Personally, I think it would make brgphy(4) hard to
maintain since it would have to access many bce(4) registers in
brgphy(4). Given that there are users who are suffering from lack
of remote PHY support, it would be better to get working system
rather than waiting for complete/perfect implementation.
Tested by: Jan Winter ( jan.winter <> kantarmedia dot de )
Reviewed by: davidch (initial version)
MFC after: 2 weeks
TX and RX PCU stop/drain routines have been thoroughly debugged.
It's also very likely that I should add hooks back up to the
interface glue (if_ath_pci / if_ath_ahb) to do any relevant
bus flushes that are required. A WMAC DDR flush may be required
for the AR9130 SoC.
some of the IPI mechanisms used by the common MIPS SMP code so we could use
the multicast IPI facilities, on GXemul as well as on several real hardware
platforms, and the ability to have multiple hard IPI types.
- DTrace scripts to check for errors, performance, ...
they serve mostly as examples of what you can do with the static probe;s
with moderate load the scripts may be overwhelmed, excessive lock-tracing
may influence program behavior (see the last design decission)
Design decissions:
- use "linuxulator" as the provider for the native bitsize; add the
bitsize for the non-native emulation (e.g. "linuxuator32" on amd64)
- Add probes only for locks which are acquired in one function and released
in another function. Locks which are aquired and released in the same
function should be easy to pair in the code, inter-function
locking is more easy to verify in DTrace.
- Probes for locks should be fired after locking and before releasing to
prevent races (to provide data/function stability in DTrace, see the
man-page of "dtrace -v ..." and the corresponding DTrace docs).
result in INQUIRY VPD 0x81 to SATA devices to return only 63 bytes of data
instead of 64 during SCSI/ATA translation.
Sponsored by: Intel
Approved by: scottl
MFC after: 1 week
I/O port addresses. Even if we do, this is hardly the place to mask
interrupts. It's not clear that this was at all needed. The code came
with CVS revision 1.2 of nexus.c when interrupt support was first added.
What is known is that ia64 has always been designed around the IOSAPIC,
and that doing I/O like this prevents Altix from booting.
them to cleanup and goto out when acknowledging the LD's. Check
for failure on malloc. Remove a couple of extra lines and remove
the spurious return.
Prompted by: Petr Lampa
MFC after: 3 days
Use MADT to match ACPI Processor objects to CPUs. MADT and DSDT/SSDTs may
list CPUs in different orders, especially for disabled logical cores. Now
we match ACPI IDs from the MADT with Processor objects, strictly order CPUs
accordingly, and ignore disabled cores. This prevents us from executing
methods for other CPUs, e. g., _PSS for disabled logical core, which may not
exist. Unfortunately, it is known that there are a few systems with buggy
BIOSes that do not have unique ACPI IDs for MADT and Processor objects. To
work around these problems, 'debug.acpi.cpu_unordered' tunable is added.
Set this to a non-zero value to restore the old behavior.
Many thanks to jhb for pointing me to the right direction and the manual
page change.
Reported by: Harris, James R (james dot r dot harris at intel dot com)
Tested by: Harris, James R (james dot r dot harris at intel dot com)
Reviewed by: jhb
MFC after: 1 month
list CPUs in different orders, especially for disabled logical cores. Now
we match ACPI IDs from the MADT with Processor objects, strictly order CPUs
accordingly, and ignore disabled cores. This prevents us from executing
methods for other CPUs, e. g., _PSS for disabled logical core, which may not
exist. Unfortunately, it is known that there are a few systems with buggy
BIOSes that do not have unique ACPI IDs for MADT and Processor objects. To
work around these problems
ThunderBolt cannot read sector >= 2^32 or 2^21
with supplied patch.
Second the bigger change, fix RAID operation on ThunderBolt base
card such as physically removing a disk from a RAID and replacing
it. The current situation is the RAID firmware effectively hangs
waiting for an acknowledgement from the driver. This is due to
the firmware support of the driver actually accessing the RAID
from under the firmware. This is an interesting feature that
the FreeBSD driver does not use. However, when the firmare
detects the driver has attached it then expects the driver will
synchronize LD's with the firmware. If the driver does not sync.
then the management part of the firmware will hang waiting for
it so a pulled driver will listed as still there.
The fix for this problem isn't extremely difficult. However,
figuring out why some of the code was the way it was and then
redoing it was involved. Not have a spec. made it harder to
try to figure out. The existing driver would send a
MFI_DCMD_LD_MAP_GET_INFO command in write mode to acknowledge
a LD state change. In read mode it gets the RAID map from the
firmware. The FreeBSD driver doesn't do that currently. It
could be added in the future with the appropriate structures.
To simplify things, get the current LD state and then build
the MFI_DCMD_LD_MAP_GET_INFO/write command so that it sends
an acknowledgement for each LD. The map would probably state
which LD's changed so then the driver could probably just
acknowledge the LD's that changed versus all. This doesn't seem
to be a problem. When a MFI_DCMD_LD_MAP_GET_INFO/write command
is sent to the firmware, it will complete later when a change
to the LD's happen. So it is very much like an AEN command
returning when something happened. When the
MFI_DCMD_LD_MAP_GET_INFO/write command completes, we refire the
sync'ing of the LD state. This needs to be done in as an event
so that MFI_DCMD_LD_GET_LIST can wait for that command to
complete before issuing the MFI_DCMD_LD_MAP_GET_INFO/write.
The prior code didn't use the call-back function and tried
to intercept the MFI_DCMD_LD_MAP_GET_INFO/write command when
processing an interrupt. This added a bunch of code complexity
to the interrupt handler. Using the call-back that is done
for other commands got rid of this need. So the interrupt
handler is greatly simplified. It seems that even commands
that shouldn't be acknowledged end up in the interrupt handler.
To deal with this, code was added to check to see if a command
is in the busy queue or not. This might have contributed to the
interrupt storm happening without MSI enabled on these cards.
Note that MFI_DCMD_LD_MAP_GET_INFO/read returns right away.
It would be interesting to see what other complexity could
be removed from the ThunderBolt driver that really isn't
needed in our mode of operation. Letting the RAID firmware
do all of the I/O to disks is a lot faster since it can
use its caches. It greatly simplifies what the driver has
to do and potential bugs if the driver and firmware are
not in sync.
Simplify the aen_abort/cm_map_abort and put it in the softc
versus in the command structure.
This should get merged to 9 before the driver is merged to
8.
PR: 167226
Submitted by: Petr Lampa
MFC after: 3 days
- Use isync/lwsync unconditionally for acquire/release. Use of isync
guarantees a complete memory barrier, which is important for serialization
of bus space accesses with mutexes on multi-processor systems.
- Go back to using sync as the I/O memory barrier, which solves the same
problem as above with respect to mutex release using lwsync, while not
penalizing non-I/O operations like a return to sync on the atomic release
operations would.
- Place an acquisition barrier around thread lock acquisition in
cpu_switchin().
This seems to break at least my test board here (AR71xx + AR8316 switch
PHY). Since I do have a whole sleuth of "normal" PHY boards (with
an AR71xx on a normal PHY port), I'll do some further testing with those
to determine whether this is a general issue, or whether it's limited
to the behaviour of the "fake" dedicated PHY port mode on these atheros
switches.
intr_bind() on x86.
This has been requested by jhb and I strongly disagree with this,
but as long as he is the x86 and interrupt subsystem maintainer I will
follow his directives.
The disagreement cames from what we should really consider as a
public KPI. IMHO, if we really need a selection between the kernel
functions, we may need an explicit protection like _KERNEL_KPI, which
defines which subset of the kernel function might really be considered
as part of the KPI (for thirdy part modules) and which not.
As long as we don't have this mechanism I just consider any possible
function as usable by thirdy part code, thus intr_bind() included.
MFC after: 1 week
is running on other cpu, the CALLOUT_PENDING flag is temporarily
cleared. Then, callout_stop() on this, in fact active, callout fails
because CALLOUT_PENDING is not set, and callout_stop() returns 0.
Now, in sleepq_check_timeout(), the failed callout_stop() causes the
sleepq code to execute mi_switch() without even setting the wmesg,
since the switch-out is supposed to be transient. In fact, the thread
is put off the CPU for full timeout interval, instead of being put on
runq immediately. Until timeout fires, the process is unkillable for
obvious reasons.
Fix this by marking the migrating callouts with CALLOUT_DFRMIGRATION
flag. The flag is cleared by callout_stop_safe() when the function
detects a migration, besides returning the success. The softclock()
rechecks the flag for migrating callout and cancels its execution if
the flag was cleared meantime.
PR: misc/166340
Reported, debugging traces provided and tested by:
Christian Esken <christian.esken trivago com>
Reviewed by: avg, jhb
MFC after: 1 week
Cleaner solution (e.g. adding another header) should be done here.
Original log:
Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
Remove ipfw/ip_fw_private.h header from non-ipfw code.
Requested by: luigi
Approved by: kib(mentor)
Lagg(4) restricts the type of packet that may be sent directly to a child
port, to avoid undesired output from accidental misconfiguration.
Previously only ETHERTYPE_PAE was permitted.
BPF writes to a lagg(4) child port are presumably intentional, so just
allow them, while still blocking other packets that should take the
aggregation path.
PR: kern/138620
Approved by: thompsa@
another process is in open() or stat() for the device node, then
close() from the owning process does not result in cdevsw close
method call. This fixes the pemanent "Device busy" seen.
Changed the sndstat_lock from mutex to sx. This allows to extend
the region covered by the lock, to include the uiomove() call in
sndstat_read() and bufptr increment. This fixes the "panic:
sbuf_put_byte called with finished or corrupt sbuf" seen.
In collaboration with: kib
MFC after: 1 week
code and which had only stub implementations or no implementation on all
platforms. Makes gxemul compile.
Hinted by: rwatson
MFC after: 3 weeks
X-MFC by: rwatson:
if the accounting log file is atomically replaced with a new file
(such as during log rotation).
- Simplify accounting log rotation a bit. There is no need to re-run
accton(8) after renaming the new log file to it's real name.
PR: kern/167321
Tested by: Jeremy Chadwick
1) Always implement missing bus space methods using a panic() stub rather
than a NULL pointer. This appeared not to trip up any existing device
drivers, but due to the nature of the devices I'm supporting locally,
I'm making use of some of the more obscure busspace methods, and
panic() is a preferred failure mode. For example, do this for the
setregion methods.
2) Hook up several existing busspace method implementations that were
provided in the file, but not actually present in the methods
structure. Especially, single-byte bus I/O routines. This should
allow bugs to be fixed in the Atheros 802.11 driver.
There are still some remaining unimplemented methods that would be
desirable to implement -- especially, 64-bit I/O calls that would
observably accelerate device performance on FPGA-based soft CPU cores
that are typically clocked an order of magnitude slower than
conventional hard core CPUs, but that remains for another day.
MFC after: 3 weeks
Discussed with: jmallett, scottl
Sponsored by: DARPA, AFRL
entirely of one machdep file lifted from the MALTA port, as well as
a low-level console and tty driver for the gxemul debugging console
device (the emulators stdio). As with many low-level embedded and
hypervisor console devices, it is polled only, so we drive TTY I/O
from a callout; we are perhaps a bit too aware of the MIPS physical
maps in order to attach the console before newbus comes to life.
The sample kernel configuration depends on an MD-based root file
system, which is not provided. However, any 64-bit, big-endian
userspace image (such as one generated for MALTA) should work.
This will hopefully be supplemented by additional device drivers for
gxemul-specific hardware simulations from Juli Mallett. We have
found oldtestmips quite useful for testing and improving aspects of
the MIPS port, so it's worth supporting better in FreeBSD.
Requested by: theraven, jmallett
Sponsored by: DARPA, AFRL
MFC after: 3 weeks
* Flesh out the PLL configuration fetch function, which will return the PLL
configuration based on the unit number and speed.
* Remove the PLL speed config logic from the AR71xx/AR91xx chip PLL config
function - pass in a 'pll' value instead.
* Modify arge_set_pll() to:
+ fetch the PLL configuration
+ write the PLL configuration
+ update the MII speed configuration.
This will allow if_arge to override the PLL configuration as required.
Obtained from: Linux/Atheros/OpenWRT
* Add a new method to set the MII mode - GMII, RGMII, RMII, MII.
+ arge0 supports all four (two for non-Gige interfaces.)
+ arge1 only supports two (one for non-gige interfaces.)
* Set the MII clock speed when changing the MAC PLL speed.
+ Needed for AR91xx and AR71xx; not needed for AR724x.
Tested:
* AR71xx only, I'll do AR913x testing tonight and fix whichever issues
creep up.
TODO:
* Implement the missing AR7242 arge0 PLL configuration, but don't
adjust the MII speed accordingly.
* .. the AR7240/AR7241 don't require this, so make sure it's not set
accidentally.
Bugs (not fixed here):
* Statically configured arge speeds are still broken - investigate why
that is on the AP96 board. Autonegotiate is working fine, but there
still seems to be an occasionally heavy packet loss issue.
Obtained from: Linux/Atheros/OpenWRT
- Align the RX buffers on the cache line size, otherwise the requirement
of partial cache line flushes on every are pretty much guaranteed. [1]
- Make the code setting the RX timeout match its comment (apparently,
start and stop bits were missed in the previous calculation). [1]
- Cover the busdma operations in at91_usart_bus_{ipend,transmit}() with
the hardware mutex, too, so these don't race against each other.
- In at91_usart_bus_ipend(), reduce duplication in the code dealing with
TX interrupts.
- In at91_usart_bus_ipend(), turn the code dealing with RX interrupts
into an else-if cascade in order reduce its complexity and to improve
its run-time behavior.
- In at91_usart_bus_ipend(), add missing BUS_DMASYNC_PREREAD calls on
the RX buffer map before handing things over to the hardware again. [1]
- In at91_usart_bus_getsig(), used a variable of sufficient width for
storing the contents of USART_CSR.
- Use KOBJMETHOD_END.
- Remove an unused header.
Submitted by: Ian Lepore [1]
Reviewed by: Ian Lepore
MFC after: 1 week
V100, the firmware is known to be broken and not allowing to simultaneously
open disk devices, causing attempts to boot from a mirror or RAIDZ to cause
a crash. This will be worked around later. The firmwares of newer sun4u models
don't seem to exhibit this problem though.
Steps for ZFS booting:
1. create VTOC8 label
# gpart create -s vtoc8 da0
2. add partitions, f.e.:
# gpart add -t freebsd-zfs -s 60g da0
# gpart add -t freebsd-swap da0
resulting in something like:
# gpart show
=> 0 143331930 da0 VTOC8 (68G)
0 125821080 1 freebsd-zfs (60G)
125821080 17510850 2 freebsd-swap (8.4G)
3. create zpool
# zpool create bunker da0a
or for mirror/RAIDZ (after preparing additional disks as in steps 1. + 2.):
# zpool create bunker mirror da0a da1a
# zpool create bunker raidz da0a da1a da2a ...
4. set bootfs
# zpool set bootfs=bunker bunker
5. install zfsboot
# zpool export bunker
# gpart bootcode -p /boot/zfsboot da0
6. write zfsloader to the ZFS Boot Block (so far, there's no dedicated tool
for this, so dd(1) has to be used for this purpose)
When using mirror/RAIDZ, step 4. and the dd(1) invocation should be repeated
for the additional disks in order to be able to boot from another disk in
case of failure.
# sysctl kern.geom.debugflags=0x10
# dd if=/boot/zfsloader of=/dev/da0a bs=512 oseek=1024 conv=notrunc
# zpool import bunker
7. install system on ZFS filesystem
Don't forget to set 'zfs_load="YES"' and vfs.root.mountfrom="zfs:bunker" in
loader.conf as well as 'zfs_enable="YES"'in rc.conf.
8. copy zpool.cache to the ZFS filesystem
cp -p /boot/zfs/zpool.cache /bunker/boot/zfs/zpool.cache
9. set mountpoint
# zfs set mountpoint=/ bunker
10. Now, given that aliases for all disks in the zpool exists (check with
the `devalias` command on the boot monitor prompt) and disk0 corresponds
to da0 (likewise for additional disks), the system can be booted from the
ZFS with:
{1} ok boot disk0
PR: 165025
Submitted by: Gavin Mu
* Modified hwmp_recv_preq:
o cleaned up code, removed rootmac variable because preq->origaddr
is the root when we recevie a Proactive PREQ;
o Modified so that a PREP in response of a Proactive PREQ is unicast,
a PREP is ALWAYS unicast;
* Modified hwmp_recv_prep:
o Before we mark a route to be valid we should remove the discovery
flag and then mark it valid in such a way we wont lose the isgate flag;
Approved by: adrian
* Added a new discovery flag IEEE80211_MESHRT_FLAGS_DISCOVER;
* Modified ieee80211_ioctl.h to include IEEE80211_MESHRT_FLAGS_DISCOVER;
* Added hwmp_rediscover_cb, which will be called by a timeout to do
rediscovery if we have not reach max number of preq discovery;
* Modified hwmp_discover to setup a callout for path rediscovery;
* Added to ieee80211req_mesh_route to have a back pointer to ieee80211vap
for the discovery callout context;
* Modified mesh_rt_add_locked arguemnt from ieee80211_mesh_state to
ieee80211vap, this because we have to initialize the above back pointer;
Approved by: adrian
* Renamed IEEE80211_ELEMID_MESHPANN to IEEE80211_ELEMID_MESHGANN according to
amendment;
* Added IEEE80211_IOC_MESH_GATE that controls whether Mesh Gate Announcement
is activated or not;
* Renamed all flags from Portal to Gate in HWMP frames;
* Removed IEEE80211_ACTION_MESHPANN enum cause its part of the Mesh Action
category now as per amendment;
* Renamed IEEE80211_MESHFLAGS_PORTAL to IEEE80211_MESHFLAGS_GATE in
ieee80211_mesh_state flags;
* Modified ieee80211_hwmp.c/ieee80211_mesh.c to use new GATE flags;
Approved by: adrian
* Introduced a new HWMP sysctl, Root Confirmation Interval;
* Added hr_lastrootconf to hwmp_route, is for ratecheck for a specific ROOT;
* We missed reading RANN.interval subfield from a RANN frame before;
* Updated hwmp_recv_rann according to amendment, see comments;
Approved by: adrian
* Added mpp_senderror for Mesh Path Selection protocol;
* Added hwmp_senderror that will send an HWMP PERR according to the
supplied reason code;
* Call mpp_senderror when deleting a route with correct reason code
for whether the route is marked proxy or not;
* Call mpp_senderror when trying to forward an individually addressed
frame and there is no forwarding information;
Approved by: adrian
* When receiving a Proactive PREQ dont return after processing it but propagate;
* When we propagate we should not enforce ratechecking;
* Added checking for multiple pred ID detection;
* Storing proxy orig address when PREQ is not for us;
Approved by: adrian
* Moved hs_lastpreq to be hr_lastpreq cause this rate check should be per
target mesh STA according to amendment (NB: not applicable for PERR);
* Modified hwmp_send_preq to use two extra arguments for last sent PREQ and
minimum PREQ interval;
* hwmp_send_preq is called with last two arguments equal to NULL when sending
Proactive PREQs cause the call back task enforces the rate check;
Approved by: adrian
* Added assertion in mesh_rt_update;
* Fixed some prep propagation that where multicast, ALL PREPS ARE UNICAST;
* Fixed PREP acceptance criteria;
* Fixed some PREP debug messages;
* HWMP intermediate reply (PREP) should only be sent if we have newer
forwarding infomration (FI) about target;
* Fixed PREP propagation condition and PREP w/ AE handling;
* Ignore PREPs that have unknown originator.
* Removed old code inside PREP that was for proactive path building
to root mesh;
Other errors include:
* use seq number of target and not orig mesh STA;
* Metric is what we have stored in our FI;
* Error in amendment, Hop count is not 0 but equals FI hopcount for target;
Approved by: adrian
* In mesh_recv_indiv_data_to_fwd update route entry for both meshDA and meshSA;
* In mesh_recv_indiv_data_to_me update route entry for meshSA;
* in ieee80211_mesh_rt_update put code so that a proxy entry that is gated
by us (number of hops == 0) is never invalidated;
* Fixed so that we always call ieee80211_mesh_rt_update with lifetime in ms;
Approved by: adrian
* Modified HWMP PREP/PREQ to contain a proxy entry and also changed PREP
frame processing according to amendment as following:
o Fixed PREP to always update/create if acceptance criteria is meet;
o PREQ processing to reply if request is for a proxy entry that is
proxied by us;
o Removed hwmp_discover call from PREQ, because sending a PREP will
build the forward path, and by receving and accepting a PREQ we
have already built the reverse path (non-proactive code);
* Disabled code for pro-active in PREP for now (will make a separate patch for
pro-active HWMP routing later)
* Added proxy information for a Mesh route, mesh gate to use and proxy seqno;
* Modified ieee80211_encap according to amendment;
* Introduced Mesh control address extension enum and removed unused struct,
also rename some structure element names.
* Modified mesh_input and added mesh_recv_* that should verify and process mesh
data frames according to 9.32 Mesh forwarding framework in amendment;
* Modified mesh_decap accordingly to changes done in mesh control AE struct;
Approved by: adrian
* Introduced ieee80211_mesh_rt_update that updates a route with the
maximum(lifetime left, new lifetime);
* Modified ieee80211_mesh_route struct by adding a lock that will be used
by both ieee80211_mesh_rt_update and precursor code (added in future commit);
* Modified in ieee80211_hwmp.c HWMP code to use new ieee80211_mesh_rt_update;
* Modified mesh_rt_flush_invalid to use new ieee80211_mesh_rt_update;
* mesh_rt_flush also checks that lifetime == 0, this gives route discovery
a change to complete;
* Modified mesh_recv_mgmt case IEEE80211_FC0_SUBTYPE_BEACON:
when ever we received a beacon from a neighbor we update route lifetime;
Approved by: adrian
* Added IEEE80211_MESH_MAX_NEIGHBORS and it is set to 15, same as before;
* Modified mesh_parse_meshpeering_action to verify MPM frame and send
correct reason code for when a frame is rejected according to standard spec;
* Modified mesh_recv_action_meshpeering_* according to the standard spec;
* Modified mesh_peer_timeout_cb to always send CLOSE frame when in CONFIRMRCV
state according to the standard spec;
Approved by: adrian
* Old struct ieee80211_meshpeer_ie had wrong peer_proto field size;
* Added IEEE80211_MPM_* size macros;
* Created an enum for the Mesh Peering Protocol Identifier field according
to the standard spec and removed old defines;
* Abbreviated Handshake Protocol is not used by the standard anymore;
* Modified mesh_verify_meshpeer to use IEEE80211_MPM_* macros for verification;
* Modified mesh_parse_meshpeering_action to parse complete frame, also to parse
it according to the standard spec;
* Modified ieee80211_add_meshpeer to construct correct MPM frames according to
the standard spec;
Approved by: adrian
* Added new action category IEEE80211_ACTION_CAT_SELF_PROT which is used by 11s
for Mesh Peering Management;
* Updated Self protected enum Action codes to start from 1 instead of 0
according to the standard spec;
* Removed old and wrong action categories IEEE80211_ACTION_CAT_MESHPEERING;
* Modified ieee80211_mesh.c and ieee80211_action.c to use the new action
category code;
* Added earlier verification code in ieee80211_input;
Approved by: adrian
- fixed an incorrect lock status issue.
- fixed an incorrect lock issue of unionfs root vnode removed.
(pointed out by keith)
- fixed an infinity loop issue.
(pointed out by dumbbell)
- changed to do LK_RELEASE expressly when unlocked.
Submitted by: ozawa@ongs.co.jp
arge1 still works (it's the standalone PHY) but arge0 and the other switch
ports don't work. They're enumerated though, demonstrating that the
mdiobus abstraction is correctly working.
This is only done if the ARGE_MDIO option is included.
* Shuffle the arge MDIO bus into a separate device, that needs to be
probed early (use hint.argemdio.X.order=0)
* hint.arge.X.mdio now specifies which miiproxy to rendezvous with.
* Call MAC/MDIO bus init during MDIO attach, not arge attach.
This is done regardless:
* Shift the arge MAC and MDIO bus reset code into separate functions
and call it early during MDIO bus attach. It's required for
correct MDIO bus IO to occur on AR71xx/AR91xx devices.
* Remove the AR71xx/AR91xx centric assumption that there's only one
MDIO bus. The initial code mapped miibus0(arge0) and miibus1(arge1)
MII register operations to the MII0 (arge0) register space. The
AR724x (and later, upcoming chipsets) have two MDIO busses and
the second is very much in use.
TODO:
* since the multiphy behaviour has changed (where now a phymask of >1
PHY will still be enumerated), multiphy setups may be quite wrong.
I'll go and fix these so they still have a chance of working, at least.
until the switch PHY support appears in -HEAD.
Submitted by: Stefan Bethke <stb@lassitu.de>
MDIO/MII rendezvous proxy.
* Add an 'mdio' bus, which is the "IO" side of an MII bus (but by design
can be anything which implements the underlying register access API.)
* Add 'miiproxy' and 'mdioproxy', which provides a rendezvous mechanism
for MII busses to appear hanging off arbitrary busses (ie, that aren't
necessarily a traditional looking MII bus.)
MII busses can now hang off anything that implements an mdiobus.
For the AR71xx SoC, there's one MDIO bus but two MII busses. So to
properly support two or more real PHYs, this can be done:
# arge0 MDIO bus - there's no arge1 MDIO bus for AR71xx
hint.argemdio.0.at="nexus0"
hint.argemdio.0.maddr=0x19000000
hint.argemdio.0.msize=0x1000
hint.argemdio.0.order=0
# Create two mdioproxy instances
hint.mdioproxy.0.at="mdio0"
hint.mdioproxy.1.at="mdio0"
# .. and with a follow-up patch
hint.arge.0.mdio=mdioproxy0
hint.arge.1.mdio=mdioproxy0
TODO:
* Do a sweep or two and add appropriate locking in mdio/mdioproxy/miiproxy.
Submitted by: Stefan Bethke <stb@lassitu.de>
Reviewed by: ray
defined by the SNIA Common RAID Disk Data Format Specification v2.0.
Supports multiple volumes per array and multiple partitions per disk.
Supports standard big-endian and Adaptec's little-endian byte ordering.
Supports all single-layer RAID levels. Dual-layer RAID levels except
RAID10 are not supported now because of GEOM RAID design limitations.
Some work is still to be done, but the present code already manages basic
interoperation with RAID BIOS of the Adaptec 1430SA SATA RAID controller.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
Olympus FE-210 camera
LG UP3S MP3 player
Laser MP3-2GA13 MP3
PR: usb/119201
Submitted by: Peter Jeremy <peterjeremy@optushome.com.au>
Approved by: cperciva
MFC after: 1 week
This will give you more bandwidth for isochronous
FULL speed applications connected through a
High Speed HUB.
This patch has been tested with XHCI and EHCI.
MFC after: 1 week
in the HAL. That's very memory hungry (32k just for channel statistics)
which would be better served by keeping a summary in the ANI state.
Or, later, keep a survey history in net80211.
So:
* Migrate the ah_chansurvey array to be a single entry, for the current
channel.
* Change the ioctl interface and ANI code to just reference that.
* Clear the ah_chansurvey array during channel reset, both in the AR5212
and AR5416 reset path.
* Always call ar5416GetListenTime()
* Modify ar5416GetListenTime() to:
+ don't update the ANI state if there isn't any ANI state;
+ don't update the channel survey state if there's no active
channel - just to be paranoid
+ copy the channel survey results into the current sample slot
based on the current channel; then increment the sample counter
and sample history counter.
* Modify ar5416GetMIBCyclesPct() to simply return a HAL_SURVEY_SAMPLE,
rather than a set of percentages. The ANI code wasn't using the
percentages anyway.
TODO:
* Create a new function which fetches the survey results periodically
* .. then modify the ANI code to use the pre-fetched values rather than
fetching them again
* Roll the 11n ext busy function from ar5416_misc.c to update all the
counters, then do the result calculation
* .. then, modify the MIB counter routine to correctly fetch a snapshot -
freeze the counters, fetch the values, then reset the counters.
The reference driver has a 3ms delay for the AR9130 but I'm not as yet
sure why. From what I can gather, it's likely waiting for some FIFO
flush to occur.
At some point in the future it may be worthwhile adding a WMAC
FIFO flush here, but that'd require some side-call through to the SoC
DDR flush routines.
Obtained from: Atheros
guarantees on acquire for the tlbie mutex. Conversely, the TLB invalidation
sequence provides guarantees that do not need to be redundantly applied on
release. Roll a small custom lock that is just right. Simultaneously,
convert the SLB tree changes back to lwsync, as changing them to sync
was a misdiagnosis of the tlbie barrier problem this commit actually fixes.
do not include file attributes in the reply to an NFS create RPC
under certain circumstances.
This resulted in a vnode of type VNON that was not usable.
This patch adds an NFS getattr RPC to nfs_create() for this case,
to fix the problem. It was tested by the person that reported
the problem and confirmed to fix this case for their server.
Tested by: Steven Haber (steven.haber at isilon.com)
MFC after: 2 weeks
ZFS volume is exported via the new NFS server. The leak occurred
because the new NFS server code didn't handle the case where
a file system sets the SAVENAME flag in its VOP_LOOKUP() and
ZFS does this for the DELETE case.
Tested by: Oliver Brandmueller (ob at gruft.de), hrs
PR: kern/167266
MFC after: 1 month
discrepancy between modules and kernel, but deal with SMP differences
within the functions themselves.
As an added bonus this also helps in terms of code readability.
Requested by: gibbs
Reviewed by: jhb, marius
MFC after: 1 week
to follow the example of OpenSolaris and its descendants, which implemented
cpu as an inline that took a value out of curthread. At certain points in
the FreeBSD scheduler curthread->td_oncpu will no longer be valid (in
particukar, just before the thread gets descheduled) so instead I have
implemented this as its own built-in variable.
Sponsored by: Sandvine Inc.
MFC after: 1 week
and voltage sensor, TWSI is used to get sensor data. msk(4) does
not monitor these sensors and interrupt for TWSI completion is
disabled by default.
However, due to unknown reason, the TWSI completion interrupt fires
and it resulted in interrupt storm. To fix it, acknowledges the
TWSI completion interrupt if driver see the event. Given that not
all Yukon II controllers show the issue it could be a silicon bug
which does not honor interrupt masking.
Probably the right way to address the issue is disabling automatic
TWSI cycle initiation against these sensors. It would be even
better to implement reading voltage/temperature from the NIC but it
requires access to National LM80 through TWSI and documentation to
do that is not available yet(probably will never happen).
Reported by: jhb
Tested by: jhb
MFC after: 2 weeks
which will be needed for AR7010 and AR9287 USB access.
The names differ slightly from Linux and Atheros, for the sake of
consistency.
A lot more work is required in order to convert the 11n HAL support to
fully support USB.
header will make the data go over the 64k limits announced to busdma as
maxsize and the transaction will fail.
With TSO this can result in a TCP regression due to the lost packet.
According to the data sheets ixgbe(4) 82598 and 82599 can handle up to
256k so increase the maximum.
Reported by: Jon Kåre Hellan, UNINETT (jon.kare.hellan uninett.no)
Tested by: Jon Kåre Hellan, UNINETT (jon.kare.hellan uninett.no)
MFC after: 1 week
to the process id. It follows the ptrace(2) interface and allows debugging
libraries to use thread ids directly, without slow and verbose conversion
of thread id into pid.
The PGET_NOTID flag is provided to allow a specific sysctl to disallow
this behaviour. All current callers of pget(9) have useful semantic to
operate on tid and do not need this flag.
Reviewed by: jhb, trocini
MFC after: 1 week
not provide general barriers, but only barriers in the context of the
atomic sequences here. As such, make them private and keep the global
*mb() routines using a variant of sync.
isync to implement read and write barriers, following Appendix B.2 of
Book II of the architecture manual. This provides a 25% speed increase
to fork() on the PowerPC G4.
of sync (lwsync is an alternate encoding of sync on systems that do not
support it, providing graceful fallback). This provides more than an order
of magnitude reduction in the time required to acquire or release a mutex.
MFC after: 2 months
sync performs a strict superset of the functions of eieio, so using both
is redundant. While here, expand bus barriers to all bus_space operations,
since many drivers do not correctly use bus_space_barrier().
In principle, we can also replace sync just with eieio, for a significant
performance increase, but it remains to be seen whether any poorly-written
drivers currently depend on the side effects of sync to properly function.
MFC after: 1 week
the page. This PMAP requires an additional lock besides the PMAP lock
in pmap_extract_and_hold(), which vm_page_pa_tryrelock() did not release.
Suggested by: kib
MFC after: 4 days
via sysctl(4) interface. This permits router not to stop forwarding
packets while route table is being written to user-supplied buffer.
Reported by: Pawel Tyll <ptyll@nitronet.pl>
Approved by: kib(mentor)
MFC after: 1 week
as otherwise the interrupt handling code may modify data in the non-DMA
part of the cache line while we have it stashed away in the temporary
stack buffer, then we end up restoring a stale value.
PR: 160431
Submitted by: Ian Lepore
MFC after: 1 week
cover the initial stack size. For MCL_WIREFUTURE maps, the subsequent
call to vm_map_wire() to wire the whole stack region fails due to
VM_MAP_WIRE_NOHOLES flag.
Use the VM_MAP_WIRE_HOLESOK to only wire mapped part of the stack.
Reported and tested by: Sushanth Rai <sushanth_rai yahoo com>
Reviewed by: alc
MFC after: 1 week
sys/dev/dpt/dpt_scsi.c:612:18: error: implicit truncation from 'int' to bitfield changes value from -2 to 2 [-Werror,-Wconstant-conversion]
dpt->cache_type = DPT_CACHE_WRITEBACK;
^ ~~~~~~~~~~~~~~~~~~~
by defining DPT_CACHE_WRITEBACK as 2, since dpt_softc::cache_type is an
unsigned bitfield. No binary change.
MFC after: 1 week
The default priority is now '1000' rather than '0'. This may cause some
unforseen regressions.
Submitted by: Stefan Bethke <stb@lassitu.de>
Reviewed by: imp
- When switching to 4-bit operation, send a SET_CLR_CARD_DETECT command
to disconnect the card-detect pull-up resistor from the DAT3 line before
sending the SET_BUS_WIDTH command.
- Add the missing "reserved" zero entry to the mantissa table used to
decode various CSD fields. This was causing SD cards to report that they
could run at 30 MHz instead of the maximum 25 MHz mandated in the spec.
o Enhancements:
- At the MMC layer, format various info from the CID into a string that
uniquely identifies the card instance (manufacturer number, serial
number, product name and revision, etc). Export it as an instance
variable.
- At the MMCSD layer, display the formatted card ID string, and also
report the clock speed of the hardware (not the card's max speed), and
the number of bits and number of blocks per transfer. It comes out like
this now:
mmcsd0: 968MB <SD SD01G 8.0 SN 276886905 MFG 08/2008 by 3 SD> at mmc0
22.5MHz/4bit/128-block
o Use DEVMETHOD_END.
o Use NULL instead of 0 for pointers.
PR: 156496
Submitted by: Ian Lepore
MFC after: 1 week
sys/contrib/rdma/rdma_cma.c:1259:8: error: case value not in enumerated type 'enum iw_cm_event_status' [-Werror,-Wswitch]
case ECONNRESET:
^
@/sys/errno.h:118:20: note: expanded from macro 'ECONNRESET'
#define ECONNRESET 54 /* Connection reset by peer */
^
sys/contrib/rdma/rdma_cma.c:1263:8: error: case value not in enumerated type 'enum iw_cm_event_status' [-Werror,-Wswitch]
case ETIMEDOUT:
^
@/sys/errno.h:124:19: note: expanded from macro 'ETIMEDOUT'
#define ETIMEDOUT 60 /* Operation timed out */
^
sys/contrib/rdma/rdma_cma.c:1260:8: error: case value not in enumerated type 'enum iw_cm_event_status' [-Werror,-Wswitch]
case ECONNREFUSED:
^
@/sys/errno.h:125:22: note: expanded from macro 'ECONNREFUSED'
#define ECONNREFUSED 61 /* Connection refused */
^
This is because the switch uses iw_cm_event::status, which is an enum
iw_cm_event_status, while ECONNRESET, ETIMEDOUT and ECONNREFUSED are
just plain defines from errno.h.
It looks like there is only one use of any of the enumeration values of
iw_cm_event_status, in:
sys/contrib/rdma/rdma_iwcm.c: if (iw_event->status == IW_CM_EVENT_STATUS_ACCEPTED) {
So messing around with the enum definitions to fix the warning seems too
disruptive; the simplest fix is to cast the argument of the switch to
int.
Reviewed by: kmacy
MFC after: 1 week
sys/dev/nxge/if_nxge.c:1276:11: error: case value not in enumerated type 'xge_hal_event_e' (aka 'enum xge_hal_event_e') [-Werror,-Wswitch]
case XGE_LL_EVENT_TRY_XMIT_AGAIN:
^
sys/dev/nxge/if_nxge.c:1289:11: error: case value not in enumerated type 'xge_hal_event_e' (aka 'enum xge_hal_event_e') [-Werror,-Wswitch]
case XGE_LL_EVENT_DEVICE_RESETTING:
^
This is because the switch uses xge_queue_item_t::event_type, which is
an enum xge_hal_event_e, while the XGE_LL_EVENT_xx values are of the
enum xge_event_e.
Since messing around with the enum definitions is too disruptive, the
simplest fix is to cast the argument of the switch to int.
Reviewed by: gnn
MFC after: 1 week
STAILQ(). While here, fix another clang warning about a switch which
tests an enum type for a regular integer value.
Submitted by: jhb
MFC after: 1 week
assumes for small buffers (< 64k) that the OS driver is actually using
a buffer rounded up to the next power of 2. It also assumes that the
buffer is at least 4k in size. Furthermore, there is at least one
known instance of megarc sending a request with a 12k buffer where the
firmware writes out a 24k-ish reply.
To workaround the data corruption triggered by this "feature", ensure
that buffers for user commands use a minimum size of 32k, and that
buffers between 32k and 64k use a 64k buffer.
PR: kern/155658
Submitted by: Andreas Longwitz longwitz incore de
Reviewed by: scottl
MFC after: 1 week
code that is used to construct a loader (e.g. libstand, ficl, etc).
There is such a thing as a 64-bit EFI application, but it's not
as standard as 32-bit is. Let's make the 32-bit functional (as in
we can load and actualy boot a kernel) before solving the 64-bit
loader problem.
ar724x_pci.c.
* Move out the code which populates the firmware into ar71xx_fixup.c
* Shuffle around the ar724x fixup code to match what the ar71xx fixup
code does.
I've validated this on an AR7240 with AR9285 on-board NIC. It doesn't
yet load, as the AR9285 EEPROM code needs to be made "flash aware."
TODO:
* Validate that I haven't broken AR71xx
* Test AR9285/AR9287 onboard NICs, complete with EEPROM code changes
* Port over the needed BAR hacks for AR7240, AR7241 and AR7242 from
Linux OpenWRT. The current WAR has only been tested on the AR7240
and I'm not sure the way the BAR register is treated is "right".
The "fixup" method here is right when setting the BAR for local access -
ie, the BAR address is either 0xffff (AR7240) or 0x1000ffff (AR7241/AR7242),
but the ath9k-fixup.c code (Linux OpenWRT) does this when setting the
initial "fixup" BAR. It then restores the original BAR.
I'll have to read the ar724x PCI bus glue to see what other special cases
await.
over just the active vnodes associated with a mount point to replace
MNT_VNODE_FOREACH_ALL in the vfs_msync, ffs_sync_lazy, and qsync
routines.
The vfs_msync routine is run every 30 seconds for every writably
mounted filesystem. It ensures that any files mmap'ed from the
filesystem with modified pages have those pages queued to be
written back to the file from which they are mapped.
The ffs_lazy_sync and qsync routines are run every 30 seconds for
every writably mounted UFS/FFS filesystem. The ffs_lazy_sync routine
ensures that any files that have been accessed in the previous
30 seconds have had their access times queued for updating in the
filesystem. The qsync routine ensures that any files with modified
quotas have those quotas queued to be written back to their
associated quota file.
In a system configured with 250,000 vnodes, less than 1000 are
typically active at any point in time. Prior to this change all
250,000 vnodes would be locked and inspected twice every minute
by the syncer. For UFS/FFS filesystems they would be locked and
inspected six times every minute (twice by each of these three
routines since each of these routines does its own pass over the
vnodes associated with a mount point). With this change the syncer
now locks and inspects only the tiny set of vnodes that are active.
Reviewed by: kib
Tested by: Peter Holm
MFC after: 2 weeks
a mount point. Active vnodes are those with a non-zero use or hold
count, e.g., those vnodes that are not on the free list. Note that
this list is in addition to the list of all the vnodes associated
with a mount point.
To avoid adding another set of linkage pointers to the vnode
structure, the active list uses the existing linkage pointers
used by the free list (previously named v_freelist, now renamed
v_actfreelist).
This update adds the MNT_VNODE_FOREACH_ACTIVE interface that loops
over just the active vnodes associated with a mount point (typically
less than 1% of the vnodes associated with the mount point).
Reviewed by: kib
Tested by: Peter Holm
MFC after: 2 weeks
actually in it. This happens when SCTP receives an unknown chunk, which
requires the sending of an ERROR chunk, and there is no final padding but
the chunk is not 4-byte aligned.
Reported by yueting via rwatson@
MFC after: 3 days
at least until I can root cause what's going on.
The only platform I've seen this on is the AR9220 when attached to
the AR71xx CPUs. I get immediate PCIe bus errors and all subsequent
accesses cause further MIPS bus exceptions. I don't have any other
big-endian platforms to test this on.
If I get a chance (or two), I'll try to whack this on a bus analyser
and see exactly what happens.
I'd rather leave this on, especially for slower, embedded platforms.
But the #ifdef hell is something I'm trying to avoid.
- Implement "configure" command to allow switching operation mode of
running device on-fly without destroying and recreation.
- Implement Active/Read mode as hybrid of Active/Active and Active/Passive.
In this mode all paths not marked FAIL may handle reads same time,
but unlike Active/Active only one path handles write requests at any
point in time. It allows to closer follow original write request order
if above layers need it for data consistency (not waiting for requisite
write completion before sending dependent write).
- Hide duplicate messages about device status change.
- Remove periodic thread wake up with 10Hz rate.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
used only as a helper function in that file. Replace sole call to
vbusy() with inline code in vholdl(). Replace sole calls to vfree()
and vdestroy() with inline code in vdropl().
The Clang compiler already inlines these functions, so they do not
show up in a kernel backtrace which is confusing. Also you cannot
set their frame in kgdb which means that it is impossible to view
their local variables. So, while the produced code is unchanged,
the debugging should be easier.
Discussed with: kib
MFC after: 2 weeks
The primary changes are that the user of the interface no longer
needs to manage the mount-mutex locking and that the vnode that
is returned has its mutex locked (thus avoiding the need to check
to see if its is DOOMED or other possible end of life senarios).
To minimize compatibility issues for third-party developers, the
old MNT_VNODE_FOREACH interface will remain available so that this
change can be MFC'ed to 9. Following the MFC to 9, MNT_VNODE_FOREACH
will be removed in head.
The reason for this update is to prepare for the addition of the
MNT_VNODE_FOREACH_ACTIVE interface that will loop over just the
active vnodes associated with a mount point (typically less than
1% of the vnodes associated with the mount point).
Reviewed by: kib
Tested by: Peter Holm
MFC after: 2 weeks
allow the owner to read and write ACL and file attributes when there
was no entry with subject matching the owner. In other words,
'getfacl meh' shouldn't fail for the owner if the ACL looks like this:
# file: meh
# owner: trasz
# group: wheel
user:root:------a-------:------:allow
Reported by: kientzle
like the one triggered by this:
# kldload geom_vinum
# pwait `pgrep -S gv_worker` &
# kldunload geom_vinum
or this:
GEOM_JOURNAL: Shutting down geom gjournal 3464572051.
panic: destroying non-empty racct: 1 allocated for resource 6
which were tracked by jh@ to be caused by checking p->p_flag,
while it wasn't initialised yet. Basically, during fork, the code
checked p_flag, concluded the process isn't marked as P_SYSTEM,
incremented the counter, and later on, when exiting, checked that
the process was marked as P_SYSTEM, and thus didn't decrement it.
Also, I believe there wasn't any good reason for checking P_SYSTEM
in the first place.
Tested by: jh
even in the face of errors.
If the RX descriptor list fails, the RX lock won't be initialised, but
then the DMA free path wil try freeing it.
This commit is brought to you by a working mwl(4).