Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Fri May 31 12:17:08 2013 +0000
drm: Sort connector modes based on vrefresh
Keeping the modes sorted by vrefresh before the pixel clock makes the
mode list somehow more pleasing to the eye.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
PR: 198936
Obtained from: Linux
MFC after: 1 month
MFC with: r280183
will depend on ficl having been built, and are set via bsd.arch.inc.mk we
need to place this after ficl.
As Makefile.amd64 is now late enough we can add the i386 directory to this.
In particular, such DDB commands were added:
show vmem <addr>
show all vmem
show vmemdump <addr>
show all vmemdump
As possible usage, that allows to see KVA usage and fragmentation.
which showed up after I started changing addresses this early.
It turns out that there's some other malarky going on behind the scenes
in the HAL and merely setting the net80211/ifp mac address this early
isn't enough. If the MAC is set from kenv at attach time, the HAL
also needs to be programmed early.
Without this, the VAP wouldn't work enough for finishing association -
probe requests would be fine as they're broadcast, but association
request would fail.
Older compilers, and compatibility modes, may not support variadic macros.
I normally wouldn't go out of my way to support those old compilers but
there is a prescendent in other system headers for using the same macro
multiple times, and the solution (although non-elegant IMHO) works.
Requested by: bde
Solution by: tijl
This allows the TL-WDR3600 to use the correct MAC address for ath0, ath1
and arge0. arge1 isn't used; until I disable it entirely it'll just
show up with a randomly generated MAC.
This is used by the AR71xx platform code to choose a local MAC based on
the "board MAC address", versus whatever potentially invalid/garbage
values are stored in the Atheros calibration data.
A lot of these dinky atheros based MIPS boards don't have a nice, well,
anything consistent defining their MAC addresses for things.
The Atheros reference design boards will happily put MAC addresses
into the wifi module calibration data like they should, and individual
ethernet MAC addresses into the calibration area in flash.
That makes my life easy - "hint.arge.X.eeprommac=<addr>" reads from
that flash address to extract a MAC, and everything works fine.
However, aside from some very well behaved vendors (eg the Carambola 2
board), everyone else does something odd.
eg:
* a MAC address in the environment (eg ubiquiti routerstation/RSPRO)
that you derive arge0/arge1 MAC addresses from.
* a MAC address in flash that you derive arge0/arge1 MAC addresses from.
* The wifi devices having their own MAC addresses in calibration data,
like normal.
* The wifi devices having a fixed, default or garbage value for a MAC
address in calibration data, and it has to be derived from the
system MAC.
So to support this complete nonsense of a situation, there needs to be
a few hacks:
* The "board" MAC address needs to be derived from somewhere and squirreled
away. For now it's either redboot or a MAC address stored in calibration
flash.
* Then, a "map" set of hints to populate kenv with some MAC addresses
that are derived/local, based on the board address. Each board has
a totally different idea of what you do to derive things, so each
map entry has an "offset" (+ve or -ve) that's added to the board
MAC address.
* Then if_arge (and later, if_ath) should check kenv for said hint and
if it's found, use that rather than the EEPROM MAC address - which may
be totally garbage and not actually work right.
In order to do this, I've undone some of the custom redboot expecting
hacks in if_arge and the stuff that magically adds one to the MAC
address supplied by the board - instead, as I continue to test this
out on more hardware, I'll update the hints file with a map explaining
(a) where the board MAC should come from, and (b) what offsets to use
for each device.
The aim is to have all of the tplink, dlink and other random hardware
we run on have valid MAC addresses at boot, so (a) people don't get
random B:S:D❌x:x ethernet MACs, and (b) the wifi MAC is valid
so it works rather than trying to use an invalid address that
actually upsets systems (think: multicast bit set in BSSID.)
Tested:
* TP-Link TL_WDR3600 - subsequent commits will add the hints map
and the if_ath support.
TODO:
* Since this is -HEAD, and I'm all for debugging, there's a lot of
printf()s in here. They'll eventually go under bootverbose.
* I'd like to turn the macaddr routines into something available
to all drivers - too many places hand-roll random MAC addresses
and parser stuff. I'd rather it just be shared code.
However, that'll require more formal review.
* More boards.
in this area and by the Clang static analyzer.
Remove some dead assignments.
Fix a typo in a panic string.
Use umtx_pi_disown() instead of duplicate code.
Use an existing variable instead of curthread.
Approved by: kib (mentor)
MFC after: 3 days
Sponsored by: Dell Inc
common (autogenerated) versions. Removes extra vertical space,
and makes it easier to grep for usage throughout the tree.
Conditionally compile only for arm6 [1] (yes sounds odd but is right).
Submitted by: andrew [1]
Reviewed by: gnn, andrew (ian earlier version I think)
Differential Revision: https://reviews.freebsd.org/D2159
Obtained from: Cambridge/L41
Sponsored by: DARPA, AFRL
CPU, also add protection against invalid CPU's as well as
split c_flags and c_iflags so that if a user plays with the active
flag (the one expected to be played with by callers in MPSAFE) without
a lock, it won't adversely affect the callout system by causing a corrupt
list. This also means that all callers need to use the macros and *not*
play with the falgs directly (like netgraph used to).
Differential Revision: htts://reviews.freebsd.org/D1894
Reviewed by: .. timed out but looked at by jhb, imp, adrian hselasky
tested by hiren and netflix.
Sponsored by: Netflix Inc.
originated from the return to usermode. #ss must be handled same as
#np.
Reported by: Andrew Lutomirski through secteam
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Without this the autotuning fails for small amounts of RAM (32mb),
which all the AR91xx shipping products seemed to have.
Thanks to gjb for reminding me to re-test this stuff.
Tested:
* AR91xx, TP-Link TL-WR1043nd v1
duplicated code in the two classes, and also allows devices in FDT-based
systems to declare simplebus as their parent and still work correctly
when the FDT data describes the device at the root of the tree rather
than as a child of a simplebus (which is common for interrupt, clock,
and power controllers).
Differential Revision: https://reviews.freebsd.org/D1990
Submitted by: Michal Meloun
are built by default. You can still override that with MODULES_EXTRA
for experimental features like ZFS and dtrace on some
architectures. Also note that kernel config files are not affected by
MK_ options listed, though some targets might be.
- make sure the timeout computations are always above zero by using
the existing "linux_timer_jiffies_until()" function. Negative timeouts
can result in undefined behaviour.
- declare all completion functions like external symbols and move the
code to the LinuxAPI kernel module.
- add a proper prefix to all LinuxAPI kernel functions to avoid
namespace collision with other parts of the FreeBSD kernel.
- clean up header file inclusions in the linux/completion.h, linux/in.h
and linux/fs.h header files.
MFC after: 1 week
Sponsored by: Mellanox Technologies
sequence is performed on UFS SU+J rootfs:
cp -Rp /sbin/init /sbin/init.old
mv -f /sbin/init.old /sbin/init
Hang occurs on the rootfs unmount. There are two issues:
1. Removed init binary, which is still mapped, creates a reference to
the removed vnode. The inodeblock for such vnode must have active
inodedep, which is (eventually) linked through the unlinked list. This
means that ffs_sync(MNT_SUSPEND) cannot succeed, because number of
softdep workitems for the mp is always > 0. FFS is suspended during
unmount, so unmount just hangs.
2. As noted above, the inodedep is linked eventually. It is not
linked until the superblock is written. But at the vfs_unmountall()
time, when the rootfs is unmounted, the call is made to
ffs_unmount()->ffs_sync() before vflush(), and ffs_sync() only calls
ffs_sbupdate() after all workitems are flushed. It is masked for
normal system operations, because syncer works in parallel and
eventually flushes superblock. Syncer is stopped when rootfs
unmounted, so ffs_sync() must do sb update on its own.
Correct the issues listed above. For MNT_SUSPEND, count the number of
linked unlinked inodedeps (this is not a typo) and substract the count
of such workitems from the total. For the second issue, the
ffs_sbupdate() is called right after device sync in ffs_sync() loop.
There is third problem, occuring with both SU and SU+J. The
softdep_waitidle() loop, which waits for softdep_flush() thread to
clear the worklist, only waits 20ms max. It seems that the 1 tick,
specified for msleep(9), was a typo.
Add fsync(devvp, MNT_WAIT) call to softdep_waitidle(), which seems to
significantly help the softdep thread, and change the MNT_LAZY update
at the reboot time to MNT_WAIT for similar reasons. Note that
userspace cannot create more work while devvp is flushed, since the
mount point is always suspended before the call to softdep_waitidle()
in unmount or remount path.
PR: 195458
In collaboration with: gjb, pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
When CPU is not busy, those queues are typically empty. When CPU is busy,
then one more extra sorting is the last thing it needs. If specific device
(HDD) really needs sorting, then it will be done later by CAM.
This supposed to fix livelock reported for mirror of two SSDs, when UFS
fires zillion of BIO_DELETE requests, that totally blocks I/O subsystem by
pointless sorting of requests and responses under single mutex lock.
MFC after: 2 weeks
the PMC_IN_KERNEL() macro definition.
Add missing macros to extract the return address (LR) from the trapframe.
Discussed with: andrew
Obtained from: Cambridge/L41
Sponsored by: DARPA, AFRL
MFC after: 2 weeks
any defaults or user specified actions on the command line. This would
be useful for specifying features that are always broken or that
cannot make sense on a specific architecture, like ACPI on pc98 or
EISA on !i386 (!x86 usage of EISA is broken and there's no supported
hardware that could have it in any event). Any items in
__ALWAYS_NO_OPTIONS are forced to "no" regardless of other settings.
Differential Revision: https://reviews.freebsd.org/D2011
This is pretty much a complete rewrite based on the existing i386 code. The
patches have been circulating for a couple years and have been looked at by
plenty of people, but I'm not putting anybody on the hook as having reviewed
this in any formal sense except myself.
After this has gotten wider testing from the user community, ARM_NEW_PMAP
will become the default and various dregs of the old pmap code will be
removed.
Submitted by: Svatopluk Kraus <onwahe@gmail.com>,
Michal Meloun <meloun@miracle.cz>
set of machine headers needed to build the userland toolchain.
Differential Revision: https://reviews.freebsd.org/D2148
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
the startup trampoline code. The old code allocated a kva page, mapped it
using using pmap_kenter_nocache(), then freed the kva without destroying
the mapping. This is the only use of pmap_kenter_nocache() in the system,
so redoing this one use of allows it to be garbage collected in the
near future.
Swap device is still reported as enabled, and system still may crash later
if some swapped-out kernel pages were lost with the device, but at least
GEOM and CAM can now release the lost disk, allowing it to be reconnected.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Bring support for two gcc function attributes that are likely to be used
in our system headers:
__alloc_size
The alloc_size attribute is used to tell the compiler that the function
return value points to memory, where the size is given by one or two of
the functions parameters.
__result_use_check
Causes a warning to be emitted if a caller of the function with this
attribute does not use its return value. This is known in gcc as
"warn_unused_result" but we considered the original naming unsuitable
for an attribute.
The __alloc_size attribute required some workarounds for lint(1).
Both attributes are supported by clang.
Also see: D2107
MFC after: 3 days
where counter was incremented on parent, instead of vlan(4) interface.
The second is more complicated. Historically, in our stack the incoming
packets are accounted in drivers, while incoming bytes for Ethernet
drivers are accounted in ether_input_internal(). Thus, it should be
removed from vlan(4) driver.
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
the Raspberry Pi B we support most of the devices are already supported,
however the base address has changed.
A few items are not working, or missing. The main ones are:
* DMA doesn't work in the sdhci driver.
* Enabling vchiq halts the boot, may be interrupt related.
* There is no U-Boot port yet so the DTB is embedded in the kernel.
The last point will make it difficult to boot FreeBSD, however there is
support for the Raspberry Pi 2 in the U-Boot git repo. As I have not tested
this it is left as an open task to create a port to build.
X-MFC: When the above issues are fixed
Sponsored by: ABT Systems Ltd
number of dynamically created and destroyed SYSCTLs during runtime it
is very likely that the current new OID number limit of 0x7fffffff can
be reached. Especially if dynamic OID creation and destruction results
from automatic tests. Additional changes:
- Optimize the typical use case by decrementing the next automatic OID
sequence number instead of incrementing it. This saves searching time
when inserting new OIDs into a fresh parent OID node.
- Add simple check for duplicate non-automatic OID numbers.
MFC after: 1 week
vm.boot_pages is marked as a CTLFLAG_RDTUN, but it's used by the VM
before the sysctl subsystem is initialsed. We manually fetch the
variable from the environment to work around this problem.
Tested by: Keith White kwhite at uottawa.ca
MFC after: 1 week
code segment base address.
Also if an instruction doesn't support a mod R/M (modRM) byte, don't
be concerned if the CPU is in real mode.
Reviewed by: neel
statements. This allows for setting all PCM core parameters in the
kernel environment through loader.conf(5) or kenv(1) which is useful
for pluggable PCM devices like USB audio devices which might be
plugged after that sysctl.conf(5) is executed.
requested size. If tag restrictions caused split entry, its size is
less then requsted.
Hardware provided by: Michael Fuckner <michael@fuckner.net>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
the right solution but I will leave it to experts to untangle this
problem to properly stop the build failures.
At the moment only if_ix.c includes dev/netmap/ixgbe_netmap.h which is
good as ixgbe_netmap.h defines a couple of (file) static variables--thus
local to if_ix.c.
static int ix_crcstrip however now also got checked from ix_txrx.c
(as an extern) and should not be visible there. In fact we do see
powerpc and powerpc64 build failures because of this. It is unclear
to me why on other (clang built?) architectures this does not lead
to a reference of an undefined symbol and similar build breakage.
There are four places, all in cam_xpt.c, where ccbs are malloc'ed. Two of
these use M_ZERO, two don't. The two that don't meant that allocated ccbs
had trash in them making it hard to debug errors where they showed up. Due
to this, use M_ZERO all the time when allocating ccbs.
Submitted by: Scott Ferris <scott.ferris@isilon.com>
Sponsored by: EMC/Isilon Storage Division
Reviewed by: scottl, imp
Differential: https://reviews.freebsd.org/D2120
NB: Using NULL for default values in-case someone
or something uncomments it and reboots. See
check-password.4th(8) for additional details.
MFC after: 3 days
X-MFC-to: stable/10 stable/9
When taking user input, don't show asterisks as the user types
but instead spin a twiddle. Implement Ctrl-U to clear user input.
If the buffer is empty, either because the user has yet to type
anything, presses Ctrl-U at any time, or presses backspace enough
to end in an empty buffer, the twiddle is erased to provide feed-
back to the user.
MFC after: 3 days
X-MFC-to: stable/10 stable/9
locking out everyone in the case of setting a password longer than
the maximum (currently 16 characters). Now the required password is
truncated to the maximum input that can be read from the user.
PR: kern/198760
MFC after: 3 days
MFH: stable/10 stable/9
This is a temporary workaround until we determine a reliable sequence
of operations for detecting MC reboots.
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D2084
bus_space_*_8() are not always macros, so it is not correct to use
#ifndef.
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D2083
Tx multi queue is added in FreeBSD 8.0. So, the changeset drops earlier
versions support.
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D2081
In theory the barriers are required to cope with write combining and
reordering. Two barriers are added (sometimes merged to one):
1. Before the first write to guarantee that previous writes to the region
have been done
2. Before the last write to guarantee that write to the last dword/qword is
done after previous writes
Barriers are inserted before in the assumption that it is better to
postpone barriers as much as it is possible (more chances that the
operation has already been already done and barrier does not stall CPU).
On x86 and amd64 bus space write barriers are just compiler memory barriers
which are definitely required.
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D2077
ioctl to put interface down sets ifp->if_flags which holds the intended
administratively defined state and calls driver callback to apply it.
When everything is done, driver updates internal copy of
interface flags sc->if_flags which holds the operational state.
So, transmit from Rx path is possible when interface is intended to be
administratively down in accordance with ifp->if_flags, but not applied
yet and the operational state is up in accordance with sc->if_flags.
Sponsored by: Solarflare Communications, Inc.
Differential Revision: https://reviews.freebsd.org/D2075
port loader.efi to both 32 and 64-bit ARM where we can use this file with
minimal changes.
Differential Revision: https://reviews.freebsd.org/D2031
Reviewed by: imp
There's a bug in the ticks handling where when initialised at '0', once
the ticks counter wrapped the comparison math would never trigger.
The pps calculation would never happen, and thus aggregation was never
enabled.
It manifests itself as "oh you only get 11n transmit aggregation for the
first 10 minutes of uptime."
I'm sure there are other ticks related issues lurking in net80211.
Tested:
* ath / iwn, both with 'wlandebug +11n' and a little bit of iperf to
kick off the transmit A-MPDU negotiation once the pps gets high enough.
delist_dev() function. In addition to this change:
- add a proper description of this function
- add a proper witness assert inside this function
- switch a nearby line to use the "cdp" pointer instead of cdev2priv()
MFC after: 3 days
The magic number MSDOSFS_ARGSMAGIC, which used to distinguish
"old" vs "new" msdosfs mount arguments, has not been used since
2005; it should just go away now.
Likewise, the local-to-Unicode table that changed at the same
time is unused.
Leave the space reserved in the old style mount arguments, though,
since we still support the old mount call (via the cmount entry
point).
Submitted by: Chris Torek <chris.torek@gmail.com>
MFC after: 2 weeks
This is based on the AP135 design - QCA9558 SoC, 3x3 2GHz wifi, but no
5GHz (11n or 11ac) chip is available.
It however still has 128MiB of RAM, 16MiB of NOR flash and the AR8327N
gigabit switch - so it's quite a beefy router device.
Tested:
* Well, a unit, naturally
Obtained from: Completely messing up an amazon.com order and getting this instead, and asking "hey, wonder if I could.."
* add ipfw
* delete ath / ath_ahb for now, until I can have Warner beat me
with the clue stick about putting in conditional build things into
the ath Makefile so the module builds can just have the HAL bits
that are relevant for a particular target.
path.
* For now there's no exposed control over classic / LNA antenna diversity,
so just stub them out. Adding this will take quite a bit of time.
* Add a function to fetch the CTS timeout.
PR: kern/198558
This allows us to get rid of bzero which was added specifically to make
mtx_init on p_mtx reliable.
This also fixes a potential problem where mtx_init on other mutexes
could trip over on unitialized memory and fire an assertion.
Reviewed by: kib
proc_set_cred_init can be used to set first credentials of a new
process.
Update proc_set_cred assertions so that it only expects already used
processes.
This fixes panics where p_ucred of a new process happens to be non-NULL.
Reviewed by: kib
Prior to this change the kernel would take p1's credentials and assign
them tempororarily to p2. But p1 could change credentials at that time
and in effect give us a use-after-free.
No objections from: kib
named objects to zero before the virtual address is selected. Previously,
the color setting was delayed until after the virtual address was
selected. In rtld, this delay effectively prevented the mapping of a
shared library's code section using superpages. Now, for example, we see
the first 1 MB of libc's code on armv6 mapped by a superpage after we've
gotten through the initial cold misses that bring the first 1 MB of code
into memory. (With the page clustering that we perform on read faults,
this happens quickly.)
Differential Revision: https://reviews.freebsd.org/D2013
Reviewed by: jhb, kib
Tested by: Svatopluk Kraus (armv6)
MFC after: 6 weeks
we're not looking at it.
Fix this by increasing l2->l2_occupancy before we try to alloc (and decrease
it if the allocation failed, or if another thread did a similar allocation).
Submitted by: Kohji Okuno <okuno.kohji@jp.panasonic.com>
MFC after: 1 week
- Use real locking, replace Giant with global sx protecting the
subsystem. Since the subsystem' lock is no longer dropped during
the sleepsk, remove not needed SHMSEG_WANTED segment flag, and
revert r278963.
- To do proper code simplification possible after the change of the
lock, restructure several functions into _locked body and
originally-named wrapper which calls into _locked variant. This
allows to eliminate the 'goto done2' spread over the code.
- Merge shm_find_segment_by_shmid() and shm_find_segment_by_shmidx().
- Consistently change all function prototypes to ANSI C.
Reviewed by: mjg (who has earlier version of the similar patch to
introduce real locking)
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
appears to be too inaccurate that it can be used to synchronize the
playback data stream. If there is a recording endpoint associated with
the playback endpoint, use that instead. That means if the isochronous
OUT endpoint is asynchronus the USB audio driver will automatically
start recording, if possible, to get exact information about the
needed sample rate adjustments. In no recording endpoint is present,
no rate adaption will be done.
While at it fix an issue where the hardware buffer pointers don't get
reset at the first device PCM trigger.
Make some variables 32-bit to avoid problems with multithreading.
MFC after: 3 days
PR: 198444
These are actually almost the same units; except one is 3x3 5GHz, and
one is 2x2 5GHz.
Tested:
* TP-Link TL-WDR3600
TODO:
* The ath0/ath1 MAC addresses are ye garbage (00:02:03:04:05:06); fixing
that will take a little more time. It works fine with the ath0/ath1
MAC addresses set manually.
* Go through and yank the AR9344 on-board switch config (arswitch1);
it's not required here for this AP.
The AR934x (and maybe others in this family) have a more complicated
GPIO mux. The AR71xx just has a single function register for a handful
of "GPIO or X" options, however the AR934x allows for one of roughly
100 behaviours for each GPIO pin.
So, this adds a quick hints based mechanism to configure the output
functions, which is required for some of the more interesting board
configurations. Specifically, some use external LNAs to improve
RX, and without the MUX/output configured right, the 2GHz RX side
will be plain terrible.
It doesn't yet configure the "input" side yet; I'll add that if
it's required.
Tested:
* TP-Link TL-WDR3600, testing 2GHz STA/AP modes, checking some
basic RX sensitivity things (ie, "can I see the AP on the other
side of the apartment that intentionally has poor signal reception
from where I am right now.")
Whilst here, fix a silly bug in the maxpin routine; I was missing
a break.
Previously format string traversal could happen while the string itself was
being modified.
Use allproc_lock as coredumping is a rare operation and as such we don't
have to create a dedicated lock.
Submitted by: Tiwei Bie <btw mail.ustc.edu.cn>
Reviewed by: kib
X-Additional: JuniorJobs project
Currently we update timestamps unconditionally when doing read or
write operations. This may slow things down on hardware where
reading timestamps is expensive (e.g. HPET, because of the default
vfs.timestamp_precision setting is nanosecond now) with limited
benefit.
A new sysctl variable, vfs.devfs.dotimes is added, which can be
set to non-zero value when the old behavior is desirable.
Differential Revision: https://reviews.freebsd.org/D2104
Reported by: Mike Tancsa <mike sentex net>
Reviewed by: kib
Relnotes: yes
Sponsored by: iXsystems, Inc.
MFC after: 2 weeks
- Use ifunit() instead of going through the interface list ourselves.
- Remove unused parameter.
- Move the most important comment above the function.
Sponsored by: Nginx, Inc.
Many thanks to ian who gently provided me the DS1307 breakout board.
Tested on: Raspberry pi
Differential Revision: https://reviews.freebsd.org/D2022
Reviewed by: rpaulo
to get the default frequency of the sdhci device.
While here use a u_int to hold the frequency as it may be too large to fit
in a 32-bit signed integer. This is the case when we have a 250MHz clock.
and create a "hidden" API that can be used in other system headers without
adding namespace pollution.
- If the POPCNT instruction is enabled at compile time, use
__builtin_popcount*() to implement __bitcount*(), otherwise fall back
to software implementations.
- Use the existing bitcount16() and bitcount32() from <sys/systm.h> to
implement the non-POPCNT __bitcount16() and __bitcount32() in
<sys/types.h>.
- For the non-POPCNT __bitcount64(), use a similar SWAR method on 64-bit
systems. For 32-bit systems, use two __bitcount32() operations on the
two halves.
- Use __bitcount32() to provide a __bitcount() that operates on plain ints.
- Use either __bitcount32() or __bitcount64() to provide a
__bitcountl() that operates on longs.
- Add public bitcount*() wrappers for __bitcount*() for use in the kernel
in <sys/libkern.h>.
- Use __builtinl() instead of __builtin_popcountl() in BIT_COUNT().
Discussed with: bde
Each plaform performs virtual memory split between kernel and user space
and assigns kernel certain amount of memory space. However, is is sometimes
reasonable to change the default values. Such situation may happen on
systems where the demand for kernel buffers is high, many devices occupying
memory etc. This of course comes with the cost of decreasing user space
memory range so shall be used with care. Most embedded systems will not
suffer from this limtation but rather take advantage of this potential
since default behavior is left unchanged.
Submitted by: Wojciech Macek <wma@semihalf.com>
Reviewed by: imp
Obtained from: Semihalf
translation. In particular, despite IO-APICs only take 8bit apic id,
IR translation structures accept 32bit APIC Id, which allows x2APIC
mode to function properly. Extend msi_cpu of struct msi_intrsrc and
io_cpu of ioapic_intsrc to full int from one byte.
KPI of IR is isolated into the x86/iommu/iommu_intrmap.h, to avoid
bringing all dmar headers into interrupt code. The non-PCI(e) devices
which generate message interrupts on FSB require special handling. The
HPET FSB interrupts are remapped, while DMAR interrupts are not.
For each msi and ioapic interrupt source, the iommu cookie is added,
which is in fact index of the IRE (interrupt remap entry) in the IR
table. Cookie is made at the source allocation time, and then used at
the map time to fill both IRE and device registers. The MSI
address/data registers and IO-APIC redirection registers are
programmed with the special values which are recognized by IR and used
to restore the IRE index, to find proper delivery mode and target.
Map all MSI interrupts in the block when msi_map() is called.
Since an interrupt source setup and dismantle code are done in the
non-sleepable context, flushing interrupt entries cache in the IR
hardware, which is done async and ideally waits for the interrupt,
requires busy-wait for queue to drain. The dmar_qi_wait_for_seq() is
modified to take a boolean argument requesting busy-wait for the
written sequence number instead of waiting for interrupt.
Some interrupts are configured before IR is initialized, e.g. ACPI
SCI. Add intr_reprogram() function to reprogram all already
configured interrupts, and call it immediately before an IR unit is
enabled. There is still a small window after the IO-APIC redirection
entry is reprogrammed with cookie but before the unit is enabled, but
to fix this properly, IR must be started much earlier.
Add workarounds for 5500 and X58 northbridges, some revisions of which
have severe flaws in handling IR. Use the same identification methods
as employed by Linux.
Review: https://reviews.freebsd.org/D1892
Reviewed by: neel
Discussed with: jhb
Tested by: glebius, pho (previous versions)
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
queue. They are for first-level translations and device TLB.
Review: https://reviews.freebsd.org/D1892
Reviewed by: neel
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
* Just do the buf check early and fail out
* If the offset being searched is:
00110000 00 b5 7e 45 61 e2 76 d3 c1 78 dd 15 95 cd 1f f1 |..~Ea.v..x......|
.. and the match string is '.!/bin/sh'
.. then it'll set the match string[0] to '\0', do a strncmp() against
the read buffer, find it's matching two zero-length strings, and think
that's where to start.
MFC after: 2 weeks
promoted" panics. The sequence of events that leads to a panic is rather
long and circuitous. First, suppose that process P has a promoted
superpage S within vm object O that it can write to. Then, suppose that P
forks, which leads to S being write protected. Now, before P's child
exits, suppose that P writes to another virtual page within O. Since the
pages within O are copy on write, a shadow object for O is created to
house the new physical copy of the faulted on virtual page. Then, before
P can fault on S, P's child exists. Now, when P faults on S, it will
follow the "optimized" path for copy-on-write faults in vm_fault(),
wherein the underlying physical page is moved from O to its shadow object
rather than allocating a new page and copying the new page's contents from
the old page. Moreover, suppose that every 4 KB physical page making up S
is moved to the shadow object in this way. However, the optimized path
does not move the underlying superpage reservation, which is the root
cause of the panics! Ultimately, P performs vm_object_collapse() on O's
shadow object, which destroys O and in doing so breaks any reservations
still belonging to O. This leaves the reservation underlying S in an
inconsistent state: It's simultaneously not in use and promoted. Breaking
a reservation does not demote it because I never intended for a promoted
reservation to be broken. It makes little sense. Finally, this
inconsistency leads to an assertion failure the next time that the
reservation is used.
The failing assertion does not (currently) exist in FreeBSD 10.x or
earlier. There, we will quietly break the promoted reservation. While
illogical and unintended, breaking the reservation is essentially
harmless.
PR: 198163
Reviewed by: kib
Tested by: pho
X-MFC after: r267213
Sponsored by: EMC / Isilon Storage Division
The only drives I have discovered so far that support medium type
reports are newer HP LTO (LTO-5 and LTO-6) drives. IBM drives
only support the density reports.
sys/cam/scsi/scsi_sa.h:
The number of possible density codes in the medium type
report is 9, not 8. This caused problems parsing all of
the medium type report after this point in the structure.
usr.bin/mt/mt.c:
Run the density codes returned in the medium type report
through denstostring(), just like the primary and secondary
density codes in the density report. This will print the
density code in hex, and give a text description if it
is available.
Thanks to Rudolf Cejka for doing extensive testing with HP LTO drives
and Bacula and discovering these problems.
Tested by: Rudolf Cejka <cejkar at fit.vutbr.cz>
Sponsored by: Spectra Logic
MFC after: 4 days
in other places that set the external storage type (ext_type) such as
m_cljset(), m_extadd(), mb_ctor_clust(), and vn_sendfile().
Differential Revision: https://reviews.freebsd.org/D2080
Reviewed by: np, glebius
MFC after: 2 weeks
conversion:
- The linux compat API layer casts the ticks to unsigned long which
might cause problems when the ticks value is negative.
- Guard against already expired ticks values, by checking if the
passed expiry tick is already elapsed.
- While at it avoid referring the address of an inlined function.
MFC after: 3 days
Sponsored by: Mellanox Technologies
* Fix the multiple same-named devclasses; the duplicate name
trips up the linker.
* Re-do the taskqueue stuff to use the new cpuset API, not the old
pinned API.
* Add includes for the new location of the RSS configuration routines.
This allows ixgbe to compile as a module /and/ linked into the kernel,
along with RSS working.
Sponsored by: Norse Corp, Inc.
allocator tries to move the entry up, after the boundary. The new
location may still fail to satisfy boundary requirement, for instance,
if the boundary is set to page size, and allocation is of multiple
pages.
Recheck that boundary is not crossed after the move. If it is
crossed, give up on allocating the whole entry and split it.
Reported by: Michael Fuckner <michael@fuckner.net>, running nvme(4)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
next entry does not intersect with the tail of the new entry, but also
that previous entry is also before new entry start.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
several types of data into the mem-info array (DRAM, SRAM, flash). We
need to extract just the DRAM entries for translation into fdt memory
properties.
Also, increase the number of regions we can handle from 5 to 16.
Submitted by: Michal Meloun
Values smaller than two lead to strange asserts that have nothing to do
with the actual problem (in the case of size=0), or to writing beyond the
end of the allocated buffer in sbuf_finish() (in the case of size=1).
- Allow to call the function with vm object lock held.
- Allow to specify reqpage that doesn't match any page in the region,
meaning freeing all pages.
o Utilize the new function in couple more places in vnode pager.
Reviewed by: alc, kib
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
Overview:
* implemented quirk for forcing SATA interface enable
* restore value to status register - this enables link autonegotiation
Modifications:
* devid:vendorid field
* quirk for forcing PI setting (BIOS is doing that on PC-like systems)
* write to capabilites field to enable phy link initialization
Submitted by: Wojciech Macek <wma@semihalf.com>
Reviewed by: imp, mav
Obtained from: Semihalf
This update brings few features:
o Support for the setmaster/dropmaster ioctls. For instance, they
are used to run multiple X servers simultaneously.
o Support for minor devices. The only user-visible change is a new
entry in /dev/dri but it is useless at the moment. This is a
first step to support render nodes [1].
The main benefit is to greatly reduce the diff with Linux (at the
expense of an unreadable commit diff). Hopefully, next upgrades will be
easier.
No updates were made to the drivers, beside adapting them to API
changes.
[1] https://en.wikipedia.org/wiki/Direct_Rendering_Manager#Render_nodes
Tested by: Many people
MFC after: 1 month
Relnotes: yes
- Split the driver into independent pf and vf loadables. This is
in preparation for SRIOV support which will be following shortly.
This also allows us to keep a seperate revision control over the
two parts, making for easier sustaining.
- Make the TX/RX code a shared/seperated file, in the old code base
the ixv code would miss fixes that went into ixgbe, this model
will eliminate that problem.
- The driver loadables will now match the device names, something that
has been requested for some time.
- Rather than a modules/ixgbe there is now modules/ix and modules/ixv
- It will also be possible to make your static kernel with only one
or the other for streamlined installs, or both.
Enjoy!
Submitted by: jfv and erj
Drops are observed under multi-stream TCP traffic due to put-list
overflow with limit equal to 64.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
Transmit may be called when TxQ is not started yet (i.e. txq->common is
invalid). TxQ state is checked below when mbuf is processed and dropped
if TxQ is not started.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
The information is required for NIC update and config tools.
Submitted by: Artem V. Andreev <Artem.Andreev at oktetlabs.ru>
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
It is done to structure sysctl and do not mix with Tx queue statistics
to be added.
Sponsored by: Solarflare Communications, Inc.
Approved by: gnn (mentor)
without a commit message...
Use sbuf_new() + SYSCTL_OUT() instead of wiring the userland buffer and
using sbuf_new_for_sysctl(). The preallocated 256 byte buffer is always
going to be big enough to hold these results, and this should be more
efficient than wiring the old buffer.
INCLUDENUL is set and sbuf_finish() has been called, the length has been
incremented to count the nulterm byte, and in that case current length is
allowed to be equal to buffer size, otherwise it must be less than.
Add a predicate macro to test for SBUF_INCLUDENUL, and use it in tests, to
be consistant with the style in the rest of this file.
and cap_ioctls_get(). On FreeBSD, these are 'unsigned long', but on Linux,
ioctl(2) takes an 'int', making mild abstraction desirable.
MFC after: 3 days
Sponsored by: Google, Inc.
writing approximately never (< 0.00000001% under heavy VM load, and it can
go for months without ever being acquired in normal operation). This
provides a 10% (2-minute) improvement in wall clock time for make -j32
buildworld on a 4-core 32-thread POWER8.
handle_ddp_close() function in t4_ddp.c as the logic is similar
to handle_ddp_data(). This allows all knowledge of the special
DDP mbufs to be private to t4_ddp.c as well.
This makes FreeBSD guest to not avoid using LAPIC timer, preferring HPET
due to worries about non-existing for virtual CPUs deep sleep states.
Benchmarks of usleep(1) on guest and host show such extra latencies:
- 51us for virtual HPET,
- 22us for virtual LAPIC timer,
- 22us for host HPET and
- 3us for host LAPIC timer.
MFC after: 2 weeks
A comment in the code stated we PROC_LOCK and as a side effect guarantee
all writers released process lock. But at that point such lock was already
taken while we were removing the process from all lists, so it should be already
unreachable.
The goal here is to provide one place altering process credentials.
This eases debugging and opens up posibilities to do additional work when such
an action is performed.
A lot of these embedded boards don't have a unique MAC address per
device stored somewhere unique - sometimes they'll have one MAC
for both arge NICs; someties they'll have one MAC for both arge NICs
/and/ the ath NICs. In these instances, we need to derive device
specific MAC addresses from the base MAC address.
These functions will be used by some follow-up code that'll slot
into if_arge and if_ath.
device restart.
(Committers note - once scan overhaul and a few other things have been
fixed in net80211 to not block things in the taskqueue, this can disappear
and the device specific taskqueues in other drivers can also go away.)
PR: kern/197143
Submitted by: Andriy Voskoboinyk <s3erios@gmail.com>
fixes various races between wpi_notif_intr() and wpi_stop_locked().
(attachment 154381)
Committers note: yes, unlock/if_input/lock has to go away, but that'll
have to be done later.
PR: kern/197143
Submitted by: Andriy Voskoboinyk <s3erios@gmail.com>
(Committer note: these checks will have to be re-established in a future
commit as /well/ as having the KASSERTs.)
PR: kern/197143
Submitted by: Andriy Voskoboinyk <s3erios@gmail.com>
traps do appear in the regular call stack, rather than only in a special
trap frame, so we don't need to inject the trap-frame $pc into a returned
stack trace in DTrace.
MFC after: 3 days
Sponsored by: DARPA, AFRL
skip using the DTrace 'profile' provider on ARM. This causes stack traces
to skip various driver-and callout-related things as they do on x86, where
the likewise arbitrary values are '6' (32-bit) and '10' (64-bit) for
similar sorts of reasons.
MFC after: 3 days
Sponsored by: DARPA, AFRL
standard way (including the nulterm byte in the data returned to userland).
This augments the existing sysctl_handle_string() in that this can be used
with const strings without ugly inappropriate casting.
If zvol_geom_start is called with a BIO_DELETE from a thread which can
sleep it queues it for later processing by the zvol_geom_worker. The
zvol_geom_worker didn't have a delete case so would simply loose the bio
hence preventing the original caller from every completing. In addition
an other unknown types would suffer the same fate.
Allow zvol_geom_worker to process BIO_DELETE's via zvol_strategy and
return unsupported for all unknown bio types.
MFC after: 2 weeks
Sponsored by: Multiplay
strings returned to userland include the nulterm byte.
Some uses of sbuf_new_for_sysctl() write binary data rather than strings;
clear the SBUF_INCLUDENUL flag after calling sbuf_new_for_sysctl() in
those cases. (Note that the sbuf code still automatically adds a nulterm
byte in sbuf_finish(), but since it's not included in the length it won't
get copied to userland along with the binary data.)
Remove explicit adding of a nulterm byte in a couple places now that it
gets done automatically by the sbuf drain code.
PR: 195668
The SBUF_INCLUDENUL flag causes the nulterm byte at the end of the string
to be counted in the length of the data. If copying the data using the
sbuf_data() and sbuf_len() functions, or if writing it automatically with
a drain function, the net effect is that the nulterm byte is copied along
with the rest of the data.
vectors to be dynamically allocated. This allows kernel modules like vmm.ko
to allocate unique IPI slots when loaded (as opposed to hard allocating one
or more vectors).
Also, reorganize the fixed IPI vectors to create a contiguous space for
dynamic IPI allocation.
Reviewed by: kib, jhb
Differential Revision: https://reviews.freebsd.org/D2042
- Add bzipfs to the list of supported filesystems in the EFI loader.
- Increase the heap size allocated for the EFI loader from 2MB to 3MB.
Differential Revision: https://reviews.freebsd.org/D2053
Reviewed by: benno, emaste, imp
MFC after: 2 weeks
Sponsored by: Cisco Systems, Inc.
redzone below the stack pointer for scratch space and requires
interrupt and signal frames to avoid overwriting it. However, EFI uses
the Windows ABI which does not support this. As a result, interrupt
handlers in EFI push their interrupt frames directly on top of the
stack pointer. If the compiler used the red zone in a function in the
EFI loader, then a device interrupt that occurred while that function
was running could trash its local variables. In practice this happens
fairly reliable when using gzipfs as an interrupt during decompression
can trash the local variables in the inflate_table() function
resulting in corrupted output or hangs.
Fix this by disabling the redzone for amd64 EFI binaries. This
requires building not only the loader but any libraries used by the
loader without redzone support.
Thanks to Jilles for pointing me at the redzone once I found the stack
corruption.
Differential Revision: https://reviews.freebsd.org/D2054
Reviewed by: imp
MFC after: 2 weeks
Sponsored by: Cisco Systems, Inc.
to obtain IPv4 next hop address in tablearg case.
Add `fwd tablearg' support for IPv6. ipfw(8) uses INADDR_ANY as next hop
address in O_FORWARD_IP opcode for specifying tablearg case. For IPv6 we
still use this opcode, but when packet identified as IPv6 packet, we
obtain next hop address from dedicated field nh6 in struct table_value.
Replace hopstore field in struct ip_fw_args with anonymous union and add
hopstore6 field. Use this field to copy tablearg value for IPv6.
Replace spare1 field in struct table_value with zoneid. Use it to keep
scope zone id for link-local IPv6 addresses. Since spare1 was used
internally, replace spare0 array with two variables spare0 and spare1.
Use getaddrinfo(3)/getnameinfo(3) functions for parsing and formatting
IPv6 addresses in table_value. Use zoneid field in struct table_value
to store sin6_scope_id value.
Since the kernel still uses embedded scope zone id to represent
link-local addresses, convert next_hop6 address into this form before
return from pfil processing. This also fixes in6_localip() check
for link-local addresses.
Differential Revision: https://reviews.freebsd.org/D2015
Obtained from: Yandex LLC
Sponsored by: Yandex LLC
This is slightly different to the other switches - the VLAN table
(VTU) programs in the vlan port mapping /and/ the port config
(tagged, untagged, passthrough, any.)
So:
* Add VTU operations to program the VTU (vlan table)
* abstract out the mirror-disable function so it's .. well, a function.
* setup the port to have a dot1q configuration for dot1q - the
port security is VLAN (not per-port VLAN) and requires an entry
in the VLAN table;
* add set_dot1q / get_dot1q to program the VLAN table;
* since the tagged/untagged ports are now programmed into the VTU,
rather than global - plumb the ports /and/ untagged ports bitmaps
through the arswitch API.
Tested:
* AP135 - QCA9558 SoC + AR8327N switch
(for example, a large mfsroot). Note that for EFI the kernel and
modules (as well as other metadata files such as splash screens or
memory disk images) are loaded into a statically-sized staging area.
When the EFI loader exits it copies this staging area down to the
location the kernel expects to run at.
- Add bounds checking to the copy routines to fail attempts to access
memory outside of the staging area. Previously loading a combined
kernel + modules larger than the staging size (32MB) would overflow
the staging area trashing whatever memory was afterwards. Under
Intel's OVMF firmware for qemu this resulted in fatal faults in the
firmware itself. Now the attempt will fail with ENOMEM.
- Allow the staging area size to be configured at compile time via
an EFI_STAGING_SIZE variable in src.conf or on the command line.
It accepts the size of the staging area in MB. The default size
remains 32MB.
MFC after: 2 weeks
Sponsored by: Cisco Systems, Inc.
packets and does not schedule interrupts for any packets currently
enqueued. Close two races where enqueued packets may not ever trigger
interrupts. The first of these, at adapter initialization time, was
especially severe since a rush of enqueued packets could actually fill
the receive buffer completely, stalling the interface forever.
MFC after: 2 weeks
initialization, when no input method specified before if_attach().
This prevents panics when if_input() method called directly e.g.
from bpf(4) code.
PR: 192426
Reviewed by: glebius
MFC after: 1 week