- Add RATELIMIT kernel configuration keyword which must be set to
enable the new functionality.
- Add support for hardware driven, Receive Side Scaling, RSS aware, rate
limited sendqueues and expose the functionality through the already
established SO_MAX_PACING_RATE setsockopt(). The API support rates in
the range from 1 to 4Gbytes/s which are suitable for regular TCP and
UDP streams. The setsockopt(2) manual page has been updated.
- Add rate limit function callback API to "struct ifnet" which supports
the following operations: if_snd_tag_alloc(), if_snd_tag_modify(),
if_snd_tag_query() and if_snd_tag_free().
- Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT
flag, which tells if a network driver supports rate limiting or not.
- This patch also adds support for rate limiting through VLAN and LAGG
intermediate network devices.
- How rate limiting works:
1) The userspace application calls setsockopt() after accepting or
making a new connection to set the rate which is then stored in the
socket structure in the kernel. Later on when packets are transmitted
a check is made in the transmit path for rate changes. A rate change
implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the
destination network interface, which then sets up a custom sendqueue
with the given rate limitation parameter. A "struct m_snd_tag" pointer is
returned which serves as a "snd_tag" hint in the m_pkthdr for the
subsequently transmitted mbufs.
2) When the network driver sees the "m->m_pkthdr.snd_tag" different
from NULL, it will move the packets into a designated rate limited sendqueue
given by the snd_tag pointer. It is up to the individual drivers how the rate
limited traffic will be rate limited.
3) Route changes are detected by the NIC drivers in the ifp->if_transmit()
routine when the ifnet pointer in the incoming snd_tag mismatches the
one of the network interface. The network adapter frees the mbuf and
returns EAGAIN which causes the ip_output() to release and clear the send
tag. Upon next ip_output() a new "snd_tag" will be tried allocated.
4) When the PCB is detached the custom sendqueue will be released by a
non-blocking ifp->if_snd_tag_free() call to the currently bound network
interface.
Reviewed by: wblock (manpages), adrian, gallatin, scottl (network)
Differential Revision: https://reviews.freebsd.org/D3687
Sponsored by: Mellanox Technologies
MFC after: 3 months
the fifth argument to functions being traced, however there was an error
where the userspace stack was being used. This may be invalid leading to
a kernel panic if this address is unmapped.
Submitted by: Graeme Jenkinson <graeme.jenkinson@cl.cam.ac.uk>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9229
As the efi_devpath_last_node() and efi_devpath_trim() can return NULL
pointers, the consumers of this API should check the the NULL pointers.
Same for efinet_dev_init() using calloc().
Reported by: Robert Mustacchi <rm@joyent.com>
Reviewed by: jhb, allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9203
Clang apparently requires the explicit form of this instruction, and rejects
uses which ignore the optional cmpD register. This was the only use of the
shorthand form of the instruction, so just fix it up to match the others.
PR: kern/215681
Submitted by: Mark Millard
Reported by: Mark Millard <markmi _AT_ dsl-only.net>
MFC after: 2 weeks
Languages like C++17 and Go provide direct support for slice types:
pointer/length pairs. The CloudABI generator now has more complete for
this, meaning that for the C binding, pointer/length pairs now use an
automatic naming scheme of ${name} and ${name}_len.
Apart from this change and some reformatting, the ABI definitions are
identical. Binary compatibility is preserved entirely.
This field has no practical use and never readed. Initiators already
receive respective residual size from frontends. Removed field had
different semantics, which looks useless, and was never passed through
by any frontend.
While there, fix kern_data_resid field support in case of HA, missed in
r312291.
MFC after: 13 days
This lock was replaced from rwlock in r272840. But unlike rwlock, rmlock
doesn't allow recursion on rm_rlock(), so at this time fix this with
RM_RECURSE flag. Later we need to change ipfw to avoid such recursions.
PR: 216171
Reported by: Eugene Grosbein
MFC after: 1 week
The Zedboard has a hardware bug where initialization of the USB PHY
occasionally fails on boot-up. Fix regression in -CURRENT when
kernel panics on such occasion. 11-RELEASE branch works fine
PR: 215862
Submitted by: Thomas Skibo <thoma555-bsd@yahoo.com>
Previously "panic: msleep" could happen for a few different reasons.
Break the KASSERTs out into individual cases to identify the failing
condition. Found during the investigation that resulted in r308288.
Reviewed by: kib, jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D8604
sources to return timestamps when SO_TIMESTAMP is enabled. Two additional
clock sources are:
o nanosecond resolution realtime clock (equivalent of CLOCK_REALTIME);
o nanosecond resolution monotonic clock (equivalent of CLOCK_MONOTONIC).
In addition to this, this option provides unified interface to get bintime
(equivalent of using SO_BINTIME), except it also supported with IPv6 where
SO_BINTIME has never been supported. The long term plan is to depreciate
SO_BINTIME and move everything to using SO_TS_CLOCK.
Idea for this enhancement has been briefly discussed on the Net session
during dev summit in Ottawa last June and the general input was positive.
This change is believed to benefit network benchmarks/profiling as well
as other scenarios where precise time of arrival measurement is necessary.
There are two regression test cases as part of this commit: one extends unix
domain test code (unix_cmsg) to test new SCM_XXX types and another one
implementis totally new test case which exchanges UDP packets between two
processes using both conventional methods (i.e. calling clock_gettime(2)
before recv(2) and after send(2)), as well as using setsockopt()+recv() in
receive path. The resulting delays are checked for sanity for all supported
clock types.
Reviewed by: adrian, gnn
Differential Revision: https://reviews.freebsd.org/D9171
gtaskqueue bits at SI_SUB_INIT_IF instead of waiting until SI_SUB_SMP
which is far too late.
Add an assertion in taskqgroup_attach() to catch startup initialization
failures in the future.
Reported by: kib bde
it into pmap-v4.h where they are used. Other than those few lines of
support for different MMU types, nothing in cpuconf.h has been used in our
code for quite a while.
The file existed to set up a variety of symbols to describe the
architecture. Over the past few years we have converted all of our source
to use the new architecture symbols standardized by ARM Inc, and predefined
by both clang and gcc.
PR: 216104
It seems like kern_data_resid was never really implemented. This change
finally does it. Now frontends update this field while transferring data,
while CTL/backends getting it can more flexibly handle the result.
At this point behavior should not change significantly, still reporting
errors on write overrun, but that may be changed later, if we decide so.
CAM target frontend still does not properly handle overruns due to CAM API
limitations. We may need to add some fields to struct ccb_accept_tio to
pass information about initiator requested transfer size(s).
MFC after: 2 weeks
This patch adds driver for temperature/humidity sensor connected via GPIO.
To compile it into kernel add "device gpioths". To activate driver, use
hints (.at and .pins) for gpiobus. As result it will provide temperature &
humidity values via sysctl.
DHT11 is cheap & popular temperature/humidity sensor used via GPIO on ARM
or MIPS devices like Raspberry Pi or Onion Omega.
Reviewed by: adrian
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D9185
The find_currdev() is using variable "copy" to store the reference to trimmed
devpath pointer, if for some reason the efi_devpath_handle() fails, we will
leak this copy.
Also we can simplify the code there a bit.
Reviewed by: allanjude
Approved by: allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D9191
Replace archaic "busses" with modern form "buses."
Intentionally excluded:
* Old/random drivers I didn't recognize
* Old hardware in general
* Use of "busses" in code as identifiers
No functional change.
http://grammarist.com/spelling/buses-busses/
PR: 216099
Reported by: bltsrc at mail.ru
Sponsored by: Dell EMC Isilon
arswitch_setled() and a number of _global_setup functions did not acquire the
lock before calling arswitch_modifyreg(). With WITNESS enabled this would
instantly panic.
Discovered on a TPLink-3600:
("panic: mutex arswitch not owned at sys/dev/etherswitch/arswitch/arswitch_reg.c:236")
Reviewed by: adrian, kan
Differential Revision: https://reviews.freebsd.org/D9187
between exp(3) and `exp` var.
The approach taken previously was not ideal for multiple
functional and stylistic reasons.
Add to existing sed call in Makefile to replace `exp` with
`exponent` instead.
MFC after: 13 days
Requested by: bde
vm_object_madvise() is frequently used to apply advice to a contiguous
set of pages in an object with no backing object. Optimize this case by
skipping non-resident subranges in constant time, and by iterating over
resident pages using the object memq, thus avoiding radix tree lookups on
each page index in the specified range.
While here, move MADV_WILLNEED handling to vm_page_advise(), and rename the
"advise" parameter to vm_object_madvise() to "advice."
Reviewed by: alc, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D9098
sys/net/iflib.c:
Add ctx to filter_info and don't skpi interrupt early on unless we're on an
SMP system
sys/kern/subr_gtaskqueue.c:
Skip smp check if we're running UP
Submitted by: Matt Macy <mmacy@nextbsd.org>
Reported by: emaste bde
This is Micrel KSZ8995MA driver code. KSZ8995MA uses SPI bus to control.
This code is written & tested on @SRCHACK's ksz8995ma board and FON2100
with gpiospi.
etherswitchcfg support commands: addtag, ingress, striptag, dropuntagged.
Submitted by: Hiroki Mori <yamori813@yahoo.co.jp>
Reviewed by: mizhka, adrian
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D8790
This patch adds missing hints for ath0 (eepromaddr) and GPIO (mask & leds).
ath0 doesn't work without eeprom hints, so this commit should make wifi
works on Onion Omega.
GPIO mask is required if you want to use gpiobus and GPIO pins on your
board. Onion Omega has several leds connected to gpio pins (one on board,
one color on dock).
This commit adds mask for gpiobus and allow you to turn off/on leds via
/dev/leds/{board,blue,green,red} (on by default).
Tested on Onion Omega 1.
Reviewed by: adrian
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D9107
This is needed for kernel dumps to work, as the panicking thread will call
into code that makes use of kernel locks.
Reported and tested by: Eugene Grosbein
MFC after: 1 week
the cmap lock. Releasing the lock first may result in the thread
being immediately rescheduled and bound to the same CPU, only to
unpin itself upon resuming execution.
Noted by: skra (in review for armv6 equivalent)
MFC after: 1 week
This helps fix a -Wshadow issue with exp(3) with tests/sys/acct/acct_test,
which include math.h, which in turn defines exp(3)
MFC after: 2 weeks
Tested with: clang, gcc 4.2.1, gcc 4.9
Sponsored by: Dell EMC Isilon
dropped then reacquired due to using M_WAITOK, which opens a window in
which the tty device can disappear. Check for this and return ENXIO
back up the call chain so that callers can cope.
This closes a race where TF_GONE would get set while buffers were being
allocated as part of ttydev_open(), causing a subsequent call to
ttydevsw_modem() later in ttydev_open() to assert.
Reported by: pho
Reviewed by: kib
header match when using a raw socket to send IPv4 packets and
providing the header. If they don't match, let send return -1
and set errno to EINVAL.
Before this patch is was only enforced that the length in the header
is not larger then the buffer length.
PR: 212283
Reviewed by: ae, gnn
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9161
I added IEEE80211_TX_LOCK() a few years ago because there were races between
seqno allocation, driver queuing and crypto IV allocation. This meant that
they'd appear out of sequence and the receiver would drop them, leading to
terrible performance or flat out traffic hangs.
This flag should be set by drivers that do their own sequence number
allocation for all frames it needs to happen for, including beacon frames.
Eventually this should lead to the driver taking care of locking for
allocating seqno and other traffic-triggered events (eg addba setup.)
This is the bulk of the magic to start enabling VHT channel negotiation.
It is absolutely, positively not yet even a complete VHT wave-1 implementation.
* parse IEs in scan, assoc req/resp, probe req/resp;
* break apart the channel upgrade from the HT IE parsing - do it after the
VHT IEs are parsed;
* (dirty! sigh) add channel width decision making in ieee80211_ht.c htinfo_update_chw().
This is the main bit where negotiated channel promotion through IEs occur.
* Shoehorn in VHT node init ,teardown, rate control, etc calls like the HT
versions;
* Do VHT channel adjustment where appropriate
Tested:
* monitor mode, ath10k port
* STA mode, ath10k port - VHT20, VHT40, VHT80 modes
TODO:
* IBSS;
* hostap;
* (ignore mesh, wds for now);
* finish 11n state engine - channel width change, opmode notifications, SMPS, etc;
* VHT basic rate negotiation and acceptance criteria when scanning, associating, etc;
* VHT control/management frame handling (group managment and operating mode being
the two big ones);
* Verify TX/RX VHT rate negotiation is actually working correctly.
Whilst here, add some comments about seqno allocation and locking. To achieve
the full VHT rates I need to push seqno allocation into the drivers and
finally remove the IEEE80211_TX_LOCK() I added years ago to fix issues. :/
This sets up:
* vht capabilities in vaps;
* calls vht_announce to announce VHT capabilities if any;
* sets up vht20, vht40 and vht80 channels, assuming the regulatory code
does the right thing with 80MHz available ranges;
* adds support to the ieee80211_add_channel_list_5ghz() code to populate
VHT channels, as this is the API my ath10k driver is using;
* add support for the freq1/freq2 field population and lookup that
VHT channels require.
The VHT80 code assumes that the regulatory domain already has limited VHT80
bands to, well, 80MHz wide chunks.
These are of the few cases where we use the GCC non-null attributes in
non-header code. As part of a review [1] of our use of such attributes we
are replacing such uses of the overly aggressive GCC attribute with clang's
_Nonnull attribute.
In this case the attributes serve little purpose as they just don't
enforce run time checks, If anything the attributes would cause NULL pointer
checks to be ignored but there are no such checks so only effect is
cosmetic.
The references appear to be left over from code development and likely
already fulfilled their purpose.
Reference [1]:
https://reviews.freebsd.org/D9004
Reviewed by: jhb
MFC after: 3 weeks
after tty_timedwait() returns an error only if the error is EWOULDBLOCK;
other errors cause an immediate return. This fixes the case of the tty
disappearing while in tty_drain().
Reported by: pho
Do this here as puc(4) disallows single-port instances; at least
one multi-port PCIe UART chip (in this case, the ASIX MCS9922)
present separate PCI configuration space (functions) for each UART.
Tested using lrzsz and a null-modem cable. The ExpressCard/34
variants containing the MCS9922 should also use MSI with this change.
Reviewed by: jhb, imp, rpokala
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D9123
The sysctl controls the period per interface.
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D9153
and tw_so_options defined here which is supposed to be a copy of the
former (short vs u_short respectively).
Switch tw_so_options to be "signed short" to match the type of the field
it's inherited from.
Some g_raid tasters attempt metadata reads in multiples of the provider
sectorsize. Reads larger than MAXPHYS are invalid, so detect and abort
in such situations.
Spiritually similar to r217305 / PR 147851.
PR: 214721
Sponsored by: Dell EMC Isilon
m_pulldown()
m_pulldown() only needs to determine if a mbuf is writable if it is going to
copy data into the data region of an existing mbuf. It does this to create a
contiguous data region in a single mbuf from multiple mbufs in the chain. If
the requested memory region is already contiguous and nothing needs to
change, the mbuf does not need to be writeable.
Submitted by: Brian Mueller <bmueller@panasas.com>
Reviewed by: bz
MFC after: 1 week
Sponsored by: Panasas
Differential Revision: https://reviews.freebsd.org/D9053
The period should be taken into account by the function which
refreshes driver stats.
Reviewed by: philip
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D9130
Firmware version which takes PERIOD_MS parameter into account is
required.
Reviewed by: philip
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D9129
drain timeout handling to historical freebsd behavior.
The primary reason for these changes is the need to have tty_drain() call
ttydevsw_busy() at some reasonable sub-second rate, to poll hardware that
doesn't signal an interrupt when the transmit shift register becomes empty
(which includes virtually all USB serial hardware). Such hardware hangs
in a ttyout wait, because it never gets an opportunity to trigger a wakeup
from the sleep in tty_drain() by calling ttydisc_getc() again, after
handing the last of the buffered data to the hardware.
While researching the history of changes to tty_drain() I stumbled across
some email describing the historical BSD behavior of tcdrain() and close()
on serial ports, and the ability of comcontrol(1) to control timeout
behavior. Using that and some advice from Bruce Evans as a guide, I've
put together these changes to implement the hardware polling and restore
the historical timeout behaviors...
- tty_drain() now calls ttydevsw_busy() in a loop at 10 Hz to accomodate
hardware that requires polling for busy state.
- The "new historical" behavior for draining during close(2) is retained:
the drain timeout is "1 second without making any progress". When the
1-second timeout expires, if the count of bytes remaining in the tty
layer buffer is smaller than last time, the timeout is extended for
another second. Unfortunately, the same logic cannot be extended all
the way down to the hardware, because the interface to that layer is a
simple busy/not-busy indication.
- Due to the previous point, an application that needs a guarantee that
all data has been transmitted must use TIOCDRAIN/tcdrain(3) before
calling close(2).
- The historical behavior of honoring the drainwait setting for TIOCDRAIN
(used by tcdrain(3)) is restored.
- The historical kern.drainwait sysctl to control the global default
drainwait time is restored, but is now named kern.tty_drainwait.
- The historical default drainwait timeout of 300 seconds is restored.
- Handling of TIOCGDRAINWAIT and TIOCSDRAINWAIT ioctls is restored
(this also makes the comcontrol(1) drainwait verb work again).
- Manpages are updated to document these behaviors.
Reviewed by: bde (prior version)
This lets one interrupt DDB's output, which is useful if paging is
disabled and the output device is slow.
Submitted by: Anton Rang <rang@acm.org>
Reviewed by: jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D9138
Include netinet/in.h before ip_compat.t which will then check if
IPPROTO_IPIP is defined or not. Doing it the other way round,
ip_compat.h would not find it defined and netinet/in.h then
redefine it.
Active Open:
- Save the socket's vnet at the time of the active open (t4_connect) and
switch to it when processing the reply (do_act_open_rpl or
do_act_establish).
Passive Open:
- Save the listening socket's vnet in the driver's listen_ctx and switch
to it when processing incoming SYNs for the socket.
- Reject SYNs that arrive on an ifnet that's not in the same vnet as the
listening socket.
CLIP (Compressed Local IPv6) table:
- Add only those IPv6 addresses to the CLIP that are in a vnet
associated with one of the card's ifnets.
Misc:
- Set vnet from the toepcb when processing TCP state transitions.
- The kernel sets the vnet when calling the driver's output routine
so t4_push_frames runs in proper vnet context already. One exception
is when incoming credits trigger tx within the driver's ithread. Set
the vnet explicitly in do_fw4_ack for that case.
MFC after: 3 days
Sponsored by: Chelsio Communications
With clang 4.0.0, we are getting the following warnings about struct
boot_module_t in efi's boot_module.h:
In file included from sys/boot/efi/boot1/ufs_module.c:41:
sys/boot/efi/boot1/boot_module.h:67:14: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
void (*init)();
^
void
sys/boot/efi/boot1/boot_module.h:92:16: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
void (*status)();
^
void
sys/boot/efi/boot1/boot_module.h:95:24: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
dev_info_t *(*devices)();
^
void
3 errors generated.
Fix this by adding 'void' to the parameter lists. No functional change.
Reviewed by: emaste, imp, smh
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D9144
With clang 4.0.0, the EFI API header causes the following warning:
In file included from sys/boot/efi/loader/bootinfo.c:43:
In file included from sys/boot/efi/loader/../include/efi.h:52:
sys/boot/efi/include/efiapi.h:534:32: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
(EFIAPI *EFI_RESERVED_SERVICE) (
^
Add VOID to make it into a real prototype.
Reviewed by: imp, emaste, tsoome
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D9132
after it, which has a conflicting definition of errno. This leads to
the following warning with clang 4.0.0:
In file included from sys/boot/common/reloc_elf32.c:6:
In file included from sys/boot/common/reloc_elf.c:37:
/usr/obj/usr/src/tmp/usr/include/stand.h:155:12: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
extern int errno;
^
sys/sys/errno.h:46:26: note: expanded from macro 'errno'
#define errno (* __error())
^
MFC after: 3 days
We would previously invalidate such entries individually, resulting in more
IPIs than necessary.
Reviewed by: alc, kib
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D9094
biowait() will otherwise race with completions of such BIOs. In-tree code
only calls biowait() on BIOs that do not specify a handler, so this change
should not have any functional impact.
Reviewed by: mav
MFC after: 1 month
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D9070
This was meant to be used by a future FORTIFY_SOURCE implementation.
Probably for good, FORTIFY_SOURCE and this particular GCCism were never
well supported by clang or other compilers. Furthermore, the technology
has long since been replaced by either static checkers, sanitizers, or
even just the strong stack protector that was enabled by default.
Drop __gnu_inline to avoid cluttering the headers.
MFC after: 5 days
clang 3.9.0 without -fPIC generates absolute jump table for
switch/case statement which trips boot1.efi and loader.efi
on ARM platform.
Reviewed by: andrew
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D9108
Fix section pattern code to exclude .rel.data.* sections from being
merged into .data. Otherwise relocations in those sections are lost
in final binary
Reviewed by: andrew
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D9108
__bss_end should not be included in .bss zeroing code. Otherwise first 4
bytes of the section that follows .bss (in loader's case it's .sdata) are
overwritten by zero.
Reviewed by: andrew
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D9108
In-tree objdump is too old to dump new ELF headers. But for example if we
use: `make CROSS_TOOLCHAIN=riscv64-gcc TARGET_ARCH=riscv64` and do not specify
CROSS_BINUTILS_PREFIX in env, embed_mfs.sh cannot find the correct objdump.
This patch just replaces using of objdump with elfdump to collect needed
information.
Later we may also put an ELFDUMP in CROSSENV and use it in embed_mfs.sh .
Reviewed by: emaste, br
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D9062
a linuxkpi style device is expected. If OFED/linuxkpi actually starts
using this field then we'll have to figure out whether to create fake
devices for these drivers or have linuxkpi deal with NULL device.
This mismatch was first reported as part of D6585.
Unnecessary prefetch just loads HW prefetcher and displaces other
cache entries (which could be really useful).
If we parse mbuf for TSO early and use firmware-assisted TSO, we do not
expect mbuf data access when we compose firmware-assisted TSO (v1 or v2)
option descriptors. If packet header needs to be linearized or finally
FATSO cannot be used because of, for example, too big header, we do not
care about a bit more performance degradation because of prefetch
absence (it is better to optimize more common case).
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D9120
in some arm64 hardware, for example the AMD Opteron A1100.
Reviewed by: mav
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D8852
This is needed for two reasons:
* Drivers will need to know what the negotiated set of VHT capabilities
and rates are in order to configure (and reconfigure for opmode/chanwidth
changes) how to speak to a given peer; and
* Because some vendors are "special", we should be careful in what we announce
to them during peer association.
This isn't the complete solution, as I still need to make sure that when
sending out probe requests before we know what we want, we don't limit
the capabilities being announced. This is important for IBSS/mesh work
later on as probe request/response exchanges are the first hint at what
a peer supports. I'll look at adding that to the API soon.
- em(4) igb(4) and lem(4)
- deprecate the igb device from kernel configurations
- create a symbolic link in /boot/kernel from if_em.ko to if_igb.ko
Devices tested:
- 82574L
- I218-LM
- 82546GB
- 82579LM
- I350
- I217
Please report problems to freebsd-net@freebsd.org
Partial review from jhb and suggestions on how to *not* brick folks who
originally would have lost their igbX device.
Submitted by: mmacy@nextbsd.org
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Limelight Networks and Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D8299
These have been tested back-to-back with Linux 3.x and a similar attachment
at the other end; a CDC EEM-like encapsulation can be used for emulated
Ethernet over udbp(4) with ng_ether.
potentially leading to fatal unaligned accesses on architectures with
strict alignment requirements. This change fixes dummynet(4) as far
as accesses to 64-bit members of struct dn_* are concerned, tripping
up on sparc64 with accesses to 32-bit members happening to be correctly
aligned there. In other words, this only fixes the tip of the iceberg;
larger parts of dummynet(4) still need to be rewritten in order to
properly work on all of !x86.
In principle, considering the amount of code in dummynet(4) that needs
this erroneous pattern corrected, an acceptable workaround would be to
declare all struct dn_* packed, forcing compilers to do byte-accesses
as a side-effect. However, given that the structs in question aren't
laid out well either, this would break ABI/KBI.
While at it, replace all existing bcopy(9) calls with memcpy(9) for
performance reasons, as there is no need to check for overlap in these
cases.
PR: 189219
MFC after: 5 days
If some process' nodes were accessed using procfs and the process
cannot exit properly at the time modunload event is reported to the
pseudofs-backed filesystem, the assertion in pfs_vncache_unload() is
triggered. Assertion is correct, the cache should be cleaned.
Approved by: des (pseudofs maintainer)
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Instead of collecting statistics for each combination of ports and logical
units, that consumed ~45KB per LU with present number of ports, collect
separate statistics for every port and every logical unit separately, that
consume only 176 bytes per each single LU/port. This reduces struct
ctl_lun size down to just 6KB.
Also new IOCTL API/ABI does not hardcode number of LUs/ports, and should
allow handling of very large quantities.
MFC after: 2 weeks (probably keeping old API enabled for some time)
handler which already holds the mutex, and have sdhci_handle_card_present()
be just a tiny wrapper that does the locking for external callers.
This should fix the recursive locking panics seen on rpi3.
Reported by: Shawn Webb
Besides slots always having non-removable media, these HCIs require
a custom hardware reset sequence after power-up.
- Flesh out the support for Intel Braswell eMMC controllers further.
Apart from also requiring said reset code, the timeout clock needs to
be hardcoded to 1 MHz for these.
Both the special reset and timeout clock handlings are implemented as
global sdhci(4) quirks as the same treatment will be necessary for
Intel eMMC controllers attached via ACPI (once sdhci(4) grows such a
front-end).
- In sdhci_init_slot(), use the right capability field for determining
the announced bus width based on MMC_CAP_*_BIT_DATA.
- Correct inverted sdhci_pci_softc member comments added in r276469. [1]
Submitted by: Anton Yuzhaninov [1]
MFC after: 5 days
or write, resulting in random short-read and short-write returns for
requests. Fixing this fixes nominal block I/O via mmcsd(4).
Obtained from: DragonFlyBSD (fd4b97583be1a1e57234713c25f6e81bc0411cb0)
MFC after: 5 days
This array takes 64KB of RAM now, that was more then half of struct ctl_lun
size. If at some point we support more ports, this may need another tune.
MFC after: 2 weeks
card presence and write protect switch detection.
A bridge driver just needs to call the setup routine in its attach(), the
teardown in its detach(), and write a couple tiny glue functions to connect
the sdhci interface functions to the new helper functions. This is not
extensively documented, but multiple examples will exist real soon.
card insert/remove events on controllers that don't implement the insert
and remove interrupts.
Bridge drivers can set a new slot option, SDHCI_NON_REMOVABLE, to indicate
non-removable media (such as eMMC). The sdhci driver will not enable
insert/remove interrupts, and sdhci_generic_get_card_present() will always
return true.
Bridge drivers can set a new quirk, SDHCI_QUIRK_POLL_CARD_PRESENT, and the
sdhci driver will not enable insert/remove interrupts, and instead will use
a callout to poll the card-present status at 5 Hz.
For bridge drivers that get notified of card insert/remove via gpio
interrupts, there is a new sdhci_handle_card_present() function they can
call from the gpio interrupt handler to inform the sdhci code of the event.
In addition to adding these new features, the existing code to debounce card
insertions was updated to use taskqueue_enqueue_timeout() instead of
scheduling a callout to do the taskqueue_enqueue(). There is also now a
comment explaining that insertion-debounce is what's going on -- it took me
a long time to realize that's what the old sdhci_card_delay() routine was
really doing. There is no functional difference between the old and new
debounce code (I hope!).
Use device-specific Rx buffer size to ensure that data will not be
truncated + add a warning if truncation was detected (the driver
cannot handle this case correctly yet).
Tested with:
- RTL8188CUS, RTL8188EU and RTL8821AU, STA / AP modes.
There are places where checks are made against VM_MAX_KERNEL_ADDRESS, or
virtual_end (set to VM_MAX_KERNEL_ADDRESS). With 32-bit checks, an address will
always be less than or equal to 0xffffffff. Drop a page, so those checks can
terminate loops safely.
With clang 4.0.0, I'm getting the following warnings:
sys/geom/vinum/geom_vinum_state.c:186:7: error: logical not is only
applied to the left hand side of this bitwise operator
[-Werror,-Wlogical-not-parentheses]
if (!flags & GV_SETSTATE_FORCE)
^ ~
The logical not operator should obiously be called after masking.
Reviewed by: mav, pfg
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D9093
This function is used only by ipsec_getpolicybysock() to fill security
policy index selector for locally generated packets (that have INPCB).
The function incorrectly assumes that spidx is the same for both directions.
Fix this by using new direction argument to specify correct INPCB security
policy - sp_in or sp_out. There is no need to fill both policy indeces,
because they are overwritten for each packet.
This fixes security policy matching for outbound packets when user has
specified TCP/UDP ports in the security policy upperspec.
PR: 213869
MFC after: 1 week
Inums in cd9660 refer to byte offsets on the media. DVD and BD media
can have entries above 4GB, especially with multi-session images.
PR: 190655
Reported by: Thomas Schmitt <scdbackup at gmx.net>
And HP x2 210, per DragonFlyBSD 240bd9cd58f8259c12c14a8006837e698.
Submitted by: Johannes Lundberg <yohanesu75 at gmail.com>
No objection: gonzo@
Obtained from: DragonFlyBSD
This is a skeleton set based on ieee80211_ht.c. It implements some IE
parsing, some basic unfinished negotiation, and channel promotion/demotion.
However, by itself it's not enough to do VHT - notably, the actual
channel promotion for STA mode at least is done in ieee80211_ht.c as
part of htinfo_update_chw(). I was .. quite amused when I found that
out.
I'm checking this in so others can see progress rather than one huge
commit when VHT is "done" (which will likely be quite a while.)
Many embedded SoC controllers that are (more or less) sdhci-compatible don't
implement card detect, and the related values in the PRESENT_STATE register
aren't useful. A bridge driver can now implement get_card_present() to read
a gpio pin or whatever else is necessary for that system.
The default implementation reads the CARD_PRESENT bit from the PRESENT_STATE
register, so existing drivers will keep working (or keep not-fully-working,
since many drivers right now can't detect card insert/remove).
sched_*(2) syscalls might be not available at runtime. Defining this
constant as zero directs POSIX-compliant code to call sysconf(3) to
detect the feature at runtime, and forces libc sysconf(3) to ask
kernel.
Noted by: ngie
Reviewed by: jilles, ngie
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D9055
Siena has limitation on maximum byte count and 4k boundary crosssing
(which is stricter than maximum byte count).
EF10 has limitation on maximum byte count only.
Reviewed by: philip
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D9061
It is safer to consider EFX_LINK_UNKNOWN as link down.
link_mode is set to EFX_LINK_UNKNOWN on port stop and fini.
Reviewed by: philip
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D9060
* Add the VHT capability element to the driver capabilities so ifconfig
can see if VHT is available
* Add ioctl plumbing for enabling/disabling VHT and each of the VHT
widths.
Note: this DOES change the ABI (the driver caps ioctl struct size, sigh)
so this will require a recompile of at least ifconfig.
In preparation for VHT station support, we need to store VHT IEs when
scanning so we can choose to upgrade to VHT.
This doesn't change the ABI - it just steals spare[] entries.
The VHT operational element (VHTOPMODE) isn't a uint32_t - it's
the MCS sets, freq1/freq2 parameters and channel width.
So, store the channel width too in lieu of just storing the
IE struct.
This changes the VHT parameter layout in ieee80211_node but it
doesn't change ABI at all.
The 11n code uses these bits for both configuration /and/ controlling
the channel width on softmac chips - it uses it to find the widest
width for all VAPs (eg a HT20 vap and a HT40 vap) to know what to
configure the ic_curchan.
For fullmac devices it isn't /as/ important, as each virtual device
exposed by the firmware will likely have its own configuration and the
firmware figures out what to do to enable it.
These came from Linux mac80211 headers and are configuration bits, not
VHTOPMODE field parameters.
Whilst here, add the field names for the VHTCAP bits.
Tested:
* ath10k, 11ac STA mode
Add a MSG_MOREOTOCOME message flag. When this flag is set, sosend*
set PRUS_MOREOTOCOME when invoking the protocol send method. The aio
worker tasks for sending on a socket set this flag when there are
additional write jobs waiting on the socket buffer.
Reviewed by: adrian
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D8955
Build and install an o32 set of libraries on mips64 suitable for
running o32 binaries via COMPAT_FREEBSD32. Enable COMPAT_FREEBSD32 in
MALTA64.
Reviewed by: jmallett, imp
Sponsored by: DARPA / AFRL
Differential Revision: https://reviews.freebsd.org/D9032
If tmpfs vnode is only shared locked, tn_status field still needs
updates to note the access time modification. Use the same locking
scheme as for UFS, protect tn_status with the node interlock + shared
vnode lock.
Fix nearby style.
Noted and reviewed by: mjg
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
sys/ptrace.h includes sys/signal.h, which includes sys/_sigset.h.
Note that sys/_sigset.h only defines osigset_t if COMPAT_43 was defined.
Two lines later, sys/ptrace.h includes machine/reg.h, which in case of
powerpc, includes opt_compat.h.
After the include headers reordering in r311345, we have sys/ptrace.h
included before sys/sysproto.h.
If COMPAT_43 was requested in the kernel config, the result is that
sys/_sigset.h does not define osigset_t, but sys/sysproto.h sees
COMPAT_43 and uses osigset_t.
Fix this by explicitely including opt_compat.h to cover the whole
kern/kern_exec.c scope.
Sponsored by: The FreeBSD Foundation
This ensures the interface is initialized by the interface driver
before it can be used by the rest of the system.
Reviewed by: jhb, karels, gnn
MFC after: 3 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D8905
The format strings weren't checked when stacksave_subr() used a function
pointer for printf instead of directly using db_printf().
Reported by: kib
Sponsored by: DARPA / AFRL
Do not hardcode elf64-tradbigmips as output format in BERI linker scrips.
Unfortunately, in-tree toolchain and external newer versions of binutils
mean two different things under that. When creating elf binaries using
external toolchain, gcc uses elf64-tradbigmips-freebsd and so linker
script file has to match in order for ld to be able to create the final loader
binary.
Rather than trying to guess, remove hardcoded output format directive from
the linker directive files and use CC to invoke the linker instead.
Reviewed by: brooks
Differential Revision: https://reviews.freebsd.org/D9050
Armada38x is already supported in the tree.
This commit adds support for DB-AP board.
File was taken from Linux v4.8 and accustomed to FreeBSD
in minimal possible way.
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D7327
ClearFog is equipped with Marvell Armada 388 SoC, which is already
supported in FreeBSD.
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D7326
Right now size of the structure is 472 bytes on amd64, which is
already large and stack allocations are indesirable. With the ino64
work, MNAMELEN is increased to 1024, which will make it impossible to have
struct statfs on the stack.
Extracted from: ino64 work by gleb
Discussed with: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Hardware buffer management entries are not used yet by FreeBSD.
They were added for compliance with Linux Armada 38x device tree
representation and will be used in future network support.
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D8179
- recognize ports and vlangroups based on DTS file
- support multi-chip addresing mode (required in upcoming
Armada-388-Clearfog support)
- refactor attachment function
Each port in 'dsa' node should have 'vlangroup' property. Otherwise,
e6000sw will fail to attach.
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Konrad Adamczyk <ka@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Differential revision: https://reviews.freebsd.org/D7328
* rename the ieee80211com field for vht mcsinfo to be ic_, not iv;
* add a vht config field, stealing from the spares I left there.
This doesn't change the ABI.
The HT40 channel population logic was "just" doing pairs of channels starting with
the band entry frequency. Trouble is, a lot of the rules start way off at 5120MHz,
which isn't a valid 5GHz channel. Then, eg for HT40U, it would populate:
* (5120,5140)
* (5160,5180)
* (5200,5220)
* (5240,5260)
.. as the HT40U pairs, with the first being the primary channel. Channel 36
is 5180MHz, and since it's not a primary channel here, it wouldn't populate it.
Then, the next HT40U would be 5200/5220, which is highly wrong.
HT40D had the same problem.
So, this just forces that 5GHz HT40 channels start at channel 36 (5180),
no matter what the band edge says. This includes eg doing 4.9GHz channels.
This erm, meant that the HT40 channels for the low band was always wrong.
Oops!
Tested:
* AR9380, STA mode
* AR9344 SoC, AP mode
MFC after: 1 week
Upon each execve, we allocate a KVA range for use in copying data to the
new image. Pages must be faulted into the range, and when the range is
freed, the backing pages are freed and their mappings are destroyed. This
is a lot of needless overhead, and the exec_map management becomes a
bottleneck when many CPUs are executing execve concurrently. Moreover, the
number of available ranges is fixed at 16, which is insufficient on large
systems and potentially excessive on 32-bit systems.
The new allocator reduces overhead by making exec_map allocations
persistent. When a range is freed, pages backing the range are marked clean
and made easy to reclaim. With this change, the exec_map is sized based on
the number of CPUs.
Reviewed by: kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D8921
Previously, the stack unwinder tried to locate the start of the function
in each frame by walking backwards until it found an instruction that
modified the stack pointer and then assumed that was the first instruction
in a function. The unwinder would only print a function name if the
starting instruction's address was an exact match for a symbol name.
However, not all functions generated by modern compilers start off functions
with that instruction. For those functions, the unwinder would fail to
find a matching function name. As a result, most frames in a stack
trace would be printed as raw hex PC's instead of a function name.
Stop depending on this incorrect assumption and just use db_printsym()
like other platforms to display the function name and offset for each
frame. This generates a far more useful stack trace.
While here, don't print out curproc's pid at the end of the trace. The
pid was always from curproc even if tracing some other process.
In addition, remove some rotted comments about hardcoded constants that
are no longer hardcoded.
Sponsored by: DARPA / AFRL
There was a single call to stacktrace() under an #ifdef DEBUG to obtain
a stack trace during a fault that resulted in a function pointer to a
printf function being passed to stacktrace_subr() in db_trace.c. The
kernel now has existing interfaces for obtaining a stack trace outside
of DDB (kdb_backtrace(), or the stack_*() API) that should be used instead.
Rather than fix the one call however, remove it since the kernel will
dump a trace anyway once it panics.
Make stacktrace_subr() static, remove the function pointer and change it
to use db_printf() explicitly.
Discussed with: kan
Sponsored by: DARPA / AFRL
Use the trapframe unwinder recently added for kernel stack overflow
panics for frames crossing MipsKernGenException and MipsKernIntr.
This provides more reliably unwinding across nested interrupts and
exceptions in the kernel.
While here, dump the value of the CAUSE and BADVADDR registers when
crossing a trapframe.
Submitted by: rwatson (original version)
Obtained from: CheriBSD
Sponsored by: DARPA / AFRL
The sim_vid, hba_vid, and dev_name fields of struct ccb_pathinq are
fixed-length strings. AFAICT the only place they're read is in
sbin/camcontrol/camcontrol.c, which assumes they'll be null-terminated.
However, the kernel doesn't null-terminate them. A bunch of copy-pasted code
uses strncpy to write them, and doesn't guarantee null-termination. For at
least 4 drivers (mpr, mps, ciss, and hyperv), the hba_vid field actually
overflows. You can see the result by doing "camcontrol negotiate da0 -v".
This change null-terminates those fields everywhere they're set in the
kernel. It also shortens a few strings to ensure they'll fit within the
16-character field.
PR: 215474
Reported by: Coverity
CID: 1009997 1010000 1010001 1010002 1010003 1010004 1010005
CID: 1331519 1010006 1215097 1010007 1288967 1010008 1306000
CID: 1211924 1010009 1010010 1010011 1010012 1010013 1010014
CID: 1147190 1010017 1010016 1010018 1216435 1010020 1010021
CID: 1010022 1009666 1018185 1010023 1010025 1010026 1010027
CID: 1010028 1010029 1010030 1010031 1010033 1018186 1018187
CID: 1010035 1010036 1010042 1010041 1010040 1010039
Reviewed by: imp, sephe, slm
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D9037
Differential Revision: https://reviews.freebsd.org/D9038
returns memory which must be freed, regardless of the error. Assign
NULL to *buf in case we are not going to allocate any memory due to
invalid mode.
Reported and tested by: pho
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks (together with r310638)
Differential revision: https://reviews.freebsd.org/D9042
a symlink and an autofs mount request. The crash was caused by namei()
calling bcopy() with a negative length, caused by numeric underflow:
in lookup(), in the relookup path, the ni_pathlen was decremented too
many times. The bug was introduced in r296715.
Big thanks to Alex Deiter for his help with debugging this.
Reviewed by: kib@
Tested by: Alex Deiter <alex.deiter at gmail.com>
MFC after: 1 month
observed to fix any actual error, but it's the right thing to do
from the correctness point of view.
Tested by: Eugene M. Zheganin <emz at norma.perm.ru>
MFC after: 1 month