When using route-to (or reply-to) pf sends the packet directly to the output
interface. If that interface doesn't support checksum offloading the checksum
has to be calculated in software.
That was already done in the IPv4 case, but not for the IPv6 case. As a result
we'd emit packets with pseudo-header checksums (i.e. incorrect checksums).
This issue was exposed by the changes in r289316 when pf stopped performing full
checksum calculations for all packets.
Submitted by: Luoqi Chen
MFC after: 1 week
pmap_change_attr must change the memory type of both the requested KVA
and the corresponding DMAP mappings (if such mappings exist), to satisfy
an Intel requirement that two or more mappings to the same physical
pages must have the same memory type.
However, not all kernel mapped pages have corresponding DMAP mappings --
for example, 64-bit BARs. Skip fixing up the DMAP for out-of-bounds
addresses.
Submitted by: Steve Wahl <steve_wahl@dell.com>
Reviewed by: alc, jhb
Sponsored by: Dell Compellent
Differential Revision: https://reviews.freebsd.org/D4030
When destroying a character device the si_devsw field is set to NULL
before all references are gone, to indicate the character device is
going away. This can cause a NULL-dereference fault inside physio().
The callers of physio() should own a thread reference on the cdev and
if si_devsw is seen as non-NULL, it is usable during the execution of
the function. Else an ENXIO error code is returned.
Reviewed by: kib
MFC after: 2 weeks
- Move all files related to the LinuxKPI into sys/compat/linuxkpi and
its subfolders.
- Update sys/conf/files and some Makefiles to use new file locations.
- Added description of COMPAT_LINUXKPI to sys/conf/NOTES which in turn
adds the LinuxKPI to all LINT builds.
- The LinuxKPI can be added to the kernel by setting the
COMPAT_LINUXKPI option. The OFED kernel option no longer builds the
LinuxKPI into the kernel. This was done to keep the build rules for
the LinuxKPI in sys/conf/files simple.
- Extend the LinuxKPI module to include support for USB by moving the
Linux USB compat from usb.ko to linuxkpi.ko.
- Bump the FreeBSD_version.
- A universe kernel build has been done.
Reviewed by: np @ (cxgb and cxgbe related changes only)
Sponsored by: Mellanox Technologies
AMD64 pmap assumes ranges will be in the DMAP, which isn't necessarily
true for NTB memory windows (especially 64-bit BARs).
Suggested by: pmap_change_attr_locked -> kassert_panic
Sponsored by: EMC / Isilon Storage Division
Allows DMA from/to arbitrary KVA or physical address. /dev/ioat_test
must be enabled by root and is only R/W root, so this is approximately
as dangerous as /dev/mem and /dev/kmem.
Sponsored by: EMC / Isilon Storage Division
suggested by RFC 6675.
Currently differnt places in the stack tries to guess this in suboptimal ways.
The main problem is that current calculations don't take sacked bytes into
account. Sacked bytes are the bytes receiver acked via SACK option. This is
suboptimal because it assumes that network has more outstanding (unacked) bytes
than the actual value and thus sends less data by setting congestion window
lower than what's possible which in turn may cause slower recovery from losses.
As an example, one of the current calculations looks something like this:
snd_nxt - snd_fack + sackhint.sack_bytes_rexmit
New proposal from RFC 6675 is:
snd_max - snd_una - sackhint.sacked_bytes + sackhint.sack_bytes_rexmit
which takes sacked bytes into account which is a new addition to the sackhint
struct. Only thing we are missing from RFC 6675 is isLost() i.e. segment being
considered lost and thus adjusting pipe based on that which makes this
calculation a bit on conservative side.
The approach is very simple. We already process each ack with sack info in
tcp_sack_doack() and extract sack blocks/holes out of it. We'd now also track
this new variable sacked_bytes which keeps track of total sacked bytes reported.
One downside to this approach is that we may get incorrect count of sacked_bytes
if the other end decides to drop sack info in the ack because of memory pressure
or some other reasons. But in this (not very likely) case also the pipe
calculation would be conservative which is okay as opposed to being aggressive
in sending packets into the network.
Next step is to use this more accurate pipe estimation to drive congestion
window adjustments.
In collaboration with: rrs
Reviewed by: jason_eggnet dot com, rrs
MFC after: 2 weeks
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D3971
For the most of chips (except anscient ones) port handlers have no relation
to port IDs. In such situation old code scanning first 125 handlers was
quite naive. Instead of doing that, send to chip single request to get full
list of port handlers available on specific virtual port and scan only them.
Old code had problems with case of several virtual ports enabled, when port
handlers allocated from global address space could easily go above 125.
This change was successfully tested on 23xx, 24xx and 25xx chips in loop
mode with 4 virtual initiator ports, each seing 50 virtual target ports.
This should make it easier to track down interrupt storms from arge.
Tested:
* AP135 (QCA955x) SoC - defaults to ARGE_DEBUG enabled
* Carambola2 (AR9331 SoC) - defaults to ARGE_DEBUG disabled
the dynamic linker copy them, but not relocate them at the new location.
This allows us to run sqlite3 without it crashing.
Sponsored by: ABT Systems Ltd
This call may be used when device cannot continue to operate normally
(e.g., throws firmware error, watchdog timer expires)
and need to be restarted.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D3998
window in number of segments on fly. It is set to 10 segments by default.
Remove net.inet.tcp.experimental.initcwnd10 which is now redundant. Also remove
the parent node net.inet.tcp.experimental as it's not needed anymore and also
because it was not well thought out.
Differential Revision: https://reviews.freebsd.org/D3858
In collaboration with: lstewart
Reviewed by: gnn (prev version), rwatson, allanjude, wblock (man page)
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Limelight Networks
* refactor out the rx filter and operating mode code into a separate
method.
* add some comments about what's left with setting the operating mode
based on what carl9170 does.
* comment out some init from otus_init_mac() - it's no longer needed as
it's always init'ed now.
* add debugging and a missing return around a failure to call m_get2() -
during monitor mode operation I found RXing of frames > 2k, which
fails allocation. I'm sure they're valid (it's configuring 11n RX and
receiving 11n frames even though the driver doesn't "do" 11n)
and may be A-MSDU; but allocations fail and we should handle that
gracefully.
Tested:
* UB82 reference NIC (AR9170 + AR9104 2x2 dual band NIC); STA and
monitor mode operation.
degradation (7%) for host host TCP connections over 10Gbps links,
even when there were no secuirty policies in place. There is no
change in performance on 1Gbps network links. Testing GENERIC vs.
GENERIC-NOIPSEC vs. GENERIC with this change shows that the new
code removes any overhead introduced by having IPSEC always in the
kernel.
Differential Revision: D3993
MFC after: 1 month
Sponsored by: Rubicon Communications (Netgate)