This fixes a bug which resulted in a warning on the userland
stack, when compiled on Windows.
Thanks to Peter Kasting from Google for reporting the issue and
provinding a potential fix.
MFC after: 3 days
towards blind SYN/RST spoofed attack.
Originally our stack used in-window checks for incoming SYN/RST
as proposed by RFC793. Later, circa 2003 the RST attack was
mitigated using the technique described in P. Watson
"Slipping in the window" paper [1].
After that, the checks were only relaxed for the sake of
compatibility with some buggy TCP stacks. First, r192912
introduced the vulnerability, just fixed by aforementioned SA.
Second, r167310 had slightly relaxed the default RST checks,
instead of utilizing net.inet.tcp.insecure_rst sysctl.
In 2010 a new technique for mitigation of these attacks was
proposed in RFC5961 [2]. The idea is to send a "challenge ACK"
packet to the peer, to verify that packet arrived isn't spoofed.
If peer receives challenge ACK it should regenerate its RST or
SYN with correct sequence number. This should not only protect
against attacks, but also improve communication with broken
stacks, so authors of reverted r167310 and r192912 won't be
disappointed.
[1] http://bandwidthco.com/whitepapers/netforensics/tcpip/TCP Reset Attacks.pdf
[2] http://www.rfc-editor.org/rfc/rfc5961.txt
Changes made:
o Revert r167310.
o Implement "challenge ACK" protection as specificed in RFC5961
against RST attack. On by default.
- Carefully preserve r138098, which handles empty window edge
case, not described by the RFC.
- Update net.inet.tcp.insecure_rst description.
o Implement "challenge ACK" protection as specificed in RFC5961
against SYN attack. On by default.
- Provide net.inet.tcp.insecure_syn sysctl, to turn off
RFC5961 protection.
The changes were tested at Netflix. The tested box didn't show
any anomalies compared to control box, except slightly increased
number of TCP connection in LAST_ACK state.
Reviewed by: rrs
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
in order to improve user-friendliness when a system has multiple disks
encrypted using the same passphrase.
When examining a new GELI provider, the most recently used passphrase
will be attempted before prompting for a passphrase; and whenever a
passphrase is entered, it is cached for later reference. When the root
disk is mounted, the cached passphrase is zeroed (triggered by the
"mountroot" event), in order to minimize the possibility of leakage
of passphrases. (After root is mounted, the "taste and prompt for
passphrases on the console" code path is disabled, so there is no
potential for a passphrase to be stored after the zeroing takes place.)
This behaviour can be disabled by setting kern.geom.eli.boot_passcache=0.
Reviewed by: pjd, dteske, allanjude
MFC after: 7 days
POSIX compliance and to improve compatibility with Linux and NetBSD
The issue was identified with lib/libc/sys/t_access:access_inval from
NetBSD
Update the manpage accordingly
PR: 181155
Reviewed by: jilles (code), jmmv (code), wblock (manpage), wollman (code)
MFC after: 4 weeks
Phabric: D678 (code), D786 (manpage)
Sponsored by: EMC / Isilon Storage Division
. Hook e_lgammal[_r].c to the build.
. Create man page links for lgammal[-r].3.
* Symbol.map:
. Sort lgammal to its rightful place.
. Add FBSD_1.4 section for the new lgamal_r symbol.
* ld128/e_lgammal_r.c:
. 128-bit implementataion of lgammal_r().
* ld80/e_lgammal_r.c:
. Intel 80-bit format implementation of lgammal_r().
* src/e_lgamma.c:
. Expose lgammal as a weak reference to lgamma for platforms
where long double is mapped to double.
* src/e_lgamma_r.c:
. Use integer literal constants instead of real literal constants.
Let compiler(s) do the job of conversion to the appropriate type.
. Expose lgammal_r as a weak reference to lgamma_r for platforms
where long double is mapped to double.
* src/e_lgammaf_r.c:
. Fixed the Cygnus Support conversion of e_lgamma_r.c to float.
This includes the generation of new polynomial and rational
approximations with fewer terms. For each approximation, include
a comment on an estimate of the accuracy over the relevant domain.
. Use integer literal constants instead of real literal constants.
Let compiler(s) do the job of conversion to the appropriate type.
This allows the removal of several explicit casts of double values
to float.
* src/e_lgammal.c:
. Wrapper for lgammal() about lgammal_r().
* src/imprecise.c:
. Remove the lgamma.
* src/math.h:
. Add a prototype for lgammal_r().
* man/lgamma.3:
. Document the new functions.
Reviewed by: bde
The flowdirector feature shares on-chip memory with other things
such as the RX buffers. In theory it should be configured in a way
that doesn't interfere with the rest of operation. In practice,
the RX buffer calculation didn't take the flow-director allocation
into account and there'd be overlap. This lead to various garbage
frames being received containing what looks like internal NIC state.
What _I_ saw was traffic ending up in the wrong RX queues.
If I was doing a UDP traffic test with only one NIC ring receiving
traffic, everything is fine. If I fired up a second UDP stream
which came in on another ring, there'd be a few percent of traffic
from both rings ending up in the wrong ring. Ie, the RSS hash would
indicate it was supposed to come in ring X, but it'd come in ring Y.
However, when the allocation was fixed up, the developers at Verisign
still saw traffic stalls.
The flowdirector feature ends up fiddling with the NIC to do various
attempts at load balancing connections by populating flow table rules
based on sampled traffic. It's likely that all of that has to be
carefully reviewed and made less "magic".
So for now the flow director feature is disabled (which fixes both
what I was seeing and what they were seeing) until it's all much
more debugged and verified.
Tested:
* (me) 82599EB 2x10G NIC, RSS UDP testing.
* (verisign) not sure on the NIC (but likely 82599), 100k-200k/sec TCP
transaction tests.
Submitted by: Marc De La Gueronniere <mdelagueronniere@verisign.com>
MFC after: 1 week
Sponsored by: Verisign, Inc.
fmp->buf at the free point is already part of the chain being freed,
so double-freeing is counter-productive.
Submitted by: Marc De La Gueronniere <mdelagueronniere@verisign.com>
MFC after: 1 week
Sponsored by: Verisign, Inc.
This allows the NIC to drop frames on the receive queue and not
cause the MAC to block on receiving to _any_ queue.
Tested:
igb0@pci0:5:0:0: class=0x020000 card=0x152115d9 chip=0x15218086 rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = 'I350 Gigabit Network Connection'
class = network
subclass = ethernet
Discussed with: Eric Joyner <eric.joyner@intel.com>
MFC after: 1 week
Sponsored by: Norse Corp, Inc.
- Fail with EINVAL if an invalid protection mask is passed to mmap().
- Fail with EINVAL if an unknown flag is passed to mmap().
- Fail with EINVAL if both MAP_PRIVATE and MAP_SHARED are passed to mmap().
- Require one of either MAP_PRIVATE or MAP_SHARED for non-anonymous
mappings.
Reviewed by: alc, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D698
Eliminate an exclusive object lock acquisition and release on the expected
execution path.
Do page zeroing before the object lock is acquired rather than during the
time that the object lock is held.
Use vm_pager_free_nonreq() to eliminate duplicated code.
Reviewed by: kib
MFC after: 6 weeks
Sponsored by: EMC / Isilon Storage Division
The suspend/resume of event channels is already handled by the xen_intr_pic.
If those methods are set on the PIRQ PIC they are just called twice, which
breaks proper resume. This fix restores migration of FreeBSD guests to a
working state.
Sponsored by: Citrix Systems R&D
by ffs and ext2fs. Remove duplicated call to vm_page_zero_invalid(),
done by VOP and by vm_pager_getpages(). Use vm_pager_free_nonreq().
Reviewed by: alc (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 6 weeks (after r271596)
Without clustering support we any way have only one group of permanently
active ports, but that gives us one more supported VMWare feature. ;)
Solaris' Comstar also reports it even when only one port is present.
This change fixes transient performance drops in some of my benchmarks,
vanishing as soon as I am trying to collect any stats from the scheduler.
It looks like reordered access to those variables sometimes caused loss of
IPI_PREEMPT, that delayed thread execution until some later interrupt.
MFC after: 3 days
spaces, rather than a split address, we actually can't check for being within
the kernel's address range. Instead, do what other backtraces do, and use
trapexit()/asttrapexit() as the stack sentinel.
MFC after: 3 weeks
In the fdt data we've written for ourselves, the interrupt properties
for GIC interrupts have just been a bare interrupt number. In standard
data that conforms to the published bindings, GIC interrupt properties
contain 3-tuples that describe the interrupt as shared vs private, the
interrupt number within the shared/private address space, and configuration
info such as level vs edge triggered.
The new gic_decode_fdt() function parses both types of data, based on the
#interrupt-cells property. Previously, each platform implemented a decode
routine and put a pointer to it into fdt_pic_table. Now they can just
list this function in their table instead if they use arm/gic.c.
Set trunc store action to Expand for all X86 targets.
When compiling without SSE2, isTruncStoreLegal(F64, F32) would return
Legal, whereas with SSE2 it would return Expand. And since the Target
doesn't seem to actually handle a truncstore for double -> float, it
would just output a store of a full double in the space for a float
hence overwriting other bits on the stack.
Patch by Luqman Aden!
This should fix clang -O0 on i386 assigning garbage to floats, in
certain scenarios.
PR: 187437
Submitted by: cebd@gmail.com
Obtained from: http://llvm.org/viewvc/llvm-project?rev=217410&view=rev
MFC after: 3 days
path through the NFS clients' getpages functions.
Introduce vm_pager_free_nonreq(). This function can be used to eliminate
code that is duplicated in many getpages functions. Also, in contrast to
the code that currently appears in those getpages functions,
vm_pager_free_nonreq() avoids acquiring an exclusive object lock in one
case.
Reviewed by: kib
MFC after: 6 weeks
Sponsored by: EMC / Isilon Storage Division
The code had references to both intr_offset and intr_parent variable names
as referring to the parent interrupt node. The intr_parent variable
wasn't actually defined anywhere, but the only references to it were as
an argument to a macro that didn't use that argument in expansion, so
the undefined variable accidentally didn't cause an error.
The intr_parent name makes more sense in context, so change all occurrances
of intr_offset to intr_parent.