netpfil: Introduce PFIL_FWD flag
Forwarded packets passed through PFIL_OUT, which made it difficult for
firewalls to figure out if they were forwarding or producing packets. This in
turn is an issue for pf for IPv6 fragment handling: it needs to call
ip6_output() or ip6_forward() to handle the fragments. Figuring out which was
difficult (and until now, incorrect).
Having pfil distinguish the two removes an ugly piece of code from pf.
Introduce a new variant of the netpfil callbacks with a flags variable, which
has PFIL_FWD set for forwarded packets. This allows pf to reliably work out if
a packet is forwarded.
sus/modules/Makefile part of r329507 just removed ffec
while r331501 also added conditional clause for bcm283x_clkman
and bcm283x_pwm. Since they're part of another revision,
remove mi-merged chunk
These macros take an existing ioctl(2) command and replace the length
with the specified length or length of the specified type respectively.
These can be used to define commands for 32-bit compatibility with fewer
opportunities for cut-and-paste errors then a whole new definition.
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
pf: Improve ioctl validation
Ensure that multiplications for memory allocations cannot overflow, and
that we'll not try to allocate M_WAITOK for potentially overly large
allocations.
pf: Improve ioctl validation for DIOCRGETTABLES, DIOCRGETTSTATS, DIOCRCLRTSTATS and DIOCRSETTFLAGS
These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.
Limit the allocation to required size (or the user allocation, if that's
smaller). That does mean we need to do the allocation with the rules
lock held (so the number doesn't change while we're doing this), so it
can't M_WAITOK.
Add 32-bit compat for ioctls that take struct ifgroupreq.
Use an accessor to access ifgr_group and ifgr_groups.
Use an macro CASE_IOC_IFGROUPREQ(cmd) in place of case statements such
as "case SIOCAIFGROUP:". This avoids poluting the switch statements
with large numbers of #ifdefs.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14960
pf: Improve ioctl validation for DIOCIGETIFACES and DIOCXCOMMIT
These ioctls can process a number of items at a time, which puts us at
risk of overflow in mallocarray() and of impossibly large allocations
even if we don't overflow.
There's no obvious limit to the request size for these, so we limit the
requests to something which won't overflow. Change the memory allocation
to M_NOWAIT so excessive requests will fail rather than stall forever.
pf: Improve ioctl validation for DIOCRADDTABLES and DIOCRDELTABLES
The DIOCRADDTABLES and DIOCRDELTABLES ioctls can process a number of
tables at a time, and as such try to allocate <number of tables> *
sizeof(struct pfr_table). This multiplication can overflow. Thanks to
mallocarray() this is not exploitable, but an overflow does panic the
system.
Arbitrarily limit this to 65535 tables. pfctl only ever processes one
table at a time, so it presents no issues there.
This was just an artifact of my testing to ensure the option had the
desired effect on freebsd 11, both when enabled and when disabled.
Reported by: Thomas Mueller <tmueller@sysgo>
Point hat: ian@
r332372:
tail(1): Add some long options
Add --blocks, --bytes, and --lines long options for -b, -c, and -n
respectively. This improves tail(1)'s compatibility with its GNU counterpart
in a straightforward way.
r332373:
tail(1): Address mandoc concern (space before punctuation after macro)
r332374:
head(1): Provide long options
Provide long options --bytes and --lines to match -c and -n respectively.
This improves head(1)'s compatibility with its GNU counterpart in a sensible
way.
Exit with usage when extra arguments are on command line
preventing mistakes such as "halt 0p" for "halt -p".
Approved by: bde (mentor, implicit), phk (mentor,implicit)
MFC after: 1 week
r308432: Capsicumize some trivial stdio programs
Trivially capsicumize some simple programs that just interact with
stdio. This list of programs uses 'pledge("stdio")' in OpenBSD.
r308657: fold(1): Revert incorrect r308432
As Jean-Sébastien notes, fold(1) requires handling argv-supplied files. That
will require a slightly more sophisticated approach.
r222319 in newfs raised the default blocksize for UFS/FFS filesystems
from 16K to 32K and the default fragment size from 2K to 4K, with a
rationale that most disks were now running with 4K sectors.
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
After multiple start/stop of netmap, ixgbe will get into a bad state
requiring a reboot to recover. Adding a delay before stopping the interface
appears to work around the issue.
The -CURRENT driver has diverged too far from -STABLE for an MFC.
PR: 221317
Submitted by: Sylvain Galliano <sg@efficientip.com>
Reported by: Cassiano Peixoto <peixoto.cassiano@gmail.com>
Sponsored by: Limelight Networks
328101:
Require the SHF_ALLOC flag for program sections from kernel object modules.
ELF object files can contain program sections which are not supposed
to be loaded into memory (e.g. .comment). Normally the static linker
uses these flags to decide which sections are allocated to loadable
program segments in ELF binaries and shared objects (including kernels
on all architectures and kernel modules on architectures other than
amd64).
Mapping ELF object files (such as amd64 kernel modules) into memory
directly is a bit of a grey area. ELF object files are intended to be
used as inputs to the static linker. As a result, there is not a
standardized definition for what the memory layout of an ELF object
should be (none of the section headers have valid virtual memory
addresses for example).
The kernel and loader were not checking the SHF_ALLOC flag but loading
any program sections with certain types such as SHT_PROGBITS. As a
result, the kernel and loader would load into RAM some sections that
weren't marked with SHF_ALLOC such as .comment that are not loaded
into RAM for kernel modules on other architectures (which are
implemented as ELF shared objects). Aside from possibly requiring
slightly more RAM to hold a kernel module this does not affect runtime
correctness as the kernel relocates symbols based on the layout it
uses.
Debuggers such as gdb and lldb do not extract symbol tables from a
running process or kernel. Instead, they replicate the memory layout
of ELF executables and shared objects and use that to construct their
own symbol tables. For executables and shared objects this works
fine. For ELF objects the current logic in kgdb (and probably lldb
based on a simple reading) assumes that only sections with SHF_ALLOC
are memory resident when constructing a memory layout. If the
debugger constructs a different memory layout than the kernel, then it
will compute different addresses for symbols causing symbols in the
debugger to appear to have the wrong values (though the kernel itself
is working fine). The current port of mdb does not check SHF_ALLOC as
it replicates the kernel's logic in its existing kernel support.
The bfd linker sorts the sections in ELF object files such that all of
the allocated sections (sections with SHF_ALLOCATED) are placed first
followed by unallocated sections. As a result, when kgdb composed a
memory layout using only the allocated sections, this layout happened
to match the layout used by the kernel and loader. The lld linker
does not sort the sections in ELF object files and mixed allocated and
unallocated sections. This resulted in kgdb composing a different
memory layout than the kernel and loader.
We could either patch kgdb (and possibly in the future lldb) to use
custom handling when generating memory layouts for kernel modules that
are ELF objects, or we could change the kernel and loader to check
SHF_ALLOCATED. I chose the latter as I feel we shouldn't be loading
things into RAM that the module won't use. This should mostly be a
NOP when linking with bfd but will allow the existing kgdb to work
with amd64 kernel modules linked with lld.
Note that we only require SHF_ALLOC for "program" sections for types
like SHT_PROGBITS and SHT_NOBITS. Other section types such as symbol
tables, string tables, and relocations must also be loaded and are not
marked with SHF_ALLOC.
328911:
Ignore relocation tables for non-memory-resident sections.
As a followup to r328101, ignore relocation tables for ELF object
sections that are not memory resident. For modules loaded by the
loader, ignore relocation tables whose associated section was not
loaded by the loader (sh_addr is zero). For modules loaded at runtime
via kldload(2), ignore relocation tables whose associated section is
not marked with SHF_ALLOC.
Rework ipfw dynamic states implementation to be lockless on fast path.
o added struct ipfw_dyn_info that keeps all needed for ipfw_chk and
for dynamic states implementation information;
o added DYN_LOOKUP_NEEDED() macro that can be used to determine the
need of new lookup of dynamic states;
o ipfw_dyn_rule now becomes obsolete. Currently it used to pass
information from kernel to userland only.
o IPv4 and IPv6 states now described by different structures
dyn_ipv4_state and dyn_ipv6_state;
o IPv6 scope zones support is added;
o ipfw(4) now depends from Concurrency Kit;
o states are linked with "entry" field using CK_SLIST. This allows
lockless lookup and protected by mutex modifications.
o the "expired" SLIST field is used for states expiring.
o struct dyn_data is used to keep generic information for both IPv4
and IPv6;
o struct dyn_parent is used to keep O_LIMIT_PARENT information;
o IPv4 and IPv6 states are stored in different hash tables;
o O_LIMIT_PARENT states now are kept separately from O_LIMIT and
O_KEEP_STATE states;
o per-cpu dyn_hp pointers are used to implement hazard pointers and they
prevent freeing states that are locklessly used by lookup threads;
o mutexes to protect modification of lists in hash tables now kept in
separate arrays. 65535 limit to maximum number of hash buckets now
removed.
o Separate lookup and install functions added for IPv4 and IPv6 states
and for parent states.
o By default now is used Jenkinks hash function.
Obtained from: Yandex LLC
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D12685
Rework ipfw rules parsing and printing code.
Introduce show_state structure to keep information about printed opcodes.
Split show_static_rule() function into several smaller functions. Make
parsing and printing opcodes into several passes. Each printed opcode
is marked in show_state structure and will be skipped in next passes.
Now show_static_rule() function is simple, it just prints each part
of rule separately: action, modifiers, proto, src and dst addresses,
options. The main goal of this change is avoiding occurrence of wrong
result of `ifpw show` command, that can not be parsed by ipfw(8).
Also now it is possible to make some simple static optimizations
by reordering of opcodes in the rule.
PR: 222705
r329388:
Define CK_MD_TSO for the relevant arches (i386, amd64 and sparc64).
Defaulting to CK_MD_RMO has the unfortunate side effect of generating
memory barriers that are useless on those arches, and the even more
unfortunate side effect of generating lfence/sfence/mfence on i386, even
if older CPUs don't support it.
This should fix the panic reported when using IPFW on a Pentium 3.
Note that mfence and sfence might still be used in a few case, but that
shouldn't happen in FreeBSD right now, and should be fixed upstream first.
r331441:
In __sync_bool_compare_and_swap(), return true if the returned value is the
same as the expected one, not the desired one.
r331898:
Import CK as of commit b19ed4c6a56ec93215ab567ba18ba61bf1cfbac8
It should fix ck_pr_[load|store]_ptr on mips and riscv, make sure no
*fence instructions are used on i386, as older cpus don't support it, and
make sure we don't rely on gcc builtins that can lead to calls to
libatomic when linked with -O0.
r319828:
rc.subr: Optimize repeated sourcing.
When /etc/rc runs all /etc/rc.d scripts, it has already loaded /etc/rc.subr
but each /etc/rc.d script sources it again (since /etc/rc.d scripts must
also work when started stand-alone).
Therefore, if rc.subr is already loaded, return so sh need not parse the
rest of the file.
A second effect is that there is no longer a compound command around most of
rc.subr. This reduces memory usage while sh is loading rc.subr for the first
time (but this memory is free()d once rc.subr is loaded).
For purposes of porting this to other systems, I do not recommend porting
this to systems with shells that do not have the change to the return
special builtin like in r255215 (before FreeBSD 10.0-RELEASE). This change
ensures that return in the top level of a dot script returns from the dot
script, even if the dot script was sourced from a function.
A comparison of CPU time on an amd64 bhyve virtual machine from a times
command added near the end of /etc/rc, all four values summed:
x orig1
+ quickreturn
+--------------------------------------------------------------------------+
| + + + x x x|
||______M__A_________| |______M___A__________| |
+--------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 3 1.704 1.802 1.726 1.744 0.051419841
+ 3 1.467 1.559 1.487 1.5043333 0.048387326
Difference at 95.0% confidence
-0.239667 +/- 0.113163
-13.7424% +/- 6.48873%
(Student's t, pooled s = 0.0499266)
r324625:
rc.subr: Remove test that is always true.
The code above always sets _pidcmd to a non-empty value.
r309350:
If the kenv variable rc_debug is set, turn on rc_debug.
r309352:
Finish incomplete comments in prior revision. I was going to fix this
after I tested it, but didn't.
r308896: rc.subr: $(ps -p $$ -o jid=) is always 0, so do not fork ps for it.
The JID keyword writes 0 for a process also in the host system or in the
same jail.
There are logistics issues that weren't considered when this was originally
MFC'd. All rc scripts in ports need audited (this is in progress) for usage
of ${name}_limits that doesn't line up with the new interpretation, and
individual rc.conf(5)'s need to be scrubbed of usage that doesn't line up.
It's since been decided that it should be left for a feature in 12.
1101514 introduced interpretation of ${name}_limits for rc scripts; this
feature no longer exists as of 1101515.
For mails which has a body not respecting RFC2822 (which often happen with
crontabs) try to split by words finding the last space before 1000's
character
If no spaces are found then consider the mail to be malformed anyway
PR: 208261
I missed that this was an assignment (a bad pattern, use another
member) on i386. As wl(4) is i386 only and gone in head, just
expand the ifr_ifru member rather than adding an accessor.
Reported by: gjb
ifconf(): correct handling of sockaddrs smaller than struct sockaddr.
Portable programs that use SIOCGIFCONF (e.g. traceroute) assume
that each pseudo ifreq is of length MAX(sizeof(struct ifreq),
sizeof(ifr_name) + ifr_addr.sa_len). For short sockaddrs we copied
too much from the source sockaddr resulting in a heap leak.
I believe only one such sockaddr exists (struct sockaddr_sco which
is 8 bytes) and it is unclear if such sockaddrs end up on interfaces
in practice. If it did, the result would be an 8 byte heap leak on
current architectures.
admbugs: 869
Reviewed by: kib
Obtained from: CheriBSD
Security: kernel heap leak
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14981
pf: Fix memory leak in DIOCRADDTABLES
If a user attempts to add two tables with the same name the duplicate table
will not be added, but we forgot to free the duplicate table, leaking memory.
Ensure we free the duplicate table in the error path.
Reported by: Coverity
CID: 1382111
pthread.h: drop nullability attributes.
These have been found to be practically useless. We were actually
following the Android bionic library and had some interest in replicating
the same warnings and behaviour but Android has since removed them.
We are still keeping some uses of nullability attributes in other headers,
somewhat in line with Apple's libc.
Hinted by: bionic (git 3f66e74b903905e763e104396aff52a81718cfde)
Retpoline is a compiler-based mitigation for CVE-2017-5715, also known
as Spectre V2, that protects against speculative execution branch target
injection attacks.
In this commit it is disabled by default, but will be changed in a
followup commit.
MFC r330962: Remove KERNEL_RETPOLINE from BROKEN_OPTIONS on i386
Clang will compile both amd64 and i386 with retpoline.
Sponsored by: The FreeBSD Foundation