Commit Graph

126279 Commits

Author SHA1 Message Date
Alan Somers
2aaf9152a8 MFHead@r345275 2019-03-18 19:21:53 +00:00
Andrey V. Elsukov
d6369c2d18 Revert r345274. It appears that not all 32-bit architectures have
necessary CK primitives.
2019-03-18 14:00:19 +00:00
Andrey V. Elsukov
d7a1cf06f3 Update NAT64LSN implementation:
o most of data structures and relations were modified to be able support
  large number of translation states. Now each supported protocol can
  use full ports range. Ports groups now are belongs to IPv4 alias
  addresses, not hosts. Each ports group can keep several states chunks.
  This is controlled with new `states_chunks` config option. States
  chunks allow to have several translation states for single alias address
  and port, but for different destination addresses.
o by default all hash tables now use jenkins hash.
o ConcurrencyKit and epoch(9) is used to make NAT64LSN lockless on fast path.
o one NAT64LSN instance now can be used to handle several IPv6 prefixes,
  special prefix "::" value should be used for this purpose when instance
  is created.
o due to modified internal data structures relations, the socket opcode
  that does states listing was changed.

Obtained from:	Yandex LLC
MFC after:	1 month
Sponsored by:	Yandex LLC
2019-03-18 12:59:08 +00:00
Andrew Gallatin
f552633918 Fix a typo introduced in r344133
The line was misedited to change tt to st instead of
changing ut to st.

The use of st as the denominator in mul64_by_fraction() will lead
to an integer divide fault in the intr proc (the process holding
ithreads) where st will be 0.  This divide by 0 happens after
the total runtime for all ithreads exceeds 76 hours.

Submitted by: bde
2019-03-18 12:41:42 +00:00
Vincenzo Maffione
d12354a56c netmap: add support for multiple host rings
Some applications forward from/to host rings most or all the
traffic received or sent on a physical interface. In this
cases it is desirable to have more than a pair of RX/TX host
rings, and use multiple threads to speed up forwarding.
This change adds support for multiple host rings. On registering
a netmap port, the user can specify the number of desired receive
and transmit host rings in the nr_host_tx_rings and nr_host_rx_rings
fields of the nmreq_register structure.

MFC after:	2 weeks
2019-03-18 12:22:23 +00:00
Andrey V. Elsukov
5c04f73e07 Add NAT64 CLAT implementation as defined in RFC6877.
CLAT is customer-side translator that algorithmically translates 1:1
private IPv4 addresses to global IPv6 addresses, and vice versa.
It is implemented as part of ipfw_nat64 kernel module. When module
is loaded or compiled into the kernel, it registers "nat64clat" external
action. External action named instance can be created using `create`
command and then used in ipfw rules. The create command accepts two
IPv6 prefixes `plat_prefix` and `clat_prefix`. If plat_prefix is ommitted,
IPv6 NAT64 Well-Known prefix 64:ff9b::/96 will be used.

  # ipfw nat64clat CLAT create clat_prefix SRC_PFX plat_prefix DST_PFX
  # ipfw add nat64clat CLAT ip4 from IPv4_PFX to any out
  # ipfw add nat64clat CLAT ip6 from DST_PFX to SRC_PFX in

Obtained from:	Yandex LLC
Submitted by:	Boris N. Lytochkin
MFC after:	1 month
Relnotes:	yes
Sponsored by:	Yandex LLC
2019-03-18 11:44:53 +00:00
Andrey V. Elsukov
002cae78da Add SPDX-License-Identifier and update year in copyright.
MFC after:	1 month
2019-03-18 10:50:32 +00:00
Andrey V. Elsukov
b11efc1eb6 Modify struct nat64_config.
Add second IPv6 prefix to generic config structure and rename another
fields to conform to RFC6877. Now it contains two prefixes and length:
PLAT is provider-side translator that translates N:1 global IPv6 addresses
to global IPv4 addresses. CLAT is customer-side translator (XLAT) that
algorithmically translates 1:1 IPv4 addresses to global IPv6 addresses.
Use PLAT prefix in stateless (nat64stl) and stateful (nat64lsn)
translators.

Modify nat64_extract_ip4() and nat64_embed_ip4() functions to accept
prefix length and use plat_plen to specify prefix length.

Retire net.inet.ip.fw.nat64_allow_private sysctl variable.
Add NAT64_ALLOW_PRIVATE flag and use "allow_private" config option to
configure this ability separately for each NAT64 instance.

Obtained from:	Yandex LLC
MFC after:	1 month
Sponsored by:	Yandex LLC
2019-03-18 10:39:14 +00:00
Mark Johnston
783efeb544 Revert r345244 for now.
The code which advances the block number is simplistic and is not
correct when the starting offset is non-zero.  Revert the change until
this is fixed.
2019-03-18 05:03:55 +00:00
Andriy Voskoboinyk
f35b8451ba net80211: correct check for SMPS node flags updates
Update node flags when driver supports SMPS, not when it is disabled or
in dynamic mode ((iv_htcaps & HTCAP_SMPS) != 0).

Checked with RTL8188EE (1T1R), STA mode - 'smps' word should disappear
from 'ifconfig wlan0' output.

MFC after:	2 weeks
2019-03-18 02:40:22 +00:00
Konstantin Belousov
c51b496e0b i386: improve detection of the fast page fault assist.
In particular, check that we are assisting the page fault from kernel mode.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-03-17 18:31:48 +00:00
Mark Johnston
1a7f456a4b Fix the gcc build (-Wstrict-prototypes) after r345244.
Reported by:	jenkins
MFC with:	r345244
2019-03-17 18:06:13 +00:00
Mark Johnston
c676692c61 Optimize lseek(SEEK_DATA) on UFS.
The old implementation, at the VFS layer, would map the entire range of
logical blocks between the starting offset and the first data block
following that offset.  With large sparse files this is very
inefficient.  The VFS currently doesn't provide an interface to improve
upon the current implementation in a generic way.

Add ufs_bmap_seekdata(), which uses the obvious algorithm of scanning
indirect blocks to look for data blocks.  Use it instead of
vn_bmap_seekhole() to implement SEEK_DATA.

Reviewed by:	kib, mckusick
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19598
2019-03-17 17:34:06 +00:00
Justin Hibbits
e5657ef4cf fdt: Explicitly mark fdt_slicer as dependent on geom_flashmap
Without this dependency relationship, the linker doesn't find the
flash_register_slicer() function, so kldload fails to load fdt_slicer.ko.

Discussed with:	ian@
2019-03-17 04:33:17 +00:00
Dimitry Andric
b0840a28f6 Connect lib/libomp to the build.
* Set MK_OPENMP to yes by default only on amd64, for now.
* Bump __FreeBSD_version to signal this addition.
* Ensure gcc's conflicting omp.h is not installed if MK_OPENMP is yes.
* Update OptionalObsoleteFiles.inc to cope with the conflicting omp.h.
* Regenerate src.conf(5) with new WITH/WITHOUT fragments.

Relnotes:	yes
PR:		236062
MFC after:	1 month
X-MFC-With:	r344779
2019-03-16 15:45:15 +00:00
Konstantin Belousov
fd8d844f76 amd64 KPTI: add control from procctl(2).
Add the infrastructure to allow MD procctl(2) commands, and use it to
introduce amd64 PTI control and reporting.  PTI mode cannot be
modified for existing pmap, the knob controls PTI of the new vmspace
created on exec.

Requested by:	jhb
Reviewed by:	jhb, markj (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19514
2019-03-16 11:44:33 +00:00
Konstantin Belousov
6f1fe3305a amd64: Add md process flags and first P_MD_PTI flag.
PTI mode for the process pmap on exec is activated iff P_MD_PTI is set.

On exec, the existing vmspace can be reused only if pti mode of the
pmap matches the P_MD_PTI flag of the process.  Add MD
cpu_exec_vmspace_reuse() callback for exec_new_vmspace() which can
vetoed reuse of the existing vmspace.

MFC note: md_flags change struct proc KBI.

Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19514
2019-03-16 11:31:01 +00:00
Konstantin Belousov
c1c120b2cb amd64: fix switching to the pmap with pti disabled.
When the pmap with pti disabled (i.e. pm_ucr3 == PMAP_NO_CR3) is
activated, tss.rsp0 was not updated.  Any interrupt that happen before
next context switch would use pti trampoline stack for hardware frame
but fault and interrupt handlers are not prepared to this.  Correctly
update tss.rsp0 for both PMAP_NO_CR3 and pti pmaps.

Note that this case, pti = 1 but pmap->pm_ucr3 == PMAP_NO_CR3 is not
used at the moment.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19514
2019-03-16 11:16:09 +00:00
Konstantin Belousov
a9262f497a amd64: rewrite cpu_switch.S fragment to reload tss.rsp0 on context switch.
New code avoids jumps.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19514
2019-03-16 11:12:02 +00:00
Kristof Provost
812483c46e pf: Rename pfsync bucket lock
Previously the main pfsync lock and the bucket locks shared the same name.
This lead to spurious warnings from WITNESS like this:

    acquiring duplicate lock of same type: "pfsync"
     1st pfsync @ /usr/src/sys/netpfil/pf/if_pfsync.c:1402
     2nd pfsync @ /usr/src/sys/netpfil/pf/if_pfsync.c:1429

It's perfectly okay to grab both the main pfsync lock and a bucket lock at the
same time.

We don't need different names for each bucket lock, because we should always
only acquire a single one of those at a time.

MFC after:	1 week
2019-03-16 10:14:03 +00:00
Juli Mallett
ce92b1bf56 Remove obsolete wrappers for 64-bit loads/stores which were only used by the
removed (r342255) SiByte port.

Reviewed by:	imp
2019-03-16 06:09:45 +00:00
Conrad Meyer
54533f66c9 stack(9): Drop unused API mode and comment that referenced it
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D19601
2019-03-15 22:39:55 +00:00
Alexander Motin
6bb46107d8 MFV r336930: 9284 arc_reclaim_thread has 2 jobs
`arc_reclaim_thread()` calls `arc_adjust()` after calling
`arc_kmem_reap_now()`; `arc_adjust()` signals `arc_get_data_buf()` to
indicate that we may no longer be `arc_is_overflowing()`.

The problem is, `arc_kmem_reap_now()` can take several seconds to
complete, has no impact on `arc_is_overflowing()`, but due to how the
code is structured, can impact how long the ARC will remain in the
`arc_is_overflowing()` state.

The fix is to use seperate threads to:

1. keep `arc_size` under `arc_c`, by calling `arc_adjust()`, which
    improves `arc_is_overflowing()`

2. keep enough free memory in the system, by calling
 `arc_kmem_reap_now()` plus `arc_shrink()`, which improves
 `arc_available_memory()`.

illumos/illumos-gate@de753e34f9

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan McDonald <danmcd@joyent.com>
Reviewed by: Tim Kordas <tim.kordas@joyent.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Brad Lewis <brad.lewis@delphix.com>
2019-03-15 18:59:04 +00:00
Gleb Smirnoff
b6f8cfb93a Deanonymize thread and proc state enums, so that a userland app can
use them without redefining the value names. New clang no longer
allows to redefine a enum value name to the same value.

Bump __FreeBSD_version, since ports depend on that.

Discussed with:	jhb
2019-03-15 18:18:05 +00:00
Kyle Evans
4920f9a348 if_bridge(4): Drop pointless rtflush
At this point, all routes should've already been dropped by removing all
members from the bridge. This condition is in-fact KASSERT'd in the line
immediately above where this nop flush was added.
2019-03-15 17:19:36 +00:00
Kyle Evans
6e6b93fe1d Revert r345192: Too many trees in play for bridge(4) bits
An accidental appendage was committed that has not undergone review yet.
2019-03-15 17:18:19 +00:00
Kyle Evans
4b4b284d95 if_bridge(4): Drop pointless rtflush
At this point, all routes should've already been dropped by removing all
members from the bridge. This condition is in-fact KASSERT'd in the line
immediately above where this nop flush was added.
2019-03-15 17:13:05 +00:00
Konstantin Belousov
39d70f6b80 Provide deterministic (and somewhat useful) value for RDPID result,
and for %ecx after RDTSCP.

Initialize TSC_AUX MSR with CPUID.  It allows for userspace to cheaply
identify CPU it was executed on some time ago, which is sometimes useful.

Note: The values returned might be changed in future.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-03-15 16:43:28 +00:00
Konstantin Belousov
7e0a345bc5 Add symbolic name for TSC_AUX MSR address.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-03-15 16:39:05 +00:00
Kristof Provost
43d3127ca7 bridge: Fix STP-related panic
After r345180 we need to have the appropriate vnet context set to delete an
rtnode in bridge_rtnode_destroy().
That's usually the case, but not when it's called by the STP code (through
bstp_notify_rtage()).

We have to set the vnet context in bridge_rtable_expire() just as we do in the
other STP callback bridge_state_change().

Reviewed by:	kevans
2019-03-15 15:52:36 +00:00
Kyle Evans
a87407ff85 if_bridge(4): Fix module teardown
bridge_rtnode_zone still has outstanding allocations at the time of
destruction in the current model because all of the interface teardown
happens in a VNET_SYSUNINIT, -after- the MOD_UNLOAD has already been
processed.  The SYSUNINIT triggers destruction of the interfaces, which then
attempts to free the memory from the zone that's already been destroyed, and
we hit a panic.

Solve this by virtualizing the uma_zone we allocate the rtnodes from to fix
the ordering. bridge_rtable_fini should also take care to flush any
remaining routes that weren't taken care of when dynamic routes were flushed
in bridge_stop.

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D19578
2019-03-15 13:19:52 +00:00
Fedor Uporov
0204d1c793 Remove unneeded mount point unlock function calls.
The ext2_nodealloccg() function unlocks the mount point
in case of successful node allocation.
The additional unlocks are not required and should be removed.

PR:		236452
Reported by:	pho
MFC after:	3 days
2019-03-15 11:49:46 +00:00
Kristof Provost
d6747eafa9 bridge: Fix panic if the STP root is removed
If the spanning tree root interface is removed from the bridge we panic
on the next 'ifconfig'.
While the STP code is notified whenever a bridge member interface is
removed from the bridge it does not clear the bs_root_port. This means
bs_root_port can still point at an bridge_iflist which has been free()d.
The next access to it will panic.

Explicitly check if the interface we're removing in bstp_destroy() is
the root, and if so re-assign the roles, which clears bs_root_port.

Reviewed by:	philip
MFC after:	2 weeks
2019-03-15 11:21:20 +00:00
Kristof Provost
5904868691 pf :Use counter(9) in pf tables.
The counters of pf tables are updated outside the rule lock. That means state
updates might overwrite each other. Furthermore allocation and
freeing of counters happens outside the lock as well.

Use counter(9) for the counters, and always allocate the counter table
element, so that the race condition cannot happen any more.

PR:		230619
Submitted by:	Kajetan Staszkiewicz <vegeta@tuxpowered.net>
Reviewed by:	glebius
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19558
2019-03-15 11:08:44 +00:00
Gleb Smirnoff
f355cb3e6f PFIL_MEMPTR for ipfw link level hook
With new pfil(9) KPI it is possible to pass a void pointer with length
instead of mbuf pointer to a packet filter. Until this commit no filters
supported that, so pfil run through a shim function pfil_fake_mbuf().

Now the ipfw(4) hook named "default-link", that is instantiated when
net.link.ether.ipfw sysctl is on, supports processing pointer/length
packets natively.

- ip_fw_args now has union for either mbuf or void *, and if flags have
  non-zero length, then we use the void *.
- through ipfw_chk() we handle mem/mbuf cases differently.
- ether_header goes away from args. It is ipfw_chk() responsibility
  to do parsing of Ethernet header.
- ipfw_log() now uses different bpf APIs to log packets.

Although ipfw_chk() is now capable to process pointer/length packets,
this commit adds support for the link level hook only, see
ipfw_check_frame(). Potentially the IP processing hook ipfw_check_packet()
can be improved too, but that requires more changes since the hook
supports more complex actions: NAT, divert, etc.

Reviewed by:	ae
Differential Revision:	https://reviews.freebsd.org/D19357
2019-03-14 22:52:16 +00:00
Gleb Smirnoff
dc0fa4f712 Remove 'dir' argument from dummynet_io(). This makes it possible to make
dn_dir flags private to dummynet. There is still some room for improvement.
2019-03-14 22:32:50 +00:00
Gleb Smirnoff
b00b7e03fd Reduce argument list to ipfw_divert(), as args holds the rule ref and
the direction. While here make 'tee' a bool.
2019-03-14 22:31:12 +00:00
Gleb Smirnoff
cef9f220cd Remove 'dir' argument in ng_ipfw_input, since ip_fw_args now has this info.
While here make 'tee' boolean.
2019-03-14 22:30:05 +00:00
Gleb Smirnoff
b7795b6746 - Add more flags to ip_fw_args. At this changeset only IPFW_ARGS_IN and
IPFW_ARGS_OUT are utilized. They are intented to substitute the "dir"
  parameter that is often passes together with args.
- Rename ip_fw_args.oif to ifp and now it is set to either input or
  output interface, depending on IPFW_ARGS_IN/OUT bit set.
2019-03-14 22:28:50 +00:00
Gleb Smirnoff
1830dae3d3 Make second argument of ip_divert(), that specifies packet direction a bool.
This allows pf(4) to avoid including ipfw(4) private files.
2019-03-14 22:23:09 +00:00
Gleb Smirnoff
2d0232783c Simplify ipfw_bpf_mtap2(). No functional change. 2019-03-14 22:20:48 +00:00
Kyle Evans
521b05ea52 ether_fakeaddr: Use 'b' 's' 'd' for the prefix
This has the advantage of being obvious to sniff out the designated prefix
by eye and it has all the right bits set. Comment stolen from ffec.

I've removed bryanv@'s pending question of using the FreeBSD OUI range --
no one has followed up on this with a definitive action, and there's no
particular reason to shoot for it and the administrative overhead that comes
with deciding exactly how to use it.
2019-03-14 19:48:43 +00:00
Konstantin Belousov
0e05133ab6 mips: remove dead comment and definitions.
Reviewed by:	brooks, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D19584
2019-03-14 19:07:41 +00:00
Kyle Evans
6b7e0c1cca ether: centralize fake hwaddr generation
We currently have two places with identical fake hwaddr generation --
if_vxlan and if_bridge. Lift it into if_ethersubr for reuse in other
interfaces that may also need a fake addr.

Reviewed by:	bryanv, kp, philip
Differential Revision:	https://reviews.freebsd.org/D19573
2019-03-14 17:18:00 +00:00
Ed Maste
cf3f5ed89e Remove radeonkmsfw firmware files
The drivers were removed in r344299 so there is no need to keep the
firmware files in the src tree.

Reviewed by:	imp, jhibbits, johalun
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19583
2019-03-14 17:05:46 +00:00
Brooks Davis
5e0d22c1b4 Style(9): add a missing space between argument declerations. 2019-03-14 15:56:34 +00:00
Brooks Davis
17f254d840 Remove an unused struct proc *p1 in cpu_fork().
The only reference to p1 after a dead store was in a comment so update
the comment to refer to td1.

Submitted by:	sbruno
Differential Revision:	https://reviews.freebsd.org/D16226
2019-03-14 15:55:30 +00:00
Ed Maste
30903c8283 Remove npe microcode
The driver was removed in r336436.
2019-03-14 13:28:21 +00:00
Hans Petter Selasky
582141f69d Revert r345102 until the DRM next port issues are resolved.
Requested by:		Johannes Lundberg <johalun0@gmail.com>
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-03-14 09:18:54 +00:00
Hans Petter Selasky
7d595f6b79 Resolve duplicate symbol name conflict after r345095, when building LINT.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-03-13 19:53:20 +00:00