Commit Graph

241465 Commits

Author SHA1 Message Date
Dmitry Chagin
6e4cf32e95 Add warning to the Linuxulator makefiles that building it outside of a
kernel does not make sence.

PR:		222861
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20179
2019-05-13 18:28:40 +00:00
Dmitry Chagin
c5156c7785 Linuxulator depends on a fundamental kernel settings such as SMP. Many
of them listed in opt_global.h which is not generated while building
modules outside of a kernel and such modules never match real cofigured
kernel.

So, we should prevent our users from building obviously defective modules.

Therefore, remove the root cause of the building of modules outside of a
kernel - the possibility of building modules with DEBUG or KTR flags.
And remove all of DEBUG printfs as it is incomplete and in threaded
programms not informative, also a half of system call does not have DEBUG
printf. For debuging Linux programms we have dtrace, ktr and ktrace ability.

PR:		222861
Reviewed by:	trasz
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20178
2019-05-13 18:24:29 +00:00
Dmitry Chagin
caaad8736e Linuxulator getpeername() returns EINVAL in case then namelen less then 0.
MFC after:	2 weeks
2019-05-13 18:14:20 +00:00
Mark Johnston
adbb25df4b Extend the libcap_sysctl tests.
- Add some coverage for cap_sysctl(3).
- Add a test for the case where the caller wishes to find the sysctl
  output length without specifying an output buffer.

Reviewed by:	oshogbo
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17856
2019-05-13 17:53:03 +00:00
Mark Johnston
3c766430f7 Convert the libcap_sysctl test cases to ATF.
Reviewed by:	oshogbo
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17855
2019-05-13 17:51:03 +00:00
Mark Johnston
1608c46ea4 Add cap_sysctl(3) and cap_sysctlnametomib(3).
These complement cap_sysctlbyname(3) to provide a drop-in
replacement for the corresponding libc functions.

Also revise the libcap_sysctl limit interface to provide access
to sysctls by MIB, and to avoid direct manipulation of nvlists
by the caller.

Reviewed by:	oshogbo
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17854
2019-05-13 17:49:54 +00:00
Dmitry Chagin
d5368bf3df Our bsd_to_linux_sockaddr() and linux_to_bsd_sockaddr() functions
alter the userspace sockaddr to convert the format between linux and BSD versions.
That's the minimum 3 of copyin/copyout operations for one syscall.

Also some syscall uses linux_sa_put() and linux_getsockaddr() when load
sockaddr to userspace or from userspace accordingly.

To avoid this chaos, especially converting sockaddr in the userspace,
rewrite these 4 functions to convert sockaddr only in kernel and leave
only 2 of this functions.

Also in order to reduce duplication between MD parts of the Linuxulator put
struct sockaddr conversion functions that are MI out into linux_common module.

PR:		232920
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20157
2019-05-13 17:48:16 +00:00
Mark Johnston
54a3a11421 Provide separate accounting for user-wired pages.
Historically we have not distinguished between kernel wirings and user
wirings for accounting purposes.  User wirings (via mlock(2)) were
subject to a global limit on the number of wired pages, so if large
swaths of physical memory were wired by the kernel, as happens with
the ZFS ARC among other things, the limit could be exceeded, causing
user wirings to fail.

The change adds a new counter, v_user_wire_count, which counts the
number of virtual pages wired by user processes via mlock(2) and
mlockall(2).  Only user-wired pages are subject to the system-wide
limit which helps provide some safety against deadlocks.  In
particular, while sources of kernel wirings typically support some
backpressure mechanism, there is no way to reclaim user-wired pages
shorting of killing the wiring process.  The limit is exported as
vm.max_user_wired, renamed from vm.max_wired, and changed from u_int
to u_long.

The choice to count virtual user-wired pages rather than physical
pages was done for simplicity.  There are mechanisms that can cause
user-wired mappings to be destroyed while maintaining a wiring of
the backing physical page; these make it difficult to accurately
track user wirings at the physical page layer.

The change also closes some holes which allowed user wirings to succeed
even when they would cause the system limit to be exceeded.  For
instance, mmap() may now fail with ENOMEM in a process that has called
mlockall(MCL_FUTURE) if the new mapping would cause the user wiring
limit to be exceeded.

Note that bhyve -S is subject to the user wiring limit, which defaults
to 1/3 of physical RAM.  Users that wish to exceed the limit must tune
vm.max_user_wired.

Reviewed by:	kib, ngie (mlock() test changes)
Tested by:	pho (earlier version)
MFC after:	45 days
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19908
2019-05-13 16:38:48 +00:00
Andrey V. Elsukov
af1f58df99 Do not leak memory used for binary filter. 2019-05-13 14:07:02 +00:00
Andrey V. Elsukov
699281b545 Rework locking in BPF code to remove rwlock from fast path.
On high packets rate the contention on rwlock in bpf_*tap*() functions
can lead to packets dropping. To avoid this, migrate this code to use
epoch(9) KPI and ConcurrencyKit's lists.

* all lists changed to use CK_LIST;
* reference counting added to bpf_if and bpf_d;
* now bpf_if references ifnet and releases this reference on destroy;
* each bpf_d descriptor references bpf_if when it is attached;
* new struct bpf_program_buffer introduced to keep BPF filter programs;
* bpf_program_buffer, bpf_d and bpf_if structures are freed by
  epoch_call();
* bpf_freelist and ifnet_departure event are no longer needed, thus
  both are removed;

Reviewed by:	melifaro
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D20224
2019-05-13 13:45:28 +00:00
Emmanuel Vadot
eb4c63f731 Revert r347356 and r347371
passwd related files need to be tagged as config file so pkg update
will attempt merging them when we install a new package.
We should use CONFS for that.
Revert for now until I come up with a better version of this patch as
it breaks pkgbase for users.
2019-05-13 12:38:33 +00:00
Andrey V. Elsukov
740d4c7c9f Revert r347402. After r347429 symlink is no longer needed. 2019-05-13 08:34:13 +00:00
Mark Johnston
11a5fc4fb9 Catch up with r347241.
MFC with:	r347241
2019-05-13 01:18:17 +00:00
Ruslan Bukin
b803d0b790 Add support for HiFive Unleashed -- the board with a multi-core RISC-V SoC
from SiFive, Inc.

The first core on this SoC (hart 0) is a 64-bit microcontroller.

o Pick a hart to run boot process using hart lottery.
  This allows to exclude hart 0 from running the boot process.
  (BBL releases hart 0 after the main harts, so it never wins the lottery).
o Renumber CPUs early on boot.
  Exclude non-MMU cores. Store the original hart ID in struct pcpu. This
  allows to find out the correct destination for IPIs and remote sfence
  calls.

Thanks to SiFive, Inc for the board provided.

Reviewed by:	markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D20225
2019-05-12 16:17:05 +00:00
Emmanuel Vadot
da153f8e60 arm: allwinner: aw_clk_nm: Don't reparent the clock if we didn't ask
When looking for the best frequency don't change the clock parent if the
clock wasn't configured to do that.
2019-05-12 15:27:01 +00:00
Mateusz Guzik
8ba6c1391b cache: fix a brainfart in r347505
If bumping over the counter goes over the limit we have to decrement it back.

Previous code would only bump the counter after adding the entry (thus allowing
the cache to go over the limit).

Sponsored by:	The FreeBSD Foundation
2019-05-12 07:56:01 +00:00
Mateusz Guzik
2425b5168c seqc: fix sed-introduced typos (seqcuence -> sequence)
Sponsored by:	The FreeBSD Foundation
2019-05-12 07:13:25 +00:00
Mateusz Guzik
b72515e129 amd64: tidy up pagezero*/pagecopy (movq -> movl)
Sponsored by:	The FreeBSD Foundation
2019-05-12 07:11:44 +00:00
Mateusz Guzik
5bf50787e6 cache: bump numcache on entry, while here fix lnumcache type
Sponsored by:	The FreeBSD Foundation
2019-05-12 06:59:22 +00:00
Mateusz Guzik
45372f1a6f amd64: fixup MEMMOVE comment (10 -> r10)
Sponsored by:	The FreeBSD Foundation
2019-05-12 06:42:17 +00:00
Mateusz Guzik
63ad3b65b0 cache: push sdt probes in cache_zap_locked to code doing the work
Avoids branching to check which probe to evaluate. Very same check was
being done later to do the actual work.

Sponsored by:	The FreeBSD Foundation
2019-05-12 06:39:30 +00:00
Mateusz Guzik
a8c2fcb287 x86: store pending bitmapped IPIs in per-cpu areas
This gets rid of the global cpu_ipi_pending array.

While replace cmpset with fcmpset in the delivery code and opportunistically
check if given IPI is already pending.

Sponsored by:	The FreeBSD Foundation
2019-05-12 06:36:54 +00:00
Mateusz Guzik
8eae2be460 amd64: stop re-reading curpc in suword
Plugs re-reads missed in r341719

Sponsored by:	The FreeBSD Foundation
2019-05-12 06:34:58 +00:00
Mateusz Guzik
5e57adc874 random(4): depessimize arc4random
- __predict_false reseeding on entry as it is almost never true.
- don't blindly atomic_cmpset as on x86 it ends up dirtying the cacheline.
it almost ever succeeds per above
- fetch the timestamp prior to getting the cpu number

Reviewed by:	cem
Approved by:	secteam (delphij)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20242
2019-05-12 06:32:46 +00:00
Rick Macklem
3e08dc749c Factor code into two new functions in preparation for a future commit.
Factor code into two functions.
read_exportfile() a functon  which reads the exports file(s) and calls
get_exportlist_one() to process each of them.
delete_export() a function which deletes the exports in the kernel for a file
system.
The contents of these functions is just the same code as was used to do the
operations, moved into separate functions. As such, there is no semantic change.
This is being done in preparation for a future commit that will add an
option to do incremental changes of kernel exports upon receiving SIGHUP.

MFC after:	1 month
2019-05-11 22:41:58 +00:00
Jens Schweikhardt
82455a3319 Correct a handful of typos. 2019-05-11 19:31:54 +00:00
Cy Schubert
706a3d9c65 Support the use of the ipsec kld.
X-MFC with:	r347410
2019-05-11 17:59:13 +00:00
Doug Moore
87ae0686a2 A new parameter to blist_alloc specifies an upper bound on the size of
the allocation request, so that the blocks allocated are from the next
set of free blocks big enough to satisfy the minimum requirements of
the request, and the number of blocks allocated are as many as
possible, up to the specified maximum. The implementation of
swp_pager_getswapspace uses this parameter to ask for a number of
blocks between the new halved request size and the previous failed
request size. Thus a request for 32 blocks may fail, but instead of
getting only 16 blocks instead, the caller asks for 16 to 31 next, and
might get 19 or 27, which is closer to what they originally wanted.

I expect this to lead to bigger block allocations and less block
fragmentation, at least in some cases.

Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20001
2019-05-11 16:15:13 +00:00
Justin Hibbits
2f420a7c7f revert r346588 for now
The rewrite of strcmp in assembly uses an instruction added in PowerISA
2.05, making it SIGILL on CPUs older than the POWER6, such as the PPC970 in
the PowerMac G5.  Revert this until we get clang+lld, or retire the in-tree
binutils in favor of newer binutils with IFUNC support, whichever comes
first.
2019-05-11 15:17:42 +00:00
Emmanuel Vadot
f78a4afd30 twsi: Calculate the clock param based on the bus frequency
Instead of precalculating the different speed, respect the bus frequency
and calculate the clock register parameter based on it.
If the platform didn't register the core clk, fallback on the precomputed
values (This is likely do be the case on Marvell boards).
2019-05-11 15:03:51 +00:00
Emmanuel Vadot
e69181cfc6 allwinner: clk: sun8i_r: Correct resets
The i2c reset wasn't defined and some bits where wrong, correct them.
2019-05-11 15:02:55 +00:00
Emmanuel Vadot
45f64a5956 allwinner: clk: prediv_mux: Init the current parent
Do not init the first parent but read the clock register to find
it's current parent and init this one.
2019-05-11 15:02:20 +00:00
Xin LI
7fa22f746b Update leap-seconds to leap-seconds.3757622400.
As per https://datacenter.iers.org/data/latestVersion/16_BULLETIN_C16.txt:

     INTERNATIONAL EARTH ROTATION AND REFERENCE SYSTEMS SERVICE (IERS)

SERVICE INTERNATIONAL DE LA ROTATION TERRESTRE ET DES SYSTEMES DE REFERENCE

SERVICE DE LA ROTATION TERRESTRE DE L'IERS
OBSERVATOIRE DE PARIS
61, Av. de l'Observatoire 75014 PARIS (France)
Tel.      : +33 1 40 51 23 35
e-mail    : services.iers@obspm.fr
http://hpiers.obspm.fr/eop-pc

                                              Paris, 07 January 2019

                                              Bulletin C 57

                                              To authorities responsible
                                              for the measurement and
                                              distribution of time

                          INFORMATION ON UTC - TAI

 NO leap second will be introduced at the end of June 2019.
 The difference between Coordinated Universal Time UTC and the
 International Atomic Time TAI is :

     from 2017 January 1, 0h UTC, until further notice : UTC-TAI = -37 s

 Leap seconds can be introduced in UTC at the end of the months of December
 or June,  depending on the evolution of UT1-TAI. Bulletin C is mailed every
 six months, either to announce a time step in UTC, or to confirm that there
 will be no time step at the next possible date.

                                            Christian BIZOUARD
                                            Director
                                            Earth Orientation Center of IERS
					    Observatoire de Paris, France

Requested by:	rgrimes
Obtained from:	ftp://tycho.usno.navy.mil/pub/ntp/leap-seconds.3757622400
MFC after:	3 days
2019-05-11 14:22:21 +00:00
Doug Moore
48e98a2afc Callers of swp_pager_getswapspace get either as many blocks as they
requested, or none, and in the latter case it is up to them to pick a
smaller request to make - which they always do by halving the failed
request. This change to swp_pager_getswapspace leaves the task of
downsizing the request to the function and not its caller. It still
does so by halving the original request.

Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20228
2019-05-11 10:16:43 +00:00
Doug Moore
535192530c When bitpos can't be implemented with an inline ffs* instruction,
change the binary search so that it does not depend on a single bit
only being set in the bitmask. Use bitpos more generally, and avoid
some clearing of bits to accommodate its current behavior.

Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20237
2019-05-11 09:09:10 +00:00
Kyle Evans
81b3b91e6b tuntap: Improve style
No functional change.

tun_flags of the tuntap_driver was renamed to ident_flags to reflect the
fact that it's a subset of the tun_flags that identifies a tuntap device.
This maps more easily (visually) to the TUN_DRIVER_IDENT_MASK that masks off
the bits of tun_flags that are applicable to tuntap driver ident. This is a
purely cosmetic change.
2019-05-11 04:18:06 +00:00
Doug Moore
0cb36fc9c2 Revert r347469.
Approved by: kib (mentor)
2019-05-11 02:13:52 +00:00
Rick Macklem
1a9a992fce Factor out some exportlist list operations into separate functions.
This patch moves the code that removes and frees all exportlist elements
out into a separate function called free_exports().
It does the same for the insertion of a new exportlist entry into a list.
It also adds a second argument to ex_search() for the list to use.
None of these changes have any semantic effect. They are being done to
prepare the code for future patches that convert the single linked list
for the exportlist to a hash table of lists and a patch that will do
incremental changes of exports in the kernel.
And it fixes the argument for SLIST_HEAD_INITIALIZER() to be a pointer,
which doesn't really matter, since SLIST_HEAD_INITIALIZER() doesn't use
the argument.

MFC after:	1 month
2019-05-10 23:52:17 +00:00
Conrad Meyer
64e7d18f34 netdump: Ref the interface we're attached to
Serialize netdump configuration / deconfiguration, and discard our
configuration when the affiliated interface goes away by monitoring
ifnet_departure_event.

Reviewed by:	markj, with input from vangyzen@ (earlier version)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20206
2019-05-10 23:12:59 +00:00
Doug Moore
12cd7ded68 Don't use _Generic, as many systems don't know about it. Go back to a lo-tech switch statement.
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20235
2019-05-10 23:12:37 +00:00
Conrad Meyer
070e7bf95e netdump: Fix boot-time configuration typo
Boot-time netdump configuration is much more useful if one can configure the
client and gateway addresses.  Fix trivial typo.

(Long-standing bug, I believe it dates to the original netdump commit.)

Spotted by:	one of vangyzen@ or markj@
Sponsored by:	Dell EMC Isilon
2019-05-10 23:10:22 +00:00
Johannes Lundberg
5098ed5f3b Implement linux_pci_unregister_drm_driver in linuxkpi so that drm drivers
can be unloaded.

This patch is a part of D19565.

Reviewed by:	hps
Approved by:	imp (mentor), hps
MFC after:	1 week
2019-05-10 23:10:22 +00:00
Doug Moore
4ab18ea23a When bitpos can't be implemented with an inline ffs* instruction,
change the binary search so that it does not depend on a single bit
only being set in the bitmask. Use bitpos more generally, and avoid
some clearing of bits to accommodate its current behavior.

Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20232
2019-05-10 22:49:01 +00:00
Doug Moore
d4808c4403 Add a (q)uit option to the subr_blist test program.
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20234
2019-05-10 22:02:29 +00:00
Conrad Meyer
6144b50f8b netdump: Don't store sensitive key data we don't need
Prior to this revision, struct diocskerneldump_arg (and struct netdump_conf
with embedded diocskerneldump_arg before r347192), were copied in their
entirety to the global 'nd_conf' variable.  Also prior to this revision,
de-configuring netdump would *not* remove the the key material from global
nd_conf.

As part of Encrypted Kernel Crash Dumps (EKCD), which was developed
contemporaneously with netdump but happened to land first, the
diocskerneldump_arg structure will contain sensitive key material
(kda_key[]) when encrypted dumps are configured.

Netdump doesn't have any use for the key data -- encryption is handled in
the core dumper code -- so in this revision, we no longer store it.

Unfortunately, I think this leak dates to the initial import of netdump in
r333283; so it's present in FreeBSD 12.0.

Fortunately, the impact *seems* relatively minor.  Any new *netdump*
configuration would overwrite the key material; for active encrypted netdump
configurations, the key data stored was just a duplicate of the key material
already in the core dumper code; and no user interface (other than
/dev/kmem) actually exposed the leaked material to userspace.

Reviewed by:	markj, rpokala (earlier commit message)
MFC after:	2 weeks
Security:	yes (minor)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20233
2019-05-10 21:55:11 +00:00
Gleb Smirnoff
54bb7ac0c4 Fix regression from r347375: do not panic when sending an IP multicast
packet from an interface that doesn't have IPv4 address.

Reported by:	Michael Butler <imb protected-networks.net>
2019-05-10 21:51:17 +00:00
John Baldwin
c9d337083f Apply r280991 to ip6_fragment.
This uses m_dup_pkthdr() to copy all of the metadata about a packet to
each of its fragments including VLAN tags, mbuf tags, etc. instead of
hand-copying a few fields.

Reviewed by:	bz
MFC after:	1 month
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20117
2019-05-10 20:15:40 +00:00
Doug Moore
09b380a1ff Replace the expression "-mask & ~mask" with a function call that does
the same thing, but is commented so that it might be better
understood.

Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20231
2019-05-10 19:55:29 +00:00
Justin Hibbits
f04019c3c6 powerpc: Initialize the Hardware Interrupt Offset Register (HIOR) earlier for ppc970
Since we now have a much larger KVA on powerpc64, it's possible to get SLB
traps earlier in boot, possibly even before the HIOR is properly configured
for us.  Move the HIOR setup to immediately after reset, so that we use our
exception handlers instead of Open Firmware's.

PR:		233863
Submitted by:	Mark Millard (partial)
Reported by:	Mark Millard
MFC after:	2 weeks
2019-05-10 19:36:14 +00:00
Doug Moore
d1c34a6b76 blist_next_leaf_alloc walks over all the meta-nodes between one leaf
and the next one, and if blocks are allocated from the next leaf, it
walks back toward where it started, as long as there are interleaving
meta-nodes to be updated on account of the last free blocks under
those meta-nodes being allocated. Only if the walk goes all the way
back to the starting point must we calculate the position of the
meta-node that is the least-comment parent of one leaf and the next,
and update a bit in that meta-node to indicate the allocation of its
last free block.

There's no need to start calculating the position of that least-common
parent until the walk back reaches the original starting point, and
there's no need for a calculation that updates 'radius' to tell us
when we've walked back to the beginning, since comparing scan to next
suffices for that.

Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20229
2019-05-10 18:25:06 +00:00