Commit Graph

252513 Commits

Author SHA1 Message Date
Peter Grehan
a333a508a2 cpu_auxmsr: assert caller is preventing CPU migration.
Submitted by:	Adam Fenn (adam at fenn dot io)
Requested by:	kib
Reviewed by:	kib, grehan
Approved by:	kib
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D26166
2020-08-24 11:49:49 +00:00
Vincenzo Maffione
de5b46107c iflib: fix isc_rxd_flush call in netmap_fl_refill()
The semantic of the pidx argument of isc_rxd_flush() is the
last valid index of in the free list, rather than the next
index to be published. However, netmap was still using the
old convention. While there, also refactor the netmap_fl_refill()
to simplify a little bit and add an assertion.

MFC after:	2 weeks
2020-08-24 11:44:20 +00:00
Alex Richardson
ebae797c82 Also print number of available CPUs on Linux
Without this change the buildworld/buildkernel epilogue looks like this:
>>> World built in 249 seconds, sysctl: cannot stat /proc/sys/hw/ncpu: No such file or directory
ncpu: , make -j72.

Reviewed By:	emaste, bdrewery
Differential Revision: https://reviews.freebsd.org/D26056
2020-08-24 09:20:38 +00:00
Alex Richardson
0b862b0399 Avoid adding duplicates to SRCS/OBJS/SOBJS/POBJS
This is a change in preparation for stopping to use lorder.sh (D26044) and
instead assume that we have a linker newer than ~1990. Without lorder.sh
duplicates end up being passed to the linker when building .so files and this
can result in duplicate symbol definition errors.

There is one minor change: libcompiler_rt.a will no longer provide
gcc_personality_v0 and instead we now only have it in libgcc_eh.a/libgcc_s.so.
This matches GCC's behaviour.

Reviewed By:	emaste, cem
Differential Revision: https://reviews.freebsd.org/D26042
2020-08-24 09:20:33 +00:00
Alex Richardson
06e20d1bab makefs (msdosfs): Use fprintf instead of debug print for errors
The added print was very helpful for debugging failed disk image creation.

Reviewed By:	emaste
Differential Revision: https://reviews.freebsd.org/D23200
2020-08-24 09:20:27 +00:00
Alex Richardson
50e525e40b Correctly determine the real executable in crunched binaries
This should fix cases like su setting argv[0] to _su for /bin/sh.
Previously cheribsdbox (a crunched tool we use in CheriBSD to reduce the
size of our minimal disk images to allow loading them onto FPGAs without
waiting forever for the transfer) would complain about _su not being
compiled in, but now that we also look at AT_EXECPATH it correctly
invokes the sh tool.

Note: we use use AT_EXECPATH instead of the KERN_PROC_PATHNAME sysctl to get
the crunchgen binary name since it seems like KERN_PROC_PATHNAME just
returns the last cached path for a given hardlink.
When using `su`, instead of invoking /bin/csh this would invoke the last
used hardlink to cheribsdbox. This caused weird test failures when running
tests due to `id` being executed instead of `echo`:

$ id  # id is a hardlink to /bin/cheribsdbox
$ su postgres -c 'echo 1' # su is also a hardlink
uid=1001(postgres) gid=1001(postgres) groups=1001(postgres)

Obtained from: CheriBSD

Reviewed By:	emaste, brooks
Differential Revision: https://reviews.freebsd.org/D25998
2020-08-24 09:20:23 +00:00
Alex Richardson
b0f558df9f Re-indent crunched_main.c in preparation for D25998 2020-08-24 09:20:18 +00:00
Alex Richardson
0f31fdf253 Pass the installworld install(1) flags to make buildenv
This ensure that running make install inside buildenv correctly includes
the METALOG flags when building with -DNO_ROOT.

Reviewed By:	brooks
Differential Revision: https://reviews.freebsd.org/D26038
2020-08-24 09:20:13 +00:00
Mateusz Guzik
e35406c8f7 cache: lockless reverse lookup
This enables fully scalable operation for getcwd and significantly improves
realpath.

For example:
PATH_CUSTOM=/usr/src ./getcwd_processes -t 104
before:  1550851
after: 380135380

Tested by:	pho
2020-08-24 09:00:57 +00:00
Mateusz Guzik
feabaaf995 cache: drop the always curthread argument from reverse lookup routines
Note VOP_VPTOCNP keeps getting it as temporary compatibility for zfs.

Tested by:	pho
2020-08-24 08:57:02 +00:00
Mateusz Guzik
f0696c5e4b cache: perform reverse lookup using v_cache_dd if possible
Tested by:	pho
2020-08-24 08:55:55 +00:00
Mateusz Guzik
ce575cd0e2 cache: populate v_cache_dd for non-VDIR entries
It makes v_cache_dd into a little bit of a misnomer and it may be addressed later.

Tested by:	pho
2020-08-24 08:55:04 +00:00
Chuck Tuffli
71a51f69a4 bhyve: NVMe queue create must init head/tail
The NVMe emulation code did not explicitly initialize queue head and
tail pointers on queue creation. As these pointers are part of
calloc()'ed memory, this only becomes a problem if the queues are
deleted and then recreated.

This error can manifest with messages about completions not matching a
command.
2020-08-24 01:51:21 +00:00
Chuck Tuffli
c4a86c1fc0 bhyve: NVMe set nominal health values
Some operating systems believe bhyve's emulated NVMe drive is failing
based on certain values in the SMART / Health Information log page being
zero. Fix is to set the reported temperature and available spare values
to reasonable defaults.

Submitted by:	wanpengqian@gmail.com
Reviewed by:    grehan
MFC after:      2 weeks
Differential Revision: https://reviews.freebsd.org/D24202
2020-08-24 01:51:17 +00:00
Kyle Evans
cc249d7800 caroot: switch to using echo+shell glob to enumerate certs
This solves an issue on stable/12 that causes certs to not get installed.
ls is apparently not in PATH during installworld, so TRUSTED_CERTS ends up
blank and nothing gets installed. We don't really require anything
ls-specific, though, so let's just simplify it.

MFC after:	3 days
2020-08-23 23:56:57 +00:00
Bjoern A. Zeeb
8f32e493cc net80211: improve media information for VHT5GHZ
Improve ieee80211_media_setup(), media2mode(), and
ieee80211_rate2media() for VHT5GHZ at least.

Reviewed by:	adrian, gnn
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC (d/b/a "Netgate")
Differential Revision:	https://reviews.freebsd.org/D26089
2020-08-23 21:42:23 +00:00
Bjoern A. Zeeb
30fdd33ca3 net80211: set_vht_extchan() reverse order to always return best
In set_vht_extchan() the checks are performed in the order of VHT20/40/80.
That means if a channel has a lower and higheer VHT flag set we would
return the lower first.
We normally do not set more than one VHT flag so this change is supposed
to be a NOP but follows the logical thinking order of returning the best
first. Also we nowhere assert a single VHT flag so make sure we'll not
be stuck with VHT20 when we could do more.

While here add the debugging printfs for VHT160 and VHT80P80 which still
need doing once we deal with a driver at that level.

Reviewed by:	adrian, gnn
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC (d/b/a "Netgate")
Differential Revision:	https://reviews.freebsd.org/D26088
2020-08-23 21:37:20 +00:00
Mateusz Guzik
f0d9c77e52 vfs: validate ndp state after the lookup
The intent is to remove known-to-be-nops NDFREE calls after many lookups.
2020-08-23 21:06:41 +00:00
Mateusz Guzik
4b5001196a vfs: convert nameiop into an enum
While here change the field size from long to int and move it into the
gap next to cn_flags.

Shrinks struct componentname from 64 to 56 bytes on amd64.
2020-08-23 21:05:39 +00:00
Mateusz Guzik
9ce9158b53 vfs: support denying access in vaccess_vexec_smr 2020-08-23 21:05:06 +00:00
Mateusz Guzik
ba3b099198 vfs: factor away doomed vnode handling into vdropl_final 2020-08-23 21:04:35 +00:00
Konstantin Belousov
da477bcdc0 procctl(8): usermode bits to force LA58/LA57 on exec.
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 20:44:15 +00:00
Konstantin Belousov
f446480b5f amd64: Handle 5-level paging on wakeup.
We can switch into long mode directly with LA57 enabled.

Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 20:43:23 +00:00
Konstantin Belousov
177622f1fd amd64: Handle 5-level paging for efirt calls.
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 20:40:35 +00:00
Warner Losh
9b1f6cfc63 Fix another minor style glitch.
Pull { to the end of the struct line rather than having them on their
own line.
2020-08-23 20:38:10 +00:00
Konstantin Belousov
f3eb12e4a6 Add bhyve support for LA57 guest mode.
Noted and reviewed by:	grehan
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 20:37:21 +00:00
Konstantin Belousov
25f2da2e64 Add amd64 procctl(2) ops to manage forced LA48/LA57 VA after exec.
Tested by:	pho (LA48 hardware)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 20:32:13 +00:00
Konstantin Belousov
9ce875d9b5 amd64 pmap: LA57 AKA 5-level paging
Since LA57 was moved to the main SDM document with revision 072, it
seems that we should have a support for it, and silicons are coming.

This patch makes pmap support both LA48 and LA57 hardware.  The
selection of page table level is done at startup, kernel always
receives control from loader with 4-level paging.  It is not clear how
UEFI spec would adapt LA57, for instance it could hand out control in
LA57 mode sometimes.

To switch from LA48 to LA57 requires turning off long mode, requesting
LA57 in CR4, then re-entering long mode.  This is somewhat delicate
and done in pmap_bootstrap_la57().  AP startup in LA57 mode is much
easier, we only need to toggle a bit in CR4 and load right value in CR3.

I decided to not change kernel map for now.  Single PML5 entry is
created that points to the existing kernel_pml4 (KML4Phys) page, and a
pml5 entry to create our recursive mapping for vtopte()/vtopde().
This decision is motivated by the fact that we cannot overcommit for
KVA, so large space there is unusable until machines start providing
wider physical memory addressing.  Another reason is that I do not
want to break our fragile autotuning, so the KVA expansion is not
included into this first step.  Nice side effect is that minidumps are
compatible.

On the other hand, (very) large address space is definitely
immediately useful for some userspace applications.

For userspace, numbering of pte entries (or page table pages) is
always done for 5-level structures even if we operate in 4-level mode.
The pmap_is_la57() function is added to report the mode of the
specified pmap, this is done not to allow simultaneous 4-/5-levels
(which is not allowed by hw), but to accomodate for EPT which has
separate level control and in principle might not allow 5-leve EPT
despite x86 paging supports it. Anyway, it does not seems critical to
have 5-level EPT support now.

Tested by:	pho (LA48 hardware)
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 20:19:04 +00:00
Konstantin Belousov
4ba405dcdb Add definition for CR4.LA57 bit.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 20:08:05 +00:00
Konstantin Belousov
0cad2aa2dd Pass pointers to info parsed from notes, to brandinfo->header_supported filter.
Currently, we parse notes for the values of ELF FreeBSD feature flags
and osrel.  Knowing these values, or knowing that image does not carry
the note if pointers are NULL, is useful to decide which ABI variant
(brand) we want to activate for the image.

Right now this is only a plumbing change

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 20:06:55 +00:00
Konstantin Belousov
e3cf75826d Style. 2020-08-23 20:05:35 +00:00
Konstantin Belousov
bc6f027a39 Reserve FreeBSD ELF feature control bit LA48 to control VA layout on amd64.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 19:47:27 +00:00
Konstantin Belousov
2b313da3bd kern_sharedpage.c: Add exec_sysvec_init_secondary() helper.
It allows a sysent to share existing usermode data in shared page with
other sysent, assuming ABI differences are not in the layout of the
page.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25273
2020-08-23 19:43:47 +00:00
Mark Johnston
3a3992fb86 ng_ubt: Add a device ID.
PR:		248838
Submitted by:	Andrey Zholos <aaz@q-fu.com>
MFC after:	1 week
2020-08-23 19:30:06 +00:00
Mateusz Guzik
992bcb37c2 libc: hide alphasort_thunk behind I_AM_SCANDIR_B
Should unbreak gcc build as reported by tinderbox:
lib/libc/gen/scandir.c:59:12: warning: 'alphasort_thunk' declared 'static' but never defined [-Wunused-function]
2020-08-23 11:06:59 +00:00
Mateusz Guzik
2ca83b5c27 vfs: mark freevnode as noinline 2020-08-23 11:05:26 +00:00
Navdeep Parhar
6a59b9940e cxgbe(4): Use large clusters for TOE rx queues when TOE+TLS is enabled.
Rx is more efficient within the chip when the receive buffer size
matches the TLS PDU size.

MFC after:	3 days
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D26127
2020-08-23 04:16:20 +00:00
Konstantin Belousov
c5bc28b273 Fix several issues with process group orphanage.
Attempt of adding assertions that pgrp->pg_jobc counters do not
underflow in r361967, reverted in r362910, points out bugs in the
handling of job control.  Peter Holm was able to narrow down the
problem to very easy reproduction with timeout(1) which uses reaping.

The following list of problems with calculation of pg_jobs which
directs SIGHUP/SIGCONT delivery for orphaned process group was
identified:
- Re-calculation of the orphaned status for children of exiting parent
  was wrong, but mostly unnoticed when all children were reparented to
  init(8).  When child can be reparented to a different process which
  could affect the child' job control state, it was not properly
  accounted for in pg_jobc.
- Lockless check for exiting process' parent process group is racy
  because nothing prevents the parent from changing its group
  membership.
- Exited process is left in the process group, until waited. This
  affects other calculations of pg_jobc.

Split handling of job control status on process changing its process
group, and process exiting.  Calculate increments and decrements for
pg_jobs by exact checking the orphanage instead of assuming process
group membership for children and parent.  Move the call to killjobc()
later under the proctree_lock.  Mark exiting process in killjobc()
with a new flag P_TREE_GRPEXITED and skip it for all pg_jobc
calculations after the flag is set.

Add checker that independently recalculates pg_jobc value and compares
it with the memoized process group state. This is enabled under INVARIANTS.

Reviewed by:	jilles
Discussed with:	kevans
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D26116
2020-08-22 21:32:11 +00:00
Ed Maste
da232c04ab mtree(8): add xref to mtree(5)
mtree(5) and mtree(8) come from different contrib sources. The former
already had an xref to the latter, but not the other way around.

MFC after:	1 week
2020-08-22 20:52:02 +00:00
Alexander V. Chernikov
eb1c7adb70 Finish r364492 by renaming rt_flags to rte_flags for multipath code. 2020-08-22 20:02:40 +00:00
Alexander V. Chernikov
93bfd365d2 Rename rt_flags to rte_flags && reduce number of rt_nhop accesses.
No functional changes.

Most of the routing flags are stored in the netxtop instead of rtentry.
Rename rt->rt_flags to rt->rte_flags to simplify reading/modifying code
 checking routing flags.

In the new multipath code, rt->rt_nhop may actually point to nexthop group
 instead of nhop. To ease transition, reduce the amount of rt->rt_nhop->...
 accesses.

Differential Revision:	https://reviews.freebsd.org/D26156
2020-08-22 19:30:56 +00:00
Warner Losh
46809e695f Whitespace change to line up dev_sotfc definition. 2020-08-22 19:18:31 +00:00
Warner Losh
b14f13459e Retire obsolete sysctl hw.bus.devctl_disable
hw.bus.devctl_disable has tagged been obsolete for a decade. Remove it. Also
remove some long obsolete comments. This was done and backed out once in 2014,
but we've had enough releases with the 'new' method of setting queue length that
we can just remove this sysctl now (stable/11, stable/12 and current all don't
reference it).
2020-08-22 19:02:15 +00:00
Alexander V. Chernikov
5676d488c2 Add test for checking RTF_HOST and RTAX_NETMASK inconsistency.
RTF_HOST indicates whether route is a host route
 (netmask is empty or /{32,128}).
Check that if netmask is empty and host route is not specified, kernel
 returns an error.

Differential Revision:	https://reviews.freebsd.org/D26155
2020-08-22 18:14:05 +00:00
Mateusz Guzik
de0fcd3a44 vfs: assert that HASBUF is only set with SAVENAME or SAVESTART
as requested by the caller. The intent is to eradicate the mostly
spurious NDFREE_PNBUF calls.
2020-08-22 16:58:59 +00:00
Mateusz Guzik
1e448a1558 cache: stronger vnode asserts in cache_enter_time 2020-08-22 16:58:34 +00:00
Mateusz Guzik
cd4a1797b0 fd: pwd_drop after releasing filedesc lock
Fixes a potential LOR against vnode lock.
2020-08-22 16:57:45 +00:00
Dimitry Andric
1c1ab42925 Add a missed source file for LLVM's BPF target. This target is not
enabled by default, so I forgot about it, apologies for the breakage.

Reported by:	hrs
MFC after:	6 weeks
X-MFC-With:	r364284
2020-08-22 15:31:56 +00:00
Ed Maste
c078c3fd69 acpi_iort: fix mapping end calculation
According to the ARM Design Document "IO Remapping Table Platform"
(DEN 0049D), the "Number of IDs" field of the ID mapping format means
"The number of IDs in the range minus one".

Submitted by:	Greg V <greg@unrelenting.technology>
Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D25179
2020-08-22 14:39:14 +00:00
Mark Johnston
c612709bce Fix a typo in r364438 affecting 32-bit platforms.
Reported by:	antoine
MFC with:	r364438
2020-08-22 14:24:17 +00:00