Commit Graph

247888 Commits

Author SHA1 Message Date
jhb
1248834695 Require the SHF_ALLOC flag for program sections from kernel object modules.
ELF object files can contain program sections which are not supposed
to be loaded into memory (e.g. .comment).  Normally the static linker
uses these flags to decide which sections are allocated to loadable
program segments in ELF binaries and shared objects (including kernels
on all architectures and kernel modules on architectures other than
amd64).

Mapping ELF object files (such as amd64 kernel modules) into memory
directly is a bit of a grey area.  ELF object files are intended to be
used as inputs to the static linker.  As a result, there is not a
standardized definition for what the memory layout of an ELF object
should be (none of the section headers have valid virtual memory
addresses for example).

The kernel and loader were not checking the SHF_ALLOC flag but loading
any program sections with certain types such as SHT_PROGBITS.  As a
result, the kernel and loader would load into RAM some sections that
weren't marked with SHF_ALLOC such as .comment that are not loaded
into RAM for kernel modules on other architectures (which are
implemented as ELF shared objects).  Aside from possibly requiring
slightly more RAM to hold a kernel module this does not affect runtime
correctness as the kernel relocates symbols based on the layout it
uses.

Debuggers such as gdb and lldb do not extract symbol tables from a
running process or kernel.  Instead, they replicate the memory layout
of ELF executables and shared objects and use that to construct their
own symbol tables.  For executables and shared objects this works
fine.  For ELF objects the current logic in kgdb (and probably lldb
based on a simple reading) assumes that only sections with SHF_ALLOC
are memory resident when constructing a memory layout.  If the
debugger constructs a different memory layout than the kernel, then it
will compute different addresses for symbols causing symbols in the
debugger to appear to have the wrong values (though the kernel itself
is working fine).  The current port of mdb does not check SHF_ALLOC as
it replicates the kernel's logic in its existing kernel support.

The bfd linker sorts the sections in ELF object files such that all of
the allocated sections (sections with SHF_ALLOCATED) are placed first
followed by unallocated sections.  As a result, when kgdb composed a
memory layout using only the allocated sections, this layout happened
to match the layout used by the kernel and loader.  The lld linker
does not sort the sections in ELF object files and mixed allocated and
unallocated sections.  This resulted in kgdb composing a different
memory layout than the kernel and loader.

We could either patch kgdb (and possibly in the future lldb) to use
custom handling when generating memory layouts for kernel modules that
are ELF objects, or we could change the kernel and loader to check
SHF_ALLOCATED.  I chose the latter as I feel we shouldn't be loading
things into RAM that the module won't use.  This should mostly be a
NOP when linking with bfd but will allow the existing kgdb to work
with amd64 kernel modules linked with lld.

Note that we only require SHF_ALLOC for "program" sections for types
like SHT_PROGBITS and SHT_NOBITS.  Other section types such as symbol
tables, string tables, and relocations must also be loaded and are not
marked with SHF_ALLOC.

Reported by:	np
Reviewed by:	kib, emaste
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D13926
2018-01-17 22:51:59 +00:00
cem
8d50f6fe85 Convert ls(1) to not use libxo(3)
libxo imposes a large burden on system utilities. In the case of ls, that
burden is difficult to justify -- any language that can interact with json
output can use readdir(3) and stat(2).

Logically, this reverts r291607, r285857, r285803, r285734, r285425,
r284494, r284489, r284252, and r284198.

Kyua tests continue to pass (libxo integration was entirely untested).

Reported by:	many
Reviewed by:	imp
Discussed with:	manu, bdrewery
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13959
2018-01-17 22:47:34 +00:00
jhb
0aaf564f94 Use long for the last argument to VOP_PATHCONF rather than a register_t.
pathconf(2) and fpathconf(2) both return a long.  The kern_[f]pathconf()
functions now accept a pointer to a long value rather than modifying
td_retval directly.  Instead, the system calls explicitly store the
returned long value in td_retval[0].

Requested by:	bde
Reviewed by:	kib
Sponsored by:	Chelsio Communications
2018-01-17 22:36:58 +00:00
landonf
d49b03f3fe bwn(4): Enable, by default, the opt-in support for bhnd(4) introduced in
r326454.

bwn(4)/bhnd(4) has been tested with most chipsets currently supported by
bwn(4), and this change should be transparent to existing bwn(4) users;
please report any regressions that you do encounter.

To revert to using siba_bwn(4) instead of bhnd(4), place the following
lines in loader.conf(5):

  hw.bwn_pci.preferred="0"

Once we're satisfied that the switch to bhnd(4) has seen sufficient broader
testing, bwn(4) will be migrated to use the native bhnd(9) interface
directly, and support for siba_bwn(4) will be dropped (see D13518).

Sponsored by:	The FreeBSD Foundation
2018-01-17 22:33:19 +00:00
markj
0c26afc4c5 Annotate a couple of changes from r328083.
Reviewed by:	kib
X-MFC with:	r328083
2018-01-17 21:52:12 +00:00
emaste
fae11c6a87 kldxref: additional sytle(9) cleanup
Reported by:	kib (via comments in D13957)
Sponsored by:	The FreeBSD Foundation
2018-01-17 20:43:30 +00:00
emaste
7a589d5c51 kldxref: improve style(9)
Address style issues including some previously raised in D13923.

- Use designated initializers for structs
- Always use bracketed return style
- No initialization in declarations
- Align function prototype names
- Remove old commented code/unused includes

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D13943
2018-01-17 19:59:43 +00:00
pfg
036ebddf97 ufs: use mallocarray(9).
Basic use of mallocarray to prevent overflows: static analyzers are also
likely to perform additional checks.

Since mallocarray expects unsigned parameters, unsign some
related variables to minimize sign conversions.

Reviewed by:	mckusick
2018-01-17 18:18:33 +00:00
mckusick
39d99c1186 Correct fsck journal-recovery code to update a cylinder-group
check-hash after making changes to the cylinder group. The problem
was that the journal-recovery code was calling the libufs bwrite()
function instead of the cgput() function. The cgput() function updates
the cylinder-group check-hash before writing the cylinder group.

This change required the additions of the cgget() and cgput() functions
to the libufs API to avoid a gratuitous bcopy of every cylinder group
to be read or written. These new functions have been added to the
libufs manual pages. This was the first opportunity that I have had
to use and document the use of the EDOOFUS error code.

Reviewed by: kib
Reported by: emaste and others
2018-01-17 17:58:24 +00:00
dim
bbd1562a49 Revert r327340, as the workaround for rep prefixes followed by .byte
directives is no longer needed after r328090.
2018-01-17 17:14:19 +00:00
dim
3207b51c93 Pull in r322623 from upstream llvm trunk (by Andrew V. Tischenko):
Allow usage of X86-prefixes as separate instrs.
  Differential Revision: https://reviews.llvm.org/D42102

This should fix parse errors when x86 prefixes (such as 'lock' and
'rep') are followed by various non-mnemonic tokens, e.g. comments, .byte
directives and labels.

PR:		224669,225054
2018-01-17 17:11:55 +00:00
imp
5af8ebb2f8 Move setting of CAM_SIM_QUEUED to before we actually submit it to the
hardware. Setting it after is racy, and we can lose the race on a
heavily loaded system.

Reviewed by: scottl@, gallatin@
Sponsored by: Netflix
2018-01-17 17:08:26 +00:00
fabient
0b968fec5b Only call flush in pipe mode.
It fixes a crash with a socket in top mode.

Ex:
# pmcstat -R 127.0.0.1:8080 -T -w1
then
# pmcstat -n1 -Sclock.prof -Slock.failed -O 127.0.0.1:8080

MFC after:	1 week
Sponsored by:	Stormshield
2018-01-17 16:55:35 +00:00
fabient
326b96a88c Fix pmcstat exit from kernel introduced by r325275.
pmcstat request for close will generate a close event.
This event will be in turn received by pmcstat to close the file.

Reviewed by:	kib
Tested by:	pho
MFC after:	1 week
Sponsored by: Stormshield
2018-01-17 16:41:22 +00:00
bapt
821c3605c6 Update pciids to 2018.01.14
MFC after:	3 days
2018-01-17 13:25:41 +00:00
dim
786fb4e1d1 Fix buildworld after r328075, by also renaming cgget to cglookup in
fsdb.

Reported by:	ohartmann@walstatt.org,david@catwhisker.org
Pointy hat to:	mckusick
2018-01-17 13:19:37 +00:00
kib
c35d24e497 PTI for amd64.
The implementation of the Kernel Page Table Isolation (KPTI) for
amd64, first version. It provides a workaround for the 'meltdown'
vulnerability.  PTI is turned off by default for now, enable with the
loader tunable vm.pmap.pti=1.

The pmap page table is split into kernel-mode table and user-mode
table. Kernel-mode table is identical to the non-PTI table, while
usermode table is obtained from kernel table by leaving userspace
mappings intact, but only leaving the following parts of the kernel
mapped:

    kernel text (but not modules text)
    PCPU
    GDT/IDT/user LDT/task structures
    IST stacks for NMI and doublefault handlers.

Kernel switches to user page table before returning to usermode, and
restores full kernel page table on the entry. Initial kernel-mode
stack for PTI trampoline is allocated in PCPU, it is only 16
qwords.  Kernel entry trampoline switches page tables. then the
hardware trap frame is copied to the normal kstack, and execution
continues.

IST stacks are kept mapped and no trampoline is needed for
NMI/doublefault, but of course page table switch is performed.

On return to usermode, the trampoline is used again, iret frame is
copied to the trampoline stack, page tables are switched and iretq is
executed.  The case of iretq faulting due to the invalid usermode
context is tricky, since the frame for fault is appended to the
trampoline frame.  Besides copying the fault frame and original
(corrupted) frame to kstack, the fault frame must be patched to make
it look as if the fault occured on the kstack, see the comment in
doret_iret detection code in trap().

Currently kernel pages which are mapped during trampoline operation
are identical for all pmaps.  They are registered using
pmap_pti_add_kva().  Besides initial registrations done during boot,
LDT and non-common TSS segments are registered if user requested their
use.  In principle, they can be installed into kernel page table per
pmap with some work.  Similarly, PCPU can be hidden from userspace
mapping using trampoline PCPU page, but again I do not see much
benefits besides complexity.

PDPE pages for the kernel half of the user page tables are
pre-allocated during boot because we need to know pml4 entries which
are copied to the top-level paging structure page, in advance on a new
pmap creation.  I enforce this to avoid iterating over the all
existing pmaps if a new PDPE page is needed for PTI kernel mappings.
The iteration is a known problematic operation on i386.

The need to flush hidden kernel translations on the switch to user
mode make global tables (PG_G) meaningless and even harming, so PG_G
use is disabled for PTI case.  Our existing use of PCID is
incompatible with PTI and is automatically disabled if PTI is
enabled.  PCID can be forced on only for developer's benefit.

MCE is known to be broken, it requires IST stack to operate completely
correctly even for non-PTI case, and absolutely needs dedicated IST
stack because MCE delivery while trampoline did not switched from PTI
stack is fatal.  The fix is pending.

Reviewed by:	markj (partially)
Tested by:	pho (previous version)
Discussed with:	jeff, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2018-01-17 11:44:21 +00:00
kib
72f9a98571 Amd64 user_ldt_deref() is not used outside sys_machdep.c. Mark it as
static.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-01-17 11:21:03 +00:00
tuexen
73677d4c35 Add missing assignment to make sure non-first cmsgs are handled as such. 2018-01-17 10:30:49 +00:00
wma
687a4fd55c PPC64: implement missing busdma ops
Add missing little-endian 64-bit read and write. Since there
is no direct ASM opcode for this, perform byte swap if
necessary.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:45:18 +00:00
wma
0ae6682f6b PPC64: fix copyinout ranges
Use current userspace address for segment mapping. Previously,
there was a bug which made the funciton constantly using the userspace
base address which could cause data integrity issues.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:36:48 +00:00
wma
225d71d135 PPC64: add CXGBE and remove AHCI from GENERIC64
Add CXGBE driver which is required for PowerNV system.
Also, remove AHCI which does not work in BigEndian.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 09:33:16 +00:00
wma
842790678f PowerNV: workaround console on OPAL 5.4
FreeBSD prints text char-by-char, which is not what OPAL
is designed to. Poll events more frequently to avoid buffer
overflow and loosing data.

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 08:01:51 +00:00
wma
4b5982d8ab PowerNV: make PowerNV PCIe working on a real hardware
Fixes:
- map all devices to PE0
- use 1:1 TCE mapping
- provide the same TCE mapping for all PEs (not only PE0)
- add TCE reset and alignment (required by OPAL)

Created by:            Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
2018-01-17 07:39:11 +00:00
mckusick
5fda4f45ad Rename cgget => cglookup to clear name space for new libufs function cgget.
No functional change.
2018-01-17 06:31:21 +00:00
landonf
2789832d3c bhndb_pci(4): fix incorrect BHND_PCI_SRSH_PI workaround
On a SPROM-less device, the PCI(e) bridge core will be initialized with its
power-on-reset defaults; this can leave the SPROM-derived BHND_PCI_SRSH_PI
value pointing to the wrong backplane address. This value is used by the
PCI core when performing address translation between the static register
windows in BAR0 that map the PCI core's register block, and backplane
address space.

Previously, bhndb_pci(4) incorrectly used the potentially invalid static
BAR0 PCI register windows when attempting to correct the BHND_PCI_SRSH_PI
value in the PCI core's SPROM shadow.

Instead, we now read/update BHND_PCI_SRSH_PI by fetching the PCI core's
backplane address from the core enumeration table, and then using a dynamic
register window to explicitly map the PCI core's register block into BAR0.

Sponsored by:	The FreeBSD Foundation
2018-01-17 03:34:26 +00:00
pfg
d32751c6b7 SPDX: finish tagging sys/cam. 2018-01-16 23:19:57 +00:00
ian
f0c14bee67 Remove redundant critical_enter/exit() calls. The block of code delimited
by these calls is now protected by a spin mutex (obscured within the
RTC_LOCK/RTC_UNLOCK macros).

Reported by:	bde@
2018-01-16 23:18:52 +00:00
ian
2fcaa5e746 Move some code around and rename a couple variables; no functional changes.
The static atrtc_set() function was called only from clock_settime(), so
just move its contents entirely into clock_settime() and delete atrtc_set().

Rename the struct bcd_clocktime variables from 'ct' to 'bct'.  I had
originally wanted to emphasize how identical the clocktime and bcd_clocktime
structs were, but things evolved to the point where the structs are not at
all identical anymore, so now emphasizing the difference seems better.
2018-01-16 23:14:12 +00:00
pfg
bf3a218e8b scsi_ch.c: Small cleanups to the comments.
Move the the NetBSD tag near to the related licence. Update it to reflect
better the point where we started diverging.

Use grouping parenthesis for the SPDX tag.

No functional change.
2018-01-16 23:08:25 +00:00
tuexen
b096e231df Fix a bug related to fast retransmissions.
When processing a SACK advancing the cumtsn-ack in fast recovery,
increment the miss-indications for all TSN's reported as missing.

Thanks to Fabian Ising for finding the bug and to Timo Voelker
for provinding a fix.

This fix moves also CMT related initialisation of some variables
to a more appropriate place.

MFC after:	1 week
2018-01-16 21:58:38 +00:00
arichardson
1c21ef5dad Use ln -n instead of -h to allow building the kernel on Linux
Both flags do the same thing but -n is more widely supported.

Reviewed By:	jhb, emaste
Approved By:	jhb (mentor)
Differential Revision: https://reviews.freebsd.org/D13936
2018-01-16 21:43:57 +00:00
arichardson
b14a698c09 Allow xinstall and makefs to be crossbuilt on Linux and Mac
I need these tools in order to install the crossbuilt FreeBSD and create a
disk image. Linux does not have a st_flags in struct stat so unfortunately
I need a bunch of ugly ifdefs. The resulting binaries allow me to
sucessfully install a MIPS64 world and create a disk-image that boots.

Reviewed By:	brooks, bdrewery, emaste
Approved By:	jhb (mentor)
Differential Revision: https://reviews.freebsd.org/D13307
2018-01-16 21:43:46 +00:00
arichardson
55c1d10911 Don't build share/syscons in build-tools stage if MK_SYSCONS == "no"
Reviewed By:	emaste, jhb
Approved By:	jhb (mentor)
Differential Revision: https://reviews.freebsd.org/D12926
2018-01-16 21:43:36 +00:00
arichardson
90f63e1c7f libnetbsd: Make the function declaration of efopen() match the definition
In order to crossbuild FreeBSD on Mac/Linux I also need to build libnetbsd
and FILE* is not equal to struct __sFILE on those platforms.

Reviewed By:	brooks, emaste
Approved By:	jhb (mentor)
Differential Revision: https://reviews.freebsd.org/D13305
2018-01-16 21:43:21 +00:00
tsoome
3cbec545e7 utf8_to_ucs2() should check for malloc failure
utf8_to_ucs2() is calling malloc() without checking the result.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D13933
2018-01-16 20:35:54 +00:00
kevans
9614593728 service(8): Reset OPTIND properly now that we're parsing args twice
r328032 introduced a second round of argument parsing to proxy the request
through to a jail as needed, but failed to reset OPTIND before getting to
the second round of parsing to allow other flags to be set.

Reported by:	Oleg Ginzburg <olevole olevole ru>
2018-01-16 20:14:31 +00:00
tuexen
aefeae56df Improve the printing of cmgs when the length is 0. Fix error handling. 2018-01-16 20:02:07 +00:00
tuexen
4df44feae4 Using %p already prints "0x", so don't do it explicitly. 2018-01-16 19:57:30 +00:00
jhb
9c89db2019 Split crp_buf into a union.
This adds explicit crp_mbuf and crp_uio pointers of the right type to
replace casts of crp_buf.  This does not sweep through changing existing
code, but new code should use the correct fields instead of casts.

Reviewed by:	kib
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D13927
2018-01-16 19:41:18 +00:00
pfg
0c49a087dc ext2fs: use mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). These
are not likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.
2018-01-16 19:29:32 +00:00
philip
7a77c1bc0f Import tzdata 2018a
Changes: https://github.com/eggert/tz/blob/2018a/NEWS

MFC after:	3 days
2018-01-16 18:36:25 +00:00
emaste
8823aaaff9 kldxref: handle modules with md_cval at the end of allocated sections
Attempting to retrieve an md_cval string from a kernel module with
kldxref would throw a offset error for modules created using lld, since
this value would be placed at the end of all allocated sections.

Add an ef_read_seg_string method to the ef interface, to allow reading
strings of varying size without attempting to read beyond the segment's
bounds.

PR:		224875
Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Reviewed by:	cem, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D13923
2018-01-16 18:20:12 +00:00
br
35bb6ca08e Fix bug: increment the value of pmcstat_npmcs instead of moving pointer
to the next int position.

Bug was introduced in r324959 ("Extract a set of pmcstat functions and
interfaces to the new internal library -- libpmcstat.")

This fixes pmcstat top mode (-T) operation.
Example: pmcstat -n1 -S clock.hard -T

Reported by:	Peter Holm <peter@holm.cc>
Sponsored by:	DARPA, AFRL
2018-01-16 09:31:01 +00:00
wma
b76b1a3176 PowerNV: XICS support for PowerNV/OPAL
Make XICS to be OPAL-aware.

Created by:            Nathan Whitehorn <nwhitehorn@freebsd.org>
Submitted by:          Wojciech Macek <wma@semihalf.com>
Sponsored by:          FreeBSD Foundation
2018-01-16 06:24:19 +00:00
pfg
d3692a35af Fix build after r328020.
Should have noticed earlier but the build was already broken by another
change.

Reported by:	Ravi Pokala
2018-01-16 06:04:39 +00:00
jhibbits
7c500b69ee Make fsl_sata driver work on P1022
P1022 SATA controller may set the wrong CCR bit for a command completion.
This would previously cause an interrupt storm.  Solve this by marking all
commands complete, and letting the end_transaction deal with the successes.
Causes no problems on P5020.

While here, fix a minor bug in collision detection.  The Freescale SATA
controller only has 16 slots, not 32.
2018-01-16 04:50:23 +00:00
ian
6ac58f6094 Add static inline rtcin_locked() and rtcout_locked() functions for doing a
related series of operations without doing a lock/unlock for each byte.
Use them when reading and writing the entire set of time registers.

The original rtcin() and writertc() functions which do lock/unlock on each
byte still exist, because they are public and called by outside code.
2018-01-16 03:02:41 +00:00
cem
3bac5f1698 random(4): Add CCP random source definitions
The implementation will follow (D12723).  For now, get the changes to
commit-protected files out of the way.

Approved by:	secteam (gordon)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13925
2018-01-16 02:56:27 +00:00
jhb
c3405f6ff4 Rename 'recv' to 'receive' to appease shadow warnings from GCC. 2018-01-16 01:21:07 +00:00