Commit Graph

310 Commits

Author SHA1 Message Date
tychon
2a8ac69f46 Add support for capturing 'struct ptrace_lwpinfo' for signals
resulting in a process dumping core in the corefile.

Also extend procstat to view select members of 'struct ptrace_lwpinfo'
from the contents of the note.

Sponsored by:	Dell EMC Isilon
2017-03-30 18:21:36 +00:00
kib
e617648699 A followup to r315749, two more places where brand->interp_path was
accessed unconditionally.

Reported by:	se
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-30 04:21:02 +00:00
ed
a9f48eafc0 Don't require the presence of the compat_3_brand.
The existing ELF image activator requires the brandinfo to provide such
a string unconditionally, even if the executable format in question
doesn't use this type of branding. Skip matching when it's a null
pointer.

Reviewed by:	kib
MFC after:	2 weeks
2017-03-23 14:09:45 +00:00
kib
da63ef1f60 Update r315753 with the proper flag name.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-22 22:28:13 +00:00
kib
a22b5a3135 Add a flag BI_BRAND_ONLY_STATIC to specify that the brand only
matches static binaries.

Interpretation of the 'static' there is that the binary must not
specify an interpreter.  In particular, shared objects are matched by
the brand if BI_CAN_EXEC_DYN is also set.

This improves precision of the brand matching, which should eliminate
surprises due to brand ordering.

Revert r315701.

Discussed with and tested by:	ed (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-22 22:23:01 +00:00
kib
08a98416ad Adjust r314851 to not require every brand to specify interpreter path.
Reported and tested by:	ed
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-22 22:06:48 +00:00
alc
fb24921f88 Avoid unnecessary calls to vm_map_protect() in elf_load_section().
Typically, when elf_load_section() unconditionally passed VM_PROT_ALL to
elf_map_insert(), it was needlessly enabling execute access on the
mapping, and it would later have to call vm_map_protect() to correct the
mapping's access rights.  Now, instead, elf_load_section() always passes
its parameter "prot" to elf_map_insert().  So, elf_load_section() must
only call vm_map_protect() if it needs to remove the write access that
was temporarily granted to perform a copyout().

Reviewed by:	kib
MFC after:	1 week
2017-03-18 23:37:00 +00:00
kib
c77fb55571 Accept linkers representation for ELF segments with zero on-disk length.
For such segments, GNU bfd linker writes knowingly incorrect value
into the the file offset field of the program header entry, with the
motivation that file should not be mapped for creation of this segment
at all.

Relax checks for the ELF structure validity when on-disk segment
length is zero, and explicitely set mapping length to zero for such
segments to avoid validating rounding arithmetic.

PR:	217610
Reported by:	Robert Clausecker <fuz@fuz.su>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-12 13:51:13 +00:00
kib
3d6312cf4f Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-12 13:49:42 +00:00
alc
8854b6f932 Simplify the control flow and tidy up a comment in map_insert.
In collaboration with:	kib
MFC after:	1 week
2017-03-11 18:57:13 +00:00
kib
02da26ef90 When selecting brand based on old Elf branding, prefer the brand which
interpreter exactly matches the one requested by the activated image.

This change applies r295277, which did the same for note branding, to
the old brand selection, with the same reasoning of fixing compat32
interpreter substitution.

PR:	211837
Reported by:	kenji@kens.fm
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-07 13:38:25 +00:00
kib
2594ff8ef5 Require whole brand string matching for old Elf branding.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-07 13:37:35 +00:00
kib
df94808752 Consistently use vm_ooffset_t type for the vm object offset in
elf_load_section.

The values passed currently as vm_offset_t are phdr.p_offset, which
have the native Elf word size.  Since elf_load_section interprets them
as the file offset, use vm object offset type.

Noted and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-07 13:36:43 +00:00
kib
5c17fc3007 Instead of direct use of vm_map_insert(), call vm_map_fixed(MAP_CHECK_EXCL).
This KPI explicitely indicates the intent of creating the mapping at
the fixed address, and incorporates the map locking into the callee.

Suggested and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-06 14:09:54 +00:00
alc
c1537aec36 Style and punctuation fixes.
Reviewed by:	kib
MFC after:	3 days
2017-03-05 23:59:04 +00:00
kib
6a00199f55 Style.
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-03-02 17:35:13 +00:00
kib
4f433ead68 Use vm_map_insert() instead of vm_map_find() in elf_map_insert().
Elf_map_insert() needs to create mapping at the known fixed address.
Usage of vm_map_find() assumes, on the other hand, that any suitable
address space range above or equal the specified hint, is acceptable.
Due to operating on the fresh or cleared address space, vm_map_find()
usually creates mapping starting exactly at hint.

Switch to vm_map_insert() use to clearly request fixed mapping from
the VM.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-03-01 10:28:15 +00:00
kib
cf53f19d89 When deallocating the vm object in elf_map_insert() due to
vm_map_insert() failure, drop the vnode lock around the call to
vm_object_deallocate().

Since the deallocated object is the vm object of the vnode, we might
get the vnode lock recursion there.  In fact, it is almost impossible
to make vm_map_insert() failing there on stock kernel.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-03-01 10:22:07 +00:00
jhb
36dfedc923 Copy the e_machine and e_flags fields from the binary into an ELF core dump.
In the kernel, cache the machine and flags fields from ELF header to use in
the ELF header of a core dump. For gcore, the copy these fields over from
the ELF header in the binary.

This matters for platforms which encode ABI information in the flags field
(such as o32 vs n32 on MIPS).

Reviewed by:	kib
Sponsored by:	DARPA / AFRL
Differential Revision:	https://reviews.freebsd.org/D9392
2017-02-07 20:34:03 +00:00
emaste
ed36e5b9a6 imgact_elf: refactor et_dyn_addr calculation
This simplifies the logic somewhat. It is extracted from the change in
review in D5603.

Differential Revision:	https://reviews.freebsd.org/D9321
2017-01-24 22:46:43 +00:00
avg
05e4e60349 don't abort writing of a core dump after EFAULT
It's possible to get EFAULT when writing a segment backed by a file
if the segment extends beyond the file.
The core dump could still be useful if we skip the rest of the segment
and proceed to other segements.
The skipped segment (or a portion of it) will be zero-filled.

While there, use 'const' to signify that core_write() only reads the
buffer and use __DECONST before calling vn_rdwr_inchunks() because it
can be used for both reading and writing.

Before the change:
kernel: Failed to write core file for process mmap_trunc_core (error 14)
kernel: pid 77718 (mmap_trunc_core), uid 1001: exited on signal 6

After the change:
kernel: Failed to fully fault in a core file segment at VA 0x800645000 with size 0x4000 to be written at offset 0x29000 for process mmap_trunc_core
kernel: pid 4901 (mmap_trunc_core), uid 1001: exited on signal 6 (core dumped)

Reviewed by:	julian, kib
Obtained from:	Panzura (older version of the change)
MFC after:	5 days
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D9233
2017-01-20 13:39:07 +00:00
kib
4a27b2c086 Style.
Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2016-10-04 15:23:03 +00:00
nwhitehorn
dabf3a7300 Back out misfired extra file in r305108. 2016-08-31 04:03:55 +00:00
nwhitehorn
e28e507026 Refix operation on sparse CPU mappings as in r302372, temporarily broken
by r304716.

PR:		kern/210106
MFC after:	2 days
2016-08-31 04:02:52 +00:00
cem
1b6fb112bc imgact_elf: Rename the segment iterator to match reality
The each_writable_segment routine evaluates segments on a slightly little more
nuanced metric than simply "writable" or not.  Rename the function to more
closely match its behavior (each_dumpable_segment).

Suggested by:	jhb
Sponsored by:	EMC / Isilon Storage Division
2016-07-20 22:51:33 +00:00
cem
1e19a6f1d9 ANSI-fy imgact_elf.c
Sponsored by:	EMC / Isilon Storage Division
2016-07-20 22:46:56 +00:00
cem
b8b7be1d97 Fix DEBUG build on 64-bit arch after r303099
Reported by:	Larry Rosenman <ler at lerctr.org>
2016-07-20 18:11:22 +00:00
cem
08b61c5d52 Extend ELF coredump to support more than 65535 segments
The ELF e_phnum field is only 16 bits wide. To support more than 65535 segments
(program headers), Sun's "Linker and Libraries Guide" table 7-7 (or 12-7,
depending on document version) prescribes a special first section header where
sh_info represents the real number of program headers.

Test code to follow, when it is ready.

Reference:	http://docs.oracle.com/cd/E18752_01/pdf/817-1984.pdf

Reviewed by:	emaste, markj
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D7255
2016-07-20 16:59:36 +00:00
jhb
5535084c1a Include process IDs in core dumps.
When threads were added to the kernel, the pr_pid member of the
NT_PRSTATUS note was repurposed to store LWP IDs instead of process
IDs.  However, the process ID was no longer recorded in core dumps.
This change adds a pr_pid field to prpsinfo (NT_PRSINFO).  Rather than
bumping the prpsinfo version number, note parsers can use the note's
payload size to determine if pr_pid is present.

Reviewed by:	kib, emaste (older version)
MFC after:	2 months
Differential Revision:	https://reviews.freebsd.org/D7117
2016-07-18 15:14:23 +00:00
jhb
9a57990b79 Include command line arguments in core dump process info.
Fill in pr_psargs in the NT_PRSINFO ELF core dump note with command
line arguments.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D7116
2016-07-14 23:20:05 +00:00
emaste
c769866ed1 add description for debug.elf{32,64}_legacy_coredump sysctl
Approved by:	re (kib)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2016-07-05 14:46:06 +00:00
ian
8d8c35656e Include machine/acle-compat.h in cdefs.h on arm if the compiler doesn't
have ACLE support built in.  The ACLE (ARM C Language Extensions) defines
a set of standardized symbols which indicate the architecture version and
features available.  ACLE support is built in to modern compilers (both
clang and gcc), but absent from gcc prior to 4.4.

ARM (the company) provides the acle-compat.h header file to define the
right symbols for older versions of gcc.  Basically, acle-compat.h does
for arm about the same thing cdefs.h does for freebsd: defines
standardized macros that work no matter which compiler you use.  If ARM
hadn't provided this file we would have ended up with a big #ifdef __arm__
section in cdefs.h with our own compatibility shims.

Remove #include <machine/acle-compat.h> from the zillion other places (an
ever-growing list) that it appears.  Since style(9) requires sys/types.h
or sys/param.h early in the include list, and both of those lead to
including cdefs.h, only a couple special cases still need to include
acle-compat.h directly.

Loves it:     imp
2016-05-25 19:44:26 +00:00
pfg
729533413f sys: use our roundup2/rounddown2() macros when param.h is available.
rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.

This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.
2016-04-21 19:57:40 +00:00
trasz
ca92bb3067 Remove some NULL checks for M_WAITOK allocations.
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-03-29 13:56:59 +00:00
kib
977f53633f When matching brand to the ELF binary by notes, try to find a brand
with interpreter name exactly matching one wanted by the binary.  If
no such brand exists, return first brand which accepted the binary by
note.

The change fixes a regression after r292749, where e.g. our two ia32
compat brands, ia32_brand_info and ia32_brand_oinfo, only differ by
the interpeter path and binary matches to a brand by linkage order.
Then old binaries which require /usr/libexec/ld-elf.so.1 but matched
against ia32_brand_info with interp_path /libexec/ld-elf.so.1, were
considered requiring non-standard interpreter name, and magic to force
ld-elf32.so.1 did not happen.

Note that it might make sense to apply the same selection of brands
for other matching criteria, SCO EI_OSABI and 3.x string.

Reported and tested by:	dwmalone
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2016-02-04 20:55:49 +00:00
kib
cc13042464 Do not substitute interpeter if the brand interpreter path is
different from the interpreter path requested by the binary.

Before this change, it is impossible to activate non-default
interpreter for 32bit image on amd64, when /libexec/ld-elf32.so.1 file
exists.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-12-26 15:40:12 +00:00
jtl
f41bf39357 Only allow one PT_INTERP ELF program header. This also fixes a potential
memory leak for interp_buf.

Differential Revision:	https://reviews.freebsd.org/D4692
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	Juniper Networks
2015-12-24 00:58:11 +00:00
kib
bcb048ba0c If we annoy user with the terminal output due to failed load of
interpreter, also show the actual error code instead of some
interpretation.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-12-22 20:12:52 +00:00
emaste
0d1c50f494 Replace magic value ELF note type with NT_FREEBSD_ABI_TAG
As of r291909 elf_common.h provides a definition.

Suggested by:	kib
Sponsored by:	The FreeBSD Foundation
2015-12-07 18:43:27 +00:00
kib
80e8626b43 Add support for usermode (vdso-like) gettimeofday(2) and
clock_gettime(2) on ARMv7 and ARMv8 systems which have architectural
generic timer hardware. It is similar how the RDTSC timer is used in
userspace on x86.

Fix a permission problem where generic timer access from EL0 (or
userspace on v7) was not properly initialized on APs.

For ARMv7, mark the stack non-executable. The shared page is added for
all arms (including ARMv8 64bit), and the signal trampoline code is
moved to the page.

Reviewed by:	andrew
Discussed with:	emaste, mmel
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D4209
2015-12-07 12:20:26 +00:00
nwhitehorn
635273ca5b Missed header_supported call from r291020: make really, really sure the brand
likes the executable.
2015-12-01 17:00:31 +00:00
nwhitehorn
2b225aeb0b Extend r270123 to run the brand info's header_supported() routine for
branded as well as unbranded binaries. This will be required to add
support for the new ELFv2 ABI on powerpc64, which is distinguished from
ELFv1 by the contents of the ELF header's flags field.

Reviewed by:	imp
MFC after:	2 weeks
2015-11-18 17:03:22 +00:00
ngie
c5d7b522c7 Define compress in __elfN(coredump) when #ifdef GZIO is true to mute
an -Wunused-but-set-variable warning

Reported by: FreeBSD_HEAD_amd64_gcc4.9 jenkins job
Sponsored by: EMC / Isilon Storage Division
2015-11-02 01:47:26 +00:00
kib
a6091af923 Allow PT_INTERP and PT_NOTES segments to be located anywhere in the
executable image.  Keep one page (arbitrary) limit on the max allowed
size of the PT_NOTES.

The ELF image activators still require that program headers of the
executable are fully contained in the first page of the image file.

Reviewed by:	emaste, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D3871
2015-10-14 18:27:35 +00:00
cem
9c1e214f79 Fix core corruption caused by race in note_procstat_vmmap
This fix is spiritually similar to r287442 and was discovered thanks to
the KASSERT added in that revision.

NT_PROCSTAT_VMMAP output length, when packing kinfo structs, is tied to
the length of filenames corresponding to vnodes in the process' vm map
via vn_fullpath.  As vnodes may move during coredump, this is racy.

We do not remove the race, only prevent it from causing coredump
corruption.

- Add a sysctl, kern.coredump_pack_vmmapinfo, to allow users to disable
  kinfo packing for PROCSTAT_VMMAP notes.  This avoids VMMAP corruption
  and truncation, even if names change, at the cost of up to PATH_MAX
  bytes per mapped object.  The new sysctl is documented in core.5.

- Fix note_procstat_vmmap to self-limit in the second pass.  This
  addresses corruption, at the cost of sometimes producing a truncated
  result.

- Fix PROCSTAT_VMMAP consumers libutil (and libprocstat, via copy-paste)
  to grok the new zero padding.

Reported by:	pho (https://people.freebsd.org/~pho/stress/log/datamove4-2.txt)
Relnotes:	yes
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3824
2015-10-06 18:07:00 +00:00
cem
a8fae65d69 Follow-up to r287442: Move sysctl to compiled-once file
Avoid duplicate sysctl nodes.

Found by:	tijl
Approved by:	markj (mentor)
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3586
2015-09-07 16:44:28 +00:00
cem
f96df638b8 Detect badly behaved coredump note helpers
Coredump notes depend on being able to invoke dump routines twice; once
in a dry-run mode to get the size of the note, and another to actually
emit the note to the corefile.

When a note helper emits a different length section the second time
around than the length it requested the first time, the kernel produces
a corrupt coredump.

NT_PROCSTAT_FILES output length, when packing kinfo structs, is tied to
the length of filenames corresponding to vnodes in the process' fd table
via vn_fullpath.  As vnodes may move around during dump, this is racy.

So:

 - Detect badly behaved notes in putnote() and pad underfilled notes.

 - Add a fail point, debug.fail_point.fill_kinfo_vnode__random_path to
   exercise the NT_PROCSTAT_FILES corruption.  It simply picks random
   lengths to expand or truncate paths to in fo_fill_kinfo_vnode().

 - Add a sysctl, kern.coredump_pack_fileinfo, to allow users to
   disable kinfo packing for PROCSTAT_FILES notes.  This should avoid
   both FILES note corruption and truncation, even if filenames change,
   at the cost of about 1 kiB in padding bloat per open fd.  Document
   the new sysctl in core.5.

 - Fix note_procstat_files to self-limit in the 2nd pass.  Since
   sometimes this will result in a short write, pad up to our advertised
   size.  This addresses note corruption, at the risk of sometimes
   truncating the last several fd info entries.

 - Fix NT_PROCSTAT_FILES consumers libutil and libprocstat to grok the
   zero padding.

With suggestions from:	bjk, jhb, kib, wblock
Approved by:	markj (mentor)
Relnotes:	yes
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3548
2015-09-03 20:32:10 +00:00
markj
0f95984131 Fix some error-handling bugs when core dump compression is enabled:
- Ensure that core dump parameters are initialized in the error path.
- Don't call gzio_fini() on a NULL stream.

Reported by:	rpaulo
2015-07-14 18:24:05 +00:00
mjg
d7bc9285a6 Implement lockless resource limits.
Use the same scheme implemented to manage credentials.

Code needing to look at process's credentials (as opposed to thred's) is
provided with *_proc variants of relevant functions.

Places which possibly had to take the proc lock anyway still use the proc
pointer to access limits.
2015-06-10 10:48:12 +00:00
emaste
41e1b133ab Add user facing errors for exceeding process memory limits
Previously the process terminating with SIGABRT at startup was the
only notification.

PR:		200617
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D2731
2015-06-08 16:07:07 +00:00