- Similar to the hack for bootinfo32.c in userboot, define
_MACHINE_ELF_WANT_32BIT in the load_elf32 file handlers in userboot.
This allows userboot to load 32-bit kernels and modules.
- Copy the SMAP generation code out of bootinfo64.c and into its own
file so it can be shared with bootinfo32.c to pass an SMAP to the i386
kernel.
- Use uint32_t instead of u_long when aligning module metadata in
bootinfo32.c in userboot, as otherwise the metadata used 64-bit
alignment which corrupted the layout.
- Populate the basemem and extmem members of the bootinfo struct passed
to 32-bit kernels.
- Fix the 32-bit stack in userboot to start at the top of the stack
instead of the bottom so that there is room to grow before the
kernel switches to its own stack.
- Push a fake return address onto the 32-bit stack in addition to the
arguments normally passed to exec() in the loader. This return
address is needed to convince recover_bootinfo() in the 32-bit
locore code that it is being invoked from a "new" boot block.
- Add a routine to libvmmapi to setup a 32-bit flat mode register state
including a GDT and TSS that is able to start the i386 kernel and
update bhyveload to use it when booting an i386 kernel.
- Use the guest register state to determine the CPU's current instruction
mode (32-bit vs 64-bit) and paging mode (flat, 32-bit, PAE, or long
mode) in the instruction emulation code. Update the gla2gpa() routine
used when fetching instructions to handle flat mode, 32-bit paging, and
PAE paging in addition to long mode paging. Don't look for a REX
prefix when the CPU is in 32-bit mode, and use the detected mode to
enable the existing 32-bit mode code when decoding the mod r/m byte.
Reviewed by: grehan, neel
MFC after: 1 month
* The RFC says (in section 10.1) that only when extbuf is not NULL,
extlen shall be checked, so don't perform this check when NULL is
passed.
* socklen_t is unsigned, so checking extlen for less than zero is
not needed.
Submitted by: swildner@dragonflybsd.org
Reviewed by: Mark Martinec <Mark.Martinec+freebsd@ijs.si>
Reviewed by: hrs
Obtained by: DragonFlyBSD
the signal second time, by adding the missed else before if statement.
While there, postpone initializing local curthread variable until
passed signal number is checked for validity.
Submitted by: John Wolfe <jlw@xinuos.com>
PR: threads/186309
MFC after: 1 week
This also fixes asserts on removal of the module for the mpc74xx.
The PowerPC 970 processors have two different types of events: direct events
and indirect events. Thus far only direct events are supported. I included
some documentation in the driver on how indirect events work, but support is
for the future.
MFC after: 1 month
commit c1acf022c5
Author: Brooks Davis <brooks@one-eyed-alien.net>
Date: Fri Jan 17 21:46:44 2014 +0000
Add an option WITHOUT_NCURSESW to suppress building and linking to
libncursesw. While wide character support it useful we'd like to
only need one ncurses library on embedded systems.
MFC after: 4 weeks
Sponsored by: DARPA, AFRL
the virtio backends.
- Add a new ioctl to export the count of pins on the I/O APIC from vmm
to the hypervisor.
- Use pins on the I/O APIC >= 16 for PCI interrupts leaving 0-15 for
ISA interrupts.
- Populate the MP Table with I/O interrupt entries for any PCI INTx
interrupts.
- Create a _PRT table under the PCI root bridge in ACPI to route any
PCI INTx interrupts appropriately.
- Track which INTx interrupts are in use per-slot so that functions
that share a slot attempt to distribute their INTx interrupts across
the four available pins.
- Implicitly mask INTx interrupts if either MSI or MSI-X is enabled
and when the INTx DIS bit is set in a function's PCI command register.
Either assert or deassert the associated I/O APIC pin when the
state of one of those conditions changes.
- Add INTx support to the virtio backends.
- Always advertise the MSI capability in the virtio backends.
Submitted by: neel (7)
Reviewed by: neel
MFC after: 2 weeks
known in advance, or where the caller doesn't care and just keeps
reading until it hits EOF.
In fetch_read(): the socket is non-blocking, so read() will return 0
on EOF, and -1 (errno == EAGAIN) when the connection is still open but
there is no data waiting. In the first case, we should immediately
return 0. The EINTR case was also broken, although not in a way that
matters.
In fetch_writev(): use timersub() and timercmp() as in fetch_read().
In http_fillbuf(): set errno to a sensible value when an invalid chunk
header is encountered.
In http_readfn(): as in fetch_read(), a zero return from down the
stack indicates EOF, not an error. Furthermore, when io->error is
EINTR, clear it (but no errno) before returning so the caller can
retry after dealing with the interrupt.
MFC after: 3 days
simply not trying to return exactly what the caller asked for - just
return whatever we got and let the caller be the judge of whether it
was enough. If an error occurs or the connection times out after we
already received some data, return a short read, under the assumption
that the next call will fail or time out before we read anything.
As it turns out, none of the code that calls fetch_read() assumes an
all-or-nothing result anyway, except for a couple of lines where we
read the CR LF at the end of a hunk in HTTP hunked encoding, so the
changes outside of fetch_read() and http_readfn() are minimal.
While there, replace select(2) with poll(2).
MFC after: 3 days
properly include sys/ headers from the source tree instead of the
host.
These patches are also applied to libdwarf since libdwarf requires
the same sys/ headers as libelf.
device is an active kernel console and "off" otherwise. This is designed to
allow serial-booting x86 systems to provide a login prompt on the serial line
by default without providing one on all systems by default.
Comments and suggestions by: grehan, dteske, jilles
MFC after: 1 month
The resolver in libc creates a kqueue for watching a single file descriptor.
This can be done using poll() which should be lighter on the kernel and
reduce possible problems with rlimits (file descriptors, kqueues).
Reviewed by: jhb
one significant difference: for LIB32 builds both TARGET_ARCH
and MACHINE_ARCH are defined. TARGET_ARCH confusingly holds the
architecture of the host (e.g. amd64), while MACHINE_ARCH holds
the architecture were trying to build (e.g. i386). With both
set and different, r260022 changed the behaviour to interpret
the condition as building a cross-amd64 libkvm on i386, when
obviously we're trying to build an i386 version on amd64. When
COMPAT_32BIT is defined, we're building LIB32 and ignore the
value of TARGET_ARCH as we did before.
These files are required to get packages in ports to build against atf and
also to get a couple of currently-failing tests to pass.
I'm following the approach already used by the libusb pkg-config files
installed by the system regarding the location and the install rules.
MFC after: 5 days
As a result, the kernel needs to process shorter pathnames if fts is not
changing directories (if fts follows symlinks (-L option to utilities), fts
cannot open "." or FTS_NOCHDIR was specified).
Side effect: If pathnames exceed PATH_MAX, [ENAMETOOLONG] is not hit at the
stat stage but later (opendir or application fts_accpath) or not at all.
Because we respect the FreeBSD src tree layout under /usr/tests, and because
the layout of the tests in the atf distfile does not match the former, the
tests for atf-c++ were not able to find the process_helper binary.
Fix this by explicitly hardcoding the right path in the FreeBSD test suite.
Obtained from: atf (git 1f0e878f7f127741a3762883ef24aef317e239d5)
MFC after: 1 week
Put test programs for internal modules into a 'detail' subdirectory of the
libatf-c and libatf-c++ test directories, just as the upstream distribution
does. This is necessary because the tests assume such layout to find the
process_helper program, and currently fail because of this divergence.
MFC after: 1 week
* Set errno to EAFNOSUPPORT if an address is provided which is neither
AF_INET nor AF_INET6.
* Don't modify the arguments.
* Don't smash the stack when provided with a non-zero port.
* Handle the case correctly where the first address provided is
an IPv6 address.
MFC after: 3 days
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
illumos/illumos-gate@43466aae47
NOTE: Make sure the boot code is updated if a zpool upgrade is
done on boot zpool.
MFC after: 2 weeks
* msun/man/sinh.3:
* msun/man/tanh.3:
. Fix grammar.
* msun/src/e_coshl.c:
* msun/src/e_sinhl.c:
. Fix comment.
* msun/src/s_tanhl.c:
. Remove unused variables.
. Fix location/indentation of comments.
. Use comparison involving ints instead of long double.
. Re-order polynomial evaluation on ld128 for |x| < 0.25.
For now, retain the older order in an "#if 0 ... #else" block.
. Use int comparison to short-circuit the |x| < 1.5 condition.
Requested by: bde
. Hook coshl, sinhl, and tanhl into libm.
. Create symbolic links for corresponding manpages.
. While here remove a nearby extraneous space.
* Symbol.map:
* src/math.h:
. Move coshl, sinhl, and tanhl to their proper locations.
* man/cosh.3:
* man/sinh.3:
* man/tanh.3:
. Update the manpages.
* src/e_cosh.c:
* src/e_sinh.c:
* src/s_tanh.c:
. Add weak reference for LBDL_MANT_DIG==53 targets.
* src/imprecise.c:
. Remove the coshl, sinhl, and tanhl kludge.
* src/e_coshl.c:
. ld80 and ld128 implementation of coshl().
* src/e_sinhl.c:
. ld80 and ld128 implementation of sinhl().
* src/s_tanhl.c:
. ld80 and ld128 implementation of tanhl().
Obtained from: bde (mostly), das and kargl
* ld128/k_expl.h:
. Split out a computational kernel,__k_expl(x, &hi, &lo, &k) from expl(x).
x must be finite and not tiny or huge. The kernel returns hi and lo
values for extra precision and an exponent k for a 2**k scale factor.
. Define additional kernels k_hexpl() and hexpl() that include a 1/2
scaling and are used by the hyperbolic functions.
* ld80/s_expl.c:
* ld128/s_expl.c:
. Use the __k_expl() kernel.
Obtained from: bde
file as follows:
1. Common ia64-specific support functions have the ia64_ prefix.
2. Functions that work on physical cores have the phys_ prefix.
3. Functions that work on virtual cores have the virt_ prefix.
With that:
1. _kvm_kvatop() has been renamed to phys_kvatop() as it handles
physical cores only.
2. The new _kvm_kvatop() is nothing but a wrapper that calls either
phys_kvatop() or virt_kvatop() by virtue of the kvatop function
pointer in the vmstate structure.
3. virt_kvatop() is nothing but a wrapper around virt_addr2off().
4. virt_addr2off() iterates over the Phdrs to find the segment in
which the address falls and return the file offset for it.
Now it's up to the kernel to populate the core file appropriately.
produced will be called libkvm-${ARCH} instead of libkvm. This allows
installing it alongside the native version.
For symbol lookups, use ps_pglobal_lookup() instead of __fdnlist()
when building a cross libkvm. It is assumed that the cross tool that
uses the cross libkvm also provides an implementation for this
proc_services function.
Note that this commit does not change any of the architecture-specific
code for cross-compilation.
- Add a generic routine to trigger an LVT interrupt that supports both
fixed and NMI delivery modes.
- Add an ioctl and bhyvectl command to trigger local interrupts inside a
guest. In particular, a global NMI similar to that raised by SERR# or
PERR# can be simulated by asserting LINT1 on all vCPUs.
- Extend the LVT table in the vCPU local APIC to support CMCI.
- Flesh out the local APIC error reporting a bit to cache errors and
report them via ESR when ESR is written to. Add support for asserting
the error LVT when an error occurs. Raise illegal vector errors when
attempting to signal an invalid vector for an interrupt or when sending
an IPI.
- Ignore writes to reserved bits in LVT entries.
- Export table entries the MADT and MP Table advertising the stock x86
config of LINT0 set to ExtInt and LINT1 wired to NMI.
Reviewed by: neel (earlier version)
clang-specific or gcc-specific flags, introduce the following new
variables for use in Makefiles:
CFLAGS.clang
CFLAGS.gcc
CXXFLAGS.clang
CXXFLAGS.gcc
In bsd.sys.mk, these get appended to the regular CFLAGS or CXXFLAGS for
the right compiler.
MFC after: 1 week
directory is like any subdirectory and as such needs to use a real
cluster number. To this end, keep a DE structure for the root in
the DOS_FS structure and populate it accordingly.
While here:
o allow consecutive path separators by skipping them all.
o add missing $FreeBSD$ keyword to dosfs.h.
code more naive and robust:
1. When setting ev_value, also always set ev_flags appropriately
2. Always check ev_value and ev_flags before calling free.
Both the value and the EV_DYNAMIC property can come directly from the
consumers of the environment functionality, so it's good to be careful.
And since this code is typically not looked at for long periods of
time, it's good to have it be a little "dumb-looking".
Trigger case for the bug:
env_setenv("foo", 0, "1", NULL, NULL);
env_setenv("foo", 0, "2", NULL, NULL);
Obtained from: Juniper Networks, Inc.
callers treat the MSI 'addr' and 'data' fields as opaque and also lets
bhyve implement multiple destination modes: physical, flat and clustered.
Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
Reviewed by: grehan@
Get rid of the msg_peek() function, which has a problem. If there was less
data in the socket buffer than requested by the caller, the function would busy
loop, as select(2) will always return immediately.
We can just receive nvlhdr now, because some time ago we splitted receive of
data from the receive of descriptors.
MFC after: 1 week
owning the handle passed to __cxa_finalize() but which are registered
by other dso, when the process is inside exit(3).
Running them makes the destruction order wrong, and there is hope that
such destructors would not call dlclose(3), since it is pointless at
this stage of the process existence.
The change effectively disables the r211706 after the exit(3) is
called.
Reported and tested by: Michael Gmelin <freebsd@grem.de>
Analyzed by: dim
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This is in the process of being submitted to the upstream LLDB
repository. The thread list functionality is modelled in part on
GDBRemoteCommunicationClient.
LLDB bug pr16696 and code review D2267
Sponsored by: DARPA, AFRL
giving access to functionality that is not available in capability mode
sandbox. The functionality can be precisely restricted.
Start with the following services:
- system.dns - provides API compatible to:
- gethostbyname(3),
- gethostbyname2(3),
- gethostbyaddr(3),
- getaddrinfo(3),
- getnameinfo(3),
- system.grp - provides getgrent(3)-compatible API,
- system.pwd - provides getpwent(3)-compatible API,
- system.random - allows to obtain entropy from /dev/random,
- system.sysctl - provides sysctlbyname(3-compatible API.
Sponsored by: The FreeBSD Foundation
by hastctl(8), hastd(8) and auditdistd(8) and will soon be also used
by casperd(8) and its services. There is no documentation and pjdlog.h
header file is not installed in /usr/include/ to keep it private.
Unfortunately we don't have /lib/private/ at this point, only
/usr/lib/private/, so the library is installed in /lib/.
Sponsored by: The FreeBSD Foundation
shifts into the sign bit. Instead use (1U << 31) which gets the
expected result.
This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.
A similar change was made in OpenBSD.
Discussed with: -arch, rdivacky
Reviewed by: cperciva
requires process descriptors to work and having PROCDESC in GENERIC
seems not enough, especially that we hope to have more and more consumers
in the base.
MFC after: 3 days
it is all in the one place again. Rename libc/iconv/iconv.c to
bsd_iconv.c. Compile the wrappers into libc.a so that WITHOUT_DYNAMICROOT
works again.
Discussed with: kib (and partly stolen from his patch)
3-clause BSD license as specified by Oracle America, Inc. in 2010.
This license change was approved by Wim Coekaerts, Senior Vice
President, Linux and Virtualization at Oracle Corporation.
bhyve supports a single timer block with 8 timers. The timers are all 32-bit
and capable of being operated in periodic mode. All timers support interrupt
delivery using MSI. Timers 0 and 1 also support legacy interrupt routing.
At the moment the timers are not connected to any ioapic pins but that will
be addressed in a subsequent commit.
This change is based on a patch from Tycho Nightingale (tycho.nightingale@pluribusnetworks.com).
when there is an invalid character in the output codeset while it is
valid in the input. However, POSIX requires iconv() to perform an
implementation-defined conversion on the character. So, Citrus iconv converts
such a character to a special character which means it is invalid in the
output codeset.
This is not a problem in most cases but some software like libxml2 depends
on GNU's behavior to determine if a character is output as-is or another form
such as a character entity (&#NNN;).
unlocking the rtld bind lock results in the processing of ast and
recursing into the check_deferred_signal(). Nested execution of
check_deferred_signal() delivers the signal to user code and clears
si_signo. On return, top-level check_deferred_signal() frame
continues delivering the same signal one more time, but now with zero
si_signo.
Fix this by adding a flag to indicate that deferred delivery is
running, so check_deferred_signal() should avoid doing anything. Since
user signal handler is allowed to modify the passed machine context to
make return from the signal handler to cause arbitrary jump, or do
longjmp(). For this case, also clear the flag in thr_sighandler(),
since kernel signal delivery means that nested delivery code should
not run right now.
Reported by: Vitaly Magerya <vmagerya@gmail.com>
Reviewed by: davidxu, jilles
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
to inject edge triggered legacy interrupts into the guest.
Start using the new API in device models that use edge triggered interrupts:
viz. the 8254 timer and the LPC/uart device emulation.
Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
SSL_set_tlsext_host_name(3) internally does not modify the host buffer
pased to it. So it is safe to DECONST the struct url* here.
Reported by: gjb
Approved by: bapt (implicit)
MFC after: 1 week
X-MFC-With: r258347
SNI is Server Name Indentification which is a protocol for TLS that
indicates the host that is being connected to at the start of the
handshake. It allows to use Virtual Hosts on HTTPS.
Submitted by: sbz
Submitted by: Michael Gmelin <freebsd@grem.de> [1]
PR: kern/183583 [1]
Reviewed by: des
Approved by: bapt
MFC after: 1 week
- Add ' to the list of directly encoded characters and * to the list of
optionally directly encoded characters as per RFC 2152.
- In _citrus_UTF7_mbtoutf16 on end of input when the next output character
has only been partially decoded, save a copy of the buffer of input
characters (not just its length). On the next call with more input
characters this buffer is reprocessed together with the new input to
form a fully decoded output character.
- At the end of a base64 encoded sequence fully discard '-' (BASE64_OUT)
by decrementing psenc->chlen and i. This is needed to make room in
psenc->ch (input buffer) in case the next input character starts a new
base64 encoded sequence. And also, if this is the end of input and no
output character can be returned, this brings the encoder in the initial
state as indicated by _citrus_UTF7_stdenc_get_state_desc_generic which
is used by the caller to distinguish between no output and partial
output.
- In _citrus_UTF7_mbrtowc_priv pass the s parameter (input pointer)
directly to _citrus_UTF7_mbtoutf16 instead of a copy (s0). This way s
is updated correctly in case of errors.
- In _citrus_UTF7_mbrtowc_priv when called with psenc->surrogate set
(previous call did not have enough input), retrieve the previously
decoded UTF-16 character from (psenc->cache >> psenc->bits) instead of
(psenc->cache >> 2).
MFC after: 5 days
When building various programs from a single Makefile, program-specific
variables are of the form <VAR>.<PROG>, not <VAR>_<PROG>. Fix this
obvious typo to fix the build when WITH_TESTS=yes.
I am not sure how this ever worked before given that manual inspection
of bsd.progs.mk clearly shows that the expected character between the
two components is a dot and not an underscore... but I suspect the
changes in r258095 exposed this oddity.
Approved by: rpaulo (mentor)
FreeBSD systems usually implemented this as a third party module and
our implementation hasn't played as nicely with the old way as it could
have.
To that end:
* Rename the iconv* symbols in libc.so.7 to have a __bsd_ prefix.
* Provide .symver compatability with existing 10.x+ binaries that
referenced the iconv symbols. All existing binaries should work.
* Like on Linux/glibc systems, add a libc_nonshared.a to the ldscript
at /usr/lib/libc.so.
* Move the "iconv*" wrapper symbols to libc_nonshared.a
This should solve the runtime ambiguity about which symbols resolve
to where. If you compile against the iconv in libc, your runtime
dependencies will be unambiguous.
Old 9.x libraries and binaries will always resolve against their
libiconv.so.3 like they did on 9.x. They won't resolve against libc.
Old 10.x binaries will be satisified by the .symver helpers.
This should allow ports to selectively compile against the libiconv
port if needed and it should behave without ambiguity now.
Discussed with: kib
upcoming in-kernel device emulations like the HPET.
The ioctls VM_IOAPIC_ASSERT_IRQ and VM_IOAPIC_DEASSERT_IRQ are used to
manipulate the ioapic pin state.
Discussed with: grehan@
Submitted by: Tycho Nightingale (tycho.nightingale@pluribusnetworks.com)
ludes minor changes relative to upstream, for compatibility with
FreeBSD's in-tree LLVM 3.3:
- Reverted LLDB r191806, restoring use of previous API.
- Reverted part of LLDB r189317, restoring previous enum names.
- Work around missing LLVM r192504, using previous registerEHFrames API
(limited functionality).
- Removed PlatformWindows header include and init/terminate calls.
Sponsored by: DARPA, AFRL
Move the installation of /usr/tests/lib/Kyuafile from src/tests/lib/
to src/lib/. This is to keep the src/tests/ hierarchy unaware of the
rest of the tree, which makes things clearer in general. In particular:
1) Everything related to the construction of /usr/tests/lib/ is kept
in src/lib/. There is no need to think about different directories
and how they relate to each other. (The same applies for libexec,
usr.bin, etc. but these are not yet handled.)
2) src/tests becomes the place to keep cross-functional test programs
and nothing else, which also helps in simplifying things.
Reviewed by: freebsd-testing
Approved by: rpaulo (mentor)
There is no reason to keep the two knobs separate: if tests are
enabled, the ATF libraries are required; and if tests are disabled,
the ATF libraries are not necessary. Keeping the two just serves
to complicate the build.
Reviewed by: freebsd-testing
Approved by: rpaulo (mentor)
* Use bit twiddling. This requires inclusion of math_private.h
and inclusion of float.h in s_roundl.c. Raise invalid exception.
* Use literal integer constants where possible. Let the compiler
do the appropriate conversion.
* In s_roundf.c, use an F suffix on float constants instead of
promoting float to double and then converting the result back
to float. In s_roundl.c, use an L suffix.
* In s_roundl.c, use the ENTERI and RETURNI macros. This requires
the inclusion of fpmath.h and on __i386__ class hardware ieeefp.h.
Reviewed by: bde
process if it has not already been stopped, since this is required for
ptrace(2) to work.
libdtrace does not seem to stop target processes before trying to remove
their breakpoints, so we were previously failing to remove the breakpoint
on r_debug_state() in rtld. This was causing processes to die with SIGTRAP
if they called dlopen(3) after dtrace(1) had detached.
Reported by: symbolics@gmx.com
Reviewed by: rpaulo
MFC after: 1 month
This explanation is supposed to be simpler and better. In particular
"comparing it to the snprintf API provides lots of value, since it raises the
bar on understanding, so that programmers/auditors will a better job calling
all 3 of these functions."
Requested by: deraadt@cvs.openbsd.org
Obtained From: OpenBSD
Reviewed by: cperciva
good. This caused libc to spoof the ports libiconv namespace and
provide a colliding libiconv.so.3 to fool rtld. This should have
been removed some time ago.
Unexpand the tag, remove the fbsd:nokeywords property and add the
svn:keywords property. This should eliminate the gratuituous diffs
that appear on these files in projects branches.
Sponsored by: The FreeBSD Foundation
* Don't print any error messages to stderr unless DEBUG is defined.
* Add a DPRINTFX macro for use when errno isn't set.
* Print the error string from libelf when appropriate.
The number of ways to indicate this confuses people.
PR: docs/100196
Reported by: "Dr. Markus Waldeck" <waldeck@gmx.de>
Reported by: Jamie Landeg Jones <jamie.landeg.jones@gmail.com>
Populate /usr/tests with the only test programs that currently live
in the tree (those in lib/libcrypt/tests/) and add all the build
machinery to accompany this change.
In particular:
- Add a WITHOUT_TESTS variable that users can define to request that
no tests be put in /usr/tests.
- Add a top-level Kyuafile for /usr/tests and a way to create similar
Kyuafiles in top-level subdirectories.
- Add a BSD.tests.dist file to define the directory layout of
/usr/tests.
Submitted by: Julio Merino jmmv google.com
Reviewed by: sjg
MFC after: 2 weeks
user. Kqueue now saves the ucred of the allocating thread, to
correctly decrement the counter on close.
Under some specific and not real-world use scenario for kqueue, it is
possible for the kqueues to consume memory proportional to the square
of the number of the filedescriptors available to the process. Limit
allows administrator to prevent the abuse.
This is kernel-mode side of the change, with the user-mode enabling
commit following.
Reported and tested by: pho
Discussed with: jmg
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
_citrus_mapper_close again and result in a deadlock otherwise.
This is similar to NetBSD PR/24023 (fixed in their r1.5 of this file).
PR: bin/182994
Submitted by: Fabian Keil <fk fabiankeil de>
MFC after: 3 days
Even though not all race conditions can be fixed if the 'e' option is not
used, still fix some race conditions using pipe2():
* Prevent both ends of the pipe from leaking to a concurrent popen().
* Prevent the child process's end of the pipe from leaking to any concurrent
fork and exec.
This change also simplifies the code.
This change introduces a new plain.test.mk file that provides the build
infrastructure to build test programs that don't use any framework.
Most of the code previously in bsd.test.mk moves to plain.test.mk and
atf.test.mk is extended with the missing pieces.
In doing so, this change pushes all test program building logic to the
various *.test.mk files instead of trying to reuse some tiny bits.
In fact, this attempt to reuse some definitions makes the code harder
to read and harder to extend.
The clear benefit of this is that the interface of bsd.test.mk is now
clearly delimited.
Submitted by: Julio Merino jmmv google.com
MFC after: 2 weeks
'invpcid' instruction to the guest. Currently bhyve will try to enable this
capability unconditionally if it is available.
Consolidate code in bhyve to set the capabilities so it is no longer
duplicated in BSP and AP bringup.
Add a sysctl 'vm.pmap.invpcid_works' to display whether the 'invpcid'
instruction is available.
Reviewed by: grehan
MFC after: 3 days
tools would need to know about the counter_u64_t type. Allow to include
sys/counter.h from userspace.
- Utilize now defined type in kvm_counter_u64_fetch().
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
Only accept 'net' and 'pxe' devices as underlying transport
in tftp.c on x86. Prior to this change tftp code would attempt
to send packets over any boot device, including zfs one with
predictably sad results.
Approved by: re (gjb)
MFC After: 1 month
Since so many programs don't check return value, always NUL terminate
the buf...
fix rounding when using base 1024 (the bug that started it all)...
add a set of test cases so we can make sure that things don't break
in the future...
Thanks to Clifton Royston for testing and the test program...
Approved by: re (hrs, glebius)
MFC after: 1 week
Make the amd64/pmap code aware of nested page table mappings used by bhyve
guests. This allows bhyve to associate each guest with its own vmspace and
deal with nested page faults in the context of that vmspace. This also
enables features like accessed/dirty bit tracking, swapping to disk and
transparent superpage promotions of guest memory.
Guest vmspace:
Each bhyve guest has a unique vmspace to represent the physical memory
allocated to the guest. Each memory segment allocated by the guest is
mapped into the guest's address space via the 'vmspace->vm_map' and is
backed by an object of type OBJT_DEFAULT.
pmap types:
The amd64/pmap now understands two types of pmaps: PT_X86 and PT_EPT.
The PT_X86 pmap type is used by the vmspace associated with the host kernel
as well as user processes executing on the host. The PT_EPT pmap is used by
the vmspace associated with a bhyve guest.
Page Table Entries:
The EPT page table entries as mostly similar in functionality to regular
page table entries although there are some differences in terms of what
bits are used to express that functionality. For e.g. the dirty bit is
represented by bit 9 in the nested PTE as opposed to bit 6 in the regular
x86 PTE. Therefore the bitmask representing the dirty bit is now computed
at runtime based on the type of the pmap. Thus PG_M that was previously a
macro now becomes a local variable that is initialized at runtime using
'pmap_modified_bit(pmap)'.
An additional wrinkle associated with EPT mappings is that older Intel
processors don't have hardware support for tracking accessed/dirty bits in
the PTE. This means that the amd64/pmap code needs to emulate these bits to
provide proper accounting to the VM subsystem. This is achieved by using
the following mapping for EPT entries that need emulation of A/D bits:
Bit Position Interpreted By
PG_V 52 software (accessed bit emulation handler)
PG_RW 53 software (dirty bit emulation handler)
PG_A 0 hardware (aka EPT_PG_RD)
PG_M 1 hardware (aka EPT_PG_WR)
The idea to use the mapping listed above for A/D bit emulation came from
Alan Cox (alc@).
The final difference with respect to x86 PTEs is that some EPT implementations
do not support superpage mappings. This is recorded in the 'pm_flags' field
of the pmap.
TLB invalidation:
The amd64/pmap code has a number of ways to do invalidation of mappings
that may be cached in the TLB: single page, multiple pages in a range or the
entire TLB. All of these funnel into a single EPT invalidation routine called
'pmap_invalidate_ept()'. This routine bumps up the EPT generation number and
sends an IPI to the host cpus that are executing the guest's vcpus. On a
subsequent entry into the guest it will detect that the EPT has changed and
invalidate the mappings from the TLB.
Guest memory access:
Since the guest memory is no longer wired we need to hold the host physical
page that backs the guest physical page before we can access it. The helper
functions 'vm_gpa_hold()/vm_gpa_release()' are available for this purpose.
PCI passthru:
Guest's with PCI passthru devices will wire the entire guest physical address
space. The MMIO BAR associated with the passthru device is backed by a
vm_object of type OBJT_SG. An IOMMU domain is created only for guest's that
have one or more PCI passthru devices attached to them.
Limitations:
There isn't a way to map a guest physical page without execute permissions.
This is because the amd64/pmap code interprets the guest physical mappings as
user mappings since they are numerically below VM_MAXUSER_ADDRESS. Since PG_U
shares the same bit position as EPT_PG_EXECUTE all guest mappings become
automatically executable.
Thanks to Alan Cox and Konstantin Belousov for their rigorous code reviews
as well as their support and encouragement.
Thanks for John Baldwin for reviewing the use of OBJT_SG as the backing
object for pci passthru mmio regions.
Special thanks to Peter Holm for testing the patch on short notice.
Approved by: re
Discussed with: grehan
Reviewed by: alc, kib
Tested by: pho
The accept(2) man page warns that O_NONBLOCK and other properties on the
new socket may vary across implementations. However, this issue only
applies to accept() and not to accept4(). On the other hand, accept4()
is not commonly available yet.
Reported by: pluknet
Reviewed by: bjk
Approved by: re (kib)
for. This is useful for software needing to know which architecture a
binary is built for as arm and armv6 have slight differences meaning only
some binaries build for one will work as expected on the other. It is
expected pkgng will be able to make use of this to simplify the logic to
determine which package ABI to use.
Approved by: re (kib)
This connects LLDB to the build, but it is disabled by default. Add
WITH_LLDB= to src.conf to build it.
Note that LLDB requires a C++11 compiler so is disabled on platforms
using GCC.
Approved by: re (gjb)
Sponsored by: DARPA, AFRL
exhausted.
- Add a new protect(1) command that can be used to set or revoke protection
from arbitrary processes. Similar to ktrace it can apply a change to all
existing descendants of a process as well as future descendants.
- Add a new procctl(2) system call that provides a generic interface for
control operations on processes (as opposed to the debugger-specific
operations provided by ptrace(2)). procctl(2) uses a combination of
idtype_t and an id to identify the set of processes on which to operate
similar to wait6().
- Add a PROC_SPROTECT control operation to manage the protection status
of a set of processes. MADV_PROTECT still works for backwards
compatability.
- Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc)
the first bit of which is used to track if P_PROTECT should be inherited
by new child processes.
Reviewed by: kib, jilles (earlier version)
Approved by: re (delphij)
MFC after: 1 month
preprocessor) gives the following error:
--- Version.map ---
<stdin>:287:4: error: invalid preprocessing directive
# Implemented as weak aliases for imprecise versions
^
1 error generated.
Change the comment to a C-style one, to prevent this error.
Approved by: re (hrs)
an address in the first 2GB of the process's address space. This flag should
have the same semantics as the same flag on Linux.
To facilitate this, add a new parameter to vm_map_find() that specifies an
optional maximum virtual address. While here, fix several callers of
vm_map_find() to use a VMFS_* constant for the findspace argument instead of
TRUE and FALSE.
Reviewed by: alc
Approved by: re (kib)
This change avoids undesirably passing some internal file descriptors to a
process created (fork+exec) by another thread.
Kernel support for SOCK_CLOEXEC was added in r248534, March 19, 2013.
Austin Group issue #411 requires 'e' to be accepted before and after 'x',
and encourages accepting the characters in any order, except the initial
'r', 'w' or 'a'.
Given that glibc accepts the characters after r/w/a in any order and that
diagnosing this problem may be hard, change our libc to behave that way as
well.
These are weak and so can be replaced by other versions in applications
that choose to do so, and will give a linker warning when used so that
applications that rely on the extra precision can avoid them.
Note that since the C/C++ specs only guarantee that long double has
precision equal to double, code that actually relies on these functions
having greater precision is unportable at best and broken at worst.
in the future in a backward compatible (API and ABI) way.
The cap_rights_t represents capability rights. We used to use one bit to
represent one right, but we are running out of spare bits. Currently the new
structure provides place for 114 rights (so 50 more than the previous
cap_rights_t), but it is possible to grow the structure to hold at least 285
rights, although we can make it even larger if 285 rights won't be enough.
The structure definition looks like this:
struct cap_rights {
uint64_t cr_rights[CAP_RIGHTS_VERSION + 2];
};
The initial CAP_RIGHTS_VERSION is 0.
The top two bits in the first element of the cr_rights[] array contain total
number of elements in the array - 2. This means if those two bits are equal to
0, we have 2 array elements.
The top two bits in all remaining array elements should be 0.
The next five bits in all array elements contain array index. Only one bit is
used and bit position in this five-bits range defines array index. This means
there can be at most five array elements in the future.
To define new right the CAPRIGHT() macro must be used. The macro takes two
arguments - an array index and a bit to set, eg.
#define CAP_PDKILL CAPRIGHT(1, 0x0000000000000800ULL)
We still support aliases that combine few rights, but the rights have to belong
to the same array element, eg:
#define CAP_LOOKUP CAPRIGHT(0, 0x0000000000000400ULL)
#define CAP_FCHMOD CAPRIGHT(0, 0x0000000000002000ULL)
#define CAP_FCHMODAT (CAP_FCHMOD | CAP_LOOKUP)
There is new API to manage the new cap_rights_t structure:
cap_rights_t *cap_rights_init(cap_rights_t *rights, ...);
void cap_rights_set(cap_rights_t *rights, ...);
void cap_rights_clear(cap_rights_t *rights, ...);
bool cap_rights_is_set(const cap_rights_t *rights, ...);
bool cap_rights_is_valid(const cap_rights_t *rights);
void cap_rights_merge(cap_rights_t *dst, const cap_rights_t *src);
void cap_rights_remove(cap_rights_t *dst, const cap_rights_t *src);
bool cap_rights_contains(const cap_rights_t *big, const cap_rights_t *little);
Capability rights to the cap_rights_init(), cap_rights_set(),
cap_rights_clear() and cap_rights_is_set() functions are provided by
separating them with commas, eg:
cap_rights_t rights;
cap_rights_init(&rights, CAP_READ, CAP_WRITE, CAP_FSTAT);
There is no need to terminate the list of rights, as those functions are
actually macros that take care of the termination, eg:
#define cap_rights_set(rights, ...) \
__cap_rights_set((rights), __VA_ARGS__, 0ULL)
void __cap_rights_set(cap_rights_t *rights, ...);
Thanks to using one bit as an array index we can assert in those functions that
there are no two rights belonging to different array elements provided
together. For example this is illegal and will be detected, because CAP_LOOKUP
belongs to element 0 and CAP_PDKILL to element 1:
cap_rights_init(&rights, CAP_LOOKUP | CAP_PDKILL);
Providing several rights that belongs to the same array's element this way is
correct, but is not advised. It should only be used for aliases definition.
This commit also breaks compatibility with some existing Capsicum system calls,
but I see no other way to do that. This should be fine as Capsicum is still
experimental and this change is not going to 9.x.
Sponsored by: The FreeBSD Foundation
headrs.
Lots of third-party code expects to find C++03 headers under tr1 because that's
where GNU decided to hide them. This should fix ports that expect them there.
MFC after: 1 week
As mentioned in r16117 and the book "Advanced Programming in the Unix
Environment" by W. Richard Stevens, we should ignore SIGINT and SIGQUIT
before forking, since it is not guaranteed that the parent process starts
running soon enough.
To avoid calling sigaction() in the vforked child, instead block SIGINT and
SIGQUIT before vfork() and keep the sigaction() to ignore after vfork(). The
FreeBSD kernel discards ignored signals, even if they are blocked;
therefore, it is not necessary to unblock SIGINT and SIGQUIT earlier.
This ensures strerror() and friends continue to work correctly even if a
(non-PIE) executable linked against an older libc imports sys_errlist (which
causes sys_errlist to refer to the executable's copy with a size fixed when
that executable was linked).
The executable's use of sys_errlist remains broken because it uses the
current value of sys_nerr and may access past the bounds of the array.
Different from the message "Using sys_errlist from executables is not
ABI-stable" on freebsd-arch, this change does not affect the static library.
There seems no reason to prevent overriding the error messages in the static
library.
for ARM.
This is quite ugly, because it has to work around a clang bug that does not
allow built-in functions to be defined, even when they're ones that are
expected to be built as part of a library.
Reviewed by: ed
o Fix range error checking to detect overflow when uint64_t < uintmax_t.
o Remove a non-functional check for no valid digits as pointed out by Bruce.
o Remove a rather pointless comment describing what the function does.
o Clean up a bunch of style bugs.
Brucified by: bde
. Use integer literal constants instead of double literal constants.
* s_erff.c:
. Use integer literal constants instead of casting double literal
constants to float.
. Update the threshold values from those carried over from erf() to
values appropriate for float.
. New sets of polynomial coefficients for the rational approximations.
These coefficients have little, but positive, effect on the maximum
error in ULP in the four intervals, but do improve the overall
speed of execution.
. Remove redundant GET_FLOAT_WORD(ix,x) as hx already contained the
contents that is packed into ix.
. Update the mask that is used to zero-out lower-order bits in x in
the intervals [1.25, 2.857143] and [2.857143, 12]. In tests on
amd64, this change improves the maximum error in ULP from 6.27739
and 63.8095 to 3.16774 and 2.92095 on these intervals for erffc().
Reviewed by: bde
lib/libpam/modules/pam_passwdqc/Makefile:
Bump WARNS to 2.
contrib/pam_modules/pam_passwdqc/pam_passwdqc.c:
Bump _XOPEN_SOURCE and _XOPEN_VERSION from 500 to 600
so that vsnprint() is declared.
Use the two new union types (pam_conv_item_t and
pam_text_item_t) to resolve strict aliasing violations
caused by casts to comply with the pam_get_item() API taking
a "const void **" for all item types. Warnings are
generated for casts that create "type puns" (pointers of
conflicting sized types that are set to access the same
memory location) since these pointers may be used in ways
that violate C's strict aliasing rules. Casts to a new
type must be performed through a union in order to be
compliant, and access must be performed through only one
of the union's data types during the lifetime of the union
instance. Handle strict-aliasing warnings through pointer
assignments, which drastically simplifies this change.
Correct a CLANG "printf-like function with more arguments
than format" error.
Submitted by: gibbs
Sponsored by: Spectra Logic
Notable new features:
* Elliptic Curve Digital Signature Algorithm keys and signatures in
DNSSEC are now supported per RFC 6605. [RT #21918]
* Introduces a new tool "dnssec-verify" that validates a signed zone,
checking for the correctness of signatures and NSEC/NSEC3 chains.
[RT #23673]
* BIND now recognizes the TLSA resource record type, created to
support IETF DANE (DNS-based Authentication of Named Entities)
[RT #28989]
* The new "inline-signing" option, in combination with the
"auto-dnssec" option that was introduced in BIND 9.7, allows
named to sign zones completely transparently.
Approved by: delphij (mentor)
MFC after: 3 days
Sponsored by: DK Hostmaster A/S
request, RFC 2616 14.23 mandates the presence of the Host: header in
all HTTP 1.1 requests.
PR: kern/181445
Submitted by: Kimo <kimor79@yahoo.com>
MFC after: 3 days