1670 Commits

Author SHA1 Message Date
Mahdi Mokhtari
507c3d47af Add new catrigl.c (r313761) APIs to include/complex.h
Reviewed by:	bde, emaste
Approved by:	bde, emaste (src committers)
Differential Revision:	https://reviews.freebsd.org/D9615
2017-02-18 21:08:09 +00:00
Pedro F. Giffuni
10723054ce Remove outdated claim.
Despite wishful thinking the removal of these old function hasn't
happened yet.

MFC after:	3 days
2017-02-16 20:30:55 +00:00
Pedro F. Giffuni
4eecef9062 Small inclusion guard comment fix. 2017-02-16 20:28:30 +00:00
Pedro F. Giffuni
649702c5a3 Make use of clang nullability attributes.
Replace uses of the GCC __nonnull__ attribute with the clang nullability
qualifiers. The replacement should be transparent for clang developers as
the new qualifiers will produce the same warnings and will be useful for
static checkers but will not cause aggressive optimizations.

GCC will not produce such warnings and developers will have to use
upgraded GCC ports built with the system headers from r312538.

Hinted by:	Apple's Libc-1158.20.4, Bionic libc
MFC after:	11.1 Release

Differential Revision:	https://reviews.freebsd.org/D9004
2017-01-28 20:54:43 +00:00
Pedro F. Giffuni
f1b298ad46 Remove some uses of the GCC __nonnull() attribute.
While the checks are considered useful, the attribute does dangerous
optimizations, removing NULL checks where they can be needed. Remove the
uses of this attribute introduced in r281130: the changes were inspired on
Google's bionic where this attribute is not used anymore.

The __nonnull() attribute will be deprecrated from our headers and
replaced with the Clang _Nonnull qualifier in the future.

MFC after:	3 days
2017-01-01 17:16:47 +00:00
Andriy Gapon
7502cc401b libkvm: support access to vmm guest memory, allow writes to fwmem and vmm
This change consists of two parts:
- allow libkvm to recognize /dev/vmm/* character devices as devices that
  provide access to the physical memory of a system (similarly to /dev/fwmem*)
- allow libkvm to recognize that /dev/vmm/* and /dev/fwmem* devices provide
  access to the physical memory of live remote systems and, thus, the memory
  is writable

As a result, it should be possible to run commands like
$ kgdb -w /path/to/kernel /dev/fwmem0.0
$ kgdb /path/to/kernel /dev/vmm/guest

Reviewed by:	kib, jhb
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D8679
2016-12-27 10:17:56 +00:00
Sepherosa Ziehau
9622c93ae8 hyperv: Allow userland to ro-mmap reference TSC page
This paves way to implement VDSO for the enlightened time counter.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8768
2016-12-15 03:32:24 +00:00
Bryan Drewery
34ecf41885 Create the /usr/lib/include symlink as relative.
This ugly code is done to avoid assuming LIBDIR is 2 components
deep.

Reported by:	jhb
2016-12-03 05:29:12 +00:00
John Baldwin
31ad7c11b3 Use the correct name for the GCC macro indicating max_align_t is defined.
MFC after:	3 days
2016-11-29 00:16:19 +00:00
Sepherosa Ziehau
168fce73b5 hyperv/vss: Add driver and tools for VSS
VSS stands for "Volume Shadow Copy Service".  Unlike virtual machine
snapshot, it only takes snapshot for the virtual disks, so both
filesystem and applications have to aware of it, and cooperate the
whole VSS process.

This driver exposes two device files to the userland:

    /dev/hv_fsvss_dev

    Normally userland programs should _not_ mess with this device file.
    It is currently used by the hv_vss_daemon(8), which freezes and
    thaws the filesystem.  NOTE: currently only UFS is supported, if
    the system mounts _any_ other filesystems, the hv_vss_daemon(8)
    will veto the VSS process.

    If hv_vss_daemon(8) was disabled, then this device file must be
    opened, and proper ioctls must be issued to keep the VSS working.

    /dev/hv_appvss_dev

    Userland application can opened this device file to receive the
    VSS freeze notification, hold the VSS for a while (mainly to flush
    application data to filesystem), release the VSS process, and
    receive the VSS thaw notification i.e. applications can run again.

    The VSS will still work, even if this device file is not opened.
    However, only filesystem consistency is promised, if this device
    file is not opened or is not operated properly.

hv_vss_daemon(8) is started by devd(8) by default.  It can be disabled
by editting /etc/devd/hyperv.conf.

Submitted by:	Hongjiang Zhang <honzhan microsoft com>
Reviewed by:	kib, mckusick
MFC after:	3 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8224
2016-11-15 02:36:12 +00:00
Ed Schouten
34168b28e9 Replace basename(3) by a thread-safe implementation.
Now that the changes to the dirname(3) function had some time to settle,
let's go ahead and use the same approach for replacing basename(3) by a
simple implementation that modifies the input string, thereby making it
thread-safe and guaranteed to succeed.

Unlike dirname(3), this function already had a thread-safe variant
basename_r(3). This function had its own set of problems, like having an
upper bound on the pathname length. Keep this function around for
compatibility, but remove most references from the man page. Make the
man page more similar to that of dirname(3).

As the basename_r(3) function is only provided by FreeBSD (and Bionic),
depending on its use is even more implementation defined than assuming
that basename(3) is thread-safe.

Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D8382
2016-11-03 20:21:34 +00:00
Ruslan Bukin
130a08a362 Detect integer overflow and limit the number of positional
arguments in the string format.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
Differential Revision:	https://reviews.freebsd.org/D8286
2016-10-31 18:38:58 +00:00
John Baldwin
5dd723425e Define max_align_t for C11.
libc++'s stddef.h includes an existing definition of max_align_t for
C++11, but it is only defined for C++, not for C.  In addition, GCC and
clang both define an alternate version of max_align_t that uses a
union of multiple types rather than a plain long double as in libc++.
This adds a __max_align_t to <sys/_types.h> that matches the GCC and
clang definition that is mapped to max_align_t in <stddef.h>.

PR:		210890
Reviewed by:	dim
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D8194
2016-10-21 23:50:02 +00:00
Marcel Moolenaar
50875ed2c1 Re-apply change 306811 or alternatively, revert change 307385. 2016-10-16 02:43:51 +00:00
Marcel Moolenaar
9ffbf09f2f Revert change 306811 so that the change can be re-done using
svn copy instead of svn move.  This to preserve history on
the originals headers as well.
2016-10-16 02:05:22 +00:00
Ed Schouten
4ef9bd22ed Improve typing of POSIX search tree functions.
Back in 2015 when I reimplemented these functions to use an AVL tree, I
was annoyed by the weakness of the typing of these functions. Both tree
nodes and keys are represented by 'void *', meaning that things like the
documentation for these functions are an absolute train wreck.

To make things worse, users of these functions need to cast the return
value of tfind()/tsearch() from 'void *' to 'type_of_key **' in order to
access the key. Technically speaking such casts violate aliasing rules.
I've observed actual breakages as a result of this by enabling features
like LTO.

I've filed a bug report at the Austin Group. Looking at the way the bug
got resolved, they made a pretty good step in the right direction. A new
type 'posix_tnode' has been added to correspond to tree nodes. It is
still defined as 'void' for source-level compatibility, but in the very
far future it could be replaced by a proper structure type containing a
key pointer.

MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D8205
2016-10-13 18:25:40 +00:00
Andriy Gapon
dca5dd6894 install header files required development with libzfs_core
libzfs_core provides a rather limited but committed (stable) interface
for working with ZFS.  We install libzfs_core shared library but we do
not install header files required for developing programs that use
the library.  This change is to install the required header files
libzfs_core.h, libnvpair.h and sys/nvpair.h.

The headers are installed into the same locations as on illumos.

Reviewed by:	mav, markj
Differential Revision: https://reviews.freebsd.org/D8005
2016-10-12 07:08:32 +00:00
Marcel Moolenaar
0974f66d06 In order to allow mkimg(1) (and other tools) to become a build tool
that can be compiled on various OSes (including on older versions
of FreeBSD), make it possible to have it include the partitioning
scheme definitions without pulling in FreeBSD specifics.
In particular this means:
 o  move the scheme definitions iand related defines to header files
    under sys/disk,
 o  make them (more) portable by using uint#_t (where applicable)
    and renaming defines so that they at least have a good prefix,
 o  make the new headers stand-alone so that they don't need FreeBSD
    definitions, like struct uuid(*)
 o  keep the original headers for compatibility, but rewrite them to
    get the scheme definitions from <sys/disk/$scheme.h>.

(*) since UUID/GUID type definitions are non-portable and the GPT
scheme uses them, make it possible to have the scheme definitions
use an external type by allowing consumers of the header to set
GPT_UUID_TYPE. When GPT_UUID_TYPE has not been defined, the header
will use it's own type definition, which is the same as struct uuid.
The gpt_uuid_t typedef is created to abstract the details and allows
consumers to refer to a single type.

There is not conflict between the partitioning scheme headers and
what is defined in them. All headers can be included in the same
source files.

Note: consumers of the old headers have not been changed yet. Such
will be done if and when needed/beneficial.

Reviewed by:	imp, jhb
MFC after:	1 month
Sponsored by:	Bracket Computing
2016-10-07 15:42:20 +00:00
Ed Schouten
1a466ddc79 Remove setkey(), encrypt(), des_setkey() and des_cipher().
The setkey() and encrypt() functions are part of XSI, not the POSIX base
definitions. There is no strict requirement for us to provide these,
especially if we're only going to keep these around as undocumented
stubs. The same holds for des_setkey() and des_cipher().

Instead of providing functions that only generate warnings when linking,
simply disallow linking against them. The impact of this is relatively
low. It only causes two leaf ports to break. I'll see what I can do to
help out to get those fixed.

PR:		211626
2016-10-03 18:20:58 +00:00
Konstantin Belousov
ddce1c3ddb Export the mq_getfd_np() symbol from librt.so, which allows to get
file descriptor for the given posix mqueue.  Export the
timer_oshandle_np() symbol to get ktimer id for the given posix timer.

Requested by:	Lewis Donzis <lew@perftech.com>
Reviewed by:	jilles
Discussed with:	kan
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-10-02 17:02:59 +00:00
Eric van Gyzen
d6744932a2 Add the __printflike attribute to the declaration of vdprintf(3)
I intended to add this in r306568.

MFC after:	3 days
Sponsored by:	Dell EMC
2016-10-01 23:08:26 +00:00
Eric van Gyzen
21ac7a7f75 Add the __printflike attribute to the declaration of dprintf(3)
MFC after:	3 days
Sponsored by:	Dell EMC
2016-10-01 22:34:38 +00:00
Ed Schouten
6bf530f4bd Refine the dirname(3) compatibility workaround a bit more.
Right now our workaround is so good that it doesn't throw any warnings
on misuse. This means that people will keep on using the old version
of dirname(3) silently without fixing their code.

Go ahead and change the prototype of __old_dirname() to also use a plain
char *, so that we still get a compiler warning. This won't have any
negative effect on building older versions of FreeBSD on HEAD, as those
are built with -Werror disabled.

Differential Revision:	https://reviews.freebsd.org/D7844
2016-09-21 13:03:55 +00:00
Oleksandr Tymoshenko
2b3f6d6650 Add evdev protocol implementation
evdev is a generic input event interface compatible with Linux
evdev API at ioctl level. It allows using unmodified (apart from
header name) input evdev drivers in Xorg, Wayland, Qt.

This commit has only generic kernel API. evdev support for individual
hardware drivers like ukbd, ums, atkbd, etc. will be committed later.

Project was started by Jakub Klama as part of GSoC 2014. Jakub's
evdev implementation was later used as a base, updated and finished
by Vladimir Kondratiev.

Submitted by:	Vladimir Kondratiev <wulf@cicgroup.ru>
Reviewed by:	adrian, hans
Differential Revision:	https://reviews.freebsd.org/D6998
2016-09-11 18:56:38 +00:00
Ed Schouten
cd4dcac89a Improve compatibility of calls to dirname() on constant strings.
As the xinstall(8) utility had to be patched up to work with the POSIXly
correct basename()/dirname() prototypes, we make it pretty hard to build
previous versions of FreeBSD on HEAD. xinstall(8) is part of the
bootstrap tools.

Add some logic to <libgen.h> to automatically detect bad calls to
dirname() based on the type of the argument. If the argument is of type
'const char *', we simply fall back to calling into dirname@FBSD_1.0
directly.

I'll also give basename() similar treatment when importing the
thread-safe version of that function.

Tested by:	bdrewery, madpilot (thanks!)
2016-08-26 20:23:10 +00:00
Andrey A. Chernov
2f5007f6d2 LC_*_MASK bit shifting order was partially broken from the initial commit
time at year 2012. Only LC_COLLATE_MASK and LC_CTYPE_MASK are in the
right order.

The order here should match XLC_* from "xlocale_private.h" which, in turn,
match LC_* publicly visible order from <locale.h> which determines how
locale components are stored in the structure.
LC_*_MASK -> XLC_* translation done as "ffs(mask) - 1" in the querylocale()
and equivalent shift loop in the newlocale(), so mapped to some wrong
components (excluding two mentioned above).

Formally the fix is ABI breakage, but old code using those masks
never works properly in any case.
Only newlocale() and querylocale() are affected.

MFC after:      7 days
2016-08-23 20:33:56 +00:00
Konstantin Belousov
295af703a0 Add an implementation of fdatasync(2).
The syscall is a trivial wrapper around new VOP_FDATASYNC(), sharing
code with fsync(2).  For all filesystems, this commit provides the
implementation which delegates the work of VOP_FDATASYNC() to
VOP_FSYNC().  This is functionally correct but not efficient.

This is not yet POSIX-compliant implementation, because it does not
ensure that queued AIO requests are completed before returning.

Reviewed by:	mckusick
Discussed with:	avg (ZFS), jhb (AIO part)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D7471
2016-08-15 19:08:51 +00:00
Xin LI
854023f054 Add timingsafe_bcmp and timingsafe_memcmp.
Obtained from:	OpenBSD
Reviewed by:	trasz
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D7280
2016-08-14 23:38:50 +00:00
Ed Schouten
5f521d7ba7 Make libcrypt thread-safe. Add crypt_r(3).
glibc has a pretty nice function called crypt_r(3), which is nothing
more than crypt(3), but thread-safe. It accomplishes this by introducing
a 'struct crypt_data' structure that contains a buffer that is large
enough to hold the resulting string.

Let's go ahead and also add this function. It would be a shame if a
useful function like this wouldn't be usable in multithreaded apps.
Refactor crypt.c and all of the backends to no longer declare static
arrays, but write their output in a provided buffer.

There is no need to do any buffer length computation here, as we'll just
need to ensure that 'struct crypt_data' is large enough, which it is.
_PASSWORD_LEN is defined to 128 bytes, but in this case I'm picking 256,
as this is going to be part of the actual ABI.

Differential Revision:	https://reviews.freebsd.org/D7306
2016-08-10 15:16:28 +00:00
Warner Losh
08ed5b80da tools/build looks for _WITH_GETLINE in /usr/include/stdio.h to see if
we need to include it in -legacy or not. Since the ifdef was removed,
this broke building 10.x and older source trees on -current. Restore
just enough of _WITH_GETLINE to allow these older source trees to
still build and properly omit getline() from their -legacy library.
2016-08-02 21:55:23 +00:00
Ed Schouten
9c24291370 Fix up setgrent(3) to have a POSIX-compliant prototype.
Just like with freelocale(3), I haven't been able to find any piece of
code that actually makes use of this function's return value, both in
base and in ports. The reason for this is that FreeBSD seems to be the
only operating system to have such a prototype. This is why I'm deciding
to not use symbol versioning for this.

It does seem that the pw(8) utility depends on the function's typing and
already had a switch in place to toggle between the FreeBSD and POSIX
variant of this function. Clean this up by always expecting the POSIX
variant.

There is also a single port that has a couple of local declarations of
setgrent(3) that need to be patched up. This is in the process of being
fixed.

PR:		211394 (exp-run)
2016-07-31 08:05:15 +00:00
Baptiste Daroussin
dd47921eac Remove _WITH_GETLINE and _WITH_DPRINTF guards
When adding getline(3) and dprintf(3) into libc, those guards were added
to prevent breaking too many ports.

7 years later the ports tree have been fixed, it is time to remove this
FreeBSDism

While here remove the extra parenthesis surrounding dprintf(3)
2016-07-30 01:00:16 +00:00
Ed Schouten
718fe473dd Change the return type of freelocale(3) to void.
Our version of this function currently returns an integer indicating
failure or success, whereas POSIX specifies that this function has no
return value. It returns void. Patch up the header, sources and man page
to use the right type. While there, use the opportunity to simplify the
body of this function.

Theoretically speaking, this change breaks the ABI of this function.
That said, I have yet to find any code that makes use of freelocale()'s
return value. I couldn't find any of it in the base system, nor did an
exp-run reveal any breakage caused by this change.

PR:		211394 (exp-run)
2016-07-29 17:18:47 +00:00
Ed Schouten
938809f941 Fix up prototypes of basename(3) and dirname(3) to comply to POSIX.
POSIX allows these functions to be implemented in a way that the
resulting string is stored in the input buffer. Though some may find
this annoying, this has the advantage that it makes it possible to
implement this function in a thread-safe way. It also means that they
can be implemented in a way that they work for paths of arbitrary
length, as the output string of these functions is never longer than
max(1, len(input)).

Portable code already needs to be written with this in mind, so in my
opinion it makes very little sense to allow the existing behaviour.
Prevent the base system from falling back to this by switching over to
POSIX prototypes.

I'm not going to bump the __FreeBSD_version for this. The reason is that
it's possible to account for this change in a portable way, without
depending on a specific version of FreeBSD. An exp-run was done some
time ago. As far as I know, all regressions as a result of this have
already been fixed.

I'll give this change some time to settle. In the long run I want to
replace our copies by ones that are thread-safe and don't depend on
PATH_MAX/MAXPATHLEN.
2016-07-28 16:20:27 +00:00
Ed Schouten
b4a395a41b Add NI_NUMERICSCOPE.
POSIX also declares NI_NUMERICSCOPE, which makes getnameinfo() return a
numerical scope identifier. The interesting thing is that support for
this is already present in code, but #ifdef disabled. Expose this
functionality by placing a definition for it in <netdb.h>.

While there, remove references to NI_WITHSCOPEID, as that got removed 11
years ago.
2016-07-28 10:05:41 +00:00
Ed Schouten
822b22a9bf Change type of MB_CUR_MAX and MB_CUR_MAX_L() to size_t.
POSIX requires that MB_CUR_MAX expands to an expression of type size_t.
It currently expands to an int. As these are already macros, don't
change the underlying type of these functions. There is no ned to touch
those.

Differential Revision:	https://reviews.freebsd.org/D6645
2016-07-28 09:50:19 +00:00
Ed Schouten
8de6c26711 Fix typing of srandom() and initstate().
POSIX requires that these functions have an unsigned int for their first
argument; not an unsigned long.

My reasoning is that we can safely change these functions without
breaking the ABI. As far as I know, our supported architectures either
use registers for passing function arguments that are at least as big as
long (e.g., amd64), or int and long are of the same size (e.g., i386).

Reviewed by:	ache
Differential Revision:	https://reviews.freebsd.org/D6644
2016-07-26 20:11:29 +00:00
Pedro F. Giffuni
9143e6e49a Remove incorrect attributes from posix_memalign(3) declaration.
Both __alloc_align and __alloc_size can't be used when the function
returns a pointer to memory. This fixes breakage when building with
clang 3.4:

In file included from /usr/src/svn/usr.sbin/bhyve/atkbdc.c:40:
/usr/include/stdlib.h:176:6: error: '__alloc_size__' attribute only
applies to functions that return a pointer [-Werror,-Wignored-attributes]

Pointed out by:	ngie, cem
Approved by:	re (gjb)
2016-07-05 22:30:29 +00:00
Warner Losh
f24c011beb Commit the bits of nda that were missed. This should fix the build.
Approved by: re@
2016-06-10 06:04:53 +00:00
Mark Johnston
714ac00292 Implement an NSS backend for netgroups and add getnetgrent_r(3).
This support appears to have been documented in nsswitch.conf(5) for some
time. The implementation adds two NSS netgroup providers to libc. The
default, compat, provides the behaviour documented in netgroup(5), so this
change does not make any user-visible behaviour changes. A files provider
is also implemented.

innetgr(3) is implemented as an optional NSS method so that providers such
as NIS which are able to implement efficient reverse lookup can do so.
A fallback implementation is used otherwise. getnetgrent_r(3) is added for
convenience and to provide compatibility with glibc and Solaris.

With a small patch to net/nss_ldap, it's possible to specify an ldap
netgroup provider, allowing one to query nisNetgroupTriple entries.

Sponsored by:	EMC / Isilon Storage Division
2016-06-09 01:28:44 +00:00
Ed Schouten
de1d269583 Fix prototype of dbm_open().
The last argument of dbm_open() should be a mode_t according to POSIX;
not an int.

Reviewed by:	pfg, kib
Differential Revision:	https://reviews.freebsd.org/D6650
2016-05-31 18:32:57 +00:00
Ed Schouten
b240f5e262 Make strfmon_l() work without requiring the use of <xlocale.h>.
The strfmon_l() function provided by <xlocale/_monetary.h> is also part
of POSIX 2008's <monetary.h>, so it should be exposed by default.

Change the check used in <monetary.h> to be similar to the one that's
part of <wchar.h>, where we both test for __POSIX_VISIBLE and
_XLOCALE_H_.
2016-05-31 12:29:21 +00:00
Ed Schouten
2fed5061db Let dbm's datum::dptr use the right type.
According to POSIX, it should use void *, not char *. Unfortunately, the
dsize field also has the wrong type. It should be size_t. I'm not going
to change that, as that will break the ABI.

Reviewed by:	pfg
Differential Revision:	https://reviews.freebsd.org/D6647
2016-05-30 16:52:23 +00:00
Ed Schouten
7e3327be32 Add missing va_list to <wchar.h>.
It looks like va_list should always be defined when XSI is enabled. It
moved over to the POSIX base in the 2008 edition.
2016-05-30 16:26:34 +00:00
Ed Schouten
0977bd1e88 Fix the signature of the psignal() function.
POSIX 2008 added the psignal() function which has already been part of
the BSDs for a long time. The only difference is, the POSIX version uses
an 'int' for the signal number, unlike our version which uses an
'unsigned int'. Fix up the function to use an 'int'. This should not
affect the ABI.
2016-05-30 13:51:27 +00:00
Ed Schouten
07acf54b4b Add missing types and constants to <netdb.h>.
According to POSIX, the netdb.h header must also provide in_addr_t and
in_port_t. It should also provide IPPORT_RESERVED. Copy over the
necessary bits from <netinet/in.h> to achieve that.
2016-05-30 13:37:11 +00:00
Ed Schouten
611c29bab9 Add missing declaration of ino_t.
POSIX requires that <dirent.h> provides ino_t in the XSI case. In our
case, this wasn't being exposed, as d_ino is a macro that expands to
d_fileno that is an uint32_t, not an ino_t.
2016-05-30 07:50:57 +00:00
Ed Schouten
864a391104 Fix style of the libgen.h header.
- Remove unneeded declarations of removed/unimplemented features.
- Add missing tab after #define.
- Add missing ! before trailing comment.
2016-05-29 12:21:54 +00:00
Bryan Drewery
9ea89f3223 WITH_META_MODE: Disable cookie handling for include installation.
Using a cookie with meta mode causes it to *not rerun* (as normal make
does) unless the command changes or filemon-detected files change.

After all of the work done here it turns out that skipping installation
is dangerous since the install commands use <dir>/*.h.  The actual build
command is not changing but the files installed are changing by the mere
act of adding a new header into the source tree.  Thus we cannot safely
use meta mode logic here.  It must always rerun and install the headers.
The install -C flag at least prevents churning timestamps when
installing a header that was already present.

Sponsored by:	EMC / Isilon Storage Division
2016-05-21 01:31:57 +00:00
Konstantin Belousov
2a339d9e3d Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.

A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held.  The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.

The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths.  Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.

The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive).  Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.

Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot.  When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.

The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.

Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
   pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
   the lifetime of the shared mutex associated with a vnode' page.

Reviewed by:	jilles (previous version, supposedly the objection was fixed)
Discussed with:	brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2016-05-17 09:56:22 +00:00