This change consists of two parts:
- allow libkvm to recognize /dev/vmm/* character devices as devices that
provide access to the physical memory of a system (similarly to /dev/fwmem*)
- allow libkvm to recognize that /dev/vmm/* and /dev/fwmem* devices provide
access to the physical memory of live remote systems and, thus, the memory
is writable
As a result, it should be possible to run commands like
$ kgdb -w /path/to/kernel /dev/fwmem0.0
$ kgdb /path/to/kernel /dev/vmm/guest
Reviewed by: kib, jhb
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D8679
This paves way to implement VDSO for the enlightened time counter.
Reviewed by: kib
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D8768
VSS stands for "Volume Shadow Copy Service". Unlike virtual machine
snapshot, it only takes snapshot for the virtual disks, so both
filesystem and applications have to aware of it, and cooperate the
whole VSS process.
This driver exposes two device files to the userland:
/dev/hv_fsvss_dev
Normally userland programs should _not_ mess with this device file.
It is currently used by the hv_vss_daemon(8), which freezes and
thaws the filesystem. NOTE: currently only UFS is supported, if
the system mounts _any_ other filesystems, the hv_vss_daemon(8)
will veto the VSS process.
If hv_vss_daemon(8) was disabled, then this device file must be
opened, and proper ioctls must be issued to keep the VSS working.
/dev/hv_appvss_dev
Userland application can opened this device file to receive the
VSS freeze notification, hold the VSS for a while (mainly to flush
application data to filesystem), release the VSS process, and
receive the VSS thaw notification i.e. applications can run again.
The VSS will still work, even if this device file is not opened.
However, only filesystem consistency is promised, if this device
file is not opened or is not operated properly.
hv_vss_daemon(8) is started by devd(8) by default. It can be disabled
by editting /etc/devd/hyperv.conf.
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Reviewed by: kib, mckusick
MFC after: 3 weeks
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D8224
Now that the changes to the dirname(3) function had some time to settle,
let's go ahead and use the same approach for replacing basename(3) by a
simple implementation that modifies the input string, thereby making it
thread-safe and guaranteed to succeed.
Unlike dirname(3), this function already had a thread-safe variant
basename_r(3). This function had its own set of problems, like having an
upper bound on the pathname length. Keep this function around for
compatibility, but remove most references from the man page. Make the
man page more similar to that of dirname(3).
As the basename_r(3) function is only provided by FreeBSD (and Bionic),
depending on its use is even more implementation defined than assuming
that basename(3) is thread-safe.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D8382
libc++'s stddef.h includes an existing definition of max_align_t for
C++11, but it is only defined for C++, not for C. In addition, GCC and
clang both define an alternate version of max_align_t that uses a
union of multiple types rather than a plain long double as in libc++.
This adds a __max_align_t to <sys/_types.h> that matches the GCC and
clang definition that is mapped to max_align_t in <stddef.h>.
PR: 210890
Reviewed by: dim
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D8194
Back in 2015 when I reimplemented these functions to use an AVL tree, I
was annoyed by the weakness of the typing of these functions. Both tree
nodes and keys are represented by 'void *', meaning that things like the
documentation for these functions are an absolute train wreck.
To make things worse, users of these functions need to cast the return
value of tfind()/tsearch() from 'void *' to 'type_of_key **' in order to
access the key. Technically speaking such casts violate aliasing rules.
I've observed actual breakages as a result of this by enabling features
like LTO.
I've filed a bug report at the Austin Group. Looking at the way the bug
got resolved, they made a pretty good step in the right direction. A new
type 'posix_tnode' has been added to correspond to tree nodes. It is
still defined as 'void' for source-level compatibility, but in the very
far future it could be replaced by a proper structure type containing a
key pointer.
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D8205
libzfs_core provides a rather limited but committed (stable) interface
for working with ZFS. We install libzfs_core shared library but we do
not install header files required for developing programs that use
the library. This change is to install the required header files
libzfs_core.h, libnvpair.h and sys/nvpair.h.
The headers are installed into the same locations as on illumos.
Reviewed by: mav, markj
Differential Revision: https://reviews.freebsd.org/D8005
that can be compiled on various OSes (including on older versions
of FreeBSD), make it possible to have it include the partitioning
scheme definitions without pulling in FreeBSD specifics.
In particular this means:
o move the scheme definitions iand related defines to header files
under sys/disk,
o make them (more) portable by using uint#_t (where applicable)
and renaming defines so that they at least have a good prefix,
o make the new headers stand-alone so that they don't need FreeBSD
definitions, like struct uuid(*)
o keep the original headers for compatibility, but rewrite them to
get the scheme definitions from <sys/disk/$scheme.h>.
(*) since UUID/GUID type definitions are non-portable and the GPT
scheme uses them, make it possible to have the scheme definitions
use an external type by allowing consumers of the header to set
GPT_UUID_TYPE. When GPT_UUID_TYPE has not been defined, the header
will use it's own type definition, which is the same as struct uuid.
The gpt_uuid_t typedef is created to abstract the details and allows
consumers to refer to a single type.
There is not conflict between the partitioning scheme headers and
what is defined in them. All headers can be included in the same
source files.
Note: consumers of the old headers have not been changed yet. Such
will be done if and when needed/beneficial.
Reviewed by: imp, jhb
MFC after: 1 month
Sponsored by: Bracket Computing
The setkey() and encrypt() functions are part of XSI, not the POSIX base
definitions. There is no strict requirement for us to provide these,
especially if we're only going to keep these around as undocumented
stubs. The same holds for des_setkey() and des_cipher().
Instead of providing functions that only generate warnings when linking,
simply disallow linking against them. The impact of this is relatively
low. It only causes two leaf ports to break. I'll see what I can do to
help out to get those fixed.
PR: 211626
file descriptor for the given posix mqueue. Export the
timer_oshandle_np() symbol to get ktimer id for the given posix timer.
Requested by: Lewis Donzis <lew@perftech.com>
Reviewed by: jilles
Discussed with: kan
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Right now our workaround is so good that it doesn't throw any warnings
on misuse. This means that people will keep on using the old version
of dirname(3) silently without fixing their code.
Go ahead and change the prototype of __old_dirname() to also use a plain
char *, so that we still get a compiler warning. This won't have any
negative effect on building older versions of FreeBSD on HEAD, as those
are built with -Werror disabled.
Differential Revision: https://reviews.freebsd.org/D7844
evdev is a generic input event interface compatible with Linux
evdev API at ioctl level. It allows using unmodified (apart from
header name) input evdev drivers in Xorg, Wayland, Qt.
This commit has only generic kernel API. evdev support for individual
hardware drivers like ukbd, ums, atkbd, etc. will be committed later.
Project was started by Jakub Klama as part of GSoC 2014. Jakub's
evdev implementation was later used as a base, updated and finished
by Vladimir Kondratiev.
Submitted by: Vladimir Kondratiev <wulf@cicgroup.ru>
Reviewed by: adrian, hans
Differential Revision: https://reviews.freebsd.org/D6998
As the xinstall(8) utility had to be patched up to work with the POSIXly
correct basename()/dirname() prototypes, we make it pretty hard to build
previous versions of FreeBSD on HEAD. xinstall(8) is part of the
bootstrap tools.
Add some logic to <libgen.h> to automatically detect bad calls to
dirname() based on the type of the argument. If the argument is of type
'const char *', we simply fall back to calling into dirname@FBSD_1.0
directly.
I'll also give basename() similar treatment when importing the
thread-safe version of that function.
Tested by: bdrewery, madpilot (thanks!)
time at year 2012. Only LC_COLLATE_MASK and LC_CTYPE_MASK are in the
right order.
The order here should match XLC_* from "xlocale_private.h" which, in turn,
match LC_* publicly visible order from <locale.h> which determines how
locale components are stored in the structure.
LC_*_MASK -> XLC_* translation done as "ffs(mask) - 1" in the querylocale()
and equivalent shift loop in the newlocale(), so mapped to some wrong
components (excluding two mentioned above).
Formally the fix is ABI breakage, but old code using those masks
never works properly in any case.
Only newlocale() and querylocale() are affected.
MFC after: 7 days
The syscall is a trivial wrapper around new VOP_FDATASYNC(), sharing
code with fsync(2). For all filesystems, this commit provides the
implementation which delegates the work of VOP_FDATASYNC() to
VOP_FSYNC(). This is functionally correct but not efficient.
This is not yet POSIX-compliant implementation, because it does not
ensure that queued AIO requests are completed before returning.
Reviewed by: mckusick
Discussed with: avg (ZFS), jhb (AIO part)
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D7471
glibc has a pretty nice function called crypt_r(3), which is nothing
more than crypt(3), but thread-safe. It accomplishes this by introducing
a 'struct crypt_data' structure that contains a buffer that is large
enough to hold the resulting string.
Let's go ahead and also add this function. It would be a shame if a
useful function like this wouldn't be usable in multithreaded apps.
Refactor crypt.c and all of the backends to no longer declare static
arrays, but write their output in a provided buffer.
There is no need to do any buffer length computation here, as we'll just
need to ensure that 'struct crypt_data' is large enough, which it is.
_PASSWORD_LEN is defined to 128 bytes, but in this case I'm picking 256,
as this is going to be part of the actual ABI.
Differential Revision: https://reviews.freebsd.org/D7306
we need to include it in -legacy or not. Since the ifdef was removed,
this broke building 10.x and older source trees on -current. Restore
just enough of _WITH_GETLINE to allow these older source trees to
still build and properly omit getline() from their -legacy library.
Just like with freelocale(3), I haven't been able to find any piece of
code that actually makes use of this function's return value, both in
base and in ports. The reason for this is that FreeBSD seems to be the
only operating system to have such a prototype. This is why I'm deciding
to not use symbol versioning for this.
It does seem that the pw(8) utility depends on the function's typing and
already had a switch in place to toggle between the FreeBSD and POSIX
variant of this function. Clean this up by always expecting the POSIX
variant.
There is also a single port that has a couple of local declarations of
setgrent(3) that need to be patched up. This is in the process of being
fixed.
PR: 211394 (exp-run)
When adding getline(3) and dprintf(3) into libc, those guards were added
to prevent breaking too many ports.
7 years later the ports tree have been fixed, it is time to remove this
FreeBSDism
While here remove the extra parenthesis surrounding dprintf(3)
Our version of this function currently returns an integer indicating
failure or success, whereas POSIX specifies that this function has no
return value. It returns void. Patch up the header, sources and man page
to use the right type. While there, use the opportunity to simplify the
body of this function.
Theoretically speaking, this change breaks the ABI of this function.
That said, I have yet to find any code that makes use of freelocale()'s
return value. I couldn't find any of it in the base system, nor did an
exp-run reveal any breakage caused by this change.
PR: 211394 (exp-run)
POSIX allows these functions to be implemented in a way that the
resulting string is stored in the input buffer. Though some may find
this annoying, this has the advantage that it makes it possible to
implement this function in a thread-safe way. It also means that they
can be implemented in a way that they work for paths of arbitrary
length, as the output string of these functions is never longer than
max(1, len(input)).
Portable code already needs to be written with this in mind, so in my
opinion it makes very little sense to allow the existing behaviour.
Prevent the base system from falling back to this by switching over to
POSIX prototypes.
I'm not going to bump the __FreeBSD_version for this. The reason is that
it's possible to account for this change in a portable way, without
depending on a specific version of FreeBSD. An exp-run was done some
time ago. As far as I know, all regressions as a result of this have
already been fixed.
I'll give this change some time to settle. In the long run I want to
replace our copies by ones that are thread-safe and don't depend on
PATH_MAX/MAXPATHLEN.
POSIX also declares NI_NUMERICSCOPE, which makes getnameinfo() return a
numerical scope identifier. The interesting thing is that support for
this is already present in code, but #ifdef disabled. Expose this
functionality by placing a definition for it in <netdb.h>.
While there, remove references to NI_WITHSCOPEID, as that got removed 11
years ago.
POSIX requires that MB_CUR_MAX expands to an expression of type size_t.
It currently expands to an int. As these are already macros, don't
change the underlying type of these functions. There is no ned to touch
those.
Differential Revision: https://reviews.freebsd.org/D6645
POSIX requires that these functions have an unsigned int for their first
argument; not an unsigned long.
My reasoning is that we can safely change these functions without
breaking the ABI. As far as I know, our supported architectures either
use registers for passing function arguments that are at least as big as
long (e.g., amd64), or int and long are of the same size (e.g., i386).
Reviewed by: ache
Differential Revision: https://reviews.freebsd.org/D6644
Both __alloc_align and __alloc_size can't be used when the function
returns a pointer to memory. This fixes breakage when building with
clang 3.4:
In file included from /usr/src/svn/usr.sbin/bhyve/atkbdc.c:40:
/usr/include/stdlib.h:176:6: error: '__alloc_size__' attribute only
applies to functions that return a pointer [-Werror,-Wignored-attributes]
Pointed out by: ngie, cem
Approved by: re (gjb)
This support appears to have been documented in nsswitch.conf(5) for some
time. The implementation adds two NSS netgroup providers to libc. The
default, compat, provides the behaviour documented in netgroup(5), so this
change does not make any user-visible behaviour changes. A files provider
is also implemented.
innetgr(3) is implemented as an optional NSS method so that providers such
as NIS which are able to implement efficient reverse lookup can do so.
A fallback implementation is used otherwise. getnetgrent_r(3) is added for
convenience and to provide compatibility with glibc and Solaris.
With a small patch to net/nss_ldap, it's possible to specify an ldap
netgroup provider, allowing one to query nisNetgroupTriple entries.
Sponsored by: EMC / Isilon Storage Division
The last argument of dbm_open() should be a mode_t according to POSIX;
not an int.
Reviewed by: pfg, kib
Differential Revision: https://reviews.freebsd.org/D6650
The strfmon_l() function provided by <xlocale/_monetary.h> is also part
of POSIX 2008's <monetary.h>, so it should be exposed by default.
Change the check used in <monetary.h> to be similar to the one that's
part of <wchar.h>, where we both test for __POSIX_VISIBLE and
_XLOCALE_H_.
According to POSIX, it should use void *, not char *. Unfortunately, the
dsize field also has the wrong type. It should be size_t. I'm not going
to change that, as that will break the ABI.
Reviewed by: pfg
Differential Revision: https://reviews.freebsd.org/D6647
POSIX 2008 added the psignal() function which has already been part of
the BSDs for a long time. The only difference is, the POSIX version uses
an 'int' for the signal number, unlike our version which uses an
'unsigned int'. Fix up the function to use an 'int'. This should not
affect the ABI.
According to POSIX, the netdb.h header must also provide in_addr_t and
in_port_t. It should also provide IPPORT_RESERVED. Copy over the
necessary bits from <netinet/in.h> to achieve that.
POSIX requires that <dirent.h> provides ino_t in the XSI case. In our
case, this wasn't being exposed, as d_ino is a macro that expands to
d_fileno that is an uint32_t, not an ino_t.
Using a cookie with meta mode causes it to *not rerun* (as normal make
does) unless the command changes or filemon-detected files change.
After all of the work done here it turns out that skipping installation
is dangerous since the install commands use <dir>/*.h. The actual build
command is not changing but the files installed are changing by the mere
act of adding a new header into the source tree. Thus we cannot safely
use meta mode logic here. It must always rerun and install the headers.
The install -C flag at least prevents churning timestamps when
installing a header that was already present.
Sponsored by: EMC / Isilon Storage Division
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
I'm still not sure why only Pypy runs into the error with the function
typedefs. Fix it anyway.
Use __ssize_t instead of ssize_t for the types; it's possible for the size_t
type to not be visible if at the wrong POSIX_VISIBLE level.
A final (crossing my fingers) follow-up to r299456.
Sponsored by: EMC / Isilon Storage Division
Despite the private namespace, several broken ports depend on the __off64_t
name for the type. Export it exactly the same way off_t and __off_t are
exported.
A follow-up to r299456.
Suggested by: php56
Sponsored by: EMC / Isilon Storage Division
Two new functions are provided, bit_ffs_at() and bit_ffc_at(), which allow
for efficient searching of set or cleared bits starting from any bit offset
within the bit string.
Performance is improved by operating on longs instead of bytes and using
ffsl() for searches within a long. ffsl() is a compiler builtin in both
clang and gcc for most architectures, converting what was a brute force
while loop search into a couple of instructions.
All of the bitstring(3) API continues to be contained in the header file.
Some of the functions are large enough that perhaps they should be uninlined
and moved to a library, but that is beyond the scope of this commit.
sys/sys/bitstring.h:
Convert the majority of the existing bit string implementation from
macros to inline functions.
Properly protect the implementation from inadvertant macro expansion
when included in a user's program by prefixing all private
macros/functions and local variables with '_'.
Add bit_ffs_at() and bit_ffc_at(). Implement bit_ffs() and
bit_ffc() in terms of their "at" counterparts.
Provide a kernel implementation of bit_alloc(), making the full API
usable in the kernel.
Improve code documenation.
share/man/man3/bitstring.3:
Add pre-exisiting API bit_ffc() to the synopsis.
Document new APIs.
Document the initialization state of the bit strings
allocated/declared by bit_alloc() and bit_decl().
Correct documentation for bitstr_size(). The original code comments
indicate the size is in bytes, not "elements of bitstr_t". The new
implementation follows this lead. Only hastd assumed "elements"
rather than bytes and it has been corrected.
etc/mtree/BSD.tests.dist:
tests/sys/Makefile:
tests/sys/sys/Makefile:
tests/sys/sys/bitstring.c:
Add tests for all existing and new functionality.
include/bitstring.h
Include all headers needed by sys/bitstring.h
lib/libbluetooth/bluetooth.h:
usr.sbin/bluetooth/hccontrol/le.c:
Include bitstring.h instead of sys/bitstring.h.
sbin/hastd/activemap.c:
Correct usage of bitstr_size().
sys/dev/xen/blkback/blkback.c
Use new bit_alloc.
sys/kern/subr_unit.c:
Remove hard-coded assumption that sizeof(bitstr_t) is 1. Get rid of
unrb.busy, which caches the number of bits set in unrb.map. When
INVARIANTS are disabled, nothing needs to know that information.
callapse_unr can be adapted to use bit_ffs and bit_ffc instead.
Eliminating unrb.busy saves memory, simplifies the code, and
provides a slight speedup when INVARIANTS are disabled.
sys/net/flowtable.c:
Use the new kernel implementation of bit-alloc, instead of hacking
the old libc-dependent macro.
sys/sys/param.h
Update __FreeBSD_version to indicate availability of new API
Submitted by: gibbs, asomers
Reviewed by: gibbs, ngie
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D6004