Commit Graph

47 Commits

Author SHA1 Message Date
Brooks Davis
f373437a01 Add helper functions to copy strings into struct image_args.
Given a zeroed struct image_args with an allocated buf member,
exec_args_add_fname() must be called to install a file name (or NULL).
Then zero or more calls to exec_args_add_env() followed by zero or
more calls to exec_args_add_env(). exec_args_adjust_args() may be
called after args and/or env to allow an interpreter to be prepended to
the argument list.

To allow code reuse when adding arg and env variables, begin_envv
should be accessed with the accessor exec_args_get_begin_envv()
which handles the case when no environment entries have been added.

Use these functions to simplify exec_copyin_args() and
freebsd32_exec_copyin_args().

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15468
2018-11-29 21:00:56 +00:00
Brooks Davis
c7d0908e1c Regenerated assorted syscall related files after:
- r327895: Implement 'domainset'...
 - r329876: Use linux types for linux-specific syscalls

Diff generated with:
	find . -name syscalls.conf | xargs dirname | \
	    xargs -n1 -I DIR make -C DIR sysent

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-10-09 20:42:17 +00:00
Bryan Drewery
595109196a Don't use an .OBJDIR for 'make sysent'.
Reported by:	emaste, jhb
Sponsored by:	Dell EMC
2018-01-29 19:14:15 +00:00
Ed Schouten
4b6b56b32b Use mallocarray(9) in CloudABI kernel code where possible.
Submitted by:	pfg@
2018-01-07 22:38:45 +00:00
Ed Schouten
8e04fe5af3 Allow timed waits with relative timeouts on locks and condvars.
Even though pthreads doesn't support this, there are various alternative
APIs that use this. For example, uv_cond_timedwait() accepts a relative
timeout. So does Rust's std::sync::Condvar::wait_timeout().

Though I personally think that relative timeouts are bad (due to
imprecision for repeated operations), it does seem that people want
this. Extend the existing futex functions to keep track of whether an
absolute timeout is used in a boolean flag.

MFC after:	1 month
2018-01-04 21:57:37 +00:00
Ed Schouten
7e6155744d Upgrade to CloudABI v0.17.
Compared to the previous version, v0.16, there are a couple of minor
changes:

- CLOUDABI_AT_PID: Process identifiers for CloudABI processes.

  Initially, BSD process identifiers weren't exposed inside the runtime,
  due to them being pretty much useless inside of a cluster computing
  environment. When jobs are scheduled across systems, the BSD process
  number doesn't act as an identifier. Even on individual systems they
  may recycle relatively quickly.

  With this change, the kernel will now generate a UUIDv4 when executing
  a process. These UUIDs can be obtained within the process using
  program_getpid(). Right now, FreeBSD will not attempt to store this
  value. This should of course happen at some point in time, so that it
  may be printed by administration tools.

- Removal of some unused structure members for polling.

  With the polling framework being simplified/redesigned, it turns out
  some of the structure fields were not used by the C library. We can
  remove these to keep things nice and tidy.

Obtained from:	https://github.com/NuxiNL/cloudabi
2017-11-08 14:21:52 +00:00
Ed Schouten
4e1847781b Import the latest CloudABI definitions, version 0.16.
The most important change in this release is the removal of the
poll_fd() system call; CloudABI's equivalent of kevent(). Though I think
that kqueue is a lot saner than many of its alternatives, our
experience is that emulating this system call on other systems
accurately isn't easy. It has become a complex API, even though I'm not
convinced this complexity is needed. This is why we've decided to take a
different approach, by looking one layer up.

We're currently adding an event loop to CloudABI's C library that is API
compatible with libuv (except when incompatible with Capsicum).
Initially, this event loop will be built on top of plain inefficient
poll() calls. Only after this is finished, we'll work our way backwards
and design a new set of system calls to optimize it.

Interesting challenges will include integrating asynchronous I/O into
such a system call API. libuv currently doesn't aio(4) on Linux/BSD, due
to it being unreliable and having undesired semantics.

Obtained from:	https://github.com/NuxiNL/cloudabi
2017-10-18 19:22:53 +00:00
Ed Schouten
b53b978a6c Complete the CloudABI networking refactoring.
Now that all of the packaged software has been adjusted to either use
Flower (https://github.com/NuxiNL/flower) for making incoming/outgoing
network connections or can have connections injected, there is no longer
need to keep accept() around. It is now a lot easier to write networked
services that are address family independent, dual-stack, testable, etc.

Remove all of the bits related to accept(), but also to
getsockopt(SO_ACCEPTCONN).
2017-08-30 07:30:06 +00:00
Ed Schouten
8212ad9a99 Sync CloudABI compatibility against the latest upstream version (v0.13).
With Flower (CloudABI's network connection daemon) becoming more
complete, there is no longer any need for creating any unconnected
sockets. Socket pairs in combination with file descriptor passing is all
that is necessary, as that is what is used by Flower to pass network
connections from the public internet to listening processes.

Remove all of the kernel bits that were used to implement socket(),
listen(), bindat() and connectat(). In principle, accept() and
SO_ACCEPTCONN may also be removed, but there are still some consumers
left.

Obtained from:	https://github.com/NuxiNL/cloudabi
MFC after:	1 month
2017-08-25 11:01:39 +00:00
Ed Schouten
cea9310d4e Upgrade to the latest sources generated from the CloudABI specification.
The CloudABI specification has had some minor changes over the last half
year. No substantial features have been added, but some features that
are deemed unnecessary in retrospect have been removed:

- mlock()/munlock():

  These calls tend to be used for two different purposes: real-time
  support and handling of sensitive (cryptographic) material that
  shouldn't end up in swap. The former use case is out of scope for
  CloudABI. The latter may also be handled by encrypting swap.

  Removing this has the advantage that we no longer need to worry about
  having resource limits put in place.

- SOCK_SEQPACKET:

  Support for SOCK_SEQPACKET is rather inconsistent across various
  operating systems. Some operating systems supported by CloudABI (e.g.,
  macOS) don't support it at all. Considering that they are rarely used,
  remove support for the time being.

- getsockname(), getpeername(), etc.:

  A shortcoming of the sockets API is that it doesn't allow you to
  create socket(pair)s, having fake socket addresses associated with
  them. This makes it harder to test applications or transparently
  forward (proxy) connections to them.

  With CloudABI, we're slowly moving networking connectivity into a
  separate daemon called Flower. In addition to passing around socket
  file descriptors, this daemon provides address information in the form
  of arbitrary string labels. There is thus no longer any need for
  requesting socket address information from the kernel itself.

This change also updates consumers of the generated code accordingly.
Even though system calls end up getting renumbered, this won't cause any
problems in practice. CloudABI programs always call into the kernel
through a kernel-supplied vDSO that has the numbers updated as well.

Obtained from:	https://github.com/NuxiNL/cloudabi
2017-07-26 06:57:15 +00:00
Ed Schouten
75865d0d75 Make file descriptor passing for CloudABI's recvmsg() work.
Similar to the change for sendmsg(), create a pointer size independent
implementation of recvmsg() and let cloudabi32 and cloudabi64 call into
it. In case userspace requests one or more file descriptors, call
kern_recvit() in such a way that we get the control message headers in
an mbuf. Iterate over all of the headers and copy the file descriptors
to userspace.
2017-03-22 19:20:39 +00:00
Ed Schouten
36cc183884 Make file descriptor passing work for CloudABI's sendmsg().
Reduce the potential amount of code duplication between cloudabi32 and
cloudabi64 by creating a cloudabi_sock_recv() utility function. The
cloudabi32 and cloudabi64 modules will then only contain code to convert
the iovecs to the native pointer size.

In cloudabi_sock_recv(), we can now construct an SCM_RIGHTS cmsghdr in
an mbuf and pass that on to kern_sendit().
2017-03-22 06:43:10 +00:00
John Baldwin
bb9b710477 Regenerate all the system call tables to drop "created from" lines.
One of the ibcs2 files contains some actual changes (new headers) as
it hasn't been regenerated after older changes to makesyscalls.sh.
2017-02-10 19:45:02 +00:00
Baptiste Daroussin
b4b4b5304b Revert crap accidentally committed 2017-01-28 16:31:23 +00:00
Baptiste Daroussin
814aaaa7da Revert r312923 a better approach will be taken later 2017-01-28 16:30:14 +00:00
Ed Schouten
4423244072 Catch up with changes to structure member names.
Pointer/length pairs are now always named ${name} and ${name}_len.
2017-01-17 22:05:52 +00:00
Ed Schouten
c20537def1 Regenerate sources based on the system call tables. 2017-01-17 22:05:01 +00:00
Mark Johnston
bdaf6d6913 Regenerate syscall provider argument strings. 2016-09-22 04:50:03 +00:00
Ed Schouten
90f4145f82 Don't forget to define __ELF_WORD_SIZE.
Without it, we only obtain the ELF types native to the system. In this
we explicitly want the 64-bit versions.
2016-08-21 15:37:49 +00:00
Ed Schouten
47cb4d7bd0 Add a utility macro for converting 64-bit pointers to native pointers.
Right now we're casting uint64_t's to native pointers. This isn't
causing any problems right now, but if we want to provide a 32-bit
compatibility layer that works on 64-bit systems as well, this will
cause problems. Casting a uint32_t to a 64-bit pointer throws a compiler
error.

Introduce a TO_PTR() macro that casts the value to uintptr_t before
casting it to a pointer.
2016-08-21 15:36:18 +00:00
Ed Schouten
4fbc90654c Move the linker script from cloudabi64/ to cloudabi/.
It turns out that it works perfectly fine for generating 32-bits vDSOs
as well. While there, get rid of the extraneous .s file extension.
2016-08-21 15:14:06 +00:00
Ed Schouten
a953f555e1 Use the right _MAX constant.
Though uio_resid is of type ssize_t, we need to take into account that
this source file contains an implementation specific to a certain
userspace pointer size. If this file provided 32-bit implementations,
this should have used INT32_MAX, even when running a 64-bit kernel.

This change has no effect, but is simply in preparation for adding
support for running 32-bit CloudABI executables.
2016-08-21 09:32:20 +00:00
Ed Schouten
7a3558bed5 Regenerate system call table after r304483. 2016-08-19 17:54:06 +00:00
Ed Schouten
a5c3c5b14f Import the new automatically generated system call table for CloudABI.
Now that we've switched over to using the vDSO on CloudABI, it becomes a
lot easier for us to phase out old features. System call numbering is no
longer something that's part of the ABI. It's fully based on names. As
long as the numbering used by the kernel and the vDSO is consistent
(which it always is), it's all right.

Let's put this to the test by removing a system call (thread_tcb_set())
that's already unused for quite some time now, but was only left intact
to serve as a placeholder. Sync in the new system call table that uses
alphabetic sorting of system calls.

Obtained from:	https://github.com/NuxiNL/cloudabi
2016-08-19 17:49:35 +00:00
Ed Schouten
13b4b4df98 Provide the CloudABI vDSO to its executables.
CloudABI executables already provide support for passing in vDSOs. This
functionality is used by the emulator for OS X to inject system call
handlers. On FreeBSD, we could use it to optimize calls to
gettimeofday(), etc.

Though I don't have any plans to optimize any system calls right now,
let's go ahead and already pass in a vDSO. This will allow us to
simplify the executables, as the traditional "syscall" shims can be
removed entirely. It also means that we gain more flexibility with
regards to adding and removing system calls.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D7438
2016-08-10 21:02:41 +00:00
Ed Schouten
ab83575070 Make CloudABI's way of doing TLS more friendly to userspace emulators.
We're currently seeing how hard it would be to run CloudABI binaries on
operating systems cannot be modified easily (Windows, Mac OS X). The
idea is that we want to just run them without any sandboxing. Now
that CloudABI executables are PIE, this is already a bit easier, but TLS
is still problematic:

- CloudABI executables want to write to the %fs, which typically
  requires extra system calls by the emulator every time it needs to
  switch between CloudABI's and its own TLS.

- If CloudABI executables overwrite the %fs base unconditionally, it
  also becomes harder for the emulator to store a backup of the old
  value of %fs. To solve this, let's no longer overwrite %fs, but just
  %fs:0.

As CloudABI's C library does not use a TCB, this space can now be used
by an emulator to keep track of its internal state. The executable can
now safely overwrite %fs:0, as long as it makes sure that the TCB is
copied over to the new TLS area.

Ensure that there is an initial TLS area set up when the process starts,
only containing a bogus TCB. We don't really care about its contents on
FreeBSD.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D5836
2016-04-06 11:11:31 +00:00
Ed Schouten
4a8b3b18cc Make Position Independent Executables work for CloudABI.
- Set BI_CAN_EXEC_DYN, so we can execute ET_DYN ELF files in addition to
  regular ET_EXECs.
- Provide an AT_BASE entry in the auxiliary vector, so the executable
  knows at which address it got loaded and can apply relocations.
2016-03-31 18:52:00 +00:00
Ed Schouten
2054309b6a Regenerate system call table after r297468. 2016-03-31 18:50:52 +00:00
Ed Schouten
38526a2cf1 Sync in the latest CloudABI system call definitions.
Some time ago I made a change to merge together the memory scope
definitions used by mmap (MAP_{PRIVATE,SHARED}) and lock objects
(PTHREAD_PROCESS_{PRIVATE,SHARED}). Though that sounded pretty smart
back then, it's backfiring. In the case of mmap it's used with other
flags in a bitmask, but for locking it's an enumeration. As our plan is
to automatically generate bindings for other languages, that looks a bit
sloppy.

Change all of the locking functions to use separate flags instead.

Obtained from:	https://github.com/NuxiNL/cloudabi
2016-03-31 18:50:06 +00:00
Ed Schouten
3cf50041ef Regenerate system call table after r297247. 2016-03-24 21:49:39 +00:00
Ed Schouten
1f3bbfd875 Replace the CloudABI system call table by a machine generated version.
The type definitions and constants that were used by COMPAT_CLOUDABI64
are a literal copy of some headers stored inside of CloudABI's C
library, cloudlibc. What is annoying is that we can't make use of
cloudlibc's system call list, as the format is completely different and
doesn't provide enough information. It had to be synced in manually.

We recently decided to solve this (and some other problems) by moving
the ABI definitions into a separate file:

	https://github.com/NuxiNL/cloudabi/blob/master/cloudabi.txt

This file is processed by a pile of Python scripts to generate the
header files like before, documentation (markdown), but in our case more
importantly: a FreeBSD system call table.

This change discards the old files in sys/contrib/cloudabi and replaces
them by the latest copies, which requires some minor changes here and
there. Because cloudabi.txt also enforces consistent names of the system
call arguments, we have to patch up a small number of system call
implementations to use the new argument names.

The new header files can also be included directly in FreeBSD kernel
space without needing any includes/defines, so we can now remove
cloudabi_syscalldefs.h and cloudabi64_syscalldefs.h. Patch up the
sources to include the definitions directly from sys/contrib/cloudabi
instead.
2016-03-24 21:47:15 +00:00
Ed Schouten
b78ef4bd86 Refactoring: move out generic bits from cloudabi64_sysvec.c.
In order to make it easier to support CloudABI on ARM64, move out all of
the bits from the AMD64 cloudabi_sysvec.c into a new file
cloudabi_module.c that would otherwise remain identical. This reduces
the AMD64 specific code to just ~160 lines.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D3974
2015-10-22 09:07:53 +00:00
Ed Schouten
fbb624e76f Add the last remaining system calls: send() and recv().
There is still one TODO item for these calls: add file descriptor
passing. The data structures are already prepared for this. It's just
the translation that's missing.

Obtained from:	http://github.com/NuxiNL/freebsd
2015-08-12 17:42:20 +00:00
Ed Schouten
18528470cb Make blocking CloudABI futex operations work.
Blocking on locks and condition variables can be accomplished by polling
and using the special filters CONDVAR, LOCK_RDLOCK and LOCK_WRLOCK.

For now it wouldn't make sense to implement this functionality into
kqueue() itself, for the reason that they are CloudABI specific and
would require us to resize 'struct kevent' to hold all of the parameters
of interest.

Add a bandaid to the CloudABI poll system call to call into the futex
code directly if it detects specific combinations of events that are
used by the C library.

Obtained from:	https://github.com/NuxiNL/freebsd
2015-08-12 08:41:48 +00:00
Ed Schouten
322e16e87e Make poll() and kqueue() on CloudABI work.
This change implements two functions, cloudabi64_kevent_copyin() and
cloudabi64_kevent_copyout(), that convert CloudABI structures to
FreeBSD's struct kevent. CloudABI uses two structures: subscription_t
and event_t. The former is used for input, whereas the latter is used
for output. Unlike struct kevent, fields aren't overloaded for multiple
purposes or for separate event types.

For poll() we call into the newly introduced kern_kevent_anonymous()
function that allows us to poll without a file descriptor. This function
is not only used by poll(), but also by functions such as
sleep() and clock_nanosleep().

Reviewed by:	jmg
Obtained from:	https://github.com/NuxiNL/freebsd
Differential Revision:	https://reviews.freebsd.org/D3308
2015-08-12 07:59:00 +00:00
Ed Schouten
2412ae2b8e Regenerate the system call table. 2015-08-05 13:10:13 +00:00
Ed Schouten
2837d9ed43 Import the latest CloudABI system call definitions and table.
We're going to need these for next code I'm going to send out for
review: support for poll() and kqueue() on CloudABI.
2015-08-05 13:09:46 +00:00
Ed Schouten
533c8a29da Regenerate system call table. 2015-07-27 10:04:28 +00:00
Ed Schouten
f4c06d124f Sync in latest upstream system call definitions.
Futex object scopes have been renamed from using their own constants to
simply reusing the existing CLOUDABI_MAP_{PRIVATE,SHARED} flags, as they
are more accurate in this context.
2015-07-27 10:04:06 +00:00
Ed Schouten
c989441af6 Regenerate system call table. 2015-07-22 10:05:46 +00:00
Ed Schouten
73dcd7db56 Import upstream changes to the system call definitions.
Support has been added for providing the scope of a futex operation,
whether the futex is local to the process or shared between processes.
2015-07-22 10:04:53 +00:00
Ed Schouten
21d30b29d5 Make thread creation work for CloudABI processes.
Summary:
Remove the stub system call that was put in place during the system call
import and replace it by a target-dependent version stored in sys/amd64.
Initialize the thread in a way similar to cpu_set_upcall_kse(). We
provide the entry point with two arguments: the thread ID and the
argument pointer.

Test Plan:
Thread creation still seems to work, both for FreeBSD and CloudABI
binaries.

Reviewers: dchagin, mjg, kib

Reviewed By: kib

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D3110
2015-07-21 12:47:15 +00:00
Ed Schouten
460ac6370a Regenerate system call table for r285540. 2015-07-14 15:12:24 +00:00
Ed Schouten
1eb7c7cae3 Implement thread_tcb_set() and thread_yield().
The first system call is used to set the user TLS address. Right now
this system call is invoked by the C library for both the initial thread
and additional threads unconditionally, but in the future we'll only
call this if the architecture does not support this. On recent x86-64
CPUs we could use the WRFSBASE instruction.

This system call was erroneously placed in sys/compat/cloudabi64, even
though it does not depend on any pointer size dependent datastructure.
Move it to the right place.

Obtained from:	https://github.com/NuxiNL/freebsd
2015-07-14 15:11:50 +00:00
Ed Schouten
03744d7c8d Implement {,p}{read,write}{,v}().
Add a routine similar to copyinuio() and freebsd32_copyinuio() that
copies in CloudABI's struct iovecs. These are then translated into
FreeBSD format and placed in a 'struct uio', so we can call into the
kern_*() functions.

Obtained from:	https://github.com/NuxiNL/freebsd
2015-07-14 14:33:21 +00:00
Ed Schouten
f355e810cf Generate CloudABI system call table with proper $FreeBSD$ tags. 2015-07-09 07:21:33 +00:00
Ed Schouten
6d338f9a81 Import the CloudABI datatypes and create a system call table.
CloudABI is a pure capability-based runtime environment for UNIX. It
works similar to Capsicum, except that processes already run in
capabilities mode on startup. All functionality that conflicts with this
model has been omitted, making it a compact binary interface that can be
supported by other operating systems without too much effort.

CloudABI is 'secure by default'; the idea is that it should be safe to
run arbitrary third-party binaries without requiring any explicit
hardware virtualization (Bhyve) or namespace virtualization (Jails). The
rights of an application are purely determined by the set of file
descriptors that you grant it on startup.

The datatypes and constants used by CloudABI's C library (cloudlibc) are
defined in separate files called syscalldefs_mi.h (pointer size
independent) and syscalldefs_md.h (pointer size dependent). We import
these files in sys/contrib/cloudabi and wrap around them in
cloudabi*_syscalldefs.h.

We then add stubs for all of the system calls in sys/compat/cloudabi or
sys/compat/cloudabi64, depending on whether the system call depends on
the pointer size. We only have nine system calls that depend on the
pointer size. If we ever want to support 32-bit binaries, we can simply
add sys/compat/cloudabi32 and implement these nine system calls again.

The next step is to send in code reviews for the individual system call
implementations, but also add a sysentvec, to allow CloudABI executabled
to be started through execve().

More information about CloudABI:
- GitHub: https://github.com/NuxiNL/cloudlibc
- Talk at BSDCan: https://www.youtube.com/watch?v=SVdF84x1EdA

Differential Revision:	https://reviews.freebsd.org/D2848
Reviewed by:	emaste, brooks
Obtained from:	https://github.com/NuxiNL/freebsd
2015-07-09 07:20:15 +00:00