Commit Graph

29 Commits

Author SHA1 Message Date
Ed Schouten
4e1847781b Import the latest CloudABI definitions, version 0.16.
The most important change in this release is the removal of the
poll_fd() system call; CloudABI's equivalent of kevent(). Though I think
that kqueue is a lot saner than many of its alternatives, our
experience is that emulating this system call on other systems
accurately isn't easy. It has become a complex API, even though I'm not
convinced this complexity is needed. This is why we've decided to take a
different approach, by looking one layer up.

We're currently adding an event loop to CloudABI's C library that is API
compatible with libuv (except when incompatible with Capsicum).
Initially, this event loop will be built on top of plain inefficient
poll() calls. Only after this is finished, we'll work our way backwards
and design a new set of system calls to optimize it.

Interesting challenges will include integrating asynchronous I/O into
such a system call API. libuv currently doesn't aio(4) on Linux/BSD, due
to it being unreliable and having undesired semantics.

Obtained from:	https://github.com/NuxiNL/cloudabi
2017-10-18 19:22:53 +00:00
Ed Schouten
733ba7f881 Merge pipes and socket pairs.
Now that CloudABI's sockets API has been changed to be addressless and
only connected socket instances are used (e.g., socket pairs), they have
become fairly similar to pipes. The only differences on CloudABI is that
socket pairs additionally support shutdown(), send() and recv().

To simplify the ABI, we've therefore decided to remove pipes as a
separate file descriptor type and just let pipe() return a socket pair
of type SOCK_STREAM. S_ISFIFO() and S_ISSOCK() are now defined
identically.
2017-09-05 07:46:45 +00:00
Ed Schouten
b53b978a6c Complete the CloudABI networking refactoring.
Now that all of the packaged software has been adjusted to either use
Flower (https://github.com/NuxiNL/flower) for making incoming/outgoing
network connections or can have connections injected, there is no longer
need to keep accept() around. It is now a lot easier to write networked
services that are address family independent, dual-stack, testable, etc.

Remove all of the bits related to accept(), but also to
getsockopt(SO_ACCEPTCONN).
2017-08-30 07:30:06 +00:00
Ed Schouten
8212ad9a99 Sync CloudABI compatibility against the latest upstream version (v0.13).
With Flower (CloudABI's network connection daemon) becoming more
complete, there is no longer any need for creating any unconnected
sockets. Socket pairs in combination with file descriptor passing is all
that is necessary, as that is what is used by Flower to pass network
connections from the public internet to listening processes.

Remove all of the kernel bits that were used to implement socket(),
listen(), bindat() and connectat(). In principle, accept() and
SO_ACCEPTCONN may also be removed, but there are still some consumers
left.

Obtained from:	https://github.com/NuxiNL/cloudabi
MFC after:	1 month
2017-08-25 11:01:39 +00:00
Ed Schouten
cea9310d4e Upgrade to the latest sources generated from the CloudABI specification.
The CloudABI specification has had some minor changes over the last half
year. No substantial features have been added, but some features that
are deemed unnecessary in retrospect have been removed:

- mlock()/munlock():

  These calls tend to be used for two different purposes: real-time
  support and handling of sensitive (cryptographic) material that
  shouldn't end up in swap. The former use case is out of scope for
  CloudABI. The latter may also be handled by encrypting swap.

  Removing this has the advantage that we no longer need to worry about
  having resource limits put in place.

- SOCK_SEQPACKET:

  Support for SOCK_SEQPACKET is rather inconsistent across various
  operating systems. Some operating systems supported by CloudABI (e.g.,
  macOS) don't support it at all. Considering that they are rarely used,
  remove support for the time being.

- getsockname(), getpeername(), etc.:

  A shortcoming of the sockets API is that it doesn't allow you to
  create socket(pair)s, having fake socket addresses associated with
  them. This makes it harder to test applications or transparently
  forward (proxy) connections to them.

  With CloudABI, we're slowly moving networking connectivity into a
  separate daemon called Flower. In addition to passing around socket
  file descriptors, this daemon provides address information in the form
  of arbitrary string labels. There is thus no longer any need for
  requesting socket address information from the kernel itself.

This change also updates consumers of the generated code accordingly.
Even though system calls end up getting renumbered, this won't cause any
problems in practice. CloudABI programs always call into the kernel
through a kernel-supplied vDSO that has the numbers updated as well.

Obtained from:	https://github.com/NuxiNL/cloudabi
2017-07-26 06:57:15 +00:00
Edward Tomasz Napierala
d293f35c09 Add kern_listen(), kern_shutdown(), and kern_socket(), and use them
instead of their sys_*() counterparts in various compats. The svr4
is left untouched, because there's no point.

Reviewed by:	ed@, kib@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9367
2017-01-30 12:57:22 +00:00
Edward Tomasz Napierala
f67d6b5f12 Add kern_lseek() and use it instead of sys_lseek() in various compats.
I didn't touch svr4/, there's no point.

Reviewed by:	ed@, kib@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9366
2017-01-30 12:24:47 +00:00
Baptiste Daroussin
b4b4b5304b Revert crap accidentally committed 2017-01-28 16:31:23 +00:00
Baptiste Daroussin
814aaaa7da Revert r312923 a better approach will be taken later 2017-01-28 16:30:14 +00:00
Mateusz Guzik
02274c93c3 cloudabi: use fget_cap instead of hand-rolling capability read
This has a side effect of unbreaking the build after r306272.

Discussed with:		ed
2016-09-23 23:08:23 +00:00
Ed Schouten
93d9ebd82e Eliminate use of sys_fsync() and sys_fdatasync().
Make the kern_fsync() function public, so that it can be used by other
parts of the kernel. Fix up existing consumers to make use of it.

Requested by:	kib
2016-08-15 20:11:52 +00:00
Ed Schouten
2256eed3eb Let CloudABI use fdatasync() as well.
Now that FreeBSD supports fdatasync() natively, we can tidy up
CloudABI's equivalent system call to use that instead.
2016-08-15 19:42:21 +00:00
Ed Schouten
1f3bbfd875 Replace the CloudABI system call table by a machine generated version.
The type definitions and constants that were used by COMPAT_CLOUDABI64
are a literal copy of some headers stored inside of CloudABI's C
library, cloudlibc. What is annoying is that we can't make use of
cloudlibc's system call list, as the format is completely different and
doesn't provide enough information. It had to be synced in manually.

We recently decided to solve this (and some other problems) by moving
the ABI definitions into a separate file:

	https://github.com/NuxiNL/cloudabi/blob/master/cloudabi.txt

This file is processed by a pile of Python scripts to generate the
header files like before, documentation (markdown), but in our case more
importantly: a FreeBSD system call table.

This change discards the old files in sys/contrib/cloudabi and replaces
them by the latest copies, which requires some minor changes here and
there. Because cloudabi.txt also enforces consistent names of the system
call arguments, we have to patch up a small number of system call
implementations to use the new argument names.

The new header files can also be included directly in FreeBSD kernel
space without needing any includes/defines, so we can now remove
cloudabi_syscalldefs.h and cloudabi64_syscalldefs.h. Patch up the
sources to include the definitions directly from sys/contrib/cloudabi
instead.
2016-03-24 21:47:15 +00:00
Ed Schouten
bc1ace0b96 Decompose linkat()/renameat() rights to source and target.
To make it easier to understand how Capsicum interacts with linkat() and
renameat(), rename the rights to CAP_{LINK,RENAME}AT_{SOURCE,TARGET}.

This also addresses a shortcoming in Capsicum, where it isn't possible
to disable linking to files stored in a directory. Creating hardlinks
essentially makes it possible to access files with additional rights.

Reviewed by:	rwatson, wblock
Differential Revision:	https://reviews.freebsd.org/D3411
2015-08-27 15:16:41 +00:00
Ed Schouten
2c20fbe43a Use CAP_EVENT instead of CAP_PDWAIT.
The cloudlibc pdwait() function ends up using FreeBSD's kqueue() in
combination with EVFILT_PROCDESC. This depends on CAP_EVENT -- not
CAP_PDWAIT.

Obtained from:	https://github.com/NuxiNL/freebsd
2015-08-12 11:07:03 +00:00
Ed Schouten
65c17fe451 Make cap_rights_limit() work for CloudABI processes.
Call into the recently introduced kern_cap_rights_limit() function to
restrict rights.
2015-08-11 08:44:19 +00:00
Ed Schouten
0f85ff377b Add file_open(): the underlying system call of openat().
CloudABI purely operates on file descriptor rights (CAP_*). File
descriptor access modes (O_ACCMODE) are emulated on top of rights.

Instead of accepting the traditional flags argument, file_open() copies
in an fdstat_t object that contains the initial rights the descriptor
should have, but also file descriptor flags that should persist after
opening (APPEND, NONBLOCK, *SYNC). Only flags that don't persist (EXCL,
TRUNC, CREAT, DIRECTORY) are passed in as an argument.

file_open() first converts the rights, the persistent flags and the
non-persistent flags to fflags. It then calls into vn_open(). If
successful, it installs the file descriptor with the requested
rights, trimming off rights that don't apply to the type of
the file that has been opened.

Unlike kern_openat(), this function does not support /dev/fd/*. I can't
think of a reason why we need to support this for CloudABI.

Obtained from:	https://github.com/NuxiNL/freebsd
Differential Revision:	https://reviews.freebsd.org/D3235
2015-08-06 06:47:28 +00:00
Ed Schouten
36310bcd1d Make fcntl(F_SETFL) work.
The stat_put() system call can be used to modify file descriptor
attributes, such as flags, but also Capsicum permission bits. Support
for changing Capsicum bits will be added as soon as its dependent
changes have been pushed through code review.

Obtained from:	https://github.com/NuxiNL/freebsd
2015-08-05 16:15:43 +00:00
Ed Schouten
4958fab8cd Allow the creation of polling descriptors (kqueues) on CloudABI. 2015-08-05 07:37:06 +00:00
Ed Schouten
f52c3dd415 Allow CloudABI processes to create shared memory objects.
Summary:
Use the newly created `kern_shm_open()` function to create objects with
just the rights that are actually needed.

Reviewers: jhb, kib

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D3260
2015-08-01 07:51:48 +00:00
Ed Schouten
8328babdd0 Make pipes in CloudABI work.
Summary:
Pipes in CloudABI are unidirectional. The reason for this is that
CloudABI attempts to provide a uniform runtime environment across
different flavours of UNIX.

Instead of implementing a custom pipe that is unidirectional, we can
simply reuse Capsicum permission bits to support this. This is nice,
because CloudABI already attempts to restrict permission bits to
correspond with the operations that apply to a certain file descriptor.

Replace kern_pipe() and kern_pipe2() by a single kern_pipe() that takes
a pair of filecaps. These filecaps are passed to the newly introduced
falloc_caps() function that creates the descriptors with rights in
place.

Test Plan:
CloudABI pipes seem to be created with proper rights in place:

https://github.com/NuxiNL/cloudlibc/blob/master/src/libc/unistd/pipe_test.c#L44

Reviewers: jilles, mjg

Reviewed By: mjg

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D3236
2015-07-29 17:18:27 +00:00
Ed Schouten
9d2332c9ee Split up Capsicum to CloudABI rights conversion into two separate routines.
CloudABI's openat() ensures that files are opened with the smallest set
of relevant rights. For example, when opening a FIFO, unrelated rights
like CAP_RECV are automatically removed. To remove unrelated rights, we
can just reuse the code for this that was already present in the rights
conversion function.
2015-07-29 12:42:45 +00:00
Ed Schouten
cec575201a Make fstat() and friends work.
Summary:
CloudABI provides access to two different stat structures:

- fdstat, containing file descriptor level status: oflags, file
  descriptor type and Capsicum rights, used by cap_rights_get(),
  fcntl(F_GETFL), getsockopt(SO_TYPE).
- filestat, containing your regular file status: timestamps, inode
  number, used by fstat().

Unlike FreeBSD's stat::st_mode, CloudABI file descriptor types don't
have overloaded meanings (e.g., returning S_ISCHR() for kqueues). Add a
utility function to extract the type of a file descriptor accurately.

CloudABI does not work with O_ACCMODEs. File descriptors have two sets
of Capsicum-style rights: rights that apply to the file descriptor
itself ('base') and rights that apply to any new file descriptors
yielded through openat() ('inheriting'). Though not perfect, we can
pretty safely decompose Capsicum rights to such a pair. This is done in
convert_capabilities().

Test Plan: Tests for these system calls are fairly extensive in cloudlibc.

Reviewers: jonathan, mjg, #manpages

Reviewed By: mjg

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D3171
2015-07-28 06:36:49 +00:00
Ed Schouten
fef97e09d9 Allow us to create UNIX sockets and socketpairs in CloudABI processes. 2015-07-23 13:52:53 +00:00
Ed Schouten
a4001f4cb9 Use FDDUP_NORMAL instead of hardcoding value 0.
Proposed by:	mjg
2015-07-11 18:53:30 +00:00
Ed Schouten
329d1bca7f Add missing function parameter.
A function parameter got added in r285356, meaning that the call to
kern_dup() needs to be patched up.
2015-07-11 18:39:16 +00:00
Mateusz Guzik
5fe97c20dc fd: split kern_dup flags argument into actual flags and a mode
Tidy up the code inside to switch on the mode.
2015-07-10 11:01:30 +00:00
Ed Schouten
2491302a04 Add implementations for some of the CloudABI file descriptor system calls.
All of the CloudABI system calls that operate on file descriptors of an
arbitrary type are prefixed with fd_. This change adds wrappers for
most of these system calls around their FreeBSD equivalents.

The dup2() system call present on CloudABI deviates from POSIX, in the
sense that it can only be used to replace existing file descriptor. It
cannot be used to create new ones. The reason for this is that this is
inherently thread-unsafe. Furthermore, there is no need on CloudABI to
use fixed file descriptor numbers. File descriptors 0, 1 and 2 have no
special meaning.

This change exposes the kern_dup() through <sys/syscallsubr.h> and puts
the FDDUP_* flags in <sys/filedesc.h>. It then adds a new flag,
FDDUP_MUSTREPLACE to force that file descriptors are replaced -- not
allocated.

Differential Revision:	https://reviews.freebsd.org/D3035
Reviewed by:	mjg
2015-07-09 16:07:01 +00:00
Ed Schouten
6d338f9a81 Import the CloudABI datatypes and create a system call table.
CloudABI is a pure capability-based runtime environment for UNIX. It
works similar to Capsicum, except that processes already run in
capabilities mode on startup. All functionality that conflicts with this
model has been omitted, making it a compact binary interface that can be
supported by other operating systems without too much effort.

CloudABI is 'secure by default'; the idea is that it should be safe to
run arbitrary third-party binaries without requiring any explicit
hardware virtualization (Bhyve) or namespace virtualization (Jails). The
rights of an application are purely determined by the set of file
descriptors that you grant it on startup.

The datatypes and constants used by CloudABI's C library (cloudlibc) are
defined in separate files called syscalldefs_mi.h (pointer size
independent) and syscalldefs_md.h (pointer size dependent). We import
these files in sys/contrib/cloudabi and wrap around them in
cloudabi*_syscalldefs.h.

We then add stubs for all of the system calls in sys/compat/cloudabi or
sys/compat/cloudabi64, depending on whether the system call depends on
the pointer size. We only have nine system calls that depend on the
pointer size. If we ever want to support 32-bit binaries, we can simply
add sys/compat/cloudabi32 and implement these nine system calls again.

The next step is to send in code reviews for the individual system call
implementations, but also add a sysentvec, to allow CloudABI executabled
to be started through execve().

More information about CloudABI:
- GitHub: https://github.com/NuxiNL/cloudlibc
- Talk at BSDCan: https://www.youtube.com/watch?v=SVdF84x1EdA

Differential Revision:	https://reviews.freebsd.org/D2848
Reviewed by:	emaste, brooks
Obtained from:	https://github.com/NuxiNL/freebsd
2015-07-09 07:20:15 +00:00