ZFS file system was ported from OpenSolaris operating system. The code in under
CDDL license.
I'd like to thank all SUN developers that created this great piece of software.
Supported by: Wheel LTD (http://www.wheel.pl/)
Supported by: The FreeBSD Foundation (http://www.freebsdfoundation.org/)
Supported by: Sentex (http://www.sentex.net/)
It may be used for external modules to attach some data to jail's in-kernel
structure.
- Change allprison_mtx mutex to allprison_sx sx(9) lock.
We will need to call external functions while holding this lock, which may
want to allocate memory.
Make use of the fact that this is shared-exclusive lock and use shared
version when possible.
- Implement the following functions:
prison_service_register() - registers a service that wants to be noticed
when a jail is created and destroyed
prison_service_deregister() - deregisters service
prison_service_data_add() - adds service-specific data to the jail structure
prison_service_data_get() - takes service-specific data from the jail
structure
prison_service_data_del() - removes service-specific data from the jail
structure
Reviewed by: rwatson
unmount jail-friendly file systems from within a jail.
Precisely it grants PRIV_VFS_MOUNT, PRIV_VFS_UNMOUNT and
PRIV_VFS_MOUNT_NONUSER privileges for a jailed super-user.
It is turned off by default.
A jail-friendly file system is a file system which driver registers
itself with VFCF_JAIL flag via VFS_SET(9) API.
The lsvfs(1) command can be used to see which file systems are
jail-friendly ones.
There currently no jail-friendly file systems, ZFS will be the first one.
In the future we may consider marking file systems like nullfs as
jail-friendly.
Reviewed by: rwatson
The problem is this: vm_fault_additional_pages() calls vm_pager_has_page(),
which calls vnode_pager_haspage(). Now when VOP_BMAP() returns an error (eg.
EOPNOTSUPP), vnode_pager_haspage() returns TRUE without initializing 'before'
and 'after' arguments, so we have some accidental values there. This bascially
was causing this condition to be meet:
if ((rahead + rbehind) >
((cnt.v_free_count + cnt.v_cache_count) - cnt.v_free_reserved)) {
pagedaemon_wakeup();
[...]
}
(we have some random values in rahead and rbehind variables)
I'm not entirely sure this is the right fix, maybe we should just return FALSE
in vnode_pager_haspage() when VOP_BMAP() fails?
alc@ knows about this problem, maybe he will be able to come up with a better
fix if this is not the right one.
implementation, and mark it as deprecated. It will be removed entirely
in libarchive 3.0 (in FreeBSD 8.0?) but there's no reason for anyone to
use it instead of archive_read_data.
Approved by: kientzle
two values, the latter does not tend to have sign extension
and/or overflow bugs, and makes the code more obvious.
While I'm there, make use of a macro which is derived from
bin/ps/ps.c: ps_compat() to improve the readability of the
code.
Suggested by: bde
MFC after: 1 week
tcp_input() to syncache_socket() where it belongs and the majority
of it already happens.
The "tp->snd_up = tp->snd_una" is removed as it is done with the
tcp_sendseqinit() macro a few lines earlier.
read data from the standard input. This allows tail -f to pipe
data to lastcomm, and thereby real-time monitoring of executed
commands. The manual page includes the exact incantation.
MFC after: 2 weeks
and flags with an sxlock. This leads to a significant and measurable
performance improvement as a result of access to shared locking for
frequent lookup operations, reduced general overhead, and reduced overhead
in the event of contention. All of these are imported for threaded
applications where simultaneous access to a shared file descriptor array
occurs frequently. Kris has reported 2x-4x transaction rate improvements
on 8-core MySQL benchmarks; smaller improvements can be expected for many
workloads as a result of reduced overhead.
- Generally eliminate the distinction between "fast" and regular
acquisisition of the filedesc lock; the plan is that they will now all
be fast. Change all locking instances to either shared or exclusive
locks.
- Correct a bug (pointed out by kib) in fdfree() where previously msleep()
was called without the mutex held; sx_sleep() is now always called with
the sxlock held exclusively.
- Universally hold the struct file lock over changes to struct file,
rather than the filedesc lock or no lock. Always update the f_ops
field last. A further memory barrier is required here in the future
(discussed with jhb).
- Improve locking and reference management in linux_at(), which fails to
properly acquire vnode references before using vnode pointers. Annotate
improper use of vn_fullpath(), which will be replaced at a future date.
In fcntl(), we conservatively acquire an exclusive lock, even though in
some cases a shared lock may be sufficient, which should be revisited.
The dropping of the filedesc lock in fdgrowtable() is no longer required
as the sxlock can be held over the sleep operation; we should consider
removing that (pointed out by attilio).
Tested by: kris
Discussed with: jhb, kris, attilio, jeff