vn_open_cred invocations shall not audit namei path.
In particular, specify VN_OPEN_NOAUDIT for dotdot lookup performed by
default implementation of vop_vptocnp, and for the open done for core
file. vn_fullpath is called from the audit code, and vn_open there need
to disable audit to avoid infinite recursion. Core file is created on
return to user mode, that, in particular, happens during syscall return.
The creation of the core file is audited by direct calls, and we do not
want to overwrite audit information for syscall.
Reported, reviewed and tested by: rwatson
The system hostname is now stored in prison0, and the global variable
"hostname" has been removed, as has the hostname_mtx mutex. Jails may
have their own host information, or they may inherit it from the
parent/system. The proper way to read the hostname is via
getcredhostname(), which will copy either the hostname associated with
the passed cred, or the system hostname if you pass NULL. The system
hostname can still be accessed directly (and without locking) at
prison0.pr_host, but that should be avoided where possible.
The "similar information" referred to is domainname, hostid, and
hostuuid, which have also become prison parameters and had their
associated global variables removed.
Approved by: bz (mentor)
about removing a few #ifdefs and providing compatibility wrappers and
VOP implementations to get and set an ACL; ZFS does ACL enforcement all
by itself.
Note that the VOPs are ifdefed out for now, so this change should be
a no-op.
Reviewed by: pjd
the VFS. Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.
In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.
While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.
VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled. Bump __FreeBSD_version in order to signal such
situation.
virtualized instances of hostname and domainname, as well as a new top-level
virtualization struct vimage, which holds pointers to struct vnet and struct
vprocg. Struct vprocg is likely to become replaced in the near future with
a new jail management API import.
As a consequence of this change, change struct ucred to point to a struct
vimage, instead of directly pointing to a vnet.
Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage
branch.
Permit kldload / kldunload operations to be executed only from the default
vimage context.
This change should have no functional impact on nooptions VIMAGE kernel
builds.
Reviewed by: bz
Approved by: julian (mentor)
interface as nmount(2). Three new system calls are added:
* jail_set, to create jails and change the parameters of existing jails.
This replaces jail(2).
* jail_get, to read the parameters of existing jails. This replaces the
security.jail.list sysctl.
* jail_remove to kill off a jail's processes and remove the jail.
Most jail parameters may now be changed after creation, and jails may be
set to exist without any attached processes. The current jail(2) system
call still exists, though it is now a stub to jail_set(2).
Approved by: bz (mentor)
vfsopt and the vfs_buildopts function public, and add some new fields
to struct vfsopt (pos and seen), and new functions vfs_getopt_pos and
vfs_opterror.
Further extend the interface to allow reading options from the kernel
in addition to sending them to the kernel, with vfs_setopt and related
functions.
While this allows the "name=value" option interface to be used for more
than just FS mounts (planned use is for jails), it retains the current
"vfsopt" name and <sys/mount.h> requirement.
Approved by: bz (mentor)
This bring huge amount of changes, I'll enumerate only user-visible changes:
- Delegated Administration
Allows regular users to perform ZFS operations, like file system
creation, snapshot creation, etc.
- L2ARC
Level 2 cache for ZFS - allows to use additional disks for cache.
Huge performance improvements mostly for random read of mostly
static content.
- slog
Allow to use additional disks for ZFS Intent Log to speed up
operations like fsync(2).
- vfs.zfs.super_owner
Allows regular users to perform privileged operations on files stored
on ZFS file systems owned by him. Very careful with this one.
- chflags(2)
Not all the flags are supported. This still needs work.
- ZFSBoot
Support to boot off of ZFS pool. Not finished, AFAIK.
Submitted by: dfr
- Snapshot properties
- New failure modes
Before if write requested failed, system paniced. Now one
can select from one of three failure modes:
- panic - panic on write error
- wait - wait for disk to reappear
- continue - serve read requests if possible, block write requests
- Refquota, refreservation properties
Just quota and reservation properties, but don't count space consumed
by children file systems, clones and snapshots.
- Sparse volumes
ZVOLs that don't reserve space in the pool.
- External attributes
Compatible with extattr(2).
- NFSv4-ACLs
Not sure about the status, might not be complete yet.
Submitted by: trasz
- Creation-time properties
- Regression tests for zpool(8) command.
Obtained from: OpenSolaris
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.
Approved by: rwatson (mentor)
doesn't overflow in arc.c in this check:
if (kmem_used() > (kmem_size() * 4) / 5)
return (1);
With this bug ZFS almost doesn't cache.
Only 32bit machines are affected that have vm.kmem_size set to values >=1GB.
Reported by: David Taylor <davidt@yadt.co.uk>
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.
KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.
Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.
Manpage and FreeBSD_version will be updated through further commits.
As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.
Tested by: Diego Sardina <siarodx at gmail dot com>,
Andrea Di Pasquale <whyx dot it at gmail dot com>
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.
Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.
We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths. Do, however, move those prototypes to priv.h.
Reviewed by: csjp
Obtained from: TrustedBSD Project
implementing some of them using existing ones.
- Allow to compile ZFS on all archs and use atomic operations surrounded
by global mutex on archs we don't have or can't have all atomic
operations needed by ZFS.
sysctl_handle_int is not sizeof the int type you want to export.
The type must always be an int or an unsigned int.
Remove the instances where a sizeof(variable) is passed to stop
people accidently cut and pasting these examples.
In a few places this was sysctl_handle_int was being used on 64 bit
types, which would truncate the value to be exported. In these
cases use sysctl_handle_quad to export them and change the format
to Q so that sysctl(1) can still print them.
1. Pass locking flags to VFS_ROOT().
2. Check v_mountedhere while the vnode is locked.
3. Always return locked vnode on success.
Change 1 fixes problem reported by Stephen M. Rumble - after
zfs_vfsops.c,1.9 change, zfs_root() no longer locks the vnode
unconditionally and traverse() didn't pass right lock type to
VFS_ROOT(). The result was that kernel paniced when .zfs/ directory
was accessed via NFS.
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation
argument from being file descriptor index into the pointer to struct file.
Proposed and reviewed by: jhb
Reviewed by: daichi (unionfs)
Approved by: re (kensmith)
- Move FreeBSD-specific code to zfs_freebsd_*() functions in zfs_vnops.c
and keep original functions as similar to vendor's code as possible.
- Add various includes back, now that we have them.
@118370 Correct typo.
@118371 Integrate changes from vendor.
@118491 Show backtrace on unexpected code paths.
@118494 Integrate changes from vendor.
@118504 Fix sendfile(2). I had two ways of fixing it:
1. Fixing sendfile(2) itself to use VOP_GETPAGES() instead of
hacking around with vn_rdwr(UIO_NOCOPY), which was suggested
by ups.
2. Modify ZFS behaviour to handle this special case.
Although 1 is more correct, I've choosen 2, because hack from 1
have a side-effect of beeing faster - it reads ahead MAXBSIZE
bytes instead of reading page by page. This is not easy to implement
with VOP_GETPAGES(), at least not for me in this very moment.
Reported by: Andrey V. Elsukov <bu7cher@yandex.ru>
@118525 Reorganize the code to reduce diff.
@118526 This code path is expected. It is simply when file is opened with
O_FSYNC flag.
Reported by: kris
Reported by: Michal Suszko <dry@dry.pl>
on a snapshot directory:
- Remove PRIV_VFS_MOUNT check - regular users can mount snapshots
via lookups on snapshot directory.
- Reset mount credential to kcred, so user won't be able to unmount
the snapshot.
- Reset owner uid.
- Unlock vnode in case of a failure.
Reported by: simokawa
This fixes stange panics when listing .zfs/snapshot/ directory for me.
Reported by: simokawa
Reported by: Johan Hendriks <Johan@double-l.nl>
- Hide cache_purge() under FREEBSD_NAMECACHE like in other files.
- Protect mnt_flag with mount interlock.
popular names. Hence:
- comment current index() and rindex() functions, as these serve the same
functionality as, respectively, strchr() and strrchr() from userland;
- add inlined version of strchr() and strrchr(), as we tend to use them more
often;
- remove str[r]chr() definitions from ZFS code;
Reviewed by: pjd
Approved by: cognet (mentor)
- Allow to shrink ARC down to 16MB (instead of 64MB).
- Set arc_max to 1/2 of kmem_map by default.
- Start freeing things earlier when low memory situation is detected.
- Serialize execution of arc_lowmem().
I decided to setup minimum ZFS memory requirements to 512MB of RAM and 256MB of
kmem_map size. If there is less RAM or kmem_map, a warning will be printed.
World is cruel, be no better. In other words: modern file system requires
modern hardware:)
From ZFS administration guide:
"Currently the minimum amount of memory recommended to install a Solaris
system is 512 Mbytes. However, for good ZFS performance, at least one
Gbyte or more of memory is recommended."
ZFS file system was ported from OpenSolaris operating system. The code in under
CDDL license.
I'd like to thank all SUN developers that created this great piece of software.
Supported by: Wheel LTD (http://www.wheel.pl/)
Supported by: The FreeBSD Foundation (http://www.freebsdfoundation.org/)
Supported by: Sentex (http://www.sentex.net/)