take a const 'name', since they dont modify anything.
159: warning: passing arg 1 of `getenv_int' discards qualifiers...
167: warning: passing arg 1 of `getenv' discards qualifiers from pointer..
Replace the a.out emulation of 'struct linker_set' with something
a little more flexible. <sys/linker_set.h> now provides macros for
accessing elements and completely hides the implementation.
The linker_set.h macros have been on the back burner in various
forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()),
John Polstra (ELF clue) and myself (cleaned up API and the conversion
of the rest of the kernel to use it).
The macros declare a strongly typed set. They return elements with the
type that you declare the set with, rather than a generic void *.
For ELF, we use the magic ld symbols (__start_<setname> and
__stop_<setname>). Thanks to Richard Henderson <rth@redhat.com> for the
trick about how to force ld to provide them for kld's.
For a.out, we use the old linker_set struct.
NOTE: the item lists are no longer null terminated. This is why
the code impact is high in certain areas.
The runtime linker has a new method to find the linker set
boundaries depending on which backend format is in use.
linker sets are still module/kld unfriendly and should never be used
for anything that may be modular one day.
Reviewed by: eivind
- Replace some very poorly thought out API hacks that should have been
fixed a long while ago.
- Provide some much more flexible search functions (resource_find_*())
- Use strings for storage instead of an outgrowth of the rather
inconvenient temporary ioconf table from config(). We already had a
fallback to using strings before malloc/vm was running anyway.
This work was based on kame-20010528-freebsd43-snap.tgz and some
critical problem after the snap was out were fixed.
There are many many changes since last KAME merge.
TODO:
- The definitions of SADB_* in sys/net/pfkeyv2.h are still different
from RFC2407/IANA assignment because of binary compatibility
issue. It should be fixed under 5-CURRENT.
- ip6po_m member of struct ip6_pktopts is no longer used. But, it
is still there because of binary compatibility issue. It should
be removed under 5-CURRENT.
Reviewed by: itojun
Obtained from: KAME
MFC after: 3 weeks
around, use a common function for looking up and extracting the tunables
from the kernel environment. This saves duplicating the same function
over and over again. This way typically has an overhead of 8 bytes + the
path string, versus about 26 bytes + the path string.
compliant. All the variable definitions and function names are
reasonably consistent, and the functions which should be static (i.e.,
all of them) are. Other assorted fixes were made. The majority of
the delta is indentation fixes.
Partially reviewed by: bde
certain cases, and a close() by another process could potentially rip the
pipe out from under the (blocked) locking operation.
Reported-by: Alexander Viro <viro@math.psu.edu>
- move the sysctl code to kern_intr.c
- do not use INTRCNT_COUNT, but rather eintrcnt - intrcnt to determine
the length of the intrcnt array
- move the declarations of intrnames, eintrnames, intrcnt and eintrcnt
from machine-dependent include files to sys/interrupt.h
- remove the hw.nintr sysctl, it is not needed.
- fix various style bugs
Requested by: bde
Reviewed by: bde (some time ago)
* all members of msginfo from sysv_msg.c;
* msqids from sysv_msg.c;
* sema from sysv_sem.c; and
* shmsegs from sysv_shm.c;
These will be used by ipcs(1) in non-kvm mode.
Reviewed by: tmm
attempting to remove nonexistant exports with MNT_DELEXPORT returns
an error; before this change it always succeeded. This caused
mountd(8) to log "can't delete exports for /whatever" warnings.
Change the error code from EINVAL to a more specific ENOENT, and
make mountd ignore this error when deleting the export list. I
could have just restored the previous behaviour of returning success,
but I think an error return is a useful diagnostic.
Reviewed by: phk
it becomes possible to trap in ptsstop() in kern/tty_pty.c
if the slave side has never been opened during the life of a kernel.
What happens is that calls to ttyflush() done from ptyioctl() for the
controlling side end up calling ptsstop() [via (*tp->t_stop)(tp, <X>)]
which evaluates the following:
struct pt_ioctl *pti = tp->t_dev->si_drv1;
In order for tp->t_dev to be set, the slave device must first be
opened in ttyopen() [kern/tty.c].
It appears that the only problem is calls to (*tp->t_stop)(tp, <n>),
so this could also happen with other ioctls initiated by the
controlling side before the slave has been opened.
PR: 27698
Submitted by: David Bein bein@netapp.com
MFC after: 6 days
the saved uid and gid during execve(). Unfortunately, the optimizations
were incorrect in the case where the credential was updated, skipping
the setting of the saved uid and gid when new credentials were generated.
This change corrects that problem by handling the newcred!=NULL case
correctly.
Reported/tested by: David Malone <dwmalone@maths.tcd.ie>
Obtained from: TrustedBSD Project
dev_t. The dev_depends(dev_t, dev_t) function is for tying them
to each other.
When destroy_dev() is called on a dev_t, all dev_t's depending
on it will also be destroyed (depth first order).
Rewrite the make_dev_alias() to use this dependency facility.
kern/subr_disk.c:
Make the disk mini-layer use dependencies to make sure all
relevant dev_t's are removed when the disk disappears.
Make the disk mini-layer precreate some magic sub devices
which the disk/slice/label code expects to be there.
kern/subr_disklabel.c:
Remove some now unneeded variables.
kern/subr_diskmbr.c:
Remove some ancient, commented out code.
kern/subr_diskslice.c:
Minor cleanup. Use name from dev_t instead of dsname()
real uid, saved uid, real gid, and saved gid to ucred, as well as the
pcred->pc_uidinfo, which was associated with the real uid, only rename
it to cr_ruidinfo so as not to conflict with cr_uidinfo, which
corresponds to the effective uid.
o Remove p_cred from struct proc; add p_ucred to struct proc, replacing
original macro that pointed.
p->p_ucred to p->p_cred->pc_ucred.
o Universally update code so that it makes use of ucred instead of pcred,
p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo,
cr_{r,sv}{u,g}id instead of p_*, etc.
o Remove pcred0 and its initialization from init_main.c; initialize
cr_ruidinfo there.
o Restruction many credential modification chunks to always crdup while
we figure out locking and optimizations; generally speaking, this
means moving to a structure like this:
newcred = crdup(oldcred);
...
p->p_ucred = newcred;
crfree(oldcred);
It's not race-free, but better than nothing. There are also races
in sys_process.c, all inter-process authorization, fork, exec, and
exit.
o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid;
remove comments indicating that the old arrangement was a problem.
o Restructure exec1() a little to use newcred/oldcred arrangement, and
use improved uid management primitives.
o Clean up exit1() so as to do less work in credential cleanup due to
pcred removal.
o Clean up fork1() so as to do less work in credential cleanup and
allocation.
o Clean up ktrcanset() to take into account changes, and move to using
suser_xxx() instead of performing a direct uid==0 comparision.
o Improve commenting in various kern_prot.c credential modification
calls to better document current behavior. In a couple of places,
current behavior is a little questionable and we need to check
POSIX.1 to make sure it's "right". More commenting work still
remains to be done.
o Update credential management calls, such as crfree(), to take into
account new ruidinfo reference.
o Modify or add the following uid and gid helper routines:
change_euid()
change_egid()
change_ruid()
change_rgid()
change_svuid()
change_svgid()
In each case, the call now acts on a credential not a process, and as
such no longer requires more complicated process locking/etc. They
now assume the caller will do any necessary allocation of an
exclusive credential reference. Each is commented to document its
reference requirements.
o CANSIGIO() is simplified to require only credentials, not processes
and pcreds.
o Remove lots of (p_pcred==NULL) checks.
o Add an XXX to authorization code in nfs_lock.c, since it's
questionable, and needs to be considered carefully.
o Simplify posix4 authorization code to require only credentials, not
processes and pcreds. Note that this authorization, as well as
CANSIGIO(), needs to be updated to use the p_cansignal() and
p_cansched() centralized authorization routines, as they currently
do not take into account some desirable restrictions that are handled
by the centralized routines, as well as being inconsistent with other
similar authorization instances.
o Update libkvm to take these changes into account.
Obtained from: TrustedBSD Project
Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit
Tor created a while ago, removes the raw I/O piece (that has cache coherency
problems), and adds a buffer cache / VM freeing piece.
Essentially this patch causes O_DIRECT I/O to not be left in the cache, but
does not prevent it from going through the cache, hence the 80%. For
the last 20% we need a method by which the I/O can be issued directly to
buffer supplied by the user process and bypass the buffer cache entirely,
but still maintain cache coherency.
I also have the code working under -stable but the changes made to sys/file.h
may not be MFCable, so an MFC is not on the table yet.
Submitted by: tegge, dillon
- Always call vfs_setdirty() with vm_mtx held.
- Fix an old comment: vm_hold_unload_pages is called vm_hold_free_pages()
nowadays.
- Always call vm_hold_free_pages() w/o vm_mtx held.
- Don't bother releasing Giant while doing a lookup on the vm_map of
initproc while starting up init. We have to grab it again right after
the lookup anyways.
at the top of the minute, whichever comes first. It seems
logtimeout() is only called once after the kernel log is opened
and then never again after that. So I guess syslogd only gets
kernel log messages by virtue of syncer(4)'s flushes ...?
PR: 27361
Submitted by: pkern@utcc.utoronto.ca
MFC after: 1 week
systems were repo-copied from sys/miscfs to sys/fs.
- Renamed the following file systems and their modules:
fdesc -> fdescfs, portal -> portalfs, union -> unionfs.
- Renamed corresponding kernel options:
FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS.
- Install header files for the above file systems.
- Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland
Makefiles.
needs instead of relying on idiosyncratic hacks in the tty subsystem.
Also add module code since this can now be compiled as a module.
Silence by: -hackers, -audit
simpler for npx exceptions that start as traps (no assembly required...)
and works better for npx exceptions that start as interrupts (there is
no longer a problem for nested interrupts).
Submitted by: original (pre-SMPng) version by luoqi
shm_deallocate_segment because shmexit_myhook calls it, and the latter
should always be called with it already held.
Submitted by: dwmalone, dd
Approved by: alfred
- Don't release the vm mutex early in pipespace() but instead hold it
across vm_object_deallocate() if vm_map_find() returns an error and
across pipe_free_kmem() if vm_map_find() succeeds.
- Add a XXX above a zfree() since zalloc already has its own locking,
one would hope that zfree() wouldn't need the vm lock.
vm_mtx does not recurse and is required for most low level
vm operations.
faults can not be taken without holding Giant.
Memory subsystems can now call the base page allocators safely.
Almost all atomic ops were removed as they are covered under the
vm mutex.
Alpha and ia64 now need to catch up to i386's trap handlers.
FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).
Reviewed (partially) by: jake, jhb
lock. Since we won't actually block on a try lock operation, it's not
a problem. Add a comment explaining why it is safe to skip lock order
checking with try locks.
- Remove the ithread list lock spin lock from the order list.
sleep locks.
- Delay returning from ithread_remove_handler() until we are certain that
the interrupt handler being removed has in fact been removed from the
ithread.
- XXX: There is still a problem in that nothing protects the kernel from
adding a new handler while the ithread is running, though with our
current architectures this is not a problem.
Requested by: gibbs (2)
- Attach a writable sysctl to bootverbose (debug.bootverbose) so it can be
toggled after boot.
- Move the printf of the version string to a SI_SUB_COPYRIGHT SYSINIT just
afer the display of the copyright message instead of doing it by hand in
three MD places.
follows: the effective uid of p1 (subject) must equal the real, saved,
and effective uids of p2 (object), p2 must not have undergone a
credential downgrade. A subject with appropriate privilege may override
these protections.
In the future, we will extend these checks to require that p1 effective
group membership must be a superset of p2 effective group membership.
Obtained from: TrustedBSD Project
Remove comment about setting error for reads on EOF, read returns 0 on
EOF so the code should be ok.
Remove non-effective priority boost, PRIO+1 doesn't do anything
(according to McKusick), if a real priority boost is needed it should
have been +4.
Style fixes:
.) return foo -> return (foo)
.) FLAG1|FlAG2 -> FLAG1 | FlAG2
.) wrap long lines
.) unwrap short lines
.) for(i=0;i=foo;i++) -> for (i = 0; i=foo; i++)
.) remove braces for some conditionals with a single statement
.) fix continuation lines.
md5 couldn't verify the binary because some code had to
be shuffled around to address the style issues.
the number of references on the filesystem root vnode to be both
expected and released. Many filesystems hold an extra reference on
the filesystem root vnode, which must be accounted for when
determining if the filesystem is busy and then released if it isn't
busy. The old `skipvp' approach required individual filesystem
xxx_unmount functions to re-implement much of vflush()'s logic to
deal with the root vnode.
All 9 filesystems that hold an extra reference on the root vnode
got the logic wrong in the case of forced unmounts, so `umount -f'
would always fail if there were any extra root vnode references.
Fix this issue centrally in vflush(), now that we can.
This commit also fixes a vnode reference leak in devfs, which could
result in idle devfs filesystems that refuse to unmount.
Reviewed by: phk, bp
- Require the proc lock be held for killproc() to allow for the vmdaemon to
kill a process when memory is exhausted while holding the lock of the
process to kill.
When people access /dev/tty, locate their controlling tty and return
the dev_t of it to them. This basically makes /dev/tty act like
a variant symlink sort of thing which is much simpler than all the
mucking about with vnodes.
- Since polling should not involve sleeping, keep holding a
process lock upon scanning file descriptors.
- Hold a reference to every file descriptor prior to entering
polling loop in order to avoid lock order reversal between
lockmgr and p_mtx upon calling fdrop() in fo_poll().
(NOTE: this work has not been done for netncp and netsmb
yet because a socket itself has no reference counts.)
Reviewed by: jhb
KASSERT when vp->v_usecount is zero or negative. In this case, the
"v*: negative ref cnt" panic that follows is much more appropriate.
Reviewed by: mckusick
fail due to witness exhausting its internal resources and shutting down.
Reported by: Szilveszter Adam <sziszi@petra.hos.u-szeged.hu>
Tested by: David Wolfskill <david@catwhisker.org>
The pipe code could not handle running out of kva, it would panic
if that happened. Instead return ENFILE to the application which
is an acceptable error return from pipe(2).
There was some slightly tricky things that needed to be worked on,
namely that the pipe code can 'realloc' the size of the buffer if
it detects that the pipe could use a bit more room. However if it
failed the reallocation it could not cope and would panic. Fix
this by attempting to grow the pipe while holding onto our old
resources. If all goes well free the old resources and use the
new ones, otherwise continue to use the smaller buffer already
allocated.
While I'm here add a few blank lines for style(9) and remove
'register'.