Commit Graph

76 Commits

Author SHA1 Message Date
Konstantin Belousov
140dedb81c The r241025 fixed the case when a binary, executed from nullfs mount,
was still possible to open for write from the lower filesystem.  There
is a symmetric situation where the binary could already has file
descriptors opened for write, but it can be executed from the nullfs
overlay.

Handle the issue by passing one v_writecount reference to the lower
vnode if nullfs vnode has non-zero v_writecount.  Note that only one
write reference can be donated, since nullfs only keeps one use
reference on the lower vnode.  Always use the lower vnode v_writecount
for the checks.

Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is
currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT
to manipulate the v_writecount value, which manages a single bypass
reference to the lower vnode.  Caling the VOPs instead of directly
accessing v_writecount provide the fix described in the previous
paragraph.

Tested by:	pho
MFC after:	3 weeks
2012-11-02 13:56:36 +00:00
Konstantin Belousov
3494f31ad2 Fix misuse of the kernel map in miscellaneous image activators.
Vnode-backed mappings cannot be put into the kernel map, since it is a
system map.

Use exec_map for transient mappings, and remove the mappings with
kmem_free_wakeup() to notify the waiters on available map space.

Do not map the whole executable into KVA at all to copy it out into
usermode.  Directly use vn_rdwr() for the case of not page aligned
binary.

There is one place left where the potentially unbounded amount of data
is mapped into exec_map, namely, in the COFF image activator
enumeration of the needed shared libraries.

Reviewed by:   alc
MFC after:     2 weeks
2012-02-17 23:47:16 +00:00
Ulrich Spörlein
9a14aa017b Convert files to UTF-8 2012-01-15 13:23:18 +00:00
Konstantin Belousov
eb46c93fa3 Corrections for the iBCS2 support that seems to regressed from 4.x times.
In particular:
- fix format specifiers in the DPRINTFs;
- do not use kernel_map for temporal mapping backed by the vnode, this
  cannot work since kernel map is a system map. Use exec_map instead.
- ignore error code from an attempt to insert the hole. If supposed hole
  is located at the region already populated by .bss, it is not an error.
- correctly translate vm error codes to errno, when appropriate.

Reported and tested by:	Rich Naill <rich enterprisesystems net>
Approved by:	re (kensmith)
MFC after:	1 week
2011-08-02 18:12:19 +00:00
Attilio Rao
0359a12ead Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
2008-08-28 15:23:18 +00:00
Jeff Roberson
6617724c5f Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to
FreeBSD, the M:N approach taken by the kse library was never developed
to its full potential.  Backwards compatibility will be provided via
libmap.conf for dynamically linked binaries and static binaries will
be broken.
2008-03-12 10:12:01 +00:00
Attilio Rao
22db15c06f VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
2008-01-13 14:44:15 +00:00
Attilio Rao
cb05b60a89 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
Konstantin Belousov
89b57fcf01 Fix for the panic("vm_thread_new: kstack allocation failed") and
silent NULL pointer dereference in the i386 and sparc64 pmap_pinit()
when the kmem_alloc_nofault() failed to allocate address space. Both
functions now return error instead of panicing or dereferencing NULL.

As consequence, vmspace_exec() and vmspace_unshare() returns the errno
int. struct vmspace arg was added to vm_forkproc() to avoid dealing
with failed allocation when most of the fork1() job is already done.

The kernel stack for the thread is now set up in the thread_alloc(),
that itself may return NULL. Also, allocation of the first process
thread is performed in the fork1() to properly deal with stack
allocation failure. proc_linkup() is separated into proc_linkup()
called from fork1(), and proc_linkup0(), that is used to set up the
kernel process (was known as swapper).

In collaboration with:	Peter Holm
Reviewed by:	jhb
2007-11-05 11:36:16 +00:00
Konstantin Belousov
9e223287c0 Revert UF_OPENING workaround for CURRENT.
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation
argument from being file descriptor index into the pointer to struct file.

Proposed and reviewed by:	jhb
Reviewed by:	daichi (unionfs)
Approved by:	re (kensmith)
2007-05-31 11:51:53 +00:00
John Baldwin
a46a6706c5 Retire the stackgap macros from ibcs2 as they are no longer used. Push
the includes of <sys/exec.h> and <sys/sysent.h> down into the only files
that now need them.
2006-07-10 17:59:26 +00:00
Jeff Roberson
7625cbf3cc - Pass the ISOPEN flag to namei so filesystems will know we're about to
open them or otherwise access the data.
2005-04-27 09:05:19 +00:00
John Baldwin
98df9218da - Change the vm_mmap() function to accept an objtype_t parameter specifying
the type of object represented by the handle argument.
- Allow vm_mmap() to map device memory via cdev objects in addition to
  vnodes and anonymous memory.  Note that mmaping a cdev directly does not
  currently perform any MAC checks like mapping a vnode does.
- Unbreak the DRM getbufs ioctl by having it call vm_mmap() directly on the
  cdev the ioctl is acting on rather than trying to find a suitable vnode
  to map from.

Reviewed by:	alc, arch@
2005-04-01 20:00:11 +00:00
Maxim Sobolev
610ecfe035 o Split out kernel part of execve(2) syscall into two parts: one that
copies arguments into the kernel space and one that operates
  completely in the kernel space;

o use kernel-only version of execve(2) to kill another stackgap in
  linuxlator/i386.

Obtained from:  DragonFlyBSD (partially)
MFC after:      2 weeks
2005-01-29 23:12:00 +00:00
Tim J. Robbins
7e1985fcd8 Replace the dangerous strcpy() call with strlcpy(), instead of the safe one
that was incorrectly changed in rev. 1.61.

Approved by:	re
2003-11-18 14:21:34 +00:00
Tim J. Robbins
a95edcba94 Fix a multitude of security bugs in the iBCS2 emulator:
- Return NULL instead of returning memory outside of the stackgap
  in stackgap_alloc() (FreeBSD-SA-00:42.linux)
- Check for stackgap_alloc() returning NULL in ibcs2_emul_find();
  other calls to stackgap_alloc() have not been changed since they
  are small fixed-size allocations.
- Replace use of strcpy() with strlcpy() in exec_coff_imgact()
  to avoid buffer overflow
- Use strlcat() instead of strcat() to avoid a one byte buffer
  overflow in ibcs2_setipdomainname()
- Use copyinstr() instead of copyin() in ibcs2_setipdomainname()
  to ensure that the string is null-terminated
- Avoid integer overflow in ibcs2_setgroups() and ibcs2_setgroups()
  by checking that gidsetsize argument is non-negative and
  no larger than NGROUPS_MAX.
- Range-check signal numbers in ibcs2_wait(), ibcs2_sigaction(),
  ibcs2_sigsys() and ibcs2_kill() to avoid accessing array past
  the end (or before the start)
2003-10-12 04:25:26 +00:00
Poul-Henning Kamp
a8d43c90af Add a "int fd" argument to VOP_OPEN() which in the future will
contain the filedescriptor number on opens from userland.

The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*

For now pass -1 all over the place.
2003-07-26 07:32:23 +00:00
David E. O'Brien
db31907c73 Use __FBSDID(). 2003-06-02 06:48:51 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Jake Burkholder
05ba50f522 Use the fields in the sysentvec and in the vm map header in place of the
constants VM_MIN_ADDRESS, VM_MAXUSER_ADDRESS, USRSTACK and PS_STRINGS.
This is mainly so that they can be variable even for the native abi, based
on different machine types.  Get stack protections from the sysentvec too.
This makes it trivial to map the stack non-executable for certain abis, on
machines that support it.
2002-09-21 22:07:17 +00:00
Robert Drehmel
4dfbdd7823 - Fix iBCS2 emulation of COFF files that have multiple libraries
in their library (STYP_LIB) section.
 - Attempt to make the code which calculates the next entry and
   string offsets look clearer.

PR:		kern/42580
Tested by:	Olaf Klein <ok@adimus.de> (on 4.7-PRERELEASE)
2002-09-09 15:51:10 +00:00
Jeff Roberson
619eb6e579 - Hold the vnode lock throughout execve.
- Set VV_TEXT in the top level execve code.
 - Fixup the image activators to deal with the newly locked vnode.
2002-08-13 06:55:28 +00:00
Jeff Roberson
e6e370a7fe - Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
   with VOP calls is needed.
 - v_iflag is protected by interlock and is used for dealing with vnode
   management issues.  These flags include X/O LOCK, FREE, DOOMED, etc.
 - All accesses to v_iflag and v_vflag have either been locked or marked with
   mp_fixme's.
 - Many ASSERT_VOP_LOCKED calls have been added where the locking was not
   clear.
 - Many functions in vfs_subr.c were restructured to provide for stronger
   locking.

Idea stolen from:	BSD/OS
2002-08-04 10:29:36 +00:00
Peter Wemm
3ebc124838 Infrastructure tweaks to allow having both an Elf32 and an Elf64 executable
handler in the kernel at the same time.  Also, allow for the
exec_new_vmspace() code to build a different sized vmspace depending on
the executable environment.  This is a big help for execing i386 binaries
on ia64.   The ELF exec code grows the ability to map partial pages when
there is a page size difference, eg: emulating 4K pages on 8K or 16K
hardware pages.

Flesh out the i386 emulation support for ia64.  At this point, the only
binary that I know of that fails is cvsup, because the cvsup runtime
tries to execute code in pages not marked executable.

Obtained from:  dfr (mostly, many tweaks from me).
2002-07-20 02:56:12 +00:00
Jens Schweikhardt
21dc7d4f57 Fix typo in the BSD copyright: s/withough/without/
Spotted and suggested by:	des
MFC after:	3 weeks
2002-06-02 20:05:59 +00:00
Alfred Perlstein
b63dc6ad47 Remove __P. 2002-03-20 05:48:58 +00:00
John Baldwin
a854ed9893 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
Julian Elischer
079b7badea Pre-KSE/M3 commit.
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.

Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
2002-02-07 20:58:47 +00:00
David E. O'Brien
6e551fb628 Update to C99, s/__FUNCTION__/__func__/,
also don't use ANSI string concatenation.
2001-12-10 08:09:49 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Mark Murray
fb919e4d5a Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
John Baldwin
ba88dfc733 Back out proc locking to protect p_ucred for obtaining additional
references along with the actual obtaining of additional references.
2001-01-27 00:01:31 +00:00
John Baldwin
1070c735ff - Use 'p' instead of 'curproc' for the namei lookup as this is what
other image activators use.
- Proc locking.
2001-01-23 23:59:30 +00:00
Poul-Henning Kamp
2c9b67a8df Remove unneeded #include <vm/vm_zone.h>
Generated by:	src/tools/tools/kerninclude
2000-04-30 18:52:11 +00:00
Peter Wemm
041cd80735 Initial dependency so that the kld's will link. imgact_coff depends
on the ibcs2 module being present.
2000-04-29 13:32:16 +00:00
Eivind Eklund
762e6b856c Introduce NDFREE (and remove VOP_ABORTOP) 1999-12-15 23:02:35 +00:00
Eivind Eklund
2c582f0fc6 Fix case where vnode could be unlocked twice. Untested; bug found by code
reading.

Reviewed by:	phk
1999-11-14 17:33:40 +00:00
Poul-Henning Kamp
923502ff91 useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>.  This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
1999-10-29 18:09:36 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Peter Wemm
e37622b251 Fix a couple of warnings and some bitrot in comments. 1999-05-09 16:04:14 +00:00
Matthew Dillon
0a5e03dda5 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-28 01:59:53 +00:00
Peter Wemm
55b8081be1 Remove references to an LKM that isn't built any more. 1999-01-17 20:39:08 +00:00
Peter Wemm
aa855a598d *gulp*. Jordan specifically OK'ed this..
This is the bulk of the support for doing kld modules.  Two linker_sets
were replaced by SYSINIT()'s.  VFS's and exec handlers are self registered.
kld is now a superset of lkm.  I have converted most of them, they will
follow as a seperate commit as samples.
This all still works as a static a.out kernel using LKM's.
1998-10-16 03:55:01 +00:00
David Greenman
6cde7a165f Fixed two potentially serious classes of bugs:
1) The vnode pager wasn't properly tracking the file size due to
   "size" being page rounded in some cases and not in others.
   This sometimes resulted in corrupted files. First noticed by
   Terry Lambert.
   Fixed by changing the "size" pager_alloc parameter to be a 64bit
   byte value (as opposed to a 32bit page index) and changing the
   pagers and their callers to deal with this properly.
2) Fixed a bogus type cast in round_page() and trunc_page() that
   caused some 64bit offsets and sizes to be scrambled. Removing
   the cast required adding casts at a few dozen callers.
   There may be problems with other bogus casts in close-by
   macros. A quick check seemed to indicate that those were okay,
   however.
1998-10-13 08:24:45 +00:00
Bruce Evans
86a14a7a0a Use [u]intptr_t instead of [u_]long for casts between pointers and
integers.  Don't forget to cast to (void *) as well.
1998-08-16 01:21:52 +00:00
Eivind Eklund
d131a704ed De-staticize enough to make all the LKMs work again. Add comments where
deemed relevant.
1998-02-11 01:46:50 +00:00
Eivind Eklund
303b270b0a Staticize. 1998-02-09 06:11:36 +00:00
Bruce Evans
675ea6f083 Unspammed nested include of <vm/vm_zone.h>. 1997-12-27 02:56:39 +00:00
John Dyson
99448ed11d Change the M_NAMEI allocations to use the zone allocator. This change
plus the previous changes to use the zone allocator decrease the useage
of malloc by half.  The Zone allocator will be upgradeable to be able
to use per CPU-pools, and has more intelligent usage of SPLs.  Additionally,
it has reasonable stats gathering capabilities, while making most calls
inline.
1997-09-21 04:24:27 +00:00