Commit Graph

385 Commits

Author SHA1 Message Date
marcel
c727378338 Fix off by one error introduced by the use of the ifnet_byindex()
macro. The commit log clearly states that the index given to the
macro is one higher than previously used to index the array. This
wasn't represented in the code and resulted in kernel page faults.

Reported by: Andrew Atrens <atrens@nortelnetworks.com>
2001-09-14 08:04:25 +00:00
julian
874cf1f6ad Fix typo.
noticed by: jhb
2001-09-13 22:02:48 +00:00
jhb
7029820884 Whitespace fix. 2001-09-12 22:16:18 +00:00
julian
5596676e6c KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
dillon
d73b3c59f0 This brings in a Yahoo coredump patch from Paul, with additional mods by
me (addition of vn_rdwr_inchunks).  The problem Yahoo is solving is that
if you have large process images core dumping, or you have a large number of
forked processes all core dumping at the same time, the original coredump code
would leave the vnode locked throughout.  This can cause the directory vnode
to get locked up, which can cause the parent directory vnode to get locked
up, and so on all the way to the root node, locking the entire machine up
for extremely long periods of time.

This patch solves the problem in two ways.  First it uses an advisory
non-blocking lock to abort multiple processes trying to core to the same
file.  Second (my contribution) it chunks up the writes and uses bwillwrite()
to avoid holding the vnode locked while blocking in the buffer cache.

Submitted by:	ps
Reviewed by:	dillon
MFC after:	2 weeks
2001-09-08 20:02:33 +00:00
marcel
df61d9eb64 Round of cleanups and enhancements. These include (in random order):
o  Introduce private types for use in linux syscalls for two reasons:
   1. establish type independence for ease in porting and,
   2. provide a visual queue as to which syscalls have proper
      prototypes to further cleanup the i386/alpha split.
   Linuxulator types are prefixed by 'l_'. void and char have not
   been "virtualized".

o  Provide dummy functions for all syscalls and remove dummy functions
   or implementations of truely obsolete syscalls.

o  Sanitize the shm*, sem* and msg* syscalls.

o  Make a first attempt to implement the linux_sysctl syscall. At this
   time it only returns one MIB (KERN_VERSION), but most importantly,
   it tells us when we need to add additional sysctls :-)

o  Bump the kenel version up to 2.4.2 (this is not the same as the
   KERN_VERSION MIB, BTW).

o  Implement new syscalls, of which most are specific to i386. Our
   syscall table is now up to date with Linux 2.4.2. Some highlights:
   -  Implement the 32-bit uid_t and gid_t bases syscalls.
   -  Implement a couple of 64-bit file size/offset bases syscalls.

o  Fix or improve numerous syscalls and prototypes.

o  Reduce style(9) violations while I'm here. Especially indentation
   inconsistencies within the same file are addressed. Re-indenting
   did not obfuscate actual changes to the extend that it could not
   be combined.

NOTE: I spend some time testing these changes and found that if there
      were regressions, they were not caused by these changes AFAICT.
      It was observed that installing a RH 7.1 runtime environment
      did make matters worse. Hangs and/or reboots have been observed
      with and without these changes, so when it failed to make life
      better in cases it doesn't look like it made it worse.
2001-09-08 19:07:04 +00:00
jlemon
f729fe0a4a Wrap array accesses in macros, which also happen to be lvalues:
ifnet_addrs[i - 1]  -> ifaddr_byindex(i)
        ifindex2ifnet[i]    -> ifnet_byindex(i)

This is intended to ease the conversion to SMPng.
2001-09-06 02:40:43 +00:00
dillon
0bdd59d16a Synchronize syscalls.master(s) with recent Giant pushdown work 2001-09-01 19:36:48 +00:00
marcel
f01626a3e5 Speculatively add this file. It's part of the Linuxulator update
to make it emulate Linux kernel version 2.4.2, which is required
in order to upgrade the linux_base port to RH 7.1.

Note that this file is only needed for 32-bit architectures. To
us this means i386 (for now?)
2001-09-01 18:11:45 +00:00
gallatin
a0087e29ad Fix linux_getcwd() so that if the cwd isn't cached (__getcwd() fails),
the cwd is looked up inside the kernel. The native getcwd() in libc
handles this in userland if __getcwd() fails.

Obtained from: NetBSD via OpenBSD
Tested by: Chris Casey <chriss@phys.ksu.edu>, Markus Holmberg <markush@acc.umu.se>
Reviewed by: Darrell Anderson <anderson@cs.duke.edu>
PR: kern/24315
2001-08-29 19:05:27 +00:00
pirzyk
0cd2751bca Added the linux_sysinfo function to implement sysinfo(2).
PR:		kern/27759
Reviewed by:	marcel
Approved by:	marcel
MFC after:	1 week
2001-07-23 06:22:10 +00:00
assar
6833b4cd91 get rid of some printf and pointer type warnings 2001-07-22 00:12:22 +00:00
rwatson
da1a848c61 o Replace calls to p_can(..., P_CAN_xxx) with calls to p_canxxx().
The p_can(...) construct was a premature (and, it turns out,
  awkward) abstraction.  The individual calls to p_canxxx() better
  reflect differences between the inter-process authorization checks,
  such as differing checks based on the type of signal.  This has
  a side effect of improving code readability.
o Replace direct credential authorization checks in ktrace() with
  invocation of p_candebug(), while maintaining the special case
  check of KTR_ROOT.  This allows ktrace() to "play more nicely"
  with new mandatory access control schemes, as well as making its
  authorization checks consistent with other "debugging class"
  checks.
o Eliminate "privused" construct for p_can*() calls which allowed the
  caller to determine if privilege was required for successful
  evaluation of the access control check.  This primitive is currently
  unused, and as such, serves only to complicate the API.

Approved by:	({procfs,linprocfs} changes) des
Obtained from:	TrustedBSD Project
2001-07-05 17:10:46 +00:00
peter
e6d5c70da1 Bah, back out part of previous commit. I got too carried away.
linux_debug_map[] is referred to from elsewhere.
2001-06-15 08:18:24 +00:00
peter
7908b13fc3 Fix warnings:
235: warning: unsigned int format, pointer arg (arg 3)
621: warning: cast discards qualifiers from pointer target type
2001-06-15 07:50:54 +00:00
peter
ded47f0635 Fix warning:
239: warning: no previous prototype for `linux_debug'
2001-06-15 07:48:21 +00:00
peter
581435f4bd Fix warning:
413: warning: long unsigned int format, vm_offset_t arg (arg 2)
2001-06-15 07:46:18 +00:00
peter
f10fa038c1 With this commit, I hereby pronounce gensetdefs past its use-by date.
Replace the a.out emulation of 'struct linker_set' with something
a little more flexible.  <sys/linker_set.h> now provides macros for
accessing elements and completely hides the implementation.

The linker_set.h macros have been on the back burner in various
forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()),
John Polstra (ELF clue) and myself (cleaned up API and the conversion
of the rest of the kernel to use it).

The macros declare a strongly typed set.  They return elements with the
type that you declare the set with, rather than a generic void *.

For ELF, we use the magic ld symbols (__start_<setname> and
__stop_<setname>).  Thanks to Richard Henderson <rth@redhat.com> for the
trick about how to force ld to provide them for kld's.

For a.out, we use the old linker_set struct.

NOTE: the item lists are no longer null terminated.  This is why
the code impact is high in certain areas.

The runtime linker has a new method to find the linker set
boundaries depending on which backend format is in use.

linker sets are still module/kld unfriendly and should never be used
for anything that may be modular one day.

Reviewed by:	eivind
2001-06-13 10:58:39 +00:00
des
eb8c70e07b Say one thing, do the other... nextpid -> lastpid 2001-06-11 23:00:35 +00:00
des
45a75d4587 Implement proc/cpuinfo for the Alpha (thanks to gallatin).
Implement proc/pid/cmdline.
2001-06-11 21:55:40 +00:00
des
b1d6b7887d Minor whitespace changes. 2001-06-11 00:17:59 +00:00
des
7183d49acf These aren't needed any more. 2001-06-10 23:24:14 +00:00
des
2a797ca9f1 New pseudofs-based linprocfs (repo-copied from linprocfs_misc.c). 2001-06-10 23:23:59 +00:00
paul
9471850c14 S_IFCHR is not a bit mask, it's just a value in a field. The correct
way to clear that field is to use S_IFMT.

Pointed out by BDE.
2001-06-04 03:39:14 +00:00
ru
e7a85be33f Remove vestiges of MFS. 2001-06-01 10:07:28 +00:00
phk
9dfeaf738e Remove MFS 2001-05-29 20:39:47 +00:00
ru
05f3be90b2 - sys/n[tw]fs moved to sys/fs/n[tw]fs
- /usr/include/n[tw]fs moved to /usr/include/fs/n[tw]fs
2001-05-26 11:57:45 +00:00
rwatson
f504530d9f o Merge contents of struct pcred into struct ucred. Specifically, add the
real uid, saved uid, real gid, and saved gid to ucred, as well as the
  pcred->pc_uidinfo, which was associated with the real uid, only rename
  it to cr_ruidinfo so as not to conflict with cr_uidinfo, which
  corresponds to the effective uid.
o Remove p_cred from struct proc; add p_ucred to struct proc, replacing
  original macro that pointed.
  p->p_ucred to p->p_cred->pc_ucred.
o Universally update code so that it makes use of ucred instead of pcred,
  p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo,
  cr_{r,sv}{u,g}id instead of p_*, etc.
o Remove pcred0 and its initialization from init_main.c; initialize
  cr_ruidinfo there.
o Restruction many credential modification chunks to always crdup while
  we figure out locking and optimizations; generally speaking, this
  means moving to a structure like this:
        newcred = crdup(oldcred);
        ...
        p->p_ucred = newcred;
        crfree(oldcred);
  It's not race-free, but better than nothing.  There are also races
  in sys_process.c, all inter-process authorization, fork, exec, and
  exit.
o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid;
  remove comments indicating that the old arrangement was a problem.
o Restructure exec1() a little to use newcred/oldcred arrangement, and
  use improved uid management primitives.
o Clean up exit1() so as to do less work in credential cleanup due to
  pcred removal.
o Clean up fork1() so as to do less work in credential cleanup and
  allocation.
o Clean up ktrcanset() to take into account changes, and move to using
  suser_xxx() instead of performing a direct uid==0 comparision.
o Improve commenting in various kern_prot.c credential modification
  calls to better document current behavior.  In a couple of places,
  current behavior is a little questionable and we need to check
  POSIX.1 to make sure it's "right".  More commenting work still
  remains to be done.
o Update credential management calls, such as crfree(), to take into
  account new ruidinfo reference.
o Modify or add the following uid and gid helper routines:
      change_euid()
      change_egid()
      change_ruid()
      change_rgid()
      change_svuid()
      change_svgid()
  In each case, the call now acts on a credential not a process, and as
  such no longer requires more complicated process locking/etc.  They
  now assume the caller will do any necessary allocation of an
  exclusive credential reference.  Each is commented to document its
  reference requirements.
o CANSIGIO() is simplified to require only credentials, not processes
  and pcreds.
o Remove lots of (p_pcred==NULL) checks.
o Add an XXX to authorization code in nfs_lock.c, since it's
  questionable, and needs to be considered carefully.
o Simplify posix4 authorization code to require only credentials, not
  processes and pcreds.  Note that this authorization, as well as
  CANSIGIO(), needs to be updated to use the p_cansignal() and
  p_cansched() centralized authorization routines, as they currently
  do not take into account some desirable restrictions that are handled
  by the centralized routines, as well as being inconsistent with other
  similar authorization instances.
o Update libkvm to take these changes into account.

Obtained from:	TrustedBSD Project
Reviewed by:	green, bde, jhb, freebsd-arch, freebsd-audit
2001-05-25 16:59:11 +00:00
jhb
d47e07ca44 Sort includes. 2001-05-21 18:52:02 +00:00
jlemon
45bfbf2bf8 Add new 'loadavg' entry, fix overflow with meminfo.
PR: 27253, 27350
Submitted by: Jim Pirzyk
2001-05-19 05:54:26 +00:00
alfred
a3f0842419 Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level
vm operations.

faults can not be taken without holding Giant.

Memory subsystems can now call the base page allocators safely.

Almost all atomic ops were removed as they are covered under the
vm mutex.

Alpha and ia64 now need to catch up to i386's trap handlers.

FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).

Reviewed (partially) by: jake, jhb
2001-05-19 01:28:09 +00:00
des
a7cc1aa05f Avoid overflow when converting ticks to jiffies.
PR:		27215
Submitted by:	Jim Pirzyk <Jim.Pirzyk@disney.com>
2001-05-09 11:41:54 +00:00
jlemon
7a74be42b6 Fix the problem of some directory entries going missing when
read by the linux version of 'ls'.

Spotted by: rwatson
2001-05-04 05:19:22 +00:00
markm
bcca5847d5 Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
phk
608c1caf3b Add a vop_stdbmap(), and make it part of the default vop vector.
Make 7 filesystems which don't really know about VOP_BMAP rely
on the default vector, rather than more or less complete local
vop_nopbmap() implementations.
2001-04-29 11:48:41 +00:00
paul
457c353d3c A bogus check for a char device also matched symbolic links.
Replace it with a correct check using S_ISCHR()

Symbolic links will now work again in linux compatibility.
2001-04-25 22:07:16 +00:00
rwatson
5bf1eebdef o Change a suser() call to a suser_xxx(..., PRISON_ROOT) call in the
linuxulator so as to allow privileged processes within a jail() to
  invoke the Linux initgroups() system call.  This allows the Linux
  "su" to work properly (better) when running a complete Linux
  environment under jail().  This problem was reported by Attila
  Nagy <bra@fsn.hu>.

Reviewed by:	marcel
2001-04-24 19:08:53 +00:00
jhb
9c03a8ae91 Change the pfind() and zpfind() functions to lock the process that they
find before releasing the allproc lock and returning.

Reviewed by:	-smp, dfr, jake
2001-04-24 00:51:53 +00:00
alc
0aa5742239 Add linux_sched_get_priority_max() and linux_sched_get_priority_min(): The
policy parameter requires translation.
2001-04-01 06:37:40 +00:00
jhb
074862548d Add missing includes of <sys/sx.h>
Reported by:	peter
2001-03-28 15:04:22 +00:00
jhb
79cf991a6b Convert the allproc and proctree locks from lockmgr locks to sx locks. 2001-03-28 11:52:56 +00:00
gallatin
24f4ef1037 fix linux_times() to take into account linux's value of CLK_TCK on the alpha.
Previously, results were off by a factor of 10

Tested by: Yoriaki FUJIMORI <fujimori@grafin.fujimori.cache.waseda.ac.jp>
2001-03-23 19:22:21 +00:00
jlemon
9060ef19e9 Eliminate global node types and instead use an operations vector for
each node in order to make it easier to add new entries.

Rewrite the internal directory structure so that it is possible to
have independent subdirectories.  Utilize this to add /proc/net/dev.

Reviewed by:  DES
2001-03-12 03:16:56 +00:00
jhb
9cd254601b Grab the process lock while calling psignal and before calling psignal. 2001-03-07 03:37:06 +00:00
jhb
6b93e2c5c0 Just hold the proc lock while getting the parent's PID rather than a
proctree lock.
2001-03-07 03:21:26 +00:00
jhb
e2c9c435ff - Hold both an exclusive proctree lock and the proc lock when reparenting
a traced process during exit.
- Lock the parent process while sending it SIGCHLD.
2001-03-07 02:17:43 +00:00
jlemon
0e6ea63318 Only pick up so_error the first time through with EISCONN, as advertised.
The sense of the test was reversed, so we were returning EISCONN, then 0.

Pointed out and tested by: Martin Blapp <mb@imp.ch>
2001-03-02 19:29:53 +00:00
jlemon
0bdef26329 Correctly emulate linux_connect. For nonblocking sockets, the behavior
is to return EINPROGRESS, EALREADY, (so_error ONCE), EISCONN.  Certain
linux applications rely on the so_error (normally 0) being returned in
order to operate properly.

Tested by:	Thomas Moestl <tmoestl@gmx.net>
2001-03-01 21:44:40 +00:00
adrian
4018955334 Reviewed by: jlemon
An initial tidyup of the mount() syscall and VFS mount code.

This code replaces the earlier work done by jlemon in an attempt to
make linux_mount() work.

* the guts of the mount work has been moved into vfs_mount().

* move `type', `path' and `flags' from being userland variables into being
  kernel variables in vfs_mount(). `data' remains a pointer into
  userspace.

* Attempt to verify the `type' and `path' strings passed to vfs_mount()
  aren't too long.

* rework mount() and linux_mount() to take the userland parameters
  (besides data, as mentioned) and pass kernel variables to vfs_mount().
  (linux_mount() already did this, I've just tidied it up a little more.)

* remove the copyin*() stuff for `path'. `data' still requires copyin*()
  since its a pointer into userland.

* set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each
  filesystem.  This variable is generally initialised with `path', and
  each filesystem can override it if they want to.

* NOTE: f_mntonname is intiailised with "/" in the case of a root mount.
2001-03-01 21:00:17 +00:00
obrien
16ff3c2b54 MFS: bring the consistent `compat_3_brand' support into -CURRENT
(the work was first done in the RELENG_4 branch near a release
	 during a MFC to make the code cleaner and more consistent)
2001-02-24 22:20:11 +00:00