kern_prot, which cleans up some namespace issues
o Don't need a special handler to limit un-setting, as suser is used to
protect suser_permitted, making it one-way by definition.
Suggested by: bde
returning anything but EPERM.
o suser is enabled by default; once disabled, cannot be reenabled
o To be used in alternative security models where uid0 does not connote
additional privileges
o Should be noted that uid0 still has some additional powers as it
owns many important files and executables, so suffers from the same
fundamental security flaws as securelevels. This is fixed with
MAC integrity protection code (in progress)
o Not safe for consumption unless you are *really* sure you don't want
things like shutdown to work, et al :-)
Obtained from: TrustedBSD Project
TCP/IP (v4) sockets, and routing sockets. Previously, interaction
with IPv6 was not well-defined, and might be inappropriate for some
environments. Similarly, sysctl MIB entries providing interface
information also give out only addresses from those protocol domains.
For the time being, this functionality is enabled by default, and
toggleable using the sysctl variable jail.socket_unixiproute_only.
In the future, protocol domains will be able to determine whether or
not they are ``jail aware''.
o Further limitations on process use of getpriority() and setpriority()
by jailed processes. Addresses problem described in kern/17878.
Reviewed by: phk, jmg
Symbol values are now represented using array sizes (4 arrays per symbol
so that 16-bit machines can represent 64-bit values) instead of being raw
binary values.
Reviewed by: marcel
Further experimentation showed that some Dell 2450 machines with the
prevention kludge installed still got T_RESERVED traps. CPU interrupt
vector 0x7A was observed to be triggered. This might have been the
bitwise OR of two different vectors sent from each of the IOAPICs at
the same time.
IOAPIC #0: 0x68 --> irq 8: RTC timer interrupt
IOAPIC #1: 0x32 --> irq 18: scsi host adapter or network interface
----
0x7a --> T_RESERVED
Both IOAPICs had ID 0.
Appendix B.3 in the MP spec indicates that the operating system is
responsible for assigning unique IDs to the IOAPICs.
The enclosed patch programs the IOAPIC IDs according to the IOAPIC
entries in the MP table.
Submitted by: tegge
and sysv shared memory support for it. It implements a new
PG_UNMANAGED flag that has slightly different characteristics
from PG_FICTICIOUS.
A new sysctl, kern.ipc.shm_use_phys has been added to enable the
use of physically-backed sysv shared memory rather then swap-backed.
Physically backed shm segments are not tracked with PV entries,
allowing programs which use a large shm segment as a rendezvous
point to operate without eating an insane amount of KVM in the
PV entry management. Read: Oracle.
Peter's OBJT_PHYS object will also allow us to eventually implement
page-table sharing and/or 4MB physical page support for such segments.
We're half way there.
buzy, only search upwards for a free slot to use..
This broke unit numbering on ATA systems where PCI attached controllers
come before the mainboard ones...
Reviewed by: dfr
This function will probably rewritten/renamed to devpp.
Submitted by: Assar Westerlund <assar@sics.se> on -current
Confirmed to work: Steinar Haug <sthaug@nethelp.no>,
Manfred Antar <mantar@pacbell.net>
Reviewed by: phk
this in awk using the hack of counting args of type off_t twice and args
of all other types once. This is too simple to work. It gave benignly
wrong results on alphas (off_t shouldn't be counted twice) and for
svr4_sys_mmap64() on i386's (off64_t should be counted twice). It gave
fatally wrong results for i386's with 64-bit longs (longs should be
counted twice). The correct value for sy_nargs is easier to determine
from the size of the args struct anyway, except for complications to
make the generated code almost readable.
Improved formatting of sysent tables by lining up the comments where
possible.
type. This gave an inconsistent amount of crufty padding on i386's with
64-bit longs (8 bytes instead of 4). On alphas it gives a consistent
amount of crufty padding (8 bytes) in addition to the 4 bytes of normal
padding caused by passing int args as register_t's.
Fixed the args struct tag for the NOPROTO syscalls (netbsd_lchown() and
netbsd_msync()). The tag is currently unused for NOPROTO syscalls, so
the bug has no effect, but it will be used even in the NOPROTO case to
calculate sy_nargs correctly.
<sys/bio.h>.
<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.
Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.
Still a few bogus uses of struct buf to track down.
Repocopy by: peter
spl() protection in the case of a copyout error.
Add missing spl calls around the intial activation call that is
done when when the kevent is added.
Add two KASSERT macros to help catch errors in the future.
is used to control whether the debug messages are output at runtime.
It defaults to on so that if you define BUS_DEBUG in your kernel
then you get all the debugging info when you boot.
It's very useful for disabling all the debugging info when you're
developing a loadable device driver and you're doing lots of loads
and unloads but don't always want to see all the debugging info.
Remove evil allocation macros from machdep.c (why was that there???) and
use malloc() instead.
Move paramters out of param.h and into the code itself.
Move a bunch of internal definitions from public sys/*.h headers (without
#ifdef _KERNEL even) into the code itself.
I had hoped to make some of this more dynamic, but the cost of doing
wakeups on all sleeping processes on old arrays was too frightening.
The other possibility is to initialize on the first use, and allow
dynamic sysctl changes to parameters right until that point. That would
allow /etc/rc.sysctl to change SEM* and MSG* defaults as we presently
do with SHM*, but without the nightmare of changing a running system.
wrong for many years that negative niceness would lower the priority
of a process below PUSER, and once below PUSER, there were conditionals
in the code that are required to test for whether a process was in
the kernel which would break.
The breakage could (and did) cause lock-ups, basically nothing else
but the least nice program being able to run in some conditions. The
algorithm which adjusts the priority now subtracts PRIO_MIN to do
things properly, and the ESTCPULIM() algorithm was updated to use
PRIO_TOTAL (PRIO_MAX - PRIO_MIN) to calculate the estcpu.
NICE_WEIGHT is now 1 to accomodate the full range of priorities better
(a -20 process with full CPU time has the priority of a +0 process with
no CPU time). There are now 20 queues (exactly; 80 priorities) for
use in user processes' scheduling, and PUSER has been lowered to 48
to accomplish this.
This means, to the user, that things will be scheduled more correctly
(noticeable), there is no lock-up anymore WRT a niced -20 process
never releasing the CPU time for other processes. In this fair system,
tsleep()ed < PUSER processes now will get the proper higher priority
than priority >= PUSER user processes.
The detective work of this was done by me, along with part of the
solution. Luoqi Chen has provided most of the solution, and really
helped me understand what was happening better, to boot :)
Submitted by: luoqi
Concept reviewed by: bde
bus/driver/kobj system. I am not 100% sure that this is the correct fix,
but it is harmless and does seem to solve the problem. At worst, it could
cause a tiny memory leak at unload time - this is better than a free(NULL)
and subsequent panic. I'm waiting for comments from Doug about this.
This may yet be backed out and fixed differently.
The change itself is to increment the reference count on drivers in one
case where it appears to have been missed. When everything is unloaded,
kobj_class_free() was being called twice in some cases, and panicing the
second time.