via sysctl. It's done pretty simply but it should be quite adequate.
Also move SHMMAXPGS from $machine/include/vmparam.h as the comments that
went with it were wrong... we don't allocate KVM space for the pages so
that comment is bogus.. The only practical limit is how much physical
ram you want to lock up as this stuff isn't paged out or swap backed.
syscall path inward. A system call may select whether it needs the MP
lock or not (the default being that it does need it).
A great deal of conditional SMP code for various deadended experiments
has been removed. 'cil' and 'cml' have been removed entirely, and the
locking around the cpl has been removed. The conditional
separately-locked fast-interrupt code has been removed, meaning that
interrupts must hold the CPL now (but they pretty much had to anyway).
Another reason for doing this is that the original separate-lock for
interrupts just doesn't apply to the interrupt thread mechanism being
contemplated.
Modifications to the cpl may now ONLY occur while holding the MP
lock. For example, if an otherwise MP safe syscall needs to mess with
the cpl, it must hold the MP lock for the duration and must (as usual)
save/restore the cpl in a nested fashion.
This is precursor work for the real meat coming later: avoiding having
to hold the MP lock for common syscalls and I/O's and interrupt threads.
It is expected that the spl mechanisms and new interrupt threading
mechanisms will be able to run in tandem, allowing a slow piecemeal
transition to occur.
This patch should result in a moderate performance improvement due to
the considerable amount of code that has been removed from the critical
path, especially the simplification of the spl*() calls. The real
performance gains will come later.
Approved by: jkh
Reviewed by: current, bde (exception.s)
Some work taken from: luoqi's patch
fragmentation problem due to geteblk() reserving too much space for the
buffer and imposes a larger granularity (16K) on KVA reservations for
the buffer cache to avoid fragmentation issues. The buffer cache size
calculations have been redone to simplify them (fewer defines, better
comments, less chance of running out of KVA).
The geteblk() fix solves a performance problem that DG was able reproduce.
This patch does not completely fix the KVA fragmentation problems, but
it goes a long way
Mostly Reviewed by: bde and others
Approved by: jkh
static int setrootbyname(char *name);
out into
dev_t getdiskbyname(char *name);
This makes it easy to create a new DDB command, which is the big reason
for the change. You can now do the following in DDB:
Example rc.conf entry:
dumpdev="/dev/ad0s1b" # Device name to crashdump to (if enabled).
db> show disk/ad0s1b
dev_t = 0xc0b7ea00
db> p *dumpdev
c0b7ea00
Make the public interface more systematically named.
Remove the alternate method, it doesn't do any good, only ruins performance.
Add counters to profile the usage of the 8 access functions.
Apply the beer-ware to my code.
The weird +/- counts are caused by two repocopies behind the scenes:
kern/kern_clock.c -> kern/kern_tc.c
sys/time.h -> sys/timetc.h
(thanks peter!)
substitute BUF_WRITE(foo) for VOP_BWRITE(foo->b_vp, foo)
substitute BUF_STRATEGY(foo) for VOP_STRATEGY(foo->b_vp, foo)
This patch is machine generated except for the ccd.c and buf.h parts.
field in struct buf: b_iocmd. The b_iocmd is enforced to have
exactly one bit set.
B_WRITE was bogusly defined as zero giving rise to obvious coding
mistakes.
Also eliminate the redundant struct buf flag B_CALL, it can just
as efficiently be done by comparing b_iodone to NULL.
Should you get a panic or drop into the debugger, complaining about
"b_iocmd", don't continue. It is likely to write on your disk
where it should have been reading.
This change is a step in the direction towards a stackable BIO capability.
A lot of this patch were machine generated (Thanks to style(9) compliance!)
Vinum users: Greg has not had time to test this yet, be careful.
return ENXIO (Device not configured). Without this, vn_isdisk()
could (and did in the case of lstat() under fdesc) pass a NULL pointer
to devsw(), which caused a page fault.
Reviewed by: alfred
This avoids the unit number from going up indefinitely when
diconnecting and connecting 2 devices alternately.
Noticed by: nsayer (quite a while ago)
And stop calling DEVICE_NOMATCH at probe repeatedly. This stops the
message on the PCI VGA board from being printed when loading a PCI driver.
to recycle full fsids after only 16 mount/unmount's. This is probably
too often for exported fsids. Now we recycle the full fsids only
after 2^16 mount/ umount's and only ensure uniqueness in the lower 16
bits if there have been <= 256 calls to vfs_getnewfsid() since the
system started.
Previously, it was being called whether it was needed or not and the
ASU flag was being set (as a side affect of calling 'suser()') in
cases where superuser privileges were not actually needed. This was
all pointed out to me by Bruce Evans.
Reviewed by: bde
number was packed very wastefully, giving perfect non-uniqeness in
the lower 16 bits of fsids for filesystems with the same vfs type.
This made linux_stat() return perfectly non-unique (broken) 16-bit
st_dev's for nfs mount points, and effectively reduced mntid_base to
8 bits so that the vfs_getnewfsid() looped endlessly when there are
already 256 mounted filesystems with the required vfs type.
Approved by: jkh
VM_PROT_EXECUTE must be added to prot before calling vm_map_find.
Without this change, an mprotect on a shmat'ed region fails (when
it shouldn't). This bug was reported Feb 28 by Brooks Davis
<brooks@one-eyed-alien.net> on -hackers.
Reviewed by: bde
Approved by: jkh
with the known bogus currtpriority. This undoes the previous changes to
sys/i386/i386/trap.c, sys/alpha/alpha/trap.c, sys/sys/systm.h
Now we have the patch set approved by bde.
Approved by: bde
This
This feature allows you to specify if mmap'd data is included in
an application's corefile.
Change the type of eflags in struct vm_map_entry from u_char to
vm_eflags_t (an unsigned int).
Reviewed by: dillon,jdp,alfred
Approved by: jkh
how the kernel was booted and perhaps do conditional things
based upon it (sysinstall, for example, will now turn Debug mode
on automatically if boot -v was done).
Submitted by: msmith
Suggested by: ulf
Now this check is necessary because IPv6 source routing might use
control data bigger than MLEN. (e.g. 16bytes IPv6 addr x 23 hops)
Actually mbuf cluster should be used in uipc_socket.c:sbcreatecontrol()
and uipc_syscalls.c:sockargs() when data size is bigger then MLEN,
and such patches were already in KAME environment and have been
confirmed to work well. I just forgot to merge them into 4.0, sorry.
For safety, I'll postpone such patches until after 4.0 release.
The effect of postponement is followings.
-Ping6 source routing hops are limitted to around 6 or so.
-If some apps do setsockopt IPV6_RTHDR and try to receive
incoming IPv6 source routing info, it can't receive more
than 6 hops source routing info.
(But currently, no apps seems to be doing it.)
Approved by: jkh
VFS_AIO option is specified, all aio-related syscalls return ENOSYS.
The aio code is very fragile right now, and is unsuitable for default
inclusion in a production shell box.
Approved by: jkh
was using them exits.
Don't allow a user process to cause the kernel to take a TRCTRAP on a
user space address.
Reviewed by: jlemon, sef
Approved by: jkh
fd's in the range of 32-63, 96-127 etc. The first problem was the
FD_*() macros were shifting a 32 bit integer "1" left by more than
32 bits. The same problem happened in selscan(). ffs() also takes
an int argument and causes failure. For cases where int == long
(ie: the usual case for x86, but not always as gcc can have long
being a 64 bit quantity) ffs() could be used.
Reported by: Marian Stagarescu <marian@bile.skycache.com>
Reviewed by: dfr, gallatin (sys/types.h only)
Approved by: jkh
was needed to make attach/detach of devices work, which is
needed for the PCCARD support.
(PCCARD support is still not working though, more to come on that)
Support the CMD646 chip which is used on many alphas, sadly only
in WDMA2 mode, as the silicon is broken beyond belief for UDMA modes.
Lots of cosmetic fixes here and there.
Sorry for the size of this megapatchfromhell but it was not
possible otherwise...
newbus patches based on work from: dfr (Doug Rabson)
firmware prompt. Several sleepy folk mistook the '>>>' for the SRM
prompt, which was never the desired idea.
Submitted by: Andrew Gallatin <gallatin@cs.duke.edu>
Approved by: jkh
run out of KVM through a mmap()/fork() bomb that allocates hundreds
of thousands of vm_map_entry structures.
Add panic to make null-pointer dereference crash a little more verbose.
Add a new sysctl, vm.max_proc_mmap, which specifies the maximum number
of mmap()'d spaces (discrete vm_map_entry's in the process). The value
defaults to around 9000 for a 128MB machine. The test is scaled for the
number of processes sharing a vmspace (aka linux threads). Setting
the value to 0 disables the feature.
PR: kern/16573
Approved by: jkh
This unspams the boot messages, concentrating on the drivers that have
actually been probed.
This basically resurrects revision 1.106 from old /sys/i386/isa/isa.c.
Reviewed by: jkh, dfr
curproc. This only makes any difference on SMP, where we used a
(potentially very bogus) switchtime from our own CPU to calculate
resource usage on another CPU.
This should remove some if not all calcru() related warnings on SMP.
Approved by: jkh
#! /bin/sh # -*- perl -*-
This is simply "delete everything after the next '#', not counting the
first char in the line". No effort has been made to allow quoting,
backslash escaping or '#' in interpreter names.
The complies to POSIX 1003.2 in that Posix says the implementation is
free to choose whatever it likes.
PR: bin/16393
``jail'', and move the set_hostname_allowed sysctl there, as well as
fixing a bug in the sysctl that resulted in jails being over-limited
(preventing them from reading as well as writing the hostname). Also,
correct some formatting issues, courtesy bde :-).
Reviewed by: phk
Approved by: jkh
or not a process in a jail, with privilege, may set the jail's hostname.
Defaults to 1, which permits this. May be set to 0 by a process with
appropriate privilege outside of jail. Preventing hostname renaming
from within a jail is currently required to make jails manageable, as they
a currently identifiable only by hostname using /proc, which may be
modified without this sysctl being set to 0. This will be documented
in upcoming man commits.
Authorized by: jkh, the ever-patient