Compile and link a new kernel, that will give native ELF support, and
provide the hooks for other ELF interpreters as well.
To make native ELF binaries use John Polstras elf-kit-1.0.1..
For the time being also use his ld-elf.so.1 and put it in
/usr/libexec.
The Linux emulator has been enhanced to also run ELF binaries, it
is however in its very first incarnation.
Just get some Linux ELF libs (Slackware-3.0) and put them in the
prober place (/compat/linux/...).
I've ben able to run all the Slackware-3.0 binaries I've tried
so far.
(No it won't run quake yet :)
residing in a buffer that had been dirtied by a process was being
handled incorrectly. The pages were mistakenly placed into the
cache queue. This would likely have the effect of mmaped page modifications
being lost when I/O system calls were being used simultaneously to
the same locations in a file.
Submitted by: davidg
on in the FreeBSD development, I had made a global lock around the
rlist code. This was bogus, and now the lock is maintained on a
per resource list basis. This now allows the rlist code to be used for
almost any non-interrupt level application.
linux binaries from the *BSD a.out loader. This is a hack, but lets me run
static NetBSD binaries. Dynamic binaries are a much bigger problem because
the shared libraries would conflict with our native libraries, so a
/compat/netbsd alternate namespace and translation would be needed.
netscape-2.0 for Linux running all the Java stuff. The scrollbars are now
working, at least on my machine. (whew! :-)
I'm uncomfortable with the size of this commit, but it's too
inter-dependant to easily seperate out.
The main changes:
COMPAT_LINUX is *GONE*. Most of the code has been moved out of the i386
machine dependent section into the linux emulator itself. The int 0x80
syscall code was almost identical to the lcall 7,0 code and a minor tweak
allows them to both be used with the same C code. All kernels can now
just modload the lkm and it'll DTRT without having to rebuild the kernel
first. Like IBCS2, you can statically compile it in with "options LINUX".
A pile of new syscalls implemented, including getdents(), llseek(),
readv(), writev(), msync(), personality(). The Linux-ELF libraries want
to use some of these.
linux_select() now obeys Linux semantics, ie: returns the time remaining
of the timeout value rather than leaving it the original value.
Quite a few bugs removed, including incorrect arguments being used in
syscalls.. eg: mixups between passing the sigset as an int, vs passing
it as a pointer and doing a copyin(), missing return values, unhandled
cases, SIOC* ioctls, etc.
The build for the code has changed. i386/conf/files now knows how
to build linux_genassym and generate linux_assym.h on the fly.
Supporting changes elsewhere in the kernel:
The user-mode signal trampoline has moved from the U area to immediately
below the top of the stack (below PS_STRINGS). This allows the different
binary emulations to have their own signal trampoline code (which gets rid
of the hardwired syscall 103 (sigreturn on BSD, syslog on Linux)) and so
that the emulator can provide the exact "struct sigcontext *" argument to
the program's signal handlers.
The sigstack's "ss_flags" now uses SS_DISABLE and SS_ONSTACK flags, which
have the same values as the re-used SA_DISABLE and SA_ONSTACK which are
intended for sigaction only. This enables the support of a SA_RESETHAND
flag to sigaction to implement the gross SYSV and Linux SA_ONESHOT signal
semantics where the signal handler is reset when it's triggered.
makesyscalls.sh no longer appends the struct sysentvec on the end of the
generated init_sysent.c code. It's a lot saner to have it in a seperate
file rather than trying to update the structure inside the awk script. :-)
At exec time, the dozen bytes or so of signal trampoline code are copied
to the top of the user's stack, rather than obtaining the trampoline code
the old way by getting a clone of the parent's user area. This allows
Linux and native binaries to freely exec each other without getting
trampolines mixed up.
queue type is not set to QUEUE_NONE. This appears to have
caused a hang bug that has been lurking.
2) Fix bugs that brelse'ing locked buffers do not "free" them, but the
code assumes so. This can cause hangs when LFS is used.
3) Use malloced memory for directories when applicable. The amount
of malloced memory is seriously limited, but should decrease the
amount of memory used by an average directory to 1/4 - 1/2 previous.
This capability is fully tunable. (Note that there is no config
parameter, and might never be.)
4) Bias slightly the buffer cache usage towards non-VMIO buffers. Since
the data in VMIO buffers is not lost when the buffer is reclaimed, this
will help performance. This is adjustable also.
is <sys/unistd.h>, with the prototype in <unistd.h>. sys/unistd.h
is visible to the kernel compile, and is #included by unistd.h.
Also, I missed a reference to a static int in the midst of my other diffs.
kern_fork.c: add the tiny bit of code for rfork operation.
kern/sysv_*: shmfork() takes one less arg, it was never used.
sys/shm.h: drop "isvfork" arg from shmfork() prototype
sys/param.h: declare rfork args.. (this is where OpenBSD put it..)
sys/filedesc.h: protos for fdshare/fdcopy.
vm/vm_mmap.c: add minherit code, add rounding to mmap() type args where
it makes sense.
vm/*: drop unused isvfork arg.
Note: this rfork() implementation copies the address space mappings,
it does not connect the mappings together. ie: once the two processes
have split, the pages may be shared, but the address space is not. If one
does a mmap() etc, it does not appear in the other. This makes it not
useful for pthreads, but it is useful in it's own right for having
light-weight threads in a static shared address space.
Obtained from: Original by Ron Minnich, extended by OpenBSD
fixes for previous version of new pipes from Bruce Evans. This
new version:
Supports more properly the semantics of select (BDE).
Supports "OLD_PIPE" correctly (kern_descrip.c, BDE).
Eliminates incorrect EPIPE returns (bash 'pipe broken' messages.)
Much faster yet, currently tuned relatively conservatively -- but now
gives approx 50% more perf than the new pipes code did originally.
(That was about 50% more perf than the original BSD pipe code.)
Known bugs outstanding:
No support for async io (SIGIO). Will be included soon.
Next to do:
Merge support for FIFOs.
Submitted by: bde
The code outputs the dc then calls the device specific externalize
routines to fill in the dc_data area. The old code assumed that dc_data
started one byte from the end of the dc, but with the compiler optimizing
alignment and padding, this isn't always the case. Do an explicit
&(dc.dc_data) - &dc. This fixes lsdev -c which must have been broken
for some time.
1) The calculation didn't account for NMBCLUSTERS, so if a large number of
clusters was specified, it would leave little or no space for kernel
malloc.
2) It was bogusly restricted to v_page_count. This doesn't take into
account the sparseness of the malloc area and would have caused
problems on machines with small amounts of memory. It should probably
instead be changed to set the malloc limit to be constrained by
the amount of memory, but I didn't do this.
First attempt at creating devfs entries for sliced devices. Doesn't
quite work yet, so the heart of it is disabled.
Added bdev and cdev args to dsopen().
Create devfs entries in dsopen() and (unsuccessfully) attempt to make
them go away at the right times. DEVFS is #undefed at the start so
that this shouldn't cause problems.
PT_ATTACH/PT_DETACH implemented now and fully operational.
PT_{GET|SET}{REGS|FPREFS} implemented now, using code shared with procfs
PT_{READ|WRITE}_{I|D} now uses code shared with procfs
ptrace opcodes now fully permission checked, including ownerships.
doing an operation to the u-area on a swapped process should no longer
panic.
running gdb as root works for me now, where it didn't before.
general cleanup..
Note, that this has some tightening of permissions/access checks etc.
Some of these may be going too far.. In particular, the "owner" of the
traced process is enforced. The process that created or attached to
the traced process is now the only one that can "do" things to it.
Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish
overhead for merged cache.
Efficiency improvement for vfs_cluster. It used to do alot of redundant
calls to cluster_rbuild.
Correct the ordering for vrele of .text and release of credentials.
Use the selective tlb update for 486/586/P6.
Numerous fixes to the size of objects allocated for files. Additionally,
fixes in the various pagers.
Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs.
Fixes in the swap pager for exhausted resources. The pageout code
will not as readily thrash.
Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into
page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE),
thereby improving efficiency of several routines.
Eliminate even more unnecessary vm_page_protect operations.
Significantly speed up process forks.
Make vm_object_page_clean more efficient, thereby eliminating the pause
that happens every 30seconds.
Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the
case of filesystems mounted async.
Fix a panic with busy pages when write clustering is done for non-VMIO
buffers.
Add more features to the one remaining to handle the job:
+ signed quantity.
# alternate format
- left padding
* read width as next arg.
n numeric in (argument specified) default radix.
Fix the DDB debugger to use these.
Use vprintf in debug routine in pcvt.
The warnings from gcc may become more wrong and intolerable because
of this.
Warning: I have not checked the entire source for unsupported or
changed constructs, but generally belive that there are only a few.
Suggested by: bde
sysv_ipc.c: add stub functions that either simply return (for the hooks
in kern_fork/kern_exit) or log() a messgae and call enosys() (for the
syscalls). sysv_ipc.c will become "standard" in conf/files and has
#ifs for all the permutations.
are about to go in. This is to fix the problem with the ibcs2 and linux
lkm's not being able to call the sysv ipc functions unless the build is
modified.
the range [210:260] by sweeping the problem under the rug. This change
has the following effects:
1) A new MIB variable in the kern branch is defined to allow modification
of the socket buffer layer's ``wastage factor'' (which determines how
much unused-but-allocated space in mbufs and mbuf clusters is allowed
in a socket buffer).
2) The default value of the wastage factor is changed from 2 to 8.
The original value was chosen when MINCLSIZE was 7*MLEN (!), and is not
appropriate for an environment where MINCLSIZE is much less.
The real solution to this problem is to scrap both mbufs and sockbufs
and completely redesign the buffering mechanism used at both levels.
sysctl handler (ouch!)
Add a "const" qualifier to the source of the copyin() and copyout()
functions - the other const warning in kern_sysctl.c was silenced when
copyout was declared as having a const source.. (which it is)
just like on SVR4.
This has no effect on any current programs in our source, but makes
the use of SVR4 code a little easier. There is no code or implementation
cost in the kernel.. This two-line change merely sets the modes on the ends
of the pipes to be bidirectional. There are no other changes.
looking at a high resolution clock for each of the following events:
function call, function return, interrupt entry, interrupt exit,
and interesting branches. The differences between the times of
these events are added at appropriate places in a ordinary histogram
(as if very fast statistical profiling sampled the pc at those
places) so that ordinary gprof can be used to analyze the times.
gmon.h:
Histogram counters need to be 4 bytes for microsecond resolutions.
They will need to be larger for the 586 clock.
The comments were vax-centric and wrong even on vaxes. Does anyone
disagree?
gprof4.c:
The standard gprof should support counters of all integral sizes
and the size of the counter should be in the gmon header. This
hack will do until then. (Use gprof4 -u to examine the results
of non-statistical profiling.)
config/*:
Non-statistical profiling is configured with `config -pp'.
`config -p' still gives ordinary profiling.
kgmon/*:
Non-statistical profiling is enabled with `kgmon -B'. `kgmon -b'
still enables ordinary profiling (and distables non-statistical
profiling) if non-statistical profiling is configured.
int shmget(key_t key, int size, int shmflg);
If the 'key' has already existed in the system and set 'shmflg'
as '(IPC_CREAT|IPC_EXC)', then shmget() must return the error 'EEXIST'.
Submitted by: m_tanaka@pa.yokogawa.co.jp (Mihoko Tanaka)
See the comments for addupc_intr() and the NetBSD implementation.
We use dummy versions of fuswintr() and susiwintr(), so addupc_intr()
always pushes the work to trap() (this is inefficient), and trap()
calls the special i386 function addupc() instead of addupc_task().
addupc() is more efficient than addupc_intr(), so some of the lost
efficiency is recovered. However, addupc() may be broken on plain
i386's since it doesn't check for write permission like copyout().