Implement all futex atomic operations in assembler to not depend on the
fuword() that does not allow to distinguish between -1 and failure return.
Correctly return 0 from atomic operations on success.
In collaboration with: rdivacky
Tested by: Scot Hetzel <swhetzel gmail com>, Milos Vyletel <mvyletel mzm cz>
Sponsored by: Google SoC 2007
dynamic nature (if no native aio code is available, the linux part
returns ENOSYS because of missing requisites) should be solved differently
than it is.
All this will be done in P4.
Not included in this commit is a backout of the changes to the native aio
code (removing static in some places). Those changes (and some more) will
also be needed when the reworked linux aio stuff will reenter the tree.
Requested by: rwatson
Discussed with: rwatson
Implement the linux_io_* syscalls (AIO). They are only enabled if the native
AIO code is available (either compiled in to the kernel or as a module) at
the time the functions are used. If the AIO stuff is not available there
will be a ENOSYS.
From the submitter:
---snip---
DESIGN NOTES:
1. Linux permits a process to own multiple AIO queues (distinguished by
"context"), but FreeBSD creates only one single AIO queue per process.
My code maintains a request queue (STAILQ of queue(3)) per "context",
and throws all AIO requests of all contexts owned by a process into
the single FreeBSD per-process AIO queue.
When the process calls io_destroy(2), io_getevents(2), io_submit(2) and
io_cancel(2), my code can pick out requests owned by the specified context
from the single FreeBSD per-process AIO queue according to the per-context
request queues maintained by my code.
2. The request queue maintained by my code stores contrast information between
Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks
(struct aiocb). FreeBSD IO control block actually exists in userland memory
space, required by FreeBSD native aio_XXXXXX(2).
3. It is quite troubling that the function io_getevents() of libaio-0.3.105
needs to use Linux-specific "struct aio_ring", which is a partial mirror
of context in user space. I would rather take the address of context in
kernel as the context ID, but the io_getevents() of libaio forces me to
take the address of the "ring" in user space as the context ID.
To my surprise, one comment line in the file "io_getevents.c" of
libaio-0.3.105 reads:
Ben will hate me for this
REFERENCE:
1. Linux kernel source code: http://www.kernel.org/pub/linux/kernel/v2.6/
(include/linux/aio_abi.h, fs/aio.c)
2. Linux manual pages: http://www.kernel.org/pub/linux/docs/manpages/
(io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2))
3. Linux Scalability Effort: http://lse.sourceforge.net/io/aio.html
The design notes: http://lse.sourceforge.net/io/aionotes.txt
4. The package libaio, both source and binary:
http://rpmfind.net/linux/rpm2html/search.php?query=libaio
Simple transparent interface to Linux AIO system calls.
5. Libaio-oracle: http://oss.oracle.com/projects/libaio-oracle/
POSIX AIO implementation based on Linux AIO system calls (depending on
libaio).
---snip---
Submitted by: Li, Xiao <intron@intron.ac>
- Prepare the modules for build on amd64, but don't build them there as
part of the kernel build yet. The code for the missing symbols on amd64
isn't committed and it may be solved differently.
Sponsored by: Google SoC 2006
Submitted by: rdivacky
Add back in a scheme to emulate old type major/minor numbers via hooks into
stat, linprocfs to return major/minors that Linux app's expect. Currently
only /dev/null is always registered. Drivers can register via the Linux
type shim similar to the ioctl shim but by using
linux_device_register_handler/linux_device_unregister_handler functions.
The structure is:
struct linux_device_handler {
char *bsd_driver_name;
char *linux_driver_name;
char *bsd_device_name;
char *linux_device_name;
int linux_major;
int linux_minor;
int linux_char_device;
};
Linprocfs uses this to display the major number of the driver. The
soon to be available linsysfs will use it to fill in the driver name.
Linux_stat uses it to translate the major/minor into Linux type values.
Note major numbers are dynamically assigned via passing in a -1 for
the major number so we don't need to keep track of them.
This is somewhat needed due to us switching to our devfs. MegaCli
will not run until I add in the linsysfs and mfi Linux compat changes.
Sponsored by: IronPort Systems
the kernel by wrapping all targets for fake opt_*.h files in
.if defined(KERNBUILDDIR). Thus, such fake files won't be
created at all if modules are built with the kernel.
Some modules undergo cleanup like removing unused or unneeded
options or .h files, without which they wouldn't build this way
or the other.
Reviewed by: ru
Tested by: no binary changes in modules built alone
Tested on: i386 sparc64 amd64
regocnized as such at the time. Now that the other bogons in the
tree have been fixed, we can remove this ugly kludge.
o Remove stale/bogus opt_foo.h files. These are left over from
by-gone resources. And they point to the need, yet again, to
improve the build system so meta information is only in one place.
Submitted by: ru
Reviewed by: bde
Approved by: re@ (jhb)
- add dependencies on opt_cpu.h and opt_kstack_pages.h to the linux module
Makefile in the i386 case. The latter is needed by an i386-only file, the
former by the i386 implementation of linux_sysvec.c (opt_cpu.h is used for
architecture-dependent options, so I added it only for i386, although this
file is also generated for the alpha).
- add a dependency on opt_kstack_pages.h to the pecoff module Makefile.
kernel access control.
Invoke appropriate MAC entry points for a number of VFS-related
operations in the Linux ABI module. In particular, handle uselib
in a manner similar to open() (more work is probably needed here),
as well as handle statfs(), and linux readdir()-like calls.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
but time and other interests is making it hard. Open the door for
new blood and fresh tactics now that the Linuxulator has had its
facelift.
Thanks to all who contributed during my tour of duty!
o Introduce private types for use in linux syscalls for two reasons:
1. establish type independence for ease in porting and,
2. provide a visual queue as to which syscalls have proper
prototypes to further cleanup the i386/alpha split.
Linuxulator types are prefixed by 'l_'. void and char have not
been "virtualized".
o Provide dummy functions for all syscalls and remove dummy functions
or implementations of truely obsolete syscalls.
o Sanitize the shm*, sem* and msg* syscalls.
o Make a first attempt to implement the linux_sysctl syscall. At this
time it only returns one MIB (KERN_VERSION), but most importantly,
it tells us when we need to add additional sysctls :-)
o Bump the kenel version up to 2.4.2 (this is not the same as the
KERN_VERSION MIB, BTW).
o Implement new syscalls, of which most are specific to i386. Our
syscall table is now up to date with Linux 2.4.2. Some highlights:
- Implement the 32-bit uid_t and gid_t bases syscalls.
- Implement a couple of 64-bit file size/offset bases syscalls.
o Fix or improve numerous syscalls and prototypes.
o Reduce style(9) violations while I'm here. Especially indentation
inconsistencies within the same file are addressed. Re-indenting
did not obfuscate actual changes to the extend that it could not
be combined.
NOTE: I spend some time testing these changes and found that if there
were regressions, they were not caused by these changes AFAICT.
It was observed that installing a RH 7.1 runtime environment
did make matters worse. Hangs and/or reboots have been observed
with and without these changes, so when it failed to make life
better in cases it doesn't look like it made it worse.
the cwd is looked up inside the kernel. The native getcwd() in libc
handles this in userland if __getcwd() fails.
Obtained from: NetBSD via OpenBSD
Tested by: Chris Casey <chriss@phys.ksu.edu>, Markus Holmberg <markush@acc.umu.se>
Reviewed by: Darrell Anderson <anderson@cs.duke.edu>
PR: kern/24315
out of fashion. This particular case, unlike joy(8) and friends which
are just plain silly, did more than just load a kernel loadable module.
However, /etc/rc and the linux_base port were adjusted a while back to
cope with the absence of this script.
The only outstanding reason to hang on to it would have been for the
linux(8) manual page, which clued folks into the existence of the
Linuxulator. A new linux(4) was introduced a while back. It does
a much better job.
This script just isn't useful any more.
This means that the kernel can be totally self contained now and is not
dependent on the last buildworld to update /usr/share/mk. This might
also make it easier to build 5.x kernels on 4.0 boxes etc, assuming
gensetdefs and config(8) are updated.
-U_KERNEL became negative when all all the genassym.c's were converted
to be cross-built.
Use "genassym ... > ${.TARGET}", not "genassym -o $@ ...", so that
genassym(1) doesn't need to support -o.
Removed duplicate -D_KERNEL from flags for compiling linux_locore.s.
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.
discussed on current.
The following variables are defined (for now):
osname (defaults to "Linux")
Allow users to change the name of the OS as returned by uname(2),
specially added for all those Linux Netscape users and statistics
maniacs :-) We now have what we all wanted!
osrelease (defaults to "2.2.5")
Allow users to change the version of the OS as returned by uname(2).
Since -current supports glibc2.1 now, change the default to 2.2.5
(was 2.0.36).
oss_version (defaults to 198144 [0x030600])
This one will be used by the OSS_GETVERSION ioctl (PR 12917) which I
can commit now that we have the MIB. The default version number is the
lowest version possible with the current 'encoding'.
A note about imprisoned processes (see jail(2)):
These variables are copy-on-write (as suggested by phk). This means that
imprisoned processes will use the system wide value unless it is written/set
by the process. From that moment on, a copy local to the prison will be
used.
A note about the implementation:
I choose to add a single pointer to struct prison, because I didn't like the
idea of changing struct prison every time I come up with a new variable. As
a side effect, the extra storage is only needed when a variable is set from
within the prison. This also minimizes kernel bloat when the Linuxulator is
not used; both compiled in or as a module.
Reviewed by: bde (first version only) and phk
Change the ELF registration/unregistration scheme to be less error prone.
Adding a new brand requires a single addition to linux_brandlist instead of
modifying linux_load(), linux_unload(), and linux_elf_init().
Approved by: jkh
Reviewed by: msmith
leftover files in /tmp. Script slightly modified from PR version
to use fewer processes.
PR: i386/7725
Submitted by: Stefan Eggers seggers@semyam.dinoco.de
not actually work for cross compiling, but that is another problem.)
Honor LDFLAGS for building internal tools. (Tools should normally
be built static to avoid problems with picking up target shared
libraries. bsd.kmod doesn't set -static yet, and has some problems
with `LDFLAGS=-static ...' in the environment.)
Use the name argument almost the same in all LKM types. Maintain
the current behavior for the external (e.g., modstat) name for DEV,
EXEC, and MISC types being #name ## "_mod" and SYCALL and VFS only
#name. This is a candidate for change and I vote just the name without
the "_mod".
Change the DISPATCH macro to MOD_DISPATCH for consistency with the
other macros.
Add an LKM_ANON #define to eliminate the magic -1 and associated
signed/unsigned warnings.
Add MOD_PRIVATE to support wcd.c's poking around in the lkm structure.
Change source in tree to use the new interface.
Reviewed by: Bruce Evans
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
were declared as non-const. This is backwards (_lkm_exec() changes the
pointers but all the target `struct execsw's are const). Fixed this
and poisoned related declarations to match and removed the bogus casts
that hid the bug.
too late to be used in all cases. It should probably be created (early)
in bsd.kmod.mk for all LKMs.
Use cc instead of cpp | as for the same reasons as in the kernel makefile.
CFLAGS isn't split up as well as in the kernel makefile, but cc doesn't
pass compiler warning flags to cpp, so there is no need to split it.
Compile and link a new kernel, that will give native ELF support, and
provide the hooks for other ELF interpreters as well.
To make native ELF binaries use John Polstras elf-kit-1.0.1..
For the time being also use his ld-elf.so.1 and put it in
/usr/libexec.
The Linux emulator has been enhanced to also run ELF binaries, it
is however in its very first incarnation.
Just get some Linux ELF libs (Slackware-3.0) and put them in the
prober place (/compat/linux/...).
I've ben able to run all the Slackware-3.0 binaries I've tried
so far.
(No it won't run quake yet :)
from a string to an identifier so that it can be used to generate
declarations and strings. It's much easier to stringize an identifier
than to identifize a string. A uniform naming scheme must be used
for the automatically generated things to apply. This is a feature.
Used the module identifer to generate prototypes for the module load,
unload and stat functions. Removed the few prototypes for these that
already existed.
Used the module identifier to generate a unique struct tag in MOD_DEV().
This should probably be done for all the MOD_*() macros.
Moved the trailing semicolon from the MOD_*() macro definitions to the
macro invocations that didn't already (bogusly) have it.
Staticized the module load and unload functions.
Added function return types for the module load, unload and stat functions.
lkm/ibcs2/ibcs2.c:
Included <sys/sysproto.h> to get everything prototyped.
Cleaned up #includes.
lkm/ibcs2/ipfw.c:
Cleaned up #includes.
lkm/linux/linux.c:
The module name had to change from "linux_emulator" to "linux_mod" to
be automatically generated.
Cleaned up #includes.
lkm/syscons/*/*_saver.c:
Completed delcarations of function pointers.
sys/i386/isa/atapi.c:
The module name had to change from "atapi" to "atapi_mod" to be
automatically generated.
sys/i386/isa/wcd.c:
Used the fixed MOD_DEV(). This module has two devices and expanded the
macro in the source instead of fixing it.
The module names had to change from "wcd" and "rwcd" to "wcd_mod" and
"rwcd_mod" to be automatically generated.
sys/pccard/pcic.c:
The module name had to change from "pcic" to "pcic_mod" to be
automatically generated.
convention of having their entry point named "<modname>_mod"".
Symorder is enforcing this when the current bsd.kmod.mk is installed.
I've not tested all these, but at least they all compile now.
Reattach them to the makefile.
Note that the change that I made to symorder needs to be compiled and
installed before any LKM's will work - the last version was corrupting
the relocation tables. A "make world" will to this, but if you
manually run a make on the lkm's you'll need to take care of it by
hand.
This first shot only incorporaties so much functionality that DOOM
can run (the X version), signal handling is VERY weak, so is many
other things. But it meets my milestone number one (you guessed it
- running DOOM).
Uses /compat/linux as prefix for loading shared libs, so it won't
conflict with our own libs.
Kernel must be compiled with "options COMPAT_LINUX" for this to work.