Commit Graph

10885 Commits

Author SHA1 Message Date
ru
a11da36d0e Refine previous revision to allow acpi_wakecode.h to be safely built
from both the acpi module build directory and a kernel build directory.
The latter didn't work when one attempted to build a kernel which had
"device acpi" with the "make kernel-toolchain buildkernel" command
because a cross-compiler couldn't find anything in the standard system
include path (it's empty in the kernel-toolchain case).

Fix this by passing a better root path to kernel headers (src/sys)
which works for both cases, kernel and module (-I@ only worked for
module).

Also, while here, pass -nostdinc (and a different spelling for icc) --
it's a feature that the kernel source tree is self-contained, and this
change enforces this.

Reported by:	glebius
2006-09-06 14:23:40 +00:00
sobomax
6c1dd2ff20 The FreeBSD by default "disables" hyper-threading cores, by not scheduling
any threads to them. However, it still counts those cores as "active but
permanently idle" when calculating system-wide CPUs statistics. It is
incorrect, since it skews statistics quite a bit and creates real problems
for certain types of applications (monitoring applications for example),
by making them believe that the system does have enough idle CPU resources,
while in fact it does not.

Correct the problem by not calling performance counting routines on "disabled"
cores. The cleaner solution would be to just disable APIC timer interrupts on
those cores completely, but ENOTIME here and it is not clear if the
additional complexity really worth minor performance gain.

Reviewed by:	ssouhlal
Sponsored by:	Sippy Software, Inc.
MFC after:	2 weeks
2006-09-05 17:15:24 +00:00
davidxu
87b5aa08ee Implement casuword32, compare and set user integer, thank Marcel Moolenarr
who wrote the IA64 version of casuword32.
2006-08-28 02:28:15 +00:00
netchild
c1c941b5f5 Fix video playing and network connections in realplayer (and most likely
other stuff) in the osrelease=2.6.16 case:
 - implement CLONE_PARENT semantic
 - fix TLS loading in clone CLONE_SETTLS
 - lock proc in the currently disabled part of CLONE_THREAD

I suggest to not unload the linux module after testing this, there are
some "<defunct>" processes hanging around after exiting (they aren't
with osrelease=2.4.2) and they may panic your kernel when unloading the
linux module. They are in state Z and some of them consume CPU according
to ps. But I don't trust the CPU part, the idle threads gets too much CPU
that this may be possible (accumulating idle, X and 2 defunct processes
results in 104.7%, this looks to much to be a rounding error).

Noticed by:	Intron <mag@intron.ac>
Submitted by:	rdivacky (in collaboration with Intron)
Tested by:	Intron, netchild
Reviewed by:	jhb (previous version)
2006-08-27 18:51:32 +00:00
netchild
ac9f0aa27b regen 2006-08-27 08:58:00 +00:00
netchild
33681b868d Add the linux statfs64 call. This allows Tivoli backup to proceed a little
but further on -current (still not successful, but a step into the right
direction).

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Tested by:	Paul Mather <paul@gromit.dlib.vt.edu>
2006-08-27 08:56:54 +00:00
netchild
fedc5604a0 Emulate what vfork does instead of using it in linux_vfork. This way
we can do the stuff we need to do with linux processes at fork and
don't panic the kernel at exit of the child.

Submitted by:	rdivacky
Tested with:	tst-vfork* (glibc regression tests)
Tested by:	netchild
2006-08-25 11:59:56 +00:00
netchild
81450589e7 Get rid of some nested includes.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	jhb
2006-08-19 15:13:01 +00:00
netchild
5d552cdc47 Move some stuff into headers where they belong.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	jhb, ssouhlal
2006-08-17 21:06:48 +00:00
netchild
39fd1c6d47 Style fixes to comments.
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
Noticed by:	jhb, ssouhlal
2006-08-16 18:54:51 +00:00
pav
e678605a6e - Fix typo in #error pragma: compitable -> compatible
Submitted by:	neologism
2006-08-15 20:10:49 +00:00
jhb
d900df3c77 Regen to propogate <prefix>_AUE_<mumble> changes as well as the earlier
systrace changes.
2006-08-15 17:37:01 +00:00
jhb
61e1e0725a - Remove unused sysvec variables from various syscalls.conf.
- Send the systrace_args files for all the compat ABIs to /dev/null for
  now.  Right now makesyscalls.sh generates a file with a hardcoded
  function name, so it wouldn't work for any of the ABIs anyway.  Probably
  the function name should be configurable via a 'systracename' variable
  and the functions should be stored in a function pointer in the sysvec
  structure.
2006-08-15 17:25:55 +00:00
imp
0f33eed4fd No need for opt_global.h here 2006-08-15 15:48:58 +00:00
netchild
67b50487d2 Remove the include of opt_global.h. It's included globally by a command
line switch. Other files which may make the same mistake (according to
fxr.watson.org) but aren't fixed in this commit (people with more clue
about those files should fix this):
 - i386/xbox/xbox.c
 - arm/arm/elf_trampoline.c
 - arm/arm/mem.c

Noticed by:	cognet
2006-08-15 15:27:13 +00:00
netchild
09751738bc Add include of opt_global.h, else the futex operations aren't locked on
SMP systems.

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
2006-08-15 13:45:39 +00:00
netchild
133c6ea862 add autogenerated systrace_args stuff for dtrace 2006-08-15 12:56:36 +00:00
netchild
ec2ba5d85d Add the linux 2.6.x stuff (not used by default!):
- TLS - complete
 - pid/tid mangling - complete
 - thread area - complete
 - futexes - complete with issues
 - clone() extension - complete with some possible minor issues
 - mq*/timer*/clock* stuff - complete but untested and the mq* stuff is
   disabled when not build as part of the kernel with native FreeBSD mq*
   support (module support for this will come later)

Tested with:
 - linux-firefox - works, tested
 - linux-opera - works, tested
 - linux-realplay - doesnt work, issue with futexes
 - linux-skype - doesnt work, issue with futexes
 - linux-rt2-demo - works, tested
 - linux-acroread - doesnt work, unknown reason (coredump) and sometimes
   issue with futexes
 - various unix utilities in linux-base-gentoo3 and linux-base-fc4:
   everything tried worked

On amd64 not everything is supported like on i386, the catchup is planned for
later when the remaining bugs in the new functions are fixed.

To test this new stuff, you have to run
	sysctl compat.linux.osrelease=2.6.16
to switch back use
	sysctl compat.linux.osrelease=2.4.2

Don't switch while running a linux program, strange things may or may not
happen.

Sponsored by:			Google SoC 2006
Submitted by:			rdivacky
Some suggestions/help by:	jhb, kib, manu@NetBSD.org, netchild
2006-08-15 12:54:30 +00:00
netchild
e8cb5b5578 regen 2006-08-15 12:51:45 +00:00
netchild
fd333609bf Add new syscalls in the linuxolator (only used when the sysctl
compat.linux.osrelease is changed to "2.6.16" or similar).

On amd64 not everything is supported like on i386, the catchup is planned for
later when the remaining bugs in the new functions are fixed.

Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
2006-08-15 12:28:14 +00:00
netchild
3a1395fb43 - Add some ASM stuff needed for futexes (linuxolator).
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
With help from:	kib
2006-08-15 12:14:36 +00:00
alc
4fcd318e7c Eliminate an unnecessary initialization from trap_pfault() that also
happens to contain a style error.
2006-08-14 19:53:53 +00:00
jhb
98def9ff62 Don't try to preserve PAT bits in pmap_enter(). We currently on pages that
aren't mapped via pmap_enter() (KVA).  We will eventually support PAT bits
on user pages, but those will require some sort of MI caching mode stored
in the vm_page.

Reviewed by:	alc
2006-08-14 15:39:41 +00:00
jhb
ce9f8963fd First pass at allowing memory to be mapped using cache modes other than
WB (write-back) on x86 via control bits in PTEs and PDEs (including making
use of the PAT MSR).  Changes include:
- A new pmap_mapdev_attr() function for amd64 and i386 which takes an
  additional parameter (relative to pmap_mapdev()) specifying the cache
  mode for this mapping.  Note that on amd64 only WB mappings are done with
  the direct map, all other modes result in a private mapping.
- pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached)
  mappings rather than WB.  Previously we relied on the BIOS setting up
  MTRR's to enforce memio regions being treated as UC.  This might make
  hw.cbb_start_memory unnecessary in some cases now for example.
- A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places
  that used pmap_mapdev() to map non-device memory (such as ACPI tables)
  to do so using WB as before.
- A new pmap_change_attr() function for amd64 and i386 that changes the
  caching mode for a range of KVA.

Reviewed by:	alc
2006-08-11 19:22:57 +00:00
netchild
1f1a93f2ab Add some more errno mappings (bsd -> linux) and a comment about the status..
Submitted by:	"Intron" <mag@intron.ac>
2006-08-10 22:05:25 +00:00
imp
b7167a2ca5 Eliminate one set of XBOX #ifdefs. The Xbox code just needs to set a
different TIMER_FREQ value than default.  Accomplish this via the
config file rather than via an #ifdef.
2006-08-09 23:47:38 +00:00
imp
af62585f2d Minor style(9) nit. 2006-08-09 23:37:30 +00:00
njl
6b5ea55333 If a beep was enabled, turn it off 3 seconds after resume.
MFC after:	3 days
2006-08-08 01:30:54 +00:00
alc
99dcbcf3fd Eliminate the acquisition and release of the page queues lock around a call
to vm_page_sleep_if_busy().
2006-08-06 06:29:16 +00:00
mr
77027a81d4 Dont overwrite cpu_model in the case of Via's C3-CPU.
Noticed by:  Mike Tancsa
MFC after:	2 days
2006-08-04 13:49:16 +00:00
yar
209e4786e7 Commit the results of the typo hunt by Darren Pilgrim.
This change affects documentation and comments only,
no real code involved.

PR:		misc/101245
Submitted by:	Darren Pilgrim <darren pilgrim bitfreak org>
Tested by:	md5(1)
MFC after:	1 week
2006-08-04 07:56:35 +00:00
alc
a152234cf9 Complete the transition from pmap_page_protect() to pmap_remove_write().
Originally, I had adopted sparc64's name, pmap_clear_write(), for the
function that is now pmap_remove_write().  However, this function is more
like pmap_remove_all() than like pmap_clear_modify() or
pmap_clear_reference(), hence, the name change.

The higher-level rationale behind this change is described in
src/sys/amd64/amd64/pmap.c revision 1.567.  The short version is that I'm
trying to clean up and fix our support for execute access.

Reviewed by: marcel@ (ia64)
2006-08-01 19:06:06 +00:00
obrien
040ba91ea8 Correct spelling of 3DNow!. 2006-08-01 01:23:39 +00:00
marcel
7067faff16 Remove sio(4) and related options from MI files to amd64, i386
and pc98 MD files. Remove nodevice and nooption lines specific
to sio(4) from ia64, powerpc and sparc64 NOTES. There were no
such lines for arm yet.
sio(4) is usable on less than half the platforms, not counting
a future mips platform. Its presence in MI files is therefore
increasingly becoming a burden.
2006-07-29 18:38:54 +00:00
jhb
3a707d012d Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is
now back to just being an argument count.
2006-07-28 20:22:58 +00:00
jhb
dee1b3da95 Regen for MPSAFE flag removal. 2006-07-28 19:08:37 +00:00
jhb
c62c38439f Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to
mark system calls as being MPSAFE:
- Stop conditionally acquiring Giant around system call invocations.
- Remove all of the 'M' prefixes from the master system call files.
- Remove support for the 'M' prefix from the script that generates the
  syscall-related files from the master system call files.
- Don't explicitly set SYF_MPSAFE when registering nfssvc.
2006-07-28 19:05:28 +00:00
jhb
6a211b6d81 Various fixes to comments in the syscall master files including removing
cruft from the audit import and adding mention of COMPAT4 to freebsd32.
2006-07-28 18:55:18 +00:00
jhb
12302c47d0 Unify the checking for lock misbehavior in the various syscall()
implementations and adjust some of the checks while I'm here:
- Add a new check to make sure we don't return from a syscall in a critical
  section.
- Add a new explicit check before userret() to make sure we don't return
  with any locks held.  The advantage here is that we can include the
  syscall number and name in syscall() whereas that info is not available
  in userret().
- Drop the mtx_assert()'s of sched_lock and Giant.  They are replaced by
  the more general checks just added.

MFC after:	2 weeks
2006-07-27 22:32:30 +00:00
jhb
dc69447236 Argh, fix compile with XBOX enabled. Somehow I missed a LINT compile. :( 2006-07-27 22:19:02 +00:00
jhb
c95747d9a1 Don't allow MAXMEM or hw.physmem to extend the top of memory if our memory
map was obtained from the SMAP.  SMAP is trustworthy, and the memory
extending feature is a band-aid for older systems where FreeBSD's methods
of detecting memory were not always trustworthy.  This fixes the issue
where using hw.physmem could result in the ACPI tables getting trashed
breaking ACPI.

MFC after:	3 days
Tested on:	i386
2006-07-27 19:47:22 +00:00
yongari
9b54b752db Add stge(4) to the list of drivers supported by GENERIC kernel. 2006-07-25 01:06:32 +00:00
jhb
e96f2e292b Regen. 2006-07-21 20:41:33 +00:00
jhb
675c87997e - Pass the MPSAFE flag to namei() in linux_uselib() and handle conditional
Giant VFS locking in that function.
- Remove bogus code to handle the case where namei() returns success but a
  NULL vnode pointer.
- Note that this code duplicates exec_check_permissions() and annotate
  where it differs.
- Hold the vnode lock longer to protect the write to set VV_TEXT in
  v_vflag.
- Mark linux_uselib() MPSAFE.

Reviewed by:	rwatson
2006-07-21 20:22:13 +00:00
alc
004ef88e09 Add pmap_clear_write() to the interface between the virtual memory
system's machine-dependent and machine-independent layers.  Once
pmap_clear_write() is implemented on all of our supported
architectures, I intend to replace all calls to pmap_page_protect() by
calls to pmap_clear_write().  Why?  Both the use and implementation of
pmap_page_protect() in our virtual memory system has subtle errors,
specifically, the management of execute permission is broken on some
architectures.  The "prot" argument to pmap_page_protect() should
behave differently from the "prot" argument to other pmap functions.
Instead of meaning, "give the specified access rights to all of the
physical page's mappings," it means "don't take away the specified
access rights from all of the physical page's mappings, but do take
away the ones that aren't specified."  However, owing to our i386
legacy, i.e., no support for no-execute rights, all but one invocation
of pmap_page_protect() specifies VM_PROT_READ only, when the intent
is, in fact, to remove only write permission.  Consequently, a
faithful implementation of pmap_page_protect(), e.g., ia64, would
remove execute permission as well as write permission.  On the other
hand, some architectures that support execute permission have
basically ignored whether or not VM_PROT_EXECUTE is passed to
pmap_page_protect(), e.g., amd64 and sparc64.  This change represents
the first step in replacing pmap_page_protect() by the less subtle
pmap_clear_write() that is already implemented on amd64, i386, and
sparc64.

Discussed with: grehan@ and marcel@
2006-07-20 17:48:41 +00:00
alc
f0337456d9 MFamd64
pmap_clear_ptes() is already convoluted.  This will worsen with the
 implementation of superpages.  Eliminate it and add pmap_clear_write().

There are no functional changes.  Checked by: md5
2006-07-18 03:17:12 +00:00
alc
45cb178426 Now that free_pv_entry() accesses the pmap, call free_pv_entry() in
pmap_remove_all() before rather than after the pmap is unlocked.  At
present, the page queues lock provides sufficient sychronization.  In the
future, the page queues lock may not always be held when free_pv_entry() is
called.
2006-07-17 03:10:17 +00:00
alc
ae11c9115b MFamd64
Make three simplifications to pmap_ts_referenced():
   Eliminate an initialized but otherwise unused variable.
   Eliminate an unnecessary test.
   Exit the loop in a shorter way.
2006-07-16 21:05:58 +00:00
alc
5afff0eadf Eliminate the remaining uses of "register".
Convert the remaining K&R-style function declarations to ANSI-style.

Eliminate excessive white space from pmap_ts_referenced().
2006-07-16 19:43:49 +00:00
alc
8f169c00cb Make pc_freemask an array of uint32_t, rather than uint64_t. (I believe
that the use of the latter is simply an oversight in porting the new pv
entry code from amd64.)
2006-07-15 07:24:30 +00:00
jhb
df5064de23 Regen. 2006-07-14 15:42:47 +00:00
jhb
9b1ba3b554 Somewhat surprisingly, ibcs2_ioctl() is MPSAFE as it is without needing any
further fixes.
2006-07-14 15:42:21 +00:00
jhb
917f450cf6 Regen. 2006-07-14 15:31:01 +00:00
jhb
ebe022b0c4 Mark ibcs2_mount() (just returns EINVAL) and ibcs2_umount() (just calls
unmount(2)) MPSAFE.
2006-07-14 15:30:50 +00:00
jhb
6ae97a774e Regen. 2006-07-14 15:11:46 +00:00
jhb
e860523612 ibcs2_sigprocmask() is already marked MPSAFE in syscalls.xenix, so mark
it MPSAFE in syscalls.isc.
2006-07-14 15:11:20 +00:00
jkim
03e0206d84 Sync specialreg.h changes between amd64 and i386 with few fixes. 2006-07-13 16:09:40 +00:00
jhb
a72b0bcd7f Simplify the pager support in DDB. Allowing different db commands to
install custom pager functions didn't actually happen in practice (they
all just used the simple pager and passed in a local quit pointer).  So,
just hardcode the simple pager as the only pager and make it set a global
db_pager_quit flag that db commands can check when the user hits 'q' (or a
suitable variant) at the pager prompt.  Also, now that it's easy to do so,
enable paging by default for all ddb commands.  Any command that wishes to
honor the quit flag can do so by checking db_pager_quit.  Note that the
pager can also be effectively disabled by setting $lines to 0.

Other fixes:
- 'show idt' on i386 and pc98 now actually checks the quit flag and
  terminates early.
- 'show intr' now actually checks the quit flag and terminates early.
2006-07-12 21:22:44 +00:00
mr
0130801813 Initialise (if necessary) the VIA C3/C7 features.
Store the capabilities for further use by random(4), padlock(4), ...

Obtained from:	mostly OpenBSD
MFC after:	1 week
2006-07-12 19:46:08 +00:00
mr
cb9048aebc fix typo in identcpu.c and add one define to specialreg.h.
MFC after:	1 week
2006-07-12 16:52:56 +00:00
mr
83b3720abd First step to identify and initialize the newer VIA C7 CPU
as found in a VIA EPIA EN-15000 board.

Obtained from:	large parts from OpenBSD
2006-07-12 14:52:32 +00:00
jkim
3117fa3da4 Add two new CPUID bits for AMD CPUs, i. e., SVM and extended APIC register. 2006-07-12 06:04:12 +00:00
jhb
286a0ec5a8 Regen. 2006-07-11 20:55:23 +00:00
jhb
9569e81b84 - Add conditional VFS Giant locking to getdents_common() (linux ABIs),
ibcs2_getdents(), ibcs2_read(), ogetdirentries(), svr4_sys_getdents(),
  and svr4_sys_getdents64() similar to that in getdirentries().
- Mark ibcs2_getdents(), ibcs2_read(), linux_getdents(), linux_getdents64(),
  linux_readdir(), ogetdirentries(), svr4_sys_getdents(), and
  svr4_sys_getdents64() MPSAFE.
2006-07-11 20:52:08 +00:00
jhb
0e6d9ac511 Retire the stackgap macros from ibcs2 as they are no longer used. Push
the includes of <sys/exec.h> and <sys/sysent.h> down into the only files
that now need them.
2006-07-10 17:59:26 +00:00
jhb
3924866a78 Regen. 2006-07-10 15:55:38 +00:00
jhb
387e004f72 Mark ibcs2_msgsys(), ibcs2_semsys(), and ibcs2_shmsys() MPSAFE. 2006-07-10 15:55:17 +00:00
twinterg
3e2c62d319 Extend i4b to support CAPI manager based ISDN controllers (CAPI manager is part of
c4b, CAPI for BSD). This is a preparation to add CAPI for BSD to the source tree.

Approved by:	hm (mentor)
MFC after:	2 weeks
2006-07-09 21:16:06 +00:00
mjacob
6850136348 Make the firmware assist driver resident in
preparation for isp using it.

Reviewed by:	sam, max
2006-07-09 16:41:22 +00:00
mjacob
4574ebad6a If PAE is built w/o modules, make sure that isp(4)
has its firmware resident as well.
2006-07-09 16:38:58 +00:00
jhb
fff357912c Regen. 2006-07-08 20:14:34 +00:00
jhb
094306d69d - Split ioctl() up into ioctl() and kern_ioctl(). The kern_ioctl() assumes
that the 'data' pointer is already setup to point to a valid KVM buffer
  or contains the copied-in data from userland as appropriate (ioctl(2)
  still does this).  kern_ioctl() takes care of looking up a file pointer,
  implementing FIONCLEX and FIOCLEX, and calling fi_ioctl().
- Use kern_ioctl() to implement xenix_rdchk() instead of using the stackgap
  and mark xenix_rdchk() MPSAFE.
2006-07-08 20:12:14 +00:00
jhb
28bb163264 Use kern_connect() in spx_open() to avoid the need for the stackgap. I
also used kern_close() for simplicity though close(2) wasn't requiring
the use of the stackgap.
2006-07-08 20:05:04 +00:00
jhb
df27227bab - Split the IBCS2 ipc foosys() system calls up into subfunctions matching
the organization in svr4_ipc.c.
- Use kern_msgctl(), kern_semctl(), and kern_shmctl() instead of the
  stackgap.
2006-07-08 19:54:12 +00:00
jhb
9f226f3f9d Use ibsc2_key_t rather than key_t. 2006-07-08 19:52:49 +00:00
jhb
a63b63284f Regen. 2006-07-06 21:43:14 +00:00
jhb
4d231459c7 - Protect the list of linux ioctl handlers with an sx lock.
- Hold Giant while calling linux ioctl handlers for now as they aren't all
  known to be MPSAFE yet.
- Mark linux_ioctl() MPSAFE.
2006-07-06 21:42:36 +00:00
jhb
e216ca9f3b Regen. 2006-07-06 21:33:14 +00:00
jhb
54c687571c Add kern_setgroups() and kern_getgroups() and use them to implement
ibcs2_[gs]etgroups() rather than using the stackgap.  This also makes
ibcs2_[gs]etgroups() MPSAFE.  Also, it cleans up one bit of weirdness in
the old setgroups() where it allocated an entire credential just so it had
a place to copy the group list into.  Now setgroups just allocates a
NGROUPS_MAX array on the stack that it copies into and then passes to
kern_setgroups().
2006-07-06 21:32:20 +00:00
jhb
6fe08fdbd3 Use the regular poll(2) function to implement poll(2) for the IBCS2 compat
ABI as FreeBSD's poll(2) is ABI compatible.  The ibcs2_poll() function
attempted to implement poll(2) using a wrapper around select(2).  Besides
being somewhat ugly, it also had at least one bug in that instead of
allocating complete fdset's on the stack via the stackgap it just allocated
pointers to fdsets.
2006-07-06 21:29:05 +00:00
davidxu
41e65e69dc Temporarily remove SCHED_CORE, it seems I have so many works can do now,
one example is POSIX priority mutex for libthr.
2006-07-05 02:32:55 +00:00
alc
4748a85152 Correct an error in the new pmap_collect(), thus only affecting HEAD.
Specifically, the pv entry was always being freed to the caller's pmap
instead of the pmap to which the pv entry belongs.
2006-07-02 18:22:47 +00:00
rink
6131479c85 Updated the XBOX kernel to use the new nfe(4) driver obtained from
OpenBSD. This driver seems to give a small performance increase, and
should lead to better maintainability in the future.

The nForce Ethernet-specific hack in sys/i386/xbox/xbox.c is still
required, judging from dev/nfe/if_nfe.c. The condition it hacks will
almost certainly only occur on XBOX-es anyway, so it is best left there.

Approved by:	imp (mentor)
2006-06-27 20:22:32 +00:00
jhb
693417c025 Regen. 2006-06-27 18:32:16 +00:00
jhb
dff69a853e - Add a kern_semctl() helper function for __semctl(). It accepts a pointer
to a copied-in copy of the 'union semun' and a uioseg to indicate which
  memory space the 'buf' pointer of the union points to.  This is then used
  in linux_semctl() and svr4_sys_semctl() to eliminate use of the stackgap.
- Mark linux_ipc() and svr4_sys_semsys() MPSAFE.
2006-06-27 18:28:50 +00:00
jhb
db4d1f72c7 Regen. 2006-06-27 14:47:08 +00:00
jhb
5ceeece21b - Expand the scope of Giant some in mount(2) to protect the vfsp structure
from going away.  mount(2) is now MPSAFE.
- Expand the scope of Giant some in unmount(2) to protect the mp structure
  (or rather, to handle concurrent unmount races) from going away.
  umount(2) is now MPSAFE, as well as linux_umount() and linux_oldumount().
- nmount(2) and linux_mount() were already MPSAFE.
2006-06-27 14:46:31 +00:00
alc
49b81721c7 Correct a very old and very obscure bug: vmspace_fork() calls
pmap_copy() if the mapping is VM_INHERIT_SHARE.  Suppose the mapping
is also wired.  vmspace_fork() clears the wiring attributes in the vm
map entry but pmap_copy() copies the PG_W attribute in the PTE.  I
don't think this is catastrophic.  It blocks pmap_remove_pages() from
destroying the mapping and corrupts the pmap's wiring count.

This revision fixes the problem by changing pmap_copy() to clear the
PG_W attribute.

Reviewed by: tegge@
2006-06-27 04:28:23 +00:00
obrien
5094b5a232 Add a pure open source nForce Ethernet driver, under BSDL.
This driver was ported from OpenBSD by Shigeaki Tagashira
<shigeaki@se.hiroshima-u.ac.jp> and posted at
http://www.se.hiroshima-u.ac.jp/~shigeaki/software/freebsd-nfe.html
It was additionally cleaned up by me.
It is still a work-in-progress and thus is purposefully not in GENERIC.
And it conflicts with nve(4), so only one should be loaded.
2006-06-26 23:41:07 +00:00
babkin
f0555f2de9 Backed out the change by request from rwatson.
PR:		kern/14584
2006-06-26 22:03:22 +00:00
jhb
368eefb9bf Regen. 2006-06-26 18:37:36 +00:00
jhb
ddfdf64e37 linux_brk() is MPSAFE. 2006-06-26 18:36:16 +00:00
alc
1b735c58a6 Eliminate a comment that became stale after revision 1.547. 2006-06-25 22:15:02 +00:00
babkin
3d8be823b0 The common UID/GID space implementation. It has been discussed on -arch
in 1999, and there are changes to the sysctl names compared to PR,
according to that discussion. The description is in sys/conf/NOTES.
Lines in the GENERIC files are added in commented-out form.
I'll attach the test script I've used to PR.

PR:		kern/14584
Submitted by:	babkin
2006-06-25 18:37:44 +00:00
alc
13b4d64335 Change get_pv_entry() such that the call to vm_page_alloc() specifies
VM_ALLOC_NORMAL instead of VM_ALLOC_SYSTEM when try is TRUE.  In other
words, when get_pv_entry() is permitted to fail, it no longer tries as
hard to allocate a page.

Change pmap_enter_quick_locked() to fail rather than wait if it is
unable to allocate a page table page.  This prevents a race between
pmap_enter_object() and the page daemon.  Specifically, an inactive
page that is a successor to the page that was given to
pmap_enter_quick_locked() might become a cache page while
pmap_enter_quick_locked() waits and later pmap_enter_object() maps
the cache page violating the invariant that cache pages are never
mapped.  Similarly, change
pmap_enter_quick_locked() to call pmap_try_insert_pv_entry() rather
than pmap_insert_entry().  Generally speaking,
pmap_enter_quick_locked() is used to create speculative mappings.  So,
it should not try hard to allocate memory if free memory is scarce.

Add an assertion that the object containing m_start is locked in
pmap_enter_object().  Remove a similar assertion from
pmap_enter_quick_locked() because that function no longer accesses the
containing object.

Remove a stale comment.

Reviewed by: ups@
2006-06-20 20:52:11 +00:00
netchild
64550de991 regen after change to syscalls.master 2006-06-20 20:41:29 +00:00
netchild
247b98ef25 Switch to using the DUMMY infrastructure instead of UNIMPL for the new
syscalls. This way there will be a log message printed to the console
(this time for real).

Note: UNIMPL should be used for syscalls we do not implement ever, e.g.
syscalls to load linux kernel modules.

Submitted by:	rdivacky
Sponsored by:	Goole SoC 2006
P4 IDs:		99600, 99602
2006-06-20 20:38:44 +00:00
yar
7e90b114e3 We no longer need to disable interrupts in MD trap machinery
when we're about to call kdb_trap() because the latter MI
function can disable interrupts by itself now.

Pointed out by:	bde
X-MFC remark:	depends on kern/subr_kdb.c#1.18
Sponsored by:	RiNet (Cronyx Plus LLC)
2006-06-20 12:44:21 +00:00
davidxu
76a64a293b Style fix, use low-case. 2006-06-19 07:55:29 +00:00
davidxu
9ef6a74011 Clear bit 22 in MSR IA32_MISC_ENABLE, according to Intel document,
when the bit 22 is set to 1, CPUID with EAX=0 returns a maximum
value in EAX[7..0] of 3, when set to 0(default), CPUID with EAX=0
returns the number corresponding to the maximum standard function
supported. On my machine, BIOS sets the bit to 1 to make it to be
compatible with old OS, this causes dual-core Pentium-D (two
physical cores) to be identified as hyperthreading (two logical
cores) by function mp_topology().
2006-06-19 07:51:47 +00:00
yar
2ace0191b7 Fix style while I'm here. 2006-06-18 12:13:49 +00:00
yar
e95f07384c The i386 "call" instruction works as follows: it pushes
the return address on the stack and only then "dereferences" %pc.
Therefore, in the case of a call to an invalid address, we arrive
to the trap handler with the invalid value in tf_eip.  This used
to prevent db_backtrace() from assigning the most recent and interesting
frame on the stack to the right spot in the right function, from
which the invalid call was attempted.

Try to detect and work around that by recovering the return address
from the stack.

The work-around requires the fault address be passed to db_backtrace().
Smuggle it as tf_err.

MFC after:	1 month
Sponsored by:	RiNet (Cronyx Plus LLC)
2006-06-18 12:07:00 +00:00
mjacob
5292755b7e Unbreak tinderbox- fix device_printf arg to accomodate different sizes
of vm_paddr_t in different contexts (e.g., PAE vs. non PAE).
2006-06-16 14:04:21 +00:00
yar
4d78a1cd9f Return -1 from db_numargs() if number of args couldn't be guessed.
Use this later to indicate in backtrace output that args shown are
uncertain.

Sponsored by: RiNet (Cronyx Plus LLC)
2006-06-16 11:49:37 +00:00
yar
f92d46b4f1 Guess the number of arguments to a function somewhat better.
Now GCC likes to stick a "mov %eax, %FOO" instruction before
"addl $BAR, %esp" if the function just called returns an int,
which is a very common case in the kernel.

Sponsored by: RiNet (Cronyx Plus LLC)
2006-06-16 11:14:54 +00:00
netchild
11681ee0b5 Remove COMPAT_43 from GENERIC (and other kernel configs). For amd64 there's
an explicit comment that it's needed for the linuxolator. This is not the
case anymore. For all other architectures there was only a "KEEP THIS".
I'm (and other people too) running a COMPAT_43-less kernel since it's not
necessary anymore for the linuxolator. Roman is running such a kernel for a
for longer time. No problems so far. And I doubt other (newer than ia32
or alpha) architectures really depend on it.

This may result in a small performance increase for some workloads.

If the removal of COMPAT_43 results in a not working program, please
recompile it and all dependencies and try again before reporting a
problem.

The only place where COMPAT_43 is needed (as in: does not compile without
it) is in the (outdated/not usable since too old) svr4 code.

Note: this does not remove the COMPAT_43TTY option.

Nagging by:	rdivacky
2006-06-15 19:58:53 +00:00
ups
b3a7439a45 Remove mpte optimization from pmap_enter_quick().
There is a race with the current locking scheme and removing
it should have no measurable performance impact.
This fixes page faults leading to panics in pmap_enter_quick_locked()
on amd64/i386.

Reviewed by: alc,jhb,peter,ps
2006-06-15 01:01:06 +00:00
netchild
de5cf4e1bd regen after MFP4 (soc2006/rdivacky_linuxolator) of syscalls.master
P4-Changes:	similar to 98673 and 98675 but regenerated locally
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
2006-06-13 18:48:30 +00:00
netchild
a561ebc3f4 MFP4 (soc2006/rdivacky_linuxolator)
Update of syscall.master:
	o	Adding of several new dummy syscalls (268-310)
	o	Synchronization of amd64 syscall.master with i386 one
	o	Auditing added to amd64 syscall.master
	o	Change auditing type for lstat syscall (bugfix). [1]

P4-Changes:	98672, 98674
Noticed by:	rwatson [1]
Sponsored by:	Google SoC 2006
Submitted by:	rdivacky
2006-06-13 18:43:55 +00:00
davidxu
82b666ed4a Add scheduler CORE, the work I have done half a year ago, recent,
I picked it up again. The scheduler is forked from ULE, but the
algorithm to detect an interactive process is almost completely
different with ULE, it comes from Linux paper "Understanding the
Linux 2.6.8.1 CPU Scheduler", although I still use same word
"score" as a priority boost in ULE scheduler.

Briefly, the scheduler has following characteristic:
1. Timesharing process's nice value is seriously respected,
   timeslice and interaction detecting algorithm are based
   on nice value.
2. per-cpu scheduling queue and load balancing.
3. O(1) scheduling.
4. Some cpu affinity code in wakeup path.
5. Support POSIX SCHED_FIFO and SCHED_RR.
Unlike scheduler 4BSD and ULE which using fuzzy RQ_PPQ, the scheduler
uses 256 priority queues. Unlike ULE which using pull and push, the
scheduelr uses pull method, the main reason is to let relative idle
cpu do the work, but current the whole scheduler is protected by the
big sched_lock, so the benefit is not visible, it really can be worse
than nothing because all other cpu are locked out when we are doing
balancing work, which the 4BSD scheduelr does not have this problem.
The scheduler does not support hyperthreading very well, in fact,
the scheduler does not make the difference between physical CPU and
logical CPU, this should be improved in feature. The scheduler has
priority inversion problem on MP machine, it is not good for
realtime scheduling, it can cause realtime process starving.
As a result, it seems the MySQL super-smack runs better on my
Pentium-D machine when using libthr, despite on UP or SMP kernel.
2006-06-13 13:12:56 +00:00
marius
9e60ac43b5 Make the ISAPNP code optional and only enable it on i386 and pc98 (used
for CBUS-PNP cards there) by default, as there are no amd64 and sparc64
machines with ISA slots and which therefore could make use of this code
known to exist. For sparc64 this additionally allows to get rid of the
compat shims for in{b,w,l}()/out{b,w,l}() etc and the associated hacks.

OK'ed by:	imp, peter
2006-06-12 21:07:13 +00:00
jhb
3ec293f314 Enable a few more things in x86 NOTES to get broader LINT coverage:
- Turn on iwi(4), ipw(4), and ndis(4) on amd64 and i386.
- Turn on ral(4) and ural(4) on i386, pc98, and amd64.
2006-06-12 20:38:17 +00:00
alc
cbeb562815 Don't invalidate the TLB in pmap_qenter() unless the old mapping was valid.
Most often, it isn't.

Reviewed by: tegge@
2006-06-12 20:05:27 +00:00
imp
038d1db25e Add the ability to subset the devices that UART pulls in. This allows
the arm to compile without all the extras that don't appear, at least
not in the flavors of ARM I deal with.  This helps us save about 100k.

If I've botched the available devices on a platform, please let me
know and I'll correct ASAP.
2006-06-12 04:21:50 +00:00
njl
547b3085ee * Ask for a page-aligned page instead of an arbitrary address. This should
not be necessary but might be helpful and at least reduce fragmentation.
* Add an assert to detect if the wakecode ever grows too big.  We include
  1 KB for stack, which should be more than enough also.
* Remove unnecessary initialization of static variables.
* Add comments and a bootverbose print giving the page phys address.
2006-06-10 08:20:17 +00:00
njl
66b0070261 Minor tweaks to the resume code. Previous commit reverted alignment back
to 4.  There is no need to be more strict at assembly time since we copy
the code anyway to a private page.

* Clear the direction flag and eflags.  Probably not necessary but it won't
  hurt to be safe.
* Add prefixes to all instructions to prevent any assembler mistakes.
* Remove zeroing of eax - edi.  We use those registers immediately after
  to transfer values to protected mode so this was pointless.
* Update comments to reflect info found during code review.
2006-06-10 08:20:03 +00:00
njl
00c07c3991 Move the reset beep tunable/sysctl to debug.acpi.resume_beep. This makes
more sense than under hw.acpi.  Also, document this in the man page.
2006-06-10 08:06:16 +00:00
njl
b8f1ff9a05 Minor tweaks to the resume code that might help people debug.
* Add hw.acpi.resume_beep tunable and sysctl, default to 0.  Beeps the PC
speaker soon after waking to diagnose whether the wakeup code is even
getting run before other drivers possibly hang the system.  To stop the beep,
cause another beep (i.e. keyboard bell).  Submitted by takawata@, I changed
the frequency to be lower.

* Use 4096 instead of 4 byte alignment.  Might be useful although doesn't
seem to be necessary.

* Remove a useless assignment to acpi_reset_video.  It was overwritten by
the default sysctl value anyway.
2006-06-08 17:54:10 +00:00
alc
ff4adb11fe Introduce the function pmap_enter_object(). It maps a sequence of resident
pages from the same object.  Use it in vm_map_pmap_enter() to reduce the
locking overhead of premapping objects.

Reviewed by: tegge@
2006-06-05 20:35:27 +00:00
emaste
b9360f5c27 Fix cut-n-pasteo: use the i386 version #define for i386 dumps, not the amd64 one. 2006-06-05 18:21:29 +00:00
alc
efb5d1da26 MFamd64
Eliminate unnecessary, recursive acquisitions and releases of the page
 queues lock by free_pv_entry() and pmap_remove_pages().

 Reduce the scope of the page queues lock in pmap_remove_pages().
2006-06-05 06:08:21 +00:00
silby
89bd691dee After much discussion with mjacob and scottl, change bus_dmamem_alloc so
that it just warns the user with a printf when it misaligns a piece
of memory that was requested through a busdma tag.

Some drivers (such as mpt, and probably others) were asking for alignments
that could not be satisfied, but as far as driver operation was concerned,
that did not matter.  In the theory that other drivers will fall into
this same category, we agreed that panicing or making the allocation
fail will cause more hardship than is necessary.  The printf should
be sufficient motivation to get the driver glitch fixed.
2006-06-01 04:49:29 +00:00
mjacob
1b7bd7c5ee Turn the panic on not being able to meet alignment constraints
in bus_dmamem_alloc into the more reasonable EINVAL return.

Also, reclaim memory allocated but then not used if we had
an error return.
2006-05-31 00:37:56 +00:00
davidxu
42175dc944 Clear invalid bits only if CPU supports SSE, otherwise, some fields in
struct save87 will be cleared unexpectly.
2006-05-31 00:17:29 +00:00
davidxu
fa9df4abe1 Use the method described in IA-32 Intel Architecture Software Developer's
Manual chapter 11.6.6 to get valid mxcsr bits, use the mxcsr mask to clear
invalid bits passed by user code.

Reviewed by: bde
2006-05-30 23:44:21 +00:00
davidxu
b60160771c Backout changes trying to inherit floating-point environment, although
POSIX (susv3) requires this, but it is unclear what should be inherited,
duplicating whole 387 stack for new thread seems to be unnecessary and
dangerous. Revert to previous code, force a new thread to be started with
clean FP state.
2006-05-29 02:58:37 +00:00
silby
7f96e8451a Add a quick hack to ensure that bus_dmamem_alloc properly aligns
small allocations with large alignment requirements.

Add a panic to detect cases where we've still failed to properly align.
2006-05-28 18:30:36 +00:00
davidxu
dc6d8065e6 Clear high 16 bits of mxcsr register, according to Intel document, if
the high 16 bits is non-zero, fxrstor instruction will generate GP fault,
resulting kernel crash, this bug can be triggered by setcontext and
ptrace(PT_SETXMMREGS).
2006-05-28 06:51:57 +00:00
davidxu
9cd9aeea7a PCB_NPXINITDONE is cleared by npx_fork_thread. 2006-05-28 04:47:56 +00:00
davidxu
03c7322bae If parent thread never used FPU, the only work is to clear flag
PCB_NPXINITDONE for new thread and let trap code initialize it.
2006-05-28 04:40:45 +00:00
davidxu
bf6b4844d3 When creating a new thread, inherit floating-point environment from
current thread, this is required by POSIX pthread_create document.
2006-05-28 02:03:13 +00:00
imp
7854550aa7 APM was calling the suspend process from a timeout. This meant that
other timeouts could not happen while suspending, including timeouts
for things like msleep.  This caused the system to hang on suspend
when the cbb was enabled, since its suspend path powered down the
socket which used a timeout to wait for it to be done.

APM now creates a thread when it is enabled, and deletes the thread
when it is disabled.  This thread takes the place of the timeout by
doing its polling every ~.9s.  When the thread is disabled, it will
wakeup early, otherwise it times out and polls the varius things the
old timeout polled (APM events, suspend delays, etc).

This makes my Sony VAIO 505TS suspend/resume correctly when APM is
enabled (ACPI is black listed on my 505TS).

This will likely fix other problems with the suspend path where
drivers would sleep with msleep and/or do other timeouts.  Maybe
there's some special case code that would use DELAY while suspending
and msleep otherwise that can be revisited and removed.

This was also tested by glebius@, who pointed out that in the patch I
sent him, I'd forgotten apm_saver.c

MFC After: 3 weeks
2006-05-25 23:06:38 +00:00
sobomax
210b6777a4 Move clock_lock prototype into <machine/clock.h>, where it is more
appropriate.

Discussed with:	jhb
2006-05-19 18:53:50 +00:00
marius
70daffddff - Add C-bus and ISA front-ends for le(4) so it can actually replace
lnc(4) on PC98 and i386. The ISA front-end supports the same non-PNP
  network cards as lnc(4) did and additionally a couple of PNP ones.
  Like lnc(4), the C-bus front-end of le(4) only supports C-NET(98)S
  and is untested due to lack of such hardware, but given that's it's
  based on the respective lnc(4) and not too different from the ISA
  front-end it should be highly likely to work.
- Remove the descriptions of le(4), which where converted from lnc(4),
  from sys/i386/conf/NOTES and sys/pc98/conf/NOTES as there's a common
  one in sys/conf/NOTES.
2006-05-17 21:25:23 +00:00
marius
0d3e65af24 - As only the PCI front-end of le(4) is common to all platforms move its
entry to the PCI NICs section so it's in the same spot in all GENERIC
  config files.
- Add a note to the description of pcn(4) informing that is has precedence
  over le(4).
2006-05-17 20:44:01 +00:00
phk
537a82e24b Send the pcvt(4) driver off to retirement. 2006-05-17 09:33:15 +00:00
phk
ef310efff8 Since DELAY() was moved, most <machine/clock.h> #includes have been
unnecessary.
2006-05-16 14:37:58 +00:00
ru
c249b5bd38 Kill more references to lnc(4).
Submitted by:	grep(1)
2006-05-16 12:15:39 +00:00
marius
be5f202f36 Remove some remnants of lnc(4). 2006-05-14 18:49:25 +00:00
gnn
d1e0397ab9 Prefer the le device driver for Lance (AMD7990 et al) hardware over the
older, and less capable lnc driver.

Reviewed by:	imp
2006-05-14 01:40:41 +00:00
peter
c0cb1adae1 Test commit after repoman upgrade. Remove one of my many email addresses
from a copyright message.
2006-05-12 22:41:58 +00:00
peter
a7162e4983 Test commit after repoman upgrade. Remove one of my many email addresses
from a coyright message.
2006-05-12 22:38:53 +00:00
njl
1d3b84d7cb Add support for the VIA C7-M processor family.
Remove an unnecessary check of the table's bus clock.  CPUs that
support this feature export only the high/low settings via the MSR,
packed into 32 bits.

Hardware from:	Centaur Technologies
MFC after:	1 week
2006-05-11 17:35:44 +00:00
phk
5d8c57a08b Clean out sysctl machdep.* related defines.
The cmos clock related stuff should really be in MI code.
2006-05-11 17:29:25 +00:00
netchild
021fd75458 regen (linux rt_sigpending) 2006-05-10 18:19:51 +00:00
netchild
24c492f42c Implement rt_sigpending in the linuxolator.
PR:		92671
Submitted by:	Markus Niemist"o <markus.niemisto@gmx.net>
2006-05-10 18:17:29 +00:00
sam
0b63676c43 make tinderbox happy: GENERIC got ath and wlan added so we need to
now mark these "nodevice" or we'll get undefined references
2006-05-10 05:19:21 +00:00
ambrisko
f7d4a6b03b Add in linsysfs. A linux 2.6 like sys filesystem to pacify the Linux
LSI MegaRAID SAS utility.

Sponsored by:		IronPort Systems
Man page help from:	brueffer
2006-05-09 22:27:01 +00:00
maxim
d447c4f045 o Add acpi_ibm to the build.
PR:		kern/96940
Submitted by:	Rong-En Fan
2006-05-07 20:13:18 +00:00
ambrisko
31b22ce017 Enhance the Linux emulation layer to make MegaRAID SAS managements tool happy.
Add back in a scheme to emulate old type major/minor numbers via hooks into
stat, linprocfs to return major/minors that Linux app's expect.  Currently
only /dev/null is always registered.  Drivers can register via the Linux
type shim similar to the ioctl shim but by using
linux_device_register_handler/linux_device_unregister_handler functions.
The structure is:

    struct linux_device_handler {
        char    *bsd_driver_name;
        char    *linux_driver_name;
        char    *bsd_device_name;
        char    *linux_device_name;
        int     linux_major;
        int     linux_minor;
        int     linux_char_device;
    };

Linprocfs uses this to display the major number of the driver.  The
soon to be available linsysfs will use it to fill in the driver name.
Linux_stat uses it to translate the major/minor into Linux type values.

Note major numbers are dynamically assigned via passing in a -1 for
the major number so we don't need to keep track of them.

This is somewhat needed due to us switching to our devfs.  MegaCli
will not run until I add in the linsysfs and mfi Linux compat changes.

Sponsored by:	IronPort Systems
2006-05-05 16:10:45 +00:00