a set of helper routines to deal with real-time clocks. The generic
functions access the clock diver using a kobj interface. This is intended
to reduce code reduplication and make it easy to support more than one
clock model on a single architecture.
This code is currently only used on sparc64, but it is planned to convert
the code of the other architectures to it later.
Unfortunately, this level doesn't really provide enough granularity. We
probably need several MI NOTES type files for things that are shared by
several architectures but not by all. For example, the PCI options could
live in a NOTES.pci.
This also updates the Makefile for i386 to generate LINT. The only changes
in the generated LINT are the order of various options.
Suggestions for improvement welcome.
following sysctl variables:
debug.mutex.prof.enable enable / disable profiling
debug.mutex.prof.acquisitions number of mutex acquisitions recorded
debug.mutex.prof.records number of acquisition points recorded
debug.mutex.prof.maxrecords max number of acquisition points
debug.mutex.prof.rejected number of rejections (due to full table)
debug.mutex.prof.hashsize hash size
debug.mutex.prof.collisions number of hash collisions
debug.mutex.prof.stats profiling statistics
The code records four numbers for each acquisition point (identified by
source file name and line number): longest time held, total time held,
number of non-recursive acquisitions, average time held. The measurements
are in clock cycles (as returned by get_cyclecount(9)); this may cause
measurements on some SMP systems to be unreliable. This can probably be
worked around by replacing get_cyclecount(9) by some incarnation of
nanotime(9).
This work was derived from initial patches by eivind.
dump the trace buffer feasible.
- Remove KTR_EXTEND. This changes the format of the trace entries when
activated, making writing a userland tool which is not tied to a specific
kernel configuration difficult.
- Use get_cyclecount() for timestamps. nanotime() is much too heavy weight
and requires recursion protection due to ktr traces occuring as a result
of ktr traces. KTR_VERBOSE may still require recursion protection, which
is now conditional on it.
- Allow KTR_CPU to be overridden by MD code. This is so that it is possible
to trace early in startup before pcpu and/or curthread are setup.
- Add a version number for the ktr interface. A userland tool can check this
to detect mismatches.
- Use an array for the parameters to make decoding in userland easier.
- Add file and line recording to the non-extended traces now that the extended
version is no more.
These changes will break gdb macros to decode the extended version of the
trace buffer which are floating around. Users of these macros should either
use the show ktr command in ddb, or use the userland utility which can be run
on a core dump.
Approved by: jhb
Tested on: i386, sparc64
I have not been able to find very much information about the PC98
extended partition layout so this is gleaned from the source in
our pc98 architecture. Corrections and patched very welcome.
Sponsored by: DARPA and NAI Labs.
disablement assumptions in kern_fork.c by adding another API call,
cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it
from MI to MD. Temporarily move cpu_critical*() from <arch>/include/cpufunc.h
to <arch>/<arch>/critical.c (stage-2 will clean this up).
Implement interrupt deferral for i386 that allows interrupts to remain
enabled inside critical sections. This also fixes an IPI interlock bug,
and requires uses of icu_lock to be enclosed in a true interrupt disablement.
This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized,
and will move cpu_critical*() into its own header file(s) + other things.
This commit may break non-i386 architectures in trivial ways. This should
be temporary.
Reviewed by: core
Approved by: core
"env name=value ... cmd ..." is just a pessimized way of doing
"name=value ... cmd ..." in real shells. Set the environment
(without using env(1)) before starting xargs so that env(1)
is not needed in "xargs env name=value ... cmd ..."
lint, so this is turned off by default. Setting WANT_LINT will turn
on generation of lint libraries for /usr/libdata/lint/*.ln.
Reviewd by: silence in -audit.
The detection code in this method is written so that it should work on
all architectures which means that you can plug a Sun disk into a i386
now and access the partitions.
We still need an endian-agnostic ufs/ffs before this is really
interresting, but the main focus was to get sparc64 onto the GEOM
trail.
The stat() and open() calls have been changed to make use of this new functionality. Using shared locks in
these cases is sufficient and can significantly reduce their latency if IO is pending to these vnodes. Also,
this reduces the number of exclusive locks that are floating around in the system, which helps reduce the
number of deadlocks that occur.
A new kernel option "LOOKUP_SHARED" has been added. It defaults to off so this patch can be turned on for
testing, and should eventually go away once it is proven to be stable. I have personally been running this
patch for over a year now, so it is believed to be fully stable.
Reviewed by: jake, obrien
Approved by: jake
and commenting of NETSMB, NETWMBCRYPTO, and SMBFS. In NOTES, they
had all floated to the bottom of the file with the list of seemingly
random and unclassified kernel options. This change moves them back
up to the network protocol and file system areas, and also documents
the dependencies.
This makes other power-management system (APM for now) to be able to
generate power profile change events (ie. AC-line status changes), and
other kernel components, not only the ACPI components, can be notified
the events.
- move subroutines in acpi_powerprofile.c (removed) to kern/subr_power.c
- call power_profile_set_state() also from APM driver when AC-line
status changes
- add call-back function for Crusoe LongRun controlling on power
profile changes for a example
device drivers for bus system with other endinesses than the CPU (using
interfaces compatible to NetBSD):
- bwap16() and bswap32(). These have optimized implementations on some
architectures; for those that don't, there exist generic implementations.
- macros to convert from a certain byte order to host byte order and vice
versa, using a naming scheme like le16toh(), htole16().
These are implemented using the bswap functions.
- stream bus space access functions, which do not perform a byte order
conversion (while the normal access functions would if the bus endianess
differs from the CPU endianess).
htons(), htonl(), ntohs() and ntohl() are implemented using the new
functions above for kernel usage. None of the above interfaces is currently
exported to user land.
Make use of the new functions in a few places where local implementations
of the same functionality existed.
Reviewed by: mike, bde
Tested on alpha by: mike
There is some unresolved badness that has been eluding me, particularly
affecting uniprocessor kernels. Turning off PG_G helped (which is a bad
sign) but didn't solve it entirely. Userland programs still crashed.
on for a while:
- fine grained TLB shootdown for SMP on i386
- ranged TLB shootdowns.. eg: specify a range of pages to shoot down with
a single IPI, since the IPI is very expensive. Adjust some callers
that used to trigger this inside tight loops to do a ranged shootdown
at the end instead.
- PG_G support for SMP on i386 (options ENABLE_PG_G)
- defer PG_G activation till after we decide what we are going to do with
PSE and the 4MB pages at the start of the kernel. This should solve
some rumored strangeness about stale PG_G entries getting stuck
underneath the 4MB pages.
- add some instrumentation for the fine TLB shootdown
- convert some asm instruction wrappers from functions to inlines. gcc
seems to do a fair bit better with this.
- [temporarily!] pessimize the tlb shootdown IPI handlers. I will fix
this again shortly.
This has been working fairly well for me for a while, but I have tweaked
it again prior to commit since my last major testing round. The only
outstanding problem that I know of is PG_G related, which is why there
is an option for it (not on by default for SMP). I have seen a world
speedups by a few percent (as much as 4 or 5% in one case) but I have
*not* accurately measured this - I am a bit sceptical of these numbers.
- fix the warnings, they are there for a reason!
- add -DNO_ERROR to your make(1) command.
- add 'makeoptions NO_WERROR=true' to your kernel config.
- add 'nowerror' to conf/files* that have warnings that should be fixed
due to tracking 3rd party vendor code.
- add 'nowerror' to conf/files* where the warning is false due to a
compiler bug and fixing it with brute force would be too expensive.
There are some very sloppy warnings in our kernel build, come on folks!
'make release' uses -DNO_WERROR intentionally.
so that this is safe even if VARIABLE is longer than kern.argmax.
There is another instance of CFILES which might need the same treatment,
and might be noticed when doing a "make links".
The same has to be done in RELENG_4 (on some different file).
Noticed-by: picobsd cross-compiling LINT
Suggested-by: Alfred (bright@mu.org), des@freebsd.org
MFC-after: 3 days
It doesn't actually do it yet though. This adds a flag to config so
that we can exclude certain vendor files from this even when the rest
of the kernel has it on. make -DNO_WERROR would also bypass all of it.
buffer length, determine if the pointer is to a valid string. Currently,
the only check is whether a '\0' appears in the buffer. This is useful
when pulling in a structure from userland that may contain one or more
strings, and validity testing must be performed on elements of the
structure. When copying normal string arguments, copyinstr() is
expected to be used.
the structure definitions come from NetBSD to make it easier to share card
definitions. The driver only acts as a shim between the pci bus and the
sio driver. Later pci parallel ports could also be supported through this
driver. Support for most single and multiport pci serial cards should be
as simple as adding its definition to pucdata.c
Tested with the following pci cards:
Moxa Industio CP-114, 4 port RS-232,RS-422/485
Syba Tech Ltd. PCI-4S2P-550-ECP, 4 port RS-232 + 2 parallel ports
Netmos NM9835 PCI-2S-550, 2 port RS-232
ACPI_NO_SEMAPHORES, ASR_MEASURE_PERFORMANCE, AST_DEBUG, ATAPI_DEBUG,
ATA_DEBUG, BKTR_ALLOC_PAGES, BROOKTREE_ALLOC_PAGES, CAPABILITIES,
COMPAT_SUNOS, CV_DEBUG, MAXFILES, METEOR_TEST_VIDEO, NDEVFSINO,
NDEVFSOVERFLOW, NETGRAPH_BRIDGE, NETSMB, NETSMBCRYPTO, PFIL_HOOKS,
SIMOS, SMBFS, VESA_DEBUG, VGA_DEBUG.
Start using #! to comment out negative options and ## to comment out
broken options.
atapi-all.c:
Fixed rotted bits that were hiding under ATAPI_DEBUG.
atapi-cd.c:
#include "opt_ata.h" so that ACD_DEBUG is actually visible.
ata/atapi-tape.c
#include "opt_ata.h" so that AST_DEBUG is actually visible.
descriptors. This simplifies code for jumbo frames.
- Cleaned up coding conventions to make code more unix-like.
- Cleaned up code in if_em_fxhw.c and if_em_phy.c.
Added relevant comments.
MFC after: 1 week
feature bit on newer Athlon CPUs if the BIOS has forgotten to enable
it.
This patch was constructed using some info made available by John
Clemens at http://www.deater.net/john/PavilionN5430.html
Reviewed by: -audit
MFC after: 3 weeks
npx is no more mandatory than sc. Its mandatoryness went away in
rev.1.226 of i386/machdep.c 9 months before it was made mandatory in
rev.1.24 of config/mkmakefile.c.
This change is mainly to test building of minimal kernel configurations.
npx should really be even more standard than clk. It was optional mainly
so that the usual device driver configuration info could be specified in
the usual way in config files, but this hasn't been necessary for a few
years.
prior ICP Vortex models. This driver was developed by Achim Leubner
of Intel (previously with ICP Vortex) and Boji Kannanthanam of Intel.
Submitted by: "Kannanthanam, Boji T" <boji.t.kannanthanam@intel.com>
MFC after: 2 weeks
Integrated RAID driver, supported by <boji.t.kannanthanam@intel.com> and
<achim.leubner@intel.com>.
Submitted by: "Kannanthanam, Boji T" <boji.t.kannanthanam@intel.com>
defined, no symbols are exported from the module. This is
the typical configuration for most device drivers and
standalone modules; only infrastructure modules or those with
special requirements typically need to export symbols.
Don't print the objcopy commands as they are run when converting
symbols; they're bulky and annoying in many cases.
simplifying the module linking process and eliminating the risks
associated with doubly-defined variables.
Cases where commons were legitimately used (detection of
compiled-in subsystems) have been converted to use sysinits, and
any new code should use this or an equivalent practice as a
matter of course.
Modules can override this behaviour by substituting -fno-common
out of ${CFLAGS} in cases where commons are necessary
(eg. third-party object modules). Commons will be resolved and
allocated space when the kld is linked as part of the module
build process, so they will not pose a risk to the kernel or
other modules.
Provide a mechanism for controlling the export of symbols from
the module namespace. The EXPORT_SYMS variable may be set in the
Makefile to NO (export no symbols), a list of symbols to export,
or the name of a file containing a newline-seperated list of
symbols to be exported. Non-exported symbols are converted to
local symbols. If EXPORT_SYMS is not set, all global symbols are
currently exported. This behaviour is expected to change (to
exporting no symbols) once modules have been converted.
Reviewed by: peter (in principle)
Obtained from: green (kmod_syms.awk)
hw.midi.debug and hw.midi.seq.debug to 1 to enable debug log.
- Make debug messages human-frendly.
- Implement /dev/music.
- Add a timer engine required by /dev/music.
- Fix nonblocking I/O.
- Fix the numbering of midi and synth devices.
Remove the explicit call to aio_proc_rundown() from exit1(), instead AIO
will use at_exit(9).
Add functions at_exec(9), rm_at_exec(9) which function nearly the
same as at_exec(9) and rm_at_exec(9), these functions are called
on behalf of modules at the time of execve(2) after the image
activator has run.
Use a modified version of tegge's suggestion via at_exec(9) to close
an exploitable race in AIO.
Fix SYSCALL_MODULE_HELPER such that it's archetecuterally neutral,
the problem was that one had to pass it a paramater indicating the
number of arguments which were actually the number of "int". Fix
it by using an inline version of the AS macro against the syscall
arguments. (AS should be available globally but we'll get to that
later.)
Add a primative system for dynamically adding kqueue ops, it's really
not as sophisticated as it should be, but I'll discuss with jlemon when
he's around.
- Temporary fix a bug of Intel ACPI CA core code.
- Add OS layer ACPI mutex support. This can be disabled by
specifying option ACPI_NO_SEMAPHORES.
- Add ACPI threading support. Now that we have a dedicate taskqueue for
ACPI tasks and more ACPI task threads can be created by specifying option
ACPI_MAX_THREADS.
- Change acpi_EvaluateIntoBuffer() behavior slightly to reuse given
caller's buffer unless AE_BUFFER_OVERFLOW occurs. Also CM battery's
evaluations were changed to use acpi_EvaluateIntoBuffer().
- Add new utility function acpi_ConvertBufferToInteger().
- Add simple locking for CM battery and temperature updating.
- Fix a minor problem on EC locking.
- Make the thermal zone polling rate to be changeable.
- Change minor things on AcpiOsSignal(); in ACPI_SIGNAL_FATAL case,
entering Debugger is easier to investigate the problem rather than panic.
Non-SMP, i386-only, no polling in the idle loop at the moment.
To use this code you must compile a kernel with
options DEVICE_POLLING
and at runtime enable polling with
sysctl kern.polling.enable=1
The percentage of CPU reserved to userland can be set with
sysctl kern.polling.user_frac=NN (default is 50)
while the remainder is used by polling device drivers and netisr's.
These are the only two variables that you should need to touch. There
are a few more parameters in kern.polling but the default values
are adequate for all purposes. See the code in kern_poll.c for
more details on them.
Polling in the idle loop will be implemented shortly by introducing
a kernel thread which does the job. Until then, the amount of CPU
dedicated to polling will never exceed (100-user_frac).
The equivalent (actually, better) code for -stable is at
http://info.iet.unipi.it/~luigi/polling/
and also supports polling in the idle loop.
NOTE to Alpha developers:
There is really nothing in this code that is i386-specific.
If you move the 2 lines supporting the new option from
sys/conf/{files,options}.i386 to sys/conf/{files,options} I am
pretty sure that this should work on the Alpha as well, just that
I do not have a suitable test box to try it. If someone feels like
trying it, I would appreciate it.
NOTE to other developers:
sure some things could be done better, and as always I am open to
constructive criticism, which a few of you have already given and
I greatly appreciated.
However, before proposing radical architectural changes, please
take some time to possibly try out this code, or at the very least
read the comments in kern_poll.c, especially re. the reason why I
am using a soft netisr and cannot (I believe) replace it with a
simple timeout.
Quick description of files touched by this commit:
sys/conf/files.i386
new file kern/kern_poll.c
sys/conf/options.i386
new option
sys/i386/i386/trap.c
poll in trap (disabled by default)
sys/kern/kern_clock.c
initialization and hardclock hooks.
sys/kern/kern_intr.c
minor swi_net changes
sys/kern/kern_poll.c
the bulk of the code.
sys/net/if.h
new flag
sys/net/if_var.h
declaration for functions used in device drivers.
sys/net/netisr.h
NETISR_POLL
sys/dev/fxp/if_fxp.c
sys/dev/fxp/if_fxpvar.h
sys/pci/if_dc.c
sys/pci/if_dcreg.h
sys/pci/if_sis.c
sys/pci/if_sisreg.h
device driver modifications
for use on machines with untrusted local users, for security as well
as stability reasons.
o Lack of clarity pointed out by: David Rufino <dr@soniq.net> via bugtraq.
Assert that compilation takes place in a freestanding environment. This
implies `-fno-builtin'. A freestanding environment is one in which the
standard library may not exist, and program startup may not necessarily be
at main. The most obvious example is an OS kernel. This is equivalent to
`-fno-hosted'.
cardbus in the kernel, not on all the bridges that implement it.
Note: this is NEWCARD only, so we don't want it for the 'card' case,
unlike card_if.m, which is both NEWCARD and OLDCARD.
NEWCARD. Other patches may be reqiured to sio to prevent a hang on
eject. Also add commented out entries for sio_pccard.c in files.pc98
to match other architectures.
Submitted by: yamamoto shigeru-san
kern.post.mk.
# this should allow us to move kern.post.mk to the last line of the makefiles,
# but I'll do that slowly as I verify that one can do that w/o breaking things.
Submitted by: naddy
Any modifications to SYSTEM_OBJS after including kern.post.mk
will not make it to SYSTEM_DEP and consequently any dependency
rules. This caused __{div|rem}* to not be built...
- Add S4BIOS sleep implementation. This will works well if MIB
hw.acpi.s4bios is set (and of course BIOS supports it and hibernation
is enabled correctly).
- Add DSDT overriding support which is submitted by takawata originally.
If loader tunable acpi_dsdt_load="YES" and DSDT file is set to
acpi_dsdt_name (default DSDT file name is /boot/acpi_dsdt.aml),
ACPI CA core loads DSDT from given file rather than BIOS memory block.
DSDT file can be generated by iasl in ports/devel/acpicatools/.
- Add new files so that we can add our proposed additional code to Intel
ACPI CA into these files temporary. They will be removed when
similar code is added into ACPI CA officially.
"[...] and removes the hostcache code from standard kernels---the
code that depends on it is not going to happen any time soon,
I'm afraid."
Time to clean up.
Fix it by putting back the link of machine to sys/i386/include rather
than ../../include (aka sys/pc98/include). I had a stale machine link
on my first test.
Not sure what the "right" fix is, but this unbreaks things.
new files: kern.pre.mk, which contains most of the definitions, and
kern.post.mk, which contains most of the rules.
I've tested this on i386 and pc98. I have had feedback on the sparc64
port, but no reports from anybody on alpha, ia64 or powerpc. I
appologize in advance if I've broken you.
Reviewed by: jake, jhb, arch@
- Now that apm loadable module can inform its existence to other kernel
components (e.g. i386/isa/clock.c:startrtclock()'s TCS hack).
- Exchange priority of SI_SUB_CPU and SI_SUB_KLD for above purpose.
- Add simple arbitration mechanism for APM vs. ACPI. This prevents
the kernel enables both of them.
- Remove obsolete `#ifdef DEV_APM' related code.
- Add abstracted interface for Powermanagement operations. Public apm(4)
functions, such as apm_suspend(), should be replaced new interfaces.
Currently only power_pm_suspend (successor of apm_suspend) is implemented.
Reviewed by: peter, arch@ and audit@
booted from it when doing an installkernel.
Only change kern.bootfile from ${DESTDIR}${KODIR}/${KERNEL_KO}
to ${DESTDIR}${KODIR}.old/${KERNEL_KO}, and only when we're renaming
a booted ${DESTDIR}${KODIR}/${KERNEL_KO} kernel.
Small tweaks to kldxref may be necessary to avoid the surprising (but harm-
less) behaviour of 'kldload foo' loading foo.ko.debug instead of foo.ko if
it is present in the kernel directory.
Approved by: a week of silence on -arch
MFC after: 2 weeks
This emulates APM device node interface APIs (mainly ioctl) and
provides APM services for the applications. The goal is to support
most of APM applications without any changes.
Implemented ioctls in this commit are:
- APMIO_SUSPEND (mapped ACPI S3 as default but changable by sysctl)
- APMIO_STANDBY (mapped ACPI S1 as default but changable by sysctl)
- APMIO_GETINFO and APMIO_GETINFO_OLD
- APMIO_GETPWSTATUS
With above, many APM applications which get batteries, ac-line
info. and transition the system into suspend/standby mode (such as
wmapm, xbatt) should work with ACPI enabled kernel (if ACPI works well :-)
Reviewed by: arch@, audit@ and some guys
defeats the point of LINT to comment out positive options.
Fixed style bugs in rev.1.973:
- disordering of PCI options list.
- missing space after "options".
- line longer than 80 characters.
- bogus quoting of "BIOS".
sio_pccard_detach to use new siodetach. Add an extra arg to sioprobe
to tell driver to probe/not probe the device for IRQs.
This incorporates most of Bruce's review material. I'm at a good
checkpoint, but there will be more to come based on bde's further
reviews.
Reviewed by: bde
Move sio from isa/sio.c to dev/sio/sio.c. The next step is to break
out the front end attachments, improve support for these parts on
different busses, and maybe, if we're lucky, merging in pc98 support.
It will also be MI and live in conf/files rather than files.*.
Approved by: bde
Tested with: i386, pc98
been misled to believe by unknown parties. It probably *should* be an option,
but the runtime value is controlled by a tunable, which Ought To Be Enough.