as multi-processor kernels. The old way made it difficult for kernel
modules to be portable between uni-processor and multi-processor
kernels. It is no longer necessary to jump through hoops.
- always load %fs with the private segment on entry to the kernel
- change the type of the self referntial pointer from struct privatespace
to struct globaldata
- make the globaldata symbol have value 0 in all cases, so the symbols
in globals.s are always offsets, not aliases for fields in globaldata
- define the globaldata space used for uniprocessor kernels in C, rather
than assembler
- change the assmebly language accessors to use %fs, add a macro
PCPU_ADDR(member, reg), which loads the register reg with the address
of the per-cpu variable member
- Provide TUNABLE_INT() hooks for ktr_cpumask, ktr_mask, and ktr_verbose
so that they can be set from the loader by their respective sysctl names.
For example, to turn on KTR_INTR and KTR_PROC in ktr_mask, one could
stick 'debug.ktr.mask="0x1200"' in /boot/loader.conf.
This version is functional and is aproaching solid..
notice I said APROACHING. There are many node types I cannot test
I have tested: echo hole ppp socket vjc iface tee bpf async tty
The rest compile and "Look" right. More changes to follow.
DEBUGGING is enabled in this code to help if people have problems.
to supress logging when ARP replies arrive on the wrong interface:
"/kernel: arp: 1.2.3.4 is on dc0 but got reply from 00:00:c5:79:d0:0c on dc1"
the default is to log just to give notice about possibly incorrectly
configured networks.
aic7xxx.h:
First pass at big-endian support in the Core.
Capture state for second channel on TWIN channel adapters
for suspend and resume.
aic7xxx_freebsd.h:
Stubs for endian conversion functions. These will get filled
out once we get an official kernel api for this kind of thing
that is something more elegant and efficient than a bunch of
manual swaps #ifdefed by platform.
aic7xxx_pci.c
Allow the second channel of motherboard aic7896 chips to be attached.
It turns out that the encoding of the subdevice id differs between
PCI cards and MB based controllers and our check to see, via
the subvendor id, if the second channel was "stuffed" always
turned out negative.
the video switch by another. Exactly as VESA does on top of VGA.
It adds linear framebuffer to S3 VESA 1.2 cards.
Obtained from: The original S3 ISA code comes from
Peter Horton <pdh@colonel-panic.com>
o Use 8 space hard tabs
o Eliminate trailing white space (while I'm here, just in a couple of places)
o wrap mostly at 80 columns (printf literal strings being the notable
exception)
o use return (foo) consistantly
o use 0 vs NULL more consistantly
o use queue(3) xxx_FOREACH macros where appropriate (some places used it
before, others didn't).
o use BSD line continuation parameters
Pendants will likely notice minor style(9) violations, but for the
most part the file now looks much much closer to style(9) and is
mostly self-consistant.
Approved in principle by: dfr
Reviewed by: md5 (no changes to the .o)
specific snd_mixer device rather than global across all mixers.
- Add per-mixer mute status and saved mute_level so that the mixer_hwmute()
function can now toggle the mute state when the mute button is pressed.
- Create a dynamic sysctl tree hw.snd.pcmX when a pcm device is registered.
- Move the hw.snd.hwvol_* sysctl's to hw.snd.pcmX.hwvol_* so that they
are now properly device-specific. Eventually when the mixers become
their own devices these sysctl's will move to live under a mixerX tree.
- Change the interface of the hwvol_mixer sysctl so that it reports the
name of the current mixer device instead of the number and is settable
with the name instead of the number.
- Add a new function mixer_hwinit() used to setup the dynamic sysctl's
needed for the hwvol support that can be called by drivers that support
hwvol.
Reviewed by: cg
to the SYSCTL_ADD_FOO() macros is a constant that should be turned into
a string via the pre-processor. Instead, require it to be an explicit
string so that names can be generated on the fly.
- Make some of the char * arguments to sysctl_add_oid() const to quiet
warnings.
'chancount' never got up to equaling 'maxchans'. As a result,
pcm_makelinks() was never called, and one always had to set the sysctl to
get the /dev/mixer and other symlinks generated in the DEVFS case. Instead,
change the test in pcm_addchan() to call pcm_makelinks() after the first
channel is initialized, since the aliases are linked to channel 0.
Reviewed by: cg
the file verifier. The NFS client is supposed to do a SETATTR after a
successful O_EXCL open/create to clean up the attributes. FreeBSD's
client code was generating a SETATTR rpc but was not generating an access
or modification time update within that rpc, leaving the file with a
broken access time that solaris chokes on (and it doesn't look very
nice when you ls -lua under FreeBSD either!). Fixed.
file.
While there fix the layout of function headers (noticable in long headers)
Fix up some style nits. It's Perl and should be written in that style.
Bump __FreeBSD_version to reflect the move.
For the moment, <sys/select.h> includes <sys/selinfo.h> to allow
clients time to catch up.
Changes made in preparation for SUSv2/POSIX <sys/select.h> requirements.
status register rather than 0. Without this, a single hardware volume
event triggers an interrupt storm.
- Implement hardware volume control for the Maestro chips. This version
only handles the case where both channels are adjusted at the same time.
Reviewed by: cg
- The mixer_hwmute() function can be called when a soundcard receives a
mute request.
- The mixer_hwstep() function can be used to adjust the volume of one or
both channels.
- The 'hw.snd.hwvol_step' sysctl determines the amount that mixer_hwstep()
adjusts the volume by on each call.
- The 'hw.snd.hwvol_mixer' sysctl specifies the mixer device to adjust the
volume on for both functions. The values used correspond to the
SOUNDCARD_MIXER_* constants.
want according to the modes set with the ppc(4) flags. Especially, it
should fix some problems with mode detection of parallel chipsets
configured to EPP but which have timing troubles with the drives. In such
a case, the driver should now fall back to slower modes (PS2, NIBBLE).
out: label in psignal() did not grab sched_lock before trying to release
it. Also, the previous version had several cases where it grabbed
sched_lock before jumping to out: unneccessarily, so rework this a bit.
The runfast: and out: labels must be called with sched_lock released, and
the run: label must be called with it held. Appropriate mtx_assert()'s
have been added that should catch any bugs that may still be in this
code.
Noticed by: bde
extension.
Add ability to create a preload disk giving an address and a length
(suggested by imp)
Fix bug relating to very small md(4) devices.
Update md.c copyright to reflect the status of code copied from vn.c.
(noticed by dillon)
all devices are by default known by their 'cooked' name, so
my change was wrong. I thought it was a hangover from old 'block
tape device' support which hasn't worked (if it ever did) since
v6/PWB.
So, the default tape name is now the same as Linux. Far out, man....
attaching to running processes, it completely breaks normal debugging.
A better fix is in the works, but cannot be properly tested until
the problem with gdb hanging the system in -current is solved.
WWNs correctly (Again!) - this time for the case that we're not going
to fully init the adapter if isp_init is called (with ISP_CFG_NOINIT
set in options). The pupose for this is to bring the adapter up to
almost ready to go, get info out of NVRAM, but to not start it up- leaving
it until later to actually start things up if wanted (and possibly with
different roles selected).
process. This fixes a problem when attaching to a process in gdb
and the process staying in the STOP'd state after quiting gdb.
This whole process seems a bit suspect, but this seems to work.
Reviewed by: peter
with the driver locking up under load.
- Restructure so that we use a static pool of commands/FIBs, rather than
allocating them in clusters. The cluster allocation just made things
more complicated, and allowed us to waste more memory in peak load
situations.
- Make queueing macros more like my other drivers. This adds queue stats
for free. Add some debugging to take advantage of this.
- Reimplement the periodic timeout scan. Kick the interrupt handler
and the start routine every scan as well, just to be safe. Track busy
commands properly.
- Bring resource cleanup into line with resource allocation. We should
now clean up correctly after a failed probe/unload/etc.
- Try to start new commands when old ones are completed. We weren't doing
this before, which could lead to deadlock when the controller was full.
- Don't try to build a new command if we have found a deferred command.
This could cause us to lose the deferred command.
- Use diskerr() to report I/O errors.
- Don't bail if the AdapterInfo structure is the wrong size. Some variation
seems to be normal. We need to improve our handing of 2.x firmware sets.
- Improve some comments in an attempt to try to make things clearer.
- Restructure to avoid some warnings.
in 4.2-REL which I ripped out in -stable and -current when implementing the
low-memory handling solution. However, maxlaunder turns out to be the saving
grace in certain very heavily loaded systems (e.g. newsreader box). The new
algorithm limits the number of pages laundered in the first pageout daemon
pass. If that is not sufficient then suceessive will be run without any
limit.
Write I/O is now pipelined using two sysctls, vfs.lorunningspace and
vfs.hirunningspace. This prevents excessive buffered writes in the
disk queues which cause long (multi-second) delays for reads. It leads
to more stable (less jerky) and generally faster I/O streaming to disk
by allowing required read ops (e.g. for indirect blocks and such) to occur
without interrupting the write stream, amoung other things.
NOTE: eventually, filesystem write I/O pipelining needs to be done on a
per-device basis. At the moment it is globalized.
o Move the ax88190 code to its own function.
o Move all device_method_t, driver_t and DRIVER_MODULE definitions to the
end of files.
o Wrap a few lines > 80 characters.
o Use the same devclass for all ed drivers. This allows machines with
multiple types of cards to have their cards numbered correctly. Before,
you could wind up with two ed0's.
o Protect if_edvar.h from multiple includes because I was there.
modify chn_setblocksize() to pick a default soft-blocksize appropriate to the
sample rate and format in use. it will aim for a power of two size small
enough to generate block sizes of at most 20ms. it will also set the
hard-blocksize taking into account rate/format conversions in use.
update drivers to implement setblocksize correctly:
updated, tested: sb16, emu10k1, maestro, solo
updated, untested: ad1816, ess, mss, sb8, csa
not updated: ds1, es137x, fm801, neomagic, t4dwave, via82c686
i lack hardware to test: ad1816, csa, fm801, neomagic
others will be updated/tested in the next few days.
This is because calls with M_WAIT (now M_TRYWAIT) may not wait
forever when nothing is available for allocation, and may end up
returning NULL. Hopefully we now communicate more of the right thing
to developers and make it very clear that it's necessary to check whether
calls with M_(TRY)WAIT also resulted in a failed allocation.
M_TRYWAIT basically means "try harder, block if necessary, but don't
necessarily wait forever." The time spent blocking is tunable with
the kern.ipc.mbuf_wait sysctl.
M_WAIT is now deprecated but still defined for the next little while.
* Fix a typo in a comment in mbuf.h
* Fix some code that was actually passing the mbuf subsystem's M_WAIT to
malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the
value of the M_WAIT flag, this could have became a big problem.
point in calling a function just to set a flag.
Keep better track of the syslog FAC/PRI code and try to DTRT if
they mingle.
Log all writes to /dev/console to syslog with <console.info>
priority. The formatting is not preserved, there is no robust,
way of doing it. (Ideas with patches welcome).
To use it, some dll is needed. And currently, the dll is only for NetBSD.
So one more kernel module is needed.
For more infomation,
http://chiharu.haun.org/peace/ .
Reviewed by: bp
striped plexes.
Submitted by: des
Don't lock buffers before calls to sdio, sdio does it by itself.
Submitted by: tegge
parityops: Use correct casts when returning error information.
Requested by: Bernd Walter <ticso@cicely8.cicely.de>
Cor Bosman <cor@xs4all.net>
Kai Storbeck <kai@xs4all.net>
Joe Greco <jgreco@ns.sol.net>
Add support for Compaq SMART-2 RAID (idad) as storage
device for Vinum subdisks.
Reported by: Aaron Hill <hillaa@hotmail.com>
ahc_pci.c:
Add detach support.
Make use of soft allocated on our behalf by newbus.
For PCI devices, disable the mapping type we aren't
using for extra protection from rogue code.
aic7xxx_93cx6.c:
aic7xxx_93cx6.h:
Sync perforce IDs.
aic7xxx_freebsd.c:
Capture the eventhandle returned by EVENTHANDER_REGISTER
so we can kill the handler off during detach.
Use AHC_* constants instead of hard coded numbers in a
few more places.
Test PPR option state when deciding to "really" negotiate
when the CAM_NEGOTIATE flag is passed in a CCB.
Make use of core "ahc_pause_and_flushwork" routine in our
timeout handler rather than re-inventing this code.
Cleanup all of our resources (really!) in ahc_platform_free().
We should be all set to become a module now.
Implement the core ahc_detach() routine shared by all of
the FreeBSD front-ends.
aic7xxx_freebsd.h:
Softc storage for our event handler.
Null implementation for the ahc_platform_flushwork() OSM
callback. FreeBSD doesn't need this as XPT callbacks are
safe from all contexts and are done directly in ahc_done().
aic7xxx_inline.h:
Implement new lazy interrupt scheme. To avoid an extra
PCI bus read, we first check our completion queues to
see if any work has completed. If work is available, we
assume that this is the source of the interrupt and skip
reading INTSTAT. Any remaining interrupt status will be
cleared by a second call to the interrupt handler should
the interrupt line still be asserted. This drops the
interrupt handler down to a single PCI bus read in the
common case of I/O completion. This is the same overhead
as in the not so distant past, but the extra sanity of
perforning a PCI read after clearing the command complete
interrupt and before running the completion queue to avoid
missing command complete interrupts added a cycle.
aic7xxx.c:
During initialization, be sure to initialize all scratch
ram locations before they are read to avoid parity errors.
In this case, we use a new function, ahc_unbusy_tcl() to
initialize the scratch ram busy target table.
Replace instances of ahc_index_busy_tcl() used to unbusy
a tcl without looking at the old value with ahc_unbusy_tcl().
Modify ahc_sent_msg so that it can find single byte messages.
ahc_sent_msg is now used to determine if a transfer negotiation
attempt resulted in a bus free.
Be more careful in filtering out only the SCSI interrupts
of interest in ahc_handle_scsiint.
Rearrange interrupt clearing code to ensure that at least
one PCI transaction occurrs after hitting CLRSINT1 and
writting to CLRINT. CLRSINT1 writes take a bit to
take effect, and the re-arrangement provides sufficient
delay to ensure the write to CLRINT is effective. The
old code might report a spurious interrupt on some "fast"
chipsets.
export ahc-update_target_msg_request for use by OSM code.
If a target does not respond to our ATN request, clear
it once we move to a non-message phase. This avoids
sending a MSG_NOOP in some later message out phase.
Use max lun and max target constants instead of
hard-coded values.
Use softc storage built into our device_t under FreeBSD.
Fix a bug in ahc_free() that caused us to delete
resources that were not allocated.
Clean up any tstate/lstate info in ahc_free().
Clear the powerdown state in ahc_reset() so that
registers can be accessed.
Add a preliminary function for pausing the chip and
processing any posted work.
Add a preliminary suspend and resume functions.
aic7xxx.h:
Limit the number of supported luns to 64. We don't
support information unit transfers, so this is the
maximum that makes sense for these chips.
Add a new flag AHC_ALL_INTERRUPTS that forces the
processing of all interrupt state in a single invokation
of ahc_intr(). When the flag is not set, we use the
lazy interrupt handling scheme.
Add data structures to store controller state while
we are suspended.
Use constants instead of hard coded values where appropriate.
Correct some harmless "unsigned/signed" conflicts.
aic7xxx.seq:
Only perform the SCSIBUSL fix on ULTRA2 or newer controllers.
Older controllers seem to be confused by this.
In target mode, ignore PHASEMIS during data phases.
This bit seems to be flakey on U160 controllers acting
in target mode.
aic7xxx_pci.c:
Add support for the 29160C CPCI adapter.
Add definitions for subvendor ID information
available for devices with the "9005" vendor id.
We currently use this information to determine
if a multi-function device doesn't have the second
channel hooked up on a board.
Add rudimentary power mode code so we can put the
controller into the D0 state. In the future this
will be an OSM callback so that in FreeBSD we don't
duplicate functionality provided by the PCI code.
The powerstate code was added after I'd completed
my regression tests on this code.
Only capture "left over BIOS state" if the POWRDN
setting is not set in HCNTRL.
In target mode, don't bother sending incremental
CRC data.
to negotiate from scratch. Make leased lines survive being put into
loopback mode. Bits and pieces and ideas taken from PRs 11238 and 21771.
Make it a module so that it can be kldloaded. Whitespace cleanup. (Can be
ignored with "cvs diff -b".)
PR: 11238 and 21771 (bits and pieces)
This sould make the system power-off correctly where the howto had
more bits set than RB_POWEROFF, e.g. RB_NOSYNC.
Submitted by: Peter Pentchev <roam@orbitel.bg>
1) Be more tolerant of missing snapshot files by only trying to decrement
their reference count if they are registered as active.
2) Fix for snapshots of filesystems with block sizes larger than 8K
(from Ollivier Robert <roberto@eurocontrol.fr>).
3) Fix to avoid losing last block in snapshot file when calculating blocks
that need to be copied (from Don Coleman <coleman@coleman.org>).
which fails to set the modification time on the file. The same
check a few lines later takes the correct action.
Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
going to hurt sio(4) performance for the time being. As we get closer to
release and have more of the kernel unlocked we can come back to doing
arcane optimizations to workaround the limitations of the sio hardware.
claimed that their Intel NIC is comatose after a warm boot from Windoze.
This is most likely due to the card getting put in the D3 state. This
should bring it back to life.
PCI code. This saves each driver from having to grovel around looking
for the right registers to twiddle.
I should eventually convert the other PCI drivers to do this; for now,
these three are ones which I know need power state handling.
The fix works by reverting the ordering of free memory so that the
chances of contig_malloc() succeeding increases.
PR: 23291
Submitted by: Andrew Atrens <atrens@nortel.ca>
format version number. (userland programs should not need to be
recompiled when the netgraph kernel internal ABI is changed.
Also fix modules that don;t handle the fact that a caller may not supply
a return message pointer. (benign at the moment because the calling code
checks, but that will change)
require the addition of flag 0x80000 to their config line in
pccard.conf(5). This flag is not optional. These Linksys cards will
not be recognized without it.
Reviewed by: imp, iwasaki
The what argument is the hold type that assertion acts on. LK_SHARED
to assert that the process holds a shared, LK_EXCLUSIVE to assert that
the process holds _either_ a shared lock or an exclusive lock.
this gives us several benefits, including:
* easier extensibility- new optional methods can be added to
ac97/mixer/channel classes without having to fixup every driver.
* forward compatibility for drivers, provided no new mandatory methods are
added.
by ensuring that newly allocated blocks are zerod. The
race can occur even in the case where the write covers
the entire block.
Reported by: Sven Berkvens <sven@berkvens.net>, Marc Olzheim <zlo@zlo.nu>
and had no data available returned 0. Now it returns -1 with errno
set to EWOULDBLOCK (== EAGAIN) as it should. This fix makes the bpf
device usable in threaded programs.
Reviewed by: bde
messages send by routers when they deny our traffic, this causes
a timeout when trying to connect to TCP ports/services on a remote
host, which is blocked by routers or firewalls.
rfc1122 (Requirements for Internet Hosts) section 3.2.2.1 actually
requi re that we treat such a message for a TCP session, that we
treat it like if we had recieved a RST.
quote begin.
A Destination Unreachable message that is received MUST be
reported to the transport layer. The transport layer SHOULD
use the information appropriately; for example, see Sections
4.1.3.3, 4.2.3.9, and 4.2.4 below. A transport protocol
that has its own mechanism for notifying the sender that a
port is unreachable (e.g., TCP, which sends RST segments)
MUST nevertheless accept an ICMP Port Unreachable for the
same purpose.
quote end.
I've written a small extension that implement this, it also create
a sysctl "net.inet.tcp.icmp_admin_prohib_like_rst" to control if
this new behaviour is activated.
When it's activated (set to 1) we'll treat a ICMP administratively
prohibited message (icmp type 3 code 9, 10 and 13) for a TCP
sessions, as if we recived a TCP RST, but only if the TCP session
is in SYN_SENT state.
The reason for only reacting when in SYN_SENT state, is that this
will solve the problem, and at the same time minimize the risk of
this being abused.
I suggest that we enable this new behaviour by default, but it
would be a change of current behaviour, so if people prefer to
leave it disabled by default, at least for now, this would be ok
for me, the attached diff actually have the sysctl set to 0 by
default.
PR: 23086
Submitted by: Jesper Skriver <jesper@skriver.dk>
1. ICMP ECHO and TSTAMP replies are now rate limited.
2. RSTs generated due to packets sent to open and unopen ports
are now limited by seperate counters.
3. Each rate limiting queue now has its own description, as
follows:
Limiting icmp unreach response from 439 to 200 packets per second
Limiting closed port RST response from 283 to 200 packets per second
Limiting open port RST response from 18724 to 200 packets per second
Limiting icmp ping response from 211 to 200 packets per second
Limiting icmp tstamp response from 394 to 200 packets per second
Submitted by: Mike Silbersack <silby@silby.com>
the kernel console. Instead, change logwakeup() to set a flag in the
softc. A callout then wakes up every so often and wakes up any processes
selecting on /dev/log (such as syslogd) if the flag is set. By default
this callout fires 5 times a second, but that can be adjusted by the
sysctl kern.log_wakeups_per_second.
Reviewed by: phk
Add detach routine and turn driver into a module so it can be loaded
and unloaded. Also take a stab at implementing multicast packet
reception so that this NIC will work with IPv6. Promiscuous mode
doesn't seem to work, but I'm not sure why. It works well enough that
I can run dhclient on it and put it on the office network though.
Also ripped out spl stuff and replaced it with mutexes.
This is a driver for the LanMedia/SBE LMC150x E1/T1 family of cards.
The driver currently support unframed E1 (2048kbit/s) and framed
E1 (nx64).
These cards will provision E1/T1 lines for about 1/4 the cost of
a cisco router...
commands have also been slightly updated as follows:
- Use ktr_idx to find the newest entry rather than walking the buffer
comparing timespecs. Timespecs are not always unique after the change
to use getnanotime(9).
- Add a new verbose setting. When the verbose setting is on, then the
timestamp is printed with each message. If KTR_EXTEND is on, then the
filename and line number are output as well. By default this option is
off. It can be turned on with the 'v' modifier passed to the 'tbuf'
and 'tall' commands. For the 'tnext' command, the 'v' modifier toggles
the verbose mode.
- Only display the cpu number for each message on SMP systems.
- Don't display anything for an empty entry that hasn't been used yet.
MS will be treated as having this quirk. In the event that we falsely
identify one that doesn't need it, no harm will be done. Ken
suggested that we make this more generic since there may be more
needed in the future.
Reported by: TERAMOTO Masahiro <teramoto@comm.eng.osaka-u.ac.jp>
PR: kern/23378
Reviewed by: ken
aicasm is run on the build machine and therefore needs to be
compiled and linked against the headers and libraries (resp)
of the build machine. Since normally the default include
directories are search after any specified on the command
line, make sure we don't accidentally pick up machine
dependent headers from the kernel compile directory by
specifying /usr/include first.
This solves the (cross) build problem for ia64.
Approved by: gibbs
functions. If this flag is set, then no KTR log messages are issued.
This is useful for blocking excessive logging, such as with the internal
mutex used by the witness code.
- Use MTX_QUIET on all of the mtx_enter/exit operations on the internal
mutex used by the witness code.
- If we are in a panic, don't do witness checks in witness_enter(),
witness_exit(), and witness_try_enter(), just return.
Generate a version string that looks just like a real Linux one - almost :)
Use sbufs everywhere instead of sprintf(). Note that this is still imperfect,
as the code does not check whether the sbuf overflowed - but it'll still
work better than before, since if the sbuf overflows, the code now simply
copies out 0 bytes instead of causing a trap (or worse, corrupting kernel
structures)
vm86_trap() to return to the calling program directly. vm86_trap()
doesn't return, thus it was never returning to trap() to release
Giant. Thus, release Giant before calling vm86_trap().
struct swblock entries by dividing the number of the entries by 2
until the swap metadata fits.
- Reject swapon(2) upon failure of swap_zone allocation.
This is just a temporary fix. Better solutions include:
(suggested by: dillon)
o reserving swap in SWAP_META_PAGES chunks, and
o swapping the swblock structures themselves.
Reviewed by: alfred, dillon
variables from i386 assembly language. The syntax is PCPU(member)
where member is the capitalized name of the per-cpu variable, without
the gd_ prefix. Example: movl %eax,PCPU(CURPROC). The capitalization
is due to using the offsets generated by genassym rather than the symbols
provided by linking with globals.o. asmacros.h is the wrong place for
this but it seemed as good a place as any for now. The old implementation
in asnames.h has not been removed because it is still used to de-mangle
the symbols used by the C variables for the UP case.
Previously, the syncer process was the only process in the
system that could process the soft updates background work
list. If enough other processes were adding requests to that
list, it would eventually grow without bound. Because some of
the work list requests require vnodes to be locked, it was
not generally safe to let random processes process the work
list while they already held vnodes locked. By adding a flag
to the work list queue processing function to indicate whether
the calling process could safely lock vnodes, it becomes possible
to co-opt other processes into helping out with the work list.
Now when the worklist gets too large, other processes can safely
help out by picking off those work requests that can be handled
without locking a vnode, leaving only the small number of
requests requiring a vnode lock for the syncer process. With
this change, it appears possible to keep even the nastiest
workloads under control.
Submitted by: Paul Saab <ps@yahoo-inc.com>
was not atomic. We now make sure that we free the ext buf if the reference
count is about to reach 0 but also make sure that nobody else has done it
before us.
While I'm here, change refcnt to u_int (from long). This fixes a compiler
warning regarding use of atomic_cmpset_long on i386.
Submitted by: jasone
Reviewed by: jlemon, jake
- Remove redundant header-type-specific support in the cardbus pcibus
clone. The bridges don't need this anymore.
- Use pcib_get_bus instead of the deprecated pci_get_secondarybus.
- Implement read/write ivar support for the pccbb, and teach it how
to report its secondary bus number. Save the subsidiary bus number
as well, although we don't use it yet.
- Break out the /dev/pci driver into a separate file.
- Kill the COMPAT_OLDPCI support.
- Make the EISA bridge attach a bit more like the old code; explicitly
check for the existence of eisa0/isa0 and only attach if they don't
already exist. Only make one bus_generic_attach() pass over the
bridge, once both busses are attached. Note that the stupid Intel
bridge's class is entirely unpredictable.
- Add prototypes and re-layout the core PCI modules in line with
current coding standards (not a major whitespace change, just moving
the module data to the top of the file).
- Remove redundant type-2 bridge support from the core PCI code; the
PCI-CardBus code does this itself internally. Remove the now
entirely redundant header-class-specific support, as well as the
secondary and subordinate bus number fields. These are bridge
attributes now.
- Add support for PCI Extended Capabilities.
- Add support for PCI Power Management. The interface currently
allows a driver to query and set the power state of a device.
- Add helper functions to allow drivers to enable/disable busmastering
and the decoding of I/O and memory ranges.
- Use PCI_SLOTMAX and PCI_FUNCMAX rather than magic numbers in some
places.
- Make the PCI-PCI bridge code a little more paranoid about valid
I/O and memory decodes.
- Add some more PCI register definitions for the command and status
registers. Correct another bogus definition for type-1 bridges.
of explicit calls to lockmgr. Also provides macros for the flags
pased to specify shared, exclusive or release which map to the
lockmgr flags. This is so that the use of lockmgr can be easily
replaced with optimized reader-writer locks.
- Add some locking that I missed the first time.
This clears out my outstanding netgraph changes.
There is a netgraph change of design in the offing and this is to some
extent a superset of soem of the new functionality and some of the old
functionality that may be removed.
This code works as before, but allows some new features that I want to
work with and evaluate. It is the basis for a version of netgraph
with integral locking for SMP use.
This is running on my test machine with no new problems :-)
rather than finding our parent pcib and using its PCI_READ_CONFIG
method.
- Fix the defines for the 32-bit I/O decode registers, and properly
process the 16-bit versions. Now we will correctly check that I/O
resources behind the bridge are going to be decoded.
- Bring the quirk for the Orion PCI:PCI bridge in here (since it
seems to want to set the secondary/supplementary bus numbers).
- Use PCI_SLOTMAX rather than a magic number.
but serves to work around some uncleanliness whereby the ISA bus is not
found on Alpha systems with PCI:EISA bridges due to the lack of EISA code
for the Alpha.
no longer contains kernel specific data structures, but rather
only scalar values and structures that are already part of the
kernel/user interface, specifically rusage and rtprio. It no
longer contains proc, session, pcred, ucred, procsig, vmspace,
pstats, mtx, sigiolst, klist, callout, pasleep, or mdproc. If
any of these changed in size, ps, w, fstat, gcore, systat, and
top would all stop working. The new structure has over 200 bytes
of unassigned space for future values to be added, yet is nearly
100 bytes smaller per entry than the structure that it replaced.
be safely held across an eventhandler function call.
- Fix an instance of the head of an eventhandler list being read without
the lock being held.
- Break down and use a SYSINIT at the new SI_SUB_EVENTHANDLER to initialize
the eventhandler global mutex and the eventhandler list of lists rather
than using a non-MP safe initialization during the first call to
eventhandler_register().
- Add in a KASSERT() to eventhandler_register() to ensure that we don't try
to register an eventhandler before things have been initialized.
the witness code is compiled in. Without this, the witness code doesn't
notice that sched_lock is released by fork_trampoline() and thus gets all
confused about spin lock order later on.
macros, the mutex KTR log entries don't actually have the useful filename
and line numbers in the KTR_EXTEND case, so remove a comment claiming this
and go back to one set of KTR strings.
the ISA bus.
- Don't expect that a PCI:ISA bridge will have a correct class value;
if we're checking PCI IDs, only depend on these.
This should fix the loss of ISA on machines with PCI:EISA bridges like the
AS4100.
CPU version (apecs:ev4::cia:ev5) and the irq hardware depends on the systype
previously, only ev4 AS1000s and ev5 AS1000a's would have worked.
tested by: wilko (in its -stable form)
noticed by: daniel
held and panic if so (conditional on witness).
- Change witness_list to return the number of locks held so this is easier.
- Add kern/syscalls.c to the kernel build if witness is defined so that the
panic message can contain the name of the offending system call.
- Add assertions that Giant and sched_lock are not held when returning from
a system call, which were missing for alpha and ia64.
can lead to further panics.
- Call getnanotime() instead of nanotime() for the timestamp. nanotime()
is more precise, but it also calls into the timer code, which results
in mutex operations on the i386 arch. If KTR_LOCK is turned on, then
ktr_tracepoint() recurses on itself until it exhausts the kernel stack.
Eventually this should change to use get_cyclecount() instead, but that
can't happen if get_cyclecount() is calling nanotime() instead of
getnanotime().
class/subclass, so give up trying to cull the list. Instead, complain
in the bootverbose case, but otherwise just accept that we will have to
carry this list of device IDs around.
cases with file fragments and read-write mmap's can lead to a situation
where a VM page has odd dirty bits, e.g. 0xFC - due to being dirtied by
an mmap and only the fragment (representing a non-page-aligned end of
file) synced via a filesystem buffer. A correct solution that
guarentees consistent m->dirty for the file EOF case is being
worked on. In the mean time we can't be so conservative in the
KASSERT.