Commit Graph

55975 Commits

Author SHA1 Message Date
Bill Paul
21628ddbd6 This commit makes a big round of updates and fixes many, many things.
First and most importantly, I threw out the thread priority-twiddling
implementation of KeRaiseIrql()/KeLowerIrq()/KeGetCurrentIrql() in
favor of a new scheme that uses sleep mutexes. The old scheme was
really very naughty and sought to provide the same behavior as
Windows spinlocks (i.e. blocking pre-emption) but in a way that
wouldn't raise the ire of WITNESS. The new scheme represents
'DISPATCH_LEVEL' as the acquisition of a per-cpu sleep mutex. If
a thread on cpu0 acquires the 'dispatcher mutex,' it will block
any other thread on the same processor that tries to acquire it,
in effect only allowing one thread on the processor to be at
'DISPATCH_LEVEL' at any given time. It can then do the 'atomic sit
and spin' routine on the spinlock variable itself. If a thread on
cpu1 wants to acquire the same spinlock, it acquires the 'dispatcher
mutex' for cpu1 and then it too does an atomic sit and spin to try
acquiring the spinlock.

Unlike real spinlocks, this does not disable pre-emption of all
threads on the CPU, but it does put any threads involved with
the NDISulator to sleep, which is just as good for our purposes.

This means I can now play nice with WITNESS, and I can safely do
things like call malloc() when I'm at 'DISPATCH_LEVEL,' which
you're allowed to do in Windows.

Next, I completely re-wrote most of the event/timer/mutex handling
and wait code. KeWaitForSingleObject() and KeWaitForMultipleObjects()
have been re-written to use condition variables instead of msleep().
This allows us to use the Windows convention whereby thread A can
tell thread B "wake up with a boosted priority." (With msleep(), you
instead have thread B saying "when I get woken up, I'll use this
priority here," and thread A can't tell it to do otherwise.) The
new KeWaitForMultipleObjects() has been better tested and better
duplicates the semantics of its Windows counterpart.

I also overhauled the IoQueueWorkItem() API and underlying code.
Like KeInsertQueueDpc(), IoQueueWorkItem() must insure that the
same work item isn't put on the queue twice. ExQueueWorkItem(),
which in my implementation is built on top of IoQueueWorkItem(),
was also modified to perform a similar test.

I renamed the doubly-linked list macros to give them the same names
as their Windows counterparts and fixed RemoveListTail() and
RemoveListHead() so they properly return the removed item.

I also corrected the list handling code in ntoskrnl_dpc_thread()
and ntoskrnl_workitem_thread(). I realized that the original logic
did not correctly handle the case where a DPC callout tries to
queue up another DPC. It works correctly now.

I implemented IoConnectInterrupt() and IoDisconnectInterrupt() and
modified NdisMRegisterInterrupt() and NdisMDisconnectInterrupt() to
use them. I also tried to duplicate the interrupt handling scheme
used in Windows. The interrupt handling is now internal to ndis.ko,
and the ndis_intr() function has been removed from if_ndis.c. (In
the USB case, interrupt handling isn't needed in if_ndis.c anyway.)

NdisMSleep() has been rewritten to use a KeWaitForSingleObject()
and a KeTimer, which is how it works in Windows. (This is mainly
to insure that the NDISulator uses the KeTimer API so I can spot
any problems with it that may arise.)

KeCancelTimer() has been changed so that it only cancels timers, and
does not attempt to cancel a DPC if the timer managed to fire and
queue one up before KeCancelTimer() was called. The Windows DDK
documentation seems to imply that KeCantelTimer() will also call
KeRemoveQueueDpc() if necessary, but it really doesn't.

The KeTimer implementation has been rewritten to use the callout API
directly instead of timeout()/untimeout(). I still cheat a little in
that I have to manage my own small callout timer wheel, but the timer
code works more smoothly now. I discovered a race condition using
timeout()/untimeout() with periodic timers where untimeout() fails
to actually cancel a timer. I don't quite understand where the race
is, using callout_init()/callout_reset()/callout_stop() directly
seems to fix it.

I also discovered and fixed a bug in winx32_wrap.S related to
translating _stdcall calls. There are a couple of routines
(i.e. the 64-bit arithmetic intrinsics in subr_ntoskrnl) that
return 64-bit quantities. On the x86 arch, 64-bit values are
returned in the %eax and %edx registers. However, it happens
that the ctxsw_utow() routine uses %edx as a scratch register,
and x86_stdcall_wrap() and x86_stdcall_call() were only preserving
%eax before branching to ctxsw_utow(). This means %edx was getting
clobbered in some cases. Curiously, the most noticeable effect of this
bug is that the driver for the TI AXC110 chipset would constantly drop
and reacquire its link for no apparent reason. Both %eax and %edx
are preserved on the stack now. The _fastcall and _regparm
wrappers already handled everything correctly.

I changed if_ndis to use IoAllocateWorkItem() and IoQueueWorkItem()
instead of the NdisScheduleWorkItem() API. This is to avoid possible
deadlocks with any drivers that use NdisScheduleWorkItem() themselves.

The unicode/ansi conversion handling code has been cleaned up. The
internal routines have been moved to subr_ntoskrnl and the
RtlXXX routines have been exported so that subr_ndis can call them.
This removes the incestuous relationship between the two modules
regarding this code and fixes the implementation so that it honors
the 'maxlen' fields correctly. (Previously it was possible for
NdisUnicodeStringToAnsiString() to possibly clobber memory it didn't
own, which was causing many mysterious crashes in the Marvell 8335
driver.)

The registry handling code (NdisOpen/Close/ReadConfiguration()) has
been fixed to allocate memory for all the parameters it hands out to
callers and delete whem when NdisCloseConfiguration() is called.
(Previously, it would secretly use a single static buffer.)

I also substantially updated if_ndis so that the source can now be
built on FreeBSD 7, 6 and 5 without any changes. On FreeBSD 5, only
WEP support is enabled. On FreeBSD 6 and 7, WPA-PSK support is enabled.

The original WPA code has been updated to fit in more cleanly with
the net80211 API, and to eleminate the use of magic numbers. The
ndis_80211_setstate() routine now sets a default authmode of OPEN
and initializes the RTS threshold and fragmentation threshold.
The WPA routines were changed so that the authentication mode is
always set first, followed by the cipher. Some drivers depend on
the operations being performed in this order.

I also added passthrough ioctls that allow application code to
directly call the MiniportSetInformation()/MiniportQueryInformation()
methods via ndis_set_info() and ndis_get_info(). The ndis_linksts()
routine also caches the last 4 events signalled by the driver via
NdisMIndicateStatus(), and they can be queried by an application via
a separate ioctl. This is done to allow wpa_supplicant to directly
program the various crypto and key management options in the driver,
allowing things like WPA2 support to work.

Whew.
2005-10-10 16:46:39 +00:00
Joseph Koshy
126a039375 Bug fix initialization on multi-core HTT CPUs.
Reported by:	ps
Tested by:	ps
2005-10-10 15:21:08 +00:00
Gleb Smirnoff
376e05d113 ALTQ support for ng_iface(4). Before turning on please consult manual page. 2005-10-10 15:12:59 +00:00
Tor Egge
8272da3106 Release clean buffer with wrong size and no dependencies also for non-VMIO
case.
2005-10-09 22:41:25 +00:00
Tor Egge
4e0cd00988 Adjust totread argument passed to cluster_read() to account for offset not
being block aligned.
2005-10-09 21:11:25 +00:00
Peter Edwards
96ca84d197 When breaking up a large request into smaller ones for the strategy
routine, create all the child bio objects before starting the
requests, rather than starting them as created. This closes a race
whereby some number of child operations could complete before the
rest were ever created, and prematurely freeing the parent bio.
This fixes the panics installing in VMWare and qemu
2005-10-09 21:11:05 +00:00
Tor Egge
9248a8271c Don't pretend that a failed sync write was succesful. 2005-10-09 20:49:01 +00:00
Tor Egge
021869b542 Reduce probability for a deadlock that can occur when a snapshot inode is
updated by a process holding the snapshot lock.  Another process updating a
different inode in the same inodeblock will do copy on write checks and lock in
the opposite direction.

The snapshot code force a copy on write of these blocks manually (cf. start of
expunge_ufs[12]) and these inode blocks are later put on snapblklist.

This partial fix is to 'drain' the relevant ffs_copyonwrite() operation after
installing new snapblklist.  This is not a 100% solution since a failed block
allocation can cause implicit fsync() which might deadlock before the new
snapblklist has been installed.
2005-10-09 20:15:15 +00:00
Tor Egge
d4d530da96 Eliminate a deadlock that can occur when a dirty block belonging to a snapshot
file is flushed by a process not holding snaplk (e.g. bufdaemon).  Another
process might hold snaplk and try to access the block due to ffs_copyonwrite
processing.
2005-10-09 20:07:51 +00:00
Tor Egge
45f91051da Eliminate a deadlock that can occur during the cgaccount() processing due to
the cg map buffer being held when writing indirect blocks.  The process ends up
in ffs_copyonwrite(), attempting to get snaplk while holding the cg map buffer
lock.

Another process might be in ffs_copyonwrite(), trying to allocate a new block
for a copy.  It would hold snaplk while trying to get the cg map buffer lock.

Release the cg map buffer early and use the copy for most of the cgaccount
processing to avoid this deadlock.
2005-10-09 20:00:16 +00:00
Tor Egge
17026ff61a Reduce the probability of low block numbers passed to ffs_snapblkfree() by
skipping the call from ffs_snapremove() if the block number is zero.

Simplify snapshot locking in ffs_copyonwrite() and ffs_snapblkfree() by using
the same locking protocol for low block numbers as for larger block numbers.
This removes a lock leak that could happen if vn_lock() succeeded after
lockmgr() failed in ffs_snapblkfree().

Check if snapshot is gone before retrying a lock in ffs_copyonwrite().
2005-10-09 19:45:01 +00:00
Tor Egge
c73e9e9c7b Reinitialize v_type and v_op fields in case vnode has been reused without
reclamation.  If the vnode previously was a fifo then v_op would point to
ffs_fifoops[12] instead of the expected ffs_vnodeops[12], causing a panic at
the end of ffsext_strategy.
2005-10-09 19:06:34 +00:00
Marcel Moolenaar
ada6a4d2b7 Rough implementation of the create and add verbs. The verbs cause
in-memory changes only and as such are only useful for prototyping
and regression testing purposes.
2005-10-09 17:10:35 +00:00
Craig Rodrigues
a3d7f575c0 - Do not hardcode the bsize to a sectorsize of 2048, even though
the UDF specification specifies a logical sectorsize of 2048.
  Instead, get it from GEOM.
- When reading the UDF Anchor Volume Descriptor, use the logical
  sectorsize of 2048 when calculating the offset to read from, but
  use the actual sectorsize to determine how much to read.

- works with reading a DVD disk and a DVD disk image file via mdconfig
- correctly returns EINVAL if we try to mount_udf an audio CD, instead
  of panicking inside GEOM when INVARIANTS is set
2005-10-09 04:45:33 +00:00
Christian S.J. Peron
a5b7fde722 Lock object while we iterate through it's backing objects.
Discussed with:	alc
2005-10-09 02:37:27 +00:00
Scott Long
8eeb2ca6bf Ue a better msleep identifier. Fix some whitespace. 2005-10-08 22:41:57 +00:00
Scott Long
7a48c6d4ea aac_intr0 rotted long ago, remove it. 2005-10-08 22:36:54 +00:00
Dag-Erling Smørgrav
3803b26bae As alc pointed out to me, vm_page.c 1.305 was incomplete: uma_startup()
still uses the constant UMA_BOOT_PAGES.  Change it to accept boot_pages
as an additional argument.

MFC after:	2 weeks
2005-10-08 21:03:54 +00:00
Scott Long
7cb209f5d0 Mega Update to the aac driver to support a whole new family of cards and
the modified interface that they use.  Changes include:

- Register a different interrupt handler for the new interface.  This one is
  INTR_MPSAFE, not INTR_FAST, and directly processes completions and AIFs.
- Add an event registration and callback mechanism for the ioctl and CAM
  modules can know when a resource shortage clears.  This condition was
  previously fatal in CAM due to programming oversights.
- Fix locking to play better with newbus.
- Provide access methods for talking to cards with the NEWCOMM interface.
- Fix up the CAM module to be better suited for dealing with newer firmware
  on the PERC Si/Di series that requires talking to plain SCSI via aac.
- Add a whole slew of new PCI Id's.

Thanks to Adaptec for providing an initial version of this work and for
answering countless questions about it.  There are still some rough edges in
this, but it works well enough to commit and test for now.

Obtained from: Adaptec, Inc.
2005-10-08 15:55:09 +00:00
Seigo Tanimura
314378233c In ngt_input(), do not derefer sc (= (sc_p) tp->t_lsc) before making
sure sc != NULL.
2005-10-08 11:03:29 +00:00
Warner Losh
7f33c2df93 MFP4: More removal of unused stuff. 2005-10-08 06:58:51 +00:00
Warner Losh
f481fa4d29 MFP4: Changes to hopefully make the new power code work better
o Rather than just try to turn off EXCA_INTR_RESET, set the entire register
  to 0.  This is slightly faster, and a better hammer.
o Move attempted clearing of the output enable (EXCA_PWRCTL_OE) back to
  after we turn off the power.  Modify it to write 0 so that we don't get
  Bad Vcc messages on TI bridges (untested, but ru@ sent me a similar patch)
  while at the same time avoiding interrupt storms on Ricoh bridges (tested
  by me on my Sony).

# Many of my observations of 'breakage' for this patch are due to some bug
# in the load/unload of cbb.ko unlreated to this change.  I'll be investigating
# and fixing that bug in the fullness of time.
2005-10-08 06:57:13 +00:00
Warner Losh
f1abc0ea53 MFP4: We no longer use intr_handlers, so remove it. 2005-10-08 06:53:17 +00:00
Warner Losh
ed448ee4de MFP4: Note why we do the dance we do for waiting for the thread to die. 2005-10-08 06:51:47 +00:00
Scott Long
a3699bcaa6 Remove a couple of explicit memset(0) ops that were zeroing past the end of
an allocation.  This fixes the malloc 'use after free' panic on boot that
many were seeing.  It doesn't solve the problem of the allocations being
cached and then written past their bounds later.  That will take more work.

Submitted by: kan
2005-10-08 05:16:45 +00:00
Damien Bergamini
71016a2499 Fixes my previous commit (rev 1.20)
MFC after:	1 day
2005-10-07 18:11:32 +00:00
Gleb Smirnoff
6512768b89 A deja vu of:
http://lists.freebsd.org/pipermail/cvs-src/2004-October/033496.html

The same problem applies to if_bridge(4), too.

- Copy-and-paste the if_bridge(4) related block from
  if_ethersubr.c to ng_ether.c
- Add XXXs, so that copy-and-paste would be noticed by
  any future editors of this code.
- Also add XXXs near if_bridge(4) declarations.

Silence from:	thompsa
2005-10-07 14:14:47 +00:00
Marcel Moolenaar
125fbd3cdc Add parse_uuid() that creates a binary representation of an UUID from
a string representation.
2005-10-07 13:37:10 +00:00
Pawel Jakub Dawidek
8597a1c5b2 We don't need 'imp' here. 2005-10-07 10:30:47 +00:00
Gleb Smirnoff
9c26aa3c12 Polling is now configured with help of ifconfig(8), not sysctl.
Prodded by:     maxim
2005-10-07 09:23:51 +00:00
Gleb Smirnoff
6e65f82cd1 Polling is now configured with help of ifconfig(8), not sysctl.
Prodded by: 	maxim
2005-10-07 08:55:58 +00:00
Joel Dahl
727ded3a70 snd_ess needs snd_sbc, so add a note about that. 2005-10-07 06:32:11 +00:00
Poul-Henning Kamp
0694506637 Eliminate __RMAN_RESOURCE_VISIBLE hack entirely by moving the struct
resource_ to subr_rman.c where it belongs.
2005-10-06 21:49:31 +00:00
Damien Bergamini
80e1a7127f o Use firmware extended scan command; this one doesn't crash when scanning
the 5GHz band.
o Enable 802.11a channels scanning for 2915ABG adapters.
o Fix a typo (negociated->negotiated).

With hints from NetBSD.

MFC after:	2 days
2005-10-06 20:11:01 +00:00
Poul-Henning Kamp
947fc8de03 Make sure that the worker thread knows the type early enough to
grab Giant for vnode backing.

Found by:	pho & tegge
2005-10-06 19:47:04 +00:00
Pawel Jakub Dawidek
24f8c87b41 Backout strtok() addition to libkern, strsep() is enough and strtok()
is not safe.

Discussed with:	stefanf, njl
2005-10-06 19:06:07 +00:00
Pawel Jakub Dawidek
df71afde00 - Use strsep() instead of strtok().
- strdup() uses M_WAITOK, so we don't need to check it's return value
  against NULL.

MFC after:	2 weeks
2005-10-06 19:04:08 +00:00
John Baldwin
46ceae8bc4 Fix another edge case I just noticed when committing the previous changes:
If bus_setup_intr() fails, cleanup the ifnet setup in vx_attach() by
calling ether_ifdetach() and if_free().

MFC after:	1 week
2005-10-06 18:41:31 +00:00
John Baldwin
fa08ebbbb1 Rototill vx(4), add locking, and mark MPSAFE:
- Rename vxfoo() functions to vx_foo() to improve readability and
  consistency with other drivers.
- Prefix most the softc members with 'vx_' (the other members already had
  the prefix).
- Switch to using callout_init_mtx() and callout_*() rather than
  timeout() and untimeout().
- Add some missing calls to if_free() in some failure cases in vx_attach().
- Use if_printf() and remove the unit number from the softc.
- Remove uses of the 'register' keyword and spls.
- Add locked variants of vx_init() and vx_start().
- Add a mutex to the softc and lock it in various appropriate places.
- Setup the interrupt handler last during attach.

Tested by:	imp
MFC after:	1 week
2005-10-06 18:27:59 +00:00
Poul-Henning Kamp
2628fdabad Eliminate need for __RMAN_RESOURCE_VISIBLE
Reviewed by:	marcel@
2005-10-06 17:39:18 +00:00
Søren Schmidt
40fdf81237 Add support for setting the SG list segment size.
Use this for the SiI3112 workaround to get rid of the "oversized DMA" errors.

MFC to 6.0 candidate.
2005-10-06 15:44:07 +00:00
Olivier Houchard
2dfc7d008b Export PAGE_SIZE from genassym.c, and include assym.s in bcopy_page.S,
instead of <machine/param.h>.
2005-10-06 11:26:37 +00:00
Pawel Jakub Dawidek
720f3948c0 Add boot.nfsroot.options loader tunable.
It allows to specify options for NFS root file system.
Currently supported options are: soft, intr, conn, lockd.

I'm adding this functionality mostly for 'lockd' option, which is only
honored when performing the initial mount and will be silently ignored
if used while updating the mount options.

This will allow to use flock(2) without the need of using varmfs or
rpc.lockd and friends.

Example of use:
boot.nfsroot.options="intr,lockd"

MFC after:	2 weeks
2005-10-06 11:18:34 +00:00
Pawel Jakub Dawidek
5e66cbaeaf Add strtok() and strtok_r() function to libkern.
MFC after:	2 weeks
2005-10-06 11:10:09 +00:00
Pawel Jakub Dawidek
57432591c1 Fix a nasty typo. Change:
if (foo);
		bar();
to:
	if (foo)
		bar();
Really, really nasty bug and a very nice catch of mine.

Unfortunately, I'll not become a hero of the day, because the code is
commented out.
2005-10-06 08:30:40 +00:00
Tai-hwa Liang
11e0838887 Fixing a boot time panic(when if_fwip is compiled into kernel) by renaming
module name to something that wouldn't conflict with
sys/dev/firewire/firewire.c.

Submitted by:	Cai, Quanqing <caiquanqing at gmail dot com>
PR:		kern/82727
MFC after:	3 days
2005-10-06 07:09:34 +00:00
Andrew Thompson
64465c6bd3 Fix KASSERT function name in ether_output, use __func__ while I am here. 2005-10-06 01:21:40 +00:00
Warner Losh
4120e213d4 Make param.h includable again from assembler. 2005-10-05 23:36:19 +00:00
Warner Losh
48ce90210f Include forgotten rtl80x9 file for ed. 2005-10-05 21:56:27 +00:00
Alexander Leidinger
e7d2d131f1 - Locking improvements.
- Don't keep the SPDIF state in the driver private struct since it
  can be overriden by hand with pciconf(8), query it when needed instead.

Regarding the locking I let Ariff explain it himself:
---snip---
About the locking, that is what I'm intended to do since the beginning.
The reason I'm not putting that along since my first patchset was
because several people especially from amd46 camp reported that it cause
lots of LORs, which is weird considering that I've never encounter such
in a pretty much strict locking environment (i386). However, since our
previous discussion with Pyun YongHyeon about strict locking, I've
decided to bring it back for all the affected drivers, not just for
es137x. It turns out that the root of the problem was within dsp.c
during device open, which has been fixed since dsp.c revision 1.84.
---snip---

Submitted by:	Ariff Abdullah <skywizard@MyBSD.org.my>
2005-10-05 20:05:52 +00:00
Alexander Leidinger
dcbde45390 Add a comment regarding problems with NForce 2 mainboards and add disabled
code which may help.

People with a ich compatible soundcard which want to help out should
change the "#if 1" to a "#if 0" and try if the soundcard still works.
Reports about working or not-working soundcards with this change to
multimedia@ please.

PR:		73987
2005-10-05 20:00:12 +00:00
Alexander Leidinger
b64944a543 Don't use the builtin vaalist for icc.
Submitted by:	Igor Sysoev <is@rambler-co.ru>
MFC after:	3 days
2005-10-05 17:21:09 +00:00
Gleb Smirnoff
f0796cd26c - Don't pollute opt_global.h with DEVICE_POLLING and introduce
opt_device_polling.h
- Include opt_device_polling.h into appropriate files.
- Embrace with HAVE_KERNEL_OPTION_HEADERS the include in the files that
  can be compiled as loadable modules.

Reviewed by:	bde
2005-10-05 10:09:17 +00:00
Gleb Smirnoff
2afb277f09 - Don't include opt_global.h, it is always included implicitly.
- Include opt_device_polling.h
2005-10-05 10:07:27 +00:00
Gleb Smirnoff
0c06364148 Define HAVE_KERNEL_OPTION_HEADERS when building kernel and when building
modules along with kernel.

After this change it is possible to embrace opt_*.h includes with ifdef
HAVE_KERNEL_OPTION_HEADERS. And thus, avoid editing a lot of Makefiles
in modules directory each time we introduce a new opt_xxx.h.

Requested by:	bde
2005-10-05 10:05:55 +00:00
Warner Losh
eb562f7bc6 Add if_ed_rtl80x9.c 2005-10-05 05:26:03 +00:00
Warner Losh
6ec4c010b5 Remove debug that crept in.. 2005-10-05 05:24:35 +00:00
Warner Losh
95a2adef58 MFp4:
o Add support for Tamarack TC5299J + MII found on SMC 8041TX V.2
	  and corega PCCCCTXD
	o Add support for ISA/PCI RTL80[12]9 chips
	o Improve support for the ax88790 based
	o minor code movement

Submitted by: (#2) David Madole
2005-10-05 05:21:07 +00:00
Peter Wemm
07f5921b86 Don't set segment registers via ptrace yet. Its not ready. 2005-10-04 23:26:00 +00:00
Warner Losh
2f624c21c5 When data passed into devctl_notify is NULL, don't print (null). Instead
don't print anything at all.

# this fixes a problem that I noticed with devd.pipe not terminating lines
# with \n correctly sometimes.
2005-10-04 22:25:14 +00:00
Olivier Houchard
4d71248bc3 Remove a never reached RET. 2005-10-04 20:47:27 +00:00
Olivier Houchard
7bfacf3dc6 strd needs the destination to be double-word aligned, but the pointer passed
to savectx isn't always, so always use stmia, savectx isn't called enough
to need that kind of optimization.
2005-10-04 20:42:42 +00:00
Andrew Thompson
f69453ca8b When bridging is enabled and an ARP request is recieved on a member interface,
the arp code will search all local interfaces for a match. This triggers a
kernel log if the bridge has been assigned an address.

arp: ac🇩🇪48:18:83:3d is using my IP address 192.168.0.142!

bridge0: flags=8041<UP,RUNNING,MULTICAST> mtu 1500
        inet 192.168.0.142 netmask 0xffffff00
        ether ac🇩🇪48:18:83:3d

Silence this warning for 6.0 to stop unnecessary bug reports, the code will need
to be reworked.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-10-04 19:50:02 +00:00
Andre Oppermann
1fd7af262a Correct brainfart in SO_BINTIME test.
Pointed out by:	nate
Pointy hat to:	andre
2005-10-04 18:19:21 +00:00
Andre Oppermann
e5fbf72cd8 Make SO_BINTIME timestamps available on raw_ip sockets.
Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-10-04 18:07:11 +00:00
Robert Watson
7723d5ed12 Re-order MAC and DAC checks in shmget() in order to give precedence to
the MAC result, as well as avoid losing the DAC check result when MAC
is enabled.

MFC after:	3 days
Reported by:	Patrick LeBlanc <Patrick dot LeBlanc at sparta dot com>
2005-10-04 16:40:20 +00:00
Olivier Houchard
db7db23dd8 dump_avail has nothing to do with ARM_USE_SMALL_ALLOC, so move its
declaration out of the #ifdef.
2005-10-04 16:29:31 +00:00
Roman Kurakin
826cf005ed Use FILEDESC_UNLOCK(fdp) after FILE_UNLOCK(p), not before to avoid LOR.
Slightly discussed on current@.

LOR #055

MFC after:	14 days
2005-10-04 16:27:54 +00:00
Christian S.J. Peron
cb1d4f92ec Protect PID initializations for statistics by the bpf descriptor
locks. Also while we are here, protect the bpf descriptor during
knlist_remove{add} operations.

Discussed with:	rwatson
2005-10-04 15:06:10 +00:00
Diomidis Spinellis
1e3090039d Update the vnode's access time after an mmap operation on it.
Before this change a copy operation with cp(1) would not update the
file access times.

According to the POSIX mmap(2) documentation: the st_atime field
of the mapped file may be marked for update at any time between the
mmap() call and the corresponding munmap() call. The initial read
or write reference to a mapped region shall cause the file's st_atime
field to be marked for update if it has not already been marked for
update.
2005-10-04 14:58:58 +00:00
Christian S.J. Peron
7367bc5a54 Use the correct object's backing_object_offset while calculating offsets.
While we are here, add a note that we need to lock the object before walking
the backing object list.

Pointed out by:	alc
Discussed with:	rwatson
2005-10-04 14:47:47 +00:00
Olivier Houchard
6270a55ec3 Remove duplicate entry for DDB. 2005-10-04 14:39:33 +00:00
Olivier Houchard
fe7336292d Really detect if DDB is enabled.
Remove kernel.tramp if it exists on make clean.
2005-10-04 14:38:55 +00:00
Olivier Houchard
ae4135983b Make arm/disassem.c depends on DDB
make arm/in_cksum.c and arm/in_cksum_asm.S depend on INET.
2005-10-04 14:37:53 +00:00
Olivier Houchard
7c616799dc Fix build when DDB isn't defined. 2005-10-04 14:37:03 +00:00
Christian S.J. Peron
9eea3d85cc Standard Giant push down operations for the Mandatory Access Control (MAC)
framework. This makes Giant protection around MAC operations which inter-
act with VFS conditional, based on the MPSAFE status of the file system.

Affected the following syscalls:

o __mac_get_fd
o __mac_get_file
o __mac_get_link
o __mac_set_fd
o __mac_set_file
o __mac_set_link

-Drop Giant all together in __mac_set_proc because the
 mac_cred_mmapped_drop_perms_recurse routine no longer requires it.
-Move conditional Giant aquisitions to after label allocation routines.
-Move the conditional release of Giant to before label de-allocation
 routines.

Discussed with:	rwatson
2005-10-04 14:32:58 +00:00
Christian S.J. Peron
dc063b81ab Conditionally pickup Giant in mac_cred_mmapped_drop_perms_recurse so
we can drop it all together in __mac_set_proc.

Reviewed by:	alc
Discussed with:	rwatson
2005-10-04 14:32:15 +00:00
Robert Watson
cea2165b10 Rename net.isr.enable to net.isr.dispatch.
No compatibility code is provided, as this will be the production name
as of 6.0.

MFC after:	3 days
Requested by:	scottl
2005-10-04 07:59:28 +00:00
Scott Long
67a7895ff4 For some utterly bizarre reason, sparc64 coerces PAGE_SIZE to be a long
instead of an int.  No other FreeBSD architecture does this.  Patch over
this problem in the lmc driver.  While I'm here, correct a mistake with
DEVICE_POLLING.
2005-10-04 04:49:21 +00:00
Scott Long
b9aaa6961b Oops, left a compile option enabled that should not be enabled. 2005-10-04 04:40:21 +00:00
Don Lewis
34ea500bea Add missing word to comment. 2005-10-04 04:02:33 +00:00
Olivier Houchard
70359cf526 Bring in the good version of this file. 2005-10-03 22:44:54 +00:00
Don Lewis
448434c3c9 Initialize the inode i_flag field in ffs_valloc() to clean up any
stale flag bits left over from before the inode was recycled.

Without this change, a leftover IN_SPACECOUNTED flag could prevent
softdep_freefile() and softdep_releasefile() from incrementing
fs_pendinginodes.  Because handle_workitem_freefile() unconditionally
decrements fs_pendinginodes, a negative value could be reported at
file system unmount time with a message like:
        unmount pending error: blocks 0 files -3
The pending block count in fs_pendingblocks could also be negative
for similar reasons.  These errors can cause the data returned by
statfs() to be slightly incorrect.  Some other cleanup code in
softdep_releasefile() could also be incorrectly bypassed.

MFC after:	3 days
2005-10-03 21:57:43 +00:00
John Baldwin
f2107e8d54 Use the constants for the syscall names from syscall.h rather than
hardcoding the numbers for the SYSVIPC syscalls.
2005-10-03 18:34:17 +00:00
John Baldwin
78de678e75 - Use if_printf() and device_printf() and axe lge_unit from the softc.
- Don't bzero the softc first thing in attach.
- Cleanup error handling in attach() to avoid lots of duplication.
- Don't initialize the callout handle twice.

MFC after:	3 days
2005-10-03 15:52:34 +00:00
John Baldwin
78486f55ef - Use PCIR_BAR().
- Remove unused TXP_PCI_INTLINE and TXP_DEVNAME macros.
2005-10-03 15:47:15 +00:00
Olivier Houchard
31fbb5019f Add dma and aau. 2005-10-03 14:20:44 +00:00
Olivier Houchard
50f31d87f2 Import dummy drivers for the i80321 DMA controller and AAU.
The DMA controller driver only knows how to do memory to memory copies, and
the AAU driver how to zero a chunk of memory.
Use them to process big (>=1KB) copying/zeroing.
2005-10-03 14:19:55 +00:00
Olivier Houchard
9ea7c8bb52 Make mem.c know about the pages allocated with ARM_USE_SMALL_ALLOC. 2005-10-03 14:18:21 +00:00
Olivier Houchard
735a72bbee Export the variables needed for the copy/zero API. 2005-10-03 14:17:45 +00:00
Olivier Houchard
edcbfd05d1 Make sure the interrupt is masked before processing it, or bad things
can happen.
2005-10-03 14:17:16 +00:00
Olivier Houchard
1f59105ee7 If a thread already tries to allocate a new memory range, wait for it
instead of trying to do the same.
2005-10-03 14:16:41 +00:00
Olivier Houchard
b834efd591 Provide a dump_avail[] variable, which contains the page ranges to be
dumped.

For iq31244_machdep.c, attempt to recognize hints provided by the elf
trampoline.
2005-10-03 14:15:50 +00:00
Gleb Smirnoff
e113edf30a o Move a lot of parameter checking from netisr_poll() to
dedicated sysctl handlers. Protect manipulations with
  poll_mtx. The affected sysctls are:
  - kern.polling.burst_max
  - kern.polling.each_burst
  - kern.polling.user_frac
  - kern.polling.reg_frac
o Use CTLFLAG_RD on MIBs that supposed to be read-only.
o u_int32t -> uint32_t
o Remove unneeded locking from poll_switch().
2005-10-03 14:15:26 +00:00
Olivier Houchard
92399cb788 - Provide the kernel l1pt physical address, for userland.
- Use the new API for pmap_copy_page() and pmap_zero_page().
- Just write-back the pages in pmap_qenter(), and invalidate it in
pmap_qremove().
- Nuke the cache flushing in pmap_enter_quick(), it's not needed anymore.
2005-10-03 14:13:50 +00:00
Olivier Houchard
0122bd1470 Add a new API to let platform-specific ports provide functions for big
copy/zeroing.
2005-10-03 14:12:10 +00:00
Olivier Houchard
87351d5fd2 Export the virtual and physical address in which the kernel was loaded,
needed for userland when reading kernel dumps.
2005-10-03 14:10:55 +00:00
Olivier Houchard
7141c0d343 Makefile magic for the ELF trampoline. 2005-10-03 14:09:58 +00:00
Olivier Houchard
b91c6ffecb Import a small ELF trampoline, in which the kernel is embedded, and that
is able to load the kernel into memory, symbol table included. This is
needed to be able to access the symbol table from DDB without a boot
loader.
2005-10-03 14:09:36 +00:00
Olivier Houchard
ffbce9965f *blush*
Don't try to dereference map if it's NULL.
While I'm there, increase the minimum value to write-back/invalidate the
whole dcache in bus_dmamap_sync().
2005-10-03 14:07:57 +00:00
Olivier Houchard
31c197a753 Only save the registers that are used. 2005-10-03 14:07:09 +00:00
Olivier Houchard
93d18f4760 asm versions of in_cksum_hdr() and in_pseudo(). 2005-10-03 14:06:44 +00:00
Olivier Houchard
f4f2359647 Define KERNELDUMP_ARM_VERSION. 2005-10-03 14:06:00 +00:00
Olivier Houchard
6d918e02d0 Implement savectx().
Obtained from:	NetBSD
2005-10-03 14:05:38 +00:00
Olivier Houchard
f7a1a1e2e5 Kernel dump for arm, ripped from the ia64/amd64 version. 2005-10-03 14:05:03 +00:00
Colin Percival
33812c066d If sufficiently bad things happen during a call to kern_execve(), it is
possible for do_execve() to call exit1() rather than returning.  As a
result, the sequence "allocate memory; call kern_execve; free memory"
can end up leaking memory.

This commit documents this astonishing behaviour and adds a call to
exec_free_args() before the exit1() call in do_execve().  Since all
the users of kern_execve() in the tree use exec_free_args() to free
the command-line arguments after kern_execve() returns, this should
be safe, and it fixes the memory leak which can otherwise occur.

Submitted by:	Peter Holm
MFC after:	3 days
Security:	Local denial of service
2005-10-03 12:49:54 +00:00
Robert Watson
c48b03fb69 Unlock Giant symmetrically with respect to lock acquire order as that's
generally nicer.

Spotted by:	johan
MFC after:	1 week
2005-10-03 11:34:29 +00:00
Robert Watson
1fa9efeffb Acquire Giant conditionally in in_addmulti() and in_delmulti() based on
whether the interface being accessed is IFF_NEEDSGIANT or not.  This
avoids lock order reversals when calling into the interface ioctl
handler, which could potentially lead to deadlock.

The long term solution is to eliminate non-MPSAFE network drivers.

Discussed with:	jhb
MFC after:	1 week
2005-10-03 11:09:39 +00:00
Scott Long
2bc6081c9f Reintroduce the lmc T1/E1/T3 WAN driver. This version is locked, supports
interface polling, compiles on 64-bit platforms, and compiles on NetBSD,
OpenBSD, BSD/OS, and Linux.  Woo!  Thanks to David Boggs for providing this
driver.

Altq, sppp, netgraph, and bpf are required for this driver to operate.
Userland tools and man pages will be committed next.

Submitted by: David Boggs
2005-10-03 07:05:34 +00:00
Hajimu UMEMOTO
56e5a87a55 make saved cpu level stackable. 2005-10-03 06:57:29 +00:00
Yaroslav Tykhiy
1cf236fb0c Improve handling flags that must be propagated
to the parent interface, such as IFF_PROMISC and
IFF_ALLMULTI.  In addition, vlan(4) gains ability
to migrate from one parent to another w/o losing
its own flags.

PR:		kern/81978
MFC after:	2 weeks
2005-10-03 02:24:21 +00:00
Yaroslav Tykhiy
b5c8bd5924 Clean up consistency checks in if_setflag():
. use KASSERT for all checks so that the source of an error can be detected;
. use __func__ instead of spelling function name each time;
. fix a typo.
2005-10-03 02:14:51 +00:00
Yaroslav Tykhiy
7aebc5e86e Log a message about entering or leaving permanently promiscuous mode,
as it is done for usual promiscuous mode already.  This info is important
because promiscuous mode in the hands of a malicious party can jeopardize
the whole network.
2005-10-03 01:47:43 +00:00
Don Lewis
5032ff8197 Always wire the sysctl output buffer in sysctl_kern_proc() before
calling sysctl_out_proc().  -- fix from jhb

Move the code in fill_kinfo_thread() that gathers data from struct proc
into the new function fill_kinfo_proc_only().

Change all callers of fill_kinfo_thread() to call both
fill_kinfo_proc_only() and fill_kinfo() thread.  When gathering
data from a multi-threaded process, fill_kinfo_proc_only() only needs
to be called once.

Grab sched_lock before accessing the process thread list or calling
fill_kinfo_thread().

PR:		kern/84684
MFC after:	3 days
2005-10-02 23:27:56 +00:00
Olivier Houchard
da927f93bd - Call db_setup_paging() for traceall.
- Make it so one can't call db_setup_paging() if it has already been called
before. traceall needs this, or else the db_setup_paging() call from
db_trace_thread() will reset the printed line number, and override its
argument.
This is not perfect for traceall, because even if one presses 'q' while in
the middle of printing a backtrace it will finish printing the backtrace
before exiting, as db_trace_thread() won't be notified it should stop, but
it is hard to do better without reworking the pager interface a lot more.
2005-10-02 22:57:31 +00:00
Andrew Thompson
d5edd47e8f Do not packet filter in the bridge_start() routine, locally generated packets
are already filtered by the higher layers.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-10-02 19:15:56 +00:00
Alexander Leidinger
34ac5f0f5f * Fixed rate operation for es1370 chip to solve conflicting
sampling rate between playback and recording. This can be
  disabled / enabled via kernel hints
  (hint.pcm.<unit>.fixed_rate=0/4000-48000) or sysctl
  hw.snd.pcm<unit>.fixed_rate=0/4000-48000). Default to 48khz
  fixed rate. [1]
* Basic cleanup. *_es1371x_* -> *_es137x_*.
* Some locking fixes. [2]

Submitted by:	Ariff Abdullah <skywizard@MyBSD.org.my>
Discussed with:	yongari [2]
See also:	http://lists.freebsd.org/pipermail/freebsd-multimedia/2005-September/002758.html [1]
Reported by:	Jos Backus <jos at catnook.com> [1]
2005-10-02 15:56:36 +00:00
Alexander Leidinger
f84e94870d Emulate pcm mixer controller for any uaudio device without it.
Submitted by:	Ariff Abdullah <skywizard@MyBSD.org.my>
2005-10-02 15:51:19 +00:00
Alexander Leidinger
d793e09c95 The cmi9739_patch function which is referenced by ac97.c (rev. 1.56) now...
Submitted by:	Ariff Abdullah <skywizard@MyBSD.org.my>
Pointy hat to:	netchild (for not committing it with rev. 1.56 of ac97.c)
2005-10-02 15:50:22 +00:00
Alexander Leidinger
28ef3fb011 sys/dev/sound/pcm/sndstat.c:
* General spl* cleanup. It doesn't serve any purpose anymore.
   * Nuke sndstat_busy(). Addition of sndstat_acquire() /
     sndstat_release() for sndstat exclusive access. [1]

sys/dev/sound/pcm/sound.c:
   * Remove duplicate SLIST_INIT()
   * Use sndstat_acquire() / release() to lock / release the entire
     sndstat during pcm_unregister(). This should fix LOR #159 [1]

sys/dev/sound/pcm/sound.h:
   * Definition of SD_F_SOFTVOL (part of feeder volume)
   * Nuke sndstat_busy(). Addition of sndstat_acquire() /
     sndstat_release() for exclusive sndstat access. [1]

Submitted by:	Ariff Abdullah <skywizard@MyBSD.org.my>
LOR:		159 [1]
Discussed with:	yongari [1]
2005-10-02 15:43:57 +00:00
Alexander Leidinger
62340837c3 General spl* cleanup. It doesn't serve any purpose anymore.
Submitted by:	Ariff Abdullah <skywizard@MyBSD.org.my>
2005-10-02 15:39:07 +00:00
Alexander Leidinger
cb44f623ec sys/dev/sound/pcm/ac97.c:
* Added codec id for CMI9761.
   * feeder_volume *whitelist* through ac97_fix_volume()

sys/dev/sound/pcm/ac97.h:
   * Added AC97_F_SOFTVOL definition.

sys/dev/sound/pcm/channel.c:
   * Slight changes for chn_setvolume() to conform with OSS.
   * FEEDER_VOLUME is now part of feeder building process.

sys/dev/sound/pcm/mixer.c:
   * General spl* cleanup. It doesn't serve any purpose anymore.
   * Main hook for feeder_volume.

Submitted by:	Ariff Abdullah <skywizard@MyBSD.org.my>
Tested by:	multimedia@
2005-10-02 15:37:40 +00:00
Alexander Leidinger
4406886f5e Soft volume implementation for audio devices without pcm mixer controller.
Submitted by:	Ariff Abdullah <skywizard@MyBSD.org.my>
Tested by:	multimedia@
2005-10-02 15:31:03 +00:00
Robert Watson
a7ad956bdf Add a DDB "traceall" function, which stack traces all known process
threads.  This is quite useful if generating a debug log for post-mortem
by another developer, in which case the person at the console may not
know which threads are of interest.  The output of this can be quite
long.

Discussed with:	kris
MFC after:	3 days
2005-10-02 11:41:12 +00:00
Robert Watson
c30bf5c317 Include kdb.h so that kdb_active is declared regardless of KDB being
included in the kernel.

MFC after:	0 days
2005-10-02 10:03:51 +00:00
Robert Watson
5bb52dc4d5 Complete removal of mac_create_root_mount/mpo_create_root_mount MAC
interfaces.

Obtained from:	TrustedBSD Project
Submitted by:	Chris Vance <Christopher dot Vance at SPARTA dot com>
MFC after:	3 days
2005-10-02 09:53:00 +00:00
Maxim Konovalov
ac827533df o Teach sysctl_drop() how to deal with the sockets in TIME_WAIT state.
This is a special case because tcp_twstart() destroys a tcp control
block via tcp_discardcb() so we cannot call tcp_drop(struct *tcpcb) on
such connections.  Use tcp_twclose() instead.

MFC after:	5 days
2005-10-02 08:43:57 +00:00
Boris Popov
ef29b0f6a1 Allow user to override default port numbers used by communication
protocols.  This is very useful for tunneled SMB connections.

MFC after:	4 weeks
2005-10-02 08:32:49 +00:00
Tai-hwa Liang
fec39060e2 Fixing WEP bustage in hostap mode since 5.2-RELEASE.
- WEP TX fix:

  The original code called software crypto, ieee80211_crypto_encap(),
which never worked since IEEE80211_KEY_SWCRYPT was never flagged due to
ieee80211_crypto_newkey() assumes that wi always supports hardware based
crypto regardless of operational mode(by virtue of IEEE80211_C_WEP).
This fix works around that issue by adding wi_key_alloc() to force
the use of s/w crypto.  Also if anyone ever decides to cleanup ioctl
handling where key changes wouldn't cause a call to wi_init() every time,
we'll need wi_key_alloc() to DTRT.

  In addition to that, this fix also adds code to wi_write_wep() to force
existing keys to be switched between h/w and s/w crypto such that an
operation mode change(sta <-> hostap) will flag IEEE80211_KEY_SWCRYPT
properly.

- WEP RX fix:

  Clear IEEE80211_F_DROPUNENC even in hostap mode.  Quote from Sam:

	"This is really gross but I don't see an easy way around it.
	By doing it we lose the ability to independently drop unencode
	frames (and support mixed wep/!wep use).  We should really be
	setting the EXCLUDE_UNENCRYPTED flag written in wi_write_wep
	based on IEEE80211_F_DROPUNENC but with our clearing it we can't
	depend on it being set properly."

Reported by:	Holm Tiffe <holm at freibergnet dot de>
Submitted by:	sam
MFC after:	3 days
2005-10-02 04:29:08 +00:00
Tai-hwa Liang
4f4035be47 Honouring ic->ic_dtim_period.
Submitted by:	sam
MFC after:	3 days
2005-10-02 03:55:07 +00:00
Robert Watson
2affdbee3e Second attempt at a work-around for fifo-related socket panics during
make -j with high levels of parallelism: acquire Giant in fifo I/O
routines.

Discussed with:	ups
MFC after:	3 days
2005-10-01 20:15:41 +00:00
Poul-Henning Kamp
7bbb3a2690 Make sure the clone lists are sorted in the right order.
Explosion triggered by:	pjd
MFC:	3 days
2005-10-01 19:21:03 +00:00
Don Lewis
460858e9ef Correct previous commit to fix the sense of the TDP_NORUNNINGBUF
check in ffs_copyonwrite() that is a precondition for calling
waitrunningbufspace().

Pointed out by:	tegge
Pointy hat to:	truckman
MFC after:	3 days
2005-10-01 19:10:48 +00:00
Gleb Smirnoff
4092996774 Big polling(4) cleanup.
o Axe poll in trap.

o Axe IFF_POLLING flag from if_flags.

o Rework revision 1.21 (Giant removal), in such a way that
  poll_mtx is not dropped during call to polling handler.
  This fixes problem with idle polling.

o Make registration and deregistration from polling in a
  functional way, insted of next tick/interrupt.

o Obsolete kern.polling.enable. Polling is turned on/off
  with ifconfig.

Detailed kern_poll.c changes:
  - Remove polling handler flags, introduced in 1.21. The are not
    needed now.
  - Forget and do not check if_flags, if_capenable and if_drv_flags.
  - Call all registered polling handlers unconditionally.
  - Do not drop poll_mtx, when entering polling handlers.
  - In ether_poll() NET_LOCK_GIANT prior to locking poll_mtx.
  - In netisr_poll() axe the block, where polling code asks drivers
    to unregister.
  - In netisr_poll() and ether_poll() do polling always, if any
    handlers are present.
  - In ether_poll_[de]register() remove a lot of error hiding code. Assert
    that arguments are correct, instead.
  - In ether_poll_[de]register() use standard return values in case of
    error or success.
  - Introduce poll_switch() that is a sysctl handler for kern.polling.enable.
    poll_switch() goes through interface list and enabled/disables polling.
    A message that kern.polling.enable is deprecated is printed.

Detailed driver changes:
  - On attach driver announces IFCAP_POLLING in if_capabilities, but
    not in if_capenable.
  - On detach driver calls ether_poll_deregister() if polling is enabled.
  - In polling handler driver obtains its lock and checks IFF_DRV_RUNNING
    flag. If there is no, then unlocks and returns.
  - In ioctl handler driver checks for IFCAP_POLLING flag requested to
    be set or cleared. Driver first calls ether_poll_[de]register(), then
    obtains driver lock and [dis/en]ables interrupts.
  - In interrupt handler driver checks IFCAP_POLLING flag in if_capenable.
    If present, then returns.This is important to protect from spurious
    interrupts.

Reviewed by:	ru, sam, jhb
2005-10-01 18:56:19 +00:00
Don Lewis
5997cae9a4 Copy new process argument list in do_execve() before grabbing PROC_LOCK
to avoid touching pageable memory while holding a mutex.

Simplify argument list replacement logic.

PR:		kern/84935
Submitted by:	"Antoine Pelisse" apelisse AT gmail.com (in a different form)
MFC after:	3 days
2005-10-01 08:33:56 +00:00
Tom Rhodes
24b3d59965 Allow the root user to be aware of other credentials by virtue
of privilege.

Submitted by:	rwatson
2005-09-30 23:41:10 +00:00
Warner Losh
7d830ac9c2 Use ansi function definitions in preference to K&R to reduce diffs
with NetBSD (and cause it looks cooler).
2005-09-30 19:39:27 +00:00
Warner Losh
0d50777898 Not sporttings on other cards 2005-09-30 19:35:44 +00:00
Poul-Henning Kamp
73a2c3a32e The NWFS code in RELENG_6 is broken due to a typo in
sys/fs/nwfs/nwfs_vfsop= s.c, introduced with the conversion to
nmount with revision 1.38. This causes mount_nwfs to fail with
the error message:

  mount_nwfs: mount error: /mnt/netware: syserr = No such file or directo=
ry

This is caused by a typo on line 178, which specifies "nwfw_args"
rather than "nwfs_args".

Submitted by:	Antony Mawer <gnats@mawer.org>
Fat fingers:	phk
PR:		86757
MFC:		3 days
2005-09-30 18:21:05 +00:00
Don Lewis
bd3c2d867d Un-staticize waitrunningbufspace() and call it before returning from
ffs_copyonwrite() if any async writes were launched.

Restore the threads previous TDP_NORUNNINGBUF state before returning
from ffs_copyonwrite().
2005-09-30 18:07:41 +00:00
Tor Egge
2e93e9099e Move some devstat collection to below where large IO operations are chopped
up.  This make iostat report operations passed down to the device driver
instead of operations passed down to GEOM disk.  The transfer size limit
imposed by the device driver is no longer hidden, improving the correlation
between iostat output and device driver workload.
2005-09-30 17:32:08 +00:00
Warner Losh
bf12d88020 ciphy wasn't included here. 2005-09-30 14:54:17 +00:00
Warner Losh
08576af02f Add a more generic version of the mii_phy_match routine (mii_phy_match_gen)
which can be used for phy that want to piggy back other data with their
table.
2005-09-30 14:51:44 +00:00
Warner Losh
a1039f82ae Add macros which follow the miidevs design pattern to make it easier
to construct tables for mii_phy_match.
2005-09-30 14:45:10 +00:00
Yoshihiro Takahashi
f51b0ce0ee MFi386: revision 1.33.
> Cause all flags passed by boot2 to set the respective loader(8)
  > boot_* variable.  The end effect is that all flags from boot2
  > are now passed to the kernel.
2005-09-30 13:24:14 +00:00
Yoshihiro Takahashi
ea7d61cd29 Use 'PC Card' 2005-09-30 13:17:54 +00:00
David Xu
763a429571 Fox a LOR of sleep and sched_lock by using a timeout wait
when process reaches maximum number of threads.

MFC after: 3 days
2005-09-30 06:09:41 +00:00
Don Lewis
6c8b634f1d Un-staticize runningbufwakeup() and staticize updateproc.
Add a new private thread flag to indicate that the thread should
not sleep if runningbufspace is too large.

Set this flag on the bufdaemon and syncer threads so that they skip
the waitrunningbufspace() call in bufwrite() rather than than
checking the proc pointer vs. the known proc pointers for these two
threads.  A way of preventing these threads from being starved for
I/O but still placing limits on their outstanding I/O would be
desirable.

Set this flag in ffs_copyonwrite() to prevent bufwrite() calls from
blocking on the runningbufspace check while holding snaplk.  This
prevents snaplk from being held for an arbitrarily long period of
time if runningbufspace is high and greatly reduces the contention
for snaplk.  The disadvantage is that ffs_copyonwrite() can start
a large amount of I/O if there are a large number of snapshots,
which could cause a deadlock in other parts of the code.

Call runningbufwakeup() in ffs_copyonwrite() to decrement runningbufspace
before attempting to grab snaplk so that I/O requests waiting on
snaplk are not counted in runningbufspace as being in-progress.
Increment runningbufspace again before actually launching the
original I/O request.

Prior to the above two changes, the system could deadlock if enough
I/O requests were blocked by snaplk to prevent runningbufspace from
falling below lorunningspace and one of the bawrite() calls in
ffs_copyonwrite() blocked in waitrunningbufspace() while holding
snaplk.

See <http://www.holm.cc/stress/log/cons143.html>
2005-09-30 01:30:01 +00:00
Max Khon
f3b5092061 - Fix "end_blk out of range" panic when INVARIANTS.
- Do not allow rw access.

Submitted by:	Dario Freni <saturnero at freesbie dot org>
MFC after:	3 days
2005-09-29 22:45:16 +00:00
Don Lewis
445193b887 After a rmdir()ed directory has been truncated, force an update of
the directory's inode after queuing the dirrem that will decrement
the parent directory's link count.  This will force the update of
the parent directory's actual link to actually be scheduled.  Without
this change the parent directory's actual link count would not be
updated until ufs_inactive() cleared the inode of the newly removed
directory, which might be deferred indefinitely.  ufs_inactive()
will not be called as long as any process holds a reference to the
removed directory, and ufs_inactive() will not clear the inode if
the link count is non-zero, which could be the result of an earlier
system crash.

If a background fsck is run before the update of the parent directory's
actual link count has been performed, or at least scheduled by
putting the dirrem on the leaf directory's inodedep id_bufwait list,
fsck will corrupt the file system by decrementing the parent
directory's effective link count, which was previously correct
because it already took the removal of the leaf directory into
account, and setting the actual link count to the same value as the
effective link count after the dangling, removed, leaf directory
has been removed.  This happens because fsck acts based on the
actual link count, which will be too high when fsck creates the
file system snapshot that it references.

This change has the fortunate side effect of more quickly cleaning
up the large number dirrem structures that linger for an extended
time after the removal of a large directory tree.  It also fixes a
potential problem with the shutdown of the syncer thread timing out
if the system is rebooted immediately after removing a large directory
tree.

Submitted by:	tegge
MFC after:	3 days
2005-09-29 21:50:26 +00:00