Commit Graph

62423 Commits

Author SHA1 Message Date
Bruce Evans
1300fd67f3 Fixed some style bugs. Routine except:
- don't use __GNUCLIKE___OFFSETOF, since __offsetof() is a standard
  FreeBSD implementaion detail which has nothing to do with GNUC.
2007-02-06 18:04:02 +00:00
Robert Watson
1ca8672907 Print intptr_t values by first casting to intmax_t and then printing with
%jd, as intptr_t may not be int-sized.

Assistance from:	jhb
Spotted by:		Mr Tinderbox
2007-02-06 17:22:36 +00:00
Robert Watson
4dbb37bd82 Update comments in mac.h.
Obtained from:	TrustedBSD Project
2007-02-06 16:24:57 +00:00
Bruce Evans
3764a82377 Simplified PCPU_GET() and PCPU_SET(). We must copy through a temporary
variable to avoid invalid constraints in dead code.  Use an array of
u_char's (inside a struct) instead of a char/short/int/long variable so
that the variable and its accesses can be spelled in the same way in all
cases and code doesn't need to be cloned just to hold the spelling
differences.

Fixed strict-aliasing errors in PCPU_SET() and in the amd64 PCPU_GET().
Cast to (void *) as in rev.1.37 of the i386 version where the errors
were fixed for the i386 PCPU_GET() only.  It would be more correct to
copy to and from the temp. variable using memcpy(), but then an
ifdef tangle would be required to ensure using the builtin memcpy().
We depend on fairly aggressive optimization to put the temp. variable
only in a register despite it being copied using
*(type *)(void *)&anothertype and could depend on this when using
memcpy() too.  This seems to work right even for -O0, but the -O0 case
has not been completely tested.

This change gives identical object code for all object files in LINT
on amd64 (except for one file with a __TIME__ stamp).  For LINT on
i386 it gives unimportant differences in instruction order and padding
in a few object files.  This was only tested for -O.

This change (actually a previous version of it) gives the following
reductions in the number of object files in LINT that fail to compile
with -O2 but without the -fno-strict-aliasing kludge:
- amd64: 29 (down from 211)
- i386: 36 (down from 47)

gcc-3.4.6 actually allows the invalid constraints that result from not
using the temp. variable, at least with -O[1-2], but gcc-3.3.3 crashes
on them and I don't want to depend on compiler bugs.
2007-02-06 16:21:09 +00:00
Robert Watson
1f837c4753 Push UNIX domain socket locking further into uipc_ctloutput() in order to
avoid holding the UNIX domain socket subsystem lock over soooptcopyin()
and sooptcopyout().  This problem was introduced when LOCAL_CREDS, and
LOCAL_CONNWAIT support were added.

Reviewed by:	mdodd
2007-02-06 14:31:37 +00:00
Robert Watson
0142affc77 Introduce accessor functions mac_label_get() and mac_label_set() to replace
LABEL_TO_SLOT() macro used by policy modules to query and set label data
in struct label.  Instead of using a union, store an intptr_t, simplifying
the API.

Update policies: in most cases this required only small tweaks to current
wrapper macros.  In two cases, a single wrapper macros had to be split into
separate get and set macros.

Move struct label definition from _label.h to mac_internal.h and remove
_label.h.  With this change, policies may now treat struct label * as
opaque, allowing us to change the layout of struct label without breaking
the policy module ABI.  For example, we could make the maximum number of
policies with labels modifiable at boot-time rather than just at
compile-time.

Obtained from:	TrustedBSD Project
2007-02-06 14:19:25 +00:00
Warner Losh
21389c94d9 at91_twi depends on the iicbus module to satisfy its symbols when
loaded, so make that explicit.  Works for the monolithic kernel case,
won't work for the kldload case.
2007-02-06 12:07:14 +00:00
Robert Watson
c96ae1968a Continue 7-CURRENT MAC Framework rearrangement and cleanup:
Don't perform a nested include of _label.h in mac.h, as mac.h now
describes only  the user API to MAC, and _label.h defines the in-kernel
representation of MAC labels.

Remove mac.h includes from policies and MAC framework components that do
not use userspace MAC API definitions.

Add _KERNEL inclusion checks to mac_internal.h and mac_policy.h, as these
are kernel-only include files

Obtained from:	TrustedBSD Project
2007-02-06 10:59:23 +00:00
Mike Pritchard
af7a34173d The change to the vm_page_queue_freelist lock from a spin lock to a
sleep lock missed the witness code, and the system will panic
immediately on boot if WITNESS is enabled.

Changed the witness definition to the new type.
2007-02-06 05:51:55 +00:00
Craig Rodrigues
8a4cab026b Eliminate some dead code which was introduced in 1.23, yet was always
commented out.
2007-02-06 03:30:58 +00:00
John Baldwin
c632517124 Change GDB_BUFSZ to be large enough to hold a register dump where each
register takes 16 characters (64-bit register in hex).  In practice this
is a slight bit of overkill as 7 of the 56 registers are only 32-bit, but
having the buffer too small results in remote kgdb trashing kernel memory
when it connects.

PR:		amd64/108673
Submitted by:	Ravi Murty, Nikhil Rao @ Intel
MFC after:	3 days
2007-02-05 21:48:32 +00:00
Bruce M Simpson
e9077dd658 Fix devfs cloning for non-superusers when net.link.tap.user_open is non-zero.
Note: 'ifconfig tapX create' still requires PRIV_NET_IFCREATE privilege.

Reviewed by:	rwatson
2007-02-05 11:29:08 +00:00
Bruce M Simpson
cc67c657e0 Clean up after tun(4) properly; remove routes whose ifp is set to
that of the tun instance even for the !AF_INET case, and properly
remove configured addresses by calling if_purgeaddrs().

Maintain the TUN_DSTADDR behaviour for compatibility with the OS/390
emulator.

MFC after:	3 weeks
PR:		100080
Reviewed by:	bz
2007-02-05 11:15:52 +00:00
Bruce M Simpson
6ede684320 MFC after: 3 days 2007-02-05 11:05:41 +00:00
Kevin Lo
e4d87be479 <sys/sx.h> is unneeded. 2007-02-05 10:33:39 +00:00
Alan Cox
3ae3919d0b Change the free page queue lock from a spin mutex to a default (blocking)
mutex.  With the demise of Alpha support, there is no longer a reason for
it to be a spin mutex.
2007-02-05 06:02:55 +00:00
Bruce M Simpson
64e740a352 When fast-forwarding is enabled, do not forward directed IPv4 broadcasts
to locally attached broadcast networks.

Note well: This relies on the layer 2 route cloning behaviour in BSD.

PR:		98799
Tested by:	Dmitry Sergienko
MFC after:	1 week
2007-02-05 00:15:40 +00:00
Tor Egge
0d86a7f7c2 Call pbgetvp() and pbrelvp() instead of setting b_vp directly.
PR:		kern/108151
2007-02-04 23:42:02 +00:00
Lukas Ertl
92fb2d84f5 Add support for another 3G card and update man page accordingly.
The patch from the PR was a little outdated w/regards to the
Vodafone vendor string.

PR:            kern/106033
Submitted by:  Volker Werth <volker_AT_vwsoft.com>
MFC in:        3 days
2007-02-04 22:14:18 +00:00
Bruce M Simpson
cd83bbd2aa Implement ifnet cloning for tun(4)/tap(4).
Make devfs cloning a sysctl/tunable which defaults to on.

If devfs cloning is enabled, only the super-user may create
tun(4)/tap(4)/vmnet(4) instances. Devfs cloning is still enabled by
default; it may be disabled from the loader or via sysctl with
"net.link.tap.devfs_cloning" and "net.link.tun.devfs_cloning".

Disabling its use affects potentially all tun(4)/tap(4) consumers
including OpenSSH, OpenVPN and VMware.

PR:		105228 (potentially also 90413, 105570)
Submitted by:	Landon Fuller
Tested by:	Andrej Tobola
Approved by:	core (rwatson)
MFC after:	4 weeks
2007-02-04 16:32:46 +00:00
Jean-Sébastien Pédron
5a10830e1a Synaptics TouchPad seems to go back to Relative Mode after the call
to set_controller_command_byte() call; by issueing a Read Mode Byte
command, the touchpad is in Absolute Mode again.

This problem occursed at least on Asus V6V laptops.
2007-02-04 12:47:52 +00:00
Joel Dahl
5bcbb3c5e8 Orion originally wrote and added these files in 2002/2003, so with his
approval, change the copyright statement to point at him instead of
"FreeBSD, Inc".

Encouraged by:	rwatson
Reviewed by:	imp
Discussed with and approved by:	orion
2007-02-04 06:52:33 +00:00
Mike Pritchard
522883b87f If quotacheck or edquota reset the block or inode grace time for
a user or group, when the kernel first sees this, it will update
the grace time value.  However, it never flags the quota as modified
and the updated value never makes it to the quota data file unless
the user actually makes some other change that would write the
data out.

Fixed to flag the quota as modified if the soft limit has actually
been reached and should be now enforced.
2007-02-04 06:46:57 +00:00
Warner Losh
6ae9968c9a Document the init_chroot and init_script variables.
# I didn't check the markup too closely, so doc people, please check

Submitted by: Oliver Fromme
2007-02-04 06:35:10 +00:00
Sam Leffler
f3b179a4b1 clear/reclaim challenge text when switching auth mode and operating as an ap
Obtained from:	Atheros
2007-02-04 05:49:16 +00:00
Alan Cox
055867a06c Include opt_ipdivert.h so that the message announcing ipfw correctly
describes the state of IPDIVERT.
2007-02-03 22:11:53 +00:00
Florent Thoumie
2b5fb13e20 Fix build (sc->dev => sc->sc_dev). 2007-02-03 21:11:11 +00:00
Rink Springer
cece26a63a Add support for the NetCell NC3000/5000 series SATA RAID cards.
Reviewed by:	sos
Approved by:	imp (mentor)
MFC after:	1 week
2007-02-03 20:12:00 +00:00
Warner Losh
df96f93d49 It turns out we were mallocing too early, so move the allocation so we
don't leak.
2007-02-03 19:11:09 +00:00
Warner Losh
5f1413947b Fix memory leak of devinfop
PR: 108719
Submitted by: Antoine Brodin
2007-02-03 16:41:55 +00:00
Warner Losh
f254d5b0d1 Fix possible memory leaks of devinfo.
PR: 108719
Submitted by: Antoine Brodin
2007-02-03 16:38:32 +00:00
Warner Losh
881c241ce3 Fix non-use, but not memory leak, of devinfop. Set the device's
description here.  The fix in the PR isn't necessary at all for memory
leaks, but we weren't setting the device description.

While I'm here, remove some of the obfuscating macros in attach.

PR: 108719
2007-02-03 16:33:47 +00:00
Warner Losh
9b4f44b31e Fix memory leak of devinfo. The leak itself was documented in
PR/108719, but there's a simpler fix: free it after it is used, and
then get rid of the redundant frees this causes.  Other leaks in this
PR not yet fixed.

While I'm here, remove NetBSD/OpenBSD code and some of the portability
#defines that were getting in the way of understanding this code.  The
devinfo bug was harder to spot because one needed to know that
device_set_desc_copy() was used inside of one of them (one that didn't
take an argument!).

Prefer device_printf(sc->sc_dev, "...") to printf("%s:...",
device_get_nameunit(sc->sc_dev)).  This saves almost 300 bytes.

PR: 108719
Submitted by: Antoine Brodin
2007-02-03 16:19:28 +00:00
Max Laier
38d4db193b Add a small informative printf under bootverbose to firmware_register to
track problems when loading firmware from loader.
2007-02-03 16:01:46 +00:00
Max Laier
fe46dc7031 Add ALTQ support for aue(4).
Tested by:	Greg Hennessy, Volker
MFC after:	1 week
2007-02-03 13:53:22 +00:00
Hajimu UMEMOTO
c57086ced7 ng_iface requiers neighbor cache as well.
MFC after:	3 days
2007-02-03 09:34:36 +00:00
Bruce M Simpson
4729603e9a Style; remove argument names from prototype, be consistent with
rest of file.
This has the additional side-effect of removing a C++ reserved keyword
from this file, which prevents the Click Modular Router's FreeBSD
kernel support from building.

Reviewed by:	silence on -current
2007-02-03 07:49:20 +00:00
Kevin Lo
1f35d9bfbd ether_ifattach() sets if_mtu to ETHERMTU, don't bother set it again.
Approved by: imp, cognet
2007-02-03 07:46:26 +00:00
Warner Losh
8ccfb0607f We need to free the ivars for the child that we just deleted. 2007-02-03 07:09:36 +00:00
Bruce M Simpson
d256723b8b In fast forwarding path, defer processing of 169.254.0.0/16
to ip_input(). See RFC 3927 section 2.7.
2007-02-03 06:46:48 +00:00
Warner Losh
8cad31a480 The path to the mmc/mmcbus_if.m file is wrong. Correct it by
prepending dev/

Submitted by: Andrea Bittau
2007-02-03 06:46:11 +00:00
Bruce M Simpson
f8429ca2e1 In regular forwarding path, reject packets destined for 169.254.0.0/16
link-local addresses. See RFC 3927 section 2.7.
2007-02-03 06:45:51 +00:00
Warner Losh
9d5ef0737d Mark mmc *_if.m files as standard to allow for mmc/sd being compiled
as a module.

Submitted by: Andrea Bittau
2007-02-03 06:45:02 +00:00
Bruce M Simpson
7dc8d021ea Diff reduction with RELENG_6, style(9):
Remove unnecessary brace; && should be on end of line.
No functional changes.
2007-02-03 03:57:45 +00:00
Bruce M Simpson
7059a5e0bd Drop unicast Ethernet frames not destined for the configured address
of a tap(4) instance, if IFF_PROMISC is not set.

In tap(4), we should emulate the effect IFF_PROMISC would have on
hardware, otherwise we risk introducing layer 2 loops if tap(4) is
used with bridges. This means not even bpf(4) gets to see them.

This patch has been tested in a variety of situations. Multicast and
broadcast frames are correctly allowed through. I have observed this
behaviour causing problems with multiple QEMU instances hosted on
the same FreeBSD machine.

The checks in in ether_demux() [if_ethersubr.c, rev 1.222, line 638]
are insufficient to prevent this bug from occurring, as ifp->if_vlantrunk
will always be NULL for the non-vlan case.

MFC after:	3 weeks
PR:		86429
Submitted by:	Pieter de Boer (with changes)
2007-02-03 02:57:45 +00:00
Bruce M Simpson
217f71d80c Use int instead of u_int for the 'extra' argument to the
clone_create() KPI.
This fixes a signedness bug in unit number comparisons.

Submitted by:	imp, Landon Fuller
PR:		kern/105228
MFC after:	2 weeks
2007-02-02 22:27:45 +00:00
Bruce M Simpson
d055815799 Comply with RFC 3927, by forcing ARP replies which contain a source
address within the link-local IPv4 prefix 169.254.0.0/16, to be
broadcast at link layer.

Reviewed by:	fenner
MFC after:	2 weeks
2007-02-02 20:31:44 +00:00
John Baldwin
f50589d755 Add constants for the PCIY_VENDOR (vendor-specific), PCIY_DEBUG (EHCI
debug port), and PCIY_EXPRESS (PCI-express) capabilities.
2007-02-02 19:48:25 +00:00
Bruce M Simpson
1baaf8347c Expose smoothed RTT and RTT variance measurements to userland via
socket option TCP_INFO.
Note that the units used in the original Linux API are in microseconds,
so use a 64-bit mantissa to convert FreeBSD's internal measurements
from struct tcpcb from ticks.
2007-02-02 18:34:18 +00:00
Pawel Jakub Dawidek
5ab5525469 coda_vptofh is never defined nor used. 2007-02-02 15:47:28 +00:00
Joel Dahl
155414e2e4 Remove dead email address.
Requested by:	luigi
2007-02-02 13:44:09 +00:00
Joel Dahl
a0afd24d9c Clean up the BSD license to match the preferred license in
/usr/share/examples/etc/bsd-style-copyright.  I've fixed a
few minor wording and formatting differences.

Approved by:	luigi, Hannu Savolainen <hannu@opensound.com>
2007-02-02 13:39:20 +00:00
Joel Dahl
262e034444 Add a standard BSD license to these files.
Discussed with:	rwatson
Approved by:	luigi
2007-02-02 13:33:35 +00:00
Gleb Smirnoff
f8e159d658 Quoting Alexander:
Formulas described in RFC require high precision of floating point.
  Formulas of integer math implemented in ng_pptpgre give mistake in range
  of +0-7ms on RTT and +0-3ms on deviation. This leads to significant
  underestimation of real packet RTT.

  I have made a very simple patch to reduce mistake to +4-3ms on RTT and
  +2-1ms on deviation. Mistake in RTT is not good, but gets covered by
  deviation. To cover worst possible negative mistake in deviation I have
  added 2ms to it. Also this 2 ms cover the case when measured deviation
  is so small (about zero) that it can interfere with process scheduling
  delays or weather on Mars.

  My tests show decreasing of packet losses on 20ms RTT link from 2.5% to
  0.3% while speed increased un 1/3.

Reviewed by:	archie
2007-02-02 09:45:23 +00:00
Gleb Smirnoff
fbfdcf8735 Since rev. 1.94 of netinet/in.c, the netinet layer frees all its
multicast memberships, when interface is detached. Thus, when
an underlying interface is detached, we do not need to free
our multicast memberships.

Reviewed by:	bms
2007-02-02 09:39:09 +00:00
Konstantin Belousov
e6a4f4cd40 Record kqueue -> struct mount mtx -> vnode interlock lock order to
catch the places where reverse lock order is instantiated.

OKed by:	jeff
2007-02-02 09:02:18 +00:00
Konstantin Belousov
b4bb515484 Remove extern int hz; use proper include file instead. 2007-02-02 08:58:16 +00:00
Kevin Lo
b0ea96df61 Use bus_get_dma_tag() so iwi(4) works on platforms requiring it.
Approved by: cognet
2007-02-02 05:17:18 +00:00
Julian Elischer
c6226eea4c Move the seting of the idle_mask bits to a place where they
can't be wrong.
Also use the IDLETD bit in the thread mask to test if its an idle thread
rather than doing a PCPU access.
2007-02-02 05:14:22 +00:00
Kevin Lo
8c605560c4 Remove a bogus i = 0
Approved by: cognet
2007-02-02 05:14:21 +00:00
Kip Macy
6d449d27d9 Add support for IPI_PREEMPT in order to enable use of the ULE scheduler 2007-02-02 05:00:21 +00:00
Kip Macy
d5ab3ef787 match against both dirty and writeable for marking page dirty 2007-02-02 04:57:11 +00:00
Sam Leffler
d9ff043726 add IEEE80211_IS_CHAN_PASSIVE
MFC after:	1 week
2007-02-02 02:45:33 +00:00
Andre Oppermann
6741ecf595 Auto sizing TCP socket buffers.
Normally the socket buffers are static (either derived from global
defaults or set with setsockopt) and do not adapt to real network
conditions. Two things happen: a) your socket buffers are too small
and you can't reach the full potential of the network between both
hosts; b) your socket buffers are too big and you waste a lot of
kernel memory for data just sitting around.

With automatic TCP send and receive socket buffers we can start with a
small buffer and quickly grow it in parallel with the TCP congestion
window to match real network conditions.

FreeBSD has a default 32K send socket buffer. This supports a maximal
transfer rate of only slightly more than 2Mbit/s on a 100ms RTT
trans-continental link. Or at 200ms just above 1Mbit/s. With TCP send
buffer auto scaling and the default values below it supports 20Mbit/s
at 100ms and 10Mbit/s at 200ms. That's an improvement of factor 10, or
1000%. For the receive side it looks slightly better with a default of
64K buffer size.

New sysctls are:
  net.inet.tcp.sendbuf_auto=1 (enabled)
  net.inet.tcp.sendbuf_inc=8192 (8K, step size)
  net.inet.tcp.sendbuf_max=262144 (256K, growth limit)
  net.inet.tcp.recvbuf_auto=1 (enabled)
  net.inet.tcp.recvbuf_inc=16384 (16K, step size)
  net.inet.tcp.recvbuf_max=262144 (256K, growth limit)

Tested by:	many (on HEAD and RELENG_6)
Approved by:	re
MFC after:	1 month
2007-02-01 18:32:13 +00:00
Andre Oppermann
6a37f331d7 Generic socket buffer auto sizing support, header defines, flag inheritance.
MFC after:	1 month
2007-02-01 17:53:41 +00:00
Andre Oppermann
087b55ea59 Change the way the advertized TCP window scaling is computed. Instead of
upper-bounding it to the size of the initial socket buffer lower-bound it
to the smallest MSS we accept.  Ideally we'd use the actual MSS information
here but it is not available yet.

For socket buffer auto sizing to be effective we need room to grow the
receive window.  The window scale shift is determined at connection setup
and can't be changed afterwards.  The previous, original, method effectively
just did a power of two roundup of the socket buffer size at connection
setup severely limiting the headroom for larger socket buffers.

Tested by:	many (as part of the socket buffer auto sizing patch)
MFC after:	1 month
2007-02-01 17:39:18 +00:00
Konstantin Belousov
d0b2365eec Introduce some more SO_ option equivalents from Linux to FreeBSD.
The msg variable in linux_recvmsg() was not initialized.
Copy it from userspace.

Submitted by: rdivacky
2007-02-01 13:36:19 +00:00
Konstantin Belousov
75ee4e5462 No need to lock emul_lock in exit_group() because em->shared
cannot change (because its referenced by curthread). This fixes
a LOR caused by acquiring emul_shared_lock while holding emul_lock.

Fix typo in comment.

Submitted by: rdivacky
2007-02-01 13:33:33 +00:00
Konstantin Belousov
25954d7430 No need to synchronize linux_schedtail with linux_proc_init.
p->p_emuldata is properly initialized in the time when the child can run.

Do not set p->p_emuldata to NULL when the process is exiting.
It does not make any sense and only costs 2 mutex operations.

Do not lock emul_data to unlock it on the very next line.
Comment on possible race while there.

Reparent all procs that are part of a threading group but not its leaders
to init and SIGCHLD init to finish the zombies off. This fixes zombies
left after opera's exit. [1]

There is no need to lock p_em in the linux_proc_init CLONE_THREAD
case because the process cannot change the address of the p_em->shared
because its currently running this code path.
Move assigning of em->shared outside emul_shared_lock.

Noticed by: Scott Robbins <scottro@nyc.rr.com> [1]
Submitted by: rdivacky
2007-02-01 13:29:27 +00:00
Konstantin Belousov
a9ccaccfc3 Fix LOR that occurs because proctree_lock was acquired while holding
emuldata lock by moving the code upwards outside the emul_lock coverage.

Submitted by: rdivacky
2007-02-01 13:27:52 +00:00
Konstantin Belousov
84fbdf86b3 MFi386: Use LINUX_SIG_VALID macro.
Submitted by: rdivacky
2007-02-01 13:24:40 +00:00
Ariff Abdullah
9e4c8259a3 Fix huge memory leak within sound buffer (during channel destruction,
buffer resizing, etc.) that was here since eon. Free all (unmanaged)
allocated buffer through sndbuf_destroy() in case we forgot to call
sndbuf_free(). For a managed buffer (mostly hw specific managed buffer),
either provide CHANNEL_FREE() method with appropriate return value to
invoke semi-automatic sndbuf_free() or simply do it on their own. If
everything is failed, sndbuf_destroy() will come to the rescue as a
final measure.

MFC after:	3 days
2007-02-01 09:46:03 +00:00
Ariff Abdullah
e444a20971 Fix apparent memory leak (during vchan destruction) that was here
since eon.
2007-02-01 09:30:01 +00:00
Tai-hwa Liang
c69e1d83f5 Reflecting the removal of MSDOSFS_LARGE found in sys/conf/files:1.1173.
This should fix the run time bustage observed on recent -CURRENT whilst
mounting a MSDOS filesystem with non-default locale/code page:

	link_elf: symbol msdosfs_fileno_free undefined
	KLD msdosfs_iconv.ko: depends on msdosfs - not available
2007-02-01 04:21:03 +00:00
Mike Pritchard
6c62e3fce9 Prevent quotactl calls that pass in an id of -1 from incorrectly
using the callers UID instead of the GID when performing group
operations.  This could allow users to determine group quota
information for groups they are not a member of in some cases.

Rename the "uid" parameter in ufs_quotactl to "id" to better show
that it is used for more than just the uid, and to be more in line
with the naming conventions in the other quota routines.

PR:	kern/33940
2007-02-01 02:13:53 +00:00
Mike Pritchard
3c0508582d Disallow negative UIDs when processing quotactl options. 2007-02-01 01:01:56 +00:00
Mohan Srinivasan
4e99994cc9 Fix for a vnode lock leak in nfs_create() in the event of an error.
Spotted by ups@.
2007-01-31 23:10:27 +00:00
Andrew Gallatin
dce01b9b27 - Add 99% of a callout based watchdog. The remaining 1% is waiting
for pci_cfg_restore() to be exported.  It was tested using a
  hackily accessed pci_cfg_restore().

- Add ifmedia_removeall() to mxge_detach() in order to stop leaking
  an ifaddr

- Fix a small acounting bug introduced by the locking code shuffle
  which could cause spurious watchdog resets now that we have a
  watchdog.

Sponsored by: Myricom
2007-01-31 19:53:36 +00:00
Andrew Gallatin
c265717682 destroy busdma maps even if they are NULL, so as to avoid leaking
busdma tags.
2007-01-31 15:47:44 +00:00
Andrew Gallatin
a98d6cd71c Abandon using sleepable locks in favor of mutexes for mxge's if_ioctl
locking in preparation for adding a watchdog handler (callouts must
not use sleepable locks).  This required shuffling memory and
interrupt allocation to the attach routine rather than if_ioctl so as
to avoid potential sleeps while bringing up the interface.
2007-01-31 15:29:31 +00:00
Bruce M Simpson
1976bc4af7 Import macros IN_LINKLOCAL(), IN_PRIVATE(), IN_LOCAL_GROUP(), IN_ANY_LOCAL().
This is not a functional change.

IN_LINKLOCAL() tests if an address falls within the IPv4 link-local prefix.
IN_PRIVATE() tests if an address falls within an RFC 1918 private prefix.
IN_LOCAL_GROUP() tests if an address falls within the statically assigned
link-local multicast scope specified in RFC 2365.
IN_ANY_LOCAL() tests for either of IN_LINKLOCAL() or IN_LOCAL_GROUP().

As with the existing macros in the FreeBSD netinet stack, comparisons
are performed in host-byte order.

See also:	RFC 1918, RFC 2365, RFC 3927
Obtained from:	NetBSD (dyoung@)
MFC after:	2 weeks
2007-01-31 14:34:47 +00:00
Joel Dahl
fcacf52ec7 Put #ifndef... after the license.
Approved by:	ariff
2007-01-31 12:10:48 +00:00
Joel Dahl
22821dadfc s/WHETHERIN/WHETHER IN/ & s/THEPOSSIBILITY/THE POSSIBILITY/ in the
license text.

Approved by:	imp
2007-01-31 08:53:45 +00:00
Ruslan Ermilov
a35dcad5cb MFsparc64: Add .cvsignore file here too. 2007-01-30 10:50:55 +00:00
Ruslan Ermilov
9633d80634 Remove the last vestige of opt_msdosfs.h.
Submitted by:	grep(1)
2007-01-30 10:17:36 +00:00
Andrew Gallatin
a82c2581b5 Minor updates:
- initialize ifq_drv_maxlen correctly
- mark the interface as jumbo capable
- keep stats on the number of times the hw transmit queue filled and
  was restarted.
2007-01-30 08:39:44 +00:00
Tai-hwa Liang
61ad2e26ef Fixing compilation bustage by removing references to opt_msdosfs.h.
This auto-generated header file no longer exists since the removal of
MSDOSFS_LARGE in sys/conf/options:1.574.
2007-01-30 08:05:04 +00:00
Craig Rodrigues
379064396d Remove MSDOSFS_LARGE compile time option. It has been converted
to a run time "-o large" mount option.

PR:		105964
MFC after:	2 weeks
2007-01-30 05:01:06 +00:00
Tom Rhodes
bade0e00f3 Fix spacing from my previous commit to this file:
Noticed by:	fjoe
2007-01-30 04:41:38 +00:00
Craig Rodrigues
f458f2a553 Add a "-o large" mount option for msdosfs. Convert compile-time checks for
#ifdef MSDOSFS_LARGE to run-time checks to see if "-o large" was specified.

Test case provided by Oliver Fromme:
  truncate -s 200G test.img
  mdconfig -a -t vnode -f test.img -u 9
  newfs_msdos -s 419430400 -n 1 /dev/md9 zip250
  mount -t msdosfs /dev/md9 /mnt    # should fail
  mount -t msdosfs -o large /dev/md9 /mnt   # should succeed

PR:		105964
Requested by:	Oliver Fromme <olli lurza secnetix de>
Tested by:	trhodes
MFC after:	2 weeks
2007-01-30 03:11:45 +00:00
Kevin Lo
25777ce331 Use our own timer that piggybacks on npe_tick() callout instead of
if_watchdog/if_timer interface.

Approved by: sam, cognet
2007-01-30 01:18:29 +00:00
Kris Kennaway
410355bf69 Instead of always hard-coding the socket type for the nfs root mount as
SOCK_DGRAM (i.e. UDP), respect the value configured earlier.  This allows
TCP NFS root mounts using e.g. the boot.nfsroot.options="tcp" tunable.

In this case some of the connection parameters like the retry timer were
previously set appropriately for TCP but inappropriately for the UDP
socket that was actually used, leading to e.g. extremely long recovery
times (O(hours)) after a nfs server reboot.

Reviewed by:    mohans
MFC After:      2 weeks
2007-01-30 00:26:04 +00:00
Robert Watson
6d38c5ad80 Update comment for struct bpf_d: we now store buffered packets for BPF
in malloc'd storage, not in mbuf clusters.
2007-01-29 14:41:03 +00:00
Pawel Jakub Dawidek
1ded77b222 We expect 'bio_data != NULL' for BIO_{READ,WRITE,GETATTR}, but for
BIO_{DELETE,FLUSH} we expect 'bio_data == NULL'.

Reviewed by:	phk
2007-01-28 23:36:07 +00:00
Joel Dahl
48351eaf73 Clean up the BSD license to match the preferred license in
/usr/share/examples/etc/bsd-style-copyright.  I've fixed a
few minor wording and formatting differences.

Approved by:	matk, Hannu Savolainen <hannu@opensound.com>
Reviewed by:	imp
2007-01-28 20:38:07 +00:00
Pawel Jakub Dawidek
a1ea1a22e9 It is possible that GEOM taste provider before SMP is started.
We can't bind to a CPU which is not yet on-line, so add code that wait for
CPUs to go on-line before binding to them.

Reported by:	Alin-Adrian Anton <aanton@spintech.ro>
MFC after:	2 weeks
2007-01-28 20:29:12 +00:00
Sam Leffler
3d14f9377c ath and ath_rate_sample now have a compile-time dependency on the hal
so we need to build them only on architectures where there's a released
hal; this hack can be eliminated when an ia64 hal build is present
2007-01-28 18:35:46 +00:00
Robert Watson
b1e5dcf778 As we now have an SFB_NOWAIT flag, change 'will' to 'may' where the
comment for sf_buf_alloc(9) talks about sleeping.
2007-01-28 17:39:03 +00:00
Robert Watson
5d1f828354 Remove slightly dubious comment; add descriptive strings for several
sysctls.

MFC after:	3 days
2007-01-28 16:38:44 +00:00
Takanori Watanabe
c5286e1196 Add support for serial communication with Windows CE based Handheld Computer.
Obtained from:	NetBSD
2007-01-28 11:56:14 +00:00
Takanori Watanabe
224b9013e8 Add some vendor IDs mainly from NetBSD. 2007-01-28 10:46:32 +00:00
Yoshihiro Takahashi
f67d792692 MFi386: revision 1.647.
exclude the icu and clock lock from LOCK_PROFILING
2007-01-28 07:19:14 +00:00
Sam Leffler
9c05116297 for newer hal's we need opt_ah.h as it specifies how the hal has been
configured and that in turn controls the descriptor layout
2007-01-28 04:38:35 +00:00
Sam Leffler
bc14459ba8 for newer hal's we need opt_ah.h as it specifies how the hal has been
configured and that in turn controls the descriptor layout; the rate
control module has no business peeking inside the descriptor but until
we can change the api so the driver records the tx rates and passes
them deal with it
2007-01-28 04:36:05 +00:00
Ariff Abdullah
5430d30e44 Add speaker control for HP xw4300. This hardware doesn't respond to
unsolicited pin sense event and need manual control to turn off speaker
volume while attaching headphone.

Tested by:		Ingeborg Hellemo <Ingeborg.Hellemo@cc.uit.no>

Disable global Acer + ALC883 headphone automute settings since there are
few models that does not respect this and causing broken behaviour.

Reported/Tested by:	Pavel Argentov <argentoff@rtelekom.ru>
2007-01-28 03:16:54 +00:00
Remko Lodder
7fd6875fc5 Add the SMART command to the ATA instruction set.
When the disk has an error, it will now print SMART
instead of 'Unknown CMD'.

PR:		kern/93368
Submitted by:	Garry Belka <garry at NetworkPhysics dot COM>
Approved by:	sos
2007-01-27 21:15:59 +00:00
Max Laier
191c2cea1c In case we are supplied with an imagename that matches a module, but not a
firmware in that module (eventhough this is a programming error) - drop the
reference to the module again.

Submitted by:	Benjamin Close
MFC after:	3 days
2007-01-27 19:52:08 +00:00
Joseph Koshy
20c71e39c3 Use a known good stack at the time of servicing an NMI --- reuse
the space allocated for the double fault handler since this space
is otherwise unused till the time a double fault occurs.

This change should have been committed alongside r1.127 of
"exception.S", but I somehow missed doing so.

Problem reported by:	jeff
Pointy hat to:		jkoshy
2007-01-27 18:13:24 +00:00
Robert Watson
a85614b42b Remove BSD < 199103 compatibility entries in the bpf_d structure: they are
not used in any of our code.  Also remove explicit padding variable that
kept the bpf_d structure the same size before and after the change in
select implementation, since binary compatibility is not required for this
data structure on 7-CURRENT.
2007-01-27 18:12:50 +00:00
Robert Watson
b6957b8597 Remove now unused bpf_compat.h. This compatibility file emulates malloc(9)
using the mbuf allocator.
2007-01-27 17:32:12 +00:00
Ariff Abdullah
d130d86519 Rearrange locking order to avoid LOR (cat /dev/midistat).
Reported by:	rodrigc
2007-01-27 15:55:59 +00:00
Ariff Abdullah
b9ba7b9e78 Massive inlining cleanups/removal to make it survive on WARNS=2. 2007-01-27 13:30:19 +00:00
Ariff Abdullah
a12b5a0728 Reduce maximum DMA segments from 128 to 64. We don't need more than that. 2007-01-27 07:35:05 +00:00
Ariff Abdullah
f39ee7ef2e Total DMA segments should include total number of record channel(s). 2007-01-26 23:53:56 +00:00
Bruce A. Mah
f234bea7d7 Revert nd6.c revs. 1.67, 1.68, 1.69, 1.70 in an attempt to unbreak
IPv6 over point-to-point gif(4) tunnels.

These revisions caused a host route to the destination of a
point-to-point gif(4) interface to not get installed when the interface
and destination addresses were assigned.  This caused
"no route to host" errors when trying to send traffic over the
interface.  The first packet arriving inbound over the tunnel,
however, would cause the correct route to get installed, allowing
subsequent outbound traffic to be routed correctly.

gif(4) interfaces with prefix lengths of less than 128 bits
(i.e. no explicit destination address assigned) were not affected
by this bug.

This bug fix is a possible candidate for a 6.2-RELEASE errata note.

Approved by:	jhay (original committer)
Discussed with:	jhay, JINMEI Tatuya
MFC after:	3 days
2007-01-26 23:22:58 +00:00
Ariff Abdullah
b1d922169b Fix forever broken ua_chan_setblocksize() uninitialized return value
which causing divide by zero panic in other places (notably chn_sync()).
2007-01-26 19:14:41 +00:00
Ariff Abdullah
3ad47bdd54 Sync uaudio_sndstat_prepare_pcm() output with sndstat_prepare_pcm() to get
simmilar (debugging) output.
2007-01-26 19:06:17 +00:00
Doug White
e30d3a0c79 Add missing MIIBUS_MEDIAINIT() call. 2007-01-26 17:06:02 +00:00
Doug White
3c3d8e1e45 Collapse 5706C and 5708C PHYs into one entry. ID 0x15 is actually used for
the SERDES PHY on these chips and we want gentbi to pick this up, not brgphy.
2007-01-26 17:05:24 +00:00
Doug White
4a5cd040cb Add support for SERDES PHY configurations. These are commonly found in
blade systems, such as the Dell 1955 and the Intel SBXD132.

Development hardware for this work was provided by Broadcom and iXsystems.
A SBXD132 blade for testing was provided by Iron Systems.
2007-01-26 17:03:51 +00:00
Xin LI
7868ee24d3 While we do not expect any change before and after GNU gzip
is replaced with BSD gzip, let's make it possible to
distinguish between the two with a __FreeBSDversion bump,
just in case some developers want it.

Suggested by:	linimon
2007-01-26 14:57:17 +00:00
Marcel Moolenaar
500a5696d4 Remove stale header.
MFC after: 3 days
2007-01-26 04:58:31 +00:00
Kevin Lo
27864bcad8 Fix comments.
Approved by: cognet
2007-01-26 01:37:32 +00:00
Jeff Roberson
fc3a97dcb7 - Implement much more intelligent ipi sending. This algorithm tries to
minimize IPIs and rescheduling when scheduling like tasks while keeping
   latency low for important threads.
   1) An idle thread is running.
   2) The current thread is worse than realtime and the new thread is
      better than realtime.  Realtime to realtime doesn't preempt.
   3) The new thread's priority is less than the threshold.
2007-01-25 23:51:59 +00:00
Gleb Smirnoff
39c14742d9 - Create ng_ppp_bypass() function, that prepares a packet
with bypass header, to send it out to userland.
- Use ng_ppp_bypass() in ng_ppp_proto_recv().
- Use ng_ppp_bypass() in ng_ppp_comp_recv() and in
  ng_ppp_crypt_recv() if compression or encryption is
  disabled, respectively.
- Any LCP packet goes directly to ng_ppp_bypass(), instead
  of passing through PPP stack.
- Any non-LCP packet on disabled link is discarded. This
  is behavior defined in RFC.

Submitted by:	Alexander Motin <mav alkar.net>
2007-01-25 21:16:50 +00:00
Jeff Roberson
1461899028 - Get rid of the unused DIDRUN flag. This was really only present to
support sched_4bsd.
 - Rename the KTR level for non schedgraph parsed events.  They take event
   space from things we'd like to graph.
 - Reset our slice value after we sleep.  The slice is simply there to
   prevent starvation among equal priorities.  A thread which had almost
   exhausted it's slice and then slept doesn't need to be rescheduled a
   tick after it wakes up.
 - Set the maximum slice value to a more conservative 100ms now that it is
   more accurately enforced.
2007-01-25 19:14:11 +00:00
Gleb Smirnoff
3cf0d02480 Make it possible that carpdetach() unlocks on return. Then, in
carp_clone_destroy() we are on a safe side, we don't need to
unlock the cif, that can me already non-existent at this point.

Reported by:	Anton Yuzhaninov <citrin rambler-co.ru>
2007-01-25 18:03:40 +00:00
Matt Jacob
325bba15cc Whoops- #ifdef problem caused uninitialized transport. Not horribly
a problem, but caused annoying messages.
2007-01-25 18:02:23 +00:00
Gleb Smirnoff
62dae1e917 Spacing. 2007-01-25 17:58:16 +00:00
Bill Paul
e2bcb489ef The TCP checksum offload handling in the 8111B/8168B and 8101E PCIe can
apparently be confused by short TCP segments that have been manually
padded to the minimum ethernet frame size. The driver does short frame
padding in software as a workaround for a bug in the 8169 PCI devices
that causes short IP fragments to be corrupted due to an apparent
conflict between the hardware autopadding and hardware IP checksumming.

To fix this, we avoid software padding for short TCP segments, since
the hardware seems to autopad and checksum these correctly (even the
older 8169 NICs get these right). Short UDP packets appear to be
handled correctly in all cases. This should work around the IP header
checksum bug in the 8169 while not tripping the TCP checksum bug in
the 8111B/8168B and 8101E.
2007-01-25 17:30:30 +00:00
Bruce Evans
63cb891e8b Rename some functions and variables from nfs_* to nfs4_* to avoid
collisions with nfsclient's names.  Even static names should have a
unique prefix so that they can be debugged easily.

Hide the unused colliding variable nfsv3_commit_on_close in "#if 0"
together with other unused sysctl variables.  Duplicating the nfs sysctl
under nfs4 is probably just a bug.

Fix some nearby style bugs.

Remove duplicate $FreeBSD$.
2007-01-25 14:33:13 +00:00
Bruce Evans
8754c03a11 Rename some functions and variables (mainly vfsops entry points) from
nfs_* to nfs4_* to avoid collisions with nfsclient's names.   Even
static names should have a unique prefix so that they can be debugged
easily.

Most of the renamed functions can probably be shared.  nfs4_cmount()
and nfs4_sync() are identical to the nfs_* versions, and all the others
except nfs4_vfsops() seem to be idendentical except for style bugs,
missing support for mountroot, and bugs.

Fix some nearby style bugs.

Remove duplicate $FreeBSD$.
2007-01-25 14:18:40 +00:00
Bruce Evans
e43982a801 Unstaticize nfs_iosize() in nfsclient and use it in nfs4client instead
of duplicating it except for larger style bugs in the copy.

Fix some nearby style bugs (including a harmless type mismatch)
in and near the remaining copy.

This is part of fixing collisions of the 2 nfs*client's names.  Even
static names should have a unique prefixes so that they can be debugged
easily.
2007-01-25 13:07:25 +00:00
Mohan Srinivasan
6c125b8df6 Fix for problems that occur when all mbuf clusters migrate to the mbuf packet
zone. Cluster allocations fail when this happens. Also processes that may have
blocked on cluster allocations will never be woken up. Thanks to rwatson for
an overview of the issue and pointers to the mbuma paper and his tool to dump
out UMA zones.

Reviewed by: andre@
2007-01-25 01:05:23 +00:00
Mohan Srinivasan
7738029183 Fix for a bug where only one process (of multiple) blocked on
maxpages on a zone is woken up, with the rest never being woken up as
a result of the ZFLAG_FULL flag being cleared. Wakeup all such blocked
procsses instead. This change introduces a thundering herd, but since
this should be relatively infrequent, optimizing this (by introducing
a count of blocked processes, for example) may be premature.

Reviewd by: ups@
2007-01-24 22:49:11 +00:00
Jeff Roberson
9a93305a2e - With a sleep time over 2097 seconds hzticks and slptime could end up
negative.  Use unsigned integers for sleep and run time so this doesn't
   disturb sched_interact_score().  This should fix the invalid interactive
   priority panics reported by several users.
2007-01-24 18:18:43 +00:00
Randall Stewart
6dbde03086 Fixes the MSG_PEEK for sctp_generic_recvmsg() the msg_flags
were not being copied in properly so PEEK and any other
msg_flags input operation were not being performed right.
Approved by:	gnn
2007-01-24 12:59:56 +00:00
Bruno Ducrot
8867dfa953 o introduce a flags 'errata' for HW bugs onto the softc.
o remove errata_a0 and introduce the corresponding flags into 'errata'.
o introduce a new errata for K8, namely some platform might set the
  PENDING_BIT but aren't able to unset it, also don't loop forever
  waiting PENDING_BIT being cleared.
o try to introduce a workaround for the PENDING_BIT stuck problem,
o support now half multipliers for K8.

Tested by:	Abdullah Al-Marrie

Approved by:	njl
2007-01-23 19:20:30 +00:00
Warner Losh
8bd73484dc Use the more specific 'EM732X' designation rather than * to disable sync
cache commands, per request from njl@.
2007-01-23 17:29:31 +00:00
Konstantin Belousov
2cc7d26f7f Cylinder group bitmaps and blocks containing inode for a snapshot
file are after snaplock, while other ffs device buffers are before
snaplock in global lock order. By itself, this could cause deadlock
when bdwrite() tries to flush dirty buffers on snapshotted ffs. If,
during the flush, COW activity for snapshot needs to allocate block
and ffs_alloccg() selects the cylinder group that is being written
by bdwrite(), then kernel would panic due to recursive buffer lock
acquision.

Avoid dealing with buffers in bdwrite() that are from other side of
snaplock divisor in the lock order then the buffer being written. Add
new BOP, bop_bdwrite(), to do dirty buffer flushing for same vnode in
the bdwrite(). Default implementation, bufbdflush(), refactors the code
from bdwrite(). For ffs device buffers, specialized implementation is
used.

Reviewed by:	tegge, jeff, Russell Cattelan (cattelan xfs org, xfs changes)
Tested by:	Peter Holm
X-MFC after:	3 weeks (if ever: it changes ABI)
2007-01-23 10:01:19 +00:00
Jeff Roberson
7a5e5e2a59 - Catch up to setrunqueue/choosethread/etc. api changes.
- Define our own maybe_preempt() as sched_preempt().  We want to be able
   to preempt idlethread in all cases.
 - Define our idlethread to require preemption to exit.
 - Get the cpu estimation tick from sched_tick() so we don't have to worry
   about errors from a sampling interval that differs from the time
   domain.  This was the source of sched_priority prints/panics and
   inaccurate pctcpu display in top.
2007-01-23 08:50:34 +00:00
Bruce Evans
cec54a8d96 Oops, pc98 is independent of i386 for clock.c and machdep.c but not
for clock.h, so changing th i386 clock.h broke it.  MFi386 (not tested):

Cleaned up declaration and initialization of clock_lock.  It is only
used by clock code, so don't export it to the world for machdep.c to
initialize.  There is a minor problem initializing it before it is
used, since although clock initialization is split up so that parts
of it can be done early, the first part was never done early enough
to actually work.  Split it up a bit more and do the first part as
late as possible to document the necessary order.  The functions that
implement the split are still bogusly exported.

Cleaned up initialization of the i8254 clock hardware using the new
split.  Actually initialize it early enough, and don't work around it
not being initialized in DELAY() when DELAY() is called early for
initialization of some console drivers.

This unfortunately moves a little more code before the early debugger
breakpoint so that it is harder to debug.  The ordering of console and
related initialization is delicate because we want to do as little as
possible before the breakpoint, but must initialize a console.
2007-01-23 08:48:26 +00:00
Jeff Roberson
f0393f063a - Remove setrunqueue and replace it with direct calls to sched_add().
setrunqueue() was mostly empty.  The few asserts and thread state
   setting were moved to the individual schedulers.  sched_add() was
   chosen to displace it for naming consistency reasons.
 - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be
   different on all three schedulers where it was only called in one place
   each.
 - Remove the long ifdef'd out remrunqueue code.
 - Remove the now redundant ts_state.  Inspect the thread state directly.
 - Don't set TSF_* flags from kern_switch.c, we were only doing this to
   support a feature in one scheduler.
 - Change sched_choose() to return a thread rather than a td_sched.  Also,
   rely on the schedulers to return the idlethread.  This simplifies the
   logic in choosethread().  Aside from the run queue links kern_switch.c
   mostly does not care about the contents of td_sched.

Discussed with:	julian

 - Move the idle thread loop into the per scheduler area.  ULE wants to
   do something different from the other schedulers.

Suggested by:	jhb

Tested on:	x86/amd64 sched_{4BSD, ULE, CORE}.
2007-01-23 08:46:51 +00:00
Jeff Roberson
3c93ca7d2f - Allow the schedulers to IPI_PREEMPT idlethread. This puts the decision
for this behavior on the initiator side.
2007-01-23 08:38:39 +00:00
Bruce Evans
71799af2d5 Cleaned up declaration and initialization of clock_lock. It is only
used by clock code, so don't export it to the world for machdep.c to
initialize.  There is a minor problem initializing it before it is
used, since although clock initialization is split up so that parts
of it can be done early, the first part was never done early enough
to actually work.  Split it up a bit more and do the first part as
late as possible to document the necessary order.  The functions that
implement the split are still bogusly exported.

Cleaned up initialization of the i8254 clock hardware using the new
split.  Actually initialize it early enough, and don't work around it
not being initialized in DELAY() when DELAY() is called early for
initialization of some console drivers.

This unfortunately moves a little more code before the early debugger
breakpoint so that it is harder to debug.  The ordering of console and
related initialization is delicate because we want to do as little as
possible before the breakpoint, but must initialize a console.
2007-01-23 08:01:20 +00:00
Nate Lawson
7826bf983c Add missing function trace for debug prints. 2007-01-23 07:20:44 +00:00
Craig Rodrigues
61e323a2fa When exiting vfs_export(), delete the "export" option from
the mount options list with vfs_deleteopt().  At this point, the export
information is saved in mp->mnt_export, so we can delete
the "export" mount option from mp->mnt_optnew and mp->mnt_opt.

This fixes read-write/read-only update mounts (mount -u -o rw, mount -u -o ro)
of NFS exported directories.

For some reason, I could only reproduce the problem with a configuration
supplied by Andre:
- "options QUOTA" enabled in kernel config
- "/ -maproot=root 10.0.1.105" in /etc/exports

Reported by:	kris, Andre Guibert de Bruet <andy siliconlandmark com>,
            	Andrzej Tobola <ato iem pw edu pl>
Tested by:	Andre Guibert de Bruet
2007-01-23 06:19:16 +00:00
Scott Long
95a8bcd854 Remove a PCI ID entry that conflicts with the AMR driver. 2007-01-23 02:47:33 +00:00
Pyun YongHyeon
d01fac16ac It seems that enabling Tx and Rx before setting descriptor DMA
addresses shall access invalid descriptor DMA addresses on PCIe
hardwares and then panicked the system.
To fix it set descriptor DMA addresses before enabling Tx and Rx
such that hardware can see valid descriptor DMA addresses. Also
set RL_EARLY_TX_THRESH before starting Tx and Rx.

Reported by:	steve.tell AT crashmail DOT de
Tested by:	steve.tell AT crashmail DOT de
Obtained from:	NetBSD
MFC after:	1 week
2007-01-23 00:44:12 +00:00
Matt Jacob
f9734398e3 Clean up some of the various platform and release specific dma tag
stuff so it is centralized in isp_freebsd.h.

Take out PCI posting flushed in qla2100/2200 register reads except for
2100s.
2007-01-23 00:02:29 +00:00
John Baldwin
5fe82bca57 Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support.
- First off, device drivers really do need to know if they are allocating
  MSI or MSI-X messages.  MSI requires allocating powerof2() messages for
  example where MSI-X does not.  To address this, split out the MSI-X
  support from pci_msi_count() and pci_alloc_msi() into new driver-visible
  functions pci_msix_count() and pci_alloc_msix().  As a result,
  pci_msi_count() now just returns a count of the max supported MSI
  messages for the device, and pci_alloc_msi() only tries to allocate MSI
  messages.  To get a count of the max supported MSI-X messages, use
  pci_msix_count().  To allocate MSI-X messages, use pci_alloc_msix().
  pci_release_msi() still handles both MSI and MSI-X messages, however.
  As a result of this change, drivers using the existing API will only
  use MSI messages and will no longer try to use MSI-X messages.
- Because MSI-X allows for each message to have its own data and address
  values (and thus does not require all of the messages to have their
  MD vectors allocated as a group), some devices allow for "sparse" use
  of MSI-X message slots.  For example, if a device supports 8 messages
  but the OS is only able to allocate 2 messages, the device may make the
  best use of 2 IRQs if it enables the messages at slots 1 and 4 rather
  than default of using the first N slots (or indicies) at 1 and 2.  To
  support this, add a new pci_remap_msix() function that a driver may call
  after a successful pci_alloc_msix() (but before allocating any of the
  SYS_RES_IRQ resources) to allow the allocated IRQ resources to be
  assigned to different message indices.  For example, from the earlier
  example, after pci_alloc_msix() returned a value of 2, the driver would
  call pci_remap_msix() passing in array of integers { 1, 4 } as the
  new message indices to use.  The rid's for the SYS_RES_IRQ resources
  will always match the message indices.  Thus, after the call to
  pci_remap_msix() the driver would be able to access the first message
  in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at
  SYS_RES_IRQ rid 4.  Note that the message slots/indices are 1-based
  rather than 0-based so that they will always correspond to the rid
  values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt).
  To support this API, a new PCIB_REMAP_MSIX() method was added to the
  pcib interface to change the message index for a single IRQ.

Tested by:	scottl
2007-01-22 21:48:44 +00:00
Andre Oppermann
7c32173ba8 Unbreak writes of 0 bytes. Zero byte writes happen when only ancillary
control data but no payload data is passed.

Change m_uiotombuf() to return at least one empty mbuf if the requested
length was zero.  Add comment to sosend_dgram and sosend_generic().

Diagnoses by:		jhb
Regression test by:	rwatson
Pointy hat to.		andre
2007-01-22 14:50:28 +00:00
Konstantin Belousov
7f92c4ee02 Below is slightly edited description of the LOR by Tor Egge:
--------------------------
[Deadlock] is caused by a lock order reversal in vfs_lookup(), where
[some] process is trying to lock a directory vnode, that is the parent
directory of covered vnode) while holding an exclusive vnode lock on
covering vnode.

A simplified scenario:

root fs					var fs
/    		A			/    (/var)	D
/var		B			/log (/var/log) E
vfs lock	C			vfs lock	F

Within each file system, the lock order is clear: C->A->B and F->D->E

When traversing across mounts, the system can choose between two lock orders,
but everything must then follow that lock order:

      L1: C->A->B
		|
	        +->F->D->E

      L2: F->D->E
	     |
             +->C->A->B

The lookup() process for namei("/var") mixes those two lock orders:

    VOP_LOOKUP() obtains B while A is held
    vfs_busy() obtains a shared lock on F while A and B are held (follows L1,
    violates L2)
    vput() releases lock on B
    VOP_UNLOCK() releases lock on A
    VFS_ROOT() obtains lock on D while shared lock on F is held
    vfs_unbusy() releases shared lock on F
    vn_lock() obtains lock on A while D is held (violates L1, follows L2)

dounmount() follows L1 (B is locked while F is drained).

Without unmount activity, vfs_busy() will always succeed without blocking
and the deadlock isn't triggered (the system behaves as if L2 is followed).

With unmount, you can get 4 processes in a deadlock:

     p1: holds D, want A (in lookup())
     p2: holds shared lock on F, want D (in VFS_ROOT())
     p3: holds B, want drain lock on F (in dounmount())
     p4: holds A, want B (in VOP_LOOKUP())

You can have more than one instance of p2.

The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and
MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs
servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode.

- Tor Egge

To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp
is actually not used by the callers of namei. Thus, placeholder deadfs
vnode vp_crossmp is introduced that is filled into ni_dvp.

Idea by:	ups
Reviewed by:	tegge, ups, jeff, rwatson (mac interaction)
Tested by:	Peter Holm
MFC after:	2 weeks
2007-01-22 11:25:22 +00:00
Warner Losh
d5f2a6f556 Add quirk for EasyMP3 EM732X usb 2.0 flash mp3 player.
(It appears that the quirk proceedures link has disappeared and that
this PR complied with it, if there's a problem, please contact me).

PR: usb/96546
2007-01-22 04:34:03 +00:00
Marius Strobl
c2175ff5ca Change the remainder of the drivers for DMA'ing devices enabled in the
sparc64 GENERIC and the sound device drivers known working on sparc64
to use bus_get_dma_tag() to obtain the parent DMA tag so we can get rid
of the sparc64_root_dma_tag kludge eventually. Except for ath(4), sk(4),
stge(4) and ti(4) these changes are runtime tested (unless I booted up
the wrong kernels again...).
2007-01-21 19:32:51 +00:00
Marius Strobl
e54f674652 Correct a logic bug in the previous change. 2007-01-21 19:28:00 +00:00
Alexander Leidinger
eff9c72b4b Use a printf-modifier which doesn't need a cast.
Submitted by:	scottl
2007-01-21 13:18:52 +00:00
Jeff Roberson
5cea64d54f - Disable the long-term load balancer. I believe that steal_busy works
better and gives more predictable results.
2007-01-20 21:24:05 +00:00
Alexander Leidinger
9cb5a012fb Fix tinderbox build on amd64. 2007-01-20 19:32:23 +00:00
Marius Strobl
d7a0d759c0 Quiet GCC4 warnings regarding the width of printf()-arguments not
matching the format. While at it limit the format to unsigned int as
we're only interested in the 11 least significant bits anyway.
2007-01-20 17:14:12 +00:00
Scott Long
089292ab0b The multicast hash table has 8 slots in the BCE hardware, not 4 slots like
the BGE hardware.  Adapt the driver for this.

Submitted by: Mike Karels
MFC After: 3 days
2007-01-20 17:05:12 +00:00
Jeff Roberson
c95d2db298 - We do need to IPI the idlethread on some systems. It may be stuck in
a power saving mode otherwise.
 - If the thread is already bound in sched_bind() unbind it before
   re-binding it to a new cpu.  I don't like these semantics but they are
   expected by some code in the tree.  Patch by jkoshy.
2007-01-20 17:03:33 +00:00
Alexander Leidinger
d071f5048c MFp4 (113077, 113083, 113103, 113124, 113097):
Dont expose em->shared to the outside world before its properly
	initialized. Might not affect anything but its at least a better
	coding style.

	Dont expose em via p->p_emuldata until its properly initialized.
	This also enables us to get rid of some locking and simplify the
	code because we are workin on a local copy.

	In linux_fork and linux_vfork create the process in stopped state
	to be sure that the new process runs with fully initialized emuldata
	structure [1]. Also fix the vfork (both in linux_clone and linux_vfork)
	race that could result in never woken up process [2].

Reported by:	Scot Hetzel	[1]
Suggested by:	jhb		[2]
Reviewed by:	jhb (at least some important parts)
Submitted by:	rdivacky
Tested by:	Scot Hetzel (on amd64)

Change 2 comments (in the new code) to comply to style(9).

Suggested by:	jhb
2007-01-20 14:58:59 +00:00
Marius Strobl
8dbf0223f3 Add macros for the individual divisor bits as some MC146818A-compatible
chips also use them for different purposes.
2007-01-20 14:57:51 +00:00
Marius Strobl
0c7d35d0b9 Remove BUS_DMA_WAITOK from bus_dma_tag_create() invocations as it's
no valid flag there.
2007-01-20 14:19:29 +00:00
Marius Strobl
e6770fff6b - Use bus_get_dma_tag() to obtain the parent DMA tag so dma(4) will
work when we start requiring this.
- Don't specify an alignment when creating our own parent DMA tag;
  the supported DMA engines require no alignment constraint (f.e. the
  LANCE child does though) and it's no inherited by the child DMA
  tags anyway (which probably is a bug though).
- Fix whitespace nits.
2007-01-20 14:06:01 +00:00
Xin LI
e499c6135c Fix build. chkdquot() should not return anything. 2007-01-20 13:54:28 +00:00
Marius Strobl
0222c13479 Add front-ends for the 'lebuffer' variants found on some SBus cards.
These are shared-memory variants based on Am79C90-compatible chips
that apart from the missing DMA engine are similar to the 'ledma'
variant including using a (pseudo-)bus/device for the buffer that
the actual LANCE device hangs off from. The performance of these is
close to that of the 'ledma' one, like expected at a few times the
CPU load though.
2007-01-20 12:53:30 +00:00
Mike Pritchard
db9b81eabc Quota system cleanup.
1) Do not do quota accounting for the actual quota data files
   or for file system snapshot files ("system" files).  This
   prevents a deadlock descibed in PR kern/30958 if the kernel
   ever has to grow the quota file.  Snapshot files were already
   exempt from the quota checks, but this change generalized the check.
2) Fix a cast that caused extremely large uids/gids to incorrectly
   write the quota information to the data file at a truncated
   value for a uint_t32 id value.  The incorrect cast caused quota
   files in this case to be around 4GB in size, with the correct cast
   they can now be 131GB in size.  Also related to PR kern/30958.
3) Check for what appear to be negative UIDs/GIDs and not account
   for them.  This prevents the quota files from becoming 131GB in
   size and causing quotacheck to run forever at bootup.  This could
   also cause the kernel to try and expand the quota file, which might
   deadlock due to the issue in #1.  kern/30958 and kern/38156
   (and some much older closed PR's).
4) With the deadlock problems gone, the kernel can now expand the
   size of the quota database files if it needs to.
5) Pass in the i-node count change value to chkiq and chkiqchg as an
   int, like it used to be before the common routine was split up
   into 2 different routines to increase / decrease the i-node in-use
   count.  Prevents an underflow on the i-node count.  Related
   to PR kern/89247.
6) Prevent the block usage from growing slowly if a file system is
   full and the write was denied due to that fact.  PR kern/89247.

Some of these changes require an updated quotacheck to prevent
the creation of huge (131GB) quota data files (item #3).

#1/#4 probably fixes a lot of the random hangs when quotas are enabled,
possibly some of the jail hangs.
2007-01-20 11:58:32 +00:00
Alexander Leidinger
f0cad96d23 Ooops, fix the ratelimit. 2007-01-20 11:31:14 +00:00
Alexander Leidinger
456ede3976 Convert a KASSERT into a runtime warning (rate limited) + failsafe fallback.
Because of a stupid bug (also fixed with this commit) the KASSERT was
triggered when runnung the linux top.

Pointy hat to:	netchild
2007-01-20 11:07:41 +00:00
Marius Strobl
17792f45fb For setting the port PCnet chips must be powered down or stopped and
unlike documented may not take effect without an initialization. So
don't invoke (*sc_mediachange) directly in lance_mediachange() but
go through lance_init_locked(). It's suboptimal to impose this for
all chips but given that besides the affected PCI bus front-end the
only other front-end which supports media selection is and likely
ever will be the 'ledma' front-end I see not enough reason to break
the in-driver API for this (though one could argue both ways here).
2007-01-20 10:47:16 +00:00
Marius Strobl
d2255d0286 Use bus_get_dma_tag() to obtain the parent DMA tag so le(4) works on
platforms requiring this.
2007-01-20 09:57:09 +00:00
Jeff Roberson
6b2f763f7c - In tdq_transfer() always set NEEDRESCHED when necessary regardless of
the ipi settings.  If NEEDRESCHED is set and an ipi is later delivered
   it will clear it rather than cause extra context switches.  However, if
   we miss setting it we can have terrible latency.
 - In sched_bind() correctly implement bind.  Also be slightly more
   tolerant of code which calls bind multiple times.  However, we don't
   change binding if another call is made with a different cpu.  This
   does not presently work with hwpmc which I believe should be changed.
2007-01-20 09:03:43 +00:00
Matt Jacob
8ada63303e Grumble- let a linux-ism slip in and had an llx which
then choked on a 64 bit platforms. Oops.
2007-01-20 07:38:31 +00:00
Matt Jacob
6c81a0aecb MFP4: Move default setting to the end of isp_reset instead of the
front of isp_init so we can read NVRAM even if we're role ISP_NONE.
Prepare for reintroduction of channels (for FC) for N-Port
Virtualization.

Fix a botch in handle assignment that caused us to nuke one device
when a new one arrives and end up with two devices with the same
identity in the virtual target mapping table.
2007-01-20 04:00:21 +00:00
Marius Strobl
9bcdfcae43 - In miibus_attach() remove IFM_IMASK from the dontcare_mask of the
ifmedia_init() invocation. IFM_IMASK makes only sense here when all of
  the maxium of 32 PHYs on each one MII bus support disjoint sets of media,
  which generally isn't the case (though it would be nice if we had a way
  to let NIC drivers indicate that for the few card models where the PHY
  configuration is known/fixed and IFM_IMASK actually makes sense).
- Add and use a miibus_print_child() for the bus_print_child method which
  additionally prints the PHY number (which actually is the PHY address)
  so one can figure out the media instance <-> PHY number mapping from the
  PHY driver attach output. This is intented to be usefull in situations
  where the addresses of the PHYs on the bus are known (f.e. of internal/
  integrated PHYs) so one can feed the appropriate media instance number
  to ifconfig(8) (with the upcoming change for ifconfig(8)).
  This is more or less inspired by the NetBSD mii_print().
2007-01-20 00:55:03 +00:00
Marius Strobl
b8a5d0481a - Don't set MIIF_NOISOLATE so ukphy(4) can be used in configurations with
multiple PHYs. In case some PHYs currently driven by ukphy(4) exhibit
  problems when isolating due to incomplete implementations or silicon bugs
  we'll need to add specific drivers for these. Looking at NetBSD and
  OpenBSD I don't expect problems here though (quite the contrary; we still
  seem to set MIIF_NOISOLATE without good reason in a bunch of PHY drivers).
- Fix a style(9) whitespace nit.
2007-01-20 00:52:29 +00:00
John Baldwin
6eb7ebfe25 - Change the PCI-X registers constants to be relative to the PCI-X PCI
capability rather than hardcoded offsets for a particular card.  While
  I'm here, expand the constants some.
- Change the ahd(4) driver to use pci_find_extcap() to locate the PCI-X
  capability to keep up with the first change.

Reviewed by:	scottl, gibbs (earlier version)
2007-01-19 22:37:52 +00:00
Jeff Roberson
7b8bfa0de9 Major revamp of ULE's cpu load balancing:
- Switch back to direct modification of remote CPU run queues.  This added
   a lot of complexity with questionable gain.  It's easy enough to
   reimplement if it's shown to help on huge machines.
 - Re-implement the old tdq_transfer() call as tdq_pickidle().  Change
   sched_add() so we have selectable cpu choosers and simplify the logic
   a bit here.
 - Implement tdq_pickpri() as the new default cpu chooser.  This algorithm
   is similar to Solaris in that it tries to always run the threads with
   the best priorities.  It is actually slightly more complex than
   solaris's algorithm because we also tend to favor the local cpu over
   other cpus which has a boost in latency but also potentially enables
   cache sharing between the waking thread and the woken thread.
 - Add a bunch of tunables that can be used to measure effects of different
   load balancing strategies.  Most of these will go away once the
   algorithm is more definite.
 - Add a new mechanism to steal threads from busy cpus when we idle.  This
   is enabled with kern.sched.steal_busy and kern.sched.busy_thresh.  The
   threshold is the required length of a tdq's run queue before another
   cpu will be able to steal runnable threads.  This prevents most queue
   imbalances that contribute the long latencies.
2007-01-19 21:56:08 +00:00
Marius Strobl
47c422c3a8 Remove remnants from the sparc64 origin of this file and which are
unlikely to be ever used and misplaced on sun4v respectively.
2007-01-19 12:22:50 +00:00
Marius Strobl
0ca3609e30 Convert the remainder of the low hanging fruits regarding including
headers in .S directly rather than getting to their macros through
genassym.c/assym.s so there are less headers genassym.c has to be
kept in sync with.
While at it fix some stytle(9) bugs (indentation, prototype format,
sort headers, etc) and remove trailing whitespace.
2007-01-19 11:15:34 +00:00
Warner Losh
7e2ff8bbff Cope gracefully with device_get_children returning an error.
Obtained from: Hans Petter Selasky
P4: http://perforce.freebsd.org/chv.cgi?CH=112957
2007-01-19 08:49:28 +00:00
Marius Strobl
97202af2dc - Add a uart_rxready() and corresponding device-specific implementations
that can be used to check whether receive data is ready, i.e. whether
  the subsequent call of uart_poll() should return a char, and unlike
  uart_poll() doesn't actually receive data.
- Remove the device-specific implementations of uart_poll() and implement
  uart_poll() in terms of uart_getc() and the newly added uart_rxready()
  in order to minimize code duplication.
- In sunkbd(4) take advantage of uart_rxready() and use it to implement
  the polled mode part of sunkbd_check() so we don't need to buffer a
  potentially read char in the softc.
- Fix some mis-indentation in sunkbd_read_char().

Discussed with:	marcel
2007-01-18 22:01:19 +00:00
Matt Jacob
33eb7cb0a9 A less draconian fix to the build. 2007-01-18 19:41:39 +00:00
Marius Strobl
3284c150d2 - Probe the CS4231 in USIII machines.
- Remove unused variables. [1]

Reported by:	Coverity Prevent (CID 700, 701) [1]
2007-01-18 19:19:19 +00:00
David E. O'Brien
da1fa91ac0 Temporarily comment out the KASSERT that broke the kernel build. 2007-01-18 18:53:13 +00:00
Marius Strobl
23e81b7e03 - Rename UPA_BUS_SPACE to NEXUS_BUS_SPACE; besides an UPA bus, nexus(4)
may also reflect a Fireplane/Safari or JBus bus (or a virtual bus which
  in turn reflects a JBus bus or something like that...).
- In the both the sparc64 and sun4v bus_machdep.c use __FBSDID.
- Spell SBus the official way in comments.
- Replace hardcoded function names (all of which were actually outdated)
  in panic and status strings with __func__.
- Fix whitespace nits.
2007-01-18 18:32:26 +00:00
Gleb Smirnoff
164b576e96 Revise the ng_ppp(4) node, so that code flow is more clear. All non-link
hooks get their per hook rcvdata methods, and all functions are organized
corresponding to protocol stack model.

Submitted by:	Alexander Motin <mav alkar.net>
Reviewed by:	archie, julian
2007-01-18 13:55:21 +00:00
Marius Strobl
441b9412d6 Remove the compat shims for the ISA old-stlye in{b,w,l}()/out{b,w,l}()
and friends along with all hacks required to implement them. None of
the drivers currently built (as part of GENERIC, LINT or modules) on
sparc64 or sun4v and none of those we might want to use there in
future uses them, AFAICT there actually never was a driver hooked up
to the sparc64 or sun4v build that correctly used these functions
(and it looks like that due to a bug read{b,w,l}()/write{b,w,l}() and
the other functions working on a memory handle never actually worked on
sun4v). All they ever were good for on sparc64 and sun4v was erroneously
dragging in dependencies on isa(4) in drivers like f.e. dpt(4), si(4)
and syscons(4) in source files that supposedly were bus-neutral and
hiding issues with drivers like f.e. ng_bt3c(4) that used these
functions with busses other than isa(4) and therefore couldn't work on
these platforms.
2007-01-18 13:52:44 +00:00
Marius Strobl
420a38dd4b Wrap the EISA-specific parts of the dpt(4) and si(4) back-ends in
the newly added DEV_EISA. This is done so that these back-ends can
be compiled on platforms not providing in{b,w,l}()/out{b,w,l}() and
friends (but may wish to use them together with bus front-ends other
than the EISA one).
2007-01-18 13:33:36 +00:00
Marius Strobl
2f11f3372a On sparc64 also use the fillw() this header provides for ia64 so
the sparc64 MD code doesn't need to provide a memsetw() along with
the ISA compat cruft.
2007-01-18 13:08:08 +00:00
Randall Stewart
93164cf98c - most all includes (#include <>) migrate to the sctp_os_bsd.h file
- Finally all splxx() are removed
 - Count error fixed in mapping array which might
   cause a wrong cumack generation.
 - Invariants around panic for case D + printf when no invariants.
 - one-to-one model race condition fixed by using
   a pre-formed connection and then completing the
   work so accept won't happen on a non-formed
   association.
 - Some additional paranoia checks in sctp_output.
 - Locks that were missing in the accept code.

Approved by:	gnn
2007-01-18 09:58:43 +00:00
Konstantin Belousov
4349c6ba29 Add support for LINUX_O_DIRECT, LINUX_O_DIRECT and LINUX_O_NOFOLLOW flags
to open() [1].
Improve locking for accessing session control structures [2].
Try to document (most likely harmless) races in the code [3].

Based on submission by:	Intron (intron at intron ac) [1]
Reviewed by:		jhb [2]
Discussed with:		netchild, rwatson, jhb [3]
2007-01-18 09:32:08 +00:00
Andrew Thompson
98b81793ed Set topology change propagation on all ports _except_ the caller. 2007-01-18 07:13:01 +00:00
Craig Rodrigues
5a09873361 Revert previous change.
Requested by:	kan
2007-01-18 05:46:32 +00:00
Craig Rodrigues
e76c6d8cd3 Forward declare __pcpu as a pointer type instead of an array type to
eliminate GCC 4.1 error: "array type has incomplete element type".
2007-01-18 02:00:04 +00:00
Xin LI
f67af5c918 Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form. 2007-01-17 15:05:52 +00:00
Xin LI
4f506694bb Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form. 2007-01-17 14:58:53 +00:00
Markus Brueffer
740ae2a34c Fix a buffer overflow iff USB_DEBUG is set, hw.usb.ums.debug is > 5 and the
total size of all input reports is < 6.

PR:		usb/106435
Submitted by:	Eygene Ryabinkin <rea-fbsd@codelabs.ru>
Approved by:	emax (mentor)
MFC after:	3 days
2007-01-17 03:50:45 +00:00