Commit Graph

62571 Commits

Author SHA1 Message Date
cognet
638a7ba075 Use __NO_STRICT_ALIGNMENT, instead of special casing ia64 and sparc64.
This fixes panics I got on arm, with struct ip aligned on 4 bytes.

MFC After:	1 week
2007-02-09 00:09:35 +00:00
bms
ecb2edb6fa Store the cached route in vifp in the normal send_packet() case.
The VIFF_TUNNEL case no longer exists, therefore this field is free to
use, and its use eliminates a static data member.
2007-02-08 23:05:08 +00:00
bms
51ca9a4740 Nuke the token bucket filter code. Attempting to request rate limiting
by the token bucket filter will result in EINVAL being returned.

If you want to rate-limit traffic in future, use ALTQ or dummynet; this
isn't a general purpose QoS engine.

Preserve the now unused fields in struct vif so as to avoid having to
recompile netstat(1) and other tools.

Reviewed by:	Pavlin Radslavov, Bill Fenner
2007-02-08 22:58:01 +00:00
imp
3ae687fcd5 Add sanity check to make sure that the MAC address isn't all 0's. Bad
boot loaders can do this, and this leads to all kinds of ill effects
downstream.  Also, minor formatting nits.
2007-02-08 21:42:10 +00:00
imp
4b83088ca6 Fix problem with RTL8201L PHY. From submitter:
Bugfix for the Realtek PHY driver... an RTL8201L standalone PHY
    needs different handling than the integrated ones in terms of
    speed detection.  There was a bogus test based on the parent
    device driver name string controlling which speed register to
    query.  That test began failing when the rl driver was split into
    separate rl and re drivers some time ago.  Apparently nobody ever
    noticed because the buggy code only executes if NWAY negotiation
    failed.  Since we happen to be testing with an ancient dumb hub
    rather than a modern switch, we found it.

    To fix it all, have the attach() routine notice whether we're
    dealing with an integrated PHY or an RTL8201L and store that info
    in a struct accessible to the status() routine that needs to know
    which register to query.

I touched up the fixes because they were relative to RELENG_6 and to
bring a few nits into line with style(9).

MFC After: 2 weeks
Submitted by: Ian Lepore
2007-02-08 19:16:15 +00:00
jhb
62b1a5668d Don't send interrupts to CPUs disabled via lapic hints.
Reported by:	Ludger Bolmerg <lbolmerg ! web.de>
MFC after:	3 days
Pointy hat to:	jhb
2007-02-08 16:49:59 +00:00
rwatson
fb9b1cf91c As VPD support still causes hard hangs on boot with some hardware, add a
tunable allowing automatic parsing of VPD data to be disabled.  The
default is left as-is; if you are having problems with hard hangs at boot
due to VPD, try setting hw.pci.enable_vpd=0.  A proper architectural
solution has been under discussion for some time, but this allows me to
boot my test machines in the mean time.

Submitted by:	bz
Head nod:	jmg
2007-02-08 14:33:07 +00:00
kib
08a6b49351 Remove not needed acquision of the mount interlock aroung reading of
mnt_kern_flags in ufs_itimes().

Suggested by:	ssouhlal
Confirmed by:	tegge
MFC after:	2 weeks
2007-02-08 09:47:19 +00:00
rodrigc
4b93723aab #include <sys/systm.h> before <sys/geom.h> to get KASSERT(), and fix LINT build. 2007-02-08 04:02:56 +00:00
rodrigc
9327f3ee0d Add noatime to the list of mount options that msdosfs accepts.
PR:		108896
Submitted by:	Eugene Grosbein <eugen grosbein pp ru>
2007-02-08 02:30:55 +00:00
rodrigc
9a7c587caa Style fixes: use ANSI C function declarations. 2007-02-08 02:25:35 +00:00
jeff
7038c5de35 - Change types for necent runq additions to u_char rather than int.
- Fix these types in ULE as well.  This fixes bugs in priority index
   calculations in certain edge cases. (int)-1 % 64 != (uint)-1 % 64.

Reported by:	kkenn using pho's stress2.
2007-02-08 01:52:25 +00:00
marcel
35c8706625 Don't recurse into geom_apple and geom_gpt. They have been moved
into the g_part framework.
2007-02-07 21:37:02 +00:00
bms
6fc869225c eliminate redundant macro MC_SEND() 2007-02-07 20:36:33 +00:00
marcel
0245423ad8 Evolve the ctlreq interface added to geom_gpt into a generic
partitioning class that supports multiple schemes. Current
schemes supported are APM (Apple Partition Map) and GPT.
Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM
and GEOM_PART_GPT (resp).

The ctlreq interface supports verbs to create and destroy
partitioning schemes on a disk; to add, delete and modify
partitions; and to commit or undo changes made.
2007-02-07 18:55:31 +00:00
jhb
9c764c7fc3 - Move 'struct swdevt' back into swap_pager.h and expose it to userland.
- Restore support for fetching swap information from crash dumps via
  kvm_get_swapinfo(3) to fix pstat -T/-s on crash dumps.

Reviewed by:	arch@, phk
MFC after:	1 week
2007-02-07 17:43:11 +00:00
bms
61cc2fad7d Remove support for IPIP tunnels in IPv4 multicast forwarding. XORP has
never used them; with mrouted, their functionality may be replaced by
explicitly configuring gif(4) instances and specifying them with the
'phyint' keyword.

Bump __FreeBSD_version to 700030, and update UPDATING.
A doc update is forthcoming.

Discussed on:	net
Reviewed by:	fenner
MFC after:	3 months
2007-02-07 16:04:13 +00:00
kib
0e5b15d726 Fix the race of dereferencing /proc/<pid>/file with execve(2) by caching
the value of p_textvp. This way, we always unlock the locked vnode.
While there, vhold() the vnode around the vn_lock().

Reported and tested by:	Guy Helmer (ghelmer palisadesys com)
Approved by:		des (procfs maintainer)
MFC after:		1 week
2007-02-07 10:30:49 +00:00
alc
2eb15b506b Change the pagedaemon, vm_wait(), and vm_waitpfault() to sleep on the
vm page queue free mutex instead of the vm page queue mutex.
2007-02-07 06:37:30 +00:00
alc
c1270b41ec Remove the vm page queue free mutex from the CDEV order. 2007-02-07 05:43:31 +00:00
bde
2847eb0a1f Fixed some style bugs. Routine except:
- don't use __GNUCLIKE___OFFSETOF, since __offsetof() is a standard
  FreeBSD implementaion detail which has nothing to do with GNUC.
2007-02-06 18:04:02 +00:00
rwatson
477d310a8e Print intptr_t values by first casting to intmax_t and then printing with
%jd, as intptr_t may not be int-sized.

Assistance from:	jhb
Spotted by:		Mr Tinderbox
2007-02-06 17:22:36 +00:00
rwatson
8a05e41ae5 Update comments in mac.h.
Obtained from:	TrustedBSD Project
2007-02-06 16:24:57 +00:00
bde
a1801d3dbe Simplified PCPU_GET() and PCPU_SET(). We must copy through a temporary
variable to avoid invalid constraints in dead code.  Use an array of
u_char's (inside a struct) instead of a char/short/int/long variable so
that the variable and its accesses can be spelled in the same way in all
cases and code doesn't need to be cloned just to hold the spelling
differences.

Fixed strict-aliasing errors in PCPU_SET() and in the amd64 PCPU_GET().
Cast to (void *) as in rev.1.37 of the i386 version where the errors
were fixed for the i386 PCPU_GET() only.  It would be more correct to
copy to and from the temp. variable using memcpy(), but then an
ifdef tangle would be required to ensure using the builtin memcpy().
We depend on fairly aggressive optimization to put the temp. variable
only in a register despite it being copied using
*(type *)(void *)&anothertype and could depend on this when using
memcpy() too.  This seems to work right even for -O0, but the -O0 case
has not been completely tested.

This change gives identical object code for all object files in LINT
on amd64 (except for one file with a __TIME__ stamp).  For LINT on
i386 it gives unimportant differences in instruction order and padding
in a few object files.  This was only tested for -O.

This change (actually a previous version of it) gives the following
reductions in the number of object files in LINT that fail to compile
with -O2 but without the -fno-strict-aliasing kludge:
- amd64: 29 (down from 211)
- i386: 36 (down from 47)

gcc-3.4.6 actually allows the invalid constraints that result from not
using the temp. variable, at least with -O[1-2], but gcc-3.3.3 crashes
on them and I don't want to depend on compiler bugs.
2007-02-06 16:21:09 +00:00
rwatson
a7eaaf4149 Push UNIX domain socket locking further into uipc_ctloutput() in order to
avoid holding the UNIX domain socket subsystem lock over soooptcopyin()
and sooptcopyout().  This problem was introduced when LOCAL_CREDS, and
LOCAL_CONNWAIT support were added.

Reviewed by:	mdodd
2007-02-06 14:31:37 +00:00
rwatson
19777f0802 Introduce accessor functions mac_label_get() and mac_label_set() to replace
LABEL_TO_SLOT() macro used by policy modules to query and set label data
in struct label.  Instead of using a union, store an intptr_t, simplifying
the API.

Update policies: in most cases this required only small tweaks to current
wrapper macros.  In two cases, a single wrapper macros had to be split into
separate get and set macros.

Move struct label definition from _label.h to mac_internal.h and remove
_label.h.  With this change, policies may now treat struct label * as
opaque, allowing us to change the layout of struct label without breaking
the policy module ABI.  For example, we could make the maximum number of
policies with labels modifiable at boot-time rather than just at
compile-time.

Obtained from:	TrustedBSD Project
2007-02-06 14:19:25 +00:00
imp
f82b2337b9 at91_twi depends on the iicbus module to satisfy its symbols when
loaded, so make that explicit.  Works for the monolithic kernel case,
won't work for the kldload case.
2007-02-06 12:07:14 +00:00
rwatson
d945a8c499 Continue 7-CURRENT MAC Framework rearrangement and cleanup:
Don't perform a nested include of _label.h in mac.h, as mac.h now
describes only  the user API to MAC, and _label.h defines the in-kernel
representation of MAC labels.

Remove mac.h includes from policies and MAC framework components that do
not use userspace MAC API definitions.

Add _KERNEL inclusion checks to mac_internal.h and mac_policy.h, as these
are kernel-only include files

Obtained from:	TrustedBSD Project
2007-02-06 10:59:23 +00:00
mpp
f010375878 The change to the vm_page_queue_freelist lock from a spin lock to a
sleep lock missed the witness code, and the system will panic
immediately on boot if WITNESS is enabled.

Changed the witness definition to the new type.
2007-02-06 05:51:55 +00:00
rodrigc
5efe7e2fa5 Eliminate some dead code which was introduced in 1.23, yet was always
commented out.
2007-02-06 03:30:58 +00:00
jhb
adbe57597a Change GDB_BUFSZ to be large enough to hold a register dump where each
register takes 16 characters (64-bit register in hex).  In practice this
is a slight bit of overkill as 7 of the 56 registers are only 32-bit, but
having the buffer too small results in remote kgdb trashing kernel memory
when it connects.

PR:		amd64/108673
Submitted by:	Ravi Murty, Nikhil Rao @ Intel
MFC after:	3 days
2007-02-05 21:48:32 +00:00
bms
94de0f0fd0 Fix devfs cloning for non-superusers when net.link.tap.user_open is non-zero.
Note: 'ifconfig tapX create' still requires PRIV_NET_IFCREATE privilege.

Reviewed by:	rwatson
2007-02-05 11:29:08 +00:00
bms
ece591cff4 Clean up after tun(4) properly; remove routes whose ifp is set to
that of the tun instance even for the !AF_INET case, and properly
remove configured addresses by calling if_purgeaddrs().

Maintain the TUN_DSTADDR behaviour for compatibility with the OS/390
emulator.

MFC after:	3 weeks
PR:		100080
Reviewed by:	bz
2007-02-05 11:15:52 +00:00
bms
dd70643685 MFC after: 3 days 2007-02-05 11:05:41 +00:00
kevlo
b504b0b913 <sys/sx.h> is unneeded. 2007-02-05 10:33:39 +00:00
alc
4881bd38e2 Change the free page queue lock from a spin mutex to a default (blocking)
mutex.  With the demise of Alpha support, there is no longer a reason for
it to be a spin mutex.
2007-02-05 06:02:55 +00:00
bms
7925e63ddf When fast-forwarding is enabled, do not forward directed IPv4 broadcasts
to locally attached broadcast networks.

Note well: This relies on the layer 2 route cloning behaviour in BSD.

PR:		98799
Tested by:	Dmitry Sergienko
MFC after:	1 week
2007-02-05 00:15:40 +00:00
tegge
06da132002 Call pbgetvp() and pbrelvp() instead of setting b_vp directly.
PR:		kern/108151
2007-02-04 23:42:02 +00:00
le
176abfa169 Add support for another 3G card and update man page accordingly.
The patch from the PR was a little outdated w/regards to the
Vodafone vendor string.

PR:            kern/106033
Submitted by:  Volker Werth <volker_AT_vwsoft.com>
MFC in:        3 days
2007-02-04 22:14:18 +00:00
bms
77c2e11309 Implement ifnet cloning for tun(4)/tap(4).
Make devfs cloning a sysctl/tunable which defaults to on.

If devfs cloning is enabled, only the super-user may create
tun(4)/tap(4)/vmnet(4) instances. Devfs cloning is still enabled by
default; it may be disabled from the loader or via sysctl with
"net.link.tap.devfs_cloning" and "net.link.tun.devfs_cloning".

Disabling its use affects potentially all tun(4)/tap(4) consumers
including OpenSSH, OpenVPN and VMware.

PR:		105228 (potentially also 90413, 105570)
Submitted by:	Landon Fuller
Tested by:	Andrej Tobola
Approved by:	core (rwatson)
MFC after:	4 weeks
2007-02-04 16:32:46 +00:00
dumbbell
dff38aee3c Synaptics TouchPad seems to go back to Relative Mode after the call
to set_controller_command_byte() call; by issueing a Read Mode Byte
command, the touchpad is in Absolute Mode again.

This problem occursed at least on Asus V6V laptops.
2007-02-04 12:47:52 +00:00
joel
0a07b005d5 Orion originally wrote and added these files in 2002/2003, so with his
approval, change the copyright statement to point at him instead of
"FreeBSD, Inc".

Encouraged by:	rwatson
Reviewed by:	imp
Discussed with and approved by:	orion
2007-02-04 06:52:33 +00:00
mpp
2ba513e47f If quotacheck or edquota reset the block or inode grace time for
a user or group, when the kernel first sees this, it will update
the grace time value.  However, it never flags the quota as modified
and the updated value never makes it to the quota data file unless
the user actually makes some other change that would write the
data out.

Fixed to flag the quota as modified if the soft limit has actually
been reached and should be now enforced.
2007-02-04 06:46:57 +00:00
imp
98f50a3dfb Document the init_chroot and init_script variables.
# I didn't check the markup too closely, so doc people, please check

Submitted by: Oliver Fromme
2007-02-04 06:35:10 +00:00
sam
9f1eaf60b8 clear/reclaim challenge text when switching auth mode and operating as an ap
Obtained from:	Atheros
2007-02-04 05:49:16 +00:00
alc
d7099516c9 Include opt_ipdivert.h so that the message announcing ipfw correctly
describes the state of IPDIVERT.
2007-02-03 22:11:53 +00:00
flz
c6ea5290e2 Fix build (sc->dev => sc->sc_dev). 2007-02-03 21:11:11 +00:00
rink
9722cd2743 Add support for the NetCell NC3000/5000 series SATA RAID cards.
Reviewed by:	sos
Approved by:	imp (mentor)
MFC after:	1 week
2007-02-03 20:12:00 +00:00
imp
0b64c14442 It turns out we were mallocing too early, so move the allocation so we
don't leak.
2007-02-03 19:11:09 +00:00
imp
82ed1dbff8 Fix memory leak of devinfop
PR: 108719
Submitted by: Antoine Brodin
2007-02-03 16:41:55 +00:00
imp
b7897e7411 Fix possible memory leaks of devinfo.
PR: 108719
Submitted by: Antoine Brodin
2007-02-03 16:38:32 +00:00
imp
363d79c219 Fix non-use, but not memory leak, of devinfop. Set the device's
description here.  The fix in the PR isn't necessary at all for memory
leaks, but we weren't setting the device description.

While I'm here, remove some of the obfuscating macros in attach.

PR: 108719
2007-02-03 16:33:47 +00:00
imp
afa51f755a Fix memory leak of devinfo. The leak itself was documented in
PR/108719, but there's a simpler fix: free it after it is used, and
then get rid of the redundant frees this causes.  Other leaks in this
PR not yet fixed.

While I'm here, remove NetBSD/OpenBSD code and some of the portability
#defines that were getting in the way of understanding this code.  The
devinfo bug was harder to spot because one needed to know that
device_set_desc_copy() was used inside of one of them (one that didn't
take an argument!).

Prefer device_printf(sc->sc_dev, "...") to printf("%s:...",
device_get_nameunit(sc->sc_dev)).  This saves almost 300 bytes.

PR: 108719
Submitted by: Antoine Brodin
2007-02-03 16:19:28 +00:00
mlaier
4bd0763c38 Add a small informative printf under bootverbose to firmware_register to
track problems when loading firmware from loader.
2007-02-03 16:01:46 +00:00
mlaier
9a2ac087c9 Add ALTQ support for aue(4).
Tested by:	Greg Hennessy, Volker
MFC after:	1 week
2007-02-03 13:53:22 +00:00
ume
de9e7e731d ng_iface requiers neighbor cache as well.
MFC after:	3 days
2007-02-03 09:34:36 +00:00
bms
ee59ac20b6 Style; remove argument names from prototype, be consistent with
rest of file.
This has the additional side-effect of removing a C++ reserved keyword
from this file, which prevents the Click Modular Router's FreeBSD
kernel support from building.

Reviewed by:	silence on -current
2007-02-03 07:49:20 +00:00
kevlo
ed33e9dab8 ether_ifattach() sets if_mtu to ETHERMTU, don't bother set it again.
Approved by: imp, cognet
2007-02-03 07:46:26 +00:00
imp
39fa690f18 We need to free the ivars for the child that we just deleted. 2007-02-03 07:09:36 +00:00
bms
929d8d99d7 In fast forwarding path, defer processing of 169.254.0.0/16
to ip_input(). See RFC 3927 section 2.7.
2007-02-03 06:46:48 +00:00
imp
426b160b28 The path to the mmc/mmcbus_if.m file is wrong. Correct it by
prepending dev/

Submitted by: Andrea Bittau
2007-02-03 06:46:11 +00:00
bms
b6b883252e In regular forwarding path, reject packets destined for 169.254.0.0/16
link-local addresses. See RFC 3927 section 2.7.
2007-02-03 06:45:51 +00:00
imp
6443ab2e87 Mark mmc *_if.m files as standard to allow for mmc/sd being compiled
as a module.

Submitted by: Andrea Bittau
2007-02-03 06:45:02 +00:00
bms
2b8498ff24 Diff reduction with RELENG_6, style(9):
Remove unnecessary brace; && should be on end of line.
No functional changes.
2007-02-03 03:57:45 +00:00
bms
cb84e5a9bd Drop unicast Ethernet frames not destined for the configured address
of a tap(4) instance, if IFF_PROMISC is not set.

In tap(4), we should emulate the effect IFF_PROMISC would have on
hardware, otherwise we risk introducing layer 2 loops if tap(4) is
used with bridges. This means not even bpf(4) gets to see them.

This patch has been tested in a variety of situations. Multicast and
broadcast frames are correctly allowed through. I have observed this
behaviour causing problems with multiple QEMU instances hosted on
the same FreeBSD machine.

The checks in in ether_demux() [if_ethersubr.c, rev 1.222, line 638]
are insufficient to prevent this bug from occurring, as ifp->if_vlantrunk
will always be NULL for the non-vlan case.

MFC after:	3 weeks
PR:		86429
Submitted by:	Pieter de Boer (with changes)
2007-02-03 02:57:45 +00:00
bms
a6c57fe6a9 Use int instead of u_int for the 'extra' argument to the
clone_create() KPI.
This fixes a signedness bug in unit number comparisons.

Submitted by:	imp, Landon Fuller
PR:		kern/105228
MFC after:	2 weeks
2007-02-02 22:27:45 +00:00
bms
4341e5c6f6 Comply with RFC 3927, by forcing ARP replies which contain a source
address within the link-local IPv4 prefix 169.254.0.0/16, to be
broadcast at link layer.

Reviewed by:	fenner
MFC after:	2 weeks
2007-02-02 20:31:44 +00:00
jhb
c09539e88f Add constants for the PCIY_VENDOR (vendor-specific), PCIY_DEBUG (EHCI
debug port), and PCIY_EXPRESS (PCI-express) capabilities.
2007-02-02 19:48:25 +00:00
bms
e9ca37568e Expose smoothed RTT and RTT variance measurements to userland via
socket option TCP_INFO.
Note that the units used in the original Linux API are in microseconds,
so use a 64-bit mantissa to convert FreeBSD's internal measurements
from struct tcpcb from ticks.
2007-02-02 18:34:18 +00:00
pjd
a71f42e832 coda_vptofh is never defined nor used. 2007-02-02 15:47:28 +00:00
joel
52a84ff1e8 Remove dead email address.
Requested by:	luigi
2007-02-02 13:44:09 +00:00
joel
090305539f Clean up the BSD license to match the preferred license in
/usr/share/examples/etc/bsd-style-copyright.  I've fixed a
few minor wording and formatting differences.

Approved by:	luigi, Hannu Savolainen <hannu@opensound.com>
2007-02-02 13:39:20 +00:00
joel
d997352f31 Add a standard BSD license to these files.
Discussed with:	rwatson
Approved by:	luigi
2007-02-02 13:33:35 +00:00
glebius
fd07387e04 Quoting Alexander:
Formulas described in RFC require high precision of floating point.
  Formulas of integer math implemented in ng_pptpgre give mistake in range
  of +0-7ms on RTT and +0-3ms on deviation. This leads to significant
  underestimation of real packet RTT.

  I have made a very simple patch to reduce mistake to +4-3ms on RTT and
  +2-1ms on deviation. Mistake in RTT is not good, but gets covered by
  deviation. To cover worst possible negative mistake in deviation I have
  added 2ms to it. Also this 2 ms cover the case when measured deviation
  is so small (about zero) that it can interfere with process scheduling
  delays or weather on Mars.

  My tests show decreasing of packet losses on 20ms RTT link from 2.5% to
  0.3% while speed increased un 1/3.

Reviewed by:	archie
2007-02-02 09:45:23 +00:00
glebius
325d4d7fda Since rev. 1.94 of netinet/in.c, the netinet layer frees all its
multicast memberships, when interface is detached. Thus, when
an underlying interface is detached, we do not need to free
our multicast memberships.

Reviewed by:	bms
2007-02-02 09:39:09 +00:00
kib
a816abd565 Record kqueue -> struct mount mtx -> vnode interlock lock order to
catch the places where reverse lock order is instantiated.

OKed by:	jeff
2007-02-02 09:02:18 +00:00
kib
de1264b042 Remove extern int hz; use proper include file instead. 2007-02-02 08:58:16 +00:00
kevlo
8b1aa284e9 Use bus_get_dma_tag() so iwi(4) works on platforms requiring it.
Approved by: cognet
2007-02-02 05:17:18 +00:00
julian
743211870f Move the seting of the idle_mask bits to a place where they
can't be wrong.
Also use the IDLETD bit in the thread mask to test if its an idle thread
rather than doing a PCPU access.
2007-02-02 05:14:22 +00:00
kevlo
c5ea90d498 Remove a bogus i = 0
Approved by: cognet
2007-02-02 05:14:21 +00:00
kmacy
bcdd0af22d Add support for IPI_PREEMPT in order to enable use of the ULE scheduler 2007-02-02 05:00:21 +00:00
kmacy
efb054426a match against both dirty and writeable for marking page dirty 2007-02-02 04:57:11 +00:00
sam
fe499355a4 add IEEE80211_IS_CHAN_PASSIVE
MFC after:	1 week
2007-02-02 02:45:33 +00:00
andre
25c4be862e Auto sizing TCP socket buffers.
Normally the socket buffers are static (either derived from global
defaults or set with setsockopt) and do not adapt to real network
conditions. Two things happen: a) your socket buffers are too small
and you can't reach the full potential of the network between both
hosts; b) your socket buffers are too big and you waste a lot of
kernel memory for data just sitting around.

With automatic TCP send and receive socket buffers we can start with a
small buffer and quickly grow it in parallel with the TCP congestion
window to match real network conditions.

FreeBSD has a default 32K send socket buffer. This supports a maximal
transfer rate of only slightly more than 2Mbit/s on a 100ms RTT
trans-continental link. Or at 200ms just above 1Mbit/s. With TCP send
buffer auto scaling and the default values below it supports 20Mbit/s
at 100ms and 10Mbit/s at 200ms. That's an improvement of factor 10, or
1000%. For the receive side it looks slightly better with a default of
64K buffer size.

New sysctls are:
  net.inet.tcp.sendbuf_auto=1 (enabled)
  net.inet.tcp.sendbuf_inc=8192 (8K, step size)
  net.inet.tcp.sendbuf_max=262144 (256K, growth limit)
  net.inet.tcp.recvbuf_auto=1 (enabled)
  net.inet.tcp.recvbuf_inc=16384 (16K, step size)
  net.inet.tcp.recvbuf_max=262144 (256K, growth limit)

Tested by:	many (on HEAD and RELENG_6)
Approved by:	re
MFC after:	1 month
2007-02-01 18:32:13 +00:00
andre
ad9bb7722c Generic socket buffer auto sizing support, header defines, flag inheritance.
MFC after:	1 month
2007-02-01 17:53:41 +00:00
andre
347b7a6a1e Change the way the advertized TCP window scaling is computed. Instead of
upper-bounding it to the size of the initial socket buffer lower-bound it
to the smallest MSS we accept.  Ideally we'd use the actual MSS information
here but it is not available yet.

For socket buffer auto sizing to be effective we need room to grow the
receive window.  The window scale shift is determined at connection setup
and can't be changed afterwards.  The previous, original, method effectively
just did a power of two roundup of the socket buffer size at connection
setup severely limiting the headroom for larger socket buffers.

Tested by:	many (as part of the socket buffer auto sizing patch)
MFC after:	1 month
2007-02-01 17:39:18 +00:00
kib
8f812418c1 Introduce some more SO_ option equivalents from Linux to FreeBSD.
The msg variable in linux_recvmsg() was not initialized.
Copy it from userspace.

Submitted by: rdivacky
2007-02-01 13:36:19 +00:00
kib
3f2b6c010a No need to lock emul_lock in exit_group() because em->shared
cannot change (because its referenced by curthread). This fixes
a LOR caused by acquiring emul_shared_lock while holding emul_lock.

Fix typo in comment.

Submitted by: rdivacky
2007-02-01 13:33:33 +00:00
kib
02650398d1 No need to synchronize linux_schedtail with linux_proc_init.
p->p_emuldata is properly initialized in the time when the child can run.

Do not set p->p_emuldata to NULL when the process is exiting.
It does not make any sense and only costs 2 mutex operations.

Do not lock emul_data to unlock it on the very next line.
Comment on possible race while there.

Reparent all procs that are part of a threading group but not its leaders
to init and SIGCHLD init to finish the zombies off. This fixes zombies
left after opera's exit. [1]

There is no need to lock p_em in the linux_proc_init CLONE_THREAD
case because the process cannot change the address of the p_em->shared
because its currently running this code path.
Move assigning of em->shared outside emul_shared_lock.

Noticed by: Scott Robbins <scottro@nyc.rr.com> [1]
Submitted by: rdivacky
2007-02-01 13:29:27 +00:00
kib
b9ce1aaa2a Fix LOR that occurs because proctree_lock was acquired while holding
emuldata lock by moving the code upwards outside the emul_lock coverage.

Submitted by: rdivacky
2007-02-01 13:27:52 +00:00
kib
84f6f6c749 MFi386: Use LINUX_SIG_VALID macro.
Submitted by: rdivacky
2007-02-01 13:24:40 +00:00
ariff
ef779241ba Fix huge memory leak within sound buffer (during channel destruction,
buffer resizing, etc.) that was here since eon. Free all (unmanaged)
allocated buffer through sndbuf_destroy() in case we forgot to call
sndbuf_free(). For a managed buffer (mostly hw specific managed buffer),
either provide CHANNEL_FREE() method with appropriate return value to
invoke semi-automatic sndbuf_free() or simply do it on their own. If
everything is failed, sndbuf_destroy() will come to the rescue as a
final measure.

MFC after:	3 days
2007-02-01 09:46:03 +00:00
ariff
f763a443cd Fix apparent memory leak (during vchan destruction) that was here
since eon.
2007-02-01 09:30:01 +00:00
avatar
114dbcad62 Reflecting the removal of MSDOSFS_LARGE found in sys/conf/files:1.1173.
This should fix the run time bustage observed on recent -CURRENT whilst
mounting a MSDOS filesystem with non-default locale/code page:

	link_elf: symbol msdosfs_fileno_free undefined
	KLD msdosfs_iconv.ko: depends on msdosfs - not available
2007-02-01 04:21:03 +00:00
mpp
261dbc8078 Prevent quotactl calls that pass in an id of -1 from incorrectly
using the callers UID instead of the GID when performing group
operations.  This could allow users to determine group quota
information for groups they are not a member of in some cases.

Rename the "uid" parameter in ufs_quotactl to "id" to better show
that it is used for more than just the uid, and to be more in line
with the naming conventions in the other quota routines.

PR:	kern/33940
2007-02-01 02:13:53 +00:00
mpp
3cdb06d461 Disallow negative UIDs when processing quotactl options. 2007-02-01 01:01:56 +00:00
mohans
5f0bd46234 Fix for a vnode lock leak in nfs_create() in the event of an error.
Spotted by ups@.
2007-01-31 23:10:27 +00:00
gallatin
dd7403e0de - Add 99% of a callout based watchdog. The remaining 1% is waiting
for pci_cfg_restore() to be exported.  It was tested using a
  hackily accessed pci_cfg_restore().

- Add ifmedia_removeall() to mxge_detach() in order to stop leaking
  an ifaddr

- Fix a small acounting bug introduced by the locking code shuffle
  which could cause spurious watchdog resets now that we have a
  watchdog.

Sponsored by: Myricom
2007-01-31 19:53:36 +00:00
gallatin
2313de63b4 destroy busdma maps even if they are NULL, so as to avoid leaking
busdma tags.
2007-01-31 15:47:44 +00:00
gallatin
60e3c70670 Abandon using sleepable locks in favor of mutexes for mxge's if_ioctl
locking in preparation for adding a watchdog handler (callouts must
not use sleepable locks).  This required shuffling memory and
interrupt allocation to the attach routine rather than if_ioctl so as
to avoid potential sleeps while bringing up the interface.
2007-01-31 15:29:31 +00:00
bms
cad0bb8b16 Import macros IN_LINKLOCAL(), IN_PRIVATE(), IN_LOCAL_GROUP(), IN_ANY_LOCAL().
This is not a functional change.

IN_LINKLOCAL() tests if an address falls within the IPv4 link-local prefix.
IN_PRIVATE() tests if an address falls within an RFC 1918 private prefix.
IN_LOCAL_GROUP() tests if an address falls within the statically assigned
link-local multicast scope specified in RFC 2365.
IN_ANY_LOCAL() tests for either of IN_LINKLOCAL() or IN_LOCAL_GROUP().

As with the existing macros in the FreeBSD netinet stack, comparisons
are performed in host-byte order.

See also:	RFC 1918, RFC 2365, RFC 3927
Obtained from:	NetBSD (dyoung@)
MFC after:	2 weeks
2007-01-31 14:34:47 +00:00
joel
4577663dd4 Put #ifndef... after the license.
Approved by:	ariff
2007-01-31 12:10:48 +00:00
joel
84ea198abf s/WHETHERIN/WHETHER IN/ & s/THEPOSSIBILITY/THE POSSIBILITY/ in the
license text.

Approved by:	imp
2007-01-31 08:53:45 +00:00
ru
313e318cc8 MFsparc64: Add .cvsignore file here too. 2007-01-30 10:50:55 +00:00
ru
75c059c38f Remove the last vestige of opt_msdosfs.h.
Submitted by:	grep(1)
2007-01-30 10:17:36 +00:00
gallatin
ec105ac682 Minor updates:
- initialize ifq_drv_maxlen correctly
- mark the interface as jumbo capable
- keep stats on the number of times the hw transmit queue filled and
  was restarted.
2007-01-30 08:39:44 +00:00
avatar
15257284d7 Fixing compilation bustage by removing references to opt_msdosfs.h.
This auto-generated header file no longer exists since the removal of
MSDOSFS_LARGE in sys/conf/options:1.574.
2007-01-30 08:05:04 +00:00
rodrigc
b188fcdf27 Remove MSDOSFS_LARGE compile time option. It has been converted
to a run time "-o large" mount option.

PR:		105964
MFC after:	2 weeks
2007-01-30 05:01:06 +00:00
trhodes
810381390e Fix spacing from my previous commit to this file:
Noticed by:	fjoe
2007-01-30 04:41:38 +00:00
rodrigc
fdf518fe9a Add a "-o large" mount option for msdosfs. Convert compile-time checks for
#ifdef MSDOSFS_LARGE to run-time checks to see if "-o large" was specified.

Test case provided by Oliver Fromme:
  truncate -s 200G test.img
  mdconfig -a -t vnode -f test.img -u 9
  newfs_msdos -s 419430400 -n 1 /dev/md9 zip250
  mount -t msdosfs /dev/md9 /mnt    # should fail
  mount -t msdosfs -o large /dev/md9 /mnt   # should succeed

PR:		105964
Requested by:	Oliver Fromme <olli lurza secnetix de>
Tested by:	trhodes
MFC after:	2 weeks
2007-01-30 03:11:45 +00:00
kevlo
bbf865842d Use our own timer that piggybacks on npe_tick() callout instead of
if_watchdog/if_timer interface.

Approved by: sam, cognet
2007-01-30 01:18:29 +00:00
kris
365d5c4ba7 Instead of always hard-coding the socket type for the nfs root mount as
SOCK_DGRAM (i.e. UDP), respect the value configured earlier.  This allows
TCP NFS root mounts using e.g. the boot.nfsroot.options="tcp" tunable.

In this case some of the connection parameters like the retry timer were
previously set appropriately for TCP but inappropriately for the UDP
socket that was actually used, leading to e.g. extremely long recovery
times (O(hours)) after a nfs server reboot.

Reviewed by:    mohans
MFC After:      2 weeks
2007-01-30 00:26:04 +00:00
rwatson
c8cb2f0c11 Update comment for struct bpf_d: we now store buffered packets for BPF
in malloc'd storage, not in mbuf clusters.
2007-01-29 14:41:03 +00:00
pjd
cb51d8d011 We expect 'bio_data != NULL' for BIO_{READ,WRITE,GETATTR}, but for
BIO_{DELETE,FLUSH} we expect 'bio_data == NULL'.

Reviewed by:	phk
2007-01-28 23:36:07 +00:00
joel
0fa9b5986e Clean up the BSD license to match the preferred license in
/usr/share/examples/etc/bsd-style-copyright.  I've fixed a
few minor wording and formatting differences.

Approved by:	matk, Hannu Savolainen <hannu@opensound.com>
Reviewed by:	imp
2007-01-28 20:38:07 +00:00
pjd
4e4fa80cab It is possible that GEOM taste provider before SMP is started.
We can't bind to a CPU which is not yet on-line, so add code that wait for
CPUs to go on-line before binding to them.

Reported by:	Alin-Adrian Anton <aanton@spintech.ro>
MFC after:	2 weeks
2007-01-28 20:29:12 +00:00
sam
1640741614 ath and ath_rate_sample now have a compile-time dependency on the hal
so we need to build them only on architectures where there's a released
hal; this hack can be eliminated when an ia64 hal build is present
2007-01-28 18:35:46 +00:00
rwatson
86d2a65642 As we now have an SFB_NOWAIT flag, change 'will' to 'may' where the
comment for sf_buf_alloc(9) talks about sleeping.
2007-01-28 17:39:03 +00:00
rwatson
14613c4dc7 Remove slightly dubious comment; add descriptive strings for several
sysctls.

MFC after:	3 days
2007-01-28 16:38:44 +00:00
takawata
7712a93b6b Add support for serial communication with Windows CE based Handheld Computer.
Obtained from:	NetBSD
2007-01-28 11:56:14 +00:00
takawata
432435a805 Add some vendor IDs mainly from NetBSD. 2007-01-28 10:46:32 +00:00
nyan
6603d04ea6 MFi386: revision 1.647.
exclude the icu and clock lock from LOCK_PROFILING
2007-01-28 07:19:14 +00:00
sam
7112c0ba66 for newer hal's we need opt_ah.h as it specifies how the hal has been
configured and that in turn controls the descriptor layout
2007-01-28 04:38:35 +00:00
sam
663b4cdc59 for newer hal's we need opt_ah.h as it specifies how the hal has been
configured and that in turn controls the descriptor layout; the rate
control module has no business peeking inside the descriptor but until
we can change the api so the driver records the tx rates and passes
them deal with it
2007-01-28 04:36:05 +00:00
ariff
8490eade81 Add speaker control for HP xw4300. This hardware doesn't respond to
unsolicited pin sense event and need manual control to turn off speaker
volume while attaching headphone.

Tested by:		Ingeborg Hellemo <Ingeborg.Hellemo@cc.uit.no>

Disable global Acer + ALC883 headphone automute settings since there are
few models that does not respect this and causing broken behaviour.

Reported/Tested by:	Pavel Argentov <argentoff@rtelekom.ru>
2007-01-28 03:16:54 +00:00
remko
3f83d16647 Add the SMART command to the ATA instruction set.
When the disk has an error, it will now print SMART
instead of 'Unknown CMD'.

PR:		kern/93368
Submitted by:	Garry Belka <garry at NetworkPhysics dot COM>
Approved by:	sos
2007-01-27 21:15:59 +00:00
mlaier
e3327eddd2 In case we are supplied with an imagename that matches a module, but not a
firmware in that module (eventhough this is a programming error) - drop the
reference to the module again.

Submitted by:	Benjamin Close
MFC after:	3 days
2007-01-27 19:52:08 +00:00
jkoshy
2664c129a9 Use a known good stack at the time of servicing an NMI --- reuse
the space allocated for the double fault handler since this space
is otherwise unused till the time a double fault occurs.

This change should have been committed alongside r1.127 of
"exception.S", but I somehow missed doing so.

Problem reported by:	jeff
Pointy hat to:		jkoshy
2007-01-27 18:13:24 +00:00
rwatson
708b428377 Remove BSD < 199103 compatibility entries in the bpf_d structure: they are
not used in any of our code.  Also remove explicit padding variable that
kept the bpf_d structure the same size before and after the change in
select implementation, since binary compatibility is not required for this
data structure on 7-CURRENT.
2007-01-27 18:12:50 +00:00
rwatson
ebd5cdbc2e Remove now unused bpf_compat.h. This compatibility file emulates malloc(9)
using the mbuf allocator.
2007-01-27 17:32:12 +00:00
ariff
039ccd88f6 Rearrange locking order to avoid LOR (cat /dev/midistat).
Reported by:	rodrigc
2007-01-27 15:55:59 +00:00
ariff
971f435178 Massive inlining cleanups/removal to make it survive on WARNS=2. 2007-01-27 13:30:19 +00:00
ariff
e055eda20e Reduce maximum DMA segments from 128 to 64. We don't need more than that. 2007-01-27 07:35:05 +00:00
ariff
5b6974c9c1 Total DMA segments should include total number of record channel(s). 2007-01-26 23:53:56 +00:00
bmah
c083ee91fa Revert nd6.c revs. 1.67, 1.68, 1.69, 1.70 in an attempt to unbreak
IPv6 over point-to-point gif(4) tunnels.

These revisions caused a host route to the destination of a
point-to-point gif(4) interface to not get installed when the interface
and destination addresses were assigned.  This caused
"no route to host" errors when trying to send traffic over the
interface.  The first packet arriving inbound over the tunnel,
however, would cause the correct route to get installed, allowing
subsequent outbound traffic to be routed correctly.

gif(4) interfaces with prefix lengths of less than 128 bits
(i.e. no explicit destination address assigned) were not affected
by this bug.

This bug fix is a possible candidate for a 6.2-RELEASE errata note.

Approved by:	jhay (original committer)
Discussed with:	jhay, JINMEI Tatuya
MFC after:	3 days
2007-01-26 23:22:58 +00:00
ariff
dcde6ce44f Fix forever broken ua_chan_setblocksize() uninitialized return value
which causing divide by zero panic in other places (notably chn_sync()).
2007-01-26 19:14:41 +00:00
ariff
796b51b951 Sync uaudio_sndstat_prepare_pcm() output with sndstat_prepare_pcm() to get
simmilar (debugging) output.
2007-01-26 19:06:17 +00:00
dwhite
7141aa5cc2 Add missing MIIBUS_MEDIAINIT() call. 2007-01-26 17:06:02 +00:00
dwhite
308276e932 Collapse 5706C and 5708C PHYs into one entry. ID 0x15 is actually used for
the SERDES PHY on these chips and we want gentbi to pick this up, not brgphy.
2007-01-26 17:05:24 +00:00
dwhite
3c7fc4d94c Add support for SERDES PHY configurations. These are commonly found in
blade systems, such as the Dell 1955 and the Intel SBXD132.

Development hardware for this work was provided by Broadcom and iXsystems.
A SBXD132 blade for testing was provided by Iron Systems.
2007-01-26 17:03:51 +00:00
delphij
9737abdf72 While we do not expect any change before and after GNU gzip
is replaced with BSD gzip, let's make it possible to
distinguish between the two with a __FreeBSDversion bump,
just in case some developers want it.

Suggested by:	linimon
2007-01-26 14:57:17 +00:00
marcel
db6667954e Remove stale header.
MFC after: 3 days
2007-01-26 04:58:31 +00:00
kevlo
40ff793d55 Fix comments.
Approved by: cognet
2007-01-26 01:37:32 +00:00
jeff
0f05ca9b5b - Implement much more intelligent ipi sending. This algorithm tries to
minimize IPIs and rescheduling when scheduling like tasks while keeping
   latency low for important threads.
   1) An idle thread is running.
   2) The current thread is worse than realtime and the new thread is
      better than realtime.  Realtime to realtime doesn't preempt.
   3) The new thread's priority is less than the threshold.
2007-01-25 23:51:59 +00:00
glebius
0688eeab06 - Create ng_ppp_bypass() function, that prepares a packet
with bypass header, to send it out to userland.
- Use ng_ppp_bypass() in ng_ppp_proto_recv().
- Use ng_ppp_bypass() in ng_ppp_comp_recv() and in
  ng_ppp_crypt_recv() if compression or encryption is
  disabled, respectively.
- Any LCP packet goes directly to ng_ppp_bypass(), instead
  of passing through PPP stack.
- Any non-LCP packet on disabled link is discarded. This
  is behavior defined in RFC.

Submitted by:	Alexander Motin <mav alkar.net>
2007-01-25 21:16:50 +00:00
jeff
94085f7612 - Get rid of the unused DIDRUN flag. This was really only present to
support sched_4bsd.
 - Rename the KTR level for non schedgraph parsed events.  They take event
   space from things we'd like to graph.
 - Reset our slice value after we sleep.  The slice is simply there to
   prevent starvation among equal priorities.  A thread which had almost
   exhausted it's slice and then slept doesn't need to be rescheduled a
   tick after it wakes up.
 - Set the maximum slice value to a more conservative 100ms now that it is
   more accurately enforced.
2007-01-25 19:14:11 +00:00
glebius
34079a8c02 Make it possible that carpdetach() unlocks on return. Then, in
carp_clone_destroy() we are on a safe side, we don't need to
unlock the cif, that can me already non-existent at this point.

Reported by:	Anton Yuzhaninov <citrin rambler-co.ru>
2007-01-25 18:03:40 +00:00
mjacob
94549ac283 Whoops- #ifdef problem caused uninitialized transport. Not horribly
a problem, but caused annoying messages.
2007-01-25 18:02:23 +00:00
glebius
b7c8a97d9b Spacing. 2007-01-25 17:58:16 +00:00
wpaul
901783bd82 The TCP checksum offload handling in the 8111B/8168B and 8101E PCIe can
apparently be confused by short TCP segments that have been manually
padded to the minimum ethernet frame size. The driver does short frame
padding in software as a workaround for a bug in the 8169 PCI devices
that causes short IP fragments to be corrupted due to an apparent
conflict between the hardware autopadding and hardware IP checksumming.

To fix this, we avoid software padding for short TCP segments, since
the hardware seems to autopad and checksum these correctly (even the
older 8169 NICs get these right). Short UDP packets appear to be
handled correctly in all cases. This should work around the IP header
checksum bug in the 8169 while not tripping the TCP checksum bug in
the 8111B/8168B and 8101E.
2007-01-25 17:30:30 +00:00
bde
abf353c8bd Rename some functions and variables from nfs_* to nfs4_* to avoid
collisions with nfsclient's names.  Even static names should have a
unique prefix so that they can be debugged easily.

Hide the unused colliding variable nfsv3_commit_on_close in "#if 0"
together with other unused sysctl variables.  Duplicating the nfs sysctl
under nfs4 is probably just a bug.

Fix some nearby style bugs.

Remove duplicate $FreeBSD$.
2007-01-25 14:33:13 +00:00
bde
2bc6f7d071 Rename some functions and variables (mainly vfsops entry points) from
nfs_* to nfs4_* to avoid collisions with nfsclient's names.   Even
static names should have a unique prefix so that they can be debugged
easily.

Most of the renamed functions can probably be shared.  nfs4_cmount()
and nfs4_sync() are identical to the nfs_* versions, and all the others
except nfs4_vfsops() seem to be idendentical except for style bugs,
missing support for mountroot, and bugs.

Fix some nearby style bugs.

Remove duplicate $FreeBSD$.
2007-01-25 14:18:40 +00:00
bde
5a952c0766 Unstaticize nfs_iosize() in nfsclient and use it in nfs4client instead
of duplicating it except for larger style bugs in the copy.

Fix some nearby style bugs (including a harmless type mismatch)
in and near the remaining copy.

This is part of fixing collisions of the 2 nfs*client's names.  Even
static names should have a unique prefixes so that they can be debugged
easily.
2007-01-25 13:07:25 +00:00
mohans
83064ec323 Fix for problems that occur when all mbuf clusters migrate to the mbuf packet
zone. Cluster allocations fail when this happens. Also processes that may have
blocked on cluster allocations will never be woken up. Thanks to rwatson for
an overview of the issue and pointers to the mbuma paper and his tool to dump
out UMA zones.

Reviewed by: andre@
2007-01-25 01:05:23 +00:00
mohans
9799fcf93c Fix for a bug where only one process (of multiple) blocked on
maxpages on a zone is woken up, with the rest never being woken up as
a result of the ZFLAG_FULL flag being cleared. Wakeup all such blocked
procsses instead. This change introduces a thundering herd, but since
this should be relatively infrequent, optimizing this (by introducing
a count of blocked processes, for example) may be premature.

Reviewd by: ups@
2007-01-24 22:49:11 +00:00
jeff
743ea48fbc - With a sleep time over 2097 seconds hzticks and slptime could end up
negative.  Use unsigned integers for sleep and run time so this doesn't
   disturb sched_interact_score().  This should fix the invalid interactive
   priority panics reported by several users.
2007-01-24 18:18:43 +00:00
rrs
ba4b733a7c Fixes the MSG_PEEK for sctp_generic_recvmsg() the msg_flags
were not being copied in properly so PEEK and any other
msg_flags input operation were not being performed right.
Approved by:	gnn
2007-01-24 12:59:56 +00:00
bruno
dcb4be2750 o introduce a flags 'errata' for HW bugs onto the softc.
o remove errata_a0 and introduce the corresponding flags into 'errata'.
o introduce a new errata for K8, namely some platform might set the
  PENDING_BIT but aren't able to unset it, also don't loop forever
  waiting PENDING_BIT being cleared.
o try to introduce a workaround for the PENDING_BIT stuck problem,
o support now half multipliers for K8.

Tested by:	Abdullah Al-Marrie

Approved by:	njl
2007-01-23 19:20:30 +00:00
imp
941d9aa2cc Use the more specific 'EM732X' designation rather than * to disable sync
cache commands, per request from njl@.
2007-01-23 17:29:31 +00:00
kib
fdd50404d1 Cylinder group bitmaps and blocks containing inode for a snapshot
file are after snaplock, while other ffs device buffers are before
snaplock in global lock order. By itself, this could cause deadlock
when bdwrite() tries to flush dirty buffers on snapshotted ffs. If,
during the flush, COW activity for snapshot needs to allocate block
and ffs_alloccg() selects the cylinder group that is being written
by bdwrite(), then kernel would panic due to recursive buffer lock
acquision.

Avoid dealing with buffers in bdwrite() that are from other side of
snaplock divisor in the lock order then the buffer being written. Add
new BOP, bop_bdwrite(), to do dirty buffer flushing for same vnode in
the bdwrite(). Default implementation, bufbdflush(), refactors the code
from bdwrite(). For ffs device buffers, specialized implementation is
used.

Reviewed by:	tegge, jeff, Russell Cattelan (cattelan xfs org, xfs changes)
Tested by:	Peter Holm
X-MFC after:	3 weeks (if ever: it changes ABI)
2007-01-23 10:01:19 +00:00
jeff
8fd8265087 - Catch up to setrunqueue/choosethread/etc. api changes.
- Define our own maybe_preempt() as sched_preempt().  We want to be able
   to preempt idlethread in all cases.
 - Define our idlethread to require preemption to exit.
 - Get the cpu estimation tick from sched_tick() so we don't have to worry
   about errors from a sampling interval that differs from the time
   domain.  This was the source of sched_priority prints/panics and
   inaccurate pctcpu display in top.
2007-01-23 08:50:34 +00:00
bde
3f550f19b8 Oops, pc98 is independent of i386 for clock.c and machdep.c but not
for clock.h, so changing th i386 clock.h broke it.  MFi386 (not tested):

Cleaned up declaration and initialization of clock_lock.  It is only
used by clock code, so don't export it to the world for machdep.c to
initialize.  There is a minor problem initializing it before it is
used, since although clock initialization is split up so that parts
of it can be done early, the first part was never done early enough
to actually work.  Split it up a bit more and do the first part as
late as possible to document the necessary order.  The functions that
implement the split are still bogusly exported.

Cleaned up initialization of the i8254 clock hardware using the new
split.  Actually initialize it early enough, and don't work around it
not being initialized in DELAY() when DELAY() is called early for
initialization of some console drivers.

This unfortunately moves a little more code before the early debugger
breakpoint so that it is harder to debug.  The ordering of console and
related initialization is delicate because we want to do as little as
possible before the breakpoint, but must initialize a console.
2007-01-23 08:48:26 +00:00
jeff
474b917526 - Remove setrunqueue and replace it with direct calls to sched_add().
setrunqueue() was mostly empty.  The few asserts and thread state
   setting were moved to the individual schedulers.  sched_add() was
   chosen to displace it for naming consistency reasons.
 - Remove adjustrunqueue, it was 4 lines of code that was ifdef'd to be
   different on all three schedulers where it was only called in one place
   each.
 - Remove the long ifdef'd out remrunqueue code.
 - Remove the now redundant ts_state.  Inspect the thread state directly.
 - Don't set TSF_* flags from kern_switch.c, we were only doing this to
   support a feature in one scheduler.
 - Change sched_choose() to return a thread rather than a td_sched.  Also,
   rely on the schedulers to return the idlethread.  This simplifies the
   logic in choosethread().  Aside from the run queue links kern_switch.c
   mostly does not care about the contents of td_sched.

Discussed with:	julian

 - Move the idle thread loop into the per scheduler area.  ULE wants to
   do something different from the other schedulers.

Suggested by:	jhb

Tested on:	x86/amd64 sched_{4BSD, ULE, CORE}.
2007-01-23 08:46:51 +00:00
jeff
f53a7830f7 - Allow the schedulers to IPI_PREEMPT idlethread. This puts the decision
for this behavior on the initiator side.
2007-01-23 08:38:39 +00:00
bde
b12ed0640c Cleaned up declaration and initialization of clock_lock. It is only
used by clock code, so don't export it to the world for machdep.c to
initialize.  There is a minor problem initializing it before it is
used, since although clock initialization is split up so that parts
of it can be done early, the first part was never done early enough
to actually work.  Split it up a bit more and do the first part as
late as possible to document the necessary order.  The functions that
implement the split are still bogusly exported.

Cleaned up initialization of the i8254 clock hardware using the new
split.  Actually initialize it early enough, and don't work around it
not being initialized in DELAY() when DELAY() is called early for
initialization of some console drivers.

This unfortunately moves a little more code before the early debugger
breakpoint so that it is harder to debug.  The ordering of console and
related initialization is delicate because we want to do as little as
possible before the breakpoint, but must initialize a console.
2007-01-23 08:01:20 +00:00
njl
ae205e3573 Add missing function trace for debug prints. 2007-01-23 07:20:44 +00:00
rodrigc
4c351de443 When exiting vfs_export(), delete the "export" option from
the mount options list with vfs_deleteopt().  At this point, the export
information is saved in mp->mnt_export, so we can delete
the "export" mount option from mp->mnt_optnew and mp->mnt_opt.

This fixes read-write/read-only update mounts (mount -u -o rw, mount -u -o ro)
of NFS exported directories.

For some reason, I could only reproduce the problem with a configuration
supplied by Andre:
- "options QUOTA" enabled in kernel config
- "/ -maproot=root 10.0.1.105" in /etc/exports

Reported by:	kris, Andre Guibert de Bruet <andy siliconlandmark com>,
            	Andrzej Tobola <ato iem pw edu pl>
Tested by:	Andre Guibert de Bruet
2007-01-23 06:19:16 +00:00
scottl
2685a7063b Remove a PCI ID entry that conflicts with the AMR driver. 2007-01-23 02:47:33 +00:00
yongari
b4c0dd68e0 It seems that enabling Tx and Rx before setting descriptor DMA
addresses shall access invalid descriptor DMA addresses on PCIe
hardwares and then panicked the system.
To fix it set descriptor DMA addresses before enabling Tx and Rx
such that hardware can see valid descriptor DMA addresses. Also
set RL_EARLY_TX_THRESH before starting Tx and Rx.

Reported by:	steve.tell AT crashmail DOT de
Tested by:	steve.tell AT crashmail DOT de
Obtained from:	NetBSD
MFC after:	1 week
2007-01-23 00:44:12 +00:00
mjacob
2df6044b61 Clean up some of the various platform and release specific dma tag
stuff so it is centralized in isp_freebsd.h.

Take out PCI posting flushed in qla2100/2200 register reads except for
2100s.
2007-01-23 00:02:29 +00:00
jhb
3624354c54 Expand the MSI/MSI-X API to address some deficiencies in the MSI-X support.
- First off, device drivers really do need to know if they are allocating
  MSI or MSI-X messages.  MSI requires allocating powerof2() messages for
  example where MSI-X does not.  To address this, split out the MSI-X
  support from pci_msi_count() and pci_alloc_msi() into new driver-visible
  functions pci_msix_count() and pci_alloc_msix().  As a result,
  pci_msi_count() now just returns a count of the max supported MSI
  messages for the device, and pci_alloc_msi() only tries to allocate MSI
  messages.  To get a count of the max supported MSI-X messages, use
  pci_msix_count().  To allocate MSI-X messages, use pci_alloc_msix().
  pci_release_msi() still handles both MSI and MSI-X messages, however.
  As a result of this change, drivers using the existing API will only
  use MSI messages and will no longer try to use MSI-X messages.
- Because MSI-X allows for each message to have its own data and address
  values (and thus does not require all of the messages to have their
  MD vectors allocated as a group), some devices allow for "sparse" use
  of MSI-X message slots.  For example, if a device supports 8 messages
  but the OS is only able to allocate 2 messages, the device may make the
  best use of 2 IRQs if it enables the messages at slots 1 and 4 rather
  than default of using the first N slots (or indicies) at 1 and 2.  To
  support this, add a new pci_remap_msix() function that a driver may call
  after a successful pci_alloc_msix() (but before allocating any of the
  SYS_RES_IRQ resources) to allow the allocated IRQ resources to be
  assigned to different message indices.  For example, from the earlier
  example, after pci_alloc_msix() returned a value of 2, the driver would
  call pci_remap_msix() passing in array of integers { 1, 4 } as the
  new message indices to use.  The rid's for the SYS_RES_IRQ resources
  will always match the message indices.  Thus, after the call to
  pci_remap_msix() the driver would be able to access the first message
  in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at
  SYS_RES_IRQ rid 4.  Note that the message slots/indices are 1-based
  rather than 0-based so that they will always correspond to the rid
  values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt).
  To support this API, a new PCIB_REMAP_MSIX() method was added to the
  pcib interface to change the message index for a single IRQ.

Tested by:	scottl
2007-01-22 21:48:44 +00:00
andre
4a22f82e6c Unbreak writes of 0 bytes. Zero byte writes happen when only ancillary
control data but no payload data is passed.

Change m_uiotombuf() to return at least one empty mbuf if the requested
length was zero.  Add comment to sosend_dgram and sosend_generic().

Diagnoses by:		jhb
Regression test by:	rwatson
Pointy hat to.		andre
2007-01-22 14:50:28 +00:00
kib
79752b63e1 Below is slightly edited description of the LOR by Tor Egge:
--------------------------
[Deadlock] is caused by a lock order reversal in vfs_lookup(), where
[some] process is trying to lock a directory vnode, that is the parent
directory of covered vnode) while holding an exclusive vnode lock on
covering vnode.

A simplified scenario:

root fs					var fs
/    		A			/    (/var)	D
/var		B			/log (/var/log) E
vfs lock	C			vfs lock	F

Within each file system, the lock order is clear: C->A->B and F->D->E

When traversing across mounts, the system can choose between two lock orders,
but everything must then follow that lock order:

      L1: C->A->B
		|
	        +->F->D->E

      L2: F->D->E
	     |
             +->C->A->B

The lookup() process for namei("/var") mixes those two lock orders:

    VOP_LOOKUP() obtains B while A is held
    vfs_busy() obtains a shared lock on F while A and B are held (follows L1,
    violates L2)
    vput() releases lock on B
    VOP_UNLOCK() releases lock on A
    VFS_ROOT() obtains lock on D while shared lock on F is held
    vfs_unbusy() releases shared lock on F
    vn_lock() obtains lock on A while D is held (violates L1, follows L2)

dounmount() follows L1 (B is locked while F is drained).

Without unmount activity, vfs_busy() will always succeed without blocking
and the deadlock isn't triggered (the system behaves as if L2 is followed).

With unmount, you can get 4 processes in a deadlock:

     p1: holds D, want A (in lookup())
     p2: holds shared lock on F, want D (in VFS_ROOT())
     p3: holds B, want drain lock on F (in dounmount())
     p4: holds A, want B (in VOP_LOOKUP())

You can have more than one instance of p2.

The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and
MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs
servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode.

- Tor Egge

To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp
is actually not used by the callers of namei. Thus, placeholder deadfs
vnode vp_crossmp is introduced that is filled into ni_dvp.

Idea by:	ups
Reviewed by:	tegge, ups, jeff, rwatson (mac interaction)
Tested by:	Peter Holm
MFC after:	2 weeks
2007-01-22 11:25:22 +00:00
imp
bbae4f9949 Add quirk for EasyMP3 EM732X usb 2.0 flash mp3 player.
(It appears that the quirk proceedures link has disappeared and that
this PR complied with it, if there's a problem, please contact me).

PR: usb/96546
2007-01-22 04:34:03 +00:00
marius
95a9b2142a Change the remainder of the drivers for DMA'ing devices enabled in the
sparc64 GENERIC and the sound device drivers known working on sparc64
to use bus_get_dma_tag() to obtain the parent DMA tag so we can get rid
of the sparc64_root_dma_tag kludge eventually. Except for ath(4), sk(4),
stge(4) and ti(4) these changes are runtime tested (unless I booted up
the wrong kernels again...).
2007-01-21 19:32:51 +00:00
marius
32ccb0b969 Correct a logic bug in the previous change. 2007-01-21 19:28:00 +00:00
netchild
1542e0642c Use a printf-modifier which doesn't need a cast.
Submitted by:	scottl
2007-01-21 13:18:52 +00:00
jeff
5fd995e14a - Disable the long-term load balancer. I believe that steal_busy works
better and gives more predictable results.
2007-01-20 21:24:05 +00:00
netchild
023c3ce346 Fix tinderbox build on amd64. 2007-01-20 19:32:23 +00:00
marius
c8d049b911 Quiet GCC4 warnings regarding the width of printf()-arguments not
matching the format. While at it limit the format to unsigned int as
we're only interested in the 11 least significant bits anyway.
2007-01-20 17:14:12 +00:00
scottl
c05aa6bb3f The multicast hash table has 8 slots in the BCE hardware, not 4 slots like
the BGE hardware.  Adapt the driver for this.

Submitted by: Mike Karels
MFC After: 3 days
2007-01-20 17:05:12 +00:00
jeff
a1996060b3 - We do need to IPI the idlethread on some systems. It may be stuck in
a power saving mode otherwise.
 - If the thread is already bound in sched_bind() unbind it before
   re-binding it to a new cpu.  I don't like these semantics but they are
   expected by some code in the tree.  Patch by jkoshy.
2007-01-20 17:03:33 +00:00
netchild
42392e7a0b MFp4 (113077, 113083, 113103, 113124, 113097):
Dont expose em->shared to the outside world before its properly
	initialized. Might not affect anything but its at least a better
	coding style.

	Dont expose em via p->p_emuldata until its properly initialized.
	This also enables us to get rid of some locking and simplify the
	code because we are workin on a local copy.

	In linux_fork and linux_vfork create the process in stopped state
	to be sure that the new process runs with fully initialized emuldata
	structure [1]. Also fix the vfork (both in linux_clone and linux_vfork)
	race that could result in never woken up process [2].

Reported by:	Scot Hetzel	[1]
Suggested by:	jhb		[2]
Reviewed by:	jhb (at least some important parts)
Submitted by:	rdivacky
Tested by:	Scot Hetzel (on amd64)

Change 2 comments (in the new code) to comply to style(9).

Suggested by:	jhb
2007-01-20 14:58:59 +00:00
marius
de8f010827 Add macros for the individual divisor bits as some MC146818A-compatible
chips also use them for different purposes.
2007-01-20 14:57:51 +00:00
marius
da9eaf073e Remove BUS_DMA_WAITOK from bus_dma_tag_create() invocations as it's
no valid flag there.
2007-01-20 14:19:29 +00:00
marius
46318caabd - Use bus_get_dma_tag() to obtain the parent DMA tag so dma(4) will
work when we start requiring this.
- Don't specify an alignment when creating our own parent DMA tag;
  the supported DMA engines require no alignment constraint (f.e. the
  LANCE child does though) and it's no inherited by the child DMA
  tags anyway (which probably is a bug though).
- Fix whitespace nits.
2007-01-20 14:06:01 +00:00
delphij
49f7e5db02 Fix build. chkdquot() should not return anything. 2007-01-20 13:54:28 +00:00
marius
a87efa794e Add front-ends for the 'lebuffer' variants found on some SBus cards.
These are shared-memory variants based on Am79C90-compatible chips
that apart from the missing DMA engine are similar to the 'ledma'
variant including using a (pseudo-)bus/device for the buffer that
the actual LANCE device hangs off from. The performance of these is
close to that of the 'ledma' one, like expected at a few times the
CPU load though.
2007-01-20 12:53:30 +00:00
mpp
0f6ed07b89 Quota system cleanup.
1) Do not do quota accounting for the actual quota data files
   or for file system snapshot files ("system" files).  This
   prevents a deadlock descibed in PR kern/30958 if the kernel
   ever has to grow the quota file.  Snapshot files were already
   exempt from the quota checks, but this change generalized the check.
2) Fix a cast that caused extremely large uids/gids to incorrectly
   write the quota information to the data file at a truncated
   value for a uint_t32 id value.  The incorrect cast caused quota
   files in this case to be around 4GB in size, with the correct cast
   they can now be 131GB in size.  Also related to PR kern/30958.
3) Check for what appear to be negative UIDs/GIDs and not account
   for them.  This prevents the quota files from becoming 131GB in
   size and causing quotacheck to run forever at bootup.  This could
   also cause the kernel to try and expand the quota file, which might
   deadlock due to the issue in #1.  kern/30958 and kern/38156
   (and some much older closed PR's).
4) With the deadlock problems gone, the kernel can now expand the
   size of the quota database files if it needs to.
5) Pass in the i-node count change value to chkiq and chkiqchg as an
   int, like it used to be before the common routine was split up
   into 2 different routines to increase / decrease the i-node in-use
   count.  Prevents an underflow on the i-node count.  Related
   to PR kern/89247.
6) Prevent the block usage from growing slowly if a file system is
   full and the write was denied due to that fact.  PR kern/89247.

Some of these changes require an updated quotacheck to prevent
the creation of huge (131GB) quota data files (item #3).

#1/#4 probably fixes a lot of the random hangs when quotas are enabled,
possibly some of the jail hangs.
2007-01-20 11:58:32 +00:00
netchild
d1c3c94c60 Ooops, fix the ratelimit. 2007-01-20 11:31:14 +00:00
netchild
22ce9c2574 Convert a KASSERT into a runtime warning (rate limited) + failsafe fallback.
Because of a stupid bug (also fixed with this commit) the KASSERT was
triggered when runnung the linux top.

Pointy hat to:	netchild
2007-01-20 11:07:41 +00:00
marius
a31f50a3fb For setting the port PCnet chips must be powered down or stopped and
unlike documented may not take effect without an initialization. So
don't invoke (*sc_mediachange) directly in lance_mediachange() but
go through lance_init_locked(). It's suboptimal to impose this for
all chips but given that besides the affected PCI bus front-end the
only other front-end which supports media selection is and likely
ever will be the 'ledma' front-end I see not enough reason to break
the in-driver API for this (though one could argue both ways here).
2007-01-20 10:47:16 +00:00
marius
613c7e2883 Use bus_get_dma_tag() to obtain the parent DMA tag so le(4) works on
platforms requiring this.
2007-01-20 09:57:09 +00:00
jeff
3f693f3417 - In tdq_transfer() always set NEEDRESCHED when necessary regardless of
the ipi settings.  If NEEDRESCHED is set and an ipi is later delivered
   it will clear it rather than cause extra context switches.  However, if
   we miss setting it we can have terrible latency.
 - In sched_bind() correctly implement bind.  Also be slightly more
   tolerant of code which calls bind multiple times.  However, we don't
   change binding if another call is made with a different cpu.  This
   does not presently work with hwpmc which I believe should be changed.
2007-01-20 09:03:43 +00:00
mjacob
da2ef49ea3 Grumble- let a linux-ism slip in and had an llx which
then choked on a 64 bit platforms. Oops.
2007-01-20 07:38:31 +00:00
mjacob
31cdd06b7a MFP4: Move default setting to the end of isp_reset instead of the
front of isp_init so we can read NVRAM even if we're role ISP_NONE.
Prepare for reintroduction of channels (for FC) for N-Port
Virtualization.

Fix a botch in handle assignment that caused us to nuke one device
when a new one arrives and end up with two devices with the same
identity in the virtual target mapping table.
2007-01-20 04:00:21 +00:00
marius
536c29257b - In miibus_attach() remove IFM_IMASK from the dontcare_mask of the
ifmedia_init() invocation. IFM_IMASK makes only sense here when all of
  the maxium of 32 PHYs on each one MII bus support disjoint sets of media,
  which generally isn't the case (though it would be nice if we had a way
  to let NIC drivers indicate that for the few card models where the PHY
  configuration is known/fixed and IFM_IMASK actually makes sense).
- Add and use a miibus_print_child() for the bus_print_child method which
  additionally prints the PHY number (which actually is the PHY address)
  so one can figure out the media instance <-> PHY number mapping from the
  PHY driver attach output. This is intented to be usefull in situations
  where the addresses of the PHYs on the bus are known (f.e. of internal/
  integrated PHYs) so one can feed the appropriate media instance number
  to ifconfig(8) (with the upcoming change for ifconfig(8)).
  This is more or less inspired by the NetBSD mii_print().
2007-01-20 00:55:03 +00:00
marius
8b9041b642 - Don't set MIIF_NOISOLATE so ukphy(4) can be used in configurations with
multiple PHYs. In case some PHYs currently driven by ukphy(4) exhibit
  problems when isolating due to incomplete implementations or silicon bugs
  we'll need to add specific drivers for these. Looking at NetBSD and
  OpenBSD I don't expect problems here though (quite the contrary; we still
  seem to set MIIF_NOISOLATE without good reason in a bunch of PHY drivers).
- Fix a style(9) whitespace nit.
2007-01-20 00:52:29 +00:00
jhb
a4f70979ed - Change the PCI-X registers constants to be relative to the PCI-X PCI
capability rather than hardcoded offsets for a particular card.  While
  I'm here, expand the constants some.
- Change the ahd(4) driver to use pci_find_extcap() to locate the PCI-X
  capability to keep up with the first change.

Reviewed by:	scottl, gibbs (earlier version)
2007-01-19 22:37:52 +00:00
jeff
a5cccc05cb Major revamp of ULE's cpu load balancing:
- Switch back to direct modification of remote CPU run queues.  This added
   a lot of complexity with questionable gain.  It's easy enough to
   reimplement if it's shown to help on huge machines.
 - Re-implement the old tdq_transfer() call as tdq_pickidle().  Change
   sched_add() so we have selectable cpu choosers and simplify the logic
   a bit here.
 - Implement tdq_pickpri() as the new default cpu chooser.  This algorithm
   is similar to Solaris in that it tries to always run the threads with
   the best priorities.  It is actually slightly more complex than
   solaris's algorithm because we also tend to favor the local cpu over
   other cpus which has a boost in latency but also potentially enables
   cache sharing between the waking thread and the woken thread.
 - Add a bunch of tunables that can be used to measure effects of different
   load balancing strategies.  Most of these will go away once the
   algorithm is more definite.
 - Add a new mechanism to steal threads from busy cpus when we idle.  This
   is enabled with kern.sched.steal_busy and kern.sched.busy_thresh.  The
   threshold is the required length of a tdq's run queue before another
   cpu will be able to steal runnable threads.  This prevents most queue
   imbalances that contribute the long latencies.
2007-01-19 21:56:08 +00:00
marius
d25d2952b1 Remove remnants from the sparc64 origin of this file and which are
unlikely to be ever used and misplaced on sun4v respectively.
2007-01-19 12:22:50 +00:00
marius
6cc2be150f Convert the remainder of the low hanging fruits regarding including
headers in .S directly rather than getting to their macros through
genassym.c/assym.s so there are less headers genassym.c has to be
kept in sync with.
While at it fix some stytle(9) bugs (indentation, prototype format,
sort headers, etc) and remove trailing whitespace.
2007-01-19 11:15:34 +00:00
imp
7cf268c182 Cope gracefully with device_get_children returning an error.
Obtained from: Hans Petter Selasky
P4: http://perforce.freebsd.org/chv.cgi?CH=112957
2007-01-19 08:49:28 +00:00
marius
545a381d5f - Add a uart_rxready() and corresponding device-specific implementations
that can be used to check whether receive data is ready, i.e. whether
  the subsequent call of uart_poll() should return a char, and unlike
  uart_poll() doesn't actually receive data.
- Remove the device-specific implementations of uart_poll() and implement
  uart_poll() in terms of uart_getc() and the newly added uart_rxready()
  in order to minimize code duplication.
- In sunkbd(4) take advantage of uart_rxready() and use it to implement
  the polled mode part of sunkbd_check() so we don't need to buffer a
  potentially read char in the softc.
- Fix some mis-indentation in sunkbd_read_char().

Discussed with:	marcel
2007-01-18 22:01:19 +00:00
mjacob
6f6da4e54a A less draconian fix to the build. 2007-01-18 19:41:39 +00:00
marius
a312b87531 - Probe the CS4231 in USIII machines.
- Remove unused variables. [1]

Reported by:	Coverity Prevent (CID 700, 701) [1]
2007-01-18 19:19:19 +00:00
obrien
d7c0868be3 Temporarily comment out the KASSERT that broke the kernel build. 2007-01-18 18:53:13 +00:00
marius
2d9010d810 - Rename UPA_BUS_SPACE to NEXUS_BUS_SPACE; besides an UPA bus, nexus(4)
may also reflect a Fireplane/Safari or JBus bus (or a virtual bus which
  in turn reflects a JBus bus or something like that...).
- In the both the sparc64 and sun4v bus_machdep.c use __FBSDID.
- Spell SBus the official way in comments.
- Replace hardcoded function names (all of which were actually outdated)
  in panic and status strings with __func__.
- Fix whitespace nits.
2007-01-18 18:32:26 +00:00
glebius
632643de7f Revise the ng_ppp(4) node, so that code flow is more clear. All non-link
hooks get their per hook rcvdata methods, and all functions are organized
corresponding to protocol stack model.

Submitted by:	Alexander Motin <mav alkar.net>
Reviewed by:	archie, julian
2007-01-18 13:55:21 +00:00
marius
52099a4877 Remove the compat shims for the ISA old-stlye in{b,w,l}()/out{b,w,l}()
and friends along with all hacks required to implement them. None of
the drivers currently built (as part of GENERIC, LINT or modules) on
sparc64 or sun4v and none of those we might want to use there in
future uses them, AFAICT there actually never was a driver hooked up
to the sparc64 or sun4v build that correctly used these functions
(and it looks like that due to a bug read{b,w,l}()/write{b,w,l}() and
the other functions working on a memory handle never actually worked on
sun4v). All they ever were good for on sparc64 and sun4v was erroneously
dragging in dependencies on isa(4) in drivers like f.e. dpt(4), si(4)
and syscons(4) in source files that supposedly were bus-neutral and
hiding issues with drivers like f.e. ng_bt3c(4) that used these
functions with busses other than isa(4) and therefore couldn't work on
these platforms.
2007-01-18 13:52:44 +00:00
marius
85c3e6d089 Wrap the EISA-specific parts of the dpt(4) and si(4) back-ends in
the newly added DEV_EISA. This is done so that these back-ends can
be compiled on platforms not providing in{b,w,l}()/out{b,w,l}() and
friends (but may wish to use them together with bus front-ends other
than the EISA one).
2007-01-18 13:33:36 +00:00
marius
fb522f68ac On sparc64 also use the fillw() this header provides for ia64 so
the sparc64 MD code doesn't need to provide a memsetw() along with
the ISA compat cruft.
2007-01-18 13:08:08 +00:00
rrs
1b181171ae - most all includes (#include <>) migrate to the sctp_os_bsd.h file
- Finally all splxx() are removed
 - Count error fixed in mapping array which might
   cause a wrong cumack generation.
 - Invariants around panic for case D + printf when no invariants.
 - one-to-one model race condition fixed by using
   a pre-formed connection and then completing the
   work so accept won't happen on a non-formed
   association.
 - Some additional paranoia checks in sctp_output.
 - Locks that were missing in the accept code.

Approved by:	gnn
2007-01-18 09:58:43 +00:00
kib
a40cd17e13 Add support for LINUX_O_DIRECT, LINUX_O_DIRECT and LINUX_O_NOFOLLOW flags
to open() [1].
Improve locking for accessing session control structures [2].
Try to document (most likely harmless) races in the code [3].

Based on submission by:	Intron (intron at intron ac) [1]
Reviewed by:		jhb [2]
Discussed with:		netchild, rwatson, jhb [3]
2007-01-18 09:32:08 +00:00
thompsa
d02694d662 Set topology change propagation on all ports _except_ the caller. 2007-01-18 07:13:01 +00:00
rodrigc
abfdc2d6f3 Revert previous change.
Requested by:	kan
2007-01-18 05:46:32 +00:00
rodrigc
d7b980406d Forward declare __pcpu as a pointer type instead of an array type to
eliminate GCC 4.1 error: "array type has incomplete element type".
2007-01-18 02:00:04 +00:00
delphij
9856d14ea1 Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form. 2007-01-17 15:05:52 +00:00
delphij
2e20bff54b Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form. 2007-01-17 14:58:53 +00:00
markus
c778226d2f Fix a buffer overflow iff USB_DEBUG is set, hw.usb.ums.debug is > 5 and the
total size of all input reports is < 6.

PR:		usb/106435
Submitted by:	Eygene Ryabinkin <rea-fbsd@codelabs.ru>
Approved by:	emax (mentor)
MFC after:	3 days
2007-01-17 03:50:45 +00:00
scottl
073fd8d865 Add PCI Id's for upcoming controllers.
Obtained from: LSI Corp.
MFC After: 3 days
2007-01-17 02:58:41 +00:00
cognet
199c2180b1 Create bus dma tags for both the PCI bus and the IXP425 root bus. Set the
PCI bus' one as the default one, and explicitely use the other one for
non-PCI devices.
This is needed because the PCI bus can only address 64MB of RAM, while some
IXP425 boards have 128MB or more, and most of the PCI drivers do not bother
providing the parent dma tag.
2007-01-17 00:58:25 +00:00
cognet
b0b19280da - Add bounce pages for arm, largely based on the i386 implementation.
- Add a default parent dma tag, similar to what has been done for sparc64.
- Before invalidating the dcache in POSTREAD, save the bits which are in the
same cachelines than our buffers, but not part of it, and restore them after
the invalidation.
2007-01-17 00:53:05 +00:00
trhodes
317819a5f6 Add a 3rd entry in the cache, which keeps the end position
from just before extending a file.  This has the desired effect
of keeping the write speed constant.  And yes, that helps a lot
copying large files always at full speed now, and I have seen
improvements using benchmarks/bonnie.

Stolen from:	NetBSD
Reviewed by:	bde
2007-01-16 23:43:14 +00:00
ssouhlal
d4434aa6e9 Remove hptlock from the static witness table, now that it's a regular sleep
mutex.
2007-01-16 22:56:28 +00:00
marius
2d74edbd80 Resurrect upa(4), now used for the subordinate/slave UPA bridge and
bus hanging off from the Fireplane/Safari bus in some USIII machines.
This is part 3/4 of allowing creator(4) to work in these machines.
The little info needed on how to configure the bridge and to work
around the incorrect values contained in the `interrupts' properties
of its children were obtained form OpenSolaris.
2007-01-16 22:08:27 +00:00
marius
7f03dfc1a6 - Merge sys/sparc64/creator/creator_upa.c into sys/dev/fb/creator.c.
The separate bus front-end was inherited from the OpenBSD creator(4),
  which at that time had a mainbus(4) (for USI/II machines, which use
  an UPA interconnection bus as the nexus) and an upa(4) (for USIII
  machines, which use a subordinate/slave UPA bus hanging off from the
  Fireplane/Safari interconnection bus) front-end. With FreeBSD and
  newbus there is/will be no need to have two separate bus front-ends
  for these busses, so we can easily coallapse the shared front-end
  and the back-end into a single source file (note that the FreeBSD
  creator_upa.c was misnomer anyway; based on what it actually attached
  to that should have been creator_nexus.c), actually OpenBSD meanwhile
  also has moved to a shared front-end and a single source file. Due
  to the low-level console support creator.c also wasn't free from bus
  related things before.
  While at it, also split sys/sparc64/creator/creator.h into a
  sys/dev/fb/creatorreg.h that only contains register macros and move
  the structures to the top of sys/dev/fb/creator.c as suggested by
  style(9) so creator(4) is no longer scattered over two directories.
- Use OF_decode_addr()/sparc64_fake_bustag() to obtain the bus tags and
  handles for the low-level console support instead of hardcoding
  support for AFB/FFB hanging off from nexus(4) only. This is part 2/4
  of allowing creator(4) to work in USIII machines (which have a UPA
  bus hanging off from the Fireplane/Safari bus reflected by the nexus),
  which already makes it work as the low-level console there.
- Allocate resources in the bus attach routine regardless of whether
  creator(4) is used as for the low-level console and thus the required
  bus tags and handles have been already obtained or not so the resources
  are marked as taken in the respective RMAN.
- For both obtaining the bus tags and handles for the low-level console
  support as well as allocating the corresponding resources in the
  regular bus attach routine don't bother to get all for the maximum of
  24 register banks but only (for) the two tag/handle pairs required for
  providing the video interface for syscons(4) support. If we can't
  allocate the rest of them just limit the memory range accessible via
  creator_fb_mmap() accordingly.
- Sanity check the memory range spanned by the first and last resources
  and the resources in between as far as possible, as the XFree86/Xorg
  sunffb(4) expects to be able to access the whole region, even though
  the backing resources are actually non-continuous. Limit and check
  the memory range accessible via creator_fb_mmap() accordingly.
- Reduce the size of buffers for OFW properties to what they actually
  need to hold.
- Rename some tables to creator_<foo> for consistency.
- Also for the sizes in the creator_fb_mmap() mapping table entries use
  macros for consistency, add macros for the remaining register banks
  for completeness.
2007-01-16 21:08:22 +00:00
marius
026ed31f96 Teach OF_decode_addr() about the bus space used for devices on the
nexus (which might or might not reflect an UPA interconnection bus;
accordingly UPA_BUS_SPACE should be renamed to NEXUS_BUS_SPACE at a
later point) and subordinate/slave UPA busses. This is part 1/4 of
allowing creator(4) to work in USIII machines (which have a UPA bus
hanging off from the Fireplane/Safari bus reflected by the nexus).
2007-01-16 20:42:21 +00:00
marius
da0f11a745 o In re_newbuf() and re_encap() if re_dma_map_desc() aborts the mapping
operation as it ran out of free descriptors or if there are too many
  segments in the first place, call bus_dmamap_unload() in order to
  unload the already loaded segments.
  For trying to map the defragmented mbuf (chain) in re_encap() this
  introduces re_dma_map_desc() setting arg.rl_maxsegs to 0 as a new
  failure mode. Previously we just ignored this case, corrupting our
  view of the TX ring.
o In re_txeof():
  - Don't clear IFF_DRV_OACTIVE unless there are at least 4 free TX
    descriptors. Further down the road re_encap() will bail if there
    aren't at least 4 free TX descriptors, causing re_start() to
    abort and prepend the dequeued mbuf again so it makes no sense
    to pretend we could process mbufs again when in fact we won't.
    While at it replace this magic 4 with a macro RL_TX_DESC_THLD
    throughout this driver.
  - Don't cancel the watchdog timeout as soon as there's at least one
    free TX descriptor but instead only if all descriptors have been
    handled. It's perfectly normal, especially in the DEVICE_POLLING
    case, that re_txeof() is called when only a part of the enqueued
    TX descriptors have been handled, causing the watchdog to be
    disarmed prematurely.
o In re_encap():
  - If m_defrag() fails just drop the packet like other NIC drivers
    do. This should only happen when there's a mbuf shortage, in which
    case it was possible to end up with an IFQ full of packets which
    couldn't be processed as they couldn't be defragmented as they
    were taking up all the mbufs themselves. This includes adjusting
    re_start() to not trying to prepend the mbuf (chain) if re_encap()
    has freed it.
  - Remove dupe initialization of members of struct rl_dmaload_arg to
    values that didn't change since trying to process the fragmented
    mbuf chain.
    While at it remove an unused member from struct rl_dmaload_arg.
o In re_start() remove a abandoned, banal comment. The corresponding
  code was moved to re_attach() some time ago.

With these changes re(4) now survives one day (until stopped) of
hammering out packets here.

Reviewed by:	yongari
MFC after:	2 weeks
2007-01-16 20:35:23 +00:00
jhb
4193a14b0b Disable MSI for the Intel 845 and 865 chipsets and update comment for
E7210 to note it is the same devid as the 875 chipset.
2007-01-16 19:44:45 +00:00
mpp
be61542c54 Fix a spelling error. heirarchy -> hierarchy.
Obtained from:	OpenBSD
2007-01-16 19:40:25 +00:00
mpp
5c20abfae9 Fix a spelling error in some comments. heirarchy -> hierarchy.
Obtained from: OpenBSD
2007-01-16 19:35:43 +00:00
jkim
2506ea94b7 Correct driver_t brgphy_driver, which was forgotten from the last commit. 2007-01-16 17:48:57 +00:00
jhb
0de926dfd1 Fix the subvendor ID for PCI-PCI bridges.
- Retire the PCI_SUB*_1 constants and don't try to read a subvendor ID out
  of them.  There isn't a standard subvendor ID field for PCI-PCI bridges.
  Instead, the dword at offset 0x34 is actually mostly reserved except for
  the LSB which is the capabilities pointer.
- Add support for the PCI-PCI bridge subvendor ID capability (13) and use
  it to set the subvendor ID for PCI-PCI bridges.

MFC after:	 1 month
2007-01-16 17:04:42 +00:00
jhb
067aec5afd Remove duplicate variable initialization.
CID:		1706
Found by:	Coverity Prevent (tm)
2007-01-16 17:01:42 +00:00
ume
66ac91ed3b Avoid infinite loop if nicmp6 and nip6 are not on the same mbuf.
NetBSD PR 34994+35333

MFC after:	3 days
2007-01-16 15:55:29 +00:00
joel
132337be8f Fix typo in a comment. 2007-01-16 12:27:13 +00:00
rrs
e614960c33 Removes useless (flags | ) KASSERT. The ^ one that actually
does what we want.

Submitted by:	Li Xin delphij@delphij.net
Reviewed by:	rrs
Approved by:	gnn
2007-01-16 11:40:55 +00:00
jkim
6a86994034 Move MII model and revision into softc. 2007-01-16 00:52:26 +00:00
kmacy
4da320a732 Fix warning by adding extra parentheses 2007-01-16 00:09:58 +00:00
marius
6d5dd8a49a Check the return value of bus_setup_intr() when setting up the
over-temperature and power-fail interrupts.

Suggested by:	Coverity Prevent (CID 683)
MFC after:	1 week
2007-01-15 22:37:59 +00:00
jkim
613d1a9f5a - Move Ethernet@WireSpeed and jumbo frame configurations to separate
functions.  The idea is taken from OpenBSD.
- Set/clear jumbo frame configurations for bge(4).
- Re-add BCM5750 PHY workaround for bce(4), which was mistakenly removed
from the previous commit.
2007-01-15 22:21:44 +00:00
jkim
ffb38d52f7 - Fix BCM5754 support found in Dell PowerEdge SC440.
- Move some PHY bug detections from brgphy.c to if_bge.c.
- Do not penalize working PHYs.
- Re-arrange bge_flags roughly by their categories.
- Fix minor style(9) nits.

PR:		kern/107257
Obtained from:	OpenBSD
Tested by:	Mike Hibler <mike at flux dot utah dot edu>
2007-01-15 21:43:43 +00:00
pav
9ca9d354d9 Rewrite the udf_read() routine to use a file vnode instead of the devvp vnode.
The code is modelled after cd9660, including support for simple read-ahead
courtesy of clustered read.

Fix udf_strategy to DTRT.

This change fixes sendfile(2) not to send out garbage.

Reviewed by:	scottl
MFC after:	1 month
2007-01-15 18:45:36 +00:00
njl
fd60f52ecc Clean up some debug prints from last commit and move one under boot -v.
Reminded by:	bruno
2007-01-15 18:17:36 +00:00
scottl
6b5470947c Add a missing mutex unlock to an error path.
Submitted by: Yuxiang Luo
PR: 107943
2007-01-15 16:22:20 +00:00
rrs
094d70fac7 - Macroizes the V6ONLY flag check.
- Added a short time wait (not used yet) constant
- Corrected the type of the crc32c table (it was
  unsigned long and really is a uint32_t
- Got rid of the user of MHeaders until they
  are truely needed by lower layers.
- Fixed an initialization problem in the readq structure
  (ordering was off).
- Found yet another collision bug when the random number
  generator returns two numbers on one side (during a collision)
  that are the same. Also added some tracking of cookies
  that will go away when we know that we have the last collision
  bug gone.
- Fixed an init bug for book_size_scale, that was causing
  Early FR code to run when it should not.
- Fixed a flight size tracking bug that was associated with
  Early FR but due to above bug also effected all FR's
- Fixed it so Max Burst also will apply to Fast Retransmit.
- Fixed a bug in the temporary logging code that allowed a
  static log array overflow
- hashinit_flags is now used.
- Two last mcopym's were converted to the macro sctp_m_copym that
  has always been used by all other places
- macro sctp_m_copym was converted to upper case.
- We now validate sinfo_flags on input (we did not before).
- Fixed a bug that prevented a user from sending data and immediately
  shuting down with one send operation.
- Moved to use hashdestroy instead of free() in our macros.
- Fixed an init problem in our timed_wait vtag where we
  did not fully initialize our time-wait blocks.
- Timer stops were re-positioned.
- A pcb cleanup method was added, however this probably will
  not be used in BSD.. unless we make module loadable protocols
- I think this fixes the mysterious timer bug.. it was a
  ordering of locks problem in the way we did timers. It
  now conforms to the timeout(9) manual (except for the
  _drain part, we had to do this a different way due
  to locks).
- Fixed error return code so we get either CONNREUSED or CONNRESET
  depending on where one is in progression
- Purged an unused clone macro.
- Fixed a read erro code issue where we were NOT getting the proper
  error when the connection was reset.
- Purged an unused clone macro.
- Fixed a read erro code issue where we were NOT getting the proper
  error when the connection was reset.
Approved by:	gnn
2007-01-15 15:12:10 +00:00
rrs
af870dbd2e Reviewed by: rwatson
Approved by:	gnn

Add a new function hashinit_flags() which allows NOT-waiting
for memory (or waiting). The old hashinit() function now
calls hashinit_flags(..., HASH_WAITOK);
2007-01-15 15:06:28 +00:00
glebius
06a509ffdc Whitespace cleanup.
Checked with:	cvs diff -b
2007-01-15 05:55:56 +00:00
glebius
3ff4e1770d Update ip and tcp pointers after m_pullup().
Submitted by:	Alexander Motin <mav alkar.net>
2007-01-15 05:01:31 +00:00