Commit Graph

65251 Commits

Author SHA1 Message Date
Ken Smith
a258946554 Make sure that either inp is NULL or we have obtained a lock on it before
jumping to dropunlock to avoid a panic.  While here move the calls to
ipsec4_in_reject() and ipsec6_in_reject() so they are after we obtain
the lock on inp.

Original patch to avoid panic:	pjd
Review of locking adjustments:	gnn, sam
Approved by:			re (rwatson)
2007-09-10 14:49:32 +00:00
Robert Watson
f5514f084e Further UDPv4 cleanup:
- Resort includes a bit.
- Correct typos and wording problems in comments.
- Rename udpcksum to udp_cksum to be consistent with other UDP-related
  configuration variables.
- Remove indirection of udp_notify through local notify variable in
  udp_ctlinput(), which is presumably due to copying and pasting from TCP,
  where multiple notify routines exist.

Approved by:	re (kensmith)
2007-09-10 14:22:15 +00:00
Bjoern A. Zeeb
7fd627f00f Fix a DIV0 in case a large value for fs_avgfilesize or fs_avgfpdir
is given (with newfs or tunefs) and dirsize overflows.

In case dirsize is <= 0 because of an overflow set maxcontigdirs
to 0 so it will be 1 later. This is what would happen for large
fs_avgfilesize. [1]

Identified with help from:	roberto, pjd
Submitted by:			pjd [1]
Approved by:			re (rwatson)
MFC after:			8 days
2007-09-10 14:12:29 +00:00
Tai-hwa Liang
73474451b9 Fixing invalid channel display in ifconfig(8) by implementing required
ioctl().

Note that other information provided by ifconfig(8) such like "list chan"
or "list ap" are still not available at this moment.

Before an(4) is connected to wlan(4), users are encouraged to use
ancontrol(8) to retrieve aforementioned information.

Reported by:	dhw (http://lists.freebsd.org/pipermail/freebsd-current/2007-July/074848.html)
Reviewed by:	ambrisko
Tested by:	dhw
Approved by:	re (bmah)
2007-09-10 12:53:34 +00:00
Kip Macy
2de1fa86d7 pull in changes made to RELENG_6 version in the process of doing the MFC
Supported by: Chelsio
Approved by: re (blanket)
2007-09-10 00:59:51 +00:00
Andrew Thompson
cb44b6dfe8 Check for multicast destination on bpf injected packets and update the M_*CAST
flags, the absense of these flags causes problems in other areas such as
bridging which expect them to be correct.

At the moment only Ethernet DLTs are checked.

Reviewed by:	bms, csjp, sam
Approved by:	re (bmah)
2007-09-10 00:03:06 +00:00
Robert Watson
45e0f3d63d Rename mac_check_vnode_delete() MAC Framework and MAC Policy entry
point to mac_check_vnode_unlink(), reflecting UNIX naming conventions.

This is the first of several commits to synchronize the MAC Framework
in FreeBSD 7.0 with the MAC Framework as it will appear in Mac OS X
Leopard.

Reveiwed by:    csjp, Samy Bahra <sbahra at gwu dot edu>
Submitted by:   Jacques Vidrine <nectar at apple dot com>
Obtained from:  Apple Computer, Inc.
Sponsored by:   SPARTA, SPAWAR
Approved by:    re (bmah)
2007-09-10 00:00:18 +00:00
Kip Macy
f4a2d780df - Remove filter support
Supported by: Chelsio
Approved by: re(blanket)
2007-09-09 20:26:02 +00:00
Olivier Houchard
4168e66b1f In __bswap16_var(), make sure the 16 upper bits are cleared; while
optimizing, gcc4 doesn't always do so.

Reported by:	Nathan Whitehorn
Approved by:	re (blanket)
2007-09-09 11:58:38 +00:00
Kip Macy
8adc65adda Add back in support for normal mbuf chaining on RX under DISABLE_MBUF_IOVEC
Approved by: re(blanket)
Supported by: Chelsio
2007-09-09 04:34:03 +00:00
Kip Macy
a8d57f7f24 Fix last-minute typo in last commit caused by pre-commit scripts
Approved by: re(blanket)
2007-09-09 03:51:25 +00:00
Kip Macy
5c5df3da16 - fix qset to port binding as a proper fix for the problems encountered on the 4-port
- fix the use after free seen when sending packets small enough to fit as an immediate
   and bpf peers are present
 - update to firmware rev 4.7 along with various small vendor fixes

Supported by: Chelsio
Approved by: re (blanket)
MFC after: 3 days
2007-09-09 01:28:03 +00:00
Olivier Houchard
18b6e4c8d2 Do not set the RTF_GATEWAY flag if RTF_LLINFO is set, it doesn't make much
sense in that context, and leads to unusable routes.
This should unbreak bootpd.

Discussed with: glebius
Submitted by:   bms
Approved by:    re (bmah)
2007-09-08 19:28:45 +00:00
Randall Stewart
851b7298b3 - send call has a reference to uio->uio_resid in
the recent send code, but uio may be NULL on sendfile
  calls. Change to use sndlen variable.
- EMSGSIZE is not being returned in non-blocking mode
  and needs a small tweak to look if the msg would
  ever fit when returning EWOULDBLOCK.
- FWD-TSN has a bug in stream processing which could
  cause a panic. This is a follow on to the codenomicon
  fix.
- PDAPI level 1 and 2 do not work unless the reader
  gets his returned buffer full. Fix so we can break
  out when at level 1 or 2.
- Fix fast-handoff features to copy across properly on
  accepted sockets
- Fix sctp_peeloff() system call when no true system call
  exists to screen arguments for errors. In cases where a
  real system call exists the system call itself does this.
- Fix raddr leak in recent add-ip code change for bundled
  asconfs (even when non-bundled asconfs are received)
- Make sure ipi_addr lock is held when walking global addr
  list. Need to change this lock type to a rwlock().
- Add don't wake flag on both input and output when the
  socket is closing.
- When deleting an address verify the interface is correct
  before allowing the delete to process. This protects panda
  and unnumbered.
- Clean up old sysctl stuff and get rid of the old Open/Net
  BSD structures.
- Add a function to watch the ranges in the sysctl sets.
- When appending in the reassembly queue, validate that
  the assoc has not gone to about to be freed. If so
  (in the middle) abort out. Note this especially effects
  MAC I think due to the lock/unlock they do (or with
  LOCK testing in place).
- Netstat patch to get rid of warnings.
- Make sure that no data gets queued to inactive/unconfirmed
  destinations. This especially effect CMT but also makes a
  impact on regular SCTP as well.
- During init collision when we detect seq number out
  of sync we need to treat it like Case C and discard
  the cookie (no invarient needed here).
- Atomic access to the random store.
- When we declare a vtag good, we need to shove it
  into the time wait hash to prevent further use. When
  the tag is put into the assoc hash, we need to remove it
  from the twait hash (where it will surely be). This prevents
  duplicate tag assignments.
- Move decr-ref count to better protect sysctl out of
  data.
- ltrace error corrections in sctp6_usrreq.c
- Add hook for interface up/down to be sent to us.
- Make sysctl() exported structures independent of processor
  architecture.
- Fix route and src addr cache clearing for delete address case.
- Make sure address marked SCTP_DEL_IP_ADDRESS is never selected
  as src addr.
- in icmp handling fixed so we actually look at the icmp codes
  to figure out what to do.
- Modified mobility code.
  Reception of DELETE IP ADDRESS for a primary destination and
  SET PRIMARY for a new primary destination is used for
  retransmission trigger to the new primary destination.
  Also, in this case, destination of chunks in send_queue are
  changed to the new primary destination.
- Fix so that we disallow sending by mbuf to ever have EEOR
  mode set upon it.

Approved by:	re@freebsd.org (B Mah)
2007-09-08 17:48:46 +00:00
Randall Stewart
ceaad40ae7 - Locking compatiability changes. This involves adding
additional flags to many function calls. The flags only
  get used in BSD when we compile with lock testing. These
  flags allow apple to escape the "giant" lock it holds on
  the socket and have more fine-grained locking in the NKE.
  It also allows us to test (with witness) the locking used
  by apple via a compile switch (manually applied).

Approved by:	re@freebsd.org(B Mah)
2007-09-08 11:35:11 +00:00
Robert Watson
ce4d8529e3 Continue UDP/UDPv6 synchronization project:
- Fix copyrights, comments in UDPv6.
- Remove macro defines for in6pcb and udp6stat.
- Consistently refer to inpcbs as 'inp' and not also 'in6p'.

Reviewed by:	gnn, jinmei, bz
Approved by:	re (bmah)
2007-09-08 08:18:24 +00:00
Robert Watson
85d9437250 Back out tcp_timer.c:1.93 and associated changes that reimplemented the many
TCP timers as a single timer, but retain the API changes necessary to
reintroduce this change.  This will back out the source of at least two
reported problems: lock leaks in certain timer edge cases, and TCP timers
continuing to fire after a connection has closed (a bug previously fixed and
then reintroduced with the timer rewrite).

In a follow-up commit, some minor restylings and comment changes performed
after the TCP timer rewrite will be reapplied, and a further change to allow
the TCP timer rewrite to be added back without disturbing the ABI.  The new
design is believed to be a good thing, but the outstanding issues are
leading to significant stability/correctness problems that are holding
up 7.0.

This patch was generated by silby, but is being committed by proxy due to
poor network connectivity for silby this week.

Approved by:	re (kensmith)
Submitted by:	silby
Tested by:	rwatson, kris
Problems reported by:	peter, kris, others
2007-09-07 09:19:22 +00:00
Sam Leffler
2a2391c23c - fix a bug that zyd_attach() returns 0 even if it encountered errors
that can lead to a panic when the stick is yanked.
- make sure that zyd_attach() returns 0 or errno.

Submitted by:	Weongyo Jeong <weongyo.jeong@gmail.com>
Reported by:	Ted Lindgreen <ted@tednet.nl>
Reviewed by:	sam
Approved by:	re (blanket wireless)
2007-09-07 03:54:54 +00:00
Marius Strobl
7439368f60 o Revamp the sparc64 interrupt code in order to be able to interface
with the INTR_FILTER-enabled MI code. Basically this consists of
  registering an interrupt controller (of which there can be multiple
  and optionally different ones either per host-to-foo bridge or shared
  amongst host-to-foo bridges in any one machine) along with an interrupt
  vector as specific argument for all the interrupt vectors used by a
  given host-to-foo bridge (roughly similar to registering interrupt
  sources on amd64 and i386), providing functions to enable, clear and
  disable the interrupts of the children beneath the bridge.
  This also includes:
  - No longer entering a critical section in tl0_intr() and tl1_intr()
    for executing interrupt handlers but rather let the handlers enter
    it themselves so in the case of intr_event_handle() we don't enter
    a nested critical section.
  - Adding infrastructure for binding delivery of interrupt vectors to
    specific CPUs which later on can be interfaced with the code from
    amd64/i386 for binding interrupts to specific CPUs.
  - Getting rid of the wrapper hack introduced along the lines of the
    API changes for INTR_FILTER which as a side-effect caused interrupts
    associated with ithread handlers only to get the elevated priority
    of those associated with filters ("fast handlers") (this removes the
    hack also in the non-INTR_FILTER case).
  - Disabling (by not clearing) an interrupt in the interrupt controller
    until all associated handlers have been executed, which is crucial
    for the typical locking strategy of NIC drivers in order to work
    correctly in case of shared interrupts. This was a more or less
    theoretical problem on sparc64 though, as shared interrupts are
    rather uncommon there except for the on-board SCCs and UARTs.
  Note that due to the behavior of at least of some of the interrupt
  controllers used on sparc64 an enable+EOI instead of a disable+EOI
  approach (as implied by the INTR_FILTER MI code and implemented on
  other architectures) is used as the latter can cause lost interrupts
  or in the worst case interrupt starvation.
o Correct a typo in sbus_alloc_resource() which caused (pass-through)
  allocations to only work down to the grandchildren of the bus, which
  wasn't a real problem so far as we don't support any devices which are
  great-grandchildren or greater of a U2S bridge, yet.
o In fhc(4) use bus_{read,write}_4() instead of bus_space_{read,write}_4()
  in order to get rid of sc_bh and sc_bt in the fhc_softc. Also get rid
  of some other unneeded members in fhc_softc.

Reviewed by:	marcel (earlier version)
Approved by:	re (kensmith)
2007-09-06 19:16:30 +00:00
Marius Strobl
5435966282 Style(9) fix - use #define<tab> consistently.
Approved by:	re (kensmith)
2007-09-06 14:56:09 +00:00
Sam Leffler
7595008bb1 oops, add missing bit from last change
Approved by:	re (blanket wireless)
2007-09-06 00:08:02 +00:00
Sam Leffler
c066143c08 Fixup sta inactivity handling:
o reset ni_inact when ni_inact_reload is changed so we're
  assured a valid setting
o never let ni_inact go negative
o add a knob to disable hostap sta idle handling (e.g. so it can be done
  by a user application)
o remove bogus reload on associate

Reviewed by:	avatar
Approved by:	re (blanket wireless)
2007-09-06 00:04:36 +00:00
Sam Leffler
5c096cfbe5 Add missing bg scanning bits; update ic_lastdata and cancel any
bg scan when there's outbound traffic.

Approved by:	re (blanket wireless)
2007-09-05 23:40:59 +00:00
Sam Leffler
2b9411e29f Add missing bits that made bg scanning lame:
o update ic_lastdata to reflect time of last outbound frame
o outbound traffic must preempt/cancel bg scanning to avoid delays

This stuff was somehow missed in the initial import.

Reviewed by:	thompsa, avatar, sephe (earlier version)
Approved by:	re (blanket wireless)
2007-09-05 23:00:27 +00:00
Sam Leffler
14fb6b8fe2 o add 802.11 state machine states for DFS and client-side power save
o fixup drivers to ignore new states

Reviewed by:	avatar (?)
Approved by:	re (blanket wireless)
2007-09-05 21:31:32 +00:00
Sam Leffler
dc60433061 add defs just removed from ieee80211.h
Approved by:	re (blanket wireless)
2007-09-05 21:25:58 +00:00
Sam Leffler
3f87f68e74 Update channel definition:
o add ic_extieee to hold the HT40 extension channel number
o add ic_state to track dynamic channel state for DFS
o add flags to mark regulatory channel requirements
o add state defs for DFS/radar support

Reviewed by:	avatar
Approved by:	re (blanket wireless)
2007-09-05 20:37:39 +00:00
Sam Leffler
eddedabe31 Miscellaneous fixups to 802.11 defs:
o update 11n definitions to D2.0 spec
o add IEEE80211_CAPINFO_SPECTRUM_MGMT for DFS support
o add CSA ie definition for DFS support
o purge some unused definitions
o correct 802.11 reason and status codes
o correct reason code returned when a sta tries to associate to an
  ap operating with WPA/RSN but without a WPA/RSN ie

Reviewed by:	thompsa, avatar
Approved by:	re (blanket wireless)
2007-09-05 20:29:51 +00:00
Sam Leffler
b1acbdbbbb o add M_WEP mbuf flag so drivers can mark frames that are decrypted by the
device and have had the crypto bits stripped from the 802.11 header
o strip mbuf flags in the rx path before passing up the stack

Reviewed by:	thompsa, sephe, avatar
Approved by:	re (blanket wireless)
2007-09-05 20:22:59 +00:00
Olivier Houchard
33321c8166 There's no need to re-read PCIR_COMMAND once we set it.
Approved by:	re (blanket)
2007-09-04 18:45:27 +00:00
Jack F Vogel
3ec35e52b8 This is an update to the new Intel 10G 82598 driver.
The first drop was Beta, this code is expected to be the release version.
Note that this driver code will build in either 6.2 or 7. If you
use the code in 6.2 you will not get TSO or MSI/X support but it will
function in a legacy mode.

Approved by: re
2007-09-04 02:31:35 +00:00
Robert Watson
70ffc2fb53 In userland_sysctl(), call useracc() with the actual newlen value to be
used, rather than the one passed via 'req', which may not reflect a
rewrite.  This call to useracc() is redundant to validation performed by
later copyin()/copyout() calls, so there isn't a security issue here,
but this could technically lead to excessive validation of addresses if
the length in newlen is shorter than req.newlen.

Approved by:	re (kensmith)
Reviewed by:	jhb
Submitted by:	Constantine A. Murenin <cnst+freebsd@bugmail.mojo.ru>
Sponsored by:	Google Summer of Code 2007
2007-09-02 09:59:33 +00:00
Yoshihiro Takahashi
7b226dfaa8 Fix a kernel panic due to a NULL pointer access on pc98.
When any PnP device exists, isa_release_resource() is called with no
activated resource.  So a bushandle is not allocated yet.

Approved by:	re (kensmith)
2007-09-01 12:18:28 +00:00
Pawel Jakub Dawidek
864cba9669 Add support for Camellia encryption algorithm.
PR:		kern/113790
Submitted by:	Yoshisato YANAGISAWA <yanagisawa@csg.is.titech.ac.jp>
Approved by:	re (bmah)
2007-09-01 06:33:02 +00:00
Pawel Jakub Dawidek
6bc581fcf0 Use CTLFLAG_RDTUN for tunable sysctls.
Approved by:	re (bmah)
2007-09-01 06:23:42 +00:00
Bruce Evans
c2819440b3 Fix races in msdosfs_lookup() and msdosfs_readdir(). These functions
can easily block in bread(), and then there was nothing to prevent the
static buffer (nambuf_{ptr,len,last_id}) being clobbered by another
thread.

The effects of the bug seem to have been limited to failed lookups and
mangled names in readdir(), since Giant locking provides enough
serialization to prevent concurrent calls to the functions that access
the buffer.  They were very obvious for multiple concurrent tree walks,
especially with a small cluster size.

The bug was introduced in msdosfs_conv.c 1.34 and associated changes,
and is in all releases starting with 5.2.

The fix is to allocate the buffer as a local variable and pass around
pointers to it like "_r" functions in libc do.  Stack use from this
is large but not too large.  This also fixes a memory leak on module
unload.

Reviewed by:	kib
Approved by:	re (kensmith)
2007-08-31 22:29:55 +00:00
John Baldwin
67b158d888 Close a race that snuck in with the recent changes to fix a LOR between
the callout_lock spin lock and the sleepqueue spin locks.  In the fix,
callout_drain() has to drop the callout_lock so it can acquire the
sleepqueue lock.  The state of the callout can change while the
callout_lock is held however (for example, it can be rescheduled via
callout_reset()).  The previous code assumed that the only state change
that could happen is that the callout could finish executing.  This change
alters callout_drain() to effectively restart and recheck everything
after it acquires the sleepqueue lock thus handling all the possible
states that the callout could be in after any changes while callout_lock
was dropped.

Approved by:	re (kensmith)
Tested by:	kris
2007-08-31 19:01:30 +00:00
Diomidis Spinellis
d5b6981e69 Add missing newline in the log message of the previous commit.
Approved by:	re (kensmith) - implied
2007-08-31 13:56:26 +00:00
Diomidis Spinellis
72de1b3709 Don't panic. When encountering a negative value call log(LOG_NOTICE, ...)
and record LONG_MAX, instead of calling KASSERT(...).

Reported by:	rwatson
Approved by:	re (kensmith)
2007-08-31 13:36:58 +00:00
Nate Lawson
c961faca8c Evaluate _OSC on boot to indicate our OS capabilities to ACPI. This is
needed at least to convince the BIOS to give us access to CPU freq
control on MacBooks.

Submitted by:	Rui Paulo <rpaulo / fnop.net>
Approved by:	re
MFC after:	5 days
2007-08-30 21:18:42 +00:00
Andrew Thompson
207455510b Show the ACTIVE flag in ifconfig for the single interface that is actaully
active in failover mode rather than all interfaces with a link. This makes it
clear if the master interface is in use or one of the backup links.

Found by:	Writing the Handbook section
Approved by:	re (kensmith)
2007-08-30 19:12:10 +00:00
Andrew Thompson
06035e8252 Remove the lock assert from iwi_newstate, this function does not need the lock
to be held and this will falsely trigger if called from net80211.

Reported by:	Munehiro (haro) Matsuda
Reviewed by:	sam
Approved by:	re (kensmith)
2007-08-29 21:52:03 +00:00
John Baldwin
57b7fe337e Partially revert the previous change. I failed to notice that where
ktruserret() is invoked, an unlocked check of  the per-process queue
is performed inline, thus, we don't lock the ktrace_sx on every userret().

Pointy hat to:	jhb
Approved by:	re (kensmith)
Pointy hat recovered from:	rwatson
2007-08-29 21:17:11 +00:00
Warner Losh
eb0fa74e92 A port of the zyd driver from NetBSD by . This supports the ZyDAS
ZD1211/ZD1211B USB IEEE 802.11b/g wireless network devices.  Not (yet)
connected to the build process (next batch of commits once I've looped
the current back back).

Submitted by: Weongyo Jeong
Reviewed by: sam@
Approved by: re@
2007-08-29 21:16:50 +00:00
Warner Losh
44298c2b79 Makefile for building zyd kernel module.
Submitted by: Weongyo Jeong
Approved by: re@ (kensmith)
2007-08-29 21:04:26 +00:00
Warner Losh
4c2b0b2a5e Add devices for the forthcoming zyd driver, ported from NetBSD, by
Weongyo Jeong.

Submitted by: Weongyo Jeong
Approved by: re@
2007-08-29 21:00:57 +00:00
Brian Feldman
598fa04675 Repair ALTQ-tagging rules in IPFW which got broken in the last PF
import.  The PF mbuf-tagging support routines changed to link the
allocated tags into the provided mbuf themselves, so the left-over
m_tag_prepend() was trying to add a bogus (usually NULL) tag.

Reviewed by: mlaier
Approved by: re
2007-08-29 19:34:28 +00:00
John Baldwin
cc479dda4a Rework the routines to convert a 5.x+ statfs structure (with fixed-size
64-bit counters) to a 4.x statfs structure (with long-sized counters).
- For block counters, we scale up the block size sufficiently large so
  that the resulting block counts fit into a the long-sized (long for the
  ABI, so 32-bit in freebsd32) counters.  In 4.x the NFS client's statfs
  VOP did this already.  This can lie about the block size to 4.x binaries,
  but it presents a more accurate picture of the ratios of free and
  available space.
- For non-block counters, fix the freebsd32 stats converter to cap the
  values at INT32_MAX rather than losing the upper 32-bits to match the
  behavior of the 4.x statfs conversion routine in vfs_syscalls.c

Approved by:	re (kensmith)
2007-08-28 20:28:12 +00:00
Konstantin Belousov
0e6ed4feab Regenerate.
Approved by:	re (kensmith)
2007-08-28 12:36:23 +00:00
Konstantin Belousov
b6e645c90f Implement fake linux sched_getaffinity() syscall to enable java to work
with Linux 2.6 emulation. This shall be reimplemented once FreeBSD gets
native scheduler affinity syscalls.

Submitted by:	rdivacky
Reviewed by:	jkim
Sponsored by:	Google Summer of Code 2007
Approved by:	re (kensmith)
2007-08-28 12:26:35 +00:00
Jung-uk Kim
8553cd622c Fix off-by-two errors.
Both WWNN and WWPN are 64-bit unsigned integers and they are prefixed
with "0x", which requires two more bytes each.

Submitted by:	Danny Braniss (danny at cs dot huji dot ac dot il)
		via Matthew Jacob (lydianconcepts at gmail dot com)
Approved by:	re (bmah)
MFC after:	3 days
2007-08-28 00:09:12 +00:00
Randall Stewart
2afb3e849f - During shutdown pending, when the last sack came in and
the last message on the send stream was "null" but still
  there, a state we allow, we could get hung and not clean
  it up and wait for the shutdown guard timer to clear the
  association without a graceful close. Fix this so that
  that we properly clean up.
- Added support for Multiple ASCONF per new RFC. We only
  (so far) accept input of these and cannot yet generate
  a multi-asconf.
- Sysctl'd support for experimental Fast Handover feature. Always
  disabled unless sysctl or socket option changes to enable.
- Error case in add-ip where the peer supports AUTH and ADD-IP
  but does NOT require AUTH of ASCONF/ASCONF-ACK. We need to
  ABORT in this case.
- According to the Kyoto summit of socket api developers
  (Solaris, Linux, BSD). We need to have:
   o non-eeor mode messages be atomic - Fixed
   o Allow implicit setup of an assoc in 1-2-1 model if
     using the sctp_**() send calls - Fixed
   o Get rid of HAVE_XXX declarations - Done
   o add a sctp_pr_policy in hole in sndrcvinfo structure - Done
   o add a PR_SCTP_POLICY_VALID type flag - yet to-do in a future patch!
- Optimize sctp6 calls to reuse code in sctp_usrreq. Also optimize
  when we close sending out the data and disabling Nagle.
- Change key concatenation order to match the auth RFC
- When sending OOTB shutdown_complete always do csum.
- Don't send PKT-DROP to a PKT-DROP
- For abort chunks just always checksums same for
  shutdown-complete.
- inpcb_free front state had a bug where in queue
  data could wedge an assoc. We need to just abandon
  ones in front states (free_assoc).
- If a peer sends us a 64k abort, we would try to
  assemble a response packet which may be larger than
  64k. This then would be dropped by IP. Instead make
  a "minimum" size for us 64k-2k (we want at least
  2k for our initack). If we receive such an init
  discard it early without all the processing.
- When we peel off we must increment the tcb ref count
  to keep it from being freed from underneath us.
- handling fwd-tsn had bugs that caused memory overwrites
  when given faulty data, fixed so can't happen and we
  also stop at the first bad stream no.
- Fixed so comm-up generates the adaption indication.
- peeloff did not get the hmac params copied.
- fix it so we lock the addr list when doing src-addr selection
  (in future we need to use a multi-reader/one writer lock here)
- During lowlevel output, we could end up with a _l_addr set
  to null if the iterator is calling the output routine. This
  means we would possibly crash when we gather the MTU info.
  Fix so we only do the gather where we have a src address
  cached.
- we need to be sure to set abort flag on conn state when
  we receive an abort.
- peeloff could leak a socket. Moved code so the close will
  find the socket if the peeloff fails (uipc_syscalls.c)

Approved by:	re@freebsd.org(Ken Smith)
2007-08-27 05:19:48 +00:00
Maxim Konovalov
4a296ec798 o Fix bug I introduced in the previous commit (ipfw set extention):
pack a set number correctly.

Submitted by:	oleg

o Plug a memory leak.

Submitted by:	oleg and Andrey V. Elsukov
Approved by:	re (kensmith)
MFC after:	1 week
2007-08-26 18:38:31 +00:00
Sepherosa Ziehau
f05ba5eeed Off-by-one bug in country ie construction, which will make HOSTAP send out
malformatted beacons.

Reviewed by: sam
Approved by: re (bmah), sam (mentor)
2007-08-26 11:34:51 +00:00
Sepherosa Ziehau
98b335504d Fix following nits:
- Per ieee80211com sysctl ctx leakage on detach
- getmgtframe incorrectly adjusts mbuf.m_data

Reviewed by: sam
Approved by: re (bmah), sam (mentor)
2007-08-26 11:32:56 +00:00
Scott Long
610f2ef365 Update the MFI driver to support new "1078" series of hardware. This
includes the upcoming Dell PERC6 series.  Many thanks to LSI for
contributing this code.

Submitted by: LSI
Approved by: re
2007-08-25 23:58:45 +00:00
Kip Macy
7ac2e6c362 Fixes for 4 port and small packet optimization
- remove cpl->iff panic - we can't know the port number from the rspq on the 4-port
- pick the ifnet based on the interface in the CPL header
- switch to using qset 0 for egress on the 4-port for now - may change
  when we start using RSS
- move ether_ifdetach to before the port lock gets deinitialized to avoid
  hang in the case where there are BPF peers (cxgb_ioctl is called indirectly
  when BPF peers are present)
- don't call t3_mac_reset if multiport is set, this was causing tx errors
  by misconfiguring the MAC on the 4-port
- change V_TXPKT_INTF to use txpkt_intf as the interfaces are not contiguous
- free the mbuf immediately in the case where the payload is small enough to be copied
  into the rspq
- only update the coalesce timer if for a queue if packets were taken off of it
- add in missed 20ms DELAY in initializaton vsc8211

- prompt MFC as this only applies to the 4-port which is currently completely
  broken - OK'd by kensmith

Supported by: Chelsio
Approved by: re (blanket)
MFC after: 0 days
2007-08-25 21:07:37 +00:00
Sam Leffler
d72c72537e drop frames marked for encryption when no key is available
Reviewed by:	avatar
Approved by:	re (kensmith)
Obtained from:	madwifi
2007-08-24 15:44:27 +00:00
Randall Stewart
c4739e2f47 - Fix address add handling to clear cached routes and source addresses
when peer acks the add in case the routing table changes.
- Fix sctp_lower_sosend to send shutdown chunk for mbuf send
  case when sndlen = 0 and sinfoflag = SCTP_EOF
- Fix sctp_lower_sosend for SCTP_ABORT mbuf send case with null data,
  So that it does not send the "null" data mbuf out and cause
  it to get freed twice.
- Fix so auto-asconf sysctl actually effect the socket's asconf state.
- Do not allow SCTP_AUTO_ASCONF option to be used on subset bound sockets.
- Memset bug in sctp_output.c (arguments were reversed) submitted
  found and reported by Dave Jones (davej@codemonkey.org.uk).
- PD-API point needs to be invoked >= not just > to conform to socket api
  draft this fixes sctp_indata.c in the two places need to be >=.
- move M_NOTIFICATION to use M_PROTO5.
- PEER_ADDR_PARAMS did not fail properly if you specify an address
  that is not in the association with a valid assoc_id. This meant
  you got or set the stcb level values instead of the destination
  you thought you were going to get/set. Now validate if the
  stcb is non-null and the net is NULL that the sa_family is
  set and the address is unspecified otherwise return an error.
- The thread based iterator could crash if associations were freed
  at the exact time it was running. rework the worker thread to
  use the increment/decrement to prevent this and no longer use
  the markers that the timer based iterator uses.
- Fix the memleak in sctp_add_addr_to_vrf() for the case when it is
  detected that ifa is already pointing to a ifn.
- Fix it so that if someone is so insane that they drop the
  send window below the minimal add mark, they still can send.
- Changed all state for associations to use mask safe macro.
- During front states in association freeing in sctp_inpcbfree, we
  had a locking problem where locks were not in place where they
  should have been.
- Free association calls were not testing the return value in
  sctp_inpcb_free() properly... others should be cast  void returns
  where we don't care about the return value.
- If a reference count is held on an assoc, even from the "force free"
  we should not do the actual free.. but instead let the timer
  free it.
- When we enter sctp_input(), if the SCTP_ASOC_ABOUT_TO_BE_FREED
  flag is set, we must NOT process the packet but handle it like
  ootb. This is because while freeing an assoc we release the
  locks to get all the higher order locks so we can purge all
  the hash tables. This leaves a hole if a packet comes in
  just at that point. Now sctp_common_input_processing() will
  call the ootb code in such a case.
- Change MBUF M_NOTIFICATION to use M_PROTO5 (per Sam L). This makes
  it so we don't have a conflict (I think this is a covertity change).
  We made this change AFTER some conversation and looking to make sure
  that M_PROTO5 does not have a problem between SCTP and the 802.11
  stuff (which is the only other place its used).
- Fixed lock order reversal and missing atomic protection around
  locked_tcb during association lookup and the 1-2-1 model.
- Added debug to source address selection.
- V6 output must always do checksum even for loopback.
- Remove more locks around inp that are not needed for an atomically
  added/subtracted ref count.
- slight optimization in the way we zero the array in sctp_sack_check()
- It was possible to respond to a ABORT() with bad checksum with
  a PKT-DROP. This lead to a PKT-DROP/ABORT war. Add code to NOT
  send a PKT-DROP to any ABORT().
- Add an option for local logging (useful for macintosh or when
  you need better performing during debugging). Note no commands
  are here to get the log info, you must just use kgdb.
- The timer code needs to be aware of if it needs to call
  sctp_sack_check() to slide the maps and adjust the cum-ack.
  This is because it may be out of sync cum-ack wise.
- Added threshold managment logging.
- If the user picked just the right size, that just filled the send
  window minus one mtu, we would enter a forever loop not copying and
  at the same time not blocking. Change from < to <= solves this.
- Sysctl added to control the fragment interleave level which defaults
  to 1.
- My rwnd control was not being used to control the rwnd properly (we
  did not add and subtract to it :-() this is now fixed so we handle
  small messages (1 byte etc) better to bring our rwnd down more
  slowly.

Approved by:	re@freebsd.org (Bruce Mah)
2007-08-24 00:53:53 +00:00
Ed Maste
afa3f6df27 Add PCI IDs for two cards:
- Adaptec RAID 3405
- Adaptec RAID 3805

Approved by:	re (bmah)
Submitted by:	John Marra  jmarra at nmu dot edu
MFC After:	1 week
2007-08-23 20:12:40 +00:00
Maksim Yevmenkin
d46210e60d Return EADDRNOTAVAIL instead of EDESTADDRREQ error when
listen(2) is called on improperly bound socket.

Suggested by:	Iain Hibbert
Approved by:	re (kensmith)
MFC after:	3 days
2007-08-23 16:55:22 +00:00
Jung-uk Kim
fada2376b8 Export 4Gbps Fibre Channel link speed correctly with inquiry commands.
Approved by:	re (kensmith)
MFC after:	3 days
2007-08-23 15:57:13 +00:00
Dag-Erling Smørgrav
5afb221c66 Style nits + more reliable Tj(max) detection + improved reporting of
critical temperature + sched_unbind() after rdmsr + initialize sc_dev.

Submitted by:	Rui Paulo <rpaulo@fnop.net>, cnst
Approved by:	re (kensmith)
2007-08-23 10:53:03 +00:00
Daniel Hartmeier
7f368082ad When checking the sequence number of a TCP header embedded in an
ICMP error message, do not access th_flags. The field is beyond
the first eight bytes of the header that are required to be present
and were pulled up in the mbuf.

A random value of th_flags can have TH_SYN set, which made the
sequence number comparison not apply the window scaling factor,
which led to legitimate ICMP(v6) packets getting blocked with
"BAD ICMP" debug log messages (if enabled with pfctl -xm), thus
breaking PMTU discovery.

Triggering the bug requires TCP window scaling to be enabled
(sysctl net.inet.tcp.rfc1323, enabled by default) on both end-
points of the TCP connection. Large scaling factors increase
the probability of triggering the bug.

PR:		kern/115413: [ipv6] ipv6 pmtu not working
Tested by:	Jacek Zapala
Reviewed by:	mlaier
Approved by:	re (kensmith)
2007-08-23 09:30:58 +00:00
Andrew Gallatin
c587e59f20 - Fix a bug which could cause a panic when enabling LRO
on an down mxge interface
- Fix a bug where mxge reported the link state as
   active when it wasn't (after ifconfig down).
- Prevent spurious watchdog resets when link partner is not consuming
- Add support for CX4 and popular XFP media detection
- Update the firmware and associated header files to 1.4.25

Approved by: re (kensmith)
2007-08-22 13:22:12 +00:00
Joseph Koshy
ea49750231 Assign sizes to assembly language support functions.
Approved by:	re (kensmith)
2007-08-22 05:06:14 +00:00
Joseph Koshy
298889efcb Define an END() macro for use in i386 and amd64 assembly code, akin
to the one available on the ia64, sparc64, and sun4v architectures.

Approved by:	re (kensmith)
2007-08-22 04:26:07 +00:00
Konstantin Belousov
046ea980e1 Properly initialize the dev_priv before calling the i915_dma_cleanup().
This fixes my rev. 1.5.

Reviewed by:	anholt
Approved by:	re (kensmith)
MFC after:	2 weeks
2007-08-21 12:52:57 +00:00
Alan Cox
8beae25391 In general, when we map a page into the kernel's address space, we no
longer create a pv entry for that mapping.  (The two exceptions are
mappings into the kernel's exec and pipe submaps.)  Consequently, there is
no reason for get_pv_entry() to dig deep into the free page queues, i.e.,
use VM_ALLOC_SYSTEM, by default.  This revision changes get_pv_entry() to
use VM_ALLOC_NORMAL by default, i.e., before calling pmap_collect() to
reclaim pv entries.

Approved by:	re (kensmith)
2007-08-21 04:59:34 +00:00
Olivier Houchard
7dd9c45f26 Some times ago, vfs_getopts() was changed, so that it would set error to
ENOENT if the option wasn't provided, instead of setting it to 0.
xfs however didn't catch up on this, so it assumed something went bad if
vfs_getopts() sets the error to non-zero, and just returns the error.
Unbreak xfs mount by just ignoring the error if vfs_getopts() sets the
error to ENOENT, as we should have sane defaults.

Reviewed by:    kan
Approved by:    re (rwatson)
Tested by:      rpaulo
2007-08-20 15:33:22 +00:00
Konstantin Belousov
d239bd3ccc Do not drop vm_map lock between doing vm_map_remove() and vm_map_insert().
For this, introduce vm_map_fixed() that does that for MAP_FIXED case.

Dropping the lock allowed for parallel thread to occupy the freed space.

Reported by:	Tijl Coosemans <tijl ulyssis org>
Reviewed by:	alc
Approved by:	re (kensmith)
MFC after:	2 weeks
2007-08-20 12:05:45 +00:00
Konstantin Belousov
5114048b63 Destroy the kaio_mtx on the freeing the struct kaioinfo in the
aio_proc_rundown.

Do not allow for zero-length read to be passed to the fo_read file method
by aio.

Reported and tested by:	Peter Holm
Approved by:	re (kensmith)
2007-08-20 11:53:26 +00:00
Jeff Roberson
67e20930bd - Improve runq_findbit_from() which is used by ULE's circular queue. Mask
of the bits we want to ignore on the first pass rather than doing a
   linear scan.  This puts us within a few instructions of the cost of
   runq_findbit() and removes this function from the top of profiling output
   for context switch heavy workloads.

Approved by:	re
2007-08-20 06:36:12 +00:00
Jeff Roberson
9862717afe - Set steal_thresh to log2(ncpus). This improves idle-time load balancing
on 2cpu machines by reducing it to 1 by default.  This improves loaded
   operation on 8cpu machines by increasing it to 3 where the extra idle
   time is not as critical.

Approved by:	re
2007-08-20 06:34:20 +00:00
Nate Lawson
62db376af3 Always call sched_bind(), even if on the CPU in question. It is wrong to
check if we're already on that cpu and skip the bind since the thread could
be migrated off in the meantime.

Suggested by:	jeff
Approved by:	re
2007-08-20 06:28:26 +00:00
Nate Lawson
2145b9d207 Use a different loop variable for the inner loop. This previous reuse could
have caused a hang, but we got lucky with the available multi-CPU states
on actual hardware.

Submitted by:	Bjorn Koenig <bkoenig / alpha-tierchen.de>
Approved by:	re
MFC after:	3 days
2007-08-19 20:34:13 +00:00
Olivier Houchard
d3973c98d5 Just wbinv if both PREREAD and PREWRITE are set.
In PREREAD, just invalidate the cache lines, and do not write back them, if
the buffer is properly aligned.

Approved by:	re (blanket)
2007-08-18 16:47:28 +00:00
Konstantin Belousov
daab56673e Remove comment that is no longer quite true.
Noted by:	alc
Approved by:	re (kensmith)
2007-08-18 16:41:31 +00:00
Konstantin Belousov
efe7553ed7 Fix the phys_pager in the way similar to the rev. 1.83 of the
sys/vm/device_pager.c:

Protect the creation of the phys pager with non-NULL handle with the
phys_pager_mtx. Lookup of phys pager in the pagers list by handle is now
synchronized with its removal from the list, and phys_pager_mtx is put
before vm object lock in lock order. Dispose the phys_pager_alloc_lock
and tsleep calls, together with acquiring Giant, since phys_pager_mtx
now covers the same block.

Reviewed by:	alc
Approved by:	re (kensmith)
2007-08-18 16:40:33 +00:00
Andrew Thompson
11eeea5e85 If the STP state machine is stopped then clear the bridge-id and root-id.
Approved by:	re (kensmith)
2007-08-18 12:06:13 +00:00
Alexander Motin
3fb87c2411 Add ng_send_fn() error handeling inside ng_con_nodes().
Without it some errors may left unnoticed and unhandeled
that will lead to hooks left in half-connected state.

Reviewed by:	julian@
Approved by:	re (kensmith), glebius (mentor)
2007-08-18 11:59:17 +00:00
Warner Losh
eb2e7f82ff Don't pass RB_BOOTINFO to the kernel. There's no bootinfo actually
passed into the kernel, and the kernel will soon grow that ability on
arm.

Approved by: re@ (blanket)
2007-08-17 18:22:31 +00:00
Kip Macy
7aff6d8ed3 forward port signedness fixes from RELENG_6
fix compile error for case where MSI_SUPPORTED not defined

Approved by: re (blanket)
2007-08-17 05:57:04 +00:00
Hidetoshi Shimokawa
ff038e3a82 We don't need to call dcons_poll event handlers if KDB is not active.
Approved by: re (kensmith)
2007-08-17 05:32:39 +00:00
Pawel Jakub Dawidek
70eaa4219c Some ZFS threads needs stack larger than the default 8kB, so use 16kB of
alternate stack if the default is smaller than 16kB.

Approved by:	re (rwatson)
2007-08-16 20:33:20 +00:00
Xin LI
1f32d0127b MFp4: rework tmpfs_readdir() logic in terms of correctness.
Approved by:	re (tmpfs blanket)
Tested with:	fstest, fsx
2007-08-16 11:00:07 +00:00
David Xu
6ec46f7aa8 Regenerate.
Approved by: re(kensmith)
2007-08-16 05:32:26 +00:00
David Xu
81ca5b4257 Add thr_kill2 compat32 syscall.
Submitted by: Tijl Coosemans tijl at ulyssis dot org
Approved by: re (kensmith)
2007-08-16 05:30:04 +00:00
David Xu
0b1f0611b4 Add thr_kill2 syscall which sends a signal to a thread in another process.
Submitted by: Tijl Coosemans tijl at ulyssis dot org
Approved by: re (kensmith)
2007-08-16 05:26:42 +00:00
Randall Stewart
2dad8a55be - Remove extra comment for 7.0 (no GIANT here).
- Remove unneeded WLOCK/UNLOCK of inp for getting TCB lock.
- Fix panic that may occur when freeing an assoc that has partial
  delivery in progress (may dereference null socket pointer when
  queuing partial delivery aborted notification)
- Some spacing and comment fixes.
- Fix address add handling to clear cached routes and source addresses
  when peer acks the add in case the routing table changes.
Approved by:	re@freebsd.org (Bruce Mah)
2007-08-16 01:51:22 +00:00
Qing Li
8cb5ba02d8 Use the sequence number comparison macro to compare
projected_offset against isn_offset to account for
wrap around.

Reviewed by:	gnn, kmacy, silby
Submitted by:	yusheng.huang@bluecoat.com
Approved by:	re
MFC:		3 days
2007-08-16 01:35:55 +00:00
Dag-Erling Smørgrav
83d18f2283 Add a driver for the on-die digital thermal sensor found on Intel Core
and newer CPUs (including Core 2 and Core / Core 2 based Xeons).  The
driver attaches to each cpu device and creates a sysctl node in that
device's sysctl context (dev.cpu.N.temperature).  When invoked, the
handler binds to the appropriate CPU to ensure a correct reading.

Submitted by:	Rui Paulo <rpaulo@fnop.net>
Sponsored by:	Google Summer of Code 2007
Tested by:	des, marcus, Constantine A. Murenin, Ian FREISLICH
Approved by:	re (kensmith)
MFC after:	3 weeks
2007-08-15 19:26:03 +00:00
John Baldwin
1dc5b1cc56 On 6.x this works:
% mount | grep home
/dev/ad4s1e on /home (ufs, local, noatime, soft-updates)
% mount -u -o atime /home
% mount | grep home
/dev/ad4s1e on /home (ufs, local, soft-updates)

Restore this behavior for on 7.x for the following mount options:
noatime, noclusterr, noclusterw, noexec, nosuid, nosymfollow

In addition, on 7.x, the following are equivalent:
mount -u -o atime /home
mount -u -o nonoatime /home

Ideally, when we introduce new mount options, we should avoid
options starting with "no". :)

Requested by:	jhb
Reported by:	Karol Kwiat <karol.kwiat gmail com>, Scott Hetzel <swhetzel gmail com>
Approved by:	re (bmah)
Proxy commit for:	rodrigc
2007-08-15 17:40:09 +00:00
Scott Long
9adc3a2dfb Move callout initialization to the proper spot. This prevents panics during
error recovery.

Approved by: re
Found by: kan
2007-08-14 19:17:35 +00:00
Pyun YongHyeon
c4aca09a2a Make sure to take PHY out of power down mode in device attach.
Without this the PHY wouldn't work as expected. This should fix
dual-boot Windows XP machine where RealTek Windows drivers put the
PHY in power down mode during shutdown. The magic PHY register
accesses come from RealTek driver. No datasheets mention the magic
PHY registers.
In general, the PHY wakeup code should go into PHY driver. However it
seems that it only apply to RTL8169S single chip and it would be
another hack if we have rgephy(4) check what parent driver/chip model
is attached.

Reported by:	lofi, Laurens Timmermans ( laurens AT timkapel DOT nl )
Tested by:	lofi
Obtained from:	RealTek FreeBSD driver
Approved by:	re (Ken Smith)
2007-08-14 02:00:04 +00:00
Pawel Jakub Dawidek
354eb80141 Improve vn_printf() by:
- adding missing vnode flags,
- printing unknown flags as numbers,
- using strlcat() instead of strcat().

Approved by:	re (bmah)
2007-08-13 21:23:30 +00:00
John Baldwin
cde586a75c Fix a few nits relative to the previous changes:
- Don't leak the config lock if detach() fails due to the controller char
  dev being open.
- Close a race between detach() and a process opening the controller char
  dev.

MFC after:	1 week
Approved by:	re (bmah)
2007-08-13 21:14:16 +00:00
John Baldwin
8ec5c98ba4 Teach the mfi(4) driver to handle requests from userland management
applications to add and remove volumes.

MFC after:	1 week
Approved by:	re (bmah)
Reviewed by:	ambrisko, scottl
2007-08-13 19:29:17 +00:00
Dag-Erling Smørgrav
438dafbbcf Update to support ICH[678] chipsets (based on a patch by Takeharu KATO)
Fix a resource allocation bug (explained by jhb on -acpi)
Thanks for Mike Tancsa for testing and helping track down the bug.

Approved by:	re (kensmith)
MFC after:	3 weeks
2007-08-13 18:52:37 +00:00
John Baldwin
14657ee81f Expand the data structure returned by the ATA RAID status ioctl to include
detailed status on each of the backing subdisks.  This allows userland
to see which subdisks are online, failed, missing, or a hot spare.

MFC after:	1 week
Approved by:	re (bmah)
Reviewed by:	sos
2007-08-13 18:46:31 +00:00
Maksim Yevmenkin
51713b2a7b Make ng_h4(4) MPSAFE. Use similar to ng_tty(4) locking strategy.
Reconnect ng_h(4) back to the build.

Reviewed by:	kensmith
Approved by:	re (kensmith)
MFC after:	1 month
2007-08-13 17:19:28 +00:00
Don Lewis
4d54b88811 Replace three copies of the host controller reset sequence that
differ in their details with calls to a new function, ehci_hcreset(),
that performs the reset.

The original sequences either had no delay or a 1ms delay between
telling the controller to stop and asserting the controller reset
bit.  One instance of the original reset sequence waited for the
controller to indicate that its reset was complete before continuing,
but the other two immediately let the subsequent code execute.  The
latter is a problem on some hardware, because a read of the HCCPARAMS
register returns an incorrect value while the reset is in progress,
which triggers an infinite loop in ehci_pci_givecontroller(), which
hangs the system on shutdown.

The reset sequence in ehci_hcreset() starts with the most complete
instance from the original code, which contains a loop to wait for
the controller to indicate that its reset is complete.   This appears
to be the correct thing to do according to "Enhanced Host Controller
Interface Specification for Universal Serial Bus" revision 1.0,
section 2.3.1.  Add another loop to wait for the controller to
indicate that it has stopped before setting the HCRESET bit.  This
is required by the section 2.3.1 in the specification, which says
that setting HCRESET before the controller has halted "will result
in undefined behaviour".

Reviewed by:	imp (previous patch version without the extra wait loop)
Tested by:	se  (previous patch version without the extra wait loop)
Approved by:	re (bmah)
MFC after:	1 week
2007-08-12 18:45:24 +00:00
Marcel Moolenaar
77d40ffd98 Revamp the interrupt handling in support of INTR_FILTER. This includes:
o  Revamp the PIC I/F to only abstract the PIC hardware. The
   resource handling has been moved to nexus, where it belongs.
o  Include EOI and MASK+EOI methods to the PIC I/F in support of
   INTR_FILTER.
o  With the allocation of interrupt resources and setup of
   interrupt handlers in the common platform code we can delay
   talking to the PIC hardware after enumeration of all devices.
   Introduce a call to powerpc_intr_enable() in configure_final()
   to achieve that and have powerpc_setup_intr() only program the
   PIC when !cold.
o  As a consequence of the above, remove all early_attach() glue
   from the OpenPIC and Heathrow PIC drivers and have them
   register themselves when they're found during enumeration.
o  Decouple the interrupt vector from the interrupt request line.
   Allocate vectors increasingly so that they can be used for
   the intrcnt index as well. Extend the Heathrow PIC driver to
   translate between IRQ and vector. The OpenPIC driver already
   has the support for vectors in hardware.

Approved by: re (blanket)
2007-08-11 19:25:32 +00:00
Kip Macy
93cccbf874 White space cleanups
Approved by: re (blanket)
2007-08-10 23:47:39 +00:00
Kip Macy
6b68e276ce - In all structures other than port info port is a pointer to a port info,
make the code less confusing by renaming the port number to port_id

Approved by: re (blanket)
2007-08-10 23:33:34 +00:00
Xin LI
ad3638ee08 MFp4:
- LK_RETRY prohibits vget() and vn_lock() to return error.
   Remove associated code. [1]
 - Properly use vhold() and vdrop() instead of their unlocked
   versions, we are guaranteed to have the vnode's interlock
   unheld. [1]
 - Fix a pseudo-infinite loop caused by 64/32-bit arithmetic
   with the same way used in modern NetBSD versions. [2]
 - Reorganize tmpfs_readdir to reduce duplicated code.

Submitted by:	kib [1]
Obtained from:	NetBSD [2]
Approved by:	re (tmpfs blanket)
2007-08-10 11:00:30 +00:00
Xin LI
0ae6383d39 MFp4:
- Respect cnflag and don't lock vnode always as LK_EXCLUSIVE [1]
 - Properly lock around tn_vnode to avoid NULL deference
 - Be more careful handling vnodes (*)

(*) This is a WIP
[1] by pjd via howardsu

Thanks kib@ for his valuable VFS related comments.

Tested with:	fsx, fstest, tmpfs regression test set
Found by:	pho's stress2 suite
Approved by:	re (tmpfs blanket)
2007-08-10 05:24:49 +00:00
Nate Lawson
3b3f28135f Add "show sysregs" command to ddb. On i386, this gives gdt, idt, ldt,
cr0-4, etc.  Support should be added for other platforms that have a
different set of registers for system use.

Loosely based on: OpenBSD
Approved by:	re
2007-08-09 20:14:35 +00:00
Tai-hwa Liang
c7f6197937 MFP4(123963): Fixing a possible NULL pointer dereference by making
the actual assignment after the NULL check.

Found by:	Coverity Prevent(tm)
CID:		2303 (run 4156)
Reviewed by:	sam
Approved by:	re (bmah)
2007-08-09 13:29:26 +00:00
Warner Losh
4ced8fb56a Use the .S version for now. I have a version optimized for size p4,
but I'm unsure of its provenance, so rather than add it here, revert
the migration to it.

Approved by: re@ (blanket)
2007-08-09 05:16:55 +00:00
Warner Losh
d8e3f30539 Merge in the AX88178 and AX88772 register definions (along with
rename) from OpenBSD.  This also dribbles in a few fields from OpenBSD
as well.

Approved by: re@ (blanket)
Obtained from: OpenBSD
2007-08-09 04:40:07 +00:00
Marcel Moolenaar
69fc43c03b Compile ipfilter:ip_lookup.c without -Werror. The file contains
a test that assumes that char is signed by default and causes a
warning with GCC 4.2 on PowerPC.
A patch has been sent to the maintainer that addresses this.

Approved by: re (blanket)
2007-08-09 01:11:21 +00:00
Marcel Moolenaar
b66623109d Re-enable -Werror for PowerPC. This should really be unconditional again.
Approved by: re (blanket)
2007-08-08 19:12:06 +00:00
Olivier Houchard
4739da977b Ooops, we need to define TD_LOCK here.
Approved by:	re (blanket)
Pointy hat to:	cognet
2007-08-08 09:27:52 +00:00
Marcel Moolenaar
fc37ccb390 Re-enable external interrupts for faults, traps and syscalls.
Approved by: re (blanket)
2007-08-08 01:19:12 +00:00
Marcel Moolenaar
4f5d8660e5 Eliminate <machine/interruptvar.h> as it has only a single
prototype. In the future that prototype will not be needed
at all anyway, but for now it's moved to intr_machdep.h.

Approved by: re (blanket)
2007-08-07 23:33:35 +00:00
Marcel Moolenaar
0201e3e97b Remove redundant prototype.
Approved by: re (blanket)
2007-08-07 18:40:02 +00:00
Marcel Moolenaar
ad9503cd37 Add prototype for trap().
Approved by: re (blanket)
2007-08-07 18:39:28 +00:00
Olivier Houchard
f7b55b6053 Add cast to silent gcc warnings.
Approved by:	re (blanket)
2007-08-07 18:37:21 +00:00
Olivier Houchard
362a46e4f6 Use the third argument of cpu_switch(), as done for i386/amd63, as it is
required for ULE.

Approved by:	re (blanket)
2007-08-07 18:20:55 +00:00
Konstantin Belousov
deea654ebf Protect the creation of the device pager with the dev_pager_mtx. Lookup
of device pager in the pagers list by handle is now synchronized with
its removal from the list, and dev_pager_mtx is put before vm object
lock in lock order. Dispose the dev_pager_sx lock, since dev_pager_mtx
now covers the same block.

Noted by:	kensmith
Reviewed by:	alc
Approved by:	re (kensmith)
2007-08-07 15:36:25 +00:00
Tai-hwa Liang
07b6a9bed8 MFP4(123687): Closing another LOR by dropping the driver lock around calls
to if_input().

Reviewed by:	ambrisko
Tested by:	dhw
Approved by:	re (kensmith)
2007-08-07 12:26:19 +00:00
Bruce Evans
a4e6807c49 In msdosfs_read() and msdosfs_write(), don't check explicitly for
(uio_offset < 0) since this can't happen.  If this happens, then the
general code handles the problem safely (better than before for reading,
returning 0 (EOF) instead of the bogus errno EINVAL, and the same as
before for writing, returning EFBIG).

In msdosfs_read(), don't check for (uio_resid < 0).  msdosfs_write()
already didn't check.

In msdosfs_read(), document in a comment our assumptions that the caller
passed a valid uio_offset and uio_resid.  ffs checks using KASSERT(),
and that is enough sanity checking.  In the same comment, partly document
there is no need to check for the EOVERFLOW case, unlike in ffs where this
case can happen at least in theory.

In msdosfs_write(), add a comment about why the checking of
(uio_resid == 0) is explicit, unlike in ffs.

In msdosfs_write(), check for impossibly large final offsets before
checking if the file size rlimit would be exceeded, so that we don't
have an overflow bug in the rlimit check and are consistent with ffs.
We now return EFBIG instead of EFBIG plus a SIGXFSZ signal if the final
offset would be impossibly large but not so large as to cause overflow.
Overflow normally gave the benign behaviour of no signal.

Approved by:	re (kensmith) (blanket)
2007-08-07 10:35:27 +00:00
Konstantin Belousov
004e08be60 Do not call free() while holding vnode interlock.
Reported and tested by:	Peter Holm
Reviewed by:	jeff
Approved by:	re (kensmith)
2007-08-07 09:04:50 +00:00
Bruce Evans
b7837a91c9 Fix and update the comments about the effect of the read-only flag on writing.
They are still too verbose.

Remove nearby unreachable code for handling symlinks.

Approved by:	re (kensmith) (blanket)
2007-08-07 05:42:10 +00:00
Bruce Evans
e3117f852e Fix some style bugs (don't assume that off_t == int64_t; fix some comments;
remove some parentheses; fix some whitespace errors; fix only one case of
a boolean comparison of a non-boolean).

Improve an error message by quoting ".", and by not printing large positive
values as negative ones.

Approved by:	re (kensmith) (blanket)
2007-08-07 03:59:49 +00:00
Bruce Evans
c0f5121cac Fix some style bugs (don't assume that off_t == int64_t; fix some comments;
remove some parentheses; fix only a couple of whtespace errors).

Approved by:	re (kensmith) (blanket)
2007-08-07 03:43:28 +00:00
Bruce Evans
2d7c6b2724 Fix some style bugs (mainly some whitespace errors).
Approved by:	re (kensmith) (blanket)
2007-08-07 03:38:36 +00:00
Bruce Evans
b6d0381e7e Fix some style bugs (some whitespace errors only).
Approved by:	re (kensmith) (blanket)
2007-08-07 03:22:10 +00:00
Bruce Evans
d2bb66bacd Sort includes.
Remove rotted banal comment attached to includes.

Approved by:	re (kensmith) (blanket)
2007-08-07 02:28:33 +00:00
Bruce Evans
6becd1c855 Sort includes.
Remove banal comments attached to includes.

Approved by:	re (kensmith) (blanket)
2007-08-07 02:27:35 +00:00
Bruce Evans
5696c6e0b2 Sort includes.
Remove banal comments before includes.  Remove rotted banal comments attached
to includes.

Approved by:	re (kensmith) (blanket)
2007-08-07 02:20:37 +00:00
Bruce Evans
9b0802c90b Remove unused include(s).
Remove banal comments before includes.

Approved by:	re (kensmith) (blanket)
2007-08-07 02:11:16 +00:00
Bruce Evans
a878a31c13 Remove unused include(s).
Approved by:	re (kensmith) (blanket)
2007-08-07 02:08:06 +00:00
Bruce Evans
eba34270fa Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of
depending on namespace pollution in <sys/buf.h> and/or <sys/vnode.h>

Approved by:	re (kensmith) (blanket)
2007-08-07 01:40:27 +00:00
Bruce Evans
1103771d95 Include <sys/mutex.h>'s prerequisite <sys/lock.h> instead of depending on
namespace pollution in <sys/vnode.h>.

Sort the include of <sys/mutex.h> instead of unsorting it after
<sys/vnode.h> and depending on the pollution there.

Approved by:	re (kensmith) (blanket)
2007-08-07 01:37:59 +00:00
Bruce Evans
6fd81fc7a6 Remove unused include(s).
Approved by:	re (kensmith) (blanket)
2007-08-07 01:07:16 +00:00
Christian S.J. Peron
b244c8ad14 Over the past couple of years, there have been a number of reports relating
the use of divert sockets to dead locks.  A number of LORs have been reported
between divert and a number of other network subsystems including: IPSEC, Pfil,
multicast, ipfw and others.  Other dead locks could occur because of recursive
entry into the IP stack.  This change should take care of most if not all of
these issues.

A summary of the changes follow:

- We disallow multicast operations on divert sockets.  It really doesn't make
  semantic sense to allow this, since typically you would set multicast
  parameters on multicast end points.

  NOTE: As a part of this change, we actually dis-allow multicast options on
  any socket that IS a divert socket OR IS NOT a SOCK_RAW or SOCK_DGRAM family

- We check to see if there are any socket options that have been specified on
  the socket, and if there was (which is very un-common and also probably
  doesnt make sense to support) we duplicate the mbuf carrying the options.

- We then drop the INP/INFO locks over the call to ip_output().  It should be
  noted that since we no longer support multicast operations on divert sockets
  and we have duplicated any socket options, we no longer need the reference
  to the pcb to be coherent.

- Finally, we replaced the call to ip_input() to use netisr queuing.  This
  should remove the recursive entry into the IP stack from divert.

By dropping the locks over the call to ip_output() we eliminate all the lock
ordering issues above.  By switching over to netisr on the inbound path,
we can no longer recursively enter the ip_input() code via divert.

I have tested this change by using the following command:

ipfwpcap -r 8000 - | tcpdump -r - -nn -v

This should exercise the input and re-injection (outbound) path, which is
very similar to the work load performed by natd(8).  Additionally, I have
run some ospf daemons which have a heavy reliance on raw sockets and
multicast.

Approved by:	re@ (kensmith)
MFC after:	1 month
LOR:		163
LOR:		181
LOR:		202
LOR:		203
Discussed with:	julian, andre et al (on freebsd-net)
In collaboration with:	bms [1], rwatson [2]

[1] bms helped out with the multicast decisions
[2] rwatson submitted the original netisr patches and came up with some
    of the original ideas on how to combat this issue.
2007-08-06 22:06:36 +00:00
Randall Stewart
63981c2b40 - change number assignments for SHA225-512 (match artisync
for bakeoff.. using the next sequential ones)
- In cookie processing 1-2-1, we did not increment the stcb
  refcnt before releasing the tcb lock. We need to do this
  to keep the tcb from being freed by a abort or ?? unlikely
  but worth doing. Also get rid of unneed INP_WLOCK.
- extra receive info included the rcvinfo which killed the
  padding/alignment. We now redefine all the fields properly
  so they both align properly both to 128 bytes.
- A peeled off socket would not close without an error due to
  its misguided idea that sctp_disconnect() was not supported
  on it. This fixes it so it goes through the proper path.
- When an assoc was being deleted after abort (via a timer) a
  small race condition exists where we might take a packet for
  the old assoc (since we are waiting for a cleanup timer). This
  state especially happens in mac. We now add a state in the asoc
  so these can properly handle the packet as OOTB.
Approved by:	re@freebsd.org(Ken Smith)
2007-08-06 15:46:46 +00:00
Robert Watson
0bf686c125 Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which
previously conditionally acquired Giant based on debug.mpsafenet.  As that
has now been removed, they are no longer required.  Removing them
significantly simplifies error-handling in the socket layer, eliminated
quite a bit of unwinding of locking in error cases.

While here clean up the now unneeded opt_net.h, which previously was used
for the NET_WITH_GIANT kernel option.  Clean up some related gotos for
consistency.

Reviewed by:	bz, csjp
Tested by:	kris
Approved by:	re (kensmith)
2007-08-06 14:26:03 +00:00
Marcel Moolenaar
ec2af96ad1 Clear pending interrupts before we enable external interrupts.
Recently the AP in my Merced box seems to have grown a habit
of getting unexpected interrupts, such as redundant wake-ups
and legacy interrupts that require an INTA cycle.

While here, replace DELAY(0) with cpu_spinwait() so that it's
clear what we're doing as well as enable the code to take
advantage of cpu_spinwait() when it gets implemented.

Approved by: re (blanket)
2007-08-06 05:15:57 +00:00
Marcel Moolenaar
78afae27e5 Keep interrupts disabled while handling external interrupts.
There's no advantage in allowing nested external interrupts.
In fact, it leads to a potential stack overrun.

While here, put the interrupt vector in the trapframe, so as
to compensate for the 36 cycle latency of reading cr.ivr.

Further simplify assembly code by dealing with ASTs from C.

Approved by: re (blanket)
2007-08-06 05:11:01 +00:00
Alan Cox
b5e8f167b9 Consider a scenario in which one processor, call it Pt, is performing
vm_object_terminate() on a device-backed object at the same time that
another processor, call it Pa, is performing dev_pager_alloc() on the
same device.  The problem is that vm_pager_object_lookup() should not be
allowed to return a doomed object, i.e., an object with OBJ_DEAD set,
but it does.  In detail, the unfortunate sequence of events is: Pt in
vm_object_terminate() holds the doomed object's lock and sets OBJ_DEAD
on the object.  Pa in dev_pager_alloc() holds dev_pager_sx and calls
vm_pager_object_lookup(), which returns the doomed object.  Next, Pa
calls vm_object_reference(), which requires the doomed object's lock, so
Pa waits for Pt to release the doomed object's lock.  Pt proceeds to the
point in vm_object_terminate() where it releases the doomed object's
lock.  Pa is now able to complete vm_object_reference() because it can
now complete the acquisition of the doomed object's lock.  So, now the
doomed object has a reference count of one!  Pa releases dev_pager_sx
and returns the doomed object from dev_pager_alloc().  Pt now acquires
dev_pager_mtx, removes the doomed object from dev_pager_object_list,
releases dev_pager_mtx, and finally calls uma_zfree with the doomed
object.  However, the doomed object is still in use by Pa.

Repeating my key point, vm_pager_object_lookup() must not return a
doomed object.  Moreover, the test for the object's state, i.e.,
doomed or not, and the increment of the object's reference count
should be carried out atomically.

Reviewed by:	kib
Approved by:	re (kensmith)
MFC after:	3 weeks
2007-08-05 21:04:32 +00:00
Marcel Moolenaar
e54994f990 In ia64_set_rr(), don't perform data serialization. This allows
us to do the data serializations once after writing multiple
region registers, as is done in pmap_switch(). All existing
calls to ia64_set_rr() are followed with calls to ia64_srlz_d().

Approved by: re (blanket)
2007-08-05 18:19:38 +00:00
Bjoern A. Zeeb
cc977adc71 Rename option IPSEC_FILTERGIF to IPSEC_FILTERTUNNEL.
Also rename the related functions in a similar way.
There are no functional changes.

For a packet coming in with IPsec tunnel mode, the default is
to only call into the firewall with the "outer" IP header and
payload.

With this option turned on, in addition to the "outer" parts,
the "inner" IP header and payload are passed to the
firewall too when going through ip_input() the second time.

The option was never only related to a gif(4) tunnel within
an IPsec tunnel and thus the name was very misleading.

Discussed at:			BSDCan 2007
Best new name suggested by:	rwatson
Reviewed by:			rwatson
Approved by:			re (bmah)
2007-08-05 16:16:15 +00:00
Bruce Evans
8d61a735c6 Silently fix up the estimated next free cluster number from the fsinfo
sector, instead of failing the whole mount if it is garbage.  Fields
in the fsinfo sector are only advisory, so there are better sanity
checks than this, and we already silently fix up the only other advisory
field in the fsinfo (the free cluster count).

This wasn't handled quite right in rev.1.92, 1.117, or in NetBSD.  1.92
also failed the whole mount for the non-garbage magic value 0xffffffff
1.117 fixed this well enough in practice since garbage values shouldn't
occur in practice, but left the error handling larger and more convoluted
than necessary.  Now we handle the magic value as a special case of
fixing up all out of bounds values.

Also fix up the estimated next free cluster number when there is no
fsinfo sector.  We were using 0, but CLUST_FIRST is safer.

Approved by:	re (kensmith)
2007-08-05 12:58:34 +00:00
Marius Strobl
6bbb5a106c - Divorce the IOTSBs, which so far where handled via a global list
instead of per IOMMU, so we no longer need to program all of them
  identically in systems having multiple IOMMUs. This continues the
  rototilling of the nexus(4) done about 5 months ago, which amongst
  others changed nexus(4) and the drivers for host-to-foo bridges
  to provide bus_get_dma_tag methods, allowing to handle DMA tags in
  a hierarchical way and to link them with devices.
  This still doesn't move the silicon bug workarounds for Sabre (and
  in the uncommitted schizo(4) for Tomatillo) bridges into special
  bus_dma_tag_create() and bus_dmamap_sync() methods though, as w/o
  fully newbus'ified bus_dma_tag_create() and bus_dma_tag_destroy()
  this still requires too much hackery, i.e. per-child parent DMA
  tags in the parent driver.
- Let the host-to-foo drivers supply the maximum physical address
  of the IOMMU accompanying the bridges. Previously iommu(4) hard-
  coded an upper limit of 16GB, which actually only applies to the
  IOMMUs of the Hummingbird and Sabre bridges. The Psycho variants
  as well as the U2S in fact can can translate to up to 2TB, i.e.
  translate to 41-bit physical addresses. According to the recently
  available Tomatillo documentation these bridges even translate to
  43-bit physical addresses and hints at the Schizo bridges doing
  43 bits as well.
  This fixes the issue the FreeBSD 6.0 todo list item "Max RAM on
  sparc64" was refering to and pretty much obsoletes the lack of
  support for bounce buffers on sparc64.

Thanks to Nathan Whitehorn for pointing me at the Tomatillo manual.

Approved by:	re (kensmith)
2007-08-05 11:56:44 +00:00
Marius Strobl
82a67a70a2 o In order to reduce bug and code duplication fold handling of NICs
requiring DC_TX_ALIGN or DC_TX_COALESCE, which was previously done
  in dc_start_locked(), into dc_encap().
o In dc_encap():
  - If m_defrag() fails just drop the packet like other NIC drivers
    do. This should only happen when there's a mbuf shortage, in which
    case it was possible to end up with an IFQ full of packets which
    couldn't be processed as they couldn't be defragmented as they
    were taking up all the mbufs themselves. This includes adjusting
    dc_start_locked() to not trying to prepend the mbuf (chain) if
    dc_encap() has freed it.
  - Likewise, if bus_dmamap_load_mbuf() fails as dc_dma_map_txbuf()
    failed, free the mbuf possibly allocated by the above call to
    m_defrag() and drop the packet.
o In dc_txeof():
  - Don't clear IFF_DRV_OACTIVE unless there are at least 6 free TX
    descriptors. Further down the road dc_encap() will bail if there
    are only 5 or fewer free TX descriptors, causing dc_start_locked()
    to abort and prepend the dequeued mbuf again so it makes no sense
    to pretend we could process mbufs again when in fact we won't.
    While at it replace this magic 5 with a macro DC_TX_LIST_RSVD.
  - Just always assign idx to sc->dc_cdata.dc_tx_cons; it doesn't
    make much sense to exclude the idx == sc->dc_cdata.dc_tx_cons
    case.
o In dc_dma_map_txbuf() there's no need to set sc->dc_cdata.dc_tx_err
  to error if the latter is != 0, bus_dmamap_load_mbuf() already
  returns the same error value in that case anyway.
o For less overhead, convert to use bus_dmamap_load_mbuf_sg() for
  loading RX buffers.
o Remove some banal and/or outdated comments.

Approved by:	re (kensmith)
MFC after:	1 week
2007-08-05 11:28:19 +00:00
Marius Strobl
9282563532 Initialize the rl_vlanctl field of the descriptors to zero (in order
to clear RL_TDESC_VLANCTL_TAG). This fixes sending packets in the
native VLAN when running both tagged and an untagged VLAN over the
same trunk and descriptors are recycled.

Approved by:	re (kensmith)
MFC after:	1 week
2007-08-05 11:20:33 +00:00
Konstantin Belousov
c6199d59e3 Do not acquire Giant unconditionally around the calls to the cdevsw
d_mmap methods. prep_cdevsw() already installs the shims that
acquire/drop Giant for the methods of a driver that specified the
D_NEEDGIANT flag.

Reviewed by:	alc
Approved by:	re (kensmith)
2007-08-05 05:40:52 +00:00
Andrew Thompson
dd04013007 - Ensure the path cost does not exceed 65535 in legacy STP mode.
- If the path cost is calculated when the link is down, set a pending flag so
  it is calculated again when it comes back up.
- To not use 00:00:00:00:00:00 as the bridge id, all interfaces are scanned and
  the lowest number wins. All zeros is too low.

Approved by:	re (rwatson)
2007-08-04 21:09:04 +00:00
Marcel Moolenaar
f5a9fc710a Replace "__asm __volatile()" by equivalent support functions from
ia64_cpu.h. This improves readability and consistency and aids in
auditing the code.
Add instruction-serialization after writing to cr.pta.

Delay enabling interrupts until after we setup the clocks and after
we program the task priority register.

Approved by: re (blanket)
2007-08-04 19:52:10 +00:00
Marcel Moolenaar
7c31469f67 Replace "__asm __volatile()" by equivalent support functions from
ia64_cpu.h. This improves readability and consistency and aids in
auditing the code.
Add data-serialization after writing to the region registers and
add instruction-serialization after writing to cr.pta.

Approved by: re (blanket)
2007-08-04 19:36:14 +00:00
Marcel Moolenaar
09363c3636 Replace "__asm __volatile()" by equivalent support functions from
ia64_cpu.h. This improves readability and consistency and aids in
auditing the code.
Add data-serialization after writing to cr.tpr.

Approved by: re (blanket)
2007-08-04 19:33:27 +00:00
Marcel Moolenaar
9d662e5c9d Add required data-serialization after writing to cr.itm and cr.itv.
Approved by: re (blanket)
2007-08-04 19:28:19 +00:00
Marcel Moolenaar
855218fbd1 Add ia64_srlz_d() and ia64_srlz_i() functions to aid in serialization.
Approved by: re (blanket)
2007-08-04 19:26:42 +00:00
Konstantin Belousov
a045dbb8ae Set D_NEEDGIANT.
Approved by:	phk
Approved by:	re (kensmith)
2007-08-04 17:43:11 +00:00
Jeff Roberson
3a78f9658b - Fix one line that erroneously crept in my last commit.
Approved by:	re
2007-08-04 01:21:28 +00:00
Jeff Roberson
c47f202b45 - Share scheduler locks between hyper-threaded cores to protect the
tdq_group structure.  Hyper-threaded cores won't really benefit from
   seperate locks anyway.
 - Seperate out the migration case from sched_switch to simplify the main
   switch code.  We only migrate here if called via sched_bind().
 - When preempted place the preempted thread back in the same queue at
   the head.
 - Improve the cpu group and topology infrastructure.

Tested by:	many on current@
Approved by:	re
2007-08-03 23:38:46 +00:00
Jeff Roberson
413ea6f543 - Set SW_PREEMPT when we preempt in critical_exit().
Approved by:	re
2007-08-03 23:35:35 +00:00
Bruce Evans
3726942956 Oops, fix the fix for the i/o size of the fsinfo block. Its log
message explained why the size is 1 sector, but the code used a
size of 1 cluster.

I/o sizes larger than necessary may cause serious coherency problems
in the buffer cache.  Here I think there were only minor efficiency
problems, since a too-large fsinfo buffer could only get far enough
to overlap buffers for the same vnode (the device vnode), so mappings
are coherent at the page level although not at the buffer level, and
the former is probably enough due to our limited use of the fsinfo
buffer.

Approved by:	re (kensmith)
2007-08-03 23:13:50 +00:00
Xin LI
fb7557140e MFp4 - Refine locking to eliminate some potential race/panics:
- Copy before testing a pointer.  This closes a race window.
 - Use msleep with the node interlock instead of tsleep.
 - Do proper locking around access to tn_vpstate.
 - Assert vnode VOP lock for dir_{atta,de}tach to capture
   inconsistent locking.

Suggested by:	kib
Submitted by:	delphij
Reviewed by:	Howard Su
Approved by:	re (tmpfs blanket)
2007-08-03 06:24:31 +00:00
Peter Wemm
b7778ae08f Move mp_topology() from apic_init(i386) and apic_setup_local(amd64) to
cpu_start_mp().  This is after we have read the cpuid registers to
calculate the hyperthreading_cpus value for the sysctl that enables or
disables hyperthread cores.  Change mp_topology() to use that information
rather than trying to do it itself.

This solves the problem of ULE being incorrectly told that dual core
Athlon64 X2 or Operton cpus are hyperthreading cores.  At the very least,
we now have a single piece of code to identify hyperthreading.

Obtained from:  jhb
Approved by:  re (kensmith)
2007-08-02 21:17:58 +00:00
Kevin Lo
0d45c918d2 Add the device ID for the VIA CX700 chipset.
Approved by: re (hrs)
2007-08-02 04:29:19 +00:00
Tai-hwa Liang
d28ab8736f MFP4(123686): Fixing various ancontrol(8) related panics by dropping locks
around copyin()/copyout().

Reviewed by:	sam, thompsa
Tested by:	dhw
Approved by:	re (kensmith)
2007-08-02 02:20:19 +00:00
Maksim Yevmenkin
acbfc85b17 Call ttyld_close() in nmdmclose() to ensure that nmdm(4)
closes line discipline installed onto /dev/nmdmX device.

Reviewed by:	julian
Approved by:	re (hrs)
MFC after:	3 days
2007-08-01 21:38:11 +00:00
Alexander Motin
d6fe462ac1 Add 64bit statistic counters to the ng_ppp node.
64bit counters are needed to simplify traffic accounting and
reduce system load at the big PPP concentrators.

Approved by:	re (rwatson), glebius (mentor)
2007-08-01 20:49:35 +00:00
Alexander Motin
e89c150775 This patch improves fine-grained locking for the ng_ppp node.
Till now node's transmit path was completely unprotected
and so wasn't thread safe in multilink mode. It's receive path was
declared as WRITER as the simpliest protection method but it
reduces performance when compression or encryption enabled.

Approved by:	re (rwatson), glebius (mentor)
2007-08-01 20:38:37 +00:00
Andrew Thompson
85ce729794 Add a bridge interface flag called PRIVATE where any private port can not
communicate with another private port.

All unicast/broadcast/multicast layer2 traffic is blocked so it works much the
same way as using firewall rules but scales better and is generally easier as
firewall packages usually do not allow ARP blocking.

An example usage would be having a number of customers on separate vlans
bridged with a server network. All the vlans are marked private, they can all
communicate with the server network unhindered, but can not exchange any
traffic whatsoever with each other.

Approved by:	re (rwatson)
2007-08-01 00:33:52 +00:00
Peter Wemm
c4a184bdc4 Change TCPTV_MIN to be independent of HZ. While it was documented to
be in ticks "for algorithm stability" when originally committed, it turns
out that it has a significant impact in timing out connections.  When we
changed HZ from 100 to 1000, this had a big effect on reducing the time
before dropping connections.

To demonstrate, boot with kern.hz=100.  ssh to a box on local ethernet
and establish a reliable round-trip-time (ie: type a few commands).
Then unplug the ethernet and press a key.  Time how long it takes to
drop the connection.

The old behavior (with hz=100) caused the connection to typically drop
between 90 and 110 seconds of getting no response.

Now boot with kern.hz=1000 (default).  The same test causes the ssh session
to drop after just 9-10 seconds.  This is a big deal on a wifi connection.

With kern.hz=1000, change sysctl net.inet.tcp.rexmit_min from 3 to 30.
Note how it behaves the same as when HZ was 100.  Also, note that when
booting with hz=100, net.inet.tcp.rexmit_min *used* to be 30.

This commit changes TCPTV_MIN to be scaled with hz.  rexmit_min should
always be about 30.  If you set hz to Really Slow(TM), there is a safety
feature to prevent a value of 0 being used.

This may be revised in the future, but for the time being, it restores the
old, pre-hz=1000 behavior, which is significantly less annoying.

As a workaround, to avoid rebooting or rebuilding a kernel, you can run
"sysctl net.inet.tcp.rexmit_min=30" and add "net.inet.tcp.rexmit_min=30"
to /etc/sysctl.conf.  This is safe to run from 6.0 onwards.

Approved by:  re (rwatson)
Reviewed by:  andre, silby
2007-07-31 22:11:55 +00:00
Scott Long
5878cbeccf Make the driver fully MPSAFE. This fixes some serious locking problems
that could cause panics and corruption under moderate load.  Many thanks
to Matt Reimer, Tom McDonald, and the rest of the guys at VPOP.net for
their help in identifying and testing this.

Approved by: re
2007-07-31 20:16:50 +00:00
Scott Long
9ab0fe8075 Fix locking mistakes in the error recovery paths of the AHC and AHD drivers.
Approved by: re
2007-07-31 20:11:03 +00:00
Warner Losh
e8b7ad8c05 Add in all the USB devices and all the wireless goo. The KB9202 has
only USB 1.1 speeds available, but this shouldn't hurt.  Now that we have
working usb support for this board, this is a natural followup.

Approved by: re (kensmith)
2007-07-31 17:45:54 +00:00
Warner Losh
3f0fd37320 Make USB work on the KB9202{,A,B} boards. This has been in p4 for about
7 months.  You must have JP6 in the 1-2 position to supply power to the
USB devices, but I've used uftdi, uplcom and umass successfully.  If you
have it in 2-3, then nothing will show up.  Also, if you have the FQPA
packaging for the AT91RM9200 (like the KN9202 boards have), you will get
the following message

uhub0: device problem (IOERROR), disabling port 2

due to a hardware erratum.  It is safe to ignore as it is about pins that
aren't brought out on the FQPA package and aren't proeprly terminated either.
Alas, there's no register to read to tell the FQPA from the BGA versions.

Submitted by: Daan Vreeken
Approved by: re (kensmith)
2007-07-31 17:43:18 +00:00
Olivier Houchard
6308183c5d MFppc:
revision 1.66
date: 2007/07/31 06:23:26;  author: marcel;  state: Exp;  lines: +2 -2
Fix backward compatibility of the "old" (i.e. FreeBSD6) lseek
syscall. It was broken when a new lseek syscall was introduced.
The problem is that we need to swap the 32-bit td_retval values
for the __syscall indirect syscall when the actual syscall has
a 32-bit return value. Hence, we need to exclude lseek(2). And
this means the "old" lseek(2) as well -- which we didn't.

Based on a patch from: grehan@

Approved by:	re (blanket)
2007-07-31 17:09:05 +00:00
Marcel Moolenaar
8875aa6621 Fix backward compatibility of the "old" (i.e. FreeBSD6) lseek
syscall. It was broken when a new lseek syscall was introduced.
The problem is that we need to swap the 32-bit td_retval values
for the __syscall indirect syscall when the actual syscall has
a 32-bit return value. Hence, we need to exclude lseek(2). And
this means the "old" lseek(2) as well -- which we didn't.

Based on a patch from: grehan@
Approved by: re (rwatson)
2007-07-31 06:23:26 +00:00
Marcel Moolenaar
789943cc81 Enable -Werror for ia64.
Approved by: re (blanket)
2007-07-31 03:15:32 +00:00
David Christensen
990a2aa530 - Fixed a problem that would cause kernel panics and "bce0: discard frame .."
errors (especially when jumbo frames are enabled or in low memory systems)
  because the RX chain was corrupted when an mbuf was mapped to an unexpected
  number of buffers.
- Fixed a problem that would cause kernel panics when an excessively
  fragmented TX mbuf couldn't be defragmented and was released by
  bce_tx_encap().

Approved by:	re(hrs)
MFC after:	7 days
2007-07-31 00:06:04 +00:00
Marcel Moolenaar
cf681ceef5 o Switch to physical addressing before dereferencing the VHPT
bucket pointer. The virtual mapping may not be present in the
  translation cache. This will result in a nested TLB fault at
  a place we don't handle (and don't want to handle).
o Make sure there's a stop after the rfi instruction, otherwise
  its behaviour is undefined.
o Make sure we switch back to virtual addressing before doing
  a rfi. Behaviour is undefined otherwise.

Approved by: re (blanket)
2007-07-30 22:52:52 +00:00
Marcel Moolenaar
ea5e2a02af Add option EXCEPTION_TRACING, which enables KTR-like functionality
for processor interruptions. This is especially useful to track
unexpected nested TLB faults.

Approved by: re (blanket)
2007-07-30 22:42:33 +00:00
Marcel Moolenaar
fe1c66b9d7 Rework the interrupt code and add support for interrupt filtering
(INTR_FILTER). This includes:
o  Save a pointer to the sapic structure and IRQ for every vector,
   so that we can quickly EOI, mask and unmask the interrupt.
o  Add locking to the sapic code now that we can reprogram a
   sapic on multiple CPUs at the same time.
o  Use u_int for the vector and IRQ. We only have 256 vectors, so
   using a 64-bit type for it is rather excessive.
o  Properly handle concurrent registration of a handler for the
   same vector.

Since vectors have a corresponding priority, we should not map
IRQs to vectors in a linear fashion, but rather pick a vector
that has a priority in line with the interrupt type. This is left
for later. The vector/IRQ interchange has been untangled as much
as possible to make this easier.

Approved by: re (blacket)
2007-07-30 22:29:33 +00:00
Marcel Moolenaar
8a2a70cb02 Explicitly map the VHPT on all processors. Previously we were
merely lucky that the VHPT was mapped as a side-effect of
mapping the kernel, but when there's enough physical memory,
this may not at all be the case.

Approved by: re (blanket)
2007-07-30 22:12:53 +00:00
Marcel Moolenaar
c183b0f2c1 Add casts to some of the more commonly used pointer-type atomic
operations. We really should be able to make those inline functions,
but this would break its use for sx_locks.

Approved by: re (blanket)
2007-07-30 22:07:01 +00:00
Andrew Thompson
de75afe64f - Propagate the largest set of interface capabilities supported by all lagg
ports to the lagg interface.
- Use the MTU from the first interface as the lagg MTU, all extra interfaces
  must be the same.

This fixes using a lagg interface for a vlan or enabling jumbo frames, etc.

Approved by:	re (kensmith)
MFC After:	3 days
2007-07-30 20:17:22 +00:00
Nate Lawson
430eaa744e Dynamically choose the quality of the ACPI timer depending on whether
the fast or safe/slow method is in use.  Fast remains at 1000, slow is
now at 850 (always preferred to TSC).  Since the HPET has proven slower
than ACPI-fast on some systems, drop its quality to 900.  In the future,
it is hoped that HPET performance will improve as it is the main
timer Intel supports.  HPET may move back to 2000 in -current once RELENG_7
is branched to ensure that it gets tested.

Approved by:	re
2007-07-30 15:21:26 +00:00
Dag-Erling Smørgrav
218cbbea9a Make tcpstates[] static, and make sure TCPSTATES is defined before
<netinet/tcp_fsm.h> is included into any compilation unit that needs
tcpstates[].  Also remove incorrect extern declarations and TCPDEBUG
conditionals.  This allows kernels both with and without TCPDEBUG to
build, and unbreaks the tinderbox.

Approved by:	re (rwatson)
2007-07-30 11:06:42 +00:00
David Malone
c848e0de55 Mfi386 revision 1.239 of src/sys/i386/isa/clock.c. Seemingly some
pc98 motherboards do not provide us with the correct day of week
either. Ignore the day of week when setting the clock here too.

Approved by:	re (bmah)
Requested from:	nyan
MFC after:	3 weeks
2007-07-29 20:16:48 +00:00
Bruce A. Mah
e251d2f4f6 Fix a typo in a log message: s/Reveived/Received/.
Approved by:	re (rwatson)
2007-07-29 20:13:22 +00:00
Warner Losh
1dfb823e11 Add missing newline in printf.
Submitted by:  "R.Mahmatkhanov" cvs-src at yandex ru
Approved by: re (blanket)
2007-07-29 18:16:43 +00:00
Marcel Moolenaar
7f67bed625 In pci_alloc_map(), restore the original value of the BAR for
the duration of the function.  The device we would otherwise
have left in an useless state may just as well be the low-level
console. When booting verbose, we do need it addressable if we
want to avoid a MCA.

Approved by: re (kensmith)
2007-07-29 02:44:41 +00:00
Matt Jacob
24face5416 Fix compilation problems- tcpstates is only available if TCPDEBUG
is set.

Approved by:	re (in spirit)
2007-07-29 01:31:33 +00:00
Mike Silbersack
e3020cfd3c Fix a panic introduced in rev 1.126.
Approved by: re (rwatson)
2007-07-28 20:13:40 +00:00
Andre Oppermann
773673c133 Provide a sysctl to toggle reporting of TCP debug logging:
sys.net.inet.tcp.log_debug = 1

It defaults to enabled for the moment and is to be turned off for
the next release like other diagnostics from development branches.

It is important to note that sysctl sys.net.inet.tcp.log_in_vain
uses the same logging function as log_debug.  Enabling of the former
also causes the latter to engage, but not vice versa.

Use consistent terminology in tcp log messages:

 "ignored" means a segment contains invalid flags/information and
   is dropped without changing state or issuing a reply.

 "rejected" means a segments contains invalid flags/information but
   is causing a reply (usually RST) and may cause a state change.

Approved by:	re (rwatson)
2007-07-28 12:20:39 +00:00
Andre Oppermann
cdaf208d09 o Move setting/resetting logic of syncache timer from macro
SYNCACHE_TIMEOUT to new function syncache_timeout().
o Fix inverted timeout callout engagement logic to actually
  enable the timer for the bucket row.  Before SYN|ACK was
  not retransmitted.
o Simplify SYN|ACK retransmit timeout backoff calculation.
o Improve logging of retransmit and timeout events.
o Reset timeout when duplicate SYN arrives.
o Add comments.
o Rearrange SYN cookie statistics counting.

Bug found by:	silby
Submitted by:	silby (different version)
Approved by:	re (rwatson)
2007-07-28 12:02:05 +00:00
Andre Oppermann
19bc77c549 o Move all detailed checks for RST in LISTEN state from tcp_input() to
syncache_rst().
o Fix tests for flag combinations of RST and SYN, ACK, FIN.  Before
  a RST for a connection in syncache did not properly free the entry.
o Add more detailed logging.

Approved by:	re (rwatson)
2007-07-28 11:51:44 +00:00
Robert Watson
c6b2899785 Replace references to NET_CALLOUT_MPSAFE with CALLOUT_MPSAFE, and remove
definition of NET_CALLOUT_MPSAFE, which is no longer required now that
debug.mpsafenet has been removed.

The once over:	bz
Approved by:	re (kensmith)
2007-07-28 07:31:30 +00:00
Alan Cox
eaa29f1ce4 Add a counter for the total number of pages cached and support for
reporting the value of this counter in the program "vmstat".

Approved by:	re (rwatson)
2007-07-27 20:01:22 +00:00
Olivier Houchard
122e1e5e24 CRB config file.
Approved by:	re (blanket)
2007-07-27 14:57:03 +00:00
Olivier Houchard
5f78cb4a35 XScale core 3 definitions.
Approved by:	re (blanket)
2007-07-27 14:54:27 +00:00
Olivier Houchard
0566a63ff3 Cleanup
Approved by:	re (blanket)
2007-07-27 14:53:42 +00:00
Olivier Houchard
55f9380c2c Do not define NIRQ, it is already defined in include/intr.h
Approved by:	re (blanket)
2007-07-27 14:53:06 +00:00
Olivier Houchard
b93e48d2f9 Share the timer and watchdog drivers with the i81342. It's the same,
except it uses different registers.

Approved by:	re (blanket)
2007-07-27 14:52:04 +00:00
Olivier Houchard
e26a6af3af Add initial IOP342 support.
Thanks to Intel for providing sample hardware.

Approved by:	re (blanket)
2007-07-27 14:50:57 +00:00
Olivier Houchard
62e70f1b69 Say if the L2 cache is enabled or disabled as well.
Approved by:	re (blanket)
2007-07-27 14:49:11 +00:00
Olivier Houchard
a9b444d065 Use coherent mapping for DMA on arm. This is propably suitable for the
other archs, but I can't test it so I made it conditionnal on __arm__
for now.

Approved by:	re (blanket)
2007-07-27 14:48:05 +00:00
Olivier Houchard
72d383c331 Handle supersections and L2 cache.
Approved by:	re (blanket)
2007-07-27 14:46:43 +00:00
Olivier Houchard
fcd373ffb8 Use supersection instead of standard sections to map the whole memory
when available.

Approved by:	re (blanket)
2007-07-27 14:46:15 +00:00
Olivier Houchard
e905513c06 Fix the cache mode description.
Approved by:	re (blanket)
2007-07-27 14:45:33 +00:00
Olivier Houchard
b4db6fd942 Properly handle supersections.
Make sure we cache entries in the L2 cache.

Approved by:	re (blanket)
2007-07-27 14:45:04 +00:00
Olivier Houchard
23f9626539 Bring in two bandaids to get the elf trampoline to work again, until I find
a proper solution.
- Add a dummy entry point which just calls the C entry points, and try to make
sure it's the first code in the binary.
- Copy a bit more than func_end to try to copy the whole load_kernel()
function. gcc4 puts code behind the func_end symbol.

Approved by:	re (blanket)
2007-07-27 14:42:25 +00:00
Olivier Houchard
425b5be335 Add a new set of functions to handle L2 cache. Make them no-op for every
CPU except Xscale core 3.

Approved by:	re (blanket)
2007-07-27 14:39:41 +00:00
Olivier Houchard
03631d9998 Import xscale core 3 cache management functions.
Approved by:	re (blanket)
2007-07-27 14:28:15 +00:00
Olivier Houchard
43a2baaf1c INTR_FILTER bits for arm
Approved by:	re (blanket)
2007-07-27 14:26:42 +00:00
Takanori Watanabe
32ee7eee09 Minor Bug fix that will cause panic with some terminal with voice path on USB.
Approved by: re@ (kensmith)
2007-07-27 12:00:29 +00:00
Robert Watson
33d2bb9ca3 First in a series of changes to remove the now-unused Giant compatibility
framework for non-MPSAFE network protocols:

- Remove debug_mpsafenet variable, sysctl, and tunable.
- Remove NET_NEEDS_GIANT() and associate SYSINITSs used by it to force
  debug.mpsafenet=0 if non-MPSAFE protocols are compiled into the kernel.
- Remove logic to automatically flag interrupt handlers as non-MPSAFE if
  debug.mpsafenet is set for an INTR_TYPE_NET handler.
- Remove logic to automatically flag netisr handlers as non-MPSAFE if
  debug.mpsafenet is set.
- Remove references in a few subsystems, including NFS and Cronyx drivers,
  which keyed off debug_mpsafenet to determine various aspects of their own
  locking behavior.
- Convert NET_LOCK_GIANT(), NET_UNLOCK_GIANT(), and NET_ASSERT_GIANT into
  no-op's, as their entire behavior was determined by the value in
  debug_mpsafenet.
- Alias NET_CALLOUT_MPSAFE to CALLOUT_MPSAFE.

Many remaining references to NET_.*_GIANT() and NET_CALLOUT_MPSAFE are still
present in subsystems, and will be removed in followup commits.

Reviewed by:	bz, jhb
Approved by:	re (kensmith)
2007-07-27 11:59:57 +00:00
David Malone
9be70a793e It seems that some i386 mothermoards either do not implement the
day of week field correctly, or they remember bad values that are
written into the day of week field. For this reason, ignore the day
of week field when reading the clock on i386 rather than bailing if
it is set incorrectly.

Problems were seen on a number of platforms, including VMWare, qemu,
EPIA ME6000, Epox-3PTA and ABIT-SL30T.

This is a slightly different fix to that proposed by Ted in his PR,
but the same basic idea.

PR:		111117
Submitted by:	Ted Faber <faber@lunabase.org>
Approved by:	re (rwatson)
MFC after:	3 weeks
2007-07-27 09:34:42 +00:00
Attilio Rao
34ed040030 Actually, upcalls cannot be freed while destroying the thread because we
should call uma_zfree() with various spinlock helds.  Rearranging the
code would not help here because we cannot break atomicity respect
prcess spinlock, so the only one choice we have is to defer the operation.
In order to do this use a global queue synchronized through the kse_lock
spinlock which is freed at any thread_alloc() / thread_wait() through a
call to thread_reap().
Note that this approach is not ideal as we should want a per-process
list of zombie upcalls, but it follows initial guidelines of KSE authors.

Tested by: jkim, pav
Approved by: jeff, julian
Approved by: re
2007-07-27 09:21:18 +00:00
Robert Watson
9e7a99e592 Continue effort to improve parity between UDPv4 and UDPv6: add a missing
scope security check for the UDPv6 socket credential lookup service,
allowing security policies to bound access to credential information.
While not an immediate issue for Jail, which doesn't allow use of UDPv6,
this may be relevant to other security policies that may wish to control
ident lookups.

While here, eliminate a very unlikely panic case, in which a socket in
the process of being freed is inspected by the sysctl.

Approved by:	re (kensmith)
Reviewed by:	bz
2007-07-27 08:25:02 +00:00
Mike Silbersack
c325962b47 Export the contents of the syncache to netstat.
Approved by: re (kensmith)
MFC after: 2 weeks
2007-07-27 00:57:06 +00:00
Pyun YongHyeon
4693e424a7 style(9)
Pointed out by:	cnst
Approved by:	re (kensmith)
2007-07-27 00:43:12 +00:00
Andrew Thompson
82056f42cf Avoid holding the softc lock when using copyout().
Reported by:	dfr
Approved by:	re (rwatson)
2007-07-26 20:30:18 +00:00
Andrew Thompson
c4dd9fb67a Fix up ndis interaction with net80211
- make NDIS_DEBUG a sysctl
 - default to IEEE80211_MODE_11B if the card doesnt tell us the channels
 - dont mess with ic_des_chan when we assosciate
 - Allow a directed scan by setting the ESSID before scanning (verified
   with wireshark). Hidden APs probably wouldnt have worked before.
 - Grab the channel type and use it to look up the correct curchan for
   the scan results (mistakenly used 11B before)
 - Fix memory leak in the ndis_scan_results

Tested by:	matteo
Reviewed by:	sam
Approved by:	re (rwatson)
2007-07-26 20:11:16 +00:00
Alexander Motin
091193febe Reduce stack usage by 256 bytes per call. It helps to avoid kernel
stack overflow in complicated traffic filtering setups.

There can be minor performance degradation for the MHLEN < len <= 256 case
due to additional buffer allocation, but it is a rare case.

Approved by:	re (rwatson), glebius (mentor)
MFC after:	1 week
2007-07-26 18:15:02 +00:00
Pawel Jakub Dawidek
57fd3d5572 When we do open, we should lock the vnode exclusively. This fixes few races:
- fifo race, where two threads assign v_fifoinfo,
- v_writecount modifications,
- v_object modifications,
- and probably more...

Discussed with:	kib, ups
Approved by:	re (rwatson)
2007-07-26 16:58:09 +00:00
Pawel Jakub Dawidek
68c1a246ae The v_mountedhere field is protected by the vnode lock, not vnode's internal
lock.

Approved by:	re (rwatson)
2007-07-26 16:52:57 +00:00
John Baldwin
de016534a8 If the trap number stored in the trapframe is corrupted into a negative
value, then we would use a negative index into the trap_msg[] array
resulting in a nested page fault.  Make the 'type' variable holding the
trap number unsigned to avoid this.

MFC after:	2 weeks
Approved by:	re (rwatson)
2007-07-26 15:32:55 +00:00
Gleb Smirnoff
bb5ba44f82 Honor the IFF_MONITOR flag.
PR:		kern/99500
Submitted by:	Craig Leres <leres ee.lbl.gov>
Approved by:	re (kensmith)
2007-07-26 10:54:33 +00:00
Andre Oppermann
564aab1fe6 Fix comments in tcp_do_segment().
Approved by:	re (kensmith)
2007-07-25 18:48:24 +00:00
Warner Losh
6dc2dedb7a Start to converge on standard ways of saying some things like
Ethernet and Adapter.

Obtained from: NetBSD (kinda)
Approved by: re (blanket)
2007-07-25 07:11:08 +00:00
Warner Losh
3b62e837c9 Fix absolutely maddening autorepeat bug that would cause the last key
to repeat if you had more than two keys down at any given time (which
happened to me all the time with emacs).

This is taken from PR 110681, although what URATAN Shigenobu describes
there is different than the pathology that I have been seeing.  I'm
seeing this only in X, while he sees it on his console, yet I think
the two problems are related.  I've also reworked the patch slightly
to conform to the coding standards of adjacent code.

It is unclear to me if this merely masks the maddening bug that I have
seen, or if this is a real fix.  I typically see the problem when I'm
typing fast in emacs and using lots of motion keys (meta and control).
In either case, my workstation at work again is finally useful with
this patch.

PR:		110681
Submitted by:	URATAN Shigenobu
Approved by: 	re (blanket)
2007-07-25 06:48:33 +00:00
Warner Losh
8a639d8fb6 ums(4) does not work if the mouse defaults to boot protocol. Force
the protocol to be report on each open, but ignore any errors as set
protocol for mice that don't implement the boot protocol can generate
an error.  Evidentally, the Gyration GyroPoint RF Technology Receiver
(Gyration Ultra Cordless) device has this problem.

Submitted by: Eugene M. Kim
PR: 106565
Approved by: re (blanket)
2007-07-25 06:43:06 +00:00
Randall Stewart
1b649582bb - take out a needless panic under invariants for sctp_output.c
- Fix addrs's error checking of sctp_sendx(3) when addrcnt is less than
   SCTP_SMALL_IOVEC_SIZE
 - re-add back inpcb_bind local address check bypass capability
 - Fix it so sctp_opt_info is independant of assoc_id postion.
 - Fix cookie life set to use MSEC_TO_TICKS() macro.
 - asconf changes
   o More comment changes/clarifications related to the old local address
    "not" list which is now an explicit restricted list.

   o Rename some functions for clarity:
     - sctp_add/del_local_addr_assoc to xxx_local_addr_restricted()
     - asconf related iterator functions to sctp_asconf_iterator_xxx()

   o Fix bug when the same address is deleted and added (and removed from
     the asconf queue) where the ifa is "freed" twice refcount wise,
     possibly freeing it completely.

   o Fix bug in output where the first ASCONF would not go out after the
     last address is changed (e.g. only goes out when retransmitted).

   o Fix bug where multiple ASCONFs can be bundled in the same packet with
     the and with the same serial numbers.

   o Fix asconf stcb iterator to not send ASCONF until after all work
     queue entries have been processed.

   o Change behavior so that when the last address is deleted (auto asconf
     on a bound all endpoint) no action is taken until an address is
     added; at that time, an ASCONF add+delete is sent (if the assoc
     is still up).

   o Fix local address counting so that address scoping is taken into
     account.

   o #ifdef SCTP_TIMER_BASED_ASCONF the old timer triggered sending
     of ASCONF (after an RTO).  The default now is to send
     ASCONF immediately (except for the case of changing/deleting the
     last usable address).
Approved by:	re(ken smith)@freebsd.org
2007-07-24 20:06:02 +00:00
Xin LI
f62e5595fd MFp4: Force 64-bit arithmatic when caculating the maximum file size.
This fixes tmpfs caculations on 32-bit systems equipped with more than
4GB swap.

Reported by:	Craig Boston <craig xfoil gank org>
PR:		kern/114870
Approved by:	re (tmpfs blanket)
2007-07-24 17:14:53 +00:00
Scott Long
05a4c1c1ef Attach the iscsi module build.
Approved by: re
2007-07-24 16:58:18 +00:00
Scott Long
c5933b2086 Introduce Danny Braniss' iSCSI initiator, version 2.0.99. Please read the
included man pages on how to use it.  This code is still somewhat experimental
but has been successfully tested on a number of targets.  Many thanks to
Danny for contributing this.

Approved by: re
2007-07-24 15:35:02 +00:00
Pawel Jakub Dawidek
aa222db26f Update assertion after revision 1.23.
Reviewed by:	dfr
Approved by:	re (rwatson)
2007-07-24 15:00:43 +00:00
Warner Losh
0e0e91989d Add support for ShanTou ST268 usb nic. This is from a patch for NetBSD
the PR pointed to.  This appears to have been written by Julian Suschlik.

Submitted by: Kuan-Chung Chiu
Obtained from: http://www.nabble.com/Patch-for-udav(4)-t4070804.html
PR: 114860
Approved by: re@ (blanket)
2007-07-24 14:44:23 +00:00
Pyun YongHyeon
5774c5ff93 Add MSI support.
Ever since switching to adaptive polling re(4) occasionally spews
watchdog timeouts on systems with MSI capability. This change is
minimal one for supporting MSI and re(4) also needs MSIX support
for RTL8111C in future. Because softc structure of re(4) is shared
with rl(4), rl(4) was touched to use the modified softc.

Reported by:	cnst
Tested by:	cnst
Approved by:	re (kensmith)
2007-07-24 01:24:03 +00:00
Pyun YongHyeon
8b590ad2d1 Don't fail on device attach if jumbo frame support was unsuccessful.
Because nfe(4) hardware doesn't support SG on Rx path, supporting
jumbo frame requires very large contiguous kernel memory(i.e. several
mega bytes). In case of lack of contiguous kernel memory that
allocation request may always fail. However nfe(4) can operate on normal
sized MTU frames, so go ahead and just disable jumbo frame support.
While I'm here add a new tunable "hw.nfe.jumbo_disable" to disable
jumbo frame support.
In nfe_poll, make sure to invoke correct Rx handler.

Approved by:	re (kensmith)
2007-07-24 01:11:00 +00:00
Attilio Rao
758b17a100 upcall_free() was only used in kse_GC() which has been removed so it now
results unused; this, with -Werror option of gcc, rise a warning for gcc
which let the buildkernel to be busted.
Fix this removing upcall_free().

Reported by: various
Approved by: jeff
Approved by: re
Pointy hat to: attilio
2007-07-23 23:16:53 +00:00
Attilio Rao
ac8094e4e3 Actually, KSE kernel bits locking is broken and can lead likely to
dangerous races.
Fix this problems adding correct locking for the members of 'struct
kse_upcall' and other struct proc/struct thread related members.
For the moment, just leave ku_mflag and ku_flags "lazy" locked.
While here, cleanup the code removing the function kse_GC() (unused),
and merging upcall_link(), upcall_unlink(), upcall_stash() in their
respective callers (static functions, very short and only called in one
place).

Reported by: pav
Tested by: pav (on some pointyhat cluster nodes)
Approved by: jeff
Approved by: re
Sponsorized by: NGX Italy (http://www.ngx.it)
2007-07-23 14:52:22 +00:00
Robert Watson
7bb9c8a05b When checking labels during a vnode link operation in MLS, use the file
vnode label for a check rather than the directory vnode label a second
time.

MFC after:	3 days
Submitted by:	Zhouyi ZHOU <zhouzhouyi at FreeBSD dot org>
Reviewed by:	csjp
Sponsored by:	Google Summer of Code 2007
Approved by:	re (bmah)
2007-07-23 13:28:54 +00:00
David Malone
6d8617d42a If clock_ct_to_ts fails to convert time time from the real time clock,
print a one line error message. Add some comments on not being able to
trust the day of week field (I'll act on these comments in a follow up
commit).

Approved by:	re
MFC after:	3 weeks
2007-07-23 09:42:32 +00:00
Robert Watson
8136d21ec0 Continue effort to align UDPv4 and UDPv6 implementations by merging
udp6_output() from udp6_output.c to udp6_usrreq.c, matching the UDPv4
structure, and allowing us to remove udp6_output.c.

Reviewed by:	bz, gnn
Approved by:	re (bmah)
2007-07-23 07:58:58 +00:00
Bruce Evans
4eb3abf0a5 Make using msdosfs as the root file system sort of work:
o Initialize ownerships and permissions.  They were garbage (0) for
  root mounts since vfs_mountroot_try() doesn't ask for them to be set
  and msdosfs's old incomplete code to set them was removed.  The
  garbage happened to give the correct ownerships root:wheel, but it
  gave permissions 000 so init could not be execed.  Use the macros
  for root: wheel and 0755.  (The removed code gave 0:0 and 0777.  0755
  is more normal and secure, thought wrong for /tmp.)

o Check the readonly flag for initial (non-MNT_UPDATE) mounts in the
  correct place, as in ffs.  For root mounts, it is only passed in
  mp->mnt_flags, since vfs_mountroot_try() only passes it as a flag
  and nothing translates the flag to the "ro" option string.  msdosfs
  only looked for it in the string, so it gave a rw mount for root
  mounts without even clearing the flag in mp->mnt_flags, so the final
  state was inconsistent.  Checking the flag only in mp->mnt_flags
  works for initial userland mounts too.  The MNT_UPDATE case is
  messier.

The main point that should work but doesn't is fsck of msdosfs root
while it is mounted ro.  This needs mainly MNT_RELOAD support to work.
It should be possible to run fsck -p and succeed provided the fs is
consistent, not just for msdosfs, but this fails because fsck -p always
tries to open the device rw.  The hack that allows open for writing
in ffs is not implemented in msdosfs, since without MNT_RELOAD support
writing could only be harmful.  So fsck must be turned off to use
msdosfs as root.  This is quite dangerous, since msdosfs is still missing
actually using its fs-dirty flag internally, so it is happy to mount
dirty fileystems rw.

Unrelated changes:
- Fix missing error handling for MNT_UPDATE from rw to ro.
- Catch up with renaming msdos to msdosfs in a string.

Approved by:	re (kensmith)
2007-07-23 07:10:17 +00:00
Xin LI
7280082944 MFp4: When swapping is not enabled, allow creating files by taking
physical memory pages into account for tm_maxfilesize.

Reported by:	Dominique Goncalves <dominique.goncalves gmail.com>
Submitted by:	Howard Su
Approved by:	re (tmpfs blanket)
2007-07-23 06:54:58 +00:00
Attilio Rao
bcfac09734 Preprocessing stub "KSE" breaks ABI either with modules and userspace
consumers.
This patch makes KSE no more an optionally stub for kernel structures
fixing the breakage.
As a tail note, this bug has broken kqemu for a long period now.

Tested by: Ulf Lilleengen <lulf@FreeBSD.org>
Discussed with: rwatson, jeff
Approved by: jeff (mentor)
Approved by: re
2007-07-22 21:35:44 +00:00
Andrew Thompson
a4e531102e ndis will signal the kthread to exit and then sleep on the proc pointer to
be woken up by kthread_exit. This is racey and in some cases the kthread will
exit before ndis gets around to sleep so it will be stuck indefinitely. This
change reuses the kq_exit variable to indicate that the thread has gone and
will loop on tsleep with a timeout waiting for it. If the kthread has already
exited then it will not sleep at all.

Approved by:	re (rwatson)
2007-07-22 20:53:28 +00:00
Nate Lawson
9bbad5af65 The HPET appears to be broken on silby's Acer Pentium M system, never
advancing.  Read from the timer before attaching to be sure it advances
in 1 us.  Since the slowest rate allowed by the spec is 10 MHz, the
timer is guaranteed to change in this interval if it is working.

Tested by:	Rui Paulo
Approved by:	re
MFC after:	3 days
2007-07-22 20:45:27 +00:00
Warner Losh
944f82cd4f Change new Wi-Spy device name to Wi-Spy 2.4x.
Submitted by: Brix Andersen
Approved by: re@ (blanket)
PR: 114807
2007-07-22 18:29:18 +00:00
Warner Losh
9fb43cb678 WISPY added an X.
Approved by: re
2007-07-22 15:59:45 +00:00
Robert Watson
3f3bb0d402 Merge OpenBSM 1.0 alpha 15 changes to src/sys/bsm:
- Synchronized audit event list to Solaris, picking up the *at(2) system call
  definitions, now required for FreeBSD and Linux.  Added additional events
  for *at(2) system calls not present in Solaris.

Obtained from:	TrustedBSD Project
Approved by:	re (hrs)
2007-07-22 12:28:13 +00:00
Kevin Lo
36ffd4ba6d Use bus_get_dma_tag() to obtain the parent DMA tag.
Reviewed by: sam, sephe, thompsa
Approved by: re (kensmith)
2007-07-22 06:44:10 +00:00
Warner Losh
7e23029ae6 Add some additional devices.
Submitted by: HPS hselasky at c2i dot net
Approved by: re (blanket)
2007-07-22 03:45:35 +00:00
Randall Stewart
52be287ebb - remove duplicate code from sctp_asconf.c
- remove duplicate #include <sys/priv.h> that is not under
   #ifdef FreeBSD version to allow compile on 6.1
- static analysis changes per the cisco SA tool including:
    o some SA_IGNORE comments
    o some checks for NULL before unlock.
    o type corrections int -> size_t
- Fix it so sctp_alloc_asoc takes a thread/proc argument. Without this
   we pass a NULL in to bind on implicit assoc setup and crash  :-(
Approved by:	re@freebsd.org(Ken Smith)
2007-07-21 21:41:32 +00:00
Alexander Kabaev
f7c7c876de Do not forget to cam_periph_unhold the peripheral before exiting
due to error.

PR:		kern/114636
Submitted by:	Tijl Coosemans
Approved by:	re (hrs)
2007-07-21 18:07:45 +00:00
Stefan Eßer
d2a748e232 Fix Symbios driver on amd64: Since amd64 has 64 bit pointers but the same
4KB pages as i386, data structures that just fit in one page on i386 (and
on 64 bit architectures with 8KB pages) can be distributed over two pages
on amd64. This is a porblem in the case of the Symbios driver, since the
SCRIPTS engine in the SCSI chip operates on physical addresses and needs
physically contiguous memory. Earlier patches used contigmalloc on amd64,
but this version replaces part of a structure by a pointer to that data.
In order to not introduce an extra indirection for other architectures,
the change has been made conditional on __amd64__.

Earlier attempts to repair this problem are removed (i.e. the macros that
made amd64 use contigmalloc). The fix was submitted by Jan Mikkelsen and
modified by me to only affect amd64.

PR:		89550
Submitted by:	janm at transactionware dot com (Jan Mikkelsen)
Approved by:	re (Hiroki Sato)
MFC after:	2 weeks
2007-07-20 23:02:01 +00:00
Bruce Evans
6b6c5f5ef9 Implement vfs clustering for msdosfs.
This gives a very large speedup for small block sizes (in my tests,
about 5 times for write and 3 times for read with a block size of 512,
if clustering is possible) and a moderate speedup for the moderatatly
large block sizes that should be used on non-small media (4K is the
best size in most cases, and the speedup for that is about 1.3 times
for write and 1.2 times for read).  mmap() should benefit from clustering
like read()/write(), but the current implementation of vm only supports
clustering (at least for getpages) if the fs block size is >= PAGE SIZE.

msdosfs is now only slightly slower than ffs with soft updates for
writing and slightly faster for reading when both use their best block
sizes.  Writing is slower for msdosfs because of more sync writes.
Reading is faster for msdosfs because indirect blocks interfere with
clustering in ffs.

The changes in msdosfs_read() and msdosfs_write() are simpler merges
of corresponding code in ffs (after fixing some style bugs in ffs).
msdosfs_bmap() needs fs-specific code.  This implementation loops
calling a lower level bmap function to do the hard parts.  This is a
bit inefficient, but is efficient enough since msdsfs_bmap() is only
called when there is physical i/o to do.

Approved by:	re (hrs)
2007-07-20 17:06:57 +00:00
Bruce Evans
d34b0a1bac Clean up before implementing vfs clustering for msdosfs:
In msdosfs_read(), mainly reorder the main loop to the same order as in
ffs_read().

In msdosfs_write() and extendfile(), use vfs_bio_clrbuf() instead of
clrbuf().  I think this just just a bogus optimization, but ffs always
does it and msdosfs already did it in one place, and it is what I've
tested.

In msdosfs_write(), merge good bits from a comment in ffs_write(), and
fix 1 style bug.

In the main comment for msdosfs_pcbmap(), improve wording and catch
up with 13 years of changes in the function.  This comment belongs in
VOP_BMAP.9 but that doesn't exist.

In msdosfs_bmap(), return EFBIG if the requested cluster number is out
of bounds instead of blindly truncating it, and fix many style bugs.

Approved by:	re (hrs)
2007-07-20 16:21:47 +00:00
Sepherosa Ziehau
7f02e579c5 In add_channel(), search 11g channels if mode is AUTO and corresponding
11b channel is not found, e.g. Atheros 5211.

Reported by: matteo
Problem outlined by: thompsa
Reviewed by: sam, thompsa
Approved by: re (kensmith), sam (mentor)
Tested by: matteo (an early version)
2007-07-20 11:38:12 +00:00
Robert Watson
825eaf3470 Make sure we release the control vnode in Coda:
We allocate coda_ctlvp when /coda is mounted, but never release it.
During the unmount this vnode was marked as UNMOUNTING and when venus
is started a second time the system would hang, possibly waiting for
the old vnode to disappear.

So now we call vrele on the control vnode when file system is unmounted
to drop the reference we got during the mount. I'm pretty sure it is
also necessary to not skip the handling in coda_inactive for the control
vnode, it seems like that is the place we actually get rid of the vnode
once the refcount has dropped to 0.

Submitted by:	Jan Harkes <jaharkes at cs dot cmu dot edu>
Approved by:	re (kensmith)
2007-07-20 11:14:51 +00:00
Konstantin Belousov
e69aee3117 ttyfree() frees the cdev(). But if there are pending kevents,
filt_ttyrdetach() etc would later attempt to dereference cdev->si_tty,
causing a 0xdeadc0de dereference.  Change kn_hook value from cdev to
struct tty to avoid dereferencing freed cdev.

In ttygone(), wake up select(), sigio and kevent() users in addition
to the queue sleepers.

Return EV_EOF from kevent filters if TS_GONE is set.

Submitted by:	peter
Tested by:	Peter Holm
Approved by:	re (kensmith)
MFC after:	2 weeks
2007-07-20 09:41:54 +00:00
Attilio Rao
6aa294be2c Fix some problems with lock profiling in rw locks:
- Adjust lock_profiling stubs semantic in the hard functions in order to be
  more accurate and trustable
- As for sx locks, disable shared paths for lock_profiling.  Actually,
  lock_profiling has a subtle race which makes results caming from shared
  paths not completely trustable. A macro stub (LOCK_PROFILING_SHARED) can
  be actually used for re-enabling this paths, but is currently intended
  for developing use only.
- style(9) fixes

Approved by: jeff, kmacy, jhb[1]
Approved by: re

[1] Had initial reservations not shared by others, conceded
    in the end.
2007-07-20 08:43:42 +00:00
Attilio Rao
52739c2d25 i386_set_ioperm, i386_get_ldt and i386_set_ldt are now MPSAFE
(Giant/sched_lock free) so remove unuseful Giant cruft.

Approved by: jeff
Approved by: re
Sponsorized by: NGX Italy (http://www.ngx.it)
2007-07-20 08:35:18 +00:00
Alan Cox
806453645a Two changes to vm_fault_additional_pages():
1. Rewrite the backward scan.  Specifically, reverse the order in which
   pages are allocated so that upon failure it is never necessary to
   free pages that were just allocated.  Moreover, any allocated pages
   can be put to use.  This makes the backward scan behave just like the
   forward scan.

2. Eliminate an explicit, unsynchronized check for low memory before
   calling vm_page_alloc().  It serves no useful purpose.  It is, in
   effect, optimizing the uncommon case at the expense of the common
   case.

Approved by:	re (hrs)
MFC after:	3 weeks
2007-07-20 06:55:11 +00:00
Hidetoshi Shimokawa
b0f99fbdbc Protect transaction labels by its own lock to reduce lock contention.
Approved by: re (rwatson)
2007-07-20 03:42:57 +00:00
Pyun YongHyeon
53dcfbd18b Add legacy interrupt handler which would be more appropriate for
interrupt that is shared with other devices(e.g. USB) in system and
provide a new tunable "hw.msk.legacy_intr" to activate the legacy
interrupt handler. Setting the tunable automatically disables MSI
for msk(4). Previously msk(4) used adoptive polling with taskqueue(9)
as all msk(4) hardwares I know supports MSI. However, there are cases
that MSI couldn't be used on some hardwares due to bugs in MSI
implementatins.

Tested by:	Li-Lun Wang < llwang AT infor DOT org >
Approved by:	re (kensmith)
2007-07-20 00:25:20 +00:00
Robert Watson
08af97b790 Attempt to improve feature parity between UDPv4 and UDPv6 by merging
UDPv4 features to UDPv6:

- Add MAC checks on delivery and MAC labeling on transmit.
- Check for (and reject) datagrams with destination port 0.
- For multicast delivery, check the source port only if the socket being
  considered as a destination has been connected.
- Implement UDP blackholing based on net.inet.udp.blackhole.
- Add a new ICMPv6 unreachable reply rate limiting category for failed
  delivery attempts and implement rate limiting for UDPv6 (submitted by
  bz).

Approved by:	re (kensmith)
Reviewed by:	bz
2007-07-19 22:34:25 +00:00
Jeff Roberson
28994a5852 - Refine the load balancer to improve buildkernel times on dual core
machines.
 - Leave the long-term load balancer running by default once per second.
 - Enable stealing load from the idle thread only when the remote processor
   has more than two transferable tasks.  Setting this to one further
   improves buildworld.  Setting it higher improves mysql.
 - Remove the bogus pick_zero option.  I had not intended to commit this.
 - Entirely disallow migration for threads with SRQ_YIELDING set.  This
   balances out the extra migration allowed for with the load balancers.
   It also makes pick_pri perform better as I had anticipated.

Tested by:	Dmitry Morozovsky <marck@rinet.ru>
Approved by:	re
2007-07-19 20:03:15 +00:00
Jeff Roberson
08c9a16c4f - When newtd is specified to sched_switch() it was not being initialized
properly.  We have to temporarily unlock the TDQ lock so we can lock
   the thread and add it to the run queue.  This is used only for KSE.
 - When we add a thread from the tdq_move() via sched_balance() we need to
   ipi the target if it's sitting in the idle thread or it'll never run.

Reported by:	Rene Landan
Approved by:	re
2007-07-19 19:51:45 +00:00
Andrew Gallatin
f9ae02802f - Enable static building of mxge(4) and its firmware.
- Add custom .c wrappers for the firmware, rather than the standard
  firmware(9) generated firmware objects to work around toolchain
  problems on ia64 involving linking objects produced by
  ld -b -binary into the kernel.

- Move from using Myricom's ".dat" firmware blobs to using Myricom's
  zlib compressed ".h" firmware header files.  This is done to
  facilitate the custom wrappers, and saves a fair amount of wired
  memory in the case where the firmware is built in, or preloaded.

- Fix two compile issues in mxge which only appear on non-i386/amd64.

Reviewed by: mlaier, mav (earlier version with just zlib support)
Glanced at by: sam
Approved by: re (kensmith)
2007-07-19 16:16:00 +00:00
Bjoern A. Zeeb
b28cd33459 Replace hard coded options by their defined PFIL_{IN,OUT} names.
Approved by:	re (hrs)
2007-07-19 09:57:54 +00:00
Bjoern A. Zeeb
8accf26fea Restore behavior changed with rev. 1.46 and make
IPV6_IPSEC_POLICY always visible again. This unbreaks some
third party user space applications.

PR:		114491
Reported by:	sumikawa
Reviewed by:	sumikawa
Approved by:	re (hrs)
2007-07-19 09:16:40 +00:00
Jeff Roberson
56696bd1ab - Remove explicit references to sched_lock. A simpler assert will do.
Approved by:	re
2007-07-19 08:58:40 +00:00
Jeff Roberson
6eeb364b4c - Calling sched_nice() in tdsigwakeup() is no longer required by ULE and
actually causes LORs and other panics.

Reported by:	mlaier
Approved by:	re
2007-07-19 08:49:16 +00:00
Xin LI
c5be778305 MFp4: Rework on tmpfs's mapped read/write procedures. This
should finally fix fsx test case.

The printf's added here would be eventually turned into
assertions.

Submitted by:	Mingyan Guo (mostly)
Approved by:	re (tmpfs blanket)
2007-07-19 03:34:50 +00:00
Jeff Roberson
6ea38de8aa - Remove the global definition of sched_lock in mutex.h to break
new code and third party modules which try to depend on it.
 - Initialize sched_lock in sched_4bsd.c.
 - Declare sched_lock in sparc64 pmap.c and assert that we're compiling
   with SCHED_4BSD to prevent accidental crashes from running ULE.  This
   is the sole remaining file outside of the scheduler that uses the
   global sched_lock.

Approved by:	re
2007-07-18 20:46:06 +00:00
Jeff Roberson
773890b9a8 - Add the proper lock profiling calls to _thread_lock().
Obtained from:	kipmacy
Approved by:	re
2007-07-18 20:38:13 +00:00
Jeff Roberson
bd675f58eb - Update ULE note to remove warnings against production use.
Suggested by:	Ben Kaduk <minimarmot@gmail.com>
Approved by:	re
2007-07-18 02:51:21 +00:00
Jeff Roberson
ae7a6b38d5 ULE 3.0: Fine grain scheduler locking and affinity improvements. This has
been in development for over 6 months as SCHED_SMP.
 - Implement one spin lock per thread-queue.  Threads assigned to a
   run-queue point to this lock via td_lock.
 - Improve the facility for assigning threads to CPUs now that sched_lock
   contention no longer dominates scheduling decisions on larger SMP
   machines.
 - Re-write idle time stealing in an attempt to make it less damaging to
   general performance.  This is still disabled by default. See
   kern.sched.steal_idle.
 - Call the long-term load balancer from a callout rather than sched_clock()
   so there are no locks held.  This is disabled by default.  See
   kern.sched.balance.
 - Parameterize many scheduling decisions via sysctls.  Try to document
   these via sysctl descriptions.
 - General structural and naming cleanups.
 - Document each function with comments.

Tested by:	current@ amd64, x86, UP, SMP.
Approved by:	re
2007-07-17 22:53:23 +00:00
Jeff Roberson
40380a6a6b - Optimize the amd64 cpu_switch() TD_LOCK blocking and releasing to
require fewer blocking loops.
 - Don't use atomic ops with 4BSD or on UP.
 - Only use the blocking loop if ULE is compiled in.
 - Use the correct memory barrier.

Discussed with:	attilio, jhb, ssouhlal
Tested by:	current@
Approved by:	re
2007-07-17 22:36:56 +00:00
Jeff Roberson
56a114967b - Add support for blocking and releasing threads to i386 cpu_switch(). This
is required for per-cpu scheduler lock support.

Obtained from:	attilio
Tested by:	current@ many users
Approved by:	re
2007-07-17 22:34:14 +00:00
Randall Stewart
18e198d3a3 - added pre-checks to the bindx call.
- use proper tick gathering macro instead of ticks directly.
- Placed reasonable boundaries on sets that a user can do
  that are converted to ticks from ms.
- Fix CMT_PF to always check to be sure CMT is on.
- Fix ticks use of CMT_PF.
- put back code to allow asconfs to be queued while INITs are in flight
  and before the assoc is established.
- During window probes, an ack'd packet might be left with the window
  probe mark on it causing it to be retransmitted. Change so that
  the flight decrease macro clears the window_probe mark.
- Additional logging flight size/reading and ASOC LOG. This
  is only enabled if you manually insert things into opt_sctp.h
  since its a set of debug code only.
- Found an interesting SMP race in the way data was appended which
  could cause a reader to lose a part of a message, had to
  reorder when we marked the message was complete to after
  the data was appended.
- bug in ADD-IP for the subset bound socket case when the peer has only
  one address
- fix ASCONF implicit success/error handling case
- proper support of jails in Freebsd 6>
- copy out the timeval for the 64 bit sparc world on cookie-echo
  alignment error crashes without this).
Approved by:	re(Ken Smith)
2007-07-17 20:58:26 +00:00
Sepherosa Ziehau
733ab6b6c8 Correct RSSI calculation.
Noticed by: Hans Petter Selasky <hselasky@c2i.net>
Approved by: re (kensmith), sam (mentor)
2007-07-17 11:27:57 +00:00
Kip Macy
ac3a6d9cef - integrate most recent changes from vendor branch and upgrade to firmware revision 4.5.5
- add filter support
	- further improvements for T304
- recover gracefully from spurious immediate packets

Approved by: re(blanket)
Supported by: Chelsio
MFC after: 3 days
2007-07-17 06:50:35 +00:00
Kip Macy
8870f0e16b - Increase descriptors per call to start
- enqueue per-txq task
- fix per-txq task initialization

Approved by: re (blanket)
2007-07-17 06:12:22 +00:00
Jeff Roberson
fb62eea266 - Use ruxagg() in calcru() to make sure we have current tick information
from all threads.

Discussed with:	bde, attilio
Approved by:	re
2007-07-17 01:08:09 +00:00
Doug Ambrisko
72d7331539 Add support to the ipmi, isa attachment to attempt to read ipmi
config info. from device.hints.  Some machines have ipmi controllers
that do not have attachment info in either PCI, SMBIOS or ACPI.
This idea was hacked together by me and then done properly by
jhb.

Submitted by:	jhb
Reviewed by:	jhb (man page)
Approved by:	re (Ken Smith)
MFC after:	1 week
2007-07-16 17:03:48 +00:00
Marcel Moolenaar
871f1ddd46 Restore the value of ar.rnat after the assignment to ar.bspstore.
The SDM states that writing to ar.bspstore invalidates the ar.rnat
register as a side-effect. This was interpreted as "bits in the
ar.rnat register that correspond to registers whose value is on
the stack are undefined'. Since we keep the kernel stack NaT-
aligned with the user stack (i.e. the lower 9 bits of the backing
store pointer remain unchanged when we switch to the kernel stack)
bits that need preserving would be preserved.

That interpretation is questionable. So, now, the interpretation
is more absolute: ar.rnat is undefined after writing to ar.bspstore.
As such, we write the saved value of ar.rnat back to ar.rnat after
writing to ar.bspstore.

Discussed with: christian.kandeler@hob.de
Approved by: re (kensmith)
2007-07-16 16:47:35 +00:00
Hidetoshi Shimokawa
f0441453c1 Improve acquisition of transaction labels.
- Keep last transaction label for each destination.
- If the next label is not free, just give up.
- This should reduce CPU load for TX on if_fwip under heavy load.

Approved by: re (hrs)
2007-07-15 13:00:29 +00:00
Robert Watson
2b851aeb63 Disconnect netatm from the build as it is not MPSAFE and relies on
NET_NEEDS_GIANT, which will shortly be removed.  This is done in a
away that it may be easily reattached to the build before 7.1 if
appropriate locking is added.  Specifics:

- Don't install netatm include files
- Disconnect netatm command line management tools
- Don't build libatm
- Don't include ATM parts in rescue or sysinstall
- Don't install sample configuration files and documents
- Don't build kernel support as a module or in NOTES
- Don't build netgraph wrapper nodes for netatm

This removes the last remaining consumer of NET_NEEDS_GIANT.

Reviewed by:	harti
Discussed with:	bz, bms
Approved by:	re (kensmith)
2007-07-14 21:49:24 +00:00
Craig Rodrigues
d7f81adbd4 Revert previous commits which I committed by mistake.
Approved by:	re (implicit)
Pointy hat to:	me
2007-07-14 21:23:31 +00:00
Alan Cox
8941dc4471 Eliminate two unused functions: vm_phys_alloc_pages() and
vm_phys_free_pages().  Rename vm_phys_alloc_pages_locked() to
vm_phys_alloc_pages() and vm_phys_free_pages_locked() to
vm_phys_free_pages().  Add comments regarding the need for the free page
queues lock to be held by callers to these functions.  No functional
changes.

Approved by:	re (hrs)
2007-07-14 21:21:17 +00:00
Craig Rodrigues
d678780e60 The last entry in the ext2_opts array must be NULL,
otherwise the kernel with crash in vfs_filteropt() if an invalid
mount option is passed to ext2fs.

Approved by:	re (kensmith)
2007-07-14 21:18:19 +00:00
Alan Cox
bd06ab2f60 Eliminate dead code, specifically, an unused sysctl: "vm.idlezero_maxrun".
Approved by:	re (hrs)
2007-07-14 19:00:44 +00:00
Robert Watson
9c89a2e949 Remove "options SCTP_HIGH_SPEED" from NOTES as it has now been removed
from options.

Approved by:	re (bmah)
2007-07-14 15:35:45 +00:00
Randall Stewart
b54d3a6c48 - Modular congestion control, with RFC2581 being the default.
- CMT_PF states added (w/sysctl to turn the PF version on)
- sctp_input.c had a missing incr of cookie case when the
  auth was bad. This meant a free was called without an
  increment to refcnt, added increment like rest of code.
- There was a case, unlikely, when the scope of the destination
  changed (this is a TSNH case). In that case, it would not free
  the alloc'ed asoc (in sctp_input.c).
- When listed addresses found a colliding cookie/Init, then
  the collided upon tcb was not unlocked in sctp_pcb.c
- Add error checking on arguments of sctp_sendx(3) to prevent it from
  referencing a NULL pointer.
- Fix an error return of sctp_sendx(3), it was returing
  ENOMEM not -1.
- Get assoc id was changed to use the sanctified socket api
  method for getting a assoc id (PEER_ADDR_INFO instead of
  PEER_ADDR_PARAMS).
- Fix it so a peeled off socket will get a proper error return
  if it trys to send to a different address then it is connected to.
- Fix so that select_a_stream can avoid an endless loop that
  could hang a caller.
- time_entered (state set time) was not being set in all cases
  to the time we went established.
Approved by:	re(ken smith)
2007-07-14 09:36:28 +00:00
Craig Rodrigues
7a920f5761 Perform range check before allocating memory when reading
extended attributes.

Reviewed by:	kib
Approved by:	re (hrs)
PR:		114389
2007-07-13 18:51:08 +00:00
Eric Anholt
d450e052dc Add support for G965/Q965/GM965/GME965/GME945 AGP.
This adds a function to agp.c to set the aperture resource ID if it's
not the usual AGP_APBASE.  Previously, agp.c had been assuming
AGP_APBASE, which resulted in incorrect agp_info, and contortions by
agp_i810.c to work around it.

This also adds functions to agp.c for default AGP_GET_APERTURE() and
AGP_SET_APERTURE(), which return the aperture resource size and disallow
aperture size changes.  Moving to these for our AGP drivers will likely
result in stability improvements.  This should fix 855-class aperture
size detection.

Additionally, refuse to attach agp_i810 when some RAM is above 4GB and
the GART can't reference memory that high.  This should be very rare.
The correct solution would be bus_dma conversion for agp, which is
beyond the scope of this change.  Other AGP drivers could likely use
this change as well.

G33/Q35/Q33 AGP support is also included, but disconnected by default
due to lack of testing.

PR:             kern/109724 (855 aperture issue)
Submitted by:   FUJIMOTO Kou<fujimoto@j.dendai.ac.jp>
Approved by:	re (hrs)
2007-07-13 16:28:12 +00:00
Warner Losh
229af622b8 MFp4:
Add support for the CENTIPAD board (http://www.harerod.de/centipad/index.html)
	(which is a very cool, very small ARM board)
Add support for KB9202B (it has different memory)
Make BOOT_FLAVOR settable
Minor cleanup nits

Approved by: re@
2007-07-13 14:27:05 +00:00
Alan Cox
0f752392c6 Update a comment describing the page queues.
Approved by:	re (hrs)
2007-07-13 04:42:20 +00:00
Alan Cox
e99a797492 Eliminate dead code.
Approved by:	re (hrs)
2007-07-12 22:23:28 +00:00
Robert Watson
00f05dc847 Complete repo-copy and move of Coda from src/sys/coda to src/sys/fs/coda
by removing files from src/sys/coda, and updating include paths in the
new location, kernel configuration, and  Makefiles.  In one case add
$FreeBSD$.

Discussed with:		anderson, Jan Harkes <jaharkes@cs.cmu.edu>
Approved by:		re (kensmith)
Repo-copy madness:	simon
2007-07-12 21:04:58 +00:00
Robert Watson
d21e51d059 Forced commit to recognize repo-copy of Coda files from src/sys/coda to
src/sys/fs/coda.

Discussed with:         anderson, Jan Harkes <jaharkes@cs.cmu.edu>
Approved by:            re (kensmith)
Repo-copy madness:      simon
2007-07-12 20:40:38 +00:00
Jack F Vogel
d2a744ffea A couple late breaking bugs that testing have turned up.
- change include style so build in kernel try OR standalone work.
	- Limit HWCSUM - I was led to believe that it would work with RSS,
	  but our testing had odd issues which suggests this is false.
	- A fatfinger error in the ioctl code made ifconfig up not work.

Approved by: re
2007-07-12 19:04:11 +00:00
John Baldwin
59d8f3ff08 Fix a couple of issues with the stack limit for 32-bit processes on 64-bit
kernels exposed by the recent fixes to resource limits for 32-bit processes
on 64-bit kernels:
- Let ABIs expose their maximum stack size via a new pointer in sysentvec
  and use that in preference to maxssiz during exec() rather than always
  using maxssiz for all processses.
- Apply the ABI's limit fixup to the previous stack size when adjusting
  RLIMIT_STACK to determine if the existing mapping for the stack needs to
  be grown or shrunk (as well as how much it should be grown or shrunk).

Approved by:	re (kensmith)
2007-07-12 18:01:31 +00:00
Sam Leffler
c4ed2c08ad revert handling of ssid and bssid to be manadatory instead of advisory
Prodded by:	Kevin Gerry
Reviewed by:	thompsa, sephe
Approved by:	re (kensmith)
2007-07-12 17:22:43 +00:00
Bruce Evans
93fe42b62f Round up the FAT block size to a multiple of the sector size so that i/o
to the FAT is possible.

Make the FAT block size less arbitrary before it is rounded up:
- for FAT12, default to 3*512 instead of to 3 sectors.  The magic 3 is
  the default number of 512-byte FAT sectors on a floppy drive.  That
  many sectors is too many if the sector size is larger.
- for !FAT12, default to PAGE_SIZE instead of to 4096.  Remove
  MSDOSFS_DFLTBSIZE since it only obfuscated this 4096.

For reading the BPB, use a block size of 8192 instead of 2048 so that
sector sizes up to 8192 can work.  We should try several sizes, or just
try the maximum supported size (MAXBSIZE = 64K).  I use 8192 because
that is enough for DVD-RW's (even 2048 is enough) and 8192 has been
tested a lot in use by ffs.

This completes fixing msdosfs for some large sector sizes (up to 8K
for read and 64K for write).  Microsoft documents support for sector
sizes up to 4K in mdosfs.  ffs is currently limited to 8K for both
read and write.

Approved by:	re (kensmith)
Approved by:	nyan (several years ago)
2007-07-12 17:17:47 +00:00
Nate Lawson
f1172c58e5 Fix a bug where the callout might not be initialized before being used.
Rev 1.9 introduced another path where machclk_freq would be initialized
before the rest of setup was done (i.e. initializing the callout).  Make
the one-time initialization a separate function and make init_machclk()
able to be called multiple times, any time.  We depend on tsc_freq first
being updated from the highest priority eventhandler, thus we run last
and call init_machclk() to set machclk_freq.  Also, don't initialize
static variables to 0.

Tested by:	Eygene Ryabinkin
Approved by:	re
2007-07-12 17:00:51 +00:00
Bruce Evans
fd7c4230b2 Fix some bugs involving the fsinfo block (many remain unfixed). This is
part of fixing msdosfs for large sector sizes.  One of the fixed bugs
was fatal for large sector sizes.

1. The fsinfo block has size 512, but it was misunderstood and declared
   as having size 1024, with nothing in the second 512 bytes except a
   signature at the end.  The second 512 bytes actually normally (if
   the file system was created by Windows) consist of a second boot
   sector which is normally (in WinXP) empty except for a signature --
   the normal layout is one boot sector, one fsinfo sector, another
   boot sector, then these 3 sectors duplicated.  However, other
   layouts are valid.  newfs_msdos produces a valid layout with one
   boot sector, one fsinfo sector, then these 2 sectors duplicated.
   The signature check for the extra part of the fsinfo was thus
   normally checking the signature in either the second boot sector
   or the first boot sector in the copy, and thus accidentally
   succeeding.  The extra signature check would just fail for weirder
   layouts with 512-byte sectors, and for normal layouts with any other
   sector size.

   Remove the extra bytes and the extra signature check.

2. Old versions did i/o to the fsinfo block using size 1024, with the
   second half only used for the extra signature check on read.  This
   was harmless for sector size 512, and worked accidentally for sector
   size 1024.  The i/o just failed for larger sector sizes.

   The version being fixed did i/o to the fsinfo block using size
   fsi_size(pmp) = (1024 << ((pmp)->pm_BlkPerSec >> 2)).  This
   expression makes no sense.  It happens to work for sector small
   sector sizes, but for sector size 32K it gives the preposterous
   value of 64M and thus causes panics.  A sector size of 32768 is
   necessary for at least some DVD-RW's (where the minimum write size
   is 32768 although the minimum read size is 2048).

   Now that the size of the fsinfo block is 512, it always fits in
   one sector so there is no need for a macro to express it.  Just
   use the sector size where the old code uses 1024.

Approved by:	re (kensmith)
Approved by:	nyan (several years ago for a different version of (2))
2007-07-12 16:09:07 +00:00
Andrew Gallatin
eb8e82f5fd Update the mxge(4) driver's copyright to 2007, and drop
the binary distribution clause.

Approved by: re (bmah)
2007-07-12 16:04:55 +00:00
Robert Watson
07cb08fd48 Directly initialize nxge's ifaddrp pointer to ifnetp->if_addr rather
than indirecting through ifaddr_byindex, which makes things easier with
respect to virtualized network stacks.

Submitted by:	Marko Zec <zec at icir dot org>
Reviewed by:	Leonid Grossman <Leonid dot Grossman at neterion dot com>
Approved by:	re (kensmith)
2007-07-12 10:03:29 +00:00
Konstantin Belousov
73f37bf31a bus_dma_tag_create() and bus_dma_mem_alloc() shall not be called with a
non-sleepable lock held. drm_pci_alloc() calls them, thus drm mutex shall
not be held during the call.

Move the drm_pci_alloc() to the start of the i915_initialize() and drop the
the drm mutex around it.

Reported by:	Ganbold <ganbold micom mng net>
Reviewed by:	anholt
Approved by:	re (hrs)
MFC after:	1 week
2007-07-12 09:02:31 +00:00
Andrew Thompson
cddce0cb90 Improve the net80211 handling within ndis
- use net80211 for scanning and pass the results back to the scan cache
 - use ieee80211_init_channels to fill our channel list
 - fix up state transitions
 - depreciate the old wicontrol ioctls
 - add some debugging lines (#define NDIS_DEBUG)

Reviewed by:	sam
Approved by:	re (kensmith)
2007-07-12 02:54:05 +00:00
Jack F Vogel
acfc6150cf Removed unnecessary global includes for ixgbe, and em. Both have been
determined to be unnecessary.

Approved by: re
2007-07-12 00:01:53 +00:00
Jack F Vogel
13705f88fa Add the actual source too :)
Approved by:	re
2007-07-11 23:03:16 +00:00
Jack F Vogel
c27bff78be New driver for Intel 10G PCI-Express adapter (82598), driver is
still in Beta, but we want early users to have access to it in
7.0, Feedback welcome. Enjoy.	-Jack

Approved by: re
2007-07-11 22:59:57 +00:00
Matt Jacob
06b642b55d Remove the internal use of __packed and put it on the structures
themselves.

Reviewed by:	nate, peter, warner, robert
Approved by:	re (ken)
2007-07-11 22:34:34 +00:00
Matt Jacob
f9f47b5bf6 In the function pc98_check_if_type for the non-8251 case
make sure we initialize fileds in the iod that otherwise
would have been initialized.

Reviewed by:	nate, ken, warner
Approved by:	re (ken)
2007-07-11 22:25:38 +00:00
Robert Watson
26e3bc3a96 Fix ioctls on the control vnode: ioctls on a character device fail with
ENOTTY.  Make the control vnode a regular file so that ioctls are passed
through to our kernel module.

Submitted by:	Jan Harkes <jaharkes@cs.cmu.edu>
Approved by:	re (kensmith)
2007-07-11 21:34:41 +00:00
Robert Watson
0e3ce855cc Avoid a panic in insmntque when we pass a NULL mount: this reenables
some previously disabled code which according to the comment caused a
problem during shutdown.  But even that is still better than
triggering a kernel panic whenever venus is started.

Submitted by:	Jan Harkes <jaharkes@cs.cmu.edu>
Approved by:	re (kensmith)
2007-07-11 21:33:46 +00:00
Robert Watson
74d326ada8 Replace CODA_OPEN with CODA_OPEN_BY_FD: coda_open was disabled because
we can't open container files by device/inode number pair anymore.
Replace the CODA_OPEN upcall with CODA_OPEN_BY_FD, where venus returns
an open file descriptor for the container file.  We can then grab a
reference on the vnode coda_psdev.c:vc_nb_write and use this vnode for
further accesses to the container file.

Submitted by:	Jan Harkes <jaharkes@cs.cmu.edu>
Approved by:	re (kensmith)
2007-07-11 21:32:08 +00:00
Andrew Thompson
9baf942d49 Now that wicontrol has been removed from the base system the corresponding
ioctls can be removed. These have been #ifdef'd out and left as a reference in
case any of the RIDs need to be turned into sysctls at a later date.

Reviewed by:	sam, avatar
Approved by:	re (kensmith)
2007-07-11 21:25:48 +00:00
Robert Watson
934030b2c9 Resolve Coda mount failing because Coda failed to match the device
operations.  But we don't have to, if we find the coda_mntinfo structure
for this device in our linked list, we know the device is good.

Submitted by:	Jan Harkes <jaharkes@cs.cmu.edu>
Approved by:	re (kensmith)
2007-07-11 21:21:55 +00:00
Robert Watson
7263babb85 Avoid crash when opening Coda device: when allocating coda_mntinfo, we
need to initialize dev so that we can actually find the allocated
coda_mntinfo structure later on.

Submitted by:	Jan Harkes <jaharkes@cs.cmu.edu>
Approved by:	re (kensmith)
2007-07-11 20:39:53 +00:00
Maksim Yevmenkin
190fa66b39 Fix kbdmux(4) issue with backslash/underscore key not working on
Japanese 106/109 keyboard.

PR:		kern/112214, kern/99090
Submitted by:	TOMITA Yoshinori, TAKAHASHI Yoshihiro
Approved by:	re (hrs)
MFC after:	3 days
2007-07-11 18:57:15 +00:00
Attilio Rao
57899e0212 Fix userland applications compilation by using correct KPI protection
macros for lock_profiling.

Reported by: Tom McLaughlin <tmclaugh@sdf.lonestar.org>
Tested by: Tom McLaughlin <tmclaugh@sdf.lonestar.org>
Approved by: jeff (mentor)
Approved by: re
2007-07-11 18:51:31 +00:00
Hartmut Brandt
2125877649 This commit was generated by cvs2svn to compensate for changes in r171364,
which included commits to RCS files with non-trunk default branches.
2007-07-11 14:41:54 +00:00
Hartmut Brandt
e52e259e88 Vendor patch to remove some __inline qualifiers on non-static functions
because they seem to cause warnings in gcc-4.2.

Submitted by:	mjacob
Approved by:	re
2007-07-11 14:41:54 +00:00
Ariff Abdullah
05cba74005 Protect against divide by zero while calculating bus speed due to
possible broken kernel timecounter.

Reported/Tested by:	silby
Approved by:		re (hrs)
MFC after:		1 day
2007-07-11 14:27:45 +00:00
Xin LI
8d9a89a3a0 MFp4: Make use of the kernel unit number allocation facility
for tmpfs nodes.

Submitted by:	Mingyan Guo <guomingyan gmail com>
Approved by:	re (tmpfs blanket)
2007-07-11 14:26:27 +00:00
Robert Watson
f11c1e88f6 Remove now-stale 00READ file in the Coda tree; rvb isn't the current
contact for the Coda kernel module in FreeBSD.

Approved by:	re (kensmith)
2007-07-11 12:14:37 +00:00
Warner Losh
5b26652b0c Add Micro Research PCMCIA LAN Adapter MR10TPC support. Patch slightly
reworked by me.

Submitted by: Osamu Hasegawa-san
PR: 93393
Approved by: re (hrs)
2007-07-11 04:14:41 +00:00
Marcel Moolenaar
ba6a2bb365 Add --no-warn-mismatch to ld(1) when linking binary files into
ELF files. On ia64 the ELF header contains information about
characteristics of the machine code and ld(1) needs that to
determine whether input files are compatible for linking. To
this end non-ELF files are not supported by binutils on ia64.
However, the resulting ELF file seems to be correct despite the
warnings and the non-supportedness of non-ELF files and it
appears enough to unbreak the build of firmware(9) files on ia64
by simply supressing the warning.

Ran into by: gallatin@
Approved by: re (hrs)
Looks good to me: mlaier@
2007-07-11 01:20:37 +00:00
Maksim Yevmenkin
37d4ce46c3 Mark ng_h4(4) as not MPSAFE and disconnect it from the LINT build for now.
Approved by:	re (rwatson)
2007-07-11 00:15:31 +00:00
Warner Losh
36fef1500d Add additional product id and quirks entry for MetaGeek Wi-Spy
Submitted by: Robert Noland
PR: 114481
Approved by: re@ (blanket)
2007-07-10 21:00:10 +00:00
Alan Cox
20dd22a24e Correct a problem in the ZERO_COPY_SOCKETS option, specifically, in
vm_page_cowfault().  Initially, if vm_page_cowfault() sleeps, the given
page is wired, preventing it from being recycled.  However, when
transmission of the page completes, the page is unwired and returned to
the page queues.  At that point, the page is not in any special state
that prevents it from being recycled.  Consequently, vm_page_cowfault()
should verify that the page is still held by the same vm object before
retrying the replacement of the page.  Note: The containing object is,
however, safe from being recycled by virtue of having a non-zero
paging-in-progress count.

While I'm here, add some assertions and comments.

Approved by: re (rwatson)
MFC After: 3 weeks
2007-07-10 18:41:34 +00:00
Maksim Yevmenkin
08b755600f Mark ng_h4(4) as not MPSAFE and disconnect it from the build for now.
Approved by:	re (rwatson)
2007-07-10 16:38:43 +00:00
Bruce Evans
8e55bfaf4b Don't use almost perfectly pessimal cluster allocation. Allocation
of the the first cluster in a file (and, if the allocation cannot be
continued contiguously, for subsequent clusters in a file) was randomized
in an attempt to leave space for contiguous allocation of subsequent
clusters in each file when there are multiple writers.  This reduced
internal fragmentation by a few percent, but it increased external
fragmentation by up to a few thousand percent.

Use simple sequential allocation instead.  Actually maintain the fsinfo
sequence index for this.  The read and write of this index from/to
disk still have many non-critical bugs, but we now write an index that
has something to do with our allocations instead of being modified
garbage.  If there is no fsinfo on the disk, then we maintain the index
internally and don't go near the bugs for writing it.

Allocating the first free cluster gives a layout that is almost as good
(better in some cases), but takes too much CPU if the FAT is large and
the first free cluster is not near the beginning.

The effect of this change for untar and tar of a slightly reduced copy
of /usr/src on a new file system was:

Before (msdosfs 4K-clusters):
untar:  459.57 real              untar from cached file (actually a pipe)
tar:    342.50 real              tar from uncached tree to /dev/zero
Before (ffs2 soft updates 4K-blocks 4K-frags)
untar:   39.18 real
tar:     29.94 real
Before (ffs2 soft updates 16K-blocks 2K-frags)
untar:   31.35 real
tar:     18.30 real

After (msdosfs 4K-clusters):
untar    54.83 real
tar      16.18 real

All of these times can be improved further.

With multiple concurrent writers or readers (especially readers), the
improvement is smaller, but I couldn't find any case where it is
negative.  342 seconds for tarring up about 342 MB on a ~47MB/S partition
is just hard to unimprove on.  (This operation would take about 7.3
seconds with reasonably localized allocation and perfect read-ahead.)
However, for active file systems, 342 seconds is closer to normal than
the 16+ seconds above or the 11 seconds with other changes (best I've
measured -- won easily by msdosfs!).  E.g., my active /usr/src on ffs1
is quite old and fragmented, so reading to prepare for the above
benchmark takes about 6 times longer than reading back the fresh copies
of it.

Approved by:	re (kensmith)
2007-07-10 13:20:24 +00:00
Robert Watson
43bbb6aa10 Further cleanup of UDPv4:
- Move udp_sendspace and udp_recvspace global variables and associated
  sysctls to the top of the file where most other such things are present.

- Rename static variable 'blackhole' to 'udp_blackhole' and unstaticize
  so that we can add blackhole support for UDPv6 using the same MIB
  variable.

- Move udp_append() above udp_input() to match the function order in
  udp6_usrreq.c.

Approved by:	re (kensmith)
2007-07-10 09:30:46 +00:00
Tai-hwa Liang
5ee1ac4645 Fixing the mount_smbfs(8) hanging by utilising the destroy_dev_sched() KPI.
Relevant threads:

  http://lists.freebsd.org/pipermail/freebsd-current/2007-June/074329.html

Reviewed by:	kib, bp (slightly different version)
Tested by:	Yuri Pankov <yuri.pankov at gmail dot com>,
		Jiawei Ye <leafy7382 at gmail dot com>
Approved by:	re (kensmith)
2007-07-10 09:23:10 +00:00
Matt Jacob
2e4637cd75 Get rid of a couple of Coverity found sign comparison errors.
Approved by:	re (Ken)
MFC after:	3 days
2007-07-10 07:55:59 +00:00
Matt Jacob
bb4f528dd8 Be more conservative- turn off fast posting and RIO for 22XX cards.
Approved by:	re (ken)
MFC after:	3 days
2007-07-10 07:55:04 +00:00
Kip Macy
b8fe6051bf MFp4 122896
- reduce cpu usage by as much as 25% (40% -> 30) by doing txq reclaim more efficiently
   - use mtx_trylock when trying to grab the lock to avoid spinning during long encap loop
   - add per-txq reclaim task
   - if mbufs were successfully re-claimed try another pass
- track txq overruns with sysctl

Approved by: re (blanket)
2007-07-10 06:01:45 +00:00
Marcel Moolenaar
c108b80c8c Cast the arguments to atomic_*_ptr() when mapping it to atomic_*_32()
This is a minimal fix.

Approved by: re (kensmith)
2007-07-10 04:40:00 +00:00
Warner Losh
2f33a9032b Missed in last commit: add usb task for rue to use for its ticks.
Approved by: re (bmah)
2007-07-09 20:56:39 +00:00
Ariff Abdullah
0937dd1ec0 - Add codec id for Realtek ALC268.
- Add controller id for Intel 82801I (ICH9).
  PR:			kern/114399
  Submitted by:		Michael Fuckner <michael@fuckner.net>

- MSI support. Disable by default due to various issues with too many
  broken hardwares. MSI can be enabled through device.hints(5) or
  kenv(8) by setting "hint.pcm.%d.msi=1".
  Partially submitted by:	kevlo
                         	YAMAMOTO Taku <taku@tackymt.homeip.net>
  Tested by:			joel, kevlo, YAMAMOTO Taku

Approved by:	re (hrs)
MFC after:	3 days
2007-07-09 20:42:11 +00:00
Ariff Abdullah
8a7c4d36cb Fix stream suspend/resume activity due to its states being
clobbered by pcm channel start/stop trigger operation.

Approved by:	re (hrs)
2007-07-09 20:41:23 +00:00
Robert Watson
542a638396 General style, white space, and comment cleanup; move to ANSI C
prototypes, don't use register, etc.  Synchronize structure and
layout to the IPv4 versions of these functions to a greater extent,
making visual comparison easier.

Remove now stale or incorrect comments.

Enable full lock assertions, and correct one exception handling
case where the wrong label was jumped to.

Tested by:	bz
Approved by:	re (bmah)
2007-07-09 17:47:04 +00:00
Warner Losh
f129c7fd08 When all the other drivers were converted to scheduling a taskqueue to
do the heavy lifting of the 'mii_tick' function, rue was left behind.
Implement this in a naive way.  Reports from the field show this makes
the driver functional with some locking issues, as opposed to an
instant panic.  Those will be addressed in a later version of the
driver.

Approved by: re@ (bmah)
2007-07-09 16:58:07 +00:00
Warner Losh
bb900be1fe Fix duplicates that crept in at the last minute :-(.
Noticed by: Ian Freislich
Approved by: re@ (blanket)
2007-07-09 14:26:08 +00:00
Bruce M Simpson
d90b8675c2 Fix a regression in IPv4 multicast join path (IP_ADD_MEMBERSHIP).
With the in_mcast.c code, if an interface for an IPv4 multicast join was
not specified, and a route did not exist for the specified group in the
unicast forwarding tables, the join would be rejected with the error
EADDRNOTAVAIL.
This change restores the old behaviour whereby if no interface is specified,
and no route exists for the group destination, the IPv4 address list is
walked to find a non-loopback, multicast-capable interface to satisfy
the join request.
This should resolve problems with starting multicast services during
system boot or when a default forwarding entry does not exist.

Approved by:	re (rwatson)
2007-07-09 10:36:47 +00:00
Doug Rabson
2dc26b36c8 Correct a reference-counting mistake in the ZFS code which led to abnormal
memory usage and pessimal cache performance.

Reviewed by: pjd
Approved by: re (rwatson)
2007-07-09 09:03:49 +00:00
Warner Losh
66807691fe Further diff reduction against the proposed merged usbdevs: Add a few
more vendors, use slightly more standardized names.

No md5 chagnes for !USBVERBOSE kernels

Approved by: re@ (blanket)
2007-07-09 06:20:07 +00:00
Warner Losh
dc950f0469 More vendors from the merged list.
Sort NETGEAR list per convention.
Swap QUALCOMM and QUALCOMM2.
Add a few vendor products.

no md5 changes with this file (except when USBVERBOSE is enabled)

Approved by: re@ (blanket)
2007-07-09 05:47:32 +00:00
Marcel Moolenaar
acd760988d dma_tag is a static structure. Testing for it being a NULL pointer
doesn't make sense. Rewrite to what was intended.

Correctly warned about by: GCC
Approved by: re (bmah)
2007-07-09 04:58:16 +00:00
Alan Cox
d1974c0df1 Eliminate the special case handling of OBJT_DEVICE objects in
vm_fault_additional_pages() that was introduced in revision 1.47.  Then
as now, it is unnecessary because dev_pager_haspage() returns zero for
both the number of pages to read ahead and read behind, producing the
same exact behavior by vm_fault_additional_pages() as the special case
handling.

Approved by: re (rwatson)
2007-07-08 19:42:52 +00:00
Attilio Rao
ea11c140d0 NULL_LDT_BASE is used in !SMP kernels too and set_user_ldt() is not
properly called. Address these two issues.

Reported by: Tinderbox
Tested by: le
Approved by: jeff (mentor)
Approved by: re
2007-07-08 18:17:42 +00:00
Xin LI
1df86a323d MFp4:
- Plug memory leak.
 - Respect underlying vnode's properties rather than assuming that
   the user want root:wheel + 0755.  Useful for using tmpfs(5) for
   /tmp.
 - Use roundup2 and howmany macros instead of rolling our own version.
 - Try to fix fsx -W -R foo case.
 - Instead of blindly zeroing a page, determine whether we need a pagein
   order to prevent data corruption.
 - Fix several bugs reported by Coverity.

Submitted by:	Mingyan Guo <guomingyan gmail com>, Howard Su, delphij
Coverity ID:	CID 2550, 2551, 2552, 2557
Approved by:	re (tmpfs blanket)
2007-07-08 15:56:12 +00:00
Hidetoshi Shimokawa
ead41a8810 Fix a bug of retrieving configuration ROM.
- Handle directories and leaves other than unit directories and text leaves
  correctly.
- Now we can retrieve CROM of iSight correctly.

Approved by: re (hrs)
Tested by: flz
MFC after: 3 days
2007-07-08 11:47:52 +00:00
Nate Lawson
d73144e778 Now that we have a function that can be called from a cdevsw close()
entry point, use it.

Approved by:	re
2007-07-07 17:54:33 +00:00
Attilio Rao
05dfa22fe9 Actual code shows several problems in ia32 LDT handling:
- When a LDT entry changes, the old one is freed while it is still
  referenced by gdt and ldtr.  This can lead to disruptive behaviours in
  particular on SMP machines.
- When a LDT entry changes, it is assumed that the only one entity sharing
  the same LDT are threads in the same proc.  It doesn't take in account
  edge cases where two processes share the same VM (rfork'ed ones, for
  example).

This patch addresses these two problems and addictionally it fixes the
usage of refcount switching back it to the old manually-grown refcount
(since in this case would be faster).

Diagnosed by: tegge
Tested by: pho (a former version)
Reviewed by: kib
Approved by: jeff (mentor)
Approved by: re
2007-07-07 16:59:01 +00:00
Robert Watson
bd84d20457 Minor UDPv4 cleanup: capitalize comment, move statistics update after mbuf
free to be consistent with other error handling, and release socket buffer
lock before freeing mbufs and statistics updates rather than after.

Approved by:	re (kensmith)
2007-07-07 09:46:34 +00:00
Alan Cox
65ea29a690 When a cached page is reactivated in vm_fault(), update the counter that
tracks the total number of reactivated pages.  (We have not been
counting reactivations by vm_fault() since revision 1.46.)

Correct a comment in vm_fault_additional_pages().

Approved by:	re (kensmith)
MFC after:	1 week
2007-07-06 21:25:21 +00:00
Warner Losh
f1d2cc831c Trivial differences with the proposed merged BSD usbdevs file merged
in.  These are exclusively in the name of the company for this round.
No new devices have been added, but the MITEL entry has been
eliminated because nothing uses it.  You won't see any difference
unless you have USBVERBOSE defined for the kernel.

Approved by: re@ (blanket)
2007-07-06 20:05:39 +00:00
Warner Losh
56f6c2d8fa uhub already does the printing and naming of a device, so don't do it
again here for compat drivers.

Approved by: re@ (blanket)
2007-07-06 20:02:37 +00:00
Attilio Rao
c1a6d9fa42 Fix some problems with lock_profiling in sx locks:
- Adjust lock_profiling stubs semantic in the hard functions in order to be
  more accurate and trustable
- Disable shared paths for lock_profiling.  Actually, lock_profiling has a
  subtle race which makes results caming from shared paths not completely
  trustable. A macro stub (LOCK_PROFILING_SHARED) can be actually used for
  re-enabling this paths, but is currently intended for developing use only.
- Use homogeneous names for automatic variables in hard functions regarding
  lock_profiling
- Style fixes
- Add a CTASSERT for some flags building

Discussed with: kmacy, kris
Approved by: jeff (mentor)
Approved by: re
2007-07-06 13:20:44 +00:00
Bjoern A. Zeeb
7a5dee0567 I4B header files were repo-copied from sys/i386/include/ to
sys/i4b/include/ so they will be available to all architectures
once I4B compiles on those.

We no longer need these "glue" files.

Reminded by:	nyan
Approved by:	re (kensmith)
2007-07-06 08:05:46 +00:00
Bjoern A. Zeeb
bebcac07fc Bump version after repo-copy of I4B headers.
The headers will now be installed to include/i4b/ and
no longer to include/machine/.

Approved by:	re (kensmith)
2007-07-06 07:36:09 +00:00
Bjoern A. Zeeb
5b919cdc47 I4B header files were repo-copied from sys/i386/include/ to
sys/i4b/include/ so they will be available to all architectures
once I4B compiles on those.

Approved by:	re (kensmith)
2007-07-06 07:23:39 +00:00
Bjoern A. Zeeb
6f5d8741e5 I4B header files were repo-copied from sys/i386/include/ to
sys/i4b/include/ so they will be available to all architectures
once I4B compiles on those.

Adapt #include paths.

Approved by:	re (kensmith)
2007-07-06 07:17:22 +00:00
Peter Wemm
01f7d072de I did not intend to turn -Werror on for pc98. Refine the test for
turning it on for i386.

Approved by:  re (rwatson, followup)
2007-07-06 01:50:58 +00:00
Peter Wemm
0a6bd02876 Turn on -Werror for sparc64 and sun4v.
Approved by:	re (rwatson)
2007-07-06 00:52:29 +00:00
Peter Wemm
89200512b3 Fix warnings.
nxge: cast page size fragments down to (int). If the vm's demand paging
PAGE_SIZE is ever too big for that, we've got far bigger problems.
ofw: move va_start() a little earlier. gcc-4.2 doesn't like us modifying
the last arg before the va_start().

Approved by:	re (rwatson)
2007-07-06 00:47:44 +00:00
Peter Wemm
c5b102f584 Fix warning - add missing #include
Submitted by:	mjacob
Approved by:	re (rwatson)
2007-07-06 00:41:53 +00:00
Pyun YongHyeon
141f92e7b5 re(4) devices requires an external EEPROM. Depending on models it
would be 93C46(1Kbit) or 93C56(2Kbit). One of differences between them
is number of address lines required to access the EEPROM. For example,
93C56 EEPROM needs 8 address lines to read/write data. If 93C56
recevied premature end of required number of serial clock(CLK) to set
OP code/address of EEPROM, the result would be unexpected behavior.
Previously it tried to detect 93C46, which requires 6 address lines,
and then assumed it would be 93C56 if read data was not expected
value. However, this approach didn't work in some models/situations
as 93C56 requries 8 address lines to access its data. In order to fix
it, change EEPROM probing order such that 93C56 is detected reliably.

While I'm here change hard-coded address line numbers with defined
constant to enhance readability.

PR:	112710
Approved by:	re (mux)
2007-07-06 00:05:12 +00:00
Xin LI
2a463222be Space cleanup
Approved by:	re (rwatson)
2007-07-05 16:29:40 +00:00
Xin LI
1272577e22 ANSIfy[1] plus some style cleanup nearby.
Discussed with:	gnn, rwatson
Submitted by:	Karl Sj?dahl - dunceor <dunceor gmail com> [1]
Approved by:	re (rwatson)
2007-07-05 16:23:49 +00:00
George V. Neville-Neil
a22fb0da42 Added comments eplaining the requirement for device crypto with IPSEC
Approved by: re
2007-07-05 15:33:13 +00:00
Max Laier
e22a271eeb Remove unused variable from pf_subr.c to make it -Werror buildable.
Approved by:	re (kensmith)
2007-07-05 15:28:59 +00:00
Warner Losh
05adc69b08 Prefer device_printf to printf + device_get_nameunit. This saves
about 100 bytes.

Approved by: re (blanket)
2007-07-05 15:25:32 +00:00
Tai-hwa Liang
798a64346d MFp4: Fixing IPW_DEBUG enabled builds by converting the last piece of
ic->ic_des_essid to ic->ic_des_ssid[0].

Reviewed by:	sam
Approved by:	re (kensmith)
2007-07-05 15:06:49 +00:00
Robert Watson
458f818f47 In preparation for 7.0 privilege cleanup, clean up style:
- Sort copyrights by date.
- Re-wrap, and in some cases, fix comments.
- Fix tabbing, white space, remove extra blank lines.
- Remove commented out debugging printfs.

Approved by:	re (kensmith)
2007-07-05 13:16:04 +00:00
Konstantin Belousov
542a8db549 Adopt snp to the destroy_dev_sched() KPI after reverting of destroy_dev()
to not call destroy_dev_sched().

Tested by:	Peter Holm
Approved by:	re (kensmith)
2007-07-05 13:07:12 +00:00
Konstantin Belousov
196a7385ac Revert destroy_dev() to the state before destroy_dev_sched() was introduced.
Attempt to spawn destroy_dev_sched() from it causes inadmissible races.

Requested by:	tegge
Approved by:	re (kensmith)
2007-07-05 13:04:59 +00:00
Ariff Abdullah
36bc8661bf Properly unlock mutex before returning. There was a slight mishap
during last major locking cleanup.

Reported by:	Thierry Herbelot <thierry@herbelot.com>
Approved by:	re (mux)
2007-07-05 10:22:37 +00:00
Peter Wemm
8032d6336f Turn on -Werror for i386 kernel builds.
Approved by: re (rwatson)
2007-07-05 09:30:34 +00:00
Andrew Thompson
b3d37ca5f8 Allow the LACP state to be queried from userland which at the moment is the
actor and partner peer info. Print out the active aggregator and per port data
in verbose mode from ifconfig.

Approved by:	re (mux)
2007-07-05 09:18:57 +00:00
Bjoern A. Zeeb
f43455fd89 Remove netkey directory from cscope/TAGs generation and replace
it with netipsec now that KAME IPsec is gone.
While here add missing netinet6 directories.

Add comments about the ports needed to be able to run those targets.

Reviewed by:	philip
Approved by:	re (rwatson)
2007-07-05 08:55:14 +00:00
Bjoern A. Zeeb
7089081d49 Fix a build breakage as result of disabling parts of I4B.
Check for (temporary gone) kernel options to be defined before using
them.

Reported by:	peter
Approved by:	re (rwatson)
2007-07-05 08:53:21 +00:00
Peter Wemm
a031fd450e Quiet framelen uninitialized warning. I think it was a false alarm.
If check_fhdr() returns false, the frame_ok variable should protect any
meaningful evaluations of framelen.

Approved by: re (rwatson)
2007-07-05 07:46:33 +00:00
Peter Wemm
b77acb8748 Quiet warnings. I believe gcc is incorrect about these.
Approved by:  re (rwatson)
2007-07-05 07:38:17 +00:00
Peter Wemm
e106f3d812 __packed has no effect on u_int8_t's except to cause a warning (and
never has had any effect).

Approved by:  re (rwatson)
2007-07-05 07:28:38 +00:00
Peter Wemm
61ba2e0a14 Turn -Werror back on for amd64 for kernel builds.
Approved by:  re (rwatson)
2007-07-05 07:06:17 +00:00
Peter Wemm
4085424709 Compile pf/pf_subr.c and netnatm/cc_conn.c without -Werror for the time
being.

Approved by:  re (rwatson)
2007-07-05 07:04:17 +00:00
Peter Wemm
0278f1c0a3 Quiet warnings. These do not appear to be actually used uninitialized,
but gcc's optimizer isn't smart enough to see that.  Pre-initializing
seems harmless enough.

Approved by:  re (rwatson)
2007-07-05 06:59:14 +00:00
Peter Wemm
0273079097 Fix a stray splx() that caused a new warning.
Approved by:  re (rwatson)
2007-07-05 06:54:03 +00:00
Peter Wemm
cb3a418e8d Initialize DWBuf[3].
Approved by:  re (rwatson, blanket)
2007-07-05 06:51:49 +00:00
Peter Wemm
343cc83e1b Fix a bunch of warnings due to a missing forward declaration of a struct.
Approved by: re (rwatson)
2007-07-05 06:45:37 +00:00
Warner Losh
d9c12353bf Prefer device_printf to printf("%s: ...", device_get_nameunit()). On
amd64, we save about 240 bytes (this is about 20 per instance).

Approved by: re (blanket)
2007-07-05 06:42:14 +00:00