Commit Graph

79971 Commits

Author SHA1 Message Date
John Baldwin
204404e890 Skip SMAP regions above 4GB on i386 since they will not fit into a long.
While here, update some comments to better explain the new code flow.

Tested by:	dhw
2010-11-02 13:04:25 +00:00
John Baldwin
33b31db666 Don't leak the LLE lock if the arptimer callout is pending or inactive.
Reported by:	David Rhodus
MFC after:	1 month
2010-11-02 13:00:56 +00:00
Alexander Motin
82d2b37bc0 Remove stale line, accidentally slipped into r214016.
MFC after:	3 days
2010-11-02 09:31:24 +00:00
David E. O'Brien
6dfeb66eda Shorten long lines. 2010-11-02 05:39:57 +00:00
Olivier Houchard
306cc0acfb Try to be a little smart at guessing where _start is located in flash, instead
of relying on a binutils bug.

Reported by:	dim
2010-11-01 21:04:23 +00:00
Jack F Vogel
e4c690b4f0 Sync the lem code up with the vlan and other fixes in em.
Delete a unneeded test from the beginning of em_xmit.
CRITICAL: shared code fix for 82574, a mutex might not be
          released, this can cause hangs.
2010-11-01 20:19:25 +00:00
John Baldwin
32c3d3b6e6 Move <machine/apicreg.h> to <x86/apicreg.h>. 2010-11-01 18:18:46 +00:00
John Baldwin
5ecdb3c46b Move the <machine/mca.h> header to <x86/mca.h>. 2010-11-01 17:40:35 +00:00
John Baldwin
544de89de0 Add an x86/include directory to the kernel to hold headers that are common
to amd64, i386, and pc98.  The headers are installed to /usr/include/x86
during an installworld, and an 'x86' symlink is created for kernel builds
similar to 'machine' so that the headers can be included as <x86/foo.h>.

Reviewed by:	imp
2010-11-01 17:34:04 +00:00
Alan Cox
e396eb604f Implement pmap_is_prefaultable().
Reviewed by:	nwhitehorn
2010-11-01 02:22:48 +00:00
David Xu
444528c026 Use integer for size of cpuset, as it won't be bigger than INT_MAX,
This is requested by bge.
Also move the sysctl into file kern_cpuset.c, because it should
always be there, it is independent of thread scheduler.
2010-11-01 00:42:25 +00:00
Nathan Whitehorn
e36e3d8221 Add a security nit to recent copyin/out changes: map the user segment
no-execute in case of exploitable kernel bugs.

MFC after:	1 week
2010-10-31 23:04:15 +00:00
Marius Strobl
e598f12273 Turn a panic into a printf so IFM_ETH_MASTER on !IFM_1000_T is complained
about but otherwise ignored. When allowing the master to be set manually via
ifconfig(8) by adding the former to IFM_SUBTYPE_ETHERNET_OPTION_DESCRIPTIONS
(as it should be) it seems to be unfavorable that a machine can be made to
panic with a simple ifconfig(8) invocation.
2010-10-31 22:59:49 +00:00
Nathan Whitehorn
ad6b3047a4 Next-to-leading-order perturbation of synchronization operations for
switching the user segment register. All races should now be closed and
a minimum of pipelines flushes be required to close them.
2010-10-31 22:55:51 +00:00
Marius Strobl
3b18190fe9 Try to make the style consistent (including regarding NetBSD bits not yet
merged) and adhere style(9).
2010-10-31 22:46:39 +00:00
Marius Strobl
4a446e3e8e Make a comment reflect reality. 2010-10-31 22:41:53 +00:00
Nathan Whitehorn
50fd2a5b9c Add a driver for the Apple Uninorth AGP host bridge found in all PowerPC
Macintoshes with an AGP bus.
2010-10-31 18:27:05 +00:00
Nathan Whitehorn
c4bcebed17 Add some missing parentheses so that moea_bat_mapped() actually works.
Submitted by:	alc
MFC after:	3 days
2010-10-31 15:07:09 +00:00
Alexander Motin
189795fe68 Fix callout_tickstofirst() behavior after signed integer ticks overflow.
This should fix callout precision drop to 1/4s after 25 days of uptime
with HZ = 1000.

Submitted by:	Taku YAMAMOTO <taku@tackymt.homeip.net>
2010-10-31 11:44:41 +00:00
Yoshihiro Takahashi
f8a94ecc2c Rename BUS_SPACE_IO and BUS_SPACE_MEM defines to BUS_SPACE_TAG_IO and
BUS_SPACE_TAG_MEM respectively to avoid conflict with nexus.c.
2010-10-31 03:03:20 +00:00
Alan Cox
2eeee67ce8 Add another safety belt to pmap_demote_DMAP(). 2010-10-30 23:49:37 +00:00
Nathan Whitehorn
c04246f45a Allow access to the HT I/O port space on the IBM CPC9X5 northbridge chips.
MFC after:	2 weeks
2010-10-30 23:09:56 +00:00
Nathan Whitehorn
54c562081f Restructure the way the copyin/copyout segment is stored to prevent a
concurrency bug. Since all SLB/SR entries were invalidated during an
exception, a decrementer exception could cause the user segment to be
invalidated during a copyin()/copyout() without a thread switch that
would cause it to be restored from the PCB, potentially causing the
operation to continue on invalid memory. This is now handled by explicit
restoration of segment 12 from the PCB on 32-bit systems and a check in
the Data Segment Exception handler on 64-bit.

While here, cause copyin()/copyout() to check whether the requested
user segment is already installed, saving some pipeline flushes, and
fix the synchronization primitives around the mtsr and slbmte
instructions to prevent accessing stale segments.

MFC after:	2 weeks
2010-10-30 23:07:30 +00:00
Marius Strobl
f5a1822131 Correct a bug in r213893; within a PHY driver MIIF_PHYPRIVn should be used
instead of MIIF_MACPRIVn. This didn't make a functional difference though.
2010-10-30 20:51:25 +00:00
Bjoern A. Zeeb
13a6cf24ac Announce both IPsec and UDP Encap (NAT-T) if available for
feature_present(3) checks.

This will help to run-time detect and conditionally handle specific
optionas of either feature in user space (i.e. in libipsec).

Descriptions read by:	rwatson
MFC after:		2 weeks
2010-10-30 18:52:44 +00:00
Alan Cox
d689bc0082 Correct some format strings used by sysctls.
MFC after:	1 week
2010-10-30 18:00:53 +00:00
Alan Cox
59fb2d9b04 Don't demote in pmap_demote_DMAP() if the specified length is zero. 2010-10-30 17:21:32 +00:00
Konstantin Belousov
3a40a00d56 Remove sysctl debug.ncnegfactor, it is renamed to vfs.ncnegfactor.
MFC:	do not
2010-10-30 14:08:26 +00:00
Pyun YongHyeon
d0b2f7efb7 Don't bother to enable ASPM L1 to save more power. Even though I am
not able to trigger the issue with sample boards, some users seems
to suffer from freeze/lockup when system is booted without UTP cable
plugged in. I'm not sure whether this is BIOS issue or controller
bug. This change fixes AR8132 lockup issue seen on EEE PC.

Reported by:	kmoore
Tested by:	kmoore
2010-10-30 01:12:54 +00:00
Marius Strobl
0da4045955 - When resetting pm_active and pm_context of a pmap in pmap_pinit() we
need locking as otherwise we may race against the other parts of the
  MD code which expects a consistent state of these. While at it move
  the resetting of the pmap before entering it in the TSB.
- Spell a 0 as TLB_CTX_KERNEL.
2010-10-29 20:51:30 +00:00
Marius Strobl
340e331450 Partially revert r203829; as it turns out what the PowerPC OFW loader did
was incorrect as further down the road cons_probe() calls malloc() so the
former can't be called before init_heap() has succeed. Instead just exit
to the firmware in case init_heap() fails like OF_init() does when hitting
a problem as we're then likely running in a very broken environment where
hardly anything can be trusted to work.
2010-10-29 20:42:02 +00:00
Edward Tomasz Napierala
252e4a96e6 Fix uninitialized variable.
Found with:	Coverity Prevent(tm)
CID:		8632
2010-10-29 19:07:36 +00:00
Rui Paulo
09b6dcf968 Sync DLTs with the latest pcap version. 2010-10-29 18:41:09 +00:00
Attilio Rao
8c0b6eaff1 Merging mptable under x86 left this option undefined for amd64 case.
Fix that.

Sponsored by:	Sandvine Incorporated
Reported by:	jkim
2010-10-29 18:38:36 +00:00
Attilio Rao
4e30bd6244 - Merge ram_attach() implementation for i386 and amd64
- Rename RES_BUS_SPACE_* into BUS_SPACE_* for consistency
- Trim out an unnecessary checking condition

Sponsored by:	Sandvine Incorporated
Requested and reviewed by:	jhb
2010-10-29 18:33:43 +00:00
Rick Macklem
f93d95cbf6 Modify nfs_open() in the experimental NFS client to be compatible
with the regular NFS client. Also, fix a couple of mutex lock issues.

MFC after:	1 week
2010-10-29 13:46:21 +00:00
Rick Macklem
0661e0348b Add a call for nfsrpc_close() to ncl_reclaim() in the experimental
NFSv4 client, since the call in ncl_inactive() might be missed
because VOP_INACTIVE() is not guaranteed to be called before
VOP_RECLAIM().

MFC after:	1 week
2010-10-29 13:34:57 +00:00
David Xu
b67cc292dc Add sysctl kern.sched.cpusetsize to export the size of kernel cpuset,
also add sysconf() key _SC_CPUSET_SIZE to get sysctl value.

Submitted by: gcooper
2010-10-29 13:31:10 +00:00
Gleb Smirnoff
27bf126d23 Remove meaningless XXXXX, that is a remain of comment, removed in r186200. 2010-10-29 11:13:42 +00:00
Gleb Smirnoff
28e1f17c81 Revert a small part of the r198301, that is entirely unrelated to the
r198301 itself. It also broke the logic of not sending more than one
ARP request per second, that consequently lead to a potential problem
of flooding network with broadcast packets.

MFC after:	1 week
2010-10-29 10:57:18 +00:00
Nathan Whitehorn
49939626be Fix the printf() in init_heap so that it can run before the console is up.
Pointed out by:	marius
2010-10-29 00:37:35 +00:00
Nathan Whitehorn
51b1acac58 Fix netboot on some Apple machines on which calling dma-free on the
network device can hang the machine. This causes the loss of 64 KB of
accessible memory on netbooted machines.
2010-10-29 00:36:44 +00:00
Nathan Whitehorn
e60ab831db Fix some memory management issues discovered when trying to boot the PPC
OF loader on systems where address cells and size cells are both 2 (the
Mambo simulator) and fix an error where cons_probe() was called before
init_heap() but used malloc() to set environment variables.

MFC after:	1 month
2010-10-28 23:46:05 +00:00
Attilio Rao
ba2a27351b Merge nexus.c from amd64 and i386 to x86 subtree.
Sponsored by:	Sandvine Incorporated
Tested by:	gianni
2010-10-28 16:31:39 +00:00
John Baldwin
b94e6f0ef6 Set bootverbose directly in mi_startup() rather than via a SYSINIT. This
ensures 'bootverbose' is in a valid state for all SYSINITs.

Reported by:	avg
MFC after:	1 week
2010-10-28 14:17:06 +00:00
John Baldwin
89d84a4055 Use 'PCPU_GET(apic_id)' to determine the BSP's APIC ID on a UP machine
when routing interrupts instead of cpu_apic_ids[0] since cpu_apic_ids[]
is only populated for multiple-CPU machines.  This also matches what the
code does when SMP is not enabled.

PR:		bin/151616
Tested by:	"Damian S. Kolodziejczyk"  damkol | gmail
Submitted by:	avg
MFC after:	1 week
2010-10-28 13:44:19 +00:00
Attilio Rao
a3da97926d Merge the mptable support from MD bits to x86 subtree.
Sponsored by:	Sandvine Incorporated
Discussed with:	jhb
2010-10-28 07:58:06 +00:00
Justin T. Gibbs
8f1382d1f2 sys/dev/xen/blkback/blkback.c:
In xbb_detach() only perform cleanup of our taskqueue and
	device statistics structures if they have been initialized.
	This avoids a panic when xbb_detach() is called on a partially
	initialized device instance, due to an early failure in
	attach.

Sponsored by:	Spectra Logic Corporation
2010-10-28 04:14:28 +00:00
Jack F Vogel
35928b338e In the data setup code for doing offloads the
ip and tcp pointers were not reset after some
pullups. In practice this led to an NFS mount
failure when using UDP reported by Kevin Lo,
thanks Kevin. Fix from yongari, thank you!
2010-10-28 00:16:54 +00:00
Hans Petter Selasky
8427ed847d Add support for setting per-interface PnP information.
Submitted by:	Nick Hibma
Approved by:	thompsa (mentor)
2010-10-27 17:38:05 +00:00
Pyun YongHyeon
1108273af4 Add initial BCM5718 family support. The BCM5718 family includes
the dual port BCM5717 and BCM5718 devices which are intended for
mainstream workstation and entry-level server designs and
represents the twelfth generation of NetXtreme Ethernet controllers.
This family is the successor to the BCM5714/BCM5715 family and
supports IPv4/IPv6 checksum offloading, TSO, VLAN hardware tagging,
jumbo frames, MSI/MSIX, IOV, RSS and TSS.

This change set supports all hardware features except IOV and
RSS/TSS. Unlike its predecessors, only extended RX buffer
descriptors can be posted to the jumbo producer ring. Single RX
buffer descriptors for jumbo frame are not supported. RSS requires
a more substantial set of changes and will apply to a larger set
of NetXtreme devices so RSS/TSS multi-queue support will be
implemented in a future releases.

Special thanks to Broadcom who kindly sent a sample board to me
and to davidch who gave provided the initial support code.

Submitted by:	davidch (initial version)
HW donated by:	Broadcom
2010-10-27 17:20:19 +00:00
Pyun YongHyeon
f8d8720ebc Add BCM5717C 10/100/1000TX PHY id. 2010-10-27 17:16:40 +00:00
Alan Cox
92ababa777 [1] According to the x86 architectural specifications, no virtual-to-
physical page mapping should span two or more MTRRs of different types.
Add a pmap function, pmap_demote_DMAP(), by which the MTRR module can
ensure that the direct map region doesn't have such a mapping.

[2] Fix a couple of nearby style errors in amd64_mrset().

[3] Re-enable the use of 1GB page mappings for implementing the direct
map.  (See also r197580 and r213897.)

Tested by:	kib@ on a Westmere-family processor [3]
MFC after:	3 weeks
2010-10-27 16:46:37 +00:00
Jaakko Heinonen
843ab5514d Add missing "readahead" to the nfs_opts list.
PR:		151321
Tested by:	Simon Walton
MFC after:	2 weeks
2010-10-27 14:08:37 +00:00
David Xu
4a5478709b - Revert r214409.
- Use long word to figure out sizeof kernel cpuset, hope it works.
2010-10-27 09:29:03 +00:00
David Xu
1676b42546 If input parameter cpusetsize is zero, give userland size of cpuset mask
kernel is using.
2010-10-27 02:32:54 +00:00
Rick Macklem
c5dd9d8c37 Add a flag to the experimental NFSv4 client to indicate when
delegations are being returned for reasons other than a Recall.
Also, re-organize nfscl_recalldeleg() slightly, so that it leaves
clearing NMODIFIED to the ncl_flush() call and invalidates the
attribute cache after flushing. It is hoped that these changes
might fix the problem others have seen when using the NFSv4
client with delegations enabled, since I can't reliably reproduce
the problem. These changes only affect the client when doing NFSv4
mounts with delegations enabled.

MFC after:	10 days
2010-10-26 23:18:37 +00:00
Jung-uk Kim
ae19af49e0 Add two new loader tunables 'hw.acpi.install_interface' and
'hw.acpi.remove_interface'.  hw.acpi.install_interface lets you install new
interfaces.  Conversely, hw.acpi.remove_interface lets you remove OS
interfaces from the pre-defined list in ACPICA.  For example,

	hw.acpi.install_interface="FreeBSD"

lets _OSI("FreeBSD") method to return 0xffffffff (or success) and

	hw.acpi.remove_interface="Windows 2009"

lets _OSI("Windows 2009") method to return zero (or failure).  Both are
comma-separated lists and leading white spaces are ignored.  For example,
the following examples are valid:

	hw.acpi.install_interface="Linux, FreeBSD"
	hw.acpi.remove_interface="Windows 2006, Windows 2006.1"
2010-10-26 18:59:50 +00:00
Attilio Rao
b2724beede Style fix.
Reported by:	bde, dim
2010-10-26 18:01:28 +00:00
Attilio Rao
61ba91df0d Remove usage of PRI* macro for style compliancy.
Requested by:	bde, jhb
Sponsored by:	Sandvine Incorporated
2010-10-26 16:16:15 +00:00
Martin Matuska
e25376bdd0 Bugfix merge from OpenSolaris:
OpenSolaris onnv-revision:	10209:91f47f0e7728
6830541	zfs_get_data_trips on a verify
6696242	multiple zfs_fillpage() zfs: accessing past end of object panics
6785914	zfs fails to drop dn_struct_rwlock in recovery code path

Approved by:	delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6830541, 6696242, 6785914)
MFC after:	2 weeks
2010-10-26 15:48:03 +00:00
Attilio Rao
256439c972 Merge dump_machdep.c i386/amd64 under the x86 subtree.
Sponsored by:	Sandvine Incorporated
Tested by:	gianni
2010-10-26 12:46:26 +00:00
Jack F Vogel
7deff7f9b4 Bug fix delta to the em driver:
- Chasin down bogus watchdogs has led to an improved
	  design to this handling, the hang decision takes
	  place in the tx cleanup, with only a simple report
	  check in local_timer. Our tests have shown no false
	  watchdogs with this code.
	- VLAN fixes from jhb, the shadow vfta should be per
	  interface, but as global it was not. Thanks John.
	- Bug fixes in the support for new PCH2 hardware.
	- Thanks for all the help and feedback on the driver,
	  changes to lem with be coming shortly as well.
2010-10-26 00:07:58 +00:00
Ivan Voras
8e431dd6f1 Bring vfs.ufs.dirhash_maxmem into the age of the fruitbat and make it
autotuned. It is only an upper bound (the memory is not always allocated)
and the system contains a vm_lowmem handler so nothing will crash and burn
if it's tuned too high.

Reviewed by:	mckusick
2010-10-25 21:46:23 +00:00
Marius Strobl
36c7255a81 - Given that in one-shot mode tick_et_start() also is called frequently
introduce function pointers once set up to the respective implementation
  for reading the (S)TICK and writing the (S)STICK_COMPARE registers as a
  compromise between duplicating code and selecting between different
  implementations during execution over and over again, similar to what is
  done elsewhere in the MD in order to support different CPU models that
  won't ever change at runtime.
- In the remaining tick interrupt handler further push down disabling of
  interrupts to the periodic case as it isn't necessary here in one-shot
  mode at all.
2010-10-25 20:52:33 +00:00
Andrey V. Elsukov
e7926a3703 Reimplemented "gpart destroy -F". Now it does all work in kernel.
This was needed for recover implementation.

Implement the recover command for GPT. Now GPT will marked as
corrupt when any of three types of corruption will be detected:
1. Damaged primary GPT header or table
2. Damaged secondary GPT header or table
3. Secondary header is not located in the last LBA
Marked GPT becomes read-only. Any changes with corrupt table
are prohibited. Only "destroy" and "recover" commands are allowed.

Discussed with:	geom@ (mostly silence)
Tested by:	Ilya A. Arhipov
Approved by:	mav (mentor)
MFC after:	2 weeks
2010-10-25 16:23:35 +00:00
Thomas Quinot
94294cada5 Fix typo in comment. 2010-10-25 16:11:37 +00:00
Nathan Whitehorn
495ed64c16 The EHCI_CAPLENGTH and EHCI_HCIVERSION registers are actually sub-registers
within the first 4 bytes of the EHCI memory space. For controllers that
use big-endian MMIO, reading them with 1- and 2-byte reads would then
return the wrong values. Instead, read the combined register with a 4-byte
read and mask out the interesting quantities.
2010-10-25 15:51:43 +00:00
Nathan Whitehorn
111044e6c2 Don't create spurious /dev entries.
Submitted by:	andreast
2010-10-25 15:41:12 +00:00
John Baldwin
0689bdcc19 Use 'saveintr' instead of 'savecrit' or 'eflags' to hold the state returned
by intr_disable().

Requested by:	bde
2010-10-25 15:31:13 +00:00
John Baldwin
c6390f7ac5 Use intr_disable() and intr_restore() instead of frobbing the flags register
directly to disable interrupts.

Reviewed by:	bde (earlier version)
MFC after:	2 weeks
2010-10-25 15:28:03 +00:00
Ivan Voras
61eee6b8a7 Reduce the difference between hirunningspace and lorunningspace,
it should help interactivity in edge cases.
2010-10-25 14:05:25 +00:00
David Xu
42fe684c1a Use function tdfind() to find a thread. 2010-10-25 13:13:16 +00:00
Bjoern A. Zeeb
a38de0134b Factor out DDB commands from r204145, r204279 into if_debug.c for further
enhancements (1).  Switch to a standard 2-clause BSD license for this (2).

Unfortunately we have to un-static the ifindex_table for this but do not
publicly export it.

Suggested by:	rwatson (1) a while back.
Approved by:	thompsa (2) for the change from r204279.
MFC after:	6 days
2010-10-25 08:30:19 +00:00
Alexander Motin
6ea7128dbd Make hw.snd.vpc_0db to be also a loader tunable. 2010-10-25 08:25:44 +00:00
Alexander Motin
5b9392e840 Add missing mtx_destroy() on channel attach failure. 2010-10-25 07:41:21 +00:00
Bjoern A. Zeeb
0ef7c8a20b Add initial inet DDB support for show in_ifaddr and show sin commands which
proved to be useful while debugging address list problems.

MFC after:	6 days
2010-10-24 22:02:36 +00:00
Pyun YongHyeon
713ca255b8 Add TSO support over VLAN for i82550/i82551. Controller requires
VLAN hardware tagging to make TSO work over VLAN. So if VLAN
hardware tagging is disabled explicitly clear TSO over VLAN. While
I'm here allow disabling VLAN TX checksum offloading.

Tested by:	Liudas < liudasb <> centras dot lt >
MFC after:	10 days
2010-10-24 21:59:51 +00:00
Pyun YongHyeon
427d3f3322 Use bge_chipid to compare controller ids. r214251 incorrectly used
bge_chiprev.

Reported by:	Buganini <buganini <> gmail dot com >
2010-10-24 20:54:46 +00:00
Alexander Motin
a4bd51a562 Make da driver to handle some probably broken Android devices, returning
zero media and sector size instead of "Medium not present" error,
until some confirmation button is tapped on device.
2010-10-24 18:53:16 +00:00
Rebecca Cran
fd104c151b Mostly revert r203420, and add similar functionality into ada(4) since the
existing code caused problems with some SCSI controllers.

A new sysctl kern.cam.ada.spindown_shutdown has been added that controls
whether or not to spin-down disks when shutting down.
Spinning down the disks unloads/parks the heads - this is
much better than removing power when the disk is still
spinning because otherwise an Emergency Unload occurs which may cause damage
to the actuator.

PR:	kern/140752
Submitted by:   olli
Reviewed by:	arundel
Discussed with: mav
MFC after:	2 weeks
2010-10-24 16:31:57 +00:00
Marius Strobl
4a1f2d1b35 - Given that as of r214264 all PHY drivers using mii(4) finally have been
converted to use the mii_phy_add_media()/mii_phy_setmedia() pair instead
  of mii_add_media()/mii_anar() remove the latter.
- Declare mii_media mii_media_table static as it shouldn't be used outside
  of mii_physubr.c.

MFC after:	never
2010-10-24 12:59:43 +00:00
Marius Strobl
bcbab52daf - Add IFM_10_2 and IFM_10_5 media via tlphy(4) only in case the respective
interface also has such connectors.
- In tl_attach() unify three different ways of obtaining the device and
  vendor IDs and remove the now obsolete tl_dinfo from tl_softc.
- Given that tlphy(4) only handles the integrated PHYs of NICs driven by
  tl(4) make it only probe on the latter.
- Switch mlphy(4) and tlphy(4) to use mii_phy_add_media()/mii_phy_setmedia().
- Simplify looking for the respective companion PHY in mlphy(4) and tlphy(4)
  by ignoring the native one by just comparing the device_t's directly rather
  than the device name.
2010-10-24 12:51:02 +00:00
Marius Strobl
743d2b468a Take advantage of mii_phy_add_media()/mii_phy_setmedia(). 2010-10-24 11:38:25 +00:00
Marius Strobl
f6613deb1f - Take advantage of mii_phy_dev_probe().
- Use mii_phy_add_media() instead of mii_add_media(). I'm not sure how
  this driver actually managed to work before as mii_add_media() is
  intended to be used to gether with mii_anar() while mii_phy_add_media()
  is intended to be used with mii_phy_setmedia(), however this driver
  mii_add_media() along with mii_phy_setmedia().
2010-10-24 11:37:01 +00:00
Yoshihiro Takahashi
fcaae21d92 MFi386: the part of revision 213226.
Rewrite the i386 memory probe:
  - Move the base memory setup into a new basemem_setup() routine.

MFC after:	1 week
2010-10-24 03:20:54 +00:00
Yoshihiro Takahashi
6d6f513763 MFi386: revision 214210
Avoid using memcpy() for copying 32bit chunks. This shrinks
  the resulting code a little.
2010-10-24 02:59:02 +00:00
Rick Macklem
377c50f67a Modify the experimental NFSv4 server's file handle hash function
to use the generic hash32_buf() function. Although adding the
bytes seemed sufficient for UFS and ZFS, since most of the bytes
are the same for file handles on the same volume, this might not
be sufficient for other file systems. Use of a generic function
also seems preferable to one specific to NFSv4.

Suggested by:	gleb.kurtsou at gmail.com
MFC after:	10 days
2010-10-23 22:28:29 +00:00
Pyun YongHyeon
ca4f898699 Apply the same workaround for SDI flow control used on BCM5906 A1
to BCM6906 A0/A2. This should fix a long standing BCM5906 A2 lockup
issues. Data sheet explicitly mentions BCM5906 A0, A1 and A2 use
de-pipelined mode on these revisions.
Special thanks to Buganini who tried all combinations of
experimental patches for more than 10 days.

Tested by:	Buganini <buganini <> gmail dot com >
2010-10-23 21:25:50 +00:00
Bjoern A. Zeeb
4a85b5e2ea Make the IPsec SADB embedded route cache a union to be able to hold both the
legacy and IPv6 route destination address.
Previously in case of IPv6, there was a memory overwrite due to not enough
space for the IPv6 address.

PR:		kern/122565
MFC After:	2 weeks
2010-10-23 20:35:40 +00:00
Robert Watson
a959b1f02c Add missing DTrace probe invocation to mac_vnode_check_open; the probe
was declared, but never used.

MFC after:	3 days
Sponsored by:	Google, Inc.
2010-10-23 16:59:39 +00:00
Edward Tomasz Napierala
880cb81c5a Remove workaround for ZFS bug; fix was committed to the //depot/user/pjd/zfs/...
branch some time ago.

MFC after:	two weeks
2010-10-23 14:22:50 +00:00
David Xu
0d036d55e7 In thr_exit() and kthread_exit(), only remove thread from
hash if it can directly exit, otherwise let exit1() do it.
The change should be in r213950, but for unknown reason,
it was lost.
2010-10-23 13:16:39 +00:00
Bernhard Schmidt
82510b7eca The firmware does pad notifications to an even number of bytes (at least
the association notification), the included information though always
contains an elem block with an odd number of bytes. We handle the last
byte as if it might contain a whole elem block, this of course is not
true as one byte is not enough to hold a block, we therefore discard the
complete frame. The solution here is to subtract one from the actual
notification length, this is also what the Linux driver does. With this
change the frames ends exactly where the last elem block ends.

This commit also reverts r214160 which is no longer required and now even
wrong.

MFC after:	1 week
2010-10-23 11:26:22 +00:00
Pawel Jakub Dawidek
0d2f5a4eaa - Improve error messages, so instead of 'Not fully done', the user will get
information that device is already suspended or that device is using
  one-time key and suspend is not supported.
- 'geli suspend -a' silently skips devices that use one-time key, this is fine,
  but because we log which device were suspended on the console, log also which
  devices were skipped.
2010-10-22 22:58:00 +00:00
Pawel Jakub Dawidek
2f2d7830b5 Close a race between checking if device is already suspended and suspending it. 2010-10-22 22:54:26 +00:00
Pawel Jakub Dawidek
d8d61ef8fc Add State tag, so 'geli status' will report active/suspended status, eg:
# geli status
	   Name     Status  Components
	da0.eli  SUSPENDED  da0
	da1.eli     ACTIVE  da1
2010-10-22 22:45:26 +00:00
Pawel Jakub Dawidek
4f294e1289 Encryption keys array might be NULL if device is suspended. Check for this, so
we don't panic when we detach suspended device.
2010-10-22 22:44:09 +00:00
Pawel Jakub Dawidek
1d0214411e Move sc_akeyctx and sc_ivctx initialization to the g_eli_mkey_propagate()
function which eliminates code duplication and will ensure proper order
of operation.
2010-10-22 22:13:11 +00:00
Rick Macklem
91027b4ef0 Modify the file handle hash function in the experimental NFS
server so that it will work better for non-UFS file systems.
The new function simply sums the bytes of the fh_fid field
of fhandle_t.

MFC after:	10 days
2010-10-22 21:38:56 +00:00
Hans Petter Selasky
8bb77249db Add possibility to generate devctl notifications regardless of UGEN presence.
Submitted by:  Nick Hibma
Approved by:    thompsa (mentor)
2010-10-22 20:13:45 +00:00
Pyun YongHyeon
8d5f71818f Add workaround for BCM5906 A1 controller silicon bug. When
auto-negotiation results in half-duplex operation, excess collision
on the ethernet link may cause internal chip delays that may result
in subsequent valid frames being dropped due to insufficient
receive buffer resources. The workaround is to choose de-pipeline
method as a flow control decision for SDI. De-pipeline method
allows only 1 data in TxMbuf at a time such that a request to RDMA
from SDI is made only when TxMbuf is empty. Thanks for david for
providing detailed errata information.
2010-10-22 19:30:56 +00:00
Pyun YongHyeon
f6a6548885 Enable TX MAC state machine lockup fix for both BCM5755 or higher
and BCM5906. Publicly available data sheet just says it may happen
due to corrupted TxMbuf.
2010-10-22 18:31:44 +00:00
Roman Divacky
104d506ddd Avoid using memcpy() for copying 32bit chunks. This shrinks
the resulting code a little.

Approved by:    rpaulo (mentor)
Reviewed by:    jhb
2010-10-22 18:07:21 +00:00
John Baldwin
ba577448a2 - Add a new PCI quirk to whitelist an old chipset that doesn't support
PCI-express or PCI-X capabilities if we are running in a virtual machine.
- Whitelist the Intel 82440 chipset used by QEMU.

Tested by:	jfv
MFC after:	1 week
2010-10-22 11:42:02 +00:00
Xin LI
5e5fd037d6 Call chainevh callback when we are invoked with neither MOD_LOAD nor
MOD_UNLOAD.  This makes it possible to add custom hooks for other module
events.

Return EOPNOTSUPP when there is no callback available.

Pointed out by:	jhb
Reviewed by:	jhb
MFC after:	1 month
2010-10-21 20:31:50 +00:00
Pawel Jakub Dawidek
3ac01bc2ae Free opencrypto sessions on suspend, as they also might keep encryption keys. 2010-10-21 19:44:28 +00:00
Bernhard Schmidt
9f95314538 The firmware always sets bit 14 and 15, to get the real associd we need
to clear those bits.

MFC after:	1 week
2010-10-21 19:30:55 +00:00
Bernhard Schmidt
f3c95fe748 Instead of calling return when reaching the end of the assoc notification
break the loop instead. We want to run the code after the while loop
to set an associd and capinfo. If we don't do this net80211 will drop
frames because it assumes the node has not yet been associated.

MFC after:	1 week
2010-10-21 19:28:52 +00:00
John Baldwin
d680caab73 - When disabling ktracing on a process, free any pending requests that
may be left.  This fixes a memory leak that can occur when tracing is
  disabled on a process via disabling tracing of a specific file (or if
  an I/O error occurs with the tracefile) if the process's next system
  call is exit().  The trace disabling code clears p_traceflag, so exit1()
  doesn't do any KTRACE-related cleanup leading to the leak.  I chose to
  make the free'ing of pending records synchronous rather than patching
  exit1().
- Move KTRACE-specific logic out of kern_(exec|exit|fork).c and into
  kern_ktrace.c instead.  Make ktrace_mtx private to kern_ktrace.c as a
  result.

MFC after:	1 month
2010-10-21 19:17:40 +00:00
Rick Macklem
8a1b5ade5f Modify the experimental NFS server in a manner analagous to
r214049 for the regular NFS server, so that it will not do
a VOP_LOOKUP() of ".." when at the root of a file system
when performing a ReaddirPlus RPC.

MFC after:	10 days
2010-10-21 18:49:12 +00:00
John Baldwin
fb2439a6f6 Clarify a misleading comment. The test in pci_reserve_map() was meant to
ignore BARs that are invalid due to having a size of zero, not to ignore
BARs with an existing base of zero.  While here, reorganize the code
slightly to make the intent clearer.

Reported by:	avg
MFC after:	1 week
2010-10-21 17:46:23 +00:00
John Baldwin
1a587ef2a5 - Make 'vm_refcnt' volatile so that compilers won't be tempted to treat
its value as a loop invariant.  Currently this is a no-op because
  'atomic_cmpset_int()' clobbers all memory on current architectures.
- Use atomic_fetchadd_int() instead of an atomic_cmpset_int() loop to drop
  a reference in vmspace_free().

Reviewed by:	alc
MFC after:	1 month
2010-10-21 17:29:32 +00:00
Sergey Kandaurov
9af74f3d68 Reshuffle SIOCGIFCONF32 handler from r155224.
- move all the chunks into one file, which allows to hide SIOCGIFCONF32
  global definition as well.
- replace __amd64__ with proper COMPAT_FREEBSD32 around.
- handle 32bit capacity before going into the handler itself instead of
  doing internal 32bit specific changes within it (e.g. as it's done for
  SIOCGDEFIFACE32_IN6).
- use explicitely sized types for ABI compat.

Approved by:	kib (mentor)
MFC after:	2 weeks
2010-10-21 16:20:48 +00:00
Pawel Jakub Dawidek
738ffa9780 Fix a bug introduced in r213067 where we use authentication key before
initializing it.
2010-10-21 12:58:26 +00:00
Sergey Kandaurov
d63e9da300 Update PD state firmware definitions: add copyback, system.
Reviewed by:	jhb
Approved by:	avg (mentor)
MFC after:	1 week
2010-10-21 10:38:52 +00:00
Xin LI
00e3c12e03 In syscall_module_handler(): all switch branches return, remove
unreached code as pointed out in a Chinese forum [1].

[1] http://www.freebsdchina.org/forum/viewtopic.php?t=50619

Pointed out by:		btw616 <btw s qq com>
MFC after:		1 month
2010-10-21 08:57:25 +00:00
Jung-uk Kim
d815d0abb7 Update PCI power management registers per PCI Bus Power Management Interface
Specification Rev. 1.2.  Rename pp_pcmcsr field of PM capabilities to pp_bse
to avoid further confusions and adjust some comments accordingly.  The real
PMCSR (Power Management Control/Status Register) is PCIR_POWER_STATUS and
it is actually BSE (PCI-to-PCI Bridge Support Extensions) register.
2010-10-20 23:41:16 +00:00
Pawel Jakub Dawidek
5ad4a7c74a Bring in geli suspend/resume functionality (finally).
Before this change if you wanted to suspend your laptop and be sure that your
encryption keys are safe, you had to stop all processes that use file system
stored on encrypted device, unmount the file system and detach geli provider.

This isn't very handy. If you are a lucky user of a laptop where suspend/resume
actually works with FreeBSD (I'm not!) you most likely want to suspend your
laptop, because you don't want to start everything over again when you turn
your laptop back on.

And this is where geli suspend/resume steps in. When you execute:

	# geli suspend -a

geli will wait for all in-flight I/O requests, suspend new I/O requests, remove
all geli sensitive data from the kernel memory (like encryption keys) and will
wait for either 'geli resume' or 'geli detach'.

Now with no keys in memory you can suspend your laptop without stopping any
processes or unmounting any file systems.

When you resume your laptop you have to resume geli devices using 'geli resume'
command. You need to provide your passphrase, etc. again so the keys can be
restored and suspended I/O requests released.

Of course you need to remember that 'geli suspend' won't clear file system
cache and other places where data from your geli-encrypted file system might be
present. But to get rid of those stopping processes and unmounting file system
won't help either - you have to turn your laptop off. Be warned.

Also note, that suspending geli device which contains file system with geli
utility (or anything used by 'geli resume') is not very good idea, as you won't
be able to resume it - when you execute geli(8), the kernel will try to read it
and this read I/O request will be suspended.
2010-10-20 20:50:55 +00:00
Pawel Jakub Dawidek
056638c469 - Add missing comments.
- Make a comment consistent with others.
2010-10-20 20:01:45 +00:00
Pawel Jakub Dawidek
ab05568beb Correct typos. 2010-10-20 19:52:27 +00:00
Jung-uk Kim
f3e0b10973 Introduce a new tunable 'hw.pci.do_power_suspend'. This tunable lets you
avoid PCI power state transition from D0 to D3 for suspending case.  Default
is 1 or enabled.
2010-10-20 16:47:09 +00:00
Jung-uk Kim
347263c935 Do not apply do_power_resume for suspending P2P bridge as we did in r214064. 2010-10-20 16:40:14 +00:00
Jayachandran C.
7850efa68d Network driver updates
- Fix network driver issue on a XLS eval board (major# 8).
- Fix issue uncovered by r213475 in check for XGMII

Submitted by:	Sriram Gorti (srgorti at netlogicmicro dot com)
2010-10-20 09:50:11 +00:00
Jayachandran C.
18ad6a4db2 On uniprocessor, warn and fixup hardware cpu mask if more than on CPU
is enabled by the bootloader.
2010-10-20 09:41:36 +00:00
Alexander Motin
6c87235098 Workaround strange situation when EDMA_RESQIP register returns zero instead
of proper value. It caused bunch of "EMPTY CRPB" messages and potentially
may cause premature requests completion, which could cause data corruption.
For most cases it seems enough to just reread register to get proper value.
To protect against worse cases - erase processed queue entries with
impossible values and ignore them if problem still happen.
2010-10-20 07:47:31 +00:00
Alexander Motin
c0609c547a Some style cleanup:
- remove commented debugging code;
- wrap long lines.
2010-10-20 07:22:34 +00:00
Andriy Gapon
55144670c2 PG_BUSY -> VPO_BUSY, PG_WANTED -> VPO_WANTED in manual pages and comments
Reviewed by:	alc
MFC after:	4 days
2010-10-20 05:17:23 +00:00
David Xu
cfca8a1862 - Don't include sx.h, it is not needed.
- Check NULL pointer, move timeout calculation code outside of
  process lock.
2010-10-20 00:41:38 +00:00
Pyun YongHyeon
69b5727f16 Correct handling of shared interrupt in sis_intr(). r212116 incorrectly
released a drver lock for shared interrupt case such that it caused
panic. While I'm here check whether driver is still running before
serving TX/RX handler.

Reported by:	Jerahmy Pocott < QUAKENET1 <> optusnet dot com dot au >
Tested by:	Jerahmy Pocott < QUAKENET1 <> optusnet dot com dot au >
MFC after:	3 days
2010-10-20 00:19:25 +00:00
Pyun YongHyeon
d598b626c0 Add workaround for BCM5906 controller silicon bug. If device
receive two back-to-back send BDs with less than or equal to 8
total bytes then the device may hang. The two back-to-back send
BDs must be in the same frame for this failure to occur.
Thanks to davidch for detailed errata information.

Reviewed by:	davidch
2010-10-19 23:04:23 +00:00
Justin T. Gibbs
ff662b5c98 Improve the Xen para-virtualized device infrastructure of FreeBSD:
o Add support for backend devices (e.g. blkback)
 o Implement extensions to the Xen para-virtualized block API to allow
   for larger and more outstanding I/Os.
 o Import a completely rewritten block back driver with support for fronting
   I/O to both raw devices and files.
 o General cleanup and documentation of the XenBus and XenStore support code.
 o Robustness and performance updates for the block front driver.
 o Fixes to the netfront driver.

Sponsored by: Spectra Logic Corporation

sys/xen/xenbus/init.txt:
	Deleted: This file explains the Linux method for XenBus device
	enumeration and thus does not apply to FreeBSD's NewBus approach.

sys/xen/xenbus/xenbus_probe_backend.c:
	Deleted: Linux version of backend XenBus service routines.  It
	was never ported to FreeBSD.  See xenbusb.c, xenbusb_if.m,
	xenbusb_front.c xenbusb_back.c for details of FreeBSD's XenBus
	support.

sys/xen/xenbus/xenbusvar.h:
sys/xen/xenbus/xenbus_xs.c:
sys/xen/xenbus/xenbus_comms.c:
sys/xen/xenbus/xenbus_comms.h:
sys/xen/xenstore/xenstorevar.h:
sys/xen/xenstore/xenstore.c:
	Split XenStore into its own tree.  XenBus is a software layer built
	on top of XenStore.  The old arrangement and the naming of some
	structures and functions blurred these lines making it difficult to
	discern what services are provided by which layer and at what times
	these services are available (e.g. during system startup and shutdown).

sys/xen/xenbus/xenbus_client.c:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_probe.c:
sys/xen/xenbus/xenbusb.c:
sys/xen/xenbus/xenbusb.h:
	Split up XenBus code into methods available for use by client
	drivers (xenbus.c) and code used by the XenBus "bus code" to
	enumerate, attach, detach, and service bus drivers.

sys/xen/reboot.c:
sys/dev/xen/control/control.c:
	Add a XenBus front driver for handling shutdown, reboot, suspend, and
	resume events published in the XenStore.  Move all PV suspend/reboot
	support from reboot.c into this driver.

sys/xen/blkif.h:
	New file from Xen vendor with macros and structures used by
	a block back driver to service requests from a VM running a
	different ABI (e.g. amd64 back with i386 front).

sys/conf/files:
	Adjust kernel build spec for new XenBus/XenStore layout and added
	Xen functionality.

sys/dev/xen/balloon/balloon.c:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/blkfront/blkfront.c:
sys/xen/xenbus/...
sys/xen/xenstore/...
	o Rename XenStore APIs and structures from xenbus_* to xs_*.
	o Adjust to use of M_XENBUS and M_XENSTORE malloc types for allocation
	  of objects returned by these APIs.
	o Adjust for changes in the bus interface for Xen drivers.

sys/xen/xenbus/...
sys/xen/xenstore/...
	Add Doxygen comments for these interfaces and the code that
	implements them.

sys/dev/xen/blkback/blkback.c:
	o Rewrite the Block Back driver to attach properly via newbus,
	  operate correctly in both PV and HVM mode regardless of domain
	  (e.g. can be in a DOM other than 0), and to deal with the latest
	  metadata available in XenStore for block devices.

	o Allow users to specify a file as a backend to blkback, in addition
	  to character devices.  Use the namei lookup of the backend path
	  to automatically configure, based on file type, the appropriate
	  backend method.

	The current implementation is limited to a single outstanding I/O
	at a time to file backed storage.

sys/dev/xen/blkback/blkback.c:
sys/xen/interface/io/blkif.h:
sys/xen/blkif.h:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
	Extend the Xen blkif API: Negotiable request size and number of
	requests.

	This change extends the information recorded in the XenStore
	allowing block front/back devices to negotiate for optimal I/O
	parameters.  This has been achieved without sacrificing backward
	compatibility with drivers that are unaware of these protocol
	enhancements.  The extensions center around the connection protocol
	which now includes these additions:

	o The back-end device publishes its maximum supported values for,
	  request I/O size, the number of page segments that can be
	  associated with a request, the maximum number of requests that
	  can be concurrently active, and the maximum number of pages that
	  can be in the shared request ring.  These values are published
	  before the back-end enters the XenbusStateInitWait state.

	o The front-end waits for the back-end to enter either the InitWait
	  or Initialize state.  At this point, the front end limits it's
	  own capabilities to the lesser of the values it finds published
	  by the backend, it's own maximums, or, should any back-end data
	  be missing in the store, the values supported by the original
	  protocol.  It then initializes it's internal data structures
	  including allocation of the shared ring, publishes its maximum
	  capabilities to the XenStore and transitions to the Initialized
	  state.

	o The back-end waits for the front-end to enter the Initalized
	  state.  At this point, the back end limits it's own capabilities
	  to the lesser of the values it finds published by the frontend,
	  it's own maximums, or, should any front-end data be missing in
	  the store, the values supported by the original protocol.  It
	  then initializes it's internal data structures, attaches to the
	  shared ring and transitions to the Connected state.

	o The front-end waits for the back-end to enter the Connnected
	  state, transitions itself to the connected state, and can
	  commence I/O.

	Although an updated front-end driver must be aware of the back-end's
	InitWait state, the back-end has been coded such that it can
	tolerate a front-end that skips this step and transitions directly
	to the Initialized state without waiting for the back-end.

sys/xen/interface/io/blkif.h:
	o Increase BLKIF_MAX_SEGMENTS_PER_REQUEST to 255.  This is
	  the maximum number possible without changing the blkif
	  request header structure (nr_segs is a uint8_t).

	o Add two new constants:
	  BLKIF_MAX_SEGMENTS_PER_HEADER_BLOCK, and
	  BLKIF_MAX_SEGMENTS_PER_SEGMENT_BLOCK.  These respectively
	  indicate the number of segments that can fit in the first
	  ring-buffer entry of a request, and for each subsequent
	  (sg element only) ring-buffer entry associated with the
          "header" ring-buffer entry of the request.

	o Add the blkif_request_segment_t typedef for segment
	  elements.

	o Add the BLKRING_GET_SG_REQUEST() macro which wraps the
	  RING_GET_REQUEST() macro and returns a properly cast
	  pointer to an array of blkif_request_segment_ts.

	o Add the BLKIF_SEGS_TO_BLOCKS() macro which calculates the
	  number of ring entries that will be consumed by a blkif
	  request with the given number of segments.

sys/xen/blkif.h:
	o Update for changes in interface/io/blkif.h macros.

	o Update the BLKIF_MAX_RING_REQUESTS() macro to take the
	  ring size as an argument to allow this calculation on
	  multi-page rings.

	o Add a companion macro to BLKIF_MAX_RING_REQUESTS(),
	  BLKIF_RING_PAGES().  This macro determines the number of
	  ring pages required in order to support a ring with the
	  supplied number of request blocks.

sys/dev/xen/blkback/blkback.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
	o Negotiate with the other-end with the following limits:
	      Reqeust Size:   MAXPHYS
	      Max Segments:   (MAXPHYS/PAGE_SIZE) + 1
	      Max Requests:   256
	      Max Ring Pages: Sufficient to support Max Requests with
	                      Max Segments.

	o Dynamically allocate request pools and segemnts-per-request.

	o Update ring allocation/attachment code to support a
	  multi-page shared ring.

	o Update routines that access the shared ring to handle
	  multi-block requests.

sys/dev/xen/blkfront/blkfront.c:
	o Track blkfront allocations in a blkfront driver specific
	  malloc pool.

	o Strip out XenStore transaction retry logic in the
	  connection code.  Transactions only need to be used when
	  the update to multiple XenStore nodes must be atomic.
	  That is not the case here.

	o Fully disable blkif_resume() until it can be fixed
	  properly (it didn't work before this change).

	o Destroy bus-dma objects during device instance tear-down.

	o Properly handle backend devices with powef-of-2 sector
	  sizes larger than 512b.

sys/dev/xen/blkback/blkback.c:
	Advertise support for and implement the BLKIF_OP_WRITE_BARRIER
	and BLKIF_OP_FLUSH_DISKCACHE blkif opcodes using BIO_FLUSH and
	the BIO_ORDERED attribute of bios.

sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkfront/block.h:
	Fix various bugs in blkfront.

       o gnttab_alloc_grant_references() returns 0 for success and
	 non-zero for failure.  The check for < 0 is a leftover
	 Linuxism.

       o When we negotiate with blkback and have to reduce some of our
	 capabilities, print out the original and reduced capability before
	 changing the local capability.  So the user now gets the correct
	 information.

	o Fix blkif_restart_queue_callback() formatting.  Make sure we hold
	  the mutex in that function before calling xb_startio().

	o Fix a couple of KASSERT()s.

        o Fix a check in the xb_remove_* macro to be a little more specific.

sys/xen/gnttab.h:
sys/xen/gnttab.c:
	Define GNTTAB_LIST_END publicly as GRANT_REF_INVALID.

sys/dev/xen/netfront/netfront.c:
	Use GRANT_REF_INVALID instead of driver private definitions of the
	same constant.

sys/xen/gnttab.h:
sys/xen/gnttab.c:
	Add the gnttab_end_foreign_access_references() API.

	This API allows a client to batch the release of an array of grant
	references, instead of coding a private for loop.  The implementation
	takes advantage of this batching to reduce lock overhead to one
	acquisition and release per-batch instead of per-freed grant reference.

	While here, reduce the duration the gnttab_list_lock is held during
	gnttab_free_grant_references() operations.  The search to find the
	tail of the incoming free list does not rely on global state and so
	can be performed without holding the lock.

sys/dev/xen/xenpci/evtchn.c:
sys/dev/xen/evtchn/evtchn.c:
sys/xen/xen_intr.h:
	o Implement the bind_interdomain_evtchn_to_irqhandler API for HVM mode.
	  This allows an HVM domain to serve back end devices to other domains.
	  This API is already implemented for PV mode.

	o Synchronize the API between HVM and PV.

sys/dev/xen/xenpci/xenpci.c:
	o Scan the full region of CPUID space in which the Xen VMM interface
	  may be implemented.  On systems using SuSE as a Dom0 where the
	  Viridian API is also exported, the VMM interface is above the region
	  we used to search.

	o Pass through bus_alloc_resource() calls so that XenBus drivers
	  attaching on an HVM system can allocate unused physical address
	  space from the nexus.  The block back driver makes use of this
	  facility.

sys/i386/xen/xen_machdep.c:
	Use the correct type for accessing the statically mapped xenstore
	metadata.

sys/xen/interface/hvm/params.h:
sys/xen/xenstore/xenstore.c:
	Move hvm_get_parameter() to the correct global header file instead
	of as a private method to the XenStore.

sys/xen/interface/io/protocols.h:
	Sync with vendor.

sys/xeninterface/io/ring.h:
	Add macro for calculating the number of ring pages needed for an N
	deep ring.

	To avoid duplication within the macros, create and use the new
	__RING_HEADER_SIZE() macro.  This macro calculates the size of the
	ring book keeping struct (producer/consumer indexes, etc.) that
	resides at the head of the ring.

	Add the __RING_PAGES() macro which calculates the number of shared
	ring pages required to support a ring with the given number of
	requests.

	These APIs are used to support the multi-page ring version of the
	Xen block API.

sys/xeninterface/io/xenbus.h:
	Add Comments.

sys/xen/xenbus/...
	o Refactor the FreeBSD XenBus support code to allow for both front and
	  backend device attachments.

	o Make use of new config_intr_hook capabilities to allow front and back
	  devices to be probed/attached in parallel.

	o Fix bugs in probe/attach state machine that could cause the system to
	  hang when confronted with a failure either in the local domain or in
	  a remote domain to which one of our driver instances is attaching.

	o Publish all required state to the XenStore on device detach and
	  failure.  The majority of the missing functionality was for serving
	  as a back end since the typical "hot-plug" scripts in Dom0 don't
	  handle the case of cleaning up for a "service domain" that is not
	  itself.

	o Add dynamic sysctl nodes exposing the generic ivars of
	  XenBus devices.

	o Add doxygen style comments to the majority of the code.

	o Cleanup types, formatting, etc.

sys/xen/xenbus/xenbusb.c:
	Common code used by both front and back XenBus busses.

sys/xen/xenbus/xenbusb_if.m:
	Method definitions for a XenBus bus.

sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusb_back.c:
	XenBus bus specialization for front and back devices.

MFC after:	1 month
2010-10-19 20:53:30 +00:00
Jung-uk Kim
220666153d Remove undocumented and stale debug.acpi.do_powerstate tunable. It was
added with hw.pci.do_powerstate but the PCI version was splitted into two
separate tunables later and now this is completely stale.  To make it worse,
PCI devices enumerated in ACPI tree ignore this tunable as it is handled by
a function in acpi_pci.c instead.
2010-10-19 20:38:21 +00:00
Jung-uk Kim
a7a3177f27 Remove PCI_SET_POWERSTATE method from acpi.c and eradicate all PCI-specific
knowledges from the file.  All PCI devices enumerated in ACPI tree must use
correct one from acpi_pci.c any way.  Reduce duplicate codes as we did for
pci.c in r213905.  Do not return ESRCH from PCIB_POWER_FOR_SLEEP method.
When the method is not found, just return zero without modifying the given
default value as it is completely optional.  As a side effect, the return
state must not be NULL.  Note there is actually no functional change by
removing ESRCH because acpi_pcib_power_for_sleep() always returns zero.
Adjust debugging messages and add new ones under bootverbose to help
debugging device power state related issues.

Reviewed by:	jhb, imp (earlier versions)
2010-10-19 19:53:06 +00:00
Marius Strobl
10c2bb0a10 - Wrap exchanging td_intr_frame and calling the event timer callback in
a critical section as apparently required by both. I don't think either
  belongs in the event timer front-ends but the callback should handle
  this as necessary instead just like for example intr_event_handle()
  does but this is how the other architectures currently handle it, either
  explicitly or implicitly.
- Further rename and reword references to hardclock as this front-end no
  longer has a notion of actually calling it.
2010-10-19 19:44:05 +00:00
Bernhard Schmidt
96a911f614 There is no reason to call rt_ifmsg(), remove it.
Submitted by:	Paul B Mahol <onemda at gmail.com>
MFC after:	1 week
2010-10-19 19:11:36 +00:00
Bernhard Schmidt
9a9a302fcd Fix an undefined behaviour if the desired ratectl algo is not available.
This can happen if the algos are built as modules but are not loaded. If
the selected ratectl algo is not available, try to load it (The load
module functions does nothing currently). Add a dummy ratectl algo which
always selects the first available rate. Use that one if the desired algo
is not available.

MFC after:	1 week
2010-10-19 18:49:26 +00:00
Jung-uk Kim
edc0cb7dc8 Make any PCI devices enumerated in ACPI tree honor do_power_resume as well. 2010-10-19 18:43:11 +00:00
Andrey V. Elsukov
366523d101 ZFS pool name is not a real device in devfs. Do not wait for
device appear when mounting root from ZFS.

Reviewed by:	marcel
Approved by:	mav (mentor)
2010-10-19 18:32:01 +00:00
Jung-uk Kim
6d018c85e1 Remove PCI header type 0 restriction from power state changes. PCI config.
registers for bridges are saved and restored since r200341.

OK'ed by:	imp, jhb
2010-10-19 17:15:22 +00:00
Jung-uk Kim
b56b75259b Do not apply do_power_resume for suspending case. When do_powerstate was
splitted into do_power_resume and do_power_nodriver, it became stale.
2010-10-19 17:05:51 +00:00
Jaakko Heinonen
bc2589f5b7 Use make_dev_p(9) with the MAKEDEV_CHECKNAME flag instead of make_dev(9)
and print a diagnostic if the call fails.

This avoids a panic when a device with an invalid name is attempted to
be registered. For example the label class gets device names from
untrusted input.

Reviewed by:	freebsd-geom
2010-10-19 16:48:49 +00:00
Matthew D Fleming
20ed0cb0c6 uma_zfree(zone, NULL) should do nothing, to match free(9).
Noticed by:	Ron Steinke <rsteinke at isilon dot com>
MFC after:	3 days
2010-10-19 16:06:00 +00:00
Rui Paulo
e09a0bdb32 Revert r206418 2010-10-19 13:31:43 +00:00
Ulrich Spörlein
7cc1fde083 mdoc: drop even more redundant .Pp calls
No change in rendered output, less mandoc lint warnings.

Tool provided by:	Nobuyuki Koganemaru n-kogane at syd.odn.ne.jp
2010-10-19 12:35:40 +00:00
Rick Macklem
4d4f9a3721 Fix the type of the 3rd argument for nm_getinfo so that it works
for architectures like sparc64.

Suggested by:	kib
MFC after:	2 weeks
2010-10-19 11:55:58 +00:00
Konstantin Belousov
bcc5a93fd7 When readdirplus() is handled on the exported filesystem that does
not support VFS_VGET, like msdosfs, do not call VOP_LOOKUP() for
dotdot on the root directory. Our filesystems expect that VFS handles
dotdot lookups on root on its own.

Reported and tested by:	kevlo
MFC after:   2 weeks
2010-10-19 08:55:31 +00:00
Rick Macklem
ca27c028d8 Modify the NFS clients and the NLM so that the NLM can be used
by both clients. Since the NLM uses various fields of the
nfsmount structure, those fields were extracted and put in a
separate nfs_mountcommon structure stored in sys/nfs/nfs_mountcommon.h.
This structure also has a function pointer for a function that
extracts the required information from the mount point and nfs vnode
for that particular client, for information stored differently by the
clients.

Reviewed by:	jhb
MFC after:	2 weeks
2010-10-19 00:20:00 +00:00
Konstantin Belousov
223073fd1a Do not synchronously start the nfsiod threads at all. The r212506
fixed the issues with file descriptor locks, but the same problems are
present for vnode lock/user map lock.

If the nfs_asyncio() cannot find the free nfsiod, schedule task to
create new nfsiod and return error. This causes fall back to the
synchronous i/o for nfs_strategy(), or does not start read at all in
the case of readahead. The caller that holds vnode and potentially
user map lock does not wait for kproc_create() to finish, preventing
the LORs.

The change effectively reverts r203072, because we never hand off the
request to newly created nfsiod thread anymore.

Reviewed by:	jhb
Tested by:	jhb, pluknet
MFC after:	3 weeks
2010-10-18 19:06:46 +00:00
Ed Maste
c4965cfc44 We've already set p = td->td_proc, so use it. 2010-10-18 15:46:58 +00:00
Rebecca Cran
8834bc521e Fix grammar. 2010-10-18 14:26:29 +00:00
Alexander Motin
bda55b6adb Set of legacy mode SATA enchancements:
- Implement proper combined mode decoding for Intel controllers to properly
identify SATA and PATA channels and associate ATA channels with SATA ports.
This fixes wrong reporting and in some cases hard resets to wrong SATA ports.
- Improve SATA registers support to handle hot-plug events and potentially
interface errors. For ICH5/6300ESB chipsets these registers accessible via
PCI config space. For later ones they may be accessible via PCI BAR(5).
- For controllers not generating interrupts on hot-plug events, implement
periodic status polling. Use it to detect hot-plug on Intel and VIA
controllers. Same probably could also be used for Serverworks and SIS.
2010-10-18 11:30:13 +00:00
Marius Strobl
c1ff8fd19a Revert r213867; while this driver really doesn't use any of the generic
subroutines, at least mii_capabilities is used within itself.
2010-10-18 08:36:03 +00:00
Marcel Moolenaar
e25daafbb6 Re-implement the root mount logic using a recursive approach, whereby each
root file system (starting with devfs and a synthesized configuration) can
contain directives for mounting another file system as root. The old root
file system is re-mounted under the new root file system (with /.mount or
/mnt as the mount point) to allow access to the underlying file system.

The configuration allows for creating vnode-backed memory disks that can
subsequently be mounted as root. This allows for an efficient and low-
cost way to distribute and boot FreeBSD software images that reside on
some storage media.

When trying a mount, the kernel will wait for the device in question to
arrive. The timeout is configurable and is part of the configuration.
This allows arbitrarily complex GEOM configurations to be constructed
on the fly.

A side-effect of this change is that all root specifications, whether
compiled into the kernel or typed at the prompt can contain root mount
options.
2010-10-18 05:01:53 +00:00
Marcel Moolenaar
c1f0aabb9f In vfs_filteropt(), only print the errmsg when there's no errmsg
mount option. Otherwise errors tend to get printed multiple times.
2010-10-18 04:34:42 +00:00
Marcel Moolenaar
76e18b25a0 Rename boot() to kern_reboot() and make it visible outside of
kern_shutdown.c. This makes it easier for emulators and other
parts of the kernel to initiate a reboot.
2010-10-18 04:30:27 +00:00
Marcel Moolenaar
3d5c947d9d Allow the MDIOCATTACH ioctl operation to originate from within the kernel.
To protect against malicious software, we demand that the file name is at
a particular location (i.e. appended to the mdio structure) for it to be
treated as in-kernel.
2010-10-18 04:26:32 +00:00
Kevin Lo
4bc8fad7bd Fix a possible race where the directory dirent is moved to the location
that was used by ".." entry.
This change seems fixed panic during attempt to access msdosfs data
over nfs.

Reviewed by:	kib
MFC after:	1 week
2010-10-18 03:34:33 +00:00
Scott Long
34c9624e2d Re-add opt_mps.h and opt_cam.h, lost in the previous rev. 2010-10-17 20:01:56 +00:00
Nathan Whitehorn
c8593f7c4d Fix an XXX comment by answering 'no'. OS X does not set the day-of-week
counter on SMU-based systems, which causes FreeBSD to reject the RTC time
when used in a dual-boot environment. Since we don't use the day-of-week
counter anyway, solve this by just not checking that it matches.

MFC after:	3 weeks
2010-10-17 17:31:49 +00:00
Marius Strobl
17f3c8f1e3 - In oneshot-mode it doesn't make sense to try to compensate the clock
drift in order to achieve a more stable clock as the tick intervals may
  vary in the first place. In fact I haven't seen this code kick in when
  in oneshot-mode so just skip it in that case.
- There's no need to explicitly stop the (S)TICK counter in oneshot-mode
  with every tick as it just won't trigger again with the (S)TICK compare
  register set to a value in the past (with a wrap-around once every ~195
  years of uptime at 1.5 GHz this isn't something we have to worry about
  in practice).
- Given that we'll disable interrupts completely anyway there's no
  need to enter critical sections.
2010-10-17 16:46:54 +00:00
David Xu
21ecd1e977 - Insert thread0 into correct thread hash link list.
- In thr_exit() and kthread_exit(), only remove thread from
  hash if it can directly exit, otherwise let exit1() do it.
- In thread_suspend_check(), fix cleanup code when thread needs
  to exit.
This change seems fixed the "Bad link elm " panic found by
Peter Holm.

Stress testing: pho
2010-10-17 11:01:52 +00:00
Andriy Gapon
23a1bcf8c6 zfs: add vop_getpages method implementation
This should make vnode_pager_getpages path a bit shorter and clearer.
Also this should eliminate problems with partially valid pages.
Having this method opens room for future optimizations.

To do: try to satisfy other pages besides the required one taking into
account tradeofs between number of page faults, read throughput and read
latency.  Also, eventually vop_putpages should be added too.

Reviewed by:	kib, mm, pjd
MFC after:	3 weeks
2010-10-16 20:43:05 +00:00
Bjoern A. Zeeb
12112cf676 MfP4 CH182763 (original version):
Make it harder to exploit certain in_control() related races between the
intiial lookup at the beginning and the time we will remove the entry
from the lists by re-checking that entry is still in the list before
trying to remove it.

(*) It is believed that with the current code and locking strategy we
    cannot completely fix all race.

Reported by:	Nima Misaghian (nima_misa hotmail.com) on net@ 20100817
Tested by:	Nima Misaghian (nima_misa hotmail.com) (original version)
PR:		kern/146250
Submitted by:	Mikolaj Golub (to.my.trociny gmail.com) (different version)
MFC after:	1 week
2010-10-16 19:53:22 +00:00
Alexander Motin
0aa99d33b5 Allow umass to use bigger transactions for USB 3.0 devices. It is less
important for USB 2.0 devices and some of them reported to have problems
with large transactions. But USB 3.0 benchmarks show that limited number
of transactions per second on USB makes impossible to reach high transfer
speeds without using bigger transactions.

On my tests this change allows to read up to 220MB/s from USB-attached SSD
(at block size of 256-512KB), comparing to only 113MB/s without it.

Reviewed by:	hselasky
2010-10-16 19:29:37 +00:00
Bjoern A. Zeeb
ee7c7fee94 Close a race acquiring the IF_ADDR_LOCK() for each entry while iterating
over all interfaces to make sure the address will neither change nor be
freed while we are working on it.

PR:		kern/146250
Submitted by:	Mikolaj Golub (to.my.trociny gmail.com)
MFC after:	1 week
2010-10-16 19:25:27 +00:00
Bjoern A. Zeeb
fc2bfb3294 lltable_drain() has never been used so far, thus #if 0 it for now.
While touching it add the missing locking to the now disabled code
for the time when we'll resurrect it.

MFC after:	3 days
2010-10-16 18:42:09 +00:00
Andriy Gapon
2b89f1fc9e atrtc: remove (pre-)historic check of RTC NVRAM at address 0x0e
Old scrolls tell that once upon a time IBM AT BIOS was known to put some
useful system diagnostic information into RTC NVRAM.  It is not really
known if and for how long PC BIOSes followed that convention, but I
believe that many, if not all, modern BIOSes do not do that any more
(not mentioning other types of x86 firmware).
Some diagnostic bits don't even make any sense any longer.
The check results in confusing messages upon boot on some systems.
So I am removing it.

Discussed with:	bde, jhb, mav
MFC after:	3 weeks
2010-10-16 10:45:36 +00:00
Konstantin Belousov
420cfbb460 Provide vfs.ncsizefactor instead of hard-coding namecache ratio.
Move debug.ncnegfactor to vfs.ncnegfactor [1].
Provide some descriptions for the namecache related sysctls [1].

Based on the submission by:	Rogier R. Mulhuijzen <drwilco drwilco net> [1]
MFC after:	2 weeks
X-MFC-note:	remove debug.ncnegfactor in HEAD after MFC
2010-10-16 09:44:31 +00:00
Lawrence Stewart
ca09d7728b Retire the system-wide, per-reassembly queue segment limit. The mechanism is far
too coarse grained to be useful and the default value significantly degrades TCP
performance on moderate to high bandwidth-delay product paths with non-zero loss
(e.g. 5+Mbps connections across the public Internet often suffer).

Replace the outgoing mechanism with an individual per-queue limit based on the
number of MSS segments that fit into the socket's receive buffer. This should
strike a good balance between performance and the potential for resource
exhaustion when FreeBSD is acting as a TCP receiver. With socket buffer
autotuning (which is enabled by default), the reassembly queue tracks the
socket buffer and benefits too.

As the XXX comment suggests, my testing uncovered some unexpected behaviour
which requires further investigation. By using so->so_rcv.sb_hiwat
instead of sbspace(&so->so_rcv), we allow more segments to be held across both
the socket receive buffer and reassembly queue than we probably should. The
tradeoff is better performance in at least one common scenario, versus a devious
sender's ability to consume more resources on a FreeBSD receiver.

Sponsored by:	FreeBSD Foundation
Reviewed by:	andre, gnn, rpaulo
MFC after:	2 weeks
2010-10-16 07:12:39 +00:00
Lawrence Stewart
c8dc0ab886 - Switch the "net.inet.tcp.reass.cursegments" and
"net.inet.tcp.reass.maxsegments" sysctl variables to be based on UMA zone
  stats. The value returned by the cursegments sysctl is approximate owing to
  the way in which uma_zone_get_cur is implemented.

- Discontinue use of V_tcp_reass_qsize as a global reassembly segment count
  variable in the reassembly implementation. The variable was used without
  proper synchronisation and was duplicating accounting done by UMA already. The
  lack of synchronisation was particularly problematic on SMP systems
  terminating many TCP sessions, resulting in poor TCP performance for
  connections with non-zero packet loss.

Sponsored by:	FreeBSD Foundation
Reviewed by:	andre, gnn, rpaulo (as part of a larger patch)
MFC after:	2 weeks
2010-10-16 05:37:45 +00:00
Lawrence Stewart
1c6cae9711 Change uma_zone_set_max to return the effective value of "nitems" after
rounding. The same value can also be obtained with uma_zone_get_max, but this
change avoids a caller having to make two back-to-back calls.

Sponsored by:	FreeBSD Foundation
Reviewed by:	gnn, jhb
2010-10-16 04:41:45 +00:00
Lawrence Stewart
c4ae7908a7 - Simplify implementation of uma_zone_get_max.
- Add uma_zone_get_cur which returns the current approximate occupancy of
  a zone. This is useful for providing stats via sysctl amongst other things.

Sponsored by:	FreeBSD Foundation
Reviewed by:	gnn, jhb
MFC after:	2 weeks
2010-10-16 04:14:45 +00:00
Marius Strobl
1636dde957 Convert the PHY drivers to honor the mii_flags passed down and convert
the NIC drivers as well as the PHY drivers to take advantage of the
mii_attach() introduced in r213878 to get rid of certain hacks. For
the most part these were:
- Artificially limiting miibus_{read,write}reg methods to certain PHY
  addresses; we now let mii_attach() only probe the PHY at the desired
  address(es) instead.
- PHY drivers setting MIIF_* flags based on the NIC driver they hang
  off from, partly even based on grabbing and using the softc of the
  parent; we now pass these flags down from the NIC to the PHY drivers
  via mii_attach(). This got us rid of all such hacks except those of
  brgphy() in combination with bce(4) and bge(4), which is way beyond
  what can be expressed with simple flags.

While at it, I took the opportunity to change the NIC drivers to pass
up the error returned by mii_attach() (previously by mii_phy_probe())
and unify the error message used in this case where and as appropriate
as mii_attach() actually can fail for a number of reasons, not just
because of no PHY(s) being present at the expected address(es).

This file was missed in r213893.
2010-10-15 23:34:31 +00:00
Jung-uk Kim
debfe32ccd Remove unnecessary castings and fix couple of style(9) nits. 2010-10-15 21:41:59 +00:00
Jung-uk Kim
6e877573df Move setting power state for children into a separate function as they were
essentially the same.  This also restores hw.pci.do_power_resume tunable,
which was broken since r211430.

Reviewed by:	jhb
2010-10-15 21:39:51 +00:00
Andreas Tobler
da89fa28c6 Add three new drivers for fan control and temperature reading on the
PowerMac7,2.

- The fcu driver lets us read and write the fan RPMs for all fans in the
  PowerMac7,2. This driver is PowerMac specific.
- The ds1775 is a driver to read the temperature for the drive bay sensor.
- The max6690 is another driver to read temperatures. Here it is used to
  read the inlet, the backside and the U3 heatsink temperature.

An additional driver, the ad7417, will follow later.

Thanks to nwhitehorn for guiding me through this driver development.

Approved by:	nwhitehorn (mentor)
2010-10-15 20:08:16 +00:00
Marius Strobl
e60f6da1d6 Now that all previous users of mii_phy_probe() have been converted
in r213893 and r213894 to use mii_attach() instead remove the former
and along with it the "EVIL HACK".

MFC after:	never
2010-10-15 15:46:58 +00:00
Matthew D Fleming
09631173be Currently only opt_compat.h is included by the mps(4) driver. Also
enable /dev/mps0, which was missing from my previous patches enabling
f/w upload and download.

opt_compat.h issue noticed by scottl.
2010-10-15 15:24:59 +00:00
Alan Cox
353b642ced Update pmap_extract() to handle 1GB page mappings. Some device drivers
use pmap_extract() rather than pmap_kextract() on direct map addresses.
Thus, pmap_extract() needs to be able to deal with 1GB page mappings if
we are to use 1GB page mappings for the direct map.  (See r197580.)
2010-10-15 15:23:34 +00:00
Marius Strobl
b56f1ea9d4 Remove a device_printf() accidentally left in r213894.
Submitted by:	jhb
2010-10-15 15:16:36 +00:00
Marius Strobl
d6c65d276e Converted the remainder of the NIC drivers to use the mii_attach()
introduced in r213878 instead of mii_phy_probe(). Unlike r213893 these
are only straight forward conversions though.

Reviewed by:	yongari
2010-10-15 15:00:30 +00:00
Marius Strobl
8e5d93dbb4 Convert the PHY drivers to honor the mii_flags passed down and convert
the NIC drivers as well as the PHY drivers to take advantage of the
mii_attach() introduced in r213878 to get rid of certain hacks. For
the most part these were:
- Artificially limiting miibus_{read,write}reg methods to certain PHY
  addresses; we now let mii_attach() only probe the PHY at the desired
  address(es) instead.
- PHY drivers setting MIIF_* flags based on the NIC driver they hang
  off from, partly even based on grabbing and using the softc of the
  parent; we now pass these flags down from the NIC to the PHY drivers
  via mii_attach(). This got us rid of all such hacks except those of
  brgphy() in combination with bce(4) and bge(4), which is way beyond
  what can be expressed with simple flags.

While at it, I took the opportunity to change the NIC drivers to pass
up the error returned by mii_attach() (previously by mii_phy_probe())
and unify the error message used in this case where and as appropriate
as mii_attach() actually can fail for a number of reasons, not just
because of no PHY(s) being present at the expected address(es).

Reviewed by:	jhb, yongari
2010-10-15 14:52:11 +00:00
Jung-uk Kim
37d696a38e Stop hard coding nm(1) and make it overridable. 2010-10-14 23:31:58 +00:00
Matthew D Fleming
e658ccea60 Fixes to mps_user_command():
- fix the leak of command struct on error
 - simplify the cleanup logic
 - EINPROGRESS is not a fatal error
 - buggy comment and error message

Reviewed by:   ken
2010-10-14 23:26:08 +00:00
Hans Petter Selasky
e11ad60db2 Add new USB device IDs to the list of supported devices.
PR:	usb/151043
Approved by:    thompsa (mentor)
2010-10-14 22:14:55 +00:00
Hans Petter Selasky
f3aa3ca3d8 - Add more USB devices to usbdevs and rename some previously unknown ones.
- Add more USB mass storage quirks.

Submitted by: Dmitry Luhtionov
PR: usb/149934, usb/143045
Approved by:    thompsa (mentor)
2010-10-14 22:06:52 +00:00
Marius Strobl
a55fb8a458 Add a NetBSD-compatible mii_attach(), which is intended to eventually
replace mii_phy_probe() altogether. Compared to the latter the advantages
of mii_attach() are:
- intended to be called multiple times in order to attach PHYs in multiple
  passes (f.e. in order to only use sub-ranges of the 0 to MII_NPHY - 1
  range)
- being able to pass along the capability mask from the NIC to the PHY
  drivers
- being able to specify at which address (phyloc) to probe for a PHY
  (instead of always probing at all addresses from 0 to MII_NPHY - 1)
- being able to specify which PHY instance (offloc) to attach
- being able to pass along MIIF_* flags from the NIC to the PHY drivers
  (f.e. as required to indicated to the PHY drivers that flow control is
  supported by the NIC driver, which actually is the motivation for this
  change).

While at it, I used the opportunity to get rid of some hacks in mii(4)
like miibus_probe() generally doing work besides sheer probing and the
"EVIL HACK" (which will vanish entirely along with mii_phy_probe()) by
passing the struct ifnet pointer via an argument of mii_attach() as well
as to fix some resource leaks in mii(4) in case something fails.
Commits which will update the PHY drivers to honor the MII flags passed
down from the NIC drivers and take advantage of mii_attach() to get rid
of certain types of hacks in NIC and PHY drivers as well as a conversion
of the remaining uses of mii_phy_probe() will follow shortly.

Reviewed by:	jhb, yongari
Obtained from:	NetBSD (partially)
2010-10-14 22:01:40 +00:00
Hans Petter Selasky
e781dd4fbb Add more USB device IDs to supported list of devices.
Submitted by:	Nick Hibma
PR:	usb/149900
Approved by:    thompsa (mentor)
2010-10-14 21:53:42 +00:00
Marius Strobl
15272ec763 Explicitly lower the PIL to 0 as part of enabling interrupts, similar to
what is done on other platforms. Unlike as with the sched_throw(NULL)
called on BSPs during their startup apparently there's nothing which will
reliably lower it on APs. I'm unsure why this only came up on V215 though,
breaking these with r207248. My best guess is that these are the only
supported ones so far fast enough to loose some race.

PR:		151404
MFC after:	3 days
2010-10-14 21:46:53 +00:00
Hans Petter Selasky
7ee713e89e Fix forwarding of Line Register Status changes to TTY layer.
PR: usb/149675
Approved by:    thompsa (mentor)
2010-10-14 21:45:41 +00:00
Hans Petter Selasky
f7d8cf85e3 Remove unused EHCI register definition.
Define reserved EHCI register.

Approved by:    thompsa (mentor)
2010-10-14 21:41:08 +00:00
Hans Petter Selasky
53e0bf6e70 Revert most of r197682 (EHCI Hardware BUG workaround). Implement
proper solution which is to not use the TERMINATE pointer, but rather
link to a halted TD. The initial fix was due to a misunderstanding
about how the EHCI hardware works. Thanks to Alan Stern for clearing
this up. This patch can increase mass storage read performance
significantly when the IRQ rate is less than 8000 IRQ/s.

Approved by:    thompsa (mentor)
2010-10-14 21:38:06 +00:00
Marius Strobl
45c347bed0 - In the spirit of r212559 add a comment describing what will eventually
lower the PIL.
- Just as with the AP ensure that the (S)TICK timer(s) are in a known
  state when starting BSPs.
2010-10-14 21:34:53 +00:00
Marius Strobl
4d0056a865 Just like xmphy(4) this driver doesn't use any of the generic subroutines
so there's no need to fill mii_{ext,}capabilities either.
2010-10-14 21:30:13 +00:00
Hans Petter Selasky
59c9250333 Avoid using endless retransmission at EHCI hardware level, hence this hide
errors from the applications. Only use endless retransmission while in the
non-addressed state on a High-Speed device.

Approved by:    thompsa (mentor)
2010-10-14 21:26:06 +00:00
Hans Petter Selasky
1678e1358b Correct EHCI root HUB interface descriptor.
Approved by:    thompsa (mentor)
2010-10-14 21:18:18 +00:00
Hans Petter Selasky
b494261c69 Correct EHCI port register read.
Approved by:    thompsa (mentor)
2010-10-14 21:14:33 +00:00
Hans Petter Selasky
6001d01180 - Add more USB devices to usbdevs and rename some previously unknown ones.
- Add more USB mass storage quirks.

Submitted by: Dmitry Luhtionov
PR: usb/149934, usb/143045
Approved by:    thompsa (mentor)
2010-10-14 21:09:37 +00:00
Hans Petter Selasky
51fd3d75fe - Add support for LibUSB in 32-bit compatibility mode.
Approved by:    thompsa (mentor)
2010-10-14 20:38:18 +00:00
Konstantin Belousov
113801819a Remove stale comment.
Submitted by:	arundel
MFC after:	3 days
2010-10-14 19:30:44 +00:00
Rui Paulo
ea7303d127 Revert r213765. This is required because our build infrastructure uses
the host lex instead of the lex built during buildworld. I will MFC the
lex changes soon and in a few weeks this I'll commit again r213765.
2010-10-14 19:19:19 +00:00
Pyun YongHyeon
96486faa6e Make sure to not use stale ip/tcp header pointers. The ip/tcp
header parser uses m_pullup(9) to get access to mbuf chain.
m_pullup(9) can allocate new mbuf chain and free old one if the
space left in the mbuf chain is not enough to hold requested
contiguous bytes. Previously drivers can use stale ip/tcp header
pointer if m_pullup(9) returned new mbuf chain.

Reported by:	Andrew Boyer (aboyer <> averesystems dot com)
MFC after:	10 days
2010-10-14 18:31:40 +00:00
Pyun YongHyeon
cb2f3e7f9b Backout r204230. TX mbuf parser for VLAN is still required to
enable TX checksum offloading if VLAN hardware tagging is disabled.
2010-10-14 17:57:52 +00:00
Pyun YongHyeon
39d76ed635 It seems some multi-port dc(4) controllers shares SROM of the first
port such that reading station address from second port always
returned 0xFF:0xFF:0xFF:0xFF:0xFF:0xFF Unfortunately it seems there
is no easy way to know whether SROM is shared or not. Workaround
the issue by traversing dc(4) device list and see whether we're
using second port and use station address of controller 0 as base
station address of second port.

PR:		kern/79262
MFC after:	2 weeks
2010-10-14 17:22:38 +00:00
Matthew D Fleming
3f237b0b0c Support firmware download. 2010-10-14 16:44:44 +00:00
Matthew D Fleming
13ac67a784 Re-work the internals of adding items to the driver's scatter-gather
list.  Use the new internals to simplify adding transaction context
elements, and in future diffs, more complicated SGLs.
2010-10-14 16:44:05 +00:00
Bjoern A. Zeeb
acf456a04a Remove dead code:
assignment to a local variable not used anywhere after that.

MFC after:	3 days
2010-10-14 15:15:22 +00:00
Bjoern A. Zeeb
e046b77ee1 Style: make the asterisk go with the variable name, not the type.
MFC after:	3 days
2010-10-14 14:49:49 +00:00
Bjoern A. Zeeb
dc699bac75 Use ifa_ifwithaddr_check() rather than ifa_ifwithaddr() as we are not
interested in the result and would leak a reference otherwise.

PR:		kern/151435
Submitted by:	Andrew Boyer (aboyer averesystems.com)
MFC after:	3 days
2010-10-14 12:32:49 +00:00
David Xu
407af02b6e In kern_sigtimedwait(), move initialization code out of process lock,
instead of using SIGISMEMBER to test every interesting signal, just
unmask the signal set and let cursig() return one, get the signal
after it returns, call reschedule_signal() after signals are blocked
again.

In kern_sigprocmask(), don't call reschedule_signal() when it is
unnecessary.

In reschedule_signal(), replace SIGISEMPTY() + SIGISMEMBER() with
sig_ffs(), rename variable 'i' to sig.
2010-10-14 08:01:33 +00:00
Matthew D Fleming
bf73d4d28e Use a safer mechanism for determining if a task is currently running,
that does not rely on the lifetime of pointers being the same. This also
restores the task KBI.

Suggested by:	jhb
MFC after:	1 month
2010-10-13 22:59:04 +00:00
Pyun YongHyeon
7ed3f0f0e8 Fix a regression introduced in r213710. r213710 removed the use of
auto polling such that it made all controllers obtain link status
information from the state of the LNKRDY input signal. Broadcom
recommends disabling auto polling such that driver should rely on
PHY interrupts for link status change indications. Unfortunately it
seems some controllers(BCM5703, BCM5704 and BCM5705) have PHY
related issues so Linux took other approach to workaround it.
bge(4) didn't follow that and it used to enable auto polling to
workaround it. Restore this old behavior for BCM5700 family
controllers and BCM5705 to use auto polling. For BCM5700 and
BCM5701, it seems it does not need to enable auto polling but I
restored it for safety.
Special thanks to marius who tried lots of patches with patience.

Reported by:	marius
Tested by:	marius
2010-10-13 22:29:48 +00:00
Hans Petter Selasky
7e05daae8f USB network (NCM driver):
- correct the ethernet payload remainder which
must be post-offseted by -14 bytes instead of
0 bytes. This is not very clearly defined in the
NCM specification.
- add development feature about limiting the
maximum datagram count in each NCM payload.
- zero-pad alignment data
- add TX-interval tuning sysctl

Approved by:    thompsa (mentor)
2010-10-13 22:04:55 +00:00
Pyun YongHyeon
d4f5240abd Add more checks for resolved link speed in bge_miibus_statchg().
Link UP state could be reported first before actual completion of
auto-negotiation. This change makes bge(4) reprogram BGE_MAC_MODE,
BGE_TX_MODE and BGE_RX_MODE register only after controller got a
valid link.
2010-10-13 21:53:37 +00:00
Juli Mallett
2bcbafd6be Keep polling at 50hz as long as link state is changing. 2010-10-13 21:45:56 +00:00
Jung-uk Kim
3c1812acc6 Merge ACPICA 20101013. 2010-10-13 21:37:02 +00:00
Hans Petter Selasky
d751769d2c USB Network:
- Add new driver for iPhone tethering
- Supports the iPhone 3G/3GS/4G ethernet protocol

Approved by:    thompsa (mentor)
2010-10-13 21:36:42 +00:00
Hans Petter Selasky
fb78ea8bd1 USB WLAN:
- Add new device ID

PR:	usb/150989
Approved by:    thompsa (mentor)
2010-10-13 20:56:54 +00:00
Hans Petter Selasky
d58bfbb688 USB network (UHSO):
- Correct network interface flags.

PR:	usb/149039
Submitted by:	Fredrik Lindberg
Approved by:    thompsa (mentor)
2010-10-13 20:51:06 +00:00
Hans Petter Selasky
3b6f59eeaa Correct some root HUB descriptor fields in multiple controller drivers.
Remove an unused structure.

Approved by:    thompsa (mentor)
2010-10-13 20:37:19 +00:00
Dimitry Andric
235610273e Change two missed instances of 'retq' in aeskeys_i386.S to 'retl', which
makes it possible to assemble this file with gas from newer binutils.

Reviewed by:	kib
2010-10-13 17:55:53 +00:00
Pyun YongHyeon
0759386405 Rewrite interrupt handler to give fairness for both RX and TX.
Previously rl(4) continuously checked whether there are RX events
or TX completions in forever loop. This caused TX starvation under
high RX load as well as consuming too much CPU cycles in the
interrupt handler. If interrupt was shared with other devices which
may be always true due to USB devices in these days, rl(4) also
tried to process the interrupt. This means polling(4) was the only
way to mitigate the these issues.

To address these issues, rl(4) now disables interrupts when it
knows the interrupt is ours and limit the number of iteration of
the loop to 16. The interrupt would be enabled again before exiting
interrupt handler if the driver is still running. Because RX buffer
is 64KB in size, the number of iterations in the loop has nothing
to do with number of RX packets being processed. This change
ensures sending TX frames under high RX load.

RX handler drops a driver lock to pass received frames to upper
stack such that there is a window that user can down the interface.
So rl(4) now checks whether driver is still running before serving
RX or TX completion in the loop.

While I'm here, exit interrupt handler when driver initialized
controller.

With this change, now rl(4) can send frames under high RX load even
though the TX performance is still not good(rl(4) controllers can't
queue more than 4 frames at a time so low TX performance was one of
design issue of rl(4) controllers). It's much better than previous
TX starvation and you should not notice RX performance drop with
this change. Controller still shows poor performance under high
network load but for many cases it's now usable without resorting
to polling(4).

MFC after:	2 weeks
2010-10-13 17:55:19 +00:00
Rui Paulo
3065282fdb Revert r213793. 2010-10-13 17:38:23 +00:00
Rui Paulo
d28843a449 When calling panic(), always pass a format string. 2010-10-13 17:21:21 +00:00
Rui Paulo
eab0cdc504 Don't do a logical AND of the result of strcmp() with a constant.
Found with:	clang
2010-10-13 17:17:50 +00:00
Rui Paulo
653bfa0ba3 Ignore the return value of ADDCARRY(). 2010-10-13 17:16:08 +00:00
Rui Paulo
910a5e18ba Pass a format string to panic() and to taskqueue_start_threads().
Found with:	clang
2010-10-13 17:13:43 +00:00
Rui Paulo
6e634bb80f In zfs_post_common(), use %d instead of %hhu.
Found with:	clang
2010-10-13 17:12:23 +00:00
Rui Paulo
8e5fffdfb5 Properly tell the compiler that we want to ignore the return value of
certain macros.
2010-10-13 17:09:53 +00:00
Rui Paulo
f3e830e9b0 Fix several cases were a conditional operator was used instead of a
bitwise operator.

Found with:	clang
2010-10-13 17:09:16 +00:00
Jung-uk Kim
1debbf5d79 Clean up unused headers. 2010-10-13 17:06:25 +00:00
Jung-uk Kim
e41724b279 Remove acpi_bus_number() completely. It had to be removed in r212761.
Pointed out by:	jhb
2010-10-13 16:30:41 +00:00
Rui Paulo
62e9a33814 Pass a format string to make_dev().
Found by:	clang
2010-10-13 14:57:13 +00:00
Rui Paulo
0c05779096 Add opt_compat.h to SRCS. 2010-10-13 14:44:38 +00:00
Rui Paulo
30aa84b727 Pass a format string to make_dev(). 2010-10-13 14:41:52 +00:00
Rui Paulo
577fedb6b1 Fix a brain-o: wrong case statement semantics.
Found with:	clang
2010-10-13 14:39:54 +00:00
Rui Paulo
a6d8c83fd9 WPA_CSE_WEP104 was being incorrectly checked.
Found with:	clang
2010-10-13 14:37:52 +00:00
Rui Paulo
1d90e6b224 Mark acpi_bus_number() as __unused. This allows clang to this file
without any warnings.
2010-10-13 11:38:24 +00:00
Rui Paulo
0b53cc9f56 Ignore the return value of DE_INTERNALIZE(). 2010-10-13 11:37:39 +00:00
Rui Paulo
cd1fa5bd4d Explicitly tell the compiler that we don't care about the return value
of kbdd_ioctl().
2010-10-13 11:37:12 +00:00
Rui Paulo
42a783c16a The canonical way to print __func__ when using KASSERT() is to write
("%s", __func__). This avoids clang's -Wformat-string warnings.
2010-10-13 11:35:59 +00:00
Rui Paulo
4c88812572 Purposely tell the compiler that we ignore the return value of ADDCARRY()
in the REDUCE macro.

Reviewed by:	dim, rdivacky
2010-10-13 10:45:22 +00:00
Rui Paulo
c42a2be28f Define YY_NO_INPUT. This makes aicasm buildable by clang with Werror
turned on.
2010-10-13 10:33:01 +00:00
Juli Mallett
f05957f7c6 o) Make it possible to attach a PHY directly to an octe device rather than
using miibus, since for some devices that use multiple addresses on the bus,
   going through miibus may be unclear, and for devices that are not standard
   MII PHYs, miibus may throw a fit, necessitating complicated interfaces to
   fake the interface that it expects during probe/attach.
o) Make the mv88e61xx SMI interface in octe attach a PHY directly and fix some
   mistakes in the code that resulted from trying too hard to present a nice
   interface to miibus.
o) Add a PHY driver for the mv88e61xx.  If attached (it is optional in kernel
   compiles so the default behavior of having a dumb switch is preserved) it
   will place the switch in a VLAN-tagging mode such that each physical port
   has a VLAN associated with it and interfaces for the VLANs can be created to
   address or bridge between them.
   XXX It would be nice for this to be part of a single module including the
       SMI interface, and for it to fit into a generic switch configuration
       framework and for it to use DSA rather than VLANs, but this is a start
       and gives some sense of the parameters of such frameworks that are not
       currently present in FreeBSD.  In lieu of a switch configuration
       interface, per-port media status and VLAN settings are in a sysctl tree.
   XXX There may be some minor nits remaining in the handling of broadcast,
       multicast and unknown destination traffic.  It would also be nice to go
       through and replace the few remaining magic numbers with macros at some
       point in the future.
   XXX This has only been tested with the MV88E6161, but it should work with
       minimal or no modification on related switches, so support for probing
       them was included.

Thanks to Pat Saavedra of TELoIP and Rafal Jaworowski of Semihalf for their
assistance in understanding the switch chipset.
2010-10-13 09:17:44 +00:00
David Xu
fc4ecc1d48 sigqueue_collect_set() is no longer needed because other functions
maintain pending set correctly.
2010-10-13 06:28:40 +00:00
Rick Macklem
cec077bc8f Fix the krpc so that it can handle NFSv3,UDP mounts with a read/write
data size greater than 8192. Since soreserve(so, 256*1024, 256*1024)
would always fail for the default value of sb_max, modify clnt_dg.c
so that it uses the calculated values and checks for an error return
from soreserve(). Also, add a check for error return from soreserve()
to clnt_vc.c and change __rpc_get_t_size() to use sb_max_adj instead of
the bogus maxsize == 256*1024.

PR:		kern/150910
Reviewed by:	jhb
MFC after:	2 weeks
2010-10-13 00:57:14 +00:00
Jung-uk Kim
ac731af567 Use AcpiReset() from ACPICA instead of rolling our own, which is actually
incomplete.  If FADT says the register is available, enable the capability
by default.  Remove the previous default value from acpi(4).
2010-10-13 00:21:53 +00:00
Jung-uk Kim
56b11f84a7 Remove trailing ", " from `sysctl machdep.idle_available' output. 2010-10-12 20:53:12 +00:00
Pyun YongHyeon
a4431eba57 Protect bge(4) from accessing invalid NIC internal memory regions
on BCM5906.

Tested by:	Buganini < buganini <> gmail dot com >
2010-10-12 19:31:25 +00:00