Commit Graph

86284 Commits

Author SHA1 Message Date
Peter Grehan
57a7aaa71f Sync with Bryan Venteicher's virtio git repo:
d04e609bdd1973cc7d2e8b38b7dcfae057b0962d
	virtio_blk: Use correct temporary variable in vtblk_poll_request

Obtained from:	Bryan Venteicher  bryanv at daemoninthecloset dot org
2012-04-16 18:29:12 +00:00
Marius Strobl
05374275e5 Turn on PREEMPTION by default. After fixing several bugs over time, the
last show-stopper keeping PREEMPTION from being usable on sparc64 should
have been dealt with in r230662.
At least on 2-way systems, PREEMPTION causes a little bit of a degradation
in worldstone performance. However, FreeBSD seems to have started building
up regressions in !PREEMPTION cases so sparc64 better should not be an
oddball in this regard.

MFC after:	1 week
2012-04-16 18:29:07 +00:00
Jaakko Heinonen
587fdb536f Sync tmpfs_chflags() with the recent changes to UFS:
- Add a check for unsupported file flags.
- Return EPERM when an user without PRIV_VFS_SYSFLAGS privilege attempts
  to toggle SF_SETTABLE flags.
2012-04-16 18:10:34 +00:00
Jaakko Heinonen
c5ab5ce345 tmpfs: Allow update mounts only for certain options.
Since r230208 update mounts were allowed if the list of mount options
contained the "export" option. This is not correct as tmpfs doesn't
really support updating all options.

Reviewed by:	kevlo, trociny
2012-04-16 18:07:42 +00:00
Gleb Smirnoff
ef341ee1e3 When we receive an ICMP unreach need fragmentation datagram, we take
proposed MTU value from it and update the TCP host cache. Then
tcp_mss_update() is called on the corresponding tcpcb. It finds the
just allocated entry in the TCP host cache and updates MSS on the
tcpcb. And then we do a fast retransmit of what we have in the tcp
send buffer.

This sequence gets broken if the TCP host cache is exausted. In this
case allocation fails, and later called tcp_mss_update() finds nothing
in cache. The fast retransmit is done with not reduced MSS and is
immidiately replied by remote host with new ICMP datagrams and the
cycle repeats. This ping-pong can go up to wirespeed.

To fix this:
- tcp_mss_update() gets new parameter - mtuoffer, that is like
  offer, but needs to have min_protoh subtracted.
- tcp_mtudisc() as notification method renamed to tcp_mtudisc_notify().
- tcp_mtudisc() now accepts not a useless error argument, but proposed
  MTU value, that is passed to tcp_mss_update() as mtuoffer.

Reported by:	az
Reported by:	Andrey Zonov <andrey zonov.org>
Reviewed by:	andre (previous version of patch)
2012-04-16 13:49:03 +00:00
Marko Zec
5bc2249ff8 #include <net/vnet.h> is no longer needed here.
Spotted by:	Ed Maste
MFC after:	3 days.
2012-04-16 13:41:46 +00:00
Andriy Gapon
85d5a233bd zfsboot: honor -q if it's present in boot.config
Before r228267 the option was honored but the original content of
boot.config was not preserved.  I tried to fix that but missed the idea.
Now the proper way of doing things is taken from i386/boo2.
Also, a comment is added to explain this a little bit unobvious
behavior.

Inspired by:	jhb
MFC after:	5 days
2012-04-16 10:43:06 +00:00
Andriy Gapon
8b452ff57a intpm: add ATI IXP400 pci id
PR:		kern/136762
Submitted by:	Aurelien Mere <freebsd@amc-os.com>
Tested by:	Jens Link <jens.link@gmx.de>
MFC after:	5 days
2012-04-16 10:33:46 +00:00
Andrew Turner
0b898a9ef1 Replace the C implementation of __aeabi_read_tp with an assembly version.
This ensures we follow the ABI by preserving registers r1-r3.

Reviewed by:	jmallett, imp
2012-04-16 09:38:20 +00:00
Adrian Chadd
468d6f48b3 Add in the AP96 phy configuration from openwrt.
* arge0 doesn't (yet) work via the switch PHY ports; I'm not sure why.
* arge1 maps to the WAN port. That works.

TODO:

* The PLL register needs a different (non-default) value for Gigabit
  Ethernet.  The board setup code needs to be extended a bit to allow
  for non-default pll_1000 values - right now, those values come out
  of hard-coded values in the per-chip set_pll_ge() routines.

Obtained from:	Linux / OpenWRT
2012-04-15 22:59:56 +00:00
Adrian Chadd
5fdb2379cb The AR913x MII speed configuration matches the AR71xx MII configuration.
So share the code.

Don't do it for the AR724x - that has a completely different set of PLL
and MII configuration parameters.
2012-04-15 22:34:22 +00:00
Gleb Kurtsou
f8439900d6 Provide better description for vfs.tmpfs.memory_reserved sysctl.
Suggested by:	Anton Yuzhaninov <citrin@citrin.ru>
2012-04-15 21:59:28 +00:00
Adrian Chadd
2aa563dfeb Migrate the net80211 TX aggregation state to be from per-AC to per-TID.
TODO:

* Test mwl(4) more thoroughly!

Reviewed by:	bschmidt (for iwn)
2012-04-15 20:29:39 +00:00
Adrian Chadd
82d05362e6 Drop this down from 512 to 128 for now.
This may result in a bit of a throughput drop.  However, any throughput
drop at this point should be investigated and root caused, as it's likely
because TX scheduling (all the way down to how preemption, scheduler work,
etc) is happening in a sub-optimal fashion.

This also makes it much more likely to be reloadable on a live machine.
Allocating 5120 TX ath_buf entries via contigmalloc is very unlikely
after a few hours of using X/Chromium.
2012-04-15 19:54:22 +00:00
Bernhard Schmidt
5ea748f578 Use the M_AMPDU_MPDU flag to determine when to manually set the seqno and
use a BA queue.
2012-04-15 18:25:17 +00:00
Adrian Chadd
bf9abaa954 Fix the mask logic when reading PCI configuration space registers. 2012-04-15 02:38:01 +00:00
Adrian Chadd
b890549d41 Override some default values to work around various issues in the deep,
dirty and murky past.

* Override the default cache line size to be something reasonable if
  it's set to 0.  Some NICs initialise with '0' (eg embedded ones)
  and there are comments in the driver stating that various OSes (eg
  older Linux ones) would incorrectly program things and 0 out this
  register.

* Just default to overriding the latency timer.  Every other driver
  does this.

* Use a default cache line size of 32 bytes.  It should be "reasonable
  enough".

Obtained from:	Linux ath9k, Atheros
2012-04-15 00:04:23 +00:00
Davide Italiano
99006d44f8 Fix a typo.
Approved by:	gnn (mentor)
MFC after:	2 days
2012-04-14 23:59:58 +00:00
Davide Italiano
331805a5d3 Fix some style bugs introduced in a previous commit (r233045)
Reported by:	glebius, jmallet
Reviewed by:	jmallet
Approved by:	gnn (mentor)
MFC after:	2 days
2012-04-14 23:53:31 +00:00
Michael Tuexen
4dca0ef478 Send always HBs when in PF state.
MFC after: 1 week
X-MFC with: r234296
2012-04-14 21:01:44 +00:00
Michael Tuexen
ca7567c923 Bugfix: Don't send HBs on path which are not idle.
MFC after: 1 week
2012-04-14 20:22:01 +00:00
Marius Strobl
83a8745245 Generate an obviously missing STOP when having finished transmitting data.
This fixes communication with PCF8563.
2012-04-14 17:27:34 +00:00
Marius Strobl
de98dfa146 Add support for the Atmel SAM9XE familiy of microcontrollers, which
consist of a ARM926EJ-S processor core with up to 512 Kbytes of on-chip
flash. Tested with SAM9XE512.

This file was missed in r234291.
2012-04-14 17:17:55 +00:00
Marius Strobl
0f0819525b Add support for the Atmel SAM9XE familiy of microcontrollers, which
consist of a ARM926EJ-S processor core with up to 512 Kbytes of on-chip
flash. Tested with SAM9XE512.
2012-04-14 17:09:38 +00:00
Luigi Rizzo
b1123b0137 i prefer this fix for the -Wformat warning (just one cast,
all the other variables are already correct for %x).
My previous attempt put the cast in the wrong place.
2012-04-14 16:44:18 +00:00
Bjoern A. Zeeb
c2d2534beb Fix LINT builds after r234233; not sure why modules need DEBUG by default. 2012-04-14 13:40:39 +00:00
Bjoern A. Zeeb
92083c91d2 Make compile on 64bit somehow for now after a first try at r234242 on
maybe 32bit?
2012-04-14 13:39:39 +00:00
Marius Strobl
bef0709e64 - Try to bring these files closer to style(9).
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
2012-04-14 11:29:32 +00:00
Marius Strobl
91849f349c Fix !DDB build after r234190. 2012-04-14 11:21:24 +00:00
Peter Grehan
b8a587074f Catch up with Bryan Venteicher's virtio git repo:
a8af6270bd96be6ccd86f70b60fa6512b710e4f0
      virtio_blk: Include function name in panic string

cbdb03a694b76c5253d7ae3a59b9995b9afbb67a
      virtio_balloon: Do the notify outside of the lock

      By the time we return from virtqueue_notify(), the descriptor
      will be in the used ring so we shouldn't have to sleep.

10ba392e60692529a5cbc1e9987e4064e0128447
      virtio: Use DEVMETHOD_END

80cbcc4d6552cac758be67f0c99c36f23ce62110
      virtqueue: Add support for VIRTIO_F_RING_EVENT_IDX

      This can be used to reduce the number of guest/host and
      host/guest interrupts by delaying the interrupt until a
      certain index value is reached.

      Actual use by the network driver will come along later.

8fc465969acc0c58477153e4c3530390db436c02
      virtqueue: Simplify virtqueue_nused()

      Since the values just wrap naturally at UINT16_MAX, we
      can just subtract the two values directly, rather than
      doing 2's complement math.

a8aa22f25959e2767d006cd621b69050e7ffb0ae
      virtio_blk: Remove debugging crud from 75dd732a

      There seems to be an issue with Qemu (or FreeBSD VirtIO) that sets
      the PCI register space for the device config to bogus values. This
      only seems to happen after unloading and reloading the module.

d404800661cb2a9769c033f8a50b2133934501aa
      virtio_blk: Use better variable name

75dd732a97743d96e7c63f7ced3c2169696dadd3
      virtio_blk: Partially revert 92ba40e65

      Just use the virtqueue to determine if any requests are
      still inflight.

06661ed66b7a9efaea240f99f414c368f1bbcdc7
      virtio_blk: error if allowed too few segments

      Should never happen unless the host provides use with a
      bogus seg_max value.

4b33e5085bc87a818433d7e664a0a2c8f56a1a89
      virtio_blk: Sort function declarations

426b9f5cac892c9c64cc7631966461514f7e08c6
      virtio_blk: Cleanup whitespace

617c23e12c61e3c2233d942db713c6b8ff0bd112
      virtio_blk: Call disk_err() on error'd completed requests

081a5712d4b2e0abf273be4d26affcf3870263a9
      virtio_blk: ASSERT the ready and inflight request queues are empty

a9be2631a4f770a84145c18ee03a3f103bed4ca8
      virtio_blk: Simplify check for too many segments

      At the cost of a small style violation.

e00ec09da014f2e60cc75542d0ab78898672d521
      virtio_blk: Add beginnings of suspend/resume

      Still not sure if we need to virtio_stop()/virtio_reinit()
      the device before/after a suspend.

      Don't start additional IO when marked as suspending.

47c71dc6ce8c238aa59ce8afd4bda5aa294bc884
      virtio_blk: Panic when dealt an unhandled BIO cmd

1055544f90fb8c0cc6a2395f5b6104039606aafe
      virtio_blk: Add VQ enqueue/dequeue wrappers

      Wrapper functions managed the added/removing to the in-flight
      list of requests.

      Normally biodone() any completed IO when draining the virtqueue.

92ba40e65b3bb5e4acb9300ece711f1ea8f3f7f4
      virtio_blk: Add in-flight list of requests

74f6d260e075443544522c0833dc2712dd93f49b
      virtio_blk: Rename VTBLK_FLAG_DETACHING to VTBLK_FLAG_DETACH

7aa549050f6fc6551c09c6362ed6b2a0728956ef
      virtio_blk: Finish all BIOs through vtblk_finish_bio()

      Also properly set bio_resid in the case of errors. Most geom_disk
      providers seem to do the same.

9eef6d0e6f7e5dd362f71ba097f2e2e4c3744882
      Added function to translate VirtIO status to error code

ef06adc337f31e1129d6d5f26de6d8d1be27bcd2
      Reset dumping flag when given unexpected parameters

393b3e390c644193a2e392220dcc6a6c50b212d9
      Added missing VTBLK_LOCK() in dump handler

Obtained from:	Bryan Venteicher  bryanv at daemoninthecloset dot org
2012-04-14 05:48:04 +00:00
Adrian Chadd
27ce86b8b6 Both linux ath9k and the reference driver initialises the PLL here
during chip wakeup.

Obtained from:	Linux ath9k, Atheros
2012-04-14 04:40:11 +00:00
Marius Strobl
34f4e555b5 Add a driver for the NXP (Philips) PCF8563 RTC.
Obtained from:	NetBSD (pcf8563reg.h)
2012-04-13 23:07:32 +00:00
Marius Strobl
7b88f42369 Merge from x86:
r233961:

Fix interrupt load balancing regression, introduced in revision
222813, that left all un-pinned interrupts assigned to CPU 0.
In intr_shuffle_irqs(), remove CPU_SETOF() call that initialized
the "intr_cpus" cpuset to only contain CPU0.

This initialization is too late and nullifies the results of calls
to the intr_add_cpu() that occur much earlier in the boot process.

r234074 (partial):

The BSP is not added to the mask of valid target CPUs for interrupts.
Fix this by adding the BSP as an interrupt target directly in

r234105:

Fix !SMP build after r234074.

MFC after: 3 days
2012-04-13 22:58:23 +00:00
Luigi Rizzo
ce2cb79269 fix build with -Wformat -Wmissing-prototypes 2012-04-13 22:24:57 +00:00
Adrian Chadd
c4b28bdc27 Flesh out the rest of the AP96 board/config. 2012-04-13 20:23:32 +00:00
Josh Paetzel
c4d87335a8 Update to version 2.3.1.0
Obtained from:	Daniel Braniss <danny@cs.huji.ac.il>
2012-04-13 18:21:56 +00:00
Adrian Chadd
d591b27dbc * Enable ATH_EEPROM_FIRMWARE, now that it's a compile time option
* Tidy up things a bit.
2012-04-13 18:01:53 +00:00
Adrian Chadd
79f57b35ce Upgrade ATH_EEPROM_FIRMWARE to a configuration option. 2012-04-13 18:00:48 +00:00
Luigi Rizzo
9b034c6f08 Properly disable crc stripping when operating in netmap mode.
Contrarily to what i wrote in my previous commit, the 82599
does include the CRC in the length. The operating mode is
reset in ixgbe_init_locked() and so we need to hook into
the places where the two registers (HLREG0 and RDRXCTL) are
modified.
2012-04-13 16:42:54 +00:00
Luigi Rizzo
ccdc3305e4 add the new memory allocator for netmap, which allocates memory
in small clusters instead of one big contiguous chunk.
This was already enabled in the previous commit.
2012-04-13 16:32:33 +00:00
Luigi Rizzo
d76bf4ff7b A bit of cleanup in the names of fields of netmap-related structures.
Use the name 'ring' instead of 'queue' in all fields.
Bump NETMAP_API.
2012-04-13 16:03:07 +00:00
Luigi Rizzo
82d2fe1069 do not use a deprecated field in a structure. 2012-04-13 15:33:12 +00:00
Adrian Chadd
2c61ba4db2 These are uboot, so mark them as such or booting from flash will not work. 2012-04-13 08:56:23 +00:00
Adrian Chadd
3a8a3eebfd Introduce configuration files for AP94 and AP96.
This uses the new firmware(9) method for squirreling away the EEPROM
contents from SPI flash so ath(4) can get to them later.

It won't work out of the box just yet - you have to add this to
if_ath_pci.c:

#define ATH_EEPROM_FIRMWARE

.. until I've added it as a configuration option and updated things.
2012-04-13 08:52:25 +00:00
Adrian Chadd
0f60da6fb4 Introduce the ability to grab local EEPROM data from the firmware(9)
interface.

* Introduce a device hint, 'eeprom_firmware', which is the name of firmware
  to lookup.
* If the lookup succeeds, take a copy of it and use it as the eeprom data.

This isn't enabled by default - you have to define ATH_EEPROM_FIRMWARE.
I'll add it to the configuration variables in a later commit.

TODO:

* just keep a firmware reference in ath_softc, and remove the need to
  waste the extra memory in having sc_eepromdata be a malloc()ed block.
2012-04-13 08:48:38 +00:00
Adrian Chadd
8f7015e205 (ab)Use the firmware API to store away EEPROM calibration data for
future use by the ath(4) driver.

These embedded devices put the calibration/PCI bootstrap data on the
on board SPI flash rather than on an EEPROM connected to the NIC.
For some boards, there's two NICs and two sets of EEPROM data in the
main SPI flash.

The particulars:

* Introduce ath_fixup_size, which is the size of the EEPROM area in
  bytes.
* Create a firmware image with a name based on the PCI device identifier
  (bus/slot/device/function).
* Hide some verbose debugging behind 'bootverbose'.

ath(4) can then use this to load in the EEPROM data.

This requires AR71XX_ATH_EEPROM to be defined.
2012-04-13 08:45:50 +00:00
Andriy Gapon
56c2dc796b add actual interrupt counters to back ipi_invlcache_counts
Otherwise one could run into a panic with COUNT_IPIS when cache
invalidation actually happened.

Reviewed by:	jhb
MFC after:	1 week
2012-04-13 07:18:19 +00:00
Andriy Gapon
f84633cdcc bump INTRCNT_COUNT values to reflect actual numbers of IPI counters
Maybe the numbers should be conditionalized on COUNT_IPIS

Reviewed by:	jhb
MFC after:	1 week
2012-04-13 07:15:40 +00:00
Adrian Chadd
8a138d80d0 Remove an unused variable. Grr. 2012-04-13 06:13:37 +00:00
Adrian Chadd
be94a28e2a Sync this code against what's in OpenWRT trunk.
* the openwrt code doesn't treat 0/0/0 any differently
  from other bus/slot/func combinations.
* A "local write" function writes to the LCONF area, and
  so I've added it.
* The PCI workaround at attach time uses this LCONF code,
  which it already did ..
* .. but it is a 4 byte write, not a 2 byte write.
  Even though it's PCIR_COMMAND which is a two byte PCI register.

Tested on:	AR7161
TODO:		The other two AR71xx derivatives
TODO:		More thoroughly stare at the datasheets I do have
		and if it indeed is incorrect, push fixes to both
		FreeBSD and Linux/OpenWRT.

Obtained from:	Linux OpenWRT
2012-04-13 06:11:24 +00:00
Jaakko Heinonen
295a542d96 Apply changes from r234103 to ext2fs:
Return EPERM from ext2_setattr() when an user without PRIV_VFS_SYSFLAGS
privilege attempts to toggle SF_SETTABLE flags.

Flags are now stored to ip->i_flags in one place after all checks.

Also, remove SF_NOUNLINK from the checks because ext2fs doesn't support
that flag.

Reviewed by:	bde
2012-04-13 05:48:31 +00:00
Adrian Chadd
676c1784cb Use strdup() on the name (and free it when it's done) so non-static names
can be used in firmware_register().
2012-04-13 04:22:42 +00:00
John Baldwin
28926c5766 Update the ddb and gdb backends for the new 'trace_thread' hook.
It is implemented via db_trace_thread() for DDB and not implemented
for GDB.  This should have been part of r234190.

Pointy hat to:	jhb
Reported by:	jkim
MFC after:	1 week
2012-04-12 21:34:58 +00:00
Peter Grehan
332cda07c0 Complete polled-mode operation by using a callout if the device will be
used in polled-mode. The callout invokes uart_intr, which rearms the timeout.
Implemented for bhyve, but generically useful for e.g. embedded bringup
when the interrupt controller hasn't been setup, or if it's not deemed
worthy to wire an interrupt line from a serial port.

Submitted by:	neel
Reviewed by:	marcel
Obtained from:	NetApp
MFC after:	3 weeks
2012-04-12 18:46:48 +00:00
John Baldwin
0cc457b000 - Extend the KDB interface to add a per-debugger callback to print a
backtrace for an arbitrary thread (rather than the calling thread).
  A kdb_backtrace_thread() wrapper function uses the configured debugger
  if possible, otherwise it falls back to using stack(9) if that is
  available.
- Replace a direct call to db_trace_thread() in propagate_priority()
  with a call to kdb_backtrace_thread() instead.

MFC after:	1 week
2012-04-12 17:43:59 +00:00
John Baldwin
7582954e34 If a linker file contains at least one module, but all of the modules
fail to load (the MOD_LOAD event fails) during a kldload(2), unload the
linker file and fail the kldload(2) with ENOEXEC.

Reported by:	gcooper
MFC after:	1 week
2012-04-12 14:49:25 +00:00
Luigi Rizzo
4f609083e5 Apparently the length field in advanced descriptors
does not include the CRC irrespective of the setting
of CRCSTRIP. The 82599 data sheets (sec. 7.1.6) say differently.
Very strange. Need to check what happens on legacy descriptors,
but for the time being this restores functionality.
2012-04-12 14:06:05 +00:00
John Baldwin
ed5a2b61fd Add OFED and the associated options and drivers to x86 LINT builds:
- Mark 'sdp' as requiring 'inet'.
- Always include "opt_inet.h" and "opt_inet6.h" and modify the IB
  driver Makefiles to honor WITH/WITHOUT_INET/INET6/_SUPPORT options
  to determine what should be enabled during a module build.
- Fix the mlxen(4) driver and the core IB code to compile without
  if INET is disabled (including when both INET and INET6 are disabled).

Reviewed by:	bz
MFC after:	2 weeks
2012-04-12 14:01:06 +00:00
John Baldwin
6937a7ac6b Don't update if_obytes when transmitting packets. That is already done
in IFQ_HANDOFF() when the packet is passed to the start routine, so doing
it here resulted in double counting.

Reported by:	Alex Tutubalin  lexa lexa ru
MFC after:	1 week
2012-04-12 13:53:49 +00:00
Edward Tomasz Napierala
c32b19833b Refactor da(4) to remove one of two code paths used to query capacity
data.

Reviewed by:	ken, mav (earlier version)
Sponsored by:	The FreeBSD Foundation
2012-04-12 12:58:14 +00:00
Andrey V. Elsukov
9dd5659756 Read backup GPT header from the last LBA only when primary GPT header and
table aren't valid. If they are ok, use hdr_lba_alt value to read backup
header. This will make gptboot happy when GPT used atop of some GEOM
provider, e.g. GEOM_MIRROR.

Reviewed by:	pjd
MFC after:	2 weeks
2012-04-12 12:37:53 +00:00
Luigi Rizzo
3c0caf6ce6 Some code restructuring to bring the memory allocator out of netmap.c
and make it easier to replace it with a different implementation.
On passing, also fix indentation.

NOTE: I know that #include "foo.c" is ugly, but the alternative
(add another entry to sys/conf/files, add a separate header with
structs and prototypes, and expose functions that are meant to
be private) looks even worse to me.
We need a more modular way to specify dependencies and build options.
2012-04-12 11:27:09 +00:00
Konstantin Belousov
2dd9ea6f70 Add thread-private flag to indicate that error value is already placed
in td_errno. Flag is supposed to be used by syscalls returning
EJUSTRETURN because errno was already placed into the usermode frame
by a call to set_syscall_retval(9). Both ktrace and dtrace get errno
value from td_errno if the flag is set.

Use the flag to fix sigsuspend(2) error return ktrace records.

Requested by:	bde
MFC after:	1 week
2012-04-12 10:48:43 +00:00
Luigi Rizzo
d5d42003f4 remove an unnecessary #define 2012-04-12 10:32:34 +00:00
Luigi Rizzo
13b9940fdc use correct selinfo pointer for the generic interrupt handler
(it is never used in current FreeBSD drivers).
2012-04-12 08:54:01 +00:00
Andrew Thompson
b517176ad9 Set the proto to LAGG_PROTO_NONE before calling the detach routine so packets
are discarded, this is an issue because lacp drops the lock which may allow
network threads to access freed memory. Expand the lock coverage so the
detach/attach happen atomically.

Submitted by:	Andrew Boyer (earlier version)
2012-04-12 01:07:17 +00:00
Kirk McKusick
ecb6e528c5 Export vinactive() from kern/vfs_subr.c (e.g., make it no longer
static and declare its prototype in sys/vnode.h) so that it can be
called from process_deferred_inactive() (in ufs/ffs/ffs_snapshot.c)
instead of the body of vinactive() being cut and pasted into
process_deferred_inactive().

Reviewed by: kib
MFC after:   2 weeks
2012-04-11 23:01:11 +00:00
Kirk McKusick
e8f8ad7266 Whitespace cleanup. 2012-04-11 22:43:40 +00:00
Nathan Whitehorn
e3c2930d36 We don't need kcopy() in any of the remaining places it is used, so
remove it.

MFC after:	2 weeks
2012-04-11 22:23:50 +00:00
Nathan Whitehorn
b6aeb1ab97 Only manipulate the PGA_EXECUTABLE flag on managed pages. This is a proxy
for whether the page is physical. On dense phys mem systems (32-bit),
VM_PHYS_TO_PAGE will not return NULL for device memory pages if device
memory is above physical memory even if there is no allocated vm_page.
Attempting to use the returned page could then cause either memory
corruption or a page fault.
2012-04-11 21:56:55 +00:00
John Baldwin
8546e82467 Reapply r223198 which was reverted in the previous vendor import. Some
portions were already reapplied in r233708:
- Use a dedicated task to handle deferred transmits from the if_transmit
  method instead of reusing the existing per-queue interrupt task.
  Reusing the per-queue interrupt task could result in both an interrupt
  thread and the taskqueue thread trying to handle received packets on a
  single queue resulting in out-of-order packet processing.
- Call ether_ifdetach() earlier in igb_detach().
- Drain tasks and free taskqueues during igb_detach().

MFC after:	1 week
2012-04-11 21:33:45 +00:00
John Baldwin
45b516f642 Trim stray blank line. 2012-04-11 21:00:33 +00:00
John Baldwin
77b479e644 Allow device_busy() and device_unbusy() to be invoked while a device is
being attached.  This is implemented by adding a new DS_ATTACHING state
while a device's DEVICE_ATTACH() method is being invoked.  A driver is
required to not fail an attach of a busy device.  The device's state will
be promoted to DS_BUSY rather than DS_ACTIVE() if the device was marked
busy during DEVICE_ATTACH().

Reviewed by:	kib
MFC after:	1 week
2012-04-11 20:57:41 +00:00
Nathan Whitehorn
805bee55eb Fix error in r233949. Synchronizing icaches on uncacheable pages turns out
not to be a good idea, and of course the PV entry list for a page is never
empty after the page has been mapped.
2012-04-11 20:28:05 +00:00
Luigi Rizzo
c85cb1a0a2 A couple of changes related to ixgbe operation in netmap mode:
- add a sysctl, dev.netmap.ix_crcstrip, to control whether ixgbe should
  strip the CRC on received frames. Defaults to 0, which keeps the CRC.
  and improves performance when receiving min-sized (64-byte) frames.
  This matters because  min-sized frames is one of the standard
  benchmarks for switches and routers, some chipsets seem to issue
  read-modify-write cycles for PCIe transactions that are not a
  full cache line, and a min-sized frame triggers the bug, resulting
  in reduced throughput -- 9.7 instead of 14.88 Mpps -- and heavy
  bus load.

- for the time being, always look for incoming packets on a select/poll
  even if there has not been an interrupt in the meantime. This is
  only a temporary workaround for a probable race condition in keeping
  track of rx interrupts.
  Add a couple of diagnostic vars to help studying the problem.
2012-04-11 16:11:08 +00:00
Jaakko Heinonen
e6b8bdf252 Restore the blank line incorrectly removed in r234104.
Pointed out by:	bde
2012-04-11 15:48:50 +00:00
Luigi Rizzo
aa15c59eb1 Enable prefetching of descriptors on the TX ring, using the same
values as in the Intel driver 3.8.21 for linux.  The fact that it
is standard in the above driver suggests that it has no bad side
effects.

But of course there must be a reason for enabling features, not
just "it does not harm", so here it is a good one:

Prefetching enables full line rate even using a single queue (14.88
Mpps, compared to ~12 Mpps without prefetch).  This in turn is
terribly useful when one wants to schedule traffic.

For obvious reasons the difference is only visible with netmap
or other high speed solutions, but presumably the advantage
should be in the order of a fraction of a microsecond when
starting transmission on an empty queue.

Discussed with Jack Vogel.

MFC after:	1 week
2012-04-11 15:02:14 +00:00
Eitan Adler
847d0034e3 Return EBADF instead of EMFILE from dup2 when the second argument is
outside the range of valid file descriptors

PR:		kern/164970
Submitted by:	Peter Jeremy <peterjeremy@acm.org>
Reviewed by:	jilles
Approved by:	cperciva
MFC after:	1 week
2012-04-11 14:08:09 +00:00
Gleb Smirnoff
a9a2c40ced It is a logical error that in carp_multicast_cleanup()
we look at count of addresses on a particular vhid, we
should account number of addresses on cif.

To achieve this we need to run carp_attach() and
carp_detach() under appropriate cif lock.
2012-04-11 12:26:30 +00:00
Pyun YongHyeon
7e0fa14052 Back out r228476.
r228476 fixed superfluous link UP/DOWN messages but broke IPMI
access during boot.  It's not clear why r228476 breaks IPMI and
should be revisited.

Reported by:	Paul Guyot <paulguyot <> ieee dot org >
MFC after:	1 week
2012-04-11 06:34:25 +00:00
Marcel Moolenaar
2059ee3cc0 uart_cpu_amd64.c and uart_cpu_i386.c (under sys/dev/uart) are
identical now that the bus spaces are unified under sys/x86.
Replace them with a single uart_cpu_x86.c.
o   delete uart_cpu_i386.c
o   move uart_cpu_amd64.c to uart_cpu_x86.c
o   update files.amd64 and files.i386 accordingly.
2012-04-11 02:42:01 +00:00
Adrian Chadd
53e98d5a48 Fix the default, non-superg compile.
Pointy-hat-to:	adrian
2012-04-11 02:34:32 +00:00
Nathan Whitehorn
88fe385600 Do not restore the register holding the TLS pointer when doing various
usermode context switches (long jumps and ucontext operations). If these
are used across threads, multiple threads can end up with the same TLS base.
Madness will then result.

This makes behavior on PPC match that on x86 systems and on Linux.

MFC after:	10 days
2012-04-11 00:00:40 +00:00
Adrian Chadd
43faa6b266 Fix compilation with IEEE80211_ENABLE_SUPERG defined.
PR:		kern/164951
2012-04-10 19:47:44 +00:00
Adrian Chadd
f8ab7a9fc9 Convert the flags over to a set of bit flags. 2012-04-10 19:25:43 +00:00
Jim Harris
cb77f0da67 Queue CCBs internally instead of using CAM_REQUEUE_REQ status. This fixes
problem where userspace apps such as smartctl fail due to CAM_REQUEUE_REQ
status getting returned when tagged commands are outstanding when smartctl
sends its I/O using the pass(4) interface.

Sponsored by: Intel
Found and tested by: Ravi Pokala <rpokala at panasas dot com>
Reviewed by: scottl
Approved by: scottl
MFC after: 1 week
2012-04-10 16:33:19 +00:00
Marius Strobl
0fec3e2d81 Fix !SMP build after r234074.
Reviewed by:	attilio, jhb
2012-04-10 16:08:46 +00:00
Jaakko Heinonen
034efc61ba Apply changes from r233787 to ext2fs:
- Use more natural ip->i_flags instead of vap->va_flags in the final
  flags check.
- Style improvements.

No functional change intended.

MFC after:	2 weeks
2012-04-10 16:05:52 +00:00
Jaakko Heinonen
fce74feae1 - Return EPERM from ufs_setattr() when an user without PRIV_VFS_SYSFLAGS
privilege attempts to toggle SF_SETTABLE flags.
- Use the '^' operator in the SF_SNAPSHOT anti-toggling check.

Flags are now stored to ip->i_flags in one place after all checks.

Submitted by:	bde
2012-04-10 15:59:37 +00:00
John Baldwin
efbebba22a Properly parse 40G media types from newer Mellanox adapters that are
40G capable.  For now, map all 40G links to 40GBase-CR4.

MFC after:	2 weeks
2012-04-10 14:01:09 +00:00
John Baldwin
19a3210a66 Add media types for 40G media that might be used with FreeBSD.
Reviewed by:	bz
MFC after:	2 weeks
2012-04-10 13:59:35 +00:00
Adrian Chadd
41b6b5074c Blank the aggregate stats whenever the zero ioctl is called. 2012-04-10 07:27:42 +00:00
Adrian Chadd
9467e3f3fc Squirrel away SYNC interrupt debugging if it's enabled in the HAL.
Bus errors will show up as various SYNC interrupts which will be passed
back up to ath_intr().
2012-04-10 07:23:37 +00:00
Adrian Chadd
eddd7521f1 Revert this for now - it may work for -8 and -9 and -HEAD, but not
"-HEAD driver + net80211 on -9 kernel."

I'll figure this out at some later stage.
2012-04-10 07:16:28 +00:00
Adrian Chadd
b779c10b12 Squirrel away the SYNC interrupt in case we're doing interrupt debugging. 2012-04-10 07:11:33 +00:00
Gleb Smirnoff
90b357f6ec M_DONTWAIT is a flag from historical mbuf(9)
allocator, not malloc(9) or uma(9) flag.
2012-04-10 06:52:39 +00:00
Gleb Smirnoff
e275c0d349 M_DONTWAIT is a flag from historical mbuf(9)
allocator, not malloc(9) or uma(9) flag.
2012-04-10 06:52:21 +00:00
Adrian Chadd
fdd72b4a32 * Since the API changed along the -CURRENT path (december 2011),
add a FreeBSD_version check.  It should work fine for compiling
  on -HEAD, 9.x and 8.x.

* Conditionally compile the 11n options only when 11n is enabled.

The above changes allow the ath(4) driver to compile and run on
8.1-RELEASE (Hi old PC-BSD!) but with the 11n stuff disabled.

I've done a test against the net80211 and tools in 8.1-RELEASE.
The NIC used in testing is the AR2427 in an EEEPC.

Just to be clear - this change is to allow the -HEAD ath/hal/rate
code to run on 9.x _and_ 8.x with no source changes. However,
when running on earlier kernels, it should only be used for legacy
mode. (Don't define ATH_ENABLE_11N.)
2012-04-10 06:25:11 +00:00
Gleb Smirnoff
9ca1fe0db8 CARP should be capable to run on if_bridge(4). Unfortunately,
this commit is not enough to enable CARP operation on
if_bridge(4), because the latter doesn't handle or even
initialize its ifp->if_link_state.

Reported by:	Alexander Lunev <sol289 gmail.com>
2012-04-10 05:42:48 +00:00
Attilio Rao
79257559ee BSP is not added to the mask of valid target CPUs for interrupts
in set_apic_interrupt_ids(). Besides, set_apic_interrupts_ids() is not
called in the !SMP case too.
Fix this by:
- Adding the BSP as an interrupt target directly in cpu_startup().
- Remove an obsolete optimization where the BSP are skipped in
  set_apic_interrupt_ids().

Reported by:	jh
Reviewed by:	jhb
MFC after:	3 days
X-MFC:		r233961
Pointy hat to:	me
2012-04-09 22:41:19 +00:00
Jilles Tjoelker
8a8be77610 Remove unused and wrong SA_PROC internal signal property.
The SA_PROC signal property indicated whether each signal number is directed
at a specific thread or at the process in general. However, that depends on
how the signal was generated and not on the signal number. SA_PROC was not
used.
2012-04-09 21:58:58 +00:00
Alexander Motin
70801abe8f Microoptimize cpu_search().
According to profiling, it makes one take 6% of CPU time on hackbench
with its million of context switches per second, instead of 8% before.
2012-04-09 18:24:58 +00:00
Attilio Rao
a0f2c37b6f - Introduce a cache-miss optimization for consistency with other
accesses of the cache member of vm_object objects.
- Use novel vm_page_is_cached() for checks outside of the vm subsystem.

Reviewed by:	alc
MFC after:	2 weeks
X-MFC:		r234039
2012-04-09 17:05:18 +00:00
John Baldwin
bcd6068179 Recognize the RDRAND instruction feature.
Submitted by:	Michael Fuckner  michael fuckner net
MFC after:	3 days
2012-04-09 15:20:16 +00:00
Andriy Gapon
2885c2e4bb intpm: return only SMB bus error codes from SMB methods
PR:		kern/25733
MFC after:	5 days
2012-04-08 20:48:39 +00:00
Andriy Gapon
32048b2bff intpm: reflect the fact that SB800 and later AMD chipsets are not supported
They do not have compatible configuration registers in PCI configuration
space.  Instead their configuration resides in AMD "PM I/O" space
(accessed via a pair of I/O space registers).

MFC after:	5 days
2012-04-08 19:58:38 +00:00
Alan Cox
1c8279e4e7 Fix mincore(2) so that it reports PG_CACHED pages as resident.
MFC after:	2 weeks
2012-04-08 18:25:12 +00:00
Alan Cox
908e3da10e If a page belonging a reservation is cached, then mark the reservation so
that it will be freed to the cache pool rather than the default pool.
Otherwise, the cached pages within the reservation may be recycled sooner
than necessary.

Reported by:	Andrey Zonov
2012-04-08 17:00:46 +00:00
Edward Tomasz Napierala
2b028c25d3 Fix panic in ffs_reload(), which may happen when read-only filesystem
gets resized and then reloaded.

Reviewed by:	kib, mckusick (earlier version)
Sponsored by:	The FreeBSD Foundation
2012-04-08 13:44:55 +00:00
Robert Watson
b4ef8be228 When allocation of labels on files is implicitly disabled due to MAC
policy configuration, avoid leaking resources following failed calls
to get and set MAC labels by file descriptor.

Reported by:	Mateusz Guzik <mjguzik at gmail.com> + clang scan-build
MFC after:	3 days
2012-04-08 11:01:49 +00:00
Kirk McKusick
85121b0979 Expand locking around identification of filesystem mount point when
accounting for I/O counts at completion of I/O operation. Also switch
from using global devmtx to vnode mutex to reduce contention.

Suggested and reviewed by: kib
2012-04-08 06:20:21 +00:00
Kirk McKusick
827e334c01 Add I/O accounting to msdos filesystem.
Suggested and reviewed by: kib
2012-04-08 06:18:18 +00:00
Kirk McKusick
b73ffa31d4 Drop an unnecessary setting of si_mountpt when updating a UFS mount point.
Clearly it must have been set when the mount was done.

Reviewed by: kib
2012-04-08 06:14:49 +00:00
Adrian Chadd
fcacf9318c Add some statistics to track BAR TX. 2012-04-08 04:51:25 +00:00
Adrian Chadd
b43facbff3 After reviewing the mcast/sleep station code a little, undo some brain
damage which I committed when I had less clue about such things.

Don't ever put normal data frames on the mcast software queue.
Just put mcast frames there if needed.

Pass the txq decision into ath_tx_normal_setup(), as we've already made
the decision.  Don't re-do it.

Whilst i'm here, add another random debugging statement.
2012-04-08 00:40:16 +00:00
Stanislav Sedov
b30d62be27 - Revert part of r234005, which I did not intend to commit.
Sorry! :(
2012-04-07 23:51:16 +00:00
Stanislav Sedov
47db1ea0bf - Add kernel config file for QEMU-emulated gumstix board. 2012-04-07 23:48:51 +00:00
Stanislav Sedov
208cf1fbc3 - Add new ARM kernel option QEMU_WORKAROUNDS which can be
used in the code which needs to implement some specific
  behaviour when being run under QEMU.
- Make PXA UART probe code to work under QEMU gumstix, which
  doesn't emulate all the ports properly.
2012-04-07 23:47:08 +00:00
Gleb Kurtsou
9295c62814 tmpfs supports only INT_MAX nodes due to limitations of unit number
allocator.

Replace UINT32_MAX checks with INT_MAX. Keeping more than 2^31 nodes in
memory is not likely to become possible in foreseeable feature and would
require new unit number allocator.

Discussed with: delphij
MFC after:	2 weeks
2012-04-07 15:30:46 +00:00
Gleb Kurtsou
0ff93c48da Add vfs_getopt_size. Support human readable file system options in tmpfs.
Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs.

Discussed with:	delphij
MFC after:	2 weeks
2012-04-07 15:27:34 +00:00
Gleb Kurtsou
da7aa2778e Add reserved memory limit sysctl to tmpfs.
Cleanup availble and used memory functions.
Check if free pages available before allocating new node.

Discussed with:	delphij
2012-04-07 15:23:51 +00:00
Stanislav Sedov
8a31831551 - Do not reinitialize the card if it is already running.
This fixes bootp on if_smc, as bootp code perform SIOCSIFADDR
  ioctl call immediately after sending the request (which causes
  if_init being called) which causes the adapter to drop all the
  packets received in the meantime.
2012-04-07 06:56:38 +00:00
Adrian Chadd
4d7f883711 Do a dma sync before the descriptors are chained together.
I need to find a better place to do this..
2012-04-07 05:51:43 +00:00
Adrian Chadd
e2e4a2c2a1 Break out the legacy duration and protection code into routines,
call these after rate control selection is done.

The duration/protection code wasn't working - it expected the rix to
be valid.  Unfortunately after I moved the rate control selection into
late in the process, the rix value isn't valid and thus the protection/
duration code would get things wrong.

HT frames are now correctly protected with an RTS and for the AR5416,
this involves having the aggregate frames be limited to 8K.

TODO:

* Fix up the DMA sync to occur just before the frame is queued to the
  hardware.  I'm adjusting the duration here but not doing the DMA
  flush.

* Doubly/triply ensure that the aggregate frames are being limited to
  the correct size, or the AR5416 will get unhappy when TXing RTS-protected
  aggregates.
2012-04-07 05:48:26 +00:00
Adrian Chadd
781e7eaffd As I thought, this is a bad idea. When forming aggregates, the RTS/CTS
stuff and rate control lookup is only done on the first frame.
2012-04-07 05:46:00 +00:00
Adrian Chadd
045bc7882e Enforce the RTS aggregation limit if RTS/CTS protection is enabled;
if any subframes in an aggregate have different protection from the
first frame in the formed aggregate, don't add that frame to the
aggregate.

This is likely a suboptimal method (I think we'll mostly be OK marking
frames that have seqno's with the same protection as normal data frames)
but I'll just be cautious for now.
2012-04-07 03:22:11 +00:00
Adrian Chadd
ce656facf3 Store away the RTS aggregate limit from the HAL.
This will be used by some upcoming code to ensure that aggregates
are enforced to be a certain size.  The AR5416 has a limitation on
RTS protected aggregates (8KiB).
2012-04-07 02:51:53 +00:00
Adrian Chadd
875a9451d9 Remove duplicate txflags field from ath_buf.
rename bf_state.bfs_flags to bf_state.bfs_txflags, as that is what
it effectively is.
2012-04-07 02:01:26 +00:00
Nathan Whitehorn
b7d0d1fabf Execute an initial ptesync if and only if the PTE is actually being
invalidated, as opposed to a ref/changed bit update.
2012-04-06 22:33:13 +00:00
Kenneth D. Merry
bf8f8f340e Change the SCSI INQUIRY peripheral qualifier that CTL reports for LUNs
that don't exist.

Anecdotal evidence indicates that it is better to return 011b (bad LUN)
than 001b (LUN offline).  However, this change also gives the user a
sysctl/tunable, kern.cam.ctl.inquiry_pq_no_lun, to override the change
and return to the previous behavior.  (The previous behavior was to
return 001b, or LUN offline.)

ctl.c:		Change the default inquiry peripheral qualifier to 011b,
		and add a sysctl and tunable to allow the user to change
		it back to 001b if needed.

		Don't insert a Copan copyright statement in the inquiry
		data.  The copyright statements on the files are
		sufficient.

ctl_private.h:	Add sysctl variable context to the CTL softc.

ctl_cmd_table.c,
ctl_frontend_internal.c,
ctl_frontend.c,
ctl_backend.c,
ctl_error.c:	Include sys/sysctl.h.

MFC after:	3 days
2012-04-06 22:23:13 +00:00
Justin T. Gibbs
47c77b2265 Fix interrupt load balancing regression, introduced in revision
222813, that left all un-pinned interrupts assigned to CPU 0.

sys/x86/x86/intr_machdep.c:
	In intr_shuffle_irqs(), remove CPU_SETOF() call that initialized
	the "intr_cpus" cpuset to only contain CPU0.

	This initialization is too late and nullifies the results of calls
	the intr_add_cpu() that occur much earlier in the boot process.
	Since "intr_cpus" is statically initialized to the empty set, and
	all processors, including the BSP, already add themselves to
	"intr_cpus" no special initialization for the BSP is necessary.

MFC after:	3 days
2012-04-06 21:19:28 +00:00
Attilio Rao
d1aa86e151 Staticize vm_page_cache_remove().
Reviewed by:	alc
2012-04-06 20:34:00 +00:00
Nathan Whitehorn
348bc07000 Substantially reduce the scope of the locks held in pmap_enter(), which
improves concurrency slightly.
2012-04-06 18:18:48 +00:00
Alan Cox
3e4c7bf65a Micro-optimize free_pv_entry() for the expected case. 2012-04-06 16:41:19 +00:00
Nathan Whitehorn
57bd5cce62 Reduce the frequency that the PowerPC/AIM pmaps invalidate instruction
caches, by invalidating kernel icaches only when needed and not flushing
user caches for shared pages.

Suggested by:	kib
MFC after:	2 weeks
2012-04-06 16:03:38 +00:00
Nathan Whitehorn
629e40e45e Give the kernel pmap lock a different name than user pmap locks. It has
(slightly) different semantics and renaming it prevents a (harmless)
WITNESS warning during bootup for 32-bit kernels on 64-bit CPUs.

MFC after:	5 days
2012-04-06 16:00:37 +00:00
Alexander V. Chernikov
9431cc1696 Fix build broken by r233938.
Pointed by:     David Wolfskill <david@catwhisker.org>
Approved by:    kib (mentor)
Pointy hat to:  melifaro
2012-04-06 13:34:19 +00:00
Andriy Gapon
bec9e056eb retrofit Safe Mode loader menu item actions
The menu item is now made completely independent with the ACPI item - most
modern systems seem to require ACPI and become even more "unsafe"
without it.
Safe Mode no longer disables APIC for the same reason.
kbdmux is not disabled as this feature has proven itself stable.

New actions:
- SMP is disabled in the Safe Mode now
- eventtimers are forced to periodic mode (some real and virtual systems
  seem to have problems otherwise)
- geom extra vigorous integrity checking is disabled, this is to
  facilitate migration from previous versions

Possible short term to do:
- make SMP switch a separate menu item
- restore APIC switch as a separate menu item

Longer term to do:
- turn various tweaks into separate menu items in a Safe Mode sub-menu

Please consider adding a safety tweak to Safe Mode when introducing
new major features or changes that may cause instabilities.

Discussed with:	jhb, scottl, Devin Teske
MFC after:	3 weeks (stable/9 only)
2012-04-06 09:36:22 +00:00
Michael Tuexen
17b611fb21 Remove duplicate condition in if statement.
Obtained from: brucec@
MFC after: 3 days
2012-04-06 09:03:02 +00:00
Sergey Kandaurov
e84459d04b Free ballooned pages with the corresponding malloc type.
MFC after:	1 week
2012-04-06 08:13:29 +00:00
Alexander V. Chernikov
51ec1eb70d - Improve performace for writer-only BPF users.
Linux and Solaris (at least OpenSolaris) has PF_PACKET socket families to send
raw ethernet frames. The only FreeBSD interface that can be used to send raw frames
is BPF. As a result, many programs like cdpd, lldpd, various dhcp stuff uses
BPF only to send data. This leads us to the situation when software like cdpd,
being run on high-traffic-volume interface significantly reduces overall performance
since we have to acquire additional locks for every packet.

Here we add sysctl that changes BPF behavior in the following way:
If program came and opens BPF socket without explicitly specifyin read filter we
assume it to be write-only and add it to special writer-only per-interface list.
This makes bpf_peers_present() return 0, so no additional overhead is introduced.
After filter is supplied, descriptor is added to original per-interface list permitting
packets to be captured.

Unfortunately, pcap_open_live() sets catch-all filter itself for the purpose of
setting snap length.

Fortunately, most programs explicitly sets (event catch-all) filter after that.
tcpdump(1) is a good example.

So a bit hackis approach is taken: we upgrade description only after second
BIOCSETF is received.

Sysctl is named net.bpf.optimize_writers and is turned off by default.

- While here, document all sysctl variables in bpf.4

Sponsored by Yandex LLC

Reviewed by:    glebius (previous version)
Reviewed by:    silence on -net@
Approved by:    (mentor)

MFC after:      4 weeks
2012-04-06 06:55:21 +00:00
Alexander V. Chernikov
e4b3229aa5 - Improve BPF locking model.
Interface locks and descriptor locks are converted from mutex(9) to rwlock(9).
This greately improves performance: in most common case we need to acquire 1
reader lock instead of 2 mutexes.

- Remove filter(descriptor) (reader) lock in bpf_mtap[2]
This was suggested by glebius@. We protect filter by requesting interface
writer lock on filter change.

- Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h
without including rwlock stuff. However, this is is temporary solution,
struct bpf_if should be made opaque for any external caller.

Found by:       Dmitrij Tejblum <tejblum@yandex-team.ru>
Sponsored by:   Yandex LLC

Reviewed by:    glebius (previous version)
Reviewed by:    silence on -net@
Approved by:    (mentor)

MFC after:      3 weeks
2012-04-06 06:53:58 +00:00
John Baldwin
35818d2e94 Add new ktrace records for the start and end of VM faults. This gives
a pair of records similar to syscall entry and return that a user can
use to determine how long page faults take.  The new ktrace records are
enabled via the 'p' trace type, and are enabled in the default set of
trace points.

Reviewed by:	kib
MFC after:	2 weeks
2012-04-05 17:13:14 +00:00
Andriy Gapon
70542ee01f zfs_ioctl: no need for ddi_copyin/out here because sys_ioctl handles that
On FreeBSD the direct ioctl argument is automatically copied in/out
as necesary by the kernel ioctl entry point.

PR:		kern/164445
Submitted by:	Luis Garces-Erice <lge@ieee.org>
Tested by:	Attila Nagy <bra@fsn.hu>
MFC after:	5 days
2012-04-05 07:59:59 +00:00
Andrey V. Elsukov
59894e4a44 Fix VIMAGE build. 2012-04-05 04:41:06 +00:00
David Xu
8931e524bf In sem_post, the field _has_waiters is no longer used, because some
application destroys semaphore after sem_wait returns. Just enter
kernel to wake up sleeping threads, only update _has_waiters if
it is safe. While here, check if the value exceed SEM_VALUE_MAX and
return EOVERFLOW if this is true.
2012-04-05 03:05:02 +00:00
David Xu
17ce606321 umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily.  On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
Adrian Chadd
88b3d48316 Implement BAR TX.
A BAR frame must be transmitted when an frame in an A-MPDU session fails
to transmit - it's retried too often, or it can't be cloned for
re-transmission.  The BAR frame tells the remote side to advance the
left edge of the block-ack window (BAW) to a new value.

In order to do this:

* TX for that particular node/TID must be paused;
* The existing frames in the hardware queue needs to be completed, whether
  they're TXed successfully or otherwise;
* The new left edge of the BAW is then communicated to the remote side
  via a BAR frame;
* Once the BAR frame has been sucessfully TXed, aggregation can resume;
* If the BAR frame can't be successfully TXed, the aggregation session
  is torn down.

This is a first pass that implements the above.  What needs to be done/
tested:

* What happens during say, a channel reset / stuck beacon _and_ BAR
  TX.  It _should_ be correctly buffered and retried once the
  reset has completed.  But if a bgscan occurs (and they shouldn't,
  grr) the BAR frame will be forcibly failed and the aggregation session
  will be torn down.

  Yes, another reason to disable bgscan until I've figured this out.

* There's way too much locking going on here.  I'm going to do a couple
  of further passes of sanitising and refactoring so the (re) locking
  isn't so heavy.  Right now I'm going for correctness, not speed.

* The BAR TX can fail if the hardware TX queue is full.  Since there's
  no "free" space kept for management frames, a full TX queue (from eg
  an iperf test) can race with your ability to allocate ath_buf/mbufs
  and cause issues.  I'll knock this on the head with a subsequent
  commit.

* I need to do some _much_ more thorough testing in hostap mode to ensure
  that many concurrent traffic streams to different end nodes are correctly
  handled.  I'll find and squish whichever bugs show up here.

But, this is an important step to being able to flip on 802.11n by default.
The last issue (besides bug fixes, of course) is HT frame protection and
I'll address that in a subsequent commit.
2012-04-04 23:45:15 +00:00
Adrian Chadd
084c471979 Track and optionally log the actual sync interrupt cause.
These are involved in tracking host interface issues (ie, PCI/PCIe/AHB
interface.)
2012-04-04 22:51:50 +00:00
Adrian Chadd
d6b2002327 Disable the HWQ contents upon a TX queue reset, rather than a TX queue flush.
This is designed to assist in figuring out what the hardware state is
when something like a queue hang has occured.
2012-04-04 22:24:11 +00:00
Adrian Chadd
d743debcbf Now that I've fixed the BAW TX hangs, disable this verbose debugging
again.
2012-04-04 22:22:50 +00:00
Jung-uk Kim
4db9b6c3c6 Save and restore VGA display memory between suspend and resume. 2012-04-04 22:02:54 +00:00
Adrian Chadd
33d340324a Correctly handle AR_MoreAggr when assembling multi-descriptor final frames.
Linux ath9k doesn't have this issue as it doesn't try queuing multi-
descriptor frames to the hardware.

Before, I was only setting the first and last descriptor in the final
frame correctly - and that was done by accident. The first descriptor in
the last sub-frame was being correctly updated by ath_tx_setds_11n();
the last descriptor in the last sub-frame was being correctly updated
by ath_buf_set_rate(). But both of those are "incorrect".

The correct behaviour is:

* AR_IsAggr is set for all descriptors for all subframes in an aggregate.
* AR_MoreAggr is set for all descriptors for all non-final sub-frames
  in an aggregate.

Ie, all descriptors in the last sub-frame of an aggregate must have this
field set to 0.

I still need to do a couple of extra passes to ensure the pad delimiter
field is being correctly handled in all descriptors in the last sub-frame.
2012-04-04 21:49:49 +00:00
Jung-uk Kim
4c7a7f266f Do not copy VESA state buffer if the VBE call has failed for any reason.
Do not unnecessarily clear the state buffer before calling the function.
2012-04-04 21:38:26 +00:00
John Baldwin
6267007aa6 Disable INET6 support in modules when building the LINT-NOINET6 kernel.
Reviewed by:	bz
MFC after:	1 week
2012-04-04 21:31:20 +00:00
Jung-uk Kim
a3ad2822c4 Remove a useless warning. The mode information is unused for very long time
and this function may be used with VESA mode since r232069.
2012-04-04 21:19:55 +00:00
Marius Strobl
26a93551be - Const'ify the device lookup-table.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
- Enable support for flow control.
  Tested by: yongari

MFC after:	1 week
2012-04-04 21:09:02 +00:00
Adrian Chadd
2fe1131c7b Add a threadid to the ah_decode API.
This adds the current thread ID to each logged register and mark entry,
allowing for easier debugging of concurrent/overlapping NIC operations.
2012-04-04 20:46:20 +00:00
Marius Strobl
bb16310a44 Refine r233827; as it turns out, controllers with a device ID of 0x0059
can be upgraded to MegaRAID mode, in which case mfi(4) should attach to
these based on the sub-vendor and -device ID instead (not currently done).
Therefore, let mpt_pci_probe() return BUS_PROBE_LOW_PRIORITY.
While it, let mpt_pci_probe() return BUS_PROBE_DEFAULT instead of 0 in
the default case.

MFC after:	3 days
2012-04-04 20:42:45 +00:00
Adrian Chadd
7961e32527 Disable a specific Merlin hardware workaround which may cause hangs on some
PCIe controllers.

Obtained from:	Atheros / Linux
2012-04-04 20:42:32 +00:00
Jung-uk Kim
3df9a2bf07 - Do not include machine/atomic.h. It is no longer necessary since r233768.
- Remove bogus "atomic" macros and a read-only variable from softc.

Reviewed by:	ambrisko
2012-04-04 16:15:40 +00:00
Jaakko Heinonen
63dfd849a1 Add a check for unsupported file flags to ufs_setattr().
Discussed with:	bde
MFC after:	2 weeks
2012-04-04 14:50:21 +00:00
Gleb Smirnoff
07b6b55dce Merge from OpenBSD:
revision 1.173
  date: 2011/11/09 12:36:03;  author: camield;  state: Exp;  lines: +11 -12
  State expire time is a baseline time ("last active") for expiry
  calculations, and does _not_ denote the time when to expire.  So
  it should never be added to (set into the future).

  Try to reconstruct it with an educated guess on state import and
  just set it to the current time on state updates.

  This fixes a problem on pfsync listeners where the expiry time
  could be double the expected value and cause a lot more states
  to linger.
2012-04-04 14:47:59 +00:00
John Baldwin
20b5d3bf40 Add descriptions after the 'device' line for several NICs to match the
existing style.
2012-04-04 13:49:22 +00:00
John Baldwin
8b39d094aa Fix build on i386. 2012-04-04 11:55:20 +00:00
Navdeep Parhar
60a305887a - Remove redundant call to pr_ctloutput from code that handles SO_SETFIB.
- Add a check for errors during copyin while here.

Reviewed by:	julian, bz
MFC after:	2 weeks
2012-04-03 18:38:00 +00:00
Gleb Smirnoff
64484cf630 Since pf 4.5 import pf(4) has a mechanism to defer
forwarding a packet, that creates state, until
pfsync(4) peer acks state addition (or 10 msec
timeout passes).

This is needed for active-active CARP configurations,
which are poorly supported in FreeBSD and arguably
a good idea at all.

Unfortunately by the time of import this feature in
OpenBSD was turned on, and did not have a switch to
turn it off. This leaked to FreeBSD.

This change make it possible to turn this feature
off via ioctl() and turns it off by default.

Obtained from:	OpenBSD
2012-04-03 18:09:20 +00:00
Bernhard Schmidt
a2a4a2aa53 Add basic HT channel setup to ieee80211_init_channels(), this will be
used by at least ral(4).

Reviewed by:	ray
2012-04-03 17:48:42 +00:00
Marius Strobl
6fe8be540f Fix probing of SAS1068E with a device ID of 0x0059 after r232411.
Reported by:	infofarmer

MFC after:	3 days
2012-04-03 08:28:43 +00:00
Kirk McKusick
23d6e518da A file cannot be deallocated until its last name has been removed
and it is no longer referenced by a user process. The inode for a
file whose name has been removed, but is still referenced at the
time of a crash will still be allocated in the filesystem, but will
have no references (e.g., they will have no names referencing them
from any directory).

With traditional soft updates these unreferenced inodes will be
found and reclaimed when the background fsck is run. When using
journaled soft updates, the kernel must keep track of these inodes
so that it can find and reclaim them during the cleanup process.
Their existence cannot be stored in the journal as the journal only
handles short-term events, and they may persist for days. So, they
are tracked by keeping them in a linked list whose head pointer is
stored in the superblock. The journal tracks them only until their
linked list pointers have been commited to disk. Part of the cleanup
process involves traversing the list of unreferenced inodes and
reclaiming them.

This bug was triggered when confusion arose in the commit steps
of keeping the unreferenced-inode linked list coherent on disk.
Notably, a race between the link() system call adding a link-count
to a file and the unlink() system call removing a link-count to
the file. Here if the unlink() ran after link() had looked up
the file but before link() had incremented the link-count of the
file, the file's link-count would drop to zero before the link()
incremented it back up to one. If the file was referenced by a
user process, the first transition through zero made it appear
that it should be added to the unreferenced-inode list when in
fact it should not have been added. If the new name created by
link() was deleted within a few seconds (with the file still
referenced by a user process) it would legitimately be a candidate
for addition to the unreferenced-inode list. The result was that
there were two attempts to add the same inode to the unreferenced-inode
list which scrambled the unreferenced-inode list's pointers leading
to a panic. The fix is to detect and avoid the false attempt at
adding it to the unreferenced-inode list by having the link()
system call check to see if the link count is zero before it
increments it. If it is, the link() fails with ENOENT (showing that
it has failed the link()/unlink() race).

While tracking down this bug, we have added additional assertions
to detect the problem sooner and also simplified some of the code.

Reported by:      Kirk Russell
Fix submitted by: Jeff Roberson
Tested by:        Peter Holm
PR:               kern/159971
MFC (to 9 only):  2 weeks
2012-04-02 21:58:37 +00:00
Konstantin Belousov
5085ecb75a When process exists, not only the children shall be reparented to
init, but also the orphans shall be removed from the orphan list,
because the list header is destroyed.

Reported and tested by:	pho
MFC after:	3 days
2012-04-02 19:35:36 +00:00
Konstantin Belousov
2e39e24f64 Add helper function to remove the process from the orphans list and
use it instead of inlined code.

Tested by:	pho
MFC after:	3 days
2012-04-02 19:34:56 +00:00
Doug Ambrisko
7e4dd9e112 Move struct megasas_sge from mfi_ioctl.h to mfivar.h so we can
remove including machine/bus.h.  Add some more mfi_ prefixes to
avoid name space pollution.

This should address the last tinderbox issues.
2012-04-02 19:13:02 +00:00
John Baldwin
b867b16dc9 Further tweak the changes made in r233709. The kernel doesn't permit
sleeping from a swi handler (even though in this case it would be ok), so
switch the refill and scanning SWI handlers to being tasks on a fast
taskqueue.  Also, only schedule the refill task for a CMCI as an MC# can
fire at any time, so it should do the minimal amount of work needed and
avoid opportunities to deadlock before it panics (such as scheduling a
task it won't ever need in practice).  To handle the case of an MC# only
finding recoverable errors (which should never happen), always try to
refill the event free list when the periodic scan executes.

MFC after:	2 weeks
2012-04-02 17:26:21 +00:00
Jaakko Heinonen
fccf286d24 - Use more natural ip->i_flags instead of vap->va_flags in the final
flags check.
- Add a comment for the immutable/append check done after handling of
  the flags.
- Style improvements.

No functional change intended.

Submitted by:	bde
MFC after:	2 weeks
2012-04-02 16:33:21 +00:00
John Baldwin
f2e3bfc074 Make machine check exception logging more readable. On newer Intel systems,
an uncorrected ECC error tends to fire on all CPUs in a package
simultaneously and the current printf hacks are not sufficient to make
the messages legible.  Instead, use the existing mca_lock spinlock to
serialize calls to mca_log() and change the machine check code to panic
directly when an unrecoverable error is encoutered rather than falling
back to a trap_fatal() call in trap() (which adds nearly a screen-full of
logging messages that aren't useful for machine checks).

MFC after:	2 weeks
2012-04-02 15:07:22 +00:00
Jayachandran C.
2323b34a2d Reinstate the XTLB handler for CPU_NLM and CPU_RMI
These platforms set the KX bit even when booted in 32 bit mode. So
the XLTB handler is needed even when __mips_n64 is not defined.
2012-04-02 11:41:33 +00:00
Hans Petter Selasky
6d917491f5 Fix compiler warnings, mostly signed issues,
when USB modules are compiled with WARNS=9.

MFC after:	1 weeks
2012-04-02 10:50:42 +00:00
Hans Petter Selasky
4563ba7a0d Add definitions and structures for USB 2.0 Link Power Management, LPM.
MFC after:	2 weeks
2012-04-02 07:51:30 +00:00
Doug Ambrisko
1f0fc486fa Change typedef atomic_t to struct mfi_atomic to avoid name space
collision and some couple more style changes.
2012-04-02 02:22:22 +00:00
Oleksandr Tymoshenko
6de0a4fa84 Remove extra semicolon which rendered condition useless
Submitted by:	Stefan Farfelder <stefanf@FreeBSD.org>
2012-04-02 00:11:26 +00:00
John Baldwin
e506e182dd Export some more useful info about shared memory objects to userland
via procstat(1) and fstat(1):
- Change shm file descriptors to track the pathname they are associated
  with and add a shm_path() method to copy the path out to a caller-supplied
  buffer.
- Use the fo_stat() method of shared memory objects and shm_path() to
  export the path, mode, and size of a shared memory object via
  struct kinfo_file.
- Add a struct shmstat to the libprocstat(3) interface along with a
  procstat_get_shm_info() to export the mode and size of a shared memory
  object.
- Change procstat to always print out the path for a given object if it
  is valid.
- Teach fstat about shared memory objects and to display their path,
  mode, and size.

MFC after:	2 weeks
2012-04-01 18:22:48 +00:00
David Chisnall
87d94367b9 Bump __FreeBSD_version for xlocale cleanup, as requested by ports people.
Approved by:	dim (mentor)
2012-04-01 09:35:23 +00:00
Marius Strobl
3e1745a769 Remove checks that are redundant due to tf_type being unsigned.
MFC after:	3 days
2012-03-31 14:03:16 +00:00
Marius Strobl
2cc3538245 Fix panic on kernel traps having a mapping in trap_sig b0rked in r206086.
Repored by:	David E. Cross

MFC after:	3 days
2012-03-31 13:56:24 +00:00
Alexander Motin
71c8e5f440 Be more conservative in using READ CAPACITY(16) command. Previous code
checked PROTECT bit in INQUIRY data for all SPC devices, while it is defined
only since SPC-3. But there are some SPC-2 USB devices were reported, that
have PROTECT bit set, return no error for READ CAPACITY(16) command, but
return wrong sector count value in response.

MFC after:	3 days
2012-03-31 11:23:09 +00:00
Gleb Smirnoff
2b9536028f Don't check malloc(M_WAITOK) results. 2012-03-31 11:20:48 +00:00
David Xu
8b1eafa723 Remove stale comments. 2012-03-31 06:48:41 +00:00
Doug Ambrisko
a6ba0fd64d MFhead_mfi r227068
First cut of new HW support from LSI and merge into FreeBSD.
	Supports Drake Skinny and ThunderBolt cards.
MFhead_mfi r227574
	Style
MFhead_mfi r227579
	Use bus_addr_t instead of uintXX_t.
MFhead_mfi r227580
	MSI support
MFhead_mfi r227612
	More bus_addr_t and remove "#ifdef __amd64__".
MFhead_mfi r227905
	Improved timeout support from Scott.
MFhead_mfi r228108
	Make file.
MFhead_mfi r228208
	Fixed botched merge of Skinny support and enhanced handling
	in call back routine.
MFhead_mfi r228279
	Remove superfluous !TAILQ_EMPTY() checks before TAILQ_FOREACH().
MFhead_mfi r228310
	Move mfi_decode_evt() to taskqueue.
MFhead_mfi r228320
	Implement MFI_DEBUG for 64bit S/G lists.
MFhead_mfi r231988
	Restore structure layout by reverting the array header to
	use [0] instead of [1].
MFhead_mfi r232412
	Put wildcard pattern later in the match table.
MFhead_mfi r232413
	Use lower case for hexadecimal numbers to match surrounding
	style.
MFhead_mfi r232414
	Add more Thunderbolt variants.
MFhead_mfi r232888
	Don't act on events prior to boot or when shutting down.
	Add hw.mfi.detect_jbod_change to enable or disable acting
	on JBOD type of disks being added on insert and removed on
	removing.  Switch hw.mfi.msi to 1 by default since it works
	better on newer cards.
MFhead_mfi r233016
	Release driver lock before taking Giant when deleting children.
	Use TAILQ_FOREACH_SAFE when items can be deleted.  Make code a
	little simplier to follow.  Fix a couple more style issues.
MFhead_mfi r233620
	Update mfi_spare/mfi_array with the actual number of elements
	for array_ref and pd.  Change these max. #define names to avoid
	name space collisions.  This will require an update to mfiutil
	It avoids mfiutil having to do a magic calculation.

	Add a note and #define to state that a "SYSTEM" disk is really
	what the firmware calls a "JBOD" drive.

Thanks to the many that helped, LSI for the initial code drop,
mav, delphij, jhb, sbruno that all helped with code and testing.
2012-03-30 23:05:48 +00:00
Dimitry Andric
1040a0007e Fix the following compilation warning with clang trunk in isci(4):
sys/dev/isci/isci_task_request.c:198:7: error: case value not in enumerated type 'SCI_TASK_STATUS' (aka 'enum _SCI_TASK_STATUS') [-Werror,-Wswitch]
    case SCI_FAILURE_TIMEOUT:
           ^

This is because the switch is done on a SCI_TASK_STATUS enum type, but
the SCI_FAILURE_TIMEOUT value belongs to SCI_STATUS instead.

Because the list of SCI_TASK_STATUS values cannot be modified at this
time, use the simplest way to get rid of this warning, which is to cast
the switch argument to int.  No functional change.

Reviewed by:	jimharris
MFC after:	3 days
2012-03-30 22:52:08 +00:00
John Baldwin
8b9e9831bf Attempt to make machine check handling a bit more robust:
- Don't malloc() new MCA records for machine checks logged due to a
  CMCI or MC# exception.  Instead, use a pre-allocated pool of records.
  When a CMCI or MC# exception fires, schedule a swi to refill the pool.
  The pool is sized to hold at least one record per available machine
  bank, and one record per CPU. This should handle the case of all CPUs
  triggering a single bank at once as well as the case a single CPU
  triggering all of its banks.  The periodic scans still use malloc()
  since they are run from a safe context.
- Since we have to create an swi to handle refills, make the periodic scan
  a second swi for the same thread instead of having a separate taskqueue
  thread for the scans.

Suggested by:	mdf (avoiding malloc())
MFC after:	2 weeks
2012-03-30 20:17:39 +00:00
John Baldwin
d8a8648379 Fix a few issues with transmit handling in em(4) and igb(4):
- Do not define the foo_start() methods or set if_start in the ifnet if
  multiq transmit is enabled.  Also, set if_transmit and if_qflush before
  ether_ifattach rather than after when multiq transmit is enabled.  This
  helps to ensure that the drivers never try to mix different transmit
  methods.
- Properly restart transmit during resume.  igb(4) was not restarting it
  at all, and em(4) was restarting even if the link was down and was
  calling the wrong method if multiq transmit was enabled.
- Remove all the 'more' handling for transmit completions.  Transmit
  completion processing does not have a processing limit, so it always
  runs to completion and never has more work to do when it returns.
  Instead, the previous code was returning 'true' anytime there were
  packets in the queue that weren't still in the process of being
  transmitted.  The effect was that the driver would continuously
  reschedule a task to process TX completions in effect running at 100%
  CPU polling the hardware until it finished transmitting all of the
  packets in the ring.  Now it will just wait for the next TX completion
  interrupt.
- Restart packet transmission when the link becomes active.
- Fix the MSI-X queue interrupt handlers to restart packet transmission if
  there are pending packets in the relevant software queue (IFQ or buf_ring)
  after processing TX completions.  This is the root cause for the OACTIVE
  hangs as if the MSI-X queue handler drained all the pending packets from
  the TX ring, nothing would ever restart it.  As such, remove some
  previously-added workarounds to reschedule a task to poll the TX ring
  anytime OACTIVE was set.

Tested by:	sbruno
Reviewed by:	jfv
MFC after:	1 week
2012-03-30 19:54:48 +00:00
John Baldwin
435803f3c7 Move the legacy(4) driver to x86. 2012-03-30 19:10:14 +00:00
Jung-uk Kim
b64bbced28 Re-initialize model-specific MSRs when we resume CPUs.
MFC after:	1 week
2012-03-30 17:03:06 +00:00
Jung-uk Kim
3ce5dbcc3d Work around Erratum 721 for AMD Family 10h and 12h processors.
"Under a highly specific and detailed set of internal timing conditions,
the processor may incorrectly update the stack pointer after a long series
of push and/or near-call instructions, or a long series of pop and/or
near-return instructions.  The processor must be in 64-bit mode for this
erratum to occur."

MFC after:	3 days
2012-03-30 16:32:41 +00:00
Marius Strobl
4cb0ce8a60 - Remove erroneous trailing semicolon. [1]
- Correctly determine the maximum payload size for setting the TX link
  frequent NACK latency and replay timer thresholds.

Submitted by:	stefanf [1]
MFC after:	3 days
2012-03-30 15:08:09 +00:00
David Xu
b29d7d9b60 Remove trailing semicolon, it is a typo. 2012-03-30 12:57:14 +00:00
David Xu
0cf573e989 Fix COMPAT_FREEBSD32 build.
Submitted by: Andreas Tobler < andreast at fgznet dot ch >
2012-03-30 09:03:53 +00:00
Alexander Motin
742963ff31 Reenable unsolicited responses on CODEC if hdaa_sense_init() called again.
This fixes jack connection events handling after suspend/resume.

PR:		kern/166382
MFC after:	1 week
2012-03-30 08:33:08 +00:00
David Xu
4ed8858df0 Remove trailing space. 2012-03-30 05:49:32 +00:00
David Xu
e05171d939 Merge umtxq_sleep and umtxq_nanosleep into a single function by using
an abs_timeout structure which describes timeout info.
2012-03-30 05:40:26 +00:00
Pyun YongHyeon
fd3d448fa8 Do not report current link status if driver is not running.
This change also workarounds dhclient's link state handling bug by
not giving current link status.

Unlike other controllers, ale(4)'s PHY hibernation perfectly works
such that driver does not see a valid link if the controller is not
brought up.  If dhclient(8) runs on ale(4) it will blindly waits
until link UP and then gives up after 10 seconds.  Because
dhclient(8) still thinks interface got a valid link when IFM_AVALID
is not set for selected media,  this change makes dhclient initiate
DHCP without waiting for link UP.
2012-03-30 05:27:05 +00:00
Pyun YongHyeon
2b6f7122ab Remove task queue based link state change handler. Driver no longer
needs to defer link state handling.
While I'm here, mark IFF_DRV_RUNNING before changing media.  If
link is established without any delay, that link state change
handling could be lost.
2012-03-30 04:46:39 +00:00
Dimitry Andric
a80f8859c4 Fix an issue introduced in sys/x86/include/endian.h with r232721. In
that revision, the bswapXX_const() macros were renamed to bswapXX_gen().

Also, bswap64_gen() was implemented as two calls to bswap32(), and
similarly, bswap32_gen() as two calls to bswap16().  This mainly helps
our base gcc to produce more efficient assembly.

However, the arguments are not properly masked, which results in the
wrong value being calculated in some instances.  For example,
bswap32(0x12345678) returns 0x7c563412, and bswap64(0x123456789abcdef0)
returns 0xfcdefc9a7c563412.

Fix this by appropriately masking the arguments to bswap16() in
bswap32_gen(), and to bswap32() in bswap64_gen().  This should also
silence warnings from clang.

Submitted by:	jh
2012-03-29 23:31:48 +00:00
Dimitry Andric
4715a95fb4 Revert sys/x86/include/endian.h to what it was before r233419, as that
revision has two problems:
- It can produce worse code with both clang and gcc.
- It doesn't fix the actual issue introduced in r232721, which will be
  fixed in the next commit.

Submitted by:	bde, tijl and jh
Pointy hat to:	dim
2012-03-29 23:30:17 +00:00
Adrian Chadd
b5a9dfd57c oops, add a missing lock. 2012-03-29 21:54:19 +00:00
Jung-uk Kim
64d87e5bf2 Fix couple of style nits. 2012-03-29 19:29:24 +00:00
Jung-uk Kim
5757b18266 Revert r233662 and generalize the hack. Writing zero to BAR actually does
not disable it and it is even harmful as hselasky found out.  Historically,
this code was originated from (OLDCARD) CardBus driver and later leaked into
PCI driver when CardBus was newbus'ified and refactored with PCI driver.
However, it is not really necessary even for CardBus.

Reviewed by:	hselasky, imp, jhb
2012-03-29 19:26:39 +00:00
John Baldwin
0d95597ca9 Use a more proper fix for enabling HT MSI mapping windows on Host-PCI
bridges.  Rather than blindly enabling the windows on all of them, only
enable the window when an MSI interrupt is enabled for a device behind
the bridge, similar to what already happens for HT PCI-PCI bridges.

To implement this, each x86 Host-PCI bridge driver has to be able to
locate it's actual backing device on bus 0.  For ACPI, use the _ADR
method to find the slot and function of the device.  For the non-ACPI
case, the legacy(4) driver already scans bus 0 looking for Host-PCI
bridge devices.  Now it saves the slot and function of each bridge that
it finds as ivars that the Host-PCI bridge driver can then use in its
pcib_map_msi() method.

This fixes machines where non-MSI interrupts were broken by the previous
round of HT MSI changes.

Tested by:	bapt
MFC after:	1 week
2012-03-29 19:03:22 +00:00
John Baldwin
46092aeec0 Restore proper use of bounce buffers for ISA DMA. When locking was
added, the call to pmap_kextract() was moved up, and as a result the
code never updated the physical address to use for DMA if a bounce
buffer was used.  Restore the earlier location of pmap_kextract() so
it takes bounce buffers into account.

Tested by:	kargl
MFC after:	1 week
2012-03-29 18:58:02 +00:00
Adrian Chadd
03e9308f0a Defer the rescheduling of TID -> TXQ frames in some instances.
Right now ath_txq_sched() is mainly called from the TX ath_tx_processq()
routine, which is (mostly) done as part of the taskqueue.  It shouldn't
be called outside the taskqueue.

But now that I'm about to flip back on BAR TX, I'm going to start
stressing the ath_tx_tid_pause() and ath_tx_tid_resume() paths.
What I don't want to have happen is a reschedule of the TID traffic
_during_ the completion of TX frames.

Ideally I'd like to have a way to flag back up to the processing code
that the current hardware queue should be rechecked for software TID
queue frames.  But for now, this should suffice for the BAR TX case.

I may eventually delete this code once I've brought some further
sanity to the general TX queue/completion path.
2012-03-29 17:39:18 +00:00
John Baldwin
1f22be4547 - Rename VM_MEMATTR_UNCACHED to VM_MEMATTR_WEAK_UNCACHEABLE on x86 to
be less ambiguous and more clearly identify what it means.  This
  attribute is what Intel refers to as UC-, and it's only difference
  relative to normal UC memory is that a WC MTRR will override a UC-
  PAT entry causing the memory to be treated as WC, whereas a UC PAT
  entry will always override the MTRR.
- Remove the VM_MEMATTR_UNCACHED alias from powerpc.
2012-03-29 16:51:22 +00:00
John Baldwin
5e1a7cc71e Use VM_MEMATTR_UNCACHEABLE for the constant for UC memory rather than
VM_MEMATTR_UNCACHED.  VM_MEMATTR_UNCACHEABLE is the constant other
platforms use.

MFC after:	2 weeks
2012-03-29 16:48:36 +00:00
Nathan Whitehorn
7893b7f6dd Fix build after changes to trap headers. 2012-03-29 16:04:42 +00:00
Hans Petter Selasky
d1eacc02f1 Move tty_opened_ns() into syscons.c which is currently the
only client of this macro.

Suggested by:	ed @
MFC after:	1 week
2012-03-29 15:47:29 +00:00
Jim Harris
8e0f1d18c8 Fix bug where isci(4) would report only 15 bytes of returned data on a
READ_CAP_16 command to a SATA target.

Sponsored by: Intel
Reviewed by: sbruno
Approved by: sbruno
MFC after: 3 days
2012-03-29 15:43:07 +00:00
Hans Petter Selasky
0e8542711a Fix for boot issue: Don't disable BARs on AGP devices. In general:
Don't disable BARs on any PCI display devices, because doing that can
sometimes cause the main memory bus to stop working, causing all
memory reads to return nothing but 0xFFFFFFFF, even though the memory
location was previously written.  After a while a privileged
instruction fault will appear and then nothing more can be debugged.
The reason for this behaviour is unknown.

MFC after:	1 week
2012-03-29 15:33:44 +00:00
Hans Petter Selasky
8dbeb1b6cc Fix for NULL-pointer panic during boot, if keys are pressed too early.
MFC after:	1 week
2012-03-29 14:53:14 +00:00
Randall Stewart
c4e848b770 Make stream our stream reset implementation
compliant to RFC6525.

MFC after:	1 month
2012-03-29 13:36:53 +00:00
Jayachandran C.
7fb26c47df Remove unnecessary assembly code.
The compiler should generate lw/sw corresponding to register
operations.
2012-03-29 11:46:29 +00:00
Andrey V. Elsukov
ba289b84b0 VMDB offset should be greater than logical volume size only for MBR. 2012-03-29 07:29:27 +00:00
Andrey V. Elsukov
1c45872b03 Do proper cleanup for the GPT case when an error occurs. 2012-03-29 06:37:02 +00:00
Eitan Adler
50d675f7a9 Remove trailing whitespace per mdoc lint warning
Disussed with:	gavin
No objection from:	doc
Approved by:	joel
MFC after:	3 days
2012-03-29 05:02:12 +00:00
Juli Mallett
84db023ec1 Assume a big-endian default on MIPS and drop the "eb" suffix from MACHINE_ARCH.
This makes our naming scheme more closely match other systems and the
expectations of much third-party software.  MIPS builds which are little-endian
should require and exhibit no changes.  Big-endian TARGET_ARCHes must be
changed:
	From:		To:
	mipseb		mips
	mipsn32eb	mipsn32
	mips64eb	mips64

An entry has been added to UPDATING and some foot-shooting protection (complete
with warnings which should become errors in the near future) to the top-level
base system Makefile.
2012-03-29 02:54:35 +00:00
David Xu
d31f470d15 Reduce code size by creating common timed sleeping function. 2012-03-29 02:46:43 +00:00
Juli Mallett
df42d19401 Turn on messages from the Simple Executive codebase, what few there are. 2012-03-29 02:05:11 +00:00
Juli Mallett
5143d82211 Disable FP instruction emulation by default on !o32 because of ABI concerns.
Note that in practice this isn't needed because we get a coprocessor unusable
exception first, but that's actually something like a bug.
2012-03-29 02:04:15 +00:00
Juli Mallett
39dec33f2b Supply endianness implied by the -m flag when compiling ucore code. 2012-03-29 02:03:06 +00:00
Juli Mallett
5c3c01764b Fix little-endian built. 2012-03-29 02:02:23 +00:00
Nathan Whitehorn
13b5e92e01 Allow multiple inclusion of trap.h. This has always been broken, but
until recently never caused problems.
2012-03-29 02:02:14 +00:00
Kirk McKusick
6c09f4a27c A refinement of change 232351 to avoid a race with a forcible unmount.
While we have a snapshot vnode unlocked to avoid a deadlock with another
inode in the same inode block being updated, the filesystem containing
it may be forcibly unmounted. When that happens the snapshot vnode is
revoked. We need to check for that condition and fail appropriately.

This change will be included along with 232351 when it is MFC'ed to 9.

Spotted by:  kib
Reviewed by: kib
2012-03-28 21:21:19 +00:00
Fabien Thomas
f5f9340b98 Add software PMC support.
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).

Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.

Sponsored by: NETASQ
MFC after:	1 month
2012-03-28 20:58:30 +00:00
Kirk McKusick
1faacf5d09 Keep track of the mount point associated with a special device
to enable the collection of counts of synchronous and asynchronous
reads and writes for its associated filesystem. The counts are
displayed using `mount -v'.

Ensure that buffers used for paging indicate the vnode from
which they are operating so that counts of paging I/O operations
from the filesystem are collected.

This checkin only adds the setting of the mount point for the
UFS/FFS filesystem, but it would be trivial to add the setting
and clearing of the mount point at filesystem mount/unmount
time for other filesystems too.

Reviewed by: kib
2012-03-28 20:49:11 +00:00
John Baldwin
45a225844f Allocate the ioapics[] array dynamically since it is only needed for the
duration of madt_setup_io().  This avoids having the array take up
permanent space in the BSS.

Inspired by:	bde
MFC after:	2 weeks
2012-03-28 18:53:48 +00:00
Jim Harris
47c3b89324 Ensure consistent target IDs for direct-attached devices.
Sponsored by: Intel
Reported by: sbruno, <rpokala at panasas dot com>
Tested by: <rpokala at panasas dot com>
Reviewed by: scottl
Approved by: scottl
MFC after: 3 days
2012-03-28 18:38:13 +00:00
Jung-uk Kim
a6e69b92e6 Add a PNP ID for Japanese 106-key keyboard.
PR:		kern/166459
MFC after:	3 days
2012-03-28 17:58:37 +00:00
Nathan Whitehorn
7e55df27cb More PMAP performance improvements: skip 256 MB segments entirely if they
are are not mapped during ranged operations and reduce the scope of the
tlbie lock only to the actual tlbie instruction instead of the entire
sequence. There are a few more optimization possibilities here as well.
2012-03-28 17:25:29 +00:00
Jung-uk Kim
3f8d720f87 MFV: r233615
Revert r233555 and apply a fix for the reference counting regressions.

Tested by:	andreast, lme, nwhitehorn,
		Sevan / Venture37 (venture37 at gmail dot com)
Submitted by:	Robert Moore (robert dot moore at intel dot com)
2012-03-28 17:21:59 +00:00
John Baldwin
5dba6ec3b3 Move the DTrace return IDT vector back up from 0x20 to 0x92. The 0x20
vector is currently dedicated to servicing IRQ 0 from the 8259A's, so
it shouldn't be overloaded for DTrace.

Tested by:	rstone
MFC after:	1 week
2012-03-28 16:32:17 +00:00
Konstantin Belousov
ea573a50b3 Do trivial reformatting of the comment to record the missed commit
message for r233609:
Restore the writes of atimes, quotas and superblock from syncer vnode.

Noted by:   rdivacky
2012-03-28 14:16:15 +00:00
Konstantin Belousov
a988a5c609 Reviewed by: bde, mckusick
Tested by:	pho
MFC after:	2 weeks
2012-03-28 14:06:47 +00:00
Konstantin Belousov
64c8ead942 Microoptimize: in qsync loop over mount vnodes, only unlock mount
interlock after we committed to try to vget() the vnode.

Submitted by:	bde
Reviewed by:	mckusick
Tested by:	pho
MFC after:	1 week
2012-03-28 13:56:18 +00:00
Konstantin Belousov
e0c1740853 Update comment.
MFC after:	3 days
2012-03-28 13:47:07 +00:00
Alexander Motin
16b8ad6420 Stop HDA controller polling callout on suspend and reset it on resume.
PR:		kern/166382
MFC after:	1 week
2012-03-28 13:28:09 +00:00
Marko Zec
2454a7ca98 Permit tcpdrop in VNET jails.
Submitted by:	Miljenko Mikuc
MFC after:	3 days
2012-03-28 12:30:16 +00:00
Michael Tuexen
86e4703fa6 Honor the net.inet.udp.checksum sysctl when using SCTP/UDP/IPv4
encapsulation.
MFCing requires MFCing http://svn.freebsd.org/changeset/base/233554
MFC after: 2 weeks
2012-03-28 08:11:46 +00:00
Pyun YongHyeon
c3f52a31a5 Remove unnecessary #if as the software workaround for PCI protocol
violation should be activated unless the system is cold-booted
after updating EEPROM.
The PCI protocol violation happens only when established link is
10Mbps so the workaround should be updated whenever link state
change is detected.  Previously the workaround was activated only
when user checks current media status with ifconfig(8).
2012-03-28 01:52:38 +00:00
Pyun YongHyeon
8262183e5b Load entire EEPROM contents in device attach time and verify
whether the checksum of EEPROM is valid or not.  Because driver
heavily relies on EEPROM information when it selectively enables
features/workarounds, it would be helpful to know whether driver
sees valid EEPROM.
While I'm here remove all other EEPROM accesses since the entire
EEPROM is loaded at device attach time.

MFC after:	2 weeks
2012-03-28 01:27:27 +00:00
Pyun YongHyeon
1343a72fe2 Partially revert r223608 and selectively allow microcode loading
for 82550C.  For 82550 controllers this change restores CPUSaver
microcode loading.  Due to silicon bug on 82550 and 82550C with
server extension, these controllers seem to require CPUSaver
microcode to receive fragmented UDP datagrams.  However the
microcode shouldn't be used on client featured 82550C as it locks
up the controller.  In addition, client featured 82550C does not
have the silicon bug.  Also clear temporary memory used for
microcode loading since the same memory area is used for other
commands.
While I'm here use 82550C in probe message instead of generic
82550.

Reported by:	Andreas Longwitz <longwitz <> incore de>
Tested by:	Andreas Longwitz <longwitz <> incore de>
MFC after:	2 weeks
2012-03-28 01:08:55 +00:00
Jung-uk Kim
f9be5550f7 - Do not clobber softc when psm(4) is reintialized.
- Make INITAFTERSUSPEND flag independent of HOOKRESUME flag.
- Automatically set INITAFTERSUSPEND flag when ALPS GlidePoint is detected.
- Always probe Synaptics Touchpad.  Allow MOUSE_SYN_GETHWINFO ioctl and
automatically set INITAFTERSUSPEND flag when a supported device is detected,
regardless of "hw.psm.synaptics_support" tunable setting.
- Update psm(4) to reflect the above changes.
- Remove long-time defunct SYNCHACK flag while I am in the neighborhood.

MFC after:	1 month
2012-03-27 23:43:01 +00:00