Commit Graph

90106 Commits

Author SHA1 Message Date
trasz
7f09aee7a1 Fix bug where NFSv4 ACL enforcement code wouldn't unconditionally
allow the owner to read and write ACL and file attributes when there
was no entry with subject matching the owner.  In other words,
'getfacl meh' shouldn't fail for the owner if the ACL looks like this:

# file: meh
# owner: trasz
# group: wheel
         user:root:------a-------:------:allow

Reported by:	kientzle
2012-04-17 14:54:00 +00:00
trasz
29ba0a35f6 Stop treating system processes as special. This fixes panics
like the one triggered by this:

# kldload geom_vinum
# pwait `pgrep -S gv_worker` &
# kldunload geom_vinum

or this:

GEOM_JOURNAL: Shutting down geom gjournal 3464572051.
panic: destroying non-empty racct: 1 allocated for resource 6

which were tracked by jh@ to be caused by checking p->p_flag,
while it wasn't initialised yet.  Basically, during fork, the code
checked p_flag, concluded the process isn't marked as P_SYSTEM,
incremented the counter, and later on, when exiting, checked that
the process was marked as P_SYSTEM, and thus didn't decrement it.

Also, I believe there wasn't any good reason for checking P_SYSTEM
in the first place.

Tested by:	jh
2012-04-17 14:31:02 +00:00
trasz
a41bb18a29 Fix panic, triggered like this: "int main() { thr_exit(); }"
Submitted by:	Mateusz Guzik
2012-04-17 13:44:40 +00:00
trasz
c37ffba90a Enforce upper bound on the input buffer length.
Reported by:	Mateusz Guzik
2012-04-17 13:28:14 +00:00
trasz
aa6b6972ad Fix panic at boot with SD/MMC readers with no media present, introduced
at r234177.  Note that this is a temporary fix, until I come up with something
prettier.
2012-04-17 10:44:28 +00:00
adrian
1bc364bf7e Run the fatal proc as a proc, rather than where it currently is.
Otherwise the reset path will sleep, which it can't do in this context.
2012-04-17 06:02:41 +00:00
adrian
ab4c20c69d Fix the RX free list locking creation and destruction to be consistent
even in the face of errors.

If the RX descriptor list fails, the RX lock won't be initialised, but
then the DMA free path wil try freeing it.

This commit is brought to you by a working mwl(4).
2012-04-17 04:52:57 +00:00
adrian
c5d2215f58 Add missing #include 2012-04-17 04:31:50 +00:00
adrian
0e5c802c61 Style(9) and white space fixes. 2012-04-17 01:34:49 +00:00
adrian
23c6d6eeb0 Protect the PCI space registers behind a mutex.
Obtained from:	Linux/OpenWRT, Atheros
2012-04-17 01:22:59 +00:00
grehan
f3d41e0f81 Add x2apic MSR definitions
Reviewed by:	jhb
Obtained from:	bhyve via Neel via NetApp
2012-04-17 00:54:38 +00:00
jkim
c73bd7ac42 Fix a Clang warning.
Submitted by:	arundel
2012-04-16 23:29:12 +00:00
jkim
3b77573032 Regen for r234359. 2012-04-16 23:17:29 +00:00
jkim
286910d465 Correct an argument type of iopl syscall for Linuxulator. This also fixes
a warning from Clang, i. e., "args->level < 0 is always false".
2012-04-16 23:16:18 +00:00
jkim
deeb601d27 Regen for r234357. 2012-04-16 22:59:51 +00:00
jkim
7bd78fb6df Correct arguments of stat64, fstat64 and lstat64 syscalls for Linuxulator. 2012-04-16 22:58:28 +00:00
dim
97ce6d5fb1 Bump __FreeBSD_version due to the import of a new clang 3.1 prerelease
snapshot.
2012-04-16 21:28:04 +00:00
jkim
f90207b886 Regen for r234352. 2012-04-16 21:24:23 +00:00
jkim
e210f689a8 - Implement pipe2 syscall for Linuxulator. This syscall appeared in 2.6.27
but GNU libc used it without checking its kernel version, e. g., Fedora 10.
- Move pipe(2) implementation for Linuxulator from MD files to MI file,
sys/compat/linux/linux_file.c.  There is no MD code for this syscall at all.
- Correct an argument type for pipe() from l_ulong * to l_int *.  Probably
this was the source of MI/MD confusion.

Reviewed by:	emulation
2012-04-16 21:22:02 +00:00
jkim
da5eae8899 - When interrupt is not requested for VM86 call, make a fake exit point and
push the address onto stack as we do for INTn emulation.  This avoids stack
underflow when we encounter RETF instruction in VM86 mode.  Lack of this
exit point actually caused page fault in VM86 mode with VESA module when we
resume from suspend state[1].
- Remove unnecessary CLI and STI instructions from BIOS interrupt emulation.
INTn and IRET must be able to emulate the flag correctly.

Reported by:	gavin [1]
Tested by:	gavin (early revision)
MFC after:	3 days
2012-04-16 19:31:44 +00:00
grehan
1cae003d38 Sync with Bryan Venteicher's virtio git repo:
d04e609bdd1973cc7d2e8b38b7dcfae057b0962d
	virtio_blk: Use correct temporary variable in vtblk_poll_request

Obtained from:	Bryan Venteicher  bryanv at daemoninthecloset dot org
2012-04-16 18:29:12 +00:00
marius
910f312b32 Turn on PREEMPTION by default. After fixing several bugs over time, the
last show-stopper keeping PREEMPTION from being usable on sparc64 should
have been dealt with in r230662.
At least on 2-way systems, PREEMPTION causes a little bit of a degradation
in worldstone performance. However, FreeBSD seems to have started building
up regressions in !PREEMPTION cases so sparc64 better should not be an
oddball in this regard.

MFC after:	1 week
2012-04-16 18:29:07 +00:00
jh
734fbc687a Sync tmpfs_chflags() with the recent changes to UFS:
- Add a check for unsupported file flags.
- Return EPERM when an user without PRIV_VFS_SYSFLAGS privilege attempts
  to toggle SF_SETTABLE flags.
2012-04-16 18:10:34 +00:00
jh
e8d1b1d0ce tmpfs: Allow update mounts only for certain options.
Since r230208 update mounts were allowed if the list of mount options
contained the "export" option. This is not correct as tmpfs doesn't
really support updating all options.

Reviewed by:	kevlo, trociny
2012-04-16 18:07:42 +00:00
glebius
9297dd4c7e When we receive an ICMP unreach need fragmentation datagram, we take
proposed MTU value from it and update the TCP host cache. Then
tcp_mss_update() is called on the corresponding tcpcb. It finds the
just allocated entry in the TCP host cache and updates MSS on the
tcpcb. And then we do a fast retransmit of what we have in the tcp
send buffer.

This sequence gets broken if the TCP host cache is exausted. In this
case allocation fails, and later called tcp_mss_update() finds nothing
in cache. The fast retransmit is done with not reduced MSS and is
immidiately replied by remote host with new ICMP datagrams and the
cycle repeats. This ping-pong can go up to wirespeed.

To fix this:
- tcp_mss_update() gets new parameter - mtuoffer, that is like
  offer, but needs to have min_protoh subtracted.
- tcp_mtudisc() as notification method renamed to tcp_mtudisc_notify().
- tcp_mtudisc() now accepts not a useless error argument, but proposed
  MTU value, that is passed to tcp_mss_update() as mtuoffer.

Reported by:	az
Reported by:	Andrey Zonov <andrey zonov.org>
Reviewed by:	andre (previous version of patch)
2012-04-16 13:49:03 +00:00
zec
b9ad5bf236 #include <net/vnet.h> is no longer needed here.
Spotted by:	Ed Maste
MFC after:	3 days.
2012-04-16 13:41:46 +00:00
avg
5f4b7b5e72 zfsboot: honor -q if it's present in boot.config
Before r228267 the option was honored but the original content of
boot.config was not preserved.  I tried to fix that but missed the idea.
Now the proper way of doing things is taken from i386/boo2.
Also, a comment is added to explain this a little bit unobvious
behavior.

Inspired by:	jhb
MFC after:	5 days
2012-04-16 10:43:06 +00:00
avg
70742a1dd8 intpm: add ATI IXP400 pci id
PR:		kern/136762
Submitted by:	Aurelien Mere <freebsd@amc-os.com>
Tested by:	Jens Link <jens.link@gmx.de>
MFC after:	5 days
2012-04-16 10:33:46 +00:00
andrew
f784792ea9 Replace the C implementation of __aeabi_read_tp with an assembly version.
This ensures we follow the ABI by preserving registers r1-r3.

Reviewed by:	jmallett, imp
2012-04-16 09:38:20 +00:00
adrian
230291dbf1 Add in the AP96 phy configuration from openwrt.
* arge0 doesn't (yet) work via the switch PHY ports; I'm not sure why.
* arge1 maps to the WAN port. That works.

TODO:

* The PLL register needs a different (non-default) value for Gigabit
  Ethernet.  The board setup code needs to be extended a bit to allow
  for non-default pll_1000 values - right now, those values come out
  of hard-coded values in the per-chip set_pll_ge() routines.

Obtained from:	Linux / OpenWRT
2012-04-15 22:59:56 +00:00
adrian
66d589b127 The AR913x MII speed configuration matches the AR71xx MII configuration.
So share the code.

Don't do it for the AR724x - that has a completely different set of PLL
and MII configuration parameters.
2012-04-15 22:34:22 +00:00
gleb
748c9a0d1c Provide better description for vfs.tmpfs.memory_reserved sysctl.
Suggested by:	Anton Yuzhaninov <citrin@citrin.ru>
2012-04-15 21:59:28 +00:00
adrian
69f376da56 Migrate the net80211 TX aggregation state to be from per-AC to per-TID.
TODO:

* Test mwl(4) more thoroughly!

Reviewed by:	bschmidt (for iwn)
2012-04-15 20:29:39 +00:00
adrian
c5f9250e61 Drop this down from 512 to 128 for now.
This may result in a bit of a throughput drop.  However, any throughput
drop at this point should be investigated and root caused, as it's likely
because TX scheduling (all the way down to how preemption, scheduler work,
etc) is happening in a sub-optimal fashion.

This also makes it much more likely to be reloadable on a live machine.
Allocating 5120 TX ath_buf entries via contigmalloc is very unlikely
after a few hours of using X/Chromium.
2012-04-15 19:54:22 +00:00
bschmidt
1ba70fed8e Use the M_AMPDU_MPDU flag to determine when to manually set the seqno and
use a BA queue.
2012-04-15 18:25:17 +00:00
adrian
179e79fd20 Fix the mask logic when reading PCI configuration space registers. 2012-04-15 02:38:01 +00:00
adrian
3f3580908c Override some default values to work around various issues in the deep,
dirty and murky past.

* Override the default cache line size to be something reasonable if
  it's set to 0.  Some NICs initialise with '0' (eg embedded ones)
  and there are comments in the driver stating that various OSes (eg
  older Linux ones) would incorrectly program things and 0 out this
  register.

* Just default to overriding the latency timer.  Every other driver
  does this.

* Use a default cache line size of 32 bytes.  It should be "reasonable
  enough".

Obtained from:	Linux ath9k, Atheros
2012-04-15 00:04:23 +00:00
davide
63cc567af5 Fix a typo.
Approved by:	gnn (mentor)
MFC after:	2 days
2012-04-14 23:59:58 +00:00
davide
ff8b0a29f3 Fix some style bugs introduced in a previous commit (r233045)
Reported by:	glebius, jmallet
Reviewed by:	jmallet
Approved by:	gnn (mentor)
MFC after:	2 days
2012-04-14 23:53:31 +00:00
tuexen
7ad5fb0897 Send always HBs when in PF state.
MFC after: 1 week
X-MFC with: r234296
2012-04-14 21:01:44 +00:00
tuexen
bc585f5103 Bugfix: Don't send HBs on path which are not idle.
MFC after: 1 week
2012-04-14 20:22:01 +00:00
marius
330283063f Generate an obviously missing STOP when having finished transmitting data.
This fixes communication with PCF8563.
2012-04-14 17:27:34 +00:00
marius
460b10f523 Add support for the Atmel SAM9XE familiy of microcontrollers, which
consist of a ARM926EJ-S processor core with up to 512 Kbytes of on-chip
flash. Tested with SAM9XE512.

This file was missed in r234291.
2012-04-14 17:17:55 +00:00
marius
c2d55bde65 Add support for the Atmel SAM9XE familiy of microcontrollers, which
consist of a ARM926EJ-S processor core with up to 512 Kbytes of on-chip
flash. Tested with SAM9XE512.
2012-04-14 17:09:38 +00:00
luigi
c0099de7f5 i prefer this fix for the -Wformat warning (just one cast,
all the other variables are already correct for %x).
My previous attempt put the cast in the wrong place.
2012-04-14 16:44:18 +00:00
bz
d5e6290126 Fix LINT builds after r234233; not sure why modules need DEBUG by default. 2012-04-14 13:40:39 +00:00
bz
0f6ba3e6af Make compile on 64bit somehow for now after a first try at r234242 on
maybe 32bit?
2012-04-14 13:39:39 +00:00
marius
66d2871df6 - Try to bring these files closer to style(9).
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
2012-04-14 11:29:32 +00:00
marius
6f1427f0e6 Fix !DDB build after r234190. 2012-04-14 11:21:24 +00:00
grehan
f4aaf5384a Catch up with Bryan Venteicher's virtio git repo:
a8af6270bd96be6ccd86f70b60fa6512b710e4f0
      virtio_blk: Include function name in panic string

cbdb03a694b76c5253d7ae3a59b9995b9afbb67a
      virtio_balloon: Do the notify outside of the lock

      By the time we return from virtqueue_notify(), the descriptor
      will be in the used ring so we shouldn't have to sleep.

10ba392e60692529a5cbc1e9987e4064e0128447
      virtio: Use DEVMETHOD_END

80cbcc4d6552cac758be67f0c99c36f23ce62110
      virtqueue: Add support for VIRTIO_F_RING_EVENT_IDX

      This can be used to reduce the number of guest/host and
      host/guest interrupts by delaying the interrupt until a
      certain index value is reached.

      Actual use by the network driver will come along later.

8fc465969acc0c58477153e4c3530390db436c02
      virtqueue: Simplify virtqueue_nused()

      Since the values just wrap naturally at UINT16_MAX, we
      can just subtract the two values directly, rather than
      doing 2's complement math.

a8aa22f25959e2767d006cd621b69050e7ffb0ae
      virtio_blk: Remove debugging crud from 75dd732a

      There seems to be an issue with Qemu (or FreeBSD VirtIO) that sets
      the PCI register space for the device config to bogus values. This
      only seems to happen after unloading and reloading the module.

d404800661cb2a9769c033f8a50b2133934501aa
      virtio_blk: Use better variable name

75dd732a97743d96e7c63f7ced3c2169696dadd3
      virtio_blk: Partially revert 92ba40e65

      Just use the virtqueue to determine if any requests are
      still inflight.

06661ed66b7a9efaea240f99f414c368f1bbcdc7
      virtio_blk: error if allowed too few segments

      Should never happen unless the host provides use with a
      bogus seg_max value.

4b33e5085bc87a818433d7e664a0a2c8f56a1a89
      virtio_blk: Sort function declarations

426b9f5cac892c9c64cc7631966461514f7e08c6
      virtio_blk: Cleanup whitespace

617c23e12c61e3c2233d942db713c6b8ff0bd112
      virtio_blk: Call disk_err() on error'd completed requests

081a5712d4b2e0abf273be4d26affcf3870263a9
      virtio_blk: ASSERT the ready and inflight request queues are empty

a9be2631a4f770a84145c18ee03a3f103bed4ca8
      virtio_blk: Simplify check for too many segments

      At the cost of a small style violation.

e00ec09da014f2e60cc75542d0ab78898672d521
      virtio_blk: Add beginnings of suspend/resume

      Still not sure if we need to virtio_stop()/virtio_reinit()
      the device before/after a suspend.

      Don't start additional IO when marked as suspending.

47c71dc6ce8c238aa59ce8afd4bda5aa294bc884
      virtio_blk: Panic when dealt an unhandled BIO cmd

1055544f90fb8c0cc6a2395f5b6104039606aafe
      virtio_blk: Add VQ enqueue/dequeue wrappers

      Wrapper functions managed the added/removing to the in-flight
      list of requests.

      Normally biodone() any completed IO when draining the virtqueue.

92ba40e65b3bb5e4acb9300ece711f1ea8f3f7f4
      virtio_blk: Add in-flight list of requests

74f6d260e075443544522c0833dc2712dd93f49b
      virtio_blk: Rename VTBLK_FLAG_DETACHING to VTBLK_FLAG_DETACH

7aa549050f6fc6551c09c6362ed6b2a0728956ef
      virtio_blk: Finish all BIOs through vtblk_finish_bio()

      Also properly set bio_resid in the case of errors. Most geom_disk
      providers seem to do the same.

9eef6d0e6f7e5dd362f71ba097f2e2e4c3744882
      Added function to translate VirtIO status to error code

ef06adc337f31e1129d6d5f26de6d8d1be27bcd2
      Reset dumping flag when given unexpected parameters

393b3e390c644193a2e392220dcc6a6c50b212d9
      Added missing VTBLK_LOCK() in dump handler

Obtained from:	Bryan Venteicher  bryanv at daemoninthecloset dot org
2012-04-14 05:48:04 +00:00
adrian
62fe709d3a Both linux ath9k and the reference driver initialises the PLL here
during chip wakeup.

Obtained from:	Linux ath9k, Atheros
2012-04-14 04:40:11 +00:00
marius
6012fb8ec2 Add a driver for the NXP (Philips) PCF8563 RTC.
Obtained from:	NetBSD (pcf8563reg.h)
2012-04-13 23:07:32 +00:00
marius
7f51c048df Merge from x86:
r233961:

Fix interrupt load balancing regression, introduced in revision
222813, that left all un-pinned interrupts assigned to CPU 0.
In intr_shuffle_irqs(), remove CPU_SETOF() call that initialized
the "intr_cpus" cpuset to only contain CPU0.

This initialization is too late and nullifies the results of calls
to the intr_add_cpu() that occur much earlier in the boot process.

r234074 (partial):

The BSP is not added to the mask of valid target CPUs for interrupts.
Fix this by adding the BSP as an interrupt target directly in

r234105:

Fix !SMP build after r234074.

MFC after: 3 days
2012-04-13 22:58:23 +00:00
luigi
c185a54ef0 fix build with -Wformat -Wmissing-prototypes 2012-04-13 22:24:57 +00:00
adrian
7a1a60b83d Flesh out the rest of the AP96 board/config. 2012-04-13 20:23:32 +00:00
jpaetzel
4cba6f70cf Update to version 2.3.1.0
Obtained from:	Daniel Braniss <danny@cs.huji.ac.il>
2012-04-13 18:21:56 +00:00
adrian
d042200b97 * Enable ATH_EEPROM_FIRMWARE, now that it's a compile time option
* Tidy up things a bit.
2012-04-13 18:01:53 +00:00
adrian
9645923e3b Upgrade ATH_EEPROM_FIRMWARE to a configuration option. 2012-04-13 18:00:48 +00:00
luigi
3cef61f727 Properly disable crc stripping when operating in netmap mode.
Contrarily to what i wrote in my previous commit, the 82599
does include the CRC in the length. The operating mode is
reset in ixgbe_init_locked() and so we need to hook into
the places where the two registers (HLREG0 and RDRXCTL) are
modified.
2012-04-13 16:42:54 +00:00
luigi
e736e38132 add the new memory allocator for netmap, which allocates memory
in small clusters instead of one big contiguous chunk.
This was already enabled in the previous commit.
2012-04-13 16:32:33 +00:00
luigi
39cda1ca43 A bit of cleanup in the names of fields of netmap-related structures.
Use the name 'ring' instead of 'queue' in all fields.
Bump NETMAP_API.
2012-04-13 16:03:07 +00:00
luigi
9780a82bfd do not use a deprecated field in a structure. 2012-04-13 15:33:12 +00:00
adrian
d44a931d6b These are uboot, so mark them as such or booting from flash will not work. 2012-04-13 08:56:23 +00:00
adrian
018271e496 Introduce configuration files for AP94 and AP96.
This uses the new firmware(9) method for squirreling away the EEPROM
contents from SPI flash so ath(4) can get to them later.

It won't work out of the box just yet - you have to add this to
if_ath_pci.c:

#define ATH_EEPROM_FIRMWARE

.. until I've added it as a configuration option and updated things.
2012-04-13 08:52:25 +00:00
adrian
7f1f1b94c1 Introduce the ability to grab local EEPROM data from the firmware(9)
interface.

* Introduce a device hint, 'eeprom_firmware', which is the name of firmware
  to lookup.
* If the lookup succeeds, take a copy of it and use it as the eeprom data.

This isn't enabled by default - you have to define ATH_EEPROM_FIRMWARE.
I'll add it to the configuration variables in a later commit.

TODO:

* just keep a firmware reference in ath_softc, and remove the need to
  waste the extra memory in having sc_eepromdata be a malloc()ed block.
2012-04-13 08:48:38 +00:00
adrian
6b1a85efd5 (ab)Use the firmware API to store away EEPROM calibration data for
future use by the ath(4) driver.

These embedded devices put the calibration/PCI bootstrap data on the
on board SPI flash rather than on an EEPROM connected to the NIC.
For some boards, there's two NICs and two sets of EEPROM data in the
main SPI flash.

The particulars:

* Introduce ath_fixup_size, which is the size of the EEPROM area in
  bytes.
* Create a firmware image with a name based on the PCI device identifier
  (bus/slot/device/function).
* Hide some verbose debugging behind 'bootverbose'.

ath(4) can then use this to load in the EEPROM data.

This requires AR71XX_ATH_EEPROM to be defined.
2012-04-13 08:45:50 +00:00
avg
72dc8f21cc add actual interrupt counters to back ipi_invlcache_counts
Otherwise one could run into a panic with COUNT_IPIS when cache
invalidation actually happened.

Reviewed by:	jhb
MFC after:	1 week
2012-04-13 07:18:19 +00:00
avg
3e4fba4c32 bump INTRCNT_COUNT values to reflect actual numbers of IPI counters
Maybe the numbers should be conditionalized on COUNT_IPIS

Reviewed by:	jhb
MFC after:	1 week
2012-04-13 07:15:40 +00:00
adrian
ef9e8f2fea Remove an unused variable. Grr. 2012-04-13 06:13:37 +00:00
adrian
10450b4fc3 Sync this code against what's in OpenWRT trunk.
* the openwrt code doesn't treat 0/0/0 any differently
  from other bus/slot/func combinations.
* A "local write" function writes to the LCONF area, and
  so I've added it.
* The PCI workaround at attach time uses this LCONF code,
  which it already did ..
* .. but it is a 4 byte write, not a 2 byte write.
  Even though it's PCIR_COMMAND which is a two byte PCI register.

Tested on:	AR7161
TODO:		The other two AR71xx derivatives
TODO:		More thoroughly stare at the datasheets I do have
		and if it indeed is incorrect, push fixes to both
		FreeBSD and Linux/OpenWRT.

Obtained from:	Linux OpenWRT
2012-04-13 06:11:24 +00:00
jh
2c87a056cc Apply changes from r234103 to ext2fs:
Return EPERM from ext2_setattr() when an user without PRIV_VFS_SYSFLAGS
privilege attempts to toggle SF_SETTABLE flags.

Flags are now stored to ip->i_flags in one place after all checks.

Also, remove SF_NOUNLINK from the checks because ext2fs doesn't support
that flag.

Reviewed by:	bde
2012-04-13 05:48:31 +00:00
adrian
2c73480574 Use strdup() on the name (and free it when it's done) so non-static names
can be used in firmware_register().
2012-04-13 04:22:42 +00:00
jhb
4b63f702b7 Update the ddb and gdb backends for the new 'trace_thread' hook.
It is implemented via db_trace_thread() for DDB and not implemented
for GDB.  This should have been part of r234190.

Pointy hat to:	jhb
Reported by:	jkim
MFC after:	1 week
2012-04-12 21:34:58 +00:00
grehan
476cf9ae6e Complete polled-mode operation by using a callout if the device will be
used in polled-mode. The callout invokes uart_intr, which rearms the timeout.
Implemented for bhyve, but generically useful for e.g. embedded bringup
when the interrupt controller hasn't been setup, or if it's not deemed
worthy to wire an interrupt line from a serial port.

Submitted by:	neel
Reviewed by:	marcel
Obtained from:	NetApp
MFC after:	3 weeks
2012-04-12 18:46:48 +00:00
jhb
20ac4e4f81 - Extend the KDB interface to add a per-debugger callback to print a
backtrace for an arbitrary thread (rather than the calling thread).
  A kdb_backtrace_thread() wrapper function uses the configured debugger
  if possible, otherwise it falls back to using stack(9) if that is
  available.
- Replace a direct call to db_trace_thread() in propagate_priority()
  with a call to kdb_backtrace_thread() instead.

MFC after:	1 week
2012-04-12 17:43:59 +00:00
jhb
51ec6999bb If a linker file contains at least one module, but all of the modules
fail to load (the MOD_LOAD event fails) during a kldload(2), unload the
linker file and fail the kldload(2) with ENOEXEC.

Reported by:	gcooper
MFC after:	1 week
2012-04-12 14:49:25 +00:00
luigi
6f01bcaac2 Apparently the length field in advanced descriptors
does not include the CRC irrespective of the setting
of CRCSTRIP. The 82599 data sheets (sec. 7.1.6) say differently.
Very strange. Need to check what happens on legacy descriptors,
but for the time being this restores functionality.
2012-04-12 14:06:05 +00:00
jhb
7e0aa0e933 Add OFED and the associated options and drivers to x86 LINT builds:
- Mark 'sdp' as requiring 'inet'.
- Always include "opt_inet.h" and "opt_inet6.h" and modify the IB
  driver Makefiles to honor WITH/WITHOUT_INET/INET6/_SUPPORT options
  to determine what should be enabled during a module build.
- Fix the mlxen(4) driver and the core IB code to compile without
  if INET is disabled (including when both INET and INET6 are disabled).

Reviewed by:	bz
MFC after:	2 weeks
2012-04-12 14:01:06 +00:00
jhb
4f83e11b0c Don't update if_obytes when transmitting packets. That is already done
in IFQ_HANDOFF() when the packet is passed to the start routine, so doing
it here resulted in double counting.

Reported by:	Alex Tutubalin  lexa lexa ru
MFC after:	1 week
2012-04-12 13:53:49 +00:00
trasz
0b02d58773 Refactor da(4) to remove one of two code paths used to query capacity
data.

Reviewed by:	ken, mav (earlier version)
Sponsored by:	The FreeBSD Foundation
2012-04-12 12:58:14 +00:00
ae
aac5c3d394 Read backup GPT header from the last LBA only when primary GPT header and
table aren't valid. If they are ok, use hdr_lba_alt value to read backup
header. This will make gptboot happy when GPT used atop of some GEOM
provider, e.g. GEOM_MIRROR.

Reviewed by:	pjd
MFC after:	2 weeks
2012-04-12 12:37:53 +00:00
luigi
3d2bebcc35 Some code restructuring to bring the memory allocator out of netmap.c
and make it easier to replace it with a different implementation.
On passing, also fix indentation.

NOTE: I know that #include "foo.c" is ugly, but the alternative
(add another entry to sys/conf/files, add a separate header with
structs and prototypes, and expose functions that are meant to
be private) looks even worse to me.
We need a more modular way to specify dependencies and build options.
2012-04-12 11:27:09 +00:00
kib
319ab382ef Add thread-private flag to indicate that error value is already placed
in td_errno. Flag is supposed to be used by syscalls returning
EJUSTRETURN because errno was already placed into the usermode frame
by a call to set_syscall_retval(9). Both ktrace and dtrace get errno
value from td_errno if the flag is set.

Use the flag to fix sigsuspend(2) error return ktrace records.

Requested by:	bde
MFC after:	1 week
2012-04-12 10:48:43 +00:00
luigi
aed1b833cf remove an unnecessary #define 2012-04-12 10:32:34 +00:00
luigi
8827ec0618 use correct selinfo pointer for the generic interrupt handler
(it is never used in current FreeBSD drivers).
2012-04-12 08:54:01 +00:00
thompsa
8d206d7279 Set the proto to LAGG_PROTO_NONE before calling the detach routine so packets
are discarded, this is an issue because lacp drops the lock which may allow
network threads to access freed memory. Expand the lock coverage so the
detach/attach happen atomically.

Submitted by:	Andrew Boyer (earlier version)
2012-04-12 01:07:17 +00:00
mckusick
7901256b30 Export vinactive() from kern/vfs_subr.c (e.g., make it no longer
static and declare its prototype in sys/vnode.h) so that it can be
called from process_deferred_inactive() (in ufs/ffs/ffs_snapshot.c)
instead of the body of vinactive() being cut and pasted into
process_deferred_inactive().

Reviewed by: kib
MFC after:   2 weeks
2012-04-11 23:01:11 +00:00
mckusick
54bf4ddfa2 Whitespace cleanup. 2012-04-11 22:43:40 +00:00
nwhitehorn
d89ad2c78a We don't need kcopy() in any of the remaining places it is used, so
remove it.

MFC after:	2 weeks
2012-04-11 22:23:50 +00:00
nwhitehorn
e1fa94aa4e Only manipulate the PGA_EXECUTABLE flag on managed pages. This is a proxy
for whether the page is physical. On dense phys mem systems (32-bit),
VM_PHYS_TO_PAGE will not return NULL for device memory pages if device
memory is above physical memory even if there is no allocated vm_page.
Attempting to use the returned page could then cause either memory
corruption or a page fault.
2012-04-11 21:56:55 +00:00
jhb
323a33eb5f Reapply r223198 which was reverted in the previous vendor import. Some
portions were already reapplied in r233708:
- Use a dedicated task to handle deferred transmits from the if_transmit
  method instead of reusing the existing per-queue interrupt task.
  Reusing the per-queue interrupt task could result in both an interrupt
  thread and the taskqueue thread trying to handle received packets on a
  single queue resulting in out-of-order packet processing.
- Call ether_ifdetach() earlier in igb_detach().
- Drain tasks and free taskqueues during igb_detach().

MFC after:	1 week
2012-04-11 21:33:45 +00:00
jhb
fff9fdab88 Trim stray blank line. 2012-04-11 21:00:33 +00:00
jhb
294ae9574d Allow device_busy() and device_unbusy() to be invoked while a device is
being attached.  This is implemented by adding a new DS_ATTACHING state
while a device's DEVICE_ATTACH() method is being invoked.  A driver is
required to not fail an attach of a busy device.  The device's state will
be promoted to DS_BUSY rather than DS_ACTIVE() if the device was marked
busy during DEVICE_ATTACH().

Reviewed by:	kib
MFC after:	1 week
2012-04-11 20:57:41 +00:00
nwhitehorn
31bc0a8bff Fix error in r233949. Synchronizing icaches on uncacheable pages turns out
not to be a good idea, and of course the PV entry list for a page is never
empty after the page has been mapped.
2012-04-11 20:28:05 +00:00
luigi
8c1a6fb8e5 A couple of changes related to ixgbe operation in netmap mode:
- add a sysctl, dev.netmap.ix_crcstrip, to control whether ixgbe should
  strip the CRC on received frames. Defaults to 0, which keeps the CRC.
  and improves performance when receiving min-sized (64-byte) frames.
  This matters because  min-sized frames is one of the standard
  benchmarks for switches and routers, some chipsets seem to issue
  read-modify-write cycles for PCIe transactions that are not a
  full cache line, and a min-sized frame triggers the bug, resulting
  in reduced throughput -- 9.7 instead of 14.88 Mpps -- and heavy
  bus load.

- for the time being, always look for incoming packets on a select/poll
  even if there has not been an interrupt in the meantime. This is
  only a temporary workaround for a probable race condition in keeping
  track of rx interrupts.
  Add a couple of diagnostic vars to help studying the problem.
2012-04-11 16:11:08 +00:00
jh
cfeeeaffc7 Restore the blank line incorrectly removed in r234104.
Pointed out by:	bde
2012-04-11 15:48:50 +00:00
luigi
bef4cb5b9b Enable prefetching of descriptors on the TX ring, using the same
values as in the Intel driver 3.8.21 for linux.  The fact that it
is standard in the above driver suggests that it has no bad side
effects.

But of course there must be a reason for enabling features, not
just "it does not harm", so here it is a good one:

Prefetching enables full line rate even using a single queue (14.88
Mpps, compared to ~12 Mpps without prefetch).  This in turn is
terribly useful when one wants to schedule traffic.

For obvious reasons the difference is only visible with netmap
or other high speed solutions, but presumably the advantage
should be in the order of a fraction of a microsecond when
starting transmission on an empty queue.

Discussed with Jack Vogel.

MFC after:	1 week
2012-04-11 15:02:14 +00:00
eadler
2a42c5c4e9 Return EBADF instead of EMFILE from dup2 when the second argument is
outside the range of valid file descriptors

PR:		kern/164970
Submitted by:	Peter Jeremy <peterjeremy@acm.org>
Reviewed by:	jilles
Approved by:	cperciva
MFC after:	1 week
2012-04-11 14:08:09 +00:00
glebius
1143c81c42 It is a logical error that in carp_multicast_cleanup()
we look at count of addresses on a particular vhid, we
should account number of addresses on cif.

To achieve this we need to run carp_attach() and
carp_detach() under appropriate cif lock.
2012-04-11 12:26:30 +00:00
yongari
f3299cb2fb Back out r228476.
r228476 fixed superfluous link UP/DOWN messages but broke IPMI
access during boot.  It's not clear why r228476 breaks IPMI and
should be revisited.

Reported by:	Paul Guyot <paulguyot <> ieee dot org >
MFC after:	1 week
2012-04-11 06:34:25 +00:00
marcel
bf8e062d1b uart_cpu_amd64.c and uart_cpu_i386.c (under sys/dev/uart) are
identical now that the bus spaces are unified under sys/x86.
Replace them with a single uart_cpu_x86.c.
o   delete uart_cpu_i386.c
o   move uart_cpu_amd64.c to uart_cpu_x86.c
o   update files.amd64 and files.i386 accordingly.
2012-04-11 02:42:01 +00:00
adrian
b6f2ea08ae Fix the default, non-superg compile.
Pointy-hat-to:	adrian
2012-04-11 02:34:32 +00:00
nwhitehorn
5ff92b1d9c Do not restore the register holding the TLS pointer when doing various
usermode context switches (long jumps and ucontext operations). If these
are used across threads, multiple threads can end up with the same TLS base.
Madness will then result.

This makes behavior on PPC match that on x86 systems and on Linux.

MFC after:	10 days
2012-04-11 00:00:40 +00:00
adrian
f0d73b6741 Fix compilation with IEEE80211_ENABLE_SUPERG defined.
PR:		kern/164951
2012-04-10 19:47:44 +00:00
adrian
c2d174fcfd Convert the flags over to a set of bit flags. 2012-04-10 19:25:43 +00:00
jimharris
9daad3b04b Queue CCBs internally instead of using CAM_REQUEUE_REQ status. This fixes
problem where userspace apps such as smartctl fail due to CAM_REQUEUE_REQ
status getting returned when tagged commands are outstanding when smartctl
sends its I/O using the pass(4) interface.

Sponsored by: Intel
Found and tested by: Ravi Pokala <rpokala at panasas dot com>
Reviewed by: scottl
Approved by: scottl
MFC after: 1 week
2012-04-10 16:33:19 +00:00
marius
c41a4a0d87 Fix !SMP build after r234074.
Reviewed by:	attilio, jhb
2012-04-10 16:08:46 +00:00
jh
6734f8805b Apply changes from r233787 to ext2fs:
- Use more natural ip->i_flags instead of vap->va_flags in the final
  flags check.
- Style improvements.

No functional change intended.

MFC after:	2 weeks
2012-04-10 16:05:52 +00:00
jh
a1ada6f9c6 - Return EPERM from ufs_setattr() when an user without PRIV_VFS_SYSFLAGS
privilege attempts to toggle SF_SETTABLE flags.
- Use the '^' operator in the SF_SNAPSHOT anti-toggling check.

Flags are now stored to ip->i_flags in one place after all checks.

Submitted by:	bde
2012-04-10 15:59:37 +00:00
jhb
2b061688be Properly parse 40G media types from newer Mellanox adapters that are
40G capable.  For now, map all 40G links to 40GBase-CR4.

MFC after:	2 weeks
2012-04-10 14:01:09 +00:00
jhb
4e983169d1 Add media types for 40G media that might be used with FreeBSD.
Reviewed by:	bz
MFC after:	2 weeks
2012-04-10 13:59:35 +00:00
adrian
69b4bd16e0 Blank the aggregate stats whenever the zero ioctl is called. 2012-04-10 07:27:42 +00:00
adrian
ac77a9a5bc Squirrel away SYNC interrupt debugging if it's enabled in the HAL.
Bus errors will show up as various SYNC interrupts which will be passed
back up to ath_intr().
2012-04-10 07:23:37 +00:00
adrian
6c067d61df Revert this for now - it may work for -8 and -9 and -HEAD, but not
"-HEAD driver + net80211 on -9 kernel."

I'll figure this out at some later stage.
2012-04-10 07:16:28 +00:00
adrian
ed6a13cb3e Squirrel away the SYNC interrupt in case we're doing interrupt debugging. 2012-04-10 07:11:33 +00:00
glebius
9a09be5774 M_DONTWAIT is a flag from historical mbuf(9)
allocator, not malloc(9) or uma(9) flag.
2012-04-10 06:52:39 +00:00
glebius
77af9d99aa M_DONTWAIT is a flag from historical mbuf(9)
allocator, not malloc(9) or uma(9) flag.
2012-04-10 06:52:21 +00:00
adrian
a5766a62c7 * Since the API changed along the -CURRENT path (december 2011),
add a FreeBSD_version check.  It should work fine for compiling
  on -HEAD, 9.x and 8.x.

* Conditionally compile the 11n options only when 11n is enabled.

The above changes allow the ath(4) driver to compile and run on
8.1-RELEASE (Hi old PC-BSD!) but with the 11n stuff disabled.

I've done a test against the net80211 and tools in 8.1-RELEASE.
The NIC used in testing is the AR2427 in an EEEPC.

Just to be clear - this change is to allow the -HEAD ath/hal/rate
code to run on 9.x _and_ 8.x with no source changes. However,
when running on earlier kernels, it should only be used for legacy
mode. (Don't define ATH_ENABLE_11N.)
2012-04-10 06:25:11 +00:00
glebius
dcb2500d91 CARP should be capable to run on if_bridge(4). Unfortunately,
this commit is not enough to enable CARP operation on
if_bridge(4), because the latter doesn't handle or even
initialize its ifp->if_link_state.

Reported by:	Alexander Lunev <sol289 gmail.com>
2012-04-10 05:42:48 +00:00
attilio
6c01a6d7fd BSP is not added to the mask of valid target CPUs for interrupts
in set_apic_interrupt_ids(). Besides, set_apic_interrupts_ids() is not
called in the !SMP case too.
Fix this by:
- Adding the BSP as an interrupt target directly in cpu_startup().
- Remove an obsolete optimization where the BSP are skipped in
  set_apic_interrupt_ids().

Reported by:	jh
Reviewed by:	jhb
MFC after:	3 days
X-MFC:		r233961
Pointy hat to:	me
2012-04-09 22:41:19 +00:00
jilles
4360dc9ca8 Remove unused and wrong SA_PROC internal signal property.
The SA_PROC signal property indicated whether each signal number is directed
at a specific thread or at the process in general. However, that depends on
how the signal was generated and not on the signal number. SA_PROC was not
used.
2012-04-09 21:58:58 +00:00
mav
e1ffe54fb7 Microoptimize cpu_search().
According to profiling, it makes one take 6% of CPU time on hackbench
with its million of context switches per second, instead of 8% before.
2012-04-09 18:24:58 +00:00
attilio
628004ddfb - Introduce a cache-miss optimization for consistency with other
accesses of the cache member of vm_object objects.
- Use novel vm_page_is_cached() for checks outside of the vm subsystem.

Reviewed by:	alc
MFC after:	2 weeks
X-MFC:		r234039
2012-04-09 17:05:18 +00:00
jhb
24b943fa1c Recognize the RDRAND instruction feature.
Submitted by:	Michael Fuckner  michael fuckner net
MFC after:	3 days
2012-04-09 15:20:16 +00:00
avg
7e7e5aed30 intpm: return only SMB bus error codes from SMB methods
PR:		kern/25733
MFC after:	5 days
2012-04-08 20:48:39 +00:00
avg
d41eccd983 intpm: reflect the fact that SB800 and later AMD chipsets are not supported
They do not have compatible configuration registers in PCI configuration
space.  Instead their configuration resides in AMD "PM I/O" space
(accessed via a pair of I/O space registers).

MFC after:	5 days
2012-04-08 19:58:38 +00:00
alc
e30063988b Fix mincore(2) so that it reports PG_CACHED pages as resident.
MFC after:	2 weeks
2012-04-08 18:25:12 +00:00
alc
a317fe0618 If a page belonging a reservation is cached, then mark the reservation so
that it will be freed to the cache pool rather than the default pool.
Otherwise, the cached pages within the reservation may be recycled sooner
than necessary.

Reported by:	Andrey Zonov
2012-04-08 17:00:46 +00:00
trasz
63d1a2cda8 Fix panic in ffs_reload(), which may happen when read-only filesystem
gets resized and then reloaded.

Reviewed by:	kib, mckusick (earlier version)
Sponsored by:	The FreeBSD Foundation
2012-04-08 13:44:55 +00:00
rwatson
b08941230a When allocation of labels on files is implicitly disabled due to MAC
policy configuration, avoid leaking resources following failed calls
to get and set MAC labels by file descriptor.

Reported by:	Mateusz Guzik <mjguzik at gmail.com> + clang scan-build
MFC after:	3 days
2012-04-08 11:01:49 +00:00
mckusick
614fd4fe5e Expand locking around identification of filesystem mount point when
accounting for I/O counts at completion of I/O operation. Also switch
from using global devmtx to vnode mutex to reduce contention.

Suggested and reviewed by: kib
2012-04-08 06:20:21 +00:00
mckusick
4bed5dbd8f Add I/O accounting to msdos filesystem.
Suggested and reviewed by: kib
2012-04-08 06:18:18 +00:00
mckusick
a94a9f9c50 Drop an unnecessary setting of si_mountpt when updating a UFS mount point.
Clearly it must have been set when the mount was done.

Reviewed by: kib
2012-04-08 06:14:49 +00:00
adrian
f16af2226b Add some statistics to track BAR TX. 2012-04-08 04:51:25 +00:00
adrian
e793af405d After reviewing the mcast/sleep station code a little, undo some brain
damage which I committed when I had less clue about such things.

Don't ever put normal data frames on the mcast software queue.
Just put mcast frames there if needed.

Pass the txq decision into ath_tx_normal_setup(), as we've already made
the decision.  Don't re-do it.

Whilst i'm here, add another random debugging statement.
2012-04-08 00:40:16 +00:00
stas
eba865fe0d - Revert part of r234005, which I did not intend to commit.
Sorry! :(
2012-04-07 23:51:16 +00:00
stas
93ce0935e7 - Add kernel config file for QEMU-emulated gumstix board. 2012-04-07 23:48:51 +00:00
stas
857374ab0e - Add new ARM kernel option QEMU_WORKAROUNDS which can be
used in the code which needs to implement some specific
  behaviour when being run under QEMU.
- Make PXA UART probe code to work under QEMU gumstix, which
  doesn't emulate all the ports properly.
2012-04-07 23:47:08 +00:00
gleb
75744aafa6 tmpfs supports only INT_MAX nodes due to limitations of unit number
allocator.

Replace UINT32_MAX checks with INT_MAX. Keeping more than 2^31 nodes in
memory is not likely to become possible in foreseeable feature and would
require new unit number allocator.

Discussed with: delphij
MFC after:	2 weeks
2012-04-07 15:30:46 +00:00
gleb
fb452e77b0 Add vfs_getopt_size. Support human readable file system options in tmpfs.
Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs.

Discussed with:	delphij
MFC after:	2 weeks
2012-04-07 15:27:34 +00:00
gleb
c4ba84f86e Add reserved memory limit sysctl to tmpfs.
Cleanup availble and used memory functions.
Check if free pages available before allocating new node.

Discussed with:	delphij
2012-04-07 15:23:51 +00:00
stas
1c82d1d4eb - Do not reinitialize the card if it is already running.
This fixes bootp on if_smc, as bootp code perform SIOCSIFADDR
  ioctl call immediately after sending the request (which causes
  if_init being called) which causes the adapter to drop all the
  packets received in the meantime.
2012-04-07 06:56:38 +00:00
adrian
1e4ae572e5 Do a dma sync before the descriptors are chained together.
I need to find a better place to do this..
2012-04-07 05:51:43 +00:00
adrian
8b30efff05 Break out the legacy duration and protection code into routines,
call these after rate control selection is done.

The duration/protection code wasn't working - it expected the rix to
be valid.  Unfortunately after I moved the rate control selection into
late in the process, the rix value isn't valid and thus the protection/
duration code would get things wrong.

HT frames are now correctly protected with an RTS and for the AR5416,
this involves having the aggregate frames be limited to 8K.

TODO:

* Fix up the DMA sync to occur just before the frame is queued to the
  hardware.  I'm adjusting the duration here but not doing the DMA
  flush.

* Doubly/triply ensure that the aggregate frames are being limited to
  the correct size, or the AR5416 will get unhappy when TXing RTS-protected
  aggregates.
2012-04-07 05:48:26 +00:00
adrian
960a4d850d As I thought, this is a bad idea. When forming aggregates, the RTS/CTS
stuff and rate control lookup is only done on the first frame.
2012-04-07 05:46:00 +00:00
adrian
89f805c29a Enforce the RTS aggregation limit if RTS/CTS protection is enabled;
if any subframes in an aggregate have different protection from the
first frame in the formed aggregate, don't add that frame to the
aggregate.

This is likely a suboptimal method (I think we'll mostly be OK marking
frames that have seqno's with the same protection as normal data frames)
but I'll just be cautious for now.
2012-04-07 03:22:11 +00:00
adrian
17c1a9e4d5 Store away the RTS aggregate limit from the HAL.
This will be used by some upcoming code to ensure that aggregates
are enforced to be a certain size.  The AR5416 has a limitation on
RTS protected aggregates (8KiB).
2012-04-07 02:51:53 +00:00
adrian
8e4ce17ba2 Remove duplicate txflags field from ath_buf.
rename bf_state.bfs_flags to bf_state.bfs_txflags, as that is what
it effectively is.
2012-04-07 02:01:26 +00:00
nwhitehorn
a8563567c4 Execute an initial ptesync if and only if the PTE is actually being
invalidated, as opposed to a ref/changed bit update.
2012-04-06 22:33:13 +00:00
ken
f5030474fb Change the SCSI INQUIRY peripheral qualifier that CTL reports for LUNs
that don't exist.

Anecdotal evidence indicates that it is better to return 011b (bad LUN)
than 001b (LUN offline).  However, this change also gives the user a
sysctl/tunable, kern.cam.ctl.inquiry_pq_no_lun, to override the change
and return to the previous behavior.  (The previous behavior was to
return 001b, or LUN offline.)

ctl.c:		Change the default inquiry peripheral qualifier to 011b,
		and add a sysctl and tunable to allow the user to change
		it back to 001b if needed.

		Don't insert a Copan copyright statement in the inquiry
		data.  The copyright statements on the files are
		sufficient.

ctl_private.h:	Add sysctl variable context to the CTL softc.

ctl_cmd_table.c,
ctl_frontend_internal.c,
ctl_frontend.c,
ctl_backend.c,
ctl_error.c:	Include sys/sysctl.h.

MFC after:	3 days
2012-04-06 22:23:13 +00:00
gibbs
ea2a8fbda5 Fix interrupt load balancing regression, introduced in revision
222813, that left all un-pinned interrupts assigned to CPU 0.

sys/x86/x86/intr_machdep.c:
	In intr_shuffle_irqs(), remove CPU_SETOF() call that initialized
	the "intr_cpus" cpuset to only contain CPU0.

	This initialization is too late and nullifies the results of calls
	the intr_add_cpu() that occur much earlier in the boot process.
	Since "intr_cpus" is statically initialized to the empty set, and
	all processors, including the BSP, already add themselves to
	"intr_cpus" no special initialization for the BSP is necessary.

MFC after:	3 days
2012-04-06 21:19:28 +00:00
attilio
4bd8f1d188 Staticize vm_page_cache_remove().
Reviewed by:	alc
2012-04-06 20:34:00 +00:00
nwhitehorn
6c37128bdd Substantially reduce the scope of the locks held in pmap_enter(), which
improves concurrency slightly.
2012-04-06 18:18:48 +00:00
alc
8466366587 Micro-optimize free_pv_entry() for the expected case. 2012-04-06 16:41:19 +00:00
nwhitehorn
9c79ae8eea Reduce the frequency that the PowerPC/AIM pmaps invalidate instruction
caches, by invalidating kernel icaches only when needed and not flushing
user caches for shared pages.

Suggested by:	kib
MFC after:	2 weeks
2012-04-06 16:03:38 +00:00
nwhitehorn
e05a39a26d Give the kernel pmap lock a different name than user pmap locks. It has
(slightly) different semantics and renaming it prevents a (harmless)
WITNESS warning during bootup for 32-bit kernels on 64-bit CPUs.

MFC after:	5 days
2012-04-06 16:00:37 +00:00
melifaro
ff7ed324aa Fix build broken by r233938.
Pointed by:     David Wolfskill <david@catwhisker.org>
Approved by:    kib (mentor)
Pointy hat to:  melifaro
2012-04-06 13:34:19 +00:00
avg
3e6288731d retrofit Safe Mode loader menu item actions
The menu item is now made completely independent with the ACPI item - most
modern systems seem to require ACPI and become even more "unsafe"
without it.
Safe Mode no longer disables APIC for the same reason.
kbdmux is not disabled as this feature has proven itself stable.

New actions:
- SMP is disabled in the Safe Mode now
- eventtimers are forced to periodic mode (some real and virtual systems
  seem to have problems otherwise)
- geom extra vigorous integrity checking is disabled, this is to
  facilitate migration from previous versions

Possible short term to do:
- make SMP switch a separate menu item
- restore APIC switch as a separate menu item

Longer term to do:
- turn various tweaks into separate menu items in a Safe Mode sub-menu

Please consider adding a safety tweak to Safe Mode when introducing
new major features or changes that may cause instabilities.

Discussed with:	jhb, scottl, Devin Teske
MFC after:	3 weeks (stable/9 only)
2012-04-06 09:36:22 +00:00
tuexen
19af6e5d9d Remove duplicate condition in if statement.
Obtained from: brucec@
MFC after: 3 days
2012-04-06 09:03:02 +00:00
pluknet
9d2114a1e8 Free ballooned pages with the corresponding malloc type.
MFC after:	1 week
2012-04-06 08:13:29 +00:00
melifaro
85ccef88d3 - Improve performace for writer-only BPF users.
Linux and Solaris (at least OpenSolaris) has PF_PACKET socket families to send
raw ethernet frames. The only FreeBSD interface that can be used to send raw frames
is BPF. As a result, many programs like cdpd, lldpd, various dhcp stuff uses
BPF only to send data. This leads us to the situation when software like cdpd,
being run on high-traffic-volume interface significantly reduces overall performance
since we have to acquire additional locks for every packet.

Here we add sysctl that changes BPF behavior in the following way:
If program came and opens BPF socket without explicitly specifyin read filter we
assume it to be write-only and add it to special writer-only per-interface list.
This makes bpf_peers_present() return 0, so no additional overhead is introduced.
After filter is supplied, descriptor is added to original per-interface list permitting
packets to be captured.

Unfortunately, pcap_open_live() sets catch-all filter itself for the purpose of
setting snap length.

Fortunately, most programs explicitly sets (event catch-all) filter after that.
tcpdump(1) is a good example.

So a bit hackis approach is taken: we upgrade description only after second
BIOCSETF is received.

Sysctl is named net.bpf.optimize_writers and is turned off by default.

- While here, document all sysctl variables in bpf.4

Sponsored by Yandex LLC

Reviewed by:    glebius (previous version)
Reviewed by:    silence on -net@
Approved by:    (mentor)

MFC after:      4 weeks
2012-04-06 06:55:21 +00:00
melifaro
8b1d10268c - Improve BPF locking model.
Interface locks and descriptor locks are converted from mutex(9) to rwlock(9).
This greately improves performance: in most common case we need to acquire 1
reader lock instead of 2 mutexes.

- Remove filter(descriptor) (reader) lock in bpf_mtap[2]
This was suggested by glebius@. We protect filter by requesting interface
writer lock on filter change.

- Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h
without including rwlock stuff. However, this is is temporary solution,
struct bpf_if should be made opaque for any external caller.

Found by:       Dmitrij Tejblum <tejblum@yandex-team.ru>
Sponsored by:   Yandex LLC

Reviewed by:    glebius (previous version)
Reviewed by:    silence on -net@
Approved by:    (mentor)

MFC after:      3 weeks
2012-04-06 06:53:58 +00:00
jhb
5829de48d9 Add new ktrace records for the start and end of VM faults. This gives
a pair of records similar to syscall entry and return that a user can
use to determine how long page faults take.  The new ktrace records are
enabled via the 'p' trace type, and are enabled in the default set of
trace points.

Reviewed by:	kib
MFC after:	2 weeks
2012-04-05 17:13:14 +00:00
avg
69c46e84cc zfs_ioctl: no need for ddi_copyin/out here because sys_ioctl handles that
On FreeBSD the direct ioctl argument is automatically copied in/out
as necesary by the kernel ioctl entry point.

PR:		kern/164445
Submitted by:	Luis Garces-Erice <lge@ieee.org>
Tested by:	Attila Nagy <bra@fsn.hu>
MFC after:	5 days
2012-04-05 07:59:59 +00:00
ae
333953a424 Fix VIMAGE build. 2012-04-05 04:41:06 +00:00
davidxu
cc55f4943b In sem_post, the field _has_waiters is no longer used, because some
application destroys semaphore after sem_wait returns. Just enter
kernel to wake up sleeping threads, only update _has_waiters if
it is safe. While here, check if the value exceed SEM_VALUE_MAX and
return EOVERFLOW if this is true.
2012-04-05 03:05:02 +00:00
davidxu
8c31e244f2 umtx operation UMTX_OP_MUTEX_WAKE has a side-effect that it accesses
a mutex after a thread has unlocked it, it event writes data to the mutex
memory to clear contention bit, there is a race that other threads
can lock it and unlock it, then destroy it, so it should not write
data to the mutex memory if there isn't any waiter.
The new operation UMTX_OP_MUTEX_WAKE2 try to fix the problem. It
requires thread library to clear the lock word entirely, then
call the WAKE2 operation to check if there is any waiter in kernel,
and try to wake up a thread, if necessary, the contention bit is set again
by the operation. This also mitgates the chance that other threads find
the contention bit and try to enter kernel to compete with each other
to wake up sleeping thread, this is unnecessary. With this change, the
mutex owner is no longer holding the mutex until it reaches a point
where kernel umtx queue is locked, it releases the mutex as soon as
possible.
Performance is improved when the mutex is contensted heavily.  On Intel
i3-2310M, the runtime of a benchmark program is reduced from 26.87 seconds
to 2.39 seconds, it even is better than UMTX_OP_MUTEX_WAKE which is
deprecated now. http://people.freebsd.org/~davidxu/bench/mutex_perf.c
2012-04-05 02:24:08 +00:00
adrian
4b2962c0bd Implement BAR TX.
A BAR frame must be transmitted when an frame in an A-MPDU session fails
to transmit - it's retried too often, or it can't be cloned for
re-transmission.  The BAR frame tells the remote side to advance the
left edge of the block-ack window (BAW) to a new value.

In order to do this:

* TX for that particular node/TID must be paused;
* The existing frames in the hardware queue needs to be completed, whether
  they're TXed successfully or otherwise;
* The new left edge of the BAW is then communicated to the remote side
  via a BAR frame;
* Once the BAR frame has been sucessfully TXed, aggregation can resume;
* If the BAR frame can't be successfully TXed, the aggregation session
  is torn down.

This is a first pass that implements the above.  What needs to be done/
tested:

* What happens during say, a channel reset / stuck beacon _and_ BAR
  TX.  It _should_ be correctly buffered and retried once the
  reset has completed.  But if a bgscan occurs (and they shouldn't,
  grr) the BAR frame will be forcibly failed and the aggregation session
  will be torn down.

  Yes, another reason to disable bgscan until I've figured this out.

* There's way too much locking going on here.  I'm going to do a couple
  of further passes of sanitising and refactoring so the (re) locking
  isn't so heavy.  Right now I'm going for correctness, not speed.

* The BAR TX can fail if the hardware TX queue is full.  Since there's
  no "free" space kept for management frames, a full TX queue (from eg
  an iperf test) can race with your ability to allocate ath_buf/mbufs
  and cause issues.  I'll knock this on the head with a subsequent
  commit.

* I need to do some _much_ more thorough testing in hostap mode to ensure
  that many concurrent traffic streams to different end nodes are correctly
  handled.  I'll find and squish whichever bugs show up here.

But, this is an important step to being able to flip on 802.11n by default.
The last issue (besides bug fixes, of course) is HT frame protection and
I'll address that in a subsequent commit.
2012-04-04 23:45:15 +00:00
adrian
7752b3142e Track and optionally log the actual sync interrupt cause.
These are involved in tracking host interface issues (ie, PCI/PCIe/AHB
interface.)
2012-04-04 22:51:50 +00:00
adrian
5f2a4bba47 Disable the HWQ contents upon a TX queue reset, rather than a TX queue flush.
This is designed to assist in figuring out what the hardware state is
when something like a queue hang has occured.
2012-04-04 22:24:11 +00:00
adrian
16c6304ced Now that I've fixed the BAW TX hangs, disable this verbose debugging
again.
2012-04-04 22:22:50 +00:00
jkim
10109cda1b Save and restore VGA display memory between suspend and resume. 2012-04-04 22:02:54 +00:00
adrian
866c0e9fc2 Correctly handle AR_MoreAggr when assembling multi-descriptor final frames.
Linux ath9k doesn't have this issue as it doesn't try queuing multi-
descriptor frames to the hardware.

Before, I was only setting the first and last descriptor in the final
frame correctly - and that was done by accident. The first descriptor in
the last sub-frame was being correctly updated by ath_tx_setds_11n();
the last descriptor in the last sub-frame was being correctly updated
by ath_buf_set_rate(). But both of those are "incorrect".

The correct behaviour is:

* AR_IsAggr is set for all descriptors for all subframes in an aggregate.
* AR_MoreAggr is set for all descriptors for all non-final sub-frames
  in an aggregate.

Ie, all descriptors in the last sub-frame of an aggregate must have this
field set to 0.

I still need to do a couple of extra passes to ensure the pad delimiter
field is being correctly handled in all descriptors in the last sub-frame.
2012-04-04 21:49:49 +00:00
jkim
1a4c884636 Do not copy VESA state buffer if the VBE call has failed for any reason.
Do not unnecessarily clear the state buffer before calling the function.
2012-04-04 21:38:26 +00:00
jhb
452f890f26 Disable INET6 support in modules when building the LINT-NOINET6 kernel.
Reviewed by:	bz
MFC after:	1 week
2012-04-04 21:31:20 +00:00
jkim
a390e7b468 Remove a useless warning. The mode information is unused for very long time
and this function may be used with VESA mode since r232069.
2012-04-04 21:19:55 +00:00
marius
cc107cbdf9 - Const'ify the device lookup-table.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
- Enable support for flow control.
  Tested by: yongari

MFC after:	1 week
2012-04-04 21:09:02 +00:00
adrian
a544afcdc9 Add a threadid to the ah_decode API.
This adds the current thread ID to each logged register and mark entry,
allowing for easier debugging of concurrent/overlapping NIC operations.
2012-04-04 20:46:20 +00:00
marius
3eaf24f2ea Refine r233827; as it turns out, controllers with a device ID of 0x0059
can be upgraded to MegaRAID mode, in which case mfi(4) should attach to
these based on the sub-vendor and -device ID instead (not currently done).
Therefore, let mpt_pci_probe() return BUS_PROBE_LOW_PRIORITY.
While it, let mpt_pci_probe() return BUS_PROBE_DEFAULT instead of 0 in
the default case.

MFC after:	3 days
2012-04-04 20:42:45 +00:00
adrian
c754ab5e30 Disable a specific Merlin hardware workaround which may cause hangs on some
PCIe controllers.

Obtained from:	Atheros / Linux
2012-04-04 20:42:32 +00:00
jkim
5c66ff4b78 - Do not include machine/atomic.h. It is no longer necessary since r233768.
- Remove bogus "atomic" macros and a read-only variable from softc.

Reviewed by:	ambrisko
2012-04-04 16:15:40 +00:00
jh
7494e2ac3f Add a check for unsupported file flags to ufs_setattr().
Discussed with:	bde
MFC after:	2 weeks
2012-04-04 14:50:21 +00:00
glebius
7676adf25d Merge from OpenBSD:
revision 1.173
  date: 2011/11/09 12:36:03;  author: camield;  state: Exp;  lines: +11 -12
  State expire time is a baseline time ("last active") for expiry
  calculations, and does _not_ denote the time when to expire.  So
  it should never be added to (set into the future).

  Try to reconstruct it with an educated guess on state import and
  just set it to the current time on state updates.

  This fixes a problem on pfsync listeners where the expiry time
  could be double the expected value and cause a lot more states
  to linger.
2012-04-04 14:47:59 +00:00
jhb
b45da04a8e Add descriptions after the 'device' line for several NICs to match the
existing style.
2012-04-04 13:49:22 +00:00
jhb
d2b8d8566c Fix build on i386. 2012-04-04 11:55:20 +00:00
np
307ef13f94 - Remove redundant call to pr_ctloutput from code that handles SO_SETFIB.
- Add a check for errors during copyin while here.

Reviewed by:	julian, bz
MFC after:	2 weeks
2012-04-03 18:38:00 +00:00
glebius
4a817f8ca2 Since pf 4.5 import pf(4) has a mechanism to defer
forwarding a packet, that creates state, until
pfsync(4) peer acks state addition (or 10 msec
timeout passes).

This is needed for active-active CARP configurations,
which are poorly supported in FreeBSD and arguably
a good idea at all.

Unfortunately by the time of import this feature in
OpenBSD was turned on, and did not have a switch to
turn it off. This leaked to FreeBSD.

This change make it possible to turn this feature
off via ioctl() and turns it off by default.

Obtained from:	OpenBSD
2012-04-03 18:09:20 +00:00
bschmidt
d54496d7dd Add basic HT channel setup to ieee80211_init_channels(), this will be
used by at least ral(4).

Reviewed by:	ray
2012-04-03 17:48:42 +00:00
marius
ae90e605cb Fix probing of SAS1068E with a device ID of 0x0059 after r232411.
Reported by:	infofarmer

MFC after:	3 days
2012-04-03 08:28:43 +00:00
mckusick
dc3cb66f4c A file cannot be deallocated until its last name has been removed
and it is no longer referenced by a user process. The inode for a
file whose name has been removed, but is still referenced at the
time of a crash will still be allocated in the filesystem, but will
have no references (e.g., they will have no names referencing them
from any directory).

With traditional soft updates these unreferenced inodes will be
found and reclaimed when the background fsck is run. When using
journaled soft updates, the kernel must keep track of these inodes
so that it can find and reclaim them during the cleanup process.
Their existence cannot be stored in the journal as the journal only
handles short-term events, and they may persist for days. So, they
are tracked by keeping them in a linked list whose head pointer is
stored in the superblock. The journal tracks them only until their
linked list pointers have been commited to disk. Part of the cleanup
process involves traversing the list of unreferenced inodes and
reclaiming them.

This bug was triggered when confusion arose in the commit steps
of keeping the unreferenced-inode linked list coherent on disk.
Notably, a race between the link() system call adding a link-count
to a file and the unlink() system call removing a link-count to
the file. Here if the unlink() ran after link() had looked up
the file but before link() had incremented the link-count of the
file, the file's link-count would drop to zero before the link()
incremented it back up to one. If the file was referenced by a
user process, the first transition through zero made it appear
that it should be added to the unreferenced-inode list when in
fact it should not have been added. If the new name created by
link() was deleted within a few seconds (with the file still
referenced by a user process) it would legitimately be a candidate
for addition to the unreferenced-inode list. The result was that
there were two attempts to add the same inode to the unreferenced-inode
list which scrambled the unreferenced-inode list's pointers leading
to a panic. The fix is to detect and avoid the false attempt at
adding it to the unreferenced-inode list by having the link()
system call check to see if the link count is zero before it
increments it. If it is, the link() fails with ENOENT (showing that
it has failed the link()/unlink() race).

While tracking down this bug, we have added additional assertions
to detect the problem sooner and also simplified some of the code.

Reported by:      Kirk Russell
Fix submitted by: Jeff Roberson
Tested by:        Peter Holm
PR:               kern/159971
MFC (to 9 only):  2 weeks
2012-04-02 21:58:37 +00:00
kib
ff6239a557 When process exists, not only the children shall be reparented to
init, but also the orphans shall be removed from the orphan list,
because the list header is destroyed.

Reported and tested by:	pho
MFC after:	3 days
2012-04-02 19:35:36 +00:00
kib
9ad701f91f Add helper function to remove the process from the orphans list and
use it instead of inlined code.

Tested by:	pho
MFC after:	3 days
2012-04-02 19:34:56 +00:00
ambrisko
d5677c25b7 Move struct megasas_sge from mfi_ioctl.h to mfivar.h so we can
remove including machine/bus.h.  Add some more mfi_ prefixes to
avoid name space pollution.

This should address the last tinderbox issues.
2012-04-02 19:13:02 +00:00
jhb
f021ea5434 Further tweak the changes made in r233709. The kernel doesn't permit
sleeping from a swi handler (even though in this case it would be ok), so
switch the refill and scanning SWI handlers to being tasks on a fast
taskqueue.  Also, only schedule the refill task for a CMCI as an MC# can
fire at any time, so it should do the minimal amount of work needed and
avoid opportunities to deadlock before it panics (such as scheduling a
task it won't ever need in practice).  To handle the case of an MC# only
finding recoverable errors (which should never happen), always try to
refill the event free list when the periodic scan executes.

MFC after:	2 weeks
2012-04-02 17:26:21 +00:00
jh
b0414ca221 - Use more natural ip->i_flags instead of vap->va_flags in the final
flags check.
- Add a comment for the immutable/append check done after handling of
  the flags.
- Style improvements.

No functional change intended.

Submitted by:	bde
MFC after:	2 weeks
2012-04-02 16:33:21 +00:00
jhb
e14923e42f Make machine check exception logging more readable. On newer Intel systems,
an uncorrected ECC error tends to fire on all CPUs in a package
simultaneously and the current printf hacks are not sufficient to make
the messages legible.  Instead, use the existing mca_lock spinlock to
serialize calls to mca_log() and change the machine check code to panic
directly when an unrecoverable error is encoutered rather than falling
back to a trap_fatal() call in trap() (which adds nearly a screen-full of
logging messages that aren't useful for machine checks).

MFC after:	2 weeks
2012-04-02 15:07:22 +00:00
jchandra
74a66a0c1f Reinstate the XTLB handler for CPU_NLM and CPU_RMI
These platforms set the KX bit even when booted in 32 bit mode. So
the XLTB handler is needed even when __mips_n64 is not defined.
2012-04-02 11:41:33 +00:00
hselasky
126953ccbe Fix compiler warnings, mostly signed issues,
when USB modules are compiled with WARNS=9.

MFC after:	1 weeks
2012-04-02 10:50:42 +00:00
hselasky
0a4ce326a7 Add definitions and structures for USB 2.0 Link Power Management, LPM.
MFC after:	2 weeks
2012-04-02 07:51:30 +00:00
ambrisko
c106dff7db Change typedef atomic_t to struct mfi_atomic to avoid name space
collision and some couple more style changes.
2012-04-02 02:22:22 +00:00
gonzo
1080fa2ace Remove extra semicolon which rendered condition useless
Submitted by:	Stefan Farfelder <stefanf@FreeBSD.org>
2012-04-02 00:11:26 +00:00
jhb
506e2f15b9 Export some more useful info about shared memory objects to userland
via procstat(1) and fstat(1):
- Change shm file descriptors to track the pathname they are associated
  with and add a shm_path() method to copy the path out to a caller-supplied
  buffer.
- Use the fo_stat() method of shared memory objects and shm_path() to
  export the path, mode, and size of a shared memory object via
  struct kinfo_file.
- Add a struct shmstat to the libprocstat(3) interface along with a
  procstat_get_shm_info() to export the mode and size of a shared memory
  object.
- Change procstat to always print out the path for a given object if it
  is valid.
- Teach fstat about shared memory objects and to display their path,
  mode, and size.

MFC after:	2 weeks
2012-04-01 18:22:48 +00:00
theraven
d5bc632dfb Bump __FreeBSD_version for xlocale cleanup, as requested by ports people.
Approved by:	dim (mentor)
2012-04-01 09:35:23 +00:00
marius
83cd4d31b1 Remove checks that are redundant due to tf_type being unsigned.
MFC after:	3 days
2012-03-31 14:03:16 +00:00
marius
6b88064bda Fix panic on kernel traps having a mapping in trap_sig b0rked in r206086.
Repored by:	David E. Cross

MFC after:	3 days
2012-03-31 13:56:24 +00:00
mav
4409a9b5ec Be more conservative in using READ CAPACITY(16) command. Previous code
checked PROTECT bit in INQUIRY data for all SPC devices, while it is defined
only since SPC-3. But there are some SPC-2 USB devices were reported, that
have PROTECT bit set, return no error for READ CAPACITY(16) command, but
return wrong sector count value in response.

MFC after:	3 days
2012-03-31 11:23:09 +00:00
glebius
03c053c63d Don't check malloc(M_WAITOK) results. 2012-03-31 11:20:48 +00:00
davidxu
42d5de0c66 Remove stale comments. 2012-03-31 06:48:41 +00:00
ambrisko
af288dfa91 MFhead_mfi r227068
First cut of new HW support from LSI and merge into FreeBSD.
	Supports Drake Skinny and ThunderBolt cards.
MFhead_mfi r227574
	Style
MFhead_mfi r227579
	Use bus_addr_t instead of uintXX_t.
MFhead_mfi r227580
	MSI support
MFhead_mfi r227612
	More bus_addr_t and remove "#ifdef __amd64__".
MFhead_mfi r227905
	Improved timeout support from Scott.
MFhead_mfi r228108
	Make file.
MFhead_mfi r228208
	Fixed botched merge of Skinny support and enhanced handling
	in call back routine.
MFhead_mfi r228279
	Remove superfluous !TAILQ_EMPTY() checks before TAILQ_FOREACH().
MFhead_mfi r228310
	Move mfi_decode_evt() to taskqueue.
MFhead_mfi r228320
	Implement MFI_DEBUG for 64bit S/G lists.
MFhead_mfi r231988
	Restore structure layout by reverting the array header to
	use [0] instead of [1].
MFhead_mfi r232412
	Put wildcard pattern later in the match table.
MFhead_mfi r232413
	Use lower case for hexadecimal numbers to match surrounding
	style.
MFhead_mfi r232414
	Add more Thunderbolt variants.
MFhead_mfi r232888
	Don't act on events prior to boot or when shutting down.
	Add hw.mfi.detect_jbod_change to enable or disable acting
	on JBOD type of disks being added on insert and removed on
	removing.  Switch hw.mfi.msi to 1 by default since it works
	better on newer cards.
MFhead_mfi r233016
	Release driver lock before taking Giant when deleting children.
	Use TAILQ_FOREACH_SAFE when items can be deleted.  Make code a
	little simplier to follow.  Fix a couple more style issues.
MFhead_mfi r233620
	Update mfi_spare/mfi_array with the actual number of elements
	for array_ref and pd.  Change these max. #define names to avoid
	name space collisions.  This will require an update to mfiutil
	It avoids mfiutil having to do a magic calculation.

	Add a note and #define to state that a "SYSTEM" disk is really
	what the firmware calls a "JBOD" drive.

Thanks to the many that helped, LSI for the initial code drop,
mav, delphij, jhb, sbruno that all helped with code and testing.
2012-03-30 23:05:48 +00:00
dim
8251950705 Fix the following compilation warning with clang trunk in isci(4):
sys/dev/isci/isci_task_request.c:198:7: error: case value not in enumerated type 'SCI_TASK_STATUS' (aka 'enum _SCI_TASK_STATUS') [-Werror,-Wswitch]
    case SCI_FAILURE_TIMEOUT:
           ^

This is because the switch is done on a SCI_TASK_STATUS enum type, but
the SCI_FAILURE_TIMEOUT value belongs to SCI_STATUS instead.

Because the list of SCI_TASK_STATUS values cannot be modified at this
time, use the simplest way to get rid of this warning, which is to cast
the switch argument to int.  No functional change.

Reviewed by:	jimharris
MFC after:	3 days
2012-03-30 22:52:08 +00:00
jhb
98a50d54e5 Attempt to make machine check handling a bit more robust:
- Don't malloc() new MCA records for machine checks logged due to a
  CMCI or MC# exception.  Instead, use a pre-allocated pool of records.
  When a CMCI or MC# exception fires, schedule a swi to refill the pool.
  The pool is sized to hold at least one record per available machine
  bank, and one record per CPU. This should handle the case of all CPUs
  triggering a single bank at once as well as the case a single CPU
  triggering all of its banks.  The periodic scans still use malloc()
  since they are run from a safe context.
- Since we have to create an swi to handle refills, make the periodic scan
  a second swi for the same thread instead of having a separate taskqueue
  thread for the scans.

Suggested by:	mdf (avoiding malloc())
MFC after:	2 weeks
2012-03-30 20:17:39 +00:00
jhb
0f7a6b4eb7 Fix a few issues with transmit handling in em(4) and igb(4):
- Do not define the foo_start() methods or set if_start in the ifnet if
  multiq transmit is enabled.  Also, set if_transmit and if_qflush before
  ether_ifattach rather than after when multiq transmit is enabled.  This
  helps to ensure that the drivers never try to mix different transmit
  methods.
- Properly restart transmit during resume.  igb(4) was not restarting it
  at all, and em(4) was restarting even if the link was down and was
  calling the wrong method if multiq transmit was enabled.
- Remove all the 'more' handling for transmit completions.  Transmit
  completion processing does not have a processing limit, so it always
  runs to completion and never has more work to do when it returns.
  Instead, the previous code was returning 'true' anytime there were
  packets in the queue that weren't still in the process of being
  transmitted.  The effect was that the driver would continuously
  reschedule a task to process TX completions in effect running at 100%
  CPU polling the hardware until it finished transmitting all of the
  packets in the ring.  Now it will just wait for the next TX completion
  interrupt.
- Restart packet transmission when the link becomes active.
- Fix the MSI-X queue interrupt handlers to restart packet transmission if
  there are pending packets in the relevant software queue (IFQ or buf_ring)
  after processing TX completions.  This is the root cause for the OACTIVE
  hangs as if the MSI-X queue handler drained all the pending packets from
  the TX ring, nothing would ever restart it.  As such, remove some
  previously-added workarounds to reschedule a task to poll the TX ring
  anytime OACTIVE was set.

Tested by:	sbruno
Reviewed by:	jfv
MFC after:	1 week
2012-03-30 19:54:48 +00:00
jhb
1dd88fec2f Move the legacy(4) driver to x86. 2012-03-30 19:10:14 +00:00
jkim
95442a8d40 Re-initialize model-specific MSRs when we resume CPUs.
MFC after:	1 week
2012-03-30 17:03:06 +00:00
jkim
6cd4f25011 Work around Erratum 721 for AMD Family 10h and 12h processors.
"Under a highly specific and detailed set of internal timing conditions,
the processor may incorrectly update the stack pointer after a long series
of push and/or near-call instructions, or a long series of pop and/or
near-return instructions.  The processor must be in 64-bit mode for this
erratum to occur."

MFC after:	3 days
2012-03-30 16:32:41 +00:00
marius
07d4fd218f - Remove erroneous trailing semicolon. [1]
- Correctly determine the maximum payload size for setting the TX link
  frequent NACK latency and replay timer thresholds.

Submitted by:	stefanf [1]
MFC after:	3 days
2012-03-30 15:08:09 +00:00
davidxu
0bd3403eb7 Remove trailing semicolon, it is a typo. 2012-03-30 12:57:14 +00:00
davidxu
febc18f31b Fix COMPAT_FREEBSD32 build.
Submitted by: Andreas Tobler < andreast at fgznet dot ch >
2012-03-30 09:03:53 +00:00
mav
e01941a391 Reenable unsolicited responses on CODEC if hdaa_sense_init() called again.
This fixes jack connection events handling after suspend/resume.

PR:		kern/166382
MFC after:	1 week
2012-03-30 08:33:08 +00:00
davidxu
f7f769bc6d Remove trailing space. 2012-03-30 05:49:32 +00:00
davidxu
5faf75d34c Merge umtxq_sleep and umtxq_nanosleep into a single function by using
an abs_timeout structure which describes timeout info.
2012-03-30 05:40:26 +00:00
yongari
4b53a41d86 Do not report current link status if driver is not running.
This change also workarounds dhclient's link state handling bug by
not giving current link status.

Unlike other controllers, ale(4)'s PHY hibernation perfectly works
such that driver does not see a valid link if the controller is not
brought up.  If dhclient(8) runs on ale(4) it will blindly waits
until link UP and then gives up after 10 seconds.  Because
dhclient(8) still thinks interface got a valid link when IFM_AVALID
is not set for selected media,  this change makes dhclient initiate
DHCP without waiting for link UP.
2012-03-30 05:27:05 +00:00
yongari
6ab556156c Remove task queue based link state change handler. Driver no longer
needs to defer link state handling.
While I'm here, mark IFF_DRV_RUNNING before changing media.  If
link is established without any delay, that link state change
handling could be lost.
2012-03-30 04:46:39 +00:00
dim
50e5af3369 Fix an issue introduced in sys/x86/include/endian.h with r232721. In
that revision, the bswapXX_const() macros were renamed to bswapXX_gen().

Also, bswap64_gen() was implemented as two calls to bswap32(), and
similarly, bswap32_gen() as two calls to bswap16().  This mainly helps
our base gcc to produce more efficient assembly.

However, the arguments are not properly masked, which results in the
wrong value being calculated in some instances.  For example,
bswap32(0x12345678) returns 0x7c563412, and bswap64(0x123456789abcdef0)
returns 0xfcdefc9a7c563412.

Fix this by appropriately masking the arguments to bswap16() in
bswap32_gen(), and to bswap32() in bswap64_gen().  This should also
silence warnings from clang.

Submitted by:	jh
2012-03-29 23:31:48 +00:00
dim
56f6402e5a Revert sys/x86/include/endian.h to what it was before r233419, as that
revision has two problems:
- It can produce worse code with both clang and gcc.
- It doesn't fix the actual issue introduced in r232721, which will be
  fixed in the next commit.

Submitted by:	bde, tijl and jh
Pointy hat to:	dim
2012-03-29 23:30:17 +00:00
adrian
d8968134ad oops, add a missing lock. 2012-03-29 21:54:19 +00:00
jkim
a48f8f3f0c Fix couple of style nits. 2012-03-29 19:29:24 +00:00
jkim
721cc884d8 Revert r233662 and generalize the hack. Writing zero to BAR actually does
not disable it and it is even harmful as hselasky found out.  Historically,
this code was originated from (OLDCARD) CardBus driver and later leaked into
PCI driver when CardBus was newbus'ified and refactored with PCI driver.
However, it is not really necessary even for CardBus.

Reviewed by:	hselasky, imp, jhb
2012-03-29 19:26:39 +00:00
jhb
876e74a14e Use a more proper fix for enabling HT MSI mapping windows on Host-PCI
bridges.  Rather than blindly enabling the windows on all of them, only
enable the window when an MSI interrupt is enabled for a device behind
the bridge, similar to what already happens for HT PCI-PCI bridges.

To implement this, each x86 Host-PCI bridge driver has to be able to
locate it's actual backing device on bus 0.  For ACPI, use the _ADR
method to find the slot and function of the device.  For the non-ACPI
case, the legacy(4) driver already scans bus 0 looking for Host-PCI
bridge devices.  Now it saves the slot and function of each bridge that
it finds as ivars that the Host-PCI bridge driver can then use in its
pcib_map_msi() method.

This fixes machines where non-MSI interrupts were broken by the previous
round of HT MSI changes.

Tested by:	bapt
MFC after:	1 week
2012-03-29 19:03:22 +00:00
jhb
38fb55f1d4 Restore proper use of bounce buffers for ISA DMA. When locking was
added, the call to pmap_kextract() was moved up, and as a result the
code never updated the physical address to use for DMA if a bounce
buffer was used.  Restore the earlier location of pmap_kextract() so
it takes bounce buffers into account.

Tested by:	kargl
MFC after:	1 week
2012-03-29 18:58:02 +00:00
adrian
9cb839a32c Defer the rescheduling of TID -> TXQ frames in some instances.
Right now ath_txq_sched() is mainly called from the TX ath_tx_processq()
routine, which is (mostly) done as part of the taskqueue.  It shouldn't
be called outside the taskqueue.

But now that I'm about to flip back on BAR TX, I'm going to start
stressing the ath_tx_tid_pause() and ath_tx_tid_resume() paths.
What I don't want to have happen is a reschedule of the TID traffic
_during_ the completion of TX frames.

Ideally I'd like to have a way to flag back up to the processing code
that the current hardware queue should be rechecked for software TID
queue frames.  But for now, this should suffice for the BAR TX case.

I may eventually delete this code once I've brought some further
sanity to the general TX queue/completion path.
2012-03-29 17:39:18 +00:00
jhb
98fa920cd5 - Rename VM_MEMATTR_UNCACHED to VM_MEMATTR_WEAK_UNCACHEABLE on x86 to
be less ambiguous and more clearly identify what it means.  This
  attribute is what Intel refers to as UC-, and it's only difference
  relative to normal UC memory is that a WC MTRR will override a UC-
  PAT entry causing the memory to be treated as WC, whereas a UC PAT
  entry will always override the MTRR.
- Remove the VM_MEMATTR_UNCACHED alias from powerpc.
2012-03-29 16:51:22 +00:00
jhb
42c95c3891 Use VM_MEMATTR_UNCACHEABLE for the constant for UC memory rather than
VM_MEMATTR_UNCACHED.  VM_MEMATTR_UNCACHEABLE is the constant other
platforms use.

MFC after:	2 weeks
2012-03-29 16:48:36 +00:00
nwhitehorn
43387e1a93 Fix build after changes to trap headers. 2012-03-29 16:04:42 +00:00
hselasky
fc9fab80c6 Move tty_opened_ns() into syscons.c which is currently the
only client of this macro.

Suggested by:	ed @
MFC after:	1 week
2012-03-29 15:47:29 +00:00
jimharris
c5537d1e1c Fix bug where isci(4) would report only 15 bytes of returned data on a
READ_CAP_16 command to a SATA target.

Sponsored by: Intel
Reviewed by: sbruno
Approved by: sbruno
MFC after: 3 days
2012-03-29 15:43:07 +00:00
hselasky
6749f469aa Fix for boot issue: Don't disable BARs on AGP devices. In general:
Don't disable BARs on any PCI display devices, because doing that can
sometimes cause the main memory bus to stop working, causing all
memory reads to return nothing but 0xFFFFFFFF, even though the memory
location was previously written.  After a while a privileged
instruction fault will appear and then nothing more can be debugged.
The reason for this behaviour is unknown.

MFC after:	1 week
2012-03-29 15:33:44 +00:00
hselasky
9cd12cd0b3 Fix for NULL-pointer panic during boot, if keys are pressed too early.
MFC after:	1 week
2012-03-29 14:53:14 +00:00
rrs
ddfb5c5980 Make stream our stream reset implementation
compliant to RFC6525.

MFC after:	1 month
2012-03-29 13:36:53 +00:00
jchandra
f9d3dd9f9c Remove unnecessary assembly code.
The compiler should generate lw/sw corresponding to register
operations.
2012-03-29 11:46:29 +00:00
ae
0c8f817a74 VMDB offset should be greater than logical volume size only for MBR. 2012-03-29 07:29:27 +00:00
ae
c982232271 Do proper cleanup for the GPT case when an error occurs. 2012-03-29 06:37:02 +00:00
eadler
1ef5fe44d3 Remove trailing whitespace per mdoc lint warning
Disussed with:	gavin
No objection from:	doc
Approved by:	joel
MFC after:	3 days
2012-03-29 05:02:12 +00:00
jmallett
4544b2987d Assume a big-endian default on MIPS and drop the "eb" suffix from MACHINE_ARCH.
This makes our naming scheme more closely match other systems and the
expectations of much third-party software.  MIPS builds which are little-endian
should require and exhibit no changes.  Big-endian TARGET_ARCHes must be
changed:
	From:		To:
	mipseb		mips
	mipsn32eb	mipsn32
	mips64eb	mips64

An entry has been added to UPDATING and some foot-shooting protection (complete
with warnings which should become errors in the near future) to the top-level
base system Makefile.
2012-03-29 02:54:35 +00:00
davidxu
362bad78ca Reduce code size by creating common timed sleeping function. 2012-03-29 02:46:43 +00:00
jmallett
4bfb81e9ab Turn on messages from the Simple Executive codebase, what few there are. 2012-03-29 02:05:11 +00:00
jmallett
5d1a09b499 Disable FP instruction emulation by default on !o32 because of ABI concerns.
Note that in practice this isn't needed because we get a coprocessor unusable
exception first, but that's actually something like a bug.
2012-03-29 02:04:15 +00:00
jmallett
2f74521514 Supply endianness implied by the -m flag when compiling ucore code. 2012-03-29 02:03:06 +00:00
jmallett
365a2ac71d Fix little-endian built. 2012-03-29 02:02:23 +00:00
nwhitehorn
dc06a3cb59 Allow multiple inclusion of trap.h. This has always been broken, but
until recently never caused problems.
2012-03-29 02:02:14 +00:00
mckusick
8c9c124d25 A refinement of change 232351 to avoid a race with a forcible unmount.
While we have a snapshot vnode unlocked to avoid a deadlock with another
inode in the same inode block being updated, the filesystem containing
it may be forcibly unmounted. When that happens the snapshot vnode is
revoked. We need to check for that condition and fail appropriately.

This change will be included along with 232351 when it is MFC'ed to 9.

Spotted by:  kib
Reviewed by: kib
2012-03-28 21:21:19 +00:00
fabient
5edfb77dd3 Add software PMC support.
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).

Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.

Sponsored by: NETASQ
MFC after:	1 month
2012-03-28 20:58:30 +00:00
mckusick
9a7982e5a0 Keep track of the mount point associated with a special device
to enable the collection of counts of synchronous and asynchronous
reads and writes for its associated filesystem. The counts are
displayed using `mount -v'.

Ensure that buffers used for paging indicate the vnode from
which they are operating so that counts of paging I/O operations
from the filesystem are collected.

This checkin only adds the setting of the mount point for the
UFS/FFS filesystem, but it would be trivial to add the setting
and clearing of the mount point at filesystem mount/unmount
time for other filesystems too.

Reviewed by: kib
2012-03-28 20:49:11 +00:00
jhb
08f2264a84 Allocate the ioapics[] array dynamically since it is only needed for the
duration of madt_setup_io().  This avoids having the array take up
permanent space in the BSS.

Inspired by:	bde
MFC after:	2 weeks
2012-03-28 18:53:48 +00:00
jimharris
a95c988456 Ensure consistent target IDs for direct-attached devices.
Sponsored by: Intel
Reported by: sbruno, <rpokala at panasas dot com>
Tested by: <rpokala at panasas dot com>
Reviewed by: scottl
Approved by: scottl
MFC after: 3 days
2012-03-28 18:38:13 +00:00
jkim
ad499681b4 Add a PNP ID for Japanese 106-key keyboard.
PR:		kern/166459
MFC after:	3 days
2012-03-28 17:58:37 +00:00
nwhitehorn
03485ae3df More PMAP performance improvements: skip 256 MB segments entirely if they
are are not mapped during ranged operations and reduce the scope of the
tlbie lock only to the actual tlbie instruction instead of the entire
sequence. There are a few more optimization possibilities here as well.
2012-03-28 17:25:29 +00:00
jkim
b5e3c2bb77 MFV: r233615
Revert r233555 and apply a fix for the reference counting regressions.

Tested by:	andreast, lme, nwhitehorn,
		Sevan / Venture37 (venture37 at gmail dot com)
Submitted by:	Robert Moore (robert dot moore at intel dot com)
2012-03-28 17:21:59 +00:00
jhb
1e917a8654 Move the DTrace return IDT vector back up from 0x20 to 0x92. The 0x20
vector is currently dedicated to servicing IRQ 0 from the 8259A's, so
it shouldn't be overloaded for DTrace.

Tested by:	rstone
MFC after:	1 week
2012-03-28 16:32:17 +00:00
kib
975e7bbd40 Do trivial reformatting of the comment to record the missed commit
message for r233609:
Restore the writes of atimes, quotas and superblock from syncer vnode.

Noted by:   rdivacky
2012-03-28 14:16:15 +00:00
kib
e8faa5a30a Reviewed by: bde, mckusick
Tested by:	pho
MFC after:	2 weeks
2012-03-28 14:06:47 +00:00
kib
378ad5717c Microoptimize: in qsync loop over mount vnodes, only unlock mount
interlock after we committed to try to vget() the vnode.

Submitted by:	bde
Reviewed by:	mckusick
Tested by:	pho
MFC after:	1 week
2012-03-28 13:56:18 +00:00
kib
1f7d5397de Update comment.
MFC after:	3 days
2012-03-28 13:47:07 +00:00
mav
42f863a907 Stop HDA controller polling callout on suspend and reset it on resume.
PR:		kern/166382
MFC after:	1 week
2012-03-28 13:28:09 +00:00
zec
3ce0f78a89 Permit tcpdrop in VNET jails.
Submitted by:	Miljenko Mikuc
MFC after:	3 days
2012-03-28 12:30:16 +00:00
tuexen
21244ace96 Honor the net.inet.udp.checksum sysctl when using SCTP/UDP/IPv4
encapsulation.
MFCing requires MFCing http://svn.freebsd.org/changeset/base/233554
MFC after: 2 weeks
2012-03-28 08:11:46 +00:00
yongari
8dfcea07d7 Remove unnecessary #if as the software workaround for PCI protocol
violation should be activated unless the system is cold-booted
after updating EEPROM.
The PCI protocol violation happens only when established link is
10Mbps so the workaround should be updated whenever link state
change is detected.  Previously the workaround was activated only
when user checks current media status with ifconfig(8).
2012-03-28 01:52:38 +00:00
yongari
58e81052f3 Load entire EEPROM contents in device attach time and verify
whether the checksum of EEPROM is valid or not.  Because driver
heavily relies on EEPROM information when it selectively enables
features/workarounds, it would be helpful to know whether driver
sees valid EEPROM.
While I'm here remove all other EEPROM accesses since the entire
EEPROM is loaded at device attach time.

MFC after:	2 weeks
2012-03-28 01:27:27 +00:00
yongari
26312bd969 Partially revert r223608 and selectively allow microcode loading
for 82550C.  For 82550 controllers this change restores CPUSaver
microcode loading.  Due to silicon bug on 82550 and 82550C with
server extension, these controllers seem to require CPUSaver
microcode to receive fragmented UDP datagrams.  However the
microcode shouldn't be used on client featured 82550C as it locks
up the controller.  In addition, client featured 82550C does not
have the silicon bug.  Also clear temporary memory used for
microcode loading since the same memory area is used for other
commands.
While I'm here use 82550C in probe message instead of generic
82550.

Reported by:	Andreas Longwitz <longwitz <> incore de>
Tested by:	Andreas Longwitz <longwitz <> incore de>
MFC after:	2 weeks
2012-03-28 01:08:55 +00:00
jkim
0b14a09c69 - Do not clobber softc when psm(4) is reintialized.
- Make INITAFTERSUSPEND flag independent of HOOKRESUME flag.
- Automatically set INITAFTERSUSPEND flag when ALPS GlidePoint is detected.
- Always probe Synaptics Touchpad.  Allow MOUSE_SYN_GETHWINFO ioctl and
automatically set INITAFTERSUSPEND flag when a supported device is detected,
regardless of "hw.psm.synaptics_support" tunable setting.
- Update psm(4) to reflect the above changes.
- Remove long-time defunct SYNCHACK flag while I am in the neighborhood.

MFC after:	1 month
2012-03-27 23:43:01 +00:00
jkim
5ca6fae2c3 Restore interrupt state after executing AcpiEnterSleepState(). 2012-03-27 23:26:58 +00:00
peter
0d73339334 Allow (with a license warning) "options ZFS" to work in static kernels.
The 'make depend' rules have to use custom -I paths for the special compat
includes for the opensolaris/zfs headers.

This option will pull in the couple of files that are shared with dtrace,
but they appear to correctly use the MODULE_VERSION/MODULE_DEPEND rules
so loader should do the right thing, as should kldload.

Reviewed by:	pjd (glanced at)
2012-03-27 21:23:56 +00:00
dumbbell
18b36c765a Make ReiserFS MPSAFE
Most functions seemed to be already fine w.r.t. what's done in msdosfs.

MFC after:	1 month
2012-03-27 20:36:03 +00:00
bschmidt
279028da36 strip (R) to match manpage and pci_vendors
MFC after:	1 week
2012-03-27 18:27:45 +00:00
jchandra
a59b18b56a Fix size of PCI softc. 2012-03-27 18:26:35 +00:00
gonzo
edf893b531 Fix crash on VirtualBox (and probably on some real hardware):
- Do not cover error returned by pmc_core_initialize with the
    result of pmc_uncore_initialize, fail right away.
- Give a user something to report instead failing silently

Reported by:	Alexandr Kovalenko <never@nevermind.kiev.ua>
2012-03-27 18:22:14 +00:00
bschmidt
419e09831f Add support for 6150 series devices.
Tested by:	Shane Riddle <sh4neriddle at yahoo dot com>
MFC after:	1 week
2012-03-27 17:45:50 +00:00
jchandra
8374f23681 Resource allocation for XLP SoC SDHCI slots
The on-chip SD slots do not have PCI BARs corresponding to them, so
this has to be handled in the custom SoC memory allocation.

Provide memory resource for rids corresponding to BAR 0 and 1 in
the custom allocation code.
2012-03-27 15:43:32 +00:00
jchandra
0ec682f566 Update memory and resource allocation code for SoC devices
The XLP on-chip devices have PCI configuration headers, but some of the
devices need custom resource allocation code.
- devices with no MEM/IO BARs with registers in PCIe extended reg
  space have to be handled in memory resource allocation
- devices without INTPIN/INTLINE in PCI header can be supported
  by having these faked with a shadow register.
- Some devices does not allow 8/16 bit access to the register space,
  he default bus space cannot be used for these.

Subclass pci and override attach and resource allocation methods to
take care of this.

Remove earlier code which did this partially.
2012-03-27 15:39:55 +00:00
jkim
debd8ef5e5 MFV: r233551
Fix two possible memory leaks in error path.

Obtained from:	ACPICA
2012-03-27 15:27:20 +00:00
jchandra
1573bb43e4 NOR flash driver for XLP.
The NOR interface on the SoC appears on the top level PCI bus. Add
a simple driver for this.
2012-03-27 15:16:38 +00:00
jkim
09a699481b MFV: r233550
Temporarily revert an upstream commit.  This change caused regressions for
too many laptop users.  Especially, automatic repair for broken _BIF caused
strange reference counting issues and kernal panics.  This reverts:

c995fed15a
2012-03-27 15:15:30 +00:00
bz
f89de17b69 Export the udp_cksum sysctl for upcoming SCTP work. Rather than always,
SCTP will only do IPv4 UDP checksum calculation as defined by the host
policy.  When tunneling SCTP always calculates the inner checksum already
so not doing the outer UDP can save cycles.

While here virtualize the variable.

Requested by:	tuexen
MFC after:	2 weeks
2012-03-27 15:14:29 +00:00
jchandra
4cab4cc0c4 CFI fixes for big endian archs.
The flash commands and responses are little-endian and have to be
byte swapped on big-endian systems.  However the raw read of data
need not be swapped.

Make the cfi_read and cfi_write do the swapping, and provide a
cfi_read_raw which does not byte swap for reading data from
flash.
2012-03-27 15:13:12 +00:00
rstone
0ee65aa24e Instead of only iterating over the set of known SDT probes when sdt.ko is
loaded and unloaded, also have sdt.ko register callbacks with kern_sdt.c
that will be called when a newly loaded KLD module adds more probes or
a module with probes is unloaded.

This fixes two issues: first, if a module with SDT probes was loaded after
sdt.ko was loaded, those new probes would not be available in DTrace.
Second, if a module with SDT probes was unloaded while sdt.ko was loaded,
the kernel would panic the next time DTrace had cause to try and do
anything with the no-longer-existent probes.

This makes it possible to create SDT probes in KLD modules, although there
are still two caveats: first, any SDT probes in a KLD module must be part
of a DTrace provider that is defined in that module.  At present DTrace
only destroys probes when the provider is destroyed, so you can still
panic the system if a KLD module creates new probes in a provider from a
different module(including the kernel) and then unload the the first module.

Second, the system will panic if you unload a module containing SDT probes
while there is an active D script that has enabled those probes.

MFC after:	1 month
2012-03-27 15:07:43 +00:00
jchandra
126e29ce5f XLP UART code udpate.
Move XLP PCI UART device to sys/mips/nlm/dev/ directory.  Other
drivers for the XLP SoC devices will be added here as well.
Update uart_cpu_xlp.c and uart_pci_xlp.c use macros for uart port,
speed and IO frequency.
2012-03-27 14:48:40 +00:00
jhb
9fc8d42e62 Use VM_MEMATTR_UNCACHEABLE instead of VM_MEMATTR_UNCACHED for UC mappings.
VM_MEMATTR_UNCACHED is actually the x86-specific UC- mode (where a WC
MTRR can override the PAT setting).
2012-03-27 14:24:29 +00:00
jchandra
2f10110045 xlpge : driver for XLP network accelerator
Features:
- network driver for the four 10G interfaces and two management ports
  on XLP 8xx.
- Support 4xx and 3xx variants of the processor.
- Source code and firmware building for the 16 mips32r2 micro-code engines
  in the Network Accelerator.
- Basic initialization code for Packet ordering Engine.

Submitted by:	Prabhath Raman (prabhath at netlogicmicro com)
		[refactored and fixed up for style by jchandra]
2012-03-27 14:05:12 +00:00
fabient
961cc7f6b4 Fix random deadlock on pmcstat exit:
- Exit the thread when soft shutdown is requested
- Wakeup owner thread.

Reproduced/tested by looping pmcstat measurement:
pmcstat -S instructions -O/tmp/test ls

MFC after:	1 week
2012-03-27 14:02:22 +00:00
jchandra
4d0abc074b Support for EEPROM and CPLD on XLP EVP boards.
On XLP evaluation platform, the board information is stored
in an I2C eeprom and the network block configuration is available
from a CPLD connected to the GBU (NOR flash bus). Add support
for both of these.
2012-03-27 12:25:47 +00:00
jchandra
b1c8867097 Opencrypto driver for XLP Security and RSA/ECC blocks
Support for the Security and RSA blocks on XLP SoC. Even though
the XLP supports many more algorithms, only the ones supported
in OCF have been added.

Submitted by:	Venkatesh J. V. (venkatesh at netlogicmicro com)
2012-03-27 11:43:46 +00:00
jchandra
88cde25ec4 I2C support for XLP, add hints for I2C devices and update PCI resource
allocation code.
2012-03-27 11:17:04 +00:00
jchandra
34e8a23c52 Driver for OpenCores I2C controller.
Add a Simple polled driver iicoc for the OpenCores I2C controller. This
is used in Netlogic XLP processors.

Submitted by:	Sreekanth M. S. (kanthms at netlogicmicro com)
2012-03-27 10:44:32 +00:00
jchandra
72abada2e8 Move driver for DS1374 RTC to sys/dev/iicbus
The earlier version of the driver is sys/mips/rmi/dev/iic/ds1374u.c
Convert all references to ds1374u to ds1374, and use DEVMETHOD_END.
Also update the license header as Netlogic is now Broadcom.
2012-03-27 09:48:18 +00:00
jchandra
96376fdd3f XLP PCIe code update.
- XLP supports hardware swap for PCIe IO/MEM accesses. Since we
  are in big-endian mode, enable hardware swap and use the normal
  bus space.
- move some printfs to bootverbose, and remove others.
- fix SoC device resource allocation code
- Do not use '|' while updating PCIE_BRIDGE_MSI_ADDRL
- some style fixes

In collaboration with: Venkatesh J. V. (venkatesh at netlogicmicro com)
2012-03-27 07:57:41 +00:00
jchandra
ee9de9d2f9 Update the L1D cache flush sequence when enabling threads.
Added more comments to the code.
2012-03-27 07:51:42 +00:00
jchandra
5239156d84 Switch to interrupt based message handling for XLP 8xx B0.
Fixup some style issues in the file as well.
2012-03-27 07:47:13 +00:00
jchandra
303cb7ff38 Support for XLP4xx and XLP 8xx B0 revision
- Add 4xx processor IDs, add workaround in CPU detection code.
- Update frequency detection code for XLP 8xx.
- Add setting device frequency code.
- Update processor ID checking code.
2012-03-27 07:39:05 +00:00
jchandra
fa4c74cfc3 Fixes to the XLP startup code.
Changes are:
- Correct the order of calling init functions.
- Fix up checking excluding reset area.
2012-03-27 07:34:27 +00:00
adrian
5e8fce0c81 Correct the ordering of tid/crypto ic_name.
Because the code lacks all the GNU extensions to printf() format stuff,
the compiler doesn't helpfully tell us that I messed up in a previous
commit.

Pointy hat to: adrian, who likely only cares about this because he's the
  only one who bothers flipping on net80211 debugging.
2012-03-27 04:15:38 +00:00
nwhitehorn
78cf971b20 Make sure to call vm_page_dirty() before the pmap lock is released to
prevent a race where another process could conclude the page was clean.

Submitted by:	alc
2012-03-27 01:26:00 +00:00
nwhitehorn
3bbb70297a More PMAP concurrency improvements: replace the table lock and (almost) all
uses of the page queues mutex with a new rwlock that protects the page
table and the PV lists. This reduces system time during a parallel
buildworld by 35%.

Reviewed by:	alc
2012-03-27 01:24:18 +00:00
gonzo
1aa843503d - For o32 ABI get arguments from the stack
- Clear CPU_DTRACE_FAULT flag in userland backtrace routine. It just
   means we hit wrong memory region and should stop.
2012-03-26 21:47:06 +00:00
gonzo
0f05d3736d Add .reginfo section entry 2012-03-26 21:26:23 +00:00
gonzo
49e07e9559 Properly cast 64-bit dofhp_dof to pointer.
For i386 this change is no-op. For AMD64 it was tested with DTrace test
suite: results are the same from the test run before the change and after
2012-03-26 21:22:51 +00:00
rmh
08e0b420dc Register signal 33 explicitly as reserved by real-time library, and
use it by its new name (SIGLIBRT) rather than internal definition
in librt (SIGSERVICE).

Approved by:	davidxu, arch
2012-03-26 19:12:09 +00:00
marius
076ad4a1d1 Remove second consts in r233288 in order to appease C++ compilers.
While at it, remove some style(9) bugs in libkern.h.

Submitted by:	kan
2012-03-26 18:22:04 +00:00
adrian
06ce35f781 Use the assigned sequence number when checking if a retried packet is
within the BAW.

This regression was introduced in ane earlier commit by me to fix the
BAW seqno allocation-but-not-insertion-into-BAW race.  Since it was only
ever using the to-be allocated sequence number, any frame retries
with the first frame in the BAW still in the software queue would
have constantly failed, as ni_txseqs[tid] would always be outside
the BAW.

TODO:

* Extract out the mostly common code here in the agg and non-agg ADDBA
  case and stuff it into a single function.

PR:		kern/166357
2012-03-26 16:05:19 +00:00
melifaro
fd561480db - Add knlist_init_rw_reader() function to kqueue(9).
Function acquired reader lock if needed.
Assert check for reader or writer lock (RA_LOCKED / RA_UNLOCKED)
- While here, add knlist_init_mtx.9 to MLINKS and fix some style(9) issues

Reviewed by:    glebius
Approved by:    ae(mentor)

MFC after:      2 weeks
2012-03-26 09:34:17 +00:00
gonzo
72dcb65231 Use macroses to load/store pointers and increase indexes instead of
hardcoded MIPS64 instructions
2012-03-26 01:26:33 +00:00
adrian
6abfad5150 Add some more debugging to try and nail down exactly what's going on when
I see traffic stalls.

It turns out that the bug isn't because the first and last frame in the
BAW is in the software queue.  It is more likely that it's because
the first frame in the BAW is still in the software queue and thus there's
no more room to allocate and do subsequent TX.

PR:		kern/166357
2012-03-25 23:50:34 +00:00
rmh
5582dc3743 Follow non-BSD case when GNU/Hurd is detected. 2012-03-25 21:54:36 +00:00
melifaro
97c3a90503 - Permit number of ipfw tables to be changed in runtime.
net.inet.ip.fw.tables_max is now read-write.

- Bump IPFW_TABLES_MAX to 65535
Default number of tables is still 128

- Remove IPFW_TABLES_MAX from ipfw(8) code.

Sponsored by Yandex LLC

Approved by:    kib(mentor)

MFC after:      2 weeks
2012-03-25 20:37:59 +00:00
gibbs
fd2370c057 Correct failure to attach the PV block front device on Citrix
XenServer configurations that advertise the multi-page ring extension,
but only allow a single page of ring space.

sys/dev/xen/blkfront/blkfront.c:
	If only one page of ring space is being used, do not publish
	in the XenStore the number of pages in use (1), via either
	of the supported multi-page ring extension schemes.

	Single page operation is the same with or without the
	ring-page extension being negotiated.   Relying on the
	legacy behavior avoids an incompatible difference in how
	the two ring-page extension schemes that are out in the
	wild, deal with the base case of a single page.  The
	Amazon/Red Hat drivers use the same XenStore variable as
	if the extension was not negotiated.  The Citrix drivers
	assume the new ring reference XenStore variables will be
	available

Reported by:	Oliver Schonefeld <schonefeld@ids-mannheim.de>
MFC after:	3 days
2012-03-25 14:20:43 +00:00
trasz
9edb0f9ffb Remove unused define.
Discussed with:	kib
2012-03-25 12:53:19 +00:00
nwhitehorn
13688ae964 More PMAP performance improvements: on powerpc64, when TLBIE can be run
with exceptions enabled, leave them enabled and use a regular mutex to
guard TLB invalidations instead of a spinlock.
2012-03-25 06:01:34 +00:00
adrian
27ff2217e7 Add the new channel width change field to the ath(4) driver.
This is not entirely correct as it simply resets the channel, flushing
whatever is in the TX/RX queue.  This can and will break aggregation
BAW tracking.  But the alternative (HT40 frames being sent with the hardware
in HT20 mode) is even worse.

There's still a small window between the htinfo being received (and the ni_chw
field being updated) which could cause problems.  I'll look at fleshing this
out in follow-up commits.

PR:		kern/166286
2012-03-25 03:14:31 +00:00
adrian
2acdb6a872 Create a new task to handle 802.11n channel width changes.
Currently, a channel width change updates the 802.11n HT info data in
net80211 but it doesn't trigger any device changes.  So the device
driver may decide that HT40 frames can be transmitted but the last
device channel set only had HT20 set.

Now, a task is scheduled so a hardware reset or change isn't done
during any active ongoing RX. It also means that it's serialised
with the other task operations (eg channel change.)

This isn't the final incantation of this work, see below.

For now, any unmodified drivers will simply receive a channel
change log entry.  A subsequent patch to ath(4) will introduce
some basic channel change handling (by resetting the NIC.)
Other NICs may need to update their rate control information.

TODO:

* There's still a small window at the present moment where the
  channel width has been updated but the task hasn't been fired.
  The final version of this should likely pass in a channel width
  field to the driver and let the driver atomically do whatever
  it needs to before changing the channel.

PR:		kern/166286
2012-03-25 03:11:57 +00:00
mckusick
e6ce77b8ea Add a third flags argument to ffs_syncvnode to avoid a possible conflict
with MNT_WAIT flags that passed in its second argument. This will be
MFC'ed together with r232351.

Discussed with: kib
2012-03-25 00:02:37 +00:00
nwhitehorn
30ea2f9f86 Only call vm_page_dirty() on pages that are writable in order not to
confuse the VM.
2012-03-24 22:32:19 +00:00
nwhitehorn
2cab577d3a Following suggestions from alc, skip wired mappings in pmap_remove_pages()
and remove moea64_attr_*() in favor of direct calls to vm_page_dirty()
and friends.
2012-03-24 19:59:14 +00:00
alc
ee76f4b5b2 Disable detailed PV entry accounting by default. Add a config option
to enable it.

MFC after:	1 week
2012-03-24 19:43:49 +00:00
marius
764884b271 Add cas(4), gem(4) and hme(4) to x86 GENERICs as suggested by netchild@ in
<20120222095239.Horde.0hpYHJjmRSRPRKzXsoFRbYk@webmail.leidinger.net>.
According to some private emails received, it apparently is not unpopular
to use at least Quad GigaSwift cards driven by cas(4) in x86 machines.

MFC after:	1 week
2012-03-24 18:08:28 +00:00
marius
23f85145bf Consistently update to the MPI header set version 01.05.20 after r224761.
Requested by:	mjacob

MFC after:	1 week
2012-03-24 16:23:21 +00:00
marius
760232ce96 Initialize the mutexes used for the NVM and the swflag as MTX_DUPOK in
order to avoid otherwise harmless witness warnings when these are acquired
at the same time and due to both using MTX_NETWORK_LOCK as their type.
The right fix actually would be to use different, descriptive types for
these. However, the latter would require undesirable changes to the shared
code base. Another approach would be to just supply NULL as the type, which
was deemed as less desirable though as it would cause the unique but cryptic
name also to be used for the type and to diverge from the type used by other
network device drivers.

MFC after:	1 week
2012-03-24 15:15:34 +00:00
marius
306ae35dca Given that this is a host-PCI-Express bridge driver, create the parent
DMA tag with a 4 GB boundary as required by PCI-Express. With r232403 in
place this actually is redundant. However, the host-PCI-Express bridge
driver is the more appropriate place for implementing this restriction.

MFC after:	3 days
2012-03-24 13:11:58 +00:00
dim
8973cd587b Fix the following clang warning in sys/dev/dcons/dcons.c, caused by the
recent changes in sys/x86/include/endian.h:

  sys/dev/dcons/dcons.c:190:15: error: implicit conversion from '__uint32_t' (aka 'unsigned int') to '__uint16_t' (aka 'unsigned short') changes value from 1684238190 to 28526 [-Werror,-Wconstant-conversion]
	  buf->magic = ntohl(DCONS_MAGIC);
		       ^~~~~~~~~~~~~~~~~~
  sys/sys/param.h:306:18: note: expanded from:
  #define ntohl(x)        __ntohl(x)
			  ^
  ./x86/endian.h:128:20: note: expanded from:
  #define __ntohl(x)      __bswap32(x)
			  ^
  ./x86/endian.h:78:20: note: expanded from:
	      __bswap32_gen((__uint32_t)(x)) : __bswap32_var(x))
			    ^
  ./x86/endian.h:68:26: note: expanded from:
	  (((__uint32_t)__bswap16(x) << 16) | __bswap16((x) >> 16))
				  ^
  ./x86/endian.h:75:53: note: expanded from:
	      __bswap16_gen((__uint16_t)(x)) : __bswap16_var(x)))
					       ~~~~~~~~~~~~~ ^

This is because the __bswapXX_gen() macros (for x86) call the regular
__bswapXX() macros.  Since the __bswapXX_gen() variants are only called
when their arguments are constant, there is no need to do that constancy
check recursively.  Also, it causes the above error with clang.

Fix it by calling __bswap16_gen() from __bswap32_gen(), and similarly,
__bswap32_gen() from  __bswap64_gen().

While here, add extra parentheses around the __bswap16_gen() macro
expansion, to prevent unexpected side effects.
2012-03-24 10:07:21 +00:00
gonzo
df574149ee Remap PMC interrupt for all cores 2012-03-24 06:28:15 +00:00
gonzo
45fa45a9cc Add DTrace-related part to machine-dependent code:
- DTrace trap handler
- invop-related variables (unused on MIPS but still referenced from dtrace)
2012-03-24 05:17:38 +00:00
gonzo
f3d75b9c28 Jusy use i386 version of cyclic_machdep.c on all supported architectures.
It's generic enough to cover all of them.
2012-03-24 05:16:26 +00:00
gonzo
326b657cf8 Make lockstat and profile modules x86-only 2012-03-24 05:15:14 +00:00
gonzo
eb11c4850d Add device part of DTrace/MIPS code 2012-03-24 05:14:37 +00:00
gonzo
71511943f0 Add MIPS support to cddl/contrib part:
- header and stub .c file for fasttrap module. It's not supported on
    MIPS yet, but there is no way to disable support completely
- Do as amd64 trying to limit allocated memory
2012-03-24 04:52:18 +00:00
marius
7f38793833 As it turns out, mpi_cnfg.h already is included by mpt.h. 2012-03-24 00:37:56 +00:00
marius
b75a3f5d87 - Use the PCI ID macros from mpi_cnfg.h rather than duplicating them here.
Note that this driver additionally probes some device IDs for the most
  part not know to other MPT drivers, if at all. So rename the macros not
  present in mpi_cnfg.h to match the naming scheme in the latter and but
  suffix them with a _FB in order to not cause conflicts.
- Like mpt_set_config_regs(), comment out mpt_read_config_regs() as the
  content of the registers read isn't actually used and both functions
  aren't exactly up to date regarding the possible layouts of the BARs
  (these function might be helpful for debugging though, so don't remove
  them completely).
- Use DEVMETHOD_END.
- Use NULL rather than 0 for pointers.
- Remove an unusual check for the softc being NULL.
- Remove redundant zeroing of the softc.
- Remove an overly banal and actually partly incorrect as well as partly
  outdated comment regarding the allocation of the memory resource.

MFC after:	3 days
2012-03-24 00:30:17 +00:00
gonzo
1586b9e9af Add define for MIPS.options 2012-03-23 22:52:23 +00:00
trociny
0079b1f6c5 Add a sysctl to set and retrieve binary osreldate of another process.
Suggested by:	kib
Reviewed by:	kib
MFC after:	2 weeks
2012-03-23 20:05:41 +00:00
bschmidt
fe30cce04e Use suspend/resume methods provided by net80211. This ensures that the
appropriate state handling takes place, not doing so results in the
device doing nothing until manual intervention.

Reviewed by:	iwasaki
Tested by:	iwasaki (iwi)
MFC after:	4 weeks
2012-03-23 19:32:30 +00:00
gonzo
15959dc921 Fix pmap_kextract prototype to align it with pmap.c change 2012-03-23 18:07:12 +00:00
jimharris
aeccd8bc9c Call xpt_bus_register during attach context, then freeze and do not release
until domain discovery is complete.  This fixes an isci(4) bug on FreeBSD 7.x
where devices weren't always appearing after boot without an explicit rescan.

Sponsored by: Intel
Reported and tested by: <rpokala at panasas dot com>
Reviewed by: scottl
Approved by: scottl
2012-03-23 16:28:11 +00:00
jhb
4f32ea5880 Don't cast a bus address to a uint8_t pointer just to add an offset to
it.  Instead, add the offset directly to the bus address.
2012-03-23 12:34:39 +00:00
dim
a7368abeec Work around the following clang warning in mps(4):
sys/dev/mps/mps_sas.c:861:1: error: function 'mpssas_discovery_timeout' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
  mpssas_discovery_timeout(void *data)
  ^

Because the driver is obtained from upstream, we don't want to modify
it; just silence the warning instead, it is harmless.

MFC after:	3 days
2012-03-23 11:35:01 +00:00
ae
7232ff0e78 Check that scheme is not already registered. This may happens when a
KLD is preloaded with loader(8) and leads to infinity loops.

Also do not return EEXIST error code from MOD_LOAD handler, because
we have undocumented(?) ability replace kernel's module with preloaded one.
And if we have so, then preloaded module will be initialized first.
Thus error in MOD_LOAD handler will be triggered for the kernel.

PR:		kern/165573
MFC after:	3 weeks
2012-03-23 07:26:17 +00:00
gonzo
0450849e9c Add pseudo-device for handling PMC interrupts and link everything
PMC-related to build
2012-03-23 00:11:54 +00:00
gonzo
d2702a3c7c Add Octeon PMC hardware backend 2012-03-23 00:09:27 +00:00
gonzo
9c120510f9 Add list of Octeon's PMC counters obtained from cvmx-core.h 2012-03-23 00:04:09 +00:00
gonzo
74c3c8ddb5 Add Octeon class and CPU type 2012-03-23 00:03:26 +00:00
gonzo
14bd63a0b0 Setup fake MODINFO variables for octeon kernel 2012-03-23 00:01:09 +00:00
adrian
0167a18d8f Add some further debugging to try and aid tracking down what the state of
things were just before a full software queue is drained.
2012-03-22 21:48:36 +00:00
adrian
a791b92eab Sprinkle some style(9) things around. 2012-03-22 21:47:14 +00:00