Commit Graph

94296 Commits

Author SHA1 Message Date
Andriy Gapon
9ba0691bdd follow up to r254051
- update powerpc/GENERIC64 as well, suggested by mdf
- update comments so that they make sense after the change, suggested by
  jhb

X-MFC after:	never (change specific to head)
2013-08-09 08:11:09 +00:00
Jeff Roberson
863c7e4562 - Reserve a special AF for SDP. The one we were incorrectly using before
was taken by another AF.

Sponsored by:	EMC / Isilon Storage Division
2013-08-09 03:26:17 +00:00
Jeff Roberson
3d71c6cf45 - Correctly handle various edge cases in sysfs emulation.
Sponsored by:	EMC / Isilon Storage Division
2013-08-09 03:24:48 +00:00
Jeff Roberson
ba397d0f16 - Use the correct type in the linux bitops emulation.
Submitted by:	Maxim Ignatenko <gelraen.ua@gmail.com>
2013-08-09 03:24:12 +00:00
Pyun YongHyeon
69b1f509a4 Fix for IPv4 fragment packets treated as RMCP.
bit25 of rxMode MAC register of 5762 needs to be set for rx mgmt
filter to work correctly when processing match for UDP header
fields.  Otherwise false positive can occur which causes IPv4
fragment to be received by APE instead of host.

Reported by:	Geans Pin <geanspin@broadcom.com>
2013-08-09 01:15:32 +00:00
Scott Long
fe8391035a Rate limit the 'out of chain frame' messages to once per 60 seconds.
Obtained from:	Netflix
MFC after:	3 days
2013-08-09 01:10:33 +00:00
Scott Long
d9802deb4e Sometimes a device misbehaves so badly that it disrupts the entire system.
Add a tunable that allows such a device to be excluded from the driver.
The id parameter is the target id that the driver assigns to a given device.

dev.mps.X.exclude_ids=<id>,<id>

Obtained from:	Netflix
MFC after:	3 days
2013-08-09 01:09:02 +00:00
Scott Long
f510415d84 Add a helpful message that can help point to why a sysctl tree removal failed
Obtained from:	Netflix
MFC after:	3 days
2013-08-09 01:04:44 +00:00
Xin LI
43667c1f68 MFV r254079:
Illumos ZFS issues:
  3957 ztest should update the cachefile before killing itself
  3958 multiple scans can lead to partial resilvering
  3959 ddt entries are not always resilvered
  3960 dsl_scan can skip over dedup-ed blocks if
       physical birth != logical birth
  3961 freed gang blocks are not resilvered and can cause pool to suspend
  3962 ztest should print out zfs debug buffer before exiting
2013-08-08 23:38:31 +00:00
Devin Teske
5b502b3b4c Update legacy static assignments in old code to support dynamic framing,
plotting, and alignment coinciding with enhancements in SVN r242667.
2013-08-08 22:34:00 +00:00
Devin Teske
5e82c321f8 Since the introduction of SVN r244048 and [follow-up] r244089, it is now
safe to build upon ``boot_serial?'' functionality to make safer UI choices.
2013-08-08 22:09:46 +00:00
Pedro F. Giffuni
95f1f8d262 Small typo.
MFC after:	3 days
2013-08-08 22:07:59 +00:00
Ryan Stone
08a42caa50 Allow drivers to return BUS_PROBE_NOWILDCARD from their attach routine to
match devices where the driver class was fixed but the unit number was
wildcarded.  This better matches the documented behaviour in
DEVICE_PROBE(9).

Reviewed by:	imp
2013-08-08 19:30:49 +00:00
Andrey V. Elsukov
b74dd6c77b gpt_entries is used as limit for the number of partition entries in
the GEOM_PART. Instead of just using number of entries from the GPT
header, calculate this limit based on the reserved space between
GPT header and first available LBA.

MFC after:	2 weeks
2013-08-08 16:09:20 +00:00
Glen Barber
a61914445d When newvers.sh is run, it is possible that the svnversion
(or svnliteversion) in the current lookup path is not what
was used to check out the tree.  If an incompatible version
is used, the svn revision number is not reported in uname(1).

Run ${svnversion} on newvers.sh itself when evaluating if the
svn(1) in use is compatible with the tree.  Fallback to an
empty ${svnversion} if necessary.

With this change, svnliteversion from base is only used
if no compatible svnversion is found, so with this change,
the version of svn(1) from the ports tree is evaluated first.

Requested by:	many
MFC after:	3 days
X-MFC-To:	stable/9, releng/9.2 only
2013-08-08 15:59:00 +00:00
Andrey V. Elsukov
4371b649aa Make the check for number of entries less strict.
Some partitioning tools can create GPT with number of entries less
than 128.

MFC after:	1 week
2013-08-08 11:24:25 +00:00
Adrian Chadd
4030a4b2a3 Cap the number of streams supported to two for now.
I haven't yet reviewed the Intel driver(s) in more depth to see if
there are 1x1 NICs that report they support 2 transmit/receive chains..
if so then we'll have to update this.

Tested:

* Intel 4965, which is a 2x2 device with 3 RX and 2 TX chains.

PR:		kern/181132
2013-08-08 05:52:41 +00:00
Adrian Chadd
e7495198d5 Convert net80211 over to using if_transmit for the dispatch from the
upper layer(s).

This eliminates the if_snd queue from net80211. Yay!

This unfortunately has a few side effects:

* It breaks ALTQ to net80211 for now - sorry everyone, but fixing
  parallelism and eliminating the if_snd queue is more important
  than supporting this broken traffic scheduling model. :-)

* There's no VAP and IC flush methods just yet - I think I'll add
  some NULL methods for now just as placeholders.

* It reduces throughput a little because now net80211 will drop packets
  rather than buffer them if the driver doesn't do its own buffering.
  This will be addressed in the future as I implement per-node software
  queues.

Tested:

* ath(4) and iwn(4) in STA operation
2013-08-08 05:09:35 +00:00
Neel Natu
f263e391a3 Use local variables with the appropriate types and eliminate a bunch of casts.
This is a cosmetic change but it does help with a proposed change to increase
the maximum size of physical memory supported on amd64 platforms.

Submitted by:	Chris Torek (torek@torek.net)
2013-08-08 03:17:39 +00:00
Xin LI
9d2f243aa6 MFV r254071:
Fix a regression introduced by fix for Illumos bug #3834.  Quote from
Matthew Ahrens on the Illumos issue:

ztest fails this assertion because ztest_dmu_read_write() does
        dmu_tx_hold_free(tx, bigobj, bigoff, bigsize);
and then
    dmu_object_set_checksum(os, bigobj,
        (enum zio_checksum)ztest_random_dsl_prop(ZFS_PROP_CHECKSUM), tx);

If the region to free is past the end of the file, the DMU assumes that there
will be nothing to do for this object.  However, ztest does set_checksum(),
which must modify the dnode.  The fix is for ztest to also call

    dmu_tx_hold_bonus(tx, bigobj);

so we can account for the dirty data associated with setting the checksum

Illumos ZFS issues:
  3955 ztest failure: assertion refcount_count(&tx->tx_space_written)
         + delta <= tx->tx_space_towrite
2013-08-07 22:21:00 +00:00
Adrian Chadd
cc80eae5cf Allow net80211 to compile on stable/9 and stable/8. 2013-08-07 22:01:43 +00:00
Xin LI
4f7b34578b MFV r254070:
Merge vendor bugfix for ZFS test suite that triggers false positives.

Illumos ZFS issues:
  3949 ztest fault injection should avoid resilvering devices
  3950 ztest: deadman fires when we're doing a scan
  3951 ztest hang when running dedup test
  3952 ztest: ztest_reguid test and ztest_fault_inject don't place nice together
2013-08-07 21:16:14 +00:00
John Baldwin
5b596f0f5f Don't emit a spurious EVFILT_PROC event with no fflags set on process exit
if NOTE_EXIT is not being monitored.  The rationale is that a listener
should only get an event for exit() if they registered interest via
NOTE_EXIT.  This matches the behavior on OS X.
- Don't save the exit status on process exit unless NOTE_EXIT is being
  monitored.
- Add an internal EV_DROP flag that requests kqueue_scan() to free the
  knote without signalling it to userland and use this when a process
  exits but the fflags in the knote is zero.

Reviewed by:	jmg
MFC after:	1 month
2013-08-07 19:56:35 +00:00
Konstantin Belousov
449c2e92c9 Split the pagequeues per NUMA domains, and split pageademon process
into threads each processing queue in a single domain.  The structure
of the pagedaemons and queues is kept intact, most of the changes come
from the need for code to find an owning page queue for given page,
calculated from the segment containing the page.

The tie between NUMA domain and pagedaemon thread/pagequeue split is
rather arbitrary, the multithreaded daemon could be allowed for the
single-domain machines, or one domain might be split into several page
domains, to further increase concurrency.

Right now, each pagedaemon thread tries to reach the global target,
precalculated at the start of the pass.  This is not optimal, since it
could cause excessive page deactivation and freeing.  The code should
be changed to re-check the global page deficit state in the loop after
some number of iterations.

The pagedaemons reach the quorum before starting the OOM, since one
thread inability to meet the target is normal for split queues.  Only
when all pagedaemons fail to produce enough reusable pages, OOM is
started by single selected thread.

Launder is modified to take into account the segments layout with
regard to the region for which cleaning is performed.

Based on the preliminary patch by jeff, sponsored by EMC / Isilon
Storage Division.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-08-07 16:36:38 +00:00
Konstantin Belousov
872d995f76 Change the pmap_ts_referenced() method of amd64 pmap to use shared
pvh_global_lock.  This allows the method to be executed in parallel,
avoiding undue contention on the pvh_global_lock for the multithreaded
pagedaemon.

The pmap_ts_referenced() function has to inspect the page mappings for
several pmaps, which need to be locked while pv list lock is owned.
This contradicts to the lock order, where pmap lock is before pv list
lock.  Introduce the generation count for the pv list of the page or
superpage, which indicate any change in the pv list, and, as usual,
perform restart of the iteration if generation changed while pv lock
was dropped for blocking acquire of a pmap lock.

Reported and tested by:	pho
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
2013-08-07 16:33:15 +00:00
Olivier Houchard
d03426e619 Don't bother trying to work around buffers which are not aligned on a cache
line boundary. It has never been 100% correct, and it can't work on SMP,
because nothing prevents another core from accessing data from an unrelated
buffer in the same cache line while we invalidated it. Just use bounce pages
instead.

Reviewed by:	ian
Approved by:	mux (mentor) (implicit)
2013-08-07 15:44:58 +00:00
Alexander Motin
a29779e865 Remove droping topology mutex after iterating 100 periphs in CAMGETPASSTHRU.
That is not so slow and so often operation to handle unneeded otherwise
xsoftc.xpt_generation and respective locking complications.
2013-08-07 11:34:20 +00:00
Ganbold Tsagaankhuu
696ec285aa Bring initial support for Allwinner A20 SoC (Cubieboard2).
Add support for A20 timer.
	Correct interrupt offset depending from chip.
	Add basic code for CPU configuration module.
	For now, add kernel config and dts file
	(only FDT blob related problem needs to be solved later in
	order to have one kernel for both cubieboard1 and 2).

Approved by: ray@
2013-08-07 11:07:56 +00:00
Alexander Motin
71185c66dd Improve r253721 by reporting detected lack of BIO_FLUSH support to GEOM.
That prevents more of such requests from coming and errors from logging.
2013-08-07 08:20:11 +00:00
Andriy Gapon
818d282e7b enable KDB_TRACE in GENERICs
KDB_TRACE is not an alternative to DDB/etc, they are complementary.
So I do not see any reason to not enable KDB_TRACE by default.

X-MFC after:	never (change specific to head)
2013-08-07 08:03:50 +00:00
Kevin Lo
3de1bd9502 Remove unsigned comparison < 0
Found by:	LLVM
Reviewed by:	luigi
2013-08-07 07:22:56 +00:00
Jeff Roberson
5df87b21d3 Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

 - Normalize functions that allocate memory to use kmem_*
 - Those that allocate address space are named kva_*
 - Those that operate on maps are named kmap_*
 - Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	EMC / Isilon Storage Division
2013-08-07 06:21:20 +00:00
Mark Johnston
5bc4f6b3ab Add a missing module version declaration to if_tun(4).
PR:		181078
Submitted by:	Brandon Gooch <jamesbrandongooch@gmail.com>
MFC after:	1 week
2013-08-07 01:32:08 +00:00
Mark Johnston
c0432fc38b Fill in the description fields for M_FICT_PAGES.
Reviewed by:	kib
MFC after:	3 days
2013-08-07 00:20:30 +00:00
Marcel Moolenaar
e01c6f329a Change <sys/diskpc98.h> to not redefine the same symbols that are
being defined in <sys/diskmbr.h>. Instead give the symbols here a
"PC98_" prefix. This way, both <sys/diskmbr.h> and <sys/diskpc98.h>
can be included in the same C source file.

The renaming is trivial. The only gotcha is that DOSBBSECTOR is
also redefined from 0 to 1. This because DOSBBSECTOR was always
used in conjunction with an addition of 1. The PC98_BBSECTOR symbol
is defined as 1 and the expression is simplified.

Note: it is not believed that ports are seriously impacted; or at
all for that matter.

Approved by: nyan@
2013-08-07 00:00:48 +00:00
Xin LI
c668ff330e MFV r254011:
This change have no effect to FreeBSD but integrated for
completeness.

Illumos ZFS issues:
  348 ZFS should handle DKIOCGMEDIAINFOEXT failure
2013-08-06 21:36:01 +00:00
Jack F Vogel
d0913b7f25 Make the various driver MSIX setup routines fallback to MSI more
gracefully. This change was suggested by Marius Strobl, thank you.

PR: kern/181016
MFC after: ASAP
2013-08-06 21:01:38 +00:00
Marius Strobl
0cfcfc1918 - Fix a bug in the MSI allocation logic so an MSI is also employed if a
controller supports only a single message. I haven't seen such an adapter
  out in the wild, though, so this change likely is a NOP.
  While at it, further simplify the MSI allocation logic; there's no need
  to check the number of available messages on our own as pci_alloc_msi(9)
  will just fail if it can't provide us with the single message we want.
- Nuke the unused softc of aacch(4).

MFC after:	1 month
2013-08-06 19:14:02 +00:00
Marius Strobl
21e5a2223c As it turns out, MSIs are broken with 2820SA so introduce an AAC_FLAGS_NOMSI
quirk and apply it to these controllers [1]. The same problem was reported
for 2230S, in which case it wasn't actually clear whether the culprit is the
controller or the mainboard, though. In order to be on the safe side, flag
MSIs as being broken with the latter type of controller as well. Given that
these are the only reports of MSI-related breakage with aac(4) so far and
OSes like OpenSolaris unconditionally employ MSIs for all adapters of this
family, however, it doesn't seem warranted to generally disable the use of
MSIs in aac(4).
While it, simplify the MSI allocation logic a bit; there's no need to check
for the presence of the MSI capability on our own as pci_alloc_msi(9) will
just fail when these kind of interrupts are not available.
Reported and tested by: David Boyd [1]

MFC after:	3 days
2013-08-06 18:55:59 +00:00
Jack F Vogel
54a6317360 When the igb driver is static there are cases when early interrupts occur,
resulting in a panic in refresh_mbufs, to prevent this add a check in the
interrupt handler for DRV_RUNNING.

MFC after: 1 day (critical for 9.2)
2013-08-06 18:00:53 +00:00
Hiroki Sato
ffa0165ae0 Fix incompatibility in ICMPV6CTL_ND6_PRLIST sysctl, and SIOCGPRLST_IN6,
SIOCGDRLST_IN6, and SIOCGNBRINFO_IN6 ioctl.  These userland interfaces
treat expiration times in time_second, not time_uptime.
2013-08-06 17:10:52 +00:00
Kirk McKusick
824009a16a This bug fix is in a code path in rename taken when there is a
collision between a rename and an open system call for the same
target file. Here, rename releases its vnode references, waits for
the open to finish, and then restarts by reacquiring its needed
vnode locks. In this case, rename was unlocking but failing to
release its reference to one of its held vnodes. The effect was
that even after all the actual references to the vnode had gone,
the vnode still showed active references. For files that had been
removed, their space was not reclaimed until the filesystem was
forcibly unmounted.

This bug manifested itself in the Postgres server which would
leak/lose hundreds of files per day amounting to many gigabytes of
disk space. This bug required shutting down Postgres, forcibly
unmounting its filesystem, remounting its filesystem and restarting
Postgres every few days to recover the lost space.

Reported by: Dan Thomas and Palle Girgensohn
Bug-fix by:  kib
Tested by:   Dan Thomas and Palle Girgensohn
MFC after:   2 weeks
2013-08-06 16:50:05 +00:00
Andriy Gapon
63230519fa fix fat-fingering in r253996
MFC after:	17 days
X-MFC with:	r253996
2013-08-06 16:18:07 +00:00
Andriy Gapon
c319ea15f4 opensolaris code: translate INVARIANTS to DEBUG and ZFS_DEBUG
Do this by forcing inclusion of
sys/cddl/compat/opensolaris/sys/debug_compat.h
via -include option into all source files from OpenSolaris.
Note that this -include option must always be after -include opt_global.h.

Additionally, remove forced definition of DEBUG for some modules and fix
their build without DEBUG.

Also, meaning of DEBUG was overloaded to enable WITNESS support for some
OpenSolaris (primarily ZFS) locks.  Now this overloading is removed and
that use of DEBUG is replaced with a new option OPENSOLARIS_WITNESS.

MFC after:	17 days
2013-08-06 15:51:56 +00:00
Marius Strobl
4af8466d8b Add MD (for now) atomic_store_acq_<type>() and use it in pmap_activate()
to get the semantics when setting the PMAP right. Prior to r251782, the
latter already used implicit acquire semantics, which - currently - means
to not employ additional explicit memory barriers under the hood (see also
r225889).
2013-08-06 15:34:11 +00:00
Alexander Motin
d9aca4ed74 Block reporting of ZFS features for suspended pools.
Before executing any subcommand, zpool tool fetches pools configuration from
the kernel.  Before features support was added, kernel was regenerating that
configuration based on data always present in memory.  Unfortunately, pool
features list and activity counters are not such. They are stored in ZAP,
that normally resides in ARC, but under heavy memory pressure may be swapped
out.  If pool is suspended at this point, there is no way to recover it back
since any zpool command will stuck.

This change has one predictable flaw: `zpool upgrade` always wish to upgrade
suspended pools, but fortunately it can't do it due to the suspension.
2013-08-06 14:41:41 +00:00
Alexander Motin
f8dcf872c4 Disable r252840 when ZFS TRIM is enabled (vfs.zfs.trim.enabled=1) and really
disable TRIM otherwise.

r252840 (illumos bug 3836) is based on assumption that zio_free_sync() has
no lock dependencies and should complete immediately. Unfortunately, with our
TRIM implementation that is not true due to ZIO_STAGE_VDEV_IO_START added
to the ZIO_FREE_PIPELINE, which, while not really accessing devices, still
acquires SCL_ZIO lock for read to be sure devices won't disappear.

When TRIM is disabled, this patch enables direct free execution from r252840
and removes ZIO_STAGE_VDEV_IO_START and ZIO_STAGE_VDEV_IO_ASSESS stages from
the pipeline to avoid lock acquisition.  Otherwise it queues free request as
it was before r252840.
2013-08-06 14:30:28 +00:00
Alexander Motin
526bb4af8a Make zpool clear to reopen also reconnected cache and spare devices.
Since `zpool status` reports about such kinds of errors, it is strange
that they are not cleared by `zpool clear`.
2013-08-06 14:23:33 +00:00
Alexander Motin
ad727e8d64 Make ZFS to use separate thread to handle SPA_ASYNC_REMOVE async events.
Existing async thread is running only on successfull spa_sync() completion,
that is impossible in case of pool loosing required (last) disk(s).  That
indefinite delay of SPA_ASYNC_REMOVE processing made ZFS to not close the
lost disks, preventing GEOM/CAM from destroying devices and reusing names
on later disk reattach.

In earlier version of the patch I've tried to just run existing thread
immediately, unrelated to spa_sync() completion, but that exposed number
of situations where it could stuck due to locks held by stuck spa_sync(),
that are required for other kinds of async events.

Experiments with OpenIndiana snapshot confirmed that they also have this
issue with lost disks reattach.
2013-08-06 14:20:41 +00:00
Andriy Gapon
5d7430f0a8 dtrace: fix compilation with gcc
Cowardly taking the easiest way and using -Wno-*

MFC after:	3 days
X-MFC with:	r253772
2013-08-06 13:55:39 +00:00
Edward Tomasz Napierala
ea7c84e46f Remove dead code. 2013-08-06 10:42:18 +00:00
Andrew Turner
6d65b3be10 We no longer need to align the stack before calling swi_handler as it is
already aligned correctly in the PUSHFRAME macro.
2013-08-06 10:03:44 +00:00
Sean Bruno
ec5d9810da Update ciss(4) with new models of raid controllers from HP
Submitted by:	scott.benesh@hp.com
MFC after:	2 weeks
Sponsored by:	Hewlett Packard
2013-08-06 03:17:01 +00:00
Justin Hibbits
98a737cf8f Micro-optimize OFW syscons 8-bit blank.
MFC after:	1 week
2013-08-06 03:09:44 +00:00
Justin Hibbits
a3b2cab451 Remove an unnecessary panic. The PVO's PTE entry and the PTEG's PTE entry may
not match, if the PVO's PTE is invalid.
2013-08-06 02:58:16 +00:00
Hiroki Sato
89cac24e48 - Use pget(PGET_CANDEBUG | PGET_NOTWEXIT) to determine if the specified
PID is valid for monitoring in FILEMON_SET_PID ioctl.

- Set the monitored PID to -1 when the process exits.

Suggested by:	jilles
Tested by:	sjg
MFC after:	3 days
2013-08-06 02:14:30 +00:00
Justin Hibbits
804d1cc1b6 Evict pages from the PTEG when it's full and trying to insert a new PTE,
rather than panicking.

Reviewed by:	nwhitehorn
MFC after:	3 weeks
2013-08-06 01:01:15 +00:00
Kirk McKusick
8cf85cf292 With the addition of journalled soft updates, the "newblk" structures
persist much longer than previously. Historically we had at most 100
entries; now the count may reach a million. With the increased count
we spent far too much time looking them up in the grossly undersized
newblk hash table. Configure the newblk hash table to accurately reflect
the number of entries that it must index.

Reviewed by: kib
Tested by:   Peter Holm
MFC after:   2 weeks
2013-08-05 22:02:45 +00:00
Kirk McKusick
57591d8e78 To better understand performance problems with journalled soft updates,
we need to collect the highest level of allocation for each of the
different soft update dependency structures. This change collects these
statistics and makes them available using `sysctl debug.softdep.highuse'.

Reviewed by: kib
Tested by:   Peter Holm
MFC after:   2 weeks
2013-08-05 22:01:16 +00:00
Olivier Houchard
7497e6267c Let the platform calculate the timer frequency at runtime, and use that for
the omap4, instead of relying on the (wrong) value provided in the dts.
2013-08-05 20:14:56 +00:00
Hiroki Sato
7d26db1792 - Use time_uptime instead of time_second in data structures for
PF_INET6 in kernel.  This fixes various malfunction when the wall time
  clock is changed.  Bump __FreeBSD_version to 1000041.

- Use clock_gettime(CLOCK_MONOTONIC_FAST) in userland utilities.

MFC after:	1 month
2013-08-05 20:13:02 +00:00
Konstantin Belousov
456597e7bd Do not override the ENOENT error for the empty path, or EFAULT errors
from copyins, with the relative lookup check.

Discussed with:	rwatson
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-08-05 19:42:03 +00:00
Andrew Turner
d8e3f572e2 When entering exception handlers we may not have an aligned stack. This is
because an exception may happen at any time. The stack alignment rules on
ARM EABI state the only place the stack must be 8-byte aligned is on a
function boundary.

If an exception happens while a function is setting up or tearing down it's
stack frame it may not be correctly aligned. There is also no requirement
for it to be when the function is a leaf node.

The fix is to align the stack after we have stored a backup of the old stack
pointer, but before we have stored anything in the trapframe. Along with
this we need to adjust the size of the trapframe by 4 bytes to ensure the
stack below it is also correctly aligned.
2013-08-05 19:06:28 +00:00
Konstantin Belousov
8239a7a878 The tmpfs_alloc_vp() is used to instantiate vnode for the tmpfs node,
in particular, from the tmpfs_lookup VOP method.  If LK_NOWAIT is not
specified in the lkflags, the lookup is supposed to return an alive
vnode whenever the underlying node is valid.

Currently, the tmpfs_alloc_vp() returns ENOENT if the vnode attached
to node exists and is being reclaimed.  This causes spurious ENOENT
errors from lookup on tmpfs and corresponding random 'No such file'
failures from syscalls working with tmpfs files.

Fix this by waiting for the doomed vnode to be detached from the tmpfs
node if sleepable allocation is requested.

Note that filesystems which use vfs_hash.c, correctly handle the case
due to vfs_hash_get() looping when vget() returns ENOENT for sleepable
requests.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2013-08-05 18:53:59 +00:00
Jack F Vogel
7301d64aba Correct a fat-finger in the last delta.
MFC after: ASAP
2013-08-05 16:16:50 +00:00
Alexander Motin
39fe6d85ee MFprojects/camlock r249006:
Pass SIM pointer as an argument to camisr_runqueue() instead of doneq
pointer.
2013-08-05 12:15:53 +00:00
Alexander Motin
ea541bfdaa MFprojects/camlock r249505:
Change CCB queue resize logic to be able safely handle overallocations:
 - (re)allocate queue space in power of 2 chunks with 64 elements minimum
and never shrink it; with only 4/8 bytes per element size is insignificant.
 - automatically reallocate the queue to double size if it is overflowed.
 - if queue reallocation failed, store extra CCBs in unsorted TAILQ,
fetching them back as soon as some queue element is freed.

To free space in CCB for TAILQ linking, change highpowerq from keeping
high-power CCBs to keeping devices frozen due to high-power CCBs.

This encloses all pieces of queue resize logic inside of cam_queue.[ch],
removing some not obvious duties from xpt_release_ccb().
2013-08-05 11:48:40 +00:00
Glen Barber
cfb2932bf4 Redirect svnversion stderr to /dev/null if we cannot determine
the tree version, for example if the tree is checked out with an
outdated svn from ports, but the base system svnlite is built.

Approved by:	kib (mentor)
2013-08-05 10:26:42 +00:00
Attilio Rao
be99683637 Revert r253939:
We cannot busy a page before doing pagefaults.
Infact, it can deadlock against vnode lock, as it tries to vget().
Other functions, right now, have an opposite lock ordering, like
vm_object_sync(), which acquires the vnode lock first and then
sleeps on the busy mechanism.

Before this patch is reinserted we need to break this ordering.

Sponsored by:	EMC / Isilon storage division
Reported by:	kib
2013-08-05 08:55:35 +00:00
Hiroki Sato
41541ebf94 Fix a panic in tmpaddrtimer. 2013-08-05 00:36:12 +00:00
Jeff Roberson
2c0b86b48f - Introduce a specific function, pmap_remove_kernel_pde, for removing
huge pages in the kernel's address space.  This works around several
   asserts from pmap_demote_pde_locked that did not apply and gave false
   warnings.

Discovered by:	pho
Reviewed by:	alc
Sponsored by:	EMC / Isilon Storage Division
2013-08-05 00:28:03 +00:00
Attilio Rao
66bacd7e17 Remove unused member.
Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc
Tested by:	pho
2013-08-04 21:17:05 +00:00
Attilio Rao
3b6714cacb The page hold mechanism is fast but it has couple of fallouts:
- It does not let pages respect the LRU policy
- It bloats the active/inactive queues of few pages

Try to avoid it as much as possible with the long-term target to
completely remove it.
Use the soft-busy mechanism to protect page content accesses during
short-term operations (like uiomove_fromphys()).

After this change only vm_fault_quick_hold_pages() is still using the
hold mechanism for page content access.
There is an additional complexity there as the quick path cannot
immediately access the page object to busy the page and the slow path
cannot however busy more than one page a time (to avoid deadlocks).

Fixing such primitive can bring to complete removal of the page hold
mechanism.

Sponsored by:	EMC / Isilon storage division
Discussed with:	alc
Reviewed by:	jeff
Tested by:	pho
2013-08-04 21:07:24 +00:00
Marcel Moolenaar
b9fdaa9b19 Remove inclusion of <sys/diskmbr.h>. We have no business knowing
anything related to MBR in this file.
2013-08-04 21:00:22 +00:00
Hiren Panchasara
c29173eb94 Fixing a typo.
Approved by:	sbruno (mentor, implicit)
2013-08-04 19:54:47 +00:00
Attilio Rao
878a788734 Remove unnecessary soft busy of the page before to do vn_rdwr() in
kern_sendfile() which is unnecessary.
The page is already wired so it will not be subjected to pagefault.
The content cannot be effectively protected as it is full of races
already.
Multiple accesses to the same indexes are serialized through vn_rdwr().

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc, jeff
Tested by:	pho
2013-08-04 15:56:19 +00:00
Steven Hartland
e44e975c1b zfs_ioc_rename should not leave the value of zc_name passed in via zc altered
on return.

MFC after:	1 week
2013-08-04 11:38:08 +00:00
Marius Strobl
eb84fc9506 Make r253899 compile. 2013-08-03 21:24:52 +00:00
Justin Hibbits
450f197050 Remove duplicate definition of SPR MMCR0.
MFC after:	3 days
2013-08-03 18:05:12 +00:00
Edward Tomasz Napierala
39ca489ea9 Fix typo. 2013-08-03 13:38:56 +00:00
Ian Lepore
992a44320c Tweak the imx debug console code so that it works with multiple SoCs.
Instead of hard-coding the uart register addresses for the imx51, use
a variable that defaults to the imx51 address.  When debugging another
imx-family SoC, the variable can be set early in initarm() to provide
full console/printf support for debugging early boot.
2013-08-03 13:31:10 +00:00
Ulrich Spörlein
006afc9b5c Add missing depend. 2013-08-03 08:21:35 +00:00
Marcel Moolenaar
90aa031bf1 Add a tunable for the default timeout. 2013-08-03 04:25:25 +00:00
Peter Grehan
80a902ef7d Follow-up commit to fix CR0 issues. Maintain
architectural state on CR vmexits by guaranteeing
that EFER, CR0 and the VMCS entry controls are
all in sync when transitioning to IA-32e mode.

Submitted by:	Tycho Nightingale (tycho.nightingale <at> plurisbusnetworks.com)
2013-08-03 03:16:42 +00:00
Marius Strobl
1e53269ac2 Const'ify scc_driver_name. 2013-08-02 23:31:51 +00:00
Marius Strobl
71bda3eb9a - Use NULL instead of 0 for pointers.
- Remove unnecessary __RMAN_RESOURCE_VISIBLE.
2013-08-02 23:30:32 +00:00
Marius Strobl
c4b1deaf0d - Implement iclear methods for QUICC and SAB 82532. With r253161 in place,
this is is crucial at least for the latter.
  What happens is that attaching uart(4) to scc(4) causes the SAB 82532 to
  "receive" something and trigger a SER_INT_RXREADY interrupt, given that
  at least fast/filter interrupts are already enabled. Prior to r253161,
  uart_bus_ihand() was set up at this point and handled that condition,
  i. e. read the RX FIFO and issued a Receive Message Complete.
  Now, uart_bus_ihand() and uart_intr() are setup after attaching uart(4),
  leaving the SER_INT_RXREADY interrupt triggered during the latter to
  be handled by the iclear method. However, with that method not implement,
  this in turn causes SAB 82532 to not issue any further SER_INT_RXREADY
  interrupts until the RX FIFO is full again. Thus, 15 received bytes go
  to nowhere, given that "the other half" of the RX FIFO is used for status
  information. Hence, implementing sab82532_bfe_iclear() fixes things again.
  Potentially, the same problem exists for QUICC.
- Remove unnecessary __RMAN_RESOURCE_VISIBLE.
- Remove a superfluous header.
- Use KOBJMETHOD_END.
- Mark unused arguments as such.
- Remove variables unused after initialization.

Reviewed by:	marcel (earlier version)
2013-08-02 23:28:49 +00:00
Adrian Chadd
f86392791f Add in some definitions required for later iwn(4) device support.
This also clarifies a few existing fields.

Tested:

* Intel 5100

Submitted by:	Cedric GROSS <cg@gross.info>
2013-08-02 21:28:36 +00:00
Adrian Chadd
a5582fae07 Break out the iwn(4) device IDs into if_iwn_devid.h, as well as add
IDs for new devices.

* Add new device IDs
* Extend the ID probe code to include the newer range of bits used
  by later model devices

Tested:

* Intel 5100, STA mode

TODO:

* Test on Intel 4965, just to be sure

Submitted by:	Cedric GROSS <cg@gross.info>
2013-08-02 21:23:28 +00:00
Olivier Houchard
cca928b9e1 Only receive the interrupts on the first core, to avoid duplicate interrupts. 2013-08-02 20:32:26 +00:00
Navdeep Parhar
82342de26d Display temperature sensor data. Shows -1 if sensor not
available on the card.

# sysctl dev.t4nex.0.temperature
# sysctl dev.t5nex.0.temperature
2013-08-02 18:05:42 +00:00
Navdeep Parhar
73cd922046 Fix previous commit (r253873). "cong" has one bit per channel but the
congestion channel map has 1 nibble per channel.  So bits wxyz need to
be blown up into 000w000x000y000z.
2013-08-02 17:44:19 +00:00
Hiroki Sato
872ce24739 Add p_candebug() check to FILEMON_SET_PID ioctl.
Discussed with:	sjg
MFC after:	3 days
2013-08-02 14:44:11 +00:00
Gleb Smirnoff
977c7043eb Remove extra zeroing after M_ZERO allocation. 2013-08-02 13:06:49 +00:00
Navdeep Parhar
ba41ec4848 Set up congestion manager context properly for T5 based cards.
MFC after:	3 days (will check with re@)
2013-08-01 23:38:30 +00:00
Adrian Chadd
21f8dc458a Now that conf/options knows about if_iwn.h, add it to if_iwn.c.
This allows for IWN_DEBUG (and maybe more stuff later) to be a build
time configure option.
2013-08-01 21:50:50 +00:00
Adrian Chadd
c49debb656 Add IWN_DEBUG as an option for if_iwn. 2013-08-01 21:50:13 +00:00
Adrian Chadd
38b1a25dfd iwn(4) debugging improvements.
* Add in some new register debugging under IWN_DEBUG_REGISTER
* Make IWN_DEBUG an option now for building.  I'll chase this up
  with a commit to 'options' soon.

Submitted by:	Cedric GROSS <cg@cgross.info>
2013-08-01 21:45:30 +00:00
Jack F Vogel
cbe75ae8f5 A number of important fixes:
- mbuf reused after an RX_COPY optimized operation can sometimes have
    a bogus cached address, resulting in TCP hangs. Add critical save points
    to the cached address. Thanks to Michael and the team at Verisign for
    finding this problem.
  - A couple more spots where the rxbuf->flags member should be cleared just
    to be sure no incorrect RX_COPY state is left around. Thanks to Adrian
    for tracking these down.
  - Remove the rearm_queues function from the driver, this was found to be
    responsible for some out-of-order packets by Verisign, and was always a
    bandaid, with the other fixes in this delta the bandaid can finally be
    removed.
  - In the other/link interrupt handler the entire state of the EICS register
    was being writen back into EICR (which clears causes and thus re-enables
    those interrupts), this was wrong, so now mask off the queue portion of
    the register value, so we only clear the other/link interrupt we intend.
    Marc from Verisign found this.
  - Make the SFP+ unsupported option tuneable now, by customer request.
  - Finally, just a couple of minor DEBUG string fixes.

I want to call out and thank all the participants in the 10G community/Intel
calls for helping track down these problems and make the driver better for
everyone!

MFC after:	3 days, these are critical fixes for 9.2!
2013-08-01 20:10:16 +00:00
Marcel Moolenaar
04ae0d7cc5 Fix the build of the testmain target. This target compiles a Forth
interpreter that can be run on the system and as such cannot be
compiled against libbstand. On the one hand this means we need to
include the usual headers for system interfaces that we use and
on the the other hand we can only use standard system interfaces.

While here, define local variables only when needed to make this
WARNS=2 clean on amd64.

PR:		172542
Obtained from:	peterj@
Pointed out by: Jan Beich <jbeich@tormail.org>
2013-08-01 18:06:58 +00:00