Commit Graph

94474 Commits

Author SHA1 Message Date
Simon J. Gerraty
3d2bc9e872 Some objects - such as *_genassym.o are not hooked into
SRCS OBJS or anything else, yet have a dependency on symlinks
such as machine/

Reviewed by: obrien
2013-08-14 22:19:29 +00:00
Michael Tuexen
0e05fbded9 Don't send uninitialized memory (two instances of 4 bytes) in
every cookie on the wire. This bug was reported in
https://bugzilla.mozilla.org/show_bug.cgi?id=905080

MFC after: 3 days
2013-08-14 21:51:32 +00:00
Rick Macklem
93c5875b24 Fix several performance related issues in the new NFS server's
DRC for NFS over TCP.
- Increase the size of the hash tables.
- Create a separate mutex for each hash list of the TCP hash table.
- Single thread the code that deletes stale cache entries.
- Add a tunable called vfs.nfsd.tcphighwater, which can be increased
  to allow the cache to grow larger, avoiding the overhead of frequent
  scans to delete stale cache entries.
  (The default value will result in frequent scans to delete stale cache
   entries, analagous to what the pre-patched code does.)
- Add a tunable called vfs.nfsd.cachetcp that can be used to disable
  DRC caching for NFS over TCP, since the old NFS server didn't DRC cache TCP.
It also adjusts the size of nfsrc_floodlevel dynamically, so that it is
always greater than vfs.nfsd.tcphighwater.

For UDP the algorithm remains the same as the pre-patched code, but the
tunable vfs.nfsd.udphighwater can be used to allow the cache to grow
larger and reduce the overhead caused by frequent scans for stale entries.
UDP also uses a larger hash table size than the pre-patched code.

Reported by:	wollman
Tested by:	wollman (earlier version of patch)
Submitted by:	ivoras (earlier patch)
Reviewed by:	jhb (earlier version of patch)
MFC after:	1 month
2013-08-14 21:11:26 +00:00
Sean Bruno
a2bc8a1d0c If sys/param.h MAXPHYS has been tuned to exceed MFI_MAXPHYS, the mfi(4)
real JBOD mode (SYS PD) would fail fairly reliably during I/O.

Steal the mfi_disk.c check for this condition (indirectly) when establishing
d_maxsize.

Reviewed by:	ambrisko@
MFC after:	4 weeks
Sponsored by:	Yahoo! Inc.
2013-08-14 15:50:34 +00:00
Steven Hartland
dce643c85f Added 4K quirks for:-
* OCZ Agility 2 SSDs
* Marvell SSDs
* Intel X25-M Series SSDs
2013-08-14 15:18:28 +00:00
Pedro F. Giffuni
4a62545173 ext2fs: update format specifiers for ext4 type.
Previous bandaid was not appropriate and didn't really work for
all platforms. While here, cleanup the surrounding code to match
ffs_checkoverlap()

Reported by:	dim, jmallet and bde
MFC after:	3 weeks
2013-08-14 14:22:46 +00:00
Ulrich Spörlein
f1fe1d39e1 Fix make depend 2013-08-14 08:03:57 +00:00
Rui Paulo
a3e08d6f4c Replace the homegrown implementation of nitems() with calls to nitems()
(param.h).

Operating systems that don't have nitems() can easily define it on their own
net80211 OS-specific header file.

Discussed with:		adrian
2013-08-14 04:24:25 +00:00
Mark Johnston
12ede07ab8 Use kld_{load,unload} instead of mod_{load,unload} for the linker file load
and unload event handlers added in r254266.

Reported by:	jhb
X-MFC with:	r254266
2013-08-14 00:42:21 +00:00
Jeff Roberson
99de9af2a6 - Disable quantum caches on the kmem_arena. This can make fragmentation
worse on small KVA systems.  I had intended to only enable it for
   debugging.

Sponsored by:	EMC / Isilon Storage Division
2013-08-13 22:41:24 +00:00
Jeff Roberson
8441d1e842 - Add a statically allocated memguard arena since it is needed very early
on.
 - Pass the appropriate flags to vmem_xalloc() when allocating space for
   the arena from kmem_arena.

Sponsored by:	EMC / Isilon Storage Division
2013-08-13 22:40:43 +00:00
Jung-uk Kim
38da30b419 Merge acpica_machdep.h for amd64 and i386 and move to x86. In fact, these
two files were functionally identical.
2013-08-13 22:05:10 +00:00
Jeff Roberson
d9e232109f Improve pageout flow control to wakeup more frequently and do less work while
maintaining better LRU of active pages.

 - Change v_free_target to include the quantity previously represented by
   v_cache_min so we don't need to add them together everywhere we use them.
 - Add a pageout_wakeup_thresh that sets the free page count trigger for
   waking the page daemon.  Set this 10% above v_free_min so we wakeup before
   any phase transitions in vm users.
 - Adjust down v_free_target now that we're willing to accept more pagedaemon
   wakeups.  This means we process fewer pages in one iteration as well,
   leading to shorter lock hold times and less overall disruption.
 - Eliminate vm_pageout_page_stats().  This was a minor variation on the
   PQ_ACTIVE segment of the normal pageout daemon.  Instead we now process
   1 / vm_pageout_update_period pages every second.  This causes us to visit
   the whole active list every 60 seconds.  Previously we would only maintain
   the active LRU when we were short on pages which would mean it could be
   woefully out of date.

Reviewed by:	alc (slight variant of this)
Discussed with:	alc, kib, jhb
Sponsored by:	EMC / Isilon Storage Division
2013-08-13 21:56:16 +00:00
Jim Harris
086d23cfd3 If a controller fails to initialize, do not notify consumers (nvd) of its
namespaces.

Sponsoredy by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-08-13 21:49:32 +00:00
Jim Harris
56183abc2b Send a shutdown notification in the driver unload path, to ensure
notification gets sent in cases where system shuts down with driver
unloaded.

Sponsored by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-08-13 21:47:08 +00:00
Jung-uk Kim
3bd12ca8f1 Tidy up global locks for ACPICA. There is no functional change. 2013-08-13 21:34:03 +00:00
Ian Lepore
9908a5a5e1 Rename imx_machdep.c to imx51_machdep.c, because it contains hardware
addresses which are specific to the imx51 chips.
2013-08-13 21:12:28 +00:00
Mikolaj Golub
c5c392e7ed Virtualize carp(4) variables to have per vnet control.
Reviewed by:	ae, glebius
2013-08-13 19:59:49 +00:00
John Baldwin
e05bf4cf95 Some small cleanups to the fixes in r180340:
- Set NOTE_TRACKERR before running filt_proc().  If the knote did not
  have NOTE_FORK set in fflags when registered, then the TRACKERR event
  could miss being posted.
- Don't pass the pid in to filt_proc() for NOTE_FORK events.  The special
  handling for pids is done knote_fork() directly and no longer in
  filt_proc().

MFC after:	2 weeks
2013-08-13 18:45:58 +00:00
Pedro F. Giffuni
88ae190ea0 ext2fs: update format specifiers for ext4 type.
Reported by:	Sam Fourman Jr.
MFC after:	3 weeks
2013-08-13 18:39:36 +00:00
Pedro F. Giffuni
70097aac13 Define ext2fs local types and use them.
Add definitions for e2fs_daddr_t, e4fs_daddr_t in addition
to the already existing e2fs_lbn_t and adjust them for ext4.
Other than making the code more readable these changes should
fix problems related to big filesystems.

Setting the proper types can be tricky so the process was
helped by looking at UFS. In our implementation, logical block
numbers can be negative and the code depends on it. In ext2,
block numbers are unsigned so it is convenient to keep
e2fs_daddr_t unsigned and use the complete 32 bits. In the
case of e4fs_daddr_t, while the value should be unsigned, for
ext4 we only need to support 48 bits so preserving an extra
bit from the sign is not an issue.

While here also drop the ext2_setblock() prototype that was
never used.

Discussed with:	mckusick, bde
MFC after:	3 weeks
2013-08-13 15:40:43 +00:00
Gleb Smirnoff
90c35c1939 - Minor style(9) fix.
- Bring a comment up to date.
2013-08-13 13:40:31 +00:00
Ian Lepore
e0511b6c67 Add imx6 compatibility and make the driver work for any clock frequency.
There are still a couple references to imx51 ccm driver functions that will
need to be changed after an imx6 ccm driver is written.

Reviewed by:	ray
2013-08-13 13:14:13 +00:00
Adrian Chadd
a1df5ac10a ieee80211_rate2plcp() and ieee80211_rate2phytype() are both pre-11n
routines and thus assert if one passes in a rate code with the
high bit set.

Since the high bit can indicate either IEEE80211_RATE_BASIC or
IEEE80211_RATE_MCS, it's up to the caller to determine whether
the rate is 11n or not, and either mask out the BASIC bit, or
call a different function.

(Yes, this does mean that net80211 should grow 11n-aware rate2phytype()
and rate2plcp() functions..)

This may need to happen for the other drivers - it's currently only
done (now) for iwn(4) and bwi(4).

PR:		kern/181100
2013-08-13 09:58:27 +00:00
Alexander Motin
0f0b2fd889 Return error when opening read-only volumes (like RAID4/5/...) for writing.
Previously opens succeeded, but actual write operations returned errors.

Requested by:	peter
MFC after:	2 weeks
2013-08-13 07:56:40 +00:00
Peter Wemm
0ff204bbd1 The iconv in libc did two things - implement the standard APIs, the GNU
extensions and also tried to be link time compatible with ports libiconv.
This splits that functionality and enables the parts that shouldn't
interfere with the port by default.

WITH_ICONV (now on by default) - adds iconv.h, iconv_open(3) etc.
WITH_LIBICONV_COMPAT (off by default) adds the libiconv_open etc API, linker
symbols and even a stub libiconv.so.3 that are good enough to be able
to 'pkg delete -f libiconv' on a running system and reasonably expect it
to work.

I have tortured many machines over the last few days to try and reduce
the possibilities of foot-shooting as much as I can.  I've successfully
recompiled to enable and disable the libiconv_compat modes, ports that use
libiconv alongside system iconv etc.  If you don't enable the
WITH_LIBICONV_COMPAT switch, they don't share symbol space.

This is an extension of behavior on other system.  iconv(3) is a standard
libc interface and libiconv port expects to be able to run alongside it on
systems that have it.

Bumped osreldate.
2013-08-13 07:15:01 +00:00
Alexander Motin
db8645f05e Oops, wrong constant at r254269. 2013-08-13 06:25:34 +00:00
Alexander Motin
e70b565ba4 Fix reasonable but safe Clang warnings. 2013-08-13 06:21:36 +00:00
Mark Johnston
8776669b53 FreeBSD's DTrace implementation has a few problems with respect to handling
probes declared in a kernel module when that module is unloaded. In
particular,

* Unloading a module with active SDT probes will cause a panic. [1]
* A module's (FBT/SDT) probes aren't destroyed when the module is unloaded;
  trying to use them after the fact will generally cause a panic.

This change fixes both problems by porting the DTrace module load/unload
handlers from illumos and registering them with the corresponding
EVENTHANDLER(9) handlers. This allows the DTrace framework to destroy all
probes defined in a module when that module is unloaded, and to prevent a
module unload from proceeding if some of its probes are active. The latter
problem has already been fixed for FBT probes by checking lf->nenabled in
kern_kldunload(), but moving the check into the DTrace framework generalizes
it to all kernel providers and also fixes a race in the current
implementation (since a probe may be activated between the check and the
call to linker_file_unload()).

Additionally, the SDT implementation has been reworked to define SDT
providers/probes/argtypes in linker sets rather than using SYSINIT/SYSUNINIT
to create and destroy SDT probes when a module is loaded or unloaded. This
simplifies things quite a bit since it means that pretty much all of the SDT
code can live in sdt.ko, and since it becomes easier to integrate SDT with
the DTrace framework. Furthermore, this allows FreeBSD to be quite flexible
in that SDT providers spanning multiple modules can be created on the fly
when a module is loaded; at the moment it looks like illumos' SDT
implementation requires all SDT probes to be statically defined in a single
kernel table.

PR:		166927, 166926, 166928
Reported by:	davide [1]
Reviewed by:	avg, trociny (earlier version)
MFC after:	1 month
2013-08-13 03:10:39 +00:00
Mark Johnston
9c6139e411 Remove some unused fields from struct linker_file. They were added in
r172862 for use by the DTrace SDT framework but don't seem to have ever
been used.

MFC after:	2 weeks
2013-08-13 03:09:00 +00:00
Mark Johnston
c9b645b50b Add event handlers for module load and unload events. The load handlers are
called after the module has been loaded, and the unload handlers are called
before the module is unloaded. Moreover, the module unload handlers may
return an error to prevent the unload from proceeding.

Reviewed by:	avg
MFC after:	2 weeks
2013-08-13 03:07:49 +00:00
Jack F Vogel
83cef45266 Alter the mq_start routine to do a TRYLOCK and call to the locked routine
rather than just queueing. The former code was an attempt at getting
UDP performance up, but there have been customer reports of problems with it,
so the ixgbe approach seems the best solution for now.
2013-08-13 00:25:39 +00:00
Scott Long
c68534f1d5 Update PCI drivers to no longer look at the MEMIO-enabled bit in the PCI
command register.  The lazy BAR allocation code in FreeBSD sometimes
disables this bit when it detects a range conflict, and will re-enable
it on demand when a driver allocates the BAR.  Thus, the bit is no longer
a reliable indication of capability, and should not be checked.  This
results in the elimination of a lot of code from drivers, and also gives
the opportunity to simplify a lot of drivers to use a helper API to set
the busmaster enable bit.

This changes fixes some recent reports of disk controllers and their
associated drives/enclosures disappearing during boot.

Submitted by:	jhb
Reviewed by:	jfv, marius, achadd, achim
MFC after:	1 day
2013-08-12 23:30:01 +00:00
Jack F Vogel
4dc63104ae Improve the MSIX setup code in the drivers, thanks to Marius for
the changes. Make sure that pci_alloc_msix() does give us the vectors
we need and fall back to MSI when it doesn't, also release any that
were allocated when insufficient.

MFC after: 3 days
2013-08-12 22:54:38 +00:00
Adrian Chadd
57b5fc5f3d Blank m_nextpkt before passing it up. 2013-08-12 22:27:53 +00:00
Pedro F. Giffuni
d7511a40a7 Add read-only support for extents in ext2fs.
Basic support for extents was implemented by Zheng Liu as part
of his Google Summer of Code in 2010. This support is read-only
at this time.

In addition to extents we also support the huge_file extension
for read-only purposes. This works nicely with the additional
support for birthtime/nanosec timestamps and dir_index that
have been added lately.

The implementation may not work for all ext4 filesystems as
it doesn't support some features that are being enabled by
default on recent linux like flex_bg. Nevertheless, the feature
should be very useful for migration or simple access in
filesystems that have been converted from ext2/3 or don't use
incompatible features.

Special thanks to Zheng Liu for his dedication and continued
work to support ext2 in FreeBSD.

Submitted by:	Zheng Liu (lz@)
Reviewed by:	Mike Ma, Christoph Mallon (previous version)
Sponsored by:	Google Inc.
MFC after:	3 weeks
2013-08-12 21:34:48 +00:00
Alexander Motin
fe97b88c15 Add brace missing in r254253. 2013-08-12 20:17:37 +00:00
Scott Long
32373512c3 r253460 accidentally some moderately expensive debugging code, even
when debugging isn't enabled.  Work around this.

Submitted by:	mav
Obtained from:	Netflix
MFC after:	3 days
2013-08-12 19:16:55 +00:00
Ed Schouten
647a92d62b Fix the formatting of the error message.
The G_MIRROR_DEBUG() macro already appends a newline. Also, most of the
log messages emitted by gmirror start with an uppercase letter.
2013-08-12 18:17:45 +00:00
Michael Tuexen
2c9c61defa Make the features a 64-bit value instead of 32-bit.
This will allow an easier integration of the support
for NDATA.
While there, do also some minor cleanups.
Obtained from:	rrs@
MFC after: 2 weeks
2013-08-12 13:52:15 +00:00
Hans Petter Selasky
62a963c5f5 - Try to fix build of 32-bit compatibility USB support for FreeBSD and
Linux targets without breaking the existing IOCTL API.

- Remove some not-needed header file inclusions.

- Wrap a long line.

MFC after:	1 week
Reported by:	Damjan Jovanovic <damjan.jov@gmail.com>
2013-08-12 09:17:48 +00:00
Hans Petter Selasky
fcd51bb4fa Correct an EHCI register write.
MFC after:	1 week
Reported by:	aseem.jolly@gmail.com
2013-08-12 06:09:28 +00:00
Devin Teske
ea14379eaa Add optional support for default override of standard setup; but only if
corresponding functions are provided. If override function does not exist,
boot remains unmodified. This patch should not result in any changes.
2013-08-12 03:52:23 +00:00
Adrian Chadd
4bd57e1078 When flushing packets from the powersave queue, make sure that
m_nextpkt is NULL before passing it up to the parent transmit
method.
2013-08-12 02:21:44 +00:00
Adrian Chadd
d52d5066e7 Add a missing break. 2013-08-12 00:38:47 +00:00
Olivier Houchard
ae8ab0e2c4 Only allocate 2 bounce pages for maps that can only use them for buffers that
are unaligned on cache lines boundary, as we will never need more.
2013-08-11 21:21:02 +00:00
Attilio Rao
6006884122 Correct the recovery logic in vm_page_alloc_contig:
what is really needed on this code snipped is that all the pages that
are already fully inserted gets fully freed, while for the others the
object removal itself might be skipped, hence the object might be set to
NULL.

Sponsored by:	EMC / Isilon storage division
Reported by:	alc, kib
Reviewed by:	alc
2013-08-11 21:15:04 +00:00
Jilles Tjoelker
f24deb02bd wait: Make sure WIFSIGNALED(s) is false if WIFCONTINUED(s) is true. 2013-08-11 14:15:01 +00:00
Glen Barber
7c0af95a13 Use realpath(1) to determine the location of the newvers.sh script,
since the current working directory might not be what is expected,
causing svn{,lite}version to fail to find ${0} (itself).

Submitted by:	Dan Mack
2013-08-11 13:57:14 +00:00
Rui Paulo
957c6e86b1 Use device_printf(). 2013-08-11 06:57:57 +00:00
Adrian Chadd
899de76d2d Use the correct structure size when flipping the BT coex state machine.
This showed up when doing some basic testing on the Intel 6230.

Tested:

* Intel 6230, STA mode
2013-08-11 03:39:28 +00:00
Adrian Chadd
da8848ffb4 Prepare for the PAN (personal area network) support for iwn(4).
* Break out the single, static RX context into a pointer, and ..
* .. extend it to two RX contexts - a default and a PAN context.

Whilst here, add a few extra fields in preparation for further iwn(4)
work.

Tested:

* Intel 4965, STA mode - same level of stability
* Intel 5100, STA mode - no change

Submitted by:	Cedric Gross <cg@gross.info>
2013-08-11 01:57:54 +00:00
Adrian Chadd
aca5021d5f Add firmware for the Intel 2030 and variants.
Submitted by:	Cedric GROSS <cg@gross.info>
Obtained from:	Linux, Intel
2013-08-11 01:09:16 +00:00
Adrian Chadd
a887c8a18c Remove a now-unused firmware. 2013-08-11 01:04:07 +00:00
Adrian Chadd
0cdfe2ae89 Update the 6000g2a image.
Obtained from:	Linux, Intel
2013-08-11 01:03:32 +00:00
Rui Paulo
e009490afc fasttrap_fork(): unlock the processes before removing the tracepoints.
In the future, we'll need to come up with new proc_*() functions that accept
locked processes. For now, this prevents postgresql + DTrace from crashing the
system.

MFC after:	1 month
2013-08-11 00:57:01 +00:00
Adrian Chadd
1df885c863 Add in missing m_free()'s during error conditions. 2013-08-10 21:46:58 +00:00
Konstantin Belousov
2f7c18600c The r254167 moved initialization of the sleepqueues before the witness
is operational.  init_sleepqueues() initializes 256 mutexes, which,
due to witness still being cold, started to overflow the pending_locks
array.

As stated in the reported panic message, increase WITNESS_PENDLIST
from 768 to 1024, which provides space for additional 256 locks.

Reported by:	many
Tested by:	rakuco, bdrewery
2013-08-10 21:42:14 +00:00
Konstantin Belousov
1f039bded4 Match malloc(9) calls with free(9), not contigfree(9). Also remove
unneeded checks for NULL, free(9) can handle NULL pointers on its own,
and the regions were allocated with M_WAITOK flag as well.

Reported and tested by:	Larry Rosenman <ler@lerctr.org>
MFC after:	1 week
2013-08-10 20:54:15 +00:00
Konstantin Belousov
6c6ff43fbe The random_adapters.c is standard in the conf/files. Revert wrong
r254185.

Pointed out by:	peter
2013-08-10 19:38:29 +00:00
Konstantin Belousov
e5e4e178b7 Restore the ability to kldload random.ko, by linking in the newly
added random_adaptors.c.
2013-08-10 18:23:28 +00:00
Glen Barber
887d03eaf7 Fix a typo. The script should run /usr/bin/svnliteversion instead of
/usr/bin/svnversion in the affected section.

Reported by:	lev, Dan Mack
2013-08-10 18:23:18 +00:00
Konstantin Belousov
c325e866f4 Different consumers of the struct vm_page abuse pageq member to keep
additional information, when the page is guaranteed to not belong to a
paging queue.  Usually, this results in a lot of type casts which make
reasoning about the code correctness harder.

Sometimes m->object is used instead of pageq, which could cause real
and confusing bugs if non-NULL m->object is leaked.  See r141955 and
r253140 for examples.

Change the pageq member into a union containing explicitly-typed
members.  Use them instead of type-punning or abusing m->object in x86
pmaps, uma and vm_page_alloc_contig().

Requested and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
2013-08-10 17:36:42 +00:00
Olivier Houchard
477f81c83e Use the correct address when calling kva_free()
Pointy hat to:	cognet
Spotted out by:	alc
2013-08-10 00:53:22 +00:00
Olivier Houchard
e32c2d4742 - The address lies in the bus space handle, not in the cookie
- Use the right address when calling kva_free()
(Is there any reason why the s3c2xx0 comes with its own version of bs_map/
 bs_unmap ? It seems to be just the same as in bus_space_generic.c)
2013-08-10 00:31:49 +00:00
Andrey Zonov
767cfe52cc Remove unused definition for CTL_VM_NAMES.
Suggested by:	bde
2013-08-09 23:47:43 +00:00
Olivier Houchard
662423aeaf Don't call sleepinit() from proc0_init(), make it a SYSINIT instead.
vmem needs the sleepq locks to be initialized when free'ing kva, so we want it
called as early as possible.
2013-08-09 23:13:52 +00:00
Olivier Houchard
e137643ef3 Instead of just trying to do it for arm, make sure vm_kmem_size is properly
aligned in kmeminit(), where it'll work for any arch.

Suggested by:	alc
2013-08-09 22:30:54 +00:00
Olivier Houchard
bdd1acb296 - The address lies in the bus space handle, not in the cookie
- Use the right address when calling kva_free()
2013-08-09 21:56:28 +00:00
Olivier Houchard
c76853ec15 Make sure vm_kmem_size is aligned on a page boundary, since that's what vmem
expects.
2013-08-09 21:53:02 +00:00
John Baldwin
cdc00bf7d2 Revert the addition of VPO_BUSY and instead update vm_page_replace() to
properly unbusy the page.

Submitted by:	alc
2013-08-09 21:14:55 +00:00
Marcel Moolenaar
d6c0f33b57 Fix the freaddir implementation for the stand-alone interpreter.
Bug pointed out by: Jan Beich <jbeich@tormail.org>
2013-08-09 19:10:56 +00:00
David E. O'Brien
5d82a21469 Add missing 'VPO_BUSY' from r254141 to fix kernel build break. 2013-08-09 16:43:50 +00:00
David E. O'Brien
5711939b63 * Add random_adaptors.[ch] which is basically a store of random_adaptor's.
random_adaptor is basically an adapter that plugs in to random(4).
  random_adaptor can only be plugged in to random(4) very early in bootup.
  Unplugging random_adaptor from random(4) is not supported, and is probably a
  bad idea anyway, due to potential loss of entropy pools.
  We currently have 3 random_adaptors:
  + yarrow
  + rdrand (ivy.c)
  + nehemeiah

* Remove platform dependent logic from probe.c, and move it into
  corresponding registration routines of each random_adaptor provider.
  probe.c doesn't do anything other than picking a specific random_adaptor
  from a list of registered ones.

* If the kernel doesn't have any random_adaptor adapters present then the
  creation of /dev/random is postponed until next random_adaptor is kldload'ed.

* Fix randomdev_soft.c to refer to its own random_adaptor, instead of a
  system wide one.

Submitted by: arthurmesh@gmail.com, obrien
Obtained from: Juniper Networks
Reviewed by: so (des)
2013-08-09 15:31:50 +00:00
Attilio Rao
e946b94934 On all the architectures, avoid to preallocate the physical memory
for nodes used in vm_radix.
On architectures supporting direct mapping, also avoid to pre-allocate
the KVA for such nodes.

In order to do so make the operations derived from vm_radix_insert()
to fail and handle all the deriving failure of those.

vm_radix-wise introduce a new function called vm_radix_replace(),
which can replace a leaf node, already present, with a new one,
and take into account the possibility, during vm_radix_insert()
allocation, that the operations on the radix trie can recurse.
This means that if operations in vm_radix_insert() recursed
vm_radix_insert() will start from scratch again.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc (older version)
Reviewed by:	jeff
Tested by:	pho, scottl
2013-08-09 11:28:55 +00:00
Attilio Rao
ac6b769be9 Give mutex(9) the ability to recurse on a per-instance basis.
Now the MTX_RECURSE flag can be passed to the mtx_*_flag() calls.
This helps in cases we want to narrow down to specific calls the
possibility to recurse for some locks.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	jeff, alc
Tested by:	pho
2013-08-09 11:24:29 +00:00
Attilio Rao
c7aebda8a1 The soft and hard busy mechanism rely on the vm object lock to work.
Unify the 2 concept into a real, minimal, sxlock where the shared
acquisition represent the soft busy and the exclusive acquisition
represent the hard busy.
The old VPO_WANTED mechanism becames the hard-path for this new lock
and it becomes per-page rather than per-object.
The vm_object lock becames an interlock for this functionality:
it can be held in both read or write mode.
However, if the vm_object lock is held in read mode while acquiring
or releasing the busy state, the thread owner cannot make any
assumption on the busy state unless it is also busying it.

Also:
- Add a new flag to directly shared busy pages while vm_page_alloc
  and vm_page_grab are being executed.  This will be very helpful
  once these functions happen under a read object lock.
- Move the swapping sleep into its own per-object flag

The KPI is heavilly changed this is why the version is bumped.
It is very likely that some VM ports users will need to change
their own code.

Sponsored by:	EMC / Isilon storage division
Discussed with:	alc
Reviewed by:	jeff, kib
Tested by:	gavin, bapt (older version)
Tested by:	pho, scottl
2013-08-09 11:11:11 +00:00
Edward Tomasz Napierala
8ddc3590cc Don't dereference null pointer should acl_alloc() be passed M_NOWAIT
and allocation failed.  Nothing in the tree passed M_NOWAIT.

Obtained from:	mjg
MFC after:	1 month
2013-08-09 08:40:31 +00:00
Andriy Gapon
9ba0691bdd follow up to r254051
- update powerpc/GENERIC64 as well, suggested by mdf
- update comments so that they make sense after the change, suggested by
  jhb

X-MFC after:	never (change specific to head)
2013-08-09 08:11:09 +00:00
Jeff Roberson
863c7e4562 - Reserve a special AF for SDP. The one we were incorrectly using before
was taken by another AF.

Sponsored by:	EMC / Isilon Storage Division
2013-08-09 03:26:17 +00:00
Jeff Roberson
3d71c6cf45 - Correctly handle various edge cases in sysfs emulation.
Sponsored by:	EMC / Isilon Storage Division
2013-08-09 03:24:48 +00:00
Jeff Roberson
ba397d0f16 - Use the correct type in the linux bitops emulation.
Submitted by:	Maxim Ignatenko <gelraen.ua@gmail.com>
2013-08-09 03:24:12 +00:00
Pyun YongHyeon
69b1f509a4 Fix for IPv4 fragment packets treated as RMCP.
bit25 of rxMode MAC register of 5762 needs to be set for rx mgmt
filter to work correctly when processing match for UDP header
fields.  Otherwise false positive can occur which causes IPv4
fragment to be received by APE instead of host.

Reported by:	Geans Pin <geanspin@broadcom.com>
2013-08-09 01:15:32 +00:00
Scott Long
fe8391035a Rate limit the 'out of chain frame' messages to once per 60 seconds.
Obtained from:	Netflix
MFC after:	3 days
2013-08-09 01:10:33 +00:00
Scott Long
d9802deb4e Sometimes a device misbehaves so badly that it disrupts the entire system.
Add a tunable that allows such a device to be excluded from the driver.
The id parameter is the target id that the driver assigns to a given device.

dev.mps.X.exclude_ids=<id>,<id>

Obtained from:	Netflix
MFC after:	3 days
2013-08-09 01:09:02 +00:00
Scott Long
f510415d84 Add a helpful message that can help point to why a sysctl tree removal failed
Obtained from:	Netflix
MFC after:	3 days
2013-08-09 01:04:44 +00:00
Xin LI
43667c1f68 MFV r254079:
Illumos ZFS issues:
  3957 ztest should update the cachefile before killing itself
  3958 multiple scans can lead to partial resilvering
  3959 ddt entries are not always resilvered
  3960 dsl_scan can skip over dedup-ed blocks if
       physical birth != logical birth
  3961 freed gang blocks are not resilvered and can cause pool to suspend
  3962 ztest should print out zfs debug buffer before exiting
2013-08-08 23:38:31 +00:00
Devin Teske
5b502b3b4c Update legacy static assignments in old code to support dynamic framing,
plotting, and alignment coinciding with enhancements in SVN r242667.
2013-08-08 22:34:00 +00:00
Devin Teske
5e82c321f8 Since the introduction of SVN r244048 and [follow-up] r244089, it is now
safe to build upon ``boot_serial?'' functionality to make safer UI choices.
2013-08-08 22:09:46 +00:00
Pedro F. Giffuni
95f1f8d262 Small typo.
MFC after:	3 days
2013-08-08 22:07:59 +00:00
Ryan Stone
08a42caa50 Allow drivers to return BUS_PROBE_NOWILDCARD from their attach routine to
match devices where the driver class was fixed but the unit number was
wildcarded.  This better matches the documented behaviour in
DEVICE_PROBE(9).

Reviewed by:	imp
2013-08-08 19:30:49 +00:00
Andrey V. Elsukov
b74dd6c77b gpt_entries is used as limit for the number of partition entries in
the GEOM_PART. Instead of just using number of entries from the GPT
header, calculate this limit based on the reserved space between
GPT header and first available LBA.

MFC after:	2 weeks
2013-08-08 16:09:20 +00:00
Glen Barber
a61914445d When newvers.sh is run, it is possible that the svnversion
(or svnliteversion) in the current lookup path is not what
was used to check out the tree.  If an incompatible version
is used, the svn revision number is not reported in uname(1).

Run ${svnversion} on newvers.sh itself when evaluating if the
svn(1) in use is compatible with the tree.  Fallback to an
empty ${svnversion} if necessary.

With this change, svnliteversion from base is only used
if no compatible svnversion is found, so with this change,
the version of svn(1) from the ports tree is evaluated first.

Requested by:	many
MFC after:	3 days
X-MFC-To:	stable/9, releng/9.2 only
2013-08-08 15:59:00 +00:00
Andrey V. Elsukov
4371b649aa Make the check for number of entries less strict.
Some partitioning tools can create GPT with number of entries less
than 128.

MFC after:	1 week
2013-08-08 11:24:25 +00:00
Adrian Chadd
4030a4b2a3 Cap the number of streams supported to two for now.
I haven't yet reviewed the Intel driver(s) in more depth to see if
there are 1x1 NICs that report they support 2 transmit/receive chains..
if so then we'll have to update this.

Tested:

* Intel 4965, which is a 2x2 device with 3 RX and 2 TX chains.

PR:		kern/181132
2013-08-08 05:52:41 +00:00
Adrian Chadd
e7495198d5 Convert net80211 over to using if_transmit for the dispatch from the
upper layer(s).

This eliminates the if_snd queue from net80211. Yay!

This unfortunately has a few side effects:

* It breaks ALTQ to net80211 for now - sorry everyone, but fixing
  parallelism and eliminating the if_snd queue is more important
  than supporting this broken traffic scheduling model. :-)

* There's no VAP and IC flush methods just yet - I think I'll add
  some NULL methods for now just as placeholders.

* It reduces throughput a little because now net80211 will drop packets
  rather than buffer them if the driver doesn't do its own buffering.
  This will be addressed in the future as I implement per-node software
  queues.

Tested:

* ath(4) and iwn(4) in STA operation
2013-08-08 05:09:35 +00:00
Neel Natu
f263e391a3 Use local variables with the appropriate types and eliminate a bunch of casts.
This is a cosmetic change but it does help with a proposed change to increase
the maximum size of physical memory supported on amd64 platforms.

Submitted by:	Chris Torek (torek@torek.net)
2013-08-08 03:17:39 +00:00
Xin LI
9d2f243aa6 MFV r254071:
Fix a regression introduced by fix for Illumos bug #3834.  Quote from
Matthew Ahrens on the Illumos issue:

ztest fails this assertion because ztest_dmu_read_write() does
        dmu_tx_hold_free(tx, bigobj, bigoff, bigsize);
and then
    dmu_object_set_checksum(os, bigobj,
        (enum zio_checksum)ztest_random_dsl_prop(ZFS_PROP_CHECKSUM), tx);

If the region to free is past the end of the file, the DMU assumes that there
will be nothing to do for this object.  However, ztest does set_checksum(),
which must modify the dnode.  The fix is for ztest to also call

    dmu_tx_hold_bonus(tx, bigobj);

so we can account for the dirty data associated with setting the checksum

Illumos ZFS issues:
  3955 ztest failure: assertion refcount_count(&tx->tx_space_written)
         + delta <= tx->tx_space_towrite
2013-08-07 22:21:00 +00:00
Adrian Chadd
cc80eae5cf Allow net80211 to compile on stable/9 and stable/8. 2013-08-07 22:01:43 +00:00
Xin LI
4f7b34578b MFV r254070:
Merge vendor bugfix for ZFS test suite that triggers false positives.

Illumos ZFS issues:
  3949 ztest fault injection should avoid resilvering devices
  3950 ztest: deadman fires when we're doing a scan
  3951 ztest hang when running dedup test
  3952 ztest: ztest_reguid test and ztest_fault_inject don't place nice together
2013-08-07 21:16:14 +00:00
John Baldwin
5b596f0f5f Don't emit a spurious EVFILT_PROC event with no fflags set on process exit
if NOTE_EXIT is not being monitored.  The rationale is that a listener
should only get an event for exit() if they registered interest via
NOTE_EXIT.  This matches the behavior on OS X.
- Don't save the exit status on process exit unless NOTE_EXIT is being
  monitored.
- Add an internal EV_DROP flag that requests kqueue_scan() to free the
  knote without signalling it to userland and use this when a process
  exits but the fflags in the knote is zero.

Reviewed by:	jmg
MFC after:	1 month
2013-08-07 19:56:35 +00:00
Konstantin Belousov
449c2e92c9 Split the pagequeues per NUMA domains, and split pageademon process
into threads each processing queue in a single domain.  The structure
of the pagedaemons and queues is kept intact, most of the changes come
from the need for code to find an owning page queue for given page,
calculated from the segment containing the page.

The tie between NUMA domain and pagedaemon thread/pagequeue split is
rather arbitrary, the multithreaded daemon could be allowed for the
single-domain machines, or one domain might be split into several page
domains, to further increase concurrency.

Right now, each pagedaemon thread tries to reach the global target,
precalculated at the start of the pass.  This is not optimal, since it
could cause excessive page deactivation and freeing.  The code should
be changed to re-check the global page deficit state in the loop after
some number of iterations.

The pagedaemons reach the quorum before starting the OOM, since one
thread inability to meet the target is normal for split queues.  Only
when all pagedaemons fail to produce enough reusable pages, OOM is
started by single selected thread.

Launder is modified to take into account the segments layout with
regard to the region for which cleaning is performed.

Based on the preliminary patch by jeff, sponsored by EMC / Isilon
Storage Division.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-08-07 16:36:38 +00:00
Konstantin Belousov
872d995f76 Change the pmap_ts_referenced() method of amd64 pmap to use shared
pvh_global_lock.  This allows the method to be executed in parallel,
avoiding undue contention on the pvh_global_lock for the multithreaded
pagedaemon.

The pmap_ts_referenced() function has to inspect the page mappings for
several pmaps, which need to be locked while pv list lock is owned.
This contradicts to the lock order, where pmap lock is before pv list
lock.  Introduce the generation count for the pv list of the page or
superpage, which indicate any change in the pv list, and, as usual,
perform restart of the iteration if generation changed while pv lock
was dropped for blocking acquire of a pmap lock.

Reported and tested by:	pho
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
2013-08-07 16:33:15 +00:00
Olivier Houchard
d03426e619 Don't bother trying to work around buffers which are not aligned on a cache
line boundary. It has never been 100% correct, and it can't work on SMP,
because nothing prevents another core from accessing data from an unrelated
buffer in the same cache line while we invalidated it. Just use bounce pages
instead.

Reviewed by:	ian
Approved by:	mux (mentor) (implicit)
2013-08-07 15:44:58 +00:00
Alexander Motin
a29779e865 Remove droping topology mutex after iterating 100 periphs in CAMGETPASSTHRU.
That is not so slow and so often operation to handle unneeded otherwise
xsoftc.xpt_generation and respective locking complications.
2013-08-07 11:34:20 +00:00
Ganbold Tsagaankhuu
696ec285aa Bring initial support for Allwinner A20 SoC (Cubieboard2).
Add support for A20 timer.
	Correct interrupt offset depending from chip.
	Add basic code for CPU configuration module.
	For now, add kernel config and dts file
	(only FDT blob related problem needs to be solved later in
	order to have one kernel for both cubieboard1 and 2).

Approved by: ray@
2013-08-07 11:07:56 +00:00
Alexander Motin
71185c66dd Improve r253721 by reporting detected lack of BIO_FLUSH support to GEOM.
That prevents more of such requests from coming and errors from logging.
2013-08-07 08:20:11 +00:00
Andriy Gapon
818d282e7b enable KDB_TRACE in GENERICs
KDB_TRACE is not an alternative to DDB/etc, they are complementary.
So I do not see any reason to not enable KDB_TRACE by default.

X-MFC after:	never (change specific to head)
2013-08-07 08:03:50 +00:00
Kevin Lo
3de1bd9502 Remove unsigned comparison < 0
Found by:	LLVM
Reviewed by:	luigi
2013-08-07 07:22:56 +00:00
Jeff Roberson
5df87b21d3 Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

 - Normalize functions that allocate memory to use kmem_*
 - Those that allocate address space are named kva_*
 - Those that operate on maps are named kmap_*
 - Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	EMC / Isilon Storage Division
2013-08-07 06:21:20 +00:00
Mark Johnston
5bc4f6b3ab Add a missing module version declaration to if_tun(4).
PR:		181078
Submitted by:	Brandon Gooch <jamesbrandongooch@gmail.com>
MFC after:	1 week
2013-08-07 01:32:08 +00:00
Mark Johnston
c0432fc38b Fill in the description fields for M_FICT_PAGES.
Reviewed by:	kib
MFC after:	3 days
2013-08-07 00:20:30 +00:00
Marcel Moolenaar
e01c6f329a Change <sys/diskpc98.h> to not redefine the same symbols that are
being defined in <sys/diskmbr.h>. Instead give the symbols here a
"PC98_" prefix. This way, both <sys/diskmbr.h> and <sys/diskpc98.h>
can be included in the same C source file.

The renaming is trivial. The only gotcha is that DOSBBSECTOR is
also redefined from 0 to 1. This because DOSBBSECTOR was always
used in conjunction with an addition of 1. The PC98_BBSECTOR symbol
is defined as 1 and the expression is simplified.

Note: it is not believed that ports are seriously impacted; or at
all for that matter.

Approved by: nyan@
2013-08-07 00:00:48 +00:00
Xin LI
c668ff330e MFV r254011:
This change have no effect to FreeBSD but integrated for
completeness.

Illumos ZFS issues:
  348 ZFS should handle DKIOCGMEDIAINFOEXT failure
2013-08-06 21:36:01 +00:00
Jack F Vogel
d0913b7f25 Make the various driver MSIX setup routines fallback to MSI more
gracefully. This change was suggested by Marius Strobl, thank you.

PR: kern/181016
MFC after: ASAP
2013-08-06 21:01:38 +00:00
Marius Strobl
0cfcfc1918 - Fix a bug in the MSI allocation logic so an MSI is also employed if a
controller supports only a single message. I haven't seen such an adapter
  out in the wild, though, so this change likely is a NOP.
  While at it, further simplify the MSI allocation logic; there's no need
  to check the number of available messages on our own as pci_alloc_msi(9)
  will just fail if it can't provide us with the single message we want.
- Nuke the unused softc of aacch(4).

MFC after:	1 month
2013-08-06 19:14:02 +00:00
Marius Strobl
21e5a2223c As it turns out, MSIs are broken with 2820SA so introduce an AAC_FLAGS_NOMSI
quirk and apply it to these controllers [1]. The same problem was reported
for 2230S, in which case it wasn't actually clear whether the culprit is the
controller or the mainboard, though. In order to be on the safe side, flag
MSIs as being broken with the latter type of controller as well. Given that
these are the only reports of MSI-related breakage with aac(4) so far and
OSes like OpenSolaris unconditionally employ MSIs for all adapters of this
family, however, it doesn't seem warranted to generally disable the use of
MSIs in aac(4).
While it, simplify the MSI allocation logic a bit; there's no need to check
for the presence of the MSI capability on our own as pci_alloc_msi(9) will
just fail when these kind of interrupts are not available.
Reported and tested by: David Boyd [1]

MFC after:	3 days
2013-08-06 18:55:59 +00:00
Jack F Vogel
54a6317360 When the igb driver is static there are cases when early interrupts occur,
resulting in a panic in refresh_mbufs, to prevent this add a check in the
interrupt handler for DRV_RUNNING.

MFC after: 1 day (critical for 9.2)
2013-08-06 18:00:53 +00:00
Hiroki Sato
ffa0165ae0 Fix incompatibility in ICMPV6CTL_ND6_PRLIST sysctl, and SIOCGPRLST_IN6,
SIOCGDRLST_IN6, and SIOCGNBRINFO_IN6 ioctl.  These userland interfaces
treat expiration times in time_second, not time_uptime.
2013-08-06 17:10:52 +00:00
Kirk McKusick
824009a16a This bug fix is in a code path in rename taken when there is a
collision between a rename and an open system call for the same
target file. Here, rename releases its vnode references, waits for
the open to finish, and then restarts by reacquiring its needed
vnode locks. In this case, rename was unlocking but failing to
release its reference to one of its held vnodes. The effect was
that even after all the actual references to the vnode had gone,
the vnode still showed active references. For files that had been
removed, their space was not reclaimed until the filesystem was
forcibly unmounted.

This bug manifested itself in the Postgres server which would
leak/lose hundreds of files per day amounting to many gigabytes of
disk space. This bug required shutting down Postgres, forcibly
unmounting its filesystem, remounting its filesystem and restarting
Postgres every few days to recover the lost space.

Reported by: Dan Thomas and Palle Girgensohn
Bug-fix by:  kib
Tested by:   Dan Thomas and Palle Girgensohn
MFC after:   2 weeks
2013-08-06 16:50:05 +00:00
Andriy Gapon
63230519fa fix fat-fingering in r253996
MFC after:	17 days
X-MFC with:	r253996
2013-08-06 16:18:07 +00:00
Andriy Gapon
c319ea15f4 opensolaris code: translate INVARIANTS to DEBUG and ZFS_DEBUG
Do this by forcing inclusion of
sys/cddl/compat/opensolaris/sys/debug_compat.h
via -include option into all source files from OpenSolaris.
Note that this -include option must always be after -include opt_global.h.

Additionally, remove forced definition of DEBUG for some modules and fix
their build without DEBUG.

Also, meaning of DEBUG was overloaded to enable WITNESS support for some
OpenSolaris (primarily ZFS) locks.  Now this overloading is removed and
that use of DEBUG is replaced with a new option OPENSOLARIS_WITNESS.

MFC after:	17 days
2013-08-06 15:51:56 +00:00
Marius Strobl
4af8466d8b Add MD (for now) atomic_store_acq_<type>() and use it in pmap_activate()
to get the semantics when setting the PMAP right. Prior to r251782, the
latter already used implicit acquire semantics, which - currently - means
to not employ additional explicit memory barriers under the hood (see also
r225889).
2013-08-06 15:34:11 +00:00
Alexander Motin
d9aca4ed74 Block reporting of ZFS features for suspended pools.
Before executing any subcommand, zpool tool fetches pools configuration from
the kernel.  Before features support was added, kernel was regenerating that
configuration based on data always present in memory.  Unfortunately, pool
features list and activity counters are not such. They are stored in ZAP,
that normally resides in ARC, but under heavy memory pressure may be swapped
out.  If pool is suspended at this point, there is no way to recover it back
since any zpool command will stuck.

This change has one predictable flaw: `zpool upgrade` always wish to upgrade
suspended pools, but fortunately it can't do it due to the suspension.
2013-08-06 14:41:41 +00:00
Alexander Motin
f8dcf872c4 Disable r252840 when ZFS TRIM is enabled (vfs.zfs.trim.enabled=1) and really
disable TRIM otherwise.

r252840 (illumos bug 3836) is based on assumption that zio_free_sync() has
no lock dependencies and should complete immediately. Unfortunately, with our
TRIM implementation that is not true due to ZIO_STAGE_VDEV_IO_START added
to the ZIO_FREE_PIPELINE, which, while not really accessing devices, still
acquires SCL_ZIO lock for read to be sure devices won't disappear.

When TRIM is disabled, this patch enables direct free execution from r252840
and removes ZIO_STAGE_VDEV_IO_START and ZIO_STAGE_VDEV_IO_ASSESS stages from
the pipeline to avoid lock acquisition.  Otherwise it queues free request as
it was before r252840.
2013-08-06 14:30:28 +00:00
Alexander Motin
526bb4af8a Make zpool clear to reopen also reconnected cache and spare devices.
Since `zpool status` reports about such kinds of errors, it is strange
that they are not cleared by `zpool clear`.
2013-08-06 14:23:33 +00:00
Alexander Motin
ad727e8d64 Make ZFS to use separate thread to handle SPA_ASYNC_REMOVE async events.
Existing async thread is running only on successfull spa_sync() completion,
that is impossible in case of pool loosing required (last) disk(s).  That
indefinite delay of SPA_ASYNC_REMOVE processing made ZFS to not close the
lost disks, preventing GEOM/CAM from destroying devices and reusing names
on later disk reattach.

In earlier version of the patch I've tried to just run existing thread
immediately, unrelated to spa_sync() completion, but that exposed number
of situations where it could stuck due to locks held by stuck spa_sync(),
that are required for other kinds of async events.

Experiments with OpenIndiana snapshot confirmed that they also have this
issue with lost disks reattach.
2013-08-06 14:20:41 +00:00
Andriy Gapon
5d7430f0a8 dtrace: fix compilation with gcc
Cowardly taking the easiest way and using -Wno-*

MFC after:	3 days
X-MFC with:	r253772
2013-08-06 13:55:39 +00:00
Edward Tomasz Napierala
ea7c84e46f Remove dead code. 2013-08-06 10:42:18 +00:00
Andrew Turner
6d65b3be10 We no longer need to align the stack before calling swi_handler as it is
already aligned correctly in the PUSHFRAME macro.
2013-08-06 10:03:44 +00:00
Sean Bruno
ec5d9810da Update ciss(4) with new models of raid controllers from HP
Submitted by:	scott.benesh@hp.com
MFC after:	2 weeks
Sponsored by:	Hewlett Packard
2013-08-06 03:17:01 +00:00
Justin Hibbits
98a737cf8f Micro-optimize OFW syscons 8-bit blank.
MFC after:	1 week
2013-08-06 03:09:44 +00:00
Justin Hibbits
a3b2cab451 Remove an unnecessary panic. The PVO's PTE entry and the PTEG's PTE entry may
not match, if the PVO's PTE is invalid.
2013-08-06 02:58:16 +00:00
Hiroki Sato
89cac24e48 - Use pget(PGET_CANDEBUG | PGET_NOTWEXIT) to determine if the specified
PID is valid for monitoring in FILEMON_SET_PID ioctl.

- Set the monitored PID to -1 when the process exits.

Suggested by:	jilles
Tested by:	sjg
MFC after:	3 days
2013-08-06 02:14:30 +00:00
Justin Hibbits
804d1cc1b6 Evict pages from the PTEG when it's full and trying to insert a new PTE,
rather than panicking.

Reviewed by:	nwhitehorn
MFC after:	3 weeks
2013-08-06 01:01:15 +00:00
Kirk McKusick
8cf85cf292 With the addition of journalled soft updates, the "newblk" structures
persist much longer than previously. Historically we had at most 100
entries; now the count may reach a million. With the increased count
we spent far too much time looking them up in the grossly undersized
newblk hash table. Configure the newblk hash table to accurately reflect
the number of entries that it must index.

Reviewed by: kib
Tested by:   Peter Holm
MFC after:   2 weeks
2013-08-05 22:02:45 +00:00
Kirk McKusick
57591d8e78 To better understand performance problems with journalled soft updates,
we need to collect the highest level of allocation for each of the
different soft update dependency structures. This change collects these
statistics and makes them available using `sysctl debug.softdep.highuse'.

Reviewed by: kib
Tested by:   Peter Holm
MFC after:   2 weeks
2013-08-05 22:01:16 +00:00
Olivier Houchard
7497e6267c Let the platform calculate the timer frequency at runtime, and use that for
the omap4, instead of relying on the (wrong) value provided in the dts.
2013-08-05 20:14:56 +00:00
Hiroki Sato
7d26db1792 - Use time_uptime instead of time_second in data structures for
PF_INET6 in kernel.  This fixes various malfunction when the wall time
  clock is changed.  Bump __FreeBSD_version to 1000041.

- Use clock_gettime(CLOCK_MONOTONIC_FAST) in userland utilities.

MFC after:	1 month
2013-08-05 20:13:02 +00:00
Konstantin Belousov
456597e7bd Do not override the ENOENT error for the empty path, or EFAULT errors
from copyins, with the relative lookup check.

Discussed with:	rwatson
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-08-05 19:42:03 +00:00
Andrew Turner
d8e3f572e2 When entering exception handlers we may not have an aligned stack. This is
because an exception may happen at any time. The stack alignment rules on
ARM EABI state the only place the stack must be 8-byte aligned is on a
function boundary.

If an exception happens while a function is setting up or tearing down it's
stack frame it may not be correctly aligned. There is also no requirement
for it to be when the function is a leaf node.

The fix is to align the stack after we have stored a backup of the old stack
pointer, but before we have stored anything in the trapframe. Along with
this we need to adjust the size of the trapframe by 4 bytes to ensure the
stack below it is also correctly aligned.
2013-08-05 19:06:28 +00:00
Konstantin Belousov
8239a7a878 The tmpfs_alloc_vp() is used to instantiate vnode for the tmpfs node,
in particular, from the tmpfs_lookup VOP method.  If LK_NOWAIT is not
specified in the lkflags, the lookup is supposed to return an alive
vnode whenever the underlying node is valid.

Currently, the tmpfs_alloc_vp() returns ENOENT if the vnode attached
to node exists and is being reclaimed.  This causes spurious ENOENT
errors from lookup on tmpfs and corresponding random 'No such file'
failures from syscalls working with tmpfs files.

Fix this by waiting for the doomed vnode to be detached from the tmpfs
node if sleepable allocation is requested.

Note that filesystems which use vfs_hash.c, correctly handle the case
due to vfs_hash_get() looping when vget() returns ENOENT for sleepable
requests.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2013-08-05 18:53:59 +00:00
Jack F Vogel
7301d64aba Correct a fat-finger in the last delta.
MFC after: ASAP
2013-08-05 16:16:50 +00:00
Alexander Motin
39fe6d85ee MFprojects/camlock r249006:
Pass SIM pointer as an argument to camisr_runqueue() instead of doneq
pointer.
2013-08-05 12:15:53 +00:00
Alexander Motin
ea541bfdaa MFprojects/camlock r249505:
Change CCB queue resize logic to be able safely handle overallocations:
 - (re)allocate queue space in power of 2 chunks with 64 elements minimum
and never shrink it; with only 4/8 bytes per element size is insignificant.
 - automatically reallocate the queue to double size if it is overflowed.
 - if queue reallocation failed, store extra CCBs in unsorted TAILQ,
fetching them back as soon as some queue element is freed.

To free space in CCB for TAILQ linking, change highpowerq from keeping
high-power CCBs to keeping devices frozen due to high-power CCBs.

This encloses all pieces of queue resize logic inside of cam_queue.[ch],
removing some not obvious duties from xpt_release_ccb().
2013-08-05 11:48:40 +00:00
Glen Barber
cfb2932bf4 Redirect svnversion stderr to /dev/null if we cannot determine
the tree version, for example if the tree is checked out with an
outdated svn from ports, but the base system svnlite is built.

Approved by:	kib (mentor)
2013-08-05 10:26:42 +00:00
Attilio Rao
be99683637 Revert r253939:
We cannot busy a page before doing pagefaults.
Infact, it can deadlock against vnode lock, as it tries to vget().
Other functions, right now, have an opposite lock ordering, like
vm_object_sync(), which acquires the vnode lock first and then
sleeps on the busy mechanism.

Before this patch is reinserted we need to break this ordering.

Sponsored by:	EMC / Isilon storage division
Reported by:	kib
2013-08-05 08:55:35 +00:00
Hiroki Sato
41541ebf94 Fix a panic in tmpaddrtimer. 2013-08-05 00:36:12 +00:00
Jeff Roberson
2c0b86b48f - Introduce a specific function, pmap_remove_kernel_pde, for removing
huge pages in the kernel's address space.  This works around several
   asserts from pmap_demote_pde_locked that did not apply and gave false
   warnings.

Discovered by:	pho
Reviewed by:	alc
Sponsored by:	EMC / Isilon Storage Division
2013-08-05 00:28:03 +00:00
Attilio Rao
66bacd7e17 Remove unused member.
Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc
Tested by:	pho
2013-08-04 21:17:05 +00:00
Attilio Rao
3b6714cacb The page hold mechanism is fast but it has couple of fallouts:
- It does not let pages respect the LRU policy
- It bloats the active/inactive queues of few pages

Try to avoid it as much as possible with the long-term target to
completely remove it.
Use the soft-busy mechanism to protect page content accesses during
short-term operations (like uiomove_fromphys()).

After this change only vm_fault_quick_hold_pages() is still using the
hold mechanism for page content access.
There is an additional complexity there as the quick path cannot
immediately access the page object to busy the page and the slow path
cannot however busy more than one page a time (to avoid deadlocks).

Fixing such primitive can bring to complete removal of the page hold
mechanism.

Sponsored by:	EMC / Isilon storage division
Discussed with:	alc
Reviewed by:	jeff
Tested by:	pho
2013-08-04 21:07:24 +00:00
Marcel Moolenaar
b9fdaa9b19 Remove inclusion of <sys/diskmbr.h>. We have no business knowing
anything related to MBR in this file.
2013-08-04 21:00:22 +00:00
Hiren Panchasara
c29173eb94 Fixing a typo.
Approved by:	sbruno (mentor, implicit)
2013-08-04 19:54:47 +00:00
Attilio Rao
878a788734 Remove unnecessary soft busy of the page before to do vn_rdwr() in
kern_sendfile() which is unnecessary.
The page is already wired so it will not be subjected to pagefault.
The content cannot be effectively protected as it is full of races
already.
Multiple accesses to the same indexes are serialized through vn_rdwr().

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc, jeff
Tested by:	pho
2013-08-04 15:56:19 +00:00
Steven Hartland
e44e975c1b zfs_ioc_rename should not leave the value of zc_name passed in via zc altered
on return.

MFC after:	1 week
2013-08-04 11:38:08 +00:00
Marius Strobl
eb84fc9506 Make r253899 compile. 2013-08-03 21:24:52 +00:00
Justin Hibbits
450f197050 Remove duplicate definition of SPR MMCR0.
MFC after:	3 days
2013-08-03 18:05:12 +00:00
Edward Tomasz Napierala
39ca489ea9 Fix typo. 2013-08-03 13:38:56 +00:00
Ian Lepore
992a44320c Tweak the imx debug console code so that it works with multiple SoCs.
Instead of hard-coding the uart register addresses for the imx51, use
a variable that defaults to the imx51 address.  When debugging another
imx-family SoC, the variable can be set early in initarm() to provide
full console/printf support for debugging early boot.
2013-08-03 13:31:10 +00:00
Ulrich Spörlein
006afc9b5c Add missing depend. 2013-08-03 08:21:35 +00:00
Marcel Moolenaar
90aa031bf1 Add a tunable for the default timeout. 2013-08-03 04:25:25 +00:00
Peter Grehan
80a902ef7d Follow-up commit to fix CR0 issues. Maintain
architectural state on CR vmexits by guaranteeing
that EFER, CR0 and the VMCS entry controls are
all in sync when transitioning to IA-32e mode.

Submitted by:	Tycho Nightingale (tycho.nightingale <at> plurisbusnetworks.com)
2013-08-03 03:16:42 +00:00
Marius Strobl
1e53269ac2 Const'ify scc_driver_name. 2013-08-02 23:31:51 +00:00
Marius Strobl
71bda3eb9a - Use NULL instead of 0 for pointers.
- Remove unnecessary __RMAN_RESOURCE_VISIBLE.
2013-08-02 23:30:32 +00:00
Marius Strobl
c4b1deaf0d - Implement iclear methods for QUICC and SAB 82532. With r253161 in place,
this is is crucial at least for the latter.
  What happens is that attaching uart(4) to scc(4) causes the SAB 82532 to
  "receive" something and trigger a SER_INT_RXREADY interrupt, given that
  at least fast/filter interrupts are already enabled. Prior to r253161,
  uart_bus_ihand() was set up at this point and handled that condition,
  i. e. read the RX FIFO and issued a Receive Message Complete.
  Now, uart_bus_ihand() and uart_intr() are setup after attaching uart(4),
  leaving the SER_INT_RXREADY interrupt triggered during the latter to
  be handled by the iclear method. However, with that method not implement,
  this in turn causes SAB 82532 to not issue any further SER_INT_RXREADY
  interrupts until the RX FIFO is full again. Thus, 15 received bytes go
  to nowhere, given that "the other half" of the RX FIFO is used for status
  information. Hence, implementing sab82532_bfe_iclear() fixes things again.
  Potentially, the same problem exists for QUICC.
- Remove unnecessary __RMAN_RESOURCE_VISIBLE.
- Remove a superfluous header.
- Use KOBJMETHOD_END.
- Mark unused arguments as such.
- Remove variables unused after initialization.

Reviewed by:	marcel (earlier version)
2013-08-02 23:28:49 +00:00
Adrian Chadd
f86392791f Add in some definitions required for later iwn(4) device support.
This also clarifies a few existing fields.

Tested:

* Intel 5100

Submitted by:	Cedric GROSS <cg@gross.info>
2013-08-02 21:28:36 +00:00
Adrian Chadd
a5582fae07 Break out the iwn(4) device IDs into if_iwn_devid.h, as well as add
IDs for new devices.

* Add new device IDs
* Extend the ID probe code to include the newer range of bits used
  by later model devices

Tested:

* Intel 5100, STA mode

TODO:

* Test on Intel 4965, just to be sure

Submitted by:	Cedric GROSS <cg@gross.info>
2013-08-02 21:23:28 +00:00
Olivier Houchard
cca928b9e1 Only receive the interrupts on the first core, to avoid duplicate interrupts. 2013-08-02 20:32:26 +00:00
Navdeep Parhar
82342de26d Display temperature sensor data. Shows -1 if sensor not
available on the card.

# sysctl dev.t4nex.0.temperature
# sysctl dev.t5nex.0.temperature
2013-08-02 18:05:42 +00:00
Navdeep Parhar
73cd922046 Fix previous commit (r253873). "cong" has one bit per channel but the
congestion channel map has 1 nibble per channel.  So bits wxyz need to
be blown up into 000w000x000y000z.
2013-08-02 17:44:19 +00:00
Hiroki Sato
872ce24739 Add p_candebug() check to FILEMON_SET_PID ioctl.
Discussed with:	sjg
MFC after:	3 days
2013-08-02 14:44:11 +00:00
Gleb Smirnoff
977c7043eb Remove extra zeroing after M_ZERO allocation. 2013-08-02 13:06:49 +00:00
Navdeep Parhar
ba41ec4848 Set up congestion manager context properly for T5 based cards.
MFC after:	3 days (will check with re@)
2013-08-01 23:38:30 +00:00
Adrian Chadd
21f8dc458a Now that conf/options knows about if_iwn.h, add it to if_iwn.c.
This allows for IWN_DEBUG (and maybe more stuff later) to be a build
time configure option.
2013-08-01 21:50:50 +00:00
Adrian Chadd
c49debb656 Add IWN_DEBUG as an option for if_iwn. 2013-08-01 21:50:13 +00:00
Adrian Chadd
38b1a25dfd iwn(4) debugging improvements.
* Add in some new register debugging under IWN_DEBUG_REGISTER
* Make IWN_DEBUG an option now for building.  I'll chase this up
  with a commit to 'options' soon.

Submitted by:	Cedric GROSS <cg@cgross.info>
2013-08-01 21:45:30 +00:00
Jack F Vogel
cbe75ae8f5 A number of important fixes:
- mbuf reused after an RX_COPY optimized operation can sometimes have
    a bogus cached address, resulting in TCP hangs. Add critical save points
    to the cached address. Thanks to Michael and the team at Verisign for
    finding this problem.
  - A couple more spots where the rxbuf->flags member should be cleared just
    to be sure no incorrect RX_COPY state is left around. Thanks to Adrian
    for tracking these down.
  - Remove the rearm_queues function from the driver, this was found to be
    responsible for some out-of-order packets by Verisign, and was always a
    bandaid, with the other fixes in this delta the bandaid can finally be
    removed.
  - In the other/link interrupt handler the entire state of the EICS register
    was being writen back into EICR (which clears causes and thus re-enables
    those interrupts), this was wrong, so now mask off the queue portion of
    the register value, so we only clear the other/link interrupt we intend.
    Marc from Verisign found this.
  - Make the SFP+ unsupported option tuneable now, by customer request.
  - Finally, just a couple of minor DEBUG string fixes.

I want to call out and thank all the participants in the 10G community/Intel
calls for helping track down these problems and make the driver better for
everyone!

MFC after:	3 days, these are critical fixes for 9.2!
2013-08-01 20:10:16 +00:00
Marcel Moolenaar
04ae0d7cc5 Fix the build of the testmain target. This target compiles a Forth
interpreter that can be run on the system and as such cannot be
compiled against libbstand. On the one hand this means we need to
include the usual headers for system interfaces that we use and
on the the other hand we can only use standard system interfaces.

While here, define local variables only when needed to make this
WARNS=2 clean on amd64.

PR:		172542
Obtained from:	peterj@
Pointed out by: Jan Beich <jbeich@tormail.org>
2013-08-01 18:06:58 +00:00
Pedro F. Giffuni
d192e40f77 Add license for the half MD4 algorithm used in ext2_half_md4().
The htree implementation uses code derived from the
RSA Data Security, Inc. MD4 Message-Digest Algorithm.

Add a proper licensing statement for the code and clarify
the corresponding comments.

Approved by:	core (hrs)
2013-08-01 16:04:48 +00:00
Konstantin Belousov
1f3ad93be7 Remove unused malloc type.
Requested by:	alc
MFC after:	1 week
2013-08-01 12:55:41 +00:00
Michael Tuexen
bfd1666aad Micro-optimization suggested in
https://bugzilla.mozilla.org/show_bug.cgi?id=898234
by pchang9. While there simplify the code.

MFC after: 1 week
2013-08-01 12:05:23 +00:00
Ganbold Tsagaankhuu
dd5c5e7147 Add identification for Cortex-A7 (R0) cores.
Reviewed by: cognet@
2013-08-01 10:06:19 +00:00
Peter Grehan
81ef6611ed Moved clearing of vmm_initialized to avoid the case
of unloading the module while VMs existed. This would
result in EBUSY, but would prevent further operations
on VMs resulting in the module being impossible to
unload.

Submitted by:   Tycho Nightingale (tycho.nightingale <at> plurisbusnetworks.com)
Reviewed by:	grehan, neel
2013-08-01 05:59:28 +00:00
Peter Grehan
aaaa065629 Correctly maintain the CR0/CR4 shadow registers.
This was exposed with AP spinup of Linux, and
booting OpenBSD, where the CR0 register is unconditionally
written to prior to the longjump to enter protected
mode. The CR-vmexit handling was not updating CPU state which
resulted in a vmentry failure with invalid guest state.

A follow-on submit will fix the CPU state issue, but this
fix prevents the CR-vmexit prior to entering protected
mode by properly initializing and maintaining CR* state.

Reviewed by:	neel
Reported by:	Gopakumar.T @ netapp
2013-08-01 01:18:51 +00:00
Ian Lepore
6cbd933b37 Changes to allow using BOOTP_NFSROOT and mounting an nfs root filesystem
other than the one specified by the BOOTP server.  This configures NFS
using the BOOTP protocol while also respecting other root-path options such
as setting vfs.root.mountfrom in the environment or using the RB_DFLTROOT
boot option.  It allows you to override the root path provided by the
server, or to supply a root path when the server provides IP configuration
but no root path info.

This maintains the historical BOOTP_NFSROOT behavior of panicking on a
failure to mount the root path provided by the server, unless you've
provided an alternative via the ROOTDEVNAME kernel option or by setting
vfs.root.mountfrom.  The behavior of panicking when given no other options
is preserved because it amounts to a bit of a retry loop that could
eventually recover from a transient network or server problem.

The user can now override the root path from loader(8) even if the
kernel is compiled with BOOTP_NFSROOT.  If vfs.root.mountfrom is set in
the environment it is used unconditionally -- it always overrides the
BOOTP info.  If it begins with [old]nfs: then the BOOTP code uses it
instead of the server-provided info.  If it specifies some other
filesystem then the bootp code will not panic like it used to and the code
in vfs_mountroot.c will invoke the right filesystem to do the mount.

If the kernel is compiled with the ROOTDEVNAME option, then that name is
used by the BOOTP code if either
      * The server doesn't provide a pathname.
      * The boothowto flags include RB_DFLTROOT.
The latter allows the user to compile in alternate path in ROOTDEVNAME
such as ufs:/dev/da0s1a and boot from that path by setting
boot_dftlroot=1 in loader(8) or using the '-r' option in boot(8).

The one thing not provided here is automatic failover from a
server-provided path to a compiled-in one without the user manually
requesting that.  The code just isn't currently structured in a way that
makes that possible with a lot of rewrite.  I think the ability to set
vfs.root.mountfrom and to use ROOTDEVNAME automatically when the server
doesn't provide a name covers the most common needs.

A set of patches submitted by Lars Eggert provided the part I couldn't
figure out by myself when I tried to do this last year; many thanks.

Reviewed by:	rodrigc
2013-07-31 19:14:00 +00:00
David E. O'Brien
0e6a0799a9 Back out r253779 & r253786. 2013-07-31 17:21:18 +00:00
Sean Bruno
969eca8b4a Adjust magic numbers to allow attachment of ath(4) modules. 2013-07-31 16:27:56 +00:00
Sean Bruno
d439f5d933 device if_bridge gets me a bridge device 2013-07-31 16:26:34 +00:00
Hiroki Sato
0de0dd9be8 Allocate in6_ifextra (ifp->if_afdata[AF_INET6]) only for IPv6-capable
interfaces.  This eliminates unnecessary IPv6 processing for non-IPv6
interfaces.

MFC after:	3 days
2013-07-31 16:24:49 +00:00
Scott Long
fcd9ff2c67 Another fix for r253823; retain the default of 1 readahead block for sendfile.
Submitted by:	glebius
Obtained from:	Netflix
MFC after:	3 days
2013-07-31 15:55:01 +00:00
Rui Paulo
caa18d0c6c Add definitions for the Mailbox, Spinlock and PRU-ICSS devices. 2013-07-31 06:23:10 +00:00
Rui Paulo
53dfd5c108 Cleanup the allocations when the attachment fails. 2013-07-31 06:05:34 +00:00
Rui Paulo
db10a06d50 Initialisation routines for the mailbox, spinlock and PRU-ICSS clocks. 2013-07-31 05:52:03 +00:00
Navdeep Parhar
6e22f9f3da Display SGE tunables in the sysctl tree.
dev.t5nex.0.fl_pktshift: payload DMA offset in rx buffer (bytes)
dev.t5nex.0.fl_pad: payload pad boundary (bytes)
dev.t5nex.0.spg_len: status page size (bytes)
dev.t5nex.0.cong_drop: congestion drop setting

Discussed with:	scottl
2013-07-31 05:12:51 +00:00
Justin Hibbits
d1da193810 Remove duplicate SRCS include block. Spotted by jmallett. 2013-07-31 01:42:59 +00:00
Justin Hibbits
84cd55bb02 Add the macio attachment for wi(4). Partially obtained from NetBSD.
Reviewed by:	adrian
Obtained from:	NetBSD (partially)
2013-07-31 01:13:29 +00:00
Scott Long
de925dd31f Fix r253823. Some WIP patches snuck in.
Submitted by:	zont
2013-07-30 23:50:09 +00:00
Scott Long
fc4a5f052b Create a knob, kern.ipc.sfreadahead, that allows one to tune the amount of
readahead that sendfile() will do.  Default remains the same.

Obtained from:	Netflix
MFC after:	3 days
2013-07-30 23:26:05 +00:00
Xin LI
bd3d1456a5 MFV r253783:
Skip eviction step of processing free records when doing ZFS
receive to avoid the expensive search operation of non-existent
dbufs in dn_dbufs.

Illumos ZFS issues:
  3834 incremental replication of 'holey' file systems is slow

MFC after:      2 weeks
2013-07-30 21:35:02 +00:00
Xin LI
1c4ead73c6 MFV r253782:
To quote Illumos issue #3888:

When 'zfs recv -F' is used with an incremental recv it rolls
back any changes made since the last snapshot in case new
changes were made to the file system while the recv is in
progress (without -F the recv would fail when it does it's
final check to commit the recv-ed data as the recv-ed data
conflicts with the newly written data).

However, if there is a snapshot taken after the recv began
rolling back to the 'latest' snapshot will not help and the
recv will still fail. 'zfs recv -F' should be extended to
destroy any snapshots created since the source snapshot when
finishing the recv (effectively rolling back through all
snapshots, instead of just to the latest snapshot).

Illumos ZFS issues:
  3888 zfs recv -F should destroy any snapshots created since the
       incremental source

MFC after:	2 weeks
2013-07-30 21:20:12 +00:00