Commit Graph

179691 Commits

Author SHA1 Message Date
Konstantin Belousov
ee75e7de7b Implement the concept of the unmapped VMIO buffers, i.e. buffers which
do not map the b_pages pages into buffer_map KVA.  The use of the
unmapped buffers eliminate the need to perform TLB shootdown for
mapping on the buffer creation and reuse, greatly reducing the amount
of IPIs for shootdown on big-SMP machines and eliminating up to 25-30%
of the system time on i/o intensive workloads.

The unmapped buffer should be explicitely requested by the GB_UNMAPPED
flag by the consumer.  For unmapped buffer, no KVA reservation is
performed at all. The consumer might request unmapped buffer which
does have a KVA reserve, to manually map it without recursing into
buffer cache and blocking, with the GB_KVAALLOC flag.

When the mapped buffer is requested and unmapped buffer already
exists, the cache performs an upgrade, possibly reusing the KVA
reservation.

Unmapped buffer is translated into unmapped bio in g_vfs_strategy().
Unmapped bio carry a pointer to the vm_page_t array, offset and length
instead of the data pointer.  The provider which processes the bio
should explicitely specify a readiness to accept unmapped bio,
otherwise g_down geom thread performs the transient upgrade of the bio
request by mapping the pages into the new bio_transient_map KVA
submap.

The bio_transient_map submap claims up to 10% of the buffer map, and
the total buffer_map + bio_transient_map KVA usage stays the
same. Still, it could be manually tuned by kern.bio_transient_maxcnt
tunable, in the units of the transient mappings.  Eventually, the
bio_transient_map could be removed after all geom classes and drivers
can accept unmapped i/o requests.

Unmapped support can be turned off by the vfs.unmapped_buf_allowed
tunable, disabling which makes the buffer (or cluster) creation
requests to ignore GB_UNMAPPED and GB_KVAALLOC flags.  Unmapped
buffers are only enabled by default on the architectures where
pmap_copy_page() was implemented and tested.

In the rework, filesystem metadata is not the subject to maxbufspace
limit anymore. Since the metadata buffers are always mapped, the
buffers still have to fit into the buffer map, which provides a
reasonable (but practically unreachable) upper bound on it. The
non-metadata buffer allocations, both mapped and unmapped, is
accounted against maxbufspace, as before. Effectively, this means that
the maxbufspace is forced on mapped and unmapped buffers separately.
The pre-patch bufspace limiting code did not worked, because
buffer_map fragmentation does not allow the limit to be reached.

By Jeff Roberson request, the getnewbuf() function was split into
smaller single-purpose functions.

Sponsored by:	The FreeBSD Foundation
Discussed with:	jeff (previous version)
Tested by:	pho, scottl (previous version), jhb, bf
MFC after:	2 weeks
2013-03-19 14:13:12 +00:00
Gleb Smirnoff
093012686d iwn(4) doesn't support adhoc mode.
PR:		misc/177106
Submitted by:	Hiren Panchasara <hiren.panchasara gmail.com>
2013-03-19 13:43:55 +00:00
Konstantin Belousov
36a6d2ebc4 Add a convenience macro bread_gb() to wrap a call to
breadn_flags(). Comparing with bread(), it adds an argument to pass
the flags to getblk().

Sponsored by:	The FreeBSD Foundation
Tested by:	pho
MFC after:	2 weeks
2013-03-19 13:21:39 +00:00
Aleksandr Rybalko
ee52bd5713 Cast "start" to u_long. Temporary fix to unbreak tinderbox.
We need here max possible storage or dynamic, depend on size of address cell.
2013-03-19 13:13:26 +00:00
Konstantin Belousov
b4862fafd5 Assert that a ccb passed to cam_periph_mapmem() for XPT_SCSI_IO and
XPT_ATA_IO holds virtual buffer address.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho
2013-03-19 13:10:14 +00:00
Ed Maste
96ecfd9813 Fix remainder calculation when biosize is not a power of 2
In common configurations biosize is a power of two, but is not required to
be so.  Thanks to markj@ for spotting an additional case beyond my original
patch.

Reviewed by: rmacklem@
2013-03-19 13:06:11 +00:00
Hans Petter Selasky
565d8205f3 Add new USB ID.
PR:		usb/177105
MFC after:	1 week
2013-03-19 12:52:13 +00:00
Joel Dahl
245e0480d2 Remove obsolete objformat information.
Submitted by:	db
2013-03-19 12:35:33 +00:00
Martin Matuska
87a5cb4650 Plug memory leak in dsl_check_snap_cb()
This was unnoticed because the function is very rarely used.

MFC after:	3 days
2013-03-19 07:47:51 +00:00
Joel Dahl
f11731eba1 mdoc: remove superfluous paragraph macro. 2013-03-19 07:25:58 +00:00
Andrey V. Elsukov
93bb4f9ed5 Separate the locking macros that are used in the packet flow path
from others. This helps easy switch to use pfil(4) lock.
2013-03-19 06:04:17 +00:00
Andrey V. Elsukov
5474386bd3 Fix style and comments. 2013-03-19 05:51:47 +00:00
Gleb Smirnoff
8863cc408c There are actually two different cases when mlock(2) returns
ENOMEM. Clarify this, taking text from SUS.

Reviewed by:	kib
2013-03-19 05:44:25 +00:00
Colin Percival
953bb3854c Fix typo in previous commit: Exit if */dev/dumpdev* does not exist, not if
*/bin/realpath* does not exist...

Submitted by:	markj
Pointy hat to:	cperciva
2013-03-19 05:08:25 +00:00
Colin Percival
510a7a8624 If dumpdev is AUTO but no dump device has been set -- i.e., there is no swap
space configured for rc.d/dumpon to designate for dumping -- then exit
silently rather than with a
> realpath: /dev/dumpdev: No such file or directory
error message.

An argument could be made that we should print a (more informative) warning
message; but given that under the same conditions the rc.d/dumpon script will
already print a
> No suitable dump device was found
warning, it seems that printing an additional
> Dump device does not exist.  Savecore not run.
warning would be superfluous.
2013-03-19 04:42:04 +00:00
Justin Hibbits
ec86c487a6 Fix the powerpc64 build. MACHINE_CPUARCH is common for powerpc/powerpc64,
not MACHINE_ARCH.
2013-03-19 00:39:02 +00:00
Aleksandr Rybalko
36581e4785 Don't hesitate to ask parent to setup IRQ finally.
Sponsored by:	The FreeBSD Foundation
2013-03-18 23:51:39 +00:00
Neel Natu
4e34ce3e13 Add bhyve to examples.
Requested by: alfred, julian
Obtained from:	NetApp
2013-03-18 23:46:14 +00:00
Aleksandr Rybalko
bf9d6206b0 Allow simplebus to attach to another simplebus.
Sponsored by:	The FreeBSD Foundation
2013-03-18 23:41:19 +00:00
Aleksandr Rybalko
089dfb09f1 Hide "no default resources for" warning under bootverbose. It's ok to use
optional resources.

Sponsored by:	The FreeBSD Foundation
2013-03-18 23:38:15 +00:00
Aleksandr Rybalko
2737a5a925 Allow simplebus to attach in less strict way, when "simple-bus" listed on not
first position of compatible property, so simplebus driver can be generic
driver for any bus listed as compatible with "simple-bus".

Sponsored by:	The FreeBSD Foundation
2013-03-18 23:35:01 +00:00
Jung-uk Kim
78260bb30a List TrackPoint device before generic model. 2013-03-18 23:31:22 +00:00
Jung-uk Kim
569d8f7e27 Add preliminary support for IBM/Lenovo TrackPoint.
PR:		kern/147237 (based on the initial patch for 8.x)
Tested by:	glebius (device detection and suspend/resume)
MFC after:	1 month
2013-03-18 23:22:47 +00:00
Neel Natu
b060ba5024 Simplify the assignment of memory to virtual machines by requiring a single
command line option "-m <memsize in MB>" to specify the memory size.

Prior to this change the user needed to explicitly specify the amount of
memory allocated below 4G (-m <lowmem>) and the amount above 4G (-M <highmem>).

The "-M" option is no longer supported by 'bhyveload' and 'bhyve'.

The start of the PCI hole is fixed at 3GB and cannot be directly changed
using command line options. However it is still possible to change this in
special circumstances via the 'vm_set_lowmem_limit()' API provided by
libvmmapi.

Submitted by:	Dinakar Medavaram (initial version)
Reviewed by:	grehan
Obtained from:	NetApp
2013-03-18 22:38:30 +00:00
Pawel Jakub Dawidek
a3d8ae5d2d Reduce stack usage. 2013-03-18 21:11:31 +00:00
Ryan Stone
3aff0961dd Correct the definition for Exar XR17V258IV: we must use a config_function
to specify the offset into the PCI memory spare at which each serial port
will find its registers.  This was already done for other Exar PCI serial
devices; it was accidentally omitted for this specific device.

Sponsored by:	Sandvine Incorporated
MFC after:	1 week
2013-03-18 19:22:51 +00:00
John Baldwin
1968f37bc9 Tweak some comments. 2013-03-18 18:04:09 +00:00
John Baldwin
3cf3b9f097 Partially revert r195702. Deferring stops is now implemented via a set of
calls to toggle TDF_SBDRY rather than passing PBDRY to individual sleep
calls.
- Remove the stop_allowed parameters from cursig() and issignal().
  issignal() checks TDF_SBDRY directly.
- Remove the PBDRY and SLEEPQ_STOP_ON_BDRY flags.
2013-03-18 17:23:58 +00:00
Aleksandr Rybalko
4117c1db9e o Switch to use physical addresses in rman for FDT.
o Remove vtophys used to translate virtual address to physical in case rman carry virtual.

Sponsored by:	The FreeBSD Foundation
2013-03-18 15:18:55 +00:00
Andrew Turner
da6b2089d5 do_vfp_vmrs and do_vfp_vmsr should not return anything. 2013-03-18 15:14:36 +00:00
Dag-Erling Smørgrav
c5c0dc9146 Keep the default AuthorizedKeysFile setting. Although authorized_keys2
has been deprecated for a while, some people still use it and were
unpleasantly surprised by this change.

I may revert this commit at a later date if I can come up with a way
to give users who still have authorized_keys2 files sufficient advance
warning.

MFC after:	ASAP
2013-03-18 10:50:50 +00:00
Andrew Turner
e8dde80b1d Add support for the vmsr and vmrs instructions. This supports the system
level version of the instructions. When used in userland the hardware only
allows us to read/write FPSCR.
2013-03-18 08:22:35 +00:00
Andrew Turner
90ab443e31 Some ARM vmov similar to 'vmov.f32 s1, s2' will incorrectly have the second
register added to the symbol table by the assembler. On further
investigation it was found the problem was with the my_get_expression
function. This is called by parse_big_immediate.

Fix this by moving the call to parse_big_immediate to the end of the if,
else if, ..., else block.
2013-03-18 07:41:08 +00:00
Hans Petter Selasky
bd247e9ddd Add new USB ID.
PR:		usb/177013
MFC after:	1 week
2013-03-18 07:02:58 +00:00
Justin Hibbits
80a5635c8b Add FBT for PowerPC DTrace. Also, clean up the DTrace assembly code,
much of which is not necessary for PowerPC.

The FBT module can likely be factored into 3 separate files: common,
intel, and powerpc, rather than duplicating most of the code between
the x86 and PowerPC flavors.

All DTrace modules for PowerPC will be MFC'd together once Fasttrap is
completed.
2013-03-18 05:30:18 +00:00
Pyun YongHyeon
e8bedbd24a r119712 introduced SIS_TYPE_83816 but it was not actually set in
driver such that checking against the type was always false.
To detect NS DP83816, driver should have checked silicon revision
register for NS controllers. While here, remove SIS_TYPE_83816 to
not make the similar mistake again.

Reported by:	Brad Smith ( brad@openbsd )
2013-03-18 04:46:17 +00:00
Adrian Chadd
1ab002f461 Print out the current fifo queue depth correctly - not just the max
queue depth.

Silly hat to me.
2013-03-18 02:29:57 +00:00
Kevin Lo
da5dfd565f Add restrict keyword to realpath manpage. 2013-03-18 01:22:28 +00:00
Adrian Chadd
eefc93a947 Dump out information about the RX descriptor free list and FIFO information. 2013-03-18 01:12:36 +00:00
Adrian Chadd
d50e882ab9 Log some more information when the RX buffer allocation failed. 2013-03-18 01:11:52 +00:00
Attilio Rao
774d251d99 Sync back vmcontention branch into HEAD:
Replace the per-object resident and cached pages splay tree with a
path-compressed multi-digit radix trie.
Along with this, switch also the x86-specific handling of idle page
tables to using the radix trie.

This change is supposed to do the following:
- Allowing the acquisition of read locking for lookup operations of the
  resident/cached pages collections as the per-vm_page_t splay iterators
  are now removed.
- Increase the scalability of the operations on the page collections.

The radix trie does rely on the consumers locking to ensure atomicity of
its operations.  In order to avoid deadlocks the bisection nodes are
pre-allocated in the UMA zone.  This can be done safely because the
algorithm needs at maximum one new node per insert which means the
maximum number of the desired nodes is the number of available physical
frames themselves.  However, not all the times a new bisection node is
really needed.

The radix trie implements path-compression because UFS indirect blocks
can lead to several objects with a very sparse trie, increasing the number
of levels to usually scan.  It also helps in the nodes pre-fetching by
introducing the single node per-insert property.

This code is not generalized (yet) because of the possible loss of
performance by having much of the sizes in play configurable.
However, efforts to make this code more general and then reusable in
further different consumers might be really done.

The only KPI change is the removal of the function vm_page_splay() which
is now reaped.
The only KBI change, instead, is the removal of the left/right iterators
from struct vm_page, which are now reaped.

Further technical notes broken into mealpieces can be retrieved from the
svn branch:
http://svn.freebsd.org/base/user/attilio/vmcontention/

Sponsored by:	EMC / Isilon storage division
In collaboration with:	alc, jeff
Tested by:	flo, pho, jhb, davide
Tested by:	ian (arm)
Tested by:	andreast (powerpc)
2013-03-18 00:25:02 +00:00
Jilles Tjoelker
ddd956b0b7 find: Include nanoseconds when comparing timestamps of files.
When comparing to the timestamp of a given file using -newer, -Xnewer and
-newerXY (where X and Y are one of m, c, a, B), include nanoseconds in the
comparison.

The primaries that compare a timestamp of a file to a given value (-Xmin,
-Xtime, -newerXt) continue to compare times in whole seconds.

Note that the default value 0 of vfs.timestamp_precision almost always
causes the nanoseconds part to be 0. However, touch -d can set a timestamp
to the microsecond regardless of that sysctl.

MFC after:	1 week
2013-03-17 22:51:58 +00:00
Ian Lepore
b479b38c0a Eliminate an intermediate buffer and some memcpy() operations, and do
DMA directly to/from the buffers passed in from higher layer drivers.

Reviewed by:	gonzo
2013-03-17 16:31:09 +00:00
Martin Matuska
6cf922c88b Fix typo in sysctl description
Reported by:	Jeremy Chadwick
MFC after:	3 days
2013-03-17 15:53:27 +00:00
Konstantin Belousov
0d3bb4afa8 Remove negative name cache entry pointing to the target name, which
could be instantiated while tdvp was unlocked.

Reported by:	Rick Miller <vmiller at hostileadmin com>
Tested by:	pho
MFC after:	1 week
2013-03-17 15:11:37 +00:00
Gleb Smirnoff
4f67e14304 In m_align() add assertions that mbuf is virgin, similar to assertions
in M_ALIGN(), MH_ALIGN, MEXT_ALIGN() macros.
2013-03-17 07:41:14 +00:00
Gleb Smirnoff
23909ac90d Add MEXT_ALIGN() macro, similar to M_ALIGN() and MH_ALIGN(), but for
mbufs with external buffer.
2013-03-17 07:39:45 +00:00
Gleb Smirnoff
7525c48111 In m_megapullup() instead of reserving some space at the end of packet,
m_align() it, reserving space to prepend data.

Reviewed by:	mav
2013-03-17 07:37:10 +00:00
Rui Paulo
dd85724bc9 Fix a typo in a comment. 2013-03-17 07:28:17 +00:00
Joel Dahl
9ca6ee33d0 Remove EOL whitespace accidentally introduced in r248393. 2013-03-17 06:57:25 +00:00