Commit Graph

189100 Commits

Author SHA1 Message Date
kib
23f577dda4 Support unmapped i/o for the md(4).
The vnode-backed md(4) has to map the unmapped bio because VOP_READ()
and VOP_WRITE() interfaces do not allow to pass unmapped requests to
the filesystem. Vnode-backed md(4) uses pbufs instead of relying on
the bio_transient_map, to avoid usual md deadlock.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho, scottl
2013-03-19 14:53:23 +00:00
kib
4f250cea7a The geom_part provider supports unmapped bio iff the underlying
provider does so, since geom_part never inspects the bio_data.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho
2013-03-19 14:50:24 +00:00
kib
aa960ab755 A flag for the geom disk driver to indicate that it accepts the
unmapped i/o requests.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho
2013-03-19 14:49:15 +00:00
kib
e5332ab955 Do not remap usermode pages into KVA for physio.
Sponsored by:	The FreeBSD Foundation
Tested by:	pho
2013-03-19 14:43:57 +00:00
kib
2ace051956 Do not map the swap i/o pbufs if the geom provider for the swap
partition accepts unmapped requests.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho
2013-03-19 14:39:27 +00:00
kib
a43491886a Pass unmapped buffers for page in requests if the filesystem indicated support
for the unmapped i/o.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho
2013-03-19 14:36:28 +00:00
kib
28b18148ad A flag for the filesystem to indicate to the upper levels that it accepts
unmapped buffers for the VOP_STRATEGY().

Sponsored by:	The FreeBSD Foundation
Tested by:	pho
2013-03-19 14:33:01 +00:00
kib
51030488d7 Add a helper function vfs_bio_bzero_buf() to zero the portion of the
buffer, transparently handling mapped or unmapped buffers.  Its intent
is to replace the use of bzero(bp->b_data) in cases where the buffer
might be unmapped, to avoid unneeded upgrades.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho
2013-03-19 14:27:14 +00:00
ray
0702655ee7 Return "start" and "end" to u_long world. Because rman handle addresses as
u_long too.

Discussed with:	ian@
Pointy hat to:	ray@
2013-03-19 14:15:41 +00:00
kib
7c26a038f9 Implement the concept of the unmapped VMIO buffers, i.e. buffers which
do not map the b_pages pages into buffer_map KVA.  The use of the
unmapped buffers eliminate the need to perform TLB shootdown for
mapping on the buffer creation and reuse, greatly reducing the amount
of IPIs for shootdown on big-SMP machines and eliminating up to 25-30%
of the system time on i/o intensive workloads.

The unmapped buffer should be explicitely requested by the GB_UNMAPPED
flag by the consumer.  For unmapped buffer, no KVA reservation is
performed at all. The consumer might request unmapped buffer which
does have a KVA reserve, to manually map it without recursing into
buffer cache and blocking, with the GB_KVAALLOC flag.

When the mapped buffer is requested and unmapped buffer already
exists, the cache performs an upgrade, possibly reusing the KVA
reservation.

Unmapped buffer is translated into unmapped bio in g_vfs_strategy().
Unmapped bio carry a pointer to the vm_page_t array, offset and length
instead of the data pointer.  The provider which processes the bio
should explicitely specify a readiness to accept unmapped bio,
otherwise g_down geom thread performs the transient upgrade of the bio
request by mapping the pages into the new bio_transient_map KVA
submap.

The bio_transient_map submap claims up to 10% of the buffer map, and
the total buffer_map + bio_transient_map KVA usage stays the
same. Still, it could be manually tuned by kern.bio_transient_maxcnt
tunable, in the units of the transient mappings.  Eventually, the
bio_transient_map could be removed after all geom classes and drivers
can accept unmapped i/o requests.

Unmapped support can be turned off by the vfs.unmapped_buf_allowed
tunable, disabling which makes the buffer (or cluster) creation
requests to ignore GB_UNMAPPED and GB_KVAALLOC flags.  Unmapped
buffers are only enabled by default on the architectures where
pmap_copy_page() was implemented and tested.

In the rework, filesystem metadata is not the subject to maxbufspace
limit anymore. Since the metadata buffers are always mapped, the
buffers still have to fit into the buffer map, which provides a
reasonable (but practically unreachable) upper bound on it. The
non-metadata buffer allocations, both mapped and unmapped, is
accounted against maxbufspace, as before. Effectively, this means that
the maxbufspace is forced on mapped and unmapped buffers separately.
The pre-patch bufspace limiting code did not worked, because
buffer_map fragmentation does not allow the limit to be reached.

By Jeff Roberson request, the getnewbuf() function was split into
smaller single-purpose functions.

Sponsored by:	The FreeBSD Foundation
Discussed with:	jeff (previous version)
Tested by:	pho, scottl (previous version), jhb, bf
MFC after:	2 weeks
2013-03-19 14:13:12 +00:00
glebius
878ef603e2 iwn(4) doesn't support adhoc mode.
PR:		misc/177106
Submitted by:	Hiren Panchasara <hiren.panchasara gmail.com>
2013-03-19 13:43:55 +00:00
kib
3007c4a4e5 Add a convenience macro bread_gb() to wrap a call to
breadn_flags(). Comparing with bread(), it adds an argument to pass
the flags to getblk().

Sponsored by:	The FreeBSD Foundation
Tested by:	pho
MFC after:	2 weeks
2013-03-19 13:21:39 +00:00
ray
9498a8470e Cast "start" to u_long. Temporary fix to unbreak tinderbox.
We need here max possible storage or dynamic, depend on size of address cell.
2013-03-19 13:13:26 +00:00
kib
3277788fe1 Assert that a ccb passed to cam_periph_mapmem() for XPT_SCSI_IO and
XPT_ATA_IO holds virtual buffer address.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho
2013-03-19 13:10:14 +00:00
emaste
2ccefecf01 Fix remainder calculation when biosize is not a power of 2
In common configurations biosize is a power of two, but is not required to
be so.  Thanks to markj@ for spotting an additional case beyond my original
patch.

Reviewed by: rmacklem@
2013-03-19 13:06:11 +00:00
hselasky
f67d1cc4dd Add new USB ID.
PR:		usb/177105
MFC after:	1 week
2013-03-19 12:52:13 +00:00
joel
34ea777e7c Remove obsolete objformat information.
Submitted by:	db
2013-03-19 12:35:33 +00:00
mm
38b46fc64e Plug memory leak in dsl_check_snap_cb()
This was unnoticed because the function is very rarely used.

MFC after:	3 days
2013-03-19 07:47:51 +00:00
joel
bf4e5d1eb6 mdoc: remove superfluous paragraph macro. 2013-03-19 07:25:58 +00:00
ae
23037c29f1 Separate the locking macros that are used in the packet flow path
from others. This helps easy switch to use pfil(4) lock.
2013-03-19 06:04:17 +00:00
ae
b3c4973a10 Fix style and comments. 2013-03-19 05:51:47 +00:00
glebius
03b86a1452 There are actually two different cases when mlock(2) returns
ENOMEM. Clarify this, taking text from SUS.

Reviewed by:	kib
2013-03-19 05:44:25 +00:00
cperciva
f07d0be8f8 Fix typo in previous commit: Exit if */dev/dumpdev* does not exist, not if
*/bin/realpath* does not exist...

Submitted by:	markj
Pointy hat to:	cperciva
2013-03-19 05:08:25 +00:00
cperciva
262a11c529 If dumpdev is AUTO but no dump device has been set -- i.e., there is no swap
space configured for rc.d/dumpon to designate for dumping -- then exit
silently rather than with a
> realpath: /dev/dumpdev: No such file or directory
error message.

An argument could be made that we should print a (more informative) warning
message; but given that under the same conditions the rc.d/dumpon script will
already print a
> No suitable dump device was found
warning, it seems that printing an additional
> Dump device does not exist.  Savecore not run.
warning would be superfluous.
2013-03-19 04:42:04 +00:00
jhibbits
c5b34b3787 Fix the powerpc64 build. MACHINE_CPUARCH is common for powerpc/powerpc64,
not MACHINE_ARCH.
2013-03-19 00:39:02 +00:00
ray
905fc9aeb7 Don't hesitate to ask parent to setup IRQ finally.
Sponsored by:	The FreeBSD Foundation
2013-03-18 23:51:39 +00:00
neel
b893c0b25f Add bhyve to examples.
Requested by: alfred, julian
Obtained from:	NetApp
2013-03-18 23:46:14 +00:00
ray
e043b6aac4 Allow simplebus to attach to another simplebus.
Sponsored by:	The FreeBSD Foundation
2013-03-18 23:41:19 +00:00
ray
3d577dd295 Hide "no default resources for" warning under bootverbose. It's ok to use
optional resources.

Sponsored by:	The FreeBSD Foundation
2013-03-18 23:38:15 +00:00
ray
9fa825f868 Allow simplebus to attach in less strict way, when "simple-bus" listed on not
first position of compatible property, so simplebus driver can be generic
driver for any bus listed as compatible with "simple-bus".

Sponsored by:	The FreeBSD Foundation
2013-03-18 23:35:01 +00:00
jkim
ed21226b83 List TrackPoint device before generic model. 2013-03-18 23:31:22 +00:00
jkim
5ebabf1d3e Add preliminary support for IBM/Lenovo TrackPoint.
PR:		kern/147237 (based on the initial patch for 8.x)
Tested by:	glebius (device detection and suspend/resume)
MFC after:	1 month
2013-03-18 23:22:47 +00:00
neel
8d05d984e8 Simplify the assignment of memory to virtual machines by requiring a single
command line option "-m <memsize in MB>" to specify the memory size.

Prior to this change the user needed to explicitly specify the amount of
memory allocated below 4G (-m <lowmem>) and the amount above 4G (-M <highmem>).

The "-M" option is no longer supported by 'bhyveload' and 'bhyve'.

The start of the PCI hole is fixed at 3GB and cannot be directly changed
using command line options. However it is still possible to change this in
special circumstances via the 'vm_set_lowmem_limit()' API provided by
libvmmapi.

Submitted by:	Dinakar Medavaram (initial version)
Reviewed by:	grehan
Obtained from:	NetApp
2013-03-18 22:38:30 +00:00
pjd
ee34459918 Reduce stack usage. 2013-03-18 21:11:31 +00:00
rstone
9e3df2d114 Correct the definition for Exar XR17V258IV: we must use a config_function
to specify the offset into the PCI memory spare at which each serial port
will find its registers.  This was already done for other Exar PCI serial
devices; it was accidentally omitted for this specific device.

Sponsored by:	Sandvine Incorporated
MFC after:	1 week
2013-03-18 19:22:51 +00:00
jhb
8604015a2e Tweak some comments. 2013-03-18 18:04:09 +00:00
jhb
8b099870ed Partially revert r195702. Deferring stops is now implemented via a set of
calls to toggle TDF_SBDRY rather than passing PBDRY to individual sleep
calls.
- Remove the stop_allowed parameters from cursig() and issignal().
  issignal() checks TDF_SBDRY directly.
- Remove the PBDRY and SLEEPQ_STOP_ON_BDRY flags.
2013-03-18 17:23:58 +00:00
ray
5f339017dc o Switch to use physical addresses in rman for FDT.
o Remove vtophys used to translate virtual address to physical in case rman carry virtual.

Sponsored by:	The FreeBSD Foundation
2013-03-18 15:18:55 +00:00
andrew
c94762cd4a do_vfp_vmrs and do_vfp_vmsr should not return anything. 2013-03-18 15:14:36 +00:00
des
153ad47126 Keep the default AuthorizedKeysFile setting. Although authorized_keys2
has been deprecated for a while, some people still use it and were
unpleasantly surprised by this change.

I may revert this commit at a later date if I can come up with a way
to give users who still have authorized_keys2 files sufficient advance
warning.

MFC after:	ASAP
2013-03-18 10:50:50 +00:00
andrew
7479840eb8 Add support for the vmsr and vmrs instructions. This supports the system
level version of the instructions. When used in userland the hardware only
allows us to read/write FPSCR.
2013-03-18 08:22:35 +00:00
andrew
e0722a1284 Some ARM vmov similar to 'vmov.f32 s1, s2' will incorrectly have the second
register added to the symbol table by the assembler. On further
investigation it was found the problem was with the my_get_expression
function. This is called by parse_big_immediate.

Fix this by moving the call to parse_big_immediate to the end of the if,
else if, ..., else block.
2013-03-18 07:41:08 +00:00
hselasky
77164fe10d Add new USB ID.
PR:		usb/177013
MFC after:	1 week
2013-03-18 07:02:58 +00:00
jhibbits
7b62f31cdf Add FBT for PowerPC DTrace. Also, clean up the DTrace assembly code,
much of which is not necessary for PowerPC.

The FBT module can likely be factored into 3 separate files: common,
intel, and powerpc, rather than duplicating most of the code between
the x86 and PowerPC flavors.

All DTrace modules for PowerPC will be MFC'd together once Fasttrap is
completed.
2013-03-18 05:30:18 +00:00
yongari
c1c3be94b5 r119712 introduced SIS_TYPE_83816 but it was not actually set in
driver such that checking against the type was always false.
To detect NS DP83816, driver should have checked silicon revision
register for NS controllers. While here, remove SIS_TYPE_83816 to
not make the similar mistake again.

Reported by:	Brad Smith ( brad@openbsd )
2013-03-18 04:46:17 +00:00
adrian
cb18769932 Print out the current fifo queue depth correctly - not just the max
queue depth.

Silly hat to me.
2013-03-18 02:29:57 +00:00
kevlo
296df9cb79 Add restrict keyword to realpath manpage. 2013-03-18 01:22:28 +00:00
adrian
5061d6f712 Dump out information about the RX descriptor free list and FIFO information. 2013-03-18 01:12:36 +00:00
adrian
c17bed3d1c Log some more information when the RX buffer allocation failed. 2013-03-18 01:11:52 +00:00
attilio
6288f2f6de Sync back vmcontention branch into HEAD:
Replace the per-object resident and cached pages splay tree with a
path-compressed multi-digit radix trie.
Along with this, switch also the x86-specific handling of idle page
tables to using the radix trie.

This change is supposed to do the following:
- Allowing the acquisition of read locking for lookup operations of the
  resident/cached pages collections as the per-vm_page_t splay iterators
  are now removed.
- Increase the scalability of the operations on the page collections.

The radix trie does rely on the consumers locking to ensure atomicity of
its operations.  In order to avoid deadlocks the bisection nodes are
pre-allocated in the UMA zone.  This can be done safely because the
algorithm needs at maximum one new node per insert which means the
maximum number of the desired nodes is the number of available physical
frames themselves.  However, not all the times a new bisection node is
really needed.

The radix trie implements path-compression because UFS indirect blocks
can lead to several objects with a very sparse trie, increasing the number
of levels to usually scan.  It also helps in the nodes pre-fetching by
introducing the single node per-insert property.

This code is not generalized (yet) because of the possible loss of
performance by having much of the sizes in play configurable.
However, efforts to make this code more general and then reusable in
further different consumers might be really done.

The only KPI change is the removal of the function vm_page_splay() which
is now reaped.
The only KBI change, instead, is the removal of the left/right iterators
from struct vm_page, which are now reaped.

Further technical notes broken into mealpieces can be retrieved from the
svn branch:
http://svn.freebsd.org/base/user/attilio/vmcontention/

Sponsored by:	EMC / Isilon storage division
In collaboration with:	alc, jeff
Tested by:	flo, pho, jhb, davide
Tested by:	ian (arm)
Tested by:	andreast (powerpc)
2013-03-18 00:25:02 +00:00