94399 Commits

Author SHA1 Message Date
Xin LI
42d875a536 Fix build. 2013-08-17 00:25:11 +00:00
Ian Lepore
ed7142a72d Consistently init all mmc request, command, and data structures to zero
before using them.
2013-08-17 00:19:27 +00:00
Ian Lepore
a8328210d0 Handle command retries for commands originating at the mmc layer, and
ensure that all such commands have a non-zero retry count except for those
that are expected to fail (for example, because they are used to probe for
feature support).

While it is possible to pass a retry count down to the hardware driver in
the command request structure, no hardware driver currently implements any
retry logic.  The hardware doesn't know much about the context of a single
request, so it makes more sense to handle retries at a layer that does.

This adds retry loops to the mmc_wait_for_cmd() and mmc_wait_for_app_cmd()
functions.  These functions are the gateway from other code within mmc.c
to the hardware.  App commands are a sequence of two commands and a retry
has to rerun both of them in order, so it needs its own retry loop.

Retry looping is specifically NOT implemented in mmc_wait_for_request()
because it is the gateway for children on the bus, and they have to
implement their own retry logic depending on what makes sense for them.
2013-08-16 23:05:34 +00:00
John Baldwin
5aa60b6f21 Add new mmap(2) flags to permit applications to request specific virtual
address alignment of mappings.
- MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n).
  Requests for n >= number of bits in a pointer or less than the size of
  a page fail with EINVAL.  This matches the API provided by NetBSD.
- MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED.  It can be used
  to optimize the chances of using large pages.  By default it will align
  the mapping on a large page boundary (the system is free to choose any
  large page size to align to that seems best for the mapping request).
  However, if the object being mapped is already using large pages, then
  it will align the virtual mapping to match the existing large pages in
  the object instead.
- Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and
  VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment.
  MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while
  MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE.
- mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than
  explicitly using VMFS_SUPER_SPACE.  All device objects are forced to
  use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively
  equivalent.

Reviewed by:	alc
MFC after:	1 month
2013-08-16 21:13:55 +00:00
Ian Lepore
df736d55a1 During card identification, run the bus at 400KHz, not the minimum
speed the bus claims to be capable of.  The 400KHz speed is dictated
by the SD and MMC standards.
2013-08-16 20:32:56 +00:00
Ian Lepore
65f63c73cb Print the card relative address in hex, because that's what all the
other debugging output does (when it appears in command arguments,
for example).
2013-08-16 20:22:57 +00:00
Ian Lepore
54c665855d Add named constants for 8-bit bus support. The sdhci and mmc drivers
don't have support for this yet, but some low-level hardware is ready
for it when the higher layers catch up.
2013-08-16 19:44:49 +00:00
Ian Lepore
ceb9e9f70d When the timeout clock is based on the SD clock, the timeout counter
has to be recalculated every time the SD clock frequency changes.

Also, tidy up the counter calculation... it makes no sense to calculate
a value one larger than the limit, then whine that it's too large and
truncate it to the limit.  If the BROKEN_TIMEOUT quirk is set, don't
calculate the counter at all, just set it to the limit value.
2013-08-16 19:40:00 +00:00
Kenneth D. Merry
aeb681d798 Add unmapped I/O and larger I/O support to the sa(4) driver.
We now pay attention to the maxio field in the XPT_PATH_INQ CCB,
and if it is set, propagate it up to physio via the si_iosize_max
field in the cdev structure.

We also now pay attention to the PIM_UNMAPPED capability bit in the
XPT_PATH_INQ CCB, and set the new SI_UNMAPPED cdev flag when the
underlying SIM supports unmapped I/O.

scsi_sa.c:	Add unmapped I/O support and propagate the SIM's
		maximum I/O size up.

		Adjust scsi_tape_read_write() in the same way that
		scsi_read_write() was changed to support unmapped
		I/O.  We overload the readop parameter with bits
		that tell us whether it's an unmapped I/O, and we
		need to set the CAM_DATA_BIO CCB flag.  This change
		should be backwards compatible in source and
		binary forms.

MFC after:	1 week
Sponsored by:	Spectra Logic
2013-08-16 16:14:32 +00:00
Konstantin Belousov
b1dd38f408 Restore the previous sendfile(2) behaviour on the block devices.
Provide valid .fo_sendfile method for several missed struct fileops.

Reviewed by:	glebius
Sponsored by:	The FreeBSD Foundation
2013-08-16 14:22:20 +00:00
Kevin Lo
612cf1ca01 Bring datasheet URL up to date. 2013-08-16 07:42:06 +00:00
Mark Johnston
196f2f42eb Use strdup(9) instead of reimplementing it. 2013-08-16 03:41:41 +00:00
Kenneth D. Merry
ce625ec719 Change the way that unmapped I/O capability is advertised.
The previous method was to set the D_UNMAPPED_IO flag in the cdevsw
for the driver.  The problem with this is that in many cases (e.g.
sa(4)) there may be some instances of the driver that can handle
unmapped I/O and some that can't.  The isp(4) driver can handle
unmapped I/O, but the esp(4) driver currently cannot.  The cdevsw
is shared among all driver instances.

So instead of setting a flag on the cdevsw, set a flag on the cdev.
This allows drivers to indicate support for unmapped I/O on a
per-instance basis.

sys/conf.h:	Remove the D_UNMAPPED_IO cdevsw flag and replace it
		with an SI_UNMAPPED cdev flag.

kern_physio.c:	Look at the cdev SI_UNMAPPED flag to determine
		whether or not a particular driver can handle
		unmapped I/O.

geom_dev.c:	Set the SI_UNMAPPED flag for all GEOM cdevs.
		Since GEOM will create a temporary mapping when
		needed, setting SI_UNMAPPED unconditionally will
		work.

		Remove the D_UNMAPPED_IO flag.

nvme_ns.c:	Set the SI_UNMAPPED flag on cdevs created here
		if NVME_UNMAPPED_BIO_SUPPORT is enabled.

vfs_aio.c:	In aio_qphysio(), check the SI_UNMAPPED flag on a
		cdev instead of the D_UNMAPPED_IO flag on the cdevsw.

sys/param.h:	Bump __FreeBSD_version to 1000045 for the switch from
		setting the D_UNMAPPED_IO flag in the cdevsw to setting
		SI_UNMAPPED in the cdev.

Reviewed by:	kib, jimharris
MFC after:	1 week
Sponsored by:	Spectra Logic
2013-08-15 22:52:39 +00:00
Jeff Roberson
114f62c6df - Fix bug in r254304. Use the ACTIVE pq count for the active list
processing, not inactive.  This was the result of a bad merge.

Reported by:	pho
Sponsored by:	EMC / Isilon Storage Division
2013-08-15 22:29:49 +00:00
Jung-uk Kim
5772203b17 Simplify check for CMPXCHG8B instruction. Note CMPXCHG8B instruction is
always available for Rise mP6 processors although it is not set by CPUID.
2013-08-15 21:09:05 +00:00
Colin Percival
2bb93f2d18 Change the queue of locks in kern_rangelock.c from holding lock requests in
the order that they arrive, to holding
(a) granted write lock requests, followed by
(b) granted read lock requests, followed by
(c) ungranted requests, in order of arrival.

This changes the stopping condition for iterating through granted locks to
see if a new request can be granted: When considering a read lock request,
we can stop iterating as soon as we see a read lock request, since anything
after that point is either a granted read lock request or a request which
has not yet been granted.  (For write lock requests, we must still compare
against all granted lock requests.)

For workloads with R parallel reads and W parallel writes, this improves
the time spent from O((R+W)^2) to O(W*(R+W)); i.e., heavy parallel-read
workloads become significantly more scalable.

No statistically significant change in buildworld time has been measured,
but synthetic tests of parallel 'dd > /dev/null' and 'openssl enc >/dev/null'
with the input file cached yield dramatic (up to 10x) improvement with high
(up to 128 processes) levels of parallelism.

Reviewed by:	kib
2013-08-15 20:19:17 +00:00
Jung-uk Kim
bd00cfe2c8 Avoid potential redefinition of the macro. 2013-08-15 20:03:22 +00:00
Edward Tomasz Napierala
da4757e06b Turn comments about locking into actual lock assertions.
Reviewed by:	ken
Tested by:	ken
MFC after:	1 month
2013-08-15 20:00:32 +00:00
Brooks Davis
cd234300d3 Use an ANSI C definition of initializecpucache() to match the declaration
and the rest of the file.
2013-08-15 17:44:44 +00:00
Brooks Davis
cb261f4315 Call set_i8254_freq with MODE_STOP (0) rather than a magic number of 0. 2013-08-15 17:21:06 +00:00
Kenneth D. Merry
7bf825d1d3 Export the maxio field in the CAM XPT_PATH_INQ CCB in the isp(4)
driver.

This tells consumers up the stack the maximum I/O size that the
controller can handle.

The I/O size is bounded by the number of scatter/gather segments
the controller can handle and the page size.  For an amd64 system,
it works out to around 5MB.

Reviewed by:	mjacob
MFC after:	3 days
Sponsored by:	Spectra Logic
2013-08-15 16:41:27 +00:00
Attilio Rao
a834cbaec8 On the recovery path for vm_page_alloc(), if a page had been requested
wired, unwind back the wiring bits otherwise we can end up freeing a
page that is considered wired.

Sponsored by:	EMC / Isilon storage division
Reported by:	alc
2013-08-15 11:01:25 +00:00
Jeremie Le Hen
2c7cd47838 Belatedly bump __FreeBSD_version for libc being an ld script.
This should have been done in r251668, on June 12, 2013.

This will have no practical consequences, besides having -lssp_nonshared
appearing twice on the command-line for systems built in this time frame.
2013-08-15 08:21:00 +00:00
Gleb Smirnoff
ca04d21d5f Make sendfile() a method in the struct fileops. Currently only
vnode backed file descriptors have this method implemented.

Reviewed by:	kib
Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2013-08-15 07:54:31 +00:00
Mark Johnston
7b77e1fe0f Specify SDT probe argument types in the probe definition itself rather than
using SDT_PROBE_ARGTYPE(). This will make it easy to extend the SDT(9) API
to allow probes with dynamically-translated types.

There is no functional change.

MFC after:	2 weeks
2013-08-15 04:08:55 +00:00
Simon J. Gerraty
3d2bc9e872 Some objects - such as *_genassym.o are not hooked into
SRCS OBJS or anything else, yet have a dependency on symlinks
such as machine/

Reviewed by: obrien
2013-08-14 22:19:29 +00:00
Michael Tuexen
0e05fbded9 Don't send uninitialized memory (two instances of 4 bytes) in
every cookie on the wire. This bug was reported in
https://bugzilla.mozilla.org/show_bug.cgi?id=905080

MFC after: 3 days
2013-08-14 21:51:32 +00:00
Rick Macklem
93c5875b24 Fix several performance related issues in the new NFS server's
DRC for NFS over TCP.
- Increase the size of the hash tables.
- Create a separate mutex for each hash list of the TCP hash table.
- Single thread the code that deletes stale cache entries.
- Add a tunable called vfs.nfsd.tcphighwater, which can be increased
  to allow the cache to grow larger, avoiding the overhead of frequent
  scans to delete stale cache entries.
  (The default value will result in frequent scans to delete stale cache
   entries, analagous to what the pre-patched code does.)
- Add a tunable called vfs.nfsd.cachetcp that can be used to disable
  DRC caching for NFS over TCP, since the old NFS server didn't DRC cache TCP.
It also adjusts the size of nfsrc_floodlevel dynamically, so that it is
always greater than vfs.nfsd.tcphighwater.

For UDP the algorithm remains the same as the pre-patched code, but the
tunable vfs.nfsd.udphighwater can be used to allow the cache to grow
larger and reduce the overhead caused by frequent scans for stale entries.
UDP also uses a larger hash table size than the pre-patched code.

Reported by:	wollman
Tested by:	wollman (earlier version of patch)
Submitted by:	ivoras (earlier patch)
Reviewed by:	jhb (earlier version of patch)
MFC after:	1 month
2013-08-14 21:11:26 +00:00
Sean Bruno
a2bc8a1d0c If sys/param.h MAXPHYS has been tuned to exceed MFI_MAXPHYS, the mfi(4)
real JBOD mode (SYS PD) would fail fairly reliably during I/O.

Steal the mfi_disk.c check for this condition (indirectly) when establishing
d_maxsize.

Reviewed by:	ambrisko@
MFC after:	4 weeks
Sponsored by:	Yahoo! Inc.
2013-08-14 15:50:34 +00:00
Steven Hartland
dce643c85f Added 4K quirks for:-
* OCZ Agility 2 SSDs
* Marvell SSDs
* Intel X25-M Series SSDs
2013-08-14 15:18:28 +00:00
Pedro F. Giffuni
4a62545173 ext2fs: update format specifiers for ext4 type.
Previous bandaid was not appropriate and didn't really work for
all platforms. While here, cleanup the surrounding code to match
ffs_checkoverlap()

Reported by:	dim, jmallet and bde
MFC after:	3 weeks
2013-08-14 14:22:46 +00:00
Ulrich Spörlein
f1fe1d39e1 Fix make depend 2013-08-14 08:03:57 +00:00
Rui Paulo
a3e08d6f4c Replace the homegrown implementation of nitems() with calls to nitems()
(param.h).

Operating systems that don't have nitems() can easily define it on their own
net80211 OS-specific header file.

Discussed with:		adrian
2013-08-14 04:24:25 +00:00
Mark Johnston
12ede07ab8 Use kld_{load,unload} instead of mod_{load,unload} for the linker file load
and unload event handlers added in r254266.

Reported by:	jhb
X-MFC with:	r254266
2013-08-14 00:42:21 +00:00
Jeff Roberson
99de9af2a6 - Disable quantum caches on the kmem_arena. This can make fragmentation
worse on small KVA systems.  I had intended to only enable it for
   debugging.

Sponsored by:	EMC / Isilon Storage Division
2013-08-13 22:41:24 +00:00
Jeff Roberson
8441d1e842 - Add a statically allocated memguard arena since it is needed very early
on.
 - Pass the appropriate flags to vmem_xalloc() when allocating space for
   the arena from kmem_arena.

Sponsored by:	EMC / Isilon Storage Division
2013-08-13 22:40:43 +00:00
Jung-uk Kim
38da30b419 Merge acpica_machdep.h for amd64 and i386 and move to x86. In fact, these
two files were functionally identical.
2013-08-13 22:05:10 +00:00
Jeff Roberson
d9e232109f Improve pageout flow control to wakeup more frequently and do less work while
maintaining better LRU of active pages.

 - Change v_free_target to include the quantity previously represented by
   v_cache_min so we don't need to add them together everywhere we use them.
 - Add a pageout_wakeup_thresh that sets the free page count trigger for
   waking the page daemon.  Set this 10% above v_free_min so we wakeup before
   any phase transitions in vm users.
 - Adjust down v_free_target now that we're willing to accept more pagedaemon
   wakeups.  This means we process fewer pages in one iteration as well,
   leading to shorter lock hold times and less overall disruption.
 - Eliminate vm_pageout_page_stats().  This was a minor variation on the
   PQ_ACTIVE segment of the normal pageout daemon.  Instead we now process
   1 / vm_pageout_update_period pages every second.  This causes us to visit
   the whole active list every 60 seconds.  Previously we would only maintain
   the active LRU when we were short on pages which would mean it could be
   woefully out of date.

Reviewed by:	alc (slight variant of this)
Discussed with:	alc, kib, jhb
Sponsored by:	EMC / Isilon Storage Division
2013-08-13 21:56:16 +00:00
Jim Harris
086d23cfd3 If a controller fails to initialize, do not notify consumers (nvd) of its
namespaces.

Sponsoredy by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-08-13 21:49:32 +00:00
Jim Harris
56183abc2b Send a shutdown notification in the driver unload path, to ensure
notification gets sent in cases where system shuts down with driver
unloaded.

Sponsored by:	Intel
Reviewed by:	carl
MFC after:	3 days
2013-08-13 21:47:08 +00:00
Jung-uk Kim
3bd12ca8f1 Tidy up global locks for ACPICA. There is no functional change. 2013-08-13 21:34:03 +00:00
Ian Lepore
9908a5a5e1 Rename imx_machdep.c to imx51_machdep.c, because it contains hardware
addresses which are specific to the imx51 chips.
2013-08-13 21:12:28 +00:00
Mikolaj Golub
c5c392e7ed Virtualize carp(4) variables to have per vnet control.
Reviewed by:	ae, glebius
2013-08-13 19:59:49 +00:00
John Baldwin
e05bf4cf95 Some small cleanups to the fixes in r180340:
- Set NOTE_TRACKERR before running filt_proc().  If the knote did not
  have NOTE_FORK set in fflags when registered, then the TRACKERR event
  could miss being posted.
- Don't pass the pid in to filt_proc() for NOTE_FORK events.  The special
  handling for pids is done knote_fork() directly and no longer in
  filt_proc().

MFC after:	2 weeks
2013-08-13 18:45:58 +00:00
Pedro F. Giffuni
88ae190ea0 ext2fs: update format specifiers for ext4 type.
Reported by:	Sam Fourman Jr.
MFC after:	3 weeks
2013-08-13 18:39:36 +00:00
Pedro F. Giffuni
70097aac13 Define ext2fs local types and use them.
Add definitions for e2fs_daddr_t, e4fs_daddr_t in addition
to the already existing e2fs_lbn_t and adjust them for ext4.
Other than making the code more readable these changes should
fix problems related to big filesystems.

Setting the proper types can be tricky so the process was
helped by looking at UFS. In our implementation, logical block
numbers can be negative and the code depends on it. In ext2,
block numbers are unsigned so it is convenient to keep
e2fs_daddr_t unsigned and use the complete 32 bits. In the
case of e4fs_daddr_t, while the value should be unsigned, for
ext4 we only need to support 48 bits so preserving an extra
bit from the sign is not an issue.

While here also drop the ext2_setblock() prototype that was
never used.

Discussed with:	mckusick, bde
MFC after:	3 weeks
2013-08-13 15:40:43 +00:00
Gleb Smirnoff
90c35c1939 - Minor style(9) fix.
- Bring a comment up to date.
2013-08-13 13:40:31 +00:00
Ian Lepore
e0511b6c67 Add imx6 compatibility and make the driver work for any clock frequency.
There are still a couple references to imx51 ccm driver functions that will
need to be changed after an imx6 ccm driver is written.

Reviewed by:	ray
2013-08-13 13:14:13 +00:00
Adrian Chadd
a1df5ac10a ieee80211_rate2plcp() and ieee80211_rate2phytype() are both pre-11n
routines and thus assert if one passes in a rate code with the
high bit set.

Since the high bit can indicate either IEEE80211_RATE_BASIC or
IEEE80211_RATE_MCS, it's up to the caller to determine whether
the rate is 11n or not, and either mask out the BASIC bit, or
call a different function.

(Yes, this does mean that net80211 should grow 11n-aware rate2phytype()
and rate2plcp() functions..)

This may need to happen for the other drivers - it's currently only
done (now) for iwn(4) and bwi(4).

PR:		kern/181100
2013-08-13 09:58:27 +00:00
Alexander Motin
0f0b2fd889 Return error when opening read-only volumes (like RAID4/5/...) for writing.
Previously opens succeeded, but actual write operations returned errors.

Requested by:	peter
MFC after:	2 weeks
2013-08-13 07:56:40 +00:00