freebsd-dev

Author	SHA1	Message	Date
Gleb Smirnoff	b7e2b86cec	Document some flags to the uma_zcreate(). Not all flags are documented, only those that at least are used in the kernel, or that definitely work.	2013-03-21 16:19:46 +00:00
Gleb Smirnoff	07f490ac97	Document uma_find_refcnt().	2013-03-21 16:04:34 +00:00
Alexander Motin	359b47db97	Minimal timer period of 100us introduced in r244758 is overkill. While original 2us are indeed not enough, 3us are working quite well on my tests. To be more safe set minimal period to 5us and to be even more safe replicate here from HPET mechanism of rereading counter after programming comparator. This change allows to handle 30K of short nanosleep() calls per second on Raspberry Pi instead of just 8K before. Discussed with: gonzo	2013-03-21 15:42:41 +00:00
John Baldwin	d071a6fa33	Another NFS SIGSTOP related fix: Ignore thread suspend requests due to SIGSTOP if stop signals are currently deferred. This can occur if a process is stopped via SIGSTOP while a thread is running or runnable but before it has set TDF_SBDRY. Tested by: pho Reviewed by: kib MFC after: 1 week	2013-03-21 14:06:27 +00:00
Konstantin Belousov	c46262f810	Fix twa(4) after the r246713. The driver copies data around to satisfy some alignment restrictions. Do not set TW_OSLI_REQ_FLAGS_CCB flag for mapped data, pass the csio->data_ptr in the req->data. Do not put the ccb pointer into req->data ever, ccb is stored in req->orig_req already. Submitted by: Shuichi KITAGUCHI <ki@hh.iij4u.or.jp> PR: kern/177020	2013-03-21 13:06:28 +00:00
Gleb Smirnoff	21f97e938d	Document NGM_NAT_LIBALIAS_INFO. Submitted by: Dmitry Luhtionov <dmitryluhtionov gmail.com>	2013-03-21 13:02:43 +00:00
Konstantin Belousov	4d569af96c	Initialize the variable to avoid (false) compiler warning about use of an uninitialized local. Reported by: Ivan Klymenko <fidaj@ukr.net> MFC after: 2 weeks	2013-03-21 12:59:24 +00:00
Eitan Adler	acdfecd359	Remove a reference to instant-server which has been removed from the ports tree in r313427. PR: 177012 Submitted by: Kevin Zheng <kevinz5000@gmail.com> Approved by: bcr (mentor)	2013-03-21 12:42:25 +00:00
Steven Hartland	2b114ad2a4	Add missing descriptions for ZFS sysctls Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks	2013-03-21 11:25:21 +00:00
Joel Dahl	b22247c287	Remove EOL whitespace.	2013-03-21 11:22:13 +00:00
Steven Hartland	adea827b21	Optimisation of TRIM processing. Previously TRIM processing was very bursty. This was made worse by the fact that TRIM requests on SSD's are typically much slower than reads or writes. This often resulted in stalls while large numbers of TRIM's where processed. In addition due to the way the TRIM thread was only woken by writes, deletes could stall in the queue for extensive periods of time. This patch adds a number of controls to how often the TRIM thread for each SPA processes its outstanding delete requests. vfs.zfs.trim.timeout: Delay TRIMs by up to this many seconds vfs.zfs.trim.txg_delay: Delay TRIMs by up to this many TXGs (reduced to 32) vfs.zfs.vdev.trim_max_bytes: Maximum pending TRIM bytes for a vdev vfs.zfs.vdev.trim_max_pending: Maximum pending TRIM segments for a vdev vfs.zfs.trim.max_interval: Maximum interval between TRIM queue processing (seconds) Given the most common TRIM implementation is ATA TRIM the current defaults are targeted at that. Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks	2013-03-21 11:02:08 +00:00
Steven Hartland	6ad46cec23	Names the ZFS TRIM thread Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks	2013-03-21 10:41:30 +00:00
Steven Hartland	89e5b43079	TRIM cache devices based on time instead of TXGs. Currently, the trim module uses the same algorithm for data and cache devices when deciding to issue TRIM requests, based on how far in the past the TXG is. Unfortunately, this is not ideal for cache devices, because the L2ARC doesn't use the concept of TXGs at all. In fact, when using a pool for reading only, the L2ARC is written but the TXG counter doesn't increase, and so no new TRIM requests are issued to the cache device. This patch fixes the issue by using time instead of the TXG number as the criteria for trimming on cache devices. The basic delay principle stays the same, but parameters are expressed in seconds instead of TXGs. The new parameters are named trim_l2arc_limit and trim_l2arc_batch, and both default to 30 second. Reviewed by: pjd (mentor) Approved by: pjd (mentor) Obtained from: `17122c31ac` MFC after: 2 weeks	2013-03-21 10:29:05 +00:00
Steven Hartland	78ad0c1c80	Improve TXG handling in the TRIM module. This patch adds some improvements to the way the trim module considers TXGs: - Free ZIOs are registered with the TXG from the ZIO itself, not the current SPA syncing TXG (which may be out of date); - L2ARC are registered with a zero TXG number, as L2ARC has no concept of TXGs; - The TXG limit for issuing TRIMs is now computed from the last synced TXG, not the currently syncing TXG. Indeed, under extremely unlikely race conditions, there is a risk we could trim blocks which have been freed in a TXG that has not finished syncing, resulting in potential data corruption in case of a crash. Reviewed by: pjd (mentor) Approved by: pjd (mentor) Obtained from: `5b46ad40d9` MFC after: 2 weeks	2013-03-21 10:16:10 +00:00
Steven Hartland	e07e3a3792	Don't register repair writes in the trim map. The trim map inflight writes tree assumes non-conflicting writes, i.e. that there will never be two simultaneous write I/Os to the same range on the same vdev. This seemed like a sane assumption; however, in actual testing, it appears that repair I/Os can very well conflict with "normal" writes. I'm not quite sure if these conflicting writes are supposed to happen or not, but in the mean time, let's ignore repair writes for now. This should be safe considering that, by definition, we never repair blocks that are freed. Reviewed by: pjd (mentor) Approved by: pjd (mentor) Obtained from: Source: `6a3cebaf7c`	2013-03-21 10:02:32 +00:00
Steven Hartland	e05aad2d33	Add TRIM support for L2ARC. This adds TRIM support to cache vdevs. When ARC buffers are removed from the L2ARC in arc_hdr_destroy(), arc_release() or l2arc_evict(), the size previously occupied by the buffer gets scheduled for TRIMming. As always, actual TRIMs are only issued to the L2ARC after txg_trim_limit. Reviewed by: pjd (mentor) Approved by: pjd (mentor) Obtained from: `31aae37399` MFC after: 2 weeks	2013-03-21 09:34:41 +00:00
Martin Matuska	05f49d92ef	Merge libzfs_core branch: includes MFV 238590, 238592, 247580 MFV 238590, 238592: In the first zfs ioctl restructuring phase, the libzfs_core library was introduced. It is a new thin library that wraps around kernel ioctl's. The idea is to provide a forward-compatible way of dealing with new features. Arguments are passed in nvlists and not random zfs_cmd fields, new-style ioctls are logged to pool history using a new method of history logging. http://blog.delphix.com/matt/2012/01/17/the-future-of-libzfs/ MFV 247580 [1]: To address issues of several deadlocks and race conditions the locking code around dsl_dataset was rewritten and the interface to synctasks was changed. User-Visible Changes: "zfs snapshot" can create more arbitrary snapshots at once (atomically) "zfs destroy" destroys multiple snapshots at once "zfs recv" has improved performance Backward Compatibility: I have extended the compatibility layer to support full backward compatibility by remapping or rewriting the responsible ioctl arguments. Old utilities are fully supported by the new kernel module. Forward Compatibility: New utilities work with old kernels with the following restrictions: - creating, destroying, holding and releasing of multiple snapshots at once is not supported, this includes recursive (-r) commands Illumos ZFS issues: 2882 implement libzfs_core 2900 "zfs snapshot" should be able to create multiple, arbitrary snapshots at once 3464 zfs synctask code needs restructuring References: https://www.illumos.org/issues/2882 https://www.illumos.org/issues/2900 https://www.illumos.org/issues/3464 [1] MFC after: 1 month Sponsored by: Hybrid Logic Inc. [1]	2013-03-21 08:38:03 +00:00
Gleb Smirnoff	5aedfa32a4	Add NGM_NAT_LIBALIAS_INFO command, that reports internal stats of libalias instance. To be used in the mpd5 daemon. Submitted by: Dmitry Luhtionov <dmitryluhtionov gmail.com>	2013-03-21 08:36:15 +00:00
Konstantin Belousov	7db07e1c85	Only size and create the bio_transient_map when unmapped buffers are enabled. Now, disabling the unmapped buffers should result in the kernel memory map identical to pre-r248550. Sponsored by: The FreeBSD Foundation	2013-03-21 07:28:15 +00:00
Konstantin Belousov	6c83fce371	Assert that transient mapping of the bio is only done when unmapped buffers are allowed. Sponsored by: The FreeBSD Foundation	2013-03-21 07:26:33 +00:00
Konstantin Belousov	7157d8f7ab	Do not call vnode_pager_setsize() while a NFS node mutex is locked. vnode_pager_setsize() might sleep waiting for the page after EOF be unbusied. Call vnode_pager_setsize() both for the regular and directory vnodes. Reported by: mich Reviewed by: rmacklem Discussed with: avg, jhb MFC after: 2 weeks	2013-03-21 07:25:08 +00:00
Hans Petter Selasky	3232aae327	Add new USB ID. PR: usb/177173 MFC after: 1 week	2013-03-21 07:04:17 +00:00
Neel Natu	7778bd576f	Set WARNS=3 so this actually compiles.	2013-03-20 21:47:05 +00:00
Konstantin Belousov	e3269b5096	In bufwrite(), a dirty buffer is moved to the clean queue before the bufobj counter of the writes in progress is incremented. Other thread inspecting the bufobj would consider it clean. For the regular vnodes, the vnode lock is typically held both by the thread performing the bufwrite() and an other thread doing syncing, which prevents the situation. On the other hand, writes to the VCHR vnodes are done without holding vnode lock. Increment the write ref counter for the buffer object before calling bundirty(). Sponsored by: The FreeBSD Foundation Tested by: pho MFC after: 2 weeks	2013-03-20 21:08:00 +00:00
Konstantin Belousov	8d6884ce9c	When the journaled FFS volume is suspended due to the journal space becoming too low, the softdep flush thread processes the workitems, which frees the space in journal, and then unsuspends the fs. The softdep_flush() and other workitem processing functions busy the filesystem before iterating over the worklist, to prevent the parallel unmount from freeing the mount data. The vfs_busy() is called with MBF_NOWAIT flag. Now, if the unmount is already started and the filesystem is suspended due to low journal space, the journal is never flushed and filesystem is never unsuspended, because vfs_busy(MBF_NOWAIT) call cannot succeed for the unmounting fs, and softdep_flush() does not process the workitems. Unmount needs to write metadata, where it hangs in the "suspfs" state. Move the vn_start_write() call in the dounmount() before setting the MNTK_UNMOUNT flag. This practically ensures that softdep_flush() processed the pending journal writes by making dounmount() wait for the lift of the suspension. Sponsored by: The FreeBSD Foundation Reported and tested by: pho MFC after: 2 weeks	2013-03-20 21:07:49 +00:00
Kirk McKusick	3289d5877a	When renaming a directory from one parent directory to another, we need to call ufs_checkpath() to walk from our new location to the root of the filesystem to ensure that we do not encounter ourselves along the way. Until now, we accomplished this by reading the ".." entries of each directory in our path until we reached the root (or encountered an error). This change tries to avoid the I/O of reading the ".." entries by first looking them up in the name cache and only doing the I/O when the name cache lookup fails. Reviewed by: kib Tested by: Peter Holm MFC after: 4 weeks	2013-03-20 17:57:00 +00:00
Aleksandr Rybalko	a2c472e741	Integrate Efika MX project back to home. Sponsored by: The FreeBSD Foundation	2013-03-20 15:39:27 +00:00
Hans Petter Selasky	76be9c89ba	Fix spelling.	2013-03-20 11:51:26 +00:00
Alexander V. Chernikov	2d6fcc3912	Remove unused variable.	2013-03-20 10:36:38 +00:00
Alexander V. Chernikov	ae01d73c04	Add ipfw support for setting/matching DiffServ codepoints (DSCP). Setting DSCP support is done via O_SETDSCP which works for both IPv4 and IPv6 packets. Fast checksum recalculation (RFC 1624) is done for IPv4. Dscp can be specified by name (AFXY, CSX, BE, EF), by value (0..63) or via tablearg. Matching DSCP is done via another opcode (O_DSCP) which accepts several classes at once (af11,af22,be). Classes are stored in bitmask (2 u32 words). Many people made their variants of this patch, the ones I'm aware of are (in alphabetic order): Dmitrii Tejblum Marcelo Araujo Roman Bogorodskiy (novel) Sergey Matveichuk (sem) Sergey Ryabin PR: kern/102471, kern/121122 MFC after: 2 weeks	2013-03-20 10:35:33 +00:00
Martin Matuska	192d547574	Release hold on pool before calling zvol_create_minor()	2013-03-20 09:56:20 +00:00
Konstantin Belousov	6991ee13a6	Fix the logic inversion in the r248512. Noted by: mckay	2013-03-20 09:44:23 +00:00
Andrew Turner	e9a848494f	Pull in r177252 from upstream clang trunk: Make sure to use same EABI version for external assembler as for integrated as. This allows us to use gcc on a world built with clang on ARM.	2013-03-20 08:34:30 +00:00
Adrian Chadd	9cda8c8082	Fix the EDMA CABQ handling - for now, the CABQ takes a descriptor chain like the legacy chips expect.	2013-03-20 05:44:03 +00:00
Pyun YongHyeon	cf402cc979	For RTL8211B or later PHYs, enable crossover detection and auto-correction. This change makes re(4) establish a link with a system using non-crossover UTP cable. Tested by: Michael BlackHeart < amdmiek <> gmail dot com >	2013-03-20 05:31:34 +00:00
Adrian Chadd	bd8cbcc32c	Add VNET wrappers around the rest of the ieee80211 rtsock messages. I triggered the cac/radar messages when doing testing in DFS channels.	2013-03-20 02:42:52 +00:00
Martin Matuska	a0abc0d302	Run zvol_create_minors() only if in non-error case	2013-03-19 22:27:15 +00:00
Martin Matuska	e56718d734	Run zvol_create_minors() on snapshot creation	2013-03-19 22:14:50 +00:00
Joel Dahl	7cf62795b7	Add simple example.	2013-03-19 21:40:14 +00:00
Jilles Tjoelker	c2e3c52e0d	Implement SOCK_CLOEXEC, SOCK_NONBLOCK and MSG_CMSG_CLOEXEC. This change allows creating file descriptors with close-on-exec set in some situations. SOCK_CLOEXEC and SOCK_NONBLOCK can be OR'ed in socket() and socketpair()'s type parameter, and MSG_CMSG_CLOEXEC to recvmsg() makes file descriptors (SCM_RIGHTS) atomically close-on-exec. The numerical values for SOCK_CLOEXEC and SOCK_NONBLOCK are as in NetBSD. MSG_CMSG_CLOEXEC is the first free bit for MSG_. The SOCK_ flags are not passed to MAC because this may cause incorrect failures and can be done later via fcntl() anyway. On the other hand, audit is expected to cope with the new flags. For MSG_CMSG_CLOEXEC, unp_externalize() is extended to take a flags argument. Reviewed by: kib	2013-03-19 20:58:17 +00:00
Adrian Chadd	f0db652cf6	Break out the RX completion path into "FIFO check / refill" and "complete RX frames." The 128 entry RX FIFO is really easy to fill up and miss refilling when it's done in the ath taskq - as that gets blocked up doing RX completion, TX completion and other random things. So the 128 entry RX FIFO now gets emptied and refilled in the ath_intr() task (and it grabs / releases locks, so now ath_intr() can't just be a FAST handler yet!) but the locks aren't held for very long. The completion part is done in the ath taskqueue context. Details: * Create a new completed frame list - sc->sc_rx_rxlist; * Split the EDMA RX process queue into two halves - one that processes the RX FIFO and refills it with new frames; another that completes the completed frame list; * When tearing down the driver, flush whatever is in the deferred queue as well as what's in the FIFO; * Create two new RX methods - one that processes all RX queues, one that processes the given RX queue. When MSI is implemented, we get told which RX queue the interrupt came in on so we can specifically schedule that. (And I can do that with the non-MSI path too; I'll figure that out later.) * Convert the legacy code over to use these new RX methods; * Replace all the instances of the RX taskqueue enqueue with a call to a relevant RX method to enqueue one or all RX queues. Tested: * AR9380, STA * AR9580, STA * AR5413, STA	2013-03-19 19:32:28 +00:00
Adrian Chadd	74ea88c379	Add more TODO items.	2013-03-19 17:55:36 +00:00
Adrian Chadd	378a752f59	Now that the tx map field is correctly populated for both edma and legacy chips, just use that.	2013-03-19 17:54:37 +00:00
Warner Losh	2493aadade	Add a comment about why aout support is still here: We need it for compat2x, which is still in use, as evidence by recent bug reports.	2013-03-19 16:57:04 +00:00
Konstantin Belousov	129c6621f7	ahci(4) and siis(4) are ready to process the unmapped i/o requests Sponsored by: The FreeBSD Foundation Tested by: pho Submitted by: bf (siis patch)	2013-03-19 15:09:32 +00:00
Konstantin Belousov	59a01b70af	UFS support of the unmapped i/o for the user data buffers. Sponsored by: The FreeBSD Foundation Tested by: pho, scottl, jhb, bf	2013-03-19 15:08:15 +00:00
Konstantin Belousov	2649fcc1d8	Commit the removal of a whitespace to record the proper commit message for the r248519: For the cam-attached HBAs, allow the driver to specify that it accepts the unmapped bio by the PIM_UNMAPPED flag. The CAM passes the CAM_DATA_BIO data transfer type request for the unmapped bio, and the driver could use the bus_dmamap_load_ccb() as a helper to transparently handle the ccb. Sponsored by: The FreeBSD Foundation Reviewed by: scottl Tested by: pho, scottl	2013-03-19 15:05:21 +00:00
Konstantin Belousov	abc1e60e0e	Support unmapped i/o for the md(4). The vnode-backed md(4) has to map the unmapped bio because VOP_READ() and VOP_WRITE() interfaces do not allow to pass unmapped requests to the filesystem. Vnode-backed md(4) uses pbufs instead of relying on the bio_transient_map, to avoid usual md deadlock. Sponsored by: The FreeBSD Foundation Tested by: pho, scottl	2013-03-19 15:01:50 +00:00
Konstantin Belousov	59ec9023ca	Support unmapped i/o for the md(4). The vnode-backed md(4) has to map the unmapped bio because VOP_READ() and VOP_WRITE() interfaces do not allow to pass unmapped requests to the filesystem. Vnode-backed md(4) uses pbufs instead of relying on the bio_transient_map, to avoid usual md deadlock. Sponsored by: The FreeBSD Foundation Tested by: pho, scottl	2013-03-19 14:53:23 +00:00
Konstantin Belousov	db7bfaa8ce	The geom_part provider supports unmapped bio iff the underlying provider does so, since geom_part never inspects the bio_data. Sponsored by: The FreeBSD Foundation Tested by: pho	2013-03-19 14:50:24 +00:00

1 2 3 4 5 ...

179768 Commits