freebsd-dev

Author	SHA1	Message	Date
Adrian Chadd	b837332d0a	Overhaul the TXQ locking (again!) as part of some beacon/cabq timing related issues. Moving the TX locking under one lock made things easier to progress on but it had one important side-effect - it increased the latency when handling CABQ setup when sending beacons. This commit introduces a bunch of new changes and a few unrelated changs that are just easier to lump in here. The aim is to have the CABQ locking separate from other locking. The CABQ transmit path in the beacon process thus doesn't have to grab the general TX lock, reducing lock contention/latency and making it more likely that we'll make the beacon TX timing. The second half of this commit is the CABQ related setup changes needed for sane looking EDMA CABQ support. Right now the EDMA TX code naively assumes that only one frame (MPDU or A-MPDU) is being pushed into each FIFO slot. For the CABQ this isn't true - a whole list of frames is being pushed in - and thus CABQ handling breaks very quickly. The aim here is to setup the CABQ list and then push _that list_ to the hardware for transmission. I can then extend the EDMA TX code to stamp that list as being "one" FIFO entry (likely by tagging the last buffer in that list as "FIFO END") so the EDMA TX completion code correctly tracks things. Major: * Migrate the per-TXQ add/removal locking back to per-TXQ, rather than a single lock. * Leave the software queue side of things under the ATH_TX_LOCK lock, (continuing) to serialise things as they are. * Add a new function which is called whenever there's a beacon miss, to print out some debugging. This is primarily designed to help me figure out if the beacon miss events are due to a noisy environment, issues with the PHY/MAC, or other. * Move the CABQ setup/enable to occur _after_ all the VAPs have been looked at. This means that for multiple VAPS in bursted mode, the CABQ gets primed once all VAPs are checked, rather than being primed on the first VAP and then having frames appended after this. Minor: * Add a (disabled) twiddle to let me enable/disable cabq traffic. It's primarily there to let me easily debug what's going on with beacon and CABQ setup/traffic; there's some DMA engine hangs which I'm finally trying to trace down. * Clear bf_next when flushing frames; it should quieten some warnings that show up when a node goes away. Tested: * AR9280, STA/hostap, up to 4 vaps (staggered) * AR5416, STA/hostap, up to 4 vaps (staggered) TODO: * (Lots) more AR9380 and later testing, as I may have missed something here. * Leverage this to fix CABQ hanling for AR9380 and later chips. * Force bursted beaconing on the chips that default to staggered beacons and ensure the CABQ stuff is all sane (eg, the MORE bits that aren't being correctly set when chaining descriptors.)	2013-03-24 00:03:12 +00:00
Adrian Chadd	49ddabc4bd	CABQ calculation changes to try and fix some weird corner cases leading to stuck beacons. * Set the cabq readytime (ie, how long to burst for) to 50% of the total beacon interval time * fix the cabq adjustment calculation based on how the beacon offset is calculated (the SWBA/DBA time offset.) This is all still a bit magic voodoo but it does seem to have further quietened issues with missed/stuck beacons under my local testing. In any case, it better matches what the reference HAL implements. Obtained from: Qualcomm Atheros	2013-03-23 23:51:11 +00:00
Konstantin Belousov	b11d58b63f	Do not call malloc(M_WAITOK) while bodev->fence_lock mutex is held. The ttm_buffer_object_transfer() does not need the mutex locked at all, except for the call to the driver sync_obj_ref() method. Reported and tested by: dumbbell MFC after: 2 weeks	2013-03-23 22:23:15 +00:00
Jean-Sébastien Pédron	accadf8de2	drm/ttm: Fix a typo: s/pTTM]/[TTM]/	2013-03-23 20:46:47 +00:00
Jean-Sébastien Pédron	76c40c6986	drm/ttm: Explain why we don't need to acquire a ref in ttm_bo_vm_ctor()	2013-03-23 20:43:26 +00:00
Martin Matuska	7608b757d7	Fix kernel build with options ZFS after r24571 (libzfs_core). Submitted by: Bjoern A. Zeeb <bz@FreeBSD.org>	2013-03-23 20:01:45 +00:00
Jean-Sébastien Pédron	a649986089	drm/ttm: Fix TTM buffer object refcount This fixes memory leaks in the radeonkms driver. Reviewed by: Konstantin Belousov (kib@) Tested by: J.R. Oldroyd <jr@opal.com>	2013-03-23 19:19:19 +00:00
Ian Lepore	49addc5755	Don't check and warn about pmap mismatch on every call to busdma sync. With some recent busdma refactoring, sometimes it happens that a sync op gets called when bus_dmamap_load() never got called, which results in a spurious warning about a map mismatch when no sync operations will actually happen anyway. Now the check is done only if a sync operation is actually performed, and the result of the check is a panic, not just a printf. Reviewed by: cognet (who prevented me from donning a point hat)	2013-03-23 17:17:06 +00:00
Will Andrews	ef04b888d2	Be more explicit about what each bio_cmd & bio_flags value means. Reviewed by: ken (mentor)	2013-03-23 16:55:07 +00:00
Will Andrews	58567a1b4e	ZFS: Fix a panic while unmounting a busy filesystem. This particular scenario was easily reproduced using a NFS export. When the first 'zfs unmount' occurred, it returned EBUSY via this path, while vflush() had flushed references on the filesystem's root vnode, which in turn caused its v_interlock to be destroyed. The next time 'zfs unmount' was called, vflush() tried to obtain this lock, which caused this panic. Since vflush() on FreeBSD is a definitive call, there is no need to check vfsp->vfs_count after it completes. Simply #ifdef sun this check. Submitted by: avg Reviewed by: avg Approved by: ken (mentor) MFC after: 1 month	2013-03-23 16:34:56 +00:00
Will Andrews	fdbc71742b	Extend taskqueue(9) to enable per-taskqueue callbacks. The scope of these callbacks is primarily to support actions that affect the taskqueue's thread environments. They are entirely optional, and consequently are introduced as a new API: taskqueue_set_callback(). This interface allows the caller to specify that a taskqueue requires a callback and optional context pointer for a given callback type. The callback types included in this commit can be used to register a constructor and destructor for thread-local storage using osd(9). This allows a particular taskqueue to define that its threads require a specific type of TLS, without the need for a specially-orchestrated task-based mechanism for startup and shutdown in order to accomplish it. Two callback types are supported at this point: - TASKQUEUE_CALLBACK_TYPE_INIT, called by every thread when it starts, prior to processing any tasks. - TASKQUEUE_CALLBACK_TYPE_SHUTDOWN, called by every thread when it exits, after it has processed its last task but before the taskqueue is reclaimed. While I'm here: - Add two new macros, TQ_ASSERT_LOCKED and TQ_ASSERT_UNLOCKED, and use them in appropriate locations. - Fix taskqueue.9 to mention taskqueue_start_threads(), which is a required interface for all consumers of taskqueue(9). Reviewed by: kib (all), eadler (taskqueue.9), brd (taskqueue.9) Approved by: ken (mentor) Sponsored by: Spectra Logic MFC after: 1 month	2013-03-23 15:11:53 +00:00
Andriy Gapon	ca84e042a3	post mountroot event after a real/final root is mounted not every time an intermediate root (including the first devfs) is mounted. This is also consistent with waking up via root_mount_complete. Reviewed by: jhb MFC after: 13 days	2013-03-23 08:59:34 +00:00
Andriy Gapon	aaf2546b67	fbt_getargdesc: correctly handle types for return probes MFC after: 6 days	2013-03-23 08:52:50 +00:00
Andriy Gapon	a47016e9a9	fbt_typoff_init: fix an off by one in determining required memory size This issue would be silent most of the time, but if the requested memory is a multiple of a page size, then accessing one element beyond the end would lead to a kernel page fault. Otherwise, the unlucky last type would just be inaccessible. Reported by: glebius Tested by: glebius MFC after: 6 days	2013-03-23 08:48:44 +00:00
Xin LI	843b298e62	Don't attempt to reference sc before testing whether it's NULL. Submitted by: Sascha Wildner Obtained from: DragonFly MFC after: 2 weeks	2013-03-22 22:46:19 +00:00
Kirk McKusick	baa12a84a7	The purpose of this change to the FFS layout policy is to reduce the running time for a full fsck. It also reduces the random access time for large files and speeds the traversal time for directory tree walks. The key idea is to reserve a small area in each cylinder group immediately following the inode blocks for the use of metadata, specifically indirect blocks and directory contents. The new policy is to preferentially place metadata in the metadata area and everything else in the blocks that follow the metadata area. The size of this area can be set when creating a filesystem using newfs(8) or changed in an existing filesystem using tunefs(8). Both utilities use the `-k held-for-metadata-blocks' option to specify the amount of space to be held for metadata blocks in each cylinder group. By default, newfs(8) sets this area to half of minfree (typically 4% of the data area). This work was inspired by a paper presented at Usenix's FAST '13: www.usenix.org/conference/fast13/ffsck-fast-file-system-checker Details of this implementation appears in the April 2013 of ;login: www.usenix.org/publications/login/april-2013-volume-38-number-2. A copy of the April 2013 ;login: paper can also be downloaded from: www.mckusick.com/publications/faster_fsck.pdf. Reviewed by: kib Tested by: Peter Holm MFC after: 4 weeks	2013-03-22 21:45:28 +00:00
Gleb Smirnoff	209dddb90e	Remove __FreeBSD_version ifdefs.	2013-03-22 20:44:16 +00:00
Pawel Jakub Dawidek	051a23d4e8	- Constify local path variable for chflagsat(). - Use correct format characters (%lx) for u_long. This fixes the build broken in r248599.	2013-03-22 07:40:34 +00:00
Kevin Lo	b3dcd51dde	Clean up some unused leftover code. Pointed out by: ae	2013-03-22 01:45:54 +00:00
Kevin Lo	dda95c6e59	Remove unused global variables. Reviewed by: ae, glebius	2013-03-22 01:40:17 +00:00
Steven Hartland	def84b9736	Fix for building libzpool under i386. Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks	2013-03-21 23:06:11 +00:00
Pawel Jakub Dawidek	5d46382415	Regenerate after r248599. Sponsored by: The FreeBSD Foundation	2013-03-21 23:02:19 +00:00
Pawel Jakub Dawidek	e948704e4b	Implement chflagsat(2) system call, similar to fchmodat(2), but operates on file flags. Reviewed by: kib, jilles Sponsored by: The FreeBSD Foundation	2013-03-21 22:59:01 +00:00
Pawel Jakub Dawidek	14cd1ffdf8	Regenerate after r248597. Sponsored by: The FreeBSD Foundation	2013-03-21 22:47:03 +00:00
Pawel Jakub Dawidek	b4b2596b97	- Make 'flags' argument to chflags(2), fchflags(2) and lchflags(2) of type u_long. Before this change it was of type int for syscalls, but prototypes in sys/stat.h and documentation for chflags(2) and fchflags(2) (but not for lchflags(2)) stated that it was u_long. Now some related functions use u_long type for flags (strtofflags(3), fflagstostr(3)). - Make path argument of type 'const char *' for consistency. Discussed on: arch Sponsored by: The FreeBSD Foundation	2013-03-21 22:44:33 +00:00
Konstantin Belousov	e808788c05	Correct the page count when excess length is trimmed from the bio. Reported and tested by: Ivan Klymenko <fidaj@ukr.net	2013-03-21 22:36:43 +00:00
Jilles Tjoelker	46f10cc265	Allow O_CLOEXEC in posix_openpt() flags. PR: kern/162374 Reviewed by: ed	2013-03-21 21:39:15 +00:00
Attilio Rao	d52d7aa871	Fix a bug in UMTX_PROFILING: UMTX_PROFILING should really analyze the distribution of locks as they index entries in the umtxq_chains hash-table. However, the current implementation does add/dec the length counters for every thread insert/removal, measuring at all really userland contention and not the hash distribution. Fix this by correctly add/dec the length counters in the points where it is really needed. Please note that this bug brought us questioning in the past the quality of the umtx hash table distribution. To date with all the benchmarks I could try I was not able to reproduce any issue about the hash distribution on umtx. Sponsored by: EMC / Isilon storage division Reviewed by: jeff, davide MFC after: 2 weeks	2013-03-21 19:58:25 +00:00
Alexander Motin	359b47db97	Minimal timer period of 100us introduced in r244758 is overkill. While original 2us are indeed not enough, 3us are working quite well on my tests. To be more safe set minimal period to 5us and to be even more safe replicate here from HPET mechanism of rereading counter after programming comparator. This change allows to handle 30K of short nanosleep() calls per second on Raspberry Pi instead of just 8K before. Discussed with: gonzo	2013-03-21 15:42:41 +00:00
John Baldwin	d071a6fa33	Another NFS SIGSTOP related fix: Ignore thread suspend requests due to SIGSTOP if stop signals are currently deferred. This can occur if a process is stopped via SIGSTOP while a thread is running or runnable but before it has set TDF_SBDRY. Tested by: pho Reviewed by: kib MFC after: 1 week	2013-03-21 14:06:27 +00:00
Konstantin Belousov	c46262f810	Fix twa(4) after the r246713. The driver copies data around to satisfy some alignment restrictions. Do not set TW_OSLI_REQ_FLAGS_CCB flag for mapped data, pass the csio->data_ptr in the req->data. Do not put the ccb pointer into req->data ever, ccb is stored in req->orig_req already. Submitted by: Shuichi KITAGUCHI <ki@hh.iij4u.or.jp> PR: kern/177020	2013-03-21 13:06:28 +00:00
Konstantin Belousov	4d569af96c	Initialize the variable to avoid (false) compiler warning about use of an uninitialized local. Reported by: Ivan Klymenko <fidaj@ukr.net> MFC after: 2 weeks	2013-03-21 12:59:24 +00:00
Steven Hartland	2b114ad2a4	Add missing descriptions for ZFS sysctls Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks	2013-03-21 11:25:21 +00:00
Steven Hartland	adea827b21	Optimisation of TRIM processing. Previously TRIM processing was very bursty. This was made worse by the fact that TRIM requests on SSD's are typically much slower than reads or writes. This often resulted in stalls while large numbers of TRIM's where processed. In addition due to the way the TRIM thread was only woken by writes, deletes could stall in the queue for extensive periods of time. This patch adds a number of controls to how often the TRIM thread for each SPA processes its outstanding delete requests. vfs.zfs.trim.timeout: Delay TRIMs by up to this many seconds vfs.zfs.trim.txg_delay: Delay TRIMs by up to this many TXGs (reduced to 32) vfs.zfs.vdev.trim_max_bytes: Maximum pending TRIM bytes for a vdev vfs.zfs.vdev.trim_max_pending: Maximum pending TRIM segments for a vdev vfs.zfs.trim.max_interval: Maximum interval between TRIM queue processing (seconds) Given the most common TRIM implementation is ATA TRIM the current defaults are targeted at that. Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks	2013-03-21 11:02:08 +00:00
Steven Hartland	6ad46cec23	Names the ZFS TRIM thread Reviewed by: pjd (mentor) Approved by: pjd (mentor) MFC after: 2 weeks	2013-03-21 10:41:30 +00:00
Steven Hartland	89e5b43079	TRIM cache devices based on time instead of TXGs. Currently, the trim module uses the same algorithm for data and cache devices when deciding to issue TRIM requests, based on how far in the past the TXG is. Unfortunately, this is not ideal for cache devices, because the L2ARC doesn't use the concept of TXGs at all. In fact, when using a pool for reading only, the L2ARC is written but the TXG counter doesn't increase, and so no new TRIM requests are issued to the cache device. This patch fixes the issue by using time instead of the TXG number as the criteria for trimming on cache devices. The basic delay principle stays the same, but parameters are expressed in seconds instead of TXGs. The new parameters are named trim_l2arc_limit and trim_l2arc_batch, and both default to 30 second. Reviewed by: pjd (mentor) Approved by: pjd (mentor) Obtained from: `17122c31ac` MFC after: 2 weeks	2013-03-21 10:29:05 +00:00
Steven Hartland	78ad0c1c80	Improve TXG handling in the TRIM module. This patch adds some improvements to the way the trim module considers TXGs: - Free ZIOs are registered with the TXG from the ZIO itself, not the current SPA syncing TXG (which may be out of date); - L2ARC are registered with a zero TXG number, as L2ARC has no concept of TXGs; - The TXG limit for issuing TRIMs is now computed from the last synced TXG, not the currently syncing TXG. Indeed, under extremely unlikely race conditions, there is a risk we could trim blocks which have been freed in a TXG that has not finished syncing, resulting in potential data corruption in case of a crash. Reviewed by: pjd (mentor) Approved by: pjd (mentor) Obtained from: `5b46ad40d9` MFC after: 2 weeks	2013-03-21 10:16:10 +00:00
Steven Hartland	e07e3a3792	Don't register repair writes in the trim map. The trim map inflight writes tree assumes non-conflicting writes, i.e. that there will never be two simultaneous write I/Os to the same range on the same vdev. This seemed like a sane assumption; however, in actual testing, it appears that repair I/Os can very well conflict with "normal" writes. I'm not quite sure if these conflicting writes are supposed to happen or not, but in the mean time, let's ignore repair writes for now. This should be safe considering that, by definition, we never repair blocks that are freed. Reviewed by: pjd (mentor) Approved by: pjd (mentor) Obtained from: Source: `6a3cebaf7c`	2013-03-21 10:02:32 +00:00
Steven Hartland	e05aad2d33	Add TRIM support for L2ARC. This adds TRIM support to cache vdevs. When ARC buffers are removed from the L2ARC in arc_hdr_destroy(), arc_release() or l2arc_evict(), the size previously occupied by the buffer gets scheduled for TRIMming. As always, actual TRIMs are only issued to the L2ARC after txg_trim_limit. Reviewed by: pjd (mentor) Approved by: pjd (mentor) Obtained from: `31aae37399` MFC after: 2 weeks	2013-03-21 09:34:41 +00:00
Martin Matuska	05f49d92ef	Merge libzfs_core branch: includes MFV 238590, 238592, 247580 MFV 238590, 238592: In the first zfs ioctl restructuring phase, the libzfs_core library was introduced. It is a new thin library that wraps around kernel ioctl's. The idea is to provide a forward-compatible way of dealing with new features. Arguments are passed in nvlists and not random zfs_cmd fields, new-style ioctls are logged to pool history using a new method of history logging. http://blog.delphix.com/matt/2012/01/17/the-future-of-libzfs/ MFV 247580 [1]: To address issues of several deadlocks and race conditions the locking code around dsl_dataset was rewritten and the interface to synctasks was changed. User-Visible Changes: "zfs snapshot" can create more arbitrary snapshots at once (atomically) "zfs destroy" destroys multiple snapshots at once "zfs recv" has improved performance Backward Compatibility: I have extended the compatibility layer to support full backward compatibility by remapping or rewriting the responsible ioctl arguments. Old utilities are fully supported by the new kernel module. Forward Compatibility: New utilities work with old kernels with the following restrictions: - creating, destroying, holding and releasing of multiple snapshots at once is not supported, this includes recursive (-r) commands Illumos ZFS issues: 2882 implement libzfs_core 2900 "zfs snapshot" should be able to create multiple, arbitrary snapshots at once 3464 zfs synctask code needs restructuring References: https://www.illumos.org/issues/2882 https://www.illumos.org/issues/2900 https://www.illumos.org/issues/3464 [1] MFC after: 1 month Sponsored by: Hybrid Logic Inc. [1]	2013-03-21 08:38:03 +00:00
Gleb Smirnoff	5aedfa32a4	Add NGM_NAT_LIBALIAS_INFO command, that reports internal stats of libalias instance. To be used in the mpd5 daemon. Submitted by: Dmitry Luhtionov <dmitryluhtionov gmail.com>	2013-03-21 08:36:15 +00:00
Konstantin Belousov	7db07e1c85	Only size and create the bio_transient_map when unmapped buffers are enabled. Now, disabling the unmapped buffers should result in the kernel memory map identical to pre-r248550. Sponsored by: The FreeBSD Foundation	2013-03-21 07:28:15 +00:00
Konstantin Belousov	6c83fce371	Assert that transient mapping of the bio is only done when unmapped buffers are allowed. Sponsored by: The FreeBSD Foundation	2013-03-21 07:26:33 +00:00
Konstantin Belousov	7157d8f7ab	Do not call vnode_pager_setsize() while a NFS node mutex is locked. vnode_pager_setsize() might sleep waiting for the page after EOF be unbusied. Call vnode_pager_setsize() both for the regular and directory vnodes. Reported by: mich Reviewed by: rmacklem Discussed with: avg, jhb MFC after: 2 weeks	2013-03-21 07:25:08 +00:00
Hans Petter Selasky	3232aae327	Add new USB ID. PR: usb/177173 MFC after: 1 week	2013-03-21 07:04:17 +00:00
Konstantin Belousov	e3269b5096	In bufwrite(), a dirty buffer is moved to the clean queue before the bufobj counter of the writes in progress is incremented. Other thread inspecting the bufobj would consider it clean. For the regular vnodes, the vnode lock is typically held both by the thread performing the bufwrite() and an other thread doing syncing, which prevents the situation. On the other hand, writes to the VCHR vnodes are done without holding vnode lock. Increment the write ref counter for the buffer object before calling bundirty(). Sponsored by: The FreeBSD Foundation Tested by: pho MFC after: 2 weeks	2013-03-20 21:08:00 +00:00
Konstantin Belousov	8d6884ce9c	When the journaled FFS volume is suspended due to the journal space becoming too low, the softdep flush thread processes the workitems, which frees the space in journal, and then unsuspends the fs. The softdep_flush() and other workitem processing functions busy the filesystem before iterating over the worklist, to prevent the parallel unmount from freeing the mount data. The vfs_busy() is called with MBF_NOWAIT flag. Now, if the unmount is already started and the filesystem is suspended due to low journal space, the journal is never flushed and filesystem is never unsuspended, because vfs_busy(MBF_NOWAIT) call cannot succeed for the unmounting fs, and softdep_flush() does not process the workitems. Unmount needs to write metadata, where it hangs in the "suspfs" state. Move the vn_start_write() call in the dounmount() before setting the MNTK_UNMOUNT flag. This practically ensures that softdep_flush() processed the pending journal writes by making dounmount() wait for the lift of the suspension. Sponsored by: The FreeBSD Foundation Reported and tested by: pho MFC after: 2 weeks	2013-03-20 21:07:49 +00:00
Kirk McKusick	3289d5877a	When renaming a directory from one parent directory to another, we need to call ufs_checkpath() to walk from our new location to the root of the filesystem to ensure that we do not encounter ourselves along the way. Until now, we accomplished this by reading the ".." entries of each directory in our path until we reached the root (or encountered an error). This change tries to avoid the I/O of reading the ".." entries by first looking them up in the name cache and only doing the I/O when the name cache lookup fails. Reviewed by: kib Tested by: Peter Holm MFC after: 4 weeks	2013-03-20 17:57:00 +00:00
Aleksandr Rybalko	a2c472e741	Integrate Efika MX project back to home. Sponsored by: The FreeBSD Foundation	2013-03-20 15:39:27 +00:00
Hans Petter Selasky	76be9c89ba	Fix spelling.	2013-03-20 11:51:26 +00:00

1 2 3 4 5 ...

92497 Commits