freebsd-skq

Author	SHA1	Message	Date
mdf	a3d624db5a	Regen. MFC after: 1 week	2013-04-02 05:30:52 +00:00
mdf	da578c6492	Fix return type of extattr_set_* and fix rmextattr(8) utility. extattr_set_{fd,file,link} is logically a write(2)-like operation and should return ssize_t, just like extattr_get_. Also, the user-space utility was using an int for the return value of extattr_get_ and extattr_list_*, both of which return an ssize_t. MFC after: 1 week	2013-04-02 05:30:41 +00:00
kib	060b76c803	Do not call the VOP_LOOKUP() for the doomed directory vnode. The vnode could be reclaimed while lock upgrade was performed. Sponsored by: The FreeBSD Foundation Reported and tested by: pho Diagnosed and reviewed by: rmacklem MFC after: 1 week	2013-04-01 09:59:38 +00:00
jilles	9d8a3c5c3b	Rename do_pipe() to kern_pipe2() and declare it properly.	2013-03-31 17:42:54 +00:00
mdf	4c77a4b020	Use a shared lock for VOP_GETEXTATTR, as it is a read-like operation. MFC after: 1 week	2013-03-30 15:09:04 +00:00
jimharris	f59af79144	Add bus_dmamap_load_bio for non-CAM disk drivers that wish to enable unmapped I/O. Sponsored by: Intel Reviewed by: kib	2013-03-29 16:26:25 +00:00
jimharris	5febbe1181	Add CTR5() to bus_dmamap_load_ccb, similar to other bus_dmamap_load_* functions. Sponsored by: Intel	2013-03-29 16:00:16 +00:00
jimharris	60c7cceb4c	Do not add 1 to nsegs before passing to CTR5(), since nsegs has already been incremented before these calls. Sponsored by: Intel	2013-03-29 15:54:12 +00:00
jimharris	7e64e1827b	Pass correct parameter to CTR5() in bus_dmamap_load_uio. Sponsored by: Intel	2013-03-29 15:51:45 +00:00
glebius	11f04943de	Fix bug in m_split() in a case when split len matches len of the first mbuf, and the first mbuf is M_PKTHDR. PR: kern/176144 Submitted by: Jacques Fourie <jacques.fourie gmail.com>	2013-03-29 14:10:40 +00:00
glebius	06ecb1b7ca	Once ng_ksocket(4) is fixed, re-apply r194662. See this revision for longer description. Discussed with: andre, rwatson Sponsored by: Nginx, Inc.	2013-03-29 14:06:04 +00:00
glebius	1bccb6e916	When soreceive_generic() hands off an mbuf from buffer, clear its pointer to next record, since next record belongs to the buffer, and shouldn't be leaked. The ng_ksocket(4) used to clear this pointer itself, but the correct place is here. Sponsored by: Nginx, Inc	2013-03-29 13:57:55 +00:00
scottl	84ae5b84bb	Several fixes and improvements to sendfile() 1. If we wanted to send exactly as many bytes as the socket buffer is sized for, the inner loop of kern_sendfile() would see that the socket is full before seeing that it had no more bytes left to send. This would cause it to return EAGAIN to the caller instead of success. Fix by changing the order that these conditions are tested. 2. Simplify the calculation for the bytes to send in each iteration of the inner loop of kern_sendfile() 3. Fix some calls with bogus arguments to sf_buf_ext(). These would only trigger on mbuf allocation failure, but would be hilariously bad if they did trigger. Submitted by: gibbs(3), andre(2) Reviewed by: emax, andre Obtained from: Netflix MFC after: 1 week	2013-03-28 14:14:28 +00:00
jimharris	6ed2dc4d7c	deferal -> deferral	2013-03-27 23:07:43 +00:00
kib	c45e5da903	Fix a race with the vnode reclamation in the aio_qphysio(). Obtain the thread reference on the vp->v_rdev and use the returned struct cdev dev instead of using vp->v_rdev. Call dev_strategy_csw() instead of dev_strategy(), since we now own the reference. Since the csw was already calculated, test d_flags to avoid mapping the buffer if the driver supports unmapped requests []. Suggested by: kan [*] Reviewed by: kan (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-03-27 11:47:52 +00:00
kib	448e7c1290	Add dev_strategy_csw() function, which is similar to dev_strategy() but assumes that a thread reference was already obtained on the passed device. Use the function from physio(), to avoid two extra dev_mtx lock and unlock. Note that physio() is always used as the cdevsw method, or is called from a cdevsw method, and the caller already owns the reference. dev_strategy() is left to keep KPI intact, but now it is implemented as a wrapper around dev_strategy_csw(). Do some style cleanup in physio(). Requested and reviewed by: kan (previous version) Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-03-27 11:34:27 +00:00
kib	df3795022f	On i386, double the default size of the bio transient map. With the maxbcache size fixed, the auto-tuned transient map is too small for real-world load on i386. Tested by: David Wolfskill Sponsored by: The FreeBSD Foundation	2013-03-27 10:56:15 +00:00
kan	49a21b7c2e	Do not pass unmapped buffers to drivers that cannot handle them In physio, check if device can handle unmapped IO and pass an appropriately mapped buffer to the driver strategy routine. The only driver in the tree that can handle unmapped buffers is one exposed by GEOM, so mark it as such with the new flag in the driver cdevsw structure. This fixes insta-panics on hosts, running dconschat, as /dev/fwmem is an example of the driver that makes use of physio routine, but bypasses the g_down thread, where the buffer gets mapped normally. Discussed with: kib (earlier version)	2013-03-26 01:17:06 +00:00
davide	f736d35281	Cache the callout precision argument as part of the informations required for migrating callouts to new CPU. This value is passed to callout_cc_add() in order to update properly precision field in case of rescheduling/migration. Reviewed by: mav	2013-03-25 09:43:50 +00:00
will	5d3a27c743	Extend taskqueue(9) to enable per-taskqueue callbacks. The scope of these callbacks is primarily to support actions that affect the taskqueue's thread environments. They are entirely optional, and consequently are introduced as a new API: taskqueue_set_callback(). This interface allows the caller to specify that a taskqueue requires a callback and optional context pointer for a given callback type. The callback types included in this commit can be used to register a constructor and destructor for thread-local storage using osd(9). This allows a particular taskqueue to define that its threads require a specific type of TLS, without the need for a specially-orchestrated task-based mechanism for startup and shutdown in order to accomplish it. Two callback types are supported at this point: - TASKQUEUE_CALLBACK_TYPE_INIT, called by every thread when it starts, prior to processing any tasks. - TASKQUEUE_CALLBACK_TYPE_SHUTDOWN, called by every thread when it exits, after it has processed its last task but before the taskqueue is reclaimed. While I'm here: - Add two new macros, TQ_ASSERT_LOCKED and TQ_ASSERT_UNLOCKED, and use them in appropriate locations. - Fix taskqueue.9 to mention taskqueue_start_threads(), which is a required interface for all consumers of taskqueue(9). Reviewed by: kib (all), eadler (taskqueue.9), brd (taskqueue.9) Approved by: ken (mentor) Sponsored by: Spectra Logic MFC after: 1 month	2013-03-23 15:11:53 +00:00
avg	1c06448efc	post mountroot event after a real/final root is mounted not every time an intermediate root (including the first devfs) is mounted. This is also consistent with waking up via root_mount_complete. Reviewed by: jhb MFC after: 13 days	2013-03-23 08:59:34 +00:00
pjd	91184d303f	- Constify local path variable for chflagsat(). - Use correct format characters (%lx) for u_long. This fixes the build broken in r248599.	2013-03-22 07:40:34 +00:00
pjd	f44b21d5e5	Regenerate after r248599. Sponsored by: The FreeBSD Foundation	2013-03-21 23:02:19 +00:00
pjd	635dbe90f2	Implement chflagsat(2) system call, similar to fchmodat(2), but operates on file flags. Reviewed by: kib, jilles Sponsored by: The FreeBSD Foundation	2013-03-21 22:59:01 +00:00
pjd	5fc1bac315	Regenerate after r248597. Sponsored by: The FreeBSD Foundation	2013-03-21 22:47:03 +00:00
pjd	2a3cf7f364	- Make 'flags' argument to chflags(2), fchflags(2) and lchflags(2) of type u_long. Before this change it was of type int for syscalls, but prototypes in sys/stat.h and documentation for chflags(2) and fchflags(2) (but not for lchflags(2)) stated that it was u_long. Now some related functions use u_long type for flags (strtofflags(3), fflagstostr(3)). - Make path argument of type 'const char *' for consistency. Discussed on: arch Sponsored by: The FreeBSD Foundation	2013-03-21 22:44:33 +00:00
jilles	bd09044d61	Allow O_CLOEXEC in posix_openpt() flags. PR: kern/162374 Reviewed by: ed	2013-03-21 21:39:15 +00:00
attilio	83c8ef372d	Fix a bug in UMTX_PROFILING: UMTX_PROFILING should really analyze the distribution of locks as they index entries in the umtxq_chains hash-table. However, the current implementation does add/dec the length counters for every thread insert/removal, measuring at all really userland contention and not the hash distribution. Fix this by correctly add/dec the length counters in the points where it is really needed. Please note that this bug brought us questioning in the past the quality of the umtx hash table distribution. To date with all the benchmarks I could try I was not able to reproduce any issue about the hash distribution on umtx. Sponsored by: EMC / Isilon storage division Reviewed by: jeff, davide MFC after: 2 weeks	2013-03-21 19:58:25 +00:00
jhb	1b6f4e466c	Another NFS SIGSTOP related fix: Ignore thread suspend requests due to SIGSTOP if stop signals are currently deferred. This can occur if a process is stopped via SIGSTOP while a thread is running or runnable but before it has set TDF_SBDRY. Tested by: pho Reviewed by: kib MFC after: 1 week	2013-03-21 14:06:27 +00:00
kib	9382f70781	Only size and create the bio_transient_map when unmapped buffers are enabled. Now, disabling the unmapped buffers should result in the kernel memory map identical to pre-r248550. Sponsored by: The FreeBSD Foundation	2013-03-21 07:28:15 +00:00
kib	fbd9d518d5	In bufwrite(), a dirty buffer is moved to the clean queue before the bufobj counter of the writes in progress is incremented. Other thread inspecting the bufobj would consider it clean. For the regular vnodes, the vnode lock is typically held both by the thread performing the bufwrite() and an other thread doing syncing, which prevents the situation. On the other hand, writes to the VCHR vnodes are done without holding vnode lock. Increment the write ref counter for the buffer object before calling bundirty(). Sponsored by: The FreeBSD Foundation Tested by: pho MFC after: 2 weeks	2013-03-20 21:08:00 +00:00
kib	6fbc70a0bc	When the journaled FFS volume is suspended due to the journal space becoming too low, the softdep flush thread processes the workitems, which frees the space in journal, and then unsuspends the fs. The softdep_flush() and other workitem processing functions busy the filesystem before iterating over the worklist, to prevent the parallel unmount from freeing the mount data. The vfs_busy() is called with MBF_NOWAIT flag. Now, if the unmount is already started and the filesystem is suspended due to low journal space, the journal is never flushed and filesystem is never unsuspended, because vfs_busy(MBF_NOWAIT) call cannot succeed for the unmounting fs, and softdep_flush() does not process the workitems. Unmount needs to write metadata, where it hangs in the "suspfs" state. Move the vn_start_write() call in the dounmount() before setting the MNTK_UNMOUNT flag. This practically ensures that softdep_flush() processed the pending journal writes by making dounmount() wait for the lift of the suspension. Sponsored by: The FreeBSD Foundation Reported and tested by: pho MFC after: 2 weeks	2013-03-20 21:07:49 +00:00
mckusick	9904f3d968	When renaming a directory from one parent directory to another, we need to call ufs_checkpath() to walk from our new location to the root of the filesystem to ensure that we do not encounter ourselves along the way. Until now, we accomplished this by reading the ".." entries of each directory in our path until we reached the root (or encountered an error). This change tries to avoid the I/O of reading the ".." entries by first looking them up in the name cache and only doing the I/O when the name cache lookup fails. Reviewed by: kib Tested by: Peter Holm MFC after: 4 weeks	2013-03-20 17:57:00 +00:00
jilles	c9066bd014	Implement SOCK_CLOEXEC, SOCK_NONBLOCK and MSG_CMSG_CLOEXEC. This change allows creating file descriptors with close-on-exec set in some situations. SOCK_CLOEXEC and SOCK_NONBLOCK can be OR'ed in socket() and socketpair()'s type parameter, and MSG_CMSG_CLOEXEC to recvmsg() makes file descriptors (SCM_RIGHTS) atomically close-on-exec. The numerical values for SOCK_CLOEXEC and SOCK_NONBLOCK are as in NetBSD. MSG_CMSG_CLOEXEC is the first free bit for MSG_. The SOCK_ flags are not passed to MAC because this may cause incorrect failures and can be done later via fcntl() anyway. On the other hand, audit is expected to cope with the new flags. For MSG_CMSG_CLOEXEC, unp_externalize() is extended to take a flags argument. Reviewed by: kib	2013-03-19 20:58:17 +00:00
kib	e5332ab955	Do not remap usermode pages into KVA for physio. Sponsored by: The FreeBSD Foundation Tested by: pho	2013-03-19 14:43:57 +00:00
kib	51030488d7	Add a helper function vfs_bio_bzero_buf() to zero the portion of the buffer, transparently handling mapped or unmapped buffers. Its intent is to replace the use of bzero(bp->b_data) in cases where the buffer might be unmapped, to avoid unneeded upgrades. Sponsored by: The FreeBSD Foundation Tested by: pho	2013-03-19 14:27:14 +00:00
kib	7c26a038f9	Implement the concept of the unmapped VMIO buffers, i.e. buffers which do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads. The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag. When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation. Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap. The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests. Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested. In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached. By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions. Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks	2013-03-19 14:13:12 +00:00
jhb	8604015a2e	Tweak some comments.	2013-03-18 18:04:09 +00:00
jhb	8b099870ed	Partially revert r195702. Deferring stops is now implemented via a set of calls to toggle TDF_SBDRY rather than passing PBDRY to individual sleep calls. - Remove the stop_allowed parameters from cursig() and issignal(). issignal() checks TDF_SBDRY directly. - Remove the PBDRY and SLEEPQ_STOP_ON_BDRY flags.	2013-03-18 17:23:58 +00:00
glebius	338f6e587c	In m_align() add assertions that mbuf is virgin, similar to assertions in M_ALIGN(), MH_ALIGN, MEXT_ALIGN() macros.	2013-03-17 07:41:14 +00:00
pjd	acae942b05	Require CAP_SEEK if both O_APPEND and O_TRUNC flags are absent. In other words we don't require CAP_SEEK if either O_APPEND or O_TRUNC flag is given, because O_APPEND doesn't allow to overwrite existing data and O_TRUNC requires CAP_FTRUNCATE already. Sponsored by: The FreeBSD Foundation	2013-03-16 23:19:13 +00:00
pjd	a03e9d6f4c	Style: Whitespace fixes.	2013-03-16 22:37:30 +00:00
pjd	27049a86ae	Style: Remove redundant space.	2013-03-16 22:36:24 +00:00
glebius	9edd6e1174	- Replace compat macros with function calls. - Remove superfluous cleaning of m_len after allocating. Sponsored by: Nginx, Inc.	2013-03-16 08:57:36 +00:00
glebius	9d8d25cc83	Contrary to what the deleted comment said, the m_move_pkthdr() will not smash the M_EXT and data pointer, so it is safe to pass an mbuf with external storage procuded by m_getcl() to m_move_pkthdr(). Reviewed by: andre Sponsored by: Nginx, Inc.	2013-03-16 08:55:21 +00:00
pjd	552a9fbc05	Sort syscalls properly.	2013-03-15 23:00:13 +00:00
kib	284e554a67	Separate the copyright lines and the informational block by a blank line. Requested by: joel MFC after: 2 weeks	2013-03-15 14:01:37 +00:00
kib	28990b8686	Add my copyright for the 2012 year work, in particular vn_io_fault() and f_offset locking. Add required Foundation notice for r248319. Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2013-03-15 12:57:30 +00:00
kib	9b5b993aea	Implement the helper function vn_io_fault_pgmove(), intended to use by the filesystem VOP_READ() and VOP_WRITE() implementations in the same way as vn_io_fault_uiomove() over the unmapped buffers. Helper provides the convenient wrapper over the pmap_copy_pages() for struct uio consumers, taking care of the TDP_UIOHELD situations. Sponsored by: The FreeBSD Foundation Tested by: pho MFC after: 2 weeks	2013-03-15 11:16:12 +00:00
glebius	a10c1a7c72	Use m_get() and m_getcl() instead of compat macros.	2013-03-15 10:21:18 +00:00

1 2 3 4 5 ...

13172 Commits