freebsd-nq

Author	SHA1	Message	Date
Navdeep Parhar	dd181b2652	cxgbe/tom: Update the CLIP table on the chip when there are changes to the list of IPv6 addresses on the system. The table is used for TOE+IPv6 only.	2013-04-18 19:52:11 +00:00
Alexander Motin	b2c63698d4	Introduce kern.timecounter.smp_tsc_adjust tunable (disabled by default) and respective functionality, allowing to synchronize TSC on APs to match BSP's during boot. It may be unsafe in general case due to theoretical chance of later drift if CPUs are using different clock rate or source, but it allows to use TSC in some cases when difference caused by some initialization bug, while TSCs are known to increment synchronously. Reviewed by: jimharris, kib MFC after: 1 month	2013-04-18 17:07:04 +00:00
Rick Macklem	175b3f31d3	Both NFS clients can deadlock when using the "rdirplus" mount option. This can occur when an nfsiod thread that already holds a buffer lock attempts to acquire a vnode lock on an entry in the directory (a LOR) when another thread holding the vnode lock is waiting on an nfsiod thread. This patch avoids the deadlock by disabling readahead for this case, so the nfsiod threads never do readdirplus. Since readaheads for directories need the directory offset cookie from the previous read, they cannot normally happen in parallel. As such, testing by jhb@ and myself didn't find any performance degredation when this patch is applied. If there is a case where this results in a significant performance degradation, mounting without the "rdirplus" option can be done to re-enable readahead for directories. Reported and tested by: jhb Reviewed by: jhb MFC after: 2 weeks	2013-04-18 13:09:04 +00:00
Alexander Motin	ca11419237	Make siis(4) and mvs(4) send bus_get_dma_tag() requests to parent buses passing real bus' child pointers instead of grandchilds. Requested by: kib	2013-04-18 12:43:06 +00:00
Rui Paulo	5dfae12246	Move the previously added CPUID7 macros to CPUID_STDEXT.	2013-04-18 07:09:27 +00:00
Alan Cox	880659fe81	When calculating the number of reserved nodes, discount the pages that will be used to store the nodes. Sponsored by: EMC / Isilon Storage Division	2013-04-18 05:34:33 +00:00
Rui Paulo	ba5f77bf16	Add the most current CPUID7_* definitions.	2013-04-18 01:30:08 +00:00
Rui Paulo	068e8f74e4	Print RDSEED, ADX, and SMAP. Pointed out by: kib	2013-04-18 01:21:44 +00:00
Kenneth D. Merry	adb974068b	Move the NFS FHA (File Handle Affinity) code from sys/nfsserver to sys/nfs, since it is now shared by the two NFS servers. Suggested by: rmacklem Sponsored by: Spectra Logic MFC after: 2 weeks	2013-04-17 22:42:43 +00:00
Hiren Panchasara	76fe16bb78	Improving r249461 by providing a better way to handle the clang warning. PR: kern/177164 Reviewed by: jhb Approved by: sbruno (mentor)	2013-04-17 21:21:27 +00:00
Kenneth D. Merry	d96b98a360	Revamp the old NFS server's File Handle Affinity (FHA) code so that it will work with either the old or new server. The FHA code keeps a cache of currently active file handles for NFSv2 and v3 requests, so that read and write requests for the same file are directed to the same group of threads (reads) or thread (writes). It does not currently work for NFSv4 requests. They are more complex, and will take more work to support. This improves read-ahead performance, especially with ZFS, if the FHA tuning parameters are configured appropriately. Without the FHA code, concurrent reads that are part of a sequential read from a file will be directed to separate NFS threads. This has the effect of confusing the ZFS zfetch (prefetch) code and makes sequential reads significantly slower with clients like Linux that do a lot of prefetching. The FHA code has also been updated to direct write requests to nearby file offsets to the same thread in the same way it batches reads, and the FHA code will now also send writes to multiple threads when needed. This improves sequential write performance in ZFS, because writes to a file are now more ordered. Since NFS writes (generally less than 64K) are smaller than the typical ZFS record size (usually 128K), out of order NFS writes to the same block can trigger a read in ZFS. Sending them down the same thread increases the odds of their being in order. In order for multiple write threads per file in the FHA code to be useful, writes in the NFS server have been changed to use a LK_SHARED vnode lock, and upgrade that to LK_EXCLUSIVE if the filesystem doesn't allow multiple writers to a file at once. ZFS is currently the only filesystem that allows multiple writers to a file, because it has internal file range locking. This change does not affect the NFSv4 code. This improves random write performance to a single file in ZFS, since we can now have multiple writers inside ZFS at one time. I have changed the default tuning parameters to a 22 bit (4MB) window size (from 256K) and unlimited commands per thread as a result of my benchmarking with ZFS. The FHA code has been updated to allow configuring the tuning parameters from loader tunable variables in addition to sysctl variables. The read offset window calculation has been slightly modified as well. Instead of having separate bins, each file handle has a rolling window of bin_shift size. This minimizes glitches in throughput when shifting from one bin to another. sys/conf/files: Add nfs_fha_new.c and nfs_fha_old.c. Compile nfs_fha.c when either the old or the new NFS server is built. sys/fs/nfs/nfsport.h, sys/fs/nfs/nfs_commonport.c: Bring in changes from Rick Macklem to newnfs_realign that allow it to operate in blocking (M_WAITOK) or non-blocking (M_NOWAIT) mode. sys/fs/nfs/nfs_commonsubs.c, sys/fs/nfs/nfs_var.h: Bring in a change from Rick Macklem to allow telling nfsm_dissect() whether or not to wait for mallocs. sys/fs/nfs/nfsm_subs.h: Bring in changes from Rick Macklem to create a new nfsm_dissect_nonblock() inline function and NFSM_DISSECT_NONBLOCK() macro. sys/fs/nfs/nfs_commonkrpc.c, sys/fs/nfsclient/nfs_clkrpc.c: Add the malloc wait flag to a newnfs_realign() call. sys/fs/nfsserver/nfs_nfsdkrpc.c: Setup the new NFS server's RPC thread pool so that it will call the FHA code. Add the malloc flag argument to newnfs_realign(). Unstaticize newnfs_nfsv3_procid[] so that we can use it in the FHA code. sys/fs/nfsserver/nfs_nfsdsocket.c: In nfsrvd_dorpc(), add NFSPROC_WRITE to the list of RPC types that use the LK_SHARED lock type. sys/fs/nfsserver/nfs_nfsdport.c: In nfsd_fhtovp(), if we're starting a write, check to see whether the underlying filesystem supports shared writes. If not, upgrade the lock type from LK_SHARED to LK_EXCLUSIVE. sys/nfsserver/nfs_fha.c: Remove all code that is specific to the NFS server implementation. Anything that is server-specific is now accessed through a callback supplied by that server's FHA shim in the new softc. There are now separate sysctls and tunables for the FHA implementations for the old and new NFS servers. The new NFS server has its tunables under vfs.nfsd.fha, the old NFS server's tunables are under vfs.nfsrv.fha as before. In fha_extract_info(), use callouts for all server-specific code. Getting file handles and offsets is now done in the individual server's shim module. In fha_hash_entry_choose_thread(), change the way we decide whether two reads are in proximity to each other. Previously, the calculation was a simple shift operation to see whether the offsets were in the same power of 2 bucket. The issue was that there would be a bucket (and therefore thread) transition, even if the reads were in close proximity. When there is a thread transition, reads wind up going somewhat out of order, and ZFS gets confused. The new calculation simply tries to see whether the offsets are within 1 << bin_shift of each other. If they are, the reads will be sent to the same thread. The effect of this change is that for sequential reads, if the client doesn't exceed the max_reqs_per_nfsd parameter and the bin_shift is set to a reasonable value (22, or 4MB works well in my tests), the reads in any sequential stream will largely be confined to a single thread. Change fha_assign() so that it takes a softc argument. It is now called from the individual server's shim code, which will pass in the softc. Change fhe_stats_sysctl() so that it takes a softc parameter. It is now called from the individual server's shim code. Add the current offset to the list of things printed out about each active thread. Change the num_reads and num_writes counters in the fha_hash_entry structure to 32-bit values, and rename them num_rw and num_exclusive, respectively, to reflect their changed usage. Add an enable sysctl and tunable that allows the user to disable the FHA code (when vfs.XXX.fha.enable = 0). This is useful for before/after performance comparisons. nfs_fha.h: Move most structure definitions out of nfs_fha.c and into the header file, so that the individual server shims can see them. Change the default bin_shift to 22 (4MB) instead of 18 (256K). Allow unlimited commands per thread. sys/nfsserver/nfs_fha_old.c, sys/nfsserver/nfs_fha_old.h, sys/fs/nfsserver/nfs_fha_new.c, sys/fs/nfsserver/nfs_fha_new.h: Add shims for the old and new NFS servers to interface with the FHA code, and callbacks for the The shims contain all of the code and definitions that are specific to the NFS servers. They setup the server-specific callbacks and set the server name for the sysctl and loader tunable variables. sys/nfsserver/nfs_srvkrpc.c: Configure the RPC code to call fhaold_assign() instead of fha_assign(). sys/modules/nfsd/Makefile: Add nfs_fha.c and nfs_fha_new.c. sys/modules/nfsserver/Makefile: Add nfs_fha_old.c. Reviewed by: rmacklem Sponsored by: Spectra Logic MFC after: 2 weeks	2013-04-17 21:00:22 +00:00
Gleb Smirnoff	8f779cc541	On non-ACPI i386 mp_ncpus is initialized at SI_SUB_CPU, and this prevents us from creating UMA_ZONE_PCPU zones earlier. As bandaid shift initialization of counter(9) zone later. Reviewed by: kib Reported & tested by: Lytochkin Boris <lytboris gmail.com>	2013-04-17 18:43:33 +00:00
Adrian Chadd	d22e3b024e	Add the static kernel boot environment, needed to actually boot this thing. (Wasting 4k just as a temporary placeholder for a boot environment seems a bit ridiculous, but hey.) Tested: gxemul: $ gxemul -e malta -d i:/home/adrian/work/freebsd/svn/mfsroot-rspro.img -C 4Kc /tftpboot/kernel.MALTA	2013-04-17 18:26:01 +00:00
Gabor Kovesdan	a8b5c2a0aa	- Correct spelling in comments Submitted by: Christoph Mallon <christoph.mallon@gmx.de> (via private mail)	2013-04-17 11:56:11 +00:00
Gabor Kovesdan	84a17a97ce	- Correct mispellings of word and Submitted by: Christoph Mallon <christoph.mallon@gmx.de> (via private mail)	2013-04-17 11:48:46 +00:00
Gabor Kovesdan	b78540b1c7	- Correct mispellings of word resource Submitted by: Christoph Mallon <christoph.mallon@gmx.de>	2013-04-17 11:47:32 +00:00
Gabor Kovesdan	8fb3bbe770	- Corrrect mispellings of word useful Submitted by: Christoph Mallon <christoph.mallon@gmx.de> (via private mail)	2013-04-17 11:45:15 +00:00
Gabor Kovesdan	f0d0985ee9	- Correct mispellings of word miscellaneous Submitted by: Christoph Mallon <christoph.mallon@gmx.de> (via private mail)	2013-04-17 11:43:46 +00:00
Gabor Kovesdan	a2098fea6d	- Correct mispellings of the word necessary Submitted by: Christoph Mallon <christoph.mallon@gmx.de> (via private mail)	2013-04-17 11:42:40 +00:00
Gabor Kovesdan	ab3f6b347e	- Correct mispellings of the word occurrence Submitted by: Christoph Mallon <christoph.mallon@gmx.de> (via private mail)	2013-04-17 11:40:10 +00:00
Ivan Voras	7ceaf939d6	Link g_label_disk_ident when building geom_label as a module	2013-04-17 09:19:29 +00:00
Adrian Chadd	91046e9c5f	Setup needed tables for TPC on AR5416->AR9287 chips. * Add ah_ratesArray[] to the ar5416 HAL state - this stores the maximum values permissable per rate. * Since different chip EEPROM formats store this value in a different place, store the HT40 power detector increment value in the ar5416 HAL state. * Modify the target power setup code to store the maximum values in the ar5416 HAL state rather than using a local variable. * Add ar5416RateToRateTable() - to convert a hardware rate code to the ratesArray enum / index. * Add ar5416GetTxRatePower() - which goes through the gymnastics required to correctly calculate the target TX power: + Add the power detector increment for ht40; + Take the power offset into account for AR9280 and later; + Offset the TX power correctly when doing open-loop TX power control; + Enforce the per-rate maximum value allowable. Note - setting a TPC value of 0x0 in the TX descriptor on (at least) the AR9160 resulted in the TX power being very high indeed. This didn't happen on the AR9220. I'm guessing it's a chip bug that was fixed at some point. So for now, just assume the AR5416/AR5418 and AR9130 are also suspect and clamp the minimum value here at 1. Tested: * AR5416, AR9160, AR9220 hostap, verified using (2GHz) spectrum analyser * Looked at target TX power in TX descriptor (using athalq) as well as TX power on the spectrum analyser. TODO: * The TX descriptor code sets the target TX power to 0 for AR9285 chips. I'm not yet sure why. Disable this for TPC and ensure that the TPC TX power is set. * AR9280, AR9285, AR9227, AR9287 testing! * 5GHz testing! Quirks: * The per-packet TPC code is only exercised when the tpc sysctl is set to 1. (dev.ath.X.tpc=1.) This needs to be done before you bring the interface up. * When TPC is enabled, setting the TX power doesn't end up with a call through to the HAL to update the maximum TX power. So ensure that you set the TPC sysctl before you bring the interface up and configure a lower TX power or the hardware will be clamped by the lower TX power (at least until the next channel change.) Thanks to Qualcomm Atheros for all the hardware, and Sam Leffler for use of his spectrum analyser to verify the TX channel power.	2013-04-17 07:31:53 +00:00
Adrian Chadd	8b470f6f71	Use the TPC bank by default for AR9160. Tested: * AR9160, hostap, verified TX power using (2GHz) spectrum analyser TODO: * 5GHz verification!	2013-04-17 07:22:23 +00:00
Adrian Chadd	de00e5cb54	Update the rate series setup code to use the decisions already made in ath_tx_rate_fill_rcflags(). Include setting up the TX power cap in the rate scenario setup code being passed to the HAL. Other things: * add a tx power cap field in ath_rc. * Add a three-stream flag in ath_rc. * Delete the LDPC flag from ath_rc - it's not a per-rate flag, it's a global flag for the transmission.	2013-04-17 07:21:30 +00:00
Rui Paulo	d1dcd93145	Print more bits from the standard extended features CPUID which will be available in the Haswell architecture (c.f. Intel Document #319433-012A).	2013-04-17 06:51:17 +00:00
Neel Natu	5685b37212	Correct misleading bootverbose output: ahc_isa_probe -> ahc_isa_identify	2013-04-17 02:33:56 +00:00
Pedro F. Giffuni	03836978be	DTrace: Revert r249367 The following change from illumos brought caused DTrace to pause in an interactive environment: 3026 libdtrace should set LD_NOLAZYLOAD=1 to help the pid provider This was not detected during testing because it doesn't affect scripts. We shouldn't be changing the environment, especially since the LD_NOLAZYLOAD option doesn't apply to our (GNU) ld. Unfortunately the change from upstream was made in such a way that it is very difficult to separate this change from the others so, at least for now, it's better to just revert everything. Reference: https://www.illumos.org/issues/3026 Reported by: Navdeep Parhar and Mark Johnston	2013-04-17 02:20:17 +00:00
Ivan Voras	8e9405e8a7	Comment typo fix. Is aware of the importance of comments: dim	2013-04-16 22:42:40 +00:00
Warner Losh	11c601447d	r249408 and r249436 cause a NULL pointer dereference on the CUBIEBOARD since it doesn't set the kernel envrionment at all. Work around this by making sure kern_envp is non-NULL before dereferencing it.	2013-04-16 22:09:08 +00:00
Adrian Chadd	12087a0769	Use the new net80211 method to fetch the node TX power, rather than directly referencing ni->ni_txpower. This provides the hardware with a slightly more accurate idea of the maximum TX power to be using. This is part of a series to get per-packet TPC to work (better). Tested: * AR5416, hostap mode	2013-04-16 21:26:44 +00:00
Adrian Chadd	ebe15a7b25	Implement a utility function to return the current TX power cap for the given node. This takes into account the per-node cap, the ic cap and the per-channel regulatory caps. This is designed to replace references to ni_txpower in various net80211 drivers - ni_txpower doesn't necessarily reflect the actual cap for the given node (eg if the node has the default value of 50dBm (100) and the administrator has manually configured a lower TX power.)	2013-04-16 20:36:32 +00:00
John Baldwin	8916af883c	- Document that sem_wait() can fail with EINTR if it is interrupted by a signal. - Fix the old ksem implementation for POSIX semaphores to not restart sem_wait() or sem_timedwait() if interrupted by a signal. MFC after: 1 week	2013-04-16 20:26:31 +00:00
Adrian Chadd	5d4dedadb6	Use a per-RX-queue deferred list, rather than a single deferred list for both queues. Since ath_rx_pkt() does multi-mbuf frame recombining based on the RX queue, this needs to occur. Tested: * AR9380 (XB112), hostap mode	2013-04-16 20:21:02 +00:00
Ivan Voras	9a796b22f6	Fix the buffer-overflow-fixing fixes. Pointy-hat to: me, for not realizing snprintf() is available in kernel. Thanks to: jh, for bringing me the good news of snprintf(), Pawel Worach, for noting that the panic can be provoked in i386 and not in amd64	2013-04-16 19:58:24 +00:00
Xin LI	f2297451fe	Fix incomplete printf. PR: kern/177889 Submitted by: Sven-Thorsten Dietrich <sven vyatta com> MFC after: 1 week	2013-04-16 19:32:12 +00:00
Xin LI	c1031303f0	Don't leak lock when returning. PR: kern/177888 Submitted by: Sven-Thorsten Dietrich <sven vyatta com> MFC after: 1 week	2013-04-16 19:25:41 +00:00
Mikolaj Golub	f1fca82ed5	Add a new set of notes to a process core dump to store procstat data. The notes format is a header of sizeof(int), which stores the size of the corresponding data structure to provide some versioning, and data in the format as it is returned by a related sysctl call. The userland tools (procstat(1)) will be taught to extract this data, providing additional info for postmortem analysis. PR: kern/173723 Suggested by: jhb Discussed with: jhb, kib Reviewed by: jhb (initial version), kib MFC after: 1 month	2013-04-16 19:19:14 +00:00
Brooks Davis	b7b63db789	Partial MFP4 of 222836: Only look for FDT partitions if our potential parent is a DISK device. Excluding direct recursion on the flashmap geoms was insufficient because it did not prevent the underlying device from being retrieved if flashmap geoms were further partitioned. Reviewed by: imp Sponsored by: DARPA, AFRL	2013-04-16 17:47:13 +00:00
Tijl Coosemans	6c81895dab	Fix build after r249543.	2013-04-16 16:59:29 +00:00
Warner Losh	dd65664bbc	Point to regdef.h. May need to dig up references to the N32 standard that support this usage (which may be a bit rough, since different parts of the standard say mutually contradictory things).	2013-04-16 16:54:37 +00:00
Rick Macklem	64fa8df6e0	Allow the vnode to be unlocked for the weird case of LK_EXCLOTHER. LK_EXCLOTHER is only used to acquire a usecount on a vnode during NFSv4 recovery from an expired lease. Reported and tested by: pho MFC after: 2 weeks	2013-04-16 14:22:16 +00:00
Andrey V. Elsukov	4ff7c740fe	Fix accounting after the r249528, also add several another counters to the statistics.	2013-04-16 11:31:26 +00:00
Andrey V. Elsukov	eca4d72003	Use IP6S_M2MMAX macro.	2013-04-16 11:19:13 +00:00
Andrey V. Elsukov	43851aae9a	Replace hardcoded numbers.	2013-04-16 11:12:58 +00:00
Konstantin Belousov	44d95698ba	Some compilers issue a warning when wider integer is casted to narrow pointer. Supposedly shut down the warning by casting through uintptr_t. Reported by: ian	2013-04-16 07:11:52 +00:00
Andrey V. Elsukov	e7a87117d3	The source address selection algorithm tries to apply several rules for the set of IPv6 addresses. Now each attempt goes into IPv6 statistics, even if given rule did not won. Change this and take into account only those rules, that won. Also add accounting for cases, when algorithm fails to select an address.	2013-04-15 21:02:40 +00:00
Warner Losh	e4be3d3ddd	Fix N32/N64 register saving by ensuring that all registers resolve to unique values. There's some confusion about what the n32 assembler API really is (since on page 9 of the spec they say that t0-t3 don't exist, then turn around on page 22 and say that t4-t7 don't exist), and this doesn't touch that. NetBSD's version of this file follows the convention I used here, and is likely to be correct. This should fix gdb/ptrace.	2013-04-15 19:32:14 +00:00
Adrian Chadd	978c5ce568	Now that the register definitions are in -HEAD, enable this.	2013-04-15 17:59:06 +00:00
Adrian Chadd	a04110a3b6	Bring over some AR9271 register definitions from the QCA HAL. Obtained from: Qualcomm Atheros	2013-04-15 17:58:11 +00:00
George V. Neville-Neil	8f2ba63493	Point args[0] not at the thread that is ending but at the one that is starting. This is in line with practice in OpenSolaris. Note that this change is only in ULE and not in the 4BSD scheduler. Once this change settles in (MFC timeout has expired) we'll try it out on 4BSD as well. PR: 177706 Submitted by: Tiwei Bie MFC after: 1 month	2013-04-15 17:21:02 +00:00

1 2 3 4 5 ...

92903 Commits