freebsd-nq

Author	SHA1	Message	Date
Alexander Motin	c0722d20d3	Microoptimize time math. As soon as our event periods are always below ome second we may not add intereger parts by using bintime_addx() instead of bintime_add(). Profiling shows handleevents() time redction by 15%.	2012-08-03 09:08:20 +00:00
Alan Cox	369763e31a	Inline vm_page_aflags_clear() and vm_page_aflags_set(). Add comments stating that neither these functions nor the flags that they are used to manipulate are part of the KBI.	2012-08-03 01:48:15 +00:00
Xin LI	0f3fae6159	Correct a typo. Reported by: Sascha Wildner <swildner dragonflybsd org> Reviewed by: scottl MFC after: 3 days	2012-08-03 00:11:13 +00:00
Adrian Chadd	a6e829596d	Fix an issue that crept in with the previous descriptor tidyup. When forming aggregates, the last descriptor was now not being correctly setup - instead, the "setuplasttxdesc" call was being handed the first descriptor in the last subframe, rather than the last descriptor in the last subframe. This showed up as "bad series0 hwrate" messages, as the final descriptor just didn't have any of the rate control information squirreled away. Tested: * AR9280 STA -> 11n AP, iperf TCP	2012-08-02 20:14:45 +00:00
Jaakko Heinonen	8cb51643e4	Disallow sectorsize larger than MAXPHYS and mediasize smaller than sectorsize. PR: 169947 Submitted by: Filip Palian (original version) Reviewed by: kib	2012-08-02 15:05:34 +00:00
Gleb Smirnoff	ea53792942	Fix races between in_lltable_prefix_free(), lla_lookup(), llentry_free() and arptimer(): o Use callout_init_rw() for lle timeout, this allows us safely disestablish them. - This allows us to simplify the arptimer() and make it race safe. o Consistently use ifp->if_afdata_lock to lock access to linked lists in the lle hashes. o Introduce new lle flag LLE_LINKED, which marks an entry that is attached to the hash. - Use LLE_LINKED to avoid double unlinking via consequent calls to llentry_free(). - Mark lle with LLE_DELETED via \|= operation istead of =, so that other flags won't be lost. o Make LLE_ADDREF(), LLE_REMREF() and LLE_FREE_LOCKED() more consistent and provide more informative KASSERTs. The patch is a collaborative work of all submitters and myself. PR: kern/165863 Submitted by: Andrey Zonov <andrey zonov.org> Submitted by: Ryan Stone <rysto32 gmail.com> Submitted by: Eric van Gyzen <eric_van_gyzen dell.com>	2012-08-02 13:57:49 +00:00
Gleb Smirnoff	b1d86af706	The llentry_update() is used only by flowtable and the latter always passes NULL pointer to it. Thus, code can be simplified and function renamed to llentry_alloc() to match rtalloc().	2012-08-02 13:20:44 +00:00
Luigi Rizzo	46f2f751e1	replace __unused with a portable construct; fix a couple of signed/unsigned warnings.	2012-08-02 12:45:13 +00:00
Joel Dahl	bda8f16fe8	Remove trailing whitespace.	2012-08-02 12:17:52 +00:00
Joel Dahl	02a7f9c149	mdoc: remove superfluous paragraph macro.	2012-08-02 12:16:46 +00:00
Luigi Rizzo	b3d5301688	fix some signed/unsigned warnings in the netmap code. Unfortunately the original drivers still have a lot of sign conversion/comparison warnings.	2012-08-02 11:59:43 +00:00
Konstantin Belousov	2db8baa956	fsck_ffs shall accept the configured journal size, and not refuse to operate on it if journal size is greater then SUJ_MAX. The later constant is only to select maximal journal size when user did not specified size explicitely. Submitted by: Andrey Zonov <andrey@zonov.org> Reviewed by: mckusick MFC after: 1 week	2012-08-02 10:39:54 +00:00
Luigi Rizzo	13a5d88f1a	Update netmap page, fixing the API documentation and usage example. Add a new manpage for the vale switch	2012-08-02 08:46:08 +00:00
Luigi Rizzo	42a3a5bd91	Add a newline on an error message; rename linux functions to avoid confusion; fix error reporting on linux	2012-08-02 07:35:40 +00:00
Sean Bruno	8844c80848	CPU_NEXT() already handles wrapping around to the beginning. Also, in a system with sparse CPU IDs, you can have a valid CPU ID > mp_ncpus (e.g. if you have two CPUs 0 and 4, with mp_maxid == 4 and mp_ncpus == 2). Introduced at svn r235210 Submitted by: jhb@ Reviewed by: jfv@	2012-08-02 00:00:34 +00:00
Tai-hwa Liang	50e91f8ce1	Just like the other file systems found in /sys/fs, g_vfs_open() should be paried with g_vfs_close(). Though g_vfs_close() is a wrapper around g_wither_geom_close(), r206130 added the following test in g_vfs_open(): if (bo->bo_private != vp) return (EBUSY); Which will cause a 'Device busy' error inside reiserfs_mountfs() if the same file system is re-mounted again after umount or mounting failure: (case 1, /dev/ad4s3 is not a valid REISERFS partition) # mount -t reiserfs -o ro /dev/ad4s3 /mnt mount: /dev/ad4s3: Invalid argument # mount -t msdosfs -o ro /dev/ad4s3 /mnt mount: /dev/ad4s3: Device busy (case 2, /dev/ad4s3 is a valid REISERFS partition) # mount -t reiserfs -o ro /dev/ad4s3 /mnt # umount /mnt # mount -t reiserfs -o ro /dev/ad4s3 /mnt mount: /dev/ad4s3: Device busy On the other hand, g_vfs_close() 'fixed' the above cases by doing an extra step to keep 'sc->sc_bo->bo_private' and 'cp->private' pointers synchronised. Reviewed by: kib MFC after: 1 month	2012-08-01 23:05:57 +00:00
George V. Neville-Neil	26d121f5df	When we return with an error we cannot unlock the mutex, because it's been freed. Protect against that, hopefully unlikely, case. Reviewed by: rpaulo MFC after: 2 weeks	2012-08-01 19:27:12 +00:00
Luigi Rizzo	f5705b527d	replace inet_ntoa_r with the more standard inet_ntop(). As discussed on -current, inet_ntoa_r() is non standard, has different arguments in userspace and kernel, and almost unused (no clients in userspace, only net/flowtable.c, net/if_llatbl.c, netinet/in_pcb.c, netinet/tcp_subr.c in the kernel)	2012-08-01 18:52:07 +00:00
Luigi Rizzo	71ca24f182	add a cast to avoid a signed/unsigned warning (to be removed when we will have TUNABLE_UINT constructors)	2012-08-01 18:49:00 +00:00
Bryan Drewery	7b2873fb83	- Add myself to calendar.freebsd - Add my mentor relationships to committers-ports.dot Approved by: eadler (mentor)	2012-08-01 17:48:38 +00:00
Konstantin Belousov	e1a18e46e1	Do a trivial reformatting of the comment, to record the proper commit message for r238973: Rdtsc instruction is not synchronized, it seems on some Intel cores it can bypass even the locked instructions. As a result, rdtsc executed on different cores may return unordered TSC values even when the rdtsc appearance in the instruction sequences is provably ordered. Similarly to what has been done in r238755 for TSC synchronization test, add explicit fences right before rdtsc in the timecounters 'get' functions. Intel recommends to use LFENCE, while AMD refers to MFENCE. For VIA follow what Linux does and use LFENCE. With this change, I see no reordered reads of TSC on Nehalem. Change the rmb() to inlined CPUID in the SMP TSC synchronization test. On i386, locked instruction is used for rmb(), and as noted earlier, it is not enough. Since i386 machine may not support SSE2, do simplest possible synchronization with CPUID. MFC after: 1 week Discussed with: avg, bde, jkim	2012-08-01 17:34:43 +00:00
Alexander Motin	61c49b4dd1	Several fixes to allow firmware/BIOS flash access from user-level: - remove special handling of zero length transfers in mpi_pre_fw_upload(); - add missing MPS_CM_FLAGS_DATAIN flag in mpi_pre_fw_upload(); - move mps_user_setup_request() call into proper place; - increase user command timeout from 30 to 60 seconds; - avoid NULL dereference panic in case of firmware crash. Set max DMA segment size to 24bit, as MPI SGE supports it. Use mps_add_dmaseg() to add empty SGE instead of custom code. Tune endianness safety. Reviewed by: Desai, Kashyap <Kashyap.Desai@lsi.com> Sponsored by: iXsystems, Inc.	2012-08-01 17:31:31 +00:00
Konstantin Belousov	814124c33e	diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c index c253a96..3d8bd30 100644 --- a/sys/x86/x86/tsc.c +++ b/sys/x86/x86/tsc.c @@ -82,7 +82,11 @@ static void tsc_freq_changed(void arg, const struct cf_level level, static void tsc_freq_changing(void arg, const struct cf_level level, int status); static unsigned tsc_get_timecount(struct timecounter tc); -static unsigned tsc_get_timecount_low(struct timecounter tc); +static inline unsigned tsc_get_timecount_low(struct timecounter tc); +static unsigned tsc_get_timecount_lfence(struct timecounter tc); +static unsigned tsc_get_timecount_low_lfence(struct timecounter tc); +static unsigned tsc_get_timecount_mfence(struct timecounter tc); +static unsigned tsc_get_timecount_low_mfence(struct timecounter tc); static void tsc_levels_changed(void arg, int unit); static struct timecounter tsc_timecounter = { @@ -262,6 +266,10 @@ probe_tsc_freq(void) (vm_guest == VM_GUEST_NO && CPUID_TO_FAMILY(cpu_id) >= 0x10)) tsc_is_invariant = 1; + if (cpu_feature & CPUID_SSE2) { + tsc_timecounter.tc_get_timecount = + tsc_get_timecount_mfence; + } break; case CPU_VENDOR_INTEL: if ((amd_pminfo & AMDPM_TSC_INVARIANT) != 0 \|\| @@ -271,6 +279,10 @@ probe_tsc_freq(void) (CPUID_TO_FAMILY(cpu_id) == 0xf && CPUID_TO_MODEL(cpu_id) >= 0x3)))) tsc_is_invariant = 1; + if (cpu_feature & CPUID_SSE2) { + tsc_timecounter.tc_get_timecount = + tsc_get_timecount_lfence; + } break; case CPU_VENDOR_CENTAUR: if (vm_guest == VM_GUEST_NO && @@ -278,6 +290,10 @@ probe_tsc_freq(void) CPUID_TO_MODEL(cpu_id) >= 0xf && (rdmsr(0x1203) & 0x100000000ULL) == 0) tsc_is_invariant = 1; + if (cpu_feature & CPUID_SSE2) { + tsc_timecounter.tc_get_timecount = + tsc_get_timecount_lfence; + } break; } @@ -328,16 +344,31 @@ init_TSC(void) #ifdef SMP -/ rmb is required here because rdtsc is not a serializing instruction. / -#define TSC_READ(x) \ -static void \ -tsc_read_##x(void arg) \ -{ \ - uint32_t tsc = arg; \ - u_int cpu = PCPU_GET(cpuid); \ - \ - rmb(); \ - tsc[cpu 3 + x] = rdtsc32(); \ +/* + * RDTSC is not a serializing instruction, and does not drain + * instruction stream, so we need to drain the stream before executing + * it. It could be fixed by use of RDTSCP, except the instruction is + * not available everywhere. + * + * Use CPUID for draining in the boot-time SMP constistency test. The + * timecounters use MFENCE for AMD CPUs, and LFENCE for others (Intel + * and VIA) when SSE2 is present, and nothing on older machines which + * also do not issue RDTSC prematurely. There, testing for SSE2 and + * vendor is too cumbersome, and we learn about TSC presence from + * CPUID. + * + * Do not use do_cpuid(), since we do not need CPUID results, which + * have to be written into memory with do_cpuid(). + / +#define TSC_READ(x) \ +static void \ +tsc_read_##x(void arg) \ +{ \ + uint32_t tsc = arg; \ + u_int cpu = PCPU_GET(cpuid); \ + \ + __asm __volatile("cpuid" : : : "eax", "ebx", "ecx", "edx"); \ + tsc[cpu 3 + x] = rdtsc32(); \ } TSC_READ(0) TSC_READ(1) @@ -487,7 +518,16 @@ init: for (shift = 0; shift < 31 && (tsc_freq >> shift) > max_freq; shift++) ; if (shift > 0) { - tsc_timecounter.tc_get_timecount = tsc_get_timecount_low; + if (cpu_feature & CPUID_SSE2) { + if (cpu_vendor_id == CPU_VENDOR_AMD) { + tsc_timecounter.tc_get_timecount = + tsc_get_timecount_low_mfence; + } else { + tsc_timecounter.tc_get_timecount = + tsc_get_timecount_low_lfence; + } + } else + tsc_timecounter.tc_get_timecount = tsc_get_timecount_low; tsc_timecounter.tc_name = "TSC-low"; if (bootverbose) printf("TSC timecounter discards lower %d bit(s)\n", @@ -599,16 +639,48 @@ tsc_get_timecount(struct timecounter tc __unused) return (rdtsc32()); } -static u_int +static inline u_int tsc_get_timecount_low(struct timecounter tc) { uint32_t rv; __asm __volatile("rdtsc; shrd %%cl, %%edx, %0" - : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); + : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx"); return (rv); } +static u_int +tsc_get_timecount_lfence(struct timecounter tc __unused) +{ + + lfence(); + return (rdtsc32()); +} + +static u_int +tsc_get_timecount_low_lfence(struct timecounter tc) +{ + + lfence(); + return (tsc_get_timecount_low(tc)); +} + +static u_int +tsc_get_timecount_mfence(struct timecounter tc __unused) +{ + + mfence(); + return (rdtsc32()); +} + +static u_int +tsc_get_timecount_low_mfence(struct timecounter tc) +{ + + mfence(); + return (tsc_get_timecount_low(tc)); +} + uint32_t cpu_fill_vdso_timehands(struct vdso_timehands *vdso_th) {	2012-08-01 17:26:22 +00:00
Konstantin Belousov	0220d04fe3	Add lfence(). MFC after: 1 week	2012-08-01 17:24:53 +00:00
Alan Cox	879eedbc7b	Revise pmap_enter()'s handling of mapping updates that change the PTE's PG_M and PG_RW bits but not the physical page frame. First, only perform vm_page_dirty() on a managed vm_page when the PG_M bit is being cleared. If the updated PTE continues to have PG_M set, then there is no requirement to perform vm_page_dirty(). Second, flush the mapping from the TLB when PG_M alone is cleared, not just when PG_M and PG_RW are cleared. Otherwise, a stale TLB entry may stop PG_M from being set again on the next store to the virtual page. However, since the vm_page's dirty field already shows the physical page as being dirty, no actual harm comes from the PG_M bit not being set. Nonetheless, it is potentially confusing to someone expecting to see the PTE change after a store to the virtual page.	2012-08-01 16:04:13 +00:00
Alexander Motin	1914fdecbe	Fix kernel panic on `camcontrol reset` for specific target, caused by uninitialized cm_targ in mpssas_action_resetdev(). Reviewed by: Desai, Kashyap <Kashyap.Desai@lsi.com> Sponsored by: iXsystems, Inc. MFC after: 3 days	2012-08-01 12:24:13 +00:00
Dag-Erling Smørgrav	39faed57b6	Restore a piece of BSD history. PR: 169127 Submitted by: Ruben de Groot <ruben@hacktor.com> MFC after: 1 week	2012-08-01 09:10:21 +00:00
Gleb Smirnoff	b9aee262e5	Some more whitespace cleanup.	2012-08-01 09:00:26 +00:00
Warner Losh	cc90639873	Add the chip select glue.	2012-08-01 01:18:36 +00:00
Xin LI	edd4c16f78	Teach md5(1) about sha512. MFC after: 1 month	2012-08-01 00:36:12 +00:00
Xin LI	0dfbbb3391	Use calloc().	2012-08-01 00:21:55 +00:00
Adrian Chadd	9f579ef85d	Fix a case of "mis-located braces". PR: kern/170302	2012-08-01 00:18:02 +00:00
Adrian Chadd	af01710118	Allow 802.11n hardware to support multi-rate retry when RTS/CTS is enabled. The legacy (pre-802.11n) hardware doesn't support this - although the AR5212 era hardware supports MRR, it doesn't have all the bits needed to support MRR + RTS/CTS. The AR5416 and later support a packet duration and RTS/CTS flags per rate scenario, so we should support it. Tested: * AR9280, STA PR: kern/170302	2012-07-31 23:54:15 +00:00
Bjoern A. Zeeb	3b43b78342	In case of IPsec he have to do delayed checksum calculations before adding any extension header, or rather before calling into IPsec processing as we may send the packet and not return to IPv6 output processing here. PR: kern/170116 MFC After: 3 days	2012-07-31 23:34:06 +00:00
Warner Losh	adebcba798	Prefer ate over macb. macb doesn't work anymore, and ate has more errata workarounds in it.	2012-07-31 19:41:12 +00:00
Warner Losh	8203a40e4d	Note about where we can boot this.	2012-07-31 19:39:21 +00:00
Warner Losh	679d446fde	Allow chip selects other than 0. The SAM9260EK board has its dataflash on CS1.	2012-07-31 19:14:22 +00:00
Adrian Chadd	b0fa0cba65	Restore the PCI bridge configuration upon resume. This allows my TI1510 cardbus/PCI bridge to work after a suspend/resume, without having to unload/reload the cbb driver. I've also tested this on stable/9. I'll MFC it shortly. PR: kern/170058 Reviewed by: jhb MFC after: 1 day	2012-07-31 18:47:17 +00:00
Jack F Vogel	b4750260cd	Clean up some unused leftover code from em Make IRQ style a tuneable Fix lock handling in the interrupt handler MFC after:3 days	2012-07-31 18:44:10 +00:00
John Baldwin	e838f09cd0	Reorder the managament of advisory locks on open files so that the advisory lock is obtained before the write count is increased during open() and the lock is released after the write count is decreased during close(). The first change closes a race where an open() that will block with O_SHLOCK or O_EXLOCK can increase the write count while it waits. If the process holding the current lock on the file then tries to call exec() on the file it has locked, it can fail with ETXTBUSY even though the advisory lock is preventing other threads from succesfully completeing a writable open(). The second change closes a race where a read-only open() with O_SHLOCK or O_EXLOCK may return successfully while the write count is non-zero due to another descriptor that had the advisory lock and was blocking the open() still being in the process of closing. If the process that completed the open() then attempts to call exec() on the file it locked, it can fail with ETXTBUSY even though the other process that held a write lock has closed the file and released the lock. Reviewed by: kib MFC after: 1 month	2012-07-31 18:25:00 +00:00
Martin Matuska	68f76b5452	Fix wrong indent according to style(9) MFC after: 2 weeks	2012-07-31 17:32:28 +00:00
Martin Matuska	f45b531d72	Fix reporting of root pool upgrade notice. MFC after: 2 weeks	2012-07-31 17:28:28 +00:00
Adrian Chadd	8c08c07ac4	Shuffle the call to ath_hal_setuplasttxdesc() to _after_ the rate control code is called and remove it from ath_buf_set_rate(). For the legacy (non-11n API) TX routines, ath_hal_filltxdesc() takes care of setting up the intermediary and final descriptors right, complete with copying the rate control info into the final descriptor so the rate modules can grab it. The 11n version doesn't do this - ath_hal_chaintxdesc() doesn't copy the rate control bits over, nor does it clear isaggr/moreaggr/ pad delimiters. So the call to setuplasttxdesc() is needed here. So: * legacy NICs - never call the 11n rate control stuff, so filltxdesc copies the rate control info right; * 11n NICs transmitting legacy or 11n non-aggregate frames - ath_hal_set11nratescenario() is called to setup rate control and then ath_hal_filltxdesc() chains them together - so the rate control info is right; * 11n aggregate frames - set11nratescenario() is called, then ath_hal_chaintxdesc() is called to chain a list of aggregate and subframes together. This requires a call to ath_hal_setuplasttxdesc() to complete things. Tested: * AR9280 in station mode TODO: * I really should make sure that the descriptor contents get blanked out correctly or garbage left over from aggregate frames may show up in non-aggregate frames, leading to badness.	2012-07-31 17:08:29 +00:00
Jilles Tjoelker	b562679671	find: Remove unnecessary and inconsistent initialization. Submitted by: jhb	2012-07-31 16:55:41 +00:00
Adrian Chadd	d34a73472a	Push the rate control and descriptor chaining into the descriptor "set" functions, for both legacy and 802.11n. This will simplify supporting the EDMA chipsets as these two descriptor setup functions can just be overridden in their entirety, hiding all of the subtle differences in setting things up. It's not a permanent solution, as eventually the AR5416 HAL should grow similar versions of the 11n descriptor functions and then those can be used. TODO: * Push the "clr11naggr" call into the legacy setds, just to ensure that retried frames don't end up with the aggregate bits set inappropriately; * Remove the "setlasttxdesc" call from the 11n TX path and push it into setds_11n. * Ensure that setds_11n will work correctly for non-aggregate frames; * .. and then when it does, just unconditionally call "setds_11n" for 11n NICs and "setds" for non-11n NICs.	2012-07-31 16:41:09 +00:00
Gleb Smirnoff	ea50c13ebe	Some style(9) and whitespace changes. Together with: Andrey Zonov <andrey zonov.org>	2012-07-31 11:31:12 +00:00
Alexander Motin	3c5c555957	Add several performance optimizations to acpi_cpu_idle(). For C1 and C2 states use cpu_ticks() to measure sleep time instead of much slower ACPI timer. We can't do it for C3, as TSC may stop there. But it is less important there as wake up latency is high any way. For C1 and C2 states do not check/clear bus mastering activity status, as it is important only for C3. As side effect it can make CPU enter C2 instead of C3 if last BM activity was two sleeps back (unlike one before), but that may be even good because of collecting more statistics. Premature BM wakeup from C3, entered because of overestimation, can easily be worse then entering C2 from both performance and power consumption points of view. Together on dual Xeon E5645 system on sequential 512 bytes read test this change makes cpu_idle_acpi() as fast as simplest cpu_idle_hlt() and only few percents slower then cpu_idle_mwait(), while deeper states are still actively used during idle periods. To help with diagnostics, add C-state type into dev.cpu.X.cx_supported. Sponsored by: iXsystems, Inc.	2012-07-31 10:58:50 +00:00
Monthadar Al Jaberi	33a2506f6b	Fixed some debug output in hwmp_recv_prep.	2012-07-31 08:05:40 +00:00
Luigi Rizzo	9df9e62789	nobody uses this file except the userspace ipfw code, but the cast of a pointer to an integer needs a cast to prevent a warning for size mismatch. MFC after: 1 week	2012-07-31 08:04:49 +00:00
Monthadar Al Jaberi	cfe1569450	Fix a PREQ comparison error in 11s HWMP. * Earlier we compared two not equal metrics, one was what we recevied in the 'new PREQ' while the other was what we already have saved which was 'old PREQ' + link metric for the last hop; * Fixed by adding 'new PREQ' + link metric for the last hop in a temporary variable;	2012-07-31 07:36:27 +00:00

... 2 3 4 5 6 ...

172212 Commits