freebsd-nq

Author	SHA1	Message	Date
Hans Petter Selasky	a7a7f5b472	Make sure kernel modules built by default are portable between UP and SMP systems by extending defined(SMP) to include defined(KLD_MODULE). This is a regression issue after r335873 . Discussed with: mmacy@ Sponsored by: Mellanox Technologies	2018-07-06 10:13:42 +00:00
John Baldwin	79ba91952d	Use 'e' instead of 'i' constraints with 64-bit atomic operations on amd64. The ADD, AND, OR, and SUB instructions take at most a 32-bit sign-extended immediate operand. 64-bit constants that do not fit into that constraint need to be loaded into a register. The 'i' constraint tells the compiler it can pass any integer constant to the assembler, whereas the 'e' constrain only permits constants that fit into a 32-bit sign-extended value. This fixes using atomic_add/clear/set/subtract_long/64 with constants that do not fit into a 32-bit sign-extended immediate. Reported by: several folks Tested by: Pete Wright <pete@nomadlogic.org> MFC after: 2 weeks	2018-07-03 22:03:28 +00:00
Matt Macy	f4b3640475	inline atomics and allow tied modules to inline locks - inline atomics in modules on i386 and amd64 (they were always inline on other arches) - allow modules to opt in to inlining locks by specifying MODULE_TIED=1 in the makefile Reviewed by: kib Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16079	2018-07-02 19:48:38 +00:00
Konstantin Belousov	30d4f9e888	Add atomic_load(9) and atomic_store(9) operations. They provide relaxed-ordered atomic access semantic. Due to the FreeBSD memory model, the operations are syntaxical wrappers around the volatile accesses. The volatile qualifier is used to ensure that the access not optimized out and in turn depends on the volatile semantic as implemented by supported compilers. The motivation for adding the operation is to help people coming from other systems or knowing the C11/C++ standards where atomics have special type and require use of the special access operations. It is still the case that FreeBSD requires plain load and stores of aligned integer types to be atomic. Suggested by: jhb Reviewed by: alc, jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D13534	2017-12-19 09:59:20 +00:00
Pedro F. Giffuni	c49761dd57	sys/amd64: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 15:03:07 +00:00
Gleb Smirnoff	83c9dea1ba	- Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter in place. To do per-cpu stats, convert all fields that previously were maintained in the vmmeters that sit in pcpus to counter(9). - Since some vmmeter stats may be touched at very early stages of boot, before we have set up UMA and we can do counter_u64_alloc(), provide an early counter mechanism: o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter. o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter, so that at early stages of boot, before counters are allocated we already point to a counter that can be safely written to. o For sparc64 that required a whole dummy pcpu[MAXCPU] array. Further related changes: - Don't include vmmeter.h into pcpu.h. - vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit, to match kernel representation. - struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion. This is based on benno@'s 4-year old patch: https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html Reviewed by: kib, gallatin, marius, lidl Differential Revision: https://reviews.freebsd.org/D10156	2017-04-17 17:34:47 +00:00
Mark Johnston	3d6732549d	Add support for 8- and 16-bit atomic_(f)cmpset to x86. Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D10068	2017-03-22 17:29:04 +00:00
Mateusz Guzik	f7c6177038	amd64: add atomic_fcmpset Reviewed by: kib, jhb	2017-01-03 21:00:24 +00:00
Sepherosa Ziehau	dfdc9a05c6	atomic: Add testandclear on i386/amd64 Reviewed by: kib Sponsored by: Microsoft OSTC Differential Revision: https://reviews.freebsd.org/D6381	2016-05-16 07:19:33 +00:00
Hans Petter Selasky	c1ecb7e114	Add missing atomic wrapper macro. Reviewed by: alfred @ Sponsored by: Mellanox Technologies MFC after: 1 week	2016-01-21 18:22:50 +00:00
Konstantin Belousov	0b6476ec5b	Improve comments. Submitted by: bde MFC after: 2 weeks	2015-07-30 15:47:53 +00:00
Konstantin Belousov	1d1ec02c44	Remove full barrier from the amd64 atomic_load_acq_*(). Strong ordering semantic of x86 CPUs makes only the compiler barrier neccessary to give the acquire behaviour. Existing implementation ensured sequentially consistent semantic for load_acq, making much stronger guarantee than required by standard's definition of the load acquire. Consumers which depend on the barrier are believed to be identified and already fixed to use proper operations. Noted by: alc (long time ago) Reviewed by: alc, bde Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-07-28 07:04:51 +00:00
Alan Cox	d8b56c8eab	Add a comment discussing the appropriate use of the atomic_() functions with acquire and release semantics versus the mb() functions on amd64 processors. Reviewed by: bde (an earlier version), kib Sponsored by: EMC / Isilon Storage Division	2015-07-24 19:43:18 +00:00
Konstantin Belousov	8954a9a4e6	Add the atomic_thread_fence() family of functions with intent to provide a semantic defined by the C11 fences with corresponding memory_order. atomic_thread_fence_acq() gives r \| r, w, where r and w are read and write accesses, and \| denotes the fence itself. atomic_thread_fence_rel() is r, w \| w. atomic_thread_fence_acq_rel() is the combination of the acquire and release in single operation. Note that reads after the acq+rel fence could be made visible before writes preceeding the fence. atomic_thread_fence_seq_cst() orders all accesses before/after the fence, and the fence itself is globally ordered against other sequentially consistent atomic operations. Reviewed by: alc Discussed with: bde Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2015-07-08 18:12:24 +00:00
Konstantin Belousov	3ac3c0f269	Add a comment about too strong semantic of atomic_load_acq() on x86. Submitted by: bde MFC after: 2 weeks	2015-06-29 09:58:40 +00:00
Konstantin Belousov	7626d062c3	Remove unneeded data dependency, currently imposed by atomic_load_acq(9), on it source, for x86. Right now, atomic_load_acq() on x86 is sequentially consistent with other atomics, code ensures this by doing store/load barrier by performing locked nop on the source. Provide separate primitive __storeload_barrier(), which is implemented as the locked nop done on a cpu-private variable, and put __storeload_barrier() before load, to keep seq_cst semantic but avoid introducing false dependency on the no-modification of the source for its later use. Note that seq_cst property of x86 atomic_load_acq() is not documented and not carried by atomics implementations on other architectures, although some kernel code relies on the behaviour. This commit does not intend to change this. Reviewed by: alc Discussed with: bde Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-06-28 05:04:08 +00:00
Jung-uk Kim	d36eb3f1c4	Remove empty lines before return statements for style consistency.	2013-08-21 22:05:58 +00:00
Jung-uk Kim	8a1ee2d346	Implement atomic_swap() and atomic_testandset(). Reviewed by: arch, bde, jilles, kib	2013-08-21 22:03:06 +00:00
Jung-uk Kim	da255e4c7f	- Remove the "a" constraint from main output operand for atomic_cmpset(). - Use "+" modifier for the "expect" because it is also an output (unused).	2013-08-21 21:30:06 +00:00
Jung-uk Kim	fe94be3da7	Use '+' modifier for a memory operand that is both an input and an output. It was actually done in r86301 but reverted in r150182 because GCC 3.x was not able to handle it for a memory operand. Apparently, this problem was fixed in GCC 4.1+ and several contrib sources already rely on this feature.	2013-08-21 21:14:16 +00:00
Jung-uk Kim	c1c84ce1bf	Remove bogus labels. No functional change.	2013-08-21 20:49:46 +00:00
Jung-uk Kim	ee93d1173a	Use consistent style. No functional change.	2013-08-21 20:43:50 +00:00
Attilio Rao	3a4730256a	Add an unified macro to deny ability from the compiler to reorder instruction loads/stores at its will. The macro __compiler_membar() is currently supported for both gcc and clang, but kernel compilation will fail otherwise. Reviewed by: bde, kib Discussed with: dim, theraven MFC after: 2 weeks	2012-10-09 14:32:30 +00:00
Konstantin Belousov	fa9f322df9	Use plain store for atomic_store_rel on x86, instead of implicitly locked xchg instruction. IA32 memory model guarantees that store has release semantic, since stores cannot pass loads or stores. Reviewed by: bde, jhb Tested by: pho MFC after: 2 weeks	2012-06-02 18:10:16 +00:00
Konstantin Belousov	7222d2fbee	Inform a compiler which asm statements in the x86 implementation of atomics change eflags. Reviewed by: jhb MFC after: 2 weeks	2010-12-18 16:41:11 +00:00
Poul-Henning Kamp	065b12a703	Rename an argument from "exp" to "expect" since the former makes FlexeLint uneasy, in case anybody think it might be exp(3) in libm. This also makes it consistent with other archs.	2010-05-20 06:18:03 +00:00
Attilio Rao	8448afced8	atomic_cmpset_barr_* was added in order to cope with compilers willing to specify their own version of atomic_cmpset_* which could have been different than the membar version. Right now, however, FreeBSD is bound mostly to GCC-like compilers and it is desired to add new support and compat shim mostly when there is a real necessity, in order to avoid too much compatibility bloats. In this optic, bring back atomic_cmpset_{acq, rel}_* to be the same as atomic_cmpset_* and unwind the atomic_cmpset_barr_* introduction. Requested by: jhb Reviewed by: jhb Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-10-09 15:51:40 +00:00
Attilio Rao	d9492a4483	- All the functions in atomic.h needs to be in "physical" form (like not defined through macros or similar) in order to be later compiled in the kernel and offer this way the support for modules (and compatibility among the UP case and SMP case). Fix this for the newly introduced atomic_cmpset_barr_* cases by defining and specifying a template. Note that the new DEFINE_CMPSET_GEN() template save more typing on amd64 than the current code. [1] - Fix the style for memory barriers on amd64. [1] Reported by: Paul B. Mahol <onemda at gmail dot com>	2009-10-06 23:48:28 +00:00
Attilio Rao	86d2e48c22	Per their definition, atomic instructions used in conjuction with memory barriers should also ensure that the compiler doesn't reorder paths where they are used. GCC, however, does that aggressively, even in presence of volatile operands. The most reliable way GCC offers for avoid instructions reordering is clobbering "memory" even if that is theoretically an heavy-weight operation, flushing the content of all the registers and forcing reload of them (We could rely, however, on gcc DTRT by just understanding the purpose as this is a well-known pattern for many modern operating-systems). Not all our memory barriers, right now, clobber memory for GCC-like compilers. The most notable cases are IA32 and amd64 where the memory barrier are treacted the same as normal atomic instructions. Fix this by offering the possibility to implement atomic instructions with memory barriers separately from the normal version and implement the GCC-like specific one using memory clobbering. Thanks to Chris Lattner (@apple) for his discussion on llvm specifics. Reported by: jhb Reviewed by: jhb Tested by: rdivacky, Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2009-10-06 13:45:49 +00:00
Kip Macy	db7f0b974f	- bump __FreeBSD version to reflect added buf_ring, memory barriers, and ifnet functions - add memory barriers to <machine/atomic.h> - update drivers to only conditionally define their own - add lockless producer / consumer ring buffer - remove ring buffer implementation from cxgb and update its callers - add if_transmit(struct ifnet ifp, struct mbuf m) to ifnet to allow drivers to efficiently manage multiple hardware queues (i.e. not serialize all packets through one ifq) - expose if_qflush to allow drivers to flush any driver managed queues This work was supported by Bitgravity Inc. and Chelsio Inc.	2008-11-22 05:55:56 +00:00
Pawel Jakub Dawidek	6eb4157ffc	Implement atomic_fetchadd_long() for all architectures and document it. Reviewed by: attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)	2008-03-16 21:20:50 +00:00
Bruce Evans	f28e1c8f99	Fixed some style bugs (mainly assorted errors in comments, and inconsistent spelling of `result').	2006-12-29 15:29:49 +00:00
Bruce Evans	6c296ffa81	Fixed some style bugs (whitespace only).	2006-12-29 14:28:23 +00:00
Bruce Evans	7e4277e591	Try harder to garbage-collect the "LOCORE" (really asm) version of MPLOCKED. The cleaning in rev.1.25 was supposed to have been undone by rev.1.26, but 1.26 could never have actually affected asm files since atomic.h is full of C declarations so including it in asm files would just give syntax errors. The asm MPLOCKED is even less needed than when misplaced definitions of it were first removed, and is now unused in any asm file in the src tree except in anachronismns in sys/i386/i386/support.s.	2006-12-29 13:36:26 +00:00
Bruce Evans	276c702d8d	Removed gratuitous cosmetic differences with the i386 version. This mainly involves removing all __CC_SUPPORTS___INLINE__ ifdefs. These ifdefs are even less needed for amd64 than for i386, but the i386 atomic.h never had them. The ifdefs here were just an optimization of obsolescent compatibility cruft (__inline) for a null set of compilers. I think null sets of compilers should only be supported in cases where this is more than an optimization, doesn't require extensive ifdefs, and only involves not-so-obsolescent compatibility cruft (plain inline here).	2006-12-28 08:15:14 +00:00
Bruce Evans	26ab2d1d23	Avoid an instruction in atomic_cmpset_{int_long)() in most cases. These functions are used a lot for mutexes, so this reduces the text size of an average kernel by about 0.75%. This wasn't intended to be a significant optimization, but it somehow increased the maximum number of packets per second that can be transmitted by my bge hardware from 320000 to 460000 (this benchmark is CPU-bound and remarkably sensitive to changes in the text section). Details: we would prefer to leave the result of the cmpxchg in %al, but cannot tell gcc that it is there, so we have to convert it to an integer register. We converted to %al, then to %[re]ax, but the latter step is usually wasted since gcc usually only wants the condition code and can recover it from %al just as easily as from %[re]ax. Let gcc promote %al in the few cases where this is needed. Nearby style fixes; - let gcc manage the load of `res', and don't abuse `res' for a copy of `exp' - don't echo `res's name in comments - consistently spell the condition code as 'e' after comparison for equality - don't hard-code %al anywhere except in constraints - for the version that doesn't use cmpxchg, there is no requirement to use %al anywhere, so don't hard-code it in the constraints either. Style non-fix: - for the versions that use cmpxchg, keep using "a" (was %[re]ax, now %al) for the main output operand, although this is not required. The input and output operands that use the "a" constraint are now decoupled, and this makes things clearer except for the reason that the output register is hard-coded. It is now just a hack to tell gcc that the input "a" has been clobbered without increasing the number of operands.	2006-12-27 20:26:00 +00:00
John Baldwin	3c2bc2bf26	Add a new atomic_fetchadd() primitive that atomically adds a value to a variable and returns the previous value of the variable. Tested on: i386, alpha, sparc64, arm (cognet) Reviewed by: arch@ Submitted by: cognet (arm) MFC after: 1 week	2005-09-27 17:39:11 +00:00
John Baldwin	80d52f16da	Stop using the '+' constraint modifier with inline assembly. The '+' constraint is actually only allowed for register operands. Instead, use separate input and output memory constraints. Education from: alc Reviewed by: alc Tested on: i386, alpha MFC after: 1 week	2005-09-15 19:31:22 +00:00
John Baldwin	5d2f4de5da	Add aliases for atomic operations on 64-bit integers just like other 64-bit platforms. MFC after: 1 week	2005-08-18 14:36:47 +00:00
Peter Wemm	9e76f9ad3f	Like on i386, bypass lock prefix for atomic ops on !SMP kernels.	2005-07-21 22:35:02 +00:00
John Baldwin	122eceef61	Convert the atomic_ptr() operations over to operating on uintptr_t variables rather than void * variables. This makes it easier and simpler to get asm constraints and volatile keywords correct. MFC after: 3 days Tested on: i386, alpha, sparc64 Compiled on: ia64, powerpc, amd64 Kernel toolchain busted on: arm	2005-07-15 18:17:59 +00:00
John Baldwin	48281036d7	Some cleanups and tweaks to some of the atomic.h files in preparation for further changes and fixes in the future: - Use aliases via macros rather than duplicated inlines wherever possible. - Move all the aliases to the bottom of these files and the inline functions to the top. - Add various comments. - On alpha, drop atomic_{load_acq,store_rel}_{8,char,16,short}(). - On i386 and amd64, don't duplicate the extern declarations for functions in the two non-inline cases (KLD_MODULE and compiler doesn't do inlines), instead, consolidate those two cases. - Some whitespace fixes. Approved by: re (scottl)	2005-07-09 12:38:53 +00:00
Joerg Wunsch	a5f50ef9e4	netchild's mega-patch to isolate compiler dependencies into a central place. This moves the dependency on GCC's and other compiler's features into the central sys/cdefs.h file, while the individual source files can then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42. By now, GCC and ICC (the Intel compiler) have been actively tested on IA32 platforms by netchild. Extension to other compilers is supposed to be possible, of course. Submitted by: netchild Reviewed by: various developers on arch@, some time ago	2005-03-02 21:33:29 +00:00
Peter Wemm	cda078658e	Cosmetic and/or trivial sync up with i386. Approved by: re (rwatson)	2003-11-21 03:02:00 +00:00
Peter Wemm	0d2a298904	Initial landing of SMP support for FreeBSD/amd64. - This is heavily derived from John Baldwin's apic/pci cleanup on i386. - I have completely rewritten or drastically cleaned up some other parts. (in particular, bootstrap) - This is still a WIP. It seems that there are some highly bogus bioses on nVidia nForce3-150 boards. I can't stress how broken these boards are. I have a workaround in mind, but right now the Asus SK8N is broken. The Gigabyte K8NPro (nVidia based) is also mind-numbingly hosed. - Most of my testing has been with SCHED_ULE. SCHED_4BSD works. - the apic and acpi components are 'standard'. - If you have an nVidia nForce3-150 board, you are stuck with 'device atpic' in addition, because they somehow managed to forget to connect the 8254 timer to the apic, even though its in the same silicon! ARGH! This directly violates the ACPI spec.	2003-11-17 08:58:16 +00:00
Peter Wemm	afa8862328	Commit MD parts of a loosely functional AMD64 port. This is based on a heavily stripped down FreeBSD/i386 (brutally stripped down actually) to attempt to get a stable base to start from. There is a lot missing still. Worth noting: - The kernel runs at 1GB in order to cheat with the pmap code. pmap uses a variation of the PAE code in order to avoid having to worry about 4 levels of page tables yet. - It boots in 64 bit "long mode" with a tiny trampoline embedded in the i386 loader. This simplifies locore.s greatly. - There are still quite a few fragments of i386-specific code that have not been translated yet, and some that I cheated and wrote dumb C versions of (bcopy etc). - It has both int 0x80 for syscalls (but using registers for argument passing, as is native on the amd64 ABI), and the 'syscall' instruction for syscalls. int 0x80 preserves all registers, 'syscall' does not. - I have tried to minimize looking at the NetBSD code, except in a couple of places (eg: to find which register they use to replace the trashed %rcx register in the syscall instruction). As a result, there is not a lot of similarity. I did look at NetBSD a few times while debugging to get some ideas about what I might have done wrong in my first attempt.	2003-05-01 01:05:25 +00:00
Jim Pirzyk	77e8341280	Add a knob to turn on and off the CMPXCHG instruction on > i386 IA32 systems. This is most beneficial for vmware client os installs. Reviewed by: jmallet, iedowse, tlambert2@mindspring.com MFC After: never, -STABLE does not currently use this instruction	2002-10-14 19:33:12 +00:00
Mark Murray	4c5aee92a7	Beautify. This has the side effect of improving portability and making lint work cleaner. Inspired to do this by: jhb	2002-07-18 15:56:46 +00:00
Mark Murray	8306a37bbb	Clean up the syntax WRT semicolons at the end of function-like-macros, and protect GCCisms from non-GNU compilers and lint.	2002-07-17 16:19:37 +00:00
Bosko Milekic	71acb2477f	Make MPLOCKED work again in asm files and stringify it explicitly where necessary. Reviewed by: jake	2002-02-28 06:17:05 +00:00

1 2

74 Commits