Commit Graph

725 Commits

Author SHA1 Message Date
tijl
606babe108 Rename __wchar_t so it no longer conflicts with __wchar_t from clang 3.4
-fms-extensions.

MFC after:	2 weeks
2014-04-01 14:46:11 +00:00
alc
6c60c88452 As of r257209, all architectures have defined VM_KMEM_SIZE_SCALE. In other
words, every architecture is now auto-sizing the kmem arena.  This revision
changes kmeminit() so that the definition of VM_KMEM_SIZE_SCALE becomes
mandatory and the definition of VM_KMEM_SIZE becomes optional.

Replace or eliminate all existing definitions of VM_KMEM_SIZE.  With
auto-sizing enabled, VM_KMEM_SIZE effectively became an alternate spelling
for VM_KMEM_SIZE_MIN on most architectures.  Use VM_KMEM_SIZE_MIN for
clarity.

Change kmeminit() so that the effect of defining VM_KMEM_SIZE is similar to
that of setting the tunable vm.kmem_size.  Whereas the macros
VM_KMEM_SIZE_{MAX,MIN,SCALE} have had the same effect as the tunables
vm.kmem_size_{max,min,scale}, the effects of VM_KMEM_SIZE and vm.kmem_size
have been distinct.  In particular, whereas VM_KMEM_SIZE was overridden by
VM_KMEM_SIZE_{MAX,MIN,SCALE} and vm.kmem_size_{max,min,scale}, vm.kmem_size
was not.  Remedy this inconsistency.  Now, VM_KMEM_SIZE can be used to set
the size of the kmem arena at compile-time without that value being
overridden by auto-sizing.

Update the nearby comments to reflect the kmem submap being replaced by the
kmem arena.  Stop duplicating the auto-sizing formula in every machine-
dependent vmparam.h and place it in kmeminit() where auto-sizing takes
place.

Reviewed by:	kib (an earlier version)
Sponsored by:	EMC / Isilon Storage Division
2013-11-08 16:25:00 +00:00
kib
79afbd5fdd Add bus_dmamap_load_ma() function to load map with the array of
vm_pages.  Provide trivial implementation which forwards the load to
_bus_dmamap_load_phys() page by page.  Right now all architectures use
bus_dmamap_load_ma_triv().

Tested by:	pho (as part of the functional patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2013-10-27 21:39:16 +00:00
marius
20009df359 Move the implementation of bus_space_barrier(9) to the inline function in
the header. Actually, there's only one version for all types of busses, so
it doesn't make sense to walk up the hierarchy.
2013-10-24 17:06:41 +00:00
marius
fe39ca00e6 Implement GET_STACK_USAGE.
Discussed with:	mav

Approved by:	re (kib)
MFC after:	1 week
2013-09-29 13:09:25 +00:00
glebius
cf37135185 Fix build with gcc. Move sf_buf_alloc()/sf_buf_free() declarations
to MD headers.
2013-09-06 17:44:13 +00:00
marius
95f847fd26 Add MD (for now) atomic_store_acq_<type>() and use it in pmap_activate()
to get the semantics when setting the PMAP right. Prior to r251782, the
latter already used implicit acquire semantics, which - currently - means
to not employ additional explicit memory barriers under the hood (see also
r225889).
2013-08-06 15:34:11 +00:00
attilio
598e23d2ea Remove unused member.
Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc
Tested by:	pho
2013-08-04 21:17:05 +00:00
avg
4e6c4b2a36 Revert r253748,253749
This WIP should not have been committed yet.

Pointyhat to:	avg
2013-07-28 18:44:17 +00:00
avg
ea04b46439 put contents of cpu.h under _KERNEL
no userland-serviceable parts inside

MFC after:	20 days
2013-07-28 18:32:27 +00:00
marius
98abe96b02 Prefix the alias macros for members of struct __mcontext with an underscore
in order to avoid a clash in the net80211 code.
2013-07-12 14:24:52 +00:00
kib
8a0279994d Fix issues with zeroing and fetching the counters, on x86 and ppc64.
Issues were noted by Bruce Evans and are present on all architectures.

On i386, a counter fetch should use atomic read of 64bit value,
otherwise carry from the increment on other CPU could be lost for the
given fetch, making error of 2^32.  If 64bit read (cmpxchg8b) is not
available on the machine, it cannot be SMP and it is enough to disable
preemption around read to avoid the split read.

On x86 the counter increment is not atomic on purpose, which makes it
possible for the store of the incremented result to override just
zeroed per-cpu slot.  The effect would be a counter going off by
arbitrary value after zeroing.  Perform the counter zeroing on the
same processor which does the increments, making the operations
mutually exclusive.  On i386, same as for the fetching, if the
cmpxchg8b is not available, machine is not SMP and we disable
preemption for zeroing.

PowerPC64 is treated the same as amd64.

For other architectures, the changes made to allow the compilation to
succeed, without fixing the issues with zeroing or fetching.  It
should be possible to handle them by using the 64bit loads and stores
atomic WRT preemption (assuming the architectures also converted from
using critical sections to proper asm).  If architecture does not
provide the facility, using global (spin) mutex would be non-optimal
but working solution.

Noted by:  bde
Sponsored by:	The FreeBSD Foundation
2013-07-01 02:48:27 +00:00
ed
1fdf0e5b3c Remove conflicting macros from SPARC64's atomic(9) header.
The atomic_load() and atomic_store() macros conflict with the equally
named macros from <stdatomic.h>. Remove them, as they are only used to
implement functions that are not present on any of the other
architectures.
2013-06-15 08:23:53 +00:00
attilio
b24a52ec9e Rename VM_NDOMAIN into MAXMEMDOM and move it into machine/param.h in
order to match the MAXCPU concept.  The change should also be useful
for consolidation and consistency.

Sponsored by:	EMC / Isilon storage division
Obtained from:	jeff
Reviewed by:	alc
2013-05-07 22:46:24 +00:00
glebius
9cf64d6c35 Merge from projects/counters: counter(9).
Introduce counter(9) API, that implements fast and raceless counters,
provided (but not limited to) for gathering of statistical data.

See http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html
for more details.

In collaboration with:	kib
Reviewed by:		luigi
Tested by:		ae, ray
Sponsored by:		Nginx, Inc.
2013-04-08 19:40:53 +00:00
glebius
8c6eba117e Merge from projects/counters:
Pad struct pcpu so that its size is denominator of PAGE_SIZE. This
is done to reduce memory waste in UMA_PCPU_ZONE zones.

Sponsored by:	Nginx, Inc.
2013-04-08 19:19:10 +00:00
kib
bd7f0fa0bb Reform the busdma API so that new types may be added without modifying
every architecture's busdma_machdep.c.  It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code.  The MD busdma is then given a chance to do any final processing
in the complete() callback.

The cam changes unify the bus_dmamap_load* handling in cam drivers.

The arm and mips implementations are updated to track virtual
addresses for sync().  Previously this was done in a type specific
way.  Now it is done in a generic way by recording the list of
virtuals in the map.

Submitted by:	jeff (sponsored by EMC/Isilon)
Reviewed by:	kan (previous version), scottl,
	mjacob (isp(4), no objections for target mode changes)
Discussed with:	     ian (arm changes)
Tested by:	marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
	amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
2013-02-12 16:57:20 +00:00
kib
dfffe11d71 The 'end' word was missed in the comment.
MFC after:     3 days
2013-02-08 15:52:20 +00:00
marius
6892c8a3be Revert the part of r239864 which removed obtaining the SMP mutex around
reading registers from other CPUs. As it turns out, the hardware doesn't
really like concurrent IPI'ing causing adverse effects. Also the thought
deadlock when using this spin lock here and the targeted CPU(s) are also
holding or in case of nested locks can't actually happen. This is due to
the fact that on sparc64, spinlock_enter() only raises the PIL but doesn't
disable interrupts completely. Thus direct cross calls as used for the
register reading (and all other MD IPI needs) still will be executed by
the targeted CPU(s) in that case.

MFC after:	3 days
2013-01-23 22:52:20 +00:00
jeff
f40f3c3255 - Implement run-time expansion of the KTR buffer via sysctl.
- Implement a function to ensure that all preempted threads have switched
   back out at least once.  Use this to make sure there are no stale
   references to the old ktr_buf or the lock profiling buffers before
   updating them.

Reviewed by:	marius (sparc64 parts), attilio (earlier patch)
Sponsored by:	EMC / Isilon Storage Division
2012-11-15 00:51:57 +00:00
attilio
f3501b109e Rework the known rwlock to benefit about staying on their own
cache line in order to avoid manual frobbing but using
struct rwlock_padalign.

Reviewed by:	alc, jimharris
2012-11-03 23:03:14 +00:00
marius
807619d8ba - Give PIL_PREEMPT the lowest priority just above low/stray interrupts.
The reason for this is that the SPARC v9 architecture allows nested
  interrupts of higher priority/level than that of the current interrupt
  to occur (and we can't just entirely bypass this model, also, at least
  for tick interrupts, this also wouldn't be wise). However, when a
  preemption interrupt interrupts another interrupt of lower priority,
  f.e. PIL_ITHREAD, and that one in turn is nested by a third interrupt,
  f.e. PIL_TICK, with SCHED_ULE the execution of interrupts higher than
  PIL_PREEMPT may be migrated to another CPU. In particular, tl1_ret(),
  which is responsible for restoring the state of the CPU prior to entry
  to the interrupt based on the (also migrated) trap frame, then is run
  on a CPU which actually didn't receive the interrupt in question,
  causing an inappropriate processor interrupt level to be "restored".
  In turn, this causes interrupts of the first level, i.e. PIL_ITHREAD
  in the above scenario, to be blocked on the target of the migration
  until the correct PIL happens to be restored again on that CPU again.
  Making PIL_PREEMPT the lowest real priority, this effectively prevents
  this scenario from happening, as preemption interrupts no longer can
  interrupt any other interrupt besides stray ones (which is no issue).
  Thanks to attilio@ and especially mav@ for helping me to understand
  this problem at the 201208DevSummit.
- Give PIL_STOP (which is also used for IPI_STOP_HARD, given that there's
  no real equivalent to NMIs on SPARC v9) the highest possible priority
  just below the hardwired PIL_TICK, so it has a chance to interrupt
  more things.

MFC after:	1 week
2012-10-20 12:07:48 +00:00
attilio
6997194551 Add an unified macro to deny ability from the compiler to reorder
instruction loads/stores at its will.
The macro __compiler_membar() is currently supported for both gcc and
clang, but kernel compilation will fail otherwise.

Reviewed by:	bde, kib
Discussed with:	dim, theraven
MFC after:	2 weeks
2012-10-09 14:32:30 +00:00
attilio
3212891c92 Reverts r234074,234105,234564,234723,234989,235231-235232 and part of
r234247.
Use, instead, the static intializer introduced in r239923 for x86 and
sparc64 intr_cpus, unwinding the code to the initial version.

Reviewed by:	marius
2012-10-09 12:22:43 +00:00
gavin
9db084a6c3 Prevent indent(1) from reformatting this comment, as it contains
a formatting-sensitive table.
2012-09-07 08:18:06 +00:00
marius
1829ce3546 Add a global MD macro for the VIS block size instead of duplicating
it and using magic values all over the place.

MFC after:	1 week
2012-08-31 11:15:01 +00:00
marius
9f542929d1 - Unlike cache invalidation and TLB demapping IPIs, reading registers from
other CPUs doesn't require locking so get rid of it. As the latter is used
  for the timecounter on certain machine models, using a spin lock in this
  case can lead to a deadlock with the upcoming callout(9) rework.
- Merge r134227/r167250 from x86:
  Avoid cross-IPI SMP deadlock by using the smp_ipi_mtx spin lock not only
  for smp_rendezvous_cpus() but also for the MD cache invalidation and TLB
  demapping IPIs.
- Mark some unused function arguments as such.

MFC after:	1 week
2012-08-29 16:56:50 +00:00
marius
4c012f63e6 Merge r236494 from x86:
Isolate the global TTE list lock from data and other locks to prevent false
sharing within the cache.

MFC after:	3 days
2012-08-05 22:03:13 +00:00
andrew
0a7002aae7 Make the wchar_t type machine dependent.
This is required for ARM EABI. Section 7.1.1 of the Procedure Call for the
ARM Architecture (AAPCS) defines wchar_t as either an unsigned int or an
unsigned short with the former preferred.

Because of this requirement we need to move the definition of __wchar_t to
a machine dependent header. It also cleans up the macros defining the limits
of wchar_t by defining __WCHAR_MIN and __WCHAR_MAX in the same machine
dependent header then using them to define WCHAR_MIN and WCHAR_MAX
respectively.

Discussed with:	bde
2012-06-24 04:15:58 +00:00
kib
7b36a08108 Implement mechanism to export some kernel timekeeping data to
usermode, using shared page.  The structures and functions have vdso
prefix, to indicate the intended location of the code in some future.

The versioned per-algorithm data is exported in the format of struct
vdso_timehands, which mostly repeats the content of in-kernel struct
timehands. Usermode reading of the structure can be lockless.
Compatibility export for 32bit processes on 64bit host is also
provided. Kernel also provides usermode with indication about
currently used timecounter, so that libc can fall back to syscall if
configured timecounter is unknown to usermode code.

The shared data updates are initiated both from the tc_windup(), where
a fast task is queued to do the update, and from sysctl handlers which
change timecounter. A manual override switch
kern.timecounter.fast_gettime allows to turn off the mechanism.

Only x86 architectures export the real algorithm data, and there, only
for tsc timecounter. HPET counters page could be exported as well, but
I prefer to not further glue the kernel and libc ABI there until
proper vdso-based solution is developed.

Minimal stubs neccessary for non-x86 architectures to still compile
are provided.

Discussed with:	bde
Reviewed by:	jhb
Tested by:	flo
MFC after:	1 month
2012-06-22 07:06:40 +00:00
kib
d98ad62d7e Reserve AT_TIMEKEEP auxv entry for providing usermode the pointer to
timekeeping information.

MFC after:  1 week
2012-06-22 06:38:31 +00:00
alc
6eeaee04e4 The page flag PGA_WRITEABLE is set and cleared exclusively by the pmap
layer, but it is read directly by the MI VM layer.  This change introduces
pmap_page_is_write_mapped() in order to completely encapsulate all direct
access to PGA_WRITEABLE in the pmap layer.

Aesthetics aside, I am making this change because amd64 will likely begin
using an alternative method to track write mappings, and having
pmap_page_is_write_mapped() in place allows me to make such a change
without further modification to the MI VM layer.

As an added bonus, tidy up some nearby comments concerning page flags.

Reviewed by:	kib
MFC after:	6 weeks
2012-06-16 18:56:19 +00:00
alc
2616fac7a2 Replace all uses of the vm page queues lock by a r/w lock that is private
to this pmap.c.  This new r/w lock is used primarily to synchronize access
to the TTE lists.  However, it will be used in a somewhat unconventional
way.  As finer-grained TTE list locking is added to each of the pmap
functions that acquire this r/w lock, its acquisition will be changed from
write to read, enabling concurrent execution of the pmap functions with
finer-grained locking.

Reviewed by:	attilio
Tested by:	flo
MFC after:	10 days
2012-05-29 01:52:38 +00:00
bz
cd8b136e12 MFp4 bz_ipv6_fast:
in_cksum.h required ip.h to be included for struct ip.  To be
  able to use some general checksum functions like in_addword()
  in a non-IPv4 context, limit the (also exported to user space)
  IPv4 specific functions to the times, when the ip.h header is
  present and IPVERSION is defined (to 4).

  We should consider more general checksum (updating) functions
  to also allow easier incremental checksum updates in the L3/4
  stack and firewalls, as well as ponder further requirements by
  certain NIC drivers needing slightly different pseudo values
  in offloading cases.  Thinking in terms of a better "library".

  Sponsored by:	The FreeBSD Foundation
  Sponsored by:	iXsystems

Reviewed by:	gnn (as part of the whole)
MFC After:	3 days
2012-05-24 22:00:48 +00:00
marius
e5cc26d604 Fix mismerge in r235231. 2012-05-10 15:23:20 +00:00
marius
d42dce6416 Merge r234989 from x86:
Revert part of r234723 by re-enabling the SMP protection for intr_bind().
2012-05-10 15:17:21 +00:00
dim
81a3e2b46d Add a convenience macro for the returns_twice attribute, and apply it to
the prototypes of the appropriate functions (getcontext, savectx,
setjmp, sigsetjmp and vfork).

MFC after:	2 weeks
2012-04-29 11:04:31 +00:00
attilio
0b98e6d835 Clean up the intr* MD KPI from the SMP dependency, removing a cause of
discrepancy between modules and kernel, but deal with SMP differences
within the functions themselves.

As an added bonus this also helps in terms of code readability.

Requested by:	gibbs
Reviewed by:	jhb, marius
MFC after:	1 week
2012-04-26 20:24:25 +00:00
dim
cf2c2fde9c Add casts to __uint16_t to the __bswap16() macros on all arches which
didn't already have them.  This is because the ternary expression will
return int, due to the Usual Arithmetic Conversions.  Such casts are not
needed for the 32 and 64 bit variants.

While here, add additional parentheses around the x86 variant, to
protect against unintended consequences.

MFC after:	2 weeks
2012-03-09 20:34:31 +00:00
jhb
5013ab31bd - Change contigmalloc() to use the vm_paddr_t type instead of an unsigned
long for specifying a boundary constraint.
- Change bus_dma tags to use bus_addr_t instead of bus_size_t for boundary
  constraints.

These allow boundary constraints to be fully expressed for cases where
sizeof(bus_addr_t) != sizeof(bus_size_t).  Specifically, it allows a
driver to properly specify a 4GB boundary in a PAE kernel.

Note that this cannot be safely MFC'd without a lot of compat shims due
to KBI changes, so I do not intend to merge it.

Reviewed by:	scottl
2012-03-01 19:58:34 +00:00
marius
8ec59afebb Now that we have a working OF_printf() since r230631 and a OF_panic()
helper since r230632, use these for output and panicing during the
early cycles and move cninit() until after the static per-CPU data
has been set up. This solves a couple of issue regarding the non-
availability of the static per-CPU data:
- panic() not working and only making things worse when called,
- having to supply a special DELAY() implementation to the low-level
  console drivers,
- curthread accesses of mutex(9) usage in low-level console drivers
  that aren't conditional due to compiler optimizations (basically,
  this is the problem described in r227537 but in this case for
  keyboards attached via uart(4)). [1]

PR:	164123 [1]
2012-01-27 23:21:54 +00:00
marius
43204e516a - Now that we have a working OF_printf() since r230631, use it for
implementing a simple OF_panic() that may be used during the early
  cycles when panic() isn't available, yet.
- Mark cpu_{exit,shutdown}() as __dead2 as appropriate.
2012-01-27 22:35:53 +00:00
marius
f3dd9013de For machines where the kernel address space is unrestricted increase
VM_KMEM_SIZE_SCALE to 2, awaiting more insight from alc@. As it turns
out, the VM apparently has problems with machines that have large holes
in the physical address space, causing the kmem_suballoc() call in
kmeminit() to fail with a VM_KMEM_SIZE_SCALE of 1. Using a value of 2
allows these, namely Blade 1500 with 2GB of RAM, to boot.

PR:	164227
2012-01-27 22:25:46 +00:00
marius
16ae182079 Mark cpu_{halt,reset}() as __dead2 as appropriate. 2012-01-27 22:04:43 +00:00
das
9feb719605 Add C11 macros describing subnormal numbers to float.h.
Reviewed by:	bde
2012-01-23 06:36:41 +00:00
ed
cb983d98e7 Replace __signed by signed.
The signed keyword is an integral part of the C syntax. There's no need
to use __signed.
2011-12-13 13:38:03 +00:00
marius
6779d3e54c Revert r225889 a bit. While it's correct that in total store order there's
no need to additionally add CPU memory barriers to the acquire variants of
atomic(9), these are documented to also include compiler memory barriers.
So add the latter, which were previously included by using membar(), back.
2011-12-03 13:51:57 +00:00
marius
5846bf2bc1 For sparc64 also adjust the geometry of da(4) driven disks to not overflow
the 16-bit cylinders field of the VTOC8 disk label (at around 502GB). The
geometry chosen for disks above that limit allows to use disks up to 2TB,
which is the limit of the extended VTOC8 format. The geometry used for
disks smaller than the 16-bit cylinders limit stays the same as used by
cam_calc_geometry(9) for extended translation.
Thanks to Hans-Joerg Sirtl for providing hardware for testing this change.

MFC after:	3 days
2011-11-27 15:43:40 +00:00
marius
6e89cebeb5 Define curthread as an inline function that loads the thread pointer
directly from g7, the pcpu pointer. This guarantees correct behavior
when the thread migrates to a different CPU.
Commit message stolen from r205431. Additional testing by Peter Jeremy.

MFC after:	3 days
2011-11-15 20:17:18 +00:00
das
28e8dea258 People porting FreeBSD to new architectures ought not have to
implement a deprecated FPU control interface in addition to the
standard one.  To make this clearer, further deprecate ieeefp.h
by not declaring the function prototypes except on architectures
that implement them already.

Currently i386 and amd64 implement the ieeefp.h interface for
compatibility, and for fp[gs]etprec(), which doesn't exist on
most other hardware.  Powerpc, sparc64, and ia64 partially implement
it and probably shouldn't, and other architectures don't implement it
at all.
2011-10-21 06:41:46 +00:00