Commit Graph

1391 Commits

Author SHA1 Message Date
Konstantin Belousov
55aabb7fd1 For architectures not using direct map , and requiring real KVA page for
sf buf allocation, use wakeup() instead of wakeup_one() to notify sf
buffer waiters about free buffer.

sf_buf_alloc() calls msleep(PCATCH) when SFB_CATCH flag was given,
and for simultaneous wakeup and signal delivery, msleep() returns
EINTR/ERESTART despite the thread was selected for wakeup_one(). As
result, we loose a wakeup, and some other waiter will not be woken up.

Reported and tested by:	az
Reviewed by:	alc, jhb
MFC after:	1 week
2011-01-18 21:57:02 +00:00
Jung-uk Kim
bc35e60ec0 Remove empty dev_mem_md_init() stubs. 2011-01-17 23:06:47 +00:00
Jung-uk Kim
2fea643112 Add reader/writer lock around mem_range_attr_get() and mem_range_attr_set().
Compile sys/dev/mem/memutil.c for all supported platforms and remove now
unnecessary dev_mem_md_init().  Consistently define mem_range_softc from
mem.c for all platforms.  Add missing #include guards for machine/memdev.h
and sys/memrange.h.  Clean up some nearby style(9) nits.

MFC after:	1 month
2011-01-17 22:58:28 +00:00
Marcel Moolenaar
6949dd5cc2 Don't re-use MODINFOMD_BOOTINFO as MODINFOMD_DTBP. It breaks
compatibility without any means for the kernel to work with
an older loader.
2011-01-11 22:07:39 +00:00
John Baldwin
58ccf5b41c Remove unneeded includes of <sys/linker_set.h>. Other headers that use
it internally contain nested includes.

Reviewed by:	bde
2011-01-11 13:59:06 +00:00
Konstantin Belousov
50a57dfbec Move repeated MAXSLP definition from machine/vmparam.h to sys/vmmeter.h.
Update the outdated comments describing MAXSLP and the process
selection algorithm for swap out.

Comments wording and reviewed by:	alc
2011-01-09 12:50:44 +00:00
Tijl Coosemans
a56e818f29 On mixed 32/64 bit architectures (mips, powerpc) use __LP64__ rather than
architecture macros (__mips_n64, __powerpc64__) when 64 bit types (and
corresponding macros) are different from 32 bit. [1]

Correct the type of INT64_MIN, INT64_MAX and UINT64_MAX.

Define (U)INTMAX_C as an alias for (U)INT64_C matching the type definition
for (u)intmax_t. Do this on all architectures for consistency.

Suggested by:	bde [1]
Approved by:	kib (mentor)
2011-01-08 12:43:05 +00:00
Tijl Coosemans
d942996baf On 32 bit architectures define (u)int64_t as (unsigned) long long instead
of (unsigned) int __attribute__((__mode__(__DI__))). This aligns better
with macros such as (U)INT64_C, (U)INT64_MAX, etc. which assume (u)int64_t
has type (unsigned) long long.

The mode attribute was used because long long wasn't standardised until
C99. Nowadays compilers should support long long and use of the mode
attribute is discouraged according to GCC Internals documentation.

The type definition has to be marked with __extension__ to support
compilation with "-std=c89 -pedantic".

Discussed with:	bde
Approved by:	kib (mentor)
2011-01-08 11:47:55 +00:00
Tijl Coosemans
9858863cd4 Fix types of some values in machine/_limits.h.
On some architectures UCHAR_MAX and USHRT_MAX had type unsigned int.
However, lacking integer suffixes for types smaller than int, their type
should correspond to that of an object of type unsigned char (or short)
when used in an expression with objects of type int. In that case unsigned
char (short) are promoted to int (i.e. signed) so the type of UCHAR_MAX and
USHRT_MAX should also be int.

Where MIN/MAX constants implicitly have the correct type the suffix has
been removed.

While here, correct some comments.

Reviewed by:	bde
Approved by:	kib (mentor)
2011-01-08 11:13:34 +00:00
Tijl Coosemans
911127a0d6 Remove unused support for 64 bit long on 32 bit architectures.
It was used mainly to discover and fix some 64-bit portability problems
before 64-bit arches were widely available.

Discussed with:	bde
Approved by:	kib (mentor)
2011-01-07 22:57:31 +00:00
Konstantin Belousov
39198f15ee Add AT_STACKPROT elf aux vector. Will be used to inform rtld about the
initial stack protection set by the kernel image activator.
2011-01-07 14:22:34 +00:00
John Baldwin
c305730dc0 Remove bogus usage of INTR_FAST. "Fast" interrupts are now indicated by
registering a filter handler rather than a threaded handler.  Also remove
a bogus use of INTR_MPSAFE for a filter.
2011-01-06 21:08:06 +00:00
John Baldwin
31e55059be - Add a proper return value to mv_gpio_intr().
- Remove an obsolete use of INTR_FAST.
2011-01-06 21:03:55 +00:00
John Baldwin
e9dc47ee81 - Use macbstart_locked() directly instead of deferring it to a task.
- Expand locking scope in interrupt handler.
- Flesh out the detach routine.

Reviewed by:	cognet
2011-01-06 19:32:00 +00:00
Warner Losh
fb1f3084ea Remove support for SKYEYE simulator 2011-01-05 23:45:07 +00:00
Warner Losh
79f4804918 Remove ancient simulation code. Skyeye simulation never really worked
quite right and hasn't been used in ages and is likely broken.  QEMU
with GUMSTIX is a more promising road to FreeBSD/arm in emulation
anyway.

Reviewed by:	cognet@
2011-01-05 22:15:57 +00:00
Warner Losh
72497ecf7e IXP4XX_GPIO_{,UN}LOCK() don't take args. Remove the sc here to make
this compile again.
2010-12-23 19:28:50 +00:00
Kevin Lo
7df9d5acad Fix double ;; 2010-12-06 10:24:06 +00:00
Rebecca Cran
c90f7d9b44 Revert r216134. This checkin broke platforms where bus_space are macros:
they need to be a single statement, and do { } while (0) doesn't work in this
situation so revert until a solution can be devised.
2010-12-03 07:09:23 +00:00
Rebecca Cran
15b4888a24 Disallow passing in a count of zero bytes to the bus_space(9) functions.
Passing a count of zero on i386 and amd64 for [I386|AMD64]_BUS_SPACE_MEM
causes a crash/hang since the 'loop' instruction decrements the counter
before checking if it's zero.

PR:	kern/80980
Discussed with:	jhb
2010-12-02 22:19:30 +00:00
Andrew Thompson
a0ba8fd51c Provide a mutex around the read/modify/write of the IXP425_GPIO_*
registers. Giant was used in some places, but not all.
2010-11-14 20:41:22 +00:00
Andrew Thompson
54873b4cd6 Add a GPIO driver for the Gateworks Cambria platform.
The external gpio pins are connected to a PLD on the i2c bus, unfortunatley
this device does not conform by failing to send an ack after each byte written.
The iicbb driver will abort the transfer when the address is not ack'd and it
would introduce a lot of churn to be able to pass a flag down to
iicbb_start/iicbb_write. Instead we do bad things by grabbing the iicbus but
then doing our own bit banging.
2010-11-11 20:18:33 +00:00
Bernd Walter
bfb8239854 add hint for at45d flash device sitting of spibus0 2010-11-11 15:02:14 +00:00
John Baldwin
961135ead8 - Remove <machine/mutex.h>. Most of the headers were empty, and the
contents of the ones that were not empty were stale and unused.
- Now that <machine/mutex.h> no longer exists, there is no need to allow it
  to override various helper macros in <sys/mutex.h>.
- Rename various helper macros for low-level operations on mutexes to live
  in the _mtx_* or __mtx_* namespaces.  While here, change the names to more
  closely match the real API functions they are backing.
- Drop support for including <sys/mutex.h> in assembly source files.

Suggested by:	bde (1, 2)
2010-11-09 20:46:41 +00:00
Rebecca Cran
b1ce21c6ef Fix typos.
PR:	bin/148894
Submitted by:	olgeni
2010-11-09 10:59:09 +00:00
Kevin Lo
d6cb4126e3 Minor cosmetic changes 2010-11-09 09:34:21 +00:00
Kevin Lo
036227ca55 Intel IXP425 SoC is based on the ARMv5TE architecture
MFC after:	3 days
2010-11-08 07:54:24 +00:00
Andrew Thompson
0b773cbb23 Remove line for the uncommitted Cambria gpio drive that snuck in with r214946. 2010-11-07 20:38:14 +00:00
Andrew Thompson
cefd33c787 Hook up the five gpio pins on the Avila board to the gpio framework. There are
actually 16 I/O lines but the other ones are used for system devices and
interrupts.

The IXP4XX platform can set interrupts on these pins for
high/low/rising/falling/transitional but this is not implemented yet.

The Cambria has the same interface but as all the pins are assigned to system
functions the gpio header is toggled via a PLD on the i2c bus and is not
supported by this commit.
2010-11-07 20:33:39 +00:00
John Baldwin
0108cce0a4 Adjust the order of operations in spinlock_enter() and spinlock_exit() to
work properly with single-stepping in a kernel debugger.  Specifically,
these routines have always disabled interrupts before increasing the nesting
count and restored the prior state of interrupts after decreasing the nesting
count to avoid problems with a nested interrupt not disabling interrupts
when acquiring a spin lock.  However, trap interrupts for single-stepping
can still occur even when interrupts are disabled.  Now the saved state of
interrupts is not saved in the thread until after interrupts have been
disabled and the nesting count has been increased.  Similarly, the saved
state from the thread cannot be read once the nesting count has been
decreased to zero.  To fix this, use temporary variables to store interrupt
state and shuffle it between the thread's MD area and the appropriate
registers.

In cooperation with:	bde
MFC after:     1 month
2010-11-05 13:42:58 +00:00
Olivier Houchard
306cc0acfb Try to be a little smart at guessing where _start is located in flash, instead
of relying on a binutils bug.

Reported by:	dim
2010-11-01 21:04:23 +00:00
Alexander Motin
bda55b6adb Set of legacy mode SATA enchancements:
- Implement proper combined mode decoding for Intel controllers to properly
identify SATA and PATA channels and associate ATA channels with SATA ports.
This fixes wrong reporting and in some cases hard resets to wrong SATA ports.
- Improve SATA registers support to handle hot-plug events and potentially
interface errors. For ICH5/6300ESB chipsets these registers accessible via
PCI config space. For later ones they may be accessible via PCI BAR(5).
- For controllers not generating interrupts on hot-plug events, implement
periodic status polling. Use it to detect hot-plug on Intel and VIA
controllers. Same probably could also be used for Serverworks and SIS.
2010-10-18 11:30:13 +00:00
Marius Strobl
b56f1ea9d4 Remove a device_printf() accidentally left in r213894.
Submitted by:	jhb
2010-10-15 15:16:36 +00:00
Marius Strobl
d6c65d276e Converted the remainder of the NIC drivers to use the mii_attach()
introduced in r213878 instead of mii_phy_probe(). Unlike r213893 these
are only straight forward conversions though.

Reviewed by:	yongari
2010-10-15 15:00:30 +00:00
Marius Strobl
8e5d93dbb4 Convert the PHY drivers to honor the mii_flags passed down and convert
the NIC drivers as well as the PHY drivers to take advantage of the
mii_attach() introduced in r213878 to get rid of certain hacks. For
the most part these were:
- Artificially limiting miibus_{read,write}reg methods to certain PHY
  addresses; we now let mii_attach() only probe the PHY at the desired
  address(es) instead.
- PHY drivers setting MIIF_* flags based on the NIC driver they hang
  off from, partly even based on grabbing and using the softc of the
  parent; we now pass these flags down from the NIC to the PHY drivers
  via mii_attach(). This got us rid of all such hacks except those of
  brgphy() in combination with bce(4) and bge(4), which is way beyond
  what can be expressed with simple flags.

While at it, I took the opportunity to change the NIC drivers to pass
up the error returned by mii_attach() (previously by mii_phy_probe())
and unify the error message used in this case where and as appropriate
as mii_attach() actually can fail for a number of reasons, not just
because of no PHY(s) being present at the expected address(es).

Reviewed by:	jhb, yongari
2010-10-15 14:52:11 +00:00
Olivier Houchard
41456e2db1 Add the QILA9G20 config files.
Submitted by:	Greg Ansley
2010-10-06 22:41:32 +00:00
Olivier Houchard
d1a346128f Add support for the AT91SAM9260
Submitted by:	Greg Ansley
2010-10-06 22:40:27 +00:00
Olivier Houchard
9195433fdc Add the AT91SAM9G20EK config files.
Submitted by:	Greg Ansley
2010-10-06 22:32:31 +00:00
Olivier Houchard
4ad6106939 if_ate.c:
* Support for sam9 "EMAC" controller.
    * Support for rmii interface to phy.

at91.c & at91sam9.c:

    * Eliminate separate at91sam9.c file.
    * Add new devices to at91sam9_devs table.

at91_machdep.c & at at91sam9_machdep.c:

    * Automatic chip type determination.
    * Remove compile time chip dependencies.
    * Eliminate separate at91sam9_machdep.c file.

at91_pmc.c:

    * Corrected support for all of the sam926? and sam9g20 chips.
    * Remove compile time chip dependencies.

My apologies to Greg for taking so long to take care of it.
2010-10-06 22:25:21 +00:00
Bernd Walter
4a2b4bc031 fix outdated comment 2010-09-28 21:13:54 +00:00
Bernd Walter
974dd68e5c The TWI controller automatically stops if we don't fill up with new data in
time.
2010-09-27 15:58:19 +00:00
Bernd Walter
cd93636f3e fix off by one error for twi reads with len != 1.
STOP must be requested before the last byte is received.
2010-09-27 15:55:30 +00:00
Alexander Motin
cfa892b592 Add basic cpu_sleep() support for Marvell SoCs. This drops my SheevaPlug's
heatsink termperature in open air from 49C to 43C when idle.
2010-09-18 16:57:05 +00:00
Alexander Motin
afc1cdb92d Clear timer interrupt status before calling callback, not after it,
This fixes timer interrupt losses, fatal in one-shot mode.
2010-09-18 13:44:39 +00:00
Olivier Houchard
68710d7d2f In pmap_remove_all(), do not decrease pm_stats.wired_count if the mapping was
wired, as it's been done later in pmap_nuke_pv().

Submitted by:	Mark Tinguely
2010-09-12 20:46:32 +00:00
Andriy Gapon
3d844eddb7 bus_add_child: change type of order parameter to u_int
This reflects actual type used to store and compare child device orders.
Change is mostly done via a Coccinelle (soon to be devel/coccinelle)
semantic patch.
Verified by LINT+modules kernel builds.

Followup to:	r212213
MFC after:	10 days
2010-09-10 11:19:03 +00:00
Maksim Yevmenkin
93dbfc7840 Add custom kernel configuration and device tree source files for
Seagate FreeAgent DockStar(tm) device. It seems to be a dumb down
version of Marvell SheevaPlug. Device tree source file could use
more tweaking, but at least it wll network boot and run FreeBSD/arm.
2010-09-08 19:50:47 +00:00
Konstantin Belousov
ee235befcb Supply some useful information to the started image using ELF aux vectors.
In particular, provide pagesize and pagesizes array, the canary value
for SSP use, number of host CPUs and osreldate.

Tested by:	marius (sparc64)
MFC after:	1 month
2010-08-17 08:55:45 +00:00
John Baldwin
60c7b36b7a Update various places that store or manipulate CPU masks to use cpumask_t
instead of int or u_int.  Since cpumask_t is currently u_int on all
platforms this should just be a cosmetic change.
2010-08-11 23:22:53 +00:00
John Baldwin
a3870a1826 Very rough first cut at NUMA support for the physical page allocator. For
now it uses a very dumb first-touch allocation policy.  This will change in
the future.
- Each architecture indicates the maximum number of supported memory domains
  via a new VM_NDOMAIN parameter in <machine/vmparam.h>.
- Each cpu now has a PCPU_GET(domain) member to indicate the memory domain
  a CPU belongs to.  Domain values are dense and numbered from 0.
- When a platform supports multiple domains, the default freelist
  (VM_FREELIST_DEFAULT) is split up into N freelists, one for each domain.
  The MD code is required to populate an array of mem_affinity structures.
  Each entry in the array defines a range of memory (start and end) and a
  domain for the range.  Multiple entries may be present for a single
  domain.  The list is terminated by an entry where all fields are zero.
  This array of structures is used to split up phys_avail[] regions that
  fall in VM_FREELIST_DEFAULT into per-domain freelists.
- Each memory domain has a separate lookup-array of freelists that is
  used when fulfulling a physical memory allocation.  Right now the
  per-domain freelists are listed in a round-robin order for each domain.
  In the future a table such as the ACPI SLIT table may be used to order
  the per-domain lookup lists based on the penalty for each memory domain
  relative to a specific domain.  The lookup lists may be examined via a
  new vm.phys.lookup_lists sysctl.
- The first-touch policy is implemented by using PCPU_GET(domain) to
  pick a lookup list when allocating memory.

Reviewed by:	alc
2010-07-27 20:33:50 +00:00
Andrew Turner
fa94061701 Allow external interrupts.
- Set the external pin to interrupt in bus_setup_intr
- Implement bus_config_intr for external interrupts
- Extend arm_{,un}mask_irq to work with external interrupts

Approved by:	imp (mentor)
2010-07-24 23:41:09 +00:00
Andrew Turner
6fc08f19a6 Add the s3c24x0 real time clock driver
Approved by:	imp (mentor)
2010-07-22 23:23:39 +00:00
Andrew Turner
6eae6892da Rework how device memory is allocated on the s3c24x0 CPU's.
The device virtual addresses are now able to be allocated at runtime rather
than from the static pmap_devmap at boot. The only exception is memory
required before we have had a chance to dynamically allocate it.

While here reduce the space between the statically allocated devices by
reducing the distance between the virtual addresses.

Approved by:	imp (mentor)
2010-07-22 23:12:19 +00:00
Alexander Motin
599cf0f197 Fix several un-/signedness bugs of r210290 and r210293. Add one more check. 2010-07-20 15:48:29 +00:00
Alexander Motin
e9f0d5658c Refactor Marvell ARM SoC timer driver to the new timer infrastructure. 2010-07-20 11:46:45 +00:00
Rafal Jaworowski
294e2f046a Now that we are fully FDT-driven on MRVL platforms, remove PHYSMEM_SIZE option. 2010-07-19 19:19:33 +00:00
Rafal Jaworowski
4f124b977c Eliminate FDT_IMMR_VA define.
This removes platform dependencies from <machine>/fdt.h for the benfit of
portability.
2010-07-19 18:47:18 +00:00
Rafal Jaworowski
5710b7a9da Move MRVL FDT fixups and PIC decode routine to a platform specific area.
This allows for better encapsulation (and eliminates generic fdt_arm.c, at
least for now).
2010-07-19 18:41:50 +00:00
Olivier Houchard
50a9f1c71c Import preliminary support for Atmel AT91SAM9G20 cpu, and the Hot-e HL201.
This fine work was done by Yohanes Nugroho <yohanes a gmail dot com>
Many thanks to John Nicholls and Thinlinx for providing sample hardware.
2010-07-14 00:48:53 +00:00
Rafal Jaworowski
6031d0b167 Get rid of bootinfo for good in loader (U-Boot-based) and ARM.
For FDT-enabled platforms the device tree is a modern replacement for bootinfo
config data.
2010-07-11 21:11:23 +00:00
John Baldwin
fc0de8f0b6 Move prototypes for kern_sigtimedwait() and kern_sigprocmask() to
<sys/syscallsubr.h> where all other kern_<syscall> prototypes live.
2010-06-30 18:03:42 +00:00
Rafal Jaworowski
915421bba4 Move ARM nexus rman initialization to attach routine.
This fixes a panic, which started to trigger after r209129 cleanup.

Submitted by:	Andrew Turner
2010-06-16 14:10:39 +00:00
Olivier Houchard
8b1f99cdd4 Turn off cache if there's more than one kernel mapping, and one is writable.
Submitted by:   Mark Tinguely
2010-06-15 22:16:02 +00:00
Rafal Jaworowski
98dbf5572a Temporarily bring back the ARM bootinfo (and make tinderbox happy).
BI will be eliminated for good when powerpc transition to FDT is complete.
2010-06-14 16:05:21 +00:00
Rafal Jaworowski
db5ef4fc77 Convert Marvell ARM platforms to FDT convention.
The following systems are involved:

  - DB-88F5182
  - DB-88F5281
  - DB-88F6281
  - DB-78100
  - SheevaPlug

This overhaul covers the following major changes:

  - All integrated peripherals drivers for Marvell ARM SoC, which are
    currently in the FreeBSD source tree are reworked and adjusted so they
    derive config data out of the device tree blob (instead of hard coded /
    tabelarized values).

  - Since the common FDT infrastrucutre (fdtbus, simplebus) is used we say
    good by to obio / mbus drivers and numerous hard-coded config data.

Note that world needs to be built WITH_FDT for the affected platforms.

Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation.
2010-06-13 13:28:53 +00:00
Rafal Jaworowski
dc65d002dd Initial FDT infrastructure elements for ARM.
Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
2010-06-13 13:12:52 +00:00
Rafal Jaworowski
ddf2705d57 Improve style. 2010-06-13 13:08:23 +00:00
Alan Cox
9124d0d6a3 Relax one of the new assertions in pmap_enter() a little. Specifically,
allow pmap_enter() to be performed on an unmanaged page that doesn't have
VPO_BUSY set.  Having VPO_BUSY set really only matters for managed pages.
(See, for example, pmap_remove_write().)
2010-06-11 15:49:39 +00:00
Alan Cox
ce18658792 Reduce the scope of the page queues lock and the number of
PG_REFERENCED changes in vm_pageout_object_deactivate_pages().
Simplify this function's inner loop using TAILQ_FOREACH(), and shorten
some of its overly long lines.  Update a stale comment.

Assert that PG_REFERENCED may be cleared only if the object containing
the page is locked.  Add a comment documenting this.

Assert that a caller to vm_page_requeue() holds the page queues lock,
and assert that the page is on a page queue.

Push down the page queues lock into pmap_ts_referenced() and
pmap_page_exists_quick().  (As of now, there are no longer any pmap
functions that expect to be called with the page queues lock held.)

Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever
be passed an unmanaged page.  Assert this rather than returning "0"
and "FALSE" respectively.

ARM:

Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH().

Push down the page queues lock inside of pmap_clearbit(), simplifying
pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write().
Additionally, this allows for avoiding the acquisition of the page
queues lock in some cases.

PowerPC/AIM:

moea*_page_exits_quick() and moea*_page_wired_mappings() will never be
called before pmap initialization is complete.  Therefore, the check
for moea_initialized can be eliminated.

Push down the page queues lock inside of moea*_clear_bit(),
simplifying moea*_clear_modify() and moea*_clear_reference().

The last parameter to moea*_clear_bit() is never used.  Eliminate it.

PowerPC/BookE:

Simplify mmu_booke_page_exists_quick()'s control flow.

Reviewed by:	kib@
2010-06-10 16:56:35 +00:00
Alan Cox
85a71b2578 Don't set PG_WRITEABLE in pmap_enter() unless the page is managed.
Correct a typo in a nearby comment on sparc64.
2010-06-05 18:20:09 +00:00
Alan Cox
6e6a072d39 In pmap_enter_locked(), don't require the vector page to be VPO_BUSY. 2010-06-01 05:32:59 +00:00
Alan Cox
c46b90e90a Push down page queues lock acquisition in pmap_enter_object() and
pmap_is_referenced().  Eliminate the corresponding page queues lock
acquisitions from vm_map_pmap_enter() and mincore(), respectively.  In
mincore(), this allows some additional cases to complete without ever
acquiring the page queues lock.

Assert that the page is managed in pmap_is_referenced().

On powerpc/aim, push down the page queues lock acquisition from
moea*_is_modified() and moea*_is_referenced() into moea*_query_bit().
Again, this will allow some additional cases to complete without ever
acquiring the page queues lock.

Reorder a few statements in vm_page_dontneed() so that a race can't lead
to an old reference persisting.  This scenario is described in detail by a
comment.

Correct a spelling error in vm_page_dontneed().

Assert that the object is locked in vm_page_clear_dirty(), and restrict the
page queues lock assertion to just those cases in which the page is
currently writeable.

Add object locking to vnode_pager_generic_putpages().  This was the one
and only place where vm_page_clear_dirty() was being called without the
object being locked.

Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call
to vm_page_clear_dirty().

Change vnode_pager_generic_putpages() to the modern-style of function
definition.  Also, change the name of one of the parameters to follow
virtual memory system naming conventions.

Reviewed by:	kib
2010-05-26 18:00:44 +00:00
Rafal Jaworowski
04cb90189b Initial loader(8) support for Flattened Device Tree.
o This is disabled by default for now, and can be enabled using WITH_FDT at
  build time.

o Tested with ARM and PowerPC.

Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
2010-05-25 15:21:39 +00:00
Alan Cox
567e51e18c Roughly half of a typical pmap_mincore() implementation is machine-
independent code.  Move this code into mincore(), and eliminate the
page queues lock from pmap_mincore().

Push down the page queues lock into pmap_clear_modify(),
pmap_clear_reference(), and pmap_is_modified().  Assert that these
functions are never passed an unmanaged page.

Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m:
Contrary to what the comment says, pmap_mincore() is not simply an
optimization.  Without a complete pmap_mincore() implementation,
mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED
because only the pmap can provide this information.

Eliminate the page queues lock from vfs_setdirty_locked_object(),
vm_pageout_clean(), vm_object_page_collect_flush(), and
vm_object_page_clean().  Generally speaking, these are all accesses
to the page's dirty field, which are synchronized by the containing
vm object's lock.

Reduce the scope of the page queues lock in vm_object_madvise() and
vm_page_dontneed().

Reviewed by:	kib (an earlier version)
2010-05-24 14:26:57 +00:00
Konstantin Belousov
afe1a68827 Reorganize syscall entry and leave handling.
Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
  usermode into struct syscall_args. The structure is machine-depended
  (this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
  from the syscall. It is a generalization of
  cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
  return value.
sv_syscallnames - the table of syscall names.

Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().

The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.

Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().

Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively.  The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.

The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.

Reviewed by:	jhb, marcel, marius, nwhitehorn, stas
Tested by:	marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
	stas (mips)
MFC after:	1 month
2010-05-23 18:32:02 +00:00
Alan Cox
9ab6032f73 On entry to pmap_enter(), assert that the page is busy. While I'm
here, make the style of assertion used by pmap_enter() consistent
across all architectures.

On entry to pmap_remove_write(), assert that the page is neither
unmanaged nor fictitious, since we cannot remove write access to
either kind of page.

With the push down of the page queues lock, pmap_remove_write() cannot
condition its behavior on the state of the PG_WRITEABLE flag if the
page is busy.  Assert that the object containing the page is locked.
This allows us to know that the page will neither become busy nor will
PG_WRITEABLE be set on it while pmap_remove_write() is running.

Correct a long-standing bug in vm_page_cowsetup().  We cannot possibly
do copy-on-write-based zero-copy transmit on unmanaged or fictitious
pages, so don't even try.  Previously, the call to pmap_remove_write()
would have failed silently.
2010-05-16 23:45:10 +00:00
Olivier Houchard
1c5ddcdfac Catchup with new prototype for db_printf(). 2010-05-14 00:00:19 +00:00
Kevin Lo
6c0458664f The FA526 belongs to the ARM9TDMI family 2010-05-12 05:50:56 +00:00
Alan Cox
3c4a24406b Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), and
vm_page_try_to_free().  Consequently, push down the page queues lock into
pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and
pmap_remove_write().

Push down the page queues lock into Xen's pmap_page_is_mapped().  (I
overlooked the Xen pmap in r207702.)

Switch to a per-processor counter for the total number of pages cached.
2010-05-08 20:34:01 +00:00
Kevin Lo
64c68f1c50 Add support for FA626TE.
Tested on GM8181 development board.
2010-05-04 10:14:05 +00:00
Maxim Sobolev
e50d35e6c6 Add new tunable 'net.link.ifqmaxlen' to set default send interface
queue length. The default value for this parameter is 50, which is
quite low for many of today's uses and the only way to modify this
parameter right now is to edit if_var.h file. Also add read-only
sysctl with the same name, so that it's possible to retrieve the
current value.

MFC after:	1 month
2010-05-03 07:32:50 +00:00
Alexander Motin
dd48af360f Import mvs(4) - Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA controllers
driver for CAM ATA subsystem. This driver supports same hardware as
atamarvell, ataadaptec and atamvsata drivers from ata(4), but provides
many additional features, such as NCQ, PMP, etc.
2010-05-02 19:28:30 +00:00
Kip Macy
2965a45315 On Alan's advice, rather than do a wholesale conversion on a single
architecture from page queue lock to a hashed array of page locks
(based on a patch by Jeff Roberson), I've implemented page lock
support in the MI code and have only moved vm_page's hold_count
out from under page queue mutex to page lock. This changes
pmap_extract_and_hold on all pmaps.

Supported by: Bitgravity Inc.

Discussed with: alc, jeffr, and kib
2010-04-30 00:46:43 +00:00
Konstantin Belousov
8bac98182a Style: use #define<TAB> instead of #define<SPACE>.
Noted by:	bde, pluknet gmail com
MFC after:	11 days
2010-04-27 09:48:43 +00:00
Alan Cox
7b85f59183 Resurrect pmap_is_referenced() and use it in mincore(). Essentially,
pmap_ts_referenced() is not always appropriate for checking whether or
not pages have been referenced because it clears any reference bits
that it encounters.  For example, in mincore(), clearing the reference
bits has two negative consequences.  First, it throws off the activity
count calculations performed by the page daemon.  Specifically, a page
on which mincore() has called pmap_ts_referenced() looks less active
to the page daemon than it should.  Consequently, the page could be
deactivated prematurely by the page daemon.  Arguably, this problem
could be fixed by having mincore() duplicate the activity count
calculation on the page.  However, there is a second problem for which
that is not a solution.  In order to clear a reference on a 4KB page,
it may be necessary to demote a 2/4MB page mapping.  Thus, a mincore()
by one process can have the side effect of demoting a superpage
mapping within another process!
2010-04-24 17:32:52 +00:00
Konstantin Belousov
ed7806879b Move the constants specifying the size of struct kinfo_proc into
machine-specific header files. Add KINFO_PROC32_SIZE for struct
kinfo_proc32 for architectures providing COMPAT_FREEBSD32. Add
CTASSERT for the size of struct kinfo_proc32.

Submitted by:	pluknet
Reviewed by:	imp, jhb, nwhitehorn
MFC after:	2 weeks
2010-04-24 12:49:52 +00:00
Andrew Thompson
b850ecc180 Change USB_DEBUG to #ifdef and allow it to be turned off. Previously this had
the illusion of a tunable setting but was always turned on regardless.

MFC after:	1 week
2010-04-22 21:31:34 +00:00
Warner Losh
b938b7a366 Add BUS_SPACE_UNRESTRICTED and define it to be ~0, just like all the
other platforms.
2010-04-08 19:34:55 +00:00
Alexander Motin
0295bbe76f Oops! Wrong copy-paste in r206053. 2010-04-01 19:05:43 +00:00
Alexander Motin
4703a67a9a Fill extended ATA command registers in cPRD to support 48bit commands. 2010-04-01 18:17:53 +00:00
Warner Losh
309e3c39d2 Build modules for this config to make sure they stay buildable... 2010-03-29 22:29:41 +00:00
Rui Paulo
c0873717d4 Pass the correct pointer to fled_cb(). 2010-03-26 18:49:43 +00:00
Nathan Whitehorn
a107d8aac9 Change the arguments of exec_setregs() so that it receives a pointer
to the image_params struct instead of several members of that struct
individually. This makes it easier to expand its arguments in the future
without touching all platforms.

Reviewed by:	jhb
2010-03-25 14:24:00 +00:00
Olivier Houchard
504e4c2e74 Make sure we insert and remove the PV entries related to unmanaged kernel
mappings into the kernel pmap, not into the pmap related to the
pmap_enter_pv()/pmap_remove_pv() call.
2010-03-21 21:03:35 +00:00
Warner Losh
30e980f2d1 Add support for the Samsung S3C2xx0 family of ARM SoCs written by
Andrew Turner.  The kernel supports the LN2410SBC evaluation board,
and likely others.  These parts (or similar ones) are in some open
hardware designs for phones.

Submitted by:	Andrew Turner
2010-03-20 03:39:35 +00:00
Bernd Walter
41ecefa550 fix type in comment 2010-03-12 22:39:35 +00:00
Rafal Jaworowski
da10e7e2d6 Fix ARM cache handling yet more.
1) vm_machdep.c: remove the dangling allocations so they do not
   un-necessarily turn off the cache upon consecutive access.

2) busdma_machdep.c: remove the same amount than shadow mapped.

Reported by:	Maks Verver
Submitted by:	Mark Tinguely
Reviewed by:	Grzegorz Bernacki
MFC after:	3 days
2010-03-11 21:16:54 +00:00
Rafal Jaworowski
43404d7e0b Let detailed info about CPU features print on Marvell Sheeva CPU as well.
Provide missing entry in the cpu_classes[].

Reported by:	Maks Verver
MFC after:	1 week
2010-03-11 21:04:29 +00:00
Rafal Jaworowski
bd50890544 Provide correct TCLK value for Kirkwood A1 silicon revision.
While there improve SOC ID output accordingly.

Obtained from:	Semihalf
MFC after:	1 week
2010-03-05 19:45:45 +00:00
Bernd Walter
80720032c0 simplify hash calculation 2010-02-28 18:06:54 +00:00
Bernd Walter
f7255c488c remove debug leftover 2010-02-28 16:14:34 +00:00
Bernd Walter
31e94ae09f Fix multicast hashes.
Atmel uses a simple xor hash instead of the typical crc based one.
2010-02-28 16:11:13 +00:00
Rafal Jaworowski
a9024075a7 Do not force verbose and single mode in non-metadata boot case.
We want to go multi-user by default also in case of booting without loader(8).
2010-02-24 20:31:00 +00:00
Rebecca Cran
6c655111ad Update the commented out option for omitting the sysctl descriptions; it
was committed as NO_SYSCTL_DESCR.

Approved by:	rrs (mentor)
2010-02-24 19:28:15 +00:00
Rui Paulo
4cd0801096 Fix previous commit: led_func() doesn't exist, it should be fled_cb().
Pointed out by:	bz
2010-02-22 14:49:52 +00:00
Kevin Lo
16445d1e64 Show the cpu info for fa526
Submitted by:	Yohanes Nugroho <yohanes at gmail dot com>
2010-02-20 14:54:11 +00:00
Kevin Lo
dbb0e359a7 Correct both FA526/FA626TE cpu ids since the cpu id is always
masked with 0xfffffff0
2010-02-20 14:52:07 +00:00
Warner Losh
d01c5f360e The NetBSD Foundation has granted permission to remove clauses 3 and 4.
Obtained from:	NetBSD
2010-02-16 21:59:17 +00:00
Attilio Rao
c1210a7d97 Adjust style (following the already existing rules) for the newly
introduced option DEADLKRES.

Reported by:	danfe, julian, avg
2010-02-15 23:44:48 +00:00
Kevin Lo
4e92112d57 Correct cpu id for FA526.
While I'm here, add cpu id for FA626TE.
2010-02-14 05:02:08 +00:00
Attilio Rao
88cbfa852e Add the options DEADLKRES (introducing the deadlock resolver thread) in
the 'debugging' section of any HEAD kernel and enable for the mainstream
ones, excluding the embedded architectures.
It may, of course, enabled on a case-by-case basis.

Sponsored by:	Sandvine Incorporated
Requested by:	emaste
Discussed with:	kib
2010-02-10 16:30:04 +00:00
Rui Paulo
fb023b5a3c Turn on the front LED at boot time like we do with the Avila. 2010-02-10 11:40:18 +00:00
Rafal Jaworowski
530a5fb294 Improve checking whether an ARM VA has a valid mapping before performing cache
sync.

VIPT/PIPT caches need valid VA-PA mapping in PTE for a cache operation to
succeed (unlike VIVT). Prior to this fix pmap was using l2pte_valid() for that
check, but this is not sufficient as the function merely checks if a PTE
exists (there can be existing but _invalid_ entries in the table).

A new pmap_has_valid_mapping() routine is introduced to do this job right by
checking proper PTE flags.

Among other potential problems this cures coherency issues with L2 caches on
MV-78100.

Submitted by:	Grzegorz Bernacki, Piotr Ziecik
Reviewed, tested by:	marcel
Obtained from:	Semihalf
MFC after:	1 week
2010-02-07 20:48:57 +00:00
Marcel Moolenaar
882561186b When backtracing self, start with the current frame (i.e. the
frame of db_trace_self()) and not the caller's frame. The use
of builtin_frame_address(1) to get the caller's frame is not
reliable and can cause panics.
2010-01-29 16:14:35 +00:00
John Baldwin
13c18821fa Move the examples for the 'hints' and 'env' keywords from various GENERIC
kernel configs into NOTES.

Reviewed by:	imp
2010-01-19 17:20:34 +00:00
Olivier Houchard
59a5c7f90e Do not free the dmamap if it is still busy.
Submitted by:	Mark Tinguely
MFC after:	3 days
2010-01-15 12:39:48 +00:00
Warner Losh
56eff2143f Revert 200594. This file isn't intended for these sorts of things. 2010-01-04 21:30:04 +00:00
Rui Paulo
16ffe04c2f Remove CNS11XXNAS.hints. 2010-01-04 03:40:46 +00:00
Rui Paulo
381a19cce0 Add support for Cavium Econa CNS11XX ARM boards. These boards were
previously know by StarSemi STR9104.

Tested by the submitter on an Emprex NSD-100 board.

Submitted by:	Yohanes Nugroho <yohanes at gmail.com>
Reviewed by:	freebsd-arm, stas
Obtained from:	//depot/projects/str91xx/...
2010-01-04 03:35:45 +00:00
Robert Noland
cfd7bacef2 Update d_mmap() to accept vm_ooffset_t and vm_memattr_t.
This replaces d_mmap() with the d_mmap2() implementation and also
changes the type of offset to vm_ooffset_t.

Purge d_mmap2().

All driver modules will need to be rebuilt since D_VERSION is also
bumped.

Reviewed by:	jhb@
MFC after:	Not in this lifetime...
2009-12-29 21:51:28 +00:00
Rui Paulo
0ce207d2af Intel XScale hwpmc(4) support.
This brings hwpmc(4) support for 2nd and 3rd generation XScale cores.
Right now it's enabled by default to make sure we test this a bit.
When the time comes it can be disabled by default.
Tested on Gateworks boards.

A man page is coming.

Obtained from:	//depot/user/rpaulo/xscalepmc/...
2009-12-23 23:16:54 +00:00
Doug Barton
f1bdf073c1 Add INCLUDE_CONFIG_FILE, and a note in comments about how to also
include the comments with CONFIGARGS
2009-12-16 02:17:43 +00:00
Alexander Motin
402ea18cbd Fix the build. 2009-12-08 21:42:04 +00:00
Alexander Motin
066f913a94 MFp4:
Introduce ATA_CAM kernel option, turning ata(4) controller drivers into
cam(4) interface modules. When enabled, this options deprecates all ata(4)
peripheral drivers (ad, acd, ...) and interfaces and allows cam(4) drivers
(ada, cd, ...) and interfaces to be natively used instead.

As side effect of this, ata(4) mode setting code was completely rewritten
to make controller API more strict and permit above change. While doing
this, SATA revision was separated from PATA mode. It allows DMA-incapable
SATA devices to operate and makes hw.ata.atapi_dma tunable work again.

Also allow ata(4) controller drivers (except some specific or broken ones)
to handle larger data transfers. Previous constraint of 64K was artificial
and is not really required by PCI ATA BM specification or hardware.

Submitted by:	nwitehorn (powerpc part)
2009-12-06 00:10:13 +00:00
Andrew Thompson
1813b93dc1 Add missing ath_ar9* ath hal entries. 2009-12-02 00:38:11 +00:00
Andrew Thompson
d3da916e6e Remove unknown ath hal device entries. 2009-12-02 00:37:03 +00:00
Alan Cox
e2997fea72 Simplify the invocation of vm_fault(). Specifically, eliminate the flag
VM_FAULT_DIRTY.  The information provided by this flag can be trivially
inferred by vm_fault().

Discussed with:	kib
2009-11-27 20:24:11 +00:00
John Baldwin
3064f0530b - Initialize callout before it is used in atestop() during attach.
- Reorder detach so that ether_ifdetach() is called first.  This removes
  the race that ATE_FLAG_DETACHING closed, so that flag can be removed.
- Trim a duplicate clearing of IFF_DRV_RUNNING.

Reviewed by:	imp
2009-11-19 22:04:02 +00:00
John Baldwin
e21e2eea58 These drivers only set if_timer but never set if_watchdog. Just remove
the assignments to if_timer.
2009-11-19 18:11:23 +00:00
Konstantin Belousov
a7b890448c Extract the code that records syscall results in the frame into MD
function cpu_set_syscall_retval().

Suggested by:	marcel
Reviewed by:	marcel, davidxu
PowerPC, ARM, ia64 changes:	marcel
Sparc64 tested and reviewed by:	marius, also sunv reviewed
MIPS tested by:	gonzo
MFC after:	1 month
2009-11-10 11:43:07 +00:00
Marcel Moolenaar
4590f2282a Fix gdb_cpu_getreg() to actually match GDB's register
definition.
2009-11-05 06:31:50 +00:00
Marcel Moolenaar
2ffa44209a Implement db_trace_thread() by calling db_stack_trace_cmd() and
passing a frame pointer that comes from the thread context. This
fixes DDB backtraces by not unwinding debugger functions first.
2009-11-05 06:27:46 +00:00
Marcel Moolenaar
faa7ba7a3f Implement db_trace_self() by calling db_stack_trace_cmd()
and not db_trace_thread().
2009-11-05 06:23:02 +00:00
Alan Cox
c346328f95 Eliminate an unnecessary vm include file. 2009-11-04 04:41:03 +00:00
Alexander Motin
ebbb35ba70 MFp4:
- Remove most of direct relations between ATA(4) peripherial and controller
levels. It makes logic more transparent and is a mandatory step to wrap
ATA(4) controller level into ATA-native CAM SIM.
- Tune AHCI and SATA2 SiI drivers memory allocation a bit to allow bigger
I/O transaction sizes without additional cost.
2009-10-31 13:24:14 +00:00
Konstantin Belousov
d6e029adbe In r197963, a race with thread being selected for signal delivery
while in kernel mode, and later changing signal mask to block the
signal, was fixed for sigprocmask(2) and ptread_exit(3). The same race
exists for sigreturn(2), setcontext(2) and swapcontext(2) syscalls.

Use kern_sigprocmask() instead of direct manipulation of td_sigmask to
reschedule newly blocked signals, closing the race.

Reviewed by:	davidxu
Tested by:	pho
MFC after:	1 month
2009-10-27 10:47:58 +00:00
Marcel Moolenaar
191bb483f2 Review previous change. It has no relation to the I-cache coherency
changes and thus unintentional.

Spotted by: rdivacky@
2009-10-21 18:44:00 +00:00
Marcel Moolenaar
1a4fcaebe3 o Introduce vm_sync_icache() for making the I-cache coherent with
the memory or D-cache, depending on the semantics of the platform.
    vm_sync_icache() is basically a wrapper around pmap_sync_icache(),
    that translates the vm_map_t argumument to pmap_t.
o   Introduce pmap_sync_icache() to all PMAP implementation. For powerpc
    it replaces the pmap_page_executable() function, added to solve
    the I-cache problem in uiomove_fromphys().
o   In proc_rwmem() call vm_sync_icache() when writing to a page that
    has execute permissions. This assures that when breakpoints are
    written, the I-cache will be coherent and the process will actually
    hit the breakpoint.
o   This also fixes the Book-E PMAP implementation that was missing
    necessary locking while trying to deal with the I-cache coherency
    in pmap_enter() (read: mmu_booke_enter_locked).

The key property of this change is that the I-cache is made coherent
*after* writes have been done. Doing it in the PMAP layer when adding
or changing a mapping means that the I-cache is made coherent *before*
any writes happen. The difference is key when the I-cache prefetches.
2009-10-21 18:38:02 +00:00
John Baldwin
b675ebba10 Sync with other GENERIC kernel configs:
- Move USB serial drivers earlier to match their placement in other kernel
  configs.
- Add descriptions to various USB drivers.
- Move the USB wireless drivers into a new section.
- Add ulscom to the list of USB serial drivers.
2009-10-13 19:04:01 +00:00
Konstantin Belousov
023063938a Define architectural load bases for PIE binaries. Addresses were selected
by looking at the bases used for non-relocatable executables by gnu ld(1),
and adjusting it slightly.

Discussed with:	bz
Reviewed by:	kan
Tested by:	bz (i386, amd64), bsam (linux)
MFC after:	some time
2009-10-10 15:31:24 +00:00
Stanislav Sedov
2c7cd48d3c - Drop unused pmap_use_l1 function and comment out currently unused
pmap_dcache_wbinv_all/pmap_copy_page functions which we might want
  to take advatage of later.  This fixes the build with PMAP_DEBUG
  defined.

Discussed with:	cognet
2009-10-05 10:08:58 +00:00
Rui Paulo
bba017d6a0 Remove remaining bits of performance counter support.
Submitted by:	Tom Judge <tom at tomjudge.com>
2009-10-03 13:59:15 +00:00
Bjoern A. Zeeb
52bf2041ac Make sure that the primary native brandinfo always gets added
first and the native ia32 compat as middle (before other things).
o(ld)brandinfo as well as third party like linux, kfreebsd, etc.
stays on SI_ORDER_ANY coming last.

The reason for this is only to make sure that even in case we would
overflow the MAX_BRANDS sized array, the native FreeBSD brandinfo
would still be there and the system would be operational.

Reviewed by:	kib
MFC after:	1 month
2009-10-03 11:57:21 +00:00
Rui Paulo
f143a35bf1 Remove performance counter headers. This code came from NetBSD, but our
hardware perf. counter support is different, so we don't need these
files.

Reviewed by:	freebsd-arm (no comments)
2009-10-02 11:10:05 +00:00
Rui Paulo
98c53ad360 Promote the cpu_class local variable to global and expose it in md_var.h
Reviewed by:	freebsd-arm
2009-09-26 16:37:23 +00:00
Alan Cox
fe105d45a2 Add a new sysctl for reporting all of the supported page sizes.
Reviewed by:	jhb
MFC after:	3 weeks
2009-09-18 17:04:57 +00:00
Poul-Henning Kamp
a254d1f16d Get rid of the _NO_NAMESPACE_POLLUTION kludge by creating an
architecture specific include file containing the _ALIGN*
stuff which <sys/socket.h> needs.
2009-09-08 20:45:40 +00:00
Konstantin Belousov
8a945d109c Reintroduce the r196640, after fixing the problem with my testing.
Remove the altkstacks, instead instantiate threads with kernel stack
allocated with the right size from the start. For the thread that has
kernel stack cached, verify that requested stack size is equial to the
actual, and reallocate the stack if sizes differ [1].

This fixes the bug introduced by r173361 that was committed several days
after r173004 and consisted of kthread_add(9) ignoring the non-default
kernel stack size.

Also, r173361 removed the caching of the kernel stacks for a non-first
thread in the process. Introduce separate kernel stack cache that keeps
some limited amount of preallocated kernel stacks to lower the latency
of thread allocation. Add vm_lowmem handler to prune the cache on
low memory condition. This way, system with reasonable amount of the
threads get lower latency of thread creation, while still not exhausting
significant portion of KVA for unused kstacks.

Submitted by:	peter [1]
Discussed with:	jhb, julian, peter
Reviewed by:	jhb
Tested by:	pho (and retested according to new test scenarious)
MFC after:	1 week
2009-09-01 11:41:51 +00:00
Konstantin Belousov
f25fa6abb2 Reverse r196640 and r196644 for now. 2009-08-29 21:53:08 +00:00
Konstantin Belousov
c3cf0b476f Remove the altkstacks, instead instantiate threads with kernel stack
allocated with the right size from the start. For the thread that has
kernel stack cached, verify that requested stack size is equial to the
actual, and reallocate the stack if sizes differ [1].

This fixes the bug introduced by r173361 that was committed several days
after r173004 and consisted of kthread_add(9) ignoring the non-default
kernel stack size.

Also, r173361 removed the caching of the kernel stacks for a non-first
thread in the process. Introduce separate kernel stack cache that keeps
some limited amount of preallocated kernel stacks to lower the latency
of thread allocation. Add vm_lowmem handler to prune the cache on
low memory condition. This way, system with reasonable amount of the
threads get lower latency of thread creation, while still not exhausting
significant portion of KVA for unused kstacks.

Submitted by:	peter [1]
Discussed with:	jhb, julian, peter
Reviewed by:	jhb
Tested by:	pho
MFC after:	1 week
2009-08-29 13:28:02 +00:00