Commit Graph

1413 Commits

Author SHA1 Message Date
Konstantin Belousov
ee75e7de7b Implement the concept of the unmapped VMIO buffers, i.e. buffers which
do not map the b_pages pages into buffer_map KVA.  The use of the
unmapped buffers eliminate the need to perform TLB shootdown for
mapping on the buffer creation and reuse, greatly reducing the amount
of IPIs for shootdown on big-SMP machines and eliminating up to 25-30%
of the system time on i/o intensive workloads.

The unmapped buffer should be explicitely requested by the GB_UNMAPPED
flag by the consumer.  For unmapped buffer, no KVA reservation is
performed at all. The consumer might request unmapped buffer which
does have a KVA reserve, to manually map it without recursing into
buffer cache and blocking, with the GB_KVAALLOC flag.

When the mapped buffer is requested and unmapped buffer already
exists, the cache performs an upgrade, possibly reusing the KVA
reservation.

Unmapped buffer is translated into unmapped bio in g_vfs_strategy().
Unmapped bio carry a pointer to the vm_page_t array, offset and length
instead of the data pointer.  The provider which processes the bio
should explicitely specify a readiness to accept unmapped bio,
otherwise g_down geom thread performs the transient upgrade of the bio
request by mapping the pages into the new bio_transient_map KVA
submap.

The bio_transient_map submap claims up to 10% of the buffer map, and
the total buffer_map + bio_transient_map KVA usage stays the
same. Still, it could be manually tuned by kern.bio_transient_maxcnt
tunable, in the units of the transient mappings.  Eventually, the
bio_transient_map could be removed after all geom classes and drivers
can accept unmapped i/o requests.

Unmapped support can be turned off by the vfs.unmapped_buf_allowed
tunable, disabling which makes the buffer (or cluster) creation
requests to ignore GB_UNMAPPED and GB_KVAALLOC flags.  Unmapped
buffers are only enabled by default on the architectures where
pmap_copy_page() was implemented and tested.

In the rework, filesystem metadata is not the subject to maxbufspace
limit anymore. Since the metadata buffers are always mapped, the
buffers still have to fit into the buffer map, which provides a
reasonable (but practically unreachable) upper bound on it. The
non-metadata buffer allocations, both mapped and unmapped, is
accounted against maxbufspace, as before. Effectively, this means that
the maxbufspace is forced on mapped and unmapped buffers separately.
The pre-patch bufspace limiting code did not worked, because
buffer_map fragmentation does not allow the limit to be reached.

By Jeff Roberson request, the getnewbuf() function was split into
smaller single-purpose functions.

Sponsored by:	The FreeBSD Foundation
Discussed with:	jeff (previous version)
Tested by:	pho, scottl (previous version), jhb, bf
MFC after:	2 weeks
2013-03-19 14:13:12 +00:00
Konstantin Belousov
e8a4a618cf Add pmap function pmap_copy_pages(), which copies the content of the
pages around, taking array of vm_page_t both for source and
destination.  Starting offsets and total transfer size are specified.

The function implements optimal algorithm for copying using the
platform-specific optimizations.  For instance, on the architectures
were the direct map is available, no transient mappings are created,
for i386 the per-cpu ephemeral page frame is used.  The code was
typically borrowed from the pmap_copy_page() for the same
architecture.

Only i386/amd64, powerpc aim and arm/arm-v6 implementations were
tested at the time of commit. High-level code, not committed yet to
the tree, ensures that the use of the function is only allowed after
explicit enablement.

For sparc64, the existing code has known issues and a stab is added
instead, to allow the kernel linking.

Sponsored by:	The FreeBSD Foundation
Tested by:	pho (i386, amd64), scottl (amd64), ian (arm and arm-v6)
MFC after:	2 weeks
2013-03-14 20:18:12 +00:00
Attilio Rao
89f6b8632c Switch the vm_object mutex to be a rwlock. This will enable in the
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.

The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
  - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
  - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
  - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
  - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
    (in order to avoid visibility of implementation details)
  - The read-mode operations are added:
    VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
    VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
  sys/mutex.h in consumers directly to cater its inlining functions
  using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
  consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
  the compat layer because the name clash between FreeBSD and solaris
  versions must be avoided.
  At this purpose zfs redefines the vm_object locking functions
  directly, isolating the FreeBSD components in specific compat stubs.

The KPI results heavilly broken by this commit.  Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).

Sponsored by:	EMC / Isilon storage division
Reviewed by:	jeff
Reviewed by:	pjd (ZFS specific review)
Discussed with:	alc
Tested by:	pho
2013-03-09 02:32:23 +00:00
Alexander Motin
fdc5dd2d2f MFcalloutng:
Switch eventtimers(9) from using struct bintime to sbintime_t.
Even before this not a single driver really supported full dynamic range of
struct bintime even in theory, not speaking about practical inexpediency.
This change legitimates the status quo and cleans up the code.
2013-02-28 13:46:03 +00:00
Attilio Rao
590f9303e5 Merge from vmobj-rwlock branch:
Remove unused inclusion of vm/vm_pager.h and vm/vnode_pager.h.

Sponsored by:	EMC / Isilon storage division
Tested by:	pho
Reviewed by:	alc
2013-02-26 01:00:11 +00:00
Konstantin Belousov
dd0b4fb6d5 Reform the busdma API so that new types may be added without modifying
every architecture's busdma_machdep.c.  It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code.  The MD busdma is then given a chance to do any final processing
in the complete() callback.

The cam changes unify the bus_dmamap_load* handling in cam drivers.

The arm and mips implementations are updated to track virtual
addresses for sync().  Previously this was done in a type specific
way.  Now it is done in a generic way by recording the list of
virtuals in the map.

Submitted by:	jeff (sponsored by EMC/Isilon)
Reviewed by:	kan (previous version), scottl,
	mjacob (isp(4), no objections for target mode changes)
Discussed with:	     ian (arm changes)
Tested by:	marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
	amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
2013-02-12 16:57:20 +00:00
Pedro F. Giffuni
646a7fea0c Clean some 'svn:executable' properties in the tree.
Submitted by:	Christoph Mallon
MFC after:	3 days
2013-01-26 22:08:21 +00:00
Jayachandran C.
9f499cc5ae Little-endian and other fixes for Broadcom XLP network driver
The changes are:
 - the microcore code loaded into the NAE has to be byteswapped
   in LE
 - the descriptors in memory for a P2P NAE descriptor has to be
   byteswapped in LE
 - the m_data pointer is already cacheline aligned, so the
   unnecessary m_adj to cacheline size can be removed
 - fix mask used to obtain physical address from the Tx freeback
   descriptor
 - fix a compile error in code under #ifdef

Obtained from:	Venkatesh J V <venkatesh.vivekanandan@broadcom.com>
2013-01-24 15:49:47 +00:00
Jayachandran C.
d4aba0f611 Fix credit configuration on Broadcom XLP CMS
The CMS output queue credit configuration register is 64 bit, so use
a 64 bit variable while updating it.
Obtained from:	Venkatesh J V <venkatesh.vivekanandan@broadcom.com>
2013-01-24 15:23:01 +00:00
Jayachandran C.
0a82286445 Broadcom XLP network driver update for XLP 8xx B1 rev
Update MDIO reset code to support Broadcom XLP B1 revisions.
Update nlm_xlpge_ioctl, nlm_xlpge_port_enable need not be
called after nlm_xlpge_init.

Obtained from:	Venkatesh J V <venkatesh.vivekanandan@broadcom.com>
2013-01-24 15:14:22 +00:00
Jayachandran C.
a10ce85526 Minor updates to the Broadcom XLP NAE driver
Remove unnecessary SGMII initialization code from nae.c. While there
clean up some prints and whitespace.
2013-01-24 14:42:58 +00:00
Jayachandran C.
301b961c3e Broadcom XLP updates for the new firmware
Support few more versions of board firmware.  In case the security
block is disabled, enable it at boot. Also increase the excluded
memory region to cover the area used by the firmware to initialize
devices.
2013-01-24 14:33:25 +00:00
Jayachandran C.
d30a6b30fc Little-endian fix for PCI on Broadcom XLP.
Update the function xlp_pcib_hardware_swap_enable() to do nothing
when BYTE_ORDER is not BIG_ENDIAN. PCIe hardware swap is not requred
in little-endian mode as the endianness matches that of CPU.
2013-01-24 11:42:16 +00:00
Robert Watson
1aef4ac1bc Partially merge Perforce changeset 219938 to head:
Write FDT attachment for the Terasic MTL (multitouch LCD) driver.
  Exploit the fact that FDT allows multiple memory ranges to be
  assigned to a device, giving us a cleaner description than
  device.hints does.

Portions of this changeset that remove mtl from BERI device.hints and
add to DTS will be merged separately.

Sponsored by:	DARPA, AFRL
2013-01-13 16:27:56 +00:00
Robert Watson
c4b4976dee Merge Perforce changeset 219922 to head:
Update nexus parts in copied DE4LED attachment to use FDT.

Sponsored by:	DARPA, AFRL
2013-01-13 15:12:35 +00:00
Robert Watson
100bfa3f87 Merge Perforce changeset 219918 to head:
Naive first cut at an FDT bus attachment for the Altera JTAG UART.

Sponsored by:	DARPA, AFRL
2013-01-13 15:08:17 +00:00
Alan Cox
fa12b7f6e0 Define VM_KMEM_SIZE_MAX as a fraction of the kernel address space size
rather than a constant so that VM_KMEM_SIZE_MAX will scale automatically
with the kernel address space size.  This is particularly important for
MIPS because the same definition is used by both 32- and 64-bit kernels.

Tested by:	jchandra
2013-01-12 18:06:21 +00:00
Robert Watson
c36a1b5c66 Merge Perforce changeset 219925 to head:
Provided a bus_space implementation for FDT, modelled on
  bus_space_generic, but with a local version of the map address
  routine that does a P->V translation, as is the case with NLM's
  similar routine for XLP.  It's not clear to me that this is the
  right solution -- possibly this belongs in simplebus -- however,
  it is sufficient to get the DE4 LED driver working.

Sponsored by:	DARPA, AFRL
2013-01-12 15:58:20 +00:00
Robert Watson
0fe7f25666 Merge Perforce chance 219924 to head:
In a sign of weakness, replicate the MIPS bus_space_generic.c to
  produce a new FDT version, which will perform necessary address
  space translation for bus_space -- the solution used in NLM's MIPS
  FDT support, but possibly not quite the right thing.  This is
  inconsistent with regular I/O via the nexus and the generic
  bus_space, which instead perform translation via pmap_mapdev()
  when a resource is activated.  However, it will work while I
  attempt to identify what the right way to reconcile possible
  approaches.

  (Another approach might be to make simplebus use Nexus's activate
  routine instead of a generic one?)

Sponsored by:	DARPA, AFRL
2013-01-12 15:53:45 +00:00
Robert Watson
4d19b97f11 Merge Perforce change @219948 to head:
Add code so that the BERI boot process can ask the kernel linker for
  DTB blobs that may have been left for it by the boot loader, as done
  on PowerPC and ARM.  This will require both a more mature boot
  loader, and more mature boot loader argument passing mechanism,
  than currently supported on BERI.

Sponsored by:	DARPA, AFRL
2013-01-12 13:20:21 +00:00
Robert Watson
f73faab74b Merge Perforce change @219935 to head:
Initialise Openfirmware/FDT code earlier in the FreeBSD/beri boot,
  so that the results will be available for configuring the console
  UART (eventually).

  Suggested by:   thompsa

Sponsored by:	DARPA, AFRL
2013-01-12 12:34:59 +00:00
Monthadar Al Jaberi
3fbbb3be4f Mips Atheros AR71XX: make PCI base slot configurable through hints.
* Mikrotik RouterBoard 433AH have PCI slot 18 wired to INT0 on the PCI Bus.
  This is different from e.g. Atheros PB42 and Ubiquiti boards.
* Check for hint hint.pcib.0.baseslot=X, where X is number of base slot;
* If hint not supplied print a warning and use default AR71XX_PCI_BASE_SLOT;

PR:		kern/174978
Approved by:	adrian (mentor)
2013-01-06 20:50:31 +00:00
Juli Mallett
b29648facd Add basic support for the Ubiquiti EdgeRouter Lite.
Note that USB does not currently work, and the flash is connected via USB, so
local storage is not working.
2013-01-02 23:17:50 +00:00
Robert Watson
d8c7c88283 Merge @219932 from Perforce:
FDT headers can't be included if the kernel is compiled without
  FDT support, due to dependence on generated kobj headers.  BERI
  supports both FDT and non-FDT kernels.

  Spotted by:	bz
2013-01-01 19:42:06 +00:00
Robert Watson
9eb71e68fe If FDT is compiled into a FreeBSD/beri kernel, initialise OpenFirmware.
Sponsored by:	DARPA, AFRL
2012-12-31 11:06:37 +00:00
Alan Cox
c2c46ecd68 Eliminate some definitions that haven't been used in a decade or more. 2012-12-19 05:07:27 +00:00
Gleb Smirnoff
eb1b1807af Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually
2012-12-05 08:04:20 +00:00
Juli Mallett
85129b4bd4 Use bootverbose to control debug printfs from the Cavium Simple Executive
code.  Also remove an unnecessary CVMX_ENABLE_DEBUG_PRINTS conditional around
what is already a cvmx_dprintf.
2012-11-24 02:12:24 +00:00
Juli Mallett
01a310bd57 o) Add support for specifying a model of Octeon to target at compile-time,
reducing the number of runtime checks done by the SDK code.
o) Group board/CPU information at early startup by subject matter, so that e.g.
   CPU information is adjacent to CPU information and board information is
   adjacent to board information.
2012-11-24 02:00:29 +00:00
Juli Mallett
fcd5eed4f0 Prevent hang on ATCA-7220 when transmitting packets < 60 bytes. 2012-11-19 08:30:29 +00:00
Juli Mallett
d0ebf478da Remove redundant printf of SDK version which already appears earlier in boot. 2012-11-19 08:29:53 +00:00
Juli Mallett
8aff4e5fdd Add basic support for the Radisys-specific PCI console mechanism found on the
Radisys ATCA-7220.
2012-11-19 01:58:20 +00:00
Juli Mallett
8961aadb2b o) Do boot descriptor parsing before console setup so that we can use a console
other than UART 0 from the outset.
o) Print board information from sysinfo after consoles have been initialized
   rather than doing it during boot descriptor parsing.
o) Use cvmx_safe_printf and platform_reset rather than panic when doing very
   early boot descriptor parsing before the console is set up.
o) Get rid of the global octeon_bootinfo.
2012-11-19 00:19:27 +00:00
Juli Mallett
2d7499b141 Remove one wholly-unused and buggy routine and some nearby alternative symbols.
While here, also correct a comment that seems to imply that this file is
NetBSD's all-singing, all-dancing locore.S, rather than our conservative set of
assembly support routines.
2012-11-17 23:53:12 +00:00
Adrian Chadd
f447c9bf87 Ensure hwpmc support is correctly included. 2012-11-17 04:11:57 +00:00
Adrian Chadd
54491754cb Make MIPS24k PMC optional on "hwpmc_mips24k."
Requested by:	juli
2012-11-17 04:10:42 +00:00
Adrian Chadd
c612af968c Migrate the AR71xx UART (an 8250 derivative) to hide behind uart_ar71xx.
The AR9330/AR9331 UART is a totally different thing, so having it included
with 'uart' is not going to work out.
2012-11-17 04:05:46 +00:00
Konstantin Belousov
b32ecf44bc Flip the semantic of M_NOWAIT to only require the allocation to not
sleep, and perform the page allocations with VM_ALLOC_SYSTEM
class. Previously, the allocation was also allowed to completely drain
the reserve of the free pages, being translated to VM_ALLOC_INTERRUPT
request class for vm_page_alloc() and similar functions.

Allow the caller of malloc* to request the 'deep drain' semantic by
providing M_USE_RESERVE flag, now translated to VM_ALLOC_INTERRUPT
class. Previously, it resulted in less aggressive VM_ALLOC_SYSTEM
allocation class.

Centralize the translation of the M_* malloc(9) flags in the single
inline function malloc2vm_flags().

Discussion started by:	"Sears, Steven" <Steven.Sears@netapp.com>
Reviewed by:	alc, mdf (previous version)
Tested by:	pho (previous version)
MFC after:	2 weeks
2012-11-14 20:01:40 +00:00
Alan Cox
6961891ea1 The function pmap_alloc_direct_page() unconditionally zeroes the returned
page.  Therefore, it is really inappropriate for use by the function
uma_small_alloc().  The effect of using it was that every page was zeroed
at least once and possibly twice if M_ZERO was passed as a "wait" flag.
2012-11-14 17:33:00 +00:00
Juli Mallett
36e83ea016 Add preliminary Octeon PCI console support. Radisys-specific PCI console
support may follow soon (it uses a proprietary memory layout, but operation
looks pretty trivial.)
2012-11-13 07:39:49 +00:00
Juli Mallett
43364eac2a Add some useful options to consider. 2012-11-13 07:34:46 +00:00
Adrian Chadd
f8eb8ef711 Update AP96 to directly attach an arswitch. 2012-11-07 23:50:28 +00:00
Aleksandr Rybalko
c11de6f059 Hint miibus to attach arswitch on AP91, AP93 and RSPRO boards.
Submitted by:	Luiz Otavio O Souza
Approved by:	adrian (menthor)
2012-11-07 22:46:30 +00:00
Attilio Rao
cfedf924d3 Rework the known rwlock to benefit about staying on their own
cache line in order to avoid manual frobbing but using
struct rwlock_padalign.

Reviewed by:	alc, jimharris
2012-11-03 23:03:14 +00:00
Adrian Chadd
6ff182d83d Drop this from 500 to 128, to save a little space on memory constrained
platforms.
2012-11-02 05:23:05 +00:00
Adrian Chadd
0fd5c74381 Free the dma map -after- it's checked, not before. Or you'll be
potentially referencing already-freed memory.
2012-11-02 05:22:32 +00:00
Juli Mallett
405b925e39 Don't disable PCIe just because the host is not a PCI host; the latter flag
only applies to non-PCIe systems.  If PCIe is in target mode, it will simply
and gracefully fail to attach of its own accord.
2012-11-01 20:39:39 +00:00
Juli Mallett
ca765bc7ab Fix longstanding misprint. 2012-10-31 04:44:32 +00:00
Juli Mallett
3ef3b736dc If the CF physical base is 0, attach no CF devices. This fixes a warning
about a 0 passed to cvmx_phys_to_ptr on systems without a CF interface,
such as the RSYS4GBE.
2012-10-31 04:23:36 +00:00
Juli Mallett
3631682eab Actually check board type rather than using a specialized octeon_is_simulation
function.
2012-10-30 06:36:14 +00:00