Commit Graph

941 Commits

Author SHA1 Message Date
Alan Cox
566526a957 Migrate pmap_prefault() into the machine-independent virtual memory layer.
A small helper function pmap_is_prefaultable() is added.  This function
encapsulate the few lines of pmap_prefault() that actually vary from
machine to machine.  Note: pmap_is_prefaultable() and pmap_mincore() have
much in common.  Going forward, it's worth considering their merger.
2003-10-03 22:46:53 +00:00
Alan Cox
1b2d9f0653 Make PAGE_SIZE and related quantities signed on sparc64. (They are signed
quantities on every other architecture.)  This change is required in order
to move pmap_prefault() out of the pmap and into the machine-independent
layer.
2003-10-03 19:49:08 +00:00
Maxime Henrion
4e513b7a01 Allow the compiler to micro-optimize byte swapping functions by
evaluating them at compile time rather than at run time.  As for x86
and amd64, this requires GCC and it's enabled only if __OPTIMIZE__ is
defined (ie, if at least -O is used).

Reviewed by:	jake
2003-09-30 22:35:27 +00:00
Alan Cox
79fa677d99 Add vm object locking to pmap_release(). 2003-09-28 00:11:15 +00:00
Peter Wemm
c460ac3a00 Add sysentvec->sv_fixlimits() hook so that we can catch cases on 64 bit
systems where the data/stack/etc limits are too big for a 32 bit process.

Move the 5 or so identical instances of ELF_RTLD_ADDR() into imgact_elf.c.

Supply an ia32_fixlimits function.  Export the clip/default values to
sysctl under the compat.ia32 heirarchy.

Have mmap(0, ...) respect the current p->p_limits[RLIMIT_DATA].rlim_max
value rather than the sysctl tweakable variable.  This allows mmap to
place mappings at sensible locations when limits have been reduced.

Have the imgact_elf.c ld-elf.so.1 placement algorithm use the same
method as mmap(0, ...) now does.

Note that we cannot remove all references to the sysctl tweakable
maxdsiz etc variables because /etc/login.conf specifies a datasize
of 'unlimited'.  And that causes exec etc to fail since it can no
longer find space to mmap things.
2003-09-25 01:10:26 +00:00
Yoshihiro Takahashi
33e38a2cc8 Implement the bus_space_map() function to allocate resources and initialize
a bus_handle, but currently it does only initializing a bus_handle.
2003-09-23 08:22:34 +00:00
Jake Burkholder
cfbe010793 Remove an invalid KASSERT. Apparently pmap_remove_all gets called on
unmanaged pages.
2003-09-20 17:00:59 +00:00
Thomas Moestl
894b85b150 Handle ISA devices in OF_decode_addr(), with the same code that is
used in the EBus case.
2003-09-12 20:04:29 +00:00
Alan Cox
b9850eb224 Add a new parameter to pmap_extract_and_hold() that is needed to eliminate
Giant from vmapbuf().

Idea from:	tegge
2003-09-12 07:07:49 +00:00
Alan Cox
ba2157f218 Introduce a new pmap function, pmap_extract_and_hold(). This function
atomically extracts and holds the physical page that is associated with the
given pmap and virtual address.  Such a function is needed to make the
memory mapping optimizations used by, for example, pipes and raw disk I/O
MP-safe.

Reviewed by:	tegge
2003-09-08 02:45:03 +00:00
Bill Paul
a94100fa9b Take the support for the 8139C+/8169/8169S/8110S chips out of the
rl(4) driver and put it in a new re(4) driver. The re(4) driver shares
the if_rlreg.h file with rl(4) but is a separate module. (Ultimately
I may change this. For now, it's convenient.)

rl(4) has been modified so that it will never attach to an 8139C+
chip, leaving it to re(4) instead. Only re(4) has the PCI IDs to
match the 8169/8169S/8110S gigE chips. if_re.c contains the same
basic code that was originally bolted onto if_rl.c, with the
following updates:

- Added support for jumbo frames. Currently, there seems to be
  a limit of approximately 6200 bytes for jumbo frames on transmit.
  (This was determined via experimentation.) The 8169S/8110S chips
  apparently are limited to 7.5K frames on transmit. This may require
  some more work, though the framework to handle jumbo frames on RX
  is in place: the re_rxeof() routine will gather up frames than span
  multiple 2K clusters into a single mbuf list.

- Fixed bug in re_txeof(): if we reap some of the TX buffers,
  but there are still some pending, re-arm the timer before exiting
  re_txeof() so that another timeout interrupt will be generated, just
  in case re_start() doesn't do it for us.

- Handle the 'link state changed' interrupt

- Fix a detach bug. If re(4) is loaded as a module, and you do
  tcpdump -i re0, then you do 'kldunload if_re,' the system will
  panic after a few seconds. This happens because ether_ifdetach()
  ends up calling the BPF detach code, which notices the interface
  is in promiscuous mode and tries to switch promisc mode off while
  detaching the BPF listner. This ultimately results in a call
  to re_ioctl() (due to SIOCSIFFLAGS), which in turn calls re_init()
  to handle the IFF_PROMISC flag change. Unfortunately, calling re_init()
  here turns the chip back on and restarts the 1-second timeout loop
  that drives re_tick(). By the time the timeout fires, if_re.ko
  has been unloaded, which results in a call to invalid code and
  blows up the system.

  To fix this, I cleared the IFF_UP flag before calling ether_ifdetach(),
  which stops the ioctl routine from trying to reset the chip.

- Modified comments in re_rxeof() relating to the difference in
  RX descriptor status bit layout between the 8139C+ and the gigE
  chips. The layout is different because the frame length field
  was expanded from 12 bits to 13, and they got rid of one of the
  status bits to make room.

- Add diagnostic code (re_diag()) to test for the case where a user
  has installed a broken 32-bit 8169 PCI NIC in a 64-bit slot. Some
  NICs have the REQ64# and ACK64# lines connected even though the
  board is 32-bit only (in this case, they should be pulled high).
  This fools the chip into doing 64-bit DMA transfers even though
  there is no 64-bit data path. To detect this, re_diag() puts the
  chip into digital loopback mode and sets the receiver to promiscuous
  mode, then initiates a single 64-byte packet transmission. The
  frame is echoed back to the host, and if the frame contents are
  intact, we know DMA is working correctly, otherwise we complain
  loudly on the console and abort the device attach. (At the moment,
  I don't know of any way to work around the problem other than
  physically modifying the board, so until/unless I can think of a
  software workaround, this will have do to.)

- Created re(4) man page

- Modified rlphy.c to allow re(4) to attach as well as rl(4).

Note that this code works for the sample 8169/Marvell 88E1000 NIC
that I have, but probably won't work for the 8169S/8110S chips.
RealTek has sent me some sample NICs, but they haven't arrived yet.
I will probably need to add an rlgphy driver to handle the on-board
PHY in the 8169S/8110S (it needs special DSP initialization).
2003-09-08 02:11:25 +00:00
Thomas Moestl
2cda2e47da - Clear the CE AFSR bits which indicate the error condition when handling
a correctable DMA error. Failing to do so can cause the error interrupt
  to be triggered over and over again.
- Clean up the comments for UEAFSR_* constants, fix a typo (UEAFSR_BLK is
  (1 << 23), not (1 << 22)), and add two more. Also, add similar constants
  for the CE AFSR bits.
2003-09-04 15:25:10 +00:00
Marcel Moolenaar
2ec1dc3639 Add function OF_decode_addr(). This function obtains the physical
address of the device identified by its phandle_t by traversing OFW's
device tree. The space and address returned by this function can
subsequently be passed to sparc64_fake_bustag() to construct a valid
tag and handle for use by the newbus I/O functions.

Use of this function is expected to be limited to pre-newbus access to
devices, such as consoles and keyboards.

Partially obtained from: tmm
Reviewed by: jake, jmg, tmm
SBus testing made possible by: jake
Tested with: LINT
2003-09-02 20:32:12 +00:00
Marcel Moolenaar
11a91bffe5 Preparatory commit to allow prototypes in ofw_machdep.h to contain
both newbus types and OFW types. This involves either including
<machine/bus.h> or <dev/ofw/openfirm.h>.

Reviewed by: jake, jmg, tmm
2003-09-02 20:24:42 +00:00
Alexander Kabaev
1d49585050 Standardize idempotentcy ifdefs. Consistently use _MACHINE_VARARGS_H_
symbol.
2003-09-01 03:01:45 +00:00
Jake Burkholder
93dee5536e Implement cpu_set_upcall_kse. May need tweaking. 2003-08-31 22:58:56 +00:00
Alan Cox
411d10a600 Migrate the sf_buf allocator that is used by sendfile(2) and zero-copy
sockets into machine-dependent files.  The rationale for this
migration is illustrated by the modified amd64 allocator.  It uses the
amd64's direct map to avoid emphemeral mappings in the kernel's
address space.  On an SMP, the emphemeral mappings result in an IPI
for TLB shootdown for each transmitted page.  Yuck.

Maintainers of other 64-bit platforms with direct maps should be able
to use the amd64 allocator as a reference implementation.
2003-08-29 20:04:10 +00:00
Marcel Moolenaar
3f51411974 Allow bus barrier operations on fake tags. The purpose of a fake
bus tag is to allow bus space accesses prior to having newbus
fully initialized, such as would be the case for console drivers.
Since barriers are a fundamental part of bus space accesses, not
allowing them on fake tags would defeat the purpose of these tags.
We use the barrier function normally associated with nexus. This
is the barrier used when subordinates haven't defined a barrier
themselves.
2003-08-24 07:47:52 +00:00
John-Mark Gurney
3f6268f92a reenable the caches when a PCI peek faults. Takes my kernel compile
from 3770 real down to 1250 real.

Submitted by:	jake
2003-08-24 06:23:36 +00:00
Jake Burkholder
bf920dafa4 Add a driver for creator upa frame buffers found in many sparc64 machines.
These are fixed resolution and operate only in pixel mode so they present
a challenge to syscons (square peg, round hole, etc, etc).  The driver
provides a video driver interface for syscons and a separate character
device for X to mmap.  Wherever possible the creator's accelarated graphics
functions are used so text mode is very fast.

Based roughly on the openbsd driver.
2003-08-24 01:15:40 +00:00
Jake Burkholder
6cf3586b7e "md" files for syscons. 2003-08-24 00:47:40 +00:00
Marcel Moolenaar
ac764ac32e s#<mk48txx/mk48txxreg.h>#<dev/mk48txx/mk48txxreg.h># 2003-08-23 05:56:58 +00:00
Warner Losh
3d11ce04c6 s=include <ofw/=include <dev/ofw/= to reflect removal of -I$S/dev 2003-08-23 00:11:16 +00:00
Warner Losh
d2c5276d96 Prefer new location of pci include files (which have only been in the
tree for two or more years now), except in a few places where there's
code to be compatible with older versions of FreeBSD.
2003-08-22 07:39:05 +00:00
Alan Cox
2094a9b6ce Lock the pmap's tsb object when performing vm_page_grab() on it. 2003-08-20 06:11:39 +00:00
David E. O'Brien
05ca5d4af2 Enable OFW_NEWPCI until jmg's 2003/06/21 18:26:08 PDT bus commit is fixed
that caused a 3-4 times slow down in performance.
(the primary Sparc64 developers are all using OFW_NEWPCI already, so it is
the best code path for users)
2003-08-19 21:57:29 +00:00
Gordon Tetlow
df3d69c217 Fixup the ELF branding information to point to the new home of rtld. 2003-08-17 08:08:38 +00:00
Marcel Moolenaar
710338e94f In vm_thread_swap{in|out}(), remove the alpha specific conditional
compilation and replace it with a call to cpu_thread_swap{in|out}().
This allows us to add similar code on ia64 without cluttering the
code even more.
2003-08-16 23:15:15 +00:00
Marcel Moolenaar
26502503e5 Further cleanup <machine/cpu.h> and <machine/md_var.h>: move the MI
prototypes of cpu_halt(), cpu_reset() and swi_vm() from md_var.h to
cpu.h. This affects db_command.c and kern_shutdown.c.

ia64: move all MD prototypes from cpu.h to md_var.h. This affects
madt.c, interrupt.c and mp_machdep.c. Remove is_physical_memory().
It's not used (vm_machdep.c).

alpha: the MD prototypes have been left in cpu.h with a comment
that they should be there. Moving them is left for later. It was
expected that the impact would be significant enough to be done in
a seperate commit.

powerpc: MD prototypes left in cpu.h. Comment added.

Suggested by: bde
Tested with: make universe (pc98 incomplete)
2003-08-16 16:57:57 +00:00
Warner Losh
06b4bf3e55 Expand inline the relevant parts of src/COPYRIGHT for Matt Dillon's
copyrighted files.

Approved by: Matt Dillon
2003-08-12 23:24:05 +00:00
Jake Burkholder
3014050b19 Fix sparc64 LINT build. <blush> 2003-08-11 07:05:55 +00:00
Jake Burkholder
2974095528 Use get_mcontext in sendsig and set_mcontext in sigreturn instead of
frobbing things directly.
2003-08-09 23:14:33 +00:00
John Baldwin
3bdbd658f1 - Since td_critnest is now initialized in MI code, it doesn't have to be
set in cpu_critical_fork_exit() anymore.
- As far as I can tell, cpu_thread_link() has never been used, not even
  when it was originally added, so remove it.
2003-08-04 20:32:45 +00:00
Peter Wemm
ad7a226f9d Deal with 'options KSTACK_PAGES' being a global option. 2003-07-31 01:31:32 +00:00
Thomas Moestl
416c84a212 Return 1 from pmap_protect_tte() instead of 0. When used with
tsb_foreach(), 0 signals to terminate the tsb traversal, so when
tsb_foreach() was used in pmap_protect() (which only happens when
the area to be protected is larger than PMAP_TSB_THRESH = 16MB), only
the first tsb entry in the specified range would be protected.

Reported by:	Andrew Belashov <bel@orel.ru>
2003-07-30 16:27:51 +00:00
Thomas Moestl
1cff8bd286 Respect BUS_DMA_ZERO in iommu_dvmamem_alloc(). 2003-07-27 15:19:45 +00:00
Maxime Henrion
d5afecd068 - Introduce a new busdma flag BUS_DMA_ZERO to request for zero'ed
memory in bus_dmamem_alloc().  This is possible now that
  contigmalloc() supports the M_ZERO flag.
- Remove the locking of Giant around calls to contigmalloc() since
  contigmalloc() now grabs Giant itself.
2003-07-27 13:52:10 +00:00
Jake Burkholder
30c2333b1d Avoid exposing declarations for kernel variables to userland.
PR:	54528
2003-07-17 23:42:08 +00:00
John-Mark Gurney
f760fd6128 change CLASS depending upon __ELF_WORD_SIZE. This is necessary if
someone wants to try to run 32bit binaries on sparc64.
2003-07-16 01:14:40 +00:00
John-Mark Gurney
e93581dc7a add support for interrupt counting on sparc64. This copies part of the
code from i386.  The code has a slight bogon that interrupts are counted
twice.  Once on the ithread dispatch and once on the dispatch for the vector

vmstat -i and systat -vm now contains interrupt counts.

Reviewed by:	jake
2003-07-16 00:08:43 +00:00
David Xu
20a2d71332 Rename thread_siginfo to cpu_thread_siginfo.
Suggested by: jhb
2003-07-15 00:11:04 +00:00
Thomas Moestl
928a49644f Lock down the IOMMU bus_dma implementation to make it safe to use
without Giant held.

A quick outline of the locking strategy:
Since all IOMMUs are synchronized, there is a single lock, iommu_mtx,
which protects the hardware registers (where needed) and the global and
per-IOMMU software states. As soon as the IOMMUs are divorced, each struct
iommu_state will have its own mutex (and the remaining global state
will be moved into the struct).
The dvma rman has its own internal mutex; the TSB slots may only be
accessed by the owner of the corresponding resource, so neither needs
extra protection.
Since there is a second access path to maps via LRU queues, the consumer-
provided locking is not sufficient; therefore, each map which is on a
queue is additionally protected by iommu_mtx (in part, there is one
member which only the map owner may access). Each map on a queue may
be accessed and removed from or repositioned in a queue in any context as
long as the lock is held; only the owner may insert a map.
To reduce lock contention, some bus_dma functions remove the map from
the queue temporarily (on behalf of the map owner) for some operations and
reinsert it when they are done. Shorter operations and operations which are
not done on behalf of the lock owner are completely covered by the lock.

To facilitate the locking, reorganize the streaming buffer handling;
while being there, fix an old oversight which would cause the streaming
buffer to always be flushed, regardless of whether streaming was enabled
in the TSB entry. The streaming buffer is still disabled for now, since
there are a number of drivers which lack critical bus_dmamp_sync() calls.

Additional testing by:	jake
2003-07-10 23:27:35 +00:00
Maxime Henrion
fab712dcc6 Uncomment the dc(4) driver, it should work just fine now. 2003-07-09 15:04:27 +00:00
Alan Cox
6e3bf93111 MFi386
Updates to cnt.v_wire_count, the global count of wired pages, should be
 performed using atomic ops.
2003-07-06 20:32:42 +00:00
Alan Cox
1f78f902a8 Background: pmap_object_init_pt() premaps the pages of a object in
order to avoid the overhead of later page faults.  In general, it
implements two cases: one for vnode-backed objects and one for
device-backed objects.  Only the device-backed case is really
machine-dependent, belonging in the pmap.

This commit moves the vnode-backed case into the (relatively) new
function vm_map_pmap_enter().  On amd64 and i386, this commit only
amounts to code rearrangement.  On alpha and ia64, the new machine
independent (MI) implementation of the vnode case is smaller and more
efficient than their pmap-based implementations.  (The MI
implementation takes advantage of the fact that objects in -CURRENT
are ordered collections of pages.)  On sparc64, pmap_object_init_pt()
hadn't (yet) been implemented.
2003-07-03 20:18:02 +00:00
Scott Long
f6b1c44d1f Mega busdma API commit.
Add two new arguments to bus_dma_tag_create(): lockfunc and lockfuncarg.
Lockfunc allows a driver to provide a function for managing its locking
semantics while using busdma.  At the moment, this is used for the
asynchronous busdma_swi and callback mechanism.  Two lockfunc implementations
are provided: busdma_lock_mutex() performs standard mutex operations on the
mutex that is specified from lockfuncarg.  dftl_lock() is a panic
implementation and is defaulted to when NULL, NULL are passed to
bus_dma_tag_create().  The only time that NULL, NULL should ever be used is
when the driver ensures that bus_dmamap_load() will not be deferred.
Drivers that do not provide their own locking can pass
busdma_lock_mutex,&Giant args in order to preserve the former behaviour.

sparc64 and powerpc do not provide real busdma_swi functions, so this is
largely a noop on those platforms.  The busdma_swi on is64 is not properly
locked yet, so warnings will be emitted on this platform when busdma
callback deferrals happen.

If anyone gets panics or warnings from dflt_lock() being called, please
let me know right away.

Reviewed by:	tmm, gibbs
2003-07-01 15:52:06 +00:00
Thomas Moestl
cb33c884cd Add a commented-out entry for OFW_NEWPCI to GENERIC and NOTES, along
with a comment describing it's advantages and the implication of
changing it. While being there, fix a typo in NOTES.

The option is not enabled in NOTES for now since large portions of code
are conditional on it being disabled, too.
2003-07-01 15:13:07 +00:00
Thomas Moestl
1d80cb1b37 Add the new sparc64 OFW PCI framework, conditional on options OFW_NEWPCI
for now. It introduces a OFW PCI bus driver and a generic OFW PCI-PCI
bridge driver. By utilizing these, the PCI handling is much more elegant
now.

The advantages of the new approach are:
- Device enumeration should hopefully be more like on Solaris now,
  so unit numbers should match what's printed on the box more
  closely.
- Real interrupt routing is implemented now, so cardbus bridges
  etc. have at least a chance to work.
- The quirk tables are gone and have been replaced by (hopefully
  sufficient) heuristics.
- Much cleaner code.

There was also a report that previously bogus interrupt assignments
are fixed now, which can be attributed to the new heuristics.

A pitfall, and the reason why this is not the default yet, is that
it changes device enumeration, as mentioned above, which can make
it necessary to change the system configuration if more than one
unit of a device type is present (on a system with two hme cars,
for example, it is possible that hme0 becomes hme1 and vice versa
after enabling the option). Systems with multiple disk controllers
may need to be booted into single user (and require manual specification
of the root file system on boot) to adjust the fstab.
Nevertheless, I would like to encourage users to use this option,
so that it can be made the default soon.

In detail, the changes are:
- Introduce an OFW PCI bus driver; it inherits most methods from the
  generic PCI bus driver, but uses the firmware for enumeration,
  performs additional initialization for devices and firmware-specific
  interrupt routing. It also implements an OFW-specific method to allow
  child devices to get their firmware nodes.
- Introduce an OFW PCI-PCI bridge driver; again, it inherits most
  of the generic PCI-PCI bridge driver; it has it's own method for
  interrupt routing, as well as some sparc64-specific methods (one to
  get the node again, and one to adjust the bridge bus range, since
  we need to reenumerate all PCI buses).
- Convert the apb driver to the new way of handling things.
- Provide a common framework for OFW bridge drivers, used be the two
  drivers above.
- Provide a small common framework for interrupt routing (for all
  bridge types).
- Convert the psycho driver to the new framework; this gets rid of a
  bunch of old kludges in pci_read_config(), and the whole
  preinitialization  (ofw_pci_init()).
- Convert the ISA MD part and the EBus driver to the new way
  interrupts and nodes are handled.
- Introduce types for firmware interrupt properties.
- Rename the old sparcbus_if to ofw_pci_if by repo copy (it is only
  required for PCI), and move it to a more correct location (new
  support methodsx were also added, and an old one was deprecated).
- Fix a bunch of minor bugs, perform some cleanups.

In some cases, I introduced some minor code duplication to keep the
new code clean, in hopes that the old code will be unifdef'ed soon.

Reviewed in part by:	imp
Tested by:	jake, Marius Strobl <marius@alchemy.franken.de>,
		Sergey Mokryshev <mokr@mokr.net>,
		Chris Jackman <cjackNOSPAM@klatsch.org>
Info on u30 firmware provided by:	kris
2003-07-01 14:52:47 +00:00
Alan Cox
dca96f1adc - Export pmap_enter_quick() to the MI VM. This will permit the
implementation of a largely MI pmap_object_init_pt() for vnode-backed
   objects.  pmap_enter_quick() is implemented via pmap_enter() on sparc64
   and powerpc.
 - Correct a mismatch between pmap_object_init_pt()'s prototype and its
   various implementations.  (I plan to keep pmap_object_init_pt() as
   the MD hook for device-backed objects on i386 and amd64.)
 - Correct an error in ia64's pmap_enter_quick() and adjust its interface
   to match the other versions.  Discussed with: marcel
2003-06-29 21:20:04 +00:00
Thomas Moestl
d462b4f058 Small fixes for the IOMMU code:
1.) Handle maximum segment sizes which are smaller than the IOMMU page
    size by splitting up pages across multiple segments if needed; this case
    was previously unimplemented, and would cause panics.
2.) KASSERT that the physical address is in range; remove a KASSERT that
    has become pointless.
3.) Add a comment describing what remains to be fixed in the IOMMU code;
    I plan to address these issues soon.

Desired by:	dwhite (1)
2003-06-28 21:52:16 +00:00