Commit Graph

3083 Commits

Author SHA1 Message Date
Tim J. Robbins
fa2a4d0595 Move TDF_DEADLKTREAT into td_pflags (and rename it accordingly) to avoid
having to acquire sched_lock when manipulating it in lockmgr(), uiomove(),
and uiomove_fromphys().

Reviewed by:	jhb
2004-06-03 01:47:37 +00:00
Tim J. Robbins
aa0aa7a113 Move TDF_SA from td_flags to td_pflags (and rename it accordingly)
so that it is no longer necessary to hold sched_lock while
manipulating it.

Reviewed by:	davidxu
2004-06-02 07:52:36 +00:00
John Baldwin
c9702514c0 - Add a function ioapic_program_intpin() that completely programs an I/O
APIC interrupt pin based on the settings in the corresponding interrupt
  source structure.
- Use ioapic_program_intpin() in place of manual frobbing of the intpin
  configuration in ioapic_program_destination() and ioapic_register().
- Use ioapic_program_intpin() to implement suspend/resume support for I/O
  APICs.
2004-06-01 20:28:42 +00:00
John Baldwin
d5c5dcf762 Fix legacy_add_child() to properly handle the case where
device_add_child_ordered() fails (due to a duplicate device add for
example) and properly cleanup and return NULL.
2004-06-01 19:50:42 +00:00
Nate Lawson
8fb8b5fb27 Remove debugging printf that never triggered because acpi is the first
user of nexus::bus_get_resource.
2004-06-01 01:04:25 +00:00
Bosko Milekic
099a0e588c Bring in mbuma to replace mballoc.
mbuma is an Mbuf & Cluster allocator built on top of a number of
extensions to the UMA framework, all included herein.

Extensions to UMA worth noting:
  - Better layering between slab <-> zone caches; introduce
    Keg structure which splits off slab cache away from the
    zone structure and allows multiple zones to be stacked
    on top of a single Keg (single type of slab cache);
    perhaps we should look into defining a subset API on
    top of the Keg for special use by malloc(9),
    for example.
  - UMA_ZONE_REFCNT zones can now be added, and reference
    counters automagically allocated for them within the end
    of the associated slab structures.  uma_find_refcnt()
    does a kextract to fetch the slab struct reference from
    the underlying page, and lookup the corresponding refcnt.

mbuma things worth noting:
  - integrates mbuf & cluster allocations with extended UMA
    and provides caches for commonly-allocated items; defines
    several zones (two primary, one secondary) and two kegs.
  - change up certain code paths that always used to do:
    m_get() + m_clget() to instead just use m_getcl() and
    try to take advantage of the newly defined secondary
    Packet zone.
  - netstat(1) and systat(1) quickly hacked up to do basic
    stat reporting but additional stats work needs to be
    done once some other details within UMA have been taken
    care of and it becomes clearer to how stats will work
    within the modified framework.

From the user perspective, one implication is that the
NMBCLUSTERS compile-time option is no longer used.  The
maximum number of clusters is still capped off according
to maxusers, but it can be made unlimited by setting
the kern.ipc.nmbclusters boot-time tunable to zero.
Work should be done to write an appropriate sysctl
handler allowing dynamic tuning of kern.ipc.nmbclusters
at runtime.

Additional things worth noting/known issues (READ):
   - One report of 'ips' (ServeRAID) driver acting really
     slow in conjunction with mbuma.  Need more data.
     Latest report is that ips is equally sucking with
     and without mbuma.
   - Giant leak in NFS code sometimes occurs, can't
     reproduce but currently analyzing; brueffer is
     able to reproduce but THIS IS NOT an mbuma-specific
     problem and currently occurs even WITHOUT mbuma.
   - Issues in network locking: there is at least one
     code path in the rip code where one or more locks
     are acquired and we end up in m_prepend() with
     M_WAITOK, which causes WITNESS to whine from within
     UMA.  Current temporary solution: force all UMA
     allocations to be M_NOWAIT from within UMA for now
     to avoid deadlocks unless WITNESS is defined and we
     can determine with certainty that we're not holding
     any locks when we're M_WAITOK.
   - I've seen at least one weird socketbuffer empty-but-
     mbuf-still-attached panic.  I don't believe this
     to be related to mbuma but please keep your eyes
     open, turn on debugging, and capture crash dumps.

This change removes more code than it adds.

A paper is available detailing the change and considering
various performance issues, it was presented at BSDCan2004:
http://www.unixdaemons.com/~bmilekic/netbuf_bmilekic.pdf
Please read the paper for Future Work and implementation
details, as well as credits.

Testing and Debugging:
    rwatson,
    brueffer,
    Ketrien I. Saihr-Kesenchedra,
    ...
Reviewed by: Lots of people (for different parts)
2004-05-31 21:46:06 +00:00
Poul-Henning Kamp
77409fe148 Add missing #include <sys/module.h> 2004-05-30 20:34:58 +00:00
Poul-Henning Kamp
41ee9f1c69 Add some missing <sys/module.h> includes which are masked by the
one on death-row in <sys/kernel.h>
2004-05-30 17:57:46 +00:00
Poul-Henning Kamp
a041b840d2 struct cpu_nameclass is a private to identcpu.c, move it there. 2004-05-30 15:16:07 +00:00
Alan Cox
662d471da6 Remove a broken micro-optimization from pmap_enter(). The ill effect
of this micro-optimization occurs when we call pmap_enter() to wire an
already mapped page.  Because of the micro-optimization, we fail to
mark the PTE as wired.  Later, on teardown of the address space,
pmap_remove_pages() destroys the PTE before vm_fault_unwire() has
unwired the page.  (pmap_remove_pages() is not supposed to destroy
wired PTEs.  They are destroyed by a later call to pmap_remove().)
Thus, the page becomes lost.

Note: The page is not lost if the application called munlock(2), only
if it relies on teardown of the address space to unwire its pages.

For the historically inclined, this bug was introduced by a
megacommit, revision 1.182, roughly six years ago.

Leak observed by: green@ and dillon independently
Patch submitted by: dillon at backplane dot com
Reviewed by: tegge@
MFC after: 1 week
2004-05-28 19:42:02 +00:00
John Baldwin
543e27a95b Reenable ithread preemption for interrupts that occur while executing in
the kernel.  I accidentally broke this with the new interrupt code that
came in prior to 5.2.

Submitted by:	bde
2004-05-28 17:50:07 +00:00
Thomas Moestl
65e29c4822 Retire cpu_sched_exit(); it is not used any more. 2004-05-26 12:09:39 +00:00
Bruce Evans
30346936cf MFamd64:
Fixed profiling of trap, syscall and interrupt handlers and some
ordinary functions, essentially by backing out half of rev.1.106 of
i386/exception.s.  The handlers must be between certain labels for
the purposes of profiling, and this was broken by scattering them in
separately compiled .s files, especially for ordinary functions that
ended up between the labels.  Merge the files by #including them as
before, except with different pathnames and better comments and
organization.  Changes to the scattered files are minimal -- just
move the labels to the file that does the #includes.

This also partly fixes profiling of IPIs -- all IPI handlers are now
correctly classified as interrupt handlers, but many are still missing
mcount calls.

vm86bios.s is included as before, but it is now between the labels for
interrupt handlers again, which seems to be wrong since half of it is
for a non-interrupt handler.
2004-05-26 07:43:41 +00:00
John Baldwin
87ca48b6b8 Revert part of rev 1.230 and assume that all EISA IRQs use active high
polarity rather than assuming that level triggered IRQs use active low and
edge triggered IRQs use active high.  Both the MultiProcessor 1.4
and ACPI 2.0 Specifications state in their examples that level triggered
EISA IRQs are active low, but in practice they seem to be active high.

Reported by:	Nik Azim Azam nskyline_r35 at yahoo dot com
2004-05-24 15:51:46 +00:00
Bruce Evans
c39f62a459 MFamd64 (1.117: made the FAKE_MCOUNT() in doreti work non-accidentally,
and removed buggy unnecessary FAKE_MCOUNT() in calltrap).
2004-05-23 17:25:46 +00:00
Bruce Evans
6670fdb155 MFamd64 (put TF_EIP in assym.s and use it instead of a magic offset in
FAKE_MCOUNT()s).
2004-05-23 16:50:55 +00:00
Bruce Evans
c94dd84382 MFamd64 (1.111: fixed missing call to .mexitcount in lgdt()). 2004-05-23 15:37:21 +00:00
Bruce Evans
912f07a626 Updated and reorganized the comments for the fetch and store families of
functions.
2004-05-21 16:08:26 +00:00
Bruce Evans
2151041f22 Fixed high resoultion profiling of fuword32() and suword32(). Use the
standard macro ALTENTRY() instead of a home made incomplete version
of it.
2004-05-21 16:01:54 +00:00
Peter Wemm
e8855d4f97 Make a small revision to the api between the elf linker core and the
elf_reloc() backends for two reasons.  First, to support the possibility
of there being two elf linkers in the kernel (eg: amd64), and second, to
pass the relocbase explicitly (for relocating .o format kld files).
2004-05-16 20:00:28 +00:00
John Baldwin
96025496cf - Remove a spurious blank line.
- Add a missing static keyword.
2004-05-11 20:06:55 +00:00
John Baldwin
eb8943b13e Rework the APIC mixed mode support a bit:
- Require the APIC enumerators to explicitly enable mixed mode by calling
  ioapic_enable_mixed_mode().  Calling this function tells the apic driver
  that the PC-AT 8259A PICs are present and routable through the first I/O
  APIC via an ExtINT pin.  The mptable enumerator always calls this
  function for now.  The MADT enumerator only enables mixed mode if the
  PC-AT compatability flag is set in the MADT header.
- Allow mixed mode to be enabled or disabled via a 'hw.apic.mixed_mode'
  tunable.  By default this tunable is set to 1 (true).  The kernel option
  NO_MIXED_MODE changes the default to 0 to preserve existing behavior, but
  adding 'hw.apic.mixed_mode=0' to loader.conf achieves the same effect.
- Only use mixed mode to route IRQ 0 if it is both enabled by the APIC
  enumerator and activated by the loader tunable.  Note that both
  conditions must be true, so if the APIC enumerator does not enable mixed
  mode, then you can't set the tunable to try to override the enumerator.
2004-05-10 18:49:58 +00:00
Nate Lawson
88d2c61ee8 Move the CPU newbus attachment to i386 legacy. The acpi_cpu device will
become just "cpu" and provide attachments in the !legacy case.

Tested by:	des
2004-05-06 15:54:02 +00:00
Yoshihiro Takahashi
3e48fb44a8 Disable an EISA support on PC98. 2004-05-06 13:45:45 +00:00
Maxime Henrion
601ae596e7 Unbreak kernel build in the !apic case. We moved to using enums for
setting the polarity and the trigger mode of interrupts.

Tested by:	Mark Santcroos, Russell Jackson, Christian Hiris
2004-05-05 12:39:02 +00:00
John Baldwin
4b1df14c60 - Add a new pic method pic_config_intr() to set the trigger mode and
polarity for a specified IRQ.  The intr_config_intr() function wraps
  this pic method hiding the IRQ to interrupt source lookup.
- Add a config_intr() method to the atpic(4) driver that reconfigures
  the interrupt using the ELCR if possible and returns an error otherwise.
- Add a config_intr() method to the apic(4) driver that just logs any
  requests that would change the existing programming under bootverbose.
  Currently, the only changes the apic(4) driver receives are due to bugs
  in the acpi(4) driver and its handling of link devices, hence the reason
  for such requests currently being ignored.
- Have the nexus(4) driver on i386 implement the bus_config_intr() function
  by calling intr_config_intr().
2004-05-04 21:02:56 +00:00
John Baldwin
c2ce35977e - Change the APIC code to mostly use the recently added intr_trigger
and intr_polarity enums for passing around interrupt trigger modes and
  polarity rather than using the magic numbers 0 for level/low and 1 for
  edge/high.
- Convert the mptable parsing code to use the new ELCR wrapper code rather
  than reading the ELCR directly.  Also, use the ELCR settings to control
  both the trigger and polarity of EISA IRQs instead of just the trigger
  mode.
- Rework the MADT's handling of the ACPI SCI again:
  - If no override entry for the SCI exists at all, use level/low trigger
    instead of the default edge/high used for ISA IRQs.
  - For the ACPI SCI, use level/low values for conforming trigger and
    polarity rather than the edge/high values we use for all other ISA
    IRQs.
  - Rework the tunables available to override the MADT.  The
    hw.acpi.force_sci_lo tunable is no longer supported.  Instead, there
    are now two tunables that can independently override the trigger mode
    and/or polarity of the SCI.  The hw.acpi.sci.trigger tunable can be
    set to either "edge" or "level", and the hw.acpi.sci.polarity tunable
    can be set to either "high" or "low".  To simulate hw.acpi.force_sci_lo,
    set hw.acpi.sci.trigger to "level" and hw.acpi.sci.polarity to "low".
    If you are having problems with ACPI either causing an interrupt storm
    or not working at all (e.g., the power button doesn't turn invoke a
    shutdown -p now), you can try tweaking these two tunables to find the
    combination that works.
2004-05-04 20:39:24 +00:00
John Baldwin
4f9e7c8b00 Use a private attach method for the MP Table host-PCI bridge driver rather
than using legacy_pcib_attach().  The MP Table drivers don't use the $PIR,
and the legacy_pcib_attach() function probes and parses the $PIR in
addition to adding the pci bus child device.
2004-05-03 14:49:10 +00:00
Poul-Henning Kamp
a2e4145b35 Make sure we are all set up before creating the LED instance. 2004-04-27 13:08:03 +00:00
John Baldwin
79cdd799f6 Use %eax rather than %ax when loading segment registers to avoid partial
register stalls.

Reviewed by:	bde (a while ago, and I think an earlier version)
2004-04-16 19:26:37 +00:00
Alan Cox
2c38d78e41 - is_physical_memory()'s parameter, which is a physical address, should be
a vm_paddr_t not a vm_offset_t.
2004-04-11 04:26:58 +00:00
Alan Cox
0af0eeacb6 - pmap_kenter_temporary()'s first parameter, which is a physical address,
should be declared as vm_paddr_t not vm_offset_t.
2004-04-10 23:28:49 +00:00
Mark Murray
4ebf3fc96b I hate noticing bugs after committing. :-(
ALWAYS set up the CPU base identity string. THEN optionally
add features.
2004-04-09 17:00:03 +00:00
Mark Murray
c97ed2df4e Add extra output to show when VIA C3 Nehemiah CPUs have hardware
Random Number Generator (RNG) and/or Advanced Cryptography Engine
(ACE).
2004-04-09 15:01:44 +00:00
Warner Losh
f36cfd49ad Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 20:46:16 +00:00
Doug Rabson
e776f0f9f8 Print cpu features for crusoe processors. 2004-04-05 10:12:19 +00:00
Alan Cox
c8607538c8 Remove avail_start on those platforms that no longer use it. (Only amd64
does anything with it beyond simple initialization.)
2004-04-05 04:08:00 +00:00
Alan Cox
bdb93eb248 Remove unused arguments from pmap_init(). 2004-04-05 00:37:50 +00:00
Marcel Moolenaar
60aa1737df Move the definition of rss() from db_interface.c to cpufunc.h where
it belongs. Change the implementation to match those of rfs() and
rgs() for consistency and irrespective of whether the original was
more correct or not (technically speaking).
2004-04-03 22:23:36 +00:00
Poul-Henning Kamp
93c98d4d1d Unbreak LED support on Elan cpus. 2004-04-03 18:42:52 +00:00
Alan Cox
121230a40d In some cases, sf_buf_alloc() should sleep with pri PCATCH; in others, it
should not.  Add a new parameter so that the caller can specify which is
the case.

Reported by:	dillon
2004-04-03 09:16:27 +00:00
Peter Wemm
5c89deaefc Finish tidying up a couple of leftovers from the KSTACK_PAGES stuff. Some
files still #included the opt_ file.  powerpc hadn't been updated yet.
2004-03-29 19:38:05 +00:00
Bill Paul
6041f1a522 The kthread_create() API is supposed to allow you to create threads
with more than the normal amount of stack pages, however the stack
pointer always wound up being initialized using KSTACK_PAGES. It
should be using td->td_kstack_pages instead. This means that although
the vm subsystem would give you all the stack pages you asked for,
%esp would always be initialized as if you had just 2 pages, and
the rest would go to waste.

I wanted to use the 'give me more stack pages' feature of kthread_create()
because the Intel 2200BG NDIS driver does an alloca() of about 5000 bytes,
which wrecks the stack with the default 2 page size, and I was baffled
that no matter how much code I shoved into thread contexts with
allegedly larger stacks, the thing would still crash unless I changed
KSTACK_PAGES.

Note: this bug is present in _ALL_ arches at this point. Peter has
promised to merge this fix into all of them.
2004-03-22 00:28:38 +00:00
Alan Cox
00cfafd7db Add an implementation of uiomove_fromphys() for i386. This implementation
uses sf_buf_alloc() and sf_buf_free() to create and destroy the necessary
ephemeral mappings.
2004-03-21 20:28:36 +00:00
Alan Cox
90ecfebd82 Refactor the existing machine-dependent sf_buf_free() into a machine-
dependent function by the same name and a machine-independent function,
sf_buf_mext().  Aside from the virtue of making more of the code machine-
independent, this change also makes the interface more logical.  Before,
sf_buf_free() did more than simply undo an sf_buf_alloc(); it also
unwired and if necessary freed the page.  That is now the purpose of
sf_buf_mext().  Thus, sf_buf_alloc() and sf_buf_free() can now be used
as a general-purpose emphemeral map cache.
2004-03-16 19:04:28 +00:00
Poul-Henning Kamp
f6c21c0641 The PPS code needs to be much more brutal to avoid synchronism on
hardware with non-sucky clocks.
2004-03-15 21:47:34 +00:00
Alan Cox
9a63fc0df0 Simplify sf_buf_alloc(). 2004-03-14 04:06:33 +00:00
Scott Long
11d905ecd8 Now that contigfree() does not require Giant, don't grab it in busdma. 2004-03-13 15:42:59 +00:00
Tom Rhodes
a122cca953 These are changes to allow to use the Intel C/C++ compiler (lang/icc)
to build the kernel. It doesn't affect the operation if gcc.

Most of the changes are just adding __INTEL_COMPILER to #ifdef's, as
icc v8 may define __GNUC__ some parts may look strange but are
necessary.

Additional changes:
 - in_cksum.[ch]:
   * use a generic C version instead of the assembly version in the !gcc
     case (ASM code breaks with the optimizations icc does)
     -> no bad checksums with an icc compiled kernel
     Help from:		andre, grehan, das
     Stolen from: 	alpha version via ppc version
     The entire checksum code should IMHO be replaced with the DragonFly
     version (because it isn't guaranteed future revisions of gcc will
     include similar optimizations) as in:
        ---snip---
          Revision  Changes    Path
          1.12      +1 -0      src/sys/conf/files.i386
          1.4       +142 -558  src/sys/i386/i386/in_cksum.c
          1.5       +33 -69    src/sys/i386/include/in_cksum.h
          1.5       +2 -0      src/sys/netinet/igmp.c
          1.6       +0 -1      src/sys/netinet/in.h
          1.6       +2 -0      src/sys/netinet/ip_icmp.c

          1.4       +3 -4      src/contrib/ipfilter/ip_compat.h
          1.3       +1 -2      src/sbin/natd/icmp.c
          1.4       +0 -1      src/sbin/natd/natd.c
          1.48      +1 -0      src/sys/conf/files
          1.2       +0 -1      src/sys/conf/files.amd64
          1.13      +0 -1      src/sys/conf/files.i386
          1.5       +0 -1      src/sys/conf/files.pc98
          1.7       +1 -1      src/sys/contrib/ipfilter/netinet/fil.c
          1.10      +2 -3      src/sys/contrib/ipfilter/netinet/ip_compat.h
          1.10      +1 -1      src/sys/contrib/ipfilter/netinet/ip_fil.c
          1.7       +1 -1      src/sys/dev/netif/txp/if_txp.c
          1.7       +1 -1      src/sys/net/ip_mroute/ip_mroute.c
          1.7       +1 -2      src/sys/net/ipfw/ip_fw2.c
          1.6       +1 -2      src/sys/netinet/igmp.c
          1.4       +158 -116  src/sys/netinet/in_cksum.c
          1.6       +1 -1      src/sys/netinet/ip_gre.c
          1.7       +1 -2      src/sys/netinet/ip_icmp.c
          1.10      +1 -1      src/sys/netinet/ip_input.c
          1.10      +1 -2      src/sys/netinet/ip_output.c
          1.13      +1 -2      src/sys/netinet/tcp_input.c
          1.9       +1 -2      src/sys/netinet/tcp_output.c
          1.10      +1 -1      src/sys/netinet/tcp_subr.c
          1.10      +1 -1      src/sys/netinet/tcp_syncache.c
          1.9       +1 -2      src/sys/netinet/udp_usrreq.c

          1.5       +1 -2      src/sys/netinet6/ipsec.c
          1.5       +1 -2      src/sys/netproto/ipsec/ipsec.c
          1.5       +1 -1      src/sys/netproto/ipsec/ipsec_input.c
          1.4       +1 -2      src/sys/netproto/ipsec/ipsec_output.c

          and finally remove
            sys/i386/i386        in_cksum.c
            sys/i386/include     in_cksum.h
        ---snip---
 - endian.h:
   * DTRT in C++ mode
 - quad.h:
   * we don't use gcc v1 anymore, remove support for it
   Suggested by:	bde (long ago)
 - assym.h:
   * avoid zero-length arrays (remove dependency on a gcc specific
     feature)
     This change changes the contents of the object file, but as it's
     only used to generate some values for a header, and the generator
     knows how to handle this, there's no impact in the gcc case.
   Explained by:	bde
   Submitted by:	Marius Strobl <marius@alchemy.franken.de>
 - aicasm.c:
   * minor change to teach it about the way icc spells "-nostdinc"
   Not approved by:	gibbs (no reply to my mail)
 - bump __FreeBSD_version (lang/icc needs to know about the changes)

Incarnations of this patch survive gcc compiles since a loooong time,
I use it on my desktop. An icc compiled kernel works since Nov. 2003
(exceptions: snd_* if used as modules), it survives a build of the
entire ports collection with icc.

Parts of this commit contains suggestions or submissions from
Marius Strobl <marius@alchemy.franken.de>.

Reviewed by:	-arch
Submitted by:	netchild
2004-03-12 21:45:33 +00:00
Marcel Moolenaar
cd5cb01152 Remove stale or broken call to kdb_trap() and protected by the non-
option KDB. Besides being wrong, it also interferes with ongoing
work.
2004-03-11 00:17:45 +00:00