43127 Commits

Author SHA1 Message Date
sam
357db6acd2 Always queue looped back packets (rather than potentially using
direct dispatch) to avoid extensive kernel stack usage and to
avoid directly re-entering the network stack.  The latter causes
locking problems when, for example, a complete TCP handshake`
happens w/o a context switch.
2003-10-29 18:37:47 +00:00
sam
47bb0c3b09 mark interrupt handlers MPSAFE 2003-10-29 18:32:14 +00:00
sam
1beb7cc4f2 Add a temporary mechanism to disble INTR_MPSAFE from network interface
drivers.  This is prepatory to running more parts of the network system
w/o Giant.
2003-10-29 18:29:50 +00:00
bde
f03fdf4e6f Removed mostly-dead code for setting switchtime after the idle loop
clobbers this variable.  Long ago, when the idle loop wasn't in a
process, it set switchtime.tv_sec to zero to indicate that the time
needs to be read after the idle loop finishes.  The special case for
this isn't needed now that there is an idle process (for each CPU).
The time is read in the normal way when the idle process is switched
away from.  The seconds component of the time is only zero for the
first second after the uptime is set, and the mostly-dead code was only
executed during this time.  (This was slightly broken by using uptimes
instead of times relative to the Epoch -- in the original version the
seconds component of the time was only 0 for the first second after
the Epoch.)

In mi_switch(), moved the setting of switchticks to just after the
first (and now only) setting of switchtime.  This setting used to be
delayed since a late setting was needed for the idle case and an early
setting was not needed.  Now the early setting is needed so that
fork_exit() doesn't need to set either switchtime or switchticks.
Removed now-completely-rotted comment attached to this.  Most of the
code described by the comment had already moved to sched_switch().
2003-10-29 15:23:09 +00:00
harti
5dd57a3567 Allow sending of more than one raw cell from a single mbuf. Only the
very first cell in the mbuf should have a cell header word (of which
everything except the payload type and the CLP bit is ignored). All
other cells should be 48 byte and get the same header as the first cell.

This fixes a problem with sending more than 120000 raw cells/sec through
an HE155. The card seems to need 2 cell times to DMA the transmit buffer
ready queue entry and the transmit buffer descriptor so at 1/3 the
link rate the transmit buffer ready queue starts to fill up. Even with this
patch it's obviously impossible to send raw cells at link rate.
2003-10-29 15:15:19 +00:00
harti
7ff4941bd6 Remove a superfluous ) from the previous commit. This was obviously
a result of the current solar storm.
2003-10-29 15:11:26 +00:00
harti
aa5acc483c Make the maximum number of pages for external mbufs configurable in
the kernel environment and accessible as a RO sysctl.

Explain that the HE155 will not work in 64-bit/66MHz slots, but may work
in 64-bit/33MHz slots.
2003-10-29 15:07:10 +00:00
ume
b9fecc82d3 add ECN support in layer-3.
- implement the tunnel egress rule in ip_ecn_egress() in ip_ecn.c.
   make ip{,6}_ecn_egress() return integer to tell the caller that
   this packet should be dropped.
 - handle ECN at fragment reassembly in ip_input.c and frag6.c.

Obtained from:	KAME
2003-10-29 15:07:04 +00:00
bde
6bce6afbe7 Removed sched_nest variable in sched_switch(). Context switches always
begin with sched_lock held but not recursed, so this variable was
always 0.

Removed fixup of sched_lock.mtx_recurse after context switches in
sched_switch().  Context switches always end with this variable in the
same state that it began in, so there is no need to fix it up.  Only
sched_lock.mtx_lock really needs a fixup.

Replaced fixup of sched_lock.mtx_recurse in fork_exit() by an assertion
that sched_lock is owned and not recursed after it is fixed up.  This
assertion much match the one in mi_switch(), and if sched_lock were
recursed then a non-null fixup of sched_lock.mtx_recurse would probably
be needed again, unlike in sched_switch(), since fork_exit() doesn't
return to its caller in the normal way.
2003-10-29 14:40:41 +00:00
harti
86956669e3 Make the value of the HATM_DEBUG symbol the default for the debugging
flags. Introduce a new debugging flag to dump received packets.
2003-10-29 14:33:41 +00:00
harti
379fe50af3 Inline a function that was called only in one place directly into that place.
Correct a bug when the number of pages for external mbufs was
very large. In this case the page number could overflow into the large
buffer flag. Make this more unlikley by move that flag further away.
2003-10-29 14:28:26 +00:00
iwasaki
8c1d6750c9 Alphabetical order for ACPI options broken by adding ACPI_NO_RESET_VIDEO.
Add short comment about ACPI_NO_RESET_VIDEO into NOTES.

Pointed-out by:	njl
2003-10-29 14:22:09 +00:00
harti
bba7295376 We have some space in the external mbufs so use this space for
the external buffer reference count. This saves us a malloc() + free()
per small receive mbuf.
2003-10-29 13:21:38 +00:00
harti
e0d0a97b8f Defer allocation of the actual receive mbuf until the external buffer
is returned from the card to the driver. Add a counter that shows
how many times this allocation has failed. Note, that we could even
further delay the allocation of the mbuf until we know, that we need it
(there are no receive errors and the connection is open). This will be done
in a later commit.

Print the new statistics field in atmconfig.
2003-10-29 13:14:39 +00:00
harti
568c57f368 Get rid of the mutexes for the exernal buffer free lists. Use
atomic instructions instead. Remove the stuff used to track
whether an external mbuf travels through the system. This is
temporary only and will come back soon.
2003-10-29 12:59:44 +00:00
ume
36edae8e0d ip6_savecontrol() argument is redundant 2003-10-29 12:52:28 +00:00
ume
bde54b9152 hide m_tag, again.
Requested by:	sam
2003-10-29 12:49:12 +00:00
alc
e273855447 - Synchronize updates to nswapdev using sw_dev_mtx. 2003-10-29 07:51:41 +00:00
marcel
66c2f8b643 Fix the alpha tinderbox. The alpha specific bitops used by the bitmap
code has the typical branch prediction detour, which creates cross-
section branches. A LINT kernel is apparently large enough nowadays
that the .text and .text2 sections cannot always be layed-out so that
branches between them reach.
The fix is to stop using the alpha-specific bitops and instead use
the portable implementation used by all platforms other than alpha
and i386.
2003-10-29 07:35:53 +00:00
alc
4307e55d6c - Avoid a race in swaponsomething(): Calculate the new swdevt's first and
end swblk and insert this new swdevt into the list of swap devices
   in the same critical section.
2003-10-29 05:42:28 +00:00
sam
409cf5f514 Introduce the notion of "persistent mbuf tags"; these are tags that stay
with an mbuf until it is reclaimed.  This is in contrast to tags that
vanish when an mbuf chain passes through an interface.  Persistent tags
are used, for example, by MAC labels.

Add an m_tag_delete_nonpersistent function to strip non-persistent tags
from mbufs and use it to strip such tags from packets as they pass through
the loopback interface and when turned around by icmp.  This fixes problems
with "tag leakage".

Pointed out by:	Jonathan Stone
Reviewed by:	Robert Watson
2003-10-29 05:40:07 +00:00
marcel
0e35eb1601 This commit was generated by cvs2svn to compensate for changes in r121642,
which included commits to RCS files with non-trunk default branches.
2003-10-29 04:25:17 +00:00
marcel
0202e645a9 Import beta6 of libuwx. This release has some minor fixes and
some minor corrections to beta5.
2003-10-29 04:25:17 +00:00
iwasaki
143f8d89ab Add kernel option ACPI_NO_RESET_VIDEO as workaround for problems
(e.g. LCD white-out after resume) on some machine cased by
re-initialize video BIOS code in acpi_wakecode.
2003-10-29 03:30:45 +00:00
sos
67cd4eebae Cleanup the interrupt code that deals with the busmaster bits. 2003-10-28 21:08:14 +00:00
brooks
b3e7c2f5bf Use VLANNAME instead of "vlan". 2003-10-28 20:58:02 +00:00
jhb
6ed78687ed According to the submitter, POSIX mandates that all interval timers are
reset in a child process after a fork().  Currently, however, only the
real timer is cleared while the virtual and profiling timers are inherited.

The realtimer is cleared because it lives directly in struct proc in
p_realtimer.  It is in the zero'd section of struct proc.  The other timers
live in the p_timer[] array in struct pstats.  These timers are copied on
fork() rather than zero'd.  The fix is to move p_timer[] to the zero'd
part of struct pstats so that they are zero'd instead of copied on fork().

Note: Since at least FreeBSD 2.0 (and possibly earlier) we've had storage
for two real interval timers.  Now that the uarea is less important,
perhaps we could move all of p_timer[] over to struct proc and drop the
p_realtimer special case to fix that.

PR:		kern/58647
Reported by:	Dan Nelson <dnelson@allantgroup.com>
MFC after:	1 week
2003-10-28 20:46:23 +00:00
marcel
ba29587a94 When switching the RSE to use the kernel stack as backing store, keep
the RNAT bit index constant. The net effect of this is that there's
no discontinuity WRT NaT collections which greatly simplifies certain
operations. The cost of this is that there can be up to 504 bytes of
unused stack between the true base of the kernel stack and the start
of the RSE backing store. The cost of adjusting the backing store
pointer to keep the RNAT bit index constant, for each kernel entry,
is negligible.

The primary reasons for this change are:
1. Asynchronuous contexts in KSE processes have the disadvantage of
   having to copy the dirty registers from the kernel stack onto the
   user stack. The implementation we had so far copied the registers
   one at a time without calculating NaT collection values. A process
   that used speculation would not work. Now that the RNAT bit index
   is constant, we can block-copy the registers from the kernel stack
   to the user stack without having to worry about NaT collections.
   They will be in the right place on the user stack.
2. The ndirty field in the trapframe is now also usable in userland.
   This was previously not the case because ndirty also includes the
   space occupied by NaT collections. The value could be off by 8,
   depending on the discontinuity. Now that the RNAT bit index is
   contants, we have exactly the same number of NaT collection points
   on the kernel stack as we would have had on the user stack if we
   didn't switch backing stores.
3. Debuggers and other applications that use ptrace(2) can now copy
   the dirty registers from the kernel stack (using ptrace(2)) and
   copy them whereever they want them (onto the user stack of the
   inferior as might be the case for gdb) without having to worry
   about NaT collections in the same way the kernel doesn't have to
   worry about them.

There's a second order effect caused by the randomization of the
base of the backing store, for it depends on the number of dirty
registers the processor happened to have at the time of entry into
the kernel. The second order effect is that the RSE will have a
better cache utilization as compared to having the backing store
always aligned at page boundaries. This has not been measured and
may be in practice only minimally beneficial, if at all measurable.
2003-10-28 19:38:26 +00:00
sos
b0cc5e450b This should allow us to boot with DMA enabled on unknown PCI ATA
chipsets, well at least newer ones...
2003-10-28 19:01:48 +00:00
scottl
208696733a Directly call the 'reboot' word instead of indirectly evaluating it. 2003-10-28 17:18:42 +00:00
ume
8ff2243783 make sure to accept only IPv6 packet.
Obtained from:	KAME
2003-10-28 16:45:29 +00:00
ume
67fa4b4d82 cleanup use of m_tag.
Obtained from:	KAME
2003-10-28 16:29:26 +00:00
ume
7a7c6e3d3e mib name was changed by fixing a spelling.
net.key.prefered_oldsa -> net.key.preferred_oldsa

Obtained from:	KAME
2003-10-28 16:16:04 +00:00
sam
39ba2e1c90 speedup stream socket recv handling by tracking the tail of
the mbuf chain instead of walking the list for each append

Submitted by:	ps/jayanth
Obtained from:	netbsd (jason thorpe)
2003-10-28 05:47:40 +00:00
jeff
7742522f99 - Only change the run queue in sched_prio() if the kse is non null. threads
can be in the TD_ON_RUNQ state and not have an associated kse.
 - Remove the PRI_IDLE special case from sched_clock(), it was not actually
   necessary.
2003-10-28 03:28:48 +00:00
peter
6af25febc6 Oops. Remove some rather noisy debug printfs that slipped in there
somehow.
2003-10-28 01:06:37 +00:00
marcel
23e5537e11 The previous commit removed both clause 3 and clause 4 from the UCB
license. Only clause 3 has been revoked. Restore the fourth clause
as clause 3.

Pointed out by: das@

Remove my name as a copyright holder since I don't use a BSD license
compatible or comparable to the UCB license. I choose not to add a
complete second license for my work for aesthetic reasons, nor to
replace the UCB license on grounds of rewriting more than 90% of the
source files. The rewrite can also be seen as an enhancement and since
the files were practically empty, it's rather trivial to have changed
90% of the files.
2003-10-27 22:54:34 +00:00
jhb
0e8406cd42 Fix pmap_unmapdev() to call pmap_kremove() instead of implementing it
directly so that it more closely mirrors pmap_mapdev() which calls
pmap_kenter().
2003-10-27 22:15:02 +00:00
scottl
bf1f504459 Directly call the 'boot' word instead of indirectly evaluating it.
Submitted by: dcs
2003-10-27 16:39:49 +00:00
harti
9a473fa2fc When we cannot allocate an external buffer (bacause we've hit
the maximum number of pages for buffers) return -1 instead of 0.
This fixes a panic under conditions when many mbufs are needed.

Update the head pointer of the receive buffer pool queue even when
we could not supply a buffer to the chip. Otherwise the chip will
not re-interrupt us for another try. A better strategy would probably
be to remember this condition and to supply buffers without an interrupt
as soon as buffers get available.
2003-10-27 16:21:59 +00:00
harti
e6e4f72758 Allow building the NgATM SAAL layer directly into the kernel. 2003-10-27 11:19:08 +00:00
jeff
d74e8d0250 - Don't set td_priority directly here, use sched_prio(). 2003-10-27 07:15:47 +00:00
ume
d382f2f692 M_DONTWAIT was passed into malloc().
Submitted by:	Ian Dowse <iedowse@maths.tcd.ie>
2003-10-27 07:15:22 +00:00
jeff
729434c982 - Use a better algorithm in sched_pctcpu_update()
Contributed by:	Thomaswuerfl@gmx.de

 - In sched_prio(), adjust the run queue for threads which may need to move
   to the current queue due to priority propagation .
 - In sched_switch(), fix style bug introduced when the KSE support went in.
   Columns are 80 chars wide, not 90.
 - In sched_switch(), Fix the comparison in the idle case and explicitly
   re-initialize the runq in the not propagated case.
 - Remove dead code in sched_clock().
 - In sched_clock(), If we're an IDLE class td set NEEDRESCHED so that threads
   that have become runnable will get a chance to.
 - In sched_runnable(), if we're not the IDLETD, we should not consider
   curthread when examining the load.  This mimics the 4BSD behavior of
   returning 0 when the only runnable thread is running.
 - In sched_userret(), remove the code for setting NEEDRESCHED entirely.
   This is not necessary and is not implemented in 4BSD.
 - Use the correct comparison in sched_add() when checking to see if an idle
   prio task has had it's priority temporarily elevated.
2003-10-27 06:47:05 +00:00
imp
28b8c0b6ea const char ** needs to be passed, not char **. 2003-10-27 06:41:40 +00:00
njl
dbcf41401d Call the VESA reset BIOS vector on the resume path. This may help displays
after resume.  I have not found it to break anything.
2003-10-27 06:26:51 +00:00
ken
0e546cc32c In camperiphdone(), make sure we check for fatal errors and bail out
instead of retrying them blindly.

This should fix some of the problems people have been having with cdrom
drives taking a long time to probe.  This should also eliminate the need
for the initial TUR in cdsize().

cam_periph.c:	Don't keep retrying if the error we get back is a fatal
		error.  This should help us detect the transition from
		"Logical unit not ready, cause not reportable" to "Medium
		not present" in the "TUR many" handler.  (The TUR many
		handler gets triggered for Logical unit not ready, cause
		not reportable errors.)

scsi_cd.c:	Remove the initial test unit ready in cdsize().  Hopefully
		it isn't necessary after the above change.

Submitted by:	gibbs (mostly)
Tested by:	peter
MFC After:	2 weeks
2003-10-27 06:15:55 +00:00
alc
f42a987e4e - Complete the synchronization of accesses to the swblock hash table. 2003-10-27 05:58:15 +00:00
marcel
367436bcad Add support for userland to access I/O port space. This is primarily
added for XFree86. There are 2 reasons for doing this with sysarch():
1. The memory mapped I/O space is not at a fixed physical address. An
   application has to use some interface to get the base address. It
   gets worse if the machine has multiple memory mapped I/O spaces.
2. Access to the memory mapped I/O space needs to happen through a
   translation that is flagged as uncachable. There's no interface
   that allows a process to do uncached memory I/O, other than though
   /dev/mem (possibly).

So, until we either disallow direct access to I/O or bus space from
userland or have a better way of doing this, sysarch() has the least
negative impact on existing interfaces.
2003-10-27 05:45:35 +00:00
imp
3543793158 sync to 1.77 2003-10-27 05:37:34 +00:00