vm_page_t's.
- Add a KTR_TRAP tracepoint to trap() on the alpha that displays the
contents of a0, a1, and a2 to make debugging of nested traps that
panic before displaying any useful output easier.
may need the clock lock for nanotime().
- Add KTR trace events for lock list manipulations and other witness
operations.
- Use a temporary variable instead of setting the lock list head directly
and then setting up the links to add a new lock list entry to the lock
list. This small race could result in witness "forgetting" about all
the locks held by this process temporarily during an interrupt.
- Close a more fatal race condition when removing a lock from a list.
Removing a lock from the list entails both decrementing the count of
items in this bucket as well as shuffling items in the current bucket up
a notch to replace the gap left by the removed item. Wrap these
operations in a critical section.
class to trace witness events.
- Make the ktr_cpu field of ktr_entry be a standard field rather than one
present only in the KTR_EXTEND case.
- Move the default definition of KTR_ENTRIES from sys/ktr.h to
kern/kern_ktr.c. It has not been needed in the header file since KTR
was un-inlined.
- Minor include cleanup in kern/kern_ktr.c.
- Fiddle with the ktr_cpumask in ktr_tracepoint() to disable KTR events
on the current CPU while we are processing an event.
- Set the current CPU inside of the critical section to ensure we don't
migrate CPU's after the critical section but before we set the CPU.
switch. Count the context switch when preempting the current thread to let
a higher priority thread blocked on a mutex we just released run as an
involuntary context switch.
Reported by: bde
rejecting INTR_FAST interrupts. Since they can't be shared anyway,
this just short circuits a failure case that should work but is panic
fodder now.
This bug is that if the interrut condiation is active when you activate
the interrupt, then the interrupt routine will be called. jhb had
a patch that may or may not work to fix it, but I've lost it.
This may be due to the sio probe doing something odd too.
only do getcred calls for sockets which were created in the same jail.
This should allow the ident to work in a reasonable way within jails.
PR: 28107
Approved by: des, rwatson
people are on track with the cause and effect of this, and although
fixing this severely degenerate case appears to violate the letter of
POSIX.1-200x, Bruce and I (and enough others) agree that it should be
comitted.
So, this patch generates an ENOENT error for any attempt to do a path lookup
through an empty symlink (e.g. open(), stat()).
Submitted by: "Andrey A. Chernov" <ache@nagual.pp.ru>
Reviewed by: bde
Discussed exhaustively on: freebsd-current
Previously committed to: NetBSD 4 years ago
connection. The information contained in a tcptemp can be
reconstructed from a tcpcb when needed.
Previously, tcp templates required the allocation of one
mbuf per connection. On large systems, this change should
free up a large number of mbufs.
Reviewed by: bmilekic, jlemon, ru
MFC after: 2 weeks
functions in ifconfig. "ifconfig an0" should output the correct
status now. Also, make the read and write functions both more
robust and more consistant. This should stop most of the incorrect
size complaints and eliminate the possiability of panics from firmware
that increases resource sizes.
PR: kern/27826
Reviewed by: imp, jlemon
Submitted by: Doug Ambrisko <ambrisko@ambrisko.com>
David Wolfskill <dhw@whistle.com>
- Grab Giant around ktrace points.
- Clean up KTR_PROC tracepoints to not display the value of
sched_lock.mtx_lock as it isn't really needed anymore and just obfuscates
the messages.
- Add a few if conditions to replace gotos.
- Ensure that every msleep KTR event ends up with a matching msleep resume
KTR event (this was broken when we didn't do a mi_switch()).
- Only note via ktrace that we resumed from a switch once rather than twice
in several places in msleep().
- Remove spl's rom asleep and await as the proc lock and sched_lock provide
all the needed locking.
- In mawait() add in a needed ktrace point for noting that we are about to
switch out.
lock until after grabbing the sched_lock to avoid CURSIG racing with
psignal.
- Don't grab Giant for addupc_task() as it isn't needed.
Reported by: tegge (signal race), bde (addupc_task a while back)
rather than grabbing it and releasing it themselves. This allows callers
of these functions to get the lock to close race conditions.
- Grab Giant around ktrace in postsig.
- Count the switches performed on SIGSTOP's as involuntary context switches
in the resource usage stats.
Reported by: tegge (signal race), bde (missing csw stats)
introduce a modified allocation mechanism for mbufs and mbuf clusters; one
which can scale under SMP and which offers the possibility of resource
reclamation to be implemented in the future. Notable advantages:
o Reduce contention for SMP by offering per-CPU pools and locks.
o Better use of data cache due to per-CPU pools.
o Much less code cache pollution due to excessively large allocation macros.
o Framework for `grouping' objects from same page together so as to be able
to possibly free wired-down pages back to the system if they are no longer
needed by the network stacks.
Additional things changed with this addition:
- Moved some mbuf specific declarations and initializations from
sys/conf/param.c into mbuf-specific code where they belong.
- m_getclr() has been renamed to m_get_clrd() because the old name is really
confusing. m_getclr() HAS been preserved though and is defined to the new
name. No tree sweep has been done "to change the interface," as the old
name will continue to be supported and is not depracated. The change was
merely done because m_getclr() sounds too much like "m_get a cluster."
- TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and
systat(1) (see TODO below).
- Fixed systat(1) to display number of "free mbufs" based on new per-CPU
stat structures.
- Fixed netstat(1) to display new per-CPU stats based on sysctl-exported
per-CPU stat structures. All infos are fetched via sysctl.
TODO (in order of priority):
- Re-enable mbtypes statistics in both netstat(1) and systat(1) after
introducing an SMP friendly way to collect the mbtypes stats under the
already introduced per-CPU locks (i.e. hopefully don't use atomic() - it
seems too costly for a mere stat update, especially when other locks are
already present).
- Optionally have systat(1) display not only "total free mbufs" but also
"total free mbufs per CPU pool."
- Fix minor length-fetching issues in netstat(1) related to recently
re-enabled option to read mbuf stats from a core file.
- Move reference counters at least for mbuf clusters into an unused portion
of the cluster itself, to save space and need to allocate a counter.
- Look into introducing resource freeing possibly from a kproc.
Reviewed by (in parts): jlemon, jake, silby, terry
Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha)
Preliminary performance measurements: jlemon (and me, obviously)
URL: http://people.freebsd.org/~bmilekic/mb_alloc/
are duplicated by newly defined types/options in RFC3121
- We have no backward compatibility issue. There is no apps in our
distribution which use the above types/options.
Obtained from: KAME
MFC after: 2 weeks
lock. We now use temporary variables to save the process argument pointer
and just update the pointer while holding the lock. We then perform the
free on the cached pointer after releasing the lock.
. staticize out_fdc(), there's no longer an ft(4) driver sharing its use
. remove in_fdc(), has been used by ft(4) last time, long since obsoleted
by fd_in()
. move the declaration of fd_clone() to where most of the other function
declarations are
. de-__P()ify fd_clone(), it's been the only _P()ed function in the
entire file
something: offset into the first mbuf of the target chain before copying
the source data over.
Make drivers using m_devget() with a first argument "data - ETHER_ALIGN"
to use the offset argument to pass ETHER_ALIGN in. The way it was previously
done is potentially dangerous if the source data was at the top of a page
and the offset caused the previous page to be copied (if the
previous page has not yet been appropriately mapped).
The old `offset' argument in m_devget() is not used anywhere (it's always
0) and dates back to ~1995 (and earlier?) when support for ethernet trailers
existed. With that support gone, it was merely collecting dust.
Tested on alpha by: jlemon
Partially submitted by: jlemon
Reviewed by: jlemon
MFC after: 3 weeks
the console device was open. At other times, the interrupts that
are used to detect the break signal or ~^B sequence were disabled,
so these events would not be noticed until the next open (e.g. the
next kernel printf). This was mainly a problem while there was no
getty running on the console, such as during bootup or shutdown.
For serial consoles with break-to-debugger support, we now enable
the generation of interrupts at attach time, and we leave them
enabled while the device is closed.
Reviewed by: bde (I've since made chages as per his suggestions)
via the new DIGIIO_SETALTPIN ioctl, and allow the port's ALTPIN setting
to be queried via DIGIIO_GETALTPIN.
The initial state and lock devices are normally used to set and/or
lock ALTPIN settings although the device itself may also be used.
ALTPIN settings are applied per-device and apply to both the callin
and callout device at the same time.
sizeof(ro_dst) is not necessarily the correct one.
this change would also fix the recent path MTU discovery problem for the
destination of an incoming TCP connection.
Submitted by: JINMEI Tatuya <jinmei@kame.net>
Obtained from: KAME
MFC after: 2 weeks
of tunclose() rather than the end, and tunopen() grabbed that unit
before tunclose() finished (one process is allocating it while another
is freeing it!).
It may be worth hanging some sort of rw mutex around all specinfo
calls where d_close and the detach handler get a write lock and all
other functions get a read lock. This would guarantee certain levels
of ``atomicity'' (is that a word?) that people may expect (I believe
Solaris does something like this).
requirements(RFC1573, interface MIB). This change for 4.4BSD was
first introduced in if_ethersubr.c:1.17->1.18.
BTW, iflastchange on all of IFs are inconsistent. e.g.
ether, tun: update
fddi, tokenring, ppp: not update
I'll make patch later.
Obtained from: KAME
MFC after: 2 weeks
converting from the old external mbuf buffer code to the new (with the
MEXTADD() macro). Also free free list memory correctly in
foo_free_jumbo_mem() routines: grab the head of the list, then
remove it, _then_ free() it.
This fixes the memory corruption problem I've been chasing in the level 1
driver.
and its associated constants. Implement _SC_IOV_MAX in the usual way.
Be a bit sloppy about the namespace question; this should get cleared up
in time for 5.0.
MFC after: 1 month
startup routine more closely matches that of alpha and ia64. At some
point the common mutexes shared across all platforms probably should move
into sys/kern_mutex.c.
trace code that was brought over from NetBSD.)
- Check for "syscall_with_err_pushed" as the label prior to a syscall trap
frame rather than "Xlcall_syscall" and "Xint0x80_syscall". We don't
have a valid trapframe during the short range of code that those two
symbols now cover.
- Simplify db_next_frame() to avoid duplicating the code for the different
trap frame types.
- Don't try to trace a swapped-out process. (Brought over from NetBSD via
the new alpha trace code.)
1.307 Turn on kernel debug support
1.309 Turn off pcm
1.311 move wx to miibus chipsets
1.312 Comment out USERCONFIG
Reminded by: mihira-san <sanpei@sanpei.org>
information until the problems can be tracked down. Right now these
are unconditional, but later it will be hidden behind a boot verbose.
Also, if there are no events listed in the event mask, return right
away. Specifically avoid writing back interrupt acks in this case.
1: most drivers are sensitive to timing, and
2: the handlers are MPSAFE and need a chance to get into the kernel
before some other non-mpsafe handler blocks the ithread on Giant in
shared irq cases.
Reviewed by: cg (in principle)
worked before.
mixer, dsp and sndstat are seperate devices - give them their own cdevsws
instead of demuxing requests sent to a single cdevsw.
use the si_drv1/si_drv2 fields in dev_t structures for holding information
specific to an open instance of mixer/dsp.
nuke /dev/{dsp,dspW,audio}[0-9]* links - this functionality is now provided
using cloning.
various locking fixes.
ports later on.
This includes the basic MI interface routines as well as a console driver.
The MD code is kept in the MD directories.
Reviewed by: obrien
us our first minimal glimpse of PowerPC support.
With this code we can get to the "mountroot>" prompt on my Apple iMac. We
can't get any further due to lack of clock and interrupt handling, among other
things. This does however mean that pmap and VM are initialising.
We're fairly dependant on OpenFirmware at this point, but I hope to add
support for other classes of firmware at a later stage.
Reviewed by: obrien, dfr
Print type of pci bridge we find.
Force the IRQ of pci bridges upon all its children.
Allocate the resources on behalf of the bridge when we're testing to see if
they exist.
This should help people who don't read updating instructions very well.
This patch started out with an idea from Shigeru Yamamoto-san in -current.
make(1) wants to build loader.sym *before* the .o files. Eliminating
one seeminly intermediate step avoids the problem. Somehow, it seems
that variables are not getting expanded at the right time.
Any explanations would be appreciated...
Changing:
${BASE}.sym: ${OBJS} ${LIBSTAND} ${LIBFICL} ${LIBALPHA} ${CRT} vers.o
${LD} ...
To:
BASEOBJS= ${OBJS} ${LIBSTAND} ${LIBFICL} ${LIBALPHA} ${CRT} vers.o
${BASE}.sym: ${BASEOBJS}
echo ${BASEOBJS}
${LD} ...
.. the echo only shows LIBFICL, CRT and vers.o. ${OBJS} is not included.
told to use IRQ 6, progam the pcic to use irq 7 instead. Evidentally,
at least some of the cards are wired this way. If you want to use irq
6, configure it. All the mapping is done just before we set the
interrupt registers. See [FreeBSD98-testers 5064] for details.
Added commentary about valid interrupts on some CBUS pc98 CL PD6722
based cards.
Submitted by: Hiroshi TSUKADA-san <hiroshi@kiwi.ne.jp>
built in, or as an addon card (My Japanese isn't quite good enough to
know which). [FreeBSD98-testers 5098] contains all the details.
Submitted by: Kawanobe Koh-san <kawanobe@st.rim.or.jp>
(I'll be we know which compiler and platform they developed this on...)
Minimally change them to C89 comments to make GCC happy. (this is kinda funny
as the file has piece derived from FreeBDS 3.2)
Also fix FreeBSD id style.
The DP83820/83821 has an undocumented limitation concerning jumbo frames
and TX checksum offload. In order for TX checksum offload to work, the
outgoing frame must fit entirely within the TX FIFO, which is 8192 bytes
in size. This isn't a problem, until you try to send a 9000-byte frame,
at which point the TX DMA engine goes to sleep. It turns out that if
you want to send a jumbo frame larger than 8170 bytes (8192 - 64), you
have to turn off the TX checksum support.
As a workaround, I changed nge_ioctl() so that if the user selects an
MTU larger than 8152 bytes, we clear the if_hwassist flags. The flags
will be set again once the MTU is reduced to a smaller value.
- Use db_printf() instead of printf().
- Clean up decode_syscall() to use regular if-then-else rather than goto's.
- Use the same method of parsing PID's for per-process traces as the x86
code does: that is, if the address passed in is not a valid kernel
address, treat it is a decimal pid.
- If the pid of the current process is specified, fall back to using the
"default" parameters for the trace as curproc's pcb is not valid at this
point.
MFC after: 1 week
we want the checksums calculated on a per-packet basis using control bits
in the extsts field of the DMA descriptor structure. For TX, the chip
seems to want these bits set in the field of the first descriptor in
a fragment chain, not the last.
412: warning: long unsigned int format, unsigned int arg (arg 3)
418: warning: long unsigned int format, unsigned int arg (arg 3)
424: warning: long unsigned int format, unsigned int arg (arg 3)
take a const 'name', since they dont modify anything.
159: warning: passing arg 1 of `getenv_int' discards qualifiers...
167: warning: passing arg 1 of `getenv' discards qualifiers from pointer..
vinumhdr.h:80: warning: redundant redeclaration of `vinum_cdevsw'
vinumext.h:239: warning: previous declaration of `vinum_cdevsw'
in each of the following files:
vinum.c, vinumconfig.c, vinumdaemon.c, vinuminterrupt.c, vinumio.c,
vinumioctl.c, vinumlock.c, vinummemory.c, vinumraid5.c, vinumrequest.c,
vinumrevive.c, vinumstate.c, vinumutil.c
musycc.c:449: warning: long unsigned int format, unsigned int arg (arg 3)
musycc.c:449: warning: long unsigned int format, unsigned int arg (arg 4)
musycc.c:453: warning: long unsigned int format, unsigned int arg (arg 3)
musycc.c:453: warning: long unsigned int format, unsigned int arg (arg 4)
musycc.c:453: warning: long unsigned int format, unsigned int arg (arg 5)
These warnings used to be confined to the alpha but are on all now.
554: passing arg 4 of `resource_string_value' from incompatible pointer type
576: passing arg 4 of `resource_string_value' from incompatible pointer type
593: passing arg 4 of `resource_string_value' from incompatible pointer type
commands that complete (with no apparent error) after
we receive a LIP. This has been observed mostly on
Local Loop topologies. To be safe, let's just mark
all active commands as dead if we get a LIP and we're
on a private or public loop.
MFC after: 4 weeks
could only get a chance of testing it under 4.3, but together with the
if_oltr.c fixes at least it seems to work now. If someone has the chance
to test this under -current, please do.
Unfortunaltey, the TR code itself (if_iso88025subr.c) is not written
in a way that would allow to make a seaparate KLD out of it. By now,
just link it directly into the oltr KLD since it's probably the POLA
to be able to load the TR code together with the only TR hardware
driver we've got by now.
I've got one single unexplained panic (in doreti_switch or somewhere
there, calling a 0xc1XXXXXX address that did no longer belong to the
kernel at all) after unloading the modules once, thus i don't propose
a MFC of this module despite my testing has been done solely on 4.3,
unless someone is really going to test this stuff in -current.
This avoids a null pointer deref panic in TRlldClose() inside the
vendor-supplied object code. It's now possible to unload the driver
at all.
Implement deallocation of malloc()ed memory regions.
MFC after: 2 months
The symptom being treated in 1.98 was to avoid freeing a
pagedep dependency if there was still a newdirblk dependency
referencing it. That change is correct and no longer prints
a warning message when it occurs. The other part of revision
1.98 was to panic when a newdirblk dependency was encountered
during a file truncation. This fix removes that panic and
replaces it with code to find and delete the newdirblk
dependency so that the truncation can succeed.
cpu_mp_start() is never called, mp_ncpus will have a non-zero value.
This prevents systat from dying with an arithmatic exception caused
by a divide-by-zero error on UP alphas running a GENERIC kernel.
Replace the a.out emulation of 'struct linker_set' with something
a little more flexible. <sys/linker_set.h> now provides macros for
accessing elements and completely hides the implementation.
The linker_set.h macros have been on the back burner in various
forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()),
John Polstra (ELF clue) and myself (cleaned up API and the conversion
of the rest of the kernel to use it).
The macros declare a strongly typed set. They return elements with the
type that you declare the set with, rather than a generic void *.
For ELF, we use the magic ld symbols (__start_<setname> and
__stop_<setname>). Thanks to Richard Henderson <rth@redhat.com> for the
trick about how to force ld to provide them for kld's.
For a.out, we use the old linker_set struct.
NOTE: the item lists are no longer null terminated. This is why
the code impact is high in certain areas.
The runtime linker has a new method to find the linker set
boundaries depending on which backend format is in use.
linker sets are still module/kld unfriendly and should never be used
for anything that may be modular one day.
Reviewed by: eivind
- Replace some very poorly thought out API hacks that should have been
fixed a long while ago.
- Provide some much more flexible search functions (resource_find_*())
- Use strings for storage instead of an outgrowth of the rather
inconvenient temporary ioconf table from config(). We already had a
fallback to using strings before malloc/vm was running anyway.
Following changed was made by previous commit:
- add a pointer to struct mauxtag. two integer was too restrictive.
- add m_aux_{add,find}2.
- make sure to nuke mbuf pointed to m_aux.
This work was based on kame-20010528-freebsd43-snap.tgz and some
critical problem after the snap was out were fixed.
There are many many changes since last KAME merge.
TODO:
- The definitions of SADB_* in sys/net/pfkeyv2.h are still different
from RFC2407/IANA assignment because of binary compatibility
issue. It should be fixed under 5-CURRENT.
- ip6po_m member of struct ip6_pktopts is no longer used. But, it
is still there because of binary compatibility issue. It should
be removed under 5-CURRENT.
Reviewed by: itojun
Obtained from: KAME
MFC after: 3 weeks
. remove stale comments and a stale #define (from the old days of ft(4))
. make MAX_SEC_SIZE (used in isa_dmainit()) a #define
. fix a typo in a string
. use 0 as the blocksize in devstat_add_entry(), since the actual blocksize
is unknown (devstat(9) suggests to use 0 in that case)
out nearly every platform but the one I tested on requires the intpin
to swizzle out the correct intline.
tested by: Martijn Pronk <mpkisbkl@xs4all.nl> (lca_pci)
page of the image to load section headers and if we let the text section
start at zero, it corrupts the section table when its loaded. With this
change, the loader gets as far as the 'ok' prompt.