that goes to opt_dontuse.h is so an opt_*.h file doesn't get created even
though an option may be used for bringing stuff in via files[.*].
Pointed out by: bde
that are linked into the kernel. The KLD compilation options are
changed to call these functions, rather than in-lining the
atomic operations.
This approach makes atomic operations from KLDs significantly
faster on UP systems (though somewhat slower on SMP systems).
PR: i386/13111
Submitted by: peter.jeremy@alcatel.com.au
about a dev_t.
printf("%x", dev) now becomes printf("%s", devtoname(dev)) because
printing actual information about the device is much more useful then
printing a pointer to an address that would never help the developer debug.
Submitted by: phk, bde
didn't match the argument (p->p_pid).
While I'm at it, also fix the dupo in the format string and fix the annoying
inconsistency in all the debug-printfs wrt p_pid arguments. Change all of them
to use the %ld format specifier and cast the p_pid arguments to long.
Submitted by: billf
0x40 and then access data stored in real-mode segment 0x40, even when
called in protected mode. Microsoft unfortunately coddle these individuals,
and so must we if we want to run their code.
This change works around GPFs in some APM and PnP BIOS implementations.
Obtained from: Linux
The first reason for this rewrite is KNF conformance.
The second reason is to avoid redundancy. Each function printed the same
string, with only the syscall name being different. The actual printing is now
performed by a single function, which gets the syscall name as an argument.
The third reason is that of convenience. It's now very easy to add a new
dummy implementation. Just add ``DUMMY(foo);'' to the file. It's also a lot
easier now to see if a syscall has a dummy implementation or not.
The dummies are ordered on syscall number. Please maintain this when adding
new dummies (there're 32 candidates at the time of writing :-)
Reviewed by: bde
prototypes of o{s|g}etrlimit (from sys/sysproto.h). Update linux_{s|g}etrlimit
so that the arguments to o{s|g}etrlimit are corresponding the prototypes.
Pointed out by: bde
apm_bioscall() to check requested BIOS is supported or not.
- Add workaround in apm_driver_version() for the buggy BIOSes which
don't return the connection version in %ax.
PR: i386/13028
Reviewed by: sanpei@sanpei.org and Warner Losh.
functions use the new sigset_t and sigaction_t which allows support for more
than 32 signals. Only the lower 32 signals are supported for now.
linux_rt_sigaction, linux_sigaction and linux_signal use linux_do_sigaction
to do the actual work. That way unnecessary redundancy is avoided. The same
has been done for linux_rt_sigprocmask and linux_sigprocmask. They call
linux_do_sigprocmask to do the actual work.
of kernel space. Remove the ioctl supporting functions, and move the actual
code to the switch-statement. Now everybody can clearly see that the
implementation is really poor.
Also fix a typo in LINUX_TIOCGETD. The underlying function was given command
TIOCSETD instead op TIOCGETD...
Introduce BUF_STRATEGY(struct buf *, int flag) macro, and use it throughout.
please see comment in sys/conf.h about the flag argument.
Remove strategy argument from all the diskslice/label/bad144
implementations, it should be found from the dev_t.
Remove bogus and unused strategy1 routines.
Remove open/close arguments from dssize(). Pick them up from dev_t.
Remove unused and unfinished setgeom support from diskslice/label/bad144 code.
changes. This is part 1 of the complete termios ioctl fixes.
o change type of c_{i|o|c|l}flag in struct termios from unsigned long to
unsigned int. The type now matches the Linux definitions.
o replaced constants by the corresponding defines in sptab[] for clarity.
Since there's no define for 135 baud, its mapping has been dropped.
function bsd_to_linux_termios:
o Fix typo IXON -> IXANY.
o Remove bogus assignment to c_cc[LINUX_VSWTC].
function linux_to_bsd_termios:
o Fix dupo LINUX_IXON -> LINUX_IXANY.
o Add LINUX_CREAD mapping.
o Fix typo IEXTEN -> LINUX_IEXTEN.
function linux_to_bsd_termio:
o Small optimization: Don't preset the complete c_cc array when we next
assign to the first LINUX_NCC entries.
in deterministic behaviour. In this case known garbage out.
The fix is different than suggested in the PR.
PR: 12749
Originator: Boris Nikolaus <boris@cs.tu-berlin.de>
The linux syscalls translate the arguments first before invoking the
FreeBSD native syscalls.
PR: kern/9591
Originator: John Plevyak <jplevyak@inktomi.com>
as PCI->HOST bridges on my (440BX) box.
My change is to remove the test at the beginning entirely, letting the
switch on the device ID happen first. If the device ID is unknown, then
(in the default case) check for the generic PCIS_BRIDGE_HOST tag. This
should allow wierd cases (eg: wpaul's IMS VL bridge) to work by using the
id override. This strategy is more in line with the other PCI match
methods we use elsewhere,
I only have a limited testbed, but having my USB etc devices detected as
PCI->HOST bridges doesn't look good.
correctly. It has the following code:
if (class != PCIC_BRIDGE || subclass != PCIS_BRIDGE_HOST)
return NULL;
My 486 has an Integrated Micro Solutions PCI bridge which identifies
itself as subclass PCIS_BRIDGE_OTHER, not PCIS_BRIDGE_HOST. Consequently,
it gets ignored. In my opinion, the correct test should be:
if ((class != PCIC_BRIDGE) && (subclass != PCIS_BRIDGE_HOST))
return NULL;
That way the test still succeeds because the chip's class is PCIC_BRIDGE.
Clearly it's not reasonable to expect all host to PCI bridges to always
have a subclass of PCIS_BRIDGE_HOST since I've got one that doesn't.
This way the sanity test should remain relatively sane while still allowing
some oddball yet correct hardware to work. If somebody has a better way
to do it, go ahead and tweak the test, but be aware that
class == PCIC_BRIDGE and subclass == PCIS_BRIDGE_OTHER is a valid case.
While I was here, I also added an explicit ID string for the IMS chipset.
I also dealt with a minor style nit: it's bad karma not to have a default
case for your switch statements, but the one in this routine doesn't have
one. The default string of "Host to PCI bridge" is now assigned in a
default case of the switch statement instead of initializing "s" with the
string before the switch and then not having any default case.
Config(8) contains no documentation about this.
Fix the help for the PnP irq and drq commands. This one caused
me a bit of head scratching the other night while trying to get
a problematic PnP device configured properly.
we create the pty on the fly when it is first opened.
If you run out of ptys now, just MAKEDEV some more.
This also demonstrate the use of dev_t->si_tty_tty and dev_t->si_drv1
in a device driver.
in the pathname translation procedure. This proves fatal, and can be
easily fixed. This or a similar change needs to be committed to svr4_util.h
and ibcs2_util.h. I will update ibcs2_util.h, if noone else thinks of a
better way to do this, in the same manner. I will leave svr4 to the
respective maintainer.
This closes the problem of the only crash I've been able to produce as
a user recently, except for (currently not-in-the-source tree) fd
table sharing fixes. Thanks goes to pho for his stress-testers.
of 2 weeks ago that this be done, and anyone who wishes to make bpf more
selective according to securelevel or compile-time options is more
than free to do so.
eisa_add_intr() which now takes an additional arguement (one of
EISA_TRIGGER_LEVEL or EISA_TRIGGER_EDGE).
The flag RR_SHAREABLE has no effect when passed to
bus_alloc_resource(dev, SYS_RES_IRQ, ...) in an EISA device context as
the eisa_alloc_resource() call (bus_alloc_resource method) now deals
with this flag directly, depending on the device ivars.
This change does nothing more than move all the 'shared = inb(foo + iobsse)'
nonesense to the device probe methods rather than the device attach.
Also, print out 'edge' or 'level' in the IRQ announcement message.
Reviewed by: dfr
- Add support for calling 32-bit code in other segments
- Add support for calling 16-bit protected mode code
Update APM to use this facility.
Submitted by: jlemon
- device_print_child() either lets the BUS_PRINT_CHILD
method produce the entire device announcement message or
it prints "foo0: not found\n"
Alter sys/kern/subr_bus.c:bus_generic_print_child() to take on
the previous behavior of device_print_child() (printing the
"foo0: <FooDevice 1.1>" bit of the announce message.)
Provide bus_print_child_header() and bus_print_child_footer()
to actually print the output for bus_generic_print_child().
These functions should be used whenever possible (unless you can
just use bus_generic_print_child())
The BUS_PRINT_CHILD method now returns int instead of void.
Modify everything else that defines or uses a BUS_PRINT_CHILD
method to comply with the above changes.
- Devices are 'on' a bus, not 'at' it.
- If a custom BUS_PRINT_CHILD method does the same thing
as bus_generic_print_child(), use bus_generic_print_child()
- Use device_get_nameunit() instead of both
device_get_name() and device_get_unit()
- All BUS_PRINT_CHILD methods return the number of
characters output.
Reviewed by: dfr, peter
active or not. The only sane thing we can do here is assume that if
APM is supported it might be active at some point, and bail.
In reality, even this isn't good enough; regardless of whether we support
APM or not, the system may well futz with the CPU's clock speed and throw
the TSC off. We need to stop using it for timekeeping except under
controlled circumstances. Curse the lack of a dependable high-resolution
timer.
equivalent to SYS_RES_MEMORY for x86 but for alpha, the rman_get_virtual()
address of the resource is initialised to point into either dense-mapped
or bwx-mapped space respectively, allowing direct memory pointers to be
used to device memory.
Reviewed by: Andrew Gallatin <gallatin@cs.duke.edu>
macros) to the signal handler, for old-style BSD signal handlers as
the second (int) argument, for SA_SIGINFO signal handlers as
siginfo_t->si_code. This is source-compatible with Solaris, except
that we have no <siginfo.h> (which isn't even mentioned in POSIX
1003.1b).
An rather complete example program is at
http://www3.cons.org/cracauer/freebsd-signal.c
This will be added to the regression tests in src/.
This commit also adds code to disable the (hardware) FPU from
userconfig, so that you can use a software FP emulator on a machine
that has hardware floating point. See LINT.
ethernet controllers based on the AIC-6915 "Starfire" controller chip.
There are single port, dual port and quad port cards, plus one 100baseFX
card. All are 64-bit PCI devices, except one single port model.
The Starfire would be a very nice chip were it not for the fact that
receive buffers have to be longword aligned. This requires buffer
copying in order to achieve proper payload alignment on the alpha.
Payload alignment is enforced on both the alpha and x86 platforms.
The Starfire has several different DMA descriptor formats and transfer
mechanisms. This driver uses frame descriptors for transmission which
can address up to 14 packet fragments, and a single fragment descriptor
for receive. It also uses the producer/consumer model and completion
queues for both transmit and receive. The transmit ring has 128
descriptors and the receive ring has 256.
This driver supports both FreeBSD/i386 and FreeBSD/alpha, and uses newbus
so that it can be compiled as a loadable kernel module. Support for BPF
and hardware multicast filtering is included.
Change "void *" to "volatile TYPE *", improving type safety
and eliminating some warnings (e.g., mp_machdep.c rev 1.106).
cpufunc.h:
Eliminate setbits. As defined, it's not precisely correct;
and it's redundant. (Use atomic_set_int instead.)
ipl_funcs.c:
Use atomic_set_int instead of setbits.
systm.h:
Include atomic.h.
Reviewed by: bde
When creating new processes (or performing exec), the new page
directory is initialized too early. The kernel might grow before
p_vmspace is initialized for the new process. Since pmap_growkernel
doesn't yet know about the new page directory, it isn't updated, and
subsequent use causes a failure.
The fix is (1) to clear p_vmspace early, to stop pmap_growkernel
from stomping on memory, and (2) to defer part of the initialization
of new page directories until p_vmspace is initialized.
PR: kern/12378
Submitted by: tegge
Reviewed by: dfr
The structure is the right length, but some of the members (notably
wi_q_info) were off a bit. This causes the received signal strength
values to appear bogus.
- Support for setting memory range attributes on SMP systems using the
new SMP rendezvous function
- Don't print the confusing default memory type message.
- Allow legal overlapping range types.
- Turn interrupts back on after setting MTRRs in UP mode (whoops)
- Don't waste time calling invltlb() after wbinvd(); it's not
SMP-compatible (interrupts are off) and unncessary because
wbinvd already flushes the TLB.
This code is now essentially feature-complete.
the caller to specify a function to be guarded between an entry and exit
barrier, as well as pre- and post-barrier functions.
The primary use for this function is synchronised update of per-cpu private
data. The implementation is almost (but not quite) MI; with a better
mechanism for masking per-CPU interrupts it could probably be hoisted.
Reviewed by: peter (partially)
but broken, since tsc_timecounter is not initialised in that case,
and updating an uninitialised timecounter is fatal.
Fixed style bugs in the machdep.i8254_freq and machdep.tsc_freq
sysctls.
Reviewed by: phk
with respect to interrupts on UP systems. (The upgrade from gcc 2.7.x
to egcs 1.1.2 produced at least one non-atomic code sequence in
swap_pager_getpages.)
In addition, the primitives are now SMP-safe, but only on SMPs. (For
portability between SMPs and UPs, modules are compiled with the SMP-safe
versions.)
Submitted by: dillon and myself
Reviewed by: bde
not masked during handling of shared PCI interrupts. This resulted in
ASTs sometimes being discarded and softclock interrupts sometimes being
handled prematurely (sometimes = quite often on systems with shared PCI
interrupts, never on other systems).
Debugged by: gibbs and other people at plutotech.com
PR: 6944, maybe 12381
gigabit ethernet adapters. This includes two single port cards
(single mode and multimode fiber) and two dual port cards (also single
mode and multimode fiber). SysKonnect is currently the only
vendor with a dual port gigabit ethernet NIC.
The ports on dual port adapters are treated as separate network
interfaces. Thus, if you have an SK-9844 dual port SX card, you
should have both sk0 and sk1 interfaces attached. Dual port cards
are implemented using two XMAC II chips connected to a single
SysKonnect GEnesis controller. Hence, dual port cards are really
one PCI device, as opposed to two separate PCI devices connected
through a PCI to PCI bridge. Note that SysKonnect's drivers use
the two ports for failover purposes rather that as two separate
interfaces, plus they don't support jumbo frames. This applies to
their Linux driver too. :)
Support is provided for hardware multicast filtering, BPF and
jumbo frames. The SysKonnect cards support TCP checksum offload
however this feature is not currently enabled (hopefully it will
be once we get checksum offload support).
There are still a few things that need to be implemeted, like
the ability to communicate with the on-board LM80 voltage/temperature
monitor, but I wanted to get the driver under CVS control and into
-current so people could bang on it.
A big thanks for SysKonnect for making all their programming info
for these cards (and for their FDDI and token ring cards) available
without NDA (see www.syskonnect.com).
large (1G) memory machine configurations. I was able to run 'dbench 32'
on a 32MB system without bring the machine to a grinding halt.
* buffer cache hash table now dynamically allocated. This will
have no effect on memory consumption for smaller systems and
will help scale the buffer cache for larger systems.
* minor enhancement to pmap_clearbit(). I noticed that
all the calls to it used constant arguments. Making
it an inline allows the constants to propogate to
deeper inlines and should produce better code.
* removal of inherent vfs_ioopt support through the emplacement
of appropriate #ifdef's, with John's permission. If we do not
find a use for it by the end of the year we will remove it entirely.
* removal of getnewbufloops* counters & sysctl's - no longer
necessary for debugging, getnewbuf() is now optimal.
* buffer hash table functions removed from sys/buf.h and localized
to vfs_bio.c
* VFS_BIO_NEED_DIRTYFLUSH flag and support code added
( bwillwrite() ), allowing processes to block when too many dirty
buffers are present in the system.
* removal of a softdep test in bdwrite() that is no longer necessary
now that bdwrite() no longer attempts to flush dirty buffers.
* slight optimization added to bqrelse() - there is no reason
to test for available buffer space on B_DELWRI buffers.
* addition of reverse-scanning code to vfs_bio_awrite().
vfs_bio_awrite() will attempt to locate clusterable areas
in both the forward and reverse direction relative to the
offset of the buffer passed to it. This will probably not
make much of a difference now, but I believe we will start
to rely on it heavily in the future if we decide to shift
some of the burden of the clustering closer to the actual
I/O initiation.
* Removal of the newbufcnt and lastnewbuf counters that Kirk
added. They do not fix any race conditions that haven't already
been fixed by the gbincore() test done after the only call
to getnewbuf(). getnewbuf() is a static, so there is no chance
of it being misused by other modules. ( Unless Kirk can think
of a specific thing that this code fixes. I went through it
very carefully and didn't see anything ).
* removal of VOP_ISLOCKED() check in flushbufqueues(). I do not
think this check is necessary, the buffer should flush properly
whether the vnode is locked or not. ( yes? ).
* removal of extra arguments passed to getnewbuf() that are not
necessary.
* missed cluster_wbuild() that had to be a cluster_wbuild_wb() in
vfs_cluster.c
* vn_write() now calls bwillwrite() *PRIOR* to locking the vnode,
which should greatly aid flushing operations in heavy load
situations - both the pageout and update daemons will be able
to operate more efficiently.
* removal of b_usecount. We may add it back in later but for now
it is useless. Prior implementations of the buffer cache never
had enough buffers for it to be useful, and current implementations
which make more buffers available might not benefit relative to
the amount of sophistication required to implement a b_usecount.
Straight LRU should work just as well, especially when most things
are VMIO backed. I expect that (even though John will not like
this assumption) directories will become VMIO backed some point soon.
Submitted by: Matthew Dillon <dillon@backplane.com>
Reviewed by: Kirk McKusick <mckusick@mckusick.com>
than a review, this was a nice puzzle.
This is supposed to be binary and source compatible with older
applications that access the old FreeBSD-style three arguments to a
signal handler.
Except those applications that access hidden signal handler arguments
bejond the documented third one. If you have applications that do,
please let me know so that we take the opportunity to provide the
functionality they need in a documented manner.
Also except application that use 'struct sigframe' directly. You need
to recompile gdb and doscmd. `make world` is recommended.
Example program that demonstrates how SA_SIGINFO and old-style FreeBSD
handlers (with their three args) may be used in the same process is at
http://www3.cons.org/tmp/fbsd-siginfo.c
Programs that use the old FreeBSD-style three arguments are easy to
change to SA_SIGINFO (although they don't need to, since the old style
will still work):
Old args to signal handler:
void handler_sn(int sig, int code, struct sigcontext *scp)
New args:
void handler_si(int sig, siginfo_t *si, void *third)
where:
old:code == new:second->si_code
old:scp == &(new:si->si_scp) /* Passed by value! */
The latter is also pointed to by new:third, but accessing via
si->si_scp is preferred because it is type-save.
FreeBSD implementation notes:
- This is just the framework to make the interface POSIX compatible.
For now, no additional functionality is provided. This is supposed
to happen now, starting with floating point values.
- We don't use 'sigcontext_t.si_value' for now (POSIX meant it for
realtime-related values).
- Documentation will be updated when new functionality is added and
the exact arguments passed are determined. The comments in
sys/signal.h are meant to be useful.
Reviewed by: BDE
into uipc_mbuf.c. This reduces three sets of identical tunable code to
one set, and puts the initialisation with the mbuf code proper.
Make NMBUFs tunable as well.
Move the nmbclusters sysctl here as well.
Move the initialisation of maxsockets from param.c to uipc_socket2.c,
next to its corresponding sysctl.
Use the new tunable macros for the kern.vm.kmem.size tunable (this should have
been in a separate commit, whoops).
print_AMD_info(), L2 internal cache is shown, as are AMD's special CPUID
infos:
CPU: AMD-K6(tm) 3D processor (350.81-MHz 586-class CPU)
Origin = "AuthenticAMD" Id = 0x58c Stepping=12
Features=0x8021bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,PGE,MMX>
AMD Features=0x808029bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,SYSCALL,PGE,MMX,3DNow!>
PR: kern/12512
Submitted by: Louis A. Mamakos <louie@TransSys.COM>
frames (or just insane received packet lengths generated due to errors
reading from the NIC's internal buffers). Anything too large to fit
safely into an mbuf cluster buffer is discarded and an error logged.
I have not observed this problem with my own cards, but on user has
reported it and adding the sanity test seems reasonable in any case.
Problem noted and patch provided by: Per Andersson <per@cdg.chalmers.se>
we will never use more memory than this value (if specified), but will always
check memory for validity up to this amount.
Get rid of the speculative_mprobe option; the memory amount can now be
specified by hw.physmem.
QUEUE_AGE, QUEUE_LRU, and QUEUE_EMPTY we instead have QUEUE_CLEAN,
QUEUE_DIRTY, QUEUE_EMPTY, and QUEUE_EMPTYKVA. With this patch clean
and dirty buffers have been separated. Empty buffers with KVM
assignments have been separated from truely empty buffers. getnewbuf()
has been rewritten and now operates in a 100% optimal fashion. That is,
it is able to find precisely the right kind of buffer it needs to
allocate a new buffer, defragment KVM, or to free-up an existing buffer
when the buffer cache is full (which is a steady-state situation for
the buffer cache).
Buffer flushing has been reorganized. Previously buffers were flushed
in the context of whatever process hit the conditions forcing buffer
flushing to occur. This resulted in processes blocking on conditions
unrelated to what they were doing. This also resulted in inappropriate
VFS stacking chains due to multiple processes getting stuck trying to
flush dirty buffers or due to a single process getting into a situation
where it might attempt to flush buffers recursively - a situation that
was only partially fixed in prior commits. We have added a new daemon
called the buf_daemon which is responsible for flushing dirty buffers
when the number of dirty buffers exceeds the vfs.hidirtybuffers limit.
This daemon attempts to dynamically adjust the rate at which dirty buffers
are flushed such that getnewbuf() calls (almost) never block.
The number of nbufs and amount of buffer space is now scaled past the
8MB limit that was previously imposed for systems with over 64MB of
memory, and the vfs.{lo,hi}dirtybuffers limits have been relaxed
somewhat. The number of physical buffers has been increased with the
intention that we will manage physical I/O differently in the future.
reassignbuf previously attempted to keep the dirtyblkhd list sorted which
could result in non-deterministic operation under certain conditions,
such as when a large number of dirty buffers are being managed. This
algorithm has been changed. reassignbuf now keeps buffers locally sorted
if it can do so cheaply, and otherwise gives up and adds buffers to
the head of the dirtyblkhd list. The new algorithm is deterministic but
not perfect. The new algorithm greatly reduces problems that previously
occured when write_behind was turned off in the system.
The P_FLSINPROG proc->p_flag bit has been replaced by the more descriptive
P_BUFEXHAUST bit. This bit allows processes working with filesystem
buffers to use available emergency reserves. Normal processes do not set
this bit and are not allowed to dig into emergency reserves. The purpose
of this bit is to avoid low-memory deadlocks.
A small race condition was fixed in getpbuf() in vm/vm_pager.c.
Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by: Kirk McKusick <mckusick@mckusick.com>
behavior slightly.
If machine/bus.h is included, but neither bus_memio.h nor bus_pio.h
are included, then behave as if both were included.
This won't change existing drivers, all of which include one or more
of bus_{p,mem}io.h, but will allow drivers from other systems to come
over with fewer changes. I freely admit that this might not be
optimal for some drivers, but those drivers can be optimized for
FreeBSD after the initial bringup happens.
Without the change, there is a bug that preclude drivers from
compiling with strange warning/errors.
I've been running this here for a while now w/o ill effects.
Reviewed by: gibbs
Not objected to by: bde, arch@ list.