object are not page aligned). This should fix the mount_msdos panic after a
failed attemp to mount as ffs.
Reviewed By: Matthew Dillon <dillon@apollo.backplane.com>
Archie Cobbs <archie@whistle.com>
Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
there does not seem to be a problem with this.
PR: kern/8732
Analysis by: David G Andersen <danderse@cs.utah.edu>
Tested by: Alfred Perlstein <bright@hotjobs.com>
Submitted by: "Richard Seaman, Jr." <lists@tar.com>
Obtained from: linux :-)
Code to allow Linux Threads to run under FreeBSD.
By default not enabled
This code is dependent on the conditional
COMPAT_LINUX_THREADS (suggested by Garret)
This is not yet a 'real' option but will be within some number of hours.
adjusted related casts to match (only in the kernel in this commit).
The pointer was only wanted in one place in kern_exec.c. Applications
should use the kern.ps_strings sysctl instead of PS_STRINGS, so they
shouldn't notice this change.
across the kernel -> application interface, and for the one sysctl where
they were passed and actually used (kern.ps_strings), the applications
want addresses represented as u_longs anyway (the other sysctl that
passed them, kern.usrstack, has never been used).
Agreed to by: dfr, phk
Obtained from: Stephen Clawson <sclawson@cs.utah.edu>
Wakeup anyone waiting on a mount point prior to returning from umount,
whether an error occurs or not. Fixes a stat/NFS-umount race and other
potential future problems. Fix taken from bug/pr which also indicated
that the same fix has already been applied to OpenBSD and NetBSD.
This is odd, especially in the case of USB where the driver is found
in several tries: vendor specific, class specific, interface specific.
The mouse driver is found at the interface specific level...
Reviewed by: Doug Rabson (dfr@freebsd.org)
0. This makes it difficult to do efficient manipulation of the
struct pollfd since you can't leave a slot empty.
PR: 8599
Submitted-by: Marc Slemko <marcs@znep.com>
for possible buffer overflow problems. Replaced most sprintf()'s
with snprintf(); for others cases, added terminating NUL bytes where
appropriate, replaced constants like "16" with sizeof(), etc.
These changes include several bug fixes, but most changes are for
maintainability's sake. Any instance where it wasn't "immediately
obvious" that a buffer overflow could not occur was made safer.
Reviewed by: Bruce Evans <bde@zeta.org.au>
Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by: Mike Spengler <mks@networkcs.com>
problem is worked around by using an interrupt gate for the page
fault handler. This code was originally made for NetBSD/pc98 by
Naofumi Honda <honda@kururu.math.sci.hokudai.ac.jp> and has already
been in PC98 tree. Because of this bug, trap_fatal cannot show
correct page fault address if %cr2 is obtained in this function.
Therefore, trap_fatal uses the value from trap() function.
- The trap handler always enables interruption when buggy application
or kernel code has disabled interrupts and then trapped. This code
was prepared by Bruce Evans <bde@FreeBSD.org>.
Submitted by: Bruce Evans <bde@FreeBSD.org>
Naofumi Honda <honda@kururu.math.sci.hokudai.ac.jp>
can set if your hw/sw produces the "calcru negative..." message.
Setting the alternate method (sysctl -w kern.timecounter.method=1)
makes the the get{nano|micro}*() functions call the real thing at
resulting in a measurable but minor overhead.
I decided to NOT have the "calcru" change the method automatically
because you should be aware of this problem if you have it.
The problems currently seen, related to usleep and a few other corners
are fixed for both methods.
runtime. p_runtime is unsigned while p_cpulimit is not, so this avoids the
nasty side effect of the process getting killed when the runtime comes up
"negative" due to other bugs.
out interrupts for too long. If you still see the "calcru: negative
time..." message you can increase NTIMECOUNTER (see LINT).
Sideeffect is that a timecounter is required to not wrap around in
less than (1 + delta) seconds instead of the (1/hz + delta) required
until now.
Many thanks to: msmith, wpaul, wosch & bde
from an interrupt context and fsetown() wants to peek at curproc, call
malloc(..., M_WAITOK), and fiddle with various unprotected data structures.
The fix is to move the code that duplicates the F_SETOWN/FIOSETOWN state
of the original socket to the new socket from sonewconn() to accept1(),
since accept1() runs in the correct context. Deferring this until the
process calls accept() is harmless since the process can't do anything
useful with SIGIO on the new socket until it has the descriptor for that
socket.
One could make the case for not bothering to duplicate the
F_SETOWN/FIOSETOWN state and requiring the process to explicitly make the
fcntl() or ioctl() call on the new socket, but this would be incompatible
with the previous implementation and might break programs which rely on
the old semantics.
This bug was discovered by Andrew Gallatin <gallatin@cs.duke.edu>.
Many (mostly machine-dependent ones) are still missing. NIST-PCTS found
this bug for all the ioctls used to implement the POSIX tc* functions
(TIOCCBRK, TIOCDRAIN, TIOCSPGRP, TIOCSBRK, TIOCSTART and TIOCSTOP), and
I found FIOASYNC, TIOCCONS, TIOCEXCL, TIOCHPCL, TIOCNXCL, TIOCSCTTY and
TIOCSDRAINWAIT by inspection. TIOCSPGRP was ifdefed out for some reason.
Handle tcsetattr()'s historical speed conversions correctly and more
centrally:
- don't store speeds of 0 in the final termios struct. Drivers can now
depend on tp->t_ispeed and tp->t_ospeed giving the actual speed.
Applications can now depend on tcgetattr() being POSIX.1 conformant.
- convert from a proposed input speed of 0 to the proposed output speed
(except if that is 0, convert to the current output speed). Drivers
can now depend on the proposed input speed being nonzero.
- don't reject negative speeds. Negative speeds can't happen now that
speed_t is unsigned, and rejecting invalid speeds is a bug - tcsetattr()
is supposed to succeed if it can "perform any of the requested actions",
so it shouldn't fail in practice.
bio interrupts, and a truncated file that along with the precise alignment
of the planets could result in a page being freed multiple times or a
just-freed page being put onto the inactive queue.
system, the mapping from logical to physical block number may be lost.
Hence we have to check for a reconstituted buffer and redo the call to
VOP_BMAP if the physical block number has been lost.
devstat_end_transaction error message that gets printed whenever the
busy count is < 0.
This will help catch drivers that improperly implement devstat(9) support.
MALLOC_DEFINE() and MALLOC_DEFINE() is needed by the recently
reenabled "reallocblks" code, but <sys/kernel.h> was only included
if CLUSTERDEBUG was defined. This was too harmless. gcc only
warns about garbage like `SYSINIT(blech);' at file scope ...
- Interface wth the new resource manager.
- Allow for multiple drivers implementing a single devclass.
- Remove ordering dependencies between header files.
- Style cleanup.
- Add DEVICE_SUSPEND and DEVICE_RESUME methods.
- Move to a single-phase interrupt setup scheme.
Kernel builds on the Alpha are brken until Doug gets a chance to incorporate
these changes on that side.
Agreed to in principle by: dfr
This avoids the fsck-on-reboot symptoms if you're shutting down with a
hung or unreachable NFS server mounted. Also remove non-local
filesystems from the mount list to prevent the system hanging when it tries
to unmount them (for the same reason).
Drew points out that there's a good argument for forcibly removing all
"non syncable" filesystems from the mount list (eg. NFS mounts, disks
that aren't responding, etc.) as this then allows you to sync and
cleanly unmount their parents. No such change is included in this
patch.
Submitted by: Andrew Gallatin <gallatin@cs.duke.edu>
basically do a on-the-fly defragmentation of the FFS filesystem, changing
file block allocations to make them contiguous. Thanks to Kirk McKusick
for providing hints on what needed to be done to get this working.
linker. This is intended to replace kvm_mkdb etc. The first version
only does name->value lookups, but it's open ended. value->name lookups
would probably be a good thing to do too.
It's been suggested to try and connect the symbol tables to sysctl (which
is probably a more flexible way of doing it if it's done right), but that
is far more complex and difficult than I was ready to have a shot at.
by bde, a few other tweaks to get the patch to apply cleanly again and
some improvements to the comments.
This change closes some fairly minor security holes associated with
F_SETOWN, fixes a few bugs, and removes some limitations that F_SETOWN
had on tty devices. For more details, see the description on the PR.
Because this patch increases the size of the proc and pgrp structures,
it is necessary to re-install the includes and recompile libkvm,
the vinum lkm, fstat, gcore, gdb, ipfilter, ps, top, and w.
PR: kern/7899
Reviewed by: bde, elvind
leaked memory on each unload and were limited to items referenced in
the kernel copy of vnode_if.c. Now a kernel module is free to create
it's own VOP_FOO() routines and the rest of the system will happily
deal with it, including passthrough layers like union/umap/etc.
Have VFS_SET() call a common vfs_modevent() handler rather than
inline duplicating the common code all over the place.
Have VNODEOP_SET() have the vnodeops removed at unload time (assuming a
module) so that the vop_t ** vector is reclaimed.
Slightly adjust the vop_t ** vectors so that calling slot 0 is a panic
rather than a page fault. This could happen if VOP_something() was called
without *any* handlers being present anywhere (including in vfs_default.c).
slot 1 becomes the default vector for the vnodeop table.
TODO: reclaim zones on unload (eg: nfs code)
removed at module unload (if in a module of course).
However; this introduces a new dependency on <sys/kernel.h> for things
that use MALLOC_DECLARE(). Bruce told me it is better to add sys/kernel.h
to the handful of files that need it rather than add an extra include to
sys/malloc.h for kernel compiles. Updates to follow in subsequent commits.
dereference a NULL pointer, causing a panic. Instead of following
s_leader to find the session id, store it in the session structure.
Jukka found the following info:
BTW - I just found what I have been looking for. Std 1003.1
Part 1: SYSTEM API [C LANGUAGE] section 2.2.2.80 states quite
explicitly...
Session lifetime: The period between when a session is created
and the end of lifetime of all the process groups that remain
as members of the session.
So, this quite clearly tells that while there is any single
process in any process group which is a member of the session,
the session remains as an independent entity.
Reviewed by: peter
Submitted by: "Jukka A. Ukkonen" <jau@jau.tmt.tele.fi>
of the input file more strict and the error messages more elaborate.
Second, the output file has slightly improved looks when >80 character
lines are concerned (I needed a 80 character line formatter anyway for
work...)."
Submitted by: Nick Hibma <nick.hibma@jrc.it>
truncated to 32 bits.
* Change the calling convention of the device mmap entry point to
pass a vm_offset_t instead of an int for the offset allowing
devices with a larger memory map than (1<<32) to be supported
on the alpha (/dev/mem is one such).
These changes are required to allow the X server to mmap the various
I/O regions used for device port and memory access on the alpha.
we can recurse when loading dependencies and that the kstack is limited
to something like 6 or 7KB. Having a single dependency caused an instant
double panic, and I stronly suspect some of the other strange "events"
that I have seen are possibly as a result of taking a couple of interrupts
with a large chunk of the stack already in use.
While here, fix a minor logic hiccup in a sanity check.
file to a stream socket. sendfile(2) is similar to implementations in
HP-UX, Linux, and other systems, but the API is more extensive and
addresses many of the complaints that the Apache Group and others have
had with those other implementations. Thanks to Marc Slemko of the
Apache Group for helping me work out the best API for this.
Anyway, this has the "net" result of speeding up sends of files over
TCP/IP sockets by about 10X (that is to say, uses 1/10th of the CPU
cycles) when compared to a traditional read/write loop.
Also fix data types and printf formats while I'm here.
PR: misc/8494
Panic instead of looping forever in sbflush(). If sb_mbcnt counts
more mbufs than sb_cc counts bytes, the original code can turn into an
infinite loop of removing 0 bytes from the socket buffer until it's empty.
the NFSv3 ACCESS RPC problems a little for busy clients that do a lot of
open/close. The nfs code could probably cache the results, but I'm not
sure whether this would be legal or useful. The problem is that with
a CPU farm, on each open there would be a lookup, getattr then access RPC
then the read/write RPC activity. Caching the access results probably
isn't going to help much if the clients access lots of files. Having the
nfs_access() routine interpret the getattr results is a bit of a hack, but
it's how NFSv2 is done and it might be OK for a mount attribute for v3.
- Use TAILQ_* macros extensively instead of internal names
- use b_xflags instead of the NOLIST magic number hack in the next pointer
- clean bufs are inserted at the tail rather than the head.
- redo dirty buffer insert so that metadata (negative lbn) goes to the
tail directly rather than at the HEAD. This makes a difference when
inserting dirty data blocks in lbn sorted order since data block
insertion will not have to bypass all the metadata cruft. data is
lbn sorted since it makes sense for clustering and writeback ordering,
while metadata sorting doesn't help much since the lbn's are
meaningless when walking the list for writebacks.
Small systems will not notice much (if any) benefit from this, but really
busy systems with large dirty block lists should get a lot more.
I've tested this with softdep, and it doesn't seem to mind the change of
queueing of metadata.
Reviewed (in princible) by: dg
Obtained from: partly from John Dyson's work-in-progress patches in June.
the old true/false.
While here, have vfs_msync() only call vm_object_page_clean() with
OBJPC_SYNC if called with MNT_WAIT flags. vfs_msync() is called at unmount
time (with MNT_WAIT) and from the syncer process (formerly update).
This should make dirty mmap writebacks a little less nasty.
I have tested this a little with SOFTUPDATES enabled, but I don't normally
use it since I've been badly burned too many times.
installed.
Remove cpu_power_down, and replace it with an entry at the end of the
SHUTDOWN_FINAL queue in the only place it's used (APM).
Submitted by: Some ideas from Bruce Walter <walter@fortean.com>
clear if the check is necessary, but vfs_object_create() is called
for all vnodes and it was silly to create objects for VBLK vnodes
that don't even have a driver.
- dev != NODEV was checked for, but 0 was returned on failure. This was
fixed in Lite2 (except the return code was still slightly wrong (ENODEV
instead of ENXIO)) but the changes were not merged. This case probably
doesn't actually occur under FreeBSD.
- major(dev) was not checked to have a valid non-NULL bdevsw entry. This
caused panics when the driver for the root device didn't exist.
Fixed minor misformattings in bdevvp(). Rev.1.14 consisted mainly of
gratuitous reformattings that seem to have caused many Lite2 merge
errors.
PR: 8417
If you have problems with the "calcru" messages and processes being
killed for excessive cpu time, try to increase the NTIMECOUNTER
#define and report your findings.
- Use the system headers method for Elf32/Elf64 symbol compatability
- get rid of the UPRINTF debugging.
- check the ELF header for compatability much more completely
- optimize the section mapper. Use the same direct VM interfaces that
imgact_aout.c and kern_exec.c use.
- Check the return codes from the vm_* functions better. Some return
KERN_* results, not an errno.
- prefault the page tables to reduce startup faults on page tables like
a.out does.
- reset the segment protection to zero for each loop, otherwise each
segment could get progressively more privs. (eg: if the first was
read/write/execute, and the second was meant to be read/execute, the
bug would make the second r/w/x too. In practice this was not a
problem because executables are normally laid out with text first.)
- Don't impose arbitary limits. Use the limits on headers imposed by
the need to fit them into one page.
- Remove unused switch() cases now that the verbose debugging is gone.
I've been using an earlier version of this for a month or so.
This sped up ELF exec speed a bit for me but I found it hard to get
consistant benchmarks when I tested it last (a few weeks ago).
I'm still bothered by the page read out of order caused by the
transition from data to bss. This which requires either part filling the
transition page or clearing the remainder.
a raw partition at a nonzero offset (EINVAL should have been EXDEV;
DIOCSDINFO was broken, and DIOCWDINFO was broken because it depended
on DIOCSDINFO).
A zero offset for the raw partition should probably be enforced in
setdisklabel(), and DIOCWDINFO should probably always be handled by
first calling setdisklabel() so that writedisklabel() doesn't need to
enforce it, but this has never been done; dsioctl() has a special
check. Changes in this commit are limited to dsioctl() to preserve
bug for bug compatibility in drivers that don't use the slice code
(notably the ccd driver, which allows setting a bogus label in
DIOCWDINFO and doesn't undo the setting when writedisklabel() fails).
partition that the label ioctl is being done on just because it has
offset 0, since there is no guarantee that such a partition is large
enough to contain the label. Don't use the wrong raw partition (0
instead of RAW_PART).
This fixes problems rewriting bizarre labels (with a nonzero offset
for the 'a' partition) in newfs(8). Such labels shouldn't normally
be used, but creating them was allowed if the ioctl was done on the
raw partition, and sysinstall creates them if the root partition isn't
allocated first.
Note that allowing write access to a partition other than the one that
has been checked for write access doesn't increase security holes
significantly, since write access to any partition already allows
changing the in-core label.
This fix should be in 3.0R. Rev.1.26 of newfs/newfs.c shouldn't be
in 3.0R.
This is the bulk of the support for doing kld modules. Two linker_sets
were replaced by SYSINIT()'s. VFS's and exec handlers are self registered.
kld is now a superset of lkm. I have converted most of them, they will
follow as a seperate commit as samples.
This all still works as a static a.out kernel using LKM's.
release goes out the door. We know there's a bug in the devstat
implementation in the wd driver, but bde and msmith haven't been able to
fix it yet.
So, disable the printf to avoid confusing/worrying people.
Suggested by: msmith
1) The vnode pager wasn't properly tracking the file size due to
"size" being page rounded in some cases and not in others.
This sometimes resulted in corrupted files. First noticed by
Terry Lambert.
Fixed by changing the "size" pager_alloc parameter to be a 64bit
byte value (as opposed to a 32bit page index) and changing the
pagers and their callers to deal with this properly.
2) Fixed a bogus type cast in round_page() and trunc_page() that
caused some 64bit offsets and sizes to be scrambled. Removing
the cast required adding casts at a few dozen callers.
There may be problems with other bogus casts in close-by
macros. A quick check seemed to indicate that those were okay,
however.
things, like msdosfs, do not work (panic) on devices with VMIO enabled.
FFS enable VMIO on mounted devices, and nothing previously disabled it, so,
after you mounted FFS floppy, you could not mount msdosfs floppy anymore...)
This is mostly a quick before-release fix.
Reviewed by: bde
Drastically quieten down the verbose load progress messages. They were
more useful for debugging than anything, but are beyond a joke when loading
a few dozen modules.
Simplify the ELF extended symbol table load format. Just take the main
symbol table and the string table that corresponds. This is what we will
be getting local symbols from. (needed for the alpha stack tracebacks).
Use the (optional) full symbol tables in lookups. This means we have to
furhter distinguish between symbols that can come from the dynamic linking
table and the complete table.
The alpha boot code now needs to be adapted as ddb/db_elf.c cannot use
the simpler format.
I have not implemented loading the extended symbol tables from the syscall
interface yet, just for preloaded modules.
I am not sure about the symbol resolution. I *think* it's possible that
a local symbol can be found in preference to a global, depending on the
search sequence and dependency tree.
Formerly, the heuristic involving the interpreter path took
precedence.
Also, print a better error message if the brand is missing or not
recognized. If there is no brand at all, give the user a hint that
"brandelf" needs to be run.
Implement preloading in a fairly MI way, assuming the information is
prepared.
DDB interface helpers.. Provide some support for db_kld.c so that we
don't have to export too much detail.
Debugging and cosmetic nits left in from development..
The other half of the containing file hack so modules can associate
themselves with their "file".
but I can't think of another (relatively) easy way of getting the info
since the boot-time initialization is not done immediately after "loading".
XXX module_register() gained an extra arg. This might break the alpha
compile, if so, just add a zero to get the old behavior.
should probably be moved to i386/i386/link_machdep.c (and the same for the
alpha).
Implement "deleting" a preloaded module by destroying it's tags. This is a
hack. We cannot reuse the data, it's been destroyed by relocation,
statically initialized variables have been modified, etc. Note that to
reclaim the load space is going to be more machine-dependent work.
Implement a relocate hook for machdep.c to call so that the physical
addresses get converted to the equivalent KVM addresses.
- seperate unload for preloaded linker objects.
- Don't build a kernel object if running as an a.out kernel.
- extract the real kernel name rather than hardwiring "kernel" for kldstat.
(sysctl kern.bootfile getst the full name via bootinfo)
- use real addresses on the kernel "module" rather than fictitious ones.
- preloaded module support
- search module path for file modules.
- symbols are checked to see if they are in the right containing file
before using their indexes into string tables. This is to help ddb
since it only supplies a pointer to an opaque symbol and there is no
telling which file/object/module/whatever it came from.
- symbol_values checks that the symbol is indeed belonging to the
correct symbol and string table pairs before looking up. (since there
could be many pairs, and KLD/DDB need to find out).
- different ops for files versus preload modules - the unload mechanism
is different. (a preloaded module has to be deleted on unload since
the in-core image is tainted by relocation and variables used)
- Do not build an a.out kernel module if we're running on an elf
kernel. :-) Note that it should theoretically be possible to
mix a.out and elf KLD modules providing -mno-underscores was used
to compile it, or some other symbol conversion takes place.
- Support preload modules (even though /boot/loader doesn't yet)
- Search the module path when loading files.
check off SYSINIT entries as they are run, and when more arrive, we re-sort
and restart (skipping the already-run entries).
This can *only* be done after KMEM (and malloc) is up and running - this is
fine because KLD is the only consumer of this and it's done after that.
The nice thing about this is that the SYSINIT's within preloaded KLD modules
are executed in their natural order. It should be possible to register
devices for the probes which follow, etc. (soon.. several key things
prevent this, such as use of linker sets for things like pci devices).
help track down bugs in the devstat implementation in various drivers.
(i.e., any situation where the driver does not call the devstat routines
once and only once for each transaction initiation and completion)
Prompted by: msmith
NFS_ROOT will produce kernel that cannot mount a UFS /.
Vfs type numbers must be distinct from VFS_GENERIC (and VFS_VFSCONF, but
that has the same value and should go away).
The problem happens because NFS is the first vfs (in sys/conf order) so it
gets type number 0 and conflicts harmfully with VFS_GENERIC which is also 0.
The conflict is apparently harmless in the usual case when another vfs
gets type number 0, because nfs is the only vfs that has sysctls.
Inital fix by: Dima <dima@tejblum.dnttm.rssi.ru>
Reason why it worked by: bde
Reviewed by: Luoqi Chen <luoqi@watermarkgroup.com>
Fixed problem where write()s can get lost due to buffers flagged B_DELWRI
being improperly released in brelse().
The last consumer of this code (the old SCSI system) has left us and
the CAM code does it's own bouncing. The isa dma system has been
doing it's own bouncing for a while too.
Reviewed by: core
Submitted by: Kirk McKusick <mckusick@McKusick.COM>
Two minor changes are also included,
1. Remove gratuitious checks for error return from vn_lock with LK_RETRY set,
vn_lock should always succeed in these cases.
2. Back out change rev. 1.36->1.37, which unnecessarily makes async mount
a little more unstable. It also keeps us in sync with other BSDs.
Suggested by: Bruce Evans <bde@zeta.org.au>
generation was causing unaligned access faults on the Alpha.
I have incremented the devstat version number, since this is an interface
change. You'll need to recompile libdevstat, systat, iostat, vmstat and
rpc.rstatd along with your kernel.
Partially Submitted by: Andrew Gallatin <gallatin@cs.duke.edu>
minus the NULL pointer dereference in rev. 1.33. Also simplify
things somewhat by eliminating one traversal of the VM map entries.
Finally, eliminate calls to vm_map_{un,}lock_read() which aren't
needed here. I originally took them from procfs_map.c, but here
we know we are dealing only with the map of the current process.
segments (except memory-mapped devices) in the ELF core file. This
is really nice. You get access to the data areas of all shared
libraries, and even to files that are mapped read-write.
In the future, it might be good to add a new resource limit in the
spirit of RLIMIT_CORE. It would specify the maximum sized writable
segment to include in core dumps. Segments larger than that would
be omitted. This would be useful for programs that map very large
files read/write but that still would like to get usable core dumps.
splhigh() after any system dumps have completed. SHUTDOWN_POST_SYNC
isn't quite late enough for disk controllers.
Converted at_shutdown queues to use the queue(3) macros.
object format of the executable being dumped. This is the first
step toward producing ELF core dumps in the proper format. I will
commit the code to generate the ELF core dumps Real Soon Now. In
the meantime, ELF executables won't dump core at all. That is
probably no less useful than dumping a.out-style core dumps as they
have done until now.
Submitted by: Alex <garbanzo@hooked.net> (with very minor changes by me)
detachment of vfs sysctls. Unloading of vfs LKMs doesn't actually
work for any vfs, since it leaves garbage pointers to memory
allocation control structures.
and use this when masking/unmasking interrupts.
Maintain a mapping from (iopaic number, int pin) tuple to irq number,
and use this when configuring devices and programming the ioapics.
Previous code assumed that irq number was equal to int pin number, and
that the ioapic number was 0.
Don't let an AP enter _cpu_switch before all local apics are initialized.
type numbers in vfs attach order (modulo incomplete reuse of old
numbers after vfs LKMs are unloaded). This requires reinitializing
the sysctl tree (or at least the vfs subtree) for vfs's that support
sysctls (currently only nfs). sysctl_order() already handled
reinitialization reasonably except it checked for annulled self
references in the wrong place.
Fixed sysctls for vfs LKMs.
when nfs is an LKM. Declare it in a header file. Don't forget to use
it in non-Lite2 code. Initialize it to -1 instead of to 0, since 0
will soon be the mount type number for the first vfs loaded.
NetBSD uses strcmp() to avoid this ugly global.
device drivers about sectors no longer in use.
Device-drivers receive the call through d_strategy, if they have
D_CANFREE in d_flags.
This allows flash based devices to erase the sectors and avoid
pointlessly carrying them around in compactions.
Reviewed by: Kirk Mckusick, bde
Sponsored by: M-Systems (www.m-sys.com)
- moved definition of MACHINE_ARCH from cpu.h to parm.h as alpha.
- Added definitions of _MACHINE and _MACHINE_ARCH.
- Added hw.ispc98. The hw.ispc98 is 1 in PC98 kernel and is 0 in
IBM-PC kernel.
Discussed with: John Birrell <jb@FreeBSD.ORG>
Add some overflow checks to read/write (from bde).
Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags
and vm_object::paging_in_progress to use operations which are not
interruptable.
Reviewed by: Bruce Evans <bde@zeta.org.au>
for the Lite2 fix for always returning EIO in dead_read().
Cleaned up the cdevswitch initializers for all tty drivers.
Removed explicit calls to ttsetwater() from all (tty) drivers. ttsetwater()
is now called centrally for opens, not just for parameter changes.
another specialized mbuf type in the process. Also clean up some
of the cruft surrounding IPFW, multicast routing, RSVP, and other
ill-explored corners.
They shouldn't be used there either. They should have gone away
about 3 years ago when the statically initialized devswitches went
away, but su.c unfortunately still frobs the cdevswitch in the old
way.
since (hardware) ttys have too low a bandwidth to benefit significantly
from large buffers. Use twice the old limit for the new-default case
and 8 times the old limit for the driver-specifies-watermark case.
Nothing uses these cases yet.
Removed related debugging code.
in a SMP system. Unexpected things could happen if each cpu
has a different ldt setting and one cpu tries to use value
of currentldt set by another cpu.
The fix is to move currentldt to the per-cpu area. It includes
patches I filed in PR i386/6219 which are also user ldt related.
PR: i386/7591, i386/6219
Submitted by: Luoqi Chen <luoqi@watermarkgroup.com>
Cast pointers to (vm_offset_t) instead of to (u_long) (as before) or to
(uintptr_t)(void *) (as would be more correct). Don't cast vm_offset_t's
to (u_long) just to do arithmetic on them.
mp_machdep.c:
Cast pointers to (uintptr_t) instead of to (u_long). Don't forget
to cast pointers to (void *) first or to recover from integral
possible integral promotions, although this is too much work for
machine-dependent code.
vm code generally avoids warnings for pointer vs long size mismatches
by using vm_offset_t to represent pointers; pmap.c often uses plain
`unsigned int' instead of vm_offset_t and didn't use u_long elsewhere,
but this style was messed up by code apparently imported from mp_machdep.c.
in ddb) which I broke by changing %8[l]x to %8p. Hacked the central
printf routine to not add an "0x" prefix for %p formats if the field
width is nonzero. The tables are still horribly misformatted on
64-bit machines.
Use %p instead of %8p to print pointers when the field width isn't
important.
DOS partition type 15 (Extended DOS, LBA) as a container for
DOS logical volumes, so the appropriate slices (e.g. sd1s5)
are not initialized.
PR: 7549
PR: 4120
Reviewed by: phk
Submitted by: Jim Mattson <jmattson@sonic.net>
managed to avoid corruption of this variable by luck (the compiler used a
memory read-modify-write instruction which wasn't interruptable) but other
architectures cannot.
With this change, I am now able to 'make buildworld' on the alpha (sfx: the
crowd goes wild...)
and DSO_NOLABELS flags prevent searching for slices and labels
respectively. Current drivers don't set these flags. When
DSO_NOLABELS is set, the in-core label for the whole disk is cloned
to create an in-core label for each slice. This gives the correct
result (a good in-core label for the compatibility slice) if
DSO_ONESLICE is set or only one slice is found, but usually gives
broken labels otherwise, so DSO_ONESLICE should be set if DSO_NOLABELS
is set.
protection checks. Using the partition-relative blkno in some
parts broke the write protection for partitions at unusual
offsets (only for partitions at offset 1 on i386's).
best place to set it, and the wd and wfd strategy routines don't
set it (for failed transfers) because they expect dscheck() to
initialize everything necessary. dscheck() has always set B_ERROR,
but this is not quite sufficient, because b_resid is used by physio()
to decide how much of a B_ERROR'ed i/o was done.
formats and args to match. Fixed old printf format errors (all related;
most were hidden by calling printf indirectly).
This change somehow avoids compiler bugs for 64-bit longs on i386's,
although it increases the number of 64-bit calculations.
timeslice of the exiting process was counted for both the exiting
process and the next process to run if the next process runs
immediately.
Broken in: mostly in kern_clock.c rev.1.70 (1998/05/28)
Callers only need to initialize d_secperunit now, but should
initialize d_type (to reduce the IDE/SCSI confusion), d_typename
(put the disk model in it) and geometry info (if it isn't completely
ficticious). Callers will soon need to initialize d_secsize.
normally only defined in opt_devfs.h, so testing it before including
anything is normally a no-op. Undef'ing DEVFS before including
opt_devfs.h is similarly useless. OTOH, DEVFS support for sliced
but not SLICEd (despite defined(SLICE)) devices is either harmless
(if there are no such devices, then nothing in this file is used)
or necessary (otherwise). It even seems to work for sliced cd
devices.
nearly so many casts here. Casting an pointer that was an integer
back to an integer just to compare it with -1 is bad, and casting
it back just to compare it with NULL is just wrong.
Access the entry address as a uintfptr_t, not as a long, and not
necessarily as what modload(8) passes (it takes a u_long from the
exec header and passes a u_int).
Fixed bitrot in K&R support (suword() now takes a long word).
Didn't fix corresponding bitrot in store.9 and fetch.9.
The correct types for the store and fetch families are problematic.
The `word' functions are unfortunately named and need to be split
to handle ints/longs/object pointers/function pointers. Storing
argv[] as longs is quite broken when longs are longer than pointers,
but usually works because it clobbers variables that will soon be
reinitialized.
Hopefully caddr_t is large enough to hold function pointers.
Cast object pointers to uintptr_t before casting them to u_long.
Types are wronger than usual for the PT_READ_U case. ptrace() can
only return ints, but longs are accessed.
respectively. Most of the longs should probably have been
u_longs, but this changes is just to prevent warnings about
casts between pointers and integers of different sizes, not
to fix poorly chosen types.
suitable for holding object pointers (ptrint_t -> uintptr_t).
Added corresponding signed type (intptr_t). Changed/added
corresponding non-C9x types for function pointers to match. Don't
use nonstandard types to implement these types, and don't comment
on them in <machine/types.h>.