Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY
structures for list operations. This patch makes all list operations
in sys/kern use the queue(3) macros, rather than directly accessing the
*Q_{HEAD,ENTRY} structures.
Reviewed by: phk
Submitted by: Jake Burkholder <jake@checker.org>
PR: 14914
Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY
structures for list operations. This patch makes all list operations
in sys/kern use the queue(3) macros, rather than directly accessing the
*Q_{HEAD,ENTRY} structures.
This batch of changes compile to the same object files.
Reviewed by: phk
Submitted by: Jake Burkholder <jake@checker.org>
PR: 14914
All Makefiles now use MACHINE_ARCH for the target architecture.
Unification is required for cross-building.
Tags added to:
sys/boot/Makefile
sys/boot/arc/loader/Makefile
sys/kern/Makefile
usr.bin/cpp/Makefile
usr.bin/gcore/Makefile
usr.bin/truss/Makefile
usr.bin/gcore/Makefile:
fixed typo: MACHINDE -> MACHINE_ARCH
Note: Previous commit to these files (except coda_vnops and devfs_vnops)
that claimed to remove WILLRELE from VOP_RENAME actually removed it from
VOP_MKNOD.
Correctly lock vnodes when calling VOP_OPEN() from filesystem mount code.
Unify spec_open() for bdev and cdev cases.
Remove the disabled bdev specific read/write code.
devsw_module_handler() indirectly and not use the chain arguments. To
eliminate this indirection via that function (which does nothing now)
without duplicating a modevent handler into all the routines that don't
presently have one, supply a NOP (do nothing, return OK) routine which
is functionally equivalent to what's there now. This is a hack and is
still wrong, because there doesn't appear to be anything to reclaim
resources on an unload of a module with one of these in it. I'm not
sure whether to make the NOP handler refuse a MOD_UNLOAD event or what.
We currently only search SCSI and IDE CDROMs; if there's felt to be a
need for supporting the very old and rare soundcard etc. drives for this
application they can be trivially added.
In order to achieve this, root filesystem mount is moved from
SI_ORDER_FIRST to SI_ORDER_SECOND in the SI_SUB_MOUNT_ROOT sysinit
group. Now, modules which wish to usurp the default root mount
can use SI_ORDER_FIRST.
A compiled-in or preloaded MFS filesystem will become the root
filesystem unless the vfs.root.mountfrom environment variable refers
to a valid bootable device. This will normally only be the case when
the kernel and MFS image have been loaded from a disk which has a
valid /etc/fstab file. In this case, the variable should be manually
overridden in the loader, or the kernel booted with -a. In either
case "mfs:" should be supplied as the new value.
Also fix a typo in one DFLTROOT case that would not have compiled.
filesystem is discovered. Preference is given to using the kernel
environment variable vfs.root.mountfrom, which is set by the loader
according to the contents of /etc/fstab. Changes in the MD code
provide fallback mechanisms for systems not using the loader.
A more robust fallback path is also provided, with the last recourse
being to prompt on the console for a root device.
These changes drastically simplify the machine-dependant parts of
the root configuration process. In addition, support for CDROM root
devices has been removed; it was a nasty hack and didn't work.
be ignored by default by the df(1) program. This is used mostly to
avoid stat()-ing entries that do not represent "real" disk mount
points (such as those made by an automounter such as amd.) It is
also useful not to have to stat() these entries because it takes
longer to report them that for other file systems, being that these
mount points are served by a user-level file server and resulting in
several context switches. Worse, if the automounter is down
unexpectedly, a causal df(1) will hang in an interruptible way.
PR: kern/9764
Submitted by: Erez Zadok <ezk@cs.columbia.edu>
"rw" argument, rather than hijacking B_{READ|WRITE}.
Fix two bugs (physio & cam) resulting by the confusion caused by this.
Submitted by: Tor.Egge@fast.no
Reviewed by: alc, ken (partly)
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.
This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
slightly older version of this code was tested by BDE and I.
Also fixes a lockup situation when kva gets too fragmented.
Remove the maxvmiobufspace variable and sysctl, they are no longer
used. Also cleanup (remove) #if 0 sections from prior commits.
This code is more of a hack, but presumably the whole buffer cache
implementation is going to be rewritten in the next year so it's no
big deal.
resource_list_release. This removes the dependancy on the
layout of ivars.
* Move set_resource, get_resource and delete_resource from
isa_if.m to bus_if.m.
* Simplify driver code by providing wrappers to those methods:
bus_set_resource(dev, type, rid, start, count);
bus_get_resource(dev, type, rid, startp, countp);
bus_get_resource_start(dev, type, rid);
bus_get_resource_count(dev, type, rid);
bus_delete_resource(dev, type, rid);
* Delete isa_get_rsrc and use bus_get_resource_start instead.
* Fix a stupid typo in isa_alloc_resource reported by Takahashi
Yoshihiro <nyan@FreeBSD.org>.
* Print a diagnostic message if we can't assign resources to a PnP
device.
* Change device_print_prettyname() so that it doesn't print
"(no driver assigned)-1" for anonymous devices.
can provide the correct context to each signal handler.
Fix broken sigsuspend(): don't use p_oldsigmask as a flag, use SAS_OLDMASK
as we did before the linuxthreads support merge (submitted by bde).
Move ps_sigstk from to p_sigacts to the main proc structure since signal
stack should not be shared among threads.
Move SAS_OLDMASK and SAS_ALTSTACK flags from sigacts::ps_flags to proc::p_flag.
Move PS_NOCLDSTOP and PS_NOCLDWAIT flags from proc::p_flag to procsig::ps_flag.
Reviewed by: marcel, jdp, bde
- Let physio take read/write compatible args and have it use uio->uio_rw
to determine the direction.
- physread/physwrite are now #defines for physio
- Remove the inversly named minphys(), dev->si_iosize_max takes over.
- Physio() always uses pbufs.
- Fix the check for non page-aligned transfers, now only unaligned
transfers larger than (MAXPHYS - PAGE_SIZE) get fragmented (only
interesting for tapes using max blocksize).
- General wash-and-clean of code.
Constructive input from: bde
NOTE: This will break building ntpd until ntpd has been upgraded to also
support draft 05. People that want to build ntpd in the meantime can
get patches from me.
add nodes to the tree. Also, default to bus_generic_driver_added for
the BUS_DRIVER_ADDED method.
This allows newbus busses to be kldload'd.
Reviewed by: dfr
- Move intrhook stuff into kernel.h
- Remove all occurrences of #device <device.h>
- Add kernel.h were necessary (nowhere)
- delete device.h
This file contained the structures for cfdata (old style config) and is no
longer used. It was included by most drivers.
It confuses the remote debugger as the definition of 'struct device' in
device.h is found before the one in bus_private.h.
two new functions spec_buf{read|write}.
Add sysctl vfs.bdev_buffered which defaults to 1 == true. This
sysctl can be used to experimentally turn buffered behaviour for
bdevs off. I should not be changed while any blockdevices are
open. Remove the misplaced sysctl vfs.enable_userblk_io.
No other changes in behaviour.
around meant that the higher level close routine never gets called.
(phk is on the road; this is a quick fix to get things working and may need
more polish)
-----------------------------
The core of the signalling code has been rewritten to operate
on the new sigset_t. No methodological changes have been made.
Most references to a sigset_t object are through macros (see
signalvar.h) to create a level of abstraction and to provide
a basis for further improvements.
The NSIG constant has not been changed to reflect the maximum
number of signals possible. The reason is that it breaks
programs (especially shells) which assume that all signals
have a non-null name in sys_signame. See src/bin/sh/trap.c
for an example. Instead _SIG_MAXSIG has been introduced to
hold the maximum signal possible with the new sigset_t.
struct sigprop has been moved from signalvar.h to kern_sig.c
because a) it is only used there, and b) access must be done
though function sigprop(). The latter because the table doesn't
holds properties for all signals, but only for the first NSIG
signals.
signal.h has been reorganized to make reading easier and to
add the new and/or modified structures. The "old" structures
are moved to signalvar.h to prevent namespace polution.
Especially the coda filesystem suffers from the change, because
it contained lines like (p->p_sigmask == SIGIO), which is easy
to do for integral types, but not for compound types.
NOTE: kdump (and port linux_kdump) must be recompiled.
Thanks to Garrett Wollman and Daniel Eischen for pressing the
importance of changing sigreturn as well.
-----------------------------
Rename sigaction, sigprocmask, sigpending and sigsuspend to
osigaction, osigprocmask, osigpending and osigsuspend (resp)
and add new syscalls for them to support the new sisgset_t
without breaking existing binaries.
Change the prototype of sigaltstack to use the typedef stack_t
instead of struct sigaltstack to reflect that it is SUSv2
compliant.
Also, rename sigreturn to osigreturn and add a new syscall
to support the modified stackframe. The change is caused by
sigreturn operating on ucontext_t now and the fact that
siginfo_t has been updated to conform to SUSv2.
an empty mbuf to stay in the queue, then causing a needless panic
because sb_cc == 0 and sb_mbcnt != 0.
But we still need to panic rather than endlessly looping if, for
some reason, sb_cc == 0 and there are non-empty mbufs in the queue.
PR: kern/11988
Reviewed by: fenner
lock specifications in kern/vnode_if.src. At present, this do not
distinguish between exclusive and shared locks, and the kernel is so full
of bugs in this area that running with auto-generation of assertions
enabled makes DEBUG_VFS_LOCKS totally useless for anybody that has used it
for anything prior to outputting automated assertions. Due to this, I made
vnode_if.sh only output locking assertions if you have the environment
variable DEBUG_ALL_VFS_LOCKS set to "YES". In order to actually use the
assertions, you need to also add "options DEBUG_VFS_LOCKS" to your kernel
config file.
Urged to commit by: phk
have been there in the first place. A GENERIC kernel shrinks almost 1k.
Add a slightly different safetybelt under nostop for tty drivers.
Add some missing FreeBSD tags
fields in struct cdevsw:
d_stop moved to struct tty.
d_reset already unused.
d_devtotty linkage now provided by dev_t->si_tty.
These fields will be removed from struct cdevsw together with
d_params and d_maxio Real Soon Now.
The changes in this patch consist of:
initialize dev->si_tty in *_open()
initialize tty->t_stop
remove devtotty functions
rename ttpoll to ttypoll
a few adjustments to these changes in the generic code
a bump of __FreeBSD_version
add a couple of FreeBSD tags
d_maxio is replaced by the dev->si_iosize_max field which the driver
should be set in all calls to cdevsw->d_open if it has a better
idea than the system wide default.
The field is a generic dev_t field (ie: not disk specific) so that
tapes and other devices can use physio as well.
clustering issues (replacing code that used to be in
ufs/ufs/ufs_readwrite.c). vm_fault also now uses the new VM page counter
inlines.
This completes the changeover from vnode->v_lastr to vm_entry_t->v_lastr
for VM, and fp->f_nextread and fp->f_seqcount (which have been in the
tree for a while). Determination of the I/O strategy (sequential, random,
and so forth) is now handled on a descriptor-by-descriptor basis for
base I/O calls, and on a memory-region-by-memory-region and
process-by-process basis for VM faults.
Reviewed by: David Greenman <dg@root.com>, Alan Cox <alc@cs.rice.edu>
the terminating zero by copying MAXCOMLEN + 1 bytes. This fixes the garbage
that occasionally appeared behind the programname when it is at least MAXCOMLEN
bytes long (such as communicator-4.61-bin).
spaces which cross a segment boundry in the page table. pmap_kextract()
is not designed for access to the user space portion of the page
table and cannot handle the null-page-directory-entry case.
The fix is to have vm_fault_quick() return a success or failure which
is then used to avoid calling pmap_kextract().
improperly ignored the B_INVAL flag when acting on the B_ERROR.
If both B_INVAL and B_ERROR are set the buffer is typically out of the
underlying device's block range and must be destroyed. If only B_ERROR
is set (for a write), a write error occured and operation remains as it
was before: the buffer must be redirtied to avoid corrupting the
filesystem state.
Reviewed by: David Greenman <dg@root.com>
Submitted by: Tor.Egge@fast.no
far-reaching in fd-land, so you'll want to consult the code for
changes. The biggest change is that now, you don't use
fp->f_ops->fo_foo(fp, bar)
but instead
fo_foo(fp, bar),
which increments and decrements the fp refcount upon entry and exit.
Two new calls, fhold() and fdrop(), are provided. Each does what it
seems like it should, and if fdrop() brings the refcount to zero, the
fd is freed as well.
Thanks to peter ("to hell with it, it looks ok to me.") for his review.
Thanks to msmith for keeping me from putting locks everywhere :)
Reviewed by: peter
addaliasu() into addalias() (no operational change) and clarify comments
relating to a trick that vclean() uses.
The fix to BOOTP is yet another hack. Actually, rootfsid handling
is already a major hack. The whole thing needs to be cleaned up.
Reviewed by: David Greenman <dg@root.com>, Alan Cox <alc@cs.rice.edu>
Make a sonewconn3() which takes an extra argument (proc) so new sockets created
with sonewconn() from a user's system call get the correct credentials, not
just the parent's credentials.
WARNING: libdevstat, iostat, vmstat, systat etc etc will need a recompile.
Add devstat_end_transaction_buf() which pulls all the vital data out
of a struct buf which is ready for biodone().
to buffered block devices are allowed. The default is to be backwards
compatible, i.e. reads and writes are allowed.
The idea is for a larger crowd to start running with this disabled and
see what problems, if any, crop up, and then to change the default to
off and see if any problems crop up in the next 6 months prior to
potentially removing support entirely. There are still a few people,
Julian and myself included, who believe the buffered block device
access from usermode to be useful.
Remove use of vnode->v_lastr from buffered block device I/O in
preparation for removal of vnode->v_lastr field, replacing it with
the already existing seqcount metric to detect sequential operation.
Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>
warnings caused by the arg having the wrong type (not const enough).
The arg was also wrong (a full name instead of a short one) for calls
from from subr_diskmbr.c and pc98/diskslice_machdep.c.
reasonable defaults.
This avoids confusing and ugly casting to eopnotsupp or making dummy functions.
Bogus casting of filesystem sysctls to eopnotsupp() have been removed.
This should make *_vfsops.c more readable and reduce bloat.
Reviewed by: msmith, eivind
Approved by: phk
Tested by: Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>
a quick think and discussion among various people some form of some of
these changes will probably be recommitted.
The reversion requested was requested by dg while discussions proceed.
PHK has indicated that he can live with this, and it has been agreed
that some form of some of these changes may return shortly after further
discussion.
- eliminate the fast/slow timeout lists for TCP and instead use a
callout entry for each timer.
- increase the TCP timer granularity to HZ
- implement "bad retransmit" recovery, as presented in
"On Estimating End-to-End Network Path Properties", by Allman and Paxson.
Submitted by: jlemon, wollmann
the highly non-recommended option ALLOW_BDEV_ACCESS is used.
(bdev access is evil because you don't get write errors reported.)
Kill si_bsize_best before it kills Matt :-)
Use the specfs routines rather having cloned copies in devfs.