VOP_LINK(). The reason of strange behavior was wrong order of the
argument, that is, the operation
# ln foo bar
in a union fs tried to do
# ln bar foo
in ufs layer.
Now we can make a link in a union fs.
fix!
The ufs_link() assumes that vnode is not unlocked and tries to lock it
in certain case. Because union_link calls VOP_LINK after locking vnode,
vn_lock in ufs_link causes above panic.
Currently, I don't know the real fix for a locking violation in
union_link, but I think it is important to avoid panic.
A vnode is unlocked before calling VOP_LINK and is locked after it if
the vnode is not union fs. Even though panic went away, the process
that access the union fs in which link was made will hang-up.
Hang-up can be easily reproduced by following operation:
mount -t union a b
cd b
ln foo bar
ls
were always in a tss; that tss just changed from the one in the
pcb to common_tss (who knows where it was when there was no curpcb?).
Not using the pcb also fixed the problem that there is no pcb in
idle(), so we now always get useful register values.
same directory pair.
If we do:
mount -t union a b
mount -t union a b
then, (1) namei tries to lock fs which has been already locked by
first union mount and (2) union_root() tries to lock locked fs. To
avoid first deadlock condition, unlock vnode if lowerrootvp is union
node, and to avoid second case, union_mount returns EDEADLK when multi
union mount is detected.
for the ix driver.
Add a shutdown hook that resets the etherexpress so that Windoze can find
the card after a warm boot.
Submitted by: Aaron Smith <aaron@tau.veritas.com>
Obtained From: NetBSD
switching to a child for the first time was being counted twice. I think
this only affected unimportant statistics.
Simplified arg handling in fork_trampoline(). splz() doesn't actually
smash the registers of interest.
cause noise.
Duplicated the lseek() redeclaration hack for all functions involving
off_t's (ftruncate(), mmap() and truncate()) to help broken programs
work.
<machine/types.h> gets redefined in the non-GNU and non-ANSI cases.
Since this hasn't caused problems, there must be no one actually
benefitting from the obfuscations supported by <sys/cdefs.h>.
`make CC="cc -traditional"' in /usr/src/bin shows the same. Almost
everything is broken in essentially the same way - `const' is used
in strings before <sys/cdefs.h> is included, so `const' is not
#defined away until after it is used.
Fixed some style bugs.
dolock is not set (that is, targetvp == overlaying vnode object).
Current code use FIXUP macro to do this, and never unlocks overlaying
vnode object in union_fsync. So, the vnode object will be locked
twice and never unlocked.
PR: 3271
Submitted by: kato
relookup() in union_relookup() is succeeded. However, if relookup()
returns non-zero value, that is relookup fails, VOP_MKDIR is never
called (c.f. union_mkshadow). Thus, pathname buffer is never FREEed.
Reviewed by: phk
Submitted by: kato
PR: 3262
allow large systems to boot successfully with bounce buffers compiled
in. We are now limiting bounce space to 512K. The 8MB allocated for
a 512MB system is very bogus -- and that is now fixed.
the pv entries. This problem has become obvious due to the increase
in the size of the pv entries. We need to create a more intelligent
policy for pv entry management eventually.
Submitted by: David Greenman <dg@freebsd.org>
cache queue more often. The pageout daemon had to be waken up
more often than necessary since pages were not put on the
cache queue, when they should have been.
Submitted by: David Greenman <dg@freebsd.org>
fork. (On my machine, fork is about 240usecs, vfork is 78usecs.)
Implement rfork(!RFPROC !RFMEM), which allows a thread to divorce its memory
from the other threads of a group.
Implement rfork(!RFPROC RFCFDG), which closes all file descriptors, eliminating
possible existing shares with other threads/processes.
Implement rfork(!RFPROC RFFDG), which divorces the file descriptors for a
thread from the rest of the group.
Fix the case where a thread does an exec. It is almost nonsense for a thread
to modify the other threads address space by an exec, so we
now automatically divorce the address space before modifying it.
which mistakenly got committed.
Fix two bugs in the ahc_reset_device code:
Limit search for SCBs to process to those that are active and
are not queued for done processing.
It's okay for an SCB to not have a waiting next SCB.
Be consistant about testing for parity errors after waiting for a
REQ on the bus.
Don't ack the last byte in a transaction until after we've cleared
all target state.
aic7xxx_asm.c:
Test the return value of getopt against -1 not EOF. (Yet another
shameless victum of the style guide being wrong).
<dirent.h> should be used instead to a warning. If this causes too
many warnings in ports then it should be changed back after checking
some ports for related configuration errors.
Moved the definition of DIRSIZ() from <sys/dir.h> to <sys/dirent.h>
so that it can be used in the kernel without including <sys/dir.h>.
Renamed it in some cases to avoid new namespace pollution.
resetting the keyboard.
Well, sorry, this bug is totally my fault. I DID intend to preserve
them, but somehow I failed.
The bug puts some old keyboard controllers in a strange state,
resulting in keyboard freeze or random key input.
The fix closes PR kern/3067.
that nameless pipes are not implemented as sockets.
Don't include <sys/time.h> if KERNEL is defined. It should already have
been included by including <sys/param.h>. Fixed a nearby typo.
longer has anything to do with vnodes and never had anything to do
with buffers, but it needs the definitions of B_READ and B_WRITE
for use with the bogus useracc() interface and was getting them
bogusly due to excessive cleanups in rev.1.49.
driver is waiting a bus settle delay. There should really be a facility
for the controller driver to "freeze" its queue during recovery operations
which would make all of this gymnastics unnecessary.
nothing else will lower it until either much later, or never(?) for
kernel processes.
This basically re-fixes what Bruce fixed in rev 1.29 of kern_fork.c,
which was broken again now the child does not execute back up the fork()
calling tree.
Rename the PT* index KSTK* #defines to UMAX*, since we don't have a kernel
stack there any more..
These are used to calculate VM_MAXUSER_ADDRESS and USRSTACK, and really
do not want to be changed with UPAGES since BSD/OS 2.x binary compatability
depends on it.
space. (!)
Have each process use the kernel stack and pcb in the kvm space. Since
the stacks are at a different address, we cannot copy the stack at fork()
and allow the child to return up through the function call tree to return
to user mode - create a new execution context and have the new process
begin executing from cpu_switch() and go to user mode directly.
In theory this should speed up fork a bit.
Context switch the tss_esp0 pointer in the common tss. This is a lot
simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer
to each process's tss since the esp0 pointer is a 32 bit pointer, and the
sd_base setting is split into three different bit sections at non-aligned
boundaries and requires a lot of twiddling to reset.
The 8K of memory at the top of the process space is now empty, and unmapped
(and unmappable, it's higher than VM_MAXUSER_ADDRESS).
Simplity the pmap code to manage process contexts, we no longer have to
double map the UPAGES, this simplifies and should measuably speed up fork().
The following parts came from John Dyson:
Set PG_G on the UPAGES that are now in kernel context, and invalidate
them when swapping them out.
Move the upages object (upobj) from the vmspace to the proc structure.
Now that the UPAGES (pcb and kernel stack) are out of user space, make
rfork(..RFMEM..) do what was intended by sharing the vmspace
entirely via reference counting rather than simply inheriting the mappings.
convenient and makes life difficult for my next commit. We still need
an i386tss to point to for the tss slot in the gdt, so we use a common
tss shared between all processes.
Note that this is going to break debugging until this series of commits
is finished. core dumps will change again too. :-( we really need
a more modern core dump format that doesn't depend on the pcb/upages.
This change makes VM86 mode harder, but the following commits will remove
a lot of constraints for the VM86 system, including the possibility of
extending the pcb for an IO port map etc.
Obtained from: bde
The typo was detected once apon a time with the -Wunused compile option.
The result was that a block of code for implementing
madvise(.. MADV_SEQUENTIAL..) behavior was "dead" and unused, probably
negating the effect of activating the option.
Reviewed by: dyson
struct direct, not using UFS' definition of DIRBLKSIZ, using directory
seek cookies to make reading non-UFS directories reliable
(e.g. cd9660, ext2fs).
A special thanks to Robert Eckardt for providing an ISC binary of GNU
ls so that I could test these changes.
Use the name argument almost the same in all LKM types. Maintain
the current behavior for the external (e.g., modstat) name for DEV,
EXEC, and MISC types being #name ## "_mod" and SYCALL and VFS only
#name. This is a candidate for change and I vote just the name without
the "_mod".
Change the DISPATCH macro to MOD_DISPATCH for consistency with the
other macros.
Add an LKM_ANON #define to eliminate the magic -1 and associated
signed/unsigned warnings.
Add MOD_PRIVATE to support wcd.c's poking around in the lkm structure.
Change source in tree to use the new interface.
Reviewed by: Bruce Evans
Use the name argument almost the same in all LKM types. Maintain
the current behavior for the external (e.g., modstat) name for DEV,
EXEC, and MISC types being #name ## "_mod" and SYCALL and VFS only
#name. This is a candidate for change and I vote just the name without
the "_mod".
Change the DISPATCH macro to MOD_DISPATCH for consistency with the
other macros.
Add an LKM_ANON #define to eliminate the magic -1 and associated
signed/unsigned warnings.
Add MOD_PRIVATE to support wcd.c's poking around in the lkm structure.
Change source in tree to use the new interface.
Reviewed by: Bruce Evans
by Alan Cox <alc@cs.rice.edu>, and his description of the problem.
The bug was primarily in procfs_mem, but the mistake likely happened
due to the lack of vm system support for the operation. I added
better support for selective marking of page dirty flags so that
vm_map_pageable(wiring) will not cause this problem again.
The code in procfs_mem is now less bogus (but maybe still a little
so.)
loop, test for them separately. The bug report from David Malone showed that
even though we had been reselected (SELDI was true), we sat in the poll for
work loop until the selection timeout timer expired. It may be that the
SSTAT0 register doesn't like to have more than one bit tested at a time.
I've seen stranger things than this on these parts.
selection loop was merged with the poll_for_work loop. We cannot assume
that the SCB for the selection timeout is the current SCB. Instead we
must look at the SCB at the head of the waiting for selection list.
This fixes part of a problem reported by David Malone, but does not explain
why he was getting selection timeouts in the first place.
the directory format (ext2fs, cd9660). For these filesystems, it must use
cookies to find the correct offset to use for subsequent reads. Without it,
linux /bin/ls tends to loop re-reading the same block over and over again.
2.2 candidate.
(see LINT). There is a new low-level console type that is more suitable
for use with gdb-remote.
Fixed setting of speed at probe time for the serial console (if any).
Reviewed by: dfr
I got tired of not seeing my worm stats show up during a burn. :)
[Joerg, I just stapled in 1MB/sec for a bogus xfer rate and left seek = 1,
as suggested - I'm not going to dynamically calculate the xfer rate from
a known device spectable, OK? :-)]
Reviewed by: joerg
implementation #ifdef out. This can be used for now by NFS. As soon
as all the other filesystems' locking is fixed, this can go away.
Print the vnode address in vprint for easier debugging.
Bump the timeout for an "ordered tag" recovery action from 1 to 5 seconds.
Remove the multiple timeout panic. Its very easy to get into a situation
where a timedout command will time out a second time even though the
recovery code is working fine. A good example is:
1) Command times out during recovery
2) reset the timeout for the command
3) Recovery actions complete and all transactions are requeued
4) second timeout fires off which puts us back into recovery bogusly
5) another transaction that timedout once during the first recovery action
times out causing the panic.
In essence, the correct solution to the problem is to put every transaction
back up into the work queue and have their timeout handling done in the same
way that all commands are handled. The CAM layer makes this easy, so it
will have to wait until then.
1. imgp->image_header needs to be cleared for the bp == NULL && `goto
interpret' case, else exec_fail_dealloc would free it twice after
an error.
2. Moved the vp->v_writecount check in exec_check_permissions() to
near the end. This fixes execve("/dev/null", ...) returning the
bogus errno ETXTBSY. ETXTBSY is still returned for attempts to
exec interpreted files that are open for writing. The man page
is very old and wrong here. It says that ETXTBSY is for pure
procedure (shared text) files that are open for writing or reading.
3. Moved the setuid disabling in exec_check_permissions() to the end.
Cosmetic. It's more natural to dispose of all the error cases
first.
...plus a couple of other cosmetic changes.
Submitted by: bde
either by looking it up in the array of pending, per target, untagged
transactions, or by using the tag value passed in during the identify. The
old code only direct indexed for tagged transactions. This makes the
"findSCB" routine only necessary when SCB paging is enabled, so appropriately
conditionalize it. This greatly simplifies the non SCB paging code flow.
if all registers are 0xff.
This allows me to run with flags 0xc0ff on my IBM-DMCA-21440 disk, which
gives 5MB/sec sequential read :-)
If you have a laptop, try adding flag 0x4000 to your disk, and tell me if
it makes any difference for you.
cache lines. Removed the struct ip proto since only a couple of chars
were actually being used in it. Changed the order of compares in the
PCB hash lookup to take advantage of partial cache line fills (on PPro).
Discussed-with: wollman
by bde.
Don't return EPERM in setre[ug]id() just because the caller passes in
the current effective id in the second arg (ie: no change), as suggested
by ache.
The magic number conflicted with the rotting disabled one in ext2fs for
debug.doasyncfree.
Removed messy debugging variable/constant/sysctl debug.doreallocblks.
Lite2 removed it, and we don't use the code that it controls.
defining doff_t both here and in <ufs/ufs/dir.h> so that this file
is independent of <ufs/ufs/dir.h>. It still has old prerequisites
<sys/param.h> and <ufs/ufs/quota.h>, and a new Lite2 prerequisite of
<sys/lock.h>, sigh.
This might fix lsof, which was broken by namespace pollution giving
conflicting definitions of DIRBLKSIZ.
This is valueable for library code which needs to be able to find out
whether the current process is or *was* set[ug]id at some point in the
past, and may have a "tainted" execution environment. This is especially
a problem with the trend to immediately revoke privs at startup and regain
them for critical sections. One problem with this is that if a cracker
is able to compromise the program while it's still got a saved id, the
cracker can direct the program to regain the privs. Another problem is
that the user may be able to affect the program in some other way (eg:
setting resolver host aliases) and the library code needs to know when it
should disable these sorts of features.
Reviewed by: ache
Inspired by: OpenBSD (but with a different implementation)
that allows traditional BSD setuid/setgid behavior.
The only visible difference should be that a non-root setuid program
(eg: inn's "rnews" program) that is setuid to news, can completely
"become" uid news. (ie: setuid(geteuid()) This was allowed in
traditional 4.2/4.3BSD and is now "blessed" by Posix as a special
case of "appropriate privilige".
Also, be much more careful with the P_SUGID flag so that we can use it
for issetugid() - only set it if something changed.
Reviewed by: ache
vector except for the egid in groups[0]. There is a risk that programs
that come from SYSV/Linux that expect this to work and don't check for
error returns may accidently pass root's groups on to child processes.
We now do what is least suprising (to non BSD programs/programmers) in
this scenario, and nothing is changed for programs written with BSD groups
rules in mind.
Reviewed by: ache
to removing the connection from the queue. The problem here is that
falloc() may block and this would allow another process to accept the
connection instead. If this happens to leave the queue empty, then the
system will panic with an "accept: nothing queued".
Also changed a wakeup() to a wakeup_one() to avoid the "thundering herd"
problem on new connections in Apache (or any other application that has
multiple processes blocked in accept() for the same socket).
as shadows of their containing directory. This should solve the problem
of users not being able to delete their symlinks from /tmp once and for
all.
Symlinks do not have modes though, they are accessable to everything that
can read the directory (as before). They are made to show this fact at
lstat time (they appear as mode 0777 always, since that's how the the
lookup routines in the kernel treat them).
More commits will follow, eg: add a real lchown() syscall and man pages.
centric rather than VM-centric to fix a problem with errors not being
detectable when the header is read.
Killed exech_map as a result of these changes.
There appears to be no performance difference with this change.
Use the same value of 512 (ufs actually uses DEV_BSIZE). There are
too many versions of DIRBLKSIZ, one for ufs, one for ext2fs, one for
nfs, one for ibcs2, one for linux, one for applications, ... I think
nfs's DIRBLKSIZ needs to be a divisor of the directory blocks sizes
of all supported file systems. There is also NFS_DIRBLKSIZ, which is
different from nfs's DIRBLKSIZ but is sometimes confused with it in
comments.
Removed a bogus #ifdef KERNEL that hid the tunable constants for nfs.
This came in undocumented with the Lite2 merge although it isn't in
Lite2. It required more-bogus #define KERNEL's in fstat and pstat
to make the constants visible.
Restored a spelling fix from rev.1.17.
Removed duplicate #defines of all the the NFS mount option flags.
they were created later on. This is not the case when processing
syscalls.isc in the ibcs2 area. (It generates no declarations, it's
all either hidden (already prototyped elsewhere) or unimplemented).
Lookup isn't done every time the system goes idle now, but it can still
take > 1800 instructions in the worst case, so if cpu interrupts are kept
disabled then it might lose 20 characters of sio input at 115200 bps.
Fixed style in vm_page_zero_idle().
functions if DDB is available. The remaining occurences are usually
only inlined and thus not available in DDB.
I'm sure Bruce will have 23 additions to these 30 lines of code, but
at least it's a starting point. ;-)
change typematic rate, or the X server (XFree86 or Accelerated X)
starts up.
So far, there have been two independent reports from Dell Latitude XPi
notebook/laptop owners. The Latitude seems to be the only system which
suffers from this problem. (I don't know the problem is with the
entire Latitude line or with only some Latitude models) No problem
report has been heard about other systems (I certainly cannot
reproduce the problem in my -current and 2.2 systems).
In 3.0-CURRENT, 2.2-RELEASE and 2.2-GAMMA-970310, when programming the
keyboard LED/repeat-rate, `set_keyboard()' in `syscons' tells the
keyboard controller not to generate keyboard interrupt (IRQ1) and then
enable tty interrupts, expecting the keyboard interrupt doesn't occur.
It appears that somehow Latitude's keyboard controller still generates
the keyboard interrupt thereafter, and `set_keyboard()' doesn't see
the return code from the keyboard because it is consumed by the
keyboard interrupt handler.
The patch entirely disables tty interrupts while setting LED and
typematic rate in `set_keyboard()', making the routine behave more
like the previous versions of `syscons' (versions in 2.1.X and
2.2-ALPHA, -BETA, and some -GAMMAs). The reporter said this patch
eliminated the problem.
(I also found another typo/bug, but the reporter and I found that it
wasn't the cause of the problem...)
This should go into RELENG_2_2.
address outside of the process's address space.
Now it matches its man page :-). Closes PR# 2682.
Discussed with: bde
Submitted by: Jonathan Lemon <jlemon@americantv.com>