because the run time exceeds the largest value a signed int can hold.
The real solution involves calculating how far we are over the limit.
To quickly solve this problem we loop removing 1/5th of the current value
until it falls below the limit. The common case requires no passes.
- Emulate lock draining (LK_DRAIN) in null_lock() to avoid deadlocks
when the vnode is being recycled.
- Don't allow null_nodeget() to return a nullfs vnode from the wrong
mount when multiple nullfs's are mounted. It's unclear why these checks
were removed in null_subr.c 1.35, but they are definitely necessary.
Without the checks, trying to unmount a nullfs mount will erroneously
return EBUSY, and forcibly unmounting with -f will cause a panic.
- Bump LOG2_SIZEVNODE up to 8, since vnodes are >256 bytes now. The old
value (7) didn't cause any problems, but made the hash algorithm
suboptimal.
These changes fix nullfs enough that a parallel buildworld succeeds.
Submitted by: tegge (partially; LK_DRAIN)
Tested by: kris
and run time.
- Scale the sleep and run time back via sched_interact_update() in more
places. This is to keep the statistic more accurate.
- Charge a parent one tick for forking a child.
- Add only the run time and not the sleep time to the parents kg when a
thread exits. This allows us to give a penalty for having an expensive
thread exit but does not give a bonus for having an interactive thread
exit.
- Change the SLP_RUN_THROTTLE to limit us to 4/5th and not 1/2.
- Change the SLP_RUN_MAX to two seconds. This keeps bursty interactive
applications like mozilla and openoffice in the interactive range even
through expensive tasks.
- Recalculate the slice after every sleep. This ensures that once a task
has been marked interactive it only has a slice of 1 at the risk of
giving tasks that sleep for a very brief period a longer time slice.
this makes connect act more sensibly in these cases.
PR: 50839
Submitted by: Barney Wolff <barney@pit.databus.com>
Patch delayed by laziness of: silby
MFC after: 1 week
This version of the driver code is compatible with near-release FreeBSD 5.1
kernel/driver interfaces.
modules/Makefile, man page and other bindings to follow shortly, once I get
this part of the check-in right.
Approved by: markm(mentor)
state, as inp_socket will then be NULL. This fixes a panic that occurs when one
tries to bind a port that was previously binded with remaining TIME_WAIT
sockets.
the terminating '\0'. Since the initialisation of rootpath in
libstand/bootp.c may copy junk into the rest of the buffer, it was
possible for the code to find a ':' after the '\0' and do the wrong
thing.
Reviewed by: ps
MFC after: 1 week
resource deallocation back to fifo_close(). This eliminates any
stale data that might be stuck in the socket buffers after all the
readers and writers have closed the fifo.
Tested by: Thorsten Schroeder <ths@katjusha.de>
on friday 13th and without making a universe). This adds struct and
constant definitions for ATM traffic parameters and re-enables the
build of the midway driver.
Tested by: make universe
Previously, any normal I/O on an fdc(4) device would fail with ENXIO
if the device had been opened in non-blocking mode and then closed
prior to the conventional access; that would last until the floppy
disk was ejected and re-inserted to raise the unit attention condition.
Add a clarifying comment.
populated. Apparently, if you use an ehci controller, it's not.
Use usbd_device2interface_handle() to retrieve the interface handle.
NOTE: uaa->iface is populated in the probe routine, so I suspect the
fact that it's NULL in the attach routine is a bug in the ehci driver.
Also, don't depend on the PHY addresses returned by the AXE_CMD_READ_PHYID
command. The address is correct for my LinkSys NIC, but a user has
reported that with a D-Link NIC, the PHYID command returns address 4
while the attached Broadcom PHY is in fact strapped for address 0.
Instead, latch onto the first PHY address that returns valid data
during a readreg operation.
- Add vm page queue locking in certain places that are only needed on
sparc64.
This should make pmap_qenter and pmap_qremove MP-safe.
Discussed with: alc
as should every block device strategy routine.
There was at least one evil consequence of not doing so:
Some errors returned by fdstrategy() could be lost (EAGAIN,
in particular.)
PR: kern/52338 (in the audit-trail)
Discussed with: bde
to access floppy parameters through it.
Note: The DIOCGSECTORSIZE and DIOCGMEDIASIZE handlers withing
fdioctl() couldn't be just moved to below the existing check
for blocking mode because fd->ft can be non-NULL while still
in non-blocking mode (fd->ft can be set with the FD_STYPE ioctl.)
PR: kern/52338
No MFC: Not applicable to STABLE
schedules an upcall. Signal delivering to a bound thread is same as
non-threaded process. This is intended to be used by libpthread to
implement PTHREAD_SCOPE_SYSTEM thread.
2. Simplify kse_release() a bit, remove sleep loop.
ulpt_status() afterwards. This fixes a crash that can occur if a
USB printer is power-cycled when printing is just starting. The
problem is similar to that fixed in revision 1.33, but it is much
less likely to occur.
MFC after: 1 week
panics. Before revision 1.38, we used to just point panicstr at the
format string if panicstr was NULL, but since we now use a static
buffer for the formatted panic message, we have to be careful to
only write to it during the first panic.
Pointed out by: bde
UFS quota implementation. Push some quite broken access control
logic out of ufs_quotactl() into the individual command
implementations in ufs_quota.c; fix that logic. Pass in the thread
argument to any quotactl command that will need to perform access
control.
o quotaon() requires privilege (PRISON_ROOT).
o quotaoff() requires privilege (PRISON_ROOT).
o getquota() requires that:
If the type is USRQUOTA, either the effective uid match the
requested quota ID, that the unprivileged_get_quota flag be
set, or that the thread be privileged (PRISON_ROOT).
If the type is GRPQUOTA, require that either the thread be
a member of the group represented by the requested quota ID,
that the unprivileged_get_quota flag be set, or that the
thread be privileged (PRISON_ROOT).
o setquota() requires privilege (PRISON_ROOT).
o setuse() requires privilege (PRISON_ROOT).
o qsync() requires no special privilege (consistent with what
was present before, but probably not very useful).
Add a new sysctl, security.bsd.unprivileged_get_quota, which when
set to a non-zero value, will permit unprivileged users to query user
quotas with non-matching uids and gids. Set this to 0 by default
to be mostly consistent with the previous behavior (the same for
USRQUOTA, but not for GRPQUOTA).
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
from the tree until it is fixed. Since it is an atm driver, it isn't
commonly used so this will not negatively impact too many people.
harti can reconnect it when he resurfaces and corrects the en module
problems. This should allow snapshots to start succeeding again.
Reported by: lots of people
which meant no process would run for longer than 20ms.
- Slightly redo the interactivity scorer. It follows the same algorithm but
in a slightly more correct way. Previously values above half were
incorrect.
- Lower the interactivity threshold to 20. It seems that in testing non-
interactive tasks are hardly ever near there and expensive interactive
tasks can sometimes surpass it. This area needs more testing.
- Remove an unnecessary KTR.
- Fix a case where an idle thread that had an elevated priority due to
priority prop. would be placed back on the idle queue.
- Delay setting NEEDRESCHED until userret() for threads that haad their
priority elevated while in kernel. This gives us the same context switch
optimization as SCHED_4BSD.
- Limit the child's slice to 1 in sched_fork_kse() so we detect its behavior
more quickly.
- Inhert some of the run/slp time from the child in sched_exit_ksegrp().
- Redo some of the priority comparisons so they are more clear.
- Throttle the frequency of sched_pctcpu_update() so that rounding errors
do not make it invalid.
and return NULL.
vinum_scandisk: Don't handle NULL device pointers.
Only look at compatibility partition for i386. This
is a kludge which should go away once I have adequate
documentation for the New World Order.
Together, these fixes remove occasional error messages about
non-existent drives. They may also fix a number of problems that have
been reported without a PR.
PRs: None
directory vnodes use to refer to their constituent vnodes, into
union_dircache_free(). Also s/union_dircache/union_dircache_get/ and
tweak the structure of union_dircache_r().
MFC after: 3 days
to the machine-independent parts of the VM. At the same time, this
introduces vm object locking for the non-i386 platforms.
Two details:
1. KSTACK_GUARD has been removed in favor of KSTACK_GUARD_PAGES. The
different machine-dependent implementations used various combinations
of KSTACK_GUARD and KSTACK_GUARD_PAGES. To disable guard page, set
KSTACK_GUARD_PAGES to 0.
2. Remove the (unnecessary) clearing of PG_ZERO in vm_thread_new. In
5.x, (but not 4.x,) PG_ZERO can only be set if VM_ALLOC_ZERO is passed
to vm_page_alloc() or vm_page_grab().
Devices below may experience a change in geometry.
* Due to a bug, aic(4) never used extended geometry. Changes all drives
>1G to now use extended translation.
* sbp(4) drives exactly 1 GB in size now no longer use extended geometry.
* umass(4) drives exactly 1 GB in size now no longer use extended geometry.
For all other controllers in this commit, this should be a no-op.
Looked over by: scottl
Devices below may experience a change in geometry.
* Due to a bug, aic(4) never used extended geometry. Changes all drives
>1G to now use extended translation.
* sbp(4) drives exactly 1 GB in size now no longer use extended geometry.
* umass(4) drives exactly 1 GB in size now no longer use extended geometry.
For all other controllers in this commit, this should be a no-op.
PR:
Submitted by:
Looked over by: scottl
Approved by:
Obtained from:
MFC after:
in smb_fphelp(): the parent vnode may have already been recycled
since we don't hold a reference to it. Fixes a panic when rebooting
with mdconfig -t vnode devices referring to vnodes on a smbfs mount.
how we registered pccard_intr, it is MPSAFE. However, since we
register the pccard_intr handler with the flags of the ISR we call,
that is the gating factor. We need do nothing specific here.
Prompted by: seeing pccard_intr in a panic.
are the same that those of the kernel in the KLD_MODULE case. If
we ever want to detect that kind of problems, this is not the right
place to do this since every network driver would be affected by
such desynchronisation.
toggle several media options (sonet/sdh, for example) with ifconfig and
to see the carrier state in ifconfig's output. It gives also read/write
access (given the right privilegs) to the S/Uni registers to user space
programs.
been tested extensively, but -CURRENT testing has been hampered by a
number of panics that also occur without the patch. Since the
destabilizing changes between 4.X and 5.X are external to unionfs,
I believe this patch applies equally well to both.
Thanks to scrappy for assistance testing these and other changes.
MFC after: 4 days
small but noticeable increase in performance for name lookup operations.
The code uses two zones, one for short names (less than 32 characters)
and one for long names (up to NAME_MAX). Since most file names are
fairly short, this saves a considerable amount of space that would
otherwise be wasted if we always allocated NAME_MAX bytes. The cutoff
value of 32 characters was picked arbitrarily and may benefit from some
tweaking; it could also be made into a tunable.
Submitted by: hmp
Restructure the error handling portion of the resource allocation
code to eliminate duplicated code.
Test for the O_NONBLOCK && fi_readers == 0 case before incrementing
fi_writers and modifying the the socket flag to avoid having to
undo these operations in this error case.
Restructure and simplify the code that handles blocking opens.
There should be no change to functionality.
than once. This appears to work around the hanging issues, at the
expense of warnings about bad RID allocations. I'm not sure this is a
permanant workaround, but does appear to help in the tests that I've
done here.
has not been cleaned in the meantime, since this can happen during
a forced unmount. Also add a comment that nfs_removeit() should
really be locking the directory vnode before calling nfs_removerpc().
Reported by: mbr
Tested by: mbr
MFC after: 1 week
It currently supports the PMC Sierra Lite, Ultra and 622 chips and
the IDT 77105. The driver handles media options and state in a consistent
manner for ATM drivers. The next commit to the midway driver will make
it use utopia.
doesn't have one. The test was bogus on these architectures, but
recent changes broke it altogether.
Prompted by: phk
This should fix the recent SPARC 64 build problems.
defined, then #error out. This is protected inside of #ifdef _KERNEL.
This allows one to tag code in the tree that will be deleted in 6.x
with the 'OBSOLETE_IN_6 #define at the top of the file. This makes
for easy grepping, plus a mechanism that automatically fails the
compilation of those files that are so tagged after we do the cutover.
However, GENERIC has wdc commented out, and COMPAT_OLDISA is required
for that. Comment out COMPAT_OLDISA and sdd a comment to this effect
near wdc.
Reviewed by: nyan@
o Register ISR INTR_MPSAFE.
o Loop on KTHREAD_DONE == 0 in the thread.
o Safe the INTR_MPSAFE flag for client drivers (don't know if there are any
CardBus/PCI drivers that are INTR_MPSAFE)
o Read status after acquiring mtx_lock(Giant) rather than before so that we
catch state changes that happen while Giant is being acquired.
o Turn off the CD bit when we see a CD interrupt, and turn it back on after
we've attached/detached the card.
o On suspend, actually set the CBB_SOCKET_MASK to zero rather than oring
in '0' to turn it off on suspend.
o If the ISR that's registerd is MPSAFE, don't acquire Giant around call to
client ISR.
o Fix comments to reflect these changes.
PCB contains FP registers, whose alignment must be 16 bytes at least.
Since the PCB pointed to by pc_pcb is immediately after the PCPU
itself, round-up the size of thge PCPU to a multiple of 16 bytes. The
PCPU is page aligned.
This fixes a misalignment trap caused by stopping a CPU in a SMP
kernel, such as been done when entering the debugger.
Reported by: Alan Robinson <alan.robinson@fujitsu-siemens.com>
too small panics on PAE machines which have odd > 4GB sizes (4.5 gig
would render a 20MB of KVA for kmem_map instead of 200MB).
Submitted by: John Cagle <john.cagle@hp.com>, jeff
Reviewed by: jeff, peter, scottl, lots of USENIX folks
about the driver version in case of an error report. It conflicts with
some other variable of the same name that has been added to the kernel
just recently and there haven't been any bug reports for quite some
time now, anyway ...
is currently executing when we try to remove it in exit1(). Without this,
it was possible for the callout to bogusly rearm itself and eventually
refire after the process had been free'd resulting in a panic.
PR: kern/51964
Reported by: Jilles Tjoelker <jilles@stack.nl>
Reviewed by: tegge, bde
curthread. Unlike td_flags, this field does not need any locking.
- Replace the td_inktr and td_inktrace variables with equivalent private
thread flags.
- Move TDF_OLDMASK over to the private flags field so it no longer requires
sched_lock.
userland, and the kernel. In the kernel by way of the 'ident[]' variable
akin to all the other stuff generated by newvers.sh. In userland it is
available to sysctl consumers via KERN_IDENT or 'kern.ident'. It is exported
by uname(1) by the -i flag.
Reviewed by: hackers@
second and equalizing the load between the two most imbalanced CPU. This
is intended to clear up long term load imbalances that would not be handled
by the 'pull' method in sched_choose().
- Pull out some bits of sched_choose() into a kseq_move() function that moves
an arbitrary thread from one kseq to another.
adding it to the nice tables. Therefore, in kseq_add_nice, we should
keep in mind that the load will be 1 if we are the only thread, and not
0.
- Assert that the sched lock is held in all the appropriate places.
- Increase the scope of the sched lock in sched_pctcpu_update().
- Hold the sched lock in sched_runnable(). It is not held by the caller.
ia64-specific.
- When trying to re-route interrupts, don't change cfg->intline if the
re-route fails by returning an invalid vector. This fixes machines
without any way of routing interrupts such as older PC's without a
$PIR table.
We do not currently write the new intline value back to the hardware, but
we should. That will likely be added in a later commit.
the caller assumes this to not happen by means of performing an
indirection without checking the return value. Add KASSERTs to
force a kernel with INVARIANTS to panic. This is a short-term
measure. The pmap code is scheduled to be overhauled.
to deliver a signal and the RSE backing store has been exhausted or
the backing store pointer has been clobbered, we need to make sure
we call userret() and do_ast() when we exit from trap(). Not adjusting
the local variable 'user' in this case will prevent the faulty process
from being terminated and we end up in an infinite fault repetition.
Faulty process provided by: bento
if it was already enabled. We don't want to set it
when it shouldn't be set, we just don't want to
inadvertantly turn it off. This should fix a recent
report of the aic7xxx driver repeatedly complaining of
"unexpected busfree while idle" in one configuration.
successfully mapping our registers. This
avoids the disabling of memory mapped I/O
just because some other driver probe happened
to touch our registers.
This drive delays going async after receiving a WDTR
message. We now send an SDTR message after a WDTR even
if our goal is to go async. This should work even for
confused devices.
If we get an unexpected busfree when attempting a WDTR
or SDTR, only set the goal negotiation parameters we were
trying to negotiate to off. This means that should a WDTR
message fail, we will still try an SDTR if our goal is
non-async.
Fix a few more places where we were looking at goal.period
instead of goal.offset for determining if we should be
negotiating sync. This should not have any impact on
our behavior, but the offset is more definitive and should
be used.
aic79xx.c:
aic79xx.h:
aic79xx_pci.c:
aic7xxx.c:
aic7xxx.h:
aic7xxx_pci.c:
Switch ah?_reset() to take an additional "reinit" argument.
Use this instead of init_level to determin if the chip
should be fully reinitialized after a chip reset. This
is required so that ah?_shutdown() can reset the chip
without side-effects.
aic79xx.c:
Implement ahd_suspend() and ahd_resume().
aic7xxx.c:
Change ahc_loadseq() to *not* restart the sequencer.
This brings the loadseq behavior in line with that
of the 7902 driver and also simplifies the init routine.
Correct the resume routine to enable interrupts and
restart the sequencer.
always kernel space. It should be treated as user space when run with
user privileges (which is the case for the signal trampolines). This
fixes its only use in a KASSERT in subr_trap.c.
fit on one line. Account for threads better.
* No need to report that it is on a sleep queue if it is actually sleeping
* "Normal" state is almost ubiquitous.. only report abnormal states.
* increment the #lines count for each separate thread shown in threaded
programs.
makes it less likely that a threaded program will make all the data
on a screen overflow off the top of the screen.
while after the legacy device was added since this driver hangs from
legacy and not nexus.
- Make several methods non-static so they can be reused in a mptable
host -> pci bridge driver that will be added at a later date.
- Let legacy_pcib() use pcibios_pcib_route_interrupt() directly instead of
wrapping it in a private function. Originally, I thought I was going to
have the nexus_pcib() driver make a runtime APIC vs. 8259A check and call
the appropriate routing method (MPTable vs. PIR) that way, but it ended
up being cleaner to make nexus_pcib() just work with PIR and have a
separate host -> pci bridge driver for the mptable/apic case.
bridge lives on (i.e., the parent bus) when probing the PIR table for a
bus. This could cause the PCIBIOS PCI-PCI bridge driver to bogusly attach
to bridges that weren't in the PIR but whose parent bus was in the PIR.
obtained from o2micro. These should only be needed for 'older'
o2micro bridges (anything before the 7xxx series of bridges), but will
work with the new bridges.
# I don't plan on porting it to oldcard, but will happily commit to
# oldcard if someone else needs them.
Add sysctl's to display statistics/debug_info
Set WAIT_FOR_AUTONEG_DEFAULT to zero by default
Increment packet in/out statistics inline instead of every two seconds.
MFC after: 3 days
attribute name of "" from ffs_getextattr(). Invoking VOP_GETETATTR()
with an empty name is now no longer supported; user application
compatibility is provided by a system call level compatibility
wrapper. We make sure to explicitly reject attempts to set an EA
with the name "".
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
"", temporarily map it to a call to extattr_list_vp() to provide
compatibility for older applications using the "" API to retrieve
EA lists.
Use VOP_LISTEXTATTR() to support extattr_list_vp() rather than
VOP_GETEXTATTR(..., "", ...).
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Asssociates Laboratories
specific attribute name. It will have the same semantics as the
older vop_getextattr() "retrieve the names" hack, returning
a buffer with ASCII nul-seperated names.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
extended attribute retrieval code: it's no longer special-cased,
and is caught by the normal UFS1 EA validity checks (and, in
fact, returns the same error, EINVAL).
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
we were passing in a void* representing the PCB of the parent thread.
Now we pass a pointer to the parent thread itself.
The prime reason for this change is to allow cpu_set_upcall() to copy
(parts of) the trapframe instead of having it done in MI code in each
caller of cpu_set_upcall(). Copying the trapframe cannot always be
done with a simply bcopy() or may not always be optimal that way. On
ia64 specifically the trapframe contains information that is specific
to an entry into the kernel and can only be used by the corresponding
exit from the kernel. A trapframe copied verbatim from another frame
is in most cases useless without some additional normalization.
Note that this change removes the assignment to td->td_frame in some
implementations of cpu_set_upcall(). The assignment is redundant.
A previous call to cpu_thread_setup() already did the exact same
assignment. An added benefit of removing the redundant assignment is
that we can now change td_pcb without nasty side-effects.
This change officially marks the ability on ia64 for 1:1 threading.
Not tested on: amd64, powerpc
Compile & boot tested on: alpha, sparc64
Functionally tested on: i386, ia64
got fixed two weeks after the ia64 version was copied from the alpha
version (see rev 1.32 of sys/alpha/alpha/mem.c). As such, we were
missing the same continue as on alpha.
While here, add a default case for the device minor switch and do
some general style(9) cleanups.
WARNING: this file still has bugs. When reading from region 6 or
region 7, we don't validate the physical address. One can trivially
cause a machine check by trying to read from address 0xFFFFFFFFFFFFFFF0
or something that uses the unimplemented physical address bits.
Reported by: Alan Robinson <alan.robinson@fujitsu-siemens.com>
we were passing in a void* representing the PCB of the parent thread.
Now we pass a pointer to the parent thread itself.
The prime reason for this change is to allow cpu_set_upcall() to copy
(parts of) the trapframe instead of having it done in MI code in each
caller of cpu_set_upcall(). Copying the trapframe cannot always be
done with a simply bcopy() or may not always be optimal that way. On
ia64 specifically the trapframe contains information that is specific
to an entry into the kernel and can only be used by the corresponding
exit from the kernel. A trapframe copied verbatim from another frame
is in most cases useless without some additional normalization.
Note that this change removes the assignment to td->td_frame in some
implementations of cpu_set_upcall(). The assignment is redundant.
A previous call to cpu_thread_setup() already did the exact same
assignment. An added benefit of removing the redundant assignment is
that we can now change td_pcb without nasty side-effects.
This change officially marks the ability on ia64 for 1:1 threading.
Not tested on: amd64, powerpc
Compile & boot tested on: alpha, sparc64
Functionally tested on: i386, ia64
Always route PCI interrupts on i386 UP machines. I was planning to enable
this for i386 anyways once SMP support is done. Having this enabled fixes
problems on many people's laptops.
Requested by: imp
Attach to the component devices using GEOM semantics.
Create a GEOM provider instead of using disk_create()
Use the GEOM OAM api for configuration.
I saw approx ~1% speedup in througput and ~7% in latency in a
simple minded test of a two-disk striped device.
This file was repo-copied from src/sys/dev/ccd/ccd.c.
This is not yet linked into the build.
While a string is readable without a tool, an array is easier to process for
a monitoring application. This also prevents the extra hoops we need with
sbufs and locking.
Move the mtx_init() in en_attach() higher before the first failure point so
that we can unconditionally destroy it in en_destroy().
extattr_list_link() system calls, which return a least of extended
attributes defined for a vnode referenced by a file descriptor
or path name. Currently, we just invoke VOP_GETEXTATTR() since
it will convert a request for an empty name into a query for a
name list, which was the old (more hackish) API. At some point
in the near future, we'll push the distinction between get and
list down to the vnode operation layer, but this provides access
to the new API for applications in the short term.
Pointed out by: Dominic Giampaolo <dbg@apple.com>
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
file/directory/link, rather than using a less explicit hack on
the extattr retrieval API:
extattr_list_fd()
extattr_list_file()
extattr_list_link()
The existing API was counter-intuitive, and poorly documented.
The prototypes for these system calls are identical to
extattr_get_*(), but without a specific attribute name to
leave NULL.
Pointed out by: Dominic Giampaolo <dbg@apple.com>
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Don't copyin() data we are about to overwrite.
Add a flag to tell userland that KSE is officially "DONE" with the
mailbox and has gone away.
Obtained from: davidxu@
we failed to put the bucket back into the general cache/container.
Also, fix a bad assumption. There was a KASSERT() that aimed to
guarantee that whenever the pcpu container's mc_starved was > 0,
that whatever the bucket we were freeing to was an empty bucket,
assuming it belonged to the pcpu container cache. However, there
is at least one case where this is not true anymore; consider:
1) All containers empty, next thread to try to alloc will touch
a pcpu container, notice it's empty, and increment the pcpu
container's mc_starved.
2) Some other thread frees an mbuf belonging to a bucket in
the general cache/container. Then it frees another mbuf
belonging to the same bucket (still in gen container).
3) Some third thread tries to allocate an mbuf from the pcpu
container and, since empty, grabs one mbuf now available
in the general cache and moves the non-empty bucket from
which it took 1 mbuf and to which the thread in (2) freed
to, and moves it to the pcpu container.
4) A final thread tries to free an mbuf belonging to the
NON-EMPTY bucket mentionned in (2) and (3) and, since
the pcpu container's mc_starved is > 0, but the bucket
is obviously non-empty, it trips on the KASSERT.
This meant that one could potentially get a panic in some
cases when out of mbufs and clusters. The problem could
be mitigated by commenting out some cv_signal() calls,
but I'm assuming that was pure coincidence and this is
the correct fix.
on my part. The output asm looks correct with the previous commit in place
and it works on amd64, but on my laptop I got a spew of AE_BAD_PARAMETER
errors trying to unlock the acpi global lock.
and releasing ACPI global locks instead of (ab)using the pointers to those
locks as the constants. Also, rather than require that the address of
the lock be stored in a register, use a memory constraint allowing the
memory address to be used directly.
Noticed by: peter
- Use a hash of umtx queues to queue blocked threads. We hash on pid and the
virtual address of the umtx structure. This eliminates cases where we
previously held a lock across a casuptr call.
Reviwed by: jhb (quickly)
to unload. This would cause a panic on the second resetconfig.
Start Vinum at boot time at SI_SUB_RAID, not SI_SUB_VINUM.
SI_SUB_VINUM was there first, but there's no real distinction, and
SI_SUB_RAID is a more neutral name.
Submitted by: hmp
o adding locking to op submission
o mark interrupt handler MPSAFE
o don't use locking on detach; disabling interrupts should be sufficient
o change mutex string names so witness printouts are more meaningful
Note: locking is still pretty brute-force but it's probably not worth
improving it given the relatively low performance of hifn-based cards.
o replace driver-global lock with three locks: one for the handling of mcr1
operations, one for handling of mcr2 operations, and one for the mcr1
free list
o mark the interrupt handler MPSAFE
o don't use locking on detach; disabling interrupts is sufficient (I think)
o add a ``done'' flag for crypto operations; this is set when the operation
completes and is intended for callers to check operations that may complete
``prematurely'' because of direct callbacks
o close a race for operations where the crypto driver returns ERESTART: we
need to hold the q lock to insure the blocked state for the driver and any
driver-private state is consistent; otherwise drivers may take an interrupt
and notify the crypto subsystem that it can unblock the driver but operations
will be left queued and never be processed
o close a race in /dev/crypto where operations can complete before the caller
can sleep waiting for the callback: use a per-session mutex and the new done
flag to handle this
o correct crypto_dispatch's handling of operations where the driver returns
ERESTART: the return value must be zero and not ERESTART, otherwise the
caller may free the crypto request despite it being queued for later handling
(this typically results in a later panic)
o change crypto mutex ``names'' so witness printouts and the like are more
meaningful
all the argument registers etc since we have almost certainly have trashed
them by now. Take particular car of %r10 since it held the original value
of %rcx (which we saved in tf_rcx on entry and doreti doesn't know this).
Change the list interface to simplify things.
Remove old list ioctls which bogusly exported the softc to userland.
Move the softc and associated structures from the public header to
the source file.
Make CCD a GEOM class.
For now only use this for implementing a OAM config method which
can return a list of configured CCD devices in the format which
"ccdconfig -g[v]" would normally output.
mpo_copy_mbuf_label() entry point for Biba and MLS, respectively.
Otherwise, labels in m_tags may not be properly propagated across
some classes of mbuf operations. This problem caused these policies
to fail-stop the system with a panic.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Two tokens that don't together form a vaid preprocssor token cannot be
pasted together using ANSI-C token concatinatation. GCC 3.2's cpp, at least,
produces the desired result w/o using "##".
Clarify that the implicit fallthrough was *not* intentional (thanks, Poul!)
and reorganize the code so a correct fallthrough (with /* FALLTHROUGH */)
occurs.
o Use pcb and tf for the new pcb and the new trapframe and use pcb0
for the old (current) pcb. The mix of pcb, pcb2 and tf was slightly
confusing.
o Don't define td->td_frame here. It has already been set previously
by cpu_thread_setup. Add a KASSERT to make sure pcb and tf are both
non-NULL.
o Make sure the number of dirty registers is 0 for the new thread.
There are no user registers on the backing store because we heven't
enter userland yet.
is not called, and no static rules match an outgoing packet, the
latter retains its source IP address. This is in support of the
"static NAT only" mode.
bzero(ptr, sizeof(DC_RXLEN * 5));
which should obviously be:
bzero(ptr, DC_RXLEN * 5);
Looks like this bug may have reduced the effectiveness of the
workaround for the hardware bug in the PNIC chips.
MFC after: 1 week
o Remove register keyword
o ANSIfy prototypes
o Remove "return;" at the end of void functions
o Remove trailing spaces
o Don't align local variables with tabs and reorder them
o Don't use /* FOO */ at the end of a #ifdef FOO block if
it's a small block
- Other non-functional changes :
o 6 -> ETHER_ADDR_LEN
o Don't initialize if_output; ether_ifattach() does it for us
hinge on the "verb" parameter which the class gets to interpret as
it sees fit.
Move the entire request into the kernel and move changed parameters
back when done.
Sleep on the vnode interlock while waiting for another
caller to increment fi_readers or fi_writers. Hold the
vnode interlock while incrementing fi_readers or fi_writers
to prevent a wakeup from being missed.
Only access fi_readers and fi_writers while holding the vnode
lock. Previously fifo_close() decremented their values without
holding a lock.
Move resource deallocation from fifo_close() to fifo_inactive(),
which allows the VOP_CLOSE() call in the error return path in
fifo_open() to be removed. Fifo_open() was calling VOP_CLOSE()
with the vnode lock held, in violation the current vnode locking
API. Also the way fifo_close() used vrefcnt() to decide whether
to deallocate resources was bogus according to comments in the
vrefcnt() implementation.
Reviewed by: bde
the lameness of the kstack code. The EPC overhaul de-lame-ified the
kstack code by removing the need for contigmalloc(). We can now
allocate stacks using malloc(). We probably want to make the stacks
swappable as well so that we can make it MI. But that's another story.
-Werror build with such option, but not other combinations. LINT
misses this because syscons knobs in LINT turn off a lot of code.
Reviewed by: marcel (some time ago)
if we permit them to occur, the kernel panics due to our performing
EA operations using VOP_STRATEGY on the vnode. This went unnoticed
previously because there are very for users of device nodes on UFS2
due to the introduction of devfs. However, this can come up with
the Linux compat directories and its hard-coded dev nodes (which will
need to go away as we move away from hard-coded device numbers).
This can come up if you use EA-intensive features such as ACLs and
MAC.
The proper fix is pretty complicated, but this band-aid would be
an excellent MFC candidate for the release.
why certain exceptions are made, note an inconsistency between
FreeBSD and some other implementations regarding IPC_M, and let
suser() generate our EPERM rather than forcing it ourselves.
Remove a carriage return that crept in in the last commit.
Reviewed by: gordon
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
gateway page is considered kernel space, we can panic when we should
only SIGSEGV. Hence, add the additional constraint that for page
faults we also require running with kernel privileges. The gateway
page is the only kernel code running with user privileges, iso this
is a correct way to exclude the gateway page from kernel land.
We do not currently exclude the gateway page for other faults as it
is not always the right way to do it. Further tuning will happen on
a case by case bases.