take a thread instead of a proc for their first argument.
- Add a mutex to protect the system-wide Linux osname, osrelease, and
oss_version variables.
- Change linux_get_prison() to take a thread instead of a proc for its
first argument and to use td_ucred rather than p_ucred. This is ok
because a thread's prison does not change even though it's ucred might.
- Also, change linux_get_prison() to return a struct prison * instead of
a struct linux_prison * since it returns with the struct prison locked
and this makes it easier to safely unlock the prison when we are done
messing with it.
KTRFAC_DROP to track instances when ktrace events are dropped due to the
request pool being exhausted. When a thread tries to post a ktrace event
and is unable to due to no available ktrace request objects, it sets
KTRFAC_DROP in its process' p_traceflag field. The next trace event to
successfully post from that process will set the KTR_DROP flag in the
header of the request going out and clear KTRFAC_DROP.
The KTR_DROP flag is the high bit in the type field of the ktr_header
structure. Older kdump binaries will simply complain about an unknown type
when seeing an entry with KTR_DROP set. Note that KTR_DROP being set on a
record in a ktrace file does not tell you anything except that at least one
event from this process was dropped prior to this event. The user has no
way of knowing what types of events were dropped nor how many were dropped.
Requested by: phk
struct proc as p_tracecred alongside the current cache of the vnode in
p_tracep. This credential is then used for all later ktrace operations on
this file rather than using the credential of the current thread at the
time of each ktrace event.
- Now that we have multiple ktrace-related items in struct proc that are
pointers, rename p_tracep to p_tracevp to make it less ambiguous.
Requested by: rwatson (1)
frames. A comment in if_atm.h suggests that both macros, that for extracting
the ethertype and that for inserting it, handle their argument in host
byte order. In fact, the inserting macro treated its argument as an opposite
host order short and the calling code feeds it the result of htons(). This
happens to work on i386, but fails on sparc. Make the macro use real host
endianess.
Reviewed by: kjc, atm@
- Create a new function bdone() which sets B_DONE and calls wakup(bp). This
is suitable for use as b_iodone for buf consumers who are not going
through the buf cache.
- Create a new function bwait() which waits for the buf to be done at a set
priority and with a specific wmesg.
- Replace several cases where the above functionality was implemented
without locking with the new functions.
possible for some time.
- Lock the buf before accessing fields. This should very rarely be locked.
- Assert that B_DELWRI is set after we acquire the buf. This should always
be the case now.
requiring locked bufs in vfs_bio_awrite(). Previously the buf could
have been written out by fsync before we acquired the buf lock if it
weren't for giant. The cluster_wbuild() handles this race properly but
the single write at the end of vfs_bio_awrite() would not.
- Modify flushbufqueues() so there is only one copy of the loop. Pass a
parameter in that says whether or not we should sync bufs with deps.
- Call flushbufqueues() a second time and then break if we couldn't find
any bufs without deps.
than a MAXPHYS size block ahead. Having this set too high just leaves
other processes starved for IO and screws up interactive response. Let the
users with RAID set it higher when they need it.
process to kill, don't block on a map lock while holding the
process lock. Instead, skip processes whose map locks are held
and find something else to kill.
- Add vm_map_trylock_read() to support the above.
Reviewed by: alc, mike (mentor)
- If SYSCTL_OUT() fails in sysctl_kern_proc_args(), return the error
instead of ignoring it if we have new arguments for the process.
- If the new arguments for a process are too long, return ENOMEM instead of
returning success but not doing the actual copy.
Submitted by: bde
this card is based on 16750 UART, modify sio(4) a bit to ignore 16750-specific
7th bit of MCR when probing card. This allows card to be detected and attached
as 16550A-compatible device. More work needs to be done in order to enable
nice 16750-specific features such as larger fifo buffer and higher speeds.
Sponsored by: IC Book Labs
MFC after: 2 weeks
hold hold it across the check to avoid extra lock operations in the
common case.
- Copy in the new args to a temporary pargs structure before we drop the
reference to the old one. Thus, if the copyin() fails, the process
arguments are unchanged rather than being deleted. Also, p_args is no
longer NULL during the sysctl operation.
it from its pgrp to avoid leaving zombies around with p_pgrp == NULL.
This bug was apparent as a NULL-dereference in the pid selection code
in fork1().
Harti Brandt's effort.
remove the DMA test to detect problems of the first generation PCI chipsets
back in 1998.
it is no longer needed and has been the source of the false alarm that
the driver uses too much stack space.
pmap_release.
- Merged pmap_release and pmap_release_free_page. When pmap_release is
called only the page directory page(s) can be left in the pmap pte object,
since all page table pages will have been freed by pmap_remove_pages and
pmap_remove. In addition, there can only be one reference to the pmap and
the page directory is wired, so the page(s) can never be busy. So all there
is to do is clear the magic mappings from the page directory and free the
page(s).
Sponsored by: DARPA, Network Associates Laboratories
monitors the entropy data harvested by crypto drivers to verify it complies
with FIPS 140-2. If data fails any test then the driver discards it and
commences continuous testing of harvested data until it is deemed ok.
Results are collected in a statistics block and, optionally, reported on
the console. In normal use the overhead associated with this driver is
not noticeable.
Note that drivers must (currently) be compiled specially to enable use.
Obtained from: original code by Jason L. Wright
conditional in each driver on foo_RNDTEST being defined_
o bring HIFN_DEBUG and UBSEC_DEBUG out to be visible options; they control
the debugging printfs that are set with hw.foo.debug (e.g. hw.hifn.debug)
closely what function is really doing. Update all existing consumers
to use the new name.
Introduce a new vfs_stdsync function, which iterates over mount
point's vnodes and call FSYNC on each one of them in turn.
Make nwfs and smbfs use this new function instead of rolling their
own identical sync implementations.
Reviewed by: jeff
a parameter instead of using the level of a given witness. When
recursing, pass an indent level of indent + 1.
- Make use of the information witness_levelall() provides in
witness_display_list() to use an O(n) algorithm instead of an O(n^2)
algo to decide which witnesses to display hierarchies from. Basically,
we only display a hierarchy for witnesses with a level of 0.
- Add a new per-witness flag that is reset at the start of
witness_display() for all witness's and is set the first time a witness
is displayed in witness_displaydescendants(). If a witness is
encountered more than once in the lock order tree (which happens often),
witness_displaydescendants() marks the later occurrences with the string
"(already displayed)" and doesn't display the subtree under that
witness. This avoids duplicating large amounts of the lock order tree
in the 'show witness' output in DDB.
All these changes serve to make 'show witness' a lot more readable and
useful than it was previously.
adds a witness to the child list of a parent witness. rebalancetree()
runs through the entire tree removing direct descendants of witnesses
who already have said child witness as an indirect descendant through
another direct descendant. itismychild() now calls insertchild()
followed by rebalancetree() and no longer needs the evil hack of
having static recursed variable.
- Add a function reparentchildren() that adds all the direct descendants
of one witness as direct descendants of another witness.
- Change the return value of itismychild() and similar functions so that
they return 0 in the case of failure due to lack of resources instead
of 1. This makes the return value more intuitive.
- Check the return value of itismychild() when defining the static lock
order in witness_initialize().
- Don't try to setup a lock instance in witness_lock() if itismychild()
fails. Witness is hosed anyways so no need to do any more witness
related activity at that point. It also makes the code flow easier to
understand.
- Add a new depart() function as the opposite of enroll(). When the
reference count of a witness drops to 0 in witness_destroy(), this
function is called on that witness. First, it runs through the
lock order tree using reparentchildren() to reparent direct descendants
of the departing witness to each of the witness' parents in the tree.
Next, it releases it's own child list and other associated resources.
Finally it calls rebalanacetree() to rebalance the lock order tree.
- Sort function prototypes into something closer to alphabetical order.
As a result of these changes, there should no longer be 'dead' witnesses
in the order tree, and repeatedly loading and unloading a module should no
longer exhaust witness of its internal resources.
Inspired by: gallatin
recursing on a lock instead of before. This fixes a bug where WITNESS
could get a little confused if you did an sx_tryslock() on a sx lock that
you already had an slock on. WITNESS would still function correctly but
it could result in weirdness in the output of 'show locks'. This also
makes it possible for mtx_trylock() to recurse on a lock.
used popped into my head during my morning commute a few weeks ago, but
it is also very similar (though a bit simpler) to a patch that mini@
developed a while ago. Basically, each eventhandler list has a mutex and
a run count. During an eventhandler invocation, the mutex is held while
we traverse the list but is dropped while we execute actual handlers. Also,
a runcount counter is incremented at the start of an invocation and
decremented at the end of an invocation. Adding to the list is not a big
deal since the reference of a thread currently executing the handlers
remains valid across an add operation. Whether or not new handlers are
executed by threads currently executing the handlers for a given list is
indeterminate however. The harder case is when a handler is removed from
the list. If the runcount is zero, the handler is simply removed from the
list directly. If the runcount is not zero, then another thread is
currently executing the handlers of this list, so the priority of this
handler is set to a magic value (currently -1) to mark it as dead. Dead
handlers are not executed during an invocation. If the runcount is zero
after it is decremented at the end of an invocation, then a new
eventhandler_prune_list() function is called to remove dead handlers from
the list.
Additional minor notes:
- All the common parts of EVENTHANDLER_INVOKE() and
EVENTHANDLER_FAST_INVOKE() have been merged into a common
_EVENTHANDLER_INVOKE() macro to reduce duplication and ease maintenance.
- KTR logging for eventhandlers is now available via the KTR_EVH mask.
- The global eventhander_mutex is no longer recursive.
Tested by: scottl (SMP i386)
monitors the entropy data harvested by crypto drivers to verify it complies
with FIPS 140-2. If data fails any test then the driver discards it and
commences continuous testing of harvested data until it is deemed ok.
Results are collected in a statistics block and, optionally, reported on
the console. In normal use the overhead associated with this driver is
not noticeable.
Note that drivers must (currently) be compiled specially to enable use.
Obtained from: original code by Jason L. Wright
attach routine, calling WIUNLOCK in the error case of one of the ifs
for that routine is now bogus. This should have been removed when the
WILOCK() was removed, but wasn't.
Submitted by: "Harti Brandt" <brandt@fokus.fraunhofer.de>
it is expected that they will not be enabled at the time that it
is called. This is reported to work around a problem in RELENG_4
where the kernel panics on boot if FAST_IPSEC and crypto support
are enabled.
Tested by: Scott Johnson <scottj@insane.com>
- Issue the io that we will later block on prior to doing cluster read ahead
so that it is more likely to be ready when we block.
- Loop issuing clustered reads until we've exhausted the seq count supplied
by the file system.
- Use a sysctl tunable "vfs.read_max" to determine the maximum number of
blocks that we'll read ahead.
use the underlying AsahiOptical USB chip and thus this quirk may need to
be generalized in the future.
PR: kern/46369
Submitted by: Tim Vanderhoek <vanderh@ecf.utoronto.ca>
MFC After: 3 days
I had commented the #ifdef INVARIANTS checks out to make sure I ran this
code in all kernels and forgot to comment the #ifdefs back in before I
committed.
Spotted by: bmilekic
[1] PHCC = Pointy Hat Correction Commit
ddb 'show locks' command. Thus, move witness_list() to the #ifdef DDB
section and remove extra checks for calling this function outside of
DDB. Also, witness_list() now returns void instead of returning an int.
Reported by: Steve Ames <steve@energistic.com>
Prodded by: davidxu
like secure level but which restricts changes to the keymap. Its
values impose the following restrictions:
0: No restriction - this is the default.
1: Only root can change restricted keys (like boot, panic, ...)
2: Only root can change restricted keys and regular keys.
Other users still can change accents and function keys.
3: Only root can change restricted keys, regular keys and accents.
4: Only root can change any of the keymap (restricted keys, regular
keys, accents and function keys).
Unfortunately, the keyboard's accent map is cleared when a new keymap
is loaded, which makes the distinction between level 3 and level 4
less useful.
The MAC guys might like to make this a policy?
No objections from: -audit about 6 moths ago
Remove an incorrect comment. (Incrementing an object's reference count
does not prevent a process from exiting. The real concern here is that the
physical page must not be deleted until transmission is complete. That is
already handled by the VM system and sf_buf_free().)
Tested by: ken
is more robust and prevents the hijacking of /dev/console for the typical
mistake.
Remove unneeded MAJOR_AUTO uses, it is only needed explicitly now if the
driver source has cross-branch compatibility to old releases.
have to examine the stats structure to tell if we have outstanding I/O
requests.
Making them u_int improves the chance of atomic updates to them,
but risks roll-over. Since the only interesting property is if
they are equal or not, this is not an issue.
outstanding requests to return before we unravel the mesh.
It is very important that the stuff below us plays nice and don't
overlook a couple of outstanding bio's, because until they remember
the geom event thread is blocked. At an expense in code here this
could be made more robust, but I actually _want_ a robust failure
in this case so any offending drivers can be fixed.
included in XFree86 4.3, but includes some fixes. Notable changes include
Radeon 8500-9100 support, PCI Radeon/Rage 128 support, transform & lighting
support for Radeons, and vblank syncing support for r128, radeon, and mga.
The gamma driver was removed due to lack of any users.
the device statistics structures into userland instead of using sysctl.
Introduce new devstat_new_entry() function which allocates the devstat
structure an calls devstat_add_entry() on it.
Two fields are sequence numbers for integrity check when we switch devstat
to use mmap to export data rather than sysctl, the last field is to mark
this as an allocated devstat entry.
in geom_disk.c.
As a side effect this makes a lot of #include <sys/devicestat.h>
lines not needed and some biofinish() calls can be reduced to
biodone() again.
- On receive, vm_map_lookup() needs to trigger the creation of a shadow
object. To make that happen, call vm_map_lookup() with PROT_WRITE
instead of PROT_READ in vm_pgmoveco().
- On send, a shadow object will be created by the vm_map_lookup() in
vm_fault(), but vm_page_cowfault() will delete the original page from
the backing object rather than simply letting the legacy COW mechanism
take over. In other words, the new page should be added to the shadow
object rather than replacing the old page in the backing object. (i.e.
vm_page_cowfault() should not be called in this case.) We accomplish
this by making sure fs.object == fs.first_object before calling
vm_page_cowfault() in vm_fault().
Submitted by: gallatin, alc
Tested by: ken
The random value sometimes causes macro CLKF_USERMODE to return true
because PSL_VM bit is set and really shoudn't be, this causes statclock()
to execute in wrong path, and further breaks KSE code and kernel crashes
when executing threaded program.
of a snapshot's copy of a superblock. This patch fixes a panic
when taking a snapshot of a 4096/512 filesystem.
Reported by: Ian Freislich <ianf@za.uu.net>
Sponsored by: DARPA & NAI Labs.
#include <sys/types.h>
#include <sys/stat.h>
main () {
}
cc -D_POSIX_C_SOURCE test.c
/usr/include/sys/stat.h:127: syntax error before "u_int"
/usr/include/sys/stat.h:158: syntax error before "u_int"
(u_int becomes invisible for _POSIX_C_SOURCE and some other *_SOURCE modes)
memory-allocation purposes. Right now it is also a very good idea
because we hit a Giant assertion in the free(9) processing if we
free something larger than 64k.
Include read streaming in the PPR flags we display in diagnostics.
In ahd_reset(), set the known mode after our initial pause prior to
setting the mode. We can't just set the mode directly because the
current mode, after the pause, is most likely unknown and setting the
mode when the saved mode is unknown will trigger an assertion in
the mode debug code.
Complete an audit for SCB RAM reads. These reads must be performed
via the special ahd_in?_scbram() methods so we can perform a
Rev A. PCI-X workaround.
Remove a superfluous mode save operation that was performed just
prior to a call to ahd_clear_critical_section(). The saved mode
was never restored and wouldn't have been valid anyway since the
mode could change while single stepping out of a critical section.
aic79xx.h:
Add new BUG definition AHD_PCIX_SCBRAM_RD_BUG.
aic79xx_inline.h:
Update ahd_inb_scbram routine to check for AHD_PCIX_SCBRAM_RD_BUG
and only apply the workaround if this bug is active. The old code
applied the workaround in all cases.
aic79xx_pci.c:
Set AHD_PCIX_SCBRAM_RD_BUG for the A4.
Remove an attempted saved_modes call in ahd_pci_test_register_access().
Saving the modes can only occur when we are paused, but the call was
happening before the chip was known to be paused. Restoring the
modes doesn't make sense either since the code makes no assumptions
about the state of the sequencer until the first time the mode is set
by the driver. This happens after the registers are successfully
mapped.
requests whether or not the lock is available. To avoid "unlocked
buffer" panics after a crash, we just claim that all buffers
are locked when cleaning up after a system panic.
Reported by: Attila Nagy <bra@fsn.hu>
Sponsored by: DARPA & NAI Labs.
witness. Sleepable locks such as sx locks always come before all mutexes
including Giant. However, the static lock order list placed Giant before
the proctree and allproc sx locks. This resulted in witness creating a
cycle in its lock order "tree" (real trees don't have cycles) leading to
infinite recursion and eventually a double fault. To fix, put Giant after
sx locks in the lock order list.
occurs when mounting the filesystem. The problem is that venus issues
the mount() syscall, which calls vfs_mount(), which calls coda_root()
which attempts to communicate with venus.