adding vnode to hash. The fix is to use atomic hash-lookup-and-add-if-
not-found operation. The odd thing is that this race can't happen
actually because the lowervp vnode is locked exclusively now during the
whole process of null node creation. This must be thought as a step
toward shared lookups.
Also remove vp->v_mount checks when looking for a match in the hash,
as this is the vestige.
Also add comments and cosmetic changes.
be a few bits left to clean from the HARP code in terms of what is using
the storage pools; once that's done, the memory management code can be
removed entirely.
This commit effectively changes the use of dynamic memory routines from
atm_allocate, atm_free, atm_release_pool to uma_zcreate, uma_zalloc,
uma_zfree, uma_zdestroy.
handshake between the ISR and the worker thread. Move the mutex lock
so that it only protects the cv_wait. This elimiates the not sleeping
with pccbb1 held messages some people were seeing.
Reviewed by: jhb (at least an early version)
code. Both tasks are not always performed completely by the firmware.
The former is required to get some e450 models to boot; the latter fixes
the repeated fifo underruns with hme(4)s and gem(4)s observed on some
machines (and probably performance problems with other peripherals as
well).
do not blunder around enabling interrupts and running trap handlers.
trap_pfault() will normally pass control to ddb's fault handler which
will normally do the right thing.
This bug is very old. but in old versions of FreeBSD it is probably only
serious for trap handling that involves sleeping. In -current, attempting
to examine unmapped memory while stopped at a breakpoint at mi_switch()
was always fatal.
Submitted by: tegge
o Eliminate the "!mapentzone" check from vm_map_entry_create() and
vm_map_entry_dispose(). Reviewed by: tegge
o Fix white-space usage in vm_map_entry_create().
or user vm_maps. This implementation has two key benefits when compared
to vm_map_{user_,}pageable(): (1) it avoids a race condition through
the use of "in-transition" vm_map entries and (2) it eliminates lock
recursion on the vm_map.
Note: there is still an error case that requires clean up.
Reviewed by: tegge
obviously bogous return value of ad1816chan_setformat().
PR: 37932
Submitted by: Martin Kaeske <Martin.Kaeske@Stud.TU-Ilmenau.DE>
Reviewed by: hm
MFC after: 10 days
o Add a stub for vm_map_wire().
Note: the description of the previous commit had an error. The in-
transition flag actually blocks the deallocation of a vm_map_entry by
vm_map_delete() and vm_map_simplify_entry().
in their tlb which the prom doesn't clear out, so we have to do so manually
before mapping the kernel page table or the cpu can hang due various
conditions which cause undefined behaviour from the tlb.
or user vm_maps. In accordance with the standards for munlock(2),
and in contrast to vm_map_user_pageable(), this implementation does not
allow holes in the specified region. This implementation uses the
"in transition" flag described below.
o Introduce a new flag, "in transition," to the vm_map_entry.
Eventually, vm_map_delete() and vm_map_simplify_entry() will respect
this flag by deallocating in-transition vm_map_entrys, allowing
the vm_map lock to be safely released in vm_map_unwire() and (the
forthcoming) vm_map_wire().
o Modify vm_map_simplify_entry() to respect the in-transition flag.
In collaboration with: tegge
options do. Comments should be in NOTES and having the comments in two
places usually means that one place will just bitrot. Thus, remove the
comment for KTRACE_REQUEST_POOL from the previous revision.
Requested by: bde
- ktrace no longer requires Giant so do ktrace syscall events before and
after acquiring and releasing Giant, respectively.
- For i386, ia32 syscalls on ia64, powerpc, and sparc64, get rid of the
goto bad hack and instead use the model on ia64 and alpha were we
skip the actual syscall invocation if error != 0. This fixes a bug
where if we the copyin() of the arguments failed for a syscall that
was not marked MP safe, we would try to release Giant when we had
not acquired it.
operations to dump a ktrace event out to an output file are now handled
asychronously by a ktrace worker thread. This enables most ktrace events
to not need Giant once p_tracep and p_traceflag are suitably protected by
the new ktrace_lock.
There is a single todo list of pending ktrace requests. The various
ktrace tracepoints allocate a ktrace request object and tack it onto the
end of the queue. The ktrace kernel thread grabs requests off the head of
the queue and processes them using the trace vnode and credentials of the
thread triggering the event.
Since we cannot assume that the user memory referenced when doing a
ktrgenio() will be valid and since we can't access it from the ktrace
worker thread without a bit of hassle anyways, ktrgenio() requests are
still handled synchronously. However, in order to ensure that the requests
from a given thread still maintain relative order to one another, when a
synchronous ktrace event (such as a genio event) is triggered, we still put
the request object on the todo list to synchronize with the worker thread.
The original thread blocks atomically with putting the item on the queue.
When the worker thread comes across an asynchronous request, it wakes up
the original thread and then blocks to ensure it doesn't manage to write a
later event before the original thread has a chance to write out the
synchronous event. When the original thread wakes up, it writes out the
synchronous using its own context and then finally wakes the worker thread
back up. Yuck. The sychronous events aren't pretty but they do work.
Since ktrace events can be triggered in fairly low-level areas (msleep()
and cv_wait() for example) the ktrace code is designed to use very few
locks when posting an event (currently just the ktrace_mtx lock and the
vnode interlock to bump the refcoun on the trace vnode). This also means
that we can't allocate a ktrace request object when an event is triggered.
Instead, ktrace request objects are allocated from a pre-allocated pool
and returned to the pool after a request is serviced.
The size of this pool defaults to 100 objects, which is about 13k on an
i386 kernel. The size of the pool can be adjusted at compile time via the
KTRACE_REQUEST_POOL kernel option, at boot time via the
kern.ktrace_request_pool loader tunable, or at runtime via the
kern.ktrace_request_pool sysctl.
If the pool of request objects is exhausted, then a warning message is
printed to the console. The message is rate-limited in that it is only
printed once until the size of the pool is adjusted via the sysctl.
I have tested all kernel traces but have not tested user traces submitted
by utrace(2), though they should work fine in theory.
Since a ktrace request has several properties (content of event, trace
vnode, details of originating process, credentials for I/O, etc.), I chose
to drop the first argument to the various ktrfoo() functions. Currently
the functions just assume the event is posted from curthread. If there is
a great desire to do so, I suppose I could instead put back the first
argument but this time make it a thread pointer instead of a vnode pointer.
Also, KTRPOINT() now takes a thread as its first argument instead of a
process. This is because the check for a recursive ktrace event is now
per-thread instead of process-wide.
Tested on: i386
Compiles on: sparc64, alpha
when a thread is in the ktrace subsystem to avoid ktrace'ing internal
ktrace events.
- Update the locking notes for p_traceflag and p_tracep taking into account
the new ktrace_lock mutex.
lock_object by another pointer (though all of lock_object should be
conditional on LOCK_DEBUG anyways) in exchange for an O(1) TAILQ_REMOVE()
in witness_destroy() (called for every mtx_destroy() and sx_destroy())
instead of an O(n) STAILQ_REMOVE. Since WITNESS is so dog slow as it is,
the speed-up is worth the space cost.
Suggested by: iedowse
being created and destroyed without a single long-term one around to ensure
the witness associated with that group of locks stays alive. The pipe
mutexes are an example of this group. For a dead witness we no longer
clear the witness name. Instead, when looking up the witness for a lock,
if a dead witness' (a witness with a refcount of 0) w_name pointer is
identical to the witness name of the lock then we revive that witness
instead of using a new witness for the lock. This results in far fewer
dead witness objects and also better preserves locking orders over the long
term resulting in more correct lock order checking. Note that we can't
ever derefence w_name of a dead witness since we don't know if the string
it is pointing to has been free()'d or kldunload()'d out from under us.
daadr_t is no larger than a long, and some other relatively harmless
things (*blush*). Overflow for subtracting a daddr_t from a u_long
caused "truncation" of the i/o for attempts to access blocks beyond
the end of the actually cause expansion of the i/o to a preposterous
size.
Eliminate some of the unnecessary complexity of ng_ether_glueback_header().
Simplify two functions a bit by doing the NG_FREE_META(meta) earlier.
Reviewed by: julian, brian
MFC after: 1 week
Include PPR option bits defined in SPI4.
scsi_iu.h:
Add data structures releated to parallel SCSI information units
for use in SPI4 packetized protocol.
simple reads (and on IA32, a "pause" instruction for each interation of the
loop) to spin until either the mutex owner field changes, or the lock owner
stops executing.
Suggested by: tanimura
Tested on: i386
vm_map_create(), and vm_map_submap().
o Make further use of a local variable in vm_map_entry_splay()
that caches a reference to one of a vm_map_entry's children.
(This reduces code size somewhat.)
o Revert a part of revision 1.66, deinlining vmspace_pmap().
(This function is MPSAFE.)
(P_CONTINUED) is set when a stopped process receives a SIGCONT and
cleared after it has notified a parent process that has requested
notification via waitpid(2) with WCONTINUED specified in its options
operand. The status value can be checked with the new WIFCONTINUED()
macro.
Reviewed by: jake
mask on both input and output to fpsetmask(), but this was only done for
input, so fpsetmask() returned the complement of the old mask (ANDed with
the mask bitfield).
PR: 38170
MFC after: 4 weeks
deinlining vm_map_entry_behavior() and vm_map_entry_set_behavior()
actually increases the kernel's size.
o Make vm_map_entry_set_behavior() static and add a comment describing
its purpose.
o Remove an unnecessary initialization statement from vm_map_entry_splay().
Fix GCC warnings caused by initializing a zero length array. In the process,
simply things a bit by getting rid of 'struct ng_parse_struct_info' which
was useless because it only contained one field.
But now I'm unbreaking compilation by adjusting these files to the recent
netgraph change.
panic because of a repeat make_dev if/when the device is reattached
to the system.
Remove an "#if __FreeBSD__" in code that's nested under a "#if __NetBSD__"
(*sigh*)
Reported by: Seth Hettich <sjh@whiskey.ucf.ics.uci.edu>
Tested by: Seth Hettich <sjh@whiskey.ucf.ics.uci.edu>
Don't require pin be non-zero before we map bogus intlines, always do it.
This fixes a number of problems on HP Omnibook computers.
Tested/Reviewed by: Brooks Davis
set to zero. This field indicates the total space in the external buffer
and therefore should not be modified after the external buffer is added.
Add a comment warning that the mbufs returned by m_split() might be read-only.
Fix M_TRAILINGSPACE() to return zero if !M_WRITABLE(m).
Reviewed by: freebsd-net
Obtained from: Vernier Networks, Inc.
MFC after: 1 week
a route is cloned. Previously, they took on the count
of their parent route (which was sometimes nonzero.)
Submitted by: Andre Oppermann <oppermann@pipeline.ch>
MFC after: 5 days
into the vm_object layer:
o Acquire and release Giant in vm_object_shadow() and
vm_object_page_remove().
o Remove the GIANT_REQUIRED assertion preceding vm_map_delete()'s call
to vm_object_page_remove().
o Remove the acquisition and release of Giant around vm_map_lookup()'s
call to vm_object_shadow().
because the padding should be inserted before the array and not after
it, as is done by GCC 3.1. Instead use an explicit uint32_t field
to get what was intended and on top of that make the size of the
padding explicit. This also doesn't depend on a C99 feature.
While here, expand the comment. Just to make a point.
Pointed out by: fanf
vnode creation globaly, we allow processes to create vnodes concurently.
In case of concurent creation of vnode for the one ino, we allow processes
to race and then check who wins.
Assuming that concurent creation of vnode for same ino is really rare case,
this is belived to be an improvement, as it just allows concurent creation
of vnodes.
Idea by: bp
Reviewed by: dillon
MFC after: 1 month
the pv lists in the vm_page, even unmanaged kernel mappings. This is so
that the virtual cachability of these mappings can be tracked when a page
is mapped to more than one virtual address. All virtually cachable
mappings of a physical page must have the same virtual colour, or illegal
alises can be created in the data cache. This is a bit tricky because we
still have to recognize managed and unmanaged mappings, even though they
are all on the pv lists.
struct uuid defined in <sys/uuid.h>.
Use uuid/UUID instead of guid/GUID to emphasize that the
identifiers are DCE version 1 identifiers and also to avoid
inconsistencies as much a possible.
date: 2002/05/28 12:42:39; author: augustss;
Change DMAADDR macro slightly.
Update the $NetBSD$ tags to reflect this and make slight changes to
usb_mem.h so that we're in sync with each other.
is currently conditional on both the GEOM and GEOM_GPT options to
avoid getting GPT by default and having the MBR and GPT classes
clash.
The correct behaviour of the MBR class would be to back-off (reject)
a MBR if it's a Protective MBR (a MBR with a single partition of type
0xEE that spans the whole disk (as far as the MBR is concerned).
The correct behaviour if the GPT class would be to back-off (reject)
a GPT if there's a MBR that's not a Protective MBR.
At this stage it's inconvenient to destroy a good MBR when working
with GPTs that it's more convenient to have the MBR class back-off
when it detects the GPT signature on disk and have the GPT class
ignore the MBR.
In sys/gpt.h UUIDs (GUIDs) for the following FreeBSD partitions
have been defined:
GPT_ENT_TYPE_FREEBSD
FreeBSD slice with disklabel. This is the equivalent of
the well-known FreeBSD MBR partition type.
GPT_ENT_TYPE_FREEBSD_{SWAP|UFS|UFS2|VINUM}
FreeBSD partitions in the context of disklabel. This is
speculating on the idea to use the GPT to hold partitions
instead if slices and removing the fixed (and low) limits
we have on the number of partitions.
This commit lacks a GPT image for the regression suite.
The uuidgen command, by means of the uuidgen syscall, generates one
or more Universally Unique Identifiers compatible with OSF/DCE 1.1
version 1 UUIDs.
From the Perforce logs (change 11995):
Round of cleanups:
o Give uuidgen() the correct prototype in syscalls.master
o Define struct uuid according to DCE 1.1 in sys/uuid.h
o Use struct uuid instead of uuid_t. The latter is defined
in sys/uuid.h but should not be used in kernel land.
o Add snprintf_uuid(), printf_uuid() and sbuf_printf_uuid()
to kern_uuid.c for use in the kernel (currently geom_gpt.c).
o Rename the non-standard struct uuid in kern/kern_uuid.c
to struct uuid_private and give it a slightly better definition
for better byte-order handling. See below.
o In sys/gpt.h, fix the broken uuid definitions to match the now
compliant struct uuid definition. See below.
o In usr.bin/uuidgen/uuidgen.c catch up with struct uuid change.
A note about byte-order:
The standard failed to provide a non-conflicting and
unambiguous definition for the binary representation. My initial
implementation always wrote the timestamp as a 64-bit little-endian
(2s-complement) integral. The clock sequence was always written
as a 16-bit big-endian (2s-complement) integral. After a good
nights sleep and couple of Pan Galactic Gargle Blasters (not
necessarily in that order :-) I reread the spec and came to the
conclusion that the time fields are always written in the native
by order, provided the the low, mid and hi chopping still occurs.
The spec mentions that you "might need to swap bytes if you talk
to a machine that has a different byte-order". The clock sequence
is always written in big-endian order (as is the IEEE 802 address)
because its division is resulting in bytes, making the ordering
unambiguous.
(UUIDs). On ia64 UUIDs, aka GUIDs, are used by EFI and the firmware
among others. To create GUID Partition Tables (GPTs), we need to
be able to generate UUIDs.
Make kern.ttys export a struct xtty rather than struct tty. Since struct
tty is no longer exposed to userland, remove the dev_t / udev_t hack.
Sponsored by: DARPA, NAI Labs
revision 1.124
date: 2002/05/26 03:10:02; author: minoura; state: Exp; lines: +3 -3
Clear done_head in the HCCA *before* acknoledging the interrupt.
Driver lost some completed transfers under heavy loads.
date: 2002/05/19 06:24:31; author: augustss; state: Exp;
Update dma memory access API a little.
NetBSD have adopted our way of using the KERNADDR macro. Update
the revision tags to show that we're in sync, and remove the casts
that they did in their adaptation.
"The only hard problem in cryptography is key-management."
All sectors are encrypted with AES in CBC mode using a constant key,
currently compiled in and all zero.
To activate this module, write the magic header on the partition:
echo "<<FreeBSD-GEOM-AES>>" | dd conv=sync of=/dev/md98
The encrypted device will be one sector shorter and have ".aes"
appended to its name.
Sponsored by: DARPA & NAI Labs.
and vm_map_delete(). Assert GIANT_REQUIRED in vm_map_delete()
only if operating on the kernel_object or the kmem_object.
o Remove GIANT_REQUIRED from vm_map_remove().
o Remove the acquisition and release of Giant from munmap().