FreeBSD flags instead of just adding one to the Linux flags. This should
be identical to the previous version except that I have at least one report
of this patch fixing problems people were having with Linux apps after my
last commit to this file. It is safer to use the switch then to make
assumptions about the flag values anyways, esp. since we currently use
MD defines for the values of the flags and this is MI code.
Tested by: Michael Class <michael_class@gmx.net>
following changes have been done:
- stylify. The original code was too hard to read.
- get rid of a number of compilation options (Adaptec-only, Eni-only, no-DMA).
- more debugging features.
- locking. This is not correct yet in the absence of interface layer locking,
but is correct enough to not to cause lock order reversals.
- remove RAW mode. There are no users of this in the tree and I doubt that
there are any.
- remove NetBSD compatibility code. There was no way to keep NetBSD non-busdma
and FreeBSD busdma code together.
- if_en now buildable as a module.
This has been actively tested on sparc64 and i386 with ENI server and
client cards and an Adaptec card (thanks to kjc).
Reviewed by: mdodd, arr
- Add fxp_start_body() and change fxp_start() to just acquire locks and
then call fxp_start_body(). Places that would call fxp_start() with
locks held (mutex recursion) now call fxp_start_body() directly.
Remove MTX_RECURSE flag from sc_mtx. [gallatin]
- Change fxp_attach() to work without the softc lock, saving interrupt
hooking until the head of fxp_attach().
- Call ether_ifattach() before overriding ifp parameters. This reverts
part of 1.155.
- Remove multiple error paths in fxp_attach().
- Teardown interrupt in fxp_detach() before unlocking the softc.
- Make sure mutex is not held in fxp_release()
- Delete the miibus instance and/or self in fxp_release(), not in
fxp_detach(). This can happen if attach fails partway through.
- Move ifmedia_removeall to fxp_release() since attach may fail after
media have been allocated.
- Add locking to fxp_suspend, fxp_resume, fxp_start, fxp_intr,
fxp_poll, fxp_tick, fxp_ioctl, fxp_watchdog.
- Pass in ifp to fxp_intr_body since its callers sometimes already use
it.
- Add compatibility define for INTR_MPSAFE for 4.x. [gallatin]
- You don't need to bzero softc.
Ideas from: gallatin, mux
Tested by: >400M packets of dd/ssh, NFS, ping on i386 UP
- Remove the Giant required from vm_page_free_toq(). (Any locking
errors will be caught by vm_page_remove().)
This remedies a panic that occurred when kmem_malloc(NOWAIT) performed
without Giant failed to allocate the necessary pages.
Reported by: phk
driver at long last!
Many thanks to vaidas.damosevicius@if.lt for keeping this issue alive
and pursuing Intel for a fix, Intel/ICP for working on the driver, and
Sergey Osokin for bringing the original patches up to 5-CURRENT.
syscall return values should be cleared. The system calls
getcontext() and swapcontext() want to return 0 on success
but these contexts can be switched to at a later time so
the return values need to be cleared in the saved register
sets. Other callers of get_mcontext() would normally want
the context without clearing the return values.
Remove the i386-specific context saving from the KSE code.
get_mcontext() is not i386-specific any more.
Fix a bad pointer in the alpha get_mcontext() code. The
context was being bcopy()'d from &td->tf_frame, but tf_frame
is itself a pointer, so the thread was being copied instead.
Spotted by jake.
Glanced at by: jake
Reviewed by: bde (months ago)
of the infrastructure for the gamma driver which was removed a while back.
The DRM_LINUX option is removed because the handler is now provided by the
linux compat code itself.
machines where the 'long' number of blocks in struct statfs wont fit.
Instead of chosing an artificial 512 byte block size, simply scale it up
until we avoid an overflow. NFSv3 reports the sizes in bytes, and the
blocksize is a figment of nfsclient's imagination.
Instead of applying the adjustment to processes with a start time of 1,
apply it to all processes with a start time of less than 3600.
None of this would be necessary if the start times were recorded in ticks
instead of seconds and microseconds.
don't include the kernel stacks of swapped-out threads in the page count,
but do include the alternate kernel stack. jhb provided some helpful
comments on this.
PR: 49102
- Add a parameter to vm_pageout_flush() that tells vm_pageout_flush()
whether its caller has locked the vm_object. (This is a temporary
measure to bootstrap vm_object locking.)
whose p_stats->p_start has the magic value 1, replace it with boottime.
Some users were apparently confused by the fact that ps(1) reported a
start time in early 1970 for system processes.
2. Include backwards compatibility good for the moment (eventually will
be turned off in current, but allow for a short transition period).
PR: 51333
Submited by: Scott Mitchell (1)
MFC after: 2 weeks
to get actual constant values. This is in preparation for machine/limits.h
retirement.
Discussed on: standards@
Submitted by: Craig Rodrigues <rodrigc@attbi.com> (*)
Modified by: kan
do all the various sigstack dances, unlock the proc lock, and finally do
the copyout. This more closely resembles the behavior of
kern_sigaltstack() and closes a small race.
- Remove Giant from osigstack as it is no longer needed.
(with other names) in the USB driver sources, but I felt that pcireg.h
should have a complete list - at least of classes and interfaces that we
know about and use.
held.
The only place where we want to not hold topology is when we read
(or write) the label to disk: in the case of a disk error with a
long recovery time, holding topology would prevent open/close of
any disk device.
rename them appropriately. Protect both flags with both the proc lock
and the sched_lock.
- Protect p_profthreads with the proc lock.
- Remove Giant from profil(2).
- For the 4BSD scheduler, this means that all callers of the static
function resetpriority() now always hold sched_lock, so don't lock
sched_lock explicitly in that function.
kg_nice is now protected by both. Being protected by both means that
other places in the kernel that want to read kg_nice only need one of the
two locks.
race where a thread could assume that a process was swapped in by
PHOLD() when it actually wasn't fully swapped in yet.
- In faultin(), always msleep() if PS_SWAPPINGIN is set instead of doing
this check after bumping p_lock in the PS_INMEM == 0 case. Also,
sched_lock is only needed for setting and clearning swapping PS_*
flags and the swap thread inhibitor.
- Don't set and clear the thread swap inhibitor in the same loops as the
pmap_swapin/out_thread() since we have to do it under sched_lock.
Instead, mimic the treatment of the PS_INMEM flag and use separate loops
to set the inhibitors when clearing PS_INMEM and clear the inhibitors
when setting PS_INMEM.
- swapout() now returns with the proc lock held as it holds the lock
while adjusting the swapping-related PS_* flags so that the proc lock
can be used to test those flags.
- Only use the proc lock to check the swapping-related PS_* flags in
several places.
- faultin() no longer requires sched_lock to be held by callers.
- Rename PS_SWAPPING to PS_SWAPPINGOUT to be less ambiguous now that we
have PS_SWAPPINGIN.
their prototypes.
- Remove sched_lock locking from kse_purge() as all callers already lock
the sched_lock before calling it.
- Hold the proc lock slightly longer to protect P_SHOULDSTOP().
or sched_lock are sufficient to test this flag.
XXX: vinum should really be using a kernel process via kthread_create()
instead of this hack. I'm not even sure PS_INMEM can be clear at this
point anyways.
kern_sigprocmask() in the various binary compatibility emulators.
- Replace calls to sigsuspend(), sigaltstack(), sigaction(), and
sigprocmask() that used the stackgap with calls to the corresponding
kern_sig*() functions instead without using the stackgap.
cpu_switch_load_gs, cpu is in context switch, so don't enable interrupt.
because it is in context switch, it is expected sched_lock was held,
so don't PROC_LOCK(p) and psignal, it is LOR, probably we can
set a P_XSIGBUS like flag in p_sflags, and set TDF_ASTPENDING in
td_flags, in ast(), post a SIGBUS to process if P_XSIGBUS was set.
called without Giant; and obj_alloc() in turn calls vm_page_alloc()
without Giant. This causes an assertion failure in vm_page_alloc().
Fortunately, obj_alloc() is now MPSAFE. So, we need only clean up
some assertions.
- Weaken the assertion in vm_page_lookup() to require Giant only
if the vm_object isn't locked.
- Remove an assertion from vm_page_alloc() that duplicates a check
performed in vm_page_lookup().
In collaboration with: gallatin, jake, jeff
instruction requires that a translation is present in the TC. This
may trigger a TLB miss and a subsequent call to vm_fault().
This implementation is deliberately non-inline for debugging and
profiling purposes. Partial or full inlining should eventually be
done.
Valuable insights by: jake
This means that you can no longer trash your opened partitions by writing to
the sunlabel through another partition. This is similar to the semantics
implemented for BSD labels.
if attach succeeded. device_is_alive just tells us that probe
succeeded. Since we were using it to do things like detach net
interfaces, this caused problems when there were errors in the attach
routine.
Symptoms of problem reported by: martin blapp
o KMF_NOUPCALL
Ask kse_release to not return to userland upcall entry, but instead
direct returns to userland by using current thread's stack and return
address on stack. This flags is intended to be used by UTS in critical
region to wait another UTS thread to leave critical region, by using
kse_release with this flag to avoid spinnng and burning CPU. Also this
flags can be used by UTS to poll completed context when there is nothing
to do in userland and needn't restart from its entry like normal upcall.
o KMF_NOCOMPLETED
Ask kernel to not bring completed thread contexts back to userland when
doing upcall, this flags is intend to be used with above flag when an
upcall thread is in critical region and can not process completed contexts
at that time.
Tested by: deischen
vm_object_pip_add() and vm_object_pip_wakeup().
- Remove GIANT_REQUIRED from vm_object_pip_subtract() and
vm_object_pip_subtract().
- Lock the vm_object when performing vm_object_page_remove().
the poll bits when there's actually something in the queue.
Otherwise, select always returned '2' when there were no items to be
read, and '3' when there were. This would preclude being able to read
in a threaded (libc_r) program, as well as checking to see if there
were pending events or not.
ethernet controller. The driver has been tested with the LinkSys
USB200M adapter. I know for a fact that there are other devices out
there with this chip but don't have all the USB vendor/device IDs.
Note: I'm not sure if this will force the driver to end up in the
install kernel image or not. Special magic needs to be done to exclude
it to keep the boot floppies from bloating again, someone please
advise.
other initializations succeeded.
- Initialize the TX and RX rings in epic_attach() rather than in
epic_init() where we're not supposed to fail. Similarly, free
the TX and RX rings in epic_detach() rather than in epic_stop().
- Change epic_init() to be a void function now that it can't fail.
Also change its parameter to a void * so that we have a correct
prototype for if_init.
- Now that epic_init() has a correct prototype, don't cast the
function pointer when initializing if_init.
- Fix nearby style bugs.
- Don't initialize if_output, ether_ifattach() does this for us.
- Use pci_enable_busmaster() instead of using pci_read_config()
and pci_write_config() directly.
- Don't try to enable I/O, bus_alloc_resource() does this for us.
lock assertion to it.
- SIGPENDING() no longer needs sched_lock, so only grab sched_lock to set
the TDF_NEEDSIGCHK and TDF_ASTPENDING flags in signotify().
- Add a proc lock assertion to tdsigwakeup().
- Since we always set TDF_OLDMASK while holding the proc lock, the proc
lock is sufficient protection to check its state in postsig() and we only
need sched_lock when clearing the actual flag.
a process group.
- Call pgadjustjobc() twice in fixjobc() to avoid code duplication and
improve readability.
- Use the proc lock to protect P_SHOULDSTOP() instead of sched_lock.
- Check to see if a process is PRS_NEW with sched_lock before trying to
lock its proc lock since the lock may not be constructed yet.
since they are going on the current cpu and not their previously assigned
cpu.
- sched_runnable() should only return true in the SMP case if the other
processor has more than one thread that is runnable. We can not steal
curthread.
- Change kseq_print() to accept the cpuid instead of a kseq pointer. This
makes use of this function in ddb much easier.
the process and session. Instead, cache a true reference to the session
when we do the hold and release our reference on that session. This avoids
the need for the proc lock when dropping the reference.
protects, so don't bother locking it while we assign it to a ucred's
cr_prison.
- Fully construct the new credential for a process before assigning it to
p_ucred.
- Set p_acflag earlier while already hold the proc lock in fork1().
- Mark the realitexpire() callout MPSAFE for new processes. It was already
marked safe for proc0 a long while ago.
printed out needs a prefix such as when a thread is blocked on a lock.
- Use another local variable to close another race for the td_wmesg and
td_wchan members of struct thread.
- Unconditionally call *_stop() if device is in the tree. This is to
prevent callouts from happening after the device is gone. Checks for
bus_child_present() should be added in the future to keep from touching
potentially non-existent hardware in *_detach(). Found by iedowse@.
- Always check for and free miibus children, even if the device is not in
the tree since some failure cases could have gotten here.
- Call ether_ifdetach() in the irq setup failure case
- ti(4), xl(4): move ifmedia_init() calls to the beginning of attach so
that ifmedia_removeall() can be unconditionally called on detach. There
is no way to detect whether ifmedia has been initialized without using
a separate variable (as tl(4) does).
- Add comments to indicate assumptions of code path
bytes_in_mbuf rather than MCLBYTES. Add the ethertnet header to the
front of the mbuf. Adjust bytes_in_mbuf inside the loop that reads
the packet out of the card.
generation number back to a long (sizeof(u_int) != sizeof(long) on
sparc64). The alternative would have been to heavily change the libdevstat API.
Discussed with: phk, ken
codec during initialization. This code mirrors the reset code used on
the VIA82c686 and fixes a codec initialization failure (SoundMAX
AD1885) reported by Matthias Schuendehuette.
in dc_detach() instead of only calling it if the hardware is preset.
This is a workaround for page faults in softclock() after a `dc'
device was detached, caused by not disabling a timer before freeing
its memory. The bus_child_present() checks should probably be
re-added later, but only to avoid the hardware accesses and not the
other resource cleanups in dc_stop().
Approved by: njl
Many internal structure changes for the FireWire driver.
- Compute CRC in CROM parsing.
- Add support for configuration ROM build.
- Simplify dummy buffer handling.
- busdma conversion
- Use swi_taskqueue_giant for -current. Mark the interrupt routine as MPSAFE.
- AR buffer handling.
Don't reallocate AR buffer but just recycle it.
Don't malloc and copy per packet in fwohci_arcv().
Pass packet to fw_rcv() using iovec.
Application must prepare receiving buffer in advance.
- Change fw_bind API so that application should pre-allocate xfer structure.
- Add fw_xfer_unload() for recycling struct fw_xfer.
- Add post_busreset hook
- Remove unused 'sub' and 'act_type' in struct fw_xfer.
- Remove npacket from struct fw_bulkxfer.
- Don't call back handlers in fwochi_arcv() if the packet has
not drained in AT queue
- Make firewire works on big endian platform.
- Use native endian for packet header and remove unnecessary ntohX/htonX.
- Remove FWXFERQ_PACKET mode. We don't use it anymore.
- Remove unnecessary restriction of FWSTMAXCHUNK.
- Don't set root node for phy config packet if the root node is
not cycle master capable but set myself for root node.
We should be the root node after next bus reset.
Spotted by: Yoshihiro Tabira <tabira@scd.mei.co.jp>
- Improve self id handling
Tested on: i386, sparc64 and i386 with forced bounce buffer
blocking allocation could occur as a result of a label
initialization. This will simulate the behavior of allocated
label policies such as MLS and Biba when running mac_test from
the perspective of WITNESS lock and sleep warnings.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
command config register. At the present, this represents a nop
because these bits should have been set earlier in the process. In
the future, we'll only set these bits when the driver requests the
resource, not when the bus code detects the resource.
Reviewed by: mdodd
don't try and convert the argument flags to malloc flags, or we risk
implicitly requesting blocking and generating witness warnings.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
network layer (ether).
- Don't abuse module names to facilitate ifconfig module loading;
such abuse isn't really needed. (And if we do need type information
associated with a module then we should make it explicit and not
use hacks.)
unencapsulated packet back into the IFQ. Unfortunately, the only reason
rl_encap would fail was due to m_defrag failing, which should only happen
when we're low on mbufs. Hence, it was possible for us to end up with
an IFQ full of packets which could never clear the queue because they could
never be defragmented because they were themselves taking up all the mbufs.
To solve this, take if_xl's approach to the problem of encapsulation failure:
drop the packet.
MFC after: 3 days
When enabled, this causes m_defrag to randomly return NULL (following
its normal failure case so that extra memory leaks are not introduced.)
Code similar to this was used to find / fix a few bugs last week.
to force the allocation of MAC labels for all mbufs regardless of
whether a configured policy requires labeling when the mbuf is
allocated. This can be useful it you anticipate loading a fully
labeled policy after boot and don't want mbufs to exist without
label storage, for performance measurement purposes, etc. It also
slightly lowers the overhead of m_tag labeling due to removing the
decision logic.
While here, improve commenting of other MAC options.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
returning some additional room in the first mbuf in a chain, and
avoiding feature-specific contents in the mbuf header. To do this:
- Modify mbuf_to_label() to extract the tag, returning NULL if not
found.
- Introduce mac_init_mbuf_tag() which does most of the work
mac_init_mbuf() used to do, except on an m_tag rather than an
mbuf.
- Scale back mac_init_mbuf() to perform m_tag allocation and invoke
mac_init_mbuf_tag().
- Replace mac_destroy_mbuf() with mac_destroy_mbuf_tag(), since
m_tag's are now GC'd deep in the m_tag/mbuf code rather than
at a higher level when mbufs are directly free()'d.
- Add mac_copy_mbuf_tag() to support m_copy_pkthdr() and related
notions.
- Generally change all references to mbuf labels so that they use
mbuf_to_label() rather than &mbuf->m_pkthdr.label. This
required no changes in the MAC policies (yay!).
- Tweak mbuf release routines to not call mac_destroy_mbuf(),
tag destruction takes care of it for us now.
- Remove MAC magic from m_copy_pkthdr() and m_move_pkthdr() --
the existing m_tag support does all this for us. Note that
we can no longer just zero the m_tag list on the target mbuf,
rather, we have to delete the chain because m_tag's will
already be hung off freshly allocated mbuf's.
- Tweak m_tag copying routines so that if we're copying a MAC
m_tag, we don't do a binary copy, rather, we initialize the
new storage and do a deep copy of the label.
- Remove use of MAC_FLAG_INITIALIZED in a few bizarre places
having to do with mbuf header copies previously.
- When an mbuf is copied in ip_input(), we no longer need to
explicitly copy the label because it will get handled by the
m_tag code now.
- No longer any weird handling of MAC labels in if_loop.c during
header copies.
- Add MPC_LOADTIME_FLAG_LABELMBUFS flag to Biba, MLS, mac_test.
In mac_test, handle the label==NULL case, since it can be
dynamically loaded.
In order to improve performance with this change, introduce the notion
of "lazy MAC label allocation" -- only allocate m_tag storage for MAC
labels if we're running with a policy that uses MAC labels on mbufs.
Policies declare this intent by setting the MPC_LOADTIME_FLAG_LABELMBUFS
flag in their load-time flags field during declaration. Note: this
opens up the possibility of post-boot policy modules getting back NULL
slot entries even though they have policy invariants of non-NULL slot
entries, as the policy might have been loaded after the mbuf was
allocated, leaving the mbuf without label storage. Policies that cannot
handle this case must be declared as NOTLATE, or must be modified.
- mac_labelmbufs holds the current cumulative status as to whether
any policies require mbuf labeling or not. This is updated whenever
the active policy set changes by the function mac_policy_updateflags().
The function iterates the list and checks whether any have the
flag set. Write access to this variable is protected by the policy
list; read access is currently not protected for performance reasons.
This might change if it causes problems.
- Add MAC_POLICY_LIST_ASSERT_EXCLUSIVE() to permit the flags update
function to assert appropriate locks.
- This makes allocation in mac_init_mbuf() conditional on the flag.
Reviewed by: sam
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
mbuf_to_label(). This permits the vast majority of entry point code
to be unaware that labels are stored in m->m_pkthdr.label, such that
we can experiment storage of labels elsewhere (such as in m_tags).
Reviewed by: sam
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
PCIM_CMD_MEMEN and PCIM_CMD_BUSMASTEREN, becaise some braindead
BIOSes (such as one found in my vprmatrix notebook) forget
to initialize it properly resulting in attachment failure.
Ignoring maxsegsz may lead to fatal data corruption for some devices.
ex. SBP-2/FireWire
We should apply this change to other platforms except for sparc64.
MFC after: 1 week
the cpu dependent files. It will need to be done differently for USIII.
- Simplify the logic for detecting context rollovers. Instead of dealing
with it when the next context switch would cause the context numbers to
rollover, deal with it when they actually do rollover.
- Move some things around in cpu_switch so that we only do 1 membar #Sync
when switching address space, instead of 2.
- Detect kernel threads by comparing the new vm space to vmspace0, instead
if checking if the tlb context is 0.
- Removed some debug code.
test is built to test GEOM as running in the kernel.
This commit is basically "unifdef -D_KERNEL" to remove the mainly #include
related code to support the userland-harness.