memory-allocation purposes. Right now it is also a very good idea
because we hit a Giant assertion in the free(9) processing if we
free something larger than 64k.
Include read streaming in the PPR flags we display in diagnostics.
In ahd_reset(), set the known mode after our initial pause prior to
setting the mode. We can't just set the mode directly because the
current mode, after the pause, is most likely unknown and setting the
mode when the saved mode is unknown will trigger an assertion in
the mode debug code.
Complete an audit for SCB RAM reads. These reads must be performed
via the special ahd_in?_scbram() methods so we can perform a
Rev A. PCI-X workaround.
Remove a superfluous mode save operation that was performed just
prior to a call to ahd_clear_critical_section(). The saved mode
was never restored and wouldn't have been valid anyway since the
mode could change while single stepping out of a critical section.
aic79xx.h:
Add new BUG definition AHD_PCIX_SCBRAM_RD_BUG.
aic79xx_inline.h:
Update ahd_inb_scbram routine to check for AHD_PCIX_SCBRAM_RD_BUG
and only apply the workaround if this bug is active. The old code
applied the workaround in all cases.
aic79xx_pci.c:
Set AHD_PCIX_SCBRAM_RD_BUG for the A4.
Remove an attempted saved_modes call in ahd_pci_test_register_access().
Saving the modes can only occur when we are paused, but the call was
happening before the chip was known to be paused. Restoring the
modes doesn't make sense either since the code makes no assumptions
about the state of the sequencer until the first time the mode is set
by the driver. This happens after the registers are successfully
mapped.
requests whether or not the lock is available. To avoid "unlocked
buffer" panics after a crash, we just claim that all buffers
are locked when cleaning up after a system panic.
Reported by: Attila Nagy <bra@fsn.hu>
Sponsored by: DARPA & NAI Labs.
witness. Sleepable locks such as sx locks always come before all mutexes
including Giant. However, the static lock order list placed Giant before
the proctree and allproc sx locks. This resulted in witness creating a
cycle in its lock order "tree" (real trees don't have cycles) leading to
infinite recursion and eventually a double fault. To fix, put Giant after
sx locks in the lock order list.
occurs when mounting the filesystem. The problem is that venus issues
the mount() syscall, which calls vfs_mount(), which calls coda_root()
which attempts to communicate with venus.
check, mac_check_sysarch_ioperm(), permitting MAC security policy
modules to control access to these interfaces. Currently, they
protect access to IOPL on i386, and setting HAE on Alpha.
Additional checks might be required on other platforms to prevent
bypass of kernel security protections by unauthorized processes.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
modules to authorize disabling of swap against a particular vnode.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
before the MAC check so that we pass the flags field into the MAC
check properly initialized. This didn't affect any current MAC
modules since they didn't care what the flags argument was (as
they were primarily interested in the fact that it was a meta-data
write, not the contents of the write), but would be relevant to
future modules relying on that field.
Submitted by: Mike Halderman <mrh@spawar.navy.mil>
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Submitted by Hiroyuki Aizu <eyes@navi.org>
(refer to [FreeBSD-users-jp 65061])
Tested by Hiroharu Tamaru <tamaru@myn.rcast.u-tokyo.ac.jp>
(refer to [bsd-usb:689])
an administrative limit on the size of tty/pty input buffers, this is
mostly an inconsequential change. (slti(4) will allocate an 8 kB
static buffer instead of a 1 kB buffer due to a hack in the driver.)
The increase happens to kludge around a lame limitation of syscons,
which does not allow one to paste more than TTYHOG bytes.
PR: 42031
Reviewed by: mike (mentor)
not save (restore) the global pointer (GP) in the jmpbuf in setjmp
(longjmp) because it's not needed in general. GP is considered a
scratch register at callsites and hence is always restored after a
call (when it's possible that the call resolves to a symbol in a
different loadmodule; otherwise GP does not have to be saved and
restored at all), including calls to setjmp/longjmp. There's just
one problem with this now that we use setjmp/longjmp for context
switching: A new context must have GP defined properly for the
thread's entry point. This means that we need to put GP in the
jmpbuf and consequently that we have to restore is in longjmp.
This automaticly requires us to save it as well.
When setjmp/longjmp isn't used for context switching, this can be
reverted again.
the J_SIG0 field. While here, rename J_SIG0 to J_SIGSET and
remove J_SIG1. The main reason for this change is that the
128-bit sigset_t is now aligned on a 16-byte boundary, which
allows us to use 16-byte atomic loads and stores on CPUs that
support it. The removal of J_SIG1 is done to avoid confusion:
it is never accessed and should not be. Renaming J_SIG0 to
J_SIGSET is the icing on the cake that's better done now than
later.
drain routines are done by swi_net, which allows for better queue control
at some future point. Packets may also be directly dispatched to a netisr
instead of queued, this may be of interest at some installations, but
currently defaults to off.
Reviewed by: hsu, silby, jayanth, sam
Sponsored by: DARPA, NAI Labs
- Use gbincore() and not incore() so that we can drop the vnode interlock
as we acquire the buflock.
- Use GB_LOCK_NOWAIT when getting bufs for read ahead clusters so that we
don't block on locked bufs.
- Convert a while loop to a howmany() that will most likely be faster on
modern processors. There is another while loop divide that was left
near by because it is operating on a 64bit int and is most likely faster.
- Cleanup the cluster_read() code a little to get rid of a goto and make
the logic clearer.
Tested on: x86, alpha
Tested by: Steve Kargl <sgk@troutmask.apl.washington.edu>
Reviewd by: arch
already own. The mtx_trylock() will fail however. Enhance the comment
at the top of the try lock function to explain this.
Requested by: jlemon and his evil netisr locking
- Add a comment about special lock order rules and Giant near the top of
subr_witness.c. Specifically, this documents and explains the real lock
order relationship between Giant and sleepable locks (i.e. lockmgr locks
and sx locks). Basically, Giant can be safely acquired either before or
after sleepable locks and the case of Giant before a sleepable lock is
exempted as a special case.
- Add a new static function 'witness_list_lock()' that displays a single
line of information about a struct lock_instance. This is used to
make the output of witness messages more consistent and reduce some code
duplication.
- Fixup a few comments in witness_lock().
- Properly handle the Giant-before-sleepable-lock lock order exception in
a more general fashion and remove the no longer needed LI_SLEPT flag.
- Break up the last condition before assuming a reversal a bit to try
and make the logic less confusing in witness_lock().
- Axe WITNESS_SLEEP() now that LI_SLEPT is no longer needed and replace it
with a more general WITNESS_WARN() macro/function combination.
WITNESS_WARN() allows you to output a customized message out to the
console along with a list of held locks. It will optionally drop into
the debugger as well. You can exempt a single lock from the check by
passing it in as the second argument. You can also use flags to specify
if Giant should be exempt from the check, if all sleepable locks should
be exempt from the check, and if witness should panic if any non-exempt
locks are found.
- Make the witness_list() function static. Other areas of the kernel
should use the new WITNESS_WARN() instead.
- Declare some local variables at the top of the function instead of in a
nested block.
- Use mtx_owned() instead of masking off bits from mtx_lock manually.
- Read the value of mtx_lock into 'v' as a separate line rather than inside
an if statement for clarity. This code is hairy enough as it is.
owned. Previously the KASSERT would only trigger if we successfully
acquired a lock that we already held. However, _obtain_lock() fails to
acquire locks that we already hold, so the KASSERT was never checked in
the case it was supposed to fail.
interactivity of a kseg and assigns it a value of 0 through 100.
- Use sched_interact_score() to determine the dynamic priority.
- Define SCHED_CURR() in terms of sched_interact_score().
- Adjust the maximum slice back down to 100ms.
- Remove redundant clearing of ke_runq in sched_wakeup()
- Clean up #defines and comment them.
- Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT
flag to the initial BUF_LOCK(). This will eventually be used in cases
were we want to use a buffer only if it is not currently in use.
- Convert all consumers of the getblk() api to use this extra parameter.
Reviwed by: arch
Not objected to by: mckusick
Remove extraneous uses of vop_null, instead defering to the default op.
Rename vnode type "vfs" to the more descriptive "syncer".
Fix formatting for various filesystems that use vop_print.
branches:
Initialize struct cdevsw using C99 sparse initializtion and remove
all initializations to default values.
This patch is automatically generated and has been tested by compiling
LINT with all the fields in struct cdevsw in reverse order on alpha,
sparc64 and i386.
Approved by: re(scottl)
calculations. Keep this changes local to the function so the tick count
is in its natural form otherwise. Previously 1000 was added each time
a tick fired and we divided by 1000 when it was reported. This is done
to reduce rounding errors.
This should fix some problem of SBP2 device probing.
Prior to rev 1.41, we keep writing the register while bus reset phase.
But in rev 1.41, we ignore successive bus reset events and some chips seem to
clear the register after we write to it.
Tested by: Michael Reifenberger <root@nihil.reifenberger.com>
permit users and groups to bind ports for TCP or UDP, and is intended
to be combined with the recently committed support for
net.inet.ip.portrange.reservedhigh. The policy is twiddled using
sysctl(8). To use this module, you will need to compile in MAC
support, and probably set reservedhigh to 0, then twiddle
security.mac.portacl.rules to set things as desired. This policy
module only restricts ports explicitly bound using bind(), not
implicitly bound ports where the port number is selected by the
IP stack. It appears to work properly in my local configuration,
but needs more broad testing.
A sample policy might be:
# sysctl security.mac.portacl.rules="uid:425:tcp:80,uid:425:tcp:79"
This permits uid 425 to bind TCP sockets to ports 79 and 80. Currently
no distinction is made for incoming vs. outgoing ports with TCP,
although that would probably be easy to add.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
queue items that can be allocated by netgraph and the number of free queue
items that are cached on a private list.
Netgraph places an upper limit on the number of queue items it may allocate.
When there is a large number of netgraph messages travelling through the
system (100k/sec and more) there is a high probability, that messages get
queued at the nodes and netgraph runs out of queue items. In this case the data
flow through netgraph gets blocked. The tuneable for the number of free
items lets one trade memory for performance.
The tunables are also available as read-only sysctls.
PR: kern/47393
Reviewed by: julian
Approved by: jake (mentor)
Always use sys/conf/kern.mk when building kernel/modules.
<bsd.kern.mk> is only preserved for sys/boot/pc98/boot2
for now, but this will be fixed. If there are other
users of <bsd.kern.mk>, please let me know.
Reminded by: bde
Clear the LQICRC_NLQ status should it pop up after we have
already handled the SCSIPERR. During some streaming operations
this status can be delayed until the stream ends. Without this
change, the driver would complain about a "Missing case in
ahd_handle_scsiint".
In the LQOBUSFREE handler...
Don't return the LQOMGR back to the idle state until after
we have cleaned up ENSELO and any status related to this
selection. The last thing we need is the LQO manager starting
another select-out before we have updated the execution queue.
It is not clear whether the LQOMGR would, or would not
start a new selection early.
Make sure ENSELO is off prior to clearing SELDO by flushing
device writes.
Move assignment of the next target SCB pointer inside of
an if to make the code clearer. The effect is the same.
Dump card state in both "Unexpected PKT busfree" paths.
In ahd_reset(), set the chip to SCSI mode before reading SXFRCTL1.
That register only exists in the SCSI mode. Also set the mode
explicitly to the SCSI mode after chip reset due to paranoia.
Re-arrange code so that SXFRCTL1 is restored as quickly after the
chip reset as possible.
S/G structurs must be 8byte aligned. Make this official by saying
so in our DMA tag.
Disable CIO bus stretch on MDFFSTAT if SHVALID is about to come
true. This can cause a CIO bus lockup if a PCI or PCI-X error
occurs while the stretch is occurring - the host cannot service
the PCI-X error since the CIO bus is locked out and SHVALID will
never resolve. The stretch was added in the Rev B to simplify the
wait for SHVALID to resolve, but the code to do this in the open
source sequencer is so simple it was never removed.
Consistently use MAX_OFFSET for the user max syncrate set from
non-volatile storage. This ensures that the offset does not
conflict with AH?_OFFSET_UNKNOWN.
Have ahd_pause_and_flushwork set the mode to ensure that it has
access to the registers it checks. Also modify the checking of
intstat so that the check against 0xFF can actually succeed if
the INT_PEND mask is something other than 0xFF. Although there
are no cardbus U320 controllers, this check may be needed to
recover from a hot-plug PCI removal that occurs without informing
the driver.
Fix a typo. sg_prefetch_cnt -> sg_prefetch_align. This fixes
an infinite loop at card initialization if the cacheline size is 0.
aic79xx.h:
Add AHD_EARLY_REQ_BUG bug flag.
Fix spelling errors.
Include the CDB's length just after the CDB pointer in the DMA'ed
CDB case.
Change AH?_OFFSET_UNKNOWN to 0xFF. This is a value that the
curr->offset can never be, unlike '0' which we previously used.
This fixes code that only checks for a non-zero offset to
determine if a sync negotiation is required since it will fire
in the unknown case even if the goal is async.
aic79xx.reg:
Add comments for LQISTAT bits indicating their names in the 7902
data book. We use slightly different and more descriptive names
in the firmware.
Fix spelling errors.
Include the CDB's length just after the CDB pointer in the DMA'ed
CDB case.
aic79xx.seq:
Update comments regarding rundown of the GSFIFO to reflect reality.
Fix spelling errors.
Since we use an 8byte address and 1 byte length, shorten the size
of a block move for the legacy DMA'ed CDB case from 11 to 9 bytes.
Remove code that, assuming the abort pending feature worked, would
set MK_MESSAGE in the SCB's control byte on completion to catch
invalid reselections. Since we don't see interrupts for completed
selections, this status update could occur prior to us noticing the
SELDO. The "select-out" queue logic will get confused by the
MK_MESSAGE bit being set as this is used to catch packatized
connections where we select-out with ATN. Since the abort pending
feature doesn't work on any released controllers yet, this code was
never executed.
Add support for the AHD_EARLY_REQ_BUG. Don't ignore persistent REQ
assertions just because they were asserted within the bus settle delay
window. This allows us to tolerate devices like the GEM318 that
violate the SCSI spec.
Remove unintentional settnig of SG_CACHE_AVAIL. Writing this bit
should have no effect, but who knows...
On the Rev A, we must wait for HDMAENACK before loading additional
segments to avoid clobbering the address of the first segment in
the S/G FIFO. This resolves data-corruption issues with certain
IBM (now Hitachi) and Fujitsu U320 drives.
Rearrange calc_residual to avoid an extra jmp instruction.
On RevA Silicon, if the target returns us to data-out after we
have already trained for data-out, it is possible for us to
transition the free running clock to data-valid before the required
100ns P1 setup time (8 P1 assertions in fast-160 mode). This will
only happen if this L-Q is a continuation of a data transfer for
which we have already prefetched data into our FIFO (LQ/Data
followed by LQ/Data for the same write transaction). This can
cause some target implementations to miss the first few data
transfers on the bus. We detect this situation by noticing that
this is the first data transfer after an LQ (LQIWORKONLQ true),
that the data transfer is a continuation of a transfer already
setup in our FIFO (SAVEPTRS interrupt), and that the transaction
is a write (DIRECTION set in DFCNTRL). The delay is performed by
disabling SCSIEN until we see the first REQ from the target.
Only compile in snapshot savepointers handler for RevA silicon
where it is enabled.
Handle the cfg4icmd packetized interrupt. We just need to load
the address and count, start the DMA, and CLRCHN once the transfer
is complete.
Fix an oversight in the overrun handler for packetized status
operations. We need to wait for either CTXTDONE or an overrun
when checking for an overrun. The previous code did not wait
and thus could decide that no overrun had occurred even though
an overrun will occur on the next data-valid req. Add some
comment to this section for clarity.
Use LAST_SEG_DONE instead of LASTSDONE for testing transfer
completion in the packetized status case. LASTSDONE may come up
more quickly since it only records completion on the SCSI side,
but since LAST_SEG_DONE is used everywhere else (and needs to be),
this is less confusing.
Add a missing invalidation of the longjmp address in the non-pack
handler. This code needs additional review.
aic79xx_inline.h:
Fix spelling error.
aic79xx_osm.c:
Set the cdb length for CDBs dma'ed from host memory.
Add a comment indicating that, should CAM start supporting cdbs
larger than 16bytes, the driver could store the CDB in the status
buffer.
aic79xx_pci.c:
Add a table entry for the 39320A.
Added a missing comma to an error string table.
Fix spelling errors.
Move <sys/conf.h> before <sys/disk.h>.
No need for raidread()/raidwrite(), we have generic code for that.
Remove non-functional dump code.
Make raidinit() return the softc, not the dev_t.
Move to "struct disk*" centric API.
Fix printfs' to get name from struct disk instead of dev_t.
OK'ed by: scottl
don't end up freezing the box. This makes VTY locking useless
in the DDB case but a box which is supposed to be physically
secure shouldn't compile DDB anyway.
Reviewed by: silence on -audit
To do this, initialize the d_maj member of the cdevsw to MAJOR_AUTO.
When the cdevsw is first passed to make_dev() a free major number
will be assigned.
Until we have a bit more experience with this a printf will announce
this fact.
Major numbers are not reclaimed, so loading/unloading the same
device driver which uses MAJOR_AUTO will eventually deplete the
pool of free major numbers and the system will panic when it can
not allocate one. Still undecided who to invonvenience with the
solution to this.
Improve SBP device probeing:
- Wait 2 sec before issuing LOGIN ORB expecting the reconnection
hold timer expires.
- Serialize management ORB and scanning LUN by CAM on each target.
This should fix the problem for devices which have multiple LUNs.
Test device is donated by: Jaye Mathisen <mrcpu@internetcds.com>
- Freeze SIM queue for 2 sec after BUS RESET.
- Retry with LOGIN rather than RECONNECT after LOGIN is not completed for
BUS RESET.
- Use appropriate CAM status for BUS RESET and DEVICE RESET.
- Let CAM to scan targets after BUS REST.
- Implement CAM scan target function.
- Keep our own devq freeze count.
- Let CAM to know that SBP does tagged queuing.
These should be merged to RELENG_4 before 4.8-RELEASE.
a correctly aligned address in this block we really want to check, that the
part of the chunk that starts at the aligned address is large enough with
regard to the original request. Comparing it to 0 makes no sense, because this
is always true except in the rare case, that the aligned address is just at
the end of the chunk.
Approved by: jake (mentor)
td_wmesg field in the thread structure points to the description string of
the condition variable or mutex. If the condvar or the mutex had been
initialized from a loadable module that was unloaded in the meantime,
td_wmesg may now point to invalid memory. Retrieving the process table now
may panic the kernel (or access junk). Setting the td_wmesg field to NULL
after unblocking on the condvar/mutex prevents this panic.
PR: kern/47408
Approved by: jake (mentor)
Fixed memory leak in the "nodevice" option implementation.
Use these instead of sed(1) in MD NOTES.
Use a single makefile (sys/conf/makeLINT.mk) to generate
LINT for all architectures. (Previous versions missed
the LINT dependency on Makefile, and i386 version also
missed the dependency on ${NOTES}.)
Fixed bugs in the previous NOTES conversion using the
"nodevice" token and sed(1):
- i386 LINT lost "device pst".
- pc98 LINT lost SC_*, MAXCONS and KBD_DISABLE_KEYMAP_LOAD
options, and got needless DPT_* options.
- Added nooptions PPC_DEBUG, PPC_PROBE_CHIPSET, KBD_INSTALL_CDEV
to sparc64 LINT so that it has a chance to config(8).
This basically returns us to where we were before.
the fxp driver. This is enabled only for the 82550/82551 chips
(PCI revision code 12 or 13). RX and TX checksum offload are
both supported. Transmit offload is limited to TCP and UDP only
right now: there seems to be a problem with IP header checksumming
on transmit in some cases.
This chip has hardware VLAN support as well. I hope to enable
support for this eventually.
- Make transmission of packets work again. This stopped working because
ether_ifattach() was forcing ifp->if_output to be ether_output() and
clobbering our attempt to override this vector with a pointer to
ng_fec_output(). Move the overriding of ifp->if_output to after
ether_ifattach().
- Abandon the use of the netgraph ng_ether_input_p hook for snagging
incoming frames, and instead override the ifp->if_input vector for
interfaces that have been aggregated into our bundle. (I would have
loved to have written things this way in the first place, but I
didn't want to have to be the one to implement the if_input hook
and change all the drivers.) This avoids collisions with the ng_ether
module, which uses the same hook. Each aggregated device now calls
ng_fec_input() directly, which then fakes up the rcvif pointer
before invoking ifp->if_input itself.
This module should actually work now.
that matches snd_max, then do not respond with an ack, just drop the
segment. This fixes a problem where a simultaneous close results in
an ack loop between two time-wait states.
Test case supplied by: Tim Robbins <tjr@FreeBSD.ORG>
Sponsored by: DARPA, NAI Labs
introduce a preprocessor define for it. The larger block size
significantly speeds up the loading of the kernel.
Submitted by: Arun Sharma <arun.sharma@intel.com>
changed since this code was written:
- The ng_ether_input_p hook only accepts two arguments now: the pointer
to the ether header structure is gone.
- It's no longer necessary to cons up a fake ether header before passing
incoming packets to BPF_MTAP().
ng_fec_input() has been modified to account for these two changes.
Running tcpdump on fec0 should work now.
PR: kern/46720
- the mutex aac_io_lock protects the main codepaths which handle queues and
hardware registers. Only one acquire/release is done in the top-half and
the taskqueue. This mutex also applies to the userland command path and
CAM data path.
- Move the taskqueue to the new Giant-free version.
- Register the disk device with DISKFLAG_NOGIANT so the top-half processing
runs without Giant.
- Move the dynamic command allocator to the worker thread to avoid locking
issues with bus_dmamem_alloc().
This gives about 20% improvement in most of my benchmarks.
turns runs its tasks free of Giant too. It is intended that as drivers
become locked down, they will move out of the old, Giant-bound taskqueue
and into this new one. The old taskqueue has been renamed to
taskqueue_swi_giant, and the new one keeps the name taskqueue_swi.
an if clause was true. Break the two clauses out into seperate statements
since they require different actions.
Reported/Tested by: jake
Spotted by: jhb
delta 1.371) we must ensure that we do not get ourselves into a
recursive trap endlessly trying to clean up after ourselves.
Reported by: Attila Nagy <bra@fsn.hu>
Sponsored by: DARPA & NAI Labs.
from the filesystem size field to the filesystem maximum blocksize
field. The problem is that older versions of growfs updated only the
new size field and not the old size field. This resulted in the old
(smaller) size field being copied up to the new size field which
caused the filesystem to appear to fsck to be badly trashed.
This also adds a sanity check to ensure that the superblock is not
being updated when the filesystem is mounted read-only. Obviously
such an update should never happen.
Reported by: Nate Lawson <nate@root.org>
Sponsored by: DARPA & NAI Labs.
o Always check for null when dereferencing the filename component.
o Implement a try-and-backoff method for allocating memory to
dump stats to avoid a spin-lock -> sleep-lock mutex lock order
panic with WITNESS.
Approved by: des, markm (mentor)
Not objected: jhb
for testing and setting the current and alternate address spaces.
- Changed PTDpde and APTDpde to arrays to support multiple page directory
pages.
ponsored by: DARPA, Network Associates Laboratories