ast().
- Actually set KEF_ASTPENDING so ast() is called. I think this is buggy
for a process with multiple KSE's in that PS_XCPU is not a KSE event,
it's a process-wide event. IMO there really should probably be two
ASTPENDING flags, one for per-process, and one for per-KSE.
Submitted by: bde
supported option and it disabled a whole 2 lines of bootverbose messages.
I wanted to see 1 of the messages (about the latency timers). This
is a wrong place to decode pci configurations, but the code is already
here and handles more details than pciconf(8).
Peter had repocopied sys/disklabel.h to sys/diskpc98.h and sys/diskmbr.h.
These two new copies are still intact copies of disklabel.h and
therefore protected by #ifndef _SYS_DISKLABEL_H_ so #including them
in programs which already include <sys.disklabel.h> is currently a
no-op.
This commit adds a number of such #includes.
Once I have verified that I have fixed all the places which need fixing,
I will commit the updated versions of the three #include files.
Sponsored by: DARPA & NAI Labs.
(1) Where previously the pipe mutex was selectively grabbed during
pipe_ioctl(), now always grab it and then release if if not
needed. This protects the call to mac_check_pipe_ioctl() to
make sure the label remains consistent. (Note: it looks
like sigio locking may be incorrect for fgetown() since we
call it not-by-reference and sigio locking assumes call by
reference).
(2) In pipe_stat(), lock the pipe if MAC is compiled in so that
the call to mac_check_pipe_stat() gets a locked pipe to
protect label consistency. We still release the lock before
returning actual stat() data, risking inconsistency, but
apparently our pipe locking model accepts that risk.
(3) In various pipe MAC authorization checks, assert that the pipe
lock is held.
(4) Grab the lock when performing a pipe relabel operation, and
assert it a little deeper in the stack.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
on a process's pending signals, use the signal queue flattener,
ksiginfo_to_sigset_t, on the process, and on a local sigset_t, and then work
with that as needed.
__mac_get_pid Retrieve MAC label of a process by pid
Similar to __mac_get_proc() except that the target process of
the operation is explicitly specified rather than assuming
curthread.
__mac_get_link Retrieve MAC label of a path with NOFOLLOW
__mac_set_link Set MAC label of a path with NOFOLLOW
extattr_set_link Set EAs on a path with NOFOLLOW
extattr_get_link Retrieve EAs on a path with NOFOLLOW
extattr_delete_link Delete EAs on a path with NOFOLLOW
These calls are similar to __mac_get_file(), __mac_set_file(),
extattr_set_file(), extattr_get_file(), and extattr_delete_file(),
except that they do not follow symlinks. The distinction between
these calls is similar to lchown() vs chown().
Implementations to follow.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
A number of functions in this driver still use the unit number in their
printouts because they pass the unit directly as a function argument
instead of passing a softc or struct ifnet pointer. This should be
resolved at a future date.
I've added a structure, kernel-private, to represent a pending or in-delivery
signal, called `ksiginfo'. It is roughly analogous to the basic information
that is exported by the POSIX interface 'siginfo_t', but more basic. I've
added functions to allocate these structures, and further to wrap all signal
operations using them.
Once the operations are wrapped, I've added a TailQ (see queue(3)) of these
structures to 'struct proc', and all pending signals are in that TailQ. When
a signal is being delivered, it is dequeued from the list. Once I finish
the spreading of ksiginfo throughout the tree, the dequeued structure will be
delivered to the process in question, whereas currently and normally, the
signal number is what is used.
has exceeded its CPU time limit.
- In mi_switch(), set PS_XCPU when the CPU time limit is exceeded.
- Perform actual CPU time limit exceeded work in ast() when PS_XCPU is set.
Requested by: many
interlock in getnewvnode() to avoid possible sleeps while holding
the mutex. Note that the warning from Witness is a slight false
positive since we know there will be no contention on the interlock
since we haven't made the vnode available for use yet, but the theory
is not a bad one.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
prototyped functions to get a sigset_t, and further to check for any
queued signals, rather than an empty signal set, to go with the move
to signal queues rather than signal sets.
gets signals operating based on a TailQ, and is good enough to run X11,
GNOME, and do job control. There are some intricate parts which could be
more refined to match the sigset_t versions, but those require further
evaluation of directions in which our signal system can expand and contract
to fit our needs.
After this has been in the tree for a while, I will make in kernel API
changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo
more robustly, such that we can actually pass information with our
(queued) signals to the userland. That will also result in using a
struct ksiginfo pointer, rather than a signal number, in a lot of
kern_sig.c, to refer to an individual pending signal queue member, but
right now there is no defined behaviour for such.
CODAFS is unfinished in this regard because the logic is unclear in
some places.
Sponsored by: New Gold Technology
Reviewed by: bde, tjr, jake [an older version, logic similar]
timestamped TCP packets where FreeBSD will send DATA+FIN and
A W2K box will ack just the DATA portion. If this occurs
after FreeBSD has done a (NewReno) fast-retransmit and is
recovering it (dupacks > threshold) it triggers a case in
tcp_newreno_partial_ack() (tcp_newreno() in stable) where
tcp_output() is called with the expectation that the retransmit
timer will be reloaded. But tcp_output() falls through and
returns without doing anything, causing the persist timer to be
loaded instead. This causes the connection to hang until W2K gives up.
This occurs because in the case where only the FIN must be acked, the
'len' calculation in tcp_output() will be 0, a lot of checks will be
skipped, and the FIN check will also be skipped because it is designed
to handle FIN retransmits, not forced transmits from tcp_newreno().
The solution is to simply set TF_ACKNOW before calling tcp_output()
to absolute guarentee that it will run the send code and reset the
retransmit timer. TF_ACKNOW is already used for this purpose in other
cases.
For some unknown reason this patch also seems to greatly reduce
the number of duplicate acks received when Guido runs his tests over
a lossy network. It is quite possible that there are other
tcp_newreno{_partial_ack()} cases which were not generating the expected
output which this patch also fixes.
X-MFC after: Will be MFC'd after the freeze is over
of 1 so that it is not probed until after acpi0 is probed and attached.
- In legacy_probe(), return ENXIO if acpi0 is around and alive.
- nexus_attach() is now much simpler and just lets its child drivers do
all the work.
and attach routines have succeeded so that if they fail we can still use
the PnP BIOS to find ISA on-board devices. The fact that we do this here
is gross but fixing it properly involves a lot more work.
code path to fix a bug in the non USB_USE_SOFTINTR path that caused
the usb bus to hang and generally misbehave when devices were unplugged.
In the process though it also reduced the throughput of usb devices because
of a less than optimal implementation under FreeBSD.
This commit fixes the non USB_USE_SOFTINTR code in uhci and ohci
so that it works again, and switches back to using this code path.
The uhci code has been tested, but the ohci code hasn't. It's
essentially the same anyway and so I don't envisage any difficulties.
Code for uhci submitted by: Maksim Yevmenkin <myevmenk@exodus.net>
testing any modifications to them, they shouldn't even bother with
disklabels in the first place and they are just plain obsolete old
hardware which should be axed entirely before 5.0-R IMO.
Sponsored by: DARPA & NAI Labs.
from stopping another thread from completing a syscall, and this allows it to
release its resources etc. Probably more related commits to follow (at least
one I know of)
Initial concept by: julian, dillon
Submitted by: davidxu
device_t the same throughout kernel.
This is a very fine point of C which fortunatly does not make any
difference in normal circumstances but which due to the pervasiveness
of device_t in the kernel can make a lint barf a lot.
sparc v9 ABI. The Elf_Rela records for local symbols appear to already
have the symbol's value added in to the addend field, even though the ABI
specifies we need to lookup the symbol and add its value too. This breaks
text relocations in klds because the symbol's value is added twice, and
the resulting address points off into nowhere land, so for now just use
the addend.
Tested by: rwatson
The advanced stage of computer assisted hardware design and
verification is aptly illustrated by the fact that this is necessary
because only the first ports in a single-chip, dual-port async
PC-Card product lacks this register.
that this will make people use this for their future copy&paste operations.
Rework the detection of raw-disk offsets in disklabels. This actually
unearthed a number of bugs in the (now) previous version.
Also accept labels which don't have a magic RAW_PART, provided they don't
confuse us too much.
Change the order of our sanity-checks on labels found on disks to be more
robust.
Check against MAXPARTITIONS in our sanity-check and reject disklabels
we cannot cope with.
Create new g_bsd_modify() function to implment disklabel modifying
ioctls.
Implement DIOCSDINFO and DIOCWDINFO with the provision that the latter
still not writes your change back to disk. I didn't have the nerves
for that yet.
In the start routine, use g_call_me() for complex ioctls to prevent
sleeping.
Sponsored by: DARPA & NAI Labs.
with support for trying, doing and forcing.
This will eventually replace g_slice_addslice() which gets changed from
grabbing topology to requing it in this commit as well.
Sponsored by: DARPA & NAI Labs.
work.
This prevents people from sleeping in the UP/DOWN I/O path by mistake
or design (doing so almost invariably result in deadlocks since it
stalls all I/O processing in the given direction.
Sponsored by: DARPA & NAI Labs.
a disklabel modification tries to change an open device, and no
counter-examples exists.
Be less facist about when we can do Setattr, the openmodes of devices
are so loosely managed that the "exclusive" count is almost useless.
Sponsored by: DARPA & NAI Labs.
Add a __unused.
Make the 2byte decoder functions return 16 bits for the benefits
of picky lints.
No need to grab giant around a tsleep() when we have a timeout.
Sponsored by: DARPA & NAI Labs.
to be performed in the event-thread.
To do this, we need to lock the eventlist with g_eventlock (nee g_doorlock),
since g_call_me() being called from the UP/DOWN paths will not be able to
aquire g_topology_lock.
This also means that for now these events are not referenced on any
particular consumer/provider/geom.
For UP/DOWN path use, this will not become a problem since the access()
function will make sure we drain any bio's before we dismantle.
Sponsored by: DARPA & NAI Labs.
that a particular device driver is not Giant-challenged.
SPECFS will DROP_GIANT() ... PICKUP_GIANT() around calls to the
driver in question.
Notice that the interrupt path is not affected by this!
This does _NOT_ work for drivers accessed through cdevsw->d_strategy()
ie drivers for disk(-like), some tapes, maybe others.
Setting this flag on an ethernet interface blocks transmission of packets
and discards incoming packets after BPF processing.
This is useful if you want to monitor network trafic but not interact
with the network in question.
Sponsored by: http://www.babeltech.dk
if they are not going to cross over themselves. Also change how the list of
completed user threads is tracked and passed to the KSE. This is not
a change in design but rather the implementation of what was originally
envisionned.
aic79xx.c:
o Remove redundant ahd_update_modes() call.
o Correct panic in diagnostic should state corruption cause
the SCB Id to be invalid during a selection timeout.
o Add workaround for missing BUSFREEREV feature in Rev A silicon.
o Corect formatting nits.
o Use register pretty printing in more places.
o Save and restore our SCB pointer when updating the waiting queue
list for an "expected" LQ-out busfree.
o In ahd_clear_intstat, deal with the missing autoclear in the
CLRLQO* registers.
o BE fixup in a diagnostic printf.
o Make sure that we are in the proper mode before disabling
selections in ahd_update_pending_scbs.
o Add more diagnostics.
o task_attribute_nonpkt_tag -> task_attribute: we don't need a
nonpkt_tag field anymore for allowing all 512 SCBs to be
used in non-packetized connections.
o Negotiate HOLD_MCS to U320 devices.
o Add a few additional mode assertions.
o Restore the chip mode after clearing out the qinfifo so that
code using ahd_abort_scbs sees a consistent mode.
o Simplify the DMA engine shutdown routine prior to performing
a bus reset.
o Perform the sequencer restart after a chip reset prior to
setting up our timer to poll for the reset to be complete.
On some OSes, the timer could actually pre-empt us and order
is important here.
o Have our "reset poller" set the expected mode since there is
no guarantee of what mode will be in force when we are called
from the OS timer.
o Save and restore the SCB pointer in ahd_dump_card_state(). This
routine must not modify card state.
o Ditto for ahd_dump_scbs().
aic79xx.h:
o Add a few more chip bug definitions.
o Align our tag on a 32bit boundary.
aic79xx.reg:
aic79xx.seq:
o Start work on removing workarounds for Rev B.
o Use a special location in scratch from for stroring
our SCBPTR during legacy FIFO allocations. This corrects
problems in mixed packetized/non-packetized configurations
where calling into a FIFO task corrupted our SCBPTR.
o Don't rely on DMA priority to guarantee that all data in
our FIFOs will flush prior to a command completion notification
going out of the command channel. We've never seen this assumption
fail, but better safe than sorry.
o Deal with missing BUSFREEREV feature in H2A.
o Simplify disconnect list code now that the list will always
have only a single entry.
o Implement the AHD_REG_SLOW_SETTLE_BUG workaround.
o Swith to using "REG_ISR" for local mode scratch during
our ISR.
o Add a missing jmp to the data_group_dma_loop after our
data pointers have been re-initialized by the kernel.
o Correct test in the bitbucket code so that we actually
wait for the bitbucket to complete before signaling the
kernel of the overrun condition.
o Reposition pkt_saveptrs to avoid a jmp instruction.
o Update a comment to reflect that the code now waits for
a FIFO to drain prior to issuing a CLRCHN.
aic79xx_inline.h:
o Remove unused untagged queue handling code.
o Don't attempt to htole64 what could be a 32bit value.
aic79xx_pci.c:
o Set additional bug flags for rev A chips.
from DHCP in the event that no gateway is returned from DHCP, breaking
the assumption that we skip the routing insertion of the gateway
if the sin length is zero. Check also for s_addr of 0 to avoid the
"Oh no, adding my default route failed" panic, making it possible
to pxeboot machines on segments without default routes. Arguably
this could be a bug in pxeboot, or in the TUNABLE code, but this
makes my boxes boot.
/h/des/src/sys/coda/coda_venus.c: In function `venus_ioctl':
/h/des/src/sys/coda/coda_venus.c:277: warning: cast from pointer to integer of
different size
/h/des/src/sys/coda/coda_venus.c:292: warning: cast from pointer to integer of
different size
/h/des/src/sys/coda/coda_venus.c: In function `venus_readlink':
/h/des/src/sys/coda/coda_venus.c:380: warning: cast from pointer to integer of
different size
/h/des/src/sys/coda/coda_venus.c: In function `venus_readdir':
/h/des/src/sys/coda/coda_venus.c:637: warning: cast from pointer to integer of
different size
Submitted by: des-alpha-tinderbox
- Make the VI asserts more orthogonal to the rest of the asserts by using a
new, common vfs_badlock() function and adding a 'str' arg.
- Adjust generated ASSERTS to match the new prototype.
- Adjust explicit ASSERTS to match the new prototype.
implement worthful VOP_BMAP() handler, so it expect the blkno not to be
changed by VOP_BMAP(). Otherwise, it'll have to find some tricky way to
determine if bp was VOP_BMAP()ed or not in VOP_STRATEGY().
PR: kern/42139
aac driver dependent on the linux emulation module. This was
especially bad for the release engineers who tried to move the
aac driver from the kernel onto the drivers floppy. The linux
compat bits for this driver are now in their own driver, aac_linux.
It can be loaded as a module or compiled into the kernel. For
the latter case, the AAC_COMPAT_LINUX option is needed, along with
the COMPAT_LINUX option.
I've tested this in every configuration I can think of. This is an
MFC candidate for 4.7.
Idea from: rwatson
MFC after: 3 days
unlocked accesses to v_usecount.
- Lock access to the buf lists in the various sync routines. interlock
locking could be avoided almost entirely in leaf filesystems if the
fsync function had a generic helper.