We return ENOBUF to indicate the problem, which is an errno that should be
handled well everywhere.
Requested & Submitted by: green
Silently okay'ed by: The rest of the firewall gang
MFC after: 3 days
Restructure pmap_enter() to prevent the loss of a page modified (PG_M) bit
in a race between processors. (This restructuring assumes the newly atomic
pte_load_store() for correct operation.)
Reviewed by: tegge@
PR: i386/61852
problem I had but it's happening in code that is messing around with
register windows - I'm willing to live with that piece being sensitive
to this and it looks like the other problems we had reported lately
are not fixed by using -O instead of -O2.
Sorry for the churn. Looks like I need a second pointy hat. Someone
tells me they stack well. :-))))
This also adds support for bigger disks on the controller I have access to,
and maybe others if I understood the adhoc methods used on those.
Those with more PC98 bigdrive controllers it is hereby invited to add/fix
support for those in geom_pc98.c and not using #ifdef PC98 all over the place.
callouts as non-CALLOUT_MPSAFE. Otherwise, they may trigger an
assertion regarding Giant if they enter other parts of the stack from
the callout.
MFC after: 3 days
Reported by: Dikshie < dikshie at ppk dot itb dot ac dot id >
-O2 on kernel compiles after all. While working on adding a KASSERT
to sparc64/sparc64/rwindow.c I found that it was "position sensitive",
putting it above a call to flushw() instead of below caused corruption
of processes on the system. jake and jhb have both confirmed there is
no obvious explanation for that. The exact same kernel code does not
have the process corruption problem if compiled with -O instead of -O2.
There have been signs of similar issues floated on the sparc64@ mailing
list, lets see if this helps make them go away.
Note this isn't an optimal fix as far as the file format goes, if this
disgusts too many people I'll fix it the right way. Since compiling
with something other than -O is a known problem this format would prevent
a change to the default causing grief. And this may also help motivate
finding out what the compiler is doing wrong so we can shift back to
using -O2. :-)
My turn for the pointy hat... One of the florescent ones...
MFC after: 2 days
After some discussion the best option seems to be to signal the thread's
death from within the kernel. This requires that thr_exit() take an
argument.
Discussed with: davidxu, deischen, marcel
MFC after: 3 days
allocate unallocated memory resources from the top 32MB of the address
space rather than the top 2GB. While the latter works on some
chipsets, it fails badly on others. 32MB is more conservative and
matches what cheap harware from this era is hardwired to pass.
RAM. Many older, legacy bridges only allow allocation from this
range. This only appies to devices who don't have their memory
assigned by the BIOS (since we allocate the ranges so assigned
exactly), so should have minimal impact.
Hoewver, for CardBus bridges (cbb), they rarely get the resources
allocated by the BIOS, and this patch helps them greatly. Typically
the 'bad Vcc' messages are caused by this problem.
(and panic). To try to finish making BPF safe, at the very least,
the BPF descriptor lock really needs to change into a reader/writer
lock that controls access to "settings," and a mutex that controls
access to the selinfo/knote/callout. Also, use of callout_drain()
instead of callout_stop() (which is really a much more widespread
issue).
Discussion: this panic (or waning) only occurs when the kernel is
compiled with INVARIANTS. Otherwise the problem (which means that
the vp->v_data field isn't NULL, and represents a coding error and
possibly a memory leak) is silently ignored by setting it to NULL
later on.
Panicking here isn't very helpful: by this time, we can only find
the symptoms. The panic occurs long after the reason for "not
cleaning" has been forgotten; in the case in point, it was the
result of severe file system corruption which left the v_type field
set to VBAD. That issue will be addressed by a separate commit.
all other threads to suicide, problem is execve() could be failed, and
a failed execve() would change threaded process to unthreaded, this side
effect is unexpected.
The new code introduces a new single threading mode SINGLE_BOUNDARY, in
the mode, all threads should suspend themself at user boundary except
the singler. we can not use SINGLE_NO_EXIT because we want to start from
a clean state if execve() is successful, suspending other threads at unknown
point and later resuming them from there and forcing them to exit at user
boundary may cause the process to start from a dirty state. If execve() is
successful, current thread upgrades to SINGLE_EXIT mode and forces other
threads to suicide at user boundary, otherwise, other threads will be resumed
and their interrupted syscall will be restarted.
Reviewed by: julian
table. acpidump(8) concatenates the body of the DSDT and SSDTs so an
edited ASL will contain all the necessary information. We can't use a
completely empty table since ACPI-CA reports this as a problem.
MFC after: 3 days
This really doesn't belong here but is preferred (for the moment) over
adding yet another mechanism for sending msgs from the kernel to user apps.
Reviewed by: imp
the raw values including for child process statistics and only compute the
system and user timevals on demand.
- Fix the various kern_wait() syscall wrappers to only pass in a rusage
pointer if they are going to use the result.
- Add a kern_getrusage() function for the ABI syscalls to use so that they
don't have to play stackgap games to call getrusage().
- Fix the svr4_sys_times() syscall to just call calcru() to calculate the
times it needs rather than calling getrusage() twice with associated
stackgap, etc.
- Add a new rusage_ext structure to store raw time stats such as tick counts
for user, system, and interrupt time as well as a bintime of the total
runtime. A new p_rux field in struct proc replaces the same inline fields
from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime). A new p_crux
field in struct proc contains the "raw" child time usage statistics.
ruadd() has been changed to handle adding the associated rusage_ext
structures as well as the values in rusage. Effectively, the values in
rusage_ext replace the ru_utime and ru_stime values in struct rusage. These
two fields in struct rusage are no longer used in the kernel.
- calcru() has been split into a static worker function calcru1() that
calculates appropriate timevals for user and system time as well as updating
the rux_[isu]u fields of a passed in rusage_ext structure. calcru() uses a
copy of the process' p_rux structure to compute the timevals after updating
the runtime appropriately if any of the threads in that process are
currently executing. It also now only locks sched_lock internally while
doing the rux_runtime fixup. calcru() now only requires the caller to
hold the proc lock and calcru1() only requires the proc lock internally.
calcru() also no longer allows callers to ask for an interrupt timeval
since none of them actually did.
- calcru() now correctly handles threads executing on other CPUs.
- A new calccru() function computes the child system and user timevals by
calling calcru1() on p_crux. Note that this means that any code that wants
child times must now call this function rather than reading from p_cru
directly. This function also requires the proc lock.
- This finishes the locking for rusage and friends so some of the Giant locks
in exit1() and kern_wait() are now gone.
- The locking in ttyinfo() has been tweaked so that a shared lock of the
proctree lock is used to protect the process group rather than the process
group lock. By holding this lock until the end of the function we now
ensure that the process/thread that we pick to dump info about will no
longer vanish while we are trying to output its info to the console.
Submitted by: bde (mostly)
MFC after: 1 month
to control the packets injected while in sack recovery (for both
retransmissions and new data).
- Cleanups to the sack codepaths in tcp_output.c and tcp_sack.c.
- Add a new sysctl (net.inet.tcp.sack.initburst) that controls the
number of sack retransmissions done upon initiation of sack recovery.
Submitted by: Mohan Srinivasan <mohans@yahoo-inc.com>
turnstile chain lock until after making all the awakened threads
runnable. First, this fixes a priority inversion race. Second, this
attempts to finish waking up all of the threads waiting on a turnstile
before doing a preemption.
Reviewed by: Stephan Uphoff (who found the priority inversion race)
+ * 9: 0x3f0-0x3f3,0x3f4-0x3f5,0x3f7
This requires only one change to support. Rather than keying on the
size of the resource being 2, instead key off the end & 7 being 3.
This covers the same cases that the size of 2 would catch, but also
covers the new above case.
In addition, I think it is clearer to use the end in preference to the
size and start for case #8 as well. Turns two tests into one, and
catches no other cases.
Make minor commentary changes to deal with new case #9.
# This change is specifically minimal to allow easy MFC. A more
# extensive change will go into current once I've had a chance to test
# it on a lot of hardware...
with different file systems. This may cause ill things
with my previous fix. Now it translate fsid of direct child of
mount point directory only.
Pointed out by: Uwe Doering
that is no longer required. (In fact, it is not clear that it was ever
required in HEAD or RELENG_4, only RELENG_3 required a work-around.) Now,
as before revision 1.251, if the preexisting PTE is invalid, pmap_enter()
does not call pmap_invalidate_page() to update the TLB(s).
Note: Even with this change, the handling of a copy-on-write fault is
inefficient, in such cases pmap_enter() calls pmap_invalidate_page() twice.
Discussed with: bde@
PR: kern/16568
so would cause kernel to produce an unkillable process in some cases,
especially, P_STOPPED_SINGLE has a singling thread, turning off the
bit would mess the state.
need to mask off the page offset bits. (This operation made some sense
prior to i386/i386/pmap.c revision 1.254 when we passed a physical address
rather than a vm_page pointer to pmap_enter().)
Discussed extensively with KAME. The API author's intent isn't clear at this
point, so rather than remove the code entirely, #if 0 out and put a big
comment in for now. The IPV6_RECVPATHMTU sockopt is available if the
application wants to be notified of the path MTU to optimize packet sizes.
Thanks to JINMEI Tatuya <jinmei@isl.rdc.toshiba.co.jp> for putting up
with my incessant badgering on this issue, and fenner for pointing out
the API issue and suggesting solutions.
data endpoints. The control endpoint doesn't need read/write/poll
operations, and more importantly, the thread counts should be
separate so that the control endpoint can properly reference itself
while deleting and recreating the data endpoints.
* Add some macros that handle referencing/releasing devices, and use them
for sleeping/woken-up and open/close operations as apppropriate.
* Use d_purge for FreeBSD, and a loop testing the open status for all
the endpoints for NetBSD and OpenBSD, so that when the device is
detached, the right thing always happens.
restart the current waiting transfer. If this isn't done, the device's
next transfer (that we would like to do a short read on) is going to
return an error -- for short transfer.
* For bulk transfer endpoints, restore the maximum transfer length each
time a transfer is done, or the first short transfer will make all the
rest that size or smaller.
* Remove impossibilities (malloc(M_WAITOK) == NULL, &var == NULL).
New device names are "{tty|cua}A$(card)$(port)[.init|.lock]"
Put a portname in the port structure if SI_DEBUG is defined to avoid
need to inspect minor number to construct name..
Constify some strings.
Remove duplicated DBG_ #defines.
uses predate the change in the pmap_enter() interface that replaced the
page's physical address by the address of its vm_page structure. The
PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page
structure that was being passed in.
is a no-op on little endian architectures, but fixes getting the MAC
address for some dc(4) cards on big endian architectures.
This is a RELENG_5 candidate.
Tested by: gallatin (powerpc), marius (sparc64)
First version of the patch written by: gallatin
1. Process p1 is currently being swapped in.
2. Process p2 calls linux_ptrace(PTRACE_GETFPXREGS, p1_pid, ...)
3. After acquiring a reference to FIRST_THREAD_IN_PROC(p1),
p2 blocks in faultin() while p1 finishes being swapped in.
This means p2 won't get back the lock on p1 until after p1's
threads are runnable.
4. After p1 is swapped in, the first thread in p1 exits.
5. p2 now uses its dangling reference to p1's first thread.
after loading such a kernel, "module_path" will be set to
an insane value. Fixed example by providing an equivalent
setting. For the record, when automatically loading a
kernel (commands "boot" and "boot-conf"), the following is
tried, in this order:
path=/boot/${kernel} file=${bootfile}
path=/boot/${kernel} file=${kernel}
path=${kernel} file=${bootfile}
path=${kernel} file=${kernel}
path=${module_path} file=${kernel}
MP machines (hopefully). CPU timers are OK on UP machines but we
don't keep the timers in sync on MP machines so if the CPU's timer
is chosen as the primary timecounter it's possible for time to
not be monotonically increasing because different CPU's counters
may be used at different times. But the CPU's counters are otherwise
one of the higher quality counters available. So, on UP machines
we'll use a relatively high quality value but on MP machines we'll
use a quality that should prevent the CPU's counters from being chosen.
Requested by: green (who did the first version of the patch)
Reviewed by: marius, green
MFC after: 1 week
the structure space had been obtained from malloc() its contents is
random garbage. The choice of value being set is part of a larger effort
to solve some timecounter issues on MP machines (while working on that
we noticed this problem).
Noticed by: marius
Reviewed by: marius, green
MFC after: 3 days
the new subr_unit.c code.
For now assert Giant in ttycreate() and ttyfree(). It is not obvious that
it will ever pay off to lock these with anything else.
Allocation is always lowest free unit number.
A mixed range/bitmap strategy for maximum memory efficiency. In
the typical case where no unit numbers are freed total memory usage
is 56 bytes on i386.
malloc is called M_WAITOK but no locking is provided (yet). A bit of
experience will be necessary to determine the best strategy. Hopefully
a "caller provides locking" strategy can be maintained, but that may
require use of M_NOWAIT allocation and failure handling.
A userland test driver is included.
date: 2004/09/26 02:01:27; author: sam; state: Exp; lines: +0 -5
Correct handling of SADB_UPDATE and SADB_ADD requests. key_align may
split the mbuf due to use of m_pulldown. Discarding the result because
of this does not make sense as no subsequent code depends on the entire
msg being linearized (only the individual pieces). It's likely
something else is wrong here but for now this appears to get things back
to a working state.
Submitted by: Roselyn Lee
This change was also made in the KAME CVS repository as key.c:1.337 by
itojun.
people have reported problems (stickyness, aiming difficulty) which is proving
difficult to fix, so this will default to disable until sometime after 5.3R.
To enable Synaptics support, set the 'hw.psm.synaptics_support=1' tunable.
MT5 candidate.
Approved by: njl
longer than 'normal'. The cause is still being tracked down but
in the meantime there are machines where raising IPI_RETRIES does
help - it's not just a case of the machine staying locked up longer
and then panic-ing anyway. Several helpful folks on sparc64@ tried
a patch that helped figure out what to raise this number to.
Discussed on: sparc64@
MFC after: 3 days
pmap_copy(). This entails additional locking in pmap_copy() and the
addition of a "flags" parameter to the page table page allocator for
specifying whether it may sleep when memory is unavailable. (Already,
pmap_copy() checks the availability of memory, aborting if it is scarce.
In theory, another CPU could, however, allocate memory between
pmap_copy()'s check and the call to the page table page allocator,
causing the current thread to release its locks and sleep. This change
makes this scenario impossible.)
Reviewed by: tegge@
in the actual _FDE parsing. If the failure occurs earlier such as in
fdc_attach() then don't try to probe any drives.
MFC after: 3 days
Reviewed by: njl
Tested by: Christian Laursen xi at borderworlds dot dk
to make sure the pipe is ready. Some devices apparently don't support
the clear stall command however. So what happens when you issue such
devices a clear stall command? Typically, the command just times out.
This, at least, is the behavior I've observed with two devices that
I own: a Rio600 mp3 player and a T-Mobile Sidekick II.
It used to be that after the timeout expired, the pipe open operation
would conclude and you could still access the device, with the only
negative effect being a long delay on open. But in the recent past,
someone added code to make the timeout a fatal error, thereby breaking
the ability to communicate with these devices in any way.
I don't know exactly what the right solution is for this problem:
presumeably there is some way to determine whether or not a device
supports the 'clear stall' command beyond just issuing one and waiting
to see if it times out, but I don't know what that is. So for now,
I've added a special case to the error checking code so that the
timeout is once again non-fatal, thereby letting me use my two
devices again.
passing along socket information. This is required to work around a LOR with
the socket code which results in an easy reproducible hard lockup with
debug.mpsafenet=1. This commit does *not* fix the LOR, but enables us to do
so later. The missing piece is to turn the filter locking into a leaf lock
and will follow in a seperate (later) commit.
This will hopefully be MT5'ed in order to fix the problem for RELENG_5 in
forseeable future.
Suggested by: rwatson
A lot of work by: csjp (he'd be even more helpful w/o mentor-reviews ;)
Reviewed by: rwatson, csjp
Tested by: -pf, -ipfw, LINT, csjp and myself
MFC after: 3 days
LOR IDs: 14 - 17 (not fixed yet)
This changes the naming of USB serial devices to: /dev/ttyU%d and
/dev/cuaU%d for call-in and call-out devices respectively. (Please
notice: capital 'U')
Please also note that we now have .init and .lock devices for USB
serial ports. These are not persistent across device removal. devd(8)
can be used to configure them on attachment time.
These changes also improve the chances of the system surviving if
the USB device is unplugged at an inconvenient time. At least we
do not rip things apart while there are any threads in the device
driver anymore.
Remove cdevsw, rely on the tty generic one.
Don't make_dev(), use ttycreate() which does all the magic.
In detach, do close procesing if we ripped things apart
while the device was open. Call ttyfree() once we're done
cleaning up.
generic way. This code will allow a similar amount of code to be
removed from most if not all serial port drivers.
Add generic cdevsw for tty devices.
Add generic slave cdevsw for init/lock devices.
Add ttypurge function which wakes up all know generic sleep
points in the tty code, and calls into the hw-driver if it
provides a method.
Add ttycreate function which creates tty device and optionally
cua device. In both cases .init/.lock devices are created
as well.
Change ttygone() slightly to also call the hw driver provided
purge routine.
Add ttyfree() which will purge and destroy the cdevs.
Add ttyconsole mode for setting console friendly termios
on a port.
is one, detect mbuf loops and stop, add an extra arg so you can only print
the first x bytes of the data per mbuf (print all if arg is -1), print
flags using %b (bitmask)...
No code in the tree appears to use m_print, and it's just a maner of adding
-1 as an additional arg to m_print to restore original behavior..
MFC after: 4 days
select(2), and discovered to my horror that ugen(4)'s bulk in/out support
is horribly lobotomized. Bulk transfers are done using the synchronous
API instead of the asynchronous one. This causes the following broken
behavior to occur:
- You open the bulk in/out ugen device and get a descriptor
- You create some other descriptor (socket, other device, etc...)
- You select on both the descriptors waiting until either one has
data ready to read
- Because of ugen's brokenness, you block in usb_bulk_transfer() inside
ugen_do_read() instead of blocking in select()
- The non-USB descriptor becomes ready for reading, but you remain blocked
on select()
- The USB descriptor becomes ready for reading
- Only now are you woken up so that you can ready data from either
descriptor.
The result is select() can only wake up when there's USB data pending. If
any other descriptor becomes ready, you lose: until the USB descriptor
becomes ready, you stay asleep.
The correct approach is to use async bulk transfers, so I changed
the read code to use the async bulk transfer API. I left the write
side alone for now since it's less of an issue.
Note that the uscanner driver has the same brokenness in it.
to 7422 since it appears that the 8169S can't transmit anything larger..
The 8169S can receive full jumbo frames, but we don't have an mru to let
the upper layers know this...
add fixup so that this driver should work on alignment constrained platforms
(!i386 && !amd64)
MFC after: 5 days
MAXWIN to the register window manipulation functions - rwindow_load()
calls rwindow_save() so this one addition should take care of both.
This should help find places that pcb_nsaved doesn't get initialized
properly.
Suggested by: jake
but it is possible:
1. Read data from good component for synchronization.
2. Write data to the same area.
3. Write synchronization data, which are now stale.
Found by: tegge (for gmirror)
on anything but DEVFS and in this case it was not even used (see below).
Put the NFS4 vop method for fifo's behind "#if 0" because it is unused.
Add a XXX comment to say that I think the unusedness is a bug.
for the new thread. The rest of the fields in the pcb wind up being
written to before they're read as a normal part of the pcb usage but
this field may be read upon return to userland, having it be uninitialized
garbage is bad.
Submitted by: Andrew Belashov (bel at orel dot ru)
Reviewed by: jake
MFC after: 3 days
but it is possible:
1. Read data from good component for synchronization.
2. Write data to the same area.
3. Write synchronization data, which are now stale.
Found by: tegge
mutexes instead.
This closes the last (known) race issues in ATA which should fix
the various hangs etc seen on heavy loaded systems.
Change from using timeout functions to using callout functions in
the timeout code. This together with above closes the race that could
happen if timeout and device interrupt occured simultaniously.
Also fix the possible recursion in ata_reinit() on very dodgy
devices that could take us down in the probe.
the trapframe via kdb_frame, but kdb_frame was not initialized until
after the call to kdb_cpu_trap(). Ergo: kdb_cpu_trap() was moved too
far up.
Pointy hat: marcel
the mbuf due to use of m_pulldown. Discarding the result because of this
does not make sense as no subsequent code depends on the entire msg being
linearized (only the individual pieces). It's likely something else is wrong
here but for now this appears to get things back to a working state.
Submitted by: Roselyn Lee
I have from Broadcom does not give much information on these devices,
so the Broadcom Linux driver was used for clues to what these chips
support. It turns out they are similar to the 5705 with the 5751
being the PCI-Express version and needing special work-arounds and
settings.