extra entry for if_ndis_pci.c that depends on cardbus, just to cover
all the bases. (I don't think you can have cardbus without PCI, but
just in case...)
- Add gre_mtx to protect global softc list.
- Hold gre_mtx over various list operations (insert, delete).
- Centralize if_gre interface teardown in gre_destroy(), and call this
from modevent unload and gre_clone_destroy().
- Export gre_mtx to ip_gre.c, which walks the gre list to look up gre
interfaces during encapsulation. Add a wonking comment on how we need
some sort of drain/reference count mechanism to keep gre references
alive while in use and simultaneous destroy.
This commit does not lockdown softc data, which follows in a future
commit.
- Add gif_mtx, which protects globals.
- Hold gif_mtx around manipulation of gif_softc_list.
- Abstract gif destruction code into gif_destroy(), which tears down
a softc after it's been removed from the global list by either module
unload or clone destroy.
- Lock gif_called, even though we know gif_called is broken with reentrant
network processing.
- Document an event ordering problem in gif_set_tunnel() that will need
to be fixed.
gif_softc fields not locked down in this commit.
processing with gif interfaces, to a global variable named "gif_called".
Add an annotation that this approach will not work with a reentrant
network stack, and that we should instead use packet tags to detect
excessive recursive processing.
implementation could be characterized as a hybrid of the amd64 and i386
implementations. Specifically, the direct virtual-to-physical mapping is
used if possible and sf_buf_alloc() is used if the direct map cannot.
to other files in netatalk:
Log:
Since I have my hands all over netatalk adding locking and restructuring
it, cinch the file's style closer to style(9) with regard to parenthesis:
s/( /(/g
s/ )/)/g
s/return(/return (/g
s/return 0/return (0)/
s/return 1/return (1)/
when it associates with a net. Because FreeBSD's kstack size is only
2 pages by default, this blows the stack and causes a double fault.
To deal with this, we now create all our kthreads with 8 stack pages.
Also, we now run all timer callouts in the ndis swi thread (since
they would otherwise run in the clock ithread, whose stack is too
small). It happens that the alloca() in this case was occuring within
the interrupt handler, which was already running in the ndis swi
thread, but I want to deal with the callouts too just to be extra
safe.
NOTE: this will only work if you update vm_machdep.c with the change
I just committed. If you don't include this fix, setting the number
of stack pages with kthread_create() has essentially no effect.
with more than the normal amount of stack pages, however the stack
pointer always wound up being initialized using KSTACK_PAGES. It
should be using td->td_kstack_pages instead. This means that although
the vm subsystem would give you all the stack pages you asked for,
%esp would always be initialized as if you had just 2 pages, and
the rest would go to waste.
I wanted to use the 'give me more stack pages' feature of kthread_create()
because the Intel 2200BG NDIS driver does an alloca() of about 5000 bytes,
which wrecks the stack with the default 2 page size, and I was baffled
that no matter how much code I shoved into thread contexts with
allegedly larger stacks, the thing would still crash unless I changed
KSTACK_PAGES.
Note: this bug is present in _ALL_ arches at this point. Peter has
promised to merge this fix into all of them.
instead of bus_alloc_resource_any() to restore source compatibility
with 5.2-REL and 5.2.1-REL systems. bus_alloc_resource_any() doesn't
really do anything besides hide some of bus_alloc_resource()'s arguments
from us, and in my opinion this isn't worth breaking backwards
compatibility for people who want to use the NDISulator code on 5.2.x.
activation (i.e., applications are using libpthread). This is because
SCHED_ULE sometimes puts P_SA processes into ksq_next unnecessarily.
Which doesn't give fair amount of CPU time to processes which are
using scheduler-activation-based threads when other (semi-)CPU-intensive,
non-P_SA processes are running.
Further work will no doubt be done by jeffr at a later date.
Submitted by: Taku YAMAMOTO <taku@cent.saitama-u.ac.jp>
Reviewed by: rwatson, freebsd-current@
distinguish between debugger inserted breakpoints and fixed
breakpoints. While here, make sure the break instruction never
ends up in the last slot of a bundle by forcing it to be an
M-unit instruction. This makes it easier for use to skip over
it.
are actually layered on top of the KeTimer API in subr_ntoskrnl.c, just
as it is in Windows. This reduces code duplication and more closely
imitates the way things are done in Windows.
- Modify ndis_encode_parm() to deal with the case where we have
a registry key expressed as a hex value ("0x1") which is being
read via NdisReadConfiguration() as an int. Previously, we tried
to decode things like "0x1" with strtol() using a base of 10, which
would always yield 0. This is what was causing problems with the
Intel 2200BG Centrino 802.11g driver: the .inf file that comes
with it has a key called RadioEnable with a value of 0x1. We
incorrectly decoded this value to '0' when it was queried, hence
the driver thought we wanted the radio turned off.
- In if_ndis.c, most drivers don't accept NDIS_80211_AUTHMODE_AUTO,
but NDIS_80211_AUTHMODE_SHARED may not be right in some cases,
so for now always use NDIS_80211_AUTHMODE_OPEN.
NOTE: There is still one problem with the Intel 2200BG driver: it
happens that the kernel stack in Windows is larger than the kernel
stack in FreeBSD. The 2200BG driver sometimes eats up more than 2
pages of stack space, which can lead to a double fault panic.
For the moment, I got things to work by adding the following to
my kernel config file:
options KSTACK_PAGES=8
I'm pretty sure 8 is too big; I just picked this value out of a hat
as a test, and it happened to work, so I left it. 4 pages might be
enough. Unfortunately, I don't think you can dynamically give a
thread a larger stack, so I'm not sure how to handle this short of
putting a note in the man page about it and dealing with the flood
of mail from people who never read man pages.
only done minimal testing on one of these cards and the firmware folks
have been extremely uncooperative in answering my qeustions about them, so
hopefully they will work ok for everyone.
This completes the effort to handle dependent functions, which are used
in some machines for irq link resources. Also, clean up some nearby
comments while I'm at it.
level of abstraction for any and all CPU mask and CPU bitmap variables
so that platforms have the ability to break free from the hard limit
of 32 CPUs, simply because we don't have more bits in an u_int. Note
that the type is not supposed to solve massive parallelism, where
the number of CPUs can be larger than the width of the widest integral
type. As such, cpumask_t is not supposed to be a compound type. If
such would be necessary in the future, we can deal with the issues
then and there. For now, it can be assumed that the type is integral
and unsigned.
With this commit, all MD definitions start off as u_int. This allows
us to phase-in cpumask_t at our leasure without breaking anything.
Once cpumask_t is used consistently, platforms can switch to wider
(or smaller) types if such would be beneficial (or not; whatever :-)
Compile-tested on: i386
This change has not been tested.
This change was triggered by a gcc(1) warning on ia64 at -O2. The
variable v was not used after being computed, which resulted in enough
dead code elimination (DCE) to confuse the compiler and emit a bogus
warning about the use of the variable i without prior definition. The
variable i is the loop variable.
Submitted by: des
Responsibility: marcel
for uart(4) to figure out which device to use as console. Use this file
to define hw.uart.console instead so that we don't have to put it in
the default loader.conf, which makes it hard to override.
to select a serial console and debug port (resp). On ia64 these replace
the use of hints completely and take precedence over hints on alpha,
amd64 and i386. On sparc64 these variables are not yet recognised.
The reasons for introducing these variables are:
1. Hints have side-effects. They reserve the unit number for use by
isa or acpi devices and therefore cannot be used to select a pci
device. Also, the use of a unit number to select a device prior
to bus enumeration is nonsense. The new variables have no side-
effects and are not based on unit numbers.
2. Hints don't have the expression power to allow the sysadmin to
select UARTs that are not legacy PC devices and need the support
of compile-time constants to give the sysadmin some level of
flexibility.
The hw.uart.console and hw.uart.dbgport variables specify a list of
attributes. An attribute is a tag-value pair, seperated by a colon.
Attributes are seperated by a comma. Where possible, tags are the
same as those in /etc/remote (only br and pa in practice). Details
can be found in the manpage (not part of this commit).
Not tested on: amd64, pc98
from ddp_usrreq.c. Functions moved are:
at_pcballoc()
at_pcbconnect()
at_pcbdetach()
at_pcbdisconnect()
at_pcbsetaddr()
at_sockaddr()
Also moved are ddp_ports and ddpcb, global variables associated with DDP
pcbs. This makes PCB implementation more parallel to inet, inet6, and
ipx.
device, the device is probed multiple times (so each device is
detected N times after unloading/loading the module N-1 times).
The real fix is (quote Doug and Warner):
> : In an ideal world, there should be some kind of BUS_UNIDENTIFY method
> : which a driver could use to delete the devices it created in
> : BUS_IDENTIFY.
>
> Or the bus would have a driver deleted routine that got called and it
> would remove all instances of the devclass attached to it.
Reviewed by: Doug Rabson & Warner Losh
to mmap it PROT_EXEC. This also depends on the architecture, as some
architextures (e.g. i386) do not distinguish between read and exec pages
Inspired by: http://linux.bkbits.net:8080/linux-2.4/cset@1.1267.1.85
Reviewed by: alc
mappings required by mdstart_swap(). On i386, if the ephemeral mapping
is already in the sf_buf mapping cache, a swap-backed md performs
similarly to a malloc-backed md. Even if the ephemeral mapping is not
cached, this implementation is still faster. On 64-bit platforms, this
change has the effect of using the direct virtual-to-physical mapping,
avoiding ephemeral mapping overheads, such as TLB shootdowns on SMPs.
On a 2.4GHz, 400MHz FSB P4 Xeon configured with 64K sf_bufs and
"mdmfs -S -o async -s 128m md /mnt"
before:
dd if=/dev/md0 of=/dev/null bs=64k
134217728 bytes transferred in 0.430923 secs (311465697 bytes/sec)
after with cold sf_buf cache:
dd if=/dev/md0 of=/dev/null bs=64k
134217728 bytes transferred in 0.367948 secs (364773576 bytes/sec)
after with warm sf_buf cache:
dd if=/dev/md0 of=/dev/null bs=64k
134217728 bytes transferred in 0.252826 secs (530870010 bytes/sec)
malloc-backed md:
dd if=/dev/md0 of=/dev/null bs=64k
134217728 bytes transferred in 0.253126 secs (530240978 bytes/sec)
I've added -fno-strict-aliasing for now so we can ease into this.
I wanted to shoot for -O3, but the inlining caused problems due to GCC's
size heuristics; so also add -frename-registers, which is one of the things
-O3 would have given us.
entry size and the ELF version. Also, avoid a potential integer
overflow when determining whether the ELF header fits entirely
within the first page.
Reviewed by: jdp
A panic when attempting to execute an ELF binary with a bogus program
header table entry size was
Reported by: Christer Öberg <christer.oberg@texonet.com>
the tap driver, even with Giant over the cdev operation vector, due to
a non-atomic test-and-set of the si_drv1 field in the dev_t. This bug
exists with Giant under high memory pressure, as malloc() may sleep
in tapcreate(), but is less likely to occur. The resolution will
probably be to cover si_drv1 using the global tapmtx since no softc is
available, but I need to think about this problem more generally
across a range of drivers using si_drv1 in combination with SI_CHEAPCLONE
to defer expensive allocation to open().
Correct what appears to be a bug in the original if_tap implementation,
in which tapopen() will panic if a tap device instance is opened more
than once due to an incorrect assertion -- only triggered if INVARIANTS
is compiled in (i.e., when built into a kernel). Return EBUSY instead.
Expand mtx_lock() coverage using tp->tap_mtx to include tp->ether_addr.
use sf_buf_free() instead of sf_buf_mext() to consolidate all actions
that require the page queues lock in one critical section. While I'm
here remove unnecessary splvm() and splx() calls.
clip/destroy the dB value contained in the wi(4)'s receive frames,
it doesn't match with the flag set in the radiotap header
(unperturbed dB versus dBm).
Also set HOOK_HACK to true (remove the related #ifdef's) as we have the
hooks in the kernel this was missed during the merge from the port.
Noticed by: Amir S. (for the HOOK_HACK part)
Approved by: bms(mentor)
options, status pointer and rusage pointer as arguments. It is up to
the caller to copyout the status and rusage to userland if needed. This
lets us axe the 'compat' argument and hide all that functionality in
owait(), by the way. This also cleans up some locking in kern_wait()
since it no longer has to drop locks around copyout() since all the
copyout()'s are deferred.
- Convert owait(), wait4(), and the various ABI compat wait() syscalls to
use kern_wait() rather than wait1() or wait4(). This removes a bit
more stackgap usage.
Tested on: i386
Compiled on: i386, alpha, amd64
Without this fix it is possible to cheat policies like:
- sysctl security.bsd.see_other_[gu]ids=0,
- mac_seeotheruids(4),
- jail(2)
and get full processes list with their arguments.
This problem exists from revision 1.62 of kern_proc.c when it was
introduced.
Reviewed by: nectar, rwatson.
as the process that opens tun_softc can exit before the file
descriptor is closed.
Taiwan experience provided by: keichii
Crashing breakers provided by: Chia-liang Kao <clkao@clkao.org>
(tap_pid, tap_flags). if_tap should now be entirely MPSAFE.
Committed from: Bamboo house by ocean in Taiwan
Tropical paradise provided by: Chia-liang Kao <clkao@clkao.org>
group block locked. If filesystem has any active snapshots, bawrite
can come back trying to allocate new snapshot data block from the same
cylinder group and cause panic due to recursive lock attempt.
PR: 64206
Reviewed by: mckusick
Tested by: pjd
dependent function by the same name and a machine-independent function,
sf_buf_mext(). Aside from the virtue of making more of the code machine-
independent, this change also makes the interface more logical. Before,
sf_buf_free() did more than simply undo an sf_buf_alloc(); it also
unwired and if necessary freed the page. That is now the purpose of
sf_buf_mext(). Thus, sf_buf_alloc() and sf_buf_free() can now be used
as a general-purpose emphemeral map cache.
might be enqueued on a sleep queue but not be asleep when the timeout fires
if it is blocked on a lock trying to check for pending signals before going
to sleep. In the case of fixing up the TDF_TIMEOUT race, however, the
thread must be marked asleep.
Reported by: kan (the bogus one)
mini-layer. I don't have time to bing it forward into the GEOM world, and
no one else has stepped forward to claim it. It'll be in the Attic for safe
keeping for now.
Use kern_open() to implement creat() rather than taking the long route
through open(). Mark creat as MPSAFE.
While I'm at it, mark nosys() (syscall 0) as MPSAFE, for all the
difference it will make.
set it to avoid the need for a bunch of code that tests whether or
not the lock member is set to REQ_WIRED in order to determine which
length member should be used.
Fix another bug in the oldlen return value code.
Fix a potential wired memory leak if a sysctl handler uses
sysctl_wire_old_buffer() and returns an EAGAIN error to trigger
a retry.
If vslock() returns ENOMEM, sysctl_wire_old_buffer() should set
wiredlen to zero and return zero (success) so that the handler will
operate according to sysctl(3):
The size of the buffer is given by the location specified by
oldlenp before the call, and that location gives the amount
of data copied after a successful call and after a call that
returns with the error code ENOMEM.
The handler will return an ENOMEM error because the zero length
buffer will overflow.
ptrace_set_pc(), and cpu_ptrace() so that those functions are free to
acquire Giant, sleep, etc. We already do a PHOLD/PRELE around them so
that it is safe to sleep inside of these routines if necessary. This
allows ptrace() to be marked MP safe again as it no longer triggers lock
order reversals on Alpha.
Tested by: wilko
snprintf() and vsnprintf() in FreeBSD kernel land).
This is needed by the Intel Centrino 2200BG driver. Unfortunately, this
driver still doesn't work right with Project Evil even with this tweak,
but I'm unable to diagnose the problem since I don't have access to a
sample card.
This adds support for cardbus ATA/SATA controllers. I get roughly the
same transfer speeds as on true PCI controllers. Nice to be able to add
a couble of "real" disks to a laptop :)
add tapmtx, which protects globale variables.
Notes:
- The EBUSY check in MOD_UNLOAD may be subject to a race. Moving the
event handler unregister inside the mutex grab may prevent that race.
- Locking of global variables safely is now possible because tapclones
is only modified when the module is loading or unloading, thanks to
phk's recent chang to clone_setup().
- softc locking to follow.
255; USB keychains exist that use 256 as the number of heads. This
check has also been removed in Darwin (along with most of the other
head/sector sanity checks).
Only cy, bs and wd in the tree still use it. I have a replacement for
cy that I need to test on ISA and PCI cards. bs and wd are pc98 only
drivers that appear to no longer be necessary. I'll be removing them
when I hear back from the pc98 people.