code. Both the driver and the new code were wrong. Driver interrupt
handlers are supposed to take "void *vsc" arg, but some including all
COMPAT_ISA drivers and the pci part of the cy driver want an "int unit"
arg. They got this using bogus casts of function pointers which should
have kept working despite their bogusness. However, the new interrupt
code doesn't honor requests to pass an arg of ((void *)0), so things
are very broken if the arg is actually a representation of unit 0.
The fix is to use a normal "void *vsc" arg for the pci case and a
wrapper for the COMPAT_ISA case (of the cy driver). This cleans up
new-busification of the pci case but takes the COMPAT_ISA case a little
further from new-bus. The corresponding bug for the COMPAT_ISA case
has already been fixed similarly using a wrapper in compat_isa.c and
we need another wrapper just to undo that.
Fixed some directly related style bugs (mainly by removing compatibility
cruft).
cy.c:
Fixed an indirectly related old bug in cyattach_common(). A wrong status
was returned in the unlikely event that malloc() failed.
Approved by: re (scottl)
issues which they found and asked to be changed so 3ware can offcially
support the driver.
Summary of the most significant changes:
- TWE_OVERRIDE is no longer supported
- If twe_getparam failed, bogus data would be returned to the caller
- Cache the device unit in the twe_drive structure to aid debugging
- Add the 3ware driver version.
- Proper return error codes for many functions.
- Track the minimum queue length statistics
- 4.x compat: use the cached unit number from the twe_drive structure
instead of the the cached si_drv2. 3ware found that after many loads
and unloads that si_drv2 became corrupted. This did not happen in
-current.
Submitted by: Vinod Kashyap (with modifications by me)
Approved by: re (rwatson)
o Back out workaround for not resetting lucent cards more than once. With
these fixes, it appaers they are no longer necessary.
o Set wi_gone when the card goes awol: typically when we get 0xffff back from
the card. Also, don't interact with a card that's gone, so we fail in
seconds rather than minutes. Also reduce amount of time we wait to .5s
in wi_cmd.
o clear wi_gone on ifconfig down to give some cards a chance after they wedge
(this appears to unwedge one of my prism cards with old firmware). ifconfig
up will fail quickly enough if the card really is out to lunch.
o Add delay in wi_init of 100ms.
o wi_stop(ifp, 0->1) changes so that we clear sc_enabled so that we
exit out of the interrupt routine by just acking the interrupt
Submitted by: iedowse
Approved by: re@ (scottl)
# after the freeze I'll fix some of the minor style issues that reviewers
# of this patch have told me about.
code is compiled in to support the O_IPSEC operator. Previously no
support was included and ipsec rules were always matching. Note that
we do not return an error when an ipsec rule is added and the kernel
does not have IPsec support compiled in; this is done intentionally
but we may want to revisit this (document this in the man page).
PR: 58899
Submitted by: Bjoern A. Zeeb
Approved by: re (rwatson)
to sendfile(2) being erroneously automatically restarted after a signal
is delivered. Fixed by converting ERESTART to EINTR prior to exiting.
Updated manual page to indicate the potential EINTR error, its cause
and consequences.
Approved by: re@freebsd.org
using critcal_enter() and critical_exit() to attempt to replace spl*()
calls. The critical section was calling selrecord(), which locks an
MTX_DEF mutex, which is not legal in a critical section.
Tested by: Stefan Ehmann <shoesoft@gmx.net> and "make universe"
Approved by: re (scottl)
forced unmount case. Otherwise, a file system that is referenced
only by process fd_cdir/fd_rdir references to the file system root
vnode will be successfully unmounted without the MNT_FORCE flag.
The previous behaviour was not compatible with the unmount semantics
required by amd(8), so file systems could be unexpectedly unmounted
while there were still references to the file system root directory.
Reported by: Erez Zadok <ezk@cs.sunysb.edu>
Approved by: re (scottl)
The altio resource magic no longer worked probably due to other changes
in the kernel. Redo that part so it also fits better into ATAng.
Fix detach so it doesn't panic the system when a pccard device is
yanked.
Approved by: re@
was equal to MAXCPU, we would overrun the pcpu_mtx array because maxcpu
was calculated incorrectly.
- Add some more debugging code so that memory leaks at the time of
uma_zdestroy() are more easily diagnosed.
Approved by: re (rwatson)
o fix race condition when processing rx descriptors: because we use
a self-linked descriptor at the end of the rx descriptor list to
avoid rx overruns (which can easily happen for 5212 parts that enable
PHY errors) we must carefully check that a descriptor is "done" by
looking ahead to the next descriptor before believing the done bit
in the current descriptor (this is all handled in the HAL since the
rx descriptor format is chip-specific so we need to pass in two
additional parameters--the physical address of the current descriptor
and the virtual address of the next descriptor in the list)
o check copyout return status for SIOCGATHSTATS ioctl
Approved by: re (scottl)
o support for 5112 and 2112 radios on 5212-based products
o revised interface for ah_procRxDesc needed to handle a race
condition created with the use of self-linked rx descriptors
o support for setting the MAC address
o remove some unused methods from the public API
o revised diagnostic API (replace dump* methods with getDiagState)
o const'ify set key cache method parameters
o support for optional 32khz sleep clock
o implement ah_setSlotTime for 5211 parts
o ANI improvements for 5212 parts
Approved by: re (scottl)
mpf are allocated on the stack, which causes this check to falsely trigger.
A new check which takes on-stack mbufs into account will be reintroduced
after 5.2 is out the door.
Approved by: re (watson)
Requested by: many
When the hostcache bucket limit is reached the last bucket wasn't
removed from the bucket row but inserted a few lines later at the
bucket row head again. This leads to infinite loop when the same
bucket row is accessed the next time for a lookup/insert or purge
action.
Tested by: imp, Matt Smith
Approved by: re (rwatson)
this problem put these lines back in. While they should be
unnecessary, they appear to be sometimes necessary.
Reviewed in concept: dfr
Approved by: re (scottl@)
caused crashes, typically during shutdown, because the second free
referenced a mutex that had been destroyed.
Tested by: several
Approved by: re (scottl)
Make it possible to configure GPIO pins as led(4) devices, PPS inputs
and PPS-echo outputs with a sysctl. Led(4) and PPS-echo can be configured
for active-high or active-low.
Be more complete in initialization of timecounter hardware.
Approved by: re@
them working (cache, automatic rebuild and hotswap) the FFDC
info (First Failure Data Capture) on the adapter must be
initialised.
Logical drives in critical/degraded states weren't added to
the drive list. FreeBSD was not able to see a degraded array
after a reboot. Degraded drives are now also added to the drivelist
and the state of the logical drive is given at boottime.
The adapter type is detected from informations in nvram page 5
and displayed at boottime.
Change IPS_OS_FREEBSD definition from 10 to 8 according to IBM
specs.
Submitted by: <Patrick Guelat> pgfb@imp.ch
Reviewed by: mbr, scottl
Approved by: re
zeroed. Doing a bzero on the entire struct route is not more
expensive than assigning NULL to ro.ro_rt and bzero of ro.ro_dst.
Reviewed by: sam (mentor)
Approved by: re (scottl)
idx'th present CPU with pc_acpi_id equal to *acpi_id. If *acpi_id
does not match that processor's pc_acpi_id, return the value for
ProcId derived from the MADT in *acpi_id. If pc_acpi_id is 0xffffffff,
always override it with the value of *acpi_id. Finally, return
pc_cpuid in *cpu_id and use that as our primary key.
* Use pc_cpuid as our unique key because we know it is valid since
MD code set it. The values for ProcId in the ASL and MADT don't
match up on some machines (!), forcing us to fall back to ordered
probing in that case.
* Remove some #ifdef SMP since the refcount doesn't hurt performance
and will be needed for dynamic _CST objects. Only one #ifdef SMP
(for smp_rendezvous) remains.
* Hook up SMP in the compile flags in the Makefile.
Tested by: marcel, truckman
Approved by: re (scottl)
uncovering some interesting problems. Be conservative and effecitvely
disable this by default. Interested parties may still define
KERNBUILDDIR by hand to achive the same effect.
I plan on referting this change after 5.2 is released, or sooner if
the issues with building releases are resolved and re@ approves.
Approved by: re@ (scottl, marcel)
transfer descriptors when a large request needs to be split into
more than one 8k chunk. The bug was that the calculation did not
take into account the offset of the chunk within the overall request.
This is reported to fix crashes and data corruption on ohci
controllers.
Submitted by: green
Approved by: re
for ipfw processing w/o an indication the packets were generated
by ipfw--and so should not be processed (this manifested itself
as a LOR.) The flag bit in the mbuf that was used to mark the
packets was not listed in M_COPYFLAGS so if a packet had a header
prepended (as done by IPsec) the flag was lost. Correct this by
defining a new M_PROTO6 flag and use it to mark packets that need
this processing.
Reviewed by: bms
Approved by: re (rwatson)
MFC after: 2 weeks
rtalloc_ign() in in_pcbconnect_setup() before it is filled out.
Otherwise, stack junk would be left in sin_zero, which could
cause host routes to be ignored because they failed the comparison
in rn_match().
This should fix the wrong source address selection for connect() to
127.0.0.1, among other things.
Reviewed by: sam
Approved by: re (rwatson)
and the nfs3 client. Also fix some bugs that happen to be causing crashes
in both v3 and v4 introduced by the v4 import.
Submitted by: Jim Rees <rees@umich.edu>
Approved by: re
the MTRR Base/Mask registers. If you use the documented algorithm in the
systems programming guide, you'll get a GPF. The only thing that has
prevented this so far is that the bios pre-sets some MTRR entries which
we mis-interpreted sufficiently to fool the memcontrol interface into
thinking all the address space was taken and therefore rejected XFree86's
requests. However, not all bioses do this.. You get an insta-panic in
that case. Grrr. A better fix (dynamic mask) will happen by 5.3/5-stable
so that we automatically adapt to more than 40 physical bits.
Approved by: re (scottl)
very early (SI_SUB_TUNABLES - 1) and is responsible for setting mp_maxid.
cpu_mp_probe() is now called at SI_SUB_CPU and determines if SMP is
actually present and sets mp_ncpus and all_cpus. Splitting these up
allows an architecture to probe CPUs later than SI_SUB_TUNABLES by just
setting mp_maxid to MAXCPU in cpu_mp_setmaxid(). This could allow the
CPU probing code to live in a module, for example, since modules
sysinit's in modules cannot be invoked prior to SI_SUB_KLD. This is
needed to re-enable the ACPI module on i386.
- For the alpha SMP probing code, use LOCATE_PCS() instead of duplicating
its contents in a few places. Also, add a smp_cpu_enabled() function
to avoid duplicating some code. There is room for further code
reduction later since much of this code is also present in cpu_mp_start().
- All archs besides i386 still set mp_maxid to the same values they set it
to before this change. i386 now sets mp_maxid to MAXCPU.
Tested on: alpha, amd64, i386, ia64, sparc64
Approved by: re (scottl)
delete of objects. Also revert our temporary workaround in dsmthdat.c
that always copied objects. This is the correct fix for errors
evaluating _BST (and GBST) on IBM Thinkpads where an argument (Arg3)
was returned to the caller and the object was freed while still in use.
This will be in a future ACPI-CA dist.
Thanks to: kochi@netbsd.org, shaohua.li@intel.com
fixes an interrupt storm for certain users. This is done on the vendor
branch since the code is already in the 20031029 ACPI-CA dist and will
be imported after 5.2R.
Tested by: sebastian ssmoller <sebastian.ssmoller@gmx.net>
PR: i386/57909
Approved by: re (jhb)
different kernel to boot with kernel="NAME" would load the kernel and
loader.conf-selected modules from /boot/NAME, but it would not change
module_path. So, for instance, the automatically loaded acpi.ko would come
from /boot/kernel/acpi.ko, *always*.
Mind you, this happened for unassisted boot. If you interrupted, typed
"unload" and then "boot NAME", it would Do The Right Thing.
The source of the problem is the double initialization with beastie's
loader.rc. One would happen inside "start", and would load the kernel. The
next one would happen later in the loader.rc script, resetting module_path.
Because module_path is set to the Right Value by the functions in support.4th
that actually load the kernel, when beastie.4th proceeded to boot
module_path would remain wrong, as the kernel was already loaded.
This can be corrected by removing either initialization, and also by changing
the command used by beastie.4th from "boot" to "boot-conf", which makes sure
you use the right kernel and modules.
I chose to remove the second initialization, since this let you interrupt
(or confirm) boot before beastie even comes up. I avoid also doing the
boot-conf change because that would simply cause the kernel and modules to
be loaded twice (in fact, that was my original patch, until, in writing this
very commit message, I saw the error of my ways).
This commit changes the semantics of module loading when using the beastie
menu. Now it does what one would expect it to, but not what it was actually
doing, so something may break for unusual setups depending on broken
behavior. As our japanese friends so nicely put it, shikata ga nakatta. :-)
Approved by: re (scottl)
known samples of broken chipsets that needed mixed mode in the first place
are so broken (ie: locks up) that we can't use IO APIC mode at all and it
needs to be turned off in the bios. So, the MIXED_MODE penalty on the
good chipsets gained nothing.
Approved by: re (scottl)
the compiler having to parse and optimize the PCPU_GET(curthread) so often.
__curthread() is an inline optimized version of PCPU_GET(curthread) that
knows that pc_curthread is at offset zero in the pcpu struct. Add a
CTASSERT() to catch any possible changes to this. This accounts for
just over a 1% wall clock speedup for total kernel compile/link time,
and 20% compile time speedup on some specific files depending on which
compile options are used.
Approved by: re (jhb)
the routing table. Move all usage and references in the tcp stack
from the routing table metrics to the tcp hostcache.
It caches measured parameters of past tcp sessions to provide better
initial start values for following connections from or to the same
source or destination. Depending on the network parameters to/from
the remote host this can lead to significant speedups for new tcp
connections after the first one because they inherit and shortcut
the learning curve.
tcp_hostcache is designed for multiple concurrent access in SMP
environments with high contention and is hash indexed by remote
ip address.
It removes significant locking requirements from the tcp stack with
regard to the routing table.
Reviewed by: sam (mentor), bms
Reviewed by: -net, -current, core@kame.net (IPv6 parts)
Approved by: re (scottl)
boot-disabled devices instead of skipping the last interrupt. This is
especially important for devices that only have one interrupt as this
bug was keeping any interrupt from being tried at all.
Reviewed by: msmith
Approved by: re (scottl)
the routing table. Move all usage and references in the tcp stack
from the routing table metrics to the tcp hostcache.
It caches measured parameters of past tcp sessions to provide better
initial start values for following connections from or to the same
source or destination. Depending on the network parameters to/from
the remote host this can lead to significant speedups for new tcp
connections after the first one because they inherit and shortcut
the learning curve.
tcp_hostcache is designed for multiple concurrent access in SMP
environments with high contention and is hash indexed by remote
ip address.
It removes significant locking requirements from the tcp stack with
regard to the routing table.
Reviewed by: sam (mentor), bms
Reviewed by: -net, -current, core@kame.net (IPv6 parts)
Approved by: re (scottl)
accordingly. The define is left intact for ABI compatibility
with userland.
This is a pre-step for the introduction of tcp_hostcache. The
network stack remains fully useable with this change.
Reviewed by: sam (mentor), bms
Reviewed by: -net, -current, core@kame.net (IPv6 parts)
Approved by: re (scottl)
on SMP systems has a chance of working. This was a loose end of the
implementation of the ACPI Cx idle states. Since our logical CPU Id
is the ACPI processor Id, we do not need to jump through hoops to
obtain it.
Approved: re@ (jhb)
happen in interrupt context; 1) sleep locks, and 2) malloc/free
calls.
1) is fixed by using spin locks instead.
2) is fixed by preallocating a FIFO (implemented with a STAILQ)
and using elements from this FIFO instead. This turns out
to be rather fast.
OK'ed by: re (scottl)
Thanks to: peter, jhb, rwatson, jake
Apologies to: *
an acpi_cpu method for shutdown that disables entry to acpi_cpu_idle
and then IPIs/waits for threads to exit. This fixes a panic late in
reboot in the SMP case.
* In the !SMP case, don't use the processor id filled out by the MADT
since there can only be one processor. This was causing a panic in
acpi_cpu_idle if the id was 1 since the data was being dereferenced from
cpu_softc[1] even though the actual data was in cpu_softc[0] (which is
correct).
* Rework the initialization functions so that cpu_idle_hook is written
late in the boot process.
* Make the P_BLK, P_BLK_LEN, and cpu_cx_count all softc-local variables.
This will help SMP boxes that have _CST or multiple P_BLKs. No such
boxes are known at this time.
* Always allocate the C1 state, even if the P_BLK is invalid. This means
we will always take over idling if enabled. Remove the value -1 as
valid for cx_lowest since this is redundant with machdep.cpu_idle_hlt.
* Reduce locking for the throttle initialization case to around the write
to the smi_cmd port. Add disabled code to write the CST_CNT. It will
be enabled once _CST re-evaluation is tested (post 5.2R).
Thank you: dfr, imp, jhb, marcel, peter
Tested by: rwatson, Harald Schmalzbauer <h@schmalzbauer.de>
Approved by: re (rwatson)
occurs when kmem_malloc() fails to allocate a sufficient number of vm
pages. Specifically, we avoid the lock-order reversal by not grabbing
Giant around pmap_remove() if the map is the kmem_map.
Approved by: re (jhb)
Reported by: Eugene <eugene3@web.de>
- turn on SMP in generic
- add 'device atpic' - this is unconditional on i386, but certain nvidia
based systems need to disable acpi because the reference bios seems to be
hosed. If acpi is disabled, we won't find the apic. amd64 has the
mptable code in a seperate compile option as well.
- turn sym back on, it doesn't fail to compile anymore.
Approved by: re
source count pointers at them so that intr_execute_handlers() won't
choke when it tries to handle an unregisterd ATPIC interrupt source.
- Install the low-level ATPIC interrupt handlers when we first program the
ATPIC in atpic_startup() rather than at SI_SUB_INTR. This is only
necessary to work around buggy code that enables interrupts too early
in the boot process (namely, the vm86 code).
Approved by: re (rwatson)
regocnized as such at the time. Now that the other bogons in the
tree have been fixed, we can remove this ugly kludge.
o Remove stale/bogus opt_foo.h files. These are left over from
by-gone resources. And they point to the need, yet again, to
improve the build system so meta information is only in one place.
Submitted by: ru
Reviewed by: bde
Approved by: re@ (jhb)
purpose and the resulting vattr structure was ignored. In addition,
the VOP_GETATTR call was made with no vnode lock held, resulting in
vnode locking violation panic with debug kernels.
Reported by: truckman
Approved by: re@ (rwatson)
In practice it seems that in situations of high packet loss the ACK
timeout seems to hit this maximum (perhaps inappropriately, but the
estimation algorithm is not perfect, so apparently it happens). In
any case, 10 seconds is way too high a value so lower to 1 second.
MFC after: 3 days
the "old" SYSINIT. This makes sure things happen in the right order.
XXX: md(4) needs to be fully geom-ified and in particluar /dev/md.ctl
should be abandonded for the GEOM OaM api.
Approved by: re@
rather than right before and right after. This allows these routines
to manipulate the mesh.
KASSERT that nobody creates a geom on an alien class.
Assert topology in g_valid_obj().
Approved by: re@
TOC's for the same media!! that borks up GEOM.
Although this looks like bad HW the following patch removes the
chance for GEOM panic'ing.
Approved by: re@
the MAC label referenced from 'struct socket' in the IPv4 and
IPv6-based protocols. This permits MAC labels to be checked during
network delivery operations without dereferencing inp->inp_socket
to get to so->so_label, which will eventually avoid our having to
grab the socket lock during delivery at the network layer.
This change introduces 'struct inpcb' as a labeled object to the
MAC Framework, along with the normal circus of entry points:
initialization, creation from socket, destruction, as well as a
delivery access control check.
For most policies, the inpcb label will simply be a cache of the
socket label, so a new protocol switch method is introduced,
pr_sosetlabel() to notify protocols that the socket layer label
has been updated so that the cache can be updated while holding
appropriate locks. Most protocols implement this using
pru_sosetlabel_null(), but IPv4/IPv6 protocols using inpcbs use
the the worker function in_pcbsosetlabel(), which calls into the
MAC Framework to perform a cache update.
Biba, LOMAC, and MLS implement these entry points, as do the stub
policy, and test policy.
Reviewed by: sam, bms
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
o Each source gets its own queue, which is a FIFO, not a ring buffer.
The FIFOs are implemented with the sys/queue.h macros. The separation
is so that a low entropy/high rate source can't swamp the harvester
with low-grade entropy and destroy the reseeds.
o Each FIFO is limited to 256 (set as a macro, so adjustable) events
queueable. Full FIFOs are ignored by the harvester. This is to
prevent memory wastage, and helps to keep the kernel thread CPU
usage within reasonable limits.
o There is no need to break up the event harvesting into ${burst}
sized chunks, so retire that feature.
o Break the device away from its roots with the memory device, and
allow it to get its major number automagically.
to see_other_uids but with the logical conversion. This is based
on (but not identical to) the patch submitted by Samy Al Bahra.
Submitted by: Samy Al Bahra <samy@kerneled.com>
more than one sf_buf for one vm_page. To accomplish this, we add
a global hash table mapping vm_pages to sf_bufs and a reference
count to each sf_buf. (This is similar to the patches for RELENG_4
at http://www.cs.princeton.edu/~yruan/debox/.)
For the uninitiated, an sf_buf is nothing more than a kernel virtual
address that is used for temporary virtual-to-physical mappings by
sendfile(2) and zero-copy sockets. As such, there is no reason for
one vm_page to have several sf_bufs mapping it. In fact, using more
than one sf_buf for a single vm_page increases the likelihood that
sendfile(2) blocks, hurting throughput.
(See http://www.cs.princeton.edu/~yruan/debox/.)
- This is heavily derived from John Baldwin's apic/pci cleanup on i386.
- I have completely rewritten or drastically cleaned up some other parts.
(in particular, bootstrap)
- This is still a WIP. It seems that there are some highly bogus bioses
on nVidia nForce3-150 boards. I can't stress how broken these boards
are. I have a workaround in mind, but right now the Asus SK8N is broken.
The Gigabyte K8NPro (nVidia based) is also mind-numbingly hosed.
- Most of my testing has been with SCHED_ULE. SCHED_4BSD works.
- the apic and acpi components are 'standard'.
- If you have an nVidia nForce3-150 board, you are stuck with 'device
atpic' in addition, because they somehow managed to forget to connect the
8254 timer to the apic, even though its in the same silicon! ARGH!
This directly violates the ACPI spec.
with multiple ports on a shared interrupt demultiplexed by the puc_intr()
handler.
siointr1() first read as much input as possible and then checked all
possibly-relevant status registers, partly for robustness and partly
for historical reasons. This is very bad if it is called for every
port sharing an interrupt like puc_intr() does. It can spend too long
reading all the input for some ports when the interrupt is for a more
urgent event on another, or just too long checking all the status
registers when there are lots of ports. The inter-character time is
too long for reading all the input even when the interrupt is for a
transmitter interrupt on the same port, and at 921600 bps the inter-char
time is 10.85 usec and was often exceeded with just 2 ports, leaving
the transmitters idle for about 6% of the time.
The tweak is to break out of the read loop after reading 1 char if
output can be done. This avoids most of the idle transmitter time for
2 active ports at 921600 bps bidirectional on the test system. It
also reduces overhead by about 20%. More complete fixes use the
programmable tx low watermark on 16950's and reduce overhead by another
65%.
do not have mh_nextpkt initialized. Somtimes what's there is "1", and the
ip_input() code pukes trying to m_free() it, rendering divert sockets and
such broken.
This really underscores the need to get rid of MT_TAG.
Reviewed by: rwatson
is the warning that points to the bug in `(char *)malloc(...)' where
malloc() is implicitly declared as returning int. We do similar things
here, but they work because u_int is the same as uintptr_t on i386's.)
system calls, and prefer these calls over getsockopt()/setsockopt()
for ABI reasons. When addressing UNIX domain sockets, these calls
retrieve and modify the socket label, not the label of the
rendezvous vnode.
- Create mac_copy_socket_label() entry point based on
mac_copy_pipe_label() entry point, intended to copy the socket
label into temporary storage that doesn't require a socket lock
to be held (currently Giant).
- Implement mac_copy_socket_label() for various policies.
- Expose socket label allocation, free, internalize, externalize
entry points as non-static from mac_net.c.
- Use mac_socket_label_set() in __mac_set_fd().
MAC-aware applications may now use mac_get_fd(), mac_set_fd(), and
mac_get_peer() to retrieve and set various socket labels without
directly invoking the getsockopt() interface.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
will now need editing except for spot checks.
Changed this buffer from a circular one to a linear one. This is more
useful for some cases and the sysctl that prints it doesn't support
circular buffers.
Fixed (output) formatting bugs in this sysctl. An off by 1 error caused
a garbage byte to be returned after annotation of large deltas, and
a race with the writer sometimes caused premature string termination.
o when compiling lint, undefine certain things and redefine them so that the
driver doesn't #error out. Since lint kernels aren't supposed to be
bootable, I'm no troubled by this breakage.
This fixes the tinderbox
Suggested by: rwatson
Approved by: bms
SO_PEERLABEL. This provides an interface to query the label of a
socket peer without embedding implementation details of mac_t in
the application. Previously, sizeof(*mac_t) had to be specified
by an application when performing getsockopt().
Document mac_get_peer(3), and expand documentation of the other
mac_get(3) functions. Note that it's possible to get EINVAL back
from mac_get_fd(3) when pointing it at an inappropriate object.
NOTE: mac_get_fd() and mac_set_fd() support for sockets will
follow shortly, so the documentation is slightly ahead of the
code.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
mac_setsockopt_label() into mac_socket_label_set(); make it non-static
so that it can be invoked from kern_mac.c for mac_set_fd().
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
While we end up the same place, we end up with two different CS register
values after the jump and 0xf000 is compatible with the hardware reset
value.
This makes a difference if the BIOS does a near jump before a far jump.
Detective work and patch by: Adrian Steinmann <ast@marabu.ch>
- improve sysinfo(2) syscall;
- add dummy fadvise64(2) syscall;
- add dummy *xattr(2) family of syscalls;
- add protos for the syscalls 222-225, 238-249 and 253-267;
- add exit_group(2) syscall, which is currently just wired to exit(2).
Obtained from: OpenBSD
MFC after: 2 weeks
Its restoration in rev.1.102 was mistranslated to the equivalent of
setsofttty() in rev.1.105. This increased overheads by causing a
context switch to the SWI handler after almost every interrupt. The
increase was approx. 50% on a Celeron 366 (from 23 usec to 34 usec
per interrupt).
I'm having bad luck with different parts of the sys tree being checked
out at slightly different times. Back it out, noting it doesn't cause
harm in any case. Tinderbox also makes these things more fun.
of newfs, to signify the newfs operation has not yet completed. Re-
write the superblock with the correct magic number once all of the
cylinder groups have been created to show the operation has finished.
Sponsored by: St. Bernard Software
physical mapping.
- Move the sf_buf API to its own header file; make struct sf_buf's
definition machine dependent. In this commit, we remove an
unnecessary field from struct sf_buf on the alpha, amd64, and ia64.
Ultimately, we may eliminate struct sf_buf on those architecures
except as an opaque pointer that references a vm page.
sure to sooptcopyin() the (struct mac) so that the MAC Framework
knows which label types are being requested. This fixes process
queries of socket labels.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
opt_ddb.h. These changes expand green's work of including
opt_global.h to prefer opt files in the kernel directory. Further
refinement might be needed, but I think this is good.
Note: While this is a step on the path to moving the meta information
about modules into the config files, it doesn't actually do that. It
just pulls in the opt files in a way that allows one to build
'generic' modules outside the tree.
disposing fifo resources in fifo_cleanup() instead using of
"vp->v_usecount == 1". There may be other references to the vnode, for
instance by nullfs, at the time fifo_open() or fifo_close() is called,
which could cause a resource leak.
Don't bother grabbing the vnode interlock in fifo_cleanup() since it no
longer accesses v_usecount.
the nfsv4 files. It is intended to be a short-term bridge while
alfred deals with the problem in a better way (eg, don't hesitate to
back this out when the real fix comes along). I've not heard back
from alfred in a few hours and other people are hitting this problem.
Approved by: markm, rwatson, grog, murray
problem, but the spamage of consoles is really bad. Until we can get
this to be less chatty, disable it so people can boot. The badness of
the spamage is worse than the badness that it reports :-(. Once the
underlying problems have been fixed, it can be reenabled.
Approved by: kken, markm, rwatson, grog, murray
* Use the cpu_idle_hook() to do idling for C1-C3.
* Use both _CST and the FADT to detect Cx states.
* Use both _PTC and P_CNT for controlling throttling.
* Add a notify handler to detect changes in _CST and _PSS
* Call the _INI function for each processor if present. This will be
done by ACPI-CA in the future.
* Fix a bug on SMP systems where CPUs will attach multiple times if the
bus is rescan.
* Document new sysctls for controlling idling.
and remove two unneccessary variable initializations.
Make the introduction comment more clear with regard which parts of
the packet are touched.
Requested by: luigi
value as reserved for internal use in boot blocks, because RB_PAUSE
broke binary compatibility by usurping the RB_DUAL flag. Probably no
one except me has boot blocks for which this matters, since most boot
blocks based on biosboot including pc98's boot2 can't boot elf kernels,
and /boot/loader doesn't properly pass flags set by the previous stage.
reboot.h:
Also mark the historical RB_PROBEKBD flag (0x80000) as reserved for
internal use in boot blocks.
boot2.c:
Added comments to inhibit usurping of other flags.
Approved by: guido, imp
MFC after: 1 week
on non-VCHR vnodes. This fixes a panic when reading data from files on a
filesystem with a small (less than a page) block size.
PR: 59271
Reviewed by: alc
kses from the run queues. Also, on SMP, we track the transferable
count here. Threads are transferable only as long as they are on the
run queue.
- Previously, we adjusted our load balancing based on the transferable count
minus the number of actual cpus. This was done to account for the threads
which were likely to be running. All of this logic is simpler now that
transferable accounts for only those threads which can actually be taken.
Updated various places in sched_add() and kseq_balance() to account for
this.
- Rename kseq_{add,rem} to kseq_load_{add,rem} to reflect what they're
really doing. The load is accounted for seperately from the runq because
the load is accounted for even as the thread is running.
- Fix a bug in sched_class() where we weren't properly using the PRI_BASE()
version of the kg_pri_class.
- Add a large comment that describes the impact of a seemingly simple
conditional in sched_add().
- Also in sched_add() check the transferable count and KSE_CAN_MIGRATE()
prior to checking kseq_idle. This reduces the frequency of access for
kseq_idle which is a shared resource.
- missing parenthesization of some macro args
- point of do-while(0) hack defeated by putting a semicolon after while(0)
Fixed some style bugs in macros:
- not splitting the line when the macro value cannot be lined up in much
the same macros that didn't parenthesize their args
- braces around a 1-line statement
- do-while(0) hack not indented in the usual way in the same macros that
defeated its point.
complex locking and rework ip_rtaddr() to do its own rtlookup.
Adopt all its callers to this and make ip_output() callable
with NULL rt pointer.
Reviewed by: sam (mentor)
- For acpi_pci_link_entry_dump(), add a few helper functions to display
the trigger mode, polarity, and sharemode of an individual IRQ resource.
These functions are then called for both regular and extended IRQ
resources.
- In acpi_pci_link_set_irq(), use the same type of IRQ resource
(regular vs. extended) for the new current resource as the type of
the resources from _PRS.
- When routing an interrupt don't ignore extended IRQ resources. Also,
use the same type of IRQ resource (regular vs. extended) for the new
current resource when as the type of the resource from _PRS.
Tested by: peter
This fixes a dependency of mac_label.c on namespace pollution in
<vm/uma.h>.
Similarly for SYSCTL_DECL() although I had no problems with it. This
probably makes some includes of <sys/sysctl.h> bogus.
longer uses these interrupt vectors for its ISA interrupt pins, so these
entries will not be overwritten. If we get a spurious interrupt from the
ATPIC when using the APIC, it will be treated as a stray interrupt instead
of causing a panic.
Short description of ip_fastforward:
o adds full direct process-to-completion IPv4 forwarding code
o handles ip fragmentation incl. hw support (ip_flow did not)
o sends icmp needfrag to source if DF is set (ip_flow did not)
o supports ipfw and ipfilter (ip_flow did not)
o supports divert, ipfw fwd and ipfilter nat (ip_flow did not)
o returns anything it can't handle back to normal ip_input
Enable with sysctl -w net.inet.ip.fastforwarding=1
Reviewed by: sam (mentor)
of depending on namespace pollution 2 layers deep in <vm/uma.h>. Fixed
most nearby include messes (another like this, several the opposite of
this, and some formatting).
be printed, if the module were loaded into a kernel which had INET6 enabled.
The gre(4) driver does not use INET6, nor is it specified for IPv6. The
tunnel_status() function in ifconfig(8) is somewhat overzealous and assumes
that all tunnel interfaces speak KAME ifioctls.
This fix follows the path of least resistance, by teaching gre(4) about
the two KAME ifioctls concerned.
PR: bin/56341
- Move the IPI and local APIC interrupt vectors up into the 0xf0 - 0xff
range. The pmap lazyfix IPI was reordered down next to the TLB
shootdowns to avoid conflicting with the spurious interrupt vector.
- Move the base of APIC interrupts up 16 so that the first 16 APIC
interrupts do not overlap the vectors used by the ATPIC.
- Remove bogus interrupt vector reservations for LINT[01].
- Now that 0xc0 - 0xef are available, use them for device interrupts.
This increases the number of APIC device interrupts to 191.
- Increase the system-wide number of global interrupts to 191 to catch up
to more APIC interrupts.
Requested by: peter (2)
the packets are immediately returned for sending (e.g. when bridging
or packet forwarding). There are more efficient ways to do this
but for now use the least intrusive approach.
Reviewed by: imp, rwatson
in exit1(), make sure the p_klist is empty after sending NOTE_EXIT.
The process won't report fork() or execve() and won't be able to handle
NOTE_SIGNAL knotes anyway.
This fixes some race conditions with do_tdsignal() calling knote() while
the process is exiting.
Reported by: Stefan Farfeleder <stefan@fafoe.narf.at>
MFC after: 1 week
- In the receive routine handle the case where last descriptor could have
less than 4 bytes of data.
- Handle race between detach/ioctl routine.
MFC after: 3 days
kernel build. This makes it possible for me not to get pissed off that
random.ko crashes the system trying to rdtsc() when the i386/cpu.h
support code decides it's okay to call that op when neither I386_CPU or
I486_CPU is defined. I guess it also makes WITNESS/INVARIANTS defines
get picked up by the modules.
vnode of the parent. However, this check should not be performed if
the lookup failed. This change should fix "union_lookup returning
. not same as startdir" panics people were seeing. The bug was
introduced by an incomplete import of a NetBSD delta in rev 1.38.
- Move the aforementioned check out from DIAGNOSTIC. Performance
is the least of our unionfs worries.
- Minor reorganization.
PR: 53004
MFC after: 1 week
- Return EBUSY if the region was wired by mlock(2) and MS_INVALIDATE
is specified to msync(2). This is required by the Open Group Base
Specifications Issue 6.
- vm_map_sync() doesn't return KERN_FAILURE. Thus, msync(2) can't
possibly return EIO.
- The second major loop in vm_map_sync() handles sub maps. Thus,
failing on sub maps in the first major loop isn't necessary.
any functions that call them. Calling proc_rwmem() with the proc lock
held is not safe. Currently, we're protected from any races by Giant.
Eventually proc_rwmem() should require the proc lock and not Giant.
parts of ptrace using proc_rwmem(). proc_rwmem() requires giant, and
giant must be acquired prior to the proc lock, so ptrace must require giant
still.
multicast hash are written. There are still two distinct algorithms used,
and there actually isn't any reason each driver should have its own copy
of this function as they could all share one copy of it (if it grew an
additional argument).
Give the HZ/overflow check a 10% margin.
Eliminate bogus newline.
If timecounters have equal quality, prefer higher frequency.
Some inspiration from: bde
5212-based devices because PHY errors are used to collect data
on environmental noise that and doesn't truly reflect the state
of the communications media. The result is confused users.
Folks that want to watch PHY errors can still get the statistics
through the device ioctl (used by athstats).
o reject scan requests for a device that isn't marked up
This fixes a problem where requesting a scan before marking the device
up would cause a panic because the current channel was set to "any" (0xffff).
o correct a read-lock assert in in_pcblookup_local that should be
a write-lock assert (since time wait close cleanups may alter state)
Supported by: FreeBSD Foundation
preemption two CPUs can be in the same function at the same time
and clobber each others variables. Remove register declaration
from local variables.
Reviewed by: sam (mentor)
and empty its turnstile while the blocking threads still pointed to the
turnstile. If the thread on the first CPU blocked on a lock owned by
one of the threads blocked on the turnstile just woken up, then the
first CPU could try to manipulate a bogus thread queue in the turnstile
during priority propagation.
- Update locking notes for ts_owner and always clear ts_owner, not just
under INVARIANTS.
Tested by: sam (1)
deleted in 1.81. Increase the initial timeout limit to 2ms to
eliminate spurious messages of excessive timeouts in the NFS
client code.
Requested by: Poul-Henning Kamp <phk@phk.freebsd.dk>
Requested by: Mike Silbersack <silby@silby.com>
Requested by: Sam Leffler <sam@errno.com>
Giant and is also MPSAFE.
Push Giant further down into __mac_get_fd() and __mac_set_fd(),
grabbing it only for constrained regions dealing with VFS, and
dropping it entirely for operations related to labeling of pipes.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Radeon IGP support (still lacking PCI IDs), and DRM interface 1.2 updates which
include finally tying the DRM instances to specific devices rather than relying
on the X Server.
This switch toggles between strict multicast delivery, and traditional
multicast delivery.
The traditional (default) behaviour is to deliver multicast datagrams to all
sockets which are members of that group, regardless of the network interface
where the datagrams were received.
The strict behaviour is to deliver multicast datagrams received on a
particular interface only to sockets whose membership is bound to that
interface.
Note that as a matter of course, multicast consumers specifying INADDR_ANY
for their interface get joined on the interface where the default route
happens to be bound. This switch has no effect if the interface which the
consumer specifies for IP_ADD_MEMBERSHIP is not UP and RUNNING.
The original patch has been cleaned up somewhat from that submitted. It has
been tested on a multihomed machine with multiple QuickTime RTP streams
running over the local switch, which doesn't do IGMP snooping.
PR: kern/58359
Submitted by: William A. Carrel
Reviewed by: rwatson
MFC after: 1 week
vector stubs and into the C functions they call.
- Move disabling and EOIing of interrupt sources out of PIC driver entry
points and into intr_execute_handlers(). Intr_execute_handlers() only
disables a source for an interrupt if it is a stray interrupt or has
threaded handlers. Sources with fast handlers no longer disable (mask)
the source while executing the handlers.
- Move the setting of clkintr_pending into intr_execute_handlers() and set
the variable for any interrupt source with a vector of 0. (Should only
be true for IRQ 0.) This fixes clkintr_pending in the NO_MIXED_MODE
case.
- Implement lapic_eoi() and use it to implement ioapic_eoi_source().
- Rename atpic_sched_ithd() to atpic_handle_intr() since it is used to
handle all atpic interrupts and not just threaded ones.
Inspired by: peter's changes to amd64 in p4 (1)
Requested by: bde (2)
extra argument to the devfs MAC policy entry points was accidentally
merged from the MAC branch during my earlier commit to these policies,
and is not scheduled to be merged just yet.
defines for these constants that include the trailing NUL byte. These
new constants have SIZ in their name instead of LEN. As soon as all
consumers in the tree are converted to use the new defines the old
defines will be put under BURN_BRIDGES.
Reviewed by: archie, julian, ru
Approved by: re (in principle)
after the additions made for the new statfs structure (version
1.157). These must be updated in a separate checkin after
syscalls.master has been checked in so that they reflect its
new CVS identity. As these are purely derived files, it is not
clear to me why they are under CVS at all. I presume that it has
something to do with having `make world' operate properly.
accurate reporting of multi-terabyte filesystem sizes.
You should build and boot a new kernel BEFORE doing a `make world'
as the new kernel will know about binaries using the old statfs
structure, but an old kernel will not know about the new system
calls that support the new statfs structure. Running an old kernel
after a `make world' will cause programs such as `df' that do a
statfs system call to fail with a bad system call.
Reviewed by: Bruce Evans <bde@zeta.org.au>
Reviewed by: Tim Robbins <tjr@freebsd.org>
Reviewed by: Julian Elischer <julian@elischer.org>
Reviewed by: the hoards of <arch@freebsd.org>
Sponsored by: DARPA & NAI Labs.